Opened 4 years ago
Closed 3 years ago
#2188 closed enhancement (fixed)
nginx -t with many worker_processes/reuseport is expensive
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | documentation | Version: | 1.18.x |
Keywords: | Cc: | ||
uname -a: | Linux laptop 5.10.0-1-amd64 #1 SMP Debian 5.10.4-1 (2020-12-31) x86_64 GNU/Linux | ||
nginx -V: | not relevant |
Description (last modified by )
nginx -t
forks worker_processes
worker processes just like normal nginx. With reuseport
in listen
directive in use each of the workers create separate socket. On modern multicore machine (assuming i.e. worker_processes auto;
) this creates a lot of sockets which are not of much value. Large configurations take a lot longer to test.
I think we could assume in testing that configuration has worker_processes 1;
. Is this problematic in any way?
We can't use -g
option to override configured value, because nginx complains about duplicated directive.
Change History (7)
comment:1 by , 4 years ago
Description: | modified (diff) |
---|
comment:2 by , 4 years ago
comment:3 by , 4 years ago
Yes, it is exactly as you wrote - I misread strace output.
As for the numbers - we have a quite large nginx instance running on 40 core linux machine.
Without reuseport test opens 1424 sockets:
real 0m0.768s user 0m0.286s sys 0m0.321s
With reuseport test opens 22796 sockets:
real 0m16.048s user 0m0.290s sys 0m15.600s
Times vary, but are many times longer than usual.
What is worth mentioning, stopping nginx instance makes nginx -t
faster again despite the same number of opened sockets:
real 0m0.805s user 0m0.287s sys 0m0.387s
I'm not sure why it is like that; maybe it is more costly to open new socket when packets are already flowing.
Lastly, when I change working_processes
from auto
to 1
, test opens 1151 sockets:
real 0m0.939s user 0m0.280s sys 0m0.510s
So it looks like we could gain a lot optimizing opening sockets.
comment:4 by , 3 years ago
Thanks for the details. Indeed, it looks like on Linux bind() takes a lot of time for the additional sockets, and it makes a little sense to try to open these sockets given we don't set SO_REUSEPORT anyway. Please test if the following patch work for you.
# HG changeset patch # User Maxim Dounin <mdounin@mdounin.ru> # Date 1621814442 -10800 # Mon May 24 03:00:42 2021 +0300 # Node ID 47945173813f5185e37385be3df5c02037be870d # Parent 394c6e488f03cb63e7ec2a0219a4882985ef52e0 Core: disabled cloning sockets when testing config (ticket #2188). Since we anyway do not set SO_REUSEPORT when testing configuration (see ecb5cd305b06), trying to open additional sockets does not make much sense, as all these additional sockets are expected to result in EADDRINUSE errors from bind(). On the other hand, there are reports that trying to open these sockets takes significant time under load: total configuration testing time greater than 15s was observed in tiket #2188, compared to less than 1s without load. With this change, no additional sockets are opened during testing configuration. diff --git a/src/event/ngx_event.c b/src/event/ngx_event.c --- a/src/event/ngx_event.c +++ b/src/event/ngx_event.c @@ -441,20 +441,23 @@ ngx_event_init_conf(ngx_cycle_t *cycle, #if (NGX_HAVE_REUSEPORT) - ls = cycle->listening.elts; - for (i = 0; i < cycle->listening.nelts; i++) { - - if (!ls[i].reuseport || ls[i].worker != 0) { - continue; - } - - if (ngx_clone_listening(cycle, &ls[i]) != NGX_OK) { - return NGX_CONF_ERROR; - } - - /* cloning may change cycle->listening.elts */ + if (!ngx_test_config) { ls = cycle->listening.elts; + for (i = 0; i < cycle->listening.nelts; i++) { + + if (!ls[i].reuseport || ls[i].worker != 0) { + continue; + } + + if (ngx_clone_listening(cycle, &ls[i]) != NGX_OK) { + return NGX_CONF_ERROR; + } + + /* cloning may change cycle->listening.elts */ + + ls = cycle->listening.elts; + } } #endif
comment:5 by , 3 years ago
Thank you for the patch, it works as expected. When applied, test creates 1153 sockets and takes:
real 0m0.761s user 0m0.286s sys 0m0.475s
Without patch, test creates 22796 sockets and takes:
real 0m16.627s user 0m0.297s sys 0m16.321s
comment:7 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Patch committed, thanks for testing.
This is not what happens. No worker processes are forked during configuration testing. On the other hand, all listening sockets, including worker-specific ones when using reuseport, are opened. If this indeed causes a lot of overhead, we may consider explicitly optimizing this. Could you please be more specific about what "a lot longer" means in the particular case you are talking about?