Opened 5 months ago

Closed 5 months ago

#2188 closed enhancement (fixed)

nginx -t with many worker_processes/reuseport is expensive

Reported by: bartebor@… Owned by:
Priority: minor Milestone:
Component: documentation Version: 1.18.x
Keywords: Cc:
uname -a: Linux laptop 5.10.0-1-amd64 #1 SMP Debian 5.10.4-1 (2020-12-31) x86_64 GNU/Linux
nginx -V: not relevant

Description (last modified by bartebor@…)

nginx -t forks worker_processes worker processes just like normal nginx. With reuseport in listen directive in use each of the workers create separate socket. On modern multicore machine (assuming i.e. worker_processes auto;) this creates a lot of sockets which are not of much value. Large configurations take a lot longer to test.

I think we could assume in testing that configuration has worker_processes 1;. Is this problematic in any way?

We can't use -g option to override configured value, because nginx complains about duplicated directive.

Change History (7)

comment:1 by bartebor@…, 5 months ago

Description: modified (diff)

comment:2 by Maxim Dounin, 5 months ago

This is not what happens. No worker processes are forked during configuration testing. On the other hand, all listening sockets, including worker-specific ones when using reuseport, are opened. If this indeed causes a lot of overhead, we may consider explicitly optimizing this. Could you please be more specific about what "a lot longer" means in the particular case you are talking about?

comment:3 by bartebor@…, 5 months ago

Yes, it is exactly as you wrote - I misread strace output.

As for the numbers - we have a quite large nginx instance running on 40 core linux machine.
Without reuseport test opens 1424 sockets:

real    0m0.768s
user    0m0.286s
sys     0m0.321s

With reuseport test opens 22796 sockets:

real    0m16.048s
user    0m0.290s
sys     0m15.600s

Times vary, but are many times longer than usual.

What is worth mentioning, stopping nginx instance makes nginx -t faster again despite the same number of opened sockets:

real    0m0.805s 
user    0m0.287s
sys     0m0.387s

I'm not sure why it is like that; maybe it is more costly to open new socket when packets are already flowing.

Lastly, when I change working_processes from auto to 1, test opens 1151 sockets:

real    0m0.939s
user    0m0.280s
sys     0m0.510s

So it looks like we could gain a lot optimizing opening sockets.

comment:4 by Maxim Dounin, 5 months ago

Thanks for the details. Indeed, it looks like on Linux bind() takes a lot of time for the additional sockets, and it makes a little sense to try to open these sockets given we don't set SO_REUSEPORT anyway. Please test if the following patch work for you.

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1621814442 -10800
#      Mon May 24 03:00:42 2021 +0300
# Node ID 47945173813f5185e37385be3df5c02037be870d
# Parent  394c6e488f03cb63e7ec2a0219a4882985ef52e0
Core: disabled cloning sockets when testing config (ticket #2188).

Since we anyway do not set SO_REUSEPORT when testing configuration
(see ecb5cd305b06), trying to open additional sockets does not make much
sense, as all these additional sockets are expected to result in EADDRINUSE
errors from bind().  On the other hand, there are reports that trying
to open these sockets takes significant time under load: total configuration
testing time greater than 15s was observed in tiket #2188, compared to less
than 1s without load.

With this change, no additional sockets are opened during testing
configuration.

diff --git a/src/event/ngx_event.c b/src/event/ngx_event.c
--- a/src/event/ngx_event.c
+++ b/src/event/ngx_event.c
@@ -441,20 +441,23 @@ ngx_event_init_conf(ngx_cycle_t *cycle, 
 
 #if (NGX_HAVE_REUSEPORT)
 
-    ls = cycle->listening.elts;
-    for (i = 0; i < cycle->listening.nelts; i++) {
-
-        if (!ls[i].reuseport || ls[i].worker != 0) {
-            continue;
-        }
-
-        if (ngx_clone_listening(cycle, &ls[i]) != NGX_OK) {
-            return NGX_CONF_ERROR;
-        }
-
-        /* cloning may change cycle->listening.elts */
+    if (!ngx_test_config) {
 
         ls = cycle->listening.elts;
+        for (i = 0; i < cycle->listening.nelts; i++) {
+
+            if (!ls[i].reuseport || ls[i].worker != 0) {
+                continue;
+            }
+
+            if (ngx_clone_listening(cycle, &ls[i]) != NGX_OK) {
+                return NGX_CONF_ERROR;
+            }
+
+            /* cloning may change cycle->listening.elts */
+
+            ls = cycle->listening.elts;
+        }
     }
 
 #endif

comment:5 by bartebor@…, 5 months ago

Thank you for the patch, it works as expected. When applied, test creates 1153 sockets and takes:

real 0m0.761s
user 0m0.286s
sys 0m0.475s

Without patch, test creates 22796 sockets and takes:

real    0m16.627s
user    0m0.297s
sys    0m16.321s

comment:6 by Maxim Dounin <mdounin@…>, 5 months ago

In 7867:c860f0b7010c/nginx:

Core: disabled cloning sockets when testing config (ticket #2188).

Since we anyway do not set SO_REUSEPORT when testing configuration
(see ecb5cd305b06), trying to open additional sockets does not make much
sense, as all these additional sockets are expected to result in EADDRINUSE
errors from bind(). On the other hand, there are reports that trying
to open these sockets takes significant time under load: total configuration
testing time greater than 15s was observed in ticket #2188, compared to less
than 1s without load.

With this change, no additional sockets are opened during testing
configuration.

comment:7 by Maxim Dounin, 5 months ago

Resolution: fixed
Status: newclosed

Patch committed, thanks for testing.

Note: See TracTickets for help on using tickets.