Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#362 closed defect (fixed)

deadlock on win32 with accept_mutex and multiple worker_processes

Reported by: Ogurets Aggressive Owned by:
Priority: minor Milestone:
Component: nginx-module Version:
Keywords: win32 worker_processes accept_mutex deadlock Cc:
uname -a: Microsoft Windows [Version 5.2.3790] (Windows 2003 Server SP2 Enterprise x64)
nginx -V: nginx version: nginx/1.4.1
TLS SNI support enabled
configure arguments: --with-cc=cl --builddir=objs.msvc8 --with-debug --prefix= --conf-path=conf/nginx.conf --pid-path=logs/nginx.pid --http-log-path=logs/access.log --error-log-path=logs/error.log --sbin-path=nginx.exe --http-client-body-temp-path=temp/client_body_temp --http-proxy-temp-path=temp/proxy_temp --http-fastcgi-temp-path=temp/fastcgi_temp --http-scgi-temp-path=temp/scgi_temp --http-uwsgi-temp-path=temp/uwsgi_temp --with-cc-opt=-DFD_SETSIZE=1024 --with-pcre=objs.msvc8/lib/pcre-8.32 --with-zlib=objs.msvc8/lib/zlib-1.2.7 --with-select_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_stub_status_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-mail --with-openssl=objs.msvc8/lib/openssl-1.0.1e --with-openssl-opt=enable-tlsext --with-http_ssl_module --with-mail_ssl_module --with-ipv6

Description

To reproduce an error following things are required (all of them):

  1. worker_processes > 1
  2. accept_mutex on
  3. server record like this:
    server {
        listen       80;
        server_name  localhost;
	location / {
            proxy_pass          http://SOME_IP;
            proxy_buffering     off;
        }
    }

Where "SOME_IP" is an address of another server (not sure if it matters though).

Accessing this location from a browser stucks in "waiting for answer" for a very long time.
Following workarounds seem to solve the problem:

  1. Set "worker_processes" to 1

or

  1. Set "accept_mutex" to "off"

Also, non-proxy locations (without "proxy-pass") seem to work fine under the same conditions, only "proxies" are affected.

Change History (3)

comment:1 by Maxim Dounin, 7 years ago

Keywords: win32 worker_processes accept_mutex added; proxy-pass worker mutex removed
Status: newaccepted
Summary: Infinite "waiting for answer" for servers with "proxy-pass"deadlock on win32 with accept_mutex and multiple worker_processes

This is a known issue with windows version, thanks for filing the ticket. See http://nginx.org/en/docs/windows.html:

Although several workers can be started, only one of them actually does any work.

With accept_mutex switched on this results in a deadlock if accept_mutex happens to be grabbed by a worker process which can't accept connections. The issue isn't related to a proxy_pass, but using it might increase probability of accept_mutex being obtained by a wrong worker process.

The following trivial patch completely disables accept_mutex on win32:

--- a/src/event/ngx_event.c
+++ b/src/event/ngx_event.c
@@ -607,6 +607,17 @@ ngx_event_process_init(ngx_cycle_t *cycl
         ngx_use_accept_mutex = 0;
     }
 
+#if (NGX_WIN32)
+
+    /*
+     * disable accept mutex on win32 as it may cause deadlock if
+     * grabbed by a process which can't accept connections
+     */
+
+    ngx_use_accept_mutex = 0;
+
+#endif
+
 #if (NGX_THREADS)
     ngx_posted_events_mutex = ngx_mutex_init(cycle->log, 0);
     if (ngx_posted_events_mutex == NULL) {

Probably this is a right way to go, at least till the underlying limitation is resolved.

comment:2 by Maxim Dounin, 7 years ago

Resolution: fixed
Status: acceptedclosed

The patch is committed as [c9fe549b127b].

comment:3 by Maxim Dounin <mdounin@…>, 7 years ago

In 1b70200d83e3e3f94883048f8517d1cc27cf90d3/nginx:

Win32: accept_mutex now always disabled (ticket #362).

Use of accept mutex on win32 may result in a deadlock if there are multiple
worker_processes configured and the mutex is grabbed by a process which
can't accept connections.

Note: See TracTickets for help on using tickets.