Opened 13 years ago

Closed 13 years ago

#53 closed defect (fixed)

Nginx fails to accept new connection if active worker crashes

Reported by: Fasih Owned by: somebody
Priority: major Milestone:
Component: nginx-core Version: 1.0.x
Keywords: multiple workers, ngx_accept_mutex, accept new connection fails Cc:
uname -a: Linux faskiri-pc 2.6.32-24-generic #43-Ubuntu SMP Thu Sep 16 14:58:24 UTC 2010 x86_64 GNU/Linux
nginx -V: nginx: nginx version: nginx/1.0.5
nginx: built by gcc 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5)
nginx: configure arguments: --without-http_ssi_module --without-http_geo_module --without-http_fastcgi_module --without-http_uwsgi_module --without-http_scgi_module --without-http_memcached_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module --with-pcre --with-debug

Description

If nginx is configured with multiple workers, killing the active worker causes nginx to "hang".

Further investigations(with nginx at debug level) showed that all threads are fine but none of the workers are getting the ngx_accept_mutex_lock.

Master tries to release the ngx_accept_mutex_lock if the dead process was holding it [
https://svn.nginx.org/nginx/browser/nginx/trunk/src/os/unix/ngx_process.c?annotate=blame#L503] but doesnt look like the value is set anywhere.

I think https://svn.nginx.org/nginx/changeset/612/nginx broke it.

I have been using nginx only for a couple of months now so I am not very sure of the diagnosis, please feel free to correct.

Change History (4)

comment:1 by Gabriel De Luca, 13 years ago

I don't know about the ngx_accept_mutex_lock part because I have yet to look into the source code. But while running load tests to see how well nginx runs as a reverse proxy I ran into something like this. Ever time that I ran 100 simulations connections and ran 100 request per second nginx would crash or stop answer any requests. I had to shutdown nginx delete the pid file and delete the logs. Then I would have to repeatedly start and stop nginx for it to start answering requests again.

by the way I'm using windows xp version 1.0.9 (I've also tested this with 1.0.10 same problem)

Last edited 13 years ago by Gabriel De Luca (previous) (diff)

comment:2 by Maxim Dounin, 13 years ago

The following patch series should fix accept mutex problem: http://mailman.nginx.org/pipermail/nginx-devel/2011-November/001471.html

The windows problem is unrelated. If running under windows, please make sure you are using only *one* worker or have accept_mutex switched off.

comment:3 by Maxim Dounin, 13 years ago

Status: newaccepted

comment:4 by Maxim Dounin, 13 years ago

Resolution: fixed
Status: acceptedclosed

Fix committed (r4310).

Note: See TracTickets for help on using tickets.