Segfault in ngx_log_error_core related to UDP connections
|Reported by:||Benjamin Grössing||Owned by:|
|uname -a:||Linux server1 2.6.32-45-server #102-Ubuntu SMP Wed Jan 2 22:53:00 UTC 2013 x86_64 GNU/Linux|
nginx version: nginx/1.2.7
built by gcc 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)
TLS SNI support enabled
configure arguments: --prefix=/opt/nginx --with-http_ssl_module --with-http_gzip_static_module --with-cc-opt=-Wno-error --add-module=/home/user1/.rvm/gems/ruby-1.9.3-p194/gems/passenger-3.0.18/ext/nginx --add-module=/usr/local/src/nginx-statsd
I am currently experiencing segmentation faults in nginx worker processes (about half of all requests lead to segfault and no answer is sent to the http client). I am not sure whether the problem is in nginx itself or in a third-party-module called nginx-statsd (which is very similar to nginx-udplog). This module basically sends out an UDP packet for every HTTP request.
The module itself works. However, when the target UDP server (in my case localhost) is unreachable, sometimes the segfault happens. I think this could be caused due to a problem in nginx core.
The segfault occurs in ngx_log_error_core, called to log "send() failed" because of the ECONNREFUSED. What's strange is that I can't always reproduce the problem (first it segfaulted always on my local machine, later without changing anything it won't segfault anymore, however on my production server it always crashes). I have attached the backtrace from gdb.
I am running nginx 1.2.7 (but I've had the same problem with 1.2.6) with the latest version of nginx-statsd from https://github.com/zebrafishlabs/nginx-statsd.
Why I think this problem is related to nginx (and not the module): I have found this entry in the mailing list archive: http://forum.nginx.org/read.php?2,66090,128251
The patch from Chaos Wang is missing I think but you can get it here: http://www.ruby-forum.com/topic/206352
It patches ngx_resolver and ngx_resolver defines ngx_udp_connection_t which in turn is being used by nginx-statsd (and other 3rd party modules with UDP as well I guess). When applying this patch (slightly modified), the segmentation faults are gone.
I am wondering if that patch from back in 2010 was ever merged or even reviewed by someone? Could this bug still be presented in the latest nginx?
I have attached the gdb backtrace, the full back trace as well as the patch for nginx 1.2.7. Also, nginx_http_statsd.c is the patched 3rd party module code (line numbers should equal those in the backtrace).
I would really appreciate if you guys could have a look at it.