Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#300 closed defect (invalid)

Segfault in ngx_log_error_core related to UDP connections

Reported by: Benjamin Grössing Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.3.x
Keywords: resolver, udp Cc:
uname -a: Linux server1 2.6.32-45-server #102-Ubuntu SMP Wed Jan 2 22:53:00 UTC 2013 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.2.7
built by gcc 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)
TLS SNI support enabled
configure arguments: --prefix=/opt/nginx --with-http_ssl_module --with-http_gzip_static_module --with-cc-opt=-Wno-error --add-module=/home/user1/.rvm/gems/ruby-1.9.3-p194/gems/passenger-3.0.18/ext/nginx --add-module=/usr/local/src/nginx-statsd

Description

Hello,

I am currently experiencing segmentation faults in nginx worker processes (about half of all requests lead to segfault and no answer is sent to the http client). I am not sure whether the problem is in nginx itself or in a third-party-module called nginx-statsd (which is very similar to nginx-udplog). This module basically sends out an UDP packet for every HTTP request.

The module itself works. However, when the target UDP server (in my case localhost) is unreachable, sometimes the segfault happens. I think this could be caused due to a problem in nginx core.

The segfault occurs in ngx_log_error_core, called to log "send() failed" because of the ECONNREFUSED. What's strange is that I can't always reproduce the problem (first it segfaulted always on my local machine, later without changing anything it won't segfault anymore, however on my production server it always crashes). I have attached the backtrace from gdb.

I am running nginx 1.2.7 (but I've had the same problem with 1.2.6) with the latest version of nginx-statsd from https://github.com/zebrafishlabs/nginx-statsd.

Why I think this problem is related to nginx (and not the module): I have found this entry in the mailing list archive: http://forum.nginx.org/read.php?2,66090,128251
The patch from Chaos Wang is missing I think but you can get it here: http://www.ruby-forum.com/topic/206352
It patches ngx_resolver and ngx_resolver defines ngx_udp_connection_t which in turn is being used by nginx-statsd (and other 3rd party modules with UDP as well I guess). When applying this patch (slightly modified), the segmentation faults are gone.

I am wondering if that patch from back in 2010 was ever merged or even reviewed by someone? Could this bug still be presented in the latest nginx?

I have attached the gdb backtrace, the full back trace as well as the patch for nginx 1.2.7. Also, nginx_http_statsd.c is the patched 3rd party module code (line numbers should equal those in the backtrace).

I would really appreciate if you guys could have a look at it.

Regards
Benjamin Grössing

Attachments (4)

patch1.patch (2.0 KB ) - added by Benjamin Grössing 12 years ago.
Patch for nginx 1.2.7
backtrace.txt (5.3 KB ) - added by Benjamin Grössing 12 years ago.
gdb backtrace
backtrace-full.txt (6.8 KB ) - added by Benjamin Grössing 12 years ago.
ngx_http_statsd.c (20.4 KB ) - added by Benjamin Grössing 12 years ago.
slightly patched nginx-statsd module

Download all attachments as: .zip

Change History (6)

by Benjamin Grössing, 12 years ago

Attachment: patch1.patch added

Patch for nginx 1.2.7

by Benjamin Grössing, 12 years ago

Attachment: backtrace.txt added

gdb backtrace

by Benjamin Grössing, 12 years ago

Attachment: backtrace-full.txt added

by Benjamin Grössing, 12 years ago

Attachment: ngx_http_statsd.c added

slightly patched nginx-statsd module

comment:1 by Maxim Dounin, 12 years ago

Resolution: invalid
Status: newclosed

Patch to change ngx_udp_connection_t's log to be a pointer is just wrong as it's not a pointer for a reason - it has to provide it's own logging context to be able to use custom log handler and custom action (see r3409). With a pointer it will instread corrupt global log's data.

Correct fix for the segfault in the resolver code was introduced in r4497 (ticket #91). Something similar should be done in 3rd party modules if they have the same problem.

comment:2 by Benjamin Grössing, 12 years ago

Thank you very much for your hint on r4497 and #91, Maxim! I have ported the changes to the module and that fixed it!

Keep up the great work!

Note: See TracTickets for help on using tickets.