Opened 5 years ago

Closed 5 years ago

#822 closed defect (fixed)

Worker process crashing (exit, signal 11)

Reported by: Valerian Saliou Owned by: Valentin V. Bartenev
Priority: major Milestone:
Component: nginx-core Version: 1.9.x
Keywords: Cc:
uname -a: Linux jappix.com 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt11-1+deb8u5 (2015-10-09) i686 GNU/Linux
nginx -V: nginx version: nginx/1.9.6

Description

Hello,

Since updating from 1.9.5 to 1.9.6 a few hours ago, I'm getting a lot of those errors in my error.log:


2015/10/28 12:51:59 [alert] 27226#27226: worker process 1412 exited on signal 11
2015/10/28 12:51:59 [alert] 27226#27226: worker process 1413 exited on signal 11
2015/10/28 12:52:00 [alert] 27226#27226: worker process 1414 exited on signal 11
2015/10/28 12:52:06 [alert] 27226#27226: worker process 1415 exited on signal 11
2015/10/28 12:52:07 [alert] 27226#27226: worker process 1416 exited on signal 11
2015/10/28 12:52:08 [alert] 27226#27226: worker process 1515 exited on signal 11


2 worker process are running simultaneously at any time.

The service running this NGINX instance is doing A LOT of proxying (HTTP + WebSocket).

HTTP/2 is enabled on all hosts.
Custom SSL settings are specified.

I gist-ed my configuration for you to look at:

Let me know if you need further details. It seems this issue makes my NGINX instance drop a lot of incoming requests, provided a worker crash happens in the meantime.

Side note: I do run 3 separate NGINX servers on the same server architecture (same uname), for 3 distinct services. Their base configuration is exactly the same (same nginx.conf), except from the SSL settings. I am only seeing this issue on 1 of my 3 servers - though, the buggy server is the most heavily loaded one.

Change History (15)

comment:1 by Valerian Saliou, 5 years ago

As a side note: I could not reproduce the issue on my end. A buggy client seems to be periodically attempting to connect to NGINX - causing the crash to happen.

Last edited 5 years ago by Valerian Saliou (previous) (diff)

comment:2 by maxim, 5 years ago

Hi,

could you please provide a stack trace from the coredump?

https://www.nginx.com/resources/wiki/start/topics/tutorials/debugging/

Also, we need full nginx -V output, if you don't mind.

comment:3 by Valerian Saliou, 5 years ago

nginx -V output:


nginx version: nginx/1.9.6
built by gcc 4.9.2 (Debian 4.9.2-10)
built with OpenSSL 1.0.1k 8 Jan 2015
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-threads --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_v2_module --with-cc-opt='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-z,relro -Wl,--as-needed' --with-ipv6

comment:4 by Valerian Saliou, 5 years ago

Core dump:


[root@jappix debugcores]# gdb /usr/sbin/nginx /srv/data_jappix/web/debugcores/core
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i586-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/nginx...(no debugging symbols found)...done.
[New LWP 29146]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1".
Core was generated by `nginx: worker process '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x080aa8ef in ?? ()
(gdb) backtrace full
#0 0x080aa8ef in ?? ()
No symbol table info available.
#1 0x080abb4d in ?? ()
No symbol table info available.
#2 0x08066ced in ?? ()
No symbol table info available.
#3 0x080668c4 in ?? ()
No symbol table info available.
#4 0x0806c2a8 in ?? ()
No symbol table info available.
#5 0x0806adbc in ?? ()
No symbol table info available.
#6 0x0806d354 in ?? ()
No symbol table info available.
#7 0x0804dd51 in ?? ()
No symbol table info available.
#8 0xb729ca63 in libc_start_main (main=0x804d2c0, argc=3, argv=0xbf81dec4, init=0x80e5c70, fini=0x80e5ce0,

rtld_fini=0xb774fc90 <_dl_fini>, stack_end=0xbf81debc) at libc-start.c:287

result = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1220374528, 0, 0, 0, -434543122, -1238285825}, mask_was_saved = 0}}, priv = {

pad = {0x0, 0x0, 0x3, 0x804dde8}, data = {prev = 0x0, cleanup = 0x0, canceltype = 3}}}

not_first_call = <optimized out>

#9 0x0804de09 in ?? ()
No symbol table info available.

comment:5 by Valerian Saliou, 5 years ago

I hope this is what you are asking for -- I'm not familiar with those tools / low-level debugging. Cheers!

comment:6 by maxim, 5 years ago

Hi,

not exactly -- the binary misses debug symbols. There are two options:

  1. install nginx-dbg package that has non-stripped binary and try to reproduce the core and get backtrace with it.
  1. send us the core file.

The second is probably the easiest option but I should warn you that the core can has some private data you don't want to share.

Let me know how you want to proceed.

Thanks!

in reply to:  6 comment:7 by Valerian Saliou, 5 years ago

Replying to maxim:

Hi,

not exactly -- the binary misses debug symbols. There are two options:

  1. install nginx-dbg package that has non-stripped binary and try to reproduce the core and get backtrace with it.
  1. send us the core file.

The second is probably the easiest option but I should warn you that the core can has some private data you don't want to share.

Let me know how you want to proceed.

Thanks!

Hello,

Here's the complete dump: https://gist.github.com/valeriansaliou/7ab9eb6cf0a062e3d575

Hope that helps a little bit more :)

Cheers,

Valerian.

comment:8 by Valerian Saliou, 5 years ago

I see some XML traces in the dump. We're using NGINX to proxy a large number of XML packets per sec.
Let me know if you need some traces/exports of the request headers + data to help you reproduce.

Requests are made either via long-living HTTPS or WebSocket (depending on browser support) - I assume a browser supporting HTTP/2 is using WebSocket in all cases.

Version 0, edited 5 years ago by Valerian Saliou (next)

comment:9 by Nick, 5 years ago

Confim this bug - had to downgrade to 1.9.5. Nginx works as a proxy for Apache, websites. Problems revealed when using HTTP/2.

comment:10 by Valentin V. Bartenev, 5 years ago

Status: newaccepted

Thank you for the report. We are able to reproduce.

comment:11 by Valentin V. Bartenev, 5 years ago

The following patch should fix the issue:

diff -r b78018cfaa2f src/http/v2/ngx_http_v2.c
--- a/src/http/v2/ngx_http_v2.c Tue Oct 27 16:47:29 2015 +0300
+++ b/src/http/v2/ngx_http_v2.c Thu Oct 29 17:55:14 2015 +0300
@@ -870,8 +870,6 @@ ngx_http_v2_state_data(ngx_http_v2_conne
         return ngx_http_v2_state_skip_padded(h2c, pos, end);
     }
 
-    stream->in_closed = h2c->state.flags & NGX_HTTP_V2_END_STREAM_FLAG;
-
     h2c->state.stream = stream;
 
     return ngx_http_v2_state_read_data(h2c, pos, end);
@@ -899,6 +897,8 @@ ngx_http_v2_state_read_data(ngx_http_v2_
     }
 
     if (stream->skip_data) {
+        stream->in_closed = h2c->state.flags & NGX_HTTP_V2_END_STREAM_FLAG;
+
         ngx_log_debug1(NGX_LOG_DEBUG_HTTP, h2c->connection->log, 0,
                        "skipping http2 DATA frame, reason: %d",
                        stream->skip_data);
@@ -988,7 +988,9 @@ ngx_http_v2_state_read_data(ngx_http_v2_
                                       ngx_http_v2_state_read_data);
     }
 
-    if (stream->in_closed) {
+    if (h2c->state.flags & NGX_HTTP_V2_END_STREAM_FLAG) {
+        stream->in_closed = 1;
+
         if (r->headers_in.content_length_n < 0) {
             r->headers_in.content_length_n = rb->rest;
 
Last edited 5 years ago by Valentin V. Bartenev (previous) (diff)

comment:12 by maxim, 5 years ago

Owner: set to Valentin V. Bartenev
Status: acceptedassigned

comment:13 by Bruno Bigras, 5 years ago

I had those crashes while using nagios. The patch seems to have fixed the problem.

comment:14 by Valentin Bartenev <vbart@…>, 5 years ago

In 6288:0f4b7800e681/nginx:

HTTP/2: backed out 16905ecbb49e (ticket #822).

It caused inconsistency between setting "in_closed" flag and the moment when
the last DATA frame was actually read. As a result, the body buffer might not
be initialized properly in ngx_http_v2_init_request_body(), which led to a
segmentation fault in ngx_http_v2_state_read_data(). Also it might cause
start processing of incomplete body.

This issue could be triggered when the processing of a request was delayed,
e.g. in the limit_req or auth_request modules.

comment:15 by Valentin V. Bartenev, 5 years ago

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.