Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#503 closed defect (fixed)

Nginx worker process crash when the upstream reject the nginx connection.

Reported by: Hao Chen Owned by: Maxim Dounin
Priority: critical Milestone:
Component: nginx-core Version: 1.4.x
Keywords: ngnix worker crash Cc:
uname -a: Linux hostname 2.6.32-358.6.2.el6.x86_64 #1 SMP Thu May 16 20:59:36 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.4.4

Description

Actually, this defect is existed from v1.4.4 to v1.5.19

This defect is easy to reproduce. I use Nginx as Websocket reverse-proxy, when the ngnix connected upstream server, and upstream sever reject the connection after sending the error message because some conditions not satisfied, then, Ngnix worker process crashed.

gdb shows the crash information as below:

2628        if (upstream->read->timedout || upstream->write->timedout) {

(gdb) p upstream
$1 = (ngx_connection_t *) 0x0 <---- NULL Pointer

(gdb) bt
#0  ngx_http_upstream_process_upgraded (r=0x1808f90, from_upstream=0, do_write=1) at src/http/ngx_http_upstream.c:2628
#1  0x000000000044a37f in ngx_http_upstream_upgrade (r=0x1808f90, u=0x180a1b0) at src/http/ngx_http_upstream.c:2566
#2  ngx_http_upstream_send_response (r=0x1808f90, u=0x180a1b0) at src/http/ngx_http_upstream.c:2213
#3  ngx_http_upstream_process_header (r=0x1808f90, u=0x180a1b0) at src/http/ngx_http_upstream.c:1735
#4  0x00000000004474fa in ngx_http_upstream_handler (ev=<value optimized out>) at src/http/ngx_http_upstream.c:977
#5  0x000000000042c547 in ngx_epoll_process_events (cycle=<value optimized out>, timer=<value optimized out>, flags=<value optimized out>)
    at src/event/modules/ngx_epoll_module.c:691
#6  0x0000000000425593 in ngx_process_events_and_timers (cycle=0x1804f80) at src/event/ngx_event.c:248
#7  0x000000000042b1c9 in ngx_worker_process_cycle (cycle=0x1804f80, data=<value optimized out>) at src/os/unix/ngx_process_cycle.c:816
#8  0x0000000000429ab4 in ngx_spawn_process (cycle=0x1804f80, proc=0x42b0f6 <ngx_worker_process_cycle>, data=0x0, name=0x49b766 "worker process", respawn=-3)
    at src/os/unix/ngx_process.c:198
#9  0x000000000042a700 in ngx_start_worker_processes (cycle=0x1804f80, n=1, type=-3) at src/os/unix/ngx_process_cycle.c:364
#10 0x000000000042b707 in ngx_master_process_cycle (cycle=0x1804f80) at src/os/unix/ngx_process_cycle.c:136
#11 0x0000000000410cbc in main (argc=<value optimized out>, argv=<value optimized out>) at src/core/nginx.c:407
(gdb)

after debugging and looked into the source code, I found the following code: (we can see the ngx_http_upstream_process_upgraded() will be invoked twice )

http://trac.nginx.org/nginx/browser/nginx/src/http/ngx_http_upstream.c#L2557

2557	    if (u->peer.connection->read->ready
2558	        || u->buffer.pos != u->buffer.last)
2559	    {
2560	        ngx_http_upstream_process_upgraded(r, 1, 1);
2561	    }
2562	
2563	    ngx_http_upstream_process_upgraded(r, 0, 1);
2564	}

Take a look the function ngx_http_upstream_process_upgraded(), we can see if the upstream connection closed, then the ngx_http_upstream_finalize_request() will be invoked.

http://trac.nginx.org/nginx/browser/nginx/src/http/ngx_http_upstream.c#L2738

2738	    if (ngx_handle_read_event(upstream->read, 0) != NGX_OK) {
2739	        ngx_http_upstream_finalize_request(r, u, NGX_ERROR); <--- Free the memory
2740	        return;
2741	    }

The ngx_http_upstream_finalize_request() function will shutdown, close and free the memory of peer connection.

So, we can see the ngx_http_upstream_process_upgraded() will be called twice, but the second one hasn't any guard to check NULL pointer.


After check the source code change, found the following message:

http://trac.nginx.org/nginx/changeset/5532/nginx

r5500	 r5532	 
2561	2561	    } 
2562	2562	 
2563	 	    if (c->read->ready 
2564	 	        || r->header_in->pos != r->header_in->last) 
2565	 	    { 
2566	 	        ngx_http_upstream_process_upgraded(r, 0, 1); 
2567	 	    } 
 	2563	    ngx_http_upstream_process_upgraded(r, 0, 1); 
2568	2564	} 
2569	2565	


I think we should rollback that change and reconsider it.

Thanks,
Hao Chen

Change History (6)

comment:1 by Maxim Dounin, 10 years ago

Owner: set to Maxim Dounin
Status: newassigned

Thank you for your report, I'll take a look how to properly fix this.

comment:2 by Maxim Dounin, 10 years ago

Please try the following patch:

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1392419939 -14400
#      Sat Feb 15 03:18:59 2014 +0400
# Node ID e4cc147b8b8c5c2b7f2c9de29d86d4b228c71ecf
# Parent  cff36d2d7fe6db1baa9d44ed30ebd26b20c05d06
Upstream: ngx_post_event() instead of upgraded call (ticket #503).

If a request is finalized in the first call to the
ngx_http_upstream_process_upgraded() function, e.g., because upstream
server closed the connection for some reason, in the second call
the u->peer.connection pointer will be null, resulting in segmentation
fault.

Fix is to avoid second direct call, and post event instead.  This ensures
that ngx_http_upstream_process_upgraded() won't be called again if
a request is finalized.

diff --git a/src/http/ngx_http_upstream.c b/src/http/ngx_http_upstream.c
--- a/src/http/ngx_http_upstream.c
+++ b/src/http/ngx_http_upstream.c
@@ -2557,7 +2557,9 @@ ngx_http_upstream_upgrade(ngx_http_reque
     if (u->peer.connection->read->ready
         || u->buffer.pos != u->buffer.last)
     {
+        ngx_post_event(c->read, &ngx_posted_events);
         ngx_http_upstream_process_upgraded(r, 1, 1);
+        return;
     }
 
     ngx_http_upstream_process_upgraded(r, 0, 1);

Note that the problem existed before the commit mentioned. The commit in question only changes conditions when the segfault appears.

comment:3 by Hao Chen, 10 years ago

With patched this fix, no crash happened. and looks no side effects. Thanks Maxim.

Last edited 10 years ago by Hao Chen (previous) (diff)

comment:4 by Maxim Dounin <mdounin@…>, 10 years ago

In 7586e7b2dbe9d9999014a99fe40e58c2e90b9cf5/nginx:

Upstream: ngx_post_event() instead of upgraded call (ticket #503).

If a request is finalized in the first call to the
ngx_http_upstream_process_upgraded() function, e.g., because upstream
server closed the connection for some reason, in the second call
the u->peer.connection pointer will be null, resulting in segmentation
fault.

Fix is to avoid second direct call, and post event instead. This ensures
that ngx_http_upstream_process_upgraded() won't be called again if
a request is finalized.

comment:5 by Maxim Dounin, 10 years ago

Resolution: fixed
Status: assignedclosed

Fix committed, thanks.

comment:6 by Maxim Dounin <mdounin@…>, 10 years ago

In c8a14fbd3ce9f59a0157fb0d0dbe37ca382aaba6/nginx:

Upstream: ngx_post_event() instead of upgraded call (ticket #503).

If a request is finalized in the first call to the
ngx_http_upstream_process_upgraded() function, e.g., because upstream
server closed the connection for some reason, in the second call
the u->peer.connection pointer will be null, resulting in segmentation
fault.

Fix is to avoid second direct call, and post event instead. This ensures
that ngx_http_upstream_process_upgraded() won't be called again if
a request is finalized.

Note: See TracTickets for help on using tickets.