Opened 7 years ago

Closed 7 years ago

#514 closed defect (fixed)

worker process caused 100% CPU usage on one core on win32

Reported by: Nos Apatka Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.5.x
Keywords: high CPU usage win32 Cc:
uname -a:
nginx -V: nginx version: nginx/1.5.10
TLS SNI support enabled
configure arguments: --with-cc=cl --builddir=objs.msvc8 --with-debug --prefix= --conf-path=conf/nginx.conf --pid-path=logs/nginx.pid --http-log-path=logs/access.log --error-log-path=logs/error.log --sbin-path=nginx.exe --http-client-body-temp-path=temp/client_body_temp --http-proxy-temp-path=temp/proxy_temp --http-fastcgi-temp-path=temp/fastcgi_temp --http-scgi-temp-path=temp/scgi_temp --http-uwsgi-temp-path=temp/uwsgi_temp --with-cc-opt=-DFD_SETSIZE=1024 --with-pcre=objs.msvc8/lib/pcre-8.34 --with-zlib=objs.msvc8/lib/zlib-1.2.8 --with-select_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_stub_status_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_auth_request_module --with-http_random_index_module --with-http_secure_link_module --with-mail --with-openssl=objs.msvc8/lib/openssl-1.0.1f --with-openssl-opt=enable-tlsext --with-http_ssl_module --with-mail_ssl_module --with-ipv6

Description

Timing issue scenario:

source code file (tip level): \src\os\win32\ngx_process_cycle.c

Step 1: After the event “ngx_worker_quit_<pid>” has been signaled the code in function ngx_worker_process_cycle() of the worker process main thread set the ngx_quit to the value 1 in line #703 and jump out of the cycle;

Step 2: In ngx_worker_thread() the ngx_quit is set to the value 0 in line #840, then closing listening sockets and set the ngx_exiting = 1 in line #844.

Step 3: ngx_cache_manager_event is signaled in line #730 in function ngx_worker_process_cycle() of the worker process main thread;

Step 4: the cache manager thread has been waked up in ngx_cache_manager_process_handler() and returns back to ngx_cache_manager_thread(), but since both, ngx_terminate and ngx_quit are 0 it enters again ngx_cache_manager_process_handler() in line #944;

Step 5: since ngx_cache_manager_event remains in signaled state the WaitForSingleObject in line #982 will return immediately to ngx_cache_manager_thread() and the loop continues since ngx_terminate and ngx_quit are still 0; this thread will cause one core 100% usage;

Meanwhile, the ngx_worker_thread() will terminate and the main thread will be stuck waiting for cache manager thread to terminate in the for(;;) cycle line #740.

The solution to the issue either to check for ngx_exiting in line #939 or to set ngx_quit = 1 just before jumping out of the loop in line #827 of the worker thread. The second approach has been successfully tested on the customer system where the issue is often reproducible.

Attachments (2)

ngx_process_cycle.c (27.4 KB ) - added by Nos Apatka 7 years ago.
modified source code
realy+service.png (83.0 KB ) - added by Nos Apatka 7 years ago.
screen shot showing issue

Download all attachments as: .zip

Change History (5)

by Nos Apatka, 7 years ago

Attachment: ngx_process_cycle.c added

modified source code

by Nos Apatka, 7 years ago

Attachment: realy+service.png added

screen shot showing issue

comment:1 by Maxim Dounin, 7 years ago

Status: newaccepted

Thanks for the report. It looks like a fallout of the fact that cache manager is a thread on win32 (instead of a normal process as it used to be on unix systems).

I don't really like the idea of rearming ngx_quit, as it's looks semantically incorrect. Checking ngx_exiting probably will be good enough though, please see the patch below:

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1395862308 -14400
#      Wed Mar 26 23:31:48 2014 +0400
# Node ID 734f0babfc133c2dc532f2794deadcf9d90245f7
# Parent  384c5cb7042d9770d688852dda2beee5f0f9587c
Win32: fixed cpu hog by cache manager on exit (ticket #514).

The "ngx_quit" may be reset by the worker thread before it's seen
by a ngx_cache_manager_thread(), resulting in an infinite loop.  Make
sure to test ngx_exiting as well.

diff --git a/src/os/win32/ngx_process_cycle.c b/src/os/win32/ngx_process_cycle.c
--- a/src/os/win32/ngx_process_cycle.c
+++ b/src/os/win32/ngx_process_cycle.c
@@ -926,7 +926,7 @@ ngx_cache_manager_thread(void *data)
          * ev == WAIT_ABANDONED_0 + 1
          */
 
-        if (ngx_terminate || ngx_quit) {
+        if (ngx_terminate || ngx_quit || ngx_exiting) {
             ngx_log_error(NGX_LOG_NOTICE, cycle->log, 0, "exiting");
             return 0;
         }
@@ -936,7 +936,7 @@ ngx_cache_manager_thread(void *data)
 
     for ( ;; ) {
 
-        if (ngx_terminate || ngx_quit) {
+        if (ngx_terminate || ngx_quit || ngx_exiting) {
             ngx_log_error(NGX_LOG_NOTICE, cycle->log, 0, "exiting");
             break;
         }
@@ -1005,7 +1005,7 @@ ngx_cache_loader_thread(void *data)
     path = cycle->paths.elts;
     for (i = 0; i < cycle->paths.nelts; i++) {
 
-        if (ngx_terminate || ngx_quit) {
+        if (ngx_terminate || ngx_quit || ngx_exiting) {
             break;
         }
 

(Please also take a look at http://nginx.org/en/docs/contributing_changes.html for further reference. Trac isn't the best place to submit patches.)

comment:2 by Maxim Dounin <mdounin@…>, 7 years ago

In b74f1106f920fe9e447c710e57a5ccdeae46d8e3/nginx:

Win32: fixed cpu hog by cache manager on exit (ticket #514).

The "ngx_quit" may be reset by the worker thread before it's seen
by a ngx_cache_manager_thread(), resulting in an infinite loop. Make
sure to test ngx_exiting as well.

comment:3 by Maxim Dounin, 7 years ago

Resolution: fixed
Status: acceptedclosed

Fix committed, thanks.

Note: See TracTickets for help on using tickets.