Opened 3 years ago

Closed 3 years ago

#2160 closed defect (worksforme)

Cache loader process doesn't exit after master process is stopped with SIGQUIT.

Reported by: ray.ward.bigcommerce.com@… Owned by:
Priority: minor Milestone:
Component: documentation Version: 1.15.x
Keywords: Cc:
uname -a: Linux store-lb-129k 4.19.0-14-cloud-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux
nginx -V: nginx version: openresty/1.15.8.2
built by gcc 8.3.0 (Debian 8.3.0-6)
built with OpenSSL 1.1.1d 10 Sep 2019
TLS SNI support enabled
configure arguments: --prefix=/usr/share/nginx/nginx --with-debug --with-cc-opt='-DNGX_LUA_USE_ASSERT -DNGX_LUA_ABORT_AT_PANIC -O2 -I/usr/include/ossp' --add-module=../ngx_devel_kit-0.3.1rc1 --add-module=../echo-nginx-module-0.61 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.32 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.08 --add-module=../srcache-nginx-module-0.31 --add-module=../ngx_lua-0.10.15 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.33 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.19 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.7 --add-module=../rds-json-nginx-module-0.15 --add-module=../rds-csv-nginx-module-0.09 --add-module=../ngx_stream_lua-0.0.7 --with-ld-opt='-Wl,-rpath,/usr/share/nginx/luajit/lib -lossp-uuid' --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid --with-compat --with-pcre-jit --with-http_ssl_module --with-ipv6 --lock-path=/var/lock/nginx.lock --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-http_gzip_static_module --with-openssl=/build/openssl/openssl-1.1.1d --with-http_realip_module --with-http_ssl_module --with-http_realip_module --with-http_auth_request_module --with-http_gunzip_module --with-file-aio --with-threads --with-http_stub_status_module --with-http_geoip_module --with-http_gunzip_module --with-http_sub_module --with-http_v2_module --add-module=/build/modules/ngx_cache_purge-2.3 --add-module=/build/modules/nginx-module-vts-0.1.18 --with-stream --with-stream_ssl_module --with-stream_ssl_preread_module

Description

When gracefully shutting down nginx with SIGQUIT, if the cache loader process is still running (we have large caches of lots of small files) then it will hold up nginx from quitting. It seems like the master process isn't passing this signal onto the cache loader.

After SIGQUIT to nginx, workers are in process of shutting down:

# ps aux | grep nginx | grep -vE "(grep|mtail)"
root     18737  0.0  0.0 2660400 16544 ?       Ss   23:32   0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 19174 55.8  0.6 2746968 223120 ?      S<   23:33   1:46 nginx: worker process is shutting down
www-data 19178 17.2  0.4 2692984 134156 ?      S<   23:33   0:32 nginx: worker process is shutting down
www-data 19180  5.8  0.2 2674952 87192 ?       S<   23:33   0:11 nginx: worker process is shutting down
www-data 19187 16.6  0.4 2658824 163728 ?      R    23:33   0:31 nginx: cache loader process
root     22216  0.0  0.0  32828  3736 pts/1    S+   23:35   0:00 systemctl stop nginx.service
root     22220  0.1  0.0   2304   820 ?        Ss   23:35   0:00 /sbin/start-stop-daemon --quiet --stop --retry QUIT/300/TERM/5/KILL/5 --pidfile /run/nginx.pid

After all the workers have stopped, nginx is lingering due to the cache loader still running:

# ps aux | grep nginx | grep -vE "(grep|mtail)"
root     18737  0.0  0.0 2660400 16544 ?       Ss   23:32   0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 19187 19.2  1.0 2658824 343776 ?      S    23:33   0:56 nginx: cache loader process
root     22216  0.0  0.0  32828  3736 pts/1    S+   23:35   0:00 systemctl stop nginx.service
root     22220  0.1  0.0   2304   820 ?        Ss   23:35   0:00 /sbin/start-stop-daemon --quiet --stop --retry QUIT/300/TERM/5/KILL/5 --pidfile /run/nginx.pid

Sending SIGQUIT directly to the cache loader will stop it and allow nginx to finish up:

# kill -s SIGQUIT 19187
# ps aux | grep nginx | grep -vE "(grep|mtail)"

Change History (2)

comment:1 by Maxim Dounin, 3 years ago

It looks like you are using heavily patched fork of nginx and with many 3rd party modules. Are you able to reproduce the issue with vanilla nginx as available from nginx.org, without any 3rd party patches and modules?

comment:2 by Maxim Dounin, 3 years ago

Resolution: worksforme
Status: newclosed

Feedback timeout.

Note: See TracTickets for help on using tickets.