Opened 4 months ago
Last modified 4 months ago
#2685 new defect
stub_status counter leak when killing old workers
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | nginx-core | Version: | 1.24.x |
Keywords: | Cc: | ||
uname -a: | Linux lb 5.15.0-100-generic #110-Ubuntu SMP Wed Feb 7 13:27:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.24.0
built by gcc 11.2.0 (Ubuntu 11.2.0-19ubuntu1) built with OpenSSL 3.0.2 15 Mar 2022 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-g -O2 -ffile-prefix-map=/data/builder/debuild/nginx-1.24.0/debian/debuild-base/nginx-1.24.0=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --with-ld-opt='-Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie' |
Description
Hello,
we've started killing (sending SIGTERM
) "old" nginx workers (nginx: worker process is shutting down
) as we have regular configuration changes and a lot of websocket connections.
Since we do this, the counters from stub_status
are incorrect.
server status
> curl localhost/nginx_status Active connections: 65369 server accepts handled requests 1042173178 1042173178 5035465167 Reading: 0 Writing: 31968 Waiting: 5356
Adding up Writing
and Waiting
it's just 37324
. But even this count is too high. The correct number should be around this:
> ss | grep https | wc -l 7327
It's reproducable e.g. using echo.websocket.org
:
nginx configuration
map $http_upgrade $connection_upgrade { default upgrade; '' close; } server { listen 1234; location / { proxy_set_header Host echo.websocket.org; proxy_ssl_server_name on; proxy_ssl_name echo.websocket.org; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; proxy_pass https://echo.websocket.org:443; } location = /basic_status { stub_status; } }
Test
- Open 2 tabs in the browser with the URL:
http://localhost:1234/.ws
- Now there are 3 active connection (one connection is the request to
/basic_status
):
# curl localhost:1234/basic_status Active connections: 3 server accepts handled requests 4 4 4 Reading: 0 Writing: 3 Waiting: 0
- After that, reload the nginx process and you can see a
nginx: worker process is shutting down
process:
# systemctl reload nginx # ps aux | grep [n]ginx root 2233850 0.0 0.0 55372 5652 ? Ss 22:29 0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on; www-data 2233851 0.0 0.0 55896 6044 ? S 22:29 0:00 nginx: worker process is shutting down www-data 2233940 0.0 0.0 55868 5400 ? S 22:30 0:00 nginx: worker process www-data 2233941 0.0 0.0 55868 5240 ? S 22:30 0:00 nginx: worker process # kill 2233851
- After we've killed the old process the websocket client in the browser will reconnect. After that we get two additional connection from
stub_status
even if the old connections are gone:
# curl localhost:1234/basic_status Active connections: 5 server accepts handled requests 7 7 7 Reading: 0 Writing: 5 Waiting: 0
Note:
See TracTickets
for help on using tickets.
I've tested it with the latest version and migrated this to github: https://github.com/nginx/nginx/issues/131
Please close this ticket.