Opened 2 weeks ago

Last modified 10 days ago

#2685 new defect

stub_status counter leak when killing old workers

Reported by: mookie-@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.24.x
Keywords: Cc:
uname -a: Linux lb 5.15.0-100-generic #110-Ubuntu SMP Wed Feb 7 13:27:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.24.0
built by gcc 11.2.0 (Ubuntu 11.2.0-19ubuntu1)
built with OpenSSL 3.0.2 15 Mar 2022
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-g -O2 -ffile-prefix-map=/data/builder/debuild/nginx-1.24.0/debian/debuild-base/nginx-1.24.0=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --with-ld-opt='-Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie'

Description

Hello,

we've started killing (sending SIGTERM) "old" nginx workers (nginx: worker process is shutting down) as we have regular configuration changes and a lot of websocket connections.

Since we do this, the counters from stub_status are incorrect.

server status

> curl localhost/nginx_status
Active connections: 65369
server accepts handled requests
 1042173178 1042173178 5035465167
Reading: 0 Writing: 31968 Waiting: 5356

Adding up Writing and Waiting it's just 37324. But even this count is too high. The correct number should be around this:

> ss | grep https | wc -l
7327

It's reproducable e.g. using echo.websocket.org:

nginx configuration

map $http_upgrade $connection_upgrade {
	default upgrade;
	''      close;
}

server {
	listen 1234;
	location / {
		proxy_set_header  Host echo.websocket.org;
		proxy_ssl_server_name on;
		proxy_ssl_name echo.websocket.org;
		proxy_http_version 1.1;
		proxy_set_header Upgrade $http_upgrade;
		proxy_set_header Connection $connection_upgrade;

		proxy_pass https://echo.websocket.org:443;
	}
	location = /basic_status {
		    stub_status;
	}
}

Test

  1. Open 2 tabs in the browser with the URL: http://localhost:1234/.ws
  2. Now there are 3 active connection (one connection is the request to /basic_status):
# curl localhost:1234/basic_status
Active connections: 3
server accepts handled requests
 4 4 4
Reading: 0 Writing: 3 Waiting: 0
  1. After that, reload the nginx process and you can see a nginx: worker process is shutting down process:
# systemctl reload nginx

# ps aux | grep [n]ginx
root     2233850  0.0  0.0  55372  5652 ?        Ss   22:29   0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 2233851  0.0  0.0  55896  6044 ?        S    22:29   0:00 nginx: worker process is shutting down
www-data 2233940  0.0  0.0  55868  5400 ?        S    22:30   0:00 nginx: worker process
www-data 2233941  0.0  0.0  55868  5240 ?        S    22:30   0:00 nginx: worker process

# kill 2233851
  1. After we've killed the old process the websocket client in the browser will reconnect. After that we get two additional connection from stub_status even if the old connections are gone:
# curl localhost:1234/basic_status
Active connections: 5
server accepts handled requests
 7 7 7
Reading: 0 Writing: 5 Waiting: 0

Change History (1)

comment:1 by mookie-@…, 10 days ago

I've tested it with the latest version and migrated this to github: https://github.com/nginx/nginx/issues/131

Please close this ticket.

Note: See TracTickets for help on using tickets.