Opened 6 years ago
Closed 5 years ago
#1610 closed defect (duplicate)
Writing connections counter leak in stub_status.
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | major | Milestone: | |
Component: | other | Version: | 1.12.x |
Keywords: | Cc: | ||
uname -a: | Linux hostname 2.6.32-696.6.3.el6.x86_64 #1 SMP Wed Jul 12 14:17:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.12.2
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) built with OpenSSL 1.0.2l 25 May 2017 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-openssl=openssl-1.0.2l --with-openssl-opt=shared --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --add-dynamic-module=nginx-vod-module-1.22 --with-http_geoip_module=dynamic --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie' |
Description
Writing connections counter as reported by stub_status is slowly growing up.
Active connections: 23374 server accepts handled requests 993360041 993360041 11278418586 Reading: 0 Writing: 13887 Waiting: 12213
However netstat shows fewer connections
# netstat -anp | grep -v CLOSE_WAIT | grep -c nginx 12137
If I restart nginx the number of writing connections drops down to 300-ish, which is close to expected value
Active connections: 12130 server accepts handled requests 26485 26485 198399 Reading: 0 Writing: 277 Waiting: 11817
but after a week or two it would gradually grow up to same large value (thousands).
In average it's receiving about 2000 requests per second, and I can't find an exact request that would increase writing connections counter permanently. However I think this problem started when we moved to http/2 (not completely sure though).
Attachments (1)
Change History (11)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
open socket ... left in connection ...
Nops, not a single one in 2 weeks.
Try disabling HTTP/2 to see if it fixes the issue
I'm afraid that won't be possible. Those nginx serve HLS traffic with VOD module, so if I run it without 3rd party modules it won't get any traffic :)
Try disabling HTTP/2 to see if it fixes the issue
Yep, I reconfigured one of the servers without http2, will check it in a week.
Thank you!
comment:3 by , 6 years ago
No (major) leaks with http/2 disabled.
A week after restart, server WITH http/2 enabled:
Active connections: 13913 server accepts handled requests 120731540 120731540 1453989168 Reading: 0 Writing: 1872 Waiting: 12288 # netstat -anp | grep -c nginx 12691 #
Server WITHOUT http/2 enabled
Active connections: 14652 server accepts handled requests 232682967 232682967 1407244359 Reading: 0 Writing: 118 Waiting: 14462 # netstat -anp | grep -c nginx 13046 #
Both servers serve more or less same traffic.
by , 6 years ago
Attachment: | nginx.conf added |
---|
nginx.conf as reported by nginx -T with paths and server names changed
comment:4 by , 6 years ago
In both cases the number of active connections do not match the total number of connections reported by netstat
: 13913 vs. 12691 with HTTP/2, and 14652 vs. 13046 without HTTP/2. So it looks like the problem is both with and without HTTP/2, and it just manifests itself differently with HTTP/2 enabled.
This is also in line with the fact that you are not seeing opens socket...
alerts on reloads (please note: these alerts are only expected to appear on configuration reloads and/or worker processes shutdown, not during normal work), and may indicate crashed or killed worker processes. Please make sure that no processes are crashed or killed. Ideally, grab a ps
output with all nginx processes at start, and make sure all these processes remain running after a while.
Note well that overall it looks like a problem in the 3rd party module you are using. It is unlikely we can help if you cannot reproduce this without 3rd party modules.
comment:5 by , 6 years ago
My understanding is that since I can't run netstat at exactly the same time I fetch /status page the numbers will always be different. I ran netstat 5 times in a loop and results are +-2K
# for i in `seq 1 5 `; do netstat -anp | grep -c nginx; done 13001 11999 12928 12598 12829 #
So I guess it's in a standard error
I don't think nginx processes crash, all the workers are still alive since master start (Aug, 22) and pid numbers are very close.
# ps -ef | grep nginx root 6648 6632 0 15:29 pts/0 00:00:00 grep nginx root 31709 1 0 Aug22 ? 00:00:04 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf 1100 31710 31709 6 Aug22 ? 14:47:32 nginx: worker process 1100 31711 31709 5 Aug22 ? 11:48:59 nginx: worker process 1100 31712 31709 5 Aug22 ? 11:16:58 nginx: worker process 1100 31713 31709 6 Aug22 ? 12:47:15 nginx: worker process 1100 31714 31709 8 Aug22 ? 18:02:36 nginx: worker process 1100 31715 31709 6 Aug22 ? 14:30:48 nginx: worker process 1100 31716 31709 66 Aug22 ? 5-19:53:05 nginx: worker process 1100 31717 31709 6 Aug22 ? 14:32:46 nginx: worker process 1100 31719 31709 7 Aug22 ? 16:15:57 nginx: worker process 1100 31720 31709 6 Aug22 ? 14:00:28 nginx: worker process 1100 31721 31709 60 Aug22 ? 5-08:44:01 nginx: worker process 1100 31722 31709 6 Aug22 ? 13:19:19 nginx: worker process 1100 31723 31709 5 Aug22 ? 12:21:53 nginx: worker process 1100 31724 31709 6 Aug22 ? 12:53:08 nginx: worker process 1100 31725 31709 5 Aug22 ? 12:04:23 nginx: worker process 1100 31726 31709 5 Aug22 ? 11:24:42 nginx: worker process 1100 31727 31709 6 Aug22 ? 14:16:52 nginx: worker process 1100 31728 31709 9 Aug22 ? 20:14:31 nginx: worker process 1100 31729 31709 63 Aug22 ? 5-14:57:12 nginx: worker process 1100 31730 31709 65 Aug22 ? 5-18:52:53 nginx: worker process 1100 31731 31709 6 Aug22 ? 13:50:16 nginx: worker process 1100 31732 31709 66 Aug22 ? 5-20:52:52 nginx: worker process 1100 31733 31709 5 Aug22 ? 11:18:26 nginx: worker process 1100 31734 31709 41 Aug22 ? 3-15:17:54 nginx: worker process 1100 31735 31709 5 Aug22 ? 12:23:48 nginx: worker process 1100 31736 31709 5 Aug22 ? 12:00:03 nginx: worker process 1100 31737 31709 19 Aug22 ? 1-16:14:45 nginx: worker process 1100 31738 31709 5 Aug22 ? 11:18:00 nginx: worker process 1100 31739 31709 15 Aug22 ? 1-09:36:22 nginx: worker process 1100 31740 31709 5 Aug22 ? 11:23:54 nginx: worker process 1100 31741 31709 28 Aug22 ? 2-11:55:52 nginx: worker process 1100 31742 31709 57 Aug22 ? 5-01:35:41 nginx: worker process 1100 31743 31709 12 Aug22 ? 1-01:32:45 nginx: worker process 1100 31744 31709 5 Aug22 ? 11:27:39 nginx: worker process 1100 31745 31709 5 Aug22 ? 11:46:17 nginx: worker process 1100 31746 31709 13 Aug22 ? 1-05:03:12 nginx: worker process 1100 31747 31709 5 Aug22 ? 11:38:25 nginx: worker process 1100 31748 31709 10 Aug22 ? 22:31:50 nginx: worker process 1100 31749 31709 5 Aug22 ? 11:33:06 nginx: worker process 1100 31750 31709 53 Aug22 ? 4-17:31:15 nginx: worker process 1100 31751 31709 6 Aug22 ? 13:44:08 nginx: worker process 1100 31752 31709 34 Aug22 ? 3-00:32:47 nginx: worker process 1100 31753 31709 6 Aug22 ? 12:57:28 nginx: worker process 1100 31754 31709 47 Aug22 ? 4-04:05:54 nginx: worker process 1100 31755 31709 5 Aug22 ? 11:44:20 nginx: worker process 1100 31756 31709 23 Aug22 ? 2-01:26:29 nginx: worker process 1100 31757 31709 5 Aug22 ? 11:16:23 nginx: worker process 1100 31758 31709 52 Aug22 ? 4-15:22:58 nginx: worker process
I'll try to replicate this problem on a spare server without VOD module.
comment:7 by , 5 years ago
I have same problem with nginx/1.16.1(http/2). Writing connection reset after nginx restart.
uname -a
Linux ********* 3.10.0-962.3.2.lve1.5.26.7.el7.x86_64 #1 SMP Wed Oct 2 07:53:12 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
nginx -V
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'
comment:10 by , 5 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Writing connections counter increasing might indicate one of the following:
Waiting
.netstat
output withoutgrep -v CLOSE_WAIT
filter. Leaked sockets are reported by nginx during graceful shutdown of a worker process (for example, after a configuration reload) - if there are any leaked sockets, nginx will writeopen socket ... left in connection ...
alerts to the error log.Enabling HTTP/2 might be an issue. It is known to cause socket leaks in the past (most recent one was fixed in nginx 1.11.3), and there are reports that there might be something not fixed, see #1163.
To further investigate things, please do the following:
open socket ... left in connection ...
alerts on configuration reload.If you'll be able to reproduce the issue with a recent version, please also provide:
nginx -T
.open socket ... left...
alert. See here for details on how to configure debug logging.