#353 closed defect (fixed)
nginx 1.4.[0|1] empty reply from server
Reported by: | Maximilian Weber | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | nginx-core | Version: | 1.3.x |
Keywords: | empty reply from server | Cc: | |
uname -a: | Linux app519.blah.com 2.6.32-220.el6.x86_64 #1 SMP Sat Dec 10 17:04:11 CST 2011 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.4.1
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) TLS SNI support enabled configure arguments: --user=nginx --group=nginx --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --pid-path=/var/run/nginx.pid --lock-path=/var/lock/subsys/nginx --with-http_stub_status_module --with-http_realip_module --with-http_ssl_module --with-http_gzip_static_module --with-mail --with-mail_ssl_module --with-ipv6 --add-module=/usr/lib/ruby/gems/1.8/gems/passenger-enterprise-server-4.0.0.rc4/ext/nginx |
Description
Problem:
curl: random curl: (52) Empty reply from server
chrome: random Error 324 (net::ERR_EMPTY_RESPONSE): The server closed the connection without sending any data.
Upon upgrading to the new nginx version we saw random error messages when doing our healthchecks to static files on the server. this isn't utilizing passenger at this point but plain serving static files
- this is only happening when during high traffic.
- there is no loadbalancer involved.
- error.log with debug doesn't show anything.
- error shows with disabled iptables
- no selinux
empty reply:
14:57:07.231413 IP 178.210.115.51.59960 > app515.blah.com.http: Flags [S], seq 2421292407, win 14600, options [mss 1460,sackOK,TS val 92501540 ecr 0,nop,wscale 7], length 0
14:57:07.231447 IP app515.blah.com.http > 178.210.115.51.59960: Flags [S.], seq 1123747008, ack 2421292408, win 14480, options [mss 1460,sackOK,TS val 1552315767 ecr 92501540,nop,wscale 7], length 0
14:57:07.250915 IP 178.210.115.51.59960 > app515.blah.com.http: Flags ., ack 1, win 115, options [nop,nop,TS val 92501545 ecr 1552315767], length 0
14:57:07.250973 IP app515.blah.com.http > 178.210.115.51.59960: Flags [F.], seq 1, ack 1, win 115, options [nop,nop,TS val 1552315855 ecr 92501545], length 0
14:57:07.251429 IP 178.210.115.51.59960 > app515.blah.com.http: Flags [P.], seq 1:95, ack 1, win 115, options [nop,nop,TS val 92501545 ecr 1552315767], length 94
14:57:07.251448 IP app515.blah.com.http > 178.210.115.51.59960: Flags [R], seq 1123747009, win 0, length 0
14:57:07.270451 IP 178.210.115.51.59960 > app515.blah.com.http: Flags [F.], seq 95, ack 2, win 115, options [nop,nop,TS val 92501549 ecr 1552315855], length 0
14:57:07.270474 IP app515.blah.com.http > 178.210.115.51.59960: Flags [R], seq 1123747010, win 0, length 0
14:57:07.670518 IP 178.210.115.51.54293 > app515.blah.com.http: Flags [F.], seq 13924, ack 2174, win 16276, length 0
14:57:07.674907 IP app515.blah.com.http > 178.210.115.51.54293: Flags [F.], seq 2174, ack 13925, win 334, length 0
14:57:07.702982 IP 178.210.115.51.54293 > app515.blah.com.http: Flags ., ack 2175, win 16276, length 0
14:57:09.293112 IP 178.210.115.51.53615 > app515.blah.com.http: Flags [F.], seq 17172, ack 4741, win 16425, length 0
14:57:09.293153 IP app515.blah.com.http > 178.210.115.51.53615: Flags [F.], seq 4741, ack 17173, win 390, length 0
14:57:09.338779 IP 178.210.115.51.53615 > app515.blah.com.http: Flags ., ack 4742, win 16425, length 0
good reply:
14:57:54.336213 IP 178.210.115.51.59975 > app515.blah.com.http: Flags [S], seq 1550226949, win 14600, options [mss 1460,sackOK,TS val 92513316 ecr 0,nop,wscale 7], length 0
14:57:54.336236 IP app515.blah.com.http > 178.210.115.51.59975: Flags [S.], seq 1532621207, ack 1550226950, win 14480, options [mss 1460,sackOK,TS val 1552362940 ecr 92513316,nop,wscale 7], length 0
14:57:54.544906 IP 178.210.115.51.59975 > app515.blah.com.http: Flags ., ack 1, win 115, options [nop,nop,TS val 92513368 ecr 1552362940], length 0
14:57:54.545233 IP 178.210.115.51.59975 > app515.blah.com.http: Flags [P.], seq 1:95, ack 1, win 115, options [nop,nop,TS val 92513368 ecr 1552362940], length 94
14:57:54.545257 IP app515.blah.com.http > 178.210.115.51.59975: Flags ., ack 95, win 114, options [nop,nop,TS val 1552363149 ecr 92513368], length 0
14:57:54.546194 IP app515.blah.com.http > 178.210.115.51.59975: Flags [P.], seq 1:231, ack 95, win 114, options [nop,nop,TS val 1552363150 ecr 92513368], length 230
14:57:54.749511 IP 178.210.115.51.59975 > app515.blah.com.http: Flags ., ack 231, win 123, options [nop,nop,TS val 92513419 ecr 1552363150], length 0
14:57:54.750097 IP 178.210.115.51.59975 > app515.blah.com.http: Flags [F.], seq 95, ack 231, win 123, options [nop,nop,TS val 92513419 ecr 1552363150], length 0
14:57:54.750135 IP app515.blah.com.http > 178.210.115.51.59975: Flags [F.], seq 231, ack 96, win 114, options [nop,nop,TS val 1552363354 ecr 92513419], length 0
14:57:54.857239 IP 178.210.115.51.59975 > app515.blah.com.http: Flags ., ack 232, win 123, options [nop,nop,TS val 92513446 ecr 1552363354], length 0
Change History (16)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
2013/05/08 07:44:04 [info] 15847#0: *45649 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 78.46.35.x, server: 0.0.0.0:80
[10:08:58 AM]
This info message seems to correlate with the curl: (52) Empty reply from server
I found the error message here: http://nginx.org/patches/spdy/patch.spdy.txt
+#if (NGX_HAVE_DEFERRED_ACCEPT && defined TCP_DEFER_ACCEPT)
+ if (c->listening->deferred_accept) {
+ ngx_log_error(NGX_LOG_INFO, c->log, NGX_ETIMEDOUT,
+ "client timed out in deferred accept");
+ ngx_http_close_connection(c);
+ return;
+ }
+#endif
comment:3 by , 12 years ago
Could you please confirm the correlation? Please note, that those messages are usually related to Chrome and were not a problem by themselves.
follow-up: 5 comment:4 by , 12 years ago
Yes, I can confirm this the correlation between the debug message and the empty reply.
This problem is only happening via http requests, https works great. spdy is not activated explicitly:
server {
listen 80;
listen 443 ssl;
...
request loop:
Wed May 8 14:07:51 CEST 2013
mysql: OK<br>cache: OK<br>
Wed May 8 14:07:51 CEST 2013
curl: (52) Empty reply from server
Wed May 8 14:07:51 CEST 2013
mysql: OK<br>cache: OK<br>
logfile (debug):
2013/05/08 12:07:51 [info] 18283#0: *1665613 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.210.115.xX, server: 0.0.0.0:80
comment:5 by , 12 years ago
Replying to Maximilian Weber <maximilian.weber@gmail.com>:
Yes, I can confirm this the correlation between the debug message and the empty reply.
This problem is only happening via http requests, https works great. spdy is not activated explicitly:
server {
listen 80;
listen 443 ssl;
...
Do you actually have the "deferred" option somewhere in one of your "listen" directives?
comment:7 by , 12 years ago
Replying to Maximilian Weber <maximilian.weber@gmail.com>:
When using deferred the error is gone.
Well, the problem is that somehow nginx thinks that you have enabled TCP_DEFER_ACCEPT, but you have not.
Could you please confirm that the problem is gone with vanilla nginx, without 3-rd party modules like "passenger"?
comment:8 by , 12 years ago
I get the same empty reply from server with vanilla nginx. I think the issue is my config file for this vhost. I basically have two different server configurations in the same file, pointing to the same root directory. I need different errors sites for my api domain.
server {
listen 80;
listen 443 ssl;
servername iframe.blah.com;
.
.
}
server {
listen 80 deferred;
listen 443 ssl deferred;
servername api.blah.com
.
.
}
Splitting the two domains in two config files seems to fix the issue?!?
comment:9 by , 11 years ago
Just a quick note: the d8ee8ef5dcde changeset may be related. If a listening socket with a "deferred" parameter was added to a configuration during a binary upgrade, it may have deferred_accept flag set, but setsockopt(TCP_DEFER_ACCEPT) not actually called for it. This case is expected to be fixed by the changeset in question.
comment:10 by , 11 years ago
I got similar problem, not related to online upgrades. Cut from error_log
2013/11/27 11:33:36 [info] 9521#0: *954 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 95.56.35.229, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *955 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 88.147.174.169, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *956 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 195.189.17.77, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *957 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 193.151.12.69, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *958 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 95.56.35.229, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *959 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 95.56.35.229, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *960 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 193.151.12.69, server: 0.0.0.0:80 2013/11/27 11:33:37 [info] 9521#0: *961 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 95.56.35.229, server: 0.0.0.0:80 2013/11/27 11:33:38 [info] 9523#0: *965 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 2.61.70.127, server: 0.0.0.0:80 2013/11/27 11:33:39 [info] 9518#0: *966 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 2.61.70.127, server: 0.0.0.0:80
nginx from nginx repo.
nginx -V nginx version: nginx/1.4.4 built by gcc 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-mail --with-mail_ssl_module --with-file-aio --with-ipv6 --with-cc-opt='-O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'
uname -a
2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
I have 2 listen in configuration
nginx.conf: listen 80 default sndbuf=1024k backlog=128 deferred; vhosts/cdn.conf: listen 80;
I can stop nginx ensuring it's stopped and start it - problem is still here. And this is not a Chrome issue, I got same with curl.
At the same time I got in dmesg
possible SYN flooding on port 80. Sending cookies. possible SYN flooding on port 80. Sending cookies. possible SYN flooding on port 80. Sending cookies.
While there is no synflood or any abnormal activity.
I'm changing listen option by removing "deferred" and after reload problem is gone. Enabling "deferred" and reloading again - problem is here again. It's not 100% reproduce - nginx can work for some time with "deferred" enabled from start without problems, but if I hit this problem - removing "deferred" fixes it, while stop and start did not help.
Here are logs when problem "begins". Seems like socket backlog overflow triggers this problem.
/var/log/messages
Nov 27 08:23:01 lenomp487 rhsmd: This system is registered to RHN Classic. Nov 27 09:06:07 lenomp487 auditd[1710]: Audit daemon rotating log files Nov 27 11:08:08 lenomp487 kernel: possible SYN flooding on port 80. Sending cookies. Nov 27 11:09:08 lenomp487 kernel: possible SYN flooding on port 80. Sending cookies. Nov 27 11:10:08 lenomp487 kernel: possible SYN flooding on port 80. Sending cookies
/var/log/nginx/error_log
2013/11/27 11:04:10 [info] 4711#0: *37626 client closed connection while waiting for request, client: 193.34.160.61, server: 0.0.0.0:80 2013/11/27 11:04:18 [info] 4715#0: *37629 client closed connection while waiting for request, client: 95.31.137.207, server: 0.0.0.0:80 2013/11/27 11:04:54 [info] 4711#0: *37675 client closed connection while waiting for request, client: 89.254.245.27, server: 0.0.0.0:80 2013/11/27 11:05:02 [info] 4715#0: *37684 client closed connection while waiting for request, client: 213.184.254.14, server: 0.0.0.0:80 2013/11/27 11:05:02 [info] 4711#0: *37685 client closed connection while waiting for request, client: 213.184.254.14, server: 0.0.0.0:80 2013/11/27 11:05:02 [info] 4711#0: *37686 client closed connection while waiting for request, client: 213.184.254.14, server: 0.0.0.0:80 2013/11/27 11:05:57 [info] 4710#0: *37740 client closed connection while waiting for request, client: 109.172.58.130, server: 0.0.0.0:80 2013/11/27 11:06:14 [info] 4715#0: *37758 client closed connection while waiting for request, client: 212.152.53.188, server: 0.0.0.0:80 2013/11/27 11:07:04 [info] 4711#0: *37885 client closed connection while waiting for request, client: 176.195.46.229, server: 0.0.0.0:80 2013/11/27 11:07:33 [info] 4712#0: *37933 client closed connection while waiting for request, client: 130.255.61.76, server: 0.0.0.0:80 2013/11/27 11:07:41 [info] 4715#0: *37953 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.158.134.225, server: 0.0.0.0:80 2013/11/27 11:07:48 [info] 4714#0: *37972 client closed connection while waiting for request, client: 95.154.76.138, server: 0.0.0.0:80 2013/11/27 11:08:08 [info] 4710#0: *37997 client closed connection while waiting for request, client: 95.154.76.138, server: 0.0.0.0:80 2013/11/27 11:08:12 [info] 4712#0: *37998 client closed connection while waiting for request, client: 178.172.229.243, server: 0.0.0.0:80 2013/11/27 11:08:15 [info] 4713#0: *37999 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.185.66.131, server: 0.0.0.0:80 2013/11/27 11:08:15 [info] 4713#0: *38000 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.185.66.131, server: 0.0.0.0:80 2013/11/27 11:08:19 [info] 4712#0: *38001 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 93.84.47.127, server: 0.0.0.0:80 2013/11/27 11:08:19 [info] 4712#0: *38002 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 93.84.47.127, server: 0.0.0.0:80 2013/11/27 11:08:20 [info] 4710#0: *38003 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 5.105.60.12, server: 0.0.0.0:80 2013/11/27 11:08:21 [info] 4712#0: *38004 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 5.105.60.12, server: 0.0.0.0:80 2013/11/27 11:08:21 [info] 4712#0: *38005 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 46.48.78.35, server: 0.0.0.0:80 2013/11/27 11:08:21 [info] 4712#0: *38006 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 46.48.78.35, server: 0.0.0.0:80 2013/11/27 11:08:22 [info] 4710#0: *38007 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 93.84.47.127, server: 0.0.0.0:80 2013/11/27 11:08:26 [info] 4711#0: *38008 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.172.229.243, server: 0.0.0.0:80 2013/11/27 11:08:26 [info] 4711#0: *38009 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.172.229.243, server: 0.0.0.0:80 2013/11/27 11:08:26 [info] 4714#0: *38010 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 86.57.190.217, server: 0.0.0.0:80 2013/11/27 11:08:27 [info] 4714#0: *38011 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 86.57.190.217, server: 0.0.0.0:80 2013/11/27 11:08:27 [info] 4712#0: *38012 client timed out in deferred accept (110: Connection timed out) while waiting for request, client: 178.185.66.131, server: 0.0.0.0:80
comment:11 by , 11 years ago
I don't think that this is related to original issue reported, but nevertheless it looks like a problem.
The "possible SYN flood" messages are likely related to too many connections sitting in the accept queue, due to accept deferred. Looks like the kernel starts sending syncookies in this case, and probably connections established using syncookies bypass deferred accept queue. This would explain why normal connections are dropped immediately in such situation - they are passed to nginx immediately rather than after deferred accept timeout.
Could you please try to test if the above is what actually happens?
comment:12 by , 11 years ago
Hello,
Now I can trigger the problem for some time by lowering listen backlog to something like 10 and overflow it with deferred connections
for i in `seq 1 30`;do nc 127.0.0.1 80 & done
After that I got "client timed out in deferred accept" in error_log for new requests.
Could you please try to test if the above is what actually happens?
It looks like you are right - overflowed and thus syncookie'd connections are passed to nginx immediately.
Anyway it's wrong to use deferred with small backlog values.
comment:13 by , 11 years ago
Backlog size doesn't really matters, the actual problem is how kernel handles the deferred queue overflow case. It turns out that current handling doesn't allow us to assume anything about connections accepted with TCP_DEFER_ACCEPT, and changeset 05a56ebb084a should be reconsidered.
Here is a example test with nginx-1.4.1 running on the server
while true; do date -R; curl -m 10 'http://app519.blah.com/watchdog'; done
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:27 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:27 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
curl: (52) Empty reply from server
Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
curl: (52) Empty reply from server
Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:28 +0200
curl: (52) Empty reply from server
Wed, 08 May 2013 09:56:28 +0200
curl: (52) Empty reply from server
Wed, 08 May 2013 09:56:28 +0200
mysql: OK<br>cache: OK<br>Wed, 08 May 2013 09:56:29 +0200