Opened 8 years ago
Closed 6 years ago
#1005 closed defect (fixed)
client_body_timeout does not send 408 as advertised
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | major | Milestone: | |
Component: | documentation | Version: | 1.9.x |
Keywords: | Cc: | ||
uname -a: | Linux ubuntu 4.2.0-30-generic #36-Ubuntu SMP Fri Feb 26 00:58:07 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.9.3 (Ubuntu)
built with OpenSSL 1.0.2d 9 Jul 2015 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_geoip_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module --with-http_spdy_module --with-http_sub_module --with-http_xslt_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/build/nginx-fcyDbp/nginx-1.9.3/debian/modules/nginx-auth-pam --add-module=/build/nginx-fcyDbp/nginx-1.9.3/debian/modules/nginx-dav-ext-module --add-module=/build/nginx-fcyDbp/nginx-1.9.3/debian/modules/nginx-echo --add-module=/build/nginx-fcyDbp/nginx-1.9.3/debian/modules/nginx-upstream-fair --add-module=/build/nginx-fcyDbp/nginx-1.9.3/debian/modules/ngx_http_substitutions_filter_module |
Description
client_body_timeout [1] docs says "If a client does not transmit anything within this time, the 408 (Request Time-out) error is returned to the client." but this is not true. What happens is that nginx logs a 408 to the access log, but does not send a 408 to client, instead just ungracefully closing the connection. This dropping behavior causes problems with some clients, for example, the Amazon Elastic Load Balancer AKA ELB.
Here is how to reproduce:
Use the attached nginx.conf file (minimally modified from the default) and restart nginx.
Send 2 healthy POST requests to ensure that the nginx.conf is working:
$ perl -e '$|++; $request = qq[POST http://127.0.0.1:80/logpost HTTP/1.1\r\nhost: 127.0.0.1\r\nContent-length: 10\r\nConnection: keep-alive\r\n\r\n0123456789]; print $request; sleep 5; print $request; sleep 5;' | nc -v 127.0.0.1 80 2>&1 | perl -MTime::HiRes -lane 'printf qq[%f %s\n], Time::HiRes::time(), $_;'
1466811503.325930 Connection to 127.0.0.1 80 port [tcp/http] succeeded!
1466811503.325998 HTTP/1.1 200 OK
1466811503.326008 Server: nginx/1.9.3 (Ubuntu)
1466811503.326017 Date: Fri, 24 Jun 2016 23:38:23 GMT
1466811503.326023 Content-Type: application/octet-stream
1466811503.326030 Content-Length: 0
1466811503.326035 Connection: keep-alive
1466811503.326041
1466811508.323207 HTTP/1.1 200 OK
1466811508.323417 Server: nginx/1.9.3 (Ubuntu)
1466811508.323446 Date: Fri, 24 Jun 2016 23:38:28 GMT
1466811508.323461 Content-Type: application/octet-stream
1466811508.323474 Content-Length: 0
1466811508.323486 Connection: keep-alive
1466811508.323498
Check that they got logged correctly:
$ sudo tail -n 4 /var/log/nginx/access.log
127.0.0.1 - - [24/Jun/2016:16:38:23 -0700] "POST http://127.0.0.1:80/logpost HTTP/1.1" 200 0 "-" "-"
0123456789
127.0.0.1 - - [24/Jun/2016:16:38:28 -0700] "POST http://127.0.0.1:80/logpost HTTP/1.1" 200 0 "-" "-"
0123456789
Now send 2 unhealthy POST requests where the body is incomplete thus triggering the client_body_timeout:
$ perl -e '$|++; $request = qq[POST http://127.0.0.1:80/logpost HTTP/1.1\r\nhost: 127.0.0.1\r\nContent-length: 10\r\nConnection: keep-alive\r\n\r\n0]; print $request; sleep 5; print $request; sleep 5;' | nc -v 127.0.0.1 80 2>&1 | perl -MTime::HiRes -lane 'printf qq[%f %s\n], Time::HiRes::time(), $_;'
1466811649.281460 Connection to 127.0.0.1 80 port [tcp/http] succeeded!
$
According to the docs then we expect a 408 response, but no 408 response is returned.
However, the access log incorrectly states that a 408 response was made:
$ sudo tail -n 6 /var/log/nginx/access.log
127.0.0.1 - - [24/Jun/2016:16:38:23 -0700] "POST http://127.0.0.1:80/logpost HTTP/1.1" 200 0 "-" "-"
0123456789
127.0.0.1 - - [24/Jun/2016:16:38:28 -0700] "POST http://127.0.0.1:80/logpost HTTP/1.1" 200 0 "-" "-"
0123456789
127.0.0.1 - - [24/Jun/2016:16:40:50 -0700] "POST http://127.0.0.1:80/logpost HTTP/1.1" 408 0 "-" "-"
0
The RFC [2] clearly says that "When a client or server wishes to time-out it SHOULD issue a graceful close on the transport connection." and further says "A client, server, or proxy MAY close the transport connection at any time." implying that a graceful close is a "Connection: close" response.
Further it says "Servers SHOULD always respond to at least one request per connection, if at all possible. Servers SHOULD NOT close a connection in the middle of transmitting a response, unless a network or client failure is suspected."
Therefore, I submit that nginx not responding with a 408 response is a bug. And a bug which should be fixed sooner, because it is known to cause the Amazon ELB to further behave in a buggy way.
[1] http://nginx.org/en/docs/http/ngx_http_core_module.html#client_body_timeout
[2] https://tools.ietf.org/html/rfc2616#section-8.1.4
Attachments (1)
Change History (34)
by , 8 years ago
Attachment: | nginx.conf added |
---|
comment:1 by , 8 years ago
Component: | nginx-core → documentation |
---|
Yes, no 408 response is returned. That's expected behaviour, documentation should be fixed to document this.
comment:2 by , 8 years ago
Why is it preferred to drop the connection (potentially upsetting the downstream such as in the case of Amazon Elastic Load Balancer) rather than return a 408 response? The RFC tells to use a "graceful close". So why not do that?
comment:3 by , 8 years ago
Returning a response in case of timeouts almost always doesn't make sense and will just waste resources, as nobody is listening.
As for what RFC tells - "graceful close" RFC 2616 talks about is a connection close, not a response with "Connection: close". Returning a response in most case it's not possible at all.
comment:4 by , 8 years ago
Would it be possible to add an option to nginx to switch on sending the 408 response? Why?
Unfortunately the Amazon Elastic Load Balancer behaves badly after receiving the connection close from nginx. What happens? The ELB correctly translates the initial TCP connection close into a downstream 504 GATEWAY_TIMEOUT... BUT... the ELB response also contains the "connection: keep-alive" header. When sending the next query to the ELB on the kept alive connection which causes nginx to close the TCP connection then the ELB responds in one of three buggy ways: (1) 400 BAD_REQUEST, (2) 405 METHOD_NOT_ALLOWED, or (3) timing out after 60 seconds and dropping the connection itself. Although nginx consistently responds with a closed TCP connection, the ELB responds in a buggy and inconsistent way. It is unknown how long it will take Amazon to fix this bug -- if ever -- but it would be nice to have an option in nginx to workaround it. Would this option be possible?
comment:5 by , 8 years ago
From what you describe it looks like ELB just screws up protocol parsing. On the other hand, 504 looks very wrong for me, as the connection with upstream server (nginx) was closed, not timed out. This may indicate that something else in fact happens on ELB side. I would recommend the following:
- make sure the problem isn't caused by 3rd party modules (it looks your config uses echo_read_request_body, it may cause unexpected bugs);
- try reset_timedout_connection, it may help;
- consider using larger timeouts.
Reporting this to Amazon may help, too.
follow-up: 8 comment:6 by , 8 years ago
WRT 504: I agree with you that 504 is perhaps not the best choice. Another reason to have an nginx option to actually send the 408 with a 'connection: close' header? I also read through the RFC but there did not seem to be a section dealing with how the downstream should handle a TCP connection close to an *unfinished* POST request. Or do you know of such a section?
WRT Amazon: The issue is reported to Amazon but they have a closed ticketing system.
WRT 3rd part module: I only used echo_read_request_body because it seems like the easiest way to reproduce the issue using POST. I'm actually using embedded Perl to handle reading the POST body. So there is no 3rd party module involved and nginx is closing the TCP connection after the specified timeout BUT without sending the 408 HTTP response.
WRT larger timeouts: If the body never arrives for some reason then the larger timeouts will not help, or?
comment:7 by , 8 years ago
FYI Amazon is also refusing to change / fix anything. They say the following:
"...
a) Client sends the request with content-length bigger than its actually sending.
b) ELB uses a pre-open connection to the back-end and stream the data, it is still waiting for the rest of the payload from the client.
c) Back-end drops the connection without doing a problem Connection: Close, that's an immediate 504 (not a real timeout)
d) The subsequent request will find that connection on a weird state since the back-end closed the initial connection. After the ELB idle timeout (default 60) that will get closed.
...
The rule is DO NOT close connections to the ELB from the back-end to avoid weird behaviors.
..."
So their reply appears to be: Please change the behavior of nginx or move from nginx to an alternative?
I wonder which one of you will be the first to help fix the problem?
comment:8 by , 8 years ago
It looks like the problem is in the client then: it fails to send the request properly but still re-uses a connection. The exact perl code you've provided is certainly broken.
comment:9 by , 8 years ago
Thanks for looking at the perl code in case there's a bug in it.
However, the perl code is designed *not* to send the full body in order to reproduce the nginx timeout.
In addition, it tries to send again, just in case nginx returns a 408 HTTP response with connection keep-alive. Because of the perl code and tcpdump dump we know that nginx does not send any response and just drops the TCP connection.
When running the perl code against an Amazon ELB with upstream nginx then the perl code reports the 504 HTTP response for the first downstream response from the ELB (when upstream nginx times out), and the 2nd response is different depending upon the type of POST; either a '400 BAD_REQUEST', '405 METHOD_NOT_ALLOWED', or the ELB blocks for 60 seconds. This illustrates the bug ticket open with Amazon where the ELB enters a 'weird' state for the 2nd keep-alive connection.
Please let me know if you still think the perl code is broken after this explanation and if so, how do you think it is broken...
comment:10 by , 8 years ago
The perl code in question does the following:
- sends part of a request (just headers and 1 byte of a body),
- then sleeps for 5 seconds to trigger a timeout,
- then sends another part of a request.
The (3) is wrong, as the request from (1) was not yet sent. As a result server who is willing to maintain a connection interprets first 9 bytes sent in (3) as the body of the request from (1), and the rest is parsed as a separate request, leading to 400 Bad Request (or whatever).
The problem is that the code assumes it can send second request before the first one was send. This won't work, and should not happen with real HTTP clients. If you see this happens in practice with some real client - consider reporting this to client's authors.
comment:11 by , 8 years ago
Thanks for the comments.
Disclaimer: The perl code is just a quick and dirty way to easily reproduce the issue with both the Amazon ELB and nginx. I'm trying to keep things as simple and short as possible. If I wrote a larger script then it's more to comprehend and potentially to go wrong. If you have a better suggestion which is almost as simple then I'd love to see it :-)
"server who is willing to maintain a connection interprets first 9 bytes sent in (3) as the body of the request from (1)"
This is not true. Why? Because the Amazon ELB responds 1 second (because nginx times out as expected after 1 second) into the 5 seconds slept with a 504 response and 'connection: keep-alive' header. Therefore signalling to the client that the 1st request is finished and it's waiting for the second request. So then the perl script sleeps the remaining 4 seconds and sends the new query. So a more accurate list of events would be:
- sends part of a request (just headers and 1 byte of a body),
- then starts sleeping for 5 seconds to trigger a timeout,
- ELB replies with 504 and 'connection: keep-alive' after 1 second,
- perl continues to sleep for the remaining 4 seconds,
- perl re-sends the original request as a new request
If the Amazon ELB sends a 504 and connection: keep-alive then why do you think the client should not interpret that as "it's time for the next query"?
"should not happen with real HTTP clients"
This can happen with real HTTP clients and it happened to me which is why I'm looking into this problem. I agree though that it's not common and it's an edge case. But still it would be nice to get the Amazon ELB and nginx dealing nicely with one another for this "unhappy path" edge case, or?
comment:12 by , 8 years ago
In HTTP/1.x there is no way to "signal that the 1st request is finished", it doesn't work this way. HTTP explicitly specifies the message format, and you are required to follow it regardless of what other side returned. If you don't want to send the rest of the request you should close the connection. Anything else will break the protocol.
Moreover, if you'll try to assume a response returned indeed means "the request is finished", there is an unavoidable race: there is a time window between the time when the response was returned and the time when the client knows that it was returned. It's unspecified how many bytes the client will be able to send during this time window, and therefore it won't be possible to continue HTTP conversation. So the only valid behaviour is to close the connection.
comment:13 by , 8 years ago
"there is no way to "signal that the 1st request is finished""
This is not true. Isn't that what Section 8.2.2 [1] of the RFC specifies? It says "An HTTP/1.1 (or later) client sending a message-body SHOULD monitor the network connection for an error status while it is transmitting the request."
This is also why I think nginx does the wrong thing when it drops the TCP connection (without responding with an error status), because dropping the TCP connection is not an 'error status', or? Do you agree?
OTOH the section goes on to say: "If the body was preceded by a Content-Length header, the client MUST close the connection." I guess this is because of the race condition that you mention. So I would expect both nginx and the Amazon ELB to return a status *and* a 'connection: close' header. And yes, the client (or my quick and dirty perl script) should not try to send another request on the same TCP connection.
comment:14 by , 8 years ago
Monitoring a connection for errors is not the same as "signal that the request is finished". An error returned doesn't indicate the request is finished - it indicate that response is already prepared and there is no need to send the rest of the body. It's up to the client to react on this by either ceasing transmission of the request body (if possible), or sending the rest of the request, or closing a connection.
I see no problems in neither nginx behaviour, nor ELB behaviour here. And nothing in the protocol suggests that there should be a "Connection: close" header - a connection can be kept open just fine (and that's why RFC talks about using a zero length chunk).
The only part of the equation which is broken is the client.
follow-up: 19 comment:15 by , 8 years ago
Maybe we are talking about the same thing? When I say the "request is finished" I mean "the client stops sending because the upstream says timeout". I don't mean "the request has finished sending its complete POST".
I agree with you that it's up to the client to react, and the nginx client is the Amazon ELB which generates a 504 (which you said looks "very wrong") with 'connection: keep-alive' for its downstream client. However, at this point the Amazon ELB is in a 'weird state' for subsequent requests.
Yes, I agree with you that the client should drop the connection at this point even if the response says 'connection: keep-alive'. However, there are many more types of clients than there are proxies, and clients are not necessarily programmed to explicitly close a connection after a 504 especially if the upstream says "connection: keep-alive". Therefore, it would be 'safer' and more 'intention revealing' -- and less misleading too -- if the Amazon ELB responded with a 408 and a 'connection: close'. nginx is in a position to to facilitate this by being more 'intention revealing' with its response to the Amazon ELB, or?
So do you agree that making a small change to nginx would help the Amazon ELB to respond more accurately, which in turn would help clients to function better even if they are not conforming to Section 8.2.2 of the RFC?
Earlier you said "Returning a response in case of timeouts almost always doesn't make sense and will just waste resources, as nobody is listening." But in this case the Amazon ELB is listening and no resources are wasted, or? Plus through returning the response it is clear to the Amazon ELB what happened with nginx, and then the Amazon ELB can clearly report the issue further to its client. Wouldn't this make debugging issues like this easier in the future? For one thing, people wouldn't need to use tcpdump to confirm that nginx is dropping the connection instead of responding.
Isn't it better to have proxies which are more 'forgiving' to the clients?
Plus with the small change then the existing nginx documentation will become correct :-)
comment:16 by , 8 years ago
FYI I've been working on a small patch which works as follows:
If the patch is applied to nginx then nginx works just like it does now, i.e. nothing changes, and nothing gets broken :-) However, if the conf contains the following line then prior to the TCP connection drop upon timeout, nginx will send the 408 response with a 'connection: close' header:
client_body_timeout_and_send_408 on;
Here's the diff. It would be great if you could look it over in case there are any bugs. I manually tested and it seems to work fine:
$ diff --ignore-all-space --context src/http/ngx_http_core_module.h.orig src/http/ngx_http_core_module.h *** src/http/ngx_http_core_module.h.orig 2016-06-28 11:11:13.037770213 -0700 --- src/http/ngx_http_core_module.h 2016-06-27 16:48:30.437149955 -0700 *************** *** 378,383 **** --- 378,385 ---- size_t read_ahead; /* read_ahead */ ngx_msec_t client_body_timeout; /* client_body_timeout */ + ngx_flag_t client_body_timeout_and_send_408; + /* client_body_timeout causes 408 to be sent or just logged*/ ngx_msec_t send_timeout; /* send_timeout */ ngx_msec_t keepalive_timeout; /* keepalive_timeout */ ngx_msec_t lingering_time; /* lingering_time */ $ diff --ignore-all-space --context src/http/ngx_http_core_module.c.orig src/http/ngx_http_core_module.c *** src/http/ngx_http_core_module.c.orig 2016-06-28 11:10:57.694086472 -0700 --- src/http/ngx_http_core_module.c 2016-06-27 17:04:24.970113051 -0700 *************** *** 109,114 **** --- 109,115 ---- }; + static ngx_conf_enum_t ngx_http_core_request_body_in_file[] = { { ngx_string("off"), NGX_HTTP_REQUEST_BODY_FILE_OFF }, { ngx_string("on"), NGX_HTTP_REQUEST_BODY_FILE_ON }, *************** *** 376,381 **** --- 377,389 ---- offsetof(ngx_http_core_loc_conf_t, client_body_timeout), NULL }, + { ngx_string("client_body_timeout_and_send_408"), + NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_FLAG, + ngx_conf_set_flag_slot, + NGX_HTTP_LOC_CONF_OFFSET, + offsetof(ngx_http_core_loc_conf_t, client_body_timeout_and_send_408), + NULL }, + { ngx_string("client_body_temp_path"), NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1234, ngx_conf_set_path_slot, *************** *** 3629,3634 **** --- 3637,3643 ---- clcf->client_max_body_size = NGX_CONF_UNSET; clcf->client_body_buffer_size = NGX_CONF_UNSET_SIZE; clcf->client_body_timeout = NGX_CONF_UNSET_MSEC; + clcf->client_body_timeout_and_send_408 = NGX_CONF_UNSET; clcf->satisfy = NGX_CONF_UNSET_UINT; clcf->if_modified_since = NGX_CONF_UNSET_UINT; clcf->max_ranges = NGX_CONF_UNSET_UINT; *************** *** 3837,3842 **** --- 3846,3854 ---- (size_t) 2 * ngx_pagesize); ngx_conf_merge_msec_value(conf->client_body_timeout, prev->client_body_timeout, 60000); + ngx_conf_merge_value(conf->client_body_timeout_and_send_408, + prev->client_body_timeout_and_send_408, + 0); ngx_conf_merge_bitmask_value(conf->keepalive_disable, prev->keepalive_disable, $ diff --ignore-all-space --context src/http/ngx_http_request.c.orig src/http/ngx_http_request.c *** src/http/ngx_http_request.c.orig 2016-06-28 11:10:27.438680709 -0700 --- src/http/ngx_http_request.c 2016-06-28 11:19:24.899108286 -0700 *************** *** 2301,2307 **** --- 2301,2320 ---- r->write_event_handler = ngx_http_request_finalizer; } + // see https://trac.nginx.org/nginx/ticket/1005 + // ngx_http_terminate_request(r, rc); + if (rc == NGX_HTTP_REQUEST_TIME_OUT) { + clcf = ngx_http_get_module_loc_conf(r, ngx_http_core_module); + if (clcf->client_body_timeout_and_send_408) { + ngx_http_finalize_request(r, ngx_http_special_response_handler(r, rc)); + } + else { + ngx_http_terminate_request(r, rc); + } + } + else { ngx_http_terminate_request(r, rc); + } return; } $ diff --ignore-all-space --context src/http/ngx_http_special_response.c.orig src/http/ngx_http_special_response.c *** src/http/ngx_http_special_response.c.orig 2016-06-28 11:12:03.016827154 -0700 --- src/http/ngx_http_special_response.c 2016-06-28 11:23:04.637314733 -0700 *************** *** 377,382 **** --- 377,383 ---- if (r->keepalive) { switch (error) { + case NGX_HTTP_REQUEST_TIME_OUT: /* AKA 408 */ case NGX_HTTP_BAD_REQUEST: case NGX_HTTP_REQUEST_ENTITY_TOO_LARGE: case NGX_HTTP_REQUEST_URI_TOO_LARGE:
Looking forward to your comments.
Is this (or a tweaked version) something that you might consider including in the nginx open source code?
comment:17 by , 8 years ago
Here's what the nginx timeout looks like with and without the patch:
$ # direct to nginx WITHOUT patch $ curl -v -X POST -H "content-length: 10" http://127.0.0.1:80/foo * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0) > POST /foo HTTP/1.1 > Host: 127.0.0.1:80 > User-Agent: curl/7.43.0 > Accept: */* > content-length: 10 > * Empty reply from server * Connection #0 to host 127.0.0.1 left intact curl: (52) Empty reply from server $ # direct to nginx WITH patch and 'client_body_timeout_and_send_408 on;' $ curl -v -X POST -H "content-length: 10" http://127.0.0.1:80/foo * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0) > POST /foo HTTP/1.1 > Host: 127.0.0.1:80 > User-Agent: curl/7.43.0 > Accept: */* > content-length: 10 > < HTTP/1.1 408 Request Time-out < Server: nginx-perl/1.8.1.9 < Date: Tue, 28 Jun 2016 20:16:55 GMT < Content-Type: text/html < Content-Length: 189 < Connection: close < <html> <head><title>408 Request Time-out</title></head> <body bgcolor="white"> <center><h1>408 Request Time-out</h1></center> <hr><center>nginx-perl/1.8.1.9</center> </body> </html> * Closing connection 0
comment:18 by , 8 years ago
Interestingly, the current nginx behavior of a TCP connection drop to symbolize a timeout seems to confuse the battle-tested curl client. Notice below how curl reports an "Empty reply from server" upon the TCP connection drop. And then gets confused when sending the next query reporting "Connection 0 seems to be dead!":
$ curl -v -X POST -H "content-length: 10" http://127.0.0.1:80/foo http://127.0.0.1:80/foo * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0) > POST /foo HTTP/1.1 > Host: 127.0.0.1:80 > User-Agent: curl/7.43.0 > Accept: */* > content-length: 10 > * Empty reply from server * Connection #0 to host 127.0.0.1 left intact curl: (52) Empty reply from server * Connection 0 seems to be dead! * Closing connection 0 * Hostname 127.0.0.1 was found in DNS cache * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#1) > POST /foo HTTP/1.1 > Host: 127.0.0.1:80 > User-Agent: curl/7.43.0 > Accept: */* > content-length: 10 > * Empty reply from server * Connection #1 to host 127.0.0.1 left intact curl: (52) Empty reply from server
However, with the patch then everything looks much more intention revealing (no confusion, it just works):
$ curl -v -X POST -H "content-length: 10" http://127.0.0.1:80/foo http://127.0.0.1:80/foo * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#0) > POST /foo HTTP/1.1 > Host: 127.0.0.1:80 > User-Agent: curl/7.43.0 > Accept: */* > content-length: 10 > < HTTP/1.1 408 Request Time-out < Server: nginx-perl/1.8.1.9 < Date: Tue, 28 Jun 2016 20:29:29 GMT < Content-Type: text/html < Content-Length: 189 < Connection: close < <html> <head><title>408 Request Time-out</title></head> <body bgcolor="white"> <center><h1>408 Request Time-out</h1></center> <hr><center>nginx-perl/1.8.1.9</center> </body> </html> * Closing connection 0 * Hostname 127.0.0.1 was found in DNS cache * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 80 (#1) > POST /foo HTTP/1.1 > Host: 127.0.0.1:80 > User-Agent: curl/7.43.0 > Accept: */* > content-length: 10 > < HTTP/1.1 408 Request Time-out < Server: nginx-perl/1.8.1.9 < Date: Tue, 28 Jun 2016 20:29:30 GMT < Content-Type: text/html < Content-Length: 189 < Connection: close < <html> <head><title>408 Request Time-out</title></head> <body bgcolor="white"> <center><h1>408 Request Time-out</h1></center> <hr><center>nginx-perl/1.8.1.9</center> </body> </html> * Closing connection 1
comment:19 by , 8 years ago
Replying to simonhf@…:
However, at this point the Amazon ELB is in a 'weird state' for subsequent requests.
You haven't provided any indication of ELB being in a "weird state". It's your client which behaves incorrectly. Client has the following options:
- close the connection;
- keep sending the request body as previously advertised, and continue using the connection once the request is completely send.
Either way there should be no problem. And if the client behaves incorrectly there will be problems regardless of what server does. Fixing the client is a correct way to go if you want problems to be resolved.
comment:20 by , 8 years ago
"Fixing the client is a correct way to go if you want problems to be resolved."
Are you suggesting to also fix curl and wget to make them 'compatible' with nginx too? Above I showed an example of how the nginx timeout confuses curl. Below I show how the confusion is even worse with wget (another battle-tested HTTP client). I also show how the proposed patch un-confuses wget:
Watch wget go into an infinite loop of retrying the request because it does not know how to intrepret the response-less TCP connection drop from nginx due to the timeout:
$ wget --debug --output-document=- --method=POST --header='Content-Length: 10' http://127.0.0.1:80/foo Setting --output-document (outputdocument) to - Setting --method (method) to POST Setting --header (header) to Content-Length: 10 DEBUG output created by Wget 1.16.1 on linux-gnu. URI encoding = ‘UTF-8’ --2016-06-28 13:44:51-- http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000557d3cd13f20 (new refcount 0). Deleting unused 0x0000557d3cd13f20. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... No data received. Closed fd 3 Retrying. --2016-06-28 13:44:53-- (try: 2) http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000557d3cd13720 (new refcount 0). Deleting unused 0x0000557d3cd13720. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... No data received. Closed fd 3 Retrying. --2016-06-28 13:44:56-- (try: 3) http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000557d3cd13720 (new refcount 0). Deleting unused 0x0000557d3cd13720. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... No data received. Closed fd 3 Retrying. --2016-06-28 13:45:00-- (try: 4) http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000557d3cd13720 (new refcount 0). Deleting unused 0x0000557d3cd13720. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... No data received. Closed fd 3 Retrying. --2016-06-28 13:45:05-- (try: 5) http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000557d3cd13720 (new refcount 0). Deleting unused 0x0000557d3cd13720. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... No data received. Closed fd 3 Retrying. --2016-06-28 13:45:11-- (try: 6) http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000557d3cd13720 (new refcount 0). Deleting unused 0x0000557d3cd13720. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... No data received. Closed fd 3 Retrying. ^C
Whereas with the patch turned on everything just works as expected and it's easier to debug:
$ wget --debug --output-document=- --method=POST --header='Content-Length: 10' http://127.0.0.1:80/foo http://127.0.0.1:80/foo Setting --output-document (outputdocument) to - Setting --method (method) to POST Setting --header (header) to Content-Length: 10 DEBUG output created by Wget 1.16.1 on linux-gnu. URI encoding = ‘UTF-8’ --2016-06-28 13:46:25-- http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000561876ffff70 (new refcount 0). Deleting unused 0x0000561876ffff70. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... ---response begin--- HTTP/1.1 408 Request Time-out Server: nginx-perl/1.8.1.9 Date: Tue, 28 Jun 2016 20:46:26 GMT Content-Type: text/html Content-Length: 189 Connection: close ---response end--- 408 Request Time-out Closed fd 3 2016-06-28 13:46:26 ERROR 408: Request Time-out. URI encoding = ‘UTF-8’ --2016-06-28 13:46:26-- http://127.0.0.1:80/foo Connecting to 127.0.0.1:80... connected. Created socket 3. Releasing 0x0000561876ffffb0 (new refcount 0). Deleting unused 0x0000561876ffffb0. ---request begin--- POST /foo HTTP/1.1 User-Agent: Wget/1.16.1 (linux-gnu) Accept: */* Accept-Encoding: identity Host: 127.0.0.1:80 Connection: Keep-Alive Content-Length: 10 ---request end--- HTTP request sent, awaiting response... ---response begin--- HTTP/1.1 408 Request Time-out Server: nginx-perl/1.8.1.9 Date: Tue, 28 Jun 2016 20:46:27 GMT Content-Type: text/html Content-Length: 189 Connection: close ---response end--- 408 Request Time-out Closed fd 3 2016-06-28 13:46:27 ERROR 408: Request Time-out.
comment:21 by , 8 years ago
Here another example of how nginx confuses the httpie [1] client:
$ http POST 127.0.0.1:80/foo Content-Length:10 http: error: ConnectionError: ('Connection aborted.', BadStatusLine("''",)) $
And how the proposed patch un-confuses httpie:
$ http POST 127.0.0.1:80/foo Content-Length:10 HTTP/1.1 408 Request Time-out Connection: close Content-Length: 189 Content-Type: text/html Date: Tue, 28 Jun 2016 21:01:15 GMT Server: nginx-perl/1.8.1.9 <html> <head><title>408 Request Time-out</title></head> <body bgcolor="white"> <center><h1>408 Request Time-out</h1></center> <hr><center>nginx-perl/1.8.1.9</center> </body> </html> $
comment:22 by , 8 years ago
By manually defining "Content-Length" to an incorrect value you intentionally break the protocol. Assuming anything will work in such situation is wrong, and assuming your changes "un-confuse" anything is wrong either, https://en.wikipedia.org/wiki/Garbage_in,_garbage_out.
comment:23 by , 8 years ago
The reason I'm setting 'Content-Length' to an incorrect value is purely for the purposes of reproducing the issue. Or how else to do you expect it to be reproduced?
The issue itself can occur under different / more complicated circumstances in a production environment with the 'Content-Length' set to a correct value. Why? If the POST body is bigger and sent as multiple packets, and the typical MTU is 1,500 bytes, then there other reasons why some of those packets can get delayed for (longer) periods of time, thus triggering the upstream timeout.
When the issue does arise -- even if very infrequently -- then isn't it better to handle it gracefully and in an intention revealing way that is also forgiving to clients, and in a way that clients actually understand?
The purpose of the RFCs is to consider all such situations even with garbage input (see [1]), and to show what the correct response should be. IMO a mature service like Amazon ELB or a mature daemon like nginx should take something like Fuzz Testing without any problems and not give 'garbage out' as you suggest. Rather they should behave in a predictable manner according to the RFCs. Do you think otherwise?
As it is, the Amazon ELB to nginx timeout incompatibility could be used in theory as an attack vector. An attacker could send many intentionally incorrect POSTs knowing that nginx will timeout and cause the Amazon ELB to block for 60 seconds, thus over-whelming the ELB upstream connections and blocking the service for others. So this is the type of situation that would be uncovered via Fuzz Testing. It's just a coincidence that I discovered it in regular production traffic (luckily not due to an attack!).
Also, thank you for patiently discussing this issue.
comment:24 by , 8 years ago
Again: there is no "Amazon ELB to nginx timeout incompatibility". There is a broken client. And Amazon ELB is waiting for the remaining body bytes from the broken client for 60 seconds - it's not blocked, it's just waiting for the request, likely according to the timeouts configured on ELB. There is no problem with neither nginx nor ELB, the only side which sees the problem is the client - who is itself responsible for the problem. That is, broken behaviour of the client damages the client itself, and it's perfectly expected outcome.
comment:25 by , 8 years ago
Hmmm... there seems to be some confusion still. Why would the Amazon ELB wait for the remaining body bytes if it has *already* sent to the client a 504 response? Are you saying that the following is possible, 'legal', and expected HTTP protocol:
- Client sends POST with incomplete body to Amazon ELB.
- Amazon ELB forwards incomplete request to nginx.
- nginx times out and drops TCP connection to Amazon ELB without sending HTTP response status.
- Amazon ELB sends 504 to client together with 'connection: keep-alive'.
- Within 60 seconds, client sends the rest of the body for the original query even though it has already received the 504 from the Amazon ELB.
What are you expecting the Amazon ELB to do with the rest of the body bytes which arrive after it sent the 504? Establish a new connection to nginx and send them? And assuming it receives a 200 response then should it forward this response to the client? So the client effectively receives *two* HTTP responses to one POST request? Where is receiving two different HTTP responses to a single HTTP request defined in the RFCs?
I thought we discussed and agreed earlier that according to the RFCs, upon receiving an error to an incomplete request the client should itself close the connection. However, we have seen that common and battle tested clients do not do that in reality. So surely by the same token the Amazon ELB is also behaving incorrectly by keeping the connection open and waiting around for 60 seconds... when according to the RFCs the client should not be sending anything further?
And again, I submit that the client is not broken and that this situation can occur in production traffic from a client which sends a correctly formulated larger POST with the correct content length. Body packets can be delayed even after the not-broken client has sent them. Do you agree?
comment:26 by , 8 years ago
The only thing to do with the rest of body bytes after an error is to throw them away. But these bytes must be send if client wishes to continue using the connection.
Again, given a client used Content-Length
and got an error response before it finished sending the body, the client has two options:
- send the rest of the body (and reuse the connection later if needed);
- close the connection.
Any other behaviour of the client will break the protocol and will expectedly cause problems.
comment:27 by , 8 years ago
After researching the RFC, I agree with you about the second option.
But the first option does not make sense to me, and I cannot find anything in the RFC to support it.
Assuming closing the connection is the only sensible thing to do, then the Amazon ELB is definitely behaving poorly by offering the 'connection: keep-alive' in its 504 response. And this seems to be a knock on effect because of the nginx TCP connection drop.
comment:28 by , 8 years ago
The first option is basically identical to what RFC 2616 recommends in 8.2.2 Monitoring Connections for Error Status Messages for chunked encoding. It also occurs naturally if the client doesn't monitor the connection (and just continue sending the body regardless of the server response) or if the body was already send at the time the client got the response (but not necessary received by the server: e.g., it may still be in the socket buffer).
comment:29 by , 8 years ago
The RFC 2616 is foggy in some places. However, RFC 7231 attempts to unfog it and section 6.5.7 [1] clearly states what nginx should do in the case of timing out due to not receiving a complete request:
"6.5.7. 408 Request Timeout
The 408 (Request Timeout) status code indicates that the server did
not receive a complete request message within the time that it was
prepared to wait. A server SHOULD send the "close" connection option
(Section 6.1 of [RFC7230]) in the response, since 408 implies that
the server has decided to close the connection rather than continue
waiting. If the client has an outstanding request in transit, the
client MAY repeat that request on a new connection."
Further, this is what the battle-tested Apache web server does; see [2]:
"RequestReadTimeout Directive ... This directive can set various timeouts for receiving the request headers and the request body from the client. If the client fails to send headers or body within the configured time, a 408 REQUEST TIME OUT error is sent."
Further, let's imagine that a completely valid POST request is sent from the client and that the body consists of many packets. However, at least one of those packets arrives at nginx very late after being delayed somehow. Then, nginx does its current behavior of not sending a 408 and instead closing the TCP connection. So in this scenario, how does the client know if the request was processed or not before the connection was closed? So here the 408 is the only way to tell the client that the request was not handled at all and is safe to retry. Which is also what RFC 7231 section 6.5.7 says.
Are you convinced now?
[1] https://tools.ietf.org/html/rfc7231#section-6.5.7
[2] https://httpd.apache.org/docs/2.4/mod/mod_reqtimeout.html#requestreadtimeout
comment:30 by , 8 years ago
Do you plan to update the following documents to make them be consistent with current implementation? Currently it's obvious that the documents makes confusion.
client_body_timeout
Defines a timeout for reading client request body. The timeout is set only for a period between two successive read operations, not for the transmission of the whole request body. If a client does not transmit anything within this time, the 408 (Request Time-out) error is returned to the client.
client_header_timeout
Defines a timeout for reading client request header. If a client does not transmit the entire header within this time, the 408 (Request Time-out) error is returned to the client.
comment:31 by , 8 years ago
we have a load balancer - haproxy. haproxy is forwarding to nginx (with fastcgi). we see these days a client that sends to us a incomplete post request (empty body / no post data). nginx closed the connection without a response and haproxy logged a "bad gateway". because of "bad gateway" error it was not clear for us what happend.
as we activated the acces_log on nginx we saw the 4xx request. but i think a correct response is very useful for every "stuff" between client and nginx - eg a load balancer....
Minimally modified nginx.conf for reproducing the bug.