Opened 2 years ago

Closed 2 years ago

#1499 closed defect (invalid)

Frequent cancelling the HTTP/2 requests will cause server stop sending more data on the same connection

Reported by: jifang@… Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.13.x
Keywords: HTTP/2 RST_STREAM cancel Cc:
uname -a: Darwin Jis-MBP 16.7.0 Darwin Kernel Version 16.7.0: Thu Jan 11 22:59:40 PST 2018; root:xnu-3789.73.8~1/RELEASE_X86_64 x86_64
nginx -V: nginx version: nginx/1.13.10
built by clang 9.0.0 (clang-900.0.39.2)
built with OpenSSL 1.0.2n 7 Dec 2017
TLS SNI support enabled
configure arguments: --with-debug --prefix=/usr/local/Cellar/nginx/1.13.9 --sbin-path=/usr/local/Cellar/nginx/1.13.9/bin/nginx --with-cc-opt='-I/usr/local/opt/pcre/include -I/usr/local/opt/openssl/include' --with-ld-opt='-L/usr/local/opt/pcre/lib -L/usr/local/opt/openssl/lib' --conf-path=/usr/local/etc/nginx/nginx.conf --pid-path=/usr/local/var/run/nginx.pid --lock-path=/usr/local/var/run/nginx.lock --http-client-body-temp-path=/usr/local/var/run/nginx/client_body_temp --http-proxy-temp-path=/usr/local/var/run/nginx/proxy_temp --http-fastcgi-temp-path=/usr/local/var/run/nginx/fastcgi_temp --http-uwsgi-temp-path=/usr/local/var/run/nginx/uwsgi_temp --http-scgi-temp-path=/usr/local/var/run/nginx/scgi_temp --http-log-path=/usr/local/var/log/nginx/access.log --error-log-path=/usr/local/var/log/nginx/error.log --with-debug --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_degradation_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-ipv6 --with-mail --with-mail_ssl_module --with-pcre --with-pcre-jit --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module

Description

Description: When downloading a lot of medium-sized files from Nginx HTTP/2 server, if requests are frequently canceled (RST_STREAM), eventually the connection will timeout and not be able to send/receive any data from the server.

On the server side configure the server to support HTTP/2 with a self-signed certificate. And host about 100 files with size about 500k each.
On the client side, I built an Android test, using a single thread to download the file from file no.1 to file no.100 consequently. During each download, the request is canceled and then proceed to download the next file. Eventually, the request will timeout. Depending on the setup the timeout may happen on different download. But once the testing environment is set, it always failed on the same file.

The server IP is 10.0.0.186
The client IP is 10.0.0.135

Server timeout is the default 60 seconds
Client timeout is set to 120 seconds

A filtered (sid:75) server log shows that the header frame has been sent on 14:39:32 and then timed out after 60 seconds. Note, it is not always the header frame though. Sometimes it is the data frame not being sent out.

2018/03/05 14:39:32 [debug] 92282#0: *4 http2 frame type:1 f:5 l:14 sid:75
2018/03/05 14:39:32 [debug] 92282#0: *4 http2 HEADERS frame sid:75 depends on 0 excl:0 weight:16
2018/03/05 14:39:32 [debug] 92282#0: *4 http2 frame out: 00007FD4B082ABA0 sid:75 bl:1 len:122
2018/03/05 14:39:32 [debug] 92282#0: *4 http2 frame sent: 00007FD4B082ABA0 sid:75 bl:1 len:122
2018/03/05 14:40:32 [debug] 92282#0: *4 http2 send RST_STREAM frame sid:75, status:1

In the meantime, the client log shows the header is not received until 14:40:32

03-05 14:39:32.229 5979-5995/? V/VB_Network: Sending request https://10.0.0.186/files/37 on Connection{10.0.0.186:443, proxy=DIRECT hostAddress=/10.0.0.186:443 cipherSuite=TLS_RSA_WITH_AES_128_CBC_SHA protocol=h2}
                                             User-Agent: 
                                             Host: 10.0.0.186
                                             Connection: Keep-Alive
                                             Accept-Encoding: gzip
03-05 14:40:32.397 5979-5995/? V/VB_Network: Received response for https://10.0.0.186/files/37 (h2) in 60167.4ms
                                             server: nginx/1.13.10
                                             date: Mon, 05 Mar 2018 22:39:32 GMT
                                             content-type: application/octet-stream
                                             content-length: 562176
                                             last-modified: Sun, 04 Mar 2018 00:23:13 GMT
                                             etag: "5a9b3c71-89400"
                                             accept-ranges: bytes
03-05 14:40:32.400 5979-5995/? E/VB_Test: stream was reset: PROTOCOL_ERROR https://10.0.0.186/files/37

Wireshark capture confirms that the header is sent out on 14:40:32 instead of 14:39:32
https://github.com/jifang/nginx_bug_repo1/blob/master/files/wireshark.jpg
Full capture file

Server full log
Client full log

The test project is under git@github.com:jifang/nginx_bug_repo1.git

Attachments (1)

wireshark.jpg (159.8 KB ) - added by jifang@… 2 years ago.
Screen capture for wireshark

Download all attachments as: .zip

Change History (3)

by jifang@…, 2 years ago

Attachment: wireshark.jpg added

Screen capture for wireshark

comment:1 by jifang@…, 2 years ago

To add on, more information for the client.
OS Android 6.0 (reproducible on 5.0/7.0, too). Not reproducible on KitKat since it didn't support TLSv1.2, so it falls back to HTTP 1.1
HTTP library, OKHttp 3.10.0

comment:2 by Ruslan Ermilov, 2 years ago

Resolution: invalid
Status: newclosed

Your client never sends WINDOW_UPDATE except right after the connection establishment, so when the window is closed (after having sent approximately 37 500k files) the writing stalls and eventually times out. Here's a relevant snippet from the nginx debug log:

2018/03/05 14:39:23 [debug] 92282#0: *4 http2 setting 4:16777216
2018/03/05 14:39:23 [debug] 92282#0: *4 http2 WINDOW_UPDATE frame sid:0 window:16711681
[...]
2018/03/05 14:39:32 [debug] 92282#0: *4 http2:73 windows: conn:27648 stream:16449536
[...]
2018/03/05 14:39:32 [debug] 92282#0: *4 http2:73 DATA frame 00007FD4B0030138 was sent partially
2018/03/05 14:39:32 [debug] 92282#0: *4 event timer add: 3: 60000:1586768942
2018/03/05 14:39:32 [debug] 92282#0: *4 http2:73 windows: conn:0 stream:16421888
[...]
2018/03/05 14:40:32 [debug] 92282#0: timer delta: 60000
2018/03/05 14:40:32 [debug] 92282#0: *4 event timer del: 3: 1586769189
2018/03/05 14:40:32 [debug] 92282#0: *4 http run request: "/files/37?"
2018/03/05 14:40:32 [debug] 92282#0: *4 http writer handler: "/files/37?"
2018/03/05 14:40:32 [info] 92282#0: *4 client timed out (60: Operation timed out) while sending response to client, client: 10.0.0.135, server: localhost, request: "GET /files/37 HTTP/2.0", host: "10.0.0.186"
Note: See TracTickets for help on using tickets.