Opened 6 years ago

Last modified 6 years ago

#1763 new enhancement

HTTP/2 prioritization is intermittent and often ineffective

Reported by: patmeenan@… Owned by:
Priority: minor Milestone:
Component: other Version: 1.15.x
Keywords: HTTP/2 http2 Cc:
uname -a: Linux ubuntu 4.18.0-17-generic #18~18.04.1-Ubuntu SMP Fri Mar 15 15:27:12 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.15.11 (Ubuntu)
built by gcc 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)
built with OpenSSL 1.1.0h 27 Mar 2018
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --user=nginx --group=nginx --build=Ubuntu --builddir=nginx-1.15.11 --with-select_module --with-poll_module --with-threads --with-file-aio --with-http_ssl_module --with-http_v2_module --with-http_realip_module --with-http_addition_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_geoip_module=dynamic --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_auth_request_module --with-http_random_index_module --with-http_secure_link_module --with-http_degradation_module --with-http_slice_module --with-http_stub_status_module --with-http_perl_module=dynamic --with-perl_modules_path=/usr/share/perl/5.26.1 --with-perl=/usr/bin/perl --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --with-mail=dynamic --with-mail_ssl_module --with-stream=dynamic --with-stream_ssl_module --with-stream_realip_module --with-stream_geoip_module=dynamic --with-stream_ssl_preread_module --with-compat --with-pcre=../pcre-8.42 --with-pcre-jit --with-zlib=../zlib-1.2.11 --with-openssl=../openssl-1.1.0h --with-openssl-opt=no-nextprotoneg --with-debug

Description

The core support for prioritization for HTTP/2 is solid and attempts to prioritize but it appears that the data flow through Nginx itself prevents it from actually prioritizing quite often.

For prioritization to be effective, the downstream (browser-facing) part of the connection has to have minimal buffering beyond the HTTP/2 prioritization logic and the upstream (origin/files/data source) needs to buffer enough data for every stream to be able to always fill the downstream connection with data from the highest current-priority request (or balance as weighting defines).

Chrome builds an exclusive dependency list so there is only ever 1 request that is at the top of the tree and it is requested to get 100% of the bandwidth. At times higher priority requests will come in and be inserted at the front of the queue (every stream has the exclusive flag set). That makes it reasonably easy to test.

There is a test page here that exercises Chrome's prioritization by warming up the connection with a few serialized requests, queuing 30 low-priority requests, waiting a bit and then queuing 2 high-priority requests serially. When prioritization is working well, the 2 high priority requests will interrupt the existing data flow and complete quickly (optimally starting in 1RTT is all of the buffering is perfect). All of the requests will use 100% of the badwidth and be downloading exclusively unless interrupted by a higher-priority request (no interleaving of data across requests). When prioritization is not working well the high-priority requests will be delayed (one or both) and you may also see interleaving across requests.

THe waterfalls below are from WebPageTest using Chrome (data from the raw netlog on the client side). The light parts of the bars are when the stream is idle and the dark parts of the bars are when header or data frames are flowing.

Here is what it looks like with h2o which has well-functioning prioritization out of the box with no server tuning:
https://www.webpagetest.org/waterfall.php

Since Nginx doesn't natively support pacing the downstream connection like h2o, it requires a bit of server tuning to minimize the downstream buffering. Specifically, BBR congestion control needs to be used to eliminate bufferbloat and tcp_notsent_lowat needs to be configured to reduce TCP buffers bloating. More details on why are available here.

Even with the system configured to minimize downstream buffering, the results with Nginx are inconsistent and sometimes it works as expected but fails often:

https://www.webpagetest.org/waterfall.php

In this test case the image is served from local disk (SSD) and epoll is not enabled. We have seen situations where the results differ based on if the data is coming from disk, proxy to a TCP connection or proxy to a local unix domain socket as well as if epoll is enabled or not. Sometimes the interleaving across requests is a lot more visible.

In this specific example, it is clear that the responses are all available very quickly with a very thin line near the beginning of each request for the HEADERS frame with the responses but the actual DATA frames are not being prioritized well. The exclusive streams are being interleaved even though the response data is available on the server MUCH faster than downstream consumes it and the ordering of the high-priority streams intermittently gets delayed behind the low priority streams.

WE have seen the same issue going back to 1.14.x and see it in production on a lot of large Nginx deployments.

Change History (2)

comment:1 by Maxim Dounin, 6 years ago

Keywords: http2 added
Priority: majorminor
Type: defectenhancement

Thank you for your report. Given the description of the test, results are affected by the amount of data from other requests sitting in the socket send buffer when a high-priority request data are available. As such, it is no surprise that you'll get inconsistent results depending on various factors - as long as large TCP send buffers are used, it is unavoidable that at some point you'll end up with filled buffer, and prioritized stream data will end up delayed more than it will be if the buffer will be empty. While using TCP_NOTSENT_LOWAT may reduce this probability, it is not going to eliminate it.

Note well that user-level testing tools trying to emulate limited bandwidth are unlikely to produce correct results as long as automatic TCP buffer sizing is used, since emulation won't affect bandwidth as seen by the server and buffer size tuning algorithms. That is, such emulated tests are expected to produce results which are worse than real-world results.

Summing the above, I don't think there is a good way to mitigate this HTTP/2 buffering problem except by limiting size of the send buffer, either configured or used by nginx (i.e., pacing). But the good news is that this is something expected to happen automatically with automatic sizing of TCP send buffers.

Keeping this open for now as a feature request to introduce pacing. Not sure it is needed though - proper automatic sizing of TCP send buffers in the kernel might be actually better than what nginx can do here.

comment:2 by patmeenan@…, 6 years ago

The WebPageTest traffic shaping is kernel/packet level using tc/netem. It is not a problem of a large amount of data in the tcp send buffer otherwise the HEADERS frame would also be delayed but it is returned very quickly (thin dark line near the beginning of request 34 in the waterfall).

The buffering/ordering issues are somewhere within Nginx and outside of the TCP buffers.

Some areas we are exploring are the order that reads are done in the event loop when signaled and how the read buffers are drained for writes (without necessarily re-filling the buffer from the source before going on to the next available stream).

Note: See TracTickets for help on using tickets.