Opened 6 years ago

Closed 5 years ago

Last modified 2 years ago

#1723 closed defect (wontfix)

proxy_cache_background_update does not really do updates in the background

Reported by: Nathalie Sautou Owned by:
Priority: minor Milestone:
Component: nginx-module Version: 1.14.x
Keywords: cache Cc:
uname -a: Linux test-cache-host 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.14.1
built with OpenSSL 1.1.0f 25 May 2017 (running with OpenSSL 1.1.0j 20 Nov 2018)
TLS SNI support enabled
configure arguments: --with-cc-opt='-g -O2 -fdebug-prefix-map=/build/nginx-jqo7Nx/nginx-1.14.1=. -fstack-protector-strong -Wformat -Werror=format-security -fPIC -Wdate-time -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -fPIC' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --modules-path=/usr/lib/nginx/modules --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_v2_module --with-http_dav_module --with-http_slice_module --with-threads --with-http_addition_module --with-http_geoip_module=dynamic --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module=dynamic --with-http_sub_module --with-http_xslt_module=dynamic --with-stream=dynamic --with-stream_ssl_module --with-stream_ssl_preread_module --with-mail=dynamic --with-mail_ssl_module --add-dynamic-module=/build/nginx-jqo7Nx/nginx-1.14.1/debian/modules/http-auth-pam --add-dynamic-module=/build/nginx-jqo7Nx/nginx-1.14.1/debian/modules/http-dav-ext --add-dynamic-module=/build/nginx-jqo7Nx/nginx-1.14.1/debian/modules/http-echo --add-dynamic-module=/build/nginx-jqo7Nx/nginx-1.14.1/debian/modules/http-upstream-fair --add-dynamic-module=/build/nginx-jqo7Nx/nginx-1.14.1/debian/modules/http-subs-filter

Description

When successively requesting multiple stale urls using the same connection on a location with proxy_cache_background_update on, the update time of each request delays the processing of the next one.

To reproduce, you can use this simple site configuration:

proxy_cache_path  /srv/test_cache
                  keys_zone=keyz:32m
                  inactive=30s
                  max_size=1g;

server {
    listen 8080 default_server;

    location / {
        proxy_pass http://httpbin.org;
        proxy_set_header Host httpbin.org;

        proxy_cache keyz;
        proxy_cache_valid 200 10s;
        proxy_cache_background_update on;
        proxy_cache_use_stale updating;

        add_header X-Cache-Status $upstream_cache_status;
    }
}

And the following test script:

CURL_OPTIONS='-so /dev/null -w "%{time_total} %{http_code} %{url_effective}\n"'
NGINX_HOST='test-cache-host:8080'

# Populate cache
eval "curl ${CURL_OPTIONS} http://${NGINX_HOST}/delay/2 ${CURL_OPTIONS} http://${NGINX_HOST}/delay/3"
echo "Waiting for the cache to become stale..."
sleep 11
# Make the same requests (using a unique session)
eval "curl ${CURL_OPTIONS} http://${NGINX_HOST}/delay/2 ${CURL_OPTIONS} http://${NGINX_HOST}/delay/3"

Running the script against this configuration gives an output like the following. For the sake of brevity headers are not included in the output, but the cache status is correct: MISS for the 2 first requests, STALE for the others.

2,182513 200 http://test-cache-host:8080/delay/2
3,223398 200 http://test-cache-host:8080/delay/3
Waiting for the cache to become stale...
0,208964 200 http://test-cache-host:8080/delay/2
2,184092 200 http://test-cache-host:8080/delay/3

As you can see, the last request took 2 seconds to get an answer even though the stale version was used!

From the debug log and the client behavior, it looks like the following is happening:

  • the client requests http://test-cache-host:8080/delay/2
  • nginx immediately serves the stale version
  • nginx "background" updates its cache - but in fact, it is still "processing" the first request and it won't start looking into the next one until it is finished
  • the client requests http://test-cache-host:8080/delay/3 in the same session
  • nginx is still updating its cache
  • http://httpbin.org/delay/2 answers after 2 seconds
  • the cache for the first request is up to date
  • nginx finally starts processing the second request
  • ...

Adding keepalive_timeout 0; in the location of the above configuration forces the client to setup a new session after getting the first response. As the 2 requests come from different sessions, the second request is not delayed:

2,180887 200 http://test-cache-host:8080/delay/2
3,179635 200 http://test-cache-host:8080/delay/3
Waiting for the cache to become stale...
0,208349 200 http://test-cache-host:8080/delay/2
0,203867 200 http://test-cache-host:8080/delay/3

Change History (11)

comment:1 by Maxim Dounin, 6 years ago

Resolution: wontfix
Status: newclosed

This is expected behaviour. Quoting docs:

Allows starting a background subrequest to update an expired cache item, while a stale cached response is returned to the client.

That is, background subrequest updates an expired cache item at the same time the response is returned to the client. If updating takes more time than returning the response, it can delay further requests on the same connection when using HTTP/1.x, since no additional requests on the connection will be processed till previous request is finished, with all related subrequests, much like when processing multiple pipelined HTTP requests.

comment:2 by Nathalie Sautou, 6 years ago

The documentation would benefit from clarifying what a background request is. In particular, is background referring to:

  1. returning a response without waiting for the end of the subrequest ?
  2. doing the subrequest 'outside of the request scope' ?

It surely looks like (1) has been implemented, but it is not specified in the documentation, or at least I did not find it.

From a user point of view, enabling proxy_cache_background_update means that you care more about performance than freshness. Delaying the next request is probably not what you want.

comment:3 by Maxim Dounin, 6 years ago

See also #1738.

comment:4 by rulatir@…, 6 years ago

Resolution: wontfix
Status: closedreopened

@mdounin, this is a non sequitur. The behavior you describe after "That is, ..." does not follow logically from the documentation passage you quoted. Your claim that "this is the expected behavior" is unsupported, and the bug should be reopened.

Version 0, edited 6 years ago by rulatir@… (next)

comment:5 by Maxim Dounin, 6 years ago

Resolution: wontfix
Status: reopenedclosed

The behaviour I describe follows from the documentation quoted and how background subrequests work. While this may not be immediately clear for ones not familiar with background subrequests, and may need additional clarification in the docs, this is how it works and how it is expected to work.

comment:6 by rulatir@…, 5 years ago

Allright, this is ridiculous.

While this behavior may be consistent with the LETTER of its specification, it is completely inconsistent with its SPIRIT.

The CORE IDEA of background updating is AVOID WAITING UNTIL COMPLICATED CONTENT IS RENDERED. With this brain-dead behavior THE WAITING IS EFFECTIVELY STILL THERE. It may not be a wait for the response to THE request that is backed by a background-updating cache, but if ACTUALLY DISPLAYING that response requires loading additional assets that are specified IN the response, then responses to requests for THOSE assets WILL be delayed by the background update process. The result for end users in those TYPICAL scenarios is EFFECTIVELY THE SAME as if the feature were absent or not enabled, and in a TYPICAL webpage scenario where the page MUST load its CSS or JS before it can be shown, this feature FAILS TO PROVIDE ANY BENEFITS DUE TO THIS BUG. It is definitely a defect, and should be fixed. If it is consistent with some design, then the design itself is defective and should be changed.

Last edited 5 years ago by rulatir@… (previous) (diff)

comment:7 by rulatir@…, 5 years ago

Resolution: wontfix
Status: closedreopened

comment:8 by rulatir@…, 5 years ago

Additionally, the only possible mitigation of the issue - namely disabling keep-alive - CANNOT be done selectively due to the handling of Connection response header being mostly hardcoded into nginx, with little to no flexibility.

comment:9 by Maxim Dounin, 5 years ago

Resolution: wontfix
Status: reopenedclosed

The core idea of background updating is to provide clear benefit over foreground ones. In the current implementation, it does so by issuing the cache update subrequest in parallel with returning the response, and this is beneficial in all cases.

Current implementation also implies that no other requests on the same connection will be allowed till the update subrequest finishes. While this may be seen as a limitation, this is how it is expected to work. There are at least two immediate reasons behind the current implementation:

  • This way it aligns nicely with nginx internals, including various configuration options such as proxy_no_cache, which may access request and/or connection details long after the subrequest is created;
  • This preserves connection limits, not leading to unexpected limit_conn errors or to DoS with effectively unlimited load due to no limit_conn applied.

If you think you can suggest a better implementation, consider submitting a patch.

comment:10 by rulatir@…, 5 years ago

and this is beneficial in all cases

No, it is not beneficial in those cases where the effects of blocking further requests on the connection COMPLETELY DEVOUR all the benefits. I am arguing that this is in fact a TYPICAL scenario when serving a modern HTML page to a web browser.

Last edited 5 years ago by rulatir@… (previous) (diff)

comment:11 by royteeuwen@…, 2 years ago

Anyone ever found an alternative fix for this? It definitely is also a show stopper of using the background update, seeing as the 500ms delay in checking if the cache lock has expired is not configurable and way too long

Note: See TracTickets for help on using tickets.