Opened 5 years ago
Closed 4 years ago
#1990 closed enhancement (wontfix)
proxy_cache_min_uses not counted per variant on initial requests
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | nginx-core | Version: | 1.16.x |
Keywords: | Cc: | ||
uname -a: | Linux 10-0-1-120 4.19.0-0.bpo.6-amd64 #1 SMP Debian 4.19.67-2+deb10u2~bpo9+1 (2019-11-12) x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.16.0
built with OpenSSL 1.1.0k 28 May 2019 (running with OpenSSL 1.1.0l 10 Sep 2019) TLS SNI support enabled configure arguments: --with-http_stub_status_module --with-http_ssl_module --with-http_gzip_static_module --with-http_realip_module --with-pcre --with-debug --with-http_auth_request_module --with-http_sub_module --with-ipv6 --prefix=/usr --user=www-data --group=www-data --pid-path=/var/run/nginx.pid --lock-path=/var/lock/nginx.lock --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-client-body-temp-path=/var/lib/nginx/body |
Description
When proxy_cache_min_uses
is set to a value greater than 1, and multiple requests arrive which all resolve to a common primary proxy_cache_key
but which are different variants according to the Vary
response header, the requests are not counted on a per-variant basis at first, until at least one variant is cached.
Suppose proxy_cache_min_uses 2
is configured and two requests for the same URL arrive with different Accept-Encoding
headers, and suppose the backend responds to both with Vary: Accept-Encoding
.
The second request will be cached in this case, even though only one "use" has so far been seen for that variant.
After this, future requests will be counted on a per-variant basis, however.
Consider the following sequence of requests for the same URL with proxy_cache_min_uses 2
:
Accept-Encoding: foo MISS (not cached) Accept-Encoding: bar MISS (cached) Accept-Encoding: bar HIT Accept-Encoding: foo MISS (not cached) Accept-Encoding: abc MISS (not cached) Accept-Encoding: foo MISS (cached) Accept-Encoding: abc MISS (cached) Accept-Encoding: foo HIT Accept-Encoding: abc HIT
So the foo
variant ends up not cached until its third request, while the bar
variant is cached on its first request. Only the abc
variant is correctly cached on its second request.
This appears to be due to the way that ngx_http_file_cache_open()
first calls ngx_http_file_cache_exists()
but only proceeds to ngx_http_file_cache_read()
if a "primary" cache file is found, and it's only in ngx_http_file_cache_read()
that the variant is tested and a "secondary" cache file may be opened instead using ngx_http_file_cache_reopen()
(see https://hg.nginx.org/nginx/file/stable-1.16/src/http/ngx_http_file_cache.c#l598).
So "uses" are counted generically against the primary key, regardless of variation, until a first cache file is created for that key.
It would be ideal if, instead, per-variant use counting could be done from the start for each unique cache key.
Change History (2)
comment:1 by , 5 years ago
comment:2 by , 4 years ago
Component: | documentation → nginx-core |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
Type: | defect → enhancement |
The problem is that it is not possible to determine the variant without a response, or, more strictly, the Vary
header from the response. As such, nginx cannot determine a particular variant to count min_uses against before it cached at least one variant.
In theory, it could be possible to improve caching to additionally calculate secondary keys and update uses counters of possible secondary cache nodes if the min_uses is not yet reached by the primary key. This way, the sequence provided in the description will look like as follows:
Accept-Encoding: foo MISS (not cached) Accept-Encoding: bar MISS (cached) Accept-Encoding: bar HIT Accept-Encoding: foo MISS (cached) ...
This approach, however, will require additional non-trivial code (we refused to cache anything due to min_uses, but still try to update something in the cache later), additional memory till min_uses is reached, and won't be able to provide distinction for the first variant cached. That is, it is not really possible to avoid caching "bar" on the second request, since the decision to cache or not to cache is made when we have no Vary information (and moving the decision elsewhere implies less effective caching, as we'll have to remove If-Modified-Since and similar headers from requests which are potentially cacheable).
An alternative approach would be to always store Vary information somewhere in the keys zone, but given that it can be large it certainly isn't a good idea to simply add it to the cache node structure, and this will require a dedicated mechanism.
Given the above implications and the fact that using multiple variants inevitably implies ineffective caching and cache duplication, it is believed that the current approach is good enough. If someone wants more effective caching, getting rid of the variants looks like a better way to go.
I should note that these examples assume a simple caching proxy configuration such as:
Also, I accidentally left the "Component" field on this ticket set to "documentation"; I'd intended to change it to "core".