Opened 4 years ago

Closed 4 years ago

#1990 closed enhancement (wontfix)

proxy_cache_min_uses not counted per variant on initial requests

Reported by: chrisd8088@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.16.x
Keywords: Cc:
uname -a: Linux 10-0-1-120 4.19.0-0.bpo.6-amd64 #1 SMP Debian 4.19.67-2+deb10u2~bpo9+1 (2019-11-12) x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.16.0
built with OpenSSL 1.1.0k 28 May 2019 (running with OpenSSL 1.1.0l 10 Sep 2019)
TLS SNI support enabled
configure arguments: --with-http_stub_status_module --with-http_ssl_module --with-http_gzip_static_module --with-http_realip_module --with-pcre --with-debug --with-http_auth_request_module --with-http_sub_module --with-ipv6 --prefix=/usr --user=www-data --group=www-data --pid-path=/var/run/nginx.pid --lock-path=/var/lock/nginx.lock --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-client-body-temp-path=/var/lib/nginx/body

Description

When proxy_cache_min_uses is set to a value greater than 1, and multiple requests arrive which all resolve to a common primary proxy_cache_key but which are different variants according to the Vary response header, the requests are not counted on a per-variant basis at first, until at least one variant is cached.

Suppose proxy_cache_min_uses 2 is configured and two requests for the same URL arrive with different Accept-Encoding headers, and suppose the backend responds to both with Vary: Accept-Encoding.

The second request will be cached in this case, even though only one "use" has so far been seen for that variant.

After this, future requests will be counted on a per-variant basis, however.

Consider the following sequence of requests for the same URL with proxy_cache_min_uses 2:

Accept-Encoding: foo                MISS (not cached)
Accept-Encoding: bar                MISS (cached)
Accept-Encoding: bar                HIT

Accept-Encoding: foo                MISS (not cached)
Accept-Encoding: abc                MISS (not cached)
Accept-Encoding: foo                MISS (cached)
Accept-Encoding: abc                MISS (cached)
Accept-Encoding: foo                HIT
Accept-Encoding: abc                HIT

So the foo variant ends up not cached until its third request, while the bar variant is cached on its first request. Only the abc variant is correctly cached on its second request.

This appears to be due to the way that ngx_http_file_cache_open() first calls ngx_http_file_cache_exists() but only proceeds to ngx_http_file_cache_read() if a "primary" cache file is found, and it's only in ngx_http_file_cache_read() that the variant is tested and a "secondary" cache file may be opened instead using ngx_http_file_cache_reopen() (see https://hg.nginx.org/nginx/file/stable-1.16/src/http/ngx_http_file_cache.c#l598).

So "uses" are counted generically against the primary key, regardless of variation, until a first cache file is created for that key.

It would be ideal if, instead, per-variant use counting could be done from the start for each unique cache key.

Change History (2)

comment:1 by chrisd8088@…, 4 years ago

I should note that these examples assume a simple caching proxy configuration such as:

http {
  proxy_cache_path /path/to/mycache levels=1:2 keys_zone=mycache:10m

  server {
    ...
    proxy_cache_key "$http_host$uri$is_args$args"
    proxy_cache_valid 15m;
    proxy_cache_min_uses 2;

    location / {
      proxy_pass http://mybackend.com;
      proxy_cache mycache;
    }
  }
}

Also, I accidentally left the "Component" field on this ticket set to "documentation"; I'd intended to change it to "core".

comment:2 by Maxim Dounin, 4 years ago

Component: documentationnginx-core
Resolution: wontfix
Status: newclosed
Type: defectenhancement

The problem is that it is not possible to determine the variant without a response, or, more strictly, the Vary header from the response. As such, nginx cannot determine a particular variant to count min_uses against before it cached at least one variant.

In theory, it could be possible to improve caching to additionally calculate secondary keys and update uses counters of possible secondary cache nodes if the min_uses is not yet reached by the primary key. This way, the sequence provided in the description will look like as follows:

Accept-Encoding: foo                MISS (not cached)
Accept-Encoding: bar                MISS (cached)
Accept-Encoding: bar                HIT

Accept-Encoding: foo                MISS (cached)
...

This approach, however, will require additional non-trivial code (we refused to cache anything due to min_uses, but still try to update something in the cache later), additional memory till min_uses is reached, and won't be able to provide distinction for the first variant cached. That is, it is not really possible to avoid caching "bar" on the second request, since the decision to cache or not to cache is made when we have no Vary information (and moving the decision elsewhere implies less effective caching, as we'll have to remove If-Modified-Since and similar headers from requests which are potentially cacheable).

An alternative approach would be to always store Vary information somewhere in the keys zone, but given that it can be large it certainly isn't a good idea to simply add it to the cache node structure, and this will require a dedicated mechanism.

Given the above implications and the fact that using multiple variants inevitably implies ineffective caching and cache duplication, it is believed that the current approach is good enough. If someone wants more effective caching, getting rid of the variants looks like a better way to go.

Note: See TracTickets for help on using tickets.