Opened 6 years ago

Last modified 6 years ago

#1459 accepted enhancement

Can't vary on request headers set by proxy_set_header (rev. proxy mode)

Reported by: Neil Craig Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.13.x
Keywords: proxy_set_header vary cache Cc:
uname -a: Linux 9acd833e6438 4.9.60-linuxkit-aufs #1 SMP Mon Nov 6 16:00:12 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.13.8 (BBC GTM)
built with OpenSSL 1.0.2n 7 Dec 2017
TLS SNI support enabled
configure arguments: --build='BBC GTM' --prefix=/etc/nginx/current --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/current/nginx.conf --pid-path=/var/run/nginx.pid --error-log-path=/var/log/nginx/default-error.log --http-log-path=/var/log/nginx/default-access.log --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=gtmdaemon --group=gtmdaemon --with-http_realip_module --with-http_v2_module --with-http_ssl_module --with-http_geoip_module --with-pcre-jit --with-ipv6 --with-file-aio --with-threads --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' --add-module=/builddir/build/BUILD/nginx-1.13.8/headers-more-nginx-module-0.33 --add-module=/builddir/build/BUILD/nginx-1.13.8/naxsi-0.55.3/naxsi_src --add-module=/builddir/build/BUILD/nginx-1.13.8/nginx-module-vts-0.1.15 --with-openssl=/builddir/build/BUILD/nginx-1.13.8/openssl-OpenSSL_1_0_2n --add-module=/builddir/build/BUILD/nginx-1.13.8/ngx_devel_kit-0.3.0 --add-module=/builddir/build/BUILD/nginx-1.13.8/lua-nginx-module-0.10.11

Description

Hi

We're using NGINX in reverse proxy mode for an internal traffic management service and I noticed that NGINX doesn't vary the cached object on request headers which we calculate and add in NGINX itself via proxy_set_header. This causes a major problem for our service as it's multi-tenant. I think it'd be logical and expected if NGINX did vary on request headers set by proxy_set_header. I have tested and setting the headers via more_set_input_headers and by setting the variable directly (and in Lua) but these also don't work, sadly.

I have included a reduced test case which hopefully illustrates the situation (a few comments help explain). Output from testing (against local/Docker) is:

# curl -k https://127.0.0.1:8443/a\?vv1\=1 -i
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Mon, 15 Jan 2018 14:33:54 GMT
Content-Type: text/plain
Content-Length: 25
Connection: keep-alive
Cache-Control: public,max-age=30
Vary: vvrh1
vvrh1-val-rec: val is 1
Edge-Cache-Status: EXPIRED
Origin-Response-Status: 200
Origin-IP: 127.0.0.1:9000

2018-01-15T14:33:54+00:00%                                                                                                                                                               

# curl -k https://127.0.0.1:8443/a\?vv1\=1 -i
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Mon, 15 Jan 2018 14:33:55 GMT
Content-Type: text/plain
Content-Length: 25
Connection: keep-alive
Cache-Control: public,max-age=30
Vary: vvrh1
vvrh1-val-rec: val is 1
Edge-Cache-Status: HIT

2018-01-15T14:33:54+00:00%                                                                                                                                                               

# curl -k https://127.0.0.1:8443/a\?vv1\=2 -i
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Mon, 15 Jan 2018 14:33:58 GMT
Content-Type: text/plain
Content-Length: 25
Connection: keep-alive
Cache-Control: public,max-age=30
Vary: vvrh1
vvrh1-val-rec: val is 1
Edge-Cache-Status: HIT

2018-01-15T14:33:54+00:00%

I'd expect a cache miss on the final response because the query string argument "vv1" has changed and this would mean that proxy_request_header would set a different value for the "vvrh1" request header. To illustrate that this mechanism works, once the cached object has expired, we see:

# curl -k https://127.0.0.1:8443/a\?vv1\=2 -i
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Mon, 15 Jan 2018 14:39:12 GMT
Content-Type: text/plain
Content-Length: 25
Connection: keep-alive
Cache-Control: public,max-age=30
Vary: vvrh1
vvrh1-val-rec: val is 2
Edge-Cache-Status: val EXPIRED
Origin-Response-Status: 200
Origin-IP: 127.0.0.1:9000

2018-01-15T14:39:12+00:00%                                                                                                                                                               

# curl -k https://127.0.0.1:8443/a\?vv1\=2 -i
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Mon, 15 Jan 2018 14:39:15 GMT
Content-Type: text/plain
Content-Length: 25
Connection: keep-alive
Cache-Control: public,max-age=30
Vary: vvrh1
vvrh1-val-rec: val is 2
Edge-Cache-Status: val HIT

2018-01-15T14:39:12+00:00%

Might this be something which could be fixed (if not, is there a workaround you can think of? Or have I made a mistake?

Cheers

Attachments (1)

nginx-reduced-test-case-vary.conf (5.2 KB ) - added by Neil Craig 6 years ago.
reduced test case

Download all attachments as: .zip

Change History (7)

by Neil Craig, 6 years ago

reduced test case

comment:1 by Maxim Dounin, 6 years ago

Status: newaccepted
Type: defectenhancement

I completely agree than using headers as sent to the upstream server in the Vary header processing is more logical than the current bahaviour of using original request headers as got from the client. I've considered this when working on the Vary support. In particular, this will help to reduce cache duplication in the following trivial case:

proxy_pass ...
proxy_cache ...
proxy_set_header Accept-Encoding "";

assuming a response with Vary: Accept-Encoding.

Unfortunately, this is not trivial to implement, and will require saving individual headers as sent to the upstream somewhere - this is not what nginx currently does. As such, current behaviour is to use the original request headers, without any changes applied via the proxy_set_header directive.

A workaround would be to use proxy_cache_key with appropriate values included.

comment:2 by Neil Craig, 6 years ago

Thanks for the quick response Maxim, very much appreciated.
Unfortunately, since my use case is a remote, multi-tenant system, i don't have any way to know what each tenant wants to vary on in advance but i appreciate the suggestion. Is there any other way you can think of that i can use? I am out of all ideas :-(.

Cheers

comment:3 by Maxim Dounin, 6 years ago

While you have no way to know what each tenant wants to vary on, you certainly know which headers you've modified using proxy_set_header, and so you can include these headers into the cache key.

Alternatively, you may want to simply disable caching of responses with Vary (proxy_no_cache $upstream_http_vary;), or responses where Vary includes the headers you've modified with proxy_set_header (similar to the above, but using map on $upstream_http_vary to check for appropriate headers).

comment:4 by Neil Craig, 6 years ago

Thanks Maxim, i'll think it through. This is going to be a very busy service to i have to make sure we don't dilute the cache if it's not absolutely necessary. I may well use #1 as an interim. I appreciate your suggestions, thank you.

Just something which occurred to me on my way to work this morning, if possible, it'd be great to have more_set_input_headers request headers also be varied-upon. So if it's possible to have that happen to or to feed back to the Openresty folk, that'd be handy for many people too i am sure. I realise that more_headers is a 3rd party module but since it's popular, it'd be really good if that worked too.

Is this work something that i should ask if we can contribute work to (my C is way to bad unfortunately) or are you able/planning to do the work internally?

Also, (i know this is probably a difficult question) is there any rough idea on how long this might take to implement? This is just so i know roughly how far into a temporary solution I should go.

Many thanks for you work and help on this :-)

comment:5 by Maxim Dounin, 6 years ago

I have no idea what more_set_input_headers does and why it doesn't work for you. On the other hand, as I told to the author of it many times, what it does is a hack, and trying to do it correctly is expected to be tricky.

As for the plans - as of now, there are no specific plans on this. If you are planning to work on this yourself, make sure to read this article.

comment:6 by Neil Craig, 6 years ago

OK, thanks again - understood on both counts :-)

Note: See TracTickets for help on using tickets.