Opened 3 years ago
Closed 3 years ago
#2320 closed defect (duplicate)
Inconsistent URL decoding with X-Accel-Redirect depending on whether the original request URL was url-encoded
Reported by: | ThiefMaster | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | nginx-core | Version: | 1.19.x |
Keywords: | Cc: | ThiefMaster | |
uname -a: | Linux eluvian 5.15.19-gentoo #1 SMP Sun Feb 6 19:44:25 CET 2022 x86_64 AMD Ryzen 5 3600X 6-Core Processor AuthenticAMD GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.20.1
built with OpenSSL 1.1.1l 24 Aug 2021 TLS SNI support enabled configure arguments: --prefix=/usr --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error_log --pid-path=/run/nginx.pid --lock-path=/run/lock/nginx.lock --with-cc-opt=-I/usr/include --with-ld-opt=-L/usr/lib64 --http-log-path=/var/log/nginx/access_log --http-client-body-temp-path=/var/lib/nginx/tmp/client --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --with-compat --with-debug --with-http_v2_module --with-pcre --without-http_browser_module --without-http_empty_gif_module --without-http_geo_module --without-http_grpc_module --without-http_limit_req_module --without-http_limit_conn_module --without-http_memcached_module --without-http_mirror_module --without-http_scgi_module --without-http_ssi_module --without-http_split_clients_module --without-http_upstream_hash_module --without-http_upstream_ip_hash_module --without-http_upstream_keepalive_module --without-http_upstream_least_conn_module --without-http_upstream_zone_module --without-http_userid_module --with-http_addition_module --with-http_stub_status_module --with-http_sub_module --with-http_realip_module --add-module=external_module/ngx-fancyindex-0.4.4 --with-http_ssl_module --without-stream_access_module --without-stream_geo_module --without-stream_limit_conn_module --without-stream_map_module --without-stream_return_module --without-stream_split_clients_module --without-stream_upstream_hash_module --without-stream_upstream_least_conn_module --without-stream_upstream_zone_module --without-mail_imap_module --without-mail_pop3_module --without-mail_smtp_module --user=nginx --group=nginx |
Description
I'm using nginx and x-accel-redirect in what's probably a slightly creative way to proxy file downloads from S3 in order to not show the very ugly and temporary S3 URL to my users. This works fine, but I noticed that in a few very rare cases the requests fail (usually due to an invalid S3 signature).
In any case, I managed to reproduce it without S3 and it seem to happen because depending on whether the URL the user accesses contains url-encoded charaters or not, the URL encoding behavior in the internal request changes.
This is a problem of course, because it means I cannot reliable know whether I need to URL-encode the path included in my x-accel-redirect header or not unless I use a horrible workaround of checking whether the current request url contains a %
character.
I believe this behavior is so strange that it can only be a bug, but if there are any workarounds to get consistent behavior (so I can either always url-encode the path in my x-accel-redirect header or never do so) please let me know.
I'm attaching a minimal test case to reproduce this. The nginx version used there was the one that's stable on my Gentoo system, but I'm also including a Dockerfile which can be used to test against the nginx:latest
image on Docker hub in this Git repo: https://github.com/ThiefMaster/nginx-bug-repro
If you want to run the test case locally:
Put the two files in some folder, create a Python virtualenv, activate it and install flask in there (pip install flask
).
Then run the flask app with flask run -p 12345 -h 0.0.0.0
and nginx with nginx -c $(realpath nginx.conf) -e /dev/stdout
.
nginx.conf:
daemon off; worker_processes 2; pid /tmp/nginx/nginx.pid; error_log /dev/stdout info; events { worker_connections 1024; use epoll; } http { client_body_temp_path /tmp/nginx/client; proxy_temp_path /tmp/nginx/proxy; fastcgi_temp_path /tmp/nginx/fcgi; uwsgi_temp_path /tmp/nginx/uwsgi; log_format short '$remote_addr [$time_local] "$request" $status $bytes_sent'; server { listen 8000 default_server; listen [::]:8000 default_server; access_log /dev/stdout short; location ~ ^/_internal/(https?)/([^/]+)/(.+)$ { internal; set $download_protocol $1; set $download_host $2; set $download_path $3; set $download_url $download_protocol://$download_host/$download_path; resolver 127.0.0.1; proxy_pass $download_url$is_args$args; } location / { proxy_pass http://127.0.0.1:12345; } root /var/empty; } }
app.py:
from urllib.parse import quote from flask import Flask, url_for, request app = Flask(__name__) @app.route('/') def index(): return f''' WITHOUT explicitly encoding x-accel-redirect url:<br> <a href="{url_for('test_without')}">Test [main url without encoding, target without encoding]</a><br> <a href="{url_for('test_without', enc_target=1)}">Test [main url without encoding, target with encoding]</a> 💣<br> <br> <a href="{url_for('test_with')}">Test [main url with encoding, target without encoding]</a><br> <a href="{url_for('test_with', enc_target=1)}">Test [main url with encoding, target with encoding]</a><br> <br><br> WITH explicitly encoding x-accel-redirect url:<br> <a href="{url_for('test_without', enc_xar=1)}">Test [main url without encoding, target without encoding]</a><br> <a href="{url_for('test_without', enc_xar=1, enc_target=1)}">Test [main url without encoding, target with encoding]</a><br> <br> <a href="{url_for('test_with', enc_xar=1)}">Test [main url with encoding, target without encoding]</a><br> <a href="{url_for('test_with', enc_xar=1, enc_target=1)}">Test [main url with encoding, target with encoding]</a> ⚠ ''' def _test(): endpoint = 'target_with' if request.args.get('enc_target') == '1' else 'target_without' path = url_for(endpoint) internal_url = f'/_internal/http/{request.host}{path}' print(f'Redirecting internally to {internal_url}') response = app.make_response('') if request.args.get('enc_xar') == '1': internal_url = quote(internal_url) response.headers['X-Accel-Redirect'] = internal_url return response @app.route('/test-without/') def test_without(): return _test() @app.route('/test with/') def test_with(): return _test() def _target(): return f"target called url={request.base_url}" @app.route('/target-without') def target_without(): return _target() @app.route('/target with') def target_with(): return _target() @app.errorhandler(404) def notfound(exc): return f'NOT FOUND: {request.base_url}'
This looks like a duplicate of #348: excessive URI encoding is done by
set ... $1;
if there were escaped characters in the original request URI. As a workaround, consider using named captures to avoid this extra escaping.