Opened 4 years ago

Closed 3 years ago

#2191 closed defect (fixed)

Nginx doesn't escape unsafe characters on proxying

Reported by: ZigzagAK@… Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.19.x
Keywords: Cc:
uname -a:
nginx -V: nginx version: nginx/1.19.6
built by gcc 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC)
built with OpenSSL 1.0.2k-fips 26 Jan 2017
TLS SNI support enabled
configure arguments: --prefix=/root/tmp/nginx-1.19.6 --with-http_v2_module --with-poll_module --with-threads --with-file-aio --with-pcre-jit --with-http_stub_status_module --with-http_ssl_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-stream --with-http_auth_request_module --with-http_realip_module --with-http_gunzip_module --with-http_sub_module

Description (last modified by ZigzagAK@…)

Example synthetic configuration:

upstream xxxx {
  server 127.0.0.1:3456;
}

server {
  listen 3456;
  location / {
    return 200 '$uri\n$request\n';
  }
}

server {
  listen 2345;
  location / {
    rewrite ^ $uri break;
    proxy_pass http://xxxx;
  }
}
[root@e078281ef0c9 gateway]# curl localhost:3456/xxx/aaa%3C%3E%22
/xxx/aaa<>"
GET /xxx/aaa%3C%3E%22 HTTP/1.1
[root@e078281ef0c9 gateway]# curl localhost:2345/xxx/aaa%3C%3E%22
/xxx/aaa<>"
GET /xxx/aaa<>" HTTP/1.1
[root@e078281ef0c9 gateway]#

tcpdump:

before nginx:

GET /xxx/aaa%3C%3E%22 HTTP/1.1
User-Agent: curl/7.29.0
Host: localhost:2345
Accept: */*

after nginx:

GET /xxx/aaa<>" HTTP/1.1
Connection: keep-alive
Host: localhost:2345
Connection: keep-alive
User-Agent: curl/7.29.0
Accept: */*
X-Forwarded-For: 127.0.0.1
X-Real-IP: 127.0.0.1

Real code is more complicated.

This is cause of error on the backend side:

java.lang.IllegalArgumentException: Invalid character found in the request target [... code:test%23a%3Ft"t%25r]. The valid characters

are defined in RFC 7230 and RFC 3986

https://datatracker.ietf.org/doc/html/rfc1738#section-2.2

... The characters "<" and ">" are unsafe because they are used as the

delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems.

All unsafe characters must always be encoded within a URL.

Cause: https://github.com/nginx/nginx/blob/master/src/core/ngx_string.c#L1496

Change History (6)

comment:1 by ZigzagAK@…, 4 years ago

Description: modified (diff)

comment:2 by Maxim Dounin, 4 years ago

Could you please be more specific about the backend software it causes problems with?

Note well the a trivial workaround is to avoid modification of the request URI in nginx, and instead use proxying without URI modification, such as (note no rewrites, and no URI component in the proxy_pass directive):

location / {
    proxy_pass http://example.com;
}

When request URI is modified within nginx, it only escapes characters which need to be escaped in HTTP, but not characters "used as the delimiters around URLs in free text". While this not strictly conforms to RFC, it works in most cases. I've tried to submit a patch for this a while ago, though the patch was rejected by Igor.

comment:3 by ZigzagAK@…, 4 years ago

Note well the a trivial workaround is to avoid modification of the request URI in nginx.

We use nginx as api gateway and we need to modify URL and arguments on proxying. Rewrites may be happens in any time.

Actually we use lua module and redefine uri with ngx.req.set_uri api (it is similar to standard rewrite directive in an example above).

Could you please be more specific about the backend software it causes problems with?

Apache tomcat.

I've tried to submit ​a patch for this a while ago, though the patch was rejected by Igor.

This patch looks like a truth.

RFC tells to us that software MUST escapes unsafe characters.

Probably we also need to patch nginx to resolve our problem because no other universal solution for this case to us.

comment:4 by ZigzagAK@…, 4 years ago

Hmm, rfc1738 is obsoleted.
In 3986 and 7230 i can't find this requirements.

comment:5 by Maxim Dounin <mdounin@…>, 3 years ago

In 7880:dfd8dfb436e5/nginx:

Core: escaping of chars not allowed in URIs per RFC 3986.

Per RFC 3986 only the following characters are allowed in URIs unescaped:

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
gen-delims = ":" / "/" / "?" / "#" / "/" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"

/ "*" / "+" / "," / ";" / "="

And "%" can appear as a part of escaping itself. The following
characters are not allowed and need to be escaped: %00-%1F, %7F-%FF,
" ", """, "<", ">", "\", "", "`", "{", "|", "}".

Not escaping ">" is known to cause problems at least with MS Exchange (see
http://nginx.org/pipermail/nginx-ru/2010-January/031261.html) and in
Tomcat (ticket #2191).

The patch adds escaping of the following chars in all URI parts: """, "<",
">", "\", "", "`", "{", "|", "}". Note that comments are mostly preserved
to outline important characters being escaped.

comment:6 by Maxim Dounin, 3 years ago

Resolution: fixed
Status: newclosed

Fix committed. Thanks for prodding this.

Note: See TracTickets for help on using tickets.