Opened 4 years ago

Closed 3 years ago

#2079 closed defect (fixed)

Empty path in URL

Reported by: ianton-ru@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.19.x
Keywords: url parse Cc:
uname -a: Linux <hostname> 4.19.51-8 #1 SMP Mon Jun 17 18:06:03 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.14.2
built by clang
built with OpenSSL 1.1.1d 10 Sep 2019
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/var/run/nginx.pid --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/body --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_degradation_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_xslt_module --with-http_v2_module --with-http_perl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module

Description

Nginx does not accept url with empty path like http://domain.com?param=value and returns error 400: Bad Request

RFC 7230 says that this kind of URL is valid and should be interpret as URL with path "/" (https://tools.ietf.org/html/rfc7230#section-2.7.3).

URLs like this makes for example official c++ SDK for AWS
(https://github.com/aws/aws-sdk-cpp/blob/master/aws-cpp-sdk-core/source/http/URI.cpp#L350)

As result, it's impossible to use nginx as proxy for requests to any AWS-compatible services (like Yandex Cloud Storage for example) from any software uses aws-sdk-cpp (like ClickHouse).

I believe that this thing is in nginx code here: https://github.com/nginx/nginx/blob/master/src/http/ngx_http_parse.c#L438 (and maybe some code does not expect empty path after parse).

PS. In nginx version field is our actual production version, but it reproducea with any version.

Change History (6)

comment:1 by Maxim Dounin, 4 years ago

Priority: majorminor

First of all, could you please clarify - are you using nginx as a forward proxy?

comment:2 by ianton-ru@…, 4 years ago

No. In known case Yandex Cloud Storage uses nginx as reverse proxy.

comment:3 by Maxim Dounin, 4 years ago

So why the client uses request-target in the absolute-form in the first place? Quoting RFC 7230:

   When making a request directly to an origin server, other than a
   CONNECT or server-wide OPTIONS request (as detailed below), a client
   MUST send only the absolute path and query components of the target
   URI as the request-target.  If the target URI's path component is
   empty, the client MUST send "/" as the path within the origin-form of
   request-target.  A Host header field is also sent, as defined in
   Section 5.4.

The only case when the client is expected to use absolute-form is when making requests to a proxy, that is, to a forward proxy.

While nginx is required to support absolute-form "to allow for transition to the absolute-form for all requests in some future version of HTTP", and the observed behaviour still needs to be fixed, the error is not expected to happen in practice unless the client misbehaves or nginx is misused as a forward proxy.

comment:4 by Maxim Dounin, 3 years ago

The following patch should fix this. Note though that if you are seeing such requests in practice, this means that the client is broken and needs to be fixed (unless nginx is misused as a reverse proxy).

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1606783689 -10800
#      Tue Dec 01 03:48:09 2020 +0300
# Node ID a7d2c9724ccc03f437615291ab071f1bb1fb440b
# Parent  90cc7194e993f8d722347e9f46a00f65dffc3935
Fixed parsing of absolute URIs with empty path (ticket #2079).

When the request line contains request-target in the absolute-URI form,
it can contain path-empty instead of a single slash (see RFC 7230, RFC 3986).
Previously, the ngx_http_parse_request_line() function only accepted empty
path when there was no query string.

With this change, non-empty query is also correctly handled.  That is,
request line "GET http://example.com?foo HTTP/1.1" is accepted and results
in $uri "/" and $args "foo".

Note that $request_uri remains "?foo", similarly to how spaces in URIs
are handled.  Providing "/?foo", similarly to how "/" is provided for
"GET http://example.com HTTP/1.1", requires allocation.

diff --git a/src/http/ngx_http_parse.c b/src/http/ngx_http_parse.c
--- a/src/http/ngx_http_parse.c
+++ b/src/http/ngx_http_parse.c
@@ -389,6 +389,12 @@ ngx_http_parse_request_line(ngx_http_req
                 r->uri_end = r->schema_end + 2;
                 state = sw_host_http_09;
                 break;
+            case '?':
+                r->uri_start = p;
+                r->args_start = p + 1;
+                r->empty_path_in_uri = 1;
+                state = sw_uri;
+                break;
             default:
                 return NGX_HTTP_PARSE_INVALID_REQUEST;
             }
@@ -456,6 +462,12 @@ ngx_http_parse_request_line(ngx_http_req
                 r->uri_end = r->schema_end + 2;
                 state = sw_host_http_09;
                 break;
+            case '?':
+                r->uri_start = p;
+                r->args_start = p + 1;
+                r->empty_path_in_uri = 1;
+                state = sw_uri;
+                break;
             default:
                 return NGX_HTTP_PARSE_INVALID_REQUEST;
             }
diff --git a/src/http/ngx_http_request.c b/src/http/ngx_http_request.c
--- a/src/http/ngx_http_request.c
+++ b/src/http/ngx_http_request.c
@@ -1243,6 +1243,9 @@ ngx_http_process_request_uri(ngx_http_re
             return NGX_ERROR;
         }
 
+    } else if (r->empty_path_in_uri) {
+        ngx_str_set(&r->uri, "/");
+
     } else {
         r->uri.data = r->uri_start;
     }
@@ -1250,7 +1253,7 @@ ngx_http_process_request_uri(ngx_http_re
     r->unparsed_uri.len = r->uri_end - r->uri_start;
     r->unparsed_uri.data = r->uri_start;
 
-    r->valid_unparsed_uri = r->space_in_uri ? 0 : 1;
+    r->valid_unparsed_uri = (r->space_in_uri || r->empty_path_in_uri) ? 0 : 1;
 
     if (r->uri_ext) {
         if (r->args_start) {
diff --git a/src/http/ngx_http_request.h b/src/http/ngx_http_request.h
--- a/src/http/ngx_http_request.h
+++ b/src/http/ngx_http_request.h
@@ -470,6 +470,9 @@ struct ngx_http_request_s {
     /* URI with " " */
     unsigned                          space_in_uri:1;
 
+    /* URI with empty path */
+    unsigned                          empty_path_in_uri:1;
+
     unsigned                          invalid_header:1;
 
     unsigned                          add_uri_to_alias:1;

comment:5 by Maxim Dounin <mdounin@…>, 3 years ago

In 7752:8989fbd2f89a/nginx:

Fixed parsing of absolute URIs with empty path (ticket #2079).

When the request line contains request-target in the absolute-URI form,
it can contain path-empty instead of a single slash (see RFC 7230, RFC 3986).
Previously, the ngx_http_parse_request_line() function only accepted empty
path when there was no query string.

With this change, non-empty query is also correctly handled. That is,
request line "GET http://example.com?foo HTTP/1.1" is accepted and results
in $uri "/" and $args "foo".

Note that $request_uri remains "?foo", similarly to how spaces in URIs
are handled. Providing "/?foo", similarly to how "/" is provided for
"GET http://example.com HTTP/1.1", requires allocation.

comment:6 by Maxim Dounin, 3 years ago

Resolution: fixed
Status: newclosed

Fix committed, thanks for reporting this.

Note: See TracTickets for help on using tickets.