Opened 10 years ago

Last modified 4 years ago

#564 accepted defect

map regex matching affects rewrite directive

Reported by: Pascal Jungblut Owned by:
Priority: minor Milestone:
Component: nginx-core Version:
Keywords: Cc:
uname -a: FreeBSD freebsd 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
nginx -V: nginx version: nginx/1.6.0
TLS SNI support enabled
configure arguments: --prefix=/usr/local/etc/nginx --with-cc-opt='-I /usr/local/include' --with-ld-opt='-L /usr/local/lib' --conf-path=/usr/local/etc/nginx/nginx.conf --sbin-path=/usr/local/sbin/nginx --pid-path=/var/run/nginx.pid --error-log-path=/var/log/nginx-error.log --user=www --group=www --http-client-body-temp-path=/var/tmp/nginx/client_body_temp --http-fastcgi-temp-path=/var/tmp/nginx/fastcgi_temp --http-proxy-temp-path=/var/tmp/nginx/proxy_temp --http-scgi-temp-path=/var/tmp/nginx/scgi_temp --http-uwsgi-temp-path=/var/tmp/nginx/uwsgi_temp --http-log-path=/var/log/nginx-access.log --with-http_stub_status_module --with-pcre --with-http_ssl_module

Description

Using a regex in the map directive changes the capture groups in a rewrite directive. This happens only if the regex in map is matched. A minimal exampe config:

http {
        map $http_accept_language $lang {
                default en;
                 ~(de) de;
        }
        server {
                server_name test.local
                listen 80;
                rewrite ^/(.*)$ http://example.com/$lang/$1 permanent;
        }
}

Expected:

$ curl -sI http://test.local/foo | grep Location
Location: http://example.com/en/foo
$ curl -H "Accept-Language: de" -sI http://test.local/foo | grep Location
Location: http://example.com/de/foo

Actual:

$ curl -sI http://test.local/foo | grep Location
Location: http://example.com/en/foo
$ curl -H "Accept-Language: de" -sI http://test.local/foo | grep Location
Location: http://example.com/de/de

If I leave out the parentheses in ~(de) de; (so it becomes ~de de;), $1 is simply empty:

$ curl -H "Accept-Language: de" -sI http://test.local/foo | grep Location
Location: http://example.com/de/

Change History (11)

comment:1 by Josh DeWald, 10 years ago

Per the docs, this seems to be by design:

http://nginx.org/en/docs/http/ngx_http_map_module.html

"A regular expression can contain named and positional captures that can later be used in other directives along with the resulting variable."

Though it looks like in practice the docs should make it clear that if you use a regex-based map variable it's going to take over the regex context of directives.

Last edited 10 years ago by Josh DeWald (previous) (diff)

comment:2 by Maxim Dounin, 10 years ago

Status: newaccepted

This case clearly shows that the current behaviour is bad, and should be fixed. While positional captures are clearly bad in most cases, but just a rewrite should be simple enough for them to work.

comment:3 by claudeha@…, 8 years ago

It would be more useful IMO if the positional captures could be used within the map directive itself. Here's a silly example that could be achieved with other things, but just to show what I mean:

map $http_accept_language $myindex {
    default index.en.html;
    ~(.*) index.$1.html;
}

I ran into a case recently where it would have been very useful to have some regex with different numbers of matches replaced within a map directive http://serverfault.com/questions/769373/porting-rewritecond-query-string-from-apache2-to-nginx

Last edited 8 years ago by claudeha@… (previous) (diff)

comment:4 by Maxim Dounin, 7 years ago

See also #1044.

comment:5 by Maxim Dounin, 7 years ago

See also #1142.

comment:6 by higuita, 7 years ago

the nginx docs should be clear that named captures should always be used to avoid problems. By not saying that, users will use positional captures until something breaks and wasting lots of time debugging a simple badly documented feature.

Simply point to named captures in the docs, give one example and everyone will start to use then. without it, people fallback to the simplest capture and fail at random times

comment:7 by Maxim Dounin, 7 years ago

See also #1285.

comment:8 by https://stackoverflow.com/users/1100117/higuita, 7 years ago

can someone change the component to documentation, as this is mostly a documentation bug

thanks

comment:9 by Maxim Dounin, 6 years ago

See also #1498.

comment:10 by Maxim Dounin, 4 years ago

See also #1934.

comment:11 by edrandall@…, 4 years ago

This bug is classified as 'minor', but in our opinion it should be 'critical'.

Are named captures restricted to the same context in which they are used?
For example, if I were to use the same name ?<p1> for the first capture parameter in several places, would I also run into the same issues when using $p1 as if I'd used $1?

The documentation very clearly uses numbered positional parameters in regex examples, which encourages developers to fall victim to this bug.

For example: http://nginx.org/en/docs/http/ngx_http_rewrite_module.html#rewrite

Last edited 4 years ago by edrandall@… (previous) (diff)
Note: See TracTickets for help on using tickets.