Opened 7 years ago

Closed 7 years ago

#1219 closed defect (fixed)

Subrequest using slice stuck in infinite loop

Reported by: giovani.rinaldi.azion.com@… Owned by:
Priority: minor Milestone:
Component: nginx-module Version: 1.11.x
Keywords: Cc:
uname -a: Linux hostname 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.11.10
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC)
configure arguments: --prefix=/usr/local/nginx --sbin-path=/usr/local/nginx/sbin/nginx --conf-path=/usr/local/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --pid-path=/run/nginx.pid --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-pcre --with-http_slice_module --with-threads --with-debug --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic' --with-ld-opt='-Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E'

Description

When nginx has to do subrequests for one or more slices to a remote origin, and the config has a error_page pointing to a different location than the one serving slices and the error is cached by proxy_cache_valid, a subrequest can get stuck in an infinite loop (i.e. worker process always running at 100% cpu) if the origin server stops responding and nginx internally generates an 5xx error, e.g 502 (such 5xx error would then be redirected to the error specific location, which is cached or will be cached - in case the cache files are empty).

If proxy_cache_valid 5xx (e.g 502) is set to 0s, then such loop does not occur, and the subrequest keeps trying to connect to the origin until it comes back online and the whole request (subsequent subrequests) are successfully satisfied.

If error_page 502 /another_uri_for_erros/ is not set (or said custom location is not present in the server level), the loop also does not occur.

This also happens with nginx-1.10.1 (although I have not reproduced it with the vanilla version, it seems not to be related to 3rd party modules).

Also, I have not tested this bug with other http statuses, such as 4xx (don't know if they're treated in a different manner internally by the subrequest module).

Attached are two files. One is the minimal nginx config file that I used to reproduce this bug with the vanilla nginx-1.11.10 (as per -V). The second one has a gdb backtrace inside the inner most part of the loop.

Hope that it helps, and, if needed, I'll be glad to provide more debugging infos or testing possible patches.

Attachments (3)

nginx_minimal.conf (1.2 KB ) - added by giovani.rinaldi.azion.com@… 7 years ago.
Minimal nginx configuration to reproduce the loop
nginx_loop_gdb_bt (6.5 KB ) - added by giovani.rinaldi.azion.com@… 7 years ago.
GDB Backtrace of loop
slice-redirect.patch (1.1 KB ) - added by Roman Arutyunyan 7 years ago.
Patch disabling internal redirects in slice subrequests

Download all attachments as: .zip

Change History (8)

by giovani.rinaldi.azion.com@…, 7 years ago

Attachment: nginx_minimal.conf added

Minimal nginx configuration to reproduce the loop

by giovani.rinaldi.azion.com@…, 7 years ago

Attachment: nginx_loop_gdb_bt added

GDB Backtrace of loop

comment:1 by Roman Arutyunyan, 7 years ago

Thanks for reporting this.
Please try the attached patch.

by Roman Arutyunyan, 7 years ago

Attachment: slice-redirect.patch added

Patch disabling internal redirects in slice subrequests

comment:2 by giovani.rinaldi.azion.com@…, 7 years ago

Your patch does fix the loop indeed! Thanks for your quick reply!

Just one last question, is there any chance of having this patch backported to 1.10?

Thank you once again!

comment:3 by Roman Arutyunyan, 7 years ago

I'm not sure we'll commit exactly this path, we have several solutions.

Anyway, if we commit the fix before the new stable version (1.12) is released then probably it will be backported to 1.10.

comment:4 by Roman Arutyunyan <arut@…>, 7 years ago

In 6963:3ff293cfdab8/nginx:

Slice filter: prevented slice redirection (ticket #1219).

When a slice subrequest was redirected to a new location, its context was lost.
After its completion, a new slice subrequest for the same slice was created.
This could lead to infinite loop. Now the slice module makes sure each slice
subrequest starts output with the slice context available.

comment:5 by Roman Arutyunyan, 7 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.