Error page directive forces upstream's keepalive to be closed
|Reported by:||Owned by:|
|Keywords:||upstream keepalive error_page||Cc:||giovani.rinaldi.azion.com@…|
|uname -a:||Linux hostname 3.10.0-1062.18.1.el7.x86_64 #1 SMP Tue Mar 17 23:49:17 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux|
nginx version: nginx/1.16.1
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC)
configure arguments: --with-debug
When using an upstream for serving a given location which has an error page configured (such as with code 302), the proxy module is not able to intercept the intermediary response headers and thus will not mark the utilised upstream server with keepalive.
For example, given the 3 nginx's servers "proxy", "redirect" and "origin", the "proxy" server uses an upstream configured with keepalive that resolves to the "redirect" server IP/port. A request to the "proxy" server gets passed, via upstream, to the "redirect" server, which only answers the fixed status code 302 with a location containing an URL to the "origin" server. Then, the error_page directive in the "proxy" server gets triggered and executes a named location which, in turn, will do a proxy pass to the "origin" server to obtain the object indicated by the "redirect" server. A minimal nginx.conf, which reproduces this scenario, is attached to this ticket.
By verifying nginx's error.log in debug, it's visible that keepalive is not doing the "saving connection" part when dealing with the intermediary (redirect with 302 + location) response. See file error.log.errorpage.nokeepalive attached.
If the error_page is disabled (by simply commenting the error_page in the minimal nginx.conf attached), the 302 answer is returned directly (instead of entering the named location and requesting "origin" server) and the upstream's keepalive works as intended, as can be seen in the attached file error.log.noerrorpage.keepalive.
As a workaround, the patch (attached as file set_keepalive_status_moved.patch) fixed the issue by allowing the function
ngx_http_proxy_module to verify if such response allows upstream's keepalive to be enabled. The attached file error.log.errorpage.keepalive.patched demonstrates the full scenario working properly with keepalive + error_page redirection.
Further investigation into the behaviour that does not use using the error_page directive indicates that keepalive is enabled by the function
ngx_http_proxy_copy_filter, but such function is never called when error_page is active (presumably because the copy filter will only be called when a final response is ready to be sent downstream, and the 302 is only the intermediary response).
As mentioned, said patch fixes such issue, but some questions remain:
- Is it the correct approach to avoid this type of keepalive from being shut down?
- If so, which other return codes could be included? Maybe externalising this as a directive to be tested/used as needed, such as it's done with e.g. proxy_next_upstream.
I haven't tested this issue with the current nginx versions such as 1.18/1.19, though I overlooked the mentioned functions' source code briefly and could not see any major change in their behaviour, but I could gladly run further tests if needed.
P.S.: In order for the attached scenarios to work properly, every host utilised ("example.com", "redirect.com" and "origin.com") must be configured locally and must resolve to the localhost IP in the recursive resolver of the host machine.