Opened 4 years ago

Closed 3 years ago

#788 closed defect (fixed)

Segfault with dynamic upstream and many SSI includes

Reported by: joh-m@… Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.8.x
Keywords: ssi proxy_pass resolver segfault Cc: jmueller@…
uname -a: Linux ip-10-0-154-103 3.14.48-33.39.amzn1.x86_64 #1 SMP Tue Jul 14 23:43:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.8.0
built by gcc 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC)
built with OpenSSL 1.0.1k-fips 8 Jan 2015
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --add-module=ngx_pagespeed-1.9.32.4-beta --add-module=srcache-nginx-module-0.30 --with-file-aio --with-ipv6 --with-http_spdy_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'

Description

Hello,

we are using nginx to combine fragments from some upstream services. We have defined multiple locations which are doing proxy_pass to multiple upstream services. These services can return an html document with lots of SSI includes, which then refer to different other locations also configured in nginx, which again do proxy_pass to some other upstreams (example configuration attached). As we are hosting on AWS and are using elastic loadbalancers as upstreams we need to make use of the resolver (because of changing IP addresses).
Currently we experience issues when parsing a document with ~800 SSI includes (all referencing the same location and therefore the same upstream). The worker segfaults with a long stacktrace in ngx_http_finalize_connection (r=0x242b1d0) at src/http/ngx_http_request.c:2545 (stacktrace attached).
This doesn't happen when configuring the upstream without a variable.
In the debug log we see lots of resolver messages. I assume nginx is starting a resolver action for each of the includes, instead of relying on the cached response (TTL is 60s).

We tested with nginx 1.6.x (vanilla Amazon Linux), 1.8.0 (self-compiled with ngx_pagespeed), 1.9.4 (self-compiled with ngx_pagespeed)

Regards
Johannes Müller

Attachments (5)

nginx_config.txt (323 bytes ) - added by joh-m@… 4 years ago.
example configuration with two upstreams
nginx_stacktrace.txt (59.8 KB ) - added by joh-m@… 4 years ago.
stacktrace
nginx_log.txt (819 bytes ) - added by joh-m@… 4 years ago.
log excerpt, combined out of different tests
posting-flag.patch (2.6 KB ) - added by Roman Arutyunyan 4 years ago.
use-after-free.log (18.9 KB ) - added by Sergey Kandaurov 3 years ago.
use-after-free -- ngx_http_run_posted_requests from ngx_http_upstream_resolve_handler

Download all attachments as: .zip

Change History (11)

by joh-m@…, 4 years ago

Attachment: nginx_config.txt added

example configuration with two upstreams

by joh-m@…, 4 years ago

Attachment: nginx_stacktrace.txt added

stacktrace

by joh-m@…, 4 years ago

Attachment: nginx_log.txt added

log excerpt, combined out of different tests

comment:1 by vl, 4 years ago

please reproduce the issue without 3rd-party modules and attach full configuration (i.e. include SSI files).

comment:2 by Sergey Kandaurov, 4 years ago

Status: newaccepted

comment:3 by Roman Arutyunyan, 4 years ago

I was able to reproduce the issue. I'm attaching here the patch which fixes it. Please try it and report the result.

by Roman Arutyunyan, 4 years ago

Attachment: posting-flag.patch added

comment:4 by joh-m@…, 4 years ago

I can confirm that with this patch nginx is not segfaulting any more

by Sergey Kandaurov, 3 years ago

Attachment: use-after-free.log added

use-after-free -- ngx_http_run_posted_requests from ngx_http_upstream_resolve_handler

comment:5 by Roman Arutyunyan <arut@…>, 3 years ago

In 7041:6169dbad37d8/nginx:

Upstream: fixed running posted requests (ticket #788).

Previously, the upstream resolve handler always called
ngx_http_run_posted_requests() to run posted requests after processing the
resolver response. However, if the handler was called directly from the
ngx_resolve_name() function (for example, if the resolver response was cached),
running posted requests from the handler could lead to the following errors:

  • If the request was scheduled for termination, it could actually be terminated

in the resolve handler. Upper stack frames could reference the freed request
object in this case.

  • If a significant number of requests were posted, and for each of them the

resolve handler was called directly from the ngx_resolve_name() function,
posted requests could be run recursively and lead to stack overflow.

Now ngx_http_run_posted_requests() is only called from asynchronously invoked
resolve handlers.

comment:6 by Roman Arutyunyan, 3 years ago

Resolution: fixed
Status: acceptedclosed
Note: See TracTickets for help on using tickets.