Opened 7 months ago

Closed 6 months ago

Last modified 6 months ago

#1950 closed defect (worksforme)

nginx/1.17.9 randomly truncating some large proxy responses

Reported by: Artem Russakovskii Owned by:
Priority: major Milestone:
Component: nginx-core Version:
Keywords: truncate, truncation, cut off, response Cc:
uname -a: Linux 5.4.10-x86_64-linode132 #1 SMP PREEMPT Thu Jan 9 21:17:12 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx/1.17.9
built by gcc 7.4.0 (SUSE Linux)
built with OpenSSL 1.1.0i-fips 14 Aug 2018 (running with OpenSSL 1.1.1f 31 Mar 2020)
TLS SNI support enabled
configure arguments: --prefix=/usr/ --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/run/nginx.pid --lock-path=/run/nginx.lock --http-client-body-temp-path=/var/lib/nginx/tmp/ --http-proxy-temp-path=/var/lib/nginx/proxy/ --http-fastcgi-temp-path=/var/lib/nginx/fastcgi/ --http-uwsgi-temp-path=/var/lib/nginx/uwsgi/ --http-scgi-temp-path=/var/lib/nginx/scgi/ --user=nginx --group=nginx --without-select_module --without-poll_module --with-threads --with-file-aio --with-ipv6 --with-http_ssl_module --with-http_v2_module --with-http_realip_module --with-http_addition_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_auth_request_module --with-http_random_index_module --with-http_secure_link_module --with-http_degradation_module --with-http_slice_module --with-http_stub_status_module --with-http_perl_module=dynamic --with-perl=/usr/bin/perl --with-mail=dynamic --with-mail_ssl_module --with-stream=dynamic --with-stream_ssl_module --with-stream_realip_module --with-stream_ssl_preread_module --with-pcre --with-pcre-jit --with-libatomic --with-compat --add-module=nginx_upstream_check_module-0.3.0 --add-dynamic-module=ngx-fancyindex-0.4.2 --add-dynamic-module=headers-more-nginx-module-0.33 --add-dynamic-module=nginx-rtmp-module-1.2.1 --add-dynamic-module=ngx_http_geoip2_module-3.3 --with-cc-opt='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g -fPIC -D_GNU_SOURCE' --with-ld-opt='-Wl,-z,relro,-z,now -pie'

Description (last modified by Artem Russakovskii)

Hi,

We're using nginx/1.17.9 on 5 servers, and one of them started behaving oddly about a week ago, randomly truncating some proxy responses - at least the ones we noticed are specifically WordPress xmlrpc responses from Apache running on the same host, which nginx proxies to.

A few things stand out about this.

  1. We haven't noticed any other data getting truncated similarly, only these xmlrpc XML API responses (at least not yet).
  1. After we narrowed it down to nginx (querying Apache directly produced no truncation), I decided to first restart nginx. After the restart, the issue went away (at least for now). None of the other 4 web servers running nginx and apache in the same manner have ever exhibited this behavior.
  1. The machine has plenty of space (1.3TB), so it's not an issue of lack of space.
  1. Here's the most interesting bit. This machine was rebooted about a week ago, which is exactly when the issue started.

The 2nd interesting bit is the reboot kicked in a new kernel: 5.4.10-x86_64-linode132.

All the other machines haven't been rebooted yet and are running 5.1.17-x86_64-linode128.

  1. The response is about 600KB. Sometimes ~500KB was returned, sometimes less (~400KB). Sometimes the full response was there.

Is this a known issue? Do you have any ideas about why it may have happened?

Change History (4)

comment:1 by Artem Russakovskii, 7 months ago

Description: modified (diff)

comment:2 by Maxim Dounin, 6 months ago

Most likely there were an issue with writing temporary files into proxy_temp_path. As long as nginx cannot write temporary files, the result looks exactly as described: when a response is larger than what can be held in memory buffers, nginx tries to write extra data to a temporary file, writing fails, so nginx closes the connection. Exact amount of data sent to client may vary depending on the client bandwidth.

Typical problems with proxy_temp_path include incorrect permissions and missing proxy_temp_path directory. During startup nginx checks permissions and creates appropriate directories, but it does so only on startup, so if, for example, directories were removed during a package upgrade, things will be broken till nginx restart. Given that in your case all nginx temporary directories are under /var/lib/nginx/, this is likely the case here if the package was upgraded / reinstalled and removed /var/lib/nginx/ with all directories in it while nginx was running.

You may want to check nginx error log for details. Also, if the above assumption about package upgrade which broke things is true, it might be a good idea to report the problem to the package maintainer.

comment:3 by Maxim Dounin, 6 months ago

Resolution: worksforme
Status: newclosed

Feedback timeout.

comment:4 by Artem Russakovskii, 6 months ago

To clarify our config, proxy_temp_path isn't even set so it's whatever the default value is. There was a lot of free disk space - over 1TB.

I'm still not sure what happened, but indeed this seemed to be a system issue as I discovered the server's access to attached block storage hung as well. Not sure if it's a kernel issue or something related to block storage (and nginx doesn't use this block storage for configuration or tmp file storage).

So worksforme is an acceptable resolution here.

Note: See TracTickets for help on using tickets.