Opened 7 years ago

Closed 7 years ago

#1174 closed defect (fixed)

Partial downloads (sendfile)

Reported by: pokotilenko@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.10.x
Keywords: sendfile partial download client timeout Cc:
uname -a: Linux **** 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.10.0 (Ubuntu)
built with OpenSSL 1.0.2g 1 Mar 2016
TLS SNI support enabled
configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_geoip_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module --with-http_v2_module --with-http_sub_module --with-http_xslt_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads

Description

I'm seeing partial downloads. Investigation come to nginx closing connection mid-file on "client timed out (110: Connection timed out) while sending response to client".

See attached portion of debug log. There is a sequence of sendfiles, delay, and a client timeout (send_timeout 150 during this capture).

I've been watching socket send buffer (netstat -ntp | grep nginx | grep CLIENT_IP). While download is progressing send buffer is non-zero, like this:

tcp 0 2944800 SERVER_IP:80 CLIENT_IP:60338 ESTABLISHED 30723/nginx: worker

When downloading hangs, send-buffer is zero like this:
tcp 0 0 SERVER_IP:80 CLIENT_IP:60476 ESTABLISHED 5247/nginx: worker

At this time no write event is happening and no more sendfiles tried.

On a server which is dedicated for testing this issue I can 100% reproduce this after flushing linux fs cache.

Either of those helps:

  • sendfile off
  • aio threads
  • sendfile_max_chunk 128k

If I get it right, with "aio off" sendfile is blocking and regarding debug log it doesn't block for long, but it seems that under some condition the connection socket fd is not put to wait for output availability so no output (sendfile) happening.

I have tcp_nopush and tcp_nodelay enabled.

Attachments (6)

debug.log (4.5 KB ) - added by pokotilenko@… 7 years ago.
portion of debug.log
main_debug.log (86.4 KB ) - added by pokotilenko@… 7 years ago.
full debug.log (main)
location_debug.log (83.9 KB ) - added by pokotilenko@… 7 years ago.
full debug.log (location)
sendfile_test.c (6.2 KB ) - added by pokotilenko@… 7 years ago.
Direct epoll/sendfile test
sendfile_test.2.c (7.7 KB ) - added by pokotilenko@… 7 years ago.
Direct epoll/sendfile test with signal interruption
sendfile_test_loop.c (8.8 KB ) - added by pokotilenko@… 7 years ago.
Direct epoll/sendfile test with signal interruption and sendfile loop

Download all attachments as: .zip

Change History (23)

by pokotilenko@…, 7 years ago

Attachment: debug.log added

portion of debug.log

comment:1 by Maxim Dounin, 7 years ago

The debug log snippet provided is not complete and doesn't contain low-level event operations (you need to enable debug log at the global level for event operations to be logged, see here). Nevertheless, available logs suggests that the kernel reports that sendfile() wasn't able to send the whole file:

2017/01/09 21:39:13 [debug] 24708#24708: *28972226 sendfile: @23226945 4405733
2017/01/09 21:39:13 [debug] 24708#24708: *28972226 sendfile: 325055 of 4405733 @23226945

and then it doesn't notify nginx about further writes being possible - while it should, as nginx registers the socket for write events when it starts sending the response.

Most likely reason for this, as far as I understand, is that the file was modified while being returned, and the rest of the file is no longer available. Unfortunately, it is not always possible to distinguish such incorrect file modifications from an exhausted socket send buffer, but in most cases nginx should be able to report sendfile() reported that ... was truncated at ... alerts. If you see such errors anywhere in the logs, or similar errors appear when you switch off sendfile - that's definitely the case. Please make sure you don't modify files being served, update them atomically instead (that is: write a new file, and then rename it).

Two other alternatives would be:

  • An obscure nginx bug, which causes it to loose and/or ignore the even from the kernel for some reason. Full debug log is needed to see if it's the case.
  • A kernel bug. In this case you may want to debug it further to confirm and reproduce it without nginx, and then report to kernel developers as appropriate.

comment:2 by pokotilenko@…, 7 years ago

The file in question has not been modified for a long time, it is surely not being modified during tests. File's last modification date is almost a month.

Please see full debug logs attached.

Also, as it comes to nginx vs kernel issue I tried to reproduce this issue directly with epoll/sendfile with no success. Source code of direct test attached. With this test I always get complete file even with caches flushed.

by pokotilenko@…, 7 years ago

Attachment: main_debug.log added

full debug.log (main)

by pokotilenko@…, 7 years ago

Attachment: location_debug.log added

full debug.log (location)

by pokotilenko@…, 7 years ago

Attachment: sendfile_test.c added

Direct epoll/sendfile test

comment:3 by pokotilenko@…, 7 years ago

Today installed kernel 4.8.0 and made tests on it. Issue persists:

Linux **** 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:55:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

comment:4 by Maxim Dounin, 7 years ago

It looks like you are using timer_resolution. Please try without it to see if it helps.

comment:5 by pokotilenko@…, 7 years ago

Meantime tried different nginx versions from offsite ubuntu repositiories (http://nginx.org/en/linux_packages.html):

  • 1.10.2-1~xenial
  • 1.11.8-1~xenial

This is 4.8.0 kernel. Issue persists.

comment:6 by pokotilenko@…, 7 years ago

Not using timer_resolution solves this issue for all tested cases:

  • kernel 4.4.0
    • 1.10.0 (Ubuntu)
  • kernel 4.8.0
    • 1.10.0 (Ubuntu)
    • 1.10.2-1~xenial
    • 1.11.8-1~xenial

Thanks.

comment:7 by Maxim Dounin, 7 years ago

Ok, thanks for testing. It looks like recent kernel changes in the sendfile() system call broke things. Now sendfile() can be interrupted in the middle of transmission, and the result seems to indistinguishable from a normal blocking on the socket buffer.

To further confirm this, you may also want to check if the issue can be reproduced with older kernels. It shouldn't exists before 4.3.

Not sure it can be properly solved on nginx side, as the existing sendfile() interface on linux doesn't allow to tell if the call was interrupted or exhausted socket buffer. The only option with the current behaviour seems to be to always assume EINTR if sendfile() returns less than it was asked to send.

Some related links, just for the record:

https://patchwork.kernel.org/patch/7374491/
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=296291cdd1629c308114504b850dc343eabc2782
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c725bfce7968009756ed2836a8cd7ba4dc163011

comment:8 by pokotilenko@…, 7 years ago

Good catch!

My understanding of what is happening is that:

  • since mentioned kernel commit sendfile() may be interrupted by a signal
  • if sendfile() is interrupted by a signal it:
    • cannot tell about that (it lacks EINTR errno)
    • does not fill up socket output buffer
  • if Edge-Triggered epoll mode is used and the socket output buffer have not been filled up - epoll_wait will not return socket as ready for output because: "edge-triggered mode only delivers events when changes occur on the monitored file descriptor"
  • the mentioned kernel commit makes sendfile() unsafe (unusable) in Edge-Triggered epoll mode.

But, even with new behaviour sendfile() still work good in Level-Triggered epoll mode.

I've done more tests and can confirm this behaviour with a simple test program (re-attached):

  • usage:
    ./sendfile_test
    Usage: ./sendfile_test bind_addr bind_port file_name timer_usec [et]
    	timer_usec: set timer to timer_usec microseconds, programm is
                        signalled with SIGALRM each time timer expires
    	et: if set use edge triggered epoll mode
    
  • this always hang:
    ./sendfile_test localhost 8888 a_20MB_file 1000 et
    
  • this never hang:
    ./sendfile_test localhost 8888 a_20MB_file 1000
    

While running sendfile_test emulate client with echo 3 > /proc/sys/vm/drop_caches; nc localhost 8888 > file.out

Maybe it's worth posting to kernel list about sendfile() now being unusable with socket watched by Edge-Triggered epoll mode.

On other hand the workaround is to use Level-Triggered epoll mode if it is affordable.

Last edited 7 years ago by pokotilenko@… (previous) (diff)

by pokotilenko@…, 7 years ago

Attachment: sendfile_test.2.c added

Direct epoll/sendfile test with signal interruption

comment:9 by pokotilenko@…, 7 years ago

Also on an older 3.2.0 kernel sendfile_test never hang in either Edge-Triggered or Level-Triggered epoll mode.

comment:10 by pokotilenko@…, 7 years ago

BTW, turning off timer_resolution doesn't mean this issue will never pop-up as there are other ways nginx could be signalled (HUP, USR1). Though less often of course.

in reply to:  8 comment:11 by Maxim Dounin, 7 years ago

Maybe it's worth posting to kernel list about sendfile() now being unusable with socket watched by Edge-Triggered epoll mode.

Yes, please do so.

BTW, turning off timer_resolution doesn't mean this issue will never pop-up as there are other ways nginx could be signalled (HUP, USR1). Though less often of course.

Sure, though timer_resolution is more or less the only feature that uses signals in nginx worker processes on Linux. Control signals are normally sent to master, and master in turn talks to worker processes via sockets. So there shouldn't be any signals to interrupt sendfile() during normal operation.

comment:12 by pokotilenko@…, 7 years ago

I've researched docs few more times and made some tests. It seems that my statement about "sendfile() now being unusable with socket watched by Edge-Triggered epoll mode" does not hold true.

Despite sendfile() behaviour change, both sendfile() and epoll_wait() still work as documented.

The thing is that in Edge-Triggered epoll mode it is programmer responsibility to read full socket buffer or fill up socket send buffer until EAGAIN. Otherwise no more Read-ready or Write-ready events will be reported.

This usually means a loop until EAGAIN is returned.

Nginx just tries sendfile once and there is a risk to not feed send buffer enough. This have not been making problems before mentioned kernel commit as sendfile() was uninterruptible.

So, generally, I see 2 options here:

  • do sendfile() in a loop until EAGAIN. But, if the socket send is faster than disk read and sendfile() is always interrupted you end-up with an entire file being sent in one loop. It could be plenty of time spent for one socket.
  • use Level-Triggered epoll mode.

I can confirm this behaviour with a new test program (re-attached):

  • usage:
    ./sendfile_test_loop
    Usage: ./sendfile_test_loop bind_addr bind_port file_name timer_usec [et|et_loop]
    	timer_usec: set timer to timer_usec microseconds, program is
                        signalled with SIGALRM each time timer expires
    	et: if set use edge triggered epoll mode
    	et_loop: if set use edge triggered epoll mode and sendfile in a loop until EAGAIN
    
  • this always hang:
    ./sendfile_test_loop localhost 8888 a_20MB_file 100 et
    
  • this never hang:
    ./sendfile_test_loop localhost 8888 a_20MB_file 100 et_loop
    
  • this also never hang (Level-Triggered without a loop):
    ./sendfile_test_loop localhost 8888 a_20MB_file 100
    

by pokotilenko@…, 7 years ago

Attachment: sendfile_test_loop.c added

Direct epoll/sendfile test with signal interruption and sendfile loop

comment:13 by Maxim Dounin, 7 years ago

Sure, this is basically what

The only option with the current behaviour seems to be to always assume EINTR if sendfile() returns less than it was asked to send.

in comment:7 means. The new behaviour of sendfile() on Linux doesn't allow effective handling and requires one to always call sendfile() in a loop till EAGAIN. Previous behaviour, as well as sendfile() implementations on other OSes, where EINTR is explicitly reported, allows more effective handling.

comment:14 by pokotilenko@…, 7 years ago

That is true for Edge-Triggered epoll mode.

What is the point of sticking with ET?

in reply to:  14 comment:15 by Valentin V. Bartenev, 7 years ago

Replying to pokotilenko@…:

That is true for Edge-Triggered epoll mode.

What is the point of sticking with ET?

LT requires additional handling (i.e. additional calls of epoll_ctl()) when you don't want to read data from the socket at the moment.

comment:16 by Maxim Dounin <mdounin@…>, 7 years ago

In 6949:ff0c8e11edbc/nginx:

Simplified and improved sendfile() code on Linux.

The ngx_linux_sendfile() function is now used for both normal sendfile()
and sendfile in threads. The ngx_linux_sendfile_thread() function was
modified to use the same interface as ngx_linux_sendfile(), and is simply
called from ngx_linux_sendfile() when threads are enabled.

Special return code NGX_DONE is used to indicate that a thread task was
posted and no further actions are needed.

If number of bytes sent is less that what we were sending, we now always
retry sending. This is needed for sendfile() in threads as the number
of bytes we are sending might have been changed since the thread task
was posted. And this is also needed for Linux 4.3+, as sendfile() might
be interrupted at any time and provides no indication if it was interrupted
or not (ticket #1174).

comment:17 by Maxim Dounin, 7 years ago

Resolution: fixed
Status: newclosed

Fix committed.

Note: See TracTickets for help on using tickets.