Opened 5 months ago

Closed 5 months ago

Last modified 5 months ago

#2183 closed enhancement (worksforme)

Don't do simultaneous download from origin

Reported by: ifel@… Owned by:
Priority: minor Milestone:
Component: nginx-module Version: 1.14.x
Keywords: Cc:
uname -a: Linux cc001.eag1.facebook.com 5.2.9-228_fbk15_4185_g357f49b36602 #1 SMP Mon Aug 10 09:34:08 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.14.1
built by gcc 8.2.1 20180905 (Red Hat 8.2.1-3) (GCC)
built with OpenSSL 1.1.1 FIPS 11 Sep 2018 (running with OpenSSL 1.1.1g FIPS 21 Apr 2020)
TLS SNI support enabled
configure arguments: --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/nginx.pid --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-file-aio --with-ipv6 --with-http_ssl_module --with-http_v2_module --with-http_realip_module --with-http_addition_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_degradation_module --with-http_slice_module --with-http_stub_status_module --with-http_perl_module=dynamic --with-http_auth_request_module --with-mail=dynamic --with-mail_ssl_module --with-pcre --with-pcre-jit --with-stream=dynamic --with-stream_ssl_module --with-debug --with-cc-opt='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E'

Description

We use nginx as a caching proxy. When request comes, nginx calculates a cache key, and checks if the object is in cache (and it's not expired), and decides it needs to fetch an object from upstream. Fetching an object it stores it to a temp file.
If it receives request for the same object (and the same cache key) from several clients simultaneously, it will start several simultaneous fetches from the upstream and saving several simultaneous temp files with the same content.
If the objects are large - images, packages, data blobs, this behavior causes several problems:

  1. Disk space usage - it spikes, as instead of taking 2GB of disk space to store a 1 2GB file, it needs to store 10-20 x 2GB files during fetching, so 2GB vs 20-40GB, if one does not have a large disk/prefer using most of it for a cache, it ends up in errors like this:

"2021/05/10 12:00:29 [crit] 1040271#0: *53636648 pwritev() "/data/package_cache/pkg/3495c212e5b5ec27b0e55fe6746b4264.0000639744" failed (28: No space left on device) while reading upstream"

  1. Bandwidth usage - again, instead of downloading 2GB over the wire, it has to download times more
  2. Fetch takes longer because of network utilization
  3. Load on an upstream/backend

It would be great, if nginx checked if there are other threads already fetching this cache key and just wait until they are completed and the object available in the cache. In this case only the first connection would do a fetch, and this would speed up all the responses and save resources.

Change History (2)

comment:1 by Maxim Dounin, 5 months ago

Resolution: worksforme
Status: newclosed

comment:2 by ifel@…, 5 months ago

Thank you, I overlooked these.

Note: See TracTickets for help on using tickets.