Opened 2 weeks ago

Last modified 3 days ago

#2254 new enhancement

cache loader ignores reopen signal

Reported by: mirek.chocholous.showmax.com@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.19.x
Keywords: log reopen Cc:
uname -a: Linux redacted.showmax.com 4.19.0-17-amd64 #1 SMP Debian 4.19.194-3 (2021-07-18) x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.20.0
built with OpenSSL 1.1.1d 10 Sep 2019
TLS SNI support enabled
configure arguments: --prefix=/opt/nginx/nginx --with-cc-opt='-O2 -g -O3 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -fPIC -D_FORTIFY_SOURCE=2' --add-module=../ngx_devel_kit-0.3.1 --add-module=../echo-nginx-module-0.62 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.32 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.08 --add-module=../srcache-nginx-module-0.32 --add-module=../ngx_lua-0.10.20 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.33 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.19 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.7 --add-module=../rds-json-nginx-module-0.15 --add-module=../rds-csv-nginx-module-0.09 --add-module=../ngx_stream_lua-0.0.10 --with-ld-opt='-Wl,-rpath,/opt/nginx/luajit/lib -fPIE' --sbin-path=/opt/nginx/sbin/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --modules-path=/opt/nginx/modules --http-client-body-temp-path=/opt/nginx/body --http-fastcgi-temp-path=/opt/nginx/fastcgi --http-proxy-temp-path=/opt/nginx/proxy --modules-path=/opt/nginx/modules --http-client-body-temp-path=/opt/nginx/body --http-fastcgi-temp-path=/opt/nginx/fastcgi --http-proxy-temp-path=/opt/nginx/proxy --with-threads --with-file-aio --with-http_v2_module --with-http_slice_module --with-http_realip_module --with-http_stub_status_module --add-module=/nginx/openresty/nginx-1.20.0.1/bundle/nginx-http-slice --add-module=/nginx/openresty/nginx-1.20.0.1/bundle/nginx-upstream-dynamic-servers --add-module=/nginx/openresty/nginx-1.20.0.1/bundle/nginx-module-vts --with-stream --with-stream_ssl_module --with-stream_ssl_preread_module --with-http_ssl_module

Description

Hi,
in our environment we use logrotate to rotate the logs regularly. We call nginx -s reopen after the logrotate and most of the times it works OK. The only exception is when nginx was restarted and cache loader is reading all the information about cached data from disk.

When cache loader is running it ignores the nginx -s reopen and keeps the file descriptor to the original log. When the rotated log file is deleted the disk space can't be released and that may lead to full log partition.

Steps to reproduce:

# cat /etc/logrotate.d/cdn_nginx_access
/var/log/nginx/access.log {

create 0640 www-data adm
daily
missingok
nocopytruncate
notifempty
rotate 0
sharedscripts
postrotate

nginx -s reopen>/dev/null 2>&1

endscript
prerotate

if [ -d /etc/logrotate.d/httpd-prerotate ]; then run-parts /etc/logrotate.d/httpd-prerotate; fi

endscript

}

# cat /etc/logrotate.d/cdn_nginx_other

/var/log/nginx/error.log {

create 0640 www-data adm
daily
missingok
nocompress
nocopytruncate
notifempty
rotate 0
sharedscripts
postrotate

nginx -s reopen>/dev/null 2>&1

endscript
prerotate

if [ -d /etc/logrotate.d/httpd-prerotate ]; then run-parts /etc/logrotate.d/httpd-prerotate; fi

endscript

}
# systemctl restart nginx
# lsof /var/log | grep delet
# logrotate -f /etc/logrotate.d/cdn_nginx_access
# logrotate -f /etc/logrotate.d/cdn_nginx_other
# lsof /var/log | grep delet
nginx 26823 www-data 2w REG 9,2 711665 1835015 /var/log/nginx/error.log.1 (deleted)
nginx 26823 www-data 4w REG 9,2 5831379042 1835014 /var/log/nginx/access.log.1 (deleted)
nginx 26823 www-data 5w REG 9,2 711665 1835015 /var/log/nginx/error.log.1 (deleted)

# ps -efww | grep 26823
www-data 26823 26626 0 13:16 ? 00:00:01 nginx: cache loader process

Change History (2)

comment:1 by mirek.chocholous.showmax.com@…, 2 weeks ago

I forgot to mention one important detail: we have got 32 cache devices in place defined like this:

proxy_cache_path /run/cache/cache_01/cache_01 levels=2:2 keys_zone=DASH_cache_28:1024m inactive=720d max_size=2500g loader_threshold=200 loader_files=500 loader_sleep=50 use_temp_path=off;

And each of them currently contains 2TB of data so the cache loader process takes approx week(s) to finish.

Last edited 2 weeks ago by mirek.chocholous.showmax.com@… (previous) (diff)

comment:2 by Maxim Dounin, 3 days ago

Priority: majorminor
Type: defectenhancement

As currently implemented, cache loading happens in one run, without return to the event loop. As a result, reopening log files is only handled when the cache is fully loaded. While reopening logs is in fact implemented, it does not work, since cache loader exits once the cache is loaded. This does not seem to be a major issue though: at most, cache loader will prevent removal of one set of log files, and will release them once the cache loading is finished.

Note that certainly cache loading is not expected to work for weeks: rather, it should be something like hours at most. Cache loading implies degraded performance, and, more importantly, non-working cache management. That is, loading cache for weeks might have much worse consequences than keeping log files from being freed. If it takes weeks in your setup, it might be a good idea to reconsider the configuration. In particular, testing larger loader_files might be a good idea.

Keeping this open as an enhancement, in case somebody is willing to work on rewriting cache loading to be more friendly to very large caches.

Note: See TracTickets for help on using tickets.