Systemd is unable to read pidfile after nginx start-up due to a race-condition
|Reported by:||Owned by:|
|uname -a:||Linux n1 4.19.42-204.el7fc.x86_64 #1 SMP Tue May 14 21:31:24 +10 2019 x86_64 x86_64 x86_64 GNU/Linux|
nginx version: nginx/1.14.1
built by gcc 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC)
built with OpenSSL 1.0.2k-fips 26 Jan 2017
TLS SNI support enabled
configure arguments: --prefix=/opt/nginx-1.14.1 --pid-path=/var/run/nginx.pid --http-client-body-temp-path=/tmp/nginx-client_body_temp --http-proxy-temp-path=/tmp/nginx-proxy_temp --http-fastcgi-temp-path=/tmp/nginx-fastcgi_temp --http-uwsgi-temp-path=/tmp/nginx-uwsgi_temp --http-scgi-temp-path=/tmp/nginx-scgi_temp --with-pcre --with-pcre-jit --with-threads --with-http_ssl_module --with-http_v2_module --with-http_stub_status_module --with-http_gzip_static_module --with-http_gunzip_module --with-http_realip_module --with-stream --with-stream_ssl_module --with-http_geoip_module=dynamic
Systemd unit nginx.service (packaged in your rpms but that same setup is used almost everywere):
... [Service] Type=forking PIDFile=/var/run/nginx.pid ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf ExecReload=/bin/kill -s HUP $MAINPID ExecStop=/bin/kill -s TERM $MAINPID ...
The problem is that
$MAINPID sometimes is empty, so that a "reload" or a "stop" operation won't work.
$MAINPID is populated from PIDFile (see
man systemd.service) and here we have a race condition:
- parent nginx process does fork and terminates, at this moment systemd tries to read PIDFile which is empty (because forked nginx child hasn't written pid file yet)
- forked nginx child does write a pid value to PIDFile but that happens too late
This bug is also present in Ubuntu bug-tracker (https://bugs.launchpad.net/ubuntu/+source/nginx/+bug/1581864).
There is a work-around - to insert a small delay:
it does alleviate the problem. The real solution suggested in Ubuntu bug-tracker is to write pidfile in parent nginx process after fork but before exiting - this closes the race condition.
Will you, please, port suggested solution to the mainline codebase or work something along the lines to solve this problem?