Opened 5 years ago

Closed 5 years ago

#1725 closed defect (invalid)

nginx ends up in a state where current and oldbin exist for days

Reported by: arekm.maven.pl@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.15.x
Keywords: Cc:
uname -a: Linux web10 4.19.20-1 #1 SMP Thu Feb 7 13:20:21 CET 2019 x86_64 Intel(R)_Xeon(R)_CPU___________E5405__@_2.00GHz PLD Linux
nginx -V: # nginx -V
nginx version: nginx/1.15.8
built by gcc 8.2.0 20180726 (release) (PLD-Linux)
built with OpenSSL 1.0.2o 27 Mar 2018
TLS SNI support enabled
configure arguments: --prefix=/usr --modules-path=/usr/lib64/nginx/modules --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/lock/subsys/nginx --http-client-body-temp-path=/var/cache/nginx/client_body_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-select_module --with-poll_module --with-http_perl_module=dynamic --with-http_image_filter_module=dynamic --with-http_xslt_module=dynamic --with-http_geoip_module=dynamic --with-stream_geoip_module=dynamic --with-mail=dynamic --with-mail_ssl_module --with-stream=dynamic --with-stream_ssl_module --with-cc=x86_64-pld-linux-gcc --with-cc-opt='-O2 -fwrapv -pipe -Wformat -Werror=format-security -gdwarf-4 -fno-debug-types-section -fvar-tracking-assignments -g2 -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -fPIC -march=x86-64 -gdwarf-4 -fno-debug-types-section -fvar-tracking-assignments -g2' --with-ld-opt='-Wl,--as-needed -Wl,--no-copy-dt-needed-entries -Wl,-z,relro -Wl,-z,combreloc' --with-http_addition_module --with-http_dav_module --with-http_flv_module --with-http_sub_module --with-http_realip_module --with-http_stub_status_module --with-http_ssl_module --without-http_browser_module --with-http_auth_request_module --with-threads --with-http_v2_module --with-http_secure_link_module

Description

nginx 1.15.8

it was running, then 20190212 11:41 it was reloaded using -USR2 method.

Unfortunately it stuck somehow. It's two days after and I still have both, oldbin and
current processes.

At this moment all new connections still go to oldbin.

This also lead to situation where log rotation script only signalled current pid whild oldbin still logged
to old files (which eventually got rotated, compressed). This part is solved now here - I'm signalling with -USR1 oldbin pid and current pid, too.

        |-nginx(7996)-+-nginx(29556)-+-nginx(12180,nginx)
        |             |              |-nginx(12181,nginx)
        |             |              |-nginx(12182,nginx)
        |             |              |-nginx(12183,nginx)
        |             |              |-nginx(12184,nginx)
        |             |              |-nginx(12186,nginx)
        |             |              |-nginx(12187,nginx)
        |             |              |-nginx(12188,nginx)
        |             |              `-nginx(12189,nginx)
        |             |-nginx(19196,nginx)
        |             |-nginx(19197,nginx)
        |             |-nginx(19199,nginx)
        |             |-nginx(19200,nginx)
        |             |-nginx(19202,nginx)
        |             |-nginx(19203,nginx)
        |             |-nginx(19204,nginx)
        |             |-nginx(19205,nginx)
        |             `-nginx(19206,nginx)


# ps aux|grep nginx
root      7996  0.0  1.6 1070388 557612 ?      Ss   lut10   0:01 nginx: master process /usr/sbin/nginx -c /etc/nginx/my-nginx.conf
nginx    12180  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12181  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12182  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12183  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12184  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12186  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12187  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12188  0.0  1.6 1072480 556700 ?      SN   07:05   0:00 nginx: worker process
nginx    12189  0.0  1.6 1072480 556504 ?      S    07:05   0:00 nginx: cache manager process
root     17915  0.0  0.0 236088   860 pts/1    S+   08:55   0:00 grep nginx
nginx    19196  0.8  1.7 1070388 586616 ?      SN   lut11  34:46 nginx: worker process
nginx    19197  0.3  1.7 1070388 586608 ?      SN   lut11  15:35 nginx: worker process
nginx    19199  0.0  1.7 1070388 585968 ?      SN   lut11   0:25 nginx: worker process
nginx    19200  0.0  1.7 1070388 584620 ?      SN   lut11   0:03 nginx: worker process
nginx    19202  0.0  1.7 1070388 582268 ?      SN   lut11   0:01 nginx: worker process
nginx    19203  0.0  1.7 1070388 577480 ?      SN   lut11   0:00 nginx: worker process
nginx    19204  0.0  1.7 1070388 577292 ?      SN   lut11   0:00 nginx: worker process
nginx    19205  0.0  1.7 1070388 573752 ?      SN   lut11   0:00 nginx: worker process
nginx    19206  0.0  1.6 1070388 556756 ?      S    lut11   0:10 nginx: cache manager process
root     29556  0.0  1.7 1072480 562952 ?      S    lut12   0:19 nginx: master process /usr/sbin/nginx -c /etc/nginx/my-nginx.conf
2019/02/12 11:41:44 [notice] 29556#29556: using inherited sockets from "6;7;"
-rw-r--r-- 1 root root 6 02-12 11:41 /var/run/nginx.pid
-rw-r--r-- 1 root root 5 02-10 22:55 /var/run/nginx.pid.oldbin
# cat /var/run/nginx.pid
29556
# cat /var/run/nginx.pid.oldbin
7996
# lsof -n -p 29556
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
nginx   29556 root  cwd    DIR                8,3      284        96 /
nginx   29556 root  rtd    DIR                8,3      284        96 /
nginx   29556 root  txt    REG                8,3  1003768     37799 /usr/sbin/nginx
nginx   29556 root  DEL    REG                0,5            8388265 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388264 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388263 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388262 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388261 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388260 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388259 /dev/zero
nginx   29556 root  DEL    REG                0,5            8388258 /dev/zero
nginx   29556 root  mem    REG                8,3  8324458    403832 /var/lib/nscd/hosts
nginx   29556 root  mem    REG                8,3   217032    403831 /var/lib/nscd/group
nginx   29556 root  mem    REG                8,3   999959    403830 /var/lib/nscd/passwd
nginx   29556 root  mem    REG                8,3    10224 201327257 /lib64/libfreebl3.so
nginx   29556 root  mem    REG                8,3  1848936 201326868 /lib64/libc-2.28.so
nginx   29556 root  mem    REG                8,3    92552 201382109 /lib64/libz.so.1.2.11
nginx   29556 root  mem    REG                8,3  2447912 201400123 /lib64/libcrypto.so.1.0.0
nginx   29556 root  mem    REG                8,3   485608 201400124 /lib64/libssl.so.1.0.0
nginx   29556 root  mem    REG                8,3   469120 217524230 /lib64/libpcre.so.1.2.9
nginx   29556 root  mem    REG                8,3    47344 202428497 /lib64/libcrypt-2.28.so
nginx   29556 root  mem    REG                8,3   146792 201326916 /lib64/libpthread-2.28.so
nginx   29556 root  mem    REG                8,3    14464 201326870 /lib64/libdl-2.28.so
nginx   29556 root  mem    REG                8,3   191536 201819215 /lib64/ld-2.28.so
nginx   29556 root  DEL    REG                0,5            8388266 /dev/zero
nginx   29556 root    0u   CHR                1,3      0t0         9 /dev/null
nginx   29556 root    1u   CHR                1,3      0t0         9 /dev/null
nginx   29556 root    2w   REG              254,1   151895 136031228 /var/log/nginx/error.log
nginx   29556 root    3u  unix 0x000000004c597e0d      0t0  19472910 type=STREAM
nginx   29556 root    4u  unix 0x000000005e439593      0t0  19472911 type=STREAM
nginx   29556 root    5w   REG              254,1   151895 136031228 /var/log/nginx/error.log
nginx   29556 root    6u  IPv4              45078      0t0       TCP *:http (LISTEN)
nginx   29556 root    7u  IPv4              45079      0t0       TCP *:https (LISTEN)
nginx   29556 root    8w   REG              254,1  5171549 136031227 /var/log/nginx/access.log
nginx   29556 root   27u  unix 0x00000000ba2d8caa      0t0  19472912 type=STREAM
nginx   29556 root   28u  unix 0x0000000035206ed8      0t0  19472913 type=STREAM
nginx   29556 root   29u  unix 0x0000000059ce7eb5      0t0  19472914 type=STREAM
nginx   29556 root   30u  unix 0x00000000e5b141b3      0t0  19472915 type=STREAM
nginx   29556 root   31u  unix 0x0000000094aa6157      0t0  19472916 type=STREAM
nginx   29556 root   32u  unix 0x0000000073072da2      0t0  19472917 type=STREAM
nginx   29556 root   33u  unix 0x00000000f02a5505      0t0  19472918 type=STREAM
nginx   29556 root   34u  unix 0x00000000a4cd0be7      0t0  19472919 type=STREAM
nginx   29556 root   35u  unix 0x000000001a7c683b      0t0  19472920 type=STREAM
nginx   29556 root   36u  unix 0x0000000056f2b71f      0t0  19472921 type=STREAM
nginx   29556 root   37u  unix 0x000000001d7bacf6      0t0  19472922 type=STREAM
nginx   29556 root   38u  unix 0x0000000025b4a62c      0t0  19472923 type=STREAM
nginx   29556 root   39u  unix 0x0000000001863fa7      0t0  19472924 type=STREAM
nginx   29556 root   40u  unix 0x000000000057ad8b      0t0  19472925 type=STREAM
nginx   29556 root   41u  unix 0x000000004e0d2c27      0t0  19472926 type=STREAM
nginx   29556 root   42u  unix 0x00000000de5dcc32      0t0  19472927 type=STREAM
# cat /proc/29556/stack
[<0>] sigsuspend+0x46/0x80
[<0>] __se_sys_rt_sigsuspend+0x44/0x80
[<0>] do_syscall_64+0x55/0x100
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff
# gdb /usr/sbin/nginx 29556
GNU gdb (GDB) 8.1-1 (PLD Linux)
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pld-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/nginx...Reading symbols from /usr/lib/debug/usr/sbin/nginx.debug...done.
done.
Attaching to program: /usr/sbin/nginx, process 29556
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
Reading symbols from /lib64/libpcre.so.1...(no debugging symbols found)...done.
Reading symbols from /lib64/libssl.so.1.0.0...(no debugging symbols found)...done.
Reading symbols from /lib64/libcrypto.so.1.0.0...(no debugging symbols found)...done.
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Reading symbols from /lib64/libfreebl3.so...(no debugging symbols found)...done.
0x00007f8d220761ad in sigsuspend () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f8d220761ad in sigsuspend () from /lib64/libc.so.6
#1  0x000000000043ad4a in ngx_master_process_cycle (cycle=0xefb2770) at src/os/unix/ngx_process_cycle.c:164
#2  0x0000000000412275 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:382
(gdb) frame 1
#1  0x000000000043ad4a in ngx_master_process_cycle (cycle=0xefb2770) at src/os/unix/ngx_process_cycle.c:164
164	        sigsuspend(&set);
(gdb) l
159	            }
160	        }
161
162	        ngx_log_debug0(NGX_LOG_DEBUG_EVENT, cycle->log, 0, "sigsuspend");
163
164	        sigsuspend(&set);
165
166	        ngx_time_update();
167
168	        ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0,

Change History (2)

comment:1 by arekm.maven.pl@…, 5 years ago

Can be marked as invalid.

QUIT was not send because it only waited 1s for oldbin pid file to appear which was not enough here.

comment:2 by Sergey Kandaurov, 5 years ago

Resolution: invalid
Status: newclosed

per reporter request

Note: See TracTickets for help on using tickets.