Opened 9 years ago
Closed 9 years ago
#912 closed defect (invalid)
Worker process crashes
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | nginx-core | Version: | 1.9.x |
Keywords: | gcc arm | Cc: | |
uname -a: | Linux rpi 4.1.15-v7+ #7 SMP PREEMPT Sun Dec 20 02:23:23 CET 2015 armv7l GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.9.12
built by gcc 4.9.2 (Raspbian 4.9.2-10) built with OpenSSL 1.0.1k 8 Jan 2015 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/etc/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_geoip_module=dynamic --with-threads --with-stream --with-stream_ssl_module --with-http_slice_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_v2_module --with-cc-opt='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security' --with-ld-opt=-Wl,-z,relro --with-ipv6 |
Description
I'm using nginx on a Raspberry Pi, and proxy a websocket connection to Mosquitto. Until nginx 1.9.10 it worked, now with 1.9.12 it crashes (I skipped 1.9.11).
location /mqtt/ { proxy_pass https://127.0.0.1:9002; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_buffering off; }
I built the package from http://nginx.org/packages/mainline/debian/ with "apt-get -b source nginx".
root@rpi:/tmp/nginx/cores# gdb /usr/sbin/nginx /tmp/nginx/cores/core GNU gdb (Raspbian 7.7.1+dfsg-5) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "arm-linux-gnueabihf". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/nginx...Reading symbols from /usr/lib/debug//usr/sbin/nginx...done. done. [New LWP 10218] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1". Core was generated by `nginx: worker process '. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x01459288 in ?? () (gdb) bt #0 0x01459288 in ?? () #1 0x00066a84 in ngx_http_upstream_process_header (r=0x145c9a0, u=0x145d3a0) at src/http/ngx_http_upstream.c:2133 #2 0x00064a60 in ngx_http_upstream_handler (ev=<optimized out>) at src/http/ngx_http_upstream.c:1110 #3 0x00041928 in ngx_epoll_process_events (cycle=<optimized out>, timer=<optimized out>, flags=<optimized out>) at src/event/modules/ngx_epoll_module.c:822 #4 0x000397e4 in ngx_process_events_and_timers (cycle=cycle@entry=0x13c82e0) at src/event/ngx_event.c:242 #5 0x0003fa70 in ngx_worker_process_cycle (cycle=0x13c82e0, data=<optimized out>) at src/os/unix/ngx_process_cycle.c:753 #6 0x0003e488 in ngx_spawn_process (cycle=cycle@entry=0x13c82e0, proc=0x7e0, data=0x6, name=0xbef60 "worker process", respawn=respawn@entry=0) at src/os/unix/ngx_process.c:198 #7 0x00040bcc in ngx_reap_children (cycle=0x13c82e0) at src/os/unix/ngx_process_cycle.c:621 #8 ngx_master_process_cycle (cycle=cycle@entry=0x13c82e0) at src/os/unix/ngx_process_cycle.c:174 #9 0x0001f1a8 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:367 (gdb) q
Change History (8)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
Thank you for your help!
The problem disappeared when I did your first point: configure without any arguments. I guess I will re-add the configure switches and see where it breaks again.
For the record, I could not reproduce the problem on x64 or i386, only on the Pi and in a ARM chroot (https://wiki.debian.org/RaspberryPi/qemu-user-static).
comment:3 by , 9 years ago
The original configure line from the package with -fPIE -pie
added works:
./configure --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/etc/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_geoip_module=dynamic --with-threads --with-stream --with-stream_ssl_module --with-http_slice_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_v2_module --with-cc-opt='-g -O2 -fPIE -pie -fstack-protector-strong -Wformat -Werror=format-security' --with-ld-opt=-Wl,-z,relro --with-ipv6
comment:4 by , 9 years ago
Thanks for the feedback. It would be interesting to see what exactly breaks it. I would suspect the following, in no particular order:
--with-cc-opt="-O2"
- other
--with-cc-opt
and--with-ld-opt
Overall it looks like an ARM-related gcc bug triggered by a particular optimization level or other compiler options and/or particular code, something like this one. Trying some other gcc versions might also help to diagnose the problem.
comment:5 by , 9 years ago
-O1
instead of -O2
works.
-O2
without -fstack-protector-strong
fails.
gcc version is 4.9.2 (Raspbian 4.9.2-10). There are apparently only older gccs in Raspbian.
I'm trying to build gcc-5.3.0, but it takes a long time (even in a qemu chroot https://wiki.debian.org/RaspberryPi/qemu-user-static) and the first attempts failed.
comment:6 by , 9 years ago
So it looks like something switched on with -O2
generates broken code. It might be also helpful to trace specific option then. Full list of options -O2
implies as per gcc docs:
-fthread-jumps -falign-functions -falign-jumps -falign-loops -falign-labels -fcaller-saves -fcrossjumping -fcse-follow-jumps -fcse-skip-blocks -fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively -fexpensive-optimizations -fgcse -fgcse-lm -fhoist-adjacent-loads -finline-small-functions -findirect-inlining -fipa-sra -fisolate-erroneous-paths-dereference -foptimize-sibling-calls -fpartial-inlining -fpeephole2 -freorder-blocks -freorder-functions -frerun-cse-after-loop -fsched-interblock -fsched-spec -fschedule-insns -fschedule-insns2 -fstrict-aliasing -fstrict-overflow -ftree-switch-conversion -ftree-tail-merge -ftree-pre -ftree-vrp
In particular it would be interesting to test if -fstrict-aliasing
is a problem or not (that is, if -O2 -fno-strict-aliasing
fixes things).
comment:7 by , 9 years ago
The original package works if it is compiled with gcc-5.3.0.
Using built-in specs.
COLLECT_GCC=/root/gcc-5.3.0/bin/gcc-5.3
COLLECT_LTO_WRAPPER=/root/gcc-5.3.0/libexec/gcc/arm-linux-gnueabihf/5.3.0/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../gcc-5.3.0/configure --prefix=/root/gcc-5.3.0 --enable-languages=c,c++ --program-suffix=-5.3 --enable-shared --enable-linker-build-id --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/5.3 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --with-arch-directory=arm --enable-multiarch --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 5.3.0 (GCC)
nginx version: nginx/1.9.12
built by gcc 5.3.0 (GCC)
built with OpenSSL 1.0.1k 8 Jan 2015
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/etc/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_geoip_module=dynamic --with-threads --with-stream --with-stream_ssl_module --with-http_slice_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_v2_module --with-cc-opt='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security' --with-ld-opt=-Wl,-z,relro --with-ipv6
With gcc-4.9.2 again:
-O2 -fno-strict-aliasing: fails
Now I'm trying to find which -O2 -fno-... is the culprit.
comment:8 by , 9 years ago
Keywords: | gcc arm added |
---|---|
Resolution: | → invalid |
Status: | new → closed |
Ok, thanks for testing. So it clearly looks like an GCC bug. Most relevant to -foptimize-sibling-calls
I was able to find is this bug.
Given it's already fixed in recent GCC versions and various workarounds are available (including not using -O2
or using -fno-optimize-sibling-calls
you've identified) it probably doesn't make sense to debug any further.
Closing this, as this clearly isn't an nginx bug, and compile options used by nginx by default doesn't trigger it as well. Thank you for testing.
There were previous reports about compilation issues on Raspberry Pi, see ticket #748. Likely you are facing something similar.
You can try tracing this further with:
Note well this comment which suggests that the problem could be solved by using
-fPIE -pie
flags. This might work for you as well.