Opened 12 years ago

Closed 12 years ago

#222 closed defect (invalid)

nginx segfaults on start

Reported by: semenko Owned by: somebody
Priority: major Milestone:
Component: nginx-core Version: 1.1.x
Keywords: segfault, crash Cc:
uname -a: Linux ip-10-160-45-203 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.1.19
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-log-path=/var/log/nginx/access.log --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --lock-path=/var/lock/nginx.lock --pid-path=/var/run/nginx.pid --with-debug --with-http_addition_module --with-http_dav_module --with-http_geoip_module --with-http_gzip_static_module --with-http_image_filter_module --with-http_realip_module --with-http_stub_status_module --with-http_ssl_module --with-http_sub_module --with-http_xslt_module --with-ipv6 --with-sha1=/usr/include/openssl --with-md5=/usr/include/openssl --with-mail --with-mail_ssl_module --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-auth-pam --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-echo --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-upstream-fair --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-dav-ext-module

Description

I've been having a semi-reproducable issue with nginx segfaulting at start.

This happens with reasonable frequency right after boot on an Amazon AWS EC2 t1.micro instance, running Ubuntu 12.04.1 LTS:

[ 550.264652] Process nginx (pid: 11293, threadinfo ffff88002337e000, task ffff8800232e44a0)
[ 550.264656] Stack:
[ 550.264661] ffff88002337fe68 00007fffd1225538 ffff8800232e44a0 000000018107b1ec
[ 550.264670] ffff8800232e48f0 0000000000000001 ffff88002337fee8 0000000000000001
[ 550.264678] 0000000000000001 00000000025e99d0 ffff88002337fe18 ffffffff8101362d
[ 550.264686] Call Trace:
[ 550.264692] [<ffffffff8101362d>] setup_rt_frame+0x9d/0xc0
[ 550.264697] [<ffffffff810136c1>] handle_signal+0x71/0x190
[ 550.264703] [<ffffffff810138a6>] do_signal+0xc6/0x130
[ 550.264710] [<ffffffff8165311a>] ? error_exit+0x2a/0x60
[ 550.264715] [<ffffffff81652c3c>] ? retint_restore_args+0x5/0x6
[ 550.264722] [<ffffffff8131464d>] ? put_user_4+0x1d/0x30
[ 550.264728] [<ffffffff81013ad5>] do_notify_resume+0x65/0x80
[ 550.264735] [<ffffffff8105ea6c>] ? schedule_tail+0x5c/0xb0
[ 550.264741] [<ffffffff8165b010>] int_signal+0x12/0x17
[ 550.264745] Code: ff 0f 85 3a 02 00 00 f6 43 08 04 0f 85 05 02 00 00 65 48 8b 04 25 c8 c4 00 00 44 8b 80 88 e0 ff ff c7 80 88 e0 ff ff 00 00 00 00 <48> 8b 05 b8 67 cc 00 a9 00 00 00 04 0f 85 89 01 00 00 49 c7 45
[ 550.264801] RIP [<ffffffff81013365>]
setup_rt_frame+0x105/0x330
[ 550.264807] RSP <ffff88002337fda8>
[ 550.264811] ---[ end trace fd7fffe51de1b4c9 ]---
[ 555.642115] alignment check: 0000 #2 SMP
[ 555.642135] CPU 0
[ 555.642137] Modules linked in: tcp_htcp isofs acpiphp
[ 555.642146]
[ 555.642150] Pid: 11318, comm: nginx Tainted: G D 3.2.0-29-virtual #46-Ubuntu
[ 555.642158] RIP: e030:[<ffffffff81013365>] [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 555.642171] RSP: e02b:ffff880002887da8 EFLAGS: 00050246
[ 555.642176] RAX: ffff880002887fd8 RBX: ffff880002887ee8 RCX: ffff8800232e0000
[ 555.642236] RDX: ffff880002887e68 RSI: ffff880002887fd8 RDI: 0000000000000000
[ 555.642244] RBP: ffff880002887df8 R08: 0000000000000000 R09: ffff8800232e0568
[ 555.642249] R10: 0000000000000001 R11: 0000000000000246 R12: ffff8800232e0558
[ 555.642254] R13: 00007fff93462f38 R14: ffff880002887f58 R15: 00007fff93463100
[ 555.642263] FS: 00007f75f95cb700(0000) GS:ffff880026400000(0000) knlGS:0000000000000000
[ 555.642269] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 555.642273] CR2: 00007fff93463100 CR3: 00000000232d5000 CR4: 0000000000002620
[ 555.642279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 555.642292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
[ 555.642299] Process nginx (pid: 11318, threadinfo ffff880002886000, task ffff8800232e0000)
[ 555.642304] Stack:
[ 555.642307] ffff880002887e68 00007fff93462f38 ffff8800232e0000 000000018107b1ec
[ 555.642324] ffff8800232e0450 0000000000000001 ffff880002887ee8 0000000000000001
[ 555.642332] 0000000000000001 0000000001efa9d0 ffff880002887e18 ffffffff8101362d
[ 555.642341] Call Trace:
[ 555.642360] [<ffffffff8101362d>] setup_rt_frame+0x9d/0xc0
[ 555.642368] [<ffffffff810136c1>] handle_signal+0x71/0x190
[ 555.642373] [<ffffffff810138a6>] do_signal+0xc6/0x130
[ 555.642380] [<ffffffff8165311a>] ? error_exit+0x2a/0x60
[ 555.642386] [<ffffffff81652c3c>] ? retint_restore_args+0x5/0x6
[ 555.642401] [<ffffffff8131464d>] ?
put_user_4+0x1d/0x30
[ 555.642407] [<ffffffff81013ad5>] do_notify_resume+0x65/0x80
[ 555.642417] [<ffffffff8105ea6c>] ? schedule_tail+0x5c/0xb0
[ 555.642433] [<ffffffff8165b010>] int_signal+0x12/0x17
[ 555.642438] Code: ff 0f 85 3a 02 00 00 f6 43 08 04 0f 85 05 02 00 00 65 48 8b 04 25 c8 c4 00 00 44 8b 80 88 e0 ff ff c7 80 88 e0 ff ff 00 00 00 00 <48> 8b 05 b8 67 cc 00 a9 00 00 00 04 0f 85 89 01 00 00 49 c7 45
[ 555.642584] RIP [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 555.642592] RSP <ffff880002887da8>
[ 555.642597] ---[ end trace fd7fffe51de1b4ca ]---

Change History (2)

comment:1 by semenko, 12 years ago

I've been deploying to EC2 instances using a python fabric script, which now runs:

sync
invoke-rc.d nginx start
sleep 5
invoke-rc.d nginx start
[I should probably change that to service nginx start ....]

The five second delay is a hackish workaround to try to avoid the segfault.

In this case, there were back-to-back segfaults. After nginx starts, it's completely stable (months of uptime, no problems).

Here's the first segfault (five seconds before the one above):

[ 550.264545] alignment check: 0000 #1 SMP
[ 550.264559] CPU 0
[ 550.264562] Modules linked in: tcp_htcp isofs acpiphp
[ 550.264570]
[ 550.264573] Pid: 11293, comm: nginx Not tainted 3.2.0-29-virtual #46-Ubuntu
[ 550.264580] RIP: e030:[<ffffffff81013365>] [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 550.264594] RSP: e02b:ffff88002337fda8 EFLAGS: 00050246
[ 550.264598] RAX: ffff88002337ffd8 RBX: ffff88002337fee8 RCX: ffff8800232e44a0
[ 550.264603] RDX: ffff88002337fe68 RSI: ffff88002337ffd8 RDI: 0000000000000000
[ 550.264608] RBP: ffff88002337fdf8 R08: 0000000000000000 R09: ffff8800232e4a08
[ 550.264613] R10: 0000000000000001 R11: 0000000000000246 R12: ffff8800232e49f8
[ 550.264618] R13: 00007fffd1225538 R14: ffff88002337ff58 R15: 00007fffd1225700
[ 550.264626] FS: 00007f4f5eb47700(0000) GS:ffff880026400000(0000) knlGS:0000000000000000
[ 550.264632] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 550.264636] CR2: 00007fffd1225700 CR3: 0000000002afc000 CR4: 0000000000002620
[ 550.264641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 550.264646] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
[ 550.264652] Process nginx (pid: 11293, threadinfo ffff88002337e000, task ffff8800232e44a0)
[ 550.264656] Stack:
[ 550.264661] ffff88002337fe68 00007fffd1225538 ffff8800232e44a0 000000018107b1ec
[ 550.264670] ffff8800232e48f0 0000000000000001 ffff88002337fee8 0000000000000001
[ 550.264678] 0000000000000001 00000000025e99d0 ffff88002337fe18 ffffffff8101362d
[ 550.264686] Call Trace:
[ 550.264692] [<ffffffff8101362d>] setup_rt_frame+0x9d/0xc0
[ 550.264697] [<ffffffff810136c1>] handle_signal+0x71/0x190
[ 550.264703] [<ffffffff810138a6>] do_signal+0xc6/0x130
[ 550.264710] [<ffffffff8165311a>] ? error_exit+0x2a/0x60
[ 550.264715] [<ffffffff81652c3c>] ? retint_restore_args+0x5/0x6
[ 550.264722] [<ffffffff8131464d>] ?
put_user_4+0x1d/0x30
[ 550.264728] [<ffffffff81013ad5>] do_notify_resume+0x65/0x80
[ 550.264735] [<ffffffff8105ea6c>] ? schedule_tail+0x5c/0xb0
[ 550.264741] [<ffffffff8165b010>] int_signal+0x12/0x17
[ 550.264745] Code: ff 0f 85 3a 02 00 00 f6 43 08 04 0f 85 05 02 00 00 65 48 8b 04 25 c8 c4 00 00 44 8b 80 88 e0 ff ff c7 80 88 e0 ff ff 00 00 00 00 <48> 8b 05 b8 67 cc 00 a9 00 00 00 04 0f 85 89 01 00 00 49 c7 45
[ 550.264801] RIP [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 550.264807] RSP <ffff88002337fda8>
[ 550.264811] ---[ end trace fd7fffe51de1b4c9 ]---

comment:2 by Maxim Dounin, 12 years ago

Resolution: invalid
Status: newclosed

There is no segmentation fault in kernel logs provided, but rather "alignment check" problem. Something in kernel turned on EFLAGS.AC bit, which is most likely a bug in kernel or hypervisor. Certainly not a problem in nginx.

Note: See TracTickets for help on using tickets.