Opened 7 years ago

Closed 7 years ago

#222 closed defect (invalid)

nginx segfaults on start

Reported by: www.google.com/accounts/o8/id?id=AItOawm3Bg5xE6-Y6dfRPQZkXbZxjShTnZtINoE Owned by: somebody
Priority: major Milestone:
Component: nginx-core Version: 1.1.x
Keywords: segfault, crash Cc:
uname -a: Linux ip-10-160-45-203 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.1.19 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-log-path=/var/log/nginx/access.log --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --lock-path=/var/lock/nginx.lock --pid-path=/var/run/nginx.pid --with-debug --with-http_addition_module --with-http_dav_module --with-http_geoip_module --with-http_gzip_static_module --with-http_image_filter_module --with-http_realip_module --with-http_stub_status_module --with-http_ssl_module --with-http_sub_module --with-http_xslt_module --with-ipv6 --with-sha1=/usr/include/openssl --with-md5=/usr/include/openssl --with-mail --with-mail_ssl_module --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-auth-pam --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-echo --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-upstream-fair --add-module=/build/buildd/nginx-1.1.19/debian/modules/nginx-dav-ext-module

Description

I've been having a semi-reproducable issue with nginx segfaulting at start.

This happens with reasonable frequency right after boot on an Amazon AWS EC2 t1.micro instance, running Ubuntu 12.04.1 LTS:

[ 550.264652] Process nginx (pid: 11293, threadinfo ffff88002337e000, task ffff8800232e44a0)
[ 550.264656] Stack:
[ 550.264661] ffff88002337fe68 00007fffd1225538 ffff8800232e44a0 000000018107b1ec
[ 550.264670] ffff8800232e48f0 0000000000000001 ffff88002337fee8 0000000000000001
[ 550.264678] 0000000000000001 00000000025e99d0 ffff88002337fe18 ffffffff8101362d
[ 550.264686] Call Trace:
[ 550.264692] [<ffffffff8101362d>] setup_rt_frame+0x9d/0xc0
[ 550.264697] [<ffffffff810136c1>] handle_signal+0x71/0x190
[ 550.264703] [<ffffffff810138a6>] do_signal+0xc6/0x130
[ 550.264710] [<ffffffff8165311a>] ? error_exit+0x2a/0x60
[ 550.264715] [<ffffffff81652c3c>] ? retint_restore_args+0x5/0x6
[ 550.264722] [<ffffffff8131464d>] ? put_user_4+0x1d/0x30
[ 550.264728] [<ffffffff81013ad5>] do_notify_resume+0x65/0x80
[ 550.264735] [<ffffffff8105ea6c>] ? schedule_tail+0x5c/0xb0
[ 550.264741] [<ffffffff8165b010>] int_signal+0x12/0x17
[ 550.264745] Code: ff 0f 85 3a 02 00 00 f6 43 08 04 0f 85 05 02 00 00 65 48 8b 04 25 c8 c4 00 00 44 8b 80 88 e0 ff ff c7 80 88 e0 ff ff 00 00 00 00 <48> 8b 05 b8 67 cc 00 a9 00 00 00 04 0f 85 89 01 00 00 49 c7 45
[ 550.264801] RIP [<ffffffff81013365>]
setup_rt_frame+0x105/0x330
[ 550.264807] RSP <ffff88002337fda8>
[ 550.264811] ---[ end trace fd7fffe51de1b4c9 ]---
[ 555.642115] alignment check: 0000 #2 SMP
[ 555.642135] CPU 0
[ 555.642137] Modules linked in: tcp_htcp isofs acpiphp
[ 555.642146]
[ 555.642150] Pid: 11318, comm: nginx Tainted: G D 3.2.0-29-virtual #46-Ubuntu
[ 555.642158] RIP: e030:[<ffffffff81013365>] [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 555.642171] RSP: e02b:ffff880002887da8 EFLAGS: 00050246
[ 555.642176] RAX: ffff880002887fd8 RBX: ffff880002887ee8 RCX: ffff8800232e0000
[ 555.642236] RDX: ffff880002887e68 RSI: ffff880002887fd8 RDI: 0000000000000000
[ 555.642244] RBP: ffff880002887df8 R08: 0000000000000000 R09: ffff8800232e0568
[ 555.642249] R10: 0000000000000001 R11: 0000000000000246 R12: ffff8800232e0558
[ 555.642254] R13: 00007fff93462f38 R14: ffff880002887f58 R15: 00007fff93463100
[ 555.642263] FS: 00007f75f95cb700(0000) GS:ffff880026400000(0000) knlGS:0000000000000000
[ 555.642269] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 555.642273] CR2: 00007fff93463100 CR3: 00000000232d5000 CR4: 0000000000002620
[ 555.642279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 555.642292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
[ 555.642299] Process nginx (pid: 11318, threadinfo ffff880002886000, task ffff8800232e0000)
[ 555.642304] Stack:
[ 555.642307] ffff880002887e68 00007fff93462f38 ffff8800232e0000 000000018107b1ec
[ 555.642324] ffff8800232e0450 0000000000000001 ffff880002887ee8 0000000000000001
[ 555.642332] 0000000000000001 0000000001efa9d0 ffff880002887e18 ffffffff8101362d
[ 555.642341] Call Trace:
[ 555.642360] [<ffffffff8101362d>] setup_rt_frame+0x9d/0xc0
[ 555.642368] [<ffffffff810136c1>] handle_signal+0x71/0x190
[ 555.642373] [<ffffffff810138a6>] do_signal+0xc6/0x130
[ 555.642380] [<ffffffff8165311a>] ? error_exit+0x2a/0x60
[ 555.642386] [<ffffffff81652c3c>] ? retint_restore_args+0x5/0x6
[ 555.642401] [<ffffffff8131464d>] ?
put_user_4+0x1d/0x30
[ 555.642407] [<ffffffff81013ad5>] do_notify_resume+0x65/0x80
[ 555.642417] [<ffffffff8105ea6c>] ? schedule_tail+0x5c/0xb0
[ 555.642433] [<ffffffff8165b010>] int_signal+0x12/0x17
[ 555.642438] Code: ff 0f 85 3a 02 00 00 f6 43 08 04 0f 85 05 02 00 00 65 48 8b 04 25 c8 c4 00 00 44 8b 80 88 e0 ff ff c7 80 88 e0 ff ff 00 00 00 00 <48> 8b 05 b8 67 cc 00 a9 00 00 00 04 0f 85 89 01 00 00 49 c7 45
[ 555.642584] RIP [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 555.642592] RSP <ffff880002887da8>
[ 555.642597] ---[ end trace fd7fffe51de1b4ca ]---

Change History (2)

comment:1 Changed 7 years ago by www.google.com/accounts/o8/id?id=AItOawm3Bg5xE6-Y6dfRPQZkXbZxjShTnZtINoE

I've been deploying to EC2 instances using a python fabric script, which now runs:

sync
invoke-rc.d nginx start
sleep 5
invoke-rc.d nginx start
[I should probably change that to service nginx start ....]

The five second delay is a hackish workaround to try to avoid the segfault.

In this case, there were back-to-back segfaults. After nginx starts, it's completely stable (months of uptime, no problems).

Here's the first segfault (five seconds before the one above):

[ 550.264545] alignment check: 0000 #1 SMP
[ 550.264559] CPU 0
[ 550.264562] Modules linked in: tcp_htcp isofs acpiphp
[ 550.264570]
[ 550.264573] Pid: 11293, comm: nginx Not tainted 3.2.0-29-virtual #46-Ubuntu
[ 550.264580] RIP: e030:[<ffffffff81013365>] [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 550.264594] RSP: e02b:ffff88002337fda8 EFLAGS: 00050246
[ 550.264598] RAX: ffff88002337ffd8 RBX: ffff88002337fee8 RCX: ffff8800232e44a0
[ 550.264603] RDX: ffff88002337fe68 RSI: ffff88002337ffd8 RDI: 0000000000000000
[ 550.264608] RBP: ffff88002337fdf8 R08: 0000000000000000 R09: ffff8800232e4a08
[ 550.264613] R10: 0000000000000001 R11: 0000000000000246 R12: ffff8800232e49f8
[ 550.264618] R13: 00007fffd1225538 R14: ffff88002337ff58 R15: 00007fffd1225700
[ 550.264626] FS: 00007f4f5eb47700(0000) GS:ffff880026400000(0000) knlGS:0000000000000000
[ 550.264632] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 550.264636] CR2: 00007fffd1225700 CR3: 0000000002afc000 CR4: 0000000000002620
[ 550.264641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 550.264646] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
[ 550.264652] Process nginx (pid: 11293, threadinfo ffff88002337e000, task ffff8800232e44a0)
[ 550.264656] Stack:
[ 550.264661] ffff88002337fe68 00007fffd1225538 ffff8800232e44a0 000000018107b1ec
[ 550.264670] ffff8800232e48f0 0000000000000001 ffff88002337fee8 0000000000000001
[ 550.264678] 0000000000000001 00000000025e99d0 ffff88002337fe18 ffffffff8101362d
[ 550.264686] Call Trace:
[ 550.264692] [<ffffffff8101362d>] setup_rt_frame+0x9d/0xc0
[ 550.264697] [<ffffffff810136c1>] handle_signal+0x71/0x190
[ 550.264703] [<ffffffff810138a6>] do_signal+0xc6/0x130
[ 550.264710] [<ffffffff8165311a>] ? error_exit+0x2a/0x60
[ 550.264715] [<ffffffff81652c3c>] ? retint_restore_args+0x5/0x6
[ 550.264722] [<ffffffff8131464d>] ?
put_user_4+0x1d/0x30
[ 550.264728] [<ffffffff81013ad5>] do_notify_resume+0x65/0x80
[ 550.264735] [<ffffffff8105ea6c>] ? schedule_tail+0x5c/0xb0
[ 550.264741] [<ffffffff8165b010>] int_signal+0x12/0x17
[ 550.264745] Code: ff 0f 85 3a 02 00 00 f6 43 08 04 0f 85 05 02 00 00 65 48 8b 04 25 c8 c4 00 00 44 8b 80 88 e0 ff ff c7 80 88 e0 ff ff 00 00 00 00 <48> 8b 05 b8 67 cc 00 a9 00 00 00 04 0f 85 89 01 00 00 49 c7 45
[ 550.264801] RIP [<ffffffff81013365>] setup_rt_frame+0x105/0x330
[ 550.264807] RSP <ffff88002337fda8>
[ 550.264811] ---[ end trace fd7fffe51de1b4c9 ]---

comment:2 Changed 7 years ago by mdounin

  • Resolution set to invalid
  • Status changed from new to closed

There is no segmentation fault in kernel logs provided, but rather "alignment check" problem. Something in kernel turned on EFLAGS.AC bit, which is most likely a bug in kernel or hypervisor. Certainly not a problem in nginx.

Note: See TracTickets for help on using tickets.