Opened 7 years ago

Closed 7 years ago

#1261 closed defect (fixed)

nginx 1.13.0/Solaris 11.3/amd64; coredump when using ssl_session_cache

Reported by: Slarky@… Owned by:
Priority: minor Milestone:
Component: nginx-module Version: 1.11.x
Keywords: solaris, 64bit, ssl, coredump Cc:
uname -a: SunOS solaris 5.11 11.3 i86pc i386 i86pc
nginx -V: nginx version: nginx/1.13.0
built by Sun C 5.14 SunOS_i386 2016/05/31
built with OpenSSL 1.0.2k 26 Jan 2017
TLS SNI support enabled
configure arguments: --with-cc=/opt/developerstudio12.5/bin/cc --with-cpu-opt=amd64 --prefix=/tmp/nginx13 --with-http_ssl_module

Description

Hi, nginx 1.13 dumps core on Solaris 11.3 with Oracle Developer Studio 12.5 when compiled with -m64:

$ ./configure --with-cc=/opt/developerstudio12.5/bin/cc --with-cpu-opt=amd64 --prefix=/tmp/nginx13 --with-http_ssl_module

Remove 2x -fast -xipo from objs/Makefile !

$ gmake
...
/opt/developerstudio12.5/bin/cc -c -m64 -errwarn=%all -g  -I src/core -I src/event -I src/event/modules -I src/os/unix -I /usr/include/pcre -I objs \
        -o objs/src/core/ngx_times.o \
        src/core/ngx_times.c src/os/unix/ngx_sunpro_amd64.il
"src/core/ngx_times.c", line 87: argument #2 is incompatible with prototype:
        prototype: pointer to void : "src/core/ngx_times.c", line 0
        argument : pointer to volatile unsigned long
"src/core/ngx_times.c", line 203: argument #2 is incompatible with prototype:
        prototype: pointer to void : "src/core/ngx_times.c", line 0
        argument : pointer to volatile unsigned long
cc: acomp failed for src/core/ngx_times.c

Remove -errwarn=%all from objs/Makefile as well...

$ gmake
...
$ gmake install

Add "ssl_session_cache shared:SSL:10m;" to /tmp/nginx13/conf/nginx.conf

$ /tmp/nginx13/sbin/nginx -t
nginx: the configuration file /tmp/nginx13/conf/nginx.conf syntax is ok
Memory fault(coredump)

$ dbx objs/nginx core
program terminated by signal SEGV (no mapping at the fault address)
Current function is ngx_slab_alloc
  176       return p;
(dbx) where                                                                  
=>[1] ngx_slab_alloc(pool = <bad address 0xffff80feffffffd8>, size = <bad address 0xffff80feffffffd0>), line 176 in "ngx_slab.c"

(dbx) list 165 177
  165   void *
  166   ngx_slab_alloc(ngx_slab_pool_t *pool, size_t size)
  167   {
  168       void  *p;
  169   
  170       ngx_shmtx_lock(&pool->mutex);
  171   
  172       p = ngx_slab_alloc_locked(pool, size);
  173   
  174       ngx_shmtx_unlock(&pool->mutex);
  175   
  176       return p;
  177   }

Change History (4)

comment:1 by skrueger8@…, 7 years ago

The root cause seems to be that nginx thinks it can use GCC atomics even though the sun compiler is used...

checking for gcc builtin atomic operations ... found

I don't get any coredumps or compile time errors when "NGX_HAVE_GCC_ATOMIC" is disabled in objs/ngx_auto_config.h (i.e. delete it or just comment it out).

comment:2 by Maxim Dounin, 7 years ago

It looks like there is more than one problem here. Oracle Developer Studio 12.5 introduced GCC-compatible __sync builtins, but:

  1. These builtins aren't really compatible with GCC, and will generate warnings when used with volatile pointers.
  2. These builtins seems to be broken at least with some compiler flags / optimizations.

The (1) causes warnings during compilation, and (2) causes segmentation faults.

I was able to reproduce broken behaviour using the following test code:

$ cat t.c
#include <stdio.h>

long  n;

int
main(void)
{
    if (!__sync_bool_compare_and_swap(&n, 0, 1)) {
        fprintf(stderr, "__sync_bool_compare_and_swap() failed\n");
        return 1;
    }

    fprintf(stderr, "all clear\n");
}
$ /opt/developerstudio12.5/bin/cc t.c
$ ./a.out
all clear
$ /opt/developerstudio12.5/bin/cc -m64 t.c
$ ./a.out
__sync_bool_compare_and_swap() failed

Exact compiler options needed to trigger a failure seems to depend heavily on the particular code layout. For example, if long n = 0; is used in main, it only fails with -m64 -O0.

Given the above, the only sensible solution seems to explicitly disable use of GCC atomic operations when compiling with Sun C, patch below. Quick fix would be to use --with-cc-opt="-DNGX_HAVE_GCC_ATOMIC=0".

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1493918641 -10800
#      Thu May 04 20:24:01 2017 +0300
# Node ID 758f3bb7b84aa8937ea6a3dd27f54b3313cefc50
# Parent  f38647c651a8d5c884b5aacc9f9a5b1af196309b
Configure: disabled gcc atomics with Sun C (ticket #1261).

Oracle Developer Studio 12.5 introduced GCC-compatible __sync builtins.
Unfortunately, these builtins are neither GCC-compatible (they generate
warnings when used with volatile), nor working (unexpectedly fail on
unpredictable combinations of code layout and compiler flags).  As such,
the gcc builtin atomic operations configure test explicitly disabled when
compiling with Sun C.

diff --git a/auto/cc/conf b/auto/cc/conf
--- a/auto/cc/conf
+++ b/auto/cc/conf
@@ -178,21 +178,25 @@ if [ "$NGX_PLATFORM" != win32 ]; then
     fi
 
 
-    ngx_feature="gcc builtin atomic operations"
-    ngx_feature_name=NGX_HAVE_GCC_ATOMIC
-    ngx_feature_run=yes
-    ngx_feature_incs=
-    ngx_feature_path=
-    ngx_feature_libs=
-    ngx_feature_test="long  n = 0;
-                      if (!__sync_bool_compare_and_swap(&n, 0, 1))
-                          return 1;
-                      if (__sync_fetch_and_add(&n, 1) != 1)
-                          return 1;
-                      if (n != 2)
-                          return 1;
-                      __sync_synchronize();"
-    . auto/feature
+    if [ "$NGX_CC_NAME" = "sunc" ]; then
+        echo "checking for gcc builtin atomic operations ... disabled"
+    else
+        ngx_feature="gcc builtin atomic operations"
+        ngx_feature_name=NGX_HAVE_GCC_ATOMIC
+        ngx_feature_run=yes
+        ngx_feature_incs=
+        ngx_feature_path=
+        ngx_feature_libs=
+        ngx_feature_test="long  n = 0;
+                          if (!__sync_bool_compare_and_swap(&n, 0, 1))
+                              return 1;
+                          if (__sync_fetch_and_add(&n, 1) != 1)
+                              return 1;
+                          if (n != 2)
+                              return 1;
+                          __sync_synchronize();"
+        . auto/feature
+    fi
 
 
     if [ "$NGX_CC_NAME" = "ccc" ]; then

comment:3 by Maxim Dounin <mdounin@…>, 7 years ago

In 6999:4925a6d13746/nginx:

Configure: disabled gcc atomics with Sun C (ticket #1261).

Oracle Developer Studio 12.5 introduced GCC-compatible sync builtins.
Unfortunately, these builtins are neither GCC-compatible (they generate
warnings when used with volatile), nor working (unexpectedly fail on
unpredictable combinations of code layout and compiler flags). As such,
the gcc builtin atomic operations configure test explicitly disabled when
compiling with Sun C.

comment:4 by Maxim Dounin, 7 years ago

Resolution: fixed
Status: newclosed

Patch committed, thanks for reporting this.

Note: See TracTickets for help on using tickets.