Opened 13 years ago
Closed 8 years ago
#235 closed defect (fixed)
segfault with SNI and ssl_session_cache assymetrical configuration
| Reported by: | Antony Dovgal | Owned by: | somebody | 
|---|---|---|---|
| Priority: | minor | Milestone: | |
| Component: | nginx-core | Version: | 1.2.x | 
| Keywords: | Cc: | ||
| uname -a: | Linux www1 2.6.32.59-32.32-default #1 SMP 2012-05-29 21:43:39 +0200 x86_64 x86_64 x86_64 GNU/Linux | ||
| nginx -V: | nginx version: nginx/1.2.4 built by gcc 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) TLS SNI support enabled configure arguments: --with-debug --add-module=../ngx_http_cashew_module.git --add-module=../ngx_http_cc_module.git --add-module=../ngx_http_ip_tos_module.git --add-module=../ngx_http_jpegresize_module.git --add-module=../ngx_http_pinba_module.git --add-module=../ngx_http_rangemap_module.git --add-module=../ngx_http_tcp_tweak_module.git --prefix=/local/nginx --add-module=../substitutions4nginx-read-only --with-openssl=../openssl-1.0.1c --with-pcre=../pcre-8.31 --with-pcre-jit --with-cc-opt='-g3 -O0 -DNGX_HAVE_ACCEPT4=1 -DNGX_HAVE_OPENAT=0 -I../ipp_7.0.6.293_intel64.git/include -I../libjpeg-turbo-1.2.1' --with-http_stub_status_module --with-http_realip_module --with-http_flv_module --with-http_gzip_static_module --with-http_ssl_module | ||
Description
Nginx 1.2.4 segfaults in ngx_ssl_new_session():
Program terminated with signal 11, Segmentation fault.
#0  0x000000000043bdb4 in ngx_ssl_new_session (ssl_conn=0xe225760, sess=0xf53f80) at src/event/ngx_event_openssl.c:1678
1678    src/event/ngx_event_openssl.c: No such file or directory.
in src/event/ngx_event_openssl.c
(gdb) p shm_zone 
$1 = (ngx_shm_zone_t *) 0x0
The crash is only reproducible when using several SSL certificates AND using Safari on Mac as a browser.
Not reproducible with Firefox/Chrome/whatever on the same Mac.
The crash is caused by SSL_CTX_get_ex_data() returning NULL, which is then dereferenced without a check. Since this check is absent in trunk, I suppose the bug is reproducible there, too.
The fix seems to be quite simple, see attached patch.
Attachments (1)
Change History (12)
by , 13 years ago
| Attachment: | nginx_sni_null_data.diff added | 
|---|
follow-up: 10 comment:1 by , 13 years ago
| Priority: | major → minor | 
|---|---|
| Status: | new → accepted | 
| Summary: | [patch] nginx segfaults in ngx_ssl_new_session() (with Safari only) → segfault with SNI and ssl_session_cache assymetrical configuration | 
comment:2 by , 13 years ago
It's actually known issue
Yeah, Anight told me about that, but only after I 'fixed' and reported it to you =)
The patch suggested will obviously fix segfault, but not the underlying problem
Well, my intention was to fix the segfault, the cache issue isn't quite my priority atm.
I understand that this is just a workaround of a kind, but crashes create much more problems than non-working cache.
Anyway, thanks for the detailed explanation!
comment:3 by , 12 years ago
I can reproduce this with Opera as well. Not that it really matters.
However, I agree with Antony that leaving a bug open like this is not a good solution. If it is known to cause issues at the moment, maybe nginx should output a warning or error when such a configuration is used? That may be easier to do that fixing the underlying issue while still will prevent random segfaults on people who are not aware of it.
comment:4 by , 12 years ago
A colleague of mine and I have looked at this issue to help fix this bug. The underlying problem is how OpenSSL uses contexts and sessions with the presence of SNI extension. 
The first context created will be the one used for accepting the connection and marked as the "session_ctx". When creating new sessions, it always uses the session_ctx, as session resumption is supposed to ignore extensions as specified in RFC 3546. This leads to always using the first context for dispatching new session callbacks, even though the proper context is passed in as a parameter. This effectively means that if the session callbacks are registered with the goal of having multiple session caches, this will not be possible (not sure if different ports will lead to different behavior).
In the case of this bug, the first context defines a shared cache, which leads to registering the new session callback. The second context doesn't have one, so it has a NULL shared memory region. Since the callback dispatch is based on the first context, it gets called with the correct context which has NULL pointer.
A proposed solution to this problem is to always register the new session callbacks, since they will be dispatched with the proper context as parameter. Then in the callback return early if there is no actual cache defined for the specific context. I can't attach the patch for some reason, so I'm inlining it here:
diff --git a/src/event/ngx_event_openssl.c b/src/event/ngx_event_openssl.c
index cbe4136..b599ccb 100644
--- a/src/event/ngx_event_openssl.c
+++ b/src/event/ngx_event_openssl.c
@@ -1774,18 +1774,16 @@ ngx_ssl_session_cache(ngx_ssl_t *ssl, ngx_str_t *sess_ctx,
         }
     }
-    if (shm_zone) {
-        SSL_CTX_sess_set_new_cb(ssl->ctx, ngx_ssl_new_session);
-        SSL_CTX_sess_set_get_cb(ssl->ctx, ngx_ssl_get_cached_session);
-        SSL_CTX_sess_set_remove_cb(ssl->ctx, ngx_ssl_remove_session);
+    SSL_CTX_sess_set_new_cb(ssl->ctx, ngx_ssl_new_session);
+    SSL_CTX_sess_set_get_cb(ssl->ctx, ngx_ssl_get_cached_session);
+    SSL_CTX_sess_set_remove_cb(ssl->ctx, ngx_ssl_remove_session);
-        if (SSL_CTX_set_ex_data(ssl->ctx, ngx_ssl_session_cache_index, shm_zone)
-            == 0)
-        {
-            ngx_ssl_error(NGX_LOG_EMERG, ssl->log, 0,
-                          "SSL_CTX_set_ex_data() failed");
-            return NGX_ERROR;
-        }
+    if (SSL_CTX_set_ex_data(ssl->ctx, ngx_ssl_session_cache_index, shm_zone)
+          == 0)
+    {
+        ngx_ssl_error(NGX_LOG_EMERG, ssl->log, 0,
+                      "SSL_CTX_set_ex_data() failed");
+        return NGX_ERROR;
     }
     return NGX_OK;
@@ -1884,6 +1882,9 @@ ngx_ssl_new_session(ngx_ssl_conn_t *ssl_conn, ngx_ssl_session_t *sess)
     ssl_ctx = SSL_get_SSL_CTX(ssl_conn);
     shm_zone = SSL_CTX_get_ex_data(ssl_ctx, ngx_ssl_session_cache_index);
+    if (!shm_zone) {
+        return 0;
+    }
     cache = shm_zone->data;
     shpool = (ngx_slab_pool_t *) shm_zone->shm.addr;
OpenSSL also has a bug (id 2019) on file to fix session resumption with SNI extension, though it might not be fixed soon.
Let us know if this is good or if there is any other way to help resolve this issue.
comment:6 by , 10 years ago
| Resolution: | → fixed | 
|---|---|
| sensitive: | → 0 | 
| Status: | accepted → closed | 
Fix committed and will be available in upcoming nginx 1.9.6.
comment:7 by , 10 years ago
Still not fixed? I upgraded to 1.9.9 with the hope to fix, but no way...
I'm using SNI with HTTP/2 (before, using 1.8.0, SPDY) 
session_cache set globally to 10m
nginx -v
nginx version: nginx/1.9.9
2015/12/22 08:15:54 [alert] 7217#0: worker process 7392 exited on signal 11 (core dumped)
2015/12/22 08:16:17 [alert] 7217#0: worker process 7357 exited on signal 11 (core dumped)
2015/12/22 08:16:50 [alert] 7217#0: worker process 7364 exited on signal 11 (core dumped)
2015/12/22 08:16:54 [alert] 7217#0: worker process 7441 exited on signal 11 (core dumped)
2015/12/22 08:18:18 [alert] 7217#0: worker process 7471 exited on signal 11 (core dumped)
comment:8 by , 10 years ago
Can you please provide:
1) nginx -V output
2) full configuration
3) debug log (http://nginx.org/en/docs/debugging_log.html)
4) backtrace (https://www.nginx.com/resources/wiki/start/topics/tutorials/debugging/)
comment:10 by , 8 years ago
| Resolution: | fixed | 
|---|---|
| Status: | closed → reopened | 
new session callback called at SSL_ST_OK, and it after ngx_http_ssl_servername;
ngx_ssl_new_session get ssl context for the current server config, not the default config.
I don't use the origin ssl context is a good idea.
Replying to mdounin:
Thank you for your report. It's actually known issue and manifests itself in a configuration like this:
server { listen 443 default ssl; server_name test; ssl_certificate ... ssl_certificate_key ... ssl_session_cache shared:SSL:1m; } server { listen 443 ssl; server_name test2; ssl_certificate ... ssl_certificate_key ... ssl_session_cache none; }I.e. if shared ssl_session_cache is used in a default server, but "ssl_session_cache none" (== default) in some SNI-based virtual host. The test is as follows:
openssl s_client -connect 127.0.0.1:443 -no_ticket -servername test2Obvious workaround is to use the same session cache for all server blocks, e.g. to configure it on the http{} level.
The patch suggested will obviously fix segfault, but not the underlying problem with disagreement between SSL context used by OpenSSL to call new session callback and the context available via SSL_get_SSL_CTX(). As far as I remember there is no way to reach original context via OpenSSL API, so probably right solution would be to save original context somewhere in SNI callback.
comment:11 by , 8 years ago
| Resolution: | → fixed | 
|---|---|
| Status: | reopened → closed | 
The particular issue is fixed. Please avoid reopening it unless you think it is not fixed.


Thank you for your report. It's actually known issue and manifests itself in a configuration like this:
server { listen 443 default ssl; server_name test; ssl_certificate ... ssl_certificate_key ... ssl_session_cache shared:SSL:1m; } server { listen 443 ssl; server_name test2; ssl_certificate ... ssl_certificate_key ... ssl_session_cache none; }I.e. if shared ssl_session_cache is used in a default server, but "ssl_session_cache none" (== default) in some SNI-based virtual host. The test is as follows:
Obvious workaround is to use the same session cache for all server blocks, e.g. to configure it on the http{} level.
The patch suggested will obviously fix segfault, but not the underlying problem with disagreement between SSL context used by OpenSSL to call new session callback and the context available via SSL_get_SSL_CTX(). As far as I remember there is no way to reach original context via OpenSSL API, so probably right solution would be to save original context somewhere in SNI callback.