Opened 7 years ago

Closed 7 years ago

#1194 closed defect (fixed)

SSL_write fails with 'bad write retry' error

Reported by: 13@… Owned by:
Priority: minor Milestone:
Component: other Version: 1.9.x
Keywords: Cc:
uname -a: Linux 3.16.XXXX x86_64
nginx -V: nginx version: nginx/1.9.15 (1.9.15.dbx24, xxx, 2016-10-12 18:52:28 UTC (release))
built with OpenSSL 1.0.2j 26 Sep 2016
TLS SNI support enabled
configure arguments: --prefix=/usr/local --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid --lock-path=/var/lock/nginx.lock --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/client_temp --http-proxy-temp-path=/var/lib/nginx/proxy_temp --http-fastcgi-temp-path=/var/lib/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/lib/nginx/uwsgi_temp --http-scgi-temp-path=/var/lib/nginx/scgi_temp --with-http_stub_status_module --with-http_ssl_module --with-http_gzip_static_module --with-http_realip_module --with-ipv6 --with-md5-asm --with-sha1-asm --with-http_v2_module --with-cc-opt='-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Wformat-security -Werror=format-security' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro' --build='1.9.15.dbx24, xxx, 2016-10-12 18:52:28 UTC (release)' --builddir=objs-release --with-openssl=opaque-build/openssl-1.0.2j --with-zlib=opaque-build/zlib-1.2.8 --with-pcre=opaque-build/pcre-8.38 --with-pcre-opt=-DSUPPORT_UCP --with-pcre-jit --with-openssl-opt='no-seed no-bf no-cast no-md2 no-md4 no-ripemd no-krb5 no-jpake no-capieng no-idea no-rc5 no-mdc2 no-ssl2 threads enable-tlsext no-shared enable-ec_nistp_64_gcc_128' --add-module=opaque-build/ngx_devel_kit-0.2.19 --add-module=opaque-build/lua-nginx-module-0.10.2 --add-module=opaque-build/headers-more-nginx-0.29 --add-module=opaque-build/ngx_http_ip_tos_filter_module --add-module=modules/dbx_dials_module --add-module=modules/nginx_upstream_check_module --add-module=modules/ngx_http_streaming_filter_module --add-module=modules/ngx_http_upstream_dynamic_module

Description

there is noticeable periodic rate of 'SSL_write() failed (SSL: error:1409F07F:SSL routines:ssl3_write_pending:bad write retry) while sending response to client' errors and according attached debug log the issue is happening when client closes connection right after ssl negotiation and sending http requests. It causes SSL_write error for sending 'send 100 Continue' followed by SSL_write with ssl3_write_pending error during sending 500 static page. Please check attached log for more details, but here is some details:

  1. SSL negotiation has been completed and nginx reads requests and client closes connection for some reason:
    2017/01/14 02:46:58 [debug] 14576#0: *358792 accept: X.X.X.X:51644 fd:117
    ...
    2017/01/14 02:46:58 [debug] 14576#0: *358792 SSL_do_handshake: 1
    ...
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_read: 455
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_read: 0
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_get_error: 6
    2017/01/14 02:46:59 [debug] 14576#0: *358792 peer shutdown SSL cleanly
    2017/01/14 02:46:59 [debug] 14576#0: *358792 reusable connection: 0
    
  2. nginx starts to handle request and after processing tries to send 100 continue to the client, but it fails
    2017/01/14 02:46:59 [debug] 14576#0: *358792 post access phase: 11
    2017/01/14 02:46:59 [debug] 14576#0: *358792 send 100 Continue
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL to write: 25
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_write: -1
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_get_error: 5
    2017/01/14 02:46:59 [info] 14576#0: *358792 SSL_write() failed (SSL:) (32: Broken pipe), client: X.X.X.X, server: A.B.C, request: "POST /X/Y/Z?k1=v1&k2=v2 HTTP/1.1", host: "A.B.C"
    
  3. nginx tries to send static page with 500 status code as a result of broken pipe, but it fails with 'bad write retry' error.
    2017/01/14 02:46:59 [debug] 14576#0: *358792 http finalize request: 500, "/X/Y/Z?k1=v1&k2=v2" a:1, c:1
    2017/01/14 02:46:59 [debug] 14576#0: *358792 http special response: 500, "/X/Y/Z?k1=v1&k2=v2"
    ...
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL buf copy: 300
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL buf copy: 14343
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL to write: 14643
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_write: -1
    2017/01/14 02:46:59 [debug] 14576#0: *358792 SSL_get_error: 1
    2017/01/14 02:46:59 [crit] 14576#0: *358792 SSL_write() failed (SSL: error:1409F07F:SSL routines:ssl3_write_pending:bad write retry) while sending response to client, client: X.X.X.X, server: A.B.C, request: "POST /X/Y/Z?k1=v1&k2=v2 HTTP/1.1", host: "A.B.C"
    
    

Attachments (1)

error_debug (1).txt (12.8 KB ) - added by 13@… 7 years ago.
debug log

Download all attachments as: .zip

Change History (6)

by 13@…, 7 years ago

Attachment: error_debug (1).txt added

debug log

comment:1 by Maxim Dounin, 7 years ago

Please try the following patch, it should help (untested though):

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1486042210 -10800
#      Thu Feb 02 16:30:10 2017 +0300
# Node ID 26aa5888d71e1ca476d0f6b53553c0b4a8d6c7f7
# Parent  c4914b8c3e13a3fa525ab77c36764227cdd896f0
Request body: c->error on "100 Continue" errors (ticket #1194).

diff --git a/src/http/ngx_http_request_body.c b/src/http/ngx_http_request_body.c
--- a/src/http/ngx_http_request_body.c
+++ b/src/http/ngx_http_request_body.c
@@ -835,6 +835,8 @@ ngx_http_test_expect(ngx_http_request_t 
 
     /* we assume that such small packet should be send successfully */
 
+    r->connection->error = 1;
+
     return NGX_ERROR;
 }
 

comment:2 by Alexey Ivanov, 7 years ago

Thank you for the quick fix! It'll take us a while to test that though, because problem is fairly rare and we already have another binary in canary.

comment:3 by Sergey Kandaurov, 7 years ago

Reproduced with the above description and forcing client to send RST.
In perl terms:

        setsockopt($s, SOL_SOCKET, SO_LINGER, pack("ii", 1, 0)) or die "LINGER: $!";
        $s->close(SSL_fast_shutdown => 1);

debug log from unpatched and patched versions below, respectively.

2017/02/07 09:54:21 [debug] 5100#0: *1 send 100 Continue
2017/02/07 09:54:21 [debug] 5100#0: *1 SSL to write: 25
2017/02/07 09:54:21 [debug] 5100#0: *1 SSL_write: -1
2017/02/07 09:54:21 [debug] 5100#0: *1 SSL_get_error: 5
2017/02/07 09:54:21 [info] 5100#0: *1 SSL_write() failed (SSL:) (32: Broken pipe), client
: 127.0.0.1, server: localhost, request: "POST / HTTP/1.1", host: "localhost"
2017/02/07 09:54:21 [debug] 5100#0: *1 http finalize request: 500, "/?" a:1, c:1
2017/02/07 09:54:21 [debug] 5100#0: *1 http special response: 500, "/?"
2017/02/07 09:54:21 [debug] 5100#0: *1 internal redirect: "/local?"
<..>
2017/02/07 10:00:35 [debug] 5156#0: *1 write old buf t:1 f:0 00005581E1214980, pos 00005581E1214980, size: 162 file: 0, size: 0
2017/02/07 10:00:35 [debug] 5156#0: *1 write new buf t:0 f:0 0000000000000000, pos 0000000000000000, size: 0 file: 0, size: 0
2017/02/07 10:00:35 [debug] 5156#0: *1 http write filter: l:1 f:0 s:162
2017/02/07 10:00:35 [debug] 5156#0: *1 http write filter limit 0
2017/02/07 10:00:35 [debug] 5156#0: *1 posix_memalign: 00005581E1212A30:512 @16
2017/02/07 10:00:35 [debug] 5156#0: *1 malloc: 00005581E126CA30:16384
2017/02/07 10:00:35 [debug] 5156#0: *1 SSL buf copy: 162
2017/02/07 10:00:35 [debug] 5156#0: *1 SSL to write: 162
2017/02/07 10:00:35 [debug] 5156#0: *1 SSL_write: -1
2017/02/07 10:00:35 [debug] 5156#0: *1 SSL_get_error: 1
2017/02/07 10:00:35 [crit] 5156#0: *1 SSL_write() failed (SSL: error:1409F07F:SSL routines:ssl3_write_pending:bad write retry), client: 127.0.0.1, server: localhost, request: "POST / HTTP/1.1", host: "localhost"
2017/02/07 10:00:35 [debug] 5156#0: *1 http write filter FFFFFFFFFFFFFFFF
2017/02/07 10:00:35 [debug] 5156#0: *1 http copy filter: -1 "/local?"
2017/02/07 10:00:35 [debug] 5156#0: *1 http finalize request: -1, "/local?" a:1, c:2
2017/02/07 10:00:35 [debug] 5156#0: *1 http terminate request count:2
2017/02/07 10:00:35 [debug] 5156#0: *1 http terminate cleanup count:2 blk:0
2017/02/07 10:00:35 [debug] 5156#0: *1 http finalize request: -4, "/local?" a:1, c:2
2017/02/07 10:00:35 [debug] 5156#0: *1 http request count:2 blk:0
2017/02/07 10:00:35 [debug] 5156#0: *1 http posted request: "/local?"
2017/02/07 10:00:35 [debug] 5156#0: *1 http terminate handler count:1
2017/02/07 10:00:35 [debug] 5156#0: *1 http request count:1 blk:0
2017/02/07 10:00:35 [debug] 5156#0: *1 http close request
2017/02/07 10:00:35 [debug] 5156#0: *1 http log handler
2017/02/07 10:00:35 [debug] 5156#0: *1 free: 00005581E1213290, unused: 16
2017/02/07 10:00:35 [debug] 5156#0: *1 free: 00005581E12142A0, unused: 1937
2017/02/07 10:00:35 [debug] 5156#0: *1 close http connection: 3
2017/02/07 10:00:35 [debug] 5156#0: *1 SSL_shutdown: 1

patched version:

2017/02/07 09:51:40 [debug] 5074#0: *1 send 100 Continue
2017/02/07 09:51:40 [debug] 5074#0: *1 SSL to write: 25
2017/02/07 09:51:40 [debug] 5074#0: *1 SSL_write: -1
2017/02/07 09:51:40 [debug] 5074#0: *1 SSL_get_error: 5
2017/02/07 09:51:40 [info] 5074#0: *1 SSL_write() failed (SSL:) (32: Broken pipe
), client: 127.0.0.1, server: localhost, request: "POST / HTTP/1.1", host: "loca
lhost"
2017/02/07 09:51:40 [debug] 5074#0: *1 http finalize request: 500, "/?" a:1, c:1
2017/02/07 09:51:40 [debug] 5074#0: *1 http terminate request count:1
2017/02/07 09:51:40 [debug] 5074#0: *1 http terminate cleanup count:1 blk:0
2017/02/07 09:51:40 [debug] 5074#0: *1 http posted request: "/?"
2017/02/07 09:51:40 [debug] 5074#0: *1 http terminate handler count:1
2017/02/07 09:51:40 [debug] 5074#0: *1 http request count:1 blk:0
2017/02/07 09:51:40 [debug] 5074#0: *1 http close request
2017/02/07 09:51:40 [debug] 5074#0: *1 http log handler
2017/02/07 09:51:40 [debug] 5074#0: *1 free: 000055DD44F846F0, unused: 48
2017/02/07 09:51:40 [debug] 5074#0: *1 free: 000055DD44F7B280, unused: 2633
2017/02/07 09:51:40 [debug] 5074#0: *1 close http connection: 3
2017/02/07 09:51:40 [debug] 5074#0: *1 SSL_shutdown: 1

comment:4 by Maxim Dounin <mdounin@…>, 7 years ago

In 6903:92e771de7d89/nginx:

Request body: c->error on "100 Continue" errors (ticket #1194).

comment:5 by Maxim Dounin, 7 years ago

Resolution: fixed
Status: newclosed

Fix committed.

Note: See TracTickets for help on using tickets.