Opened 7 years ago

Closed 7 years ago

Last modified 3 years ago

#413 closed defect (fixed)

Extra roundtrip during SSL handshake with long certificate chains

Reported by: Jiri Horky Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.3.x
Keywords: SSL, Cc:
uname -a: 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 GMT 2011 x86_64 x86_64 x86_64 GNU/Linux
2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 GMT 2011 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.5.4
TLS SNI support enabled
configure arguments: --prefix=/root/horky/nginx-root/ --without-mail_smtp_module --without-mail_imap_module --without-mail_pop3_module --without-http_auth_basic_module --without-http_autoindex_module --without-http_geo_module --without-http_map_module --without-http_fastcgi_module --without-http_scgi_module --without-http_memcached_module --without-http_empty_gif_module --with-http_ssl_module --without-http_rewrite_module --with-poll_module

Description

Hi,

during some performance testing I noticed that there is an extra RTT involved during SSL handshake when using big-enough certificate file (4096-bit self-signed certificate is big enough to trigger the error).
I have located the problem in the size of openSSL's write buffers: DEFAULT_BUFFER_SIZE=4096 in crypto/bio/bf_buff.c which causes that the certificate does not get to the client during one write(2) to the socket, as only as many bytes as the buffer size are flushed in one moment. The rest of bytes are cached until openSSL does not have anything else to send or decides to flush, which, unfortunatelly waits for some ACK from client.

Here is an exterm of the communication from tcpdump (run on the client), the RTT is 100ms, this shows simple openssl s_client -connect operation (just SSL handshake):

  1. 06:06:28.837676 IP client.42233 > server.com.https: Flags [S], seq 2917118389, win 2920, options [mss 1460,sackOK,TS val 32817878 ecr 0,nop,wscale 13], length 0
  2. 06:06:28.939046 IP server.com.https > client.42233: Flags [S.], seq 1395986130, ack 2917118390, win 30408, options [mss 1380,sackOK,TS val 4079167253 ecr 32817878,nop,wscale 13], length 0
  3. 06:06:29.033098 IP client.42233 > server.com.https: Flags ., ack 1, win 1, options [nop,nop,TS val 32818003 ecr 4079167253], length 0
  4. 06:06:29.224189 IP client.42233 > server.com.https: Flags [P.], seq 1:113, ack 1, win 1, options [nop,nop,TS val 32818194 ecr 4079167253], length 112
  5. 06:06:29.255924 IP server.com.https > client.42233: Flags ., ack 113, win 4, options [nop,nop,TS val 4079167570 ecr 32818194], length 0
  1. 06:06:29.256512 IP server.com.https > client.42233: Flags ., seq 1:1369, ack 113, win 4, options [nop,nop,TS val 4079167570 ecr 32818194], length 1368
  2. 06:06:29.256520 IP server.com.https > client.42233: Flags ., seq 1369:2737, ack 113, win 4, options [nop,nop,TS val 4079167570 ecr 32818194], length 1368
  3. 06:06:29.256522 IP server.com.https > client.42233: Flags [P.], seq 2737:4097, ack 113, win 4, options [nop,nop,TS val 4079167570 ecr 32818194], length 1360
  1. 06:06:29.350656 IP client.42233 > server.com.https: Flags ., ack 1369, win 1, options [nop,nop,TS val 32818321 ecr 4079167570], length 0
  2. 06:06:29.350716 IP client.42233 > server.com.https: Flags ., ack 2737, win 1, options [nop,nop,TS val 32818321 ecr 4079167570], length 0
  3. 06:06:29.350743 IP client.42233 > server.com.https: Flags ., ack 4097, win 1, options [nop,nop,TS val 32818321 ecr 4079167570], length 0
  4. 06:06:29.383781 IP server.com.https > client.42233: Flags [P.], seq 4097:4939, ack 113, win 4, options [nop,nop,TS val 4079167696 ecr 32818321], length 842
  5. 06:06:29.517408 IP client.42233 > server.com.https: Flags ., ack 4939, win 1, options [nop,nop,TS val 32818488 ecr 4079167696], length 0
  6. 06:06:29.530716 IP client.42233 > server.com.https: Flags [P.], seq 113:311, ack 4939, win 1, options [nop,nop,TS val 32818501 ecr 4079167696], length 198
  7. 06:06:29.564909 IP server.com.https > client.42233: Flags [P.], seq 4939:5173, ack 311, win 4, options [nop,nop,TS val 4079167879 ecr 32818501], length 234
  8. 06:06:29.738429 IP client.42233 > server.com.https: Flags ., ack 5173, win 1, options [nop,nop,TS val 32818709 ecr 4079167879], length 0
  9. 06:06:33.026897 IP client.42233 > server.com.https: Flags [F.], seq 311, ack 5173, win 1, options [nop,nop,TS val 32821997 ecr 4079167879], length 0
  10. 06:06:33.053611 IP server.com.https > client.42233: Flags [F.], seq 5173, ack 312, win 4, options [nop,nop,TS val 4079171372 ecr 32821997], length 0
  11. 06:06:33.147658 IP client.42233 > server.com.https: Flags ., ack 5174, win 1, options [nop,nop,TS val 32822118 ecr 4079171372], length 0

PLease note that the servers sends 4096 bytes in packets 6-8. and then waits for an ACK and only after that, it continues to send the last part of the certificate in packet 12 which unnecessary prolongs the whole handshake.

This does not happen if you use smaller certificates.

The point is that the SSL's default buffer size can be increased using BIO_set_write_buffer_size call. Could you please increase it and possibly, make it configurable?

I double checked that the server's initial congestion window and the client's window size is not the problem.

BTW: The problem is also reproducible using embedded openssl s_server.

Attachments (2)

out.affinity.one.by.one.5s.2048key.RSA.aniCluster.patched.nginx.only.sorted.png (140.7 KB ) - added by Jiří Horký 7 years ago.
Performance with SSL handshake patch applied
out.affinity.one.by.one.5s.2048key.RSA.again.sorted.png (151.8 KB ) - added by Jiří Horký 7 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 by Maxim Dounin, 7 years ago

Thank you for report. I don't see how you were able to trigger this with a single 4096bit certificate - the one generated by "openssl req -x509 -newkey rsa:4096 ..." is only slightly larger than 2k, but obviously enough using long certificate chain can easily trigger this.

Unfortunately buffering in question happens inside OpenSSL, and adjusting it isn't really trivial without touching OpenSSL internals. The following patch tries to do it in a sane way:

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1379620217 -14400
# Node ID 0b7139824b53d2be0d14d5be92dbd3a912f10629
# Parent  f5e7682109be659b4372220c69ebe13df64da69b
SSL: adjust buffer used by OpenSSL during handshake (ticket #413).

diff --git a/src/event/ngx_event_openssl.c b/src/event/ngx_event_openssl.c
--- a/src/event/ngx_event_openssl.c
+++ b/src/event/ngx_event_openssl.c
@@ -521,8 +521,29 @@ ngx_ssl_verify_callback(int ok,
 static void
 ngx_ssl_info_callback(const ngx_ssl_conn_t *ssl_conn, int where, int ret)
 {
+    BIO               *rbio, *wbio;
     ngx_connection_t  *c;
 
+    if (where == SSL_CB_ACCEPT_LOOP) {
+        c = ngx_ssl_get_connection((ngx_ssl_conn_t *) ssl_conn);
+
+        if (!c->ssl->handshake_buffer_set) {
+             /*
+              * by default OpenSSL uses 4k buffer during a handshake,
+              * which is too low for long certificate chains and might
+              * result in extra round-trips
+              */
+
+            rbio = SSL_get_rbio(ssl_conn);
+            wbio = SSL_get_wbio(ssl_conn);
+
+            if (rbio != wbio) {
+                (void) BIO_set_write_buffer_size(wbio, NGX_SSL_BUFSIZE);
+                c->ssl->handshake_buffer_set = 1;
+            }
+        }
+    }
+
     if (where & SSL_CB_HANDSHAKE_START) {
         c = ngx_ssl_get_connection((ngx_ssl_conn_t *) ssl_conn);
 
diff --git a/src/event/ngx_event_openssl.h b/src/event/ngx_event_openssl.h
--- a/src/event/ngx_event_openssl.h
+++ b/src/event/ngx_event_openssl.h
@@ -48,6 +48,7 @@ typedef struct {
     unsigned                    buffer:1;
     unsigned                    no_wait_shutdown:1;
     unsigned                    no_send_shutdown:1;
+    unsigned                    handshake_buffer_set:1;
 } ngx_ssl_connection_t;
 
 

Better (and not nginx-only) solution would probably be in the ssl_init_wbio_buffer() inside OpenSSL, which actually already has relevant call, though it's commented out.

comment:2 by Jiří Horký, 7 years ago

Hi,

thank you for the prompt response! You are right, the bug does not trigger with 4096-bit certificates, I was lost in a bunch of different scenarios I've tested - I saw some additional latency with 4096-bit certs and did not confirm it comes from the network (it was just because of CPU) which I knew was true for 2048-bit certs with CA's certs bundled. Could you please change the ticket's title?

With the submitted patch, the extra roundtrip is gone + I measured increased and more stable performance!

Thank you again.

Jiri Horky

comment:3 by Maxim Dounin, 7 years ago

Status: newaccepted
Summary: Extra roundtrip during SSL handshake with 4096-bit certificates (or smaller with other CAs bundled)Extra roundtrip during SSL handshake with long certificate chains

Thanks for testing, ticket's title adjusted. If you have any numbers to share - please do so, it's interesting.

by Jiří Horký, 7 years ago

Performance with SSL handshake patch applied

comment:4 by Jiří Horký, 7 years ago

Hi,

for performance comparison - see the attached images. This is a measurement of sole SSL handshakes (no HTTP involved) from 28 clients, each one doing 10 handshakes in parralel as fast as possible. Clients are started 5s (vertical line) after each other, all clients runs during last 30s. The RTT between clients and the server was about 8.5ms. The latency plot includes only successful connections. 1000ms maximum latency comes from a lost initial SYN from one of the clients. The server has 1x E5 2620 processor with HT enabled, nginx configured with 6 workers with tuned affinity (for some reason, without tuning affinity of workers, all of them appear to run on two cores - moreover, the same physical one).

Jiri Horky

comment:5 by Maxim Dounin, 7 years ago

Thanks, looks good.

comment:6 by Maxim Dounin <mdounin@…>, 7 years ago

In a720f0b0e08345ebb01353250f4031bb6e141385/nginx:

SSL: adjust buffer used by OpenSSL during handshake (ticket #413).

comment:7 by Maxim Dounin, 7 years ago

Resolution: fixed
Status: acceptedclosed

Slightly modified patch committed, thanks for testing.

comment:8 by Maxim Dounin <mdounin@…>, 3 years ago

In 7008:29c6d66b83ba/nginx:

SSL: set TCP_NODELAY on SSL connections before handshake.

With OpenSSL 1.1.0+, the workaround for handshake buffer size as introduced
in a720f0b0e083 (ticket #413) no longer works, as OpenSSL no longer exposes
handshake buffers, see https://github.com/openssl/openssl/commit/2e7dc7cd688.
Moreover, it is no longer possible to adjust handshake buffers at all now.

To avoid additional RTT if handshake uses more than 4k we now set TCP_NODELAY
on SSL connections before handshake. While this still results in sub-optimal
network utilization due to incomplete packets being sent, it seems to be
better than nothing.

Note: See TracTickets for help on using tickets.