Opened 20 months ago

Closed 20 months ago

Last modified 20 months ago

#1519 closed defect (fixed)

grpc_pass causes grpc TCP reset when streaming a lot of data (with default gprc_buffer_size)

Reported by: naggie@… Owned by: mdounin
Priority: major Milestone:
Component: nginx-module Version: 1.13.x
Keywords: http_v2 grpc ubuntu Cc: callan.bryant@…
uname -a: Linux testbox 4.4.0-1052-aws #61-Ubuntu SMP Mon Feb 12 23:05:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.13.10 built by gcc 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5) built with OpenSSL 1.0.2g 1 Mar 2016 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie'

Description

grpc_pass seems to cause grpc core to do a TCP reset when streaming a lot of data, ostensibly when response headers are being sent.

The grpc client experiences a HTTP/2 RST_STREAM frame.

The problem *appears* to be fixed if the grpc_buffer_size is set to a large number such as 100M. Presumably, this makes sense if the upstream GRPC server is able to produce data faster than the client can receive it; however I expected grpc_pass to act like proxy_pass and use a temporary file if the buffer size is exceeded.

Alternatively, disabling buffering and relying on flow-control seems like a good option, considering the GRPC stream data may be real-time in some use cases. With proxy_pass this is possible with proxy_buffering off. Unfortunately there is no such option with grpc_pass.


I have created a gist which includes a minimal example (with nginx configuration, versions etc) to reproduce this:

https://gist.github.com/naggie/1c432b41613a23497a3c6f67c0adac28

Thanks!

Change History (5)

comment:1 Changed 20 months ago by mdounin

  • Owner set to mdounin
  • Status changed from new to assigned

Thank you for your report, I was able to reproduce the problem and it seems to be a bug which prevents things from working properly here (normally, grpc_pass works similar to proxy_pass with proxy_buffering off).

Please try the following patch:

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1522859697 -10800
#      Wed Apr 04 19:34:57 2018 +0300
# Node ID 63b9eade4ff18f08887287e2642776d05ace8a3c
# Parent  d4cc2edb4ff8391d0c7419e91e6fcc988c510654
Upstream: fixed u->conf->preserve_output (ticket #1519).

Previously, ngx_http_upstream_process_header() might be called after
we've finished reading response headers and switched to a different read
event handler, leading to errors with gRPC proxying.  Additionally,
the u->conf->read_timeout timer might re-armed during reading response
headers (while this is expected to be a single timeout on reading
the whole response header).

diff --git a/src/http/ngx_http_upstream.c b/src/http/ngx_http_upstream.c
--- a/src/http/ngx_http_upstream.c
+++ b/src/http/ngx_http_upstream.c
@@ -2013,8 +2013,6 @@ ngx_http_upstream_send_request(ngx_http_
 
     /* rc == NGX_OK */
 
-    u->request_body_sent = 1;
-
     if (c->write->timer_set) {
         ngx_del_timer(c->write);
     }
@@ -2041,11 +2039,15 @@ ngx_http_upstream_send_request(ngx_http_
         return;
     }
 
-    ngx_add_timer(c->read, u->conf->read_timeout);
-
-    if (c->read->ready) {
-        ngx_http_upstream_process_header(r, u);
-        return;
+    if (!u->request_body_sent) {
+        u->request_body_sent = 1;
+
+        ngx_add_timer(c->read, u->conf->read_timeout);
+
+        if (c->read->ready && !u->header_sent) {
+            ngx_http_upstream_process_header(r, u);
+            return;
+        }
     }
 }
 

comment:2 Changed 20 months ago by naggie@…

I am happy to confirm that this patch fixes the issue. I ran it against the Debian package provided for Ubuntu 16.04, and did not set a buffer size.

Thank you Maxim for the effort and fast response.


For anyone that needs to apply this fix in the mean time:

mkdir -p ~/src/debian
cd ~/src/debian/nginx-1.13.11/
patch -p1 < grpc_fix.patch
dpkg-buildpackage -uc -b -rfakeroot
cd ~/src/debian
dpkg -i nginx_1.13.11-1~xenial_amd64.deb

Beware of apt automatically updating the package, you'll have to freeze it until this patch makes it to mainline.

comment:3 Changed 20 months ago by Maxim Dounin <mdounin@…>

In 7260:08f114ed5449/nginx:

Upstream: fixed u->conf->preserve_output (ticket #1519).

Previously, ngx_http_upstream_process_header() might be called after
we've finished reading response headers and switched to a different read
event handler, leading to errors with gRPC proxying. Additionally,
the u->conf->read_timeout timer might be re-armed during reading response
headers (while this is expected to be a single timeout on reading
the whole response header).

comment:4 Changed 20 months ago by mdounin

  • Resolution set to fixed
  • Status changed from assigned to closed

Fix committed and will be available in the next release. Thanks for reporting this.

comment:5 Changed 20 months ago by Sergey Kandaurov <pluknet@…>

In 1318:6de2a27af2d3/nginx-tests:

Tests: grpc preserve output tests added (ticket #1519).

Note: See TracTickets for help on using tickets.