#2229 closed defect (fixed)
Grpc Upstream timeout
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | documentation | Version: | 1.19.x |
Keywords: | grpc, c#, java, upstream, timeout | Cc: | |
uname -a: | Windows 10 Pro | ||
nginx -V: |
nginx version: nginx/1.21.1
built by cl 16.00.40219.01 for 80x86 built with OpenSSL 1.1.1k 25 Mar 2021 TLS SNI support enabled configure arguments: --with-cc=cl --builddir=objs.msvc8 --with-debug --prefix= --conf-path=conf/nginx.conf --pid-path=logs/nginx.pid --http-log-path=logs/access.log --error-log-path=logs/error.log --sbin-path=nginx.exe --http-client-body-temp-path=temp/client_body_temp --http-proxy-temp-path=temp/proxy_temp --http-fastcgi-temp-path=temp/fastcgi_temp --http-scgi-temp-path=temp/scgi_temp --http-uwsgi-temp-path=temp/uwsgi_temp --with-cc-opt=-DFD_SETSIZE=1024 --with-pcre=objs.msvc8/lib/pcre-8.44 --with-zlib=objs.msvc8/lib/zlib-1.2.11 --with-http_v2_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_stub_status_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_auth_request_module --with-http_random_index_module --with-http_secure_link_module --with-http_slice_module --with-mail --with-stream --with-openssl=objs.msvc8/lib/openssl-1.1.1k --with-openssl-opt='no-asm no-tests -D_WIN32_WINNT=0x0501' --with-http_ssl_module --with-mail_ssl_module --with-stream_ssl_module |
Description (last modified by )
So, i've got these things : a client, a server, nginx. They all run on the same machine.
The client does some network scanning and is supposed to send it's results to the server.
The server is supposed to take the scanning-results and put them into a database.
The server is based on ASP.Net.
The client and server use the following packages
- grpc-dotnet - https://github.com/grpc/grpc-dotnet
- protobuf-net - https://github.com/protobuf-net/protobuf-net
Both also use the SSL certificate and generally they both work as expected.
The nginx proxy is supposed to be used as a Reverse-Proxy -> Clients only need to know one Endpoint for all their Requests. Nginx is only configured to pass grpc requests, nothing else.
First of all, almost all grpc requests do work. But they stop working, if the request gets too big. I dont know the exact size, but it must be under 180kb.
(180kb is the size of the scan results.In a test, sending half of these results succeded, where sending all did not. Custom test data (with the same class of course) did show the same results. If they got too big, it stopped working)
Without nginx, there are no errors while sending data from client to server. With nginx however, the client does recieve a HTTP 504 error.
Content of error.log
2021/07/30 13:08:05 [error] 23372#21368: *44 upstream timed out (10060: Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstelle nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergestellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat) while sending request to upstream, client: 127.0.0.1, server: , request: "POST /greet.Greeter/SayHello HTTP/2.0", upstream: "grpcs://127.0.0.1:5001", host: "localhost:5002"
The repository does only contain a simplified version, but with the same end results.
Please have a look at my sample project, wich should include everything you'd need.
- Github Repository with samples https://github.com/xTeare/GrpcAndNGINX
Debug logs
- for the working request with 500 entries https://pastebin.com/q54T55vR
- for the not working request 1000 entries https://pastebin.com/6vnxiNax
Do you guy's have heard of this behaviour / know how to fix this ?
Other Resources
- Stackoverflow Question https://stackoverflow.com/questions/68547210/c-sharp-nginx-upstream-timed-out
- 2nd Stackoverflow Question (with Java) https://stackoverflow.com/questions/68708438/grpc-communication-stucks-using-nginx
- /r/nginx https://www.reddit.com/r/nginx/comments/otwhma/c_grpc_with_nginx_does_not_work/
- /r/csharp https://old.reddit.com/r/csharp/comments/otwgta/grpc_with_nginx_does_not_work/
Thanks in advance
xTeare
Change History (8)
comment:1 by , 3 years ago
Description: | modified (diff) |
---|
comment:2 by , 3 years ago
Status: | new → accepted |
---|
comment:3 by , 3 years ago
Do you have a rough estimate when this will get fixed ? Our application is quite dependent on nginx :)
Thanks in advance !
comment:4 by , 3 years ago
I'm able to reproduce it with slightly modified grpc.t, quick-and-dirty patch to reproduce:
diff --git a/grpc.t b/grpc.t --- a/grpc.t +++ b/grpc.t @@ -33,6 +33,7 @@ my $t = Test::Nginx->new()->has(qw/http daemon off; events { + use select; } http { @@ -62,6 +63,10 @@ http { } } + location /FlowControl { + grpc_pass grpcs://127.0.0.1:8082; + } + location /KeepAlive { grpc_pass u; } @@ -86,10 +91,41 @@ http { set $args $arg_c; } } + + server { + listen 127.0.0.1:8082 http2 ssl; + server_name localhost; + + ssl_certificate localhost.crt; + ssl_certificate_key localhost.key; + + location / { + grpc_pass grpc://127.0.0.1:8081; + } + } } EOF +$t->write_file('openssl.conf', <<EOF); +[ req ] +default_bits = 2048 +encrypt_key = no +distinguished_name = req_distinguished_name +[ req_distinguished_name ] +EOF + +my $d = $t->testdir(); + +foreach my $name ('localhost') { + system('openssl req -x509 -new ' + . "-config $d/openssl.conf -subj /CN=$name/ " + . "-out $d/$name.crt -keyout $d/$name.key " + . ">>$d/openssl.out 2>&1") == 0 + or die "Can't create certificate for $name: $!\n"; +} + + $t->run(); ###############################################################################
Below is a patch to address this. Testing is appreciated.
# HG changeset patch # User Maxim Dounin <mdounin@mdounin.ru> # Date 1629154008 -10800 # Tue Aug 17 01:46:48 2021 +0300 # Node ID f6582b24d622998db1a8587fc0893a1e883c0a6b # Parent 419c066cb7103165fe008339d210037f68a72d4f Upstream: fixed timeouts with gRPC, SSL and select (ticket #2229). With SSL it is possible that an established connection is ready for reading after the handshake. Further, events might be already disabled in case of level-triggered event methods. If this happens and ngx_http_upstream_send_request() blocks waiting for some data from the upstream, such as for flow control in case of gRPC, the connection will time out due to no read events on the upstream connection. Fix is explicitly check for c->read->ready if ngx_http_upstream_send_request() blocks and post a read event if it is set. Note that while it is possible to modify ngx_ssl_handshake() to keep read events active, this won't completely resolve the issue, since there can be data already received during the SSL handshake. diff --git a/src/http/ngx_http_upstream.c b/src/http/ngx_http_upstream.c --- a/src/http/ngx_http_upstream.c +++ b/src/http/ngx_http_upstream.c @@ -2113,6 +2113,10 @@ ngx_http_upstream_send_request(ngx_http_ c->tcp_nopush = NGX_TCP_NOPUSH_UNSET; } + if (c->read->ready) { + ngx_post_event(c->read, &ngx_posted_events); + } + return; }
comment:5 by , 3 years ago
I was indeed able to fix the problem with your "quick-fix". I've tested it with the same Repository
(I've sent 10k generated Strings, beforehand only 500 worked)
nginx.conf : https://pastebin.com/H6qpMTNY debug log : https://pastebin.com/vL9JwKBx build configuration : https://pastebin.com/KiGF9t9s
comment:7 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
Fix committed, thanks for reporting this.
Looks like there is a bug that causes removing read event after successful SSL handshake on the upstream connection. From the non-working request: