#2418 closed defect (fixed)
ngx_http_upstream_process_non_buffered_request do_write Dead-loop
| Reported by: | Owned by: | ||
|---|---|---|---|
| Priority: | minor | Milestone: | |
| Component: | documentation | Version: | 1.23.x |
| Keywords: | proxy_request_buffering off | Cc: | bullerdu@… |
| uname -a: | Linux cdn-dev011164234021.na61 4.19.91-008.ali4000.alios7.x86_64 #1 SMP Fri Sep 4 17:33:26 CST 2020 x86_64 x86_64 x86_64 GNU/Linux | ||
| nginx -V: |
nginx version: nginx/1.23.2
built by gcc 6.5.1 20220324 (GCC) configure arguments: --prefix=/home/yefei.dyf/nginx |
||
Description (last modified by )
In non-buffering transport mode(proxy_request_buffering off), I encountered a write dead loop problem with the following code path:
src/http/ngx_http_upstream.c:3666 ngx_http_upstream_process_non_buffered_request
->
src/http/ngx_http_upstream.c:3691
rc = ngx_http_output_filter(r, u->out_bufs); /* u->out_bufs=0x0 */
->
src/http/ngx_http_write_filter_module.c:48
ngx_http_write_filter
->
src/http/ngx_http_write_filter_module.c:295
chain = c->send_chain(c, r->out, limit);
src/os/uinx/ngx_linux_sendfile_chain.c:50
ngx_linux_sendfile_chain
/* code here return
if (!wev->ready) {
return in;
}
*/
src/http/ngx_http_write_filter_module.c:342
/* code here return
if (chain) {
c->buffered |= NGX_HTTP_WRITE_BUFFERED;
return NGX_AGAIN;
}
*/
src/http/ngx_http_upstream.c:3735
n = upstream->recv(upstream, b->last, size);
/* code here, return n = -1 */
->
src/http/ngx_http_upstream.c:3753
/* code here, continue */
->
src/http/ngx_http_upstream.c:3691
rc = ngx_http_output_filter(r, u->out_bufs);
My analysis is as follows:
First:
when the downstream is written, ngx_linux_sendfile_chain calls ngx_writev and returns NGX_AGAIN, setting wev->ready =0
src/os/uinx/ngx_linux_sendfile_chain.c:204
if (n == NGX_AGAIN) {
wev->ready = 0;
return in;
}
Second:
src/http/ngx_http_upstream.c:3691
ngx_http_output_filter returns NGX_AGAIN, u->out_bufs = NULL, u->busy_bufs! = NULL
size > 0 && upstream->read->ready = 1, upstream->recv returns -1, So I keep writing downstream(ngx_http_output_filter), but downstream->write->ready = 0 so ngx_http_output_filter keeps returning -2
The resulting problems:
ngx_http_output_filter loop call, some filter_modules have cpu consumption problem, CPU 100%
The Solution:
Check the return value of ngx_http_output_filter. If it is NGX_AGAIN, break the write loop
src/http/ngx_http_upstream.c:3691
rc = ngx_http_output_filter(r, u->out_bufs);
if (rc == NGX_ERROR) {
ngx_http_upstream_finalize_request(r, u, NGX_ERROR);
return;
}
if (rc == NGX_AGAIN) {
...
break;
}
Change History (9)
follow-up: 2 comment:1 by , 3 years ago
follow-up: 3 comment:2 by , 3 years ago
| Description: | modified (diff) |
|---|
Replying to Maxim Dounin:
The loop is expected to stop once the
u->bufferis full. Could you please clarify the configuration you are seeing the problem with?
First of all, u->buffer is not full in my scenario. And n = upstream->recv(upstream, b->last, size); Always return -1.
Because the nginx production environment configuration is very complicated, it cannot be provided. And this phenomenon is not easy to repeat, only occasionally.
But I was sure that size = b->end - b->last; the size was not 0 at that time. I gdb checked the scene.
I still believe this problem is related to n = ngx_writev(c, &header); returning -2.
follow-up: 4 comment:3 by , 3 years ago
Replying to bullerdu@…:
Replying to Maxim Dounin:
The loop is expected to stop once the
u->bufferis full. Could you please clarify the configuration you are seeing the problem with?
First of all,
u->bufferis not full in my scenario. Andn = upstream->recv(upstream, b->last, size);Always return -1.
So the upstream connection is closed with an error, and upstream->read->error is set, but nginx did not notice that due to u->busy_bufs is not empty, and tries to send the rest of the buffers to the client. Interesting.
But if upstream->recv() returns -1, the upstream->read->ready should be set to 0 (see ngx_unix_recv()), breaking the loop.
Is the nginx -V output you've provided in the ticket matches the actual nginx you are seeing the problem with, or it's just a placeholder? Looking at ngx_ssl_recv() I think the described behaviour might be possible with https upstream connections, but the nginx -V output you've provided suggests SSL is not used.
Also, are you using any 3rd party modules or patches?
comment:4 by , 3 years ago
Replying to Maxim Dounin:
Replying to bullerdu@…:
Replying to Maxim Dounin:
The loop is expected to stop once the
u->bufferis full. Could you please clarify the configuration you are seeing the problem with?
First of all,
u->bufferis not full in my scenario. Andn = upstream->recv(upstream, b->last, size);Always return -1.
So the upstream connection is closed with an error, and
upstream->read->erroris set, but nginx did not notice that due tou->busy_bufsis not empty, and tries to send the rest of the buffers to the client. Interesting.
But if
upstream->recv()returns-1, theupstream->read->readyshould be set to 0 (seengx_unix_recv()), breaking the loop.
Is the
nginx -Voutput you've provided in the ticket matches the actual nginx you are seeing the problem with, or it's just a placeholder? Looking atngx_ssl_recv()I think the described behaviour might be possible with https upstream connections, but thenginx -Voutput you've provided suggests SSL is not used.
Also, are you using any 3rd party modules or patches?
Yes, My problem does occur with https upstream connections. upstream->recv() was ngx_ssl_recv().
I used the nginx version based on the official version 1.23 with some customized patches.
And the ngx_ssl_recv() is the same as version 1.23.
The code in question is as follows:
c->ssl->last was -1.
switch (c->ssl->last) {
case NGX_DONE:
c->read->ready = 0;
c->read->eof = 1;
return 0;
case NGX_ERROR:
c->read->error = 1;
/* fall through */
case NGX_AGAIN:
return c->ssl->last;
}
And finally, this is a given problem, right? How to fix it? like this?
switch (c->ssl->last) {
case NGX_DONE:
c->read->ready = 0;
c->read->eof = 1;
return 0;
case NGX_ERROR:
c->read->ready = 0;
c->read->error = 1;
/* fall through */
case NGX_AGAIN:
return c->ssl->last;
}
follow-up: 6 comment:5 by , 3 years ago
Thanks for the details, I was able to reproduce this with an SSL error from the upstream server.
The following patch fixes the problem in my tests. Please test if it works for you:
# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1669621392 -10800
# Mon Nov 28 10:43:12 2022 +0300
# Node ID 78a94133b50f4f7d632d3eb71ac389163a3309f3
# Parent 0b360747c74e3fa7e439e0684a8cf1da2d14d8f6
SSL: fixed ngx_ssl_recv() to reset c->read->ready after errors.
With this change, behaviour of ngx_ssl_recv() now matches ngx_unix_recv(),
which used to always reset c->read->ready to 0 when returning errors.
This fixes an infinite loop in unbuffered SSL proxying if writing to the
client is blocked and an SSL error happens (ticket #2418).
With this change, the fix for a similar issue in the stream module
(6868:ee3645078759), which used a different approach of explicitly
testing c->read->error instead, is no longer needed and was reverted.
diff --git a/src/event/ngx_event_openssl.c b/src/event/ngx_event_openssl.c
--- a/src/event/ngx_event_openssl.c
+++ b/src/event/ngx_event_openssl.c
@@ -2204,6 +2204,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char
#endif
if (c->ssl->last == NGX_ERROR) {
+ c->read->ready = 0;
c->read->error = 1;
return NGX_ERROR;
}
@@ -2270,6 +2271,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char
#if (NGX_HAVE_FIONREAD)
if (ngx_socket_nread(c->fd, &c->read->available) == -1) {
+ c->read->ready = 0;
c->read->error = 1;
ngx_connection_error(c, ngx_socket_errno,
ngx_socket_nread_n " failed");
@@ -2306,6 +2308,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char
return 0;
case NGX_ERROR:
+ c->read->ready = 0;
c->read->error = 1;
/* fall through */
@@ -2326,6 +2329,7 @@ ngx_ssl_recv_early(ngx_connection_t *c,
size_t readbytes;
if (c->ssl->last == NGX_ERROR) {
+ c->read->ready = 0;
c->read->error = 1;
return NGX_ERROR;
}
@@ -2425,6 +2429,7 @@ ngx_ssl_recv_early(ngx_connection_t *c,
return 0;
case NGX_ERROR:
+ c->read->ready = 0;
c->read->error = 1;
/* fall through */
diff --git a/src/stream/ngx_stream_proxy_module.c b/src/stream/ngx_stream_proxy_module.c
--- a/src/stream/ngx_stream_proxy_module.c
+++ b/src/stream/ngx_stream_proxy_module.c
@@ -1675,9 +1675,8 @@ ngx_stream_proxy_process(ngx_stream_sess
size = b->end - b->last;
- if (size && src->read->ready && !src->read->delayed
- && !src->read->error)
- {
+ if (size && src->read->ready && !src->read->delayed) {
+
if (limit_rate) {
limit = (off_t) limit_rate * (ngx_time() - u->start_sec + 1)
- *received;
comment:6 by , 3 years ago
Replying to Maxim Dounin:
Thanks for the details, I was able to reproduce this with an SSL error from the upstream server.
The following patch fixes the problem in my tests. Please test if it works for you:
# HG changeset patch # User Maxim Dounin <mdounin@mdounin.ru> # Date 1669621392 -10800 # Mon Nov 28 10:43:12 2022 +0300 # Node ID 78a94133b50f4f7d632d3eb71ac389163a3309f3 # Parent 0b360747c74e3fa7e439e0684a8cf1da2d14d8f6 SSL: fixed ngx_ssl_recv() to reset c->read->ready after errors. With this change, behaviour of ngx_ssl_recv() now matches ngx_unix_recv(), which used to always reset c->read->ready to 0 when returning errors. This fixes an infinite loop in unbuffered SSL proxying if writing to the client is blocked and an SSL error happens (ticket #2418). With this change, the fix for a similar issue in the stream module (6868:ee3645078759), which used a different approach of explicitly testing c->read->error instead, is no longer needed and was reverted. diff --git a/src/event/ngx_event_openssl.c b/src/event/ngx_event_openssl.c --- a/src/event/ngx_event_openssl.c +++ b/src/event/ngx_event_openssl.c @@ -2204,6 +2204,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char #endif if (c->ssl->last == NGX_ERROR) { + c->read->ready = 0; c->read->error = 1; return NGX_ERROR; } @@ -2270,6 +2271,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char #if (NGX_HAVE_FIONREAD) if (ngx_socket_nread(c->fd, &c->read->available) == -1) { + c->read->ready = 0; c->read->error = 1; ngx_connection_error(c, ngx_socket_errno, ngx_socket_nread_n " failed"); @@ -2306,6 +2308,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char return 0; case NGX_ERROR: + c->read->ready = 0; c->read->error = 1; /* fall through */ @@ -2326,6 +2329,7 @@ ngx_ssl_recv_early(ngx_connection_t *c, size_t readbytes; if (c->ssl->last == NGX_ERROR) { + c->read->ready = 0; c->read->error = 1; return NGX_ERROR; } @@ -2425,6 +2429,7 @@ ngx_ssl_recv_early(ngx_connection_t *c, return 0; case NGX_ERROR: + c->read->ready = 0; c->read->error = 1; /* fall through */ diff --git a/src/stream/ngx_stream_proxy_module.c b/src/stream/ngx_stream_proxy_module.c --- a/src/stream/ngx_stream_proxy_module.c +++ b/src/stream/ngx_stream_proxy_module.c @@ -1675,9 +1675,8 @@ ngx_stream_proxy_process(ngx_stream_sess size = b->end - b->last; - if (size && src->read->ready && !src->read->delayed - && !src->read->error) - { + if (size && src->read->ready && !src->read->delayed) { + if (limit_rate) { limit = (off_t) limit_rate * (ngx_time() - u->start_sec + 1) - *received;
Thanks for your patch, it looks good to me.
comment:7 by , 3 years ago
Submitted for review to the nginx-devel@ mailing list:
https://mailman.nginx.org/pipermail/nginx-devel/2022-November/63QHXAIG6Y2VF7NRS4VHOZ2IWXQDJNF3.html
Thanks for testing.
comment:9 by , 3 years ago
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
Fix committed and will be available in the next release, thanks for reporting this.

The loop is expected to stop once the
u->bufferis full. Could you please clarify the configuration you are seeing the problem with?