#2418 closed defect (fixed)
ngx_http_upstream_process_non_buffered_request do_write Dead-loop
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | documentation | Version: | 1.23.x |
Keywords: | proxy_request_buffering off | Cc: | bullerdu@… |
uname -a: | Linux cdn-dev011164234021.na61 4.19.91-008.ali4000.alios7.x86_64 #1 SMP Fri Sep 4 17:33:26 CST 2020 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.23.2
built by gcc 6.5.1 20220324 (GCC) configure arguments: --prefix=/home/yefei.dyf/nginx |
Description (last modified by )
In non-buffering transport mode(proxy_request_buffering off), I encountered a write dead loop problem with the following code path:
src/http/ngx_http_upstream.c:3666 ngx_http_upstream_process_non_buffered_request -> src/http/ngx_http_upstream.c:3691 rc = ngx_http_output_filter(r, u->out_bufs); /* u->out_bufs=0x0 */ -> src/http/ngx_http_write_filter_module.c:48 ngx_http_write_filter -> src/http/ngx_http_write_filter_module.c:295 chain = c->send_chain(c, r->out, limit); src/os/uinx/ngx_linux_sendfile_chain.c:50 ngx_linux_sendfile_chain /* code here return if (!wev->ready) { return in; } */ src/http/ngx_http_write_filter_module.c:342 /* code here return if (chain) { c->buffered |= NGX_HTTP_WRITE_BUFFERED; return NGX_AGAIN; } */ src/http/ngx_http_upstream.c:3735 n = upstream->recv(upstream, b->last, size); /* code here, return n = -1 */ -> src/http/ngx_http_upstream.c:3753 /* code here, continue */ -> src/http/ngx_http_upstream.c:3691 rc = ngx_http_output_filter(r, u->out_bufs);
My analysis is as follows:
First:
when the downstream is written, ngx_linux_sendfile_chain calls ngx_writev and returns NGX_AGAIN, setting wev->ready =0
src/os/uinx/ngx_linux_sendfile_chain.c:204
if (n == NGX_AGAIN) {
wev->ready = 0;
return in;
}
Second:
src/http/ngx_http_upstream.c:3691
ngx_http_output_filter returns NGX_AGAIN, u->out_bufs = NULL, u->busy_bufs! = NULL
size > 0 && upstream->read->ready = 1, upstream->recv returns -1, So I keep writing downstream(ngx_http_output_filter), but downstream->write->ready = 0 so ngx_http_output_filter keeps returning -2
The resulting problems:
ngx_http_output_filter loop call, some filter_modules have cpu consumption problem, CPU 100%
The Solution:
Check the return value of ngx_http_output_filter. If it is NGX_AGAIN, break the write loop
src/http/ngx_http_upstream.c:3691
rc = ngx_http_output_filter(r, u->out_bufs);
if (rc == NGX_ERROR) {
ngx_http_upstream_finalize_request(r, u, NGX_ERROR);
return;
}
if (rc == NGX_AGAIN) {
...
break;
}
Change History (9)
follow-up: 2 comment:1 by , 2 years ago
follow-up: 3 comment:2 by , 2 years ago
Description: | modified (diff) |
---|
Replying to Maxim Dounin:
The loop is expected to stop once the
u->buffer
is full. Could you please clarify the configuration you are seeing the problem with?
First of all, u->buffer
is not full in my scenario. And n = upstream->recv(upstream, b->last, size);
Always return -1.
Because the nginx production environment configuration is very complicated, it cannot be provided. And this phenomenon is not easy to repeat, only occasionally.
But I was sure that size = b->end - b->last;
the size was not 0 at that time. I gdb checked the scene.
I still believe this problem is related to n = ngx_writev(c, &header);
returning -2.
follow-up: 4 comment:3 by , 2 years ago
Replying to bullerdu@…:
Replying to Maxim Dounin:
The loop is expected to stop once the
u->buffer
is full. Could you please clarify the configuration you are seeing the problem with?
First of all,
u->buffer
is not full in my scenario. Andn = upstream->recv(upstream, b->last, size);
Always return -1.
So the upstream connection is closed with an error, and upstream->read->error
is set, but nginx did not notice that due to u->busy_bufs
is not empty, and tries to send the rest of the buffers to the client. Interesting.
But if upstream->recv()
returns -1
, the upstream->read->ready
should be set to 0 (see ngx_unix_recv()
), breaking the loop.
Is the nginx -V
output you've provided in the ticket matches the actual nginx you are seeing the problem with, or it's just a placeholder? Looking at ngx_ssl_recv()
I think the described behaviour might be possible with https upstream connections, but the nginx -V
output you've provided suggests SSL is not used.
Also, are you using any 3rd party modules or patches?
comment:4 by , 2 years ago
Replying to Maxim Dounin:
Replying to bullerdu@…:
Replying to Maxim Dounin:
The loop is expected to stop once the
u->buffer
is full. Could you please clarify the configuration you are seeing the problem with?
First of all,
u->buffer
is not full in my scenario. Andn = upstream->recv(upstream, b->last, size);
Always return -1.
So the upstream connection is closed with an error, and
upstream->read->error
is set, but nginx did not notice that due tou->busy_bufs
is not empty, and tries to send the rest of the buffers to the client. Interesting.
But if
upstream->recv()
returns-1
, theupstream->read->ready
should be set to 0 (seengx_unix_recv()
), breaking the loop.
Is the
nginx -V
output you've provided in the ticket matches the actual nginx you are seeing the problem with, or it's just a placeholder? Looking atngx_ssl_recv()
I think the described behaviour might be possible with https upstream connections, but thenginx -V
output you've provided suggests SSL is not used.
Also, are you using any 3rd party modules or patches?
Yes, My problem does occur with https upstream connections. upstream->recv()
was ngx_ssl_recv()
.
I used the nginx version based on the official version 1.23 with some customized patches.
And the ngx_ssl_recv()
is the same as version 1.23.
The code in question is as follows:
c->ssl->last
was -1.
switch (c->ssl->last) { case NGX_DONE: c->read->ready = 0; c->read->eof = 1; return 0; case NGX_ERROR: c->read->error = 1; /* fall through */ case NGX_AGAIN: return c->ssl->last; }
And finally, this is a given problem, right? How to fix it? like this?
switch (c->ssl->last) { case NGX_DONE: c->read->ready = 0; c->read->eof = 1; return 0; case NGX_ERROR: c->read->ready = 0; c->read->error = 1; /* fall through */ case NGX_AGAIN: return c->ssl->last; }
follow-up: 6 comment:5 by , 2 years ago
Thanks for the details, I was able to reproduce this with an SSL error from the upstream server.
The following patch fixes the problem in my tests. Please test if it works for you:
# HG changeset patch # User Maxim Dounin <mdounin@mdounin.ru> # Date 1669621392 -10800 # Mon Nov 28 10:43:12 2022 +0300 # Node ID 78a94133b50f4f7d632d3eb71ac389163a3309f3 # Parent 0b360747c74e3fa7e439e0684a8cf1da2d14d8f6 SSL: fixed ngx_ssl_recv() to reset c->read->ready after errors. With this change, behaviour of ngx_ssl_recv() now matches ngx_unix_recv(), which used to always reset c->read->ready to 0 when returning errors. This fixes an infinite loop in unbuffered SSL proxying if writing to the client is blocked and an SSL error happens (ticket #2418). With this change, the fix for a similar issue in the stream module (6868:ee3645078759), which used a different approach of explicitly testing c->read->error instead, is no longer needed and was reverted. diff --git a/src/event/ngx_event_openssl.c b/src/event/ngx_event_openssl.c --- a/src/event/ngx_event_openssl.c +++ b/src/event/ngx_event_openssl.c @@ -2204,6 +2204,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char #endif if (c->ssl->last == NGX_ERROR) { + c->read->ready = 0; c->read->error = 1; return NGX_ERROR; } @@ -2270,6 +2271,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char #if (NGX_HAVE_FIONREAD) if (ngx_socket_nread(c->fd, &c->read->available) == -1) { + c->read->ready = 0; c->read->error = 1; ngx_connection_error(c, ngx_socket_errno, ngx_socket_nread_n " failed"); @@ -2306,6 +2308,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char return 0; case NGX_ERROR: + c->read->ready = 0; c->read->error = 1; /* fall through */ @@ -2326,6 +2329,7 @@ ngx_ssl_recv_early(ngx_connection_t *c, size_t readbytes; if (c->ssl->last == NGX_ERROR) { + c->read->ready = 0; c->read->error = 1; return NGX_ERROR; } @@ -2425,6 +2429,7 @@ ngx_ssl_recv_early(ngx_connection_t *c, return 0; case NGX_ERROR: + c->read->ready = 0; c->read->error = 1; /* fall through */ diff --git a/src/stream/ngx_stream_proxy_module.c b/src/stream/ngx_stream_proxy_module.c --- a/src/stream/ngx_stream_proxy_module.c +++ b/src/stream/ngx_stream_proxy_module.c @@ -1675,9 +1675,8 @@ ngx_stream_proxy_process(ngx_stream_sess size = b->end - b->last; - if (size && src->read->ready && !src->read->delayed - && !src->read->error) - { + if (size && src->read->ready && !src->read->delayed) { + if (limit_rate) { limit = (off_t) limit_rate * (ngx_time() - u->start_sec + 1) - *received;
comment:6 by , 2 years ago
Replying to Maxim Dounin:
Thanks for the details, I was able to reproduce this with an SSL error from the upstream server.
The following patch fixes the problem in my tests. Please test if it works for you:
# HG changeset patch # User Maxim Dounin <mdounin@mdounin.ru> # Date 1669621392 -10800 # Mon Nov 28 10:43:12 2022 +0300 # Node ID 78a94133b50f4f7d632d3eb71ac389163a3309f3 # Parent 0b360747c74e3fa7e439e0684a8cf1da2d14d8f6 SSL: fixed ngx_ssl_recv() to reset c->read->ready after errors. With this change, behaviour of ngx_ssl_recv() now matches ngx_unix_recv(), which used to always reset c->read->ready to 0 when returning errors. This fixes an infinite loop in unbuffered SSL proxying if writing to the client is blocked and an SSL error happens (ticket #2418). With this change, the fix for a similar issue in the stream module (6868:ee3645078759), which used a different approach of explicitly testing c->read->error instead, is no longer needed and was reverted. diff --git a/src/event/ngx_event_openssl.c b/src/event/ngx_event_openssl.c --- a/src/event/ngx_event_openssl.c +++ b/src/event/ngx_event_openssl.c @@ -2204,6 +2204,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char #endif if (c->ssl->last == NGX_ERROR) { + c->read->ready = 0; c->read->error = 1; return NGX_ERROR; } @@ -2270,6 +2271,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char #if (NGX_HAVE_FIONREAD) if (ngx_socket_nread(c->fd, &c->read->available) == -1) { + c->read->ready = 0; c->read->error = 1; ngx_connection_error(c, ngx_socket_errno, ngx_socket_nread_n " failed"); @@ -2306,6 +2308,7 @@ ngx_ssl_recv(ngx_connection_t *c, u_char return 0; case NGX_ERROR: + c->read->ready = 0; c->read->error = 1; /* fall through */ @@ -2326,6 +2329,7 @@ ngx_ssl_recv_early(ngx_connection_t *c, size_t readbytes; if (c->ssl->last == NGX_ERROR) { + c->read->ready = 0; c->read->error = 1; return NGX_ERROR; } @@ -2425,6 +2429,7 @@ ngx_ssl_recv_early(ngx_connection_t *c, return 0; case NGX_ERROR: + c->read->ready = 0; c->read->error = 1; /* fall through */ diff --git a/src/stream/ngx_stream_proxy_module.c b/src/stream/ngx_stream_proxy_module.c --- a/src/stream/ngx_stream_proxy_module.c +++ b/src/stream/ngx_stream_proxy_module.c @@ -1675,9 +1675,8 @@ ngx_stream_proxy_process(ngx_stream_sess size = b->end - b->last; - if (size && src->read->ready && !src->read->delayed - && !src->read->error) - { + if (size && src->read->ready && !src->read->delayed) { + if (limit_rate) { limit = (off_t) limit_rate * (ngx_time() - u->start_sec + 1) - *received;
Thanks for your patch, it looks good to me.
comment:7 by , 2 years ago
Submitted for review to the nginx-devel@ mailing list:
https://mailman.nginx.org/pipermail/nginx-devel/2022-November/63QHXAIG6Y2VF7NRS4VHOZ2IWXQDJNF3.html
Thanks for testing.
comment:9 by , 2 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fix committed and will be available in the next release, thanks for reporting this.
The loop is expected to stop once the
u->buffer
is full. Could you please clarify the configuration you are seeing the problem with?