Opened 5 months ago

Last modified 4 months ago

#2568 new enhancement

Introduce send_timeout and proxy_send_timeout in the stream module

Reported by: felipeapolanco@… Owned by:
Priority: minor Milestone:
Component: nginx-module Version: 1.25.x
Keywords: stream Cc: felipeapolanco@…
uname -a: Linux localhost 3.10.0-1160.95.1.el7.x86_64 #1 SMP Mon Jul 24 13:59:37 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx -V

nginx version: nginx/1.20.1

built by gcc 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)

built with OpenSSL 1.1.1k FIPS 25 Mar 2021

TLS SNI support enabled

configure arguments: --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/nginx.pid --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-compat --with-debug --with-file-aio --with-google_perftools_module --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_degradation_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module=dynamic --with-http_mp4_module --with-http_perl_module=dynamic --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-http_xslt_module=dynamic --with-mail=dynamic --with-mail_ssl_module --with-pcre --with-pcre-jit --with-stream=dynamic --with-stream_ssl_module --with-stream_ssl_preread_module --with-threads --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic' --with-ld-opt='-Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E'

Description

myF5 Case # 00508180

Hello,
We would like to request enhancement for support on NGINX regarding stale sockets.

On payment networks, the clients connect to the network and the network sends authorization for the clients to approve.

We have observed that when the client becomes unresponsive (no TCP ACK) or the client host is too slow to process, the socket sending buffer starts queuing up and the buffer becomes full and NGINX does not close the socket ever and let it fill up, the socket becomes stale and it just tries forever to send the queued data.

The problem here is that the payload is time sensitive, some authorization requests will expire after some seconds

While we can address TCP Retransmission timeout and retries, there is no option to handle TCP window size = 0 on client side, causing NGINX send buffer to fill up.

Specifically, we want NGINX to close the socket if the buffer becomes full. Before NGINX, On our processor we handled this situation as flow control and whenever the socket is full with EWOULDBLOCK error we closed the socket and signed off the client for authorization requests.

We are looking for flow control in NGINX.

proxy_timeout doesn't work in this scenario because these are long-lived TCP sockets, they stay open for months waiting for authorizations to come through

We observe the Send-Q and Receive-Q going up in netstat when the issue happens, but NGINX doesn't close the socket.

Here are the logs

stream {

server {

listen 10.156.35.71:6007;

listen 10.156.35.71:6003;

listen 10.156.35.71:6005;

listen 10.156.35.71:6006;

listen 10.156.35.71:6070;

proxy_pass 127.0.0.1:$server_port;

proxy_protocol on;

proxy_buffer_size 8k;

}

}

2023/11/09 14:02:03 [debug] 65169#65169: *1365 write new buf t:1 f:0 0000000000000000, pos 0000557F1640E4D0, size: 306 file: 0, size: 0

2023/11/09 14:02:03 [debug] 65169#65169: *1365 stream write filter: l:0 f:1 s:306

2023/11/09 14:02:03 [debug] 65169#65169: *1365 writev: 306 of 306

2023/11/09 14:02:03 [debug] 65169#65169: *1365 stream write filter 0000000000000000

2023/11/09 14:02:03 [debug] 65169#65169: *1365 event timer del: 3: 31701382574

2023/11/09 14:02:03 [debug] 65169#65169: *1365 event timer add: 3: 31536000000:31701384196

2023/11/09 14:02:03 [debug] 65169#65169: *1365 event timer: 3, old: 31701384196, new: 31701384196

2023/11/09 14:02:03 [debug] 65169#65169: timer delta: 1622

2023/11/09 14:02:03 [debug] 65169#65169: worker cycle

2023/11/09 14:02:03 [debug] 65169#65169: epoll timer: 31536000000

2023/11/09 14:02:04 [debug] 65169#65169: epoll: fd:3 ev:0001 d:00007F4A3D6765B0

2023/11/09 14:02:04 [debug] 65169#65169: *1365 recv: eof:0, avail:-1

2023/11/09 14:02:04 [debug] 65169#65169: *1365 recv: fd:3 234 of 8192

2023/11/09 14:02:04 [debug] 65169#65169: *1365 write new buf t:1 f:0 0000000000000000, pos 0000557F16416830, size: 234 file: 0, size: 0

2023/11/09 14:02:04 [debug] 65169#65169: *1365 stream write filter: l:0 f:1 s:234

2023/11/09 14:02:04 [debug] 65169#65169: *1365 writev: 234 of 234

2023/11/09 14:02:04 [debug] 65169#65169: *1365 stream write filter 0000000000000000

2023/11/09 14:02:04 [debug] 65169#65169: *1365 event timer del: 3: 31701384196

2023/11/09 14:02:04 [debug] 65169#65169: *1365 event timer add: 3: 31536000000:31701384604

2023/11/09 14:02:04 [debug] 65169#65169: timer delta: 408

2023/11/09 14:02:04 [debug] 65169#65169: worker cycle

2023/11/09 14:02:04 [debug] 65169#65169: epoll timer: 31536000000

2023/11/09 14:02:05 [debug] 65169#65169: epoll: fd:11 ev:0005 d:00007F4A3D6766A0

2023/11/09 14:02:05 [debug] 65169#65169: *1365 recv: eof:0, avail:-1

2023/11/09 14:02:05 [debug] 65169#65169: *1365 recv: fd:11 306 of 8192

# cat /etc/centos-release

CentOS Linux release 7.9.2009 (Core)

Change History (6)

comment:1 by Maxim Dounin, 5 months ago

Keywords: financial flowcontrol flow control removed
Milestone: nginx-1.25
Priority: blockerminor
Summary: [Enhancement] [Financial] Add support to close stale sockets / Add Flow control to NGINXIntroduce send_timeout and proxy_send_timeout in the stream module

myF5 Case # 00508180

This is Trac for nginx as an open source server. If you are looking for support for a commercial product of F5, please use corresponding support contacts.

Specifically, we want NGINX to close the socket if the buffer becomes full. Before NGINX, On our processor we handled this situation as flow control and whenever the socket is full with EWOULDBLOCK error we closed the socket and signed off the client for authorization requests.

EWOULDBLOCK is a normal condition for non-blocking sockets, and means that the socket send buffer is filled up. So, basically, in your previous setup you've limited the amount of buffered data and closed connection right away if the limit is reached.

proxy_timeout doesn't work in this scenario because these are long-lived TCP sockets, they stay open for months waiting for authorizations to come through

As of now, stream module provides only one timeout for established connections, proxy_timeout, which is set between successive read/write operations.

While it is probably possible to introduce additional timeouts, similar to send_timeout and proxy_send_timeout in HTTP, triggered only if there are data in the buffer on nginx side, this won't completely solve the problem you are trying to solve: data still can be buffered in the socket buffer for arbitrary time till proxy_timeout expires.

Overall, I would recommend to reconsider the setup and/or the protocol being used.

As possible workaround for the existing setup, assuming unreachable peers, TCP keepalive settings might be helpful, see so_keepalive parameter of the listen directive for client connections, and the proxy_socket_keepalive for upstream connections directive.

Also, tuning buffer sizes (proxy_buffer_size for in-nginx buffer, TCP socket buffer sizes) to smaller values should make it possible to detect such conditions on the backend side, similarly to how you do it without nginx.

comment:2 by felipeapolanco@…, 5 months ago

Thanks for the reply,

TCP keepalive is not honored everywhere according to the RFC, some networks will drop keepalive ACK packets.

send_timeout and proxy_send_timeout will not solve this situation, these are long-lived TCP sockets and they must remain open even if no transactions are passing through for a long time.

Also, it doesn't address the case when a spike of transactions in a very short amount of time happens, all those transactions will expire while sitting on the socket buffer until the proposed timeout occurs.

The reason we need EWOULDBLOCK action is to close the socket, I agree it would be normal operation in cases such as network congestion, but can we have at least a timeout for the ewouldblock scenario?

In our application, we close any socket that has been in EWOULDBLOCK for 10 seconds.

If NGINX can provide a timeout for EWOULDBLOCK that will address these scenarios like TCP window size = 0 (client is responsive but has no buffer space available to read)

comment:3 by Maxim Dounin, 5 months ago

send_timeout and proxy_send_timeout will not solve this situation, these are long-lived TCP sockets and they must remain open even if no transactions are passing through for a long time.

The idea behind send_timeout in HTTP is that nginx will time out the request as long as a) nginx has something to send in its buffers (but cannot send, because the socket buffer is full) and b) the timeout since previous send operation expires. If nginx has nothing to send, other timeouts are expected to be set (such as proxy_read_timeout).

In our application, we close any socket that has been in EWOULDBLOCK for 10 seconds.

That's basically equivalent to send_timeout 10s;.

comment:4 by felipeapolanco@…, 5 months ago

I understand, this will definitely work for our case.

Is there any estimated time when these will merge into NGINX master branch?

comment:5 by Maxim Dounin, 5 months ago

Is there any estimated time when these will merge into NGINX master branch?

This is a feature request and some thoughts on how this can be done, if at all. There is nothing to merge, and it is an open question if it needs to be implemented and/or merged if implemented. Note that even with this feature implemented, there will be still socket buffers which aren't subject to send_timeout. As previously suggested, I would recommend to reconsider the setup and/or the protocol being used.

comment:6 by felipeapolanco@…, 4 months ago

Thanks for the reply,

Understood.

Has there been any decision made to implement send_timeout and proxy_send_timeout?

Note: See TracTickets for help on using tickets.