Opened 6 years ago

Closed 5 years ago

#1735 closed defect (fixed)

Nginx js freezing

Reported by: Roman Mingazeev Owned by:
Priority: critical Milestone:
Component: nginx-module Version: 1.15.x
Keywords: nginx js stream Cc:
uname -a: Linux ecss1 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: configure arguments: --prefix=/tmp --sbin-path=/usr/lib/ecss/ecss-restfs/nginx --modules-path=/usr/lib/ecss/ecss-restfs/conf/nginx/modules --conf-path=/usr/lib/ecss/ecss-restfs/conf/nginx.conf --error-log-path=/var/log/ecss/restfs/error-nginx.log --http-log-path=/var/log/ecss/restfs/access-nginx.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --user=www-data --group=www-data --http-client-body-temp-path=/var/cache/ecss/nginx/client_temp --http-proxy-temp-path=/var/cache/ecss/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/ecss/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/ecss/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/ecss/nginx/scgi_temp --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie' --with-debug --add-dynamic-module=../njs/nginx --add-dynamic-module=../lua-nginx-module --add-dynamic-module=../ngx_devel_kit --add-dynamic-module=../nginx-dav-ext-module

Description

When I use stream with js_filter for proxying to the http server, I get the following:
curl localhost: port1 / somefile - endless download occurs

And it happens periodically, it is not clear why. But if I comment js_filter, then this is not observed

#stream.conf
stream {

js_include fix.js
server {

listen port1;
js_filter fix;
proxy_pass 127.0.0.1:port2;

}

}

#fix.js
function fix(s) {

s.on('upload', function(data, flags) { ... })

}

Attachments (4)

log.tar.gz (32.9 KB ) - added by Roman Mingazeev 6 years ago.
all logs
transfer-fix.js (3.8 KB ) - added by Roman Mingazeev 6 years ago.
js for fix transfer
server.conf (421 bytes ) - added by Roman Mingazeev 6 years ago.
http conf
server.2.conf (178 bytes ) - added by Roman Mingazeev 6 years ago.
proxy

Download all attachments as: .zip

Change History (15)

comment:1 by xeioex, 6 years ago

Please share the code inside function fix and the debug log.

by Roman Mingazeev, 6 years ago

Attachment: log.tar.gz added

all logs

by Roman Mingazeev, 6 years ago

Attachment: transfer-fix.js added

js for fix transfer

by Roman Mingazeev, 6 years ago

Attachment: server.conf added

http conf

by Roman Mingazeev, 6 years ago

Attachment: server.2.conf added

proxy

comment:2 by xeioex, 6 years ago

When I use stream with js_filter for proxying to the http server, I get the following:
curl localhost: port1 / somefile - endless download occurs

You provided a large chunk of code. why do you think the issue is in njs and not in your code? If you are sure that the problem is in njs, please, narrow it down to the simplest code snippet possible.

comment:3 by Maxim Dounin, 6 years ago

Just to make sure it is not overlooked: the code in question can be instructed to produce arbitrary large amount of data in chunk_last():

        s.send('0'.repeat(chunk_len) + '\r\n')

While this may not be the immediate reason for the behaviour observed, the code is certainly unsafe to use.

in reply to:  2 comment:4 by Roman Mingazeev, 6 years ago

Replying to xeioex:

When I use stream with js_filter for proxying to the http server, I get the following:
curl localhost: port1 / somefile - endless download occurs

You provided a large chunk of code. why do you think the issue is in njs and not in your code? If you are sure that the problem is in njs, please, narrow it down to the simplest code snippet possible.

I tried to comment all the code on js, and it still starts to hang periodically. But, if you remove the js_filter directive, everything works fine.

in reply to:  3 comment:5 by Roman Mingazeev, 6 years ago

Replying to mdounin:

Just to make sure it is not overlooked: the code in question can be instructed to produce arbitrary large amount of data in chunk_last():

        s.send('0'.repeat(chunk_len) + '\r\n')

While this may not be the immediate reason for the behaviour observed, the code is certainly unsafe to use.

I sinned that somewhere in the code could affect this. But, firstly, this action should only be performed in the PUT method, and the hang occurs in the GET method. Secondly, the entire code commented - it still hangs, that is, for me the contents of the js code did not affect the work

comment:6 by xeioex, 6 years ago

I am using the following nginx/njs configuration. Cannot reproduce your issue.

http { 
    js_include http.njs;
    server {
        listen 8080;
        location /trac1735 {
            js_content trac1735;
        }
    }
}

#http.njs:
function trac1735(r) {
    r.return(200, "x".repeat(2**20));
}

stream {
    server {
        listen      127.0.0.1:10000;                                                                                             
        js_filter   empty_filter;                                                                                                
        proxy_pass  127.0.0.1:8080;                                                                                              
    }                                                                                                                            
} 

# stream.njs
function empty_filter(s) {
}

comment:7 by maxim, 6 years ago

Milestone: njs-0.3.0

Ticket retargeted after milestone closed

comment:8 by Maxim Dounin, 6 years ago

Status: newaccepted

TWIMC, here are relevant debug log snippet as found in error-stream.log:

2019/03/02 18:54:37 [debug] 19845#19845: *29 write new buf t:1 f:0 0000000000000000, pos 0000559F706006F0, size: 16384 file: 0, size: 0
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream write filter: l:0 f:1 s:16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev: 16384 of 16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream write filter 0000000000000000
2019/03/02 18:54:37 [debug] 19845#19845: *29 recv: eof:0, avail:1
2019/03/02 18:54:37 [debug] 19845#19845: *29 recv: fd:25 16384 of 16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream js filter u:1
2019/03/02 18:54:37 [debug] 19845#19845: *29 write new buf t:1 f:0 0000000000000000, pos 0000559F706006F0, size: 16384 file: 0, size: 0
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream write filter: l:0 f:1 s:16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev: 14423 of 16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev: -1 of 1961
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev() not ready (11: Resource temporarily unavailable)
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream write filter 0000559F704515B8
2019/03/02 18:54:37 [debug] 19845#19845: *29 epoll add event: fd:12 op:3 ev:80002005
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514607
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514607
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream js filter u:1
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:53 [debug] 19845#19845: *29 recv: eof:1, avail:1
2019/03/02 18:54:53 [debug] 19845#19845: *29 recv: fd:12 0 of 16384
2019/03/02 18:54:53 [debug] 19845#19845: *29 posix_memalign: 0000559F703F87D0:256 @16
2019/03/02 18:54:53 [debug] 19845#19845: *29 stream js filter u:0
2019/03/02 18:54:53 [debug] 19845#19845: *29 write new buf t:0 f:0 0000000000000000, pos 0000559F7067A5D0, size: 0 file: 0, size: 0
2019/03/02 18:54:53 [debug] 19845#19845: *29 stream write filter: l:1 f:1 s:0
2019/03/02 18:54:53 [info] 19845#19845: *29 client disconnected, bytes from/to client:90/2816087, bytes from/to upstream:2818048/90
2019/03/02 18:54:53 [debug] 19845#19845: *29 finalize stream proxy: 200

That is, it looks like sending to the client blocked:

2019/03/02 18:54:37 [debug] 19845#19845: *29 stream write filter: l:0 f:1 s:16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev: 14423 of 16384
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev: -1 of 1961
2019/03/02 18:54:37 [debug] 19845#19845: *29 writev() not ready (11: Resource temporarily unavailable)
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream write filter 0000559F704515B8
2019/03/02 18:54:37 [debug] 19845#19845: *29 epoll add event: fd:12 op:3 ev:80002005

As can be seen from ev:80002005, nginx added events for both reading and writing. Unfortunately there are no main-level debug logs, so it is not clear what happens on the event level, but clearly js filter is called at least once after this, likely due to a write event reported by kernel:

2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611
2019/03/02 18:54:37 [debug] 19845#19845: *29 stream js filter u:1
2019/03/02 18:54:37 [debug] 19845#19845: *29 event timer: 12, old: 111514607, new: 111514611

It looks like ngx_stream_js_body_filter() fails to call next filter if there are no output data, and as a result data buffered in the stream write filter will not be flushed in such situation, leading to connection hang and an eventual timeout. Looks like a bug in ngx_stream_js_body_filter(), it should call next filter even if called with NULL input chain and/or resulting output chain is empty.

comment:9 by Maxim Dounin, 6 years ago

It looks like the problem was introduced in this njs commit:

changeset:   601:bbec3cdb747b
user:        Dmitry Volyntsev <xeioex@nginx.com>
date:        Tue Sep 11 15:34:50 2018 +0300
summary:     Nginx stream module refactored.

comment:10 by xeioex, 5 years ago

Fixed in: this commit.

comment:11 by xeioex, 5 years ago

Resolution: fixed
Status: acceptedclosed
Note: See TracTickets for help on using tickets.