Opened 2 years ago

Closed 2 years ago

#1477 closed defect (fixed)

Syslog stops working after few hours

Reported by: Undone@… Owned by:
Priority: major Milestone:
Component: nginx-module Version: 1.13.x
Keywords: syslog Cc:
uname -a: Linux *hostname* 4.4.96-mainline-rev1 #1 SMP Thu Nov 2 11:39:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.13.8
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)
built with OpenSSL 1.0.2k-fips 26 Jan 2017
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'

Description

After few hours, the access_log and error_log will stop sending to the syslog server, and the message below is outputted in the error_log file instead.

2018/02/11 12:07:36 [alert] 15865#15865: send() failed (22: Invalid argument)

Restarting nginx fixes this temporarily, until it starts again after a while.

Attachments (3)

nginx.conf (16.2 KB ) - added by Undone@… 2 years ago.
Nginx configurations
nginx-debug.log (28.0 KB ) - added by Undone@… 2 years ago.
syslog_tolerate.diff (1.1 KB ) - added by vl 2 years ago.
re-open socket on error

Download all attachments as: .zip

Change History (16)

comment:1 by Undone@…, 2 years ago

nginx.conf:

error_log syslog:server=syslog.domain.com:12302;

log_format graylog2_format '$remote_addr - $remote_user
[$time_local] "$request" $status $body_bytes_sent "$http_referer"
"$http_user_agent" "$http_x_forwarded_for"
<msec=$msec|connection=$connection|connection_requests=$connection_requests|millis=$request_time>';

access_log syslog:server=syslog.domain.com:12301 graylog2_format;

comment:2 by vl, 2 years ago

Can you please provide full configuration obtained from nginx -T
and a debug log?

by Undone@…, 2 years ago

Attachment: nginx.conf added

Nginx configurations

comment:3 by Undone@…, 2 years ago

I added the debug parameter to error_log, and interestingly enough it sends some messages to the syslog. They seem to be unrelated to the issue though.

domain.com nginx: 2018/02/13 04:28:48 [info] 25123#25123: *84 peer closed connection in SSL handshake while SSL handshaking, client: x.x.231.5, server: 0.0.0.0:443
2018-02-13 06:14:09.000	domain.com
domain.com nginx: 2018/02/13 04:14:09 [info] 25122#25122: *83 client ::1 closed keepalive connection
2018-02-13 05:35:01.000	domain.com
domain.com nginx: 2018/02/13 03:35:01 [notice] 25123#25123: reopening logs
2018-02-13 05:35:01.000	domain.com
domain.com nginx: 2018/02/13 03:35:01 [notice] 25121#25121: signal 10 (SIGUSR1) received from 25714, reopening logs
2018-02-13 05:35:01.000	domain.com
domain.com nginx: 2018/02/13 03:35:01 [notice] 25121#25121: reopening logs
2018-02-13 05:35:01.000	domain.com
domain.com nginx: 2018/02/13 03:35:01 [notice] 25122#25122: reopening logs
2018-02-13 05:17:00.000	domain.com
domain.com nginx: 2018/02/13 03:17:00 [info] 25122#25122: *80 peer closed connection in SSL handshake while SSL handshaking, client: x.x.231.5, server: 0.0.0.0:443
2018-02-13 04:40:49.000	domain.com
domain.com nginx: 2018/02/13 02:40:49 [info] 25123#25123: *76 client x.x.136.7 closed keepalive connection
2018-02-13 04:40:42.000	domain.com
domain.com nginx: 2018/02/13 02:40:42 [info] 25122#25122: *78 client 127.0.0.1 closed keepalive connection

Also the error message I mentioned earlier gets sometimes sent to the syslog server.

[alert] 23429#23429: send() failed (22: Invalid argument)

I have added the output of nginx -T as an attachment.

comment:4 by vl, 2 years ago

  • what syslog server is used? Any errors in its own logs?
  • does nginx receives any signals (i.e. do you perform reload or reopen logs at the time)?
  • please add 'debug' to file-based log in order to see any issues with syslog, i.e.

just add second 'error_log' directive:
error_log /path/to/debug.log debug;
please attach full debug.log to the ticket.

comment:5 by Maxim Dounin, 2 years ago

Note that to obtain a debug log you need an nginx binary compiled --with-debug, see http://nginx.org/en/docs/debugging_log.html for details.

by Undone@…, 2 years ago

Attachment: nginx-debug.log added

comment:6 by Undone@…, 2 years ago

I'm using Graylog as a syslog server. I attached the debug log. I don't reload nginx unless I change a configuration.

comment:7 by vl, 2 years ago

which OS are you runnging? Any virtualization? Is this run in Docker container?

comment:8 by Undone@…, 2 years ago

I'm running CentOS 7.4 in a virtual machine. The virtualisation technology is KVM I believe. No docker.

comment:9 by vl, 2 years ago

Does your IP address changes regularly? It looks like this may be the cause for such errors.

comment:10 by Undone@…, 2 years ago

The syslog server is accessed through an OpenVPN tunnel. The IP address shouldn't be changing, but's possible the tunnel could be doing something funny.

by vl, 2 years ago

Attachment: syslog_tolerate.diff added

re-open socket on error

comment:11 by vl, 2 years ago

Can you please try attached patch and see if it helps?

comment:12 by Vladimir Homutov <vl@…>, 2 years ago

In 7272:fa0e093b64d7/nginx:

Syslog: re-open syslog udp socket on send error (ticket #1477).

Previously, only unix domain sockets were reopened to tolerate cases when
local syslog server was restarted. It makes sense to treat other cases
(for example, local IP address changes) similarly.

comment:13 by vl, 2 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.