nginx + POST reqsuest = 408 for 0.1% of users, probably SSL related
|Reported by:||Owned by:|
|uname -a:||Linux so3 4.19.0-21-amd64 #1 SMP Debian 4.19.249-2 (2022-06-30) x86_64 GNU/Linux|
nginx version: nginx/1.23.1
built by gcc 8.3.0 (Debian 8.3.0-6)
built with LibreSSL 3.5.3
TLS SNI support enabled
configure arguments: --with-openssl=/root/soft/nginx/libressl-3.5.3/ --with-http_geoip_module=dynamic --with-stream=dynamic --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_v2_module --with-http_dav_module --with-http_slice_module --with-threads --with-http_addition_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module=dynamic --with-http_sub_module
I am struggling with this bug for almost two years, please guys I need your help.
What: 0.1% of users of my website complain that website sometimes returns errors.
I can clearly see their problems in log but can't do anything to help them.
It looks like HTTP 408 error with 0 size. Example:
220.127.116.11 - service-online.su [23/Jul/2022:08:46:08 +0300] "GET /blank/.........pdf HTTP/1.1" 200 62034 "........ this is successful request, I just added it to demonstrate that user can access the website. And then this happens:
18.104.22.168 - service-online.su [23/Jul/2022:09:37:26 +0300] "POST /forms/cenniki/blanc.php HTTP/1.1" 408 0 "https://service-online.su/forms/cenniki/" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/22.214.171.124 Safari/537.36" "-" rt=15.000 ut="-" uh=- cs=- cn=283675 n=1 request_length=1160 request_time=15.000 http_accept="text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
As you can see I use non-standard log format, with information useful for debugging (debug log is available too).
Error is ALWAYS POST request and HTTP code 408 and size=0.
It happens after a series of successful GET/POST requests, but then POST just fails and can continue to fail for 5-20 times. Users just click that cursed "submit" button in hope that this time this POST request succeeds and they get what they want.
This happens for different browsers and different operating systems.
This is difficult to debug and replicate, all information I got from website users and was unable to access their PC.
Backend is php-fpm.
What I've tried:
1) replacing Debian's nginx with version built from sources (1.23.1) does not help.
2) replacing Openssl with Libressl does not help.
3) I played with keepalive - no affect.
4) I played with SSL ciphers - and got some result - disabling TLSv1.2 decreases number of 408 errors a bit.
5) I played with other settings, disabled http2 and this seems to help a bit.
Debug log for that particular 408 request (cn=283675) attached to this ticket.