Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#2341 closed defect (invalid)

Sporadic "502 Bad Gateway" with reverse proxy

Reported by: robsch@… Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.16.x
Keywords: reverse-proxy 502 Cc:
uname -a: Linux itsrv2493.esrv.local 4.18.0-348.12.2.el8_5.x86_64 #1 SMP Mon Jan 17 07:06:06 EST 2022 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.16.1
built by gcc 8.2.1 20180905 (Red Hat 8.2.1-3) (GCC)
built with OpenSSL 1.1.1 FIPS 11 Sep 2018 (running with OpenSSL 1.1.1k FIPS 25 Mar 2021)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie

Description (last modified by robsch@…)

For unknown reasons I've got 502 responses. AFAIK there was no change anywhere, but this is not 100% certain. Though, anything must have changed, but nobody knows what this could be. At least it has worked for long time, but suddenly not anymore.

So, the problem is this: If I try to fetch 100 images with a http2 connection, many requests get a 502 response. If I do it with 20 images, all seems to be fine. The problem occur only if the reverse proxy is used. If the images gets fetched directly, there is never a 502 response. I tested that in browsers but also with curl on the command line.

The specified nginx version is 1.16.1 and the system is RHEL 8. But I tried it also with Fedora an nginx 1.20.2 in a VM with the same results.

Using curl on linux (mac is not able), you can try using the reverse proxy with this (you should see some 502 responses):

curl --http2 -Z -svo /dev/null https://demos.colop.com/creator4/media-cache/media/nio/{62052c6ce8288600019bdeef,620511a6e8288600019bdecf,6205036be8288600019bdebb,6204cda3e8288600016305cd,62043bd5e8288600016305c5,6203f219e8288600016305b6,6202c041e8288600016305a0,6201b0fbe828860001630593,6201a9abe82886000163058c,620190a8e82886000163057f,620182a3e828860001630578,620155d0e82886000163056b,62013eede828860001630561,620123b3e828860001630557,620110a2e82886000163054c,6200e643e828860001630544,6200347ce828860001630536,61fff0aae828860001630516,61fff044e82886000163050e,61ffef65e828860001630506,61ffc945e8288600016304f3,61ffafb7e8288600016304e8,61ffae28e8288600016304e3,61ff96bce8288600016304d5,61ff961de8288600016304cf,61fefe9ce8288600016304c3,61fef568e8288600016304b4,61fef0d5e8288600016304a8,61fed732e828860001630497,61feb647e828860001630485,61fd8ea6e828860001630467,61fd80fee82886000163044f,61fd7be2e828860001630449,61fd67a8e828860001630435,61fd44c7e828860001630426,61fd1107e828860001630416,61fc3e1ee828860001630407,61fc2a5ae8288600016303ff,61fc27c9e8288600016303f5,61fc2479e8288600016303ef,61fc1f94e8288600016303e7,61fc168ae8288600016303df,61fbdab4e8288600016303d4,61fad9aee8288600016303bf,61fa6510e8288600016303b0,61fa4f03e8288600016303a6,61fa43a6e828860001630398,61f9a2eee828860001630386,61f96d87e828860001630375,61f7fe3ce828860001630352,61f7fc56e82886000163034d,61f784cee828860001630339,61f708dae82886000163032b,61f70726e828860001630324,61f7044fe828860001630316,61f6eb8ae82886000163030b,61f6e7b9e8288600016302ff,61f6e25de8288600016302f1,61f6abb8e8288600016302c2,61f695e4e8288600016302a6,61f68b71e82886000163029d,61f5c1c0e82886000163026e,61f5bfefe828860001630267,61f5bec8e828860001630260,61f5a84ee82886000163025a,61f58de0e828860001630252,61f589c5e82886000163024a,61f53270e82886000163022c,61f4517de82886000163020c,61f44163e8288600016301fe,61f43e8de8288600016301f7,61f43c39e8288600016301f1,61f42bc6e8288600016301e8,61f429ade8288600016301e1,61f427b9e8288600016301d9,61f424f7e8288600016301d3,61f419bae8288600016301cb,61f3ff05e8288600016301c2,61f3fdeae8288600016301bc,61f3eb80e82886000163019c,61f3e295e828860001630196,61f3d79ce82886000163018e,61f3d00be828860001630185,61f3cf95e828860001630180,61f2ea44e82886000163015e,61f2dcebe828860001630157,61f2d5ace828860001630150,61f2c5b3e828860001630146,61f2bf9fe828860001630136,61f2a627e82886000163012e,61f2782ee828860001630124,61f1dff5e828860001630111,61f1c86fe828860001630104,61f1bbcae8288600016300e6,61f1b46ae8288600016300c1,61f1b094e8288600016300a6,61f18ec6e828860001630094,61f18c1ee82886000163008e,61f185f1e828860001630086,61f16f85e82886000163006b}

You can try the direct requests with the curl command (there shouldn't be any 502 responses):

curl --http2  -Z -svo /dev/null https://api.colop-online.com/service/creator/media-cache/media/nio/{62052c6ce8288600019bdeef,620511a6e8288600019bdecf,6205036be8288600019bdebb,6204cda3e8288600016305cd,62043bd5e8288600016305c5,6203f219e8288600016305b6,6202c041e8288600016305a0,6201b0fbe828860001630593,6201a9abe82886000163058c,620190a8e82886000163057f,620182a3e828860001630578,620155d0e82886000163056b,62013eede828860001630561,620123b3e828860001630557,620110a2e82886000163054c,6200e643e828860001630544,6200347ce828860001630536,61fff0aae828860001630516,61fff044e82886000163050e,61ffef65e828860001630506,61ffc945e8288600016304f3,61ffafb7e8288600016304e8,61ffae28e8288600016304e3,61ff96bce8288600016304d5,61ff961de8288600016304cf,61fefe9ce8288600016304c3,61fef568e8288600016304b4,61fef0d5e8288600016304a8,61fed732e828860001630497,61feb647e828860001630485,61fd8ea6e828860001630467,61fd80fee82886000163044f,61fd7be2e828860001630449,61fd67a8e828860001630435,61fd44c7e828860001630426,61fd1107e828860001630416,61fc3e1ee828860001630407,61fc2a5ae8288600016303ff,61fc27c9e8288600016303f5,61fc2479e8288600016303ef,61fc1f94e8288600016303e7,61fc168ae8288600016303df,61fbdab4e8288600016303d4,61fad9aee8288600016303bf,61fa6510e8288600016303b0,61fa4f03e8288600016303a6,61fa43a6e828860001630398,61f9a2eee828860001630386,61f96d87e828860001630375,61f7fe3ce828860001630352,61f7fc56e82886000163034d,61f784cee828860001630339,61f708dae82886000163032b,61f70726e828860001630324,61f7044fe828860001630316,61f6eb8ae82886000163030b,61f6e7b9e8288600016302ff,61f6e25de8288600016302f1,61f6abb8e8288600016302c2,61f695e4e8288600016302a6,61f68b71e82886000163029d,61f5c1c0e82886000163026e,61f5bfefe828860001630267,61f5bec8e828860001630260,61f5a84ee82886000163025a,61f58de0e828860001630252,61f589c5e82886000163024a,61f53270e82886000163022c,61f4517de82886000163020c,61f44163e8288600016301fe,61f43e8de8288600016301f7,61f43c39e8288600016301f1,61f42bc6e8288600016301e8,61f429ade8288600016301e1,61f427b9e8288600016301d9,61f424f7e8288600016301d3,61f419bae8288600016301cb,61f3ff05e8288600016301c2,61f3fdeae8288600016301bc,61f3eb80e82886000163019c,61f3e295e828860001630196,61f3d79ce82886000163018e,61f3d00be828860001630185,61f3cf95e828860001630180,61f2ea44e82886000163015e,61f2dcebe828860001630157,61f2d5ace828860001630150,61f2c5b3e828860001630146,61f2bf9fe828860001630136,61f2a627e82886000163012e,61f2782ee828860001630124,61f1dff5e828860001630111,61f1c86fe828860001630104,61f1bbcae8288600016300e6,61f1b46ae8288600016300c1,61f1b094e8288600016300a6,61f18ec6e828860001630094,61f18c1ee82886000163008e,61f185f1e828860001630086,61f16f85e82886000163006b}

I've tried to setup a reverse proxy with apache. I'm not sure if I have configured it correctly. But it seemed that there was no problem with apache.

My current (dumb) workaround it to let the browser load only 20 images or so at the same time.

I have attached to debug log where you can find some errors.

What could be the problem? What could be the reason that things have worked and then suddenly not?

Attachments (1)

test-reverse-proxy.log (1.2 MB ) - added by robsch@… 2 years ago.
debug log

Download all attachments as: .zip

Change History (7)

by robsch@…, 2 years ago

Attachment: test-reverse-proxy.log added

debug log

comment:1 by robsch@…, 2 years ago

The nginx.conf configuration is this (although I've tried the default nginx.conf that comes with dnf, there was no difference):

user                           nginx;
worker_processes               auto;
worker_rlimit_nofile           100000;

error_log                      /var/log/nginx/error.log warn;
pid                            /var/run/nginx.pid;


events {
    worker_connections         1024;
    use                        epoll;
    multi_accept               on;
}


http {
    # allow long server names
    server_names_hash_bucket_size 64;

    include                    /etc/nginx/mime.types;
    default_type               application/octet-stream;

    log_format                 main '$status [$time_local] '
                                    '$server_protocol $request_method $host:$server_port$request_uri    '
                                    '$remote_addr [$sent_http_location] [$remote_user] "$http_user_agent"';

    access_log                 /var/log/nginx/access.log  main;

    # spool uploads to disk instead of clobbering downstream servers
    client_body_temp_path      /var/spool/nginx-client-body 1 2;
    client_max_body_size       32m;
    client_body_buffer_size    512k;

    server_tokens              off;

    sendfile                   on;
    tcp_nopush                 on;
    tcp_nodelay                off;

    keepalive_timeout          120;

    ## Compression
    gzip                       on;
    gzip_http_version          1.0;
    gzip_comp_level            6;
    gzip_proxied               any;
   #gzip_min_length            500;
    gzip_buffers               16 8k;
    gzip_types                 text/plain
                               text/css
                               text/javascript
                               text/xml
                               application/x-javascript
                               application/javascript
                               application/xml
                               application/xml+rss;
    # Some version of IE 6 don't handle compression well on some mime-types,
    # so just disable for them
    gzip_disable               "MSIE [1-6].(?!.*SV1)";
    # Set a vary header so downstream proxies don't send cached gzipped
    # content to IE6
    gzip_vary                  on;

    # info from http://www.slashroot.in/nginx-web-server-performance-tuning-how-to-do-it
    #caching for metadata
    open_file_cache            max=10000 inactive=30s;
    open_file_cache_valid      60s;
    open_file_cache_min_uses   2;
    open_file_cache_errors     on;

    # proxy settings
    proxy_headers_hash_bucket_size 128;
    proxy_redirect             off;

    proxy_set_header           Host             $host;
    proxy_set_header           X-Real-IP        $remote_addr;
    proxy_set_header           X-Forwarded-For  $proxy_add_x_forwarded_for;
    proxy_max_temp_file_size   0;

    proxy_connect_timeout      300;
    proxy_send_timeout         300;
    proxy_read_timeout         300;

    proxy_buffer_size          4k;
    proxy_buffers              4 32k;
    proxy_busy_buffers_size    64k;
    proxy_temp_file_write_size 64k;
    proxy_buffering            off;

    include                    /etc/nginx/conf.d/*.conf;
}

and

server {
	listen					443 ssl http2;
	server_name				demos.colop.com;

	ssl_certificate			/etc/pki/nginx/ssl/colop/_colop_com.crt;
	ssl_certificate_key		/etc/pki/nginx/ssl/colop/keys/_colop_com_RSA_private.key;
	ssl_protocols			TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA;

	ssl_prefer_server_ciphers on;

	ssl_dhparam /etc/nginx/ssl/dhp/dhparams.pem;

	proxy_set_header          Host               $host;
	proxy_set_header          X-Real-IP          $remote_addr;
	proxy_set_header          X-Forwarded-For    $proxy_add_x_forwarded_for;
	proxy_set_header          X_FORWARDED_FOR    $proxy_add_x_forwarded_for;
	proxy_set_header          X_FORWARDED_HOST   $host;
	proxy_set_header          X_FORWARDED_SERVER $host;
	proxy_set_header          X-Forwarded-Proto  https;
	proxy_redirect            off;

    location /creator4/           {
        proxy_http_version 1.1;
        proxy_pass https://api.colop-online.com/service/creator/;
        proxy_redirect default;
        proxy_set_header Host api.colop-online.com;
    }
}

comment:2 by robsch@…, 2 years ago

Description: modified (diff)

comment:3 by robsch@…, 2 years ago

I have also played with different proxy and other settings (cache, buffer, ...) but without any luck. Honestly, I don't know much about these settings. Maybe I just need the right configuration. Would be glad then...

comment:4 by Maxim Dounin, 2 years ago

Resolution: invalid
Status: newclosed

The log clearly says the reason for 502:

2022/04/04 16:41:56 [error] 3495727#3495727: *102 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 212.241.114.172, ser
ver: demos.colop.com, request: "GET /creator4/media-cache/media/nio/61fc2479e8288600016303ef HTTP/2.0", upstream: "https://104.248.100.111:443/service/cr
eator/media-cache/media/nio/61fc2479e8288600016303ef", host: "demos.colop.com"

That is, your upstream server closed the connection.

Given that this happens when you try to load more than 20 images over HTTP/2, most likely your upstream server is limited to 20 connections and simply closes any additional connections.

Note that with HTTP/2, client can request many resources in parallel. Each of these requests is processed by nginx separately. If you proxy everything, for each request nginx will open a separate connection to the upstream server, and this might create significant load on the upstream servers.

Obvious fixes would be, in no particular order:

  • Fix your upstream server to handle appropriate load.
  • Change your upstream server to queue additional connections instead of closing them. This is what happens in many servers by default due to socket listen queue, though might not be what a particular server does.
  • Limit the number of connections nginx is going to open to a particular server by using the server ... max_conns=.... Using queue might also be a good idea unless you have other servers to proxy requests to, though this only available in the commercial version.
  • Limit the number of parallel requests nginx is willing to accept from a HTTP/2 client, see http2_max_concurrent_streams. Note that the upstream server limit still can be reached by multiple clients though.

If you need further help with configuring nginx, consider using support options available.

in reply to:  4 comment:5 by robsch@…, 2 years ago

Many thanks for your quick reply! I think I've got your explanations.

Replying to Maxim Dounin:

The log clearly says the reason for 502:

2022/04/04 16:41:56 [error] 3495727#3495727: *102 peer closed connection in SSL handshake while SSL handshaking to upstream, client: 212.241.114.172, ser
ver: demos.colop.com, request: "GET /creator4/media-cache/media/nio/61fc2479e8288600016303ef HTTP/2.0", upstream: "https://104.248.100.111:443/service/cr
eator/media-cache/media/nio/61fc2479e8288600016303ef", host: "demos.colop.com"

That is, your upstream server closed the connection.

It is quite helpful to hear that from official side. However, is that really true? I mean, could it be for any reason that this log entry is wrong? Or is that impossible?

I'm wondering why I never got 502 responses when using the upstream directly. I have provided those curl calls - I don't know if you have tried them. Can you provide a curl command that shows 502 responses using api.colop-online.com? I'm not able to do it. E.g. using curl -Z --parallel-max 100 --parallel-immediate with or without --http2 doesn't produce 502 responses. Is it anyhow possible to definetely show that this is not an nginx error? Then I could forward this fact to the persons that are responsible for the upstream server. I'd be very, very glad if you could show me such an curl command or anything like that!

Many thanks again!


comment:6 by Maxim Dounin, 2 years ago

I'm wondering why I never got 502 responses when using the upstream directly. I have provided those curl calls - I don't know if you have tried them. Can you provide a curl command that shows 502 responses using api.colop-online.com?

You won't get 502 when connecting the server directly - rather, you'll get an error in curl.

With curl --http1.1 ... or with curl -Z --parallel-max 100 --parallel-immediate ... (both are equivalent and will try to open many connections instead of using one HTTP/2 connection) I see multiple SSL_connect: SSL_ERROR_SYSCALL errors:

...
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api.colop-online.com:443 
* Closing connection 27
* Connected to api.colop-online.com (104.248.100.111) port 443 (#28)
* ALPN, offering http/1.1
*  CAfile: /usr/local/share/certs/ca-root-nss.crt
*  CApath: none
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api.colop-online.com:443 
* Closing connection 28
...

This looks exactly equivalent to the errors logged by nginx.

Note: See TracTickets for help on using tickets.