Opened 12 months ago

Closed 4 weeks ago

#1926 closed defect (invalid)

resolver should not cache failures

Reported by: orgads@… Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.17.x
Keywords: resolver Cc:
uname -a: Linux e989be129352 5.3.0-24-generic #26-Ubuntu SMP Thu Nov 14 01:33:18 UTC 2019 x86_64 Linux
nginx -V: nginx version: nginx/1.17.5
built by gcc 8.3.0 (Alpine 8.3.0)
built with OpenSSL 1.1.1c 28 May 2019 (running with OpenSSL 1.1.1d 10 Sep 2019)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --with-perl_modules_path=/usr/lib/perl5/vendor_perl --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-Os -fomit-frame-pointer' --with-ld-opt=-Wl,--as-needed

Description

If a domain is not found, the "valid" parameter is still effective, so if it becomes available again, nginx fails to resolve it until the cache timeout is reached.

This is likely to happen with docker-compose. Each container is accessed by its name using DNS resolution. If one of the containers is down and nginx looks it up, it will fail and will keep failing even if the container is brought up.

If I change the resolver to 127.0.0.11 valid=5s then it works as expected.

Attachments (1)

nginx-docker.tar.gz (584 bytes ) - added by orgads@… 4 weeks ago.
Docker example

Download all attachments as: .zip

Change History (7)

comment:1 by Maxim Dounin, 12 months ago

Could you please provide more details about the configuration you are having issues with? In particular, full configuration and a debug log demonstrating the problem might be helpful. Negative DNS responses are not cached by nginx, and what you describe suggests that the issue is likely elsewhere.

comment:2 by orgads@…, 12 months ago

I'll try to create a minimal reproducible example.

comment:3 by Maxim Dounin, 9 months ago

Resolution: worksforme
Status: newclosed

Feedback timeout.

comment:4 by orgads@…, 4 weeks ago

Sorry for the late response. It hits us again recently.

I have a simple way to reproduce. Attached the relevant files.

Extract the tarball.

Start with:
cd nginx
docker-compose up -d
# Test
curl http://localhost:5000/s1/
# Inspect the addresses:
docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' nginx_service1_1 nginx_service2_1
# Notice the order
docker-compose stop service1 service2
# Now swap the addresses.
# If the order was ascending:
docker-compose up -d service2
docker-compose up -d service1
# Otherwise, start service1 and then service2
# Test again
curl http://localhost:5000/s1/
# -> 502

I've noticed that the upstream address is only resolved when loading the configuration, regardless of the resolver value.

Related question: https://serverfault.com/q/240476/157997

by orgads@…, 4 weeks ago

Attachment: nginx-docker.tar.gz added

Docker example

comment:5 by orgads@…, 4 weeks ago

Resolution: worksforme
Status: closedreopened

comment:6 by Maxim Dounin, 4 weeks ago

Resolution: invalid
Status: reopenedclosed

The configuration in the archive provided is as follows:

resolver 127.0.0.11;
server {
    listen 5000;

    location /s1 {
        rewrite /s1/(.*) /$1 break;
        proxy_pass  http://service1:8060;
    }

    location /s2 {
        rewrite /s2/(.*) /$1 break;
        proxy_pass  http://service2:8070;
    }
}

Note that proxy_pass directives does not use variables, but rather contain statically defined addresses. Statically defined addresses are resolved during configuration parsing, the resolver is only used to resolve names which cannot be resolved during configuration parsing.

Note: See TracTickets for help on using tickets.