Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#1305 closed defect (wontfix)

OCSP stapling fails due to not working DNS resolution

Reported by: grazzolini@… Owned by:
Priority: minor Milestone:
Component: other Version: 1.13.x
Keywords: ocsp, dns, stapling Cc:
uname -a: Linux arch 4.11.6-3-ARCH #1 SMP PREEMPT Thu Jun 22 12:21:46 CEST 2017 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.13.1
built with OpenSSL 1.1.0f 25 May 2017
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --conf-path=/etc/nginx/nginx.conf --sbin-path=/usr/bin/nginx --pid-path=/run/nginx.pid --lock-path=/run/lock/nginx.lock --user=http --group=http --http-log-path=/var/log/nginx/access.log --error-log-path=stderr --http-client-body-temp-path=/var/lib/nginx/client-body --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-compat --with-file-aio --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_degradation_module --with-http_flv_module --with-http_geoip_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-pcre-jit --with-stream --with-stream_geoip_module --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-threads

Description

I'm the maintainer of Archlinux's nginx-mainline package. Recently we have received a bug report, https://bugs.archlinux.org/task/54478, claiming that, if there's no available nameserver or no connectivity at the time of the boot, OCSP stapling will fail and nginx won't run it again, even after there is connectivity.

We need to issue a restart to make it work. We have recently added a workaround for both our packages, nginx and nginx-mainline, that only works on Archlinux, due to systemd, in where we make the nginx service unit to wait for network-online.target and nss-lookup.target. Even thought this workaround works, it would be nice if nginx would periodically check for connectivity and re-issue a OCSP stapling request.

Change History (2)

comment:1 by Maxim Dounin, 7 years ago

Resolution: wontfix
Status: newclosed

With OCSP stapling enabled, nginx will parse OCSP responder address from either the configuration (ssl_stapling_responder) or a certificate while reading the configuration file. If the address is invalid or cannot be resolved, nginx will disable OCSP stapling for a particular certificate, and will log something like this to the console and the error log:

2017/06/29 01:46:17 [warn] 30472#100049: "ssl_stapling" ignored, host not found in OCSP responder "no-such-domain.example.com" in the certificate "/path/to/certificate"

This is mostly agree with how all names are used in nginx - when names are used in the configuration, they will be resolved during configuration parsing. (Normally, a name resolution error will result in a fatal error instead, and failed startup. But as OCSP stapling is an optimization, we opted to ignore such errors with an appropriate warning and disable OCSP stapling instead.)

The resolver directive, as mentioned in the original report, is used to resolve names which are only available at runtime, as in proxy_pass with variables. In case of OCSP stapling, if available, it is additionally used to re-resolve OCSP responder name periodically, to make sure nginx will see updated IP addresses if they are changed. It is not a replacement for the initial name resolution during configuration parsing though.

Requiring name resolution to work during nginx startup generally looks correct solution as long as names are using in the configuration, including ones used in OCSP stapling.

comment:2 by grazzolini@…, 7 years ago

So, basically, setting up the resolver directive, makes nginx to deal with failing network issues after it has started, but if it has started and there was no connectivity, it won't retry, even if the resolver gets available afterwards?

I really think this shouldn't be dismissed right away as wontfix. But at least if this is truly a design decision, I think it should be properly documented.

Note: See TracTickets for help on using tickets.