Opened 4 months ago

Closed 4 months ago

#2014 closed defect (invalid)

nginx dynamic configuration reload is jittery

Reported by: udaysha01@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.18.x
Keywords: Cc:
uname -a: Linux uday-arch 5.6.8-arch1-1 #1 SMP PREEMPT Wed, 29 Apr 2020 16:22:56 +0000 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.18.0
built with OpenSSL 1.1.1g 21 Apr 2020
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --conf-path=/etc/nginx/nginx.conf --sbin-path=/usr/bin/nginx --pid-path=/run/nginx.pid --lock-path=/run/lock/nginx.lock --user=http --group=http --http-log-path=/var/log/nginx/access.log --error-log-path=stderr --http-client-body-temp-path=/var/lib/nginx/client-body --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-cc-opt='-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -D_FORTIFY_SOURCE=2' --with-ld-opt=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now --with-compat --with-debug --with-file-aio --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_degradation_module --with-http_flv_module --with-http_geoip_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-pcre-jit --with-stream --with-stream_geoip_module --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-threads

Description

I am trying to use nginx as a reverse proxy, listening on a port and forwarding all incoming requests to a service running on another port. I want to change the configuration and update which service nginx is passing the incoming traffic to, but when I update the configuration, nginx doesn't switch cleanly from the old service to the new service. It switches back and forth a number of times before eventually settling on the new service.

Steps to reproduce: First create the following files.

Just a simple nginx config file.
nginx.conf:

events {}

http {
  server {
    listen 1970;
    location / {
      proxy_pass http://127.0.0.1:3000;
    }
  }
}

A dummy service which listens and logs the path of each http request. Also generates a UUID which it sends back along with every response.
listen.py:

#!/usr/bin/env python3

import http.server as SimpleHTTPServer
import socketserver as SocketServer
import logging
import sys
import os
import uuid

#PORT = int(sys.argv[1])
PORT = int(sys.argv[1])
me = str(uuid.uuid4())

class GetHandler(
        SimpleHTTPServer.SimpleHTTPRequestHandler
        ):

    def log_message(self, format, *args):
        pass

    def do_GET(self):
        print(me + ": " + self.path)
        self.send_response(200)
        self.end_headers()
        self.wfile.write(bytes("ok " + me + "\n", "utf-8"))

Handler = GetHandler
httpd = SocketServer.TCPServer(("", PORT), Handler)
try:
    httpd.serve_forever()
except KeyboardInterrupt:
    pass

httpd.server_close()

A script which sends requests one after the other to nginx. It looks at the UUID reported in the response to figure out which service responded to the request, and logs whenever the responding service changes.
spammer.py:

#!/usr/bin/python3

import requests
from time import sleep
PORT= "1970"
i = 0
pid = 0
while True:
    URL = "http://localhost:" + PORT + "/" + str(i)
    r = requests.get(url = URL)
    if r.content[:2] != b"ok":
        print("Failed at %d" % i)
        break
    if r.content[2:-1] != pid:
        pid = r.content[2:-1]
        print("New responder: %s at %d" % (pid, i))
    i += 1
  1. Run nginx via: nginx -c $(pwd)/nginx.conf -g "pid $(pwd)/nginx.pid; daemon off;"
  2. Start two copies of listen.py with: python listen.py 3000 && python listen.py 3001
  3. Start a copy of spammer.py with: python spammer.py
  4. Edit the port number in nginx.conf - change it from 3000 to 3001
  5. Tell nginx to reload configuration via: kill -HUP $(cat nginx.pid)
  6. Observe the "jitter" by looking at the output of spammer.py. Actual output is something like
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 0
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21948
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 21951
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21952
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 21953
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21955
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 21956
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21965
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 21966
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21967
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 21974
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21977
    New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 21993
    New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21997
    

while the expected output would be more like

New responder: b' affe8cf0-0ae5-4565-8e20-a237367f5e83' at 0
New responder: b' 7553a5ad-6875-4903-9b15-533bd4faf53f' at 21948

Change History (1)

comment:1 by Maxim Dounin, 4 months ago

Resolution: invalid
Status: newclosed

Updating configuration implies that nginx will start new worker processes, and as long as new worker processes are started - it will gracefully stop old worker processes, see here. This implies that there is a time frame when both new and old worker processes are running and serving connections. While this time frame is relatively small, it is incorrect to assume that configuration is changed atomically. Your test demonstrates that the change is indeed non-atomic.

Note: See TracTickets for help on using tickets.