Opened 17 months ago

Closed 2 months ago

#1030 closed defect (fixed)

100% CPU with hash consistent balancing method without live upstreams

Reported by: alex.umansky@… Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.11.x
Keywords: Cc:
Sensitive: no
uname -a: Linux RP1 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 11:03:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.11.1 (TFO custom build) built by gcc 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) built with OpenSSL 1.0.1e-fips 11 Feb 2013 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error_log --http-log-path=/var/log/nginx --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --user=nginx --group=nginx --build='TFO custom build' --with-http_stub_status_module --with-http_ssl_module --with-http_realip_module --without-http_ssi_module --without-http_userid_module --without-http_geo_module --without-http_referer_module --without-http_uwsgi_module --without-http_scgi_module --without-http_memcached_module --without-http_browser_module --without-http_upstream_ip_hash_module --without-http_upstream_least_conn_module --with-http_gunzip_module --add-module=./ngx_devel_kit-0.3.0/ --add-module=./lua-nginx-module-0.10.5 --add-module=./headers-more-nginx-module-0.30 --with-pcre-jit --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DTCP_FASTOPEN=23'

Description

Good day,
For caching purposes I have the following upstream configuration:

upstream upstream1 {

hash $arg_1$arg_2$arg_3 consistent;

server backend1:80 weight=100;
server backend2:80 weight=100;
server backend3:80 weight=100;
server backend4:80 weight=100;

zone upstream 10m;
keepalive 512;
}

Once backend servers experience problems, and the proxy considers all of it as unavailable (leading to "no live upstreams while connecting to upstream" in error logs) then all workers go 100% CPU. Single restarts usually don't help. It ends with either few nginx restarts or wait for nginx to recover (yes, it recovers on its own after a while) or by taking care of backend.
I guess all 3 lead to same thing - backend availability.

Since backup upstream can't be used with 'hash' balancing algo, I tried adding another upstream (leading to nginx itself and some special response) with very low weight:

server backend1:80 weight=1000;
...
server localhost:81 weight=1;

But even that didn't help.

That "backup" location stops receiving requests. And once all other upstreams are down it also goes down.
So CPU goes 100%, and response time grows to unreal values like 1-2+ minutes regardless the timeouts set (well, server's CPU is 100% in use...)

Without 'consistent' it works fine as well as with basic RR balancing method that been in use so far.

Also, that proxy works with 10k-60k req/sec, and the problem might happen at any time.

Thanks!

Change History (6)

comment:1 Changed 17 months ago by pluknet

Such configuration results in a large total number of consistent hash points.
In case of all servers failed, every point will be tried to no avail and there is no limit on the number of such attempts.
Probably it needs to be specially handled to early return from balancing in such case.

comment:2 follow-up: Changed 15 months ago by savetherbtz@…

Shouldn't hash module have something like this in case all upstreams are down(like all other modules)?

            /* all peers failed, mark them as live for quick recovery */

            for (peer = hp->rrp.peers->peer; peer; peer = peer->next) {
                peer->fails = 0;
            }

Also using an algorithm with better asymptotics(e..g permutation-based consistent hashing) may help:
https://github.com/dropbox/godropbox/blob/master/hash2/consistent_hash.go#L34

comment:3 in reply to: ↑ 2 Changed 15 months ago by mdounin

Replying to savetherbtz@…:

Shouldn't hash module have something like this in case all upstreams are down(like all other modules)?

            /* all peers failed, mark them as live for quick recovery */

            for (peer = hp->rrp.peers->peer; peer; peer = peer->next) {
                peer->fails = 0;
            }

Right now we are considering removing this code from all modules, as there are multiple drawbacks of the code. Either way this will only hide the problem by avoiding all peers down situation in some cases, but not all. For example, adding the down flag to all servers will still trigger the same problem.

comment:4 Changed 15 months ago by mdounin

Probably most simple solution would be to just limit number of tries like we do for all other hash balancing algorithms. This will result in non-optimal distribution if nginx won't be able to select a server in a reasonable time, but at least this will prevent severe performance degradation as observed.

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1474488921 -10800
#      Wed Sep 21 23:15:21 2016 +0300
# Node ID 7de2efccfb1d0a12bdf6a99b5886eaf6355d3314
# Parent  9cf2dce316e51b6eb942f0c0f5070d35298c7a63
Upstream hash: limit number of tries in consistent case (ticket #1030).

While this may result in non-ideal distribution of requests if nginx
won't be able to select a server in a reasonable number of attempts,
this still looks better than severe performance degradation observed
if there is no limit and there are many points configured.  This is
also in line with what we do for other hash balancing methods.

diff --git a/src/http/modules/ngx_http_upstream_hash_module.c b/src/http/modules/ngx_http_upstream_hash_module.c
--- a/src/http/modules/ngx_http_upstream_hash_module.c
+++ b/src/http/modules/ngx_http_upstream_hash_module.c
@@ -499,6 +499,11 @@ ngx_http_upstream_get_chash_peer(ngx_pee
 
     ngx_http_upstream_rr_peers_wlock(hp->rrp.peers);
 
+    if (hp->tries > 20 || hp->rrp.peers->single) {
+        ngx_http_upstream_rr_peers_unlock(hp->rrp.peers);
+        return hp->get_rr_peer(pc, &hp->rrp);
+    }
+
     pc->cached = 0;
     pc->connection = NULL;
 
@@ -570,10 +575,9 @@ ngx_http_upstream_get_chash_peer(ngx_pee
         hp->hash++;
         hp->tries++;
 
-        if (hp->tries >= points->number) {
-            pc->name = hp->rrp.peers->name;
+        if (hp->tries > 20) {
             ngx_http_upstream_rr_peers_unlock(hp->rrp.peers);
-            return NGX_BUSY;
+            return hp->get_rr_peer(pc, &hp->rrp);
         }
     }
 

Additionally, the following patch somewhat improves order of checks. The patch alone results in 5x better performance in my tests with degraded upstream, though it's still 100x times slower that it should be.

# HG changeset patch
# User Maxim Dounin <mdounin@mdounin.ru>
# Date 1474489827 -10800
#      Wed Sep 21 23:30:27 2016 +0300
# Node ID 6fee68ae4034ca7d5a191ff66632e6954b6eb70f
# Parent  7de2efccfb1d0a12bdf6a99b5886eaf6355d3314
Upstream hash: reordered peer checks.

This slightly reduces cost of selecting a peer if all or almost all peers
failed, see ticket #1030.  There should be no measureable difference with
other workloads.

diff --git a/src/http/modules/ngx_http_upstream_hash_module.c b/src/http/modules/ngx_http_upstream_hash_module.c
--- a/src/http/modules/ngx_http_upstream_hash_module.c
+++ b/src/http/modules/ngx_http_upstream_hash_module.c
@@ -540,16 +540,16 @@ ngx_http_upstream_get_chash_peer(ngx_pee
                 continue;
             }
 
-            if (peer->server.len != server->len
-                || ngx_strncmp(peer->server.data, server->data, server->len)
-                   != 0)
+            if (peer->max_fails
+                && peer->fails >= peer->max_fails
+                && now - peer->checked <= peer->fail_timeout)
             {
                 continue;
             }
 
-            if (peer->max_fails
-                && peer->fails >= peer->max_fails
-                && now - peer->checked <= peer->fail_timeout)
+            if (peer->server.len != server->len
+                || ngx_strncmp(peer->server.data, server->data, server->len)
+                   != 0)
             {
                 continue;
             }

comment:5 Changed 2 months ago by Maxim Dounin <mdounin@…>

In 7123:6c52b24fcf8e/nginx:

Upstream hash: limited number of tries in consistent case.

While this may result in non-ideal distribution of requests if nginx
won't be able to select a server in a reasonable number of attempts,
this still looks better than severe performance degradation observed
if there is no limit and there are many points configured (ticket #1030).
This is also in line with what we do for other hash balancing methods.

comment:6 Changed 2 months ago by mdounin

  • Resolution set to fixed
  • Status changed from new to closed

Patches committed.

Note: See TracTickets for help on using tickets.