Opened 4 years ago

Closed 4 years ago

#907 closed defect (fixed)

OPTIONS request timeout in Firefox (but only in a somewhat complicated situation)

Reported by: Pyry Hakulinen Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.9.x
Keywords: http2 Cc:
uname -a: Linux somehost.fi 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.9.11
built by gcc 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04)
built with OpenSSL 1.0.2f 28 Jan 2016
TLS SNI support enabled
configure arguments: --with-ipv6 --with-http_stub_status_module --with-http_ssl_module --with-http_v2_module --with-openssl=../openssl-1.0.2f --with-debug

Description

We noticed that CORS preflight OPTIONS requests don't work in the following scenario:

1) have a HTTP/2 enabled frontend
2) proxy_pass from frontend to backend
3) Use upstream keepalive (and the OPTIONS request needs to go through already open connection)
4) Use https:// upstream
5) Client needs to be Firefox (tested with 44.0.2 on Ubuntu 15.10, but it's not limited to Linux)

Everything works fine if we disable anything from the above list, it's just this combination that reliably fails to work. Debug logs are attached to the ticket (includes initial request to get the backend upstream keepalive connection going).

Steps to reproduce (tested on Ubuntu 14.04, but should be quite general):

1) Download nginx-1.9.11 & openssl-1.0.2f

2) Compile: ./configure --with-ipv6 --with-http_stub_status_module --with-http_ssl_module --with-http_v2_module --with-openssl=../openssl-1.0.2f --with-debug && make && make install

3) Make some self signed certificates:
cd /usr/local/nginx/conf;for i in api frontend backend;do openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -keyout $i.example.com.key -out $i.example.com.cert -subj /CN=$i.example.com;done

4) Configure /usr/local/nginx/conf/nginx.conf

worker_processes  1;
events {
    worker_connections  1024;
}
http {
    include       mime.types;
    default_type  application/octet-stream;

    keepalive_timeout  65;

    upstream _backend {
        server 127.0.0.8:8888;
        keepalive 32;
    }

    server {
        listen 443 http2 ssl;

        server_name frontend.example.com;

        ssl_certificate frontend.example.com.cert;
        ssl_certificate_key frontend.example.com.key;
    }

    server {
        listen 443 http2 ssl;

        server_name api.example.com;

        ssl_certificate api.example.com.cert;
        ssl_certificate_key api.example.com.key;

        location / {
            proxy_pass https://_backend;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
        }
    }

    server {
        listen 127.0.0.8:8888 ssl;

        server_name backend.example.com;

        ssl_certificate backend.example.com.cert;
        ssl_certificate_key backend.example.com.key;

        root html/backend;
    }
}

5) Add testing file to /usr/local/nginx/html/cors.html

<html>
<head>
</head>
<body>
Sending CORS preflight options request. Error: <span id="error"></span>
<script>
var xhr = new XMLHttpRequest();
xhr.open('GET', 'https://api.example.com/');
xhr.setRequestHeader('X-Foo', 'bar'); // This will trigger a preflight OPTIONS request
xhr.onerror = function() {
  document.getElementById("error").innerHTML = "xhr failed";
};
xhr.send();       
</script>
</body>

6) Add hosts file entries for api.example.com and frontend.example.com
7) Launch Firefox
8) Navigate to https://api.example.com/ and https://frontend.example.com/ and accept the self-signed certificates
9) Restart nginx (to make sure we are in a clean state)
10) Navigate to https://frontend.example.com/cors.html and page should (quickly) show: "Sending CORS preflight options request. Error: xhr failed" (this means everything is working)
11) Reload the page, and it will just show "Sending CORS preflight options request. Error: " (this means that the OPTIONS request is somehow stuck in nginx)
12) Eventually nginx disconnects the upstream connection and tries again, this time the request will work.

Attachments (4)

debug.log (73.3 KB ) - added by Pyry Hakulinen 4 years ago.
debug.log
body.patch (26.1 KB ) - added by Valentin V. Bartenev 4 years ago.
Please try the attached patch for nginx 1.9.12.
body2.patch (26.2 KB ) - added by Valentin V. Bartenev 4 years ago.
Thank you for testing. A fixed version of the patch is attached.
body3.patch (24.7 KB ) - added by Valentin V. Bartenev 4 years ago.
That is because part of the patch was already committed. Anyway, here is an updated version.

Download all attachments as: .zip

Change History (15)

by Pyry Hakulinen, 4 years ago

Attachment: debug.log added

debug.log

comment:1 by Maxim Dounin, 4 years ago

Keywords: http2 added
Priority: blockerminor
Status: newaccepted

Most relevant part of the debug log seems to be:

2016/02/23 15:03:21 [alert] 20906#0: *5 zero size buf in output t:0 r:0 f:1 0000000000000000 0000000000000000-0000000000000000 0000000001A53048 0-0 while sending request to upstream, client: 10.0.0.10, server: api.example.com, request: "OPTIONS / HTTP/2.0", upstream: "https://127.0.0.8:8888/", host: "api.example.com

This looks related to request body reading in HTTP/2, and is likely fixed by the upcoming rewrite of the HTTP/2 request body reading code.

comment:2 by Pyry Hakulinen, 4 years ago

Is there an ETA when the HTTP/2 request body rewrite will be available?

I ask because we see similar "zero size buf in output" errors and timeouts also for other clients. (GET requests from Facebook mobile apps). This OPTIONS request stall was just something that I could easily reproduce.

by Valentin V. Bartenev, 4 years ago

Attachment: body.patch added

Please try the attached patch for nginx 1.9.12.

comment:3 by Pyry Hakulinen, 4 years ago

Thank you for the patch. It fixed the OPTIONS requests. It also mostly fixed the Facebook mobile apps, but we do see some segfaults in production with a backtrace of:

(gdb) bt
#0  ngx_http_v2_process_request_body (r=0x22559e70, pos=pos@entry=0x224d1612 "", size=size@entry=0, last=1) at src/http/v2/ngx_http_v2.c:3552
#1  0x00000000004b156c in ngx_http_v2_state_read_data (h2c=h2c@entry=0x225b6220, pos=pos@entry=0x224d1612 "", end=end@entry=0x224d1612 "") at src/http/v2/ngx_http_v2.c:931
#2  0x00000000004b282e in ngx_http_v2_state_data (h2c=0x225b6220, pos=0x224d1612 "", end=0x224d1612 "") at src/http/v2/ngx_http_v2.c:897
#3  0x00000000004b14c8 in ngx_http_v2_state_head (h2c=<optimized out>, pos=<optimized out>, end=<optimized out>) at src/http/v2/ngx_http_v2.c:755
#4  0x00000000004b0e35 in ngx_http_v2_read_handler (rev=rev@entry=0x22232640) at src/http/v2/ngx_http_v2.c:363
#5  0x00000000004b10fc in ngx_http_v2_init (rev=0x22232640) at src/http/v2/ngx_http_v2.c:302
#6  0x000000000048be20 in ngx_http_ssl_handshake_handler (c=0x7f38473f5050) at src/http/ngx_http_request.c:793
#7  0x0000000000479573 in ngx_ssl_handshake_handler (ev=<optimized out>) at src/event/ngx_event_openssl.c:1266
#8  0x00000000004760d4 in ngx_epoll_process_events (cycle=<optimized out>, timer=<optimized out>, flags=<optimized out>) at src/event/modules/ngx_epoll_module.c:822
#9  0x000000000046e32e in ngx_process_events_and_timers (cycle=cycle@entry=0x112bf50) at src/event/ngx_event.c:242
#10 0x00000000004741e3 in ngx_worker_process_cycle (cycle=cycle@entry=0x112bf50, data=data@entry=0x8) at src/os/unix/ngx_process_cycle.c:753
#11 0x0000000000472a22 in ngx_spawn_process (cycle=cycle@entry=0x112bf50, proc=proc@entry=0x474162 <ngx_worker_process_cycle>, data=data@entry=0x8, 
    name=name@entry=0x6841ca "worker process", respawn=respawn@entry=-3) at src/os/unix/ngx_process.c:198
#12 0x0000000000473733 in ngx_start_worker_processes (cycle=cycle@entry=0x112bf50, n=12, type=type@entry=-3) at src/os/unix/ngx_process_cycle.c:358
#13 0x0000000000474906 in ngx_master_process_cycle (cycle=cycle@entry=0x112bf50) at src/os/unix/ngx_process_cycle.c:130
#14 0x00000000004557b5 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:367
(gdb) p r->request_body
$1 = (ngx_http_request_body_t *) 0x0

It only seems to happen right after binary upgrade and only a few of the new processes segfault. The segfault appear to be from the same type of clients that previously ended up in a stuck state.

The production build obviously has more stuff compiled in, but I hope the backtrace is still clean enough.

nginx version: nginx/1.9.12
built by gcc 4.7.2 (Debian 4.7.2-5) 
built with OpenSSL 1.0.2f  28 Jan 2016
TLS SNI support enabled
configure arguments: --with-http_stub_status_module --with-http_ssl_module --with-http_v2_module --conf-path=/usr/local/etc/nginx.conf --error-log-path=/var/log/nginx/error_log --http-log-path=/var/log/nginx/access_log --sbin-path=/usr/local/sbin/nginx --with-openssl=../openssl-1.0.2f/ --with-http_auth_request_module --with-threads --with-file-aio --add-module=../ngx_http_ghetto_module/ --add-module=../ngx_cache_purge-2.3/ --add-module=../ngx_devel_kit-0.2.19/ --add-module=../lua-nginx-module-0.10.1rc1/
Last edited 4 years ago by Pyry Hakulinen (previous) (diff)

by Valentin V. Bartenev, 4 years ago

Attachment: body2.patch added

Thank you for testing. A fixed version of the patch is attached.

comment:4 by Pyry Hakulinen, 4 years ago

Thanks for the patch, it works perfectly. It's running on some parts of our production environment and we haven't seen any issues with it.

Is this patch going in nginx-1.9.13 or do you need to wait until some later version?

comment:5 by Valentin V. Bartenev, 4 years ago

This patch is on review now. Not sure if the review will be finished till 1.9.13, it takes time.

comment:6 by creativeprogramming@…, 4 years ago

I'm probably getting the same issue, but I'm using SPDY (as facebook clients and mobile browsers is my main target, and they still not support HTTP/2)

this is the issue I got:

2016/03/10 13:05:44 [alert] 4855#4855: *3220 zero size buf in output t:0 r:0 f:1 0000000000000000 0000000000000000-0000000000000000 00000000038C1650 0-0 while sending request to upstream, client: 87.7.159.190, server: myhost.it, request: "GET /my/path/1002 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "myhost.it", referrer: "http://m.facebook.com/"
2016/03/10 13:05:44 [error] 4855#4855: *3220 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 87.7.159.190, server: myhost.it, request: "GET /my/path/1002 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "myhost.it", referrer: "http://m.facebook.com/"
2016/03/10 13:05:44 [alert] 4855#4855: *3220 zero size buf in output t:0 r:0 f:1 0000000000000000 0000000000000000-0000000000000000 00000000038C1650 0-0 while sending request to upstream, client: 87.7.159.190, server: myhost.it, request: "GET /my/path/1002 HTTP/1.1", upstream: "fastcgi://unix:/var/run/fallback-php-fpm-unix.sock:", host: "myhost.it", referrer: "http://m.facebook.com/"

Note: I'm using HHVM php at localhost:9000 with fallback (error_page 502 @fallback... location) to php5-fpm to the unix socket.

The url is working when i try to open with my browser but it seems i get this error with some clients

nginx -V
nginx version: nginx/1.9.4
built by gcc 4.9.2 (Ubuntu 4.9.2-10ubuntu13)
built with OpenSSL 1.0.2e 3 Dec 2015
TLS SNI support enabled
configure arguments: --with-http_realip_module --with-file-aio --with-threads --with-google_perftools_module --with-pcre --with-pcre-jit --with-ld-opt='-Wl,-z,relro -Wl,-E' --with-cc-opt='-D FD_SETSIZE=40048 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -DTCP_FASTOPEN=23' --with-http_stub_status_module --with-http_spdy_module --with-http_ssl_module --prefix=/usr/local/nginx-pagespeed --sbin-path=/usr/local/nginx-pagespeed/nginx --conf-path=/etc/nginx/nginx.conf --pid-path=/var/run/nginx.pid --add-module=/root/nginx-source/ngx_pagespeed-release-1.10.33.5-beta/ --add-module=/root/nginx-source/ngx_cache_purge/ngx_cache_purge/ --add-module= --add-module= --with-http_gzip_static_module --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --with-http_sub_module


uname -a
Linux xxxxx 3.19.0-51-generic #58-Ubuntu SMP Fri Feb 26 21:22:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I'll try to upgrade to latest stable and apply your patch.

comment:7 by creativeprogramming@…, 4 years ago

Ok i confirm that this patch applied to nginx 1.9.12 solves the issue with facebook clients with HTTP/2.

I'm running this in production.

Last edited 4 years ago by creativeprogramming@… (previous) (diff)

comment:8 by dave.qccareerschool.com@…, 4 years ago

body2.patch seems to be working for me too. No more referrer: "http://m.facebook.com/" [alerts] so far.

comment:9 by creativeprogramming@…, 4 years ago

Anyway i'm experiencing an issue with ngx_pagespeed when i leave this patch enabled: POST requests - but only if pagespeed optimization is enabled in the server block - are often delayed and timed out (e.g. i often cannot login in my backend). I'll cross post the issue to their github, here just to feedback for who's using it.

Last edited 4 years ago by creativeprogramming@… (previous) (diff)

comment:10 by Pyry Hakulinen, 4 years ago

Would it be possible to get an updated patch for 1.9.13? The body2.patch doesn't apply cleanly (because of [6420], which is slightly different than the one in body2.patch).

by Valentin V. Bartenev, 4 years ago

Attachment: body3.patch added

That is because part of the patch was already committed. Anyway, here is an updated version.

comment:11 by Valentin V. Bartenev, 4 years ago

Resolution: fixed
Status: acceptedclosed

This is fixed by 887cca40ba6a (nginx 1.9.14).

Note: See TracTickets for help on using tickets.