Opened 3 weeks ago

Closed 11 days ago

#1891 closed defect (wontfix)

$body_bytes_sent and $bytes_sent wrong during sliced subrequest

Reported by: arnili@… Owned by:
Priority: minor Milestone:
Component: other Version: 1.14.x
Keywords: Cc:
uname -a: Linux 3ea167ae5f93 4.4.0-145-generic #171-Ubuntu SMP Tue Mar 26 12:43:40 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.14.0 (Ubuntu) built with OpenSSL 1.1.1 11 Sep 2018 TLS SNI support enabled configure arguments: --with-cc-opt='-g -O2 -fdebug-prefix-map=/build/nginx-DUghaW/nginx-1.14.0=. -fstack-protector-strong -Wformat -Werror=format-security -fPIC -Wdate-time -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -fPIC' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --modules-path=/usr/lib/nginx/modules --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_v2_module --with-http_dav_module --with-http_slice_module --with-threads --with-http_addition_module --with-http_geoip_module=dynamic --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module=dynamic --with-http_sub_module --with-http_xslt_module=dynamic --with-stream=dynamic --with-stream_ssl_module --with-mail=dynamic --with-mail_ssl_module

Description

When a partial slice hit triggers a subrequest to fetch more data from the backend, the $body_bytes_sent and $bytes_sent variables contain wrong information.

1.: Both variables should actually be 0. Data is sent to and accounted in the main request.
2.: The variables should certainly not be the same number.

log_format cachelog '[$cacheidentifier] $remote_addr [$time_local] "$request" $status $body_bytes_sent $bytes_sent $upstream_response_length $upstream_bytes_received "$upstream_cache_status" "$http_range" "$slice_range"';
log_subrequest on;
[abc123] 10.x.x.x [16/Nov/2019:17:27:44 +0000] "GET /abc123 HTTP/1.1" 206 266678 266678 1048576 1048955 "MISS" "bytes=1863680-2129919" "bytes=2097152-3145727"
[abc123] 10.x.x.x [16/Nov/2019:17:27:44 +0000] "GET /abc123 HTTP/1.1" 206 266240 266678 - - "HIT" "bytes=1863680-2129919" "-"

First request is a subrequest of the second triggered by a partial range hit.
Data sent to the client is logged twice.

Change History (7)

comment:1 Changed 3 weeks ago by arut

The $bytes_sent variable returns the number of bytes sent to the client. The value is the same for the main request and all subrequests because client connection is the same for all of them. Subrequest response is sent directly to the client. There's no fake client in a subrequest with a separate byte counter.

As for the $body_bytes_sent variable, this value currently does not make sense for a subrequest. It returns the value of $bytes_sent minus header size for the active request. For the main request it's the actual response header size, but it is zero for a subrequest.

comment:2 Changed 3 weeks ago by arnili@…

If this works as designed, how would I account all effective client and upstream bytes to evaluate cache effectiveness?

Currently some client bytes are logged multiple times with no way of knowing.

comment:3 Changed 3 weeks ago by arut

You can reach for upstream counters by accessing upstream-side variables rather than client-side variables. For example $upstream_response_length keeps the length of the upstream response.

http://nginx.org/en/docs/http/ngx_http_upstream_module.html#var_upstream_response_length

comment:4 follow-up: Changed 3 weeks ago by arnili@…

Yes, $upstream_response_length and $upstream_bytes_received work as expected. I have shown it above.

Back to the ticket topic:
When using slicing $body_bytes_sent and $bytes_sent will log more traffic than actually happened during a partial hit.
There is no way of knowing how many bytes were double logged and thus no way to evaluate cache efficiency reliably.

comment:5 in reply to: ↑ 4 Changed 12 days ago by arut

The documentation says:

$bytes_sent - number of bytes sent to a client

This is exactly what you get when you access it from any subrequest. There's no double logging, it's just the number of bytes went to the client socket.

If you're interested in the number of bytes (if any) sent by the upstream server while serving a slice, $upstream_response_length seems like the right choice.

comment:6 Changed 12 days ago by arnili@…

I'm unsure how to make this any clearer at this point, I'll try:

1.: Turn log_subrequest on and use a sliced cache that produces partial hits
2.: Sum up all the $bytes_sent in your log file
3.: You will have a number larger than bytes actually sent to clients

How can this be correct?

comment:7 Changed 11 days ago by mdounin

  • Resolution set to wontfix
  • Status changed from new to closed

While your confusion is understood, the other side of the problem is that you are trying to sum up numbers which refer to the same thing, hence an incorrect result is expected. Basically, we have two options:

  • Disable these variables in subrequests, artificially returning 0 or an empty string.
  • Keep the variables as is, explaining they shouldn't be looked at in subrequests (unless you understand that these are numbers for a particular connection at the time of logging of a particular subrequest), much like these variables shouldn't be used before the response is fully sent.

Current approach is the second one. Additionally, it is backed by the fact that log_subrequest is off by default, so these numbers are only visible if you understand what subrequests are, and explicitly switched on subrequest logging.

Back to the practical problem you have - if you want to evaluate cache efficiency and hence want to find out how many bytes were sent to the client, consider using $bytes_sent only from the main request log. Note though that it might be tricky to find out which request is the main one. Looking that the $request_length variable might be the way to go - it will be 0 for subrequests.

Note: See TracTickets for help on using tickets.