Opened 11 years ago
Closed 10 years ago
#377 closed defect (fixed)
etag не отдается с gzip
Reported by: | Владимир Андреев | Owned by: | |
---|---|---|---|
Priority: | trivial | Milestone: | |
Component: | nginx-module | Version: | |
Keywords: | etag, gzip | Cc: | |
uname -a: | Linux host-1 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.4.1
TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_spdy_module --with-ipv6 |
Description
Обнаружил странное поведение при одновременно включенных директивах gzip и etag для отдачи статики.
Предположим, что мы уже делали запрос к ресурсу и он закеширован в браузере. После этого перевыкатываем изменения файлов на сервере. Далее делаем повторный запрос к ресурсу:
GET /css/reset.css HTTP/1.1 Host: example.com Connection: keep-alive Cache-Control: max-age=0 Accept: text/css,*/*;q=0.1 If-None-Match: "51bca012-41b" If-Modified-Since: Sat, 15 Jun 2013 17:10:42 GMT User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36 Referer: http://example.com/ Accept-Encoding: gzip,deflate,sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 Cookie: _ym_visorc=w
HTTP/1.1 200 OK Server: nginx/1.4.1 Date: Sat, 15 Jun 2013 17:11:37 GMT Content-Type: text/css Last-Modified: Sat, 15 Jun 2013 17:11:34 GMT Transfer-Encoding: chunked Connection: keep-alive Content-Encoding: gzip
Как видно, здесь нету заговка ETag. После этого обновляем страницу еще раз и видим вот такое:
GET /css/reset.css HTTP/1.1 Host: example.com Connection: keep-alive Cache-Control: max-age=0 Accept: text/css,*/*;q=0.1 If-None-Match: "51bca012-41b" If-Modified-Since: Sat, 15 Jun 2013 17:11:34 GMT User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36 Referer: http://example.com/ Accept-Encoding: gzip,deflate,sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 Cookie: _ym_visorc=w
HTTP/1.1 200 OK Server: nginx/1.4.1 Date: Sat, 15 Jun 2013 17:11:53 GMT Content-Type: text/css Last-Modified: Sat, 15 Jun 2013 17:11:34 GMT Transfer-Encoding: chunked Connection: keep-alive Content-Encoding: gzip
Т.е. браузер посылает дату модификации, полученную в ответ на первый запрос и etag, который остался еще от прежней версии файла (до выкатки изменений). NGINX видит, что первое условие выполняется, а второе нет, и отдает файл снова целиком. Если обновить страницу еще несколько раз, то увидим ту же самую картину.
Я не знаю, можно ли это назвать багом, но приведенный пример довольно типичный и отдавать 200 и весь файл целиком нехорошо.
Почему бы не отдавать ETag вместе с включенным сжатием? Это позволило бы устранить данную проблему.
P.S. А вот если перевыкатки на сервере не делать, то NGINX нормально отдает 304, причем вместе с заголовком ETag.
Change History (14)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
Hello, this is a major issue for us. It removes all of our Etags and causes significant increases to page load times for our Rails app, and is probably affecting other Rails users as well. Weak ETags would resolve these issues for us. Please implement weak ETags!
comment:4 by , 11 years ago
Last-Modified is not universal enough to be used for automated caching (we generate an ETag for every page we serve.) Our data has non-trivial caching logic that cannot be reduced to a simple date stamp.
comment:5 by , 11 years ago
Here is a patch for it, can you apply it ?
--- nginx-1.3.8/src/http/modules/ngx_http_gzip_filter_module.c
2012-07-07 17:22:27.000000000 -0400
+++ nginx-1.3.8-weak-etags-shorter/src/http/modules/ngx_http_gzip_filter_module.c2012-11-21
17:05:12.758389000 -0500
@@ -306,7 +306,15 @@
ngx_http_clear_content_length(r);
ngx_http_clear_accept_ranges(r);
- ngx_http_clear_etag(r);
+
+ /* Clear etags unless they're marked as weak (prefixed with 'W/') */
+ h = r->headers_out.etag;
+ if (h && !(h->value.len >= 3 &&
+ h->value.data[0] == 'W' &&
+ h->value.data[1] == '/' &&
+ h->value.data[2] == '"')) {
+ ngx_http_clear_etag(r);
+ }
return ngx_http_next_header_filter(r);
}
comment:6 by , 11 years ago
This patch looks like a dirty hack. And please see Contributing Changes for a recommended way to submit patches.
comment:7 by , 11 years ago
I'm not good at C, it's some really simple logic, could you clean it up and do the patch ?
comment:8 by , 11 years ago
How about:
--- a/src/http/modules/ngx_http_gzip_filter_module.c Mon Oct 21 18:20:32 2013 +0800
+++ b/src/http/modules/ngx_http_gzip_filter_module.c Mon Oct 21 10:18:00 2013 -0700
@@ -306,7 +306,11 @@
ngx_http_clear_content_length(r);
ngx_http_clear_accept_ranges(r);
- ngx_http_clear_etag(r);
+
+ h = r->headers_out.etag;
+ if(h && ngx_strncmp(h->value.data, "W
", 2) != 0) {
+ ngx_http_clear_etag(r);
+ }
return ngx_http_next_header_filter(r);
}
comment:9 by , 11 years ago
It looks like apache is just appending -gzip to the original etag to make this process transparent/simple,
can we do something similar ?
comment:10 by , 11 years ago
No. As far as I understand, Apache approach is actually wrong as resulting entities can be different but will be served with identical strict etags (e.g., with different compression level), and this directly contradicts RFC 2616. Correct approach would probably be to downgrade strict entity tags to weak ones.
(See also http://mailman.nginx.org/pipermail/nginx-devel/2013-November/004523.html for a patch which prevents clearing of weak etags by gzip module and friends. It's unlikely to be committed though.)
comment:11 by , 11 years ago
I respectfully disagree; I think the Apache solution is adequate. The debate regarding weak vs. strong ETags was exhaustively argued by the Apache team in https://issues.apache.org/bugzilla/show_bug.cgi?id=39727, and their resulting decision to append the -gzip suffix has not caused any reported problems. The fact that there was so much discussion over it suggests the RFC is not completely clear on the subject so they had to make a decision, that it has not proved to be a bad one (i.e. it hasn't caused problems with intermediary caching).
The fact of the matter is that any difference in the compressed bytestream is irrelevant to the client; it is the uncompressed representation that the client cares about.
Given that there is already one HTTP reference implementation that has behaved this way without ill effect, I think it's a strong argument for nginx to behave this way as well unless it can be demonstrated that it will actually cause real-world problems.
comment:12 by , 11 years ago
Would be great if we could actually do something about it, instead of worrying about being RFC blah blah compliant. If we cannot use etags that just means we use apache because they actually fixed it instead of burying their head in the sand. We saw 40% response time drop by implementing weak etag ourselves, something that cannot be done with last-modified.
Here is the patch for weak etag support:
https://github.com/grosser/puppet-nginx/blob/grosser/weak-etag/files/brews/nginx_weak_etag.patch
comment:13 by , 10 years ago
I'd like to see this problem solved too. Sometimes it is hard to implement "Last-Modified" header for dinamic content because you have to remember when that content was modified. In such situation etags are very handy. And weak etags would work in such situation quite well.
comment:14 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
The e491b26fa5a1 change introduces downgrade of strong etags to weak ones, and af229f8cf987 implements weak comparison for If-None-Match.
При использовании gzip - содержимое ответа меняется, и strong entity tag исходного ответа уже не может быть использован, иначе будут проблемы при byte-range запросах. Соответственно сейчас заголовок ETag при изменение ответа просто убирается (как gzip-фильтром, так и другими фильтрами, меняющими ответ, e.g. ssi).
Интересно, на что рассчитывает Chrome, используя ETag ответа, который гарантированно устарел (ему уже вернули новый ответ). RFC2616 как бы говорит нам:
Т.е. 304 в описанной ситуации возвращён быть не может, никогда. Возможно, имеет смысл сообщить об этой проблеме разработчикам Chrome'а.
Вот что выглядит ошибкой - это возврат ETag'а для 304-го ответа при включённом gzip. Надо подумать, что с этим можно сделать...