Opened 12 years ago

Closed 12 years ago

#340 closed defect (invalid)

autoindex does not treat multi-byte unicode correctly

Reported by: Calvin Li Owned by:
Priority: trivial Milestone:
Component: nginx-core Version: 1.3.x
Keywords: Cc:
uname -a: Linux epicurus 3.8.0-19-generic #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.2.6 (Ubuntu)
TLS SNI support enabled
configure arguments: --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-log-path=/var/log/nginx/access.log --http-proxy-temp-path=/var/lib/nginx/proxy --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --with-pcre-jit --with-http_gzip_static_module --with-http_ssl_module --with-ipv6 --without-http_browser_module --without-http_geo_module --without-http_limit_req_module --without-http_limit_zone_module --without-http_memcached_module --without-http_referer_module --without-http_scgi_module --without-http_split_clients_module --with-http_stub_status_module --without-http_ssi_module --without-http_userid_module --without-http_uwsgi_module --add-module=/build/buildd/nginx-1.2.6/debian/modules/nginx-echo

Description

The autoindex page uses spaces to line up columns. But it counts bytes, not characters, so multi-byte characters throw it off.

Example:

Index of /

../
hello                                              26-Apr-2013 00:56                   0
mon_frère                                         26-Apr-2013 00:57                   0

(This is a well-known problem in PHP, if it helps: http://php.net/manual/en/book.mbstring.php see mb_strlen())

Solution: use string functions that support multi-byte Unicode. See PHP docs above.

Change History (2)

comment:1 by Alex Storn, 12 years ago

Enca is an Extremely Naive Charset Analyser. http://gitorious.org/enca

comment:2 by Maxim Dounin, 12 years ago

Resolution: invalid
Status: newclosed

You have to set charset using the http://nginx.org/r/charset directive.

Note: See TracTickets for help on using tickets.