#813 closed defect (invalid)
Double encoding of utf-8 strings inside x509 certificates
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | nginx-core | Version: | 1.6.x |
Keywords: | Cc: | ||
uname -a: | Linux frontend.modio.gerox.se 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux | ||
nginx -V: |
nginx version: nginx/1.6.3
built by gcc 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) TLS SNI support enabled configure arguments: --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/nginx.pid --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-file-aio --with-ipv6 --with-http_ssl_module --with-http_spdy_module --with-http_realip_module --with-http_addition_module --with-http_xslt_module --with-http_image_filter_module --with-http_geoip_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_degradation_module --with-http_stub_status_module --with-http_perl_module --with-mail --with-mail_ssl_module --with-pcre --with-pcre-jit --with-google_perftools_module --with-debug --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic' --with-ld-opt='-Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E' |
Description
When an x509 certificate contains utf-8 data in their fields, the data is double-encoded.
This is visible in the logs, where it turns out like this:
/C=SE/ST=\x5CxC3\x5Cx96sterg\x5CxC3\x5CxB6tland/L=Link\x5CxC3\x5CxB6ping/
And when you pass it into environments to downstream applications, it's the same.
This causes more than a little problems, as we expect the un-escaped UTF-8 strings downstream, (or at most having to do a single "unfuck encoding" step).
Change History (3)
comment:1 by , 9 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 9 years ago
This isn't only in access logs, it's also in parameters passed into fex. PHP.
Example config (nginx):
fastcgi_param SSL_CLIENT_I_DN $ssl_client_i_dn; fastcgi_param SSL_CLIENT_S_DN $ssl_client_s_dn;
Relevant parts of phpinfo():
_SERVER["SSL_CLIENT_I_DN"] /C=SE/OU=Caramel/L=Link\xC3\xB6ping/O=Modio AB/ST=\xC3\x96sterg\xC3\xB6tland/CN=Caramel Signing Certificate _SERVER["SSL_CLIENT_S_DN"] /C=SE/ST=\xC3\x96sterg\xC3\xB6tland/L=Link\xC3\xB6ping/O=Modio AB/OU=Caramel/CN=R\xC3\xA4ksm\xC3\xB6rg\xC3\xA5s
Which is more to the real usecase here. We are expecting proper strings, not something that has to be ran through multiple layers of un-escaping to turn it into something useful. This is done with the traditional local teststring "Räksmörgås" ( Which contains all the lowercase special characters)
comment:3 by , 9 years ago
The phpinfo() provided shows representation that OpenSSL produces, with all non-ASCII bytes escaped with \xXX. Try
openssl x509 -text -noout -in <certificate.pem>
on a relevant certificate to compare.
Either way, escaping of X509 names is unavoidable, as they can contain various non-printable characters, as well as characters used as separators in a string representation. The question is only how exactly they will be escaped. As previously said, nginx uses traditional representation as produced by OpenSSL.
All control characters, non-ASCII characters,
"
and\
in access logs are escaped using the \xXX encoding. This is needed to allow unambiguous parsing of access logs with multiple fields. On the other hand, UTF-8 characters in x509 are encoded using similar encoding by the OpenSSL itself when accessed as text. That's why it appears to be double-encoded.If you'll properly remove \xXX encoding when parsing access logs, you are expected to obtain exactly the same text representation OpenSSL produces.
See also ticket #366 about introducing (switching to) rfc2253-compatible versions of x509 strings, it may be interesting/related.