Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#813 closed defect (invalid)

Double encoding of utf-8 strings inside x509 certificates

Reported by: Spindel@… Owned by:
Priority: minor Milestone:
Component: nginx-core Version: 1.6.x
Keywords: Cc:
uname -a: Linux 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.6.3
built by gcc 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC)
TLS SNI support enabled
configure arguments: --prefix=/usr/share/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/tmp/client_body --http-proxy-temp-path=/var/lib/nginx/tmp/proxy --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/ --lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx --with-file-aio --with-ipv6 --with-http_ssl_module --with-http_spdy_module --with-http_realip_module --with-http_addition_module --with-http_xslt_module --with-http_image_filter_module --with-http_geoip_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_degradation_module --with-http_stub_status_module --with-http_perl_module --with-mail --with-mail_ssl_module --with-pcre --with-pcre-jit --with-google_perftools_module --with-debug --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic' --with-ld-opt='-Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E'


When an x509 certificate contains utf-8 data in their fields, the data is double-encoded.

This is visible in the logs, where it turns out like this:


And when you pass it into environments to downstream applications, it's the same.

This causes more than a little problems, as we expect the un-escaped UTF-8 strings downstream, (or at most having to do a single "unfuck encoding" step).

Change History (3)

comment:1 by Maxim Dounin, 9 years ago

Resolution: invalid
Status: newclosed

All control characters, non-ASCII characters, " and \ in access logs are escaped using the \xXX encoding. This is needed to allow unambiguous parsing of access logs with multiple fields. On the other hand, UTF-8 characters in x509 are encoded using similar encoding by the OpenSSL itself when accessed as text. That's why it appears to be double-encoded.

If you'll properly remove \xXX encoding when parsing access logs, you are expected to obtain exactly the same text representation OpenSSL produces.

See also ticket #366 about introducing (switching to) rfc2253-compatible versions of x509 strings, it may be interesting/related.

comment:2 by Spindel@…, 9 years ago

This isn't only in access logs, it's also in parameters passed into fex. PHP.

Example config (nginx):

fastcgi_param SSL_CLIENT_I_DN $ssl_client_i_dn;
fastcgi_param SSL_CLIENT_S_DN $ssl_client_s_dn;

Relevant parts of phpinfo():

_SERVER["SSL_CLIENT_I_DN"]	/C=SE/OU=Caramel/L=Link\xC3\xB6ping/O=Modio AB/ST=\xC3\x96sterg\xC3\xB6tland/CN=Caramel Signing Certificate
_SERVER["SSL_CLIENT_S_DN"]	/C=SE/ST=\xC3\x96sterg\xC3\xB6tland/L=Link\xC3\xB6ping/O=Modio AB/OU=Caramel/CN=R\xC3\xA4ksm\xC3\xB6rg\xC3\xA5s

Which is more to the real usecase here. We are expecting proper strings, not something that has to be ran through multiple layers of un-escaping to turn it into something useful. This is done with the traditional local teststring "Räksmörgås" ( Which contains all the lowercase special characters)

comment:3 by Maxim Dounin, 9 years ago

The phpinfo() provided shows representation that OpenSSL produces, with all non-ASCII bytes escaped with \xXX. Try

openssl x509 -text -noout -in <certificate.pem>

on a relevant certificate to compare.

Either way, escaping of X509 names is unavoidable, as they can contain various non-printable characters, as well as characters used as separators in a string representation. The question is only how exactly they will be escaped. As previously said, nginx uses traditional representation as produced by OpenSSL.

Note: See TracTickets for help on using tickets.