Opened 10 years ago

Closed 10 years ago

#501 closed defect (invalid)

Incorrect Host HTTP header when using upstream element in location

Reported by: Antti Huokko Owned by:
Priority: major Milestone:
Component: nginx-core Version: 1.5.x
Keywords: upstream HTTP Host header Cc:
uname -a: Linux <HOST-NAME> 2.6.32-431.3.1.el6oso.bz844450v4.x86_64 #1 SMP Mon Jan 27 17:37:39 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
nginx -V: nginx version: nginx/1.5.10
built by gcc 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
configure arguments: --prefix=/var/lib/openshift/<USER>/app-root/data//nginx --with-pcre=/var/lib/openshift/<USER>/app-root/data//pcre --user=<USER> --group=<USER> --without-mail_pop3_module --without-mail_smtp_module --without-mail_imap_module --without-http_uwsgi_module --without-http_scgi_module --with-debug

Description

Hello,

I have a simple load balancing configuration using upstream element. When I use the upstream element in the location element the Host HTTP header is send incorrectly containing the name of the upstream element configuration block.

Here is an example of my configuration. For testing purposes I have configured only single upstream server.

upstream backend {
  server backend.example.com;
}
location /test/ {
  proxy_pass http://backend/;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

With this configuration the HTTP Host header gets value 'backend' instead of the actual server address 'backend.example.com'. I am just wondering how it is possible that Nginx exposes part of the configuration as a HTTP header? Also in my environment it is crucial that the Hots header get send correctly. This is now in fact the only thing that prevents me from replacing Apache as a load balancer with Nginx.

Change History (8)

comment:1 by Maxim Dounin, 10 years ago

Resolution: invalid
Status: newclosed

This is the expected behaviour. You specify actual name of a server in the proxy_pass directive, and it's used in the request to upstream. The upstream block is a way to override addresses into which the name resolves, much like DNS which is used by default if there is no upstream block with a given name.

If you want nginx to send Host: backend.example.com, there are two basic options:

  • Use this name in proxy_pass:
        upstream backend.example.com {
            server ...;
        }
    
        proxy_pass http://backend.example.com;
    
  • Use proxy_set_header to specify Host explicitly:
        proxy_pass http://backend;
        proxy_set_header Host backend.example.com;
    

See documentation here for more information:

comment:2 by Antti Huokko, 10 years ago

Hello,

Thank you for the quick reply.

However I still don't understand how I can achieve real load balancing using methods you described. In actual scenario I will always have 2 or more servers in upstream block. So it means that I can not use a static value for the HTTP Host header.

Here is a better example with 2 upstream servers.

upstream backend {
  server backend1.example.com;
  server backend2.example.com;
}

Now if I use this there is no load balancing. All the requests will now go to host 'backend1.example.com'.

location /test/ {
  proxy_pass http://backend1.example.com/;
}

If I use this way of configuring the load balancing the Host header will be incorrect for every other request i.e. when the load balancer chooses to use host 'backend2.example.com'.

location /test/ {
  proxy_pass http://backend;
  proxy_set_header Host backend1.example.com;
}

comment:3 by Antti Huokko, 10 years ago

Resolution: invalid
Status: closedreopened

comment:4 by Maxim Dounin, 10 years ago

Resolution: invalid
Status: reopenedclosed

You have to configure all your backend servers to recognize one single name.

Please use mailing list to ask questions, trac is a wrong place to do so. Thank you.

comment:5 by Antti Huokko, 10 years ago

Hello,

I opened the issue one last time because I really want to report a bug.

According to HTTP specification (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html): "The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI". This means that only valid value that the Host header can have is the same host name that is currently being requested. In that sense it is very wrong that Nginx will by default pass the name of the upstream configuration block as a Host header because in backend server's point of view it is gibberish.

What I would like to see is to be able to use for example a variable that holds the name of the upstream server currently chosen to be the backend that is handling the request.

location /test/ {
  proxy_pass http://backend;
  proxy_set_header Host $current_upstream_server_name;
}

If you still won't accept this as a bug I will respect that and I won't open this issue any more. However I strongly think that if a HTTP proxy fails to follow HTTP specification it should be a bug. Personally I fail to see the point in a HTTP server or a HTTP proxy that is not following the HTTP specification.

comment:6 by Antti Huokko, 10 years ago

Resolution: invalid
Status: closedreopened

in reply to:  5 comment:7 by Valentin V. Bartenev, 10 years ago

Replying to Antti Huokko <huokko@gmail.com>:

According to HTTP specification (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html): "The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI".

But the original URI in your case is: http://backend$request_uri.

This means that only valid value that the Host header can have is the same host name that is currently being requested.

Please also note these parts of the quoted RFC:

connection

A transport layer virtual circuit established between two programs
for the purpose of communication.

client

A program that establishes connections for the purpose of sending
requests.

server

An application program that accepts connections in order to
service requests by sending back responses. Any given program may
be capable of being both a client and a server; our use of these
terms refers only to the role being performed by the program for a
particular connection, rather than to the program's capabilities
in general. Likewise, any server may act as an origin server,
proxy, gateway, or tunnel, switching behavior based on the nature
of each request.

In connection to upstream nginx acts as a client, and the Host header in this connection depends on a given configuration, that's it.

comment:8 by Maxim Dounin, 10 years ago

Resolution: invalid
Status: reopenedclosed

As already explained, name of a host is one written in the proxy_pass directive. The upstream block is much like DNS records for the name given, with some additional options.

Your problem is perfectly clear - you want nginx to send different request to different servers in your upstream blocks, and hence you are trying to convince us that what nginx is doing is wrong. No, it's not wrong, it's just different from what you are trying to do. Using upstream blocks will not work for you if you want nginx to do different requests to different servers specified - a request is created before nginx will ever try to find out what servers are currently available.

If you want to use proxy_pass with multiple backends in a single upstream block, you have to configure all your backends to recognize one name, much like you'll do it if you'll use DNS with multiple A records. If you can't configure your backends this way, you can't use proxy_pass this way - instead, you have to do proxy_pass to different servers, e.g., by selecting different servers based on a request URI prefix or using something like split_clients.

And again, please use ​mailing list to ask questions, trac is a wrong place to do so. Thank you.

Note: See TracTickets for help on using tickets.