sebb wrote:
HTTP requests include a "Host:" header which generally specifies the
target hostname and port (omitted if it is the default port).

AIUI, in virtual hosting situations, the name in the Host header may
be different from the URL host.
So for example a request to:

http://localhost:8080/

might be sent with the header:

Host: stimpy:8080

in order to direct the request to the "stimpy" virtual host.

Does it ever make sense for the "Host" port to be different from the
URL port? For example:

Host: stimpy:9090

As far as I can tell, Tomcat validates the format of the Host header,
but otherwise ignores the port?
Is that correct?

Does anyone know of other servers that make use of the Host port setting?


With respect (and a willigness to help), I believe that you have a wrong understanding of the basics of how this all works, and as a consequence your questions and theories above are a bit off the mark (if not necessarily off-topic for this list).


When you enter a URL in the URL box of the browser and hit <return>, what happens is as follows. Let's suppose that the URL in question is :

http://someserver.company.com:8080/something?name1=value1


1) the browser splits the URL into several parts :
- http://  (the "protocol" part)
- someserver.company.com  (the hostname part)
- (optionally) 8080 : (the port, which if not specified is "80" for HTTP)
- the rest : /something?name1=value1
 (which is itself conmposed of different parts, but that is not important here)

2) the browser takes the hostname part "someserver.company.com", and asks its local operating system to translate this to an IP address. The part of the OS which does this is called the "resolver", and it makes use the DNS system in order to make this name-to-address translation.

3) when the browser has the IP address of the target server, it creates a TCP connection *to that IP address*, and to the explicit or implicit port. The result of this is that there is now a direct connection between the browser and that server, over TCP/IP. And on that server, this connection is handled by whichever process was listening on that port. In general, for HTTP, this will be a webserver process (like Apache httpd or Tomcat).

So what the browser writes to that connection, is read by the webserver, and vice-versa what the webserver writes to that connection, is received and read by the browser.

4) on this connection, the browser now writes a HTTP request. That request is composed of several lines, of which there are at least the following 3 lines :
GET /something?name1=value1 HTTP/1.1
Host: someserver.company.com

(the empty line indicates the end of the request headers, which for a "GET" request is also the end of the request)

Now we look at the webserver, which receives this request over the connection.

1) The server reads the request lines, and first looks at the "Host:" line.
That tells it which "virtual server" (or "virtual host") the browser is trying to reach. In this case the browser, through that Host: header line, indicated that it is a virtual host named "someserver.company.com".

2) The webserver then checks in its configuration, if it really has a virtual host named that way. For Tomcat, that would mean that in its "server.xml" file, there exists a tag like :
<Host name="someserver.company.com" ...>.

2a) if there is no such virtual host, then the server will use its "default host" to process this request. For Tomcat, that is the Host (also identified by a <Host ..> tag, whose name is in the <Engine> tag, like :
<Engine name="Catalina" defaultHost="localhost">

2b) if there is a <Host ..> tag where the "name" attribute matches the request "Host:" header, then the server will pick that Host to answer this request.

3) Now the server looks at the first line of the request again, to see what the browser wants inside that selected Host (here it is thus "/something?name1=value1"). In this case, it would be a "web application" (or "webapp", or "context"), with the name "/something". The webserver now passes the whole browser request (first line, other header lines, and maybe also a content), to the application "/something" within this Host. (and in the case of Tomcat, if there is no such application, it will pass the request to the "default application" (also named "ROOT")).

4) that application creates a response, and writes it back into the TCP 
connection.

and now finally the browser reads that response from the connection, and displays it to the user in the browser window.


Now the above is really a dramatic summary, and in reality there is a lot more that happens between each of the steps above. I have also taken some liberties with the language. But the above is fundamentally true, for all webservers, not only for Tomcat. And that is because all browsers and all webservers, with respect to what is exchanged over the connection, follow the HTTP protocol, and that is a general Internet standard defined independently of any browser and any webserver, and valid for all of them (which is the point of a standard of course).

So, to get back to your questions above :


> HTTP requests include a "Host:" header which generally specifies the
> target hostname and port (omitted if it is the default port).

According to HTTP RFC 2616, the Host: header MUST be present, and MUST specify the target hostname.

> AIUI, in virtual hosting situations, the name in the Host header may
> be different from the URL host.
> So for example a request to:
>
> http://localhost:8080/
>
> might be sent with the header:
>
> Host: stimpy:8080
>
> in order to direct the request to the "stimpy" virtual host.

If you have understood the above explanation, you will now know that this was 
wrong.

In order to even connect to the machine that runs the webserver of interest, the browser will first need to use a hostname (in the URL) that can be resolved by the OS to a valid IP address. And when it has that, it will make a connection to that IP address (and to the port indicated in the URL, or by default to the port 80). And then, in the request itself that is sends onto that connection, it will /repeat/ that same hostname in the Host: header.

(RFC 2616 says that a port can be present in the Host: header; but it does not mention what the server should do with it. And I can't think of what it could do with it either, since by the time the server reads this header, the connection is already established with the webserver anyway.)

> Does it ever make sense for the "Host" port to be different from the
> URL port? For example:
>
> Host: stimpy:9090
>

As far as I know, no.  And there is also no standard browser which would do 
such a thing.

> As far as I can tell, Tomcat validates the format of the Host header,
> but otherwise ignores the port?
> Is that correct?
>
Kind of. It will probably ignore the port, because it is irrelevant.

> Does anyone know of other servers that make use of the Host port setting?
>
As far as I know, none.





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to