On 17 May 2011 23:35, André Warnier <a...@ice-sa.com> wrote:
> sebb wrote:
>>
>> HTTP requests include a "Host:" header which generally specifies the
>> target hostname and port (omitted if it is the default port).
>
>> AIUI, in virtual hosting situations, the name in the Host header may
>> be different from the URL host.
>> So for example a request to:
>>
>> http://localhost:8080/
>>
>> might be sent with the header:
>>
>> Host: stimpy:8080
>>
>> in order to direct the request to the "stimpy" virtual host.
>>
>> Does it ever make sense for the "Host" port to be different from the
>> URL port? For example:
>>
>> Host: stimpy:9090
>>
>> As far as I can tell, Tomcat validates the format of the Host header,
>> but otherwise ignores the port?
>> Is that correct?
>>
>> Does anyone know of other servers that make use of the Host port setting?
>>
>
> With respect (and a willigness to help), I believe that you have a wrong
> understanding of the basics of how this all works, and as a consequence your
> questions and theories above are a bit off the mark (if not necessarily
> off-topic for this list).
>
>
> When you enter a URL in the URL box of the browser and hit <return>, what
> happens is as follows.  Let's suppose that the URL in question is :
>
> http://someserver.company.com:8080/something?name1=value1
>
>
> 1) the browser splits the URL into several parts :
> - http://  (the "protocol" part)
> - someserver.company.com  (the hostname part)
> - (optionally) 8080 : (the port, which if not specified is "80" for HTTP)
> - the rest : /something?name1=value1
>  (which is itself conmposed of different parts, but that is not important
> here)
>
> 2) the browser takes the hostname part "someserver.company.com", and asks
> its local operating system to translate this to an IP address.  The part of
> the OS which does this is called the "resolver", and it makes use the DNS
> system in order to make this name-to-address translation.
>
> 3) when the browser has the IP address of the target server, it creates a
> TCP connection *to that IP address*, and to the explicit or implicit port.
> The result of this is that there is now a direct connection between the
> browser and that server, over TCP/IP.
> And on that server, this connection is handled by whichever process was
> listening on that port.  In general, for HTTP, this will be a webserver
> process (like Apache httpd or Tomcat).
>
> So what the browser writes to that connection, is read by the webserver, and
> vice-versa what the webserver writes to that connection, is received and
> read by the browser.
>
> 4) on this connection, the browser now writes a HTTP request.  That request
> is composed of several lines, of which there are at least the following 3
> lines :
> GET /something?name1=value1 HTTP/1.1
> Host: someserver.company.com
>
> (the empty line indicates the end of the request headers, which for a "GET"
> request is also the end of the request)
>
> Now we look at the webserver, which receives this request over the
> connection.
>
> 1) The server reads the request lines, and first looks at the "Host:" line.
> That tells it which "virtual server" (or "virtual host") the browser is
> trying to reach. In this case the browser, through that Host: header line,
> indicated that it is a virtual host named "someserver.company.com".
>
> 2) The webserver then checks in its configuration, if it really has a
> virtual host named that way.  For Tomcat, that would mean that in its
> "server.xml" file, there exists a tag like :
> <Host name="someserver.company.com" ...>.
>
> 2a) if there is no such virtual host, then the server will use its "default
> host" to process this request.  For Tomcat, that is the Host (also
> identified by a <Host ..> tag, whose name is in the <Engine> tag, like :
> <Engine name="Catalina" defaultHost="localhost">
>
> 2b) if there is a <Host ..> tag where the "name" attribute matches the
> request "Host:" header, then the server will pick that Host to answer this
> request.
>
> 3) Now the server looks at the first line of the request again, to see what
> the browser wants inside that selected Host (here it is thus
> "/something?name1=value1").  In this case, it would be a "web application"
> (or "webapp", or "context"), with the name "/something".
> The webserver now passes the whole browser request (first line, other header
> lines, and maybe also a content), to the application "/something" within
> this Host.
> (and in the case of Tomcat, if there is no such application, it will pass
> the request to the "default application" (also named "ROOT")).
>
> 4) that application creates a response, and writes it back into the TCP
> connection.
>
> and now finally the browser reads that response from the connection, and
> displays it to the user in the browser window.

Yes, I understand all of that (I have been working on JMeter for some
while now ...).
But thanks for the recap.

>
> Now the above is really a dramatic summary, and in reality there is a lot
> more that happens between each of the steps above.  I have also taken some
> liberties with the language.  But the above is fundamentally true, for all
> webservers, not only for Tomcat.
> And that is because all browsers and all webservers, with respect to what is
> exchanged over the connection, follow the HTTP protocol, and that is a
> general Internet standard defined independently of any browser and any
> webserver, and valid for all of them (which is the point of a standard of
> course).

> So, to get back to your questions above :

>
>> HTTP requests include a "Host:" header which generally specifies the
>> target hostname and port (omitted if it is the default port).
>
> According to HTTP RFC 2616, the Host: header MUST be present, and MUST
> specify the target hostname.
>
>> AIUI, in virtual hosting situations, the name in the Host header may
>> be different from the URL host.
>> So for example a request to:
>>
>> http://localhost:8080/
>>
>> might be sent with the header:
>>
>> Host: stimpy:8080
>>
>> in order to direct the request to the "stimpy" virtual host.
>
> If you have understood the above explanation, you will now know that this
> was wrong.

If "stimpy" resolves to "localhost", what is wrong with that?

However, it was a misleading example, I should have written:

http://123.456.123.456:8080/

Host: stimpy:8080

> In order to even connect to the machine that runs the webserver of interest,
> the browser will first need to use a hostname (in the URL) that can be
> resolved by the OS to a valid IP address.  And when it has that, it will
> make a connection to that IP address (and to the port indicated in the URL,
> or by default to the port 80).

Yes, of course.

> And then, in the request itself that is sends onto that connection, it will
> /repeat/ that same hostname in the Host: header.

By default, yes, that's what browsers (and HTTP stacks) do.
But some stacks allow the Host header to be overridden, e.g. to select
a virtual host which is not in DNS.

> (RFC 2616 says that a port can be present in the Host: header; but it does
> not mention what the server should do with it.  And I can't think of what it
> could do with it either, since by the time the server reads this header, the
> connection is already established with the webserver anyway.)

That's exactly my question.

If one did not know about virtual hosts, one could ask the same
question about the hostname part.

>> Does it ever make sense for the "Host" port to be different from the
>> URL port? For example:
>>
>> Host: stimpy:9090
>>
>
> As far as I know, no.  And there is also no standard browser which would do
> such a thing.
>
>> As far as I can tell, Tomcat validates the format of the Host header,
>> but otherwise ignores the port?
>> Is that correct?
>>
> Kind of. It will probably ignore the port, because it is irrelevant.
>
>> Does anyone know of other servers that make use of the Host port setting?
>>
> As far as I know, none.

Ok, that's what I was asking.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to