On 2022-11-01 11:38, Grant Taylor wrote:
On 10/30/22 6:59 AM, squ...@treenet.co.nz wrote:
Duane W. would be the best one to ask about the details.

What I know is that some 10-12 years ago I discovered an message by Duane mentioning that W3C had (given or accepted) port 3128 for Squid use. I've checked the squid-cache archives and not seeing the message.

Right now it looks like the W3C changed their systems and only track the standards documents. So I cannot reference their (outdated?) protocol registry :-{ . Also checked the squid-cache archives and not finding it email history. Sorry.

Did you by chance mean IANA?

No I meant W3C. Back in the before times things were a bit messy.


I looked and 3128 is registered to something other than Squid.


Indeed, thus we cannot register it with IEFT/IANA now. The IANA http-alt port would probably be best if we did go official.


Nor did their search bring anything up for Squid.

I mean "authority" as used by HTTP specification, which refers to https://www.rfc-editor.org/rfc/rfc3986#section-3.2

Yes exactly. That is the source of the problem, perpetuated by the need to retain on-wire byte/octet backward compatibility until HTTP/2 changed to binary format.

Consider what the proxy has to do when (not if) the IP:port being connected to are that proxy's (eg localhost:80) and the URL is only a path ("/") on an origin server somewhere else. Does the "GET / HTTP/1.0" mean "http://example.com/"; or "http://example.net/"; ?

I would hope that it would return an error page, much like Squid does when it can't resolve a domain name or the connection times out.

You see my point I hope. A gateway proxy that returns an error to *every* request is not very good.



The key point is that the proxy host:port and the origin host:port are two different authority and only the origin may be passed along in the URL (or URL+Host header).

Agreed.

When the client uses port 80 and 443 thinking they are origin services it is *required* (per https://www.rfc-editor.org/rfc/rfc9112.html#name-origin-form) to omit the real origins info. Enter problems.

Why would a client (worth it's disk space) ever conflate the value of it's configured proxy as the origin server?


There is no "configured proxy" for this use-case.


I can see a potential for confusion when using (network) transparent / intercepting proxies.


Those are the two most/extremely common instances of the problematic use-cases. All implicit use of proxy (or gateway) have the same issue.



The defaults though are tuned for origin server (or reverse-proxy) direct contact.

I don't see how that precludes their use for (forward) proxy servers.


I think you are getting stuck with the subtle difference between "use for case X" and "use by default".

ANY port number can be used for *some* use-case(s). "by default" has to work for *all* use-cases.


No Browser I know supports "http-alt://proxy.example.com?http://origin.example.net/index.html"; URLs.

But I bet that many browsers would support:

   http://proxy.example.com:8080/?http://origin.example.net/index.html


Note that you are now having to add a non-default port "8080" and path "/" to the URL to make it valid/accepted by the Browser.

Clients speaking HTTP origin-form (the http:// scheme) are not permitted to request tunnels or equivalent gateway services. They can only ask for resource representations.


Also, I'm talking about "http://"; and "https://"; using their default ports of 80 & 443.


Port is just a number, it can be anything *IF* it is made explicit.
The scheme determines what protocol syntax is being spoken and thus what restrictions and/or requirements are.

... and so the protocol for talking to a webcache service is http-alt://. Whose default port is not 80 nor 443 for all the same reasons why Squid default listening port is 3128.

If we wanted to we could easily switch Squid default port to http-alt/8080 without causing technical issues. But it would be annoying to update all the existing documentation around the Internet, so not worth the effort changing now.



It is based on experience. Squid used to be a lot more lenient and tried for decades to do the syntax auto-detection. The path from that to separate ports is littered with CVEs. Most notably the curse that keeps on giving: CVE-2009-0801, which is just the trigger issue for a whole nest of bad side effects.

I wonder how much of that problematic history was related to HTTP/0.9 vs HTTP/1.0 vs HTTP/1.1 clients.

Ditto. Though the legacy install base has a long long long tail. 26 years after HTTP/1.0 came out and HTTP/0.9 still has use-cases alive.


I similarly wonder how much HTTP/1.0, or even HTTP/0.9, protocol is used these days.

Decreasing, but still a potentially significant amount of traffic seen by Squid in general.


Also, there is the elephant in the room of we're talking about a proxy server which is frequently, but not always, on a dedicated system or IP. As such, I have no problem predicating the use of the HTTP(80) and HTTPS(443) ports when there is no possible chance of confusion between forward proxy roles and origin server / reverse proxy roles.


Ah, if you have been treating it like an irrelevant elephant that is your confusion. The "but not always" is a critical detail in the puzzle - its side-effects are the answer to your initial question of *why* Squid defaults to X instead of 80/443.

Amos
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to