Hey,
I was convinced that there was an option to disable the host forgery
test, which will make more sense if you will use bind and will intercept
all DNS traffic into it.
About your idea for an upstream cache.
It's a pretty nice idea, I am pretty sure that the host forgery test can
be disabled in a case you are using an upstream cache_peer.
If it is not in the code yet it should be reported as a bug.(some can
argue it is a wanted feature)
The idea by itself is not crazier that what I have done at:
http://wiki.squid-cache.org/ConfigExamples/DynamicContent/Coordinator
The idea of pre-fetching is old and I had an intention to write and ICAP
service that will do something like that but with the full original
request headers.
The issue with a pre-fetching of a file is that you will be required to
download the file at-least twice and the first request will might not be
saved into the cache as it should.
If you plan to implement pre-fetching consider using some ICAP service
that will know about the full request headers to mimic the exact same
request.
If you do have interest in the ICAP idea take a look at the the ICAP
service I wrote(in Golang) at:
https://github.com/elico/squidblocker-icap-server
You can see that in the filterByUrl function the req.Request object
content can be dumped and re-used for the pre-fetch.
Eliezer
On 30/10/2015 05:09, Jester Purtteman wrote:
We've got a couple thoughts going at once here, so let me condense it a bit.
First, yes, this is coming in over a satellite and that is part of the bugger.
Nothing like 560 ms to bring a connection to a halt. Part of my plan is
exactly as you say, optimize the links by setting huge tcp_windows and all the
rest so that I can get full bandwidth. The other part of the story (and I
could just be misunderstanding this too) is that it appears that if I have say,
3 or 4 clients connect for a file over the course of the period of the
download, if any one of them (or maybe just the last one, again, insufficient
testing so I don't know the exactly course of events here) ends up requesting
an IP different than what is looked up, it appeared to drop the file.
>I think a worse problem is if the DNS TTL is shorter than a client connections
TCP connected time.
>Then requests arriving after the DNS TTL expired would no longer match the
initial dst-IP.
That is what I think I was seeing: if by that you mean, clients A, B, and C
all request a large file (few hundred MB), it downloads but takes more than 300
seconds (which has become a pretty common TTL, when did that happen?), and then
D requests it too, but the DNS updates while its coming in and suddenly gets
flagged as a host forgery and is no longer cacheable. I could be wrong, so I
need to experiment, but I think that’s what I am seeing.
My crazy solution is, I have a server on a fast connection on which I setup a
cache there with a pretty big minimum and maximum file size (say, 10 MB minimum
object size, 8,000MB maximum) and set it up as a parent cache to the cache out
at the slow end of the universe, which is a transparent proxy. The transparent
proxy then uses the parent proxy to request the files, and when the files
happen to be very big, I set up the connection to do a pre-cache (because a 100
MB file is a piece of cake for a 100 mbps connection) and it stores it, because
the time to download was trivial compared to the DNS TTL. I set the cache up
no the slow end to cache more aggressively, but the point is that once the
cache down south has the file, the cache up north is requesting the file from a
system much more optimized to pull big files over, and that improves the odds
that the DNS has not updated before the transfer completes.
I'm not convinced my idea is valid, so I'll have to ponder it a bit, but I'm
going to give it a shot and let you know if it makes a difference. Bottom line
is, it is a pretty nasty work around, and there is probably a better solution
if someone that knows C out there worth beans is into it. I don't think there
are ANY answers that don't involve setting up your own DNS, but after
configuring BIND in about 7 minutes last night, I am thinking that’s not a big
issue. The obvious answers I can think of are (1) to maintain a short table of
IPs associated with a specific domain request until all transfers referring
back to it have passed and rewrite the DNS resolution calls to refer to that
table or (2) tag the requested IP and resolved IP.
The last line of C I wrote was in the 90s, but I'll dig in and see if I can
find the right place to start making a mess:).
In any event, you and Eliezer have helped me get farther since Tuesday night
than I had since August, Thank you both!
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users