On 20/10/2013 5:55 PM, Raphael Geissert wrote:
> Stephen Gran wrote:
>> That's mostly because we're not actually 'using' them now - we're just
>> allowing them to cache.  Most CDNs have a decache mechanism of some sort
>> or other that we could use on mirror pulses, or we could tune the cache
>> headers to actually make it possible for CDNs to do the right thing, etc.
> I'm aware of those methods to expire objects and I can tell you that they 
> are already in use for cloudfront.d.n. Even with them I still need to re-
> enable cloudfront.d.n from http.debian.net as from time to time it fails a 
> consistency check and gets banned.
>
> Looking at report.txt right now it seems that some got banned again.

It's been some time since i have delved into these deeply again, but I'd
love to continue to tune down the cache expiry headers on
cloudfront.debian.net as needed; for those not aware I covered this in
detail in my presentation at Debconf. Here's the summary of the
'introduced' Cache-Control Max-Age headers that are currently in place:

Default: 24 hours (Cloudfront Default on objects that have no headers)
/debian/dists/*: 15 minutes (default for all files, overridden by
subsequent rules)
/debian/dists/(unstable|sid)/.*: 5 minutes
/debian/dists/.*\.diff/[\d-]+\.gz: 2 hours - these are datestamped
filenames and don't change once uploaded
/debian/dists/.*\.diff/(Index)?: 10 seconds
/debian/dists/.*/(Contents-.*\.(bz|gz)|(In)?Release(\.gpg)?)?: 20 seconds
/debian/dists/(unstable|sid)/(Contents-.*\.(bz|gz)|(In)?Release(\.gpg)?)?:
10 seconds
/debian/dists/.*/i18n/(Index|Translation-.*)?: 10 seconds
/debian/dists/.*/(binary-.*|source)/(Packages(\..*)?|Sources(\..*)?|Release)?:
10 seconds
/debian/project/.*: 10 seconds



These rules are all running on a tiny little Apache instance, becuase
upstream mirrors do not have any Cache-Control headers. I am hoping that
if we are happy refining these cache times, we can migrate these rules
to an upstream HTTP server (this is currently using ftp.debian.org) and
do away with an 'interstitial' server that is squirting these headers
into the response.
 
0 seconds would work but probably overload things - that's no caching at
all for every edge hit. And 1 second means that for objects in heavy
use, we'll have 43 hits/sec, so something a little longer than that...
hence the 10/20 seconds lines above.


CloudFront is currently 43 locations worldwide, and continuing to
expand. That's much less than the 400 mirrors in the Debian list. I
would not recommend abandoning mirrors.

  James
(Its just mod proxy and mod headers in Apache)
(PS: I am travelling this week and possibly slower at answering email -
ping jameseb_AT_amazon.com if you need me urgently)
(PPS: Happy to give any DD/DM or read access into the AWS Account - just
send me a signed email - and read/write (ie, for DSA) if you want).

-- 
/Mobile:/ +61 422 166 708, /Email:/ james_AT_rcpt.to

Reply via email to