Re: generating hash from packet content

2014-09-02 Thread Alan M. Carroll
Tuesday, September 2, 2014, 3:04:55 AM, you wrote: > How does ATS manage the expired objecs? If an object has expired/removed , I > assume allocated space for it can later be used by other objects. ATS doesn't have allocated space in the cache. It's a circular buffer. All objects, including exp

RE: generating hash from packet content

2014-09-02 Thread Rasim Saltuk Alakuş
: Re: generating hash from packet content Luca, Monday, September 1, 2014, 1:49:13 PM, you wrote: > You can also choose to store it in cache and delete dupe in a second moment. You could, but once you've written to cache, you have advanced the write cursor. Deleting the duplicate late

RE: generating hash from packet content

2014-09-02 Thread Luca Rea
Ok, what about using a second ATS (parent of the first one) with a small storage as buffering system?

Re: generating hash from packet content

2014-09-01 Thread Alan M. Carroll
Luca, Monday, September 1, 2014, 1:49:13 PM, you wrote: > You can also choose to store it in cache and delete dupe in a second moment. You could, but once you've written to cache, you have advanced the write cursor. Deleting the duplicate later has no effect except changing a directory entry.

RE: generating hash from packet content

2014-09-01 Thread Luca Rea
You can also choose to store it in cache and delete dupe in a second moment.

Re: generating hash from packet content

2014-09-01 Thread Leif Hedstrom
I assume the idea would be to not write to the cache until you have decided it is not a dupe? Which would imply, buffering it somewhere which does not move the write header forward. -- Leif > On Sep 1, 2014, at 9:55 AM, Luca Rea wrote: > > Can it help to keep only the useful cache and limit

RE: generating hash from packet content

2014-09-01 Thread Luca Rea
Can it help to keep only the useful cache and limit the recycle process?

Re: generating hash from packet content

2014-09-01 Thread Alan M. Carroll
Rasim, Monday, September 1, 2014, 10:22:06 AM, you wrote: > Looks like it is not feasible/possible to remove URL hash map solution > completely.. However, storage optimization was another topic in our mind, > which content hash can save the day. We are thinking this can be a nice > feature if

RE: generating hash from packet content

2014-09-01 Thread Rasim Saltuk Alakuş
decide to implement please let us know. regards Saltuk From: Rasim Saltuk Alakuş Sent: Wednesday, August 27, 2014 7:17 PM To: dev@trafficserver.apache.org; us...@trafficserver.apache.org Subject: generating hash from packet content Hi All, ATS uses

RE: generating hash from packet content

2014-08-29 Thread Luca Rea
what about a post-optimization of the cache? I mean... 1. when ATS receives a huge data it stores the URLs with a rounded timestamp and the flag "checked:true/false" into a RDBMS (eg. postgresql) with a unique constraint on URLs and timestamp fields 2. a batch process periodically get URLs (

Re: generating hash from packet content

2014-08-29 Thread Yongming Zhao
I’d agree that Leif point out the problem here, we may call this a de-duplicate solution but mostly after we save the content when we get from the origin, it is already wasting your disk storage, you will get the same hash after all the data is completed from the origin, and the disk already was

Re: generating hash from packet content

2014-08-28 Thread Leif Hedstrom
On Aug 28, 2014, at 12:19 PM, Bill Zeng wrote: > > > > On Thu, Aug 28, 2014 at 10:41 AM, Leif Hedstrom wrote: > > On Aug 28, 2014, at 11:35 AM, Bill Zeng wrote: > > > Just to throw another idea your way. We can insert another level of > > indirection between URL's and objects. Every obje

RE: generating hash from packet content

2014-08-28 Thread Luca Rea
Mmmm... can help something like the following below? Client(Request=Normal URL) -> ATS(Lua) -> NoSQL (PUT: key=hash,value=url object) -> Origin Client <- ATS(Cache) <- Origin Client(Request=HASH Cache) -> ATS(Lua) -> NoSQL (GET: url object) -> ATS(Cache) Client <- ATS(Cache) You can use an i

Re: generating hash from packet content

2014-08-28 Thread Alan M. Carroll
Well, it would definitely be possible to store an indirection object to implement Bill's idea. The URL is used to do a lookup and the object that is returned is a forwarding header, which then causes another lookup. Basically it's a form of remap for the cache, using the cache itself to store th

Re: generating hash from packet content

2014-08-28 Thread Bill Zeng
On Thu, Aug 28, 2014 at 10:41 AM, Leif Hedstrom wrote: > > On Aug 28, 2014, at 11:35 AM, Bill Zeng wrote: > > > Just to throw another idea your way. We can insert another level of > indirection between URL's and objects. Every object has a unique hash. > URL's point to the hashes instead of obje

Re: generating hash from packet content

2014-08-28 Thread Leif Hedstrom
On Aug 28, 2014, at 11:35 AM, Bill Zeng wrote: > Just to throw another idea your way. We can insert another level of > indirection between URL's and objects. Every object has a unique hash. URL's > point to the hashes instead of objects. The hashes are used to look up > objects. Even if multi

Re: generating hash from packet content

2014-08-28 Thread Bill Zeng
Just to throw another idea your way. We can insert another level of indirection between URL's and objects. Every object has a unique hash. URL's point to the hashes instead of objects. The hashes are used to look up objects. Even if multiple URL's are duplicated and hence their hashes, they always

Re: generating hash from packet content

2014-08-28 Thread Niki Gorchilov
Hi, Rasim, AFAICT metalink plugin has a code to calculate checksum of the object contents. Still I don't understand how this is going to resolve the problem you're trying to address. In order to have the hash, you need to download the whole object from origin server, thus you learn if you have it

Re: generating hash from packet content

2014-08-27 Thread Susan Hinrichs
On 8/27/2014 3:22 PM, Leif Hedstrom wrote: On Aug 27, 2014, at 1:51 PM, Nick Kew wrote: On Wed, 27 Aug 2014 16:17:17 + Rasim Saltuk Alakuş wrote: Hi All, ATS uses URL hash for cache storage. And CacheUrl plugin adds some more flexibility in URL hashing strategy. We think of creating

Re: generating hash from packet content

2014-08-27 Thread Bill Zeng
Just as a side question, do we have statistics on the extent of duplication we have on ATS cache? Say, how many URL's point to the same object on average? It seems like a trade-off between duplication and computation (space and time). On Wed, Aug 27, 2014 at 1:22 PM, Leif Hedstrom wrote: > On

Re: generating hash from packet content

2014-08-27 Thread Leif Hedstrom
On Aug 27, 2014, at 1:51 PM, Nick Kew wrote: > On Wed, 27 Aug 2014 16:17:17 + > Rasim Saltuk Alakuş wrote: > >> >> Hi All, >> >> ATS uses URL hash for cache storage. And CacheUrl plugin adds some more >> flexibility in URL hashing strategy. >> >> We think of creating hash based on packe

Re: generating hash from packet content

2014-08-27 Thread Nick Kew
On Wed, 27 Aug 2014 16:17:17 + Rasim Saltuk Alakuş wrote: > > Hi All, > > ATS uses URL hash for cache storage. And CacheUrl plugin adds some more > flexibility in URL hashing strategy. > > We think of creating hash based on packet content and use it as the hash > while storing and retrie

Re: generating hash from packet content

2014-08-27 Thread Susan Hinrichs
I've been thinking about it recently. But it hasn't come to the top of my priority queue yet. Some of the content providers are making the mapping of fixed asset ID to URL more and more obscure, so ultimately a hash-based solution becomes necessary. You pay some on startup costs (having to f

Re: generating hash from packet content

2014-08-27 Thread Leif Hedstrom
On Aug 27, 2014, at 9:42 AM, Rasim Saltuk Alakuş wrote: > Hi All, > > ATS uses URL hash for cache storage. And CacheUrl plugin adds some more > flexibility in URL hashing strategy. > > We think of creating hash based on packet content and use it as the hash > while storing and retrieving fro

generating hash from packet content

2014-08-27 Thread Rasim Saltuk Alakuş
Hi All, ATS uses URL hash for cache storage. And CacheUrl plugin adds some more flexibility in URL hashing strategy. We think of creating hash based on packet content and use it as the hash while storing and retrieving from cache This looks a better solution, so that URI changes won't hurt ca

generating hash from packet content

2014-08-27 Thread Rasim Saltuk Alakuş
Hi All, ATS uses URL hash for cache storage. And CacheUrl plugin adds some more flexibility in URL hashing strategy. We think of creating hash based on packet content and use it as the hash while storing and retrieving from cache This looks a better solution, so that URI changes won't hurt cac