On Thu, Jan 10, 2013 at 4:18 PM, Benoît Canet wrote:
>> Now I understand. This case covers overwriting existing data with new
>> contents. That is common :).
>>
>> But are you seeing a cluster with refcount > 1 being overwritten
>> often? If so, it's worth looking into why that happens. It may
> Now I understand. This case covers overwriting existing data with new
> contents. That is common :).
>
> But are you seeing a cluster with refcount > 1 being overwritten
> often? If so, it's worth looking into why that happens. It may be a
> common pattern for certain file systems or applica
On Wed, Jan 9, 2013 at 5:40 PM, Benoît Canet wrote:
>> > I.5) cluster removal
>> > When a L2 entry to a cluster become stale the qcow2 code decrement the
>> > refcount.
>> > When the refcount reach zero the L2 hash block of the stale cluster
>> > is written to clear the hash.
>> > This happen ofte
On Wed, Jan 9, 2013 at 5:32 PM, Eric Blake wrote:
> On 01/09/2013 09:16 AM, Stefan Hajnoczi wrote:
>
>>> I.6) max refcount reached
>>> The L2 hash block of the cluster is written in order to remember at next
>>> startup
>>> that it must not be used anymore for deduplication. The hash is dropped
> > Two GTrees are used to give access to the hashes : one indexed by hash and
> > one other indexed by physical offset.
>
> What is the GTree indexed by physical offset used for?
I think I can get rid of the second GTree for ram based deduplication.
It need to:
-Start qcow2 with the deduplicati
On Wed, Jan 9, 2013 at 4:24 PM, Benoît Canet wrote:
> Here is a mail to open a discussion on QCOW2 deduplication design and
> performance.
>
> The actual deduplication strategy is RAM based.
> One of the goal of the project is to plan and implement an alternative way to
> do
> the lookups from di
>
> What is the GTree indexed by physical offset used for?
It's used for two things: deletion and loading of the hashes.
-Deletion is a hook in the refcount code that trigger when zero is reached.
the only information the code got is the physical offset of the yet to discard
cluster. The hash m
On 01/09/2013 09:16 AM, Stefan Hajnoczi wrote:
>> I.6) max refcount reached
>> The L2 hash block of the cluster is written in order to remember at next
>> startup
>> that it must not be used anymore for deduplication. The hash is dropped from
>> the
>> gtrees.
>
> Interesting case. This means