[Qemu-devel] QCOW2 deduplication key value store

2013-03-27 Thread Benoît Canet
Hello, I am starting this thread so we can discuss of the choice of a good key/value store for the QCOW2 deduplication. One of the main goal is to keep the ratio between the number of cluster written and the number of dedup metadata io high. I initially though about taking the first two stages

[Qemu-devel] Qcow2 Deduplication

2013-03-18 Thread Gaurab Basu
Hi, This is with reference to the deduplication patch for qcow2 image.( http://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02811.html) I applied the patch and the code compiled without any error. I converted a raw image to qcow2 image using the usual qemu-img convert command. Then i created

Re: [Qemu-devel] QCOW2 deduplication

2013-02-28 Thread Kevin Wolf
Am 28.02.2013 um 10:59 hat Stefan Hajnoczi geschrieben: > On Wed, Feb 27, 2013 at 05:40:53PM +0100, Kevin Wolf wrote: > > Am 27.02.2013 um 16:58 hat Benoît Canet geschrieben: > > > > > The current prototype of the QCOW2 deduplication uses 32 bytes SHA256 > > > > > or SKEIN > > > > > hashes to iden

Re: [Qemu-devel] QCOW2 deduplication

2013-02-28 Thread Stefan Hajnoczi
On Wed, Feb 27, 2013 at 05:40:53PM +0100, Kevin Wolf wrote: > Am 27.02.2013 um 16:58 hat Benoît Canet geschrieben: > > > > The current prototype of the QCOW2 deduplication uses 32 bytes SHA256 > > > > or SKEIN > > > > hashes to identify each 4KB clusters with a very low probability of > > > > col

Re: [Qemu-devel] QCOW2 deduplication

2013-02-27 Thread Kevin Wolf
Am 27.02.2013 um 16:58 hat Benoît Canet geschrieben: > > > The current prototype of the QCOW2 deduplication uses 32 bytes SHA256 or > > > SKEIN > > > hashes to identify each 4KB clusters with a very low probability of > > > collisions. > > > > How do you handle the rare collision cases? Do you r

Re: [Qemu-devel] QCOW2 deduplication

2013-02-27 Thread Benoît Canet
> > The current prototype of the QCOW2 deduplication uses 32 bytes SHA256 or > > SKEIN > > hashes to identify each 4KB clusters with a very low probability of > > collisions. > > How do you handle the rare collision cases? Do you read the original > cluster and compare the exact contents when th

Re: [Qemu-devel] QCOW2 deduplication

2013-02-27 Thread Kevin Wolf
Am 26.02.2013 um 18:14 hat Benoît Canet geschrieben: > > Hello Kevin, > > As you are best person to discuss QCOW2 implementations issues with I am > writing > this mail so you can know what has been done on deduplication and what I am > planning to do next. > > In short I need your feedback bef

[Qemu-devel] QCOW2 deduplication

2013-02-26 Thread Benoît Canet
Hello Kevin, As you are best person to discuss QCOW2 implementations issues with I am writing this mail so you can know what has been done on deduplication and what I am planning to do next. In short I need your feedback before going into another code sprint and being in need of another code rev

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-10 Thread Stefan Hajnoczi
On Thu, Jan 10, 2013 at 4:18 PM, Benoît Canet wrote: >> Now I understand. This case covers overwriting existing data with new >> contents. That is common :). >> >> But are you seeing a cluster with refcount > 1 being overwritten >> often? If so, it's worth looking into why that happens. It may

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-10 Thread Benoît Canet
> Now I understand. This case covers overwriting existing data with new > contents. That is common :). > > But are you seeing a cluster with refcount > 1 being overwritten > often? If so, it's worth looking into why that happens. It may be a > common pattern for certain file systems or applica

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-10 Thread Stefan Hajnoczi
On Wed, Jan 9, 2013 at 5:40 PM, Benoît Canet wrote: >> > I.5) cluster removal >> > When a L2 entry to a cluster become stale the qcow2 code decrement the >> > refcount. >> > When the refcount reach zero the L2 hash block of the stale cluster >> > is written to clear the hash. >> > This happen ofte

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-09 Thread Stefan Hajnoczi
On Wed, Jan 9, 2013 at 5:32 PM, Eric Blake wrote: > On 01/09/2013 09:16 AM, Stefan Hajnoczi wrote: > >>> I.6) max refcount reached >>> The L2 hash block of the cluster is written in order to remember at next >>> startup >>> that it must not be used anymore for deduplication. The hash is dropped

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-09 Thread Benoît Canet
> > Two GTrees are used to give access to the hashes : one indexed by hash and > > one other indexed by physical offset. > > What is the GTree indexed by physical offset used for? I think I can get rid of the second GTree for ram based deduplication. It need to: -Start qcow2 with the deduplicati

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-09 Thread Stefan Hajnoczi
On Wed, Jan 9, 2013 at 4:24 PM, Benoît Canet wrote: > Here is a mail to open a discussion on QCOW2 deduplication design and > performance. > > The actual deduplication strategy is RAM based. > One of the goal of the project is to plan and implement an alternative way to > do > the lookups from di

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-09 Thread Benoît Canet
> > What is the GTree indexed by physical offset used for? It's used for two things: deletion and loading of the hashes. -Deletion is a hook in the refcount code that trigger when zero is reached. the only information the code got is the physical offset of the yet to discard cluster. The hash m

Re: [Qemu-devel] QCOW2 deduplication design

2013-01-09 Thread Eric Blake
On 01/09/2013 09:16 AM, Stefan Hajnoczi wrote: >> I.6) max refcount reached >> The L2 hash block of the cluster is written in order to remember at next >> startup >> that it must not be used anymore for deduplication. The hash is dropped from >> the >> gtrees. > > Interesting case. This means

[Qemu-devel] QCOW2 deduplication design

2013-01-09 Thread Benoît Canet
Hello, Here is a mail to open a discussion on QCOW2 deduplication design and performance. The actual deduplication strategy is RAM based. One of the goal of the project is to plan and implement an alternative way to do the lookups from disk for bigger images. I will in a first section enumerate