Am 17.07.2013 um 00:45 hat Benoît Canet geschrieben:
> > > Simple is good. Even for deduplication alone, I think data integrity is
> > > critical - otherwise we risk stale dedup metadata pointing to clusters
> > > that are unallocated or do not contain the right data. So the journal
> > > will pr
> > Simple is good. Even for deduplication alone, I think data integrity is
> > critical - otherwise we risk stale dedup metadata pointing to clusters
> > that are unallocated or do not contain the right data. So the journal
> > will probably need to follow techniques for commits/checksums.
>
I'
> Simple is good. Even for deduplication alone, I think data integrity is
> critical - otherwise we risk stale dedup metadata pointing to clusters
> that are unallocated or do not contain the right data. So the journal
> will probably need to follow techniques for commits/checksums.
I agree that
On Wed, Jul 03, 2013 at 02:53:27PM +0200, Benoît Canet wrote:
> > By the way, I don't know much about journalling techniques. So I'm
> > asking you these questions so that either you can answer them straight
> > away or because they might warrant a look at existing journal
> > implementations like
> By the way, I don't know much about journalling techniques. So I'm
> asking you these questions so that either you can answer them straight
> away or because they might warrant a look at existing journal
> implementations like:
I tried to so something simple and performing for the deduplication
> Does this mean the journal forms the first-stage data structure for
> deduplication? Dedup records will accumulate in the journal until it
> becomes time to convert them in bulk into a more compact representation?
The journal is mainly used to persist the last inserted dedup metadata across
QEM
> Care to explain that in more detail? Why shouldn't it work on spinning
> disks?
Hash are random they introduce random read access.
With a QCOW2 cluster size of 4KB the deduplication code when writting duplicated
data will do one random read per 4KB block to deduplicate.
A server grade hardisk
On Tue, Jul 02, 2013 at 11:23:56PM +0200, Benoît Canet wrote:
> >Any ideas how existing journals handle this?
By the way, I don't know much about journalling techniques. So I'm
asking you these questions so that either you can answer them straight
away or because they might warrant a look at
Am 02.07.2013 um 23:26 hat Benoît Canet geschrieben:
> > > 2. Byte-granularity means that read-modify-write is necessary to append
> > >entries to the journal. Therefore a failure could destroy previously
> > >committed entries.
> > >
> > >Any ideas how existing journals handle this?
Am 02.07.2013 um 23:23 hat Benoît Canet geschrieben:
> Also since deduplication will not work on spinning disk I discarded the seek
> time factor.
Care to explain that in more detail? Why shouldn't it work on spinning
disks?
Kevin
On Tue, Jul 02, 2013 at 11:23:56PM +0200, Benoît Canet wrote:
> > > +QCOW2 can use one or more instance of a metadata journal.
> >
> > s/instance/instances/
> >
> > Is there a reason to use multiple journals rather than a single journal
> > for all entry types? The single journal area avoids see
On Tue, Jul 02, 2013 at 04:54:46PM +0200, Kevin Wolf wrote:
> Am 02.07.2013 um 16:42 hat Stefan Hajnoczi geschrieben:
> > On Thu, Jun 20, 2013 at 04:26:09PM +0200, Benoît Canet wrote:
> > > ---
> > > docs/specs/qcow2.txt | 42 ++
> > > 1 file changed, 42 i
> > 2. Byte-granularity means that read-modify-write is necessary to append
> >entries to the journal. Therefore a failure could destroy previously
> >committed entries.
> >
> >Any ideas how existing journals handle this?
>
> You commit only whole blocks. So in this case we can consi
> > +QCOW2 can use one or more instance of a metadata journal.
>
> s/instance/instances/
>
> Is there a reason to use multiple journals rather than a single journal
> for all entry types? The single journal area avoids seeks.
Here are the main reason for this:
For the deduplication some patter
Am 02.07.2013 um 16:42 hat Stefan Hajnoczi geschrieben:
> On Thu, Jun 20, 2013 at 04:26:09PM +0200, Benoît Canet wrote:
> > ---
> > docs/specs/qcow2.txt | 42 ++
> > 1 file changed, 42 insertions(+)
> >
> > diff --git a/docs/specs/qcow2.txt b/docs/specs/q
On Thu, Jun 20, 2013 at 04:26:09PM +0200, Benoît Canet wrote:
> ---
> docs/specs/qcow2.txt | 42 ++
> 1 file changed, 42 insertions(+)
>
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 36a559d..a4ffc85 100644
> --- a/docs/specs/qcow2.tx
---
docs/specs/qcow2.txt | 42 ++
1 file changed, 42 insertions(+)
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 36a559d..a4ffc85 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -350,3 +350,45 @@ Snapshot table entry:
17 matches
Mail list logo