I am pleased to report that a short of version of the FVD-cow paper I
previously posted here was just accepted to USENIX Annual Technical
Conference (USENIX ATC'11), which is a prestigious and highly competitive
research conference in the systems field. This shows from another angle
(in additio
> > > FVD's novel uses of the reference count table reduces the metadata
update
> > > overhead down to literally zero during normal execution of a VM.
This gets
> > > the bests of QCOW2's reference count table but without its
oeverhead. In
> > > FVD, the reference count table is only updated whe
> > Here is a detailed description. Relevant to the discussion of
snapshot,
> > FVD uses a one-level lookup table and a refcount table. FVD’s
one-level
> > lookup table is very similar to QCOW2’s two-level lookup table, except
> > that it is much smaller in FVD, and is preallocated and hence
> > FVD's novel uses of the reference count table reduces the metadata
update
> > overhead down to literally zero during normal execution of a VM. This
gets
> > the bests of QCOW2's reference count table but without its oeverhead.
In
> > FVD, the reference count table is only updated when creati
> >> The file system can keep a lot of these things around pretty easily
but
> >> with your proposal, it seems like there can only be one. If you
support
> >> many of them, I think you'll degenerate to something as complex as a
> >> reference count table.
> > IIUC, he already uses a refcount tab
> On Mon, Mar 14, 2011 at 1:53 PM, Chunqiang Tang
wrote:
> > Therefore, during normal execution of a
> > VM, images with snapshots are as fast as images without snapshot.
>
> Hang on, an image with a snapshot still needs to do copy-on-write,
> just like backing files. T
> > Your use of "current-state" is confusing me because AFAICT,
> > current-state is just semantically another snapshot.
> >
> > It's writable because it has no children. You only keep around one
> > writable snapshot and to make another snapshot writable, you have to
> > discard the former.
>
> IIUC, he already uses a refcount table. Actually, I think that a
> refcount table is a requirement to provide the interesting properties
> that internal snapshots have (see my other mail).
>
> Refcount tables aren't a very complex thing either. In fact, it makes a
> format much simpler to have o
> No, because the copy-on-write is another layer on top of the snapshot
> and AFAICT, they don't persist when moving between snapshots.
>
> The equivalent for external snapshots would be:
>
> base0 <- base1 <- base2 <- image
>
> And then if I wanted to move to base1 without destroying base2 and
> > In short, FVD's internal snapshot achieves the ideal properties of
G1-G6,
> > by 1) using the reference count table to only track "static"
snapshots, 2)
> > not keeping the reference count table in memory, 3) not updating the
> > on-disk "static" reference count table when the VM runs, and 4)
> It seems that there is great interest in QCOW2's
> internal snapshot feature. If we really want to do that, the right
solution is
> to follow VMDK's approach of storing each snapshot as a separate COW
file (see
> http://www.vmware.com/app/vmdk/?src=vmdk ), rather than using the
reference
> c
> Am 01.03.2011 10:55, schrieb Stefan Hajnoczi:
> > On Mon, Feb 28, 2011 at 3:48 PM, Kevin Wolf wrote:
> >> Am 28.02.2011 16:35, schrieb Stefan Hajnoczi:
> >>> On Mon, Feb 28, 2011 at 3:12 PM, Kevin Wolf
wrote:
> Am 28.02.2011 12:49, schrieb Prerna Saxena:
> > The following patchset int
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_aio_readv() interface. It
supports read and copy-on-read in FVD.
Signed-off-by: Chunqiang Tang
---
block/fvd-bitmap.c | 88 ++
bloc
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds the skeleton of the block device driver for
Fast Virtual Disk (FVD).
Signed-off-by: Chunqiang Tang
---
Makefile.objs |2 +-
block/fvd-create.c | 21 +++
block/fvd
bug. This makes debugging much easier.
Signed-off-by: Chunqiang Tang
---
qemu-io-auto.c | 947
qemu-io-sim.c | 127
qemu-io.c | 50 +++-
qemu-tool.c| 107 ++-
4 files changed, 1209 insertions(+), 22 deletions(-)
cre
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds the support for aio_cancel into FVD. FVD faithfully cleans up
all resources upon aio_cancel.
Signed-off-by: Chunqiang Tang
---
block/fvd-journal-buf.c | 16 +++
block
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_create() interface. It
supports FVD image creation.
Signed-off-by: Chunqiang Tang
---
block/fvd-create.c |
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
test-vdi.sh drives 'qemu-io --auto' to perform fully automated testing for VDI.
Signed-off-by: Chunqiang Tang
---
test-vdi.sh | 83
conditions. Bugs found by blksim under rare
race conditions are guranteed to be precisely reproducible.
Signed-off-by: Chunqiang Tang
---
Makefile.objs |1 +
block/blksim.c | 757
block/blksim.h | 35 +++
3 files changed, 793
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_aio_writev() interface. It
supports copy-on-write in FVD.
Signed-off-by: Chunqiang Tang
---
block/fvd-bitmap.c | 150
bloc
initiated by the FVD driver rather than
triggered by the VM's read requests. FVD's prefetching is conservative in
that, if it detects resource contention, it will back off and temporarily
pause prefetching.
Signed-off-by: Chunqiang Tang
---
block/fvd-prefetc
7;qemu-img resize' can be considered as two special cases
of update.
Signed-off-by: Chunqiang Tang
---
block_int.h |3 +
qemu-img-cmds.hx |6 +++
qemu-img.c | 125 +++---
qemu-option.c| 79 ++
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_probe() interface.
Signed-off-by: Chunqiang Tang
---
block/fvd-misc.c |9 -
1 files changed, 8 insertions(+), 1 deletions(-)
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch makes FVD's header file fvd.h more complete, by adding type
definition for BDRVFvdState, FvdAIOCB, etc.
Signed-off-by: Chunqiang Tang
---
block/fvd.h |
Hi Andreas, Anthony, Stefan H., and Stefan W.,
I just posed the latest series of FVD patches to the mailing list, which
addressed the review comments you previously made on FVD . Thank you for
the feedback. Off the mailing list, Stefan Weil provided guidance on
porting FVD to win32 and also sen
: Chunqiang Tang
---
block/fvd-store.c | 459 +
block/fvd-utils.c | 65
2 files changed, 524 insertions(+), 0 deletions(-)
diff --git a/block/fvd-store.c b/block/fvd-store.c
index 85e45d4..fe670eb 100644
--- a/block/fvd-store.c
+++ b
.
Signed-off-by: Chunqiang Tang
---
block/fvd-load.c | 448 +
block/fvd-utils.c | 40 +
2 files changed, 488 insertions(+), 0 deletions(-)
diff --git a/block/fvd-load.c b/block/fvd-load.c
index 80ab32c..88e5fb4 100644
--- a/block/fvd
: Chunqiang Tang
---
block.c |2 +-
block/fvd-bitmap.c | 57
block/fvd-journal-buf.c | 34 ++
block/fvd-journal.c | 814 ++-
block/fvd-write.c |1 +
block/fvd.c | 19 ++
6 files changed, 920 insertions
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_update() interface.
Signed-off-by: Chunqiang Tang
---
block/fvd-update.c | 274 +++-
1
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_close() interface.
Signed-off-by: Chunqiang Tang
---
block/fvd-misc.c | 78 ++
1
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
test-qcow2.sh drives 'qemu-io --auto' to perform fully automated testing for
QCOW2.
Signed-off-by: Chunqiang Tang
---
test-qcow2
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
test-fvd.sh drives 'qemu-io --auto' to perform fully automated testing for FVD.
Signed-off-by: Chunqiang Tang
---
test-fvd.sh | 161
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_is_allocated() interface.
Signed-off-by: Chunqiang Tang
---
block/fvd-misc.c | 67
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_flush() and bdrv_aio_flush()
interfaces.
Signed-off-by: Chunqiang Tang
---
block/fvd-flush.c |
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch enhances FVD's journal with the capability of buffering
multiple metadata updates and sending them to the journal in a single write.
Signed-off-by: Chunqiang Tang
---
block/fvd-jo
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds some debugging utilities to FVD.
Signed-off-by: Chunqiang Tang
---
block/blksim.c |7 +-
block/fvd-debug.c | 369
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_has_zero_init() interface.
Signed-off-by: Chunqiang Tang
---
block/fvd-misc.c |9 -
1 files changed, 8 insertions(+), 1 dele
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_get_info() interface.
Signed-off-by: Chunqiang Tang
---
block/fvd-misc.c | 98 +-
1
This patch is part of the Fast Virtual Disk (FVD) proposal.
See http://wiki.qemu.org/Features/FVD.
This patch adds FVD's implementation of the bdrv_file_open() interface.
It supports openning an FVD image.
Signed-off-by: Chunqiang Tang
---
block/fvd-journal.c |6 +
block/fvd-o
> In any case, the next step is to get down to specifics. Here is the
> page with the current QCOW3 roadmap:
>
> http://wiki.qemu.org/Qcow3_Roadmap
>
> Please raise concrete requirements or features so they can be
> discussed and captured.
Now it turns into a more productive discussion, but it s
Hi Stefan,
I applied FVD's fully automated testing tool to the VDI block device
driver and found several bugs. Some bugs are easy to fix whereas others
need some thoughts on design. Therefore, I thought you might be able to
handle the bugs better than me. These bugs occur only if I/O errors or
> Am 15.02.2011 20:45, schrieb Chunqiang Tang:
> >> Chunqiang Tang/Watson/IBM wrote on 01/28/2011 05:13:27 PM:
> >> As you requested, I set up a wiki page for FVD at
> > http://wiki.qemu.org/Features/FVD
> >> . It includes a summary of FVD, a detailed specifica
> On Tue, Feb 15, 2011 at 7:45 PM, Chunqiang Tang
wrote:
> >> Chunqiang Tang/Watson/IBM wrote on 01/28/2011 05:13:27 PM:
> >> As you requested, I set up a wiki page for FVD at
> > http://wiki.qemu.org/Features/FVD
> >> . It includes a summary of FVD, a d
> Chunqiang Tang/Watson/IBM wrote on 01/28/2011 05:13:27 PM:
> As you requested, I set up a wiki page for FVD at
http://wiki.qemu.org/Features/FVD
> . It includes a summary of FVD, a detailed specification of FVD, and a
> comparison of the design and performance of FVD and QED.
Hi Kevin,
Fast Virtual Disk (FVD) has an automated testing tool (see
http://wiki.qemu.org/Features/FVD/Engineering). For a long time, I knew
that QCOW2 could not pass the automated tests. Today I finally sit down to
look into those bugs. I already submitted multiple patches for different
bugs,
> Oops, thanks for catching this. I thought this was fixed long ago, but
> apparently it wasn't.
Not me, the testing tool caught it without my supervision. :-)
> > @@ -495,8 +497,10 @@ static void qcow2_aio_read_cb(void *opaque, int
ret)
> > }
> > } else if (acb->cluster_offset & Q
und=10 --parallel=100
--io_size=1048576 --fail_prob=0.1 --cancel_prob=0 --instant_qemubh=true
Signed-off-by: Chunqiang Tang
---
block/qcow2.c |8 ++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index 8c906d1..6f6d56f 100644
--- a/blo
85760 --fail_prob=0 --cancel_prob=0 --instant_qemubh=true
Signed-off-by: Chunqiang Tang
---
block/qcow2.c |5 ++---
cutils.c | 31 +++
qemu-common.h |2 ++
3 files changed, 35 insertions(+), 3 deletions(-)
diff --git a/block/qcow2.c b/block/qcow2.c
index db
> After thinking about it more, qemu-img update does also serve a
> purpose. Sometimes it is necessary to set options on many images in
> bulk or from provisioning scripts instead of at runtime.
>
> I guess my main fear of qemu-img update is that it adds a new
> interface that only FVD exploits s
Hi Anthony,
As you requested, I set up a wiki page for FVD at
http://wiki.qemu.org/Features/FVD . It includes a summary of FVD, a
detailed specification of FVD, and a comparison of the design and
performance of FVD and QED. I copied the comparison part below for easy
reference.
=
> It should be possible to change prefetching and copy-on-read while the
> VM is running. For example, having to shut down a VM in order to
> pause the prefetching is not workable. In the QED image streaming
> tree there are monitor commands for this:
>
> http://repo.or.cz/w/qemu/stefanha.git/sh
> On Fri, Jan 21, 2011 at 05:19:13PM -0500, Chunqiang Tang wrote:
> > This patch adds the 'update' command to qemu-img. FVD stores various
> > image-specific configurable parameters in the image header. A user can
use
> > 'qemu-img update' to modify
> Headers usually start with a one-line summary, "QEMU simulated block
> driver" maybe?
> > + * Copyright (c) 2010-2011 IBM
> > + *
> > + * Authors:
> > + * Chunqiang Tang
> > + *
> > + * This work is licensed under the terms of the GNU
> Before going any further with this series, I'd like to see
>
> 1) a specification (on the QEMU wiki) describing this image format that
> can be reviewed
>
> 2) a concise explanation of why qcow2/qed cannot satisfy the use cases
> addressed by FVD
>
> 3) performance data to backup the claims
> Read CODING_STYLE and go through your code.
Went through CODING_STYLE. The white space issue in FVD was already been
fixed previously. FVD’s variable and type names are fine, and line width
is fine. The only remaining issue in FVD is '}' before 'else', which will
be fixed. CODING_STYLE does n
> On 20 January 2011 17:08, Stefan Weil wrote:
> > Yes, that's a problem with some parts of the old code.
> > For files which you want to modify, you could remove
> > the spaces with your script before applying your other
> > modifications and create a separate patch which only
> > removes the sup
> I think the root of the problem is that your series didn't maintain
> bisectability.
>
> IOW, each patch needs to be able to be applied one at a time such that
> at each point, the build doesn't break and functionality doesn't break.
>
> Otherwise, tools like git bisect don't work.
This was
> Coding style.
>
> In general, I like the idea of the simulator but the coding style is off
> quite a bit.
Please be specific and I would be happy to take suggestions. The header
issue should be easy to fix.
> > -void qemu_bh_schedule(QEMUBH *bh)
> > -{
> > -bh->cb(bh->opaque);
> > -}
> > -
> > -void qemu_bh_cancel(QEMUBH *bh)
> > -{
> > -}
> > -
> > -void qemu_bh_delete(QEMUBH *bh)
> > -{
> > -qemu_free(bh);
> > -}
> > -
> > int qemu_set_fd_handler2(int fd,
> >IOC
pose is to comprehensively test a block
device driver under failures and race conditions. Bugs found by blksim under
rare race conditions are guaranteed to be precisely reproducible, with no
dependency on thread timing etc., which makes debugging much easier.
Signed-off-by: Chunqiang Tang
---
Makefile.ob
producible, with no
dependency on thread timing etc., which makes debugging much easier.
Signed-off-by: Chunqiang Tang
---
qemu-io-sim.c | 109 +
qemu-io.c | 38 +++-
2 files changed, 138 insertions(+), 9 deletions(
e-fit-all manner.
Signed-off-by: Chunqiang Tang
---
block_int.h |1 +
qemu-img-cmds.hx |6 ++
qemu-img.c | 43 +++
3 files changed, 50 insertions(+), 0 deletions(-)
diff --git a/block_int.h b/block_int.h
index 12663e8..e98872a
> Please try to split the patches into logical parts, and use descriptive
> subject lines for each patch.
> E.g. adding the new sim command to qemu-io could be one patch, adding
> the img_update (why not just update?) command to qemu-img another,
> moving code into qemu-tool-time.c one more, etc.
> when I tried to use your patch, I found several problems:
>
> * The patch does apply cleanly to latest QEMU.
>This is caused by recent changes in QEMU git master.
>
> * The new code uses tabs instead of spaces (QEMU coding rules).
>
> * Some lines of the new code end with blank characters.
.
Signed-off-by: Chunqiang Tang
---
Makefile | 10 +---
Makefile.objs|1 +
block.c | 12 +-
block_int.h |5 ++-
configure|2 +-
qemu-img-cmds.hx |6 +
qemu-img.c | 62
> Actually current filesystems do pretty well on thinly provisioned
> storage, as long as your extent size is not too small. Starting from
> extent size in the 64M to 256M range there's almost no difference to
> non-virtualized storage.
>
> Again, sparse images with a large enough allocation size
> It's something filesystems have to deal with. Real storage is getting
> increasingly virtualized. While this didn't matter for the real high
> end storage which has been doing this for a long time it's getting more
> and more exposed to the filesystem. That includes LVM layouts and
> thinly pr
> >> Moreover, using a host file system not only adds overhead, but
> >> also introduces data integrity issues. Specifically, if I/Os uses
O_DSYNC,
> >> it may be too slow. If I/Os use O_DIRECT, it cannot guarantee data
> >> integrity in the event of a host crash. See
> >> http://lwn.net/Articles/
> > Doing both fault injection and verification together introduces some
> > subtlety. For example, even under the random failure mode, two disk
writes
> > triggered by one VM-issued write must either fail together or succeed
> > together. Otherwise, the truth image and the test image will dive
r. Otherwise, the truth image and the test image will diverge and
verification won't succeed. Currently, qemu-test carefully works with the
'sim' driver to guarantee those conditions. Those conditions need be
retained after code restructure.
Best regards,
Chunqiang Tang
> The community block I/O test suite is qemu-iotests:
> http://git.kernel.org/?p=linux/kernel/git/hch/qemu-iotests.git;a=summary
> If you have tests that you'd like to contribute, please put them into
> that framework so other developers can run them as part of their
> regular testing.
Hi Stefan,
> Based on my limited understanding, I think FVD shares a
> lot in common with the COW format (block/cow.c).
>
> But I think most of the advantages you mention could be considered as
> additions to either qcow2 or qed. At any rate, the right way to have
> that discussion is in the form of patc
base image. QCOW2 experts please take a look at this "potential"
bug.
Best Regards,
Chunqiang Tang
Homepage: http://www.research.ibm.com/people/c/ctang
73 matches
Mail list logo