Am 2014-10-17 um 16:59 schrieb Sandeep Joshi:
Hi there,
Do let me know if I am asking these questions on the wrong forum. I'd
like to write a QEMU block driver which forwards IO requests to a
custom-built storage cluster.
I have seen Jeff Cody's presentation <http://bugnik.us/kvm2013> and
also browsed the source code for sheepdog, nbd and gluster in the
"block" directory and had a few questions to confirm or correct my
understanding.
1) What is the difference between bdrv_open and bdrv_file_open
function pointers in the BlockDriver ?
I'm not sure, but the main difference should be that bdrv_file_open() is
invoked for protocol block drivers, whereas bdrv_open() is invoked for
format block drivers. A couple of months ago, there was still a
top-level bdrv_file_open() function which has since been integrated into
bdrv_open(), so we might probably want to remove bdrv_file_open() in the
future as well...
But for now, use bdrv_file_open() for protocol drivers and bdrv_open()
for format drivers.
2) Is it possible to implement only a protocol driver without a format
driver (the distinction that Jeff made in his presentation above) ?
In other words, can I only set the "protocol_name" and not
"format_name" in BlockDriver ? I'd like to support all image formats
(qemu, raw, etc) without having to reimplement the logic for each.
Setting format_name does not make a block driver a format driver. A
block driver can only be either protocol or format driver, and the
distinction is probably made (again, I'd have to look it up to be sure)
by protocol drivers setting protocol_name and bdrv_file_open(), whereas
format drivers do not.
So you just need to set protocol_name and bdrv_file_open() (and
format_name as well, see nbd for an example where protocol_name and
format_name differ) and qemu knows your block driver is a protocol
driver and any format drivers will work on top of it. You should not set
bdrv_open(), however.
Once again, I'm not 100 % sure, but it should work that way.
Just by the way, I can very well imagine that the distinction between
protocol and format block drivers will disappear (at least in the code)
in the future. But that should not be any of your concern. :-)
3) The control flow for creating a file starts with the image format
driver and later invokes the protocol driver.
image_driver->bdrv_create()
--> bdrv_create_file
--> bdrv_find_protocol(filename)
--> bdrv_create
---> Protocol_driver->bdrv_create()
Is this the case for all functions? Does the read/write first flow
through the image format driver before getting passed down to the
protocol driver (possibly via some coroutine invoked from the block
layer or virtio-blk ) ? Can someone give me a hint as to how I can
trace the control flow ?
Well, you can always use gdb with break points and backtraces. At least
that's what I'd do.
For your first question: Yes, for each guest device or let's say virtual
guest device (because creating an image is not done through a guest
device, but the only thing missing from a guest device configuration is
in fact the device itself), there is a tree of BlockDriverStates. Every
request runs through the whole tree. It may not touch all nodes, but it
will start from the top (which is normally a format BDS) and then
proceed as far as the block drivers create new requests to their children.
Or, to be more technical: A request only goes to the topmost node in the
BDS tree (the root). If need be, it will manually forward it to its
child (which normally is bs->file if bs is a pointer to the
BlockDriverState) or children (e.g. bs->backing_hd, the backing file, or
driver-specific things, such as the children for the quorum block driver
which are not stored in the BlockDriverState).
This doesn't apply so well to bdrv_create(), because that function does
not work on BlockDriverStates, but I'm hoping you're seeing the point.
Shameless self plug: Regarding this whole BDS tree thing I can recommend
Kevin's and my presentation from this year's KVM Forum:
http://events.linuxfoundation.org/sites/events/files/slides/blockdev.pdf
Max