[adding nbd list] On 07/19/2016 12:21 AM, Fam Zheng wrote: > On Mon, 07/18 22:08, Eric Blake wrote: >> Upstream NBD protocol recently added the ability to efficiently >> write zeroes without having to send the zeroes over the wire, >> along with a flag to control whether the client wants a hole. >> >> Signed-off-by: Eric Blake <ebl...@redhat.com> >> >> ---
>> @@ -1235,6 +1242,37 @@ static void nbd_trip(void *opaque) >> } >> break; >> >> + case NBD_CMD_WRITE_ZEROES: >> + TRACE("Request type is WRITE_ZEROES"); >> + >> + if (exp->nbdflags & NBD_FLAG_READ_ONLY) { >> + TRACE("Server is read-only, return error"); >> + reply.error = EROFS; >> + goto error_reply; >> + } >> + >> + TRACE("Writing to device"); >> + >> + flags = 0; >> + if (request.flags & NBD_CMD_FLAG_FUA) { >> + flags |= BDRV_REQ_FUA; >> + } >> + if (!(request.flags & NBD_CMD_FLAG_NO_HOLE)) { >> + flags |= BDRV_REQ_MAY_UNMAP; > > If I'm reading the NBD proto.md correctly, this is not enough if > NBD_CMD_FLAG_NO_HOLE is specified. We probably need to use a zeroed buffer > with > blk_pwrite, or pass a new flag (BDRV_RED_NO_HOLE) to blk_pwrite_zeroes to > enforce the bdrv_driver_pwritev() branch in bdrv_co_do_pwrite_zeroes(). If that's how you read it, then my proposal to proto.md needs updating. I specifically wrote the proposal to be as close as possible to the existing qemu semantics, except that we negated the sense of the bit because we wanted to allow the bit value of 0 to allow the server the most flexibility in performing optimizations. The code here (and in 14/14 on the client side) is merely catering to the fact that the bit has opposite sense in the two projects. That is, the rules in qemu are: MAY_UNMAP == 0 : must write zeroes MAY_UNMAP == 1 : may optimize if supported (where reads will see 0), but must write zeroes if not while the rules in NBD are: FLAG_NO_HOLE == 1 : must write zeroes FLAG_NO_HOLE == 0 : may optimize if supported (where reads will see 0), but must write zeroes if not Or another way of putting it: in qemu, the ability to punch holes was added after the fact (default of no holes being 0 due to back-compat), where prior to its addition full allocation was the only option, and most callers have to worry about passing MAY_UNMAP when they care about optimal use of storage; while in NBD we want to allow the server the freedom to have optimal usage of storage by default, but need a way to specifically ask for full allocation. If you think the NBD flag is poorly named, we have not yet committed to the NBD_CMD_WRITE_EXTENSIONS documentation yet, and are free to patch proto.md to choose a better name and/or wording to better describe what we actually mean on the NBD side of things. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature