Hi, On Wed, Mar 23, 2016 at 05:16:01PM +0300, Denis V. Lunev wrote: > From: Pavel Borzenkov <pborzen...@virtuozzo.com> > > There exist some cases when a client knows that the data it is going to > write is all zeroes. Such cases include mirroring or backing up a device > implemented by a sparse file. > > With current NBD command set, the client has to issue NBD_CMD_WRITE > command with zeroed payload and transfer these zero bytes through the > wire. The server has to write the data onto disk, effectively denying > the sparseness. > > To remedy this, the patch adds WRITE_ZEROES extension with one new > NBD_CMD_WRITE_ZEROES command. > > Signed-off-by: Pavel Borzenkov <pborzen...@virtuozzo.com> > Reviewed-by: Roman Kagan <rka...@virtuozzo.com> > Signed-off-by: Denis V. Lunev <d...@openvz.org> > CC: Wouter Verhelst <w...@uter.be> > CC: Paolo Bonzini <pbonz...@redhat.com> > CC: Kevin Wolf <kw...@redhat.com> > CC: Stefan Hajnoczi <stefa...@redhat.com> > --- > doc/proto.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 44 insertions(+) > > diff --git a/doc/proto.md b/doc/proto.md > index 463ef8a..cda213c 100644 > --- a/doc/proto.md > +++ b/doc/proto.md > @@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle > negotiation: > schedule I/O accesses as for a rotational medium > - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports > `NBD_CMD_TRIM` commands > +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server > + supports `NBD_CMD_WRITE_ZEROES` commands > > ##### Client flags > > @@ -471,6 +473,10 @@ The following request types exist: > about the contents of the export affected by this command, until > overwriting it again with `NBD_CMD_WRITE`. > > +* `NBD_CMD_WRITE_ZEROES` (6) > + > + Defined by the experimental `WRITE_ZEROES` extension; see below. > + > * Other requests > > Some third-party implementations may require additional protocol > @@ -594,6 +600,44 @@ option reply type. > message if they do not also send it as a reply to the > `NBD_OPT_SELECT` message. > > +### `WRITE_ZEROES` extension > + > +There exist some cases when a client knows that the data it is going to write > +is all zeroes. Such cases include mirroring or backing up a device > implemented > +by a sparse file. With current NBD command set, the client has to issue > +`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes > +through the wire. The server has to write the data onto disk, effectively > +denying the sparseness. > + > +To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds > +one new command with two command flags. > + > +* `NBD_CMD_WRITE_ZEROES` (6) > + > + A write request with no payload. Length and offset define the location > + and amount of data to be zeroed. > + > + The server MUST zero out the data on disk, and then send the reply > + message. The server MAY send the reply message before the data has > + reached permanent storage. > + > + If the `NBD_FLAG_SEND_FUA` flag ("Force Unit Access") was set in the > + export flags field, the client MAY set the flag `NBD_CMD_FLAG_FUA` (bit > 0) > + in the command flags field. If this flag was set, the server MUST NOT > send > + the reply until it has ensured that the newly-zeroed data has reached > + permanent storage. > + > + If the flag `NBD_CMD_FLAG_MAY_TRIM` (bit 1) was set by the client in the > + command flags field, the server MAY use trimming to zero out the area, > + but it MUST ensure that the data reads back as zero. > + > + If an error occurs, the server SHOULD set the appropriate error code > + in the error field. The server MAY then close the connection. > + > +The server SHOULD return `ENOSPC` if it receives a write zeroes request > +including one or more sectors beyond the size of the device. It SHOULD > +return `EPERM` if it receives a write zeroes request on a read-only export. > +
So, the semantics of your proposed WRITE_ZEROES are exactly the same as the WRITE command, except that no payload is sent? In that case, I think it's slightly more sensible if we don't add a new command, but instead just have an NBD_CMD_FLAG_ZEROES added to the WRITE command instead. After all, they're going to be (mostly) the same anyway. Did you propose a separate command for a specific reason that I'm missing (or forgetting), or is that just an oversight? -- < ron> I mean, the main *practical* problem with C++, is there's like a dozen people in the world who think they really understand all of its rules, and pretty much all of them are just lying to themselves too. -- #debian-devel, OFTC, 2016-02-12