Re: [Qemu-devel] [Nbd] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension

Wouter Verhelst Tue, 29 Nov 2016 05:12:13 -0800

Hi Vladimir,

On Tue, Nov 29, 2016 at 03:41:10PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi,
> 
> 29.11.2016 13:50, Wouter Verhelst wrote:
> 
>     Hi,
> 
>     On Mon, Nov 28, 2016 at 06:33:24PM +0100, Wouter Verhelst wrote:
> 
>         However, I'm arguing that if we're going to provide information about
>         snapshots, we should be able to properly refer to these snapshots from
>         within an NBD context. My previous mail suggested adding a negotiation
>         message that would essentially ask the server "tell me about the
>         snapshots you know about", giving them an NBD identifier in the 
> process
>         (accompanied by a "foreign" identifier that is decidedly *not* an NBD
>         identifier and that could be used to match the NBD identifier to
>         something implementation-defined). This would be read-only 
> information;
>         the client cannot ask the server to create new snapshots. We can then
>         later in the protocol refer to these snapshots by way of that NBD
>         identifier.
> 
>     To make this a bit more concrete, I've changed the proposal like so:
> 
[...]
>     +##### Allocation contexts
>     +
>     +Allocation context 0 is the basic "exists at all" allocation context. If
>     +an extent is not allocated at allocation context 0, it MUST NOT be
>     +listed as allocated at another allocation context. This supports sparse
> 
> 
> allocated here is range with unset NBD_STATE_HOLE bit?


Eh, yes. I should clarify that a bit further (no time right now though,
but patches are certainly welcome).

>     +file semantics on the server side. If a server has only one allocation
>     +context (the default), then writing to an extent which is allocated in
>     +that allocation context 0 MUST NOT fail with ENOSPC.
>     +
>     +For all other cases, this specification requires no specific semantics
>     +of allocation contexts. Implementations could support allocation
>     +contexts with semantics like the following:
>     +
>     +- Incremental snapshots; if a block is allocated in one allocation
>     +  context, that implies that it is also allocated in the next level up.
>     +- Various bits of data about the backend of the storage; e.g., if the
>     +  storage is written on a RAID array, an allocation context could
>     +  return information about the redundancy level of a given extent
>     +- If the backend implements a write-through cache of some sort, or
>     +  synchronises with other servers, an allocation context could state
>     +  that an extent is "allocated" once it has reached permanent storage
>     +  and/or is synchronized with other servers.
>     +
>     +The only requirement of an allocation context is that it MUST be
>     +representable with the flags as defined for `NBD_CMD_BLOCK_STATUS`.
>     +
>     +Likewise, the syntax of query strings is not specified by this document.
>     +
>     +Server implementations SHOULD document their syntax for query strings
>     +and semantics for resulting allocation contexts in a document like this
>     +one.
> 
> 
> IMHO, redundant paragraph for this spec.

The "SHOULD document" one? I want that there at any rate, just to make
clear that it's probably a good thing to have it (but no, it's not part
of the formal protocol spec).

>     +
>      ### Transmission phase
> 
>      #### Flag fields
>     @@ -983,6 +1065,9 @@ valid may depend on negotiation during the handshake 
> phase.
>         content chunk in reply.  MUST NOT be set unless the transmission
>         flags include `NBD_FLAG_SEND_DF`.  Use of this flag MAY trigger an
>         `EOVERFLOW` error chunk, if the request length is too large.
>     +- bit 3, `NBD_CMD_FLAG_REQ_ONE`; valid during `NBD_CMD_BLOCK_STATUS`. If
>     +  set, the client is interested in only one extent per allocation
>     +  context.
> 
>      ##### Structured reply flags
> 
>     @@ -1371,38 +1456,48 @@ adds a new `NBD_CMD_BLOCK_STATUS` command which 
> returns a list of
>      ranges with their respective states.  This extension is not available
>      unless the client also negotiates the `STRUCTURED_REPLY` extension.
> 
>     -* `NBD_FLAG_SEND_BLOCK_STATUS`
>     -
>     -    The server SHOULD set this transmission flag to 1 if structured
>     -    replies have been negotiated, and the `NBD_CMD_BLOCK_STATUS`
>     -    request is supported.
>     -
>      * `NBD_REPLY_TYPE_BLOCK_STATUS`
> 
>     -    *length* MUST be a positive integer multiple of 8.  This reply
>     +    *length* MUST be 4 + (a positive integer multiple of 8).  This reply
>          represents a series of consecutive block descriptors where the sum
>          of the lengths of the descriptors MUST not be greater than the
>     -    length of the original request.  This chunk type MUST appear at most
>     -    once in a structured reply. Valid as a reply to
>     +    length of the original request. This chunk type MUST appear at most
>     +    once per allocation ID in a structured reply. Valid as a reply to
>          `NBD_CMD_BLOCK_STATUS`.
> 
>     -    The payload is structured as a list of one or more descriptors,
>     -    each with this layout:
>     +    Servers MUST return an `NBD_REPLY_TYPE_BLOCK_STATUS` chunk for every
>     +    allocation context ID, except if the semantics of particular
>     +    allocation contexts mean that the information for one allocation
>     +    context is implied by the information for another.
> 
> 
> So, actually, instead of selecting with a nbd_cmd or with external tool which
> bitmap we want to access, we just reply with all bitmaps (negotiated at the
> beginning). Personally I dislike this. With such approach, we will have to
> export allocation bitmap always, even if we need only dirtyness. Consider, 
> that
> requesting allocation bitmap may be much more expensive in time that 
> requesting
> dirtyness.

Ah, yes, didn't consider that. I suppose more updates will be required,
then.

What would you suggest instead, then?

> Or allocation bitmap information is implied by dirtyness?
> 
> Furthermore, as allocation context semantics defined externally, 'semantics
> mean that information is implied' states nothing, and we actually return to 
> way
> #3, where external tool decides which bitmap to export.
> 
> 
>     +
>     +    The payload starts with:
>     +
>     +        * 32 bits, allocation context ID
>     +
>     +    and is followed by a list of one or more descriptors, each with this
>     +    layout:
> 
>              * 32 bits, length (unsigned, MUST NOT be zero)
>              * 32 bits, status flags
> 
>     -    The definition of the status flags is determined based on the
>     -    flags present in the original request.
>     +    If the client used the `NBD_CMD_FLAG_REQ_ONE` flag in the request,
>     +    then every reply chunk MUST NOT contain more than one descriptor.
>     +
>     +    Even if the client did not use the `NBD_CMD_FLAG_REQ_ONE` flag in
>     +    its request, the server MAY return less descriptors in the reply
>     +    than would be required to fully specify the whole range of requested
>     +    information to the client, if the number of descriptors would be
>     +    over 16 otherwise and looking up the information would be too
>     +    resource-intensive for the server.
> 
> 
> So, if there are <=16 extents, they all MUST be present in reply.. (just note)

That's the proposal, yes. I think it makes sense to have a minimum that
MUST be present (unless the client asked for REQ_ONE), although the
exact count can be different from 16, if needs be.

[...]
>     Thoughts?
> 
> For me considering dirtyness like one of allocation contexts sounds a bit
> weird, as dirtyness is not allocation.. But it is not so important.

I would certainly be willing to change the name, if that helps. The idea
is that you have various types of information that you can query the
server about. I called these "allocation contexts", but I'm certainly
not of the opinion that it *has* to be called that. Allocation is one of
the information types, but there can be more; my proposed spec
explicitly does not go into detail about the others.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

Re: [Qemu-devel] [Nbd] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension

Reply via email to