Re: [Qemu-devel] [PATCH 3/9] nbd: BLOCK_STATUS for standard get_block_status function: server part

Eric Blake Fri, 16 Feb 2018 09:04:03 -0800

On 02/16/2018 08:43 AM, Vladimir Sementsov-Ogievskiy wrote:

16.02.2018 16:21, Eric Blake wrote:

On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal realization: only one extent in server answer is supported.


Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
---

Again, function comments are useful.
+    if (query[0] == '\0' || strcmp(query, "allocation") == 0) {
+        /* Note: empty query should select all contexts within base
+         * namespace. */
+        meta->base_allocation = true;
From the client perspective, this handling of the empty leaf-nameworks well for NBD_OPT_LIST_META_CONTEXT (I want to see what leafnodes the server supports), but not so well forNBD_OPT_SET_META_CONTEXT (asking the server to send ALL baseallocations, even when I don't necessarily know how to interpretanything other than base:allocation, is a waste). So this functionneeds a bool parameter that says whether it is being invoked from_LIST (empty string okay, to advertise ALL base leaf names back toclient, which for now is just base:allocation) or from _SET (emptystring is ignored as invalid; client has to specifically ask forbase:allocation by name).
"empty string is ignored as invalid", hm, do we have this in spec? Ithink SET and LIST must select exactly same sets of contexts.

No, it is specifically NOT intended that SET and LIST have to producethe same set of contexts; although I do see your point that as currentlywritten, it does not appear to require SET to ignore "base:" or to treatit as an error. At any rate, the spec gives the example of:

In either case, however, for any given namespace the server MAY, instead of 
exhaustively listing every matching context available to select (or every 
context available to select where no query is given), send sufficient context 
records back to allow a client with knowledge of the namespace to select any 
context. This may be helpful where a client can construct algorithmic queries. 
For instance, a client might reply simply with the namespace with no leaf-name 
(e.g. 'X-FooBar:') or with a range of values (e.g. 
'X-ModifiedDate:20160310-20161214'). The semantics of such a reply are a matter 
for the definition of the namespace. However each namespace returned MUST begin 
with the relevant namespace, followed by a colon, and then other UTF-8 
characters, with the entire string following the restrictions for strings set 
out earlier in this document.

with the intent being that for some namespaces, it may be easy toperform an algorithmic query via _LIST to see what ranges are supported,but that you cannot select ALL elements in the range simultaneously.The empty query for _LIST exists to enumerate what is supported, butdoes not have to equate to an empty query for _SET selecting everythingpossible. I could even see it being possible to have some round-trips,depending on the namespace (of course, any namespace other than "base:"will be tightly coordinated between both client and server, so theyunderstand each other - the point was that the NBD spec didn't want toconstrain what a client and server could do as long as they stay withinthe generic framework):


C> LIST ""
S> REPLY "base:allocation" id 0
S> REPLY "X-FooBar:" id 0
S> ACK
C> LIST "X-FooBar:"
S> REPLY "X-FooBar:A_Required", id 0
S> REPLY "X-FooBar:A_Min=100", id 0
S> REPLY "X-FooBar:A_Max=200", id 0
S> REPLY "X-FooBar:B_Default=300", id 0
S> REPLY "X-FooBar:B_Min=300", id 0
S> REPLY "X-FooBar:B_Max=400", id 0
S> ACK
C> SET "X-FooBar:A=150" "base:allocation"
S> REPLY "X-FooBar:A=150,B=300", id 1
S> REPLY "base:allocation", id 2
S> ACK

where the global query of all available contexts merely lists thatX-FooBar: is understood, but that a specific query is needed for moredetails (to avoid the client having to parse those specifics if itdoesn't care about X-FooBar:), and the specific query sets up thealgorithmic description (parameter A is required, between 100 and 200;parameter B is optional, between 300 and 400, defaulting to 300), andthe final SET gives the actual request (A given a value, B left to itsdefault; but the reply names the entire context rather than repeatingthe shorter request). So the spec is written to permit something likethat for third-party namespaces, while also trying to be very specificabout the "base:" context as that is the one that needs the mostinteroperability.

It isstrange behavior of client to set "base:", but it is its decision. And Idon't thing that it is invalid.

For LIST, querying "base:" makes total sense (for the sake of example,we may add "base:frob" down the road that does something new. Beingable to LIST whether "base:" turns into just "base:allocation" or into"base:allocation"+"base:frob" may be useful to a client that understandsboth formats and wants to probe if the server is new; and even for aclient right now, the client can gracefully note that it doesn't want toselect "base:frob"). But for SET, it does not (if "base:" turns into"base:allocation" + "base:frob" down the road, then the server iswasting time preparing the response to "base:frob" for everyNBD_CMD_BLOCK_STATUS, and the client is wasting time unpacking from thewire and ignoring it), so having the empty query work on LIST but not onSET makes sense.

Formally we may answer with NBD_REP_ERR_TOO_BIG, but it will look weird,as client see that both base: and base:allocation returns _one_ context,but in one case it is too big. But if we will have several base:contextes, server may fairly answer with NBD_REP_ERR_TOO_BIG.

Hmm, you have a point that while a client can ask for "namespace:", theserver should always respond with "namespace:leaf" for the actualcontexts that it supports/selects, so that the client knows which leavesit actually got, if it does not fail with ERR_TOO_BIG. You are alsoright that failing with ERR_TOO_BIG for "base:" seems odd, but it maymake more sense for other namespaces.


So, I think for now the code is ok.

Then this is probably worth something to bring up on the NBD list, if weneed to tweak wording to be more explicit (whether we shouldallow/forbid wildcards during SET, or if wildcard queries are intendedonly for LIST). Sounds like I have more spec emails to write to the NBDlist.

Also, I don't see NBD_REP_ERR_TOO_BIG possible reply inNBD_OPT_LIST_META_CONTEXT description. Should it be here?

Yeah, that's probably also worth adding to the upstream spec, eventhough it already encourages LIST results to send compressed informationback that allows a client to contruct valid specific queries, ratherthan an exhaustive list of selecting everything possible.

+/* nbd_negotiate_meta_query
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_negotiate_meta_query(NBDClient *client,
+ NBDExportMetaContexts *meta,Error **errp)
+{
+    int ret;
+    char *query, *colon, *namespace, *subquery;
Is it worth stack-allocating query here, so you don't have to g_free()it later? NBD limits the maximum string to 4k, which is a little bitbig for stack allocation (on an operating system with 4k pages,allocating more than 4k on the stack in one function risks missing theguard page on stack overflow), but we also have the benefit that weKNOW that the set of meta-context namespaces that we support have amuch smaller maximum limit of what we even care about.
it is not stack allocated, nbd_alloc_read_size_string calls g_malloc.

Hence my question - do we NEED the malloc'd version, or can we get awaywith a stack-allocated space? Although I then revised my question...

+
+    ret = nbd_alloc_read_size_string(client, &query, errp);
+    if (ret <= 0) {
+        return ret;
+    }
+
+    colon = strchr(query, ':');
+    if (colon == NULL) {
+        ret = nbd_opt_invalid(client, errp, "no colon in query");
+        goto out;
Hmm, that puts a slight wrinkle into my proposal, or else maybe it issomething I should bring up on the NBD list. If we only read 5characters (because the max namespace WE support is "base:"), but aclient asks for namespace "X-longname:", we should gracefully ignorethe client's request; while we still want to reply with an error to aclient that asks for "garbage" with no colon at all. The question forthe NBD spec, then, is whether detecting bad client requests thatdidn't use colon is mandatory for the server (meaning we MUST read theentire namespace request, and search for the colon) or merely besteffort (we only have to read 5 characters, and if we silently ignoreinstead of diagnose a bad namespace request that was longer than that,oh well). Worded from the client, it switches to a question of whetherthe client should expect the server to diagnose all requests, or mustbe prepared for the server to ignore requests even where thoserequests are bogus. Or, the NBD spec may change slightly to passnamespace and leafname as separate fields, both with lengths, ratherthan a colon, to make it easier for the server to skip over an unknownnamespace/leaf pair without having to parse whether a colon waspresent. I'll send that in a separate email (the upstream NBD listdoesn't need to see all my review comments on this thread).


... in light of this thread now on the NBD list.


Thank you for careful review!

No problem. We still have some things to sort out on the NBD list aswell, but I want to make sure we get something that is likely to workwell with other implementations (I'm also trying, on the side, to getnbdkit to support structured reads so I have something available fortesting cross-implementation support, but it is slow going).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH 3/9] nbd: BLOCK_STATUS for standard get_block_status function: server part

Reply via email to