Re: [Qemu-devel] [PATCH RFC v2 45/47] qapi: New QMP command query-schema for QMP schema introspection

Markus Armbruster Wed, 29 Jul 2015 02:22:41 -0700

Eric Blake <ebl...@redhat.com> writes:

> On 07/28/2015 08:33 AM, Markus Armbruster wrote:
>> Eric Blake <ebl...@redhat.com> writes:
>> 
>>> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>>>> Caution, rough edges.
>>>
>>> No joke. It doesn't even compile without this fixup to a rebase snafu
>>> (see [0] below):
>> 
>> Uh, how did that happen?  I compiled it a million times... (except for
>> the last time, obviously).
>
> At least you're not alone; I've done dumb things like that, too.  And at
> least it was easy to figure out, so I could test it.
>
>>>>
>>>> FIXME it can generate awfully long lines
>>>
>>> We already have long lines in generated output, but I agree that finding
>>> ways to break it up might be nice.  Actually,
>> 
>> This one's different in that the length is due to string literals.
>> indent is happy to break lines for us, but not string literals.
>> 
>> Longest line is a bit over 4KiB for me.
>> 
>
> If we break up string literals, at least use some indentation to make it
> obvious that multiple lines merge to a single array entry. For example
> (after patch 47):
>
> ...
>     "{ 'name': ':abr', 'meta-type': 'object', "
>       "'members': [ "
>         "{ 'name': 'device', 'type': ':acg', 'default': null }, "
>         "{ 'name': 'node-name', 'type': ':acg', 'default': null }, "
>         "{ 'name': 'snapshot-file', 'type': ':acg' }, "
>         "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null
> }, "
>         "{ 'name': 'format', 'type': ':acg', 'default': null }, "
>         "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, "
>     "{ 'name': ... "


Unconventional indentation, but if it helps the reader...

> Oh, and just thought of something while typing: although patch 47 masks
> type names from the end user, it would be VERY worthwhile if the C code
> had strategic comments that pointed back to the qapi type name:
>
>     "{ 'name': ':abr', " /* TypeFoo */
>       "'meta-type': 'object'"
> ...

If we do that for uses and definitions, we don't need a flag to generate
nice typenames.  But doing it for uses might complicate the code too
much.

> (and then there's the '' vs. "" issue which impacts the output, as
> well).  And if you do the "," between entries as a separate line, that
> might impact whether you do a trailing "]" on a line by its own (see [3]
> below).
>
>>>> +
>>>> +{ 'struct': 'SchemaInfoEnum',
>>>> +  'data': { 'values': ['str'] } }
>>>
>>> Do we want to document anything about sort ordering of this list?  Is it
>>> worth sorting the array by name, to allow clients to bsearch for whether
>>> a particular enum value is supported, rather than having to linear search?
>> 
>> Haven't thought about it.  We currently emit them in definition order,
>> which is not sorted.  Largest enums:
>> 
>>     Q_KEY_CODE_MAX = 125,
>>     BLKDEBUG_EVENT_MAX = 43,
>>     BLOCKDEV_DRIVER_MAX = 28,
>>     CHARDEV_BACKEND_KIND_MAX = 19,
>>     RUN_STATE_MAX = 15,
>>     NET_CLIENT_OPTIONS_KIND_MAX = 12,
>> 
>> Most enums are small: median is 3.
>> 
>> Would libvirt prefer them sorted?
>
> Libvirt can probably live without sorting of enum constants (searching
> Q_KEY_CODE_MAX isn't going to be that noticeable of an impact on O(n)
> vs. O(logn); I predict much more time will be spent searching for type
> "xyz" referenced by command "abc" from the overall array, and repeating
> that search for multiple feature probes).
>
> But if we do want sorting, we need it up front (it will be easier to
> decide to use bsearch down the road if we are guaranteed that output is
> sorted, than it would be to try and learn whether whether we are talking
> to a new vs. old qemu to learn if sorting is in effect because it was
> added as an after-thought).

I agree adding sorting later is prone to add another instance of rarely
used, bit-rotting backward-compatibility code to libvirt, along with the
logic whether to use it.  Best avoided.

>                              And if we don't want sorting, documenting
> that data is NOT guaranteed to be position-dependent, in spite of being
> in a JSON array, is a nice touch.

What do you mean by "position-dependent"?

>>>> +
>>>> +{ 'struct': 'SchemaInfoObjectVariant',
>>>> +  'data': { 'case': 'str',
>>>> +            'members': [ 'SchemaInfoObjectMember' ] } }
>>>
>>> Would it be simpler to just have:
>>>
>>> 'data': { 'case': 'str', 'type': 'str' }
>>>
>>> and make the user refer recursively to the (possibly-implicit) type for
>>> the members?
>> 
>> Hmmm...
>> 
>> QAPISchemaObjectTypeVariant has members name, type, flat, and a few more
>> that don't matter here.
>> 
>> For a non-flat variant with name=N, type=T, my code creates
>> 
>>     { 'case': 'N', 'members': [ { 'name': 'data', 'type': 'T' } ] }
>> 
>> This means when the tag is 'N', we have a member 'data' of type 'T'.
>> 
>> For a flat variant, it creates
>> 
>>     { 'case': 'N', 'members': [ { ... the members of T ... } ] }
>> 
>> This means when the tag is 'N', we have all the members of T.
>> 
>> If I understand you correctly, you're proposing
>> 
>>     { 'case': 'N', 'type': 'T' }
>> 
>> to mean when the tag is 'N', we have all the members of T.  For the flat
>> variant above, we'd create exactly that.
>> 
>> T must not have variants, but the schema doesn't reflect that.
>
> That's our current restriction, but it's one we might decide to lift in
> the future.  Having a type with two different discriminators could get a
> bit weird to think about, but doesn't seem to be technically impossible.
>
>> 
>> For the simple variant, we'd then create
>> 
>>     { 'case': 'N', 'type': 'TT' }
>> 
>> where TT is a new implicit object type with a single member with name
>> data and type T.
>> 
>> Correct?
>
> Yes.

I'm starting to like it.

>>> In particular, if we ever decide to allow a flat union to have another
>>> union as a branch, rather than the current restriction that all branches
>>> must be structs, then referring to the type of a branch may be easier
>>> than breaking out all members of a struct.
>> 
>> Not sure I completely get you here.  Using an object type instead of a
>> member list is obviously more flexible, because for any member list we
>> can make up an object type with the same meaning, but not vice versa.
>
> Indeed, and that's the restriction that I mention we might someday want
> to lift.  Or, in (modified, due to inline {} types) qapi terms, if we
> ever want to allow:
>
> { 'union': 'Helper', 'data': { 'number': 'int', 'string': 'str' } }
> { 'enum': 'MyEnum', 'data': [ 'a', 'b' ] }
> { 'union': 'Main', 'base': { 'switch': 'MyEnum' },
>   'discriminator': 'switch', 'data': { 'a': { 'boolean': 'bool' },
>                                        'b': 'Helper' } }
> { 'command': 'foo', 'data': { 'data': 'Main' } }
>
> then that would permit QMP invocations of:
> { "execute": "foo", "arguments": { "data": {
>   "switch": "a", "boolean": true } } }
> { "execute": "foo", "arguments": { "data": {
>   "switch": "b", "type": "number", "data": 1 } } }
> { "execute": "foo", "arguments": { "data": {
>   "switch": "b", "type": "string", "data": "hello" } } }
>
> which we can express as a list of case/types for the primary variants of
> 'Main' (those types in turn refer to the secondary variants of type
> Helper), but which we cannot express as a [ 'SchemaInfoObjectMember' ]
> list, because the type of the "data" member depends on the secondary
> discriminator that is called into use on the "b" case of the primary
> discriminator.
>
> So I think we're saying the same thing, that a [
> 'SchemaInfoObjectMember' ] can always be written as a reference to an
> object type name, but not all object type names can be broken back into
> an array of SchemaInfoObjectMember (only those types that are pure
> structs without variants); and that although we currently do not allow
> sub-variants within a union, we should not get in the way of that being
> a possible future extension.
>
>> 
>>>                                             And if that's the case, it
>>> may have knock-on simplifications to your earlier patches for tracking
>>> variants. See [1] below for more thoughts...
>>>
>>> Do we want to guarantee anything about the sort ordering in this list?
>> 
>> Again, haven't thought about it.
>> 
>> Do we expect member lists to get so large binary search is called for?
>
> Probably not, since such a list would be unwieldy for both client and
> server.  We tend to add boxing and optional sub-structs rather than
> direct parameters if we have that much information to pass along (think
> about how adding throttling parameters to a new block device was done
> with a single top-level parameter pointing to a throttling sub-struct,
> rather than adding lots of throttling parameters at top level).
>
> But, as with enum sorting, actually documenting our choice will help
> cement the expectations of clients on what they have to do when learning
> if a parameter was added.

We may want to adopt a consistent rule on sorting stuff.

>>>> +
>>>> +{ 'struct': 'SchemaInfoObject',
>>>> +  'data': { 'members': [ 'SchemaInfoObjectMember' ],
>>>> +            '*tag': 'str',
>>>> +            '*variants': [ 'SchemaInfoObjectVariant' ] } }
>>>
>>> or these?
>> 
>> Same question.
>> 
>
> Here, if enums are sorted, then case branches within variants should be
> sorted. If enums are unsorted, then I'm fine if case branches are also
> unsorted (and possibly in a different order than the enum was),

Okay.

>                                                                 but be
> sure to document that.

Documentation I produce tends to err on the side of brevity, sometimes
too much.

If something is meant to produce sorted results, I document the
sortedness.  If not, I habitually say nothing.  Since nothing's said,
you better assume nothing.

Please keep calling out instances of excessive brevity :)

>>>> +
>>>> +{ 'struct': 'SchemaInfoAlternate',
>>>> +  'data': { 'members': [ 'SchemaInfoObjectMember' ] } }
>>>
>>> Here's an example of what you generated:
>>>     "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'members': [ {
>>> 'name': 'definition', 'type': 'BlockdevOptions' }, { 'name':
>>> 'reference', 'type': 'str' } ] }, "
>>>
>>> I think you could get away with something simpler:
>>>
>>>  'data': { 'types': [ 'str' ] }
>>>
>>> as in:
>>>  "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'types': [
>>> 'BlockdevOptions', 'str' ] }, "
>> 
>> I.e. have a list of types instead of a list of members.
>> 
>> Let's see what we'd lose, by enumerating the members of
>> SchemaInfoObjectMember:
>> 
>> * name: not ABI, should not be examined (see commit message), thus no
>>   loss.
>> 
>> * type: kept.
>> 
>> * default: never present (see commit message), thus no loss
>> 
>>> the only worry is whether we might want future extensions, where we'd
>>> want additional information per element of that array, vs. being forced
>>> to return two arrays in parallel (arrays of structs are more extensible
>>> than arrays of strings).
>>> Seems like this would be just a
>> 
>> Yes?
>> 
>> Choices:
>> 
>> * The only piece of information we need on an alternative right now is
>>   the type, so make members a list of types.  Nice now, awkward if we
>>   ever need more,
>> 
>> * To provide for future additions, make it a list of
>>   SchemaInfoAlternateMember, where SchemaInfoAlternateMember has just
>>   one member type now.
>> 
>> * Reuse existing SchemaInfoObjectMember, because that's close and I'm
>>   lazy.
>> 
>> Preferences?
>
> At this point, my vote is with a new SchemaInfoAlternateMember class
> (SchemaInfoObjectMember may diverge in a different direction, and it
> would eliminate the question of how to not expose the branch names as
> ABI; but keeping things as a (one-member, for now) dictionary will allow
> future extensions).

Since Kevin also questioned my reuse of SchemaInfoObjectMember for
alternates in his review of RFC v1, let's go with a separate
SchemaInfoAlternateMember.

>>>> @@ -265,7 +265,7 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor):
>>>>          self.defn = None
>>>>          self.regy = None
>>>>          self.visited_rets = None
>>>> -    def visit_begin(self):
>>>> +    def visit_begin(self, schema):
>>>
>>> And again my python object-oriented newness is showing through; where I
>>> guess all children have to update signatures to still be polymorphic to
>>> a parent adding a parameter.
>> 
>> Maybe they need to be updated, maybe not, but updating them neatly
>> sidesteps Python polymorphism questions from mere mortals like you and
>> me.
>
> see reference to [4] below
>
>> 
>>>> +''',
>>>> +                          c_name=c_name(name))
>>>> +        self.defn = mcgen('''
>>>> +char %(c_name)s[] = "["
>>>> +    "%(c_jsons)s]";
>>>
>>> And again. Also, I'd consider putting the "]" on its own line, like the
>>> "[" was, so that you can more easily cut and paste individual lines of
>>> generated output (but since JSON doesn't allow trailing comma, I guess
>>> the last line is still always going to be special).
>> 
>> That's my reason for keepint the "]" there.
>> 
>
> [3] and that works for me, unless we want to break long lines into
> shorter ones by virtue of putting "," (and thus "]") on their own line
> to make it even more obvious where we are breaking between elements of
> the overall array.
>
>>>> +''',
>>>> +                          c_name=c_name(name),
>>>> +                          c_jsons=', "\n    "'.join(self.jsons))
>>>
>>> Cool syntax :)
>> 
>> Possibly bordering on too cool :)
>
> Java was the first language I encountered where "".foo() was valid
> syntax; sometimes, I wish C had made strings a first class type.  I'm
> fine with it as it is.
>
>>> [1] Ah, so .flat is still in use here, to avoid having to create
>>> implicit types everywhere.  But if we create implicit types for simple
>>> unions, and just track variants by their case name/type instead of case
>>> name/[members], it will allow us to have a union as a case branch (I
>>> don't know that we need that much flexibility), and not have to worry
>>> about exposing .flat everywhere.  It may even result in a smaller JSON
>>> string (you'd have to play with it to know for sure).
>> 
>> We should try hard to get the introspection schema right from the start.
>> 
>> But the internal representation is malleable.  I know we can implement
>> non-flat unions completely in terms of flat ones (and that means no
>> .flat), but I also know I failed at doing it in this series.  I'm not
>> sure your idea will do the trick completely.  It's fine, we can finish
>> the job later.
>
> Yes, I've come to that conclusion myself - doesn't matter what we do on
> the first round internally as long as the output is right; cleaning up
> the internals can come later.
>
>
>> I could shoehorn both views into a single visitor function, by passing
>> both views, .base + .local_members, and .members.  All implementations
>> will use only one of the views, but it's not immediately obvious which
>> one.  So I chose to have two visitor functions.  Matter of taste.
>
> I can live with it (documentation that sub-classes should override at
> most only one of the two visitors might help the cause, though).

Actually, overriding both would be just fine.  We just haven't had a use
for that.

>>>> +++ b/scripts/qapi.py
>>>> @@ -764,7 +764,7 @@ class QAPISchemaEntity(object):
>>>>          pass
>>>>  
>>>>  class QAPISchemaVisitor(object):
>>>> -    def visit_begin(self):
>>>> +    def visit_begin(self, schema):
>>>
>>> I don't know enough python to know if making schema optional in the
>>> parent class affects what the child class is allowed to implement while
>>> still overriding things.
>> 
>> Not sure I get you here.
>
> It was an idle musing on whether
>
> class QAPISchemaVisitor(object):
>     def visit_begin(self, schema=None):
>
> would permit:
> class QAPIIntrospectVisitor(QAPISchemaVisitor):
>     def visit_begin(self):
>
> with proper polymorphism.  But you've already come to the conclusion
> above [4] that it's easier to not mess with optional parameters (leave
> that for the python gurus), and that mere mortals are better off using
> something that obviously works instead of requiring knowledge about
> language internals.
>
>
>>>> +++ b/tests/.gitignore
>>>> @@ -19,6 +19,7 @@ test-opts-visitor
>>>>  test-qapi-event.[ch]
>>>>  test-qapi-types.[ch]
>>>>  test-qapi-visit.[ch]
>>>> +test-qapi-introspect.[ch]
>>>
>>> [2] Ah, maybe this is the file that wasn't quite right.
>> 
>> Anything I need to fix?
>
> The generated files created by 'make check' wer named
> test-qmp-introspect.[ch] (either s/qapi/qmp/ here, or else fix a
> Makefile rule to generate the desired name).

Scales falling from my eyes...

>> Thanks!  I really appreciate your review.  Must have been a big chunk of
>> work.
>
> Yes, doing a thorough review took me the better part of a week to go
> through it all, so I can only imagine how much time you've invested in
> it.  Hopefully v3 will be easier to review, because it will be diffs
> against this version and mainly focusing on whether review comments were
> addressed.

I'll do my best to keep the diffs in check.

Re: [Qemu-devel] [PATCH RFC v2 45/47] qapi: New QMP command query-schema for QMP schema introspection

Reply via email to