Eric Blake <ebl...@redhat.com> writes: > On 07/28/2015 08:33 AM, Markus Armbruster wrote: >> Eric Blake <ebl...@redhat.com> writes: >> >>> On 07/01/2015 02:22 PM, Markus Armbruster wrote: >>>> Caution, rough edges. >>> >>> No joke. It doesn't even compile without this fixup to a rebase snafu >>> (see [0] below): >> >> Uh, how did that happen? I compiled it a million times... (except for >> the last time, obviously). > > At least you're not alone; I've done dumb things like that, too. And at > least it was easy to figure out, so I could test it. > >>>> >>>> FIXME it can generate awfully long lines >>> >>> We already have long lines in generated output, but I agree that finding >>> ways to break it up might be nice. Actually, >> >> This one's different in that the length is due to string literals. >> indent is happy to break lines for us, but not string literals. >> >> Longest line is a bit over 4KiB for me. >> > > If we break up string literals, at least use some indentation to make it > obvious that multiple lines merge to a single array entry. For example > (after patch 47): > > ... > "{ 'name': ':abr', 'meta-type': 'object', " > "'members': [ " > "{ 'name': 'device', 'type': ':acg', 'default': null }, " > "{ 'name': 'node-name', 'type': ':acg', 'default': null }, " > "{ 'name': 'snapshot-file', 'type': ':acg' }, " > "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null > }, " > "{ 'name': 'format', 'type': ':acg', 'default': null }, " > "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, " > "{ 'name': ... "
Unconventional indentation, but if it helps the reader... > Oh, and just thought of something while typing: although patch 47 masks > type names from the end user, it would be VERY worthwhile if the C code > had strategic comments that pointed back to the qapi type name: > > "{ 'name': ':abr', " /* TypeFoo */ > "'meta-type': 'object'" > ... If we do that for uses and definitions, we don't need a flag to generate nice typenames. But doing it for uses might complicate the code too much. > (and then there's the '' vs. "" issue which impacts the output, as > well). And if you do the "," between entries as a separate line, that > might impact whether you do a trailing "]" on a line by its own (see [3] > below). > >>>> + >>>> +{ 'struct': 'SchemaInfoEnum', >>>> + 'data': { 'values': ['str'] } } >>> >>> Do we want to document anything about sort ordering of this list? Is it >>> worth sorting the array by name, to allow clients to bsearch for whether >>> a particular enum value is supported, rather than having to linear search? >> >> Haven't thought about it. We currently emit them in definition order, >> which is not sorted. Largest enums: >> >> Q_KEY_CODE_MAX = 125, >> BLKDEBUG_EVENT_MAX = 43, >> BLOCKDEV_DRIVER_MAX = 28, >> CHARDEV_BACKEND_KIND_MAX = 19, >> RUN_STATE_MAX = 15, >> NET_CLIENT_OPTIONS_KIND_MAX = 12, >> >> Most enums are small: median is 3. >> >> Would libvirt prefer them sorted? > > Libvirt can probably live without sorting of enum constants (searching > Q_KEY_CODE_MAX isn't going to be that noticeable of an impact on O(n) > vs. O(logn); I predict much more time will be spent searching for type > "xyz" referenced by command "abc" from the overall array, and repeating > that search for multiple feature probes). > > But if we do want sorting, we need it up front (it will be easier to > decide to use bsearch down the road if we are guaranteed that output is > sorted, than it would be to try and learn whether whether we are talking > to a new vs. old qemu to learn if sorting is in effect because it was > added as an after-thought). I agree adding sorting later is prone to add another instance of rarely used, bit-rotting backward-compatibility code to libvirt, along with the logic whether to use it. Best avoided. > And if we don't want sorting, documenting > that data is NOT guaranteed to be position-dependent, in spite of being > in a JSON array, is a nice touch. What do you mean by "position-dependent"? >>>> + >>>> +{ 'struct': 'SchemaInfoObjectVariant', >>>> + 'data': { 'case': 'str', >>>> + 'members': [ 'SchemaInfoObjectMember' ] } } >>> >>> Would it be simpler to just have: >>> >>> 'data': { 'case': 'str', 'type': 'str' } >>> >>> and make the user refer recursively to the (possibly-implicit) type for >>> the members? >> >> Hmmm... >> >> QAPISchemaObjectTypeVariant has members name, type, flat, and a few more >> that don't matter here. >> >> For a non-flat variant with name=N, type=T, my code creates >> >> { 'case': 'N', 'members': [ { 'name': 'data', 'type': 'T' } ] } >> >> This means when the tag is 'N', we have a member 'data' of type 'T'. >> >> For a flat variant, it creates >> >> { 'case': 'N', 'members': [ { ... the members of T ... } ] } >> >> This means when the tag is 'N', we have all the members of T. >> >> If I understand you correctly, you're proposing >> >> { 'case': 'N', 'type': 'T' } >> >> to mean when the tag is 'N', we have all the members of T. For the flat >> variant above, we'd create exactly that. >> >> T must not have variants, but the schema doesn't reflect that. > > That's our current restriction, but it's one we might decide to lift in > the future. Having a type with two different discriminators could get a > bit weird to think about, but doesn't seem to be technically impossible. > >> >> For the simple variant, we'd then create >> >> { 'case': 'N', 'type': 'TT' } >> >> where TT is a new implicit object type with a single member with name >> data and type T. >> >> Correct? > > Yes. I'm starting to like it. >>> In particular, if we ever decide to allow a flat union to have another >>> union as a branch, rather than the current restriction that all branches >>> must be structs, then referring to the type of a branch may be easier >>> than breaking out all members of a struct. >> >> Not sure I completely get you here. Using an object type instead of a >> member list is obviously more flexible, because for any member list we >> can make up an object type with the same meaning, but not vice versa. > > Indeed, and that's the restriction that I mention we might someday want > to lift. Or, in (modified, due to inline {} types) qapi terms, if we > ever want to allow: > > { 'union': 'Helper', 'data': { 'number': 'int', 'string': 'str' } } > { 'enum': 'MyEnum', 'data': [ 'a', 'b' ] } > { 'union': 'Main', 'base': { 'switch': 'MyEnum' }, > 'discriminator': 'switch', 'data': { 'a': { 'boolean': 'bool' }, > 'b': 'Helper' } } > { 'command': 'foo', 'data': { 'data': 'Main' } } > > then that would permit QMP invocations of: > { "execute": "foo", "arguments": { "data": { > "switch": "a", "boolean": true } } } > { "execute": "foo", "arguments": { "data": { > "switch": "b", "type": "number", "data": 1 } } } > { "execute": "foo", "arguments": { "data": { > "switch": "b", "type": "string", "data": "hello" } } } > > which we can express as a list of case/types for the primary variants of > 'Main' (those types in turn refer to the secondary variants of type > Helper), but which we cannot express as a [ 'SchemaInfoObjectMember' ] > list, because the type of the "data" member depends on the secondary > discriminator that is called into use on the "b" case of the primary > discriminator. > > So I think we're saying the same thing, that a [ > 'SchemaInfoObjectMember' ] can always be written as a reference to an > object type name, but not all object type names can be broken back into > an array of SchemaInfoObjectMember (only those types that are pure > structs without variants); and that although we currently do not allow > sub-variants within a union, we should not get in the way of that being > a possible future extension. > >> >>> And if that's the case, it >>> may have knock-on simplifications to your earlier patches for tracking >>> variants. See [1] below for more thoughts... >>> >>> Do we want to guarantee anything about the sort ordering in this list? >> >> Again, haven't thought about it. >> >> Do we expect member lists to get so large binary search is called for? > > Probably not, since such a list would be unwieldy for both client and > server. We tend to add boxing and optional sub-structs rather than > direct parameters if we have that much information to pass along (think > about how adding throttling parameters to a new block device was done > with a single top-level parameter pointing to a throttling sub-struct, > rather than adding lots of throttling parameters at top level). > > But, as with enum sorting, actually documenting our choice will help > cement the expectations of clients on what they have to do when learning > if a parameter was added. We may want to adopt a consistent rule on sorting stuff. >>>> + >>>> +{ 'struct': 'SchemaInfoObject', >>>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ], >>>> + '*tag': 'str', >>>> + '*variants': [ 'SchemaInfoObjectVariant' ] } } >>> >>> or these? >> >> Same question. >> > > Here, if enums are sorted, then case branches within variants should be > sorted. If enums are unsorted, then I'm fine if case branches are also > unsorted (and possibly in a different order than the enum was), Okay. > but be > sure to document that. Documentation I produce tends to err on the side of brevity, sometimes too much. If something is meant to produce sorted results, I document the sortedness. If not, I habitually say nothing. Since nothing's said, you better assume nothing. Please keep calling out instances of excessive brevity :) >>>> + >>>> +{ 'struct': 'SchemaInfoAlternate', >>>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ] } } >>> >>> Here's an example of what you generated: >>> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'members': [ { >>> 'name': 'definition', 'type': 'BlockdevOptions' }, { 'name': >>> 'reference', 'type': 'str' } ] }, " >>> >>> I think you could get away with something simpler: >>> >>> 'data': { 'types': [ 'str' ] } >>> >>> as in: >>> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'types': [ >>> 'BlockdevOptions', 'str' ] }, " >> >> I.e. have a list of types instead of a list of members. >> >> Let's see what we'd lose, by enumerating the members of >> SchemaInfoObjectMember: >> >> * name: not ABI, should not be examined (see commit message), thus no >> loss. >> >> * type: kept. >> >> * default: never present (see commit message), thus no loss >> >>> the only worry is whether we might want future extensions, where we'd >>> want additional information per element of that array, vs. being forced >>> to return two arrays in parallel (arrays of structs are more extensible >>> than arrays of strings). >>> Seems like this would be just a >> >> Yes? >> >> Choices: >> >> * The only piece of information we need on an alternative right now is >> the type, so make members a list of types. Nice now, awkward if we >> ever need more, >> >> * To provide for future additions, make it a list of >> SchemaInfoAlternateMember, where SchemaInfoAlternateMember has just >> one member type now. >> >> * Reuse existing SchemaInfoObjectMember, because that's close and I'm >> lazy. >> >> Preferences? > > At this point, my vote is with a new SchemaInfoAlternateMember class > (SchemaInfoObjectMember may diverge in a different direction, and it > would eliminate the question of how to not expose the branch names as > ABI; but keeping things as a (one-member, for now) dictionary will allow > future extensions). Since Kevin also questioned my reuse of SchemaInfoObjectMember for alternates in his review of RFC v1, let's go with a separate SchemaInfoAlternateMember. >>>> @@ -265,7 +265,7 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor): >>>> self.defn = None >>>> self.regy = None >>>> self.visited_rets = None >>>> - def visit_begin(self): >>>> + def visit_begin(self, schema): >>> >>> And again my python object-oriented newness is showing through; where I >>> guess all children have to update signatures to still be polymorphic to >>> a parent adding a parameter. >> >> Maybe they need to be updated, maybe not, but updating them neatly >> sidesteps Python polymorphism questions from mere mortals like you and >> me. > > see reference to [4] below > >> >>>> +''', >>>> + c_name=c_name(name)) >>>> + self.defn = mcgen(''' >>>> +char %(c_name)s[] = "[" >>>> + "%(c_jsons)s]"; >>> >>> And again. Also, I'd consider putting the "]" on its own line, like the >>> "[" was, so that you can more easily cut and paste individual lines of >>> generated output (but since JSON doesn't allow trailing comma, I guess >>> the last line is still always going to be special). >> >> That's my reason for keepint the "]" there. >> > > [3] and that works for me, unless we want to break long lines into > shorter ones by virtue of putting "," (and thus "]") on their own line > to make it even more obvious where we are breaking between elements of > the overall array. > >>>> +''', >>>> + c_name=c_name(name), >>>> + c_jsons=', "\n "'.join(self.jsons)) >>> >>> Cool syntax :) >> >> Possibly bordering on too cool :) > > Java was the first language I encountered where "".foo() was valid > syntax; sometimes, I wish C had made strings a first class type. I'm > fine with it as it is. > >>> [1] Ah, so .flat is still in use here, to avoid having to create >>> implicit types everywhere. But if we create implicit types for simple >>> unions, and just track variants by their case name/type instead of case >>> name/[members], it will allow us to have a union as a case branch (I >>> don't know that we need that much flexibility), and not have to worry >>> about exposing .flat everywhere. It may even result in a smaller JSON >>> string (you'd have to play with it to know for sure). >> >> We should try hard to get the introspection schema right from the start. >> >> But the internal representation is malleable. I know we can implement >> non-flat unions completely in terms of flat ones (and that means no >> .flat), but I also know I failed at doing it in this series. I'm not >> sure your idea will do the trick completely. It's fine, we can finish >> the job later. > > Yes, I've come to that conclusion myself - doesn't matter what we do on > the first round internally as long as the output is right; cleaning up > the internals can come later. > > >> I could shoehorn both views into a single visitor function, by passing >> both views, .base + .local_members, and .members. All implementations >> will use only one of the views, but it's not immediately obvious which >> one. So I chose to have two visitor functions. Matter of taste. > > I can live with it (documentation that sub-classes should override at > most only one of the two visitors might help the cause, though). Actually, overriding both would be just fine. We just haven't had a use for that. >>>> +++ b/scripts/qapi.py >>>> @@ -764,7 +764,7 @@ class QAPISchemaEntity(object): >>>> pass >>>> >>>> class QAPISchemaVisitor(object): >>>> - def visit_begin(self): >>>> + def visit_begin(self, schema): >>> >>> I don't know enough python to know if making schema optional in the >>> parent class affects what the child class is allowed to implement while >>> still overriding things. >> >> Not sure I get you here. > > It was an idle musing on whether > > class QAPISchemaVisitor(object): > def visit_begin(self, schema=None): > > would permit: > class QAPIIntrospectVisitor(QAPISchemaVisitor): > def visit_begin(self): > > with proper polymorphism. But you've already come to the conclusion > above [4] that it's easier to not mess with optional parameters (leave > that for the python gurus), and that mere mortals are better off using > something that obviously works instead of requiring knowledge about > language internals. > > >>>> +++ b/tests/.gitignore >>>> @@ -19,6 +19,7 @@ test-opts-visitor >>>> test-qapi-event.[ch] >>>> test-qapi-types.[ch] >>>> test-qapi-visit.[ch] >>>> +test-qapi-introspect.[ch] >>> >>> [2] Ah, maybe this is the file that wasn't quite right. >> >> Anything I need to fix? > > The generated files created by 'make check' wer named > test-qmp-introspect.[ch] (either s/qapi/qmp/ here, or else fix a > Makefile rule to generate the desired name). Scales falling from my eyes... >> Thanks! I really appreciate your review. Must have been a big chunk of >> work. > > Yes, doing a thorough review took me the better part of a week to go > through it all, so I can only imagine how much time you've invested in > it. Hopefully v3 will be easier to review, because it will be diffs > against this version and mainly focusing on whether review comments were > addressed. I'll do my best to keep the diffs in check.