On 02/03/2014 05:15 PM, Eric Blake wrote: > On 01/23/2014 07:46 AM, Amos Kong wrote: >> This is a code generator for qapi introspection. It will parse >> qapi-schema.json, extend schema definitions and generate a schema >> table with metadata, it references to the new structs which we used >> to describe dynamic data structs. The metadata will help C code to >> allocate right structs and provide useful information to management >> to checking suported feature and QMP commandline detail. The schema > > s/suported/supported/ > >> table will be saved to qapi-introspect.h. >>
> > I am NOT a python expert. But what I _can_ do is apply this patch and > review the generated code for sanity. >> +fdecl.write(mcgen(''' >> +/* AUTOMATICALLY GENERATED, DO NOT MODIFY */ >> + >> +/* >> + * Head file to store parsed information of QAPI schema s/Head/Header/ > Like I said, I think my review will be more helpful by looking at the > generated file; I'll follow up in another email (probably tomorrow, > since it's now late for me) with more comments once I actually finish that. > It took me longer than planned to get back to this. But here goes my impressions of the generated file: -rw-rw-r--. 1 eblake eblake 667643 Feb 10 16:16 qapi-introspect.h -rw-rw-r--. 1 eblake eblake 126261 Feb 7 17:12 qapi-schema.json -rw-rw-r--. 1 eblake eblake 80170 Feb 7 17:12 qapi-types.h Wow, that's a LOT of code. Why does it take 6 times more C code than what qapi itself represented everything in? Are we going too far with inlining type information? A larger file MIGHT be okay, if that's what it takes to make C code that is expressive of the information at hand (after all, the whole point of our qapi is to give us some shorthand, so that we can quickly define types in less syntax than C) - but I want to be sure that we really need that much content. For comparison, the generated qapi-types.h is smaller than the input; sure, that's in part due to the comments being stripped out of the input file, but it's evidence that the C code representation of qapi should be about the same size as the input file, not approaching an order of magnitude larger. const char *const qmp_schema_table[] = { /* OrderedDict([('enum', 'ErrorClass'), ('data', ['GenericError', 'CommandNotFound', 'DeviceEncrypted', 'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap'])]) */ "{'_obj_member': 'False', '_obj_type': 'enum', '_obj_name': 'ErrorClass', '_obj_data': {'data': ['GenericError', 'CommandNotFound', 'DeviceEncrypted', 'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap']}, '_obj_recursive': 'False'}", /* OrderedDict([('command', 'add_client'), ('data', OrderedDict([('protocol', 'str'), ('fdname', 'str'), ('*skipauth', 'bool'), ('*tls', 'bool')]))]) */ "{'_obj_member': 'False', '_obj_type': 'command', '_obj_name': 'add_client', '_obj_data': {'data': {'_obj_type': 'anonymous-struct', '_obj_member': 'True', '_obj_name': '', '_obj_data': {'*skipauth': 'bool', 'protocol': 'str', 'fdname': 'str', '*tls': 'bool'}, '_obj_recursive': 'False'}}, '_obj_recursive': 'False'}", Long lines! Just because it's generated doesn't mean it can't have nice line wraps. Make your generator output some strategic whitespace, so that a human perusing the file stands a chance of understanding it (look at qapi-types.h for comparison). No sorting? This looks like you just dumped the output in hash-table order. Please sort the array by type and/or name, so that if we add filtering, the C code can then do O(log n) lookup via bsearch rather than an O(n) linear crawl (or maybe even multiple tables, one per type, with each table sorted by name within that type). The comments before each string entry is redundant. Cut your file in half by listing only what the C compiler cares about - after all the information is supposed to be self-describing enough that if the comment actually added anything, we failed at introspecting enough useful information to the user. A flat-out array of pre-compiled strings? I guess it makes generating the output of your command a little faster (just replay the pre-computed strings, instead of having to stringify a QObject), but it is lousy if you ever have to process the data differently. I was totally expecting an array of structs. And not just any struct, but an array of the C struct that gets generated from the QAPI code. That is, since qapi is _already_ the mechanism for generating decent C code structs, and we want introspection to be self-describing, then your 'struct DataObject' from qapi-types.h (as generated by patch 1/5) should already be sufficient as the base of your array. Or maybe we make an array of yet one more layer of type: typedef struct QIntrospection QIntrospection; struct QIntrospection { DataObject data; const char *string; }; so that the C code has access to both the qapi struct and the pre-rendered string, and can thus get at whichever piece of information is needed (array[i].data.name when filtering by name, array[i].string when outputting pre-formatted text). What I find most appalling about the generated file is that each entry of the array is a JSON string, but that the JSON string does NOT match the JSON that would be sent over the wire when sending a DataObject. Thus, the only way your generated file is usable is if you run the string through a JSON parser, peel out the portions of the string you need, and then reconstruct things back into the QMP command. You REALLY want either a DataObject[] or a QIntrospection[], so that you DON'T have to parse strings in your C code. The python code should have already generated the header file with everything already placed in C structs and/or QMP wire format strings in the format that best suits your needs! That is, I think you want something more like this (rendering the first two lines that I quoted above in pseudo-C): const DataObject qmp_schema_table[] = { { .has_name = true, .name = "ErrorClass", .kind = DATA_OBJECT_KIND_ENUMERATION, .enumeration = { .value = "GenericError", .next = { .value = "CommandNotFound", .next = { .value = "DeviceEncrypted", .next = { .value = "DeviceNotActive", .next = { .value = "DeviceNotFound", .next = { .value = "KVMMissingCap", .next = NULL } } } } } }, .has_recursive = false, }, { .has_name = true, .name = "add_client", .kind = DATA_OBJECT_KIND_COMMAND, .command = { .has_data = true, .data = { .value = { .type = { .kind = DATA_OBJECT_MEMBER_TYPE_KIND_REFERENCE, .reference = "str" }, .has_name = true, .name = "protocol", .has_optional = false, .has_recursive = false }, .next = { .value = { .type = { .kind = DATA_OBJECT_MEMBER_TYPE_KIND_REFERENCE, .reference = "bool" }, .has_name = true, .name = "skipauth", .has_optional = true, .optional = true, .has_recursive = false }, ... .next = NULL }...}, .has_returns = false, .has_gen = false, }, .has_recursive = false, }, and so on. Of course, as I typed that, I realize that you can't actually initialize DataObjectCommand* and other object pointers via simple {} initializers; so you actually need to be more verbose and use some C99 typed initializers: ... .enumeration = &(DataObjectEnumeration) { .value = "GenericError", .next = &(DataObjectEnumeration) { .value = "CommandNotFound", ... Or maybe all you need is: const QIntrospection qmp_schema_table[] = { { .name = "ErrorClass", .str = "{\"name\":\"ErrorClass\"," "\"type\":\"enumeration\"," "\"data\":[" "\"GenericError\"," "\"CommandNotFound\"," "\"DeviceEncrypted\"," "\"DeviceNotFound\"," "\"KVMMissingCap\"" "]}" }, { .name = "add_client", .str = "{\"name\":\"add_client\"," "\"type\":\"command\"," "\"data\":{" ... with just the object name and pre-rendered string in each array entry. But my point remains - let the python code generate something USEFUL, and not something that the C code has to re-parse. If you're going to store a string, store it in the format that QMP will already hand over the wire. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature