qapi-code-gen: Improve QAPI schema language doc

Eric Blake Tue, 10 Sep 2019 10:38:57 -0700

On 9/10/19 1:37 AM, Markus Armbruster wrote:
> We document the language by giving patterns of valid JSON objects.
> The patterns contain placeholders we don't define anywhere; their
> names have to speak for themselves.  I guess they do, but I'd prefer a
> bit more rigor.  Provide a grammar instead, and rework the text
> accordingly.
> 
> Documentation for QAPI schema conditionals (commit 967c885108,
> 6cc32b0e14, 87adbbffd4..3e270dcacc) and feature flags (commit
> 6a8c0b5102) was bolted on.  The sections documenting types, commands
> and events don't mention them.  Section "Features" and "Configuring
> the schema" then provide additional syntax for types, commands and
> events.  I hate that.  Fix the sections documenting types, commands
> and events to provide accurate syntax, and point to "Features" and
> "Configuring the schema" for details.
> 
> We talk about "(top-level) expressions other than include and pragma".
> Adopt more convenient terminology: a (top-level) expression is either
> a directive (include or pragma) or a definition (anything else).
> 
> Avoid the terms "dictionary" and "key".  Stick to JSON terminology
> "object" and "member name" instead.
> 
> While there, make spacing more consistent.
> 
> Signed-off-by: Markus Armbruster <arm...@redhat.com>
> ---
>  docs/devel/qapi-code-gen.txt | 592 ++++++++++++++++++++++-------------
>  1 file changed, 369 insertions(+), 223 deletions(-)


>  
> -The order of keys within JSON objects does not matter unless
> +* Production rules look like non-terminal = expression
> +* Concatenation: Expression A B matches expression A, then B
> +* Alternation: Expression A | B matches expression A or B
> +* Repetition: Expression A... matches zero or more occurences of

occurrences

> +  expression A; expression A, ... likewise, but separated by ,

worth calling out that trailing , are not allowed?

Is the 'expression A;' a copy-paste from RFC text, irrelevant to our
usage here?

> +* Grouping: Expression ( A ) matches expression A
> +* JSON's structural characters are terminals: { } [ ] : ,
> +* JSON's literal names are terminals: false true null
> +* String literals enclosed in 'single quotes' are terminal, and match
> +  this JSON string, with a leading '*' stripped off
> +* When JSON object member's name starts with '*', the member is
> +  optional.
> +* The symbol STRING is a terminal, and matches any JSON string
> +* The symbol BOOL is a terminal, and matches JSON false or true
> +* ALL-CAPS words other than STRING are non-terminals
> +
> +The order of members within JSON objects does not matter unless
>  explicitly noted.
>  

> +A QAPI schema consists of a series of top-level expressions:
> +
> +    SCHEMA = TOP-LEVEL-EXPR...
> +
> +The top-level expressions are all JSON objects.  Their order does not
> +matter.

Is that always true for all directives?

Would it be worth reformulating as something like:

SCHEMA = DIRECTIVE... DEFINITION...

allowing zero-or-more of either, but where directives come first?

> +
> +A top-level expressions is either a directive or a definition:
> +
> +    TOP-LEVEL-EXPR = DIRECTIVE | DEFINITION
> +
> +There are two kinds of directives and six kinds of definitions:
> +
> +    DIRECTIVE = INCLUDE | PRAGMA
> +    DEFINITION = ENUM | STRUCT | UNION | ALTERNATE | COMMAND | EVENT
> +
> +These are discussed in detail below.
>  
>  

>  === Pragma directives ===
>  
> -Usage: { 'pragma': DICT }
> +Syntax:
> +    PRAGMA = { 'pragma': { '*doc-required': BOOL },
> +                           '*returns-whitelist': [ STRING, ... ],
> +                           '*name-case-whitelist': [ STRING, ... ] }

This has matched {}, but looks wrong.  I think you meant

PRAGMA = { 'pragma': { '*doc-required': BOOL,
                       '*returns-whitelist': [ STRING, ... ],
                       '*name-case-whitelist': [ STRING, ... ] } }

>  === Enumeration types ===
>  
> -Usage: { 'enum': STRING, 'data': ARRAY-OF-STRING }
> -       { 'enum': STRING, '*prefix': STRING, 'data': ARRAY-OF-STRING }
> +Syntax:
> +    ENUM = { 'enum': STRING,
> +             'data': [ ENUM-VALUE, ... ],
> +             '*prefix': STRING,
> +             '*if': COND }
> +    ENUM-VALUE = STRING
> +               | { 'name': STRING, '*if': COND }
>  

> +=== Type references and array types ===
> +
> +Syntax:
> +    TYPE-REF = STRING | ARRAY-TYPE
> +    ARRAY-TYPE = [ STRING ]

As written (and enforced by current code), we don't allow 2-d arrays; if
we ever decide we need that, we'd rewrite to ARRAY-TYPE = [ ARRAY-TYPE ]

>  === Struct types ===
>  
> -Usage: { 'struct': STRING, 'data': DICT, '*base': STRUCT-NAME }
> +Syntax:
> +    STRUCT = { 'struct': STRING,
> +               'data': MEMBERS,
> +               '*base': STRING,
> +               '*if': COND,
> +               '*features': FEATURES }
> +    MEMBERS = { MEMBER, ... }
> +    MEMBER = STRING : TYPE-REF
> +           | STRING : { 'type': TYPE-REF, '*if': COND }

Down the road, we'll probably be amending this line:

 | STRING : { 'type': TYPE-REF, '*if': COND, '*default': VALUE }

with appropriate documentation on how VALUE must match type.


>  === Union types ===
>  
> -Usage: { 'union': STRING, 'data': DICT }
> -or:    { 'union': STRING, 'data': DICT, 'base': STRUCT-NAME-OR-DICT,
> -         'discriminator': ENUM-MEMBER-OF-BASE }
> -
> -Union types are used to let the user choose between several different
> -variants for an object.  There are two flavors: simple (no
> -discriminator or base), and flat (both discriminator and base).  A union
> -type is defined using a data dictionary as explained in the following
> -paragraphs.  Unions must have at least one branch.
> +Syntax:
> +    UNION = { 'union': STRING,
> +              'data': BRANCHES,
> +              '*if': COND }
> +          | { 'union': STRING,
> +              'data': BRANCHES,
> +              'base': ( MEMBERS | STRING ),
> +              'discriminator': STRING,
> +              '*if': COND }
> +    BRANCHES = { BRANCH, ... }

This syntactically allows zero or more branches per the intro, but in
code we semantically require at least one.  Is that a problem?

> +    BRANCH = STRING : TYPE-REF
> +           | STRING : { 'type': TYPE-REF, '*if': COND }
> +
> +Member 'union' names the union type.
> +
> +There are two flavors of union types: simple (no discriminator or
> +base), and flat (both discriminator and base).
> +
> +Each BRANCH of the 'data' object defines a branch of the union.  A
> +union must have at least one branch.

Is it worth trying to represent one-or-more in the grammar, in a
different manner than zero-or-more?


> +
> +All flat union branches must be of struct type.
> +

We may eventually relax that to permit another union, hopefully we
remember to fix the docs when doing so.

> +In the Client JSON Protocol, a flat union is represented by an object
> +with the common members (from the base type) and the selected branch's
> +members.  The two sets of member names must be disjoint.  Member
> +'discriminator' must name a non-optional enum-typed member of the base
> +struct.

We've already had patches proposed for relaxing that to allow an
optional discriminator as a way to select a default branch.  They'll
have to be rebased on top of this.


>  === Configuring the schema ===
>  
> -The 'struct', 'enum', 'union', 'alternate', 'command' and 'event'
> -top-level expressions can take an 'if' key.  Its value must be a string
> -or a list of strings.  A string is shorthand for a list containing just
> -that string.  The code generated for the top-level expression will then
> -be guarded by #if COND for each COND in the list.
> +Syntax:
> +    COND = STRING
> +         | [ STRING, ... ]
> +
> +All definitions take an optional 'if' member.  Its value must be a
> +string or a list of strings.  A string is shorthand for a list
> +containing just that string.  The code generated for the definition
> +will then be guarded by #if COND for each COND in the list.

Should this read 'guarded by #if STRING for each STRING in the COND list'?


> -==== Expression documentation ====
> +==== Definition documentation ====
>  
> -Expressions other than include and pragma directives may be preceded
> -by a documentation block.  Such blocks are called expression
> -documentation blocks.
> +Definition documentation, if present, must immediately precede the
> +definition it documents.
>  
> -When documentation is required (see pragma 'doc-required'), expression
> -documentation blocks are mandatory.
> +When documentation is required (see pragma 'doc-required'), every
> +definition must have documentation.
>  
> -The documentation block consists of a first line naming the
> -expression, an optional overview, a description of each argument (for
> -commands and events) or member (for structs, unions and alternates),
> -and optional tagged sections.
> +Definition documentation starts with a line naming the definition,
> +followed by an optional overview, a description of each argument (for
> +commands and events), member (for structs and unions), branch (for
> +alternates), or value (for enums), and finally optional tagged
> +sections.
>  
>  FIXME: the parser accepts these things in almost any order.
> +FIXME: union branches should be described, too.
>  
> -Extensions added after the expression was first released carry a
> +Extensions added after the definition was first released carry a
>  '(since x.y.z)' comment.

Not this patch's problem - but we are inconsistent between x.y and x.y.z.

There are a couple of typo or grammar fixes worth fixing, but I'm
comfortable enough giving:

Reviewed-by: Eric Blake <ebl...@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 15/16] docs/devel/qapi-code-gen: Improve QAPI schema language doc

Reply via email to