Re: [Qemu-devel] [PATCH v8 10/17] qapi: Simplify visiting of alternate types

Markus Armbruster Wed, 04 Nov 2015 08:06:55 -0800

Eric Blake <ebl...@redhat.com> writes:

> On 11/03/2015 11:30 AM, Markus Armbruster wrote:
>> Eric Blake <ebl...@redhat.com> writes:
>> 
>>> Previously, working with alternates required two enums, and
>>> some indirection: for type Foo, we created Foo_qtypes[] which
>>> maps each qtype to a member of FooKind_lookup[], then use
>> 
>> member of FooKind, actually.
>
> Or entry in the FooKind_lookup[] array.
>
>> 
>>> FooKind_lookup[] like we do for other union types.
>> 
>> You probably mean FooKind here as well.
>
> I'll play with the wording.
>
>> 
>>> This has a subtle bug: since the values of FooKind_lookup
>>> start at zero, all entries of Foo_qtypes that were not
>>> explicitly initialized map to the same branch of the union as
>>> the first member of the alternate, rather than triggering a
>>> failure in visit_get_next_type().  Fortunately, the bug
>>> seldom bites; the very next thing the input visitor does is
>>> try to parse the incoming JSON with the wrong parser, which
>>> fails; the output visitor is not used with a C struct in that
>>> state, and the dealloc visitor has nothing to clean up (so
>>> there is no leak).
>> 
>> Yes, I remember us discussing this bug.
>> 
>> While reading code to double-check your description, I stumbled over
>> this beauty in generated qapi-visit.c:
>> 
>>     visit_get_next_type(v, (int*) &(*obj)->type, BlockdevRef_qtypes,
>> name, &err);
>> 
>> This casts enum BlockdevRefKind * to int *, which assumes the compiler
>> represents the enum BlockdevRefKind as int or unsigned.  It is free to
>> use any integer type, though.  Common mistake of programmers with
>> insufficiently developed wariness of C's subtleties.
>> 
>> visit_get_next_type() passes the fishy int * on to v->get_next_type().
>> Only implementation is qmp_input_get_next_type(), which uses it so:
>> 
>>     *kind = qobjects[qobject_type(qobj)];
>> 
>> Latent death trap.
>> 
>> Does your patch clean this up?
>
> Yes, and I need to also document that this is an additional bug fix.
>
>> 
>>> However, it IS observable in one case: the behavior of an
>>> alternate that contains a 'number' member but no 'int' member
>>> differs according to whether the 'number' was first in the
>>> qapi definition, and when the input being parsed is an integer;
>>> this is because the 'number' parser accepts QTYPE_QINT in
>>> addition to the expected QTYPE_QFLOAT.  A later patch will worry
>>> about fixing alternates to parse all inputs that a non-alternate
>>> 'number' would accept, for now it is still marked FIXME.
>
> See [1] below.
>
>>>
>>> This patch fixes the validation bug by deleting the indirection,
>>> and modifying get_next_type() to directly return a qtype code.
>> 
>> get_next_type() doesn't return anything.  Do you mean "store a qtype
>> code"?
>
> Yes.
>
>> 
>>> There is no longer a need to generate an implicit FooKind array
>> 
>> FooKind is an enum, not an array.
>
> ...to generate an implicit FooKind enum, nor FooKind_lookup[] array.
>
>> 
>>> associated with the alternate type (since the QMP wire format
>>> never uses the stringized counterparts of the C union member
>>> names); that also means we no longer have a collision with an
>>> alternate branch named 'max'.  Next, the generated visitor is
>>> fixed to properly detect unexpected qtypes in the switch
>>> statement.  This is done via the use of a new
>>> QAPISchemaAlternateTypeTag subclass and the use of a new
>>> member.c_type() method when producing qapi-types.  The new
>>> subtype also allows us to clean up a TODO left in the previous
>>> commit.
>>>
>>> Callers now have to know the QTYPE_* mapping when looking at the
>>> discriminator; but so far, only the testsuite was even using the
>>> C struct of an alternate types.  If that gets too confusing, we
>>> could reintroduce FooKind, but initialize it differently than
>>> most generated arrays, as in:
>>>   typedef enum FooKind {
>>>       FOO_KIND_A = QTYPE_QDICT,
>>>       FOO_KIND_B = QTYPE_QINT,
>>>   } FooKind;
>>> to create nicer aliases for knowing when to use foo->a or foo->b
>>> when inspecting foo->type.  But without a current client, I
>>> didn't see the point of doing it now.
>
> You have a point below that we either need to reserve MAX and require no
> case-insensitive clashes, or that we will never want to add it.  I'm
> leaning towards never going back, because the new way feels so much nicer.
[...]
>>> +++ b/tests/qapi-schema/qapi-schema-test.json
>>> @@ -131,7 +131,7 @@
>>>    'data': { 'value1': 'UserDefZero', 'has_a': 'UserDefZero',
>>>              'u': 'UserDefZero', 'type': 'UserDefZero' } }
>>>  { 'alternate': 'AltName', 'data': { 'type': 'int', 'u': 'bool',
>>> -                                    'myKind': 'has_a' } }
>>> +                                    'myKind': 'has_a', 'max': 'str' } }
>> 
>> Here, you add the positive test that alternate name 'max' works.
>> 
>> One, not mentioned in the commit message.
>
> D'oh.
>
>> 
>> Two, the commit message says we may reintroduce FooKind if working with
>> qtype_code turns out to be too confusing.  If we ever do that, alternate
>> name 'max' breaks, doesn't it?  Shouldn't we keep it reserved then, just
>> in case?
>> 
>
> See my comment above; at this point, I doubt we'll ever want to go back,
> so maybe I just need to be more definitive in stating that.


Fair enough.

However, we should also consider QAPI language regularity and
simplicity.  To make that argument, I need to back up a bit.

There are three kinds of name collisions: QAPI schema, QMP wire,
generated C.

QAPI schema collisions are the obvious ones:

* JSON object member names must be distinct (enforced by our JSON
  parser).

* Enumeration values must be distinct.

* Each flat union's variant's members must be distinct from the
  non-variant members.

A translation of the schema into a protocol or code can add additional
restrictions.

I believe the translation to QMP wire doesn't add any.  We don't mangle
names, we don't add object members or enumeration values.

The translation to C does add some:

* Mangled names can collide even when unmangled names don't.

* Names can collide with names used by the implementation.

It can also remove collision possibilities (e.g. variant and non-variant
members end up in separate name spaces), but that's not important here.

One possible approach would be to let qapi.py worry about QAPI schema /
QMP wire collisions, and the C compiler about generated C collisions.
We rejected that approach, because navigating from the C compiler's
error messages to the broken spot in the QAPI schema is a bother.

Instead, qapi.py knows enough about the generators to predict collisions
in generated C.  This is feasible only because the generators follow
simple and regular rules:

* Names are mangled by c_name(), prefixes or suffixes may be tacked on.

* Exception: enumeration values are mangled by camel_to_upper().
  Although the way c_enum_const() is defined, you have to look closely
  to see it.

* Any pair of names that clashes before mangling also clashes after.
  Permits omitting collision checks for unmangled names when the mangled
  names are checked.

* Tag values are mangled *both* ways, because they occur both as C union
  members and as C enum members.

* We reserve a bunch of names for generator use: mangled names can't
  start with 'q_', mangled member names can't start with 'has_', mangled
  enumeration values can't be 'MAX', and so forth.

Remarks:

* A simple union's member names are also tag values, and are therefore
  mangled both ways, and both reserved member and enum value names
  apply.

* The moment you use an enumeration as type of a flat union tag, the
  other mangling and reserved names kicks in.

Conclusions:

* Having two different name manglers is a headache we could do without,
  especially since the second one camel_to_upper() is pretty magic.

  We have it only to get

    typedef enum BlockDeviceIoStatus {
        BLOCK_DEVICE_IO_STATUS_OK = 0,
        BLOCK_DEVICE_IO_STATUS_FAILED = 1,
        BLOCK_DEVICE_IO_STATUS_NOSPACE = 2,
        BLOCK_DEVICE_IO_STATUS_MAX = 3,
    } BlockDeviceIoStatus;

  instead of

    typedef enum BlockDeviceIoStatus {
        BlockDeviceIoStatus_ok = 0,
        BlockDeviceIoStatus_failed = 1,
        BlockDeviceIoStatus_nospace = 2,
        BlockDeviceIoStatus_MAX = 3,
    } BlockDeviceIoStatus;

  Bah!  CODING_STYLE doesn't even ask for shouting enumeration
  constants.  Can't see why we do.

* Keeping the complexity of the rules under control is good both for
  qapi.py and for the QAPI schema language.

  To that end, I think we should consider reserving the same set of
  names both for members and tag values.  It gets rid of complications
  like enumerations you can't use as flat union tags.

  Additionally, the question whether to keep the door open for
  generating an enum for the alternate cases becomes moot.

What do you think?

[...]

Re: [Qemu-devel] [PATCH v8 10/17] qapi: Simplify visiting of alternate types

Reply via email to