Re: [Qemu-devel] [PATCH RFC 3/6] qapi: rewrite string-input-visitor

David Hildenbrand Wed, 14 Nov 2018 11:57:31 -0800

On 14.11.18 18:38, Markus Armbruster wrote:
> David Hildenbrand <da...@redhat.com> writes:
> 
>> The input visitor has some problems right now, especially
>> - unsigned type "Range" is used to process signed ranges, resulting in
>>   inconsistent behavior and ugly/magical code
>> - uint64_t are parsed like int64_t, so big uint64_t values are not
>>   supported and error messages are misleading
>> - lists/ranges of int64_t are accepted although no list is parsed and
>>   we should rather report an error
>> - lists/ranges are preparsed using int64_t, making it hard to
>>   implement uint64_t values or uint64_t lists
>> - types that don't support lists don't bail out
> 
> Known weirdness: empty list is invalid (test-string-input-visitor.c
> demonstates).  Your patch doesn't change that (or else it would update
> the test).  Should it be changed?
>


I don't change the test, so the old behavior still works.
(empty string -> error)

>>
>> So let's rewrite it by getting rid of usage of the type "Range" and
>> properly supporting lists of int64_t and uint64_t (including ranges of
>> both types), fixing the above mentioned issues.
>>
>> Lists of other types are not supported and will properly report an
>> error. Virtual walks are now supported.
>>
>> Tests have to be fixed up:
>> - Two BUGs were hardcoded that are fixed now
>> - The string-input-visitor now actually returns a parsed list and not
>>   an ordered set.
>>
>> While at it, do some smaller style changes (e.g. use g_assert).
> 
> Please don't replace assert() by g_assert() in files that consistently
> use assert() now.

Alright, will use assert() and don't replace. (I thought g_assert() was
the new fancy thing to do it in QEMU).

> 
> In my opinion, g_assert() is one of the cases where GLib pointlessly
> reinvents wheels that have been carrying load since forever.
> 
>> Signed-off-by: David Hildenbrand <da...@redhat.com>
>> ---
>>  include/qapi/string-input-visitor.h |   3 +-
>>  qapi/string-input-visitor.c         | 436 +++++++++++++++++-----------
>>  tests/test-string-input-visitor.c   |  18 +-
>>  3 files changed, 264 insertions(+), 193 deletions(-)
>>
>> diff --git a/include/qapi/string-input-visitor.h 
>> b/include/qapi/string-input-visitor.h
>> index 33551340e3..2c40f7f5a6 100644
>> --- a/include/qapi/string-input-visitor.h
>> +++ b/include/qapi/string-input-visitor.h
>> @@ -19,8 +19,7 @@ typedef struct StringInputVisitor StringInputVisitor;
>>  
>>  /*
>>   * The string input visitor does not implement support for visiting
>> - * QAPI structs, alternates, null, or arbitrary QTypes.  It also
>> - * requires a non-null list argument to visit_start_list().
>> + * QAPI structs, alternates, null, or arbitrary QTypes.
> 
> Preexisting: neglects to spell out the list limitations, i.e. can do
> only flat lists of integers.  Care do fix that, too?

Added "Only flat lists of integers (int64/uint64) are supported."

> 
>>   */
>>  Visitor *string_input_visitor_new(const char *str);
>>  
>> diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
>> index dee708d384..16da47409e 100644
>> --- a/qapi/string-input-visitor.c
>> +++ b/qapi/string-input-visitor.c
>> @@ -4,10 +4,10 @@
>>   * Copyright Red Hat, Inc. 2012-2016
>>   *
>>   * Author: Paolo Bonzini <pbonz...@redhat.com>
>> + *         David Hildenbrand <da...@redhat.com>
>>   *
>>   * This work is licensed under the terms of the GNU LGPL, version 2.1 or 
>> later.
>>   * See the COPYING.LIB file in the top-level directory.
>> - *
>>   */
>>  
>>  #include "qemu/osdep.h"
>> @@ -18,21 +18,39 @@
>>  #include "qapi/qmp/qerror.h"
>>  #include "qapi/qmp/qnull.h"
>>  #include "qemu/option.h"
>> -#include "qemu/queue.h"
>> -#include "qemu/range.h"
>>  #include "qemu/cutils.h"
>>  
>> +typedef enum ListMode {
>> +    /* no list parsing active / no list expected */
>> +    LM_NONE,
>> +    /* we have an unparsed string remaining */
>> +    LM_UNPARSED,
>> +    /* we have an unfinished int64 range */
>> +    LM_INT64_RANGE,
>> +    /* we have an unfinished uint64 range */
>> +    LM_UINT64_RANGE,
>> +    /* we have parsed the string completely and no range is remaining */
>> +    LM_END,
>> +} ListMode;
>> +
>> +typedef union RangeLimit {
>> +    int64_t i64;
>> +    uint64_t u64;
>> +} RangeLimit;
>>  
>>  struct StringInputVisitor
>>  {
>>      Visitor visitor;
>>  
>> -    GList *ranges;
>> -    GList *cur_range;
>> -    int64_t cur;
>> +    /* Porperties related to list processing */
> 
> "Properties"
> 
> Suggest
> 
>        /* List parsing state */

Yes, will use that.

> 
>> +    ListMode lm;
>> +    RangeLimit rangeNext;
>> +    RangeLimit rangeEnd;
> 
> RangeLimit is a good name for rangeEnd, but not so hot for rangeNext.
> Naming is hard.

Initially I had two unnamed unions to not have to give a name ;)

RangeElement ? Will use that for now.

> 
>> +    const char *unparsed_string;
>> +    void *list;
> 
> You drop /* Only needed for sanity checking the caller */.
> Double-checking: is that not true anymore?

I consider such comments bad as they can easily become outdated (and
IMHO don't help anybody). But I can leave it here, don't really care.

[...]

>> +        error_setg(errp, "Already processing a list.");
> 
> Drop the period, please.  error_setg()'s contract stipulates:
> 
>  * The resulting message should be a single phrase, with no newline or
>  * trailing punctuation.
> 
> More of the same below.

All dots dropped :)

[...]
>>  static GenericList *next_list(Visitor *v, GenericList *tail, size_t size)
>>  {
>>      StringInputVisitor *siv = to_siv(v);
>> -    Range *r;
>> -
>> -    if (!siv->ranges || !siv->cur_range) {
>> -        return NULL;
>> -    }
>>  
>> -    r = siv->cur_range->data;
>> -    if (!r) {
>> +    switch (siv->lm) {
>> +    case LM_NONE:
>> +    case LM_END:
>> +        /* we have reached the end of the list already or have no list */
>>          return NULL;
> 
> Hmm.  Is there a way to reach case LM_NONE other than a programming
> error?  If no, we should abort then.  To do so, fold LM_NONE into the
> default case below.

Alright, all programming errors will be encoded as asserts.

> 
>> -    }
>> -
>> -    if (!range_contains(r, siv->cur)) {
>> -        siv->cur_range = g_list_next(siv->cur_range);
>> -        if (!siv->cur_range) {
>> -            return NULL;
>> -        }
>> -        r = siv->cur_range->data;
>> -        if (!r) {
>> -            return NULL;
>> -        }
>> -        siv->cur = range_lob(r);
>> +    case LM_INT64_RANGE:
>> +    case LM_UINT64_RANGE:
>> +    case LM_UNPARSED:
>> +        /* we have an unparsed string or something left in a range */
>> +        break;
>> +    default:
>> +        g_assert_not_reached();
> 
> abort() suffices, and is more consistent with the rest of qapi/.

Right, qapi is special ;)

Will use s/g_assert/assert/
s/g_assert_not_reached/abort/

> 
>>      }
>>  
>>      tail->next = g_malloc0(size);
>> @@ -179,88 +109,216 @@ static GenericList *next_list(Visitor *v, GenericList 
>> *tail, size_t size)
>>  static void check_list(Visitor *v, Error **errp)
>>  {
>>      const StringInputVisitor *siv = to_siv(v);
>> -    Range *r;
>> -    GList *cur_range;
>>  
>> -    if (!siv->ranges || !siv->cur_range) {
>> +    switch (siv->lm) {
>> +    case LM_NONE:
>> +        error_setg(errp, "Not processing a list.");
> 
> Same question as for next_list().

abort() it is :) -> default case

> 
>> +    case LM_INT64_RANGE:
>> +    case LM_UINT64_RANGE:
>> +    case LM_UNPARSED:
>> +        error_setg(errp, "There are elements remaining in the list.");
> 
> Hmm.  qobject_input_check_list() reports "Only %u list elements expected
> in %s" here.  Looks unintuitive, until you remember how it's normally
> used: to parse user input.  The error gets reported when the parsing
> didn't consume the whole list.  Can only happen with a virtual walk.
> And when it happens, the issue to report is "you provided a longer list
> than I can accept".  qobject_input_check_list() does exactly that.  Your
> message is less clear.

We don't keep track of element count (and I don't like adding it just to
print a error message).

What about "Less list elements expected"? That at least matches the
other error.

> 
>>          return;
>> -    }
>> -
>> -    r = siv->cur_range->data;
>> -    if (!r) {
>> +    case LM_END:
>>          return;
>> +    default:
>> +        g_assert_not_reached();
>>      }
>> -
>> -    if (!range_contains(r, siv->cur)) {
>> -        cur_range = g_list_next(siv->cur_range);
>> -        if (!cur_range) {
>> -            return;
>> -        }
>> -        r = cur_range->data;
>> -        if (!r) {
>> -            return;
>> -        }
>> -    }
>> -
>> -    error_setg(errp, "Range contains too many values");
>>  }
>>  
>>  static void end_list(Visitor *v, void **obj)
>>  {
>>      StringInputVisitor *siv = to_siv(v);
>>  
>> -    assert(siv->list == obj);
> 
> I suspect state LM_NONE is also a programming error here.  If that's
> correct, let's assert(siv->lm != LM_NONE).

Indeed, added.

[...]
>> +        if (siv->rangeNext.i64 > siv->rangeEnd.i64 || *obj == INT64_MAX) {
>> +            /* end of range, check if there is more to parse */
>> +            if (siv->unparsed_string[0]) {
>> +                siv->lm = LM_UNPARSED;
>> +            } else {
>> +                siv->lm = LM_END;
>> +            }
>>          }
> 
> Are you sure this is kosher?
> 
> if siv->rangeNext.i64 and siv->rangeEnd are both initially INT64_MAX,
> siv->rangeNext++ wraps around to INT64_MIN.  Except when the compiler
> exploits undefined behavior.

*obj == INT64_MAX checks exactly that

(it is the old value of siv->rangeNext.i64 before a possible wraparound).

BTW I added a test case for exactly that as I had the BUG you described
in the code ;)

[...]
>> +        return;
>> +    case LM_UNPARSED:
>> +        if (try_parse_int64_list_entry(siv, obj)) {
>> +            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : 
>> "null",
>> +                       "list of int64 values or ranges");
>> +        }
> 
> I figure I'd make try_parse_int64_list_entry() just parse, and on
> success fall through to case LM_INT64_RANGE.  But your solution works,
> too.

Then we would have to represent even single values as ranges, which is
something I'd like to avoid.

> 
>> +        return;
>> +    case LM_END:
>> +        error_setg(errp, "No more elements in the list.");
> 
> This is the buddy of check_list() case LM_UNPARSED etc.  There, input
> provides more list members than we expect.  Here, input provides less.
> Again, the error is less clear than the one the QObject input visitor
> reports: "Parameter '%s' is missing".
> 
> Same for the unsigned copy below.

Will also use "Less list elements expected".

"Parameter '%s' is missing" does not really apply to list elements
without a name. We can discuss when I resend.

> 
>> +        return;
>> +    default:
>> +        error_setg(errp, "Lists don't support mixed types.");
> 
> Is this a programming error?
> 
> Same for the unsigned copy below.

abort() it is.

[...]
>> +
>> +        if (siv->rangeNext.u64 > siv->rangeEnd.u64 || *obj == UINT64_MAX) {
>> +            /* end of range, check if there is more to parse */
>> +            if (siv->unparsed_string[0]) {
>> +                siv->lm = LM_UNPARSED;
>> +            } else {
>> +                siv->lm = LM_END;
>> +            }
>> +        }
> 
> Unsigned wraps around fine.  But when you fix the signed code, you
> should probably keep this unsigned code as similar as practical.

Again, *obj == UINT64_MAX checks this.

> 
>> +        return;
>> +    case LM_UNPARSED:
>> +        if (try_parse_uint64_list_entry(siv, obj)) {
>> +            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : 
>> "null",
>> +                       "list of uint64 values or ranges");
>> +        }
>> +        return;
>> +    case LM_END:
>> +        error_setg(errp, "No more elements in the list.");
>> +        return;
>> +    default:
>> +        error_setg(errp, "Lists don't support mixed types.");
>> +        return;
>>      }
>>  }
>>  
>> @@ -271,6 +329,11 @@ static void parse_type_size(Visitor *v, const char 
>> *name, uint64_t *obj,
>>      Error *err = NULL;
>>      uint64_t val;
>>  
>> +    if (siv->lm != LM_NONE) {
>> +        error_setg(errp, "Lists not supported for type \"size\"");
>> +        return;
>> +    }
>> +
> 
> Programming error?  More of the same below.

assert() it is.

[...]
>>      visit_type_uint64List(v, NULL, &res, &error_abort);
>>      tail = res;
>>      for (i = 0; i < n; i++) {
>> @@ -117,10 +107,10 @@ static void check_ulist(Visitor *v, uint64_t 
>> *expected, size_t n)
>>  static void test_visitor_in_intList(TestInputVisitorData *data,
>>                                      const void *unused)
>>  {
>> -    /* Note: the visitor *sorts* ranges *unsigned* */
>> -    int64_t expect1[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20 };
>> +    int64_t expect1[] = { 1, 2, 0, 2, 3, 4, 20, 5, 6, 7, 8, 9, 1, 2, 3, 4, 
>> 5,
> 
> Please wrap this line a bit earlier.

Will do.

> 
>> +                          6, 7, 8 };
>>      int64_t expect2[] = { 32767, -32768, -32767 };
>> -    int64_t expect3[] = { INT64_MAX, INT64_MIN };
>> +    int64_t expect3[] = { INT64_MIN, INT64_MAX };
> 
> Did you mean to change this line?
> 

Yes, we don't sort the list anymore.

>>      uint64_t expect4[] = { UINT64_MAX };
>>      Error *err = NULL;
>>      int64List *res = NULL;
>> @@ -189,7 +179,7 @@ static void test_visitor_in_intList(TestInputVisitorData 
>> *data,
>>      visit_type_int64(v, NULL, &tail->value, &err);
>>      g_assert_cmpint(tail->value, ==, 0);
>>      visit_type_int64(v, NULL, &val, &err);
>> -    g_assert_cmpint(val, ==, 1); /* BUG */
>> +    error_free_or_abort(&err);
> 
> Which of the problems listed in commit message is this?

I guess the not mentioned "visiting beyond the end of a list is not
handled properly" bug (most probably due to old range handling). Will
add it to the description,

> 
>>      visit_check_list(v, &error_abort);
>>      visit_end_list(v, (void **)&res);
> 
> Please don't let my review comments discourage you!  This is so much
> better than the code we have now, and feels pretty close to ready.
> Thanks a lot!
> 

I appreciate people reviewing my code :) So don't worry, I'll do
requested changes and resend, then we'll see what we are still missing.

BTW: I'll add support for single-element ranges (1-1) as the previous
code seemed to accepted that.

-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH RFC 3/6] qapi: rewrite string-input-visitor

Reply via email to