Hi On Fri, Jul 20, 2018 at 10:49 AM, Markus Armbruster <arm...@redhat.com> wrote: > Marc-André Lureau <marcandre.lur...@redhat.com> writes: > >> qobject_from_jsonv() returns a single object. Let's make sure that >> during parsing we don't leak an intermediary object. Instead of >> returning the last object, set a parsing error. >> >> Also, the lexer/parser keeps consuming all the data. There might be an >> error set earlier. Let's keep it and not call json_parser_parse() with >> the remaining data. Eventually, we may teach the message parser to >> stop consuming the data. > > Took me a while to figure out what you mean :) > > qobject_from_jsonv() feeds a complete string to the JSON parser, with > json_message_parser_feed(). This actually feeds one character after the > other to the lexer, via json_lexer_feed_char(). > > Whenever a character completes a token, the lexer feeds that token to > the streamer via a callback that is always json_message_process_token(). > > The attentive reader may wonder what it does with trailing characters > that aren't a complete token. More on that below. > > The streamer accumulates tokens, counting various parenthesis. When all > counts are zero or any is negative, the group is complete, and is fed to > the parser via another callback. That callback is parse_json() here. > There are more elsewhere. > > The attentive reader may wonder what it does with trailing tokens that > aren't a complete group. More on that below. > > The callbacks all pass the tokens to json_parser_parse() to do the > actual parsing. Returns the abstract syntax tree as QObject on success. > > Note that the callback can be called any number of times. > > In my opinion, this is over-engineered and under-thought. There's more > flexibility than we're using, and the associated complexity breeds bugs. > > In particular, parse_json() is wrong: > > s->result = json_parser_parse(tokens, s->ap, &s->err); > > If the previous call succeeded, we leak its result. This is the leak > you mentioned. > > If any previous call set an error, we pass &s->err pointing to non-null, > which is forbidden. If json_parser_parse() passed it to error_setg(), > this would crash. Since it passes it only to error_propagate(), all > errors but the first one are ignored. Latent crash bug. > > If the last call succeeds and any previous call failed, the function > simultaneously succeeds and fails: we return both a result and set an > error. > > Let's demonstrate these bugs before we fix them, by inserting the > appended patch before this one. > > Are the other callbacks similarly buggy? Turns out they're okay: > > * handle_qmp_command() consumes result and error on each call. > > * process_event() does, too. > > * qmp_response() treats errors as fatal, and asserts "no previous > response". > > On trailing tokens that don't form a group: they get silently ignored, > as demonstrated by check-qjson test cases unterminated_array(), > unterminated_array_comma(), unterminated_dict(), > unterminated_dict_comma(), all marked BUG. > > On trailing characters that don't form a token: they get silently > ignored, as demonstrated by check-qjson test cases > unterminated_string(), unterminated_sq_string(), unterminated_escape(), > all marked BUG, except when they aren't, as in test case > unterminated_literal(). > > The BUG marks are all gone at the end of this series. I guess you're > fixing all that :) > > Note that these "trailing characters / tokens are silently ignored" bugs > *do* affect the other callbacks. Reproducer for handle_qmp_command(): > > $ echo -e '{ "execute": "qmp_capabilities" }\n{ "execute": "query-name" > }\n"unterminated' | socat UNIX:test-qmp STDIO > {"QMP": {"version": {"qemu": {"micro": 90, "minor": 12, "major": 2}, > "package": "v3.0.0-rc1-20-g6a024cd461"}, "capabilities": ["oob"]}} > {"return": {}} > {"return": {}} > > Note there's no error reported for the last line. > >> Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> >> --- >> qobject/qjson.c | 16 +++++++++++++++- >> tests/check-qjson.c | 11 +++++++++++ >> 2 files changed, 26 insertions(+), 1 deletion(-) >> >> diff --git a/qobject/qjson.c b/qobject/qjson.c >> index ee04e61096..8a9d116150 100644 >> --- a/qobject/qjson.c >> +++ b/qobject/qjson.c >> @@ -22,6 +22,7 @@ >> #include "qapi/qmp/qlist.h" >> #include "qapi/qmp/qnum.h" >> #include "qapi/qmp/qstring.h" >> +#include "qapi/qmp/qerror.h" >> #include "qemu/unicode.h" >> >> typedef struct JSONParsingState >> @@ -36,7 +37,20 @@ static void parse_json(JSONMessageParser *parser, GQueue >> *tokens) >> { >> JSONParsingState *s = container_of(parser, JSONParsingState, parser); >> >> - s->result = json_parser_parse(tokens, s->ap, &s->err); >> + if (s->result || s->err) { >> + if (s->result) { >> + qobject_unref(s->result); >> + s->result = NULL; >> + if (!s->err) { > > Conditional is necessary because a previous call to json_parser_parse() > could have set s->err already. Without it, error_setg() would fail the > assertion in error_setv() then. Subtle. > >> + error_setg(&s->err, QERR_JSON_PARSING); > > Hmm. Is this really "Invalid JSON syntax"? In a way, it is, because > RFC 7159 defines JSON-text as a single value. Still, a friendlier error > message like "Expecting at most one JSON value" woun't hurt. > > What if !tokens? That shouldn't be an error. > >> + } >> + } >> + if (tokens) { >> + g_queue_free_full(tokens, g_free); > > g_queue_free_full(NULL, ...) doesn't work?!? That would be lame.
It warns and return. We could use g_clear_pointer(&tokens, g_queue_free_full)... > >> + } >> + } else { >> + s->result = json_parser_parse(tokens, s->ap, &s->err); >> + } >> } > > What about this: > > JSONParsingState *s = container_of(parser, JSONParsingState, parser); > Error *err = NULL; > > if (!tokens) { > return; > } I would rather leave handling of NULL tokens to json_parser_parse() for consistency with other callers. > if (s->result || s->err) { > if (s->result) { > qobject_unref(s->result); > s->result = NULL; > error_setg(&err, "Expecting at most one JSON value"); > } > g_queue_free_full(tokens, g_free); missing null check > } else { > s->result = json_parser_parse(tokens, s->ap, &err); > } > error_propagate(&s->err, err); how do you ensure s->err is not already set? > assert(!s->result != !s->err); > >> >> QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp) >> diff --git a/tests/check-qjson.c b/tests/check-qjson.c >> index da582df3e9..7d3547e0cc 100644 >> --- a/tests/check-qjson.c >> +++ b/tests/check-qjson.c >> @@ -1417,6 +1417,16 @@ static void limits_nesting(void) >> g_assert(obj == NULL); >> } >> >> +static void multiple_objects(void) >> +{ >> + Error *err = NULL; >> + QObject *obj; >> + >> + obj = qobject_from_json("{} {}", &err); >> + g_assert(obj == NULL); >> + error_free_or_abort(&err); >> +} >> + >> int main(int argc, char **argv) >> { >> g_test_init(&argc, &argv, NULL); >> @@ -1454,6 +1464,7 @@ int main(int argc, char **argv) >> g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma); >> g_test_add_func("/errors/unterminated/literal", unterminated_literal); >> g_test_add_func("/errors/limits/nesting", limits_nesting); >> + g_test_add_func("/errors/multiple_objects", multiple_objects); >> >> return g_test_run(); >> } > > From 18064b774ea7a79a9dbabe5cf75458f0326070d5 Mon Sep 17 00:00:00 2001 > From: Markus Armbruster <arm...@redhat.com> > Date: Fri, 20 Jul 2018 10:22:41 +0200 > Subject: [PATCH] check-qjson: Cover multiple JSON objects in same string > > qobject_from_json() & friends misbehave when the JSON text has more > than one JSON value. Add test coverage to demonstrate the bugs. > > Signed-off-by: Markus Armbruster <arm...@redhat.com> test looks better than mine, should I include it in the series before the fix? > --- > tests/check-qjson.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/tests/check-qjson.c b/tests/check-qjson.c > index da582df3e9..c5fd439263 100644 > --- a/tests/check-qjson.c > +++ b/tests/check-qjson.c > @@ -1417,6 +1417,25 @@ static void limits_nesting(void) > g_assert(obj == NULL); > } > > +static void multiple_objects(void) > +{ > + Error *err = NULL; > + QObject *obj; > + > + /* BUG this leaks the syntax tree for "false" */ > + obj = qobject_from_json("false true", &err); > + g_assert(qbool_get_bool(qobject_to(QBool, obj))); > + g_assert(!err); > + qobject_unref(obj); > + > + /* BUG simultaneously succeeds and fails */ > + /* BUG calls json_parser_parse() with errp pointing to non-null */ > + obj = qobject_from_json("} true", &err); > + g_assert(qbool_get_bool(qobject_to(QBool, obj))); > + error_free_or_abort(&err); > + qobject_unref(obj); > +} > + > int main(int argc, char **argv) > { > g_test_init(&argc, &argv, NULL); > @@ -1454,6 +1473,7 @@ int main(int argc, char **argv) > g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma); > g_test_add_func("/errors/unterminated/literal", unterminated_literal); > g_test_add_func("/errors/limits/nesting", limits_nesting); > + g_test_add_func("/errors/multiple_objects", multiple_objects); > > return g_test_run(); > } > -- > 2.17.1 >