On Thu, Apr 09, 2015 at 02:55:45PM -0400, Russell Bryant wrote: > On 04/01/2015 12:52 AM, Ben Pfaff wrote: > Nice work! This is really impressive stuff. Thanks a bunch for the > extensive testing. I'm relying on that pretty heavily for correctness > of most of the logic. I was mostly reading through and leaving style > nits and looking for other subtle things. Feel free to incorporate (or > not) whichever pieces you'd like. > Acked-by: Russell Bryant <rbry...@redhat.com>
Thank you for the review! > > +/* Returns the name of measurement level 'level'. */ > > +const char * > > +expr_level_to_string(enum expr_level level) > > As a general comment, I probably would have used an 'ovn_' prefix for > all the new types and functions exposed by libovn. If I were you, I'd > probably pass on making this change at this point because it would be a > pain. :-) You might be right. I'll think about it. I did do this for OVSDB. It's not really harder to do this after commit, so I'm going to postpone it for now. > > +{ > > + switch (level) { > > + case EXPR_L_NOMINAL: return "nominal"; > > + case EXPR_L_BOOLEAN: return "Boolean"; > > + case EXPR_L_ORDINAL: return "ordinal"; > > + default: OVS_NOT_REACHED(); > > This is a case where I'd drop default so you get the compiler warning if > a new entry is added to the enum but not to this switch statement. The compiler options we use ensure that this happens regardless. Try deleting one of the cases above. You get Clang and GCC diagnostics: ../ovn/expr.c:33:13: error: enumeration value 'EXPR_L_ORDINAL' not explicitly handled in switch [-Werror,-Wswitch-enum] ../ovn/expr.c: In function 'expr_level_to_string': ../ovn/expr.c:33:5: error: enumeration value 'EXPR_L_ORDINAL' not handled in switch [-Werror=switch-enum] > I won't repeat this same comment for any other cases in this patch, but > it generally applies to any of the default: OVS_NOT_REACHED() cases. We can't really delete the OVS_NOT_REACHED() cases because then GCC gives a different diagnostic: ../ovn/expr.c: In function 'expr_level_to_string': ../ovn/expr.c:38:1: error: control reaches end of non-void function [-Werror=return-type] > > + } > > +} > > + > > +bool > > +expr_relop_from_token(enum lex_type type, enum expr_relop *relop) > > +{ > > + enum expr_relop r; > > + > > + switch ((int) type) { > > I'm curious why the cast was needed here. Oh, that suppresses warnings related to the point above, which aren't really helpful here. Try removing the cast: ../ovn/expr.c:63:13: error: 19 enumeration values not explicitly handled in switch: 'LEX_T_END', 'LEX_T_ID', 'LEX_T_STRING'... [-Werror,-Wswitch-enum] or ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_END' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_ID' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_STRING' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_INTEGER' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_MASKED_INTEGER' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_ERROR' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_LPAREN' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_RPAREN' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_LCURLY' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_RCURLY' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_LSQUARE' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_RSQUARE' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_LOG_NOT' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_LOG_AND' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_LOG_OR' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_ELLIPSIS' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_COMMA' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_SEMICOLON' not handled in switch [-Werror=switch-enum] ../ovn/expr.c:63:5: error: enumeration value 'LEX_T_EQUALS' not handled in switch [-Werror=switch-enum] It didn't seem worth listing cases for all the non-relational operator tokens here. > > +/* Constructing and manipulating expressions. */ > > + > > +/* Creates and returns a logical AND or OR expression (according to 'type', > > + * which must be EXPR_T_AND or EXPR_T_OR) that initially has no > > + * sub-expressions. (To satisfy the invariants for expressions, the caller > > + * must add at least two sub-expressions whose types are different from > > + * 'type'.) */ > > +struct expr * > > +expr_create_andor(enum expr_type type) > > +{ > > + struct expr *e = xmalloc(sizeof *e); > > + e->type = type; > > + list_init(&e->andor); > > + return e; > > This leaves e->node uninitialized. Maybe use xzalloc() instead? I see limited value in initializing the node member, since it isn't meaningful outside of the context of some parent node. > > +/* Returns an EXPR_T_BOOLEAN expression with value 'b'. */ > > +struct expr * > > +expr_create_boolean(bool b) > > +{ > > + struct expr *e = xmalloc(sizeof *e); > > Maybe use xzalloc() here to avoid e->node being uninitialized? Same here. > > + e->type = EXPR_T_BOOLEAN; > > + e->boolean = b; > > + return e; > > +} > > > +static void > > +expr_format_string(const char *s, struct ds *ds) > > +{ > > + struct json json; > > + json.type = JSON_STRING; > > + json.u.string = CONST_CAST(char *, s); > > This could be: > > struct json json = { > .type = JSON_STRING, > .u.string = CONST_CAST(char *, s), > }; > > It doesn't make a difference now, but would ensure the whole struct is > initialized if more was ever added to struct json. Sure, good idea, I made that change. I made the same change to lex_token_format_string(), which is a cut-and-paste of the same code. (I have another patch, which I'm not sure I've posted yet, that unifies 3 copies of this same snippet in the tree.) > > + json_to_ds(&json, 0, ds); > > +} > > + > > +static void > > +expr_format_cmp(const struct expr *e, struct ds *s) > > +{ > > + if (e->cmp.symbol->width) { > > really minor style nit ... I'd be tempted to change this to: > > if (!e->cmp.symbol->width) { > ds_put_format(s, "%s %s ", e->cmp.symbol->name, > expr_relop_to_string(e->cmp.relop)); > expr_format_string(e->cmp.string, s); > return; > } > > and then bring the majority of the function back to not being nested. I don't care much either way, so I made this change. > > + int ofs, n; > > + > > + find_bitwise_range(&e->cmp.mask, e->cmp.symbol->width, &ofs, &n); > > + if (n == 1 > > + && (e->cmp.relop == EXPR_R_EQ || e->cmp.relop == EXPR_R_NE)) { > > + bool positive; > > + > > + positive = bitwise_get_bit(&e->cmp.value, sizeof e->cmp.value, > > + ofs); > > + positive ^= e->cmp.relop == EXPR_R_NE; > > + if (!positive) { > > + ds_put_char(s, '!'); > > + } > > + ds_put_cstr(s, e->cmp.symbol->name); > > + if (e->cmp.symbol->width > 1) { > > + ds_put_format(s, "[%d]", ofs); > > + } > > + return; > > + } > > + > > + ds_put_cstr(s, e->cmp.symbol->name); > > + if (n > 0 && n < e->cmp.symbol->width) { > > + if (n > 1) { > > + ds_put_format(s, "[%d..%d]", ofs, ofs + n - 1); > > + } else { > > + ds_put_format(s, "[%d]", ofs); > > + } > > + } > > + > > + ds_put_format(s, " %s ", expr_relop_to_string(e->cmp.relop)); > > + > > + if (n) { > > + union mf_subvalue value; > > + > > + memset(&value, 0, sizeof value); > > I prefer something like the following: > > union mf_subvalue value = { > .integer = 0, > }; > > It results in the same thing as the memset(), but gives the opportunity > for the memset to be optimized out if possible. There are some other > cases where the same change could be made if you like it. I do tend to forget about the .member initialization syntax, because OVS style didn't allow it until recently (now that recent Visual C supports it), so I do underuse it. But there are a couple of reasons why I don't think it's the best choice here. First, it reads to the programmer as if I only cared about the 'integer' member, when really I want the whole thing zero-initialized. It does read to the compiler as "initialize all the members to zero" but in a case like this, that always makes me go look and figure out whether there is any non-member padding that needs to be zero-initialized (the compiler isn't obligated to do that and I don't know whether Clang and GCC and MSVC consistently do). Second, the C standard has some pretty fussy "aliasing" requirements for unions. If you write one member, you're strictly supposed to only read from that member. GCC, in some modes, will screw up your code if you don't pay attention to that rule, and we've seen it happen, see the following commits: commit 538919d3a672eab8a561048acf9a8e1721cd559c Author: Ben Pfaff <b...@nicira.com> Date: Tue Sep 30 17:03:07 2014 -0700 configure: Disable strict aliasing. The C standard allows compilers to do type-based alias analysis, which means that the compiler is allowed to assume that pointers to objects of different types are pointers to different objects. For example, a compiler may assume that "uint16_t *a" and "uint32_t *b" point to different and nonoverlapping locations because the pointed-to types are different. This can lead to surprising "optimizations" with compilers that by default do this kind of analysis, which includes GCC and Clang. The one escape clause that the C standard gives us is that character types must be assumed to alias any other object. We've always tried to use this escape clause to avoid problems with type-based alias analysis in the past. I think that we should continue to try to do this in the future. It's hard to tell what compiler we might want to use in the future, and one never knows what kind of control that compiler allows over alias analysis. However, recently I helped another developer debug a nasty and confusing issue, which turned out to be the result of a surprising compiler optimization due to alias analysis. I've seen enough of these that I don't think it's worthwhile to risk more problems than we have to. Thus, this commit turns off type-based alias analysis in GCC and Clang. Linus Torvalds thinks that type-base alias analysis is not sane, at least as GCC implements it: https://lkml.org/lkml/2003/2/26/158 The GCC manual says that -Wstrict-aliasing is only effective without -fno-strict-aliasing, otherwise I'd keep -Wstrict-aliasing also. Indications are that MSVC doesn't do type-based alias analysis by default. Signed-off-by: Ben Pfaff <b...@nicira.com> Acked-by: Jarno Rajahalme <jrajaha...@nicira.com> commit 02334943a07e2e9a40e350397192d7793b9962f4 Author: Jarno Rajahalme <jrajaha...@nicira.com> Date: Tue Aug 5 13:51:19 2014 -0700 Fix strict aliasing violations with GCC 4.1 and 4.4. The typical use of struct sockaddr_storage is flagged as a strict aliasing violation by GCC 4.4.7. Using an explicit union lets the compiler know that accessing the same location via different types is not an error. GCC 4.1.2 had a similar complaint about a cast of ukey's key_buf to nlattr. After this patch there are no further warnings with the XenServer build, so we could start treating warnings as errors in the builds. Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com> Acked-by: Ben Pfaff <b...@nicira.com> I see that the lkml.org link above is dead, but it's in the Wayback Machine: https://web.archive.org/web/20150130022255/https://lkml.org/lkml/2003/2/26/158 Since memset() is character-based, it's safer than initializing a non-character-based union member. Finally, I don't know why the compiler would be better or worse at memset() versus initialization. GCC understands what memset() does, so I think it ought to be able to do just as good a job. Also, old versions of GCC (3.x?) had pretty bad reputations regarding complicated initializers; I don't have citations, but I remember Dave Miller on netdev telling people not to use them because they generated bad code. So, while I like member-based initialization better than memset() in general, I don't think that this is a good place to substitute it. > > +static void OVS_PRINTF_FORMAT(2, 3) > > +expr_error(struct expr_context *ctx, const char *message, ...) > > +{ > > + if (!ctx->error) { > > + if (ctx->lexer->token.type == LEX_T_ERROR) { > > + ctx->error = xstrdup(ctx->lexer->token.s); > > + } else { > > + va_list args; > > + va_start(args, message); > > + ctx->error = xvasprintf(message, args); > > + va_end(args); > > + } > > + } > > +} > > This will silently drop the error message if expr_error() is called > twice. It avoids a mem leak, which is great, but I wonder if a > VLOG_WARN would be good here. > > I suppose another alternative would be to make it stack error messages. > It could append something like: "(next error: %s)". I figured that dropping the later error messages was a feature, since the cascade seemed to me to be unlikely to be useful. > > +static void OVS_PRINTF_FORMAT(2, 3) > > +expr_syntax_error(struct expr_context *ctx, const char *message, ...) > > +{ > > + if (!ctx->error) { > > + if (ctx->lexer->token.type == LEX_T_ERROR) { > > + ctx->error = xstrdup(ctx->lexer->token.s); > > + } else { > > + struct ds s; > > + > > + ds_init(&s); > > + ds_put_cstr(&s, "Syntax error "); > > + if (ctx->lexer->token.type == LEX_T_END) { > > + ds_put_cstr(&s, "at end of input "); > > + } else if (ctx->lexer->start) { > > + ds_put_format(&s, "at `%.*s' ", > > + (int) (ctx->lexer->input - > > ctx->lexer->start), > > + ctx->lexer->start); > > + } > > + > > + va_list args; > > + va_start(args, message); > > + ds_put_format_valist(&s, message, args); > > + va_end(args); > > + > > + ctx->error = ds_steal_cstr(&s); > > + } > > + } > > +} > > You could reduce nesting in the 2 functions above with: > > if (ctx->error) { > return; > } I guess they have a little common code too. Thanks, I folded this in (with "diff -w" to ignore white space changes): @@ -460,28 +460,44 @@ static void expr_not(struct expr *); static void expr_constant_set_destroy(struct expr_constant_set *); static bool parse_field(struct expr_context *, struct expr_field *); -static void OVS_PRINTF_FORMAT(2, 3) -expr_error(struct expr_context *ctx, const char *message, ...) +static bool +expr_error_handle_common(struct expr_context *ctx) { - if (!ctx->error) { - if (ctx->lexer->token.type == LEX_T_ERROR) { + if (ctx->error) { + /* Already have an error, suppress this one since the cascade seems + * unlikely to be useful. */ + return true; + } else if (ctx->lexer->token.type == LEX_T_ERROR) { + /* The lexer signaled an error. Nothing at the expression level + * accepts an error token, so we'll inevitably end up here with some + * meaningless parse error. Report the lexical error instead. */ ctx->error = xstrdup(ctx->lexer->token.s); + return true; } else { + return false; + } +} + +static void OVS_PRINTF_FORMAT(2, 3) +expr_error(struct expr_context *ctx, const char *message, ...) +{ + if (expr_error_handle_common(ctx)) { + return; + } + va_list args; va_start(args, message); ctx->error = xvasprintf(message, args); va_end(args); } - } -} static void OVS_PRINTF_FORMAT(2, 3) expr_syntax_error(struct expr_context *ctx, const char *message, ...) { - if (!ctx->error) { - if (ctx->lexer->token.type == LEX_T_ERROR) { - ctx->error = xstrdup(ctx->lexer->token.s); - } else { + if (expr_error_handle_common(ctx)) { + return; + } + struct ds s; ds_init(&s); @@ -501,8 +517,6 @@ expr_syntax_error(struct expr_context *ctx, const char *message, ...) ctx->error = ds_steal_cstr(&s); } - } -} static struct expr * make_cmp__(const struct expr_field *f, enum expr_relop r, > > +static bool > > +parse_constant_set(struct expr_context *ctx, struct expr_constant_set *cs) > > +{ > > + size_t allocated_values = 0; > > + bool ok; > > + > > + memset(cs, 0, sizeof *cs); > > I went looking around the code seeing if there was any possibility of > leaking anything here. Perhaps add a comment above the function that > clarifies the precondition that cs is uninitialized (or at least empty), > as this function will initialize it. > > An expr_constant_set_init() might be nice for more consistency, bit it > may be overkill to do that for everything. I added a comment: /* Parses a single or {}-enclosed set of integer or string constants into 'cs', * which the caller need not have initialized. Returns true on success, in * which case the caller owns 'cs', false on failure, in which case 'cs' is * indeterminate. */ > > + if (list_is_empty(&expr->andor)) { > > + if (is_all_zeros(&mask, sizeof mask)) { > > + expr_destroy(expr); > > + return expr_create_boolean(true); > > + } else { > > + struct expr *cmp; > > + cmp = xmalloc(sizeof *cmp); > > + cmp->type = EXPR_T_CMP; > > + cmp->cmp.symbol = symbol; > > + cmp->cmp.relop = EXPR_R_EQ; > > + cmp->cmp.value = value; > > + cmp->cmp.mask = mask; > > + expr_destroy(expr); > > + return cmp; > > This code leaves part of cmp unitialized (cmp->node). I guess I just think of 'node' as owned by the parent of an expression, not by the expression itself, so I don't bother to initialize it until it's a subexpression of something else. > > +static struct expr_match * > > +expr_match_new(const struct match *m, uint8_t clause, uint8_t n_clauses, > > + uint32_t conj_id) > > +{ > > + struct expr_match *match = xmalloc(sizeof *match); > > How about xzalloc() here? > > if m is NULL, match->match is uninitialized. I didn't want to initialize match->match if the caller doesn't know what it is yet (because "struct match" is large--hundreds of bytes), so I left that possibility. However, both callers that do that first call match_init_catchall(&match->match), so I guess we might as well do that in expr_match_new(). I folded this in: @@ -2048,6 +2066,17 @@ expr_normalize(struct expr *expr) OVS_NOT_REACHED(); } +/* Creates, initializes, and returns a new 'struct expr_match'. If 'm' is + * nonnull then it is copied into the new expr_match, otherwise the new + * expr_match's 'match' member is initialized to a catch-all match for the + * caller to refine in-place. + * + * If 'conj_id' is nonzero, adds one conjunction based on 'conj_id', 'clause', + * and 'n_clauses' to the returned 'struct expr_match', otherwise the + * expr_match will not have any conjunctions. + * + * The caller should use expr_match_add() to add the expr_match to a hash table + * after it is finalized. */ static struct expr_match * expr_match_new(const struct match *m, uint8_t clause, uint8_t n_clauses, uint32_t conj_id) @@ -2055,6 +2084,8 @@ expr_match_new(const struct match *m, uint8_t clause, uint8_t n_clauses, struct expr_match *match = xmalloc(sizeof *match); if (m) { match->match = *m; + } else { + match_init_catchall(&m->match); } if (conj_id) { match->conjunctions = xmalloc(sizeof *match->conjunctions); @@ -2071,6 +2102,11 @@ expr_match_new(const struct match *m, uint8_t clause, uint8_t n_clauses, return match; } +/* Adds 'match' to hash table 'matches', which becomes the new owner of + * 'match'. + * + * This might actually destroy 'match' because it gets merged together with + * some existing conjunction.*/ static void expr_match_add(struct hmap *matches, struct expr_match *match) { @@ -2175,7 +2211,6 @@ static void add_cmp_flow(const struct expr *cmp, struct hmap *matches) { struct expr_match *m = expr_match_new(NULL, 0, 0, 0); - match_init_catchall(&m->match); constrain_match(cmp, &m->match); expr_match_add(matches, m); } @@ -2221,7 +2256,6 @@ expr_to_matches(const struct expr *expr, struct hmap *matches) case EXPR_T_BOOLEAN: if (expr->boolean) { struct expr_match *m = expr_match_new(NULL, 0, 0, 0); - match_init_catchall(&m->match); expr_match_add(matches, m); } else { /* No match. */ > All cases leave match->hmap_node uninitialized. I'm comfortable with that, it's owned by the parent hash table not by the node itself. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev