On Wed, 13 Jan 2021, Jakub Jelinek wrote: > Hi! > > The following patch doesn't actually fix the print_mem_ref bugs, I've kept > it for now as broken as it was, but at least improves the cases where > we can unambiguously map back MEM[&something + off] into some particular > reference (e.g. something.foo[1].bar etc.). > In the distant past I think we were folding such MEM_REFs back to > COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that.
Yeah, because it has different semantics - *(((int *)t + 3) accesses an int object while t.u.b accesses a 't' object from the TBAA perspective. > But for > diagnostics that is what the user actually want to see IMHO. > So on the attached testcase, instead of printing what is in left column > it prints what is in right column: > ((int*)t) + 3 t.u.b > ((int*)t) + 6 t.u.e.i > ((int*)t) + 8 t.v > s + 1 s[1] so while that's "nice" in general, for TBAA diagnostics it might actually be misleading. I wonder whether we absolutely need to print a C expression here. We could print, instead of *((int *)t + 3), "access to a memory object of type 'int' at offset 12 bytes from 't'", thus explain in plain english. That said, *((int *)t + 3) is exactly what the access is, semantically. There's the case of a mismatch of the access type and the TBAA type which we cannot write down in C terms but maybe we want to have a builtin for this? __builtin_access (ptr, lvalue-type, tbaa-type)? > Of course, print_mem_ref needs to be also fixed to avoid printing the > nonsense it is printing right now, t is a structure type, so it can't be > cast to int* in C and in C++ only using some user operator, and > the result of what it printed is a pointer, while the uninitialized reads > are int. > > I was hoping Martin would fix that, but given his comment in the PR I think > I'll fix it myself tomorrow. > > Anyway, this patch is useful on its own. Bootstrapped/regtested on > x86_64-linux and i686-linux, ok for trunk? In the light of Martins patch this is probably reasonable but still the general direction is wrong (which is why I didn't approve Martins original patch). I'm also somewhat disappointed we're breaking this so late in the cycle. c_fold_indirect_ref_for_warn doesn't look like it is especially careful about error recovery issues (error_mark_node in random places of the trees). Maybe that never happens. Thanks, Richard. > 2021-01-13 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/98597 > * c-pretty-print.c (c_fold_indirect_ref_for_warn): New function. > (print_mem_ref): Call it. > > * gcc.dg/uninit-40.c: New test. > > --- gcc/c-family/c-pretty-print.c.jj 2021-01-13 08:02:09.425498954 +0100 > +++ gcc/c-family/c-pretty-print.c 2021-01-13 15:02:57.860329998 +0100 > @@ -1809,6 +1809,113 @@ pp_c_call_argument_list (c_pretty_printe > pp_c_right_paren (pp); > } > > +/* Try to fold *(type *)&op into op.fld.fld2[1] if possible. > + Only used for printing expressions. Should punt if ambiguous > + (e.g. in unions). */ > + > +static tree > +c_fold_indirect_ref_for_warn (location_t loc, tree type, tree op, > + offset_int &off) > +{ > + tree optype = TREE_TYPE (op); > + if (off == 0) > + { > + if (lang_hooks.types_compatible_p (optype, type)) > + return op; > + /* *(foo *)&complexfoo => __real__ complexfoo */ > + else if (TREE_CODE (optype) == COMPLEX_TYPE > + && lang_hooks.types_compatible_p (type, TREE_TYPE (optype))) > + return build1_loc (loc, REALPART_EXPR, type, op); > + } > + /* ((foo*)&complexfoo)[1] => __imag__ complexfoo */ > + else if (TREE_CODE (optype) == COMPLEX_TYPE > + && lang_hooks.types_compatible_p (type, TREE_TYPE (optype)) > + && tree_to_uhwi (TYPE_SIZE_UNIT (type)) == off) > + { > + off = 0; > + return build1_loc (loc, IMAGPART_EXPR, type, op); > + } > + /* ((foo *)&fooarray)[x] => fooarray[x] */ > + if (TREE_CODE (optype) == ARRAY_TYPE > + && TYPE_SIZE_UNIT (TREE_TYPE (optype)) > + && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (optype))) == INTEGER_CST > + && !integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (optype)))) > + { > + tree type_domain = TYPE_DOMAIN (optype); > + tree min_val = size_zero_node; > + if (type_domain && TYPE_MIN_VALUE (type_domain)) > + min_val = TYPE_MIN_VALUE (type_domain); > + offset_int el_sz = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (optype))); > + offset_int idx = off / el_sz; > + offset_int rem = off % el_sz; > + if (TREE_CODE (min_val) == INTEGER_CST) > + { > + tree index > + = wide_int_to_tree (sizetype, idx + wi::to_offset (min_val)); > + op = build4_loc (loc, ARRAY_REF, TREE_TYPE (optype), op, index, > + NULL_TREE, NULL_TREE); > + off = rem; > + if (tree ret = c_fold_indirect_ref_for_warn (loc, type, op, off)) > + return ret; > + return op; > + } > + } > + /* ((foo *)&struct_with_foo_field)[x] => COMPONENT_REF */ > + else if (TREE_CODE (optype) == RECORD_TYPE) > + { > + for (tree field = TYPE_FIELDS (optype); > + field; field = DECL_CHAIN (field)) > + if (TREE_CODE (field) == FIELD_DECL > + && TREE_TYPE (field) != error_mark_node > + && TYPE_SIZE_UNIT (TREE_TYPE (field)) > + && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (field))) == INTEGER_CST) > + { > + tree pos = byte_position (field); > + if (TREE_CODE (pos) != INTEGER_CST) > + continue; > + offset_int upos = wi::to_offset (pos); > + offset_int el_sz > + = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (field))); > + if (upos <= off && off < upos + el_sz) > + { > + tree cop = build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), > + op, field, NULL_TREE); > + off = off - upos; > + if (tree ret = c_fold_indirect_ref_for_warn (loc, type, cop, > + off)) > + return ret; > + return cop; > + } > + } > + } > + /* Similarly for unions, but in this case try to be very conservative, > + only match if some field has type compatible with type and it is the > + only such field. */ > + else if (TREE_CODE (optype) == UNION_TYPE) > + { > + tree fld = NULL_TREE; > + for (tree field = TYPE_FIELDS (optype); > + field; field = DECL_CHAIN (field)) > + if (TREE_CODE (field) == FIELD_DECL > + && TREE_TYPE (field) != error_mark_node > + && lang_hooks.types_compatible_p (TREE_TYPE (field), type)) > + { > + if (fld) > + return NULL_TREE; > + else > + fld = field; > + } > + if (fld) > + { > + off = 0; > + return build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), op, fld, > + NULL_TREE); > + } > + } > + > + return NULL_TREE; > +} > + > /* Print the MEM_REF expression REF, including its type and offset. > Apply casts as necessary if the type of the access is different > from the type of the accessed object. Produce compact output > @@ -1836,6 +1943,17 @@ print_mem_ref (c_pretty_printer *pp, tre > const bool addr = TREE_CODE (arg) == ADDR_EXPR; > if (addr) > { > + tree op > + = c_fold_indirect_ref_for_warn (EXPR_LOCATION (e), TREE_TYPE (e), > + TREE_OPERAND (arg, 0), byte_off); > + if (op > + && byte_off == 0 > + && lang_hooks.types_compatible_p (TREE_TYPE (e), TREE_TYPE (op))) > + { > + pp->expression (op); > + return; > + } > + byte_off = wi::to_offset (TREE_OPERAND (e, 1)); > arg = TREE_OPERAND (arg, 0); > if (byte_off == 0) > { > --- gcc/testsuite/gcc.dg/uninit-40.c.jj 2021-01-13 15:23:47.366134905 > +0100 > +++ gcc/testsuite/gcc.dg/uninit-40.c 2021-01-13 15:23:15.208500279 +0100 > @@ -0,0 +1,31 @@ > +/* PR tree-optimization/98597 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -Wuninitialized" } */ > + > +union U { double d; int i; float f; }; > +struct S { char a; int b; char c; unsigned d; union U e; }; > +struct T { char t; struct S u; int v; }; > +typedef short U[2][2]; > +void baz (U *); > + > +static inline int > +bar (char *p) > +{ > + return *(int *) p; > +} > + > +void > +foo (int *q) > +{ > + struct T t; > + t.t = 1; > + t.u.c = 2; > + char *pt = (char *) &t; > + q[0] = bar (pt + __builtin_offsetof (struct T, u.b)); /* { dg-warning > "'t\\.u\\.b' is used uninitialized" } */ > + q[1] = bar (pt + __builtin_offsetof (struct T, u.e)); /* { dg-warning > "'t\\.u\\.e\\.i' is used uninitialized" } */ > + q[2] = bar (pt + __builtin_offsetof (struct T, v)); /* { dg-warning > "'t\\.v' is used uninitialized" } */ > + int s[3]; > + s[0] = 1; > + char *ps = (char *) s; > + q[3] = bar (ps + sizeof (int)); /* { dg-warning > "'s\\\[1\\\]' is used uninitialized" } */ > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)