Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

Richard Biener Wed, 13 Jan 2021 23:44:04 -0800

On Wed, 13 Jan 2021, Jakub Jelinek wrote:

> Hi!
> 
> The following patch doesn't actually fix the print_mem_ref bugs, I've kept
> it for now as broken as it was, but at least improves the cases where
> we can unambiguously map back MEM[&something + off] into some particular
> reference (e.g. something.foo[1].bar etc.).
> In the distant past I think we were folding such MEM_REFs back to
> COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that.


Yeah, because it has different semantics - *(((int *)t + 3)
accesses an int object while t.u.b accesses a 't' object from the TBAA
perspective.

>  But for
> diagnostics that is what the user actually want to see IMHO.
> So on the attached testcase, instead of printing what is in left column
> it prints what is in right column:
> ((int*)t) + 3         t.u.b
> ((int*)t) + 6         t.u.e.i
> ((int*)t) + 8         t.v
> s + 1                 s[1]

so while that's "nice" in general, for TBAA diagnostics it might actually
be misleading.

I wonder whether we absolutely need to print a C expression here.
We could print, instead of *((int *)t + 3), "access to a memory
object of type 'int' at offset 12 bytes from 't'", thus explain
in plain english.

That said, *((int *)t + 3) is exactly what the access is,
semantically.  There's the case of a mismatch of the access type
and the TBAA type which we cannot write down in C terms but maybe
we want to have a builtin for this?  __builtin_access (ptr, lvalue-type,
tbaa-type)?

> Of course, print_mem_ref needs to be also fixed to avoid printing the
> nonsense it is printing right now, t is a structure type, so it can't be
> cast to int* in C and in C++ only using some user operator, and
> the result of what it printed is a pointer, while the uninitialized reads
> are int.
> 
> I was hoping Martin would fix that, but given his comment in the PR I think
> I'll fix it myself tomorrow.
> 
> Anyway, this patch is useful on its own.  Bootstrapped/regtested on
> x86_64-linux and i686-linux, ok for trunk?

In the light of Martins patch this is probably reasonable but still
the general direction is wrong (which is why I didn't approve Martins
original patch).  I'm also somewhat disappointed we're breaking this
so late in the cycle.

c_fold_indirect_ref_for_warn doesn't look like it is especially
careful about error recovery issues (error_mark_node in random
places of the trees).  Maybe that never happens.

Thanks,
Richard.

> 2021-01-13  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/98597
>       * c-pretty-print.c (c_fold_indirect_ref_for_warn): New function.
>       (print_mem_ref): Call it.
> 
>       * gcc.dg/uninit-40.c: New test.
> 
> --- gcc/c-family/c-pretty-print.c.jj  2021-01-13 08:02:09.425498954 +0100
> +++ gcc/c-family/c-pretty-print.c     2021-01-13 15:02:57.860329998 +0100
> @@ -1809,6 +1809,113 @@ pp_c_call_argument_list (c_pretty_printe
>    pp_c_right_paren (pp);
>  }
>  
> +/* Try to fold *(type *)&op into op.fld.fld2[1] if possible.
> +   Only used for printing expressions.  Should punt if ambiguous
> +   (e.g. in unions).  */
> +
> +static tree
> +c_fold_indirect_ref_for_warn (location_t loc, tree type, tree op,
> +                           offset_int &off)
> +{
> +  tree optype = TREE_TYPE (op);
> +  if (off == 0)
> +    {
> +      if (lang_hooks.types_compatible_p (optype, type))
> +     return op;
> +      /* *(foo *)&complexfoo => __real__ complexfoo */
> +      else if (TREE_CODE (optype) == COMPLEX_TYPE
> +            && lang_hooks.types_compatible_p (type, TREE_TYPE (optype)))
> +     return build1_loc (loc, REALPART_EXPR, type, op);
> +    }
> +  /* ((foo*)&complexfoo)[1] => __imag__ complexfoo */
> +  else if (TREE_CODE (optype) == COMPLEX_TYPE
> +        && lang_hooks.types_compatible_p (type, TREE_TYPE (optype))
> +        && tree_to_uhwi (TYPE_SIZE_UNIT (type)) == off)
> +    {
> +      off = 0;
> +      return build1_loc (loc, IMAGPART_EXPR, type, op);
> +    }
> +  /* ((foo *)&fooarray)[x] => fooarray[x] */
> +  if (TREE_CODE (optype) == ARRAY_TYPE
> +      && TYPE_SIZE_UNIT (TREE_TYPE (optype))
> +      && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (optype))) == INTEGER_CST
> +      && !integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (optype))))
> +    {
> +      tree type_domain = TYPE_DOMAIN (optype);
> +      tree min_val = size_zero_node;
> +      if (type_domain && TYPE_MIN_VALUE (type_domain))
> +     min_val = TYPE_MIN_VALUE (type_domain);
> +      offset_int el_sz = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (optype)));
> +      offset_int idx = off / el_sz;
> +      offset_int rem = off % el_sz;
> +      if (TREE_CODE (min_val) == INTEGER_CST)
> +     {
> +       tree index
> +         = wide_int_to_tree (sizetype, idx + wi::to_offset (min_val));
> +       op = build4_loc (loc, ARRAY_REF, TREE_TYPE (optype), op, index,
> +                        NULL_TREE, NULL_TREE);
> +       off = rem;
> +       if (tree ret = c_fold_indirect_ref_for_warn (loc, type, op, off))
> +         return ret;
> +       return op;
> +     }
> +    }
> +  /* ((foo *)&struct_with_foo_field)[x] => COMPONENT_REF */
> +  else if (TREE_CODE (optype) == RECORD_TYPE)
> +    {
> +      for (tree field = TYPE_FIELDS (optype);
> +        field; field = DECL_CHAIN (field))
> +     if (TREE_CODE (field) == FIELD_DECL
> +         && TREE_TYPE (field) != error_mark_node
> +         && TYPE_SIZE_UNIT (TREE_TYPE (field))
> +         && TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (field))) == INTEGER_CST)
> +       {
> +         tree pos = byte_position (field);
> +         if (TREE_CODE (pos) != INTEGER_CST)
> +           continue;
> +         offset_int upos = wi::to_offset (pos);
> +         offset_int el_sz
> +           = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (field)));
> +         if (upos <= off && off < upos + el_sz)
> +           {
> +             tree cop = build3_loc (loc, COMPONENT_REF, TREE_TYPE (field),
> +                                    op, field, NULL_TREE);
> +             off = off - upos;
> +             if (tree ret = c_fold_indirect_ref_for_warn (loc, type, cop,
> +                                                          off))
> +               return ret;
> +             return cop;
> +           }
> +       }
> +    }
> +  /* Similarly for unions, but in this case try to be very conservative,
> +     only match if some field has type compatible with type and it is the
> +     only such field.  */
> +  else if (TREE_CODE (optype) == UNION_TYPE)
> +    {
> +      tree fld = NULL_TREE;
> +      for (tree field = TYPE_FIELDS (optype);
> +        field; field = DECL_CHAIN (field))
> +     if (TREE_CODE (field) == FIELD_DECL
> +         && TREE_TYPE (field) != error_mark_node
> +         && lang_hooks.types_compatible_p (TREE_TYPE (field), type))
> +       {
> +         if (fld)
> +           return NULL_TREE;
> +         else
> +           fld = field;
> +       }
> +      if (fld)
> +     {
> +       off = 0;
> +       return build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), op, fld,
> +                          NULL_TREE);
> +     }
> +    }
> +
> +  return NULL_TREE;
> +}
> +
>  /* Print the MEM_REF expression REF, including its type and offset.
>     Apply casts as necessary if the type of the access is different
>     from the type of the accessed object.  Produce compact output
> @@ -1836,6 +1943,17 @@ print_mem_ref (c_pretty_printer *pp, tre
>    const bool addr = TREE_CODE (arg) == ADDR_EXPR;
>    if (addr)
>      {
> +      tree op
> +     = c_fold_indirect_ref_for_warn (EXPR_LOCATION (e), TREE_TYPE (e),
> +                                     TREE_OPERAND (arg, 0), byte_off);
> +      if (op
> +       && byte_off == 0
> +       && lang_hooks.types_compatible_p (TREE_TYPE (e), TREE_TYPE (op)))
> +     {
> +       pp->expression (op);
> +       return;
> +     }
> +      byte_off = wi::to_offset (TREE_OPERAND (e, 1));
>        arg = TREE_OPERAND (arg, 0);
>        if (byte_off == 0)
>       {
> --- gcc/testsuite/gcc.dg/uninit-40.c.jj       2021-01-13 15:23:47.366134905 
> +0100
> +++ gcc/testsuite/gcc.dg/uninit-40.c  2021-01-13 15:23:15.208500279 +0100
> @@ -0,0 +1,31 @@
> +/* PR tree-optimization/98597 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Wuninitialized" } */
> +
> +union U { double d; int i; float f; };
> +struct S { char a; int b; char c; unsigned d; union U e; };
> +struct T { char t; struct S u; int v; };
> +typedef short U[2][2];
> +void baz (U *);
> +
> +static inline int
> +bar (char *p)
> +{
> +  return *(int *) p;
> +}
> +
> +void
> +foo (int *q)
> +{
> +  struct T t;
> +  t.t = 1;
> +  t.u.c = 2;
> +  char *pt = (char *) &t;
> +  q[0] = bar (pt + __builtin_offsetof (struct T, u.b));      /* { dg-warning 
> "'t\\.u\\.b' is used uninitialized" } */
> +  q[1] = bar (pt + __builtin_offsetof (struct T, u.e));      /* { dg-warning 
> "'t\\.u\\.e\\.i' is used uninitialized" } */
> +  q[2] = bar (pt + __builtin_offsetof (struct T, v));        /* { dg-warning 
> "'t\\.v' is used uninitialized" } */
> +  int s[3];
> +  s[0] = 1;
> +  char *ps = (char *) s;
> +  q[3] = bar (ps + sizeof (int));                    /* { dg-warning 
> "'s\\\[1\\\]' is used uninitialized" } */
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] c-family: Improve MEM_REF printing for diagnostics [PR98597]

Reply via email to