> Am 18.03.2022 um 18:01 schrieb Jakub Jelinek via Gcc-patches 
> <gcc-patches@gcc.gnu.org>:
> 
> Hi!
> 
> Starting with GCC11 we keep emitting false positive -Warray-bounds or
> -Wstringop-overflow etc. warnings on widely used *(type *)0x12345000
> style accesses (or memory/string routines to such pointers).
> This is a standard programming style supported by all C/C++ compilers
> I've ever tried, used mostly in kernel or DSP programming, but sometimes
> also together with mmap MAP_FIXED when certain things, often I/O registers
> but could be anything else too are known to be present at fixed
> addresses.
> 
> Such INTEGER_CST addresses can appear in code either because a user
> used it like that (in which case it is fine) or because somebody used
> pointer arithmetics (including &((struct whatever *)NULL)->field) on
> a NULL pointer.  The middle-end warning code wrongly assumes that the
> latter case is what is very likely, while the former is unlikely and
> users should change their code.
> 
> The following patch attempts to differentiate between the two, but
> because we are in stage4, does it in a temporary compromise way.
> For GCC 13, the aim is to represent in the IL the user (type *)0x12345000
> style addresses as INTEGER_CSTs as before, and represent the addresses
> derived from pointer arithmetics on NULL pointer as
> &MEM_REF[(void*)0B + offset].  When we see say POINTER_PLUS of
> NULL and non-zero offset, I think we should fold it to this form.
> We IMNSHO need to support the poor man's offsetof I'm afraid still
> in wide use, so we should IMHO fold casts of such addresses to integers
> into INTEGER_CST offset.  SCCVN IMHO shouldn't the
> &MEM_REF[(void*)0B + offset] and (void*)offset forms as equivalent (yes,
> it will be a small missed optimization but RTL will see them the same),
> but e.g. the patch already does ==/!= folding between the two forms.
> As doing this fully seems to be more like a stage1 project, the following
> patch keeps doing what GCC11/GCC12 did until now for pointers smaller than
> a parameter, by default equal to 4KB, which means that PR102037 will
> remain unfixed for now (unless users lower the new param manually) and for
> the most common cases nothing changes right now, but for higher
> addresses we'll assume that such INTEGER_CSTs are valid direct address
> accesses and use the &MEM_REF[(void*)0B + offset] forms for the invalid
> code on which we'll warn.
> 
> Yet another alternative would be just the params.opt and
> pointer-query.cc (compute_objsize_r) hunks which will stop warning
> if we go more than 4KB from a NULL pointer altogether and doing
> the full set of changes only in GCC 13 (I've bootstrapped/regtested
> such patch last night on x86_64-linux and i686-linux).

I think I’d prefer this simpler approach at this point.  Thus that one is OK.

Richard.

> 
> So far tested with make check-gcc/check-g++ on x86_64-linux,
> furthermore also tested with the parameter's default changed from 4096
> to 16.
> 
> Ok for trunk if it passes full bootstrap/regtest?
> 
> 2022-03-18  Jakub Jelinek  <ja...@redhat.com>
> 
>    PR middle-end/99578
>    PR middle-end/100680
>    PR tree-optimization/100834
>    * params.opt (--param=min-pagesize=): New parameter.
>    * gimple-expr.cc (is_gimple_invariant_address): Allow ADDR_EXPR of
>    MEM_REF with integer_zerop as first operand as invariant.
>    (is_gimple_ip_invariant_address): Likewise.
>    * match.pd (&MEM[0 + off] == off): New simplification.
>    * gimple-ssa-warn-restrict.cc (builtin_memref::set_base_and_offset):
>    Ignore MEM_REFs with integer_zerop as address.
>    * gimple-fold.cc (maybe_canonicalize_mem_ref_addr): Don't fold
>    &MEM[0 + off] into off if off is larger than min-pagesize.
>    * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Return
>    false for MEM_REFs with integer_zerop as address.
>    * pointer-query.cc (handle_mem_ref): Handle MEM_REFs with integer_zerop
>    as address specially.
>    (compute_objsize_r) <case ARRAY_REF>: Formatting fix.
>    (compute_objsize_r) <case INTEGER_CST>: Use maximum object size instead
>    of zero for pointer constants equal or larger than min-pagesize.
>    * fold-const.cc (build_fold_addr_expr_with_type_loc):
> 
>    * gcc.dg/tree-ssa/pr99578-1.c: New test.
>    * gcc.dg/pr99578-1.c: New test.
>    * gcc.dg/pr99578-2.c: New test.
>    * gcc.dg/pr99578-3.c: New test.
>    * gcc.dg/pr100680.c: New test.
>    * gcc.dg/pr100834.c: New test.
> 
> --- gcc/params.opt.jj    2022-03-17 18:48:06.723406664 +0100
> +++ gcc/params.opt    2022-03-18 11:59:31.518452714 +0100
> @@ -613,6 +613,10 @@ The maximum number of insns in loop head
> Common Joined UInteger Var(param_max_modulo_backtrack_attempts) Init(40) 
> Param Optimization
> The maximum number of backtrack attempts the scheduler should make when 
> modulo scheduling a loop.
> 
> +-param=min-pagesize=
> +Common Joined UInteger Var(param_min_pagesize) Init(4096) Param Optimization
> +Minimum page size for warning purposes.
> +
> -param=max-partial-antic-length=
> Common Joined UInteger Var(param_max_partial_antic_length) Init(100) Param 
> Optimization
> Maximum length of partial antic set when performing tree pre optimization.
> --- gcc/gimple-expr.cc.jj    2022-03-17 18:48:06.691407102 +0100
> +++ gcc/gimple-expr.cc    2022-03-18 11:59:31.484453184 +0100
> @@ -690,9 +690,13 @@ is_gimple_invariant_address (const_tree
>   if (TREE_CODE (op) == MEM_REF)
>     {
>       const_tree op0 = TREE_OPERAND (op, 0);
> -      return (TREE_CODE (op0) == ADDR_EXPR
> -          && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> -          || decl_address_invariant_p (TREE_OPERAND (op0, 0))));
> +      return ((TREE_CODE (op0) == ADDR_EXPR
> +           && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> +           || decl_address_invariant_p (TREE_OPERAND (op0, 0))))
> +          /* Allow &MEM <int[4]> [(void *)0B + 8192B] as representation
> +         of &NULL->field_at_offset_8192 to differentiate for
> +         warning purposes those cases from (type *)0x2000.  */
> +          || integer_zerop (op0));
>     }
> 
>   return CONSTANT_CLASS_P (op) || decl_address_invariant_p (op);
> @@ -716,9 +720,10 @@ is_gimple_ip_invariant_address (const_tr
>   if (TREE_CODE (op) == MEM_REF)
>     {
>       const_tree op0 = TREE_OPERAND (op, 0);
> -      return (TREE_CODE (op0) == ADDR_EXPR
> -          && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> -          || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0))));
> +      return ((TREE_CODE (op0) == ADDR_EXPR
> +           && (CONSTANT_CLASS_P (TREE_OPERAND (op0, 0))
> +           || decl_address_ip_invariant_p (TREE_OPERAND (op0, 0))))
> +          || integer_zerop (op0));
>     }
> 
>   return CONSTANT_CLASS_P (op) || decl_address_ip_invariant_p (op);
> --- gcc/match.pd.jj    2022-03-16 10:56:18.995730484 +0100
> +++ gcc/match.pd    2022-03-18 14:53:55.726876501 +0100
> @@ -5606,6 +5606,25 @@ (define_operator_list SYNC_FETCH_AND_AND
>       (if (cmp == NE_EXPR)
>        { constant_boolean_node (true, type); })))))))
> 
> +/* Simplify pointer equality compares of &MEM_REF[0 + offset] and integer.  
> */
> +(for neeq (ne eq)
> + (simplify
> +  (neeq (convert? addr@0) INTEGER_CST@1)
> +   (if (TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (TREE_TYPE (@0))
> +    && TYPE_PRECISION (TREE_TYPE (@1)) == TYPE_PRECISION (sizetype))
> +    (with { poly_int64 off;
> +        HOST_WIDE_INT coff;
> +        tree base = get_addr_base_and_unit_offset (TREE_OPERAND (@0, 0),
> +                               &off); }
> +     (if (base
> +      && TREE_CODE (base) == MEM_REF
> +      && integer_zerop (TREE_OPERAND (base, 0))
> +      && TREE_CODE (TREE_OPERAND (base, 1)) == INTEGER_CST
> +      && off.is_constant (&coff))
> +      { constant_boolean_node ((wi::to_wide (TREE_OPERAND (base, 1)) + coff
> +                == wi::to_wide (@1))
> +                   ^ (neeq != EQ_EXPR), type); })))))
> +
> /* Simplify pointer equality compares using PTA.  */
> (for neeq (ne eq)
>  (simplify
> --- gcc/gimple-ssa-warn-restrict.cc.jj    2022-02-04 14:36:55.000000000 +0100
> +++ gcc/gimple-ssa-warn-restrict.cc    2022-03-18 13:42:05.590650262 +0100
> @@ -521,7 +521,7 @@ builtin_memref::set_base_and_offset (tre
>    offrange[1] += maxobjsize;
>     }
> 
> -  if (TREE_CODE (base) == MEM_REF)
> +  if (TREE_CODE (base) == MEM_REF && !integer_zerop (TREE_OPERAND (base, 0)))
>     {
>       tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND (base, 
> 1));
>       extend_offset_range (memrefoff);
> --- gcc/gimple-fold.cc.jj    2022-03-17 18:48:06.709406855 +0100
> +++ gcc/gimple-fold.cc    2022-03-18 11:59:31.500452963 +0100
> @@ -6018,7 +6018,25 @@ maybe_canonicalize_mem_ref_addr (tree *t
>          if (wi::to_poly_wide (TREE_OPERAND (base, 0)).to_shwi (&moffset))
>        {
>          coffset += moffset;
> -          *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
> +          /* Don't fold &NULL->field into INTEGER_CSTs, so that we
> +             for warning purposes keep the distinction between such
> +             invalid cases and valid (type *)0x7e124000.  */
> +          if (integer_zerop (TREE_OPERAND (*t, 0))
> +              && maybe_ge (coffset, param_min_pagesize))
> +            {
> +              if (t == &TREE_OPERAND (*orig_t, 0))
> +            return res;
> +              location_t loc = EXPR_LOCATION (*orig_t);
> +              tree ptype = TREE_TYPE (*orig_t);
> +              tree tem = build2_loc (loc, MEM_REF, TREE_TYPE (ptype),
> +                         TREE_OPERAND (*t, 0),
> +                         build_int_cst (ptr_type_node,
> +                                coffset));
> +              *orig_t
> +            = build_fold_addr_expr_with_type_loc (loc, tem, ptype);
> +            }
> +          else
> +            *orig_t = build_int_cst (TREE_TYPE (*orig_t), coffset);
>          return true;
>        }
>        }
> --- gcc/gimple-array-bounds.cc.jj    2022-03-17 18:48:06.691407102 +0100
> +++ gcc/gimple-array-bounds.cc    2022-03-18 11:59:31.509452838 +0100
> @@ -393,7 +393,8 @@ bool
> array_bounds_checker::check_mem_ref (location_t location, tree ref,
>                     bool ignore_off_by_one)
> {
> -  if (warning_suppressed_p (ref, OPT_Warray_bounds))
> +  if (warning_suppressed_p (ref, OPT_Warray_bounds)
> +      || integer_zerop (TREE_OPERAND (ref, 0)))
>     return false;
> 
>   /* The statement used to allocate the array or null.  */
> --- gcc/pointer-query.cc.jj    2022-03-17 18:48:06.724406650 +0100
> +++ gcc/pointer-query.cc    2022-03-18 16:14:38.184691128 +0100
> @@ -1968,6 +1968,17 @@ handle_mem_ref (tree mref, gimple *stmt,
>     }
> 
>   tree mrefop = TREE_OPERAND (mref, 0);
> +  if (integer_zerop (mrefop))
> +    {
> +      addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (mref));
> +      if (targetm.addr_space.zero_address_valid (as))
> +    pref->set_max_size_range ();
> +      else
> +    pref->sizrng[0] = pref->sizrng[1] = 0;
> +      pref->ref = null_pointer_node;
> +      return true;
> +    }
> +
>   if (!compute_objsize_r (mrefop, stmt, false, ostype, pref, snlim, qry))
>     return false;
> 
> @@ -2243,7 +2254,7 @@ compute_objsize_r (tree ptr, gimple *stm
>       }
> 
>     case ARRAY_REF:
> -    return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
> +      return handle_array_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
> 
>     case COMPONENT_REF:
>       return handle_component_ref (ptr, stmt, addr, ostype, pref, snlim, qry);
> @@ -2264,12 +2275,14 @@ compute_objsize_r (tree ptr, gimple *stm
>       }
> 
>     case INTEGER_CST:
> -      /* Pointer constants other than null are most likely the result
> -     of erroneous null pointer addition/subtraction.  Unless zero
> -     is a valid address set size to zero.  For null pointers, set
> -     size to the maximum for now since those may be the result of
> -     jump threading.  */
> -      if (integer_zerop (ptr))
> +      /* Pointer constants other than null smaller than param_min_pagesize
> +     might be the result of erroneous null pointer addition/subtraction.
> +     Unless zero is a valid address set size to zero.  For null pointers,
> +     set size to the maximum for now since those may be the result of
> +     jump threading.  Similarly, for values >= param_min_pagesize in
> +     order to support (type *) 0x7cdeab00.  */
> +      if (integer_zerop (ptr)
> +      || wi::to_widest (ptr) >= param_min_pagesize)
>    pref->set_max_size_range ();
>       else if (POINTER_TYPE_P (TREE_TYPE (ptr)))
>    {
> --- gcc/fold-const.cc.jj    2022-03-17 18:48:06.675407320 +0100
> +++ gcc/fold-const.cc    2022-03-18 11:59:31.541452395 +0100
> @@ -9206,7 +9206,12 @@ build_fold_addr_expr_with_type_loc (loca
>    t = fold_convert_loc (loc, ptrtype, t);
>     }
>   else if (TREE_CODE (t) == MEM_REF
> -       && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
> +       && TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST
> +       /* Don't fold &MEM <int[4]> [(void *)0B + 8192B]
> +          as a representation of &((type *)NULL)->field_at_offset_8192.  */
> +       && (!integer_zerop (TREE_OPERAND (t, 0))
> +           || TREE_CODE (TREE_OPERAND (t, 1)) != INTEGER_CST
> +           || wi::to_widest (TREE_OPERAND (t, 1)) < param_min_pagesize))
>     return fold_binary (POINTER_PLUS_EXPR, ptrtype,
>            TREE_OPERAND (t, 0),
>            convert_to_ptrofftype (TREE_OPERAND (t, 1)));
> --- gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c.jj    2022-03-18 
> 15:12:47.470156316 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr99578-1.c    2022-03-18 
> 15:18:30.249401754 +0100
> @@ -0,0 +1,22 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "&MEM" "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "PHI <-?1\\\(\[0-9\]+\\\), 
> -?1\\\(\[0-9\]+\\\)>" 2 "optimized" } } */
> +
> +struct S { int a, b[4]; };
> +struct T { int a, b[8192], c[4]; };
> +
> +int
> +foo (struct S *p)
> +{
> +  if (p) return -1;
> +  return p->b == (void *)4;
> +}
> +
> +int
> +bar (struct T *p)
> +{
> +  if (p) return -1;
> +  return p->c == (void *)32772;
> +}
> --- gcc/testsuite/gcc.dg/pr99578-1.c.jj    2022-03-18 15:04:22.825163042 +0100
> +++ gcc/testsuite/gcc.dg/pr99578-1.c    2022-03-18 15:09:26.693941208 +0100
> @@ -0,0 +1,26 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile { target int32 } } */
> +/* { dg-options "-O2 -Warray-bounds" } */
> +
> +struct S { int a, b[4]; };
> +struct T { int a, b[8192], c[4]; };
> +
> +void
> +foo (struct S *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->b, 0, sizeof p->b);    /* { dg-warning "offset \\\[0, 
> 15\\\] is out of the bounds \\\[0, 0\\\]" } */
> +}
> +
> +void
> +bar (struct T *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->c, 0, sizeof p->c);    /* { dg-warning "offset \\\[0, 
> 15\\\] is out of the bounds \\\[0, 0\\\]" } */
> +}
> +
> +void
> +baz (void)
> +{
> +  __builtin_memset ((void *) 0x8004, 0, 16);    /* { dg-bogus "is out of the 
> bounds" } */
> +}
> --- gcc/testsuite/gcc.dg/pr99578-2.c.jj    2022-03-18 15:09:58.339502263 +0100
> +++ gcc/testsuite/gcc.dg/pr99578-2.c    2022-03-18 15:10:33.308017231 +0100
> @@ -0,0 +1,26 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile { target int32 } } */
> +/* { dg-options "-O2 -Wstringop-overflow" } */
> +
> +struct S { int a, b[4]; };
> +struct T { int a, b[8192], c[4]; };
> +
> +void
> +foo (struct S *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->b, 0, sizeof p->b);    /* { dg-warning "writing 16 
> bytes into a region of size 0 overflows the destination" } */
> +}
> +
> +void
> +bar (struct T *p)
> +{
> +  if (p) return;
> +  __builtin_memset (p->c, 0, sizeof p->c);    /* { dg-warning "writing 16 
> bytes into a region of size 0 overflows the destination" } */
> +}
> +
> +void
> +baz (void)
> +{
> +  __builtin_memset ((void *) 0x8004, 0, 16);    /* { dg-bogus "overflows the 
> destination" } */
> +}
> --- gcc/testsuite/gcc.dg/pr99578-3.c.jj    2022-03-18 16:43:56.303435806 +0100
> +++ gcc/testsuite/gcc.dg/pr99578-3.c    2022-03-18 16:43:51.379503346 +0100
> @@ -0,0 +1,13 @@
> +/* PR middle-end/99578 */
> +/* { dg-do compile { target size32plus } } */
> +/* { dg-options "-O2 -Wstringop-overread" } */
> +
> +struct S { unsigned int s; };
> +extern struct S v;
> +extern void *memcpy (void *, const void *, __SIZE_TYPE__);
> +
> +void
> +foo (void)
> +{
> +  memcpy (&v, (void *)(0xe8ffc000), sizeof (struct S));    /* { dg-bogus 
> "from a region of size 0" } */
> +}
> --- gcc/testsuite/gcc.dg/pr100680.c.jj    2022-03-18 16:48:39.136556295 +0100
> +++ gcc/testsuite/gcc.dg/pr100680.c    2022-03-18 16:55:37.512876159 +0100
> @@ -0,0 +1,31 @@
> +/* PR middle-end/100680 */
> +/* { dg-do compile { target size32plus } } */
> +/* { dg-options "-O2 -Wstringop-overread" } */
> +
> +struct s {
> +  char a[8];
> +  int i;
> +  long l;
> +};
> +
> +extern char ea[8];
> +static char sa[8] = { 1, 2, 3, 4 };
> +
> +int
> +test (void)
> +{
> +  const struct s *ps = (const struct s *) 0x12345678L;
> +  if (__builtin_memcmp (ps->a, ps->a, 8))
> +    return 0;
> +
> +  if (__builtin_memcmp (ps->a, ea, 8))        /* { dg-bogus "exceeds source 
> size 0" } */
> +    return 0;
> +
> +  if (__builtin_memcmp (ps->a, sa, 8))        /* { dg-bogus "exceeds source 
> size 0" } */
> +    return 0;
> +
> +  if (__builtin_memcmp (ps->a, "abcdABCD", 8))    /* { dg-bogus "exceeds 
> source size 0" } */
> +    return 0;
> +
> +  return 1;
> +}
> --- gcc/testsuite/gcc.dg/pr100834.c.jj    2022-03-18 17:23:13.802552517 +0100
> +++ gcc/testsuite/gcc.dg/pr100834.c    2022-03-18 17:24:21.985630103 +0100
> @@ -0,0 +1,42 @@
> +/* PR tree-optimization/100834 */
> +/* { dg-do compile { target size32plus } } */
> +/* { dg-options "-O2 -Wall" } */
> +
> +#define PAGE_SIZE    4096
> +#define STACK_SIZE    PAGE_SIZE
> +
> +union registers
> +{
> +  struct
> +  {
> +    unsigned long r15, r14, r13, r12, r11, r10, r9, r8;
> +    unsigned long rdi, rsi, rbp, unused, rbx, rdx, rcx, rax;
> +  };
> +  unsigned long by_index[16];
> +};
> +
> +struct per_cpu
> +{
> +  union
> +  {
> +    unsigned char stack[STACK_SIZE];
> +    struct
> +    {
> +      unsigned char __fill[STACK_SIZE - sizeof (union registers)];
> +      union registers guest_regs;
> +    };
> +  };
> +} __attribute__((aligned (PAGE_SIZE)));
> +
> +static inline struct per_cpu *
> +this_cpu_data (void)
> +{
> +  return (struct per_cpu *) 0xdeadbeef;
> +}
> +
> +void
> +foo (void)
> +{
> +  struct per_cpu *cpu_data = this_cpu_data ();
> +  __builtin_memset (&cpu_data->guest_regs, 0, sizeof 
> (cpu_data->guest_regs));    /* { dg-bogus "is out of the bounds" } */
> +}
> 
> 
>    Jakub
> 

Reply via email to