Thanks for the comments and suggestions.

> On May 15, 2024, at 10:00, David Malcolm <dmalc...@redhat.com> wrote:
> 
> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>> On Mon, 13 May 2024, Qing Zhao wrote:
>> 
>>> -Warray-bounds is an important option to enable linux kernal to
>>> keep
>>> the array out-of-bound errors out of the source tree.
>>> 
>>> However, due to the false positive warnings reported in PR109071
>>> (-Warray-bounds false positive warnings due to code duplication
>>> from
>>> jump threading), -Warray-bounds=1 cannot be added on by default.
>>> 
>>> Although it's impossible to elinimate all the false positive
>>> warnings
>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>>> documentation says "always out of bounds"), we should minimize the
>>> false positive warnings in -Warray-bounds=1.
>>> 
>>> The root reason for the false positive warnings reported in
>>> PR109071 is:
>>> 
>>> When the thread jump optimization tries to reduce the # of branches
>>> inside the routine, sometimes it needs to duplicate the code and
>>> split into two conditional pathes. for example:
>>> 
>>> The original code:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>> {
>>>   if (index >= 4)
>>>     warn ();
>>>   *ptr = 0;
>>>   *val = sg->vals[index];
>>>   if (index >= 4)
>>>     warn ();
>>>   *ptr = *val;
>>> 
>>>   return;
>>> }
>>> 
>>> With the thread jump, the above becomes:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>> {
>>>   if (index >= 4)
>>>     {
>>>       warn ();
>>>       *ptr = 0;         // Code duplications since "warn" does
>>> return;
>>>       *val = sg->vals[index];   // same this line.
>>>                                 // In this path, since it's under
>>> the condition
>>>                                 // "index >= 4", the compiler knows
>>> the value
>>>                                 // of "index" is larger then 4,
>>> therefore the
>>>                                 // out-of-bound warning.
>>>       warn ();
>>>     }
>>>   else
>>>     {
>>>       *ptr = 0;
>>>       *val = sg->vals[index];
>>>     }
>>>   *ptr = *val;
>>>   return;
>>> }
>>> 
>>> We can see, after the thread jump optimization, the # of branches
>>> inside
>>> the routine "sparx5_set" is reduced from 2 to 1, however,  due to
>>> the
>>> code duplication (which is needed for the correctness of the code),
>>> we
>>> got a false positive out-of-bound warning.
>>> 
>>> In order to eliminate such false positive out-of-bound warning,
>>> 
>>> A. Add one more flag for GIMPLE: is_splitted.
>>> B. During the thread jump optimization, when the basic blocks are
>>>    duplicated, mark all the STMTs inside the original and
>>> duplicated
>>>    basic blocks as "is_splitted";
>>> C. Inside the array bound checker, add the following new heuristic:
>>> 
>>> If
>>>    1. the stmt is duplicated and splitted into two conditional
>>> paths;
>>> +  2. the warning level < 2;
>>> +  3. the current block is not dominating the exit block
>>> Then not report the warning.
>>> 
>>> The false positive warnings are moved from -Warray-bounds=1 to
>>>  -Warray-bounds=2 now.
>>> 
>>> Bootstrapped and regression tested on both x86 and aarch64.
>>> adjusted
>>>  -Warray-bounds-61.c due to the false positive warnings.
>>> 
>>> Let me know if you have any comments and suggestions.
>> 
>> At the last Cauldron I talked with David Malcolm about these kind of
>> issues and thought of instead of suppressing diagnostics to record
>> how a block was duplicated.  For jump threading my idea was to record
>> the condition that was proved true when entering the path and do this
>> by recording the corresponding locations

Is only recording the location for the TRUE path  enough?
We might need to record the corresponding locations for both TRUE and FALSE 
paths since the VRP might be more accurate on both paths. 
Is only recording the location is enough? 
Do we need to record the pointer to the original condition stmt?


>> so that in the end we can
>> use the diagnostic-path infrastructure to say
>> 
>> warning: array index always above array bounds
>> events 1:
>> 
>>> 3 |  if (index >= 4)
>>          |
>>         (1) when index >= 4
>> 
>> it would be possible to record the info as part of the ad-hoc
>> location data on each duplicated stmt or, possibly simpler,
>> as part of a debug stmt of new kind.
>> 
>> I'm not sure pruning the warnings is a good thing to do.  One
>> would argue we should instead isolate such path as unreachable
>> since it invokes undefined behavior.  In particular your
>> example is clearly a bug and should be diagnosed.
>> 
>> Note very similar issues happen when unrolling a loop.
>> 
>> Note all late diagnostics are prone to these kind of issues.
> 
> To recap our chat at Cauldron: any GCC diagnostic can potentially have
> a diagnostic_path associated with it (not just the analyzer).  The
> current mechanism is:
> (a) use a rich_location for the diagnostic, and 
> (b) create an instance of some diagnostic_path subclass (e.g.
> simple_diagnostic_path, or something else), and 
> (c) call  richloc.set_path (&path);  to associate the path with the
> rich_location
> 
> See gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_paths.c for an
> example of using simple_diagnostic_path that doesn't use the analyzer.

Thanks for the information. 
Yes, If we have recorded all necessary information for the diagnostic during 
the jump threading or loop unrolling transformation, the current rich_location 
and diagnostic_path infrastruture looks a very nice fit to use to report the 
warning with more details. 
> 
> 
> If we want *every* late diagnostic to potentially have a path, it
> sounds like we might want some extra infrastructure (perhaps a hook
> that autogenerates the paths from the ad-hoc data???)

Recording the minimum and necessary information into the IR during 
transformation to help the late diagnostic is the key to this approach.  What 
kind of information need to be recorded and where to record such information to 
minimize the cost need more thinking and discussion.

>  But probably
> best to get it working for just a specific example first before trying
> to be fancy and generalize.

Yes. 

Thanks.

Qing
> 
> Dave
> 
> 
>> 
>> Richard.
>> 
>>> Thanks.
>>> 
>>> Qing
>>> 
>>> 
>>>         PR tree optimization/109071
>>> 
>>> gcc/ChangeLog:
>>> 
>>>         * gimple-array-bounds.cc (check_out_of_bounds_and_warn):
>>> Add two new
>>>         arguments for the new heuristic to not issue warnings.
>>>         (array_bounds_checker::check_array_ref): Call the new
>>> prototype of the
>>>         routine check_out_of_bounds_and_warn.
>>>         (array_bounds_checker::check_mem_ref): Add one new argument
>>> for the
>>>         new heuristic to not issue warnings.
>>>         (array_bounds_checker::check_addr_expr): Call the new
>>> prototype of the
>>>         routine check_mem_ref, add new heuristic for not issue
>>> warnings.
>>>         (array_bounds_checker::check_array_bounds): Call the new
>>> prototype of
>>>         the routine check_mem_ref.
>>>         * gimple-array-bounds.h: New prototype of check_mem_ref.
>>>         * gimple.h (struct GTY): Add one new flag is_splitted for
>>> gimple.
>>>         (gimple_is_splitted_p): New function.
>>>         (gimple_set_is_splitted): New function.
>>>         * tree-ssa-threadupdate.cc (set_stmts_in_bb_is_splitted):
>>> New function.
>>>         (back_jt_path_registry::duplicate_thread_path): Mark all
>>> the stmts in
>>>         both original and copied blocks as IS_SPLITTED.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>>         * gcc.dg/Warray-bounds-61.c: Adjust testing case.
>>>         * gcc.dg/pr109071-1.c: New test.
>>>         * gcc.dg/pr109071.c: New test.
>>> ---
>>>  gcc/gimple-array-bounds.cc              | 46
>>> +++++++++++++++++++++----
>>>  gcc/gimple-array-bounds.h               |  2 +-
>>>  gcc/gimple.h                            | 21 +++++++++--
>>>  gcc/testsuite/gcc.dg/Warray-bounds-61.c |  6 ++--
>>>  gcc/testsuite/gcc.dg/pr109071-1.c       | 22 ++++++++++++
>>>  gcc/testsuite/gcc.dg/pr109071.c         | 22 ++++++++++++
>>>  gcc/tree-ssa-threadupdate.cc            | 15 ++++++++
>>>  7 files changed, 122 insertions(+), 12 deletions(-)
>>>  create mode 100644 gcc/testsuite/gcc.dg/pr109071-1.c
>>>  create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
>>> 
>>> diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-
>>> bounds.cc
>>> index 008071cd5464..4a2975623bc1 100644
>>> --- a/gcc/gimple-array-bounds.cc
>>> +++ b/gcc/gimple-array-bounds.cc
>>> @@ -264,7 +264,9 @@ check_out_of_bounds_and_warn (location_t
>>> location, tree ref,
>>>                               tree up_bound, tree up_bound_p1,
>>>                               const value_range *vr,
>>>                               bool ignore_off_by_one, bool
>>> for_array_bound,
>>> -                             bool *out_of_bound)
>>> +                             bool *out_of_bound,
>>> +                             bool is_splitted,
>>> +                             bool is_dominate_exit)
>>>  {
>>>    tree min, max;
>>>    tree low_bound = array_ref_low_bound (ref);
>>> @@ -273,6 +275,13 @@ check_out_of_bounds_and_warn (location_t
>>> location, tree ref,
>>>    bool warned = false;
>>>    *out_of_bound = false;
>>>  
>>> +  /* If the stmt is duplicated and splitted, the warning level is
>>> not 2,
>>> +     and the current block is not dominating the exit block, not
>>> report
>>> +     the warning.  */
>>> +  if (is_splitted && warn_array_bounds < 2
>>> +      && !is_dominate_exit)
>>> +    return false;
>>> +
>>>    /* Empty array.  */
>>>    if (up_bound && tree_int_cst_equal (low_bound, up_bound_p1))
>>>      {
>>> @@ -386,12 +395,17 @@ array_bounds_checker::check_array_ref
>>> (location_t location, tree ref,
>>>         }
>>>      }
>>>  
>>> +  bool is_dominate_exit = dominated_by_p (CDI_DOMINATORS,
>>> +                                         EXIT_BLOCK_PTR_FOR_FN
>>> (fun),
>>> +                                         gimple_bb (stmt));
>>> +
>>>    warned = check_out_of_bounds_and_warn (location, ref,
>>>                                          low_sub_org, low_sub,
>>> up_sub,
>>>                                          up_bound, up_bound_p1,
>>> &vr,
>>>                                          ignore_off_by_one,
>>> warn_array_bounds,
>>> -                                        &out_of_bound);
>>> -
>>> +                                        &out_of_bound,
>>> +                                        gimple_is_splitted_p
>>> (stmt),
>>> +                                        is_dominate_exit);
>>>  
>>>    if (!warned && sam == special_array_member::int_0)
>>>      warned = warning_at (location, OPT_Wzero_length_bounds,
>>> @@ -476,7 +490,7 @@ array_bounds_checker::check_array_ref
>>> (location_t location, tree ref,
>>>  
>>>  bool
>>>  array_bounds_checker::check_mem_ref (location_t location, tree
>>> ref,
>>> -                                    bool ignore_off_by_one)
>>> +                                    gimple *stmt, bool
>>> ignore_off_by_one)
>>>  {
>>>    if (warning_suppressed_p (ref, OPT_Warray_bounds_))
>>>      return false;
>>> @@ -574,6 +588,16 @@ array_bounds_checker::check_mem_ref
>>> (location_t location, tree ref,
>>>         }
>>>      }
>>>  
>>> +  /* If the stmt is duplicated and splitted, the warning level is
>>> not 2,
>>> +     and the current block is not dominating the exit block, not
>>> report
>>> +     the warning.  */
>>> +  bool is_dominate_exit = dominated_by_p (CDI_DOMINATORS,
>>> +                                         EXIT_BLOCK_PTR_FOR_FN
>>> (fun),
>>> +                                         gimple_bb (stmt));
>>> +  if (gimple_is_splitted_p (stmt) && warn_array_bounds < 2
>>> +      && !is_dominate_exit)
>>> +    return false;
>>> +
>>>    bool warned = false;
>>>    if (lboob)
>>>      {
>>> @@ -654,7 +678,7 @@ array_bounds_checker::check_addr_expr
>>> (location_t location, tree t,
>>>           ignore_off_by_one = false;
>>>         }
>>>        else if (TREE_CODE (t) == MEM_REF)
>>> -       warned = check_mem_ref (location, t, ignore_off_by_one);
>>> +       warned = check_mem_ref (location, t, stmt,
>>> ignore_off_by_one);
>>>  
>>>        if (warned)
>>>         suppress_warning (t, OPT_Warray_bounds_);
>>> @@ -690,6 +714,16 @@ array_bounds_checker::check_addr_expr
>>> (location_t location, tree t,
>>>    if (!mem_ref_offset (t).is_constant (&idx))
>>>      return;
>>>  
>>> +  /* If the stmt is duplicated and splitted, the warning level is
>>> not 2,
>>> +     and the current block is not dominating the exit block, not
>>> report
>>> +     the warning.  */
>>> +  bool is_dominate_exit = dominated_by_p (CDI_DOMINATORS,
>>> +                                         EXIT_BLOCK_PTR_FOR_FN
>>> (fun),
>>> +                                         gimple_bb (stmt));
>>> +  if (gimple_is_splitted_p (stmt) && warn_array_bounds < 2
>>> +      && !is_dominate_exit)
>>> +    return;
>>> +
>>>    bool warned = false;
>>>    idx = wi::sdiv_trunc (idx, wi::to_offset (el_sz));
>>>    if (idx < 0)
>>> @@ -809,7 +843,7 @@ array_bounds_checker::check_array_bounds (tree
>>> *tp, int *walk_subtree,
>>>      warned = checker->check_array_ref (location, t, wi->stmt,
>>>                                        false/*ignore_off_by_one*/);
>>>    else if (TREE_CODE (t) == MEM_REF)
>>> -    warned = checker->check_mem_ref (location, t,
>>> +    warned = checker->check_mem_ref (location, t, wi->stmt,
>>>                                      false /*ignore_off_by_one*/);
>>>    else if (TREE_CODE (t) == ADDR_EXPR)
>>>      {
>>> diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
>>> index 3e077d0178ff..7c98f02204c9 100644
>>> --- a/gcc/gimple-array-bounds.h
>>> +++ b/gcc/gimple-array-bounds.h
>>> @@ -33,7 +33,7 @@ public:
>>>  private:
>>>    static tree check_array_bounds (tree *tp, int *walk_subtree,
>>> void *data);
>>>    bool check_array_ref (location_t, tree, gimple *, bool
>>> ignore_off_by_one);
>>> -  bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
>>> +  bool check_mem_ref (location_t, tree, gimple *, bool
>>> ignore_off_by_one);
>>>    void check_addr_expr (location_t, tree, gimple *);
>>>    void get_value_range (irange &r, const_tree op, gimple *);
>>>  
>>> diff --git a/gcc/gimple.h b/gcc/gimple.h
>>> index bd315ffc2dd4..08f52d107084 100644
>>> --- a/gcc/gimple.h
>>> +++ b/gcc/gimple.h
>>> @@ -254,8 +254,8 @@ struct GTY((desc ("gimple_statement_structure
>>> (&%h)"), tag ("GSS_BASE"),
>>>    /* Nonzero if this statement contains volatile operands.  */
>>>    unsigned has_volatile_ops    : 1;
>>>  
>>> -  /* Padding to get subcode to 16 bit alignment.  */
>>> -  unsigned pad                 : 1;
>>> +  /* Nonzero if this statement is duplicated and splitted to two
>>> pathes.  */
>>> +  unsigned is_splitted         : 1;
>>>  
>>>    /* The SUBCODE field can be used for tuple-specific flags for
>>> tuples
>>>       that do not require subcodes.  Note that SUBCODE should be at
>>> @@ -2327,6 +2327,23 @@ gimple_set_has_volatile_ops (gimple *stmt,
>>> bool volatilep)
>>>      stmt->has_volatile_ops = (unsigned) volatilep;
>>>  }
>>>  
>>> +/* Return true if statement G's is_splitted field has
>>> +   been set.  */
>>> +
>>> +inline bool
>>> +gimple_is_splitted_p (const gimple *g)
>>> +{
>>> +  return (bool) g->is_splitted;
>>> +}
>>> +
>>> +/* Set the IS_SPLITTED flag to IS_SPLITTEDP.  */
>>> +
>>> +inline void
>>> +gimple_set_is_splitted (gimple *s, bool is_splittedp)
>>> +{
>>> +    s->is_splitted = (unsigned) is_splittedp;
>>> +}
>>> +
>>>  /* Return true if STMT is in a transaction.  */
>>>  
>>>  inline bool
>>> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-61.c
>>> b/gcc/testsuite/gcc.dg/Warray-bounds-61.c
>>> index 5b66cdc0aab1..cb3c64a813d7 100644
>>> --- a/gcc/testsuite/gcc.dg/Warray-bounds-61.c
>>> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-61.c
>>> @@ -23,7 +23,7 @@ void test_ua3_ua0_a0 (int i)
>>>  
>>>    if (i < __LINE__)
>>>      i = 5;
>>> -  ua3_a0.a0[i] = 0;           // { dg-warning "\\\[-Warray-bounds"
>>> }
>>> +  ua3_a0.a0[i] = 0;           // { dg-bogus "\\\[-Warray-bounds" }
>>>  
>>>    if (i > -1)
>>>      i = -1;
>>> @@ -44,7 +44,7 @@ void test_ua3_ua0_a1 (int i)
>>>  
>>>    if (i > -1)
>>>      i = -1;
>>> -  ua3_a0.a1[i] = 0;           // { dg-warning "\\\[-Warray-bounds"
>>> }
>>> +  ua3_a0.a1[i] = 0;           // { dg-bogus "\\\[-Warray-bounds" }
>>>  
>>>    if (i < 7)
>>>      i = 7;
>>> @@ -60,7 +60,7 @@ void test_ua3_ua0_a2 (int i)
>>>  
>>>    if (i < __LINE__)
>>>      i = __LINE__;
>>> -  ua3_a0.a2[i] = 0;           // { dg-warning "\\\[-Warray-bounds"
>>> }
>>> +  ua3_a0.a2[i] = 0;           // { dg-bogus "\\\[-Warray-bounds" }
>>>  
>>>    if (i > -1)
>>>      i = -1;
>>> diff --git a/gcc/testsuite/gcc.dg/pr109071-1.c
>>> b/gcc/testsuite/gcc.dg/pr109071-1.c
>>> new file mode 100644
>>> index 000000000000..a405c80bd549
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pr109071-1.c
>>> @@ -0,0 +1,22 @@
>>> +/* PR tree-optimization/109071 -Warray-bounds false positive
>>> warnings
>>> +   due to code duplication from jump threading 
>>> +   { dg-do compile }
>>> +   { dg-options "-O2 -Warray-bounds=2" }
>>> + */
>>> +
>>> +extern void warn(void);
>>> +static inline void assign(int val, int *regs, int index)
>>> +{
>>> +  if (index >= 4)
>>> +    warn();
>>> +  *regs = val;
>>> +}
>>> +struct nums {int vals[4];};
>>> +
>>> +void sparx5_set (int *ptr, struct nums *sg, int index)
>>> +{
>>> +  int *val = &sg->vals[index]; /* { dg-warning "is above array
>>> bounds" } */
>>> +
>>> +  assign(0,    ptr, index);
>>> +  assign(*val, ptr, index);
>>> +}
>>> diff --git a/gcc/testsuite/gcc.dg/pr109071.c
>>> b/gcc/testsuite/gcc.dg/pr109071.c
>>> new file mode 100644
>>> index 000000000000..782dfad84ea2
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pr109071.c
>>> @@ -0,0 +1,22 @@
>>> +/* PR tree-optimization/109071 -Warray-bounds false positive
>>> warnings
>>> +   due to code duplication from jump threading 
>>> +   { dg-do compile }
>>> +   { dg-options "-O2 -Wall" }
>>> + */
>>> +
>>> +extern void warn(void);
>>> +static inline void assign(int val, int *regs, int index)
>>> +{
>>> +  if (index >= 4)
>>> +    warn();
>>> +  *regs = val;
>>> +}
>>> +struct nums {int vals[4];};
>>> +
>>> +void sparx5_set (int *ptr, struct nums *sg, int index)
>>> +{
>>> +  int *val = &sg->vals[index]; /* { dg-bogus "is above array
>>> bounds" } */
>>> +
>>> +  assign(0,    ptr, index);
>>> +  assign(*val, ptr, index);
>>> +}
>>> diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-
>>> threadupdate.cc
>>> index fa61ba9512b7..9f338dd4d54d 100644
>>> --- a/gcc/tree-ssa-threadupdate.cc
>>> +++ b/gcc/tree-ssa-threadupdate.cc
>>> @@ -2371,6 +2371,17 @@
>>> back_jt_path_registry::adjust_paths_after_duplication (unsigned
>>> curr_path_num)
>>>      }
>>>  }
>>>  
>>> +/* Set all the stmts in the basic block BB as IS_SPLITTED.  */
>>> +
>>> +static void
>>> +set_stmts_in_bb_is_splitted (basic_block bb)
>>> +{
>>> +  gimple_stmt_iterator gsi;
>>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>> +    gimple_set_is_splitted (gsi_stmt (gsi), true);
>>> +  return;
>>> +}
>>> +
>>>  /* Duplicates a jump-thread path of N_REGION basic blocks.
>>>     The ENTRY edge is redirected to the duplicate of the region.
>>>  
>>> @@ -2418,6 +2429,10 @@ back_jt_path_registry::duplicate_thread_path
>>> (edge entry,
>>>    basic_block *region_copy = XNEWVEC (basic_block, n_region);
>>>    copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy,
>>> loop,
>>>             split_edge_bb_loc (entry), false);
>>> +  /* Mark all the stmts in both original and copied basic blocks
>>> +     as IS_SPLITTED.  */
>>> +  set_stmts_in_bb_is_splitted (*region);
>>> +  set_stmts_in_bb_is_splitted (*region_copy);
>>>  
>>>    /* Fix up: copy_bbs redirects all edges pointing to copied
>>> blocks.  The
>>>       following code ensures that all the edges exiting the jump-
>>> thread path are
>>> 
>> 
> 

Reply via email to