92539] Improve code and avoid Warray-bounds false positive

Richard Biener Thu, 09 Jan 2025 00:08:58 -0800

On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao <[email protected]> wrote:
>
>
>
> > On Jan 7, 2025, at 07:29, Richard Biener <[email protected]> wrote:
> >
> > On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao <[email protected]> wrote:
> >>
> >>
> >>
> >>> On Jan 6, 2025, at 11:01, Richard Biener <[email protected]> 
> >>> wrote:
> >>>
> >>> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao <[email protected]> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Jan 6, 2025, at 09:21, Jeff Law <[email protected]> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
> >>>>>>>
> >>>>>>> Given it doesn't cause user visible UB, we could insert the trap 
> >>>>>>> *before* the UB inducing statement.  That would then make the 
> >>>>>>> statement unreachable and it'd get removed avoiding the false 
> >>>>>>> positive diagnostic.
> >>>>>> Yes, that’s a good idea.
> >>>>>> However, in order to distinguish a user visible UB and a UB in the IL 
> >>>>>> that is introduced purely by compiler, we might need some new marking 
> >>>>>> in the IR?
> >>>>> I don't think we've ever really tackled that question; the closest I 
> >>>>> can think of would be things like integer overflow which we try to 
> >>>>> avoid allowing the compiler to introduce.  If we take the integer 
> >>>>> overflow as the model, then that would say we should be tackling this 
> >>>>> during loop unrolling.
> >>>>
> >>>> UB that is introduced by compiler transformation is one important cause 
> >>>> of false positive warnings.
> >>>>
> >>>> There are two approaches to tackle this problem from my understanding:
> >>>>
> >>>> 1. Avoid generating such UB from the beginning. i.e, for every compiler 
> >>>> transformation that might introduce such UB, we should add check to 
> >>>> avoid generating it.
> >>>>
> >>>> 2. Marking the IR portion that were generated by compiler 
> >>>> transformations, then check whether the UB is compiler generated when 
> >>>> issue static checker warnings.
> >>>>
> >>>> Are there other approaches?
> >>>
> >>> Note unrolling doesn't introduce UB - it makes conditional UB
> >>> "obvious”.
> >>
> >> So, you mean this is the same issue as PR109071 (and PR85788, PR88771, 
> >> etc), i.e, the compiler optimization make the conditional UB that’s 
> >> originally in the source code “obvious” after code duplication?
> >>
> >> (I need to study the testing case in PR92539 more carefully to make sure 
> >> this is the case...)
> >>
> >> If so, then the claimed false positive warning in PR92539 actually is a 
> >> real bug in the original source code,  and my patch that introduced the 
> >> new option “--fdiagnostics-details” should also include loop unrolling to 
> >> provide more details on the warning introduced by loop unrolling.
> >>
> >>
> >>> Note -Warray-bounds wants to
> >>> diagnose UB, so doing path isolation and removing the UB would make
> >>> -Warray-bounds useless.
> >>>
> >>> So unless the condition guarding the UB unrolling exposes is visibly
> >>> false to the compiler but we fail
> >>> to exploit that (missed optimization) there's not much that we can do.
> >>> I think "folding" away the UB
> >>> like what Jeff proposes trades false negatives for the false positive
> >>> diagnostics.
> >>>
> >>> Note the unroller knows UB that effectively bounds the number of
> >>> iterations, even on conditional
> >>> paths and it uses this to limit the number of copies _and_ to prune
> >>> unreachable paths (exploiting
> >>> UB, avoiding diagnostics).  But one of the limitations is that it only
> >>> prunes paths in the last unrolled
> >>> copy which can be insufficient (ISTR some PR where I noticed this).
> >>>
> >>> That said - I think for these unroller exposed cases of apparent false
> >>> positives we should improve
> >>> the path pruning in the unroller itself.  For the other cases the path
> >>> diagnostic might help clarify
> >>> that the UB happens on the 'n-th' iteration of the loop when some
> >>> additional condition is true/false.
> >>
> >> So, the “other cases” refer to the situation similar as PR109071, i.e, 
> >> “conditional UB” in the original source code is made obvious after loop 
> >> unrolling?
> >> Yes, for such cases, the new option I have been trying to add, 
> >> “-fdiagnostic-details” should be able to track and provide more details on 
> >> the conditions that lead to the UB.
> >> Is this understanding correct?
> >
> > I think so, but I didn't look into the testcase of the referenced PR.
>
> I took a detailed study of the test case of PR92539 yesterday.  The following 
> is a brief summary:
>
> 1. The pass that caused the issue is: cunrolli.
>      Adding -fdisable-tree-cunrolli eliminate the false positive warnings.
>
> 2. The IR Before cunrolli:
>
> const char *local_iterator = beginning address of string "aa";
> const char *last = last address of string "aa";
>
> for (int i = 0; i < 3; ++i)
>   if (local_iterator != last)           // pointer comparison 1
>     {
>       local_iterator++;
>       if (local_iterator != last)       // pointer comparison 2
>         local_iterator++;
>     }
>
> I think that the IR has NO UB at this moment;
> (The  pointer comparison 2 in the 2nd iteration of “I” loop and
> both pointer comparison in the 3rd iteration of “I” loop will
> NOT execute at all)
>
> **After cunrolli (fully unrolling of the above i loop):
>
> const char *local_iterator = beginning address of string "aa";
> const char *last = last address of string "aa";
> int i = 0;
>
> if (local_iterator != last)                     // pointer comparison 1
>   {
>     local_iterator++;
>     if (local_iterator != last)         // pointer comparison 2
>       {
>         local_iterator++;
>         i++;
>         if (local_iterator != last)             // pointer comparison 3
>           {
>             local_iterator++;
>             if (local_iterator != last) // pointer comparison 4
>               {
>                 local_iterator++;
>                 i++;
>                 if (local_iterator != last)     // pointer comparison 5
>                   {
>                     local_iterator++;
>                     if (local_iterator != last) // pointer comparison 6
>                       {
>                         local_iterator++;
>                         i++;
>                       }
>                   }
>               }
>           }
>       }
>   }
>
> Also, I think that the IR has NO UB at this moment too.
> (The pointer comparison 4 and later will NOT execute at all)
>
> ** The false positive warnings claimed for pointer comparison 4, 5, and 6
> assuming that these comparison will be executed during runtime but actually
> they will not be executed at all.
>
>
> So, based on the above study, my impression is:
>
> A.  Unrolling transformation didn’t do anything wrong.
> B.  Only after const propagation, the compiler will know that the pointer 
> “local_iterator” will
>       point to an out-of-bound position of string “aa”, the pointer 
> comparison 4, 5, 6 is invalid
>       After that.
>
> So, I guess that Jeff’s patch to identify the invalid pointer address at 
> “vrp1” is reasonable.
>
> What’s your opinion on this?


Your analysis is correct - the long-standing design issue is that -Warray-bounds
does not distinguish conditional UB from unconditional UB (but that concept is
somewhat wishful thinking given a function isn't guaranteed to be invoked either
unless it is main()).  The other thing is that while -Warray-bounds
without unrolling
likely sees the value-range of i_1 for &a[i_1] includes out-of-bound values it
chooses to only diagnose cases where the whole value-range is out-of-bound
(correctly so to avoid false positives just because of conservative ranges).

I think Jeff's patch is not reasonable since it boils down to not diagnose
-Warray-bounds but instead remove those stmts.

The problem in this particular testcase is that we fail to optimize the
compares - not because we don't exploit the visible UB but because
we fail to compute the final value of 'last' from

    auto last = "aa";
    while (*last)
        ++last;

one way to attack this is to recognize this pattern in niter analysis
and record niter as strlen (last) (but it would be a "first" to have
a niter pattern with a memory operation), another would be to
handle this in loop_niter_by_eval and allow us to take advantage
of the constant &'aa' initial value (that would only scale so much).

>
> > Note the unroller explicitly tries to
> > prune those "bugs" given it happily exploits UB when computing the
> > number of iterations of a loop,
> > like for
> >
> > int a[6];
> > for (int i = 0; i < n; ++i)
> >  a[i] = 1;
> >
> > it knows the loop can at most iterate 5 times because a[i] = 1 with i
> > == 6 would invoke UB.
>
> So, the unroller has some heuristic to limit the potential UB invoked by 
> unrolling? Could you point me the corresponding routine for such heuristic if 
> convenient?

It cuts off those via
tree-ssa-loop-ivcanon.cc:remove_exits_and_undefined_stmts, but this is
invoked only for the last copy,
and it only handles stmts that we recorded bounds for via
infer_loop_bounds_from_undefined.

It's a bit difficult to extend this to non-last copies, we'd have to
incrementally copy and we'll possibly disrupt the
batching of transforms, causing compile-time issues.  But I haven't
had time to investigate.

To summarize the following things should be done to improve things in
this and similar situations:
  a) improve niter analysis / final value replacement to constant fold
'last' in such cases
  b) improve unroll to cut out UB stmts of earlier iterations than the
last (size estimates also do not
      consider this, so it's also a missed optimization)
  c) better explain diagnostics

There's also the possibility to disable UB diagnostics for all unroll
copies when unroll used the
iteration bound rather than the exact number of iterations (I think we
don't record whether the
bound was from UB or from sth else).  Possibly we can try to disable
UB diagnostic only for
stmts with UB as recorded from bounds, similar to
remove_exits_and_undefined_stmts, but
with easier to guarantee to not affect batching, since we're not
inserting stmts or removing edges.

Richard.

> thanks.
>
> Qing
>
> >
> > Richard.
> >
> >>
> >> thanks.
> >>
> >> Qing
> >>>
> >>> So in the end Jeff - I think your patch isn't a good approach for the
> >>> issue at hand.
> >>>
> >>> Richard.
> >>>
> >>>> The above is very rough and initial idea at this moment.
> >>>>
> >>>> Qing
> >>>>
> >>>>>
> >>>>> jeff
>
>

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

Reply via email to