92539] Improve code and avoid Warray-bounds false positive

Sam James Thu, 09 Jan 2025 00:18:12 -0800

Richard Biener <richard.guent...@gmail.com> writes:

> On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>
>>
>>
>> > On Jan 7, 2025, at 07:29, Richard Biener <richard.guent...@gmail.com> 
>> > wrote:
>> >
>> > On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao <qing.z...@oracle.com> wrote:
>> >>
>> >>
>> >>
>> >>> On Jan 6, 2025, at 11:01, Richard Biener <richard.guent...@gmail.com> 
>> >>> wrote:
>> >>>
>> >>> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao <qing.z...@oracle.com> wrote:
>> >>>>
>> >>>>
>> >>>>
>> >>>>> On Jan 6, 2025, at 09:21, Jeff Law <jeffreya...@gmail.com> wrote:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
>> >>>>>>>
>> >>>>>>> Given it doesn't cause user visible UB, we could insert the
>> >>>>>>> trap *before* the UB inducing statement.  That would then
>> >>>>>>> make the statement unreachable and it'd get removed avoiding
>> >>>>>>> the false positive diagnostic.
>> >>>>>> Yes, that’s a good idea.
>> >>>>>> However, in order to distinguish a user visible UB and a UB in the IL 
>> >>>>>> that is introduced purely by compiler, we might need some new marking 
>> >>>>>> in the IR?
>> >>>>> I don't think we've ever really tackled that question; the
>> >>>>> closest I can think of would be things like integer overflow
>> >>>>> which we try to avoid allowing the compiler to introduce.  If
>> >>>>> we take the integer overflow as the model, then that would say
>> >>>>> we should be tackling this during loop unrolling.
>> >>>>
>> >>>> UB that is introduced by compiler transformation is one important cause 
>> >>>> of false positive warnings.
>> >>>>
>> >>>> There are two approaches to tackle this problem from my understanding:
>> >>>>
>> >>>> 1. Avoid generating such UB from the beginning. i.e, for every compiler 
>> >>>> transformation that might introduce such UB, we should add check to 
>> >>>> avoid generating it.
>> >>>>
>> >>>> 2. Marking the IR portion that were generated by compiler 
>> >>>> transformations, then check whether the UB is compiler generated when 
>> >>>> issue static checker warnings.
>> >>>>
>> >>>> Are there other approaches?
>> >>>
>> >>> Note unrolling doesn't introduce UB - it makes conditional UB
>> >>> "obvious”.
>> >>
>> >> So, you mean this is the same issue as PR109071 (and PR85788,
>> >> PR88771, etc), i.e, the compiler optimization make the
>> >> conditional UB that’s originally in the source code “obvious”
>> >> after code duplication?
>> >>
>> >> (I need to study the testing case in PR92539 more carefully to make sure 
>> >> this is the case...)
>> >>
>> >> If so, then the claimed false positive warning in PR92539
>> >> actually is a real bug in the original source code, and my patch
>> >> that introduced the new option “--fdiagnostics-details” should
>> >> also include loop unrolling to provide more details on the
>> >> warning introduced by loop unrolling.
>> >>
>> >>
>> >>> Note -Warray-bounds wants to
>> >>> diagnose UB, so doing path isolation and removing the UB would make
>> >>> -Warray-bounds useless.
>> >>>
>> >>> So unless the condition guarding the UB unrolling exposes is visibly
>> >>> false to the compiler but we fail
>> >>> to exploit that (missed optimization) there's not much that we can do.
>> >>> I think "folding" away the UB
>> >>> like what Jeff proposes trades false negatives for the false positive
>> >>> diagnostics.
>> >>>
>> >>> Note the unroller knows UB that effectively bounds the number of
>> >>> iterations, even on conditional
>> >>> paths and it uses this to limit the number of copies _and_ to prune
>> >>> unreachable paths (exploiting
>> >>> UB, avoiding diagnostics).  But one of the limitations is that it only
>> >>> prunes paths in the last unrolled
>> >>> copy which can be insufficient (ISTR some PR where I noticed this).
>> >>>
>> >>> That said - I think for these unroller exposed cases of apparent false
>> >>> positives we should improve
>> >>> the path pruning in the unroller itself.  For the other cases the path
>> >>> diagnostic might help clarify
>> >>> that the UB happens on the 'n-th' iteration of the loop when some
>> >>> additional condition is true/false.
>> >>
>> >> So, the “other cases” refer to the situation similar as PR109071, i.e, 
>> >> “conditional UB” in the original source code is made obvious after loop 
>> >> unrolling?
>> >> Yes, for such cases, the new option I have been trying to add, 
>> >> “-fdiagnostic-details” should be able to track and provide more details 
>> >> on the conditions that lead to the UB.
>> >> Is this understanding correct?
>> >
>> > I think so, but I didn't look into the testcase of the referenced PR.
>>
>> I took a detailed study of the test case of PR92539 yesterday.  The 
>> following is a brief summary:
>>
>> 1. The pass that caused the issue is: cunrolli.
>>      Adding -fdisable-tree-cunrolli eliminate the false positive warnings.
>>
>> 2. The IR Before cunrolli:
>>
>> const char *local_iterator = beginning address of string "aa";
>> const char *last = last address of string "aa";
>>
>> for (int i = 0; i < 3; ++i)
>>   if (local_iterator != last)           // pointer comparison 1
>>     {
>>       local_iterator++;
>>       if (local_iterator != last)       // pointer comparison 2
>>         local_iterator++;
>>     }
>>
>> I think that the IR has NO UB at this moment;
>> (The  pointer comparison 2 in the 2nd iteration of “I” loop and
>> both pointer comparison in the 3rd iteration of “I” loop will
>> NOT execute at all)
>>
>> **After cunrolli (fully unrolling of the above i loop):
>>
>> const char *local_iterator = beginning address of string "aa";
>> const char *last = last address of string "aa";
>> int i = 0;
>>
>> if (local_iterator != last)                     // pointer comparison 1
>>   {
>>     local_iterator++;
>>     if (local_iterator != last)         // pointer comparison 2
>>       {
>>         local_iterator++;
>>         i++;
>>         if (local_iterator != last)             // pointer comparison 3
>>           {
>>             local_iterator++;
>>             if (local_iterator != last) // pointer comparison 4
>>               {
>>                 local_iterator++;
>>                 i++;
>>                 if (local_iterator != last)     // pointer comparison 5
>>                   {
>>                     local_iterator++;
>>                     if (local_iterator != last) // pointer comparison 6
>>                       {
>>                         local_iterator++;
>>                         i++;
>>                       }
>>                   }
>>               }
>>           }
>>       }
>>   }
>>
>> Also, I think that the IR has NO UB at this moment too.
>> (The pointer comparison 4 and later will NOT execute at all)
>>
>> ** The false positive warnings claimed for pointer comparison 4, 5, and 6
>> assuming that these comparison will be executed during runtime but actually
>> they will not be executed at all.
>>
>>
>> So, based on the above study, my impression is:
>>
>> A.  Unrolling transformation didn’t do anything wrong.
>> B.  Only after const propagation, the compiler will know that the pointer 
>> “local_iterator” will
>>       point to an out-of-bound position of string “aa”, the pointer 
>> comparison 4, 5, 6 is invalid
>>       After that.
>>
>> So, I guess that Jeff’s patch to identify the invalid pointer address at 
>> “vrp1” is reasonable.
>>
>> What’s your opinion on this?
>
> Your analysis is correct - the long-standing design issue is that 
> -Warray-bounds
> does not distinguish conditional UB from unconditional UB (but that concept is
> somewhat wishful thinking given a function isn't guaranteed to be invoked 
> either
> unless it is main()).  The other thing is that while -Warray-bounds
> without unrolling
> likely sees the value-range of i_1 for &a[i_1] includes out-of-bound values it
> chooses to only diagnose cases where the whole value-range is out-of-bound
> (correctly so to avoid false positives just because of conservative ranges).
>
> I think Jeff's patch is not reasonable since it boils down to not diagnose
> -Warray-bounds but instead remove those stmts.
>
> The problem in this particular testcase is that we fail to optimize the
> compares - not because we don't exploit the visible UB but because
> we fail to compute the final value of 'last' from
>
>     auto last = "aa";
>     while (*last)
>         ++last;
>
> one way to attack this is to recognize this pattern in niter analysis
> and record niter as strlen (last) (but it would be a "first" to have
> a niter pattern with a memory operation), another would be to
> handle this in loop_niter_by_eval and allow us to take advantage
> of the constant &'aa' initial value (that would only scale so much).
>
>>
>> > Note the unroller explicitly tries to
>> > prune those "bugs" given it happily exploits UB when computing the
>> > number of iterations of a loop,
>> > like for
>> >
>> > int a[6];
>> > for (int i = 0; i < n; ++i)
>> >  a[i] = 1;
>> >
>> > it knows the loop can at most iterate 5 times because a[i] = 1 with i
>> > == 6 would invoke UB.
>>
>> So, the unroller has some heuristic to limit the potential UB invoked by 
>> unrolling? Could you point me the corresponding routine for such heuristic 
>> if convenient?
>
> It cuts off those via
> tree-ssa-loop-ivcanon.cc:remove_exits_and_undefined_stmts, but this is
> invoked only for the last copy,
> and it only handles stmts that we recorded bounds for via
> infer_loop_bounds_from_undefined.
>
> It's a bit difficult to extend this to non-last copies, we'd have to
> incrementally copy and we'll possibly disrupt the
> batching of transforms, causing compile-time issues.  But I haven't
> had time to investigate.
>
> To summarize the following things should be done to improve things in
> this and similar situations:
>   a) improve niter analysis / final value replacement to constant fold
> 'last' in such cases
>   b) improve unroll to cut out UB stmts of earlier iterations than the
> last (size estimates also do not
>       consider this, so it's also a missed optimization)
>   c) better explain diagnostics


Maybe even with -fdiagnostics-details in 15 as a late Christmas present?
:)

The only issues I've found when using Qing's patch are cases I need to
reduce where we may be able to diagnose more, but nothing else. Kees is
using it for kernel issues already too.

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

Reply via email to