92539] Improve code and avoid Warray-bounds false positive

Qing Zhao Thu, 09 Jan 2025 09:48:57 -0800


> On Jan 9, 2025, at 03:08, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao <qing.z...@oracle.com> wrote:
>> 
>> 
>> 
>>> On Jan 7, 2025, at 07:29, Richard Biener <richard.guent...@gmail.com> wrote:
>>> 
>>> On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Jan 6, 2025, at 11:01, Richard Biener <richard.guent...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Jan 6, 2025, at 09:21, Jeff Law <jeffreya...@gmail.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
>>>>>>>>> 
>>>>>>>>> Given it doesn't cause user visible UB, we could insert the trap 
>>>>>>>>> *before* the UB inducing statement.  That would then make the 
>>>>>>>>> statement unreachable and it'd get removed avoiding the false 
>>>>>>>>> positive diagnostic.
>>>>>>>> Yes, that’s a good idea.
>>>>>>>> However, in order to distinguish a user visible UB and a UB in the IL 
>>>>>>>> that is introduced purely by compiler, we might need some new marking 
>>>>>>>> in the IR?
>>>>>>> I don't think we've ever really tackled that question; the closest I 
>>>>>>> can think of would be things like integer overflow which we try to 
>>>>>>> avoid allowing the compiler to introduce.  If we take the integer 
>>>>>>> overflow as the model, then that would say we should be tackling this 
>>>>>>> during loop unrolling.
>>>>>> 
>>>>>> UB that is introduced by compiler transformation is one important cause 
>>>>>> of false positive warnings.
>>>>>> 
>>>>>> There are two approaches to tackle this problem from my understanding:
>>>>>> 
>>>>>> 1. Avoid generating such UB from the beginning. i.e, for every compiler 
>>>>>> transformation that might introduce such UB, we should add check to 
>>>>>> avoid generating it.
>>>>>> 
>>>>>> 2. Marking the IR portion that were generated by compiler 
>>>>>> transformations, then check whether the UB is compiler generated when 
>>>>>> issue static checker warnings.
>>>>>> 
>>>>>> Are there other approaches?
>>>>> 
>>>>> Note unrolling doesn't introduce UB - it makes conditional UB
>>>>> "obvious”.
>>>> 
>>>> So, you mean this is the same issue as PR109071 (and PR85788, PR88771, 
>>>> etc), i.e, the compiler optimization make the conditional UB that’s 
>>>> originally in the source code “obvious” after code duplication?
>>>> 
>>>> (I need to study the testing case in PR92539 more carefully to make sure 
>>>> this is the case...)
>>>> 
>>>> If so, then the claimed false positive warning in PR92539 actually is a 
>>>> real bug in the original source code,  and my patch that introduced the 
>>>> new option “--fdiagnostics-details” should also include loop unrolling to 
>>>> provide more details on the warning introduced by loop unrolling.
>>>> 
>>>> 
>>>>> Note -Warray-bounds wants to
>>>>> diagnose UB, so doing path isolation and removing the UB would make
>>>>> -Warray-bounds useless.
>>>>> 
>>>>> So unless the condition guarding the UB unrolling exposes is visibly
>>>>> false to the compiler but we fail
>>>>> to exploit that (missed optimization) there's not much that we can do.
>>>>> I think "folding" away the UB
>>>>> like what Jeff proposes trades false negatives for the false positive
>>>>> diagnostics.
>>>>> 
>>>>> Note the unroller knows UB that effectively bounds the number of
>>>>> iterations, even on conditional
>>>>> paths and it uses this to limit the number of copies _and_ to prune
>>>>> unreachable paths (exploiting
>>>>> UB, avoiding diagnostics).  But one of the limitations is that it only
>>>>> prunes paths in the last unrolled
>>>>> copy which can be insufficient (ISTR some PR where I noticed this).
>>>>> 
>>>>> That said - I think for these unroller exposed cases of apparent false
>>>>> positives we should improve
>>>>> the path pruning in the unroller itself.  For the other cases the path
>>>>> diagnostic might help clarify
>>>>> that the UB happens on the 'n-th' iteration of the loop when some
>>>>> additional condition is true/false.
>>>> 
>>>> So, the “other cases” refer to the situation similar as PR109071, i.e, 
>>>> “conditional UB” in the original source code is made obvious after loop 
>>>> unrolling?
>>>> Yes, for such cases, the new option I have been trying to add, 
>>>> “-fdiagnostic-details” should be able to track and provide more details on 
>>>> the conditions that lead to the UB.
>>>> Is this understanding correct?
>>> 
>>> I think so, but I didn't look into the testcase of the referenced PR.
>> 
>> I took a detailed study of the test case of PR92539 yesterday.  The 
>> following is a brief summary:
>> 
>> 1. The pass that caused the issue is: cunrolli.
>>     Adding -fdisable-tree-cunrolli eliminate the false positive warnings.
>> 
>> 2. The IR Before cunrolli:
>> 
>> const char *local_iterator = beginning address of string "aa";
>> const char *last = last address of string "aa";
>> 
>> for (int i = 0; i < 3; ++i)
>>  if (local_iterator != last)           // pointer comparison 1
>>    {
>>      local_iterator++;
>>      if (local_iterator != last)       // pointer comparison 2
>>        local_iterator++;
>>    }
>> 
>> I think that the IR has NO UB at this moment;
>> (The  pointer comparison 2 in the 2nd iteration of “I” loop and
>> both pointer comparison in the 3rd iteration of “I” loop will
>> NOT execute at all)
>> 
>> **After cunrolli (fully unrolling of the above i loop):
>> 
>> const char *local_iterator = beginning address of string "aa";
>> const char *last = last address of string "aa";
>> int i = 0;
>> 
>> if (local_iterator != last)                     // pointer comparison 1
>>  {
>>    local_iterator++;
>>    if (local_iterator != last)         // pointer comparison 2
>>      {
>>        local_iterator++;
>>        i++;
>>        if (local_iterator != last)             // pointer comparison 3
>>          {
>>            local_iterator++;
>>            if (local_iterator != last) // pointer comparison 4
>>              {
>>                local_iterator++;
>>                i++;
>>                if (local_iterator != last)     // pointer comparison 5
>>                  {
>>                    local_iterator++;
>>                    if (local_iterator != last) // pointer comparison 6
>>                      {
>>                        local_iterator++;
>>                        i++;
>>                      }
>>                  }
>>              }
>>          }
>>      }
>>  }
>> 
>> Also, I think that the IR has NO UB at this moment too.
>> (The pointer comparison 4 and later will NOT execute at all)
>> 
>> ** The false positive warnings claimed for pointer comparison 4, 5, and 6
>> assuming that these comparison will be executed during runtime but actually
>> they will not be executed at all.
>> 
>> 
>> So, based on the above study, my impression is:
>> 
>> A.  Unrolling transformation didn’t do anything wrong.
>> B.  Only after const propagation, the compiler will know that the pointer 
>> “local_iterator” will
>>      point to an out-of-bound position of string “aa”, the pointer 
>> comparison 4, 5, 6 is invalid
>>      After that.
>> 
>> So, I guess that Jeff’s patch to identify the invalid pointer address at 
>> “vrp1” is reasonable.
>> 
>> What’s your opinion on this?
> 
> Your analysis is correct - the long-standing design issue is that 
> -Warray-bounds
> does not distinguish conditional UB from unconditional UB (but that concept is
> somewhat wishful thinking given a function isn't guaranteed to be invoked 
> either
> unless it is main()).  The other thing is that while -Warray-bounds
> without unrolling
> likely sees the value-range of i_1 for &a[i_1] includes out-of-bound values it
> chooses to only diagnose cases where the whole value-range is out-of-bound
> (correctly so to avoid false positives just because of conservative ranges).
> 
> I think Jeff's patch is not reasonable since it boils down to not diagnose
> -Warray-bounds but instead remove those stmts.


If these stmts are dead-code that are generated by compiler optimization (NOT 
from source code),
removing them before diagnosis is correct. (To avoid false positive warnings). 

I have a question here, when these stmts are dead-code from source code in the 
beginning, 
shall the -Warray-bounds report the warnings in the dead-code of source code? 
> 
> The problem in this particular testcase is that we fail to optimize the
> compares - not because we don't exploit the visible UB but because
> we fail to compute the final value of 'last' from
> 
>    auto last = "aa";
>    while (*last)
>        ++last;
> 
> one way to attack this is to recognize this pattern in niter analysis
> and record niter as strlen (last) (but it would be a "first" to have
> a niter pattern with a memory operation), another would be to
> handle this in loop_niter_by_eval and allow us to take advantage
> of the constant &'aa' initial value (that would only scale so much).

Yes, I agree. if the loop unrolling phase can successfully avoid such UB code, 
that’s the best
solution to this PR92539. 
> 
>> 
>>> Note the unroller explicitly tries to
>>> prune those "bugs" given it happily exploits UB when computing the
>>> number of iterations of a loop,
>>> like for
>>> 
>>> int a[6];
>>> for (int i = 0; i < n; ++i)
>>> a[i] = 1;
>>> 
>>> it knows the loop can at most iterate 5 times because a[i] = 1 with i
>>> == 6 would invoke UB.
>> 
>> So, the unroller has some heuristic to limit the potential UB invoked by 
>> unrolling? Could you point me the corresponding routine for such heuristic 
>> if convenient?
> 
> It cuts off those via
> tree-ssa-loop-ivcanon.cc:remove_exits_and_undefined_stmts, but this is
> invoked only for the last copy,
> and it only handles stmts that we recorded bounds for via
> infer_loop_bounds_from_undefined.
> 
> It's a bit difficult to extend this to non-last copies, we'd have to
> incrementally copy and we'll possibly disrupt the
> batching of transforms, causing compile-time issues.  But I haven't
> had time to investigate.

For this specific PR, it’s probably feasible to enhance the current heuristic
in loop unrolling to avoid generating the UB code. 

However, I guess that it’s impossible to completely eliminate such situation in 
general. Right?

> 
> To summarize the following things should be done to improve things in
> this and similar situations:
>  a) improve niter analysis / final value replacement to constant fold
> 'last' in such cases
>  b) improve unroll to cut out UB stmts of earlier iterations than the
> last (size estimates also do not
>      consider this, so it's also a missed optimization)
>  c) better explain diagnostics

For c), I think that the patch set I have proposed to add the new option
 -fdiagnostics-details, is the right direction to go in general. 
However, currently, the patches didn’t include loop-unrolling transformation 
there. 

Could you please take a look at the patch set for -fdiagnostics-details 
(Version 4), and provide
More advice on:

A. Whether the middle-end design and implementation is reasonable and 
extendable to more other
     optimizations (including loop unrolling)?
B. If not, how to improve the current design to make it more extendable?
C. If yes, whether it’s okay to add this patches into GCC15, and then improve 
it in GCC16?
> 
> There's also the possibility to disable UB diagnostics for all unroll
> copies when unroll used the
> iteration bound rather than the exact number of iterations (I think we
> don't record whether the
> bound was from UB or from sth else).

Then it might be possible that we miss some real warnings?

>  Possibly we can try to disable
> UB diagnostic only for
> stmts with UB as recorded from bounds, similar to
> remove_exits_and_undefined_stmts, but
> with easier to guarantee to not affect batching, since we're not
> inserting stmts or removing edges.
That could be done too, then we need additional marking on the IRs to 
distinguish these stmts during loop unrolling?

Qing
> 
> Richard.
> 
>> thanks.
>> 
>> Qing
>> 
>>> 
>>> Richard.
>>> 
>>>> 
>>>> thanks.
>>>> 
>>>> Qing
>>>>> 
>>>>> So in the end Jeff - I think your patch isn't a good approach for the
>>>>> issue at hand.
>>>>> 
>>>>> Richard.
>>>>> 
>>>>>> The above is very rough and initial idea at this moment.
>>>>>> 
>>>>>> Qing
>>>>>> 
>>>>>>> 
>>>>>>> jeff

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

Reply via email to