92539] Improve code and avoid Warray-bounds false positive

Qing Zhao Wed, 08 Jan 2025 08:35:00 -0800


> On Jan 7, 2025, at 07:29, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao <qing.z...@oracle.com> wrote:
>> 
>> 
>> 
>>> On Jan 6, 2025, at 11:01, Richard Biener <richard.guent...@gmail.com> wrote:
>>> 
>>> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Jan 6, 2025, at 09:21, Jeff Law <jeffreya...@gmail.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
>>>>>>> 
>>>>>>> Given it doesn't cause user visible UB, we could insert the trap 
>>>>>>> *before* the UB inducing statement.  That would then make the statement 
>>>>>>> unreachable and it'd get removed avoiding the false positive diagnostic.
>>>>>> Yes, that’s a good idea.
>>>>>> However, in order to distinguish a user visible UB and a UB in the IL 
>>>>>> that is introduced purely by compiler, we might need some new marking in 
>>>>>> the IR?
>>>>> I don't think we've ever really tackled that question; the closest I can 
>>>>> think of would be things like integer overflow which we try to avoid 
>>>>> allowing the compiler to introduce.  If we take the integer overflow as 
>>>>> the model, then that would say we should be tackling this during loop 
>>>>> unrolling.
>>>> 
>>>> UB that is introduced by compiler transformation is one important cause of 
>>>> false positive warnings.
>>>> 
>>>> There are two approaches to tackle this problem from my understanding:
>>>> 
>>>> 1. Avoid generating such UB from the beginning. i.e, for every compiler 
>>>> transformation that might introduce such UB, we should add check to avoid 
>>>> generating it.
>>>> 
>>>> 2. Marking the IR portion that were generated by compiler transformations, 
>>>> then check whether the UB is compiler generated when issue static checker 
>>>> warnings.
>>>> 
>>>> Are there other approaches?
>>> 
>>> Note unrolling doesn't introduce UB - it makes conditional UB
>>> "obvious”.
>> 
>> So, you mean this is the same issue as PR109071 (and PR85788, PR88771, etc), 
>> i.e, the compiler optimization make the conditional UB that’s originally in 
>> the source code “obvious” after code duplication?
>> 
>> (I need to study the testing case in PR92539 more carefully to make sure 
>> this is the case...)
>> 
>> If so, then the claimed false positive warning in PR92539 actually is a real 
>> bug in the original source code,  and my patch that introduced the new 
>> option “--fdiagnostics-details” should also include loop unrolling to 
>> provide more details on the warning introduced by loop unrolling.
>> 
>> 
>>> Note -Warray-bounds wants to
>>> diagnose UB, so doing path isolation and removing the UB would make
>>> -Warray-bounds useless.
>>> 
>>> So unless the condition guarding the UB unrolling exposes is visibly
>>> false to the compiler but we fail
>>> to exploit that (missed optimization) there's not much that we can do.
>>> I think "folding" away the UB
>>> like what Jeff proposes trades false negatives for the false positive
>>> diagnostics.
>>> 
>>> Note the unroller knows UB that effectively bounds the number of
>>> iterations, even on conditional
>>> paths and it uses this to limit the number of copies _and_ to prune
>>> unreachable paths (exploiting
>>> UB, avoiding diagnostics).  But one of the limitations is that it only
>>> prunes paths in the last unrolled
>>> copy which can be insufficient (ISTR some PR where I noticed this).
>>> 
>>> That said - I think for these unroller exposed cases of apparent false
>>> positives we should improve
>>> the path pruning in the unroller itself.  For the other cases the path
>>> diagnostic might help clarify
>>> that the UB happens on the 'n-th' iteration of the loop when some
>>> additional condition is true/false.
>> 
>> So, the “other cases” refer to the situation similar as PR109071, i.e, 
>> “conditional UB” in the original source code is made obvious after loop 
>> unrolling?
>> Yes, for such cases, the new option I have been trying to add, 
>> “-fdiagnostic-details” should be able to track and provide more details on 
>> the conditions that lead to the UB.
>> Is this understanding correct?
> 
> I think so, but I didn't look into the testcase of the referenced PR.


I took a detailed study of the test case of PR92539 yesterday.  The following 
is a brief summary:

1. The pass that caused the issue is: cunrolli.
     Adding -fdisable-tree-cunrolli eliminate the false positive warnings.

2. The IR Before cunrolli:

const char *local_iterator = beginning address of string "aa";
const char *last = last address of string "aa";

for (int i = 0; i < 3; ++i)
  if (local_iterator != last)           // pointer comparison 1
    {
      local_iterator++;
      if (local_iterator != last)       // pointer comparison 2
        local_iterator++;
    }

I think that the IR has NO UB at this moment;
(The  pointer comparison 2 in the 2nd iteration of “I” loop and 
both pointer comparison in the 3rd iteration of “I” loop will 
NOT execute at all)

**After cunrolli (fully unrolling of the above i loop):

const char *local_iterator = beginning address of string "aa";
const char *last = last address of string "aa";
int i = 0;

if (local_iterator != last)                     // pointer comparison 1
  {
    local_iterator++;
    if (local_iterator != last)         // pointer comparison 2
      {
        local_iterator++;
        i++;
        if (local_iterator != last)             // pointer comparison 3
          {
            local_iterator++;
            if (local_iterator != last) // pointer comparison 4
              {
                local_iterator++;
                i++;
                if (local_iterator != last)     // pointer comparison 5
                  {
                    local_iterator++;
                    if (local_iterator != last) // pointer comparison 6
                      {
                        local_iterator++;
                        i++;
                      }
                  }
              }
          }
      }
  }

Also, I think that the IR has NO UB at this moment too. 
(The pointer comparison 4 and later will NOT execute at all)

** The false positive warnings claimed for pointer comparison 4, 5, and 6 
assuming that these comparison will be executed during runtime but actually
they will not be executed at all. 


So, based on the above study, my impression is:

A.  Unrolling transformation didn’t do anything wrong. 
B.  Only after const propagation, the compiler will know that the pointer 
“local_iterator” will
      point to an out-of-bound position of string “aa”, the pointer comparison 
4, 5, 6 is invalid
      After that. 

So, I guess that Jeff’s patch to identify the invalid pointer address at “vrp1” 
is reasonable. 

What’s your opinion on this?


> Note the unroller explicitly tries to
> prune those "bugs" given it happily exploits UB when computing the
> number of iterations of a loop,
> like for
> 
> int a[6];
> for (int i = 0; i < n; ++i)
>  a[i] = 1;
> 
> it knows the loop can at most iterate 5 times because a[i] = 1 with i
> == 6 would invoke UB.

So, the unroller has some heuristic to limit the potential UB invoked by 
unrolling? Could you point me the corresponding routine for such heuristic if 
convenient?

thanks.

Qing

> 
> Richard.
> 
>> 
>> thanks.
>> 
>> Qing
>>> 
>>> So in the end Jeff - I think your patch isn't a good approach for the
>>> issue at hand.
>>> 
>>> Richard.
>>> 
>>>> The above is very rough and initial idea at this moment.
>>>> 
>>>> Qing
>>>> 
>>>>> 
>>>>> jeff

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

Reply via email to