Re: [PATCH] Make strlen range computations more conservative

Jeff Law Tue, 07 Aug 2018 08:33:39 -0700

On 08/06/2018 09:38 PM, Martin Sebor wrote:
> On 08/06/2018 11:40 AM, Jeff Law wrote:
>> On 08/06/2018 11:15 AM, Martin Sebor wrote:
>>>>> These examples do not aim to be valid C, they just point out
>>>>> limitations
>>>>> of the middle-end design, and a good deal of the problems are due
>>>>> to trying to do things that are not safe within the boundaries given
>>>>> by the middle-end design.
>>>> I really think this is important -- and as such I think we need to move
>>>> away from trying to describe scenarios in C because doing so keeps
>>>> bringing us back to the "C doesn't allow XYZ" kinds of arguments when
>>>> what we're really discussing are GIMPLE semantic issues.
>>>>
>>>> So examples should be GIMPLE.  You might start with (possibly
>>>> invalid) C
>>>> code to generate the GIMPLE, but the actual discussion needs to be
>>>> looking at GIMPLE.  We might include the C code in case someone
>>>> wants to
>>>> look at things in a debugger, but bringing the focus to GIMPLE is
>>>> really
>>>> important here.
>>>
>>> I don't understand the goal of this exercise.  Unless the GIMPLE
>>> code is the result of a valid test case (in some language GCC
>>> supports), what does it matter what it looks like?  The basis of
>>> every single transformation done by a compiler is that the source
>>> code is correct.  If it isn't then all bets are off.  I'm no GIMPLE
>>> expert but even I can come up with any number of GIMPLE expressions
>>> that have undefined behavior.  What would that prove?
>> The GIMPLE IL is less restrictive than the original source language.
>> The process of translation into GIMPLE and optimization can create
>> situations in the GIMPLE IL that can't be validly represented in the
>> original source language.  Subobject crossing being one such case, there
>> are certainly others.  We have to handle these scenarios correctly.
> 
> Sure, but a valid C test case still needs to exist to show that
> such a transformation is possible.  Until someone comes up with
> one it's all speculation.
No, not at all.  The defined semantics in this space come from actually
bumping against these problems in this past -- resulting in defining the
semantics in the way we have.



> 
> Under normal circumstances the burden of proof that there is
> a problem is on the reporter.  In this case, the requirement
> has turned into one to prove a negative.  Effectively, you
> are asking for a proof that there is no bug, either in
> the assumptions behind the strlen optimization, or somewhere
> else in GCC that would lead the optimization to invalidate
> a valid piece of code.  That's impossible.
I disagree strongly.  We have *defined* a set of semantics in GIMPLE
based on the language lowering processes and needs of the optimizers.
For anything which transforms the IL, you must adhere to the semantics
of GIMPLE.  It's that simple.

I am sympathetic to the desire to use C semantics to get better refined
ranges, but that's just wrong for anything which impacts code generation.

This discussion doesn't seem to be moving beyond that basic point which
is concerning.  It really feels like we should be moving towards how do
we avoid violating GIMPLE semantics for codegen/opt issues while still
getting good warnings.



Jeff

Re: [PATCH] Make strlen range computations more conservative

Reply via email to