Re: [RFA] [tree-optimization/80576] Handle non-constant sizes in DSE

Jeff Law Fri, 23 Aug 2019 08:39:49 -0700

On 8/22/19 10:39 AM, Martin Sebor wrote:
> On 8/21/19 4:34 PM, Jeff Law wrote:
>> On 8/16/19 4:21 PM, Martin Sebor wrote:
>>> On 8/16/19 12:15 PM, Jeff Law wrote:
>>>> On 8/16/19 12:09 PM, Marc Glisse wrote:
>>>>> On Fri, 16 Aug 2019, Jeff Law wrote:
>>>>>
>>>>>> This patch improves our ability to detect dead stores by handling
>>>>>> cases
>>>>>> where the size memcpy, memset, strncpy, etc call is not constant. 
>>>>>> This
>>>>>> addresses some, but not all, of the issues in 80576.
>>>>>>
>>>>>> The key here is when the size is not constant we can make
>>>>>> conservative
>>>>>> decisions that still give us a chance to analyze the code for dead
>>>>>> stores.
>>>>>>
>>>>>> Remember that for dead store elimination, we're trying to prove that
>>>>>> given two stores, the second store overwrites (partially or fully)
>>>>>> the
>>>>>> same memory locations as the first store.  That makes the first store
>>>>>> either partially or fully dead.
>>>>>>
>>>>>> When we encounter the first store, we set up a bitmap of bytes
>>>>>> written
>>>>>> by that store (live_bytes).  We then look at subsequent stores and
>>>>>> clear
>>>>>> the appropriate entries in the bitmap.
>>>>>>
>>>>>> If the first store has a nonconstant length argument we can use the
>>>>>> range of the length argument (max) and the size of the destination
>>>>>> object to make a conservative estimation of how many bytes are
>>>>>> written.
>>>>>>
>>>>>> For the second store the conservative thing to do for a non-constant
>>>>>> length is to use the minimum of the range of the length argument.
>>>>>
>>>>> So I guess it won't handle things like
>>>>>
>>>>> void f(char*p,int n){
>>>>>     __builtin_memset(p,3,n);
>>>>>     __builtin_memset(p,7,n);
>>>>> }
>>>>>
>>>>> where we know nothing about the length, except that it is the same? Or
>>>>> do you look at symbolic ranges?
>>>> Nope.  I think ao_ref can represent that, so it'd just be a matter of
>>>> recording "n" as the length, then verifying that the second call's
>>>> length is "n" as well.  That makes the first call dead.  We'd have to
>>>> bypass the byte tracking in that case, but I think that's trivial
>>>> because we already have a means to do that when the sizes are too
>>>> large.
>>>>
>>>>>
>>>>>> This doesn't come up a lot in practice.  But it also happens to put
>>>>>> some
>>>>>> of the infrastructure in place to handle strcpy and strcpy_chk which
>>>>>> are
>>>>>> needed to fully resolve 80576.
>>>>>>
>>>>>> Bootstrapped and regression tested on x86, x86_64, ppc64le, ppc64,
>>>>>> ppc32, aarch64, sparc, s390x and probably others.  Also verified that
>>>>>> the tests work on the various *-elf targets in my tester.
>>>>>>
>>>>>> OK for the trunk?
>>>>>
>>>>> ENOPATCH
>>>> Opps.    Attached.
>>>
>>> It's an improvement and I realize you said it doesn't handle
>>> everything and that you don't think it comes up a lot, but...
>>> I would actually expect the following example (from the bug)
>>> not to be that uncommon:
>>>
>>>    void g (char *s)
>>>    {
>>>      char a[8];
>>>      __builtin_strcpy (a, s);
>>>      __builtin_memset (a, 0, sizeof a);
>>>      f (a);
>>>    }
>>>
>>> or at least to be more common than the equivalent alternative
>>> the improvement does optimize:
>>>
>>>    void g (char *s)
>>>    {
>>>      char a[8];
>>>      __builtin_memcpy (a, s,  __builtin_strlen (s));
>>>      __builtin_memset (a, 0, 8);
>>>      f (a);
>>>    }
>>>
>>> It seems that making the first one work should be just a matter
>>> of handling strcpy analogously (the string length doesn't matter).
>>>
>>> As an aside, the new tests make me realize that -Wstringop-overflow
>>> should be enhanced to detect this problem (i.e., a consider
>>> the largest size in a PHI).
>> Certainly not seeing much of these in the code I've looked at.  It may
>> be a case that aliasing gets in the way often.
>>
>> Also note that we've got the same issues in this space that we did with
>> the strlen optimization improvements for last year.  I totally spaced
>> that and the net result may be we have to avoid using the type size for
>> anything in DSE which may make it impossible to really do a good job.
> 
> The issue with the strlen optimization was that it relied on
> the type of subobjects of multidimensional arrays and structs
> to determine the maximum valid size of the access to them.
> This ran afoul of assumptions in code that doesn't respect
> subobject boundaries.
Right.  And my proposed DSE changes have the same fundamental problem.


> 
> This is not a concern for outermost objects like in the example
> above.  strlen and other parts of GCC still make use of their
> size for optimization.  The middle-end diagnostics I've been
> adding still expect code to respect subobject boundaries.  I've
> been doing that for this very reason: to let us do a better job
> not just optimizing code, but also diagnosing bugs: find more
> real problems with fewer false positives.
True, but the proposed DSE code does not restrict itself to outermost
objects.  That's what I need to fix before resubmitting.  Thankfully
it's not terribly hard.

jeff

Re: [RFA] [tree-optimization/80576] Handle non-constant sizes in DSE

Reply via email to