Re: [PATCH] cache compute_objsize results in strlen/sprintf (PR 97373)

Jeff Law via Gcc-patches Tue, 01 Dec 2020 15:38:29 -0800

On 12/1/20 2:21 PM, Martin Sebor wrote:
> On 12/1/20 1:57 PM, Martin Sebor wrote:
>> On 11/23/20 2:04 PM, Jeff Law wrote:
>>>
>>>
>>> On 11/4/20 5:58 PM, Martin Sebor via Gcc-patches wrote:
>>>> To determine the target of a pointer expression and the offset into
>>>> it, the increasingly widely used compute_objsize function traverses
>>>> the IL following the DEF statements of pointer variables, aggregating
>>>> offsets from POINTER_PLUS assignments along the way.  It does that
>>>> for many statements that involve pointers, including calls to
>>>> built-in functions and (so far only) accesses to char arrays.  When
>>>> a function has many such statements with pointers to the same objects
>>>> but with different offsets, the traversal ends up visiting the same
>>>> pointer assignments repeatedly and unnecessarily.
>>>>
>>>> To avoid this repeated traversal, the attached patch adds the ability
>>>> to cache results obtained in prior calls to the function.  The cache
>>>> is optional and only used when enabled.
>>>>
>>>> To exercise the cache I have enabled it for the strlen pass (which
>>>> is probably the heaviest compute_objsize user).  That happens to
>>>> resolve PR 97373 which tracks the pass' failure to detect sprintf
>>>> overflowing allocated buffers at a non-constant offset.  I thought
>>>> about making this a separate patch but the sprintf/strlen changes
>>>> are completely mechanical so it didn't seem worth the effort.
>>>>
>>>> In the benchmarking I've done the cache isn't a huge win there but
>>>> it does have a measurable difference in the project I'm wrapping up
>>>> where most pointer assignments need to be examined.  The space used
>>>> for the cache is negligible on average: fewer than 20 entries per
>>>> Glibc function and about 6 for GCC.  The worst case in Glibc is
>>>> 6 thousand entries and 10k in GCC.  Since the entries are sizable
>>>> (216 bytes each) the worst case memory consumption can be reduced
>>>> by adding a level of indirection.  A further savings can be obtained
>>>> by replacing some of the offset_int members of the entries with
>>>> HOST_WIDE_INT.
>>>>
>>>> The efficiency benefits of the cache should increase further as more
>>>> of the access checking code is integrated into the same pass.  This
>>>> should eventually include the checking currently done in the built-in
>>>> expanders.
>>>>
>>>> Tested on x86_64-linux, along with Glibc and Binutils/GDB.
>>>>
>>>> Martin
>>>>
>>>> PS The patch add the new pointer_query class (loosely modeled on
>>>> range_query) to builtins.{h,c}.  This should be only temporary,
>>>> until the access checking code is moved into a file (and ultimately
>>>> a pass) of its own.
>>>>
>>>> gcc-97373.diff
>>>>
>>>> PR middle-end/97373 - missing warning on sprintf into allocated
>>>> destination
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>     PR middle-end/97373
>>>>     * builtins.c (compute_objsize): Rename...
>>>>     (compute_objsize_r): to this.  Change order and types of
>>>> arguments.
>>>>     Use new argument.  Adjust calls to self.
>>>>     (access_ref::get_ref): New member function.
>>>>     (pointer_query::pointer_query): New member function.
>>>>     (pointer_query::get_ref): Same.
>>>>     (pointer_query::put_ref): Same.
>>>>     (handle_min_max_size): Change order and types of arguments.
>>>>     (maybe_emit_free_warning): Add a test.
>>>>     * builtins.h (class pointer_query): New class.
>>>>     (compute_objsize): Declare an overload.
>>>>     * gimple-ssa-sprintf.c (get_destination_size):
>>>>     (handle_printf_call):
>>>>     * tree-ssa-strlen.c (adjust_last_stmt): Add an argument and use
>>>> it.
>>>>     (maybe_warn_overflow): Same.
>>>>     (handle_builtin_strcpy): Same.
>>>>     (maybe_diag_stxncpy_trunc): Same.
>>>>     (handle_builtin_memcpy): Change argument type.  Adjust calls.
>>>>     (handle_builtin_strcat): Same.
>>>>     (handle_builtin_memset): Same.
>>>>     (handle_store): Same.
>>>>     (strlen_check_and_optimize_call): Same.
>>>>     (check_and_optimize_stmt): Same.
>>>>     (strlen_dom_walker): Add new data members.
>>>>     (strlen_dom_walker::before_dom_children): Use new member.
>>>>     (printf_strlen_execute): Dump cache performance counters.
>>>>     * tree-ssa-strlen.h (maybe_diag_stxncpy_trunc): Add argument.
>>>>     (handle_printf_call): Change argument type.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>     PR middle-end/97373
>>>>     * gcc.dg/warn-strnlen-no-nul.c:
>>>>     * g++.dg/warn/Wmismatched-new-delete-2.C: New test.
>>>>     * gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: New test.
>>> I'm going to hesitatingly ACK this.
>>
>> Besides rebasing the changes on top the latest trunk I've also
>> adjusted them to avoid the "inefficient hacks" I mentioned in
>> the subsequent patch(*).  They were not only inefficient but
>> also would have caused the function to return incorrect results
>> in some cases.  After retesting with Binutils/GDB and Glibc
>> I committed the attached patch in r11-5622.
>>
>> Martin
>>
>> [*] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558775.html
>>
>>>
>>> The reason I saw hesitatingly is because I suspect that we're going to
>>> see more cases where caching would be useful and I'd like to have a
>>> general way to do that.  We've already got a ton of caching code in
>>> Ranger and I fully expect more as we continue to write analysis code
>>> that wants to walk backwards.  I don't want to see multiple
>>> implementations of simple caching code.
>
> I neglected to respond to this: if/when a general caching mechanism
> becomes available I'll be happy to switch over to it.  This is just
> a couple of vectors to avoid the repetitive traversal of the use-def
> chains that you have been objecting to, and make constant time
> retrieval of the object sizes possible.
Understood.  What I'm trying to avoid is having a variety of bespoke
caching implementations for on-demand optimization/analysis phases.  I
can easily see that scenario creeping up on us.

>
>>>
>>> Additionally, if you're doing so many queries in a subsequent patch
>>> that
>>> caching is needed, then that may be an indication that you're doing too
>>> much work.  I can't really evaluate that yet as I haven't looked at the
>>> subsequent patch where this becomes a significant issue.
>
> I also feel like I should address this comment, although I'd prefer
> to have the discussion in a review of that patch.
>
> The work that the subsequent patch does (and this one makes possible
> to do more efficiently) is to determine the identities of objects
> pointed to by pointers at the time of their assignment, as opposed
> to the time of their use in expressions like ARRAY_REF, MEM_REF,
> COMPONENT_REF, POINTER_PLUS, etc.  The current approach on trunk
> is inefficient when the same object is referenced many times by
> distinct pointers (the common case), because each such reference
> involves the traversal of the pointer's use-def chain to discover
> what it points to.
Understood.

Jeff
Re: [PATCH] cache compute_objsize results in strlen/sprintf (PR 97373)

Reply via email to