A gentle ping
> On 30 Oct 2025, at 10:42 AM, Prachi Godbole <[email protected]> wrote:
>
> This patch attempts to reduce compile time for locality cloning pass by
> reducing recursive calls to partition_callchain (). This is achieved by
> precomputing caller callee information into locality_info. locality_info
> stores all callees of a node, either directly or via inlined nodes thereby
> avoiding calls to partition_callchain () for inlined nodes which are already
> partitioned with their inlined_to nodes. locality_info stores precomputed
> accumulated incoming edge frequencies per unique caller and avoids repeated
> computation within partition_callchain (). It also stores preaccumulated and
> sorted outgoing edge frequencies for unique callees.
>
> This patch refines is_entry_node_p () check by calling local_p () instead of
> just alias check.
>
> Approximately 45% compile time improvement is observed for
> bootstrap-lto-locality config, and takes 2-5% more time on top of
> bootstrap-lto.
>
> This patch also handles appropriate memory management of pass specific data
> structures.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for mainline?
>
> Thanks,
> Prachi
>
> Signed-off-by: Prachi Godbole <[email protected]>
>
> gcc/ChangeLog:
>
> * ipa-locality-cloning.cc (struct locality_callee_info): New struct.
> (struct locality_info): Ditto.
> (loc_infos): Ditto.
> (get_locality_info): New function.
> (sort_all_callees_default): Ditto.
> (callee_default_cmp): Ditto.
> (populate_callee_locality_info): Ditto.
> (populate_caller_locality_info): Ditto.
> (create_locality_info): Ditto.
> (adjust_recursive_callees): Access node_to_clone by reference.
> (inline_clones): Access node_to_clone and clone_to_node by reference.
> (clone_node_as_needed): Ditto.
> (accumulate_incoming_edge_frequency): Remove function.
> (clone_node_p): New function.
> (partition_callchain): Refactor the function.
> (is_entry_node_p): Call local_p ().
> (locality_determine_ipa_order): Call create_locality_info ().
> (locality_determine_static_order): Ditto.
> (locality_partition_and_clone): Update call to partition_callchain ()
> according prototype.
> (lc_execute): Allocate and free node_to_ch_info, node_to_clone,
> clone_to_node.
>
> <0002-PATCH-2-3-ipa-reorder-for-locality-Address-compile-t.patch>