Re: [PATCH][RFC] Introduce TREE_AOREFWRAP to cache ao_ref in the IL

Michael Matz via Gcc-patches Wed, 13 Oct 2021 07:57:43 -0700

Hello,

[this is the fourth attempt to write a comment/review/opinion for this 
ao_ref-in-tcc_reference, please accept some possible incoherence]

On Tue, 12 Oct 2021, Richard Biener via Gcc-patches wrote:

> This prototype hack introduces a new tcc_reference TREE_AOREFWRAP
> which we can use to wrap a reference tree, recording the ao_ref
> associated with it.  That comes in handy when trying to optimize
> the constant factor involved with alias stmt walking (or alias
> queries in general) where there's parts that are liner in the
> reference expression complexity, namely get_ref_base_and_extent,
> which shows up usually high on profiles.

So, generally I like storing things into the IL that are impossible to 
(re-)discover.  Remembering things that are merely slowish to rediscover 
is less clear, the increase of IL memory use, and potentially the 
necessary pointer chasing might just trade one clearly measurable slow 
point (the rediscover computation) with many slow points all over the 
place (the pointer chasing/cache effects).  Here ...

> The following patch is minimal as to make tree-ssa.exp=ssa-fre-*
> not ICE and make the testcases from PR28071 and PR39326 compile
> successfully at -O1 (both testcases show a moderately high
> load on alias stmt walking around 25%, resp. 34%).  With the
> patch which makes use of the cache only from stmt_may_clobber_ref_p
> for now the compile-time improves by 7%, resp. 19% which means
> overall the idea might be worth pursuing.

... you seem to have a point, though.  Also, I am of the opinion that our 
gimple memrefs could be even fatter (and also encode things like 
multi-dimensional accesses, either right from the source code or 
discovered by index analysis), and your idea goes into that direction.  
So, yay for encoding memref properties into the IL, even though here it's 
only a cache.  You solved the necessary invalidation already.  Perhaps 
only partly, that will be seen once exposed to the wild.

So, the only question seems to be how to encode it: either by ...

> transparent by instead of wrapping the refs with another tree

... this (wrapping) ...

> to reallocate the outermost handled_component_p (and only those),

... or that (aggregation).  There is a third possibility if you want this 
only in the gimple world (which is the case): encode it not in the trees 
but in the gimple statements.  This sort of works easily for everything 
except calls.  I will not consider this variant, nor the side table 
implementation.

While writing this email I switched between more liking one or the other, 
multiple times.  So, I'll write down some basic facts/requirements:

1) You basically want to add stuff to an existing structure:
(a) by wrapping: to work seemlessly the outer tree should have similar 
    enough properties to the inner tree (e.g. also be tcc_reference) to be 
    used interchangably in most code, except that which needs to look at 
    the added stuff.
(b) by aggregating the stuff into the existing structure itself: if you 
    need both structs (with and without stuff) the pure thing to do is to 
    actually create two structs, once with, once without stuff.
2) the added stuff is optional
3) we have multiple things (all tcc_reference) to which to add stuff
4) all tcc_reference are tree_exp, which is variable number of operands,
   which constrain things we can do naturally (e.g. we can't add stuff 
   after tree_exp, except by constraining the number of operands)

Considering this it seems that aggregation is worse: you basically double 
the number of structure types (at least conceptually, if you go with your 
bit-idea).  So, some idea of wrapping seems more natural.

(I think your idea of aggregation but going with a bit flag to indicate if 
this tcc_reference is or isn't annotated, and therefore has things 
allocated after the variable number of operands, is a terrible hack)

There is another possibility doing something like your bit-flag 
aggregation but with fewer hackery: if ao_ref would be a tree it could be 
a normal operand of a tree_exp (and hence tcc_reference), just that the 
number of operands then would vary depending on if it's annotated or not.

Making ao_ref into a tree would also enable the use of ANNOTATE_EXPR for a 
generic wrapping tree.  (Currently its used only in very specific cases, 
so ANNOTATE_EXPR handling would need to be extended all over, and as it's 
not a tcc_reference it would probably rather mean to introduce a new 
ANNOTATE_REF).

Anyway, with this:

struct tree_ref_annotation {
  struct tree_base base;
  struct ao_ref ao_ref;
};

DEFTREECODE(TREE_MEM_ANNO, "mem_anno", tcc_exceptional, 0);

you could then add

DEFTREECODE(MEM_REF_A, "mem_ref_a", tcc_reference, 3);

where TREE_OPERAND(memref, 2) would then be a TREE_MEM_ANNO.  If we were 
to add one operand slot to each tcc_reference we could even do without new 
tree codes: the existence of an ao_ref would simply be indicated by 
TREE_OPERAND(ref, position) being non-NULL.  It would of course enlargen 
each tcc_reference by one pointer, but maybe we could move more things 
into the above ref_annotation from somewhere else (points-to sets? scev 
info?) and then trade saving memory there with this pointer which then 
will more often be used.  (I think the eight bytes one has to pay for 
wrapping the ao_ref into a tree by adding tree_base might be quite 
acceptabable, or, if ao_ref would be a tree itself (not just wrapped into 
one), then e.g. the ao_ref.volatile_p flag could be grabbed from tree_base 
and free up the 8 bytes again)

So, at _this_ write-through of the email I think I like the above idea 
best: make ao_ref be a tree (at least its storage, because it currently 
is a one-member-function class), make ao_ref.volatile_p be 
tree_base.volatile_flag (hence TREE_VOLATILE(ao_ref)) (this reduces 
sizeof(ao_ref) by 8), increase all nr-of-operand of each tcc_reference by 
1, and make TREE_AO_REF(reftree) be "TREE_OPERAND(reftree, 
TREE_CODE_LENGTH(reftree) - 1)", i.e. the last operand of such 
tcc_reference tree.

Hmm, food for thought :)

Ciao,
Michael.

Re: [PATCH][RFC] Introduce TREE_AOREFWRAP to cache ao_ref in the IL

Reply via email to