[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

amacleod at redhat dot com via Gcc-bugs Thu, 10 Mar 2022 06:18:11 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943


--- Comment #42 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Richard Biener from comment #37)
> I'm looking at range_def_chain::m_def_chain, it's use is well obfuscated by
> inheritance but comments suggest that we have one such structure either for
> each edge in the CFG or for each basic-block.  In particular this

There is one structure per ssa-name globally.  It is the dependency list for an
ssa name which contains the list of other ssa-names in the chain of stmts which
are used to construct it.  Meaning, if one of those dependant names change,
this name could change.

The dependant chain of statements does not extend beyond a basic block boundry.



> m_def_chain vector looks very sparse and fat, replacing that with a
> 
>   hash_map<int, rdc *>
> 
> and allocating rdcs from another obstack (in principle re-using
> m_bitmap.obstack would be possible but somewhat ugly) should make this
> more cache and memory friendly (whether SSA name version or pointer is
> used as key would remain to be determined).
> 
> The ssa1 and ssa2 members are also quite odd, we always record into the
> bitmap so those seem to be a waste of time?  Changing allocation the

The bitmap is exhaustive dependencies (up to alimit) withinthe block.
ssa1/ssa2 are basically cached names for fast the direct dependency lookup.

  i_7 = .....

<bb4>
  _1 = i_7 < 0;
  _2 = j_8 < 0;
  _3 = _1 | _2;
  if (_3 != 0)

Imports: i_7  j_8
Exports: _1  _2  _3  i_7  j_8
depchains:
         _1 : i_7(I)                      // ssa1 = i_7
         _2 : j_8(I)                      // ssa1 = j_8
         _3 : _1  _2  i_7(I)  j_8(I)      // ssa1 = _1  ssa2 = _2


the ssa1 and ssa2 fields are used to specify up to 2 ssa-names that occur on
the def stmt itself, and are used during global cache lookup in conjunction
with the timetamp to determine if the current global value is stale or not.

ie, its a fast check.  Ask for the range of _2.  the ssa1 field is set to j_8,
we simply check the timestamp on j_8 vs the timestamp on _2 to ensure its up to
date. if its stale, then we recalculate _2.

Otherwise we either have to parse the stmt or loop thru the bitmap and check
each element.  They were once in their own data structure, but it was more
efficient to simply include them here in this structure



> above way would also enable embedding bitmap_head, removing one pointer
> indirection.  Unfortunately we use bitmap_ior_into so using the more
> efficient tree form for bitmap queries isn't possible until somebody
> implements (efficient!) bitmap_ior_into on tree form.
> 
> It wouldnt't fix the appearant algorithmic issues of course, so just food
> for thought.  Complexity wise it would reduce O (n-edges * n-ssa-names)
> to O (n-edges * n-deps/imports-on-edge).

so its just O(ssa-name) already.

[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

Reply via email to