On November 19, 2021 4:00:01 PM GMT+01:00, Andrew MacLeod <amacl...@redhat.com> wrote: >On 11/19/21 04:21, Richard Biener wrote: >> On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches >> <gcc-patches@gcc.gnu.org> wrote: >>> At issue here is the dynamic approach we currently use for outgoing edge >>> calculations. It isn't normally a problem, but once you get a very >>> large number of possible outgoing values (ie very long unrolled blocks) >>> with pairs of values on a statement, and individual queries for each >>> one, it becomes exponential if they relate to each other. >>> >>> So. We previously introduced a param for PR 100081 which limited the >>> depth of logical expressions GORI would pursue because we always make 2 >>> or 4 queries for each logical expression, which accelerated the >>> exponential behaviour much quicker. >>> >>> This patch reuses that to apply to any statement which has 2 ssa-names, >>> as the potential for 2 queries can happen then as well.. leading to >>> exponential behaviour. Each statement we step back through with >>> multiple ssa-names decreases the odds of calculating a useful outgoing >>> range anyway. This will remove ridiculous behaviour like this PR >>> demonstrates. >>> >>> Next release I plan to revamp GORI to cache outgoing values, which will >>> eliminate all this sort of behaviour, and we can remove all such >>> restrictions. >>> >>> Bootstrapped on x86_64-pc-linux-gnu with no regressions. OK for trunk? >> I think it's reasonable. SCEV tries to limit the overall number of SSA -> >> def >> visits, capturing deep vs. wide in a more uniform (and single) way. >> (--param scev-max-expr-size and the duplicate - huh - --param >> scev-max-expr-complexity). >> >> For SCEV running into the limit means fatal fail of the analysis, for ranger >> it only means less refinement? So it might make a difference which path >> you visit and which not (just thinking about reproducability of range >> analysis >> when we say, swap operands of a stmt)? >> >Yes, Its means less refinement for "deeper" dependencies in the block.. >It will not be affected by operand swapping because we will either look >at dependencies for both operands of a stmt, or neither. The path you >visit shouldn't make any difference. Note this has nothing to do with >generating ranges for those statements, this is only for refining >outgoing ranges created by the condition at the bottom of the block. > >To be precise, we will stop looking for possible range changes once our >dependency search, which always starts from the bottom of the block, >encounters 6 stmts with multiple SSA operands. Statements with a single >SSA operand will continue to build the dependency chains. > >so for instance, following just the chain for first operands of this >somewhat truncated listing: > <...> > _137 = (short int) j_131; > _138 = _129 & _137; > _139 = (int) _138; > j_140 = j_131 & _139; << depth 6 ... we don't go look at j_131 >or _139 for further dependencies > _146 = (short int) j_140; > _147 = _138 & _146; > _148 = (int) _147; > j_149 = j_140 & _148; << depth 5 > _155 = (short int) j_149; > _156 = _147 & _155; > _157 = (int) _156; > j_158 = j_149 & _157; << depth 4 > _164 = (short int) j_158; > _165 = _156 & _164; > _166 = (int) _165; > j_167 = j_158 & _166; <<-- depth 3 > _1 = (short int) j_167; > _3 = _1 & _165; <<-depth 2 > _5 = (int) _3; > j_22 = _5 & j_167; <<-- depth 1 > j_20 = j_22 + 4; > if (j_20 == 4) > goto <bb 3>; > >We'll generate dependencies, and thus able to generate outgoing ranges >for all these ssa-names still until we hit a dependency depth of 6 (the >current default), which for one chain comes to an end at > >j_140 = j_131 & _139 > > We will be able to generate outgoing ranges for j_131 and _139 on the >edges, but we will not try for any ssa_names used to define those.. ie, >_138 might not be able to have outgoing ranges generated for it in this >block. > >working the j_20 : [4,4] range on the true edge back thru those >expressions, once we get that deep the odds of finding something of >value is remote :-) > >We will also only generate these outgoing ranges when they are asked >for. Many of these names are not used outside the block (especially >back the deeper you go), and so never get asked for anyway... but you >could :-)
I see. I suppose you could prune the list simply by asking whether a Def has a single use only (that you just followed) and follow the chain but not generate an outgoing range for the Def. Not sure if that will make a big difference in compile time. >Anyway, this limit in no way introduces any kind of ordering variation.. > >Andrew > >PS for amusement and information overload... it turns out in this hunk >of code we end up knowing that globally the range of most of these names >are [0, 4], and the actual ranges we COULD generate if asked: > >3->5 (T) _1 : short int [0, 4] >3->5 (T) _3 : short int [0, 4] >3->5 (T) _5 : int [0, 4] >3->5 (T) j_20 : int [4, 4] >3->5 (T) j_22 : int [0, 0] >3->5 (T) _120 : short int [0, 4] >3->5 (T) _121 : int [0, 4] >3->5 (T) _128 : short int [0, 4] >3->5 (T) _129 : short int [0, 4] >3->5 (T) _130 : int [0, 4] >3->5 (T) j_131 : int [0, 4] >3->5 (T) _137 : short int [0, 4] >3->5 (T) _138 : short int [0, 4] >3->5 (T) _139 : int [0, 4] >3->5 (T) j_140 : int [0, 4] >3->5 (T) _146 : short int [0, 4] >3->5 (T) _147 : short int [0, 4] >3->5 (T) _148 : int [0, 4] >3->5 (T) j_149 : int [0, 4] >3->5 (T) _155 : short int [0, 4] >3->5 (T) _156 : short int [0, 4] >3->5 (T) _157 : int [0, 4] >3->5 (T) j_158 : int [0, 4] >3->5 (T) _164 : short int [0, 4] >3->5 (T) _165 : short int [0, 4] >3->5 (T) _166 : int [0, 4] >3->5 (T) j_167 : int [0, 4] >3->4 (F) _1 : short int [1, 4] >3->4 (F) _3 : short int [1, 4] >3->4 (F) _5 : int [1, 4] >3->4 (F) j_20 : int [5, 8] >3->4 (F) j_22 : int [1, 4] >3->4 (F) _120 : short int [1, 4] >3->4 (F) _121 : int [1, 4] >3->4 (F) _128 : short int [1, 4] >3->4 (F) _129 : short int [1, 4] >3->4 (F) _130 : int [1, 4] >3->4 (F) j_131 : int [1, 4] >3->4 (F) _137 : short int [1, 4] >3->4 (F) _138 : short int [1, 4] >3->4 (F) _139 : int [1, 4] >3->4 (F) j_140 : int [1, 4] >3->4 (F) _146 : short int [1, 4] >3->4 (F) _147 : short int [1, 4] >3->4 (F) _148 : int [1, 4] >3->4 (F) j_149 : int [1, 4] >3->4 (F) _155 : short int [1, 4] >3->4 (F) _156 : short int [1, 4] >3->4 (F) _157 : int [1, 4] >3->4 (F) j_158 : int [1, 4] >3->4 (F) _164 : short int [1, 4] >3->4 (F) _165 : short int [1, 4] >3->4 (F) _166 : int [1, 4] >3->4 (F) j_167 : int [1, 4] > >So the chains go slightly deeper than my example, but the end result is >most of the names in the _30 thru _119 range on the false edge would >come back with [0,4] instead of [1,4] with this change. > >The actual dependency chains from the dump: > >Exports: _1 _3 _5 j_20 j_22 _120 _128 _129 j_131 _137 _138 >_139 j_140 _146 _147 _148 j_149 _155 _156 _157 j_158 _164 >_165 _166 j_167 > >The exports are the names that ranger can potentially generate a range >on outgoing edges that may differ from the range in the block. It means >GORI might be able to take the condition at the bottom and adjust those >ranges. Any name not in the list will simply use the range it has in >the block. Adjusting that param value will change the number of these. > >The dependency list for each ssa name is also shown: and we can see >how the higher in the block we go, the deeper in the dependency chain >from the bottom of the block we are, and thus there are less names in >the list for each name > > _1 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 _157 j_158 _164 _165 _166 j_167 > _3 : _1 _120 _128 _129 j_131 _137 _138 _139 j_140 >_146 _147 _148 j_149 _155 _156 _157 j_158 _164 _165 _166 j_167 > _5 : _1 _3 _120 _128 _129 j_131 _137 _138 _139 j_140 >_146 _147 _148 j_149 _155 _156 _157 j_158 _164 _165 _166 j_167 > j_20 : _1 _3 _5 j_22 _120 _128 _129 j_131 _137 _138 >_139 j_140 _146 _147 _148 j_149 _155 _156 _157 j_158 _164 >_165 _166 j_167 > j_22 : _1 _3 _5 _120 _128 _129 j_131 _137 _138 _139 >j_140 _146 _147 _148 j_149 _155 _156 _157 j_158 _164 _165 >_166 j_167 > _129 : _120 _128 > _137 : j_131 > _138 : _120 _128 _129 j_131 _137 > j_140 : j_131 _139 > _146 : j_131 _139 j_140 > _147 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 > _148 : _147 > j_149 : j_131 _139 j_140 _147 _148 > _155 : j_131 _139 j_140 _147 _148 j_149 > _156 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 > _157 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 > j_158 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 _157 > _164 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 _157 j_158 > _165 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 _157 j_158 _164 > _166 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 _157 j_158 _164 _165 > j_167 : _120 _128 _129 j_131 _137 _138 _139 j_140 _146 >_147 _148 j_149 _155 _156 _157 j_158 _164 _165 _166 >