[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

tnfchris at gcc dot gnu.org via Gcc-bugs Fri, 03 May 2024 01:09:58 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932


--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> > which is harder for prefetchers to follow.
> 
> This seems like a limitation in the HW prefetcher rather than anything else.
> Maybe the cost model for addressing mode should punish base+index if so.
> Many HW prefetchers I know of are based on the final VA (or even PA) rather
> looking at the instruction to see if it increments or not ...

That was the first thing we tried, and even increasing the cost of
register_offset to something ridiculously high doesn't change a thing.

IVopts thinks it needs to use it and generates:

  _1150 = (voidD.26 *) _1148;
  _1152 = (sizetype) l0_78(D);
  _1154 = _1152 * 324;
  _1156 = _1154 + 216;
  # VUSE <.MEM_421>
  vect__349.614_1418 = MEM <vector(2) integer(kind=4)D.9> [(integer(kind=4)D.9
*)_1150 + _1156 * 1 clique 2 base 0];

Hence the bug report to see what's going on.

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

Reply via email to