https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118945

Andrew Waterman <andrew at sifive dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrew at sifive dot com

--- Comment #8 from Andrew Waterman <andrew at sifive dot com> ---
>  In fact, I'd be rather surprised to see anything preferring tail undisturbed.

Right.  To be precise, microarchitectures without register renaming absolutely
do prefer to leave the tail undisturbed.  But that's why the ISA defines the
agnostic mode in such a way that undisturbed is a valid implementation of
agnostic.  (The in-order microarchitectures I've worked on simply ignore the
tail-/mask-agnostic setting; the state bits that control the mode are
essentially vestigial.)

Since no plausible implementation will benefit from being in undisturbed mode,
we don't need to consider that aspect of the problem, but...

> I prefer fewer "vsetvli" (which allows more fusion) by default.

...but here's the rub.  Implementations that don't benefit from the agnostic
setting would definitely prefer to avoid the extra setvl instructions, not
because they're expensive, but because they're not free.

> Some designs aren't sensitive to the number of vsetvls and I would expect 
> that over time that's where high performance designs will land over time.

Low-performance ones, too.  (Making vset[i]vli fast is more of an engineering
cost than a silicon cost.)  But the instructions still have to be fetched and
decoded, and registers have to be read and written, so the perf cost will
converge on that of, say, an ADDI instruction, which is to say cheap but not
zero.  For narrow-issue machines, this does matter.

> Obviously for your design you'll want to set the knob which says "minimize 
> vsetvls" as opposed to "avoid false dependencies by preferring tail 
> agnostic". That's easily handled by putting the data in the tuning structure 
> for each design.

And so this is the right answer :)

Reply via email to