Not directly.  The AVX512 instructions include some significant
permute/shuffle/mask hardware, available on pretty much all instructions.
These in turn lead to very long capacitance chains (ie, transistors in
series that have to stabilize each clock) and so constrain how fast the
clock can run.  For Larrabee that wasn't a big deal as the clock was
relatively slow, but modern 4-5GHz clocks do not like structures this
long.  Another choice is to split the registers into two banks to run on
alternate clocks and then settle the permutations in a post-process using
register renaming to avoid hazards.  I don't know if any have been
implemented this way as that would be after my time.
It looks a lot like the old CISC vs RISC arguments, only now about
shuffles.  The classical GPU architecture won with what's effectively the
"RISC" version.

Paul

On Fri, Dec 27, 2024 at 1:41 PM Kurt H Maier via 9fans <9fans@9fans.net>
wrote:

> Is the power consumption the reason the cores downclock when you start
> sending AVX512 instructions?
>

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T7692a612f26c8ec5-M6fde8bb81da995615fa76603
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to