Not directly. The AVX512 instructions include some significant permute/shuffle/mask hardware, available on pretty much all instructions. These in turn lead to very long capacitance chains (ie, transistors in series that have to stabilize each clock) and so constrain how fast the clock can run. For Larrabee that wasn't a big deal as the clock was relatively slow, but modern 4-5GHz clocks do not like structures this long. Another choice is to split the registers into two banks to run on alternate clocks and then settle the permutations in a post-process using register renaming to avoid hazards. I don't know if any have been implemented this way as that would be after my time. It looks a lot like the old CISC vs RISC arguments, only now about shuffles. The classical GPU architecture won with what's effectively the "RISC" version.
Paul On Fri, Dec 27, 2024 at 1:41 PM Kurt H Maier via 9fans <9fans@9fans.net> wrote: > Is the power consumption the reason the cores downclock when you start > sending AVX512 instructions? > ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/T7692a612f26c8ec5-M6fde8bb81da995615fa76603 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription