On 13.06.24 04:00, Nathan Bossart wrote:
That's true, but my point is that as soon as we start avoiding function
pointers more commonly, it becomes difficult to justify adding them back in
order to support new instruction sets. Should we just compile in the SSE
4.2 version, or should we take a chance on AVX-512 with the function
pointer?
The idea that's been floating around recently is to build a bunch of
different versions of Postgres and to choose one on startup based on what
the CPU supports. That seems like quite a lot of work, and it'll increase
the size of the builds quite a bit, but it at least doesn't have the
aforementioned problem.
I guess another idea would be for the PGDG packagers or someone else
interested in performance to create repos with binaries built for
these microarch levels and users can research what they need. The new
-v2 etc levels are a lot more practical than the microarch names and
individual features...
Heartily agreed.
One thing that is perhaps not clear (to me?) is how much this matters
and how much of it matters. Obviously, we know that it matters some,
otherwise we'd not be working on it. But does it, like, matter only
with checksums, or with thousands of partitions, or with many CPUs, or
certain types of indexes, etc.?
If we're going to, say, create some recommendations for packagers around
this, how are they supposed to determine the tradeoffs? It might be
easy for a packager to set some slightly-higher -march flag that is in
line with their distro's policies, but it would probably be a lot more
work to create separate binaries or a separate repository for, say,
moving from SSE-something to AVX-something. And how are they supposed
to decide that, and how are they supposed to communicate that to their
users? (And how can we get different packagers to make somewhat
consistent decisions around this?)
We have in a somewhat similar case quite clearly documented that without
native spinlock support everything will be terrible. And there is
probably some information out there that without certain CPU support
checksum performance will be terrible. But beyond that we probably
don't have much.