On Mon, Nov 06, 2023 at 09:59:26PM -0600, Nathan Bossart wrote: > On Mon, Nov 06, 2023 at 07:15:01PM -0800, Noah Misch wrote: > > On Mon, Nov 06, 2023 at 09:52:58PM -0500, Tom Lane wrote: > >> Nathan Bossart <nathandboss...@gmail.com> writes: > >> > Like I said, I don't have any proposals yet, but assuming we do want to > >> > support newer intrinsics, either open-coded or via auto-vectorization, I > >> > suspect we'll need to gather consensus for a new policy/strategy. > >> > >> Yeah. The function-pointer solution kind of sucks, because for the > >> sort of operation we're considering here, adding a call and return > >> is probably order-of-100% overhead. Worse, it adds similar overhead > >> for everyone who doesn't get the benefit of the optimization. > > > > The glibc/gcc "ifunc" mechanism was designed to solve this problem of > > choosing > > a function implementation based on the runtime CPU, without incurring > > function > > pointer overhead. I would not attempt to use AVX512 on non-glibc systems, > > and > > I would use ifunc to select the desired popcount implementation on glibc: > > https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcc/Function-Attributes.html > > Thanks, that seems promising for the function pointer cases. I'll plan on > trying to convert one of the existing ones to use it. BTW it looks like > LLVM has something similar [0]. > > IIUC this unfortunately wouldn't help for cases where we wanted to keep > stuff inlined, such as is_valid_ascii() and the functions in pg_lfind.h, > unless we applied it to the calling functions, but that doesn't ѕound > particularly maintainable.
Agreed, it doesn't solve inline cases. If the gains are big enough, we should move toward packages containing N CPU-specialized copies of the postgres binary, with bin/postgres just exec'ing the right one.