On Mon, Aug 15, 2022 at 08:33:21PM +0700, John Naylor wrote: > The attached implements the above, more or less, using new pg_lfind8() > and pg_lfind8_le(), which in turn are based on helper functions that > act on a single vector. The pg_lfind* functions have regression tests, > but I haven't done the same for json yet. I went the extra step to use > bit-twiddling for non-SSE builds using uint64 as a "vector", which > still gives a pretty good boost (test below, min of 3):
Looks pretty reasonable to me. > +#ifdef USE_SSE2 > + chunk = _mm_loadu_si128((const __m128i *) &base[i]); > +#else > + memcpy(&chunk, &base[i], sizeof(chunk)); > +#endif /* USE_SSE2 */ > +#ifdef USE_SSE2 > + chunk = _mm_loadu_si128((const __m128i *) &base[i]); > +#else > + memcpy(&chunk, &base[i], sizeof(chunk)); > +#endif /* USE_SSE2 */ Perhaps there should be a macro or inline function for loading a vector so that these USE_SSE2 checks can be abstracted away, too. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com