On Wed, 28 May 2025 at 11:24, John Naylor <johncnaylo...@gmail.com> wrote:
> https://lemire.me/blog/2025/04/13/detect-control-characters-quotes-and-backslashes-efficiently-using-swar/
>
> I don't find this use of SWAR that bad for readability, and there's
> only one obtuse clever part that merits a comment. Plus, it seems json
> escapes are pretty much set in stone?

I think we'll end up needing some SWAR code. There are plenty of
places where 16 bytes is too much to do at once. e.g. looking for the
delimiter character in a COPY FROM, 16 is likely too many when you're
important a bunch of smallish ints. A 4 or 8 byte SWAR search is
likely better for that. With 16 you're probably going to find a
delimiter every time you look and do byte-at-a-time processing to find
that delimiter.

> I gave this a spin with
> https://www.postgresql.org/message-id/attachment/163406/json_bench.sh.txt
>
> master:
>
> Test 1
> tps = 321.522667 (without initial connection time)
> tps = 315.070985 (without initial connection time)
> tps = 331.070054 (without initial connection time)
> Test 2
> tps = 35.107257 (without initial connection time)
> tps = 34.977670 (without initial connection time)
> tps = 35.898471 (without initial connection time)
> Test 3
> tps = 33.575570 (without initial connection time)
> tps = 32.383352 (without initial connection time)
> tps = 31.876192 (without initial connection time)
> Test 4
> tps = 810.676116 (without initial connection time)
> tps = 745.948518 (without initial connection time)
> tps = 747.651923 (without initial connection time)
>
> swar patch:
>
> Test 1
> tps = 291.919004 (without initial connection time)
> tps = 294.446640 (without initial connection time)
> tps = 307.670464 (without initial connection time)
> Test 2
> tps = 30.984440 (without initial connection time)
> tps = 31.660630 (without initial connection time)
> tps = 32.538174 (without initial connection time)
> Test 3
> tps = 29.828546 (without initial connection time)
> tps = 30.332913 (without initial connection time)
> tps = 28.873059 (without initial connection time)
> Test 4
> tps = 748.676688 (without initial connection time)
> tps = 768.798734 (without initial connection time)
> tps = 766.924632 (without initial connection time)
>
> While noisy, this test seems a bit faster with SWAR, and it's more
> portable to boot. I'm not sure where I'd put the new function so both
> call sites can see it, but that's a small detail...

Isn't that mostly a performance regression? How does it do with ANSI
chars where the high bit is set?

I had in mind we'd have a swar.h header and have a bunch of inline
functions for this in there. I've not yet studied how well compilers
would inline multiple such SWAR functions to de-duplicate the common
parts.

David


Reply via email to