On Wed, 28 May 2025 at 11:24, John Naylor <johncnaylo...@gmail.com> wrote: > https://lemire.me/blog/2025/04/13/detect-control-characters-quotes-and-backslashes-efficiently-using-swar/ > > I don't find this use of SWAR that bad for readability, and there's > only one obtuse clever part that merits a comment. Plus, it seems json > escapes are pretty much set in stone?
I think we'll end up needing some SWAR code. There are plenty of places where 16 bytes is too much to do at once. e.g. looking for the delimiter character in a COPY FROM, 16 is likely too many when you're important a bunch of smallish ints. A 4 or 8 byte SWAR search is likely better for that. With 16 you're probably going to find a delimiter every time you look and do byte-at-a-time processing to find that delimiter. > I gave this a spin with > https://www.postgresql.org/message-id/attachment/163406/json_bench.sh.txt > > master: > > Test 1 > tps = 321.522667 (without initial connection time) > tps = 315.070985 (without initial connection time) > tps = 331.070054 (without initial connection time) > Test 2 > tps = 35.107257 (without initial connection time) > tps = 34.977670 (without initial connection time) > tps = 35.898471 (without initial connection time) > Test 3 > tps = 33.575570 (without initial connection time) > tps = 32.383352 (without initial connection time) > tps = 31.876192 (without initial connection time) > Test 4 > tps = 810.676116 (without initial connection time) > tps = 745.948518 (without initial connection time) > tps = 747.651923 (without initial connection time) > > swar patch: > > Test 1 > tps = 291.919004 (without initial connection time) > tps = 294.446640 (without initial connection time) > tps = 307.670464 (without initial connection time) > Test 2 > tps = 30.984440 (without initial connection time) > tps = 31.660630 (without initial connection time) > tps = 32.538174 (without initial connection time) > Test 3 > tps = 29.828546 (without initial connection time) > tps = 30.332913 (without initial connection time) > tps = 28.873059 (without initial connection time) > Test 4 > tps = 748.676688 (without initial connection time) > tps = 768.798734 (without initial connection time) > tps = 766.924632 (without initial connection time) > > While noisy, this test seems a bit faster with SWAR, and it's more > portable to boot. I'm not sure where I'd put the new function so both > call sites can see it, but that's a small detail... Isn't that mostly a performance regression? How does it do with ANSI chars where the high bit is set? I had in mind we'd have a swar.h header and have a bunch of inline functions for this in there. I've not yet studied how well compilers would inline multiple such SWAR functions to de-duplicate the common parts. David