Karan Sikka <karanssi...@gmail.com> writes: > Just for the record, I'm leaning to the side of "not worth it". My > thoughts are, if you are at a scale where you care about this 20% > speedup, you would want to go all the way to an indexed structure, > because the gains you would realize would exceed 20%, and 20% may not be > a sufficient speedup anyway.
If I'm reading your test case correctly, 20% is actually a rather impressive number, because it's 20% *overall* gain on queries that will also involve TOAST fetch and decompress on the source data. (Decompress definitely, and I'm guessing those 5K strings don't compress well enough to avoid getting pushed out-of-line; though it might be worth repeating the test with chunks of 10K or 20K to be sure.) So the percentage improvement in the LIKE test proper must have been a lot more than that. However, I'm dubious that LIKE patterns with long fixed substrings are a common use-case, so I'm afraid that this might be quite a lot of work for something that won't much benefit most users. I'm also worried that the setup costs might be enough to make it a net loss in many cases. There are probably ways to amortize the setup costs, since typical scenarios involve the same LIKE pattern across many rows, but implementing that would add even more work. (Having said that, I've had a bee in my bonnet for a long time about removing per-row setup cost for repetitive regex matches, and whatever infrastructure that needs would work for this too. And for strpos' B-M-H setup, looks like. So this might be something to look into with a suitably wide view of what the problem is.) Not sure what advice to give you here. I think this is in the grey zone where it's hard to be sure whether it's worth putting work into. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers