On Mon, Jan 10, 2022 at 11:25:08AM +0800, Julien Rouhaud wrote: > On Fri, Jan 07, 2022 at 03:25:28PM +0100, Peter Eisentraut wrote: > > > > I tested this a bit. I used the following setup: > > > > create table t1 (a text); > > insert into t1 select md5(generate_series(1, 10000000)::text); > > select count(*) from t1 where a > ''; > > > > And then I changed in varstr_cmp(): > > > > if (collid != DEFAULT_COLLATION_OID) > > mylocale = pg_newlocale_from_collation(collid); > > > > to just > > > > mylocale = pg_newlocale_from_collation(collid); > > > > I find that the \timing results are indistinguishable. (I used locale > > "en_US.UTF-8" and made sure that that code path is actually hit.) > > > > Does anyone have other insights? > > Looking at the git history, you added this comment in 414c5a2ea65. > > After a bit a digging in the lists, I found that you introduced it to fix a > reported 13% slowdown in varstr_cmp(): > https://www.postgresql.org/message-id/20110129075253.GA18784%40tornado.leadboat.com > https://www.postgresql.org/message-id/1296748408.6442.1.camel%40vanquo.pezone.net
So I tried to run Noah's benchmark to see if I could reproduce the slowdown. Unfortunately the results I'm getting don't really make sense as removing the optimisation brings a 15% speedup, and with a few more runs I can see that I have about 25% noise, so there isn't much I can do to help.