Hello I looked on this patch too.
It's good idea. I think, so we can have a function or macro that compare a varlena sizes. Some like Datum texteq(..) { if (!datumsHasSameLength(PG_GETARG_DATUM(0), PG_GETARG_DATUM(1)) PG_RETURN_FALSE(); ... actual code .. } Regards Pavel Stehule 2011/1/16 Andy Colson <a...@squeakycode.net>: > This is a review of: > https://commitfest.postgresql.org/action/patch_view?id=468 > > Purpose: > ======== > Equal and not-equal _may_ be quickly determined if their lengths are > different. This _may_ be a huge speed up if we dont have to detoat. > > > The Patch: > ========== > I was able to read and understand the patch, its a simple change and looked > correct to me (a non PG hacker). > It applies clean to git head, compiles and runs fine with debug enabled. > > make check passes > > > Usability: > ========== > I used _may_ above. The benchmark included with the patch, showing huge > speedups, is really contrived. It uses a where clause with a thousand > character constant: (where c = > 'long...long...long...long...ConstantText...etc'). In my opinion this is > very uncommon (the author does note this is a "best case"). If you have a > field large enough to be toasted you are not going to be using that to > search on, you are going to have an ID field that is indexed. (select c > where id = 7) > > This also only touches = and <>. > < and like wont be touched. So I think > the scope of this is limited. > > THAT being said, the patch is simple, and if you do happen to hit the code, > it will speed things up. As a user of PG I'd like to have this included. > Its a corner case, but a big corner, and its a small, simple change, and it > wont slow anything else down. > > > Performance: > ============ > I created myself a more real world test, with a table with indexes and id's > and a large toasted field. > > create table junk(id serial primary key, xgroup integer, c text); > create index junk_group on junk(xgroup); > > > I filled it full of junk: > > do $$ > declare i integer; > declare j integer; > begin > for i in 1..100 loop > for j in 1..500 loop > insert into junk(xgroup, c) values (j, 'c'||i); > insert into junk (xgroup, c) select j, repeat('abc', > 2000)|| n from generate_series(1, 5) n; > end loop; > end loop; > end$$; > > > This will make about 600 records within the same xgroup. As well as a > simple 'c15' type of value in c we can search for. My thinking is you may > not know the exact unique id, but you do know what group its in, so that'll > cut out 90% of the records, and then you'll have to add " and c = 'c15'" to > get the exact one you want. > > I still saw a nice performance boost. > > Old PG: > $ psql < bench3.sql > Timing is on. > DO > Time: 2010.412 ms > > Patched: > $ psql < bench3.sql > Timing is on. > DO > Time: 184.602 ms > > > bench3.sql: > do $$ > declare i integer; > begin > for i in 1..400 loop > perform count(*) from junk where xgroup = i and c like 'c' || > i; > end loop; > end$$; > > > > Summary: > ======== > Performance speed-up: Oh yeah! If you just happen to hit it, and if you do > hit it, you might want to re-think your layout a little bit. > > Do I want it? Yes please. > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers