Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

Peter Geoghegan Thu, 04 Sep 2014 17:08:42 -0700

On Thu, Sep 4, 2014 at 11:12 AM, Peter Geoghegan <p...@heroku.com> wrote:
> What I
> consider an open question is whether or not we should do that on the
> first call when there is no abbreviated comparison, such as on the
> second or subsequent attribute in a multi-column sort, in the hope
> that equality will just happen to be indicated.


> Let me try and come up with some numbers for a really unsympathetic
> case, since you've already seen sympathetic numbers.

So I came up with what I imagined to be an unsympathetic case:

postgres=# create table opt_eq_test as select 1 as dummy, country ||
', ' || city as country  from cities order by city;
SELECT 317102
[local]/postgres=# select * from opt_eq_test limit 5;
 dummy |         country
-------+--------------------------
     1 | India, 108 Kalthur
     1 | Mexico, 10 de Abril
     1 | India, 113 Thalluru
     1 | Argentina, 11 de Octubre
     1 | India, 11 Dlp
(5 rows)

I added the dummy column to prevent abbreviated keys from being used -
this is all about the question of trying to get away with a "memcmp()
== 0" in all circumstances, without abbreviated keys/statistics on
attribute cardinality. This is a question that has nothing to do with
abbreviated keys in particular.

With the most recent revision of the patch, performance of a
representative query against this data stabilizes as follows on my
laptop:

LOG:  duration: 2252.500 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2261.505 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2315.903 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2260.132 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2247.340 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2246.723 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2276.297 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2241.911 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2259.540 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2248.740 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2245.913 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2230.583 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;

If I apply the additional optimization that we're on the fence about:

--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -1967,7 +1967,7 @@ bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup)
        len1 = VARSIZE_ANY_EXHDR(arg1);
        len2 = VARSIZE_ANY_EXHDR(arg2);

-       if (ssup->abbrev_state == ABBREVIATED_KEYS_TIE && len1 == len2)
+       if (len1 == len2)
        {
                /*
                 * Being called as authoritative tie-breaker for an
abbreviated key

Then the equivalent numbers look like this:

LOG:  duration: 2178.220 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2175.005 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2219.648 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2174.865 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2246.387 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2234.023 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2186.957 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2177.778 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2186.709 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2171.557 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2211.822 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2224.198 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;
LOG:  duration: 2192.506 ms  statement: select * from (select * from
opt_eq_test order by dummy, country offset 1000000) d;

That looks like a small improvement. It turns out that there are some
cities in certain countries that are not the only city with that name
in the country. For example, there are about 5 Dublins in the United
States, although each is fairly far apart. There are 285,260 distinct
country + city combinations out of a total of 317,102 cities in this
sample dataset. I thought about doing something equivalent with
(dummy, province || city), but that's not interesting because there
are lots of distinct "provinces" (e.g American states, Canadian
provinces, British counties) in the world, so a wasted memcmp() is
minimally wasteful. I also thought about looking at (dummy, country ||
province || city), but the expense of sorting larger strings tends to
dominate there.

I'm more confident that being optimistic about getting away with a
"memcmp() == 0" in all circumstances is the right decision now. I'm
still not quite sure if 1) We should worry about very long strings,
where our optimism is unjustified, or 2) It's worth surfacing
ABBREVIATED_KEYS_TIE within the abbreviated key infrastructure at all.

Perhaps there is something to be said for "getting out of the way of
branch prediction". In any case, even if I actually saw a small
regression here, I'd probably still say it was worth it to get the big
improvements to sympathetic though reasonably representative cases
that we've seen with this technique [1]. As things stand, it looks
well worth it to me.

[1] 
http://www.postgresql.org/message-id/cam3swzqtyv3kp+cakzjzv3rwb1ojjahwpcz9coyjxpkhbtc...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

Reply via email to