Re: [proposal] de-TOAST'ing using a iterator

John Naylor Mon, 15 Jul 2019 04:20:59 -0700

On Wed, Jun 19, 2019 at 8:51 PM Binguo Bao <djydew...@gmail.com> wrote:
> [v4 patch]


Hi Binguo,

I can verify I get no warnings with the v4 patch. I've done some
additional performance testing. First, to sum up your results:

> insert into detoast_c (a) select 
> repeat('1234567890-=abcdefghijklmnopqrstuvwxyz', 1000000)||'321' as a from 
> generate_series(1,100);

When the search pattern was at the beginning, the patch was several
times faster , and when the pattern was at the end, it was 3% slower
when uncompressed and 9% slower when compressed.

First, I'd like to advocate for caution when using synthetic
benchmarks involving compression. Consider this test:

insert into detoast_c (a)
select
    'abc'||
    repeat(
    (SELECT string_agg(md5(chr(i)), '')
    FROM generate_series(1,127) i)
    , 10000)
    ||'xyz'
from generate_series(1,100);

The results for the uncompressed case were not much different then
your test. However, in the compressed case the iterator doesn't buy us
much with beginning searches since full decompression is already fast:

                 master          patch
comp. beg.        869ms          837ms
comp. end       14100ms        16100ms
uncomp. beg.     6360ms          800ms
uncomp. end     21100ms        21400ms

and with compression it's 14% slower searching to the end. This is
pretty contrived, but I include it for demonstration.

To test something hopefully a bit more realistic, I loaded 100 records
each containing the 1995 CIA fact book (~3MB of ascii) with a pattern
string put at the beginning and end. For the end search, I used a
longer needle to speed up the consumption of text, hoping to put more
stress on the detoasting algorithms, for example:

select max(position('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
in a)) from detoast_*;

comp. beg.        836ms           22ms
comp. end        1510ms         1700ms
uncomp. beg.      185ms           12ms
uncomp. end       851ms          903ms

Here, the "beginning" case is ~15-35x faster, which is very impressive
and much faster than with your generated contents. The "end" case is
up to 13% slower. It would be interesting to see where the break-even
point is, where the results are the same.

Reading the thread where you're working on optimizing partial
decompression [1], it seems you have two separate solutions for the
two problems. Maybe this is fine, but I'd like to bring up the
possibility of using the same approach for both kinds of callers.

I'm not an expert on TOAST, but maybe one way to solve both problems
is to work at the level of whole TOAST chunks. In that case, the
current patch would look like this:

1. The caller requests more of the attribute value from the de-TOAST iterator.
2. The iterator gets the next chunk and either copies or decompresses
the whole chunk into the buffer. (If inline, just decompress the whole
thing)

This seems simpler and also easy to adapt to callers that do know how
big a slice they want. I also suspect this way would be easier to
adapt to future TOAST formats not tied to heap or to a certain
compression algorithm. With less bookkeepping overhead, maybe there'll
be less worst-case performance degradation, while not giving up much
in the best case. (Note also that commit 9556aa01c6 already introduced
some performance degradation in near-end searches, when using
multibyte strings. This patch would add to that.) The regression
doesn't seem large, but I see more than your test showed, and it would
be nice to avoid it.

Thoughts, anyone?

[1] 
https://www.postgresql.org/message-id/flat/CAL-OGkux7%2BBm_J%3Dt5VpH7fJGGSm%2BPxWJtgs1%2BWU2g6cmLru%3D%3DA%40mail.gmail.com#705d074aa4ae305ed3d992b7e5b7af3c

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [proposal] de-TOAST'ing using a iterator

Reply via email to