Re: [PATCH] Incremental sort (was: PoC: Partial sort)

Tomas Vondra Fri, 13 Mar 2020 17:13:29 -0700

On Fri, Mar 13, 2020 at 04:31:16PM -0400, James Coleman wrote:

On Fri, Mar 13, 2020 at 2:23 PM James Coleman <[email protected]> wrote:


On Tue, Mar 10, 2020 at 10:44 PM Tomas Vondra
<[email protected]> wrote:
> 3) Most of the execution plans look reasonable, except that some of the
> plans look like this:
>
>
>                           QUERY PLAN
>    ---------------------------------------------------------
>     Limit
>       ->  GroupAggregate
>             Group Key: t.a, t.b, t.c, t.d
>             ->  Incremental Sort
>                   Sort Key: t.a, t.b, t.c, t.d
>                   Presorted Key: t.a, t.b, t.c
>                   ->  Incremental Sort
>                         Sort Key: t.a, t.b, t.c
>                         Presorted Key: t.a, t.b
>                         ->  Index Scan using t_a_b_idx on t
>    (10 rows)
>
> i.e. there are two incremental sorts on top of each other, with
> different prefixes. But this this is not a new issue - it happens with
> queries like this:
>
>    SELECT a, b, c, d, count(*) FROM (
>      SELECT * FROM t ORDER BY a, b, c
>    ) foo GROUP BY a, b, c, d limit 1000;
>
> i.e. there's a subquery with a subset of pathkeys. Without incremental
> sort the plan looks like this:
>
>                     QUERY PLAN
>    ---------------------------------------------
>     Limit
>       ->  GroupAggregate
>             Group Key: t.a, t.b, t.c, t.d
>             ->  Sort
>                   Sort Key: t.a, t.b, t.c, t.d
>                   ->  Sort
>                         Sort Key: t.a, t.b, t.c
>                         ->  Seq Scan on t
>    (8 rows)
>
> so essentially the same plan shape. What bugs me though is that there
> seems to be some sort of memory leak, so that this query consumes
> gigabytes os RAM before it gets killed by OOM. But the memory seems not
> to be allocated in any memory context (at least MemoryContextStats don't
> show anything like that), so I'm not sure what's going on.
>
> Reproducing it is fairly simple:
>
>    CREATE TABLE t (a bigint, b bigint, c bigint, d bigint);
>    INSERT INTO t SELECT
>      1000*random(), 1000*random(), 1000*random(), 1000*random()
>    FROM generate_series(1,10000000) s(i);
>    CREATE INDEX idx ON t(a,b);
>    ANALYZE t;
>
>    EXPLAIN ANALYZE SELECT a, b, c, d, count(*)
>    FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d
>    LIMIT 100;

While trying to reproduce this, instead of lots of memory usage, I got
the attached assertion failure instead.


And, without the EXPLAIN ANALYZE was able to get this one, which will
probably be a lot more helpful.


Hmmm, I'll try reproducing it, but can you investigate the values in the
Assert? I mean, it fails on this:

  Assert(total_allocated == context->mem_allocated);

so can you get a core or attach to the process using gdb, and see what's
the expected / total value?

BTW, I might have copied the wrong query - can you try with a higher
value in the LIMIT clause? For example:

    EXPLAIN ANALYZE SELECT a, b, c, d, count(*)
    FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d
    LIMIT 1000000;

I think this might be the differenc ewhy you don't see the memory leak.
Or maybe it was because of asserts? I'm not sure if I had enabled them
in the build ...


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PATCH] Incremental sort (was: PoC: Partial sort)

Reply via email to