Re: to_jsonb performance on array aggregated correlated subqueries

Nico Heller Fri, 12 Aug 2022 12:18:54 -0700


Am 12.08.2022 um 21:15 schrieb Rick Otten:



On Fri, Aug 12, 2022 at 3:07 PM Nico Heller <[email protected]> wrote:

    Am 12.08.2022 um 21:02 schrieb Rick Otten:



    On Fri, Aug 12, 2022 at 2:50 PM Nico Heller
    <[email protected]> wrote:

        Good day,

        consider the following query:

        WITH aggregation(
             SELECT
                    a.*,
                   (SELECT array_agg(b.*) FROM b WHERE b.a_id = a.id
        <http://a.id>) as "bs",
                   (SELECT array_agg(c.*) FROM c WHERE c.a_id = a.id
        <http://a.id>) as "cs",
                   (SELECT array_agg(d.*) FROM d WHERE d.a_id = a.id
        <http://a.id>) as "ds",
                   (SELECT array_agg(e.*) FROM d WHERE e.a_id = a.id
        <http://a.id>) as "es"
             FROM a WHERE a.id <http://a.id> IN (<some big list,
        ranging from 20-180 entries)
        )
        SELECT to_jsonb(aggregation.*) as "value" FROM aggregation;


    - You do have an index on `b.a_id` and `c.a_id`, etc... ?  You
    didn't say...

    Yes there are indices on all referenced columns of the subselect
    (they are all primary keys anyway)

    - Are you sure it is the `to_jsonb` that is making this query slow?

    Yes, EXPLAIN ANALYZE shows a doubling of execution time - I don't
    have numbers on the memory usage difference though


    - Since you are serializing this for easy machine readable
    consumption outside of the database, does it make a difference if
    you use `to_json` instead?

    Using to_json vs. to_jsonb makes no difference in regards to
    runtime, I will check if the memory consumption is different on
    monday - thank you for the idea!

One other thought. Does it help if you convert the arrays to jsonfirst before you convert the whole row? ie, add some to_json()'saround the bs, cs, ds, es columns in the CTE. I'm wondering ifbreaking the json conversions up into smaller pieces will let theouter to_json() have less work to do and overall run faster. Youcould even separately serialize the elements inside the array too. Iwouldn't think it would make a huge difference, you'd be making abunch of extra to_json calls, but maybe it avoids some large memorystructure that would otherwise have to be constructed to serialize allof those objects in all of the arrays all at the same time.

Using jsonb_array_agg and another to_jsonb at the (its still needed tocreate one value at the end and to include the columns "a.*") worsensthe query performance by 100%, I can't speak for the memory usagebecause I would have to push these changes to preproduction - will trythis on monday, thanks.

Re: to_jsonb performance on array aggregated correlated subqueries

Reply via email to