Re: sql/json remaining issue

Amit Langote Tue, 09 Apr 2024 04:38:00 -0700

Hi,

On Tue, Apr 9, 2024 at 4:47 PM jian he <jian.universal...@gmail.com> wrote:
>
> hi.
> `
>  | NESTED [ PATH ] json_path_specification [ AS json_path_name ]
> COLUMNS ( json_table_column [, ...] )
> NESTED [ PATH ] json_path_specification [ AS json_path_name ] COLUMNS
> ( json_table_column [, ...] )
> `
>  "json_path_specification" should be "path_expression"?
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> drop table s1;
> create or replace function random_text_1000() returns text
> as $$select string_agg(md5(random()::text),'') from
> generate_Series(1,1000) s $$ LANGUAGE SQL;
>
> create unlogged table s1(a int GENERATED BY DEFAULT AS IDENTITY, js jsonb);
> insert into s1(js)
> select jsonb ('{"a":{"za":[{"z1": [11,2222]},{"z21": [22, 234,' || g
> || ']},{"z22": [32, 204,145]}]},"c": ' || g
> || ',"id": "' || random_text_1000() || '"}')
> from generate_series(1_000_000, 1_000_000) g;
> insert into s1(js)
> select jsonb ('{"a":{"za":[{"z1": [11,2222]},{"z21": [22, 234,' || g
> || ']},{"z22": [32, 204,145]}]},"c": ' || g
> || ',"id": "' || random_text_1000() || '"}')
> from generate_series(235, 235 + 200000,1) g;
>
> select count(*), pg_size_pretty(pg_total_relation_size('s1')) from s1;
>  count  | pg_size_pretty
> --------+----------------
>  200002 | 6398 MB
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> explain(analyze, costs off,buffers, timing)
> SELECT sub.*, s.a as s_a FROM s,
> (values(23)) x(x),
> generate_series(13, 13) y,
> JSON_TABLE(js, '$' AS c1 PASSING x AS x, y AS y
> COLUMNS (
> xx1 int PATH '$.c',
> NESTED PATH '$.a.za[2]' COLUMNS (NESTED PATH '$.z22[*]' as z22 COLUMNS
> (c int PATH '$')),
> NESTED PATH '$.a.za[1]' COLUMNS (d json PATH '$ ? (@.z21[*] == ($"x" -1))'),
> NESTED PATH '$.a.za[0]' COLUMNS (NESTED PATH '$.z1[*] ? (@ >= ($"x"
> -2))' as z1 COLUMNS (a int PATH '$')),
> NESTED PATH '$.a.za[1]' COLUMNS
> (NESTED PATH '$.z21[*] ? (@ >= ($"y" +121))' as z21 COLUMNS (b int
> PATH '$ ? (@ <= ($"x" + 999976))' default -1000 ON EMPTY))
> )) sub;
>
> for one jsonb, it can expand to 7 rows, the above query will return
> around 1.4 million rows.
> i use the above query, and pg_log_backend_memory_contexts in another
> session to check the memory usage.
> didn't find memory over consumed issue.
>
> but I am not sure this is the right way to check the memory consumption.
> ----------------------------------------------------------------------------------------------------------------------
> begin;
> SELECT sub.*, s.a as s_a FROM s,
> (values(23)) x(x),
> generate_series(13, 13) y,
> JSON_TABLE(js, '$' AS c1 PASSING x AS x, y AS y
> COLUMNS (
> xx1 int PATH '$.c',
> NESTED PATH '$.a.za[2]' COLUMNS (NESTED PATH '$.z22[*]' as z22 COLUMNS
> (c int PATH '$')),
> NESTED PATH '$.a.za[1]' COLUMNS (d json PATH '$ ? (@.z21[*] == ($"x" -1))'),
> NESTED PATH '$.a.za[0]' COLUMNS (NESTED PATH '$.z1[*] ? (@ >= ($"x"
> -2))' as z1 COLUMNS (a int PATH '$')),
> NESTED PATH '$.a.za[1]' COLUMNS
> (NESTED PATH '$.z21[*] ? (@ >= ($"y" +121))' as z21 COLUMNS (b int
> PATH '$ ? (@ <= ($"x" + 999976))' error ON EMPTY))
> )) sub;
> rollback;
>
> only the last row will fail, because of "error ON EMPTY", (1_000_000
> <= 23 + 999976) is false.
> I remember the very previous patch, because of error cleanup, it took
> a lot of resources.
> does our current implementation, only the very last row fail, will it
> be easy to clean up the transaction?


I am not sure I understand your concern.  Could you please rephrase
it?  Which previous patch are you referring to and what problem did it
cause with respect to error cleanup?

Per-row memory allocated for each successful output row JSON_TABLE()
doesn't pile up, because it's allocated in a context that is reset
after evaluating each row; see tfuncLoadRows().  But again I may be
misunderstanding your concern.

> the last query error message is:
> `
> ERROR:  no SQL/JSON item
> `
>
> we are in ExecEvalJsonExprPath, can we output it to be:
> `
> ERROR:  after applying json_path "5s", no SQL/JSON item found
> `
> in a json_table query, we can have multiple path_expressions, like the
> above query.
> it's not easy to know applying which path_expression failed.

Hmm, I'm not so sure about mentioning the details of the path because
path names are optional and printing path expression itself is not a
good idea.  Perhaps, we could mention the column name which would
always be there, but we'd then need to add a new field column_name
that's optionally set to JsonFuncExpr and JsonExpr, that is, when they
are being set up for JSON_TABLE() columns.  As shown in the attached.
With the patch you'll get:

ERROR:  no SQL/JSON item found for column "b"

--
Thanks, Amit Langote

v1-0001-JSON_TABLE-mention-column-name-in-the-ON-EMPTY-er.patch
Description: Binary data

Re: sql/json remaining issue

Reply via email to