Hi Tomas, On Wed, Mar 6, 2024 at 6:30 AM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > Hi, > > I know very little about sql/json and all the json internals, but I > decided to do some black box testing. I built a large JSONB table > (single column, ~7GB of data after loading). And then I did a query > transforming the data into tabular form using JSON_TABLE. > > The JSON_TABLE query looks like this: > > SELECT jt.* FROM > title_jsonb t, > json_table(t.info, '$' > COLUMNS ( > "id" text path '$."id"', > "type" text path '$."type"', > "title" text path '$."title"', > "original_title" text path '$."original_title"', > "is_adult" text path '$."is_adult"', > "start_year" text path '$."start_year"', > "end_year" text path '$."end_year"', > "minutes" text path '$."minutes"', > "genres" text path '$."genres"', > "aliases" text path '$."aliases"', > "directors" text path '$."directors"', > "writers" text path '$."writers"', > "ratings" text path '$."ratings"', > NESTED PATH '$."aliases"[*]' > COLUMNS ( > "alias_title" text path '$."title"', > "alias_region" text path '$."region"' > ), > NESTED PATH '$."directors"[*]' > COLUMNS ( > "director_name" text path '$."name"', > "director_birth_year" text path '$."birth_year"', > "director_death_year" text path '$."death_year"' > ), > NESTED PATH '$."writers"[*]' > COLUMNS ( > "writer_name" text path '$."name"', > "writer_birth_year" text path '$."birth_year"', > "writer_death_year" text path '$."death_year"' > ), > NESTED PATH '$."ratings"[*]' > COLUMNS ( > "rating_average" text path '$."average"', > "rating_votes" text path '$."votes"' > ) > ) > ) as jt; > > again, not particularly complex. But if I run this, it consumes multiple > gigabytes of memory, before it gets killed by OOM killer. This happens > even when ran using > > COPY (...) TO '/dev/null' > > so there's nothing sent to the client. I did catch memory context info, > where it looks like this (complete stats attached): > > ------ > TopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks); > 84640 used > ... > TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); ... > PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); ... > ExecutorState: 2541764672 total in 314 blocks; 6528176 free > (1208 chunks); 2535236496 used > printtup: 8192 total in 1 blocks; 7952 free (0 chunks); ... > ... > ... > Grand total: 2544132336 bytes in 528 blocks; 7484504 free > (1340 chunks); 2536647832 used > ------ > > I'd say 2.5GB in ExecutorState seems a bit excessive ... Seems there's > some memory management issue? My guess is we're not releasing memory > allocated while parsing the JSON or building JSON output. > > I'm not attaching the data, but I can provide that if needed - it's > about 600MB compressed. The structure is not particularly complex, it's > movie info from [1] combined into a JSON document (one per movie).
Thanks for the report. Yeah, I'd like to see the data to try to drill down into what's piling up in ExecutorState. I want to be sure of if the 1st, query functions patch, is not implicated in this, because I'd like to get that one out of the way sooner than later. -- Thanks, Amit Langote