Re: Pluggable toaster

Simon Riggs Wed, 05 Jan 2022 06:46:26 -0800

On Thu, 30 Dec 2021 at 16:40, Teodor Sigaev <teo...@sigaev.ru> wrote:


> We are working on custom toaster for JSONB [1], because current TOAST is
> universal for any data type and because of that it has some disadvantages:
>     - "one toast fits all"  may be not the best solution for particular
>       type or/and use cases
>     - it doesn't know the internal structure of data type, so it  cannot
>       choose an optimal toast strategy
>     - it can't  share common parts between different rows and even
>       versions of rows

Agreed, Oleg has made some very clear analysis of the value of having
a higher degree of control over toasting from within the datatype.

In my understanding, we want to be able to
1. Access data from a toasted object one slice at a time, by using
knowledge of the structure
2. If toasted data is updated, then update a minimum number of
slices(s), without rewriting the existing slices
3. If toasted data is expanded, then allownew slices to be appended to
the object without rewriting the existing slices

> Modification of current toaster for all tasks and cases looks too
> complex, moreover, it  will not works for  custom data types. Postgres
> is an extensible database,  why not to extent its extensibility even
> further, to have pluggable TOAST! We  propose an idea to separate
> toaster from  heap using  toaster API similar to table AM API etc.
> Following patches are applicable over patch in [1]

ISTM that we would want the toast algorithm to be associated with the
datatype, not the column?
Can you explain your thinking?

We already have Expanded toast format, in-memory, which was designed
specifically to allow us to access sub-structure of the datatype
in-memory. So I was expecting to see an Expanded, on-disk, toast
format that roughly matched that concept, since Tom has already shown
us the way. (varatt_expanded). This would be usable by both JSON and
PostGIS.


Some other thoughts:

I imagine the data type might want to keep some kind of dictionary
inside the main toast pointer, so we could make allowance for some
optional datatype-specific private area in the toast pointer itself,
allowing a mix of inline and out-of-line data, and/or a table of
contents to the slices.

I'm thinking could also tackle these things at the same time:
* We want to expand TOAST to 64-bit pointers, so we can have more
pointers in a table
* We want to avoid putting the data length into the toast pointer, so
we can allow the toasted data to be expanded without rewriting
everything (to avoid O(N^2) cost)

--
Simon Riggs                http://www.EnterpriseDB.com/

Re: Pluggable toaster

Reply via email to