Currently our code can do lazily detoast by design, for example: SELECT toast_col FROM t; SELECT toast_col FROM t ORDER BY b; SELECT toast_col FROM t join t2 using(c);
it is only detoast at {type}_out function. The benefits includes: 1. The life time of detoast datum is pretty short which is good for general memory usage. 2. In the order by / hash case, the less memory usage can let the work_mem hold more tuples so it is good for performance aspect. Recently I run into a user case like this: create table b(big jsonb); ... select big->'1', big->'2', big->'3', big->'5', big->'10' from b; In the above query, we can see the 'big' datum is detoasted 5 times, and if the toast value is huge, it causes a pretty bad performance. jsonb will be a common case to access the toast value multi times, but it is possible for other data type as well. for example: SELECT f1(big_toast_col), f2(big_toast_col) FROM t; I attached a POC patch which eagerly detoast the datum during EEOP_INNER/OUTER/SCAN_VAR step and store the detoast value back to the original slot->tts_values, so the later call of slot->tts_values[n] will use the detoast value automatically. With the attached setup.sql and the patch, the performance is easy to reduced to 310ms from 1600ms. select big->'1', big->'2', big->'3', big->'5', big->'10' from b; QUERY PLAN --------------------------------------------------------------- Seq Scan on b (actual time=1.731..1577.911 rows=1001 loops=1) Planning Time: 0.099 ms Execution Time: 1578.411 ms (3 rows) set jit to off; select big->'1', big->'2', big->'3', big->'5', big->'10' from b; QUERY PLAN -------------------------------------------------------------- Seq Scan on b (actual time=0.417..309.937 rows=1001 loops=1) Planning Time: 0.097 ms Execution Time: 310.255 m (I used 'jit=off' to turn on this feature just because I'm still not ready for JIT code.) However this patch just throws away almost all the benefits of toast, so how can we draw a line between should vs should not do this code path? IMO, we should only run the 'eagerly detoast' when we know that we will have a FuncCall against the toast_col on the current plan node. I think this information can be get from Qual and TargetList. If so, we can set the slot->detoast_attrs accordingly. if we code like this: SELECT f1(toast_col) FROM t join t2 using(c); We only apply the code path on the join plan node, so even the join method is hash / sort merge, the benefit of toast is still there. 'SELECT f1(toast_col) FROM t;' will apply this code path, but nothing gain and nothing lost. Applying this code path only when the toast datum is accessed 1+ times needs some extra run-time effort. I don't implement this so far, I'd like to see if I miss some obvious points. Any feedback is welcome.
>From a65f4746b9cde8f19192d720d78c2edece5d26c2 Mon Sep 17 00:00:00 2001 From: "yizhi.fzh" <yizhi....@alibaba-inc.com> Date: Mon, 4 Dec 2023 14:22:03 +0800 Subject: [PATCH v1] A PoC code to show the benefit of eagerly detoast code path. How to avoid this code path for the normal toast designed case needs more discussion. --- src/backend/executor/execExprInterp.c | 17 +++++++++++++++++ src/backend/executor/execTuples.c | 16 ++++++++++++++++ src/include/executor/tuptable.h | 3 +++ 3 files changed, 36 insertions(+) diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c index 24c2b60c62..062b3d884a 100644 --- a/src/backend/executor/execExprInterp.c +++ b/src/backend/executor/execExprInterp.c @@ -56,6 +56,7 @@ */ #include "postgres.h" +#include "access/detoast.h" #include "access/heaptoast.h" #include "catalog/pg_type.h" #include "commands/sequence.h" @@ -567,6 +568,12 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) * have an Assert to check that that did happen. */ Assert(attnum >= 0 && attnum < innerslot->tts_nvalid); + + if (bms_is_member(attnum, innerslot->detoast_attrs) && + !innerslot->tts_isnull[attnum]) + innerslot->tts_values[attnum] = (Datum) pg_detoast_datum( + (struct varlena *) innerslot->tts_values[attnum]); + *op->resvalue = innerslot->tts_values[attnum]; *op->resnull = innerslot->tts_isnull[attnum]; @@ -580,6 +587,11 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) /* See EEOP_INNER_VAR comments */ Assert(attnum >= 0 && attnum < outerslot->tts_nvalid); + if (bms_is_member(attnum, outerslot->detoast_attrs) && + !outerslot->tts_isnull[attnum]) + outerslot->tts_values[attnum] = (Datum) pg_detoast_datum( + (struct varlena *) outerslot->tts_values[attnum]); + *op->resvalue = outerslot->tts_values[attnum]; *op->resnull = outerslot->tts_isnull[attnum]; @@ -593,6 +605,11 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) /* See EEOP_INNER_VAR comments */ Assert(attnum >= 0 && attnum < scanslot->tts_nvalid); + if (bms_is_member(attnum, scanslot->detoast_attrs) && + !scanslot->tts_isnull[attnum]) + scanslot->tts_values[attnum] = (Datum) pg_detoast_datum( + (struct varlena *) scanslot->tts_values[attnum]); + *op->resvalue = scanslot->tts_values[attnum]; *op->resnull = scanslot->tts_isnull[attnum]; diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 2c2712ceac..629c45fb52 100644 --- a/src/backend/executor/execTuples.c +++ b/src/backend/executor/execTuples.c @@ -62,6 +62,7 @@ #include "access/tupdesc_details.h" #include "catalog/pg_type.h" #include "funcapi.h" +#include "jit/jit.h" #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" @@ -1143,6 +1144,7 @@ MakeTupleTableSlot(TupleDesc tupleDesc, if (tupleDesc != NULL) { + int i; slot->tts_values = (Datum *) (((char *) slot) + MAXALIGN(basesz)); @@ -1152,6 +1154,20 @@ MakeTupleTableSlot(TupleDesc tupleDesc, + MAXALIGN(tupleDesc->natts * sizeof(Datum))); PinTupleDesc(tupleDesc); + + /* not ready jit so far, it's OK for a poc patch. */ + if (!jit_enabled) + { + for (i = 0; i < tupleDesc->natts; i++) + { + Form_pg_attribute attr = TupleDescAttr(tupleDesc, i); + if (attr->attlen == -1 && attr->attstorage != TYPSTORAGE_PLAIN) + { + slot->detoast_attrs = bms_add_member(slot->detoast_attrs, + attr->attnum - 1); + } + } + } } /* diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 4210d6d838..75779ed31f 100644 --- a/src/include/executor/tuptable.h +++ b/src/include/executor/tuptable.h @@ -18,6 +18,7 @@ #include "access/htup_details.h" #include "access/sysattr.h" #include "access/tupdesc.h" +#include "nodes/bitmapset.h" #include "storage/buf.h" /*---------- @@ -128,6 +129,8 @@ typedef struct TupleTableSlot MemoryContext tts_mcxt; /* slot itself is in this context */ ItemPointerData tts_tid; /* stored tuple's tid */ Oid tts_tableOid; /* table oid of tuple */ + + Bitmapset *detoast_attrs; /* Index start from 0. */ } TupleTableSlot; /* routines for a TupleTableSlot implementation */ -- 2.34.1
setup.sql
Description: application/sql
-- Best Regards Andy Fan