I have made a small research and found out that though the patch itself is correct (i.e. we can benefit from replacing TTSOpsMinimalTuple with TTSOpsVirtual for the Unique node), my explanation WHY was wrong. 1. We always materialize the new unique tuple in the slot, never mind what type of tuple table slots do we use. 2. But the virtual tuple materialization (tts_virtual_copyslot) have performance benefits over the minimal tuple one (tts_minimal_copyslot): - tts_minimal_copyslot always allocates zeroed memory with palloc0 (expensive according to the flame graph); - tts_minimal_copyslot() doesn’t allocate additional memory if the tuples are constructed from the passed by value column (but for the variable-size columns we still need memory allocation); - if tts_minimal_copyslot() need allocation it doesn’t need to zero the memory; So as a result we seriously benefit from virtual TTS for the tuples constructed from the fixed-sized columns when get a Unique node in the plan. |
commit 148642d81f046b7d72b3a40182c165e42a8ab6d7 Author: Denis Smirnov <darthu...@gmail.com> Date: Thu Aug 31 08:51:14 2023 +0700
Change tuple table slot for Unique node to "virtual" The Unique node uses minimal TTS implementation to copy the unique tuples from the sorted stream into the resulting tuple slot. But if we replace the minimal TTS with the virtual TTS copy method, the performance improves. 1. Minimal TTS always allocates zeroed memory for the materialized tuple. 2. Virtual TTS doesn't allocate additional memory for the tuples with the columns passed by value. For the columns with external memory we don't need to zero the bytes but can simply take the memory chunk from the free list "as is". diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c index 45035d74fa..c859add6e0 100644 --- a/src/backend/executor/nodeUnique.c +++ b/src/backend/executor/nodeUnique.c @@ -141,7 +141,7 @@ ExecInitUnique(Unique *node, EState *estate, int eflags) * Initialize result slot and type. Unique nodes do no projections, so * initialize projection info for this node appropriately. */ - ExecInitResultTupleSlotTL(&uniquestate->ps, &TTSOpsMinimalTuple); + ExecInitResultTupleSlotTL(&uniquestate->ps, &TTSOpsVirtual); uniquestate->ps.ps_ProjInfo = NULL; /*
|