On Sat, 21 Dec 2024 at 00:41, Amit Langote <amitlangot...@gmail.com> wrote: > To address (1), I tried assigning specialized functions to > PlanState.ExecProcNode in ExecInitSeqScan() based on whether qual or > projInfo are NULL. Inspired by David Rowley’s suggestion to look at > ExecHashJoinImpl(), I wrote variants like ExecSeqScanNoQual() (for > qual == NULL) and ExecSeqScanNoProj() (for projInfo == NULL). These > call a local version of ExecScan() that lives in nodeSeqScan.c, marked > always-inline. This local copy takes qual and projInfo as arguments, > letting compilers inline and optimize unnecessary branches away.
I tested the performance of this and I do see close to a 5% performance increase in TPC-H Q1. Nice. I'm a little concerned with the method the patch takes where it copies most of ExecScan and includes it in nodeSeqscan.c. If there are any future changes to ExecScan, someone might forget to propagate those changes into nodeSeqscan.c's version. What if instead you moved ExecScan() into a header file and made it static inline? That way callers would get their own inlined copy with the callback functions inlined too, which for nodeSeqscan is good, since the recheck callback does nothing. Just as an additional reason for why I think this might be a better idea is that the patch doesn't seem to quite keep things equivalent as in the process of having ExecSeqScanNoEPQImpl() directly call SeqNext() without going through ExecScanFetch is that you've lost a call to CHECK_FOR_INTERRUPTS(). On the other hand, one possible drawback from making ExecScan a static inline is that any non-core code that uses ExecScan won't get any bug fixes if we were to fix some bug in ExecScan in a minor release unless the extension is compiled again. That could be fixed by keeping ExecScan as an extern function and maybe just having ExecScanExtended as the static inline version. Another thing I wondered about is the naming conversion you're using for these ExecSeqScan variant functions. +ExecSeqScanNoQualNoProj(PlanState *pstate) +ExecSeqScanNoQual(PlanState *pstate) +ExecSeqScanNoProj(PlanState *pstate) +ExecSeqScanNoEPQ(PlanState *pstate) I think it's better to have a naming convention that aims to convey what the function does do rather than what it does not do. I've attached my workings of what I was messing around with. It seems to perform about the same as your version. I think maybe we'd need some sort of execScan.h instead of where I've stuffed the functions in. It would also be good if there was some way to give guarantees to the compiler that a given pointer isn't NULL. For example in: return ExecScanExtended(&node->ss, (ExecScanAccessMtd) SeqNext, (ExecScanRecheckMtd) SeqRecheck, NULL, pstate->qual, NULL); It would be good if when ExecScanExtended is inlined the compiler wouldn't emit code for the "if (qual == NULL)" ... part. I don't know if there's any way to do that. I thought I'd mention it in case someone can think of a way... I guess you could add another parameter that gets passed as a const and have the "if" test look at that instead, that's a bit ugly though. David
diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c index 556a5d98e7..31e028dc84 100644 --- a/src/backend/executor/execScan.c +++ b/src/backend/executor/execScan.c @@ -21,238 +21,6 @@ #include "executor/executor.h" #include "miscadmin.h" - - -/* - * ExecScanFetch -- check interrupts & fetch next potential tuple - * - * This routine is concerned with substituting a test tuple if we are - * inside an EvalPlanQual recheck. If we aren't, just execute - * the access method's next-tuple routine. - */ -static inline TupleTableSlot * -ExecScanFetch(ScanState *node, - ExecScanAccessMtd accessMtd, - ExecScanRecheckMtd recheckMtd) -{ - EState *estate = node->ps.state; - - CHECK_FOR_INTERRUPTS(); - - if (estate->es_epq_active != NULL) - { - EPQState *epqstate = estate->es_epq_active; - - /* - * We are inside an EvalPlanQual recheck. Return the test tuple if - * one is available, after rechecking any access-method-specific - * conditions. - */ - Index scanrelid = ((Scan *) node->ps.plan)->scanrelid; - - if (scanrelid == 0) - { - /* - * This is a ForeignScan or CustomScan which has pushed down a - * join to the remote side. The recheck method is responsible not - * only for rechecking the scan/join quals but also for storing - * the correct tuple in the slot. - */ - - TupleTableSlot *slot = node->ss_ScanTupleSlot; - - if (!(*recheckMtd) (node, slot)) - ExecClearTuple(slot); /* would not be returned by scan */ - return slot; - } - else if (epqstate->relsubs_done[scanrelid - 1]) - { - /* - * Return empty slot, as either there is no EPQ tuple for this rel - * or we already returned it. - */ - - TupleTableSlot *slot = node->ss_ScanTupleSlot; - - return ExecClearTuple(slot); - } - else if (epqstate->relsubs_slot[scanrelid - 1] != NULL) - { - /* - * Return replacement tuple provided by the EPQ caller. - */ - - TupleTableSlot *slot = epqstate->relsubs_slot[scanrelid - 1]; - - Assert(epqstate->relsubs_rowmark[scanrelid - 1] == NULL); - - /* Mark to remember that we shouldn't return it again */ - epqstate->relsubs_done[scanrelid - 1] = true; - - /* Return empty slot if we haven't got a test tuple */ - if (TupIsNull(slot)) - return NULL; - - /* Check if it meets the access-method conditions */ - if (!(*recheckMtd) (node, slot)) - return ExecClearTuple(slot); /* would not be returned by - * scan */ - return slot; - } - else if (epqstate->relsubs_rowmark[scanrelid - 1] != NULL) - { - /* - * Fetch and return replacement tuple using a non-locking rowmark. - */ - - TupleTableSlot *slot = node->ss_ScanTupleSlot; - - /* Mark to remember that we shouldn't return more */ - epqstate->relsubs_done[scanrelid - 1] = true; - - if (!EvalPlanQualFetchRowMark(epqstate, scanrelid, slot)) - return NULL; - - /* Return empty slot if we haven't got a test tuple */ - if (TupIsNull(slot)) - return NULL; - - /* Check if it meets the access-method conditions */ - if (!(*recheckMtd) (node, slot)) - return ExecClearTuple(slot); /* would not be returned by - * scan */ - return slot; - } - } - - /* - * Run the node-type-specific access method function to get the next tuple - */ - return (*accessMtd) (node); -} - -/* ---------------------------------------------------------------- - * ExecScan - * - * Scans the relation using the 'access method' indicated and - * returns the next qualifying tuple. - * The access method returns the next tuple and ExecScan() is - * responsible for checking the tuple returned against the qual-clause. - * - * A 'recheck method' must also be provided that can check an - * arbitrary tuple of the relation against any qual conditions - * that are implemented internal to the access method. - * - * Conditions: - * -- the "cursor" maintained by the AMI is positioned at the tuple - * returned previously. - * - * Initial States: - * -- the relation indicated is opened for scanning so that the - * "cursor" is positioned before the first qualifying tuple. - * ---------------------------------------------------------------- - */ -TupleTableSlot * -ExecScan(ScanState *node, - ExecScanAccessMtd accessMtd, /* function returning a tuple */ - ExecScanRecheckMtd recheckMtd) -{ - ExprContext *econtext; - ExprState *qual; - ProjectionInfo *projInfo; - - /* - * Fetch data from node - */ - qual = node->ps.qual; - projInfo = node->ps.ps_ProjInfo; - econtext = node->ps.ps_ExprContext; - - /* interrupt checks are in ExecScanFetch */ - - /* - * If we have neither a qual to check nor a projection to do, just skip - * all the overhead and return the raw scan tuple. - */ - if (!qual && !projInfo) - { - ResetExprContext(econtext); - return ExecScanFetch(node, accessMtd, recheckMtd); - } - - /* - * Reset per-tuple memory context to free any expression evaluation - * storage allocated in the previous tuple cycle. - */ - ResetExprContext(econtext); - - /* - * get a tuple from the access method. Loop until we obtain a tuple that - * passes the qualification. - */ - for (;;) - { - TupleTableSlot *slot; - - slot = ExecScanFetch(node, accessMtd, recheckMtd); - - /* - * if the slot returned by the accessMtd contains NULL, then it means - * there is nothing more to scan so we just return an empty slot, - * being careful to use the projection result slot so it has correct - * tupleDesc. - */ - if (TupIsNull(slot)) - { - if (projInfo) - return ExecClearTuple(projInfo->pi_state.resultslot); - else - return slot; - } - - /* - * place the current tuple into the expr context - */ - econtext->ecxt_scantuple = slot; - - /* - * check that the current tuple satisfies the qual-clause - * - * check for non-null qual here to avoid a function call to ExecQual() - * when the qual is null ... saves only a few cycles, but they add up - * ... - */ - if (qual == NULL || ExecQual(qual, econtext)) - { - /* - * Found a satisfactory scan tuple. - */ - if (projInfo) - { - /* - * Form a projection tuple, store it in the result tuple slot - * and return it. - */ - return ExecProject(projInfo); - } - else - { - /* - * Here, we aren't projecting, so just return scan tuple. - */ - return slot; - } - } - else - InstrCountFiltered1(node, 1); - - /* - * Tuple fails qual, so free per-tuple memory and try again. - */ - ResetExprContext(econtext); - } -} - /* * ExecAssignScanProjectionInfo * Set up projection info for a scan node, if necessary. diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c index fa2d522b25..c150224f2e 100644 --- a/src/backend/executor/nodeSeqscan.c +++ b/src/backend/executor/nodeSeqscan.c @@ -99,9 +99,10 @@ SeqRecheck(SeqScanState *node, TupleTableSlot *slot) * ExecSeqScan(node) * * Scans the relation sequentially and returns the next qualifying - * tuple. - * We call the ExecScan() routine and pass it the appropriate - * access method functions. + * tuple. This variant is used when there is no es_eqp_active, no qual + * and no projection. Passing const-NULLs for these to ExecScanExtended + * allows the compiler to eliminate the additional code that would + * ordinarily be required for evalualtion of these. * ---------------------------------------------------------------- */ static TupleTableSlot * @@ -109,12 +110,94 @@ ExecSeqScan(PlanState *pstate) { SeqScanState *node = castNode(SeqScanState, pstate); + Assert(pstate->state->es_epq_active == NULL); + Assert(pstate->qual == NULL); + Assert(pstate->ps_ProjInfo == NULL); + + return ExecScanExtended(&node->ss, + (ExecScanAccessMtd) SeqNext, + (ExecScanRecheckMtd) SeqRecheck, + NULL, + NULL, + NULL); +} + +/* + * Variant of ExecSeqScan() but when qual evaluation is required. + */ +static TupleTableSlot * +ExecSeqScanWithQual(PlanState *pstate) +{ + SeqScanState *node = castNode(SeqScanState, pstate); + + Assert(pstate->state->es_epq_active == NULL); + Assert(pstate->qual != NULL); + Assert(pstate->ps_ProjInfo == NULL); + + return ExecScanExtended(&node->ss, + (ExecScanAccessMtd) SeqNext, + (ExecScanRecheckMtd) SeqRecheck, + NULL, + pstate->qual, + NULL); +} + +/* + * Variant of ExecSeqScan() but when projection is required. + */ +static TupleTableSlot * +ExecSeqScanProject(PlanState *pstate) +{ + SeqScanState *node = castNode(SeqScanState, pstate); + + Assert(pstate->state->es_epq_active == NULL); + Assert(pstate->qual == NULL); + Assert(pstate->ps_ProjInfo != NULL); + + return ExecScanExtended(&node->ss, + (ExecScanAccessMtd) SeqNext, + (ExecScanRecheckMtd) SeqRecheck, + NULL, + NULL, + pstate->ps_ProjInfo); +} + +/* + * Variant of ExecSeqScan() but when qual evaluation and projection are + * required. + */ +static TupleTableSlot * +ExecSeqScanWithQualProject(PlanState *pstate) +{ + SeqScanState *node = castNode(SeqScanState, pstate); + + Assert(pstate->state->es_epq_active == NULL); + Assert(pstate->qual != NULL); + Assert(pstate->ps_ProjInfo != NULL); + + return ExecScanExtended(&node->ss, + (ExecScanAccessMtd) SeqNext, + (ExecScanRecheckMtd) SeqRecheck, + NULL, + pstate->qual, + pstate->ps_ProjInfo); +} + +/* + * Variant of ExecSeqScan for when EPQ evaluation is required. We don't + * bother adding variants of this for with/without qual and projection as + * EPQ doesn't seem as exciting a case to optimize for. + */ +static TupleTableSlot * +ExecSeqScanEPQ(PlanState *pstate) +{ + SeqScanState *node = castNode(SeqScanState, pstate); + return ExecScan(&node->ss, (ExecScanAccessMtd) SeqNext, (ExecScanRecheckMtd) SeqRecheck); } - /* ---------------------------------------------------------------- * ExecInitSeqScan * ---------------------------------------------------------------- @@ -137,7 +220,6 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags) scanstate = makeNode(SeqScanState); scanstate->ss.ps.plan = (Plan *) node; scanstate->ss.ps.state = estate; - scanstate->ss.ps.ExecProcNode = ExecSeqScan; /* * Miscellaneous initialization @@ -171,6 +253,28 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags) scanstate->ss.ps.qual = ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate); + /* + * When EvalPlanQual() is not in use, assign ExecProcNode for this node + * based on the presence of qual and projection. Each ExecSeqScan*() + * variant is optimized for the specific combination of these conditions. + */ + if (scanstate->ss.ps.state->es_epq_active != NULL) + scanstate->ss.ps.ExecProcNode = ExecSeqScanEPQ; + else if (scanstate->ss.ps.qual == NULL) + { + if (scanstate->ss.ps.ps_ProjInfo == NULL) + scanstate->ss.ps.ExecProcNode = ExecSeqScan; + else + scanstate->ss.ps.ExecProcNode = ExecSeqScanProject; + } + else + { + if (scanstate->ss.ps.ps_ProjInfo == NULL) + scanstate->ss.ps.ExecProcNode = ExecSeqScanWithQual; + else + scanstate->ss.ps.ExecProcNode = ExecSeqScanWithQualProject; + } + return scanstate; } diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index f8a8d03e53..940fcb7789 100644 --- a/src/include/executor/executor.h +++ b/src/include/executor/executor.h @@ -18,6 +18,7 @@ #include "fmgr.h" #include "nodes/lockoptions.h" #include "nodes/parsenodes.h" +#include "miscadmin.h" #include "utils/memutils.h" @@ -486,8 +487,6 @@ extern Datum ExecMakeFunctionResultSet(SetExprState *fcache, typedef TupleTableSlot *(*ExecScanAccessMtd) (ScanState *node); typedef bool (*ExecScanRecheckMtd) (ScanState *node, TupleTableSlot *slot); -extern TupleTableSlot *ExecScan(ScanState *node, ExecScanAccessMtd accessMtd, - ExecScanRecheckMtd recheckMtd); extern void ExecAssignScanProjectionInfo(ScanState *node); extern void ExecAssignScanProjectionInfoWithVarno(ScanState *node, int varno); extern void ExecScanReScan(ScanState *node); @@ -695,4 +694,280 @@ extern ResultRelInfo *ExecLookupResultRelByOid(ModifyTableState *node, bool missing_ok, bool update_cache); + +/* + * inline functions for execScan.c + */ +/* + * ExecScanFetch -- check interrupts & fetch next potential tuple + * + * This routine is concerned with substituting a test tuple if we are + * inside an EvalPlanQual recheck. If we aren't, just execute + * the access method's next-tuple routine. + */ +static pg_attribute_always_inline TupleTableSlot * +ExecScanFetch(ScanState *node, + EPQState *epqstate, + ExecScanAccessMtd accessMtd, + ExecScanRecheckMtd recheckMtd) +{ + CHECK_FOR_INTERRUPTS(); + + if (epqstate != NULL) + { + /* + * We are inside an EvalPlanQual recheck. Return the test tuple if + * one is available, after rechecking any access-method-specific + * conditions. + */ + Index scanrelid = ((Scan *) node->ps.plan)->scanrelid; + + if (scanrelid == 0) + { + /* + * This is a ForeignScan or CustomScan which has pushed down a + * join to the remote side. The recheck method is responsible not + * only for rechecking the scan/join quals but also for storing + * the correct tuple in the slot. + */ + + TupleTableSlot *slot = node->ss_ScanTupleSlot; + + if (!(*recheckMtd) (node, slot)) + ExecClearTuple(slot); /* would not be returned by scan */ + return slot; + } + else if (epqstate->relsubs_done[scanrelid - 1]) + { + /* + * Return empty slot, as either there is no EPQ tuple for this rel + * or we already returned it. + */ + + TupleTableSlot *slot = node->ss_ScanTupleSlot; + + return ExecClearTuple(slot); + } + else if (epqstate->relsubs_slot[scanrelid - 1] != NULL) + { + /* + * Return replacement tuple provided by the EPQ caller. + */ + + TupleTableSlot *slot = epqstate->relsubs_slot[scanrelid - 1]; + + Assert(epqstate->relsubs_rowmark[scanrelid - 1] == NULL); + + /* Mark to remember that we shouldn't return it again */ + epqstate->relsubs_done[scanrelid - 1] = true; + + /* Return empty slot if we haven't got a test tuple */ + if (TupIsNull(slot)) + return NULL; + + /* Check if it meets the access-method conditions */ + if (!(*recheckMtd) (node, slot)) + return ExecClearTuple(slot); /* would not be returned by + * scan */ + return slot; + } + else if (epqstate->relsubs_rowmark[scanrelid - 1] != NULL) + { + /* + * Fetch and return replacement tuple using a non-locking rowmark. + */ + + TupleTableSlot *slot = node->ss_ScanTupleSlot; + + /* Mark to remember that we shouldn't return more */ + epqstate->relsubs_done[scanrelid - 1] = true; + + if (!EvalPlanQualFetchRowMark(epqstate, scanrelid, slot)) + return NULL; + + /* Return empty slot if we haven't got a test tuple */ + if (TupIsNull(slot)) + return NULL; + + /* Check if it meets the access-method conditions */ + if (!(*recheckMtd) (node, slot)) + return ExecClearTuple(slot); /* would not be returned by + * scan */ + return slot; + } + } + + /* + * Run the node-type-specific access method function to get the next tuple + */ + return (*accessMtd) (node); +} + +/* ---------------------------------------------------------------- + * ExecScanWithQualAndProjection + * + * Scans the relation using the 'access method' indicated and + * returns the next qualifying tuple. + * The access method returns the next tuple and the tuple is checked + * against the optional 'qual'. + * + * A 'recheck method' must also be provided that can check an + * arbitrary tuple of the relation against any qual conditions + * that are implemented internal to the access method. + * + * When a non-NULL 'projInfo' is given, qualifying tuples are projected + * using this. + * + * This function may be used as an alternative to ExecScan when + * callers don't have a 'qual' or don't have a 'projInfo'. The inlining + * allows the compiler to eliminate the non-relevant branches, which + * can save having to do run-time checks on every tuple. + * + * Conditions: + * -- the "cursor" maintained by the AMI is positioned at the tuple + * returned previously. + * + * Initial States: + * -- the relation indicated is opened for scanning so that the + * "cursor" is positioned before the first qualifying tuple. + * ---------------------------------------------------------------- + */ +static pg_attribute_always_inline TupleTableSlot * +ExecScanExtended(ScanState *node, + ExecScanAccessMtd accessMtd, /* function returning a tuple */ + ExecScanRecheckMtd recheckMtd, + EPQState *epqstate, + ExprState *qual, + ProjectionInfo *projInfo) +{ + ExprContext *econtext = node->ps.ps_ExprContext; + + /* interrupt checks are in ExecScanFetch */ + + /* + * If we have neither a qual to check nor a projection to do, just skip + * all the overhead and return the raw scan tuple. + */ + if (!qual && !projInfo) + { + ResetExprContext(econtext); + return ExecScanFetch(node, epqstate, accessMtd, recheckMtd); + } + + /* + * Reset per-tuple memory context to free any expression evaluation + * storage allocated in the previous tuple cycle. + */ + ResetExprContext(econtext); + + /* + * get a tuple from the access method. Loop until we obtain a tuple that + * passes the qualification. + */ + for (;;) + { + TupleTableSlot *slot; + + slot = ExecScanFetch(node, epqstate, accessMtd, recheckMtd); + + /* + * if the slot returned by the accessMtd contains NULL, then it means + * there is nothing more to scan so we just return an empty slot, + * being careful to use the projection result slot so it has correct + * tupleDesc. + */ + if (TupIsNull(slot)) + { + if (projInfo) + return ExecClearTuple(projInfo->pi_state.resultslot); + else + return slot; + } + + /* + * place the current tuple into the expr context + */ + econtext->ecxt_scantuple = slot; + + /* + * check that the current tuple satisfies the qual-clause + * + * check for non-null qual here to avoid a function call to ExecQual() + * when the qual is null ... saves only a few cycles, but they add up + * ... + */ + if (qual == NULL || ExecQual(qual, econtext)) + { + /* + * Found a satisfactory scan tuple. + */ + if (projInfo) + { + /* + * Form a projection tuple, store it in the result tuple slot + * and return it. + */ + return ExecProject(projInfo); + } + else + { + /* + * Here, we aren't projecting, so just return scan tuple. + */ + return slot; + } + } + else + InstrCountFiltered1(node, 1); + + /* + * Tuple fails qual, so free per-tuple memory and try again. + */ + ResetExprContext(econtext); + } +} + +/* ---------------------------------------------------------------- + * ExecScan + * + * Scans the relation using the 'access method' indicated and + * returns the next qualifying tuple. + * The access method returns the next tuple and ExecScan() is + * responsible for checking the tuple returned against the qual-clause. + * + * A 'recheck method' must also be provided that can check an + * arbitrary tuple of the relation against any qual conditions + * that are implemented internal to the access method. + * + * Conditions: + * -- the "cursor" maintained by the AMI is positioned at the tuple + * returned previously. + * + * Initial States: + * -- the relation indicated is opened for scanning so that the + * "cursor" is positioned before the first qualifying tuple. + * ---------------------------------------------------------------- + */ +static inline TupleTableSlot * +ExecScan(ScanState *node, + ExecScanAccessMtd accessMtd, /* function returning a tuple */ + ExecScanRecheckMtd recheckMtd) + +{ + EPQState *epqstate; + ExprState *qual; + ProjectionInfo *projInfo; + + epqstate = node->ps.state->es_epq_active; + qual = node->ps.qual; + projInfo = node->ps.ps_ProjInfo; + + return ExecScanExtended(node, + accessMtd, + recheckMtd, + epqstate, + qual, + projInfo); +} + #endif /* EXECUTOR_H */