Em qui., 11 de jul. de 2024 às 09:09, David Rowley <dgrowle...@gmail.com> escreveu:
> On Wed, 10 Jul 2024 at 02:42, Ashutosh Bapat > <ashutosh.bapat....@gmail.com> wrote: > > Observations > > 1. The numbers corresponding to 10 and 100 partitions are higher when > > patched. That might be just noise. I don't see any reason why it would > > impact negatively when there are a small number of partitions. The > > lower partition cases also have a higher number of rows per partition, > > so is the difference between MemoryContextDelete() vs > > MemoryContextReset() making any difference here. May be worth > > verifying those cases carefully. Otherwise upto 1000 partitions, it > > doesn't show any differences. > > I think this might just be noise as a result of rearranging code. In > terms of C code, I don't see any reason for it to be slower. If you > look at GenerationDelete() (as what is getting called from > MemoryContextDelete()), it just calls GenerationReset(). So resetting > is going to always be less work than deleting the context, especially > given we don't need to create the context again when we reset it. > > I wrote the attached script to see if I can also see the slowdown and > I do see the patched code come out slightly slower (within noise > levels) in lower partition counts. > > To get my compiler to produce code in a more optimal order for the > common case, I added unlikely() to the "if (winstate->all_first)" > condition. This is only evaluated on the first time after a rescan, > so putting that code at the end of the function makes more sense. The > attached v2 patch has it this way. Not that it's going to make any difference, but since you're going to touch this code, why not? Function *begin_partition*: 1. You can reduce the scope for variable *outerPlan*, it is not needed anywhere else. + /* + * If this is the very first partition, we need to fetch the first input + * row to store in first_part_slot. + */ + if (TupIsNull(winstate->first_part_slot)) + { + PlanState *outerPlan = outerPlanState(winstate); + TupleTableSlot *outerslot = ExecProcNode(outerPlan); 2. There is once reference to variable numFuncs You can reduce the scope. + /* reset mark and seek positions for each real window function */ + for (int i = 0; i < winstate->numfuncs; i++) Function *prepare_tuplestore. * 1. There is once reference to variable numFuncs You can reduce the scope. /* create mark and read pointers for each real window function */ - for (i = 0; i < numfuncs; i++) + for (int i = 0; i < winstate->numfuncs; i++) 2. You can securely initialize the default value for variable *winstate->grouptail_ptr* in *else* part. if ((frameOptions & (FRAMEOPTION_EXCLUDE_GROUP | FRAMEOPTION_EXCLUDE_TIES)) && node->ordNumCols != 0) winstate->grouptail_ptr = tuplestore_alloc_read_pointer(winstate->buffer, 0); } + else + winstate->grouptail_ptr = -1; best regards, Ranier Vilela