Re: DSA overflow in hash join

Konstantin Knizhnik Sun, 27 Jul 2025 10:24:40 -0700

I still trying to understand the reason of DSA overflow in hash join.

In addition to two suspicious places where number of buckets is doubledwithout chek for overflow (nodeHash.c:1668 and nodeHash.c:3290),there is one more place where number of batches is multiplied by`EstimateParallelHashJoinBatch(hashtable)` which is

sizeof(ParallelHashJoinBatch) + (sizeof(SharedTuplestore) +sizeof(SharedTuplestoreParticipant) * participants) * 2


which is 480 bytes!

But when we calculate maximal number of batches, we limit it by macximalnumber of pointers (8 bytes):


    max_pointers = hash_table_bytes / sizeof(HashJoinTuple);
    max_pointers = Min(max_pointers, MaxAllocSize / sizeof(HashJoinTuple));
    /* If max_pointers isn't a power of 2, must round it down to one */
    max_pointers = pg_prevpower2_size_t(max_pointers);

    /* Also ensure we avoid integer overflow in nbatch and nbuckets */
    /* (this step is redundant given the current value of MaxAllocSize) */
    max_pointers = Min(max_pointers, INT_MAX / 2 + 1);

    dbuckets = ceil(ntuples / NTUP_PER_BUCKET);
    dbuckets = Min(dbuckets, max_pointers);
    nbuckets = (int) dbuckets;


But as we see, here multiplier is 480 bytes, not 8 bytes.

Re: DSA overflow in hash join

Reply via email to