On Fri, Feb 25, 2022 at 4:41 AM Melanie Plageman <melanieplage...@gmail.com> wrote: > > I'm trying to understand why hash indexes are built primarily in shared > buffers except when allocating a new splitpoint's worth of bucket pages > -- which is done with smgrextend() directly in _hash_alloc_buckets(). > > Is this just so that the value returned by smgrnblocks() includes the > new splitpoint's worth of bucket pages? > > All writes of tuple data to pages in this new splitpoint will go > through shared buffers (via hash_getnewbuf()). > > I asked this and got some thoughts from Robert in [1], but I still don't > really get it. > > When a new page is needed during the hash index build, why can't > _hash_expandtable() just call ReadBufferExtended() with P_NEW instead of > _hash_getnewbuf()? Does it have to do with the BUCKET_TO_BLKNO mapping? >
We allocate the chunk of pages (power-of-2 groups) at the time of split which allows them to appear consecutively in an index. This helps us to compute the physical block number from bucket number easily (BUCKET_TO_BLKNO mapping) with some minimal control information. -- With Regards, Amit Kapila.