On Wed, Feb 8, 2017 at 11:26 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Wed, Feb 8, 2017 at 11:58 AM, Ashutosh Sharma <ashu.coe...@gmail.com> > wrote: >>> And then, instead, you need to add some code to set bit based on the >>> bitmap page, like what you have: >>> >>> + mapbuf = _hash_getbuf(indexRel, bitmapblkno, HASH_READ, >>> LH_BITMAP_PAGE); >>> + mappage = BufferGetPage(mapbuf); >>> + freep = HashPageGetBitmap(mappage); >>> + >>> + if (ISSET(freep, bitmapbit)) >>> + bit = 1; >>> >>> Except I would write that last line as... >>> >>> bit = ISSET(freep, bitmapbit) != 0 >>> >>> ...instead of using an if statement. >>> >>> I'm sort of confused as to why the idea of not reading the underlying >>> page is so hard to understand. >> >> It was not so hard to understand your point. The only reason for >> reading overflow page is to ensure that we are passing an overflow >> block as input to '_hash_ovflblkno_to_bitno(metap, (BlockNumber) >> ovflblkno)'. I had removed the code for reading an overflow page >> assuming that _hash_ovflblkno_to_bitno() will throw an error if we >> pass block number of a page other than overflow page but, it doesn't >> seem to guarantee that it will always return value for an overflow >> page. Here are my observations, >> >> postgres=# select hash_page_type(get_raw_page('con_hash_index', 65)); >> hash_page_type >> ---------------- >> bucket >> (1 row) >> >> postgres=# SELECT * FROM >> hash_bitmap_info(get_raw_page('con_hash_index', 0), 'con_hash_index', >> 65); >> bitmapblkno | bitmapbit | bitstatus >> -------------+-----------+----------- >> 33 | 0 | t >> (1 row) >> >> postgres=# select hash_page_type(get_raw_page('con_hash_index', 64)); >> hash_page_type >> ---------------- >> bucket >> (1 row) >> >> postgres=# SELECT * FROM >> hash_bitmap_info(get_raw_page('con_hash_index', 0), 'con_hash_index', >> 64); >> ERROR: invalid overflow block number 64 >> >> postgres=# select hash_page_type(get_raw_page('con_hash_index', 63)); >> hash_page_type >> ---------------- >> bucket >> (1 row) >> >> postgres=# SELECT * FROM >> hash_bitmap_info(get_raw_page('con_hash_index', 0), 'con_hash_index', >> 63); >> ERROR: invalid overflow block number 63 >> >> postgres=# select hash_page_type(get_raw_page('con_hash_index', 33)); >> hash_page_type >> ---------------- >> bitmap >> (1 row) >> >> postgres=# SELECT * FROM >> hash_bitmap_info(get_raw_page('con_hash_index', 0), 'con_hash_index', >> 33); >> bitmapblkno | bitmapbit | bitstatus >> -------------+-----------+----------- >> 33 | 0 | t >> (1 row) > > Right, I understand. But if the caller cares about that, they can use > hash_page_type() in order to find out whether the result of > hash_bitmap_info() will be meaningful. The problem with the way > you've done it is that if someone wants to check the status of a > million bitmap bits, that currently requires reading a million pages > (8GB) whereas if you don't read the underlying page it requires > reading only enough pages to contain a million bitmap bits (~128kB). > That's a big difference. > > If you want to verify that the supplied page number is an overflow > page before returning the bit, I think you should do that calculation > based on the metapage. There's enough information in hashm_spares to > figure out which pages are primary bucket pages as opposed to overflow > pages, and there's enough information in hashm_mapp to identify which > pages are bitmap pages, and the metapage is always page 0. If you > take that approach, then you can check a million bitmap bits reading > only the relevant bitmap pages plus the metapage, which is a LOT less > I/O than reading a million index pages. >
Thanks for the inputs. Attached is the patch modified as per your suggestions. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
simplify_hash_bitmap_info_v2.patch
Description: application/download
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers