> The heap tuple is on page 3407872 at line pointer offset 12584? > That's an awfully but not implausibly big page number (about 26GB into > the relation) and an awfully and implausibly large line pointer offset > (how do we fit 12584 index tuples into an 8192-byte page?). Also, the > fact that this example has two index entries with the same hash code > pointing at the same heap TID seems wrong; wouldn't that result in > index scans returning duplicates? I think what's actually happening > here is that (3407872,12584) is actually the attempt to interpret some > chunk of memory containing the text representation of a TID as an > actual TID. When I print those numbers as hex, I get 0x343128, and > those are the digits "4" and "1" and an opening parenthesis ")" as > ASCII, so that might fit this theory.
I too had a similar observations when debugging hash_page_items() and I think we were trying to represent tid as a text rather than tid itself which was not correct. Attched patch fixes this bug. Below is what i observed and it is somewhat similar to your explanation in the above comment: (gdb) p PageGetItemId(uargs->page, uargs->offset) $2 = (ItemIdData *) 0x1d84754 (gdb) p *(ItemIdData *) 0x1d84754 $3 = {lp_off = 1664, lp_flags = 1, lp_len = 16} (gdb) p *(IndexTuple) (page + 1664) $4 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 839}, ip_posid = 137}, t_info = 16} (gdb) p BlockIdGetBlockNumber(&(itup->t_tid.ip_blkid)) $5 = 839 (gdb) p (itup->t_tid.ip_posid) $6 = 137 (gdb) p CStringGetTextDatum(psprintf("(%u,%u)", 839, 137)) $7 = 30959896 (gdb) p DatumGetCString(30959896) $8 = 0x1d86918 "4" (gdb) p (char *)0x1d86918 $23 = 0x1d86918 "4" > > The code that tries to extract the hashcode from the item also looks > suspicious. I'm not 100% sure whether it's wrong or just > strangely-written, but it doesn't make much sense to extract the item > contents, then using sprintf() to turn that into a hex string of > bytes, then parse the hex string using strtol() to get an integer > back. I think what it ought to be doing is getting a pointer to the > index tuple and then calling _hash_get_indextuple_hashkey. > I think there is nothing wrong with the hashcode being shown but i do agree that it is a bit lengthly method to find a hashcode considering that we already have a extern function _hash_get_indextuple_hashkey() that can be used to find the hashcode. I have corrected this in the attached patch. > Another minor complaint about hash_page_items is that it doesn't > pfree(uargs->page). I'm not sure it's necessary to pfree anything > here, but if we're going to do it I think we might as well be > consistent with the btree case. Anyway it can hardly make sense to > free the 8-byte structure and not the 8192-byte page to which it's > pointing. > In case of btree, we are copying entire page into uargs->page. <btreefuncs.c> memcpy(uargs->page, BufferGetPage(buffer), BLCKSZ); </btrrefuncs.c> But in our case we have a raw_page and uargs->page contains pointer to the page so no need to pfree uargs->page separately. > In hash_bimap_info(), we go to the trouble of creating a raw page but > never do anything with it. I guess the idea here is just to error out > if the supplied page number is not an overflow page, but there are no > comments explaining that. Anyway, I'm not sure it's a very good idea, > because it means that if you try to write a query to interrogate the > status of all the bitmap pages, it's going to read all of the overflow > pages to which they point, which makes the operation a LOT more > expensive. I think it would be better to leave this part out; if the > user wants to know which pages are actually overflow pages, they can > use hash_page_type() to figure it out. Let's make it the job of this > function just to check the available/free status of the page in the > bitmap, without reading the bitmap itself. Point taken. I am now checking the status of an overflow page without reading bitmap page. > > In doing that, I think it should either return (bitmapblkno, > bitmapbit, bit) or just bit. Returning bitmapblkno but not bitmapbit > seems strange. Also, I think it should return bit as a bool, not > int4. > The new bitmap_info() now returns bitmapblkno, bitmapbit and bitstatus as bool. > Another general note is that, in general, if you use the > BlahGetDatum() function to construct a datum, the SQL type should be > match the macro you picked - e.g. if the SQL type is int4, the macro > used should be Int32GetDatum(), not UInt32GetDatum() or anything else. > If the SQL type is bool, you should be using BoolGetDatum(). Sorry to mention but I didn't find any SQL datatype equivalent to uint32 or uint16 in 'C'. So, currently i am using int4 for uint16 and int8 for uint32. > Apart from the above, I did a little work to improve the reporting > when a page of the wrong type is verified. Updated patch with those > changes attached. okay. Thanks. I have done changes on top of this patch. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com
0001-Add-support-for-hash-index-in-pageinspect-contrib-mo_v9.patch
Description: invalid/octet-stream
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers