On Tue, 14 Oct 2025 at 01:43, Kirill Reshke <[email protected]> wrote: > > Hi! > I do frequently interact with gin indexes. I also frequently use > pageinspect to check how data is stored in indixies. The latter is > beneficial in both education and corruption fixing purposes. > > I was always wondering on why GIN pageinspect module lacks support for > anything rather that Posting Tree leaf pages. So, I implemented > extended GIN support. > > Patch includes two functions, gin_entrypage_items and gin_datapage_items. > > gin_entrypage_items is a function to work with entry tree. It supports > both entry tree leaf and internal pages. > > GIN entry tree stores an index tuple with a single indexed column in > single-index-column case. > Otherwise, each tuple on GIN entry tree page is tuple of (attrnum, > attrvalue), where the first item in tuple shows which indexes column > this tuple refers to. GIN internal pages also contains downlink to > other entry tree pages, while entry tree leaf page may contain > compressed TIDs of the posting list, or downlink to posting tree. > > > example of output on entry tree internal page of multi-column index: > > ``` > > reshke=# select * from gin_entrypage_items(get_raw_page('x_i_j_idx', > 1), 'x_i_j_idx'::regclass); > itemoffset | downlink | tids | keys > ------------+----------+------+------------------------------------ > 1 | (3,0) | {} | i=113 > 2 | (5,0) | {} | j=34173cb38f07f89ddbebc2ac9128303f > 3 | (2,0) | {} | j=a0a080f42e6f13b3a2df133f073095dd > 4 | (4,0) | {} | j=fc490ca45c00b1249bbe3554a4fdf6fb > (4 rows) > > ``` > > example of output on entry tree leaf page of multi-column index: > > ``` > reshke=# select * from gin_entrypage_items(get_raw_page('x_i_j_idx', > 2), 'x_i_j_idx'::regclass); > itemoffset | downlink | tids | > keys > ------------+----------------+------------------------------+------------------------------------ > 1 | (2147483696,3) | {"(1,39)","(1,40)","(2,1)"} | > j=35f4a8d465e6e1edc05f3d8ab658c551 > 2 | (2147483696,3) | {"(4,10)","(4,11)","(4,12)"} | > j=3636638817772e42b59d74cff571fbb3 > 3 | (2147483696,3) | {"(5,1)","(5,2)","(5,3)"} | > j=3644a684f98ea8fe223c713b77189a77 > 4 | (2147483696,3) | {"(0,25)","(0,26)","(0,27)"} | > j=37693cfc748049e45d87b8c7d8b9aacd > 5 | (2147483696,3) | {"(3,33)","(3,34)","(3,35)"} | > j=37a749d808e46495a8da1e5352d03cae > > ``` > > downlink on the leaf page has a different meaning than on the internal > page, but I didn't handle it any differently. In the example ouput, > (2147483696,3) = ((1<<31) + 48, 3), meaning the next 48 bytes is the > entry tree key, and after that there is 3 posting items, > varbyte-encoded. > > I also tested this function on GIN index with nulls, it works. Let me > know if I'm wrong, I know that GIN handles NULLS very differently, but > I had trouble with it, which makes me think that I'm missing > something. > > Also turns out this gin_entrypage_items actually works for fast lists > (GIN_LIST and GIN_LIST_FULLROW pages). But output is something > strange, I cannot validate is output is sane. For tids, I get values > like > > ``` > 1 | (1,19) | > {"(16782080,4096)","(16777216,14336)","(1280,0)","(469774336,0)","(65536,0)","(2734686208,57344)","(8389120,0)","(3372220424,65280)","(4294967295,65535)","(16711680,0)","(0,0)","( > 33554688,0)","(614,0)","(67108864,1024)","(16777216,0)","(73793536,0)","(0,0)","(0,0)","(0,0)"} > ``` > > > > > gin_datapage_items is for posting trees, but not leaf pages. For leaf > pages, users are expected to still use the gin_leafpage_items > function. > Example output for gin_datapage_items:
In v2 I decided to simply reject these pages.
> ```
> reshke=# select * from gin_datapage_items(get_raw_page('x_i_j_idx',
> 43), 'x_i_j_idx'::regclass);
> itemoffset | downlink | item_tid
> ------------+----------+----------
> 1 | 124 | (162,12)
> 2 | 123 | (314,37)
> 3 | 251 | (467,23)
> 4 | 373 | (0,0)
> (4 rows)
>
> ```
>
> Patch still is very raw, many things to improve.
> Comments?
>
> --
> Best regards,
> Kirill Reshke
Attached v2 with minor fixes and new test cases in gin.sql.
--
Best regards,
Kirill Reshke
v2-0001-GIN-pageinspect-support-for-entry-tree-and-postin.patch
Description: Binary data
