On Tue, 26 Nov 2024 at 15:39, Nazir Bilal Yavuz <byavu...@gmail.com> wrote:
>
> Hi,
>
> Thank you for working on this!
>
> On Mon, 25 Nov 2024 at 21:17, Kirill Reshke <reshkekir...@gmail.com> wrote:
> > While reviewing other threads implementing stream API for various core
> > subsystems, I spotted that pgstattuple could also benefit from that.
> > So, PFA.
> >
> > Notice refactoring around pgstat_hash_page and changes in pgstat_page
> > signature. This is needed because pgstat_hash_tuple uses
> > _hash_getbuf_with_strategy rather than ReadBufferExtended, which is
> > OK, but makes adapting the streaming API impossible. So, I changed
> > this place. Old codes should behave exactly the same way as new ones.
> > I eliminated the additional check that _hash_getbuf_with_strategy
> > performed, which was blkno == P_NEW. However, since the streaming API
> > already delivers the right buffer, I think it's okay.
>
> I agree that this looks okay.
>
> > This change behaves sanely on my by-hand testing.
>
> I encountered some problems while playing with your patch. This query
> [1] fails when the patch is applied although CI passes, it wasn't
> failing before. I looked at the code and found that:
>
> === 1
>
>     blkno = start;
>     for (;;)
>     {
>         /* Get the current relation length */
>         LockRelationForExtension(rel, ExclusiveLock);
>         nblocks = RelationGetNumberOfBlocks(rel);
>         UnlockRelationForExtension(rel, ExclusiveLock);
>
> I think you need to set p.last_exclusive to nblocks here.
>
> === 2
>
> -        for (; blkno < nblocks; blkno++)
> +        while(BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
>          {
>              CHECK_FOR_INTERRUPTS();
>
> -            pagefn(&stat, rel, blkno, bstrategy);
> +            pagefn(&stat, rel, buf);
>          }
>
> blkno doesn't get increased now and this causes an infinite loop.
>
> ===
>
> Also, since this problem couldn't be found by the CI, you may want to
> add a test for the pgstat_index function.
>
> [1] CREATE EXTENSION pgstattuple; create table test (a int primary
> key, b int[]); create index test_hashidx on test using hash (b);
> select pgstattuple(oid) from pg_class where relname = 'test_hashidx';
>
> --
> Regards,
> Nazir Bilal Yavuz
> Microsoft


Hello! Thank you for taking a peek. Your review comments have been
corrected. Since my changes were wrong, I honestly don't know why this
worked in version 1. By a miracle.

As for CI, i rechecked v1:

```
db2=#
select * from pgstathashindex('test_hashidx');
 version | bucket_pages | overflow_pages | bitmap_pages | unused_pages
| live_items | dead_items | free_percent
---------+--------------+----------------+--------------+--------------+------------+------------+--------------
       4 |            4 |              0 |            1 |            0
|          0 |          0 |          100
(1 row)

db2=#
select * from pgstattuple('test_hashidx');
ERROR:  could not read blocks 6..16 in file "base/16454/16473": read
only 0 of 90112 bytes
```

In v2 this behaves correctly.

Should we add `pgstattuple(...)` after `pgstat*type*index(..)`
everywhere in pgstattuple regression tests?


P.S.
I want to mention something that I forgot in my initial email: I
reused codes from here https://commitfest.postgresql.org/50/5327/ with
some adaptation changes. So, Andrey is the author of this patch too.


-- 
Best regards,
Kirill Reshke

Attachment: v2-0001-Use-stream-read-interface-for-pgstattuple-routine.patch
Description: Binary data

Reply via email to