Re: automatic analyze: readahead - add "IO read time" log message

Heikki Linnakangas Tue, 26 Jan 2021 02:23:51 -0800

On 13/01/2021 23:17, Stephen Frost wrote:

Would be great to get a review / comments from others as to if there's
any concerns.  I'll admit that it seems reasonably straight-forward to
me, but hey, I wrote most of it, so that's not really a fair
assessment... ;)


Look good overall. A few minor comments:

The patch consists of two part: add stats to the log for auto-analyze,and implement prefetching. They seem like independent features, considersplitting into two patches.

It's a bit weird that you get more stats in the log forautovacuum/autoanalyze than you get with VACUUM/ANALYZE VERBOSE. Notreally this patch's fault though.

This conflicts with the patch athttps://commitfest.postgresql.org/31/2907/, to refactor the table AManalyze API. That's OK, it's straightforward to resolve regardless ofwhich patch is committed first.

        /* Outer loop over blocks to sample */
        while (BlockSampler_HasMore(&bs))
        {
#ifdef USE_PREFETCH
                BlockNumber prefetch_targblock = InvalidBlockNumber;
#endif
                BlockNumber targblock = BlockSampler_Next(&bs);

#ifdef USE_PREFETCH

                /*
                 * Make sure that every time the main BlockSampler is moved 
forward
                 * that our prefetch BlockSampler also gets moved forward, so 
that we
                 * always stay out ahead.
                 */
                if (BlockSampler_HasMore(&prefetch_bs))
                        prefetch_targblock = BlockSampler_Next(&prefetch_bs);
#endif

                vacuum_delay_point();

                if (!table_scan_analyze_next_block(scan, targblock, 
vac_strategy))
                        continue;

#ifdef USE_PREFETCH

                /*
                 * When pre-fetching, after we get a block, tell the kernel 
about the
                 * next one we will want, if there's any left.
                 */
                if (effective_io_concurrency && prefetch_targblock != 
InvalidBlockNumber)
                        PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, 
prefetch_targblock);
#endif

                while (table_scan_analyze_next_tuple(scan, OldestXmin, &liverows, 
&deadrows, slot))
                {
                        ...
                }

                pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE,
                                                                         
++blksdone);
        }

If effective_io_concurrency == 0, this callsBlockSampler_Next(&prefetch_bs) anyway, which is a waste of cycles.

If table_scan_analyze_next_block() returns false, we skip thePrefetchBuffer() call. That seem wrong.

Is there any potential harm from calling PrefetchBuffer() on a page thattable_scan_analyze_next_block() later deems as unsuitable for smaplingand returns false? That's theoretical at the moment, becauseheapam_scan_analyze_next_block() always returns true. (The tableamANALYZE API refactor patch will make this moot, as it moves this logicinto the tableam's implementation, so the implementation can do whatevermake sense for the particular AM.)


- Heikki

Re: automatic analyze: readahead - add "IO read time" log message

Reply via email to