On Wed, Oct 26, 2022 at 10:55 AM Melanie Plageman <melanieplage...@gmail.com> wrote:
+ The <structname>pg_statio_</structname> and + <structname>pg_stat_io</structname> views are primarily useful to determine + the effectiveness of the buffer cache. When the number of actual disk reads Totally nitpicking, but this reads a little funny to me. Previously the trailing underscore suggested this is a group, and now with pg_stat_io itself added (stupid question: should this be "pg_statio"?), it sounds like we're talking about two views: pg_stat_io and "pg_statio_". Maybe something like "The pg_stat_io view and the pg_statio_ set of views are primarily..."? + by that backend type in that IO context. Currently only a subset of IO + operations are tracked here. WAL IO, IO on temporary files, and some forms + of IO outside of shared buffers (such as when building indexes or moving a + table from one tablespace to another) could be added in the future. Again nitpicking, but should this be "may be added"? I think "could" suggests the possibility of implementation, whereas "may" feels more like a hint as to how the feature could evolve. + portion of the main shared buffer pool. This pattern is called a + <quote>Buffer Access Strategy</quote> in the + <productname>PostgreSQL</productname> source code and the fixed-size + ring buffer is referred to as a <quote>strategy ring buffer</quote> for + the purposes of this view's documentation. + </para></entry> Nice, I think this explanation is very helpful. You also use the term "strategy context" and "strategy operation" below. I think it's fairly obvious what those mean, but pointing it out in case we want to note that here, too. + <varname>read</varname> and <varname>extended</varname> for Maybe "plus" instead of "and" here for clarity (I'm assuming that's what the "and" means)? + <varname>backend_type</varname>s <literal>autovacuum launcher</literal>, + <literal>autovacuum worker</literal>, <literal>client backend</literal>, + <literal>standalone backend</literal>, <literal>background + worker</literal>, and <literal>walsender</literal> for all + <varname>io_context</varname>s is similar to the sum of I'm reviewing the rendered docs now, and I noticed sentences like this are a bit hard to scan: they force the reader to parse a big list of backend types before even getting to the meat of what this is talking about. Should we maybe reword this so that the backend list comes at the end of the sentence? Or maybe even use a list (e.g., like in the "state" column description in pg_stat_activity)? + <varname>heap_blks_read</varname>, <varname>idx_blks_read</varname>, + <varname>tidx_blks_read</varname>, and + <varname>toast_blks_read</varname> in <link + linkend="monitoring-pg-statio-all-tables-view"> + <structname>pg_statio_all_tables</structname></link>. and + <varname>blks_read</varname> from <link I think that's a stray period before the "and." + <para>If using the <productname>PostgreSQL</productname> extension, + <xref linkend="pgstatstatements"/>, + <varname>read</varname> for + <varname>backend_type</varname>s <literal>autovacuum launcher</literal>, + <literal>autovacuum worker</literal>, <literal>client backend</literal>, + <literal>standalone backend</literal>, <literal>background + worker</literal>, and <literal>walsender</literal> for all + <varname>io_context</varname>s is equivalent to Same comment as above re: the lengthy list. + Normal client backends should be able to rely on maintenance processes + like the checkpointer and background writer to write out dirty data as Nice--it's great to see this mentioned. But I think these are generally referred to as "auxiliary" not "maintenance" processes, no? + <para>If using the <productname>PostgreSQL</productname> extension, + <xref linkend="pgstatstatements"/>, <varname>written</varname> and + <varname>extended</varname> for <varname>backend_type</varname>s Again, should this be "plus" instead of "and"? + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>bytes_conversion</structfield> <type>bigint</type> + </para> I think this general approach works (instead of unit). I'm not wild about the name, but I don't really have a better suggestion. Maybe "op_bytes" (since each cell is counting the number of I/O operations)? But I think bytes_conversion is okay. Also, is this (in the middle of the table) the right place for this column? I would have expected to see it before or after all the actual I/O op cells. + <varname>io_context</varname>s. When a <quote>Buffer Access + Strategy</quote> reuses a buffer in the strategy ring, it must evict its + contents, incrementing <varname>reused</varname>. When a <quote>Buffer + Access Strategy</quote> adds a new shared buffer to the strategy ring + and this shared buffer is occupied, the <quote>Buffer Access + Strategy</quote> must evict the contents of the shared buffer, + incrementing <varname>evicted</varname>. I think the parallel phrasing here makes this a little hard to follow. Specifically, I think "must evict its contents" for the strategy case sounds like a bad thing, but in fact this is a totally normal thing that happens as part of strategy access, no? The idea is you probably won't need that buffer again, so it's fine to evict it. I'm not sure how to reword, but I think the current phrasing is misleading. + The number of times a <literal>bulkread</literal> found the current + buffer in the fixed-size strategy ring dirty and requiring flush. Maybe "...found ... to be dirty..."? + frequent vacuuming or more aggressive autovacuum settings, as buffers are + dirtied during a bulkread operation when updating the hint bit or when + performing on-access pruning. Are there docs to cross-reference here, especially for pruning? I couldn't find much except a few un-explained mentions in the page layout docs [2], and most of the search results refer to partition pruning. Searching for hint bits at least gives some info in blog posts and the wiki. + again. A high number of repossessions is a sign of contention for the + blocks operated on by the strategy operation. This (and in general the repossession description) makes sense, but I'm not sure what to do with the information. Maybe Andres is right that we could skip this in the first version? On Mon, Oct 24, 2022 at 12:39 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > I don't quite follow this: does this mean that I should expect > > 'reused' and 'evicted' to be equal in the 'shared' context, because > > they represent the same thing? Or will 'reused' just be null because > > it's not distinct from 'evicted'? It looks like it's null right now, > > but I find the wording here confusing. > > You should only see evictions when the strategy evicts shared buffers > and reuses when the strategy evicts existing strategy buffers. > > How about this instead in this docs? > > the number of times an existing buffer in the strategy ring was reused > as part of an operation in the <literal>bulkread</literal>, > <literal>bulkwrite</literal>, or <literal>vacuum</literal> > <varname>io_context</varname>s. when a buffer access strategy > <quote>reuses</quote> a buffer in the strategy ring, it must evict its > contents, incrementing <varname>reused</varname>. when a buffer access > strategy adds a new shared buffer to the strategy ring and this shared > buffer is occupied, the buffer access strategy must evict the contents > of the shared buffer, incrementing <varname>evicted</varname>. It looks like you ended up with different wording in the patch, but both this explanation and what's in the patch now make sense to me. Thanks for clarifying. Also, I noticed that the commit message explains missing rows for some backend_type / io_context combinations and NULL (versus 0) in some cells, but the docs don't really talk about that. Do you think that should be in there as well? Thanks, Maciek [1]: https://www.postgresql.org/docs/15/glossary.html#GLOSSARY-AUXILIARY-PROC [2]: https://www.postgresql.org/docs/15/storage-page-layout.html