On Fri, Mar 27, 2020 at 8:21 PM Kirill Bychik <kirill.byc...@gmail.com> wrote: > > > > >>>>> I'm attaching a v5 with fp records only for temp tables, so there's > > > >>>>> no risk of > > > >>>>> instability. As I previously said I'm fine with your two patches, > > > >>>>> so unless > > > >>>>> you have objections on the fpi test for temp tables or the > > > >>>>> documentation > > > >>>>> changes, I believe those should be ready for committer. > > > >>>> > > > >>>> You added the columns into pg_stat_database, but seem to forget to > > > >>>> update the document for pg_stat_database. > > > >>> > > > >>> Ah right, I totally missed that when I tried to clean up the original > > > >>> POC. > > > >>> > > > >>>> Is it really reasonable to add the columns for vacuum's WAL usage > > > >>>> into > > > >>>> pg_stat_database? I'm not sure how much the information about > > > >>>> the amount of WAL generated by vacuum per database is useful. > > > >>> > > > >>> The amount per database isn't really useful, but I didn't had a > > > >>> better idea on > > > >>> how to expose (auto)vacuum WAL usage until this: > > > >>> > > > >>>> Isn't it better to make VACUUM VERBOSE and autovacuum log include > > > >>>> that information, instead, to see how much each vacuum activity > > > >>>> generates the WAL? Sorry if this discussion has already been done > > > >>>> upthread. > > > >>> > > > >>> That's a way better idea! I'm attaching the full patchset with the > > > >>> 3rd patch > > > >>> to use this approach instead. There's a bit a duplicate code for > > > >>> computing the > > > >>> WalUsage, as I didn't find a better way to avoid that without exposing > > > >>> WalUsageAccumDiff(). > > > >>> > > > >>> Autovacuum log sample: > > > >>> > > > >>> 2020-03-19 15:49:05.708 CET [5843] LOG: automatic vacuum of table > > > >>> "rjuju.public.t1": index scans: 0 > > > >>> pages: 0 removed, 2213 remain, 0 skipped due to pins, 0 > > > >>> skipped frozen > > > >>> tuples: 250000 removed, 250000 remain, 0 are dead but not > > > >>> yet removable, oldest xmin: 502 > > > >>> buffer usage: 4448 hits, 4 misses, 4 dirtied > > > >>> avg read rate: 0.160 MB/s, avg write rate: 0.160 MB/s > > > >>> system usage: CPU: user: 0.13 s, system: 0.00 s, elapsed: > > > >>> 0.19 s > > > >>> WAL usage: 6643 records, 4 full page records, 1402679 bytes > > > >>> > > > >>> VACUUM log sample: > > > >>> > > > >>> # vacuum VERBOSE t1; > > > >>> INFO: vacuuming "public.t1" > > > >>> INFO: "t1": removed 50000 row versions in 443 pages > > > >>> INFO: "t1": found 50000 removable, 0 nonremovable row versions in > > > >>> 443 out of 443 pages > > > >>> DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 512 > > > >>> There were 50000 unused item identifiers. > > > >>> Skipped 0 pages due to buffer pins, 0 frozen pages. > > > >>> 0 pages are entirely empty. > > > >>> 1332 WAL records, 4 WAL full page records, 306901 WAL bytes > > > >>> CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s. > > > >>> INFO: "t1": truncated 443 to 0 pages > > > >>> DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s > > > >>> INFO: vacuuming "pg_toast.pg_toast_16385" > > > >>> INFO: index "pg_toast_16385_index" now contains 0 row versions in 1 > > > >>> pages > > > >>> DETAIL: 0 index row versions were removed. > > > >>> 0 index pages have been deleted, 0 are currently reusable. > > > >>> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > > > >>> INFO: "pg_toast_16385": found 0 removable, 0 nonremovable row > > > >>> versions in 0 out of 0 pages > > > >>> DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 513 > > > >>> There were 0 unused item identifiers. > > > >>> Skipped 0 pages due to buffer pins, 0 frozen pages. > > > >>> 0 pages are entirely empty. > > > >>> 0 WAL records, 0 WAL full page records, 0 WAL bytes > > > >>> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > > > >>> VACUUM > > > >>> > > > >>> Note that the 3rd patch is an addition on top of Kirill's original > > > >>> patch, as > > > >>> this is information that would have been greatly helpful to > > > >>> investigate in some > > > >>> performance issues I had to investigate recently. I'd be happy to > > > >>> have it land > > > >>> into v13, but if that's controversial or too late I'm happy to > > > >>> postpone it to > > > >>> v14 if the infrastructure added in Kirill's patches can make it to > > > >>> v13. > > > >> > > > >> Dear all, can we please focus on getting the core patch committed? > > > >> Given the uncertainity regarding autovacuum stats, can we please get > > > >> parts 1 and 2 into the codebase, and think about exposing autovacuum > > > >> stats later? > > > > > > > > Here are the comments for 0001 patch. > > > > > > > > + /* > > > > + * Report a full page image constructed for the WAL record > > > > + */ > > > > + pgWalUsage.wal_fp_records++; > > > > > > > > Isn't it better to use "fpw" or "fpi" for the variable name rather than > > > > "fp" here? In other places, "fpw" and "fpi" are used for full page > > > > writes/image. > > > > > > > > ISTM that this counter could be incorrect if XLogInsertRecord() > > > > determines to > > > > calculate again whether FPI is necessary or not. No? IOW, this issue > > > > could > > > > happen if XLogInsert() calls XLogRecordAssemble() multiple times in > > > > its do-while loop. Isn't this problematic? > > > > > > > > + long wal_bytes; /* size of wal records produced */ > > > > > > > > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this > > > > variable > > > > rather than long? > > > > > > > > + shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space); > > > > > > > > bufusage_space should be walusage_space here? > > > > > > > > /* > > > > * Finish parallel execution. We wait for parallel workers to finish, > > > > and > > > > * accumulate their buffer usage. > > > > */ > > > > > > > > There are some comments mentioning buffer usage, in execParallel.c. > > > > For example, the top comment for ExecParallelFinish(), as the above. > > > > These should be updated. > > > > > > Here are the comments for 0002 patch. > > > > > > + OUT wal_write_bytes int8, > > > + OUT wal_write_records int8, > > > + OUT wal_write_fp_records int8 > > > > > > Isn't "write" part in the column names confusing because it's WAL > > > *generated* (not written) by the statement? > > > > > > +RETURNS SETOF record > > > +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4' > > > +LANGUAGE C STRICT VOLATILE; > > > > > > PARALLEL SAFE should be specified? > > > > > > +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */ > > > > > > ISTM it's good timing to have also pg_stat_statements--1.8.sql since > > > the definition of pg_stat_statements() is changed. Thought? > > > > > > +-- CHECKPOINT before WAL tests to ensure test stability > > > +CHECKPOINT; > > > > > > Is this true? I thought you added this because the number of FPI > > > should be larger than zero in the subsequent test. No? But there > > > seems no such test. I'm not excited about adding the test checking > > > the number of FPI because it looks fragile, though... > > > > > > +UPDATE pgss_test SET b = '333' WHERE a = 3 \; > > > +UPDATE pgss_test SET b = '444' WHERE a = 4 ; > > > > > > Could you tell me why several queries need to be run to test > > > the WAL usage? Isn't running a few query enough for the test purpase? > > > > FTR I marked the commitfest entry as waiting on author. > > > > Kirill do you think you'll have time to address Fuji-san's review > > shortly? The end of the commitfest is approaching quite fast :( > > All these are really valuable objections. Unfortunately, I won't be > able to get all sorted out soon, due to total lack of time. I would be > very glad if somebody could step in for this patch.
I'll try to do that tomorrow!