On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju...@gmail.com> wrote: > > Hi, > > I'm replying here to all reviews that have been sent, thanks a lot! > > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote: > > On Wed, Apr 1, 2020 at 1:32 PM Julien Rouhaud <rjuju...@gmail.com> wrote: > > > > > > So here's a v9, rebased on top of the latest versions of Sawada-san's bug > > > fixes > > > (Amit's v6 for vacuum and Sawada-san's v2 for create index), with all > > > previously mentionned changes. > > > > > > > Few other comments: > > v9-0003-Add-infrastructure-to-track-WAL-usage > > 1. > > static void BufferUsageAdd(BufferUsage *dst, const BufferUsage *add); > > - > > +static void WalUsageAdd(WalUsage *dst, WalUsage *add); > > > > Looks like a spurious line removal > > > Fixed. > > > > 2. > > + /* Report a full page imsage constructed for the WAL record */ > > + *num_fpw += 1; > > > > Typo. /imsage/image > > > Ah sorry I though I fixed it previously, fixed. > > > > 3. Doing some testing with and without parallelism to ensure WAL usage > > data is correct would be great and if possible, share the results? > > > I just saw that Dilip did some testing, but just in case here is some > additional one > > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id" > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from > pg_stat_statements where query ilike '%vacuum%'; > query | calls | wal_bytes | wal_records | wal_num_fpw > ------------------------+-------+-----------+-------------+------------- > vacuum (parallel 3) t1 | 1 | 20098962 | 34104 | 2 > vacuum (parallel 0) t1 | 1 | 20098962 | 34104 | 2 > (2 rows) > > - create index, overload t1's parallel_workers, using the 1M line just > vacuumed: > > =# alter table t1 set (parallel_workers = 2); > ALTER TABLE > > =# create index t1_parallel_2 on t1(id); > CREATE INDEX > > =# alter table t1 set (parallel_workers = 0); > ALTER TABLE > > =# create index t1_parallel_0 on t1(id); > CREATE INDEX > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from > pg_stat_statements where query ilike '%create index%'; > query | calls | wal_bytes | wal_records | > wal_num_fpw > --------------------------------------+-------+-----------+-------------+------------- > create index t1_parallel_0 on t1(id) | 1 | 20355540 | 2762 | > 2745 > create index t1_parallel_2 on t1(id) | 1 | 20406811 | 2762 | > 2758 > (2 rows) > > It all looks good to me. > > > > v9-0005-Keep-track-of-WAL-usage-in-pg_stat_statements > > 4. > > +-- SELECT usage data, check WAL usage is reported, wal_records equal > > rows count for INSERT/UPDATE/DELETE > > +SELECT query, calls, rows, > > +wal_bytes > 0 as wal_bytes_generated, > > +wal_records > 0 as wal_records_generated, > > +wal_records = rows as wal_records_as_rows > > +FROM pg_stat_statements ORDER BY query COLLATE "C"; > > + query | > > calls | rows | wal_bytes_generated | wal_records_generated | > > wal_records_as_rows > > +------------------------------------------------------------------+-------+------+---------------------+-----------------------+--------------------- > > + DELETE FROM pgss_test WHERE a > $1 | > > 1 | 1 | t | t | t > > + DROP TABLE pgss_test | > > 1 | 0 | t | t | f > > + INSERT INTO pgss_test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) | > > 1 | 3 | t | t | t > > + INSERT INTO pgss_test VALUES(generate_series($1, $2), $3) | > > 1 | 10 | t | t | t > > + SELECT * FROM pgss_test ORDER BY a | > > 1 | 12 | f | f | f > > + SELECT * FROM pgss_test WHERE a > $1 ORDER BY a | > > 2 | 4 | f | f | f > > + SELECT * FROM pgss_test WHERE a IN ($1, $2, $3, $4, $5) | > > 1 | 8 | f | f | f > > + SELECT pg_stat_statements_reset() | > > 1 | 1 | f | f | f > > + SET pg_stat_statements.track_utility = FALSE | > > 1 | 0 | f | f | t > > + UPDATE pgss_test SET b = $1 WHERE a = $2 | > > 6 | 6 | t | t | t > > + UPDATE pgss_test SET b = $1 WHERE a > $2 | > > 1 | 3 | t | t | t > > +(11 rows) > > + > > > > I am not sure if the above tests make much sense as they are just > > testing that if WAL is generated for these commands. I understand it > > is not easy to make these tests reliable but in that case, we can > > think of some simple tests. It seems to me that the difficulty is due > > to full_page_writes as that depends on the checkpoint. Can we make > > full_page_writes = off for these tests and check some simple > > Insert/Update/Delete cases? Alternatively, if you can present the > > reason why that is unstable or are tricky to write, then we can simply > > get rid of these tests because I don't see tests for BufferUsage. Let > > not write tests for the sake of writing it unless they can detect bugs > > in the future or are meaningfully covering the new code added. > > > I don't think that we can have any hope in a stable amount of WAL bytes > generated, so testing a positive number looks sensible to me. Then testing > that each 1-line-write query generates a WAL record also looks sensible, so I > kept this. I realized that Kirill used an existing set of queries that were > previously added to validate the multi queries commands behavior, so there's > no > need to have all of them again. I just kept one of each (insert, update, > delete, select) to make sure that we do record WAL activity there, but I don't > think that more can really be done. I still think that this is better than > nothing, but if you disagree feel free to drop those tests. > > > > 5. > > -SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE > > "C"; > > - query | calls | rows > > ------------------------------------+-------+------ > > - SELECT $1::TEXT | 1 | 1 > > - SELECT PLUS_ONE($1) | 2 | 2 > > - SELECT PLUS_TWO($1) | 2 | 2 > > - SELECT pg_stat_statements_reset() | 1 | 1 > > +SELECT query, calls, rows, wal_bytes, wal_records FROM > > pg_stat_statements ORDER BY query COLLATE "C"; > > + query | calls | rows | wal_bytes | wal_records > > +-----------------------------------+-------+------+-----------+------------- > > + SELECT $1::TEXT | 1 | 1 | 0 | 0 > > + SELECT PLUS_ONE($1) | 2 | 2 | 0 | 0 > > + SELECT PLUS_TWO($1) | 2 | 2 | 0 | 0 > > + SELECT pg_stat_statements_reset() | 1 | 1 | 0 | 0 > > (4 rows) > > > > Again, I am not sure if these modifications make much sense? > > > Those are queries that were previously executed. As those are read-only > query, > that are pretty much guaranteed to not cause any WAL activity, I don't see how > it hurts to test at the same time that that's we indeed record with > pg_stat_statements, just to be safe. Once again, feel free to drop the extra > wal_* columns from the output if you disagree. > > > > 6. > > static void pgss_shmem_startup(void); > > @@ -313,6 +318,7 @@ static void pgss_store(const char *query, uint64 > > queryId, > > int query_location, int query_len, > > double total_time, uint64 rows, > > const BufferUsage *bufusage, > > + const WalUsage* walusage, > > pgssJumbleState *jstate); > > > > The alignment for walusage doesn't seem to be correct. Running > > pgindent will fix this. > > > Indeed, fixed. > > > 7. > > + values[i++] = Int64GetDatumFast(tmp.wal_records); > > + values[i++] = UInt64GetDatum(tmp.wal_num_fpw); > > > > Why are they different? I think we should use the same *GetDatum API > > (probably Int64GetDatumFast) for these. > > > Oops, that's a mistake from when I was working on the wal_bytes output, fixed. > > > > v9-0005-Keep-track-of-WAL-usage-in-pg_stat_statements > > > > > > > One more comment related to this patch. > > + > > + snprintf(buf, sizeof buf, UINT64_FORMAT, tmp.wal_bytes); > > + > > + /* Convert to numeric. */ > > + wal_bytes = DirectFunctionCall3(numeric_in, > > + CStringGetDatum(buf), > > + ObjectIdGetDatum(0), > > + Int32GetDatum(-1)); > > + > > + values[i++] = wal_bytes; > > > > I see that other places that display uint64 values use BIGINT datatype > > in SQL, so why can't we do the same here? See the usage of queryid in > > pg_stat_statements or internal_pages, *_pages exposed via > > pgstatindex.c. > > > That's because it's harmless to report a signed number for a hash (at least > comapred to the overhead of having it unsigned), while that's certainly not > wanted to report a negative amount of WAL bytes generated if it goes beyond > bigint limit. See the usage of pg_lsn_mi in pg_lsn.c for instance. > > On Wed, Apr 01, 2020 at 07:20:31PM +0530, Dilip Kumar wrote: > > > > I have reviewed 0003 and 0004, I have a few comments. > > v9-0003-Add-infrastructure-to-track-WAL-usage > > > > 1. > > /* Points to buffer usage area in DSM */ > > BufferUsage *buffer_usage; > > + /* Points to WAL usage area in DSM */ > > + WalUsage *wal_usage; > > > > Better to give one blank line between the previous statement/variable > > declaration and the next comment line. > > > Fixed. > > > > 2. > > @@ -2154,7 +2157,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel, > > IndexBulkDeleteResult **stats, > > WaitForParallelWorkersToFinish(lps->pcxt); > > > > for (i = 0; i < lps->pcxt->nworkers_launched; i++) > > - InstrAccumParallelQuery(&lps->buffer_usage[i]); > > + InstrAccumParallelQuery(&lps->buffer_usage[i], &lps->wal_usage[i]); > > } > > > > The existing comment above this loop, which just mentions the buffer > > usage, not the wal usage so I guess we need to change that. > > > Ah indeed, I thought I caught all the comments but missed this one. Fixed. > > > > v9-0004-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut > > > > 3. > > + if (usage->wal_num_fpw > 0) > > + appendStringInfo(es->str, " full page records=%ld", > > + usage->wal_num_fpw); > > + if (usage->wal_bytes > 0) > > + appendStringInfo(es->str, " bytes=" UINT64_FORMAT, > > + usage->wal_bytes); > > > > Shall we change to 'full page writes' or 'full page image' instead of > > full page records? > > > Indeed, I changed it in the (auto)vacuum output but missed this one. Fixed. > > > > Apart from this, I have some testing to see the wal_usage with the > > parallel vacuum and the results look fine. > > > > postgres[104248]=# CREATE TABLE test (a int, b int); > > CREATE TABLE > > postgres[104248]=# INSERT INTO test SELECT i, i FROM > > GENERATE_SERIES(1,2000000) as i; > > INSERT 0 2000000 > > postgres[104248]=# CREATE INDEX idx1 on test(a); > > CREATE INDEX > > postgres[104248]=# VACUUM (PARALLEL 1) test; > > VACUUM > > postgres[104248]=# select query , wal_bytes, wal_records, wal_num_fpw > > from pg_stat_statements where query like 'VACUUM%'; > > query | wal_bytes | wal_records | wal_num_fpw > > --------------------------+-----------+-------------+------------- > > VACUUM (PARALLEL 1) test | 72814331 | 8857 | 8855 > > > > > > > > postgres[106479]=# CREATE TABLE test (a int, b int); > > CREATE TABLE > > postgres[106479]=# INSERT INTO test SELECT i, i FROM > > GENERATE_SERIES(1,2000000) as i; > > INSERT 0 2000000 > > postgres[106479]=# CREATE INDEX idx1 on test(a); > > CREATE INDEX > > postgres[106479]=# VACUUM (PARALLEL 0) test; > > VACUUM > > postgres[106479]=# select query , wal_bytes, wal_records, wal_num_fpw > > from pg_stat_statements where query like 'VACUUM%'; > > query | wal_bytes | wal_records | wal_num_fpw > > --------------------------+-----------+-------------+------------- > > VACUUM (PARALLEL 0) test | 72814331 | 8857 | 8855 > > > Thanks! I did some similar testing, with also seq/parallel index creation and > got similar results. > > > > By tomorrow, I will try to finish reviewing 0005 and 0006.
I have reviewed these patches and I have a few cosmetic comments. v10-0005-Keep-track-of-WAL-usage-in-pg_stat_statements 1. + uint64 wal_bytes; /* total amount of wal bytes written */ + int64 wal_records; /* # of wal records written */ + int64 wal_num_fpw; /* # of full page wal records written */ /s/# of full page wal records written / /* # of WAL full page image produced */ 2. static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment *queryEnv, - DestReceiver *dest, QueryCompletion *qc); + DestReceiver *dest, QueryCompletion * qc); Useless hunk. 3. v10-0006-Expose-WAL-usage-counters-in-verbose-auto-vacuum @@ -3105,7 +3105,7 @@ show_wal_usage(ExplainState *es, const WalUsage *usage) { ExplainPropertyInteger("WAL records", NULL, usage->wal_records, es); - ExplainPropertyInteger("WAL full page records", NULL, + ExplainPropertyInteger("WAL full page writes", NULL, usage->wal_num_fpw, es); Just noticed that in 0004 you have first added "WAL full page records", which is later corrected to "WAL full page writes" in 0006. I think we better keep this proper in 0004 itself and avoid this hunk in 0006, otherwise, it creates confusion while reviewing. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com