On Thu, Aug 1, 2019 at 4:45 AM Thomas Munro <thomas.mu...@gmail.com> wrote: > On Tue, Jul 23, 2019 at 4:51 PM Tatsuro Yamada > <tatsuro.yamada...@nttcom.co.jp> wrote: > > Attached v4 patch file only includes this fix. > > I've moved this to the September CF, where it is in "Needs review" state.
/me reviews. + <entry><structfield>scanning_table</structfield></entry> I think this should be retitled to something that ends in 'relid', like all of the corresponding cases in existing progress views. Perhaps 'active_relid' or 'current_relid'. + The command is computing extended stats from the samples obtained in the previous phase. I think you should change this (and the previous one) to say "from the samples obtained during the table scan." + Total number of heap blocks in the scanning_table. Perhaps I'm confused, but it looks to me like what you are advertising is the number of blocks that will be sampled, not the total number of blocks in the table. I think that's the right thing to advertise, but then the column should be named and documented that way. + { + const int index[] = { + PROGRESS_ANALYZE_TOTAL_BLOCKS, + PROGRESS_ANALYZE_SCANREL + }; + const int64 val[] = { + nblocks, + RelationGetRelid(onerel) + }; + + pgstat_progress_update_multi_param(2, index, val); + } This block seems to be introduced just so you can declare variables; I don't think that's good style. It's arguably unnecessary because we now are selectively allowing variable declarations within functions, but I think you should just move the first array to the top of the function and the second declaration to the top of the function dropping const, and then just do val[0] = nblocks and val[1] = RelationGetRelid(onerel). Maybe you can also come up with better names than 'index' and 'val'. Same comment applies to another place where you have something similar. Patch seems to need minor rebasing. Maybe "scanning table" should be renamed "acquiring sample rows," to match the names used in the code? I'm not a fan of the way you set the scan-table phase and inh flag in one place, and then slightly later set the relation OID and block count. That creates a race during which users could see the first bit of data set and the second not set. I don't see any reason not to set all four fields together. Please be sure to make the names of the constants you use match up with the external names as far as it reasonably makes sense, e.g. +#define PROGRESS_ANALYZE_PHASE_SCAN_TABLE 1 +#define PROGRESS_ANALYZE_PHASE_COMPUTING 2 +#define PROGRESS_ANALYZE_PHASE_COMPUTING_EXTENDED 3 +#define PROGRESS_ANALYZE_PHASE_FINALIZE 4 vs. + WHEN 0 THEN 'initializing'::text + WHEN 1 THEN 'scanning table'::text + WHEN 2 THEN 'computing stats'::text + WHEN 3 THEN 'computing extended stats'::text + WHEN 4 THEN 'finalizing analyze'::text Not terrible, but it could be closer. Similarly with the column names (include_children vs. INH). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company