Hi, I didn't find any major issues with the patch. It works as expected. However, I have few minor comments that I would like to share,
+ <entry> + Total number of sample rows. The sample it reads is taken randomly. + Its size depends on the <varname>default_statistics_target</> parameter value. + </entry> 1) I think it would be better if you could specify reference link to the guc variable 'default_statistics_target'. Something like this, <xref linkend='guc-default-statistics-target'>. If you go through monitoring.sgml, you would find that there is reference link to all guc variables being used. 2) I feel that the 'computing_statistics' phase could have been splitted into 'computing_column_stats', 'computing_index_stats'.... Please let me know your thoughts on this. + certain commands during command execution. Currently, the command + which supports progress reporting is <command>VACUUM</> and <command>ANALYZE</>. This may be expanded in the future. 3) I think above needs to be rephrased. Something like...Currently, the supported progress reporting commands are 'VACUUM' and 'ANALYZE'. Moreover, I also ran your patch on Windows platform and didn't find any issues. Thanks. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com On Sat, Mar 18, 2017 at 5:30 PM, Ashutosh Sharma <ashu.coe...@gmail.com> wrote: > Hi Vinayak, > > Here are couple of review comments that may need your attention. > > 1. Firstly, I am seeing some trailing whitespace errors when trying to > apply your v3 patch using git apply command. > > [ashu@localhost postgresql]$ git apply pg_stat_progress_analyze_v3.patch > pg_stat_progress_analyze_v3.patch:155: trailing whitespace. > CREATE VIEW pg_stat_progress_analyze AS > pg_stat_progress_analyze_v3.patch:156: trailing whitespace. > SELECT > pg_stat_progress_analyze_v3.patch:157: trailing whitespace. > S.pid AS pid, S.datid AS datid, D.datname AS datname, > pg_stat_progress_analyze_v3.patch:158: trailing whitespace. > S.relid AS relid, > pg_stat_progress_analyze_v3.patch:159: trailing whitespace. > CASE S.param1 WHEN 0 THEN 'initializing' > error: patch failed: doc/src/sgml/monitoring.sgml:521 > > 2) The test-case 'rules' is failing. I think it's failing because in > rules.sql 'ORDERBY viewname' is used which will put > 'pg_stat_progress_analyze' ahead of 'pg_stat_progress_vacuum' in the > list of catalog views. You may need to correct rules.out file > accordingly. This is what i could see in rules.sql, > > SELECT viewname, definition FROM pg_views WHERE schemaname <> > 'information_schema' ORDER BY viewname; > > I am still reviewing your patch and doing some testing. Will update if > i find any issues. Thanks. > > -- > With Regards, > Ashutosh Sharma > EnterpriseDB:http://www.enterprisedb.com > > On Fri, Mar 17, 2017 at 3:16 PM, vinayak > <pokale_vinayak...@lab.ntt.co.jp> wrote: >> >> On 2017/03/17 10:38, Robert Haas wrote: >>> >>> On Fri, Mar 10, 2017 at 2:46 AM, vinayak >>> <pokale_vinayak...@lab.ntt.co.jp> wrote: >>>> >>>> Thank you for reviewing the patch. >>>> >>>> The attached patch incorporated Michael and Amit comments also. >>> >>> I reviewed this tonight. >> >> Thank you for reviewing the patch. >>> >>> + /* Report compute index stats phase */ >>> + pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, >>> + >>> PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS); >>> >>> Hmm, is there really any point to this? And is it even accurate? It >>> doesn't look to me like we are computing any index statistics here; >>> we're only allocating a few in-memory data structures here, I think, >>> which is not a statistics computation and probably doesn't take any >>> significant time. >>> >>> + /* Report compute heap stats phase */ >>> + pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, >>> + >>> PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS); >>> >>> The phase you label as computing heap statistics also includes the >>> computation of index statistics. I wouldn't bother separating the >>> computation of heap and index statistics; I'd just have a "compute >>> statistics" phase that begins right after the comment that starts >>> "Compute the statistics." >> >> Understood. Fixed in the attached patch. >>> >>> >>> + /* Report that we are now doing index cleanup */ >>> + pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, >>> + PROGRESS_ANALYZE_PHASE_INDEX_CLEANUP); >>> >>> OK, this doesn't make any sense either. We are not cleaning up the >>> indexes here. We are just closing them and releasing resources. I >>> don't see why you need this phase at all; it can't last more than some >>> tiny fraction of a second. It seems like you're trying to copy the >>> exact same phases that exist for vacuum instead of thinking about what >>> makes sense for ANALYZE. >> >> Understood. I have removed this phase. >>> >>> >>> + /* Report number of heap blocks scaned so far*/ >>> + pgstat_progress_update_param(PROGRESS_ANALYZE_HEAP_BLKS_SCANNED, >>> targblock); >>> >>> While I don't think this is necessarily a bad thing to report, I find >>> it very surprising that you're not reporting what seems to me like the >>> most useful thing here - namely, the number of blocks that will be >>> sampled in total and the number of those that you've selected. >>> Instead, you're just reporting the length of the relation and the >>> last-sampled block, which is a less-accurate guide to total progress. >>> There are enough counters to report both things, so maybe we should. >>> >>> + /* Report total number of sample rows*/ >>> + pgstat_progress_update_param(PROGRESS_ANALYZE_TOTAL_SAMPLE_ROWS, >>> numrows); >>> >>> As an alternative to the previous paragraph, yet another thing you >>> could report is number of rows sampled out of total rows to be >>> sampled. But this isn't the way to do it: you are reporting the >>> number of rows you sampled only once you've finished sampling. The >>> point of progress reporting is to report progress -- that is, to >>> report this value as it's updated, not just to report it when ANALYZE >>> is practically done and the value has reached its maximum. >> >> Understood. >> I have reported number of rows sampled out of total rows to be sampled. >> I have not reported the number of blocks that will be sampled in total. >> So currect pg_stat_progress_analyze view looks like: >> >> =# \d+ pg_stat_progress_analyze >> View "pg_catalog.pg_stat_progress_analyze" >> Column | Type | Collation | Nullable | Default | Storage >> | Descripti >> on >> ------------------------+---------+-----------+----------+---------+----------+---------- >> --- >> pid | integer | | | | plain >> | >> datid | oid | | | | plain >> | >> datname | name | | | | plain >> | >> relid | oid | | | | plain >> | >> phase | text | | | | >> extended | >> num_target_sample_rows | bigint | | | | plain >> | >> num_rows_sampled | bigint | | | | plain >> | >> View definition: >> SELECT s.pid, >> s.datid, >> d.datname, >> s.relid, >> CASE s.param1 >> WHEN 0 THEN 'initializing'::text >> WHEN 1 THEN 'collecting sample rows'::text >> WHEN 2 THEN 'computing statistics'::text >> ELSE NULL::text >> END AS phase, >> s.param2 AS num_target_sample_rows, >> s.param3 AS num_rows_sampled >> FROM pg_stat_get_progress_info('ANALYZE'::text) s(pid, datid, relid, >> param1, param2, p >> aram3, param4, param5, param6, param7, param8, param9, param10) >> LEFT JOIN pg_database d ON s.datid = d.oid; >> >> Comment? >>> >>> The documentation for this patch isn't very good, either. You forgot >>> to update the part of the documentation that says that VACUUM is the >>> only command that currently supports progress reporting, and the >>> descriptions of the phases and progress counters are brief and not >>> entirely accurate - e.g. heap_blks_scanned is not actually the number >>> of heap blocks scanned, because we are sampling, not reading all the >>> blocks. >> >> Understood. I have updated the documentation. >> I will also try to improve documentation. >>> >>> When adding calls to the progress reporting functions, please consider >>> whether a blank line should be added before or after the new code, or >>> both, or neither. I'd say some blank lines are needed in a few places >>> where you didn't add them. >> >> +1. I have added blank lines in a few places. >> >> Regards, >> Vinayak Pokale >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers