Hello. At Tue, 9 Jul 2019 17:38:44 +0900, Tatsuro Yamada <tatsuro.yamada...@nttcom.co.jp> wrote in <244cb241-168b-d6a9-c45f-a80c34cdc...@nttcom.co.jp> > Hi Alvaro, Anthony, Julien and Robert, > > On 2019/07/09 3:47, Julien Rouhaud wrote: > > On Mon, Jul 8, 2019 at 8:44 PM Robert Haas <robertmh...@gmail.com> > > wrote: > >> > >> On Mon, Jul 8, 2019 at 2:18 PM Alvaro Herrera > >> <alvhe...@2ndquadrant.com> wrote: > >>> Yeah, I got the impression that that was determined to be the > >>> desirable > >>> behavior, so I made it do that, but I'm not really happy about it > >>> either. We're not too late to change the CREATE INDEX behavior, but > >>> let's discuss what is it that we want. > >> > >> I don't think I intended to make any such determination -- which > >> commit do you think established this as the canonical behavior? > >> > >> I propose that once a field is set, we should leave it set until the > >> end. > > +1 > > Note that this patch is already behaving like that if the table only > > contains dead rows.
+1 from me. > I fixed the patch including: > > - Replace "if" to "else if". (Suggested by Julien) > - Fix typo s/ech/each/. (Suggested by Anthony) > - Add Phase "analyzing complete" in the pgstat view. (Suggested by > - Julien, Robert and me) > It was overlooked to add it in system_views.sql. > > I share my re-test result, see below: > > --------------------------------------------------------- > [Session #1] > create table hoge as select * from generate_series(1, 1000000) a; > analyze verbose hoge; > > [Session #2] > \a \t > select * from pg_stat_progress_analyze; \watch 0.001 > > 3785|13599|postgres|16384|f|16384|scanning table|4425|6 > 3785|13599|postgres|16384|f|16384|scanning table|4425|31 > 3785|13599|postgres|16384|f|16384|scanning table|4425|70 > 3785|13599|postgres|16384|f|16384|scanning table|4425|109 > ... > 3785|13599|postgres|16384|f|16384|scanning table|4425|4425 > 3785|13599|postgres|16384|f|16384|scanning table|4425|4425 > 3785|13599|postgres|16384|f|16384|scanning table|4425|4425 > 3785|13599|postgres|16384|f|16384|analyzing sample|0|0 > 3785|13599|postgres|16384|f|16384|analyzing complete|0|0 <-- Added and > fixed. :) > --------------------------------------------------------- I have some comments. + const int index[] = { + PROGRESS_ANALYZE_PHASE, + PROGRESS_ANALYZE_TOTAL_BLOCKS, + PROGRESS_ANALYZE_BLOCKS_DONE + }; + const int64 val[] = { + PROGRESS_ANALYZE_PHASE_ANALYSIS, + 0, 0 Do you oppose to leaving the total/done blocks alone here:p? + BlockNumber nblocks; + double blksdone = 0; Why is it a double even though blksdone is of the same domain with nblocks? And finally: + pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE, + ++blksdone); It is converted into int64. + WHEN 2 THEN 'analyzing sample' + WHEN 3 THEN 'analyzing sample (extended stats)' I think we should avoid parenthesized phrases as far as not-necessary. That makes the column unnecessarily long. The phase is internally called as "compute stats" so *I* would prefer something like the followings: + WHEN 2 THEN 'computing stats' + WHEN 3 THEN 'computing extended stats' + WHEN 4 THEN 'analyzing complete' And if you are intending by this that (as mentioned in the documentation) "working to complete this analyze", this would be: + WHEN 4 THEN 'completing analyze' + WHEN 4 THEN 'finalizing analyze' + <entry>Process ID of backend.</entry> of "the" backend. ? And period is not attached on all descriptions consisting of a single sentence. + <entry>OID of the database to which this backend is connected.</entry> + <entry>Name of the database to which this backend is connected.</entry> "database to which .. is connected" is phrased as "database this backend is connected to" in pg_stat_acttivity. (Just for consistency) + <entry>Whether the current scan includes legacy inheritance children.</entry> This apparently excludes partition tables but actually it is included. "Whether scanning through child tables" ? I'm not sure "child tables" is established as the word to mean both "partition tables and inheritance children".. + The table being scanned (differs from <literal>relid</literal> + only when processing partitions or inheritance children). Is <literal> needed? And I think the parentheses are not needed. OID of the table currently being scanned. Can differ from relid when analyzing tables that have child tables. + Total number of heap blocks to scan in the current table. Number of heap blocks on scanning_table to scan? It might be better that this description describes that this and the next column is meaningful only while the phase "scanning table". + The command is currently scanning the table. "sample(s)" comes somewhat abruptly in the next item. Something like "The command is currently scanning the table <structfield>scanning_table</structfield> to obtain samples" might be better. + <command>ANALYZE</command> is currently extracting statistical data + from the sample obtained. Something like "The command is computing stats from the samples obtained in the previous phase" might be better. regards. - Kyotaro Horiguchi NTT Open Source Software Center