Hi Amit-san,

I wonder two things below. What do you think?

1)
For now, I'm not sure it should be set current_child_table_relid to zero
when the current phase is changed from "acquiring inherited sample rows" to
"computing stats". See <Test result> bellow.

In the upthread discussion [1], Robert asked to *not* do such things,
that is, resetting some values due to phase change.  I'm not sure his
point applies to this case too though.

Yeah, I understood.
I'll check target relid of "computing stats" to re-read a code of
analyze command later. :)


Finally, I understood after investigation of the code. :)
Call stack is the following, and analyze_rel() calls "N + 1" times
for partitioned table and each partitions.

analyze_rel start
 do_analyze_rel inh==true start
  onerel: hoge2
   acq_inh_sample_rows start
    childrel: hoge2_10000
    childrel: hoge2_20000
    childrel: hoge2_30000
    childrel: hoge2_default
   acq_inh_sample_rows end
   compute_stats start
   compute_stats end
   compute_index_stats start
   compute_index_stats end
   finalizing start
   finalizing end
 do_analyze_rel inh==true end
analyze_rel end
...


Also, I checked my test result. ("//" is my comments)


# select oid,relname,relkind from pg_class where relname like 'hoge2%';
  oid  |    relname    | relkind
-------+---------------+---------
 36081 | hoge2         | p
 36084 | hoge2_10000   | r
 36087 | hoge2_20000   | r
 36090 | hoge2_30000   | r
 36093 | hoge2_default | r
(6 rows)

# select relid,
         current_child_table_relid,
         phase,
         sample_blks_total,
         sample_blks_scanned,
         ext_stats_total,
         ext_stats_computed,
         child_tables_total,
         child_tables_done
  from pg_stat_progress_analyze; \watch 0.00001

== for partitioned table hoge2 ==
//hoge2_10000
36081|36084|acquiring inherited sample rows|45|20|0|0|4|0
36081|36084|acquiring inherited sample rows|45|42|0|0|4|0
36081|36084|acquiring inherited sample rows|45|45|0|0|4|0
36081|36084|acquiring inherited sample rows|45|45|0|0|4|0

//hoge2_20000
36081|36087|acquiring inherited sample rows|45|3|0|0|4|1
36081|36087|acquiring inherited sample rows|45|31|0|0|4|1
36081|36087|acquiring inherited sample rows|45|45|0|0|4|1
36081|36087|acquiring inherited sample rows|45|45|0|0|4|1

//hoge2_30000
36081|36090|acquiring inherited sample rows|45|12|0|0|4|2
36081|36090|acquiring inherited sample rows|45|35|0|0|4|2
36081|36090|acquiring inherited sample rows|45|45|0|0|4|2
36081|36090|acquiring inherited sample rows|45|45|0|0|4|2

//hoge2_default
36081|36093|acquiring inherited sample rows|45|18|0|0|4|3
36081|36093|acquiring inherited sample rows|45|38|0|0|4|3
36081|36093|acquiring inherited sample rows|45|45|0|0|4|3
36081|36093|acquiring inherited sample rows|45|45|0|0|4|3

//Below "computing stats" is for the partitioned table hoge,
//therefore the second column from the left side would be
//better to set Zero to easy to understand.
//I guessd that user think which relid is the target of
//"computing stats"?!
//Of course, other option is to write it on document.

36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|finalizing analyze  |45|45|0|0|4|4

== for each partitions such as hoge2_10000 ... hoge2_default ==

//hoge2_10000
36084|0|acquiring sample rows   |45|25|0|0|0|0
36084|0|computing stats         |45|45|0|0|0|0
36084|0|computing extended stats|45|45|0|0|0|0
36084|0|finalizing analyze      |45|45|0|0|0|0

//hoge2_20000
36087|0|acquiring sample rows   |45|14|0|0|0|0
36087|0|computing stats         |45|45|0|0|0|0
36087|0|computing extended stats|45|45|0|0|0|0
36087|0|finalizing analyze      |45|45|0|0|0|0

//hoge2_30000
36090|0|acquiring sample rows   |45|12|0|0|0|0
36090|0|acquiring sample rows   |45|44|0|0|0|0
36090|0|computing extended stats|45|45|0|0|0|0
36090|0|finalizing analyze      |45|45|0|0|0|0

//hoge2_default
36093|0|acquiring sample rows   |45|10|0|0|0|0
36093|0|acquiring sample rows   |45|43|0|0|0|0
36093|0|computing extended stats|45|45|0|0|0|0
36093|0|finalizing analyze      |45|45|0|0|0|0



2)
There are many "finalizing analyze" phases based on relids in the case
of partitioning tables. Would it better to fix the document? or it
would be better to reduce it to one?

<Document>
---------------------------------------------------------
       <entry><literal>finalizing analyze</literal></entry>
       <entry>
         The command is updating pg_class. When this phase is completed,
         <command>ANALYZE</command> will end.
---------------------------------------------------------

When a partitioned table is analyzed, its partitions are analyzed too.
So, the ANALYZE command effectively runs N + 1 times if there are N
partitions -- first analyze partitioned table to collect "inherited"
statistics by collecting row samples using
acquire_inherited_sample_rows(), then each partition to collect its
own statistics.  Note that this recursive application to ANALYZE to
partitions (child tables) only occurs for partitioned tables, not for
legacy inheritance.

Thanks for your explanation.
I understand Analyzing Partitioned table a little.


It would be better to modify the document of "finalizing analyze" phase.

  # Before modify
   The command is updating pg_class. When this phase is completed,
   <command>ANALYZE</command> will end.

  # Modified
   The command is updating pg_class. When this phase is completed,
   <command>ANALYZE</command> will end. In the case of partitioned table,
   it might be shown on each partitions.

What do you think that? I'm going to fix it, if you agreed. :)

Thanks,
Tatsuro Yamada





Reply via email to