On Wed, Mar 24, 2021 at 01:24:46AM -0500, Justin Pryzby wrote:
> It seems like you're preferring to use pluralized "statistics" in a lot of
> places that sound wrong to me.  For example:
> > Currently the first statistics wins, which seems silly.
> I can write more separately, but I think this is resolved and clarified if you
> write "statistics object" and not just "statistics".  

In HEAD:catalogs.sgml, pg_statistic_ext (the table) says "object":
|Name of the statistics object
|Owner of the statistics object
|An array of attribute numbers, indicating which table columns are covered by 
this statistics object;

But pg_stats_ext (the view) doesn't say "object", which sounds wrong:
|Name of extended statistics
|Owner of the extended statistics
|Names of the columns the extended statistics is defined on

Other pre-existing issues: should be singular "statistic":
doc/src/sgml/perform.sgml:     Another type of statistics stored for each 
column are most-common value
doc/src/sgml/ref/psql-ref.sgml:        The status of each kind of extended 
statistics is shown in a column

Pre-existing issues: doesn't say "object" but I think it should:
src/backend/commands/statscmds.c:                                        
errmsg("statistics creation on system columns is not supported")));
src/backend/commands/statscmds.c:                                        
errmsg("cannot have more than %d columns in statistics",
src/backend/commands/statscmds.c:        * If we got here and the OID is not 
valid, it means the statistics does
src/backend/commands/statscmds.c: * Select a nonconflicting name for a new 
statistics.
src/backend/commands/statscmds.c: * Generate "name2" for a new statistics given 
the list of column names for it
src/backend/statistics/extended_stats.c:                /* compute statistics 
target for this statistics */
src/backend/statistics/extended_stats.c: * attributes the statistics is defined 
on, and then the default statistics
src/backend/statistics/mcv.c: * The input is the OID of the statistics, and 
there are no rows returned if

should say "for a statistics object" or "for statistics objects"
src/backend/statistics/extended_stats.c: * target for a statistics objects 
(from the object target, attribute targets

Your patch adds these:

Should say "object":
+        * Check if we actually have a matching statistics for the expression.  
                                                                                
                                                                   
+               /* evaluate expressions (if the statistics has any) */          
                                                                                
                                                                   
+        * for the extended statistics. The second option seems more 
reasonable.                                                                     
                                                                              
+                * the statistics had all options enabled on the original 
version.                                                                        
                                                                         
+                * But if the statistics is defined on just a single column, it 
has to                                                                          
                                                                   
+       /* has the statistics expressions? */                                   
                                                                                
                                                                   
+                       /* expression - see if it's in the statistics */        
                                                                                
                                                                   
+                                        * column(s) the statistics depends on. 
 Also require all                                                               
                                                                   
+        * statistics is defined on more than one column/expression).           
                                                                                
                                                                   
+        * statistics is useless, but harmless).                                
                                                                                
                                                                   
+        * If there are no simply-referenced columns, give the statistics an 
auto                                                                            
                                                                      


+                        * Then the first statistics matches no expressions and 
3 vars,                                                                         
                                                                   
+                        * while the second statistics matches one expression 
and 1 var.                                                                      
                                                                     
+                        * Currently the first statistics wins, which seems 
silly.                                                                          
                                                                       

+                        * [(a+c), d]. But maybe it's better than failing to 
match the                                                                       
                                                                      
+                        * second statistics?                                   
                                                                                
                                                                   

I can make patches for these (separate patches for HEAD and your patch), but I
don't think your patch has to wait on it, since the user-facing documentation
is consistent with what's already there, and the rest are internal comments.

-- 
Justin


Reply via email to