> * Doc build error and malformatting. > Looking into it.
> * I'm not certain that we want all changes to relation stats to be non- > transactional. Are there transactional use cases? Should it be an > option? Should it be transactional for pg_set_relation_stats() but non- > transactional for pg_restore_relation_stats()? > It's non-transactional because that's how ANALYZE does it to avoid bloating pg_class. We _could_ do it transactionally, but on restore we'd immediately have a pg_class that was 50% bloat. > > * The documentation for the pg_set_attribute_stats() still refers to > upgrade scenarios -- shouldn't that be in the > pg_restore_attribute_stats() docs? I imagine the pg_set variant to be > used for ad-hoc planner stuff rather than upgrades. > Noted. > > * For the "WARNING: stat names must be of type text" I think we need an > ERROR instead. The calling convention of name/value pairs is broken and > we can't safely continue. > They can't be errors, because any one error fails the whole pg_upgrade. > * The huge list of "else if (strcmp(statname, mc_freqs_name) == 0) ..." > seems wasteful and hard to read. I think we already discussed this, > what was the reason we can't just use an array to map the arg name to > an arg position type OID? > That was my overreaction to the dislike that the P_argname enum got in previous reviews. We'd need an array of struct like argname (ex. "mc_vals") argtypeoid (one of: int, text, real, rea[]) argtypename (name we want to call the argtypeoid (integer, text. real, real[] about covers it). argpos (position in the arg list of the corresponding pg_set_ function > > * How much error checking did we decide is appropriate? Do we need to > check that range_length_hist is always specified with range_empty_frac, > or should we just call that the planner's problem if one is specified > and the other not? Similarly, range stats for a non-range type. > I suppose we can let that go, and leave incomplete stat pairs in there. The big risk is that somebody packs the call with more than 5 statkinds, which would overflow the struct. > * I think most of the tests should be of pg_set_*_stats(). For > pg_restore_, we just want to know that it's translating the name/value > pairs reasonably well and throwing WARNINGs when appropriate. Then, for > pg_dump tests, it should exercise pg_restore_*_stats() more completely. > I was afraid you'd suggest that, in which case I'd break up the patch into the pg_sets and the pg_restores. > * It might help to clarify which arguments are important (like > n_distinct) vs not. I assume the difference is that it's a non-NULLable > column in pg_statistic. > There are NOT NULL stats...now. They might not be in the future. Does that change your opinion? > > * Some arguments, like the relid, just seem absolutely required, and > it's weird to just emit a WARNING and return false in that case. Again, we can't fail.Any one failure breaks pg_upgrade. > * To clarify: a return of "true" means all settings were successfully > applied, whereas "false" means that some were applied and some were > unrecognized, correct? Or does it also mean that some recognized > options may not have been applied? > True means "at least some stats were applied. False means "nothing was modified". > * pg_set_attribute_stats(): why initialize the output tuple nulls array > to false? It seems like initializing it to true would be safer. > +1 > > * please use a better name for "k" and add some error checking to make > sure it doesn't overrun the available slots. > k was an inheritance from analzye.c, from whence the very first version was cribbed. No objection to renaming. > * the pg_statistic tuple is always completely replaced, but the way you > can call pg_set_attribute_stats() doesn't imply that -- calling > pg_set_attribute_stats(..., most_common_vals => ..., most_common_freqs > => ...) looks like it would just replace the most_common_vals+freqs and > leave histogram_bounds as it was, but it actually clears > histogram_bounds, right? Should we make that work or should we document > better that it doesn't? > That would complicate things. How would we intentionally null-out one stat, while leaving others unchanged? However, this points out that I didn't re-instate the re-definition that applied the NULL defaults.