On 03/26/2018 06:21 PM, Dean Rasheed wrote: > On 26 March 2018 at 14:08, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: >> On 03/26/2018 12:31 PM, Dean Rasheed wrote: >>> A wider concern I have is that I think this function is trying to be >>> too clever by only resetting selected stats. IMO it should just reset >>> all stats unconditionally when the column type changes, which would >>> be consistent with what we do for regular stats. >>> >> The argument a year ago was that it's more plausible that the semantics >> remains the same. I think the question is how the type change affects >> precision - had the type change in the opposite direction (int to real) >> there would be no problem, because both ndistinct and dependencies would >> produce the same statistics. >> >> In my experience people are far more likely to change data types in a >> way that preserves precision, so I think the current behavior is OK. > > Hmm, I don't really buy that argument. Altering a column's type > allows the data in it to be rewritten in arbitrary ways, and I don't > think we should presume that the statistics will still be valid just > because the user *probably* won't do something that changes the data > much. >
Maybe, I can only really speak about my experience, and in those cases it's usually "the column is an INT and I need a FLOAT". But you're right it's not guaranteed to be like that, perhaps the right thing to do is resetting the stats. Another reason to do that might be consistency - resetting just some of the stats might be surprising for users. And we're are already resetting per-column stats on that column, so the users running ANALYZE anyway. BTW in my response I claimed this: > > The other reason is that when reducing precision, it generally > enforces the dependency (you can't violate functional dependencies or > break grouping by merging values). So you will have stale stats with > weaker dependencies, but it's still better than not having any.> That's actually bogus. For example for functional dependencies, it's important on which side of the dependency we reduce precision. With (a->b) dependency, reducing precision of "b" does indeed strengthen it, but reducing precision of "a" does weaken it. So I take that back. So, I'm not particularly opposed to just resetting extended stats referencing the altered column. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services