Hi, Attached is an updated patch, fixing all the issues pointed out so far. Unless there are some objections, I plan to commit the 0001 part by the end of this CF. Part 0002 is a matter for PG13, as previously agreed.
On 3/24/19 1:17 AM, David Rowley wrote: > On Sun, 24 Mar 2019 at 12:41, Tomas Vondra <tomas.von...@2ndquadrant.com> > wrote: >> >> On 3/21/19 4:05 PM, David Rowley wrote: >>> 11. In get_mincount_for_mcv_list() it's probably better to have the >>> numerical literals of 0.0 instead of just 0. >> >> Why? > > Isn't it what we do for float and double literals? > OK, fixed. >> >>> 12. I think it would be better if you modified build_attnums_array() >>> to add an output argument that sets the size of the array. It seems >>> that most places you call this function you perform bms_num_members() >>> to determine the array size. >> >> Hmmm. I've done this, but I'm not sure I like it very much - there's no >> protection the value passed in is the right one, so the array might be >> allocated either too small or too large. I think it might be better to >> make it work the other way, i.e. pass the value out instead. > > When I said "that sets the size", I meant "that gets set to the size", > sorry for the confusion. I mean, if you're doing bms_num_members() > inside build_attnums_array() anyway, then this will save you from > having to do that in the callers too. > OK, I've done this now, and I'm fairly happy with it. >>> 28. Can you explain what this is? >>> >>> uint32 type; /* type of MCV list (BASIC) */ >>> >>> I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */ >>> >>> but it's not really clear to me what else could exist. Maybe the >>> "type" comment can explain there's only one type for now, but more >>> might exist in the future? >>> >> >> It's the same idea as for dependencies/mvdistinct stats, i.e. >> essentially a version number for the data structure so that we can >> perhaps introduce some improved version of the data structure in the future. >> >> But now that I think about it, it seems a bit pointless. We would only >> do that in a major version anyway, and we don't keep statistics during >> upgrades. So we could just as well introduce the version/flag/... if >> needed. We can't do this for regular persistent data, but for stats it >> does not matter. >> >> So I propose we just remove this thingy from both the existing stats and >> this patch. > > I see. I wasn't aware that existed for the other types. It certainly > gives some wiggle room if some mistakes were discovered after the > release, but thinking about it, we could probably just change the > "magic" number and add new code in that branch only to ignore the old > magic number, perhaps with a WARNING to analyze the table again. The > magic field seems sufficiently early in the struct that we could do > that. In the master branch we'd just error if the magic number didn't > match, since we wouldn't have to deal with stats generated by the > previous version's bug. > OK. I've decided to keep the field for now, for sake of consistency with the already existing statistic types. I think we can rethink that in the future, if needed. >>> 29. Looking at the tests I see you're testing that you get bad >>> estimates without extended stats. That does not really seem like >>> something that should be done in tests that are meant for extended >>> statistics. >>> >> >> True, it might be a bit unnecessary. Initially the tests were meant to >> show old/new estimates for development purposes, but it might not be >> appropriate for regression tests. I don't think it's a big issue, it's >> not like it'd slow down the tests significantly. Opinions? > > My thoughts were that if someone did something to improve non-MV > stats, then is it right for these tests to fail? What should the > developer do in the case? update the expected result? remove the test? > It's not so clear. > I think such changes would affect a number of other places in regression tests (changing plans, ...), so I don't see why fixing these tests would be any different. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
0001-multivariate-MCV-lists-20190325.patch.gz
Description: application/gzip
0002-multivariate-histograms-20190325.patch.gz
Description: application/gzip