David Rowley <dgrowle...@gmail.com> writes: > ... The main > thing I'd like to understand here is if there's not enough time to get > the entire patch set committed, is there much benefit to just having > the EquivalenceMember index stuff in by itself without the > RestrictInfo changes.
I finally made some time to look at this patchset, and I'm pretty disappointed, because after 35 versions I'd expect to see something that looks close to committable. This doesn't really. I like the basic idea of taking child EC members out of ECs' main ec_members lists, but there are too many weird details and underexplained/overcomplicated/unmaintainable data structures. One thing I don't love is putting the children into RelOptInfos. That seems like an unrelated data structure. Have you thought about instead having, in each EC that needs it, an array indexed by RTI of per-relation child-member lists? I think this might net out as less storage because there typically aren't that many ECs in a query. But the main thing is to not have so many interconnections between ECs and RelOptInfos. Another thing I really don't like is the back-link from EMs to ECs: + EquivalenceClass *em_ec; /* EquivalenceClass which has this member */ That makes the data structure circular, which will cause pprint to recurse infinitely. (The fact that you hadn't noticed that makes me wonder how you debugged any of these data structure changes.) We could prevent the recursion with suitable annotation on this field, but I'd really rather not have the field in the first place. Circular pointers are dangerous and best avoided. Also, it's bloating a node type that you are concerned about supporting a lot of. Another point is that I don't see any code to take care of updating these links during an EC merge. Some thoughts about the iterator stuff: * setup_eclass_member_iterator_with_children is a carpal-tunnel-inducing name. Could we drop the "_with_children" part? It doesn't seem to add much, since there's no variant for "without children". * The root parameter should be first; IMO there should be no exceptions to that within the planner. Perhaps putting the target iterator parameter last would make it read more nicely. Or you could rely on struct assignment: it = setup_eclass_member_iterator(root, ec, relids); * Why did you define the iterator as possibly returning irrelevant members? Doesn't that mean that every caller has to double-check? Wouldn't it make for less code and fewer bugs for the iterator to have that responsibility? If there is a good reason to do it like that, the comments should explain why. I don't really like the concept of 0004 at all. Putting *all* the EC-related RelOptInfos into a root-stored list seems to be doubling down very hard on the assumption that no performance-critical operations will ever need to search that whole list. Is there a good reason to do it like that, rather than say using the bitmap-index concept separately within each EC? That might also alleviate the problem you're having with the bitmapsets getting too big. Given that we've only got a week left, I see little hope of getting any of this into v18. regards, tom lane