On 2017/08/08 4:34, Robert Haas wrote: > On Mon, Aug 7, 2017 at 2:54 AM, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> As long as find_all_inheritors() is a place only to determine the order in >> which partitions will be locked, it's fine. My concern is about the time >> of actual locking, which in the current planner implementation is too soon >> that we end up needlessly locking all the partitions. > > I don't think avoiding that problem is going to be easy. We need a > bunch of per-relation information, like the size of each relation, and > what indexes it has, and how big they are, and the statistics for each > one. It was at one point proposed by someone that every partition > should be required to have the same indexes, but (1) we didn't > implement it like that and (2) if we had done that it wouldn't solve > this problem anyway because the sizes are still going to vary.
Sorry, I didn't mean to say we shouldn't lock and open partitions at all. We do need their relation descriptors for planning and there is no doubt about that. I was just saying that we should do that only for the partitions that are not pruned. But, as you say, I can see that the planner changes required to be able to do that might be hard. >> The locking-partitions-too-soon issue, I think, is an important one and >> ISTM, we'd want to lock the partitions after we've determined the specific >> ones a query needs to scan using the information returned by >> RelationGetPartitionDispatchInfo. That means the latter had better locked >> the relations whose cached partition descriptors will be used to determine >> the result that it produces. One way to do that might be to lock all the >> tables in the list returned by find_all_inheritors that are partitioned >> tables before calling RelationGetPartitionDispatchInfo. It seems what the >> approach you've outlined below will let us do that. > > Yeah, I think so. I think we could possibly open and lock partitioned > children only, then prune away leaf partitions that we can determine > aren't needed, then open and lock the leaf partitions that are needed. Yes. >> BTW, IIUC, there will be two lists of OIDs we'll have: one in the >> find_all_inheritors order, say, L1 and the other determined by using >> partitioning-specific information for the given query, say L2. >> >> To lock, we iterate L1 and if a given member is in L2, we lock it. It >> might be possible to make it as cheap as O(nlogn). > > Commonly, we'll prune no partitions or all but one; and we should be > able to make those cases very fast. Agreed. >> Maybe, we can make the initial patch use syscache to get the relkind for a >> given child. If the syscache bloat is unbearable, we go with the >> denormalization approach. > > Yeah. Maybe if you write that patch, you can also test it to see how > bad the bloat is. I will try and see, but maybe the syscache solution doesn't get us past the proof-of-concept stage. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers