On Thu, Aug 22, 2024 at 8:25 PM Robert Haas <robertmh...@gmail.com> wrote: > > On Thu, Aug 22, 2024 at 12:43 PM Alexander Korotkov > <aekorot...@gmail.com> wrote: > > Thank you for your feedback. Yes, it seems that there is not enough > > time to even carefully analyze all the issues in these features. The > > rule of thumb I can get from this experience is "think multiple times > > before accessing something already opened by its name". I'm going to > > revert these features during next couple days. > > Thanks, and sorry about that. I would say even "think multiple times" > is possibly not strong enough -- it might almost be "just don't ever > do it". Even if (in some particular case) the invalidation mechanism > seems to protect you from getting wrong answers, there are often holes > in that, specifically around search_path = foo, bar and you're > operating on an object in schema bar and an identically-named object > is created in schema foo at just the wrong time. Sometimes there are > problems even when search_path is not involved, but when it is, there > are more. > > Here, aside from the name lookup issues, there are also problems with > expression evaluation: we can't split partitions without reindexing > rows that those partitions contain, and it is critical to think > through which is going to do the evaluation and make sure it's > properly sandboxed. I think we might need > SECURITY_RESTRICTED_OPERATION here. > > Another thing I want to highlight if you do have another go at this > patch is that it's really critical to think about where every single > property of the newly-created tables comes from. The original patch > didn't consider relpersistence or tableam, and here I just discovered > that owner is also an issue that probably needs more consideration, > but it goes way beyond that. For example, I was surprised to discover > that if I put per-partition constraints or triggers on a partition and > then split it, they were not duplicated to the new partitions. Now, > maybe that's actually the behavior we want -- I'm not 100% positive -- > but it sure wasn't what I was expecting. If we did duplicate them when > splitting, then what's supposed to happen when merging occurs? That is > not at all obvious, at least to me, but it needs careful thought. ACLs > and rules and default values and foreign keys (both outbond and > inbound) all need to be considered too, along with 27 other things > that I'm sure I'm not thinking about right now. Some of this behavior > should probably be explicitly documented, but all of it should be > considered carefully enough before commit to avoid surprises later. I > say that both from a security point of view and also just from a user > experience point of view. Even if things aren't insecure, they can > still be annoying, but it's not uncommon in cases like this for > annoying things to turn out to also be insecure. > > Finally, if you do revisit this, I believe it would be a good idea to > think a bit harder about how data is moved around. My impression (and > please correct me if I am mistaken) is that currently, any split or > merge operation rewrites all the data in the source partition(s). If a > large partition is being split nearly equally, I think that has a good > chance of being optimal, but I think that might be the only case. If > we're merging partitions, wouldn't it be better to adjust the > constraints on the first partition -- or perhaps the largest partition > if we want to be clever -- and insert the data from all of the others > into it? Maybe that would even have syntax that puts the user in > control of which partition survives, e.g. ALTER TABLE tab1 MERGE > PARTITION part1 WITH part2, part3, .... That would also make it really > obvious to the user what all of the properties of part1 will be after > the merge: they will be exactly the same as they were before the > merge, except that the partition constraint will have been adjusted. > You basically dodge everything in the previous paragraph in one shot, > and it seems like it would also be faster. Splitting there's no > similar get-out-of-jail free card, at least not that I can see. Even > if you add syntax that splits a partition by using INSERT/DELETE to > move some rows to a newly-created partition, you still have to make at > least one new partition. But possibly that syntax is worth having > anyway, because it would be a lot quicker in the case of a highly > asymmetric split. On the other hand, maybe even splits are much more > likely and we don't really need it. I don't know.
Thank you for so valuable feedback! When I have another go over this patch I will ensure this is addressed. ------ Regards, Alexander Korotkov Supabase