On Mon, Oct 05, 2020 at 10:07:33PM -0500, Justin Pryzby wrote: > Honestly, I think you're over-thinking and over-engineering indisclustered. > > If "clusteredness" was something we offered to maintain across DML, I think > that might be important to provide stronger guarantees. As it is now, I don't > think this patch is worth changing the catalog definition.
Well, this use case is new because we are discussing the relationship of indisclustered across multiple transactions for multiple indexes, so I'd rather have this discussion than not, and I have learnt the hard way with REINDEX that we should care a lot about the consistency of partition trees at any step of the operation. Let's imagine a simple example here, take this partition tree: p (parent), and two partitions p1 and p2. p has two partitioned indexes i and j, indexes also present in p1 and p2 as i1, i2, j1 and j2. Let's assume that the user has done a CLUSTER on p USING i that completes, meaning that i, i1 and i2 have indisclustered set. Now let's assume that the user does a CLUSTER on p USING j this time, and that this command fails while processing p2, meaning that indisclustered is set for j1, i2, and perhaps i or j depending on what the patch does. Per the latest arguments, j would be the one set to indisclustered. From this inconsistent state comes a couple of interesting things: - A database-wide CLUSTER would finish by using j1 and i2 for the operation on the partitions, while the intention was to use j2 for the second partition as the previous command failed. - With CLUSTER p, without USING. Logically, I would assume that we would rely on the value of indisclustered as of j, meaning that we would *enforce* p2 to use j2. But it could also be seen as incorrect by the user because we would not use the index originally marked as such. So keeping this consistent has the advantage to have clear rules here. > I think it would be strange if we refused "ALTER..CLUSTER ON" for a partition > just because a different partitioned index was set clustered. We'd clear > that, > like always, and then (in my proposal) also clear its parents > "indisclustered". > I still don't think that's essential, though. Why? Blocking a partition, which may be itself partitioned, to switch to a different index if its partitioned parent uses something else sounds kind of logic to me, at the end, because the user originally intended to use CLUSTER with a specific index on this tree. So I would say that the partitioned table takes priority, and this should be released with a WITHOUT CLUSTER from the partitioned table. > I didn't think it's worth the overhead of closing and opening more CFs. > But I don't mind. Thanks, I'll do some cleanup. -- Michael
signature.asc
Description: PGP signature