On Mon, Jun 15, 2020 at 09:33:05PM +0800, 李杰(慎追) wrote: > This is a good idea. > We should maintain the consistency of the entire partition table. > However, this is not a small change in the code. > May be we need to make a new design for DefineIndex function....
Indeed. I have looked at the patch set a bit and here is the related bit for CIC in 0001, meaning that you handle the basic index creation for a partition tree within one transaction for each: + /* + * CIC needs to mark a partitioned table as VALID, which itself + * requires setting READY, which is unset for CIC (even though + * it's meaningless for an index without storage). + */ + if (concurrent) + { + PopActiveSnapshot(); + CommitTransactionCommand(); + StartTransactionCommand(); + index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY); + + CommandCounterIncrement(); + index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID); } I am afraid that this makes the error handling more complicated, with risks of having inconsistent partition trees. That's the point you raised. This one is going to need more thoughts. > But most importantly, if we use three steps to implement CIC, > we will spend more time than ordinary tables, especially in high > concurrency cases. To wait for one of partitions which the users to > use frequently, the creation of other partition indexes is delayed. > Is it worth doing this? CIC is an operation that exists while allowing read and writes to still happen in parallel, so that's fine IMO if it takes time. Now it is true that it may not scale well so we should be careful with the approach taken. Also, I think that the case of REINDEX would require less work overall because we already have some code in place to gather multiple indexes from one or more relations and work on these consistently, all at once. -- Michael
signature.asc
Description: PGP signature