Hi Robert, I have tried to address your comments in the V22 set of patches[1]. Please find my feedback inlined on your comments.
On Thu, Jun 15, 2017 at 1:21 AM, Robert Haas <robertmh...@gmail.com> wrote: > > - Needs to be rebased over b08df9cab777427fdafe633ca7b8abf29817aa55. > Rebased on master latest commit: ca793c59a51e94cedf8cbea5c29668bf8fa298f3 > - Still no documentation. > Yes, this is long pending, and I will make this is a priority to get it included in next set of my patches. - Should probably be merged with the patch to add default partitioning > for ranges. Beena is already rebasing her patch on my latest patches, so I think getting them merged here won't be an issue, mostly will be just like one more patch on top my patches. > Other stuff I noticed: > > - The regression tests don't seem to check that the scan-skipping > logic works as expected. We have regression tests for that case for > attaching regular partitions, and it seems like it would be worth > testing the default-partition case as well. > Added a test case for default in alter_table.sql. - check_default_allows_bound() assumes that if > canSkipPartConstraintValidation() fails for the default partition, it > will also fail for every subpartition of the default partition. That > is, once we commit to scanning one child partition, we're committed to > scanning them all. In practice, that's probably not a huge > limitation, but if it's not too much code, we could keep the top-level > check but also check each partitioning individually as we reach it, > and skip the scan for any individual partitions for which the > constraint can be proven. For example, suppose the top-level table is > list-partitioned with a partition for each of the most common values, > and then we range-partition the default partition. > I have tried to address this in patch 0007, please let me know your views on that patch. - The changes to the regression test results in 0004 make the error > messages slightly worse. The old message names the default partition, > whereas the new one does not. Maybe that's worth avoiding. The only way for this, I can think of to achieve this is to store the name of the default relation in AlteredTableInfo, currently I am using a flag for realizing if the scanned table is a default partition to throw specific error. But, IMO storing a string in AlteredTableInfo just for error purpose might be overkill. Your suggestions? > Specific comments: > > + * Also, invalidate the parent's and a sibling default partition's > relcache, > + * so that the next rebuild will load the new partition's info into > parent's > + * partition descriptor and default partition constraints(which are > dependent > + * on other partition bounds) are built anew. > > I find this a bit unclear, and it also repeats the comment further > down. Maybe something like: Also, invalidate the parent's relcache > entry, so that the next rebuild will load he new partition's info into > its partition descriptor. If there is a default partition, we must > invalidate its relcache entry as well. > > Done. > + /* > + * The default partition constraints depend upon the partition bounds > of > + * other partitions. Adding a new(or even removing existing) partition > + * would invalidate the default partition constraints. Invalidate the > + * default partition's relcache so that the constraints are built > anew and > + * any plans dependent on those constraints are invalidated as well. > + */ > > Here, I'd write: The partition constraint for the default partition > depends on the partition bounds of every other partition, so we must > invalidate the relcache entry for that partition every time a > partition is added or removed. > > Done. > + /* > + * Default partition cannot be added if there already > + * exists one. > + */ > + if (spec->is_default) > + { > + overlap = partition_bound_has_default(boundinfo); > + with = boundinfo->default_index; > + break; > + } > > To support default partitioning for range, this is going to have to be > moved above the switch rather than done inside of it. And there's > really no downside to putting it there. > > Done. > + * constraint, by *proving* that the existing constraints of the table > + * *imply* the given constraints. We include the table's check > constraints and > > Both the comma and the asterisks are unnecessary. > > Done. > + * Check whether all rows in the given table (scanRel) obey given > partition > > obey the given > > I think the larger comment block could be tightened up a bit, like > this: Check whether all rows in the given table obey the given > partition constraint; if so, it can be attached as a partition. We do > this by scanning the table (or all of its leaf partitions) row by row, > except when the existing constraints are sufficient to prove that the > new partitioning constraint must already hold. > > Done. > + /* Check if we can do away with having to scan the table being > attached. */ > > If possible, skip the validation scan. > > Fixed. > + * Set up to have the table be scanned to validate the partition > + * constraint If it's a partitioned table, we instead schedule its > leaf > + * partitions to be scanned. > > I suggest: Prepare to scan the default partition (or, if it is itself > partitioned, all of its leaf partitions). > > Done. > + int default_index; /* Index of the default partition if any; > -1 > + * if there isn't one */ > > "if any" is a bit redundant with "if there isn't one"; note the > phrasing of the preceding entry. > Done. > + /* > + * Skip if it's a partitioned table. Only RELKIND_RELATION > relations > + * (ie, leaf partitions) need to be scanned. > + */ > + if (part_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE || > + part_rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE) > > The comment talks about what must be included in our list of things to > scan, but the code tests for the things that can be excluded. I > suspect the comment has the right idea and the code should be adjusted > to match, but anyway they should be consistent. Also, the correct way > to punctuate i.e. is like this: (i.e. leaf partitions) You should have > a period after each letter, but no following comma. > > Done. > + * The default partition must be already having an > AccessExclusiveLock. > > I think we should instead change DefineRelation to open (rather than > just lock) the default partition and pass the Relation as an argument > here so that we need not reopen it. > > I have fixed this as a part of patch 0006. > + /* Construct const from datum */ > + val = makeConst(key->parttypid[0], > + key->parttypmod[0], > + key->parttypcoll[0], > + key->parttyplen[0], > + *boundinfo->datums[i], > + false, /* isnull */ > + key->parttypbyval[0] /* byval */ ); > > The /* byval */ comment looks a bit redundant, but I think this could > use a comment along the lines of: /* Only single-column list > partitioning is supported, so we only need to worry about the > partition key with index 0. */ And I'd also add an Assert() verifying > the the partition key has exactly 1 column, so that this breaks a bit > more obviously if someone removes that restriction in the future. > Removed the /* byval */ comment. The assert is taken care as part of commit 5efccc1cb43005a832776ed9158d2704fd976f8f. > + * Handle NULL partition key here if there's a null-accepting list > + * partition, else later it will be routed to the default > partition if > + * one exists. > > This isn't a great update of the existing comment -- it's drifted from > explaining the code to which it is immediately attached to a more > general discussion of NULL handling. I'd just say something like: If > this is a NULL, send it to the null-accepting partition. Otherwise, > route by searching the array of partition bounds. > > Done. > + if (tab->is_default_partition) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > + errmsg("updated partition constraint for > default partition would be violated by some row"))); > + else > + ereport(ERROR, > + (errcode(ERRCODE_CHECK_VIOLATION), > > While there's room for debate about the correct error code here, it's > hard for me to believe that it's correct to use one error code for the > is_default_partition case and a different error code the rest of the > time. > > Per discussion here[2], I had changed this error code, but as of now I have restored this to ERRCODE_CHECK_VIOLATION to be consistent with the error when non-default partition being attached has some existing row that violates partition constraints. Similarly, for consistency I have changed this in check_default_allows_bound() too. I agree that there is still a room for debate here after this change too, and also this change reverts the suggestion by Ashutosh. + * previously cached default partition constraints; those > constraints > + * won't stand correct after addition(or even removal) of a > partition. > > won't be correct after addition or removal > Done. > > + * allow any row that qualifies for this new partition. So, check > if > + * the existing data in the default partition satisfies this > *would be* > + * default partition constraint. > > check that the existing data in the default partition satisfies the > constraint as it will exist after adding this partition > Done. > > + * Need to take a lock on the default partition, refer comment for > locking > + * the default partition in DefineRelation(). > > I'd say: We must also lock the default partition, for the same reasons > explained in DefineRelation(). > > And similarly in the other places that refer to that same comment. > Done. > > + /* > + * In case of the default partition, the constraint is of the form > + * "!(result)" i.e. one of the following two forms: > + * 1. NOT ((keycol IS NULL) OR (keycol = ANY (arr))) > + * 2. NOT ((keycol IS NOT NULL) AND (keycol = ANY (arr))), where arr > is an > + * array of datums in boundinfo->datums. > + */ > > Does this survive pgindent? You might need to surround the comment > with dashes to preserve formatting. > > Yes, this din't survive pg_indent, but even adding dashes '--' did not make the deal(may be I misunderstood the workaround), I have instead added blank line in the bullets. I think it would be worth adding a little more text this comment, > something like this: Note that, in general, applying NOT to a > constraint expression doesn't necessarily invert the set of rows it > accepts, because NOT NULL is NULL. However, the partition constraints > we construct here never evaluate to NULL, so applying NOT works as > intended. > > Added. > + * Check whether default partition has a row that would fit the > partition > + * being attached by negating the partition constraint derived from > the > + * bounds. Since default partition is already part of the partitioned > + * table, we don't need to validate the constraints on the partitioned > + * table. > > Here again, I'd add to the end of the first sentence a parenthetical > note, like this: ...from the bounds (the partition constraint never > evaluates to NULL, so negating it like this is safe). > > Done. > I don't understand the second sentence. It seems to contradict the first > one. > Fixed, I removed the second sentence. > +extern List *get_default_part_validation_constraint(List > *new_part_constaints); > #endif /* PARTITION_H */ > > There should be a blank line after the last prototype and before the > #endif. > > +-- default partition table when it is being used in cahced plan. > > Typo. > Fixed. Thanks, Jeevan Ladhe Refs: [1] https://www.postgresql.org/message-id/CAOgcT0OARciE2X%2BU0rjSKp9VuC279dYcCGkc3nCWKhHQ1_m2rw%40mail.gmail.com [2] https://www.postgresql.org/message-id/CA%2BTgmobkFaptsmQiP94sbAKTtDKS6Azz%2BP4Bw1FxzmNrnyVa0w%40mail.gmail.com