On Tue, Nov 13, 2018 at 10:34:50AM +0900, Amit Langote wrote: > Hmm, Assert or elog is your call, but I'd learned that we prefer elog when > checking a value read from the disk?
Let's keep elog() then, which is consistent with the previous code. If there is any meaning in switching to an assert(), it could always be changed later, if that proves necessary. > Thank you, the new logic seems sane to me. > > I noticed a few things while going over the new patch. Thanks for going through it. > I'd accidentally ended up changing the mapping elements to mean that they > map canonical indexes of partitions to their original indexes, but your > rewrite has restored the original meaning, so the following comment that I > wrote is obsolete: > > + * Upon return from this function, *mapping is set to an array of > + * list_length(boundspecs) elements, each of which maps the canonical > + * index of a given partition to its 0-based position in the original list. > > It should be: "each of which maps the original index of a partition to its > canonical index." Indeed. Good point. I have missed that bit. > Maybe we can slightly rearrange this comment (along with fixing typos and > obsolete facts) as: > > /* > * For each partitioning method, we first convert the partition bounds f > * from their parser node representation to the internal representation, > * along with any additional preprocessing (such as de-duplicating range > * bounds). Resulting bound datums are then added to the 'datums' array > * in PartitionBoundInfo. For each datum added, an integer indicating the > * canonical partition index is added to the 'indexes' array. > * > * For each bound, we remember its partition's position (0-based) in the > * original list to later map it to the canonical index. > */ OK, that looks cleaner. > I'd proposed: > > /* > * Set the canonical value for null_index, if any. > * > * It's possible that the null-accepting partition has not been > * assigned an index yet, which could happen if such partition > * accepts only null and hence not handled in the above loop > * which only looked at non-null values. > */ No issues with that. So canonical index is used as a more common term, which seems fine. > I'd proposed: > > /* Set the canonical value for default_index, if any. */ And this one either, using a conditional at the end is more adapted anyway. And there are two occurences of that. Attached is an updated patch. Perhaps you are spotting something else? -- Michael
diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c index c94f73aadc..70cee6862b 100644 --- a/src/backend/partitioning/partbounds.c +++ b/src/backend/partitioning/partbounds.c @@ -36,6 +36,61 @@ #include "utils/ruleutils.h" #include "utils/syscache.h" +/* + * When qsort'ing partition bounds after reading from the catalog, each bound + * is represented with one of the following structs. + */ + +/* One bound of a hash partition */ +typedef struct PartitionHashBound +{ + int modulus; + int remainder; + int index; +} PartitionHashBound; + +/* One value coming from some (index'th) list partition */ +typedef struct PartitionListValue +{ + int index; + Datum value; +} PartitionListValue; + +/* One bound of a range partition */ +typedef struct PartitionRangeBound +{ + int index; + Datum *datums; /* range bound datums */ + PartitionRangeDatumKind *kind; /* the kind of each datum */ + bool lower; /* this is the lower (vs upper) bound */ +} PartitionRangeBound; + +static int32 qsort_partition_hbound_cmp(const void *a, const void *b); +static int32 qsort_partition_list_value_cmp(const void *a, const void *b, + void *arg); +static int32 qsort_partition_rbound_cmp(const void *a, const void *b, + void *arg); +static PartitionBoundInfo create_hash_bounds(List *boundspecs, + PartitionKey key, + int **mapping); +static PartitionBoundInfo create_list_bounds(List *boundspecs, + PartitionKey key, + int **mapping); +static PartitionBoundInfo create_range_bounds(List *boundspecs, + PartitionKey key, + int **mapping); +static PartitionRangeBound *make_one_partition_rbound(PartitionKey key, int index, + List *datums, bool lower); +static int32 partition_hbound_cmp(int modulus1, int remainder1, int modulus2, + int remainder2); +static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, Datum *datums1, + PartitionRangeDatumKind *kind1, bool lower1, + PartitionRangeBound *b2); +static int partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, + PartitionBoundInfo boundinfo, + PartitionRangeBound *probe, bool *is_equal); static int get_partition_bound_num_indexes(PartitionBoundInfo b); static Expr *make_partition_op_expr(PartitionKey key, int keynum, uint16 strategy, Expr *arg1, Expr *arg2); @@ -92,6 +147,522 @@ get_qual_from_partbound(Relation rel, Relation parent, return my_qual; } +/* + * partition_bounds_create + * Build a PartitionBoundInfo struct from a list of PartitionBoundSpec + * nodes + * + * This function creates a PartitionBoundInfo and fills the values of its + * various members based on the input list. Importantly, 'datums' array will + * contain Datum representation of individual bounds (possibly after + * de-duplication as in case of range bounds), sorted in a canonical order + * defined by qsort_partition_* functions of respective partitioning methods. + * 'indexes' array will contain as many elements as there are bounds (specific + * exceptions to this rule are listed in the function body), which represent + * the 0-based canonical positions of partitions. + * + * Upon return from this function, *mapping is set to an array of + * list_length(boundspecs) elements, each of which maps the original index of + * a partition to its canonical index. + * + * Note: All the memory allocated by this function, including that of the + * the returned PartitionBoundInfo and its members is allocated in the context + * that was active when the function was called. + */ +PartitionBoundInfo +partition_bounds_create(List *boundspecs, PartitionKey key, int **mapping) +{ + int nparts = list_length(boundspecs); + int i; + + Assert(nparts > 0); + + /* + * For each partitioning method, we first convert the partition bounds + * from their parser node representation to the internal representation, + * along with any additional preprocessing (such as de-duplicating range + * bounds). Resulting bound datums are then added to the 'datums' array + * in PartitionBoundInfo. For each datum added, an integer indicating the + * canonical partition index is added to the 'indexes' array. + * + * For each bound, we remember its partition's position (0-based) in the + * original list to later map it to the canonical index. + */ + + /* + * Initialize mapping array with invalid values, this is filled within + * each sub-routine below depending on the bound type. + */ + *mapping = (int *) palloc(sizeof(int) * nparts); + for (i = 0; i < nparts; i++) + (*mapping)[i] = -1; + + switch (key->strategy) + { + case PARTITION_STRATEGY_HASH: + return create_hash_bounds(boundspecs, key, mapping); + + case PARTITION_STRATEGY_LIST: + return create_list_bounds(boundspecs, key, mapping); + + case PARTITION_STRATEGY_RANGE: + return create_range_bounds(boundspecs, key, mapping); + + default: + elog(ERROR, "unexpected partition strategy: %d", + (int) key->strategy); + break; + } + + Assert(false); + return NULL; /* keep compiler quiet */ +} + +/* + * create_hash_bounds + * Create a PartitionBoundInfo for a hash partitioned table + */ +static PartitionBoundInfo +create_hash_bounds(List *boundspecs, PartitionKey key, int **mapping) +{ + PartitionBoundInfo boundinfo; + PartitionHashBound **hbounds = NULL; + ListCell *cell; + int i, + nparts = list_length(boundspecs); + int ndatums = 0; + int greatest_modulus; + + boundinfo = (PartitionBoundInfoData *) + palloc0(sizeof(PartitionBoundInfoData)); + boundinfo->strategy = key->strategy; + /* No special hash partitions. */ + boundinfo->null_index = -1; + boundinfo->default_index = -1; + + ndatums = nparts; + hbounds = (PartitionHashBound **) + palloc(nparts * sizeof(PartitionHashBound *)); + + /* Convert from node to the internal representation */ + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, lfirst(cell)); + + if (spec->strategy != PARTITION_STRATEGY_HASH) + elog(ERROR, "invalid strategy in partition bound spec"); + + hbounds[i] = (PartitionHashBound *) palloc(sizeof(PartitionHashBound)); + hbounds[i]->modulus = spec->modulus; + hbounds[i]->remainder = spec->remainder; + hbounds[i]->index = i; + i++; + } + + /* Sort all the bounds in ascending order */ + qsort(hbounds, nparts, sizeof(PartitionHashBound *), + qsort_partition_hbound_cmp); + + /* After sorting, moduli are now stored in ascending order. */ + greatest_modulus = hbounds[ndatums - 1]->modulus; + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(greatest_modulus * sizeof(int)); + for (i = 0; i < greatest_modulus; i++) + boundinfo->indexes[i] = -1; + + /* + * For hash partitioning, there are as many datums (modulus and remainder + * pairs) as there are partitions. Indexes are simply values ranging from + * 0 to (nparts - 1). + */ + for (i = 0; i < nparts; i++) + { + int modulus = hbounds[i]->modulus; + int remainder = hbounds[i]->remainder; + + boundinfo->datums[i] = (Datum *) palloc(2 * sizeof(Datum)); + boundinfo->datums[i][0] = Int32GetDatum(modulus); + boundinfo->datums[i][1] = Int32GetDatum(remainder); + + while (remainder < greatest_modulus) + { + /* overlap? */ + Assert(boundinfo->indexes[remainder] == -1); + boundinfo->indexes[remainder] = i; + remainder += modulus; + } + + (*mapping)[hbounds[i]->index] = i; + pfree(hbounds[i]); + } + pfree(hbounds); + + return boundinfo; +} + +/* + * create_list_bounds + * Create a PartitionBoundInfo for a list partitioned table + */ +static PartitionBoundInfo +create_list_bounds(List *boundspecs, PartitionKey key, int **mapping) +{ + PartitionBoundInfo boundinfo; + PartitionListValue **all_values = NULL; + ListCell *cell; + int i = 0; + int ndatums = 0; + int next_index = 0; + int default_index = -1; + int null_index = -1; + List *non_null_values = NIL; + + boundinfo = (PartitionBoundInfoData *) + palloc0(sizeof(PartitionBoundInfoData)); + boundinfo->strategy = key->strategy; + /* Will be set correctly below. */ + boundinfo->null_index = -1; + boundinfo->default_index = -1; + + /* Create a unified list of non-null values across all partitions. */ + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, lfirst(cell)); + ListCell *c; + + if (spec->strategy != PARTITION_STRATEGY_LIST) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the default + * partition. There's no datum to add to the list on non-null datums + * for this partition. + */ + if (spec->is_default) + { + default_index = i; + i++; + continue; + } + + foreach(c, spec->listdatums) + { + Const *val = castNode(Const, lfirst(c)); + PartitionListValue *list_value = NULL; + + if (!val->constisnull) + { + list_value = (PartitionListValue *) + palloc0(sizeof(PartitionListValue)); + list_value->index = i; + list_value->value = val->constvalue; + } + else + { + /* + * Never put a null into the values array, flag instead for + * the code further down below where we construct the actual + * relcache struct. + */ + if (null_index != -1) + elog(ERROR, "found null more than once"); + null_index = i; + } + + if (list_value) + non_null_values = lappend(non_null_values, list_value); + } + + i++; + } + + ndatums = list_length(non_null_values); + + /* + * Collect all list values in one array. Alongside the value, we also save + * the index of partition the value comes from. + */ + all_values = (PartitionListValue **) + palloc(ndatums * sizeof(PartitionListValue *)); + i = 0; + foreach(cell, non_null_values) + { + PartitionListValue *src = lfirst(cell); + + all_values[i] = (PartitionListValue *) + palloc(sizeof(PartitionListValue)); + all_values[i]->value = src->value; + all_values[i]->index = src->index; + i++; + } + + qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), + qsort_partition_list_value_cmp, (void *) key); + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); + + /* + * Copy values. Canonical indexes are values ranging from 0 to (nparts - + * 1) assigned to each partition such that all datums of a given partition + * receive the same value. The value for a given partition is the index of + * that partition's smallest datum in the all_values[] array. + */ + for (i = 0; i < ndatums; i++) + { + int orig_index = all_values[i]->index; + + boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); + boundinfo->datums[i][0] = datumCopy(all_values[i]->value, + key->parttypbyval[0], + key->parttyplen[0]); + + /* If the old index has no mapping, assign one */ + if ((*mapping)[orig_index] == -1) + (*mapping)[orig_index] = next_index++; + + boundinfo->indexes[i] = (*mapping)[orig_index]; + } + + /* + * Set the canonical value for null_index, if any. + * + * It is possible that the null-accepting partition has not been assigned + * an index yet, which could happen if such partition accepts only null + * and hence not handled in the above loop which only looked at non-null + * values. + */ + if (null_index != -1) + { + Assert(null_index >= 0); + if ((*mapping)[null_index] == -1) + (*mapping)[null_index] = next_index++; + boundinfo->null_index = (*mapping)[null_index]; + } + + /* Set the canonical value for default_index, if any. */ + if (default_index != -1) + { + /* + * The default partition accepts any value not specified in the lists + * of other partitions, hence it should not get mapped index while + * assigning those for non-null datums. + */ + Assert(default_index >= 0); + Assert((*mapping)[default_index] == -1); + (*mapping)[default_index] = next_index++; + boundinfo->default_index = (*mapping)[default_index]; + } + + /* All partition must now have been assigned canonical indexes. */ + Assert(next_index == list_length(boundspecs)); + return boundinfo; +} + +/* + * create_range_bounds + * Create a PartitionBoundInfo for a range partitioned table + */ +static PartitionBoundInfo +create_range_bounds(List *boundspecs, PartitionKey key, int **mapping) +{ + PartitionBoundInfo boundinfo; + PartitionRangeBound **rbounds = NULL; + PartitionRangeBound **all_bounds, + *prev; + ListCell *cell; + int i, + k, + nparts = list_length(boundspecs); + int ndatums = 0; + int default_index = -1; + int next_index = 0; + + boundinfo = (PartitionBoundInfoData *) + palloc0(sizeof(PartitionBoundInfoData)); + boundinfo->strategy = key->strategy; + /* There is no special null-accepting range partition. */ + boundinfo->null_index = -1; + /* Will be set correctly below. */ + boundinfo->default_index = -1; + + all_bounds = (PartitionRangeBound **) + palloc0(2 * nparts * sizeof(PartitionRangeBound *)); + + /* Create a unified list of range bounds across all the partitions. */ + i = ndatums = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, lfirst(cell)); + PartitionRangeBound *lower, + *upper; + + if (spec->strategy != PARTITION_STRATEGY_RANGE) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the default + * partition. There's no datum to add to the all_bounds array for + * this partition. + */ + if (spec->is_default) + { + default_index = i++; + continue; + } + + lower = make_one_partition_rbound(key, i, spec->lowerdatums, true); + upper = make_one_partition_rbound(key, i, spec->upperdatums, false); + all_bounds[ndatums++] = lower; + all_bounds[ndatums++] = upper; + i++; + } + + Assert(ndatums == nparts * 2 || + (default_index != -1 && ndatums == (nparts - 1) * 2)); + + /* Sort all the bounds in ascending order */ + qsort_arg(all_bounds, ndatums, + sizeof(PartitionRangeBound *), + qsort_partition_rbound_cmp, + (void *) key); + + /* Save distinct bounds from all_bounds into rbounds. */ + rbounds = (PartitionRangeBound **) + palloc(ndatums * sizeof(PartitionRangeBound *)); + k = 0; + prev = NULL; + for (i = 0; i < ndatums; i++) + { + PartitionRangeBound *cur = all_bounds[i]; + bool is_distinct = false; + int j; + + /* Is the current bound distinct from the previous one? */ + for (j = 0; j < key->partnatts; j++) + { + Datum cmpval; + + if (prev == NULL || cur->kind[j] != prev->kind[j]) + { + is_distinct = true; + break; + } + + /* + * If the bounds are both MINVALUE or MAXVALUE, stop now and treat + * them as equal, since any values after this point must be + * ignored. + */ + if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) + break; + + cmpval = FunctionCall2Coll(&key->partsupfunc[j], + key->partcollation[j], + cur->datums[j], + prev->datums[j]); + if (DatumGetInt32(cmpval) != 0) + { + is_distinct = true; + break; + } + } + + /* + * Only if the bound is distinct save it into a temporary array, i.e, + * rbounds which is later copied into boundinfo datums array. + */ + if (is_distinct) + rbounds[k++] = all_bounds[i]; + + prev = cur; + } + + /* Update ndatums to hold the count of distinct datums. */ + ndatums = k; + + /* + * Add datums to boundinfo. Canonical indexes are values ranging from 0 + * to nparts - 1, assigned in that order to each partition's upper bound. + * For 'datums' elements that are lower bounds, there is -1 in the + * 'indexes' array to signify that no partition exists for the values less + * than such a bound and greater than or equal to the previous upper + * bound. + */ + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * sizeof(Datum *)); + boundinfo->kind = (PartitionRangeDatumKind **) + palloc(ndatums * + sizeof(PartitionRangeDatumKind *)); + + /* + * For range partitioning, an additional value of -1 is stored as the last + * element. + */ + boundinfo->indexes = (int *) palloc((ndatums + 1) * sizeof(int)); + + for (i = 0; i < ndatums; i++) + { + int j; + + boundinfo->datums[i] = (Datum *) palloc(key->partnatts * + sizeof(Datum)); + boundinfo->kind[i] = (PartitionRangeDatumKind *) + palloc(key->partnatts * + sizeof(PartitionRangeDatumKind)); + for (j = 0; j < key->partnatts; j++) + { + if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) + boundinfo->datums[i][j] = + datumCopy(rbounds[i]->datums[j], + key->parttypbyval[j], + key->parttyplen[j]); + boundinfo->kind[i][j] = rbounds[i]->kind[j]; + } + + /* + * There is no mapping for invalid indexes. + * + * Any lower bounds in the rbounds array have invalid indexes + * assigned, because the values between the previous bound (if there + * is one) and this (lower) bound are not part of the range of any + * existing partition. + */ + if (rbounds[i]->lower) + boundinfo->indexes[i] = -1; + else + { + int orig_index = rbounds[i]->index; + + /* If the old index has no mapping, assign one */ + if ((*mapping)[orig_index] == -1) + (*mapping)[orig_index] = next_index++; + + boundinfo->indexes[i] = (*mapping)[orig_index]; + } + } + + /* Set the canonical value for default_index, if any. */ + if (default_index != -1) + { + Assert(default_index >= 0 && (*mapping)[default_index] == -1); + (*mapping)[default_index] = next_index++; + boundinfo->default_index = (*mapping)[default_index]; + } + + /* The extra -1 element. */ + Assert(i == ndatums); + boundinfo->indexes[i] = -1; + + /* All partition must now have been assigned canonical indexes. */ + Assert(next_index == nparts); + return boundinfo; +} + /* * Are two partition bound collections logically equal? * @@ -763,7 +1334,7 @@ get_hash_partition_greatest_modulus(PartitionBoundInfo bound) * and a flag telling whether the bound is lower or not. Made into a function * because there are multiple sites that want to use this facility. */ -PartitionRangeBound * +static PartitionRangeBound * make_one_partition_rbound(PartitionKey key, int index, List *datums, bool lower) { PartitionRangeBound *bound; @@ -819,7 +1390,7 @@ make_one_partition_rbound(PartitionKey key, int index, List *datums, bool lower) * structure, which only stores the upper bound of a common boundary between * two contiguous partitions. */ -int32 +static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation, Datum *datums1, PartitionRangeDatumKind *kind1, @@ -914,7 +1485,7 @@ partition_rbound_datum_cmp(FmgrInfo *partsupfunc, Oid *partcollation, * * Compares modulus first, then remainder if modulus is equal. */ -int32 +static int32 partition_hbound_cmp(int modulus1, int remainder1, int modulus2, int remainder2) { if (modulus1 < modulus2) @@ -977,7 +1548,7 @@ partition_list_bsearch(FmgrInfo *partsupfunc, Oid *partcollation, * *is_equal is set to true if the range bound at the returned index is equal * to the input range bound */ -int +static int partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation, PartitionBoundInfo boundinfo, @@ -1101,6 +1672,55 @@ partition_hash_bsearch(PartitionBoundInfo boundinfo, return lo; } +/* + * qsort_partition_hbound_cmp + * + * Hash bounds are sorted by modulus, then by remainder. + */ +static int32 +qsort_partition_hbound_cmp(const void *a, const void *b) +{ + PartitionHashBound *h1 = (*(PartitionHashBound *const *) a); + PartitionHashBound *h2 = (*(PartitionHashBound *const *) b); + + return partition_hbound_cmp(h1->modulus, h1->remainder, + h2->modulus, h2->remainder); +} + +/* + * qsort_partition_list_value_cmp + * + * Compare two list partition bound datums. + */ +static int32 +qsort_partition_list_value_cmp(const void *a, const void *b, void *arg) +{ + Datum val1 = (*(const PartitionListValue **) a)->value, + val2 = (*(const PartitionListValue **) b)->value; + PartitionKey key = (PartitionKey) arg; + + return DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0], + key->partcollation[0], + val1, val2)); +} + +/* + * qsort_partition_rbound_cmp + * + * Used when sorting range bounds across all range partitions. + */ +static int32 +qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) +{ + PartitionRangeBound *b1 = (*(PartitionRangeBound *const *) a); + PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b); + PartitionKey key = (PartitionKey) arg; + + return partition_rbound_cmp(key->partnatts, key->partsupfunc, + key->partcollation, b1->datums, b1->kind, + b1->lower, b2); +} + /* * get_partition_bound_num_indexes * diff --git a/src/backend/utils/cache/partcache.c b/src/backend/utils/cache/partcache.c index 5757301d05..020bca1f54 100644 --- a/src/backend/utils/cache/partcache.c +++ b/src/backend/utils/cache/partcache.c @@ -38,12 +38,6 @@ static List *generate_partition_qual(Relation rel); -static int32 qsort_partition_hbound_cmp(const void *a, const void *b); -static int32 qsort_partition_list_value_cmp(const void *a, const void *b, - void *arg); -static int32 qsort_partition_rbound_cmp(const void *a, const void *b, - void *arg); - /* * RelationBuildPartitionKey @@ -260,36 +254,22 @@ RelationBuildPartitionKey(Relation relation) void RelationBuildPartitionDesc(Relation rel) { - List *inhoids, - *partoids; - Oid *oids = NULL; + PartitionDesc partdesc; + PartitionBoundInfo boundinfo; + List *inhoids; List *boundspecs = NIL; ListCell *cell; int i, nparts; PartitionKey key = RelationGetPartitionKey(rel); - PartitionDesc result; MemoryContext oldcxt; - - int ndatums = 0; - int default_index = -1; - - /* Hash partitioning specific */ - PartitionHashBound **hbounds = NULL; - - /* List partitioning specific */ - PartitionListValue **all_values = NULL; - int null_index = -1; - - /* Range partitioning specific */ - PartitionRangeBound **rbounds = NULL; + Oid *oids_orig; + int *mapping; /* Get partition oids from pg_inherits */ inhoids = find_inheritance_children(RelationGetRelid(rel), NoLock); /* Collect bound spec nodes in a list */ - i = 0; - partoids = NIL; foreach(cell, inhoids) { Oid inhrelid = lfirst_oid(cell); @@ -325,245 +305,10 @@ RelationBuildPartitionDesc(Relation rel) } boundspecs = lappend(boundspecs, boundspec); - partoids = lappend_oid(partoids, inhrelid); ReleaseSysCache(tuple); } - nparts = list_length(partoids); - - if (nparts > 0) - { - oids = (Oid *) palloc(nparts * sizeof(Oid)); - i = 0; - foreach(cell, partoids) - oids[i++] = lfirst_oid(cell); - - /* Convert from node to the internal representation */ - if (key->strategy == PARTITION_STRATEGY_HASH) - { - ndatums = nparts; - hbounds = (PartitionHashBound **) - palloc(nparts * sizeof(PartitionHashBound *)); - - i = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - - if (spec->strategy != PARTITION_STRATEGY_HASH) - elog(ERROR, "invalid strategy in partition bound spec"); - - hbounds[i] = (PartitionHashBound *) - palloc(sizeof(PartitionHashBound)); - - hbounds[i]->modulus = spec->modulus; - hbounds[i]->remainder = spec->remainder; - hbounds[i]->index = i; - i++; - } - - /* Sort all the bounds in ascending order */ - qsort(hbounds, nparts, sizeof(PartitionHashBound *), - qsort_partition_hbound_cmp); - } - else if (key->strategy == PARTITION_STRATEGY_LIST) - { - List *non_null_values = NIL; - - /* - * Create a unified list of non-null values across all partitions. - */ - i = 0; - null_index = -1; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - ListCell *c; - - if (spec->strategy != PARTITION_STRATEGY_LIST) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the default - * partition. There's no datum to add to the list of non-null - * datums for this partition. - */ - if (spec->is_default) - { - default_index = i; - i++; - continue; - } - - foreach(c, spec->listdatums) - { - Const *val = castNode(Const, lfirst(c)); - PartitionListValue *list_value = NULL; - - if (!val->constisnull) - { - list_value = (PartitionListValue *) - palloc0(sizeof(PartitionListValue)); - list_value->index = i; - list_value->value = val->constvalue; - } - else - { - /* - * Never put a null into the values array, flag - * instead for the code further down below where we - * construct the actual relcache struct. - */ - if (null_index != -1) - elog(ERROR, "found null more than once"); - null_index = i; - } - - if (list_value) - non_null_values = lappend(non_null_values, - list_value); - } - - i++; - } - - ndatums = list_length(non_null_values); - - /* - * Collect all list values in one array. Alongside the value, we - * also save the index of partition the value comes from. - */ - all_values = (PartitionListValue **) palloc(ndatums * - sizeof(PartitionListValue *)); - i = 0; - foreach(cell, non_null_values) - { - PartitionListValue *src = lfirst(cell); - - all_values[i] = (PartitionListValue *) - palloc(sizeof(PartitionListValue)); - all_values[i]->value = src->value; - all_values[i]->index = src->index; - i++; - } - - qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), - qsort_partition_list_value_cmp, (void *) key); - } - else if (key->strategy == PARTITION_STRATEGY_RANGE) - { - int k; - PartitionRangeBound **all_bounds, - *prev; - - all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * - sizeof(PartitionRangeBound *)); - - /* - * Create a unified list of range bounds across all the - * partitions. - */ - i = ndatums = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - PartitionRangeBound *lower, - *upper; - - if (spec->strategy != PARTITION_STRATEGY_RANGE) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the default - * partition. There's no datum to add to the allbounds array - * for this partition. - */ - if (spec->is_default) - { - default_index = i++; - continue; - } - - lower = make_one_partition_rbound(key, i, spec->lowerdatums, - true); - upper = make_one_partition_rbound(key, i, spec->upperdatums, - false); - all_bounds[ndatums++] = lower; - all_bounds[ndatums++] = upper; - i++; - } - - Assert(ndatums == nparts * 2 || - (default_index != -1 && ndatums == (nparts - 1) * 2)); - - /* Sort all the bounds in ascending order */ - qsort_arg(all_bounds, ndatums, - sizeof(PartitionRangeBound *), - qsort_partition_rbound_cmp, - (void *) key); - - /* Save distinct bounds from all_bounds into rbounds. */ - rbounds = (PartitionRangeBound **) - palloc(ndatums * sizeof(PartitionRangeBound *)); - k = 0; - prev = NULL; - for (i = 0; i < ndatums; i++) - { - PartitionRangeBound *cur = all_bounds[i]; - bool is_distinct = false; - int j; - - /* Is the current bound distinct from the previous one? */ - for (j = 0; j < key->partnatts; j++) - { - Datum cmpval; - - if (prev == NULL || cur->kind[j] != prev->kind[j]) - { - is_distinct = true; - break; - } - - /* - * If the bounds are both MINVALUE or MAXVALUE, stop now - * and treat them as equal, since any values after this - * point must be ignored. - */ - if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) - break; - - cmpval = FunctionCall2Coll(&key->partsupfunc[j], - key->partcollation[j], - cur->datums[j], - prev->datums[j]); - if (DatumGetInt32(cmpval) != 0) - { - is_distinct = true; - break; - } - } - - /* - * Only if the bound is distinct save it into a temporary - * array i.e. rbounds which is later copied into boundinfo - * datums array. - */ - if (is_distinct) - rbounds[k++] = all_bounds[i]; - - prev = cur; - } - - /* Update ndatums to hold the count of distinct datums. */ - ndatums = k; - } - else - elog(ERROR, "unexpected partition strategy: %d", - (int) key->strategy); - } + nparts = list_length(boundspecs); /* Now build the actual relcache partition descriptor */ rel->rd_pdcxt = AllocSetContextCreate(CacheMemoryContext, @@ -572,210 +317,41 @@ RelationBuildPartitionDesc(Relation rel) MemoryContextCopyAndSetIdentifier(rel->rd_pdcxt, RelationGetRelationName(rel)); oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt); - - result = (PartitionDescData *) palloc0(sizeof(PartitionDescData)); - result->nparts = nparts; - if (nparts > 0) - { - PartitionBoundInfo boundinfo; - int *mapping; - int next_index = 0; - - result->oids = (Oid *) palloc0(nparts * sizeof(Oid)); - - boundinfo = (PartitionBoundInfoData *) - palloc0(sizeof(PartitionBoundInfoData)); - boundinfo->strategy = key->strategy; - boundinfo->default_index = -1; - boundinfo->ndatums = ndatums; - boundinfo->null_index = -1; - boundinfo->datums = (Datum **) palloc0(ndatums * sizeof(Datum *)); - - /* Initialize mapping array with invalid values */ - mapping = (int *) palloc(sizeof(int) * nparts); - for (i = 0; i < nparts; i++) - mapping[i] = -1; - - switch (key->strategy) - { - case PARTITION_STRATEGY_HASH: - { - /* Moduli are stored in ascending order */ - int greatest_modulus = hbounds[ndatums - 1]->modulus; - - boundinfo->indexes = (int *) palloc(greatest_modulus * - sizeof(int)); - - for (i = 0; i < greatest_modulus; i++) - boundinfo->indexes[i] = -1; - - for (i = 0; i < nparts; i++) - { - int modulus = hbounds[i]->modulus; - int remainder = hbounds[i]->remainder; - - boundinfo->datums[i] = (Datum *) palloc(2 * - sizeof(Datum)); - boundinfo->datums[i][0] = Int32GetDatum(modulus); - boundinfo->datums[i][1] = Int32GetDatum(remainder); - - while (remainder < greatest_modulus) - { - /* overlap? */ - Assert(boundinfo->indexes[remainder] == -1); - boundinfo->indexes[remainder] = i; - remainder += modulus; - } - - mapping[hbounds[i]->index] = i; - pfree(hbounds[i]); - } - pfree(hbounds); - break; - } - - case PARTITION_STRATEGY_LIST: - { - boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); - - /* - * Copy values. Indexes of individual values are mapped - * to canonical values so that they match for any two list - * partitioned tables with same number of partitions and - * same lists per partition. One way to canonicalize is - * to assign the index in all_values[] of the smallest - * value of each partition, as the index of all of the - * partition's values. - */ - for (i = 0; i < ndatums; i++) - { - boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); - boundinfo->datums[i][0] = datumCopy(all_values[i]->value, - key->parttypbyval[0], - key->parttyplen[0]); - - /* If the old index has no mapping, assign one */ - if (mapping[all_values[i]->index] == -1) - mapping[all_values[i]->index] = next_index++; - - boundinfo->indexes[i] = mapping[all_values[i]->index]; - } - - /* - * If null-accepting partition has no mapped index yet, - * assign one. This could happen if such partition - * accepts only null and hence not covered in the above - * loop which only handled non-null values. - */ - if (null_index != -1) - { - Assert(null_index >= 0); - if (mapping[null_index] == -1) - mapping[null_index] = next_index++; - boundinfo->null_index = mapping[null_index]; - } - - /* Assign mapped index for the default partition. */ - if (default_index != -1) - { - /* - * The default partition accepts any value not - * specified in the lists of other partitions, hence - * it should not get mapped index while assigning - * those for non-null datums. - */ - Assert(default_index >= 0 && - mapping[default_index] == -1); - mapping[default_index] = next_index++; - boundinfo->default_index = mapping[default_index]; - } - - /* All partition must now have a valid mapping */ - Assert(next_index == nparts); - break; - } - - case PARTITION_STRATEGY_RANGE: - { - boundinfo->kind = (PartitionRangeDatumKind **) - palloc(ndatums * - sizeof(PartitionRangeDatumKind *)); - boundinfo->indexes = (int *) palloc((ndatums + 1) * - sizeof(int)); - - for (i = 0; i < ndatums; i++) - { - int j; - - boundinfo->datums[i] = (Datum *) palloc(key->partnatts * - sizeof(Datum)); - boundinfo->kind[i] = (PartitionRangeDatumKind *) - palloc(key->partnatts * - sizeof(PartitionRangeDatumKind)); - for (j = 0; j < key->partnatts; j++) - { - if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) - boundinfo->datums[i][j] = - datumCopy(rbounds[i]->datums[j], - key->parttypbyval[j], - key->parttyplen[j]); - boundinfo->kind[i][j] = rbounds[i]->kind[j]; - } - - /* - * There is no mapping for invalid indexes. - * - * Any lower bounds in the rbounds array have invalid - * indexes assigned, because the values between the - * previous bound (if there is one) and this (lower) - * bound are not part of the range of any existing - * partition. - */ - if (rbounds[i]->lower) - boundinfo->indexes[i] = -1; - else - { - int orig_index = rbounds[i]->index; - - /* If the old index has no mapping, assign one */ - if (mapping[orig_index] == -1) - mapping[orig_index] = next_index++; - - boundinfo->indexes[i] = mapping[orig_index]; - } - } - - /* Assign mapped index for the default partition. */ - if (default_index != -1) - { - Assert(default_index >= 0 && mapping[default_index] == -1); - mapping[default_index] = next_index++; - boundinfo->default_index = mapping[default_index]; - } - boundinfo->indexes[i] = -1; - break; - } - - default: - elog(ERROR, "unexpected partition strategy: %d", - (int) key->strategy); - } - - result->boundinfo = boundinfo; - - /* - * Now assign OIDs from the original array into mapped indexes of the - * result array. Order of OIDs in the former is defined by the - * catalog scan that retrieved them, whereas that in the latter is - * defined by canonicalized representation of the partition bounds. - */ - for (i = 0; i < nparts; i++) - result->oids[mapping[i]] = oids[i]; - pfree(mapping); - } + partdesc = (PartitionDescData *) palloc0(sizeof(PartitionDescData)); + partdesc->nparts = nparts; + /* oids and boundinfo are allocated below. */ MemoryContextSwitchTo(oldcxt); - rel->rd_partdesc = result; + + if (nparts == 0) + { + rel->rd_partdesc = partdesc; + return; + } + + /* First create the PartitionBoundInfo in caller's context. */ + boundinfo = partition_bounds_create(boundspecs, key, &mapping); + oids_orig = (Oid *) palloc(sizeof(Oid) * partdesc->nparts); + i = 0; + foreach(cell, inhoids) + oids_orig[i++] = lfirst_oid(cell); + + /* Now copy boundinfo and oids into partdesc. */ + oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt); + partdesc->boundinfo = partition_bounds_copy(boundinfo, key); + partdesc->oids = (Oid *) palloc(partdesc->nparts * sizeof(Oid)); + + /* + * Now assign OIDs from the original array into mapped indexes of the + * result array. Order of OIDs in the former is defined by the catalog + * scan that retrieved them, whereas that in the latter is defined by + * canonicalized representation of the partition bounds. + */ + for (i = 0; i < partdesc->nparts; i++) + partdesc->oids[mapping[i]] = oids_orig[i]; + MemoryContextSwitchTo(oldcxt); + + rel->rd_partdesc = partdesc; } /* @@ -917,48 +493,3 @@ generate_partition_qual(Relation rel) return result; } - -/* - * qsort_partition_hbound_cmp - * - * We sort hash bounds by modulus, then by remainder. - */ -static int32 -qsort_partition_hbound_cmp(const void *a, const void *b) -{ - PartitionHashBound *h1 = (*(PartitionHashBound *const *) a); - PartitionHashBound *h2 = (*(PartitionHashBound *const *) b); - - return partition_hbound_cmp(h1->modulus, h1->remainder, - h2->modulus, h2->remainder); -} - -/* - * qsort_partition_list_value_cmp - * - * Compare two list partition bound datums - */ -static int32 -qsort_partition_list_value_cmp(const void *a, const void *b, void *arg) -{ - Datum val1 = (*(const PartitionListValue **) a)->value, - val2 = (*(const PartitionListValue **) b)->value; - PartitionKey key = (PartitionKey) arg; - - return DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0], - key->partcollation[0], - val1, val2)); -} - -/* Used when sorting range bounds across all range partitions */ -static int32 -qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) -{ - PartitionRangeBound *b1 = (*(PartitionRangeBound *const *) a); - PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b); - PartitionKey key = (PartitionKey) arg; - - return partition_rbound_cmp(key->partnatts, key->partsupfunc, - key->partcollation, b1->datums, b1->kind, - b1->lower, b2); -} diff --git a/src/include/partitioning/partbounds.h b/src/include/partitioning/partbounds.h index c7535e32fc..7a697d1c0a 100644 --- a/src/include/partitioning/partbounds.h +++ b/src/include/partitioning/partbounds.h @@ -75,40 +75,14 @@ typedef struct PartitionBoundInfoData #define partition_bound_accepts_nulls(bi) ((bi)->null_index != -1) #define partition_bound_has_default(bi) ((bi)->default_index != -1) -/* - * When qsort'ing partition bounds after reading from the catalog, each bound - * is represented with one of the following structs. - */ - -/* One bound of a hash partition */ -typedef struct PartitionHashBound -{ - int modulus; - int remainder; - int index; -} PartitionHashBound; - -/* One value coming from some (index'th) list partition */ -typedef struct PartitionListValue -{ - int index; - Datum value; -} PartitionListValue; - -/* One bound of a range partition */ -typedef struct PartitionRangeBound -{ - int index; - Datum *datums; /* range bound datums */ - PartitionRangeDatumKind *kind; /* the kind of each datum */ - bool lower; /* this is the lower (vs upper) bound */ -} PartitionRangeBound; - extern int get_hash_partition_greatest_modulus(PartitionBoundInfo b); extern uint64 compute_partition_hash_value(int partnatts, FmgrInfo *partsupfunc, Datum *values, bool *isnull); extern List *get_qual_from_partbound(Relation rel, Relation parent, PartitionBoundSpec *spec); +extern PartitionBoundInfo partition_bounds_create(List *boundspecs, + PartitionKey key, + int **mapping); extern bool partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval, PartitionBoundInfo b1, PartitionBoundInfo b2); @@ -120,14 +94,6 @@ extern void check_default_partition_contents(Relation parent, Relation defaultRel, PartitionBoundSpec *new_spec); -extern PartitionRangeBound *make_one_partition_rbound(PartitionKey key, int index, - List *datums, bool lower); -extern int32 partition_hbound_cmp(int modulus1, int remainder1, int modulus2, - int remainder2); -extern int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, - Oid *partcollation, Datum *datums1, - PartitionRangeDatumKind *kind1, bool lower1, - PartitionRangeBound *b2); extern int32 partition_rbound_datum_cmp(FmgrInfo *partsupfunc, Oid *partcollation, Datum *rb_datums, PartitionRangeDatumKind *rb_kind, @@ -136,10 +102,6 @@ extern int partition_list_bsearch(FmgrInfo *partsupfunc, Oid *partcollation, PartitionBoundInfo boundinfo, Datum value, bool *is_equal); -extern int partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc, - Oid *partcollation, - PartitionBoundInfo boundinfo, - PartitionRangeBound *probe, bool *is_equal); extern int partition_range_datum_bsearch(FmgrInfo *partsupfunc, Oid *partcollation, PartitionBoundInfo boundinfo,
signature.asc
Description: PGP signature