Hi, Thank your for taking a look.
On 2018/11/05 16:21, Michael Paquier wrote: > On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote: >> Done a few moments ago. :) > > From the file size this move is actually negative. From what I can see > partcache decreases to 400 lines, while partbounds increases to 3k > lines. Hmm, is that because partbounds.c contains more functionality? > There are a couple of things that this patch is doing: > 1) Move the functions comparing two bounds into partbounds.c. > 2) Remove the chunk in charge of building PartitionBoundData into > partbounds.c for each method: list, hash and values. > > From what I can see, this patch brings actually more confusion by doing > more things than just building all the PartitionBound structures as it > fills in each structure and then builds a mapping which is used to save > each partition OID into the correct mapping position. Wouldn't it move > on with a logic without this mapping so as the partition OIDs are > directly part of PartitionBound? It looks wrong to me to have > build_partition_boundinfo create not only partdesc->boundinfo but also > partdesc->oids, and the new routine is here to fill in data for the > former, not the latter. I think we can design the interface of partition_bounds_create such that it returns information needed to re-arrange OIDs to be in the canonical partition bound order, so the actual re-ordering of OIDs can be done by the caller itself. (It may not always be OIDs, could be something else like a struct to represent fake partitions.) The canonical partition bound order is basically partition bounds sorted in ascending order, or in some manner as defined by the qsort_partition_* functions of individual partitioning methods. We need to map the new canonical positions to the older ones by generating a map, which it only makes sense to do in partition_bounds_create. > The first phase building the bounds should switch to a switch/case like > the second phase. That bothered me too, so I looked at whether it'd be a good idea to have just one switch () block and put all the code for different partitioning methods into it. The resulting code seems more readable to me as one doesn't have to shuffle between looking at the first block that generates Datums from PartitionBoundSpecs and the second block that assigns them to the Datum array in PartitionBoundInfo, for each partitioning method respectively. > PartitionHashBound & friends can become structures local to > partbounds.c as they are used only there. Ah, I missed that; moved in. > To be more consistent with all the other routines, like > partition_bounds_equal/copy, wouldn't it be better to call the new > routine partition_bounds_build or partition_bounds_create? Agree about naming consistency; I went with partition_bounds_create. I've broken this down this into 3 patches for ease of review: 1. Local re-arrangement of the code in RelationBuildPartitionDesc such that the code that creates PartitionBoundInfo is clearly separated. I've also applied some of the comments above to that piece of code such as unifying all of that code in one switch() block. I also found an opportunity to simplify the code that maps canonical partition indexes to original partition indexes. 2. Re-arrangement local to partcache.c that moves out the separated code into partition_bounds_create() with the interface mentioned above. 3. Move partition_bounds_create to partbounds.c and make some functions that are not needed outside partbounds.c static to that file. Thanks, Amit
>From 65dec23fc1fa0c99b773c3c5254462acffb56c17 Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Wed, 7 Nov 2018 13:52:43 +0900 Subject: [PATCH v2 1/3] Code re-arrangement in RelationBuildPartitionDesc Rearranged such that the code that creates PartitionBoundInfo is separated from the rest. Also, simplified the portion of that code that handles mapping of canonical partition indexes to the original indexes. --- src/backend/utils/cache/partcache.c | 941 +++++++++++++++++++----------------- 1 file changed, 486 insertions(+), 455 deletions(-) diff --git a/src/backend/utils/cache/partcache.c b/src/backend/utils/cache/partcache.c index 5757301d05..399ef82193 100644 --- a/src/backend/utils/cache/partcache.c +++ b/src/backend/utils/cache/partcache.c @@ -260,36 +260,25 @@ RelationBuildPartitionKey(Relation relation) void RelationBuildPartitionDesc(Relation rel) { - List *inhoids, - *partoids; - Oid *oids = NULL; + PartitionDesc partdesc; + PartitionBoundInfo boundinfo; + List *inhoids; List *boundspecs = NIL; ListCell *cell; int i, nparts; PartitionKey key = RelationGetPartitionKey(rel); - PartitionDesc result; MemoryContext oldcxt; - int ndatums = 0; + int next_index = 0; int default_index = -1; - - /* Hash partitioning specific */ - PartitionHashBound **hbounds = NULL; - - /* List partitioning specific */ - PartitionListValue **all_values = NULL; - int null_index = -1; - - /* Range partitioning specific */ - PartitionRangeBound **rbounds = NULL; + Oid *oids_orig; + int *mapping; /* Get partition oids from pg_inherits */ inhoids = find_inheritance_children(RelationGetRelid(rel), NoLock); /* Collect bound spec nodes in a list */ - i = 0; - partoids = NIL; foreach(cell, inhoids) { Oid inhrelid = lfirst_oid(cell); @@ -325,245 +314,10 @@ RelationBuildPartitionDesc(Relation rel) } boundspecs = lappend(boundspecs, boundspec); - partoids = lappend_oid(partoids, inhrelid); ReleaseSysCache(tuple); } - nparts = list_length(partoids); - - if (nparts > 0) - { - oids = (Oid *) palloc(nparts * sizeof(Oid)); - i = 0; - foreach(cell, partoids) - oids[i++] = lfirst_oid(cell); - - /* Convert from node to the internal representation */ - if (key->strategy == PARTITION_STRATEGY_HASH) - { - ndatums = nparts; - hbounds = (PartitionHashBound **) - palloc(nparts * sizeof(PartitionHashBound *)); - - i = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - - if (spec->strategy != PARTITION_STRATEGY_HASH) - elog(ERROR, "invalid strategy in partition bound spec"); - - hbounds[i] = (PartitionHashBound *) - palloc(sizeof(PartitionHashBound)); - - hbounds[i]->modulus = spec->modulus; - hbounds[i]->remainder = spec->remainder; - hbounds[i]->index = i; - i++; - } - - /* Sort all the bounds in ascending order */ - qsort(hbounds, nparts, sizeof(PartitionHashBound *), - qsort_partition_hbound_cmp); - } - else if (key->strategy == PARTITION_STRATEGY_LIST) - { - List *non_null_values = NIL; - - /* - * Create a unified list of non-null values across all partitions. - */ - i = 0; - null_index = -1; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - ListCell *c; - - if (spec->strategy != PARTITION_STRATEGY_LIST) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the default - * partition. There's no datum to add to the list of non-null - * datums for this partition. - */ - if (spec->is_default) - { - default_index = i; - i++; - continue; - } - - foreach(c, spec->listdatums) - { - Const *val = castNode(Const, lfirst(c)); - PartitionListValue *list_value = NULL; - - if (!val->constisnull) - { - list_value = (PartitionListValue *) - palloc0(sizeof(PartitionListValue)); - list_value->index = i; - list_value->value = val->constvalue; - } - else - { - /* - * Never put a null into the values array, flag - * instead for the code further down below where we - * construct the actual relcache struct. - */ - if (null_index != -1) - elog(ERROR, "found null more than once"); - null_index = i; - } - - if (list_value) - non_null_values = lappend(non_null_values, - list_value); - } - - i++; - } - - ndatums = list_length(non_null_values); - - /* - * Collect all list values in one array. Alongside the value, we - * also save the index of partition the value comes from. - */ - all_values = (PartitionListValue **) palloc(ndatums * - sizeof(PartitionListValue *)); - i = 0; - foreach(cell, non_null_values) - { - PartitionListValue *src = lfirst(cell); - - all_values[i] = (PartitionListValue *) - palloc(sizeof(PartitionListValue)); - all_values[i]->value = src->value; - all_values[i]->index = src->index; - i++; - } - - qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), - qsort_partition_list_value_cmp, (void *) key); - } - else if (key->strategy == PARTITION_STRATEGY_RANGE) - { - int k; - PartitionRangeBound **all_bounds, - *prev; - - all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * - sizeof(PartitionRangeBound *)); - - /* - * Create a unified list of range bounds across all the - * partitions. - */ - i = ndatums = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - PartitionRangeBound *lower, - *upper; - - if (spec->strategy != PARTITION_STRATEGY_RANGE) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the default - * partition. There's no datum to add to the allbounds array - * for this partition. - */ - if (spec->is_default) - { - default_index = i++; - continue; - } - - lower = make_one_partition_rbound(key, i, spec->lowerdatums, - true); - upper = make_one_partition_rbound(key, i, spec->upperdatums, - false); - all_bounds[ndatums++] = lower; - all_bounds[ndatums++] = upper; - i++; - } - - Assert(ndatums == nparts * 2 || - (default_index != -1 && ndatums == (nparts - 1) * 2)); - - /* Sort all the bounds in ascending order */ - qsort_arg(all_bounds, ndatums, - sizeof(PartitionRangeBound *), - qsort_partition_rbound_cmp, - (void *) key); - - /* Save distinct bounds from all_bounds into rbounds. */ - rbounds = (PartitionRangeBound **) - palloc(ndatums * sizeof(PartitionRangeBound *)); - k = 0; - prev = NULL; - for (i = 0; i < ndatums; i++) - { - PartitionRangeBound *cur = all_bounds[i]; - bool is_distinct = false; - int j; - - /* Is the current bound distinct from the previous one? */ - for (j = 0; j < key->partnatts; j++) - { - Datum cmpval; - - if (prev == NULL || cur->kind[j] != prev->kind[j]) - { - is_distinct = true; - break; - } - - /* - * If the bounds are both MINVALUE or MAXVALUE, stop now - * and treat them as equal, since any values after this - * point must be ignored. - */ - if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) - break; - - cmpval = FunctionCall2Coll(&key->partsupfunc[j], - key->partcollation[j], - cur->datums[j], - prev->datums[j]); - if (DatumGetInt32(cmpval) != 0) - { - is_distinct = true; - break; - } - } - - /* - * Only if the bound is distinct save it into a temporary - * array i.e. rbounds which is later copied into boundinfo - * datums array. - */ - if (is_distinct) - rbounds[k++] = all_bounds[i]; - - prev = cur; - } - - /* Update ndatums to hold the count of distinct datums. */ - ndatums = k; - } - else - elog(ERROR, "unexpected partition strategy: %d", - (int) key->strategy); - } + nparts = list_length(boundspecs); /* Now build the actual relcache partition descriptor */ rel->rd_pdcxt = AllocSetContextCreate(CacheMemoryContext, @@ -572,210 +326,487 @@ RelationBuildPartitionDesc(Relation rel) MemoryContextCopyAndSetIdentifier(rel->rd_pdcxt, RelationGetRelationName(rel)); oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt); - - result = (PartitionDescData *) palloc0(sizeof(PartitionDescData)); - result->nparts = nparts; - if (nparts > 0) - { - PartitionBoundInfo boundinfo; - int *mapping; - int next_index = 0; - - result->oids = (Oid *) palloc0(nparts * sizeof(Oid)); - - boundinfo = (PartitionBoundInfoData *) - palloc0(sizeof(PartitionBoundInfoData)); - boundinfo->strategy = key->strategy; - boundinfo->default_index = -1; - boundinfo->ndatums = ndatums; - boundinfo->null_index = -1; - boundinfo->datums = (Datum **) palloc0(ndatums * sizeof(Datum *)); - - /* Initialize mapping array with invalid values */ - mapping = (int *) palloc(sizeof(int) * nparts); - for (i = 0; i < nparts; i++) - mapping[i] = -1; - - switch (key->strategy) - { - case PARTITION_STRATEGY_HASH: - { - /* Moduli are stored in ascending order */ - int greatest_modulus = hbounds[ndatums - 1]->modulus; - - boundinfo->indexes = (int *) palloc(greatest_modulus * - sizeof(int)); - - for (i = 0; i < greatest_modulus; i++) - boundinfo->indexes[i] = -1; - - for (i = 0; i < nparts; i++) - { - int modulus = hbounds[i]->modulus; - int remainder = hbounds[i]->remainder; - - boundinfo->datums[i] = (Datum *) palloc(2 * - sizeof(Datum)); - boundinfo->datums[i][0] = Int32GetDatum(modulus); - boundinfo->datums[i][1] = Int32GetDatum(remainder); - - while (remainder < greatest_modulus) - { - /* overlap? */ - Assert(boundinfo->indexes[remainder] == -1); - boundinfo->indexes[remainder] = i; - remainder += modulus; - } - - mapping[hbounds[i]->index] = i; - pfree(hbounds[i]); - } - pfree(hbounds); - break; - } - - case PARTITION_STRATEGY_LIST: - { - boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); - - /* - * Copy values. Indexes of individual values are mapped - * to canonical values so that they match for any two list - * partitioned tables with same number of partitions and - * same lists per partition. One way to canonicalize is - * to assign the index in all_values[] of the smallest - * value of each partition, as the index of all of the - * partition's values. - */ - for (i = 0; i < ndatums; i++) - { - boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); - boundinfo->datums[i][0] = datumCopy(all_values[i]->value, - key->parttypbyval[0], - key->parttyplen[0]); - - /* If the old index has no mapping, assign one */ - if (mapping[all_values[i]->index] == -1) - mapping[all_values[i]->index] = next_index++; - - boundinfo->indexes[i] = mapping[all_values[i]->index]; - } - - /* - * If null-accepting partition has no mapped index yet, - * assign one. This could happen if such partition - * accepts only null and hence not covered in the above - * loop which only handled non-null values. - */ - if (null_index != -1) - { - Assert(null_index >= 0); - if (mapping[null_index] == -1) - mapping[null_index] = next_index++; - boundinfo->null_index = mapping[null_index]; - } - - /* Assign mapped index for the default partition. */ - if (default_index != -1) - { - /* - * The default partition accepts any value not - * specified in the lists of other partitions, hence - * it should not get mapped index while assigning - * those for non-null datums. - */ - Assert(default_index >= 0 && - mapping[default_index] == -1); - mapping[default_index] = next_index++; - boundinfo->default_index = mapping[default_index]; - } - - /* All partition must now have a valid mapping */ - Assert(next_index == nparts); - break; - } - - case PARTITION_STRATEGY_RANGE: - { - boundinfo->kind = (PartitionRangeDatumKind **) - palloc(ndatums * - sizeof(PartitionRangeDatumKind *)); - boundinfo->indexes = (int *) palloc((ndatums + 1) * - sizeof(int)); - - for (i = 0; i < ndatums; i++) - { - int j; - - boundinfo->datums[i] = (Datum *) palloc(key->partnatts * - sizeof(Datum)); - boundinfo->kind[i] = (PartitionRangeDatumKind *) - palloc(key->partnatts * - sizeof(PartitionRangeDatumKind)); - for (j = 0; j < key->partnatts; j++) - { - if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) - boundinfo->datums[i][j] = - datumCopy(rbounds[i]->datums[j], - key->parttypbyval[j], - key->parttyplen[j]); - boundinfo->kind[i][j] = rbounds[i]->kind[j]; - } - - /* - * There is no mapping for invalid indexes. - * - * Any lower bounds in the rbounds array have invalid - * indexes assigned, because the values between the - * previous bound (if there is one) and this (lower) - * bound are not part of the range of any existing - * partition. - */ - if (rbounds[i]->lower) - boundinfo->indexes[i] = -1; - else - { - int orig_index = rbounds[i]->index; - - /* If the old index has no mapping, assign one */ - if (mapping[orig_index] == -1) - mapping[orig_index] = next_index++; - - boundinfo->indexes[i] = mapping[orig_index]; - } - } - - /* Assign mapped index for the default partition. */ - if (default_index != -1) - { - Assert(default_index >= 0 && mapping[default_index] == -1); - mapping[default_index] = next_index++; - boundinfo->default_index = mapping[default_index]; - } - boundinfo->indexes[i] = -1; - break; - } - - default: - elog(ERROR, "unexpected partition strategy: %d", - (int) key->strategy); - } - - result->boundinfo = boundinfo; - - /* - * Now assign OIDs from the original array into mapped indexes of the - * result array. Order of OIDs in the former is defined by the - * catalog scan that retrieved them, whereas that in the latter is - * defined by canonicalized representation of the partition bounds. - */ - for (i = 0; i < nparts; i++) - result->oids[mapping[i]] = oids[i]; - pfree(mapping); - } + partdesc = (PartitionDescData *) palloc0(sizeof(PartitionDescData)); + partdesc->nparts = nparts; + /* oids and boundinfo are allocated below. */ MemoryContextSwitchTo(oldcxt); - rel->rd_partdesc = result; + + if (nparts == 0) + { + rel->rd_partdesc = partdesc; + return; + } + + /* + * For each partitioning method, we first convert the partition bounds + * from their parser node representation to the internal representation, + * along with any additional preprocessing (such performing de-duplication + * on range bounds). For each bound, we remember its partition's position + * (0-based) in the original list, so that we can add it to the mapping + * array. + * + * Resulting bound datums are then added to the 'datums' array in + * PartitionBoundInfo. For each datum added, an integer indicating the + * canonical partition index is added to the 'indexes' array. + */ + boundinfo = (PartitionBoundInfoData *) + palloc0(sizeof(PartitionBoundInfoData)); + boundinfo->strategy = key->strategy; + boundinfo->null_index = -1; + boundinfo->default_index = -1; + mapping = (int *) palloc(sizeof(int) * nparts); + switch (key->strategy) + { + case PARTITION_STRATEGY_HASH: + { + PartitionHashBound **hbounds = NULL; + int greatest_modulus; + + ndatums = nparts; + hbounds = (PartitionHashBound **) + palloc(nparts * sizeof(PartitionHashBound *)); + + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + + if (spec->strategy != PARTITION_STRATEGY_HASH) + elog(ERROR, "invalid strategy in partition bound spec"); + + hbounds[i] = (PartitionHashBound *) + palloc(sizeof(PartitionHashBound)); + + hbounds[i]->modulus = spec->modulus; + hbounds[i]->remainder = spec->remainder; + hbounds[i]->index = i; + i++; + } + + /* Sort all the bounds in ascending order */ + qsort(hbounds, nparts, sizeof(PartitionHashBound *), + qsort_partition_hbound_cmp); + + /* Moduli are stored in ascending order */ + greatest_modulus = hbounds[ndatums - 1]->modulus; + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(greatest_modulus * + sizeof(int)); + + for (i = 0; i < greatest_modulus; i++) + boundinfo->indexes[i] = -1; + + /* + * For hash partitioning, there are as many datums (modulus and + * remainder pairs) as there are partitions. Indexes are + * simply values ranging from 0 to npart - 1. + */ + for (i = 0; i < nparts; i++) + { + int orig_index = hbounds[i]->index; + int modulus = hbounds[i]->modulus; + int remainder = hbounds[i]->remainder; + + boundinfo->datums[i] = (Datum *) palloc(2 * + sizeof(Datum)); + boundinfo->datums[i][0] = Int32GetDatum(modulus); + boundinfo->datums[i][1] = Int32GetDatum(remainder); + + while (remainder < greatest_modulus) + { + /* overlap? */ + Assert(boundinfo->indexes[remainder] == -1); + boundinfo->indexes[remainder] = i; + remainder += modulus; + } + + /* Add the mapping element. */ + mapping[i] = orig_index; + + pfree(hbounds[i]); + } + pfree(hbounds); + break; + } + + case PARTITION_STRATEGY_LIST: + { + PartitionListValue **all_values = NULL; + int null_index = -1; + List *non_null_values = NIL; + int *listpart_canon_idx; + + /* + * Create a unified list of non-null values across all + * partitions. + */ + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + ListCell *c; + + if (spec->strategy != PARTITION_STRATEGY_LIST) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the + * default partition. There's no datum to add to the list + * of non-null datums for this partition. + */ + if (spec->is_default) + { + default_index = i; + i++; + continue; + } + + foreach(c, spec->listdatums) + { + Const *val = castNode(Const, lfirst(c)); + PartitionListValue *list_value = NULL; + + if (!val->constisnull) + { + list_value = (PartitionListValue *) + palloc0(sizeof(PartitionListValue)); + list_value->index = i; + list_value->value = val->constvalue; + } + else + { + /* + * Never put a null into the values array, flag + * instead for the code further down below where + * we construct the actual relcache struct. + */ + if (null_index != -1) + elog(ERROR, "found null more than once"); + null_index = i; + } + + if (list_value) + non_null_values = lappend(non_null_values, + list_value); + } + + i++; + } + + ndatums = list_length(non_null_values); + + /* + * Collect all list values in one array. Alongside the value, + * we also save the index of partition the value comes from. + */ + all_values = (PartitionListValue **) palloc(ndatums * + sizeof(PartitionListValue *)); + i = 0; + foreach(cell, non_null_values) + { + PartitionListValue *src = lfirst(cell); + + all_values[i] = (PartitionListValue *) + palloc(sizeof(PartitionListValue)); + all_values[i]->value = src->value; + all_values[i]->index = src->index; + i++; + } + + qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), + qsort_partition_list_value_cmp, (void *) key); + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); + + /* + * Copy values. Indexes are values ranging from 0 to + * npart - 1, assigned to each partition such that all datums + * of a given partition receive the same value. The value + * for a given partition is the index of that partition's + * smallest datum in the all_values[] array. We keep track + * of whether a given partition has been assigned an index + * yet using the 'listpart_canon_idx' mapping array. + * + * Initialize listpart_canon_idx elements to -1 to indicate + * that an index hasn't been assigned yet. + */ + listpart_canon_idx = (int *) palloc(sizeof(int) * nparts); + memset(listpart_canon_idx, -1, sizeof(int) * nparts); + for (i = 0; i < ndatums; i++) + { + int orig_index = all_values[i]->index; + + boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); + boundinfo->datums[i][0] = datumCopy(all_values[i]->value, + key->parttypbyval[0], + key->parttyplen[0]); + + /* + * Add the mapping element, if not already done for this + * partition. + */ + if (listpart_canon_idx[orig_index] == -1) + { + /* Add the mapping element. */ + mapping[next_index] = orig_index; + listpart_canon_idx[orig_index] = next_index++; + } + + boundinfo->indexes[i] = listpart_canon_idx[orig_index]; + } + + /* + * Set null_index, if any. + * + * It's possible that the null-accepting partition has not been + * assigned an index yet, which could happen if such partition + * accepts only null and hence not handled in the above loop + * which only looked at non-null values. + */ + if (null_index != -1) + { + Assert(null_index >= 0); + if (listpart_canon_idx[null_index] == -1) + { + /* Add the mapping element. */ + mapping[next_index] = null_index; + listpart_canon_idx[null_index] = next_index++; + } + boundinfo->null_index = listpart_canon_idx[null_index]; + } + + /* Set default_index, if any. */ + if (default_index != -1) + { + /* + * The default partition accepts any value not + * specified in the lists of other partitions, hence + * it should not have been assigned an index yet. + */ + Assert(default_index >= 0); + Assert(listpart_canon_idx[default_index] == -1); + /* Add the mapping element. */ + mapping[next_index] = default_index; + boundinfo->default_index = next_index++; + } + + /* All partition must now have a valid mapping */ + Assert(next_index == nparts); + break; + } + + case PARTITION_STRATEGY_RANGE: + { + PartitionRangeBound **rbounds = NULL; + PartitionRangeBound **all_bounds, + *prev; + int k; + + all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * + sizeof(PartitionRangeBound *)); + + /* Create a unified list of range bounds across all the partitions. */ + i = ndatums = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + PartitionRangeBound *lower, + *upper; + + if (spec->strategy != PARTITION_STRATEGY_RANGE) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the + * default partition. There's no datum to add to the + * allbounds array for this partition. + */ + if (spec->is_default) + { + default_index = i++; + continue; + } + + lower = make_one_partition_rbound(key, i, + spec->lowerdatums, + true); + upper = make_one_partition_rbound(key, i, + spec->upperdatums, + false); + all_bounds[ndatums++] = lower; + all_bounds[ndatums++] = upper; + i++; + } + + Assert(ndatums == nparts * 2 || + (default_index != -1 && ndatums == (nparts - 1) * 2)); + + /* Sort all the bounds in ascending order */ + qsort_arg(all_bounds, ndatums, + sizeof(PartitionRangeBound *), + qsort_partition_rbound_cmp, + (void *) key); + + /* + * De-duplicate: Save distinct bounds from 'all_bounds' into + * 'rbounds'. In many cases, lower bound of a partition + * matches exactly with the upper of the previous partition, + * in which case, we only store one. + */ + rbounds = (PartitionRangeBound **) + palloc(ndatums * sizeof(PartitionRangeBound *)); + k = 0; + prev = NULL; + for (i = 0; i < ndatums; i++) + { + PartitionRangeBound *cur = all_bounds[i]; + bool is_distinct = false; + int j; + + /* Is the current bound distinct from the previous one? */ + for (j = 0; j < key->partnatts; j++) + { + Datum cmpval; + + if (prev == NULL || cur->kind[j] != prev->kind[j]) + { + is_distinct = true; + break; + } + + /* + * If the bounds are both MINVALUE or MAXVALUE, stop + * now and treat them as equal, since any values after + * this point must be ignored. + */ + if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) + break; + + cmpval = FunctionCall2Coll(&key->partsupfunc[j], + key->partcollation[j], + cur->datums[j], + prev->datums[j]); + if (DatumGetInt32(cmpval) != 0) + { + is_distinct = true; + break; + } + } + + /* + * Only if the bound is distinct save it into a temporary + * array i.e. rbounds which is later copied into boundinfo + * datums array. + */ + if (is_distinct) + rbounds[k++] = all_bounds[i]; + + prev = cur; + } + + /* Update ndatums to hold the count of distinct datums. */ + ndatums = k; + + /* + * Add datums to boundinfo. Indexes are values ranging from + * 0 to nparts - 1, assigned in that order to each partition's + * upper bound. For 'datums' elements that are lower bounds, + * there is -1 in 'indexes' array to signify that no partition + * exists for the values less than such a bound and greater + * than or equal to the previous upper bound. + */ + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->kind = (PartitionRangeDatumKind **) + palloc(ndatums * + sizeof(PartitionRangeDatumKind *)); + /* + * For range partitioning, an additional value of -1 is stored + * as the last element. + */ + boundinfo->indexes = (int *) palloc((ndatums + 1) * + sizeof(int)); + + for (i = 0; i < ndatums; i++) + { + int j; + + boundinfo->datums[i] = (Datum *) palloc(key->partnatts * + sizeof(Datum)); + boundinfo->kind[i] = (PartitionRangeDatumKind *) + palloc(key->partnatts * + sizeof(PartitionRangeDatumKind)); + for (j = 0; j < key->partnatts; j++) + { + if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) + boundinfo->datums[i][j] = + datumCopy(rbounds[i]->datums[j], + key->parttypbyval[j], + key->parttyplen[j]); + boundinfo->kind[i][j] = rbounds[i]->kind[j]; + } + + /* Assign -1 to lower bounds as described above. */ + if (rbounds[i]->lower) + boundinfo->indexes[i] = -1; + else + { + int orig_index = rbounds[i]->index; + + /* Add the mapping element. */ + mapping[next_index] = orig_index; + + boundinfo->indexes[i] = next_index++; + } + } + + /* Set default_index, if any. */ + if (default_index != -1) + { + Assert(default_index >= 0); + /* Add the mapping element. */ + mapping[next_index] = default_index; + boundinfo->default_index = next_index++; + } + boundinfo->indexes[i] = -1; + + /* All partition must now have a valid mapping */ + Assert(next_index == nparts); + break; + } + + default: + elog(ERROR, "unexpected partition strategy: %d", + (int) key->strategy); + break; + } + + oids_orig = (Oid *) palloc(sizeof(Oid) * partdesc->nparts); + i = 0; + foreach(cell, inhoids) + oids_orig[i++] = lfirst_oid(cell); + + /* Now copy boundinfo and oids into partdesc. */ + oldcxt = MemoryContextSwitchTo(rel->rd_pdcxt); + partdesc->boundinfo = partition_bounds_copy(boundinfo, key); + partdesc->oids = (Oid *) palloc(partdesc->nparts * sizeof(Oid)); + /* oids are copied in the canonical partition bound order. */ + for (i = 0; i < partdesc->nparts; i++) + partdesc->oids[i] = oids_orig[mapping[i]]; + MemoryContextSwitchTo(oldcxt); + + rel->rd_partdesc = partdesc; } /* -- 2.11.0
>From ba3ed61cdf744441fc1e70280525ee05dc6e1250 Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Wed, 7 Nov 2018 14:09:51 +0900 Subject: [PATCH v2 2/3] Move out the PartitionBoundInfo creation code into a function --- src/backend/utils/cache/partcache.c | 955 +++++++++++++++++++----------------- 1 file changed, 498 insertions(+), 457 deletions(-) diff --git a/src/backend/utils/cache/partcache.c b/src/backend/utils/cache/partcache.c index 399ef82193..342bb46a55 100644 --- a/src/backend/utils/cache/partcache.c +++ b/src/backend/utils/cache/partcache.c @@ -43,6 +43,9 @@ static int32 qsort_partition_list_value_cmp(const void *a, const void *b, void *arg); static int32 qsort_partition_rbound_cmp(const void *a, const void *b, void *arg); +static PartitionBoundInfo partition_bounds_create(List *boundspecs, + PartitionKey key, + int **mapping); /* @@ -269,9 +272,6 @@ RelationBuildPartitionDesc(Relation rel) nparts; PartitionKey key = RelationGetPartitionKey(rel); MemoryContext oldcxt; - int ndatums = 0; - int next_index = 0; - int default_index = -1; Oid *oids_orig; int *mapping; @@ -338,460 +338,8 @@ RelationBuildPartitionDesc(Relation rel) return; } - /* - * For each partitioning method, we first convert the partition bounds - * from their parser node representation to the internal representation, - * along with any additional preprocessing (such performing de-duplication - * on range bounds). For each bound, we remember its partition's position - * (0-based) in the original list, so that we can add it to the mapping - * array. - * - * Resulting bound datums are then added to the 'datums' array in - * PartitionBoundInfo. For each datum added, an integer indicating the - * canonical partition index is added to the 'indexes' array. - */ - boundinfo = (PartitionBoundInfoData *) - palloc0(sizeof(PartitionBoundInfoData)); - boundinfo->strategy = key->strategy; - boundinfo->null_index = -1; - boundinfo->default_index = -1; - mapping = (int *) palloc(sizeof(int) * nparts); - switch (key->strategy) - { - case PARTITION_STRATEGY_HASH: - { - PartitionHashBound **hbounds = NULL; - int greatest_modulus; - - ndatums = nparts; - hbounds = (PartitionHashBound **) - palloc(nparts * sizeof(PartitionHashBound *)); - - i = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - - if (spec->strategy != PARTITION_STRATEGY_HASH) - elog(ERROR, "invalid strategy in partition bound spec"); - - hbounds[i] = (PartitionHashBound *) - palloc(sizeof(PartitionHashBound)); - - hbounds[i]->modulus = spec->modulus; - hbounds[i]->remainder = spec->remainder; - hbounds[i]->index = i; - i++; - } - - /* Sort all the bounds in ascending order */ - qsort(hbounds, nparts, sizeof(PartitionHashBound *), - qsort_partition_hbound_cmp); - - /* Moduli are stored in ascending order */ - greatest_modulus = hbounds[ndatums - 1]->modulus; - - boundinfo->ndatums = ndatums; - boundinfo->datums = (Datum **) palloc0(ndatums * - sizeof(Datum *)); - boundinfo->indexes = (int *) palloc(greatest_modulus * - sizeof(int)); - - for (i = 0; i < greatest_modulus; i++) - boundinfo->indexes[i] = -1; - - /* - * For hash partitioning, there are as many datums (modulus and - * remainder pairs) as there are partitions. Indexes are - * simply values ranging from 0 to npart - 1. - */ - for (i = 0; i < nparts; i++) - { - int orig_index = hbounds[i]->index; - int modulus = hbounds[i]->modulus; - int remainder = hbounds[i]->remainder; - - boundinfo->datums[i] = (Datum *) palloc(2 * - sizeof(Datum)); - boundinfo->datums[i][0] = Int32GetDatum(modulus); - boundinfo->datums[i][1] = Int32GetDatum(remainder); - - while (remainder < greatest_modulus) - { - /* overlap? */ - Assert(boundinfo->indexes[remainder] == -1); - boundinfo->indexes[remainder] = i; - remainder += modulus; - } - - /* Add the mapping element. */ - mapping[i] = orig_index; - - pfree(hbounds[i]); - } - pfree(hbounds); - break; - } - - case PARTITION_STRATEGY_LIST: - { - PartitionListValue **all_values = NULL; - int null_index = -1; - List *non_null_values = NIL; - int *listpart_canon_idx; - - /* - * Create a unified list of non-null values across all - * partitions. - */ - i = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - ListCell *c; - - if (spec->strategy != PARTITION_STRATEGY_LIST) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the - * default partition. There's no datum to add to the list - * of non-null datums for this partition. - */ - if (spec->is_default) - { - default_index = i; - i++; - continue; - } - - foreach(c, spec->listdatums) - { - Const *val = castNode(Const, lfirst(c)); - PartitionListValue *list_value = NULL; - - if (!val->constisnull) - { - list_value = (PartitionListValue *) - palloc0(sizeof(PartitionListValue)); - list_value->index = i; - list_value->value = val->constvalue; - } - else - { - /* - * Never put a null into the values array, flag - * instead for the code further down below where - * we construct the actual relcache struct. - */ - if (null_index != -1) - elog(ERROR, "found null more than once"); - null_index = i; - } - - if (list_value) - non_null_values = lappend(non_null_values, - list_value); - } - - i++; - } - - ndatums = list_length(non_null_values); - - /* - * Collect all list values in one array. Alongside the value, - * we also save the index of partition the value comes from. - */ - all_values = (PartitionListValue **) palloc(ndatums * - sizeof(PartitionListValue *)); - i = 0; - foreach(cell, non_null_values) - { - PartitionListValue *src = lfirst(cell); - - all_values[i] = (PartitionListValue *) - palloc(sizeof(PartitionListValue)); - all_values[i]->value = src->value; - all_values[i]->index = src->index; - i++; - } - - qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), - qsort_partition_list_value_cmp, (void *) key); - - boundinfo->ndatums = ndatums; - boundinfo->datums = (Datum **) palloc0(ndatums * - sizeof(Datum *)); - boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); - - /* - * Copy values. Indexes are values ranging from 0 to - * npart - 1, assigned to each partition such that all datums - * of a given partition receive the same value. The value - * for a given partition is the index of that partition's - * smallest datum in the all_values[] array. We keep track - * of whether a given partition has been assigned an index - * yet using the 'listpart_canon_idx' mapping array. - * - * Initialize listpart_canon_idx elements to -1 to indicate - * that an index hasn't been assigned yet. - */ - listpart_canon_idx = (int *) palloc(sizeof(int) * nparts); - memset(listpart_canon_idx, -1, sizeof(int) * nparts); - for (i = 0; i < ndatums; i++) - { - int orig_index = all_values[i]->index; - - boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); - boundinfo->datums[i][0] = datumCopy(all_values[i]->value, - key->parttypbyval[0], - key->parttyplen[0]); - - /* - * Add the mapping element, if not already done for this - * partition. - */ - if (listpart_canon_idx[orig_index] == -1) - { - /* Add the mapping element. */ - mapping[next_index] = orig_index; - listpart_canon_idx[orig_index] = next_index++; - } - - boundinfo->indexes[i] = listpart_canon_idx[orig_index]; - } - - /* - * Set null_index, if any. - * - * It's possible that the null-accepting partition has not been - * assigned an index yet, which could happen if such partition - * accepts only null and hence not handled in the above loop - * which only looked at non-null values. - */ - if (null_index != -1) - { - Assert(null_index >= 0); - if (listpart_canon_idx[null_index] == -1) - { - /* Add the mapping element. */ - mapping[next_index] = null_index; - listpart_canon_idx[null_index] = next_index++; - } - boundinfo->null_index = listpart_canon_idx[null_index]; - } - - /* Set default_index, if any. */ - if (default_index != -1) - { - /* - * The default partition accepts any value not - * specified in the lists of other partitions, hence - * it should not have been assigned an index yet. - */ - Assert(default_index >= 0); - Assert(listpart_canon_idx[default_index] == -1); - /* Add the mapping element. */ - mapping[next_index] = default_index; - boundinfo->default_index = next_index++; - } - - /* All partition must now have a valid mapping */ - Assert(next_index == nparts); - break; - } - - case PARTITION_STRATEGY_RANGE: - { - PartitionRangeBound **rbounds = NULL; - PartitionRangeBound **all_bounds, - *prev; - int k; - - all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * - sizeof(PartitionRangeBound *)); - - /* Create a unified list of range bounds across all the partitions. */ - i = ndatums = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - PartitionRangeBound *lower, - *upper; - - if (spec->strategy != PARTITION_STRATEGY_RANGE) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the - * default partition. There's no datum to add to the - * allbounds array for this partition. - */ - if (spec->is_default) - { - default_index = i++; - continue; - } - - lower = make_one_partition_rbound(key, i, - spec->lowerdatums, - true); - upper = make_one_partition_rbound(key, i, - spec->upperdatums, - false); - all_bounds[ndatums++] = lower; - all_bounds[ndatums++] = upper; - i++; - } - - Assert(ndatums == nparts * 2 || - (default_index != -1 && ndatums == (nparts - 1) * 2)); - - /* Sort all the bounds in ascending order */ - qsort_arg(all_bounds, ndatums, - sizeof(PartitionRangeBound *), - qsort_partition_rbound_cmp, - (void *) key); - - /* - * De-duplicate: Save distinct bounds from 'all_bounds' into - * 'rbounds'. In many cases, lower bound of a partition - * matches exactly with the upper of the previous partition, - * in which case, we only store one. - */ - rbounds = (PartitionRangeBound **) - palloc(ndatums * sizeof(PartitionRangeBound *)); - k = 0; - prev = NULL; - for (i = 0; i < ndatums; i++) - { - PartitionRangeBound *cur = all_bounds[i]; - bool is_distinct = false; - int j; - - /* Is the current bound distinct from the previous one? */ - for (j = 0; j < key->partnatts; j++) - { - Datum cmpval; - - if (prev == NULL || cur->kind[j] != prev->kind[j]) - { - is_distinct = true; - break; - } - - /* - * If the bounds are both MINVALUE or MAXVALUE, stop - * now and treat them as equal, since any values after - * this point must be ignored. - */ - if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) - break; - - cmpval = FunctionCall2Coll(&key->partsupfunc[j], - key->partcollation[j], - cur->datums[j], - prev->datums[j]); - if (DatumGetInt32(cmpval) != 0) - { - is_distinct = true; - break; - } - } - - /* - * Only if the bound is distinct save it into a temporary - * array i.e. rbounds which is later copied into boundinfo - * datums array. - */ - if (is_distinct) - rbounds[k++] = all_bounds[i]; - - prev = cur; - } - - /* Update ndatums to hold the count of distinct datums. */ - ndatums = k; - - /* - * Add datums to boundinfo. Indexes are values ranging from - * 0 to nparts - 1, assigned in that order to each partition's - * upper bound. For 'datums' elements that are lower bounds, - * there is -1 in 'indexes' array to signify that no partition - * exists for the values less than such a bound and greater - * than or equal to the previous upper bound. - */ - boundinfo->ndatums = ndatums; - boundinfo->datums = (Datum **) palloc0(ndatums * - sizeof(Datum *)); - boundinfo->kind = (PartitionRangeDatumKind **) - palloc(ndatums * - sizeof(PartitionRangeDatumKind *)); - /* - * For range partitioning, an additional value of -1 is stored - * as the last element. - */ - boundinfo->indexes = (int *) palloc((ndatums + 1) * - sizeof(int)); - - for (i = 0; i < ndatums; i++) - { - int j; - - boundinfo->datums[i] = (Datum *) palloc(key->partnatts * - sizeof(Datum)); - boundinfo->kind[i] = (PartitionRangeDatumKind *) - palloc(key->partnatts * - sizeof(PartitionRangeDatumKind)); - for (j = 0; j < key->partnatts; j++) - { - if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) - boundinfo->datums[i][j] = - datumCopy(rbounds[i]->datums[j], - key->parttypbyval[j], - key->parttyplen[j]); - boundinfo->kind[i][j] = rbounds[i]->kind[j]; - } - - /* Assign -1 to lower bounds as described above. */ - if (rbounds[i]->lower) - boundinfo->indexes[i] = -1; - else - { - int orig_index = rbounds[i]->index; - - /* Add the mapping element. */ - mapping[next_index] = orig_index; - - boundinfo->indexes[i] = next_index++; - } - } - - /* Set default_index, if any. */ - if (default_index != -1) - { - Assert(default_index >= 0); - /* Add the mapping element. */ - mapping[next_index] = default_index; - boundinfo->default_index = next_index++; - } - boundinfo->indexes[i] = -1; - - /* All partition must now have a valid mapping */ - Assert(next_index == nparts); - break; - } - - default: - elog(ERROR, "unexpected partition strategy: %d", - (int) key->strategy); - break; - } - + /* First create the PartitionBoundInfo in caller's context. */ + boundinfo = partition_bounds_create(boundspecs, key, &mapping); oids_orig = (Oid *) palloc(sizeof(Oid) * partdesc->nparts); i = 0; foreach(cell, inhoids) @@ -993,3 +541,496 @@ qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) key->partcollation, b1->datums, b1->kind, b1->lower, b2); } + +/* + * partition_bounds_create + * Build a PartitionBoundInfo struct from a list of PartitionBoundSpec + * nodes + * + * This function creates a PartitionBoundInfo and fills the values of its + * various members based on the input list. Importantly, 'datums' array will + * contain Datum representation of individual bounds (possibly after + * de-duplication as in case of range bounds), sorted in a canonical order + * defined by qsort_partition_* functions of respective partitioning methods. + * 'indexes' array will contain as many elements as there are bounds (specific + * exceptions to this rule are listed in the function body), which represent + * the 0-based canonical positions of partitions. + * + * Upon return from this function, *mapping is set to an array of + * list_length(boundspecs) / nparts elements, each of which maps the canonical + * index of a given partition to its 0-based position in the original list. + * + * Note: All the memory allocated by this function, including that of the + * the returned PartitionBoundInfo and its members is allocated in the context + * that was active when the function was called. + */ +static PartitionBoundInfo +partition_bounds_create(List *boundspecs, PartitionKey key, int **mapping) +{ + PartitionBoundInfo boundinfo; + ListCell *cell; + int i, + nparts; + int ndatums = 0; + int default_index = -1; + int next_index = 0; + + nparts = list_length(boundspecs); + Assert(nparts > 0); + + /* + * For each partitioning method, we first convert the partition bounds + * from their parser node representation to the internal representation, + * along with any additional preprocessing (such performing de-duplication + * on range bounds). For each bound, we remember its partition's position + * (0-based) in the original list, so that we can add it to the *mapping + * array. + * + * Resulting bound datums are then added to the 'datums' array in + * PartitionBoundInfo. For each datum added, an integer indicating the + * canonical partition index is added to the 'indexes' array. + */ + boundinfo = (PartitionBoundInfoData *) + palloc0(sizeof(PartitionBoundInfoData)); + boundinfo->strategy = key->strategy; + boundinfo->null_index = -1; + boundinfo->default_index = -1; + *mapping = (int *) palloc(sizeof(int) * nparts); + switch (key->strategy) + { + case PARTITION_STRATEGY_HASH: + { + PartitionHashBound **hbounds = NULL; + int greatest_modulus; + + ndatums = nparts; + hbounds = (PartitionHashBound **) + palloc(nparts * sizeof(PartitionHashBound *)); + + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + + if (spec->strategy != PARTITION_STRATEGY_HASH) + elog(ERROR, "invalid strategy in partition bound spec"); + + hbounds[i] = (PartitionHashBound *) + palloc(sizeof(PartitionHashBound)); + + hbounds[i]->modulus = spec->modulus; + hbounds[i]->remainder = spec->remainder; + hbounds[i]->index = i; + i++; + } + + /* Sort all the bounds in ascending order */ + qsort(hbounds, nparts, sizeof(PartitionHashBound *), + qsort_partition_hbound_cmp); + + /* Moduli are stored in ascending order */ + greatest_modulus = hbounds[ndatums - 1]->modulus; + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(greatest_modulus * + sizeof(int)); + + for (i = 0; i < greatest_modulus; i++) + boundinfo->indexes[i] = -1; + + /* + * For hash partitioning, there are as many datums (modulus and + * remainder pairs) as there are partitions. Indexes are + * simply values ranging from 0 to npart - 1. + */ + for (i = 0; i < nparts; i++) + { + int orig_index = hbounds[i]->index; + int modulus = hbounds[i]->modulus; + int remainder = hbounds[i]->remainder; + + boundinfo->datums[i] = (Datum *) palloc(2 * + sizeof(Datum)); + boundinfo->datums[i][0] = Int32GetDatum(modulus); + boundinfo->datums[i][1] = Int32GetDatum(remainder); + + while (remainder < greatest_modulus) + { + /* overlap? */ + Assert(boundinfo->indexes[remainder] == -1); + boundinfo->indexes[remainder] = i; + remainder += modulus; + } + + /* Add the mapping element. */ + (*mapping)[i] = orig_index; + + pfree(hbounds[i]); + } + pfree(hbounds); + break; + } + + case PARTITION_STRATEGY_LIST: + { + PartitionListValue **all_values = NULL; + int null_index = -1; + List *non_null_values = NIL; + int *listpart_canon_idx; + + /* + * Create a unified list of non-null values across all + * partitions. + */ + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + ListCell *c; + + if (spec->strategy != PARTITION_STRATEGY_LIST) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the + * default partition. There's no datum to add to the list + * of non-null datums for this partition. + */ + if (spec->is_default) + { + default_index = i; + i++; + continue; + } + + foreach(c, spec->listdatums) + { + Const *val = castNode(Const, lfirst(c)); + PartitionListValue *list_value = NULL; + + if (!val->constisnull) + { + list_value = (PartitionListValue *) + palloc0(sizeof(PartitionListValue)); + list_value->index = i; + list_value->value = val->constvalue; + } + else + { + /* + * Never put a null into the values array, flag + * instead for the code further down below where + * we construct the actual relcache struct. + */ + if (null_index != -1) + elog(ERROR, "found null more than once"); + null_index = i; + } + + if (list_value) + non_null_values = lappend(non_null_values, + list_value); + } + + i++; + } + + ndatums = list_length(non_null_values); + + /* + * Collect all list values in one array. Alongside the value, + * we also save the index of partition the value comes from. + */ + all_values = (PartitionListValue **) palloc(ndatums * + sizeof(PartitionListValue *)); + i = 0; + foreach(cell, non_null_values) + { + PartitionListValue *src = lfirst(cell); + + all_values[i] = (PartitionListValue *) + palloc(sizeof(PartitionListValue)); + all_values[i]->value = src->value; + all_values[i]->index = src->index; + i++; + } + + qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), + qsort_partition_list_value_cmp, (void *) key); + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); + + /* + * Copy values. Indexes are values ranging from 0 to + * npart - 1, assigned to each partition such that all datums + * of a given partition receive the same value. The value + * for a given partition is the index of that partition's + * smallest datum in the all_values[] array. We keep track + * of whether a given partition has been assigned an index + * yet using the 'listpart_canon_idx' array. + * + * Initialize listpart_canon_idx elements to -1 to indicate + * that an index hasn't been assigned yet. + */ + listpart_canon_idx = (int *) palloc(sizeof(int) * nparts); + memset(listpart_canon_idx, -1, sizeof(int) * nparts); + for (i = 0; i < ndatums; i++) + { + int orig_index = all_values[i]->index; + + boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); + boundinfo->datums[i][0] = datumCopy(all_values[i]->value, + key->parttypbyval[0], + key->parttyplen[0]); + + /* + * Add the mapping element, if not already done for this + * partition. + */ + if (listpart_canon_idx[orig_index] == -1) + { + /* Add the mapping element. */ + (*mapping)[next_index] = orig_index; + listpart_canon_idx[orig_index] = next_index++; + } + + boundinfo->indexes[i] = listpart_canon_idx[orig_index]; + } + + /* + * Set null_index, if any. + * + * It's possible that the null-accepting partition has not been + * assigned an index yet, which could happen if such partition + * accepts only null and hence not handled in the above loop + * which only looked at non-null values. + */ + if (null_index != -1) + { + Assert(null_index >= 0); + if (listpart_canon_idx[null_index] == -1) + { + /* Add the mapping element. */ + (*mapping)[next_index] = null_index; + listpart_canon_idx[null_index] = next_index++; + } + boundinfo->null_index = listpart_canon_idx[null_index]; + } + + /* Set default_index, if any. */ + if (default_index != -1) + { + /* + * The default partition accepts any value not + * specified in the lists of other partitions, hence + * it should not have been assigned an index yet. + */ + Assert(default_index >= 0); + Assert(listpart_canon_idx[default_index] == -1); + /* Add the mapping element. */ + (*mapping)[next_index] = default_index; + boundinfo->default_index = next_index++; + } + + /* All partition must now have a valid mapping */ + Assert(next_index == nparts); + break; + } + + case PARTITION_STRATEGY_RANGE: + { + PartitionRangeBound **rbounds = NULL; + PartitionRangeBound **all_bounds, + *prev; + int k; + + all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * + sizeof(PartitionRangeBound *)); + + /* Create a unified list of range bounds across all the partitions. */ + i = ndatums = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + PartitionRangeBound *lower, + *upper; + + if (spec->strategy != PARTITION_STRATEGY_RANGE) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the + * default partition. There's no datum to add to the + * allbounds array for this partition. + */ + if (spec->is_default) + { + default_index = i++; + continue; + } + + lower = make_one_partition_rbound(key, i, + spec->lowerdatums, + true); + upper = make_one_partition_rbound(key, i, + spec->upperdatums, + false); + all_bounds[ndatums++] = lower; + all_bounds[ndatums++] = upper; + i++; + } + + Assert(ndatums == nparts * 2 || + (default_index != -1 && ndatums == (nparts - 1) * 2)); + + /* Sort all the bounds in ascending order */ + qsort_arg(all_bounds, ndatums, + sizeof(PartitionRangeBound *), + qsort_partition_rbound_cmp, + (void *) key); + + /* + * De-duplicate: Save distinct bounds from 'all_bounds' into + * 'rbounds'. In many cases, lower bound of a partition + * matches exactly with the upper of the previous partition, + * in which case, we only store one. + */ + rbounds = (PartitionRangeBound **) + palloc(ndatums * sizeof(PartitionRangeBound *)); + k = 0; + prev = NULL; + for (i = 0; i < ndatums; i++) + { + PartitionRangeBound *cur = all_bounds[i]; + bool is_distinct = false; + int j; + + /* Is the current bound distinct from the previous one? */ + for (j = 0; j < key->partnatts; j++) + { + Datum cmpval; + + if (prev == NULL || cur->kind[j] != prev->kind[j]) + { + is_distinct = true; + break; + } + + /* + * If the bounds are both MINVALUE or MAXVALUE, stop + * now and treat them as equal, since any values after + * this point must be ignored. + */ + if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) + break; + + cmpval = FunctionCall2Coll(&key->partsupfunc[j], + key->partcollation[j], + cur->datums[j], + prev->datums[j]); + if (DatumGetInt32(cmpval) != 0) + { + is_distinct = true; + break; + } + } + + /* + * Only if the bound is distinct save it into a temporary + * array i.e. rbounds which is later copied into boundinfo + * datums array. + */ + if (is_distinct) + rbounds[k++] = all_bounds[i]; + + prev = cur; + } + + /* Update ndatums to hold the count of distinct datums. */ + ndatums = k; + + /* + * Add datums to boundinfo. Indexes are values ranging from + * 0 to nparts - 1, assigned in that order to each partition's + * upper bound. For 'datums' elements that are lower bounds, + * there is -1 in 'indexes' array to signify that no partition + * exists for the values less than such a bound and greater + * than or equal to the previous upper bound. + */ + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->kind = (PartitionRangeDatumKind **) + palloc(ndatums * + sizeof(PartitionRangeDatumKind *)); + /* + * For range partitioning, an additional value of -1 is stored + * as the last element. + */ + boundinfo->indexes = (int *) palloc((ndatums + 1) * + sizeof(int)); + + for (i = 0; i < ndatums; i++) + { + int j; + + boundinfo->datums[i] = (Datum *) palloc(key->partnatts * + sizeof(Datum)); + boundinfo->kind[i] = (PartitionRangeDatumKind *) + palloc(key->partnatts * + sizeof(PartitionRangeDatumKind)); + for (j = 0; j < key->partnatts; j++) + { + if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) + boundinfo->datums[i][j] = + datumCopy(rbounds[i]->datums[j], + key->parttypbyval[j], + key->parttyplen[j]); + boundinfo->kind[i][j] = rbounds[i]->kind[j]; + } + + /* Assign -1 to lower bounds as described above. */ + if (rbounds[i]->lower) + boundinfo->indexes[i] = -1; + else + { + int orig_index = rbounds[i]->index; + + /* Add the mapping element. */ + (*mapping)[next_index] = orig_index; + + boundinfo->indexes[i] = next_index++; + } + } + + /* Set default_index, if any. */ + if (default_index != -1) + { + Assert(default_index >= 0); + /* Add the mapping element. */ + (*mapping)[next_index] = default_index; + boundinfo->default_index = next_index++; + } + boundinfo->indexes[i] = -1; + + /* All partition must now have a valid mapping */ + Assert(next_index == nparts); + break; + } + + default: + elog(ERROR, "unexpected partition strategy: %d", + (int) key->strategy); + break; + } + + return boundinfo; +} -- 2.11.0
>From f72164c5b73042d3d82fa79a0b1a5ef8efccaace Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Wed, 7 Nov 2018 14:24:15 +0900 Subject: [PATCH v2 3/3] Move certain code to partbounds.c partition_bounds_create() contains a bunch of logic that's better contained in partbounds.c. Move partition bound sorting routines as well, because they're only needed by partition_bounds_create. Also mark as static a few other functions that were exported out of partbounds.c because partition_bounds_create needed them. --- src/backend/partitioning/partbounds.c | 593 +++++++++++++++++++++++++++++++++- src/backend/utils/cache/partcache.c | 547 ------------------------------- src/include/partitioning/partbounds.h | 44 +-- 3 files changed, 592 insertions(+), 592 deletions(-) diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c index c94f73aadc..b8aa1c9710 100644 --- a/src/backend/partitioning/partbounds.c +++ b/src/backend/partitioning/partbounds.c @@ -36,6 +36,54 @@ #include "utils/ruleutils.h" #include "utils/syscache.h" +/* + * When qsort'ing partition bounds after reading from the catalog, each bound + * is represented with one of the following structs. + */ + +/* One bound of a hash partition */ +typedef struct PartitionHashBound +{ + int modulus; + int remainder; + int index; +} PartitionHashBound; + +/* One value coming from some (index'th) list partition */ +typedef struct PartitionListValue +{ + int index; + Datum value; +} PartitionListValue; + +/* One bound of a range partition */ +typedef struct PartitionRangeBound +{ + int index; + Datum *datums; /* range bound datums */ + PartitionRangeDatumKind *kind; /* the kind of each datum */ + bool lower; /* this is the lower (vs upper) bound */ +} PartitionRangeBound; + +static int32 qsort_partition_hbound_cmp(const void *a, const void *b); +static int32 qsort_partition_list_value_cmp(const void *a, const void *b, + void *arg); +static int32 qsort_partition_rbound_cmp(const void *a, const void *b, + void *arg); +static PartitionRangeBound *make_one_partition_rbound(PartitionKey key, int index, + List *datums, bool lower); +static int32 partition_hbound_cmp(int modulus1, int remainder1, int modulus2, + int remainder2); +static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, Datum *datums1, + PartitionRangeDatumKind *kind1, bool lower1, + PartitionRangeBound *b2); + +static int partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, + PartitionBoundInfo boundinfo, + PartitionRangeBound *probe, bool *is_equal); + static int get_partition_bound_num_indexes(PartitionBoundInfo b); static Expr *make_partition_op_expr(PartitionKey key, int keynum, uint16 strategy, Expr *arg1, Expr *arg2); @@ -93,6 +141,498 @@ get_qual_from_partbound(Relation rel, Relation parent, } /* + * partition_bounds_create + * Build a PartitionBoundInfo struct from a list of PartitionBoundSpec + * nodes + * + * This function creates a PartitionBoundInfo and fills the values of its + * various members based on the input list. Importantly, 'datums' array will + * contain Datum representation of individual bounds (possibly after + * de-duplication as in case of range bounds), sorted in a canonical order + * defined by qsort_partition_* functions of respective partitioning methods. + * 'indexes' array will contain as many elements as there are bounds (specific + * exceptions to this rule are listed in the function body), which represent + * the 0-based canonical positions of partitions. + * + * Upon return from this function, *mapping is set to an array of + * list_length(boundspecs) / nparts elements, each of which maps the canonical + * index of a given partition to its 0-based position in the original list. + * + * Note: All the memory allocated by this function, including that of the + * the returned PartitionBoundInfo and its members is allocated in the context + * that was active when the function was called. + */ +PartitionBoundInfo +partition_bounds_create(List *boundspecs, PartitionKey key, int **mapping) +{ + PartitionBoundInfo boundinfo; + ListCell *cell; + int i, + nparts; + int ndatums = 0; + int default_index = -1; + int next_index = 0; + + nparts = list_length(boundspecs); + Assert(nparts > 0); + + /* + * For each partitioning method, we first convert the partition bounds + * from their parser node representation to the internal representation, + * along with any additional preprocessing (such performing de-duplication + * on range bounds). For each bound, we remember its partition's position + * (0-based) in the original list, so that we can add it to the *mapping + * array. + * + * Resulting bound datums are then added to the 'datums' array in + * PartitionBoundInfo. For each datum added, an integer indicating the + * canonical partition index is added to the 'indexes' array. + */ + boundinfo = (PartitionBoundInfoData *) + palloc0(sizeof(PartitionBoundInfoData)); + boundinfo->strategy = key->strategy; + boundinfo->null_index = -1; + boundinfo->default_index = -1; + *mapping = (int *) palloc(sizeof(int) * nparts); + switch (key->strategy) + { + case PARTITION_STRATEGY_HASH: + { + PartitionHashBound **hbounds = NULL; + int greatest_modulus; + + ndatums = nparts; + hbounds = (PartitionHashBound **) + palloc(nparts * sizeof(PartitionHashBound *)); + + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + + if (spec->strategy != PARTITION_STRATEGY_HASH) + elog(ERROR, "invalid strategy in partition bound spec"); + + hbounds[i] = (PartitionHashBound *) + palloc(sizeof(PartitionHashBound)); + + hbounds[i]->modulus = spec->modulus; + hbounds[i]->remainder = spec->remainder; + hbounds[i]->index = i; + i++; + } + + /* Sort all the bounds in ascending order */ + qsort(hbounds, nparts, sizeof(PartitionHashBound *), + qsort_partition_hbound_cmp); + + /* Moduli are stored in ascending order */ + greatest_modulus = hbounds[ndatums - 1]->modulus; + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(greatest_modulus * + sizeof(int)); + + for (i = 0; i < greatest_modulus; i++) + boundinfo->indexes[i] = -1; + + /* + * For hash partitioning, there are as many datums (modulus and + * remainder pairs) as there are partitions. Indexes are + * simply values ranging from 0 to npart - 1. + */ + for (i = 0; i < nparts; i++) + { + int orig_index = hbounds[i]->index; + int modulus = hbounds[i]->modulus; + int remainder = hbounds[i]->remainder; + + boundinfo->datums[i] = (Datum *) palloc(2 * + sizeof(Datum)); + boundinfo->datums[i][0] = Int32GetDatum(modulus); + boundinfo->datums[i][1] = Int32GetDatum(remainder); + + while (remainder < greatest_modulus) + { + /* overlap? */ + Assert(boundinfo->indexes[remainder] == -1); + boundinfo->indexes[remainder] = i; + remainder += modulus; + } + + /* Add the mapping element. */ + (*mapping)[i] = orig_index; + + pfree(hbounds[i]); + } + pfree(hbounds); + break; + } + + case PARTITION_STRATEGY_LIST: + { + PartitionListValue **all_values = NULL; + int null_index = -1; + List *non_null_values = NIL; + int *listpart_canon_idx; + + /* + * Create a unified list of non-null values across all + * partitions. + */ + i = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + ListCell *c; + + if (spec->strategy != PARTITION_STRATEGY_LIST) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the + * default partition. There's no datum to add to the list + * of non-null datums for this partition. + */ + if (spec->is_default) + { + default_index = i; + i++; + continue; + } + + foreach(c, spec->listdatums) + { + Const *val = castNode(Const, lfirst(c)); + PartitionListValue *list_value = NULL; + + if (!val->constisnull) + { + list_value = (PartitionListValue *) + palloc0(sizeof(PartitionListValue)); + list_value->index = i; + list_value->value = val->constvalue; + } + else + { + /* + * Never put a null into the values array, flag + * instead for the code further down below where + * we construct the actual relcache struct. + */ + if (null_index != -1) + elog(ERROR, "found null more than once"); + null_index = i; + } + + if (list_value) + non_null_values = lappend(non_null_values, + list_value); + } + + i++; + } + + ndatums = list_length(non_null_values); + + /* + * Collect all list values in one array. Alongside the value, + * we also save the index of partition the value comes from. + */ + all_values = (PartitionListValue **) palloc(ndatums * + sizeof(PartitionListValue *)); + i = 0; + foreach(cell, non_null_values) + { + PartitionListValue *src = lfirst(cell); + + all_values[i] = (PartitionListValue *) + palloc(sizeof(PartitionListValue)); + all_values[i]->value = src->value; + all_values[i]->index = src->index; + i++; + } + + qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), + qsort_partition_list_value_cmp, (void *) key); + + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); + + /* + * Copy values. Indexes are values ranging from 0 to + * npart - 1, assigned to each partition such that all datums + * of a given partition receive the same value. The value + * for a given partition is the index of that partition's + * smallest datum in the all_values[] array. We keep track + * of whether a given partition has been assigned an index + * yet using the 'listpart_canon_idx' array. + * + * Initialize listpart_canon_idx elements to -1 to indicate + * that an index hasn't been assigned yet. + */ + listpart_canon_idx = (int *) palloc(sizeof(int) * nparts); + memset(listpart_canon_idx, -1, sizeof(int) * nparts); + for (i = 0; i < ndatums; i++) + { + int orig_index = all_values[i]->index; + + boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); + boundinfo->datums[i][0] = datumCopy(all_values[i]->value, + key->parttypbyval[0], + key->parttyplen[0]); + + /* + * Add the mapping element, if not already done for this + * partition. + */ + if (listpart_canon_idx[orig_index] == -1) + { + /* Add the mapping element. */ + (*mapping)[next_index] = orig_index; + listpart_canon_idx[orig_index] = next_index++; + } + + boundinfo->indexes[i] = listpart_canon_idx[orig_index]; + } + + /* + * Set null_index, if any. + * + * It's possible that the null-accepting partition has not been + * assigned an index yet, which could happen if such partition + * accepts only null and hence not handled in the above loop + * which only looked at non-null values. + */ + if (null_index != -1) + { + Assert(null_index >= 0); + if (listpart_canon_idx[null_index] == -1) + { + /* Add the mapping element. */ + (*mapping)[next_index] = null_index; + listpart_canon_idx[null_index] = next_index++; + } + boundinfo->null_index = listpart_canon_idx[null_index]; + } + + /* Set default_index, if any. */ + if (default_index != -1) + { + /* + * The default partition accepts any value not + * specified in the lists of other partitions, hence + * it should not have been assigned an index yet. + */ + Assert(default_index >= 0); + Assert(listpart_canon_idx[default_index] == -1); + /* Add the mapping element. */ + (*mapping)[next_index] = default_index; + boundinfo->default_index = next_index++; + } + + /* All partition must now have a valid mapping */ + Assert(next_index == nparts); + break; + } + + case PARTITION_STRATEGY_RANGE: + { + PartitionRangeBound **rbounds = NULL; + PartitionRangeBound **all_bounds, + *prev; + int k; + + all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * + sizeof(PartitionRangeBound *)); + + /* Create a unified list of range bounds across all the partitions. */ + i = ndatums = 0; + foreach(cell, boundspecs) + { + PartitionBoundSpec *spec = castNode(PartitionBoundSpec, + lfirst(cell)); + PartitionRangeBound *lower, + *upper; + + if (spec->strategy != PARTITION_STRATEGY_RANGE) + elog(ERROR, "invalid strategy in partition bound spec"); + + /* + * Note the index of the partition bound spec for the + * default partition. There's no datum to add to the + * allbounds array for this partition. + */ + if (spec->is_default) + { + default_index = i++; + continue; + } + + lower = make_one_partition_rbound(key, i, + spec->lowerdatums, + true); + upper = make_one_partition_rbound(key, i, + spec->upperdatums, + false); + all_bounds[ndatums++] = lower; + all_bounds[ndatums++] = upper; + i++; + } + + Assert(ndatums == nparts * 2 || + (default_index != -1 && ndatums == (nparts - 1) * 2)); + + /* Sort all the bounds in ascending order */ + qsort_arg(all_bounds, ndatums, + sizeof(PartitionRangeBound *), + qsort_partition_rbound_cmp, + (void *) key); + + /* + * De-duplicate: Save distinct bounds from 'all_bounds' into + * 'rbounds'. In many cases, lower bound of a partition + * matches exactly with the upper of the previous partition, + * in which case, we only store one. + */ + rbounds = (PartitionRangeBound **) + palloc(ndatums * sizeof(PartitionRangeBound *)); + k = 0; + prev = NULL; + for (i = 0; i < ndatums; i++) + { + PartitionRangeBound *cur = all_bounds[i]; + bool is_distinct = false; + int j; + + /* Is the current bound distinct from the previous one? */ + for (j = 0; j < key->partnatts; j++) + { + Datum cmpval; + + if (prev == NULL || cur->kind[j] != prev->kind[j]) + { + is_distinct = true; + break; + } + + /* + * If the bounds are both MINVALUE or MAXVALUE, stop + * now and treat them as equal, since any values after + * this point must be ignored. + */ + if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) + break; + + cmpval = FunctionCall2Coll(&key->partsupfunc[j], + key->partcollation[j], + cur->datums[j], + prev->datums[j]); + if (DatumGetInt32(cmpval) != 0) + { + is_distinct = true; + break; + } + } + + /* + * Only if the bound is distinct save it into a temporary + * array i.e. rbounds which is later copied into boundinfo + * datums array. + */ + if (is_distinct) + rbounds[k++] = all_bounds[i]; + + prev = cur; + } + + /* Update ndatums to hold the count of distinct datums. */ + ndatums = k; + + /* + * Add datums to boundinfo. Indexes are values ranging from + * 0 to nparts - 1, assigned in that order to each partition's + * upper bound. For 'datums' elements that are lower bounds, + * there is -1 in 'indexes' array to signify that no partition + * exists for the values less than such a bound and greater + * than or equal to the previous upper bound. + */ + boundinfo->ndatums = ndatums; + boundinfo->datums = (Datum **) palloc0(ndatums * + sizeof(Datum *)); + boundinfo->kind = (PartitionRangeDatumKind **) + palloc(ndatums * + sizeof(PartitionRangeDatumKind *)); + /* + * For range partitioning, an additional value of -1 is stored + * as the last element. + */ + boundinfo->indexes = (int *) palloc((ndatums + 1) * + sizeof(int)); + + for (i = 0; i < ndatums; i++) + { + int j; + + boundinfo->datums[i] = (Datum *) palloc(key->partnatts * + sizeof(Datum)); + boundinfo->kind[i] = (PartitionRangeDatumKind *) + palloc(key->partnatts * + sizeof(PartitionRangeDatumKind)); + for (j = 0; j < key->partnatts; j++) + { + if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) + boundinfo->datums[i][j] = + datumCopy(rbounds[i]->datums[j], + key->parttypbyval[j], + key->parttyplen[j]); + boundinfo->kind[i][j] = rbounds[i]->kind[j]; + } + + /* Assign -1 to lower bounds as described above. */ + if (rbounds[i]->lower) + boundinfo->indexes[i] = -1; + else + { + int orig_index = rbounds[i]->index; + + /* Add the mapping element. */ + (*mapping)[next_index] = orig_index; + + boundinfo->indexes[i] = next_index++; + } + } + + /* Set default_index, if any. */ + if (default_index != -1) + { + Assert(default_index >= 0); + /* Add the mapping element. */ + (*mapping)[next_index] = default_index; + boundinfo->default_index = next_index++; + } + boundinfo->indexes[i] = -1; + + /* All partition must now have a valid mapping */ + Assert(next_index == nparts); + break; + } + + default: + elog(ERROR, "unexpected partition strategy: %d", + (int) key->strategy); + break; + } + + return boundinfo; +} +/* * Are two partition bound collections logically equal? * * Used in the keep logic of relcache.c (ie, in RelationClearRelation()). @@ -763,7 +1303,7 @@ get_hash_partition_greatest_modulus(PartitionBoundInfo bound) * and a flag telling whether the bound is lower or not. Made into a function * because there are multiple sites that want to use this facility. */ -PartitionRangeBound * +static PartitionRangeBound * make_one_partition_rbound(PartitionKey key, int index, List *datums, bool lower) { PartitionRangeBound *bound; @@ -819,7 +1359,7 @@ make_one_partition_rbound(PartitionKey key, int index, List *datums, bool lower) * structure, which only stores the upper bound of a common boundary between * two contiguous partitions. */ -int32 +static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation, Datum *datums1, PartitionRangeDatumKind *kind1, @@ -914,7 +1454,7 @@ partition_rbound_datum_cmp(FmgrInfo *partsupfunc, Oid *partcollation, * * Compares modulus first, then remainder if modulus is equal. */ -int32 +static int32 partition_hbound_cmp(int modulus1, int remainder1, int modulus2, int remainder2) { if (modulus1 < modulus2) @@ -977,7 +1517,7 @@ partition_list_bsearch(FmgrInfo *partsupfunc, Oid *partcollation, * *is_equal is set to true if the range bound at the returned index is equal * to the input range bound */ -int +static int partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation, PartitionBoundInfo boundinfo, @@ -1102,6 +1642,51 @@ partition_hash_bsearch(PartitionBoundInfo boundinfo, } /* + * qsort_partition_hbound_cmp + * + * We sort hash bounds by modulus, then by remainder. + */ +static int32 +qsort_partition_hbound_cmp(const void *a, const void *b) +{ + PartitionHashBound *h1 = (*(PartitionHashBound *const *) a); + PartitionHashBound *h2 = (*(PartitionHashBound *const *) b); + + return partition_hbound_cmp(h1->modulus, h1->remainder, + h2->modulus, h2->remainder); +} + +/* + * qsort_partition_list_value_cmp + * + * Compare two list partition bound datums + */ +static int32 +qsort_partition_list_value_cmp(const void *a, const void *b, void *arg) +{ + Datum val1 = (*(const PartitionListValue **) a)->value, + val2 = (*(const PartitionListValue **) b)->value; + PartitionKey key = (PartitionKey) arg; + + return DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0], + key->partcollation[0], + val1, val2)); +} + +/* Used when sorting range bounds across all range partitions */ +static int32 +qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) +{ + PartitionRangeBound *b1 = (*(PartitionRangeBound *const *) a); + PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b); + PartitionKey key = (PartitionKey) arg; + + return partition_rbound_cmp(key->partnatts, key->partsupfunc, + key->partcollation, b1->datums, b1->kind, + b1->lower, b2); +} + +/* * get_partition_bound_num_indexes * * Returns the number of the entries in the partition bound indexes array. diff --git a/src/backend/utils/cache/partcache.c b/src/backend/utils/cache/partcache.c index 342bb46a55..882e8c6b58 100644 --- a/src/backend/utils/cache/partcache.c +++ b/src/backend/utils/cache/partcache.c @@ -38,15 +38,6 @@ static List *generate_partition_qual(Relation rel); -static int32 qsort_partition_hbound_cmp(const void *a, const void *b); -static int32 qsort_partition_list_value_cmp(const void *a, const void *b, - void *arg); -static int32 qsort_partition_rbound_cmp(const void *a, const void *b, - void *arg); -static PartitionBoundInfo partition_bounds_create(List *boundspecs, - PartitionKey key, - int **mapping); - /* * RelationBuildPartitionKey @@ -496,541 +487,3 @@ generate_partition_qual(Relation rel) return result; } - -/* - * qsort_partition_hbound_cmp - * - * We sort hash bounds by modulus, then by remainder. - */ -static int32 -qsort_partition_hbound_cmp(const void *a, const void *b) -{ - PartitionHashBound *h1 = (*(PartitionHashBound *const *) a); - PartitionHashBound *h2 = (*(PartitionHashBound *const *) b); - - return partition_hbound_cmp(h1->modulus, h1->remainder, - h2->modulus, h2->remainder); -} - -/* - * qsort_partition_list_value_cmp - * - * Compare two list partition bound datums - */ -static int32 -qsort_partition_list_value_cmp(const void *a, const void *b, void *arg) -{ - Datum val1 = (*(const PartitionListValue **) a)->value, - val2 = (*(const PartitionListValue **) b)->value; - PartitionKey key = (PartitionKey) arg; - - return DatumGetInt32(FunctionCall2Coll(&key->partsupfunc[0], - key->partcollation[0], - val1, val2)); -} - -/* Used when sorting range bounds across all range partitions */ -static int32 -qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) -{ - PartitionRangeBound *b1 = (*(PartitionRangeBound *const *) a); - PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b); - PartitionKey key = (PartitionKey) arg; - - return partition_rbound_cmp(key->partnatts, key->partsupfunc, - key->partcollation, b1->datums, b1->kind, - b1->lower, b2); -} - -/* - * partition_bounds_create - * Build a PartitionBoundInfo struct from a list of PartitionBoundSpec - * nodes - * - * This function creates a PartitionBoundInfo and fills the values of its - * various members based on the input list. Importantly, 'datums' array will - * contain Datum representation of individual bounds (possibly after - * de-duplication as in case of range bounds), sorted in a canonical order - * defined by qsort_partition_* functions of respective partitioning methods. - * 'indexes' array will contain as many elements as there are bounds (specific - * exceptions to this rule are listed in the function body), which represent - * the 0-based canonical positions of partitions. - * - * Upon return from this function, *mapping is set to an array of - * list_length(boundspecs) / nparts elements, each of which maps the canonical - * index of a given partition to its 0-based position in the original list. - * - * Note: All the memory allocated by this function, including that of the - * the returned PartitionBoundInfo and its members is allocated in the context - * that was active when the function was called. - */ -static PartitionBoundInfo -partition_bounds_create(List *boundspecs, PartitionKey key, int **mapping) -{ - PartitionBoundInfo boundinfo; - ListCell *cell; - int i, - nparts; - int ndatums = 0; - int default_index = -1; - int next_index = 0; - - nparts = list_length(boundspecs); - Assert(nparts > 0); - - /* - * For each partitioning method, we first convert the partition bounds - * from their parser node representation to the internal representation, - * along with any additional preprocessing (such performing de-duplication - * on range bounds). For each bound, we remember its partition's position - * (0-based) in the original list, so that we can add it to the *mapping - * array. - * - * Resulting bound datums are then added to the 'datums' array in - * PartitionBoundInfo. For each datum added, an integer indicating the - * canonical partition index is added to the 'indexes' array. - */ - boundinfo = (PartitionBoundInfoData *) - palloc0(sizeof(PartitionBoundInfoData)); - boundinfo->strategy = key->strategy; - boundinfo->null_index = -1; - boundinfo->default_index = -1; - *mapping = (int *) palloc(sizeof(int) * nparts); - switch (key->strategy) - { - case PARTITION_STRATEGY_HASH: - { - PartitionHashBound **hbounds = NULL; - int greatest_modulus; - - ndatums = nparts; - hbounds = (PartitionHashBound **) - palloc(nparts * sizeof(PartitionHashBound *)); - - i = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - - if (spec->strategy != PARTITION_STRATEGY_HASH) - elog(ERROR, "invalid strategy in partition bound spec"); - - hbounds[i] = (PartitionHashBound *) - palloc(sizeof(PartitionHashBound)); - - hbounds[i]->modulus = spec->modulus; - hbounds[i]->remainder = spec->remainder; - hbounds[i]->index = i; - i++; - } - - /* Sort all the bounds in ascending order */ - qsort(hbounds, nparts, sizeof(PartitionHashBound *), - qsort_partition_hbound_cmp); - - /* Moduli are stored in ascending order */ - greatest_modulus = hbounds[ndatums - 1]->modulus; - - boundinfo->ndatums = ndatums; - boundinfo->datums = (Datum **) palloc0(ndatums * - sizeof(Datum *)); - boundinfo->indexes = (int *) palloc(greatest_modulus * - sizeof(int)); - - for (i = 0; i < greatest_modulus; i++) - boundinfo->indexes[i] = -1; - - /* - * For hash partitioning, there are as many datums (modulus and - * remainder pairs) as there are partitions. Indexes are - * simply values ranging from 0 to npart - 1. - */ - for (i = 0; i < nparts; i++) - { - int orig_index = hbounds[i]->index; - int modulus = hbounds[i]->modulus; - int remainder = hbounds[i]->remainder; - - boundinfo->datums[i] = (Datum *) palloc(2 * - sizeof(Datum)); - boundinfo->datums[i][0] = Int32GetDatum(modulus); - boundinfo->datums[i][1] = Int32GetDatum(remainder); - - while (remainder < greatest_modulus) - { - /* overlap? */ - Assert(boundinfo->indexes[remainder] == -1); - boundinfo->indexes[remainder] = i; - remainder += modulus; - } - - /* Add the mapping element. */ - (*mapping)[i] = orig_index; - - pfree(hbounds[i]); - } - pfree(hbounds); - break; - } - - case PARTITION_STRATEGY_LIST: - { - PartitionListValue **all_values = NULL; - int null_index = -1; - List *non_null_values = NIL; - int *listpart_canon_idx; - - /* - * Create a unified list of non-null values across all - * partitions. - */ - i = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - ListCell *c; - - if (spec->strategy != PARTITION_STRATEGY_LIST) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the - * default partition. There's no datum to add to the list - * of non-null datums for this partition. - */ - if (spec->is_default) - { - default_index = i; - i++; - continue; - } - - foreach(c, spec->listdatums) - { - Const *val = castNode(Const, lfirst(c)); - PartitionListValue *list_value = NULL; - - if (!val->constisnull) - { - list_value = (PartitionListValue *) - palloc0(sizeof(PartitionListValue)); - list_value->index = i; - list_value->value = val->constvalue; - } - else - { - /* - * Never put a null into the values array, flag - * instead for the code further down below where - * we construct the actual relcache struct. - */ - if (null_index != -1) - elog(ERROR, "found null more than once"); - null_index = i; - } - - if (list_value) - non_null_values = lappend(non_null_values, - list_value); - } - - i++; - } - - ndatums = list_length(non_null_values); - - /* - * Collect all list values in one array. Alongside the value, - * we also save the index of partition the value comes from. - */ - all_values = (PartitionListValue **) palloc(ndatums * - sizeof(PartitionListValue *)); - i = 0; - foreach(cell, non_null_values) - { - PartitionListValue *src = lfirst(cell); - - all_values[i] = (PartitionListValue *) - palloc(sizeof(PartitionListValue)); - all_values[i]->value = src->value; - all_values[i]->index = src->index; - i++; - } - - qsort_arg(all_values, ndatums, sizeof(PartitionListValue *), - qsort_partition_list_value_cmp, (void *) key); - - boundinfo->ndatums = ndatums; - boundinfo->datums = (Datum **) palloc0(ndatums * - sizeof(Datum *)); - boundinfo->indexes = (int *) palloc(ndatums * sizeof(int)); - - /* - * Copy values. Indexes are values ranging from 0 to - * npart - 1, assigned to each partition such that all datums - * of a given partition receive the same value. The value - * for a given partition is the index of that partition's - * smallest datum in the all_values[] array. We keep track - * of whether a given partition has been assigned an index - * yet using the 'listpart_canon_idx' array. - * - * Initialize listpart_canon_idx elements to -1 to indicate - * that an index hasn't been assigned yet. - */ - listpart_canon_idx = (int *) palloc(sizeof(int) * nparts); - memset(listpart_canon_idx, -1, sizeof(int) * nparts); - for (i = 0; i < ndatums; i++) - { - int orig_index = all_values[i]->index; - - boundinfo->datums[i] = (Datum *) palloc(sizeof(Datum)); - boundinfo->datums[i][0] = datumCopy(all_values[i]->value, - key->parttypbyval[0], - key->parttyplen[0]); - - /* - * Add the mapping element, if not already done for this - * partition. - */ - if (listpart_canon_idx[orig_index] == -1) - { - /* Add the mapping element. */ - (*mapping)[next_index] = orig_index; - listpart_canon_idx[orig_index] = next_index++; - } - - boundinfo->indexes[i] = listpart_canon_idx[orig_index]; - } - - /* - * Set null_index, if any. - * - * It's possible that the null-accepting partition has not been - * assigned an index yet, which could happen if such partition - * accepts only null and hence not handled in the above loop - * which only looked at non-null values. - */ - if (null_index != -1) - { - Assert(null_index >= 0); - if (listpart_canon_idx[null_index] == -1) - { - /* Add the mapping element. */ - (*mapping)[next_index] = null_index; - listpart_canon_idx[null_index] = next_index++; - } - boundinfo->null_index = listpart_canon_idx[null_index]; - } - - /* Set default_index, if any. */ - if (default_index != -1) - { - /* - * The default partition accepts any value not - * specified in the lists of other partitions, hence - * it should not have been assigned an index yet. - */ - Assert(default_index >= 0); - Assert(listpart_canon_idx[default_index] == -1); - /* Add the mapping element. */ - (*mapping)[next_index] = default_index; - boundinfo->default_index = next_index++; - } - - /* All partition must now have a valid mapping */ - Assert(next_index == nparts); - break; - } - - case PARTITION_STRATEGY_RANGE: - { - PartitionRangeBound **rbounds = NULL; - PartitionRangeBound **all_bounds, - *prev; - int k; - - all_bounds = (PartitionRangeBound **) palloc0(2 * nparts * - sizeof(PartitionRangeBound *)); - - /* Create a unified list of range bounds across all the partitions. */ - i = ndatums = 0; - foreach(cell, boundspecs) - { - PartitionBoundSpec *spec = castNode(PartitionBoundSpec, - lfirst(cell)); - PartitionRangeBound *lower, - *upper; - - if (spec->strategy != PARTITION_STRATEGY_RANGE) - elog(ERROR, "invalid strategy in partition bound spec"); - - /* - * Note the index of the partition bound spec for the - * default partition. There's no datum to add to the - * allbounds array for this partition. - */ - if (spec->is_default) - { - default_index = i++; - continue; - } - - lower = make_one_partition_rbound(key, i, - spec->lowerdatums, - true); - upper = make_one_partition_rbound(key, i, - spec->upperdatums, - false); - all_bounds[ndatums++] = lower; - all_bounds[ndatums++] = upper; - i++; - } - - Assert(ndatums == nparts * 2 || - (default_index != -1 && ndatums == (nparts - 1) * 2)); - - /* Sort all the bounds in ascending order */ - qsort_arg(all_bounds, ndatums, - sizeof(PartitionRangeBound *), - qsort_partition_rbound_cmp, - (void *) key); - - /* - * De-duplicate: Save distinct bounds from 'all_bounds' into - * 'rbounds'. In many cases, lower bound of a partition - * matches exactly with the upper of the previous partition, - * in which case, we only store one. - */ - rbounds = (PartitionRangeBound **) - palloc(ndatums * sizeof(PartitionRangeBound *)); - k = 0; - prev = NULL; - for (i = 0; i < ndatums; i++) - { - PartitionRangeBound *cur = all_bounds[i]; - bool is_distinct = false; - int j; - - /* Is the current bound distinct from the previous one? */ - for (j = 0; j < key->partnatts; j++) - { - Datum cmpval; - - if (prev == NULL || cur->kind[j] != prev->kind[j]) - { - is_distinct = true; - break; - } - - /* - * If the bounds are both MINVALUE or MAXVALUE, stop - * now and treat them as equal, since any values after - * this point must be ignored. - */ - if (cur->kind[j] != PARTITION_RANGE_DATUM_VALUE) - break; - - cmpval = FunctionCall2Coll(&key->partsupfunc[j], - key->partcollation[j], - cur->datums[j], - prev->datums[j]); - if (DatumGetInt32(cmpval) != 0) - { - is_distinct = true; - break; - } - } - - /* - * Only if the bound is distinct save it into a temporary - * array i.e. rbounds which is later copied into boundinfo - * datums array. - */ - if (is_distinct) - rbounds[k++] = all_bounds[i]; - - prev = cur; - } - - /* Update ndatums to hold the count of distinct datums. */ - ndatums = k; - - /* - * Add datums to boundinfo. Indexes are values ranging from - * 0 to nparts - 1, assigned in that order to each partition's - * upper bound. For 'datums' elements that are lower bounds, - * there is -1 in 'indexes' array to signify that no partition - * exists for the values less than such a bound and greater - * than or equal to the previous upper bound. - */ - boundinfo->ndatums = ndatums; - boundinfo->datums = (Datum **) palloc0(ndatums * - sizeof(Datum *)); - boundinfo->kind = (PartitionRangeDatumKind **) - palloc(ndatums * - sizeof(PartitionRangeDatumKind *)); - /* - * For range partitioning, an additional value of -1 is stored - * as the last element. - */ - boundinfo->indexes = (int *) palloc((ndatums + 1) * - sizeof(int)); - - for (i = 0; i < ndatums; i++) - { - int j; - - boundinfo->datums[i] = (Datum *) palloc(key->partnatts * - sizeof(Datum)); - boundinfo->kind[i] = (PartitionRangeDatumKind *) - palloc(key->partnatts * - sizeof(PartitionRangeDatumKind)); - for (j = 0; j < key->partnatts; j++) - { - if (rbounds[i]->kind[j] == PARTITION_RANGE_DATUM_VALUE) - boundinfo->datums[i][j] = - datumCopy(rbounds[i]->datums[j], - key->parttypbyval[j], - key->parttyplen[j]); - boundinfo->kind[i][j] = rbounds[i]->kind[j]; - } - - /* Assign -1 to lower bounds as described above. */ - if (rbounds[i]->lower) - boundinfo->indexes[i] = -1; - else - { - int orig_index = rbounds[i]->index; - - /* Add the mapping element. */ - (*mapping)[next_index] = orig_index; - - boundinfo->indexes[i] = next_index++; - } - } - - /* Set default_index, if any. */ - if (default_index != -1) - { - Assert(default_index >= 0); - /* Add the mapping element. */ - (*mapping)[next_index] = default_index; - boundinfo->default_index = next_index++; - } - boundinfo->indexes[i] = -1; - - /* All partition must now have a valid mapping */ - Assert(next_index == nparts); - break; - } - - default: - elog(ERROR, "unexpected partition strategy: %d", - (int) key->strategy); - break; - } - - return boundinfo; -} diff --git a/src/include/partitioning/partbounds.h b/src/include/partitioning/partbounds.h index c7535e32fc..991f294e84 100644 --- a/src/include/partitioning/partbounds.h +++ b/src/include/partitioning/partbounds.h @@ -75,40 +75,14 @@ typedef struct PartitionBoundInfoData #define partition_bound_accepts_nulls(bi) ((bi)->null_index != -1) #define partition_bound_has_default(bi) ((bi)->default_index != -1) -/* - * When qsort'ing partition bounds after reading from the catalog, each bound - * is represented with one of the following structs. - */ - -/* One bound of a hash partition */ -typedef struct PartitionHashBound -{ - int modulus; - int remainder; - int index; -} PartitionHashBound; - -/* One value coming from some (index'th) list partition */ -typedef struct PartitionListValue -{ - int index; - Datum value; -} PartitionListValue; - -/* One bound of a range partition */ -typedef struct PartitionRangeBound -{ - int index; - Datum *datums; /* range bound datums */ - PartitionRangeDatumKind *kind; /* the kind of each datum */ - bool lower; /* this is the lower (vs upper) bound */ -} PartitionRangeBound; - extern int get_hash_partition_greatest_modulus(PartitionBoundInfo b); extern uint64 compute_partition_hash_value(int partnatts, FmgrInfo *partsupfunc, Datum *values, bool *isnull); extern List *get_qual_from_partbound(Relation rel, Relation parent, PartitionBoundSpec *spec); +extern PartitionBoundInfo partition_bounds_create(List *boundspecs, + PartitionKey key, + int **mapping); extern bool partition_bounds_equal(int partnatts, int16 *parttyplen, bool *parttypbyval, PartitionBoundInfo b1, PartitionBoundInfo b2); @@ -120,14 +94,6 @@ extern void check_default_partition_contents(Relation parent, Relation defaultRel, PartitionBoundSpec *new_spec); -extern PartitionRangeBound *make_one_partition_rbound(PartitionKey key, int index, - List *datums, bool lower); -extern int32 partition_hbound_cmp(int modulus1, int remainder1, int modulus2, - int remainder2); -extern int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, - Oid *partcollation, Datum *datums1, - PartitionRangeDatumKind *kind1, bool lower1, - PartitionRangeBound *b2); extern int32 partition_rbound_datum_cmp(FmgrInfo *partsupfunc, Oid *partcollation, Datum *rb_datums, PartitionRangeDatumKind *rb_kind, @@ -136,10 +102,6 @@ extern int partition_list_bsearch(FmgrInfo *partsupfunc, Oid *partcollation, PartitionBoundInfo boundinfo, Datum value, bool *is_equal); -extern int partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc, - Oid *partcollation, - PartitionBoundInfo boundinfo, - PartitionRangeBound *probe, bool *is_equal); extern int partition_range_datum_bsearch(FmgrInfo *partsupfunc, Oid *partcollation, PartitionBoundInfo boundinfo, -- 2.11.0