> > While working on [1], I observed that extra memory is allocated in
> > [1] 
> > https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

I am really sorry for this. I wanted to point to the thread subjected
to 'Multi-Column List Partitioning'.

> If it's worth counting list elements in advance, then you can also allocate 
> the
> PartitionListValue as a single chunk, rather than palloc in a loop.
> This may help large partition heirarchies.
>
> And the same thing in create_hash_bounds with hbounds.

I agree and thanks for creating those patches. I am not able to apply
the patch on the latest HEAD. Kindly check and upload the modified
patches.

> I'm not able to detect that this is saving more than about ~1% less RAM, to
> create or select from 1000 partitions, probably because other data structures
> are using much more, and savings here are relatively small.

Yes it does not save huge memory but it's an improvement.

> I'm going to add to the next CF.  You can add yourself as an author, and watch
> that the patch passes tests in cfbot.
> https://commitfest.postgresql.org/
> http://cfbot.cputube.org/

Thanks for creating the commitfest entry.

> Since the function returns the total number of non null bound values, should 
> it be named get_non_null_list_bounds_count ?
> It can also be named get_count_of_... but that's longer.

Changed it to 'get_non_null_list_bounds_count'.

> The palloc() call would take place even if ndatums is 0. I think in that 
> case, palloc() doesn't need to be called.

I feel there is no such case where the 'ndatums' is 0 because as we
can see below, there is an assert in the 'partition_bounds_create'
function from where we call the 'create_list_bounds' function. Kindly
provide such a case if I am wrong.

PartitionBoundInfo
partition_bounds_create(PartitionBoundSpec **boundspecs, int nparts,
                        PartitionKey key, int **mapping)
{
    int         i;

    Assert(nparts > 0);

Thanks & Regards,
Nitin Jadhav
On Sun, May 16, 2021 at 10:52 PM Zhihong Yu <z...@yugabyte.com> wrote:
>
>
>
> On Sun, May 16, 2021 at 10:00 AM Justin Pryzby <pry...@telsasoft.com> wrote:
>>
>> On Sat, May 15, 2021 at 02:40:45PM +0530, Nitin Jadhav wrote:
>> > While working on [1], I observed that extra memory is allocated in
>> > [1] 
>> > https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV
>>
>> This is a link to your gmail, not to anything public.
>>
>> If it's worth counting list elements in advance, then you can also allocate 
>> the
>> PartitionListValue as a single chunk, rather than palloc in a loop.
>> This may help large partition heirarchies.
>>
>> And the same thing in create_hash_bounds with hbounds.
>>
>> create_range_bounds() already doesn't call palloc in a loop.  However, then
>> there's an asymmetry in create_range_bounds, which is still takes a
>> double-indirect pointer.
>>
>> I'm not able to detect that this is saving more than about ~1% less RAM, to
>> create or select from 1000 partitions, probably because other data structures
>> are using much more, and savings here are relatively small.
>>
>> I'm going to add to the next CF.  You can add yourself as an author, and 
>> watch
>> that the patch passes tests in cfbot.
>> https://commitfest.postgresql.org/
>> http://cfbot.cputube.org/
>>
>> Thanks,
>> --
>> Justin
>
> Hi,
> For 0001-Removed-extra-memory-allocations-from-create_list_bo.patch :
>
> +static int
> +get_non_null_count_list_bounds(PartitionBoundSpec **boundspecs, int nparts)
>
> Since the function returns the total number of non null bound values, should 
> it be named get_non_null_list_bounds_count ?
> It can also be named get_count_of_... but that's longer.
>
> +   all_values = (PartitionListValue **)
> +       palloc(ndatums * sizeof(PartitionListValue *));
>
> The palloc() call would take place even if ndatums is 0. I think in that 
> case, palloc() doesn't need to be called.
>
> Cheers
>
diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c
index 7925fcc..e0a04fe 100644
--- a/src/backend/partitioning/partbounds.c
+++ b/src/backend/partitioning/partbounds.c
@@ -433,6 +433,32 @@ create_hash_bounds(PartitionBoundSpec **boundspecs, int nparts,
 }
 
 /*
+ * get_non_null_list_bounds_count
+ * 		Calculates the total number of non null bound values of
+ * 		all the partitions.
+ */
+static int
+get_non_null_list_bounds_count(PartitionBoundSpec **boundspecs, int nparts)
+{
+	int	i = 0;
+	int count = 0;
+
+	for (i = 0; i < nparts; i++)
+	{
+		ListCell   *c;
+
+		foreach(c, boundspecs[i]->listdatums)
+		{
+			Const   *val = castNode(Const, lfirst(c));
+
+			if (!val->constisnull)
+				count++;
+		}
+	}
+
+	return count;
+}
+/*
  * create_list_bounds
  *		Create a PartitionBoundInfo for a list partitioned table
  */
@@ -442,13 +468,12 @@ create_list_bounds(PartitionBoundSpec **boundspecs, int nparts,
 {
 	PartitionBoundInfo boundinfo;
 	PartitionListValue **all_values = NULL;
-	ListCell   *cell;
 	int			i = 0;
+	int			j = 0;
 	int			ndatums = 0;
 	int			next_index = 0;
 	int			default_index = -1;
 	int			null_index = -1;
-	List	   *non_null_values = NIL;
 
 	boundinfo = (PartitionBoundInfoData *)
 		palloc0(sizeof(PartitionBoundInfoData));
@@ -457,6 +482,10 @@ create_list_bounds(PartitionBoundSpec **boundspecs, int nparts,
 	boundinfo->null_index = -1;
 	boundinfo->default_index = -1;
 
+	ndatums = get_non_null_list_bounds_count(boundspecs, nparts);
+	all_values = (PartitionListValue **)
+		palloc(ndatums * sizeof(PartitionListValue *));
+
 	/* Create a unified list of non-null values across all partitions. */
 	for (i = 0; i < nparts; i++)
 	{
@@ -480,14 +509,14 @@ create_list_bounds(PartitionBoundSpec **boundspecs, int nparts,
 		foreach(c, spec->listdatums)
 		{
 			Const	   *val = castNode(Const, lfirst(c));
-			PartitionListValue *list_value = NULL;
 
 			if (!val->constisnull)
 			{
-				list_value = (PartitionListValue *)
+				all_values[j] = (PartitionListValue *)
 					palloc0(sizeof(PartitionListValue));
-				list_value->index = i;
-				list_value->value = val->constvalue;
+				all_values[j]->index = i;
+				all_values[j]->value = val->constvalue;
+				j++;
 			}
 			else
 			{
@@ -499,32 +528,9 @@ create_list_bounds(PartitionBoundSpec **boundspecs, int nparts,
 					elog(ERROR, "found null more than once");
 				null_index = i;
 			}
-
-			if (list_value)
-				non_null_values = lappend(non_null_values, list_value);
 		}
 	}
 
-	ndatums = list_length(non_null_values);
-
-	/*
-	 * Collect all list values in one array. Alongside the value, we also save
-	 * the index of partition the value comes from.
-	 */
-	all_values = (PartitionListValue **)
-		palloc(ndatums * sizeof(PartitionListValue *));
-	i = 0;
-	foreach(cell, non_null_values)
-	{
-		PartitionListValue *src = lfirst(cell);
-
-		all_values[i] = (PartitionListValue *)
-			palloc(sizeof(PartitionListValue));
-		all_values[i]->value = src->value;
-		all_values[i]->index = src->index;
-		i++;
-	}
-
 	qsort_arg(all_values, ndatums, sizeof(PartitionListValue *),
 			  qsort_partition_list_value_cmp, (void *) key);
 

Reply via email to