Hi,

While working on [1], I observed that extra memory is allocated in
'create_list_bounds'
function which can be avoided. So the attached patch removes extra memory
allocations done inside 'create_list_bounds' function and also removes the
unused variable 'cell'.

In the existing code, in create_list_bounds(),

   1. It iterates through all the partitions and for each partition,
      - It iterates through the list of datums named 'listdatums'.
         - For each non null value of 'listdatums', it allocates a memory
         for 'list_value' whose type is 'PartitionListValue' and
stores value and
         index information.
         - Appends 'list_value' to a list named 'non_null_values'.
      2. Allocates memory to 'all_values' variable which contains
   information of all the list bounds of all the partitions. The count
   allocated for 'all_values' is nothing but the total number of non null
   values which is populated from the previous step (1).
   3. Iterates through each item of 'non_null_values' list.
      - It allocates a memory for 'all_values[i]' whose type is
      'PartitionListValue' and copies the information from 'list_value'.

 The above logic is changed to following,

   1. Call function 'get_non_null_count_list_bounds()' which iterates
   through all the partitions and for each partition, it iterates through a
   list of datums and calculates the count of all non null bound values.
   2. Allocates memory to 'all_values' variable which contains information
   of all the list bounds of all the partitions. The count allocated for
   'all_values' is nothing but the total number of non null values which is
   populated from the previous step (1).
   3. Iterates through all the partitions and for each partition,
      - It iterates through the list of datums named 'listdatums'.
         - For each non null value of 'listdatums', it allocates a memory
         for 'all_values[i]' whose type is 'PartitionListValue' and stores
         value and index information directly.

The above fix, removes the extra memory allocations. Let's consider an
example.
If there are 10 partitions and each partition contains 11 bounds including
NULL value.

Parameters Existing code With patch
Memory allocation of 'PartitionListValue' 100+100 = 200 times 100 times
Total number of iterations 110 + 100 = 210 110 + 110 = 220
As we can see in the above data, the total number of iterations are
increased slightly
(When it contains NULL values. Otherwise no change) but it improves in case
of
memory allocations. As memory allocations are costly operations, I feel we
should
consider changing the existing code.

Please share your thoughts.

[1] -
https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

Thanks & Regards,
Nitin Jadhav

Attachment: v1_remove_extra_mem_alloc_from_list_bounds.patch
Description: Binary data

Reply via email to