Hi, While working on [1], I observed that extra memory is allocated in 'create_list_bounds' function which can be avoided. So the attached patch removes extra memory allocations done inside 'create_list_bounds' function and also removes the unused variable 'cell'.
In the existing code, in create_list_bounds(), 1. It iterates through all the partitions and for each partition, - It iterates through the list of datums named 'listdatums'. - For each non null value of 'listdatums', it allocates a memory for 'list_value' whose type is 'PartitionListValue' and stores value and index information. - Appends 'list_value' to a list named 'non_null_values'. 2. Allocates memory to 'all_values' variable which contains information of all the list bounds of all the partitions. The count allocated for 'all_values' is nothing but the total number of non null values which is populated from the previous step (1). 3. Iterates through each item of 'non_null_values' list. - It allocates a memory for 'all_values[i]' whose type is 'PartitionListValue' and copies the information from 'list_value'. The above logic is changed to following, 1. Call function 'get_non_null_count_list_bounds()' which iterates through all the partitions and for each partition, it iterates through a list of datums and calculates the count of all non null bound values. 2. Allocates memory to 'all_values' variable which contains information of all the list bounds of all the partitions. The count allocated for 'all_values' is nothing but the total number of non null values which is populated from the previous step (1). 3. Iterates through all the partitions and for each partition, - It iterates through the list of datums named 'listdatums'. - For each non null value of 'listdatums', it allocates a memory for 'all_values[i]' whose type is 'PartitionListValue' and stores value and index information directly. The above fix, removes the extra memory allocations. Let's consider an example. If there are 10 partitions and each partition contains 11 bounds including NULL value. Parameters Existing code With patch Memory allocation of 'PartitionListValue' 100+100 = 200 times 100 times Total number of iterations 110 + 100 = 210 110 + 110 = 220 As we can see in the above data, the total number of iterations are increased slightly (When it contains NULL values. Otherwise no change) but it improves in case of memory allocations. As memory allocations are costly operations, I feel we should consider changing the existing code. Please share your thoughts. [1] - https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV Thanks & Regards, Nitin Jadhav
v1_remove_extra_mem_alloc_from_list_bounds.patch
Description: Binary data