Hello hackers, This patch adds a mul_var_large() that is dispatched to from mul_var() for var1ndigits >= 8, regardless of rscale.
The main idea with mul_var_large() is to reduce the "n" in O(n^2) by a factor of two. This is achieved by first converting the (ndigits) number of int16 NBASE digits, to (ndigits/2) number of int32 NBASE^2 digits, as well as upgrading the int32 variables to int64-variables so that the products and carry values fit. The existing mul_var() algorithm is then executed without structural change. Finally, the int32 NBASE^2 result digits are converted back to twice the number of int16 NBASE digits. This adds overhead of approximately 4 * O(n), due to the conversion. Benchmarks indicates mul_var_large() starts to be beneficial when var1 is at least 8 ndigits, or perhaps a little more. Definitively an interesting speed-up for 100 ndigits and above. Benchmarked on Apple M3 Max so far: -- var1ndigits == var2ndigits == 10 SELECT COUNT(*) FROM n_10 WHERE product = var1 * var2; Time: 3957.740 ms (00:03.958) -- HEAD Time: 3943.795 ms (00:03.944) -- mul_var_large -- var1ndigits == var2ndigits == 100 SELECT COUNT(*) FROM n_100 WHERE product = var1 * var2; Time: 1532.594 ms (00:01.533) -- HEAD Time: 1065.974 ms (00:01.066) -- mul_var_large -- var1ndigits == var2ndigits == 1000 SELECT COUNT(*) FROM n_1000 WHERE product = var1 * var2; Time: 3055.372 ms (00:03.055) -- HEAD Time: 2295.888 ms (00:02.296) -- mul_var_large -- var1ndigits and var2ndigits completely random, -- with random number of decimal digits SELECT COUNT(*) FROM n_mixed WHERE product = var1 * var2; Time: 46796.240 ms (00:46.796) -- HEAD Time: 27970.006 ms (00:27.970) -- mul_var_large -- var1ndigits == var2ndigits == 16384 SELECT COUNT(*) FROM n_max WHERE product = var1 * var2; Time: 3191.145 ms (00:03.191) -- HEAD Time: 1836.404 ms (00:01.836) -- mul_var_large Regards, Joel
mul_var_large.patch
Description: Binary data
bench_mul.sql
Description: Binary data
bench_mul-init.sql
Description: Binary data