Re: Optimize mul_var() for var1ndigits >= 8

2024-09-04 Thread Joel Jacobson
On Wed, Sep 4, 2024, at 09:22, Dean Rasheed wrote: > On Tue, 3 Sept 2024 at 21:31, Tom Lane wrote: >> >> Dean Rasheed writes: >> > Ah, OK. I've pushed a fix. >> >> There is an open CF entry pointing at this thread [1]. >> Shouldn't it be marked committed now? >> > > Oops, yes I missed that CF ent

Re: Optimize mul_var() for var1ndigits >= 8

2024-09-04 Thread Dean Rasheed
On Tue, 3 Sept 2024 at 21:31, Tom Lane wrote: > > Dean Rasheed writes: > > Ah, OK. I've pushed a fix. > > There is an open CF entry pointing at this thread [1]. > Shouldn't it be marked committed now? > Oops, yes I missed that CF entry. I've closed it now. Joel, are you still planning to work o

Re: Optimize mul_var() for var1ndigits >= 8

2024-09-03 Thread Tom Lane
Dean Rasheed writes: > Ah, OK. I've pushed a fix. There is an open CF entry pointing at this thread [1]. Shouldn't it be marked committed now? regards, tom lane [1] https://commitfest.postgresql.org/49/5115/

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-26 Thread Dean Rasheed
On Sat, 24 Aug 2024 at 19:17, Tom Lane wrote: > > About a dozen buildfarm members are complaining > Ah, OK. I've pushed a fix. Regards, Dean

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-24 Thread Tom Lane
Dean Rasheed writes: > On Wed, 14 Aug 2024 at 07:31, Joel Jacobson wrote: >> I think this is acceptable, since it produces more correct results. > Thanks for checking. I did a bit more testing myself and didn't see > any problems, so I have committed both these patches. About a dozen buildfarm

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-15 Thread Dean Rasheed
On Wed, 14 Aug 2024 at 07:31, Joel Jacobson wrote: > > I think this is acceptable, since it produces more correct results. > Thanks for checking. I did a bit more testing myself and didn't see any problems, so I have committed both these patches. > In addition, I've traced the rscale_adjustment

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-13 Thread Joel Jacobson
On Tue, Aug 13, 2024, at 13:01, Joel Jacobson wrote: > I think this is acceptable, since it produces more correct results. In addition, I've traced the rscale_adjustment -15 mul_var() calls to originate from numeric_exp() and numeric_power(), so I thought it would be good to brute-force test those

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-13 Thread Joel Jacobson
On Tue, Aug 13, 2024, at 12:23, Dean Rasheed wrote: > On Tue, 13 Aug 2024 at 08:49, Joel Jacobson wrote: >> >> I reran the tests and v5 produces much fewer diffs than v4. >> Not sure if the remaining ones are problematic or not. ... > > Yes, that's exactly the sort of thing you'd expect to see. Th

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-13 Thread Dean Rasheed
On Tue, 13 Aug 2024 at 08:49, Joel Jacobson wrote: > > I reran the tests and v5 produces much fewer diffs than v4. > Not sure if the remaining ones are problematic or not. > > joel@Joels-MBP postgresql % ./test-mul_var-verify-v5.sh > HEAD is now at a67a49648d Rename C23 keyword > SET > DROP TABLE

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-13 Thread Joel Jacobson
On Tue, Aug 13, 2024, at 09:49, Joel Jacobson wrote: > I reran the tests and v5 produces much fewer diffs than v4. > Not sure if the remaining ones are problematic or not. Attaching scripts if needed. Regards, Joel#!/bin/bash git reset --hard && \ { patch -p1 < ~/numeric_mul_rscale.patch && \

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-13 Thread Joel Jacobson
On Tue, Aug 13, 2024, at 00:56, Dean Rasheed wrote: > On Mon, 12 Aug 2024 at 16:17, Joel Jacobson wrote: >> >> On Mon, Aug 12, 2024, at 17:14, Joel Jacobson wrote: >> > The case found with the smallest rscale adjustment was this one: >> > -[ RECORD 1 ]--+ >> > v

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-12 Thread Dean Rasheed
On Mon, 12 Aug 2024 at 16:17, Joel Jacobson wrote: > > On Mon, Aug 12, 2024, at 17:14, Joel Jacobson wrote: > > The case found with the smallest rscale adjustment was this one: > > -[ RECORD 1 ]--+ > > var1 | 0.9873307197037692 > > var2

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-12 Thread Joel Jacobson
On Mon, Aug 12, 2024, at 17:14, Joel Jacobson wrote: > The case found with the smallest rscale adjustment was this one: > -[ RECORD 1 ]--+ > var1 | 0.9873307197037692 > var2 | 0.426697279270850 > rscale_adjustment | -15 >

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-12 Thread Joel Jacobson
On Mon, Aug 12, 2024, at 12:47, Joel Jacobson wrote: >> 2). Attempt to fix the formulae incorporating maxdigits mentioned >> above. This part really made my brain hurt, and I'm still not quite >> sure that I've got it right. In particular, it needs double-checking >> to ensure that it's not losing

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-12 Thread Joel Jacobson
On Sun, Aug 11, 2024, at 22:04, Joel Jacobson wrote: >> Attachments: >> * v4-0001-Extend-mul_var_short-to-5-and-6-digit-inputs.patch >> * v4-0002-Optimise-numeric-multiplication-using-base-NBASE-.patch > > I've reviewed and tested both patches and think they are ready to be > committed. In additi

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-11 Thread Joel Jacobson
On Tue, Aug 6, 2024, at 13:52, Dean Rasheed wrote: > On Mon, 5 Aug 2024 at 13:34, Joel Jacobson wrote: >> >> Noted from 5e1f3b9eb: >> "While it adds some space on 32-bit machines, we aren't optimizing for that >> case anymore." >> >> In this case, the extra 32-bit numeric_mul code seems to be 89

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-06 Thread Dean Rasheed
On Mon, 5 Aug 2024 at 13:34, Joel Jacobson wrote: > > Noted from 5e1f3b9eb: > "While it adds some space on 32-bit machines, we aren't optimizing for that > case anymore." > > In this case, the extra 32-bit numeric_mul code seems to be 89 lines of code, > excluding comments. > To me, this seems l

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-05 Thread Joel Jacobson
On Tue, Jul 30, 2024, at 00:31, Tom Lane wrote: > Dean Rasheed writes: >> On Mon, 29 Jul 2024 at 21:39, Joel Jacobson wrote: >>> I think it's non-obvious if the separate code paths for 32-bit and 64-bit, >>> using `#if SIZEOF_DATUM < 8`, to get *fast* 32-bit support, outweighs >>> the benefits of

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-29 Thread Tom Lane
Dean Rasheed writes: > On Mon, 29 Jul 2024 at 21:39, Joel Jacobson wrote: >> I think it's non-obvious if the separate code paths for 32-bit and 64-bit, >> using `#if SIZEOF_DATUM < 8`, to get *fast* 32-bit support, outweighs >> the benefits of simpler code. > Looking at that other thread that yo

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-29 Thread Dean Rasheed
On Mon, 29 Jul 2024 at 21:39, Joel Jacobson wrote: > > I like these simplifications, how `var2ndigits` is used instead of > `res_ndigits`: > - for (int i = res_ndigits - 3; i >= 1; i--) > + for (int i = var2ndigits - 1; i >= 1; i--) > > But I wonder why

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-29 Thread Joel Jacobson
On Mon, Jul 29, 2024, at 22:01, Dean Rasheed wrote: > On Mon, 29 Jul 2024 at 18:57, Joel Jacobson wrote: >> >> Thanks to v3-0002, they are all still significantly faster when both patches >> have been applied, >> but I wonder if it is expected or not, that v3-0001 temporarily made them a >> bit

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-29 Thread Dean Rasheed
On Mon, 29 Jul 2024 at 18:57, Joel Jacobson wrote: > > Thanks to v3-0002, they are all still significantly faster when both patches > have been applied, > but I wonder if it is expected or not, that v3-0001 temporarily made them a > bit slower? > There's no obvious reason why 0001 would make th

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-29 Thread Joel Jacobson
On Mon, Jul 29, 2024, at 16:42, Joel Jacobson wrote: > New results with less noise below. > > Pardon the exceeding of 80 chars line width, > but felt important to include commit hash and relative delta. > > > ndigits|rate| change | accum | commit | > su

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-29 Thread Joel Jacobson
On Mon, Jul 29, 2024, at 02:23, Joel Jacobson wrote: > Then, I've used sched_setaffinity() from to ensure the > benchmark run on CPU core id 31. I fixed a bug in my measure function, I had forgot to reset affinity after each benchmark, so the PostgreSQL backend continued to use the core even aft

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-28 Thread Joel Jacobson
On Sun, Jul 28, 2024, at 21:18, Dean Rasheed wrote: > Attachments: > * v3-0002-Optimise-numeric-multiplication-using-base-NBASE-.patch > * v3-0001-Extend-mul_var_short-to-5-and-6-digit-inputs.patch Very nice. I've done some initial benchmarks on my Intel Core i9-14900K machine. To reduce noise,

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-28 Thread Dean Rasheed
On Fri, 12 Jul 2024 at 13:34, Dean Rasheed wrote: > > Then I tried compiling with -m32, and unfortunately this made the > patch slower than HEAD for small inputs: > > -- var1ndigits1=5, var2ndigits2=5 [-m32, SIMD disabled] > call rate=5.052332e+06 -- HEAD > call rate=3.883459e+06 -- v2 patch > >

Re: Optimize mul_var() for var1ndigits >= 8

2024-07-12 Thread Dean Rasheed
On Sun, 7 Jul 2024 at 20:46, Joel Jacobson wrote: > > This patch adds a mul_var_large() that is dispatched to from mul_var() > for var1ndigits >= 8, regardless of rscale. > > -- var1ndigits == var2ndigits == 16384 > SELECT COUNT(*) FROM n_max WHERE product = var1 * var2; > Time: 3191.145 ms (00:03