Hi all,
I will be retiring from Intel at the end of this week. I wanted to introduce
the engineer who will be taking over the CRC32c proposal and commit fest entry.
Devulapalli, Raghuveer
I have brought him up to speed and he will be the go-to for technical review
comments and questions. Plea
> Things like sizeof() and offsetof() are known at compile time, so the compiler
> will recognize when a condition is always true or false and optimize it out
> accordingly. In cases where the value cannot be known at compile time,
> checking the length in the macro and dispatching to a different
> IMHO that would be useful to establish the current state of the patch set from
> a performance standpoint, especially since you've added code intended to
> mitigate the regression.
Ok.
> +#define COMP_CRC32C_SMALL(crc, data, len) \
> + ((crc) = pg_comp_crc32c_sse42((crc), (data), (len)))
>
> And this still shows the ~14% regression in your original post?
At the small buffer sizes the margin of error or "noise" is larger, 7-11%. My
average could be just bad luck. It will take me a while to re-setup for full
data collection runs but I can try it again if you like.
Paul
> I'm curious about where exactly the regression is coming from. Is it possible
> that your build for the SSE 4.2 tests was using it unconditionally, i.e.,
> optimizing away the function pointer?
I am calling the SSE 4.2 implementation directly; I am not even building the
pg_sse42_*_choose.c fil
> Upthread [0], Andres suggested dispatching to a different implementation for
> compile-time-known small lengths. Have you looked into that? In your
> original post, you noted a 14% regression for records smaller than 256 bytes,
> which is not an uncommon case for Postgres. IMO we should try to
> Upthread [0], Andres suggested dispatching to a different implementation for
> compile-time-known small lengths. Have you looked into that? In your
> original post, you noted a 14% regression for records smaller than 256 bytes,
> which is not an uncommon case for Postgres. IMO we should try to
Hi,
Here are the latest patches for the accelerated CRC32c algorithm. I did the
following to create these refactored patches:
1) From the main branch I moved all x86_64 hardware checks from the various
locations into a single location. I did not move any ARM tests as I would have
no way to tes
> Okay, that is very interesting. Yes, we will have no problem reproducing the
> exact license text in the source code. I think we can remove the license
> issue
> as a blocker for this patch.
Hi,
I was wondering if I can I get a review please. I am interested in the refactor
question for the
> It would be good to know exactly what, if any, changes the Intel lawyers want
> us to make to our license if we accept this patch.
I asked about this and there is nothing Intel requires here license wise. They
believe that there is nothing wrong with including Clause-3 BSD like licenses
under
> Hmm, I wonder if the "(c) 2024 Intel" line is going to bring us trouble.
> (I bet it's not really necessary anyway.)
Our lawyer agrees, copyright is covered by the "PostgreSQL Global Development
Group" copyright line as a contributor.
> And this bit doesn't look good. The LICENSE file says:
.
> This is extremely workload dependent, it's not hard to find workloads with
> lots of very small record and very few big ones... What you observed might
> have "just" been the warmup behaviour where more full page writes have to
> be written.
Can you tell me how to avoid capturing this "warm-up"
> -Original Message-
> From: Andres Freund
> Sent: Wednesday, June 12, 2024 1:12 PM
> To: Amonson, Paul D
> FWIW, I tried the v2 patch on my Xeon Gold 5215 workstation, and dies early
> on with SIGILL:
Nice catch!!! I was testing the bit for the vpclmulqdq in
> The project is currently in feature-freeze in preparation for the next major
> release so new development and ideas are not the top priority right now.
> Additionally there is a large developer meeting shortly which many are busy
> preparing for. Excercise some patience, and I'm sure there will
Hi, forgive the top-post but I have not seen any response to this post?
Thanks,
Paul
> -Original Message-
> From: Amonson, Paul D
> Sent: Wednesday, May 1, 2024 8:56 AM
> To: pgsql-hackers@lists.postgresql.org
> Cc: Nathan Bossart ; Shankaran, Akash
>
> Subject:
Hi,
Comparing the current SSE4.2 implementation of the CRC32C algorithm in
Postgres, to an optimized AVX-512 algorithm [0] we observed significant gains.
The result was a ~6.6X average multiplier of increased performance measured on
3 different Intel products. Details below. The AVX-512 algorit
> A counterexample is the CRC32C code. AFAICT we assume the presence of
> CPUID in that code (and #error otherwise). I imagine its probably safe to
> assume the compiler understands CPUID if it understands AVX512 intrinsics,
> but that is still mostly a guess.
If AVX-512 intrinsics are available
> On Thu, Mar 28, 2024 at 11:10:33PM +0100, Alvaro Herrera wrote:
> > We don't do MSVC via autoconf/Make. We used to have a special build
> > framework for MSVC which parsed Makefiles to produce "solution" files,
> > but it was removed as soon as Meson was mature enough to build. See
> > commit 1
> -Original Message-
>
> Cool. I think we should run the benchmarks again to be safe, though.
Ok, sure go ahead. :)
> >> I forgot to mention that I also want to understand whether we can
> >> actually assume availability of XGETBV when CPUID says we support
> >> AVX512:
> >
> > You canno
> -Original Message-
> From: Amonson, Paul D
> Sent: Thursday, March 28, 2024 3:03 PM
> To: Nathan Bossart
> ...
> I will review the new patch to see if there are anything that jumps out at me.
I see in the meson.build you added the new file twice?
@@ -7,6 +7,7
> -Original Message-
> From: Nathan Bossart
> Sent: Thursday, March 28, 2024 2:39 PM
> To: Amonson, Paul D
>
> * The latest patch set from Paul Amonson appeared to support MSVC in the
> meson build, but not the autoconf one. I don't have much expertise
> -Original Message-
> From: Nathan Bossart
> Sent: Wednesday, March 27, 2024 3:00 PM
> To: Amonson, Paul D
>
> ... (I realize that I'm essentially
> recanting much of my previous feedback, which I apologize for.)
It happens. LOL As long as the algorithm fo
> -Original Message-
> From: Amonson, Paul D
> Sent: Monday, March 25, 2024 8:20 AM
> To: Tom Lane
> Cc: David Rowley ; Nathan Bossart
> ; Andres Freund ; Alvaro
> Herrera ; Shankaran, Akash
> ; Noah Misch ; Matthias
> van de Meent ; pgsql-
> hack...@list
> -Original Message-
> From: Tom Lane
> Sent: Monday, March 25, 2024 8:12 AM
> To: Amonson, Paul D
> Cc: David Rowley ; Nathan Bossart
> Subject: Re: Popcount optimization using AVX512
>...
> Just for a note --- the cfbot will re-test existing patches every so of
> -Original Message-
> From: Amonson, Paul D
> Sent: Thursday, March 21, 2024 12:18 PM
> To: David Rowley
> Cc: Nathan Bossart ; Andres Freund
I am re-posting the patches as CI for Mac failed (CI error not code/test
error). The patches are the same as last time.
Than
> -Original Message-
> From: David Rowley
> Sent: Wednesday, March 20, 2024 5:28 PM
> To: Amonson, Paul D
> Cc: Nathan Bossart ; Andres Freund
>
> I'm not sure about this "extern negates inline" comment. It seems to me the
> compiler is perfectly f
> -Original Message-
> From: David Rowley
> Sent: Tuesday, March 19, 2024 9:26 PM
> To: Amonson, Paul D
>
> AMD's Zen4 also has AVX512, so it's misleading to indicate it's an Intel only
> instruction. Also, writing the date isn't necessary as
> -Original Message-
> From: Nathan Bossart
>
> Committed. Thanks for the suggestion and for reviewing!
>
> Paul, I suspect your patches will need to be rebased after commit cc4826d.
> Would you mind doing so?
Changed in this patch set.
* Rebased.
* Direct *slow* calls via macros as sh
> -Original Message-
> From: Nathan Bossart
> Sent: Monday, March 18, 2024 2:08 PM
> To: David Rowley
> Cc: Amonson, Paul D ; Andres Freund
>...
>
> The only reason I left it out was because I couldn't convince myself that it
> wasn't dead code, give
> -Original Message-
> From: Nathan Bossart
> Sent: Monday, March 18, 2024 9:20 AM
> ...
> I don't think David was suggesting that we need to remove the runtime checks
> for AVX512. IIUC he was pointing out that most of the performance gain is
> from removing the function call overhead, w
om: Nathan Bossart
> Sent: Monday, March 18, 2024 8:29 AM
> To: David Rowley
> Cc: Amonson, Paul D ; Andres Freund
> ; Alvaro Herrera ; Shankaran,
> Akash ; Noah Misch ;
> Tom Lane ; Matthias van de Meent
> ; pgsql-hackers@lists.postgresql.org
> Subject: Re: Popcount optimizati
> -Original Message-
> From: Amonson, Paul D
> Sent: Friday, March 15, 2024 8:31 AM
> To: Nathan Bossart
...
> When I tested the code outside postgres in a micro benchmark I got 200-
> 300% improvements. Your results are interesting, as it implies more than
> 300% i
> -Original Message-
> From: Nathan Bossart
> Sent: Friday, March 15, 2024 8:06 AM
> To: Amonson, Paul D
> Cc: Andres Freund ; Alvaro Herrera ip.org>; Shankaran, Akash ; Noah Misch
> ; Tom Lane ; Matthias van de
> Meent ; pgsql-
> hack...@lists.postgresql.
> -Original Message-
> From: Nathan Bossart
> Sent: Monday, March 11, 2024 6:35 PM
> To: Amonson, Paul D
> Thanks. There's no need to wait to post the AVX portion. I recommend using
> "git format-patch" to construct the patch set for the lists.
> -Original Message-
> From: Nathan Bossart
> Sent: Wednesday, March 13, 2024 9:39 AM
> To: Amonson, Paul D
> +extern int pg_popcount32_slow(uint32 word); extern int
> +pg_popcount64_slow(uint64 word);
>
> +/* In pg_popcnt_*_accel source file. */ extern i
> -Original Message-
> From: Nathan Bossart
> Sent: Thursday, March 7, 2024 1:36 PM
> Subject: Re: Popcount optimization using AVX512
I will be splitting the request into 2 patches. I am attaching the first patch
(refactoring only) and I updated the commitfest entry to match this patch.
-Original Message-
>From: Nathan Bossart
>Sent: Tuesday, March 5, 2024 8:38 AM
>To: Amonson, Paul D
>Cc: Andres Freund ; Alvaro Herrera
>; Shankaran, Akash ; Noah
>Misch ; Tom Lane ; Matthias van de
>Meent ; >pgsql-hackers@lists.postgresql.org
>Subject
apply and build. It
succeeded.
Thanks,
Paul
-Original Message-
From: Nathan Bossart
Sent: Monday, March 4, 2024 2:21 PM
To: Amonson, Paul D
Cc: Andres Freund ; Alvaro Herrera
; Shankaran, Akash ; Noah
Misch ; Tom Lane ; Matthias van de Meent
; pgsql-hackers@lists.postgresql.org
S
an be picked
up by a committer, given it has been reviewed by multiple committers so far?
The scope of the change is pretty contained as well.
[0] https://wiki.postgresql.org/wiki/Submitting_a_Patch
Thanks,
Paul
-Original Message-
From: Nathan Bossart
Sent: Friday, March 1, 2024 1
. Both meson
and autoconf are updated with the new refactor.
I am attaching the new patch.
Paul
-Original Message-
From: Amonson, Paul D
Sent: Monday, February 26, 2024 9:57 AM
To: Amonson, Paul D ; Andres Freund
Cc: Alvaro Herrera ; Shankaran, Akash
; Nathan Bossart ; Noah
Misch
. Can someone with Windows/MSVC experience
help me?
* Code: https://github.com/paul-amonson/postgresql/tree/popcnt_patch
* CI build: https://cirrus-ci.com/task/4927666021728256
Thanks,
Paul
-Original Message-
From: Amonson, Paul D
Sent: Wednesday, February 21, 2024 9:36 AM
To: Andres F
ild is at https://cirrus-ci.com/task/4927666021728256.
Thanks,
Paul
-Original Message-
From: Andres Freund
Sent: Monday, February 12, 2024 12:37 PM
To: Amonson, Paul D
Cc: Alvaro Herrera ; Shankaran, Akash
; Nathan Bossart ; Noah
Misch ; Tom Lane ; Matthias van de Meent
64/512bit x86
implementations).
I'm not an expert in meson, but splitting might add complexity to meson.build.
Could you elaborate if there are other benefits to the split file approach?
Paul
-Original Message-
From: Andres Freund
Sent: Friday, February 9, 2024 10:35 AM
To: Amons
OS or hypervisor even if the CPU supports AVX512.
The big change is adding all old and new build support to meson. I am new to
meson/ninja so please review carefully.
Thanks,
Paul
-Original Message-
From: Alvaro Herrera
Sent: Wednesday, February 7, 2024 2:13 AM
To: Amonson, Paul D
Cc
1:49 AM
To: Shankaran, Akash
Cc: Nathan Bossart ; Noah Misch ;
Amonson, Paul D ; Tom Lane ;
Matthias van de Meent ;
pgsql-hackers@lists.postgresql.org
Subject: Re: Popcount optimization using AVX512
On 2024-Jan-25, Shankaran, Akash wrote:
> With the updated patch, we observed significant im
45 matches
Mail list logo