On Wed, Jan 22, 2025 at 10:58:09AM +, chiranmoy.bhattacha...@fujitsu.com
wrote:
>> The functions that test the length before potentially calling a function
>> pointer should probably be inlined (see pg_popcount() in pg_bitutils.h).
>> I wouldn't be surprised if some compilers are inlining this
On Wed, Jan 22, 2025 at 11:10:10AM +, chiranmoy.bhattacha...@fujitsu.com
wrote:
> I realized I didn't attach the patch.
Thanks. Would you mind creating a commitfest entry for this one?
--
nathan
I realized I didn't attach the patch.
v2-0001-SVE-support-for-hex-encode-and-hex-decode.patch
Description: v2-0001-SVE-support-for-hex-encode-and-hex-decode.patch
> The approach looks generally reasonable to me, but IMHO the code needs
much more commentary to explain how it works.
Added comments to explain the SVE implementation.
> I would be interested to see how your bytea test compares with the
improvements added in commit e24d770 and with sending the
With commit e24d770 in place, I took a closer look at hex_decode(), and I
concluded that doing anything better without intrinsics would likely
require either a huge lookup table or something with complexity rivalling
the instrinsics approach (while also not rivalling its performance). So, I
took a
David Rowley writes:
> I agree that the evidence you (John) gathered is enough reason to use
> memcpy().
Okay ... doesn't quite match my intuition, but intuition is a poor
guide to such things.
regards, tom lane
On Wed, 15 Jan 2025 at 23:57, John Naylor wrote:
>
> On Wed, Jan 15, 2025 at 2:14 PM Tom Lane wrote:
> > Compilers that inline memcpy() may arrive at the same machine code,
> > but why rely on the compiler to make that optimization? If the
> > compiler fails to do so, an out-of-line memcpy() cal
Hi.
Em qua., 15 de jan. de 2025 às 07:57, John Naylor
escreveu:
> On Wed, Jan 15, 2025 at 2:14 PM Tom Lane wrote:
>
> > Couple of thoughts:
> >
> > 1. I was actually hoping for a comment on the constant's definition,
> > perhaps along the lines of
> >
> > /*
> > * The hex expansion of each pos
On Wed, Jan 15, 2025 at 2:14 PM Tom Lane wrote:
> Couple of thoughts:
>
> 1. I was actually hoping for a comment on the constant's definition,
> perhaps along the lines of
>
> /*
> * The hex expansion of each possible byte value (two chars per value).
> */
Works for me. With that, did you mean
John Naylor writes:
> Okay, I added a comment. I also agree with Michael that my quick
> one-off was a bit hard to read so I've cleaned it up a bit. I plan to
> commit the attached by Friday, along with any bikeshedding that
> happens by then.
Couple of thoughts:
1. I was actually hoping for a c
On Tue, Jan 14, 2025 at 11:57 PM Nathan Bossart
wrote:
>
> On Tue, Jan 14, 2025 at 12:59:04AM -0500, Tom Lane wrote:
> > John Naylor writes:
> >> We can do about as well simply by changing the nibble lookup to a byte
> >> lookup, which works on every compiler and architecture:
>
> Nice. I tried
On Tue, Jan 14, 2025 at 12:59:04AM -0500, Tom Lane wrote:
> John Naylor writes:
>> We can do about as well simply by changing the nibble lookup to a byte
>> lookup, which works on every compiler and architecture:
Nice. I tried enabling auto-vectorization and loop unrolling on top of
this patch,
John Naylor writes:
> We can do about as well simply by changing the nibble lookup to a byte
> lookup, which works on every compiler and architecture:
I didn't attempt to verify your patch, but I do prefer addressing
this issue in a machine-independent fashion. I also like the brevity
of the pat
On Tue, Jan 14, 2025 at 12:27:30PM +0700, John Naylor wrote:
> We can do about as well simply by changing the nibble lookup to a byte
> lookup, which works on every compiler and architecture:
>
> select hex_encode_test(100, 1024);
> master:
> Time: 1158.700 ms
> v2:
> Time: 777.443 ms
>
> If
On Sat, Jan 11, 2025 at 3:46 AM Nathan Bossart wrote:
>
> I was able to get auto-vectorization to take effect on Apple clang 16 with
> the following addition to src/backend/utils/adt/Makefile:
>
> encode.o: CFLAGS += ${CFLAGS_VECTORIZE} -mllvm -force-vector-width=8
>
> This gave the follow
On Mon, Jan 13, 2025 at 03:48:49PM +, chiranmoy.bhattacha...@fujitsu.com
wrote:
> There is a 30% improvement using auto-vectorization.
It might be worth enabling auto-vectorization independently of any patches
that use intrinsics, then.
> Currently, it is assumed that all aarch64 machine sup
On Fri, Jan 10, 2025 at 09:38:14AM -0600, Nathan Bossart wrote:
> Do you mean that the auto-vectorization worked and you observed no
> performance improvement, or the auto-vectorization had no effect on the
> code generated?
Auto-vectorization is working now with the following addition on Graviton
On Fri, Jan 10, 2025 at 09:38:14AM -0600, Nathan Bossart wrote:
> On Fri, Jan 10, 2025 at 11:10:03AM +, chiranmoy.bhattacha...@fujitsu.com
> wrote:
>> We tried auto-vectorization and observed no performance improvement.
>
> Do you mean that the auto-vectorization worked and you observed no
>
On Fri, Jan 10, 2025 at 11:10:03AM +, chiranmoy.bhattacha...@fujitsu.com
wrote:
> We tried auto-vectorization and observed no performance improvement.
Do you mean that the auto-vectorization worked and you observed no
performance improvement, or the auto-vectorization had no effect on the
cod
Hello Nathan,
We tried auto-vectorization and observed no performance improvement.
The instructions in src/include/port/simd.h are based on older SIMD
architectures like NEON, whereas the patch uses the newer SVE, so some of the
instructions used in the patch may not have direct equivalents in N
On Thu, Jan 09, 2025 at 11:22:05AM +, devanga.susmi...@fujitsu.com wrote:
> This email aims to discuss the contribution of optimized hex_encode and
> hex_decode functions for ARM (aarch64) machines. These functions are
> widely used for encoding and decoding binary data in the bytea data type.
21 matches
Mail list logo