Jorge Cardoso Leitão writes:
> Yes, I expect aligned SIMD loads to be faster.
>
> My understanding is that we do not need an alignment requirement for this,
> though: split the buffer in 3, [unaligned][aligned][unaligned], use aligned
> loads for the middle and un-aligned (or not even SIMD) for t
Thanks Yibo,
Yes, I expect aligned SIMD loads to be faster.
My understanding is that we do not need an alignment requirement for this,
though: split the buffer in 3, [unaligned][aligned][unaligned], use aligned
loads for the middle and un-aligned (or not even SIMD) for the prefix and
suffix. This
Thanks Jorge,
I'm wondering if the 64 bytes alignment requirement is for cache or for
simd register(avx512?).
For simd, looks register width alignment does helps.
E.g., _mm_load_si128 can only load 128 bits aligned data, it performs
better than _mm_loadu_si128, which supports unaligned load.
Thanks,
I think that the alignment requirement in IPC is different from this one:
we enforce 8/64 byte alignment when serializing for IPC, but we (only)
recommend 64 byte alignment in memory addresses (at least this is my
understanding from the above link).
I did test adding two arrays and the re
Did a quick bench of accessing long buffer not 8 bytes aligned. Giving
enough conditions, looks it does shows unaligned access has some penalty
over aligned access. But I don't think this is an issue in practice.
Please be very skeptical to this benchmark. It's hard to get it right
given the c
>
> My own impression is that the emphasis may be slightly exagerated. But
> perhaps some other benchmarks would prove differently.
This is probably true. [1] is the original mailing list discussion. I
think lack of measurable differences and high overhead for 64 byte
alignment was the reason f
Le 06/09/2021 à 23:20, Jorge Cardoso Leitão a écrit :
Thanks a lot Antoine for the pointers. Much appreciated!
Generally, it should not hurt to align allocations to 64 bytes anyway,
since you are generally dealing with large enough data that the
(small) memory overhead doesn't matter.
Not f
To add to Antoine's points, besides data alignment being beneficial for
reducing cache line reads/write and overall using the cache more
effectively, another key point is when using vector (SIMD) registers.
Although recent CPUs can load unaligned data to vector registers at similar
speeds as aligne
Thanks a lot Antoine for the pointers. Much appreciated!
Generally, it should not hurt to align allocations to 64 bytes anyway,
> since you are generally dealing with large enough data that the
> (small) memory overhead doesn't matter.
>
Not for performance. However, 64 byte alignment in Rust req
Le 06/09/2021 à 19:45, Antoine Pitrou a écrit :
Specifically, I performed two types of tests, a "random sum" where we
compute the sum of the values taken at random indices, and "sum", where we
sum all values of the array (buffer[1] of the primitive array), both for
array ranging from 2^10 to
On Mon, 6 Sep 2021 18:09:31 +0100
Jorge Cardoso Leitão wrote:
> Hi,
>
> We have a whole section related to byte alignment (
> https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding)
> recommending 64 byte alignment and referring to intel's manual.
>
> Do we have evidence
Hi,
We have a whole section related to byte alignment (
https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding)
recommending 64 byte alignment and referring to intel's manual.
Do we have evidence that this alignment helps (besides intel claims)?
I am asking because going
12 matches
Mail list logo