On 2023/11/10 9:18 PM, Morten Brørup wrote:
Dear Ruifeng,

Hi Morten,

+CC: all CPU architecture maintainers,

I'm trying to figure out the requirements for supporting unaligned memory 
access in DPDK (specifically the ring library), and need your expert feedback.

The #define RTE_ARCH_STRICT_ALIGN - which is undocumented, but probably means that CPU 
memory access must be aligned - is only set by "generic_aarch32".

So we will assume that all other CPU architectures supported by DPDK can access 
unaligned memory.

As discussed in this thread, "generic_aarch32" has special requirements for 
performing 64-bit load/store at unaligned addresses.

Yes, 64-bit load/store at unaligned addresses causes alignment fault.


Now comes the big question: Can "generic_aarch32" perform 32-bit load/store at 
unaligned addresses without similar special requirements? Then the ring library already 
supports unaligned 32-bit objects, and doesn't need to be fixed in this regard.

Yes, 32-bit load/store support unaligned data accesses to normal memory (not device memory). This is documented in Arm Architecture Reference Manual.

Thanks,
Ruifeng


Med venlig hilsen / Kind regards,
-Morten Brørup


From: Morten Brørup [mailto:m...@smartsharesystems.com]
Sent: Friday, 10 November 2023 11.44

From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
Sent: Friday, 10 November 2023 10.45

From: Morten Brørup <m...@smartsharesystems.com>
Sent: Friday, November 10, 2023 9:34 AM

+CC Gavin, reviewed the test case

From: Ruifeng Wang [mailto:ruifeng.w...@arm.com]
Sent: Friday, 10 November 2023 09.40

On 2023/11/4 8:04 AM, Morten Brørup wrote:
I have for a long time now wondered why the ring functions for
enqueue/dequeue of 64-bit objects supports unaligned addresses,
and
now
I finally found the patch introducing it.

From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Phil Yang
Sent: Monday, 9 March 2020 18.20

The 32-bit arm machine doesn't support unaligned memory
access.
It
will cause a bus error on aarch32 with the custom element size
ring.

Thread 1 "test" received signal SIGBUS, Bus error.
__rte_ring_enqueue_elems_64 (n=1, obj_table=0xf5edfe41,
prod_head=0,
\
r=0xf5edfb80) at /build/dpdk/build/include/rte_ring_elem.h:177
177                             ring[idx++] = obj[i++];

Which test is this? Why is it using an unaligned array of 64-
bit
objects? (Notice that obj_table=0xf5edfe41.)

The test case is:


https://elixir.bootlin.com/dpdk/latest/source/app/test/test_ring.c#L112
1
This case deliberately use unaligned objects.

Thank you, Ruifeng.

Honnappa, I see you signed off on the patch introducing the test
for
unaligned objects:


http://git.dpdk.org/dpdk/commit/app/test/test_ring.c?id=a9fe152363e283d
4c590ab8e8d51396f2ffab9ff

What was the rationale behind testing support for unaligned object
pointers? Did any applications/customers use unaligned object
pointers, or is it a purely academic test case?



Nobody in their right mind would use an unaligned array of 64-
bit
objects. You can only create such an array if you force the
compiler to
prevent automatic alignment! And all the functions in your
application
using this array would also need to support unaligned addressing
of
these objects.

It could be just one elem, not an array.
And the user can use 'packed' struct or so...
Agree, not a common case, but probably still possible.

Very good point, Konstantin. A single unaligned object in a packed
structure is not as exotic as an unaligned array of objects (which I
consider completely unrealistic).

If anyone is using an architecture which doesn't support unaligned
accesses, I would expect them to completely avoid using unaligned
objects. But perhaps you are right; however unlikely, it might happen.

If we think this is a real use case, should we add support for
unaligned 32 bit objects?
(128 bit objects already support unaligned access; they are type casted
to void pointer and accessed using memcpy().)

And how about the stack library, should it support unaligned objects
too?



This seems extremely exotic, and not something any real
application
would do!

I would like to revert this patch for performance reasons.

Morten, could you probably explain first the problems you encountered
with this patch?
You mention about 'performance reasons', so did you notice any real
slowdown?

Please check my reply to the same question here:
http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35E9EFD3@smarts
erver.smartshare.dk/



I could add an RTE_ASSERT() to verify that the pointer is aligned,
for debugging purposes.



Fixes: cc4b218790f6 ("ring: support configurable element
size")

Signed-off-by: Phil Yang <phil.y...@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
Reviewed-by: Honnappa Nagarahalli
<honnappa.nagaraha...@arm.com>
---
   lib/librte_ring/rte_ring_elem.h | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ring/rte_ring_elem.h
b/lib/librte_ring/rte_ring_elem.h
index 3976757..663addc 100644
--- a/lib/librte_ring/rte_ring_elem.h
+++ b/lib/librte_ring/rte_ring_elem.h
@@ -160,7 +160,7 @@ __rte_ring_enqueue_elems_64(struct
rte_ring
*r,
uint32_t prod_head,
        const uint32_t size = r->size;
        uint32_t idx = prod_head & r->mask;
        uint64_t *ring = (uint64_t *)&r[1];
-       const uint64_t *obj = (const uint64_t *)obj_table;
+       const unaligned_uint64_t *obj = (const unaligned_uint64_t
*)obj_table;
        if (likely(idx + n < size)) {
                for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
                        ring[idx] = obj[i];
@@ -294,7 +294,7 @@ __rte_ring_dequeue_elems_64(struct
rte_ring
*r,
uint32_t prod_head,
        const uint32_t size = r->size;
        uint32_t idx = prod_head & r->mask;
        uint64_t *ring = (uint64_t *)&r[1];
-       uint64_t *obj = (uint64_t *)obj_table;
+       unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table;
        if (likely(idx + n < size)) {
                for (i = 0; i < (n & ~0x3); i += 4, idx += 4) {
                        obj[i] = ring[idx];
--
2.7.4


References:



https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring_elem.h?id=3ba
51478a3ab3132c33effc8b132641233275b36
https://patchwork.dpdk.org/project/dpdk/patch/1583774395-10233-
1-
git-
send-email-phil.y...@arm.com/

Reply via email to