On 10/26/22 17:44, Morten Brørup wrote:
Add __rte_cache_aligned to the objs array.

It makes no difference in the general case, but if get/put operations are
always 32 objects, it will reduce the number of memory (or last level
cache) accesses from five to four 64 B cache lines for every get/put
operation.

For readability reasons, an example using 16 objects follows:

Currently, with 16 objects (128B), we access to 3
cache lines:

       ┌────────┐
       │len     │
cache │********│---
line0 │********│ ^
       │********│ |
       ├────────┤ | 16 objects
       │********│ | 128B
cache │********│ |
line1 │********│ |
       │********│ |
       ├────────┤ |
       │********│_v_
cache │        │
line2 │        │
       │        │
       └────────┘

With the alignment, it is also 3 cache lines:

       ┌────────┐
       │len     │
cache │        │
line0 │        │
       │        │
       ├────────┤---
       │********│ ^
cache │********│ |
line1 │********│ |
       │********│ |
       ├────────┤ | 16 objects
       │********│ | 128B
cache │********│ |
line2 │********│ |
       │********│ v
       └────────┘---

However, accessing the objects at the bottom of the mempool cache is a
special case, where cache line0 is also used for objects.

Consider the next burst (and any following bursts):

Current:
       ┌────────┐
       │len     │
cache │        │
line0 │        │
       │        │
       ├────────┤
       │        │
cache │        │
line1 │        │
       │        │
       ├────────┤
       │        │
cache │********│---
line2 │********│ ^
       │********│ |
       ├────────┤ | 16 objects
       │********│ | 128B
cache │********│ |
line3 │********│ |
       │********│ |
       ├────────┤ |
       │********│_v_
cache │        │
line4 │        │
       │        │
       └────────┘
4 cache lines touched, incl. line0 for len.

With the proposed alignment:
       ┌────────┐
       │len     │
cache │        │
line0 │        │
       │        │
       ├────────┤
       │        │
cache │        │
line1 │        │
       │        │
       ├────────┤
       │        │
cache │        │
line2 │        │
       │        │
       ├────────┤
       │********│---
cache │********│ ^
line3 │********│ |
       │********│ | 16 objects
       ├────────┤ | 128B
       │********│ |
cache │********│ |
line4 │********│ |
       │********│_v_
       └────────┘
Only 3 cache lines touched, incl. line0 for len.

Credits go to Olivier Matz for the nice ASCII graphics.

Signed-off-by: Morten Brørup <m...@smartsharesystems.com>

Reviewed-by: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>


Reply via email to