On 8/27/2024 5:10 PM, Vipin Varghese wrote:
As core density continues to increase, chiplet-based
core packing has become a key trend. In AMD SoC EPYC
architectures, core complexes within the same chiplet
share a Last-Level Cache (LLC). By packing logical cores
within the same LLC, we can enhance pipeline processing
stages due to reduced latency and improved data locality.
To leverage these benefits, DPDK libraries and examples
can utilize localized lcores. This approach ensures more
consistent latencies by minimizing the dispersion of lcores
across different chiplet complexes and enhances packet
processing by ensuring that data for subsequent pipeline
stages is likely to reside within the LLC.
< Function: Purpose >
---------------------
- rte_get_llc_first_lcores: Retrieves all the first lcores in the shared LLC.
- rte_get_llc_lcore: Retrieves all lcores that share the LLC.
- rte_get_llc_n_lcore: Retrieves the first n or skips the first n lcores in
the shared LLC.
< MACRO: Purpose >
------------------
RTE_LCORE_FOREACH_LLC_FIRST: iterates through all first lcore from each LLC.
RTE_LCORE_FOREACH_LLC_FIRST_WORKER: iterates through all first worker lcore
from each LLC.
RTE_LCORE_FOREACH_LLC_WORKER: iterates lcores from LLC based on hint (lcore id).
RTE_LCORE_FOREACH_LLC_SKIP_FIRST_WORKER: iterates lcores from LLC while
skipping first worker.
RTE_LCORE_FOREACH_LLC_FIRST_N_WORKER: iterates through `n` lcores from each LLC.
RTE_LCORE_FOREACH_LLC_SKIP_N_WORKER: skip first `n` lcores, then iterates
through reaming lcores in each LLC.
Hi Vipin,
I recently looked into how Intel's Sub-NUMA Clustering would work within
DPDK, and found that I actually didn't have to do anything, because the
SNC "clusters" present themselves as NUMA nodes, which DPDK already
supports natively.
Does AMD's implementation of chiplets not report themselves as separate
NUMA nodes? Because if it does, I don't really think any changes are
required because NUMA nodes would give you the same thing, would it not?
--
Thanks,
Anatoly