On Wed, 23 Oct 2024 19:59:35 +0200
Mattias Rönnblom <hof...@lysator.liu.se> wrote:

> > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > index 883e59a927..b90dc8793b 100644
> > --- a/lib/ethdev/ethdev_driver.h
> > +++ b/lib/ethdev/ethdev_driver.h
> > @@ -1235,6 +1235,70 @@ typedef int (*eth_count_aggr_ports_t)(struct 
> > rte_eth_dev *dev);
> >   typedef int (*eth_map_aggr_tx_affinity_t)(struct rte_eth_dev *dev, 
> > uint16_t tx_queue_id,
> >                                       uint8_t affinity);
> >   
> > +/**
> > + * @internal
> > + * Set cache stashing hint in the ethernet device.
> > + *
> > + * @param dev
> > + *   Port (ethdev) handle.
> > + * @param cpuid
> > + *   ID of the targeted CPU.
> > + * @param cache_level
> > + *   Level of the cache to stash data.  
> 
> If we had a hwtopo API in DPDK, we could just use a node id in such a 
> graph (of CPUs and caches) to describe were the data ideally would land. 
> In such a case, you could have a node id for DDR as well, and thus you 
> could drop the notion of "stashing". Just a "drop off the data here, 
> please, if you can" API.
> 
> I don't think this API and its documentation should talk about what the 
> "CPU" needs, since it's somewhat misleading.
> 
> For example, you can imagine you want the packet payload to land in the 
> LLC, even though it's not for any CPU to consume, in case you know with 
> some certaintly that the packet will soon be transmitted (and thus 
> consumed by the NIC).
> 
> The same scenario can happen, the consumer is an accelerator (e.g., a 
> crypto engine).
> 
> Likewise, you may know that the whole packet will be read by some CPU 
> core, but you also know the system tends to buffer packets before they 
> are being processed. In such a case, it's better to go to DRAM right 
> away, to avoid trashing the LLC (or some other cache).
> 
> Also, why do you need to use the word "host"? Seems like a PCI thing. 
> This may be implemented in PCI, but surely can be done (and has been 
> done) without PCI.

+1 for the concept of having a CPU and PCI topology map that
can be queried by drivers and application. Dumpster diving into sysfs
is hard to get right and keeps growing. I wonder if there exists an open
source library that is a good enough starting point for this already.

Reply via email to