On Wed, Jun 16, 2021 at 4:57 PM Morten Brørup <m...@smartsharesystems.com> 
wrote:
>
> > From: Jerin Jacob [mailto:jerinjac...@gmail.com]
> > Sent: Wednesday, 16 June 2021 11.42
> >
> > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon <tho...@monjalon.net>
> > wrote:
> > >
> > > 14/06/2021 17:48, Morten Brørup:
> > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas
> > Monjalon
> > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> > something big enough to hold a sufficiently large array. And possibly
> > add an rte_max_ethports variable to indicate the number of populated
> > entries in the array, for use when iterating over the array.
> > > >
> > > > Can we come up with another example than RTE_MAX_ETHPORTS where
> > this library provides a better benefit?
> > >
> > > What is big enough?
> > > Is 640KB enough for RAM? ;)
> >
> > If I understand it correctly, Linux process allocates 640KB due to
> > that fact currently
> > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and it
> > is from BSS.
>
> Correct.
>
> > If we make this from heap i.e use malloc() to allocate this memory
> > then in my understanding Linux
> > really won't allocate the real page for backend memory until unless,
> > someone write/read to this memory.
>
> If the array is allocated from the heap, its members will be accessed though 
> a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This might affect 
> performance, which is probably why the array is allocated the way it is.
>
> Although it might be worth investigating how much it actually affects the 
> performance.

it should not. From CPU and compiler PoV it is same.
if see cryptodev, it is using following

static struct rte_cryptodev rte_crypto_devices[RTE_CRYPTO_MAX_DEVS];
struct rte_cryptodev *rte_cryptodevs = rte_crypto_devices;

And accessing  rte_cryptodevs[].

Also, this structure is not cache aligned. Probably need to fix it.


> So we need to do something else if we want to conserve memory and still allow 
> a large rte_eth_devices[] array.
>
> Looking at struct rte_eth_dev, we could reduce its size as follows:
>
> 1. Change the two callback arrays 
> post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to callback 
> arrays, which are allocated from the heap.
> With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays are the 
> sinners that make the struct rte_eth_dev use so much memory. This 
> modification would save 16 KB (minus 16 bytes for the pointers to the two 
> arrays) per port.
> Furthermore, these callback arrays would only need to be allocated if the 
> application is compiled with callbacks enabled (#define 
> RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the 
> actual number of queues for the port.
>
> The disadvantage is that this would add another level of indirection, 
> although only for applications compiled with callbacks enabled.

I think, we don't need one more indirection if all allocated from the
heap. as memory is not wasted if not touched by CPU.

>
> 2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64 bytes per 
> port. Not much, but worth considering if we are changing the API/ABI anyway.
>
>

Reply via email to