On Wed, Jun 16, 2021 at 4:57 PM Morten Brørup <m...@smartsharesystems.com> wrote: > > > From: Jerin Jacob [mailto:jerinjac...@gmail.com] > > Sent: Wednesday, 16 June 2021 11.42 > > > > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon <tho...@monjalon.net> > > wrote: > > > > > > 14/06/2021 17:48, Morten Brørup: > > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas > > Monjalon > > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to > > something big enough to hold a sufficiently large array. And possibly > > add an rte_max_ethports variable to indicate the number of populated > > entries in the array, for use when iterating over the array. > > > > > > > > Can we come up with another example than RTE_MAX_ETHPORTS where > > this library provides a better benefit? > > > > > > What is big enough? > > > Is 640KB enough for RAM? ;) > > > > If I understand it correctly, Linux process allocates 640KB due to > > that fact currently > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and it > > is from BSS. > > Correct. > > > If we make this from heap i.e use malloc() to allocate this memory > > then in my understanding Linux > > really won't allocate the real page for backend memory until unless, > > someone write/read to this memory. > > If the array is allocated from the heap, its members will be accessed though > a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This might affect > performance, which is probably why the array is allocated the way it is. > > Although it might be worth investigating how much it actually affects the > performance.
it should not. From CPU and compiler PoV it is same. if see cryptodev, it is using following static struct rte_cryptodev rte_crypto_devices[RTE_CRYPTO_MAX_DEVS]; struct rte_cryptodev *rte_cryptodevs = rte_crypto_devices; And accessing rte_cryptodevs[]. Also, this structure is not cache aligned. Probably need to fix it. > So we need to do something else if we want to conserve memory and still allow > a large rte_eth_devices[] array. > > Looking at struct rte_eth_dev, we could reduce its size as follows: > > 1. Change the two callback arrays > post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to callback > arrays, which are allocated from the heap. > With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays are the > sinners that make the struct rte_eth_dev use so much memory. This > modification would save 16 KB (minus 16 bytes for the pointers to the two > arrays) per port. > Furthermore, these callback arrays would only need to be allocated if the > application is compiled with callbacks enabled (#define > RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the > actual number of queues for the port. > > The disadvantage is that this would add another level of indirection, > although only for applications compiled with callbacks enabled. I think, we don't need one more indirection if all allocated from the heap. as memory is not wasted if not touched by CPU. > > 2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64 bytes per > port. Not much, but worth considering if we are changing the API/ABI anyway. > >