On Wed, Jun 16, 2021 at 05:01:46PM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Wednesday, 16 June 2021 15.03
> >
> > On Wed, Jun 16, 2021 at 01:27:17PM +0200, Morten Brørup wrote:
> > > > From: Jerin Jacob [mailto:jerinjac...@gmail.com]
> > > > Sent: Wednesday, 16 June 2021 11.42
> > > >
> > > > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon
> > <tho...@monjalon.net>
> > > > wrote:
> > > > >
> > > > > 14/06/2021 17:48, Morten Brørup:
> > > > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas
> > > > Monjalon
> > > > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> > > > something big enough to hold a sufficiently large array. And
> > possibly
> > > > add an rte_max_ethports variable to indicate the number of
> > populated
> > > > entries in the array, for use when iterating over the array.
> > > > > >
> > > > > > Can we come up with another example than RTE_MAX_ETHPORTS where
> > > > this library provides a better benefit?
> > > > >
> > > > > What is big enough?
> > > > > Is 640KB enough for RAM? ;)
> > > >
> > > > If I understand it correctly, Linux process allocates 640KB due to
> > > > that fact currently
> > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and
> > it
> > > > is from BSS.
> > >
> > > Correct.
> > >
> > > > If we make this from heap i.e use malloc() to allocate this memory
> > > > then in my understanding Linux
> > > > really won't allocate the real page for backend memory until
> > unless,
> > > > someone write/read to this memory.
> > >
> > > If the array is allocated from the heap, its members will be accessed
> > though a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This
> > might affect performance, which is probably why the array is allocated
> > the way it is.
> > >
> >
> > It depends on whether the array contains pointers to malloced elements
> > or
> > the array itself is just a single malloced array of all the structures.
> > While I think the parray proposal referred to the former - which would
> > have
> > an extra level of indirection - the switch we are discussing here is
> > the
> > latter which should have no performance difference, since the method of
> > accessing the elements will be the same, only with the base address
> > pointing to a different area of memory.
>
> I was not talking about an array of pointers. And it is not the same:
>
> int arr[27];
> int * parr = arr;
>
> // direct access
> int dir(int i) { return arr[i]; }
>
> // indirect access
> int indir(int i) { return parr[i]; }
>
> The direct access knows the address of arr, so it will compile to:
> movsx rdi, edi
> mov eax, DWORD PTR arr[0+rdi*4]
> ret
>
> The indirect access needs to first read the memory location holding the
> pointer to the array, and then it can read the array member, so it will
> compile to:
> mov rax, QWORD PTR parr[rip]
> movsx rdi, edi
> mov eax, DWORD PTR [rax+rdi*4]
> ret
>
Interesting, thanks. Definitely seems like a bit of perf testing will be
needed whatever way we go.