On Mon, Jun 14, 2021 at 8:29 PM Ananyev, Konstantin <konstantin.anan...@intel.com> wrote: > > > > > > 14/06/2021 15:15, Bruce Richardson: > > > On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote: > > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon > > > > > Sent: Monday, 14 June 2021 12.59 > > > > > > > > > > Performance of access in a fixed-size array is very good > > > > > because of cache locality > > > > > and because there is a single pointer to dereference. > > > > > The only drawback is the lack of flexibility: > > > > > the size of such an array cannot be increase at runtime. > > > > > > > > > > An approach to this problem is to allocate the array at runtime, > > > > > being as efficient as static arrays, but still limited to a maximum. > > > > > > > > > > That's why the API rte_parray is introduced, > > > > > allowing to declare an array of pointer which can be resized > > > > > dynamically > > > > > and automatically at runtime while keeping a good read performance. > > > > > > > > > > After resize, the previous array is kept until the next resize > > > > > to avoid crashs during a read without any lock. > > > > > > > > > > Each element is a pointer to a memory chunk dynamically allocated. > > > > > This is not good for cache locality but it allows to keep the same > > > > > memory per element, no matter how the array is resized. > > > > > Cache locality could be improved with mempools. > > > > > The other drawback is having to dereference one more pointer > > > > > to read an element. > > > > > > > > > > There is not much locks, so the API is for internal use only. > > > > > This API may be used to completely remove some compilation-time > > > > > maximums. > > > > > > > > I get the purpose and overall intention of this library. > > > > > > > > I probably already mentioned that I prefer "embedded style programming" > > > > with fixed size arrays, rather than runtime configurability. It's > > my personal opinion, and the DPDK Tech Board clearly prefers reducing the > > amount of compile time configurability, so there is no way for > > me to stop this progress, and I do not intend to oppose to this library. :-) > > > > > > > > This library is likely to become a core library of DPDK, so I think it > > > > is important getting it right. Could you please mention a few examples > > where you think this internal library should be used, and where it should > > not be used. Then it is easier to discuss if the border line between > > control path and data plane is correct. E.g. this library is not intended > > to be used for dynamically sized packet queues that grow and shrink in > > the fast path. > > > > > > > > If the library becomes a core DPDK library, it should probably be > > > > public instead of internal. E.g. if the library is used to make > > RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some > > applications might also need dynamically sized arrays for their > > application specific per-port runtime data, and this library could serve > > that purpose too. > > > > > > > > > > Thanks Thomas for starting this discussion and Morten for follow-up. > > > > > > My thinking is as follows, and I'm particularly keeping in mind the cases > > > of e.g. RTE_MAX_ETHPORTS, as a leading candidate here. > > > > > > While I dislike the hard-coded limits in DPDK, I'm also not convinced that > > > we should switch away from the flat arrays or that we need fully dynamic > > > arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way > > > house here, where we keep the ethdevs as an array, but one allocated/sized > > > at runtime rather than statically. This would allow us to have a > > > compile-time default value, but, for use cases that need it, allow use of > > > a > > > flag e.g. "max-ethdevs" to change the size of the parameter given to the > > > malloc call for the array. This max limit could then be provided to apps > > > too if they want to match any array sizes. [Alternatively those apps could > > > check the provided size and error out if the size has been increased > > > beyond > > > what the app is designed to use?]. There would be no extra dereferences > > > per > > > rx/tx burst call in this scenario so performance should be the same as > > > before (potentially better if array is in hugepage memory, I suppose). > > > > I think we need some benchmarks to decide what is the best tradeoff. > > I spent time on this implementation, but sorry I won't have time for > > benchmarks. > > Volunteers? > > I had only a quick look at your approach so far. > But from what I can read, in MT environment your suggestion will require > extra synchronization for each read-write access to such parray element > (lock, rcu, ...). > I think what Bruce suggests will be much ligther, easier to implement and > less error prone. > At least for rte_ethdevs[] and friends.
+1 > Konstantin > >