On Sat, Dec 21, 2019 at 2:37 AM Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> wrote: > > <snip> > > > > From: Jerin Jacob <jer...@marvell.com> > > > > > > > > The exiting optimize_object_size() function address the memory > > > > object alignment constraint on x86 for better performance. > > > > > > > > Different (Mirco) architecture may have different memory alignment > > > > constraint for better performance and it not same as the existing > > > > optimize_object_size() function. Some use, XOR(kind of CRC) scheme > > > > to enable DRAM channel distribution based on the address and some > > > > may have a different formula. > > > If I understand correctly, address interleaving is the characteristic of > > > the > > memory controller and not the CPU. > > > For ex: different SoCs using the same Arm architecture might have > > > different > > memory controllers. So, the solution should not be architecture specific, > > but > > SoC specific. > > > > Yes. See below. > > > > > > -static unsigned optimize_object_size(unsigned obj_size) > > > > +static unsigned > > > > +arch_mem_object_align(unsigned obj_size) > > > > { > > > > unsigned nrank, nchan; > > > > unsigned new_obj_size; > > > > @@ -99,6 +101,13 @@ static unsigned optimize_object_size(unsigned > > > > obj_size) > > > > new_obj_size++; > > > > return new_obj_size * RTE_MEMPOOL_ALIGN; } > > > > +#else > > > This applies to add Arm (PPC as well) SoCs which might have different > > schemes depending on the memory controller. IMO, this should not be > > architecture specific. > > > > I agree in principle. > > I will summarize the > > https://www.mail-archive.com/dev@dpdk.org/msg149157.html feedback: > > > > 1) For x86 arch, it is architecture-specific > > 2) For power PC arch, It is architecture-specific > > 3) For the ARM case, it will be the memory controller specific. > > 4) For the ARM case, The memory controller is not using the existing > > x86 arch formula. > > 5) If it is memory/arch-specific, Can userspace code find the optimal > > alignment? In the case of octeontx2/arm64, the memory controller does XOR > > on PA address which userspace code doesn't have much control. > > > > This patch address the known case of (1), (2), (4) and (5). (2) can be > > added to > > this framework when POWER9 folks want it. > > > > We can extend this patch to address (3) if there is a case. Without the > > actual > > requirement(If some can share the formula of alignment which is the > > memory controller specific and it does not come under (4))) then we can > > create extra layer for the memory controller and abstraction to probe it. > > Again there is no standard way of probing the memory controller in > > userspace and we need platform #define, which won't work for distribution > > build. > > So solution needs to be arch-specific and then fine-tune to memory > > controller > > if possible. > > > > I can work on creating an extra layer of code if some can provide the > > details > > of the memory controller and probing mechanism or this patch be extended > Inputs for BlueField, DPAAx, ThunderX2 would be helpful.
Yes. Probably memory controller used in n1sdp SoC also. > > > to support such case if it arises in future. > > > > Thoughts? > How much memory will this save for your platform? Is it affecting performance? No performance difference. The existing code adding the tailer for each objs. Additional space/Trailer space will be function of number of objects in mempool and its obj_size, its alignment and selected rte_memory_get_nchannel() and rte_memory_get_nrank() I will wait for inputs from Bluefield, DPAAx, ThunderX2 and n1sdp(if any) for any rework on the patch. > > > > > > > > > > +static unsigned > > > > +arch_mem_object_align(unsigned obj_size) { > > > > + return obj_size; > > > > +} > > > > +#endif > > > > > > > > struct pagesz_walk_arg { > > > > int socket_id; > > > > @@ -234,8 +243,8 @@ rte_mempool_calc_obj_size(uint32_t elt_size, > > > > uint32_t flags, > > > > */ > > > > if ((flags & MEMPOOL_F_NO_SPREAD) == 0) { > > > > unsigned new_size; > > > > - new_size = optimize_object_size(sz->header_size + sz- > > > > >elt_size + > > > > - sz->trailer_size); > > > > + new_size = arch_mem_object_align > > > > + (sz->header_size + sz->elt_size + > > > > + sz->trailer_size); > > > > sz->trailer_size = new_size - sz->header_size - > > > > sz->elt_size; > > > > } > > > > > > > > -- > > > > 2.24.1 > > >