On Fri, Jul 19, 2024 at 04:31:11PM +0100, Ferruh Yigit wrote: > On 7/19/2024 2:22 PM, Bruce Richardson wrote: > > On Fri, Jul 19, 2024 at 12:10:24PM +0100, Ferruh Yigit wrote: > >> One option can be adding a warning log to the fallback case, saying that > >> memory allocated from non-local socket and performance will be less. > >> Although this message may not mean much to a new user, it may still help > >> via a support engineer or internet search... > >> > > > > Yes, we could add a warning, but that is probably better in the app itself. > > Thinking about where we get issues, they primarily stem from running the > > cores on a different numa node Since the private_data struct is accessed > > by cores not devices, any perf degradation will come from having it remote > > to the cores. Because of that, I actually think the original implementation > > should really have used "rte_socket_id()" rather than the device socket id > > for the allocation. > > > > Yes I guess private_data is not accessed by device, but it may be > accessed by cores that is running the datapath. > > This API may be called by core that is not involved to the datapath, so > it may not correct to allocate memory from its numa. > > Will it be wrong to assume that cores used for datapath will be same > numa with device, if so allocating memory from that numa (which device > is in) makes more sense. Am I missing something? >
It depends on which you think is more likely for the polling cores: - they will be on the same socket as the device, but different socket to the main lcore. - they will be on the same socket as the main lcore, but different socket to the device. Personally, I'd suspect both to be unlikely, but also both to be possible. For the first scenario, I don't see anything being broken or affected by the proposed fix here, since priority is still being given to memory on the same socket as the device. It just opens up the possibility of scenario two. /Bruce