On Sun, Jul 21, 2024 at 11:56:08PM +0100, Ferruh Yigit wrote:
> On 7/19/2024 5:10 PM, Bruce Richardson wrote:
> > On Fri, Jul 19, 2024 at 04:31:11PM +0100, Ferruh Yigit wrote:
> >> On 7/19/2024 2:22 PM, Bruce Richardson wrote:
> >>> On Fri, Jul 19, 2024 at 12:10:24PM +0100, Ferruh Yigit wrote:
> >>>> One option can be adding a warning log to the fallback case, saying that
> >>>> memory allocated from non-local socket and performance will be less.
> >>>> Although this message may not mean much to a new user, it may still help
> >>>> via a support engineer or internet search...
> >>>>
> >>>
> >>> Yes, we could add a warning, but that is probably better in the app 
> >>> itself.
> >>> Thinking about where we get issues, they primarily stem from running the
> >>> cores on a different numa node  Since the private_data struct is accessed
> >>> by cores not devices, any perf degradation will come from having it remote
> >>> to the cores. Because of that, I actually think the original 
> >>> implementation
> >>> should really have used "rte_socket_id()" rather than the device socket id
> >>> for the allocation.
> >>>
> >>
> >> Yes I guess private_data is not accessed by device, but it may be
> >> accessed by cores that is running the datapath.
> >>
> >> This API may be called by core that is not involved to the datapath, so
> >> it may not correct to allocate memory from its numa.
> >>
> >> Will it be wrong to assume that cores used for datapath will be same
> >> numa with device, if so allocating memory from that numa (which device
> >> is in) makes more sense. Am I missing something?
> >>
> > 
> > It depends on which you think is more likely for the polling cores:
> > - they will be on the same socket as the device, but different socket to
> >   the main lcore.
> > - they will be on the same socket as the main lcore, but different socket
> >   to the device.
> > 
> > Personally, I'd suspect both to be unlikely, but also both to be possible.
> > For the first scenario, I don't see anything being broken or affected by
> > the proposed fix here, since priority is still being given to memory on the
> > same socket as the device. It just opens up the possibility of scenario
> > two.
> > 
> 
> My comment was on suggestion to use "rte_socket_id()" rather than the
> device socket id,
> if both nodes have memory, memory should be allocated from the one where
> device is in, because although device doesn't use private_data, polling
> cores will and polling cores will be most likely in the same node with
> device and memory, but we don't know main core is in.
> So I think first try for memory allocation should be node where device
> is in, which is the existing code.
> 
> If node that has device doesn't have any memory attached, your change
> enables this case, as already there is memory only in one node, it
> doesn't matter if we check device node or main core node anyway.
> 
> 
> Briefly, I am OK to current patch with a warning log in fallback, but
> not to "rte_socket_id()" change.
>
Ack, makes sense. Done in V2 patch.

Thanks for review.

/Bruce 

Reply via email to