+CC: EAL and Memory maintainers.

> From: Don Wallwork [mailto:d...@xsightlabs.com]
> Sent: Tuesday, 26 April 2022 23.26
> 
> On 4/26/2022 5:21 PM, Stephen Hemminger wrote:
> > On Tue, 26 Apr 2022 17:01:18 -0400
> > Don Wallwork <d...@xsightlabs.com> wrote:
> >
> >> On 4/26/2022 10:58 AM, Stephen Hemminger wrote:
> >>> On Tue, 26 Apr 2022 08:19:59 -0400
> >>> Don Wallwork <d...@xsightlabs.com> wrote:
> >>>
> >>>> Add support for using hugepages for worker lcore stack memory.
> The
> >>>> intent is to improve performance by reducing stack memory related
> TLB
> >>>> misses and also by using memory local to the NUMA node of each
> lcore.

This certainly seems like a good idea!

However, I wonder: Does the O/S assign memory local to the NUMA node to an 
lcore-pinned thread's stack when instantiating the tread? And does the DPDK EAL 
ensure that the preconditions for the O/S to do that are present?

(Not relevant for this patch, but the same locality questions come to mind 
regarding Thread Local Storage.)

> >>>>
> >>>> Platforms desiring to make use of this capability must enable the
> >>>> associated option flag and stack size settings in platform config
> >>>> files.
> >>>> ---
> >>>>    lib/eal/linux/eal.c | 39
> +++++++++++++++++++++++++++++++++++++++
> >>>>    1 file changed, 39 insertions(+)
> >>>>
> >>> Good idea but having a fixed size stack makes writing complex
> application
> >>> more difficult. Plus you lose the safety of guard pages.

Would it be possible to add a guard page or guard region by using the O/S 
memory allocator instead of rte_zmalloc_socket()? Since the stack is considered 
private to the process, i.e. not accessible from other processes, this patch 
does not need to provide remote access to stack memory from secondary processes 
- and thus it is not a requirement for this features to use DPDK managed memory.

> >> Thanks for the quick reply.
> >>
> >> The expectation is that use of this optional feature would be
> limited to
> >> cases where
> >> the performance gains justify the implications of these tradeoffs.
> For
> >> example, a specific
> >> data plane application may be okay with limited stack size and could
> be
> >> tested to ensure
> >> stack usage remains within limits.

How to identify the required stack size and verify it... If aiming for small 
stacks, some instrumentation would be nice, like rte_mempool_audit() and 
rte_mempool_list_dump().

Alternatively, just assume that the stack is "always big enough", and don't 
worry about it - like the default O/S stack size. And as Stephen already 
mentioned: Regardless of stack size, overflowing the stack will cause memory 
corruption instead of a segmentation fault.

Keep in mind that the required stack size not only depends on the application, 
but also on DPDK and other libraries being used by the application.

> >>
> >> Also, since this applies only to worker threads, the main thread
> would
> >> not be impacted
> >> by this change.
> >>
> >>
> > I would prefer it as a runtime, not compile time option.
> > That way distributions could ship DPDK and application could opt in
> if it wanted.
> Good point..  I'll work on a v2 and will post that when it's ready.

May I suggest using the stack size configured in the O/S, from 
pthread_attr_getstacksize() or similar, instead of choosing the stack size 
manually? If you want it to be configurable, use the default size unless 
explicitly specified otherwise.

Do the worker threads need a different stack size than the main thread? In my 
opinion: "Nice to have", not "must have".

Do the worker threads need different stack sizes individually? In my opinion: 
Perhaps "nice to have", certainly not "must have".

Reply via email to