Burakov, Anatoly <anatoly.bura...@intel.com> 于2023年5月20日周六 23:03写道:
>
> Hi,
>
> On 5/16/2023 1:21 PM, Fengnan Chang wrote:
> > Under legacy mode, if the number of continuous memsegs greater
> > than RTE_MAX_MEMSEG_PER_LIST, eal init will failed even though
> > another memseg list is empty, because only one memseg list used
> > to check in remap_needed_hugepages.
> >
> > For example:
> > hugepage configure:
> > 20480
> > 13370
> > 7110
> >
> > startup log:
> > EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> > EAL: Detected memory type: socket_id:1 hugepage_sz:2097152
> > EAL: Creating 4 segment lists: n_segs:8192 socket_id:0 hugepage_sz:2097152
> > EAL: Creating 4 segment lists: n_segs:8192 socket_id:1 hugepage_sz:2097152
> > EAL: Requesting 13370 pages of size 2MB from socket 0
> > EAL: Requesting 7110 pages of size 2MB from socket 1
> > EAL: Attempting to map 14220M on socket 1
> > EAL: Allocated 14220M on socket 1
> > EAL: Attempting to map 26740M on socket 0
> > EAL: Could not find space for memseg. Please increase 32768 and/or 65536 in
> > configuration.
>
> Unrelated, but this is probably a wrong message, this should've called
> out the config options to change, not their values. Sounds like a log
> message needs fixing somewhere...

In the older version, the log is:
EAL: Could not find space for memseg. Please increase
CONFIG_RTE_MAX_MEMSEG_PER_TYPE and/or CONFIG_RTE_MAX_MEM_PER_TYPE in
configuration.
Maybe it's better ?

>
> > EAL: Couldn't remap hugepage files into memseg lists
> > EAL: FATAL: Cannot init memory
> > EAL: Cannot init memory
> >
> > Signed-off-by: Fengnan Chang <changfeng...@bytedance.com>
> > Signed-off-by: Lin Li <lilint...@bytedance.com>
> > ---
> >   lib/eal/linux/eal_memory.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
> > index 60fc8cc6ca..36b9e78f5f 100644
> > --- a/lib/eal/linux/eal_memory.c
> > +++ b/lib/eal/linux/eal_memory.c
> > @@ -1001,6 +1001,8 @@ remap_needed_hugepages(struct hugepage_file 
> > *hugepages, int n_pages)
> >               if (cur->size == 0)
> >                       break;
> >
> > +             if (cur_page - seg_start_page >= RTE_MAX_MEMSEG_PER_LIST)
> > +                     new_memseg = 1;
>
> I don't think this is quite right, because technically,
> `RTE_MAX_MEMSEG_PER_LIST` is only applied to smaller page size segment
> lists - larger page sizes segment lists will hit their limits earlier.
> So, while this will work for 2MB pages, it won't work for page sizes
> which segment list length is smaller than the maximum (such as 1GB pages).
>
> I think this solution could be improved upon by trying to break up the
> contiguous area instead. I suspect the core of the issue is not even the
> fact that we're exceeding limits of one memseg list, but that we're
> always attempting to map exactly N pages in `remap_hugepages`, which
> results in us leaving large contiguous zones inside memseg lists unused
> because we couldn't satisfy current allocation request and skipped to a
> new memseg list.

Correct, I didn't consider 1GB pages case, I get your point.
Thanks.
>
> For example, let's suppose we found a large contiguous area that
> would've exceeded limits of current memseg list. Sooner or later, this
> contiguous area will end, and we'll attempt to remap this virtual area
> into a memseg list. Whenever that happens, we call into the remap code,
> which will start with first segment, attempt to find exactly N number of
> free spots, fail to do so, and skip to the next segment list.
>
> Thus, sooner or later, if we get contiguous areas that are large enough,
> we will not populate our memseg lists but instead skip through them, and
> start with a new memseg list every time we need a large contiguous area.
> We prioritize having a large contiguous area over using up all of our
> memory map.
>
> If, instead, we could break up the allocation - that is, use
> `rte_fbarray_find_biggest_free()` instead of
> `rte_fbarray_find_next_n_free()`, and keep doing it until we run out of
> segment lists, we will achieve the same result your patch does, but have
> it work for all page sizes, because now we would be targeting the actual
> issue (under-utilization of memseg lists), not its symptoms (exceeding
> segment list limits for large allocations).
>
> This logic could either be inside `remap_hugepages`, or we could just
> return number of pages mapped from `remap_hugepages`, and have the
> calling code (`remap_needed_hugepages`) try again, this time with a
> different start segment, reflecting how much pages we actually mapped.
> IMO this would be easier to implement, as `remap_hugepages` is overly
> complex as it is!
>
> >               if (cur_page == 0)
> >                       new_memseg = 1;
> >               else if (cur->socket_id != prev->socket_id)
>
> --
> Thanks,
> Anatoly
>

Reply via email to