On 2024-11-09 00:52, Morten Brørup wrote:
From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Friday, 8 November 2024 23.23

On 2024-11-08 20:53, Morten Brørup wrote:
From: Morten Brørup [mailto:m...@smartsharesystems.com]
Sent: Friday, 8 November 2024 19.35

From: David Marchand [mailto:david.march...@redhat.com]
Sent: Friday, 8 November 2024 19.18

OVS locks all pages to avoid page faults while processing packets.

It sounds smart, so I just took a look at how it does this. I'm not
sure, but it seems like it only locks pages that are actually mapped
(current and future).


mlockall(MLOCK_CURRENT) will bring in the whole BSS, it seems. Plus all
the rest like unused parts of the execution stacks, the data section
and
unused code (text) in the binary and all libraries it has linked to.

It makes a simple (e.g., a unit test) DPDK 24.07 program use ~33x more
residential memory. After lcore variables, the same MLOCK_CURRENT-ed
program is ~30% larger than before. So, a relatively modest increase.

Thank you for testing this, Mattias.
What are the absolute numbers, i.e. in KB, to get an idea of the numbers I 
should be looking for?


Hello world type program with static linking. Default DPDK config. x86_64.

DPDK version  MAX_LCORE_VAR   EAL params         mlock  RSS [MB]
22.11         -               --no-huge -m 1000  no     22
24.11         1048576         --no-huge -m 1000  no     22
24.11         1048576         --no-huge -m 1000  yes    1576
24.11         4096            --no-huge -m 1000  yes    1445
22.11         -               -                  yes    333*
24.11         1048576         -                  yes    542*
24.11         4096            -                  yes    411*

* Excluding huge pages

If you are more selective what libraries you bring in, the footprint will be lower. How large a fraction is effectively unavoidable, I don't know. The relative increase will depends on how much memory the application uses, obviously. The hello world app doesn't have any app-level state.

I wonder why the footprint grows at all... Intuitively the same variables 
should consume approximately the same amount of RAM, regardless how they are 
allocated.
Speculating...

lcore variables use malloc(), which in turn does not bring in memory pages unless they are needed. Much of the lcore buffer will be unused, and not resident. I covered this, including some example calculation of the space savings, in an earlier thread. It may be in the programmer's guide as well, I don't remember.

The lcore_states were allocated through rte_calloc() and thus used some space 
in the already allocated hugepages, so they didn't add more pages to the 
footprint. But they do when allocated and initialized as lcore variables.
The first lcore variable allocated/initialized uses RTE_MAX_LCORE (128) pages 
of 4 KB each = 512 KB total. It seems unlikely that adding 512 KB increases the 
footprint by 30 %.


mlockall() brings in all currently-untouched malloc()ed pages, growing the set of residential pages.


The numbers are less drastic, obviously, for many real-world programs,
which have large packet pools and other memory hogs.

Agree.
However, it would be good to understand why switching to lcore variables has 
this effect on the footprint when using mlockall() like OVS.


Reply via email to