On Mon, 11 Nov 2024 08:22:46 +0100 Mattias Rönnblom <hof...@lysator.liu.se> wrote:
> On 2024-11-09 00:52, Morten Brørup wrote: > >> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] > >> Sent: Friday, 8 November 2024 23.23 > >> > >> On 2024-11-08 20:53, Morten Brørup wrote: > >>>> From: Morten Brørup [mailto:m...@smartsharesystems.com] > >>>> Sent: Friday, 8 November 2024 19.35 > >>>> > >>>>> From: David Marchand [mailto:david.march...@redhat.com] > >>>>> Sent: Friday, 8 November 2024 19.18 > >>>>> > >>>>> OVS locks all pages to avoid page faults while processing packets. > >>> > >>> It sounds smart, so I just took a look at how it does this. I'm not > >> sure, but it seems like it only locks pages that are actually mapped > >> (current and future). > >>> > >> > >> mlockall(MLOCK_CURRENT) will bring in the whole BSS, it seems. Plus all > >> the rest like unused parts of the execution stacks, the data section > >> and > >> unused code (text) in the binary and all libraries it has linked to. > >> > >> It makes a simple (e.g., a unit test) DPDK 24.07 program use ~33x more > >> residential memory. After lcore variables, the same MLOCK_CURRENT-ed > >> program is ~30% larger than before. So, a relatively modest increase. > > > > Thank you for testing this, Mattias. > > What are the absolute numbers, i.e. in KB, to get an idea of the numbers I > > should be looking for? > > > > Hello world type program with static linking. Default DPDK config. x86_64. > > DPDK version MAX_LCORE_VAR EAL params mlock RSS [MB] > 22.11 - --no-huge -m 1000 no 22 > 24.11 1048576 --no-huge -m 1000 no 22 > 24.11 1048576 --no-huge -m 1000 yes 1576 > 24.11 4096 --no-huge -m 1000 yes 1445 > 22.11 - - yes 333* > 24.11 1048576 - yes 542* > 24.11 4096 - yes 411* > > * Excluding huge pages > > If you are more selective what libraries you bring in, the footprint > will be lower. How large a fraction is effectively unavoidable, I don't > know. The relative increase will depends on how much memory the > application uses, obviously. The hello world app doesn't have any > app-level state. > > > I wonder why the footprint grows at all... Intuitively the same variables > > should consume approximately the same amount of RAM, regardless how they > > are allocated. > > Speculating... > > lcore variables use malloc(), which in turn does not bring in memory > pages unless they are needed. Much of the lcore buffer will be unused, > and not resident. I covered this, including some example calculation of > the space savings, in an earlier thread. It may be in the programmer's > guide as well, I don't remember. I suspect that glibc malloc assumes a virtual memory backed model. It is lazy about reclaiming memory and grows in large chunks. This is one of the reasons malloc() is faster than rte_malloc() for allocation. https://sourceware.org/glibc/wiki/MallocInternals