Re: [vpp-dev] VPP Memory usage

Andrew Yourtchenko Mon, 20 Aug 2018 13:25:22 -0700

Dear Rubina,

On 8/20/18, Rubina Bianchi <r_bian...@outlook.com> wrote:
> Hi dear Andrew
>
> What we were talked before was about "Worker Thread Deadlock".


We had that discussion in march or may. :-)

The one I had in mind was another thread, starting with your mail on
January 30. I forwarded to you unicast :-)

>
> I tried to test scenario as you explained and started with 1M entry and
> after that I doubled it at each run.
> When I test with 4M entry size, I logged two things:
> 1. ps aux | grep vpp
> 2. First 5 lines of "vppctl show acl-plugin session"
>
> At first, I've run VPP and configured it with script that I attached to
> previous email.
> After that I run my logger script.
> Finally I run Trex with this command: ./t-rex-64 --cfg cfg/trex_config.yaml
> -f cap2/sfr.yaml -m 50 -c 3 -d 10000 -p
> After tracing VPP logs I found some signs of leakage. I mean in the logs of
> VPP, RSS (6th parameter in ps aux command) is increasing continuously
> (sometimes more and sometimes less) but on the other side, Trex Total-Rx is
> decreasing at the same time.
> After about 3000 seconds, I stopped Trex and wait until session table being
> cleared. But no change in RSS happens.
> Then, I run Trex again without any change and again I saw the increase of
> RSS while the Trex Total-Rx is decreasing.

Based on the counters, in this test we are continuously churning
through the half-open sessions, because we are hitting the maximum
session limit. Session creation is quite expensive (at least at this
point, I did not optimize that code much yet).

>
> This is my ram status when vpp is stop:
> root@debian-hp:~# free -m
>              total       used       free     shared    buffers     cached
> Mem:        129135       3414     125721         12         99        591
> -/+ buffers/cache:       2723     126412
> Swap:         2518          0       2518
>
> I also attached my logs to this email. This logs are gathered every 20
> seconds.
>
> In 40M entry size I saw this behavior too, but It happens much faster than
> 4M entry size.

Yes, because you create more sessions and use more buckets, I think
(though this is a speculation at this point, since we dont have the
memory outputs).

What i sthe maximum amount of simultaneous sessions on the T-rex and
what is the connection per second rate ?

> I also have a question about your phrase  of "Using this method you can
> arrive to the number of maximum connections that your memory configuration
> can support".
> Is there any formula to config init.conf in an efficient way? Because VPP
> didn't return any error about misconfiguration.

No, there is no formula, unfortunately - hence I can not print an
error about a misconfiguration.

You can use the "show acl memory" as I described in the other mail, to
see what the memory usage in the session bihash is and what is the
number of active elements - could you have a look at doing that ?

--a

>
> Thanks,
> Sincerely
>
>
>
>
> ________________________________
> From: Andrew 👽 Yourtchenko <ayour...@gmail.com>
> Sent: Sunday, August 19, 2018 8:28 AM
> To: Rubina Bianchi
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] VPP Memory usage
>
> Dear Rubina,
>
> The ACL plugin does all the necessary allocations at startup for all data
> structures except the connection bihash.
>
> You would need to check the current number of the connections as your test
> progresses. I believe we had a communication a while ago regarding the
> gradual growth of background memory usage within the bihash data structure
> as you churn through random addresses. Since then there were some changes
> aimed to address this. Please verify what does the current total session
> count look like in “show acl-plugin sessions” as your test progresses -
> based on what you described I think it continuously increases.
>
> If the bihash memory requirement for active connections goes above of what
> is available from the OS, then there is no feedback to the user code (acl
> plugin) other than a full crash.
>
> The  only safeguard I could come up against this situation is the maximum
> connection count, which is checked before attempting to insert an entry into
> the bihash.
>
> Your current value is 40 million which is quite a lot, while the hash table
> heap size is 17 gigabytes. This might not be enough to hold all the 40
> million entries as the churn progresses and you need to create more
> buckets.
>
> I suggest you keep all the other parameters as they are and start with the
> value of maximum connections of 1 million and rerun the test, and monitor
> the memory usage within the ACL plugin heap (“show acl-plugin memory”) - it
> should stabilize over time at some value and there should be no crash. The
> exact usage will depend on the distribution of session entries over bucket
> (note that in the worst case you may have one entry per bucket which may
> give a lot of overhead). Note that value.
>
> If you stop the traffic, as the session count goes down to zero, the memory
> should get released.
>
> Then double the max conn count and recheck the behavior same as above - the
> usage probably would be about double of the previous one.
>
> Using this method you can arrive to the number of maximum connections that
> your memory configuration can support, and get a gauge of how much memory
> you would need for the target amount of connections.
>
> If in the initial iteration test you observe the memory usage never
> stabilizing or if you see that the memory is not being released as the
> connection count goes down to zero, then it would be a bug, which we will
> need to further troubleshoot - though from your description so far it seems
> more a case of tuning the parameters. So please apply the method above and
> let me know how it goes! Thanks!
>
> --a
>
> On 19 Aug 2018, at 07:26, Rubina Bianchi
> <r_bian...@outlook.com<mailto:r_bian...@outlook.com>> wrote:
>
>
> Hi dear VPP
>
>
> I configured vpp stable/1807 and added permit+reflect acl on input and
> output of my network interfaces. I configured vpp with 9 cpu (1 main and 8
> worker cpu). My init.conf is:
>
>
> vppctl>
>
> set acl-plugin session table max-entries 40000000
> set acl-plugin session table hash-table-buckets 1000000
> set acl-plugin session table hash-table-memory 17179869184
> set acl-plugin session timeout udp idle 20
> set acl-plugin session timeout tcp idle 120
> set acl-plugin session timeout tcp transient 30
>
>
> vpp_api_test>
>
> acl_add_replace permit
> acl_add_replace permit+reflect
>
> acl_interface_add_del TenGigabitEthernet3/0/0 add output acl 1
> acl_interface_add_del TenGigabitEthernet3/0/1 add output acl 1
> acl_interface_add_del TenGigabitEthernet3/0/0 add input acl 1
> acl_interface_add_del TenGigabitEthernet3/0/1 add input acl 1
>
> exec set interface l2 bridge TenGigabitEthernet3/0/0 1
> exec set interface l2 bridge TenGigabitEthernet3/0/1 1
> exec set int state TenGigabitEthernet3/0/0 up
> exec set int state TenGigabitEthernet3/0/1 up
>
> My startup.conf is pasted in this link:
> https://paste.ubuntu.com/p/MhQDyqF6Xd/
>
>
> I used Trex as traffic generator as following:
>
> ./t-rex-64 --cfg cfg/trex_config.yaml  -f cap2/sfr.yaml -m 50 -c 3 -d 3600
> -p
>
>
> During execution of my test, Total-rx continuously decreased and after a
> while, it reached to 0. I checked vpp status and it got SIGKILL signal from
> OS.
>
> I monitored vpp memory and it was increasing until it crashed.
>
> Does acl_plugin session management have any memory leak problem?
>
>
> Regards,
>
> Rubina
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
>
> View/Reply Online (#10213): https://lists.fd.io/g/vpp-dev/message/10213
> Mute This Topic: https://lists.fd.io/mt/24729023/675608
> Group Owner: vpp-dev+ow...@lists.fd.io<mailto:vpp-dev+ow...@lists.fd.io>
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub
> [ayour...@gmail.com<mailto:ayour...@gmail.com>]
> -=-=-=-=-=-=-=-=-=-=-=-
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10224): https://lists.fd.io/g/vpp-dev/message/10224
Mute This Topic: https://lists.fd.io/mt/24729023/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] VPP Memory usage

Reply via email to