Re: RES: RES: [vpp-dev] NAT memory usage problem for VPP 20.09 compared to 20.05 due to larger translation_buckets value

Damjan Marion via lists.fd.io Thu, 26 Nov 2020 13:05:13 -0800

Will leave that to NAT folks to comment… They have multiple tables and 
they are two per thread…


— 
Damjan

> On 26.11.2020., at 20:27, Marcos - Mgiga <mar...@mgiga.com.br> wrote:
> 
> Of course.
> 
> Since I intend to implement VPP as a deterministic CGN gateway I have some 
> parameters regarding to nat config, for example: translation hash buckets, 
> translation hash memory , user hash buckets and user hash memory to be 
> configured in startup.conf.
> 
> In this context I would like to know how do I give the right value to those 
> parameters.
> 
> 
> Thanks
> 
> Marcos
> 
> 
> -----Mensagem original-----
> De: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> Em nome de Damjan Marion via 
> lists.fd.io
> Enviada em: quinta-feira, 26 de novembro de 2020 16:17
> Para: Marcos - Mgiga <mar...@mgiga.com.br>
> Cc: Elias Rudberg <elias.rudb...@bahnhof.net>; vpp-dev@lists.fd.io
> Assunto: Re: RES: [vpp-dev] NAT memory usage problem for VPP 20.09 compared 
> to 20.05 due to larger translation_buckets value
> 
> 
> Sorry, I don’t understand your question. Can you elaborate further?
> 
> --
> Damjan
> 
>> On 26.11.2020., at 20:05, Marcos - Mgiga <mar...@mgiga.com.br> wrote:
>> 
>> Hello,
>> 
>> Taking benefit of the topic, how you suggest to monitor if translation hash 
>> bucket value has an appropriate value? What about translation hash memory, 
>> user hash buckets and user hash memory ?
>> 
>> How do I know if I increase or decrease those values?
>> 
>> Best Regards
>> 
>> Marcos
>> 
>> -----Mensagem original-----
>> De: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> Em nome de Damjan Marion 
>> via lists.fd.io Enviada em: quinta-feira, 26 de novembro de 2020 14:53
>> Para: Elias Rudberg <elias.rudb...@bahnhof.net>
>> Cc: vpp-dev@lists.fd.io
>> Assunto: Re: [vpp-dev] NAT memory usage problem for VPP 20.09 compared 
>> to 20.05 due to larger translation_buckets value
>> 
>> 
>> Dear Elias,
>> 
>> Let me try to explain a bit underlying mechanics.
>> Let’s assume your target number of sessions is 10M and we are talking about 
>> 16byte key size.
>> That means each hash entry (KV) is 24 bytes (16 bytes key and 8 bytes value).
>> 
>> In the setup you were mentioning, with 1<<20 buckets, your will need to fit 
>> 10 KVs into each bucket.
>> Initial bihash bucket holds 4 KVs and to accomodate 10 keys (assuming that 
>> hash function gives us equal distribution) you will need to grow each bucket 
>> 2 times. Growing means doubling bucket size.
>> So at the end you will have 1<<20 buckets where each holds 16 KVs.
>> 
>> Math is:
>> 1<<20 * (16 * 24 /* KV size in bytes */  + 8 /*bucket header size*/) Which 
>> means 392 MB of memory.
>> 
>> If you keep target number of 10M sessions, but you increase number of 
>> buckets to 1 << 22 (which is roughly what formula bellow is trying to do) 
>> you end up with the following math:
>> 
>> Math is:
>> 1<<22 * (4 * 24 /* KV size in bytes */  + 8 /*bucket header size*/) Which 
>> means 416 MB of memory.
>> 
>> So why 2nd one is better. Several reasons:
>> 
>> - in first case you need to grow each bucket twice, that means 
>> allocating memory for the new bucket,  copying existing data from the 
>> old bucket and putting old bucket to the free list. This operation 
>> increases  key insertion time and lowers performance
>> 
>> - growing will likely result in significant amount of old buckets 
>> sitting in the free list  and they are effectively wasted memory 
>> (bihash tries to reuse that memory but at some point  there is no 
>> demand anymore for smaller buckets)
>> 
>> - performance-wise original bucket (one which first 4 KVs) is collocated 
>> with bucket header.
>> This is new behaviour Dave introduced earlier this year (and I think it is 
>> present in 20.09).
>> Bucket collocated with header means that there is no dependant 
>> prefetch needed as both header  and at least part of data sits in the same 
>> cacheline. This significantly improveslookup performance.
>> 
>> So in general, for best performance and optimal memory usage, number of 
>> buckets should be big enough so it unlikely grow with your target number of 
>> KVs. rule of thumb would be rounding target number of entries to closer 
>> power-of-2 value and then dividing that number with 2.
>> For example, for 10M entries first lower power-of-2 number is 1<<23 (8M) and 
>> first higher is 1<<24 (16M).
>> 1<<23 is closer, when we divide that by 2 we got 1<<22 (4M) buckets.
>> 
>> Hope this explains….
>> 
>> —
>> Damjan
>> 
>> 
>>> On 26.11.2020., at 17:54, Elias Rudberg <elias.rudb...@bahnhof.net> wrote:
>>> 
>>> Hello VPP experts,
>>> 
>>> We are using VPP for NAT44 and are currently looking at how to move 
>>> from VPP 20.05 to 20.09. There are some differences in the way the 
>>> NAT plugin is configured.
>>> 
>>> One difficulty for us is the maximum number of sessions allowed, we 
>>> need to handle large numbers of sessions so that limit can be 
>>> important for us. For VPP 20.05 we have used "translation hash 
>>> buckets 1048576" and then the maximum number of sessions per thread 
>>> becomes 10 times that because of this line in the source code in 
>>> snat_config():
>>> 
>>> sm->max_translations = 10 * translation_buckets;
>>> 
>>> So then we got a limit of about 10 million sessions per thread, which 
>>> we have been happy with so far.
>>> 
>>> With VPP 20.09 however, things have changed so that the maximum 
>>> number of sessions is now configured explicitly, and the relationship 
>>> between max_translations_per_thread and translation_buckets is no 
>>> longer a factor of 10 but instead given by the 
>>> nat_calc_bihash_buckets()
>>> function:
>>> 
>>> static u32
>>> nat_calc_bihash_buckets (u32 n_elts)
>>> {
>>> return 1 << (max_log2 (n_elts >> 1) + 1); }
>>> 
>>> The above function corresponds to a factor of somewhere between 1 and
>>> 2 instead of 10. So, if I understood this correctly, for a given 
>>> maximum number of sessions, the corresponding translation_buckets 
>>> value will be something like 5 to 10 times larger in VPP 20.09 
>>> compared to how it was in VPP 20.05, leading to significantly 
>>> increased memory requirement given that we want to have the same 
>>> maximum number of sessions as before.
>>> 
>>> It seems a little strange that the translation_buckets value would 
>>> change so much between VPP versions, was that change intentional? The 
>>> old relationship "max_translations = 10 * translation_buckets" seems 
>>> to have worked well in practice, at least for our use case.
>>> 
>>> What could we do to get around this, if we want to switch to VPP 
>>> 20.09 but without reducing the maximum number of sessions? If we were 
>>> to simply divide the nat_calc_bihash_buckets() value by 8 or so to 
>>> make it more similar to how it was earlier, would that lead to other 
>>> problems?
>>> 
>>> Best regards,
>>> Elias
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18166): https://lists.fd.io/g/vpp-dev/message/18166
Mute This Topic: https://lists.fd.io/mt/78535814/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: RES: RES: [vpp-dev] NAT memory usage problem for VPP 20.09 compared to 20.05 due to larger translation_buckets value

Reply via email to