[vpp-dev] [NAT] Assign same external IP

2019-02-03 Thread JB
Hello,

 

Breaking this out into its own thread.

 

Currently when creating a new dynamic NAT session, the source IP and source 
port are considered. If I've understood this right, the next time the user 
(source IP) sends traffic matching the previous traffic (source IP + source 
port), the same external IP should be assigned.

 

However, I'd want to make sure the source IP always gets the same external IP, 
regardless of port used.

 

I'm looking for suggestions and ideas as to how this can be achieved. Is the 
4-tuple hash key the only deciding factor here?

Thanks!
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12142): https://lists.fd.io/g/vpp-dev/message/12142
Mute This Topic: https://lists.fd.io/mt/29639823/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] [NAT] Assign same external IP

2019-02-03 Thread Ole Troan
Hi there,

> Breaking this out into its own thread.
> 
>  
> Currently when creating a new dynamic NAT session, the source IP and source 
> port are considered. If I've understood this right, the next time the user 
> (source IP) sends traffic matching the previous traffic (source IP + source 
> port), the same external IP should be assigned.
> 
>  
> However, I'd want to make sure the source IP always gets the same external 
> IP, regardless of port used.
> 
>  
> I'm looking for suggestions and ideas as to how this can be achieved. Is the 
> 4-tuple hash key the only deciding factor here?

The NAT implementation address and port allocation algorithm is plugable.
You can add your own. Check out nat.h:
  /* Address and port allocation function */
  nat_alloc_out_addr_and_port_function_t *alloc_addr_and_port;

Best regards,
Ole

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12143): https://lists.fd.io/g/vpp-dev/message/12143
Mute This Topic: https://lists.fd.io/mt/29639823/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] RFC: buffer manager rework

2019-02-03 Thread Damjan Marion via Lists.Fd.Io


> On 3 Feb 2019, at 16:58, Nitin Saxena  wrote:
> 
> Hi Damjan,
> 
> I have few queries regarding this patch.
> 
>  - DPDK mempools are not used anymore, we register custom mempool ops, and 
> dpdk is taking buffers from VPP
> Some of the targets uses hardware memory allocator like OCTEONTx family and 
> NXP's dpaa. Those hardware allocators are exposed as dpdk mempools.

Which exact operation do they accelerate?

> Now with this change I can see rte_mempool_populate_iova() is not anymore 
> called.

Yes, but new code does pretty much the same thing, it populates both elt_list 
and mem_list. Also new code puts IOVA into mempool_objhdr.

> So what is your suggestion to support such hardware.

Before I can provide any suggestion I need to understand better what those 
hardware buffer managers do
and why they are better than pure software solution we have today.

>  
> 
>  - first 64-bytes of metadata are initialised on free, so buffer alloc is 
> very fast
> Is it fair to say if a mempool is created per worker core per sw_index 
> (interface) then buffer template copy can be avoided even during free (It can 
> be done only once at init time)

The really expensive part of buffer free operation is bringing cacheline into 
L1, and we need to do that to verify reference count of the packet.
At the moment when data is in L1, simply copying template will not cost much. 
1-2 clocks on x86, not sure about arm but still i expect that it will result in 
4 128-bit stores.
That was the rationale for resetting the metadata during buffer free.

So to answer your question, having buffer per sw-interface will likely improve 
performance a bit, but it will also cause sub-optimal use of buffer memory.
Such solution will also have problem in scaling, for example if you have 
hundreds of virtual interfaces...


> 
> Thanks,
> Nitin
> 
> From: vpp-dev@lists.fd.io   > on behalf of Damjan Marion via Lists.Fd.Io 
> mailto:dmarion=me@lists.fd.io>>
> Sent: Friday, January 25, 2019 10:38 PM
> To: vpp-dev
> Cc: vpp-dev@lists.fd.io 
> Subject: [vpp-dev] RFC: buffer manager rework
>  
> External Email
> 
> I am very close to the finish line with buffer management rework patch, and 
> would like to
> ask people to take a look before it is merged.
> 
> https://gerrit.fd.io/r/16638 
> 
> It significantly improves performance of buffer alloc free and introduces 
> numa awareness.
> On my skylake platinum 8180 system, with native AVF driver observed 
> performance improvement is:
> 
> - single core, 2 threads, ipv4 base forwarding test, CPU running at 2.5GHz 
> (TB off):
> 
> old code - dpdk buffer manager: 20.4 Mpps
> old code - old native buffer manager: 19.4 Mpps
> new code: 24.9 Mpps
> 
> With DPDK drivers performance stays same as DPDK is maintaining own internal 
> buffer cache.
> So major perf gain should be observed in native code like: vhost-user, memif, 
> AVF, host stack.
> 
> user facing changes:
> to change number of buffers:
>   old startup.conf:
> dpdk { num-mbufs  }
>   new startup.conf:
> buffers { buffers-per-numa }
> 
> Internal changes:
>  - free lists are deprecated
>  - buffer metadata is always initialised.
>  - first 64-bytes of metadata are initialised on free, so buffer alloc is 
> very fast
>  - DPDK mempools are not used anymore, we register custom mempool ops, and 
> dpdk is taking buffers from VPP
>  - to support such operation plugin can request external header space - in 
> case of DPDK it stores rte_mbuf + rte_mempool_objhdr
> 
> I'm still running some tests so possible minor changes are possible, but 
> nothing major expected.
> 
> --
> Damjan
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> 
> View/Reply Online (#12016): https://lists.fd.io/g/vpp-dev/message/12016 
> 
> Mute This Topic: https://lists.fd.io/mt/29539221/675748 
> 
> Group Owner: vpp-dev+ow...@lists.fd.io 
> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub 
>   [nsax...@caviumnetworks.com 
> ]
> -=-=-=-=-=-=-=-=-=-=-=-

-- 
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12144): https://lists.fd.io/g/vpp-dev/message/12144
Mute This Topic: https://lists.fd.io/mt/29539221/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] RFC: buffer manager rework

2019-02-03 Thread Nitin Saxena
Hi Damjan,


I have few queries regarding this patch.


 - DPDK mempools are not used anymore, we register custom mempool ops, and dpdk 
is taking buffers from VPP

Some of the targets uses hardware memory allocator like OCTEONTx family and 
NXP's dpaa. Those hardware allocators are exposed as dpdk mempools. Now with 
this change I can see rte_mempool_populate_iova() is not anymore called. So 
what is your suggestion to support such hardware.

 - first 64-bytes of metadata are initialised on free, so buffer alloc is very 
fast
Is it fair to say if a mempool is created per worker core per sw_index 
(interface) then buffer template copy can be avoided even during free (It can 
be done only once at init time)

Thanks,
Nitin



From: vpp-dev@lists.fd.io  on behalf of Damjan Marion via 
Lists.Fd.Io 
Sent: Friday, January 25, 2019 10:38 PM
To: vpp-dev
Cc: vpp-dev@lists.fd.io
Subject: [vpp-dev] RFC: buffer manager rework

External Email

I am very close to the finish line with buffer management rework patch, and 
would like to
ask people to take a look before it is merged.

https://gerrit.fd.io/r/16638

It significantly improves performance of buffer alloc free and introduces numa 
awareness.
On my skylake platinum 8180 system, with native AVF driver observed performance 
improvement is:

- single core, 2 threads, ipv4 base forwarding test, CPU running at 2.5GHz (TB 
off):

old code - dpdk buffer manager: 20.4 Mpps
old code - old native buffer manager: 19.4 Mpps
new code: 24.9 Mpps

With DPDK drivers performance stays same as DPDK is maintaining own internal 
buffer cache.
So major perf gain should be observed in native code like: vhost-user, memif, 
AVF, host stack.

user facing changes:
to change number of buffers:
  old startup.conf:
dpdk { num-mbufs  }
  new startup.conf:
buffers { buffers-per-numa }

Internal changes:
 - free lists are deprecated
 - buffer metadata is always initialised.
 - first 64-bytes of metadata are initialised on free, so buffer alloc is very 
fast
 - DPDK mempools are not used anymore, we register custom mempool ops, and dpdk 
is taking buffers from VPP
 - to support such operation plugin can request external header space - in case 
of DPDK it stores rte_mbuf + rte_mempool_objhdr

I'm still running some tests so possible minor changes are possible, but 
nothing major expected.

--
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12016): https://lists.fd.io/g/vpp-dev/message/12016
Mute This Topic: https://lists.fd.io/mt/29539221/675748
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [nsax...@caviumnetworks.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12145): https://lists.fd.io/g/vpp-dev/message/12145
Mute This Topic: https://lists.fd.io/mt/29539221/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] RFC: buffer manager rework

2019-02-03 Thread Nitin Saxena
Hi Damjan,

Which exact operation do they accelerate?
There are many…basic features are…
- they accelerate fast buffer free and alloc. Single instruction required for 
both operations.
- Free list is maintained by hardware and not software.

Further other co-processors are dependent on buffer being managed by hardware 
instead of software so it is must to add support of hardware mem-pool in VPP. 
Software mempool will not work with other packet engines.

Thanks,
Nitin

On 03-Feb-2019, at 10:34 PM, Damjan Marion via Lists.Fd.Io 
mailto:dmarion=me@lists.fd.io>> wrote:


External Email


On 3 Feb 2019, at 16:58, Nitin Saxena 
mailto:nsax...@marvell.com>> wrote:

Hi Damjan,

I have few queries regarding this patch.

 - DPDK mempools are not used anymore, we register custom mempool ops, and dpdk 
is taking buffers from VPP
Some of the targets uses hardware memory allocator like OCTEONTx family and 
NXP's dpaa. Those hardware allocators are exposed as dpdk mempools.

Which exact operation do they accelerate?

Now with this change I can see rte_mempool_populate_iova() is not anymore 
called.

Yes, but new code does pretty much the same thing, it populates both elt_list 
and mem_list. Also new code puts IOVA into mempool_objhdr.

So what is your suggestion to support such hardware.

Before I can provide any suggestion I need to understand better what those 
hardware buffer managers do
and why they are better than pure software solution we have today.



 - first 64-bytes of metadata are initialised on free, so buffer alloc is very 
fast
Is it fair to say if a mempool is created per worker core per sw_index 
(interface) then buffer template copy can be avoided even during free (It can 
be done only once at init time)

The really expensive part of buffer free operation is bringing cacheline into 
L1, and we need to do that to verify reference count of the packet.
At the moment when data is in L1, simply copying template will not cost much. 
1-2 clocks on x86, not sure about arm but still i expect that it will result in 
4 128-bit stores.
That was the rationale for resetting the metadata during buffer free.

So to answer your question, having buffer per sw-interface will likely improve 
performance a bit, but it will also cause sub-optimal use of buffer memory.
Such solution will also have problem in scaling, for example if you have 
hundreds of virtual interfaces...



Thanks,
Nitin


From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> on behalf of Damjan Marion 
via Lists.Fd.Io mailto:dmarion=me@lists.fd.io>>
Sent: Friday, January 25, 2019 10:38 PM
To: vpp-dev
Cc: vpp-dev@lists.fd.io
Subject: [vpp-dev] RFC: buffer manager rework

External Email

I am very close to the finish line with buffer management rework patch, and 
would like to
ask people to take a look before it is merged.

https://gerrit.fd.io/r/16638

It significantly improves performance of buffer alloc free and introduces numa 
awareness.
On my skylake platinum 8180 system, with native AVF driver observed performance 
improvement is:

- single core, 2 threads, ipv4 base forwarding test, CPU running at 2.5GHz (TB 
off):

old code - dpdk buffer manager: 20.4 Mpps
old code - old native buffer manager: 19.4 Mpps
new code: 24.9 Mpps

With DPDK drivers performance stays same as DPDK is maintaining own internal 
buffer cache.
So major perf gain should be observed in native code like: vhost-user, memif, 
AVF, host stack.

user facing changes:
to change number of buffers:
  old startup.conf:
dpdk { num-mbufs  }
  new startup.conf:
buffers { buffers-per-numa }

Internal changes:
 - free lists are deprecated
 - buffer metadata is always initialised.
 - first 64-bytes of metadata are initialised on free, so buffer alloc is very 
fast
 - DPDK mempools are not used anymore, we register custom mempool ops, and dpdk 
is taking buffers from VPP
 - to support such operation plugin can request external header space - in case 
of DPDK it stores rte_mbuf + rte_mempool_objhdr

I'm still running some tests so possible minor changes are possible, but 
nothing major expected.

--
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12016): https://lists.fd.io/g/vpp-dev/message/12016
Mute This Topic: https://lists.fd.io/mt/29539221/675748
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
[nsax...@caviumnetworks.com]
-=-=-=-=-=-=-=-=-=-=-=-

--
Damjan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12144): https://lists.fd.io/g/vpp-dev/message/12144
Mute This Topic: https://lists.fd.io/mt/29539221/675748
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  
[nsax...@caviumnetworks.com

Re: [vpp-dev] RFC: buffer manager rework

2019-02-03 Thread Damjan Marion via Lists.Fd.Io


> On 3 Feb 2019, at 18:38, Nitin Saxena  wrote:
> 
> Hi Damjan,
> 
>> Which exact operation do they accelerate?
> There are many…basic features are…
> - they accelerate fast buffer free and alloc. Single instruction required for 
> both operations. 

I quickly looked into DPDK octeontx_fpavf_dequeue() and it looks to me much 
more than one instruction.

In case of DPDK, how that works with DPDK mempool cache or are you disabling 
mempool cache completely?

Does single instruction alloc/free include:
 - reference_count check and decrement?
 - user metadata initialization ?

> - Free list is maintained by hardware and not software.  

Sounds to me that it is slower to program hardware, than to simply add few 
buffer indices to the end of vector but I may be wrong...

> 
> Further other co-processors are dependent on buffer being managed by hardware 
> instead of software so it is must to add support of hardware mem-pool in VPP. 
> Software mempool will not work with other packet engines.

But that can also be handled internally by device driver...

So, If you are able to prove with numbers that current software solution is 
low-performant and that you are confident that you can do significantly better, 
I will be happy to work with you on implementing support for hardware buffer 
manager.

> 
> Thanks,
> Nitin
> 
>> On 03-Feb-2019, at 10:34 PM, Damjan Marion via Lists.Fd.Io 
>> mailto:dmarion=me@lists.fd.io>> wrote:
>> 
>> External Email
>> 
>> 
>> 
>>> On 3 Feb 2019, at 16:58, Nitin Saxena >> > wrote:
>>> 
>>> Hi Damjan,
>>> 
>>> I have few queries regarding this patch.
>>> 
>>>  - DPDK mempools are not used anymore, we register custom mempool ops, and 
>>> dpdk is taking buffers from VPP
>>> Some of the targets uses hardware memory allocator like OCTEONTx family and 
>>> NXP's dpaa. Those hardware allocators are exposed as dpdk mempools.
>> 
>> Which exact operation do they accelerate?
>> 
>>> Now with this change I can see rte_mempool_populate_iova() is not anymore 
>>> called.
>> 
>> Yes, but new code does pretty much the same thing, it populates both 
>> elt_list and mem_list. Also new code puts IOVA into mempool_objhdr.
>> 
>>> So what is your suggestion to support such hardware.
>> 
>> Before I can provide any suggestion I need to understand better what those 
>> hardware buffer managers do
>> and why they are better than pure software solution we have today.
>> 
>>>  
>>> 
>>>  - first 64-bytes of metadata are initialised on free, so buffer alloc is 
>>> very fast
>>> Is it fair to say if a mempool is created per worker core per sw_index 
>>> (interface) then buffer template copy can be avoided even during free (It 
>>> can be done only once at init time)
>> 
>> The really expensive part of buffer free operation is bringing cacheline 
>> into L1, and we need to do that to verify reference count of the packet.
>> At the moment when data is in L1, simply copying template will not cost 
>> much. 1-2 clocks on x86, not sure about arm but still i expect that it will 
>> result in 4 128-bit stores.
>> That was the rationale for resetting the metadata during buffer free.
>> 
>> So to answer your question, having buffer per sw-interface will likely 
>> improve performance a bit, but it will also cause sub-optimal use of buffer 
>> memory.
>> Such solution will also have problem in scaling, for example if you have 
>> hundreds of virtual interfaces...
>> 
>> 
>>> 
>>> Thanks,
>>> Nitin
>>> 
>>> From: vpp-dev@lists.fd.io  >> > on behalf of Damjan Marion via Lists.Fd.Io 
>>> mailto:dmarion=me@lists.fd.io>>
>>> Sent: Friday, January 25, 2019 10:38 PM
>>> To: vpp-dev
>>> Cc: vpp-dev@lists.fd.io 
>>> Subject: [vpp-dev] RFC: buffer manager rework
>>>  
>>> External Email
>>> 
>>> I am very close to the finish line with buffer management rework patch, and 
>>> would like to
>>> ask people to take a look before it is merged.
>>> 
>>> https://gerrit.fd.io/r/16638 
>>> 
>>> It significantly improves performance of buffer alloc free and introduces 
>>> numa awareness.
>>> On my skylake platinum 8180 system, with native AVF driver observed 
>>> performance improvement is:
>>> 
>>> - single core, 2 threads, ipv4 base forwarding test, CPU running at 2.5GHz 
>>> (TB off):
>>> 
>>> old code - dpdk buffer manager: 20.4 Mpps
>>> old code - old native buffer manager: 19.4 Mpps
>>> new code: 24.9 Mpps
>>> 
>>> With DPDK drivers performance stays same as DPDK is maintaining own 
>>> internal buffer cache.
>>> So major perf gain should be observed in native code like: vhost-user, 
>>> memif, AVF, host stack.
>>> 
>>> user facing changes:
>>> to change number of buffers:
>>>   old startup.conf:
>>> dpdk { num-mbufs  }
>>>   new startup.conf:
>>> buffers { buffers-per-numa }
>>> 
>>> Internal changes:
>>>  - free lists are deprecated
>>>  - buffer metadata is always i

Re: [vpp-dev] RFC: buffer manager rework

2019-02-03 Thread Nitin Saxena
Hi Damjan,

See function octeontx_fpa_bufpool_alloc() called by octeontx_fpa_dequeue(). Its 
a single read instruction to get the pointer of data.
Similarly, octeontx_fpa_bufpool_free() is also a single write instruction.

So, If you are able to prove with numbers that current software solution is 
low-performant and that you are confident that you can do significantly better, 
I will be happy to work with you on implementing support for hardware buffer 
manager.
First of all I welcome your patch as we were also trying to remove latencies 
seen by memcpy_x4() of buffer template. As I said earlier hardware buffer 
coprocessor is being used by other packet engines hence the support has to be 
added in VPP. I am looking for suggestion for its resolution.

Thanks,
Nitin

On 03-Feb-2019, at 11:39 PM, Damjan Marion via Lists.Fd.Io 
mailto:dmarion=me@lists.fd.io>> wrote:


External Email


On 3 Feb 2019, at 18:38, Nitin Saxena 
mailto:nitin.sax...@cavium.com>> wrote:

Hi Damjan,

Which exact operation do they accelerate?
There are many…basic features are…
- they accelerate fast buffer free and alloc. Single instruction required for 
both operations.

I quickly looked into DPDK octeontx_fpavf_dequeue() and it looks to me much 
more than one instruction.

In case of DPDK, how that works with DPDK mempool cache or are you disabling 
mempool cache completely?

Does single instruction alloc/free include:
 - reference_count check and decrement?
 - user metadata initialization ?

- Free list is maintained by hardware and not software.

Sounds to me that it is slower to program hardware, than to simply add few 
buffer indices to the end of vector but I may be wrong...


Further other co-processors are dependent on buffer being managed by hardware 
instead of software so it is must to add support of hardware mem-pool in VPP. 
Software mempool will not work with other packet engines.

But that can also be handled internally by device driver...

So, If you are able to prove with numbers that current software solution is 
low-performant and that you are confident that you can do significantly better, 
I will be happy to work with you on implementing support for hardware buffer 
manager.


Thanks,
Nitin

On 03-Feb-2019, at 10:34 PM, Damjan Marion via Lists.Fd.Io 
mailto:dmarion=me@lists.fd.io>> wrote:


External Email


On 3 Feb 2019, at 16:58, Nitin Saxena 
mailto:nsax...@marvell.com>> wrote:

Hi Damjan,

I have few queries regarding this patch.

 - DPDK mempools are not used anymore, we register custom mempool ops, and dpdk 
is taking buffers from VPP
Some of the targets uses hardware memory allocator like OCTEONTx family and 
NXP's dpaa. Those hardware allocators are exposed as dpdk mempools.

Which exact operation do they accelerate?

Now with this change I can see rte_mempool_populate_iova() is not anymore 
called.

Yes, but new code does pretty much the same thing, it populates both elt_list 
and mem_list. Also new code puts IOVA into mempool_objhdr.

So what is your suggestion to support such hardware.

Before I can provide any suggestion I need to understand better what those 
hardware buffer managers do
and why they are better than pure software solution we have today.



 - first 64-bytes of metadata are initialised on free, so buffer alloc is very 
fast
Is it fair to say if a mempool is created per worker core per sw_index 
(interface) then buffer template copy can be avoided even during free (It can 
be done only once at init time)

The really expensive part of buffer free operation is bringing cacheline into 
L1, and we need to do that to verify reference count of the packet.
At the moment when data is in L1, simply copying template will not cost much. 
1-2 clocks on x86, not sure about arm but still i expect that it will result in 
4 128-bit stores.
That was the rationale for resetting the metadata during buffer free.

So to answer your question, having buffer per sw-interface will likely improve 
performance a bit, but it will also cause sub-optimal use of buffer memory.
Such solution will also have problem in scaling, for example if you have 
hundreds of virtual interfaces...



Thanks,
Nitin


From: vpp-dev@lists.fd.io 
mailto:vpp-dev@lists.fd.io>> on behalf of Damjan Marion 
via Lists.Fd.Io mailto:dmarion=me@lists.fd.io>>
Sent: Friday, January 25, 2019 10:38 PM
To: vpp-dev
Cc: vpp-dev@lists.fd.io
Subject: [vpp-dev] RFC: buffer manager rework

External Email

I am very close to the finish line with buffer management rework patch, and 
would like to
ask people to take a look before it is merged.

https://gerrit.fd.io/r/16638

It significantly improves performance of buffer alloc free and introduces numa 
awareness.
On my skylake platinum 8180 system, with native AVF driver observed performance 
improvement is:

- single core, 2 threads, ipv4 base forwarding test, CPU running at 2.5GHz (TB 
off):

old code - dpdk

Re: [vpp-dev] RFC: buffer manager rework

2019-02-03 Thread Damjan Marion via Lists.Fd.Io


> On 3 Feb 2019, at 20:13, Saxena, Nitin  wrote:
> 
> Hi Damjan,
> 
> See function octeontx_fpa_bufpool_alloc() called by octeontx_fpa_dequeue(). 
> Its a single read instruction to get the pointer of data.

Yeah saw that, and today vpp buffer manager can grab up to 16 buffer indices 
with one instructions so no big deal here

> Similarly, octeontx_fpa_bufpool_free() is also a single write instruction. 
> 
>> So, If you are able to prove with numbers that current software solution is 
>> low-performant and that you are confident that you can do significantly 
>> better, I will be happy to work with you on implementing support for 
>> hardware buffer manager.
> First of all I welcome your patch as we were also trying to remove latencies 
> seen by memcpy_x4() of buffer template. As I said earlier hardware buffer 
> coprocessor is being used by other packet engines hence the support has to be 
> added in VPP. I am looking for suggestion for its resolution. 

You can hardly get any suggestion from my side if you are ignoring my 
questions, which I asked in my previous email to get better understanding of 
what your hardware do.

"It is hardware so it is fast" is not real argument, we need real datapoints 
before investing time into this area

> 
> Thanks,
> Nitin
> 
>> On 03-Feb-2019, at 11:39 PM, Damjan Marion via Lists.Fd.Io 
>> mailto:dmarion=me@lists.fd.io>> wrote:
>> 
>> External Email
>> 
>> 
>> 
>>> On 3 Feb 2019, at 18:38, Nitin Saxena >> > wrote:
>>> 
>>> Hi Damjan,
>>> 
 Which exact operation do they accelerate?
>>> There are many…basic features are…
>>> - they accelerate fast buffer free and alloc. Single instruction required 
>>> for both operations. 
>> 
>> I quickly looked into DPDK octeontx_fpavf_dequeue() and it looks to me much 
>> more than one instruction.
>> 
>> In case of DPDK, how that works with DPDK mempool cache or are you disabling 
>> mempool cache completely?
>> 
>> Does single instruction alloc/free include:
>>  - reference_count check and decrement?
>>  - user metadata initialization ?
>> 
>>> - Free list is maintained by hardware and not software.  
>> 
>> Sounds to me that it is slower to program hardware, than to simply add few 
>> buffer indices to the end of vector but I may be wrong...
>> 
>>> 
>>> Further other co-processors are dependent on buffer being managed by 
>>> hardware instead of software so it is must to add support of hardware 
>>> mem-pool in VPP. Software mempool will not work with other packet engines.
>> 
>> But that can also be handled internally by device driver...
>> 
>> So, If you are able to prove with numbers that current software solution is 
>> low-performant and that you are confident that you can do significantly 
>> better, I will be happy to work with you on implementing support for 
>> hardware buffer manager.
>> 
>>> 
>>> Thanks,
>>> Nitin
>>> 
 On 03-Feb-2019, at 10:34 PM, Damjan Marion via Lists.Fd.Io 
 mailto:dmarion=me@lists.fd.io>> wrote:
 
 External Email
 
 
 
> On 3 Feb 2019, at 16:58, Nitin Saxena  > wrote:
> 
> Hi Damjan,
> 
> I have few queries regarding this patch.
> 
>  - DPDK mempools are not used anymore, we register custom mempool ops, 
> and dpdk is taking buffers from VPP
> Some of the targets uses hardware memory allocator like OCTEONTx family 
> and NXP's dpaa. Those hardware allocators are exposed as dpdk mempools.
 
 Which exact operation do they accelerate?
 
> Now with this change I can see rte_mempool_populate_iova() is not anymore 
> called.
 
 Yes, but new code does pretty much the same thing, it populates both 
 elt_list and mem_list. Also new code puts IOVA into mempool_objhdr.
 
> So what is your suggestion to support such hardware.
 
 Before I can provide any suggestion I need to understand better what those 
 hardware buffer managers do
 and why they are better than pure software solution we have today.
 
>  
> 
>  - first 64-bytes of metadata are initialised on free, so buffer alloc is 
> very fast
> Is it fair to say if a mempool is created per worker core per sw_index 
> (interface) then buffer template copy can be avoided even during free (It 
> can be done only once at init time)
 
 The really expensive part of buffer free operation is bringing cacheline 
 into L1, and we need to do that to verify reference count of the packet.
 At the moment when data is in L1, simply copying template will not cost 
 much. 1-2 clocks on x86, not sure about arm but still i expect that it 
 will result in 4 128-bit stores.
 That was the rationale for resetting the metadata during buffer free.
 
 So to answer your question, having buffer per sw-interface will likely 
 improve performance a bit, but it will also cause sub-optimal use of 
 buffer 

FW: [vpp-dev] VPP register node change upper limit

2019-02-03 Thread Abeeha Aqeel

I am using the vpp pppoe plugin and that’s how its working. I do see an option 
in the vnet/interface.c to create interfaces that do not need TX nodes, but I 
am not sure how to use that. 

Also I can not figure out where the nodes created along with the pppoe sessions 
are being used as they do not show up in the “show runtime” or the trace of 
packets. 

Regards,
 
Abeeha

From: Abeeha Aqeel
Sent: Friday, February 1, 2019 5:36 PM
Cc: vpp-dev@lists.fd.io
Subject: FW: [vpp-dev] VPP register node change upper limit




From: Abeeha Aqeel
Sent: Friday, February 1, 2019 5:32 PM
To: dmar...@me.com
Subject: RE: [vpp-dev] VPP register node change upper limit

I am using the vpp pppoe plugin and that’s how its working. I do see an option 
in the vnet/interface.c to create interfaces that do not need TX nodes, but I 
am not sure how to use that. 

Also I can not figure out where the nodes created along with the pppoe sessions 
are being used as they do not show up in the “show runtime” or the trace of 
packets. 



From: Damjan Marion via Lists.Fd.Io
Sent: Friday, February 1, 2019 5:23 PM
To: Abeeha Aqeel
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] VPP register node change upper limit



On 1 Feb 2019, at 11:32, Abeeha Aqeel  wrote:

Dear All,
 
I am trying to create 64k PPPoE sessions with VPP but VPP crashes after 
creating 216 sessions each time. From the system logs it seems that it crashes 
while trying to register a node and that node’s index is greater than the limit 
(1024). (attached screenshot of the trace)
 
>From the “show vlib graph”, I can see that two new nodes are registered for 
>each session i.e. pppoe_session0-tx and pppoe_session0-output.
 
Can someone guide me to how to increase the upper limit on the number of nodes?

Currently number of nodes is limited by buffer metadata space, and the way how 
we calculate node errors (vlib_error_t).
Currently vlib_error_t is u16, and 10 bits are used for node. That gives you 1 
<< 10 of node indices, so roughly
300-400 interfaces (2 nodes per interface  + other registered nodes < 1024).

This is something we can improve, but the real question is, do you really want 
to go that way.
Have you considered using some more lighter way to deal with large number of 
sessions...

-- 
Damjan




-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12150): https://lists.fd.io/g/vpp-dev/message/12150
Mute This Topic: https://lists.fd.io/mt/29649032/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Streaming Telemetry with gNMI server

2019-02-03 Thread Jerome Tollet via Lists.Fd.Io
Hi Yohan & Stevan,
Great work. Thanks!
Jerome

Le 02/02/2019 23:35, « vpp-dev@lists.fd.io au nom de Yohan Pipereau » 
 a écrit :

Hi everyone,

Stevan and I have developed a small gRPC server to stream VPP metrics to
an analytic stack.

That's right, there is already a program to do this in VPP, it is
vpp_prometheus_export. Here are the main details/improvements regarding
our implementation:

* Our implementation is based on gNMI specification, a network standard
co-written by several network actors to allow configuration and
telemetry with RPCs.
* Thanks to gNMI protobuf file, messages are easier to parse and use a
binary format for better performances.
* We are using gRPC and Protobuf, so this is a HTTP2 server
* We are using a push model (or streaming) instead of a pull model. This
mean that clients subscribe to metric paths with a sample interval, and
our server streams counters according to the sample interval.
* As we said just before, contrary to vpp_prometheus_export, our
application let clients decide which metric will be streamed and how often.
* For interface related counters, we also provide conversion of
interface indexes into interface names.
Ex: /if/rx would be output as /if/rx/tap0/thread0
But at this stage, this conversion is expensive because it uses a loop
to collect vapi interface events. It is planned to write paths with
interface names in STAT shared memory segment to avoid this loop.

Here is the link to our project:
https://github.com/vpp-telemetry-pfe/gnmi-grpc

We have provided a docker scenario to illustrate our work. It can be
found in docker directory of the project. You can follow the guide named
guide.md.

Do not hesitate to give us feedbacks regarding the scenario or the code.

Yohan


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12151): https://lists.fd.io/g/vpp-dev/message/12151
Mute This Topic: https://lists.fd.io/mt/29635594/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [sweetcomb-dev] [vpp-dev] Streaming Telemetry with gNMI server

2019-02-03 Thread Jerome Tollet via Lists.Fd.Io
Hi Hongjun,
Integrating this work with Sweetcomb would be interesting because stats may be 
"enriched" with extra information which not exposed to stats shared memory 
segment.
Because of Chinese new year, there won't be a weekly call on Thursday but may 
Yohann & Stevan could attend to next call.
Regards,
Jerome

Le 03/02/2019 05:32, « sweetcomb-...@lists.fd.io au nom de Ni, Hongjun » 
 a écrit :

Hi Yohan and Stevan,

Thank you for your great work!

FD.io has a sub-project named Sweetcomb, which provides gNMI Northbound 
interface to upper application.
Sweetcomb project will push its first release on Feb 6, 2019.
Please take a look at below link for details from Pantheon Technologies:
https://www.youtube.com/watch?v=hTv6hFnyAhE 

Not sure if your work could be integrated with Sweetcomb project?

Thanks a lot,
Hongjun


-Original Message-
From: vpp-dev@lists.fd.io [mailto:vpp-dev@lists.fd.io] On Behalf Of Yohan 
Pipereau
Sent: Sunday, February 3, 2019 5:55 AM
To: vpp-dev@lists.fd.io
Cc: Stevan COROLLER 
Subject: [vpp-dev] Streaming Telemetry with gNMI server

Hi everyone,

Stevan and I have developed a small gRPC server to stream VPP metrics to an 
analytic stack.

That's right, there is already a program to do this in VPP, it is 
vpp_prometheus_export. Here are the main details/improvements regarding our 
implementation:

* Our implementation is based on gNMI specification, a network standard 
co-written by several network actors to allow configuration and telemetry with 
RPCs.
* Thanks to gNMI protobuf file, messages are easier to parse and use a 
binary format for better performances.
* We are using gRPC and Protobuf, so this is a HTTP2 server
* We are using a push model (or streaming) instead of a pull model. This 
mean that clients subscribe to metric paths with a sample interval, and our 
server streams counters according to the sample interval.
* As we said just before, contrary to vpp_prometheus_export, our 
application let clients decide which metric will be streamed and how often.
* For interface related counters, we also provide conversion of interface 
indexes into interface names.
Ex: /if/rx would be output as /if/rx/tap0/thread0 But at this stage, this 
conversion is expensive because it uses a loop to collect vapi interface 
events. It is planned to write paths with interface names in STAT shared memory 
segment to avoid this loop.

Here is the link to our project:
https://github.com/vpp-telemetry-pfe/gnmi-grpc

We have provided a docker scenario to illustrate our work. It can be found 
in docker directory of the project. You can follow the guide named guide.md.

Do not hesitate to give us feedbacks regarding the scenario or the code.

Yohan

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#170): https://lists.fd.io/g/sweetcomb-dev/message/170
Mute This Topic: https://lists.fd.io/mt/29637803/675291
Group Owner: sweetcomb-dev+ow...@lists.fd.io
Unsubscribe: 
https://lists.fd.io/g/sweetcomb-dev/leave/3383274/1904987652/xyzzy  
[jtol...@cisco.com]
-=-=-=-=-=-=-=-=-=-=-=-



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12152): https://lists.fd.io/g/vpp-dev/message/12152
Mute This Topic: https://lists.fd.io/mt/29649627/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VPP register node change upper limit

2019-02-03 Thread Damjan Marion via Lists.Fd.Io

It is a bit of shame that that plugin doesn’t scale. Somebody will need to 
rewrite that plugin to make it right, i.e simple use of sub-interfaces will 
likely make this limitation to dissapear...

— 
Damjan

> On Feb 4, 2019, at 5:56 AM, Abeeha Aqeel  
> wrote:
> 
>  
> I am using the vpp pppoe plugin and that’s how its working. I do see an 
> option in the vnet/interface.c to create interfaces that do not need TX 
> nodes, but I am not sure how to use that.
>  
> Also I can not figure out where the nodes created along with the pppoe 
> sessions are being used as they do not show up in the “show runtime” or the 
> trace of packets.
>  
> Regards,
>  
> Abeeha
>  
> From: Abeeha Aqeel
> Sent: Friday, February 1, 2019 5:36 PM
> Cc: vpp-dev@lists.fd.io
> Subject: FW: [vpp-dev] VPP register node change upper limit
>  
>  
>  
>  
> From: Abeeha Aqeel
> Sent: Friday, February 1, 2019 5:32 PM
> To: dmar...@me.com
> Subject: RE: [vpp-dev] VPP register node change upper limit
>  
> I am using the vpp pppoe plugin and that’s how its working. I do see an 
> option in the vnet/interface.c to create interfaces that do not need TX 
> nodes, but I am not sure how to use that.
>  
> Also I can not figure out where the nodes created along with the pppoe 
> sessions are being used as they do not show up in the “show runtime” or the 
> trace of packets.
>  
>  
>  
> From: Damjan Marion via Lists.Fd.Io
> Sent: Friday, February 1, 2019 5:23 PM
> To: Abeeha Aqeel
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [vpp-dev] VPP register node change upper limit
>  
>  
>  
> 
> On 1 Feb 2019, at 11:32, Abeeha Aqeel  wrote:
>  
> Dear All,
>  
> I am trying to create 64k PPPoE sessions with VPP but VPP crashes after 
> creating 216 sessions each time. From the system logs it seems that it 
> crashes while trying to register a node and that node’s index is greater than 
> the limit (1024). (attached screenshot of the trace)
>  
> From the “show vlib graph”, I can see that two new nodes are registered for 
> each session i.e. pppoe_session0-tx and pppoe_session0-output.
>  
> Can someone guide me to how to increase the upper limit on the number of 
> nodes?
>  
> Currently number of nodes is limited by buffer metadata space, and the way 
> how we calculate node errors (vlib_error_t).
> Currently vlib_error_t is u16, and 10 bits are used for node. That gives you 
> 1 << 10 of node indices, so roughly
> 300-400 interfaces (2 nodes per interface  + other registered nodes < 1024).
>  
> This is something we can improve, but the real question is, do you really 
> want to go that way.
> Have you considered using some more lighter way to deal with large number of 
> sessions...
>  
> -- 
> Damjan
>  
>  
>  
>  
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#12153): https://lists.fd.io/g/vpp-dev/message/12153
Mute This Topic: https://lists.fd.io/mt/29649711/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-