HI Pawel,

On Sat, Dec 17, 2022 at 6:28 PM Paweł Staszewski <pstaszew...@itcare.pl>
wrote:

> Hi
>
>
> So without bgp (lcp) and only one static route performance is basically as
> it is on "paper" 24Mpps/100Gbit/s without problem.
>
> And then no matter if with bond or without bond (with lcp) there are
> problems starting.
>

When you tried it without bgp, did you still use lcp to manage
interfaces/addresses and add the single static route? If not, could you try
that and report whether the problem still occurs?

How many routes is your BGP daemon adding to VPP's FIB?

Are you isolating the cores that worker threads are bound to (e.g. via
isolcpus kernel argument or via cpuset)?


>
> Basically side where im Receiving most traffic that need to be TX-ed to
> other interface is ok.
>
> Interface with most RX traffic on vlan 2906 has ip address and when I ping
> from this ip address to ptp ip other side - there is no packet drops.
>
> But on the interface where this traffic trat was RX-ed from this vlan2906
> - and need to be TX-ed on vlan 514 - there are drops to ptp ip of other
> side - from 10 to 20%
>
> Same is when I ping /mtr from RX-side to TX side there are drops - but
> there are no drops when I ping from TX side to RX side - so forwarding is
> done other side thru interface that has most RX - less TX
>

How are you measuring packet loss? You mentioned 10 to 20% drops by ping &
mtr above. Are those tools all you're using, or are you running some
traffic generator like T-rex? My reason for asking is that when I look in
the 'show runtime' output you sent at the number of packets ("vectors")
handled by rdma-input on each thread and compare it to the number of
packets handled by enp59s0f0-rdma-output and enp59s0f1-rdma-output in the
same thread, the difference is much much smaller than 10 - 20%. So some
more specific information on how you're measuring that there is 10-20%
packet loss would be useful. Is the 10-20% packet loss only observed when
communicating directly to the interface addresses on the host system or are
10-20% of packets which should be forwarded between interfaces by VPP being
dropped?

I notice you mentioned "lcpng", which is a customized version of linux-cp.
I'm not sure the differences between the stock versions of
linux-cp/linux-nl and the code in lcpng. Have you tried this experiment
with the stock version of linux-cp/linux-nl from the VPP master branch on
gerrit?

Also, as Ben previously requested, the output of 'show errors' and  'show
hardware-interfaces' would be helpful.

Thanks,
-Matt

So it looks like interface busy with RX-traffix is ok - problem is when
> interface is mostly TX-ing traffic RX-ed from other interface... but dont
> know how to check what is causing it ... ethtool -S for any interface is
> showing no errors/drops at interface lvl.
>
>
>
>
> On 12/16/22 10:51 AM, Benoit Ganne (bganne) via lists.fd.io wrote:
>
> Hi,
>
>
> So the hardware is:
> Intel 6246R
> 96GB ram
> Mellanox Connect-X 5 2x 100GB Ethernet NIC
> And simple configuration  with vpp/frr where one vlan interface all
> traffix is RX-ed and second vlan interface where this traffic is TX-ed -
> it is normal internet traffic - about 20Gbit/s with 2Mpps
>
> 2Mpps looks definitely too low, in a similar setup, CSIT measures IPv4 NDR 
> with rdma at ~17.6Mpps with 2 workers on 1 core (2 hyperthreads): 
> http://csit.fd.io/trending/#eNrlkk0OwiAQhU-DGzNJwdKuXFh7D0NhtE36QwBN6-mljXHahTt3LoCQb-Y95gUfBocXj-2RyYLlBRN5Y-LGDqd9PB7WguhBtyPwJLmhsFyPUmYKnOkUNDaFLK2Aa8BQz7e4KuUReuNmFXGeVcw9bCSJ2Hoi8t2IGpRDRR3RjVBAv7LZvoeqrk516JsnUmmcgLiOeRDieqsfJrui7yHzcqn4XXj2H8Kzn_BkuesH1y0_UJYvWG6xEg
>
> The output of 'sh err' and 'sh hard' would be useful too.
>
>
> Below vpp config:
>
> To start with, I'd recommend doing a simple test removing lcp, vlan & bond to 
> see if you can reproduce CSIT performance, and then maybe add bond and 
> finally lcp and vlan. This could help narrowing where performance drops.
>
>
> Below also show run
>
> The vector rate is really low, so it is really surprising there are drops...
> Do you capture the show run output when you're dropping packets? Basically, 
> when traffic is going through VPP and performance is maxing out, do 'cle run' 
> and then 'sh run' to see the instantaneous values and not averages.
>
>
> Anyone know how to interpret this data ? what are the Suspends for
> api-rx-from-ring ?
>
> This is a control plane task in charge of processing API messages. VPP uses 
> cooperative multitasking within the main thread for control plane tasks, 
> Suspends counts the number of times this specific task voluntarily released 
> the CPU, yielding to other tasks.
>
>
> and how to check what type of error(traffic) is doing drops:
>
> You can capture dropped traffic:
> pcap trace drop
> <wait for some traffic to be dropped...>
> pcap trace drop off
>
> You can also use VPP packet tracer:
> tr add rdma-input 1000
> <wait for some traffic to be dropped...>
> tr filter include error-drop 1000
> sh tr max 1000
>
> Best
> ben
>
>
>
> 
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22349): https://lists.fd.io/g/vpp-dev/message/22349
Mute This Topic: https://lists.fd.io/mt/95697757/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to