All high-performance networking devices on the market have pipeline 
architecture.

The pipeline consists of "stages".



ASICs have stages fixed to particular functions:

[cid:image002.png@01D8A0DD.988EC6A0]

Well, some stages are driven by code our days (a little flexibility).



Juniper is pipeline-based too (like any ASIC). They just invented one special 
stage in 1996 for lookup (sequence search by nibble in the big external memory 
tree) – it was public information up to 2000year. It is a different principle 
from TCAM search – performance is traded for flexibility/simplicity/cost.



Network Processors emulate stages on general-purpose ARM cores. It is a 
pipeline too (different cores for different functions, many cores for every 
function), just it is a virtual pipeline.



Ed/

-----Original Message-----
From: NANOG [mailto:nanog-bounces+vasilenko.eduard=huawei....@nanog.org] On 
Behalf Of Saku Ytti
Sent: Monday, July 25, 2022 10:03 PM
To: James Bensley <jwbensley+na...@gmail.com>
Cc: NANOG <nanog@nanog.org>
Subject: Re: 400G forwarding - how does it work?



On Mon, 25 Jul 2022 at 21:51, James Bensley 
<jwbensley+na...@gmail.com<mailto:jwbensley+na...@gmail.com>> wrote:



> I have no frame of reference here, but in comparison to Gen 6 Trio of

> NP5, that seems very high to me (to the point where I assume I am

> wrong).



No you are right, FP has much much more PPEs than Trio.



For fair calculation, you compare how many lines FP has to PPEs in Trio. 
Because in Trio single PPE handles entire packet, and all PPEs run identical 
ucode, performing same work.



In FP each PPE in line has its own function, like first PPE in line could be 
parsing the packet and extracting keys from it, second could be doing 
ingressACL, 3rd ingressQoS, 4th ingress lookup and so forth.



Why choose this NP design instead of Trio design, I don't know. I don't 
understand the upsides.



Downside is easy to understand, picture yourself as ucode developer, and you 
get task to 'add this magic feature in the ucode'.

Implementing it in Trio seems trivial, add the code in ucode, rock on.

On FP, you might have to go 'aww shit, I need to do this before PPE5 but after 
PPE3 in the pipeline, but the instruction cost it adds isn't in the budget that 
I have in the PPE4, crap, now I need to shuffle around and figure out which PPE 
in line runs what function to keep the PPS we promise to customer.



Let's look it from another vantage point, let's cook-up IPv6 header with 
crapton of EH, in Trio, PPE keeps churning it out, taking long time, but 
eventually it gets there or raises exception and gives up.

Every other PPE in the box is fully available to perform work.

Same thing in FP? You have HOLB, the PPEs in the line after thisPPE are not 
doing anything and can't do anything, until the PPE before in line is done.



Today Cisco and Juniper do 'proper' CoPP, that is, they do ingressACL before 
and after lookup, before is normally needed for ingressACL but after lookup 
ingressACL is needed for CoPP (we only know after lookup if it is control-plane 
packet). Nokia doesn't do this at all, and I bet they can't do it, because if 
they'd add it in the core where it needs to be in line, total PPS would go 
down. as there is no budget for additional ACL. Instead all control-plane 
packets from ingressFP are sent to control plane FP, and inshallah we don't 
congest the connection there or it.





>

> Cheers,

> James.







--

  ++ytti

Reply via email to