Hi Dave, Per my action from the last week TSC meeting (item 4.c. in [1]), here is the list of HW that FD.io<http://FD.io> project needs and that we can order any time:
1. 28 NICs, 2p100 GbE from Nvidia / Mellanox - preferred: MCX613106A-VDAT, less preferred: MCX556A-EDAT, to cover the following testbeds: a. Performance 3-Node-ICX, 2 testbeds, 4 SUTs, 2 TGs b. Performance 2-Node-ICX, 4 testbeds, 4 SUTs, 4 TGs c. ICX TGs for other systems, 3 TGs d. 3-Node-Alt (Ampere Altra Arm N1), 1 testbed, 2 SUTs, 1 TG e. (exact breakdown in my email from 28 Jan 2022 in the thread below) 2. If we also want to add a MLX NIC for functional vpp_device test, that would be additional 2 MLX 2p100GbE NICs. Things that we originally planned, but can't place orders as the HW is not available yet: 3. TBC number of 2-socket Xeon SapphireRapids servers a. Intel Xeon processor SKUs are not available yet to us - expecting update any week now. b. Related SuperMicro SKUs are not available yet to us - expecting update any week now. Hope this helps. Happy to answer any questions. Cheers, -Maciek [1] https://ircbot.wl.linuxfoundation.org/meetings/fdio-meeting/2022/fd_io_tsc/fdio-meeting-fd_io_tsc.2022-06-09-15.00.html On 5 Apr 2022, at 12:55, Maciek Konstantynowicz (mkonstan) <mkons...@cisco.com<mailto:mkons...@cisco.com>> wrote: Super, thanks! On 4 Apr 2022, at 20:22, Dave Wallace <dwallac...@gmail.com<mailto:dwallac...@gmail.com>> wrote: Hi Maciek, I have added this information to the TSC Agenda [0]. Thanks, -daw- [0] https://wiki.fd.io/view/TSC#Agenda On 4/4/2022 10:46 AM, Maciek Konstantynowicz (mkonstan) wrote: Begin forwarded message: From: mkonstan <mkons...@cisco.com<mailto:mkons...@cisco.com>> Subject: Re: [tsc-private] [csit-dev] TRex - replacing CVL with MLX for 100GbE Date: 3 March 2022 at 16:23:08 GMT To: Ed Warnicke <e...@cisco.com<mailto:e...@cisco.com>> Cc: "tsc-priv...@lists.fd.io<mailto:tsc-priv...@lists.fd.io>" <tsc-priv...@lists.fd.io<mailto:tsc-priv...@lists.fd.io>>, Lijian Zhang <lijian.zh...@arm.com<mailto:lijian.zh...@arm.com>> +Lijian Hi, Resending email from January, so it’s refreshed in our collective memory, as discussed on TSC call just now. Number of 2p100GE MLX NICs needed for performance testing of Ampere Altra servers are listed under point 4 below. Let me know if anything unclear and/or if any questions. Cheers, Maciek On 28 Jan 2022, at 17:35, Maciek Konstantynowicz (mkonstan) via lists.fd.io<http://lists.fd.io/> <mkonstan=cisco....@lists.fd.io<mailto:mkonstan=cisco....@lists.fd.io>> wrote: Hi Ed, Trishan, One correction regarding my last email from 25-Jan:- For Intel Xeon Icelake testbeds, apart from just replacing E810s on TRex servers, we should also considder adding MLX 100GbE NICs for SUTs, so that FD.io<http://fd.io/> could benchmark MLX on latest Intel Xeon CPUs. Exactly as discussed on in our side conversation, Ed. Here an updated calc with breakdown for Icelake (ICX) builds (the Cascadelake part stays as per previous email): // Sorry to TL;DR, if you just want the number of NICs, scroll to the bottom of this message :) (SUT, system under test, server running VPP+NICs under test) (TG, traffic generator, server running TRex, needs link speeds matching SUTs') 1. 3-Node-ICX, 2 testbeds, 4 SUTs, 2 TGs - 4 SUT/VPP/dpdk servers - 4 ConnectX NIC, 1 per SUT - test ConnectX on SUT - 2 TG/TRex servers - 2 ConnectX NICs, 1 per TG - replace E810s and test E810 on SUT - 2 ConnectX NICs, 1 per TG - test ConnectX on SUT - 1 ConnectX NIC, 1 per testbed type - for TRex calibration - sub-total 9 NICs 2. 2-Node-ICX, 4 testbeds, 4 SUTs, 4 TGs - 4 SUT/VPP/dpdk servers - 4 ConnectX NIC, 1 per SUT - test ConnectX on SUT - 4 TG/TRex servers - 4 ConnectX NICs, 1 per TG - replace E810s and test E810 on SUT - 4 ConnectX NICs, 1 per TG - test ConnectX on SUT - 1 ConnectX NIC, 1 per testbed type - for TRex calibration - sub-total 13 NICs 3. ICX TGs for other systems, 3 TGs - 3 TG/TRex servers - 3 ConnectX NICs, 1 per TG - replace E810s and test ConnectX and other 100GbE NICs on SUTs - 1 ConnectX NIC, 1 per testbed type - for TRex calibration - sub-total 4 NICs 4. 3-Node-Alt (Ampere Altra Arm N1), 1 testbed, 2 SUTs, 1 TG - 2 SUT/VPP/dpdk servers - 2 ConnectX NIC, 1 per SUT - test ConnectX on SUT - 1 TG/TRex server - will use one of the ICX TGs as listed in point 3. - sub-total 2 NICs Total 28 NICs. Hope this makes sense ... Cheers, Maciek P.S. I'm on PTO now until 7-Feb, so email responses delayed. On 25 Jan 2022, at 16:38, mkonstan <mkons...@cisco.com<mailto:mkons...@cisco.com>> wrote: Hi Ed, Trishan, Following from the last TSC call, here are the details about Nvidia Mellanox NICs that we are after for CSIT. For existing Intel Xeon Cascadelake testbeds we have one option: - MCX556A-EDAT NIC 2p100GbE - $1,195.00 - details in [2]. - need 4 NICs, plus 1 spare => 5 NICs For the new Intel Xeon Icelake testbeds we have two options: - MCX556A-EDAT NIC 2p100GbE - $1,195.00 - same as above, OR - MCX613106A-VDAT 2p100GbE - $1,795.00 - details in [3] (limited availability) - need 7 NICs, plus 1 spare => 8 NICs. We need Nvidia Mellanox advice and assistance with two things: 1. What NIC model we should get for Icelake with PCIe Gen4 x16 slots? 2. How many of listed NIC quantities can Nvidia Mellanox donate, vs LFN FD.io<http://fd.io/> project purchasing them thru retail channel? Let me know if you’re still good to help here and next steps. Hope this makes sense, let me know if questions. Cheers, Maciek Begin forwarded message: From: "Maciek Konstantynowicz (mkonstan) via lists.fd.io<http://lists.fd.io/>" <mkonstan=cisco....@lists.fd.io<mailto:mkonstan=cisco....@lists.fd.io>> Subject: [csit-dev] TRex - replacing CVL with MLX for 100GbE Date: 23 January 2022 at 20:03:59 GMT To: csit-dev <csit-...@lists.fd.io<mailto:csit-...@lists.fd.io>> Reply-To: mkons...@cisco.com<mailto:mkons...@cisco.com> Hi, Following discussion on CSIT call last Wednesday [1], we would like to move forward with using only Mellanox NICs to drive 100 GbE links and disconnectiong (or removing) E810 CVL NICs from TG(TRex) servers. This is due to a number of show-stopper issues preventing CSIT use of TRex with DPDK ICE driver[ICE] and no line of sight to have them addressed. This impacts our production 2n-clx testbeds, as well as the new icx testbeds that are being built. For 2n-clx, I believe we agreed on a call to use the same NIC model that is already there, MCX556A-EDAT (ConnectX-5 Ex) 2p100GbE [2]. Just add more NICs for 100GbE capacity. For icx testbeds, with servers supporting PCIe Gen4, we could also use MCX556A-EDAT (it supports PCIe Gen4), or take it up a notch and use ConnectX-6 MCX613106A-VDAT[3] that is advertised with "Up to 215 million messages/sec", which may mean "215 Mpps". If anybody has experience (or knows someone who does) with ConnectX-6 with DPDK driver, it would be great to hear. Anyways, here a quick calculation about how many NICs we would need: 1. 2n-clx, 3 testbeds, 3 TG servers => 4 NICs - s34-t27-tg1, s36-t28-tg1, s38-t29-tg1, see [4] - NIC model: MCX556A-EDAT NIC 2p100GbE - 1 NIC per TG server => 3 NICs - 1 NIC per TG server type for calibration => 1 NIC 2. 2n-icx, 4 testbeds, 4 TG servers => 5 NICs - NIC model: MCX556A-EDAT 2p100GbE QSFP48 - Or MCX613106A-VDAT 2p100GbE (accepts QSFP28 NRZ per [5]) - 1 NIC per TG server => 4 NICs - 1 NIC per TG server type for calibration => 1 NIC 3. 3n-icx, 2 testbeds, 2 TG servers => 2 NICs - NIC model: MCX556A-EDAT NIC 2p100GbE - Or MCX613106A-VDAT 2p100GbE (accepts QSFP28 NRZ per [5]) - 1 NIC per TG server => 2 NICs Thoughts? Cheers, Maciek [1] https://ircbot.wl.linuxfoundation.org/meetings/fdio-meeting/2022/fd_io_tsc/fdio-meeting-fd_io_tsc.2022-01-13-16.00.log.html#l-79 [2] https://store.nvidia.com/en-us/networking/store/product/MCX556A-EDAT/nvidiamcx556a-edatconnectx-5exvpiadaptercardedr100gbe/ [3] https://store.nvidia.com/en-us/networking/store/product/MCX613106A-VDAT/nvidiamcx613106a-vdatconnectx-6enadaptercard200gbe/ [4] https://git.fd.io/csit/tree/docs/lab/testbed_specifications.md#n1189 [5] https://community.mellanox.com/s/question/0D51T00008Cdv1g/qsfp56-ports-accept-qsfp28-devices [ICE] Summary of issues with DPDK ICE driver support for TRex: 1. TO-DO. Drop All. CVL rte-flow doesn’t support match criteria of ANY. - TRex: HW assist for rx packet counters; SW mode not fit for purpose. - Status: POC attempted, incomplete. 2. TO-DO. Steer all to a queue. CVL rte-flow doesn’t support matching criteria of ANY. - TRex: HW assist for TODO; needed for STL; SW mode not fit for purpose. - Status: POC attempted, incomplete. 3. TO-VERIFY. CVL doesn’t support ipv4.id<http://ipv4.id/>. - TRex: HW assist for flow stats and latency streams redirect. - Status: Completed in DPDK 21.08. 4. TO-VERIFY. CVL PF doesn’t support LL (Low Latency)/HP (High Priority) for PF queues. - TRex: Needed for ASTF (stateful). - Status: CVL (E810) NIC does not have this API but has the capability. --------------------------------------------------------------------------------
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#21578): https://lists.fd.io/g/vpp-dev/message/21578 Mute This Topic: https://lists.fd.io/mt/91944722/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-