On Mon, 2021-12-20 at 18:51 +0530, Jerin Jacob wrote: > On Fri, Dec 17, 2021 at 5:10 PM Van Haaren, Harry > <harry.van.haa...@intel.com> wrote: > > > > +CC Thomas; > > > > > -----Original Message----- > > > From: Jerin Jacob <jerinjac...@gmail.com> > > > Sent: Wednesday, December 15, 2021 12:41 PM > > > To: Randles, Ronan <ronan.rand...@intel.com> > > > Cc: dpdk-dev <dev@dpdk.org>; Van Haaren, Harry > > > <harry.van.haa...@intel.com> > > > Subject: Re: [PATCH 05/12] gen: add raw packet data API and tests > > > > > > On Tue, Dec 14, 2021 at 7:43 PM Ronan Randles <ronan.rand...@intel.com> > > > wrote: > > > > > > > > From: Harry van Haaren <harry.van.haa...@intel.com> > > > > <snip some patch contents> > > > > > > + const uint32_t base_size = gen->base_pkt->pkt_len; > > > > + const uint8_t *base_data = rte_pktmbuf_mtod(gen->base_pkt, > > > > uint8_t > > > *); > > > > > > I think, the very next feature will be generating packets for > > > incrementing IP addresses or so. > > > > Hah, yes! It’s a logical next step, and indeed we have POC code internally > > that Ronan > > and I have worked on that does this :) I've been using this internal POC of > > testing of OVS for ~ a year now, and it provides a pretty nice workflow for > > me. > > > > > In this case, one packet-based template will not work. > > > > Why not? I agree that "pre-calculating" all packets will not work, but the > > approach > > we have taken for this library is different. See below; > > > > > May we worth consider that use case into API framework first and add > > > support > > > later for implementation as it may change the complete land space of API > > > to have > > > better performance. Options like struct rte_gen logical object can have > > > N templates instead of one is an option on the table. :-) > > > > Agree - more complex usages have been designed for too. Let me explain; > > > > 1) A single gen instance uses a single template, and has "modifiers" that > > allow > > manipulation of the packet before sending. The gen->base_pkt is copied to > > the > > destination mbuf, and then updated by the modifiers. This approach is much > > better > > to allow for huge flow-counts (> 1 million?) as pre-calculating and storing > > 1 million > > packets is a waste of memory, and causes a lot of mem-IO for the datapath > > core. > > > > 2) The "modifiers" approach allows any number of things to be changed, with > > little > > mem-IO, and variable CPU cycle cost based on the modifiers themselves. > > If the CPU cycle cost of generating packets is too high, just add more > > cores :) > > > > 3) There are also some smarts we can apply for pre-calculating only a small > > amount of > > data per packet (e.g. uniformly-random distributed src ip). The memory > > footprint is > > lower than pre-calc of whole packets, and the runtime overhead of > > uniform-random > > is moved to configure time instead of on the datapath. > > > > 4) Dynamically generating packets by modification of templates allows for > > cool things > > to be added, e.g. adding timestamps to packets, and calculating latency can > > be done using the modifier concept and a protocol string > > "Ether()/IP()/UDP()/TSC()". > > If the packet is being decapped by the target application, the string > > params can provide > > context for where to "retrieve" the TSC from on RX in the generator: > > "TSC(rx_offset=30)". > > I've found this approach to be very flexible and nice, so am a big fan :) > > > > 5) In order to have multiple streams of totally-different traffic types > > (read "from multiple templates") > > the user can initialize multiple rte_gen instances. This allows > > applications that require multi-stream traffic > > to achieve that too, with the same abstraction as a single template stream. > > Initially the generator app is just > > providing a single stream, but this application can be expanded to many > > usages over the next year before 22.11 :) > > OK. I thought "modifiers" will need some sort of critical section in > multiple producer use cases. If so, > one option could be N streams in one gen instance vs N gen instance. > Just my 2c. Anyway, you folks can decide > on one option vs another. Only my concern was including such features > affect the prototype of existing APIs or not? > In either case, No strong opinion.
Sometimes we need RSS with limited streams, modifies need to copy template data, then modify accordingly, involves more cycles and data cache, not a good choice for performance test. By cloning a list of templates, just copy the mbuf headers, seems more efficient for such case. Agree modifiers a flexible way to do things more powerful, hopefully we have them all :) > > > > > > I could ramble on a bit more, but mostly diminishing returns I think... > > I'll just use this email as a reply to Thomas' tweet; > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Ftmonjalo%2Fstatus%2F1337313985662771201&data=04%7C01%7Cxuemingl%40nvidia.com%7Cedc55ae350504eaebe0b08d9c3bbb464%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637756033240578829%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=JVkpoweUrPoEWf7rWg1tSG4qiO9IKTtnw30x%2BqBZb%2FI%3D&reserved=0 > > > > Regards, -Harry