On Thu, 2021-09-30 at 09:59 +0000, Ananyev, Konstantin wrote:
On Wed, 2021-09-29 at 10:20 +0000, Ananyev, Konstantin wrote: In current DPDK framework, each RX queue is pre-loaded with mbufs for incoming packets. When number of representors scale out in a switch domain, the memory consumption became significant. Most important, polling all ports leads to high cache miss, high latency and low throughput. This patch introduces shared RX queue. Ports with same configuration in a switch domain could share RX queue set by specifying sharing group. Polling any queue using same shared RX queue receives packets from all member ports. Source port is identified by mbuf->port. Port queue number in a shared group should be identical. Queue index is 1:1 mapped in shared group. Share RX queue is supposed to be polled on same thread. Multiple groups is supported by group ID. Is this offload specific to the representor? If so can this name be changed specifically to representor? Yes, PF and representor in switch domain could take advantage. If it is for a generic case, how the flow ordering will be maintained? Not quite sure that I understood your question. The control path of is almost same as before, PF and representor port still needed, rte flows not impacted. Queues still needed for each member port, descriptors(mbuf) will be supplied from shared Rx queue in my PMD implementation. My question was if create a generic RTE_ETH_RX_OFFLOAD_SHARED_RXQ offload, multiple ethdev receive queues land into the same receive queue, In that case, how the flow order is maintained for respective receive queues. I guess the question is testpmd forward stream? The forwarding logic has to be changed slightly in case of shared rxq. basically for each packet in rx_burst result, lookup source stream according to mbuf->port, forwarding to target fs. Packets from same source port could be grouped as a small burst to process, this will accelerates the performance if traffic come from limited ports. I'll introduce some common api to do shard rxq forwarding, call it with packets handling callback, so it suites for all forwarding engine. Will sent patches soon. All ports will put the packets in to the same queue (share queue), right? Does this means only single core will poll only, what will happen if there are multiple cores polling, won't it cause problem? And if this requires specific changes in the application, I am not sure about the solution, can't this work in a transparent way to the application? Discussed with Jerin, new API introduced in v3 2/8 that aggregate ports in same group into one new port. Users could schedule polling on the aggregated port instead of all member ports. The v3 still has testpmd changes in fastpath. Right? IMO, For this feature, we should not change fastpath of testpmd application. Instead, testpmd can use aggregated ports probably as separate fwd_engine to show how to use this feature. Good point to discuss :) There are two strategies to polling a shared Rxq: 1. polling each member port All forwarding engines can be reused to work as before. My testpmd patches are efforts towards this direction. Does your PMD support this? Not unfortunately. More than that, every application needs to change to support this model. Both strategies need user application to resolve port ID from mbuf and process accordingly. This one doesn't demand aggregated port, no polling schedule change. I was thinking, mbuf will be updated from driver/aggregator port as when it comes to application. 2. polling aggregated port Besides forwarding engine, need more work to to demo it. This is an optional API, not supported by my PMD yet. We are thinking of implementing this PMD when it comes to it, ie. without application change in fastpath logic. Fastpath have to resolve port ID anyway and forwarding according to logic. Forwarding engine need to adapt to support shard Rxq. Fortunately, in testpmd, this can be done with an abstract API. Let's defer part 2 until some PMD really support it and tested, how do you think? We are not planning to use this feature so either way it is OK to me. I leave to ethdev maintainers decide between 1 vs 2. I do have a strong opinion not changing the testpmd basic forward engines for this feature.I would like to keep it simple as fastpath optimized and would like to add a separate Forwarding engine as means to verify this feature. +1 to that. I don't think it a 'common' feature. So separate FWD mode seems like a best choice to me. -1 :) There was some internal requirement from test team, they need to verify all features like packet content, rss, vlan, checksum, rte_flow... to be working based on shared rx queue. Then I suppose you'll need to write really comprehensive fwd- engine to satisfy your test team :) Speaking seriously, I still don't understand why do you need all available fwd-engines to verify this feature. The shared Rxq is low level feature, need to make sure driver higher level features working properly. fwd-engines like csum checks input packet and enable L3/L4 checksum and tunnel offloads accordingly, other engines do their own feature verification. All test automation could be reused with these engines supported seamlessly. From what I understand, main purpose of your changes to test-pmd: allow to fwd packet though different fwd_stream (TX through different HW queue). Yes, each mbuf in burst come from differnt port, testpmd current fwd- engines relies heavily on source forwarding stream, that's why the patch devide burst result mbufs into sub-burst and use orginal fwd- engine callback to handle. How to handle is not changed. In theory, if implemented in generic and extendable way - that might be a useful add-on to tespmd fwd functionality. But current implementation looks very case specific. And as I don't think it is a common case, I don't see much point to pollute basic fwd cases with it. Shared Rxq is a ethdev feature that impacts how packets get handled. It's natural to update forwarding engines to avoid broken. Why is that? All it affects the way you RX the packets. So why *all* FWD engines have to be updated? People will ask why some FWD engine can't work? It can be documented: which fwd engine supposed to work properly with this feature, which not. BTW, as I understand, as long as RX queues are properly assigned to lcores, any fwd engine will continue to work. Just for engines that are not aware about this feature packets can be send out via wrong TX queue. Let say what specific you are going to test with macswap vs macfwd mode for that feature? If people want to test NIC with real switch, or make sure L2 layer not get corrupted. I understand that, what I am saying that you probably don't need both to test this feature. I still think one specific FWD engine is enough to cover majority of test cases. Yes, rxonly should be sufficient to verify the fundametal, but to verify csum, timing, need others. Some back2back test system need io forwarding, real switch depolyment need macswap... Ok, but nothing stops you to pickup for your purposes a FWD engine with most comprehensive functionality: macswap, 5tswap, even 5tswap+csum update .... The new macro is introduced to minimize performance impact, I'm also wondering is there an elegant solution :) I think Jerin suggested a good alternative with eventdev. As another approach - might be consider to add an RX callback that will return packets only for one particular port (while keeping packets for other ports cached internally). This and the aggreate port API could be options in ethdev layer later. It can't be the fundamental due performance loss and potential cache miss. As a 'wild' thought - change testpmd fwd logic to allow multiple TX queues per fwd_stream and add a function to do TX switching logic. But that's probably quite a big change that needs a lot of work. Current performance penalty is one "if unlikely" per burst. It is not only about performance impact. It is about keeping test-pmd code simple and maintainable. Think in reverse direction, if we don't update fwd-engines here, all malfunction when shared rxq enabled, users can't verify driver features, are you expecting this? I expect developers not to rewrite whole test-pmd fwd code for each new ethdev feature. Here just abstract duplicated code from fwd-engines, an improvement, keep test-pmd code simple and mantainable. Specially for the feature that is not widely used. Based on the huge memory saving, performance and latency gains, it will be popular to users. But the test-pmd is not critical to this feature, I'm ok to drop the fwd-engine support if you agree. Not sure fully understand you here... If you saying that you decided to demonstrate this feature via some other app/example and prefer to abandon these changes in testpmd - then, yes I don't see any problems with that. Hi Ananyev & Jerin, New v4 posted with a dedicate fwd engine to demonstrate this feature, do you have time to check? Thanks, Xueming