> > > >>>>>>>>>>>> On Thu, Mar 11, 2021 at 12:01 AM Honnappa Nagarahalli > > > >>>>>>>>>>>> <honnappa.nagaraha...@arm.com> wrote: > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Hello, > > > >>>>>>>>>>>>> Performance of L3fwd example application is one > > > >>>>>>>>>>>>> of the key > > > >>>>>>>>>>>> benchmarks in DPDK. However, the application does not > > > >>>>>>>>>>>> have many debugging statistics to understand the > > > >>>>>>>>>>>> performance issues. We have added L3fwd as another > > > >>>>>>>>>>>> mode/stream to testpmd which provides > > > >>>>>>>>>> enough > > > >>>>>>>>>>>> statistics at various levels. This has allowed us to > > > >>>>>>>>>>>> debug the performance issues effectively. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> There is more work to be done to get it to upstreamable > > > >>>>>>>>>>>>> state. I am > > > >>>>>>>>>>>> wondering if such a patch is helpful for others and if > > > >>>>>>>>>>>> the community would be interested in taking a look. > > > >>>>>>>>>>>> Please let me know > > > >>>>>>>>> what you think. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> We are using app/proc-info/ to attach and analyze the > > > >>>>> performance. > > > >>>>>>>>>>>> That helps to analyze the unmodified application. I > > > >>>>>>>>>>>> think, if something is missing in proc-info app, in my > > > >>>>>>>>>>>> opinion it is better to enhance proc-info so that it can > > > >>>>>>>>>>>> help other third-party > > > >>>>>>> applications. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Just my 2c. > > > >>>>>>>>>>> Thanks Jerin. We will explore that. > > > >>>>>>>>>> > > > >>>>>>>>>> I agree it is dangerous to rely too much on testpmd for > > > >> everything. > > > >>>>>>>>>> Please tell us what in testpmd could be useful out of it. > > > >>>>>>>>>> > > > >>>>>>>>> Things that are very helpful in testpmd are: 1) HW > > > >>>>>>>>> statistics from the NIC 2) Forwarding stats 3) Burst stats > > > >>>>>>>>> (indication of headroom > > > >>>>>>>>> availability) 4) Easy to set parameters like RX and TX queue > > > >>>>>>>>> depths (among others) without having to recompile. > > > >>>>>>>> > > > >>>>>>>> [Kathleen Capella] > > > >>>>>>>> Thank you for the suggestion of app/proc-info. I've tried it > > > >>>>>>>> out with l3fwd and see that it does have the HW stats from > > > >>>>>>>> the NIC and the forwarding > > > >>>>>>> stats. > > > >>>>>>>> However, it does not have the burst stats testpmd offers, nor > > > >>>>>>>> the > > > >>>>>>> > > > >>>>>>> One option to see such level of debugging would be to have > > > >>>>>>> - Create a memzone in the primary process > > > >>>>>>> - Application under test can update the stats in memzone based > > > >>>>>>> on the code flow > > > >>>>>>> - proc-info can read the counters updated by application under > > > >>>>>>> test using the memzone object got through > > > >> rte_memzone_lookup() > > > >>>>>> Agreed. Currently, using app/proc-info does not provide this > > > >>>>>> ability. We > > > >>>>> cannot add this capability to app/proc-info as these stats would > > > >>>>> be specific to L3fwd application. > > > >>>>> > > > >>>>> I meant creating generic counter-read/write infra via memzone to > > > >>>>> not make it as l3fwd specific. > > > >>>> Currently, app/proc-info is able to print the stats as they are > > > >>>> standardized > > > >> via the API. But for statistics that are generated in the > > > >> application, they are very specific to that application. For ex: > > > >> burst stats in testpmd are very specific to it and another > > > >> application might implement the same in a very different manner. > > > >>>> > > > >>>> In needs to be something like the app/proc-info just needs to be > > > >>>> a dumb > > > >> displaying utility and the application has to do all the heavy > > > >> lifting of copying the exact display strings to the memory. > > > >>> > > > >>> Yes. > > > >>> > > > >>>> > > > >>>>>>> > > > >>>>>>> Another approach will be using rte_trace()[1] for > > > >>>>>>> debugging/tracing by adding tracepoints in l3fwd for such events. > > > >>>>>>> It has a timestamp and the trace format is opensource trace > > > >>>>>>> format(CTF(Common trace format)), so that we can use post > > > >>>>>>> posting tools to analyze. > > > >>>>>>> [1] > > > >>>>>>> https://doc.dpdk.org/guides/prog_guide/trace_lib.html > > > >>>>>> This is good for analyzing an incident. I think it is an > > > >>>>>> overhead for > > > >>>>> development purposes. > > > >>>>> > > > >>>>> Consider if one wants to add burst stats, one can add stats > > > >>>>> increment under RTE_TRACE_POINT_FP, it will be emitted > > whenever > > > >>>>> code flow through that path. Set of events of can be viewed in > > > >>>>> trace viewer[1]. Would that be enough? > > > >>>>> Adding traces to l3fwd can be upstreamed as it is useful for > > > >>>>> others for debugging. > > > >>>>> > > > >>>>> [1] > > > >>>>> https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG > > > >>>> This needs post processing of the trace info to derive the > > > >>>> information, is it > > > >> correct? For ex: for burst stats, there will be several traces > > > >> generated collecting the number of packets returned by > > > >> rte_eth_rx_burst which needs to be post processed. > > > >>> > > > >>> Or You can have an additional variable to acculate it. > > > >>> > > > >>>> Also, adding traces is equivalent to adding statistics in L3fwd. > > > >>> > > > >>> Yes. > > > >>> > > > >>> If the sole purpose only stats then it is better to add status in > > > >>> l3fwd without performance impact. I thought some thing else. > > > >>> > > > >>>> > > > >>>>>>> > > > >>>>>>>> ability to easily change parameters without having to > > > >>>>>>>> recompile, which helps reduce debugging time significantly. > > > >>>> We will not be able to fix this above issue. > > > >>> > > > >>> It depends on what you want to debug. Trace can be disabled at > > runtime. > > > >> > > > >> > > > >> DPDK has existing API's for application metrics but they are rarely > > > >> used. > > > >> > > > >> Why not implement rte_metrics in l3fwd and proc-info? > > > > This discussion has ended up as a stats discussion. But, we also need to > > be able to change the configurable parameters easily. > > > > If we implement the stats and ability to change the configurable > > > > parameters, then it is essentially bringing in some of the > > > > capabilities from > > > testpmd to the sample application. I think that will result in lot more > > > code in > > the sample application and will make it complicated. > > > > > > > > Instead our proposal is to take L3fwd to testpmd and use all the > > > > infra code that testpmd provides. We see that this approach results > > > > in less > > > amount of code added to DPDK overall. > > > > > > > > > > Agree that it may help testing to have l3fwd support on the testpmd. > > > > > > Two concerns, > > > 1) Testpmd already too complex. > > > 2) Code duplication. > > > > > > For 1), if the l3fwd can be implemented in testpmd as new, independent > > > forwarding mode, without touching rest of the testpmd, I think it can be > > OK. > Yes, this is what we have done. It is a new forwarding mode. > We could remove some forwarding modes from testpmd. For ex: macfwd, macswap > seem very similar to iofwd mode.
Not really, iowfd doesn't touch packet data at all, while macfwd and macwap change L2 headers. In fact I found all of them quite helpful (just for different cases), so please keep them. > > > > > In fact, l3fwd is also quite big and complex: > > $ wc -l examples/l3fwd/*.[h,c] |grep total > > 6969 total > > > > Plus it will introduce extra dependencies (fib, lpm, hash, might-be acl?) I > > am > > not sure it is a good idea to pull all these complexities into test-pmd. > I do not suggest pulling all these in. In our case, I see that the ask is > only on LPM. I am open to hearing what others see as the requirement. Ok, but l3fwd forwarding model is quite different from current PMD one (egress queue selection, TX packets buffering, etc.). I suppose you'll need to pull all that too from l3fwd? > > > I can't imagine that l3fwd app need ability to configure each and every PMD > > parameter. > > From my experience in l3fwd most of cycles are spent not in PMD itself, but > > in actual packet processing: header parsing and checking, classification, > > routing table lookup, etc. > During our work, we had to experiment with burst size, rx/tx queue depths > along with other PMD specific configuration parameters. The > packet processing code remains the same and there is not much to optimize. I think burst-size and rx/tx queue size can be added into l3fwd as new config parameters. Doesn't look like a major issue to me. PMD specific parameters could be a problem... anything particular you plan to use? > > > > > Not sure how to address 2), also lets say we want to add new feature > > > to l3fwd, where it should go, to the sample or to the testpmd? > L3fwd example will remain as the example. We have to duplicate the code into > testpmd. If L3fwd example is changed, it needs to be > changed in testpmd as well. Usually code duplication is not a good sign. I understand that sometimes it is unavoidable, but why we have to do it here?