> > > >>>>>>>>>>>> On Thu, Mar 11, 2021 at 12:01 AM Honnappa Nagarahalli
> > > >>>>>>>>>>>> <honnappa.nagaraha...@arm.com> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hello,
> > > >>>>>>>>>>>>>         Performance of L3fwd example application is one
> > > >>>>>>>>>>>>> of the key
> > > >>>>>>>>>>>> benchmarks in DPDK. However, the application does not
> > > >>>>>>>>>>>> have many debugging statistics to understand the
> > > >>>>>>>>>>>> performance issues. We have added L3fwd as another
> > > >>>>>>>>>>>> mode/stream to testpmd which provides
> > > >>>>>>>>>> enough
> > > >>>>>>>>>>>> statistics at various levels. This has allowed us to
> > > >>>>>>>>>>>> debug the performance issues effectively.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> There is more work to be done to get it to upstreamable
> > > >>>>>>>>>>>>> state. I am
> > > >>>>>>>>>>>> wondering if such a patch is helpful for others and if
> > > >>>>>>>>>>>> the community would be interested in taking a look.
> > > >>>>>>>>>>>> Please let me know
> > > >>>>>>>>> what you think.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> We are using app/proc-info/ to attach and analyze the
> > > >>>>> performance.
> > > >>>>>>>>>>>> That helps to analyze the unmodified application. I
> > > >>>>>>>>>>>> think, if something is missing in proc-info app, in my
> > > >>>>>>>>>>>> opinion it is better to enhance proc-info so that it can
> > > >>>>>>>>>>>> help other third-party
> > > >>>>>>> applications.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Just my 2c.
> > > >>>>>>>>>>> Thanks Jerin. We will explore that.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I agree it is dangerous to rely too much on testpmd for
> > > >> everything.
> > > >>>>>>>>>> Please tell us what in testpmd could be useful out of it.
> > > >>>>>>>>>>
> > > >>>>>>>>> Things that are very helpful in testpmd are: 1) HW
> > > >>>>>>>>> statistics from the NIC 2) Forwarding stats 3) Burst stats
> > > >>>>>>>>> (indication of headroom
> > > >>>>>>>>> availability) 4) Easy to set parameters like RX and TX queue
> > > >>>>>>>>> depths (among others) without having to recompile.
> > > >>>>>>>>
> > > >>>>>>>> [Kathleen Capella]
> > > >>>>>>>> Thank you for the suggestion of app/proc-info. I've tried it
> > > >>>>>>>> out with l3fwd and see that it does have the HW stats from
> > > >>>>>>>> the NIC and the forwarding
> > > >>>>>>> stats.
> > > >>>>>>>> However, it does not have the burst stats testpmd offers, nor
> > > >>>>>>>> the
> > > >>>>>>>
> > > >>>>>>> One option to see such  level of debugging would be to have
> > > >>>>>>> - Create a memzone in the primary process
> > > >>>>>>> - Application under test can update the stats in memzone based
> > > >>>>>>> on the code flow
> > > >>>>>>> - proc-info can read the counters updated by application under
> > > >>>>>>> test using the memzone object got through
> > > >> rte_memzone_lookup()
> > > >>>>>> Agreed. Currently, using app/proc-info does not provide this
> > > >>>>>> ability. We
> > > >>>>> cannot add this capability to app/proc-info as these stats would
> > > >>>>> be specific to L3fwd application.
> > > >>>>>
> > > >>>>> I meant creating generic counter-read/write infra via memzone to
> > > >>>>> not make it as l3fwd specific.
> > > >>>> Currently, app/proc-info is able to print the stats as they are
> > > >>>> standardized
> > > >> via the API. But for statistics that are generated in the
> > > >> application, they are very specific to that application. For ex:
> > > >> burst stats in testpmd are very specific to it and another
> > > >> application might implement the same in a very different manner.
> > > >>>>
> > > >>>> In needs to be something like the app/proc-info just needs to be
> > > >>>> a dumb
> > > >> displaying utility and the application has to do all the heavy
> > > >> lifting of copying the exact display strings to the memory.
> > > >>>
> > > >>> Yes.
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>> Another approach will be using rte_trace()[1] for
> > > >>>>>>> debugging/tracing by adding tracepoints in l3fwd for such events.
> > > >>>>>>> It has a timestamp and the trace format is opensource trace
> > > >>>>>>> format(CTF(Common trace format)), so that we can use post
> > > >>>>>>> posting tools to analyze.
> > > >>>>>>> [1]
> > > >>>>>>> https://doc.dpdk.org/guides/prog_guide/trace_lib.html
> > > >>>>>> This is good for analyzing an incident. I think it is an
> > > >>>>>> overhead for
> > > >>>>> development purposes.
> > > >>>>>
> > > >>>>> Consider if one wants to add burst stats, one can add stats
> > > >>>>> increment under RTE_TRACE_POINT_FP, it will be emitted
> > whenever
> > > >>>>> code flow through that path. Set of events of can be viewed in
> > > >>>>> trace viewer[1]. Would that be enough?
> > > >>>>> Adding traces to l3fwd can be upstreamed as it is useful for
> > > >>>>> others for debugging.
> > > >>>>>
> > > >>>>> [1]
> > > >>>>> https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG
> > > >>>> This needs post processing of the trace info to derive the
> > > >>>> information, is it
> > > >> correct? For ex: for burst stats, there will be several traces
> > > >> generated collecting the number of packets returned by
> > > >> rte_eth_rx_burst which needs to be post processed.
> > > >>>
> > > >>> Or You can have an additional variable to acculate it.
> > > >>>
> > > >>>> Also, adding traces is equivalent to adding statistics in L3fwd.
> > > >>>
> > > >>> Yes.
> > > >>>
> > > >>> If the sole purpose only stats then it is better to add status in
> > > >>> l3fwd without performance impact. I thought some thing else.
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>>> ability to easily change parameters without having to
> > > >>>>>>>> recompile, which helps reduce debugging time significantly.
> > > >>>> We will not be able to fix this above issue.
> > > >>>
> > > >>> It depends on what you want to debug. Trace can be disabled at
> > runtime.
> > > >>
> > > >>
> > > >> DPDK has existing API's for application metrics but they are rarely 
> > > >> used.
> > > >>
> > > >> Why not implement rte_metrics in l3fwd and proc-info?
> > > > This discussion has ended up as a stats discussion. But, we also need to
> > be able to change the configurable parameters easily.
> > > > If we implement the stats and ability to change the configurable
> > > > parameters, then it is essentially bringing in some of the
> > > > capabilities from
> > > testpmd to the sample application. I think that will result in lot more 
> > > code in
> > the sample application and will make it complicated.
> > > >
> > > > Instead our proposal is to take L3fwd to testpmd and use all the
> > > > infra code that testpmd provides. We see that this approach results
> > > > in less
> > > amount of code added to DPDK overall.
> > > >
> > >
> > > Agree that it may help testing to have l3fwd support on the testpmd.
> > >
> > > Two concerns,
> > > 1) Testpmd already too complex.
> > > 2) Code duplication.
> > >
> > > For 1), if the l3fwd can be implemented in testpmd as new, independent
> > > forwarding mode, without touching rest of the testpmd, I think it can be
> > OK.
> Yes, this is what we have done. It is a new forwarding mode.
> We could remove some forwarding modes from testpmd. For ex: macfwd, macswap 
> seem very similar to iofwd mode.

Not really, iowfd doesn't touch packet data at all, while macfwd and macwap 
change L2 headers.
In fact I found all of them quite helpful (just for different cases), so please 
keep them.

> 
> >
> > In fact, l3fwd is also quite big and complex:
> > $ wc -l examples/l3fwd/*.[h,c] |grep total
> >   6969 total
> >
> > Plus it will introduce extra dependencies (fib, lpm, hash, might-be acl?) I 
> > am
> > not sure it is a good idea to pull all these complexities into test-pmd.
> I do not suggest pulling all these in. In our case, I see that the ask is 
> only on LPM. I am open to hearing what others see as the requirement.

Ok, but l3fwd forwarding model is quite different from current PMD one
(egress queue selection, TX packets buffering, etc.).
I suppose you'll need to pull all that too from l3fwd?

> 
> > I can't imagine that l3fwd app need ability to configure each and every PMD
> > parameter.
> > From my experience in l3fwd most of cycles are spent not in PMD itself, but
> > in actual packet processing: header parsing and checking, classification,
> > routing table lookup, etc.
> During our work, we had to experiment with burst size, rx/tx queue depths 
> along with other PMD specific configuration parameters. The
> packet processing code remains the same and there is not much to optimize.

I think burst-size and rx/tx queue size can be added into l3fwd as new config 
parameters.
Doesn't look like a major issue to me.
PMD specific parameters could be a problem... anything particular you plan to 
use?
 
> >
> > > Not sure how to address 2), also lets say we want to add new feature
> > > to l3fwd, where it should go, to the sample or to the testpmd?
> L3fwd example will remain as the example. We have to duplicate the code into 
> testpmd. If L3fwd example is changed, it needs to be
> changed in testpmd as well.

Usually code duplication is not a good sign.
I understand that sometimes it is unavoidable, but why we have to do it here?

Reply via email to