On Tue, Aug 10, 2021 at 5:12 PM Ferruh Yigit <ferruh.yi...@intel.com> wrote:
>
> On 8/10/2021 8:57 AM, 王志宏 wrote:
> > Thanks for the review Ferruh :)
> >
> > On Mon, Aug 9, 2021 at 11:18 PM Ferruh Yigit <ferruh.yi...@intel.com> wrote:
> >>
> >> On 8/9/2021 7:52 AM, Zhihong Wang wrote:
> >>> This patch aims to:
> >>>  1. Add flexibility by supporting IP & UDP src/dst fields
> >>
> >> What is the reason/"use case" of this flexibility?
> >
> > The purpose is to emulate pkt generator behaviors.
> >
>
> 'flowgen' forwarding is already to emulate pkt generator, but it was only
> changing destination IP.
>
> What additional benefit does changing udp ports of the packets brings? What is
> your usecase for this change?

Pkt generators like pktgen/trex/ixia/spirent can change various fields
including ip/udp src/dst.

Keeping the cfg_n_* while setting cfg_n_ip_dst = 1024 and others = 1
makes the default behavior exactly unchanged. Do you think it makes
sense?

>
> >>
> >>>  2. Improve multi-core performance by using per-core vars>
> >>
> >> On multi core this also has syncronization problem, OK to make it 
> >> per-core. Do
> >> you have any observed performance difference, if so how much is it?
> >
> > Huge difference, one example: 8 core flowgen -> rxonly results: 43
> > Mpps (per-core) vs. 9.3 Mpps (shared), of course the numbers "varies
> > depending on system configuration".
> >
>
> Thanks for clarification.
>
> >>
> >> And can you please separate this to its own patch? This can be before 
> >> ip/udp update.
> >
> > Will do.
> >
> >>
> >>> v2: fix assigning ip header cksum
> >>>
> >>
> >> +1 to update, can you please make it as seperate patch?
> >
> > Sure.
> >
> >>
> >> So overall this can be a patchset with 4 patches:
> >> 1- Fix retry logic (nb_rx -> nb_pkt)
> >> 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()')
> >> 3- User per-core varible (for 'next_flow')
> >> 4- Support ip/udp src/dst variaty of packets
> >>
> >
> > Great summary. Thanks a lot.
> >
> >>> Signed-off-by: Zhihong Wang <wangzhihong....@bytedance.com>
> >>> ---
> >>>  app/test-pmd/flowgen.c | 137 
> >>> +++++++++++++++++++++++++++++++------------------
> >>>  1 file changed, 86 insertions(+), 51 deletions(-)
> >>>
> >>
> >> <...>
> >>
> >>> @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
> >>>               }
> >>>               pkts_burst[nb_pkt] = pkt;
> >>>
> >>> -             next_flow = (next_flow + 1) % cfg_n_flows;
> >>> +             if (++next_udp_dst < cfg_n_udp_dst)
> >>> +                     continue;
> >>> +             next_udp_dst = 0;
> >>> +             if (++next_udp_src < cfg_n_udp_src)
> >>> +                     continue;
> >>> +             next_udp_src = 0;
> >>> +             if (++next_ip_dst < cfg_n_ip_dst)
> >>> +                     continue;
> >>> +             next_ip_dst = 0;
> >>> +             if (++next_ip_src < cfg_n_ip_src)
> >>> +                     continue;
> >>> +             next_ip_src = 0;
> >>
> >> What is the logic here, can you please clarifiy the packet generation 
> >> logic both
> >> in a comment here and in the commit log?
> >
> > It's round-robin field by field. Will add the comments.
> >
>
> Thanks. If the receiving end is doing RSS based on IP address, dst address 
> will
> change in every 100 packets and src will change in every 10000 packets. This 
> is
> a slight behavior change.
>
> When it was only dst ip, it was simple to just increment it, not sure about it
> in this case. I wonder if we should set all randomly for each packet. I don't
> know what is the better logic here, we can discuss it more in the next 
> version.

A more sophisticated pkt generator provides various options among
"step-by-step" / "random" / etc.

But supporting multiple fields naturally brings this implicitly. It
won't be a problem as it can be configured by setting the cfg_n_* as
we discussed above.

I think rte_rand() is a good option, anyway this can be tweaked easily
once the framework becomes shaped.

>
> >>
> >>>       }
> >>>
> >>>       nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, 
> >>> nb_pkt);
> >>>       /*
> >>>        * Retry if necessary
> >>>        */
> >>> -     if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
> >>> +     if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
> >>>               retry = 0;
> >>> -             while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
> >>> +             while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
> >>>                       rte_delay_us(burst_tx_delay_time);
> >>>                       nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
> >>> -                                     &pkts_burst[nb_tx], nb_rx - nb_tx);
> >>> +                                     &pkts_burst[nb_tx], nb_pkt - nb_tx);
> >>>               }
> >>
> >> +1 to this fix, thanks for it. But can you please make a seperate patch for
> >> this, with proper 'Fixes:' tag etc..
> >
> > Ok.
> >
> >>
> >>>       }
> >>> -     fs->tx_packets += nb_tx;
> >>>
> >>>       inc_tx_burst_stats(fs, nb_tx);
> >>> -     if (unlikely(nb_tx < nb_pkt)) {
> >>> -             /* Back out the flow counter. */
> >>> -             next_flow -= (nb_pkt - nb_tx);
> >>> -             while (next_flow < 0)
> >>> -                     next_flow += cfg_n_flows;
> >>> +     fs->tx_packets += nb_tx;
> >>> +     /* Catch up flow idx by actual sent. */
> >>> +     for (i = 0; i < nb_tx; ++i) {
> >>> +             RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) 
> >>> + 1;
> >>> +             if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_udp_dst) = 0;
> >>> +             RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) 
> >>> + 1;
> >>> +             if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_udp_src) = 0;
> >>> +             RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 
> >>> 1;
> >>> +             if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_ip_dst) = 0;
> >>> +             RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 
> >>> 1;
> >>> +             if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src)
> >>> +                     continue;
> >>> +             RTE_PER_LCORE(_next_ip_src) = 0;
> >>> +     }
> >>
> >> Why per-core variables are not used in forward function, but local 
> >> variables
> >> (like 'next_ip_src' etc..) used? Is it for the performance, if so what is 
> >> the
> >> impact?
> >>
> >> And why not directly assign from local variables to per-core variables, 
> >> but have
> >> above catch up loop?
> >>
> >>
> >
> > Local vars are for generating pkts, global ones catch up finally when
> > nb_tx is clear.
>
> Why you are not using global ones to generate packets? This removes the need 
> for
> catch up?

When there are multiple fields, back out the overran index caused by
dropped packets is not that straightforward -- It's the "carry" issue
in adding.

>
> > So flow indexes only increase by actual sent pkt number.
> > It serves the same purpose of the original "/* backout the flow counter */".
> > My math isn't good enough to make it look more intelligent though.
> >
>
> Maybe I am missing something, for this case why not just assign back from 
> locals
> to globals?

As above.

However, this can be simplified if we discard the "back out"
mechanism: generate 32 pkts and send 20 of them while the rest 12 are
dropped, the difference is that is the idx gonna start from 21 or 33
next time?

Reply via email to