On 8/9/2021 7:52 AM, Zhihong Wang wrote: > This patch aims to: > 1. Add flexibility by supporting IP & UDP src/dst fields
What is the reason/"use case" of this flexibility? > 2. Improve multi-core performance by using per-core vars> On multi core this also has syncronization problem, OK to make it per-core. Do you have any observed performance difference, if so how much is it? And can you please separate this to its own patch? This can be before ip/udp update. > v2: fix assigning ip header cksum > +1 to update, can you please make it as seperate patch? So overall this can be a patchset with 4 patches: 1- Fix retry logic (nb_rx -> nb_pkt) 2- Use 'rte_ipv4_cksum()' API (instead of static 'ip_sum()') 3- User per-core varible (for 'next_flow') 4- Support ip/udp src/dst variaty of packets > Signed-off-by: Zhihong Wang <wangzhihong....@bytedance.com> > --- > app/test-pmd/flowgen.c | 137 > +++++++++++++++++++++++++++++++------------------ > 1 file changed, 86 insertions(+), 51 deletions(-) > <...> > @@ -185,30 +193,57 @@ pkt_burst_flow_gen(struct fwd_stream *fs) > } > pkts_burst[nb_pkt] = pkt; > > - next_flow = (next_flow + 1) % cfg_n_flows; > + if (++next_udp_dst < cfg_n_udp_dst) > + continue; > + next_udp_dst = 0; > + if (++next_udp_src < cfg_n_udp_src) > + continue; > + next_udp_src = 0; > + if (++next_ip_dst < cfg_n_ip_dst) > + continue; > + next_ip_dst = 0; > + if (++next_ip_src < cfg_n_ip_src) > + continue; > + next_ip_src = 0; What is the logic here, can you please clarifiy the packet generation logic both in a comment here and in the commit log? > } > > nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt); > /* > * Retry if necessary > */ > - if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) { > + if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) { > retry = 0; > - while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > + while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) { > rte_delay_us(burst_tx_delay_time); > nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > - &pkts_burst[nb_tx], nb_rx - nb_tx); > + &pkts_burst[nb_tx], nb_pkt - nb_tx); > } +1 to this fix, thanks for it. But can you please make a seperate patch for this, with proper 'Fixes:' tag etc.. > } > - fs->tx_packets += nb_tx; > > inc_tx_burst_stats(fs, nb_tx); > - if (unlikely(nb_tx < nb_pkt)) { > - /* Back out the flow counter. */ > - next_flow -= (nb_pkt - nb_tx); > - while (next_flow < 0) > - next_flow += cfg_n_flows; > + fs->tx_packets += nb_tx; > + /* Catch up flow idx by actual sent. */ > + for (i = 0; i < nb_tx; ++i) { > + RTE_PER_LCORE(_next_udp_dst) = RTE_PER_LCORE(_next_udp_dst) + 1; > + if (RTE_PER_LCORE(_next_udp_dst) < cfg_n_udp_dst) > + continue; > + RTE_PER_LCORE(_next_udp_dst) = 0; > + RTE_PER_LCORE(_next_udp_src) = RTE_PER_LCORE(_next_udp_src) + 1; > + if (RTE_PER_LCORE(_next_udp_src) < cfg_n_udp_src) > + continue; > + RTE_PER_LCORE(_next_udp_src) = 0; > + RTE_PER_LCORE(_next_ip_dst) = RTE_PER_LCORE(_next_ip_dst) + 1; > + if (RTE_PER_LCORE(_next_ip_dst) < cfg_n_ip_dst) > + continue; > + RTE_PER_LCORE(_next_ip_dst) = 0; > + RTE_PER_LCORE(_next_ip_src) = RTE_PER_LCORE(_next_ip_src) + 1; > + if (RTE_PER_LCORE(_next_ip_src) < cfg_n_ip_src) > + continue; > + RTE_PER_LCORE(_next_ip_src) = 0; > + } Why per-core variables are not used in forward function, but local variables (like 'next_ip_src' etc..) used? Is it for the performance, if so what is the impact? And why not directly assign from local variables to per-core variables, but have above catch up loop?