On Thu, 2017-05-11 at 04:14 +0000, Sekhar, Ashwin wrote: ... > > > Combining all the above comments, I made some changes on top of > > > your > > > patch. These changes are giving 3-4% improvement over your > > > version. > > > > > > You may find the changes at > > > https://gist.github.com/ashwinyes/34cbdd999784402c859c71613587faf > > > c > > > > > Is the correct in Line 103/104, you only process one packets in the > > last FWDSTEP packets? > Its doing processx4_* there. So its processing 4 packets. > > > > > Actually, I don't like your change in l3fwd_lpm_send_packets, > > making > > the simple logic complicated. And I don't think it can help to > > improve > > performance. :-) > Its not making it complicated. The number of lines of code may be > higher by may be 10 lines, but the conditions of the loops are > simplified which reduces the number of branch instructions and helps > the processor to go through them faster. > > If possible, please try it out on your machine.
Missed out one point. Since 2 loops are form "for (i = 0; i < FWDSTEP; i++)" i.e. looping for constant number of iterations, compiler will easily unroll them. Thanks Ashwin > > > > > > > > > > > > > Please check it out and let me know your comments. > > > > > > Thanks > > > Ashwin