пт, 5 июл. 2019 г. в 13:31, Medvedkin, Vladimir <vladimir.medved...@intel.com>: > > Hi Stephen, > > On 28/06/2019 16:35, Stephen Hemminger wrote: > > On Fri, 28 Jun 2019 15:16:30 +0100 > > "Medvedkin, Vladimir" <vladimir.medved...@intel.com> wrote: > > > >> Hi Honnappa, > >> > >> On 28/06/2019 14:57, Honnappa Nagarahalli wrote: > >>>> Hi all, > >>>> > >>>> On 28/06/2019 05:34, Stephen Hemminger wrote: > >>>>> On Fri, 28 Jun 2019 02:44:54 +0000 > >>>>> "Ruifeng Wang (Arm Technology China)"<ruifeng.w...@arm.com> wrote: > >>>>> > >>>>>>>> Tests showed that the function inlining caused performance drop on > >>>>>>>> some x86 platforms with the memory ordering patches applied. > >>>>>>>> By force no-inline functions, the performance was better than > >>>>>>>> before on x86 and no impact to arm64 platforms. > >>>>>>>> > >>>>>>>> Suggested-by: Medvedkin Vladimir<vladimir.medved...@intel.com> > >>>>>>>> Signed-off-by: Ruifeng Wang<ruifeng.w...@arm.com> > >>>>>>>> Reviewed-by: Gavin Hu<gavin...@arm.com> > >>>>>>> { > >>>>>>> > >>>>>>> Do you actually need to force noinline or is just taking of inline > >>>>>>> enough? > >>>>>>> In general, letting compiler decide is often best practice. > >>>>>> The force noinline is an optimization for x86 platforms to keep > >>>>>> rte_lpm_add() API performance with memory ordering applied. > >>>>> I don't think you answered my question. What does a recent version of > >>>>> GCC do if you drop the inline. > >>>>> > >>>>> Actually all the functions in rte_lpm should drop inline. > >>>> I'm agree with Stephen. If it is not a fastpath and size of function is > >>>> not > >>>> minimal it is good to remove inline qualifier for other control plane > >>>> functions > >>>> such as rule_add/delete/find/etc and let the compiler decide to inline it > >>>> (unless it affects performance). > >>> IMO, the rule needs to be simple. If it is control plane function, we > >>> should leave it to the compiler to decide. I do not think we need to > >>> worry too much about performance for control plane functions. > >> Control plane is not as important as data plane speed but it is still > >> important. For lpm we are talking not about initialization, but runtime > >> routes add/del related functions. If it is very slow the library will be > >> totally unusable because after it receives a route update it will be > >> blocked for a long time and route update queue would overflow. > > Control plane performance is more impacted by algorithmic choice. > > The original LPM had terrible (n^2?) control path. Current code is better. > > I had a patch using RB tree, but it was rejected because it used the > > /usr/include/bsd/sys/tree.h which added a dependency. > > You're absolutely right, control plane performance is mostly depends on > algorithm. Current LPM implementation has number of problems there. One > problem is rules_tbl[] that is a flat array containing routes for > control plane purposes. Replacing it with a rb-tree solves this problem, > but there are another problems. For example, when you try to add a route > 10.0.0.0/8 while a number of subroutes are exist in the table (for > example 10.20.0.0/16), current implementation will load tbl_entry -> do > some checks (depth, ext entry) -> conditionally store new entry. Under > several circumstances it would take a lot time. But in fact it needs to > unconditionally rewrite only two ranges - from 10.0.0.0 to 10.19.255.255 > and from 10.21.0.0 to 10.255.255.255. And control plane could help us to > get this two ranges. The best struct to do so is lc-tree because it is > relatively easy to traverse subtree (described by 10.0.0.0/8) and get > subroutes. We are working on a new implementation, hopefully it will be > ready soon.
Have you considered switching to this algorithm? http://www.nxlab.fer.hr/dxr/ -- Alex