Yes, ALWAYS_INLINE looks a good solution. miniflow_extract() is one of the most cpu-consuming functions. Perf tool in a phy-to-phy test without CT was infact showing
29% emc_processing() 20% miniflow_extract() <== 13% _recv_raw_pkts_vec() 11% ixgbe_xmit_pkts_vec() 8% dp_netdev_process_rxq_port() 3% netdev_dpdk_rxq_recv() ... > -----Original Message----- > From: Daniele Di Proietto [mailto:diproiet...@vmware.com] > Sent: Monday, February 1, 2016 7:38 PM > To: Fischetti, Antonio > Cc: dev@openvswitch.org > Subject: Re: [ovs-dev] [PATCH 03/12] flow: Introduce parse_dl_type(). > > Hi Antonio, > > Thanks for investigating and sharing these results. > > You're right, this patch changes slightly the assembly output in > miniflow_extract() and introduces a performance regression. > > > On 01/02/2016 03:13, "Fischetti, Antonio" <antonio.fische...@intel.com> > wrote: > > >Hi, I was running some Regression Test to check performance in a > >Phy-2-Phy test, ie > >ovs-ofctl add-flow br0 in_port=1,action=output:2 > > > >After applying patches #1 and then #2 the performance > >is still ok. > >Instead, after I apply this patch #3 the throughput goes from 12.1 > >down to 11.6 Mpps. > > > >I found out 2 strange alternative ways to restore performance. > >Either: > >1. Comment the new parse_dl_type() function. I know it's odd > >because it isn't yet called from anywhere. > > I guess that the compiler doesn't know that when it's compiling > flow.c. > > >Or: > >2. Adding these weird changes to the 2 function prototypes which are > >called by parse_dl_type() > > > >- static inline ovs_be16 > >+/*static inline*/ ovs_be16 > >parse_vlan(const void **datap, size_t *sizep) > > > >and > > > >- static inline ovs_be16 > >+ /*static*/ inline ovs_be16 > >parse_ethertype(const void **datap, size_t *sizep) > > It looks like my patch prevents the compiler from inlining part of > these functions, and this confirms it. > > I propose the following incremental, that seems to resolve the > regression. What do you think? > > ---8<--- > > diff --git a/lib/flow.c b/lib/flow.c > index 08e199c..238a7f5 100644 > --- a/lib/flow.c > +++ b/lib/flow.c > @@ -290,7 +290,7 @@ parse_mpls(const void **datap, size_t *sizep) > return MIN(count, FLOW_MAX_MPLS_LABELS); > } > > -static inline ovs_be16 > +static inline ALWAYS_INLINE ovs_be16 > parse_vlan(const void **datap, size_t *sizep) > { > const struct eth_header *eth = *datap; > @@ -312,7 +312,7 @@ parse_vlan(const void **datap, size_t *sizep) > return 0; > } > > -static inline ovs_be16 > +static inline ALWAYS_INLINE ovs_be16 > parse_ethertype(const void **datap, size_t *sizep) > { > const struct llc_snap_header *llc; > > > ---8<--- > > I'm not a fan of this kind of optimizations, but I believe that > miniflow_extract() deserves some special treatment. > > Thanks, > > Daniele > > > > > >It looks like some compiler optimization is failing for some reason? > >I didn't investigate further on the output map of the executable. > > > >Antonio > > > > > >> -----Original Message----- > >> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Daniele > Di > >> Proietto > >> Sent: Monday, November 16, 2015 6:21 AM > >> To: dev@openvswitch.org > >> Subject: [ovs-dev] [PATCH 03/12] flow: Introduce parse_dl_type(). > >> > >> The function simply returns the ethernet type of the packet (after > >> eventually discarding the VLAN tag). It will be used by a following > >> commit. > >> > >> Signed-off-by: Daniele Di Proietto <diproiet...@vmware.com> > >> --- > >> lib/flow.c | 10 ++++++++++ > >> lib/flow.h | 1 + > >> 2 files changed, 11 insertions(+) > >> > >> diff --git a/lib/flow.c b/lib/flow.c > >> index 2bdce26..d8c1228 100644 > >> --- a/lib/flow.c > >> +++ b/lib/flow.c > >> @@ -789,6 +789,16 @@ miniflow_extract(struct dp_packet *packet, > struct > >> miniflow *dst) > >> dst->map = mf.map; > >> } > >> > >> +ovs_be16 > >> +parse_dl_type(const struct eth_header *data_, size_t size) > >> +{ > >> + const void *data = data_; > >> + > >> + parse_vlan(&data, &size); > >> + > >> + return parse_ethertype(&data, &size); > >> +} > >> + > >> /* For every bit of a field that is wildcarded in 'wildcards', sets the > >> * corresponding bit in 'flow' to zero. */ > >> void > >> diff --git a/lib/flow.h b/lib/flow.h > >> index de15a98..7465af8 100644 > >> --- a/lib/flow.h > >> +++ b/lib/flow.h > >> @@ -237,6 +237,7 @@ void flow_compose(struct dp_packet *, const > struct > >> flow *); > >> > >> bool parse_ipv6_ext_hdrs(const void **datap, size_t *sizep, uint8_t > >> *nw_proto, > >> uint8_t *nw_frag); > >> +ovs_be16 parse_dl_type(const struct eth_header *data_, size_t size); > >> > >> static inline uint64_t > >> flow_get_xreg(const struct flow *flow, int idx) > >> -- > >> 2.1.4 > >> > >> _______________________________________________ > >> dev mailing list > >> dev@openvswitch.org > >> http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev