> On Feb 4, 2016, at 6:08 AM, Fischetti, Antonio <antonio.fische...@intel.com> > wrote: > > Hi William, > Thanks for your interest on this idea. > I'm currently working to provide a patch, once it is ready I'll be glad to > share. > > In order to compare the throughput between the classifier and the acl > I changed the code to bypass the EMC table. So in both cases any packet > was hitting the classifier/acl. That's just for my tests, of course. >
Could you share the comparable numbers WITH the EMC cache? Jarno > I installed flows like > sudo ./utilities/ovs-ofctl add-flow br0 > dl_type=0x0800,nw_dst=34.34.34.34,action=output:2 > sudo ./utilities/ovs-ofctl add-flow br0 > idle_timeout=0,dl_type=0x0800,nw_dst=34.34.34.35,dl_src=01:02:03:04:05:08,action=output:2 > sudo ./utilities/ovs-ofctl add-flow br0 > dl_type=0x0800,nw_proto=17,udp_dst=64,action=output:2 > sudo ./utilities/ovs-ofctl add-flow br0 > dl_src=01:02:03:04:05:06,action=output:2 > sudo ./utilities/ovs-ofctl add-flow br0 > dl_type=0x0800,nw_proto=17,actions=output:2 > sudo ./utilities/ovs-ofctl add-flow br0 > dl_type=0x0800,nw_tos=2,actions=output:2 > sudo ./utilities/ovs-ofctl add-flow br0 vlan_tci=0,actions=output:2 > > so that the classifier was using 6 Subtables. > > > Antonio > > > From: William Tu [mailto:u9012...@gmail.com] > Sent: Wednesday, February 3, 2016 11:52 PM > To: Fischetti, Antonio > Cc: dev@openvswitch.org > Subject: Re: [ovs-dev] Wildcard Matching optimization idea > > Hi Fischetti, > I'm interested in trying your experiment with 2 DPDK ACL table swapping. Do > you have a patch for me to start with? Or any comments about how to reproduce > your experiment results? (the one comparing classifier and DPDK ACL) >> +------------+------------+ >> | Classifier | 2 ACLs | >> +----------------+------------+------------+ >> | Max Throughput | 2.2 | 5.4 | >> | [Mpps] | | | >> +----------------+------------+------------+ > Thank you > William > > On Thu, Jan 14, 2016 at 7:28 AM, Fischetti, Antonio > <antonio.fische...@intel.com> wrote: > Thanks Wei Li, please see my replies inline. > If I missed something or for further detail, just let me know. > >> -----Original Message----- >> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Wei Li >> Sent: Thursday, January 14, 2016 6:38 AM >> To: dev@openvswitch.org >> Subject: Re: [ovs-dev] Wildcard Matching optimization idea >> >> insert a rule to "mirror" and swap "mirror" and "real" >> >> then real does not contain the new inserted rule > > Correct, it needs to be updated as well. > >> >> how to sync? take the "long" time as inserting to "mirror"? >> > > Yes, it will take the same long time. > The thread that has previously inserted the rule into what was the Shadow > (or "mirror") will repeat the same work on the other table. > > Below is the sequence with few more details to show the latency for a new > rule. > > 1. Insert the new rule into the background table > ---------------------------------------------------------- > Please Note: while this is happening the new rule will not affect the lookups > as they're always carried out on the foreground Table. > Background table updated? Go to Step #2. > > 2. Swap tables > ------------------ > The updated table is 'moved' to the foreground so that lookups are now > affected by the new rule. > Anyway we're not done yet, we updated just one of the 2 tables. > In case further new rules need to be added at this stage, they will get > buffered > in a 'waiting list' to be processed later. > Background table updated? Go to Step #3. > > 3. Done > ---------- > Both tables are synced up. > Is there a new rule to insert and/or anything in the 'waiting list'? Go to > Step #1. > > >> 在 2016/1/12 18:37, Fischetti, Antonio 写道: >>> Thanks Zoltan for your questions. >>> >>> Anyone else has any new feedback? >>> >>>> -----Original Message----- >>>> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Fischetti, >>>> Antonio >>>> Sent: Friday, December 18, 2015 3:44 PM >>>> To: Zoltan Kiss; dev@openvswitch.org >>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea >>>> >>>> >>>>> -----Original Message----- >>>>> From: Zoltan Kiss [mailto:zoltan.k...@linaro.org] >>>>> Sent: Friday, December 18, 2015 12:38 PM >>>>> To: Fischetti, Antonio; dev@openvswitch.org >>>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea >>>>> >>>>> >>>>> >>>>> On 17/12/15 16:23, Fischetti, Antonio wrote: >>>>>> Hi Zoltan, thanks for your questions. >>>>>> Please find below my answers inline. >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Zoltan Kiss [mailto:zoltan.k...@linaro.org] >>>>>>> Sent: Thursday, December 17, 2015 2:33 PM >>>>>>> To: Fischetti, Antonio; dev@openvswitch.org >>>>>>> Subject: Re: [ovs-dev] Wildcard Matching optimization idea >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 17/12/15 10:41, Fischetti, Antonio wrote: >>>>>>>> Hi All, >>>>>>>> Here's an optimization idea for the datapath classifier table. >>>>>>>> I'd like to get some feedback. >>>>>>>> >>>>>>>> I used the DPDK ACL tables. They can perform a wildcarded matching >>>>> and >>>>>>> each >>>>>>>> lookup requires less CPU cycles than the Classifier. >>>>>>>> Anyway there's a negative aspect with ACLs. They take a very long >> time >>>>> to >>>>>>>> insert a new Rule. >>>>>>>> It can be 50 times greater than an insertion into the Classifier. See >>>> Note >>>>>>> below >>>>>>>> for further details. >>>>>>>> >>>>>>>> So a simple 1:1 replacement of the Classifier with an ACL table is not >> a >>>>>>> viable >>>>>>>> solution. >>>>>>>> >>>>>>>> The idea described below is instead to replace the Classifier with 2 >> ACL >>>>>>>> tables. One is the 'Operating', while the other is a 'Shadow' table. >>>>>>>> >>>>>>>> Any lookup will be performed on the Operating table. >>>>>>>> >>>>>>>> Instead any new insertion will be executed on the Shadow table by >>>>> means >>>>>>> of a >>>>>>>> separate thread. >>>>>>>> After the insertion is done, the 2 tables will be swapped. >>>>> Are you swapping after each insertion, or in batches? >>>> In batches. >>>> >>>>> The new shadow table needs to get updated first to be sync with >>>> Operating, does it take >>>>> a similar amount of time? >>>> No, the shadow table acts as a 'mirror' of the operating. So the 2 >>>> tables are supposed to contain exactly the same entries. >>>> An exception is during the transient insertion procedure. But after it >>>> is completed the 2 tables will contain again the same entries. >>>> >>>>> Instead of having this 2 table, how about have one, and make it possible >>>>> that you can look up while an insertion is on place? Something in an RCU >>>>> fashion? >>>> Unluckly not. While an insertion is taking place it is not possible to >>>> access >> the >>>> ACL >>>> to read its entries. >>>> More precisely, an ACL insertion means 2 actions: add + rebuild. The >> rebuild >>>> takes >>>> the 95% cpu cycles of all the insertion. >>>> You could read while the 'add' is in progress. Instead you can't read while >> the >>>> 'rebuild' is still happening. >>>> That's why I'm using 2 ACL tables. >>>> >>>>>>> So while this insertion happens, you still look up in the actual >>>>>>> Operating table. >>>>>> Yes, while insertion is in progress any lookup will still be carried out >>>>>> on >>>> the >>>>>> Operating table. >>>>> I don't know how the classifier works exactly, but is the following >>>>> scenario possible?: >>>>> >>>>> Rule A matches a flow and specifies an action. A new insertion would >>>>> essentially remove Rule A and add B which matches the same flow but >>>>> specifies a different action. While that happens, packets would still >>>>> match A, while the expectation probably would be to match B. >>>>> >>>> The same issue can happen with the Classifier, in this case it will be >>>> worse. >>>> The solution with 2 ACLs have an insertion latency much longer. >>>> That is because an ACL insertion can be about 50 times greater than an >>>> insertion into the Classifier. >>>> >>>>>>> What happens if you have a new insertion in the meantime? >>>>>> The new Rule gets buffered into a 'wait' queue. >>>>>> >>>>>>> Especially, what happens if your lookup yields the same rule >>>>>>> which is inserted at the moment? >>>>>> That's a good point. At the current stage it is simply added into the >> wait >>>>> queue. >>>>>> So I could potentially have duplications where different rules into the >> ACL >>>>> are >>>>>> referring to the same netdev-flow. >>>>>> To avoid these duplications there could be 2 approaches. >>>>>> One option would be to check that in the wait queue that rule is not >>>>> present. >>>>>> Another option would be to store it into the wait queue anyway and >> then >>>>> check >>>>>> that the ACL does not already contain that rule. >>>>> >>>>> >>>>> >>>>>>>> Thus the Shadow table will now become the Operating one, and >>>>> viceversa. >>>>>>>> >>>>>>>> Is the following ok with real use cases? >>>>>>>> ======================================== >>>>>>>> An Assumption was made: new sets of Rules arrive with a frequency >>>>> lower >>>>>>>> than 1 (Rule Sets)/sec. >>>>>>>> Would this be ok with real use cases? >>>>>>>> >>>>>>>> >>>>>>>> Performance Figures >>>>>>>> =================== >>>>>>>> The table below refers to a mono-directional test where the >>>>> performance >>>>>>> is >>>>>>>> compared between the 2 implementations. >>>>>>>> Some Flows were installed so that the Classifier was using 7 >> SubTables. >>>>>>>> The ACL Rule format was {Protocol, IPdest, MACsrc, UdpPortDest, >> ToS, >>>>>>> VlanTci}. >>>>>>>> The performance figures are expressed in Mpps. >>>>>>>> >>>>>>>> +------------+------------+ >>>>>>>> | Classifier | 2 ACLs | >>>>>>>> +----------------+------------+------------+ >>>>>>>> | Max Throughput | 2.2 | 5.4 | >>>>>>>> | [Mpps] | | | >>>>>>>> +----------------+------------+------------+ >>>>>>>> >>>>>>>> >>>>>>>> Conclusions >>>>>>>> =========== >>>>>>>> At this stage it would really be helpful to have an initial feedback >> from >>>>> the >>>>>>>> Community. Any comment or suggestion will be useful to drive >> further >>>>>>>> developments. >>>>>>>> >>>>>>>> >>>>>>>> References >>>>>>>> ========== >>>>>>>> DPDK ACL Rules, how to: >>>>>>>> >>>>> http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html >>>>>>>> >>>>>>>> Notes >>>>>>>> ===== >>>>>>>> When an ACL table contains about 2000 Rules with a structure like >>>>>>>> {Protocol, IPsource, IPdest, PortSource, PortDest} >>>>>>>> a new insertion costs about 69000 CPUcycles/Rule. >>>>>>>> Instead under similar operating conditions the Classifier would >> require >>>>>>> about >>>>>>>> 1300 CPUcycles/Rule. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Antonio >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> dev mailing list >>>>>>>> dev@openvswitch.org >>>>>>>> http://openvswitch.org/mailman/listinfo/dev >>>>>>>> >>>> _______________________________________________ >>>> dev mailing list >>>> dev@openvswitch.org >>>> http://openvswitch.org/mailman/listinfo/dev >>> _______________________________________________ >>> dev mailing list >>> dev@openvswitch.org >>> http://openvswitch.org/mailman/listinfo/dev >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev