Thanks Zoltan for your questions. Anyone else has any new feedback?
> -----Original Message----- > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Fischetti, > Antonio > Sent: Friday, December 18, 2015 3:44 PM > To: Zoltan Kiss; dev@openvswitch.org > Subject: Re: [ovs-dev] Wildcard Matching optimization idea > > > > -----Original Message----- > > From: Zoltan Kiss [mailto:zoltan.k...@linaro.org] > > Sent: Friday, December 18, 2015 12:38 PM > > To: Fischetti, Antonio; dev@openvswitch.org > > Subject: Re: [ovs-dev] Wildcard Matching optimization idea > > > > > > > > On 17/12/15 16:23, Fischetti, Antonio wrote: > > > Hi Zoltan, thanks for your questions. > > > Please find below my answers inline. > > > > > >> -----Original Message----- > > >> From: Zoltan Kiss [mailto:zoltan.k...@linaro.org] > > >> Sent: Thursday, December 17, 2015 2:33 PM > > >> To: Fischetti, Antonio; dev@openvswitch.org > > >> Subject: Re: [ovs-dev] Wildcard Matching optimization idea > > >> > > >> > > >> > > >> On 17/12/15 10:41, Fischetti, Antonio wrote: > > >>> Hi All, > > >>> Here's an optimization idea for the datapath classifier table. > > >>> I'd like to get some feedback. > > >>> > > >>> I used the DPDK ACL tables. They can perform a wildcarded matching > > and > > >> each > > >>> lookup requires less CPU cycles than the Classifier. > > >>> Anyway there's a negative aspect with ACLs. They take a very long time > > to > > >>> insert a new Rule. > > >>> It can be 50 times greater than an insertion into the Classifier. See > Note > > >> below > > >>> for further details. > > >>> > > >>> So a simple 1:1 replacement of the Classifier with an ACL table is not a > > >> viable > > >>> solution. > > >>> > > >>> The idea described below is instead to replace the Classifier with 2 ACL > > >>> tables. One is the 'Operating', while the other is a 'Shadow' table. > > >>> > > >>> Any lookup will be performed on the Operating table. > > >>> > > >>> Instead any new insertion will be executed on the Shadow table by > > means > > >> of a > > >>> separate thread. > > >>> After the insertion is done, the 2 tables will be swapped. > > > > Are you swapping after each insertion, or in batches? > > In batches. > > > The new shadow table needs to get updated first to be sync with > Operating, does it take > > a similar amount of time? > > No, the shadow table acts as a 'mirror' of the operating. So the 2 > tables are supposed to contain exactly the same entries. > An exception is during the transient insertion procedure. But after it > is completed the 2 tables will contain again the same entries. > > > Instead of having this 2 table, how about have one, and make it possible > > that you can look up while an insertion is on place? Something in an RCU > > fashion? > > Unluckly not. While an insertion is taking place it is not possible to access > the > ACL > to read its entries. > More precisely, an ACL insertion means 2 actions: add + rebuild. The rebuild > takes > the 95% cpu cycles of all the insertion. > You could read while the 'add' is in progress. Instead you can't read while > the > 'rebuild' is still happening. > That's why I'm using 2 ACL tables. > > > > > >> > > >> So while this insertion happens, you still look up in the actual > > >> Operating table. > > > > > > Yes, while insertion is in progress any lookup will still be carried out > > > on > the > > > Operating table. > > > > I don't know how the classifier works exactly, but is the following > > scenario possible?: > > > > Rule A matches a flow and specifies an action. A new insertion would > > essentially remove Rule A and add B which matches the same flow but > > specifies a different action. While that happens, packets would still > > match A, while the expectation probably would be to match B. > > > > The same issue can happen with the Classifier, in this case it will be worse. > The solution with 2 ACLs have an insertion latency much longer. > That is because an ACL insertion can be about 50 times greater than an > insertion into the Classifier. > > > > > > >> What happens if you have a new insertion in the meantime? > > > > > > The new Rule gets buffered into a 'wait' queue. > > > > > >> Especially, what happens if your lookup yields the same rule > > >> which is inserted at the moment? > > > > > > That's a good point. At the current stage it is simply added into the > > > wait > > queue. > > > So I could potentially have duplications where different rules into the > > > ACL > > are > > > referring to the same netdev-flow. > > > To avoid these duplications there could be 2 approaches. > > > One option would be to check that in the wait queue that rule is not > > present. > > > Another option would be to store it into the wait queue anyway and then > > check > > > that the ACL does not already contain that rule. > > > > > > > > > > > > > >> > > >>> Thus the Shadow table will now become the Operating one, and > > viceversa. > > >>> > > >>> > > >>> Is the following ok with real use cases? > > >>> ======================================== > > >>> An Assumption was made: new sets of Rules arrive with a frequency > > lower > > >>> than 1 (Rule Sets)/sec. > > >>> Would this be ok with real use cases? > > >>> > > >>> > > >>> Performance Figures > > >>> =================== > > >>> The table below refers to a mono-directional test where the > > performance > > >> is > > >>> compared between the 2 implementations. > > >>> Some Flows were installed so that the Classifier was using 7 SubTables. > > >>> The ACL Rule format was {Protocol, IPdest, MACsrc, UdpPortDest, ToS, > > >> VlanTci}. > > >>> The performance figures are expressed in Mpps. > > >>> > > >>> +------------+------------+ > > >>> | Classifier | 2 ACLs | > > >>> +----------------+------------+------------+ > > >>> | Max Throughput | 2.2 | 5.4 | > > >>> | [Mpps] | | | > > >>> +----------------+------------+------------+ > > >>> > > >>> > > >>> Conclusions > > >>> =========== > > >>> At this stage it would really be helpful to have an initial feedback > > >>> from > > the > > >>> Community. Any comment or suggestion will be useful to drive further > > >>> developments. > > >>> > > >>> > > >>> References > > >>> ========== > > >>> DPDK ACL Rules, how to: > > >>> > > http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html > > >>> > > >>> > > >>> Notes > > >>> ===== > > >>> When an ACL table contains about 2000 Rules with a structure like > > >>> {Protocol, IPsource, IPdest, PortSource, PortDest} > > >>> a new insertion costs about 69000 CPUcycles/Rule. > > >>> Instead under similar operating conditions the Classifier would require > > >> about > > >>> 1300 CPUcycles/Rule. > > >>> > > >>> > > >>> Thanks, > > >>> Antonio > > >>> > > >>> > > >>> _______________________________________________ > > >>> dev mailing list > > >>> dev@openvswitch.org > > >>> http://openvswitch.org/mailman/listinfo/dev > > >>> > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev