Hi All,
Here's an optimization idea for the datapath classifier table. 
I'd like to get some feedback.

I used the DPDK ACL tables. They can perform a wildcarded matching and each 
lookup requires less CPU cycles than the Classifier.
Anyway there's a negative aspect with ACLs. They take a very long time to 
insert a new Rule. 
It can be 50 times greater than an insertion into the Classifier. See Note 
below 
for further details.

So a simple 1:1 replacement of the Classifier with an ACL table is not a viable
solution.

The idea described below is instead to replace the Classifier with 2 ACL 
tables. One is the 'Operating', while the other is a 'Shadow' table.

Any lookup will be performed on the Operating table.

Instead any new insertion will be executed on the Shadow table by means of a 
separate thread. 
After the insertion is done, the 2 tables will be swapped.
Thus the Shadow table will now become the Operating one, and viceversa.


Is the following ok with real use cases?
========================================
An Assumption was made: new sets of Rules arrive with a frequency lower
than 1 (Rule Sets)/sec.
Would this be ok with real use cases?


Performance Figures
===================
The table below refers to a mono-directional test where the performance is
compared between the 2 implementations.
Some Flows were installed so that the Classifier was using 7 SubTables.
The ACL Rule format was {Protocol, IPdest, MACsrc, UdpPortDest, ToS, VlanTci}.
The performance figures are expressed in Mpps.

                 +------------+------------+
                 | Classifier |   2 ACLs   |
+----------------+------------+------------+
| Max Throughput |    2.2     |    5.4     |
|     [Mpps]     |            |            |
+----------------+------------+------------+


Conclusions
===========
At this stage it would really be helpful to have an initial feedback from the
Community. Any comment or suggestion will be useful to drive further
developments.


References
==========
DPDK ACL Rules, how to:
    http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html


Notes
=====
When an ACL table contains about 2000 Rules with a structure like
{Protocol, IPsource, IPdest, PortSource, PortDest}
a new insertion costs about 69000 CPUcycles/Rule.
Instead under similar operating conditions the Classifier would require about
1300 CPUcycles/Rule.


Thanks,
Antonio


_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to