On Thu, Aug 13, 2020 at 5:52 AM Jamal Hadi Salim <j...@mojatatu.com> wrote: > > The _main_ requirement is to scale to a large number of filters > (a million is a good handwave number). Scale means > 1) fast datapath lookup time + 2) fast insertion/deletion/get/dump > from control/user space. > fwmark is good at all these goals today for #2. It is good for #1 for > maybe 1K rules (limitation is the 256 buckets, constrained by rcu > trickery). Then you start having collisions in a bucket and your > lookup requires long linked list walks. > > Generally something like a hash table with sufficient number of buckets > will work out ok. > There maybe other approaches (IDR in the kernel looks interesting, > but i didnt look closely). > > So to the implementation issue: > Major issue is removing ambiguity while at the same time trying > to get good performance. > > Lets say we decided to classify skbmark and skbhash at this point. > For a hash table, one simple approach is to set > lookupkey = hash<<32|mark > > the key is used as input to the hash algo to find the bucket. > > There are two outstanding challenges in my mind: > > 1) To use the policy like you describe above > as an example: > > $TC filter add dev $DEV1 parent ffff: protocol ip prio 3 handle 1 > skb hash Y flowid 1:12 action ok > > and say you receive a packet with both skb->hash and skb->mark set > Then there is ambiguity > > How do you know whether to use hash or mark or both > for that specific key?
Hmm, you can just unconditionally pass skb->hash and skb->mark, no? Something like: if (filter_parameter_has_hash) { match skb->hash with cls->param_hash } if (filter_parameter_has_mark) { match skb->mark with cls->param_mark } fw_classify() uses skb->mark unconditionally anyway, without checking whether it is set or not first. But if filters were put in a global hashtable, the above would be much harder to implement. > You can probably do some trick but I cant think of a cheap way to > achieve this goal. Of course this issue doesnt exist if you have > separate classifiers. > > 2) If you decide tomorrow to add tcindex/prio etc, you will have to > rework this as well. > > #2 is not as a big deal as #1. Well, I think #2 is more serious than #1, if we have to use a hashtable. (If we don't have to, then it would be much easier to extend, of course.) THanks.