On Tue, 2006-14-03 at 12:45 +1000, Russell Stuart wrote: > On Sat, 2006-03-11 at 10:56 -0500, jamal wrote:
> > [1] Essentially, if you teach u32 in 2.4 to spread rules over different > > buckets it will not do so evenly. Off top of my head i remember that the > > best you could ever do is have bucket selection in increments of 4 (so > > bucket 0, 4, 8,..) with certain divisor sizes - mostly works with 256; > > to work properly you need it to be (0,1,2,3,..). > > implies --> When it does work, it is detrimental to lookup speed when > > you hit thousands of rules because some buckets get overloaded and > > others are totaly unused and when it doesnt work, it results in > > classifier misses for rules you inserted. > > Hmm. I can't see how this could be so. Can you give > specific examples. > I will try to find the scripts. > The only time I can think of where the 2.4 XOR hash would > be worse is where there is a correlation between the bits > in different bytes. actually if memory serves me right, thats what part of it ;-> > I can't think of a real life example > of where that would be so. The end goal is to reduce the total lookups at the expense of more RAM. I dont see using 20M as a big deal for example. Imagine a few thousand hosts... and you want to distribute these across many hash tables (i recall the test i had was for 64K unique hosts). One of the techniques you could use is to have several layers of hash tables before eventually hitting the terminal; at each then you could spread the bits (of lets say the 3rd octet of src IP in 3 bit hash masks) in the same hash level at the diagonal level. The result is many hash tables created (a lot of memory consumed) but faster lookups. In such a case a selection on a hash table could be made with bit pattern 101 or 111 etc as the mask with a hash table with only 8 buckets. I am not sure if that made sense. > In every other case it will > be as good as, but typically better than, the 2.6 > algorithm. That could be debated. There are many variables you need to consider with u32 before reaching that conclusion, heres what comes to mind: 1) the number of buckets at the hash table - you picked 256 in your example. 2) the mask and size (as you have found out with 1 or >/< 1 byte ) and whether it is contigous. 3) the bit offset of where the ANDing happens. Different bit positions will result in different distributions. To properly analyze things you need to have a variation of these. When i last compared the two, the all round winner was the 2.6.x one. The equation you provided for evaluating the different hashes does look clever but i have never seen it used anywhere. What i did was to compute the std deviation from the best value. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html