On 17/11/17 08:14, Nikolay Aleksandrov wrote: > On 17/11/17 07:26, Willy Tarreau wrote: >> Hi Stephen, >> >> On Thu, Nov 16, 2017 at 04:27:18PM -0800, Stephen Hemminger wrote: >>> On Thu, 16 Nov 2017 21:21:55 +0100 >>> Vincent Bernat <ber...@luffy.cx> wrote: >>> >>>> ? 16 novembre 2017 20:23 +0100, Andrew Lunn <and...@lunn.ch> : >>>> >>>>> struct net_bridge_fdb_entry is 40 bytes. >>>>> >>>>> My WiFi access point which is also a 5 port bridge, currently has 97MB >>>>> free RAM. That is space for about 2.5M FDB entries. So even Roopa's >>>>> 128K is not really a problem, in terms of memory. >>>> >>>> I am also interested in Sarah's patch because we can now have bridge >>>> with many ports through VXLAN. The FDB can be replicated to an external >>>> daemon with BGP and the cost of each additional MAC address is therefore >>>> higher than just a few bytes. It seems simpler to implement a limiting >>>> policy early (at the port or bridge level). >>>> >>>> Also, this is a pretty standard limit to have for a bridge (switchport >>>> port-security maximum on Cisco, set interface X mac-limit on >>>> Juniper). And it's not something easy to do with ebtables. >>> >>> I want an optional limit per port, it makes a lot of sense. >>> If for no other reason that huge hash tables are a performance problems. >> >> Except its not a limit in that it doesn't prevent new traffic from going >> in, it only prevents new MACs from being learned, so suddenly you start >> flooding all ports with new traffic once the limit is reached, which is >> not trivial to detect nor diagnose. >> >>> There is a bigger question about which fdb to evict but just dropping the >>> new one seems to be easiest and as good as any other solution. >> >> Usually it's better to apply LRU or random here in my opinion, as the >> new entry is much more likely to be needed than older ones by definition. >> In terms of CPU usage it may even be better to kill an entire series in >> the hash table (eg: all nodes in the same table entry for example), as >> the operation can be almost as cheap and result in not being needed for >> a while again. >> >> Willy >> > > Hi, > I have been thinking about this and how to try and keep everyone happy > while maintaining performance, so how about this: > > * Add a per-port fdb LRU list which is used only when there are >= 2000 > global fdb entries _or_ a per-port limit is set. If the list is in use, > update the fdb list position once per second. I think these properties will > allow us to scale and have a better LRU locking granularity (per-port), > and in > smaller setups (not needing LRU) the cost will be a single test in fast > path. >
It seems I spoke too soon, I just tried a quick and dirty version of this and the per-port LRU becomes a nightmare with the completely lockless fdb dst port changes. So I think going back to the original proposition with new fdb dropping is best for now, otherwise we need to add some synchronization on fdb port move. In general the per-port locks would be tricky to handle with moving fdbs, so even if we add a per-port LRU list we'll need some global lock to handle moves and updates in a simple manner, and I'd like to avoid adding a new bridge lock. > * Use the above LRU list for per-port limit evictions > > * More importantly use the LRU list for fdb expire, removing the need to > > walk > over all fdbs every time we expire entries. This would be of great help for > larger setups with many fdbs (it will kick in on >= 2000 fdb entries). > Will have to drop the above point for now, even though I was mostly interested in that. > * (optional) Make the global LRU kick in limit an option, people might want > to > minimize traffic blocking due to expire process. > > Any comments and suggestions are welcome. When we agree on the details I'll do > the RFC patches and run some tests before submitting. Defaults will be kept as > they are now. I've chosen the 2000 limit arbitrarily and am happy to change it > if people have something else in mind. This should play nice with the > resizeable > hashtable change. > > Thanks, > Nik > > > > >