On Wed, Dec 09, 2020 at 11:01:25PM +0100, Tobias Waldekranz wrote: > It is not the Fibonacci sequence or anything, it is an integer in the > range 0..num_lags-1. I realize that some hardware probably allocate IDs > from some shared (and thus possibly non-contiguous) pool. Maybe ocelot > works like that. But it seems reasonable to think that at least some > other drivers could make use of a linear range.
In the ocelot RFC patches that I've sent to the list yesterday, you could see that the ports within the same bond must have the same logical port ID (as opposed to regular mode, when each port has a logical ID equal to its physical ID, i.e. swp0 -> 0, swp1 -> 1, etc). We can't use the contiguous LAG ID assignment that you do in DSA, because maybe we have swp1 and swp2 in a bond, and the LAG ID you give that bond is 0. But if we assign logical port ID 0 to physical ports 1 and 2, then we end up also bonding with swp0... So what is done in ocelot is that the LAG ID is derived from the index of the first port that is part of the bond, and the logical port IDs are all assigned to that value. It's really simple when you think about it. It would have probably been the same for Marvell too if it weren't for that cross-chip thing. If I were to take a look at Florian's b53-bond branch, I do see that Broadcom switches also expect a contiguous range of LAG IDs: https://github.com/ffainelli/linux/tree/b53-bond So ok, maybe ocelot is in the minority. Not an issue. If you add that lookup table in the DSA layer, then you could get your linear "LAG ID" by searching through it using the struct net_device *bond as key. Drivers which don't need this linear array will just not use it. > > I think that there is a low practical risk that the assumption will not > > hold true basically forever. But I also see why you might like your > > approach more. Maybe Vivien, Andrew, Florian could also chime in and we > > can see if struct dsa_lag "bothers" anybody else except me (bothers in > > the sense that it's an unnecessary complication to hold in DSA). We > > could, of course, also take the middle ground, which would be to keep > > the 16-entry array of bonding net_device pointers in DSA, and you could > > still call your dsa_lag_dev_by_id() and pretend it's generic, and that > > would just look up that table. Even with this middle ground, we are > > getting rid of the port lists and of the reference counting, which is > > still a welcome simplification in my book. > > Yeah I agree that we can trim it down to just the array. Going beyond > that point, i.e. doing something like how sja1105 works, is more painful > but possible if Andrew can live with it. I did not get the reference to sja1105 here. That does not support bonding offload, but does work properly with software bridging thanks to your patches.