On Mon, Apr 12, 2021 at 11:22:45PM +0200, Tobias Waldekranz wrote: > On Mon, Apr 12, 2021 at 21:30, Marek Behun <marek.be...@nic.cz> wrote: > > On Mon, 12 Apr 2021 14:46:11 +0200 > > Tobias Waldekranz <tob...@waldekranz.com> wrote: > > > >> I agree. Unless you only have a few really wideband flows, a LAG will > >> typically do a great job with balancing. This will happen without the > >> user having to do any configuration at all. It would also perform well > >> in "router-on-a-stick"-setups where the incoming and outgoing port is > >> the same. > > > > TLDR: The problem with LAGs how they are currently implemented is that > > for Turris Omnia, basically in 1/16 of configurations the traffic would > > go via one CPU port anyway. > > > > > > > > One potencial problem that I see with using LAGs for aggregating CPU > > ports on mv88e6xxx is how these switches determine the port for a > > packet: only the src and dst MAC address is used for the hash that > > chooses the port. > > > > The most common scenario for Turris Omnia, for example, where we have 2 > > CPU ports and 5 user ports, is that into these 5 user ports the user > > plugs 5 simple devices (no switches, so only one peer MAC address for > > port). So we have only 5 pairs of src + dst MAC addresses. If we simply > > fill the LAG table as it is done now, then there is 2 * 0.5^5 = 1/16 > > chance that all packets would go through one CPU port. > > > > In order to have real load balancing in this scenario, we would either > > have to recompute the LAG mask table depending on the MAC addresses, or > > rewrite the LAG mask table somewhat randomly periodically. (This could > > be in theory offloaded onto the Z80 internal CPU for some of the > > switches of the mv88e6xxx family, but not for Omnia.) > > I thought that the option to associate each port netdev with a DSA > master would only be used on transmit. Are you saying that there is a > way to configure an mv88e6xxx chip to steer packets to different CPU > ports depending on the incoming port? > > The reason that the traffic is directed towards the CPU is that some > kind of entry in the ATU says so, and the destination of that entry will > either be a port vector or a LAG. Of those two, only the LAG will offer > any kind of balancing. What am I missing? > > Transmit is easy; you are already in the CPU, so you can use an > arbitrarily fancy hashing algo/ebpf classifier/whatever to load balance > in that case.
Say a user port receives a broadcast frame. Based on your understanding where user-to-CPU port assignments are used only for TX, which CPU port should be selected by the switch for this broadcast packet, and by which mechanism?