On Mon, Feb 08, 2021 at 13:05, Jakub Kicinski <k...@kernel.org> wrote: > On Mon, 08 Feb 2021 20:54:29 +0100 Tobias Waldekranz wrote: >> On Thu, Feb 04, 2021 at 21:16, Jakub Kicinski <k...@kernel.org> wrote: >> > On Wed, 3 Feb 2021 18:54:56 +0200 Vadym Kochan wrote: >> >> From: Serhiy Boiko <serhiy.bo...@plvision.eu> >> >> >> >> The following features are supported: >> >> >> >> - LAG basic operations >> >> - create/delete LAG >> >> - add/remove a member to LAG >> >> - enable/disable member in LAG >> >> - LAG Bridge support >> >> - LAG VLAN support >> >> - LAG FDB support >> >> >> >> Limitations: >> >> >> >> - Only HASH lag tx type is supported >> >> - The Hash parameters are not configurable. They are applied >> >> during the LAG creation stage. >> >> - Enslaving a port to the LAG device that already has an >> >> upper device is not supported. >> > >> > Tobias, Vladimir, you worked on LAG support recently, would you mind >> > taking a look at this one? >> >> I took a quick look at it, and what I found left me very puzzled. I hope >> you do not mind me asking a generic question about the policy around >> switchdev drivers. If someone published a driver using something similar >> to the following configuration flow: >> >> iproute2 daemon(SDK) >> | ^ | >> : : : user/kernel boundary >> v | | >> netlink | | >> | | | >> v | | >> driver | | >> | | | >> '--------' | >> : kernel/hardware boundary >> v >> ASIC >> >> My guess is that they would be (rightly IMO) told something along the >> lines of "we do not accept drivers that are just shims for proprietary >> SDKs". >> >> But it seems like if that same someone has enough area to spare in their >> ASIC to embed a CPU, it is perfectly fine to run that same SDK on it, >> call it "firmware", and then push a shim driver into the kernel tree. >> >> iproute2 >> | >> : user/kernel boundary >> v >> netlink >> | >> v >> driver >> | >> | >> : kernel/hardware boundary >> '-------------. >> v >> daemon(SDK) >> | >> v >> ASIC >> >> What have we, the community, gained by this? In the old world, the >> vendor usually at least had to ship me the SDK in source form. Having >> seen the inside of some of those sausage factories, they are not the >> kinds of code bases that I want at the bottom of my stack; even less so >> in binary form where I am entirely at the vendor's mercy for bugfixes. >> >> We are talking about a pure Ethernet fabric here, so there is no fig >> leaf of "regulatory requirements" to hide behind, in contrast to WiFi >> for example. >> >> Is it the opinion of the netdev community that it is OK for vendors to >> use this model? > > I ask myself that question pretty much every day. Sadly I have no clear > answer.
Thank you for your candid answer, really appreciate it. I do not envy you one bit, making those decisions must be extremely hard. > Silicon is cheap, you can embed a reasonable ARM or Risc-V core in the > chip for the area and power draw comparable to one high speed serdes > lane. > > The drivers landing in the kernel are increasingly meaningless. My day > job is working for a hyperscaler. Even though we have one of the most > capable kernel teams on the planet most of issues with HW we face > result in "something is wrong with the FW, let's call the vendor". Right, and being a hyperscaler probably at least gets you some attention when you call your vendor. My day job is working for a nanoscaler, so my experience is that we must be prepared to solve all issues in-house; if we get any help from the vendor that is just a bonus. > And even when I say "drivers landing" it is an overstatement. > If you look at high speed anything these days the drivers cover > multiple generations of hardware, seems like ~5 years ago most > NIC vendors reached sufficient FW saturation to cover up differences > between HW generations. > > At the same time some FW is necessary. Certain chip functions, are > best driven by a micro-controller running a tight control loop. I agree. But I still do not understand why vendors cling to the source of these like it was their wallet. That is the beauty of selling silicon; you can fully leverage OSS and still have a very straight forward business model. > The complexity of FW is a spectrum, from basic to Qualcomm. > The problem is there is no way for us to know what FW is hiding > by just looking at the driver. > > Where do we draw the line? Yeah it is a very hard problem. In this particular case though, the vendor explicitly said that what they have done is compiled their existing SDK to run on the ASIC: https://lore.kernel.org/netdev/bn6pr18mb1587eb225c6b80bf35a44ebfba...@bn6pr18mb1587.namprd18.prod.outlook.com So there is no reason that it could not be done as a proper driver. > Personally I'd really like to see us pushing back stronger. Hear, hear!