On 4/7/25 13:21, Christian Marangi (Ansuel) wrote: > Il giorno lun 7 apr 2025 alle ore 19:00 Sean Anderson > <sean.ander...@linux.dev> ha scritto: >> >> On 4/7/25 12:46, Christian Marangi (Ansuel) wrote: >> > Il giorno lun 7 apr 2025 alle ore 18:33 Sean Anderson >> > <sean.ander...@linux.dev> ha scritto: >> >> >> >> On 4/7/25 12:27, Kory Maincent wrote: >> >> > On Thu, 3 Apr 2025 14:18:54 -0400 >> >> > Sean Anderson <sean.ander...@linux.dev> wrote: >> >> > >> >> >> This series adds support for creating PCSs as devices on a bus with a >> >> >> driver (patch 3). As initial users, >> >> >> >> >> >> - The Lynx PCS (and all of its users) is converted to this system >> >> >> (patch 5) >> >> >> - The Xilinx PCS is broken out from the AXI Ethernet driver (patches >> >> >> 6-8) >> >> >> - The Cadence MACB driver is converted to support external PCSs (namely >> >> >> the Xilinx PCS) (patches 9-10). >> >> >> >> >> >> The last few patches add device links for pcs-handle to improve boot >> >> >> times, >> >> >> and add compatibles for all Lynx PCSs. >> >> >> >> >> >> Care has been taken to ensure backwards-compatibility. The main source >> >> >> of this is that many PCS devices lack compatibles and get detected as >> >> >> PHYs. To address this, pcs_get_by_fwnode_compat allows drivers to edit >> >> >> the devicetree to add appropriate compatibles. >> >> > >> >> > I don't dive into your patch series and I don't know if you have heard >> >> > about it >> >> > but Christian Marangi is currently working on fwnode for PCS: >> >> > https://lore.kernel.org/netdev/20250406221423.9723-1-ansuels...@gmail.com >> >> > >> >> > Maybe you should sync with him! >> >> >> >> I saw that series and made some comments. He is CC'd on this one. >> >> >> >> I think this approach has two advantages: >> >> >> >> - It completely solves the problem of the PCS being unregistered while >> >> the netdev >> >> (or whatever) is up >> >> - I have designed the interface to make it easy to convert existing >> >> drivers that may not be able to use the "standard" probing process >> >> (because they have to support other devicetree structures for >> >> backwards-compatibility). >> >> >> > >> > I notice this and it's my fault for taking too long to post v2 of the PCS >> > patch. >> > There was also this idea of entering the wrapper hell but I scrapped that >> > early >> > as I really feel it's a workaround to the current problem present for >> > PCS handling. >> >> It's no workaround. The fundamental problem is that drivers can become >> unbound at any time, and we cannot make consumers drop their references. >> Every subsystem must deal with this reality, or suffer from >> user-after-free bugs. See [1-3] for discussion of this problem in >> relation to PCSs and PHYs, and [4] for more discussion of my approach. >> >> [1] https://lore.kernel.org/netdev/yv7kp2k8vvn7j...@shell.armlinux.org.uk/ >> [2] >> https://lore.kernel.org/netdev/20220816163701.1578850-1-sean.ander...@seco.com/ >> [3] >> https://lore.kernel.org/netdev/9747f8ef-66b3-0870-cbc0-c1783896b...@seco.com/ >> [3] https://lpc.events/event/17/contributions/1627/ >> >> > And the real problem IMHO is that currently PCS handling is fragile and >> > with too >> > many assumptions. With Daniel we also discussed backwards-compatibility. >> > (mainly needed for mt7621 and mt7986 (for mediatek side those are the 2 >> > that slipped in before it was correctly complained that things were >> > taking a bad path) >> > >> > We feel v2 permits correct support of old implementations. >> > The ""legacy"" implementation pose the assumption that PCS is never removed >> > (unless the MAC driver is removed) >> > That fits v2 where a MAC has to initially provide a list of PCS to >> > phylink instance. >> >> And what happens when the driver is unbound from the device and suddenly >> a PCS on that list is free'd memory but is in active use by a netdev? >> > > driver bug for not correctly implementing the removal task. > > The approach is remove as provider and call phylink removal phase > under rtnl lock. > We tested this with unbind, that is actually the main problem we are > trying to address > and works correctly.
OK, so this is a different approach since your last submission. Please CC me on your series. - Fundamentally this is going to make backwards compatibility very difficult, since your approach cannot work with mac_select_pcs. How are you going to handle the case of MAC-internal PCSs? Are you going to make them create a swnode and bind to it just to create a PCS for e.g. MMIO registers? And how is the MAC supposed to know how to select the PCS? From what I can tell you don't even notify the MAC about which PCS it's using. I considered an approach like this, where the phylink would be in the driver's seat (so to speak), but I decided not to persue it due to the problems listed above. A lot of PCSs are tightly-integrated with their MACs, so it does not make sense to introduce this little coupling. I think it is better to let the MAC select the PCS e.g. based on the phy interface. This tends to be a few lines of code for the MAC and saves so much complexity in phylink. I think you should try doing the macb and lynx conversions for your approach. It will make the above problems obvious. - Your approach is very intrusive. There are lots of changes all over phylink across several patches and it is hard to verify all the assumptions. Whereas a wrapper keeps everything contained to one file, and most of the functions can be evaluated independently. --Sean