+Tom Rini (actually adding Tom to the conversation) > On 3. Apr 2025, at 19:54, Simon Glass <s...@chromium.org> wrote: > > +Tom Rini in case this affects the release > > Hi Christian, > > On Fri, 4 Apr 2025 at 01:46, Christian Kohlschütter > <christ...@kohlschutter.com> wrote: >> >> Hi Simon, >> >>> On 1. Apr 2025, at 17:51, Simon Glass <s...@chromium.org> wrote: >>>> >>>> but I don't know precisely what these various functions are supposed to >>>> do, and I can't find any path that leads from any of these to eth_halt(). >>>> >>>> Is it possible that U-Boot is failing to call eth_halt() in response to >>>> ExitBootServices(), and is therefore leaving the network device active >>>> and performing DMA while the kernel starts up? >>> >>> The dm_remove_devices_active() is supposed to handle this, but it is >>> possible that one of the drivers lacks a remove() method. >> >> for what it's worth, this is happening on both ODROID N2+ (meson8b-dwmac) as >> well as NanoPI R4S (rk_gmac-dwmac). >> >> I also don't understand why reverting >> 06ef8089f876b6dabf56caba31a05c003f03c629 "fixes" this behavior. Does it mean >> that any memalign / malloc may cause this? >> Also, how can this failure mode be detected and prevented in code? >> >> I really don't fancy the thought of remote code injection into Linux initrd >> by spoofing Ethernet packages, but it really looks like a possibility. > > Here's my idea of what might be going on based on Michael's observations: > > 1. designware_eth_start() starts the interface and > designware_eth_stop() stops it > > 2. tx_descs_init() sets up the DMA and uses the device-private info > (struct dw_eth_dev) > > 3. When the device is removed, the struct is freed, meaning that a > future malloc() can use that same space.
Yes, that sounds plausible. How can such allocations be prevented? I assume U-Boot's malloc does not know anything about iPXE's or Linux's allocation, and vice versa? > > 4. DMA traffic could then write over the malloc() region > > I'm not seeing where the Ethernet device's stop() is called. The > dwmac_meson8b driver does not have a remove() method, so presumably > DMA is still running after the device is removed. Probably the correct > fix would be to add a remove() method to that driver. Right. Of course this means that there's still a chance that some future driver would again fail to do this. How can we prevent this? Can some removal hook be added automatically upon registration? > > Regards, > Simon Thanks for looking into this! Christian