On Thu Aug 28, 2025 at 8:27 PM JST, Danilo Krummrich wrote: > On 8/26/25 6:07 AM, Alexandre Courbot wrote: >> /// Structure encapsulating the firmware blobs required for the GPU to >> operate. >> #[expect(dead_code)] >> pub(crate) struct Firmware { >> @@ -36,7 +123,10 @@ pub(crate) struct Firmware { >> booter_unloader: BooterFirmware, >> /// GSP bootloader, verifies the GSP firmware before loading and >> running it. >> gsp_bootloader: RiscvFirmware, >> - gsp: firmware::Firmware, >> + /// GSP firmware. >> + gsp: Pin<KBox<GspFirmware>>, > > Is there a reason why we don't just propagate it through struct Gpu, which > uses > pin-init already? > > You can make Firmware pin_data too and then everything is within the single > allocation of struct Gpu.
I tried doing that at first, and hit the problem that the `impl PinInit` returned by `GspFirmware::new` borrows a reference to the GSP firmware binary loaded by `Firmware::new` - when `Firmware::new` returns, the firmware gets freed, and the borrow checker complains. We could move the GSP firmware loading code into the `pin_init!` of `Firmware::new`, but now we hit another problem: in `Gpu::new` the following code is executed: FbLayout::new(chipset, bar, &fw.gsp_bootloader, &fw.gsp)? which requires the `Firmware` instance, which doesn't exist yet as the `Gpu` object isn't initialized until the end of the method. So we could move `FbLayout`, and everything else created by `Gpu::new` to become members of the `Gpu` instance. It does make sense actually: this `new` method is doing a lot of stuff, such as running FRTS, and with Alistair's series it even runs Booter, the sequencer and so on. Maybe we should move all firmware execution to a separate method that is called by `probe` after the `Gpu` is constructed, as right now the `Gpu` constructor looks like it does a bit more than it should. ... but even when doing that, `Firmware::new` and `FbLayout::new` still require a reference to the `Bar`, and... you get the idea. :) So I don't think the current design allows us to do that easily or at all, and even if it does, it will be at a significant cost in code clarity. There is also the fact that I am considering making the firmware member of `Gpu` a trait object: the boot sequence is so different between pre and post-Hopper that I don't think it makes sense to share the same `Firmware` structure between the two. I would rather see `Firmware` as an opaque trait object, which provides high-level methods such as "start GSP" behind which the specifics of each GPU family are hidden. If we go with this design, `Firmware` will become a trait object and so cannot be pinned into `Gpu`. This doesn't change my observation that `Gpu::new` should not IMHO do steps like booting the GSP - it should just acquire the resources it needs, return the pinned GPU object, and then `probe` can continue the boot sequence. Having the GPU object pinned and constructed early simplifies things quite a bit as the more we progress with boot, the harder it becomes to construct everything in place (and the `PinInit` closure also becomes more and more complex). I'm still laying down the general design, but I'm pretty convinced that having `Firmware` as a trait object is the right way to abstract the differences between GPU families.