On Mon Jun 2, 2025 at 9:06 PM JST, Danilo Krummrich wrote:
> On Wed, May 21, 2025 at 03:45:09PM +0900, Alexandre Courbot wrote:
>> Add the common Falcon code and HAL for Ampere GPUs, and instantiate the
>> GSP and SEC2 Falcons that will be required to boot the GSP.
>
> Maybe add a few more words about the architectural approach taken here?

Sure, note this will only be valid for Ampere though.

>
>> +/// Valid values for the `size` field of the 
>> [`crate::regs::NV_PFALCON_FALCON_DMATRFCMD`] register.
>> +#[repr(u8)]
>> +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)]
>> +pub(crate) enum DmaTrfCmdSize {
>> +    /// 256 bytes transfer.
>> +    #[default]
>> +    Size256B = 0x6,
>
> Can we use a constant from `regs` to assign this value? Or is *this* meant to 
> be
> the corresponding constant?

This is the corresponding constant, as meant by the register value.

>
>> +}
>
> I wonder what's the correct thing to do for enum variants that do *not* have 
> an
> arbitrary value, but match a specific register value in general.
>
> Should those be part of the `regs` module?

Both approaches seem possible. I like to keep `regs` focused on
registers and importing the types it needs from other modules. After
all, these types are used in the code as well, so putting then into `regs`
would turn that file into a mix of completely unrelated types. IMHO
having these in their respective module is cleaner, and also helps
keeping their names short as we don't need to prefix the type with the
module's name (i.e. if the type above was declared in `regs` it would
likely have to be named `FalconDmaTrfCmdSize`).

>
>> +    /// Wait for memory scrubbing to complete.
>> +    fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result {
>> +        util::wait_on(Duration::from_millis(20), || {
>
> I general, I think there can be quite a lot of parameters such timeouts can
> depend on, e.g. chipset, firmware version, etc.
>
> I think it could make sense to establish a rule for the project that for such
> timeouts we require a dedicated `// TIMEOUT: ` comment that mentions the worst
> case scenario, which we derived this timeout value from.

Not opposed to it. When the timeouts differ for some reason, I'd
recommend putting the different requirements into their own HAL and use
the accurate/expected values for each though.

>
>> +    /// Perform a DMA write according to `load_offsets` from `dma_handle` 
>> into the falcon's
>> +    /// `target_mem`.
>> +    ///
>> +    /// `sec` is set if the loaded firmware is expected to run in secure 
>> mode.
>> +    fn dma_wr(
>> +        &self,
>> +        bar: &Bar0,
>> +        dma_handle: bindings::dma_addr_t,
>> +        target_mem: FalconMem,
>> +        load_offsets: FalconLoadTarget,
>> +        sec: bool,
>> +    ) -> Result {
>> +        const DMA_LEN: u32 = 256;
>> +
>> +        // For IMEM, we want to use the start offset as a virtual address 
>> tag for each page, since
>> +        // code addresses in the firmware (and the boot vector) are virtual.
>> +        //
>> +        // For DMEM we can fold the start offset into the DMA handle.
>> +        let (src_start, dma_start) = match target_mem {
>> +            FalconMem::Imem => (load_offsets.src_start, dma_handle),
>> +            FalconMem::Dmem => (
>> +                0,
>> +                dma_handle + load_offsets.src_start as bindings::dma_addr_t,
>
> We should make this a method of CoherentAllocation, such that we can get a
> boundary check on the offset calculation.

Do you mean getting a dma_handle with a specific offset? Guess this
could be an opportunity to also define a type for DMA handles as Lyude
suggested.

>
> For this purpose dma_rw() should also have the `F: FalconFirmware<Target = E>`
> generic I think.
>
> (No worries about the dependencies; I can create a shared tag for the DMA
> patches and merge it into the nova tree, such that it doesn't block this
> series.)
>
>> +            // Wait for the transfer to complete.
>> +            util::wait_on(Duration::from_millis(2000), || {
>
> Yeah, I really think some timeout justification would be nice.

Is "OpenRM does this" an acceptable justification? :) I think here we
are just waiting some arbitrarily large amount of time to be confident
that the transfer has indeed failed.

>
>> +/// Hardware Abstraction Layer for Falcon cores.
>> +///
>> +/// Implements chipset-specific low-level operations. The trait is generic 
>> against [`FalconEngine`]
>> +/// so its `BASE` parameter can be used in order to avoid runtime bound 
>> checks when accessing
>> +/// registers.
>> +pub(crate) trait FalconHal<E: FalconEngine>: Sync {
>> +    // Activates the Falcon core if the engine is a risvc/falcon dual 
>> engine.
>> +    fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result<()> {
>> +        Ok(())
>> +    }
>> +
>> +    /// Returns the fused version of the signature to use in order to run a 
>> HS firmware on this
>> +    /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from 
>> the firmware header.
>> +    fn get_signature_reg_fuse_version(
>
> Unless the method increases a reference count, please don't use the 'get'
> prefix.

Ack.

>
>> +        &self,
>> +        falcon: &Falcon<E>,
>> +        bar: &Bar0,
>> +        engine_id_mask: u16,
>> +        ucode_id: u8,
>> +    ) -> Result<u32>;
>> +
>> +    // Program the boot ROM registers prior to starting a secure firmware.
>> +    fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: 
>> &FalconBromParams)
>> +        -> Result<()>;
>> +}
>> +
>> +impl Chipset {
>> +    /// Returns a boxed falcon HAL adequate for this chipset.
>> +    ///
>> +    /// We use a heap-allocated trait object instead of a statically 
>> defined one because the
>> +    /// generic `FalconEngine` argument makes it difficult to define all 
>> the combinations
>> +    /// statically.
>> +    ///
>> +    /// TODO: replace the return type with `KBox` once it gains the ability 
>> to host trait objects.
>
> I think we can do this for v5. :-)

Happy to! :)

Reply via email to