[I] [FEATURE] The design issues regarding DMA alignment in device drivers [nuttx]

via GitHub Tue, 20 May 2025 05:08:37 -0700


snikeguo opened a new issue, #16415:
URL: https://github.com/apache/nuttx/issues/16415


   ### Is your feature request related to a problem? Please describe.
   
   RT
   
   ### Describe the solution you'd like
   
   During my recent research on STM32H7 + SDMMC, I identified several design 
issues in the NuttX device driver implementation. For example, the introduction 
of CONFIG_FAT_FORCE_INDIRECT is a flawed design decision. The current 
implementation logic of the fat_read function is as follows:
   ```
   fat_read(...buf...)
   {
       while (!read_end)
       {
           
   #if CONFIG_FAT_FORCE_INDIRECT
           fat_hwread(buf + offset, 1~16 sectors);
   #else
           fat_hwread(tempbuff + offset, 1 sector); // 
           memcpy(buf + offset, tempbuff, 512 bytes);
   #endif
           offset += ...;
           
       }
   }
   ```
   I believe the root cause of this design stems from the chip's DMA transfer 
requirement, which mandates that data addresses must be N-byte aligned. 
However, in the fat_read function, the incoming buffer address may not meet 
this alignment requirement, necessitating this workaround. Otherwise, no 
developer would intentionally write such convoluted code.
   
   Reviewing the code in stm32_sdmmc.c reveals that since the upper layer 
(MMCSD) does not handle DMA alignment, the low-level driver (stm32_sdmmc.c) 
must address alignment and DCache management. The current driver hierarchy is: 
vfs->fat->mmcsd->stm32_sdmmc.
   
   We need to design a new mechanism to completely solve the DMA alignment 
issue. For instance, in scenarios like vfs->char device->stm32 uart driver 
(with DMA), similar alignment problems arise. Therefore, this mechanism should 
be applicable to all DMA transfer scenarios.
   
   It is critical that low-level drivers remain as simple as possible to 
encourage broader participation in driver development. I have observed the 
following development model:
   
   Design is more critical than implementation.
   Developers with philosophical thinking tend to design better models, but 
such individuals are rare. Most developers are implementers who excel at 
realizing existing designs and prefer simplicity.
   
   Take the flash driver as an example: if the upper layer handles Flash 
read/write alignment, the low-level driver can be significantly simplified. For 
example, the flash_read function could be:
   ```
   flash_read(...buf...)
   {
       // Operate registers to write data
   }
   ```
   This way, the low-level driver does not need to handle DMA alignment.
   Designing a DMA Alignment Mechanism
   Considering the vfs->fat->mmcsd->stm32_sdmmc hierarchy, where should a DMA 
Alignment Manager be placed? Let’s first consider placing it in the FAT layer:
   ```
   uint32_t get_bsp_mmcsd_dma_align_bytes(void) // Layout: STM32 SDMMC BSP
   {
       return ARMV7M_DCACHE_LINESIZE;
   }
   
   uint32_t get_mmcsd_dma_align_bytes(void) // Layout: Device driver layer
   {
       return get_bsp_mmcsd_dma_align_bytes();
   }
   
   typedef struct dma_align_manager 
   {
       uint8_t *original_buf;
       uint8_t *aligned_buf;
       size_t aligned_buf_size;
       size_t aligned_buf_offset;
   } dma_align_manager_t;
   
   int dma_manager_init(dma_align_manager_t *manager, uint8_t *buf, size_t 
size, uint32_t align_bytes) // Layout: Independent component
   {
       // Initialize DMA alignment manager
       manager->original_buf = buf;
       manager->aligned_buf = (uint8_t *)align_malloc(size, align_bytes); // 
align_bytes: e.g., 4, 8, 16, 32, 64
       if (!manager->aligned_buf)
       {
           // Allocation failed
           return -1;
       }
       manager->aligned_buf_offset = 0;
       return 0;
   }
   
   ssize_t fat_read(...uint8_t *buf...) // Layout: FAT layer
   {
       uint32_t align_bytes = get_mmcsd_dma_align_bytes();
       if ((uint32_t)buf % align_bytes == 0) // Buffer is aligned
       {
           // Direct read: no alignment needed
           while (!read_end)
           {
               fat_hwread(buf + offset, 1~16 sectors);
               offset += ...;
           }
       }
       else // Buffer is unaligned
       {
           dma_align_manager_t dma_mgr;
           if (dma_manager_init(&dma_mgr, buf, max_sectors * 512, align_bytes) 
!= 0)
           {
               return -ENOMEM;
           }
           while (!read_end)
           {
               fat_hwread(dma_mgr.aligned_buf + dma_mgr.aligned_buf_offset, 
1~16 sectors);
               // Update offsets and copy data from aligned buffer to original 
buffer as needed
               dma_mgr.aligned_buf_offset += sectors * 512;
           }
           dma_manager_finalize(&dma_mgr); // Free aligned buffer
       }
       return total_bytes_read;
   }
   ```
   This approach automatically handles alignment and requires only one aligned 
memory allocation for multiple fat_hwread calls.
   Alternative: Placing the Manager in the MMCSD Layer
   Now consider moving the DMA alignment logic to the MMCSD layer:
   ```
   ssize_t fat_read(...uint8_t *buf...) // Layout: FAT layer
   {
       while (!read_end)
       {
           fat_hwread(buf + offset, 1~16 sectors);
           offset += ...;
       }
   }
   
   static ssize_t fat_hwread(uint8_t *buf, uint32_t sectors) // Layout: FAT 
layer
   {
       return mmcsd_read(buf, sectors);
   }
   
   ssize_t mmcsd_read(uint8_t *buf, uint32_t sectors) // Layout: MMCSD layer
   {
       uint32_t align_bytes = get_bsp_mmcsd_dma_align_bytes();
       bool use_dma = get_bsp_mmcsd_use_dma();
       
       if ((uint32_t)buf % align_bytes == 0) // Aligned buffer
       {
           if (use_dma)
           {
               return bsp_sdmmc_dma_read(buf, sectors); // Direct DMA transfer
           }
           else
           {
               return bsp_sdmmc_polled_read(buf, sectors); // Polled read
           }
       }
       else // Unaligned buffer
       {
           dma_align_manager_t dma_mgr;
           if (dma_manager_init(&dma_mgr, buf, sectors * 512, align_bytes) != 0)
           {
               return -ENOMEM;
           }
           
           ssize_t ret = 0;
           if (use_dma)
           {
               ret = bsp_sdmmc_dma_read(dma_mgr.aligned_buf, sectors); // DMA 
to aligned buffer
           }
           else
           {
               ret = bsp_sdmmc_polled_read(dma_mgr.aligned_buf, sectors); // 
Polled read to aligned buffer
           }
           
           // Copy data from aligned buffer to original buffer if necessary
           if (buf != dma_mgr.aligned_buf)
           {
               memcpy(buf, dma_mgr.aligned_buf, sectors * 512);
           }
           
           dma_manager_finalize(&dma_mgr);
           return ret;
       }
   }
   ```
   While this requires memory allocation for each read, it is more appropriate 
for the device driver layer (MMCSD) to handle DMA alignment rather than the FAT 
layer. This keeps the FAT layer agnostic to hardware-specific alignment 
requirements, adhering to the principle that low-level drivers should handle 
hardware details.
   General Applicability
   For other scenarios like vfs->char device->stm32 uart driver (with DMA), the 
DMA Alignment Manager should be integrated into the char device layer (e.g., 
UART driver), following the same logic: let the hardware-specific layer handle 
alignment, keeping upper layers (like VFS) abstract and simplified.
   Key Takeaways
   Separation of Concerns: Hardware-specific alignment logic belongs in 
low-level drivers (e.g., MMCSD, UART), not in upper layers (FAT, VFS).
   Simpler Low-Level Drivers: By abstracting alignment via a reusable manager, 
low-level drivers become easier to develop and maintain, welcoming more 
contributors.
   Reusability: The DMA Alignment Manager can be generalized across all 
DMA-enabled drivers, reducing code duplication and improving consistency.
   
   Thank you for reviewing this technical analysis! Let me know if further 
refinements are needed.
   My English is not very good, and the above content is translated by AI.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Verification
   
   - [x] I have verified before submitting the report.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] [FEATURE] The design issues regarding DMA alignment in device drivers [nuttx]

Reply via email to