snikeguo opened a new issue, #16415: URL: https://github.com/apache/nuttx/issues/16415
### Is your feature request related to a problem? Please describe. RT ### Describe the solution you'd like During my recent research on STM32H7 + SDMMC, I identified several design issues in the NuttX device driver implementation. For example, the introduction of CONFIG_FAT_FORCE_INDIRECT is a flawed design decision. The current implementation logic of the fat_read function is as follows: ``` fat_read(...buf...) { while (!read_end) { #if CONFIG_FAT_FORCE_INDIRECT fat_hwread(buf + offset, 1~16 sectors); #else fat_hwread(tempbuff + offset, 1 sector); // memcpy(buf + offset, tempbuff, 512 bytes); #endif offset += ...; } } ``` I believe the root cause of this design stems from the chip's DMA transfer requirement, which mandates that data addresses must be N-byte aligned. However, in the fat_read function, the incoming buffer address may not meet this alignment requirement, necessitating this workaround. Otherwise, no developer would intentionally write such convoluted code. Reviewing the code in stm32_sdmmc.c reveals that since the upper layer (MMCSD) does not handle DMA alignment, the low-level driver (stm32_sdmmc.c) must address alignment and DCache management. The current driver hierarchy is: vfs->fat->mmcsd->stm32_sdmmc. We need to design a new mechanism to completely solve the DMA alignment issue. For instance, in scenarios like vfs->char device->stm32 uart driver (with DMA), similar alignment problems arise. Therefore, this mechanism should be applicable to all DMA transfer scenarios. It is critical that low-level drivers remain as simple as possible to encourage broader participation in driver development. I have observed the following development model: Design is more critical than implementation. Developers with philosophical thinking tend to design better models, but such individuals are rare. Most developers are implementers who excel at realizing existing designs and prefer simplicity. Take the flash driver as an example: if the upper layer handles Flash read/write alignment, the low-level driver can be significantly simplified. For example, the flash_read function could be: ``` flash_read(...buf...) { // Operate registers to write data } ``` This way, the low-level driver does not need to handle DMA alignment. Designing a DMA Alignment Mechanism Considering the vfs->fat->mmcsd->stm32_sdmmc hierarchy, where should a DMA Alignment Manager be placed? Let’s first consider placing it in the FAT layer: ``` uint32_t get_bsp_mmcsd_dma_align_bytes(void) // Layout: STM32 SDMMC BSP { return ARMV7M_DCACHE_LINESIZE; } uint32_t get_mmcsd_dma_align_bytes(void) // Layout: Device driver layer { return get_bsp_mmcsd_dma_align_bytes(); } typedef struct dma_align_manager { uint8_t *original_buf; uint8_t *aligned_buf; size_t aligned_buf_size; size_t aligned_buf_offset; } dma_align_manager_t; int dma_manager_init(dma_align_manager_t *manager, uint8_t *buf, size_t size, uint32_t align_bytes) // Layout: Independent component { // Initialize DMA alignment manager manager->original_buf = buf; manager->aligned_buf = (uint8_t *)align_malloc(size, align_bytes); // align_bytes: e.g., 4, 8, 16, 32, 64 if (!manager->aligned_buf) { // Allocation failed return -1; } manager->aligned_buf_offset = 0; return 0; } ssize_t fat_read(...uint8_t *buf...) // Layout: FAT layer { uint32_t align_bytes = get_mmcsd_dma_align_bytes(); if ((uint32_t)buf % align_bytes == 0) // Buffer is aligned { // Direct read: no alignment needed while (!read_end) { fat_hwread(buf + offset, 1~16 sectors); offset += ...; } } else // Buffer is unaligned { dma_align_manager_t dma_mgr; if (dma_manager_init(&dma_mgr, buf, max_sectors * 512, align_bytes) != 0) { return -ENOMEM; } while (!read_end) { fat_hwread(dma_mgr.aligned_buf + dma_mgr.aligned_buf_offset, 1~16 sectors); // Update offsets and copy data from aligned buffer to original buffer as needed dma_mgr.aligned_buf_offset += sectors * 512; } dma_manager_finalize(&dma_mgr); // Free aligned buffer } return total_bytes_read; } ``` This approach automatically handles alignment and requires only one aligned memory allocation for multiple fat_hwread calls. Alternative: Placing the Manager in the MMCSD Layer Now consider moving the DMA alignment logic to the MMCSD layer: ``` ssize_t fat_read(...uint8_t *buf...) // Layout: FAT layer { while (!read_end) { fat_hwread(buf + offset, 1~16 sectors); offset += ...; } } static ssize_t fat_hwread(uint8_t *buf, uint32_t sectors) // Layout: FAT layer { return mmcsd_read(buf, sectors); } ssize_t mmcsd_read(uint8_t *buf, uint32_t sectors) // Layout: MMCSD layer { uint32_t align_bytes = get_bsp_mmcsd_dma_align_bytes(); bool use_dma = get_bsp_mmcsd_use_dma(); if ((uint32_t)buf % align_bytes == 0) // Aligned buffer { if (use_dma) { return bsp_sdmmc_dma_read(buf, sectors); // Direct DMA transfer } else { return bsp_sdmmc_polled_read(buf, sectors); // Polled read } } else // Unaligned buffer { dma_align_manager_t dma_mgr; if (dma_manager_init(&dma_mgr, buf, sectors * 512, align_bytes) != 0) { return -ENOMEM; } ssize_t ret = 0; if (use_dma) { ret = bsp_sdmmc_dma_read(dma_mgr.aligned_buf, sectors); // DMA to aligned buffer } else { ret = bsp_sdmmc_polled_read(dma_mgr.aligned_buf, sectors); // Polled read to aligned buffer } // Copy data from aligned buffer to original buffer if necessary if (buf != dma_mgr.aligned_buf) { memcpy(buf, dma_mgr.aligned_buf, sectors * 512); } dma_manager_finalize(&dma_mgr); return ret; } } ``` While this requires memory allocation for each read, it is more appropriate for the device driver layer (MMCSD) to handle DMA alignment rather than the FAT layer. This keeps the FAT layer agnostic to hardware-specific alignment requirements, adhering to the principle that low-level drivers should handle hardware details. General Applicability For other scenarios like vfs->char device->stm32 uart driver (with DMA), the DMA Alignment Manager should be integrated into the char device layer (e.g., UART driver), following the same logic: let the hardware-specific layer handle alignment, keeping upper layers (like VFS) abstract and simplified. Key Takeaways Separation of Concerns: Hardware-specific alignment logic belongs in low-level drivers (e.g., MMCSD, UART), not in upper layers (FAT, VFS). Simpler Low-Level Drivers: By abstracting alignment via a reusable manager, low-level drivers become easier to develop and maintain, welcoming more contributors. Reusability: The DMA Alignment Manager can be generalized across all DMA-enabled drivers, reducing code duplication and improving consistency. Thank you for reviewing this technical analysis! Let me know if further refinements are needed. My English is not very good, and the above content is translated by AI. ### Describe alternatives you've considered _No response_ ### Verification - [x] I have verified before submitting the report. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org