Introduce DMA_BUF_IOCTL_RW_FILE ioctl for direct file I/O on dma-buf objects.
Current flow: 1. Allocate dma-buf (buf_fd) # Get buffer descriptor 2. Map memory (vaddr) # Access via virtual address 3. File ops: open/lseek/read # Read into mapped memory Problem: - No direct I/O support in dmabuf - 70% read time spent on page cache & memcpy - High latency/power with buffer I/O Solution: Add rw_file callback in exporter. When holding sgtable exclusively: - Build bio_vec and set IOCB_DIRECT flag - Use vfs_iocb_iter_read for direct I/O Improved usage: dmabuf_fd = dmabuf_alloc(len, heap_fd) file_fd = open(file_path, O_RDONLY) if (direct_io) arg.flags |= DMA_BUF_RW_FLAGS_DIRECT ioctl(dmabuf_fd, DMA_BUF_IOCTL_RW_FILE, &arg) Performance gains: - Throughput: 1032MB/s -> 3776MB/s (UFS4.0 @4GB/s) - Zero page cache overhead - Direct path eliminates memory copies Use cases: - AI model loading - Real-time data streaming - Task snapshot storage vs udmabuf: - udmabuf creation slower - udmabuf direct I/O slower than dmabuf direct I/O - sendfile still has 1 copy vs dmabuf's zero-copy Test (32x32MB buffer, 1GB file, UFS @4GB/s, CPU @1GHZ): Metric | alloc (ms) | read (ms) | total (ms) -----------------------|------------|-----------|----------- udmabuf buffer read | 539 | 2017 | 2555 udmabuf direct read | 522 | 658 | 1179 udmabuf buffer sendfile| 505 | 1040 | 1546 udmabuf direct sendfile| 510 | 2269 | 2780 dmabuf buffer read | 51 | 1068 | 1118 patch 1-2 direct read | 52 | 297 | 349 v1: https://lore.kernel.org/all/20250513092803.2096-1-tao.wang...@honor.com v1 -> v2: Dma-buf exporter verify exclusive access to the dmabuf's sgtable. wangtao (2): dmabuf: add DMA_BUF_IOCTL_RW_FILE dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap drivers/dma-buf/dma-buf.c | 8 ++ drivers/dma-buf/heaps/system_heap.c | 121 ++++++++++++++++++++++++++++ include/linux/dma-buf.h | 15 ++++ include/uapi/linux/dma-buf.h | 28 +++++++ 4 files changed, 172 insertions(+) -- 2.17.1