Hi, Gao, >So I tend to just constrain the case to your limited case >first, could you explain more if chunk deduplication is >needed for your scenarios? and what's your real `chunksize`?
chunk deduplication is needed. As I wrote in commit msg, we expect scenario below: 1. mount with -o dax=always, 2. application calls mmap(addr, file). 3. application read from addr, page fault is triggered. We hope in kernel, erofs_dax_vm_ops.huge_fault() can be handled, do not fallback to erofs_dax_vm_ops.fault(). This required file body on blob devices aligned on huge page(2M), and deduplicate unit also is 2M. We can specify --dsunit=512, --chunksize=2*1024*1024 to fulfill this. I don't think need a new command option. Currently, '--dsunit' can be set for formatting blobdev. The following cmdline completes successfully. User certainly thinks mkfs.erofs has executed --dsunit alignment. But actually, it does not. This patch just simply makes actual runtime fit for cmdline looks like. mkfs.erofs --blobdev /dev/sdb1 --dsunit 512 ...... If actually `--dsunit` does not work on blobdev, should prompt warning msg to user. Best Regards Friendy Su ________________________________________ From: Gao Xiang <hsiang...@linux.alibaba.com> Sent: Thursday, August 21, 2025 10:00 To: Su, Friendy; linux-erofs@lists.ozlabs.org Cc: Mo, Yuezhang; Palmer, Daniel (SGC) Subject: Re: [PATCH v1] erofs-utils: mkfs: Implement 'dsunit' alignment on blobdev On 2025/8/20 17: 38, Friendy. Su@ sony. com wrote: > Hi, Gao, > >> What's your `--chunksize` ? consider the following: > > chunksize = 4096 > dsunit = 512 = 2M > >> and two inodes: > >> inode A (8k) 2M, 2M+4k On 2025/8/20 17:38, friendy...@sony.com wrote: > Hi, Gao, > >> What's your `--chunksize` ? consider the following: > > chunksize = 4096 > dsunit = 512 = 2M > >> and two inodes: > >> inode A (8k) 2M, 2M+4k > >> inode B (12k) 4M, 2M, 4M+4k, 4M+8k? > >> Is it possible? what's the expected behavior of >> this case. > > Yes. This is the expected behavior. See runtime below: I understand that is the expected behavior according to this patch, but I'm just unsure if it's an expected behavior for the future wider setups (because some users may use `--dsunit` for other usage). So I tend to just constrain the case to your limited case first, could you explain more if chunk deduplication is needed for your scenarios? and what's your real `chunksize`? Maybe adding another command option for this is better. Thanks, Gao Xiang