On 2025/8/21 11:33, Gao Xiang wrote:
Hi Friendy,

On 2025/8/21 11:17, friendy...@sony.com wrote:
Hi, Gao,

So I tend to just constrain the case to your limited case
first, could you explain more if chunk deduplication is
needed for your scenarios? and what's your real `chunksize`?

chunk deduplication is needed.

As I wrote in commit msg, we expect scenario below:

1. mount with -o dax=always,
2. application calls mmap(addr, file).
3. application read from addr, page fault is triggered.
We hope in kernel, erofs_dax_vm_ops.huge_fault() can be handled, do not 
fallback to erofs_dax_vm_ops.fault().

I totally understand the runtime side, but in short:


This required file body on blob devices aligned on huge page(2M), and 
deduplicate unit also is 2M. We can specify --dsunit=512, 
--chunksize=2*1024*1024 to fulfill this.

I don't think need a new command option.
Currently, '--dsunit' can be set for formatting blobdev. The following cmdline 
completes successfully. User certainly thinks mkfs.erofs has executed --dsunit 
alignment.
But actually, it does not.  This patch just simply makes actual runtime fit for 
cmdline looks like.

mkfs.erofs --blobdev /dev/sdb1 --dsunit 512 ......

If actually `--dsunit` does not work on blobdev, should prompt warning msg to 
user.

My cercern is why `--chunksize=4096 --dsunit=512` will not
lead to each 4k chunk to the 2M data boundary, is it obvious?

chunksize = 4096
dsunit = 512 = 2M

inode A (8k)    2M, 4M
inode B (12k)    6M, 2M, 4M, 8M?

Are you sure if there is no such use case in the
future? Mixing `--chunksize=4096 --dsunit=512` seems
non-obvious for this case.

Or as I mentioned before, I'm fine to leave each `dsunit` logical
aligned chunks (but not any deduplicated chunk in this logical range)
alignes with dsunit value, it enables PMD-mapping as you mentioned.

But if there is some deduplciated chunks in the logical dsunit
boundary, don't align it at all since there is no real benefit.
Although I'm still not sure what's the default behavior of `dsunit`
for chunks.

Thanks,
Gao Xiang


Thanks,
Gao Xiang



Best Regards
Friendy Su

________________________________________
From: Gao Xiang <hsiang...@linux.alibaba.com>
Sent: Thursday, August 21, 2025 10:00
To: Su, Friendy; linux-erofs@lists.ozlabs.org
Cc: Mo, Yuezhang; Palmer, Daniel (SGC)
Subject: Re: [PATCH v1] erofs-utils: mkfs: Implement 'dsunit' alignment on 
blobdev

On 2025/8/20 17: 38, Friendy. Su@ sony. com wrote: > Hi, Gao, > >> What's your `--chunksize` ? consider 
the following: > > chunksize = 4096 > dsunit = 512 = 2M > >> and two inodes: > >> inode A 
(8k) 2M, 2M+4k




On 2025/8/20 17:38, friendy...@sony.com wrote:
Hi, Gao,

What's your `--chunksize` ? consider the following:

    chunksize = 4096
    dsunit = 512 = 2M

and two inodes:

inode A (8k)    2M, 2M+4k

inode B (12k)   4M, 2M, 4M+4k, 4M+8k?

Is it possible? what's the expected behavior of
this case.

Yes. This is the expected behavior. See runtime below:

I understand that is the expected behavior according to this
patch, but I'm just unsure if it's an expected behavior for
the future wider setups (because some users may use `--dsunit`
for other usage).

So I tend to just constrain the case to your limited case
first, could you explain more if chunk deduplication is
needed for your scenarios? and what's your real `chunksize`?

Maybe adding another command option for this is better.

Thanks,
Gao Xiang



Reply via email to