Hi, Gao, Thanks for your review !
> As for this patch, what if the inode itself is > chunk-deduplicated, could we apply this if the inode > only has one new chunk instead at least for now? Do you mean inode has 3 chunks, chunk#2 and chunk#3 duplicate chunck#1? This patch only makes the 1st chunk exactly written to blobdev aligned on dsunit. Deduplicated chunks will not be written to blobdev. In example above, chunk#1 is written to dsunit aligned block addr, chunk#2,#3 are not written. The next file will be written from next dsunit alignment addr. Best Regards Friendy Su ________________________________________ From: Gao Xiang <hsiang...@linux.alibaba.com> Sent: Wednesday, August 20, 2025 15:44 To: Su, Friendy; linux-erofs@lists.ozlabs.org Cc: Mo, Yuezhang; Palmer, Daniel (SGC) Subject: Re: [PATCH v1] erofs-utils: mkfs: Implement 'dsunit' alignment on blobdev Hi Friendly, On 2025/8/20 15: 23, Friendy Su wrote: > Set proper 'dsunit' to let file body align on huge page on blobdev, > > where 'dsunit' * 'blocksize' = huge page size (2M). > > When do mmap() a file mounted with dax=always, Hi Friendly, On 2025/8/20 15:23, Friendy Su wrote: > Set proper 'dsunit' to let file body align on huge page on blobdev, > > where 'dsunit' * 'blocksize' = huge page size (2M). > > When do mmap() a file mounted with dax=always, aligning on huge page > makes kernel map huge page(2M) per page fault exception, compared with > mapping normal page(4K) per page fault. > > This greatly improves mmap() performance by reducing times of page > fault being triggered. > > Signed-off-by: Friendy Su <friendy...@sony.com> > Reviewed-by: Yuezhang Mo <yuezhang...@sony.com> > Reviewed-by: Daniel Palmer <daniel.pal...@sony.com> Sigh, thanks for the patch! Actually the blob chunk implementation (this file) is now an implementation debt since: 1) In principle, chunks (and deduplicated pclusters) should across blobs (considering the main device is also a blob); 2) Each blob should have its own space allocation context which is independent to the in-memory chunk records... My mid-term plan tends to use the current metabox way (i.e. use buffer management for all extra blobs too..) > --- > lib/blobchunk.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/lib/blobchunk.c b/lib/blobchunk.c > index bbc69cf..e47afb5 100644 > --- a/lib/blobchunk.c > +++ b/lib/blobchunk.c > @@ -309,6 +309,21 @@ int erofs_blob_write_chunked_file(struct erofs_inode > *inode, int fd, > minextblks = BLK_ROUND_UP(sbi, inode->i_size); > interval_start = 0; > > + /* Align file on 'dsunit' */ > + if (sbi->bmgr->dsunit > 1) { > + off_t off = lseek(blobfile, 0, SEEK_CUR); > + > + erofs_dbg("Try to round up 0x%llx to align on %d blocks > (dsunit)", > + off, sbi->bmgr->dsunit); > + off = roundup(off, sbi->bmgr->dsunit * erofs_blksiz(sbi)); > + if (lseek(blobfile, off, SEEK_SET) != off) { > + ret = -errno; > + erofs_err("lseek to blobdev 0x%llx error", off); > + goto err; > + } > + erofs_dbg("Aligned on 0x%llx", off); > + } But since you have use case on the current chunk approach, so... As for this patch, what if the inode itself is chunk-deduplicated, could we apply this if the inode only has one new chunk instead at least for now? Thanks, Gao Xiang