On 2025/7/18 09:52, Chao Yu wrote:
Hi Xiang,
On 2025/7/17 16:26, Gao Xiang wrote:
Hi Chao,
On 2025/7/14 17:39, Chao Yu wrote:
This patch supports to readahead more blocks in erofs_readdir(), it can
enhance readdir performance in large direcotry.
readdir test in a large directory which contains 12000 sub-files.
files_per_second
Before: 926385.54
After: 2380435.562
Meanwhile, let's introduces a new sysfs entry to control readahead
bytes to provide more flexible policy for readahead of readdir().
- location: /sys/fs/erofs/<disk>/dir_ra_bytes
- default value: 16384
- disable readahead: set the value to 0
Signed-off-by: Chao Yu <c...@kernel.org>
---
v3:
- add EROFS prefix for macro
- update new sysfs interface to 1) use bytes instead of pages
2) remove upper boundary limitation
- fix bug of pageidx calculation
Documentation/ABI/testing/sysfs-fs-erofs | 8 ++++++++
fs/erofs/dir.c | 13 +++++++++++++
fs/erofs/internal.h | 4 ++++
fs/erofs/super.c | 2 ++
fs/erofs/sysfs.c | 2 ++
5 files changed, 29 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-fs-erofs b/Documentation/
ABI/testing/sysfs-fs-erofs
index bf3b6299c15e..85fa56ca092c 100644
--- a/Documentation/ABI/testing/sysfs-fs-erofs
+++ b/Documentation/ABI/testing/sysfs-fs-erofs
@@ -35,3 +35,11 @@ Description: Used to set or show hardware accelerators in
effect
and multiple accelerators are separated by '\n'.
Supported accelerator(s): qat_deflate.
Disable all accelerators with an empty string (echo > accel).
+
+What: /sys/fs/erofs/<disk>/dir_ra_bytes
+Date: July 2025
+Contact: "Chao Yu" <c...@kernel.org>
+Description: Used to set or show readahead bytes during readdir(), by
+ default the value is 16384.
+
+ - 0: disable readahead.
diff --git a/fs/erofs/dir.c b/fs/erofs/dir.c
index 3e4b38bec0aa..950d6b0046f4 100644
--- a/fs/erofs/dir.c
+++ b/fs/erofs/dir.c
@@ -47,8 +47,10 @@ static int erofs_readdir(struct file *f, struct dir_context
*ctx)
struct inode *dir = file_inode(f);
struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
struct super_block *sb = dir->i_sb;
+ struct file_ra_state *ra = &f->f_ra;
unsigned long bsz = sb->s_blocksize;
unsigned int ofs = erofs_blkoff(sb, ctx->pos);
+ unsigned long nr_pages = DIV_ROUND_UP_POW2(dir->i_size, PAGE_SIZE);
pgoff_t ra_pages = PAGE_ALIGN(EROFS_SB(dir)->dir_ra_bytes);
Do you mean?
pgoff_t ra_pages = PAGE_ALIGN(EROFS_SB(dir)->dir_ra_bytes) >> PAGE_SHIFT?
That seems more complicated, sigh, I think the original
DIV_ROUND_UP_POW2() is better.
int err = 0;
bool initial = true;
@@ -63,6 +65,17 @@ static int erofs_readdir(struct file *f, struct dir_context
*ctx)
break;
}
+ /* readahead blocks to enhance performance in large directory */
+ if (EROFS_I_SB(dir)->dir_ra_bytes) {
> > if (ra_pages) {
+ unsigned long idx = DIV_ROUND_UP(ctx->pos, PAGE_SIZE);
+ pgoff_t ra_pages = DIV_ROUND_UP(
+ EROFS_I_SB(dir)->dir_ra_bytes, PAGE_SIZE);
I put calculation here because if the value is zero, we don't need unnecessary
calculation above, anyway, will update as you suggested, but let me know if you
have any other concerns. :)
If only shift calculatiion is needed, I guess it
should not have much overhead, like:
if (ra_pages) {
pgoff_t idx = DIV_ROUND_UP_POW2(ctx->pos, PAGE_SIZE);
pgoff_t pages = min(nr_pages - idx, ra_pages);
...
Thanks,
Gao Xiang