On 2024/7/18 15:11, Hongbo Li wrote:
On 2024/7/18 10:40, Gao Xiang wrote:
Hi Hongbo,
I'd like to request Jingbo's review too.
On 2024/7/18 09:05, Hongbo Li wrote:
erofs over fscache cannot handle the direct read io. When the file
is opened with O_DIRECT flag, -EINVAL will reback. We support the
DIO in erofs over fscache by bypassing the erofs page cache and
reading target data into ubuf from fscache's file directly.
Could you give more hints in the commit message on the target user
of fscache DIO?
To be honest, I haven't come across such containers using direct I/O yet. I've
just run fio and some other tests for the direct mode in containers, and they
failed during open. If a traditional container start using direct I/O when it's
then migrated to the erofs over fscache solution (ie. Nydus), it won't run
properly. This is because the current on-demand mode of erofs does not support
direct I/O. Currently, I thought there are two approaches to solve this: 1.
direct I/O can fallback to buffered I/O (simpler but seems non-reasonable); 2.
implement the direct I/O process like this way.
I think benchmark might be a clear use case, personally
I'm fine to add direct I/O for unencoded I/Os like this.
Thanks,
Hongbo
For Android use cases, direct I/O support is mainly used for loop
device direct mode.
The alignment for buffer memory, offset and size now is restricted
by erofs, since `i_blocksize` is enough for the under filesystems.
Signed-off-by: Hongbo Li <lihongb...@huawei.com>
---
fs/erofs/data.c | 3 ++
fs/erofs/fscache.c | 95 +++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 93 insertions(+), 5 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 8be60797ea2f..dbfafe358de4 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -391,6 +391,9 @@ static ssize_t erofs_file_read_iter(struct kiocb *iocb,
struct iov_iter *to)
iov_iter_alignment(to)) & blksize_mask)
return -EINVAL;
+ if (erofs_is_fscache_mode(inode->i_sb))
+ return generic_file_read_iter(iocb, to);
+
return iomap_dio_rw(iocb, to, &erofs_iomap_ops,
NULL, 0, NULL, 0);
}
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index fda16eedafb5..f5a09b168539 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -35,6 +35,8 @@ struct erofs_fscache_io {
struct erofs_fscache_rq {
struct address_space *mapping; /* The mapping being accessed */
+ struct iov_iter *iter; /* dst buf for direct io */
+ struct completion done; /* for synced direct io */
loff_t start; /* Start position */
size_t len; /* Length of the request */
size_t submitted; /* Length of submitted */
@@ -76,7 +78,11 @@ static void erofs_fscache_req_put(struct erofs_fscache_rq
*req)
{
if (!refcount_dec_and_test(&req->ref))
return;
- erofs_fscache_req_complete(req);
+
+ if (req->iter)
+ complete(&req->done);
+ else
+ erofs_fscache_req_complete(req);
kfree(req);
}
@@ -88,6 +94,7 @@ static struct erofs_fscache_rq
*erofs_fscache_req_alloc(struct address_space *ma
if (!req)
return NULL;
req->mapping = mapping;
+ req->iter = NULL;
req->start = start;
req->len = len;
refcount_set(&req->ref, 1);
@@ -253,6 +260,55 @@ static int erofs_fscache_meta_read_folio(struct file
*data, struct folio *folio)
return ret;
}
+static int erofs_fscache_data_dio_read(struct erofs_fscache_rq *req)
Is it possible to merge this helper into erofs_fscache_data_read_slice?
Also it seems that it doesn't handle tailpacking inline files
(with EROFS_MAP_META set) although Nydus itself doesn't generate such
files but later I will add a new fscache backend in erofs-utils too.
Thanks,
Gao Xiang