On 9/8/25 21:09, Brian Song wrote:
>
>
> On 9/3/25 7:51 AM, Stefan Hajnoczi wrote:
>> On Fri, Aug 29, 2025 at 10:50:23PM -0400, Brian Song wrote:
>>> https://docs.kernel.org/filesystems/fuse-io-uring.html
>>>
>>> As described in the kernel documentation, after FUSE-over-io_uring
>>> initialization and handshake, FUSE interacts with the kernel using
>>> SQE/CQE to send requests and receive responses. This corresponds to
>>> the "Sending requests with CQEs" section in the docs.
>>>
>>> This patch implements three key parts: registering the CQE handler
>>> (fuse_uring_cqe_handler), processing FUSE requests (fuse_uring_co_
>>> process_request), and sending response results (fuse_uring_send_
>>> response). It also merges the traditional /dev/fuse request handling
>>> with the FUSE-over-io_uring handling functions.
>>>
>>> Suggested-by: Kevin Wolf <kw...@redhat.com>
>>> Suggested-by: Stefan Hajnoczi <stefa...@redhat.com>
>>> Signed-off-by: Brian Song <hibrians...@gmail.com>
>>> ---
>>> block/export/fuse.c | 457 ++++++++++++++++++++++++++++++--------------
>>> 1 file changed, 309 insertions(+), 148 deletions(-)
>>>
>>> diff --git a/block/export/fuse.c b/block/export/fuse.c
>>> index 19bf9e5f74..07f74fc8ec 100644
>>> --- a/block/export/fuse.c
>>> +++ b/block/export/fuse.c
>>> @@ -310,6 +310,47 @@ static const BlockDevOps fuse_export_blk_dev_ops = {
>>> };
>>>
>>> #ifdef CONFIG_LINUX_IO_URING
>>> +static void coroutine_fn fuse_uring_co_process_request(FuseRingEnt *ent);
>>> +
>>> +static void coroutine_fn co_fuse_uring_queue_handle_cqes(void *opaque)
>>
>> This function appears to handle exactly one cqe. A singular function
>> name would be clearer than a plural: co_fuse_uring_queue_handle_cqe().
>>
>>> +{
>>> + FuseRingEnt *ent = opaque;
>>> + FuseExport *exp = ent->rq->q->exp;
>>> +
>>> + /* Going to process requests */
>>> + fuse_inc_in_flight(exp);
>>
>> What is the rationale for taking a reference here? Normally something
>> already holds a reference (e.g. the request itself) and it will be
>> dropped somewhere inside a function we're about to call, but we still
>> need to access exp afterwards, so we temporarily take a reference.
>> Please document the specifics in a comment.
>>
>> I think blk_exp_ref()/blk_exp_unref() are appropriate instead of
>> fuse_inc_in_flight()/fuse_dec_in_flight() since we only need to hold
>> onto the export and don't care about drain behavior.
>>
>
> Stefan:
>
> When handling FUSE requests, we don’t want the FuseExport to be
> accidentally deleted. Therefore, we use fuse_inc_in_flight in the CQE
> handler to increment the in_flight counter, and when a request is
> completed, we call fuse_dec_in_flight to decrement it. Once the last
> request has been processed, fuse_dec_in_flight brings the in_flight
> counter down to 0, indicating that the export can safely be deleted. The
> usage of in_flight follows the same logic as in traditional FUSE request
> handling.
>
> Since submitted SQEs for FUSE cannot be canceled, once we register or
> commit them we must wait for the kernel to return a CQE. Otherwise, the
> kernel may deliver a CQE and invoke its handler after the export has
> already been deleted. For this reason, we directly call blk_exp_ref and
> blk_exp_unref when submitting an SQE and when receiving its CQE, to
> explicitly control the export reference and prevent accidental deletion.
>
> The doc/comment for co_fuse_uring_queue_handle_cqe:
>
> Protect FuseExport from premature deletion while handling FUSE requests.
> CQE handlers inc/dec the in_flight counter; when it reaches 0, the
> export can be freed. This follows the same logic as traditional FUSE.
>
> Since FUSE SQEs cannot be canceled, a CQE may arrive after commit even
> if the export is deleted. To prevent this, we ref/unref the export
> explicitly at SQE submission and CQE completion.
>
>>> +
>>> + /* A ring entry returned */
>>> + fuse_uring_co_process_request(ent);
>>> +
>>> + /* Finished processing requests */
>>> + fuse_dec_in_flight(exp);
>>> +}
>>> +
>>> +static void fuse_uring_cqe_handler(CqeHandler *cqe_handler)
>>> +{
>>> + FuseRingEnt *ent = container_of(cqe_handler, FuseRingEnt,
>>> fuse_cqe_handler);
>>> + Coroutine *co;
>>> + FuseExport *exp = ent->rq->q->exp;
>>> +
>>> + if (unlikely(exp->halted)) {
>>> + return;
>>> + }
>>> +
>>> + int err = cqe_handler->cqe.res;
>>> +
>>> + if (err != 0) {
>>> + /* -ENOTCONN is ok on umount */
>>> + if (err != -EINTR && err != -EAGAIN &&
>>> + err != -ENOTCONN) {
>>> + fuse_export_halt(exp);
>>> + }
>>
>> How are EINTR and EAGAIN handled if they are silently ignored? When did
>> you encounter these error codes?
>
> Bernd:
>
> I have the same question about this. As for how the kernel returns
> errors, I haven’t studied each case yet. In libfuse it’s implemented the
> same way, could you briefly explain why we choose to ignore these two
> errors, and under what circumstances we might encounter them?
I think I remember why I had added these. Initially the ring threads
didn't inherit the signal handlers libfuse worker threads have. I had
fixed that later and these error conditions are a left over.
In libfuse idea is that the main thread gets all signals and then sets
se->exited - worker thread, include ring threads are not supposed to get
or handle signals at all, but have to monitor se->exited.
Good catch Stefan, I think I can remove these conditions in libfuse.
Thanks,
Bernd