block: Add read support for block drivers

Kevin Wolf Wed, 19 Feb 2025 05:03:49 -0800

Am 19.02.2025 um 07:11 hat Paolo Bonzini geschrieben:
> On 2/18/25 19:20, Kevin Wolf wrote:
> > +    /// The described blocks are stored in a child node.
> > +    Data {
> > +        /// Child node in which the data is stored
> > +        node: Arc<BdrvChild>,
> 
> Having Arc<> here shouldn't be necessary, since the BdrvChild is already
> reference counted.  Since the code is called under the bdrv_graph_rdlock
> there's no risk of the BdrvChild going away, and you can just make it a
> &BdrvChild.


That would mean that you need keep the BlockDriver borrowed as long as
you're using the mapping. It would work today, but as soon as I want to
cache mappings, it won't any more.

> Likewise, even BochsImage should not need a standard Rust Arc<BdrvChild>.
> However you need to add your own block::Arc<BdrvChild> and map Clone/Drop to
> bdrv_ref/bdrv_unref.  Then BochsImage can use block::Arc<BdrvChild>; this
> makes it even clearer that Mapping should not use the Arc<> wrapper, because
> bdrv_ref is GLOBAL_STATE_CODE() and would abort if run from a non-main
> thread.

It's not BdrvChild that is refcounted on the C side, but
BlockDriverState. We definitely don't bdrv_ref()/unref() for each
request on the C side and we shouldn't on the Rust side either. The
refcount only changes when you modify the graph.

I'm not entirely sure how your block::Arc<T> is supposed to work. It
would be tied to one specific type (BlockDriverState), not generic.
Which probably means that it can't be a separate pointer type, but
BlockDriverState itself should just implement Clone with bdrv_ref().

Though that doesn't help here, obviously, because we have a BdrvChild.

> That said, I'm not sure how to include "block graph lock must be taken" into
> the types, yet.  That has to be taken into account too, sooner or later.
> You probably have a lot of items like this one so it'd be nice to have TODO
> comments as much as you can.

Actually, I'm not aware of that many items. But yes, there is a TODO
item for the graph lock.

I think I'll have something like:

    pub struct BdrvChild {
        child: GraphLock<*mut bindings::BdrvChild>,
    }

where you can access the inner object either by calling a lock function,
or passing another graph lock guard that you already own. And for the
FFI boundary unsafe functions like "I promise I already own the lock".

> (This boundary is where you get an unholy mix of C and Rust concepts. It
> takes a while to get used to, and it teaches you a lot of the parts of Rust
> that you usually take for granted.  So while it's not hard, it's unusual and
> it does feel like water and oil in the beginning).
> 
> > +) -> std::os::raw::c_int {
> > +    let s = unsafe { &mut *((*bs).opaque as *mut D) };
> 
> &mut is not safe here (don't worry, we went through the same thing for
> devices :)).  You can only get an & unless you go through an UnsafeCell (or
> something that contains one).

Right, we can have multiple requests in flight.

The fix is easy here: Even though bindgen gives us a *mut, we only want
a immutable reference.

> You'll need to split the mutable and immutable parts of BochsImage in
> separate structs, and embed the former into the latter.  Long term you
> there should be a qemu_api::coroutine::CoMutex<>, but for the short
> term you can just use a BqlRefCell<> or a standard Rust RefCell<>.
> You can see how PL011Registers is included into PL011State in
> rust/hw/char/pl011/src/device.rs, and a small intro is also present in
> docs/devel/rust.rst.

There is no mutable part in BochsImage, which makes this easy. The only
thing is the *mut bindings::BdrvChild, but we never dereference that in
Rust. It is also essentially interior mutability protected by the graph
lock, even though this isn't explicit yet.

But if we were to introduce a mutable part (I think we will add write
support to it sooner or later), then BqlRefCell or RefCell are
definitely not right. They would only turn the UB into a safe panic when
you have more than one request in flight. (Or actually, BqlRefCell
should already panic with just one request from an iothread, because we
don't actually hold the BQL.)

> Anyway, the BdrvChild needs to remain in BochsImage, so that it is
> accessible outside the CoMutex critical section and can be placed into
> the Mapping.
> 
> > +    let mut offset = offset as u64;
> > +    let mut bytes = bytes as u64;
> > +
> > +    while bytes > 0 {
> > +        let req = Request::Read { offset, len: bytes };
> > +        let mapping = match qemu_co_run_future(s.map(&req)) {
> > +            Ok(mapping) => mapping,
> > +            Err(e) => return -i32::from(Errno::from(e).0),
> 
> This is indeed not great, but it's partly so because you're doing a
> lot (for some definition of "a lot") in the function.  While it would
> be possible to use a trait, I wrote the API thinking of minimal glue
> code that only does the C<->Rust conversion.
> 
> In this case, because you have a lot more code than just a call into
> the BlockDriver trait, you'd have something like
> 
> fn bdrv_co_preadv_part(
>     bs: &dyn BlockDriver,
>     offset: i64,
>     bytes: i64,
>     qiov: &bindings::QEMUIOVector,
>     mut qiov_offset: usize,
>     flags: bindings::BdrvRequestFlags) -> io::Result<()>
> 
> and then a wrapper (e.g. rust_co_preadv_part?) that only does
> 
>    let s = unsafe { &mut *((*bs).opaque as *mut D) };
>    let qiov = unsafe { &*qiov };
>    let result = bdrv_co_preadv_part(s, offset, bytes,
>          qiov, qiov_offset, flags);
>    errno::into_negative_errno(result)
> 
> This by the way has also code size benefits because &dyn, unlike
> generics, does not need to result in duplicated code.

I don't really like the aesthetics of having two functions on the
Rust side for each C function, but I guess ugliness is expected in
bindings...

For the errno conversion functions, I'm still not sure that they should
only support trivial wrappers with no early returns. I reluctantly buy
your &dyn argument (though in C block drivers this is generally
duplicated code, too), but that's unrelated to error handling.

> For now, I'd rather keep into_negative_errno() this way, to keep an
> eye on other cases where you have an io::Error<()>.  Since Rust rarely
> has Error objects that aren't part of a Result, it stands to reason
> that the same is true of QEMU code, but if I'm wrong then it can be
> changed.

This one is part of a Result, too. But not a result that is directly
returned in both success and error cases, but where the error leads to
an early return. That is, an equivalent for the ubiquitous pattern:

    ret = foo();
    if (ret < 0) {
        return ret;
    }

    /* Do something with the successful result of foo() */

Kevin

Re: [PATCH v2 09/11] rust/block: Add read support for block drivers

Reply via email to