Kevin, Max and I used an opportunity to meet and discuss block layer matters. We examined two topics in some depth: BlockBackend, and block filters and dynamic reconfiguration.
Not nearly enough people to call it a block summit. But the local dialect is known for its use of diminutives, and "Gipfele" is the diminutive of "summit" :) = BlockBackend = Background: BlockBackend (BB) was split off BlockDriverState (BDS) to separate the block layer's external interface (BB) from its internal building block (BDS). Block layer clients such as device models and the NBD server attach to a BB by BB name. A BB has zero or one BDS (zero means no medium). Multiple device models using the same BB is dangerous, so we allow attaching only one. We don't currently enforce an "only one" restriction for other clients. This is problematic, because * Different clients may want to configure the BB in conflicting ways, e.g. writeback caching mode (still to be moved from the BDS's enable_write_cache to the BB). * When the BDS graph gets dynamically reconfigured, say when a block filter gets spliced in, clients that started out in the same spot may need to move differently. Instead, each client should connect to its own BB. This leads to the next question: how should this BB be created? Initially, what is now the BB was mashed into the BDS. In a way, the BB got created along with the BDS. The current code lets you create a BB along with a BDS when you need one, or create a new BB for an existing BDS. The BB has a name, and the BDS may have a node-name. The obvious low-level building blocks would be "create BB", "connect BB to a BDS" (we have that as x-blockdev-insert-medium), "disconnect BB from a BDS" (x-blockdev-remove-medium) and "destroy BB" (x-blockdev-del). Management applications probably don't mind having to work at this low level, but for human users, it's cumbersome. Perhaps the BB should be created along with the client, at least optionally. Means to create BBs separately are mostly useful when the BB needs to be configured by the user: instead of duplicating the BB configuration within each client, we keep it neatly separate. We're not aware of user-configurable knobs, though. Currently, a client is configured to attach to a BB by specifying a BB name. For instance, a device model has a "drive" property that names a BB. If we create the BB automatically, we need client configuration to name a BDS instead, i.e. we need a node-name instead of a BB name. Of course, we'll have to keep the legacy configuration working. The "drive" property will have to refer to a BDS, like it did before BBs were invented. We could: * Move the BB name back into the BDS. * Move the BB name into DriveInfo, where the other legacy stuff lives. DriveInfo needs to be changed to hang off BDS rather than BB. Regardless, dynamic reconfiguration may have to move the name / the DriveInfo to a different BDS. Not entirely sure automatic creation of BB is worthwhile or not. Next steps: * Support multiple BBs sharing the same BDS. * Restrict BB to only one client of any kind instead of special-casing device models. * Block jobs should go through BB. * Investigate automatic creation of BB. = Block filters = We already have a few block filters: * blkdebug, blkverify, quorum Encryption should become another one. Moreover, we have a few things mashed into BDS that should be filters: * throttle (only at a root, i.e. right below a BB), copy-on-read, notifier (for backup block job), detect-zero Dynamic reconfiguration means altering the BDS graph while it's in use. Existing mutators: * snaphot, mirror-complete, commit-complete, x-blockdev-change. Things become interesting when nodes get implicitly inserted into the graph, e.g.: * A backup job inserts its notifier filter * We create an implicit throttle filter to implement legacy throttling configuration And so forth. Nothing of the sort exists just yet. What should happen when the user asks for a mutation at a place where we have implicit filter(s)? First, let's examine how such a chain could look like. If we read the current code correctly, it behaves as if we had a chain BB | throttle | detect-zero | copy-on-read | BDS Except for the backup job, which behaves as if we had backup job / notifier | detect-zero | BDS We believe that the following cleaned up filter stack should work: BB | throttle \ | \ copy-on-read ) fixed at creation time | / detect-zero / | | backup job | / notifier ) dynamically inserted by the job | BDS Clients (device model, NBD server) connect through a BB on top. Snapshot cuts in between the BDS and its implicit filters, like this: BB | throttle | copy-on-read | detect-zero | qcow2 \ | \ ) inserted by snapshot snapshot | overlay / BDS The notifier filter not shown, because we can't currently snapshot while a block job is active. Still to do: similar analysis for the other mutators.