Kevin Wolf <kw...@redhat.com> writes: > [ CCed Benoît and Max, this is blockdev work ] > [ CCed Jeff, we're also talking about op blockers ] > > Not stripping quoted text for their convenience. > > Am 02.12.2014 um 20:06 hat Markus Armbruster geschrieben: >> = Introduction = >> >> The block layer and its monitor commands have evolved, resulting in a >> bit of a mess when it comes to referring to block layer objects in >> commands. >> >> Semantically, we refer to two different kinds of objects: backends and >> nodes within backends. >> >> Example: eject parameter @device names a backend. >> >> Example: change-backing-file parameter @image-node-name names a node. >> >> Backend and node names share a name space. Names are unique. >> Corollary: a name unambiguously identifies either a backend, or a node, >> or nothing. >> >> Example: blockdev-add parameter @options is a union with discriminator >> "driver". With its value is "raw", member "file" is an anonymous union. >> Its string value can name either a backend or a node. >> >> Node names are a fairly recent feature. Before, we referenced nodes by >> their file name. Pretty schlocky. Should be replaced by node name. >> >> Example: block-commit parameter @base is the "file name of the backing >> image to write data into". In other words, it identifies a node. We >> should add a node name parameter, and deprecate @base. >> >> Some commands predating the node name feature can reference only >> backends even though nodes could make sense, too. Such restrictions >> should be lifted. Others reference backends where only nodes make >> sense. Should be cleaned up. >> >> Example: drive-mirror parameter @device names a backend. This restricts >> mirroring to backends. If we want to support mirroring nodes, we need >> to extend the command to permit referencing nodes. >> >> Example: block_passwd parameter @device names a backend. However, >> backends aren't encryped, nodes are. In 2.0, we cleaned this up: we >> added parameter @node-name. We kept @device for backward compatibility. >> Either @device or @node-name must be given. >> >> Note: in block_passwd, we have separate parameters for backend and node >> name. In the blockdev-add example above, we have only one, and use its >> value to figure out what it is. >> >> I find this inconsistency rather ugly. We discussed cleaning it up in >> the "BB name vs. BDS name in the QAPI schema" thread. >> >> A backend has a tree of nodes. We call the tree's root the backend's >> root node. >> >> For convenience, we sometimes accept a backend name when a node name is >> wanted, and resolve it to the backend's root node. >> >> Example: block_passwd again; it's how its @device parameter works. >> >> Recent development (blockdev-add) permits nodes to be shared by multiple >> backends. Op blockers should ensure the sharing is safe. I wouldn't >> bet a nickel on this to work today. >> >> >> = Block layer data structures = >> >> A backend is represented by a BlockBackend object (short: BB). >> >> A node is represented by a BlockDriverState object (short: BDS). >> >> A backend (BB) has a tree of nodes (BDSes). >> >> Two of a nodes children carry special meaning: the one stored in BDS >> member file (sometimes called "parent"), and the one stored in BDS >> member backing_hd (sometimes called "backing"). These used to be the >> only children a node could have. >> >> A BB always has a name. We sometimes call it device name. Stored in BB >> member name. >> >> A BDS may have a name. We call it node name. Stored in BDS member >> node_name[], empty when the node has no name. >> >> A BDS has a file name. The actual matching of a command's file name >> argument against a BDS's file name is complex, but we don't have to >> worry about that here. >> >> BBs are a fairly recent invention. Before, a backend was represented by >> a BDS with a non-empty member device_name[]. >> >> >> = Commands = >> >> I'm going to discuss all QMP commands whose handlers are defined in >> block*. To make sense of it, you'll probably have to peruse the command >> documentation in the schema and/or qmp-commands.hx. >> >> When I think it's fairly obvious what needs to be done for a command, I >> write it down as TODO. Please challenge it if you think I'm wrong. >> >> When it's not so obvious, I write it down as questions. Answers >> appreciated. >> >> == block-core.json == >> >> * block-commit >> >> @device names a backend, @top and @base each name one of its nodes via >> file name matching. >> >> TODO: support specifying the two nodes via node name. >> >> Why do we need @device when a top node is specified? >> >> * We currently need the backend to set op blockers, and finding a >> node's backend isn't easy. Implementation detail shows through the >> interface, blech. >> >> * We need to know whether the top node is the active layer. >> >> Complication: if it's shared by multiple backends, it may be the >> active layer in one but not the other. Specifying the backend >> resolves the ambiguity. Whether that makes sense I don't know. > > It doesn't. > > The real requirement for the commit job (for inactive layers) is that > base...top (I'm using git-rev-parse syntax for describing nodes and > subtrees from here on) is read-only for the duration of the commit > operation. For committing the active layer, we use a mirror job > internally, which works for images that are written during the > operation. > > In theory, the mirror should work in all cases, it's just less efficient > (and the implementation of the completion code which reconfigures the > tree would have to be changed). Jeff, is this correct? > > What commit should be doing: > - Check whether it can block (as in op blockers) writes to top, i.e. no > other user is currently requiring the ability to write > - If it can, block writes on base...top and start a commit job > - If it can't, block writes on base...top^ and start a mirror job > - If it can't block at least base...top^, fail > > This automatically solves the problem of multiple parents, as long as > these parents advertise correctly which op blocker capabilities they are > using (no, they don't, and we don't have the infrastructure for them to > do so yet - but I think it's becoming clearer what it would have to look > like).
Makes sense. >> * block-job-cancel >> * block-job-complete >> * block-job-pause >> * block-job-resume >> * block-job-set-speed >> >> @device names a backend. >> >> The question whether we need to support a node name here is moot, >> because jobs shouldn't be tied to a single backend (or node) in the >> first place. They should have their own, independent job name. > > Correct. > > TODO: add a @name argument to commands starting block jobs > TODO: add a @job-name argument to these job management commands and > deprecate @device > > How to get rid of bs->job to actually allow multiple jobs on one BDS and > make the job IDs useful is a separate question that is harder to answer. Good point: we can fix the interface before we fix the implementation. >> * block-stream >> >> @device names a backend, @base names a node via file name matching. >> >> TODO: support specifying the node via node name. >> >> I think backend is inappropriate here, we should support streaming up >> to a node, just like block-commit supports committing down from a >> node. > > Yes. I think Benoît suggested this before. I don't remember if he already > coded up something. > >> * block_passwd >> * block_resize >> >> @node-name names a node. >> >> @device names a backend, and references its root node, for >> compatibility. >> >> Either @device or @node-name must be given. >> >> TODO: should have single mandatory parameter instead of two optionals. > > How is this "fairly obvious what needs to be done"? We can't get rid of > any field because of compatibility. Do you mean this: > > TODO: Make @device accept node names as well Obviousness is subjective. I like to think about a nice, clean interface first, and worry about compatibility second. Compatibility may force us to compromise on cleanliness. I feel this helps finding the least unclean interface. Furthermore, I like to deprecate unwanted interfaces. Advertize the clean core, not the compatibility gunk. Let's try this approach here. I think we can agree that the clean interface is "single mandatory parameter". Ironically, that's what we had until we screwed it up in 2.0. @device is a sub-optimal name for this single parameter. Either we accept that and move on, or we deprecate it in favor of a new parameter with a better name. I guess the better name isn't worth that much trouble, in particular since the command name is already ugly. When @node-name is given, @device must not be given. So @device is mandatory *except* in the @node-name usage we retain for compatibility. Deprecate the usage. Command documentation could then look like this: ## # @block-resize # # Resize a block image while a guest is running. # # @device: the name of the block backend or node to resize (node names # supported since 2.3) # # @size: new image size in bytes # # Deprecated usage (since 2.3): # # @device: #optional the name of the block backend to resize # # @node-name: #optional name of the node to resize (since 2.0) # # Either @device or @node-name must be set but not both. # # Since: 0.14.0 >> * blockdev-snapshot-sync >> >> @node-name and @device as for block_passwd, including TODO. >> >> @snapshot-node-name defines the new node's node name. >> >> * block_set_io_throttle >> >> @device names a backend. >> >> TODO: support nodes. >> >> Note: we'd like to redo throttling as a block filter node, so maybe >> we'll have a completely different command instead. >> >> * blockdev-add >> >> This command defines a backend and its node tree, where sub-trees may >> be references to existing nodes. >> >> @id defines the backend's name. >> >> @node-name defines its root node's node name. >> >> Note: blockdev-add always defines a backend. When you build up a >> backend bottom-up with several commands, you get a bunch of unwanted >> backends "in the middle". I'd like to make @id optional, and omit the >> backend when it's missing. > > Yes, this makes sense, as we discussed privately a while ago. > >> * change-backing-file >> >> @device names a backend, @image-node-name names a node. >> >> As far as I can tell, we need the backend only to set op blockers. >> Implementation detail shows through the interface. > > Once we have the new op blockers, we'll make @device optional then and > completely ignore its value. > >> * drive-backup >> >> @device names a backend. >> >> Do we want to be able to back up any node, or only a backend? >> >> Note: documentation of @target sounds like it could somehow name a >> backend, but as far as I can tell it's always interpreted as file >> name. > > I can't think of a reasonable use case for backing up non-roots because > the only thing that would write to them are other block jobs. So you > could backup the old snapshot contents while committing to it. > Usefulness questionable. > > That said, while there's no urgent need for it, it would be nicer to > allow all operations that can safely be performed, and this is one of > them. Might fall out almost natually once we have op blockers, so that > we really only need to add a @node-name option. > > But that's something for later. > >> * drive-mirror >> >> @device names a backend, @replaces names a node, and @node-name >> defines the name of the new node. >> >> Do we want to be able to mirror any node, or only a backend? >> >> Note: documentation of @target sounds like it could somehow name a >> backend, but as far as I can tell it's always interpreted as file >> name. >> >> Note: drive-mirror supports @replaces, but drive-backup doesn't. Odd. > > We probably want to mirror any node (Benoît has a use case for this with > replacing broken quorum children). > >> * query-block >> >> Returns information on all backends as array of BlockInfo. BlockInfo >> member @device is the backend name. >> >> * query-named-block-nodes >> >> Returns information on all named nodes as an array of BlockDeviceInfo >> (we suck at naming). BlockDeviceInfo member @node-name is the node >> name. >> >> You can't get information on nodes that don't have a node name. >> >> * query-blockstats >> >> Returns some stats for all backends as array of BlockStats. >> BlockStats member @device is the backend name. Member @stats contains >> the stats for the backend's root node. Member @parent is BlockStats >> for the child node that is stored in BDS member file. Member @backing >> is BlockStats for the chold node that is stored in BDS member >> backing_hd. Stats for other children aren't returned. >> >> TODO: generalize this to the full tree, include node names. > > Do we want the same thing for query-block? There are a couple of ways to skin the query-cat. Let's get the easy case out of the way first: a query that reports only backend properties and not node properties can return an array where each element carries a backend name, like query-block does now. For queries that report node properties, we have a couple of options: * Flat array with node names Like current query-named-block-nodes. Can't report on nodes without names. Jeff's project to give all nodes names would moot this issue. * Array of trees mirroring the actual node forest, Similar to current query-blockstats. Tree roots correspond to backends, and backends have names. Unfortunately, the nodes don't actually form a forest: node trees may be shared. To turn it into make a forest, we'd have to duplicate the shared trees. * Hybrid: array of trees, but named sub-trees are represented by name Like the previous one, except the recursion stops at named nodes. Instead of including such a sub-tree, we reference it by name, then add it to the array if it's not already there. "Flat array" seems simplest. >> * query-block-jobs >> >> Returns information on block jobs as array of BlockJobInfo. A block >> job is always tied to a backend, and BlockJobInfo member @device is >> its name. >> >> The question whether we need a node name here is moot; see >> block-job-cancel above. >> >> == block.json == >> >> * blockdev-snapshot-delete-internal-sync >> * blockdev-snapshot-internal-sync >> >> @device names a backend. >> >> Do we want to be able to snapshot any node, or only a backend? > > Probably only a backend. At least until someone comes up with a > reasonable use case. Internal snapshots and backing files don't mix very > well anyway. > >> * eject >> >> @device names a backend. Appropriate, because this is really a >> backend operation. >> >> * nbd-server-add >> >> @device names a backend. >> >> I think backend is appropriate here. The NBD server sits on top of >> the block layer just like device models do. It should therefore use >> the BB API. See Max's [PATCH v2 0/6] nbd: Use BlockBackend. > > Agreed. > >> * nbd-server-start >> * nbd-server-stop >> >> No backend or node name parameters. >> >> == qapi-schema.json == >> >> * transaction >> >> This is a wrapper around a list of transaction-capable commands. >> Thus, nothing new here. > > Surprisingly few really hard questions. Good :)