I guess some people are aware that virt-v2v, which is a tool which converts guests from VMware to run on KVM, and some other OpenStack-OpenStack migration tools we have, use "qemu-img convert" to copy the data around.
Historically we've had bugs here. The most recent was discussed in the thread on this list called "Bug? qemu-img convert to preallocated image makes it sparse" (https://www.mail-archive.com/qemu-block@nongnu.org/msg60479.html) We've been kicking around the idea of writing some alternate tool. My proposal would be a tool (not yet written, maybe it will never be written) called nbdcp for copying between NBD servers and local files. An outline manual page for this proposed tool is attached. Some of the things which this tool might do which qemu-img convert cannot do right now: - Hint that the target already contains zeroes. It's almost always the case that we know this, but we cannot tell qemu. This was the cause of a big performance regression last year. - Declare that we want the target to be either sparse or preallocated. qemu-img convert can sort of do this in a round-about way (create the target in advance and use the -n option), but also it's broken at the moment. - NBD multi-conn. In my tests this makes a really massive performance difference in certain situations. Again, virt-v2v has a lot of information that we cannot pass to qemu: we know, for example, exactly if the server supports the feature, how many threads are available, in some situations even have information about the network and backing disks that the data will travel over / be stored on. - Machine-parsable progress bars. You can, sort of, parse the progress bar from qemu-img convert, but it's not as easy as it could be. In particular it would be nice if the format was treated as ABI, and if there was a way to have the tool write the progress bar info to a precreated file descriptor. - External block lists. This is a rather obscure requirement, but it's necessary in the case where we can get the allocated block map from another source (eg. pyvmomi) and then want to use that with an NBD source that does not support extents (eg. nbdkit-ssh-plugin / libssh / sftp). [Having said that, it may be possible to implement this as an nbdkit filter, so maybe this is not a blocking feature.] One thing which qemu-img convert can do which nbdcp could not: - Read or write from qcow2 files. So instead of splitting the ecosystem and writing a new tool that doesn't do as much as qemu-img convert, I wonder what qemu developers think about the above missing features? For example, are they in scope for qemu-img convert? Rich. ---------------------------------------------------------------------- nbdcp(1) LIBNBD nbdcp(1) NAME nbdcp - copy between NBD servers and local files SYNOPSIS nbdcp [-a|--target-allocation allocated|sparse] [-b|--block-list <blocksfile>] [-m|--multi-conn <n>] [-M|--multi-conn-target <n>] [-p|--progress-bar] [-S|--sparse-detect <n>] [-T|--threads <n>] [-z|--target-is-zero] 'nbd://...'|DISK.IMG 'nbd://...'|DISK.IMG DESCRIPTION nbdcp is a utility that can copy quickly between NBD servers and local raw format files (or block devices). It can copy: from NBD server to file (or block device) For example, this command copies from the NBD server listening on port 10809 on "example.com" to a local file called disk.img: nbdcp nbd://example.com disk.img from file (or block device) to NBD server For example, this command copies from a local block device /dev/sda to the NBD server listening on Unix domain socket /tmp/socket: nbdcp /dev/sda 'nbd+unix:///?socket=/tmp/socket' from NBD server to NBD server For example this copies between two different exports on the same NBD server: nbdcp nbd://example.com/export1 nbd://example.com/export2 This program cannot: copy from file to file (use cp(1) or dd(1)), copy to or from formats other than raw (use qemu-img(1) convert), or access servers other than NBD servers (also use qemu-img(1)). NBD servers are specified by their URI, following the NBD URI standard at https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md Controlling sparseness or preallocation in the target The options -a (--target-allocation), -S (--sparse-detect) and -z (--target-is-zero) together control sparseness in the target file. By default nbdcp tries to both preserve sparseness from the source and will detect runs of allocated zeroes and turn them into sparseness. To turn off detection of sparseness use "-S 0". The -z option should be used if and only if you know that the target block device is zeroed already. This allows an important optimization where nbdcp can skip zeroing or trimming parts of the disk that are already zero. The -a option is used to control the desired final preallocation state of the target. The default is "-a sparse" which makes the target as sparse as possible. "-a allocated" makes the target fully allocated. OPTIONS --help Display brief command line help and exit. -a allocated --target-allocation=allocated Make the target fully allocated. -a sparse --target-allocation=sparse Make the target as sparse as possible. This is the default. See also "Controlling sparseness or preallocation in the target". -b BLOCKSFILE --block-list=BLOCKSFILE Load the list of extents from an external file. nbdcp considers this to be the truth for source extents. The file should contain one record per line in the same format as nbdkit-sh-plugin(1), ie: offset length type with "offset" and "length" in bytes, and the "type" field being a comma-separated list of the words "hole" and "zero". For example: 0 1M 1M 9M hole,zero Any parts of the source which don't have descriptions are assumed to be of type "hole,zero". -m N --multi-conn=N Enable NBD multi-conn with up to "N" connections. Only some NBD servers support this but it can greatly improve performance. The default is to enable multi-conn if we detect that the server supports it, with up to 4 connections. -M N --multi-conn-target=N If you are copying between NBD servers, use -m to control the multi-conn setting for the source server, and this option (-M) to control the multi-conn setting for the target server. -p --progress-bar Display a progress bar during copying. -p machine:FD --progress-bar=machine:FD Write a machine-readable progress bar to file descriptor "FD". This progress bar prints lines with the format "COPIED/TOTAL" (where "COPIED" and "TOTAL" are 64 bit unsigned integers). -S 0 --sparse-detect=0 Turn off sparseness detection. -S N --sparse-detect=N Detect runs of zero bytes of at least size "N" bytes and turn them into sparse blocks on the target (if "-a sparse" is used). This is the default, with a 512 byte block size. -T N --threads N Use at most "N" threads when copying. Usually more threads leads to better performance, up to the limit of the number of cores on your machine and the parallelism of the underlying disk or network. The default is to use the number of online processors. -z --target-is-zero Declare that the target block device contains only zero bytes (or sparseness that reads back as zeroes). You must only use this option if you are sure that this is true, since it means that nbdcp will enable an optimization where it skips zeroing parts of the disk that are zero on the source. -V --version Display the package name and version and exit. SEE ALSO qemu-img(1), libnbd(3), nbdsh(1). AUTHORS Richard W.M. Jones COPYRIGHT Copyright (C) 2020 Red Hat Inc. LICENSE This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA libnbd-1.3.1 2020-01-23 nbdcp(1) -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW