On Fri, Jan 30, 2015 at 02:32:25PM -0500, Max Reitz wrote: > On 2015-01-30 at 13:41, Kashyap Chamarthy wrote: > >On Fri, Jan 30, 2015 at 06:15:21PM +0100, Kevin Wolf wrote: > >>Am 29.01.2015 um 17:25 hat Kashyap Chamarthy geschrieben:
[. . .] > >>Copying Stefan because he's the master of AIO contexts and it is > >>bs->aio_context that becomes NULL. I couldn't see anything obvious. > >> > >> > >>In the meantime, could you retest on git master? > >Just tested from git, and I can still reproduce it. > > > >That's the commit I'm at: > > > > $ git describe > > v2.2.0-682-g16017c4 > > > > > >Run the NBD server, from git: > > > > $ /home/kashyapc/build/qemu/qemu-nbd -f qcow2 \ > > -p10809 ./f21vm.qcow2 -t > > > > > >Create the overlay: > > > > $ /home/kashyapc/build/qemu/qemu-img create \ > > -f qcow2 -F nbd -o backing_file=nbd://localhost > > overlay2-of-f21vm.qcow2 > > Segmentation fault (core dumped) > > You want to use -F raw. The file format is raw, not nbd (nbd is the protocol > over which the data is read, which is in format raw). Noted, thanks for this detail. However, the `qemu-img` binary from the system (version noted earlier in the thread) honors the "-F nbd" option just fine: $ qemu-img create -f qcow2 -F nbd \ -o backing_file=nbd://localhost overlay3-of-f21vm.qcow2 Formatting 'overlay3-of-f21vm.qcow2', fmt=qcow2 size=42949672960 backing_file='nbd://localhost' backing_fmt='nbd' encryption=off cluster_size=65536 lazy_refcounts=off Then, the segfault with the git-compiled `qemu-img` binary seems like some kind of an incorrect "regression" (if you could call it so). Anyhow, the `qemu-img` from git works correctly as per your reasoning: $ /home/kashyapc/build/qemu/qemu-img create \ -f qcow2 -F raw -o backing_file=nbd://localhost > Anyway, -F nbd shouldn't result in a segfault. One way to prevent this is to > check whether the backing file format specified (or any format given to > qemu-img in general) is a real format or the name of a protocol driver and > then error out if it's the latter; but that would be more of a hotfix. > > Kevin, Stefan: The real problem is that block/nbd.c stores a BDRVNBDState > object in bs->opaque and passes &BDRVNBDState.client (an NbdClientSession > object) to the block/nbd-client.c functions. Those functions then receive > the BDS pointer from client->bs. If an NBD BDS is a root BDS (as in this > case), at some point a bdrv_swap() may happen (and it does happen here) > which leads to ((BDRVNBDState *)bs->opaque)->client.bs != bs, and that's > where the segfault comes from (bdrv_get_aio_context() returns NULL). > > One way to fix this real problem is to remove the BDS pointer from the > NbdClientSession and to always pass the BDS explicitly to the > block/nbd-client.c functions; the other is to always update the BDS pointer > in NbdClientSession in block/nbd.c. I'll try the former, and if it doesn't > work, will do the latter (if you don't object). Thank you for investigating. -- /kashyap