2011/1/21 Kevin Wolf <kw...@redhat.com>: > Am 21.01.2011 09:08, schrieb Pierre Riteau: >> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshi...@lab.ntt.co.jp> a >> écrit : >> >>> 2011/1/20 Pierre Riteau <pierre.rit...@irisa.fr>: >>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>>> >>>>> 2011/1/19 Pierre Riteau <pierre.rit...@irisa.fr>: >>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >>>>>> value of bdrv_write and aborts migration when it fails. However, if the >>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE >>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>>>>> >>>>>> Fixed by calling bdrv_write with the correct size of the last block. >>>>>> --- >>>>>> block-migration.c | 16 +++++++++++++++- >>>>>> 1 files changed, 15 insertions(+), 1 deletions(-) >>>>>> >>>>>> diff --git a/block-migration.c b/block-migration.c >>>>>> index 1475325..eeb9c62 100644 >>>>>> --- a/block-migration.c >>>>>> +++ b/block-migration.c >>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int >>>>>> version_id) >>>>>> int64_t addr; >>>>>> BlockDriverState *bs; >>>>>> uint8_t *buf; >>>>>> + int64_t total_sectors; >>>>>> + int nr_sectors; >>>>>> >>>>>> do { >>>>>> addr = qemu_get_be64(f); >>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, >>>>>> int version_id) >>>>>> return -EINVAL; >>>>>> } >>>>>> >>>>>> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >>>>>> + if (total_sectors <= 0) { >>>>>> + fprintf(stderr, "Error getting length of block device >>>>>> %s\n", device_name); >>>>>> + return -EINVAL; >>>>>> + } >>>>>> + >>>>>> + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { >>>>>> + nr_sectors = total_sectors - addr; >>>>>> + } else { >>>>>> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >>>>>> + } >>>>>> + >>>>>> buf = qemu_malloc(BLOCK_SIZE); >>>>>> >>>>>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>>>>> - ret = bdrv_write(bs, addr, buf, >>>>>> BDRV_SECTORS_PER_DIRTY_CHUNK); >>>>>> + ret = bdrv_write(bs, addr, buf, nr_sectors); >>>>>> >>>>>> qemu_free(buf); >>>>>> if (ret < 0) { >>>>>> -- >>>>>> 1.7.3.5 >>>>>> >>>>>> >>>>>> >>>>> >>>>> Hi Pierre, >>>>> >>>>> I don't think the fix above is correct. If you have a file which >>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>>>> patch. However, the receiver doesn't know how much sectors which >>>>> the sender wants to be written, so the guest may fail after >>>>> migration because some data may not be written. IIUC, although >>>>> changing bytestream should be prevented as much as possible, we >>>>> should save/load total_sectors to check appropriate file is >>>>> allocated on the receiver side. >>>> >>>> Isn't the guest supposed to be started using a file with the correct size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think there are >> valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as storage. If >> the partition size is not rounded to 1 MB, the last write will fail with the >> current code, and there is no way we can expand the partition. > > Actually, that you can change the image size is a special case. It only > works on raw with file and sheepdog, and on qcow2 and qed. All other > block drivers can't do it. > >>>> But I guess changing the protocol would be best as it would avoid >>>> headaches to people who mistakenly created a file that is too small. >>> >>> We should think carefully before changing the protocol. >>> >>> Kevin? > > Can we do it in a compatible way? I agree that it would be nice to catch > this error, but changing the protocol in an incompatible way for it > seems to be too much.
No. However, it's not only about catching this error, but improving the usability of block migration. I don't expect to change all at once, I think it would be worthwhile to discuss if we want to improve block migration. Yoshi > Anyway, it's independent of this patch and can be done on top. > > Kevin > >