Am 21.01.2011 14:59, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau <pierre.rit...@irisa.fr>: >> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: >> >>> 2011/1/21 Kevin Wolf <kw...@redhat.com>: >>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >>>>> 2011/1/21 Pierre Riteau <pierre.rit...@irisa.fr>: >>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura >>>>>> <tamura.yoshi...@lab.ntt.co.jp> a écrit : >>>>>> >>>>>>> 2011/1/20 Pierre Riteau <pierre.rit...@irisa.fr>: >>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>>>>>>> >>>>>>>>> 2011/1/19 Pierre Riteau <pierre.rit...@irisa.fr>: >>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if >>>>>>>>>> the >>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE >>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>>>>>>>>> >>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block. >>>>>>>>>> --- >>>>>>>>>> block-migration.c | 16 +++++++++++++++- >>>>>>>>>> 1 files changed, 15 insertions(+), 1 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/block-migration.c b/block-migration.c >>>>>>>>>> index 1475325..eeb9c62 100644 >>>>>>>>>> --- a/block-migration.c >>>>>>>>>> +++ b/block-migration.c >>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, >>>>>>>>>> int version_id) >>>>>>>>>> int64_t addr; >>>>>>>>>> BlockDriverState *bs; >>>>>>>>>> uint8_t *buf; >>>>>>>>>> + int64_t total_sectors; >>>>>>>>>> + int nr_sectors; >>>>>>>>>> >>>>>>>>>> do { >>>>>>>>>> addr = qemu_get_be64(f); >>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void >>>>>>>>>> *opaque, int version_id) >>>>>>>>>> return -EINVAL; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >>>>>>>>>> + if (total_sectors <= 0) { >>>>>>>>>> + fprintf(stderr, "Error getting length of block >>>>>>>>>> device %s\n", device_name); >>>>>>>>>> + return -EINVAL; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + if (total_sectors - addr < >>>>>>>>>> BDRV_SECTORS_PER_DIRTY_CHUNK) { >>>>>>>>>> + nr_sectors = total_sectors - addr; >>>>>>>>>> + } else { >>>>>>>>>> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> buf = qemu_malloc(BLOCK_SIZE); >>>>>>>>>> >>>>>>>>>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>>>>>>>>> - ret = bdrv_write(bs, addr, buf, >>>>>>>>>> BDRV_SECTORS_PER_DIRTY_CHUNK); >>>>>>>>>> + ret = bdrv_write(bs, addr, buf, nr_sectors); >>>>>>>>>> >>>>>>>>>> qemu_free(buf); >>>>>>>>>> if (ret < 0) { >>>>>>>>>> -- >>>>>>>>>> 1.7.3.5 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Pierre, >>>>>>>>> >>>>>>>>> I don't think the fix above is correct. If you have a file which >>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>>>>>>>> patch. However, the receiver doesn't know how much sectors which >>>>>>>>> the sender wants to be written, so the guest may fail after >>>>>>>>> migration because some data may not be written. IIUC, although >>>>>>>>> changing bytestream should be prevented as much as possible, we >>>>>>>>> should save/load total_sectors to check appropriate file is >>>>>>>>> allocated on the receiver side. >>>>>>>> >>>>>>>> Isn't the guest supposed to be started using a file with the correct >>>>>>>> size? >>>>>>> >>>>>>> I personally don't like that; It's insisting too much to the user. >>>>>>> Can't we expand the image on the fly? We can just abort if expanding >>>>>>> failed anyway. >>>>>> >>>>>> At first I thought your expansion idea was best, but now I think there >>>>>> are valid scenarios where it fails. >>>>>> >>>>>> Imagine both sides are not using a file but a disk partition as storage. >>>>>> If the partition size is not rounded to 1 MB, the last write will fail >>>>>> with the current code, and there is no way we can expand the partition. >>>>>> >>>>> >>>>> Right. But in case of partition doesn't the check in the patch below >>>>> return error? Does bdrv_getlength return the size correctly? >>>> >>>> I'm pretty sure that it does. We would have problems in other places if >>>> it didn't (e.g. we're checking if I/O requests are within the disk size). >>> >>> Sorry for the noise. I just learned it's returning the value of lseek >>> in case of raw-posix. >> >> >> And it does a ioctl call on other platforms than Linux. > > Thanks. Just a quick question regarding total_sectors. > BlockDriverState seems to contain total_sectors. Can we avoid > calling bdrv_getlength() if bs->total_sectors were already there?
I'd need to check the details, but I think it may not be correct with growable files. Kevin