On Wed, Jun 19, 2013 at 12:50:16PM +0200, Kevin Wolf wrote: > Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben: > > + DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n", > > + __func__, bdrv_get_device_name(bs), start, sector_num, > > nb_sectors); > > Maybe put the first "%s" and __func__ directly into the DPRINTF macro?
Will use trace events instead. > > + wait_for_overlapping_requests(job, start, end); > > + cow_request_begin(&cow_request, job, start, end); > > + > > + total_sectors = bdrv_getlength(bs); > > + if (total_sectors < 0) { > > + if (error_is_read) { > > + *error_is_read = true; > > + } > > + ret = total_sectors; > > + goto out; > > + } > > + total_sectors /= BDRV_SECTOR_SIZE; > > Why is this needed? There are two callers of the function: One is the > write notifier, which has already a sector number that is checked > against the image size. The other one is background job that gets the > size once when it starts. I don't think there's a reason to call the > block driver each time and potentially do an expensive request. > > At first I thought it has something to do with resizing images, but it's > forbidden while a block job is running (otherwise the job's main loop > exit condition would be wrong, too), so that doesn't make it necessary. Thanks, I will eliminate the call. > > + /* Publish progress */ > > + job->sectors_read += n; > > + job->common.offset += n * BDRV_SECTOR_SIZE; > > This is interesting, because the function is not only called by the > background job, but also by write notifiers. So 'offset' in a literal > sense doesn't make too much sense because we're not operating purely > sequential. > > The QAPI documentation describes 'offset' like this: > > # @offset: the current progress value > > If we take it as just that, I think we could actually consider this code > correct, because it's a useful measure for the progress (each sector is > copied only once, either by the job or by a notifier), even though it > really has nothing to do with an offset into the image. > > Maybe a comment would be appropriate. Will add the comment. > > +static BlockJobType backup_job_type = { > > + .instance_size = sizeof(BackupBlockJob), > > + .job_type = "backup", > > + .set_speed = backup_set_speed, > > + .iostatus_reset = backup_iostatus_reset, > > +}; > > Align = on the same column? Should probably be const, too. Thanks for pointing the const out. I'll throw in alignment as a bonus, there is a team of ASCII artists here who do amazing whitespace work ;). > > +static void coroutine_fn backup_run(void *opaque) > > +{ > > + BackupBlockJob *job = opaque; > > + BlockDriverState *bs = job->common.bs; > > + BlockDriverState *target = job->target; > > + BlockdevOnError on_target_error = job->on_target_error; > > + NotifierWithReturn before_write = { > > + .notify = backup_before_write_notify, > > + }; > > + int64_t start, end; > > + int ret = 0; > > + > > + QLIST_INIT(&job->inflight_reqs); > > + qemu_co_rwlock_init(&job->flush_rwlock); > > + > > + start = 0; > > + end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE, > > bdrv_getlength() can fail. Will fix. > > + BACKUP_SECTORS_PER_CLUSTER); > > + > > + job->bitmap = hbitmap_alloc(end, 0); > > + > > + bdrv_set_on_error(target, on_target_error, on_target_error); > > + bdrv_iostatus_enable(target); > > Mirroring also has: > > bdrv_set_enable_write_cache(s->target, true); > > Wouldn't it make sense for backup as well? I guess so. > > + /* we need to yield so that qemu_aio_flush() returns. > > + * (without, VM does not reboot) > > + */ > > + if (job->common.speed) { > > + uint64_t delay_ns = ratelimit_calculate_delay( > > + &job->limit, job->sectors_read); > > + job->sectors_read = 0; > > + block_job_sleep_ns(&job->common, rt_clock, delay_ns); > > Here's the other implication of counting the progress of copies > initiated by write notifiers: The more copies they trigger, the less > additional copies are made by the background job. > > I'm willing to count this as a feature rather than a bug. Yes, the guest does not get throttled by the block job. > > + } else { > > + block_job_sleep_ns(&job->common, rt_clock, 0); > > + } > > + > > + if (block_job_is_cancelled(&job->common)) { > > + break; > > + } > > + > > + DPRINTF("backup_run loop C%" PRId64 "\n", start); > > + > > + ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1, > > + &error_is_read); > > You're taking advantage of the fact that backup_do_cow() rounds this one > sector up to 64k, which is a much more reasonable size. But why not > specify BACKUP_SECTORS_PER_CLUSTER as nb_sectors when this is what you > really assume? Sounds good though I need to double-check if we run into issues when hitting the end of the disk (if not aligned to BACKUP_SECTORS_PER_CLUSTER).