On 07/31/2017 03:00 PM, Fabian Grünbichler wrote: > On Thu, Jul 27, 2017 at 11:25:41AM +0200, Emmanuel Kasper wrote: >> It can happen that the qmp connection gets lost while mirroring a disk. >> In that case the current block job get cancelled, but the real cause of the >> failure >> is lost, becase we die() at a later step with the generic message >> "die "$job: mirroring has been cancelled\n" > > I am not quite sure I can follow.. see below > >> >> example: >> ... >> drive-scsi0: transferred: 5524946944 bytes remaining: 918355968 bytes total: >> 6443302912 bytes progression: 85.75 % busy: 1 ready: 0 >> drive-scsi0: Cancelling block job >> drive-scsi0: Done. >> 2017-07-26 15:39:56 ERROR: online migrate failure - mirroring error: >> drive-scsi0: mirroring has been cancelled >> 2017-07-26 15:39:56 aborting phase 2 - cleanup resources >> 2017-07-26 15:39:56 migrate_cancel >> ... > > but this must be from dying in line 6054 (caught by the eval in 6030), > not from dying in line 6036? which means that query-block-jobs maybe > returned an empty array (or undef?)..
yes you're right this is caused by dying in 6054 since the dying in 6054 has a misleading message I sent a new patch for that >> >> after patch applied: >> 2017-07-27 09:43:37 ERROR: online migrate failure - mirroring error: lost >> connection to qemu machine protocol: VM 600 not running >> 2017-07-27 09:43:37 aborting phase 2 - cleanup resources > > but this would mean vm_qmp_command (called by vm_mon_cmd) died in line > 4798, because check_running returned false?? > > I'd rather fix check_running returning false then, because obviously the > VM IS running isn't it? ;) in that case the VM was NOT running so the check_running was right :) _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel