Hi! I' fighting with this problem for a long time, but the lack of documentation and poor error messages of Xen make it hard to solve the problem:
I've configured a Xen paravirtualized VM using tap:aio on top of an OCFS2 filesystem. I could live-migrate the VM between two hosts several times, but when I try to reboot the VM (xm reboot vm), hte VM doesn't find ist boot disk: [...] [ 0.198834] registered taskstats version 1 [ 0.198853] Magic number: 1:252:3141 [ 0.198866] XENBUS: Device with no driver: device/vbd/51712 [ 0.198869] XENBUS: Device with no driver: device/vbd/51728 [ 0.198871] XENBUS: Device with no driver: device/vif/0 [ 0.198873] XENBUS: Device with no driver: device/vif/1 [ 0.198875] XENBUS: Device with no driver: device/vif/2 [ 0.198987] Freeing unused kernel memory: 428k freed [ 0.199111] Write protecting the kernel read-only data: 6868k doing fast boot [ 5.356160] XENBUS: Waiting for devices to initialise: 295s...290s...285s...280s...275s...270s...265s...260s...255s...250s...245s...240s...235s...230s...225s...220s...215s...210s...205s...200s...195s...190s...185s...180s...175s...170s...165s...160s...155s...150s...145s...140s...135s...130s...125s...120s...115s...110s...105s...100s...95s...90s...85s...80s...75s...70s...65s...60s...55s...50s...45s...40s...35s...30s...25s...20s...15s...10s...5s...0s... [ 300.356355] XENBUS: Timeout connecting to device: device/vbd/51712 (local state 3, remote state 2) [ 300.356370] XENBUS: Device not ready: device/vbd/51712 [ 300.356554] XENBUS: Timeout connecting to device: device/vbd/51728 (local state 3, remote state 2) [ 300.356565] XENBUS: Device not ready: device/vbd/51728 [...] In a cluster with 10 VMs some VMs can boot without problem, while others cannot, and the VMs that cannot boot change over time. I don't see any pattern other than some obscure software bug in blktap, or some obscure configuration problem. I'm using SLES11 SP2 on x86_64. The problem occurs with both Intel CPUs and AMD CPUs. Could it be related to the contents of "tapdisk-ioemu.log" saying:? connected disks: 0 => 1 connected disks: 1 => 0 Last image is closed, exiting. connected disks: 0 => 1 connected disks: 1 => 2 connected disks: 2 => 1 (it seems the device mapping changed) BTW: The log without timestamps is quite useless as shown on another host: [...] connected disks: 2 => 3 connected disks: 3 => 4 connected disks: 4 => 5 connected disks: 5 => 6 connected disks: 0 => 1 connected disks: 1 => 2 connected disks: 2 => 3 connected disks: 3 => 4 connected disks: 4 => 3 Regards, Ulrich _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
