On Thu, 19 Jun 2014, Zhang, Eniac wrote: > Hi, > > I am trying to use AHCI controllers with xen. The main reason for that is > because windows is so picky about which type of > controller it can boot on, and if you don’t have the right registry settings, > you get Error 7B > (http://support.microsoft.com/kb/324103) > > So here’s my attempt (in "hw/i386/pc_piix.c"): > > ide_drive_get(hd, MAX_IDE_BUS); > > if (pci_enabled) { > > PCIDevice *dev; > > #if 1 // Eniac > > dev = pci_create_simple_multifunction(pci_bus, > > piix3_devfn + 1, true, "ich9-ahci"); > > #else // original code > > if (xen_enabled()) { > > dev = pci_piix3_xen_ide_init(pci_bus, hd, piix3_devfn + 1); > > } else { > > dev = pci_piix3_ide_init(pci_bus, hd, piix3_devfn + 1); > > } > > #endif // Eniac > > idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0"); > > idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1"); > > } else { > > for(i = 0; i < MAX_IDE_BUS; i++) { > > ISADevice *dev; > > dev = isa_ide_init(isa_bus, ide_iobase[i], ide_iobase2[i], > > ide_irq[i], > > hd[MAX_IDE_DEVS * i], hd[MAX_IDE_DEVS * i + > 1]); > > idebus[i] = qdev_get_child_bus(DEVICE(dev), "ide.0"); > > } > > } > > > > And the config file to start xen fvm: > > name = 'personal' > > builder = 'hvm' > > device_model_version = 'qemu-xen' > > device_model_override = '/vm/qemu161test/qemu-161-vanilla' > > > > vcpus = 4 > > memory = 2848 > > maxmem = 2848 > > > > usb = 1 > > usbdevice = 'tablet' > > vnc = 1 > > vnclisten = '0.0.0.0:0' > > serial='pty' > > > > boot = "d" > > device_model_args = [ "-drive", "if=none,file=/dev/sda,id=hd", "-device", > "ide-hd,drive=hd,bus=ide.0", "-drive", > "if=none,file=/vm/qemu161test/output.iso,id=cd", "-device", > "ide-cd,drive=cd,bus=ide.1" ] > > > > It works… but not flawlessly. The grub can load the Ubuntu menu, kernel and > initrd but then crash when it’s trying to mount the > rootfs. In other words, bios int13h can read the disk fine but Linux kernel > driver can’t. I’ve seen similar corruption > happening on Windows boot as well. After several days of triage, I realized > that it might have to do with async read. I found > a defect report on this and applied that > (https://code.grnet.gr/projects/qemu/repository/revisions/8464b273d69c61e33c55347e5b6bc0659687bae2) > but the problem persists. > Here’s the command I used to demonstrate the corruption: > > bash-4.2# for k in `seq 20`; do dd if=/dev/sr0 bs=51200 count=10 2>/dev/null > |> > > 2393220578 512000 > > 2393220578 512000 > > 1498197434 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 1652232720 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > 2393220578 512000 > > > > One may need to run it several times to see the corruption. It looks like a > random racing condition. > > > > I then tried several test with vanilla qemu: > > # non-xen test > > Qemu-1.6.1-vanilla with q35 chipset: no corruption > > Qemu-1.6.1-vanilla with 440fx chipset: no corruption > > Qemu-1.6.1 with AHCI patch and 440fx chipset: no corruption > > > > # test with vanilla-xen > > Qemu-1.6.1-vanilla with 440fx chipset: no corruption > > Qemu-1.6.1 with AHCI patch and 440fx chipset: corruption
Can you try with a more recent QEMU version? Maybe QEMU 2.0? > So the problem lies between the interaction between AHCI controller and xen. > Has anyone else tried this and/or can take a look > to see what’s happening here? In these cases it is usually a mapcache (see xen-mapcache.c) problem. Not a bug in the mapcache per se, but maybe it is not called correctly from common code. To give you an idea of a possible bad interaction between the mapcache and common code, see: commit a41087bc7110e8378cd49ddd06aa7c9d361f3673 Author: Stefano Stabellini <stefano.stabell...@eu.citrix.com> Date: Thu Jan 30 12:46:05 2014 +0000 address_space_translate: do not cross page boundaries