Re: [Qemu-devel] OUT_ASM on two different systems
On 17/07/2016 04:06, Ayaz Akram wrote: > Hi all ! > > I ran a program with qemu in user mode emulation and generated trace for > generated host instructions using (-d OUT_ASM) on two different linux > systems.I expected that the addresses in two trace files can be different. > But the total number of lines in two files is different as well. I mean the > generated host instructions in two files are different (I have not yet > looked into details of those differenes). Qemu and program's binary are > exactly same on both systems. I wonder if someone can help me in explaining > this ? > > Thanks for your time ! > It's difficult to answer your question without also seeing an example of those differences. Paolo
[Qemu-devel] [Bug 1603693] Re: Disks in mptsas1068 scsi controller not seen by linux
Linux requires that you specify a WWN for the disk (through the wwn property of the scsi-disk device). -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1603693 Title: Disks in mptsas1068 scsi controller not seen by linux Status in QEMU: New Bug description: When using the mptsas1068 scsi controller, linux detects the controller itself but not the drives attached to it. Freebsd works. Using a different controller with linux works. VMware with linux works. qemu 2.6.50 (v2.6.0-1925-g6b92bbf) seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 support) Test script, loosely based off what libvirt runs and the libvirt tests that Paolo Bonzini wrote [1] # iso=archlinux-2016.07.01-dual.iso #iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso device=mptsas1068 #device=lsi img=empty.img qemu-img create -f qcow2 $img 1G /usr/bin/qemu-system-x86_64 \ -enable-kvm \ -m 1024 \ -boot menu=on \ -device $device,id=scsi0,bus=pci.0,addr=0x9 \ -drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \ -drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \ -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1 # The ISOs can be downloaded from [2] and [3]. After booting linux, do "lsblk". /dev/sda should exist. After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU HARDDISK" should be mentioned. With device=mptsas1068 this fails in linux. With device=lsi line it works in both. With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only loads modules for mptsas1068, this works. I also reproduced this with the debian 8.5 netinstall image, but it insists in making you pick a driver from a list of modules when it fails to mount it, instead of dropping to a shell. Arch linux dmesg output snippet (full output attached as arch-linux- dmesg.txt): # root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0 [0.00] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016 [0.00] Normal empty [0.00] Preemptible hierarchical RCU implementation. [1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249) [1.951581] SCSI subsystem initialized [1.957113] Fusion MPT base driver 3.04.20 [1.957618] Fusion MPT SAS Host driver 3.04.20 [2.281773] scsi host0: ata_piix [2.285372] scsi host1: ata_piix [2.305803] mptbase: ioc0: Initiating bringup [2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator} [2.444390] scsi 0:0:1:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ PQ: 0 ANSI: 5 [2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, MaxQ=128, IRQ=11 [2.507024] sr 0:0:1:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray [2.507274] sr 0:0:1:0: Attached scsi CD-ROM sr0 # The controller itself is detected, the disk isn't. An early version of this patch [4] said that it was only tested with FreeBSD: >Tested with FreeBSD for now. The previous version (before the >configuration page rewrite) worked with RHEL and Windows guests as well. > >TODO: write qtest for (at least) config pages, test Linux and Windows. [1]: https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=fc922eb2080a3fa7b24bc8a8b0aabfd394480143 [2]: https://www.archlinux.org/download [3]: https://www.freebsd.org/where.html [4]: https://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06475.html To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1603693/+subscriptions
[Qemu-devel] [Bug 1603734] [NEW] Hang in fsqrt
Public bug reported: At least qemu-i368 and qemu-x86_64 hang in floatx80_sqrt in versions 2.6.0 and git (2.6.50) for some input values, likely due to an infinite loop at fpu/softfloat.c:6569. Steps to reproduce: 1) Compile attached code: gcc -o test test.c -lm 2) `qemu-i368 test` and `qemu-x86_64 test` will hang at 100% cpu ** Affects: qemu Importance: Undecided Status: New ** Attachment added: "minimal example" https://bugs.launchpad.net/bugs/1603734/+attachment/4702260/+files/test.c -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1603734 Title: Hang in fsqrt Status in QEMU: New Bug description: At least qemu-i368 and qemu-x86_64 hang in floatx80_sqrt in versions 2.6.0 and git (2.6.50) for some input values, likely due to an infinite loop at fpu/softfloat.c:6569. Steps to reproduce: 1) Compile attached code: gcc -o test test.c -lm 2) `qemu-i368 test` and `qemu-x86_64 test` will hang at 100% cpu To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1603734/+subscriptions
Re: [Qemu-devel] [v9 00/19] QEMU:Xen stubdom vTPM for HVM virtual machine(QEMU Part)
On 2016 Jul 14 (Thu) 23:34, Stefano Stabellini wrote:> Hi Quan, > > thanks for CC'ing me. sstabell...@kernel.org is the right address to > reach me now. > > I am also CC'ing Anthony Perard who is Xen co-maintainer in QEMU. > > Cheers, > > Stefano thanks in advance!! :):)Quan
Re: [Qemu-devel] [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c
[Quan:]: comment starts with [Quan:] The purpose of the new file is to store generic functions shared by frontendand backends such as xenstore operations, xendevs. Signed-off-by: Quan Xu Signed-off-by: Emil Condrea --- hw/xen/Makefile.objs | 2 +- hw/xen/xen_backend.c | 125 +--- hw/xen/xen_pvdev.c | 149 +++ include/hw/xen/xen_backend.h | 63 +- include/hw/xen/xen_pvdev.h | 71 + 5 files changed, 223 insertions(+), 187 deletions(-) create mode 100644 hw/xen/xen_pvdev.c create mode 100644 include/hw/xen/xen_pvdev.h diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs index d367094..591cdc2 100644 --- a/hw/xen/Makefile.objs +++ b/hw/xen/Makefile.objs @@ -1,5 +1,5 @@ # xen backend driver support -common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o +common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o xen_pvdev.o obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_graphics.o xen_pt_msi.o diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c index bab79b1..a251a4a 100644 --- a/hw/xen/xen_backend.c +++ b/hw/xen/xen_backend.c @@ -30,6 +30,7 @@ #include "sysemu/char.h" #include "qemu/log.h" #include "hw/xen/xen_backend.h" +#include "hw/xen/xen_pvdev.h" #include @@ -56,8 +57,6 @@ static QTAILQ_HEAD(xs_dirs_head, xs_dirs) xs_cleanup = static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = QTAILQ_HEAD_INITIALIZER(xendevs); static int debug = 0; -/* - */ - static void xenstore_cleanup_dir(char *dir) { struct xs_dirs *d; @@ -76,34 +75,6 @@ void xen_config_cleanup(void) } } -int xenstore_write_str(const char *base, const char *node, const char *val) -{ -char abspath[XEN_BUFSIZE]; - -snprintf(abspath, sizeof(abspath), "%s/%s", base, node); -if (!xs_write(xenstore, 0, abspath, val, strlen(val))) { -return -1; -} -return 0; -} - -char *xenstore_read_str(const char *base, const char *node) -{ -char abspath[XEN_BUFSIZE]; -unsigned int len; -char *str, *ret = NULL; - -snprintf(abspath, sizeof(abspath), "%s/%s", base, node); -str = xs_read(xenstore, 0, abspath, &len); -if (str != NULL) { -/* move to qemu-allocated memory to make sure - * callers can savely g_free() stuff. */ -ret = g_strdup(str); -free(str); -} -return ret; -} - int xenstore_mkdir(char *path, int p) { struct xs_permissions perms[2] = { @@ -128,48 +99,6 @@ int xenstore_mkdir(char *path, int p) return 0; } -int xenstore_write_int(const char *base, const char *node, int ival) -{ -char val[12]; - [Quan:]: why 12 ? what about XEN_BUFSIZE? -snprintf(val, sizeof(val), "%d", ival); -return xenstore_write_str(base, node, val); -} - -int xenstore_write_int64(const char *base, const char *node, int64_t ival) -{ -char val[21]; - [Quan:]: why 21 ? what about XEN_BUFSIZE? -snprintf(val, sizeof(val), "%"PRId64, ival); -return xenstore_write_str(base, node, val); -} - -int xenstore_read_int(const char *base, const char *node, int *ival) -{ -char *val; -int rc = -1; - -val = xenstore_read_str(base, node); [Quan:]: IMO, it is better to initialize val when declares. the same comment for the other 'val' -if (val && 1 == sscanf(val, "%d", ival)) { -rc = 0; -} -g_free(val); -return rc; -} - -int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval) -{ -char *val; -int rc = -1; - -val = xenstore_read_str(base, node);-if (val && 1 == sscanf(val, "%"SCNu64, uval)) { -rc = 0; -} -g_free(val); -return rc; -} - int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const char *val) { return xenstore_write_str(xendev->be, node, val); @@ -212,20 +141,6 @@ int xenstore_read_fe_uint64(struct XenDevice *xendev, const char *node, uint64_t /* - */ -const char *xenbus_strstate(enum xenbus_state state) -{ -static const char *const name[] = { -[ XenbusStateUnknown ] = "Unknown", -[ XenbusStateInitialising ] = "Initialising", -[ XenbusStateInitWait ] = "InitWait", -[ XenbusStateInitialised ] = "Initialised", -[ XenbusStateConnected] = "Connected", -[ XenbusStateClosing ] = "Closing", -[ XenbusStateClosed ] = "Closed", -}; -return (state < ARRAY_SIZE(name)) ? name[state] : "INVALID"; -} - int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state) { int rc; @@ -833,44 +748,6 @@ int xen_be_send_notify(struct XenDevice *xendev) return xenevtchn_notify(xendev->evtchndev, xendev->local_port); } -/* - * msg_level: - * 0
[Qemu-devel] [Bug 1603779] [NEW] AC97 can allocate ~500MB of host RAM
Public bug reported: While working with qtest test cases generated via fuzzing with QEMU 2.5.0, I discovered some odd behavior for the AC97 virtual device with qemu-system-i386. If AC97_MIC_ADC_RATE is set to the value of 1, the QEMU process allocates over 500MB of additional host RAM. You probably would not normally notice this on a modern PC, except that I was using a "ulimit" command to restrict the maximum amount of virtual memory allowed for the QEMU process, so the process would crash with a SIGTRAP (signal 5) on the failed memory allocation. My minimized qtest code to reproduce the issue is: static void test_crash(void) { uint64_t barsize; dev = get_device(); dev_base[0] = qpci_iomap(dev, 0, &barsize); dev_base[1] = qpci_iomap(dev, 1, &barsize); qpci_device_enable(dev); qpci_io_writew(dev, dev_base[0]+0x32, 0x0001); } I ran a "ulimit -sv 65" command and then launched the tests/ac97-test binary with this crash test case included in it. I can then see the QEMU process crash on an allocation of 722538464 bytes. I can gradually increase the ulimit memory limit to ~120 and then no longer see the issue, hence my estimate of 500 MB of RAM allocated by the device. ** Affects: qemu Importance: Undecided Status: New ** Tags: ac97 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1603779 Title: AC97 can allocate ~500MB of host RAM Status in QEMU: New Bug description: While working with qtest test cases generated via fuzzing with QEMU 2.5.0, I discovered some odd behavior for the AC97 virtual device with qemu-system-i386. If AC97_MIC_ADC_RATE is set to the value of 1, the QEMU process allocates over 500MB of additional host RAM. You probably would not normally notice this on a modern PC, except that I was using a "ulimit" command to restrict the maximum amount of virtual memory allowed for the QEMU process, so the process would crash with a SIGTRAP (signal 5) on the failed memory allocation. My minimized qtest code to reproduce the issue is: static void test_crash(void) { uint64_t barsize; dev = get_device(); dev_base[0] = qpci_iomap(dev, 0, &barsize); dev_base[1] = qpci_iomap(dev, 1, &barsize); qpci_device_enable(dev); qpci_io_writew(dev, dev_base[0]+0x32, 0x0001); } I ran a "ulimit -sv 65" command and then launched the tests/ac97-test binary with this crash test case included in it. I can then see the QEMU process crash on an allocation of 722538464 bytes. I can gradually increase the ulimit memory limit to ~120 and then no longer see the issue, hence my estimate of 500 MB of RAM allocated by the device. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1603779/+subscriptions
[Qemu-devel] [Bug 1603785] [NEW] trace_usb_port_attach prints junk data
Public bug reported: Running qemu with tracing (-D ~/qemu_trace -d trace:\*) will result in a trace file with unprintable characters. example: usb_port_attach bus 0, port 1, devspeed <90>l.U, portspeed full+high The problem is in hw/usb/bus.c usb_mask_to_str. If speedmask doesn't match any of the defined speed nothing is written to *dest and uninitialized data is printed to the log. This happens with a real usb device that is forwarded into the machine. My qemu version is 2.6.0 but it looks like the problem exists in latest git also. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1603785 Title: trace_usb_port_attach prints junk data Status in QEMU: New Bug description: Running qemu with tracing (-D ~/qemu_trace -d trace:\*) will result in a trace file with unprintable characters. example: usb_port_attach bus 0, port 1, devspeed <90>l.U, portspeed full+high The problem is in hw/usb/bus.c usb_mask_to_str. If speedmask doesn't match any of the defined speed nothing is written to *dest and uninitialized data is printed to the log. This happens with a real usb device that is forwarded into the machine. My qemu version is 2.6.0 but it looks like the problem exists in latest git also. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1603785/+subscriptions
[Qemu-devel] [PATCH V5 0/7] pxb: fix 64-bit MMIO allocation
v4 -> v5: Addressed the pull request issues: (Peter Maydell) See: https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00882.html - cland warning -> "hw/pci/pci.c:196:23: runtime error: shift exponent -1 is negative": The PCIe Root port was not initialized properly, the interrupt pin was left 0. This is a long standing issue exposed by the new test. (Patch 1/7) - 'make check' fails on 32-bit: Fix it by changing the ivshmem mem size from 4G to 1G, since 4G is not a valid value on 32-bit archs. (Patch 2/7) (4G is truncated to 0 on 32-bit systems) - Rebased on mst's pci branch. Since all the new changes are not related to the series, I kept the existing "Reviewed-by"/"Tested-by" signatures. v3 -> v4: Addressed Igor's comments (thanks for the productive review!) - Split pxb test patch (previously patch 3/3) into the test itself (patch 1/6) and the blobs (patch 6/6). - New patch declaring pxb/pxb-pxie as not hot-pluggable. - Note that it does not solve the DSDT issue, but it is a prerequisite for the next patch. - New patch solving the DSDT issue spotted by Igor. - Using V=1 DIFF=diff make check does make it easier to review the ACPI changes, thanks. - Patches 4 and 5 untouched (previously patches 1/3 and 2/3) v2 -> v3: - split original series "pci: better support for 64-bit MMIO allocation" into 2 series: - this is the first part dealing with correct 64-bit MMIO ACPI computation - the second one will include 64-bit MMIO reservation for PCI hotplug - Add pxb/pxb-pcie tests (Igor) - See diffs below (*) - Re-based on latest master. v1 -> v2: - resolved some styling issues (Laszlo) - rebase on latest master (Laszlo) 64-bit BARs allocations fix for devices behind PXBs/PXB-PCIEs. In build_crs() the calculation and merging of the ranges already happens in 64-bit, but the entry boundaries are silently truncated to 32-bit in the call to aml_dword_memory(). Fix it by handling the 64-bit MMIO ranges separately. Thank you, Marcel Marcel Apfelbaum (7): hw/pcie-root-port: Fix PCIe root port initialization tests/acpi: add pxb/pxb-pcie tests hw/pxb: declare pxb devices as not hot-pluggable hw/acpi: fix a DSDT table issue when a pxb is present. acpi: refactor pxb crs computation hw/apci: handle 64-bit MMIO regions correctly tests/acpi: Add pxb/pxb-pcie tests blobs hw/i386/acpi-build.c | 131 - hw/pci-bridge/ioh3420.c| 1 + hw/pci-bridge/pci_expander_bridge.c| 2 + tests/acpi-test-data/pc/DSDT.pxb | Bin 0 -> 6286 bytes tests/acpi-test-data/q35/DSDT.pxb_pcie | Bin 0 -> 9098 bytes tests/bios-tables-test.c | 37 ++ 6 files changed, 135 insertions(+), 36 deletions(-) create mode 100644 tests/acpi-test-data/pc/DSDT.pxb create mode 100644 tests/acpi-test-data/q35/DSDT.pxb_pcie -- 2.4.3
[Qemu-devel] [PATCH V5 1/7] hw/pcie-root-port: Fix PCIe root port initialization
Specify the root port interrupt pin as part of the init process for cases when msi/msix are not enabled. Fixes "hw/pci/pci.c:196:23: runtime error: shift exponent -1 is negative" warning from clang's sanitizer. Reported-by: Peter Maydell Signed-off-by: Marcel Apfelbaum --- hw/pci-bridge/ioh3420.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c index 93c6f0b..d88cae5 100644 --- a/hw/pci-bridge/ioh3420.c +++ b/hw/pci-bridge/ioh3420.c @@ -100,6 +100,7 @@ static int ioh3420_initfn(PCIDevice *d) int rc; Error *err = NULL; +pci_config_set_interrupt_pin(d->config, 1); pci_bridge_initfn(d, TYPE_PCIE_BUS); pcie_port_init_reg(d); -- 2.4.3
[Qemu-devel] [PATCH V5 5/7] acpi: refactor pxb crs computation
Instead of always passing both IO and MEM ranges when computing CRS ranges, define a new CrsRangeSet structure that include them both. This is done before introducing a third type of range, 64-bit MEM, so it will be easier to pass them all around. Reviewed-by: Igor Mammedov Signed-off-by: Marcel Apfelbaum Tested-by: Laszlo Ersek --- hw/i386/acpi-build.c | 81 1 file changed, 50 insertions(+), 31 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 5ed2bbd..d8b3543 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -748,6 +748,23 @@ static void crs_range_free(gpointer data) g_free(entry); } +typedef struct CrsRangeSet { +GPtrArray *io_ranges; +GPtrArray *mem_ranges; + } CrsRangeSet; + +static void crs_range_set_init(CrsRangeSet *range_set) +{ +range_set->io_ranges = g_ptr_array_new_with_free_func(crs_range_free); +range_set->mem_ranges = g_ptr_array_new_with_free_func(crs_range_free); +} + +static void crs_range_set_free(CrsRangeSet *range_set) +{ +g_ptr_array_free(range_set->io_ranges, true); +g_ptr_array_free(range_set->mem_ranges, true); +} + static gint crs_range_compare(gconstpointer a, gconstpointer b) { CrsRangeEntry *entry_a = *(CrsRangeEntry **)a; @@ -832,18 +849,17 @@ static void crs_range_merge(GPtrArray *range) g_ptr_array_free(tmp, true); } -static Aml *build_crs(PCIHostState *host, - GPtrArray *io_ranges, GPtrArray *mem_ranges) +static Aml *build_crs(PCIHostState *host, CrsRangeSet *range_set) { Aml *crs = aml_resource_template(); -GPtrArray *host_io_ranges = g_ptr_array_new_with_free_func(crs_range_free); -GPtrArray *host_mem_ranges = g_ptr_array_new_with_free_func(crs_range_free); +CrsRangeSet temp_range_set; CrsRangeEntry *entry; uint8_t max_bus = pci_bus_num(host->bus); uint8_t type; int devfn; int i; +crs_range_set_init(&temp_range_set); for (devfn = 0; devfn < ARRAY_SIZE(host->bus->devices); devfn++) { uint64_t range_base, range_limit; PCIDevice *dev = host->bus->devices[devfn]; @@ -867,9 +883,11 @@ static Aml *build_crs(PCIHostState *host, } if (r->type & PCI_BASE_ADDRESS_SPACE_IO) { -crs_range_insert(host_io_ranges, range_base, range_limit); +crs_range_insert(temp_range_set.io_ranges, + range_base, range_limit); } else { /* "memory" */ -crs_range_insert(host_mem_ranges, range_base, range_limit); +crs_range_insert(temp_range_set.mem_ranges, + range_base, range_limit); } } @@ -888,7 +906,8 @@ static Aml *build_crs(PCIHostState *host, * that do not support multiple root buses */ if (range_base && range_base <= range_limit) { -crs_range_insert(host_io_ranges, range_base, range_limit); +crs_range_insert(temp_range_set.io_ranges, + range_base, range_limit); } range_base = @@ -901,7 +920,8 @@ static Aml *build_crs(PCIHostState *host, * that do not support multiple root buses */ if (range_base && range_base <= range_limit) { -crs_range_insert(host_mem_ranges, range_base, range_limit); +crs_range_insert(temp_range_set.mem_ranges, + range_base, range_limit); } range_base = @@ -914,35 +934,36 @@ static Aml *build_crs(PCIHostState *host, * that do not support multiple root buses */ if (range_base && range_base <= range_limit) { -crs_range_insert(host_mem_ranges, range_base, range_limit); +crs_range_insert(temp_range_set.mem_ranges, + range_base, range_limit); } } } -crs_range_merge(host_io_ranges); -for (i = 0; i < host_io_ranges->len; i++) { -entry = g_ptr_array_index(host_io_ranges, i); +crs_range_merge(temp_range_set.io_ranges); +for (i = 0; i < temp_range_set.io_ranges->len; i++) { +entry = g_ptr_array_index(temp_range_set.io_ranges, i); aml_append(crs, aml_word_io(AML_MIN_FIXED, AML_MAX_FIXED, AML_POS_DECODE, AML_ENTIRE_RANGE, 0, entry->base, entry->limit, 0, entry->limit - entry->base + 1)); -crs_range_insert(io_ranges, entry->base, entry->limit); +crs_range_insert(range_set->io_ranges, entry->base, entry->limit); } -g_ptr_array_free(host_io_ranges, true); -crs_range_merge(host_mem_ranges); -for (i = 0; i < host_mem_ranges->len; i++) { -entry = g_ptr_array_index(host_mem_ranges, i
[Qemu-devel] [PATCH V5 2/7] tests/acpi: add pxb/pxb-pcie tests
Add an ivshmem device with 1G shared memory to pxb in order to check the ACPI code of 64bit MMIO allocation. Suggested-by: Igor Mammedov Signed-off-by: Marcel Apfelbaum Tested-by: Laszlo Ersek --- tests/bios-tables-test.c | 37 + 1 file changed, 37 insertions(+) diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c index de4019e..b23c6b0 100644 --- a/tests/bios-tables-test.c +++ b/tests/bios-tables-test.c @@ -864,6 +864,41 @@ static void test_acpi_piix4_tcg_ipmi(void) free_test_data(&data); } +static void test_acpi_piix4_tcg_pxb(void) +{ +test_data data; + +memset(&data, 0, sizeof(data)); +data.machine = MACHINE_PC; +data.variant = ".pxb"; +data.required_struct_types = base_required_struct_types; +data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types); +test_acpi_one("-machine accel=tcg" + " -device pxb,id=pxb,bus_nr=0x80,bus=pci.0" + " -object memory-backend-file,size=1G,mem-path=/tmp/shmem,share,id=mb" + " -device ivshmem-plain,memdev=mb,bus=pxb", + &data); +free_test_data(&data); +} + +static void test_acpi_q35_tcg_pxb_pcie(void) +{ +test_data data; + +memset(&data, 0, sizeof(data)); +data.machine = MACHINE_Q35; +data.variant = ".pxb_pcie"; +data.required_struct_types = base_required_struct_types; +data.required_struct_types_len = ARRAY_SIZE(ipmi_required_struct_types); +test_acpi_one("-machine q35,accel=tcg" + " -device pxb-pcie,id=pxb,bus_nr=0x80,bus=pcie.0" + " -device ioh3420,id=rp,bus=pxb,slot=1" + " -object memory-backend-file,size=1G,mem-path=/tmp/shmem,share,id=mb" + " -device ivshmem-plain,memdev=mb,bus=rp", + &data); +free_test_data(&data); +} + int main(int argc, char *argv[]) { const char *arch = qtest_get_arch(); @@ -884,6 +919,8 @@ int main(int argc, char *argv[]) qtest_add_func("acpi/q35/tcg/ipmi", test_acpi_q35_tcg_ipmi); qtest_add_func("acpi/piix4/tcg/cpuhp", test_acpi_piix4_tcg_cphp); qtest_add_func("acpi/q35/tcg/cpuhp", test_acpi_q35_tcg_cphp); +qtest_add_func("acpi/piix4/tcg/pxb", test_acpi_piix4_tcg_pxb); +qtest_add_func("acpi/q35/tcg/pxb-pcie", test_acpi_q35_tcg_pxb_pcie); } ret = g_test_run(); boot_sector_cleanup(disk); -- 2.4.3
[Qemu-devel] [PATCH V5 7/7] tests/acpi: Add pxb/pxb-pcie tests blobs
Signed-off-by: Marcel Apfelbaum --- tests/acpi-test-data/pc/DSDT.pxb | Bin 0 -> 6286 bytes tests/acpi-test-data/q35/DSDT.pxb_pcie | Bin 0 -> 9098 bytes 2 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 tests/acpi-test-data/pc/DSDT.pxb create mode 100644 tests/acpi-test-data/q35/DSDT.pxb_pcie diff --git a/tests/acpi-test-data/pc/DSDT.pxb b/tests/acpi-test-data/pc/DSDT.pxb new file mode 100644 index ..8839fcfe4246cdca093c5890c98e64dc5a10f8b8 GIT binary patch literal 6286 zcmb_gZEqXL5uPQF(s4;iN9kxJ`>C{{&-PD9e?opj`WI5P_9v?|cU+35bbtetfT*3FXP=qb zot@pKZt0D`%mT1EcaO)3J{M0JZ7AQjusB-5uyne%Ecn_st~y>p!s_*x^&Mqt&fcicfCGF=8YK z31AbHvlq!5a@X!0HE(9~zOTtBFm%Pt=Cug$U1cFQ1kYtKxAwa?9P?tqCS8}qiWorl|i8XKo?wKIBFf%@&8-|!=DHWcIT6-*%z+tjQEOTY^*V~UfYVGgdQ@k>wJZL4$*x1#)lGAcoW zR?L*Hp`utb>MWrI3KFzpZPiHItXS2SK|O0~rE3xUG)cR=UW=68Cc6tX+&)L$JD}8X z3l$w@nN!-UI(N)^1H9=-+x4LNycmDlIM+QB`eou7JqrtlO<>ne!05>i?SqJbQH__gc zbq3!06SoNun>(((b>>o(yrsQDyN#Lk>lqopvh!RzQYsh zeC4`Uk0YR-Nlo&^H)9a@lWh~er9K|R?@BN6w1qLZsJ%nocBeXyVPY}|8#0LngvyhQ zb7iFt^OwQ^6i2C&)cD<>(a)zN9v9`#n>%t6_&+^5r9rRkzr; zt-uScX7vj7_pKL4tYiMZv-Rwd^{j6_i~bE;DJo&)OY>V%J8j^T;nSpBplt^Udc* z%;!SP=lbULQS*2}ymR#VA@h0Pe161yKE!<9?yN`{y)|lH3e1ZXqEiHyT$J5}I_597 znh55YT*WS0U8Qm*IPf%nYp>{=_0qruAM=xILNiy><@*o3PapmaLB`wus6-}xMD zrkT~{e)*l?e2KwJc*&AjqcN%cD)N#i?XA-!6dV-yHO*)lYMfQL?G4VdU*A4&rE-)MXSlZ{?F-0X37oD`I8HwE_IZYA zkGap5rCxbImA{?`Q}O72E`%f3!GRiaZN+LN!vZ~Uz4kea5qozJh|A-V(O!by8|}m? zIM6%$D?GU|t#FUw2Qw5?OWvUb!z45cjZ}imAk*CowF!qZoRR|6Ne6>UP-M`$drv)% zxx8eSoKhx-tL0y~D6eUWmD+{9)3{T(V3f_E;!Kz|GgHXSqa`f(6ULp-?r&_STl==9 zI^$RSUc77QwL0EJ@UXF&;UB|=F#}D&o(}WHK25sa-RHl3>e2|$HE<8~X1WF?h=Hq$ zX~jr*cDSc@4O(lzmsXaf4VTuENOqWd!JwT#MC<*%1wLV-Ym_XfzlD6Xs}^Prto*oV z0v7~UHtO@2Q+b~G_SXP-qGBsyCSn$y@U2hI1ZCDD5xoBiIy zM&V{3dkmXp&!`v6G>95iBEm+sOarJH#Rmg!8{QVjCq4d(#Tsv`3voOYhwhoR@prsu;>D8U-?LqFvghBa+Wl7GAjG--G;0TFzpLL0X5o1RumMGcNlx~B%@i=00f zK!88{@Fn&-`8vsD>}SEAdxbo&lLL9ePsM)^Cn(3nq&p^kA~5U}$mlK+)92A!nt!6A z?{52(VNDMM`C$+UJ@qsRb0v7{jg&v?AG(X+)4fsPi6A=vF7ngMpI$PFNUlj{*{lZ{ z_jV1%2|3_iv!4rIzze40Onz_+tFM)JrpF|P>w@je&a^Dy>k;3DD1Ag4idgy0-jv&& zi;KV076S*amyA_BKyrj0@mPr|F}gh|+MoQy#UCVwvfn=k{E9vlBwq44%+iIcY|{G+ zIG?q*PLU9$`jw_?hF?``Rk4NJI&E*ABtQWcxL=X6USOT~%R0U8?5#t=c)E+x4#zJ` zORHm4^drjRXMP$LQHyh%ure8Z2-B}9cIUj6?F*PV5l}J2td0i|HFBbnr-t2`va(l$ zfryP_#|Ku923xB&jyGB)#D08PPy2@`y&lOTseLGoC1^}QV`0#|V8y#^z&aKR6|Ca| zIvxhSELicEf`C;Hg$mZVfX2h19|=}2pga^RSTW}V2NGe>dBK_#&}1l7uucf*L>Tmn zV4W1u$xx_Z#kXqU>r@!@s$f+FR0)L&)|7yz!k`}u)@cEq4uuNV83COMgBAoU=64X5 z8VVJxhXnLc7_=x@4-4qwP^e%%BA`dYpqgNv70}sGs9-%Rphv@?x?nvfpvOX?g7vt7 z9uI?-1nUU_JrN2OtS1HZWEgZ&u)Zgt?}b7I>nQ;}br4itKBnKOH6p#_;}-r*v`q!< zt}moXxY`#MP{1M-Xxb!@X5w)xfzO0(3oA)qijXwWNho1G=<8&X20Ai^ySlFx&!h(ut*0zOvOvrd@p}eQZvuoSJ-6Hh(ltFnHTj26~EHAJS9gm>&aun&ct<0?N431AUuh)T^iH W_K3e%(+`idYIumPvVKG7(faeuZXaVrb*+3l3XX~M{07HB3W@Iilp;V8XzSrt?abO zLa~83Mi5B>64ws}64pWcRNByU{zCH;^40)->T6$%B7TZ`?#vE7LrOr*hvnSabMBq9 zJ3D8&OTX#29$aM1x>a2FD$QK+t-9x<&ti;Go4%cD;yP>Z`NeX_O2-mbdp|AO*c5H| zCBJyBZ2fho`(?L#`?IdK8M3X%&cm(E@RR%9Ek>X_TOsGva9-#%i=Fmf-D|e2H2aEY zM7wCEGhb0!^cSzzy=u{R)-wWnP|hIE+nq|)&lJAlw%ze$D{glReuLSaYgON}o7d`f zW_zthjk$iRv)=R)58h>-|K?(UWxThOo#9S9_}B7x3+JwW`R47y*T4Gb?;gC%696pX z+s5~DNCD-c#xkY4M;m^gwteS|-c9Qx>{%4(#s1tEJ%y9q4($vHO#P2CvX@FM9=_`Y zw$xLyEA>)K>9L>! z^TRH8y8X=Oy)Nr@(?>^#tUvjAZ8W8p#P}YMwf9d`n|bR&AZw=asv=cuA6#m*JQS7ZHI4H~Bc127Wh-PM zb7B{mRkn87E{i4avi8BNt3`hRkH_7VI35Fug4ktUW`VWMw()%-2BqzT=Y#cY1LD-o zFwy2?7nno9*P&$qOfYv|9&s9fx1YpOK4gy)gMtp9C|e)IKZx7*g*{QIl)y1rREowd zIWtunIv^*dc@lpTXZ8h0q*)@X^u#=|Co$x)k;krAy%pwWI{3E29g3Vk zzirVeaJ>JBORC;z)u&hA-7+~#syQ2faagbDxh)61XP8@ zR8=tVC067~2&RH2BvTC@p{{ex&^acU3OXj4YVZhkok>GyQZN-XY3f9%>l`<9jvG41 zO`QmJowlLVHgwviPK3HntPt9fP8m8=rcQ*qPHyPrhE8tkM5yab8#>d5&a|l$p{^4v zm3BTehR%$s6QQni!q7Qk=$tThBGh$G8agKpos*_cgu2ctL+6yCbIR0-P}k`gIvqo& zW9mex>&zNDvxd&BsS}~DbK1~3ZRngfbt2St&KNpp44pHkPK3Hn*U;%2I$cvILS5&q zp>x*IIcw@fsOy|Fbj}$%=S-alb)BaSou>?)r%as)b)EBu&Ur)Uyr~nRuJg2D7NV8# zv|tva#qP9ZdQU@Q`Vsbw!8~Iy&zMYvI`gc-JZmt|noNW`^PIsvXE4v1OoTe~yumzg zFwdJzggWy%!Bj5uoM5Uo{yE813pYYyDkoVmYAzTx7tERnWlb9fTF?x%+^h}^6m1la z3={#%${>zF6@`IHoUkMd_d5cp95W`NDkP?o1_PB?WuO|A3{*k|lMECgMxzW=Vxx#^ zDz7C2mC%8KBE--I76vM@QAA92aLGU=R4~av5h|T9P>B^x7^ns%1C>z0Bm+gLbizO- zRxn|p8k7uFLIsly6rs`y1C?08gn?>MGEfN>Ofpb}N+%3dVg(ZhszJ#>B~&oUKoKgP zFi?pVOc2Z#2?Lc_!GwWoP%=;n6-+Wv zgi0q2RAL1a2C6~HKqXW#$v_b*oiI>|6-*eY1|DxEM;i4{y3s0Jki zl~BPX14XEG!ayZfFkzq?lnhit1(OUEq0$Khl~}=qfof1PPze=GGEjs{Ck#|#1rr9U zLCHWRR4~av5h|T9P>B^x7^ns%1C>z0Bm+gLbizO-Rxn|p8k7uFLIsly6rs`y1C?08 zgn?>MGEfN>Ofpb}N+%3dVg(ZhszJ#>B~&oUKoKgPFi=FAfg;ik6rpaQ2sH!Mm@rU{ zNd~Gh$v`zG3{+#nKs6>AsKz7%)tE3)jR^zQm}Hl%# zMWoIy3=|R0m^heZpa^jqR7gy9Y+;~?)UkzuB2vee3>2X{wq&3PedeC&z(QF+(-S{d z59tT#qa=M5+WS9$B~HIm=~)3ijj-yCjRG8ZqevT%Hond9Y-FQE?G@UTX;ZP?nd$}} zFZEOR_}&g4f}{?4+Pz%e@ER!q8u{!bPyZ>&QuJsGzmy+PA0NZE8s6LCQ;b>|L<3Jo zYU8*^D5&4!89Yv*hcz~a-OSGMnLkakR2=TcuzoYF#K`7O$>$O@1dOKOW;=eh$v&bs z-xEDk^rV9w(tt#Ks?%&Vtp_{N7^pn;N7ai%)$v&22Ujm@)r+Ef5#!UV7v1Bl7uyE` zY4?Vz^Vl<#_q6h!DDMrG_f9D9MdkhB@_1bKgVp<5d0&+GhsygWl=q|ZrQ!0
[Qemu-devel] [PATCH V5 4/7] hw/acpi: fix a DSDT table issue when a pxb is present.
PXBs do not support hotplug so they don't have a PCNT function. Since the PXB's PCI root-bus is a child bus of bus 0, the build_dsdt code will add a call to the corresponding PCNT function. Fix this by skipping the PCNT call for the above case. While at it skip also PCIe child buses. Reported-by: Igor Mammedov Signed-off-by: Marcel Apfelbaum Tested-by: Laszlo Ersek --- hw/i386/acpi-build.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index fbba461..5ed2bbd 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -597,6 +597,10 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus, QLIST_FOREACH(sec, &bus->child, sibling) { int32_t devfn = sec->parent_dev->devfn; +if (pci_bus_is_root(sec) || pci_bus_is_express(sec)) { +continue; +} + aml_append(method, aml_name("^S%.02X.PCNT", devfn)); } } -- 2.4.3
[Qemu-devel] [PATCH V5 3/7] hw/pxb: declare pxb devices as not hot-pluggable
Prevent future issues when hotplug will work for devices attached to pxbs. Suggested-by: Igor Mammedov Signed-off-by: Marcel Apfelbaum Tested-by: Laszlo Ersek --- hw/pci-bridge/pci_expander_bridge.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c index ab86121..b4f8ca2 100644 --- a/hw/pci-bridge/pci_expander_bridge.c +++ b/hw/pci-bridge/pci_expander_bridge.c @@ -310,6 +310,7 @@ static void pxb_dev_class_init(ObjectClass *klass, void *data) dc->desc = "PCI Expander Bridge"; dc->props = pxb_dev_properties; +dc->hotpluggable = false; set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); } @@ -343,6 +344,7 @@ static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data) dc->desc = "PCI Express Expander Bridge"; dc->props = pxb_dev_properties; +dc->hotpluggable = false; set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); } -- 2.4.3
[Qemu-devel] [PATCH V5 6/7] hw/apci: handle 64-bit MMIO regions correctly
In build_crs(), the calculation and merging of the ranges already happens in 64-bit, but the entry boundaries are silently truncated to 32-bit in the call to aml_dword_memory(). Fix it by handling the 64-bit MMIO ranges separately. This fixes 64-bit BARs behind PXBs. Reported-by: Laszlo Ersek Reviewed-by: Igor Mammedov Tested-by: Laszlo Ersek Signed-off-by: Marcel Apfelbaum --- hw/i386/acpi-build.c | 54 +++- 1 file changed, 45 insertions(+), 9 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index d8b3543..b1adf04 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -751,18 +751,22 @@ static void crs_range_free(gpointer data) typedef struct CrsRangeSet { GPtrArray *io_ranges; GPtrArray *mem_ranges; +GPtrArray *mem_64bit_ranges; } CrsRangeSet; static void crs_range_set_init(CrsRangeSet *range_set) { range_set->io_ranges = g_ptr_array_new_with_free_func(crs_range_free); range_set->mem_ranges = g_ptr_array_new_with_free_func(crs_range_free); +range_set->mem_64bit_ranges = +g_ptr_array_new_with_free_func(crs_range_free); } static void crs_range_set_free(CrsRangeSet *range_set) { g_ptr_array_free(range_set->io_ranges, true); g_ptr_array_free(range_set->mem_ranges, true); +g_ptr_array_free(range_set->mem_64bit_ranges, true); } static gint crs_range_compare(gconstpointer a, gconstpointer b) @@ -920,8 +924,14 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet *range_set) * that do not support multiple root buses */ if (range_base && range_base <= range_limit) { -crs_range_insert(temp_range_set.mem_ranges, - range_base, range_limit); +uint64_t length = range_limit - range_base + 1; +if (range_limit <= UINT32_MAX && length <= UINT32_MAX) { +crs_range_insert(temp_range_set.mem_ranges, + range_base, range_limit); +} else { +crs_range_insert(temp_range_set.mem_64bit_ranges, + range_base, range_limit); +} } range_base = @@ -934,8 +944,14 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet *range_set) * that do not support multiple root buses */ if (range_base && range_base <= range_limit) { -crs_range_insert(temp_range_set.mem_ranges, - range_base, range_limit); +uint64_t length = range_limit - range_base + 1; +if (range_limit <= UINT32_MAX && length <= UINT32_MAX) { +crs_range_insert(temp_range_set.mem_ranges, + range_base, range_limit); +} else { +crs_range_insert(temp_range_set.mem_64bit_ranges, + range_base, range_limit); +} } } } @@ -963,6 +979,19 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet *range_set) crs_range_insert(range_set->mem_ranges, entry->base, entry->limit); } +crs_range_merge(temp_range_set.mem_64bit_ranges); +for (i = 0; i < temp_range_set.mem_64bit_ranges->len; i++) { +entry = g_ptr_array_index(temp_range_set.mem_64bit_ranges, i); +aml_append(crs, + aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, +AML_MAX_FIXED, AML_NON_CACHEABLE, +AML_READ_WRITE, +0, entry->base, entry->limit, 0, +entry->limit - entry->base + 1)); +crs_range_insert(range_set->mem_64bit_ranges, + entry->base, entry->limit); +} + crs_range_set_free(&temp_range_set); aml_append(crs, @@ -2085,11 +2114,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, } if (!range_is_empty(pci_hole64)) { -aml_append(crs, -aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED, - AML_CACHEABLE, AML_READ_WRITE, - 0, range_lob(pci_hole64), range_upb(pci_hole64), 0, - range_upb(pci_hole64) + 1 - range_lob(pci_hole64))); +crs_replace_with_free_ranges(crs_range_set.mem_64bit_ranges, + range_lob(pci_hole64), + range_upb(pci_hole64)); +for (i = 0; i < crs_range_set.mem_64bit_ranges->len; i++) { +entry = g_ptr_array_index(crs_range_set.mem_64bit_ranges, i); +aml_append(crs, + aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, +AML_MAX_FIXED, +
Re: [Qemu-devel] [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c
On Jul 17, 2016 10:41, "Quan Xu" wrote: > > > [Quan:]: comment starts with [Quan:] > Thanks, Quan for your comments. The first patches from this series just move some code from xen_backend to xen_pvdev file. I would not group the reorg from xen_backend with refactoring in the same patch. Eventually this can be done in another patch later. > > > The purpose of the new file is to store generic functions shared by frontend > and backends such as xenstore operations, xendevs. > > Signed-off-by: Quan Xu > Signed-off-by: Emil Condrea > --- > hw/xen/Makefile.objs | 2 +- > hw/xen/xen_backend.c | 125 +--- > hw/xen/xen_pvdev.c | 149 +++ > include/hw/xen/xen_backend.h | 63 +- > include/hw/xen/xen_pvdev.h | 71 + > 5 files changed, 223 insertions(+), 187 deletions(-) > create mode 100644 hw/xen/xen_pvdev.c > create mode 100644 include/hw/xen/xen_pvdev.h > > diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs > index d367094..591cdc2 100644 > --- a/hw/xen/Makefile.objs > +++ b/hw/xen/Makefile.objs > @@ -1,5 +1,5 @@ > # xen backend driver support > -common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o > +common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o xen_pvdev.o > > obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o > obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o > xen_pt_graphics.o xen_pt_msi.o > diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c > index bab79b1..a251a4a 100644 > --- a/hw/xen/xen_backend.c > +++ b/hw/xen/xen_backend.c > @@ -30,6 +30,7 @@ > #include "sysemu/char.h" > #include "qemu/log.h" > #include "hw/xen/xen_backend.h" > +#include "hw/xen/xen_pvdev.h" > > #include > > @@ -56,8 +57,6 @@ static QTAILQ_HEAD(xs_dirs_head, xs_dirs) xs_cleanup = > static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = > QTAILQ_HEAD_INITIALIZER(xendevs); > static int debug = 0; > > -/* - */ > - > static void xenstore_cleanup_dir(char *dir) > { > struct xs_dirs *d; > @@ -76,34 +75,6 @@ void xen_config_cleanup(void) > } > } > > -int xenstore_write_str(const char *base, const char *node, const char *val) > -{ > -char abspath[XEN_BUFSIZE]; > - > -snprintf(abspath, sizeof(abspath), "%s/%s", base, node); > -if (!xs_write(xenstore, 0, abspath, val, strlen(val))) { > -return -1; > -} > -return 0; > -} > - > -char *xenstore_read_str(const char *base, const char *node) > -{ > -char abspath[XEN_BUFSIZE]; > -unsigned int len; > -char *str, *ret = NULL; > - > -snprintf(abspath, sizeof(abspath), "%s/%s", base, node); > -str = xs_read(xenstore, 0, abspath, &len); > -if (str != NULL) { > -/* move to qemu-allocated memory to make sure > - * callers can savely g_free() stuff. */ > -ret = g_strdup(str); > -free(str); > -} > -return ret; > -} > - > int xenstore_mkdir(char *path, int p) > { > struct xs_permissions perms[2] = { > @@ -128,48 +99,6 @@ int xenstore_mkdir(char *path, int p) > return 0; > } > > -int xenstore_write_int(const char *base, const char *node, int ival) > -{ > -char val[12]; > - > > [Quan:]: why 12 ? what about XEN_BUFSIZE? > > -snprintf(val, sizeof(val), "%d", ival); > -return xenstore_write_str(base, node, val); > -} > - > -int xenstore_write_int64(const char *base, const char *node, int64_t ival) > -{ > -char val[21]; > - > > [Quan:]: why 21 ? what about XEN_BUFSIZE? > > > -snprintf(val, sizeof(val), "%"PRId64, ival); > -return xenstore_write_str(base, node, val); > -} > - > -int xenstore_read_int(const char *base, const char *node, int *ival) > -{ > -char *val; > -int rc = -1; > - > -val = xenstore_read_str(base, node); > > [Quan:]: IMO, it is better to initialize val when declares. the same comment for the other 'val' > > -if (val && 1 == sscanf(val, "%d", ival)) { > -rc = 0; > -} > -g_free(val); > -return rc; > -} > - > -int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval) > -{ > -char *val; > -int rc = -1; > - > -val = xenstore_read_str(base, node); > -if (val && 1 == sscanf(val, "%"SCNu64, uval)) { > -rc = 0; > -} > -g_free(val); > -return rc; > -} > - > int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const > char *val) > { > return xenstore_write_str(xendev->be, node, val); > @@ -212,20 +141,6 @@ int xenstore_read_fe_uint64(struct XenDevice *xendev, > const char *node, uint64_t > > /* - */ > > -const char *xenbus_strstate(enum xenbus_state state) > -{ > -static const char *const name[] = { > -[ XenbusStateUnknown ] = "Unknown", > -[ XenbusStateInitialising ] = "Initialising", > -
Re: [Qemu-devel] [PATCH] vfio/pci: Hide ARI capability
On 07/15/2016 08:30 PM, Alex Williamson wrote: QEMU supports ARI on downstream ports and assigned devices may support ARI in their extended capabilities. The endpoint ARI capability specifies the next function, such that the OS doesn't need to walk each possible function, however this next function is relative to the host, not the guest. This leads to device discovery issues when we combine separate functions into virtual multi-function packages in a guest. For example, SR-IOV VFs are not enumerated by simply probing the function address space, therefore the ARI next-function field is zero. When we combine multiple VFs together as a multi-function device in the guest, the guest OS identifies ARI is enabled, relies on this next-function field, and stops looking for additional function after the first is found. Hi Alex, Long term we should expose the ARI capability to the guest to enable configurations with more than 8 functions per slot, but this requires additional QEMU PCI infrastructure to manage the next-function field for multiple, otherwise independent devices. The ARI implementation is on my "to-do" list. In the short term, hiding this capability allows equivalent functionality to what we currently have on non-express chipsets. I agree. Reviewed-by: Marcel Apfelbaum Thanks, Marcel Signed-off-by: Alex Williamson --- hw/vfio/pci.c |1 + 1 file changed, 1 insertion(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 44783c5..c8436a1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1828,6 +1828,7 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev) switch (cap_id) { case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */ +case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */ trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next); break; default:
Re: [Qemu-devel] [PATCH v2 1/2] tests: Resort check-qtest entries in Makefile.include
On Fri, Jul 15, 2016 at 05:39:38PM +0200, Thomas Huth wrote: > The rather random list of check-qtest-xxx entries caused some > confusion in the past, where to use "=" and where to use "+=" > (see commits 0ccac16f59462b8e2b9afbc1 and 1f5c1cfbaec0792cd2e5da > for example). > Sorting the check-qtest-xxx entries by architecure instead and > using some empty lines inbetween should help to ease this > situation a little bit, so that it is hopefully now obvious > that new tests should be added with "+=" instead of "=". > While we are at it, this patch also comments out two of the > "gcov-files-..." lines since the corresponding m48t59-test is > disabled for sparc and sparc64, too. > > Signed-off-by: Thomas Huth Reviewed-by: David Gibson > --- > tests/Makefile.include | 38 +- > 1 file changed, 25 insertions(+), 13 deletions(-) > > diff --git a/tests/Makefile.include b/tests/Makefile.include > index 2010b11..3d76cf4 100644 > --- a/tests/Makefile.include > +++ b/tests/Makefile.include > @@ -240,33 +240,45 @@ check-qtest-i386-y += tests/postcopy-test$(EXESUF) > check-qtest-x86_64-y += $(check-qtest-i386-y) > gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c > gcov-files-x86_64-y = $(subst > i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y)) > + > check-qtest-mips-y = tests/endianness-test$(EXESUF) > + > check-qtest-mips64-y = tests/endianness-test$(EXESUF) > + > check-qtest-mips64el-y = tests/endianness-test$(EXESUF) > + > check-qtest-ppc-y = tests/endianness-test$(EXESUF) > -check-qtest-ppc64-y = tests/endianness-test$(EXESUF) > +check-qtest-ppc-y += tests/boot-order-test$(EXESUF) > +check-qtest-ppc-y += tests/prom-env-test$(EXESUF) > + > +check-qtest-ppc64-y = tests/spapr-phb-test$(EXESUF) > +gcov-files-ppc64-y = ppc64-softmmu/hw/ppc/spapr_pci.c > +check-qtest-ppc64-y += tests/endianness-test$(EXESUF) > +check-qtest-ppc64-y += tests/boot-order-test$(EXESUF) > +check-qtest-ppc64-y += tests/prom-env-test$(EXESUF) > + > check-qtest-sh4-y = tests/endianness-test$(EXESUF) > + > check-qtest-sh4eb-y = tests/endianness-test$(EXESUF) > + > +check-qtest-sparc-y = tests/prom-env-test$(EXESUF) > +#check-qtest-sparc-y += tests/m48t59-test$(EXESUF) > +#gcov-files-sparc-y = hw/timer/m48t59.c > + > check-qtest-sparc64-y = tests/endianness-test$(EXESUF) > -#check-qtest-sparc-y = tests/m48t59-test$(EXESUF) > #check-qtest-sparc64-y += tests/m48t59-test$(EXESUF) > -gcov-files-sparc-y += hw/timer/m48t59.c > -gcov-files-sparc64-y += hw/timer/m48t59.c > +#gcov-files-sparc64-y += hw/timer/m48t59.c > +#Disabled for now, triggers a TCG bug on 32-bit hosts > +#check-qtest-sparc64-y += tests/prom-env-test$(EXESUF) > + > check-qtest-arm-y = tests/tmp105-test$(EXESUF) > check-qtest-arm-y += tests/ds1338-test$(EXESUF) > gcov-files-arm-y += hw/misc/tmp105.c > check-qtest-arm-y += tests/virtio-blk-test$(EXESUF) > gcov-files-arm-y += arm-softmmu/hw/block/virtio-blk.c > -check-qtest-ppc-y += tests/boot-order-test$(EXESUF) > -check-qtest-ppc64-y += tests/boot-order-test$(EXESUF) > -check-qtest-ppc64-y += tests/spapr-phb-test$(EXESUF) > -gcov-files-ppc64-y += ppc64-softmmu/hw/ppc/spapr_pci.c > -check-qtest-ppc-y += tests/prom-env-test$(EXESUF) > -check-qtest-ppc64-y += tests/prom-env-test$(EXESUF) > -check-qtest-sparc-y += tests/prom-env-test$(EXESUF) > -#Disabled for now, triggers a TCG bug on 32-bit hosts > -#check-qtest-sparc64-y += tests/prom-env-test$(EXESUF) > + > check-qtest-microblazeel-y = $(check-qtest-microblaze-y) > + > check-qtest-xtensaeb-y = $(check-qtest-xtensa-y) > > check-qtest-generic-y += tests/qom-test$(EXESUF) -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH v2 2/2] tests: Check serial output of firmware boot of some machines
On Fri, Jul 15, 2016 at 05:39:39PM +0200, Thomas Huth wrote: > Some of the machines that we have got a firmware image for write > some output to the serial console while booting up. We can use > this output to make sure that the machine is basically working, > so this adds a test that checks the output of these machines > for some well-known "magic" strings. > > Signed-off-by: Thomas Huth Reviewed-by: David Gibson > --- > tests/Makefile.include | 8 > tests/boot-serial-test.c | 111 > +++ > 2 files changed, 119 insertions(+) > create mode 100644 tests/boot-serial-test.c > > diff --git a/tests/Makefile.include b/tests/Makefile.include > index 3d76cf4..3986093 100644 > --- a/tests/Makefile.include > +++ b/tests/Makefile.include > @@ -194,6 +194,7 @@ check-qtest-i386-y += tests/hd-geo-test$(EXESUF) > gcov-files-i386-y += hw/block/hd-geometry.c > check-qtest-i386-y += tests/boot-order-test$(EXESUF) > check-qtest-i386-y += tests/bios-tables-test$(EXESUF) > +check-qtest-i386-y += tests/boot-serial-test$(EXESUF) > check-qtest-i386-y += tests/pxe-test$(EXESUF) > check-qtest-i386-y += tests/rtc-test$(EXESUF) > check-qtest-i386-y += tests/ipmi-kcs-test$(EXESUF) > @@ -241,6 +242,8 @@ check-qtest-x86_64-y += $(check-qtest-i386-y) > gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c > gcov-files-x86_64-y = $(subst > i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y)) > > +check-qtest-alpha-y = tests/boot-serial-test$(EXESUF) > + > check-qtest-mips-y = tests/endianness-test$(EXESUF) > > check-qtest-mips64-y = tests/endianness-test$(EXESUF) > @@ -250,12 +253,14 @@ check-qtest-mips64el-y = tests/endianness-test$(EXESUF) > check-qtest-ppc-y = tests/endianness-test$(EXESUF) > check-qtest-ppc-y += tests/boot-order-test$(EXESUF) > check-qtest-ppc-y += tests/prom-env-test$(EXESUF) > +check-qtest-ppc-y += tests/boot-serial-test$(EXESUF) > > check-qtest-ppc64-y = tests/spapr-phb-test$(EXESUF) > gcov-files-ppc64-y = ppc64-softmmu/hw/ppc/spapr_pci.c > check-qtest-ppc64-y += tests/endianness-test$(EXESUF) > check-qtest-ppc64-y += tests/boot-order-test$(EXESUF) > check-qtest-ppc64-y += tests/prom-env-test$(EXESUF) > +check-qtest-ppc64-y += tests/boot-serial-test$(EXESUF) > > check-qtest-sh4-y = tests/endianness-test$(EXESUF) > > @@ -281,6 +286,8 @@ check-qtest-microblazeel-y = $(check-qtest-microblaze-y) > > check-qtest-xtensaeb-y = $(check-qtest-xtensa-y) > > +check-qtest-s390x-y = tests/boot-serial-test$(EXESUF) > + > check-qtest-generic-y += tests/qom-test$(EXESUF) > > qapi-schema += alternate-any.json > @@ -579,6 +586,7 @@ tests/ipmi-kcs-test$(EXESUF): tests/ipmi-kcs-test.o > tests/ipmi-bt-test$(EXESUF): tests/ipmi-bt-test.o > tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o > tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y) > +tests/boot-serial-test$(EXESUF): tests/boot-serial-test.o $(libqos-obj-y) > tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o \ > tests/boot-sector.o $(libqos-obj-y) > tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o $(libqos-obj-y) > diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c > new file mode 100644 > index 000..fd60337 > --- /dev/null > +++ b/tests/boot-serial-test.c > @@ -0,0 +1,111 @@ > +/* > + * Test serial output of some machines. > + * > + * Copyright 2016 Thomas Huth, Red Hat Inc. > + * > + * This work is licensed under the terms of the GNU GPL, version 2 > + * or later. See the COPYING file in the top-level directory. > + * > + * This test is used to check that the serial output of the firmware > + * (that we provide for some machines) contains an expected string. > + * Thus we check that the firmware still boots at least to a certain > + * point and so we know that the machine is not completely broken. > + */ > + > +#include "qemu/osdep.h" > +#include "libqtest.h" > + > +typedef struct testdef { > +const char *arch; /* Target architecture */ > +const char *machine;/* Name of the machine */ > +const char *extra; /* Additional parameters */ > +const char *expect; /* Expected string in the serial output */ > +} testdef_t; > + > +static testdef_t tests[] = { > +{ "alpha", "clipper", "", "PCI:" }, > +{ "ppc", "ppce500", "", "U-Boot" }, > +{ "ppc", "prep", "", "Open Hack'Ware BIOS" }, > +{ "ppc64", "ppce500", "", "U-Boot" }, > +{ "ppc64", "prep", "", "Open Hack'Ware BIOS" }, > +{ "ppc64", "pseries", "", "Open Firmware" }, > +{ "i386", "isapc", "-device sga", "SGABIOS" }, > +{ "i386", "pc", "-device sga", "SGABIOS" }, > +{ "i386", "q35", "-device sga", "SGABIOS" }, > +{ "x86_64", "isapc", "-device sga", "SGABIOS" }, > +{ "x86_64", "pc", "-device sga", "SGABIOS" }, > +{ "x86_64", "q35", "-device sga", "SGABIOS" }, > +{ "s390x", "s390-ccw-virtio", > + "-nodefaults -device sclpconsole,chardev=serial0", "virtio device" }, > +{ NU
Re: [Qemu-devel] [PATCH] target-ppc: fix left shift overflow in hpte_page_shift
On Fri, Jul 15, 2016 at 05:22:10PM +0200, Paolo Bonzini wrote: > ps->pte_enc is a 32-bit value, which is shifted left and then compared > to a 64-bit value. It needs a cast before the shift. > > Reported by Coverity. > > Signed-off-by: Paolo Bonzini Applied to ppc-for-2.7, thanks. > --- > target-ppc/mmu-hash64.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c > index 82c2186..8f7e5b4 100644 > --- a/target-ppc/mmu-hash64.c > +++ b/target-ppc/mmu-hash64.c > @@ -479,7 +479,7 @@ static unsigned hpte_page_shift(const struct > ppc_one_seg_page_size *sps, > > mask = ((1ULL << ps->page_shift) - 1) & HPTE64_R_RPN; > > -if ((pte1 & mask) == (ps->pte_enc << HPTE64_R_RPN_SHIFT)) { > +if ((pte1 & mask) == ((uint64_t)ps->pte_enc << HPTE64_R_RPN_SHIFT)) { > return ps->page_shift; > } > } -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation
On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > On Thu, 14 Jul 2016 21:59:45 +1000 > David Gibson wrote: > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell > > > wrote: > > > > On 14 July 2016 at 08:57, David Gibson > > > > wrote: > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done > > > >> differently > > > >> than for full system targets. This method turns out to be broken, > > > >> since > > > >> it can fairly easily result in duplicate cpu_index values for > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > >> > > > >> Consider this sequence: > > > >> Create thread 1 > > > >> Create thread 2 > > > >> Exit thread 1 > > > >> Create thread 3 > > > >> > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > >> threads in the cpus list at the point of its creation). > > > >> > > > >> We mostly get away with this because cpu_index values aren't that > > > >> important > > > >> for userspace emulation. Still, it can't be good, so this patch fixes > > > >> it > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that > > > >> full > > > >> system targets already use. > > > >> > > > >> Signed-off-by: David Gibson > > > >> --- > > > >> exec.c | 19 --- > > > >> 1 file changed, 19 deletions(-) > > > >> > > > >> diff --git a/exec.c b/exec.c > > > >> index 011babd..e410dab 100644 > > > >> --- a/exec.c > > > >> +++ b/exec.c > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, > > > >> int asidx) > > > >> } > > > >> #endif > > > >> > > > >> -#ifndef CONFIG_USER_ONLY > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > >> > > > >> static int cpu_get_free_index(Error **errp) > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > >> { > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > >> } > > > >> -#else > > > >> - > > > >> -static int cpu_get_free_index(Error **errp) > > > >> -{ > > > >> -CPUState *some_cpu; > > > >> -int cpu_index = 0; > > > >> - > > > >> -CPU_FOREACH(some_cpu) { > > > >> -cpu_index++; > > > >> -} > > > >> -return cpu_index; > > > >> -} > > > >> - > > > >> -static void cpu_release_index(CPUState *cpu) > > > >> -{ > > > >> -return; > > > >> -} > > > >> -#endif > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > threads? That seems a little low for comfort. > > > > > > This was the reason why the bitmap logic wasn't applied to > > > CONFIG_USER_ONLY when it was introduced. > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > Ah.. good point. > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > them. Does that seem reasonable? > > > > Isn't it only deferring the problem to later ? You mean that we could get duplicate indexes after the value wraps around? I suppose, but duplicates after spawning 4 billion threads seems like a substantial improvement over duplicates after spawning 3 in the wrong order.. > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > value fo CONFIG_USER only instead ? Perhaps. It does mean carrying around a huge bitmap, though. Another option is to remove cpu_index entirely for the user only case. I have some patches for this, which are very ugly but it's possible they can be cleaned up to something reasonable (the biggest chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY for what I think are registers that aren't accessible in user mode). > > > But then we didn't have actual removal, but we do now. > > > > You mean patch 1/2 in this set? Or something else? > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > threads - there are some bug hairy multi-threaded programs out there. > > > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism
On Fri, Jul 15, 2016 at 10:10:25AM +0200, Thomas Huth wrote: > Commit 86b50f2e1bef ("Disable huge page support if it is not available > for main RAM") already made sure that huge page support is not announced > to the guest if the normal RAM of non-NUMA configurations is not backed > by a huge page filesystem. However, there is one more case that can go > wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured > with huge page support (and only the memory of a DIMM is configured with > it). When QEMU is started with the following command line for example, > the Linux guest currently crashes because it is trying to use huge pages > on a memory region that does not support huge pages: > > qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \ >memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 > \ >-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \ >-numa node,nodeid=0 -numa node,nodeid=1 > > To fix this issue, we've got to make sure to disable huge page support, > too, when there is a NUMA node that is not using a memory backend with > huge page support. > > Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740 > Signed-off-by: Thomas Huth > --- > target-ppc/kvm.c | 10 +++--- > 1 file changed, 7 insertions(+), 3 deletions(-) Applied to ppc-for-2.7, thanks. > > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c > index 884d564..7a8f555 100644 > --- a/target-ppc/kvm.c > +++ b/target-ppc/kvm.c > @@ -389,12 +389,16 @@ static long getrampagesize(void) > > object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize); > > -if (hpsize == LONG_MAX) { > +if (hpsize == LONG_MAX || hpsize == getpagesize()) { > return getpagesize(); > } > > -if (nb_numa_nodes == 0 && hpsize > getpagesize()) { > -/* No NUMA nodes and normal RAM without -mem-path ==> no huge pages! > */ > +/* If NUMA is disabled or the NUMA nodes are not backed with a > + * memory-backend, then there is at least one node using "normal" > + * RAM. And since normal RAM has not been configured with "-mem-path" > + * (what we've checked earlier here already), we can not use huge pages! > + */ > +if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) { > static bool warned; > if (!warned) { > error_report("Huge page support disabled (n/a for main > memory)."); -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/4] Pass generic CPUState to gen_intermediate_code()
On Fri, Jul 15, 2016 at 06:12:05PM +0200, Lluís Vilanova wrote: > Needed to implement a target-agnostic gen_intermediate_code() in the > future. > > Signed-off-by: Lluís Vilanova > --- > include/exec/exec-all.h |2 +- > target-alpha/translate.c | 11 +-- > target-arm/translate.c| 24 > target-cris/translate.c | 17 - > target-i386/translate.c | 13 ++--- > target-lm32/translate.c | 22 +++--- > target-m68k/translate.c | 15 +++ > target-microblaze/translate.c | 24 > target-mips/translate.c | 15 +++ > target-moxie/translate.c | 14 +++--- > target-openrisc/translate.c | 24 > target-ppc/translate.c| 15 +++ > target-s390x/translate.c | 13 ++--- > target-sh4/translate.c| 15 +++ > target-sparc/translate.c | 11 +-- > target-tilegx/translate.c |7 +++ > target-tricore/translate.c|9 - > target-unicore32/translate.c | 17 - > target-xtensa/translate.c | 13 ++--- > translate-all.c |2 +- > 20 files changed, 135 insertions(+), 148 deletions(-) target-ppc portion Reviewed-by: David Gibson > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h > index 7362095..06c2400 100644 > --- a/include/exec/exec-all.h > +++ b/include/exec/exec-all.h > @@ -66,7 +66,7 @@ typedef struct TranslationBlock TranslationBlock; > > #include "qemu/log.h" > > -void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb); > +void gen_intermediate_code(CPUState *env, struct TranslationBlock *tb); > void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb, >target_ulong *data); > > diff --git a/target-alpha/translate.c b/target-alpha/translate.c > index 5b86992..faeccf8 100644 > --- a/target-alpha/translate.c > +++ b/target-alpha/translate.c > @@ -2860,10 +2860,9 @@ static ExitStatus translate_one(DisasContext *ctx, > uint32_t insn) > return ret; > } > > -void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb) > +void gen_intermediate_code(CPUState *cpu, struct TranslationBlock *tb) > { > -AlphaCPU *cpu = alpha_env_get_cpu(env); > -CPUState *cs = CPU(cpu); > +CPUAlphaState *env = cpu->env_ptr; > DisasContext ctx, *ctxp = &ctx; > target_ulong pc_start; > target_ulong pc_mask; > @@ -2878,7 +2877,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct > TranslationBlock *tb) > ctx.pc = pc_start; > ctx.mem_idx = cpu_mmu_index(env, false); > ctx.implver = env->implver; > -ctx.singlestep_enabled = cs->singlestep_enabled; > +ctx.singlestep_enabled = cpu->singlestep_enabled; > > #ifdef CONFIG_USER_ONLY > ctx.ir = cpu_std_ir; > @@ -2917,7 +2916,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct > TranslationBlock *tb) > tcg_gen_insn_start(ctx.pc); > num_insns++; > > -if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) { > +if (unlikely(cpu_breakpoint_test(cpu, ctx.pc, BP_ANY))) { > ret = gen_excp(&ctx, EXCP_DEBUG, 0); > /* The address covered by the breakpoint must be included in > [tb->pc, tb->pc + tb->size) in order to for it to be > @@ -2991,7 +2990,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct > TranslationBlock *tb) > #ifdef DEBUG_DISAS > if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) { > qemu_log("IN: %s\n", lookup_symbol(pc_start)); > -log_target_disas(cs, pc_start, ctx.pc - pc_start, 1); > +log_target_disas(cpu, pc_start, ctx.pc - pc_start, 1); > qemu_log("\n"); > } > #endif > diff --git a/target-arm/translate.c b/target-arm/translate.c > index 940ec8d..837ceda 100644 > --- a/target-arm/translate.c > +++ b/target-arm/translate.c > @@ -11587,10 +11587,10 @@ static bool insn_crosses_page(CPUARMState *env, > DisasContext *s) > } > > /* generate intermediate code for basic block 'tb'. */ > -void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb) > +void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb) > { > -ARMCPU *cpu = arm_env_get_cpu(env); > -CPUState *cs = CPU(cpu); > +CPUARMState *env = cpu->env_ptr; > +ARMCPU *arm_cpu = arm_env_get_cpu(env); > DisasContext dc1, *dc = &dc1; > target_ulong pc_start; > target_ulong next_page_start; > @@ -11604,7 +11604,7 @@ void gen_intermediate_code(CPUARMState *env, > TranslationBlock *tb) > * the A32/T32 complexity to do with conditional execution/IT blocks/etc. > */ > if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) { > -gen_intermediate_code_a64(cpu, tb); > +gen_intermediate_code_a64(arm_cpu, tb); > return; >
Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi
On Fri, Jul 15, 2016 at 03:55:20PM +0800, Zhangfei Gao wrote: > Dear Dave > > On Wed, Jul 13, 2016 at 7:03 AM, Dave Chinner wrote: > > On Tue, Jul 12, 2016 at 12:43:24PM -0400, Theodore Ts'o wrote: > >> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote: > >> > Some update: > >> > > >> > If test with ext2, no problem in iblock. > >> > If test with ext4, ext4_mb_generate_buddy reported error in the > >> > removing files after reboot. > >> > > >> > > >> > root@(none)$ rm test > >> > [ 21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: > >> > group 18 > >> > , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free > >> > clusters > >> > [ 21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = > >> > 0). Th > >> > ere's a risk of filesystem corruption in case of system crash. > >> > > >> > Any special notes of using ext4 in qemu? > >> > >> Ext4 has more runtime consistency checking than ext2. So just because > >> ext4 complains doesn't mean that there isn't a problem with the file > >> system; it just means that ext4 is more likely to notice before you > >> lose user data. > >> > >> So if you test with ext2, try running e2fsck afterwards, to make sure > >> the file system is consistent. > >> > >> Given that I'm reguarly testing ext4 using kvm, and I haven't seen > >> anything like this in a very long time, I suspect the problemb is with > >> your SCSI code, and not with ext4. > > > > It's the same error I reported yesterday for ext3 on 4.7-rc6 when > > rebooting a VM after it hung. > > > Any link of this error? http://article.gmane.org/gmane.comp.file-systems.ext4/53792 Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [Qemu-devel] [RFC PATCH V2] qemu-char: Fix context for g_source_attach()
Hi~ All~~ Can you give me some feedback for this patch? We need more comments~~ COLO project depend on this patch to work. Because this patch colo-compare can make handler of qemu_chr_add_handlers() run in compare thread, reduce workload of main_loop in network busy situation. This idea from Jason. Thanks Zhang Chen On 07/11/2016 10:12 AM, Zhang Chen wrote: On 07/08/2016 10:27 PM, Paolo Bonzini wrote: On 08/07/2016 10:54, Daniel P. Berrange wrote: On Fri, Jul 08, 2016 at 09:48:23AM +0800, Fam Zheng wrote: On Wed, 06/22 18:49, Zhang Chen wrote: We want to poll and handle chardev in another thread other than main loop. But qemu_chr_add_handlers() can only work for global default context other than thread default context. So we use g_source_attach(xx, g_main_context_get_thread_default()) replace g_source_attach(xx, NULL) to attach g_source. Comments from jason. Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- io/channel.c | 2 +- qemu-char.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/io/channel.c b/io/channel.c index 692eb17..cd25677 100644 --- a/io/channel.c +++ b/io/channel.c @@ -146,7 +146,7 @@ guint qio_channel_add_watch(QIOChannel *ioc, g_source_set_callback(source, (GSourceFunc)func, user_data, notify); -id = g_source_attach(source, NULL); +id = g_source_attach(source, g_main_context_get_thread_default()); g_source_unref(source); return id; diff --git a/qemu-char.c b/qemu-char.c index 84f49ac..4340457 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -859,7 +859,7 @@ static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_) iwp->src = qio_channel_create_watch( iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL); g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL); -g_source_attach(iwp->src, NULL); +g_source_attach(iwp->src, g_main_context_get_thread_default()); } else { g_source_destroy(iwp->src); g_source_unref(iwp->src); @@ -918,7 +918,7 @@ static guint io_add_watch_poll(QIOChannel *ioc, iwp->fd_read = (GSourceFunc) fd_read; iwp->src = NULL; -tag = g_source_attach(&iwp->parent, NULL); +tag = g_source_attach(&iwp->parent, g_main_context_get_thread_default()); g_source_unref(&iwp->parent); return tag; } @@ -3982,7 +3982,7 @@ int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond, } g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); -tag = g_source_attach(src, NULL); +tag = g_source_attach(src, g_main_context_get_thread_default()); g_source_unref(src); return tag; -- IIRC this opens a gate for your special thread (COLO compare thread?) to use QIOChannel. I've no real objection to this proposed patch, though it is fairly pointless to take it now without seeing any following patch that actually makes use of this added feature. I agree. Should I move this patch to the "[RFC PATCH V5 0/4] Introduce COLO-compare" patch set? that can show how it works. you can see this patch for how to use: http://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg06754.html In colo_compare_thread() I think in the long run it is better to think about allowing integrating QIO to AioContext, to support its usage outside main loop. Given how opaque GSource is, I'm not sure how feasible that is, or how useful it will be. Anyway we should definitely hear more opinions from Daniel and Paolo. Personally I think it is preferable to stick as close to the standard GSource model as possible, as that's widely used & well understood API, compared to the QEMU specific AioContext. AioContext is more optimized for the case where the callbacks are static. In general this is not the case for qemu-char.c. I don't sure AioContext can do this job good, but I think we can make qemu more flexible to do same one job. All roads lead to Rome. Thanks Zhang Chen Paolo . -- Thanks zhangchen
Re: [Qemu-devel] [RFC 4/6] target-ppc: add cmprb instruction
On Tue, Jul 12, 2016 at 11:33:20PM +0530, Nikunj A Dadhania wrote: > ISA 3.0 Compare Ranged Byte instruction useful for > isupper/islower/isaplha kind of operation. At least until you have locale-aware versions of those... > Signed-off-by: Nikunj A Dadhania Reviewed-by: David Gibson > --- > target-ppc/translate.c | 40 > 1 file changed, 40 insertions(+) > > diff --git a/target-ppc/translate.c b/target-ppc/translate.c > index 93c7c66..8de217f 100644 > --- a/target-ppc/translate.c > +++ b/target-ppc/translate.c > @@ -817,6 +817,45 @@ static void gen_cmpli(DisasContext *ctx) > } > } > > +/* cmprb - range comparison: isupper, isaplha, islower*/ > +static void gen_cmprb(DisasContext *ctx) > +{ > +TCGLabel *lab1 = gen_new_label(); > +TCGLabel *lab2 = gen_new_label(); > +TCGv src1 = tcg_temp_local_new(); > +TCGv src2 = tcg_temp_local_new(); > +TCGv src2lo = tcg_temp_local_new(); > +TCGv src2hi = tcg_temp_local_new(); > + > +tcg_gen_andi_tl(src1, cpu_gpr[rA(ctx->opcode)], 0xFF); > +tcg_gen_andi_tl(src2, cpu_gpr[rB(ctx->opcode)], 0x); > + > +tcg_gen_andi_tl(src2lo, src2, 0xFF); > +tcg_gen_shri_tl(src2hi, src2, 8); > +tcg_gen_andi_tl(src2hi, src2hi, 0xFF); > + > +tcg_gen_brcond_tl(TCG_COND_GTU, src1, src2hi, lab1); > +tcg_gen_brcond_tl(TCG_COND_LTU, src1, src2lo, lab1); > +tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT); > +tcg_gen_br(lab2); > +gen_set_label(lab1); > + > +if (ctx->opcode & 0x0020) { > +tcg_gen_shri_tl(src2hi, src2, 24); > +tcg_gen_andi_tl(src2hi, src2hi, 0xFF); > +tcg_gen_shri_tl(src2lo, src2, 16); > +tcg_gen_andi_tl(src2lo, src2lo, 0xFF); > +tcg_gen_brcond_tl(TCG_COND_GTU, src1, src2hi, lab2); > +tcg_gen_brcond_tl(TCG_COND_LTU, src1, src2lo, lab2); > +tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT); > +} > +gen_set_label(lab2); > +tcg_temp_free(src1); > +tcg_temp_free(src2); > +tcg_temp_free(src2lo); > +tcg_temp_free(src2hi); > +} > + > /* isel (PowerPC 2.03 specification) */ > static void gen_isel(DisasContext *ctx) > { > @@ -9898,6 +9937,7 @@ GEN_HANDLER(cmpi, 0x0B, 0xFF, 0xFF, 0x0040, > PPC_INTEGER), > GEN_HANDLER(cmpl, 0x1F, 0x00, 0x01, 0x0040, PPC_INTEGER), > GEN_HANDLER(cmpli, 0x0A, 0xFF, 0xFF, 0x0040, PPC_INTEGER), > GEN_HANDLER_E(cmpb, 0x1F, 0x1C, 0x0F, 0x0001, PPC_NONE, PPC2_ISA205), > +GEN_HANDLER_E(cmprb, 0x1F, 0x00, 0x06, 0x0041, PPC_NONE, PPC2_ISA300), > GEN_HANDLER(isel, 0x1F, 0x0F, 0xFF, 0x0001, PPC_ISEL), > GEN_HANDLER(addi, 0x0E, 0xFF, 0xFF, 0x, PPC_INTEGER), > GEN_HANDLER(addic, 0x0C, 0xFF, 0xFF, 0x, PPC_INTEGER), -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC 1/6] target-ppc: Introduce Power9 family
On Tue, Jul 12, 2016 at 11:33:17PM +0530, Nikunj A Dadhania wrote: > From: "Aneesh Kumar K.V" > > Signed-off-by: Aneesh Kumar K.V > [ rebased and added POWER9 alias ] > Signed-off-by: Nikunj A Dadhania > --- > target-ppc/cpu-models.c | 5 +++ > target-ppc/cpu-models.h | 2 ++ > target-ppc/cpu-qom.h| 7 > target-ppc/mmu_helper.c | 3 +- > target-ppc/translate_init.c | 85 > - > 5 files changed, 100 insertions(+), 2 deletions(-) > > diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c > index 5209e63..901cf40 100644 > --- a/target-ppc/cpu-models.c > +++ b/target-ppc/cpu-models.c > @@ -1147,6 +1147,10 @@ > "POWER8NVL v1.0") > POWERPC_DEF("970_v2.2", CPU_POWERPC_970_v22,970, > "PowerPC 970 v2.2") > + > +POWERPC_DEF("POWER9_v1.0", CPU_POWERPC_POWER9_BASE,POWER9, > +"POWER9 v1.0") > + > POWERPC_DEF("970fx_v1.0",CPU_POWERPC_970FX_v10, 970, > "PowerPC 970FX v1.0 (G5)") > POWERPC_DEF("970fx_v2.0",CPU_POWERPC_970FX_v20, 970, > @@ -1395,6 +1399,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = { > { "POWER8E", "POWER8E_v2.1" }, > { "POWER8", "POWER8_v2.0" }, > { "POWER8NVL", "POWER8NVL_v1.0" }, > +{ "POWER9", "POWER9_v1.0" }, > { "970", "970_v2.2" }, > { "970fx", "970fx_v3.1" }, > { "970mp", "970mp_v1.1" }, > diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h > index f21a44c..beeaaba 100644 > --- a/target-ppc/cpu-models.h > +++ b/target-ppc/cpu-models.h > @@ -562,6 +562,8 @@ enum { > CPU_POWERPC_POWER8_v20 = 0x004D0200, > CPU_POWERPC_POWER8NVL_BASE = 0x004C, > CPU_POWERPC_POWER8NVL_v10 = 0x004C0100, > +CPU_POWERPC_POWER9_BASE= 0x004E, > +CPU_POWERPC_POWER9_MAM = 0x004E0100, > CPU_POWERPC_970_v22= 0x00390202, > CPU_POWERPC_970FX_v10 = 0x00391100, > CPU_POWERPC_970FX_v20 = 0x003C0200, > diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h > index 2864105..df2fb65 100644 > --- a/target-ppc/cpu-qom.h > +++ b/target-ppc/cpu-qom.h > @@ -86,6 +86,13 @@ enum powerpc_mmu_t { > POWERPC_MMU_2_07 = POWERPC_MMU_64 | POWERPC_MMU_1TSEG > | POWERPC_MMU_64K > | POWERPC_MMU_AMR | 0x0004, > +/* for now , We can add radix later if needed */ I'm guessing this means you're only thinking about the guest-side presentation of the P9 MMU at this point? IIUC the host side presentation is so different that sharing any constants with pre-P9 MMUs probably doesn't make sense. I'm not immediately sure how we should make this distinction in the target-ppc code, since these values are supposed to belong to the CPU regardless of operating mode. > +/* POWERPC_MMU_3_00 = POWERPC_MMU_64 | POWERPC_MMU_1TSEG > + * | POWERPC_MMU_AMR | 0x0005, > + */ > + > +POWERPC_MMU_3_00 = POWERPC_MMU_64 | POWERPC_MMU_AMR | 0x0005, > + > /* Architecture 2.07 "degraded" (no 1T segments) */ > POWERPC_MMU_2_07a = POWERPC_MMU_64 | POWERPC_MMU_AMR > | 0x0004, > diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c > index 485d5b8..6219c4a 100644 > --- a/target-ppc/mmu_helper.c > +++ b/target-ppc/mmu_helper.c > @@ -1935,13 +1935,14 @@ void ppc_tlb_invalidate_all(CPUPPCState *env) > case POWERPC_MMU_2_06a: > case POWERPC_MMU_2_07: > case POWERPC_MMU_2_07a: > +case POWERPC_MMU_3_00: > #endif /* defined(TARGET_PPC64) */ > env->tlb_need_flush = 0; > tlb_flush(CPU(cpu), 1); > break; > default: > /* XXX: TODO */ > -cpu_abort(CPU(cpu), "Unknown MMU model\n"); > +cpu_abort(CPU(cpu), "Unknown MMU model %d\n", env->mmu_model); > break; > } > } > diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c > index 8f257fb..51bab23 100644 > --- a/target-ppc/translate_init.c > +++ b/target-ppc/translate_init.c > @@ -7459,7 +7459,8 @@ enum BOOK3S_CPU_TYPE { > BOOK3S_CPU_POWER5PLUS, > BOOK3S_CPU_POWER6, > BOOK3S_CPU_POWER7, > -BOOK3S_CPU_POWER8 > +BOOK3S_CPU_POWER8, > +BOOK3S_CPU_POWER9 > }; > > static void gen_fscr_facility_check(DisasContext *ctx, int facility_sprn, > @@ -8241,6 +8242,7 @@ static void init_proc_book3s_64(CPUPPCState *env, int > version) > break; > case BOOK3S_CPU_POWER7: > case BOOK3S_CPU_POWER8: > +case BOOK3S_CPU_POWER9: > gen_spr_book3s_ids(env); > gen_spr_amr(env, version >= BOOK3S_CPU_POWER8); > gen_spr_book3s_purr(env); > @@ -8293,6 +8295,7 @@ static void init_proc_book3s_64(CPUPPCState *env, int > version) > break; > case BOOK3S_CPU_POWER7: > case BOOK3S_CPU_POWER8: > +case BOOK3S_CPU_POWER9
Re: [Qemu-devel] [RFC 2/6] target-ppc: Introduce POWER ISA 3.0 flag
On Tue, Jul 12, 2016 at 11:33:18PM +0530, Nikunj A Dadhania wrote: > This flag will be used for POWER9 instructions. > > Signed-off-by: Nikunj A Dadhania Reviewed-by: David Gibson > --- > target-ppc/cpu.h| 5 - > target-ppc/translate_init.c | 2 +- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h > index 2666a3f..f48ff0f 100644 > --- a/target-ppc/cpu.h > +++ b/target-ppc/cpu.h > @@ -2093,6 +2093,8 @@ enum { > PPC2_TM= 0x0002ULL, > /* Server PM instructgions (ISA 2.06, Book III) > */ > PPC2_PM_ISA206 = 0x0004ULL, > +/* POWER ISA 3.0 > */ > +PPC2_ISA300= 0x0008ULL, > > #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | > \ > PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \ > @@ -2100,7 +2102,8 @@ enum { > PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206 | \ > PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \ > PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \ > -PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206) > +PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \ > +PPC2_ISA300) > }; > > > /*/ > diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c > index 51bab23..9852524 100644 > --- a/target-ppc/translate_init.c > +++ b/target-ppc/translate_init.c > @@ -8820,7 +8820,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data) > PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 | > PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 | > PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | > -PPC2_TM | PPC2_PM_ISA206; > +PPC2_TM | PPC2_PM_ISA206 | PPC2_ISA300; > pcc->msr_mask = (1ull << MSR_SF) | > (1ull << MSR_TM) | > (1ull << MSR_VR) | -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC 5/6] target-ppc: add modulo word operations
On Tue, Jul 12, 2016 at 11:33:21PM +0530, Nikunj A Dadhania wrote: > Adding following instructions: > > moduw: Modulo Unsigned Word > modsw: Modulo Signed Word > > Signed-off-by: Nikunj A Dadhania Hrm.. any reason you're not using the TCG inbuilt remainder ops (tcg_gen_rem_i32() etc.)? > --- > target-ppc/translate.c | 50 > ++ > 1 file changed, 50 insertions(+) > > diff --git a/target-ppc/translate.c b/target-ppc/translate.c > index 8de217f..c505684 100644 > --- a/target-ppc/translate.c > +++ b/target-ppc/translate.c > @@ -1178,6 +1178,54 @@ GEN_DIVE(divde, divde, 0); > GEN_DIVE(divdeo, divde, 1); > #endif > > +static inline void gen_op_arith_modw(DisasContext *ctx, TCGv ret, TCGv arg1, > + TCGv arg2, int sign) > +{ > +TCGLabel *l1 = gen_new_label(); > +TCGLabel *l2 = gen_new_label(); > +TCGv_i32 t0 = tcg_temp_local_new_i32(); > +TCGv_i32 t1 = tcg_temp_local_new_i32(); > +TCGv_i32 t2 = tcg_temp_local_new_i32(); > + > +tcg_gen_trunc_tl_i32(t0, arg1); > +tcg_gen_trunc_tl_i32(t1, arg2); > +tcg_gen_brcondi_i32(TCG_COND_EQ, t1, 0, l1); > +if (sign) { > +TCGLabel *l3 = gen_new_label(); > +tcg_gen_brcondi_i32(TCG_COND_NE, t1, -1, l3); > +tcg_gen_brcondi_i32(TCG_COND_EQ, t0, INT32_MIN, l1); > +gen_set_label(l3); > +tcg_gen_div_i32(t2, t0, t1); > +} else { > +tcg_gen_divu_i32(t2, t0, t1); > +} > +tcg_gen_mul_i32(t2, t2, t1); > +tcg_gen_sub_i32(t2, t0, t2); > +tcg_gen_br(l2); > +gen_set_label(l1); > +if (sign) { > +tcg_gen_sari_i32(t2, t0, 31); > +} else { > +tcg_gen_movi_i32(t2, 0); > +} > +gen_set_label(l2); > +tcg_gen_extu_i32_tl(ret, t2); > +tcg_temp_free_i32(t0); > +tcg_temp_free_i32(t1); > +tcg_temp_free_i32(t2); > +} > + > +#define GEN_INT_ARITH_MODW(name, opc3, sign)\ > +static void glue(gen_, name)(DisasContext *ctx) \ > +{ \ > +gen_op_arith_modw(ctx, cpu_gpr[rD(ctx->opcode)],\ > + cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], \ > + sign);\ > +} > + > +GEN_INT_ARITH_MODW(modsw, 0x18, 1); > +GEN_INT_ARITH_MODW(moduw, 0x08, 0); > + > /* mulhw mulhw. */ > static void gen_mulhw(DisasContext *ctx) > { > @@ -10244,6 +10292,8 @@ GEN_HANDLER_E(divwe, 0x1F, 0x0B, 0x0D, 0, PPC_NONE, > PPC2_DIVE_ISA206), > GEN_HANDLER_E(divweo, 0x1F, 0x0B, 0x1D, 0, PPC_NONE, PPC2_DIVE_ISA206), > GEN_HANDLER_E(divweu, 0x1F, 0x0B, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206), > GEN_HANDLER_E(divweuo, 0x1F, 0x0B, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206), > +GEN_HANDLER_E(modsw, 0x1F, 0x0B, 0x18, 0x0001, PPC_NONE, PPC2_ISA300), > +GEN_HANDLER_E(moduw, 0x1F, 0x0B, 0x08, 0x0001, PPC_NONE, PPC2_ISA300), > > #if defined(TARGET_PPC64) > #undef GEN_INT_ARITH_DIVD -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC 3/6] target-ppc: adding addpcis instruction
On Tue, Jul 12, 2016 at 11:33:19PM +0530, Nikunj A Dadhania wrote: > ISA 3.0 instruction for adding immediate value with next instruction > address and return the result in the target register. > > Signed-off-by: Nikunj A Dadhania Reviewed-by: David Gibson > --- > target-ppc/translate.c | 27 +++ > 1 file changed, 27 insertions(+) > > diff --git a/target-ppc/translate.c b/target-ppc/translate.c > index 92030b6..93c7c66 100644 > --- a/target-ppc/translate.c > +++ b/target-ppc/translate.c > @@ -432,6 +432,20 @@ static inline uint32_t name(uint32_t opcode) > \ > return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) | > \ > ((opcode >> (shift2)) & ((1 << (nb2)) - 1)); > \ > } > + > +#define EXTRACT_HELPER_DXFORM(name, > \ > + d0_bits, shift_op_d0, shift_d0, > \ > + d1_bits, shift_op_d1, shift_d1, > \ > + d2_bits, shift_op_d2, shift_d2) > \ > +static inline int16_t name(uint32_t opcode) > \ > +{ > \ > +return > \ > +(((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) > | \ > +(((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) > | \ > +(((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << > (shift_d2)); \ > +} > + > + > /* Opcode part 1 */ > EXTRACT_HELPER(opc1, 26, 6); > /* Opcode part 2 */ > @@ -501,6 +515,9 @@ EXTRACT_HELPER(FPL, 25, 1); > EXTRACT_HELPER(FPFLM, 17, 8); > EXTRACT_HELPER(FPW, 16, 1); > > +/* addpcis */ > +EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0) > + > /***Jump target decoding > ***/ > /* Immediate address */ > static inline target_ulong LI(uint32_t opcode) > @@ -984,6 +1001,15 @@ static void gen_addis(DisasContext *ctx) > } > } > > +/* addpcis */ > +static void gen_addpcis(DisasContext *ctx) > +{ > +target_long d = DX(ctx->opcode); > + > +tcg_gen_movi_tl(cpu_gpr[rD(ctx->opcode)], ctx->nip); > +tcg_gen_addi_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rD(ctx->opcode)], d); > +} > + > static inline void gen_op_arith_divw(DisasContext *ctx, TCGv ret, TCGv arg1, > TCGv arg2, int sign, int compute_ov) > { > @@ -9877,6 +9903,7 @@ GEN_HANDLER(addi, 0x0E, 0xFF, 0xFF, 0x, > PPC_INTEGER), > GEN_HANDLER(addic, 0x0C, 0xFF, 0xFF, 0x, PPC_INTEGER), > GEN_HANDLER2(addic_, "addic.", 0x0D, 0xFF, 0xFF, 0x, PPC_INTEGER), > GEN_HANDLER(addis, 0x0F, 0xFF, 0xFF, 0x, PPC_INTEGER), > +GEN_HANDLER_E(addpcis, 0x13, 0x2, 0xFF, 0x, PPC_NONE, PPC2_ISA300), > GEN_HANDLER(mulhw, 0x1F, 0x0B, 0x02, 0x0400, PPC_INTEGER), > GEN_HANDLER(mulhwu, 0x1F, 0x0B, 0x00, 0x0400, PPC_INTEGER), > GEN_HANDLER(mullw, 0x1F, 0x0B, 0x07, 0x, PPC_INTEGER), -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
[Qemu-devel] [Bug 1603693] Re: Disks in mptsas1068 scsi controller not seen by linux
Welp. Yeah now I see it, it was in the test case I linked. Thanks. Vmware doesn't seem to need this. Seems like it assigns a WWN of 0x5000c293944837df to my disk (not in the vm config files as far as i can see, seems to persist across reboots) [2.305111] ioc0: LSISAS1068 B0: Capabilities={Initiator} [2.445800] scsi host2: ioc0: LSISAS1068 B0, FwRev=01032920h, Ports=1, MaxQ=128, IRQ=18 [2.447672] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x5000c293944837df [2.448806] scsi 2:0:0:0: Direct-Access VMware, VMware Virtual S 1.0 PQ: 0 ANSI: 2 Qemu with the manually specified WWN, for reference: [3.656894] ioc0: LSISAS1068 A0: Capabilities={Initiator} [3.790680] scsi host0: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, MaxQ=128, IRQ=10 [3.792232] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x5000c50015ea71ac [3.792476] scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK2.5+ PQ: 0 ANSI: 5 Also vmware doesn't populate /dev/disk/by-id/wwn-*: # ls /dev/disk/by-id ata-VMware_Virtual_IDE_CDROM_Drive_0001@ dm-name-arch_airootfs@ Qemu: # ls /dev/disk/by-id ata-QEMU_DVD-ROM_QM2@ scsi-35000c50015ea71ac@ scsi-35000c50015ea71ac-part2@ wwn-0x5000c50015ea71ac@ wwn-0x5000c50015ea71ac-part2@ dm-name-arch_airootfs@ scsi-35000c50015ea71ac-part1@ scsi-35000c50015ea71ac-part3@ wwn-0x5000c50015ea71ac-part1@ wwn-0x5000c50015ea71ac-part3@ Not directly related: after getting the arch iso cd to boot, I found that the VM that I actually wanted to get working uses mptspi instead of mptsas. So I didn't even need this controller... The non-working vmware config says `scsi0.virtualDev = "lsilogic"` (that's mptspi, LSI53C1030 or "LSI Logic Ultra 320"). For the mptsas tests above, I changed it to `scsi0.virtualDev = "lsisas1068"`. Is it correct to say that the LSI53C1030 parts of [1] were never applied? [1]: http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01608.html -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1603693 Title: Disks in mptsas1068 scsi controller not seen by linux Status in QEMU: New Bug description: When using the mptsas1068 scsi controller, linux detects the controller itself but not the drives attached to it. Freebsd works. Using a different controller with linux works. VMware with linux works. qemu 2.6.50 (v2.6.0-1925-g6b92bbf) seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 support) Test script, loosely based off what libvirt runs and the libvirt tests that Paolo Bonzini wrote [1] # iso=archlinux-2016.07.01-dual.iso #iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso device=mptsas1068 #device=lsi img=empty.img qemu-img create -f qcow2 $img 1G /usr/bin/qemu-system-x86_64 \ -enable-kvm \ -m 1024 \ -boot menu=on \ -device $device,id=scsi0,bus=pci.0,addr=0x9 \ -drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \ -drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \ -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1 # The ISOs can be downloaded from [2] and [3]. After booting linux, do "lsblk". /dev/sda should exist. After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU HARDDISK" should be mentioned. With device=mptsas1068 this fails in linux. With device=lsi line it works in both. With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only loads modules for mptsas1068, this works. I also reproduced this with the debian 8.5 netinstall image, but it insists in making you pick a driver from a list of modules when it fails to mount it, instead of dropping to a shell. Arch linux dmesg output snippet (full output attached as arch-linux- dmesg.txt): # root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0 [0.00] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016 [0.00] Normal empty [0.00] Preemptible hierarchical RCU implementation. [1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249) [1.951581] SCSI subsystem initialized [1.957113] Fusion MPT base driver 3.04.20 [1.957618] Fusion MPT SAS Host driver 3.04.20 [2.281773] scsi host0: ata_piix [2.285372] scsi host1: ata_piix [2.305803] mptbase: ioc0: Initiating bringup [2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator} [2.444390] scsi 0:0:1:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ PQ: 0 ANSI: 5 [2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=013292
[Qemu-devel] [PATCH qemu] xhci: Fix possible side effect from assert()
A static analysis tool called BEAM detected possible side effect from assert() calling a helper which may change an XHCI ring after every call. This moves xhci_ring_fetch() out of assert() so it will be called with and without enabled debug. Signed-off-by: Alexey Kardashevskiy --- hw/usb/hcd-xhci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c index 976bfb0..188f954 100644 --- a/hw/usb/hcd-xhci.c +++ b/hw/usb/hcd-xhci.c @@ -2201,7 +2201,9 @@ static void xhci_kick_ep(XHCIState *xhci, unsigned int slotid, xfer->trb_count = length; for (i = 0; i < length; i++) { -assert(xhci_ring_fetch(xhci, ring, &xfer->trbs[i], NULL)); +TRBType type; +type = xhci_ring_fetch(xhci, ring, &xfer->trbs[i], NULL); +assert(type); } xfer->streamid = streamid; -- 2.5.0.rc3
[Qemu-devel] [PATCH] virtio-blk: dataplane cleanup
No need duplicate the judgment, there is one in function entry. Cc: Stefan Hajnoczi Cc: Kevin Wolf Cc: Max Reitz Signed-off-by: Cao jin --- hw/block/dataplane/virtio-blk.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 54b9ac1..704a763 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -112,10 +112,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf, s->vdev = vdev; s->conf = conf; -if (conf->iothread) { -s->iothread = conf->iothread; -object_ref(OBJECT(s->iothread)); -} +s->iothread = conf->iothread; +object_ref(OBJECT(s->iothread)); s->ctx = iothread_get_aio_context(s->iothread); s->bh = aio_bh_new(s->ctx, notify_guest_bh, s); s->batch_notify_vqs = bitmap_new(conf->num_queues); -- 2.1.0
[Qemu-devel] [PULL 12/14] ppc/mmu-hash64: Remove duplicated #include statement
From: Thomas Huth No need to include error-report.h twice here. Signed-off-by: Thomas Huth Signed-off-by: David Gibson --- target-ppc/mmu-hash64.c | 1 - 1 file changed, 1 deletion(-) diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c index 82c2186..f6ffe35 100644 --- a/target-ppc/mmu-hash64.c +++ b/target-ppc/mmu-hash64.c @@ -24,7 +24,6 @@ #include "exec/helper-proto.h" #include "qemu/error-report.h" #include "sysemu/kvm.h" -#include "qemu/error-report.h" #include "kvm_ppc.h" #include "mmu-hash64.h" #include "exec/log.h" -- 2.7.4
[Qemu-devel] [PULL 07/14] dbdma: reset io->processing flag for unassigned DBDMA channel rw accesses
From: Mark Cave-Ayland Otherwise MacOS 9 hangs upon shutdown. Signed-off-by: Mark Cave-Ayland Acked-by: Benjamin Herrenschmidt Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index ef5b0a5..15452b9 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -778,6 +778,7 @@ static void dbdma_unassigned_rw(DBDMA_io *io) DBDMA_channel *ch = io->channel; qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n", __func__, ch->channel); +ch->io.processing = false; } static void dbdma_unassigned_flush(DBDMA_io *io) -- 2.7.4
[Qemu-devel] [PULL 02/14] dbdma: always define DBDMA_DPRINTF and enable debug with DEBUG_DBDMA
From: Mark Cave-Ayland Enabling DBDMA_DPRINTF unconditionally ensures that any errors in debug statements are picked up immediately. Signed-off-by: Mark Cave-Ayland Acked-by: Benjamin Herrenschmidt Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index f116f9c..b6639f4 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -45,14 +45,13 @@ #include "sysemu/dma.h" /* debug DBDMA */ -//#define DEBUG_DBDMA +#define DEBUG_DBDMA 0 -#ifdef DEBUG_DBDMA -#define DBDMA_DPRINTF(fmt, ...) \ -do { printf("DBDMA: " fmt , ## __VA_ARGS__); } while (0) -#else -#define DBDMA_DPRINTF(fmt, ...) -#endif +#define DBDMA_DPRINTF(fmt, ...) do { \ +if (DEBUG_DBDMA) { \ +printf("DBDMA: " fmt , ## __VA_ARGS__); \ +} \ +} while (0); /* */ @@ -62,7 +61,7 @@ static DBDMAState *dbdma_from_ch(DBDMA_channel *ch) return container_of(ch, DBDMAState, channels[ch->channel]); } -#ifdef DEBUG_DBDMA +#if DEBUG_DBDMA static void dump_dbdma_cmd(dbdma_cmd *cmd) { printf("dbdma_cmd %p\n", cmd); -- 2.7.4
[Qemu-devel] [PULL 06/14] dbdma: set FLUSH bit upon reception of flush command for unassigned DBDMA channels
From: Mark Cave-Ayland This fixes MacOS 9 whereby it continually flushes and polls the status bits until they are set to indicate a successful flush. Signed-off-by: Mark Cave-Ayland Acked-by: Benjamin Herrenschmidt Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index c5dd0ac..ef5b0a5 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -783,8 +783,18 @@ static void dbdma_unassigned_rw(DBDMA_io *io) static void dbdma_unassigned_flush(DBDMA_io *io) { DBDMA_channel *ch = io->channel; +dbdma_cmd *current = &ch->current; +uint16_t cmd; qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n", __func__, ch->channel); + +cmd = le16_to_cpu(current->command) & COMMAND_MASK; +if (cmd == OUTPUT_MORE || cmd == OUTPUT_LAST || +cmd == INPUT_MORE || cmd == INPUT_LAST) { +current->xfer_status = cpu_to_le16(ch->regs[DBDMA_STATUS] | FLUSH); +current->res_count = cpu_to_le16(io->len); +dbdma_cmdptr_save(ch); +} } void* DBDMA_init (MemoryRegion **dbdma_mem) -- 2.7.4
[Qemu-devel] [PULL 00/14] ppc-for-2.7 queue 20160718
The following changes since commit 6b92bbfe812746fe7841a24c24e6460f5359ce72: Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging (2016-07-15 16:56:08 +0100) are available in the git repository at: git://github.com/dgibson/qemu.git tags/ppc-for-2.7-20160718 for you to fetch changes up to 159d2e39a8602c369542a92573a52acb5f5f58f2: ppc: Yet another fix for the huge page support detection mechanism (2016-07-18 10:52:19 +1000) ppc patch queue 2016-07-18 Here's what ought to be the final ppc pull request before the 2.7 hard freeze. This set contains a rework of the DBDMA device for Mac platforms, and some assorted cleanups and bugfixes. Benjamin Herrenschmidt (1): ppc: Fix support for odd MSR combinations Bharata B Rao (1): spapr: Ensure CPU cores are added contiguously and removed in LIFO order David Gibson (1): vfio/spapr: Remove stale ioctl() call Greg Kurz (2): spapr: fix core unplug crash ppc: abort if compat property contains an unknown value Mark Cave-Ayland (6): dbdma: always define DBDMA_DPRINTF and enable debug with DEBUG_DBDMA dbdma: add per-channel debugging enabled via DEBUG_DBDMA_CHANMASK dbdma: fix endian of DBDMA_CMDPTR_LO during branch dbdma: fix load_word/store_word value endianness dbdma: set FLUSH bit upon reception of flush command for unassigned DBDMA channels dbdma: reset io->processing flag for unassigned DBDMA channel rw accesses Paolo Bonzini (1): target-ppc: fix left shift overflow in hpte_page_shift Thomas Huth (2): ppc/mmu-hash64: Remove duplicated #include statement ppc: Yet another fix for the huge page support detection mechanism hw/misc/macio/mac_dbdma.c | 125 +++- hw/ppc/spapr_cpu_core.c | 27 -- hw/vfio/spapr.c | 1 - target-ppc/helper_regs.h| 46 target-ppc/kvm.c| 10 ++-- target-ppc/mmu-hash64.c | 3 +- target-ppc/translate_init.c | 4 +- 7 files changed, 119 insertions(+), 97 deletions(-)
[Qemu-devel] [PULL 14/14] ppc: Yet another fix for the huge page support detection mechanism
From: Thomas Huth Commit 86b50f2e1bef ("Disable huge page support if it is not available for main RAM") already made sure that huge page support is not announced to the guest if the normal RAM of non-NUMA configurations is not backed by a huge page filesystem. However, there is one more case that can go wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured with huge page support (and only the memory of a DIMM is configured with it). When QEMU is started with the following command line for example, the Linux guest currently crashes because it is trying to use huge pages on a memory region that does not support huge pages: qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \ memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 \ -device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \ -numa node,nodeid=0 -numa node,nodeid=1 To fix this issue, we've got to make sure to disable huge page support, too, when there is a NUMA node that is not using a memory backend with huge page support. Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740 Signed-off-by: Thomas Huth Signed-off-by: David Gibson --- target-ppc/kvm.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 884d564..7a8f555 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -389,12 +389,16 @@ static long getrampagesize(void) object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize); -if (hpsize == LONG_MAX) { +if (hpsize == LONG_MAX || hpsize == getpagesize()) { return getpagesize(); } -if (nb_numa_nodes == 0 && hpsize > getpagesize()) { -/* No NUMA nodes and normal RAM without -mem-path ==> no huge pages! */ +/* If NUMA is disabled or the NUMA nodes are not backed with a + * memory-backend, then there is at least one node using "normal" + * RAM. And since normal RAM has not been configured with "-mem-path" + * (what we've checked earlier here already), we can not use huge pages! + */ +if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) { static bool warned; if (!warned) { error_report("Huge page support disabled (n/a for main memory)."); -- 2.7.4
[Qemu-devel] [PULL 09/14] vfio/spapr: Remove stale ioctl() call
This ioctl() call to VFIO_IOMMU_SPAPR_TCE_REMOVE was left over from an earlier version of the code and has since been folded into vfio_spapr_remove_window(). It wasn't caught because although the argument structure has been removed, the libc function remove() means this didn't trigger a compile failure. The ioctl() was also almost certain to fail silently and harmlessly with the bogus argument, so this wasn't caught in testing. Suggested-by: Paolo Bonzini Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/vfio/spapr.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 0af3423..7443d34 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -177,7 +177,6 @@ int vfio_spapr_create_window(VFIOContainer *container, error_report("Host doesn't support DMA window at %"HWADDR_PRIx", must be %"PRIx64, section->offset_within_address_space, (uint64_t)create.start_addr); -ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove); return -EINVAL; } trace_vfio_spapr_create_window(create.page_shift, -- 2.7.4
[Qemu-devel] [PULL 03/14] dbdma: add per-channel debugging enabled via DEBUG_DBDMA_CHANMASK
From: Mark Cave-Ayland By default large amounts of DBDMA debugging are produced when often it is just 1 or 2 channels that are of interest. Introduce DEBUG_DBDMA_CHANMASK to allow the developer to select the channels of interest at compile time, and then further add the extra channel information to each debug statement where possible. Also clearly mark the start/end of DBDMA_run_bh to allow tracking the bottom half execution. Signed-off-by: Mark Cave-Ayland Acked-by: Benjamin Herrenschmidt Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 75 ++- 1 file changed, 42 insertions(+), 33 deletions(-) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index b6639f4..e692312 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -46,6 +46,7 @@ /* debug DBDMA */ #define DEBUG_DBDMA 0 +#define DEBUG_DBDMA_CHANMASK ((1ull << DBDMA_CHANNELS) - 1) #define DBDMA_DPRINTF(fmt, ...) do { \ if (DEBUG_DBDMA) { \ @@ -53,6 +54,14 @@ } \ } while (0); +#define DBDMA_DPRINTFCH(ch, fmt, ...) do { \ +if (DEBUG_DBDMA) { \ +if ((1ul << (ch)->channel) & DEBUG_DBDMA_CHANMASK) { \ +printf("DBDMA[%02x]: " fmt , (ch)->channel, ## __VA_ARGS__); \ +} \ +} \ +} while (0); + /* */ @@ -79,26 +88,26 @@ static void dump_dbdma_cmd(dbdma_cmd *cmd) #endif static void dbdma_cmdptr_load(DBDMA_channel *ch) { -DBDMA_DPRINTF("dbdma_cmdptr_load 0x%08x\n", - ch->regs[DBDMA_CMDPTR_LO]); +DBDMA_DPRINTFCH(ch, "dbdma_cmdptr_load 0x%08x\n", +ch->regs[DBDMA_CMDPTR_LO]); dma_memory_read(&address_space_memory, ch->regs[DBDMA_CMDPTR_LO], &ch->current, sizeof(dbdma_cmd)); } static void dbdma_cmdptr_save(DBDMA_channel *ch) { -DBDMA_DPRINTF("dbdma_cmdptr_save 0x%08x\n", - ch->regs[DBDMA_CMDPTR_LO]); -DBDMA_DPRINTF("xfer_status 0x%08x res_count 0x%04x\n", - le16_to_cpu(ch->current.xfer_status), - le16_to_cpu(ch->current.res_count)); +DBDMA_DPRINTFCH(ch, "dbdma_cmdptr_save 0x%08x\n", +ch->regs[DBDMA_CMDPTR_LO]); +DBDMA_DPRINTFCH(ch, "xfer_status 0x%08x res_count 0x%04x\n", +le16_to_cpu(ch->current.xfer_status), +le16_to_cpu(ch->current.res_count)); dma_memory_write(&address_space_memory, ch->regs[DBDMA_CMDPTR_LO], &ch->current, sizeof(dbdma_cmd)); } static void kill_channel(DBDMA_channel *ch) { -DBDMA_DPRINTF("kill_channel\n"); +DBDMA_DPRINTFCH(ch, "kill_channel\n"); ch->regs[DBDMA_STATUS] |= DEAD; ch->regs[DBDMA_STATUS] &= ~ACTIVE; @@ -114,7 +123,7 @@ static void conditional_interrupt(DBDMA_channel *ch) uint32_t status; int cond; -DBDMA_DPRINTF("%s\n", __func__); +DBDMA_DPRINTFCH(ch, "%s\n", __func__); intr = le16_to_cpu(current->command) & INTR_MASK; @@ -123,7 +132,7 @@ static void conditional_interrupt(DBDMA_channel *ch) return; case INTR_ALWAYS: /* always interrupt */ qemu_irq_raise(ch->irq); -DBDMA_DPRINTF("%s: raise\n", __func__); +DBDMA_DPRINTFCH(ch, "%s: raise\n", __func__); return; } @@ -138,13 +147,13 @@ static void conditional_interrupt(DBDMA_channel *ch) case INTR_IFSET: /* intr if condition bit is 1 */ if (cond) { qemu_irq_raise(ch->irq); -DBDMA_DPRINTF("%s: raise\n", __func__); +DBDMA_DPRINTFCH(ch, "%s: raise\n", __func__); } return; case INTR_IFCLR: /* intr if condition bit is 0 */ if (!cond) { qemu_irq_raise(ch->irq); -DBDMA_DPRINTF("%s: raise\n", __func__); +DBDMA_DPRINTFCH(ch, "%s: raise\n", __func__); } return; } @@ -158,7 +167,7 @@ static int conditional_wait(DBDMA_channel *ch) uint32_t status; int cond; -DBDMA_DPRINTF("conditional_wait\n"); +DBDMA_DPRINTFCH(ch, "conditional_wait\n"); wait = le16_to_cpu(current->command) & WAIT_MASK; @@ -217,7 +226,7 @@ static void conditional_branch(DBDMA_channel *ch) uint32_t status; int cond; -DBDMA_DPRINTF("conditional_branch\n"); +DBDMA_DPRINTFCH(ch, "conditional_branch\n"); /* check if we must branch */ @@ -262,7 +271,7 @@ static void dbdma_end(DBDMA_io *io) DBDMA_channel *ch = io->channel; dbdma_cmd *current = &ch->current; -DBDMA_DPRINTF("%s\n", __func__); +DBDMA_DPRINTFCH(ch, "%s\n", __func__); if (conditional_wait(ch)) goto wait; @@ -288,13 +297,13 @@ wait: static void start_output(DBDMA_channel *ch, int key, uint32_t addr, uint16_t req_count, int is_last) { -DBDMA_DPRINTF("start_output\n"); +DBDMA_DPRINTFCH(ch, "start_output\n"); /* KEY_REGS, KEY_DEVICE and KEY_STREAM * are not implemented in the mac-io chip */ -DBDMA_D
[Qemu-devel] [PULL 01/14] spapr: fix core unplug crash
From: Greg Kurz If the host has 8 threads/core and the guest is started with: -smp cores=1,threads=4,maxcpus=12 It is possible to crash QEMU by doing: (qemu) device_add host-spapr-cpu-core,core-id=16,id=foo (qemu) device_del foo Segmentation fault This happens because spapr_core_unplug() assumes cpu_dt_id == core_id. As long as cpu_dt_id is derived from the non-table cpu_index, this is only true when you plug cores with contiguous ids. It is safer to be consistent: the DR connector was created with an index that is immediately written to cc->core_id, and spapr_core_plug() also relies on cc->core_id. Let's use it also in spapr_core_unplug(). Signed-off-by: Greg Kurz Reviewed-by: Bharata B Rao Signed-off-by: David Gibson --- hw/ppc/spapr_cpu_core.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c index 9347f07..bc52b3c 100644 --- a/hw/ppc/spapr_cpu_core.c +++ b/hw/ppc/spapr_cpu_core.c @@ -126,11 +126,9 @@ static void spapr_core_release(DeviceState *dev, void *opaque) void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { -sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev)); -PowerPCCPU *cpu = POWERPC_CPU(core->threads); -int id = ppc_get_vcpu_dt_id(cpu); +CPUCore *cc = CPU_CORE(dev); sPAPRDRConnector *drc = -spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id); +spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); sPAPRDRConnectorClass *drck; Error *local_err = NULL; -- 2.7.4
[Qemu-devel] [PULL 11/14] ppc: abort if compat property contains an unknown value
From: Greg Kurz It is not possible to set the compat property to an unknown value with powerpc_set_compat(). Something must have gone terribly wrong in QEMU, if we detect an "Internal error" in powerpc_get_compat(). Let's abort then. This patch also drops the "max_compat ? *max_compat : -1" construct. It is useless since max_compat is dereferenced a few lines above. Signed-off-by: Greg Kurz Signed-off-by: David Gibson --- target-ppc/translate_init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index 7cb7842..5ecafc7 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -8446,8 +8446,8 @@ static void powerpc_get_compat(Object *obj, Visitor *v, const char *name, case 0: break; default: -error_setg(errp, "Internal error: compat is set to %x", - max_compat ? *max_compat : -1); +error_report("Internal error: compat is set to %x", *max_compat); +abort(); break; } -- 2.7.4
[Qemu-devel] [PULL 10/14] spapr: Ensure CPU cores are added contiguously and removed in LIFO order
From: Bharata B Rao If CPU core addition or removal is allowed in random order leading to holes in the core id range (and hence in the cpu_index range), migration can fail as migration with holes in cpu_index range isn't yet handled correctly. Prevent this situation by enforcing the addition in contiguous order and removal in LIFO order so that we never end up with holes in cpu_index range. Signed-off-by: Bharata B Rao Signed-off-by: David Gibson --- hw/ppc/spapr_cpu_core.c | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c index bc52b3c..4bfc96b 100644 --- a/hw/ppc/spapr_cpu_core.c +++ b/hw/ppc/spapr_cpu_core.c @@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void *opaque) void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { +sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); CPUCore *cc = CPU_CORE(dev); sPAPRDRConnector *drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id); sPAPRDRConnectorClass *drck; Error *local_err = NULL; +int smt = kvmppc_smt_threads(); +int index = cc->core_id / smt; +int spapr_max_cores = max_cpus / smp_threads; +int i; +for (i = spapr_max_cores - 1; i > index; i--) { +if (spapr->cores[i]) { +error_setg(errp, "core-id %d should be removed first", i * smt); +return; +} +} g_assert(drc); drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); @@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev)); sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev)); int spapr_max_cores = max_cpus / smp_threads; -int index; +int index, i; int smt = kvmppc_smt_threads(); Error *local_err = NULL; CPUCore *cc = CPU_CORE(dev); @@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, goto out; } +for (i = 0; i < index; i++) { +if (!spapr->cores[i]) { +error_setg(&local_err, "core-id %d should be added first", + i * smt); +goto out; +} +} + out: g_free(base_core_type); error_propagate(errp, local_err); -- 2.7.4
[Qemu-devel] [PULL 13/14] target-ppc: fix left shift overflow in hpte_page_shift
From: Paolo Bonzini ps->pte_enc is a 32-bit value, which is shifted left and then compared to a 64-bit value. It needs a cast before the shift. Reported by Coverity. Signed-off-by: Paolo Bonzini Signed-off-by: David Gibson --- target-ppc/mmu-hash64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c index f6ffe35..5de1358 100644 --- a/target-ppc/mmu-hash64.c +++ b/target-ppc/mmu-hash64.c @@ -478,7 +478,7 @@ static unsigned hpte_page_shift(const struct ppc_one_seg_page_size *sps, mask = ((1ULL << ps->page_shift) - 1) & HPTE64_R_RPN; -if ((pte1 & mask) == (ps->pte_enc << HPTE64_R_RPN_SHIFT)) { +if ((pte1 & mask) == ((uint64_t)ps->pte_enc << HPTE64_R_RPN_SHIFT)) { return ps->page_shift; } } -- 2.7.4
[Qemu-devel] [PULL 08/14] ppc: Fix support for odd MSR combinations
From: Benjamin Herrenschmidt MacOS uses an architecturally illegal MSR combination that seems nonetheless supported by 32-bit processors, which is to have MSR[PR]=1 and one or more of MSR[DR/IR/EE]=0. This adds support for it. To work properly we need to also properly include support for PR=1,{I,D}R=0 to the MMU index used by the qemu TLB. Signed-off-by: Benjamin Herrenschmidt Tested-by: Mark Cave-Ayland Signed-off-by: David Gibson --- target-ppc/helper_regs.h | 46 ++ 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/target-ppc/helper_regs.h b/target-ppc/helper_regs.h index 8d38828..3d279f1 100644 --- a/target-ppc/helper_regs.h +++ b/target-ppc/helper_regs.h @@ -41,17 +41,19 @@ static inline void hreg_swap_gpr_tgpr(CPUPPCState *env) static inline void hreg_compute_mem_idx(CPUPPCState *env) { -/* This is our encoding for server processors +/* This is our encoding for server processors. The architecture + * specifies that there is no such thing as userspace with + * translation off, however it appears that MacOS does it and + * some 32-bit CPUs support it. Weird... * * 0 = Guest User space virtual mode * 1 = Guest Kernel space virtual mode - * 2 = Guest Kernel space real mode - * 3 = HV User space virtual mode - * 4 = HV Kernel space virtual mode - * 5 = HV Kernel space real mode - * - * The combination PR=1 IR&DR=0 is invalid, we will treat - * it as IR=DR=1 + * 2 = Guest User space real mode + * 3 = Guest Kernel space real mode + * 4 = HV User space virtual mode + * 5 = HV Kernel space virtual mode + * 6 = HV User space real mode + * 7 = HV Kernel space real mode * * For BookE, we need 8 MMU modes as follow: * @@ -71,20 +73,11 @@ static inline void hreg_compute_mem_idx(CPUPPCState *env) env->immu_idx += msr_gs ? 4 : 0; env->dmmu_idx += msr_gs ? 4 : 0; } else { -/* First calucalte a base value independent of HV */ -if (msr_pr != 0) { -/* User space, ignore IR and DR */ -env->immu_idx = env->dmmu_idx = 0; -} else { -/* Kernel, setup a base I/D value */ -env->immu_idx = msr_ir ? 1 : 2; -env->dmmu_idx = msr_dr ? 1 : 2; -} -/* Then offset it for HV */ -if (msr_hv) { -env->immu_idx += 3; -env->dmmu_idx += 3; -} +env->immu_idx = env->dmmu_idx = msr_pr ? 0 : 1; +env->immu_idx += msr_ir ? 0 : 2; +env->dmmu_idx += msr_dr ? 0 : 2; +env->immu_idx += msr_hv ? 4 : 0; +env->dmmu_idx += msr_hv ? 4 : 0; } } @@ -136,8 +129,13 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value, /* Change the exception prefix on PowerPC 601 */ env->excp_prefix = ((value >> MSR_EP) & 1) * 0xFFF0; } -/* If PR=1 then EE, IR and DR must be 1 */ -if ((value >> MSR_PR) & 1) { +/* If PR=1 then EE, IR and DR must be 1 + * + * Note: We only enforce this on 64-bit processors. It appears that + * 32-bit implementations supports PR=1 and EE/DR/IR=0 and MacOS + * exploits it. + */ +if ((env->insns_flags & PPC_64B) && ((value >> MSR_PR) & 1)) { value |= (1 << MSR_EE) | (1 << MSR_DR) | (1 << MSR_IR); } #endif -- 2.7.4
[Qemu-devel] [PULL 04/14] dbdma: fix endian of DBDMA_CMDPTR_LO during branch
From: Mark Cave-Ayland The current DBDMA command is stored in little-endian format, so make sure we convert it to match our CPU when updating the DBDMA_CMDPTR_LO register. Signed-off-by: Mark Cave-Ayland Acked-by: Benjamin Herrenschmidt Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index e692312..c4ee381 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -213,7 +213,7 @@ static void branch(DBDMA_channel *ch) { dbdma_cmd *current = &ch->current; -ch->regs[DBDMA_CMDPTR_LO] = current->cmd_dep; +ch->regs[DBDMA_CMDPTR_LO] = le32_to_cpu(current->cmd_dep); ch->regs[DBDMA_STATUS] |= BT; dbdma_cmdptr_load(ch); } -- 2.7.4
[Qemu-devel] [PULL 05/14] dbdma: fix load_word/store_word value endianness
From: Mark Cave-Ayland The values to read/write to/from physical memory are copied directly to the physical address with no endian swapping required. Also add some extra information to debugging output while we are here. Signed-off-by: Mark Cave-Ayland Acked-by: Benjamin Herrenschmidt Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 24 +--- 1 file changed, 5 insertions(+), 19 deletions(-) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index c4ee381..c5dd0ac 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -350,9 +350,8 @@ static void load_word(DBDMA_channel *ch, int key, uint32_t addr, uint16_t len) { dbdma_cmd *current = &ch->current; -uint32_t val; -DBDMA_DPRINTFCH(ch, "load_word\n"); +DBDMA_DPRINTFCH(ch, "load_word %d bytes, addr=%08x\n", len, addr); /* only implements KEY_SYSTEM */ @@ -362,14 +361,7 @@ static void load_word(DBDMA_channel *ch, int key, uint32_t addr, return; } -dma_memory_read(&address_space_memory, addr, &val, len); - -if (len == 2) -val = (val << 16) | (current->cmd_dep & 0x); -else if (len == 1) -val = (val << 24) | (current->cmd_dep & 0x00ff); - -current->cmd_dep = val; +dma_memory_read(&address_space_memory, addr, ¤t->cmd_dep, len); if (conditional_wait(ch)) goto wait; @@ -389,9 +381,9 @@ static void store_word(DBDMA_channel *ch, int key, uint32_t addr, uint16_t len) { dbdma_cmd *current = &ch->current; -uint32_t val; -DBDMA_DPRINTFCH(ch, "store_word\n"); +DBDMA_DPRINTFCH(ch, "store_word %d bytes, addr=%08x pa=%x\n", +len, addr, le32_to_cpu(current->cmd_dep)); /* only implements KEY_SYSTEM */ @@ -401,13 +393,7 @@ static void store_word(DBDMA_channel *ch, int key, uint32_t addr, return; } -val = current->cmd_dep; -if (len == 2) -val >>= 16; -else if (len == 1) -val >>= 24; - -dma_memory_write(&address_space_memory, addr, &val, len); +dma_memory_write(&address_space_memory, addr, ¤t->cmd_dep, len); if (conditional_wait(ch)) goto wait; -- 2.7.4
Re: [Qemu-devel] [RFC 5/6] target-ppc: add modulo word operations
David Gibson writes: > [ Unknown signature status ] > On Tue, Jul 12, 2016 at 11:33:21PM +0530, Nikunj A Dadhania wrote: >> Adding following instructions: >> >> moduw: Modulo Unsigned Word >> modsw: Modulo Signed Word >> >> Signed-off-by: Nikunj A Dadhania > > Hrm.. any reason you're not using the TCG inbuilt remainder ops > (tcg_gen_rem_i32() etc.)? I have an updated version with me which uses inbuilt ops, i was searching for modulo expressions, which I didn't find, so wrote. Found later that it is called tcg_gen_rem. Will send in the next version. Regards Nikunj
Re: [Qemu-devel] [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
On Sun, Jul 17, 2016 at 9:37 PM, Namhyung Kim wrote: > The virtio pstore driver provides interface to the pstore subsystem so > that the guest kernel's log/dump message can be saved on the host > machine. Users can access the log file directly on the host, or on the > guest at the next boot using pstore filesystem. It currently deals with > kernel log (printk) buffer only, but we can extend it to have other > information (like ftrace dump) later. > > It supports legacy PCI device using single order-2 page buffer. As all > operation of pstore is synchronous, it would be fine IMHO. However I > don't know how to make write operation synchronous since it's called > with a spinlock held (from any context including NMI). > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: "Michael S. Tsirkin" > Cc: Anthony Liguori > Cc: Anton Vorontsov > Cc: Colin Cross > Cc: Kees Cook > Cc: Tony Luck > Cc: Steven Rostedt > Cc: Ingo Molnar > Cc: Minchan Kim > Cc: k...@vger.kernel.org > Cc: qemu-devel@nongnu.org > Cc: virtualizat...@lists.linux-foundation.org > Signed-off-by: Namhyung Kim This looks great to me! I'd love to use this in qemu. (Right now I go through hoops to use the ramoops backend for testing.) Reviewed-by: Kees Cook Notes below... > --- > drivers/virtio/Kconfig | 10 ++ > drivers/virtio/Makefile| 1 + > drivers/virtio/virtio_pstore.c | 317 > + > include/uapi/linux/Kbuild | 1 + > include/uapi/linux/virtio_ids.h| 1 + > include/uapi/linux/virtio_pstore.h | 53 +++ > 6 files changed, 383 insertions(+) > create mode 100644 drivers/virtio/virtio_pstore.c > create mode 100644 include/uapi/linux/virtio_pstore.h > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > index 77590320d44c..8f0e6c796c12 100644 > --- a/drivers/virtio/Kconfig > +++ b/drivers/virtio/Kconfig > @@ -58,6 +58,16 @@ config VIRTIO_INPUT > > If unsure, say M. > > +config VIRTIO_PSTORE > + tristate "Virtio pstore driver" > + depends on VIRTIO > + depends on PSTORE > + ---help--- > +This driver supports virtio pstore devices to save/restore > +panic and oops messages on the host. > + > +If unsure, say M. > + > config VIRTIO_MMIO > tristate "Platform bus driver for memory mapped virtio devices" > depends on HAS_IOMEM && HAS_DMA > diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile > index 41e30e3dc842..bee68cb26d48 100644 > --- a/drivers/virtio/Makefile > +++ b/drivers/virtio/Makefile > @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o > virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o > obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o > obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o > +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o > diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c > new file mode 100644 > index ..6fe62c0f1508 > --- /dev/null > +++ b/drivers/virtio/virtio_pstore.c > @@ -0,0 +1,317 @@ > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#define VIRT_PSTORE_ORDER2 > +#define VIRT_PSTORE_BUFSIZE (4096 << VIRT_PSTORE_ORDER) > + > +struct virtio_pstore { > + struct virtio_device*vdev; > + struct virtqueue*vq; > + struct pstore_info pstore; > + struct virtio_pstore_hdr hdr; > + size_t buflen; > + u64 id; > + > + /* Waiting for host to ack */ > + wait_queue_head_t acked; > +}; > + > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id > type) > +{ > + u16 ret; > + > + switch (type) { > + case PSTORE_TYPE_DMESG: > + ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_DMESG); > + break; > + default: > + ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN); > + break; > + } I would love to see this support PSTORE_TYPE_CONSOLE too. It should be relatively easy to add: I think it'd just be another virtio command? > + > + return ret; > +} > + > +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 > type) > +{ > + enum pstore_type_id ret; > + > + switch (virtio16_to_cpu(vps->vdev, type)) { > + case VIRTIO_PSTORE_TYPE_DMESG: > + ret = PSTORE_TYPE_DMESG; > + break; > + default: > + ret = PSTORE_TYPE_UNKNOWN; > + break; > + } > + > + return ret; > +} > + > +static void virtpstore_ack(struct virtqueue *vq) > +{ > + struct virtio_pstore *vps = vq->vdev->priv; > + > + wake_up(&vps->acked); > +} > + > +static int virt_pstore_open(struct pstore_info *psi) > +{ > + struct virtio_pstore *vps = psi->data; > + stru
Re: [Qemu-devel] [RFC 1/6] target-ppc: Introduce Power9 family
David Gibson writes: > [ Unknown signature status ] > On Tue, Jul 12, 2016 at 11:33:17PM +0530, Nikunj A Dadhania wrote: >> From: "Aneesh Kumar K.V" >> >> Signed-off-by: Aneesh Kumar K.V >> [ rebased and added POWER9 alias ] >> Signed-off-by: Nikunj A Dadhania >> --- >> target-ppc/cpu-models.c | 5 +++ >> target-ppc/cpu-models.h | 2 ++ >> target-ppc/cpu-qom.h| 7 >> target-ppc/mmu_helper.c | 3 +- >> target-ppc/translate_init.c | 85 >> - >> 5 files changed, 100 insertions(+), 2 deletions(-) >> >> diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c >> index 5209e63..901cf40 100644 >> --- a/target-ppc/cpu-models.c >> +++ b/target-ppc/cpu-models.c >> @@ -1147,6 +1147,10 @@ >> "POWER8NVL v1.0") >> POWERPC_DEF("970_v2.2", CPU_POWERPC_970_v22,970, >> "PowerPC 970 v2.2") >> + >> +POWERPC_DEF("POWER9_v1.0", CPU_POWERPC_POWER9_BASE,POWER9, >> +"POWER9 v1.0") >> + >> POWERPC_DEF("970fx_v1.0",CPU_POWERPC_970FX_v10, 970, >> "PowerPC 970FX v1.0 (G5)") >> POWERPC_DEF("970fx_v2.0",CPU_POWERPC_970FX_v20, 970, >> @@ -1395,6 +1399,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = { >> { "POWER8E", "POWER8E_v2.1" }, >> { "POWER8", "POWER8_v2.0" }, >> { "POWER8NVL", "POWER8NVL_v1.0" }, >> +{ "POWER9", "POWER9_v1.0" }, >> { "970", "970_v2.2" }, >> { "970fx", "970fx_v3.1" }, >> { "970mp", "970mp_v1.1" }, >> diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h >> index f21a44c..beeaaba 100644 >> --- a/target-ppc/cpu-models.h >> +++ b/target-ppc/cpu-models.h >> @@ -562,6 +562,8 @@ enum { >> CPU_POWERPC_POWER8_v20 = 0x004D0200, >> CPU_POWERPC_POWER8NVL_BASE = 0x004C, >> CPU_POWERPC_POWER8NVL_v10 = 0x004C0100, >> +CPU_POWERPC_POWER9_BASE= 0x004E, >> +CPU_POWERPC_POWER9_MAM = 0x004E0100, >> CPU_POWERPC_970_v22= 0x00390202, >> CPU_POWERPC_970FX_v10 = 0x00391100, >> CPU_POWERPC_970FX_v20 = 0x003C0200, >> diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h >> index 2864105..df2fb65 100644 >> --- a/target-ppc/cpu-qom.h >> +++ b/target-ppc/cpu-qom.h >> @@ -86,6 +86,13 @@ enum powerpc_mmu_t { >> POWERPC_MMU_2_07 = POWERPC_MMU_64 | POWERPC_MMU_1TSEG >> | POWERPC_MMU_64K >> | POWERPC_MMU_AMR | 0x0004, >> +/* for now , We can add radix later if needed */ > > I'm guessing this means you're only thinking about the guest-side > presentation of the P9 MMU at this point? IIUC the host side > presentation is so different that sharing any constants with pre-P9 > MMUs probably doesn't make sense. > > I'm not immediately sure how we should make this distinction in the > target-ppc code, since these values are supposed to belong to the CPU > regardless of operating mode. Currently, this is just a place holder patch. Not close to committing yet. For me to add the new instruction needed these family defines. Regards, Nikunj
Re: [Qemu-devel] [PATCH] virtio-blk: dataplane cleanup
On Mon, 07/18 12:05, Cao jin wrote: > No need duplicate the judgment, there is one in function entry. > > Cc: Stefan Hajnoczi > Cc: Kevin Wolf > Cc: Max Reitz > Signed-off-by: Cao jin > --- > hw/block/dataplane/virtio-blk.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index 54b9ac1..704a763 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -112,10 +112,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, > VirtIOBlkConf *conf, > s->vdev = vdev; > s->conf = conf; > > -if (conf->iothread) { > -s->iothread = conf->iothread; > -object_ref(OBJECT(s->iothread)); > -} > +s->iothread = conf->iothread; > +object_ref(OBJECT(s->iothread)); > s->ctx = iothread_get_aio_context(s->iothread); > s->bh = aio_bh_new(s->ctx, notify_guest_bh, s); > s->batch_notify_vqs = bitmap_new(conf->num_queues); > -- > 2.1.0 > > > > Reviewed-by: Fam Zheng
Re: [Qemu-devel] [RFC PATCH V2] qemu-char: Fix context for g_source_attach()
On 2016年07月18日 09:55, Zhang Chen wrote: Hi~ All~~ Can you give me some feedback for this patch? We need more comments~~ COLO project depend on this patch to work. Because this patch colo-compare can make handler of qemu_chr_add_handlers() run in compare thread, reduce workload of main_loop in network busy situation. This idea from Jason. Thanks Zhang Chen I think you can put this patch in the series of COLO comparing thread which shows its using. And then you can ask acked-by or reviewed-by from other maintainers. Thanks
Re: [Qemu-devel] [RFC PATCH V2] qemu-char: Fix context for g_source_attach()
On 07/18/2016 01:31 PM, Jason Wang wrote: On 2016年07月18日 09:55, Zhang Chen wrote: Hi~ All~~ Can you give me some feedback for this patch? We need more comments~~ COLO project depend on this patch to work. Because this patch colo-compare can make handler of qemu_chr_add_handlers() run in compare thread, reduce workload of main_loop in network busy situation. This idea from Jason. Thanks Zhang Chen I think you can put this patch in the series of COLO comparing thread which shows its using. And then you can ask acked-by or reviewed-by from other maintainers. Thanks Make sense. I will add this patch in next colo-compare series. Thanks Zhang Chen -- Thanks zhangchen
Re: [Qemu-devel] [PATCH] e1000e: fix building without CONFIG_VMXNET3_PCI
On 2016年07月13日 10:42, Jason Wang wrote: e1000e needs net_tx_pkt.o and net_rx_pkt.o too. Cc: Dmitry Fleytman Cc: Leonid Bloch Signed-off-by: Jason Wang --- hw/net/Makefile.objs | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/net/Makefile.objs b/hw/net/Makefile.objs index fe61e9f..610ed3e 100644 --- a/hw/net/Makefile.objs +++ b/hw/net/Makefile.objs @@ -7,6 +7,7 @@ common-obj-$(CONFIG_EEPRO100_PCI) += eepro100.o common-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o common-obj-$(CONFIG_E1000_PCI) += e1000.o e1000x_common.o +common-obj-$(CONFIG_E1000E_PCI) += net_tx_pkt.o net_rx_pkt.o common-obj-$(CONFIG_E1000E_PCI) += e1000e.o e1000e_core.o e1000x_common.o common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o common-obj-$(CONFIG_VMXNET3_PCI) += net_tx_pkt.o net_rx_pkt.o Applied, thanks.
Re: [Qemu-devel] [PATCH] net: fix incorrect argument to iov_to_buf
On 2016年07月15日 16:41, Paolo Bonzini wrote: Coverity reports a "suspicious sizeof" which is indeed wrong. Signed-off-by: Paolo Bonzini --- net/eth.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/eth.c b/net/eth.c index 95fe15c..0be59c2 100644 --- a/net/eth.c +++ b/net/eth.c @@ -418,7 +418,7 @@ _eth_get_rss_ex_dst_addr(const struct iovec *pkt, int pkt_frags, bytes_read = iov_to_buf(pkt, pkt_frags, rthdr_offset + sizeof(*ext_hdr), -dst_addr, sizeof(dst_addr)); +dst_addr, sizeof(*dst_addr)); return bytes_read == sizeof(dst_addr); } @@ -467,7 +467,7 @@ _eth_get_rss_ex_src_addr(const struct iovec *pkt, int pkt_frags, bytes_read = iov_to_buf(pkt, pkt_frags, opt_offset + sizeof(opthdr), -src_addr, sizeof(src_addr)); +src_addr, sizeof(*src_addr)); return bytes_read == sizeof(src_addr); } Applied to -net. Thanks
Re: [Qemu-devel] [PATCH] net: fix incorrect access to pointer
On 2016年07月15日 16:43, Paolo Bonzini wrote: This is not dereferencing the pointer, and instead checking only the value of the pointer. Signed-off-by: Paolo Bonzini --- net/eth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/eth.c b/net/eth.c index 0be59c2..df81efb 100644 --- a/net/eth.c +++ b/net/eth.c @@ -211,7 +211,7 @@ void eth_get_protocols(const struct iovec *iov, int iovcnt, *l4hdr_off, sizeof(l4hdr_info->hdr.tcp), &l4hdr_info->hdr.tcp); -if (istcp) { +if (*istcp) { *l5hdr_off = *l4hdr_off + TCP_HEADER_DATA_OFFSET(&l4hdr_info->hdr.tcp); Applied to -net. Thanks
Re: [Qemu-devel] [PATCH] e1000e: fix incorrect access to pointer
On 2016年07月15日 16:44, Paolo Bonzini wrote: This is not dereferencing the pointer, and instead checking only the value of the pointer. Signed-off-by: Paolo Bonzini --- hw/net/e1000e_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index 6050d8b..badb1fe 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -281,7 +281,7 @@ e1000e_intrmgr_delay_rx_causes(E1000ECore *core, uint32_t *causes) /* Check if delayed RX interrupts disabled by client or if there are causes that cannot be delayed */ -if ((rdtr == 0) || (causes != 0)) { +if ((rdtr == 0) || (*causes != 0)) { return false; } @@ -322,7 +322,7 @@ e1000e_intrmgr_delay_tx_causes(E1000ECore *core, uint32_t *causes) *causes &= ~delayable_causes; /* If there are causes that cannot be delayed */ -if (causes != 0) { +if (*causes != 0) { return false; } Applied to -net. Thanks
Re: [Qemu-devel] [PATCH] tap: fix memory leak on failure to create a multiqueue tap device
On 2016年07月15日 16:56, Paolo Bonzini wrote: Reported by Coverity. Signed-off-by: Paolo Bonzini --- net/tap.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/net/tap.c b/net/tap.c index e9c32f3..6a2cedc 100644 --- a/net/tap.c +++ b/net/tap.c @@ -787,8 +787,8 @@ int net_init_tap(const NetClientOptions *opts, const char *name, return -1; } } else if (tap->has_fds) { -char **fds = g_new(char *, MAX_TAP_QUEUES); -char **vhost_fds = g_new(char *, MAX_TAP_QUEUES); +char **fds = g_new0(char *, MAX_TAP_QUEUES); +char **vhost_fds = g_new0(char *, MAX_TAP_QUEUES); int nfds, nvhosts; if (tap->has_ifname || tap->has_script || tap->has_downscript || @@ -806,7 +806,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name, if (nfds != nvhosts) { error_setg(errp, "The number of fds passed does not match " "the number of vhostfds passed"); -return -1; +goto free_fail; } } @@ -814,7 +814,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name, fd = monitor_fd_param(cur_mon, fds[i], &err); if (fd == -1) { error_propagate(errp, err); -return -1; +goto free_fail; } fcntl(fd, F_SETFL, O_NONBLOCK); @@ -824,7 +824,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name, } else if (vnet_hdr != tap_probe_vnet_hdr(fd)) { error_setg(errp, "vnet_hdr not consistent across given tap fds"); -return -1; +goto free_fail; } net_init_tap_one(tap, peer, "tap", name, ifname, @@ -833,11 +833,21 @@ int net_init_tap(const NetClientOptions *opts, const char *name, vnet_hdr, fd, &err); if (err) { error_propagate(errp, err); -return -1; +goto free_fail; } } g_free(fds); g_free(vhost_fds); +return 0; + +free_fail: +for (i = 0; i < nfds; i++) { +g_free(fds[i]); +g_free(vhost_fds[i]); +} +g_free(fds); +g_free(vhost_fds); +return -1; } else if (tap->has_helper) { if (tap->has_ifname || tap->has_script || tap->has_downscript || tap->has_vnet_hdr || tap->has_queues || tap->has_vhostfds) { Applied to -net. Thanks
Re: [Qemu-devel] [patch qemu] MAINTAINERS: release Scott from being a rocker maintainer
On 2016年07月11日 15:49, Jiri Pirko wrote: From: Jiri Pirko As requested by Scott, removing him. Signed-off-by: Scott Feldman Signed-off-by: Jiri Pirko --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 1d0e2c3..5928f22 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -971,7 +971,6 @@ F: hw/net/vmxnet* F: hw/scsi/vmw_pvscsi* Rocker -M: Scott Feldman M: Jiri Pirko S: Maintained F: hw/net/rocker/ Applied to -net. Thanks
[Qemu-devel] [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
The virtio pstore driver provides interface to the pstore subsystem so that the guest kernel's log/dump message can be saved on the host machine. Users can access the log file directly on the host, or on the guest at the next boot using pstore filesystem. It currently deals with kernel log (printk) buffer only, but we can extend it to have other information (like ftrace dump) later. It supports legacy PCI device using single order-2 page buffer. As all operation of pstore is synchronous, it would be fine IMHO. However I don't know how to make write operation synchronous since it's called with a spinlock held (from any context including NMI). Cc: Paolo Bonzini Cc: Radim Krčmář Cc: "Michael S. Tsirkin" Cc: Anthony Liguori Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Tony Luck Cc: Steven Rostedt Cc: Ingo Molnar Cc: Minchan Kim Cc: k...@vger.kernel.org Cc: qemu-devel@nongnu.org Cc: virtualizat...@lists.linux-foundation.org Signed-off-by: Namhyung Kim --- drivers/virtio/Kconfig | 10 ++ drivers/virtio/Makefile| 1 + drivers/virtio/virtio_pstore.c | 317 + include/uapi/linux/Kbuild | 1 + include/uapi/linux/virtio_ids.h| 1 + include/uapi/linux/virtio_pstore.h | 53 +++ 6 files changed, 383 insertions(+) create mode 100644 drivers/virtio/virtio_pstore.c create mode 100644 include/uapi/linux/virtio_pstore.h diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 77590320d44c..8f0e6c796c12 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -58,6 +58,16 @@ config VIRTIO_INPUT If unsure, say M. +config VIRTIO_PSTORE + tristate "Virtio pstore driver" + depends on VIRTIO + depends on PSTORE + ---help--- +This driver supports virtio pstore devices to save/restore +panic and oops messages on the host. + +If unsure, say M. + config VIRTIO_MMIO tristate "Platform bus driver for memory mapped virtio devices" depends on HAS_IOMEM && HAS_DMA diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index 41e30e3dc842..bee68cb26d48 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c new file mode 100644 index ..6fe62c0f1508 --- /dev/null +++ b/drivers/virtio/virtio_pstore.c @@ -0,0 +1,317 @@ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include + +#define VIRT_PSTORE_ORDER2 +#define VIRT_PSTORE_BUFSIZE (4096 << VIRT_PSTORE_ORDER) + +struct virtio_pstore { + struct virtio_device*vdev; + struct virtqueue*vq; + struct pstore_info pstore; + struct virtio_pstore_hdr hdr; + size_t buflen; + u64 id; + + /* Waiting for host to ack */ + wait_queue_head_t acked; +}; + +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type) +{ + u16 ret; + + switch (type) { + case PSTORE_TYPE_DMESG: + ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_DMESG); + break; + default: + ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN); + break; + } + + return ret; +} + +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 type) +{ + enum pstore_type_id ret; + + switch (virtio16_to_cpu(vps->vdev, type)) { + case VIRTIO_PSTORE_TYPE_DMESG: + ret = PSTORE_TYPE_DMESG; + break; + default: + ret = PSTORE_TYPE_UNKNOWN; + break; + } + + return ret; +} + +static void virtpstore_ack(struct virtqueue *vq) +{ + struct virtio_pstore *vps = vq->vdev->priv; + + wake_up(&vps->acked); +} + +static int virt_pstore_open(struct pstore_info *psi) +{ + struct virtio_pstore *vps = psi->data; + struct virtio_pstore_hdr *hdr = &vps->hdr; + struct scatterlist sg[1]; + unsigned int len; + + hdr->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN); + + sg_init_one(sg, hdr, sizeof(*hdr)); + virtqueue_add_outbuf(vps->vq, sg, 1, vps, GFP_KERNEL); + virtqueue_kick(vps->vq); + + wait_event(vps->acked, virtqueue_get_buf(vps->vq, &len)); + return 0; +} + +static int virt_pstore_close(struct pstore_info *psi) +{ + struct virtio_pstore *vps = psi->data; + struct virtio_pstore_hdr *hdr = &vps->hdr; + struct scatterlist sg[1]; + unsigned int len; + + hdr->cmd = cpu_to_
[Qemu-devel] [PATCH 2/3] qemu: Implement virtio-pstore device
From: Namhyung Kim Add virtio pstore device to allow kernel log files saved on the host. It will save the log files on the directory given by pstore device option. $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ... (guest) # echo c > /proc/sysrq-trigger $ ls dir-xx dmesg-0.enc.z dmesg-1.enc.z The log files are usually compressed using zlib. Users can see the log messages directly on the host or on the guest (using pstore filesystem). Cc: Paolo Bonzini Cc: Radim Krčmář Cc: "Michael S. Tsirkin" Cc: Anthony Liguori Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Tony Luck Cc: Steven Rostedt Cc: Ingo Molnar Cc: Minchan Kim Cc: k...@vger.kernel.org Cc: qemu-devel@nongnu.org Cc: virtualizat...@lists.linux-foundation.org Signed-off-by: Namhyung Kim --- hw/virtio/Makefile.objs| 2 +- hw/virtio/virtio-pci.c | 50 hw/virtio/virtio-pci.h | 14 + hw/virtio/virtio-pstore.c | 328 + include/hw/pci/pci.h | 1 + include/hw/virtio/virtio-pstore.h | 30 ++ include/standard-headers/linux/virtio_ids.h| 1 + .../linux/{virtio_ids.h => virtio_pstore.h}| 48 +-- qdev-monitor.c | 1 + 9 files changed, 455 insertions(+), 20 deletions(-) create mode 100644 hw/virtio/virtio-pstore.c create mode 100644 include/hw/virtio/virtio-pstore.h copy include/standard-headers/linux/{virtio_ids.h => virtio_pstore.h} (63%) diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs index 3e2b175..aae7082 100644 --- a/hw/virtio/Makefile.objs +++ b/hw/virtio/Makefile.objs @@ -4,4 +4,4 @@ common-obj-y += virtio-bus.o common-obj-y += virtio-mmio.o obj-y += virtio.o virtio-balloon.o -obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o +obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o virtio-pstore.o diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 2b34b43..8281b80 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -2416,6 +2416,55 @@ static const TypeInfo virtio_host_pci_info = { }; #endif +/* virtio-pstore-pci */ + +static void virtio_pstore_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) +{ +VirtIOPstorePCI *vps = VIRTIO_PSTORE_PCI(vpci_dev); +DeviceState *vdev = DEVICE(&vps->vdev); +Error *err = NULL; + +qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus)); +object_property_set_bool(OBJECT(vdev), true, "realized", &err); +if (err) { +error_propagate(errp, err); +return; +} +} + +static void virtio_pstore_pci_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass); +PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass); + +k->realize = virtio_pstore_pci_realize; +set_bit(DEVICE_CATEGORY_MISC, dc->categories); + +pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET; +pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PSTORE; +pcidev_k->revision = VIRTIO_PCI_ABI_VERSION; +pcidev_k->class_id = PCI_CLASS_OTHERS; +} + +static void virtio_pstore_pci_instance_init(Object *obj) +{ +VirtIOPstorePCI *dev = VIRTIO_PSTORE_PCI(obj); + +virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev), +TYPE_VIRTIO_PSTORE); +object_property_add_alias(obj, "directory", OBJECT(&dev->vdev), + "directory", &error_abort); +} + +static const TypeInfo virtio_pstore_pci_info = { +.name = TYPE_VIRTIO_PSTORE_PCI, +.parent= TYPE_VIRTIO_PCI, +.instance_size = sizeof(VirtIOPstorePCI), +.instance_init = virtio_pstore_pci_instance_init, +.class_init= virtio_pstore_pci_class_init, +}; + /* virtio-pci-bus */ static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size, @@ -2485,6 +2534,7 @@ static void virtio_pci_register_types(void) #ifdef CONFIG_VHOST_SCSI type_register_static(&vhost_scsi_pci_info); #endif +type_register_static(&virtio_pstore_pci_info); } type_init(virtio_pci_register_types) diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h index e4548c2..b4c039f 100644 --- a/hw/virtio/virtio-pci.h +++ b/hw/virtio/virtio-pci.h @@ -31,6 +31,7 @@ #ifdef CONFIG_VHOST_SCSI #include "hw/virtio/vhost-scsi.h" #endif +#include "hw/virtio/virtio-pstore.h" typedef struct VirtIOPCIProxy VirtIOPCIProxy; typedef struct VirtIOBlkPCI VirtIOBlkPCI; @@ -44,6 +45,7 @@ typedef struct VirtIOInputPCI VirtIOInputPCI; typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI; typedef struct VirtIOInputHostPCI VirtIOInputHostPCI; typedef struct VirtIOGPUPCI VirtIOGPUPCI; +typedef struct VirtIOPstorePCI VirtIOPstorePCI; /* virtio-pci-bus */ @@ -311,6 +313,18 @@ struct VirtIOGPUPCI { VirtIOGPU vdev; }; +/* + * virtio-pstore-pci: This extends Virti
[Qemu-devel] [Bug 1603693] Re: Disks in mptsas1068 scsi controller not seen by linux
> The non-working vmware config says `scsi0.virtualDev = "lsilogic"` > (that's mptspi, LSI53C1030 or "LSI Logic Ultra 320"). For the mptsas > tests above, I changed it to `scsi0.virtualDev = "lsisas1068"`. > > Is it correct to say that the LSI53C1030 parts of [1] were never applied? Yes, that's correct. The patch you linked was almost entirely rewritten. ** Changed in: qemu Status: New => Invalid -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1603693 Title: Disks in mptsas1068 scsi controller not seen by linux Status in QEMU: Invalid Bug description: When using the mptsas1068 scsi controller, linux detects the controller itself but not the drives attached to it. Freebsd works. Using a different controller with linux works. VMware with linux works. qemu 2.6.50 (v2.6.0-1925-g6b92bbf) seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 support) Test script, loosely based off what libvirt runs and the libvirt tests that Paolo Bonzini wrote [1] # iso=archlinux-2016.07.01-dual.iso #iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso device=mptsas1068 #device=lsi img=empty.img qemu-img create -f qcow2 $img 1G /usr/bin/qemu-system-x86_64 \ -enable-kvm \ -m 1024 \ -boot menu=on \ -device $device,id=scsi0,bus=pci.0,addr=0x9 \ -drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \ -drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \ -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1 # The ISOs can be downloaded from [2] and [3]. After booting linux, do "lsblk". /dev/sda should exist. After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU HARDDISK" should be mentioned. With device=mptsas1068 this fails in linux. With device=lsi line it works in both. With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only loads modules for mptsas1068, this works. I also reproduced this with the debian 8.5 netinstall image, but it insists in making you pick a driver from a list of modules when it fails to mount it, instead of dropping to a shell. Arch linux dmesg output snippet (full output attached as arch-linux- dmesg.txt): # root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0 [0.00] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016 [0.00] Normal empty [0.00] Preemptible hierarchical RCU implementation. [1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249) [1.951581] SCSI subsystem initialized [1.957113] Fusion MPT base driver 3.04.20 [1.957618] Fusion MPT SAS Host driver 3.04.20 [2.281773] scsi host0: ata_piix [2.285372] scsi host1: ata_piix [2.305803] mptbase: ioc0: Initiating bringup [2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator} [2.444390] scsi 0:0:1:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ PQ: 0 ANSI: 5 [2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, MaxQ=128, IRQ=11 [2.507024] sr 0:0:1:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray [2.507274] sr 0:0:1:0: Attached scsi CD-ROM sr0 # The controller itself is detected, the disk isn't. An early version of this patch [4] said that it was only tested with FreeBSD: >Tested with FreeBSD for now. The previous version (before the >configuration page rewrite) worked with RHEL and Windows guests as well. > >TODO: write qtest for (at least) config pages, test Linux and Windows. [1]: https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=fc922eb2080a3fa7b24bc8a8b0aabfd394480143 [2]: https://www.archlinux.org/download [3]: https://www.freebsd.org/where.html [4]: https://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06475.html To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1603693/+subscriptions
[Qemu-devel] [RFC/PATCHSET 0/3] virtio-pstore: Implement virtio pstore device
Hello, This patchset is a proof of concept of virtio-pstore idea [1]. It has some rough edges and I'm not familiar with this area, so please give me feedbacks and advices if I'm going to a wrong direction. It started from the fact that dumping ftrace buffer at kernel oops/panic takes too much time. Although there's a way to reduce the size of the original data, sometimes I want to have the information as many as possible. Maybe kexec/kdump can solve this problem but it consumes some portion of guest memory so I'd like to avoid it. And I know the qemu + crashtool can dump and analyze the whole guest memory including the ftrace buffer without wasting guest memory, but it adds one more layer and has some limitation as an out-of-tree tool like not being in sync with the kernel changes. So I think it'd be great using the pstore interface to dump guest kernel data on the host. One can read the data on the host directly or on the guest (at the next boot) using pstore filesystem as usual. While this patchset only implements dumping kernel log buffer, it can be extended to have ftrace buffer and probably some more.. The patch 0001 implements virtio pstore driver. It has a single virt queue, pstore buffer and header structure. The virtio_pstore_hdr struct is to give information about the current pstore operation. The patch 0002 and 0003 implement virtio-pstore legacy PCI device on qemu-kvm and kvmtool respectively. I referenced virtio-baloon and virtio-rng implementations and I don't know whether kvmtool supports modern virtio 1.0+ spec. For example, using virtio-pstore on qemu looks like below: $ qemu-system-x86_64 -enable-kvm -device virtio-pstore,directory=xxx When guest kernel gets panic the log messages will be saved under the xxx directory. $ ls xxx dmesg-0.enc.z dmesg-1.enc.z As you can see the pstore subsystem compresses the log data using zlib. The data can be extracted with the following command: $ cat xxx/dmesg-0.enc.z | \ > python -c 'import sys, zlib; print(zlib.decompress(sys.stdin.read()))' Oops#1 Part1 <5>[0.00] Linux version 4.6.0kvm+ (namhyung@danjae) (gcc version 5.3.0 (GCC) ) #145 SMP Mon Jul 18 10:22:45 KST 2016 <6>[0.00] Command line: root=/dev/vda console=ttyS0 <6>[0.00] x86/fpu: Legacy x87 FPU detected. <6>[0.00] x86/fpu: Using 'eager' FPU context switches. <6>[0.00] e820: BIOS-provided physical RAM map: <6>[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable <6>[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved <6>[0.00] BIOS-e820: [mem 0x000f-0x000f] reserved <6>[0.00] BIOS-e820: [mem 0x0010-0x07fddfff] usable <6>[0.00] BIOS-e820: [mem 0x07fde000-0x07ff] reserved <6>[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] reserved <6>[0.00] BIOS-e820: [mem 0xfffc-0x] reserved <6>[0.00] NX (Execute Disable) protection: active <6>[0.00] SMBIOS 2.8 present. <7>[0.00] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 ... Maybe we can add a config option to control the compression later. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: "Michael S. Tsirkin" Cc: Anthony Liguori Cc: Anton Vorontsov Cc: Colin Cross Cc: Kees Cook Cc: Tony Luck Cc: Steven Rostedt Cc: Ingo Molnar Cc: Minchan Kim Cc: k...@vger.kernel.org Cc: qemu-devel@nongnu.org Cc: virtualizat...@lists.linux-foundation.org [1] https://lkml.org/lkml/2016/7/1/6 Thanks, Namhyung
Re: [Qemu-devel] [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
Hello, On Sun, Jul 17, 2016 at 10:12:26PM -0700, Kees Cook wrote: > On Sun, Jul 17, 2016 at 9:37 PM, Namhyung Kim wrote: > > The virtio pstore driver provides interface to the pstore subsystem so > > that the guest kernel's log/dump message can be saved on the host > > machine. Users can access the log file directly on the host, or on the > > guest at the next boot using pstore filesystem. It currently deals with > > kernel log (printk) buffer only, but we can extend it to have other > > information (like ftrace dump) later. > > > > It supports legacy PCI device using single order-2 page buffer. As all > > operation of pstore is synchronous, it would be fine IMHO. However I > > don't know how to make write operation synchronous since it's called > > with a spinlock held (from any context including NMI). > > > > Cc: Paolo Bonzini > > Cc: Radim Kr??m > > Cc: "Michael S. Tsirkin" > > Cc: Anthony Liguori > > Cc: Anton Vorontsov > > Cc: Colin Cross > > Cc: Kees Cook > > Cc: Tony Luck > > Cc: Steven Rostedt > > Cc: Ingo Molnar > > Cc: Minchan Kim > > Cc: k...@vger.kernel.org > > Cc: qemu-devel@nongnu.org > > Cc: virtualizat...@lists.linux-foundation.org > > Signed-off-by: Namhyung Kim > > This looks great to me! I'd love to use this in qemu. (Right now I go > through hoops to use the ramoops backend for testing.) > > Reviewed-by: Kees Cook Thank you! > > Notes below... > [SNIP] > > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id > > type) > > +{ > > + u16 ret; > > + > > + switch (type) { > > + case PSTORE_TYPE_DMESG: > > + ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_DMESG); > > + break; > > + default: > > + ret = cpu_to_virtio16(vps->vdev, > > VIRTIO_PSTORE_TYPE_UNKNOWN); > > + break; > > + } > > I would love to see this support PSTORE_TYPE_CONSOLE too. It should be > relatively easy to add: I think it'd just be another virtio command? Do you want to append the data to the host file as guest does printk()? I think it needs some kind of buffer management, but it's not hard to add IMHO. > > > + > > + return ret; > > +} > > + [SNIP] > > +static int notrace virt_pstore_write(enum pstore_type_id type, > > +enum kmsg_dump_reason reason, > > +u64 *id, unsigned int part, int count, > > +bool compressed, size_t size, > > +struct pstore_info *psi) > > +{ > > + struct virtio_pstore *vps = psi->data; > > + struct virtio_pstore_hdr *hdr = &vps->hdr; > > + struct scatterlist sg[2]; > > + unsigned int flags = compressed ? VIRTIO_PSTORE_FL_COMPRESSED : 0; > > + > > + *id = vps->id++; > > + > > + hdr->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_WRITE); > > + hdr->id= cpu_to_virtio64(vps->vdev, *id); > > + hdr->flags = cpu_to_virtio32(vps->vdev, flags); > > + hdr->type = to_virtio_type(vps, type); > > + > > + sg_init_table(sg, 2); > > + sg_set_buf(&sg[0], hdr, sizeof(*hdr)); > > + sg_set_buf(&sg[1], psi->buf, size); > > + virtqueue_add_outbuf(vps->vq, sg, 2, vps, GFP_ATOMIC); > > + virtqueue_kick(vps->vq); > > + > > + /* TODO: make it synchronous */ > > + return 0; > > The down side to this being asynchronous is the lack of error > reporting. Perhaps this could check hdr->type before queuing and error > for any VIRTIO_PSTORE_TYPE_UNKNOWN message instead of trying to send > it? I cannot follow, sorry. Could you please elaborate it more? > > > +} > > + > > +static int virt_pstore_erase(enum pstore_type_id type, u64 id, int count, > > +struct timespec time, struct pstore_info *psi) > > +{ > > + struct virtio_pstore *vps = psi->data; > > + struct virtio_pstore_hdr *hdr = &vps->hdr; > > + struct scatterlist sg[1]; > > + unsigned int len; > > + > > + hdr->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_ERASE); > > + hdr->id= cpu_to_virtio64(vps->vdev, id); > > + hdr->type = to_virtio_type(vps, type); > > + > > + sg_init_one(sg, hdr, sizeof(*hdr)); > > + virtqueue_add_outbuf(vps->vq, sg, 1, vps, GFP_KERNEL); > > + virtqueue_kick(vps->vq); > > + > > + wait_event(vps->acked, virtqueue_get_buf(vps->vq, &len)); > > + return 0; > > +} > > + > > +static int virt_pstore_init(struct virtio_pstore *vps) > > +{ > > + struct pstore_info *psinfo = &vps->pstore; > > + int err; > > + > > + vps->id = 0; > > + vps->buflen = 0; > > + psinfo->bufsize = VIRT_PSTORE_BUFSIZE; > > + psinfo->buf = (void *)__get_free_pages(GFP_KERNEL, > > VIRT_PSTORE_ORDER); > > + if (!psinfo->buf) { > > + pr_err("cannot allocate pstore buffer\n"); > > + return -ENOMEM; > > + }
[Qemu-devel] [RFC PATCH V7 0/7] Introduce COLO-compare
COLO-compare is a part of COLO project. It is used to compare the network package to help COLO decide whether to do checkpoint. The full version in this github: https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-with-colo-base-jul18 v7: p5: - add [PATCH]qemu-char: Fix context for g_source_attach() in this patch series. v6: p6: - add more commit log. - fix icmp comparison to compare all packet. p5: - add more cpmments in commit log. - change REGULAR_CHECK_MS to REGULAR_PACKET_CHECK_MS - make check old packet independent to compare thread - remove thread_status p4: - change this patch only about Connection and ConnectionKey. - add some comments in commit log. - remove mode in fill_connection_key(). - fix some comments and bug. - move colo_conn_state to patch of "work with colo-frame" - remove conn_list_lock. - add MAX_QUEUE_SIZE, if primary_list or secondary_list biger than MAX_QUEUE_SIZE we will drop packet. p3: - add new independent kernel jhash patch. p2: - add new independent colo-base patch. p1: - add a ascii figure and some comments to explain it - move trace.h to p2 - move QTAILQ_HEAD(, CompareState) net_compares to patch of "work with colo-frame" - add some comments in qemu-option.hx v5: p3: - comments from Jason we poll and handle chardev in comapre thread, Through this way, there's no need for extra synchronization with main loop this depend on another patch: qemu-char: Fix context for g_source_attach() - remove QemuEvent p2: - remove conn->list_lock p1: - move compare_pri/sec_chr_in to p3 - move compare_chr_send to p2 v4: p4: - add some comments - fix some trace-events - fix tcp compare error p3: - add rcu_read_lock(). - fix trace name - fix jason's other comments - rebase some Dave's branch function p2: - colo_compare_connection() change g_queue_push_head() to - g_queue_push_tail() match to sorted order. - remove pkt->s - move data structure to colo-base.h - add colo-base.c reuse codes for filter-rewriter - add some filter-rewriter needs struct - depends on previous SocketReadState patch p1: - except move qemu_chr_add_handlers() to colo thread - remove class_finalize - remove secondary arp codes - depends on previous SocketReadState patch v3: - rebase colo-compare to colo-frame v2.7 - fix most of Dave's comments (except RCU) - add TCP,UDP,ICMP and other packet comparison - add trace-event - add some comments - other bug fix - add RFC index - add usage in patch 1/4 v2: - add jhash.h v1: - initial patch Zhang Chen (7): colo-compare: introduce colo compare initialization colo-base: add colo-base to define and handle packet Jhash: add linux kernel jhashtable in qemu colo-compare: track connection and enqueue packet qemu-char: Fix context for g_source_attach() colo-compare: introduce packet comparison thread colo-compare: add TCP,UDP,ICMP packet comparison include/qemu/jhash.h | 61 io/channel.c | 2 +- net/Makefile.objs| 2 + net/colo-base.c | 183 net/colo-base.h | 71 + net/colo-compare.c | 769 +++ qemu-char.c | 6 +- qemu-options.hx | 38 +++ trace-events | 9 + vl.c | 3 +- 10 files changed, 1139 insertions(+), 5 deletions(-) create mode 100644 include/qemu/jhash.h create mode 100644 net/colo-base.c create mode 100644 net/colo-base.h create mode 100644 net/colo-compare.c -- 2.7.4
[Qemu-devel] [RFC PATCH V7 4/7] colo-compare: track connection and enqueue packet
In this patch we use kernel jhash table to track connection, and then enqueue net packet like this: + CompareState ++ | | +---+ +---+ +---+ |conn list +--->conn +->conn | +---+ +---+ +---+ | | | | | | +---+ +---v+ +---v++---v+ +---v+ |primary | |secondary|primary | |secondary |packet | |packet +|packet | |packet + ++ ++++ ++ | | | | +---v+ +---v++---v+ +---v+ |primary | |secondary|primary | |secondary |packet | |packet +|packet | |packet + ++ ++++ ++ | | | | +---v+ +---v++---v+ +---v+ |primary | |secondary|primary | |secondary |packet | |packet +|packet | |packet + ++ ++++ ++ We use conn_list to record connection info. When we want to enqueue a packet, firstly get the connection from connection_track_table. then push the packet to g_queue(pri/sec) in it's own conn. Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian Signed-off-by: Wen Congyang --- net/colo-base.c| 108 + net/colo-base.h| 30 +++ net/colo-compare.c | 70 +- 3 files changed, 198 insertions(+), 10 deletions(-) diff --git a/net/colo-base.c b/net/colo-base.c index f5d5de9..7e91dec 100644 --- a/net/colo-base.c +++ b/net/colo-base.c @@ -16,6 +16,29 @@ #include "qemu/error-report.h" #include "net/colo-base.h" +uint32_t connection_key_hash(const void *opaque) +{ +const ConnectionKey *key = opaque; +uint32_t a, b, c; + +/* Jenkins hash */ +a = b = c = JHASH_INITVAL + sizeof(*key); +a += key->src.s_addr; +b += key->dst.s_addr; +c += (key->src_port | key->dst_port << 16); +__jhash_mix(a, b, c); + +a += key->ip_proto; +__jhash_final(a, b, c); + +return c; +} + +int connection_key_equal(const void *key1, const void *key2) +{ +return memcmp(key1, key2, sizeof(ConnectionKey)) == 0; +} + int parse_packet_early(Packet *pkt) { int network_length; @@ -47,6 +70,62 @@ int parse_packet_early(Packet *pkt) return 0; } +void fill_connection_key(Packet *pkt, ConnectionKey *key) +{ +uint32_t tmp_ports; + +key->ip_proto = pkt->ip->ip_p; + +switch (key->ip_proto) { +case IPPROTO_TCP: +case IPPROTO_UDP: +case IPPROTO_DCCP: +case IPPROTO_ESP: +case IPPROTO_SCTP: +case IPPROTO_UDPLITE: +tmp_ports = *(uint32_t *)(pkt->transport_layer); +key->src = pkt->ip->ip_src; +key->dst = pkt->ip->ip_dst; +key->src_port = ntohs(tmp_ports & 0x); +key->dst_port = ntohs(tmp_ports >> 16); +break; +case IPPROTO_AH: +tmp_ports = *(uint32_t *)(pkt->transport_layer + 4); +key->src = pkt->ip->ip_src; +key->dst = pkt->ip->ip_dst; +key->src_port = ntohs(tmp_ports & 0x); +key->dst_port = ntohs(tmp_ports >> 16); +break; +default: +key->src_port = 0; +key->dst_port = 0; +break; +} +} + +Connection *connection_new(ConnectionKey *key) +{ +Connection *conn = g_slice_new(Connection); + +conn->ip_proto = key->ip_proto; +conn->processing = false; +g_queue_init(&conn->primary_list); +g_queue_init(&conn->secondary_list); + +return conn; +} + +void connection_destroy(void *opaque) +{ +Connection *conn = opaque; + +g_queue_foreach(&conn->primary_list, packet_destroy, NULL); +g_queue_free(&conn->primary_list); +g_queue_foreach(&conn->secondary_list, packet_destroy, NULL); +g_queue_free(&conn->secondary_list); +g_slice_free(Connection, conn); +} + Packet *packet_new(const void *data, int size) { Packet *pkt = g_slice_new(Packet); @@ -72,3 +151,32 @@ void connection_hashtable_reset(GHashTable *connection_track_table) { g_hash_table_remove_all(connection_track_table); } + +/* if not found, create a new connection and add to hash table */ +Connection *connection_get(GHashTable *connection_track_table, + ConnectionKey *key, + uint32_t *hashtable_size) +{ +Connection *conn = g_hash_table_lookup(connection_track_table, key); + +if (conn == NULL) { +ConnectionKey *new_key = g_memdup(key, sizeof(*key)); + +conn = connection_new(key); + +(*hashtable_size) += 1; +if (*hashtable_size > HASHTABLE_MAX_SIZE) { +
[Qemu-devel] [RFC PATCH V7 3/7] Jhash: add linux kernel jhashtable in qemu
Jhash used by colo-compare and filter-rewriter to save and lookup net connection info Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian Signed-off-by: Wen Congyang --- include/qemu/jhash.h | 61 1 file changed, 61 insertions(+) create mode 100644 include/qemu/jhash.h diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h new file mode 100644 index 000..0fcd875 --- /dev/null +++ b/include/qemu/jhash.h @@ -0,0 +1,61 @@ +/* jhash.h: Jenkins hash support. + * + * Copyright (C) 2006. Bob Jenkins (bob_jenk...@burtleburtle.net) + * + * http://burtleburtle.net/bob/hash/ + * + * These are the credits from Bob's sources: + * + * lookup3.c, by Bob Jenkins, May 2006, Public Domain. + * + * These are functions for producing 32-bit hashes for hash table lookup. + * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final() + * are externally useful functions. Routines to test the hash are +included + * if SELF_TEST is defined. You can use this free for any purpose. +It's in + * the public domain. It has no warranty. + * + * Copyright (C) 2009-2010 Jozsef Kadlecsik (kad...@blackhole.kfki.hu) + * + * I've modified Bob's hash to be useful in the Linux kernel, and + * any bugs present are my fault. + * Jozsef + */ + +#ifndef QEMU_JHASH_H__ +#define QEMU_JHASH_H__ + +#include "qemu/bitops.h" + +/* + * hashtable relation copy from linux kernel jhash + */ + +/* __jhash_mix -- mix 3 32-bit values reversibly. */ +#define __jhash_mix(a, b, c)\ +{ \ +a -= c; a ^= rol32(c, 4); c += b; \ +b -= a; b ^= rol32(a, 6); a += c; \ +c -= b; c ^= rol32(b, 8); b += a; \ +a -= c; a ^= rol32(c, 16); c += b; \ +b -= a; b ^= rol32(a, 19); a += c; \ +c -= b; c ^= rol32(b, 4); b += a; \ +} + +/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */ +#define __jhash_final(a, b, c) \ +{ \ +c ^= b; c -= rol32(b, 14); \ +a ^= c; a -= rol32(c, 11); \ +b ^= a; b -= rol32(a, 25); \ +c ^= b; c -= rol32(b, 16); \ +a ^= c; a -= rol32(c, 4); \ +b ^= a; b -= rol32(a, 14); \ +c ^= b; c -= rol32(b, 24); \ +} + +/* An arbitrary initial parameter */ +#define JHASH_INITVAL 0xdeadbeef + +#endif /* QEMU_JHASH_H__ */ -- 2.7.4
[Qemu-devel] [RFC PATCH V7 6/7] colo-compare: introduce packet comparison thread
If primary packet is same with secondary packet, we will send primary packet and drop secondary packet, otherwise notify COLO frame to do checkpoint. If primary packet comes and secondary packet not, after REGULAR_PACKET_CHECK_MS milliseconds we set the primary packet as old_packet,then do a checkpoint. Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian Signed-off-by: Wen Congyang --- net/colo-base.c| 1 + net/colo-base.h| 3 + net/colo-compare.c | 216 + trace-events | 2 + 4 files changed, 222 insertions(+) diff --git a/net/colo-base.c b/net/colo-base.c index 7e91dec..eb1b631 100644 --- a/net/colo-base.c +++ b/net/colo-base.c @@ -132,6 +132,7 @@ Packet *packet_new(const void *data, int size) pkt->data = g_memdup(data, size); pkt->size = size; +pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST); return pkt; } diff --git a/net/colo-base.h b/net/colo-base.h index 0505608..06d6dca 100644 --- a/net/colo-base.h +++ b/net/colo-base.h @@ -17,6 +17,7 @@ #include "slirp/slirp.h" #include "qemu/jhash.h" +#include "qemu/timer.h" #define HASHTABLE_MAX_SIZE 16384 @@ -28,6 +29,8 @@ typedef struct Packet { }; uint8_t *transport_layer; int size; +/* Time of packet creation, in wall clock ms */ +int64_t creation_ms; } Packet; typedef struct ConnectionKey { diff --git a/net/colo-compare.c b/net/colo-compare.c index 5f87710..942e326 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -36,6 +36,8 @@ #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 +/* TODO: Should be configurable */ +#define REGULAR_PACKET_CHECK_MS 3000 /* + CompareState ++ @@ -83,6 +85,10 @@ typedef struct CompareState { GQueue unprocessed_connections; /* proxy current hash size */ uint32_t hashtable_size; +/* compare thread, a thread for each NIC */ +QemuThread thread; +/* Timer used on the primary to find packets that are never matched */ +QEMUTimer *timer; } CompareState; typedef struct CompareClass { @@ -170,6 +176,112 @@ static int packet_enqueue(CompareState *s, int mode) return 0; } +/* + * The IP packets sent by primary and secondary + * will be compared in here + * TODO support ip fragment, Out-Of-Order + * return:0 means packet same + *> 0 || < 0 means packet different + */ +static int colo_packet_compare(Packet *ppkt, Packet *spkt) +{ +trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src), + inet_ntoa(ppkt->ip->ip_dst), spkt->size, + inet_ntoa(spkt->ip->ip_src), + inet_ntoa(spkt->ip->ip_dst)); + +if (ppkt->size == spkt->size) { +return memcmp(ppkt->data, spkt->data, spkt->size); +} else { +return -1; +} +} + +static int colo_packet_compare_all(Packet *spkt, Packet *ppkt) +{ +trace_colo_compare_main("compare all"); +return colo_packet_compare(ppkt, spkt); +} + +static void colo_old_packet_check_one(void *opaque_packet, + void *opaque_found) +{ +int64_t now; +bool *found_old = (bool *)opaque_found; +Packet *ppkt = (Packet *)opaque_packet; + +if (*found_old) { +/* Someone found an old packet earlier in the queue */ +return; +} + +now = qemu_clock_get_ms(QEMU_CLOCK_HOST); +if ((now - ppkt->creation_ms) > REGULAR_PACKET_CHECK_MS) { +trace_colo_old_packet_check_found(ppkt->creation_ms); +*found_old = true; +} +} + +static void colo_old_packet_check_one_conn(void *opaque, + void *user_data) +{ +bool found_old = false; +Connection *conn = opaque; + +g_queue_foreach(&conn->primary_list, colo_old_packet_check_one, +&found_old); +if (found_old) { +/* do checkpoint will flush old packet */ +/* TODO: colo_notify_checkpoint();*/ +} +} + +/* + * Look for old packets that the secondary hasn't matched, + * if we have some then we have to checkpoint to wake + * the secondary up. + */ +static void colo_old_packet_check(void *opaque) +{ +CompareState *s = opaque; + +g_queue_foreach(&s->conn_list, colo_old_packet_check_one_conn, NULL); +} + +/* + * called from the compare thread on the primary + * for compare connection + */ +static void colo_compare_connection(void *opaque, void *user_data) +{ +CompareState *s = user_data; +Connection *conn = opaque; +Packet *pkt = NULL; +GList *result = NULL; +int ret; + +while (!g_queue_is_empty(&conn->primary_list) && + !g_queue_is_empty(&conn->secondary_list)) { +pkt = g_queue_pop_tail(&conn->primary_list); +result = g_queue_find_custom(&conn->secondary_list, + pkt, (GCompareFunc)colo_packet_compare_all); + +if (result) { +ret = compare_chr_send(s->chr_
[Qemu-devel] [RFC PATCH V7 7/7] colo-compare: add TCP, UDP, ICMP packet comparison
We add TCP,UDP,ICMP packet comparison to replace IP packet comparison. This can increase the accuracy of the package comparison. less checkpoint more efficiency. Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian Signed-off-by: Wen Congyang --- net/colo-compare.c | 174 +++-- trace-events | 4 ++ 2 files changed, 174 insertions(+), 4 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 942e326..9737ec6 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -18,6 +18,7 @@ #include "qapi/qmp/qerror.h" #include "qapi/error.h" #include "net/net.h" +#include "net/eth.h" #include "net/vhost_net.h" #include "qom/object_interfaces.h" #include "qemu/iov.h" @@ -197,9 +198,158 @@ static int colo_packet_compare(Packet *ppkt, Packet *spkt) } } -static int colo_packet_compare_all(Packet *spkt, Packet *ppkt) +/* + * called from the compare thread on the primary + * for compare tcp packet + * compare_tcp copied from Dr. David Alan Gilbert's branch + */ +static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt) +{ +struct tcphdr *ptcp, *stcp; +int res; +char *sdebug, *ddebug; + +trace_colo_compare_main("compare tcp"); +if (ppkt->size != spkt->size) { +if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) { +trace_colo_compare_main("pkt size not same"); +} +return -1; +} + +ptcp = (struct tcphdr *)ppkt->transport_layer; +stcp = (struct tcphdr *)spkt->transport_layer; + +if (ptcp->th_seq != stcp->th_seq) { +if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) { +trace_colo_compare_main("pkt tcp seq not same"); +} +return -1; +} + +/* + * The 'identification' field in the IP header is *very* random + * it almost never matches. Fudge this by ignoring differences in + * unfragmented packets; they'll normally sort themselves out if different + * anyway, and it should recover at the TCP level. + * An alternative would be to get both the primary and secondary to rewrite + * somehow; but that would need some sync traffic to sync the state + */ +if (ntohs(ppkt->ip->ip_off) & IP_DF) { +spkt->ip->ip_id = ppkt->ip->ip_id; +/* and the sum will be different if the IDs were different */ +spkt->ip->ip_sum = ppkt->ip->ip_sum; +} + +res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN, +(spkt->size - ETH_HLEN)); + +if (res != 0 && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) { +sdebug = strdup(inet_ntoa(ppkt->ip->ip_src)); +ddebug = strdup(inet_ntoa(ppkt->ip->ip_dst)); +fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u" +" s: seq/ack=%u/%u res=%d flags=%x/%x\n", __func__, + sdebug, ddebug, + ntohl(ptcp->th_seq), ntohl(ptcp->th_ack), + ntohl(stcp->th_seq), ntohl(stcp->th_ack), + res, ptcp->th_flags, stcp->th_flags); + +trace_colo_compare_tcp_miscompare("Primary len", ppkt->size); +qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size); +trace_colo_compare_tcp_miscompare("Secondary len", spkt->size); +qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size); + +g_free(sdebug); +g_free(ddebug); +} + +return res; +} + +/* + * called from the compare thread on the primary + * for compare udp packet + */ +static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt) +{ +int ret; + +trace_colo_compare_main("compare udp"); +ret = colo_packet_compare(ppkt, spkt); + +if (ret) { +trace_colo_compare_udp_miscompare("primary pkt size", ppkt->size); +qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size); +trace_colo_compare_udp_miscompare("Secondary pkt size", spkt->size); +qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size); +} + +return ret; +} + +/* + * called from the compare thread on the primary + * for compare icmp packet + */ +static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt) { -trace_colo_compare_main("compare all"); +int network_length; +struct icmp *icmp_ppkt, *icmp_spkt; + +trace_colo_compare_main("compare icmp"); +network_length = ppkt->ip->ip_hl * 4; +if (ppkt->size != spkt->size || +ppkt->size < network_length + ETH_HLEN) { +return -1; +} +icmp_ppkt = (struct icmp *)(ppkt->data + network_length + ETH_HLEN); +icmp_spkt = (struct icmp *)(spkt->data + network_length + ETH_HLEN); + +if ((icmp_ppkt->icmp_type == icmp_spkt->icmp_type) && +(icmp_ppkt->icmp_code == icmp_spkt->icmp_code)) { +if (icmp_ppkt->icmp_type == ICMP_REDIRECT) { +if (icmp_ppkt->icmp_gwaddr.s_addr != +icmp_spkt->icmp_gwaddr.s_addr) { +trace_colo_compare
[Qemu-devel] [RFC PATCH V7 5/7] qemu-char: Fix context for g_source_attach()
We want to poll and handle chardev in another thread other than main loop. But qemu_chr_add_handlers() can only work for global default context other than thread default context. So we use g_source_attach(xx, g_main_context_get_thread_default()) replace g_source_attach(xx, NULL) to attach g_source. Comments from jason. Signed-off-by: Zhang Chen Signed-off-by: Jason Wang --- io/channel.c | 2 +- qemu-char.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/io/channel.c b/io/channel.c index 692eb17..cd25677 100644 --- a/io/channel.c +++ b/io/channel.c @@ -146,7 +146,7 @@ guint qio_channel_add_watch(QIOChannel *ioc, g_source_set_callback(source, (GSourceFunc)func, user_data, notify); -id = g_source_attach(source, NULL); +id = g_source_attach(source, g_main_context_get_thread_default()); g_source_unref(source); return id; diff --git a/qemu-char.c b/qemu-char.c index b597ee1..ed7e1f7 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -860,7 +860,7 @@ static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_) iwp->src = qio_channel_create_watch( iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL); g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL); -g_source_attach(iwp->src, NULL); +g_source_attach(iwp->src, g_main_context_get_thread_default()); } else { g_source_destroy(iwp->src); g_source_unref(iwp->src); @@ -919,7 +919,7 @@ static guint io_add_watch_poll(QIOChannel *ioc, iwp->fd_read = (GSourceFunc) fd_read; iwp->src = NULL; -tag = g_source_attach(&iwp->parent, NULL); +tag = g_source_attach(&iwp->parent, g_main_context_get_thread_default()); g_source_unref(&iwp->parent); return tag; } @@ -3983,7 +3983,7 @@ int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond, } g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); -tag = g_source_attach(src, NULL); +tag = g_source_attach(src, g_main_context_get_thread_default()); g_source_unref(src); return tag; -- 2.7.4
[Qemu-devel] [RFC PATCH V7 2/7] colo-base: add colo-base to define and handle packet
COLO-base used by colo-compare and filter-rewriter. this can share common data structure like:net packet, and share other functions. Signed-off-by: Zhang Chen Signed-off-by: Li Zhijian Signed-off-by: Wen Congyang --- net/Makefile.objs | 1 + net/colo-base.c| 74 + net/colo-base.h| 38 + net/colo-compare.c | 119 - trace-events | 3 ++ 5 files changed, 233 insertions(+), 2 deletions(-) create mode 100644 net/colo-base.c create mode 100644 net/colo-base.h diff --git a/net/Makefile.objs b/net/Makefile.objs index ba92f73..119589f 100644 --- a/net/Makefile.objs +++ b/net/Makefile.objs @@ -17,3 +17,4 @@ common-obj-y += filter.o common-obj-y += filter-buffer.o common-obj-y += filter-mirror.o common-obj-y += colo-compare.o +common-obj-y += colo-base.o diff --git a/net/colo-base.c b/net/colo-base.c new file mode 100644 index 000..f5d5de9 --- /dev/null +++ b/net/colo-base.c @@ -0,0 +1,74 @@ +/* + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) + * (a.k.a. Fault Tolerance or Continuous Replication) + * + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. + * Copyright (c) 2016 FUJITSU LIMITED + * Copyright (c) 2016 Intel Corporation + * + * Author: Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "net/colo-base.h" + +int parse_packet_early(Packet *pkt) +{ +int network_length; +uint8_t *data = pkt->data; +uint16_t l3_proto; +ssize_t l2hdr_len = eth_get_l2_hdr_length(data); + +if (pkt->size < ETH_HLEN) { +error_report("pkt->size < ETH_HLEN"); +return 1; +} +pkt->network_layer = data + ETH_HLEN; +l3_proto = eth_get_l3_proto(data, l2hdr_len); +if (l3_proto != ETH_P_IP) { +return 1; +} + +network_length = pkt->ip->ip_hl * 4; +if (pkt->size < ETH_HLEN + network_length) { +error_report("pkt->size < network_layer + network_length"); +return 1; +} +pkt->transport_layer = pkt->network_layer + network_length; +if (!pkt->transport_layer) { +error_report("pkt->transport_layer is valid"); +return 1; +} + +return 0; +} + +Packet *packet_new(const void *data, int size) +{ +Packet *pkt = g_slice_new(Packet); + +pkt->data = g_memdup(data, size); +pkt->size = size; + +return pkt; +} + +void packet_destroy(void *opaque, void *user_data) +{ +Packet *pkt = opaque; + +g_free(pkt->data); +g_slice_free(Packet, pkt); +} + +/* + * Clear hashtable, stop this hash growing really huge + */ +void connection_hashtable_reset(GHashTable *connection_track_table) +{ +g_hash_table_remove_all(connection_track_table); +} diff --git a/net/colo-base.h b/net/colo-base.h new file mode 100644 index 000..48835e7 --- /dev/null +++ b/net/colo-base.h @@ -0,0 +1,38 @@ +/* + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) + * (a.k.a. Fault Tolerance or Continuous Replication) + * + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. + * Copyright (c) 2016 FUJITSU LIMITED + * Copyright (c) 2016 Intel Corporation + * + * Author: Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#ifndef QEMU_COLO_BASE_H +#define QEMU_COLO_BASE_H + +#include "slirp/slirp.h" +#include "qemu/jhash.h" + +#define HASHTABLE_MAX_SIZE 16384 + +typedef struct Packet { +void *data; +union { +uint8_t *network_layer; +struct ip *ip; +}; +uint8_t *transport_layer; +int size; +} Packet; + +int parse_packet_early(Packet *pkt); +void connection_hashtable_reset(GHashTable *connection_track_table); +Packet *packet_new(const void *data, int size); +void packet_destroy(void *opaque, void *user_data); + +#endif /* QEMU_COLO_BASE_H */ diff --git a/net/colo-compare.c b/net/colo-compare.c index 0402958..7c52cc8 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -27,13 +27,38 @@ #include "sysemu/char.h" #include "qemu/sockets.h" #include "qapi-visit.h" +#include "net/colo-base.h" +#include "trace.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) #define COMPARE_READ_LEN_MAX NET_BUFSIZE +#define MAX_QUEUE_SIZE 1024 +/* + + CompareState ++ + | | + +---+ +---+ +---+ + |conn list +--->conn +->conn | + +---+ +---+ +---+ + | | | | | | + +---+ +---v+ +---v++---v+ +---v+ +|primary | |secondary|primary | |secondary +
[Qemu-devel] [RFC PATCH V7 1/7] colo-compare: introduce colo compare initialization
This a COLO net ascii figure: Primary qemu Secondary qemu +--+ ++ | +-+ | | +---+ | | | | | | | | | | |guest| | | | guest | | | | | | | | | | | +---^--+--+ | | +-+++ | | | | | | ^| | | | | | | || | | | +--+ | || | |netfilter| | | || | netfilter|| | | +--+ ---+|| | +---+ | | | | | |||| | | || filter excute order | | | | | | |||| | | || +---> | | | | | | |||| | | || TCP | | | | +-+--+--+ +--v-+ | ++ || | | ++ +---++---v+rewriter++ ++ | | | | | | || | || || | | | | || | || | | | | | filter | | filter +> colo <+ +> filter +--> adjust | adjust +--> filter | | | | | | mirror | | redirector | | | compare | | || | | redirector | | ack| seq| | redirector | | | | | | | || | || | || | | | || | || | | | | +^--+ ++ | +-+--+ | || | ++ ++--+ +---++ | | | | | tx rx | || || | txall | rx | | | | || || || +---+ | | | || || || || | | | filter excute order | || || || | | | +---> | || ++| | +---+ || | | |||| | | +--+ ++ |guest receive |guest send || ++v+ | | NOTE: filter direction is rx/tx/all | tap | rx:receive packets sent to the netdev | | tx:receive packets sent by the netdev +--+ In COLO-compare. Packets coming from the primary char indev will be sent to outdev Packets coming from the secondary char dev will be dropped colo-comapre need two input chardev and one output chardev: primary_in=chardev1-id secondary_in=chardev2-id outdev=chardev3-id usage: primary: -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device