date:20160717

Re: [Qemu-devel] OUT_ASM on two different systems

2016-07-17 Thread Paolo Bonzini



On 17/07/2016 04:06, Ayaz Akram wrote:
> Hi all !
> 
> I ran a program with qemu in user mode emulation and generated trace for
> generated host instructions using (-d OUT_ASM) on two different linux
> systems.I expected that the addresses in two trace files can be different.
> But the total number of lines in two files is different as well. I mean the
> generated host instructions in two files are different (I have not yet
> looked into details of those differenes). Qemu and program's binary are
> exactly same on both systems. I wonder if someone can help me in explaining
> this ?
> 
> Thanks for your time !
> 

It's difficult to answer your question without also seeing an example of
those differences.

Paolo

[Qemu-devel] [Bug 1603693] Re: Disks in mptsas1068 scsi controller not seen by linux

2016-07-17 Thread Paolo Bonzini

Linux requires that you specify a WWN for the disk (through the wwn
property of the scsi-disk device).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1603693

Title:
  Disks in mptsas1068 scsi controller not seen by linux

Status in QEMU:
  New

Bug description:
  When using the mptsas1068 scsi controller, linux detects the
  controller itself but not the drives attached to it. Freebsd works.
  Using a different controller with linux works. VMware with linux
  works.

  qemu 2.6.50 (v2.6.0-1925-g6b92bbf)
  seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 
support)

  Test script, loosely based off what libvirt runs and the libvirt tests
  that Paolo Bonzini wrote [1]

  #
  iso=archlinux-2016.07.01-dual.iso
  #iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso
  device=mptsas1068
  #device=lsi

  img=empty.img
  qemu-img create -f qcow2 $img 1G

  /usr/bin/qemu-system-x86_64 \
  -enable-kvm \
  -m 1024 \
  -boot menu=on \
  -device $device,id=scsi0,bus=pci.0,addr=0x9 \
  -drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \
  -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
 \
  -drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \
  -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1
  #

  The ISOs can be downloaded from [2] and [3].

  After booting linux, do "lsblk". /dev/sda should exist.

  After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU
  HARDDISK" should be mentioned.

  With device=mptsas1068 this fails in linux.

  With device=lsi line it works in both.

  With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only
  loads modules for mptsas1068, this works.

  I also reproduced this with the debian 8.5 netinstall image, but it
  insists in making you pick a driver from a list of modules when it
  fails to mount it, instead of dropping to a shell.

  Arch linux dmesg output snippet (full output attached as arch-linux-
  dmesg.txt):

  #
  root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0
  [0.00] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 
6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016
  [0.00]   Normal   empty
  [0.00] Preemptible hierarchical RCU implementation.
  [1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded 
(major 249)
  [1.951581] SCSI subsystem initialized
  [1.957113] Fusion MPT base driver 3.04.20
  [1.957618] Fusion MPT SAS Host driver 3.04.20
  [2.281773] scsi host0: ata_piix
  [2.285372] scsi host1: ata_piix
  [2.305803] mptbase: ioc0: Initiating bringup
  [2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator}
  [2.444390] scsi 0:0:1:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ 
PQ: 0 ANSI: 5
  [2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, 
MaxQ=128, IRQ=11
  [2.507024] sr 0:0:1:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
  [2.507274] sr 0:0:1:0: Attached scsi CD-ROM sr0
  #

  The controller itself is detected, the disk isn't.

  An early version of this patch [4] said that it was only tested with
  FreeBSD:

  >Tested with FreeBSD for now.  The previous version (before the
  >configuration page rewrite) worked with RHEL and Windows guests as well.
  >
  >TODO: write qtest for (at least) config pages, test Linux and Windows.

  [1]: 
https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=fc922eb2080a3fa7b24bc8a8b0aabfd394480143
  [2]: https://www.archlinux.org/download
  [3]: https://www.freebsd.org/where.html
  [4]: https://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06475.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1603693/+subscriptions

[Qemu-devel] [Bug 1603734] [NEW] Hang in fsqrt

2016-07-17 Thread Robert Femmer

Public bug reported:

At least qemu-i368 and qemu-x86_64 hang in floatx80_sqrt in versions
2.6.0 and git (2.6.50) for some input values, likely due to an infinite
loop at fpu/softfloat.c:6569.

Steps to reproduce:
1) Compile attached code: gcc -o test test.c -lm
2) `qemu-i368 test` and `qemu-x86_64 test` will hang at 100% cpu

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "minimal example"
   https://bugs.launchpad.net/bugs/1603734/+attachment/4702260/+files/test.c

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1603734

Title:
  Hang in fsqrt

Status in QEMU:
  New

Bug description:
  At least qemu-i368 and qemu-x86_64 hang in floatx80_sqrt in versions
  2.6.0 and git (2.6.50) for some input values, likely due to an
  infinite loop at fpu/softfloat.c:6569.

  Steps to reproduce:
  1) Compile attached code: gcc -o test test.c -lm
  2) `qemu-i368 test` and `qemu-x86_64 test` will hang at 100% cpu

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1603734/+subscriptions

Re: [Qemu-devel] [v9 00/19] QEMU:Xen stubdom vTPM for HVM virtual machine(QEMU Part)

2016-07-17 Thread Quan Xu


On 2016 Jul 14 (Thu) 23:34, Stefano Stabellini  wrote:> 
Hi Quan,
> 
> thanks for CC'ing me. sstabell...@kernel.org is the right address to
> reach me now.
>
> I am also CC'ing Anthony Perard who is Xen co-maintainer in QEMU.
> 
> Cheers,
>
> Stefano
thanks in advance!! :):)Quan

Re: [Qemu-devel] [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c

2016-07-17 Thread Quan Xu


[Quan:]: comment starts with [Quan:]


The purpose of the new file is to store generic functions shared by frontendand 
backends such as xenstore operations, xendevs.

Signed-off-by: Quan Xu 
Signed-off-by: Emil Condrea 
---
 hw/xen/Makefile.objs |   2 +-
 hw/xen/xen_backend.c | 125 +---
 hw/xen/xen_pvdev.c   | 149 +++
 include/hw/xen/xen_backend.h |  63 +-
 include/hw/xen/xen_pvdev.h   |  71 +
 5 files changed, 223 insertions(+), 187 deletions(-)
 create mode 100644 hw/xen/xen_pvdev.c
 create mode 100644 include/hw/xen/xen_pvdev.h

diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index d367094..591cdc2 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,5 @@
 # xen backend driver support
-common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o
+common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o xen_pvdev.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_graphics.o xen_pt_msi.o
diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index bab79b1..a251a4a 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -30,6 +30,7 @@
 #include "sysemu/char.h"
 #include "qemu/log.h"
 #include "hw/xen/xen_backend.h"
+#include "hw/xen/xen_pvdev.h"
 
 #include 
 
@@ -56,8 +57,6 @@ static QTAILQ_HEAD(xs_dirs_head, xs_dirs) xs_cleanup =
 static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = 
QTAILQ_HEAD_INITIALIZER(xendevs);
 static int debug = 0;
 
-/* - */
-
 static void xenstore_cleanup_dir(char *dir)
 {
 struct xs_dirs *d;
@@ -76,34 +75,6 @@ void xen_config_cleanup(void)
 }
 }
 
-int xenstore_write_str(const char *base, const char *node, const char *val)
-{
-char abspath[XEN_BUFSIZE];
-
-snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
-if (!xs_write(xenstore, 0, abspath, val, strlen(val))) {
-return -1;
-}
-return 0;
-}
-
-char *xenstore_read_str(const char *base, const char *node)
-{
-char abspath[XEN_BUFSIZE];
-unsigned int len;
-char *str, *ret = NULL;
-
-snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
-str = xs_read(xenstore, 0, abspath, &len);
-if (str != NULL) {
-/* move to qemu-allocated memory to make sure
- * callers can savely g_free() stuff. */
-ret = g_strdup(str);
-free(str);
-}
-return ret;
-}
-
 int xenstore_mkdir(char *path, int p)
 {
 struct xs_permissions perms[2] = {
@@ -128,48 +99,6 @@ int xenstore_mkdir(char *path, int p)
 return 0;
 }
 
-int xenstore_write_int(const char *base, const char *node, int ival)
-{
-char val[12];
-
[Quan:]: why 12 ? what about XEN_BUFSIZE? 
-snprintf(val, sizeof(val), "%d", ival);
-return xenstore_write_str(base, node, val);
-}
-
-int xenstore_write_int64(const char *base, const char *node, int64_t ival)
-{
-char val[21];
-
[Quan:]: why 21 ? what about XEN_BUFSIZE?

-snprintf(val, sizeof(val), "%"PRId64, ival);
-return xenstore_write_str(base, node, val);
-}
-
-int xenstore_read_int(const char *base, const char *node, int *ival)
-{
-char *val;
-int rc = -1;
-
-val = xenstore_read_str(base, node);
[Quan:]:  IMO, it is better to initialize val when declares.  the same comment 
for the other 'val'
-if (val && 1 == sscanf(val, "%d", ival)) {
-rc = 0;
-}
-g_free(val);
-return rc;
-}
-
-int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval)
-{
-char *val;
-int rc = -1;
-
-val = xenstore_read_str(base, node);-if (val && 1 == sscanf(val, 
"%"SCNu64, uval)) {
-rc = 0;
-}
-g_free(val);
-return rc;
-}
-
 int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const 
char *val)
 {
 return xenstore_write_str(xendev->be, node, val);
@@ -212,20 +141,6 @@ int xenstore_read_fe_uint64(struct XenDevice *xendev, 
const char *node, uint64_t
 
 /* - */
 
-const char *xenbus_strstate(enum xenbus_state state)
-{
-static const char *const name[] = {
-[ XenbusStateUnknown  ] = "Unknown",
-[ XenbusStateInitialising ] = "Initialising",
-[ XenbusStateInitWait ] = "InitWait",
-[ XenbusStateInitialised  ] = "Initialised",
-[ XenbusStateConnected] = "Connected",
-[ XenbusStateClosing  ] = "Closing",
-[ XenbusStateClosed   ] = "Closed",
-};
-return (state < ARRAY_SIZE(name)) ? name[state] : "INVALID";
-}
-
 int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state)
 {
 int rc;
@@ -833,44 +748,6 @@ int xen_be_send_notify(struct XenDevice *xendev)
 return xenevtchn_notify(xendev->evtchndev, xendev->local_port);
 }
 
-/*
- * msg_level:
- *  0

[Qemu-devel] [Bug 1603779] [NEW] AC97 can allocate ~500MB of host RAM

2016-07-17 Thread Andrew Henderson

Public bug reported:

While working with qtest test cases generated via fuzzing with QEMU
2.5.0, I discovered some odd behavior for the AC97 virtual device with
qemu-system-i386. If AC97_MIC_ADC_RATE is set to the value of 1, the
QEMU process allocates over 500MB of additional host RAM. You probably
would not normally notice this on a modern PC, except that I was using a
"ulimit" command to restrict the maximum amount of virtual memory
allowed for the QEMU process, so the process would crash with a SIGTRAP
(signal 5) on the failed memory allocation.

My minimized qtest code to reproduce the issue is:

static void test_crash(void)
{
  uint64_t barsize;
  dev = get_device();

  dev_base[0] = qpci_iomap(dev, 0, &barsize);
  dev_base[1] = qpci_iomap(dev, 1, &barsize);
  qpci_device_enable(dev);
  qpci_io_writew(dev, dev_base[0]+0x32, 0x0001);
} 

I ran a "ulimit -sv 65" command and then launched the
tests/ac97-test binary with this crash test case included in it. I can
then see the QEMU process crash on an allocation of 722538464 bytes. I
can gradually increase the ulimit memory limit to ~120 and then no
longer see the issue, hence my estimate of 500 MB of RAM allocated by
the device.

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: ac97

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1603779

Title:
  AC97 can allocate ~500MB of host RAM

Status in QEMU:
  New

Bug description:
  While working with qtest test cases generated via fuzzing with QEMU
  2.5.0, I discovered some odd behavior for the AC97 virtual device with
  qemu-system-i386. If AC97_MIC_ADC_RATE is set to the value of 1, the
  QEMU process allocates over 500MB of additional host RAM. You probably
  would not normally notice this on a modern PC, except that I was using
  a "ulimit" command to restrict the maximum amount of virtual memory
  allowed for the QEMU process, so the process would crash with a
  SIGTRAP (signal 5) on the failed memory allocation.

  My minimized qtest code to reproduce the issue is:

  static void test_crash(void)
  {
uint64_t barsize;
dev = get_device();

dev_base[0] = qpci_iomap(dev, 0, &barsize);
dev_base[1] = qpci_iomap(dev, 1, &barsize);
qpci_device_enable(dev);
qpci_io_writew(dev, dev_base[0]+0x32, 0x0001);
  } 

  I ran a "ulimit -sv 65" command and then launched the
  tests/ac97-test binary with this crash test case included in it. I can
  then see the QEMU process crash on an allocation of 722538464 bytes. I
  can gradually increase the ulimit memory limit to ~120 and then no
  longer see the issue, hence my estimate of 500 MB of RAM allocated by
  the device.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1603779/+subscriptions

[Qemu-devel] [Bug 1603785] [NEW] trace_usb_port_attach prints junk data

2016-07-17 Thread Thomas

Public bug reported:

Running qemu with tracing (-D ~/qemu_trace -d trace:\*) will result in a
trace file with unprintable characters.

example: usb_port_attach bus 0, port 1, devspeed <90>l.U,
portspeed full+high

The problem is in hw/usb/bus.c usb_mask_to_str. If speedmask doesn't
match any of the defined speed nothing is written to *dest and
uninitialized data is printed to the log.

This happens with a real usb device that is forwarded into the machine.

My qemu version is 2.6.0 but it looks like the problem exists in latest
git also.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1603785

Title:
  trace_usb_port_attach prints junk data

Status in QEMU:
  New

Bug description:
  Running qemu with tracing (-D ~/qemu_trace -d trace:\*) will result in
  a trace file with unprintable characters.

  example: usb_port_attach bus 0, port 1, devspeed <90>l.U,
  portspeed full+high

  The problem is in hw/usb/bus.c usb_mask_to_str. If speedmask doesn't
  match any of the defined speed nothing is written to *dest and
  uninitialized data is printed to the log.

  This happens with a real usb device that is forwarded into the
  machine.

  My qemu version is 2.6.0 but it looks like the problem exists in
  latest git also.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1603785/+subscriptions

[Qemu-devel] [PATCH V5 0/7] pxb: fix 64-bit MMIO allocation

2016-07-17 Thread Marcel Apfelbaum


v4 -> v5:
  Addressed the pull request issues: (Peter Maydell)
  See: https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00882.html
  - cland warning -> "hw/pci/pci.c:196:23: runtime error: shift exponent -1 is 
negative":
The PCIe Root port was not initialized properly, the interrupt pin was left 
0. This
is a long standing issue exposed by the new test. (Patch 1/7)
  - 'make check' fails on 32-bit:
 Fix it by changing the ivshmem mem size from 4G
 to 1G, since 4G is not a valid value on 32-bit archs. (Patch 2/7)
 (4G is truncated to 0 on 32-bit systems)
  - Rebased on mst's pci branch.
  Since all the new changes are not related to the series, I kept the existing
  "Reviewed-by"/"Tested-by" signatures.

v3 -> v4:
 Addressed Igor's comments (thanks for the productive review!)
 - Split pxb test patch (previously patch 3/3) into the test itself (patch 1/6) 
and the blobs (patch 6/6).
 - New patch declaring pxb/pxb-pxie as not hot-pluggable.
- Note that it does not solve the DSDT issue, but it is a prerequisite for 
the next patch.
 - New patch solving the DSDT issue spotted by Igor.
 - Using V=1 DIFF=diff make check does make it easier to review the ACPI 
changes, thanks.
 - Patches 4 and 5 untouched (previously patches 1/3 and 2/3) 

v2 -> v3:
 - split original series "pci: better support for 64-bit MMIO allocation" into 
2 series:
- this is the first part dealing with correct 64-bit MMIO ACPI computation
- the second one will include 64-bit MMIO reservation for PCI hotplug
 - Add pxb/pxb-pcie tests (Igor) - See diffs below (*)
 - Re-based on latest master.
   
v1 -> v2:
 - resolved some styling issues (Laszlo)
 - rebase on latest master (Laszlo)



64-bit BARs allocations fix for devices behind PXBs/PXB-PCIEs.

In build_crs() the calculation and merging of the ranges already happens
in 64-bit, but the entry boundaries are silently truncated to 32-bit in the
call to aml_dword_memory(). Fix it by handling the 64-bit MMIO ranges 
separately.


Thank you,
Marcel

Marcel Apfelbaum (7):
  hw/pcie-root-port: Fix PCIe root port initialization
  tests/acpi: add pxb/pxb-pcie tests
  hw/pxb: declare pxb devices as not hot-pluggable
  hw/acpi: fix a DSDT table issue when a pxb is present.
  acpi: refactor pxb crs computation
  hw/apci: handle 64-bit MMIO regions correctly
  tests/acpi: Add pxb/pxb-pcie tests blobs

 hw/i386/acpi-build.c   | 131 -
 hw/pci-bridge/ioh3420.c|   1 +
 hw/pci-bridge/pci_expander_bridge.c|   2 +
 tests/acpi-test-data/pc/DSDT.pxb   | Bin 0 -> 6286 bytes
 tests/acpi-test-data/q35/DSDT.pxb_pcie | Bin 0 -> 9098 bytes
 tests/bios-tables-test.c   |  37 ++
 6 files changed, 135 insertions(+), 36 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/DSDT.pxb
 create mode 100644 tests/acpi-test-data/q35/DSDT.pxb_pcie

-- 
2.4.3

[Qemu-devel] [PATCH V5 1/7] hw/pcie-root-port: Fix PCIe root port initialization

2016-07-17 Thread Marcel Apfelbaum

Specify the root port interrupt pin as part of the init
process for cases when msi/msix are not enabled.

Fixes "hw/pci/pci.c:196:23: runtime error: shift exponent -1 is negative"
warning from clang's sanitizer.

Reported-by: Peter Maydell 
Signed-off-by: Marcel Apfelbaum 
---
 hw/pci-bridge/ioh3420.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
index 93c6f0b..d88cae5 100644
--- a/hw/pci-bridge/ioh3420.c
+++ b/hw/pci-bridge/ioh3420.c
@@ -100,6 +100,7 @@ static int ioh3420_initfn(PCIDevice *d)
 int rc;
 Error *err = NULL;
 
+pci_config_set_interrupt_pin(d->config, 1);
 pci_bridge_initfn(d, TYPE_PCIE_BUS);
 pcie_port_init_reg(d);
 
-- 
2.4.3

[Qemu-devel] [PATCH V5 5/7] acpi: refactor pxb crs computation

2016-07-17 Thread Marcel Apfelbaum

Instead of always passing both IO and MEM ranges when
computing CRS ranges, define a new CrsRangeSet structure
that include them both.

This is done before introducing a third type of range,
64-bit MEM, so it will be easier to pass them all around.

Reviewed-by: Igor Mammedov 
Signed-off-by: Marcel Apfelbaum 
Tested-by: Laszlo Ersek 
---
 hw/i386/acpi-build.c | 81 
 1 file changed, 50 insertions(+), 31 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 5ed2bbd..d8b3543 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -748,6 +748,23 @@ static void crs_range_free(gpointer data)
 g_free(entry);
 }
 
+typedef struct CrsRangeSet {
+GPtrArray *io_ranges;
+GPtrArray *mem_ranges;
+ } CrsRangeSet;
+
+static void crs_range_set_init(CrsRangeSet *range_set)
+{
+range_set->io_ranges = g_ptr_array_new_with_free_func(crs_range_free);
+range_set->mem_ranges = g_ptr_array_new_with_free_func(crs_range_free);
+}
+
+static void crs_range_set_free(CrsRangeSet *range_set)
+{
+g_ptr_array_free(range_set->io_ranges, true);
+g_ptr_array_free(range_set->mem_ranges, true);
+}
+
 static gint crs_range_compare(gconstpointer a, gconstpointer b)
 {
  CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
@@ -832,18 +849,17 @@ static void crs_range_merge(GPtrArray *range)
 g_ptr_array_free(tmp, true);
 }
 
-static Aml *build_crs(PCIHostState *host,
-  GPtrArray *io_ranges, GPtrArray *mem_ranges)
+static Aml *build_crs(PCIHostState *host, CrsRangeSet *range_set)
 {
 Aml *crs = aml_resource_template();
-GPtrArray *host_io_ranges = g_ptr_array_new_with_free_func(crs_range_free);
-GPtrArray *host_mem_ranges = 
g_ptr_array_new_with_free_func(crs_range_free);
+CrsRangeSet temp_range_set;
 CrsRangeEntry *entry;
 uint8_t max_bus = pci_bus_num(host->bus);
 uint8_t type;
 int devfn;
 int i;
 
+crs_range_set_init(&temp_range_set);
 for (devfn = 0; devfn < ARRAY_SIZE(host->bus->devices); devfn++) {
 uint64_t range_base, range_limit;
 PCIDevice *dev = host->bus->devices[devfn];
@@ -867,9 +883,11 @@ static Aml *build_crs(PCIHostState *host,
 }
 
 if (r->type & PCI_BASE_ADDRESS_SPACE_IO) {
-crs_range_insert(host_io_ranges, range_base, range_limit);
+crs_range_insert(temp_range_set.io_ranges,
+ range_base, range_limit);
 } else { /* "memory" */
-crs_range_insert(host_mem_ranges, range_base, range_limit);
+crs_range_insert(temp_range_set.mem_ranges,
+ range_base, range_limit);
 }
 }
 
@@ -888,7 +906,8 @@ static Aml *build_crs(PCIHostState *host,
  * that do not support multiple root buses
  */
 if (range_base && range_base <= range_limit) {
-crs_range_insert(host_io_ranges, range_base, range_limit);
+crs_range_insert(temp_range_set.io_ranges,
+ range_base, range_limit);
 }
 
 range_base =
@@ -901,7 +920,8 @@ static Aml *build_crs(PCIHostState *host,
  * that do not support multiple root buses
  */
 if (range_base && range_base <= range_limit) {
-crs_range_insert(host_mem_ranges, range_base, range_limit);
+crs_range_insert(temp_range_set.mem_ranges,
+ range_base, range_limit);
 }
 
 range_base =
@@ -914,35 +934,36 @@ static Aml *build_crs(PCIHostState *host,
  * that do not support multiple root buses
  */
 if (range_base && range_base <= range_limit) {
-crs_range_insert(host_mem_ranges, range_base, range_limit);
+crs_range_insert(temp_range_set.mem_ranges,
+ range_base, range_limit);
 }
 }
 }
 
-crs_range_merge(host_io_ranges);
-for (i = 0; i < host_io_ranges->len; i++) {
-entry = g_ptr_array_index(host_io_ranges, i);
+crs_range_merge(temp_range_set.io_ranges);
+for (i = 0; i < temp_range_set.io_ranges->len; i++) {
+entry = g_ptr_array_index(temp_range_set.io_ranges, i);
 aml_append(crs,
aml_word_io(AML_MIN_FIXED, AML_MAX_FIXED,
AML_POS_DECODE, AML_ENTIRE_RANGE,
0, entry->base, entry->limit, 0,
entry->limit - entry->base + 1));
-crs_range_insert(io_ranges, entry->base, entry->limit);
+crs_range_insert(range_set->io_ranges, entry->base, entry->limit);
 }
-g_ptr_array_free(host_io_ranges, true);
 
-crs_range_merge(host_mem_ranges);
-for (i = 0; i < host_mem_ranges->len; i++) {
-entry = g_ptr_array_index(host_mem_ranges, i

[Qemu-devel] [PATCH V5 2/7] tests/acpi: add pxb/pxb-pcie tests

2016-07-17 Thread Marcel Apfelbaum

Add an ivshmem device with 1G shared memory to
pxb in order to check the ACPI code of 64bit MMIO allocation.

Suggested-by: Igor Mammedov 
Signed-off-by: Marcel Apfelbaum 
Tested-by: Laszlo Ersek 
---
 tests/bios-tables-test.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index de4019e..b23c6b0 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -864,6 +864,41 @@ static void test_acpi_piix4_tcg_ipmi(void)
 free_test_data(&data);
 }
 
+static void test_acpi_piix4_tcg_pxb(void)
+{
+test_data data;
+
+memset(&data, 0, sizeof(data));
+data.machine = MACHINE_PC;
+data.variant = ".pxb";
+data.required_struct_types = base_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(base_required_struct_types);
+test_acpi_one("-machine accel=tcg"
+  " -device pxb,id=pxb,bus_nr=0x80,bus=pci.0"
+  " -object 
memory-backend-file,size=1G,mem-path=/tmp/shmem,share,id=mb"
+  " -device ivshmem-plain,memdev=mb,bus=pxb",
+  &data);
+free_test_data(&data);
+}
+
+static void test_acpi_q35_tcg_pxb_pcie(void)
+{
+test_data data;
+
+memset(&data, 0, sizeof(data));
+data.machine = MACHINE_Q35;
+data.variant = ".pxb_pcie";
+data.required_struct_types = base_required_struct_types;
+data.required_struct_types_len = ARRAY_SIZE(ipmi_required_struct_types);
+test_acpi_one("-machine q35,accel=tcg"
+  " -device pxb-pcie,id=pxb,bus_nr=0x80,bus=pcie.0"
+  " -device ioh3420,id=rp,bus=pxb,slot=1"
+  " -object 
memory-backend-file,size=1G,mem-path=/tmp/shmem,share,id=mb"
+  " -device ivshmem-plain,memdev=mb,bus=rp",
+  &data);
+free_test_data(&data);
+}
+
 int main(int argc, char *argv[])
 {
 const char *arch = qtest_get_arch();
@@ -884,6 +919,8 @@ int main(int argc, char *argv[])
 qtest_add_func("acpi/q35/tcg/ipmi", test_acpi_q35_tcg_ipmi);
 qtest_add_func("acpi/piix4/tcg/cpuhp", test_acpi_piix4_tcg_cphp);
 qtest_add_func("acpi/q35/tcg/cpuhp", test_acpi_q35_tcg_cphp);
+qtest_add_func("acpi/piix4/tcg/pxb", test_acpi_piix4_tcg_pxb);
+qtest_add_func("acpi/q35/tcg/pxb-pcie", test_acpi_q35_tcg_pxb_pcie);
 }
 ret = g_test_run();
 boot_sector_cleanup(disk);
-- 
2.4.3

[Qemu-devel] [PATCH V5 7/7] tests/acpi: Add pxb/pxb-pcie tests blobs

2016-07-17 Thread Marcel Apfelbaum

Signed-off-by: Marcel Apfelbaum 
---
 tests/acpi-test-data/pc/DSDT.pxb   | Bin 0 -> 6286 bytes
 tests/acpi-test-data/q35/DSDT.pxb_pcie | Bin 0 -> 9098 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/DSDT.pxb
 create mode 100644 tests/acpi-test-data/q35/DSDT.pxb_pcie

diff --git a/tests/acpi-test-data/pc/DSDT.pxb b/tests/acpi-test-data/pc/DSDT.pxb
new file mode 100644
index 
..8839fcfe4246cdca093c5890c98e64dc5a10f8b8
GIT binary patch
literal 6286
zcmb_gZEqXL5uPQF(s4;iN9kxJ`>C{{&-PD9e?opj`WI5P_9v?|cU+35bbtetfT*3FXP=qb
zot@pKZt0D`%mT1EcaO)3J{M0JZ7AQjusB-5uyne%Ecn_st~y>p!s_*x^&Mqt&fcicfCGF=8YK
z31AbHvlq!5a@X!0HE(9~zOTtBFm%Pt=Cug$U1cFQ1kYtKxAwa?9P?tqCS8}qiWorl|i8XKo?wKIBFf%@&8-|!=DHWcIT6-*%z+tjQEOTY^*V~UfYVGgdQ@k>wJZL4$*x1#)lGAcoW
zR?L*Hp`utb>MWrI3KFzpZPiHItXS2SK|O0~rE3xUG)cR=UW=68Cc6tX+&)L$JD}8X
z3l$w@nN!-UI(N)^1H9=-+x4LNycmDlIM+QB`eou7JqrtlO<>ne!05>i?SqJbQH__gc
zbq3!06SoNun>(((b>>o(yrsQDyN#Lk>lqopvh!RzQYsh
zeC4`Uk0YR-Nlo&^H)9a@lWh~er9K|R?@BN6w1qLZsJ%nocBeXyVPY}|8#0LngvyhQ
zb7iFt^OwQ^6i2C&)cD<>(a)zN9v9`#n>%t6_&+^5r9rRkzr;
zt-uScX7vj7_pKL4tYiMZv-Rwd^{j6_i~bE;&#DJo&)OY>V%J8j^T;nSpBplt^Udc*
z%;!SP=lbULQS*2}ymR#VA@h0Pe161yKE!<9?yN`{y)|lH3e1ZXqEiHyT$J5}I_597
znh55YT*WS0U8Qm*IPf%nYp>{=_0qruAM=xILNiy><@*o3PapmaLB`wus6-}xMD
zrkT~{e)*l?e2KwJc*&AjqcN%cD)N#i?XA-!6dV-yHO*)lYMfQL?G4VdU*A4&rE-)MXSlZ{?F-0X37oD`I8HwE_IZYA
zkGap5rCxbImA{?`Q}O72E`%f3!GRiaZN+LN!vZ~Uz4kea5qozJh|A-V(O!by8|}m?
zIM6%$D?GU|t#FUw2Qw5?OWvUb!z45cjZ}imAk*CowF!qZoRR|6Ne6>UP-M`$drv)%
zxx8eSoKhx-tL0y~D6eUWmD+{9)3{T(V3f_E;!Kz|GgHXSqa`f(6ULp-?r&_STl==9
zI^$RSUc77QwL0EJ@UXF&;UB|=F#}D&o(}WHK25sa-RHl3>e2|$HE<8~X1WF?h=Hq$
zX~jr*cDSc@4O(lzmsXaf4VTuENOqWd!JwT#MC<*%1wLV-Ym_XfzlD6Xs}^Prto*oV
z0v7~UHtO@2Q+b~G_SXP-qGBsyCSn$y@U2hI1ZCDD5xoBiIy
zM&V{3dkmXp&!`v6G>95iBEm+sOarJH#Rmg!8{QVjCq4d(#Tsv`3voOYhwhoR@prsu;>D8U-?LqFvghBa+Wl7GAjG--G;0TFzpLL0X5o1RumMGcNlx~B%@i=00f
zK!88{@Fn&-`8vsD>}SEAdxbo&lLL9ePsM)^Cn(3nq&p^kA~5U}$mlK+)92A!nt!6A
z?{52(VNDMM`C$+UJ@qsRb0v7{jg&v?AG(X+)4fsPi6A=vF7ngMpI$PFNUlj{*{lZ{
z_jV1%2|3_iv!4rIzze40Onz_+tFM)JrpF|P>w@je&a^Dy>k;3DD1Ag4idgy0-jv&&
zi;KV076S*amyA_BKyrj0@mPr|F}gh|+MoQy#UCVwvfn=k{E9vlBwq44%+iIcY|{G+
zIG?q*PLU9$`jw_?hF?``Rk4NJI&E*ABtQWcxL=X6USOT~%R0U8?5#t=c)E+x4#zJ`
zORHm4^drjRXMP$LQHyh%ure8Z2-B}9cIUj6?F*PV5l}J2td0i|HFBbnr-t2`va(l$
zfryP_#|Ku923xB&jyGB)#D08PPy2@`y&lOTseLGoC1^}QV`0#|V8y#^z&aKR6|Ca|
zIvxhSELicEf`C;Hg$mZVfX2h19|=}2pga^RSTW}V2NGe>dBK_#&}1l7uucf*L>Tmn
zV4W1u$xx_Z#kXqU>r@!@s$f+FR0)L&)|7yz!k`}u)@cEq4uuNV83COMgBAoU=64X5
z8VVJxhXnLc7_=x@4-4qwP^e%%BA`dYpqgNv70}sGs9-%Rphv@?x?nvfpvOX?g7vt7
z9uI?-1nUU_JrN2OtS1HZWEgZ&u)Zgt?}b7I>nQ;}br4itKBnKOH6p#_;}-r*v`q!<
zt}moXxY`#MP{1M-Xxb!@X5w)xfzO0(3oA)qijXwWNho1G=<8&X20Ai^ySlFx&!h(ut*0zOvOvrd@p}eQZvuoSJ-6Hh(ltFnHTj26~EHAJS9gm>&aun&ct<0?N431AUuh)T^iH
W_K3e%(+`idYIumPvVKG7(faeuZXaVrb*+3l3XX~M{07HB3W@Iilp;V8XzSrt?abO
zLa~83Mi5B>64ws}64pWcRNByU{zCH;^40)->T6$%B7TZ`?#vE7LrOr*hvnSabMBq9
zJ3D8&OTX#29$aM1x>a2FD$QK+t-9x<&ti;Go4%cD;yP>Z`NeX_O2-mbdp|AO*c5H|
zCBJyBZ2fho`(?L#`?IdK8M3X%&cm(E@RR%9Ek>X_TOsGva9-#%i=Fmf-D|e2H2aEY
zM7wCEGhb0!^cSzzy=u{R)-wWnP|hIE+nq|)&lJAlw%ze$D{glReuLSaYgON}o7d`f
zW_zthjk$iRv)=R)58h>-|K?(UWxThOo#9S9_}B7x3+JwW`R47y*T4Gb?;gC%696pX
z+s5~DNCD-c#xkY4M;m^gwteS|-c9Qx>{%4(#s1tEJ%y9q4($vHO#P2CvX@FM9=_`Y
zw$xLyEA>)K>9L>!
z^TRH8y8X=Oy)Nr@(?>^#tUvjAZ8W8p#P}YMwf9d`n|bR&AZw=asv=cuA6#m*JQS7ZHI4H~Bc127Wh-PM
zb7B{mRkn87E{i4avi8BNt3`hRkH_7VI35Fug4ktUW`VWMw()%-2BqzT=Y#cY1LD-o
zFwy2?7nno9*P&$qOfYv|9&s9fx1YpOK4gy)gMtp9C|e)IKZx7*g*{QIl)y1rREowd
zIWtunIv^*dc@lpTXZ8h0q*)@X^u#=|Co$x)k;krAy%pwWI{3E29g3Vk
zzirVeaJ>JBORC;z)u&hA-7+~#syQ2faagbDxh)61XP8@
zR8=tVC067~2&RH2BvTC@p{{ex&^acU3OXj4YVZhkok>GyQZN-XY3f9%>l`<9jvG41
zO`QmJowlLVHgwviPK3HntPt9fP8m8=rcQ*qPHyPrhE8tkM5yab8#>d5&a|l$p{^4v
zm3BTehR%$s6QQni!q7Qk=$tThBGh$G8agKpos*_cgu2ctL+6yCbIR0-P}k`gIvqo&
zW9mex>&zNDvxd&BsS}~DbK1~3ZRngfbt2St&KNpp44pHkPK3Hn*U;%2I$cvILS5&q
zp>x*IIcw@fsOy|Fbj}$%=S-alb)BaSou>?)r%as)b)EBu&Ur)Uyr~nRuJg2D7NV8#
zv|tva#qP9ZdQU@Q`Vsbw!8~Iy&zMYvI`gc-JZmt|noNW`^PIsvXE4v1OoTe~yumzg
zFwdJzggWy%!Bj5uoM5Uo{yE813pYYyDkoVmYAzTx7tERnWlb9fTF?x%+^h}^6m1la
z3={#%${>zF6@`IHoUkMd_d5cp95W`NDkP?o1_PB?WuO|A3{*k|lMECgMxzW=Vxx#^
zDz7C2mC%8KBE--I76vM@QAA92aLGU=R4~av5h|T9P>B^x7^ns%1C>z0Bm+gLbizO-
zRxn|p8k7uFLIsly6rs`y1C?08gn?>MGEfN>Ofpb}N+%3dVg(ZhszJ#>B~&oUKoKgP
zFi?pVOc2Z#2?Lc_!GwWoP%=;n6-+Wv
zgi0q2RAL1a2C6~HKqXW#$v_b*oiI>|6-*eY1|DxEM;i4{y3s0Jki
zl~BPX14XEG!ayZfFkzq?lnhit1(OUEq0$Khl~}=qfof1PPze=GGEjs{Ck#|#1rr9U
zLCHWRR4~av5h|T9P>B^x7^ns%1C>z0Bm+gLbizO-Rxn|p8k7uFLIsly6rs`y1C?08
zgn?>MGEfN>Ofpb}N+%3dVg(ZhszJ#>B~&oUKoKgPFi=FAfg;ik6rpaQ2sH!Mm@rU{
zNd~Gh$v`zG3{+#nKs6>AsKz7%)tE3)jR^zQm}Hl%#
zMWoIy3=|R0m^heZpa^jqR7gy9Y+;~?)UkzuB2vee3>2X{wq&3PedeC&z(QF+(-S{d
z59tT#qa=M5+WS9$B~HIm=~)3ijj-yCjRG8ZqevT%Hond9Y-FQE?G@UTX;ZP?nd$}}
zFZEOR_}&g4f}{?4+Pz%e@ER!q8u{!bPyZ>&QuJsGzmy+PA0NZE8s6LCQ;b>|L<3Jo
zYU8*^D5&4!89Yv*hcz~a-OSGMnLkakR2=TcuzoYF#K`7O$>$O@1dOKOW;=eh$v&bs
z-xEDk^rV9w(tt#Ks?%&Vtp_{N7^pn;N7ai%)$v&22Ujm@)r+Ef5#!UV7v1Bl7uyE`
zY4?Vz^Vl<#_q6h!DDMrG_f9D9MdkhB@_1bKgVp<5d0&+GhsygWl=q|ZrQ!0

[Qemu-devel] [PATCH V5 4/7] hw/acpi: fix a DSDT table issue when a pxb is present.

2016-07-17 Thread Marcel Apfelbaum

PXBs do not support hotplug so they don't have a PCNT function.
Since the PXB's PCI root-bus is a child bus of bus 0, the
build_dsdt code will add a call to the corresponding PCNT function.

Fix this by skipping the PCNT call for the above case.
While at it skip also PCIe child buses.

Reported-by: Igor Mammedov 
Signed-off-by: Marcel Apfelbaum 
Tested-by: Laszlo Ersek 
---
 hw/i386/acpi-build.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index fbba461..5ed2bbd 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -597,6 +597,10 @@ static void build_append_pci_bus_devices(Aml 
*parent_scope, PCIBus *bus,
 QLIST_FOREACH(sec, &bus->child, sibling) {
 int32_t devfn = sec->parent_dev->devfn;
 
+if (pci_bus_is_root(sec) || pci_bus_is_express(sec)) {
+continue;
+}
+
 aml_append(method, aml_name("^S%.02X.PCNT", devfn));
 }
 }
-- 
2.4.3

[Qemu-devel] [PATCH V5 3/7] hw/pxb: declare pxb devices as not hot-pluggable

2016-07-17 Thread Marcel Apfelbaum

Prevent future issues when hotplug will work for devices
attached to pxbs.

Suggested-by: Igor Mammedov 
Signed-off-by: Marcel Apfelbaum 
Tested-by: Laszlo Ersek 
---
 hw/pci-bridge/pci_expander_bridge.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index ab86121..b4f8ca2 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -310,6 +310,7 @@ static void pxb_dev_class_init(ObjectClass *klass, void 
*data)
 
 dc->desc = "PCI Expander Bridge";
 dc->props = pxb_dev_properties;
+dc->hotpluggable = false;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 }
 
@@ -343,6 +344,7 @@ static void pxb_pcie_dev_class_init(ObjectClass *klass, 
void *data)
 
 dc->desc = "PCI Express Expander Bridge";
 dc->props = pxb_dev_properties;
+dc->hotpluggable = false;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH V5 6/7] hw/apci: handle 64-bit MMIO regions correctly

2016-07-17 Thread Marcel Apfelbaum

In build_crs(), the calculation and merging of the ranges already happens
in 64-bit, but the entry boundaries are silently truncated to 32-bit in the
call to aml_dword_memory(). Fix it by handling the 64-bit MMIO ranges 
separately.
This fixes 64-bit BARs behind PXBs.

Reported-by: Laszlo Ersek 
Reviewed-by: Igor Mammedov 
Tested-by: Laszlo Ersek 
Signed-off-by: Marcel Apfelbaum 
---
 hw/i386/acpi-build.c | 54 +++-
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d8b3543..b1adf04 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -751,18 +751,22 @@ static void crs_range_free(gpointer data)
 typedef struct CrsRangeSet {
 GPtrArray *io_ranges;
 GPtrArray *mem_ranges;
+GPtrArray *mem_64bit_ranges;
  } CrsRangeSet;
 
 static void crs_range_set_init(CrsRangeSet *range_set)
 {
 range_set->io_ranges = g_ptr_array_new_with_free_func(crs_range_free);
 range_set->mem_ranges = g_ptr_array_new_with_free_func(crs_range_free);
+range_set->mem_64bit_ranges =
+g_ptr_array_new_with_free_func(crs_range_free);
 }
 
 static void crs_range_set_free(CrsRangeSet *range_set)
 {
 g_ptr_array_free(range_set->io_ranges, true);
 g_ptr_array_free(range_set->mem_ranges, true);
+g_ptr_array_free(range_set->mem_64bit_ranges, true);
 }
 
 static gint crs_range_compare(gconstpointer a, gconstpointer b)
@@ -920,8 +924,14 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet 
*range_set)
  * that do not support multiple root buses
  */
 if (range_base && range_base <= range_limit) {
-crs_range_insert(temp_range_set.mem_ranges,
- range_base, range_limit);
+uint64_t length = range_limit - range_base + 1;
+if (range_limit <= UINT32_MAX && length <= UINT32_MAX) {
+crs_range_insert(temp_range_set.mem_ranges,
+ range_base, range_limit);
+} else {
+crs_range_insert(temp_range_set.mem_64bit_ranges,
+ range_base, range_limit);
+}
 }
 
 range_base =
@@ -934,8 +944,14 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet 
*range_set)
  * that do not support multiple root buses
  */
 if (range_base && range_base <= range_limit) {
-crs_range_insert(temp_range_set.mem_ranges,
- range_base, range_limit);
+uint64_t length = range_limit - range_base + 1;
+if (range_limit <= UINT32_MAX && length <= UINT32_MAX) {
+crs_range_insert(temp_range_set.mem_ranges,
+ range_base, range_limit);
+} else {
+crs_range_insert(temp_range_set.mem_64bit_ranges,
+ range_base, range_limit);
+}
 }
 }
 }
@@ -963,6 +979,19 @@ static Aml *build_crs(PCIHostState *host, CrsRangeSet 
*range_set)
 crs_range_insert(range_set->mem_ranges, entry->base, entry->limit);
 }
 
+crs_range_merge(temp_range_set.mem_64bit_ranges);
+for (i = 0; i < temp_range_set.mem_64bit_ranges->len; i++) {
+entry = g_ptr_array_index(temp_range_set.mem_64bit_ranges, i);
+aml_append(crs,
+   aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED,
+AML_MAX_FIXED, AML_NON_CACHEABLE,
+AML_READ_WRITE,
+0, entry->base, entry->limit, 0,
+entry->limit - entry->base + 1));
+crs_range_insert(range_set->mem_64bit_ranges,
+ entry->base, entry->limit);
+}
+
 crs_range_set_free(&temp_range_set);
 
 aml_append(crs,
@@ -2085,11 +2114,18 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 }
 
 if (!range_is_empty(pci_hole64)) {
-aml_append(crs,
-aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED, AML_MAX_FIXED,
- AML_CACHEABLE, AML_READ_WRITE,
- 0, range_lob(pci_hole64), range_upb(pci_hole64), 
0,
- range_upb(pci_hole64) + 1 - 
range_lob(pci_hole64)));
+crs_replace_with_free_ranges(crs_range_set.mem_64bit_ranges,
+ range_lob(pci_hole64),
+ range_upb(pci_hole64));
+for (i = 0; i < crs_range_set.mem_64bit_ranges->len; i++) {
+entry = g_ptr_array_index(crs_range_set.mem_64bit_ranges, i);
+aml_append(crs,
+   aml_qword_memory(AML_POS_DECODE, AML_MIN_FIXED,
+AML_MAX_FIXED,
+

Re: [Qemu-devel] [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c

2016-07-17 Thread Emil Condrea

On Jul 17, 2016 10:41, "Quan Xu"  wrote:
>
>
> [Quan:]: comment starts with [Quan:]
>
Thanks, Quan for your comments.

The first patches from this series just move some code from xen_backend to
xen_pvdev file. I would not group the reorg from xen_backend with
refactoring in the same patch. Eventually this can be done in another patch
later.
>
>
>
The purpose of the new file is to store generic functions shared by frontend
> and backends such as xenstore operations, xendevs.
>
> Signed-off-by: Quan Xu 
> Signed-off-by: Emil Condrea 
> ---
>  hw/xen/Makefile.objs |   2 +-
>  hw/xen/xen_backend.c | 125 +---
>
 hw/xen/xen_pvdev.c   | 149 +++
>  include/hw/xen/xen_backend.h |  63 +-
>  include/hw/xen/xen_pvdev.h   |  71 +
>  5 files changed, 223 insertions(+), 187 deletions(-)
>  create mode 100644 hw/xen/xen_pvdev.c
>  create mode 100644 include/hw/xen/xen_pvdev.h
>
> diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
> index d367094..591cdc2 100644
> --- a/hw/xen/Makefile.objs
> +++ b/hw/xen/Makefile.objs
> @@ -1,5 +1,5 @@
>  # xen backend driver support
> -common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o
>
+common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o xen_pvdev.o
>
>  obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
>  obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o
> xen_pt_graphics.o xen_pt_msi.o
> diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
> index bab79b1..a251a4a 100644
> --- a/hw/xen/xen_backend.c
> +++ b/hw/xen/xen_backend.c
> @@ -30,6 +30,7 @@
>  #include "sysemu/char.h"
>  #include "qemu/log.h"
>  #include "hw/xen/xen_backend.h"
> +#include "hw/xen/xen_pvdev.h"
>
>  #include 
>
> @@ -56,8 +57,6 @@ static QTAILQ_HEAD(xs_dirs_head, xs_dirs) xs_cleanup =
>  static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs =
> QTAILQ_HEAD_INITIALIZER(xendevs);
>  static int debug = 0;
>
> -/* - */
> -
>  static void xenstore_cleanup_dir(char *dir)
>  {
>  struct xs_dirs *d;
> @@ -76,34 +75,6 @@ void xen_config_cleanup(void)
>  }
>  }
>
>
-int xenstore_write_str(const char *base, const char *node, const char *val)
> -{
> -char abspath[XEN_BUFSIZE];
> -
> -snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
> -if (!xs_write(xenstore, 0, abspath, val, strlen(val))) {
> -return -1;
> -}
> -return 0;
> -}
> -
> -char *xenstore_read_str(const char *base, const char *node)
> -{
> -char abspath[XEN_BUFSIZE];
> -unsigned int len;
> -char *str, *ret = NULL;
> -
> -snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
> -str = xs_read(xenstore, 0, abspath, &len);
> -if (str != NULL) {
> -/* move to qemu-allocated memory to make sure
> - * callers can savely g_free() stuff. */
> -ret = g_strdup(str);
> -free(str);
> -}
> -return ret;
> -}
> -
>  int xenstore_mkdir(char *path, int p)
>  {
>  struct xs_permissions perms[2] = {
> @@ -128,48 +99,6 @@ int xenstore_mkdir(char *path, int p)
>  return 0;
>  }
>
> -int xenstore_write_int(const char *base, const char *node, int ival)
> -{
> -char val[12];
> -
>
> [Quan:]: why 12 ? what about XEN_BUFSIZE?
>
> -snprintf(val, sizeof(val), "%d", ival);
> -return xenstore_write_str(base, node, val);
> -}
> -
>
-int xenstore_write_int64(const char *base, const char *node, int64_t ival)
> -{
> -char val[21];
> -
>
> [Quan:]: why 21 ? what about XEN_BUFSIZE?
>
>
> -snprintf(val, sizeof(val), "%"PRId64, ival);
> -return xenstore_write_str(base, node, val);
> -}
> -
> -int xenstore_read_int(const char *base, const char *node, int *ival)
> -{
> -char *val;
> -int rc = -1;
> -
> -val = xenstore_read_str(base, node);
>
> [Quan:]:  IMO, it is better to initialize val when declares.  the same
comment for the other 'val'
>
> -if (val && 1 == sscanf(val, "%d", ival)) {
> -rc = 0;
> -}
> -g_free(val);
> -return rc;
> -}
> -
>
-int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval)
> -{
> -char *val;
> -int rc = -1;
> -
> -val = xenstore_read_str(base, node);
> -if (val && 1 == sscanf(val, "%"SCNu64, uval)) {
> -rc = 0;
> -}
> -g_free(val);
> -return rc;
> -}
> -
>
 int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const
> char *val)
>  {
>  return xenstore_write_str(xendev->be, node, val);
>
@@ -212,20 +141,6 @@ int xenstore_read_fe_uint64(struct XenDevice *xendev,
> const char *node, uint64_t
>
>  /* - */
>
> -const char *xenbus_strstate(enum xenbus_state state)
> -{
> -static const char *const name[] = {
> -[ XenbusStateUnknown  ] = "Unknown",
> -[ XenbusStateInitialising ] = "Initialising",
> -

Re: [Qemu-devel] [PATCH] vfio/pci: Hide ARI capability

2016-07-17 Thread Marcel Apfelbaum


On 07/15/2016 08:30 PM, Alex Williamson wrote:

QEMU supports ARI on downstream ports and assigned devices may support
ARI in their extended capabilities.  The endpoint ARI capability
specifies the next function, such that the OS doesn't need to walk
each possible function, however this next function is relative to the
host, not the guest.  This leads to device discovery issues when we
combine separate functions into virtual multi-function packages in a
guest.  For example, SR-IOV VFs are not enumerated by simply probing
the function address space, therefore the ARI next-function field is
zero.  When we combine multiple VFs together as a multi-function
device in the guest, the guest OS identifies ARI is enabled, relies on
this next-function field, and stops looking for additional function
after the first is found.



Hi Alex,


Long term we should expose the ARI capability to the guest to enable
configurations with more than 8 functions per slot, but this requires
additional QEMU PCI infrastructure to manage the next-function field
for multiple, otherwise independent devices.


The ARI implementation is on my "to-do" list.

  In the short term,

hiding this capability allows equivalent functionality to what we
currently have on non-express chipsets.



I agree.

Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Signed-off-by: Alex Williamson 
---
  hw/vfio/pci.c |1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 44783c5..c8436a1 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1828,6 +1828,7 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev)

  switch (cap_id) {
  case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
+case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */
  trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next);
  break;
  default:

Re: [Qemu-devel] [PATCH v2 1/2] tests: Resort check-qtest entries in Makefile.include

2016-07-17 Thread David Gibson

On Fri, Jul 15, 2016 at 05:39:38PM +0200, Thomas Huth wrote:
> The rather random list of check-qtest-xxx entries caused some
> confusion in the past, where to use "=" and where to use "+="
> (see commits 0ccac16f59462b8e2b9afbc1 and 1f5c1cfbaec0792cd2e5da
> for example).
> Sorting the check-qtest-xxx entries by architecure instead and
> using some empty lines inbetween should help to ease this
> situation a little bit, so that it is hopefully now obvious
> that new tests should be added with "+=" instead of "=".
> While we are at it, this patch also comments out two of the
> "gcov-files-..." lines since the corresponding m48t59-test is
> disabled for sparc and sparc64, too.
> 
> Signed-off-by: Thomas Huth 

Reviewed-by: David Gibson 

> ---
>  tests/Makefile.include | 38 +-
>  1 file changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 2010b11..3d76cf4 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -240,33 +240,45 @@ check-qtest-i386-y += tests/postcopy-test$(EXESUF)
>  check-qtest-x86_64-y += $(check-qtest-i386-y)
>  gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
>  gcov-files-x86_64-y = $(subst 
> i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
> +
>  check-qtest-mips-y = tests/endianness-test$(EXESUF)
> +
>  check-qtest-mips64-y = tests/endianness-test$(EXESUF)
> +
>  check-qtest-mips64el-y = tests/endianness-test$(EXESUF)
> +
>  check-qtest-ppc-y = tests/endianness-test$(EXESUF)
> -check-qtest-ppc64-y = tests/endianness-test$(EXESUF)
> +check-qtest-ppc-y += tests/boot-order-test$(EXESUF)
> +check-qtest-ppc-y += tests/prom-env-test$(EXESUF)
> +
> +check-qtest-ppc64-y = tests/spapr-phb-test$(EXESUF)
> +gcov-files-ppc64-y = ppc64-softmmu/hw/ppc/spapr_pci.c
> +check-qtest-ppc64-y += tests/endianness-test$(EXESUF)
> +check-qtest-ppc64-y += tests/boot-order-test$(EXESUF)
> +check-qtest-ppc64-y += tests/prom-env-test$(EXESUF)
> +
>  check-qtest-sh4-y = tests/endianness-test$(EXESUF)
> +
>  check-qtest-sh4eb-y = tests/endianness-test$(EXESUF)
> +
> +check-qtest-sparc-y = tests/prom-env-test$(EXESUF)
> +#check-qtest-sparc-y += tests/m48t59-test$(EXESUF)
> +#gcov-files-sparc-y = hw/timer/m48t59.c
> +
>  check-qtest-sparc64-y = tests/endianness-test$(EXESUF)
> -#check-qtest-sparc-y = tests/m48t59-test$(EXESUF)
>  #check-qtest-sparc64-y += tests/m48t59-test$(EXESUF)
> -gcov-files-sparc-y += hw/timer/m48t59.c
> -gcov-files-sparc64-y += hw/timer/m48t59.c
> +#gcov-files-sparc64-y += hw/timer/m48t59.c
> +#Disabled for now, triggers a TCG bug on 32-bit hosts
> +#check-qtest-sparc64-y += tests/prom-env-test$(EXESUF)
> +
>  check-qtest-arm-y = tests/tmp105-test$(EXESUF)
>  check-qtest-arm-y += tests/ds1338-test$(EXESUF)
>  gcov-files-arm-y += hw/misc/tmp105.c
>  check-qtest-arm-y += tests/virtio-blk-test$(EXESUF)
>  gcov-files-arm-y += arm-softmmu/hw/block/virtio-blk.c
> -check-qtest-ppc-y += tests/boot-order-test$(EXESUF)
> -check-qtest-ppc64-y += tests/boot-order-test$(EXESUF)
> -check-qtest-ppc64-y += tests/spapr-phb-test$(EXESUF)
> -gcov-files-ppc64-y += ppc64-softmmu/hw/ppc/spapr_pci.c
> -check-qtest-ppc-y += tests/prom-env-test$(EXESUF)
> -check-qtest-ppc64-y += tests/prom-env-test$(EXESUF)
> -check-qtest-sparc-y += tests/prom-env-test$(EXESUF)
> -#Disabled for now, triggers a TCG bug on 32-bit hosts
> -#check-qtest-sparc64-y += tests/prom-env-test$(EXESUF)
> +
>  check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
> +
>  check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
>  
>  check-qtest-generic-y += tests/qom-test$(EXESUF)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 2/2] tests: Check serial output of firmware boot of some machines

2016-07-17 Thread David Gibson

On Fri, Jul 15, 2016 at 05:39:39PM +0200, Thomas Huth wrote:
> Some of the machines that we have got a firmware image for write
> some output to the serial console while booting up. We can use
> this output to make sure that the machine is basically working,
> so this adds a test that checks the output of these machines
> for some well-known "magic" strings.
> 
> Signed-off-by: Thomas Huth 

Reviewed-by: David Gibson 

> ---
>  tests/Makefile.include   |   8 
>  tests/boot-serial-test.c | 111 
> +++
>  2 files changed, 119 insertions(+)
>  create mode 100644 tests/boot-serial-test.c
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 3d76cf4..3986093 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -194,6 +194,7 @@ check-qtest-i386-y += tests/hd-geo-test$(EXESUF)
>  gcov-files-i386-y += hw/block/hd-geometry.c
>  check-qtest-i386-y += tests/boot-order-test$(EXESUF)
>  check-qtest-i386-y += tests/bios-tables-test$(EXESUF)
> +check-qtest-i386-y += tests/boot-serial-test$(EXESUF)
>  check-qtest-i386-y += tests/pxe-test$(EXESUF)
>  check-qtest-i386-y += tests/rtc-test$(EXESUF)
>  check-qtest-i386-y += tests/ipmi-kcs-test$(EXESUF)
> @@ -241,6 +242,8 @@ check-qtest-x86_64-y += $(check-qtest-i386-y)
>  gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
>  gcov-files-x86_64-y = $(subst 
> i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
>  
> +check-qtest-alpha-y = tests/boot-serial-test$(EXESUF)
> +
>  check-qtest-mips-y = tests/endianness-test$(EXESUF)
>  
>  check-qtest-mips64-y = tests/endianness-test$(EXESUF)
> @@ -250,12 +253,14 @@ check-qtest-mips64el-y = tests/endianness-test$(EXESUF)
>  check-qtest-ppc-y = tests/endianness-test$(EXESUF)
>  check-qtest-ppc-y += tests/boot-order-test$(EXESUF)
>  check-qtest-ppc-y += tests/prom-env-test$(EXESUF)
> +check-qtest-ppc-y += tests/boot-serial-test$(EXESUF)
>  
>  check-qtest-ppc64-y = tests/spapr-phb-test$(EXESUF)
>  gcov-files-ppc64-y = ppc64-softmmu/hw/ppc/spapr_pci.c
>  check-qtest-ppc64-y += tests/endianness-test$(EXESUF)
>  check-qtest-ppc64-y += tests/boot-order-test$(EXESUF)
>  check-qtest-ppc64-y += tests/prom-env-test$(EXESUF)
> +check-qtest-ppc64-y += tests/boot-serial-test$(EXESUF)
>  
>  check-qtest-sh4-y = tests/endianness-test$(EXESUF)
>  
> @@ -281,6 +286,8 @@ check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
>  
>  check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
>  
> +check-qtest-s390x-y = tests/boot-serial-test$(EXESUF)
> +
>  check-qtest-generic-y += tests/qom-test$(EXESUF)
>  
>  qapi-schema += alternate-any.json
> @@ -579,6 +586,7 @@ tests/ipmi-kcs-test$(EXESUF): tests/ipmi-kcs-test.o
>  tests/ipmi-bt-test$(EXESUF): tests/ipmi-bt-test.o
>  tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
>  tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y)
> +tests/boot-serial-test$(EXESUF): tests/boot-serial-test.o $(libqos-obj-y)
>  tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o \
>   tests/boot-sector.o $(libqos-obj-y)
>  tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o $(libqos-obj-y)
> diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
> new file mode 100644
> index 000..fd60337
> --- /dev/null
> +++ b/tests/boot-serial-test.c
> @@ -0,0 +1,111 @@
> +/*
> + * Test serial output of some machines.
> + *
> + * Copyright 2016 Thomas Huth, Red Hat Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2
> + * or later. See the COPYING file in the top-level directory.
> + *
> + * This test is used to check that the serial output of the firmware
> + * (that we provide for some machines) contains an expected string.
> + * Thus we check that the firmware still boots at least to a certain
> + * point and so we know that the machine is not completely broken.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "libqtest.h"
> +
> +typedef struct testdef {
> +const char *arch;   /* Target architecture */
> +const char *machine;/* Name of the machine */
> +const char *extra;  /* Additional parameters */
> +const char *expect; /* Expected string in the serial output */
> +} testdef_t;
> +
> +static testdef_t tests[] = {
> +{ "alpha", "clipper", "", "PCI:" },
> +{ "ppc", "ppce500", "", "U-Boot" },
> +{ "ppc", "prep", "", "Open Hack'Ware BIOS" },
> +{ "ppc64", "ppce500", "", "U-Boot" },
> +{ "ppc64", "prep", "", "Open Hack'Ware BIOS" },
> +{ "ppc64", "pseries", "", "Open Firmware" },
> +{ "i386", "isapc", "-device sga", "SGABIOS" },
> +{ "i386", "pc", "-device sga", "SGABIOS" },
> +{ "i386", "q35", "-device sga", "SGABIOS" },
> +{ "x86_64", "isapc", "-device sga", "SGABIOS" },
> +{ "x86_64", "pc", "-device sga", "SGABIOS" },
> +{ "x86_64", "q35", "-device sga", "SGABIOS" },
> +{ "s390x", "s390-ccw-virtio",
> +  "-nodefaults -device sclpconsole,chardev=serial0", "virtio device" },
> +{ NU

Re: [Qemu-devel] [PATCH] target-ppc: fix left shift overflow in hpte_page_shift

2016-07-17 Thread David Gibson

On Fri, Jul 15, 2016 at 05:22:10PM +0200, Paolo Bonzini wrote:
> ps->pte_enc is a 32-bit value, which is shifted left and then compared
> to a 64-bit value.  It needs a cast before the shift.
> 
> Reported by Coverity.
> 
> Signed-off-by: Paolo Bonzini 

Applied to ppc-for-2.7, thanks.

> ---
>  target-ppc/mmu-hash64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
> index 82c2186..8f7e5b4 100644
> --- a/target-ppc/mmu-hash64.c
> +++ b/target-ppc/mmu-hash64.c
> @@ -479,7 +479,7 @@ static unsigned hpte_page_shift(const struct 
> ppc_one_seg_page_size *sps,
>  
>  mask = ((1ULL << ps->page_shift) - 1) & HPTE64_R_RPN;
>  
> -if ((pte1 & mask) == (ps->pte_enc << HPTE64_R_RPN_SHIFT)) {
> +if ((pte1 & mask) == ((uint64_t)ps->pte_enc << HPTE64_R_RPN_SHIFT)) {
>  return ps->page_shift;
>  }
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation

2016-07-17 Thread David Gibson

On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote:
> On Thu, 14 Jul 2016 21:59:45 +1000
> David Gibson  wrote:
> 
> > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote:
> > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell  
> > > wrote:  
> > > > On 14 July 2016 at 08:57, David Gibson  
> > > > wrote:  
> > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done 
> > > >> differently
> > > >> than for full system targets.  This method turns out to be broken, 
> > > >> since
> > > >> it can fairly easily result in duplicate cpu_index values for
> > > >> simultaneously active cpus (i.e. threads in the emulated process).
> > > >>
> > > >> Consider this sequence:
> > > >> Create thread 1
> > > >> Create thread 2
> > > >> Exit thread 1
> > > >> Create thread 3
> > > >>
> > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get
> > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2
> > > >> threads in the cpus list at the point of its creation).
> > > >>
> > > >> We mostly get away with this because cpu_index values aren't that 
> > > >> important
> > > >> for userspace emulation.  Still, it can't be good, so this patch fixes 
> > > >> it
> > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that 
> > > >> full
> > > >> system targets already use.
> > > >>
> > > >> Signed-off-by: David Gibson 
> > > >> ---
> > > >>  exec.c | 19 ---
> > > >>  1 file changed, 19 deletions(-)
> > > >>
> > > >> diff --git a/exec.c b/exec.c
> > > >> index 011babd..e410dab 100644
> > > >> --- a/exec.c
> > > >> +++ b/exec.c
> > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, 
> > > >> int asidx)
> > > >>  }
> > > >>  #endif
> > > >>
> > > >> -#ifndef CONFIG_USER_ONLY
> > > >>  static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS);
> > > >>
> > > >>  static int cpu_get_free_index(Error **errp)
> > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu)
> > > >>  {
> > > >>  bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> > > >>  }
> > > >> -#else
> > > >> -
> > > >> -static int cpu_get_free_index(Error **errp)
> > > >> -{
> > > >> -CPUState *some_cpu;
> > > >> -int cpu_index = 0;
> > > >> -
> > > >> -CPU_FOREACH(some_cpu) {
> > > >> -cpu_index++;
> > > >> -}
> > > >> -return cpu_index;
> > > >> -}
> > > >> -
> > > >> -static void cpu_release_index(CPUState *cpu)
> > > >> -{
> > > >> -return;
> > > >> -}
> > > >> -#endif  
> > > >
> > > > Won't this change impose a maximum limit of 256 simultaneous
> > > > threads? That seems a little low for comfort.  
> > > 
> > > This was the reason why the bitmap logic wasn't applied to
> > > CONFIG_USER_ONLY when it was introduced.
> > > 
> > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html  
> > 
> > Ah.. good point.
> > 
> > Hrm, ok, my next idea would be to just (globally) sequentially
> > allocate cpu_index values for CONFIG_USER, and never try to re-use
> > them.  Does that seem reasonable?
> > 
> 
> Isn't it only deferring the problem to later ?

You mean that we could get duplicate indexes after the value wraps
around?

I suppose, but duplicates after spawning 4 billion threads seems like
a substantial improvement over duplicates after spawning 3 in the
wrong order..

> Maybe it is possible to define MAX_CPUMASK_BITS to a much higher
> value fo CONFIG_USER only instead ?

Perhaps.  It does mean carrying around a huge bitmap, though.

Another option is to remove cpu_index entirely for the user only
case.  I have some patches for this, which are very ugly but it's
possible they can be cleaned up to something reasonable (the biggest
chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY
for what I think are registers that aren't accessible in user mode).


> > > But then we didn't have actual removal, but we do now.  
> > 
> > You mean patch 1/2 in this set?  Or something else?
> > 
> > Even so, 256 does seem a bit low for a number of simultaneously active
> > threads - there are some bug hairy multi-threaded programs out there.
> > 
> 



-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism

2016-07-17 Thread David Gibson

On Fri, Jul 15, 2016 at 10:10:25AM +0200, Thomas Huth wrote:
> Commit 86b50f2e1bef ("Disable huge page support if it is not available
> for main RAM") already made sure that huge page support is not announced
> to the guest if the normal RAM of non-NUMA configurations is not backed
> by a huge page filesystem. However, there is one more case that can go
> wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
> with huge page support (and only the memory of a DIMM is configured with
> it). When QEMU is started with the following command line for example,
> the Linux guest currently crashes because it is trying to use huge pages
> on a memory region that does not support huge pages:
> 
>  qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
>memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 
> \
>-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
>-numa node,nodeid=0 -numa node,nodeid=1
> 
> To fix this issue, we've got to make sure to disable huge page support,
> too, when there is a NUMA node that is not using a memory backend with
> huge page support.
> 
> Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
> Signed-off-by: Thomas Huth 
> ---
>  target-ppc/kvm.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

Applied to ppc-for-2.7, thanks.

> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 884d564..7a8f555 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -389,12 +389,16 @@ static long getrampagesize(void)
>  
>  object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
>  
> -if (hpsize == LONG_MAX) {
> +if (hpsize == LONG_MAX || hpsize == getpagesize()) {
>  return getpagesize();
>  }
>  
> -if (nb_numa_nodes == 0 && hpsize > getpagesize()) {
> -/* No NUMA nodes and normal RAM without -mem-path ==> no huge pages! 
> */
> +/* If NUMA is disabled or the NUMA nodes are not backed with a
> + * memory-backend, then there is at least one node using "normal"
> + * RAM. And since normal RAM has not been configured with "-mem-path"
> + * (what we've checked earlier here already), we can not use huge pages!
> + */
> +if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) {
>  static bool warned;
>  if (!warned) {
>  error_report("Huge page support disabled (n/a for main 
> memory).");

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/4] Pass generic CPUState to gen_intermediate_code()

2016-07-17 Thread David Gibson

On Fri, Jul 15, 2016 at 06:12:05PM +0200, Lluís Vilanova wrote:
> Needed to implement a target-agnostic gen_intermediate_code() in the
> future.
> 
> Signed-off-by: Lluís Vilanova 
> ---
>  include/exec/exec-all.h   |2 +-
>  target-alpha/translate.c  |   11 +--
>  target-arm/translate.c|   24 
>  target-cris/translate.c   |   17 -
>  target-i386/translate.c   |   13 ++---
>  target-lm32/translate.c   |   22 +++---
>  target-m68k/translate.c   |   15 +++
>  target-microblaze/translate.c |   24 
>  target-mips/translate.c   |   15 +++
>  target-moxie/translate.c  |   14 +++---
>  target-openrisc/translate.c   |   24 
>  target-ppc/translate.c|   15 +++
>  target-s390x/translate.c  |   13 ++---
>  target-sh4/translate.c|   15 +++
>  target-sparc/translate.c  |   11 +--
>  target-tilegx/translate.c |7 +++
>  target-tricore/translate.c|9 -
>  target-unicore32/translate.c  |   17 -
>  target-xtensa/translate.c |   13 ++---
>  translate-all.c   |2 +-
>  20 files changed, 135 insertions(+), 148 deletions(-)

target-ppc portion

Reviewed-by: David Gibson 

> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 7362095..06c2400 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -66,7 +66,7 @@ typedef struct TranslationBlock TranslationBlock;
>  
>  #include "qemu/log.h"
>  
> -void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
> +void gen_intermediate_code(CPUState *env, struct TranslationBlock *tb);
>  void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
>target_ulong *data);
>  
> diff --git a/target-alpha/translate.c b/target-alpha/translate.c
> index 5b86992..faeccf8 100644
> --- a/target-alpha/translate.c
> +++ b/target-alpha/translate.c
> @@ -2860,10 +2860,9 @@ static ExitStatus translate_one(DisasContext *ctx, 
> uint32_t insn)
>  return ret;
>  }
>  
> -void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb)
> +void gen_intermediate_code(CPUState *cpu, struct TranslationBlock *tb)
>  {
> -AlphaCPU *cpu = alpha_env_get_cpu(env);
> -CPUState *cs = CPU(cpu);
> +CPUAlphaState *env = cpu->env_ptr;
>  DisasContext ctx, *ctxp = &ctx;
>  target_ulong pc_start;
>  target_ulong pc_mask;
> @@ -2878,7 +2877,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
> TranslationBlock *tb)
>  ctx.pc = pc_start;
>  ctx.mem_idx = cpu_mmu_index(env, false);
>  ctx.implver = env->implver;
> -ctx.singlestep_enabled = cs->singlestep_enabled;
> +ctx.singlestep_enabled = cpu->singlestep_enabled;
>  
>  #ifdef CONFIG_USER_ONLY
>  ctx.ir = cpu_std_ir;
> @@ -2917,7 +2916,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
> TranslationBlock *tb)
>  tcg_gen_insn_start(ctx.pc);
>  num_insns++;
>  
> -if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
> +if (unlikely(cpu_breakpoint_test(cpu, ctx.pc, BP_ANY))) {
>  ret = gen_excp(&ctx, EXCP_DEBUG, 0);
>  /* The address covered by the breakpoint must be included in
> [tb->pc, tb->pc + tb->size) in order to for it to be
> @@ -2991,7 +2990,7 @@ void gen_intermediate_code(CPUAlphaState *env, struct 
> TranslationBlock *tb)
>  #ifdef DEBUG_DISAS
>  if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
>  qemu_log("IN: %s\n", lookup_symbol(pc_start));
> -log_target_disas(cs, pc_start, ctx.pc - pc_start, 1);
> +log_target_disas(cpu, pc_start, ctx.pc - pc_start, 1);
>  qemu_log("\n");
>  }
>  #endif
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index 940ec8d..837ceda 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -11587,10 +11587,10 @@ static bool insn_crosses_page(CPUARMState *env, 
> DisasContext *s)
>  }
>  
>  /* generate intermediate code for basic block 'tb'.  */
> -void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
> +void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb)
>  {
> -ARMCPU *cpu = arm_env_get_cpu(env);
> -CPUState *cs = CPU(cpu);
> +CPUARMState *env = cpu->env_ptr;
> +ARMCPU *arm_cpu = arm_env_get_cpu(env);
>  DisasContext dc1, *dc = &dc1;
>  target_ulong pc_start;
>  target_ulong next_page_start;
> @@ -11604,7 +11604,7 @@ void gen_intermediate_code(CPUARMState *env, 
> TranslationBlock *tb)
>   * the A32/T32 complexity to do with conditional execution/IT blocks/etc.
>   */
>  if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) {
> -gen_intermediate_code_a64(cpu, tb);
> +gen_intermediate_code_a64(arm_cpu, tb);
>  return;
>

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-17 Thread Dave Chinner

On Fri, Jul 15, 2016 at 03:55:20PM +0800, Zhangfei Gao wrote:
> Dear Dave
> 
> On Wed, Jul 13, 2016 at 7:03 AM, Dave Chinner  wrote:
> > On Tue, Jul 12, 2016 at 12:43:24PM -0400, Theodore Ts'o wrote:
> >> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
> >> > Some update:
> >> >
> >> > If test with ext2, no problem in iblock.
> >> > If test with ext4, ext4_mb_generate_buddy reported error in the
> >> > removing files after reboot.
> >> >
> >> >
> >> > root@(none)$ rm test
> >> > [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: 
> >> > group 18
> >> > , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free 
> >> > clusters
> >> > [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 
> >> > 0). Th
> >> > ere's a risk of filesystem corruption in case of system crash.
> >> >
> >> > Any special notes of using ext4 in qemu?
> >>
> >> Ext4 has more runtime consistency checking than ext2.  So just because
> >> ext4 complains doesn't mean that there isn't a problem with the file
> >> system; it just means that ext4 is more likely to notice before you
> >> lose user data.
> >>
> >> So if you test with ext2, try running e2fsck afterwards, to make sure
> >> the file system is consistent.
> >>
> >> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
> >> anything like this in a very long time, I suspect the problemb is with
> >> your SCSI code, and not with ext4.
> >
> > It's the same error I reported yesterday for ext3 on 4.7-rc6 when
> > rebooting a VM after it hung.
> 
> 
> Any link of this error?

http://article.gmane.org/gmane.comp.file-systems.ext4/53792

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com

Re: [Qemu-devel] [RFC PATCH V2] qemu-char: Fix context for g_source_attach()

2016-07-17 Thread Zhang Chen


Hi~ All~~

Can you give me some feedback for this patch?

We need more comments~~

COLO project depend on this patch to work.

Because this patch colo-compare can make handler of qemu_chr_add_handlers()

run in compare thread, reduce workload of main_loop in network busy 
situation.


This idea from Jason.


Thanks
Zhang Chen




On 07/11/2016 10:12 AM, Zhang Chen wrote:



On 07/08/2016 10:27 PM, Paolo Bonzini wrote:


On 08/07/2016 10:54, Daniel P. Berrange wrote:

On Fri, Jul 08, 2016 at 09:48:23AM +0800, Fam Zheng wrote:

On Wed, 06/22 18:49, Zhang Chen wrote:

We want to poll and handle chardev in another thread
other than main loop. But qemu_chr_add_handlers() can only
work for global default context other than thread default context.
So we use g_source_attach(xx, g_main_context_get_thread_default())
replace g_source_attach(xx, NULL) to attach g_source.
Comments from jason.

Signed-off-by: Zhang Chen 
Signed-off-by: Jason Wang 
---
  io/channel.c | 2 +-
  qemu-char.c  | 6 +++---
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/io/channel.c b/io/channel.c
index 692eb17..cd25677 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -146,7 +146,7 @@ guint qio_channel_add_watch(QIOChannel *ioc,
g_source_set_callback(source, (GSourceFunc)func, 
user_data, notify);

  -id = g_source_attach(source, NULL);
+id = g_source_attach(source, 
g_main_context_get_thread_default());

  g_source_unref(source);
return id;
diff --git a/qemu-char.c b/qemu-char.c
index 84f49ac..4340457 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -859,7 +859,7 @@ static gboolean io_watch_poll_prepare(GSource 
*source, gint *timeout_)

  iwp->src = qio_channel_create_watch(
  iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
  g_source_set_callback(iwp->src, iwp->fd_read, 
iwp->opaque, NULL);

-g_source_attach(iwp->src, NULL);
+g_source_attach(iwp->src, 
g_main_context_get_thread_default());

  } else {
  g_source_destroy(iwp->src);
  g_source_unref(iwp->src);
@@ -918,7 +918,7 @@ static guint io_add_watch_poll(QIOChannel *ioc,
  iwp->fd_read = (GSourceFunc) fd_read;
  iwp->src = NULL;
  -tag = g_source_attach(&iwp->parent, NULL);
+tag = g_source_attach(&iwp->parent, 
g_main_context_get_thread_default());

  g_source_unref(&iwp->parent);
  return tag;
  }
@@ -3982,7 +3982,7 @@ int qemu_chr_fe_add_watch(CharDriverState 
*s, GIOCondition cond,

  }
g_source_set_callback(src, (GSourceFunc)func, user_data, 
NULL);

-tag = g_source_attach(src, NULL);
+tag = g_source_attach(src, g_main_context_get_thread_default());
  g_source_unref(src);
return tag;
--
IIRC this opens a gate for your special thread (COLO compare 
thread?) to use

QIOChannel.
I've no real objection to this proposed patch, though it is fairly 
pointless
to take it now without seeing any following patch that actually 
makes use

of this added feature.

I agree.


Should I move this patch to the "[RFC PATCH V5 0/4] Introduce 
COLO-compare"

patch set? that can show how it works.

you can see this patch for how to use:
http://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg06754.html




In colo_compare_thread()




I think in the long run it is better to think about allowing 
integrating QIO to
AioContext, to support its usage outside main loop.  Given how 
opaque GSource
is, I'm not sure how feasible that is, or how useful it will be.  
Anyway we

should definitely hear more opinions from Daniel and Paolo.
Personally I think it is preferable to stick as close to the 
standard GSource
model as possible, as that's widely used & well understood API, 
compared to the

QEMU specific AioContext.

AioContext is more optimized for the case where the callbacks are
static.  In general this is not the case for qemu-char.c.


I don't sure AioContext can do this job good, but I think
we can make qemu more flexible to do same one job.
All roads lead to Rome.

Thanks
Zhang Chen



Paolo


.





--
Thanks
zhangchen

Re: [Qemu-devel] [RFC 4/6] target-ppc: add cmprb instruction

2016-07-17 Thread David Gibson

On Tue, Jul 12, 2016 at 11:33:20PM +0530, Nikunj A Dadhania wrote:
> ISA 3.0 Compare Ranged Byte instruction useful for
> isupper/islower/isaplha kind of operation.

At least until you have locale-aware versions of those...

> Signed-off-by: Nikunj A Dadhania 

Reviewed-by: David Gibson 

> ---
>  target-ppc/translate.c | 40 
>  1 file changed, 40 insertions(+)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 93c7c66..8de217f 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -817,6 +817,45 @@ static void gen_cmpli(DisasContext *ctx)
>  }
>  }
>  
> +/* cmprb - range comparison: isupper, isaplha, islower*/
> +static void gen_cmprb(DisasContext *ctx)
> +{
> +TCGLabel *lab1 = gen_new_label();
> +TCGLabel *lab2 = gen_new_label();
> +TCGv src1 = tcg_temp_local_new();
> +TCGv src2 = tcg_temp_local_new();
> +TCGv src2lo = tcg_temp_local_new();
> +TCGv src2hi = tcg_temp_local_new();
> +
> +tcg_gen_andi_tl(src1, cpu_gpr[rA(ctx->opcode)], 0xFF);
> +tcg_gen_andi_tl(src2, cpu_gpr[rB(ctx->opcode)], 0x);
> +
> +tcg_gen_andi_tl(src2lo, src2, 0xFF);
> +tcg_gen_shri_tl(src2hi, src2, 8);
> +tcg_gen_andi_tl(src2hi, src2hi, 0xFF);
> +
> +tcg_gen_brcond_tl(TCG_COND_GTU, src1, src2hi, lab1);
> +tcg_gen_brcond_tl(TCG_COND_LTU, src1, src2lo, lab1);
> +tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT);
> +tcg_gen_br(lab2);
> +gen_set_label(lab1);
> +
> +if (ctx->opcode & 0x0020) {
> +tcg_gen_shri_tl(src2hi, src2, 24);
> +tcg_gen_andi_tl(src2hi, src2hi, 0xFF);
> +tcg_gen_shri_tl(src2lo, src2, 16);
> +tcg_gen_andi_tl(src2lo, src2lo, 0xFF);
> +tcg_gen_brcond_tl(TCG_COND_GTU, src1, src2hi, lab2);
> +tcg_gen_brcond_tl(TCG_COND_LTU, src1, src2lo, lab2);
> +tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT);
> +}
> +gen_set_label(lab2);
> +tcg_temp_free(src1);
> +tcg_temp_free(src2);
> +tcg_temp_free(src2lo);
> +tcg_temp_free(src2hi);
> +}
> +
>  /* isel (PowerPC 2.03 specification) */
>  static void gen_isel(DisasContext *ctx)
>  {
> @@ -9898,6 +9937,7 @@ GEN_HANDLER(cmpi, 0x0B, 0xFF, 0xFF, 0x0040, 
> PPC_INTEGER),
>  GEN_HANDLER(cmpl, 0x1F, 0x00, 0x01, 0x0040, PPC_INTEGER),
>  GEN_HANDLER(cmpli, 0x0A, 0xFF, 0xFF, 0x0040, PPC_INTEGER),
>  GEN_HANDLER_E(cmpb, 0x1F, 0x1C, 0x0F, 0x0001, PPC_NONE, PPC2_ISA205),
> +GEN_HANDLER_E(cmprb, 0x1F, 0x00, 0x06, 0x0041, PPC_NONE, PPC2_ISA300),
>  GEN_HANDLER(isel, 0x1F, 0x0F, 0xFF, 0x0001, PPC_ISEL),
>  GEN_HANDLER(addi, 0x0E, 0xFF, 0xFF, 0x, PPC_INTEGER),
>  GEN_HANDLER(addic, 0x0C, 0xFF, 0xFF, 0x, PPC_INTEGER),

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC 1/6] target-ppc: Introduce Power9 family

2016-07-17 Thread David Gibson

On Tue, Jul 12, 2016 at 11:33:17PM +0530, Nikunj A Dadhania wrote:
> From: "Aneesh Kumar K.V" 
> 
> Signed-off-by: Aneesh Kumar K.V 
> [ rebased and added POWER9 alias ]
> Signed-off-by: Nikunj A Dadhania 
> ---
>  target-ppc/cpu-models.c |  5 +++
>  target-ppc/cpu-models.h |  2 ++
>  target-ppc/cpu-qom.h|  7 
>  target-ppc/mmu_helper.c |  3 +-
>  target-ppc/translate_init.c | 85 
> -
>  5 files changed, 100 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
> index 5209e63..901cf40 100644
> --- a/target-ppc/cpu-models.c
> +++ b/target-ppc/cpu-models.c
> @@ -1147,6 +1147,10 @@
>  "POWER8NVL v1.0")
>  POWERPC_DEF("970_v2.2",  CPU_POWERPC_970_v22,970,
>  "PowerPC 970 v2.2")
> +
> +POWERPC_DEF("POWER9_v1.0",   CPU_POWERPC_POWER9_BASE,POWER9,
> +"POWER9 v1.0")
> +
>  POWERPC_DEF("970fx_v1.0",CPU_POWERPC_970FX_v10,  970,
>  "PowerPC 970FX v1.0 (G5)")
>  POWERPC_DEF("970fx_v2.0",CPU_POWERPC_970FX_v20,  970,
> @@ -1395,6 +1399,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
>  { "POWER8E", "POWER8E_v2.1" },
>  { "POWER8", "POWER8_v2.0" },
>  { "POWER8NVL", "POWER8NVL_v1.0" },
> +{ "POWER9", "POWER9_v1.0" },
>  { "970", "970_v2.2" },
>  { "970fx", "970fx_v3.1" },
>  { "970mp", "970mp_v1.1" },
> diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
> index f21a44c..beeaaba 100644
> --- a/target-ppc/cpu-models.h
> +++ b/target-ppc/cpu-models.h
> @@ -562,6 +562,8 @@ enum {
>  CPU_POWERPC_POWER8_v20 = 0x004D0200,
>  CPU_POWERPC_POWER8NVL_BASE = 0x004C,
>  CPU_POWERPC_POWER8NVL_v10  = 0x004C0100,
> +CPU_POWERPC_POWER9_BASE= 0x004E,
> +CPU_POWERPC_POWER9_MAM = 0x004E0100,
>  CPU_POWERPC_970_v22= 0x00390202,
>  CPU_POWERPC_970FX_v10  = 0x00391100,
>  CPU_POWERPC_970FX_v20  = 0x003C0200,
> diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
> index 2864105..df2fb65 100644
> --- a/target-ppc/cpu-qom.h
> +++ b/target-ppc/cpu-qom.h
> @@ -86,6 +86,13 @@ enum powerpc_mmu_t {
>  POWERPC_MMU_2_07   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
>   | POWERPC_MMU_64K
>   | POWERPC_MMU_AMR | 0x0004,
> +/* for now , We can add radix later if needed */

I'm guessing this means you're only thinking about the guest-side
presentation of the P9 MMU at this point?  IIUC the host side
presentation is so different that sharing any constants with pre-P9
MMUs probably doesn't make sense.

I'm not immediately sure how we should make this distinction in the
target-ppc code, since these values are supposed to belong to the CPU
regardless of operating mode.


> +/* POWERPC_MMU_3_00   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
> + * | POWERPC_MMU_AMR | 0x0005,
> + */
> +
> +POWERPC_MMU_3_00   = POWERPC_MMU_64 | POWERPC_MMU_AMR | 0x0005,
> +
>  /* Architecture 2.07 "degraded" (no 1T segments)   */
>  POWERPC_MMU_2_07a  = POWERPC_MMU_64 | POWERPC_MMU_AMR
>   | 0x0004,
> diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
> index 485d5b8..6219c4a 100644
> --- a/target-ppc/mmu_helper.c
> +++ b/target-ppc/mmu_helper.c
> @@ -1935,13 +1935,14 @@ void ppc_tlb_invalidate_all(CPUPPCState *env)
>  case POWERPC_MMU_2_06a:
>  case POWERPC_MMU_2_07:
>  case POWERPC_MMU_2_07a:
> +case POWERPC_MMU_3_00:
>  #endif /* defined(TARGET_PPC64) */
>  env->tlb_need_flush = 0;
>  tlb_flush(CPU(cpu), 1);
>  break;
>  default:
>  /* XXX: TODO */
> -cpu_abort(CPU(cpu), "Unknown MMU model\n");
> +cpu_abort(CPU(cpu), "Unknown MMU model %d\n", env->mmu_model);
>  break;
>  }
>  }
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 8f257fb..51bab23 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -7459,7 +7459,8 @@ enum BOOK3S_CPU_TYPE {
>  BOOK3S_CPU_POWER5PLUS,
>  BOOK3S_CPU_POWER6,
>  BOOK3S_CPU_POWER7,
> -BOOK3S_CPU_POWER8
> +BOOK3S_CPU_POWER8,
> +BOOK3S_CPU_POWER9
>  };
>  
>  static void gen_fscr_facility_check(DisasContext *ctx, int facility_sprn,
> @@ -8241,6 +8242,7 @@ static void init_proc_book3s_64(CPUPPCState *env, int 
> version)
>  break;
>  case BOOK3S_CPU_POWER7:
>  case BOOK3S_CPU_POWER8:
> +case BOOK3S_CPU_POWER9:
>  gen_spr_book3s_ids(env);
>  gen_spr_amr(env, version >= BOOK3S_CPU_POWER8);
>  gen_spr_book3s_purr(env);
> @@ -8293,6 +8295,7 @@ static void init_proc_book3s_64(CPUPPCState *env, int 
> version)
>  break;
>  case BOOK3S_CPU_POWER7:
>  case BOOK3S_CPU_POWER8:
> +case BOOK3S_CPU_POWER9

Re: [Qemu-devel] [RFC 2/6] target-ppc: Introduce POWER ISA 3.0 flag

2016-07-17 Thread David Gibson

On Tue, Jul 12, 2016 at 11:33:18PM +0530, Nikunj A Dadhania wrote:
> This flag will be used for POWER9 instructions.
> 
> Signed-off-by: Nikunj A Dadhania 

Reviewed-by: David Gibson 

> ---
>  target-ppc/cpu.h| 5 -
>  target-ppc/translate_init.c | 2 +-
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
> index 2666a3f..f48ff0f 100644
> --- a/target-ppc/cpu.h
> +++ b/target-ppc/cpu.h
> @@ -2093,6 +2093,8 @@ enum {
>  PPC2_TM= 0x0002ULL,
>  /* Server PM instructgions (ISA 2.06, Book III)  
> */
>  PPC2_PM_ISA206 = 0x0004ULL,
> +/* POWER ISA 3.0 
> */
> +PPC2_ISA300= 0x0008ULL,
>  
>  #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | 
> \
>  PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
> @@ -2100,7 +2102,8 @@ enum {
>  PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206 | \
>  PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
>  PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
> -PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206)
> +PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \
> +PPC2_ISA300)
>  };
>  
>  
> /*/
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 51bab23..9852524 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -8820,7 +8820,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
>  PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 |
>  PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 |
>  PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
> -PPC2_TM | PPC2_PM_ISA206;
> +PPC2_TM | PPC2_PM_ISA206 | PPC2_ISA300;
>  pcc->msr_mask = (1ull << MSR_SF) |
>  (1ull << MSR_TM) |
>  (1ull << MSR_VR) |

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC 5/6] target-ppc: add modulo word operations

2016-07-17 Thread David Gibson

On Tue, Jul 12, 2016 at 11:33:21PM +0530, Nikunj A Dadhania wrote:
> Adding following instructions:
> 
> moduw: Modulo Unsigned Word
> modsw: Modulo Signed Word
> 
> Signed-off-by: Nikunj A Dadhania 

Hrm.. any reason you're not using the TCG inbuilt remainder ops
(tcg_gen_rem_i32() etc.)?

> ---
>  target-ppc/translate.c | 50 
> ++
>  1 file changed, 50 insertions(+)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 8de217f..c505684 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -1178,6 +1178,54 @@ GEN_DIVE(divde, divde, 0);
>  GEN_DIVE(divdeo, divde, 1);
>  #endif
>  
> +static inline void gen_op_arith_modw(DisasContext *ctx, TCGv ret, TCGv arg1,
> + TCGv arg2, int sign)
> +{
> +TCGLabel *l1 = gen_new_label();
> +TCGLabel *l2 = gen_new_label();
> +TCGv_i32 t0 = tcg_temp_local_new_i32();
> +TCGv_i32 t1 = tcg_temp_local_new_i32();
> +TCGv_i32 t2 = tcg_temp_local_new_i32();
> +
> +tcg_gen_trunc_tl_i32(t0, arg1);
> +tcg_gen_trunc_tl_i32(t1, arg2);
> +tcg_gen_brcondi_i32(TCG_COND_EQ, t1, 0, l1);
> +if (sign) {
> +TCGLabel *l3 = gen_new_label();
> +tcg_gen_brcondi_i32(TCG_COND_NE, t1, -1, l3);
> +tcg_gen_brcondi_i32(TCG_COND_EQ, t0, INT32_MIN, l1);
> +gen_set_label(l3);
> +tcg_gen_div_i32(t2, t0, t1);
> +} else {
> +tcg_gen_divu_i32(t2, t0, t1);
> +}
> +tcg_gen_mul_i32(t2, t2, t1);
> +tcg_gen_sub_i32(t2, t0, t2);
> +tcg_gen_br(l2);
> +gen_set_label(l1);
> +if (sign) {
> +tcg_gen_sari_i32(t2, t0, 31);
> +} else {
> +tcg_gen_movi_i32(t2, 0);
> +}
> +gen_set_label(l2);
> +tcg_gen_extu_i32_tl(ret, t2);
> +tcg_temp_free_i32(t0);
> +tcg_temp_free_i32(t1);
> +tcg_temp_free_i32(t2);
> +}
> +
> +#define GEN_INT_ARITH_MODW(name, opc3, sign)\
> +static void glue(gen_, name)(DisasContext *ctx) \
> +{   \
> +gen_op_arith_modw(ctx, cpu_gpr[rD(ctx->opcode)],\
> +  cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)],   \
> +  sign);\
> +}
> +
> +GEN_INT_ARITH_MODW(modsw, 0x18, 1);
> +GEN_INT_ARITH_MODW(moduw, 0x08, 0);
> +
>  /* mulhw  mulhw. */
>  static void gen_mulhw(DisasContext *ctx)
>  {
> @@ -10244,6 +10292,8 @@ GEN_HANDLER_E(divwe, 0x1F, 0x0B, 0x0D, 0, PPC_NONE, 
> PPC2_DIVE_ISA206),
>  GEN_HANDLER_E(divweo, 0x1F, 0x0B, 0x1D, 0, PPC_NONE, PPC2_DIVE_ISA206),
>  GEN_HANDLER_E(divweu, 0x1F, 0x0B, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206),
>  GEN_HANDLER_E(divweuo, 0x1F, 0x0B, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206),
> +GEN_HANDLER_E(modsw, 0x1F, 0x0B, 0x18, 0x0001, PPC_NONE, PPC2_ISA300),
> +GEN_HANDLER_E(moduw, 0x1F, 0x0B, 0x08, 0x0001, PPC_NONE, PPC2_ISA300),
>  
>  #if defined(TARGET_PPC64)
>  #undef GEN_INT_ARITH_DIVD

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC 3/6] target-ppc: adding addpcis instruction

2016-07-17 Thread David Gibson

On Tue, Jul 12, 2016 at 11:33:19PM +0530, Nikunj A Dadhania wrote:
> ISA 3.0 instruction for adding immediate value with next instruction
> address and return the result in the target register.
> 
> Signed-off-by: Nikunj A Dadhania 

Reviewed-by: David Gibson 

> ---
>  target-ppc/translate.c | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 92030b6..93c7c66 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -432,6 +432,20 @@ static inline uint32_t name(uint32_t opcode) 
>  \
>  return (((opcode >> (shift1)) & ((1 << (nb1)) - 1)) << nb2) |
>  \
>  ((opcode >> (shift2)) & ((1 << (nb2)) - 1)); 
>  \
>  }
> +
> +#define EXTRACT_HELPER_DXFORM(name,  
>  \
> +  d0_bits, shift_op_d0, shift_d0,
>  \
> +  d1_bits, shift_op_d1, shift_d1,
>  \
> +  d2_bits, shift_op_d2, shift_d2)
>  \
> +static inline int16_t name(uint32_t opcode)  
>  \
> +{
>  \
> +return   
>  \
> +(((opcode >> (shift_op_d0)) & ((1 << (d0_bits)) - 1)) << (shift_d0)) 
> | \
> +(((opcode >> (shift_op_d1)) & ((1 << (d1_bits)) - 1)) << (shift_d1)) 
> | \
> +(((opcode >> (shift_op_d2)) & ((1 << (d2_bits)) - 1)) << 
> (shift_d2));  \
> +}
> +
> +
>  /* Opcode part 1 */
>  EXTRACT_HELPER(opc1, 26, 6);
>  /* Opcode part 2 */
> @@ -501,6 +515,9 @@ EXTRACT_HELPER(FPL, 25, 1);
>  EXTRACT_HELPER(FPFLM, 17, 8);
>  EXTRACT_HELPER(FPW, 16, 1);
>  
> +/* addpcis */
> +EXTRACT_HELPER_DXFORM(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
> +
>  /***Jump target decoding   
> ***/
>  /* Immediate address */
>  static inline target_ulong LI(uint32_t opcode)
> @@ -984,6 +1001,15 @@ static void gen_addis(DisasContext *ctx)
>  }
>  }
>  
> +/* addpcis */
> +static void gen_addpcis(DisasContext *ctx)
> +{
> +target_long d = DX(ctx->opcode);
> +
> +tcg_gen_movi_tl(cpu_gpr[rD(ctx->opcode)], ctx->nip);
> +tcg_gen_addi_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rD(ctx->opcode)], d);
> +}
> +
>  static inline void gen_op_arith_divw(DisasContext *ctx, TCGv ret, TCGv arg1,
>   TCGv arg2, int sign, int compute_ov)
>  {
> @@ -9877,6 +9903,7 @@ GEN_HANDLER(addi, 0x0E, 0xFF, 0xFF, 0x, 
> PPC_INTEGER),
>  GEN_HANDLER(addic, 0x0C, 0xFF, 0xFF, 0x, PPC_INTEGER),
>  GEN_HANDLER2(addic_, "addic.", 0x0D, 0xFF, 0xFF, 0x, PPC_INTEGER),
>  GEN_HANDLER(addis, 0x0F, 0xFF, 0xFF, 0x, PPC_INTEGER),
> +GEN_HANDLER_E(addpcis, 0x13, 0x2, 0xFF, 0x, PPC_NONE, PPC2_ISA300),
>  GEN_HANDLER(mulhw, 0x1F, 0x0B, 0x02, 0x0400, PPC_INTEGER),
>  GEN_HANDLER(mulhwu, 0x1F, 0x0B, 0x00, 0x0400, PPC_INTEGER),
>  GEN_HANDLER(mullw, 0x1F, 0x0B, 0x07, 0x, PPC_INTEGER),

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [Bug 1603693] Re: Disks in mptsas1068 scsi controller not seen by linux

2016-07-17 Thread Dx

Welp. Yeah now I see it, it was in the test case I linked. Thanks.

Vmware doesn't seem to need this. Seems like it assigns a WWN of
0x5000c293944837df to my disk (not in the vm config files as far as i
can see, seems to persist across reboots)

[2.305111] ioc0: LSISAS1068 B0: Capabilities={Initiator}
[2.445800] scsi host2: ioc0: LSISAS1068 B0, FwRev=01032920h, Ports=1, 
MaxQ=128, IRQ=18
[2.447672] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 
0, sas_addr 0x5000c293944837df
[2.448806] scsi 2:0:0:0: Direct-Access VMware,  VMware Virtual S 1.0  
PQ: 0 ANSI: 2

Qemu with the manually specified WWN, for reference:

[3.656894] ioc0: LSISAS1068 A0: Capabilities={Initiator}
[3.790680] scsi host0: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, 
MaxQ=128, IRQ=10
[3.792232] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 
0, sas_addr 0x5000c50015ea71ac
[3.792476] scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK2.5+ 
PQ: 0 ANSI: 5

Also vmware doesn't populate /dev/disk/by-id/wwn-*:

# ls /dev/disk/by-id
ata-VMware_Virtual_IDE_CDROM_Drive_0001@  dm-name-arch_airootfs@

Qemu:

# ls /dev/disk/by-id
ata-QEMU_DVD-ROM_QM2@  scsi-35000c50015ea71ac@
scsi-35000c50015ea71ac-part2@  wwn-0x5000c50015ea71ac@
wwn-0x5000c50015ea71ac-part2@
dm-name-arch_airootfs@ scsi-35000c50015ea71ac-part1@  
scsi-35000c50015ea71ac-part3@  wwn-0x5000c50015ea71ac-part1@  
wwn-0x5000c50015ea71ac-part3@


Not directly related: after getting the arch iso cd to boot, I found that the 
VM that I actually wanted to get working uses mptspi instead of mptsas. So I 
didn't even need this controller...

The non-working vmware config says `scsi0.virtualDev = "lsilogic"`
(that's mptspi, LSI53C1030 or "LSI Logic Ultra 320"). For the mptsas
tests above, I changed it to `scsi0.virtualDev = "lsisas1068"`.

Is it correct to say that the LSI53C1030 parts of [1] were never
applied?

[1]: http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01608.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1603693

Title:
  Disks in mptsas1068 scsi controller not seen by linux

Status in QEMU:
  New

Bug description:
  When using the mptsas1068 scsi controller, linux detects the
  controller itself but not the drives attached to it. Freebsd works.
  Using a different controller with linux works. VMware with linux
  works.

  qemu 2.6.50 (v2.6.0-1925-g6b92bbf)
  seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 
support)

  Test script, loosely based off what libvirt runs and the libvirt tests
  that Paolo Bonzini wrote [1]

  #
  iso=archlinux-2016.07.01-dual.iso
  #iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso
  device=mptsas1068
  #device=lsi

  img=empty.img
  qemu-img create -f qcow2 $img 1G

  /usr/bin/qemu-system-x86_64 \
  -enable-kvm \
  -m 1024 \
  -boot menu=on \
  -device $device,id=scsi0,bus=pci.0,addr=0x9 \
  -drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \
  -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
 \
  -drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \
  -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1
  #

  The ISOs can be downloaded from [2] and [3].

  After booting linux, do "lsblk". /dev/sda should exist.

  After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU
  HARDDISK" should be mentioned.

  With device=mptsas1068 this fails in linux.

  With device=lsi line it works in both.

  With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only
  loads modules for mptsas1068, this works.

  I also reproduced this with the debian 8.5 netinstall image, but it
  insists in making you pick a driver from a list of modules when it
  fails to mount it, instead of dropping to a shell.

  Arch linux dmesg output snippet (full output attached as arch-linux-
  dmesg.txt):

  #
  root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0
  [0.00] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 
6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016
  [0.00]   Normal   empty
  [0.00] Preemptible hierarchical RCU implementation.
  [1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded 
(major 249)
  [1.951581] SCSI subsystem initialized
  [1.957113] Fusion MPT base driver 3.04.20
  [1.957618] Fusion MPT SAS Host driver 3.04.20
  [2.281773] scsi host0: ata_piix
  [2.285372] scsi host1: ata_piix
  [2.305803] mptbase: ioc0: Initiating bringup
  [2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator}
  [2.444390] scsi 0:0:1:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ 
PQ: 0 ANSI: 5
  [2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=013292

[Qemu-devel] [PATCH qemu] xhci: Fix possible side effect from assert()

2016-07-17 Thread Alexey Kardashevskiy

A static analysis tool called BEAM detected possible side effect from
assert() calling a helper which may change an XHCI ring after every call.

This moves xhci_ring_fetch() out of assert() so it will be called
with and without enabled debug.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/usb/hcd-xhci.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 976bfb0..188f954 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -2201,7 +2201,9 @@ static void xhci_kick_ep(XHCIState *xhci, unsigned int 
slotid,
 xfer->trb_count = length;
 
 for (i = 0; i < length; i++) {
-assert(xhci_ring_fetch(xhci, ring, &xfer->trbs[i], NULL));
+TRBType type;
+type = xhci_ring_fetch(xhci, ring, &xfer->trbs[i], NULL);
+assert(type);
 }
 xfer->streamid = streamid;
 
-- 
2.5.0.rc3

[Qemu-devel] [PATCH] virtio-blk: dataplane cleanup

2016-07-17 Thread Cao jin

No need duplicate the judgment, there is one in function entry.

Cc: Stefan Hajnoczi 
Cc: Kevin Wolf 
Cc: Max Reitz 
Signed-off-by: Cao jin 
---
 hw/block/dataplane/virtio-blk.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 54b9ac1..704a763 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -112,10 +112,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *conf,
 s->vdev = vdev;
 s->conf = conf;
 
-if (conf->iothread) {
-s->iothread = conf->iothread;
-object_ref(OBJECT(s->iothread));
-}
+s->iothread = conf->iothread;
+object_ref(OBJECT(s->iothread));
 s->ctx = iothread_get_aio_context(s->iothread);
 s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
 s->batch_notify_vqs = bitmap_new(conf->num_queues);
-- 
2.1.0

[Qemu-devel] [PULL 12/14] ppc/mmu-hash64: Remove duplicated #include statement

2016-07-17 Thread David Gibson

From: Thomas Huth 

No need to include error-report.h twice here.

Signed-off-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 target-ppc/mmu-hash64.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 82c2186..f6ffe35 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -24,7 +24,6 @@
 #include "exec/helper-proto.h"
 #include "qemu/error-report.h"
 #include "sysemu/kvm.h"
-#include "qemu/error-report.h"
 #include "kvm_ppc.h"
 #include "mmu-hash64.h"
 #include "exec/log.h"
-- 
2.7.4

[Qemu-devel] [PULL 07/14] dbdma: reset io->processing flag for unassigned DBDMA channel rw accesses

2016-07-17 Thread David Gibson

From: Mark Cave-Ayland 

Otherwise MacOS 9 hangs upon shutdown.

Signed-off-by: Mark Cave-Ayland 
Acked-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index ef5b0a5..15452b9 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -778,6 +778,7 @@ static void dbdma_unassigned_rw(DBDMA_io *io)
 DBDMA_channel *ch = io->channel;
 qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n",
   __func__, ch->channel);
+ch->io.processing = false;
 }
 
 static void dbdma_unassigned_flush(DBDMA_io *io)
-- 
2.7.4

[Qemu-devel] [PULL 02/14] dbdma: always define DBDMA_DPRINTF and enable debug with DEBUG_DBDMA

2016-07-17 Thread David Gibson

From: Mark Cave-Ayland 

Enabling DBDMA_DPRINTF unconditionally ensures that any errors in debug
statements are picked up immediately.

Signed-off-by: Mark Cave-Ayland 
Acked-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index f116f9c..b6639f4 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -45,14 +45,13 @@
 #include "sysemu/dma.h"
 
 /* debug DBDMA */
-//#define DEBUG_DBDMA
+#define DEBUG_DBDMA 0
 
-#ifdef DEBUG_DBDMA
-#define DBDMA_DPRINTF(fmt, ...) \
-do { printf("DBDMA: " fmt , ## __VA_ARGS__); } while (0)
-#else
-#define DBDMA_DPRINTF(fmt, ...)
-#endif
+#define DBDMA_DPRINTF(fmt, ...) do { \
+if (DEBUG_DBDMA) { \
+printf("DBDMA: " fmt , ## __VA_ARGS__); \
+} \
+} while (0);
 
 /*
  */
@@ -62,7 +61,7 @@ static DBDMAState *dbdma_from_ch(DBDMA_channel *ch)
 return container_of(ch, DBDMAState, channels[ch->channel]);
 }
 
-#ifdef DEBUG_DBDMA
+#if DEBUG_DBDMA
 static void dump_dbdma_cmd(dbdma_cmd *cmd)
 {
 printf("dbdma_cmd %p\n", cmd);
-- 
2.7.4

[Qemu-devel] [PULL 06/14] dbdma: set FLUSH bit upon reception of flush command for unassigned DBDMA channels

2016-07-17 Thread David Gibson

From: Mark Cave-Ayland 

This fixes MacOS 9 whereby it continually flushes and polls the status bits
until they are set to indicate a successful flush.

Signed-off-by: Mark Cave-Ayland 
Acked-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index c5dd0ac..ef5b0a5 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -783,8 +783,18 @@ static void dbdma_unassigned_rw(DBDMA_io *io)
 static void dbdma_unassigned_flush(DBDMA_io *io)
 {
 DBDMA_channel *ch = io->channel;
+dbdma_cmd *current = &ch->current;
+uint16_t cmd;
 qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n",
   __func__, ch->channel);
+
+cmd = le16_to_cpu(current->command) & COMMAND_MASK;
+if (cmd == OUTPUT_MORE || cmd == OUTPUT_LAST ||
+cmd == INPUT_MORE || cmd == INPUT_LAST) {
+current->xfer_status = cpu_to_le16(ch->regs[DBDMA_STATUS] | FLUSH);
+current->res_count = cpu_to_le16(io->len);
+dbdma_cmdptr_save(ch);
+}
 }
 
 void* DBDMA_init (MemoryRegion **dbdma_mem)
-- 
2.7.4

[Qemu-devel] [PULL 00/14] ppc-for-2.7 queue 20160718

2016-07-17 Thread David Gibson

The following changes since commit 6b92bbfe812746fe7841a24c24e6460f5359ce72:

  Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' 
into staging (2016-07-15 16:56:08 +0100)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.7-20160718

for you to fetch changes up to 159d2e39a8602c369542a92573a52acb5f5f58f2:

  ppc: Yet another fix for the huge page support detection mechanism 
(2016-07-18 10:52:19 +1000)


ppc patch queue 2016-07-18

Here's what ought to be the final ppc pull request before the 2.7 hard
freeze.  This set contains a rework of the DBDMA device for Mac
platforms, and some assorted cleanups and bugfixes.


Benjamin Herrenschmidt (1):
  ppc: Fix support for odd MSR combinations

Bharata B Rao (1):
  spapr: Ensure CPU cores are added contiguously and removed in LIFO order

David Gibson (1):
  vfio/spapr: Remove stale ioctl() call

Greg Kurz (2):
  spapr: fix core unplug crash
  ppc: abort if compat property contains an unknown value

Mark Cave-Ayland (6):
  dbdma: always define DBDMA_DPRINTF and enable debug with DEBUG_DBDMA
  dbdma: add per-channel debugging enabled via DEBUG_DBDMA_CHANMASK
  dbdma: fix endian of DBDMA_CMDPTR_LO during branch
  dbdma: fix load_word/store_word value endianness
  dbdma: set FLUSH bit upon reception of flush command for unassigned DBDMA 
channels
  dbdma: reset io->processing flag for unassigned DBDMA channel rw accesses

Paolo Bonzini (1):
  target-ppc: fix left shift overflow in hpte_page_shift

Thomas Huth (2):
  ppc/mmu-hash64: Remove duplicated #include statement
  ppc: Yet another fix for the huge page support detection mechanism

 hw/misc/macio/mac_dbdma.c   | 125 +++-
 hw/ppc/spapr_cpu_core.c |  27 --
 hw/vfio/spapr.c |   1 -
 target-ppc/helper_regs.h|  46 
 target-ppc/kvm.c|  10 ++--
 target-ppc/mmu-hash64.c |   3 +-
 target-ppc/translate_init.c |   4 +-
 7 files changed, 119 insertions(+), 97 deletions(-)

[Qemu-devel] [PULL 14/14] ppc: Yet another fix for the huge page support detection mechanism

2016-07-17 Thread David Gibson

From: Thomas Huth 

Commit 86b50f2e1bef ("Disable huge page support if it is not available
for main RAM") already made sure that huge page support is not announced
to the guest if the normal RAM of non-NUMA configurations is not backed
by a huge page filesystem. However, there is one more case that can go
wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
with huge page support (and only the memory of a DIMM is configured with
it). When QEMU is started with the following command line for example,
the Linux guest currently crashes because it is trying to use huge pages
on a memory region that does not support huge pages:

 qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
   memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 \
   -device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
   -numa node,nodeid=0 -numa node,nodeid=1

To fix this issue, we've got to make sure to disable huge page support,
too, when there is a NUMA node that is not using a memory backend with
huge page support.

Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
Signed-off-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 target-ppc/kvm.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 884d564..7a8f555 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -389,12 +389,16 @@ static long getrampagesize(void)
 
 object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
 
-if (hpsize == LONG_MAX) {
+if (hpsize == LONG_MAX || hpsize == getpagesize()) {
 return getpagesize();
 }
 
-if (nb_numa_nodes == 0 && hpsize > getpagesize()) {
-/* No NUMA nodes and normal RAM without -mem-path ==> no huge pages! */
+/* If NUMA is disabled or the NUMA nodes are not backed with a
+ * memory-backend, then there is at least one node using "normal"
+ * RAM. And since normal RAM has not been configured with "-mem-path"
+ * (what we've checked earlier here already), we can not use huge pages!
+ */
+if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) {
 static bool warned;
 if (!warned) {
 error_report("Huge page support disabled (n/a for main memory).");
-- 
2.7.4

[Qemu-devel] [PULL 09/14] vfio/spapr: Remove stale ioctl() call

2016-07-17 Thread David Gibson

This ioctl() call to VFIO_IOMMU_SPAPR_TCE_REMOVE was left over from an
earlier version of the code and has since been folded into
vfio_spapr_remove_window().

It wasn't caught because although the argument structure has been removed,
the libc function remove() means this didn't trigger a compile failure.
The ioctl() was also almost certain to fail silently and harmlessly with
the bogus argument, so this wasn't caught in testing.

Suggested-by: Paolo Bonzini 
Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/vfio/spapr.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index 0af3423..7443d34 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -177,7 +177,6 @@ int vfio_spapr_create_window(VFIOContainer *container,
 error_report("Host doesn't support DMA window at %"HWADDR_PRIx", must 
be %"PRIx64,
  section->offset_within_address_space,
  (uint64_t)create.start_addr);
-ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove);
 return -EINVAL;
 }
 trace_vfio_spapr_create_window(create.page_shift,
-- 
2.7.4

[Qemu-devel] [PULL 03/14] dbdma: add per-channel debugging enabled via DEBUG_DBDMA_CHANMASK

2016-07-17 Thread David Gibson

From: Mark Cave-Ayland 

By default large amounts of DBDMA debugging are produced when often it is just
1 or 2 channels that are of interest. Introduce DEBUG_DBDMA_CHANMASK to allow
the developer to select the channels of interest at compile time, and then
further add the extra channel information to each debug statement where
possible.

Also clearly mark the start/end of DBDMA_run_bh to allow tracking the bottom
half execution.

Signed-off-by: Mark Cave-Ayland 
Acked-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 75 ++-
 1 file changed, 42 insertions(+), 33 deletions(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index b6639f4..e692312 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -46,6 +46,7 @@
 
 /* debug DBDMA */
 #define DEBUG_DBDMA 0
+#define DEBUG_DBDMA_CHANMASK ((1ull << DBDMA_CHANNELS) - 1)
 
 #define DBDMA_DPRINTF(fmt, ...) do { \
 if (DEBUG_DBDMA) { \
@@ -53,6 +54,14 @@
 } \
 } while (0);
 
+#define DBDMA_DPRINTFCH(ch, fmt, ...) do { \
+if (DEBUG_DBDMA) { \
+if ((1ul << (ch)->channel) & DEBUG_DBDMA_CHANMASK) { \
+printf("DBDMA[%02x]: " fmt , (ch)->channel, ## __VA_ARGS__); \
+} \
+} \
+} while (0);
+
 /*
  */
 
@@ -79,26 +88,26 @@ static void dump_dbdma_cmd(dbdma_cmd *cmd)
 #endif
 static void dbdma_cmdptr_load(DBDMA_channel *ch)
 {
-DBDMA_DPRINTF("dbdma_cmdptr_load 0x%08x\n",
-  ch->regs[DBDMA_CMDPTR_LO]);
+DBDMA_DPRINTFCH(ch, "dbdma_cmdptr_load 0x%08x\n",
+ch->regs[DBDMA_CMDPTR_LO]);
 dma_memory_read(&address_space_memory, ch->regs[DBDMA_CMDPTR_LO],
 &ch->current, sizeof(dbdma_cmd));
 }
 
 static void dbdma_cmdptr_save(DBDMA_channel *ch)
 {
-DBDMA_DPRINTF("dbdma_cmdptr_save 0x%08x\n",
-  ch->regs[DBDMA_CMDPTR_LO]);
-DBDMA_DPRINTF("xfer_status 0x%08x res_count 0x%04x\n",
-  le16_to_cpu(ch->current.xfer_status),
-  le16_to_cpu(ch->current.res_count));
+DBDMA_DPRINTFCH(ch, "dbdma_cmdptr_save 0x%08x\n",
+ch->regs[DBDMA_CMDPTR_LO]);
+DBDMA_DPRINTFCH(ch, "xfer_status 0x%08x res_count 0x%04x\n",
+le16_to_cpu(ch->current.xfer_status),
+le16_to_cpu(ch->current.res_count));
 dma_memory_write(&address_space_memory, ch->regs[DBDMA_CMDPTR_LO],
  &ch->current, sizeof(dbdma_cmd));
 }
 
 static void kill_channel(DBDMA_channel *ch)
 {
-DBDMA_DPRINTF("kill_channel\n");
+DBDMA_DPRINTFCH(ch, "kill_channel\n");
 
 ch->regs[DBDMA_STATUS] |= DEAD;
 ch->regs[DBDMA_STATUS] &= ~ACTIVE;
@@ -114,7 +123,7 @@ static void conditional_interrupt(DBDMA_channel *ch)
 uint32_t status;
 int cond;
 
-DBDMA_DPRINTF("%s\n", __func__);
+DBDMA_DPRINTFCH(ch, "%s\n", __func__);
 
 intr = le16_to_cpu(current->command) & INTR_MASK;
 
@@ -123,7 +132,7 @@ static void conditional_interrupt(DBDMA_channel *ch)
 return;
 case INTR_ALWAYS: /* always interrupt */
 qemu_irq_raise(ch->irq);
-DBDMA_DPRINTF("%s: raise\n", __func__);
+DBDMA_DPRINTFCH(ch, "%s: raise\n", __func__);
 return;
 }
 
@@ -138,13 +147,13 @@ static void conditional_interrupt(DBDMA_channel *ch)
 case INTR_IFSET:  /* intr if condition bit is 1 */
 if (cond) {
 qemu_irq_raise(ch->irq);
-DBDMA_DPRINTF("%s: raise\n", __func__);
+DBDMA_DPRINTFCH(ch, "%s: raise\n", __func__);
 }
 return;
 case INTR_IFCLR:  /* intr if condition bit is 0 */
 if (!cond) {
 qemu_irq_raise(ch->irq);
-DBDMA_DPRINTF("%s: raise\n", __func__);
+DBDMA_DPRINTFCH(ch, "%s: raise\n", __func__);
 }
 return;
 }
@@ -158,7 +167,7 @@ static int conditional_wait(DBDMA_channel *ch)
 uint32_t status;
 int cond;
 
-DBDMA_DPRINTF("conditional_wait\n");
+DBDMA_DPRINTFCH(ch, "conditional_wait\n");
 
 wait = le16_to_cpu(current->command) & WAIT_MASK;
 
@@ -217,7 +226,7 @@ static void conditional_branch(DBDMA_channel *ch)
 uint32_t status;
 int cond;
 
-DBDMA_DPRINTF("conditional_branch\n");
+DBDMA_DPRINTFCH(ch, "conditional_branch\n");
 
 /* check if we must branch */
 
@@ -262,7 +271,7 @@ static void dbdma_end(DBDMA_io *io)
 DBDMA_channel *ch = io->channel;
 dbdma_cmd *current = &ch->current;
 
-DBDMA_DPRINTF("%s\n", __func__);
+DBDMA_DPRINTFCH(ch, "%s\n", __func__);
 
 if (conditional_wait(ch))
 goto wait;
@@ -288,13 +297,13 @@ wait:
 static void start_output(DBDMA_channel *ch, int key, uint32_t addr,
 uint16_t req_count, int is_last)
 {
-DBDMA_DPRINTF("start_output\n");
+DBDMA_DPRINTFCH(ch, "start_output\n");
 
 /* KEY_REGS, KEY_DEVICE and KEY_STREAM
  * are not implemented in the mac-io chip
  */
 
-DBDMA_D

[Qemu-devel] [PULL 01/14] spapr: fix core unplug crash

2016-07-17 Thread David Gibson

From: Greg Kurz 

If the host has 8 threads/core and the guest is started with:

-smp cores=1,threads=4,maxcpus=12

It is possible to crash QEMU by doing:

(qemu) device_add host-spapr-cpu-core,core-id=16,id=foo
(qemu) device_del foo
Segmentation fault

This happens because spapr_core_unplug() assumes cpu_dt_id == core_id.
As long as cpu_dt_id is derived from the non-table cpu_index, this is
only true when you plug cores with contiguous ids.

It is safer to be consistent: the DR connector was created with an
index that is immediately written to cc->core_id, and spapr_core_plug()
also relies on cc->core_id.

Let's use it also in spapr_core_unplug().

Signed-off-by: Greg Kurz 
Reviewed-by: Bharata B Rao 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_cpu_core.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 9347f07..bc52b3c 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -126,11 +126,9 @@ static void spapr_core_release(DeviceState *dev, void 
*opaque)
 void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
Error **errp)
 {
-sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
-PowerPCCPU *cpu = POWERPC_CPU(core->threads);
-int id = ppc_get_vcpu_dt_id(cpu);
+CPUCore *cc = CPU_CORE(dev);
 sPAPRDRConnector *drc =
-spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
+spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id);
 sPAPRDRConnectorClass *drck;
 Error *local_err = NULL;
 
-- 
2.7.4

[Qemu-devel] [PULL 11/14] ppc: abort if compat property contains an unknown value

2016-07-17 Thread David Gibson

From: Greg Kurz 

It is not possible to set the compat property to an unknown value with
powerpc_set_compat(). Something must have gone terribly wrong in QEMU,
if we detect an "Internal error" in powerpc_get_compat(). Let's abort then.

This patch also drops the "max_compat ? *max_compat : -1" construct. It is
useless since max_compat is dereferenced a few lines above.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 7cb7842..5ecafc7 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8446,8 +8446,8 @@ static void powerpc_get_compat(Object *obj, Visitor *v, 
const char *name,
 case 0:
 break;
 default:
-error_setg(errp, "Internal error: compat is set to %x",
-   max_compat ? *max_compat : -1);
+error_report("Internal error: compat is set to %x", *max_compat);
+abort();
 break;
 }
 
-- 
2.7.4

[Qemu-devel] [PULL 10/14] spapr: Ensure CPU cores are added contiguously and removed in LIFO order

2016-07-17 Thread David Gibson

From: Bharata B Rao 

If CPU core addition or removal is allowed in random order leading to
holes in the core id range (and hence in the cpu_index range), migration
can fail as migration with holes in cpu_index range isn't yet handled
correctly.

Prevent this situation by enforcing the addition in contiguous order
and removal in LIFO order so that we never end up with holes in
cpu_index range.

Signed-off-by: Bharata B Rao 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_cpu_core.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index bc52b3c..4bfc96b 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -126,12 +126,23 @@ static void spapr_core_release(DeviceState *dev, void 
*opaque)
 void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
Error **errp)
 {
+sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
 CPUCore *cc = CPU_CORE(dev);
 sPAPRDRConnector *drc =
 spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, cc->core_id);
 sPAPRDRConnectorClass *drck;
 Error *local_err = NULL;
+int smt = kvmppc_smt_threads();
+int index = cc->core_id / smt;
+int spapr_max_cores = max_cpus / smp_threads;
+int i;
 
+for (i = spapr_max_cores - 1; i > index; i--) {
+if (spapr->cores[i]) {
+error_setg(errp, "core-id %d should be removed first", i * smt);
+return;
+}
+}
 g_assert(drc);
 
 drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -214,7 +225,7 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, 
DeviceState *dev,
 sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(OBJECT(hotplug_dev));
 sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
 int spapr_max_cores = max_cpus / smp_threads;
-int index;
+int index, i;
 int smt = kvmppc_smt_threads();
 Error *local_err = NULL;
 CPUCore *cc = CPU_CORE(dev);
@@ -252,6 +263,14 @@ void spapr_core_pre_plug(HotplugHandler *hotplug_dev, 
DeviceState *dev,
 goto out;
 }
 
+for (i = 0; i < index; i++) {
+if (!spapr->cores[i]) {
+error_setg(&local_err, "core-id %d should be added first",
+   i * smt);
+goto out;
+}
+}
+
 out:
 g_free(base_core_type);
 error_propagate(errp, local_err);
-- 
2.7.4

[Qemu-devel] [PULL 13/14] target-ppc: fix left shift overflow in hpte_page_shift

2016-07-17 Thread David Gibson

From: Paolo Bonzini 

ps->pte_enc is a 32-bit value, which is shifted left and then compared
to a 64-bit value.  It needs a cast before the shift.

Reported by Coverity.

Signed-off-by: Paolo Bonzini 
Signed-off-by: David Gibson 
---
 target-ppc/mmu-hash64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index f6ffe35..5de1358 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -478,7 +478,7 @@ static unsigned hpte_page_shift(const struct 
ppc_one_seg_page_size *sps,
 
 mask = ((1ULL << ps->page_shift) - 1) & HPTE64_R_RPN;
 
-if ((pte1 & mask) == (ps->pte_enc << HPTE64_R_RPN_SHIFT)) {
+if ((pte1 & mask) == ((uint64_t)ps->pte_enc << HPTE64_R_RPN_SHIFT)) {
 return ps->page_shift;
 }
 }
-- 
2.7.4

[Qemu-devel] [PULL 08/14] ppc: Fix support for odd MSR combinations

2016-07-17 Thread David Gibson

From: Benjamin Herrenschmidt 

MacOS uses an architecturally illegal MSR combination that
seems nonetheless supported by 32-bit processors, which is
to have MSR[PR]=1 and one or more of MSR[DR/IR/EE]=0.

This adds support for it. To work properly we need to also
properly include support for PR=1,{I,D}R=0 to the MMU index
used by the qemu TLB.

Signed-off-by: Benjamin Herrenschmidt 
Tested-by: Mark Cave-Ayland 
Signed-off-by: David Gibson 
---
 target-ppc/helper_regs.h | 46 ++
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/target-ppc/helper_regs.h b/target-ppc/helper_regs.h
index 8d38828..3d279f1 100644
--- a/target-ppc/helper_regs.h
+++ b/target-ppc/helper_regs.h
@@ -41,17 +41,19 @@ static inline void hreg_swap_gpr_tgpr(CPUPPCState *env)
 
 static inline void hreg_compute_mem_idx(CPUPPCState *env)
 {
-/* This is our encoding for server processors
+/* This is our encoding for server processors. The architecture
+ * specifies that there is no such thing as userspace with
+ * translation off, however it appears that MacOS does it and
+ * some 32-bit CPUs support it. Weird...
  *
  *   0 = Guest User space virtual mode
  *   1 = Guest Kernel space virtual mode
- *   2 = Guest Kernel space real mode
- *   3 = HV User space virtual mode
- *   4 = HV Kernel space virtual mode
- *   5 = HV Kernel space real mode
- *
- * The combination PR=1 IR&DR=0 is invalid, we will treat
- * it as IR=DR=1
+ *   2 = Guest User space real mode
+ *   3 = Guest Kernel space real mode
+ *   4 = HV User space virtual mode
+ *   5 = HV Kernel space virtual mode
+ *   6 = HV User space real mode
+ *   7 = HV Kernel space real mode
  *
  * For BookE, we need 8 MMU modes as follow:
  *
@@ -71,20 +73,11 @@ static inline void hreg_compute_mem_idx(CPUPPCState *env)
 env->immu_idx += msr_gs ? 4 : 0;
 env->dmmu_idx += msr_gs ? 4 : 0;
 } else {
-/* First calucalte a base value independent of HV */
-if (msr_pr != 0) {
-/* User space, ignore IR and DR */
-env->immu_idx = env->dmmu_idx = 0;
-} else {
-/* Kernel, setup a base I/D value */
-env->immu_idx = msr_ir ? 1 : 2;
-env->dmmu_idx = msr_dr ? 1 : 2;
-}
-/* Then offset it for HV */
-if (msr_hv) {
-env->immu_idx += 3;
-env->dmmu_idx += 3;
-}
+env->immu_idx = env->dmmu_idx = msr_pr ? 0 : 1;
+env->immu_idx += msr_ir ? 0 : 2;
+env->dmmu_idx += msr_dr ? 0 : 2;
+env->immu_idx += msr_hv ? 4 : 0;
+env->dmmu_idx += msr_hv ? 4 : 0;
 }
 }
 
@@ -136,8 +129,13 @@ static inline int hreg_store_msr(CPUPPCState *env, 
target_ulong value,
 /* Change the exception prefix on PowerPC 601 */
 env->excp_prefix = ((value >> MSR_EP) & 1) * 0xFFF0;
 }
-/* If PR=1 then EE, IR and DR must be 1 */
-if ((value >> MSR_PR) & 1) {
+/* If PR=1 then EE, IR and DR must be 1
+ *
+ * Note: We only enforce this on 64-bit processors. It appears that
+ * 32-bit implementations supports PR=1 and EE/DR/IR=0 and MacOS
+ * exploits it.
+ */
+if ((env->insns_flags & PPC_64B) && ((value >> MSR_PR) & 1)) {
 value |= (1 << MSR_EE) | (1 << MSR_DR) | (1 << MSR_IR);
 }
 #endif
-- 
2.7.4

[Qemu-devel] [PULL 04/14] dbdma: fix endian of DBDMA_CMDPTR_LO during branch

2016-07-17 Thread David Gibson

From: Mark Cave-Ayland 

The current DBDMA command is stored in little-endian format, so make sure
we convert it to match our CPU when updating the DBDMA_CMDPTR_LO register.

Signed-off-by: Mark Cave-Ayland 
Acked-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index e692312..c4ee381 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -213,7 +213,7 @@ static void branch(DBDMA_channel *ch)
 {
 dbdma_cmd *current = &ch->current;
 
-ch->regs[DBDMA_CMDPTR_LO] = current->cmd_dep;
+ch->regs[DBDMA_CMDPTR_LO] = le32_to_cpu(current->cmd_dep);
 ch->regs[DBDMA_STATUS] |= BT;
 dbdma_cmdptr_load(ch);
 }
-- 
2.7.4

[Qemu-devel] [PULL 05/14] dbdma: fix load_word/store_word value endianness

2016-07-17 Thread David Gibson

From: Mark Cave-Ayland 

The values to read/write to/from physical memory are copied directly to the
physical address with no endian swapping required.

Also add some extra information to debugging output while we are here.

Signed-off-by: Mark Cave-Ayland 
Acked-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index c4ee381..c5dd0ac 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -350,9 +350,8 @@ static void load_word(DBDMA_channel *ch, int key, uint32_t 
addr,
  uint16_t len)
 {
 dbdma_cmd *current = &ch->current;
-uint32_t val;
 
-DBDMA_DPRINTFCH(ch, "load_word\n");
+DBDMA_DPRINTFCH(ch, "load_word %d bytes, addr=%08x\n", len, addr);
 
 /* only implements KEY_SYSTEM */
 
@@ -362,14 +361,7 @@ static void load_word(DBDMA_channel *ch, int key, uint32_t 
addr,
 return;
 }
 
-dma_memory_read(&address_space_memory, addr, &val, len);
-
-if (len == 2)
-val = (val << 16) | (current->cmd_dep & 0x);
-else if (len == 1)
-val = (val << 24) | (current->cmd_dep & 0x00ff);
-
-current->cmd_dep = val;
+dma_memory_read(&address_space_memory, addr, ¤t->cmd_dep, len);
 
 if (conditional_wait(ch))
 goto wait;
@@ -389,9 +381,9 @@ static void store_word(DBDMA_channel *ch, int key, uint32_t 
addr,
   uint16_t len)
 {
 dbdma_cmd *current = &ch->current;
-uint32_t val;
 
-DBDMA_DPRINTFCH(ch, "store_word\n");
+DBDMA_DPRINTFCH(ch, "store_word %d bytes, addr=%08x pa=%x\n",
+len, addr, le32_to_cpu(current->cmd_dep));
 
 /* only implements KEY_SYSTEM */
 
@@ -401,13 +393,7 @@ static void store_word(DBDMA_channel *ch, int key, 
uint32_t addr,
 return;
 }
 
-val = current->cmd_dep;
-if (len == 2)
-val >>= 16;
-else if (len == 1)
-val >>= 24;
-
-dma_memory_write(&address_space_memory, addr, &val, len);
+dma_memory_write(&address_space_memory, addr, ¤t->cmd_dep, len);
 
 if (conditional_wait(ch))
 goto wait;
-- 
2.7.4

Re: [Qemu-devel] [RFC 5/6] target-ppc: add modulo word operations

2016-07-17 Thread Nikunj A Dadhania

David Gibson  writes:

> [ Unknown signature status ]
> On Tue, Jul 12, 2016 at 11:33:21PM +0530, Nikunj A Dadhania wrote:
>> Adding following instructions:
>> 
>> moduw: Modulo Unsigned Word
>> modsw: Modulo Signed Word
>> 
>> Signed-off-by: Nikunj A Dadhania 
>
> Hrm.. any reason you're not using the TCG inbuilt remainder ops
> (tcg_gen_rem_i32() etc.)?

I have an updated version with me which uses inbuilt ops, i was searching
for modulo expressions, which I didn't find, so wrote. Found later that
it is called tcg_gen_rem. Will send in the next version.

Regards
Nikunj

Re: [Qemu-devel] [PATCH 1/3] virtio: Basic implementation of virtio pstore driver

2016-07-17 Thread Kees Cook

On Sun, Jul 17, 2016 at 9:37 PM, Namhyung Kim  wrote:
> The virtio pstore driver provides interface to the pstore subsystem so
> that the guest kernel's log/dump message can be saved on the host
> machine.  Users can access the log file directly on the host, or on the
> guest at the next boot using pstore filesystem.  It currently deals with
> kernel log (printk) buffer only, but we can extend it to have other
> information (like ftrace dump) later.
>
> It supports legacy PCI device using single order-2 page buffer.  As all
> operation of pstore is synchronous, it would be fine IMHO.  However I
> don't know how to make write operation synchronous since it's called
> with a spinlock held (from any context including NMI).
>
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: "Michael S. Tsirkin" 
> Cc: Anthony Liguori 
> Cc: Anton Vorontsov 
> Cc: Colin Cross 
> Cc: Kees Cook 
> Cc: Tony Luck 
> Cc: Steven Rostedt 
> Cc: Ingo Molnar 
> Cc: Minchan Kim 
> Cc: k...@vger.kernel.org
> Cc: qemu-devel@nongnu.org
> Cc: virtualizat...@lists.linux-foundation.org
> Signed-off-by: Namhyung Kim 

This looks great to me! I'd love to use this in qemu. (Right now I go
through hoops to use the ramoops backend for testing.)

Reviewed-by: Kees Cook 

Notes below...

> ---
>  drivers/virtio/Kconfig |  10 ++
>  drivers/virtio/Makefile|   1 +
>  drivers/virtio/virtio_pstore.c | 317 
> +
>  include/uapi/linux/Kbuild  |   1 +
>  include/uapi/linux/virtio_ids.h|   1 +
>  include/uapi/linux/virtio_pstore.h |  53 +++
>  6 files changed, 383 insertions(+)
>  create mode 100644 drivers/virtio/virtio_pstore.c
>  create mode 100644 include/uapi/linux/virtio_pstore.h
>
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 77590320d44c..8f0e6c796c12 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -58,6 +58,16 @@ config VIRTIO_INPUT
>
>  If unsure, say M.
>
> +config VIRTIO_PSTORE
> +   tristate "Virtio pstore driver"
> +   depends on VIRTIO
> +   depends on PSTORE
> +   ---help---
> +This driver supports virtio pstore devices to save/restore
> +panic and oops messages on the host.
> +
> +If unsure, say M.
> +
>   config VIRTIO_MMIO
> tristate "Platform bus driver for memory mapped virtio devices"
> depends on HAS_IOMEM && HAS_DMA
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 41e30e3dc842..bee68cb26d48 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
>  virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
>  obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
>  obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
> +obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
> diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
> new file mode 100644
> index ..6fe62c0f1508
> --- /dev/null
> +++ b/drivers/virtio/virtio_pstore.c
> @@ -0,0 +1,317 @@
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define VIRT_PSTORE_ORDER2
> +#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
> +
> +struct virtio_pstore {
> +   struct virtio_device*vdev;
> +   struct virtqueue*vq;
> +   struct pstore_info   pstore;
> +   struct virtio_pstore_hdr hdr;
> +   size_t   buflen;
> +   u64  id;
> +
> +   /* Waiting for host to ack */
> +   wait_queue_head_t   acked;
> +};
> +
> +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id 
> type)
> +{
> +   u16 ret;
> +
> +   switch (type) {
> +   case PSTORE_TYPE_DMESG:
> +   ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_DMESG);
> +   break;
> +   default:
> +   ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
> +   break;
> +   }

I would love to see this support PSTORE_TYPE_CONSOLE too. It should be
relatively easy to add: I think it'd just be another virtio command?

> +
> +   return ret;
> +}
> +
> +static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 
> type)
> +{
> +   enum pstore_type_id ret;
> +
> +   switch (virtio16_to_cpu(vps->vdev, type)) {
> +   case VIRTIO_PSTORE_TYPE_DMESG:
> +   ret = PSTORE_TYPE_DMESG;
> +   break;
> +   default:
> +   ret = PSTORE_TYPE_UNKNOWN;
> +   break;
> +   }
> +
> +   return ret;
> +}
> +
> +static void virtpstore_ack(struct virtqueue *vq)
> +{
> +   struct virtio_pstore *vps = vq->vdev->priv;
> +
> +   wake_up(&vps->acked);
> +}
> +
> +static int virt_pstore_open(struct pstore_info *psi)
> +{
> +   struct virtio_pstore *vps = psi->data;
> +   stru

Re: [Qemu-devel] [RFC 1/6] target-ppc: Introduce Power9 family

2016-07-17 Thread Nikunj A Dadhania

David Gibson  writes:

> [ Unknown signature status ]
> On Tue, Jul 12, 2016 at 11:33:17PM +0530, Nikunj A Dadhania wrote:
>> From: "Aneesh Kumar K.V" 
>> 
>> Signed-off-by: Aneesh Kumar K.V 
>> [ rebased and added POWER9 alias ]
>> Signed-off-by: Nikunj A Dadhania 
>> ---
>>  target-ppc/cpu-models.c |  5 +++
>>  target-ppc/cpu-models.h |  2 ++
>>  target-ppc/cpu-qom.h|  7 
>>  target-ppc/mmu_helper.c |  3 +-
>>  target-ppc/translate_init.c | 85 
>> -
>>  5 files changed, 100 insertions(+), 2 deletions(-)
>> 
>> diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
>> index 5209e63..901cf40 100644
>> --- a/target-ppc/cpu-models.c
>> +++ b/target-ppc/cpu-models.c
>> @@ -1147,6 +1147,10 @@
>>  "POWER8NVL v1.0")
>>  POWERPC_DEF("970_v2.2",  CPU_POWERPC_970_v22,970,
>>  "PowerPC 970 v2.2")
>> +
>> +POWERPC_DEF("POWER9_v1.0",   CPU_POWERPC_POWER9_BASE,POWER9,
>> +"POWER9 v1.0")
>> +
>>  POWERPC_DEF("970fx_v1.0",CPU_POWERPC_970FX_v10,  970,
>>  "PowerPC 970FX v1.0 (G5)")
>>  POWERPC_DEF("970fx_v2.0",CPU_POWERPC_970FX_v20,  970,
>> @@ -1395,6 +1399,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
>>  { "POWER8E", "POWER8E_v2.1" },
>>  { "POWER8", "POWER8_v2.0" },
>>  { "POWER8NVL", "POWER8NVL_v1.0" },
>> +{ "POWER9", "POWER9_v1.0" },
>>  { "970", "970_v2.2" },
>>  { "970fx", "970fx_v3.1" },
>>  { "970mp", "970mp_v1.1" },
>> diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
>> index f21a44c..beeaaba 100644
>> --- a/target-ppc/cpu-models.h
>> +++ b/target-ppc/cpu-models.h
>> @@ -562,6 +562,8 @@ enum {
>>  CPU_POWERPC_POWER8_v20 = 0x004D0200,
>>  CPU_POWERPC_POWER8NVL_BASE = 0x004C,
>>  CPU_POWERPC_POWER8NVL_v10  = 0x004C0100,
>> +CPU_POWERPC_POWER9_BASE= 0x004E,
>> +CPU_POWERPC_POWER9_MAM = 0x004E0100,
>>  CPU_POWERPC_970_v22= 0x00390202,
>>  CPU_POWERPC_970FX_v10  = 0x00391100,
>>  CPU_POWERPC_970FX_v20  = 0x003C0200,
>> diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
>> index 2864105..df2fb65 100644
>> --- a/target-ppc/cpu-qom.h
>> +++ b/target-ppc/cpu-qom.h
>> @@ -86,6 +86,13 @@ enum powerpc_mmu_t {
>>  POWERPC_MMU_2_07   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
>>   | POWERPC_MMU_64K
>>   | POWERPC_MMU_AMR | 0x0004,
>> +/* for now , We can add radix later if needed */
>
> I'm guessing this means you're only thinking about the guest-side
> presentation of the P9 MMU at this point?  IIUC the host side
> presentation is so different that sharing any constants with pre-P9
> MMUs probably doesn't make sense.
>
> I'm not immediately sure how we should make this distinction in the
> target-ppc code, since these values are supposed to belong to the CPU
> regardless of operating mode.

Currently, this is just a place holder patch. Not close to committing
yet. For me to add the new instruction needed these family defines.

Regards,
Nikunj

Re: [Qemu-devel] [PATCH] virtio-blk: dataplane cleanup

2016-07-17 Thread Fam Zheng

On Mon, 07/18 12:05, Cao jin wrote:
> No need duplicate the judgment, there is one in function entry.
> 
> Cc: Stefan Hajnoczi 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Signed-off-by: Cao jin 
> ---
>  hw/block/dataplane/virtio-blk.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 54b9ac1..704a763 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -112,10 +112,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
> VirtIOBlkConf *conf,
>  s->vdev = vdev;
>  s->conf = conf;
>  
> -if (conf->iothread) {
> -s->iothread = conf->iothread;
> -object_ref(OBJECT(s->iothread));
> -}
> +s->iothread = conf->iothread;
> +object_ref(OBJECT(s->iothread));
>  s->ctx = iothread_get_aio_context(s->iothread);
>  s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
>  s->batch_notify_vqs = bitmap_new(conf->num_queues);
> -- 
> 2.1.0
> 
> 
> 
> 

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [RFC PATCH V2] qemu-char: Fix context for g_source_attach()

2016-07-17 Thread Jason Wang




On 2016年07月18日 09:55, Zhang Chen wrote:

Hi~ All~~

Can you give me some feedback for this patch?

We need more comments~~

COLO project depend on this patch to work.

Because this patch colo-compare can make handler of 
qemu_chr_add_handlers()


run in compare thread, reduce workload of main_loop in network busy 
situation.


This idea from Jason.


Thanks
Zhang Chen


I think you can put this patch in the series of COLO comparing thread 
which shows its using. And then you can ask acked-by or reviewed-by from 
other maintainers.


Thanks

Re: [Qemu-devel] [RFC PATCH V2] qemu-char: Fix context for g_source_attach()

2016-07-17 Thread Zhang Chen




On 07/18/2016 01:31 PM, Jason Wang wrote:



On 2016年07月18日 09:55, Zhang Chen wrote:

Hi~ All~~

Can you give me some feedback for this patch?

We need more comments~~

COLO project depend on this patch to work.

Because this patch colo-compare can make handler of 
qemu_chr_add_handlers()


run in compare thread, reduce workload of main_loop in network busy 
situation.


This idea from Jason.


Thanks
Zhang Chen


I think you can put this patch in the series of COLO comparing thread 
which shows its using. And then you can ask acked-by or reviewed-by 
from other maintainers.


Thanks




Make sense. I will add this patch in next colo-compare series.

Thanks
Zhang Chen






--
Thanks
zhangchen

Re: [Qemu-devel] [PATCH] e1000e: fix building without CONFIG_VMXNET3_PCI

2016-07-17 Thread Jason Wang




On 2016年07月13日 10:42, Jason Wang wrote:

e1000e needs net_tx_pkt.o and net_rx_pkt.o too.

Cc: Dmitry Fleytman 
Cc: Leonid Bloch 
Signed-off-by: Jason Wang 
---
  hw/net/Makefile.objs | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/net/Makefile.objs b/hw/net/Makefile.objs
index fe61e9f..610ed3e 100644
--- a/hw/net/Makefile.objs
+++ b/hw/net/Makefile.objs
@@ -7,6 +7,7 @@ common-obj-$(CONFIG_EEPRO100_PCI) += eepro100.o
  common-obj-$(CONFIG_PCNET_PCI) += pcnet-pci.o
  common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o
  common-obj-$(CONFIG_E1000_PCI) += e1000.o e1000x_common.o
+common-obj-$(CONFIG_E1000E_PCI) += net_tx_pkt.o net_rx_pkt.o
  common-obj-$(CONFIG_E1000E_PCI) += e1000e.o e1000e_core.o e1000x_common.o
  common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o
  common-obj-$(CONFIG_VMXNET3_PCI) += net_tx_pkt.o net_rx_pkt.o


Applied, thanks.

Re: [Qemu-devel] [PATCH] net: fix incorrect argument to iov_to_buf

2016-07-17 Thread Jason Wang




On 2016年07月15日 16:41, Paolo Bonzini wrote:

Coverity reports a "suspicious sizeof" which is indeed wrong.

Signed-off-by: Paolo Bonzini 
---
  net/eth.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/eth.c b/net/eth.c
index 95fe15c..0be59c2 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -418,7 +418,7 @@ _eth_get_rss_ex_dst_addr(const struct iovec *pkt, int 
pkt_frags,
  
  bytes_read = iov_to_buf(pkt, pkt_frags,

  rthdr_offset + sizeof(*ext_hdr),
-dst_addr, sizeof(dst_addr));
+dst_addr, sizeof(*dst_addr));
  
  return bytes_read == sizeof(dst_addr);

  }
@@ -467,7 +467,7 @@ _eth_get_rss_ex_src_addr(const struct iovec *pkt, int 
pkt_frags,
  
  bytes_read = iov_to_buf(pkt, pkt_frags,

  opt_offset + sizeof(opthdr),
-src_addr, sizeof(src_addr));
+src_addr, sizeof(*src_addr));
  
  return bytes_read == sizeof(src_addr);

  }


Applied to -net. Thanks

Re: [Qemu-devel] [PATCH] net: fix incorrect access to pointer

2016-07-17 Thread Jason Wang




On 2016年07月15日 16:43, Paolo Bonzini wrote:

This is not dereferencing the pointer, and instead checking only
the value of the pointer.

Signed-off-by: Paolo Bonzini 
---
  net/eth.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/eth.c b/net/eth.c
index 0be59c2..df81efb 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -211,7 +211,7 @@ void eth_get_protocols(const struct iovec *iov, int iovcnt,
   *l4hdr_off, sizeof(l4hdr_info->hdr.tcp),
   &l4hdr_info->hdr.tcp);
  
-if (istcp) {

+if (*istcp) {
  *l5hdr_off = *l4hdr_off +
  TCP_HEADER_DATA_OFFSET(&l4hdr_info->hdr.tcp);
  


Applied to -net. Thanks

Re: [Qemu-devel] [PATCH] e1000e: fix incorrect access to pointer

2016-07-17 Thread Jason Wang




On 2016年07月15日 16:44, Paolo Bonzini wrote:

This is not dereferencing the pointer, and instead checking only
the value of the pointer.

Signed-off-by: Paolo Bonzini 
---
  hw/net/e1000e_core.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 6050d8b..badb1fe 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -281,7 +281,7 @@ e1000e_intrmgr_delay_rx_causes(E1000ECore *core, uint32_t 
*causes)
  
  /* Check if delayed RX interrupts disabled by client

 or if there are causes that cannot be delayed */
-if ((rdtr == 0) || (causes != 0)) {
+if ((rdtr == 0) || (*causes != 0)) {
  return false;
  }
  
@@ -322,7 +322,7 @@ e1000e_intrmgr_delay_tx_causes(E1000ECore *core, uint32_t *causes)

  *causes &= ~delayable_causes;
  
  /* If there are causes that cannot be delayed */

-if (causes != 0) {
+if (*causes != 0) {
  return false;
  }
  


Applied to -net. Thanks

Re: [Qemu-devel] [PATCH] tap: fix memory leak on failure to create a multiqueue tap device

2016-07-17 Thread Jason Wang




On 2016年07月15日 16:56, Paolo Bonzini wrote:

Reported by Coverity.

Signed-off-by: Paolo Bonzini 
---
  net/tap.c | 22 --
  1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index e9c32f3..6a2cedc 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -787,8 +787,8 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
  return -1;
  }
  } else if (tap->has_fds) {
-char **fds = g_new(char *, MAX_TAP_QUEUES);
-char **vhost_fds = g_new(char *, MAX_TAP_QUEUES);
+char **fds = g_new0(char *, MAX_TAP_QUEUES);
+char **vhost_fds = g_new0(char *, MAX_TAP_QUEUES);
  int nfds, nvhosts;
  
  if (tap->has_ifname || tap->has_script || tap->has_downscript ||

@@ -806,7 +806,7 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
  if (nfds != nvhosts) {
  error_setg(errp, "The number of fds passed does not match "
 "the number of vhostfds passed");
-return -1;
+goto free_fail;
  }
  }
  
@@ -814,7 +814,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name,

  fd = monitor_fd_param(cur_mon, fds[i], &err);
  if (fd == -1) {
  error_propagate(errp, err);
-return -1;
+goto free_fail;
  }
  
  fcntl(fd, F_SETFL, O_NONBLOCK);

@@ -824,7 +824,7 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
  } else if (vnet_hdr != tap_probe_vnet_hdr(fd)) {
  error_setg(errp,
 "vnet_hdr not consistent across given tap fds");
-return -1;
+goto free_fail;
  }
  
  net_init_tap_one(tap, peer, "tap", name, ifname,

@@ -833,11 +833,21 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
   vnet_hdr, fd, &err);
  if (err) {
  error_propagate(errp, err);
-return -1;
+goto free_fail;
  }
  }
  g_free(fds);
  g_free(vhost_fds);
+return 0;
+
+free_fail:
+for (i = 0; i < nfds; i++) {
+g_free(fds[i]);
+g_free(vhost_fds[i]);
+}
+g_free(fds);
+g_free(vhost_fds);
+return -1;
  } else if (tap->has_helper) {
  if (tap->has_ifname || tap->has_script || tap->has_downscript ||
  tap->has_vnet_hdr || tap->has_queues || tap->has_vhostfds) {


Applied to -net. Thanks

Re: [Qemu-devel] [patch qemu] MAINTAINERS: release Scott from being a rocker maintainer

2016-07-17 Thread Jason Wang




On 2016年07月11日 15:49, Jiri Pirko wrote:

From: Jiri Pirko 

As requested by Scott, removing him.

Signed-off-by: Scott Feldman 
Signed-off-by: Jiri Pirko 
---
  MAINTAINERS | 1 -
  1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1d0e2c3..5928f22 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -971,7 +971,6 @@ F: hw/net/vmxnet*
  F: hw/scsi/vmw_pvscsi*
  
  Rocker

-M: Scott Feldman 
  M: Jiri Pirko 
  S: Maintained
  F: hw/net/rocker/


Applied to -net. Thanks

[Qemu-devel] [PATCH 1/3] virtio: Basic implementation of virtio pstore driver

2016-07-17 Thread Namhyung Kim

The virtio pstore driver provides interface to the pstore subsystem so
that the guest kernel's log/dump message can be saved on the host
machine.  Users can access the log file directly on the host, or on the
guest at the next boot using pstore filesystem.  It currently deals with
kernel log (printk) buffer only, but we can extend it to have other
information (like ftrace dump) later.

It supports legacy PCI device using single order-2 page buffer.  As all
operation of pstore is synchronous, it would be fine IMHO.  However I
don't know how to make write operation synchronous since it's called
with a spinlock held (from any context including NMI).

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: "Michael S. Tsirkin" 
Cc: Anthony Liguori 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Kees Cook 
Cc: Tony Luck 
Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Minchan Kim 
Cc: k...@vger.kernel.org
Cc: qemu-devel@nongnu.org
Cc: virtualizat...@lists.linux-foundation.org
Signed-off-by: Namhyung Kim 
---
 drivers/virtio/Kconfig |  10 ++
 drivers/virtio/Makefile|   1 +
 drivers/virtio/virtio_pstore.c | 317 +
 include/uapi/linux/Kbuild  |   1 +
 include/uapi/linux/virtio_ids.h|   1 +
 include/uapi/linux/virtio_pstore.h |  53 +++
 6 files changed, 383 insertions(+)
 create mode 100644 drivers/virtio/virtio_pstore.c
 create mode 100644 include/uapi/linux/virtio_pstore.h

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 77590320d44c..8f0e6c796c12 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -58,6 +58,16 @@ config VIRTIO_INPUT
 
 If unsure, say M.
 
+config VIRTIO_PSTORE
+   tristate "Virtio pstore driver"
+   depends on VIRTIO
+   depends on PSTORE
+   ---help---
+This driver supports virtio pstore devices to save/restore
+panic and oops messages on the host.
+
+If unsure, say M.
+
  config VIRTIO_MMIO
tristate "Platform bus driver for memory mapped virtio devices"
depends on HAS_IOMEM && HAS_DMA
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 41e30e3dc842..bee68cb26d48 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -5,3 +5,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VIRTIO_PSTORE) += virtio_pstore.o
diff --git a/drivers/virtio/virtio_pstore.c b/drivers/virtio/virtio_pstore.c
new file mode 100644
index ..6fe62c0f1508
--- /dev/null
+++ b/drivers/virtio/virtio_pstore.c
@@ -0,0 +1,317 @@
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define VIRT_PSTORE_ORDER2
+#define VIRT_PSTORE_BUFSIZE  (4096 << VIRT_PSTORE_ORDER)
+
+struct virtio_pstore {
+   struct virtio_device*vdev;
+   struct virtqueue*vq;
+   struct pstore_info   pstore;
+   struct virtio_pstore_hdr hdr;
+   size_t   buflen;
+   u64  id;
+
+   /* Waiting for host to ack */
+   wait_queue_head_t   acked;
+};
+
+static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id type)
+{
+   u16 ret;
+
+   switch (type) {
+   case PSTORE_TYPE_DMESG:
+   ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_DMESG);
+   break;
+   default:
+   ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_UNKNOWN);
+   break;
+   }
+
+   return ret;
+}
+
+static enum pstore_type_id from_virtio_type(struct virtio_pstore *vps, u16 
type)
+{
+   enum pstore_type_id ret;
+
+   switch (virtio16_to_cpu(vps->vdev, type)) {
+   case VIRTIO_PSTORE_TYPE_DMESG:
+   ret = PSTORE_TYPE_DMESG;
+   break;
+   default:
+   ret = PSTORE_TYPE_UNKNOWN;
+   break;
+   }
+
+   return ret;
+}
+
+static void virtpstore_ack(struct virtqueue *vq)
+{
+   struct virtio_pstore *vps = vq->vdev->priv;
+
+   wake_up(&vps->acked);
+}
+
+static int virt_pstore_open(struct pstore_info *psi)
+{
+   struct virtio_pstore *vps = psi->data;
+   struct virtio_pstore_hdr *hdr = &vps->hdr;
+   struct scatterlist sg[1];
+   unsigned int len;
+
+   hdr->cmd = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_OPEN);
+
+   sg_init_one(sg, hdr, sizeof(*hdr));
+   virtqueue_add_outbuf(vps->vq, sg, 1, vps, GFP_KERNEL);
+   virtqueue_kick(vps->vq);
+
+   wait_event(vps->acked, virtqueue_get_buf(vps->vq, &len));
+   return 0;
+}
+
+static int virt_pstore_close(struct pstore_info *psi)
+{
+   struct virtio_pstore *vps = psi->data;
+   struct virtio_pstore_hdr *hdr = &vps->hdr;
+   struct scatterlist sg[1];
+   unsigned int len;
+
+   hdr->cmd = cpu_to_

[Qemu-devel] [PATCH 2/3] qemu: Implement virtio-pstore device

2016-07-17 Thread Namhyung Kim

From: Namhyung Kim 

Add virtio pstore device to allow kernel log files saved on the host.
It will save the log files on the directory given by pstore device
option.

  $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...

  (guest) # echo c > /proc/sysrq-trigger

  $ ls dir-xx
  dmesg-0.enc.z  dmesg-1.enc.z

The log files are usually compressed using zlib.  Users can see the log
messages directly on the host or on the guest (using pstore filesystem).

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: "Michael S. Tsirkin" 
Cc: Anthony Liguori 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Kees Cook 
Cc: Tony Luck 
Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Minchan Kim 
Cc: k...@vger.kernel.org
Cc: qemu-devel@nongnu.org
Cc: virtualizat...@lists.linux-foundation.org
Signed-off-by: Namhyung Kim 
---
 hw/virtio/Makefile.objs|   2 +-
 hw/virtio/virtio-pci.c |  50 
 hw/virtio/virtio-pci.h |  14 +
 hw/virtio/virtio-pstore.c  | 328 +
 include/hw/pci/pci.h   |   1 +
 include/hw/virtio/virtio-pstore.h  |  30 ++
 include/standard-headers/linux/virtio_ids.h|   1 +
 .../linux/{virtio_ids.h => virtio_pstore.h}|  48 +--
 qdev-monitor.c |   1 +
 9 files changed, 455 insertions(+), 20 deletions(-)
 create mode 100644 hw/virtio/virtio-pstore.c
 create mode 100644 include/hw/virtio/virtio-pstore.h
 copy include/standard-headers/linux/{virtio_ids.h => virtio_pstore.h} (63%)

diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 3e2b175..aae7082 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,4 +4,4 @@ common-obj-y += virtio-bus.o
 common-obj-y += virtio-mmio.o
 
 obj-y += virtio.o virtio-balloon.o 
-obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
+obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o virtio-pstore.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 2b34b43..8281b80 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2416,6 +2416,55 @@ static const TypeInfo virtio_host_pci_info = {
 };
 #endif
 
+/* virtio-pstore-pci */
+
+static void virtio_pstore_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+VirtIOPstorePCI *vps = VIRTIO_PSTORE_PCI(vpci_dev);
+DeviceState *vdev = DEVICE(&vps->vdev);
+Error *err = NULL;
+
+qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+object_property_set_bool(OBJECT(vdev), true, "realized", &err);
+if (err) {
+error_propagate(errp, err);
+return;
+}
+}
+
+static void virtio_pstore_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+
+k->realize = virtio_pstore_pci_realize;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+
+pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
+pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PSTORE;
+pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
+pcidev_k->class_id = PCI_CLASS_OTHERS;
+}
+
+static void virtio_pstore_pci_instance_init(Object *obj)
+{
+VirtIOPstorePCI *dev = VIRTIO_PSTORE_PCI(obj);
+
+virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+TYPE_VIRTIO_PSTORE);
+object_property_add_alias(obj, "directory", OBJECT(&dev->vdev),
+  "directory", &error_abort);
+}
+
+static const TypeInfo virtio_pstore_pci_info = {
+.name  = TYPE_VIRTIO_PSTORE_PCI,
+.parent= TYPE_VIRTIO_PCI,
+.instance_size = sizeof(VirtIOPstorePCI),
+.instance_init = virtio_pstore_pci_instance_init,
+.class_init= virtio_pstore_pci_class_init,
+};
+
 /* virtio-pci-bus */
 
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
@@ -2485,6 +2534,7 @@ static void virtio_pci_register_types(void)
 #ifdef CONFIG_VHOST_SCSI
 type_register_static(&vhost_scsi_pci_info);
 #endif
+type_register_static(&virtio_pstore_pci_info);
 }
 
 type_init(virtio_pci_register_types)
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index e4548c2..b4c039f 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -31,6 +31,7 @@
 #ifdef CONFIG_VHOST_SCSI
 #include "hw/virtio/vhost-scsi.h"
 #endif
+#include "hw/virtio/virtio-pstore.h"
 
 typedef struct VirtIOPCIProxy VirtIOPCIProxy;
 typedef struct VirtIOBlkPCI VirtIOBlkPCI;
@@ -44,6 +45,7 @@ typedef struct VirtIOInputPCI VirtIOInputPCI;
 typedef struct VirtIOInputHIDPCI VirtIOInputHIDPCI;
 typedef struct VirtIOInputHostPCI VirtIOInputHostPCI;
 typedef struct VirtIOGPUPCI VirtIOGPUPCI;
+typedef struct VirtIOPstorePCI VirtIOPstorePCI;
 
 /* virtio-pci-bus */
 
@@ -311,6 +313,18 @@ struct VirtIOGPUPCI {
 VirtIOGPU vdev;
 };
 
+/*
+ * virtio-pstore-pci: This extends Virti

[Qemu-devel] [Bug 1603693] Re: Disks in mptsas1068 scsi controller not seen by linux

2016-07-17 Thread Paolo Bonzini

> The non-working vmware config says `scsi0.virtualDev = "lsilogic"`
> (that's mptspi, LSI53C1030 or "LSI Logic Ultra 320"). For the mptsas
> tests above, I changed it to `scsi0.virtualDev = "lsisas1068"`.
>
> Is it correct to say that the LSI53C1030 parts of [1] were never applied?

Yes, that's correct.  The patch you linked was almost entirely
rewritten.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1603693

Title:
  Disks in mptsas1068 scsi controller not seen by linux

Status in QEMU:
  Invalid

Bug description:
  When using the mptsas1068 scsi controller, linux detects the
  controller itself but not the drives attached to it. Freebsd works.
  Using a different controller with linux works. VMware with linux
  works.

  qemu 2.6.50 (v2.6.0-1925-g6b92bbf)
  seabios rel-1.9.0-139-gae3f78f (master branch, required for mptsas1068 
support)

  Test script, loosely based off what libvirt runs and the libvirt tests
  that Paolo Bonzini wrote [1]

  #
  iso=archlinux-2016.07.01-dual.iso
  #iso=FreeBSD-10.3-RELEASE-amd64-bootonly.iso
  device=mptsas1068
  #device=lsi

  img=empty.img
  qemu-img create -f qcow2 $img 1G

  /usr/bin/qemu-system-x86_64 \
  -enable-kvm \
  -m 1024 \
  -boot menu=on \
  -device $device,id=scsi0,bus=pci.0,addr=0x9 \
  -drive file=$img,format=qcow2,if=none,id=drive-scsi0-0-0-0 \
  -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
 \
  -drive file=$iso,format=raw,if=none,id=drive-ide0-0-1,readonly=on \
  -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1
  #

  The ISOs can be downloaded from [2] and [3].

  After booting linux, do "lsblk". /dev/sda should exist.

  After booting freebsd, do "geom disk list". A da0 / "QEMU QEMU
  HARDDISK" should be mentioned.

  With device=mptsas1068 this fails in linux.

  With device=lsi line it works in both.

  With VMWare and a linux VM (opensuse 10.1, kernel 2.6.18) which only
  loads modules for mptsas1068, this works.

  I also reproduced this with the debian 8.5 netinstall image, but it
  insists in making you pick a driver from a list of modules when it
  fails to mount it, instead of dropping to a shell.

  Arch linux dmesg output snippet (full output attached as arch-linux-
  dmesg.txt):

  #
  root@archiso ~ # dmesg | grep -i -e mpt -e scsi -e ioc0
  [0.00] Linux version 4.6.3-1-ARCH (builduser@tobias) (gcc version 
6.1.1 20160602 (GCC) ) #1 SMP PREEMPT Fri Jun 24 21:19:13 CEST 2016
  [0.00]   Normal   empty
  [0.00] Preemptible hierarchical RCU implementation.
  [1.879616] Block layer SCSI generic (bsg) driver version 0.4 loaded 
(major 249)
  [1.951581] SCSI subsystem initialized
  [1.957113] Fusion MPT base driver 3.04.20
  [1.957618] Fusion MPT SAS Host driver 3.04.20
  [2.281773] scsi host0: ata_piix
  [2.285372] scsi host1: ata_piix
  [2.305803] mptbase: ioc0: Initiating bringup
  [2.363555] ioc0: LSISAS1068 A0: Capabilities={Initiator}
  [2.444390] scsi 0:0:1:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ 
PQ: 0 ANSI: 5
  [2.500572] scsi host2: ioc0: LSISAS1068 A0, FwRev=01329200h, Ports=8, 
MaxQ=128, IRQ=11
  [2.507024] sr 0:0:1:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
  [2.507274] sr 0:0:1:0: Attached scsi CD-ROM sr0
  #

  The controller itself is detected, the disk isn't.

  An early version of this patch [4] said that it was only tested with
  FreeBSD:

  >Tested with FreeBSD for now.  The previous version (before the
  >configuration page rewrite) worked with RHEL and Windows guests as well.
  >
  >TODO: write qtest for (at least) config pages, test Linux and Windows.

  [1]: 
https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=fc922eb2080a3fa7b24bc8a8b0aabfd394480143
  [2]: https://www.archlinux.org/download
  [3]: https://www.freebsd.org/where.html
  [4]: https://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06475.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1603693/+subscriptions

[Qemu-devel] [RFC/PATCHSET 0/3] virtio-pstore: Implement virtio pstore device

2016-07-17 Thread Namhyung Kim

Hello,

This patchset is a proof of concept of virtio-pstore idea [1].  It has
some rough edges and I'm not familiar with this area, so please give
me feedbacks and advices if I'm going to a wrong direction.

It started from the fact that dumping ftrace buffer at kernel
oops/panic takes too much time.  Although there's a way to reduce the
size of the original data, sometimes I want to have the information as
many as possible.  Maybe kexec/kdump can solve this problem but it
consumes some portion of guest memory so I'd like to avoid it.  And I
know the qemu + crashtool can dump and analyze the whole guest memory
including the ftrace buffer without wasting guest memory, but it adds
one more layer and has some limitation as an out-of-tree tool like not
being in sync with the kernel changes.

So I think it'd be great using the pstore interface to dump guest
kernel data on the host.  One can read the data on the host directly
or on the guest (at the next boot) using pstore filesystem as usual.
While this patchset only implements dumping kernel log buffer, it can
be extended to have ftrace buffer and probably some more..

The patch 0001 implements virtio pstore driver.  It has a single virt
queue, pstore buffer and header structure.  The virtio_pstore_hdr
struct is to give information about the current pstore operation.

The patch 0002 and 0003 implement virtio-pstore legacy PCI device on
qemu-kvm and kvmtool respectively.  I referenced virtio-baloon and
virtio-rng implementations and I don't know whether kvmtool supports
modern virtio 1.0+ spec.

For example, using virtio-pstore on qemu looks like below:

  $ qemu-system-x86_64 -enable-kvm -device virtio-pstore,directory=xxx

When guest kernel gets panic the log messages will be saved under the
xxx directory.

  $ ls xxx
  dmesg-0.enc.z  dmesg-1.enc.z

As you can see the pstore subsystem compresses the log data using
zlib.  The data can be extracted with the following command:

  $ cat xxx/dmesg-0.enc.z | \
  > python -c 'import sys, zlib; print(zlib.decompress(sys.stdin.read()))'
  Oops#1 Part1
  <5>[0.00] Linux version 4.6.0kvm+ (namhyung@danjae) (gcc version 
5.3.0 (GCC) ) #145 SMP Mon Jul 18 10:22:45 KST 2016
  <6>[0.00] Command line: root=/dev/vda console=ttyS0
  <6>[0.00] x86/fpu: Legacy x87 FPU detected.
  <6>[0.00] x86/fpu: Using 'eager' FPU context switches.
  <6>[0.00] e820: BIOS-provided physical RAM map:
  <6>[0.00] BIOS-e820: [mem 0x-0x0009fbff] 
usable
  <6>[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] 
reserved
  <6>[0.00] BIOS-e820: [mem 0x000f-0x000f] 
reserved
  <6>[0.00] BIOS-e820: [mem 0x0010-0x07fddfff] 
usable
  <6>[0.00] BIOS-e820: [mem 0x07fde000-0x07ff] 
reserved
  <6>[0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] 
reserved
  <6>[0.00] BIOS-e820: [mem 0xfffc-0x] 
reserved
  <6>[0.00] NX (Execute Disable) protection: active
  <6>[0.00] SMBIOS 2.8 present.
  <7>[0.00] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
  ...

Maybe we can add a config option to control the compression later.


Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: "Michael S. Tsirkin" 
Cc: Anthony Liguori 
Cc: Anton Vorontsov 
Cc: Colin Cross 
Cc: Kees Cook 
Cc: Tony Luck 
Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Minchan Kim 
Cc: k...@vger.kernel.org
Cc: qemu-devel@nongnu.org
Cc: virtualizat...@lists.linux-foundation.org

[1] https://lkml.org/lkml/2016/7/1/6


Thanks,
Namhyung

Re: [Qemu-devel] [PATCH 1/3] virtio: Basic implementation of virtio pstore driver

2016-07-17 Thread Namhyung Kim

Hello,

On Sun, Jul 17, 2016 at 10:12:26PM -0700, Kees Cook wrote:
> On Sun, Jul 17, 2016 at 9:37 PM, Namhyung Kim  wrote:
> > The virtio pstore driver provides interface to the pstore subsystem so
> > that the guest kernel's log/dump message can be saved on the host
> > machine.  Users can access the log file directly on the host, or on the
> > guest at the next boot using pstore filesystem.  It currently deals with
> > kernel log (printk) buffer only, but we can extend it to have other
> > information (like ftrace dump) later.
> >
> > It supports legacy PCI device using single order-2 page buffer.  As all
> > operation of pstore is synchronous, it would be fine IMHO.  However I
> > don't know how to make write operation synchronous since it's called
> > with a spinlock held (from any context including NMI).
> >
> > Cc: Paolo Bonzini 
> > Cc: Radim Kr??m 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Anthony Liguori 
> > Cc: Anton Vorontsov 
> > Cc: Colin Cross 
> > Cc: Kees Cook 
> > Cc: Tony Luck 
> > Cc: Steven Rostedt 
> > Cc: Ingo Molnar 
> > Cc: Minchan Kim 
> > Cc: k...@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualizat...@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim 
> 
> This looks great to me! I'd love to use this in qemu. (Right now I go
> through hoops to use the ramoops backend for testing.)
> 
> Reviewed-by: Kees Cook 

Thank you!

> 
> Notes below...
>

[SNIP]
> > +static u16 to_virtio_type(struct virtio_pstore *vps, enum pstore_type_id 
> > type)
> > +{
> > +   u16 ret;
> > +
> > +   switch (type) {
> > +   case PSTORE_TYPE_DMESG:
> > +   ret = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_TYPE_DMESG);
> > +   break;
> > +   default:
> > +   ret = cpu_to_virtio16(vps->vdev, 
> > VIRTIO_PSTORE_TYPE_UNKNOWN);
> > +   break;
> > +   }
> 
> I would love to see this support PSTORE_TYPE_CONSOLE too. It should be
> relatively easy to add: I think it'd just be another virtio command?

Do you want to append the data to the host file as guest does
printk()?  I think it needs some kind of buffer management, but it's
not hard to add IMHO.


> 
> > +
> > +   return ret;
> > +}
> > +

[SNIP]
> > +static int notrace virt_pstore_write(enum pstore_type_id type,
> > +enum kmsg_dump_reason reason,
> > +u64 *id, unsigned int part, int count,
> > +bool compressed, size_t size,
> > +struct pstore_info *psi)
> > +{
> > +   struct virtio_pstore *vps = psi->data;
> > +   struct virtio_pstore_hdr *hdr = &vps->hdr;
> > +   struct scatterlist sg[2];
> > +   unsigned int flags = compressed ? VIRTIO_PSTORE_FL_COMPRESSED : 0;
> > +
> > +   *id = vps->id++;
> > +
> > +   hdr->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_WRITE);
> > +   hdr->id= cpu_to_virtio64(vps->vdev, *id);
> > +   hdr->flags = cpu_to_virtio32(vps->vdev, flags);
> > +   hdr->type  = to_virtio_type(vps, type);
> > +
> > +   sg_init_table(sg, 2);
> > +   sg_set_buf(&sg[0], hdr, sizeof(*hdr));
> > +   sg_set_buf(&sg[1], psi->buf, size);
> > +   virtqueue_add_outbuf(vps->vq, sg, 2, vps, GFP_ATOMIC);
> > +   virtqueue_kick(vps->vq);
> > +
> > +   /* TODO: make it synchronous */
> > +   return 0;
> 
> The down side to this being asynchronous is the lack of error
> reporting. Perhaps this could check hdr->type before queuing and error
> for any VIRTIO_PSTORE_TYPE_UNKNOWN message instead of trying to send
> it?

I cannot follow, sorry.  Could you please elaborate it more?


> 
> > +}
> > +
> > +static int virt_pstore_erase(enum pstore_type_id type, u64 id, int count,
> > +struct timespec time, struct pstore_info *psi)
> > +{
> > +   struct virtio_pstore *vps = psi->data;
> > +   struct virtio_pstore_hdr *hdr = &vps->hdr;
> > +   struct scatterlist sg[1];
> > +   unsigned int len;
> > +
> > +   hdr->cmd   = cpu_to_virtio16(vps->vdev, VIRTIO_PSTORE_CMD_ERASE);
> > +   hdr->id= cpu_to_virtio64(vps->vdev, id);
> > +   hdr->type  = to_virtio_type(vps, type);
> > +
> > +   sg_init_one(sg, hdr, sizeof(*hdr));
> > +   virtqueue_add_outbuf(vps->vq, sg, 1, vps, GFP_KERNEL);
> > +   virtqueue_kick(vps->vq);
> > +
> > +   wait_event(vps->acked, virtqueue_get_buf(vps->vq, &len));
> > +   return 0;
> > +}
> > +
> > +static int virt_pstore_init(struct virtio_pstore *vps)
> > +{
> > +   struct pstore_info *psinfo = &vps->pstore;
> > +   int err;
> > +
> > +   vps->id = 0;
> > +   vps->buflen = 0;
> > +   psinfo->bufsize = VIRT_PSTORE_BUFSIZE;
> > +   psinfo->buf = (void *)__get_free_pages(GFP_KERNEL, 
> > VIRT_PSTORE_ORDER);
> > +   if (!psinfo->buf) {
> > +   pr_err("cannot allocate pstore buffer\n");
> > +   return -ENOMEM;
> > +   }

[Qemu-devel] [RFC PATCH V7 0/7] Introduce COLO-compare

2016-07-17 Thread Zhang Chen

COLO-compare is a part of COLO project. It is used
to compare the network package to help COLO decide
whether to do checkpoint.

The full version in this github:
https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-with-colo-base-jul18


v7:
 p5:
   - add [PATCH]qemu-char: Fix context for g_source_attach()
 in this patch series.

v6: 
 p6:
   - add more commit log.
   - fix icmp comparison to compare all packet.

 p5:
   - add more cpmments in commit log.
   - change REGULAR_CHECK_MS to REGULAR_PACKET_CHECK_MS
   - make check old packet independent to compare thread
   - remove thread_status

 p4:
   - change this patch only about
 Connection and ConnectionKey.
   - add some comments in commit log.
   - remove mode in fill_connection_key().
   - fix some comments and bug.
   - move colo_conn_state to patch of
 "work with colo-frame"
   - remove conn_list_lock.
   - add MAX_QUEUE_SIZE, if primary_list or
 secondary_list biger than MAX_QUEUE_SIZE
 we will drop packet. 

 p3:
   - add new independent kernel jhash patch.

 p2:
   - add new independent colo-base patch.

 p1:
   - add a ascii figure and some comments to explain it
   - move trace.h to p2
   - move QTAILQ_HEAD(, CompareState) net_compares to
 patch of "work with colo-frame"
   - add some comments in qemu-option.hx


v5:
 p3:
- comments from Jason
  we poll and handle chardev in comapre thread,
  Through this way, there's no need for extra 
  synchronization with main loop
  this depend on another patch:
  qemu-char: Fix context for g_source_attach()
- remove QemuEvent
 p2:
- remove conn->list_lock
 p1:
- move compare_pri/sec_chr_in to p3
- move compare_chr_send to p2

v4:
 p4:
- add some comments
- fix some trace-events
- fix tcp compare error
 p3:
- add rcu_read_lock().
- fix trace name
- fix jason's other comments
- rebase some Dave's branch function
 p2:
- colo_compare_connection() change g_queue_push_head() to
- g_queue_push_tail() match to sorted order.
- remove pkt->s
- move data structure to colo-base.h
- add colo-base.c reuse codes for filter-rewriter
- add some filter-rewriter needs struct
- depends on previous SocketReadState patch
 p1:
- except move qemu_chr_add_handlers()
  to colo thread
- remove class_finalize
- remove secondary arp codes
- depends on previous SocketReadState patch

v3:
  - rebase colo-compare to colo-frame v2.7
  - fix most of Dave's comments
(except RCU)
  - add TCP,UDP,ICMP and other packet comparison
  - add trace-event
  - add some comments
  - other bug fix
  - add RFC index
  - add usage in patch 1/4

v2:
  - add jhash.h

v1:
  - initial patch


Zhang Chen (7):
  colo-compare: introduce colo compare initialization
  colo-base: add colo-base to define and handle packet
  Jhash: add linux kernel jhashtable in qemu
  colo-compare: track connection and enqueue packet
  qemu-char: Fix context for g_source_attach()
  colo-compare: introduce packet comparison thread
  colo-compare: add TCP,UDP,ICMP packet comparison

 include/qemu/jhash.h |  61 
 io/channel.c |   2 +-
 net/Makefile.objs|   2 +
 net/colo-base.c  | 183 
 net/colo-base.h  |  71 +
 net/colo-compare.c   | 769 +++
 qemu-char.c  |   6 +-
 qemu-options.hx  |  38 +++
 trace-events |   9 +
 vl.c |   3 +-
 10 files changed, 1139 insertions(+), 5 deletions(-)
 create mode 100644 include/qemu/jhash.h
 create mode 100644 net/colo-base.c
 create mode 100644 net/colo-base.h
 create mode 100644 net/colo-compare.c

-- 
2.7.4

[Qemu-devel] [RFC PATCH V7 4/7] colo-compare: track connection and enqueue packet

2016-07-17 Thread Zhang Chen

In this patch we use kernel jhash table to track
connection, and then enqueue net packet like this:

+ CompareState ++
|   |
+---+   +---+ +---+
|conn list  +--->conn   +->conn   |
+---+   +---+ +---+
|   | |   | |  |
+---+ +---v+  +---v++---v+ +---v+
  |primary |  |secondary|primary | |secondary
  |packet  |  |packet  +|packet  | |packet  +
  ++  ++++ ++
  |   | |  |
  +---v+  +---v++---v+ +---v+
  |primary |  |secondary|primary | |secondary
  |packet  |  |packet  +|packet  | |packet  +
  ++  ++++ ++
  |   | |  |
  +---v+  +---v++---v+ +---v+
  |primary |  |secondary|primary | |secondary
  |packet  |  |packet  +|packet  | |packet  +
  ++  ++++ ++

We use conn_list to record connection info.
When we want to enqueue a packet, firstly get the
connection from connection_track_table. then push
the packet to g_queue(pri/sec) in it's own conn.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/colo-base.c| 108 +
 net/colo-base.h|  30 +++
 net/colo-compare.c |  70 +-
 3 files changed, 198 insertions(+), 10 deletions(-)

diff --git a/net/colo-base.c b/net/colo-base.c
index f5d5de9..7e91dec 100644
--- a/net/colo-base.c
+++ b/net/colo-base.c
@@ -16,6 +16,29 @@
 #include "qemu/error-report.h"
 #include "net/colo-base.h"
 
+uint32_t connection_key_hash(const void *opaque)
+{
+const ConnectionKey *key = opaque;
+uint32_t a, b, c;
+
+/* Jenkins hash */
+a = b = c = JHASH_INITVAL + sizeof(*key);
+a += key->src.s_addr;
+b += key->dst.s_addr;
+c += (key->src_port | key->dst_port << 16);
+__jhash_mix(a, b, c);
+
+a += key->ip_proto;
+__jhash_final(a, b, c);
+
+return c;
+}
+
+int connection_key_equal(const void *key1, const void *key2)
+{
+return memcmp(key1, key2, sizeof(ConnectionKey)) == 0;
+}
+
 int parse_packet_early(Packet *pkt)
 {
 int network_length;
@@ -47,6 +70,62 @@ int parse_packet_early(Packet *pkt)
 return 0;
 }
 
+void fill_connection_key(Packet *pkt, ConnectionKey *key)
+{
+uint32_t tmp_ports;
+
+key->ip_proto = pkt->ip->ip_p;
+
+switch (key->ip_proto) {
+case IPPROTO_TCP:
+case IPPROTO_UDP:
+case IPPROTO_DCCP:
+case IPPROTO_ESP:
+case IPPROTO_SCTP:
+case IPPROTO_UDPLITE:
+tmp_ports = *(uint32_t *)(pkt->transport_layer);
+key->src = pkt->ip->ip_src;
+key->dst = pkt->ip->ip_dst;
+key->src_port = ntohs(tmp_ports & 0x);
+key->dst_port = ntohs(tmp_ports >> 16);
+break;
+case IPPROTO_AH:
+tmp_ports = *(uint32_t *)(pkt->transport_layer + 4);
+key->src = pkt->ip->ip_src;
+key->dst = pkt->ip->ip_dst;
+key->src_port = ntohs(tmp_ports & 0x);
+key->dst_port = ntohs(tmp_ports >> 16);
+break;
+default:
+key->src_port = 0;
+key->dst_port = 0;
+break;
+}
+}
+
+Connection *connection_new(ConnectionKey *key)
+{
+Connection *conn = g_slice_new(Connection);
+
+conn->ip_proto = key->ip_proto;
+conn->processing = false;
+g_queue_init(&conn->primary_list);
+g_queue_init(&conn->secondary_list);
+
+return conn;
+}
+
+void connection_destroy(void *opaque)
+{
+Connection *conn = opaque;
+
+g_queue_foreach(&conn->primary_list, packet_destroy, NULL);
+g_queue_free(&conn->primary_list);
+g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
+g_queue_free(&conn->secondary_list);
+g_slice_free(Connection, conn);
+}
+
 Packet *packet_new(const void *data, int size)
 {
 Packet *pkt = g_slice_new(Packet);
@@ -72,3 +151,32 @@ void connection_hashtable_reset(GHashTable 
*connection_track_table)
 {
 g_hash_table_remove_all(connection_track_table);
 }
+
+/* if not found, create a new connection and add to hash table */
+Connection *connection_get(GHashTable *connection_track_table,
+   ConnectionKey *key,
+   uint32_t *hashtable_size)
+{
+Connection *conn = g_hash_table_lookup(connection_track_table, key);
+
+if (conn == NULL) {
+ConnectionKey *new_key = g_memdup(key, sizeof(*key));
+
+conn = connection_new(key);
+
+(*hashtable_size) += 1;
+if (*hashtable_size > HASHTABLE_MAX_SIZE) {
+

[Qemu-devel] [RFC PATCH V7 3/7] Jhash: add linux kernel jhashtable in qemu

2016-07-17 Thread Zhang Chen

Jhash used by colo-compare and filter-rewriter
to save and lookup net connection info

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 include/qemu/jhash.h | 61 
 1 file changed, 61 insertions(+)
 create mode 100644 include/qemu/jhash.h

diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
new file mode 100644
index 000..0fcd875
--- /dev/null
+++ b/include/qemu/jhash.h
@@ -0,0 +1,61 @@
+/* jhash.h: Jenkins hash support.
+  *
+  * Copyright (C) 2006. Bob Jenkins (bob_jenk...@burtleburtle.net)
+  *
+  * http://burtleburtle.net/bob/hash/
+  *
+  * These are the credits from Bob's sources:
+  *
+  * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
+  *
+  * These are functions for producing 32-bit hashes for hash table lookup.
+  * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
+  * are externally useful functions.  Routines to test the hash are
+included
+  * if SELF_TEST is defined.  You can use this free for any purpose.
+It's in
+  * the public domain.  It has no warranty.
+  *
+  * Copyright (C) 2009-2010 Jozsef Kadlecsik (kad...@blackhole.kfki.hu)
+  *
+  * I've modified Bob's hash to be useful in the Linux kernel, and
+  * any bugs present are my fault.
+  * Jozsef
+  */
+
+#ifndef QEMU_JHASH_H__
+#define QEMU_JHASH_H__
+
+#include "qemu/bitops.h"
+
+/*
+ * hashtable relation copy from linux kernel jhash
+ */
+
+/* __jhash_mix -- mix 3 32-bit values reversibly. */
+#define __jhash_mix(a, b, c)\
+{   \
+a -= c;  a ^= rol32(c, 4);  c += b; \
+b -= a;  b ^= rol32(a, 6);  a += c; \
+c -= b;  c ^= rol32(b, 8);  b += a; \
+a -= c;  a ^= rol32(c, 16); c += b; \
+b -= a;  b ^= rol32(a, 19); a += c; \
+c -= b;  c ^= rol32(b, 4);  b += a; \
+}
+
+/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
+#define __jhash_final(a, b, c)  \
+{   \
+c ^= b; c -= rol32(b, 14);  \
+a ^= c; a -= rol32(c, 11);  \
+b ^= a; b -= rol32(a, 25);  \
+c ^= b; c -= rol32(b, 16);  \
+a ^= c; a -= rol32(c, 4);   \
+b ^= a; b -= rol32(a, 14);  \
+c ^= b; c -= rol32(b, 24);  \
+}
+
+/* An arbitrary initial parameter */
+#define JHASH_INITVAL   0xdeadbeef
+
+#endif /* QEMU_JHASH_H__ */
-- 
2.7.4

[Qemu-devel] [RFC PATCH V7 6/7] colo-compare: introduce packet comparison thread

2016-07-17 Thread Zhang Chen

If primary packet is same with secondary packet,
we will send primary packet and drop secondary
packet, otherwise notify COLO frame to do checkpoint.
If primary packet comes and secondary packet not,
after REGULAR_PACKET_CHECK_MS milliseconds we set
the primary packet as old_packet,then do a checkpoint.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/colo-base.c|   1 +
 net/colo-base.h|   3 +
 net/colo-compare.c | 216 +
 trace-events   |   2 +
 4 files changed, 222 insertions(+)

diff --git a/net/colo-base.c b/net/colo-base.c
index 7e91dec..eb1b631 100644
--- a/net/colo-base.c
+++ b/net/colo-base.c
@@ -132,6 +132,7 @@ Packet *packet_new(const void *data, int size)
 
 pkt->data = g_memdup(data, size);
 pkt->size = size;
+pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 
 return pkt;
 }
diff --git a/net/colo-base.h b/net/colo-base.h
index 0505608..06d6dca 100644
--- a/net/colo-base.h
+++ b/net/colo-base.h
@@ -17,6 +17,7 @@
 
 #include "slirp/slirp.h"
 #include "qemu/jhash.h"
+#include "qemu/timer.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -28,6 +29,8 @@ typedef struct Packet {
 };
 uint8_t *transport_layer;
 int size;
+/* Time of packet creation, in wall clock ms */
+int64_t creation_ms;
 } Packet;
 
 typedef struct ConnectionKey {
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 5f87710..942e326 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -36,6 +36,8 @@
 
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
+/* TODO: Should be configurable */
+#define REGULAR_PACKET_CHECK_MS 3000
 
 /*
   + CompareState ++
@@ -83,6 +85,10 @@ typedef struct CompareState {
 GQueue unprocessed_connections;
 /* proxy current hash size */
 uint32_t hashtable_size;
+/* compare thread, a thread for each NIC */
+QemuThread thread;
+/* Timer used on the primary to find packets that are never matched */
+QEMUTimer *timer;
 } CompareState;
 
 typedef struct CompareClass {
@@ -170,6 +176,112 @@ static int packet_enqueue(CompareState *s, int mode)
 return 0;
 }
 
+/*
+ * The IP packets sent by primary and secondary
+ * will be compared in here
+ * TODO support ip fragment, Out-Of-Order
+ * return:0  means packet same
+ *> 0 || < 0 means packet different
+ */
+static int colo_packet_compare(Packet *ppkt, Packet *spkt)
+{
+trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
+   inet_ntoa(ppkt->ip->ip_dst), spkt->size,
+   inet_ntoa(spkt->ip->ip_src),
+   inet_ntoa(spkt->ip->ip_dst));
+
+if (ppkt->size == spkt->size) {
+return memcmp(ppkt->data, spkt->data, spkt->size);
+} else {
+return -1;
+}
+}
+
+static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+{
+trace_colo_compare_main("compare all");
+return colo_packet_compare(ppkt, spkt);
+}
+
+static void colo_old_packet_check_one(void *opaque_packet,
+  void *opaque_found)
+{
+int64_t now;
+bool *found_old = (bool *)opaque_found;
+Packet *ppkt = (Packet *)opaque_packet;
+
+if (*found_old) {
+/* Someone found an old packet earlier in the queue */
+return;
+}
+
+now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+if ((now - ppkt->creation_ms) > REGULAR_PACKET_CHECK_MS) {
+trace_colo_old_packet_check_found(ppkt->creation_ms);
+*found_old = true;
+}
+}
+
+static void colo_old_packet_check_one_conn(void *opaque,
+   void *user_data)
+{
+bool found_old = false;
+Connection *conn = opaque;
+
+g_queue_foreach(&conn->primary_list, colo_old_packet_check_one,
+&found_old);
+if (found_old) {
+/* do checkpoint will flush old packet */
+/* TODO: colo_notify_checkpoint();*/
+}
+}
+
+/*
+ * Look for old packets that the secondary hasn't matched,
+ * if we have some then we have to checkpoint to wake
+ * the secondary up.
+ */
+static void colo_old_packet_check(void *opaque)
+{
+CompareState *s = opaque;
+
+g_queue_foreach(&s->conn_list, colo_old_packet_check_one_conn, NULL);
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare connection
+ */
+static void colo_compare_connection(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+GList *result = NULL;
+int ret;
+
+while (!g_queue_is_empty(&conn->primary_list) &&
+   !g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_tail(&conn->primary_list);
+result = g_queue_find_custom(&conn->secondary_list,
+  pkt, (GCompareFunc)colo_packet_compare_all);
+
+if (result) {
+ret = compare_chr_send(s->chr_

[Qemu-devel] [RFC PATCH V7 7/7] colo-compare: add TCP, UDP, ICMP packet comparison

2016-07-17 Thread Zhang Chen

We add TCP,UDP,ICMP packet comparison to replace
IP packet comparison. This can increase the
accuracy of the package comparison.
less checkpoint more efficiency.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/colo-compare.c | 174 +++--
 trace-events   |   4 ++
 2 files changed, 174 insertions(+), 4 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 942e326..9737ec6 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -18,6 +18,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi/error.h"
 #include "net/net.h"
+#include "net/eth.h"
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
@@ -197,9 +198,158 @@ static int colo_packet_compare(Packet *ppkt, Packet *spkt)
 }
 }
 
-static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+/*
+ * called from the compare thread on the primary
+ * for compare tcp packet
+ * compare_tcp copied from Dr. David Alan Gilbert's branch
+ */
+static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
+{
+struct tcphdr *ptcp, *stcp;
+int res;
+char *sdebug, *ddebug;
+
+trace_colo_compare_main("compare tcp");
+if (ppkt->size != spkt->size) {
+if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+trace_colo_compare_main("pkt size not same");
+}
+return -1;
+}
+
+ptcp = (struct tcphdr *)ppkt->transport_layer;
+stcp = (struct tcphdr *)spkt->transport_layer;
+
+if (ptcp->th_seq != stcp->th_seq) {
+if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+trace_colo_compare_main("pkt tcp seq not same");
+}
+return -1;
+}
+
+/*
+ * The 'identification' field in the IP header is *very* random
+ * it almost never matches.  Fudge this by ignoring differences in
+ * unfragmented packets; they'll normally sort themselves out if different
+ * anyway, and it should recover at the TCP level.
+ * An alternative would be to get both the primary and secondary to rewrite
+ * somehow; but that would need some sync traffic to sync the state
+ */
+if (ntohs(ppkt->ip->ip_off) & IP_DF) {
+spkt->ip->ip_id = ppkt->ip->ip_id;
+/* and the sum will be different if the IDs were different */
+spkt->ip->ip_sum = ppkt->ip->ip_sum;
+}
+
+res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN,
+(spkt->size - ETH_HLEN));
+
+if (res != 0 && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+sdebug = strdup(inet_ntoa(ppkt->ip->ip_src));
+ddebug = strdup(inet_ntoa(ppkt->ip->ip_dst));
+fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u"
+" s: seq/ack=%u/%u res=%d flags=%x/%x\n", __func__,
+   sdebug, ddebug,
+   ntohl(ptcp->th_seq), ntohl(ptcp->th_ack),
+   ntohl(stcp->th_seq), ntohl(stcp->th_ack),
+   res, ptcp->th_flags, stcp->th_flags);
+
+trace_colo_compare_tcp_miscompare("Primary len", ppkt->size);
+qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
+trace_colo_compare_tcp_miscompare("Secondary len", spkt->size);
+qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
+
+g_free(sdebug);
+g_free(ddebug);
+}
+
+return res;
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare udp packet
+ */
+static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
+{
+int ret;
+
+trace_colo_compare_main("compare udp");
+ret = colo_packet_compare(ppkt, spkt);
+
+if (ret) {
+trace_colo_compare_udp_miscompare("primary pkt size", ppkt->size);
+qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
+trace_colo_compare_udp_miscompare("Secondary pkt size", spkt->size);
+qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
+}
+
+return ret;
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare icmp packet
+ */
+static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
 {
-trace_colo_compare_main("compare all");
+int network_length;
+struct icmp *icmp_ppkt, *icmp_spkt;
+
+trace_colo_compare_main("compare icmp");
+network_length = ppkt->ip->ip_hl * 4;
+if (ppkt->size != spkt->size ||
+ppkt->size < network_length + ETH_HLEN) {
+return -1;
+}
+icmp_ppkt = (struct icmp *)(ppkt->data + network_length + ETH_HLEN);
+icmp_spkt = (struct icmp *)(spkt->data + network_length + ETH_HLEN);
+
+if ((icmp_ppkt->icmp_type == icmp_spkt->icmp_type) &&
+(icmp_ppkt->icmp_code == icmp_spkt->icmp_code)) {
+if (icmp_ppkt->icmp_type == ICMP_REDIRECT) {
+if (icmp_ppkt->icmp_gwaddr.s_addr !=
+icmp_spkt->icmp_gwaddr.s_addr) {
+trace_colo_compare

[Qemu-devel] [RFC PATCH V7 5/7] qemu-char: Fix context for g_source_attach()

2016-07-17 Thread Zhang Chen

We want to poll and handle chardev in another thread
other than main loop. But qemu_chr_add_handlers() can only
work for global default context other than thread default context.
So we use g_source_attach(xx, g_main_context_get_thread_default())
replace g_source_attach(xx, NULL) to attach g_source.
Comments from jason.

Signed-off-by: Zhang Chen 
Signed-off-by: Jason Wang 
---
 io/channel.c | 2 +-
 qemu-char.c  | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/io/channel.c b/io/channel.c
index 692eb17..cd25677 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -146,7 +146,7 @@ guint qio_channel_add_watch(QIOChannel *ioc,
 
 g_source_set_callback(source, (GSourceFunc)func, user_data, notify);
 
-id = g_source_attach(source, NULL);
+id = g_source_attach(source, g_main_context_get_thread_default());
 g_source_unref(source);
 
 return id;
diff --git a/qemu-char.c b/qemu-char.c
index b597ee1..ed7e1f7 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -860,7 +860,7 @@ static gboolean io_watch_poll_prepare(GSource *source, gint 
*timeout_)
 iwp->src = qio_channel_create_watch(
 iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
 g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL);
-g_source_attach(iwp->src, NULL);
+g_source_attach(iwp->src, g_main_context_get_thread_default());
 } else {
 g_source_destroy(iwp->src);
 g_source_unref(iwp->src);
@@ -919,7 +919,7 @@ static guint io_add_watch_poll(QIOChannel *ioc,
 iwp->fd_read = (GSourceFunc) fd_read;
 iwp->src = NULL;
 
-tag = g_source_attach(&iwp->parent, NULL);
+tag = g_source_attach(&iwp->parent, g_main_context_get_thread_default());
 g_source_unref(&iwp->parent);
 return tag;
 }
@@ -3983,7 +3983,7 @@ int qemu_chr_fe_add_watch(CharDriverState *s, 
GIOCondition cond,
 }
 
 g_source_set_callback(src, (GSourceFunc)func, user_data, NULL);
-tag = g_source_attach(src, NULL);
+tag = g_source_attach(src, g_main_context_get_thread_default());
 g_source_unref(src);
 
 return tag;
-- 
2.7.4

[Qemu-devel] [RFC PATCH V7 2/7] colo-base: add colo-base to define and handle packet

2016-07-17 Thread Zhang Chen

COLO-base used by colo-compare and filter-rewriter.
this can share common data structure like:net packet,
and share other functions.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/Makefile.objs  |   1 +
 net/colo-base.c|  74 +
 net/colo-base.h|  38 +
 net/colo-compare.c | 119 -
 trace-events   |   3 ++
 5 files changed, 233 insertions(+), 2 deletions(-)
 create mode 100644 net/colo-base.c
 create mode 100644 net/colo-base.h

diff --git a/net/Makefile.objs b/net/Makefile.objs
index ba92f73..119589f 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -17,3 +17,4 @@ common-obj-y += filter.o
 common-obj-y += filter-buffer.o
 common-obj-y += filter-mirror.o
 common-obj-y += colo-compare.o
+common-obj-y += colo-base.o
diff --git a/net/colo-base.c b/net/colo-base.c
new file mode 100644
index 000..f5d5de9
--- /dev/null
+++ b/net/colo-base.c
@@ -0,0 +1,74 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "net/colo-base.h"
+
+int parse_packet_early(Packet *pkt)
+{
+int network_length;
+uint8_t *data = pkt->data;
+uint16_t l3_proto;
+ssize_t l2hdr_len = eth_get_l2_hdr_length(data);
+
+if (pkt->size < ETH_HLEN) {
+error_report("pkt->size < ETH_HLEN");
+return 1;
+}
+pkt->network_layer = data + ETH_HLEN;
+l3_proto = eth_get_l3_proto(data, l2hdr_len);
+if (l3_proto != ETH_P_IP) {
+return 1;
+}
+
+network_length = pkt->ip->ip_hl * 4;
+if (pkt->size < ETH_HLEN + network_length) {
+error_report("pkt->size < network_layer + network_length");
+return 1;
+}
+pkt->transport_layer = pkt->network_layer + network_length;
+if (!pkt->transport_layer) {
+error_report("pkt->transport_layer is valid");
+return 1;
+}
+
+return 0;
+}
+
+Packet *packet_new(const void *data, int size)
+{
+Packet *pkt = g_slice_new(Packet);
+
+pkt->data = g_memdup(data, size);
+pkt->size = size;
+
+return pkt;
+}
+
+void packet_destroy(void *opaque, void *user_data)
+{
+Packet *pkt = opaque;
+
+g_free(pkt->data);
+g_slice_free(Packet, pkt);
+}
+
+/*
+ * Clear hashtable, stop this hash growing really huge
+ */
+void connection_hashtable_reset(GHashTable *connection_track_table)
+{
+g_hash_table_remove_all(connection_track_table);
+}
diff --git a/net/colo-base.h b/net/colo-base.h
new file mode 100644
index 000..48835e7
--- /dev/null
+++ b/net/colo-base.h
@@ -0,0 +1,38 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_BASE_H
+#define QEMU_COLO_BASE_H
+
+#include "slirp/slirp.h"
+#include "qemu/jhash.h"
+
+#define HASHTABLE_MAX_SIZE 16384
+
+typedef struct Packet {
+void *data;
+union {
+uint8_t *network_layer;
+struct ip *ip;
+};
+uint8_t *transport_layer;
+int size;
+} Packet;
+
+int parse_packet_early(Packet *pkt);
+void connection_hashtable_reset(GHashTable *connection_track_table);
+Packet *packet_new(const void *data, int size);
+void packet_destroy(void *opaque, void *user_data);
+
+#endif /* QEMU_COLO_BASE_H */
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 0402958..7c52cc8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -27,13 +27,38 @@
 #include "sysemu/char.h"
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
+#include "net/colo-base.h"
+#include "trace.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
 OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
+#define MAX_QUEUE_SIZE 1024
 
+/*
+  + CompareState ++
+  |   |
+  +---+   +---+ +---+
+  |conn list  +--->conn   +->conn   |
+  +---+   +---+ +---+
+  |   | |   | |  |
+  +---+ +---v+  +---v++---v+ +---v+
+|primary |  |secondary|primary | |secondary
+

[Qemu-devel] [RFC PATCH V7 1/7] colo-compare: introduce colo compare initialization

2016-07-17 Thread Zhang Chen

This a COLO net ascii figure:

 Primary qemu   
Secondary qemu
+--+   
++
| +-+  |   |  
+---+ |
| | |  |   |  | 
  | |
| |guest|  |   |  | 
   guest  | |
| | |  |   |  | 
  | |
| +---^--+--+  |   |  
+-+++ |
| |  | |   |
^|  |
| |  | |   |
||  |
| |  +--+  |
||  |
|netfilter|  |   | ||  |   
netfilter||  |
| +--+ ---+||  |  
+---+ |
| |   |  |   ||||  |  | 
||  filter excute order   | |
| |   |  |   ||||  |  | 
|| +--->  | |
| |   |  |   ||||  |  | 
||   TCP  | |
| | +-+--+--+ +--v-+  | ++ ||  |  | 
++  +---++---v+rewriter++  ++ | |
| | |   | ||  | || ||  |  | |   
 |  ||  |  || | |
| | |  filter   | |   filter   +>   colo <+ +>  
filter   +--> adjust |   adjust +-->   filter   | | |
| | |  mirror   | | redirector |  | |  compare   | |  ||  | | 
redirector |  | ack|   seq|  | redirector | | |
| | |   | ||  | || |  ||  | |   
 |  ||  |  || | |
| | +^--+ ++  | +-+--+ |  ||  | 
++  ++--+  +---++ | |
| |  | tx rx  |   ||  ||  | 
   txall   |  rx  | |
| |  ||   ||  ||  
+---+ |
| |  ||   ||  ||
   ||
| |  |   filter excute order  |   ||  ||
   ||
| |  |  +---> |   ||  
++|
| +---+   ||   |
|
||||   |
|
+--+   
++
 |guest receive   |guest send
 ||
++v+
|  |
  NOTE: filter direction is rx/tx/all
| tap  |
  rx:receive packets sent to the netdev
|  |
  tx:receive packets sent by the netdev
+--+

In COLO-compare.
Packets coming from the primary char indev will be sent to outdev
Packets coming from the secondary char dev will be dropped
colo-comapre need two input chardev and one output chardev:
primary_in=chardev1-id
secondary_in=chardev2-id
outdev=chardev3-id

usage:

primary:
-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
-device

73 matches

Mail list logo