Re: [Qemu-devel] [PATCH v3 1/3] uhci: fix segfault when hot-unplugging uhci controller

2015-03-18 Thread Gerd Hoffmann
  Hi,

> -static void uhci_reset(void *opaque)
> +static void uhci_reset(DeviceState *dev)
>  {
> -UHCIState *s = opaque;
> +PCIDevice *d = PCI_DEVICE(dev);
> +UHCIState *s = DO_UPCAST(UHCIState, dev, d);

Uh, oh, DO_UPCAST() is long deprecated.  There are other instances of
this in the uhci emulation though, so we need a cleanup & qom-ify pass
for the code anyway.  So I think it's ok for a bugfix patch.

I'll queue it up (and the other two too of course).

cheers,
  Gerd





Re: [Qemu-devel] [Qemu-block] [PATCH] qtest/ahci: Fix clang 3.5.0 error

2015-03-18 Thread Markus Armbruster
John Snow  writes:

> A thinko that clang 3.5.0 caught.
> Thankfully does not introduce any new failures.
>
> Signed-off-by: John Snow 

How you caught the bug is interesting enough to be mentioned in the
commit message, but the nature of the bug surely is more interesting
than that.  Suggest to change the subject to

qtest/ahci: Fix a bit mask expression

Perhaps it can be fixed up on commit.

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH] add pci-bridge-seat

2015-03-18 Thread Gerd Hoffmann
  Hi,

> Could you explain a bit more please? What does it mean that
> a bridge is a new seat? How is this used?

seat: kbd+mouse+display adapter.

multiseat: When you have multiple of them on a single machine. When
configured properly logind will show a login screen on each seat.

why a pci bridge: grouping all devices belonging to a seat by placing
them all below a pci bridge simplifies guest configuration (instead of a
list of devices you'll only need to say "this bridge and all devices
underneath it").

why a new pci bridge device: this allows to make the guest-side seat
configuration automatic with a udev rule.

more details: see docs/multiseat.txt

cheers,
  Gerd





Re: [Qemu-devel] [PATCH v3 1/3] uhci: fix segfault when hot-unplugging uhci controller

2015-03-18 Thread Gonglei
On 2015/3/18 15:02, Gerd Hoffmann wrote:
>   Hi,
> 
>> -static void uhci_reset(void *opaque)
>> +static void uhci_reset(DeviceState *dev)
>>  {
>> -UHCIState *s = opaque;
>> +PCIDevice *d = PCI_DEVICE(dev);
>> +UHCIState *s = DO_UPCAST(UHCIState, dev, d);
> 
> Uh, oh, DO_UPCAST() is long deprecated.  There are other instances of
> this in the uhci emulation though, so we need a cleanup & qom-ify pass
> for the code anyway.  So I think it's ok for a bugfix patch.
> 
Yes, I noticed that, but I haven't a good idea for qom-ifing uhci. :)
May we refer to the realization of ehci ?

> I'll queue it up (and the other two too of course).
> 
Thanks.

Regards,
-Gonglei






Re: [Qemu-devel] RFC: -object vs -chardev creation order

2015-03-18 Thread Paolo Bonzini


On 17/03/2015 20:08, Daniel P. Berrange wrote:
> A third option is to not process -object args in one go, instead process
> each type of object at a time. eg we'd first create all the
> -object tls-credential instances, then create the -chardev instances,
> then create the -object rng-egd instances. This is probably the least
> amount of work in short term, but not all that scalable, unless we do
> a catch-all default case, so we only need code up hacks for a few
> particular object types.
> 
> Thus my gut feeling is to do option 3, but I'd like other opinions before
> embarking on this

Yes, for now it is the best.

Another idea is to build objects lazily.  This requires adding a
"missing property" interface and making /objects implement it.  We can
think about it later.

Paolo



Re: [Qemu-devel] Duplicate definition of 'qemu_time' building git master

2015-03-18 Thread Paolo Bonzini


On 17/03/2015 21:36, B Cran wrote:
> When trying to build the latest git master on openSUSE 13.2, it failed with:
> 
> vl.c:711:15: error: ‘qemu_time’ redeclared as different kind of symbol
>  static time_t qemu_time(void)
>^
> In file included from qemu/include/block/aio.h:23:0,
>  from qemu/include/hw/hw.h:13,
>  from vl.c:62:
> qemu/include/qemu/timer.h:1005:16: note: previous declaration of
> ‘qemu_time’ was here
>  extern int64_t qemu_time, qemu_time_start;
> ^
>   CCbackends/rng-egd.o
>   CCbackends/rng-random.o
> qemu/rules.mak:57: recipe for target 'vl.o' failed
> make: *** [vl.o] Error 1
> make: *** Waiting for unfinished jobs
> 
> 
> The ./configure commandline I used (which has previously worked) was:
> 
> ./configure --prefix=~ --enable-sdl --disable-xen --enable-kvm
> --enable-curses --enable-fdt --disable-bluez --disable-user
> --enable-uuid --disable-vde --enable-linux-aio --enable-vhost-net
> --enable-spice --enable-libusb --enable-usb-redir --enable-lzo
> --enable-libssh2 --enable-profiler --enable-system
> --target-list=i386-softmmu,x86_64-softmmu
> 

A patch has been posted to fix this.  You can disable --enable-profiler,
you probably have never used it.

Paolo



Re: [Qemu-devel] [v2][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-03-18 Thread Chen, Tiejun

I think the _libxl_ message needs to be just "Unable to detect graphics
passthru kind". i.e. it can't/shouldn't reference anything to do with xl
config options etc (which would make no sense if libvirt was being
used).

That's not very user friendly though, so you may want to consider adding
a new specific error code for this case and returning it here, such that
xl or libvirt can then give a more comprehensible error message.


But,
args = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
  guest_config, d_state, NULL);
|
	+ return libxl__build_device_model_args_new(gc, dm,guest_domid, 
guest_config, state, dm_state_fd);

|
+  Currently we always return "NULL" inside.

if (!args) {
ret = ERROR_FAIL;
goto out;
}

So I'm not very sure how to pass this new specific error code to xl/libvirt.




So looks the whole policy should be something like this,

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 5c40e84..5518759 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1953,8 +1953,27 @@ skip_vfb:
   xlu_cfg_replace_string (config, "spice_streaming_video",
   &b_info->u.hvm.spice.streaming_video, 0);
   xlu_cfg_get_defbool(config, "nographic",
&b_info->u.hvm.nographic, 0);
-xlu_cfg_get_defbool(config, "gfx_passthru",
-&b_info->u.hvm.gfx_passthru, 0);
+if (!xlu_cfg_get_long(config, "gfx_passthru", &l, 1)) {
+libxl_defbool_set(&b_info->u.hvm.gfx_passthru, l);
+} else if (!xlu_cfg_get_string(config, "gfx_passthru", &buf, 0)) {
+if (libxl_gfx_passthru_kind_from_string(buf,
+
&b_info->u.hvm.gfx_passthru_kind)) {
+fprintf(stderr,
+"ERROR: invalid value \"%s\" for
\"gfx_passthru\"\n",
+buf);
+exit (1);
+}
+libxl_defbool_set(&b_info->u.hvm.gfx_passthru, true);
+}


Up to here is fine.


Thanks but I guess I'd better to paste this whole patch to avoid I'm 
still missing something :)


---
 tools/libxl/libxl.h  |  6 ++
 tools/libxl/libxl_dm.c   | 25 ++---
 tools/libxl/libxl_internal.h |  5 +
 tools/libxl/libxl_types.idl  |  6 ++
 tools/libxl/xl_cmdimpl.c | 14 --
 5 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 6bbc52d..62b3ae5 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -720,6 +720,12 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);

 #define LIBXL_HAVE_PSR_MBM 1
 #endif

+/*
+ * libxl_domain_build_info has the u.hvm.gfx_passthru_kind field and
+ * the libxl_gfx_passthru_kind enumeration is defined.
+*/
+#define LIBXL_HAVE_GFX_PASSTHRU_KIND
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..045d48a 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -710,9 +710,6 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,

 flexarray_append(dm_args, "-net");
 flexarray_append(dm_args, "none");
 }
-if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
-flexarray_append(dm_args, "-gfx_passthru");
-}
 } else {
 if (!sdl && !vnc) {
 flexarray_append(dm_args, "-nographic");
@@ -757,6 +754,28 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,

 machinearg, max_ram_below_4g);
 }
 }
+
+if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
+switch (b_info->u.hvm.gfx_passthru_kind) {
+case LIBXL_GFX_PASSTHRU_KIND_DEFAULT:
+if (libxl__is_igd_vga_passthru(gc, guest_config)) {
+machinearg = GCSPRINTF("%s,igd-passthru=on", 
machinearg);

+} else {
+LOG(ERROR, "Unable to detect graphics passthru kind,"
+" please set gfx_passthru_kind. See xl.cfg(5) 
for more"

+" information.\n");
+return NULL;
+}
+break;
+case LIBXL_GFX_PASSTHRU_KIND_IGD:
+machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
+break;
+default:
+LOG(WARN, "gfx_passthru_kind is invalid so ignored.\n");
+break;
+}
+}
+
 flexarray_append(dm_args, machinearg);
 for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; 
i++)

 flexarray_append(dm_args, b_info->extra_hvm[i]);
diff --git a/tools/libxl/libxl_internal.h

Re: [Qemu-devel] [PATCH] virtio-scsi-dataplane: fix memory leak in virtio_scsi_vring_init

2015-03-18 Thread Paolo Bonzini


On 18/03/2015 10:42, Bo Su wrote:
> if k->set_host_notifier failed, VirtIOSCSIVring *r will leak
> 
> Signed-off-by: Bo Su 
> ---
>  hw/scsi/virtio-scsi-dataplane.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
> index 3f40ff0..c069cd7 100644
> --- a/hw/scsi/virtio-scsi-dataplane.c
> +++ b/hw/scsi/virtio-scsi-dataplane.c
> @@ -45,7 +45,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI 
> *s,
>  {
>  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
>  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> -VirtIOSCSIVring *r = g_slice_new(VirtIOSCSIVring);
> +VirtIOSCSIVring *r;
>  int rc;
>  
>  /* Set up virtqueue notify */
> @@ -56,6 +56,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI 
> *s,
>  s->dataplane_fenced = true;
>  return NULL;
>  }
> +
> +r = g_slice_new(VirtIOSCSIVring);
>  r->host_notifier = *virtio_queue_get_host_notifier(vq);
>  r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
>  aio_set_event_notifier(s->ctx, &r->host_notifier, handler);
> 

Thanks, queued for 2.3.

Paolo



Re: [Qemu-devel] [PATCH v3 1/3] uhci: fix segfault when hot-unplugging uhci controller

2015-03-18 Thread Markus Armbruster
Gerd Hoffmann  writes:

>   Hi,
>
>> -static void uhci_reset(void *opaque)
>> +static void uhci_reset(DeviceState *dev)
>>  {
>> -UHCIState *s = opaque;
>> +PCIDevice *d = PCI_DEVICE(dev);
>> +UHCIState *s = DO_UPCAST(UHCIState, dev, d);
>
> Uh, oh, DO_UPCAST() is long deprecated.  There are other instances of
[...]

Only 378 instances left in the code %-}



Re: [Qemu-devel] [PATCH] raw-posix: Deprecate host floppy passthrough

2015-03-18 Thread Markus Armbruster
John Snow  writes:

> On 03/17/2015 12:02 PM, Markus Armbruster wrote:
>> Raise your hand if you have a physical floppy drive in a computer
>> you've powered on in 2015.  Okay, I see we got a few weirdos in the
>> audience.  That's okay, weirdos are welcome here.
>>
>
> May I introduce to you my reference Q35 machine:
> http://i.imgur.com/4gCnetj.jpg

Take good care of it, so it doesn't die on us before pc-q35 is done!



Re: [Qemu-devel] [PATCH] raw-posix: Deprecate host floppy passthrough

2015-03-18 Thread Gerd Hoffmann
On Di, 2015-03-17 at 17:02 +0100, Markus Armbruster wrote:
> Raise your hand if you have a physical floppy drive in a computer
> you've powered on in 2015.

/me raises the hand, my amd test box has one.

I was surprised when I got the machine a few years back.
And I never ever actually used the floppy drive in all those years.

> Kidding aside, media change detection doesn't fully work, isn't going
> to be fixed, and floppy passthrough just isn't earning its keep
> anymore.

If you really need floppies there is the option to go for usb.  Recent
machines don't have floppy connectors any more, so that is the only kind
of floppy device you can use anyway.  And you can usb passthrough it to
the guest.  Case closed.

Reviewed-by: Gerd Hoffmann 

cheers,
  Gerd






Re: [Qemu-devel] [PATCH v3 1/3] uhci: fix segfault when hot-unplugging uhci controller

2015-03-18 Thread Gonglei
On 2015/3/18 15:35, Markus Armbruster wrote:
> Gerd Hoffmann  writes:
> 
>>   Hi,
>>
>>> -static void uhci_reset(void *opaque)
>>> +static void uhci_reset(DeviceState *dev)
>>>  {
>>> -UHCIState *s = opaque;
>>> +PCIDevice *d = PCI_DEVICE(dev);
>>> +UHCIState *s = DO_UPCAST(UHCIState, dev, d);
>>
>> Uh, oh, DO_UPCAST() is long deprecated.  There are other instances of
> [...]
> 
> Only 378 instances left in the code %-}
> 
Maybe I can do some cleanup work for 2.4 ;)

Regards,
-Gonglei




Re: [Qemu-devel] [PATCH v3 2/3] ehci: fix segfault when hot-unplugging ehci controller

2015-03-18 Thread Gerd Hoffmann
On Mi, 2015-03-18 at 09:49 +0800, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> When hot-unplugging the usb controllers (ehci/uhci),
> we have to clean all resouce of these devices,
> involved registered reset handler. Otherwise, it
> may cause NULL pointer access and/or segmentation fault
> if we reboot the guest os after hot-unplugging.
> 
> Let's hook up reset via DeviceClass->reset() and drop
> the qemu_register_reset() call. Then Qemu will register
> and unregister the reset handler automatically.

Fails "make check" (for aarch64).  My guess is the sysbus variants lost
the reset hookup.

cheers,
  Gerd





Re: [Qemu-devel] [RFC 0/5] Memory transaction attributes API

2015-03-18 Thread Edgar E. Iglesias
On Mon, Mar 16, 2015 at 05:20:17PM +, Peter Maydell wrote:
> This is an RFC patchset aimed at getting comment on
> some memory API changes to support "transaction attributes",
> ie sideband information that goes along with a memory read
> or write access to define things like ARM secure/nonsecure,
> CPU/transaction master ID, privileged/nonprivileged.

Hi Peter,

Generally I think the direction of this looks very good, thanks!

A few comments:

1. Would it make sense to instead of having the MemTxAttrs be
a uint64_t instead have them be a struct with bitfields?
We could still pass the struct by value I think and as long
as it doesn't grow large the compiler will emit similar or
the same code IIUC.

2. The series introduces the read and write _with_attrs. Another
approach could be to add a (for example) single .access callback.
The callback could take a structure pointer as input, e.g:

struct MemTx {
bool rw;
hwaddr addr;
int size;
uint64_t data;
MemTxAttrs attrs;
}

MemTxResult access(MemTx *tx)

The benefit I see is that this will make it easier for us in the
future to add new fields if needed.

Best regards,
Edgar


> 
> (See also previous discussion:
> https://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg01026.html )
> 
> Patch 1 is the API changes themselves:
> 1) new read_with_attrs and write_with_attrs fields in the
>MemoryRegionOps struct; the old read/write still exist
>for backwards compatibility, but devices that care about
>attributes can register with these function pointers instead
> 2) the functions return a success/failure status, so a device
>can actually fail bad transactions rather than merely
>returning bogus data. (This isn't wired up in this patchset
>but I include it to avoid revving the memory API twice.)
> 
> Patches 2, 3 and 4 then plumb the memory attribute parameters
> through the various functions, working upwards to being able
> to put them in the iotlb. Patch 5 implements the target-arm
> changes to provide a secure/nonsecure tx attribute based on
> the page table walk, as a demonstration.
> 
> There are obviously more APIs within QEMU for memory access
> functions which need to change to either always take a tx
> attribute, or to have extra with-tx-attribute versions of the
> functions. For the moment things are stubbed out with passing
> in "no attributes specified" values.
> 
> I've modelled the transaction attributes as a (typedefed)
> uint64_t, whose bits will be defined as we find requirements
> for them (the meaning will not be per-architecture). When
> we originally discussed this on-list, Edgar suggested making
> the attributes be a (pointer to a) struct; however I found
> the ownership/copying semantics on this too awkward, because
> the access path needs to take attributes set up in the TLB
> and then modify them according to details of the access
> actually being made before passing them to the device, so
> took the simpler implementation route.
> 
> I intend to continue working on this (filling in the gaps,
> etc), but wanted to send this series out early for comment
> on the memory API changes in particular.
> 
> thanks
> -- PMM
> 
> Peter Maydell (5):
>   memory: Define API for MemoryRegionOps to take attrs and return status
>   memory: Add MemTxAttrs argument to io_mem_read and io_mem_write
>   Make CPU iotlb a structure rather than a plain hwaddr
>   Add MemTxAttrs to the IOTLB
>   target-arm: Honour NS bits in page tables
> 
>  cputlb.c |  22 +++---
>  exec.c   |  29 +++---
>  hw/s390x/s390-pci-inst.c |   7 ++--
>  hw/vfio/pci.c|   4 +-
>  include/exec/cpu-defs.h  |  15 ++-
>  include/exec/exec-all.h  |   8 +++-
>  include/exec/memattrs.h  |  37 +
>  include/exec/memory.h|  22 ++
>  memory.c | 102 
> +--
>  softmmu_template.h   |  36 +
>  target-arm/helper.c  |  83 --
>  11 files changed, 288 insertions(+), 77 deletions(-)
>  create mode 100644 include/exec/memattrs.h
> 
> -- 
> 1.9.1
> 



Re: [Qemu-devel] [PATCH v3 2/3] ehci: fix segfault when hot-unplugging ehci controller

2015-03-18 Thread Gonglei
On 2015/3/18 16:23, Gerd Hoffmann wrote:
> On Mi, 2015-03-18 at 09:49 +0800, arei.gong...@huawei.com wrote:
>> From: Gonglei 
>>
>> When hot-unplugging the usb controllers (ehci/uhci),
>> we have to clean all resouce of these devices,
>> involved registered reset handler. Otherwise, it
>> may cause NULL pointer access and/or segmentation fault
>> if we reboot the guest os after hot-unplugging.
>>
>> Let's hook up reset via DeviceClass->reset() and drop
>> the qemu_register_reset() call. Then Qemu will register
>> and unregister the reset handler automatically.
> 
> Fails "make check" (for aarch64).  My guess is the sysbus variants lost
> the reset hookup.
> 
Actually, these fails were introduced by the follow patch:

commit c3cf77cb63b71618224129df41f114488e0f74e4
Author: David Gibson 
Date:   Wed Feb 18 16:01:01 2015 +1100

Make sysbus EHCI devices ARM only by default

A number of ARM embedded boards include EHCI USB host controllers which
appear as directly mapped devices, rather than sitting on a PCI bus.

At present code to emulate such devices is included whenever EHCI support
is included.  This patch adjusts teh config options to only include them
in builds targetting ARM by default.

But on the other hand, the sysbus variants lost the reset hookup is a real bug,
I will fix them in the next version. thanks!

Regards,
-Gonglei






[Qemu-devel] [v7][PATCH 00/10] xen: add Intel IGD passthrough support

2015-03-18 Thread Tiejun Chen
v7:

* Instead of "-gfx_passthru" we'd like to make that a machine
  option, "-machine xxx,igd-passthru=on"" 
* try to make something as common shared by others like KvmGT in
  the future
* Just read those real value from host bridge pci
  configuration space when create host bridge then put in dev->config.

v6:

* Drop introducing a new machine specific to IGD passthrough
* Try to share some codes from KVM stuff in qemu to retrive VGA BIOS
* Currently IGD drivers always need to access PCH by 1f.0. But we
  don't want to poke that directly to get ID, and although in real
  world different GPU should have different PCH. But actually the
  different PCH DIDs likely map to different PCH SKUs. We do the
  same thing for the GPU. For PCH, the different SKUs are going to
  be all the same silicon design and implementation, just different
  features turn on and off with fuses. The SW interfaces should be
  consistent across all SKUs in a given family (eg LPT). But just
  same features may not be supported.
 
  Most of these different PCH features probably don't matter to the
  Gfx driver, but obviously any difference in display port connections
  will so it should be fine with any PCH in case of passthrough.
 
  So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
  scenarios, 0x9cc3 for BDW(Broadwell).
* Drop igd write ops since its fine to emulate that, and we also shrink
  those igd read ops as necessary.
* Rebase and cleanup all patches.

v5:

* Simplify to make sure its really inherited from the standard one in patch #3
* Then drop the original patch #3

v4:

* Rebase on latest tree
* Drop patch #2
* Regenerate patches after Michael introduce patch #1
* We need to use this pci_type as a index to reuse I440FX_PCI_DEVICE()
* Test: boot with a preinstalled winxp
  ./i386-softmmu/qemu-system-i386 -hda winxp-32.img -m 2560 -boot c -machine pc

v3:

* Drop patch #4
* Add one patch #1 from Michael
* Rebase
* In./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

v2:

* Fix some coding style
* New patch to separate i440fx_init
* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
* Based on patch #2 to regenerate
* Unify prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough like patch #3
* Test: boot with a preinstalled ubuntu 14.04
  ./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

As we discussed we need to create a separate machine to support current
IGD passthrough.


Michael S. Tsirkin (1):
  i440fx: make types configurable at run-time
 
Tiejun Chen (9):
  pc_init1: pass parameters just with types
  piix: create host bridge to passthrough
  hw/pci-assign: split pci-assign.c
  xen, gfx passthrough: basic graphics passthrough support
  xen, gfx passthrough: retrieve VGA BIOS to work
  igd gfx passthrough: create a isa bridge
  xen, gfx passthrough: register a isa bridge
  xen, gfx passthrough: register host bridge specific to passthrough
  xen, gfx passthrough: add opregion mapping
 
 hw/core/machine.c |  20 +++
 hw/i386/Makefile.objs |   1 +
 hw/i386/kvm/pci-assign.c  |  82 +-
 hw/i386/pc_piix.c | 149 ++-
 hw/i386/pci-assign-load-rom.c |  93 
 hw/pci-host/piix.c|  91 +++-
 hw/xen/Makefile.objs  |   1 +
 hw/xen/xen-host-pci-device.c  |   5 +
 hw/xen/xen-host-pci-device.h  |   1 +
 hw/xen/xen_pt.c   |  32 
 hw/xen/xen_pt.h   |  21 ++-
 hw/xen/xen_pt_config_init.c   |  52 ++-
 hw/xen/xen_pt_graphics.c  | 272 ++
 include/hw/boards.h   |   1 +
 include/hw/i386/pc.h  |   8 +-
 include/hw/pci/pci-assign.h   |  27 
 include/hw/xen/xen.h  |   1 +
 qemu-options.hx   |   3 +
 vl.c  |  10 ++
 19 files changed, 778 insertions(+), 92 deletions(-)
 create mode 100644 hw/i386/pci-assign-load-rom.c
 create mode 100644 hw/xen/xen_pt_graphics.c
 create mode 100644 include/hw/pci/pci-assign.h

Thanks
Tiejun



[Qemu-devel] [v7][PATCH 01/10] i440fx: make types configurable at run-time

2015-03-18 Thread Tiejun Chen
From: "Michael S. Tsirkin" 

IGD passthrough wants to supply a different pci and
host devices, inheriting i440fx devices. Make types
configurable.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c| 4 +++-
 hw/pci-host/piix.c   | 9 -
 include/hw/i386/pc.h | 6 +-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 36c69d7..07faec9 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -202,7 +202,9 @@ static void pc_init1(MachineState *machine,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  TYPE_I440FX_PCI_DEVICE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
   system_memory, system_io, machine->ram_size,
   below_4g_mem_size,
   above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 723836f..c812eaa 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -40,7 +40,6 @@
  * http://download.intel.com/design/chipsets/datashts/29054901.pdf
  */
 
-#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
 #define I440FX_PCI_HOST_BRIDGE(obj) \
 OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_PCI_HOST_BRIDGE)
 
@@ -91,7 +90,6 @@ typedef struct PIIX3State {
 MemoryRegion rcr_mem;
 } PIIX3State;
 
-#define TYPE_I440FX_PCI_DEVICE "i440FX"
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
 
@@ -304,7 +302,8 @@ static void i440fx_realize(PCIDevice *dev, Error **errp)
 cpu_smm_register(&i440fx_set_smm, d);
 }
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state,
 int *piix3_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
@@ -324,7 +323,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 unsigned i;
 I440FXState *i440fx;
 
-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, host_type);
 s = PCI_HOST_BRIDGE(dev);
 b = pci_bus_new(dev, NULL, pci_address_space,
 address_space_io, 0, TYPE_PCI_BUS);
@@ -332,7 +331,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 object_property_add_child(qdev_get_machine(), "i440fx", OBJECT(dev), NULL);
 qdev_init_nofail(dev);
 
-d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
+d = pci_create_simple(b, 0, pci_type);
 *pi440fx_state = I440FX_PCI_DEVICE(d);
 f = *pi440fx_state;
 f->system_memory = address_space_mem;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 1b35168..99dc26b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -230,7 +230,11 @@ extern int no_hpet;
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
+#define TYPE_I440FX_PCI_DEVICE "i440FX"
+
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
 MemoryRegion *address_space_io,
-- 
1.9.1




[Qemu-devel] [v7][PATCH 06/10] xen, gfx passthrough: retrieve VGA BIOS to work

2015-03-18 Thread Tiejun Chen
Now we retrieve VGA bios like kvm stuff in qemu but we need to
fix Device Identification in case if its not matched with the
real IGD device since Seabios is always trying to compare this
ID to work out VGA BIOS.

Signed-off-by: Tiejun Chen 
---
 hw/xen/xen_pt.c  | 10 ++
 hw/xen/xen_pt.h  |  5 +++
 hw/xen/xen_pt_graphics.c | 79 
 3 files changed, 94 insertions(+)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 1d78021..fcc9f1c 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -672,6 +672,16 @@ static int xen_pt_initfn(PCIDevice *d)
 s->memory_listener = xen_pt_memory_listener;
 s->io_listener = xen_pt_io_listener;
 
+/* Setup VGA bios for passthrough GFX */
+if ((s->real_device.domain == 0) && (s->real_device.bus == 0) &&
+(s->real_device.dev == 2) && (s->real_device.func == 0)) {
+if (xen_pt_setup_vga(s, &s->real_device) < 0) {
+XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");
+xen_host_pci_device_put(&s->real_device);
+return -1;
+}
+}
+
 /* Handle real device's MMIO/PIO BARs */
 xen_pt_register_regions(s);
 
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 3325387..d9b1ce0 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -298,6 +298,11 @@ static inline bool 
xen_pt_has_msix_mapping(XenPCIPassthroughState *s, int bar)
 return s->msix && s->msix->bar_index == bar;
 }
 
+extern void *pci_assign_dev_load_option_rom(PCIDevice *dev,
+struct Object *owner, int *size,
+unsigned int domain,
+unsigned int bus, unsigned int 
slot,
+unsigned int function);
 extern bool has_igd_gfx_passthru;
 static inline bool is_igd_vga_passthrough(XenHostPCIDevice *dev)
 {
diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
index 9b3df81..3232296 100644
--- a/hw/xen/xen_pt_graphics.c
+++ b/hw/xen/xen_pt_graphics.c
@@ -109,3 +109,82 @@ int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev)
 
 return 0;
 }
+
+static void *get_vgabios(XenPCIPassthroughState *s, int *size,
+   XenHostPCIDevice *dev)
+{
+return pci_assign_dev_load_option_rom(&s->dev, OBJECT(&s->dev), size,
+  dev->domain, dev->bus,
+  dev->dev, dev->func);
+}
+
+/* Refer to Seabios. */
+struct rom_header {
+uint16_t signature;
+uint8_t size;
+uint8_t initVector[4];
+uint8_t reserved[17];
+uint16_t pcioffset;
+uint16_t pnpoffset;
+} __attribute__((packed));
+
+struct pci_data {
+uint32_t signature;
+uint16_t vendor;
+uint16_t device;
+uint16_t vitaldata;
+uint16_t dlen;
+uint8_t drevision;
+uint8_t class_lo;
+uint16_t class_hi;
+uint16_t ilen;
+uint16_t irevision;
+uint8_t type;
+uint8_t indicator;
+uint16_t reserved;
+} __attribute__((packed));
+
+int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev)
+{
+unsigned char *bios = NULL;
+struct rom_header *rom;
+int bios_size;
+char *c = NULL;
+char checksum = 0;
+uint32_t len = 0;
+struct pci_data *pd = NULL;
+
+if (!is_igd_vga_passthrough(dev)) {
+return -1;
+}
+
+bios = get_vgabios(s, &bios_size, dev);
+if (!bios) {
+XEN_PT_ERR(&s->dev, "VGA: Can't getting VBIOS!\n");
+return -1;
+}
+
+/* Currently we fixed this address as a primary. */
+rom = (struct rom_header *)bios;
+pd = (void *)(bios + (unsigned char)rom->pcioffset);
+
+/* We may need to fixup Device Identification. */
+if (pd->device != s->real_device.device_id) {
+pd->device = s->real_device.device_id;
+
+len = rom->size * 512;
+/* Then adjust the bios checksum */
+for (c = (char *)bios; c < ((char *)bios + len); c++) {
+checksum += *c;
+}
+if (checksum) {
+bios[len - 1] -= checksum;
+XEN_PT_LOG(&s->dev, "vga bios checksum is adjusted %x!\n",
+   checksum);
+}
+}
+
+/* Currently we fixed this address as a primary for legacy BIOS. */
+cpu_physical_memory_rw(0xc, bios, bios_size, 1);
+return 0;
+}
-- 
1.9.1




[Qemu-devel] [v7][PATCH 05/10] xen, gfx passthrough: basic graphics passthrough support

2015-03-18 Thread Tiejun Chen
basic gfx passthrough support:
- add a vga type for gfx passthrough
- register/unregister legacy VGA I/O ports and MMIOs for passthrough GFX

Signed-off-by: Tiejun Chen 
Signed-off-by: Yang Zhang 
---
 hw/core/machine.c|  20 
 hw/xen/Makefile.objs |   1 +
 hw/xen/xen-host-pci-device.c |   5 ++
 hw/xen/xen-host-pci-device.h |   1 +
 hw/xen/xen_pt.c  |   4 ++
 hw/xen/xen_pt.h  |  10 +++-
 hw/xen/xen_pt_graphics.c | 111 +++
 include/hw/boards.h  |   1 +
 qemu-options.hx  |   3 ++
 vl.c |  10 
 10 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100644 hw/xen/xen_pt_graphics.c

diff --git a/hw/core/machine.c b/hw/core/machine.c
index cb1185a..6ac3ee4 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -225,6 +225,20 @@ static void machine_set_usb(Object *obj, bool value, Error 
**errp)
 ms->usb = value;
 }
 
+static bool machine_get_igd_gfx_passthru(Object *obj, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+return ms->igd_gfx_passthru;
+}
+
+static void machine_set_igd_gfx_passthru(Object *obj, bool value, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+ms->igd_gfx_passthru = value;
+}
+
 static char *machine_get_firmware(Object *obj, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
@@ -379,6 +393,12 @@ static void machine_initfn(Object *obj)
 object_property_set_description(obj, "usb",
 "Set on/off to enable/disable usb",
 NULL);
+object_property_add_bool(obj, "igd-passthru",
+ machine_get_igd_gfx_passthru,
+ machine_set_igd_gfx_passthru, NULL);
+object_property_set_description(obj, "igd-passthru",
+"Set on/off to enable/disable igd 
passthrou",
+NULL);
 object_property_add_str(obj, "firmware",
 machine_get_firmware,
 machine_set_firmware, NULL);
diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index a0ca0aa..a9ad7e7 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -3,3 +3,4 @@ common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o 
xen_devconfig.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
+obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_msi.o xen_pt_graphics.o
diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 743b37b..a54b7de 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -376,6 +376,11 @@ int xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t 
domain,
 goto error;
 }
 d->irq = v;
+rc = xen_host_pci_get_hex_value(d, "class", &v);
+if (rc) {
+goto error;
+}
+d->class_code = v;
 d->is_virtfn = xen_host_pci_dev_is_virtfn(d);
 
 return 0;
diff --git a/hw/xen/xen-host-pci-device.h b/hw/xen/xen-host-pci-device.h
index c2486f0..f1e1c30 100644
--- a/hw/xen/xen-host-pci-device.h
+++ b/hw/xen/xen-host-pci-device.h
@@ -25,6 +25,7 @@ typedef struct XenHostPCIDevice {
 
 uint16_t vendor_id;
 uint16_t device_id;
+uint32_t class_code;
 int irq;
 
 XenHostPCIIORegion io_regions[PCI_NUM_REGIONS - 1];
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index f2893b2..1d78021 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -450,6 +450,7 @@ static int xen_pt_register_regions(XenPCIPassthroughState 
*s)
d->rom.size, d->rom.base_addr);
 }
 
+xen_pt_register_vga_regions(d);
 return 0;
 }
 
@@ -748,6 +749,7 @@ out:
 static void xen_pt_unregister_device(PCIDevice *d)
 {
 XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, d);
+XenHostPCIDevice *host_dev = &s->real_device;
 uint8_t machine_irq = s->machine_irq;
 uint8_t intx = xen_pt_pci_intx(s);
 int rc;
@@ -791,6 +793,8 @@ static void xen_pt_unregister_device(PCIDevice *d)
 /* delete all emulated config registers */
 xen_pt_config_delete(s);
 
+xen_pt_unregister_vga_regions(host_dev);
+
 memory_listener_unregister(&s->memory_listener);
 memory_listener_unregister(&s->io_listener);
 
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 942dc60..3325387 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -298,5 +298,13 @@ static inline bool 
xen_pt_has_msix_mapping(XenPCIPassthroughState *s, int bar)
 return s->msix && s->msix->bar_index == bar;
 }
 
-
+extern bool has_igd_gfx_passthru;
+static inline bool is_igd_vga_passthrough(XenHostPCIDevice *dev)
+{
+return (has_igd_gfx_passthru
+&& ((dev->class_code >> 0x8) == PCI_CLASS_DISPLAY_VGA));
+}
+int xen_pt_register_vga_regions(XenHostPCIDevice *dev);
+int xen_pt_unregister_vga_regions(Xen

[Qemu-devel] [v7][PATCH 08/10] xen, gfx passthrough: register a isa bridge

2015-03-18 Thread Tiejun Chen
Currently we just register this isa bridge when we use IGD
passthrough in Xen side.

Signed-off-by: Tiejun Chen 
---
 hw/xen/xen_pt.c  | 18 ++
 include/hw/xen/xen.h |  1 +
 2 files changed, 19 insertions(+)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index fcc9f1c..2d5cebb 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -632,6 +632,21 @@ static const MemoryListener xen_pt_io_listener = {
 .priority = 10,
 };
 
+static void
+xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
+  XenHostPCIDevice *dev)
+{
+uint16_t gpu_dev_id;
+PCIDevice *d = &s->dev;
+
+if (!is_igd_vga_passthrough(dev)) {
+return;
+}
+
+gpu_dev_id = dev->device_id;
+igd_passthrough_isa_bridge_create(d->bus, gpu_dev_id);
+}
+
 /* init */
 
 static int xen_pt_initfn(PCIDevice *d)
@@ -680,6 +695,9 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_host_pci_device_put(&s->real_device);
 return -1;
 }
+
+/* Register ISA bridge for passthrough GFX. */
+xen_igd_passthrough_isa_bridge_create(s, &s->real_device);
 }
 
 /* Handle real device's MMIO/PIO BARs */
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 4356af4..703148e 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -51,4 +51,5 @@ void xen_register_framebuffer(struct MemoryRegion *mr);
 #  define HVM_MAX_VCPUS 32
 #endif
 
+extern void igd_passthrough_isa_bridge_create(PCIBus *bus, uint16_t 
gpu_dev_id);
 #endif /* QEMU_HW_XEN_H */
-- 
1.9.1




[Qemu-devel] [v7][PATCH 07/10] igd gfx passthrough: create a isa bridge

2015-03-18 Thread Tiejun Chen
Currently IGD drivers always need to access PCH by 1f.0. But we
don't want to poke that directly to get ID, and although in real
world different GPU should have different PCH. But actually the
different PCH DIDs likely map to different PCH SKUs. We do the
same thing for the GPU. For PCH, the different SKUs are going to
be all the same silicon design and implementation, just different
features turn on and off with fuses. The SW interfaces should be
consistent across all SKUs in a given family (eg LPT). But just
same features may not be supported.

Most of these different PCH features probably don't matter to the
Gfx driver, but obviously any difference in display port connections
will so it should be fine with any PCH in case of passthrough.

So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
scenarios, 0x9cc3 for BDW(Broadwell).

Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c | 113 ++
 1 file changed, 113 insertions(+)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index cea3a5c..8fbfc09 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -956,6 +956,119 @@ static QEMUMachine pc_machine_v0_10 = {
 .hw_version = "0.10",
 };
 
+typedef struct {
+uint16_t gpu_device_id;
+uint16_t pch_device_id;
+uint8_t pch_revision_id;
+} IGDDeviceIDInfo;
+
+/* In real world different GPU should have different PCH. But actually
+ * the different PCH DIDs likely map to different PCH SKUs. We do the
+ * same thing for the GPU. For PCH, the different SKUs are going to be
+ * all the same silicon design and implementation, just different
+ * features turn on and off with fuses. The SW interfaces should be
+ * consistent across all SKUs in a given family (eg LPT). But just same
+ * features may not be supported.
+ *
+ * Most of these different PCH features probably don't matter to the
+ * Gfx driver, but obviously any difference in display port connections
+ * will so it should be fine with any PCH in case of passthrough.
+ *
+ * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
+ * scenarios, 0x9cc3 for BDW(Broadwell).
+ */
+static const IGDDeviceIDInfo igd_combo_id_infos[] = {
+/* HSW Classic */
+{0x0402, 0x8c4e, 0x04}, /* HSWGT1D, HSWD_w7 */
+{0x0406, 0x8c4e, 0x04}, /* HSWGT1M, HSWM_w7 */
+{0x0412, 0x8c4e, 0x04}, /* HSWGT2D, HSWD_w7 */
+{0x0416, 0x8c4e, 0x04}, /* HSWGT2M, HSWM_w7 */
+{0x041E, 0x8c4e, 0x04}, /* HSWGT15D, HSWD_w7 */
+/* HSW ULT */
+{0x0A06, 0x8c4e, 0x04}, /* HSWGT1UT, HSWM_w7 */
+{0x0A16, 0x8c4e, 0x04}, /* HSWGT2UT, HSWM_w7 */
+{0x0A26, 0x8c4e, 0x06}, /* HSWGT3UT, HSWM_w7 */
+{0x0A2E, 0x8c4e, 0x04}, /* HSWGT3UT28W, HSWM_w7 */
+{0x0A1E, 0x8c4e, 0x04}, /* HSWGT2UX, HSWM_w7 */
+{0x0A0E, 0x8c4e, 0x04}, /* HSWGT1ULX, HSWM_w7 */
+/* HSW CRW */
+{0x0D26, 0x8c4e, 0x04}, /* HSWGT3CW, HSWM_w7 */
+{0x0D22, 0x8c4e, 0x04}, /* HSWGT3CWDT, HSWD_w7 */
+/* HSW Server */
+{0x041A, 0x8c4e, 0x04}, /* HSWSVGT2, HSWD_w7 */
+/* HSW SRVR */
+{0x040A, 0x8c4e, 0x04}, /* HSWSVGT1, HSWD_w7 */
+/* BSW */
+{0x1606, 0x9cc3, 0x03}, /* BDWULTGT1, BDWM_w7 */
+{0x1616, 0x9cc3, 0x03}, /* BDWULTGT2, BDWM_w7 */
+{0x1626, 0x9cc3, 0x03}, /* BDWULTGT3, BDWM_w7 */
+{0x160E, 0x9cc3, 0x03}, /* BDWULXGT1, BDWM_w7 */
+{0x161E, 0x9cc3, 0x03}, /* BDWULXGT2, BDWM_w7 */
+{0x1602, 0x9cc3, 0x03}, /* BDWHALOGT1, BDWM_w7 */
+{0x1612, 0x9cc3, 0x03}, /* BDWHALOGT2, BDWM_w7 */
+{0x1622, 0x9cc3, 0x03}, /* BDWHALOGT3, BDWM_w7 */
+{0x162B, 0x9cc3, 0x03}, /* BDWHALO28W, BDWM_w7 */
+{0x162A, 0x9cc3, 0x03}, /* BDWGT3WRKS, BDWM_w7 */
+{0x162D, 0x9cc3, 0x03}, /* BDWGT3SRVR, BDWM_w7 */
+};
+
+static void isa_bridge_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+dc->desc= "ISA bridge faked to support IGD PT";
+k->vendor_id= PCI_VENDOR_ID_INTEL;
+k->class_id = PCI_CLASS_BRIDGE_ISA;
+};
+
+static TypeInfo isa_bridge_info = {
+.name  = "igd-passthrough-isa-bridge",
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PCIDevice),
+.class_init = isa_bridge_class_init,
+};
+
+static void pt_graphics_register_types(void)
+{
+type_register_static(&isa_bridge_info);
+}
+type_init(pt_graphics_register_types)
+
+void igd_passthrough_isa_bridge_create(PCIBus *bus, uint16_t gpu_dev_id)
+{
+struct PCIDevice *bridge_dev;
+int i, num;
+uint16_t pch_dev_id = 0x;
+uint8_t pch_rev_id;
+
+num = ARRAY_SIZE(igd_combo_id_infos);
+for (i = 0; i < num; i++) {
+if (gpu_dev_id == igd_combo_id_infos[i].gpu_device_id) {
+pch_dev_id = igd_combo_id_infos[i].pch_device_id;
+pch_rev_id = igd_combo_id_infos[i].pch_revision_id;
+}
+}
+
+if (pch_dev_id == 0x) {
+return;
+}
+
+/* Currently IGD drivers always need to access

[Qemu-devel] [v7][PATCH 03/10] piix: create host bridge to passthrough

2015-03-18 Thread Tiejun Chen
Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one. And we also just expose
a minimal real host bridge pci configuration subset.

Signed-off-by: Tiejun Chen 
---
 hw/pci-host/piix.c   | 82 
 include/hw/i386/pc.h |  2 ++
 2 files changed, 84 insertions(+)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index c812eaa..0906ba5 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -728,6 +728,87 @@ static const TypeInfo i440fx_info = {
 .class_init= i440fx_class_init,
 };
 
+/* IGD Passthrough Host Bridge. */
+typedef struct {
+uint8_t offset;
+uint8_t len;
+} IGDHostInfo;
+
+/* Here we just expose minimal host bridge offset subset. */
+static const IGDHostInfo igd_host_bridge_infos[] = {
+{0x08, 2},  /* revision id */
+{0x2c, 2},  /* sybsystem vendor id */
+{0x2e, 2},  /* sybsystem id */
+{0x50, 2},  /* SNB: processor graphics control register */
+{0x52, 2},  /* processor graphics control register */
+{0xa4, 4},  /* SNB: graphics base of stolen memory */
+{0xa8, 4},  /* SNB: base of GTT stolen memory */
+};
+
+static int host_pci_config_read(int pos, int len, uint32_t val)
+{
+char path[PATH_MAX];
+int config_fd;
+ssize_t size = sizeof(path);
+/* Access real host bridge. */
+int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
+  0, 0, 0, 0, "config");
+
+if (rc >= size || rc < 0) {
+return -ENODEV;
+}
+
+config_fd = open(path, O_RDWR);
+if (config_fd < 0) {
+return -ENODEV;
+}
+
+do {
+rc = pread(config_fd, (uint8_t *)&val, len, pos);
+} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
+if (rc != len) {
+return -errno;
+}
+
+return 0;
+}
+
+static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
+{
+uint32_t val = 0;
+int rc, i, num;
+int pos, len;
+
+num = ARRAY_SIZE(igd_host_bridge_infos);
+for (i = 0; i < num; i++) {
+pos = igd_host_bridge_infos[i].offset;
+len = igd_host_bridge_infos[i].len;
+rc = host_pci_config_read(pos, len, val);
+if (rc) {
+return -ENODEV;
+}
+pci_default_write_config(pci_dev, pos, val, len);
+}
+
+return 0;
+}
+
+static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->init = igd_pt_i440fx_initfn;
+dc->desc = "IGD Passthrough Host bridge";
+}
+
+static const TypeInfo igd_passthrough_i440fx_info = {
+.name  = TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_I440FX_PCI_DEVICE,
+.instance_size = sizeof(PCII440FXState),
+.class_init= igd_passthrough_i440fx_class_init,
+};
+
 static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
 PCIBus *rootbus)
 {
@@ -769,6 +850,7 @@ static const TypeInfo i440fx_pcihost_info = {
 static void i440fx_register_types(void)
 {
 type_register_static(&i440fx_info);
+type_register_static(&igd_passthrough_i440fx_info);
 type_register_static(&piix3_info);
 type_register_static(&piix3_xen_info);
 type_register_static(&i440fx_pcihost_info);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 99dc26b..ec3ca88 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -233,6 +233,8 @@ typedef struct PCII440FXState PCII440FXState;
 #define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
 #define TYPE_I440FX_PCI_DEVICE "i440FX"
 
+#define TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE "xen-igd-passthrough-i440FX"
+
 PCIBus *i440fx_init(const char *host_type, const char *pci_type,
 PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
-- 
1.9.1




[Qemu-devel] [v7][PATCH 02/10] pc_init1: pass parameters just with types

2015-03-18 Thread Tiejun Chen
Pass types to configure pc_init1().

Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 07faec9..cea3a5c 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -75,7 +75,9 @@ static bool has_reserved_memory = true;
 /* PC hardware initialisation */
 static void pc_init1(MachineState *machine,
  int pci_enabled,
- int kvmclock_enabled)
+ int kvmclock_enabled,
+ const char *host_type,
+ const char *pci_type)
 {
 PCMachineState *pc_machine = PC_MACHINE(machine);
 MemoryRegion *system_memory = get_system_memory();
@@ -202,8 +204,8 @@ static void pc_init1(MachineState *machine,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
-  TYPE_I440FX_PCI_DEVICE,
+pci_bus = i440fx_init(host_type,
+  pci_type,
   &i440fx_state, &piix3_devfn, &isa_bus, gsi,
   system_memory, system_io, machine->ram_size,
   below_4g_mem_size,
@@ -309,7 +311,8 @@ static void pc_init1(MachineState *machine,
 
 static void pc_init_pci(MachineState *machine)
 {
-pc_init1(machine, 1, 1);
+pc_init1(machine, 1, 1, TYPE_I440FX_PCI_HOST_BRIDGE,
+ TYPE_I440FX_PCI_DEVICE);
 }
 
 static void pc_compat_2_2(MachineState *machine)
@@ -478,7 +481,8 @@ static void pc_init_pci_1_2(MachineState *machine)
 static void pc_init_pci_no_kvmclock(MachineState *machine)
 {
 pc_compat_1_2(machine);
-pc_init1(machine, 1, 0);
+pc_init1(machine, 1, 0, TYPE_I440FX_PCI_HOST_BRIDGE,
+ TYPE_I440FX_PCI_DEVICE);
 }
 
 static void pc_init_isa(MachineState *machine)
@@ -495,7 +499,8 @@ static void pc_init_isa(MachineState *machine)
 }
 x86_cpu_compat_kvm_no_autoenable(FEAT_KVM, 1 << KVM_FEATURE_PV_EOI);
 enable_compat_apic_id_mode();
-pc_init1(machine, 0, 1);
+pc_init1(machine, 0, 1, TYPE_I440FX_PCI_HOST_BRIDGE,
+ TYPE_I440FX_PCI_DEVICE);
 }
 
 #ifdef CONFIG_XEN
-- 
1.9.1




[Qemu-devel] [v7][PATCH 10/10] xen, gfx passthrough: add opregion mapping

2015-03-18 Thread Tiejun Chen
The OpRegion shouldn't be mapped 1:1 because the address in the host
can't be used in the guest directly.

This patch traps read and write access to the opregion of the Intel
GPU config space (offset 0xfc).

The original patch is from Jean Guyader 

Signed-off-by: Tiejun Chen 
Signed-off-by: Yang Zhang 
---
 hw/xen/xen_pt.h |  6 +++-
 hw/xen/xen_pt_config_init.c | 52 ++--
 hw/xen/xen_pt_graphics.c| 82 +
 3 files changed, 137 insertions(+), 3 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index d9b1ce0..20cc9b9 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -36,6 +36,9 @@ typedef struct XenPTReg XenPTReg;
 
 typedef struct XenPCIPassthroughState XenPCIPassthroughState;
 
+uint32_t igd_read_opregion(XenPCIPassthroughState *s);
+void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
+
 /* function type for config reg */
 typedef int (*xen_pt_conf_reg_init)
 (XenPCIPassthroughState *, XenPTRegInfo *, uint32_t real_offset,
@@ -62,8 +65,9 @@ typedef int (*xen_pt_conf_byte_read)
 #define XEN_PT_BAR_ALLF 0x
 #define XEN_PT_BAR_UNMAPPED (-1)
 
-#define PCI_CAP_MAX 48
+#define XEN_PCI_CAP_MAX 48
 
+#define XEN_PCI_INTEL_OPREGION 0xfc
 
 typedef enum {
 XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index d99c22e..abcfe1b 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -573,6 +573,22 @@ static int 
xen_pt_exp_rom_bar_reg_write(XenPCIPassthroughState *s,
 return 0;
 }
 
+static int xen_pt_intel_opregion_read(XenPCIPassthroughState *s,
+  XenPTReg *cfg_entry,
+  uint32_t *value, uint32_t valid_mask)
+{
+*value = igd_read_opregion(s);
+return 0;
+}
+
+static int xen_pt_intel_opregion_write(XenPCIPassthroughState *s,
+   XenPTReg *cfg_entry, uint32_t *value,
+   uint32_t dev_value, uint32_t valid_mask)
+{
+igd_write_opregion(s, *value);
+return 0;
+}
+
 /* Header Type0 reg static information table */
 static XenPTRegInfo xen_pt_emu_reg_header0[] = {
 /* Vendor ID reg */
@@ -1438,6 +1454,20 @@ static XenPTRegInfo xen_pt_emu_reg_msix[] = {
 },
 };
 
+static XenPTRegInfo xen_pt_emu_reg_igd_opregion[] = {
+/* Intel IGFX OpRegion reg */
+{
+.offset = 0x0,
+.size   = 4,
+.init_val   = 0,
+.no_wb  = 1,
+.u.dw.read   = xen_pt_intel_opregion_read,
+.u.dw.write  = xen_pt_intel_opregion_write,
+},
+{
+.size = 0,
+},
+};
 
 /
  * Capabilities
@@ -1675,6 +1705,14 @@ static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 .size_init   = xen_pt_msix_size_init,
 .emu_regs = xen_pt_emu_reg_msix,
 },
+/* Intel IGD Opregion group */
+{
+.grp_id  = XEN_PCI_INTEL_OPREGION,
+.grp_type= XEN_PT_GRP_TYPE_EMU,
+.grp_size= 0x4,
+.size_init   = xen_pt_reg_grp_size_init,
+.emu_regs= xen_pt_emu_reg_igd_opregion,
+},
 {
 .grp_size = 0,
 },
@@ -1725,7 +1763,7 @@ out:
 static uint8_t find_cap_offset(XenPCIPassthroughState *s, uint8_t cap)
 {
 uint8_t id;
-unsigned max_cap = PCI_CAP_MAX;
+unsigned max_cap = XEN_PCI_CAP_MAX;
 uint8_t pos = PCI_CAPABILITY_LIST;
 uint8_t status = 0;
 
@@ -1804,7 +1842,8 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
 uint32_t reg_grp_offset = 0;
 XenPTRegGroup *reg_grp_entry = NULL;
 
-if (xen_pt_emu_reg_grps[i].grp_id != 0xFF) {
+if (xen_pt_emu_reg_grps[i].grp_id != 0xFF
+&& xen_pt_emu_reg_grps[i].grp_id != XEN_PCI_INTEL_OPREGION) {
 if (xen_pt_hide_dev_cap(&s->real_device,
 xen_pt_emu_reg_grps[i].grp_id)) {
 continue;
@@ -1817,6 +1856,15 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
 }
 }
 
+/*
+ * By default we will trap up to 0x40 in the cfg space.
+ * If an intel device is pass through we need to trap 0xfc,
+ * therefore the size should be 0xff.
+ */
+if (xen_pt_emu_reg_grps[i].grp_id == XEN_PCI_INTEL_OPREGION) {
+reg_grp_offset = XEN_PCI_INTEL_OPREGION;
+}
+
 reg_grp_entry = g_new0(XenPTRegGroup, 1);
 QLIST_INIT(®_grp_entry->reg_tbl_list);
 QLIST_INSERT_HEAD(&s->reg_grps, reg_grp_entry, entries);
diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
index 3232296..df6069b 100644
--- a/hw/xen/xen_pt_graphics.c
+++ b/hw/xen/xen_pt_graphics.c
@@ -5,6 +5,11 @@
 #include "xen-host-pci-device.h"
 #include "hw/xen/xen_backend.h"
 
+static unsigned long igd_guest_opregion;
+static unsigned long igd_host_opregion;
+
+#define XEN_PCI_INTEL_O

[Qemu-devel] [v7][PATCH 09/10] xen, gfx passthrough: register host bridge specific to passthrough

2015-03-18 Thread Tiejun Chen
Just register that pci host bridge specific to passthrough.

Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 8fbfc09..eae2d20 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -50,7 +50,8 @@
 #include "cpu.h"
 #include "qemu/error-report.h"
 #ifdef CONFIG_XEN
-#  include 
+#include 
+#include "hw/xen/xen_pt.h"
 #endif
 
 #define MAX_IDE_BUS 2
@@ -504,11 +505,25 @@ static void pc_init_isa(MachineState *machine)
 }
 
 #ifdef CONFIG_XEN
+static void igd_passthrough_pc_init_pci(MachineState *machine)
+{
+pc_init1(machine, 1, 1, TYPE_I440FX_PCI_HOST_BRIDGE,
+ TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);
+}
+
+static void pc_igd_init_pci(MachineState *machine)
+{
+if (has_igd_gfx_passthru)
+igd_passthrough_pc_init_pci(machine);
+else
+pc_init_pci(machine);
+}
+
 static void pc_xen_hvm_init(MachineState *machine)
 {
 PCIBus *bus;
 
-pc_init_pci(machine);
+pc_igd_init_pci(machine);
 
 bus = pci_find_primary_bus();
 if (bus != NULL) {
-- 
1.9.1




[Qemu-devel] [v7][PATCH 04/10] hw/pci-assign: split pci-assign.c

2015-03-18 Thread Tiejun Chen
We will try to reuse assign_dev_load_option_rom in xen side, and
especially its a good beginning to unify pci assign codes both on
kvm and xen in the future.

Signed-off-by: Tiejun Chen 
---
 hw/i386/Makefile.objs |  1 +
 hw/i386/kvm/pci-assign.c  | 82 --
 hw/i386/pci-assign-load-rom.c | 93 +++
 include/hw/pci/pci-assign.h   | 27 +
 4 files changed, 128 insertions(+), 75 deletions(-)
 create mode 100644 hw/i386/pci-assign-load-rom.c
 create mode 100644 include/hw/pci/pci-assign.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index e058a39..8bd1932 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -7,6 +7,7 @@ obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
 obj-y += acpi-build.o
+obj-y += pci-assign-load-rom.o
 hw/i386/acpi-build.o: hw/i386/acpi-build.c \
hw/i386/acpi-dsdt.hex hw/i386/q35-acpi-dsdt.hex \
hw/i386/ssdt-tpm.hex
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 9db7c77..fe870c9 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -37,6 +37,7 @@
 #include "hw/pci/pci.h"
 #include "hw/pci/msi.h"
 #include "kvm_i386.h"
+#include "hw/pci/pci-assign.h"
 
 #define MSIX_PAGE_SIZE 0x1000
 
@@ -48,17 +49,6 @@
 #define IORESOURCE_PREFETCH 0x2000  /* No side effects */
 #define IORESOURCE_MEM_64   0x0010
 
-//#define DEVICE_ASSIGNMENT_DEBUG
-
-#ifdef DEVICE_ASSIGNMENT_DEBUG
-#define DEBUG(fmt, ...)   \
-do {  \
-fprintf(stderr, "%s: " fmt, __func__ , __VA_ARGS__);  \
-} while (0)
-#else
-#define DEBUG(fmt, ...)
-#endif
-
 typedef struct PCIRegion {
 int type;   /* Memory or port I/O */
 int valid;
@@ -1893,73 +1883,15 @@ static void assign_register_types(void)
 
 type_init(assign_register_types)
 
-/*
- * Scan the assigned devices for the devices that have an option ROM, and then
- * load the corresponding ROM data to RAM. If an error occurs while loading an
- * option ROM, we just ignore that option ROM and continue with the next one.
- */
 static void assigned_dev_load_option_rom(AssignedDevice *dev)
 {
-char name[32], rom_file[64];
-FILE *fp;
-uint8_t val;
-struct stat st;
-void *ptr;
-
-/* If loading ROM from file, pci handles it */
-if (dev->dev.romfile || !dev->dev.rom_bar) {
-return;
-}
+int size = 0;
 
-snprintf(rom_file, sizeof(rom_file),
- "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/rom",
- dev->host.domain, dev->host.bus, dev->host.slot,
- dev->host.function);
+pci_assign_dev_load_option_rom(&dev->dev, OBJECT(dev), &size,
+   dev->host.domain, dev->host.bus,
+   dev->host.slot, dev->host.function);
 
-if (stat(rom_file, &st)) {
-return;
-}
-
-if (access(rom_file, F_OK)) {
-error_report("pci-assign: Insufficient privileges for %s", rom_file);
-return;
-}
-
-/* Write "1" to the ROM file to enable it */
-fp = fopen(rom_file, "r+");
-if (fp == NULL) {
-return;
+if (!size) {
+error_report("pci-assign: Invalid ROM.");
 }
-val = 1;
-if (fwrite(&val, 1, 1, fp) != 1) {
-goto close_rom;
-}
-fseek(fp, 0, SEEK_SET);
-
-snprintf(name, sizeof(name), "%s.rom",
-object_get_typename(OBJECT(dev)));
-memory_region_init_ram(&dev->dev.rom, OBJECT(dev), name, st.st_size,
-   &error_abort);
-vmstate_register_ram(&dev->dev.rom, &dev->dev.qdev);
-ptr = memory_region_get_ram_ptr(&dev->dev.rom);
-memset(ptr, 0xff, st.st_size);
-
-if (!fread(ptr, 1, st.st_size, fp)) {
-error_report("pci-assign: Cannot read from host %s", rom_file);
-error_printf("Device option ROM contents are probably invalid "
- "(check dmesg).\nSkip option ROM probe with rombar=0, "
- "or load from file with romfile=\n");
-goto close_rom;
-}
-
-pci_register_bar(&dev->dev, PCI_ROM_SLOT, 0, &dev->dev.rom);
-dev->dev.has_rom = true;
-close_rom:
-/* Write "0" to disable ROM */
-fseek(fp, 0, SEEK_SET);
-val = 0;
-if (!fwrite(&val, 1, 1, fp)) {
-DEBUG("%s\n", "Failed to disable pci-sysfs rom file");
-}
-fclose(fp);
 }
diff --git a/hw/i386/pci-assign-load-rom.c b/hw/i386/pci-assign-load-rom.c
new file mode 100644
index 000..a04b540
--- /dev/null
+++ b/hw/i386/pci-assign-load-rom.c
@@ -0,0 +1,93 @@
+/*
+ * This is splited from hw/i386/kvm/pci-assign.c
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hw/hw.h"
+#include "hw/i386/pc.h"
+#include "qemu/error-report.h"
+#include "ui/console.h"
+#include "hw/loader.h"
+#include "monitor/monitor.h"
+#include "qemu/range.h"
+#include "sysemu/sysemu.h"

[Qemu-devel] [PATCH v4 1/3] uhci: fix segfault when hot-unplugging uhci controller

2015-03-18 Thread arei.gonglei
From: Gonglei 

When hot-unplugging the usb controllers (ehci/uhci),
we have to clean all resouce of these devices,
involved registered reset handler. Otherwise, it
may cause NULL pointer access and/or segmentation fault
if we reboot the guest os after hot-unplugging.

Let's hook up reset via DeviceClass->reset() and drop
the qemu_register_reset() call. Then Qemu will register
and unregister the reset handler automatically.

Cc: qemu-stable 
Reported-by: Lidonglin 
Signed-off-by: Gonglei 
---
 hw/usb/hcd-uhci.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index f903de7..d8f0577 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -348,9 +348,10 @@ static void uhci_update_irq(UHCIState *s)
 pci_set_irq(&s->dev, level);
 }
 
-static void uhci_reset(void *opaque)
+static void uhci_reset(DeviceState *dev)
 {
-UHCIState *s = opaque;
+PCIDevice *d = PCI_DEVICE(dev);
+UHCIState *s = DO_UPCAST(UHCIState, dev, d);
 uint8_t *pci_conf;
 int i;
 UHCIPort *port;
@@ -454,11 +455,11 @@ static void uhci_port_write(void *opaque, hwaddr addr,
 port = &s->ports[i];
 usb_device_reset(port->port.dev);
 }
-uhci_reset(s);
+uhci_reset(DEVICE(s));
 return;
 }
 if (val & UHCI_CMD_HCRESET) {
-uhci_reset(s);
+uhci_reset(DEVICE(s));
 return;
 }
 s->cmd = val;
@@ -1226,8 +1227,6 @@ static int usb_uhci_common_initfn(PCIDevice *dev)
 s->num_ports_vmstate = NB_PORTS;
 QTAILQ_INIT(&s->queues);
 
-qemu_register_reset(uhci_reset, s);
-
 memory_region_init_io(&s->io_bar, OBJECT(s), &uhci_ioport_ops, s,
   "uhci", 0x20);
 
@@ -1303,6 +1302,7 @@ static void uhci_class_init(ObjectClass *klass, void 
*data)
 k->revision  = info->revision;
 k->class_id  = PCI_CLASS_SERIAL_USB;
 dc->vmsd = &vmstate_uhci;
+dc->reset = uhci_reset;
 if (!info->unplug) {
 /* uhci controllers in companion setups can't be hotplugged */
 dc->hotpluggable = false;
-- 
1.7.12.4





[Qemu-devel] [PATCH v4 0/3] usb: fix segfault when hot-unplugging usb host adapter

2015-03-18 Thread arei.gonglei
From: Gonglei 

When hot-unplugging the usb controllers (ehci/uhci),
we have to clean all resouce of these devices,
involved registered reset handler. Otherwise, it
may cause NULL pointer access and/or segmentation fault
if we reboot the guest os after hot-unplugging.

Let's hook up reset via DeviceClass->reset() and drop
the qemu_register_reset() call. Then Qemu will register
and unregister the reset handler automatically.

Cc: qemu-stable 

v4 -> v3:
 - add reset hookup for sysbus devices (ehci/ohci). (Gerd)

v2 -> v3:
 - rewrite each patch's title

v1 -> v2:
 - hooking up reset via DeviceClass->reset 
 and drop the qemu_register_reset() calls.   (Gerd)

Gonglei (3):
  uhci: fix segfault when hot-unplugging uhci controller
  ehci: fix segfault when hot-unplugging ehci controller
  ohci: fix resource cleanup leak

 hw/usb/hcd-ehci-pci.c| 10 ++
 hw/usb/hcd-ehci-sysbus.c | 10 ++
 hw/usb/hcd-ehci.c|  3 +--
 hw/usb/hcd-ehci.h|  1 +
 hw/usb/hcd-ohci.c| 20 +++-
 hw/usb/hcd-uhci.c| 12 ++--
 6 files changed, 47 insertions(+), 9 deletions(-)

-- 
1.7.12.4





[Qemu-devel] [PATCH V4 00/19] Support more virtio queues

2015-03-18 Thread Jason Wang
We current limit the max virtio queues to 64. This is not sufficient
to support multiqueue devices (e.g recent Linux support up to 256
tap queues). So this series tries to let virtio to support more queues.

No much works need to be done except:

- Introducing transport specific queue limitation.
- Let each virtio transport to use specific limit.
- Speedup the MSI-X masking and unmasking through per vector queue
  list, and increase the maximum MSI-X vectors supported by qemu.
- With the above enhancements, increase the maximum number of
  virtqueues supported by PCI from 64 to 513.
- Compat the changes for legacy machine types.

With this patch, we can support up to 256 queues. Since x86 can only
allow about 240 interrupt vectors for MSI-X, current Linux driver can
only have about 80 queue pairs has their private MSI-X interrupt
vectors. With sharing IRQ with queue pairs (RFC posted in
https://lkml.org/lkml/2014/12/25/169), Linux driver can have up
to about 186 queue pairs has their private MSI-X interrupt vectors.

Stress/migration test on virtio-pci, compile test on other
targets. And make check on s390x-softmmu and ppc64-softmmu.

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Michael S. Tsirkin 
Cc: Alexander Graf 
Cc: Keith Busch 
Cc: Kevin Wolf 
Cc: Stefan Hajnoczi 
Cc: Christian Borntraeger 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: qemu-...@nongnu.org

Please review

Thanks

Changes from V3:
- rebase to master and target to 2.4
- handling compat issues for spapr
- fixes for hmp command completion when we have a nic with 256 queues
- using VIRTIO_NO_VECTOR instead of 0 for invalid virtqueue in
  virtio-ccw
- fix compile issues for ppc64-softmmu
- don't export VIRTIO_CCW_QUEUE_MAX by introducing a gsi limit and
  inherit in ccw
- use transport specific queue limit in virtio-serial
- correct the stale comment for AdapterRoutes and move it to ccw patch
- replace 128 with queue_max + 64 and add a comment for this in
  virtio_ccw_notify()

Changes from V2:
- move transport specific limitation to their implementation. (The
  left is VIRTIO_CCW_QUEUE_MAX since I fail to find a common header
  files other than virtio.h)
- use virtio_get_queue_max() if possible in virtio.c
- AdapterRoutes should use ccw limit
- introduce a vector to queue mapping for virito devices and speedup
  the MSI-X masking and unmasking through this.

Changes from V1:
- add a validation against the bus limitation
- switch to use a bus specific queue limit instead of a global one,
  this will allow us to just increase the limit of one transport
  without disturbing others.
- only increase the queue limit of virtio-pci
- limit the maximum number of virtio queues to 64 for legacy machine
  types

Jason Wang (19):
  pc: add 2.4 machine types
  spapr: add machine type specific instance init function
  ppc: spapr: add 2.4 machine type
  monitor: replace the magic number 255 with MAX_QUEUE_NUM
  monitor: check return value of qemu_find_net_clients_except()
  virtio-ccw: using VIRTIO_NO_VECTOR instead of 0 for invalid virtqueue
  virtio-net: validate backend queue numbers against bus limitation
  virtio-net: fix the upper bound when trying to delete queues
  virito: introduce bus specific queue limit
  virtio-ccw: introduce ccw specific queue limit
  virtio-s390: switch to bus specific queue limit
  virtio-mmio: switch to bus specific queue limit
  virtio-pci: switch to use bus specific queue limit
  virtio: introduce vector to virtqueues mapping
  virtio: introduce virtio_queue_get_index()
  virtio-pci: speedup MSI-X masking and unmasking
  virtio-pci: increase the maximum number of virtqueues to 513
  pci: remove hard-coded bar size in msix_init_exclusive_bar()
  virtio-pci: introduce auto_msix_bar_size property

 hw/block/nvme.c|  2 +-
 hw/char/virtio-serial-bus.c|  2 +-
 hw/i386/pc_piix.c  | 42 ---
 hw/i386/pc_q35.c   | 39 +++--
 hw/misc/ivshmem.c  |  2 +-
 hw/net/virtio-net.c|  9 -
 hw/pci/msix.c  | 18 --
 hw/ppc/spapr.c | 70 +++--
 hw/s390x/s390-virtio-bus.c |  7 ++--
 hw/s390x/s390-virtio-ccw.c |  7 ++--
 hw/s390x/virtio-ccw.c  | 20 +++
 hw/scsi/virtio-scsi.c  |  4 +--
 hw/virtio/virtio-mmio.c|  7 ++--
 hw/virtio/virtio-pci.c | 78 --
 hw/virtio/virtio-pci.h |  3 ++
 hw/virtio/virtio.c | 75 +++-
 include/hw/compat.h| 11 ++
 include/hw/pci/msix.h  |  2 +-
 include/hw/s390x/s390_flic.h   |  4 ++-
 include/hw/virtio/virtio-bus.h |  2 ++
 include/hw/virtio/virtio.h |  7 ++--
 monitor.c  | 25 --
 22 files changed, 339 insertions(+), 97 deletions(-)

-- 
2.1.0




[Qemu-devel] [PATCH V4 02/19] spapr: add machine type specific instance init function

2015-03-18 Thread Jason Wang
This patches adds machine type specific instance initialization
functions. Those functions will be used by following patches to compat
class properties for legacy machine types.

Cc: Alexander Graf 
Cc: qemu-...@nongnu.org
Signed-off-by: Jason Wang 
---
 hw/ppc/spapr.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0487f52..7fb174f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1803,6 +1803,27 @@ static const TypeInfo spapr_machine_info = {
 #define SPAPR_COMPAT_2_1 \
 SPAPR_COMPAT_2_2
 
+static void spapr_compat_2_2(Object *obj)
+{
+}
+
+static void spapr_compat_2_1(Object *obj)
+{
+spapr_compat_2_2(obj);
+}
+
+static void spapr_machine_2_2_instance_init(Object *obj)
+{
+spapr_compat_2_2(obj);
+spapr_machine_initfn(obj);
+}
+
+static void spapr_machine_2_1_instance_init(Object *obj)
+{
+spapr_compat_2_1(obj);
+spapr_machine_initfn(obj);
+}
+
 static void spapr_machine_2_1_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
@@ -1821,6 +1842,7 @@ static const TypeInfo spapr_machine_2_1_info = {
 .name  = TYPE_SPAPR_MACHINE "2.1",
 .parent= TYPE_SPAPR_MACHINE,
 .class_init= spapr_machine_2_1_class_init,
+.instance_init = spapr_machine_2_1_instance_init,
 };
 
 static void spapr_machine_2_2_class_init(ObjectClass *oc, void *data)
@@ -1840,6 +1862,7 @@ static const TypeInfo spapr_machine_2_2_info = {
 .name  = TYPE_SPAPR_MACHINE "2.2",
 .parent= TYPE_SPAPR_MACHINE,
 .class_init= spapr_machine_2_2_class_init,
+.instance_init = spapr_machine_2_2_instance_init,
 };
 
 static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
-- 
2.1.0




Re: [Qemu-devel] [PATCH] MAINTAINERS: Add myself as the maintainer of the Quorum driver

2015-03-18 Thread Alberto Garcia
On Tue, Mar 17, 2015 at 05:23:56PM +, Stefan Hajnoczi wrote:

> > +
> > +Quorum
> > +M: Alberto Garcia 
> > +S: Supported
> > +F: block/quorum.c
> 
> I'd like to also add the qemu-bl...@nongnu.org mailing list to your
> MAINTAINERS stanza because patches should be CCed to the block list:
> 
> L: qemu-bl...@nongnu.org
> 
> Does that sound good?  If you have any questions, just let me know.

Yes, please go ahead.

Thanks,

Berto



[Qemu-devel] [PATCH V4 03/19] ppc: spapr: add 2.4 machine type

2015-03-18 Thread Jason Wang
The following patches will limit the following things to legacy
machine type:

- maximum number of virtqueues for virtio-pci were limited to 64
- auto msix bar size for virtio-net-pci were disabled by default

Cc: Alexander Graf 
Cc: qemu-...@nongnu.org

Signed-off-by: Jason Wang 
---
 hw/ppc/spapr.c | 31 +--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7fb174f..22f4ae4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1803,8 +1803,13 @@ static const TypeInfo spapr_machine_info = {
 #define SPAPR_COMPAT_2_1 \
 SPAPR_COMPAT_2_2
 
+static void spapr_compat_2_3(Object *obj)
+{
+}
+
 static void spapr_compat_2_2(Object *obj)
 {
+spapr_compat_2_3(obj);
 }
 
 static void spapr_compat_2_1(Object *obj)
@@ -1812,6 +1817,12 @@ static void spapr_compat_2_1(Object *obj)
 spapr_compat_2_2(obj);
 }
 
+static void spapr_machine_2_3_instance_init(Object *obj)
+{
+spapr_compat_2_3(obj);
+spapr_machine_initfn(obj);
+}
+
 static void spapr_machine_2_2_instance_init(Object *obj)
 {
 spapr_compat_2_2(obj);
@@ -1871,14 +1882,29 @@ static void spapr_machine_2_3_class_init(ObjectClass 
*oc, void *data)
 
 mc->name = "pseries-2.3";
 mc->desc = "pSeries Logical Partition (PAPR compliant) v2.3";
-mc->alias = "pseries";
-mc->is_default = 1;
 }
 
 static const TypeInfo spapr_machine_2_3_info = {
 .name  = TYPE_SPAPR_MACHINE "2.3",
 .parent= TYPE_SPAPR_MACHINE,
 .class_init= spapr_machine_2_3_class_init,
+.instance_init = spapr_machine_2_3_instance_init,
+};
+
+static void spapr_machine_2_4_class_init(ObjectClass *oc, void *data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+
+mc->name = "pseries-2.4";
+mc->desc = "pSeries Logical Partition (PAPR compliant) v2.4";
+mc->alias = "pseries";
+mc->is_default = 1;
+}
+
+static const TypeInfo spapr_machine_2_4_info = {
+.name  = TYPE_SPAPR_MACHINE "2.4",
+.parent= TYPE_SPAPR_MACHINE,
+.class_init= spapr_machine_2_4_class_init,
 };
 
 static void spapr_machine_register_types(void)
@@ -1887,6 +1913,7 @@ static void spapr_machine_register_types(void)
 type_register_static(&spapr_machine_2_1_info);
 type_register_static(&spapr_machine_2_2_info);
 type_register_static(&spapr_machine_2_3_info);
+type_register_static(&spapr_machine_2_4_info);
 }
 
 type_init(spapr_machine_register_types)
-- 
2.1.0




[Qemu-devel] [PATCH V4 08/19] virtio-net: fix the upper bound when trying to delete queues

2015-03-18 Thread Jason Wang
Virtqueue were indexed from zero, so don't delete virtqueue whose
index is n->max_queues * 2 + 1.

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/net/virtio-net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 59f76bc..b6fac9c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1309,7 +1309,7 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int 
multiqueue)
 
 n->multiqueue = multiqueue;
 
-for (i = 2; i <= n->max_queues * 2 + 1; i++) {
+for (i = 2; i < n->max_queues * 2 + 1; i++) {
 virtio_del_queue(vdev, i);
 }
 
-- 
2.1.0




[Qemu-devel] [PATCH V4 13/19] virtio-pci: switch to use bus specific queue limit

2015-03-18 Thread Jason Wang
Instead of depending on a macro, switch to use a bus specific queue
limit.

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio-pci.c | 12 +++-
 include/hw/virtio/virtio.h |  2 --
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 075b13b..e556919 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -42,6 +42,8 @@
  * configuration space */
 #define VIRTIO_PCI_CONFIG_SIZE(dev) 
VIRTIO_PCI_CONFIG_OFF(msix_enabled(dev))
 
+#define VIRTIO_PCI_QUEUE_MAX 64
+
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
VirtIOPCIProxy *dev);
 
@@ -171,7 +173,7 @@ static void virtio_pci_start_ioeventfd(VirtIOPCIProxy 
*proxy)
 return;
 }
 
-for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+for (n = 0; n < virtio_get_queue_max(vdev); n++) {
 if (!virtio_queue_get_num(vdev, n)) {
 continue;
 }
@@ -207,7 +209,7 @@ static void virtio_pci_stop_ioeventfd(VirtIOPCIProxy *proxy)
 return;
 }
 
-for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+for (n = 0; n < virtio_get_queue_max(vdev); n++) {
 if (!virtio_queue_get_num(vdev, n)) {
 continue;
 }
@@ -243,11 +245,11 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 virtio_queue_set_addr(vdev, vdev->queue_sel, pa);
 break;
 case VIRTIO_PCI_QUEUE_SEL:
-if (val < VIRTIO_PCI_QUEUE_MAX)
+if (val < virtio_get_queue_max(vdev))
 vdev->queue_sel = val;
 break;
 case VIRTIO_PCI_QUEUE_NOTIFY:
-if (val < VIRTIO_PCI_QUEUE_MAX) {
+if (val < virtio_get_queue_max(vdev)) {
 virtio_queue_notify(vdev, val);
 }
 break;
@@ -748,7 +750,7 @@ static int virtio_pci_set_guest_notifiers(DeviceState *d, 
int nvqs, bool assign)
 bool with_irqfd = msix_enabled(&proxy->pci_dev) &&
 kvm_msi_via_irqfd_enabled();
 
-nvqs = MIN(nvqs, VIRTIO_PCI_QUEUE_MAX);
+nvqs = MIN(nvqs, virtio_get_queue_max(vdev));
 
 /* When deassigning, pass a consistent nvqs value
  * to avoid leaking notifiers.
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 91fd673..7ff40ac 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -48,8 +48,6 @@ typedef struct VirtQueueElement
 struct iovec out_sg[VIRTQUEUE_MAX_SIZE];
 } VirtQueueElement;
 
-#define VIRTIO_PCI_QUEUE_MAX 64
-
 #define VIRTIO_NO_VECTOR 0x
 
 #define TYPE_VIRTIO_DEVICE "virtio-device"
-- 
2.1.0




[Qemu-devel] [PATCH V4 01/19] pc: add 2.4 machine types

2015-03-18 Thread Jason Wang
The following patches will limit the following things to legacy
machine type:

- maximum number of virtqueues for virtio-pci were limited to 64
- auto msix bar size for virtio-net-pci were disabled by default

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/i386/pc_piix.c | 29 +
 hw/i386/pc_q35.c  | 26 +++---
 2 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 36c69d7..175e582 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -310,8 +310,13 @@ static void pc_init_pci(MachineState *machine)
 pc_init1(machine, 1, 1);
 }
 
+static void pc_compat_2_3(MachineState *machine)
+{
+}
+
 static void pc_compat_2_2(MachineState *machine)
 {
+pc_compat_2_3(machine);
 rsdp_in_ram = false;
 x86_cpu_compat_set_features("kvm64", FEAT_1_EDX, 0, CPUID_VME);
 x86_cpu_compat_set_features("kvm32", FEAT_1_EDX, 0, CPUID_VME);
@@ -417,6 +422,12 @@ static void pc_compat_1_2(MachineState *machine)
 x86_cpu_compat_kvm_no_autoenable(FEAT_KVM, 1 << KVM_FEATURE_PV_EOI);
 }
 
+static void pc_init_pci_2_3(MachineState *machine)
+{
+pc_compat_2_3(machine);
+pc_init_pci(machine);
+}
+
 static void pc_init_pci_2_2(MachineState *machine)
 {
 pc_compat_2_2(machine);
@@ -516,19 +527,28 @@ static void pc_xen_hvm_init(MachineState *machine)
 .desc = "Standard PC (i440FX + PIIX, 1996)", \
 .hot_add_cpu = pc_hot_add_cpu
 
-#define PC_I440FX_2_3_MACHINE_OPTIONS   \
+#define PC_I440FX_2_4_MACHINE_OPTIONS   \
 PC_I440FX_MACHINE_OPTIONS,  \
 .default_machine_opts = "firmware=bios-256k.bin",   \
 .default_display = "std"
 
-static QEMUMachine pc_i440fx_machine_v2_3 = {
-PC_I440FX_2_3_MACHINE_OPTIONS,
-.name = "pc-i440fx-2.3",
+
+static QEMUMachine pc_i440fx_machine_v2_4 = {
+PC_I440FX_2_4_MACHINE_OPTIONS,
+.name = "pc-i440fx-2.4",
 .alias = "pc",
 .init = pc_init_pci,
 .is_default = 1,
 };
 
+#define PC_I440FX_2_3_MACHINE_OPTIONS PC_I440FX_2_4_MACHINE_OPTIONS
+
+static QEMUMachine pc_i440fx_machine_v2_3 = {
+PC_I440FX_2_3_MACHINE_OPTIONS,
+.name = "pc-i440fx-2.3",
+.init = pc_init_pci_2_3,
+};
+
 #define PC_I440FX_2_2_MACHINE_OPTIONS PC_I440FX_2_3_MACHINE_OPTIONS
 
 static QEMUMachine pc_i440fx_machine_v2_2 = {
@@ -974,6 +994,7 @@ static QEMUMachine xenfv_machine = {
 
 static void pc_machine_init(void)
 {
+qemu_register_pc_machine(&pc_i440fx_machine_v2_4);
 qemu_register_pc_machine(&pc_i440fx_machine_v2_3);
 qemu_register_pc_machine(&pc_i440fx_machine_v2_2);
 qemu_register_pc_machine(&pc_i440fx_machine_v2_1);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index bc40537..df44d25 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -289,8 +289,13 @@ static void pc_q35_init(MachineState *machine)
 }
 }
 
+static void pc_compat_2_3(MachineState *machine)
+{
+}
+
 static void pc_compat_2_2(MachineState *machine)
 {
+pc_compat_2_3(machine);
 rsdp_in_ram = false;
 x86_cpu_compat_set_features("kvm64", FEAT_1_EDX, 0, CPUID_VME);
 x86_cpu_compat_set_features("kvm32", FEAT_1_EDX, 0, CPUID_VME);
@@ -365,6 +370,12 @@ static void pc_compat_1_4(MachineState *machine)
 x86_cpu_compat_set_features("Westmere", FEAT_1_ECX, 0, 
CPUID_EXT_PCLMULQDQ);
 }
 
+static void pc_q35_init_2_3(MachineState *machine)
+{
+pc_compat_2_3(machine);
+pc_q35_init(machine);
+}
+
 static void pc_q35_init_2_2(MachineState *machine)
 {
 pc_compat_2_2(machine);
@@ -414,16 +425,24 @@ static void pc_q35_init_1_4(MachineState *machine)
 .hot_add_cpu = pc_hot_add_cpu, \
 .units_per_default_bus = 1
 
-#define PC_Q35_2_3_MACHINE_OPTIONS  \
+#define PC_Q35_2_4_MACHINE_OPTIONS  \
 PC_Q35_MACHINE_OPTIONS, \
 .default_machine_opts = "firmware=bios-256k.bin",   \
 .default_display = "std"
 
+static QEMUMachine pc_q35_machine_v2_4 = {
+PC_Q35_2_4_MACHINE_OPTIONS,
+.name = "pc-q35-2.4",
+.alias = "q35",
+.init = pc_q35_init,
+};
+
+#define PC_Q35_2_3_MACHINE_OPTIONS PC_Q35_2_4_MACHINE_OPTIONS
+
 static QEMUMachine pc_q35_machine_v2_3 = {
 PC_Q35_2_3_MACHINE_OPTIONS,
 .name = "pc-q35-2.3",
-.alias = "q35",
-.init = pc_q35_init,
+.init = pc_q35_init_2_3,
 };
 
 #define PC_Q35_2_2_MACHINE_OPTIONS PC_Q35_2_3_MACHINE_OPTIONS
@@ -510,6 +529,7 @@ static QEMUMachine pc_q35_machine_v1_4 = {
 
 static void pc_q35_machine_init(void)
 {
+qemu_register_pc_machine(&pc_q35_machine_v2_4);
 qemu_register_pc_machine(&pc_q35_machine_v2_3);
 qemu_register_pc_machine(&pc_q35_machine_v2_2);
 qemu_register_pc_machine(&pc_q35_machine_v2_1);
-- 
2.1.0




[Qemu-devel] [PATCH V4 16/19] virtio-pci: speedup MSI-X masking and unmasking

2015-03-18 Thread Jason Wang
This patch tries to speed up the MSI-X masking and unmasking through
the mapping between vector and queues. With this patch it will there's
no need to go through all possible virtqueues, which may help to
reduce the time spent when doing MSI-X masking/unmasking a single
vector when more than hundreds or even thousands of virtqueues were
supported.

Tested with 80 queue pairs virito-net-pci by changing the smp affinity
in the background and doing netperf in the same time:

Before the patch:
5711.70 Gbits/sec
After the patch:
6830.98 Gbits/sec

About 19.6% improvements in throughput.

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio-pci.c | 40 +---
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c38f33f..9a5242a 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -632,28 +632,30 @@ static int virtio_pci_vector_unmask(PCIDevice *dev, 
unsigned vector,
 {
 VirtIOPCIProxy *proxy = container_of(dev, VirtIOPCIProxy, pci_dev);
 VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-int ret, queue_no;
+VirtQueue *vq = virtio_vector_first_queue(vdev, vector);
+int ret, index, unmasked = 0;
 
-for (queue_no = 0; queue_no < proxy->nvqs_with_notifiers; queue_no++) {
-if (!virtio_queue_get_num(vdev, queue_no)) {
+while (vq) {
+index = virtio_queue_get_index(vdev, vq);
+if (!virtio_queue_get_num(vdev, index)) {
 break;
 }
-if (virtio_queue_vector(vdev, queue_no) != vector) {
-continue;
-}
-ret = virtio_pci_vq_vector_unmask(proxy, queue_no, vector, msg);
+ret = virtio_pci_vq_vector_unmask(proxy, index, vector, msg);
 if (ret < 0) {
 goto undo;
 }
+vq = virtio_vector_next_queue(vq);
+++unmasked;
 }
+
 return 0;
 
 undo:
-while (--queue_no >= 0) {
-if (virtio_queue_vector(vdev, queue_no) != vector) {
-continue;
-}
-virtio_pci_vq_vector_mask(proxy, queue_no, vector);
+vq = virtio_vector_first_queue(vdev, vector);
+while (vq && --unmasked >= 0) {
+index = virtio_queue_get_index(vdev, vq);
+virtio_pci_vq_vector_mask(proxy, index, vector);
+vq = virtio_vector_next_queue(vq);
 }
 return ret;
 }
@@ -662,16 +664,16 @@ static void virtio_pci_vector_mask(PCIDevice *dev, 
unsigned vector)
 {
 VirtIOPCIProxy *proxy = container_of(dev, VirtIOPCIProxy, pci_dev);
 VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-int queue_no;
+VirtQueue *vq = virtio_vector_first_queue(vdev, vector);
+int index;
 
-for (queue_no = 0; queue_no < proxy->nvqs_with_notifiers; queue_no++) {
-if (!virtio_queue_get_num(vdev, queue_no)) {
+while (vq) {
+index = virtio_queue_get_index(vdev, vq);
+if (!virtio_queue_get_num(vdev, index)) {
 break;
 }
-if (virtio_queue_vector(vdev, queue_no) != vector) {
-continue;
-}
-virtio_pci_vq_vector_mask(proxy, queue_no, vector);
+virtio_pci_vq_vector_mask(proxy, index, vector);
+vq = virtio_vector_next_queue(vq);
 }
 }
 
-- 
2.1.0




[Qemu-devel] [PATCH V4 06/19] virtio-ccw: using VIRTIO_NO_VECTOR instead of 0 for invalid virtqueue

2015-03-18 Thread Jason Wang
There's no need to use vector 0 for invalid virtqueue. So this patch
changes to use VIRTIO_NO_VECTOR instead.

Cc: Michael S. Tsirkin 
Cc: Cornelia Huck 
CC: Christian Borntraeger 
Cc: Richard Henderson 
Cc: Alexander Graf 
Signed-off-by: Jason Wang 
---
 hw/s390x/virtio-ccw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 130535c..c8b87aa 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -281,7 +281,7 @@ static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t addr, 
uint32_t align,
 
 virtio_queue_set_addr(vdev, index, addr);
 if (!addr) {
-virtio_queue_set_vector(vdev, index, 0);
+virtio_queue_set_vector(vdev, index, VIRTIO_NO_VECTOR);
 } else {
 /* Fail if we don't have a big enough queue. */
 /* TODO: Add interface to handle vring.num changing */
-- 
2.1.0




[Qemu-devel] [PATCH V4 09/19] virito: introduce bus specific queue limit

2015-03-18 Thread Jason Wang
This patch introduces a bus specific queue limitation. It will be
useful for increasing the limit for one of the bus without disturbing
other buses.

Cc: Michael S. Tsirkin 
Cc: Alexander Graf 
Cc: Richard Henderson 
Cc: Cornelia Huck 
Cc: Christian Borntraeger 
Cc: Paolo Bonzini 
Signed-off-by: Jason Wang 
---
 hw/char/virtio-serial-bus.c|  2 +-
 hw/net/virtio-net.c|  4 ++--
 hw/s390x/s390-virtio-bus.c |  1 +
 hw/s390x/virtio-ccw.c  |  1 +
 hw/scsi/virtio-scsi.c  |  4 ++--
 hw/virtio/virtio-mmio.c|  1 +
 hw/virtio/virtio-pci.c |  1 +
 hw/virtio/virtio.c | 40 +---
 include/hw/virtio/virtio-bus.h |  1 +
 include/hw/virtio/virtio.h |  1 +
 10 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index c86814f..dec2f2d 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -942,7 +942,7 @@ static void virtio_serial_device_realize(DeviceState *dev, 
Error **errp)
 }
 
 /* Each port takes 2 queues, and one pair is for the control queue */
-max_supported_ports = VIRTIO_PCI_QUEUE_MAX / 2 - 1;
+max_supported_ports = virtio_get_queue_max(vdev) / 2 - 1;
 
 if (vser->serial.max_virtserial_ports > max_supported_ports) {
 error_setg(errp, "maximum ports supported: %u", max_supported_ports);
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b6fac9c..bf286f5 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1588,10 +1588,10 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 virtio_init(vdev, "virtio-net", VIRTIO_ID_NET, n->config_size);
 
 n->max_queues = MAX(n->nic_conf.peers.queues, 1);
-if (n->max_queues * 2 + 1 > VIRTIO_PCI_QUEUE_MAX) {
+if (n->max_queues * 2 + 1 > virtio_get_queue_max(vdev)) {
 error_setg(errp, "Invalid number of queues (= %" PRIu32 "), "
"must be a postive integer less than %d.",
-   n->max_queues, (VIRTIO_PCI_QUEUE_MAX - 1) / 2);
+   n->max_queues, (virtio_get_queue_max(vdev) - 1) / 2);
 virtio_cleanup(vdev);
 return;
 }
diff --git a/hw/s390x/s390-virtio-bus.c b/hw/s390x/s390-virtio-bus.c
index 047c963..2b41e32 100644
--- a/hw/s390x/s390-virtio-bus.c
+++ b/hw/s390x/s390-virtio-bus.c
@@ -715,6 +715,7 @@ static void virtio_s390_bus_class_init(ObjectClass *klass, 
void *data)
 bus_class->max_dev = 1;
 k->notify = virtio_s390_notify;
 k->get_features = virtio_s390_get_features;
+k->queue_max = VIRTIO_PCI_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_s390_bus_info = {
diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index c8b87aa..dfe6ded 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -1711,6 +1711,7 @@ static void virtio_ccw_bus_class_init(ObjectClass *klass, 
void *data)
 k->load_queue = virtio_ccw_load_queue;
 k->save_config = virtio_ccw_save_config;
 k->load_config = virtio_ccw_load_config;
+k->queue_max = VIRTIO_PCI_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_ccw_bus_info = {
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index da0cff8..feb5e07 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -822,10 +822,10 @@ void virtio_scsi_common_realize(DeviceState *dev, Error 
**errp,
 sizeof(VirtIOSCSIConfig));
 
 if (s->conf.num_queues == 0 ||
-s->conf.num_queues > VIRTIO_PCI_QUEUE_MAX - 2) {
+s->conf.num_queues > virtio_get_queue_max(vdev) - 2) {
 error_setg(errp, "Invalid number of queues (= %" PRIu32 "), "
  "must be a positive integer less than %d.",
-   s->conf.num_queues, VIRTIO_PCI_QUEUE_MAX - 2);
+   s->conf.num_queues, virtio_get_queue_max(vdev) - 2);
 virtio_cleanup(vdev);
 return;
 }
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 10123f3..2ae6942 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -403,6 +403,7 @@ static void virtio_mmio_bus_class_init(ObjectClass *klass, 
void *data)
 k->device_plugged = virtio_mmio_device_plugged;
 k->has_variable_vring_alignment = true;
 bus_class->max_dev = 1;
+k->queue_max = VIRTIO_PCI_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_mmio_bus_info = {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c7c3f72..075b13b 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1498,6 +1498,7 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, 
void *data)
 k->vmstate_change = virtio_pci_vmstate_change;
 k->device_plugged = virtio_pci_device_plugged;
 k->device_unplugged = virtio_pci_device_unplugged;
+k->queue_max = VIRTIO_PCI_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_pci_bus_info = {
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 3c6e430..023eb04 100644
--- a/hw/v

[Qemu-devel] [PATCH V4 04/19] monitor: replace the magic number 255 with MAX_QUEUE_NUM

2015-03-18 Thread Jason Wang
This patch replace the magic number 255, and increase it to
MAX_QUEUE_NUM which is maximum number of queues supported by a nic.

Cc: Luiz Capitulino 
Signed-off-by: Jason Wang 
---
 monitor.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/monitor.c b/monitor.c
index 42116a9..07dfed0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4475,10 +4475,11 @@ void set_link_completion(ReadLineState *rs, int 
nb_args, const char *str)
 len = strlen(str);
 readline_set_completion_index(rs, len);
 if (nb_args == 2) {
-NetClientState *ncs[255];
+NetClientState *ncs[MAX_QUEUE_NUM];
 int count, i;
 count = qemu_find_net_clients_except(NULL, ncs,
- NET_CLIENT_OPTIONS_KIND_NONE, 
255);
+ NET_CLIENT_OPTIONS_KIND_NONE,
+ MAX_QUEUE_NUM);
 for (i = 0; i < count; i++) {
 const char *name = ncs[i]->name;
 if (!strncmp(str, name, len)) {
@@ -4494,7 +4495,7 @@ void set_link_completion(ReadLineState *rs, int nb_args, 
const char *str)
 void netdev_del_completion(ReadLineState *rs, int nb_args, const char *str)
 {
 int len, count, i;
-NetClientState *ncs[255];
+NetClientState *ncs[MAX_QUEUE_NUM];
 
 if (nb_args != 2) {
 return;
@@ -4503,7 +4504,7 @@ void netdev_del_completion(ReadLineState *rs, int 
nb_args, const char *str)
 len = strlen(str);
 readline_set_completion_index(rs, len);
 count = qemu_find_net_clients_except(NULL, ncs, 
NET_CLIENT_OPTIONS_KIND_NIC,
- 255);
+ MAX_QUEUE_NUM);
 for (i = 0; i < count; i++) {
 QemuOpts *opts;
 const char *name = ncs[i]->name;
@@ -4569,14 +4570,15 @@ void host_net_add_completion(ReadLineState *rs, int 
nb_args, const char *str)
 
 void host_net_remove_completion(ReadLineState *rs, int nb_args, const char 
*str)
 {
-NetClientState *ncs[255];
+NetClientState *ncs[MAX_QUEUE_NUM];
 int count, i, len;
 
 len = strlen(str);
 readline_set_completion_index(rs, len);
 if (nb_args == 2) {
 count = qemu_find_net_clients_except(NULL, ncs,
- NET_CLIENT_OPTIONS_KIND_NONE, 
255);
+ NET_CLIENT_OPTIONS_KIND_NONE,
+ MAX_QUEUE_NUM);
 for (i = 0; i < count; i++) {
 int id;
 char name[16];
@@ -4592,7 +4594,8 @@ void host_net_remove_completion(ReadLineState *rs, int 
nb_args, const char *str)
 return;
 } else if (nb_args == 3) {
 count = qemu_find_net_clients_except(NULL, ncs,
- NET_CLIENT_OPTIONS_KIND_NIC, 255);
+ NET_CLIENT_OPTIONS_KIND_NIC,
+ MAX_QUEUE_NUM);
 for (i = 0; i < count; i++) {
 int id;
 const char *name;
-- 
2.1.0




[Qemu-devel] [PATCH V4 18/19] pci: remove hard-coded bar size in msix_init_exclusive_bar()

2015-03-18 Thread Jason Wang
This patch let msix_init_exclusive_bar() can accept bar_size parameter
other than a hard-coded limit 4096. Then caller can specify a bar_size
depends on msix entries and can use up to 2048 msix entries as PCI
spec allows. To keep migration compatibility, 4096 is used for all
callers and pba were start from half of bar size.

Cc: Keith Busch 
Cc: Kevin Wolf 
Cc: Stefan Hajnoczi 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/block/nvme.c|  2 +-
 hw/misc/ivshmem.c  |  2 +-
 hw/pci/msix.c  | 18 +++---
 hw/virtio/virtio-pci.c |  2 +-
 include/hw/pci/msix.h  |  2 +-
 5 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 0f3dfb9..09d7884 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -787,7 +787,7 @@ static int nvme_init(PCIDevice *pci_dev)
 pci_register_bar(&n->parent_obj, 0,
 PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
 &n->iomem);
-msix_init_exclusive_bar(&n->parent_obj, n->num_queues, 4);
+msix_init_exclusive_bar(&n->parent_obj, n->num_queues, 4, 4096);
 
 id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
 id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 5d272c8..3e2a2d4 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -631,7 +631,7 @@ static uint64_t ivshmem_get_size(IVShmemState * s) {
 
 static void ivshmem_setup_msi(IVShmemState * s)
 {
-if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1)) {
+if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1, 4096)) {
 IVSHMEM_DPRINTF("msix initialization failed\n");
 exit(1);
 }
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 24de260..9a1894f 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -291,33 +291,29 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
 }
 
 int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
-uint8_t bar_nr)
+uint8_t bar_nr, uint32_t bar_size)
 {
 int ret;
 char *name;
+uint32_t bar_pba_offset = bar_size / 2;
 
 /*
  * Migration compatibility dictates that this remains a 4k
  * BAR with the vector table in the lower half and PBA in
  * the upper half.  Do not use these elsewhere!
  */
-#define MSIX_EXCLUSIVE_BAR_SIZE 4096
-#define MSIX_EXCLUSIVE_BAR_TABLE_OFFSET 0
-#define MSIX_EXCLUSIVE_BAR_PBA_OFFSET (MSIX_EXCLUSIVE_BAR_SIZE / 2)
-#define MSIX_EXCLUSIVE_CAP_OFFSET 0
-
-if (nentries * PCI_MSIX_ENTRY_SIZE > MSIX_EXCLUSIVE_BAR_PBA_OFFSET) {
+if (nentries * PCI_MSIX_ENTRY_SIZE > bar_pba_offset) {
 return -EINVAL;
 }
 
 name = g_strdup_printf("%s-msix", dev->name);
-memory_region_init(&dev->msix_exclusive_bar, OBJECT(dev), name, 
MSIX_EXCLUSIVE_BAR_SIZE);
+memory_region_init(&dev->msix_exclusive_bar, OBJECT(dev), name, bar_size);
 g_free(name);
 
 ret = msix_init(dev, nentries, &dev->msix_exclusive_bar, bar_nr,
-MSIX_EXCLUSIVE_BAR_TABLE_OFFSET, &dev->msix_exclusive_bar,
-bar_nr, MSIX_EXCLUSIVE_BAR_PBA_OFFSET,
-MSIX_EXCLUSIVE_CAP_OFFSET);
+0, &dev->msix_exclusive_bar,
+bar_nr, bar_pba_offset,
+0);
 if (ret) {
 return ret;
 }
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 02e3ce8..4a5febb 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -937,7 +937,7 @@ static void virtio_pci_device_plugged(DeviceState *d)
 config[PCI_INTERRUPT_PIN] = 1;
 
 if (proxy->nvectors &&
-msix_init_exclusive_bar(&proxy->pci_dev, proxy->nvectors, 1)) {
+msix_init_exclusive_bar(&proxy->pci_dev, proxy->nvectors, 1, 4096)) {
 error_report("unable to init msix vectors to %" PRIu32,
  proxy->nvectors);
 proxy->nvectors = 0;
diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
index 954d82b..43edebc 100644
--- a/include/hw/pci/msix.h
+++ b/include/hw/pci/msix.h
@@ -11,7 +11,7 @@ int msix_init(PCIDevice *dev, unsigned short nentries,
   unsigned table_offset, MemoryRegion *pba_bar,
   uint8_t pba_bar_nr, unsigned pba_offset, uint8_t cap_pos);
 int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
-uint8_t bar_nr);
+uint8_t bar_nr, uint32_t bar_size);
 
 void msix_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int 
len);
 
-- 
2.1.0




[Qemu-devel] [PATCH V4 05/19] monitor: check return value of qemu_find_net_clients_except()

2015-03-18 Thread Jason Wang
qemu_find_net_clients_except() may return a value which is greater
than the size of array we provided. So we should check this value
before using it, otherwise this may cause unexpected memory access.

This patch fixes the net related command completion when we have a
virtio-net nic with more than 255 queues.

Cc: Luiz Capitulino 
Signed-off-by: Jason Wang 
---
 monitor.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index 07dfed0..3c0abfd 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4480,7 +4480,7 @@ void set_link_completion(ReadLineState *rs, int nb_args, 
const char *str)
 count = qemu_find_net_clients_except(NULL, ncs,
  NET_CLIENT_OPTIONS_KIND_NONE,
  MAX_QUEUE_NUM);
-for (i = 0; i < count; i++) {
+for (i = 0; i < MIN(count, MAX_QUEUE_NUM); i++) {
 const char *name = ncs[i]->name;
 if (!strncmp(str, name, len)) {
 readline_add_completion(rs, name);
@@ -4505,7 +4505,7 @@ void netdev_del_completion(ReadLineState *rs, int 
nb_args, const char *str)
 readline_set_completion_index(rs, len);
 count = qemu_find_net_clients_except(NULL, ncs, 
NET_CLIENT_OPTIONS_KIND_NIC,
  MAX_QUEUE_NUM);
-for (i = 0; i < count; i++) {
+for (i = 0; i < MIN(count, MAX_QUEUE_NUM); i++) {
 QemuOpts *opts;
 const char *name = ncs[i]->name;
 if (strncmp(str, name, len)) {
@@ -4579,7 +4579,7 @@ void host_net_remove_completion(ReadLineState *rs, int 
nb_args, const char *str)
 count = qemu_find_net_clients_except(NULL, ncs,
  NET_CLIENT_OPTIONS_KIND_NONE,
  MAX_QUEUE_NUM);
-for (i = 0; i < count; i++) {
+for (i = 0; i < MIN(count, MAX_QUEUE_NUM); i++) {
 int id;
 char name[16];
 
@@ -4596,7 +4596,7 @@ void host_net_remove_completion(ReadLineState *rs, int 
nb_args, const char *str)
 count = qemu_find_net_clients_except(NULL, ncs,
  NET_CLIENT_OPTIONS_KIND_NIC,
  MAX_QUEUE_NUM);
-for (i = 0; i < count; i++) {
+for (i = 0; i < MIN(count, MAX_QUEUE_NUM); i++) {
 int id;
 const char *name;
 
-- 
2.1.0




[Qemu-devel] [PATCH V4 07/19] virtio-net: validate backend queue numbers against bus limitation

2015-03-18 Thread Jason Wang
We don't validate the backend queue numbers against bus limitation,
this will easily crash qemu if it exceeds the limitation. Fixing this
by doing the validation and fail early.

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/net/virtio-net.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 27adcc5..59f76bc 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1588,6 +1588,13 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 virtio_init(vdev, "virtio-net", VIRTIO_ID_NET, n->config_size);
 
 n->max_queues = MAX(n->nic_conf.peers.queues, 1);
+if (n->max_queues * 2 + 1 > VIRTIO_PCI_QUEUE_MAX) {
+error_setg(errp, "Invalid number of queues (= %" PRIu32 "), "
+   "must be a postive integer less than %d.",
+   n->max_queues, (VIRTIO_PCI_QUEUE_MAX - 1) / 2);
+virtio_cleanup(vdev);
+return;
+}
 n->vqs = g_malloc0(sizeof(VirtIONetQueue) * n->max_queues);
 n->vqs[0].rx_vq = virtio_add_queue(vdev, 256, virtio_net_handle_rx);
 n->curr_queues = 1;
-- 
2.1.0




[Qemu-devel] [PATCH V4 14/19] virtio: introduce vector to virtqueues mapping

2015-03-18 Thread Jason Wang
Currently we will try to traverse all virtqueues to find a subset that
using a specific vector. This is sub optimal when we will support
hundreds or even thousands of virtqueues. So this patch introduces a
method which could be used by transport to get all virtqueues that
using a same vector. This is done through QLISTs and the number of
QLISTs was queried through a transport specific method. When guest
setting vectors, the virtqueue will be linked and helpers for traverse
the list was also introduced.

The first user will be virtio pci which will use this to speed up
MSI-X masking and unmasking handling.

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio-pci.c |  8 
 hw/virtio/virtio.c | 32 ++--
 include/hw/virtio/virtio-bus.h |  1 +
 include/hw/virtio/virtio.h |  3 +++
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index e556919..c38f33f 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -910,6 +910,13 @@ static const TypeInfo virtio_9p_pci_info = {
  * virtio-pci: This is the PCIDevice which has a virtio-pci-bus.
  */
 
+static int virtio_pci_query_nvectors(DeviceState *d)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
+
+return proxy->nvectors;
+}
+
 /* This is called by virtio-bus just after the device is plugged. */
 static void virtio_pci_device_plugged(DeviceState *d)
 {
@@ -1500,6 +1507,7 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, 
void *data)
 k->vmstate_change = virtio_pci_vmstate_change;
 k->device_plugged = virtio_pci_device_plugged;
 k->device_unplugged = virtio_pci_device_unplugged;
+k->query_nvectors = virtio_pci_query_nvectors;
 k->queue_max = VIRTIO_PCI_QUEUE_MAX;
 }
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 023eb04..ca157e8 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -89,6 +89,7 @@ struct VirtQueue
 VirtIODevice *vdev;
 EventNotifier guest_notifier;
 EventNotifier host_notifier;
+QLIST_ENTRY(VirtQueue) node;
 };
 
 /* virt queue functions */
@@ -613,7 +614,7 @@ void virtio_reset(void *opaque)
 vdev->vq[i].vring.used = 0;
 vdev->vq[i].last_avail_idx = 0;
 vdev->vq[i].pa = 0;
-vdev->vq[i].vector = VIRTIO_NO_VECTOR;
+virtio_queue_set_vector(vdev, i, VIRTIO_NO_VECTOR);
 vdev->vq[i].signalled_used = 0;
 vdev->vq[i].signalled_used_valid = false;
 vdev->vq[i].notification = true;
@@ -738,6 +739,16 @@ void virtio_queue_set_num(VirtIODevice *vdev, int n, int 
num)
 virtqueue_init(&vdev->vq[n]);
 }
 
+VirtQueue *virtio_vector_first_queue(VirtIODevice *vdev, uint16_t vector)
+{
+return QLIST_FIRST(&vdev->vector_queues[vector]);
+}
+
+VirtQueue *virtio_vector_next_queue(VirtQueue *vq)
+{
+return QLIST_NEXT(vq, node);
+}
+
 int virtio_queue_get_num(VirtIODevice *vdev, int n)
 {
 return vdev->vq[n].vring.num;
@@ -788,8 +799,17 @@ uint16_t virtio_queue_vector(VirtIODevice *vdev, int n)
 
 void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector)
 {
-if (n < virtio_get_queue_max(vdev))
+VirtQueue *vq = &vdev->vq[n];
+
+if (n < virtio_get_queue_max(vdev)) {
+if (vdev->vq[n].vector != VIRTIO_NO_VECTOR) {
+QLIST_REMOVE(vq, node);
+}
 vdev->vq[n].vector = vector;
+if (vector != VIRTIO_NO_VECTOR) {
+QLIST_INSERT_HEAD(&vdev->vector_queues[vector], vq, node);
+}
+}
 }
 
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
@@ -1097,6 +1117,7 @@ void virtio_cleanup(VirtIODevice *vdev)
 qemu_del_vm_change_state_handler(vdev->vmstate);
 g_free(vdev->config);
 g_free(vdev->vq);
+g_free(vdev->vector_queues);
 }
 
 static void virtio_vmstate_change(void *opaque, int running, RunState state)
@@ -1134,7 +1155,14 @@ void virtio_instance_init_common(Object *proxy_obj, void 
*data,
 void virtio_init(VirtIODevice *vdev, const char *name,
  uint16_t device_id, size_t config_size)
 {
+BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
+VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
 int i, queue_max = virtio_get_queue_max(vdev);
+int nvectors = k->query_nvectors ?
+k->query_nvectors(qbus->parent) : queue_max;
+
+vdev->vector_queues = g_malloc0(sizeof(*vdev->vector_queues) * nvectors);
+
 vdev->device_id = device_id;
 vdev->status = 0;
 vdev->isr = 0;
diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
index 4da8022..12386d5 100644
--- a/include/hw/virtio/virtio-bus.h
+++ b/include/hw/virtio/virtio-bus.h
@@ -62,6 +62,7 @@ typedef struct VirtioBusClass {
  * This is called by virtio-bus just before the device is unplugged.
  */
 void (*device_unplugged)(DeviceState *d);
+int (*query_nvectors)(DeviceState *d);
 /*
  * Does the transport have variable vring alignment?
  

[Qemu-devel] [PATCH V4 11/19] virtio-s390: switch to bus specific queue limit

2015-03-18 Thread Jason Wang
Instead of depending on marco, switch to use a bus specific queue
limit.

Cc: Alexander Graf 
Cc: Richard Henderson 
Cc: Christian Borntraeger 
Cc: Cornelia Huck 
Signed-off-by: Jason Wang 
---
 hw/s390x/s390-virtio-bus.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/s390x/s390-virtio-bus.c b/hw/s390x/s390-virtio-bus.c
index 2b41e32..9d6028d 100644
--- a/hw/s390x/s390-virtio-bus.c
+++ b/hw/s390x/s390-virtio-bus.c
@@ -45,6 +45,8 @@
 do { } while (0)
 #endif
 
+#define VIRTIO_S390_QUEUE_MAX 64
+
 static void virtio_s390_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOS390Device *dev);
 
@@ -344,7 +346,7 @@ static ram_addr_t 
s390_virtio_device_num_vq(VirtIOS390Device *dev)
 VirtIODevice *vdev = dev->vdev;
 int num_vq;
 
-for (num_vq = 0; num_vq < VIRTIO_PCI_QUEUE_MAX; num_vq++) {
+for (num_vq = 0; num_vq < virtio_get_queue_max(vdev); num_vq++) {
 if (!virtio_queue_get_num(vdev, num_vq)) {
 break;
 }
@@ -446,7 +448,7 @@ VirtIOS390Device *s390_virtio_bus_find_vring(VirtIOS390Bus 
*bus,
 QTAILQ_FOREACH(kid, &bus->bus.children, sibling) {
 VirtIOS390Device *dev = (VirtIOS390Device *)kid->child;
 
-for(i = 0; i < VIRTIO_PCI_QUEUE_MAX; i++) {
+for (i = 0; i < virtio_get_queue_max(dev->vdev); i++) {
 if (!virtio_queue_get_addr(dev->vdev, i))
 break;
 if (virtio_queue_get_addr(dev->vdev, i) == mem) {
@@ -715,7 +717,7 @@ static void virtio_s390_bus_class_init(ObjectClass *klass, 
void *data)
 bus_class->max_dev = 1;
 k->notify = virtio_s390_notify;
 k->get_features = virtio_s390_get_features;
-k->queue_max = VIRTIO_PCI_QUEUE_MAX;
+k->queue_max = VIRTIO_S390_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_s390_bus_info = {
-- 
2.1.0




Re: [Qemu-devel] RFC: -object vs -chardev creation order

2015-03-18 Thread Markus Armbruster
"Daniel P. Berrange"  writes:

> The current QEMU startup code will create -chardev backends first, then
> create -object backends, then -fsdev backends and so on, in some pretty
> arbitrary order of types.
>
> There is already a dependancy from the rng-egd object type, which has a
> link to a chardev, which requires -chardev options be processed before
> -object options.
>
> In my work to introduce an object type for TLS credentials though, we
> encounter exactly the opposite dependancy. A -chardev option will
> require that -object options be processed first, so that the TLS
> credantials can be created.
>
> This is a no-win situation. Both orders of creation are wrong for some
> set of use cases. This is only going to get worse over time, as more
> types of user creatable objects are defined.
>
> The application invoking QEMU could easily solve it, by putting the
> args on the command line in the order in which they need to processed,
> but then QEMU throws away this ordering information and creates them
> based on type of backend.

Yes, and that's stupid.

> I'm unclear on whether anyone has thought about ways to solve this yet,
> but it is a blocker for my work now, so I need to figure out some kind
> of solution asap, even if only a temporary workaround.
>
> One option is to convert the -chardev backends so they become user
> creatable objects, and thus -object can obsolete use of -chardev
> for these. This sounds simple, but I fear it could be a rather large
> yak shaving exercise to get that done.
>
> Another idea is to perhaps change the way options are stored, so that
> instead of having a separate QemuOptsList for each type of option, have
> one single list and store the type with each entry in the list. Then we
> can iterate over the single list creating things in exactly the same
> order that they were passed on the command line.

It's how we should've done it, but:

>   I fear this may cause
> regressions for apps using QEMU though - eg if some app is placing
> '-object rng-egd' in the args, before the -chardev it would currently
> work, but if we switch to honour ordering it would fail. We could
> introduce a flag -strict-ordering but that feels gross in itself.

Yes.

I'm afraid we'll have to break our byzantine command line at some
point...  I'd rather not break it now, though.

> A third option is to not process -object args in one go, instead process
> each type of object at a time. eg we'd first create all the
> -object tls-credential instances, then create the -chardev instances,
> then create the -object rng-egd instances. This is probably the least
> amount of work in short term, but not all that scalable, unless we do
> a catch-all default case, so we only need code up hacks for a few
> particular object types.
>
> Thus my gut feeling is to do option 3, but I'd like other opinions before
> embarking on this

Piling yet another hack on top isn't great, put telling you "no more
hacks" after everybody and his dog already got to pile on some doesn't
feel fair.

That said: can we sort the creation work topologically?  Requires
identifying references.  In the QemuOpts world, a reference is a
string-valued parameter that is interpreted as ID in a well-known
QemuOptsList.  Adding the necessary information to QemuOptDesc desc[]
shouldn't be too hard.  The ones with empty desc[] could be troublesome,
as usual.



[Qemu-devel] [PATCH V4 10/19] virtio-ccw: introduce ccw specific queue limit

2015-03-18 Thread Jason Wang
Instead of depending on marco, using a bus specific limit.

Cc: Alexander Graf 
Cc: Cornelia Huck 
Cc: Christian Borntraeger 
Cc: Richard Henderson 
Signed-off-by: Jason Wang 
---
 hw/s390x/s390-virtio-ccw.c   |  7 +--
 hw/s390x/virtio-ccw.c| 19 ---
 include/hw/s390x/s390_flic.h |  4 +++-
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index afb539a..96b84ab 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -55,6 +55,7 @@ void io_subsystem_reset(void)
 
 static int virtio_ccw_hcall_notify(const uint64_t *args)
 {
+VirtIODevice *vdev;
 uint64_t subch_id = args[0];
 uint64_t queue = args[1];
 SubchDev *sch;
@@ -67,10 +68,12 @@ static int virtio_ccw_hcall_notify(const uint64_t *args)
 if (!sch || !css_subch_visible(sch)) {
 return -EINVAL;
 }
-if (queue >= VIRTIO_PCI_QUEUE_MAX) {
+
+vdev = virtio_ccw_get_vdev(sch);
+if (queue >= virtio_get_queue_max(vdev)) {
 return -EINVAL;
 }
-virtio_queue_notify(virtio_ccw_get_vdev(sch), queue);
+virtio_queue_notify(vdev, queue);
 return 0;
 
 }
diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index dfe6ded..935b880 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -32,6 +32,8 @@
 static QTAILQ_HEAD(, IndAddr) indicator_addresses =
 QTAILQ_HEAD_INITIALIZER(indicator_addresses);
 
+#define VIRTIO_CCW_QUEUE_MAX ADAPTER_ROUTES_MAX_GSI
+
 static IndAddr *get_indicator(hwaddr ind_addr, int len)
 {
 IndAddr *indicator;
@@ -170,7 +172,7 @@ static void virtio_ccw_start_ioeventfd(VirtioCcwDevice *dev)
 return;
 }
 vdev = virtio_bus_get_device(&dev->bus);
-for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+for (n = 0; n < virtio_get_queue_max(vdev); n++) {
 if (!virtio_queue_get_num(vdev, n)) {
 continue;
 }
@@ -205,7 +207,7 @@ static void virtio_ccw_stop_ioeventfd(VirtioCcwDevice *dev)
 return;
 }
 vdev = virtio_bus_get_device(&dev->bus);
-for (n = 0; n < VIRTIO_PCI_QUEUE_MAX; n++) {
+for (n = 0; n < virtio_get_queue_max(vdev); n++) {
 if (!virtio_queue_get_num(vdev, n)) {
 continue;
 }
@@ -265,8 +267,9 @@ static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t addr, 
uint32_t align,
   uint16_t index, uint16_t num)
 {
 VirtIODevice *vdev = virtio_ccw_get_vdev(sch);
+int queue_max = virtio_get_queue_max(vdev);
 
-if (index > VIRTIO_PCI_QUEUE_MAX) {
+if (index > queue_max) {
 return -EINVAL;
 }
 
@@ -291,7 +294,7 @@ static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t addr, 
uint32_t align,
 virtio_queue_set_vector(vdev, index, index);
 }
 /* tell notify handler in case of config change */
-vdev->config_vector = VIRTIO_PCI_QUEUE_MAX;
+vdev->config_vector = queue_max;
 return 0;
 }
 
@@ -1036,14 +1039,16 @@ static uint8_t virtio_set_ind_atomic(SubchDev *sch, 
uint64_t ind_loc,
 static void virtio_ccw_notify(DeviceState *d, uint16_t vector)
 {
 VirtioCcwDevice *dev = to_virtio_ccw_dev_fast(d);
+VirtIODevice *vdev = virtio_bus_get_device(&dev->bus);
 SubchDev *sch = dev->sch;
 uint64_t indicators;
 
-if (vector >= 128) {
+/* queue indicators + secondary indicators */
+if (vector >= virtio_get_queue_max(vdev) + 64) {
 return;
 }
 
-if (vector < VIRTIO_PCI_QUEUE_MAX) {
+if (vector < virtio_get_queue_max(vdev)) {
 if (!dev->indicators) {
 return;
 }
@@ -1711,7 +1716,7 @@ static void virtio_ccw_bus_class_init(ObjectClass *klass, 
void *data)
 k->load_queue = virtio_ccw_load_queue;
 k->save_config = virtio_ccw_save_config;
 k->load_config = virtio_ccw_load_config;
-k->queue_max = VIRTIO_PCI_QUEUE_MAX;
+k->queue_max = VIRTIO_CCW_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_ccw_bus_info = {
diff --git a/include/hw/s390x/s390_flic.h b/include/hw/s390x/s390_flic.h
index 489d73b..e1b0389 100644
--- a/include/hw/s390x/s390_flic.h
+++ b/include/hw/s390x/s390_flic.h
@@ -17,10 +17,12 @@
 #include "hw/s390x/adapter.h"
 #include "hw/virtio/virtio.h"
 
+#define ADAPTER_ROUTES_MAX_GSI 64
+
 typedef struct AdapterRoutes {
 AdapterInfo adapter;
 int num_routes;
-int gsi[VIRTIO_PCI_QUEUE_MAX];
+int gsi[ADAPTER_ROUTES_MAX_GSI];
 } AdapterRoutes;
 
 #define TYPE_S390_FLIC_COMMON "s390-flic"
-- 
2.1.0




[Qemu-devel] [PATCH V4 12/19] virtio-mmio: switch to bus specific queue limit

2015-03-18 Thread Jason Wang
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio-mmio.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 2ae6942..dbd44b6 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -34,6 +34,8 @@ do { printf("virtio_mmio: " fmt , ## __VA_ARGS__); } while (0)
 #define DPRINTF(fmt, ...) do {} while (0)
 #endif
 
+#define VIRTIO_MMIO_QUEUE_MAX 64
+
 /* QOM macros */
 /* virtio-mmio-bus */
 #define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
@@ -237,7 +239,7 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, 
uint64_t value,
 proxy->guest_page_shift);
 break;
 case VIRTIO_MMIO_QUEUESEL:
-if (value < VIRTIO_PCI_QUEUE_MAX) {
+if (value < virtio_get_queue_max(vdev)) {
 vdev->queue_sel = value;
 }
 break;
@@ -257,7 +259,7 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, 
uint64_t value,
 }
 break;
 case VIRTIO_MMIO_QUEUENOTIFY:
-if (value < VIRTIO_PCI_QUEUE_MAX) {
+if (value < virtio_get_queue_max(vdev)) {
 virtio_queue_notify(vdev, value);
 }
 break;
@@ -403,7 +405,7 @@ static void virtio_mmio_bus_class_init(ObjectClass *klass, 
void *data)
 k->device_plugged = virtio_mmio_device_plugged;
 k->has_variable_vring_alignment = true;
 bus_class->max_dev = 1;
-k->queue_max = VIRTIO_PCI_QUEUE_MAX;
+k->queue_max = VIRTIO_MMIO_QUEUE_MAX;
 }
 
 static const TypeInfo virtio_mmio_bus_info = {
-- 
2.1.0




[Qemu-devel] [PATCH V4 15/19] virtio: introduce virtio_queue_get_index()

2015-03-18 Thread Jason Wang
This patch introduces a helper that can get the queue index of a
VirtQueue. This is useful when traversing the list of VirtQueues.

Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio.c | 5 +
 include/hw/virtio/virtio.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index ca157e8..d4c49ef 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -725,6 +725,11 @@ hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n)
 return vdev->vq[n].pa;
 }
 
+int virtio_queue_get_index(VirtIODevice *vdev, VirtQueue *vq)
+{
+return vq - vdev->vq;
+}
+
 void virtio_queue_set_num(VirtIODevice *vdev, int n, int num)
 {
 /* Don't allow guest to flip queue between existent and
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index e3adb1d..f148590 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -171,6 +171,7 @@ void virtio_config_writew(VirtIODevice *vdev, uint32_t 
addr, uint32_t data);
 void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data);
 void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr);
 hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n);
+int virtio_queue_get_index(VirtIODevice *vdev, VirtQueue *vq);
 void virtio_queue_set_num(VirtIODevice *vdev, int n, int num);
 int virtio_queue_get_num(VirtIODevice *vdev, int n);
 void virtio_queue_set_align(VirtIODevice *vdev, int n, int align);
-- 
2.1.0




[Qemu-devel] [PATCH V4 17/19] virtio-pci: increase the maximum number of virtqueues to 513

2015-03-18 Thread Jason Wang
This patch increases the maximum number of virtqueues for pci from 64
to 513. This will allow booting a virtio-net-pci device with 256 queue
pairs.

To keep migration compatibility, 64 was kept for legacy machine
types. This is because qemu in fact allows guest to probe the limit of
virtqueues through trying to set queue_sel. So for legacy machine
type, we should make sure setting queue_sel to more than 63 won't
make effect.

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Michael S. Tsirkin 
Cc: Alexander Graf 
Cc: qemu-...@nongnu.org
Signed-off-by: Jason Wang 
---
 hw/i386/pc_piix.c  | 5 +
 hw/i386/pc_q35.c   | 5 +
 hw/ppc/spapr.c | 5 +
 hw/virtio/virtio-pci.c | 2 +-
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 175e582..0796719 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -49,6 +49,7 @@
 #include "hw/acpi/acpi.h"
 #include "cpu.h"
 #include "qemu/error-report.h"
+#include "hw/virtio/virtio-bus.h"
 #ifdef CONFIG_XEN
 #  include 
 #endif
@@ -312,6 +313,10 @@ static void pc_init_pci(MachineState *machine)
 
 static void pc_compat_2_3(MachineState *machine)
 {
+ObjectClass *klass = object_class_by_name("virtio-pci-bus");
+VirtioBusClass *k = VIRTIO_BUS_CLASS(klass);
+
+k->queue_max = 64;
 }
 
 static void pc_compat_2_2(MachineState *machine)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index df44d25..a8a34a4 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -45,6 +45,7 @@
 #include "hw/usb.h"
 #include "hw/cpu/icc_bus.h"
 #include "qemu/error-report.h"
+#include "include/hw/virtio/virtio-bus.h"
 
 /* ICH9 AHCI has 6 ports */
 #define MAX_SATA_PORTS 6
@@ -291,6 +292,10 @@ static void pc_q35_init(MachineState *machine)
 
 static void pc_compat_2_3(MachineState *machine)
 {
+ObjectClass *klass = object_class_by_name("virtio-pci-bus");
+VirtioBusClass *k = VIRTIO_BUS_CLASS(klass);
+
+k->queue_max = 64;
 }
 
 static void pc_compat_2_2(MachineState *machine)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 22f4ae4..5f25dd3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -50,6 +50,7 @@
 #include "hw/pci/pci.h"
 #include "hw/scsi/scsi.h"
 #include "hw/virtio/virtio-scsi.h"
+#include "hw/virtio/virtio-bus.h"
 
 #include "exec/address-spaces.h"
 #include "hw/usb.h"
@@ -1805,6 +1806,10 @@ static const TypeInfo spapr_machine_info = {
 
 static void spapr_compat_2_3(Object *obj)
 {
+ObjectClass *klass = object_class_by_name("virtio-pci-bus");
+VirtioBusClass *k = VIRTIO_BUS_CLASS(klass);
+
+k->queue_max = 64;
 }
 
 static void spapr_compat_2_2(Object *obj)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 9a5242a..02e3ce8 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -42,7 +42,7 @@
  * configuration space */
 #define VIRTIO_PCI_CONFIG_SIZE(dev) 
VIRTIO_PCI_CONFIG_OFF(msix_enabled(dev))
 
-#define VIRTIO_PCI_QUEUE_MAX 64
+#define VIRTIO_PCI_QUEUE_MAX 513
 
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
VirtIOPCIProxy *dev);
-- 
2.1.0




[Qemu-devel] [PATCH V4 19/19] virtio-pci: introduce auto_msix_bar_size property

2015-03-18 Thread Jason Wang
Currently we don't support more than 128 MSI-X vectors for a pci
devices, trying to use vector=129 for a virtio-net-pci device may get:

qemu-system-x86_64: -device virtio-net-pci,netdev=hn0,vectors=129:
unable to init msix vectors to 129

This this because the MSI-X bar size were hard-coded as 4096. So this
patch introduces boolean auto_msix_bar_size property for
virito-pci devices. Enable this will let the device calculate the msix
bar size based on the number of MSI-X entries instead of previous 4096
hard-coded limit.

This is a must to let virtio-net can up to 256 queues and each queue
were associated with a specific MSI-X entry.

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Michael S. Tsirkin 
Cc: Alexander Graf 
Cc: qemu-...@nongnu.org
Signed-off-by: Jason Wang 
---
 hw/i386/pc_piix.c  |  8 
 hw/i386/pc_q35.c   |  8 
 hw/ppc/spapr.c | 11 ++-
 hw/virtio/virtio-pci.c | 17 +++--
 hw/virtio/virtio-pci.h |  3 +++
 include/hw/compat.h| 11 +++
 6 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 0796719..8808500 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -552,6 +552,10 @@ static QEMUMachine pc_i440fx_machine_v2_3 = {
 PC_I440FX_2_3_MACHINE_OPTIONS,
 .name = "pc-i440fx-2.3",
 .init = pc_init_pci_2_3,
+.compat_props = (GlobalProperty[]) {
+HW_COMPAT_2_3,
+{ /* end of list */ }
+},
 };
 
 #define PC_I440FX_2_2_MACHINE_OPTIONS PC_I440FX_2_3_MACHINE_OPTIONS
@@ -560,6 +564,10 @@ static QEMUMachine pc_i440fx_machine_v2_2 = {
 PC_I440FX_2_2_MACHINE_OPTIONS,
 .name = "pc-i440fx-2.2",
 .init = pc_init_pci_2_2,
+.compat_props = (GlobalProperty[]) {
+HW_COMPAT_2_2,
+{ /* end of list */ }
+},
 };
 
 #define PC_I440FX_2_1_MACHINE_OPTIONS   \
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index a8a34a4..4a34349 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -448,6 +448,10 @@ static QEMUMachine pc_q35_machine_v2_3 = {
 PC_Q35_2_3_MACHINE_OPTIONS,
 .name = "pc-q35-2.3",
 .init = pc_q35_init_2_3,
+.compat_props = (GlobalProperty[]) {
+HW_COMPAT_2_3,
+{ /* end of list */ }
+},
 };
 
 #define PC_Q35_2_2_MACHINE_OPTIONS PC_Q35_2_3_MACHINE_OPTIONS
@@ -456,6 +460,10 @@ static QEMUMachine pc_q35_machine_v2_2 = {
 PC_Q35_2_2_MACHINE_OPTIONS,
 .name = "pc-q35-2.2",
 .init = pc_q35_init_2_2,
+.compat_props = (GlobalProperty[]) {
+HW_COMPAT_2_2,
+{ /* end of list */ }
+},
 };
 
 #define PC_Q35_2_1_MACHINE_OPTIONS  \
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5f25dd3..853a5cc 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1794,12 +1794,16 @@ static const TypeInfo spapr_machine_info = {
 },
 };
 
+#define SPAPR_COMPAT_2_3 \
+HW_COMPAT_2_3
+
 #define SPAPR_COMPAT_2_2 \
+SPAPR_COMPAT_2_3, \
 {\
 .driver   = TYPE_SPAPR_PCI_HOST_BRIDGE,\
 .property = "mem_win_size",\
 .value= "0x2000",\
-}
+} \
 
 #define SPAPR_COMPAT_2_1 \
 SPAPR_COMPAT_2_2
@@ -1883,10 +1887,15 @@ static const TypeInfo spapr_machine_2_2_info = {
 
 static void spapr_machine_2_3_class_init(ObjectClass *oc, void *data)
 {
+static GlobalProperty compat_props[] = {
+SPAPR_COMPAT_2_3,
+{ /* end of list */ }
+};
 MachineClass *mc = MACHINE_CLASS(oc);
 
 mc->name = "pseries-2.3";
 mc->desc = "pSeries Logical Partition (PAPR compliant) v2.3";
+mc->compat_props = compat_props;
 }
 
 static const TypeInfo spapr_machine_2_3_info = {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 4a5febb..f4cd405 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -925,7 +925,7 @@ static void virtio_pci_device_plugged(DeviceState *d)
 VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
 VirtioBusState *bus = &proxy->bus;
 uint8_t *config;
-uint32_t size;
+uint32_t size, bar_size;
 
 config = proxy->pci_dev.config;
 if (proxy->class_code) {
@@ -936,8 +936,19 @@ static void virtio_pci_device_plugged(DeviceState *d)
 pci_set_word(config + PCI_SUBSYSTEM_ID, virtio_bus_get_vdev_id(bus));
 config[PCI_INTERRUPT_PIN] = 1;
 
+if (proxy->flags & VIRTIO_PCI_FLAG_AUTO_MSIX_SIZE) {
+bar_size = proxy->nvectors * PCI_MSIX_ENTRY_SIZE * 2;
+if (bar_size & (bar_size - 1)) {
+bar_size = 1 << qemu_fls(bar_size);
+}
+} else {
+/* For migration compatibility */
+bar_size = 4096;
+}
+
 if (proxy->nvectors &&
-msix_init_exclusive_bar(&proxy->pci_dev, proxy->nvectors, 1, 4096)) {
+msix_init_exclusive_bar(&proxy->pci_dev, proxy->nvectors, 1,
+bar_size)) {
 error_report("unable to init msix vectors to %" PRIu32,
  proxy->nvectors);
 proxy->nvect

Re: [Qemu-devel] RFC: -object vs -chardev creation order

2015-03-18 Thread Daniel P. Berrange
On Wed, Mar 18, 2015 at 10:43:25AM +0100, Markus Armbruster wrote:
> "Daniel P. Berrange"  writes:
> > A third option is to not process -object args in one go, instead process
> > each type of object at a time. eg we'd first create all the
> > -object tls-credential instances, then create the -chardev instances,
> > then create the -object rng-egd instances. This is probably the least
> > amount of work in short term, but not all that scalable, unless we do
> > a catch-all default case, so we only need code up hacks for a few
> > particular object types.
> >
> > Thus my gut feeling is to do option 3, but I'd like other opinions before
> > embarking on this
> 
> Piling yet another hack on top isn't great, put telling you "no more
> hacks" after everybody and his dog already got to pile on some doesn't
> feel fair.

I think I'll do this right now, since it should be a fast fix.

> That said: can we sort the creation work topologically?  Requires
> identifying references.  In the QemuOpts world, a reference is a
> string-valued parameter that is interpreted as ID in a well-known
> QemuOptsList.  Adding the necessary information to QemuOptDesc desc[]
> shouldn't be too hard.  The ones with empty desc[] could be troublesome,
> as usual.

This sounds intriguing though - we should have enough info around to
do a topological sort - its a shame that properties are only registered
at time the object is created, rather than being registered against the
classes, as then we wouldn't even need to add more info to the
QemuOptsList.  I'll have a think about this approach though and see if
I can come up with something that works

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH v4 2/3] ehci: fix segfault when hot-unplugging ehci controller

2015-03-18 Thread arei.gonglei
From: Gonglei 

When hot-unplugging the usb controllers (ehci/uhci),
we have to clean all resouce of these devices,
involved registered reset handler. Otherwise, it
may cause NULL pointer access and/or segmentation fault
if we reboot the guest os after hot-unplugging.

Let's hook up reset via DeviceClass->reset() and drop
the qemu_register_reset() call. Then Qemu will register
and unregister the reset handler automatically.

Cc: qemu-stable 
Reported-by: Lidonglin 
Signed-off-by: Gonglei 
---
 hw/usb/hcd-ehci-pci.c| 10 ++
 hw/usb/hcd-ehci-sysbus.c | 10 ++
 hw/usb/hcd-ehci.c|  3 +--
 hw/usb/hcd-ehci.h|  1 +
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/hw/usb/hcd-ehci-pci.c b/hw/usb/hcd-ehci-pci.c
index 4c80707..7afa5f9 100644
--- a/hw/usb/hcd-ehci-pci.c
+++ b/hw/usb/hcd-ehci-pci.c
@@ -101,6 +101,15 @@ static void usb_ehci_pci_exit(PCIDevice *dev)
 }
 }
 
+static void usb_ehci_pci_reset(DeviceState *dev)
+{
+PCIDevice *pci_dev = PCI_DEVICE(dev);
+EHCIPCIState *i = PCI_EHCI(pci_dev);
+EHCIState *s = &i->ehci;
+
+ehci_reset(s);
+}
+
 static void usb_ehci_pci_write_config(PCIDevice *dev, uint32_t addr,
   uint32_t val, int l)
 {
@@ -143,6 +152,7 @@ static void ehci_class_init(ObjectClass *klass, void *data)
 k->config_write = usb_ehci_pci_write_config;
 dc->vmsd = &vmstate_ehci_pci;
 dc->props = ehci_pci_properties;
+dc->reset = usb_ehci_pci_reset;
 }
 
 static const TypeInfo ehci_pci_type_info = {
diff --git a/hw/usb/hcd-ehci-sysbus.c b/hw/usb/hcd-ehci-sysbus.c
index 19ed2c2..cd1cc14 100644
--- a/hw/usb/hcd-ehci-sysbus.c
+++ b/hw/usb/hcd-ehci-sysbus.c
@@ -42,6 +42,15 @@ static void usb_ehci_sysbus_realize(DeviceState *dev, Error 
**errp)
 sysbus_init_irq(d, &s->irq);
 }
 
+static void usb_ehci_sysbus_reset(DeviceState *dev)
+{
+SysBusDevice *d = SYS_BUS_DEVICE(dev);
+EHCISysBusState *i = SYS_BUS_EHCI(d);
+EHCIState *s = &i->ehci;
+
+ehci_reset(s);
+}
+
 static void ehci_sysbus_init(Object *obj)
 {
 SysBusDevice *d = SYS_BUS_DEVICE(obj);
@@ -70,6 +79,7 @@ static void ehci_sysbus_class_init(ObjectClass *klass, void 
*data)
 dc->realize = usb_ehci_sysbus_realize;
 dc->vmsd = &vmstate_ehci_sysbus;
 dc->props = ehci_sysbus_properties;
+dc->reset = usb_ehci_sysbus_reset;
 set_bit(DEVICE_CATEGORY_USB, dc->categories);
 }
 
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index ccf54b6..2a84f46 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -845,7 +845,7 @@ static USBDevice *ehci_find_device(EHCIState *ehci, uint8_t 
addr)
 }
 
 /* 4.1 host controller initialization */
-static void ehci_reset(void *opaque)
+void ehci_reset(void *opaque)
 {
 EHCIState *s = opaque;
 int i;
@@ -2471,7 +2471,6 @@ void usb_ehci_realize(EHCIState *s, DeviceState *dev, 
Error **errp)
 s->async_bh = qemu_bh_new(ehci_frame_timer, s);
 s->device = dev;
 
-qemu_register_reset(ehci_reset, s);
 s->vmstate = qemu_add_vm_change_state_handler(usb_ehci_vm_state_change, s);
 }
 
diff --git a/hw/usb/hcd-ehci.h b/hw/usb/hcd-ehci.h
index 2bc259c..87b240f 100644
--- a/hw/usb/hcd-ehci.h
+++ b/hw/usb/hcd-ehci.h
@@ -325,6 +325,7 @@ extern const VMStateDescription vmstate_ehci;
 void usb_ehci_init(EHCIState *s, DeviceState *dev);
 void usb_ehci_realize(EHCIState *s, DeviceState *dev, Error **errp);
 void usb_ehci_unrealize(EHCIState *s, DeviceState *dev, Error **errp);
+void ehci_reset(void *opaque);
 
 #define TYPE_PCI_EHCI "pci-ehci-usb"
 #define PCI_EHCI(obj) OBJECT_CHECK(EHCIPCIState, (obj), TYPE_PCI_EHCI)
-- 
1.7.12.4





[Qemu-devel] [PATCH v4 3/3] ohci: fix resource cleanup leak

2015-03-18 Thread arei.gonglei
From: Gonglei 

When hot-unplugging the usb controllers (ehci/uhci),
we have to clean all resouce of these devices,
involved registered reset handler. Otherwise, it
may cause NULL pointer access and/or segmentation fault
if we reboot the guest os after hot-unplugging.

Let's hook up reset via DeviceClass->reset() and drop
the qemu_register_reset() call. Then Qemu will register
and unregister the reset handler automatically.

Ohci does't support hotplugging/hotunplugging yet, but
existing resource cleanup leak logic likes ehci/uhci.

Cc: qemu-stable 
Signed-off-by: Gonglei 
---
 hw/usb/hcd-ohci.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index a0d478e..ca7aeb3 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -1879,7 +1879,6 @@ static int usb_ohci_init(OHCIState *ohci, DeviceState 
*dev,
 usb_packet_init(&ohci->usb_packet);
 
 ohci->async_td = 0;
-qemu_register_reset(ohci_reset, ohci);
 
 return 0;
 }
@@ -1951,6 +1950,15 @@ static void usb_ohci_exit(PCIDevice *dev)
 }
 }
 
+static void usb_ohci_reset_pci(DeviceState *d)
+{
+PCIDevice *dev = PCI_DEVICE(d);
+OHCIPCIState *ohci = PCI_OHCI(dev);
+OHCIState *s = &ohci->state;
+
+ohci_reset(s);
+}
+
 #define TYPE_SYSBUS_OHCI "sysbus-ohci"
 #define SYSBUS_OHCI(obj) OBJECT_CHECK(OHCISysBusState, (obj), TYPE_SYSBUS_OHCI)
 
@@ -1976,6 +1984,14 @@ static void ohci_realize_pxa(DeviceState *dev, Error 
**errp)
 sysbus_init_mmio(sbd, &s->ohci.mem);
 }
 
+static void usb_ohci_reset_sysbus(DeviceState *dev)
+{
+OHCISysBusState *s = SYSBUS_OHCI(dev);
+OHCIState *ohci = &s->ohci;
+
+ohci_reset(ohci);
+}
+
 static Property ohci_pci_properties[] = {
 DEFINE_PROP_STRING("masterbus", OHCIPCIState, masterbus),
 DEFINE_PROP_UINT32("num-ports", OHCIPCIState, num_ports, 3),
@@ -2097,6 +2113,7 @@ static void ohci_pci_class_init(ObjectClass *klass, void 
*data)
 dc->props = ohci_pci_properties;
 dc->hotpluggable = false;
 dc->vmsd = &vmstate_ohci;
+dc->reset = usb_ohci_reset_pci;
 }
 
 static const TypeInfo ohci_pci_info = {
@@ -2120,6 +2137,7 @@ static void ohci_sysbus_class_init(ObjectClass *klass, 
void *data)
 set_bit(DEVICE_CATEGORY_USB, dc->categories);
 dc->desc = "OHCI USB Controller";
 dc->props = ohci_sysbus_properties;
+dc->reset = usb_ohci_reset_sysbus;
 }
 
 static const TypeInfo ohci_sysbus_info = {
-- 
1.7.12.4





Re: [Qemu-devel] [v7][PATCH 03/10] piix: create host bridge to passthrough

2015-03-18 Thread Gerd Hoffmann
On Mi, 2015-03-18 at 17:06 +0800, Tiejun Chen wrote:
> Implement a pci host bridge specific to passthrough. Actually
> this just inherits the standard one. And we also just expose
> a minimal real host bridge pci configuration subset.

> +/* Here we just expose minimal host bridge offset subset. */
> +static const IGDHostInfo igd_host_bridge_infos[] = {
> +{0x08, 2},  /* revision id */
> +{0x2c, 2},  /* sybsystem vendor id */
> +{0x2e, 2},  /* sybsystem id */
> +{0x50, 2},  /* SNB: processor graphics control register */
> +{0x52, 2},  /* processor graphics control register */
> +{0xa4, 4},  /* SNB: graphics base of stolen memory */
> +{0xa8, 4},  /* SNB: base of GTT stolen memory */
> +};

Hmm, no vendor/device id here?  Will the idg guest drivers happily read
graphics control registers from i440fx even though this chipset never
existed in a igd variant?

[ just wondering, if it works that way fine, it certainly makes things
  easier for the firmware which uses host bridge pci id to figure
  whenever it is running @ i440fx or q35 ].

> +static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
> +{
> +uint32_t val = 0;
> +int rc, i, num;
> +int pos, len;
> +
> +num = ARRAY_SIZE(igd_host_bridge_infos);
> +for (i = 0; i < num; i++) {
> +pos = igd_host_bridge_infos[i].offset;
> +len = igd_host_bridge_infos[i].len;
> +rc = host_pci_config_read(pos, len, val);
> +if (rc) {
> +return -ENODEV;
> +}
> +pci_default_write_config(pci_dev, pos, val, len);
> +}
> +
> +return 0;
> +}

Nothing i440fx specific in here, just copying some host bridge pci
config space bits.  I guess we can sub-class the q35 host bridge the
same way and even reuse the init function?

> +#define TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE "xen-igd-passthrough-i440FX"

One xen leftover slipped through here.

cheers,
  Gerd





Re: [Qemu-devel] [RFC 0/5] Memory transaction attributes API

2015-03-18 Thread Peter Maydell
On 18 March 2015 at 08:38, Edgar E. Iglesias  wrote:
> 2. The series introduces the read and write _with_attrs. Another
> approach could be to add a (for example) single .access callback.
> The callback could take a structure pointer as input, e.g:
>
> struct MemTx {
> bool rw;
> hwaddr addr;
> int size;
> uint64_t data;
> MemTxAttrs attrs;
> }
>
> MemTxResult access(MemTx *tx)
>
> The benefit I see is that this will make it easier for us in the
> future to add new fields if needed.

I see the reasoning, but I looked at this and it just felt
awkward when I tried writing the code that way -- the code
calling the access function ends up having to create and
mutate this MemTx struct as it loops round calling the
access callback. In particular, memory.c has kept the
read and write paths separate all the way through, so
squashing them back together into a single callback
which will then immediately want to split them back out
into read vs write seemed a bit daft.

-- PMM



Re: [Qemu-devel] [v2][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-03-18 Thread Ian Campbell
On Wed, 2015-03-18 at 15:32 +0800, Chen, Tiejun wrote:
> > I think the _libxl_ message needs to be just "Unable to detect graphics
> > passthru kind". i.e. it can't/shouldn't reference anything to do with xl
> > config options etc (which would make no sense if libvirt was being
> > used).
> >
> > That's not very user friendly though, so you may want to consider adding
> > a new specific error code for this case and returning it here, such that
> > xl or libvirt can then give a more comprehensible error message.
> 
> But,
>  args = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
>guest_config, d_state, NULL);
>   |
>   + return libxl__build_device_model_args_new(gc, dm,guest_domid, 
> guest_config, state, dm_state_fd);
>   |
>   +  Currently we always return "NULL" inside.
> 
>  if (!args) {
>  ret = ERROR_FAIL;
>  goto out;
>  }
> 
> So I'm not very sure how to pass this new specific error code to xl/libvirt.

Bah, yes. I don't think it is reasonable to ask you to rework the error
handling here completely for this. Lets leave this as a potential future
enhancement.

> Thanks but I guess I'd better to paste this whole patch to avoid I'm 
> still missing something :)
> 
> ---
>   tools/libxl/libxl.h  |  6 ++
>   tools/libxl/libxl_dm.c   | 25 ++---
>   tools/libxl/libxl_internal.h |  5 +
>   tools/libxl/libxl_types.idl  |  6 ++
>   tools/libxl/xl_cmdimpl.c | 14 --
>   5 files changed, 51 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 6bbc52d..62b3ae5 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -720,6 +720,12 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
> libxl_mac *src);
>   #define LIBXL_HAVE_PSR_MBM 1
>   #endif
> 
> +/*
> + * libxl_domain_build_info has the u.hvm.gfx_passthru_kind field and
> + * the libxl_gfx_passthru_kind enumeration is defined.
> +*/
> +#define LIBXL_HAVE_GFX_PASSTHRU_KIND
> +
>   typedef char **libxl_string_list;
>   void libxl_string_list_dispose(libxl_string_list *sl);
>   int libxl_string_list_length(const libxl_string_list *sl);
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 8599a6a..045d48a 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -710,9 +710,6 @@ static char ** 
> libxl__build_device_model_args_new(libxl__gc *gc,
>   flexarray_append(dm_args, "-net");
>   flexarray_append(dm_args, "none");
>   }
> -if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
> -flexarray_append(dm_args, "-gfx_passthru");
> -}
>   } else {
>   if (!sdl && !vnc) {
>   flexarray_append(dm_args, "-nographic");
> @@ -757,6 +754,28 @@ static char ** 
> libxl__build_device_model_args_new(libxl__gc *gc,
>   machinearg, max_ram_below_4g);
>   }
>   }
> +
> +if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
> +switch (b_info->u.hvm.gfx_passthru_kind) {
> +case _info->u.hvm.gfx_passthru_kind:
> +if (libxl__is_igd_vga_passthru(gc, guest_config)) {
> +machinearg = GCSPRINTF("%s,igd-passthru=on", 
> machinearg);
> +} else {
> +LOG(ERROR, "Unable to detect graphics passthru kind,"
> +" please set gfx_passthru_kind. See xl.cfg(5) 
> for more"
> +" information.\n");

Please change this to "Unable to detect graphics passthru kind" to avoid
talking xl-isms in libxl.

> +return NULL;
> +}
> +break;
> +case LIBXL_GFX_PASSTHRU_KIND_IGD:
> +machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
> +break;

This duplicates the code from above. I think this would be best done as:

static int libxl__detect_gfx_passthru_kind(libxl__gc *gc, guest_config)
{
if (b_info->u.hvm.gfx_passthru_kind != LIBXL_GFX_PASSTHRU_KIND_DEFAULT)
return 0;

if (libxl__is_igd_vga_passthru(gc, guest_config)) {
b_info->u.hvm.gfx_passthru_kind = LIBXL_GFX_PASSTHRU_KIND_IGD;
return 0;
}

LOG(ERROR, "Unable to detect graphics passthru kind");
return 1;
}

Then for the code in libxl__build_device_model_args_new:

 if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
 if (!libxl__detect_gfx_passthru_kind(gc, guest_config))
  return NULL
 switch (b_info->u.hvm.gfx_passthru_kind) {
 case LIBXL_GFX_PASSTHRU_KIND_IGD:
 machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
 break;
 default:
 LOG(ERROR, "unknown gfx_passthru_kind\n");
return NULL;
 }
}

Tha

Re: [Qemu-devel] [PATCH for-2.3] numa: pc: fix default VCPU to node mapping

2015-03-18 Thread Igor Mammedov
On Tue, 17 Mar 2015 13:42:36 -0300
Eduardo Habkost  wrote:

> On Tue, Mar 17, 2015 at 03:48:38PM +, Igor Mammedov wrote:
> > since commit
> >dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT
> > Linux kernel actually tries to use CPU to Node mapping from
> > QEMU provided SRAT table instead of discarding it, and that
> > in some cases breaks build_sched_domains() which expects
> > sane mapping where cores/threads belonging to the same socket
> > are on the same NUMA node.
> > 
> > With current default round-robin mapping of VCPUs to nodes
> > guest ends-up with cores/threads belonging to the same socket
> > being on different NUMA nodes.
> > 
> > For example with following CLI:
> > qemu-kvm -m 4G -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 \
> >  -numa node,nodeid=0 -numa node,nodeid=1
> > 2.6.32 based kernels will hang on boot due to incorrectly build
> > sched_group-s list in update_sd_lb_stats()
> > so comment in QEMU justifying dumb default mapping:
> >  "
> >   guest OSes must cope with this anyway, because there are BIOSes
> >   out there in real machines which also use this scheme.
> >  "
> > isn't really valid.
> > 
> > Replacing default mapping withi a manual, where VCPUs belonging to
> > the same socket are on the same NUMA node, fixes issue for
> > guests which can't handle nonsense topology i.e. cnaging CLI to:
> >   -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
> > 
> > So instead of simply scattering VCPUs around nodes, map
> > the same socket VCPUs to the same NUMA node, which is what
> > guest would expect from a sane hardware/BIOS.
> > 
> > Signed-off-by: Igor Mammedov 
> 
> I believe the proposed behavior is much better. But if we are going to
> break compatibility, shouldn't we at least do that before the first -rc
> so we get feedback in case it break existing configurations?
Aren't existing setups that actually care about node mapping typically do not 
use default mapping in favor of manually specifying which CPUs go to which node?
BTW what it would break on already broken configuration?

My thinking on this was that we prefer to not keep bugs in ACPI tables and
fix them instead, since it doesn't affect currently running (migrating)
guest and allows restarted guest use fixed ACPI tables.

Considering that CPU info from SRAT is actually used only since 2.2,
I'd prefer to fix it in 2.3 and 2.2 stable then allow default buggy SRAT
spread over several releases.


> 
> About qemu_cpu_socket_id_from_index(): all qemu-system-* binaries have
> smp_cores and smp_threads available (even if machines ignore it), but
> the default stub can return values that are larger than the number of
> sockets if smp_cores*smp_threads > 1, which would be obviously
> incorrect. Isn't it easier to simply make
> "cpu_index/(smp_cores*smp_sockets)" be the default cpu_index->socket
> mapping function, and allow machine-specific (not arch-specific)
> overrides if necessary?
Since it's hard-freeze now, patch focuses on fixing PC machine (which I can 
test)
and do not affecting other targets (i.e. keep legacy behavior for them)
that's why stub returns cpu_index. So I'd prefer hold off global
"cpu_index/(smp_cores*smp_sockets)" change for now.

So would something like below acceptable 2.3?

 node_id = cpu_index % nb_numa_nodes;
 if (machine_class->socket_from_cpu_index) {
 node_id = machine_class->socket_from_cpu_index(cpu_index) % nb_numa_nodes;
 }
  


> 
> 
> > ---
> >  include/sysemu/cpus.h |  3 +++
> >  numa.c| 14 ++
> >  stubs/Makefile.objs   |  1 +
> >  stubs/qemu_cpu_socket_id_from_index.c |  6 ++
> >  target-i386/cpu.c | 11 +++
> >  5 files changed, 31 insertions(+), 4 deletions(-)
> >  create mode 100644 stubs/qemu_cpu_socket_id_from_index.c
> > 
> > diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> > index 3f162a9..aacabcb 100644
> > --- a/include/sysemu/cpus.h
> > +++ b/include/sysemu/cpus.h
> > @@ -1,5 +1,6 @@
> >  #ifndef QEMU_CPUS_H
> >  #define QEMU_CPUS_H
> > +#include "qemu-common.h"
> >  
> >  /* cpus.c */
> >  void qemu_init_cpu_loop(void);
> > @@ -18,6 +19,8 @@ void qtest_clock_warp(int64_t dest);
> >  /* vl.c */
> >  extern int smp_cores;
> >  extern int smp_threads;
> > +
> > +unsigned qemu_cpu_socket_id_from_index(unsigned int cpu_index);
> >  #else
> >  /* *-user doesn't have configurable SMP topology */
> >  #define smp_cores   1
> > diff --git a/numa.c b/numa.c
> > index ffbec68..5297749 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -26,6 +26,7 @@
> >  #include "exec/cpu-common.h"
> >  #include "qemu/bitmap.h"
> >  #include "qom/cpu.h"
> > +#include "sysemu/cpus.h"
> >  #include "qemu/error-report.h"
> >  #include "include/exec/cpu-common.h" /* for RAM_ADDR_FMT */
> >  #include "qapi-visit.h"
> > @@ -233,13 +234,18 @@ void parse_numa_opts(void)
> >  break;
> >  }
> >  }
> > -/* assigning 

Re: [Qemu-devel] [PATCH for-2.3] numa: pc: fix default VCPU to node mapping

2015-03-18 Thread Igor Mammedov
On Tue, 17 Mar 2015 17:59:48 +0100
Andreas Färber  wrote:

> Am 17.03.2015 um 17:42 schrieb Eduardo Habkost:
> > On Tue, Mar 17, 2015 at 03:48:38PM +, Igor Mammedov wrote:
> >> since commit
> >>dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT
> >> Linux kernel actually tries to use CPU to Node mapping from
> >> QEMU provided SRAT table instead of discarding it, and that
> >> in some cases breaks build_sched_domains() which expects
> >> sane mapping where cores/threads belonging to the same socket
> >> are on the same NUMA node.
> >>
> >> With current default round-robin mapping of VCPUs to nodes
> >> guest ends-up with cores/threads belonging to the same socket
> >> being on different NUMA nodes.
> >>
> >> For example with following CLI:
> >> qemu-kvm -m 4G -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 \
> >>  -numa node,nodeid=0 -numa node,nodeid=1
> >> 2.6.32 based kernels will hang on boot due to incorrectly build
> >> sched_group-s list in update_sd_lb_stats()
> >> so comment in QEMU justifying dumb default mapping:
> >>  "
> >>   guest OSes must cope with this anyway, because there are BIOSes
> >>   out there in real machines which also use this scheme.
> >>  "
> >> isn't really valid.
> >>
> >> Replacing default mapping withi a manual, where VCPUs belonging to
> >> the same socket are on the same NUMA node, fixes issue for
> >> guests which can't handle nonsense topology i.e. cnaging CLI to:
> >>   -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
> >>
> >> So instead of simply scattering VCPUs around nodes, map
> >> the same socket VCPUs to the same NUMA node, which is what
> >> guest would expect from a sane hardware/BIOS.
> >>
> >> Signed-off-by: Igor Mammedov 
> > 
> > I believe the proposed behavior is much better. But if we are going to
> > break compatibility, shouldn't we at least do that before the first -rc
> > so we get feedback in case it break existing configurations?
> > 
> > About qemu_cpu_socket_id_from_index(): all qemu-system-* binaries have
> > smp_cores and smp_threads available (even if machines ignore it), but
> > the default stub can return values that are larger than the number of
> > sockets if smp_cores*smp_threads > 1, which would be obviously
> > incorrect. Isn't it easier to simply make
> > "cpu_index/(smp_cores*smp_sockets)" be the default cpu_index->socket
> > mapping function, and allow machine-specific (not arch-specific)
> > overrides if necessary?
> 
> Agree that the proposed stub solution is not so nice. Can you propose a
> MachineClass based solution instead?
sure

> 
> The example I keep bringing up for x86 is that the Galileo boards or
> even the Minnow boards don't really have sockets, being a SoC.
> 
> Thanks,
> Andreas
> 




Re: [Qemu-devel] [PATCH] exec: Respect as_tranlsate_internal length clamp

2015-03-18 Thread Paolo Bonzini


On 17/03/2015 06:35, Peter Crosthwaite wrote:
> address_space_translate_internal will clamp the *plen length argument
> based on the size of the memory region being queried. The iommu walker
> logic in addresss_space_translate was ignoring this by discarding the
> post fn call value of *plen. Fix by just always using *plen as the
> length argument throughout the fn, removing the len local variable.
> 
> This fixes a bootloader bug when a single elf section spans multiple
> QEMU memory regions.
> 
> Signed-off-by: Peter Crosthwaite 
> ---
>  exec.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index e97071a..8b922db 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -380,7 +380,6 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
> hwaddr addr,
>  IOMMUTLBEntry iotlb;
>  MemoryRegionSection *section;
>  MemoryRegion *mr;
> -hwaddr len = *plen;
>  
>  rcu_read_lock();
>  for (;;) {
> @@ -395,7 +394,7 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
> hwaddr addr,
>  iotlb = mr->iommu_ops->translate(mr, addr, is_write);
>  addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
>  | (addr & iotlb.addr_mask));
> -len = MIN(len, (addr | iotlb.addr_mask) - addr + 1);
> +*plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1);
>  if (!(iotlb.perm & (1 << is_write))) {
>  mr = &io_mem_unassigned;
>  break;
> @@ -406,10 +405,9 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
> hwaddr addr,
>  
>  if (xen_enabled() && memory_access_is_direct(mr, is_write)) {
>  hwaddr page = ((addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
> -len = MIN(page, len);
> +*plen = MIN(page, *plen);
>  }
>  
> -*plen = len;
>  *xlat = addr;
>  rcu_read_unlock();
>  return mr;
> 

Applied, thanks.

Paolo



Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM

2015-03-18 Thread Kevin Wolf
[ Cc: qemu-block ]

Am 18.03.2015 um 04:19 hat Li, Liang Z geschrieben:
> > This needs further review/changes on the block layer.
> > 
> > First explanation, why I think this don't fix the full problem.
> > Whith this patch, we fix the problem where we have a dirty block layer but
> > basically nothing dirtying the memory on the guest (we are moving the 20
> > seconds from max_downtime for the blocklayer flush), to 20 seconds until
> > we have decided that the amount of dirty memory is small enough to be
> > transferred during max_downtime.  But it is still going to take 20 seconds 
> > to
> > flush the block layer, and during that 20 seconds, the amount of memory that
> > can be dirty is HUGE.
> 
> It's true.

What kind of cache is it actually that takes 20s to flush here?

Inside of qemu, we don't cache a whole lot. If you use qcow2, we do use
a writeback cache for some metadata that might need to be written back,
but it is small and shouldn't take any significant time.

Then we have the kernel page cache, or for some network protocols caches
in the respective libs. This sounds like the right size for a 20s stall,
but don't we require cache.direct=on for live migration anyway for
coherency, i.e. bypassing any such cache?

Lastly there may be a disk cache, but it's too small either.

> > I think our ouptions are:
> > 
> > - tell the block layer at the beggining of migration
> >   Hey, we are migrating, could you please start flusing data now, and
> >   don't get the caches to grow too much, please, pretty please.
> >   (I left the API to the block layer)
> > - Add on that point a new function:
> >   bdrvr_flush_all_start()
> >   That starts the sending of pages, and we "hope" that by the time that
> >   we have migrated all memory, they have also finished (so our last
> >   call to block_flush_all() have less work to do)
> > - Add another function:
> >   int bdrv_flush_all_timeout(int timeout)
> >   that returns if timeout pass, telling if it has migrated all pages or
> >   timeout has passed.  So we can got back to the iterative stage if it
> >   has taken too long.

The problem is that the block layer really doesn't have an option to
control what is getting synced once the data is cached outside of qemu.
Essentially we can do an fdatasync() or we can leave it, that's the only
choice we have.

Now doing an asynchronous fdatasync() in the background is completely
reasonable in order to reduce the amount of data to be flushed later.
But the patch is doing it while holding both the BQL and the AIOContext
lock of the device, which doesn't feel right. Maybe it should schedule a
BH in the AIOContext instead and flush from there asynchronously.


The other thing is that flushing once doesn't mean that new data isn't
accumulating in the cache, especially if you decide to do the background
flush at the start of the migration.

The obvious way to avoid that would be to switch to a writethrough mode,
so any write request goes directly to the disk. This will, however,
impact performance so heavily that it's probably not a realistic option.

An alternative approach could be to repeat the background flush
periodically, either time based or after every x bytes that are written
to a device. Time based should actually be quite easy to implement.


Once we have the flushing in the background, the migration code can
apply any timeouts it wants. If the timeout is exceeded, the flush
wouldn't be aborted but keep running in the background, but migration
can go back to the iterative state anyway.

> > Notice that *normally* bdrv_flush_all() is very fast, the problem is that
> > sometimes it get really, really slow (NFS decided to go slow, TCP drop a
> > packet, whatever).
> > 
> > Right now, we don't have an interface to detect that cases and got back to
> > the iterative stage.
> 
> How about go back to the iterative stage when detect that the pending_size is 
> larger
> Than max_size, like this:
> 
> +/* do flush here is aimed to shorten the VM downtime,
> + * bdrv_flush_all is a time consuming operation
> + * when the guest has done some file writing */
> +bdrv_flush_all();
> +pending_size = qemu_savevm_state_pending(s->file, max_size);
> +if (pending_size && pending_size >= max_size) {
> +qemu_mutex_unlock_iothread();
> +continue;
> +}
>   ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>   if (ret >= 0) {
>   qemu_file_set_rate_limit(s->file, INT64_MAX);
> 
> and this is quite simple.

Yes, but it is too simple. If you hold all the locks during
bdrv_flush_all(), your VM will effectively stop as soon as it performs
the next I/O access, so you don't win much. And you still don't have a
timeout for cases where the flush takes really long.

Kevin



[Qemu-devel] [PULL 01/19] nbd: Fix overflow return value

2015-03-18 Thread Paolo Bonzini
From: Yik Fang 

The value of reply.error should be the type unsigned int.

Signed-off-by: Yik Fang 
Message-Id: <1423722111-12902-1-git-send-email-eric.fan...@huawei.com>
Signed-off-by: Paolo Bonzini 
---
 nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/nbd.c b/nbd.c
index 71159af..02ba0fe 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1295,7 +1295,7 @@ static void nbd_trip(void *opaque)
 default:
 LOG("invalid request type (%u) received", request.type);
 invalid_request:
-reply.error = -EINVAL;
+reply.error = EINVAL;
 error_reply:
 if (nbd_co_send_reply(req, &reply, 0) < 0) {
 goto out;
-- 
2.3.0





[Qemu-devel] [PULL 06/19] nbd: Handle blk_getlength() failure

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-9-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 blockdev-nbd.c  |  6 +-
 include/block/nbd.h |  3 ++-
 nbd.c   | 16 ++--
 qemu-nbd.c  | 10 +-
 4 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 22e95d1..b29e456 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -105,7 +105,11 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 writable = false;
 }
 
-exp = nbd_export_new(blk, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, NULL);
+exp = nbd_export_new(blk, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, NULL,
+ errp);
+if (!exp) {
+return;
+}
 
 nbd_export_set_name(exp, device);
 
diff --git a/include/block/nbd.h b/include/block/nbd.h
index ca9a5ac..2c20138 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -86,7 +86,8 @@ typedef struct NBDExport NBDExport;
 typedef struct NBDClient NBDClient;
 
 NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
-  uint32_t nbdflags, void (*close)(NBDExport *));
+  uint32_t nbdflags, void (*close)(NBDExport *),
+  Error **errp);
 void nbd_export_close(NBDExport *exp);
 void nbd_export_get(NBDExport *exp);
 void nbd_export_put(NBDExport *exp);
diff --git a/nbd.c b/nbd.c
index 34f4dbb..8837e75 100644
--- a/nbd.c
+++ b/nbd.c
@@ -966,7 +966,8 @@ static void blk_aio_detach(void *opaque)
 }
 
 NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
-  uint32_t nbdflags, void (*close)(NBDExport *))
+  uint32_t nbdflags, void (*close)(NBDExport *),
+  Error **errp)
 {
 NBDExport *exp = g_malloc0(sizeof(NBDExport));
 exp->refcount = 1;
@@ -974,7 +975,14 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t 
dev_offset, off_t size,
 exp->blk = blk;
 exp->dev_offset = dev_offset;
 exp->nbdflags = nbdflags;
-exp->size = size == -1 ? blk_getlength(blk) : size;
+exp->size = size < 0 ? blk_getlength(blk) : size;
+if (exp->size < 0) {
+error_setg_errno(errp, -exp->size,
+ "Failed to determine the NBD export's length");
+goto fail;
+}
+exp->size -= exp->size % BDRV_SECTOR_SIZE;
+
 exp->close = close;
 exp->ctx = blk_get_aio_context(blk);
 blk_ref(blk);
@@ -986,6 +994,10 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t 
dev_offset, off_t size,
  */
 blk_invalidate_cache(blk, NULL);
 return exp;
+
+fail:
+g_free(exp);
+return NULL;
 }
 
 NBDExport *nbd_export_find(const char *name)
diff --git a/qemu-nbd.c b/qemu-nbd.c
index d8daf1d..0cb0e4e 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -717,6 +717,10 @@ int main(int argc, char **argv)
 
 bs->detect_zeroes = detect_zeroes;
 fd_size = blk_getlength(blk);
+if (fd_size < 0) {
+errx(EXIT_FAILURE, "Failed to determine the image length: %s",
+ strerror(-fd_size));
+}
 
 if (partition != -1) {
 ret = find_partition(blk, partition, &dev_offset, &fd_size);
@@ -726,7 +730,11 @@ int main(int argc, char **argv)
 }
 }
 
-exp = nbd_export_new(blk, dev_offset, fd_size, nbdflags, 
nbd_export_closed);
+exp = nbd_export_new(blk, dev_offset, fd_size, nbdflags, nbd_export_closed,
+ &local_err);
+if (!exp) {
+errx(EXIT_FAILURE, "%s", error_get_pretty(local_err));
+}
 
 if (sockpath) {
 fd = unix_socket_incoming(sockpath);
-- 
2.3.0





[Qemu-devel] [PULL 03/19] qemu-nbd: Detect unused partitions by system == 0

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

Unused partitions do not necessarily have a total sector count of 0
(although they should have), but they always do have the system field
set to 0, so use that for testing whether a partition is in use rather
than the sector count field alone.

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-3-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 qemu-nbd.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 064b000..d8daf1d 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -167,8 +167,9 @@ static int find_partition(BlockBackend *blk, int partition,
 for (i = 0; i < 4; i++) {
 read_partition(&data[446 + 16 * i], &mbr[i]);
 
-if (!mbr[i].nb_sectors_abs)
+if (!mbr[i].system || !mbr[i].nb_sectors_abs) {
 continue;
+}
 
 if (mbr[i].system == 0xF || mbr[i].system == 0x5) {
 struct partition_record ext[4];
@@ -182,8 +183,9 @@ static int find_partition(BlockBackend *blk, int partition,
 
 for (j = 0; j < 4; j++) {
 read_partition(&data1[446 + 16 * j], &ext[j]);
-if (!ext[j].nb_sectors_abs)
+if (!ext[j].system || !ext[j].nb_sectors_abs) {
 continue;
+}
 
 if ((ext_partnum + j + 1) == partition) {
 *offset = (uint64_t)ext[j].start_sector_abs << 9;
-- 
2.3.0





[Qemu-devel] [PULL 00/19] Misc bugfixes for 2.3.0-rc1

2015-03-18 Thread Paolo Bonzini
The following changes since commit cd232acfa0d70002fed89e9293f04afda577a513:

  Update version for v2.3.0-rc0 release (2015-03-17 18:58:33 +)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to c3c1bb99d1c11978d9ce94d1bdcf0705378c1459:

  exec: Respect as_tranlsate_internal length clamp (2015-03-18 12:09:42 +0100)


- kvm: ioeventfd fix for PPC64LE
- virtio-scsi: misc fixes
- fix for --enable-profiler
- nbd: fixes from Max
- build: fix for scripts/make_device_config.sh
- exec: fix for address_space_translate



Let's see if patchew likes the Message-Id tags from git-am. :)

Alexey Kardashevskiy (1):
  profiler: Reenable built-in profiler

Bo Su (1):
  virtio-scsi-dataplane: fix memory leak in virtio_scsi_vring_init

Fam Zheng (1):
  virtio-scsi: Fix assert in virtio_scsi_push_event

Greg Kurz (1):
  kvm: fix ioeventfd endianness on bi-endian architectures

Max Reitz (12):
  util/uri: Add overflow check to rfc3986_parse_port
  qemu-nbd: Detect unused partitions by system == 0
  nbd: Fix nbd_establish_connection()'s return value
  nbd: Pass return value from nbd_handle_list()
  nbd: Handle blk_getlength() failure
  qemu-nbd: fork() can fail
  nbd: Fix potential signed overflow issues
  nbd: Set block size to BDRV_SECTOR_SIZE
  nbd: Fix nbd_receive_options()
  nbd: Fix interpretation of the export flags
  nbd: Drop unexpected data for NBD_OPT_LIST
  coroutine-io: Return -errno in case of error

Paolo Bonzini (1):
  build: pass .d file name to scripts/make_device_config.sh, fix makefile 
target

Peter Crosthwaite (1):
  exec: Respect as_tranlsate_internal length clamp

Yik Fang (1):
  nbd: Fix overflow return value

 Makefile|   3 +-
 block/nbd-client.c  |   3 +-
 block/nbd-client.h  |   1 -
 block/nbd.c |   2 +-
 blockdev-nbd.c  |   6 ++-
 cpus.c  |   2 +-
 exec.c  |   6 +--
 hw/scsi/virtio-scsi-dataplane.c |   4 +-
 hw/scsi/virtio-scsi.c   |   8 ++-
 include/block/nbd.h |  11 ++--
 include/qemu/timer.h|   5 +-
 kvm-all.c   |  24 -
 monitor.c   |   6 +--
 nbd.c   | 112 
 qemu-coroutine-io.c |   2 +-
 qemu-nbd.c  |  30 +++
 scripts/make_device_config.sh   |  18 ---
 util/uri.c  |  24 +
 18 files changed, 178 insertions(+), 89 deletions(-)
-- 
2.3.0




[Qemu-devel] [PULL 04/19] nbd: Fix nbd_establish_connection()'s return value

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

unix_connect_opts() and inet_connect_opts() do not necessarily set errno
(if at all); therefore, nbd_establish_connection() should not literally
return -errno on error.

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-4-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 block/nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nbd.c b/block/nbd.c
index 6634a69..2176186 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -248,7 +248,7 @@ static int nbd_establish_connection(BlockDriverState *bs, 
Error **errp)
 /* Failed to establish connection */
 if (sock < 0) {
 logout("Failed to establish connection to NBD server\n");
-return -errno;
+return -EIO;
 }
 
 return sock;
-- 
2.3.0





[Qemu-devel] [PULL 02/19] util/uri: Add overflow check to rfc3986_parse_port

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

And while at it, replace tabs by eight spaces in this function.

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-2-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 util/uri.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/util/uri.c b/util/uri.c
index 1cfd78b..550b984 100644
--- a/util/uri.c
+++ b/util/uri.c
@@ -320,19 +320,23 @@ static int
 rfc3986_parse_port(URI *uri, const char **str)
 {
 const char *cur = *str;
+int port = 0;
 
 if (ISA_DIGIT(cur)) {
-   if (uri != NULL)
-   uri->port = 0;
-   while (ISA_DIGIT(cur)) {
-   if (uri != NULL)
-   uri->port = uri->port * 10 + (*cur - '0');
-   cur++;
-   }
-   *str = cur;
-   return(0);
+while (ISA_DIGIT(cur)) {
+port = port * 10 + (*cur - '0');
+if (port > 65535) {
+return 1;
+}
+cur++;
+}
+if (uri) {
+uri->port = port;
+}
+*str = cur;
+return 0;
 }
-return(1);
+return 1;
 }
 
 /**
-- 
2.3.0





[Qemu-devel] [PULL 18/19] virtio-scsi-dataplane: fix memory leak in virtio_scsi_vring_init

2015-03-18 Thread Paolo Bonzini
From: Bo Su 

if k->set_host_notifier failed, VirtIOSCSIVring *r will leak

Signed-off-by: Bo Su 
Message-Id: <1426671732-80213-1-git-send-email-su...@huawei.com>
Reviewed-by: Fam Zheng 
Reviewed-by: Gonglei 
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/virtio-scsi-dataplane.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 3f40ff0..c069cd7 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -45,7 +45,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
 {
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
-VirtIOSCSIVring *r = g_slice_new(VirtIOSCSIVring);
+VirtIOSCSIVring *r;
 int rc;
 
 /* Set up virtqueue notify */
@@ -56,6 +56,8 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
 s->dataplane_fenced = true;
 return NULL;
 }
+
+r = g_slice_new(VirtIOSCSIVring);
 r->host_notifier = *virtio_queue_get_host_notifier(vq);
 r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
 aio_set_event_notifier(s->ctx, &r->host_notifier, handler);
-- 
2.3.0





[Qemu-devel] [PULL 05/19] nbd: Pass return value from nbd_handle_list()

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

While it does not make a difference in practice, nbd_receive_options()
generally returns -errno, so it should do that here as well; and the
easiest way to achieve this is by passing on the value returned by
nbd_handle_list().

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-7-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 nbd.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/nbd.c b/nbd.c
index 02ba0fe..34f4dbb 100644
--- a/nbd.c
+++ b/nbd.c
@@ -351,7 +351,7 @@ fail:
 static int nbd_receive_options(NBDClient *client)
 {
 while (1) {
-int csock = client->sock;
+int csock = client->sock, ret;
 uint32_t tmp, length;
 uint64_t magic;
 
@@ -398,8 +398,9 @@ static int nbd_receive_options(NBDClient *client)
 TRACE("Checking option");
 switch (be32_to_cpu(tmp)) {
 case NBD_OPT_LIST:
-if (nbd_handle_list(client, length) < 0) {
-return 1;
+ret = nbd_handle_list(client, length);
+if (ret < 0) {
+return ret;
 }
 break;
 
-- 
2.3.0





[Qemu-devel] [PULL 09/19] nbd: Set block size to BDRV_SECTOR_SIZE

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-13-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 block/nbd-client.c  |  3 +--
 block/nbd-client.h  |  1 -
 include/block/nbd.h |  4 ++--
 nbd.c   | 15 +++
 qemu-nbd.c  |  5 ++---
 5 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 259f5a3..e1bb919 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -386,8 +386,7 @@ int nbd_client_init(BlockDriverState *bs, int sock, const 
char *export,
 logout("session init %s\n", export);
 qemu_set_block(sock);
 ret = nbd_receive_negotiate(sock, export,
-&client->nbdflags, &client->size,
-&client->blocksize, errp);
+&client->nbdflags, &client->size, errp);
 if (ret < 0) {
 logout("Failed to negotiate with the NBD server\n");
 closesocket(sock);
diff --git a/block/nbd-client.h b/block/nbd-client.h
index fa4ff42..e841340 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -20,7 +20,6 @@ typedef struct NbdClientSession {
 int sock;
 uint32_t nbdflags;
 off_t size;
-size_t blocksize;
 
 CoMutex send_mutex;
 CoMutex free_sema;
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 53726e8..65f409d 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -75,8 +75,8 @@ enum {
 
 ssize_t nbd_wr_sync(int fd, void *buffer, size_t size, bool do_read);
 int nbd_receive_negotiate(int csock, const char *name, uint32_t *flags,
-  off_t *size, size_t *blocksize, Error **errp);
-int nbd_init(int fd, int csock, uint32_t flags, off_t size, size_t blocksize);
+  off_t *size, Error **errp);
+int nbd_init(int fd, int csock, uint32_t flags, off_t size);
 ssize_t nbd_send_request(int csock, struct nbd_request *request);
 ssize_t nbd_receive_reply(int csock, struct nbd_reply *reply);
 int nbd_client(int fd);
diff --git a/nbd.c b/nbd.c
index 8837e75..3d550f7 100644
--- a/nbd.c
+++ b/nbd.c
@@ -495,7 +495,7 @@ fail:
 }
 
 int nbd_receive_negotiate(int csock, const char *name, uint32_t *flags,
-  off_t *size, size_t *blocksize, Error **errp)
+  off_t *size, Error **errp)
 {
 char buf[256];
 uint64_t magic, s;
@@ -603,7 +603,6 @@ int nbd_receive_negotiate(int csock, const char *name, 
uint32_t *flags,
 goto fail;
 }
 *size = be64_to_cpu(s);
-*blocksize = 1024;
 TRACE("Size is %" PRIu64, *size);
 
 if (!name) {
@@ -630,7 +629,7 @@ fail:
 }
 
 #ifdef __linux__
-int nbd_init(int fd, int csock, uint32_t flags, off_t size, size_t blocksize)
+int nbd_init(int fd, int csock, uint32_t flags, off_t size)
 {
 TRACE("Setting NBD socket");
 
@@ -640,17 +639,17 @@ int nbd_init(int fd, int csock, uint32_t flags, off_t 
size, size_t blocksize)
 return -serrno;
 }
 
-TRACE("Setting block size to %lu", (unsigned long)blocksize);
+TRACE("Setting block size to %lu", (unsigned long)BDRV_SECTOR_SIZE);
 
-if (ioctl(fd, NBD_SET_BLKSIZE, blocksize) < 0) {
+if (ioctl(fd, NBD_SET_BLKSIZE, (size_t)BDRV_SECTOR_SIZE) < 0) {
 int serrno = errno;
 LOG("Failed setting NBD block size");
 return -serrno;
 }
 
-TRACE("Setting size to %zd block(s)", (size_t)(size / blocksize));
+TRACE("Setting size to %zd block(s)", (size_t)(size / BDRV_SECTOR_SIZE));
 
-if (ioctl(fd, NBD_SET_SIZE_BLOCKS, size / blocksize) < 0) {
+if (ioctl(fd, NBD_SET_SIZE_BLOCKS, size / (size_t)BDRV_SECTOR_SIZE) < 0) {
 int serrno = errno;
 LOG("Failed setting size (in blocks)");
 return -serrno;
@@ -715,7 +714,7 @@ int nbd_client(int fd)
 return ret;
 }
 #else
-int nbd_init(int fd, int csock, uint32_t flags, off_t size, size_t blocksize)
+int nbd_init(int fd, int csock, uint32_t flags, off_t size)
 {
 return -ENOTSUP;
 }
diff --git a/qemu-nbd.c b/qemu-nbd.c
index a4a9a0c..7e690ff 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -279,7 +279,6 @@ static void *nbd_client_thread(void *arg)
 {
 char *device = arg;
 off_t size;
-size_t blocksize;
 uint32_t nbdflags;
 int fd, sock;
 int ret;
@@ -292,7 +291,7 @@ static void *nbd_client_thread(void *arg)
 }
 
 ret = nbd_receive_negotiate(sock, NULL, &nbdflags,
-&size, &blocksize, &local_error);
+&size, &local_error);
 if (ret < 0) {
 if (local_error) {
 fprintf(stderr, "%s\n", error_get_pretty(local_error));
@@ -308,7 +307,7 @@ static void *nbd_client_thread(void *arg)
 goto out_socket;
 }
 
-ret = nbd_init(fd, sock, nbdflags, size, blocksize);
+ret = nbd_init(fd, sock, nbdflags, size);
 if (ret < 0) {
 goto out_fd;
 }
-- 
2.3.0





[Qemu-devel] [PULL 12/19] nbd: Drop unexpected data for NBD_OPT_LIST

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

When requesting the list of exports, no data should be sent. If data is
sent, the NBD server should not just inform the client of the invalid
request, but also drop the data.

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-22-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 nbd.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/nbd.c b/nbd.c
index 563e820..91b7d56 100644
--- a/nbd.c
+++ b/nbd.c
@@ -193,6 +193,26 @@ static ssize_t read_sync(int fd, void *buffer, size_t size)
 return nbd_wr_sync(fd, buffer, size, true);
 }
 
+static ssize_t drop_sync(int fd, size_t size)
+{
+ssize_t ret, dropped = size;
+uint8_t *buffer = g_malloc(MIN(65536, size));
+
+while (size > 0) {
+ret = read_sync(fd, buffer, MIN(65536, size));
+if (ret < 0) {
+g_free(buffer);
+return ret;
+}
+
+assert(ret <= size);
+size -= ret;
+}
+
+g_free(buffer);
+return dropped;
+}
+
 static ssize_t write_sync(int fd, void *buffer, size_t size)
 {
 int ret;
@@ -303,6 +323,9 @@ static int nbd_handle_list(NBDClient *client, uint32_t 
length)
 
 csock = client->sock;
 if (length) {
+if (drop_sync(csock, length) != length) {
+return -EIO;
+}
 return nbd_send_rep(csock, NBD_REP_ERR_INVALID, NBD_OPT_LIST);
 }
 
-- 
2.3.0





[Qemu-devel] [PULL 17/19] profiler: Reenable built-in profiler

2015-03-18 Thread Paolo Bonzini
From: Alexey Kardashevskiy 

2ed1ebcf6 "timer: replace time() with QEMU_CLOCK_HOST" broke compile
when configured with --enable-profiler. Turned out the profiler has been
broken for a while.

This does s/qemu_time/tcg_time/ as the profiler only works in a TCG mode.
This also fixes the compile error.

This changes profile_getclock() to return nanoseconds rather than
CPU ticks as the "profile" HMP command prints seconds and there is no
platform-independent way to get ticks-per-second rate.
Since TCG is quite slow and get_clock() returns nanoseconds (fine
enough), this should not affect precision much.

This removes unused qemu_time_start and tlb_flush_time.

Signed-off-by: Alexey Kardashevskiy 
Message-Id: <1426478258-29961-1-git-send-email-...@ozlabs.ru>
Signed-off-by: Paolo Bonzini 
---
 cpus.c   | 2 +-
 include/qemu/timer.h | 5 ++---
 monitor.c| 6 +++---
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/cpus.c b/cpus.c
index 1ce90a1..314df16 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1353,7 +1353,7 @@ static int tcg_cpu_exec(CPUArchState *env)
 }
 ret = cpu_exec(env);
 #ifdef CONFIG_PROFILER
-qemu_time += profile_getclock() - ti;
+tcg_time += profile_getclock() - ti;
 #endif
 if (use_icount) {
 /* Fold pending instructions back into the
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index eba8b21..e5bd494 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -999,11 +999,10 @@ static inline int64_t cpu_get_real_ticks (void)
 #ifdef CONFIG_PROFILER
 static inline int64_t profile_getclock(void)
 {
-return cpu_get_real_ticks();
+return get_clock();
 }
 
-extern int64_t qemu_time, qemu_time_start;
-extern int64_t tlb_flush_time;
+extern int64_t tcg_time;
 extern int64_t dev_time;
 #endif
 
diff --git a/monitor.c b/monitor.c
index 42116a9..65a63df 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1975,7 +1975,7 @@ static void hmp_info_numa(Monitor *mon, const QDict 
*qdict)
 
 #ifdef CONFIG_PROFILER
 
-int64_t qemu_time;
+int64_t tcg_time;
 int64_t dev_time;
 
 static void hmp_info_profile(Monitor *mon, const QDict *qdict)
@@ -1983,8 +1983,8 @@ static void hmp_info_profile(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "async time  %" PRId64 " (%0.3f)\n",
dev_time, dev_time / (double)get_ticks_per_sec());
 monitor_printf(mon, "qemu time   %" PRId64 " (%0.3f)\n",
-   qemu_time, qemu_time / (double)get_ticks_per_sec());
-qemu_time = 0;
+   tcg_time, tcg_time / (double)get_ticks_per_sec());
+tcg_time = 0;
 dev_time = 0;
 }
 #else
-- 
2.3.0





[Qemu-devel] [PULL 08/19] nbd: Fix potential signed overflow issues

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-11-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 include/block/nbd.h | 4 ++--
 qemu-nbd.c  | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2c20138..53726e8 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -54,8 +54,8 @@ struct nbd_reply {
 /* Reply types. */
 #define NBD_REP_ACK (1) /* Data sending finished. */
 #define NBD_REP_SERVER  (2) /* Export description. */
-#define NBD_REP_ERR_UNSUP   ((1 << 31) | 1) /* Unknown option. */
-#define NBD_REP_ERR_INVALID ((1 << 31) | 3) /* Invalid length. */
+#define NBD_REP_ERR_UNSUP   ((UINT32_C(1) << 31) | 1) /* Unknown option. */
+#define NBD_REP_ERR_INVALID ((UINT32_C(1) << 31) | 3) /* Invalid length. */
 
 #define NBD_CMD_MASK_COMMAND   0x
 #define NBD_CMD_FLAG_FUA   (1 << 16)
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 0c9e807..a4a9a0c 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -142,8 +142,9 @@ static void read_partition(uint8_t *p, struct 
partition_record *r)
 r->end_head = p[5];
 r->end_cylinder = p[7] | ((p[6] << 2) & 0x300);
 r->end_sector = p[6] & 0x3f;
-r->start_sector_abs = p[8] | p[9] << 8 | p[10] << 16 | p[11] << 24;
-r->nb_sectors_abs = p[12] | p[13] << 8 | p[14] << 16 | p[15] << 24;
+
+r->start_sector_abs = le32_to_cpup((uint32_t *)(p +  8));
+r->nb_sectors_abs   = le32_to_cpup((uint32_t *)(p + 12));
 }
 
 static int find_partition(BlockBackend *blk, int partition,
-- 
2.3.0





[Qemu-devel] [PULL 10/19] nbd: Fix nbd_receive_options()

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

The client flags are sent exactly once overall, not once per option.

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-19-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 nbd.c | 49 +
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/nbd.c b/nbd.c
index 3d550f7..fb8a4d4 100644
--- a/nbd.c
+++ b/nbd.c
@@ -350,30 +350,39 @@ fail:
 
 static int nbd_receive_options(NBDClient *client)
 {
+int csock = client->sock;
+uint32_t flags;
+
+/* Client sends:
+[ 0 ..   3]   client flags
+
+[ 0 ..   7]   NBD_OPTS_MAGIC
+[ 8 ..  11]   NBD option
+[12 ..  15]   Data length
+...   Rest of request
+
+[ 0 ..   7]   NBD_OPTS_MAGIC
+[ 8 ..  11]   Second NBD option
+[12 ..  15]   Data length
+...   Rest of request
+*/
+
+if (read_sync(csock, &flags, sizeof(flags)) != sizeof(flags)) {
+LOG("read failed");
+return -EIO;
+}
+TRACE("Checking client flags");
+be32_to_cpus(&flags);
+if (flags != 0 && flags != NBD_FLAG_C_FIXED_NEWSTYLE) {
+LOG("Bad client flags received");
+return -EIO;
+}
+
 while (1) {
-int csock = client->sock, ret;
+int ret;
 uint32_t tmp, length;
 uint64_t magic;
 
-/* Client sends:
-[ 0 ..   3]   client flags
-[ 4 ..  11]   NBD_OPTS_MAGIC
-[12 ..  15]   NBD option
-[16 ..  19]   length
-...   Rest of request
-*/
-
-if (read_sync(csock, &tmp, sizeof(tmp)) != sizeof(tmp)) {
-LOG("read failed");
-return -EINVAL;
-}
-TRACE("Checking client flags");
-tmp = be32_to_cpu(tmp);
-if (tmp != 0 && tmp != NBD_FLAG_C_FIXED_NEWSTYLE) {
-LOG("Bad client flags received");
-return -EINVAL;
-}
-
 if (read_sync(csock, &magic, sizeof(magic)) != sizeof(magic)) {
 LOG("read failed");
 return -EINVAL;
-- 
2.3.0





[Qemu-devel] [PULL 14/19] build: pass .d file name to scripts/make_device_config.sh, fix makefile target

2015-03-18 Thread Paolo Bonzini
The .d file name must match exactly what is used in the SUBDIR_DEVICES_MAK_DEP
variable.  Instead of making assumptions in the make_device_config.sh script,
just pass it in.

Similarly, the makefile target may not match the output file name, because
Makefile uses a temporary file.  Instead of making assumptions on what the
Makefile does, emit the config-devices.mak file to stdout, and use the
passed-in destination as the makefile target

Reported-by: Peter Maydell 
Cc: Michael S. Tsirkin 
Signed-off-by: Paolo Bonzini 
---
 Makefile  |  3 ++-
 scripts/make_device_config.sh | 18 ++
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/Makefile b/Makefile
index 884b59d..88bce56 100644
--- a/Makefile
+++ b/Makefile
@@ -112,7 +112,8 @@ endif
 -include $(SUBDIR_DEVICES_MAK_DEP)
 
 %/config-devices.mak: default-configs/%.mak
-   $(call quiet-command,$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh 
$@.tmp $<, "  GEN   $@.tmp")
+   $(call quiet-command, \
+$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< 
$*-config-devices.mak.d $@ > $@.tmp, "  GEN   $@.tmp")
$(call quiet-command, if test -f $@; then \
  if cmp -s $@.old $@; then \
mv $@.tmp $@; \
diff --git a/scripts/make_device_config.sh b/scripts/make_device_config.sh
index 7958086..c1afb3f 100644
--- a/scripts/make_device_config.sh
+++ b/scripts/make_device_config.sh
@@ -1,10 +1,12 @@
 #! /bin/sh
-# Construct a target device config file from a default, pulling in any
-# files from include directives.
+# Writes a target device config file to stdout, from a default and from
+# include directives therein.  Also emits Makefile dependencies.
+#
+# Usage: make_device_config.sh SRC DEPFILE-NAME DEPFILE-TARGET > DEST
 
-dest=$1
-dep=`dirname $1`-`basename $1`.d
-src=$2
+src=$1
+dep=$2
+target=$3
 src_dir=`dirname $src`
 all_includes=
 
@@ -22,7 +24,7 @@ while [ -n "$f" ] ; do
   [ $? = 0 ] || exit 1
   all_includes="$all_includes $f"
 done
-process_includes $src > $dest
+process_includes $src
 
-cat $src $all_includes | grep -v '^include' > $dest
-echo "$1: $all_includes" > $dep
+cat $src $all_includes | grep -v '^include'
+echo "$target: $all_includes" > $dep
-- 
2.3.0





[Qemu-devel] [PULL 07/19] qemu-nbd: fork() can fail

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

It is very unlikely, but it is possible.

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-10-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 qemu-nbd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 0cb0e4e..0c9e807 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -635,7 +635,9 @@ int main(int argc, char **argv)
  * print errors and exit with the proper status code.
  */
 pid = fork();
-if (pid == 0) {
+if (pid < 0) {
+err(EXIT_FAILURE, "Failed to fork");
+} else if (pid == 0) {
 close(stderr_fd[0]);
 ret = qemu_daemon(1, 0);
 
-- 
2.3.0





[Qemu-devel] [PULL 19/19] exec: Respect as_tranlsate_internal length clamp

2015-03-18 Thread Paolo Bonzini
From: Peter Crosthwaite 

address_space_translate_internal will clamp the *plen length argument
based on the size of the memory region being queried. The iommu walker
logic in addresss_space_translate was ignoring this by discarding the
post fn call value of *plen. Fix by just always using *plen as the
length argument throughout the fn, removing the len local variable.

This fixes a bootloader bug when a single elf section spans multiple
QEMU memory regions.

Signed-off-by: Peter Crosthwaite 
Message-Id: <1426570554-15940-1-git-send-email-peter.crosthwa...@xilinx.com>
Signed-off-by: Paolo Bonzini 
---
 exec.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index e97071a..8b922db 100644
--- a/exec.c
+++ b/exec.c
@@ -380,7 +380,6 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 IOMMUTLBEntry iotlb;
 MemoryRegionSection *section;
 MemoryRegion *mr;
-hwaddr len = *plen;
 
 rcu_read_lock();
 for (;;) {
@@ -395,7 +394,7 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 iotlb = mr->iommu_ops->translate(mr, addr, is_write);
 addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
 | (addr & iotlb.addr_mask));
-len = MIN(len, (addr | iotlb.addr_mask) - addr + 1);
+*plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1);
 if (!(iotlb.perm & (1 << is_write))) {
 mr = &io_mem_unassigned;
 break;
@@ -406,10 +405,9 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 
 if (xen_enabled() && memory_access_is_direct(mr, is_write)) {
 hwaddr page = ((addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
-len = MIN(page, len);
+*plen = MIN(page, *plen);
 }
 
-*plen = len;
 *xlat = addr;
 rcu_read_unlock();
 return mr;
-- 
2.3.0




[Qemu-devel] [PULL 11/19] nbd: Fix interpretation of the export flags

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

The export flags are a 16 bit value, so be16_to_cpu() has to be used to
interpret them correctly. This makes discard and flush actually work
for named NBD exports (they did not work before, because the client
always assumed them to be unsupported because of the bug fixed by this
patch).

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-20-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/nbd.c b/nbd.c
index fb8a4d4..563e820 100644
--- a/nbd.c
+++ b/nbd.c
@@ -625,7 +625,7 @@ int nbd_receive_negotiate(int csock, const char *name, 
uint32_t *flags,
 error_setg(errp, "Failed to read export flags");
 goto fail;
 }
-*flags |= be32_to_cpu(tmp);
+*flags |= be16_to_cpu(tmp);
 }
 if (read_sync(csock, &buf, 124) != 124) {
 error_setg(errp, "Failed to read reserved block");
-- 
2.3.0





[Qemu-devel] [PULL 16/19] kvm: fix ioeventfd endianness on bi-endian architectures

2015-03-18 Thread Paolo Bonzini
From: Greg Kurz 

KVM expects host endian values. Hosts that don't use the default endianness
need to negate the swap performed in adjust_endianness().

Suggested-by: Paolo Bonzini 
Signed-off-by: Greg Kurz 
Message-Id: <20150313212337.31142.3991.stgit@bahia.local>
Signed-off-by: Paolo Bonzini 
---
 kvm-all.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 55025cc..335438a 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -528,13 +528,33 @@ int kvm_vm_check_extension(KVMState *s, unsigned int 
extension)
 return ret;
 }
 
+static uint32_t adjust_ioeventfd_endianness(uint32_t val, uint32_t size)
+{
+#if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
+/* The kernel expects ioeventfd values in HOST_WORDS_BIGENDIAN
+ * endianness, but the memory core hands them in target endianness.
+ * For example, PPC is always treated as big-endian even if running
+ * on KVM and on PPC64LE.  Correct here.
+ */
+switch (size) {
+case 2:
+val = bswap16(val);
+break;
+case 4:
+val = bswap32(val);
+break;
+}
+#endif
+return val;
+}
+
 static int kvm_set_ioeventfd_mmio(int fd, hwaddr addr, uint32_t val,
   bool assign, uint32_t size, bool datamatch)
 {
 int ret;
 struct kvm_ioeventfd iofd;
 
-iofd.datamatch = datamatch ? val : 0;
+iofd.datamatch = datamatch ? adjust_ioeventfd_endianness(val, size) : 0;
 iofd.addr = addr;
 iofd.len = size;
 iofd.flags = 0;
@@ -564,7 +584,7 @@ static int kvm_set_ioeventfd_pio(int fd, uint16_t addr, 
uint16_t val,
  bool assign, uint32_t size, bool datamatch)
 {
 struct kvm_ioeventfd kick = {
-.datamatch = datamatch ? val : 0,
+.datamatch = datamatch ? adjust_ioeventfd_endianness(val, size) : 0,
 .addr = addr,
 .flags = KVM_IOEVENTFD_FLAG_PIO,
 .len = size,
-- 
2.3.0





[Qemu-devel] [PULL 15/19] virtio-scsi: Fix assert in virtio_scsi_push_event

2015-03-18 Thread Paolo Bonzini
From: Fam Zheng 

Hotplugging a scsi-disk may trigger the assertion in qemu_sgl_concat.

qemu-system-x86_64: qemu/hw/scsi/virtio-scsi.c:115: qemu_sgl_concat:
Assertion `skip == 0' failed.

This is introduced by commit 55783a55 (virtio-scsi: work around bug in
old BIOSes) which didn't check out_num when accessing out_sg[0].iov_len
(the same to in sg). For virtio_scsi_push_event, looking into out_sg
doesn't make sense because 0 req_size is intended.

Cc: qemu-sta...@nongnu.org
[Cc'ing qemu-stable because 55783a55 did it too]
Signed-off-by: Fam Zheng 
Message-Id: <1426233354-525-1-git-send-email-f...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/virtio-scsi.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index da0cff8..c9bea06 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -146,8 +146,12 @@ static int virtio_scsi_parse_req(VirtIOSCSIReq *req,
  * TODO: always disable this workaround for virtio 1.0 devices.
  */
 if (!virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT)) {
-req_size = req->elem.out_sg[0].iov_len;
-resp_size = req->elem.in_sg[0].iov_len;
+if (req->elem.out_num) {
+req_size = req->elem.out_sg[0].iov_len;
+}
+if (req->elem.in_num) {
+resp_size = req->elem.in_sg[0].iov_len;
+}
 }
 
 out_size = qemu_sgl_concat(req, req->elem.out_sg,
-- 
2.3.0





[Qemu-devel] [PULL 13/19] coroutine-io: Return -errno in case of error

2015-03-18 Thread Paolo Bonzini
From: Max Reitz 

In case qemu_co_sendv_recvv() fails without any data read, there is no
reason not to return the perfectly fine error number retrieved from
socket_error().

Signed-off-by: Max Reitz 
Message-Id: <1424887718-10800-16-git-send-email-mre...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 qemu-coroutine-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-coroutine-io.c b/qemu-coroutine-io.c
index d404926..28dc735 100644
--- a/qemu-coroutine-io.c
+++ b/qemu-coroutine-io.c
@@ -45,7 +45,7 @@ qemu_co_sendv_recvv(int sockfd, struct iovec *iov, unsigned 
iov_cnt,
 if (err == EAGAIN || err == EWOULDBLOCK) {
 qemu_coroutine_yield();
 } else if (done == 0) {
-return -1;
+return -err;
 } else {
 break;
 }
-- 
2.3.0





[Qemu-devel] [PATCH] virtio: move sanity checks to ifdef DEBUG

2015-03-18 Thread Michael S. Tsirkin
All that happens when virtqueue_fill is invoked incorrectly is that we
corrupt guest memory, so this check is not a security measure.
Move the check to ifdef DEBUG to make sure we don't introduce new
crashes close to release.
Array scans aren't free either, so it's a good idea from performance
point of view, too.

Cc: Rusty Russell 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 27429c2..37fb2ee 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -243,15 +243,22 @@ int virtio_queue_empty(VirtQueue *vq)
 void virtqueue_fill(VirtQueue *vq, const VirtQueueElement *elem,
 unsigned int len_written, unsigned int idx)
 {
-unsigned int offset, tot_wlen;
+unsigned int offset;
 int i;
 
 trace_virtqueue_fill(vq, elem, len_written, idx);
 
-for (tot_wlen = i = 0; i < elem->in_num; i++) {
-tot_wlen += elem->in_sg[i].iov_len;
+#ifdef DEBUG_VIRTIO
+{
+/* Check that len_written is <= the writable length. */
+unsigned int tot_wlen;
+
+for (tot_wlen = i = 0; i < elem->in_num; i++) {
+tot_wlen += elem->in_sg[i].iov_len;
+}
+assert(len_written <= tot_wlen);
 }
-assert(len_written <= tot_wlen);
+#endif
 
 offset = 0;
 for (i = 0; i < elem->in_num; i++) {
-- 
MST



Re: [Qemu-devel] GSOC 2015 Project Proposal

2015-03-18 Thread Stefan Hajnoczi
On Mon, Feb 9, 2015 at 12:44 AM, Xin Tong  wrote:
> I would like to do GSOC this summer. The project i have in mind is to
> implement a set of facilities to make implementing Hardware
> transactional memory (HTM) easier in QEMU.
>
> HTM has become available in many architecture supported by QEMU, e.g.
> i386, PowerPC, etc. Currently, necessary memory tracking. conflict
> detection and transaction rollbak/commit are not available in QEMU. As
> a result, HTM is supported in a very rudimentary fashion in PowerPC,
> i.e. the transaction begins (tbegin in PowerPC) always trigger a fault
> to the fallback code path. Even though HTM is supported by different
> architectures, the underlying principle are very similar and therefore
> it is beneficial to provide a set of facilities to make implementing
> HTM easier in QEMU.
>
> These facilities should include.
>
> A modified software TLB to make memory address and value tracking simple.
> A performant and memory efficient value/address tracking facility to
> detect read/write conflicts for transactions.
> A performant and memory efficient mechanism to rollback and commit
> memory accesses.
> A mechanism to abort transactions on the current processor as well as
> remote processor.
>
> I will come up with a more detailed proposal as application time draws
> close. Any suggestions are appreciated at the moment.

Hi Xin,
Thanks for proposing this project idea.  There haven't been any
responses yet.  I have CCed Alexander Graf and David Gibson, who have
worked on the PowerPC target, and Richard Henderson in case he's
interested in transactional memory.

You need to find a mentor willing to supervise this project idea.
Hopefully bumping this email thread will remind people to consider
your idea.

If no QEMU regular contributors are willing to mentor the project idea
then I'm afraid you would have to choose another idea to apply for
QEMU GSoC.  Let me know if you have any questions or need help.

Stefan



Re: [Qemu-devel] [PATCH v4 0/3] usb: fix segfault when hot-unplugging usb host adapter

2015-03-18 Thread Gerd Hoffmann
On Mi, 2015-03-18 at 17:33 +0800, arei.gong...@huawei.com wrote:
> v4 -> v3:
>  - add reset hookup for sysbus devices (ehci/ohci). (Gerd)

Replaced patches.  make check failure is gone.  Good.

thanks,
  Gerd





Re: [Qemu-devel] [PATCH v2] hw/usb: Include USB files only if necessary

2015-03-18 Thread Gerd Hoffmann
On Di, 2015-03-17 at 14:52 +0100, Thomas Huth wrote:
> Everything should now also compile fine again when configure has
> been run with the --enable-usb-redir and/or --enable-libusb option.

Survived test now.  Added to usb queue.

thanks,
  Gerd





[Qemu-devel] [PATCH] pcie_aer: fix comment to match pcie spec

2015-03-18 Thread Michael S. Tsirkin
Code comment says "table 6-2" but in fact it's is not a table, it is
"Figure 6-2" on page 479.

Cc: Chen Fan 
Reported-by: Michael Tokarev 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/pcie_aer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 9126058..eaa3e6e 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -622,8 +622,8 @@ static bool pcie_aer_inject_uncor_error(PCIEAERInject *inj, 
bool is_fatal)
  *
  * 6.2.4 Error Logging
  * 6.2.5 Sequence of Device Error Signaling and Logging Operations
- * table 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
- *Operations
+ * Figure 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
+ * Operations
  */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err)
 {
-- 
MST



Re: [Qemu-devel] Google SoC 2015 Idea - Extracting Virtual Device Implementation from QEMU

2015-03-18 Thread Stefan Hajnoczi
On Tue, Mar 17, 2015 at 7:38 PM, Guodong Zhu  wrote:
> The purpose of the project is to implement an interface that can extract the
> implementation of the virtual devices and make them executable outside of
> the QEMU environment, which will benefit later testing and making the test
> of the virtual devices more flexible and more thoroughly.

Hi,
Take a look at tests/libqtest.c.  Test cases use this interface to
exercise specific device models.

Although qtest involves running a full QEMU system emulator, it allows
isolated device model tests similar to what you have described.

Is there a project idea on the wiki that you are interested in?
http://qemu-project.org/Google_Summer_of_Code_2015

Or maybe you have another idea you'd like to propose?

Stefa



[Qemu-devel] [Bug 1297218] Re: guest hangs after live migration due to tsc jump

2015-03-18 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: glusterfs (Ubuntu Trusty)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1297218

Title:
  guest hangs after live migration due to tsc jump

Status in QEMU:
  New
Status in glusterfs package in Ubuntu:
  Invalid
Status in qemu package in Ubuntu:
  Triaged
Status in glusterfs source package in Trusty:
  Confirmed
Status in qemu source package in Trusty:
  Confirmed

Bug description:
  We have two identical Ubuntu servers running libvirt/kvm/qemu, sharing
  a Gluster filesystem. Guests can be live migrated between them.
  However, live migration often leads to the guest being stuck at 100%
  for a while. In that case, the dmesg output for such a guest will show
  (once it recovers): Clocksource tsc unstable (delta = 662463064082
  ns). In this particular example, a guest was migrated and only after
  11 minutes (662 seconds) did it become responsive again.

  It seems that newly booted guests doe not suffer from this problem,
  these can be migrated back and forth at will. After a day or so, the
  problem becomes apparent. It also seems that migrating from server A
  to server B causes much more problems than going from B back to A. If
  necessary, I can do more measurements to qualify these observations.

  The VM servers run Ubuntu 13.04 with these packages:
  Kernel: 3.8.0-35-generic x86_64
  Libvirt: 1.0.2
  Qemu: 1.4.0
  Gluster-fs: 3.4.2 (libvirt access the images via the filesystem, not using 
libgfapi yet as the Ubuntu libvirt is not linked against libgfapi).
  The interconnect between both machines (both for migration and gluster) is 
10GbE. 
  Both servers are synced to NTP and well within 1ms form one another.

  Guests are either Ubuntu 13.04 or 13.10.

  On the guests, the current_clocksource is kvm-clock.
  The XML definition of the guests only contains:   

  Now as far as I've read in the documentation of kvm-clock, it specifically 
supports live migrations, so I'm a bit surprised at these problems. There isn't 
all that much information to find on these issue, although I have found 
postings by others that seem to have run into the same issues, but without a 
solution.
  --- 
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  DistroRelease: Ubuntu 14.04
  Package: libvirt (not installed)
  ProcCmdline: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic 
root=UUID=1b0c3c6d-a9b8-4e84-b076-117ae267d178 ro console=ttyS1,115200n8 
BOOTIF=01-00-25-90-75-b5-c8
  ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9
  Tags:  trusty apparmor apparmor apparmor apparmor apparmor
  Uname: Linux 3.13.0-24-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  modified.conffile..etc.default.libvirt.bin: [modified]
  modified.conffile..etc.libvirt.libvirtd.conf: [modified]
  modified.conffile..etc.libvirt.qemu.conf: [modified]
  modified.conffile..etc.libvirt.qemu.networks.default.xml: [deleted]
  mtime.conffile..etc.default.libvirt.bin: 2014-05-12T19:07:40.020662
  mtime.conffile..etc.libvirt.libvirtd.conf: 2014-05-13T14:40:25.894837
  mtime.conffile..etc.libvirt.qemu.conf: 2014-05-12T18:58:27.885506

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1297218/+subscriptions



Re: [Qemu-devel] [PATCH v4] block/null: Latency simulation by adding new option "latency-ns"

2015-03-18 Thread Eric Blake
On 03/18/2015 12:49 AM, Fam Zheng wrote:
> Aio context switch should just work because the requests will be
> drained, so the scheduled timer(s) on the old context will be freed.
> 
> Signed-off-by: Fam Zheng 
> 
> ---
> v4: If errp is not null, return -EINVAL. [Kevin]
> ---
>  block/null.c | 59 
> ++--
>  qapi/block-core.json |  5 -
>  2 files changed, 57 insertions(+), 7 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/3] block: Add QMP support for streaming to an intermediate layer

2015-03-18 Thread Alberto Garcia
On Thu, Mar 12, 2015 at 04:45:17PM +0100, Kevin Wolf wrote:

> > One issue that I'm finding is that when we move the block-stream
> > job to an intermediate node, where the device name is empty, we
> > get messages like "Device '' is busy".

> My first thought was "then make it 'Source/Target device is busy'
> without mentioning any name". In the context of any given command,
> it would still be clear which BDS is meant. In fact, I have argued
> before that mentioning the device name in an error to a command that
> refers to a specific device is redundant and should be avoided.
> 
> The problem here is that it's not stream_start() that generates the
> error, but block_job_create(), which doesn't know which role it's bs
> argument has for the block job. So it can't decide whether to say
> "source device", "target device" or something completely different.

The problem is actually not there. The error message generated by
block_job_create() is "block device is in use by block job: stream".

It's bdrv_op_is_blocked() that adds the extra "Device '' is busy".

error_setg(errp, "Device '%s' is busy: %s",
   bdrv_get_device_name(bs),
   error_get_pretty(blocker->reason));

I can use the same approach as in the BlockJobInfo case and fall back
to the node name if the device name is empty, but the problem is that
bdrv_get_device_name() is used all over the place, so this probably
needs a more general solution.

Even at the moment the backing blocker set by bdrv_set_backing_hd()
has problems:

error_setg(&bs->backing_blocker,
   "device is used as backing hd of '%s'",
   bdrv_get_device_name(bs));

This only works if 'bs' is a root node, but if you try to perform an
operation on the backing image of another backing image, you get a
"device is used as backing hd of ''".

Error messages aside, I would probably need to check all uses of
bdrv_get_device_name() because there could be more surprises if the
node is not at the root.

Berto



Re: [Qemu-devel] [PATCH 0/2] virtio len fixes for qemu.

2015-03-18 Thread Michael S. Tsirkin
On Wed, Mar 11, 2015 at 04:29:30PM +1030, Rusty Russell wrote:
> The virtio 'used' ring describes descriptors which have been used.  It
> also says how many bytes have been written to the ring.  For some cases,
> this value is ignored by Linux guests, thus errors have not been noticed.
> I was working on increasing the checking in Linux when I noticed this
> behaviour.
> 
> The first patch changes the 'len' formal parameter name to 'len_written' to
> make the API clearer, and adds an assert(). The second fixes block writes.
> 
> Cheers,
> Rusty.
> PS.  It's based on MST's virtio-1.0 tree, but should be easily ported.

After going back and forth on this, I decided it's
best to defer this change to 2.4.
Guests can't depend on this behaviour without checking virtio-1 anyway.


> Rusty Russell (2):
>   virtio: make it clear that "len" for a used descriptor is len written.
>   virtio-blk: fix length calculations for write operations.
> 
>  hw/block/virtio-blk.c  |  9 -
>  hw/virtio/virtio.c | 19 ---
>  include/hw/virtio/virtio.h |  4 ++--
>  3 files changed, 22 insertions(+), 10 deletions(-)
> 
> -- 
> 2.1.0



[Qemu-devel] [PULL 1/8] virtio: validate the existence of handle_output before calling it

2015-03-18 Thread Michael S. Tsirkin
From: Jason Wang 

We don't validate the existence of handle_output which may let a buggy
guest to trigger a SIGSEV easily. E.g:

1) write 10 to queue_sel to a virtio net device with only 1 queue
2) setup an arbitrary pfn
3) then notify queue 10

Fixing this by validating the existence of handle_output before.

Cc: qemu-sta...@nongnu.org
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Don Koch 
Reviewed-by: Fam Zheng 
---
 hw/virtio/virtio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 3c6e430..17c1260 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -759,8 +759,9 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, int 
align)
 
 void virtio_queue_notify_vq(VirtQueue *vq)
 {
-if (vq->vring.desc) {
+if (vq->vring.desc && vq->handle_output) {
 VirtIODevice *vdev = vq->vdev;
+
 trace_virtio_queue_notify(vdev, vq - vdev->vq, vq);
 vq->handle_output(vdev, vq);
 }
-- 
MST




[Qemu-devel] [PULL 0/8] pci, virtio bugfixes for 2.3

2015-03-18 Thread Michael S. Tsirkin
The following changes since commit 18bf9e2f379334306530cbfd44218748eceaf67d:

  virtio-scsi: remove empty wrapper for cmd (2015-03-11 18:24:30 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to ce394947a75296fc10f1676932473e92aa8be11a:

  pcie_aer: fix comment to match pcie spec (2015-03-18 12:48:21 +0100)


pci, virtio bugfixes for 2.3

Just a bunch of bugfixes. Should be nothing remarkable here.

Signed-off-by: Michael S. Tsirkin 


Chen Fan (5):
  pcie: correct mistaken register bit for End-End TLP Prefix Blocking
  aer: fix wrong check on expose aer tlp prefix log
  pcie_aer: fix typos in pcie_aer_inject_error comment
  aer: fix a wrong init PCI_ERR_COR_STATUS w1cmask type register
  pci: fix several trivial typos in comment

Jason Wang (1):
  virtio: validate the existence of handle_output before calling it

Michael S. Tsirkin (1):
  pcie_aer: fix comment to match pcie spec

Stefan Weil (1):
  virtio: Fix memory leaks reported by Coverity

 include/hw/pci/pci.h   |  2 +-
 include/hw/pci/pcie_aer.h  |  2 +-
 include/hw/pci/pcie_regs.h |  2 +-
 hw/9pfs/virtio-9p-local.c  | 28 
 hw/pci/pcie.c  |  2 +-
 hw/pci/pcie_aer.c  | 12 ++--
 hw/virtio/virtio.c |  3 ++-
 7 files changed, 20 insertions(+), 31 deletions(-)




Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM

2015-03-18 Thread Juan Quintela
Kevin Wolf  wrote:
> [ Cc: qemu-block ]
>
> Am 18.03.2015 um 04:19 hat Li, Liang Z geschrieben:
>> > This needs further review/changes on the block layer.
>> > 
>> > First explanation, why I think this don't fix the full problem.
>> > Whith this patch, we fix the problem where we have a dirty block layer but
>> > basically nothing dirtying the memory on the guest (we are moving the 20
>> > seconds from max_downtime for the blocklayer flush), to 20 seconds until
>> > we have decided that the amount of dirty memory is small enough to be
>> > transferred during max_downtime.  But it is still going to take 20 seconds 
>> > to
>> > flush the block layer, and during that 20 seconds, the amount of memory 
>> > that
>> > can be dirty is HUGE.
>> 
>> It's true.
>
> What kind of cache is it actually that takes 20s to flush here?

That is a very good question.   When I meassured this (long, long ago),
testing with the same workload, bdrv_flush_all() could take form
100-300ms (what I expected and I can live with that), to several
seconds, what I can't live with.

Basically (this was around RHEL6 times, whatever upstream were there at
the time), my notes of the time point to:

aio.c:quemu_aio_wait()
.
ret = select(max_fd, &rdfds, &wrfds, NULL, NULL);

notice that we are doing a select without a timeout in the iothread,
bad.

I know that the code has changed a lot on that area, the select() don't
exist anymore.  But this exemplifies the problem, something asks the
block layer to do an operation, and it blocks until it finishes, even if
this time it is taking more than usual for whatever reason.  What I
would really is a way to be able to bdrv_flush_all() to take a timeout
parameter and return an error case that it is taking too long.

> Inside of qemu, we don't cache a whole lot. If you use qcow2, we do use
> a writeback cache for some metadata that might need to be written back,
> but it is small and shouldn't take any significant time.
>
> Then we have the kernel page cache, or for some network protocols caches
> in the respective libs. This sounds like the right size for a 20s stall,
> but don't we require cache.direct=on for live migration anyway for
> coherency, i.e. bypassing any such cache?
>
> Lastly there may be a disk cache, but it's too small either.

>> > I think our ouptions are:
>> > 
>> > - tell the block layer at the beggining of migration
>> >   Hey, we are migrating, could you please start flusing data now, and
>> >   don't get the caches to grow too much, please, pretty please.
>> >   (I left the API to the block layer)
>> > - Add on that point a new function:
>> >   bdrvr_flush_all_start()
>> >   That starts the sending of pages, and we "hope" that by the time that
>> >   we have migrated all memory, they have also finished (so our last
>> >   call to block_flush_all() have less work to do)
>> > - Add another function:
>> >   int bdrv_flush_all_timeout(int timeout)
>> >   that returns if timeout pass, telling if it has migrated all pages or
>> >   timeout has passed.  So we can got back to the iterative stage if it
>> >   has taken too long.
>
> The problem is that the block layer really doesn't have an option to
> control what is getting synced once the data is cached outside of qemu.
> Essentially we can do an fdatasync() or we can leave it, that's the only
> choice we have.

See my explanation, if all that qemu is doing is an fdatasync(), just
spawn a thread that do the fdatasync() and if the thread don't finish in
 time, just return one error.  You can implement that behaviour
whatever you want.


> Now doing an asynchronous fdatasync() in the background is completely
> reasonable in order to reduce the amount of data to be flushed later.
> But the patch is doing it while holding both the BQL and the AIOContext
> lock of the device, which doesn't feel right. Maybe it should schedule a
> BH in the AIOContext instead and flush from there asynchronously.

Position is wrong, definitelly.  We want to start the asynchronous
fdatasync() at the start of migration, or each X milliseconds.  At this
point, we *think* that we can finish the migration on max_downtime
(basically we are ignoring the time that is going to take to migrate
device state and do the block layer flush, but normally this takes in
the other of 100-200ms, so it don't matter at all).

> The other thing is that flushing once doesn't mean that new data isn't
> accumulating in the cache, especially if you decide to do the background
> flush at the start of the migration.

But "pontentially", we would arrive to this point with less "cached"
data everything on the system.

> The obvious way to avoid that would be to switch to a writethrough mode,
> so any write request goes directly to the disk. This will, however,
> impact performance so heavily that it's probably not a realistic option.
>
> An alternative approach could be to repeat the background flush
> periodically, either time based or after every x bytes that are wri

[Qemu-devel] [PULL 4/8] aer: fix wrong check on expose aer tlp prefix log

2015-03-18 Thread Michael S. Tsirkin
From: Chen Fan 

when specify TLP Prefix log as using pcie_aer_inject_error,
the TLP prefix log is always discarded. because the check
is incorrect, the End-End TLP Prefix Supported bit
(PCI_EXP_DEVCAP2_EETLPP) should be in Device Capabilities 2 Register.

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/pcie_aer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 5a25c32..c7fad34 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -433,7 +433,7 @@ static void pcie_aer_update_log(PCIDevice *dev, const 
PCIEAERErr *err)
 }
 
 if ((err->flags & PCIE_AER_ERR_TLP_PREFIX_PRESENT) &&
-(pci_get_long(dev->config + dev->exp.exp_cap + PCI_EXP_DEVCTL2) &
+(pci_get_long(dev->config + dev->exp.exp_cap + PCI_EXP_DEVCAP2) &
  PCI_EXP_DEVCAP2_EETLPP)) {
 for (i = 0; i < ARRAY_SIZE(err->prefix); ++i) {
 /* 7.10.12 tlp prefix log register */
-- 
MST




[Qemu-devel] [PULL 5/8] pcie_aer: fix typos in pcie_aer_inject_error comment

2015-03-18 Thread Michael S. Tsirkin
From: Chen Fan 

Refer to "PCI Express Base Spec3.0", this comments can't
fit the description in spec, so we should fix them.

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/pcie_aer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index c7fad34..9daebc2 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -618,11 +618,11 @@ static bool pcie_aer_inject_uncor_error(PCIEAERInject 
*inj, bool is_fatal)
  * non-Function specific error must be recorded in all functions.
  * It is the responsibility of the caller of this function.
  * It is also caller's responsibility to determine which function should
- * report the rerror.
+ * report the error.
  *
  * 6.2.4 Error Logging
- * 6.2.5 Sqeunce of Device Error Signaling and Logging Operations
- * table 6-2: Flowchard Showing Sequence of Device Error Signaling and Logging
+ * 6.2.5 Sequence of Device Error Signaling and Logging Operations
+ * table 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
  *Operations
  */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err)
-- 
MST




[Qemu-devel] [PULL 2/8] virtio: Fix memory leaks reported by Coverity

2015-03-18 Thread Michael S. Tsirkin
From: Stefan Weil 

All four leaks are similar, so fix them in one patch.

Signed-off-by: Stefan Weil 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/9pfs/virtio-9p-local.c | 28 
 1 file changed, 8 insertions(+), 20 deletions(-)

diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c
index d05c917..d66abcd 100644
--- a/hw/9pfs/virtio-9p-local.c
+++ b/hw/9pfs/virtio-9p-local.c
@@ -488,7 +488,7 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 int err = -1;
 int serrno = 0;
 V9fsString fullname;
-char *buffer;
+char *buffer = NULL;
 
 v9fs_string_init(&fullname);
 v9fs_string_sprintf(&fullname, "%s/%s", dir_path->data, name);
@@ -499,7 +499,6 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 buffer = rpath(fs_ctx, path);
 err = mknod(buffer, SM_LOCAL_MODE_BITS|S_IFREG, 0);
 if (err == -1) {
-g_free(buffer);
 goto out;
 }
 err = local_set_xattr(buffer, credp);
@@ -512,7 +511,6 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 buffer = rpath(fs_ctx, path);
 err = mknod(buffer, SM_LOCAL_MODE_BITS|S_IFREG, 0);
 if (err == -1) {
-g_free(buffer);
 goto out;
 }
 err = local_set_mapped_file_attr(fs_ctx, path, credp);
@@ -525,7 +523,6 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 buffer = rpath(fs_ctx, path);
 err = mknod(buffer, credp->fc_mode, credp->fc_rdev);
 if (err == -1) {
-g_free(buffer);
 goto out;
 }
 err = local_post_create_passthrough(fs_ctx, path, credp);
@@ -539,8 +536,8 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
*dir_path,
 err_end:
 remove(buffer);
 errno = serrno;
-g_free(buffer);
 out:
+g_free(buffer);
 v9fs_string_free(&fullname);
 return err;
 }
@@ -552,7 +549,7 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
*dir_path,
 int err = -1;
 int serrno = 0;
 V9fsString fullname;
-char *buffer;
+char *buffer = NULL;
 
 v9fs_string_init(&fullname);
 v9fs_string_sprintf(&fullname, "%s/%s", dir_path->data, name);
@@ -563,7 +560,6 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
*dir_path,
 buffer = rpath(fs_ctx, path);
 err = mkdir(buffer, SM_LOCAL_DIR_MODE_BITS);
 if (err == -1) {
-g_free(buffer);
 goto out;
 }
 credp->fc_mode = credp->fc_mode|S_IFDIR;
@@ -576,7 +572,6 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
*dir_path,
 buffer = rpath(fs_ctx, path);
 err = mkdir(buffer, SM_LOCAL_DIR_MODE_BITS);
 if (err == -1) {
-g_free(buffer);
 goto out;
 }
 credp->fc_mode = credp->fc_mode|S_IFDIR;
@@ -590,7 +585,6 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
*dir_path,
 buffer = rpath(fs_ctx, path);
 err = mkdir(buffer, credp->fc_mode);
 if (err == -1) {
-g_free(buffer);
 goto out;
 }
 err = local_post_create_passthrough(fs_ctx, path, credp);
@@ -604,8 +598,8 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
*dir_path,
 err_end:
 remove(buffer);
 errno = serrno;
-g_free(buffer);
 out:
+g_free(buffer);
 v9fs_string_free(&fullname);
 return err;
 }
@@ -659,7 +653,7 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
*dir_path, const char *name,
 int err = -1;
 int serrno = 0;
 V9fsString fullname;
-char *buffer;
+char *buffer = NULL;
 
 /*
  * Mark all the open to not follow symlinks
@@ -675,7 +669,6 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
*dir_path, const char *name,
 buffer = rpath(fs_ctx, path);
 fd = open(buffer, flags, SM_LOCAL_MODE_BITS);
 if (fd == -1) {
-g_free(buffer);
 err = fd;
 goto out;
 }
@@ -690,7 +683,6 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
*dir_path, const char *name,
 buffer = rpath(fs_ctx, path);
 fd = open(buffer, flags, SM_LOCAL_MODE_BITS);
 if (fd == -1) {
-g_free(buffer);
 err = fd;
 goto out;
 }
@@ -706,7 +698,6 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
*dir_path, const char *name,
 buffer = rpath(fs_ctx, path);
 fd = open(buffer, flags, credp->fc_mode);
 if (fd == -1) {
-g_free(buffer);
 err = fd;
 goto out;
 }
@@ -724,8 +715,8 @@ err_end:
 close(fd);
 remove(buffer);
 errno = serrno;
-g_free(buffer);
 out:
+g_free(buffer);
 v9fs_string_free(&fullname);
 return err;
 }
@@ -738,7 +729,7 @@ static int local_symlink(FsContext *fs_ctx, const char 
*oldpath,
 int serrno = 0;
 char *newpath;
 V9fsString fullname;
-

[Qemu-devel] [PULL 8/8] pcie_aer: fix comment to match pcie spec

2015-03-18 Thread Michael S. Tsirkin
Code comment says "table 6-2" but in fact it's is not a table, it is
"Figure 6-2" on page 479.

Cc: Chen Fan 
Reported-by: Michael Tokarev 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/pcie_aer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 9126058..eaa3e6e 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -622,8 +622,8 @@ static bool pcie_aer_inject_uncor_error(PCIEAERInject *inj, 
bool is_fatal)
  *
  * 6.2.4 Error Logging
  * 6.2.5 Sequence of Device Error Signaling and Logging Operations
- * table 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
- *Operations
+ * Figure 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
+ * Operations
  */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err)
 {
-- 
MST




[Qemu-devel] [PULL 3/8] pcie: correct mistaken register bit for End-End TLP Prefix Blocking

2015-03-18 Thread Michael S. Tsirkin
From: Chen Fan 

from pcie spec 7.8.17, the End-End TLP Prefix Blocking bit local
is 15(e.g. 0x8000) in device control 2 register.

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/pci/pcie_regs.h | 2 +-
 hw/pci/pcie.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
index 652d9fc..848ab1c 100644
--- a/include/hw/pci/pcie_regs.h
+++ b/include/hw/pci/pcie_regs.h
@@ -72,7 +72,7 @@
 #define PCI_EXP_DEVCAP2_EFF 0x10
 #define PCI_EXP_DEVCAP2_EETLPP  0x20
 
-#define PCI_EXP_DEVCTL2_EETLPPB 0x80
+#define PCI_EXP_DEVCTL2_EETLPPB 0x8000
 
 /* ARI */
 #define PCI_ARI_VER 1
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 1a1..1463e65 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -84,7 +84,7 @@ int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t 
type, uint8_t port)
 pci_set_long(exp_cap + PCI_EXP_DEVCAP2,
  PCI_EXP_DEVCAP2_EFF | PCI_EXP_DEVCAP2_EETLPP);
 
-pci_set_word(dev->wmask + pos, PCI_EXP_DEVCTL2_EETLPPB);
+pci_set_word(dev->wmask + pos + PCI_EXP_DEVCTL2, PCI_EXP_DEVCTL2_EETLPPB);
 return pos;
 }
 
-- 
MST




[Qemu-devel] [PULL 7/8] pci: fix several trivial typos in comment

2015-03-18 Thread Michael S. Tsirkin
From: Chen Fan 

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/pci/pci.h  | 2 +-
 include/hw/pci/pcie_aer.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index be2d9b8..b97c295 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -137,7 +137,7 @@ enum {
 #define PCI_CONFIG_HEADER_SIZE 0x40
 /* Size of the standard PCI config space */
 #define PCI_CONFIG_SPACE_SIZE 0x100
-/* Size of the standart PCIe config space: 4KB */
+/* Size of the standard PCIe config space: 4KB */
 #define PCIE_CONFIG_SPACE_SIZE  0x1000
 
 #define PCI_NUM_PINS 4 /* A-D */
diff --git a/include/hw/pci/pcie_aer.h b/include/hw/pci/pcie_aer.h
index bcac80a..2fb8388 100644
--- a/include/hw/pci/pcie_aer.h
+++ b/include/hw/pci/pcie_aer.h
@@ -51,7 +51,7 @@ struct PCIEAERLog {
 PCIEAERErr *log;
 };
 
-/* aer error message: error signaling message has only error sevirity and
+/* aer error message: error signaling message has only error severity and
source id. See 2.2.8.3 error signaling messages */
 struct PCIEAERMsg {
 /*
-- 
MST




[Qemu-devel] [PULL 6/8] aer: fix a wrong init PCI_ERR_COR_STATUS w1cmask type register

2015-03-18 Thread Michael S. Tsirkin
From: Chen Fan 

Error Status Register, so this patch fix a wrong definition
for PCI_ERR_COR_STATUS register with w1cmask type.

Signed-off-by: Chen Fan 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/pcie_aer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 9daebc2..9126058 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -123,7 +123,7 @@ int pcie_aer_init(PCIDevice *dev, uint16_t offset)
  PCI_ERR_UNC_SUPPORTED);
 
 pci_long_test_and_set_mask(dev->w1cmask + offset + PCI_ERR_COR_STATUS,
-   PCI_ERR_COR_STATUS);
+   PCI_ERR_COR_SUPPORTED);
 
 pci_set_long(dev->config + offset + PCI_ERR_COR_MASK,
  PCI_ERR_COR_MASK_DEFAULT);
-- 
MST




  1   2   3   >