Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions

2015-01-31 Thread Paolo Bonzini


On 31/01/2015 00:55, Alex Williamson wrote:
> Commit d8d95814609e replaced a number of memory_region_destroy()
> calls with object_unparent() calls.  The logic appears to be that
> subregions need to be unparented, but the base region is destroyed
> with the device object.  Doing hotplug testing with vfio-pci I
> occasionally get a segfault from object_finalize_child_property()
> due to completely bogus class pointers on the child Object.  Adding
> the explicit object_unparent() for these subregions resolves the
> problem, however I question the sanity of the Memory API now where
> we sometimes need to destroy MemoryRegions, but the rules aren't
> clear

There is no memory_region_destroy API because you cannot destroy
MemoryRegions.  All you do is releasing the link between the VFIO device
(the parent, specified in memory_region_init*) and the MemoryRegion.
The link caused the VFIO device to keep the MemoryRegion alive.

There can be pending references to the VFIO device at unrealize time,
and this is why the memory_region_destroy() API was not enough.  For
example if someone was doing I/O to a BAR and thus address_space_map is
keeping the VFIO device alive.

The explicit memory_region_destroy() function made it much harder to
handle this case.  You had to define an instance_finalize function for
every class, and do memory_region_destroy() there.  Not surprisingly, no
one did that.  Sure, it's not a common case and a well-behaving guest
does not do that, but if it does it means use-after-frees and thus a
possible guest->host escalation.

Instead, the implicit destruction via reference counting makes this case
easy to handle, because reclamation is done automatically when the VFIO
device dies.

Explicit object_unparent() is only needed if you recreate the memory
region during the lifetime of the object.  This is rarely needed, and it
is simple to spot if it's needed.  If you do memory_region_init* outside
the realize function, most likely you need a matching object_unparent
somewhere else in the device logic.

This was the idea behind commit d8d95814609e.  It only touched a handful
of files because almost no one does memory_region_init* outside the
realize function, and in particular VFIO doesn't.  VFIO follows the
common convention of only creating regions in realize, and thus does not
need object_unparent.

> and there's no longer a memory_region_destroy() function, so
> we need to reach over to some other random QEMU API

It's not random.  Object is the parent class of MemoryRegion.
object_unparent is a method for MemoryRegion.

> and unparent an object that we barely know about

I'm not sure about this?  You certainly know the memory regions you create.

> and certainly didn't explicitly parent previously.

You did when you passed the VFIO device to memory_region_init*.

I'm afraid this patch is incorrect.  You have to find out where the
region is being overwritten.

Thanks,

Paolo

> Signed-off-by: Alex Williamson 
> Cc: Paolo Bonzini 
> Cc: qemu-sta...@nongnu.org
> ---
> 
>  hw/vfio/pci.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 014a92c..c71499e 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int 
> nr)
>  
>  memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
>  munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
> +object_unparent(OBJECT(&bar->region.mmap_mem));
>  
>  if (vdev->msix && vdev->msix->table_bar == nr) {
>  memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
>  munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
> +object_unparent(OBJECT(&vdev->msix->mmap_mem));
>  }
>  }
>  
> 
> 
> 



Re: [Qemu-devel] [PATCH] vl.c: fix -usb option assertion failure in qemu_opt_get_bool_helper()

2015-01-31 Thread Jan Kiszka
On 2015-01-05 12:37, Jan Kiszka wrote:
> On 2015-01-05 12:22, Stefan Hajnoczi wrote:
>> Commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6 ("machine: remove
>> qemu_machine_opts global list") removed option descriptions from the
>> -machine QemuOptsList to avoid repeating MachineState's QOM properties.
>>
>> This change broke vl.c:usb_enabled() because qemu_opt_get_bool() cannot
>> be used on QemuOptsList without option descriptions since QemuOpts
>> doesn't know the type and therefore left an unparsed string value.
>>
>> This patch avoids calling qemu_opt_get_bool() to fix the assertion
>> failure:
>>
>>   $ qemu-system-x86_64 -usb
>>   qemu_opt_get_bool_helper: Assertion `opt->desc && opt->desc->type == 
>> QEMU_OPT_BOOL' failed.
>>
>> Test the presence of -usb using qemu_opt_find() but use the
>> MachineState->usb field instead of qemu_opt_get_bool().
>>
>> Cc: Marcel Apfelbaum 
>> Cc: Tiejun Chen 
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>>  vl.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/vl.c b/vl.c
>> index bea9656..6e8889c 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -999,8 +999,11 @@ static int parse_name(QemuOpts *opts, void *opaque)
>>  
>>  bool usb_enabled(bool default_usb)
>>  {
>> -return qemu_opt_get_bool(qemu_get_machine_opts(), "usb",
>> - has_defaults && default_usb);
>> +if (qemu_opt_find(qemu_get_machine_opts(), "usb")) {
>> +return current_machine->usb;
>> +} else {
>> +return has_defaults && default_usb;
>> +}
>>  }
>>  
>>  #ifndef _WIN32
>>
> 
> That still leaves the other boolean machine options broken. A generic
> fix would be good. Or revert the original commit until we have one.

Just pulled current master, and at least the iommu machine option is
still triggering an abort.

What is the status of fixing these fallouts? Was anything else addressed
by now, just this one forgotten?

Jan




signature.asc
Description: OpenPGP digital signature


[Qemu-devel] 2.1.0 unexpected abort after canceling migrate

2015-01-31 Thread Wangting (Kathy)
Hello,

Qemu 2.1.0 is currently crashing with the log

[2015-01-30T08:17:29.512340Z] handle_qmp_command:5100 qmp_cmd_name: 
migrate_cancel
[2015-01-30T08:17:29.857114Z] handle_qmp_command:5110 qmp_cmd_name: 
block-job-cancel, qmp_cmd_arguments: {"device": "drive-scsi0-0-0-0"}
[2015-01-30T08:17:29.890641Z] handle_qmp_command:5110 qmp_cmd_name: 
block-job-cancel, qmp_cmd_arguments: {"device": "drive-scsi0-0-0-4"}
nbd.c:nbd_receive_reply():L797: read failed
Co-routine re-entered recursively
[2015-01-30 16:17:30]: shutting down

after execution of the commands "migrate_cancel" and "block-job-cancel"

That can be easily reproduceable when using ide or virtio-scsi disks.




[Qemu-devel] [PATCH 1/2] Add GDB qAttached support

2015-01-31 Thread Jan Kiszka
From: Jan Kiszka 

With this patch QEMU handles qAttached request from gdb. When QEMU
replies 1, GDB sends a "detach" command at the end of a debugging
session otherwise GDB sends "kill".

The default value for qAttached is 1 on system emulation and 0 on user
emulation.

Based on original version by Fabien Chouteau.

Signed-off-by: Jan Kiszka 
---

Long pending in my queue. Hope we can finally get these two in via
trivial (that's what they are).

 gdbstub.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gdbstub.c b/gdbstub.c
index e4a1a79..da3e7cb 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -41,6 +41,12 @@
 #include "qemu/sockets.h"
 #include "sysemu/kvm.h"
 
+#ifdef CONFIG_USER_ONLY
+#define GDB_ATTACHED "0"
+#else
+#define GDB_ATTACHED "1"
+#endif
+
 static inline int target_memory_rw_debug(CPUState *cpu, target_ulong addr,
  uint8_t *buf, int len, bool is_write)
 {
@@ -1187,6 +1193,10 @@ static int gdb_handle_packet(GDBState *s, const char 
*line_buf)
 put_packet_binary(s, buf, len + 1);
 break;
 }
+if (strncmp(p, "Attached", 8) == 0) {
+put_packet(s, GDB_ATTACHED);
+break;
+}
 /* Unrecognised 'q' command.  */
 goto unknown_command;
 
-- 
2.1.4



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 2/2] Revert "gdbstub: Do not kill target in system emulation mode"

2015-01-31 Thread Jan Kiszka
From: Fabien Chouteau 

The requirements described in this patch are implemented by "Add GDB
qAttached support".

This reverts commit 00e94dbc7fd0110b0555d59592b004333adfb4b8.

Signed-off-by: Fabien Chouteau 
Signed-off-by: Jan Kiszka 
---
 gdbstub.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index da3e7cb..14e1f0c 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -880,11 +880,9 @@ static int gdb_handle_packet(GDBState *s, const char 
*line_buf)
 goto unknown_command;
 }
 case 'k':
-#ifdef CONFIG_USER_ONLY
 /* Kill the target */
 fprintf(stderr, "\nQEMU: Terminated via GDBstub\n");
 exit(0);
-#endif
 case 'D':
 /* Detach packet */
 gdb_breakpoint_remove_all();
-- 
2.1.4



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 3/7] softfloat: Convert `*_default_nan' variables into inline functions

2015-01-31 Thread Maciej W. Rozycki
On Fri, 30 Jan 2015, Peter Maydell wrote:

> >  Hmm, so perhaps my idea for a later improvement:
> >
> >>  Eventually we might want to move the new inline functions into a
> >> separate header to be included from softfloat.h instead of softfloat.c,
> >> but let's make changes one step at a time.
> >
> > will actually have to be made right away.  I suspect GCC is more liberal
> > here due to its convoluted extern/static/inline semantics history.
> > Sigh...
> 
> I would suggest just using "static inline", as we do elsewhere
> for little utility functions.

 Yes, that's exactly what they'd have to be moved into a separate header 
for.

  Maciej



Re: [Qemu-devel] [PATCH v2 3/7] softfloat: Convert `*_default_nan' variables into inline functions

2015-01-31 Thread Peter Maydell
On 31 January 2015 at 11:56, Maciej W. Rozycki  wrote:
> On Fri, 30 Jan 2015, Peter Maydell wrote:
>
>> >  Hmm, so perhaps my idea for a later improvement:
>> >
>> >>  Eventually we might want to move the new inline functions into a
>> >> separate header to be included from softfloat.h instead of softfloat.c,
>> >> but let's make changes one step at a time.
>> >
>> > will actually have to be made right away.  I suspect GCC is more liberal
>> > here due to its convoluted extern/static/inline semantics history.
>> > Sigh...
>>
>> I would suggest just using "static inline", as we do elsewhere
>> for little utility functions.
>
>  Yes, that's exactly what they'd have to be moved into a separate header
> for.

Why do they need to be moved into a different header to do this?
I must be missing something...

-- PMM



Re: [Qemu-devel] [Xen-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough

2015-01-31 Thread Wei Liu
On Sat, Jan 31, 2015 at 07:07:16AM +, Xu, Quan wrote:
> 
> 
> > -Original Message-
> > From: xen-devel-boun...@lists.xen.org
> > [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Wei Liu
> > Sent: Friday, January 30, 2015 8:26 PM
> > To: Chen, Tiejun
> > Cc: Wei Liu; ian.campb...@citrix.com; m...@redhat.com; Ian Jackson;
> > qemu-devel@nongnu.org; xen-de...@lists.xen.org; Gerd Hoffmann
> > Subject: Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one 
> > machine
> > property to support IGD GFX passthrough
> > 
> > On Fri, Jan 30, 2015 at 08:56:48AM +0800, Chen, Tiejun wrote:
> > [...]
> > > >>>
> > > >>>Just remember to handle old option in libxl if your old option is
> > > >>>already released by some older version of QEMUs.
> > > >>
> > > >>I just drop that old option, -gfx_passthru, if we're under qemu
> > > >>upstream circumstance, like this,
> > > >>
> > > >
> > > >The question is, is there any version of qemu upstream that has been
> > > >released that has the old option (-gfx-passthru)?
> > >
> > > No. Just now we're starting to support IGD passthrough in qemu upstream.
> > >
> > 
> > Right, as of QEMU 2.2.0 there's no support of IGD passthrough in QMEU
> > upstream.
> > 
> 
> Just a question:
>Now what features do vt-d support? Thanks.
> 

I don't know whether vt-d is supported in qemu upstream.

But if there is support in upstream and you want to change some options,
the same principle in my previous email still applies.

Wei.



Re: [Qemu-devel] PCI iommu issues

2015-01-31 Thread Knut Omang
On Fri, 2015-01-30 at 08:40 +0100, Jan Kiszka wrote:
> Adding Knut to CC as he particularly looked into and fixed the bridging
> issues or the vtd emulation. I will have to refresh my memories first.

Thanks for cc'ing me, Jan -
Yes I did some work specifically to serve my need of being able to use
the intel iommu with a PCIe device in a root port.

> Jan
> 
> On 2015-01-30 06:45, Benjamin Herrenschmidt wrote:
> > Hi folks !
> > 
> > 
> > I've looked at the intel iommu code to try to figure out how to properly
> > implement a Power8 "native" iommu model and encountered some issues.
> > 
> > Today "pseries" ppc machine is paravirtualized and so uses a pretty
> > simplistic iommu model that essentially has one address space per host
> > bridge.
> > 
> > However, the real HW model I'm working on is closer to Intel in that we
> > have various tables walked by HW that match an originator RID to what
> > we called a "PE" (Partitionable Endpoint) to which corresponds an
> > address space.
> > 
> > So on a given domain, individual functions can have different iommu
> > address spaces & translation structures, or group of devices etc...
> > which can all be configured dynamically by the guest OS. This is similar
> > as far as I understand things to the Intel model though the details of
> > the implementation are very different.
> > 
> > So I implemented something along the lines of what you guys did for q35
> > and intel_iommu, and quickly discovered that it doesn't work, which
> > makes me wonder whether the intel stuff in qemu actually works, or
> > rather, does it work when adding bridges & switches into the picture.
> > 
> > I basically have two problems but they are somewhat related. Firstly
> > the way the intel code works is that it creates lazily context
> > structures that contain the address space, and get associated with
> > devices when pci_device_iommu_address_space() is called which in
> > turns calls the bridge iommu_fn which performs the association.
> > 
> > The first problem is that the association is done based on bus/dev/fn
> > of the device... at a time where bus numbers have not been assigned yet.
> >
> > In fact, the bus numbers are assigned dynamically by SW, the BIOS
> > typically, but the OS can renumber things and it's bogus to assume thus
> > that the RID (bus/dev/fn) of a PCI device/function is fixed. However
> > that's exactly what the code does as it calls
> > pci_device_iommu_address_space() once at device instanciation time in
> > qemu, even before SW had a chance to assign anything.

This is very much in line with my understanding, and I implemented a
solution that works for me in the root port and downstream cases on
Intel in the patch leading up to this:

http://lists.gnu.org/archive/html/qemu-devel/2014-10/msg03121.html

I still have on my TODO list to follow up on your feedback, Michael, but
i figured it needs a bit of continuous focus and unfortunately I have
had far too much to do on other fronts lately to be able to get forward
on this.
 
As Jan argued, my solution cannot be used with non-pci devices, and I
didnt have much to work on wrt other archtectures since the
implementations there, as you indicate, is fairly simple so far. 

I also understand there has been quite some work on "platform" devices
lately, maybe that makes it easier to integrate at a more generic level?

> > So as far as I can tell, things will work as long as you are on bus 0
> > and there is no bridge, otherwise, it's broken by design, unless I'm
> > missing something...

Yes, as it stands today it does not work with bridges on Intel.

> > I've hacked that locally in my code by using the PCIBus * pointer
> > instead of the bus number to match the device to the iommu context.
> > 
> > The second problem is that pci_device_iommu_address_space(), as it walks
> > up the hierarchy to find the iommu_fn, drops the original device
> > information. That means that if a device is below a switch or a p2p
> > bridge of some sort, once you reach the host bridge top level bus, all
> > we know is the bus & devfn of the last p2p entity along the path, we
> > lose the original bus & devfn information.
> >
> > This is incorrect for that sort of iommu, at least while in the PCIe
> > domain, as the original RID is carried along with DMA transactions and i
> > thus needed to properly associate the device/function with a context.
> > 
> > One fix could be to populate the iommu_fn of every bus down the food
> > chain but that's fairly cumbersome... unless we make the PCI bridges by
> > default "inherit" from their parent iommu_fn.
> > 
> > Here, I've done a hack locally to keep the original device information
> > in pci_device_iommu_address_space() but it's not a proper way to do it
> > either, ultimately, each bridge need to be able to tell whether it
> > properly forwards the RID information or not, so the bridge itself need
> > to have some attribute to control that. Typically a PCIe switch or root
> > complex will always 

Re: [Qemu-devel] [PATCH v2 3/7] softfloat: Convert `*_default_nan' variables into inline functions

2015-01-31 Thread Maciej W. Rozycki
On Sat, 31 Jan 2015, Peter Maydell wrote:

> >> >  Hmm, so perhaps my idea for a later improvement:
> >> >
> >> >>  Eventually we might want to move the new inline functions into a
> >> >> separate header to be included from softfloat.h instead of softfloat.c,
> >> >> but let's make changes one step at a time.
> >> >
> >> > will actually have to be made right away.  I suspect GCC is more liberal
> >> > here due to its convoluted extern/static/inline semantics history.
> >> > Sigh...
> >>
> >> I would suggest just using "static inline", as we do elsewhere
> >> for little utility functions.
> >
> >  Yes, that's exactly what they'd have to be moved into a separate header
> > for.
> 
> Why do they need to be moved into a different header to do this?
> I must be missing something...

 This is because fpu/softfloat-specialize.h is an implementation header 
private to SoftFloat and therefore such inline definitions won't be seen 
by users outside SoftFloat, such as target-mips/msa_helper.c.  And IMO 
they shouldn't be moved into include/fpu/softfloat.h itself as this header 
contains generic stuff and is supposed to have no TARGET_foo stuff, as 
observed by current usage and inferred from comments in fpu/softfloat.c.

 So ultimately I think the newly converted `*_default_nan' inline 
functions will need to go into include/fpu/softfloat-public-specialize.h 
or suchlike, which will be pulled from include/fpu/softfloat.h for general 
availability.

 Overall I think the whole arrangement in fpu/softfloat-specialize.h has a 
potential to being cleaned up by removing all the chains of #ifdef's and 
splitting the conditional bits into separate headers matching the current 
variants.  E.g. the presence of this construct:

#if defined(TARGET_SPARC)
const float64 float64_default_nan = const_float64(LIT64( 0x7FFF ));
#elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA)
const float64 float64_default_nan = const_float64(LIT64( 0x7FF8 ));
#elif SNAN_BIT_IS_ONE
const float64 float64_default_nan = const_float64(LIT64( 0x7FF7 ));
#else
const float64 float64_default_nan = const_float64(LIT64( 0xFFF8 ));
#endif

asks for 4 independent headers defining the 4 bit pattern styles used by 
targets for symbolic FP data, one each -- perhaps even pulled indirectly 
via target headers rather than copying the chain of #ifdef's above around 
#include's from wherever they'll be pulled.

 Please correct me if I am wrong, but it appears to me possible right away 
by removing fpu/softfloat-specialize.h and instead creating individual 
target-*/softfloat-specialize.h headers that'll pull the right variant 
each target requires from a pool of templates made available in fpu/.  
Then include/fpu/softfloat.h could do the same for the public stuff (i.e. 
these NaN bits concerned here) placed similarly in include/fpu/.

 FWIW,

  Maciej



Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions

2015-01-31 Thread Alex Williamson
On Sat, 2015-01-31 at 09:43 +0100, Paolo Bonzini wrote:
> 
> On 31/01/2015 00:55, Alex Williamson wrote:
> > Commit d8d95814609e replaced a number of memory_region_destroy()
> > calls with object_unparent() calls.  The logic appears to be that
> > subregions need to be unparented, but the base region is destroyed
> > with the device object.  Doing hotplug testing with vfio-pci I
> > occasionally get a segfault from object_finalize_child_property()
> > due to completely bogus class pointers on the child Object.  Adding
> > the explicit object_unparent() for these subregions resolves the
> > problem, however I question the sanity of the Memory API now where
> > we sometimes need to destroy MemoryRegions, but the rules aren't
> > clear
> 
> There is no memory_region_destroy API because you cannot destroy
> MemoryRegions.  All you do is releasing the link between the VFIO device
> (the parent, specified in memory_region_init*) and the MemoryRegion.
> The link caused the VFIO device to keep the MemoryRegion alive.
> 
> There can be pending references to the VFIO device at unrealize time,
> and this is why the memory_region_destroy() API was not enough.  For
> example if someone was doing I/O to a BAR and thus address_space_map is
> keeping the VFIO device alive.
> 
> The explicit memory_region_destroy() function made it much harder to
> handle this case.  You had to define an instance_finalize function for
> every class, and do memory_region_destroy() there.  Not surprisingly, no
> one did that.  Sure, it's not a common case and a well-behaving guest
> does not do that, but if it does it means use-after-frees and thus a
> possible guest->host escalation.
> 
> Instead, the implicit destruction via reference counting makes this case
> easy to handle, because reclamation is done automatically when the VFIO
> device dies.
> 
> Explicit object_unparent() is only needed if you recreate the memory
> region during the lifetime of the object.  This is rarely needed, and it
> is simple to spot if it's needed.  If you do memory_region_init* outside
> the realize function, most likely you need a matching object_unparent
> somewhere else in the device logic.
> 
> This was the idea behind commit d8d95814609e.  It only touched a handful
> of files because almost no one does memory_region_init* outside the
> realize function, and in particular VFIO doesn't.  VFIO follows the
> common convention of only creating regions in realize, and thus does not
> need object_unparent.
> 
> > and there's no longer a memory_region_destroy() function, so
> > we need to reach over to some other random QEMU API
> 
> It's not random.  Object is the parent class of MemoryRegion.
> object_unparent is a method for MemoryRegion.
> 
> > and unparent an object that we barely know about
> 
> I'm not sure about this?  You certainly know the memory regions you create.
> 
> > and certainly didn't explicitly parent previously.
> 
> You did when you passed the VFIO device to memory_region_init*.
> 
> I'm afraid this patch is incorrect.  You have to find out where the
> region is being overwritten.

Thanks Paolo, so if I look more closely at where you added
object_unparent() calls in d8d95814609e, I can see that they're
associated with dynamically allocated objects that are freed as part of
the vfio device exitfn.  vdev->msix is also such a structure and is the
property causing us the segfaults.  Being associated with a free also
explains the randomness of the segfault.  So, I think the second
object_unparent() call is correct and that the guiding principle is that
any MemoryRegion associated with a dynamically allocated structure and
freed as part of the class exit callback needs to be explicitly
unparented.  Does that sound right?  Thanks,

Alex

> > Signed-off-by: Alex Williamson 
> > Cc: Paolo Bonzini 
> > Cc: qemu-sta...@nongnu.org
> > ---
> > 
> >  hw/vfio/pci.c |2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 014a92c..c71499e 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int 
> > nr)
> >  
> >  memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
> >  munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
> > +object_unparent(OBJECT(&bar->region.mmap_mem));
> >  
> >  if (vdev->msix && vdev->msix->table_bar == nr) {
> >  memory_region_del_subregion(&bar->region.mem, 
> > &vdev->msix->mmap_mem);
> >  munmap(vdev->msix->mmap, 
> > memory_region_size(&vdev->msix->mmap_mem));
> > +object_unparent(OBJECT(&vdev->msix->mmap_mem));
> >  }
> >  }
> >  
> > 
> > 
> > 






[Qemu-devel] debugging qemu arm

2015-01-31 Thread Attila Csosz
Hi,

I'm trying to debug qemu when it executes a simple arm executable. Where is
in the qemu code when executing a single arm asm instruction?

Thanks
 Attila


[Qemu-devel] [Bug 1308341] Re: Multiple CPUs causes blue screen on Windows guest (14.04 regression)

2015-01-31 Thread Fred Thoma
*** This bug is a duplicate of bug 1346917 ***
https://bugs.launchpad.net/bugs/1346917

Just wanted to add that upgrading my kernel to a newer version fixed the
problem for me, too.

Host: 2x E5-2620V2, Ubuntu 14.04 LTS
Guest: 24 virtual cores, Windows Server 2008 R2

Before fix:
sudo uname -a
Linux x.contabo.net 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 
2014 x86_64 x86_64 x86_64 GNU/Linux
Bluescreen stop 0x005c every few hours

After fix:
sudo uname -a
Linux x.contabo.net 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:56:17 UTC 
2014 x86_64 x86_64 x86_64 GNU/Linux
No Bluescreens or other crashes since 7 days under full load

Upgraded with this tutorial http://askubuntu.com/questions/541775/how-
can-i-install-ubuntu-14-10s-kernel-in-ubuntu-14-04-lts

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1308341

Title:
  Multiple CPUs causes blue screen on Windows guest (14.04 regression)

Status in QEMU:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Configuring a Windows 7 guest using more than one CPU cases the guest to 
fail. This happens after a few hours after guest boot. This is the error on the 
blue screen:
  "A clock interrupt was not received on a secondary processor within the 
allocated time interval"

  After resetting, the guest will never boot and a new bluescreen with
  the error "STOP: 0x005c" appears. Shutting down the guest
  completely and restarting it will allow it to boot and run for a few
  hours again.

  The guest was created using virt-manager. The error happens with or
  without virtio devices and with both 32-bit and 64-bit Windows 7
  guests.

  I am using Ubuntu 14.04 release candidate.

  qemu-kvm version 2.0.0~rc1+dfsg-0ubuntu3

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1308341/+subscriptions



Re: [Qemu-devel] [Xen-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough

2015-01-31 Thread Xu, Quan


> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: Saturday, January 31, 2015 10:33 PM
> To: Xu, Quan
> Cc: Wei Liu; Chen, Tiejun; ian.campb...@citrix.com; m...@redhat.com; Ian 
> Jackson;
> qemu-devel@nongnu.org; xen-de...@lists.xen.org; Gerd Hoffmann
> Subject: Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine
> property to support IGD GFX passthrough
> 
> On Sat, Jan 31, 2015 at 07:07:16AM +, Xu, Quan wrote:
> >
> >
> > > -Original Message-
> > > From: xen-devel-boun...@lists.xen.org
> > > [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Wei Liu
> > > Sent: Friday, January 30, 2015 8:26 PM
> > > To: Chen, Tiejun
> > > Cc: Wei Liu; ian.campb...@citrix.com; m...@redhat.com; Ian Jackson;
> > > qemu-devel@nongnu.org; xen-de...@lists.xen.org; Gerd Hoffmann
> > > Subject: Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add
> > > one machine property to support IGD GFX passthrough
> > >
> > > On Fri, Jan 30, 2015 at 08:56:48AM +0800, Chen, Tiejun wrote:
> > > [...]
> > > > >>>
> > > > >>>Just remember to handle old option in libxl if your old option
> > > > >>>is already released by some older version of QEMUs.
> > > > >>
> > > > >>I just drop that old option, -gfx_passthru, if we're under qemu
> > > > >>upstream circumstance, like this,
> > > > >>
> > > > >
> > > > >The question is, is there any version of qemu upstream that has
> > > > >been released that has the old option (-gfx-passthru)?
> > > >
> > > > No. Just now we're starting to support IGD passthrough in qemu upstream.
> > > >
> > >
> > > Right, as of QEMU 2.2.0 there's no support of IGD passthrough in
> > > QMEU upstream.
> > >
> >
> > Just a question:
> >Now what features do vt-d support? Thanks.
> >
> 
> I don't know whether vt-d is supported in qemu upstream.
> 
> But if there is support in upstream and you want to change some options, the
> same principle in my previous email still applies.
> 
> Wei.

Thanks.  -Quan



Re: [Qemu-devel] debugging qemu arm

2015-01-31 Thread Peter Maydell
On 31 January 2015 at 12:25, Attila Csosz  wrote:
> I'm trying to debug qemu when it executes a simple arm executable. Where is
> in the qemu code when executing a single arm asm instruction?

QEMU works in two phases:
 (1) we translate ARM code into x86 instructions
 (2) we run the instructions created in phase 1

So when we're executing an ARM instruction we're in phase 2;
this code was generated at runtime and isn't part of QEMU's
source code at all.

-- PMM



Re: [Qemu-devel] debugging qemu arm

2015-01-31 Thread Attila Csosz
Where is the arm-to-x86 call in QEMU code? Which tool/library call
generates this code?

Attila


On Sat, Jan 31, 2015 at 5:43 PM, Peter Maydell 
wrote:

> On 31 January 2015 at 12:25, Attila Csosz  wrote:
> > I'm trying to debug qemu when it executes a simple arm executable. Where
> is
> > in the qemu code when executing a single arm asm instruction?
>
> QEMU works in two phases:
>  (1) we translate ARM code into x86 instructions
>  (2) we run the instructions created in phase 1
>
> So when we're executing an ARM instruction we're in phase 2;
> this code was generated at runtime and isn't part of QEMU's
> source code at all.
>
> -- PMM
>


Re: [Qemu-devel] debugging qemu arm

2015-01-31 Thread Peter Maydell
On 31 January 2015 at 16:50, Attila Csosz  wrote:
> Where is the arm-to-x86 call in QEMU code? Which tool/library call generates
> this code?

We generate the code in target-arm/translate.c (actually we generate
a TCG intermediate representation which is subsequently turned into
x86 instructions by the TCG backend in tcg/i386/). The main runtime
loop is in exec.c: when we call tcg_qemu_tb_exec() this is actually
a jump to generated code:
# define tcg_qemu_tb_exec(env, tb_ptr) \
((uintptr_t (*)(void *, void *))tcg_ctx.code_gen_prologue)(env, tb_ptr)

-- PMM



Re: [Qemu-devel] [PATCH v6 3/5] acpi: add build_append_namestring() helper

2015-01-31 Thread Michael S. Tsirkin
On Fri, Jan 30, 2015 at 12:46:41PM +0100, Igor Mammedov wrote:
> On Wed, 28 Jan 2015 17:16:17 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Jan 28, 2015 at 02:34:48PM +, Igor Mammedov wrote:
> > > Use build_append_namestring() instead of build_append_nameseg()
> > > So user won't have to care whether name is NameSeg, NamePath or
> > > NameString.
> > > 
> > > See for refernce ACPI 5.0: 20.2.2 Name Objects Encoding
> > > 
> > 
> > typo
> > 
> > > Signed-off-by: Igor Mammedov 
> > > Reviewed-by: Claudio Fontana 
> > > Acked-by: Marcel Apfelbaum 
> > > ---
> > > v4:
> > >  * fix codding style of ACPI_DualNamePrefix & ACPI_MultiNamePrefix
> > >it's imposible to use literals as suggested due to
> > >g_array_append_val() requiring reference value
> > > 
> > > v3:
> > >  assert on wrong Segcount earlier and extend condition to
> > >   seg_count > 0 && seg_count <= 255
> > >  as suggested by Claudio Fontana 
> > > ---
> > >  hw/acpi/aml-build.c | 92
> > > -
> > > hw/i386/acpi-build.c| 35 -
> > > include/hw/acpi/aml-build.h |  2 +- 3 files changed, 108
> > > insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> > > index b28c1f4..ed4ab56 100644
> > > --- a/hw/acpi/aml-build.c
> > > +++ b/hw/acpi/aml-build.c
> > > @@ -52,7 +52,7 @@ void build_append_array(GArray *array, GArray
> > > *val) 
> > >  #define ACPI_NAMESEG_LEN 4
> > >  
> > > -void GCC_FMT_ATTR(2, 3)
> > > +static void GCC_FMT_ATTR(2, 3)
> > >  build_append_nameseg(GArray *array, const char *format, ...)
> > >  {
> > >  /* It would be nicer to use g_string_vprintf but it's only
> > > there in 2.22 */ @@ -71,6 +71,96 @@ build_append_nameseg(GArray
> > > *array, const char *format, ...) g_array_append_vals(array, "",
> > > ACPI_NAMESEG_LEN - len); }
> > >  
> > > +static void
> > > +build_append_namestringv(GArray *array, const char *format,
> > > va_list ap) +{
> > > +/* It would be nicer to use g_string_vprintf but it's only
> > > there in 2.22 */
> > > +char *s;
> > > +int len;
> > > +va_list va_len;
> > > +char **segs;
> > > +char **segs_iter;
> > > +int seg_count = 0;
> > > +
> > > +va_copy(va_len, ap);
> > > +len = vsnprintf(NULL, 0, format, va_len);
> > > +va_end(va_len);
> > > +len += 1;
> > > +s = g_new(typeof(*s), len);
> > > +
> > > +len = vsnprintf(s, len, format, ap);
> > > +
> > > +segs = g_strsplit(s, ".", 0);
> > > +g_free(s);
> > > +
> > > +/* count segments */
> > > +segs_iter = segs;
> > > +while (*segs_iter) {
> > > +++segs_iter;
> > > +++seg_count;
> > >
> > How about we split out formatting?
> > Make +build_append_namestringv acceptconst char **segs, int nsegs,
> > put all code above to build_append_namestring.
> Can't do this, AML API calls which accept format string will
> use build_append_namestringv()

OK but I still think it's a good idea to split this out,
and have a static function that accepts an array of segments.




Re: [Qemu-devel] debugging qemu arm

2015-01-31 Thread Attila Csosz
Where is the arm-to-tcg translation?

Attila


On Sat, Jan 31, 2015 at 5:59 PM, Peter Maydell 
wrote:

> On 31 January 2015 at 16:50, Attila Csosz  wrote:
> > Where is the arm-to-x86 call in QEMU code? Which tool/library call
> generates
> > this code?
>
> We generate the code in target-arm/translate.c (actually we generate
> a TCG intermediate representation which is subsequently turned into
> x86 instructions by the TCG backend in tcg/i386/). The main runtime
> loop is in exec.c: when we call tcg_qemu_tb_exec() this is actually
> a jump to generated code:
> # define tcg_qemu_tb_exec(env, tb_ptr) \
> ((uintptr_t (*)(void *, void *))tcg_ctx.code_gen_prologue)(env, tb_ptr)
>
> -- PMM
>


Re: [Qemu-devel] [PATCH v3 4/4] virtio-blk: introduce multiread

2015-01-31 Thread Peter Lieven
Am 30.01.2015 um 22:30 schrieb Max Reitz:
> On 2015-01-30 at 16:05, Peter Lieven wrote:
>> Am 30.01.2015 um 18:16 schrieb Max Reitz:
>>> On 2015-01-30 at 09:33, Peter Lieven wrote:
 this patch finally introduces multiread support to virtio-blk. While
 multiwrite support was there for a long time, read support was missing.

 The complete merge logic is moved into virtio-blk.c which has
 been the only user of request merging ever since. This is required
 to be able to merge chunks of requests and immediately invoke callbacks
 for those requests. Secondly, this is required to switch to
 direct invocation of coroutines which is planned at a later stage.

 The following benchmarks show the performance of running fio with
 4 worker threads on a local ram disk. The numbers show the average
 of 10 test runs after 1 run as warmup phase.

 |4k|   64k|4k
 MB/s  | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
 --++-++-++
 master| 1221   | 1187| 4178   | 4114| 1745   | 1213
 multiread | 1829   | 1189| 4639   | 4110| 1894   | 1216

 Signed-off-by: Peter Lieven 
 ---
hw/block/dataplane/virtio-blk.c |   8 +-
hw/block/virtio-blk.c   | 288 
 +++-
include/hw/virtio/virtio-blk.h  |  19 +--
trace-events|   1 +
4 files changed, 210 insertions(+), 106 deletions(-)

 diff --git a/hw/block/dataplane/virtio-blk.c 
 b/hw/block/dataplane/virtio-blk.c
 index 39c5d71..93472e9 100644
 --- a/hw/block/dataplane/virtio-blk.c
 +++ b/hw/block/dataplane/virtio-blk.c
 @@ -96,9 +96,7 @@ static void handle_notify(EventNotifier *e)
event_notifier_test_and_clear(&s->host_notifier);
blk_io_plug(s->conf->conf.blk);
for (;;) {
 -MultiReqBuffer mrb = {
 -.num_writes = 0,
 -};
 +MultiReqBuffer mrb = {};
int ret;
  /* Disable guest->host notifies to avoid unnecessary vmexits 
 */
 @@ -120,7 +118,9 @@ static void handle_notify(EventNotifier *e)
virtio_blk_handle_request(req, &mrb);
}
-virtio_submit_multiwrite(s->conf->conf.blk, &mrb);
 +if (mrb.num_reqs) {
 +virtio_submit_multireq(s->conf->conf.blk, &mrb);
 +}
  if (likely(ret == -EAGAIN)) { /* vring emptied */
/* Re-enable guest->host notifies and stop processing the 
 vring.
 diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
 index e04adb8..77890a0 100644
 --- a/hw/block/virtio-blk.c
 +++ b/hw/block/virtio-blk.c
 @@ -34,6 +34,7 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
req->dev = s;
req->qiov.size = 0;
req->next = NULL;
 +req->mr_next = NULL;
return req;
}
@@ -84,20 +85,29 @@ static int 
 virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
  static void virtio_blk_rw_complete(void *opaque, int ret)
{
 -VirtIOBlockReq *req = opaque;
 +VirtIOBlockReq *next = opaque;
-trace_virtio_blk_rw_complete(req, ret);
 +while (next) {
 +VirtIOBlockReq *req = next;
 +next = req->mr_next;
 +trace_virtio_blk_rw_complete(req, ret);
-if (ret) {
 -int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
 -bool is_read = !(p & VIRTIO_BLK_T_OUT);
 -if (virtio_blk_handle_rw_error(req, -ret, is_read))
 -return;
 -}
 +if (req->qiov.nalloc != -1) {
 +qemu_iovec_destroy(&req->qiov);
 +}
-virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
 -block_acct_done(blk_get_stats(req->dev->blk), &req->acct);
 -virtio_blk_free_request(req);
 +if (ret) {
 +int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
 +bool is_read = !(p & VIRTIO_BLK_T_OUT);
 +if (virtio_blk_handle_rw_error(req, -ret, is_read)) {
 +continue;
 +}
 +}
 +
 +virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
 +block_acct_done(blk_get_stats(req->dev->blk), &req->acct);
 +virtio_blk_free_request(req);
 +}
}
  static void virtio_blk_flush_complete(void *opaque, int ret)
 @@ -291,24 +301,125 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq 
 *req)
}
}
-void virtio_submit_multiwrite(BlockBackend *blk, MultiReqBuffer *mrb)
 +static inline void
>>> Is the "inline" really worth it? T

Re: [Qemu-devel] [PATCH v3 4/4] virtio-blk: introduce multiread

2015-01-31 Thread Peter Lieven
Am 30.01.2015 um 22:15 schrieb Kevin Wolf:
> Am 30.01.2015 um 22:05 hat Peter Lieven geschrieben:
>> Am 30.01.2015 um 18:16 schrieb Max Reitz:
>>> On 2015-01-30 at 09:33, Peter Lieven wrote:
 this patch finally introduces multiread support to virtio-blk. While
 multiwrite support was there for a long time, read support was missing.

 The complete merge logic is moved into virtio-blk.c which has
 been the only user of request merging ever since. This is required
 to be able to merge chunks of requests and immediately invoke callbacks
 for those requests. Secondly, this is required to switch to
 direct invocation of coroutines which is planned at a later stage.

 The following benchmarks show the performance of running fio with
 4 worker threads on a local ram disk. The numbers show the average
 of 10 test runs after 1 run as warmup phase.

|4k|   64k|4k
 MB/s  | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
 --++-++-++
 master| 1221   | 1187| 4178   | 4114| 1745   | 1213
 multiread | 1829   | 1189| 4639   | 4110| 1894   | 1216

 Signed-off-by: Peter Lieven 
 ---
   hw/block/dataplane/virtio-blk.c |   8 +-
   hw/block/virtio-blk.c   | 288 
 +++-
   include/hw/virtio/virtio-blk.h  |  19 +--
   trace-events|   1 +
   4 files changed, 210 insertions(+), 106 deletions(-)
 +int64_t sector_num = 0;
 +
 +if (mrb->num_reqs == 1) {
 +virtio_submit_multireq2(blk, mrb, 0, 1, -1);
 +mrb->num_reqs = 0;
   return;
   }
   -ret = blk_aio_multiwrite(blk, mrb->blkreq, mrb->num_writes);
 -if (ret != 0) {
 -for (i = 0; i < mrb->num_writes; i++) {
 -if (mrb->blkreq[i].error) {
 -virtio_blk_rw_complete(mrb->blkreq[i].opaque, -EIO);
 +max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
 +max_xfer_len = MIN_NON_ZERO(max_xfer_len, INT_MAX);
 +
 +qsort(mrb->reqs, mrb->num_reqs, sizeof(*mrb->reqs),
 +  &virtio_multireq_compare);
 +
 +for (i = 0; i < mrb->num_reqs; i++) {
 +VirtIOBlockReq *req = mrb->reqs[i];
 +if (num_reqs > 0) {
 +bool merge = true;
 +
 +/* merge would exceed maximum number of IOVs */
 +if (niov + req->qiov.niov + 1 > IOV_MAX) {
>>> Hm, why the +1?
>> A really good question. I copied this piece from the old merge routine. It 
>> seems
>> definetely wrong.
> The old code merged requests even if they were overlapping. This could
> result in one area being split in two.
>
> I think you don't support this here, so removing the + 1 is probably
> okay.

I don't support it because it was a good source for bugs in the past and I think
a good guest should not create overlapping requests at all.

Peter




Re: [Qemu-devel] PCI iommu issues

2015-01-31 Thread Benjamin Herrenschmidt
On Sat, 2015-01-31 at 15:42 +0100, Knut Omang wrote:

> I agree with you that they are two distinct problems, with perhaps a 
> 3rd problem of how to generalize this to other devices than PCI devices?
> 
> Right now the 
> handling of requester IDs across bridges seems very rudimentary in the
> QEMU model, and as you indicate, it is perhaps not necessary to emulate
> all details of broken switches etc, but maybe we need some kind of
> requester ID translation function (similar to pci_bridge_map_irq()) that
> a bridge can choose to implement, where the default is the unit
> translation?

That could work.

> Wrt. handling solving the original issue of IOMMU support for devices on
> secondary buses (and beyond?) My approach in the patch above is just to
> pass a PCIDevice pointer instead of PCIBus* + devfn, eg:
> 
> -typedef AddressSpace *(*PCIIOMMUFunc)(PCIBus *, void *, int);
> +typedef AddressSpace *(*PCIIOMMUFunc)(PCIDevice *, void *);

This is something I was going to suggest :-)

> For the current Intel Iommu implementation, that seems to work fine as
> no additional logic is need to "propagate" the enumeration, as accurate
> IDs can be found via the device pointer.

Agreed.

>  The current API leaves the task
> of maintaining multiple address spaces for each IOMMU implementation,
> maybe a more explicit notion of an IOMMU group similar to the way it is
> used by VFIO is more close to how a common denominator of hardware
> works? Adding Alex as well, as he might have more insight into the
> details of different IOMMU implementations from his VFIO work.

I would first make it work, let more than one implementation be written,
and only then, see what can be factored. 

As for non-PCI, I wouldn't worry too much yet, ie, I don't see a point
in delaying fixing broken PCI IOMMUs for an hypothetical generic layer
than may or may not work or take a year in the making.

As for Michael feedback, well, if we feel we can hack all the way to the
core DMA functions, I would have gone further and not attached address
spaces to devices permanently like we do ... the address spaces are
really the domains, so to mimmic the HW properly, ideally we would
capture the bus number on config space, and lazily resolve the address
space on each DMA ... We can invalidate the cached pointer when cfg
space is written or when the iommu invalidates its cache, but it's more
work and it could be difficult with vfio...

Also it makes it harder to use a memory region as child of the address
space that we enable/disable to deal with the operations of the bus
master enable bit (but then the way we do that prevents us from
implementing broken devices that ignore that bit :-)

Cheers,
Ben.

> Knut
> 
> > > Cheers,
> > > Ben.
> > > 
> > > 
> > 
> 





Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions

2015-01-31 Thread Paolo Bonzini


On 31/01/2015 16:10, Alex Williamson wrote:
>> Explicit object_unparent() is only needed if you recreate the memory
>> region during the lifetime of the object.  This is rarely needed, and it
>> is simple to spot if it's needed.  If you do memory_region_init* outside
>> the realize function, most likely you need a matching object_unparent
>> somewhere else in the device logic.
>>
>> This was the idea behind commit d8d95814609e.  It only touched a handful
>> of files because almost no one does memory_region_init* outside the
>> realize function, and in particular VFIO doesn't.  VFIO follows the
>> common convention of only creating regions in realize, and thus does not
>> need object_unparent.
> 
> Thanks Paolo, so if I look more closely at where you added
> object_unparent() calls in d8d95814609e, I can see that they're
> associated with dynamically allocated objects that are freed as part of
> the vfio device exitfn.  vdev->msix is also such a structure and is the
> property causing us the segfaults.

Yeah, this is a different case than the one I mentioned above, and
you're right that this is also not exactly the common convention---so it
is a second case of needing an explicit object_unparent.

> So, I think the second
> object_unparent() call is correct and that the guiding principle is that
> any MemoryRegion associated with a dynamically allocated structure and
> freed as part of the class exit callback needs to be explicitly
> unparented.  Does that sound right?

The pedant in me asks to do the object_unparent in vfio_put_device, just
before freeing vdev->msix.  This way you already abide by RCU's
principle of separating removal and reclamation (and I won't have to
move it in part 3 of the RCU patches, which is the next to be posted;
that part will move the vfio_put_device call to instance_finalize).

Another possible change would be to make vdev->msix statically allocated
(checking vdev->msix.entries != 0 instead of vdev->msix != NULL,
possibly hidden beneath an inline function vfio_has_msix).  Then you'd
be in the exact same case as the other memory regions and wouldn't need
an unparent at all.  But I am not certain myself of the relative beauty
of this, and it is obviously less suitable for qemu-stable, so do go
ahead with the one-liner.

I'll improve the documentation as soon as the code actually follows the
principles that have to be documented.  But now that the ball is rolling
on RCU and multithreading, documentation is indeed getting more and more
important.

Thanks,

Paolo

> Alex
> 
>>> Signed-off-by: Alex Williamson 
>>> Cc: Paolo Bonzini 
>>> Cc: qemu-sta...@nongnu.org
>>> ---
>>>
>>>  hw/vfio/pci.c |2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index 014a92c..c71499e 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int 
>>> nr)
>>>  
>>>  memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
>>>  munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
>>> +object_unparent(OBJECT(&bar->region.mmap_mem));
>>>  
>>>  if (vdev->msix && vdev->msix->table_bar == nr) {
>>>  memory_region_del_subregion(&bar->region.mem, 
>>> &vdev->msix->mmap_mem);
>>>  munmap(vdev->msix->mmap, 
>>> memory_region_size(&vdev->msix->mmap_mem));
>>> +object_unparent(OBJECT(&vdev->msix->mmap_mem));
>>>  }
>>>  }
>>>  
>>>
>>>
>>>
> 
> 
> 
> 
>