Re: [Qemu-devel] [PATCH v9] spec: add qcow2 bitmaps extension specification

2016-02-03 Thread Fam Zheng
On Tue, 02/02 09:35, Vladimir Sementsov-Ogievskiy wrote:
> The new feature for qcow2: storing bitmaps.
> 
> This patch adds new header extension to qcow2 - Bitmaps Extension. It
> provides an ability to store virtual disk related bitmaps in a qcow2
> image. For now there is only one type of such bitmaps: Dirty Tracking
> Bitmap, which just tracks virtual disk changes from some moment.
> 
> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
> should be stored in this qcow2 file. The size of each bitmap
> (considering its granularity) is equal to virtual disk size.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v9
> - rewordings, thanks to Max
> 
> v8
> - rewordings
> - bitmap_directory_size: 4b -> 8b
> - add more descriptive description in == Bitmaps == section
> - add paragraph "Dirty tracking bitmaps"
> 
>   Bitmap directory entry:
> - extra data should not allocate additional clusters
> - padding must be all-bytes-zero
> - add extra_data_compatible flag (now behavior in case of unknown
>   extra data is defined by this flag)
> 
> v7:
> 
> - Rewordings, grammar.
>   Max, Eric, John, thank you very much.
> 
> - add last paragraph: remaining bits in bitmap data clusters must be
>   zero.
> 
> - s/Bitmap Directory/bitmap directory/ and other names like this at
>   the request of Max.
> 
> v6:
> 
> - reword bitmap_directory_size description
> - bitmap type: make 0 reserved
> - extra_data_size: resize to 4bytes
>   Also, I've marked this field as "must be zero". We can always change
>   it, if we decide allowing managing app to specify any extra data, by
>   defining some magic value as a top of user extra data.. So, for now
>   non zeor extra_data_size should be considered as an error.
> - swap name and extra_data to give good alignment to extra_data.
> 
> 
> v5:
> 
> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
>   bitmaps.
> - rewordings
> - move upper bounds to "Notes about Qemu limits"
> - s/should/must somewhere. (but not everywhere)
> - move name_size field closer to name itself in bitmap header
> - add extra data area to bitmap header
> - move bitmap data description to separate section
> 
> 
>  docs/specs/qcow2.txt | 223 
> ++-
>  1 file changed, 222 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index f236d8c..db5e666 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,18 @@ in the description of a field.
>  write to an image with unknown auto-clear features if it
>  clears the respective bits from this field first.
>  
> -Bits 0-63:  Reserved (set to 0)
> +Bit 0:  Bitmaps extension bit
> +This bit indicates consistency for the 
> bitmaps
> +extension data.
> +
> +It is an error if this bit is set without the
> +bitmaps extension present.
> +
> +If the bitmaps extension is present but this
> +bit is unset, the bitmaps extension data 
> must be
> +considered inconsistent.
> +
> +Bits 1-63:  Reserved (set to 0)
>  
>   96 -  99:  refcount_order
>  Describes the width of a reference count block entry 
> (width
> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like the 
> following:
>  0x - End of the header extension area
>  0xE2792ACA - Backing file format name
>  0x6803f857 - Feature name table
> +0x23852875 - Bitmaps extension
>  other  - Unknown header extension, can be safely
>   ignored
>  
> @@ -166,6 +178,36 @@ the header extension data. Each entry look like this:
>  terminated if it has full length)
>  
>  
> +== Bitmaps extension ==
> +
> +The bitmaps extension is an optional header extension. It provides the 
> ability
> +to store bitmaps related to a virtual disk. For now, there is only one bitmap
> +type: the dirty tracking bitmap, which tracks virtual disk changes from some
> +point in time.
> +
> +The data of the extension should be considered consistent only if the
> +corresponding auto-clear feature bit is set, see autoclear_features above.
> +
> +The fields of the bitmaps extension are:
> +
> +Byte  0 -  3:  nb_bitmaps
> +   The number of bitmaps contained in the image. Must be
> +   greater than or equal to 1.
> +
> +   Note: Qemu currently only supports up to 65535 bitmaps per
> +   image.
> +
> +  4 -  7:  Reserved, must be zero.
> +
> +  

[Qemu-devel] RE: [iGVT-g] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-02-03 Thread Tian, Kevin
> From: Zhiyuan Lv
> Sent: Tuesday, February 02, 2016 3:35 PM
> 
> Hi Gerd/Alex,
> 
> On Mon, Feb 01, 2016 at 02:44:55PM -0700, Alex Williamson wrote:
> > On Mon, 2016-02-01 at 14:10 +0100, Gerd Hoffmann wrote:
> > >   Hi,
> > >
> > > > > Unfortunately it's not the only one. Another example is, device-model
> > > > > may want to write-protect a gfn (RAM). In case that this request goes
> > > > > to VFIO .. how it is supposed to reach KVM MMU?
> > > >
> > > > Well, let's work through the problem.  How is the GFN related to the
> > > > device?  Is this some sort of page table for device mappings with a base
> > > > register in the vgpu hardware?
> > >
> > > IIRC this is needed to make sure the guest can't bypass execbuffer
> > > verification and works like this:
> > >
> > >   (1) guest submits execbuffer.
> > >   (2) host makes execbuffer readonly for the guest
> > >   (3) verify the buffer (make sure it only accesses resources owned by
> > >   the vm).
> > >   (4) pass on execbuffer to the hardware.
> > >   (5) when the gpu is done with it make the execbuffer writable again.
> >
> > Ok, so are there opportunities to do those page protections outside of
> > KVM?  We should be able to get the vma for the buffer, can we do
> > something with that to make it read-only.  Alternatively can the vgpu
> > driver copy it to a private buffer and hardware can execute from that?
> > I'm not a virtual memory expert, but it doesn't seem like an
> > insurmountable problem.  Thanks,
> 
> Originally iGVT-g used write-protection for privilege execbuffers, as Gerd
> described. Now the latest implementation has removed wp to do buffer copy
> instead, since the privilege command buffers are usually small. So that part
> is fine.
> 
> But we need write-protection for graphics page table shadowing as well. Once
> guest driver modifies gpu page table, we need to know that and manipulate
> shadow page table accordingly. buffer copy cannot help here. Thanks!
> 

After walking through the whole thread again, let me do a summary here
so everyone can be on the same page. 

First, Jike told me before his vacation, that we cannot do any change to 
KVM module according to community comments. Now I think it's not true. 
We can do necessary changes, as long as it is done in a structural/layered 
approach, w/o hard assumption on KVMGT as the only user. That's the 
guideline we need to obey. :-)

Mostly we care about two aspects regarding to a vgpu driver:
  - services/callbacks which vgpu driver provides to external framework
(e.g. vgpu core driver and VFIO);
  - services/callbacks which vgpu driver relies on for proper emulation
(e.g. from VFIO and/or hypervisor);

The former is being discussed in another thread. So here let's focus
on the latter.

In general Intel GVT-g requires below services for emulation:

1) Selectively pass-through a region to a VM
--
This can be supported by today's VFIO framework, by setting
VFIO_REGION_INFO_FLAG_MMAP for concerned regions. Then Qemu
will mmap that region which will finally be added to the EPT table of
the target VM

2) Trap-and-emulate a region
--
Similarly, this can be easily achieved by clearing MMAP flag for concerned
regions. Then every access from VM will go through Qemu and then VFIO
and finally reach vgpu driver. The only concern is in the performance
part. We need some general mechanism to allow delivering I/O emulation
request directly from KVM in kernel. For example, Alex mentioned some
flavor based on file descriptor + offset. Likely let's move forward with
the default Qemu forwarding, while brainstorming exit-less delivery in parallel.

3) Inject a virtual interrupt
--
We can leverage existing VFIO IRQ injection interface, including configuration
and irqfd interface.

4) Map/unmap guest memory
--
It's there for KVM.

5) Pin/unpin guest memory
--
IGD or any PCI passthru should have same requirement. So we should be
able to leverage existing code in VFIO. The only tricky thing (Jike may
elaborate after he is back), is that KVMGT requires to pin EPT entry too,
which requires some further change in KVM side. But I'm not sure whether
it still holds true after some design changes made in this thread. So I'll
leave to Jike to further comment.

6) Write-protect a guest memory page
--
The primary purpose is for GPU page table shadowing. We need to track
modifications on guest GPU page table, so shadow part can be synchronized
accordingly. Just think about CPU page table shadowing. And old example
as Zhiyuan pointed out, is to write-protect guest cmd buffer. But it becomes
not necessary now.

So we need KVM to provide an interface so some agents can request such
write-protection action (not just for KVMGT. could be for other tracking 
usages). Guangrong has been working on a general page tracking mechanism,
upon which write-protection can be easily built on. The review is still in 
progress.

7) GPA->IOVA/HVA translation
--
It's required in various places, e.g.:
- read a guest structure accord

Re: [Qemu-devel] [PATCH v2 0/6] external backup api

2016-02-03 Thread Fam Zheng
On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote:
> Hi all.
> 
> These series which aims to add external backup api. This is needed to allow
> backup software use our dirty bitmaps.
> 
> Vmware and Parallels Cloud Server have this feature.

What is the advantage of this appraoch over "drive-backup sync=incremental
..."?

> 
> There are three things are done:
> - add query-block-dirty-bitmap-ranges qmp command
> - add qmp commands for dirty-bitmap functions: create_successor, abdicate,
>   reclaim.
> - make create-successor command transaction-able
> 
> Then, external backup should be done like this:
> 
> 1.  qmp transaction {
> external-snapshot
> bitmap-create-successor
> }
> 
> 2.  qmp query frozen bitmap, not acquiring aio context.

What do you do with the query response, read the snapshot image with an
external tool?

Fam

> 
> 3.  do external backup, using snapshot and bitmap
> 
> 
> 4.  if (success backup)
> qmp bitmap-abdicate
> else
> qmp bitmap-reclaime
> 
> 5.  qmp merge snapshot
> 
> 
> v2: a lot of additions and changes, no sense to compare with v1
> 
> 
> Vladimir Sementsov-Ogievskiy (6):
>   block dirty bitmap: add next_zero function
>   qmp: add query-block-dirty-bitmap-ranges
>   iotests: test query-block-dirty-bitmap-ranges
>   qapi: add qmp commands for some dirty bitmap functions
>   qapi: make block-dirty-bitmap-create-successor transaction-able
>   iotests: test external backup api
> 
>  block/dirty-bitmap.c |  60 ++
>  blockdev.c   | 113 +
>  include/block/dirty-bitmap.h |   9 
>  include/qemu/hbitmap.h   |   8 +++
>  qapi-schema.json |   4 +-
>  qapi/block-core.json |  90 +
>  qmp-commands.hx  | 118 
> +++
>  tests/qemu-iotests/150   |  88 
>  tests/qemu-iotests/150.out   |  21 
>  tests/qemu-iotests/151   |  77 
>  tests/qemu-iotests/151.out   |   7 +++
>  tests/qemu-iotests/group |   2 +
>  util/hbitmap.c   |  26 ++
>  13 files changed, 622 insertions(+), 1 deletion(-)
>  create mode 100755 tests/qemu-iotests/150
>  create mode 100644 tests/qemu-iotests/150.out
>  create mode 100755 tests/qemu-iotests/151
>  create mode 100644 tests/qemu-iotests/151.out
> 
> -- 
> 1.8.3.1
> 



Re: [Qemu-devel] [PATCH v9 01/16] block: Release named dirty bitmaps in bdrv_close()

2016-02-03 Thread Fam Zheng
On Fri, 01/29 16:36, Max Reitz wrote:
> bdrv_delete() is not very happy about deleting BlockDriverStates with
> dirty bitmaps still attached to them. In the past, we got around that
> very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
> bdrv_close() simply ignoring that condition. We should fix that by
> releasing all named dirty bitmaps in bdrv_close() (there should not be
> any unnamed bitmaps left) and moving the assertion from bdrv_delete()
> there.
> 
> Signed-off-by: Max Reitz 
> ---
>  block.c | 39 +++
>  1 file changed, 31 insertions(+), 8 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 5709d3d..41ab00e 100644
> --- a/block.c
> +++ b/block.c
> @@ -88,6 +88,8 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
> char *filename,
>   const BdrvChildRole *child_role, Error **errp);
>  
>  static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
> +static void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs);
> +
>  /* If non-zero, use only whitelisted block drivers */
>  static int use_bdrv_whitelist;
>  
> @@ -2157,6 +2159,9 @@ void bdrv_close(BlockDriverState *bs)
>  
>  notifier_list_notify(&bs->close_notifiers, bs);
>  
> +bdrv_release_named_dirty_bitmaps(bs);
> +assert(QLIST_EMPTY(&bs->dirty_bitmaps));
> +
>  if (bs->blk) {
>  blk_dev_change_media_cb(bs->blk, false);
>  }
> @@ -2366,7 +2371,6 @@ static void bdrv_delete(BlockDriverState *bs)
>  assert(!bs->job);
>  assert(bdrv_op_blocker_is_empty(bs));
>  assert(!bs->refcnt);
> -assert(QLIST_EMPTY(&bs->dirty_bitmaps));
>  
>  bdrv_close(bs);
>  
> @@ -3582,21 +3586,40 @@ static void 
> bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
>  }
>  }
>  
> -void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
> +static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
> +  BdrvDirtyBitmap *bitmap,
> +  bool only_named)
>  {
>  BdrvDirtyBitmap *bm, *next;
>  QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
> -if (bm == bitmap) {
> +if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
>  assert(!bdrv_dirty_bitmap_frozen(bm));
> -QLIST_REMOVE(bitmap, list);
> -hbitmap_free(bitmap->bitmap);
> -g_free(bitmap->name);
> -g_free(bitmap);
> -return;
> +QLIST_REMOVE(bm, list);
> +hbitmap_free(bm->bitmap);
> +g_free(bm->name);
> +g_free(bm);
> +
> +if (bitmap) {
> +return;
> +}
>  }
>  }
>  }
>  
> +void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
> +{
> +bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
> +}
> +
> +/**
> + * Release all named dirty bitmaps attached to a BDS (for use in 
> bdrv_close()).
> + * There must not be any frozen bitmaps attached.
> + */
> +static void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
> +{
> +bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
> +}
> +
>  void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>  {
>  assert(!bdrv_dirty_bitmap_frozen(bitmap));
> -- 
> 2.7.0
> 

Reviewed-by: Fam Zheng 



Re: [Qemu-devel] [RFC PATCH v1 1/1] vGPU core driver : to provide common interface for vGPU.

2016-02-03 Thread Gerd Hoffmann
  Hi,

> Actually I have a long puzzle in this area. Definitely libvirt will use UUID 
> to
> mark a VM. And obviously UUID is not recorded within KVM. Then how does
> libvirt talk to KVM based on UUID? It could be a good reference to this 
> design.

libvirt keeps track which qemu instance belongs to which vm.
qemu also gets started with "-uuid ...", so one can query qemu via
monitor ("info uuid") to figure what the uuid is.  It is also in the
smbios tables so the guest can see it in the system information table.

The uuid is not visible to the kernel though, the kvm kernel driver
doesn't know what the uuid is (and neither does vfio).  qemu uses file
handles to talk to both kvm and vfio.  qemu notifies both kvm and vfio
about anything relevant events (guest address space changes etc) and
connects file descriptors (eventfd -> irqfd).

qemu needs a sysfs node as handle to the vfio device, something
like /sys/devices/virtual/vgpu/.   can be a uuid if you want
have it that way, but it could be pretty much anything.  The sysfs node
will probably show up as-is in the libvirt xml when assign a vgpu to a
vm.  So the name should be something stable (i.e. when using a uuid as
name you should better not generate a new one on each boot).

cheers,
  Gerd




Re: [Qemu-devel] [PATCH v9 05/16] virtio-scsi: Catch BDS-BB removal/insertion

2016-02-03 Thread Fam Zheng
On Fri, 01/29 16:36, Max Reitz wrote:
> Make use of the BDS-BB removal and insertion notifiers to remove or set
> up, respectively, virtio-scsi's op blockers.
> 
> Signed-off-by: Max Reitz 

Reviewed-by: Fam Zheng 



Re: [Qemu-devel] RE: [iGVT-g] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-02-03 Thread Neo Jia
On Wed, Feb 03, 2016 at 08:04:16AM +, Tian, Kevin wrote:
> > From: Zhiyuan Lv
> > Sent: Tuesday, February 02, 2016 3:35 PM
> > 
> > Hi Gerd/Alex,
> > 
> > On Mon, Feb 01, 2016 at 02:44:55PM -0700, Alex Williamson wrote:
> > > On Mon, 2016-02-01 at 14:10 +0100, Gerd Hoffmann wrote:
> > > >   Hi,
> > > >
> > > > > > Unfortunately it's not the only one. Another example is, 
> > > > > > device-model
> > > > > > may want to write-protect a gfn (RAM). In case that this request 
> > > > > > goes
> > > > > > to VFIO .. how it is supposed to reach KVM MMU?
> > > > >
> > > > > Well, let's work through the problem.  How is the GFN related to the
> > > > > device?  Is this some sort of page table for device mappings with a 
> > > > > base
> > > > > register in the vgpu hardware?
> > > >
> > > > IIRC this is needed to make sure the guest can't bypass execbuffer
> > > > verification and works like this:
> > > >
> > > >   (1) guest submits execbuffer.
> > > >   (2) host makes execbuffer readonly for the guest
> > > >   (3) verify the buffer (make sure it only accesses resources owned by
> > > >   the vm).
> > > >   (4) pass on execbuffer to the hardware.
> > > >   (5) when the gpu is done with it make the execbuffer writable again.
> > >
> > > Ok, so are there opportunities to do those page protections outside of
> > > KVM?  We should be able to get the vma for the buffer, can we do
> > > something with that to make it read-only.  Alternatively can the vgpu
> > > driver copy it to a private buffer and hardware can execute from that?
> > > I'm not a virtual memory expert, but it doesn't seem like an
> > > insurmountable problem.  Thanks,
> > 
> > Originally iGVT-g used write-protection for privilege execbuffers, as Gerd
> > described. Now the latest implementation has removed wp to do buffer copy
> > instead, since the privilege command buffers are usually small. So that part
> > is fine.
> > 
> > But we need write-protection for graphics page table shadowing as well. Once
> > guest driver modifies gpu page table, we need to know that and manipulate
> > shadow page table accordingly. buffer copy cannot help here. Thanks!
> > 
> 
> 
> 4) Map/unmap guest memory
> --
> It's there for KVM.
> 
> 5) Pin/unpin guest memory
> --
> IGD or any PCI passthru should have same requirement. So we should be
> able to leverage existing code in VFIO. The only tricky thing (Jike may
> elaborate after he is back), is that KVMGT requires to pin EPT entry too,
> which requires some further change in KVM side. But I'm not sure whether
> it still holds true after some design changes made in this thread. So I'll
> leave to Jike to further comment.
> 

Hi Kevin,

I think you should be able to map and pin guest memory via the IOMMU API, not
KVM.

> Well, then I realize pretty much opens have been covered with a solution
> when ending this write-up. Then we should move forward to come up a
> prototype upon which we can then identify anything missing or overlooked
> (definitely there would be), and also discuss several remaining opens atop
>  (such as exit-less emulation, pin/unpin, etc.). Another thing we need
> to think is whether this new design is still compatible to Xen side.
> 
> Thanks a lot all for the great discussion (especially Alex with many good
> inputs)! I believe it becomes much clearer now than 2 weeks ago, about 
> how to integrate KVMGT with VFIO. :-)
> 

It is great to see you guys are onboard with VFIO solution! As Kirti has
mentioned in other threads, let's review the current registration APIs and
figure out what we need to add for both solutions.

Thanks,
Neo

> Thanks
> Kevin
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Qemu-devel] [PATCH v2] scripts/kvm/kvm_stat: Fix tracefs access checking

2016-02-03 Thread Janosch Frank
On kernels build without CONFIG_TRACING kvm_stat will bail out even
when traces are not used. This is not very helpful, especially if the
user can't install a new kernel. Instead, we should warn the user and
fall back to debugfs statistics.

These changes check if trace statistics were selected without kernel
support and warn the user. If they were not selected explicitly, we
wait five seconds to let the user read the warning and then set the
debugfs statistics to True and the tracefs ones to False. Otherwise we
exit.

Fixes: 7aa4ee5 ('scripts/kvm/kvm_stat: Improve debugfs access checking')
Signed-off-by: Janosch Frank 
---
 v1 to v2:
Exit if -t is set explicitly

 scripts/kvm/kvm_stat | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat
index d43e8f3..4756eee 100755
--- a/scripts/kvm/kvm_stat
+++ b/scripts/kvm/kvm_stat
@@ -22,6 +22,7 @@ import resource
 import struct
 import re
 from collections import defaultdict
+from time import sleep
 
 VMX_EXIT_REASONS = {
 'EXCEPTION_NMI':0,
@@ -778,7 +779,7 @@ def get_providers(options):
 
 return providers
 
-def check_access():
+def check_access(options):
 if not os.path.exists('/sys/kernel/debug'):
 sys.stderr.write('Please enable CONFIG_DEBUG_FS in your kernel.')
 sys.exit(1)
@@ -790,14 +791,26 @@ def check_access():
  "Also ensure, that the kvm modules are loaded.\n")
 sys.exit(1)
 
-if not os.path.exists(PATH_DEBUGFS_TRACING):
-sys.stderr.write("Please make {0} readable by the current user.\n"
+if not os.path.exists(PATH_DEBUGFS_TRACING) and (options.tracepoints
+ or not options.debugfs):
+sys.stderr.write("Please enable CONFIG_TRACING in your kernel "
+ "when using the option -t (default).\n"
+ "If it is enabled, make {0} readable by the "
+ "current user.\n"
  .format(PATH_DEBUGFS_TRACING))
-sys.exit(1)
+
+if options.tracepoints:
+sys.exit(1)
+
+sys.stderr.write("Falling back to debugfs statistics!\n")
+options.debugfs = True
+sleep(5)
+
+return options
 
 def main():
-check_access()
 options = get_options()
+options = check_access(options)
 providers = get_providers(options)
 stats = Stats(providers, fields=options.fields)
 
-- 
2.3.0




Re: [Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version

2016-02-03 Thread Gerd Hoffmann
  Hi,

> I think the "dma_enabled" property is not exposed to the user.

It is: "-global fw_cfg.dma_enabled=off" works (as in: doesn't throw an
error).  Has no effect through as it gets overridden later on.

> The default value of "dma_enabled" in both fw_cfg_io_properties and
> fw_cfg_mem_properties is irrelevant; the actual property value is always
> overwritten in fw_cfg_init_io_dma() and fw_cfg_init_mem_wide(), which
> all of the init paths go through.

And IMHO we should not do that, so setting the property actually has an
effect.

> I agree that DMA capability should be filtered with machine type.
> However, that distinction should not be made using the current
> "dma_enabled" properties (i.e., of "fw_cfg_io_properties" and
> "fw_cfg_mem_properties". Instead, it should be made in the
> board-specific callers of fw_cfg_init_(io_dma|mem_wide).

Why?

cheers,
  Gerd




Re: [Qemu-devel] [RFC PATCH v2 7/7] vfio/pci: Find and expose Intel IGD OpRegion

2016-02-03 Thread Gerd Hoffmann
On Di, 2016-02-02 at 13:09 -0700, Alex Williamson wrote:
> This is provided via a device specific region, look for it on Intel
> VGA class devices.  Our default mechanism to expose this to the BIOS
> is via fw_cfg where it's expected that the BIOS will copy the data
> into a reserved RAM area and update the ASL Storage register to
> reference the GPA of that buffer.

>   We also support directly mapping
> the OpRegion through to the host in response to the ASL Storage
> register write, which makes the data "live" and potentially provides
> write access should the underlying vfio region support writes.

This should better be splitted into a separate patch.

> This
> option is automatically enabled if we somehow don't support fw_cfg (is
> this a good idea?).

I think this can't happen.  And even in case it can: we have bigger
problems than the opregion then.

cheers,
  Gerd




Re: [Qemu-devel] [PATCH v14 7/8] Implement new driver for block replication

2016-02-03 Thread Stefan Hajnoczi
On Wed, Feb 03, 2016 at 09:29:15AM +0800, Wen Congyang wrote:
> On 02/02/2016 10:34 PM, Stefan Hajnoczi wrote:
> > On Mon, Feb 01, 2016 at 09:13:36AM +0800, Wen Congyang wrote:
> >> On 01/29/2016 11:46 PM, Stefan Hajnoczi wrote:
> >>> On Fri, Jan 29, 2016 at 11:13:42AM +0800, Changlong Xie wrote:
>  On 01/28/2016 11:15 PM, Stefan Hajnoczi wrote:
> > On Thu, Jan 28, 2016 at 09:13:24AM +0800, Wen Congyang wrote:
> >> On 01/27/2016 10:46 PM, Stefan Hajnoczi wrote:
> >>> On Wed, Jan 13, 2016 at 05:18:31PM +0800, Changlong Xie wrote:
> > I'm concerned that the bdrv_drain_all() in vm_stop() can take a long
> > time if the disk is slow/failing.  bdrv_drain_all() blocks until all
> > in-flight I/O requests have completed.  What does the Primary do if the
> > Secondary becomes unresponsive?
> 
>  Actually, we knew this problem. But currently, there seems no better way 
>  to
>  resolve it. If you have any ideas?
> >>>
> >>> Is it possible to hold the checkpoint information and acknowledge the
> >>> checkpoint right away, without waiting for bdrv_drain_all() or any
> >>> Secondory guest activity to complete?
> >>
> >> There is no way to know that secondary becomes unreponsive.
> > 
> > I meant whether it is necessary for the Secondary to vm_stop() and apply
> > the checkpoint before acknowledging the checkpoint to the Primary?
> 
> I don't understand this.
> Here is the COLO checkpoint flow:
> 
> PrimarySecondary
> new checkpoint notice --->
> vm_stop()  vm_stop()
> vm state(device state, memory, cpu)   --->
>load state
>   <--- done
> vm_start() vm_start()

If the Secondary's vm_stop() call blocks then the Primary is stuck too.

I was wondering whether the Secondary can do:

<---  done
  vm_stop()
  load state

It simply receives the checkpoint data into a buffer and immediately
replies with "done".  vm_stop() and load state is only performed after
sending "done".

The advantage is that the Primary will not be delayed by the Secondary.
It's an approach that doesn't block.

But perhaps it's a problem if the Secondary is slower than the Primary
since the Secondary still needs to complete vm_stop() and load state
before it can resume execution?

> >>> I think this really means falling back to microcheckpointing until the
> >>> Secondary guest can checkpoint.  Instead of a blocking vm_stop() we
> >>> would prevent vcpus from running and when the last pending I/O finishes
> >>> the Secondary could apply the last checkpoint.  This approach does not
> >>> block QEMU (the monitor, etc).
> >>>
> >>
> >> If secondary host becomes unresponsive, it means that we cannot do 
> >> mocrocheckpointing.
> >> We should do failover in this case.
> > 
> > This is dangerous because it means that a delay/failure in the Secondary
> > would cause the Primary to fail over to the broken Secondary.  All the
> > more reason not to perform blocking operations on the Secondary in the
> > checkpoint code path.
> 
> If the secondary is broken, primary qemu will take over.

Does the Primary use a timeout between "new checkpoint notice" and
Secondary's "done" so it can move on if the Secondary is unresponsive?

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 05/16] iothread: release AioContext around aio_poll

2016-02-03 Thread Stefan Hajnoczi
On Tue, Feb 02, 2016 at 04:01:55PM +0100, Paolo Bonzini wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> 
> 
> On 02/02/2016 15:52, Stefan Hajnoczi wrote:
> >>> @@ -110,6 +111,8 @@ static void *test_acquire_thread(void
> >>> *opaque) qemu_mutex_lock(&data->start_lock); 
> >>> qemu_mutex_unlock(&data->start_lock);
> >>> 
> >>> +g_usleep(50);
> > Sleep?
> 
> What about it? :)

Sleep in a loop is inefficient but at least correct.

A sleep outside a loop is a race condition.  If the machine is heavily
loaded, whatever you are waiting for may not happen in 500 ms and the
test case is unreliable.

What is the purpose of this sleep?

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] Fix inconsistency between comment and variable name

2016-02-03 Thread Markus Armbruster
Michael Tokarev  writes:

> 03.02.2016 06:19, Cao jin wrote:
>> Signed-off-by: Cao jin 
>> ---
>>  include/hw/qdev-core.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
>> index abcdee8..42fa5db 100644
>> --- a/include/hw/qdev-core.h
>> +++ b/include/hw/qdev-core.h
>> @@ -221,7 +221,7 @@ typedef struct BusChild {
>>  
>>  /**
>>   * BusState:
>> - * @hotplug_device: link to a hotplug device associated with bus.
>> + * @hotplug_handler: link to a hotplug device associated with bus.
>
> Hmm.  Now while the field name in comment and in the structure
> do match, the comment is still wrong, since it is a linke to
> a handler, not a device… :)

Do you mean to suggest the line should be changed to

 * @hotplug_device: link to a hotplug handler associated with bus.

?



[Qemu-devel] [PATCH 2/2] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread John Paul Adrian Glaubitz
Signed-off-by: John Paul Adrian Glaubitz 
---
 target-m68k/translate.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 535d7f9..f508d1e 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2828,6 +2828,10 @@ register_opcode (disas_proc proc, uint16_t opcode, 
uint16_t mask)
Later insn override earlier ones.  */
 void register_m68k_insns (CPUM68KState *env)
 {
+/* Build the opcode table only once to avoid
+   issues with multithreading. */
+if(opcode_table[0] != NULL)
+   return;
 #define INSN(name, opcode, mask, feature) do { \
 if (m68k_feature(env, M68K_FEATURE_##feature)) \
 register_opcode(disas_##name, 0x##opcode, 0x##mask); \
-- 
2.7.0




[Qemu-devel] [PATCH 1/2] m68k: Fix opcode mask for fbcc instruction

2016-02-03 Thread John Paul Adrian Glaubitz
Signed-off-by: John Paul Adrian Glaubitz 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 342c040..535d7f9 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2940,7 +2940,7 @@ void register_m68k_insns (CPUM68KState *env)
 INSN(shift_reg, e0a0, f0f0, CF_ISA_A);
 INSN(undef_fpu, f000, f000, CF_ISA_A);
 INSN(fpu,   f200, ffc0, CF_FPU);
-INSN(fbcc,  f280, ffc0, CF_FPU);
+INSN(fbcc,  f280, ff80, CF_FPU);
 INSN(frestore,  f340, ffc0, CF_FPU);
 INSN(fsave, f340, ffc0, CF_FPU);
 INSN(intouch,   f340, ffc0, CF_ISA_A);
-- 
2.7.0




[Qemu-devel] m68k: More bug fixes for translation code

2016-02-03 Thread John Paul Adrian Glaubitz
Hi Laurent!

As promised, here are the fixes for the two recently discovered
bugs in the m68k translation code.

The first patch fixes the opcode mask for the fbcc instruction which
is currently incorrect as it masks the 6th bit as constant (0xffc0).
However, according to the ColdFire reference manual, this bit is
used to determine the size of the displacement for the jump, either
16 or 32 bits:

> http://www.nxp.com/files/dsp/doc/ref_manual/CFPRM.pdf (p. 229)

Looking at DISAS_INSN(fbcc), the emulated instruction actually tests
for the 6th bit and sets the offset accordingly. However, since the
current opcode mask ignores this bit, long jumps can never work. In
fact, what we actually see is an illegal instruction: 0xf2e0.

Changing the opcode mask to 0xff80 makes the 6th bit variable and
allows long jumps to work as expected.

The second patch addresses a problem with the thread safety of
register_m68k_insns(). It turns out, that the opcode table is
rebuild for every thread that is started which means that in
a multithreaded environment, one thread can destroy the opcode
table of a concurrent thread which makes this thread crash
with an illegal instruction.

This patch changes register_m68k_insns() such that it returns
without doing anything in case the opcode table has already been
built and re-registering the instructions is therefore not necessary
but rather harmful.

Credits go to Michael Karcher for helping to debug these issues!

Cheers,
Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913




Re: [Qemu-devel] [PATCH 1/2] m68k: Fix opcode mask for fbcc instruction

2016-02-03 Thread Laurent Vivier


Le 03/02/2016 10:37, John Paul Adrian Glaubitz a écrit :
> Signed-off-by: John Paul Adrian Glaubitz 
> ---
>  target-m68k/translate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
> index 342c040..535d7f9 100644
> --- a/target-m68k/translate.c
> +++ b/target-m68k/translate.c
> @@ -2940,7 +2940,7 @@ void register_m68k_insns (CPUM68KState *env)
>  INSN(shift_reg, e0a0, f0f0, CF_ISA_A);
>  INSN(undef_fpu, f000, f000, CF_ISA_A);
>  INSN(fpu,   f200, ffc0, CF_FPU);
> -INSN(fbcc,  f280, ffc0, CF_FPU);
> +INSN(fbcc,  f280, ff80, CF_FPU);
>  INSN(frestore,  f340, ffc0, CF_FPU);
>  INSN(fsave, f340, ffc0, CF_FPU);
>  INSN(intouch,   f340, ffc0, CF_ISA_A);
> 

Reviewed-by: Laurent Vivier 



Re: [Qemu-devel] [PATCH 2/2] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread Laurent Vivier


Le 03/02/2016 10:37, John Paul Adrian Glaubitz a écrit :
> Signed-off-by: John Paul Adrian Glaubitz 
> ---
>  target-m68k/translate.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
> index 535d7f9..f508d1e 100644
> --- a/target-m68k/translate.c
> +++ b/target-m68k/translate.c
> @@ -2828,6 +2828,10 @@ register_opcode (disas_proc proc, uint16_t opcode, 
> uint16_t mask)
> Later insn override earlier ones.  */
>  void register_m68k_insns (CPUM68KState *env)
>  {
> +/* Build the opcode table only once to avoid
> +   issues with multithreading. */
> +if(opcode_table[0] != NULL)
> +   return;
>  #define INSN(name, opcode, mask, feature) do { \
>  if (m68k_feature(env, M68K_FEATURE_##feature)) \
>  register_opcode(disas_##name, 0x##opcode, 0x##mask); \
> 
Reviewed-by: Laurent Vivier 



Re: [Qemu-devel] [PATCH 1/2] m68k: Fix opcode mask for fbcc instruction

2016-02-03 Thread John Paul Adrian Glaubitz
Strange. There should be a cover letter coming along as well which
explains my changes. Did you get it?

On 02/03/2016 10:37 AM, John Paul Adrian Glaubitz wrote:
> Signed-off-by: John Paul Adrian Glaubitz 
> ---
>  target-m68k/translate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
> index 342c040..535d7f9 100644
> --- a/target-m68k/translate.c
> +++ b/target-m68k/translate.c
> @@ -2940,7 +2940,7 @@ void register_m68k_insns (CPUM68KState *env)
>  INSN(shift_reg, e0a0, f0f0, CF_ISA_A);
>  INSN(undef_fpu, f000, f000, CF_ISA_A);
>  INSN(fpu,   f200, ffc0, CF_FPU);
> -INSN(fbcc,  f280, ffc0, CF_FPU);
> +INSN(fbcc,  f280, ff80, CF_FPU);
>  INSN(frestore,  f340, ffc0, CF_FPU);
>  INSN(fsave, f340, ffc0, CF_FPU);
>  INSN(intouch,   f340, ffc0, CF_ISA_A);
> 


-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [Qemu-devel] [PATCH 1/2] m68k: Fix opcode mask for fbcc instruction

2016-02-03 Thread Laurent Vivier


Le 03/02/2016 10:40, John Paul Adrian Glaubitz a écrit :
> Strange. There should be a cover letter coming along as well which
> explains my changes. Did you get it?

We have the cover letter, but it is never sent to the sender :)

Laurent
> 
> On 02/03/2016 10:37 AM, John Paul Adrian Glaubitz wrote:
>> Signed-off-by: John Paul Adrian Glaubitz 
>> ---
>>  target-m68k/translate.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
>> index 342c040..535d7f9 100644
>> --- a/target-m68k/translate.c
>> +++ b/target-m68k/translate.c
>> @@ -2940,7 +2940,7 @@ void register_m68k_insns (CPUM68KState *env)
>>  INSN(shift_reg, e0a0, f0f0, CF_ISA_A);
>>  INSN(undef_fpu, f000, f000, CF_ISA_A);
>>  INSN(fpu,   f200, ffc0, CF_FPU);
>> -INSN(fbcc,  f280, ffc0, CF_FPU);
>> +INSN(fbcc,  f280, ff80, CF_FPU);
>>  INSN(frestore,  f340, ffc0, CF_FPU);
>>  INSN(fsave, f340, ffc0, CF_FPU);
>>  INSN(intouch,   f340, ffc0, CF_ISA_A);
>>
> 
> 



Re: [Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version

2016-02-03 Thread Stefan Hajnoczi
On Tue, Feb 02, 2016 at 01:58:14PM +0100, Marc Marí wrote:
> El Tue, 02 Feb 2016 12:06:27 +0100
> Gerd Hoffmann  escribió:
> >   Hi,
> > 
> > >  %.img: %.o
> > > - $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e
> > > _start -s -o $@ $<,"  Building $(TARGET_DIR)$@")
> > > + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386
> > > -Ttext 0 -e _start -s -o $@ $<,"  Building $(TARGET_DIR)$@")
> > 
> > Hmm, that breaks the windows cross build:
> > 
> > make: Entering directory `/home/kraxel/projects/qemu/build-win32'
> >   Building optionrom/linuxboot_dma.img
> > i686-w64-mingw32-ld: unrecognised emulation mode: elf_i386
> > Supported emulations: i386pe
> > make[1]: *** [linuxboot_dma.img] Error 1
> 
> Thanks for reporting.
> 
> I don't know much about Windows cross-builds. Any idea on how to solve
> the issue?

The Windows toolchain doesn't use ELF binaries so -m elf_i386 doesn't
work there (the emulation is called "i386pe" in i686-w64-mingw32-ld).

I wonder whether it's possible to use gcc -m32 ... -o $@ %< as a wrapper
that automatically does the right thing.  But I guess it won't work
since gcc wants a C source file and not an object file as input.

You could make the emulation ("elf_i386" vs "i386pe") conditional on the
host platform (CONFIG_WIN32=y).

I'm not sure what the most elegant solution is.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v6 1/5] util: Introduce error reporting functions with fatal/abort

2016-02-03 Thread Markus Armbruster
David Gibson  writes:

> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>> On 02.02.2016 19:53, Markus Armbruster wrote:
>> > Lluís Vilanova  writes:
>> ...
>> 
>> >> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>> >> index 7ab2355..6c2f142 100644
>> >> --- a/include/qemu/error-report.h
>> >> +++ b/include/qemu/error-report.h
>> >> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) 
>> >> GCC_FMT_ATTR(1, 2);
>> >>  const char *error_get_progname(void);
>> >>  extern bool enable_timestamp_msg;
>> >>  
>> >> +/* Report message and exit with error */
>> >> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) 
>> >> GCC_FMT_ATTR(1, 0);
>> >> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) 
>> >> GCC_FMT_ATTR(1, 2);
>> > 
>> > This lets people write things like
>> > 
>> > error_report_fatal("The sky is falling");
>> > 
>> > instead of
>> > 
>> > error_report("The sky is falling");
>> > exit(1);
>> > 
>> > or
>> > 
>> > fprintf(stderr, "The sky is falling\n");
>> > exit(1);
>> > 
>> > I don't think that's an improvement in clarity.
>> 
>> The problem is not the existing code, but that in a couple of new
>> patches, I've now already seen that people are trying to use
>> 
>>  error_setg(&error_fatal, ... );
>
> So, I don't actually see any real advantage to error_report_fatal(...)
> over error_setg(&error_fatal, ...).

I do.  Compare:

(a) error_report(...);
exit(1);

(b) error_report_fatal(...);

(c) error_setg(&error_fatal, ...);

In my opinion, (a) is clearest: even a relatively clueless reader will
know what exit(1) does, can guess what error_report() approximately
does, and doesn't need to know what it does exactly.  (b) is slightly
less obvious, and (c) is positively opaque.

Let's stick to the obvious (a) and be done with it.

[...]



[Qemu-devel] [PULL 1/1] hmp: fix sendkey out of bounds write (CVE-2015-8619)

2016-02-03 Thread Markus Armbruster
From: Wolfgang Bumiller 

When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.

Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.

Reported-by: Ling Liu 
Signed-off-by: Wolfgang Bumiller 
Message-Id: <20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked]
Signed-off-by: Markus Armbruster 
---
 hmp.c| 18 --
 include/ui/console.h |  2 +-
 ui/input-legacy.c|  5 +++--
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/hmp.c b/hmp.c
index 54f2620..9c571f5 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1731,21 +1731,18 @@ void hmp_sendkey(Monitor *mon, const QDict *qdict)
 int has_hold_time = qdict_haskey(qdict, "hold-time");
 int hold_time = qdict_get_try_int(qdict, "hold-time", -1);
 Error *err = NULL;
-char keyname_buf[16];
 char *separator;
 int keyname_len;
 
 while (1) {
 separator = strchr(keys, '-');
 keyname_len = separator ? separator - keys : strlen(keys);
-pstrcpy(keyname_buf, sizeof(keyname_buf), keys);
 
 /* Be compatible with old interface, convert user inputted "<" */
-if (!strncmp(keyname_buf, "<", 1) && keyname_len == 1) {
-pstrcpy(keyname_buf, sizeof(keyname_buf), "less");
+if (keys[0] == '<' && keyname_len == 1) {
+keys = "less";
 keyname_len = 4;
 }
-keyname_buf[keyname_len] = 0;
 
 keylist = g_malloc0(sizeof(*keylist));
 keylist->value = g_malloc0(sizeof(*keylist->value));
@@ -1758,16 +1755,17 @@ void hmp_sendkey(Monitor *mon, const QDict *qdict)
 }
 tmp = keylist;
 
-if (strstart(keyname_buf, "0x", NULL)) {
+if (strstart(keys, "0x", NULL)) {
 char *endp;
-int value = strtoul(keyname_buf, &endp, 0);
-if (*endp != '\0') {
+int value = strtoul(keys, &endp, 0);
+assert(endp <= keys + keyname_len);
+if (endp != keys + keyname_len) {
 goto err_out;
 }
 keylist->value->type = KEY_VALUE_KIND_NUMBER;
 keylist->value->u.number = value;
 } else {
-int idx = index_from_key(keyname_buf);
+int idx = index_from_key(keys, keyname_len);
 if (idx == Q_KEY_CODE__MAX) {
 goto err_out;
 }
@@ -1789,7 +1787,7 @@ out:
 return;
 
 err_out:
-monitor_printf(mon, "invalid parameter: %s\n", keyname_buf);
+monitor_printf(mon, "invalid parameter: %.*s\n", keyname_len, keys);
 goto out;
 }
 
diff --git a/include/ui/console.h b/include/ui/console.h
index adac36d..116bc2b 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -448,7 +448,7 @@ static inline int vnc_display_pw_expire(const char *id, 
time_t expires)
 void curses_display_init(DisplayState *ds, int full_screen);
 
 /* input.c */
-int index_from_key(const char *key);
+int index_from_key(const char *key, size_t key_length);
 
 /* gtk.c */
 void early_gtk_display_init(int opengl);
diff --git a/ui/input-legacy.c b/ui/input-legacy.c
index 35dfc27..3454055 100644
--- a/ui/input-legacy.c
+++ b/ui/input-legacy.c
@@ -57,12 +57,13 @@ struct QEMUPutLEDEntry {
 static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers =
 QTAILQ_HEAD_INITIALIZER(led_handlers);
 
-int index_from_key(const char *key)
+int index_from_key(const char *key, size_t key_length)
 {
 int i;
 
 for (i = 0; QKeyCode_lookup[i] != NULL; i++) {
-if (!strcmp(key, QKeyCode_lookup[i])) {
+if (!strncmp(key, QKeyCode_lookup[i], key_length) &&
+!QKeyCode_lookup[i][key_length]) {
 break;
 }
 }
-- 
2.4.3




[Qemu-devel] [PULL 0/1] Monitor patches for 2016-02-03

2016-02-03 Thread Markus Armbruster
The following changes since commit c65db7705b7926f4a084b93778e4bd5dd3990aad:

  Merge remote-tracking branch 
'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging 
(2016-02-02 18:04:04 +)

are available in the git repository at:

  git://repo.or.cz/qemu/armbru.git tags/pull-monitor-2016-02-03

for you to fetch changes up to 64ffbe04eaafebf4045a3ace52a360c14959d196:

  hmp: fix sendkey out of bounds write (CVE-2015-8619) (2016-02-03 10:13:06 
+0100)


Monitor patches for 2016-02-03


Wolfgang Bumiller (1):
  hmp: fix sendkey out of bounds write (CVE-2015-8619)

 hmp.c| 18 --
 include/ui/console.h |  2 +-
 ui/input-legacy.c|  5 +++--
 3 files changed, 12 insertions(+), 13 deletions(-)

-- 
2.4.3




Re: [Qemu-devel] [PATCH 05/16] iothread: release AioContext around aio_poll

2016-02-03 Thread Paolo Bonzini


On 03/02/2016 10:34, Stefan Hajnoczi wrote:
> +g_usleep(50);
> Sleep?
>>> 
>>> What about it? :)
> Sleep in a loop is inefficient but at least correct.
> 
> A sleep outside a loop is a race condition.  If the machine is
> heavily loaded, whatever you are waiting for may not happen in 500
> ms and the test case is unreliable.
> 
> What is the purpose of this sleep?

In the test the desired ordering of events is

main thread additional thread
-
lock start_lock
start thread
lock start_lock (waits)
acquire context
unlock start_lock
lock start_lock (returns)
unlock start_lock
aio_poll
   poll()   
event_notifier_set  

Comparing the test to QEMU, the roles of the threads are reversed. The
test's "main thread" is QEMU's dataplane thread, and the test's
additional thread is QEMU's main thread.

The sleep "ensures" that poll() blocks before event_notifier_set.  The
sleep represents how QEMU's main thread acquires the context rarely,
and not right after starting the dataplane thread.

Because this is a file descriptor source, there is really no
difference between the code's behavior, no matter if aio_poll starts
before or after the event_notifier_set.  The test passes even if you
remove the sleep.

Do you have any suggestion?  Just add a comment?

Paolo



Re: [Qemu-devel] [PATCH v1 14/22] migration: convert savevm to use QIOChannel for writing to files

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> Convert the exec savevm code to use QIOChannel and QEMUFileChannel,
> instead of the stdio APIs.
> 
> Signed-off-by: Daniel P. Berrange 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/savevm.c   |  8 +---
>  tests/Makefile   |  4 ++--
>  tests/test-vmstate.c | 11 ++-
>  3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index f2e1880..e57d7ce 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -51,6 +51,7 @@
>  #include "block/snapshot.h"
>  #include "block/qapi.h"
>  #include "io/channel-buffer.h"
> +#include "io/channel-file.h"
>  
>  
>  #ifndef ETH_P_RARP
> @@ -1990,6 +1991,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>  void qmp_xen_save_devices_state(const char *filename, Error **errp)
>  {
>  QEMUFile *f;
> +QIOChannelFile *ioc;
>  int saved_vm_running;
>  int ret;
>  
> @@ -1997,11 +1999,11 @@ void qmp_xen_save_devices_state(const char *filename, 
> Error **errp)
>  vm_stop(RUN_STATE_SAVE_VM);
>  global_state_store_running();
>  
> -f = qemu_fopen(filename, "wb");
> -if (!f) {
> -error_setg_file_open(errp, errno, filename);
> +ioc = qio_channel_file_new_path(filename, O_WRONLY | O_CREAT, 0660, 
> errp);
> +if (!ioc) {
>  goto the_end;
>  }
> +f = qemu_fopen_channel_output(QIO_CHANNEL(ioc));
>  ret = qemu_save_device_state(f);
>  qemu_fclose(f);
>  if (ret < 0) {
> diff --git a/tests/Makefile b/tests/Makefile
> index 5a1b25f..e8f759a 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -423,8 +423,8 @@ tests/test-qdev-global-props$(EXESUF): 
> tests/test-qdev-global-props.o \
>   $(test-qapi-obj-y)
>  tests/test-vmstate$(EXESUF): tests/test-vmstate.o \
>   migration/vmstate.o migration/qemu-file.o \
> -migration/qemu-file-unix.o qjson.o \
> - $(test-qom-obj-y)
> +migration/qemu-file-channel.o qjson.o \
> + $(test-io-obj-y)
>  tests/test-timed-average$(EXESUF): tests/test-timed-average.o qemu-timer.o \
>   $(test-util-obj-y)
>  tests/test-base64$(EXESUF): tests/test-base64.o \
> diff --git a/tests/test-vmstate.c b/tests/test-vmstate.c
> index 0f943d5..6bbda8a 100644
> --- a/tests/test-vmstate.c
> +++ b/tests/test-vmstate.c
> @@ -28,6 +28,7 @@
>  #include "migration/migration.h"
>  #include "migration/vmstate.h"
>  #include "qemu/coroutine.h"
> +#include "io/channel-file.h"
>  
>  static char temp_file[] = "/tmp/vmst.test.XX";
>  static int temp_fd;
> @@ -48,11 +49,17 @@ void yield_until_fd_readable(int fd)
>  static QEMUFile *open_test_file(bool write)
>  {
>  int fd = dup(temp_fd);
> +QIOChannel *ioc;
>  lseek(fd, 0, SEEK_SET);
>  if (write) {
>  g_assert_cmpint(ftruncate(fd, 0), ==, 0);
>  }
> -return qemu_fdopen(fd, write ? "wb" : "rb");
> +ioc = QIO_CHANNEL(qio_channel_file_new_fd(fd));
> +if (write) {
> +return qemu_fopen_channel_output(ioc);
> +} else {
> +return qemu_fopen_channel_input(ioc);
> +}
>  }
>  
>  #define SUCCESS(val) \
> @@ -468,6 +475,8 @@ int main(int argc, char **argv)
>  {
>  temp_fd = mkstemp(temp_file);
>  
> +module_call_init(MODULE_INIT_QOM);
> +
>  g_test_init(&argc, &argv, NULL);
>  g_test_add_func("/vmstate/simple/primitive", test_simple_primitive);
>  g_test_add_func("/vmstate/versioned/load/v1", test_load_v1);
> -- 
> 2.5.0
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v14 7/8] Implement new driver for block replication

2016-02-03 Thread Wen Congyang
On 02/03/2016 05:32 PM, Stefan Hajnoczi wrote:
> On Wed, Feb 03, 2016 at 09:29:15AM +0800, Wen Congyang wrote:
>> On 02/02/2016 10:34 PM, Stefan Hajnoczi wrote:
>>> On Mon, Feb 01, 2016 at 09:13:36AM +0800, Wen Congyang wrote:
 On 01/29/2016 11:46 PM, Stefan Hajnoczi wrote:
> On Fri, Jan 29, 2016 at 11:13:42AM +0800, Changlong Xie wrote:
>> On 01/28/2016 11:15 PM, Stefan Hajnoczi wrote:
>>> On Thu, Jan 28, 2016 at 09:13:24AM +0800, Wen Congyang wrote:
 On 01/27/2016 10:46 PM, Stefan Hajnoczi wrote:
> On Wed, Jan 13, 2016 at 05:18:31PM +0800, Changlong Xie wrote:
>>> I'm concerned that the bdrv_drain_all() in vm_stop() can take a long
>>> time if the disk is slow/failing.  bdrv_drain_all() blocks until all
>>> in-flight I/O requests have completed.  What does the Primary do if the
>>> Secondary becomes unresponsive?
>>
>> Actually, we knew this problem. But currently, there seems no better way 
>> to
>> resolve it. If you have any ideas?
>
> Is it possible to hold the checkpoint information and acknowledge the
> checkpoint right away, without waiting for bdrv_drain_all() or any
> Secondory guest activity to complete?

 There is no way to know that secondary becomes unreponsive.
>>>
>>> I meant whether it is necessary for the Secondary to vm_stop() and apply
>>> the checkpoint before acknowledging the checkpoint to the Primary?
>>
>> I don't understand this.
>> Here is the COLO checkpoint flow:
>>
>> PrimarySecondary
>> new checkpoint notice --->
>> vm_stop()  vm_stop()
>> vm state(device state, memory, cpu)   --->
>>load state
>>   <--- done
>> vm_start() vm_start()
> 
> If the Secondary's vm_stop() call blocks then the Primary is stuck too.
> 
> I was wondering whether the Secondary can do:
> 
> <---  done
>   vm_stop()
>   load state
> 
> It simply receives the checkpoint data into a buffer and immediately
> replies with "done".  vm_stop() and load state is only performed after
> sending "done".

Secondary vm is running, so we should also get the pages that are dirtied
by secondary vm, but not dirtied by primary vm.
We have two ways to do it:
1. Cache all original memory in the secondary qemu
2. Send the dirty pfn list to primary qemu, and get it.

If we ack the checkpoint and the call vm_stop(), we only can select 1. It
means that secondary qemu costs more memory.
In COLO mode, we will compare the output socket, and will do checkpoint if
the application level data is different. If we ack the checkpoint and the
call vm_stop(), the client can not get any more data until secondary vm
is running again. So we still 'wait' the secondary vm.


> 
> The advantage is that the Primary will not be delayed by the Secondary.
> It's an approach that doesn't block.
> 
> But perhaps it's a problem if the Secondary is slower than the Primary
> since the Secondary still needs to complete vm_stop() and load state
> before it can resume execution?
> 
> I think this really means falling back to microcheckpointing until the
> Secondary guest can checkpoint.  Instead of a blocking vm_stop() we
> would prevent vcpus from running and when the last pending I/O finishes
> the Secondary could apply the last checkpoint.  This approach does not
> block QEMU (the monitor, etc).
>

 If secondary host becomes unresponsive, it means that we cannot do 
 mocrocheckpointing.
 We should do failover in this case.
>>>
>>> This is dangerous because it means that a delay/failure in the Secondary
>>> would cause the Primary to fail over to the broken Secondary.  All the
>>> more reason not to perform blocking operations on the Secondary in the
>>> checkpoint code path.
>>
>> If the secondary is broken, primary qemu will take over.
> 
> Does the Primary use a timeout between "new checkpoint notice" and
> Secondary's "done" so it can move on if the Secondary is unresponsive?

To hailiang:
IIRC, we don't use a timeout but I think we can do it. In our design, there is
an exteranl heartbeat to check primary and secondary status, and decide when
to do checkpoint.

Thanks
Wen Congyang


> 
> Stefan
> 






Re: [Qemu-devel] [PATCH] block: add missing call to bdrv_drain_recurse

2016-02-03 Thread Paolo Bonzini


On 11/01/2016 09:32, Paolo Bonzini wrote:
> 
> 
> On 25/12/2015 02:55, Fam Zheng wrote:
>> On Wed, 12/23 11:48, Paolo Bonzini wrote:
>>> This is also needed in bdrv_drain_all, not just in bdrv_drain.
>>>
>>> Signed-off-by: Paolo Bonzini 
>>> ---
>>>  block/io.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/block/io.c b/block/io.c
>>> index 841f5b5..bfe2544 100644
>>> --- a/block/io.c
>>> +++ b/block/io.c
>>> @@ -293,6 +293,7 @@ void bdrv_drain_all(void)
>>>  if (bs->job) {
>>>  block_job_pause(bs->job);
>>>  }
>>> +bdrv_drain_recurse(bs);
>>>  aio_context_release(aio_context);
>>>  
>>>  if (!g_slist_find(aio_ctxs, aio_context)) {
>>> -- 
>>> 2.5.0
>>>
>>>
>>
>> Reviewed-by: Fam Zheng 
>>
>>
> 
> Ping?

Ping^2?



Re: [Qemu-devel] [PATCH v1 15/22] migration: delete QEMUFile buffer implementation

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> The qemu_bufopen() method is no longer used, so the memory
> buffer based QEMUFile backend can be deleted entirely.
> 
> Signed-off-by: Daniel P. Berrange 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  include/migration/qemu-file.h |  6 ---
>  migration/qemu-file-buf.c | 96 
> ---
>  2 files changed, 102 deletions(-)
> 
> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> index cb79311..6b12960 100644
> --- a/include/migration/qemu-file.h
> +++ b/include/migration/qemu-file.h
> @@ -141,7 +141,6 @@ QEMUFile *qemu_fopen_socket(int fd, const char *mode);
>  QEMUFile *qemu_fopen_channel_input(QIOChannel *ioc);
>  QEMUFile *qemu_fopen_channel_output(QIOChannel *ioc);
>  QEMUFile *qemu_popen_cmd(const char *command, const char *mode);
> -QEMUFile *qemu_bufopen(const char *mode, QEMUSizedBuffer *input);
>  void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks);
>  int qemu_get_fd(QEMUFile *f);
>  int qemu_fclose(QEMUFile *f);
> @@ -167,11 +166,6 @@ ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t 
> *buf,
>   off_t pos, size_t count);
>  
>  
> -/*
> - * For use on files opened with qemu_bufopen
> - */
> -const QEMUSizedBuffer *qemu_buf_get(QEMUFile *f);
> -
>  static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
>  {
>  qemu_put_byte(f, (int)v);
> diff --git a/migration/qemu-file-buf.c b/migration/qemu-file-buf.c
> index 49516b8..c0bc38c 100644
> --- a/migration/qemu-file-buf.c
> +++ b/migration/qemu-file-buf.c
> @@ -365,99 +365,3 @@ ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t 
> *source,
>  
>  return count;
>  }
> -
> -typedef struct QEMUBuffer {
> -QEMUSizedBuffer *qsb;
> -QEMUFile *file;
> -bool qsb_allocated;
> -} QEMUBuffer;
> -
> -static ssize_t buf_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
> -  size_t size)
> -{
> -QEMUBuffer *s = opaque;
> -ssize_t len = qsb_get_length(s->qsb) - pos;
> -
> -if (len <= 0) {
> -return 0;
> -}
> -
> -if (len > size) {
> -len = size;
> -}
> -return qsb_get_buffer(s->qsb, pos, len, buf);
> -}
> -
> -static ssize_t buf_put_buffer(void *opaque, const uint8_t *buf,
> -  int64_t pos, size_t size)
> -{
> -QEMUBuffer *s = opaque;
> -
> -return qsb_write_at(s->qsb, buf, pos, size);
> -}
> -
> -static int buf_close(void *opaque)
> -{
> -QEMUBuffer *s = opaque;
> -
> -if (s->qsb_allocated) {
> -qsb_free(s->qsb);
> -}
> -
> -g_free(s);
> -
> -return 0;
> -}
> -
> -const QEMUSizedBuffer *qemu_buf_get(QEMUFile *f)
> -{
> -QEMUBuffer *p;
> -
> -qemu_fflush(f);
> -
> -p = f->opaque;
> -
> -return p->qsb;
> -}
> -
> -static const QEMUFileOps buf_read_ops = {
> -.get_buffer = buf_get_buffer,
> -.close =  buf_close,
> -};
> -
> -static const QEMUFileOps buf_write_ops = {
> -.put_buffer = buf_put_buffer,
> -.close =  buf_close,
> -};
> -
> -QEMUFile *qemu_bufopen(const char *mode, QEMUSizedBuffer *input)
> -{
> -QEMUBuffer *s;
> -
> -if (mode == NULL || (mode[0] != 'r' && mode[0] != 'w') ||
> -mode[1] != '\0') {
> -error_report("qemu_bufopen: Argument validity check failed");
> -return NULL;
> -}
> -
> -s = g_new0(QEMUBuffer, 1);
> -s->qsb = input;
> -
> -if (s->qsb == NULL) {
> -s->qsb = qsb_create(NULL, 0);
> -s->qsb_allocated = true;
> -}
> -if (!s->qsb) {
> -g_free(s);
> -error_report("qemu_bufopen: qsb_create failed");
> -return NULL;
> -}
> -
> -
> -if (mode[0] == 'r') {
> -s->file = qemu_fopen_ops(s, &buf_read_ops);
> -} else {
> -s->file = qemu_fopen_ops(s, &buf_write_ops);
> -}
> -return s->file;
> -}
> -- 
> 2.5.0
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v1 16/22] migration: delete QEMUSizedBuffer struct

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> Now that we don't have have a buffer based QemuFile
> implementation, the QEMUSizedBuffer code is also
> unused and can be deleted. A simpler buffer class
> also exists in util/buffer.c which other code can
> used as needed.
> 
> Signed-off-by: Daniel P. Berrange 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  include/migration/qemu-file.h |  16 --
>  include/qemu/typedefs.h   |   1 -
>  migration/Makefile.objs   |   2 +-
>  migration/qemu-file-buf.c | 367 
> --
>  4 files changed, 1 insertion(+), 385 deletions(-)
>  delete mode 100644 migration/qemu-file-buf.c
> 
> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> index 6b12960..da67931 100644
> --- a/include/migration/qemu-file.h
> +++ b/include/migration/qemu-file.h
> @@ -127,13 +127,6 @@ typedef struct QEMUFileHooks {
>  QEMURamSaveFunc *save_page;
>  } QEMUFileHooks;
>  
> -struct QEMUSizedBuffer {
> -struct iovec *iov;
> -size_t n_iov;
> -size_t size; /* total allocated size in all iov's */
> -size_t used; /* number of used bytes */
> -};
> -
>  QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops);
>  QEMUFile *qemu_fopen(const char *filename, const char *mode);
>  QEMUFile *qemu_fdopen(int fd, const char *mode);
> @@ -156,15 +149,6 @@ void qemu_put_buffer_async(QEMUFile *f, const uint8_t 
> *buf, size_t size);
>  bool qemu_file_mode_is_not_valid(const char *mode);
>  bool qemu_file_is_writable(QEMUFile *f);
>  
> -QEMUSizedBuffer *qsb_create(const uint8_t *buffer, size_t len);
> -void qsb_free(QEMUSizedBuffer *);
> -size_t qsb_set_length(QEMUSizedBuffer *qsb, size_t length);
> -size_t qsb_get_length(const QEMUSizedBuffer *qsb);
> -ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count,
> -   uint8_t *buf);
> -ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf,
> - off_t pos, size_t count);
> -
>  
>  static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
>  {
> diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
> index 78fe6e8..3f8dfbf 100644
> --- a/include/qemu/typedefs.h
> +++ b/include/qemu/typedefs.h
> @@ -77,7 +77,6 @@ typedef struct QemuOpt QemuOpt;
>  typedef struct QemuOpts QemuOpts;
>  typedef struct QemuOptsList QemuOptsList;
>  typedef struct QEMUSGList QEMUSGList;
> -typedef struct QEMUSizedBuffer QEMUSizedBuffer;
>  typedef struct QEMUTimer QEMUTimer;
>  typedef struct QEMUTimerListGroup QEMUTimerListGroup;
>  typedef struct QObject QObject;
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 3c90c44..8ecb941 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -1,6 +1,6 @@
>  common-obj-y += migration.o tcp.o unix.o fd.o exec.o
>  common-obj-y += vmstate.o
> -common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o 
> qemu-file-stdio.o
> +common-obj-y += qemu-file.o qemu-file-unix.o qemu-file-stdio.o
>  common-obj-y += qemu-file-channel.o
>  common-obj-y += xbzrle.o postcopy-ram.o
>  
> diff --git a/migration/qemu-file-buf.c b/migration/qemu-file-buf.c
> deleted file mode 100644
> index c0bc38c..000
> --- a/migration/qemu-file-buf.c
> +++ /dev/null
> @@ -1,367 +0,0 @@
> -/*
> - * QEMU System Emulator
> - *
> - * Copyright (c) 2003-2008 Fabrice Bellard
> - * Copyright (c) 2014 IBM Corp.
> - *
> - * Authors:
> - *  Stefan Berger 
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> - * of this software and associated documentation files (the "Software"), to 
> deal
> - * in the Software without restriction, including without limitation the 
> rights
> - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> - * copies of the Software, and to permit persons to whom the Software is
> - * furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice shall be included in
> - * all copies or substantial portions of the Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> - * THE SOFTWARE.
> - */
> -#include "qemu-common.h"
> -#include "qemu/error-report.h"
> -#include "qemu/iov.h"
> -#include "qemu/sockets.h"
> -#include "qemu/coroutine.h"
> -#include "migration/migration.h"
> -#include "migration/qemu-file.h"
> -#include "migration/qemu-file-internal.h"
> -#include "trace.h"
> -
> -#define QSB_CHUNK_SIZE  (1 << 10)
> -#define QSB_MAX_CHUNK_SIZE  (16 * QSB_CHUNK_SIZE)
> -
> 

Re: [Qemu-devel] [PATCH v1 17/22] migration: delete QEMUFile sockets implementation

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> Now that the tcp, unix and fd migration backends have converted
> to use the QIOChannel based QEMUFile, there is no user remaining
> for the sockets based QEMUFile impl and it can be deleted.
> 
> Signed-off-by: Daniel P. Berrange 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  include/migration/qemu-file.h |   2 -
>  migration/Makefile.objs   |   2 +-
>  migration/qemu-file-unix.c| 324 
> --
>  3 files changed, 1 insertion(+), 327 deletions(-)
>  delete mode 100644 migration/qemu-file-unix.c
> 
> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> index da67931..dc4b7ab 100644
> --- a/include/migration/qemu-file.h
> +++ b/include/migration/qemu-file.h
> @@ -129,8 +129,6 @@ typedef struct QEMUFileHooks {
>  
>  QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops);
>  QEMUFile *qemu_fopen(const char *filename, const char *mode);
> -QEMUFile *qemu_fdopen(int fd, const char *mode);
> -QEMUFile *qemu_fopen_socket(int fd, const char *mode);
>  QEMUFile *qemu_fopen_channel_input(QIOChannel *ioc);
>  QEMUFile *qemu_fopen_channel_output(QIOChannel *ioc);
>  QEMUFile *qemu_popen_cmd(const char *command, const char *mode);
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 8ecb941..2c71056 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -1,6 +1,6 @@
>  common-obj-y += migration.o tcp.o unix.o fd.o exec.o
>  common-obj-y += vmstate.o
> -common-obj-y += qemu-file.o qemu-file-unix.o qemu-file-stdio.o
> +common-obj-y += qemu-file.o qemu-file-stdio.o
>  common-obj-y += qemu-file-channel.o
>  common-obj-y += xbzrle.o postcopy-ram.o
>  
> diff --git a/migration/qemu-file-unix.c b/migration/qemu-file-unix.c
> deleted file mode 100644
> index 6ca53e7..000
> --- a/migration/qemu-file-unix.c
> +++ /dev/null
> @@ -1,324 +0,0 @@
> -/*
> - * QEMU System Emulator
> - *
> - * Copyright (c) 2003-2008 Fabrice Bellard
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> - * of this software and associated documentation files (the "Software"), to 
> deal
> - * in the Software without restriction, including without limitation the 
> rights
> - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> - * copies of the Software, and to permit persons to whom the Software is
> - * furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice shall be included in
> - * all copies or substantial portions of the Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> - * THE SOFTWARE.
> - */
> -#include "qemu-common.h"
> -#include "qemu/error-report.h"
> -#include "qemu/iov.h"
> -#include "qemu/sockets.h"
> -#include "qemu/coroutine.h"
> -#include "migration/qemu-file.h"
> -#include "migration/qemu-file-internal.h"
> -
> -typedef struct QEMUFileSocket {
> -int fd;
> -QEMUFile *file;
> -} QEMUFileSocket;
> -
> -static ssize_t socket_writev_buffer(void *opaque, struct iovec *iov, int 
> iovcnt,
> -int64_t pos)
> -{
> -QEMUFileSocket *s = opaque;
> -ssize_t len;
> -ssize_t size = iov_size(iov, iovcnt);
> -ssize_t offset = 0;
> -int err;
> -
> -while (size > 0) {
> -len = iov_send(s->fd, iov, iovcnt, offset, size);
> -
> -if (len > 0) {
> -size -= len;
> -offset += len;
> -}
> -
> -if (size > 0) {
> -err = socket_error();
> -
> -if (err != EAGAIN && err != EWOULDBLOCK) {
> -error_report("socket_writev_buffer: Got err=%d for 
> (%zu/%zu)",
> - err, (size_t)size, (size_t)len);
> -/*
> - * If I've already sent some but only just got the error, I
> - * could return the amount validly sent so far and wait for 
> the
> - * next call to report the error, but I'd rather flag the 
> error
> - * immediately.
> - */
> -return -err;
> -}
> -
> -/* Emulate blocking */
> -GPollFD pfd;
> -
> -pfd.fd = s->fd;
> -pfd.events = G_IO_OUT | G_IO_ERR;
> -pfd.revents = 0;
> -TFR(err = g_poll(&pfd, 1, -1 /* no timeout */));
> -/* Errors other than EINTR intentionally ignored */
> -}
> - }
> -
> -return offset;
>

Re: [Qemu-devel] [PATCH 2/2] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread Laurent Vivier


Le 03/02/2016 10:39, Laurent Vivier a écrit :
> 
> 
> Le 03/02/2016 10:37, John Paul Adrian Glaubitz a écrit :
>> Signed-off-by: John Paul Adrian Glaubitz 
>> ---
>>  target-m68k/translate.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
>> index 535d7f9..f508d1e 100644
>> --- a/target-m68k/translate.c
>> +++ b/target-m68k/translate.c
>> @@ -2828,6 +2828,10 @@ register_opcode (disas_proc proc, uint16_t opcode, 
>> uint16_t mask)
>> Later insn override earlier ones.  */
>>  void register_m68k_insns (CPUM68KState *env)
>>  {
>> +/* Build the opcode table only once to avoid
>> +   issues with multithreading. */
>> +if(opcode_table[0] != NULL)
>> +   return;
>>  #define INSN(name, opcode, mask, feature) do { \
>>  if (m68k_feature(env, M68K_FEATURE_##feature)) \
>>  register_opcode(disas_##name, 0x##opcode, 0x##mask); \
>>
> Reviewed-by: Laurent Vivier 

In fact, the line should be:

+if (opcode_table[0] != NULL)
+return;

thanks to scripts/checkpatch.pl.

Laurent



Re: [Qemu-devel] [PATCH v6 1/5] util: Introduce error reporting functions with fatal/abort

2016-02-03 Thread Thomas Huth
On 03.02.2016 10:48, Markus Armbruster wrote:
> David Gibson  writes:
> 
>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
>>> On 02.02.2016 19:53, Markus Armbruster wrote:
 Lluís Vilanova  writes:
>>> ...
>>>
> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
> index 7ab2355..6c2f142 100644
> --- a/include/qemu/error-report.h
> +++ b/include/qemu/error-report.h
> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) 
> GCC_FMT_ATTR(1, 2);
>  const char *error_get_progname(void);
>  extern bool enable_timestamp_msg;
>  
> +/* Report message and exit with error */
> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) 
> GCC_FMT_ATTR(1, 0);
> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) 
> GCC_FMT_ATTR(1, 2);

 This lets people write things like

 error_report_fatal("The sky is falling");

 instead of

 error_report("The sky is falling");
 exit(1);

 or

 fprintf(stderr, "The sky is falling\n");
 exit(1);

 I don't think that's an improvement in clarity.
>>>
>>> The problem is not the existing code, but that in a couple of new
>>> patches, I've now already seen that people are trying to use
>>>
>>>  error_setg(&error_fatal, ... );
>>
>> So, I don't actually see any real advantage to error_report_fatal(...)
>> over error_setg(&error_fatal, ...).
> 
> I do.  Compare:
> 
> (a) error_report(...);
> exit(1);
> 
> (b) error_report_fatal(...);
> 
> (c) error_setg(&error_fatal, ...);
> 
> In my opinion, (a) is clearest: even a relatively clueless reader will
> know what exit(1) does, can guess what error_report() approximately
> does, and doesn't need to know what it does exactly.  (b) is slightly
> less obvious, and (c) is positively opaque.
> 
> Let's stick to the obvious (a) and be done with it.

Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
maybe add that information to your patch that updates the HACKING text?
(and sorry for the fuzz with error_report_fatal() ... I thought it would
be a good solution to avoid (c), but if (a) is preferred instead, then
we should go with that solution instead).

And, by the way, what about the spots that currently already use
error_setg(&error_abort, ) ? Should they be turned into
error_report() + abort() instead? Or only abort(), without error
message, since abort() is only about programming errors?

 Thomas




Re: [Qemu-devel] [PATCH v1 18/22] migration: delete QEMUFile stdio implementation

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> Now that the exec migration backend and savevm have converted
> to use the QIOChannel based QEMUFile, there is no user remaining
> for the stdio based QEMUFile impl and it can be deleted.
> 
> Signed-off-by: Daniel P. Berrange 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  include/migration/qemu-file.h |   2 -
>  migration/Makefile.objs   |   2 +-
>  migration/qemu-file-stdio.c   | 195 
> --
>  3 files changed, 1 insertion(+), 198 deletions(-)
>  delete mode 100644 migration/qemu-file-stdio.c
> 
> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> index dc4b7ab..6a66735 100644
> --- a/include/migration/qemu-file.h
> +++ b/include/migration/qemu-file.h
> @@ -128,10 +128,8 @@ typedef struct QEMUFileHooks {
>  } QEMUFileHooks;
>  
>  QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops);
> -QEMUFile *qemu_fopen(const char *filename, const char *mode);
>  QEMUFile *qemu_fopen_channel_input(QIOChannel *ioc);
>  QEMUFile *qemu_fopen_channel_output(QIOChannel *ioc);
> -QEMUFile *qemu_popen_cmd(const char *command, const char *mode);
>  void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks);
>  int qemu_get_fd(QEMUFile *f);
>  int qemu_fclose(QEMUFile *f);
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 2c71056..a127f31 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -1,6 +1,6 @@
>  common-obj-y += migration.o tcp.o unix.o fd.o exec.o
>  common-obj-y += vmstate.o
> -common-obj-y += qemu-file.o qemu-file-stdio.o
> +common-obj-y += qemu-file.o
>  common-obj-y += qemu-file-channel.o
>  common-obj-y += xbzrle.o postcopy-ram.o
>  
> diff --git a/migration/qemu-file-stdio.c b/migration/qemu-file-stdio.c
> deleted file mode 100644
> index 9bde9db..000
> --- a/migration/qemu-file-stdio.c
> +++ /dev/null
> @@ -1,195 +0,0 @@
> -/*
> - * QEMU System Emulator
> - *
> - * Copyright (c) 2003-2008 Fabrice Bellard
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> - * of this software and associated documentation files (the "Software"), to 
> deal
> - * in the Software without restriction, including without limitation the 
> rights
> - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> - * copies of the Software, and to permit persons to whom the Software is
> - * furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice shall be included in
> - * all copies or substantial portions of the Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> - * THE SOFTWARE.
> - */
> -#include "qemu-common.h"
> -#include "qemu/coroutine.h"
> -#include "migration/qemu-file.h"
> -
> -typedef struct QEMUFileStdio {
> -FILE *stdio_file;
> -QEMUFile *file;
> -} QEMUFileStdio;
> -
> -static int stdio_get_fd(void *opaque)
> -{
> -QEMUFileStdio *s = opaque;
> -
> -return fileno(s->stdio_file);
> -}
> -
> -static ssize_t stdio_put_buffer(void *opaque, const uint8_t *buf, int64_t 
> pos,
> -size_t size)
> -{
> -QEMUFileStdio *s = opaque;
> -size_t res;
> -
> -res = fwrite(buf, 1, size, s->stdio_file);
> -
> -if (res != size) {
> -return -errno;
> -}
> -return res;
> -}
> -
> -static ssize_t stdio_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
> -size_t size)
> -{
> -QEMUFileStdio *s = opaque;
> -FILE *fp = s->stdio_file;
> -ssize_t bytes;
> -
> -for (;;) {
> -clearerr(fp);
> -bytes = fread(buf, 1, size, fp);
> -if (bytes != 0 || !ferror(fp)) {
> -break;
> -}
> -if (errno == EAGAIN) {
> -yield_until_fd_readable(fileno(fp));
> -} else if (errno != EINTR) {
> -break;
> -}
> -}
> -return bytes;
> -}
> -
> -static int stdio_pclose(void *opaque)
> -{
> -QEMUFileStdio *s = opaque;
> -int ret;
> -ret = pclose(s->stdio_file);
> -if (ret == -1) {
> -ret = -errno;
> -} else if (!WIFEXITED(ret) || WEXITSTATUS(ret) != 0) {
> -/* close succeeded, but non-zero exit code: */
> -ret = -EIO; /* fake errno value */
> -}
> -g_free(s);
> -return ret;
> -}
> -
> -static int stdio_fclose(void *opaque)
> -{
> -QEMUFileStdio *s = opaque;
> -int ret = 0;
> -
> -if (qemu_file_is_writable(s->file)) {
> -int f

Re: [Qemu-devel] [PATCHv3 2/4] Split serial-isa into its own config option

2016-02-03 Thread Markus Armbruster
David Gibson  writes:

> At present, the core device model code for 8250-like serial ports
> (serial.c) and the code for serial ports attached to ISA-style legacy IO
> (serial-isa.c) are both controlled by the CONFIG_SERIAL variable.
>
> There are lots and lots of embedded platforms that have 8250-like serial
> ports but have never had anything resembling ISA legacy IO.  Therefore,
> split serial-isa into its own CONFIG_SERIAL_ISA option so it can be
> disabled for platforms where it's not appropriate.
>
> For now, I enabled CONFIG_SERIAL_ISA in every default-config where
> CONFIG_SERIAL is enabled, excepting microblaze, moxie, or32, and
> xtensa.  As best as I can tell, those platforms never used legacy ISA,
> and also don't include PCI support (which would allow connection of a
> PCI->ISA bridge and/or a southbridge including legacy ISA serial
> ports).
>
> Signed-off-by: David Gibson 
> ---
>  default-configs/alpha-softmmu.mak| 1 +
>  default-configs/arm-softmmu.mak  | 1 +
>  default-configs/i386-softmmu.mak | 1 +
>  default-configs/mips-softmmu.mak | 1 +
>  default-configs/mips64-softmmu.mak   | 1 +
>  default-configs/mips64el-softmmu.mak | 1 +
>  default-configs/mipsel-softmmu.mak   | 1 +
>  default-configs/ppc-softmmu.mak  | 1 +
>  default-configs/ppc64-softmmu.mak| 1 +
>  default-configs/ppcemb-softmmu.mak   | 1 +
>  default-configs/sh4-softmmu.mak  | 1 +
>  default-configs/sh4eb-softmmu.mak| 1 +
>  default-configs/sparc64-softmmu.mak  | 1 +
>  default-configs/x86_64-softmmu.mak   | 1 +
>  hw/char/Makefile.objs| 3 ++-
>  15 files changed, 16 insertions(+), 1 deletion(-)

Ignorant question: what about the CONFIG_SERIAL in pci.mak?  Should it
trigger CONFIG_SERIAL_ISA, too?  If not, should the commit message
explain why not?



Re: [Qemu-devel] [PATCH v1 10/22] migration: convert tcp socket protocol to use QIOChannel

2016-02-03 Thread Daniel P. Berrange
On Tue, Feb 02, 2016 at 06:19:03PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > Convert the tcp socket migration protocol driver to use
> > QIOChannel and QEMUFileChannel, instead of plain sockets
> > APIs.
> > 
> > While this now looks pretty similar to the migration/unix.c
> > file from the previous patch, it was decided not to merge
> > the two, because when TLS is added to the TCP impl later,
> > this file diverge from unix.c once again.
> 
> Hmm OK, although I'd kind of like to see merging, but lets
> see the TLS code later in the series
> 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  migration/tcp.c | 119 
> > ++--
> >  1 file changed, 82 insertions(+), 37 deletions(-)
> > 
> > diff --git a/migration/tcp.c b/migration/tcp.c
> > index ae89172..ac73977 100644
> > --- a/migration/tcp.c
> > +++ b/migration/tcp.c
> > @@ -2,9 +2,11 @@
> >   * QEMU live migration
> >   *
> >   * Copyright IBM, Corp. 2008
> > + * Copyright Red Hat, Inc. 2015
> >   *
> >   * Authors:
> >   *  Anthony Liguori   
> > + *  Daniel P. Berrange 
> >   *
> >   * This work is licensed under the terms of the GNU GPL, version 2.  See
> >   * the COPYING file in the top-level directory.
> > @@ -17,11 +19,9 @@
> >  
> >  #include "qemu-common.h"
> >  #include "qemu/error-report.h"
> > -#include "qemu/sockets.h"
> >  #include "migration/migration.h"
> >  #include "migration/qemu-file.h"
> > -#include "block/block.h"
> > -#include "qemu/main-loop.h"
> > +#include "io/channel-socket.h"
> >  
> >  //#define DEBUG_MIGRATION_TCP
> >  
> > @@ -33,71 +33,116 @@
> >  do { } while (0)
> >  #endif
> >  
> > -static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
> > +
> > +static SocketAddress *tcp_build_address(const char *host_port, Error 
> > **errp)
> > +{
> > +InetSocketAddress *iaddr = inet_parse(host_port, errp);
> > +SocketAddress *saddr;
> > +
> > +if (!iaddr) {
> > +return NULL;
> > +}
> > +
> > +saddr = g_new0(SocketAddress, 1);
> > +saddr->type = SOCKET_ADDRESS_KIND_INET;
> > +saddr->u.inet = iaddr;
> > +
> > +return saddr;
> > +}
> > +
> > +
> > +static void tcp_outgoing_migration(Object *src,
> > +   Error *err,
> > +   gpointer opaque)
> >  {
> >  MigrationState *s = opaque;
> > +QIOChannel *sioc = QIO_CHANNEL(src);
> >  
> > -if (fd < 0) {
> > +if (err) {
> >  DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
> >  s->file = NULL;
> >  migrate_fd_error(s);
> >  } else {
> >  DPRINTF("migrate connect success\n");
> > -s->file = qemu_fopen_socket(fd, "wb");
> > +s->file = qemu_fopen_channel_output(sioc);
> >  migrate_fd_connect(s);
> >  }
> > +object_unref(src);
> >  }
> >  
> > -void tcp_start_outgoing_migration(MigrationState *s, const char 
> > *host_port, Error **errp)
> > +
> > +void tcp_start_outgoing_migration(MigrationState *s,
> > +  const char *host_port,
> > +  Error **errp)
> >  {
> > -inet_nonblocking_connect(host_port, tcp_wait_for_connect, s, errp);
> > +SocketAddress *saddr = tcp_build_address(host_port, errp);
> > +QIOChannelSocket *sioc;
> > +
> > +if (!saddr) {
> > +return;
> > +}
> > +
> > +sioc = qio_channel_socket_new();
> > +qio_channel_socket_connect_async(sioc,
> > + saddr,
> > + tcp_outgoing_migration,
> > + s,
> > + NULL);
> > +qapi_free_SocketAddress(saddr);
> >  }
> >  
> > -static void tcp_accept_incoming_migration(void *opaque)
> > +
> > +static gboolean tcp_accept_incoming_migration(QIOChannel *ioc,
> > +  GIOCondition condition,
> > +  gpointer opaque)
> >  {
> > -struct sockaddr_in addr;
> > -socklen_t addrlen = sizeof(addr);
> > -int s = (intptr_t)opaque;
> >  QEMUFile *f;
> > -int c, err;
> > +QIOChannelSocket *cioc;
> > +Error *err = NULL;
> >  
> > -do {
> > -c = qemu_accept(s, (struct sockaddr *)&addr, &addrlen);
> > -err = socket_error();
> > -} while (c < 0 && err == EINTR);
> > -qemu_set_fd_handler(s, NULL, NULL, NULL);
> > -closesocket(s);
> > -
> > -DPRINTF("accepted migration\n");
> > -
> > -if (c < 0) {
> > +cioc = qio_channel_socket_accept(QIO_CHANNEL_SOCKET(ioc),
> > + &err);
> > +if (!cioc) {
> >  error_report("could not accept migration connection (%s)",
> > - strerror(err));
> > -return;
> > -}
> > -
> > -f = qemu_fopen_socket(c, "rb");
> > -if (f == NULL) {
> > -error_

Re: [Qemu-devel] [PATCH v1 11/22] migration: convert fd socket protocol to use QIOChannel

2016-02-03 Thread Daniel P. Berrange
On Tue, Feb 02, 2016 at 06:46:01PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > Convert the fd socket migration protocol driver to use
> > QIOChannel and QEMUFileChannel, instead of plain sockets
> > APIs. It can be unconditionally built because the
> > QIOChannel APIs it uses will take care to report suitable
> > error messages if needed.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  migration/Makefile.objs |  4 ++--
> >  migration/fd.c  | 57 
> > -
> >  migration/migration.c   |  4 
> >  3 files changed, 39 insertions(+), 26 deletions(-)
> > 
> > diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> > index a5f8a03..64f95cd 100644
> > --- a/migration/Makefile.objs
> > +++ b/migration/Makefile.objs
> > @@ -1,11 +1,11 @@
> > -common-obj-y += migration.o tcp.o unix.o
> > +common-obj-y += migration.o tcp.o unix.o fd.o
> >  common-obj-y += vmstate.o
> >  common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o 
> > qemu-file-stdio.o
> >  common-obj-y += qemu-file-channel.o
> >  common-obj-y += xbzrle.o postcopy-ram.o
> >  
> >  common-obj-$(CONFIG_RDMA) += rdma.o
> > -common-obj-$(CONFIG_POSIX) += exec.o fd.o
> > +common-obj-$(CONFIG_POSIX) += exec.o
> >  
> >  common-obj-y += block.o
> >  
> > diff --git a/migration/fd.c b/migration/fd.c
> > index 3e4bed0..8d48e0d 100644
> > --- a/migration/fd.c
> > +++ b/migration/fd.c
> > @@ -20,6 +20,8 @@
> >  #include "monitor/monitor.h"
> >  #include "migration/qemu-file.h"
> >  #include "block/block.h"
> > +#include "io/channel-file.h"
> > +#include "io/channel-socket.h"
> >  
> >  //#define DEBUG_MIGRATION_FD
> >  
> > @@ -33,56 +35,71 @@
> >  
> >  static bool fd_is_socket(int fd)
> >  {
> > -struct stat stat;
> > -int ret = fstat(fd, &stat);
> > -if (ret == -1) {
> > -/* When in doubt say no */
> > -return false;
> > -}
> > -return S_ISSOCK(stat.st_mode);
> > +int optval;
> > +socklen_t optlen;
> > +optlen = sizeof(optval);
> > +return getsockopt(fd,
> > +  SOL_SOCKET,
> > +  SO_TYPE,
> > +  (char *)&optval,
> > +  &optlen) == 0;
> 
> Should that be qemu_getsockopt ?

Yes.

> >  void fd_start_outgoing_migration(MigrationState *s, const char *fdname, 
> > Error **errp)
> >  {
> > +QIOChannel *ioc;
> >  int fd = monitor_get_fd(cur_mon, fdname, errp);
> >  if (fd == -1) {
> >  return;
> >  }
> >  
> >  if (fd_is_socket(fd)) {
> > -s->file = qemu_fopen_socket(fd, "wb");
> > +ioc = QIO_CHANNEL(qio_channel_socket_new_fd(fd, errp));
> > +if (!ioc) {
> > +close(fd);
> > +return;
> > +}
> 
> Have you considered moving this fd_is_socket detection
> glue down into channel-file.c?  It's here so that a socket
> passed via an fd will be able to use a shutdown() method.
> It would be nice to do the same thing to sockets in the same
> situation in other places, so it would make sense if it worked
> on any fd that went through channels.
> The tricky bit is at that level you return a QIOChannelFile rather
> than a generic QIOChannel pointer.
> (One place I'd like to be able to a shutdown is on an nbd channel
> for example).

Yeah, the complexity is that we have to return two different
class instances depending on the type of FD in use. We could
introduce some helper method todo this though.


> > -static void fd_accept_incoming_migration(void *opaque)
> > +static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
> > + GIOCondition condition,
> > + gpointer opaque)
> >  {
> >  QEMUFile *f = opaque;
> > -
> > -qemu_set_fd_handler(qemu_get_fd(f), NULL, NULL, NULL);
> >  process_incoming_migration(f);
> > +return FALSE;
> 
> Please comment the magical FALSE.

FALSE is the standard value any glib event loop handled need to
return to cause glib to unregister it. I'm not sure we really
want to comment every event handle in this way.

> >  void fd_start_incoming_migration(const char *infd, Error **errp)
> >  {
> > -int fd;
> >  QEMUFile *f;
> > +QIOChannel *ioc;
> > +int fd;
> >  
> >  DPRINTF("Attempting to start an incoming migration via fd\n");
> >  
> >  fd = strtol(infd, NULL, 0);
> >  if (fd_is_socket(fd)) {
> > -f = qemu_fopen_socket(fd, "rb");
> > +ioc = QIO_CHANNEL(qio_channel_socket_new_fd(fd, errp));
> > +if (!ioc) {
> > +close(fd);
> > +return;
> > +}
> 
> Wouldn't it be better to move this check outside of this if, so that
> you test the output of both the socket_new_fd and the file_new_fd ?

qio_channel_file_new_fd() can never fail - only the socket_new_fd can
fail (when trying to query the sockaddr_t data).

> >  } else {
> > -f = q

Re: [Qemu-devel] [PATCH 2/2] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread John Paul Adrian Glaubitz
On 02/03/2016 10:57 AM, Laurent Vivier wrote:
> In fact, the line should be:
> 
> +if (opcode_table[0] != NULL)
> +return;
> 
> thanks to scripts/checkpatch.pl.

Want me to re-send it?

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [Qemu-devel] [PATCHv3 3/4] Allow ISA bus to be configured out

2016-02-03 Thread Markus Armbruster
David Gibson  writes:

> Currently, the code to handle the legacy ISA bus is always included in
> qemu.  However there are lots of platforms that don't include ISA legacy
> devies, and quite a few that have never used ISA legacy devices at all.
>
> This patch allows the ISA bus code to be disabled in the configuration for
> platforms where it doesn't make sense.
>
> For now, the default configs are adjusted to include ISA on all platforms
> including PCI: anything with PCI can at least in principle add an i82378
> PCI->ISA bridge.  Also, CONFIG_IDE_CORE which is already in pci.mak
> requires ISA support.
>
> We also explicitly enable ISA on some other non-PCI platforms which include
> ISA devices.  We may want to pare this down in future.

Impact?  Please list the targets that lose ISA because of this patch.

> Signed-off-by: David Gibson 
> Acked-by: Michael S. Tsirkin 
> ---
>  default-configs/moxie-softmmu.mak | 1 +
>  default-configs/pci.mak   | 2 ++
>  default-configs/sparc-softmmu.mak | 1 +
>  default-configs/unicore32-softmmu.mak | 1 +
>  hw/isa/Makefile.objs  | 2 +-
>  5 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/default-configs/moxie-softmmu.mak 
> b/default-configs/moxie-softmmu.mak
> index 1a95476..3886275 100644
> --- a/default-configs/moxie-softmmu.mak
> +++ b/default-configs/moxie-softmmu.mak
> @@ -1,5 +1,6 @@
>  # Default configuration for moxie-softmmu
>  
> +CONFIG_ISA_BUS=y
>  CONFIG_MC146818RTC=y
>  CONFIG_SERIAL=y
>  CONFIG_VGA=y

Uh, PATCH 2 excepted moxie from "add CONFIG_SERIAL_ISA to every config
that has CONFIG_SERIAL", but now you're giving it an ISA bus.  Please
explain.

> diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> index f250119..bcf18f0 100644
> --- a/default-configs/pci.mak
> +++ b/default-configs/pci.mak
> @@ -1,4 +1,6 @@
>  CONFIG_PCI=y
> +# For now, IDE_CORE requires ISA, so we enable it here

You mean CONFIG_IDE_CORE?  If yes, please spell it out, so grep finds
it.

> +CONFIG_ISA_BUS=y
>  CONFIG_VIRTIO_PCI=y
>  CONFIG_VIRTIO=y
>  CONFIG_USB_UHCI=y
> diff --git a/default-configs/sparc-softmmu.mak 
> b/default-configs/sparc-softmmu.mak
> index ab796b3..004b0f4 100644
> --- a/default-configs/sparc-softmmu.mak
> +++ b/default-configs/sparc-softmmu.mak
> @@ -1,5 +1,6 @@
>  # Default configuration for sparc-softmmu
>  
> +CONFIG_ISA_BUS=y
>  CONFIG_ECC=y
>  CONFIG_ESP=y
>  CONFIG_ESCC=y
> diff --git a/default-configs/unicore32-softmmu.mak 
> b/default-configs/unicore32-softmmu.mak
> index de38577..5f6c4a8 100644
> --- a/default-configs/unicore32-softmmu.mak
> +++ b/default-configs/unicore32-softmmu.mak
> @@ -1,4 +1,5 @@
>  # Default configuration for unicore32-softmmu
> +CONFIG_ISA_BUS=y
>  CONFIG_PUV3=y
>  CONFIG_PTIMER=y
>  CONFIG_PCKBD=y
> diff --git a/hw/isa/Makefile.objs b/hw/isa/Makefile.objs
> index 9164556..fb37c55 100644
> --- a/hw/isa/Makefile.objs
> +++ b/hw/isa/Makefile.objs
> @@ -1,4 +1,4 @@
> -common-obj-y += isa-bus.o
> +common-obj-$(CONFIG_ISA_BUS) += isa-bus.o
>  common-obj-$(CONFIG_APM) += apm.o
>  common-obj-$(CONFIG_I82378) += i82378.o
>  common-obj-$(CONFIG_PC87312) += pc87312.o



Re: [Qemu-devel] [PATCH v5 02/10] qemu-img: add support for --object command line arg

2016-02-03 Thread Daniel P. Berrange
On Tue, Feb 02, 2016 at 05:24:32PM -0700, Eric Blake wrote:
> On 02/02/2016 05:57 AM, Daniel P. Berrange wrote:
> > Allow creation of user creatable object types with qemu-img
> > via a new --object command line arg. This will be used to supply
> > passwords and/or encryption keys to the various block driver
> > backends via the recently added 'secret' object type.
> > 
> >  # printf letmein > mypasswd.txt
> >  # qemu-img info --object secret,id=sec0,file=mypasswd.txt \
> >   ...other info args...
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  qemu-img-cmds.hx |  44 -
> >  qemu-img.c   | 269 
> > +--
> >  qemu-img.texi|   8 ++
> >  3 files changed, 291 insertions(+), 30 deletions(-)
> > 
> 
> > +++ b/qemu-img.c
> > @@ -94,6 +97,10 @@ static void QEMU_NORETURN help(void)
> > "\n"
> > "Command parameters:\n"
> > "  'filename' is a disk image filename\n"
> > +   "  'objectdef' is a QEMU user creatable object definition. See 
> > the @code{qemu(1)}\n"
> 
> Drop @code; this is the --help text.
> 
> > +   "manual page for a description of the object properties. 
> > The common object\n"
> > +   "type that it makes sense to define is 'secret' object, 
> > which is used to\n"
> 
> s/is/is a/
> 
> or maybe go for something shorter:
> 
> The most common object type is a 'secret', which is used...
> 
> or match the text you put in the info:
> 
> The only object type that it makes sense to define is the 'secret'
> object, which is used...
> 
> > @@ -275,7 +291,14 @@ static int img_create(int argc, char **argv)
> >  bool quiet = false;
> >  
> >  for(;;) {
> > -c = getopt(argc, argv, "F:b:f:he6o:q");
> > +int option_index = 0;
> > +static const struct option long_options[] = {
> > +{"help", no_argument, 0, 'h'},
> > +{"object", required_argument, 0, OPTION_OBJECT},
> > +{0, 0, 0, 0}
> > +};
> > +c = getopt_long(argc, argv, "F:b:f:he6o:q",
> > +long_options, &option_index);
> 
> Can't you pass NULL for the last parameter, if you aren't going to use
> option_index in your error reporting?

Oh, I didn't realize that it allowed NULL for that parameter.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH 2/2] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread Laurent Vivier


Le 03/02/2016 11:06, John Paul Adrian Glaubitz a écrit :
> On 02/03/2016 10:57 AM, Laurent Vivier wrote:
>> In fact, the line should be:
>>
>> +if (opcode_table[0] != NULL)
>> +return;
>>
>> thanks to scripts/checkpatch.pl.
> 
> Want me to re-send it?
> 

I don't know if the one who will commit this to the tree will want to
update this.

BTW, Peter, perhaps it's the time to add me as m68k maintainer ?
(so I will manage that :) )

Laurent



[Qemu-devel] [PATCH] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread John Paul Adrian Glaubitz
Signed-off-by: John Paul Adrian Glaubitz 
---
 target-m68k/translate.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 535d7f9..a989961 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2828,6 +2828,10 @@ register_opcode (disas_proc proc, uint16_t opcode, 
uint16_t mask)
Later insn override earlier ones.  */
 void register_m68k_insns (CPUM68KState *env)
 {
+/* Build the opcode table only once to avoid
+   multithreading issues. */
+if (opcode_table[0] != NULL)
+return;
 #define INSN(name, opcode, mask, feature) do { \
 if (m68k_feature(env, M68K_FEATURE_##feature)) \
 register_opcode(disas_##name, 0x##opcode, 0x##mask); \
-- 
2.7.0




[Qemu-devel] Updated patch for the opcode table

2016-02-03 Thread John Paul Adrian Glaubitz
Hi Laurent!

Here's the second patch again, this time checked with ./scripts/checkpatch.pl.

Thanks for the heads-up!

Adrian




Re: [Qemu-devel] [PATCH 2/2] m68k: Build the opcode table only once to avoid multithreading issues

2016-02-03 Thread John Paul Adrian Glaubitz
On 02/03/2016 11:13 AM, Laurent Vivier wrote:
> I don't know if the one who will commit this to the tree will want to
> update this.

Just sent an updated one, just in case :).

> BTW, Peter, perhaps it's the time to add me as m68k maintainer ?
> (so I will manage that :) )

Yes, please. I'm all for that! I'm really motivated to get m68k
support into best shape. qemu-m68k is fun to hack on!

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaub...@debian.org
`. `'   Freie Universitaet Berlin - glaub...@physik.fu-berlin.de
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [Qemu-devel] [PATCH v1 11/22] migration: convert fd socket protocol to use QIOChannel

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Tue, Feb 02, 2016 at 06:46:01PM +, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > Convert the fd socket migration protocol driver to use
> > > QIOChannel and QEMUFileChannel, instead of plain sockets
> > > APIs. It can be unconditionally built because the
> > > QIOChannel APIs it uses will take care to report suitable
> > > error messages if needed.
> > > 
> > > Signed-off-by: Daniel P. Berrange 
> > > ---
> > >  migration/Makefile.objs |  4 ++--
> > >  migration/fd.c  | 57 
> > > -
> > >  migration/migration.c   |  4 
> > >  3 files changed, 39 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> > > index a5f8a03..64f95cd 100644
> > > --- a/migration/Makefile.objs
> > > +++ b/migration/Makefile.objs
> > > @@ -1,11 +1,11 @@
> > > -common-obj-y += migration.o tcp.o unix.o
> > > +common-obj-y += migration.o tcp.o unix.o fd.o
> > >  common-obj-y += vmstate.o
> > >  common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o 
> > > qemu-file-stdio.o
> > >  common-obj-y += qemu-file-channel.o
> > >  common-obj-y += xbzrle.o postcopy-ram.o
> > >  
> > >  common-obj-$(CONFIG_RDMA) += rdma.o
> > > -common-obj-$(CONFIG_POSIX) += exec.o fd.o
> > > +common-obj-$(CONFIG_POSIX) += exec.o
> > >  
> > >  common-obj-y += block.o
> > >  
> > > diff --git a/migration/fd.c b/migration/fd.c
> > > index 3e4bed0..8d48e0d 100644
> > > --- a/migration/fd.c
> > > +++ b/migration/fd.c
> > > @@ -20,6 +20,8 @@
> > >  #include "monitor/monitor.h"
> > >  #include "migration/qemu-file.h"
> > >  #include "block/block.h"
> > > +#include "io/channel-file.h"
> > > +#include "io/channel-socket.h"
> > >  
> > >  //#define DEBUG_MIGRATION_FD
> > >  
> > > @@ -33,56 +35,71 @@
> > >  
> > >  static bool fd_is_socket(int fd)
> > >  {
> > > -struct stat stat;
> > > -int ret = fstat(fd, &stat);
> > > -if (ret == -1) {
> > > -/* When in doubt say no */
> > > -return false;
> > > -}
> > > -return S_ISSOCK(stat.st_mode);
> > > +int optval;
> > > +socklen_t optlen;
> > > +optlen = sizeof(optval);
> > > +return getsockopt(fd,
> > > +  SOL_SOCKET,
> > > +  SO_TYPE,
> > > +  (char *)&optval,
> > > +  &optlen) == 0;
> > 
> > Should that be qemu_getsockopt ?
> 
> Yes.
> 
> > >  void fd_start_outgoing_migration(MigrationState *s, const char *fdname, 
> > > Error **errp)
> > >  {
> > > +QIOChannel *ioc;
> > >  int fd = monitor_get_fd(cur_mon, fdname, errp);
> > >  if (fd == -1) {
> > >  return;
> > >  }
> > >  
> > >  if (fd_is_socket(fd)) {
> > > -s->file = qemu_fopen_socket(fd, "wb");
> > > +ioc = QIO_CHANNEL(qio_channel_socket_new_fd(fd, errp));
> > > +if (!ioc) {
> > > +close(fd);
> > > +return;
> > > +}
> > 
> > Have you considered moving this fd_is_socket detection
> > glue down into channel-file.c?  It's here so that a socket
> > passed via an fd will be able to use a shutdown() method.
> > It would be nice to do the same thing to sockets in the same
> > situation in other places, so it would make sense if it worked
> > on any fd that went through channels.
> > The tricky bit is at that level you return a QIOChannelFile rather
> > than a generic QIOChannel pointer.
> > (One place I'd like to be able to a shutdown is on an nbd channel
> > for example).
> 
> Yeah, the complexity is that we have to return two different
> class instances depending on the type of FD in use. We could
> introduce some helper method todo this though.

Is there any reason they return the subclasses rather than the
common parent?

> > > -static void fd_accept_incoming_migration(void *opaque)
> > > +static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
> > > + GIOCondition condition,
> > > + gpointer opaque)
> > >  {
> > >  QEMUFile *f = opaque;
> > > -
> > > -qemu_set_fd_handler(qemu_get_fd(f), NULL, NULL, NULL);
> > >  process_incoming_migration(f);
> > > +return FALSE;
> > 
> > Please comment the magical FALSE.
> 
> FALSE is the standard value any glib event loop handled need to
> return to cause glib to unregister it. I'm not sure we really
> want to comment every event handle in this way.

I'd prefer we did.  It doesn't need to be big and detailed,
but either a function comment, or something as simple as:
return FALSE; /* unregister */

it's not something I immediately recognised, and it took a bit of digging
around for me to find the answer.

> > >  void fd_start_incoming_migration(const char *infd, Error **errp)
> > >  {
> > > -int fd;
> > >  QEMUFile *f;
> > > +QIOChannel *ioc;
> > > +int fd;
> > >  
> > >  DPR

Re: [Qemu-devel] [PATCH v1 10/22] migration: convert tcp socket protocol to use QIOChannel

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Tue, Feb 02, 2016 at 06:19:03PM +, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > Convert the tcp socket migration protocol driver to use
> > > QIOChannel and QEMUFileChannel, instead of plain sockets
> > > APIs.
> > > 
> > > While this now looks pretty similar to the migration/unix.c
> > > file from the previous patch, it was decided not to merge
> > > the two, because when TLS is added to the TCP impl later,
> > > this file diverge from unix.c once again.
> > 
> > Hmm OK, although I'd kind of like to see merging, but lets
> > see the TLS code later in the series
> > 
> > > Signed-off-by: Daniel P. Berrange 
> > > ---
> > >  migration/tcp.c | 119 
> > > ++--
> > >  1 file changed, 82 insertions(+), 37 deletions(-)
> > > 
> > > diff --git a/migration/tcp.c b/migration/tcp.c
> > > index ae89172..ac73977 100644
> > > --- a/migration/tcp.c
> > > +++ b/migration/tcp.c
> > > @@ -2,9 +2,11 @@
> > >   * QEMU live migration
> > >   *
> > >   * Copyright IBM, Corp. 2008
> > > + * Copyright Red Hat, Inc. 2015
> > >   *
> > >   * Authors:
> > >   *  Anthony Liguori   
> > > + *  Daniel P. Berrange 
> > >   *
> > >   * This work is licensed under the terms of the GNU GPL, version 2.  See
> > >   * the COPYING file in the top-level directory.
> > > @@ -17,11 +19,9 @@
> > >  
> > >  #include "qemu-common.h"
> > >  #include "qemu/error-report.h"
> > > -#include "qemu/sockets.h"
> > >  #include "migration/migration.h"
> > >  #include "migration/qemu-file.h"
> > > -#include "block/block.h"
> > > -#include "qemu/main-loop.h"
> > > +#include "io/channel-socket.h"
> > >  
> > >  //#define DEBUG_MIGRATION_TCP
> > >  
> > > @@ -33,71 +33,116 @@
> > >  do { } while (0)
> > >  #endif
> > >  
> > > -static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
> > > +
> > > +static SocketAddress *tcp_build_address(const char *host_port, Error 
> > > **errp)
> > > +{
> > > +InetSocketAddress *iaddr = inet_parse(host_port, errp);
> > > +SocketAddress *saddr;
> > > +
> > > +if (!iaddr) {
> > > +return NULL;
> > > +}
> > > +
> > > +saddr = g_new0(SocketAddress, 1);
> > > +saddr->type = SOCKET_ADDRESS_KIND_INET;
> > > +saddr->u.inet = iaddr;
> > > +
> > > +return saddr;
> > > +}
> > > +
> > > +
> > > +static void tcp_outgoing_migration(Object *src,
> > > +   Error *err,
> > > +   gpointer opaque)
> > >  {
> > >  MigrationState *s = opaque;
> > > +QIOChannel *sioc = QIO_CHANNEL(src);
> > >  
> > > -if (fd < 0) {
> > > +if (err) {
> > >  DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
> > >  s->file = NULL;
> > >  migrate_fd_error(s);
> > >  } else {
> > >  DPRINTF("migrate connect success\n");
> > > -s->file = qemu_fopen_socket(fd, "wb");
> > > +s->file = qemu_fopen_channel_output(sioc);
> > >  migrate_fd_connect(s);
> > >  }
> > > +object_unref(src);
> > >  }
> > >  
> > > -void tcp_start_outgoing_migration(MigrationState *s, const char 
> > > *host_port, Error **errp)
> > > +
> > > +void tcp_start_outgoing_migration(MigrationState *s,
> > > +  const char *host_port,
> > > +  Error **errp)
> > >  {
> > > -inet_nonblocking_connect(host_port, tcp_wait_for_connect, s, errp);
> > > +SocketAddress *saddr = tcp_build_address(host_port, errp);
> > > +QIOChannelSocket *sioc;
> > > +
> > > +if (!saddr) {
> > > +return;
> > > +}
> > > +
> > > +sioc = qio_channel_socket_new();
> > > +qio_channel_socket_connect_async(sioc,
> > > + saddr,
> > > + tcp_outgoing_migration,
> > > + s,
> > > + NULL);
> > > +qapi_free_SocketAddress(saddr);
> > >  }
> > >  
> > > -static void tcp_accept_incoming_migration(void *opaque)
> > > +
> > > +static gboolean tcp_accept_incoming_migration(QIOChannel *ioc,
> > > +  GIOCondition condition,
> > > +  gpointer opaque)
> > >  {
> > > -struct sockaddr_in addr;
> > > -socklen_t addrlen = sizeof(addr);
> > > -int s = (intptr_t)opaque;
> > >  QEMUFile *f;
> > > -int c, err;
> > > +QIOChannelSocket *cioc;
> > > +Error *err = NULL;
> > >  
> > > -do {
> > > -c = qemu_accept(s, (struct sockaddr *)&addr, &addrlen);
> > > -err = socket_error();
> > > -} while (c < 0 && err == EINTR);
> > > -qemu_set_fd_handler(s, NULL, NULL, NULL);
> > > -closesocket(s);
> > > -
> > > -DPRINTF("accepted migration\n");
> > > -
> > > -if (c < 0) {
> > > +cioc = qio_channel_socket_accep

Re: [Qemu-devel] [PATCH v6 1/5] util: Introduce error reporting functions with fatal/abort

2016-02-03 Thread Markus Armbruster
Thomas Huth  writes:

> On 03.02.2016 10:48, Markus Armbruster wrote:
>> David Gibson  writes:
>> 
>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
 On 02.02.2016 19:53, Markus Armbruster wrote:
> Lluís Vilanova  writes:
 ...

>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>> index 7ab2355..6c2f142 100644
>> --- a/include/qemu/error-report.h
>> +++ b/include/qemu/error-report.h
>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) 
>> GCC_FMT_ATTR(1, 2);
>>  const char *error_get_progname(void);
>>  extern bool enable_timestamp_msg;
>>  
>> +/* Report message and exit with error */
>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) 
>> GCC_FMT_ATTR(1, 0);
>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) 
>> GCC_FMT_ATTR(1, 2);
>
> This lets people write things like
>
> error_report_fatal("The sky is falling");
>
> instead of
>
> error_report("The sky is falling");
> exit(1);
>
> or
>
> fprintf(stderr, "The sky is falling\n");
> exit(1);
>
> I don't think that's an improvement in clarity.

 The problem is not the existing code, but that in a couple of new
 patches, I've now already seen that people are trying to use

  error_setg(&error_fatal, ... );
>>>
>>> So, I don't actually see any real advantage to error_report_fatal(...)
>>> over error_setg(&error_fatal, ...).
>> 
>> I do.  Compare:
>> 
>> (a) error_report(...);
>> exit(1);
>> 
>> (b) error_report_fatal(...);
>> 
>> (c) error_setg(&error_fatal, ...);
>> 
>> In my opinion, (a) is clearest: even a relatively clueless reader will
>> know what exit(1) does, can guess what error_report() approximately
>> does, and doesn't need to know what it does exactly.  (b) is slightly
>> less obvious, and (c) is positively opaque.
>> 
>> Let's stick to the obvious (a) and be done with it.
>
> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
> maybe add that information to your patch that updates the HACKING text?

I feel such detailed advice belings into error.h.  Sketch appended.

If that doesn't succeed in keeping (c) out, make checkpatch flag it.

> (and sorry for the fuzz with error_report_fatal() ... I thought it would
> be a good solution to avoid (c), but if (a) is preferred instead, then
> we should go with that solution instead).
>
> And, by the way, what about the spots that currently already use
> error_setg(&error_abort, ) ? Should they be turned into
> error_report() + abort() instead? Or only abort(), without error
> message, since abort() is only about programming errors?

As I wrote in my first reply to this thread, I'd like them to be cleaned
up to just abort() or assert().

I like assert(), because it gives me exactly what I can use to debug the
programming error: a core dump (if enabled) and a source location
(useful when no core dump).  I never bought the argument that we should
use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
define NDEBUG, our 600+ abort()s won't save you from our 4000+
assert()s.



diff --git a/include/qapi/error.h b/include/qapi/error.h
index 45d6c72..ea7e74f 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err);
  * human-readable error message is made from printf-style @fmt, ...
  * The resulting message should be a single phrase, with no newline or
  * trailing punctuation.
+ * Please don't error_setg(&error_fatal, ...), use error_report() and
+ * exit(), because that's more obvious.
+ * Likewise, don't error_setg(&error_abort, ...), use assert().
  */
 #define error_setg(errp, fmt, ...)  \
 error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
@@ -213,6 +216,8 @@ void error_setg_win32_internal(Error **errp,
  * the error object.
  * Else, move the error object from @local_err to *@dst_errp.
  * On return, @local_err is invalid.
+ * Please don't error_propagate(&error_fatal, ...), use
+ * error_report_err() and exit(), because that's more obvious.
  */
 void error_propagate(Error **dst_errp, Error *local_err);
 
@@ -291,12 +296,14 @@ void error_set_internal(Error **errp,
 GCC_FMT_ATTR(6, 7);
 
 /*
- * Pass to error_setg() & friends to abort() on error.
+ * Special error destination to abort on error.
+ * See error_setg() and error_propagate() for details.
  */
 extern Error *error_abort;
 
 /*
- * Pass to error_setg() & friends to exit(1) on error.
+ * Special error destination to exit(1) on error.
+ * See error_setg() and error_propagate() for details.
  */
 extern Error *error_fatal;
 



Re: [Qemu-devel] ping: [PATCH v13] block/raw-posix.c: Make physical devices usable in QEMU under Mac OS X host

2016-02-03 Thread Kevin Wolf
Am 02.02.2016 um 20:29 hat Eric Blake geschrieben:
> On 02/02/2016 10:28 AM, Programmingkid wrote:
> 
> >> Whats the rationale here ? Using pre-allocated fixed
> >> length arrays is pretty bad practice in general, but
> >> especially so for filenames
> > 
> > With an automatic variable there is no worry about when to release it. 
> 
> Yeah, but it comes with the downside of having to worry about exhausting
> stack space (there are platforms where MAXPATHLEN is intentionally
> undefined [hello, GNU Hurd]), or with the downside of arbitrary
> limitations.  And at the same time, MAXPATHLEN is usually wasteful -
> allocating 1k or 4k of stack to store what is typically less than 100
> bytes is dumb.  Really, storing file names in fixed length arrays is
> better off to avoid, and just get used to dynamic management.

Just let me add that while it's probably harmless here in .bdrv_open(),
other paths in the block layer which run in a coroutine have a much more
limited stack size and the danger of stack overflows is very real there.

Kevin



Re: [Qemu-devel] [PATCH v1 11/22] migration: convert fd socket protocol to use QIOChannel

2016-02-03 Thread Daniel P. Berrange
On Wed, Feb 03, 2016 at 10:29:50AM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > On Tue, Feb 02, 2016 at 06:46:01PM +, Dr. David Alan Gilbert wrote:
> > > >  void fd_start_incoming_migration(const char *infd, Error **errp)
> > > >  {
> > > > -int fd;
> > > >  QEMUFile *f;
> > > > +QIOChannel *ioc;
> > > > +int fd;
> > > >  
> > > >  DPRINTF("Attempting to start an incoming migration via fd\n");
> > > >  
> > > >  fd = strtol(infd, NULL, 0);
> > > >  if (fd_is_socket(fd)) {
> > > > -f = qemu_fopen_socket(fd, "rb");
> > > > +ioc = QIO_CHANNEL(qio_channel_socket_new_fd(fd, errp));
> > > > +if (!ioc) {
> > > > +close(fd);
> > > > +return;
> > > > +}
> > > 
> > > Wouldn't it be better to move this check outside of this if, so that
> > > you test the output of both the socket_new_fd and the file_new_fd ?
> > 
> > qio_channel_file_new_fd() can never fail - only the socket_new_fd can
> > fail (when trying to query the sockaddr_t data).
> 
> OK, I'm not too fussed about this bit, but:
>   1) It's easier to read - one level less of nesting
>   2) It avoids making an assumption about qio_channel_file_new_fd() never
>  failing, which is something you happen to know but you treat as
>  API;  it costs nothing to avoid making that assumption.

Note that qio_channel_file_new_fd() does not accept a 'Error **errp' since
there is no failure condition. So if we pushed the failure check up it'd
be feel like a logic error as we'd be introducing a error return path
from the method where 'errp' may not be set.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [Qemu-block] [PATCH] block: add missing call to bdrv_drain_recurse

2016-02-03 Thread Kevin Wolf
Am 23.12.2015 um 11:48 hat Paolo Bonzini geschrieben:
> This is also needed in bdrv_drain_all, not just in bdrv_drain.
> 
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Kevin Wolf 



Re: [Qemu-devel] [PATCH 2/6] i.MX: simplify CCM to only handle clock required by timers.

2016-02-03 Thread Peter Maydell
On 2 February 2016 at 22:22, Jean-Christophe DUBOIS  
wrote:
> Peter Maydell wrote:
>> These are just renaming NOCLK to CLK_NONE and fixing formatting?
>> Again, please don't put that in the same patch as substantive
>> code changes.
>
>
> I just wanted to make things more coherent at the naming convention level.
>
> But if you prefer NOCLK, I'll put it back.

I have no preference either way about the name. It's just hard
to review patches if they mix lots of cleanups in at once,
and especially if they mix stylistic changes in with
behaviour changes. It's probably sufficient just to split this
patch up into logically distinct changes with suitable commit
messages.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-02-03 Thread Peter Maydell
On 3 February 2016 at 07:15, Michael Tokarev  wrote:
> 28.01.2016 21:22, Wei Huang wrote:
>> When QEMU is hook'ed up with libvirt/virsh, the first ACPI reboot
>> request will succeed; but the following shutdown/reboot requests
>> fail to trigger VMs to react. Notice that in mach-virt machine
>> model GPIO is defined as edge-triggered and active-high in ACPI.
>> This patch changes the behavior of powerdown notifier from PULLUP
>> to PULSE. It solves the problem described above (i.e. reboot
>> continues to work).
>
> So, what's the outcome of this? :)

This patch is definitely wrong. The patch to fix up the
gpio reset stuff is definitely the right idea. Whether it
fixes the reported failure or some further change is also
needed is currently unclear.

thanks
-- PMM



Re: [Qemu-devel] [PATCH] Fix inconsistency between comment and variable name

2016-02-03 Thread Cao jin



On 02/03/2016 05:35 PM, Markus Armbruster wrote:

Michael Tokarev  writes:


03.02.2016 06:19, Cao jin wrote:

Signed-off-by: Cao jin 
---
  include/hw/qdev-core.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index abcdee8..42fa5db 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -221,7 +221,7 @@ typedef struct BusChild {

  /**
   * BusState:
- * @hotplug_device: link to a hotplug device associated with bus.
+ * @hotplug_handler: link to a hotplug device associated with bus.


Hmm.  Now while the field name in comment and in the structure
do match, the comment is still wrong, since it is a linke to
a handler, not a device… :)


Do you mean to suggest the line should be changed to

  * @hotplug_device: link to a hotplug handler associated with bus.

?



If I understand it right, hotplug_handler is a link property of 
BusState, see qbus_initfn(). Tt actually points to a device, see 
qbus_set_hotplug_handler(), and the actually hotplug handler is reside 
in this device, see how qdev_get_hotplug_handler() is used.


So, base on the talking above, the original comment could be understood 
by me:)


--
Yours Sincerely,

Cao jin





Re: [Qemu-devel] [PATCH v2 0/6] external backup api

2016-02-03 Thread Vladimir Sementsov-Ogievskiy

On 03.02.2016 11:14, Fam Zheng wrote:

On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote:

Hi all.

These series which aims to add external backup api. This is needed to allow
backup software use our dirty bitmaps.

Vmware and Parallels Cloud Server have this feature.

What is the advantage of this appraoch over "drive-backup sync=incremental
..."?


Hmm, you are asking about advantage of external backup over internal? It 
depends on external backup tool.





There are three things are done:
- add query-block-dirty-bitmap-ranges qmp command
- add qmp commands for dirty-bitmap functions: create_successor, abdicate,
   reclaim.
- make create-successor command transaction-able

Then, external backup should be done like this:

1.  qmp transaction {
 external-snapshot
 bitmap-create-successor
 }

2.  qmp query frozen bitmap, not acquiring aio context.

What do you do with the query response, read the snapshot image with an
external tool?


Yes. Alternatively, image fleecing may be used instead of snapshot.



Fam


3.  do external backup, using snapshot and bitmap


4.  if (success backup)
 qmp bitmap-abdicate
 else
 qmp bitmap-reclaime

5.  qmp merge snapshot


v2: a lot of additions and changes, no sense to compare with v1


Vladimir Sementsov-Ogievskiy (6):
   block dirty bitmap: add next_zero function
   qmp: add query-block-dirty-bitmap-ranges
   iotests: test query-block-dirty-bitmap-ranges
   qapi: add qmp commands for some dirty bitmap functions
   qapi: make block-dirty-bitmap-create-successor transaction-able
   iotests: test external backup api

  block/dirty-bitmap.c |  60 ++
  blockdev.c   | 113 +
  include/block/dirty-bitmap.h |   9 
  include/qemu/hbitmap.h   |   8 +++
  qapi-schema.json |   4 +-
  qapi/block-core.json |  90 +
  qmp-commands.hx  | 118 +++
  tests/qemu-iotests/150   |  88 
  tests/qemu-iotests/150.out   |  21 
  tests/qemu-iotests/151   |  77 
  tests/qemu-iotests/151.out   |   7 +++
  tests/qemu-iotests/group |   2 +
  util/hbitmap.c   |  26 ++
  13 files changed, 622 insertions(+), 1 deletion(-)
  create mode 100755 tests/qemu-iotests/150
  create mode 100644 tests/qemu-iotests/150.out
  create mode 100755 tests/qemu-iotests/151
  create mode 100644 tests/qemu-iotests/151.out

--
1.8.3.1




--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH v4] blockjob: Fix hang in block_job_finish_sync

2016-02-03 Thread Stefan Hajnoczi
On Tue, Feb 02, 2016 at 10:12:24AM +0800, Fam Zheng wrote:
> With a mirror job running on a virtio-blk dataplane disk, sending "q" to
> HMP will cause a dead loop in block_job_finish_sync.
> 
> This is because the aio_poll() only processes the AIO context of bs
> which has no more work to do, while the main loop BH that is scheduled
> for setting the job->completed flag is never processed.
> 
> Fix this by adding a flag in BlockJob structure, to track which context
> to poll for the block job to make progress. Its value is set to true
> when block_job_coroutine_complete() is called, and is checked in
> block_job_finish_sync to determine which context to poll.
> 
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: Fam Zheng 
> ---
>  blockjob.c   | 6 +-
>  include/block/blockjob.h | 5 +
>  2 files changed, 10 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 0/6] external backup api

2016-02-03 Thread Fam Zheng
On Wed, 02/03 13:57, Vladimir Sementsov-Ogievskiy wrote:
> On 03.02.2016 11:14, Fam Zheng wrote:
> >On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote:
> >>Hi all.
> >>
> >>These series which aims to add external backup api. This is needed to allow
> >>backup software use our dirty bitmaps.
> >>
> >>Vmware and Parallels Cloud Server have this feature.
> >What is the advantage of this appraoch over "drive-backup sync=incremental
> >..."?
> 
> Hmm, you are asking about advantage of external backup over
> internal? It depends on external backup tool.

I'm asking why the tool can't use drive-backup to achieve what you want. They
seem very comparable.

Fam

> 
> >
> >>There are three things are done:
> >>- add query-block-dirty-bitmap-ranges qmp command
> >>- add qmp commands for dirty-bitmap functions: create_successor, abdicate,
> >>   reclaim.
> >>- make create-successor command transaction-able
> >>
> >>Then, external backup should be done like this:
> >>
> >>1.  qmp transaction {
> >> external-snapshot
> >> bitmap-create-successor
> >> }
> >>
> >>2.  qmp query frozen bitmap, not acquiring aio context.
> >What do you do with the query response, read the snapshot image with an
> >external tool?
> 
> Yes. Alternatively, image fleecing may be used instead of snapshot.
> 
> >
> >Fam
> >
> >>3.  do external backup, using snapshot and bitmap
> >>
> >>
> >>4.  if (success backup)
> >> qmp bitmap-abdicate
> >> else
> >> qmp bitmap-reclaime
> >>
> >>5.  qmp merge snapshot
> >>
> >>
> >>v2: a lot of additions and changes, no sense to compare with v1
> >>
> >>
> >>Vladimir Sementsov-Ogievskiy (6):
> >>   block dirty bitmap: add next_zero function
> >>   qmp: add query-block-dirty-bitmap-ranges
> >>   iotests: test query-block-dirty-bitmap-ranges
> >>   qapi: add qmp commands for some dirty bitmap functions
> >>   qapi: make block-dirty-bitmap-create-successor transaction-able
> >>   iotests: test external backup api
> >>
> >>  block/dirty-bitmap.c |  60 ++
> >>  blockdev.c   | 113 
> >> +
> >>  include/block/dirty-bitmap.h |   9 
> >>  include/qemu/hbitmap.h   |   8 +++
> >>  qapi-schema.json |   4 +-
> >>  qapi/block-core.json |  90 +
> >>  qmp-commands.hx  | 118 
> >> +++
> >>  tests/qemu-iotests/150   |  88 
> >>  tests/qemu-iotests/150.out   |  21 
> >>  tests/qemu-iotests/151   |  77 
> >>  tests/qemu-iotests/151.out   |   7 +++
> >>  tests/qemu-iotests/group |   2 +
> >>  util/hbitmap.c   |  26 ++
> >>  13 files changed, 622 insertions(+), 1 deletion(-)
> >>  create mode 100755 tests/qemu-iotests/150
> >>  create mode 100644 tests/qemu-iotests/150.out
> >>  create mode 100755 tests/qemu-iotests/151
> >>  create mode 100644 tests/qemu-iotests/151.out
> >>
> >>-- 
> >>1.8.3.1
> >>
> 
> 
> -- 
> Best regards,
> Vladimir
> 



Re: [Qemu-devel] [PATCH v16 00/14] vfio-pci: pass the aer error to guest

2016-02-03 Thread Chen Fan


On 01/17/2016 02:34 AM, Michael S. Tsirkin wrote:

On Tue, Jan 12, 2016 at 10:43:01AM +0800, Cao jin wrote:

From: Chen Fan 

For now, for vfio pci passthough devices when qemu receives
an error from host aer report, currentlly just terminate the guest,
but usually user want to know what error occurred but stopping the
guest, so this patches add aer capability support for vfio device,
and pass the error to guest, and have guest driver to recover
from the error.

I would like to see a version of this patchset that doesn't
depend on pci core changes.
I think that if you make this simplifying assumption:

- all devices on same bus in guest are on same bus in host

then you can handle both reset and hotplug simply in function 0
since it will belong to vfio.

So we can have a version without pci core changes that simply assumes
this, and things will just work.


Now, if we wanted to enforce this limitation, I think the
cleanest way would be to add a callback in struct PCIDevice:

bool is_valid_function(PCIDevice *newfunction)

and call it as each function is added.
This way aer function can validate that each function
added shares the same bus.
And this way issues will be detected directly and not when
function 0 is added.

I would prefer this validation code to be a patch on top so we can merge
the functionality directly and avoid blocking it while we figure out the
best api to validate things.

I don't see why making guest topology match host would
ever be a problem, but if it's required to support
configurations where these differ, I'd like to see
an attempt to address that be split out, after aer
is supported.

Hi Michael,

Just think about this more,  I think we also should check the vfio
devices whether on the same bus at the time of function 0 is added.
because we don't know the affected devices by a bus reset have
already all been assigned to VM. for example, the multi-function's hotplug.
devices on same bus in host are added to VM one by one. when we
test one device, we haven't yet added the other devices. so I think
the patch should like below. then we could add a vfio_is_valid_function 
in vfio

to test each device whether the affected devices on the same bus.

Thanks,
Chen

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index d940f79..7163b56 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1836,6 +1836,38 @@ PCIDevice *pci_find_device(PCIBus *bus, int 
bus_num, uint8_t devfn)

 return bus->devices[devfn];
 }

+static int pci_bus_check_devices(PCIBus *bus)
+{
+PCIDeviceClass *pc;
+int i, ret = 0;
+
+for (i = 0; i < ARRAY_SIZE(bus->devices); ++i) {
+if (!bus->devices[i]) {
+continue;
+}
+
+pc = PCI_DEVICE_GET_CLASS(bus->devices[i]);
+if (!pc->is_valid_func) {
+continue;
+}
+
+ret = pc->is_valid_func(bus->devices[i], bus);
+if (!ret) {
+return -1;
+}
+}
+return 0;
+}
+
+static bool pci_is_valid_function(PCIDevice *pdev, PCIBus *bus)
+{
+if (pdev->bus == bus) {
+return true;
+}
+
+return false;
+}
+
 static void pci_qdev_realize(DeviceState *qdev, Error **errp)
 {
 PCIDevice *pci_dev = (PCIDevice *)qdev;
@@ -1878,6 +1910,14 @@ static void pci_qdev_realize(DeviceState *qdev, 
Error **errp)

 pci_qdev_unrealize(DEVICE(pci_dev), NULL);
 return;
 }
+
+if (DEVICE(pci_dev)->hotplugged &&
+pci_get_function_0(pci_dev) == pci_dev &&
+pci_bus_check_devices(bus)) {
+error_setg(errp, "failed to hotplug function 0");
+pci_qdev_unrealize(DEVICE(pci_dev), NULL);
+return;
+}
 }

 static void pci_default_realize(PCIDevice *dev, Error **errp)
@@ -2390,6 +2430,7 @@ static void pci_device_class_init(ObjectClass 
*klass, void *data)

 k->bus_type = TYPE_PCI_BUS;
 k->props = pci_props;
 pc->realize = pci_default_realize;
+pc->is_valid_func = pci_is_valid_function;
 }

 AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index dedf277..a89580f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -191,6 +191,7 @@ typedef struct PCIDeviceClass {

 void (*realize)(PCIDevice *dev, Error **errp);
 int (*init)(PCIDevice *dev);/* TODO convert to realize() and remove */
+bool (*is_valid_func)(PCIDevice *dev, PCIBus *bus);
 PCIUnregisterFunc *exit;
 PCIConfigReadFunc *config_read;
 PCIConfigWriteFunc *config_write;








v15-v16:
10/14, 11/14 are new to introduce a reset sequence id to specify the
vfio devices has been reset for that reset. other patches aren't modified.

v14-v15:
1. add device hot reset callback
2. add bus_in_reset for vfio device to avoid multi do host bus reset

v13-v14:
1. for multifunction device, requiring all functions enable AER.(9/13)
2. due to all affected functions receive error signal, ignore no
   error occurred function. (12/13)

v12-v13:

Re: [Qemu-devel] [SeaBIOS] [RFC PATCH v2] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU

2016-02-03 Thread Gerd Hoffmann
  Hi,

> +static void intel_igd_opregion_setup(struct pci_device *dev, void *arg)
> +{
> +struct romfile_s *file = romfile_find("etc/igd-opregion");

Is it possible to have multiple igd devices in a single machine?
So, should we include the pci address in the file name?

Guess not needed, it's chipset graphics after all ...

> +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)opregion));

Looks a bit funny in code which is never ever going to run on big endian
machines ;)

Patch looks good to me.

cheers,
  Gerd




Re: [Qemu-devel] [PULL 00/17] Net patches

2016-02-03 Thread Samuel Thibault
Jason Wang, on Wed 03 Feb 2016 12:52:24 +0800, wrote:
> > Hi. I'm afraid this failed to build on w32:
> >
> > declaration specifiers or ‘...’ before ‘sa_family_t’
> 
> - switch to use unsigned short
> - or typedef sa_family_t as unsigned short for windows

Doing the second runs the risk of windows eventually catching up with
what posix requires. Using unsigned short instead should be fine.

Samuel



[Qemu-devel] [PULL 0/6] virtio-gpu: bugfixes and spice support preparation

2016-02-03 Thread Gerd Hoffmann
  Hi,

Minor fixes and some preparing work in console
and virtio-gpu for spice support.

please pull,
  Gerd

The following changes since commit c65db7705b7926f4a084b93778e4bd5dd3990aad:

  Merge remote-tracking branch 
'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging 
(2016-02-02 18:04:04 +)

are available in the git repository at:


  git://git.kraxel.org/qemu tags/pull-vga-20160203-1

for you to fetch changes up to 321c9adba5a64a1a9de2dd7db5433b62a5433439:

  virtio-gpu: block any rendering until client (ui) is done (2016-02-03 
10:41:36 +0100)


virtio-gpu: bugfixes and spice support preparation


Gerd Hoffmann (6):
  zap qemu_egl_has_ext in include/ui/egl-helpers.h
  console: block rendering until client is done
  virtio-gpu: fix memory leak in error path
  virtio-gpu: maintain command queue
  virtio-gpu: add support to enable/disable command processing
  virtio-gpu: block any rendering until client (ui) is done

 hw/display/virtio-gpu-3d.c | 11 --
 hw/display/virtio-gpu.c| 77 ++
 hw/display/virtio-vga.c| 10 ++
 include/hw/virtio/virtio-gpu.h |  4 +++
 include/ui/console.h   |  2 ++
 include/ui/egl-helpers.h   |  1 -
 ui/console.c   | 10 ++
 7 files changed, 90 insertions(+), 25 deletions(-)



[Qemu-devel] [PULL 1/6] zap qemu_egl_has_ext in include/ui/egl-helpers.h

2016-02-03 Thread Gerd Hoffmann
Drop leftover prototype which sneaked in by mistake

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 include/ui/egl-helpers.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/ui/egl-helpers.h b/include/ui/egl-helpers.h
index 5ad5dc3..8c84398 100644
--- a/include/ui/egl-helpers.h
+++ b/include/ui/egl-helpers.h
@@ -11,6 +11,5 @@ EGLSurface qemu_egl_init_surface_x11(EGLContext ectx, Window 
win);
 
 int qemu_egl_init_dpy(EGLNativeDisplayType dpy, bool gles, bool debug);
 EGLContext qemu_egl_init_ctx(void);
-bool qemu_egl_has_ext(const char *haystack, const char *needle);
 
 #endif /* EGL_HELPERS_H */
-- 
1.8.3.1




[Qemu-devel] [PULL 4/6] virtio-gpu: maintain command queue

2016-02-03 Thread Gerd Hoffmann
We'll go take out the commands we receive out of the virt queue and put
them into a linked list, to decouple virtio queue handling from actual
command processing.

Also move cmd processing to new virtio_gpu_handle_ctrl func, so we can
easily kick it from different places.

Signed-off-by: Gerd Hoffmann 
---
 hw/display/virtio-gpu.c| 63 +++---
 include/hw/virtio/virtio-gpu.h |  1 +
 2 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 3c96f8e..a2ec7cb 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -755,33 +755,21 @@ static void virtio_gpu_handle_cursor_cb(VirtIODevice 
*vdev, VirtQueue *vq)
 qemu_bh_schedule(g->cursor_bh);
 }
 
-static void virtio_gpu_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+static void virtio_gpu_process_cmdq(VirtIOGPU *g)
 {
-VirtIOGPU *g = VIRTIO_GPU(vdev);
 struct virtio_gpu_ctrl_command *cmd;
 
-if (!virtio_queue_ready(vq)) {
-return;
-}
-
-#ifdef CONFIG_VIRGL
-if (!g->renderer_inited && g->use_virgl_renderer) {
-virtio_gpu_virgl_init(g);
-g->renderer_inited = true;
-}
-#endif
-
-cmd = g_new(struct virtio_gpu_ctrl_command, 1);
-while (virtqueue_pop(vq, &cmd->elem)) {
-cmd->vq = vq;
-cmd->error = 0;
-cmd->finished = false;
-if (virtio_gpu_stats_enabled(g->conf)) {
-g->stats.requests++;
-}
+while (!QTAILQ_EMPTY(&g->cmdq)) {
+cmd = QTAILQ_FIRST(&g->cmdq);
 
+/* process command */
 VIRGL(g, virtio_gpu_virgl_process_cmd, virtio_gpu_simple_process_cmd,
   g, cmd);
+QTAILQ_REMOVE(&g->cmdq, cmd, next);
+if (virtio_gpu_stats_enabled(g->conf)) {
+g->stats.requests++;
+}
+
 if (!cmd->finished) {
 QTAILQ_INSERT_TAIL(&g->fenceq, cmd, next);
 g->inflight++;
@@ -791,11 +779,41 @@ static void virtio_gpu_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 }
 fprintf(stderr, "inflight: %3d (+)\r", g->inflight);
 }
-cmd = g_new(struct virtio_gpu_ctrl_command, 1);
+} else {
+g_free(cmd);
 }
 }
+}
+
+static void virtio_gpu_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+{
+VirtIOGPU *g = VIRTIO_GPU(vdev);
+struct virtio_gpu_ctrl_command *cmd;
+
+if (!virtio_queue_ready(vq)) {
+return;
+}
+
+#ifdef CONFIG_VIRGL
+if (!g->renderer_inited && g->use_virgl_renderer) {
+virtio_gpu_virgl_init(g);
+g->renderer_inited = true;
+}
+#endif
+
+cmd = g_new(struct virtio_gpu_ctrl_command, 1);
+while (virtqueue_pop(vq, &cmd->elem)) {
+cmd->vq = vq;
+cmd->error = 0;
+cmd->finished = false;
+cmd->waiting = false;
+QTAILQ_INSERT_TAIL(&g->cmdq, cmd, next);
+cmd = g_new(struct virtio_gpu_ctrl_command, 1);
+}
 g_free(cmd);
 
+virtio_gpu_process_cmdq(g);
+
 #ifdef CONFIG_VIRGL
 if (g->use_virgl_renderer) {
 virtio_gpu_virgl_fence_poll(g);
@@ -921,6 +939,7 @@ static void virtio_gpu_device_realize(DeviceState *qdev, 
Error **errp)
 g->ctrl_bh = qemu_bh_new(virtio_gpu_ctrl_bh, g);
 g->cursor_bh = qemu_bh_new(virtio_gpu_cursor_bh, g);
 QTAILQ_INIT(&g->reslist);
+QTAILQ_INIT(&g->cmdq);
 QTAILQ_INIT(&g->fenceq);
 
 g->enabled_output_bitmask = 1;
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 9b279d7..f7e7a52 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -94,6 +94,7 @@ typedef struct VirtIOGPU {
 DeviceState *qdev;
 
 QTAILQ_HEAD(, virtio_gpu_simple_resource) reslist;
+QTAILQ_HEAD(, virtio_gpu_ctrl_command) cmdq;
 QTAILQ_HEAD(, virtio_gpu_ctrl_command) fenceq;
 
 struct virtio_gpu_scanout scanout[VIRTIO_GPU_MAX_SCANOUT];
-- 
1.8.3.1




[Qemu-devel] [PULL 3/6] virtio-gpu: fix memory leak in error path

2016-02-03 Thread Gerd Hoffmann
Found by Coverity Scan, buf not freed on error.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 hw/display/virtio-gpu-3d.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/display/virtio-gpu-3d.c b/hw/display/virtio-gpu-3d.c
index 59581a4..e13122d 100644
--- a/hw/display/virtio-gpu-3d.c
+++ b/hw/display/virtio-gpu-3d.c
@@ -198,7 +198,7 @@ static void virgl_cmd_submit_3d(VirtIOGPU *g,
 qemu_log_mask(LOG_GUEST_ERROR, "%s: size mismatch (%zd/%d)",
   __func__, s, cs.size);
 cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
-return;
+goto out;
 }
 
 if (virtio_gpu_stats_enabled(g->conf)) {
@@ -208,6 +208,7 @@ static void virgl_cmd_submit_3d(VirtIOGPU *g,
 
 virgl_renderer_submit_cmd(buf, cs.hdr.ctx_id, cs.size / 4);
 
+out:
 g_free(buf);
 }
 
-- 
1.8.3.1




[Qemu-devel] [PULL 6/6] virtio-gpu: block any rendering until client (ui) is done

2016-02-03 Thread Gerd Hoffmann
Wire up gl_block callback, so ui code can request to stop
virtio-gpu rendering.

Signed-off-by: Gerd Hoffmann 
---
 hw/display/virtio-gpu-3d.c |  5 +
 hw/display/virtio-gpu.c| 11 +++
 hw/display/virtio-vga.c| 10 ++
 include/hw/virtio/virtio-gpu.h |  1 +
 4 files changed, 27 insertions(+)

diff --git a/hw/display/virtio-gpu-3d.c b/hw/display/virtio-gpu-3d.c
index 6f646b1..fa19294 100644
--- a/hw/display/virtio-gpu-3d.c
+++ b/hw/display/virtio-gpu-3d.c
@@ -383,6 +383,11 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
 {
 VIRTIO_GPU_FILL_CMD(cmd->cmd_hdr);
 
+cmd->waiting = g->renderer_blocked;
+if (cmd->waiting) {
+return;
+}
+
 virgl_renderer_force_ctx_0();
 switch (cmd->cmd_hdr.type) {
 case VIRTIO_GPU_CMD_CTX_CREATE:
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index af9b757..1cb4002 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -897,11 +897,22 @@ static int virtio_gpu_ui_info(void *opaque, uint32_t idx, 
QemuUIInfo *info)
 return 0;
 }
 
+static void virtio_gpu_gl_block(void *opaque, bool block)
+{
+VirtIOGPU *g = opaque;
+
+g->renderer_blocked = block;
+if (!block) {
+virtio_gpu_process_cmdq(g);
+}
+}
+
 const GraphicHwOps virtio_gpu_ops = {
 .invalidate = virtio_gpu_invalidate_display,
 .gfx_update = virtio_gpu_update_display,
 .text_update = virtio_gpu_text_update,
 .ui_info = virtio_gpu_ui_info,
+.gl_block = virtio_gpu_gl_block,
 };
 
 static void virtio_gpu_device_realize(DeviceState *qdev, Error **errp)
diff --git a/hw/display/virtio-vga.c b/hw/display/virtio-vga.c
index 249dbc0..e58b165 100644
--- a/hw/display/virtio-vga.c
+++ b/hw/display/virtio-vga.c
@@ -66,11 +66,21 @@ static int virtio_vga_ui_info(void *opaque, uint32_t idx, 
QemuUIInfo *info)
 return -1;
 }
 
+static void virtio_vga_gl_block(void *opaque, bool block)
+{
+VirtIOVGA *vvga = opaque;
+
+if (virtio_gpu_ops.gl_block) {
+virtio_gpu_ops.gl_block(&vvga->vdev, block);
+}
+}
+
 static const GraphicHwOps virtio_vga_ops = {
 .invalidate = virtio_vga_invalidate_display,
 .gfx_update = virtio_vga_update_display,
 .text_update = virtio_vga_text_update,
 .ui_info = virtio_vga_ui_info,
+.gl_block = virtio_vga_gl_block,
 };
 
 /* VGA device wrapper around PCI device around virtio GPU */
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index f6cae0b..13b0ab0 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -107,6 +107,7 @@ typedef struct VirtIOGPU {
 
 bool use_virgl_renderer;
 bool renderer_inited;
+bool renderer_blocked;
 QEMUTimer *fence_poll;
 QEMUTimer *print_stats;
 
-- 
1.8.3.1




[Qemu-devel] [PULL 2/6] console: block rendering until client is done

2016-02-03 Thread Gerd Hoffmann
Allow gl user interfaces to block display device gl rendering.
The ui code might want to do that in case it takes a little
longer to bring things to screen, for example because we'll
hand over a dma-buf to another process (spice will do that).

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 include/ui/console.h |  2 ++
 ui/console.c | 10 ++
 2 files changed, 12 insertions(+)

diff --git a/include/ui/console.h b/include/ui/console.h
index adac36d..12ad627 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -362,6 +362,7 @@ typedef struct GraphicHwOps {
 void (*text_update)(void *opaque, console_ch_t *text);
 void (*update_interval)(void *opaque, uint64_t interval);
 int (*ui_info)(void *opaque, uint32_t head, QemuUIInfo *info);
+void (*gl_block)(void *opaque, bool block);
 } GraphicHwOps;
 
 QemuConsole *graphic_console_init(DeviceState *dev, uint32_t head,
@@ -374,6 +375,7 @@ void graphic_console_set_hwops(QemuConsole *con,
 void graphic_hw_update(QemuConsole *con);
 void graphic_hw_invalidate(QemuConsole *con);
 void graphic_hw_text_update(QemuConsole *con, console_ch_t *chardata);
+void graphic_hw_gl_block(QemuConsole *con, bool block);
 
 QemuConsole *qemu_console_lookup_by_index(unsigned int index);
 QemuConsole *qemu_console_lookup_by_device(DeviceState *dev, uint32_t head);
diff --git a/ui/console.c b/ui/console.c
index fe950c6..791b4fc 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -261,6 +261,16 @@ void graphic_hw_update(QemuConsole *con)
 }
 }
 
+void graphic_hw_gl_block(QemuConsole *con, bool block)
+{
+if (!con) {
+con = active_console;
+}
+if (con && con->hw_ops->gl_block) {
+con->hw_ops->gl_block(con->hw, block);
+}
+}
+
 void graphic_hw_invalidate(QemuConsole *con)
 {
 if (!con) {
-- 
1.8.3.1




[Qemu-devel] [PULL 5/6] virtio-gpu: add support to enable/disable command processing

2016-02-03 Thread Gerd Hoffmann
So we can stop rendering for a while in case we have to.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 hw/display/virtio-gpu-3d.c | 3 ++-
 hw/display/virtio-gpu.c| 5 -
 include/hw/virtio/virtio-gpu.h | 2 ++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-gpu-3d.c b/hw/display/virtio-gpu-3d.c
index e13122d..6f646b1 100644
--- a/hw/display/virtio-gpu-3d.c
+++ b/hw/display/virtio-gpu-3d.c
@@ -554,7 +554,8 @@ static void virtio_gpu_fence_poll(void *opaque)
 VirtIOGPU *g = opaque;
 
 virgl_renderer_poll();
-if (g->inflight) {
+virtio_gpu_process_cmdq(g);
+if (!QTAILQ_EMPTY(&g->cmdq) || !QTAILQ_EMPTY(&g->fenceq)) {
 timer_mod(g->fence_poll, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 10);
 }
 }
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index a2ec7cb..af9b757 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -755,7 +755,7 @@ static void virtio_gpu_handle_cursor_cb(VirtIODevice *vdev, 
VirtQueue *vq)
 qemu_bh_schedule(g->cursor_bh);
 }
 
-static void virtio_gpu_process_cmdq(VirtIOGPU *g)
+void virtio_gpu_process_cmdq(VirtIOGPU *g)
 {
 struct virtio_gpu_ctrl_command *cmd;
 
@@ -765,6 +765,9 @@ static void virtio_gpu_process_cmdq(VirtIOGPU *g)
 /* process command */
 VIRGL(g, virtio_gpu_virgl_process_cmd, virtio_gpu_simple_process_cmd,
   g, cmd);
+if (cmd->waiting) {
+break;
+}
 QTAILQ_REMOVE(&g->cmdq, cmd, next);
 if (virtio_gpu_stats_enabled(g->conf)) {
 g->stats.requests++;
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index f7e7a52..f6cae0b 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -76,6 +76,7 @@ struct virtio_gpu_ctrl_command {
 VirtQueue *vq;
 struct virtio_gpu_ctrl_hdr cmd_hdr;
 uint32_t error;
+bool waiting;
 bool finished;
 QTAILQ_ENTRY(virtio_gpu_ctrl_command) next;
 };
@@ -152,6 +153,7 @@ int virtio_gpu_create_mapping_iov(struct 
virtio_gpu_resource_attach_backing *ab,
   struct virtio_gpu_ctrl_command *cmd,
   struct iovec **iov);
 void virtio_gpu_cleanup_mapping_iov(struct iovec *iov, uint32_t count);
+void virtio_gpu_process_cmdq(VirtIOGPU *g);
 
 /* virtio-gpu-3d.c */
 void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v2 0/6] external backup api

2016-02-03 Thread Vladimir Sementsov-Ogievskiy

On 03.02.2016 14:02, Fam Zheng wrote:

On Wed, 02/03 13:57, Vladimir Sementsov-Ogievskiy wrote:

On 03.02.2016 11:14, Fam Zheng wrote:

On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote:

Hi all.

These series which aims to add external backup api. This is needed to allow
backup software use our dirty bitmaps.

Vmware and Parallels Cloud Server have this feature.

What is the advantage of this appraoch over "drive-backup sync=incremental
..."?

Hmm, you are asking about advantage of external backup over
internal? It depends on external backup tool.

I'm asking why the tool can't use drive-backup to achieve what you want. They
seem very comparable.


drive-backup has limited range of maintained by qemu targets.. Third 
party tool uses its own internal targets. If we use drvie-backup there 
will be two copies instead of one: from qemu to drive-backup produced 
backup and then to internal target in third-party tool.




Fam


There are three things are done:
- add query-block-dirty-bitmap-ranges qmp command
- add qmp commands for dirty-bitmap functions: create_successor, abdicate,
   reclaim.
- make create-successor command transaction-able

Then, external backup should be done like this:

1.  qmp transaction {
 external-snapshot
 bitmap-create-successor
 }

2.  qmp query frozen bitmap, not acquiring aio context.

What do you do with the query response, read the snapshot image with an
external tool?

Yes. Alternatively, image fleecing may be used instead of snapshot.


Fam


3.  do external backup, using snapshot and bitmap


4.  if (success backup)
 qmp bitmap-abdicate
 else
 qmp bitmap-reclaime

5.  qmp merge snapshot


v2: a lot of additions and changes, no sense to compare with v1


Vladimir Sementsov-Ogievskiy (6):
   block dirty bitmap: add next_zero function
   qmp: add query-block-dirty-bitmap-ranges
   iotests: test query-block-dirty-bitmap-ranges
   qapi: add qmp commands for some dirty bitmap functions
   qapi: make block-dirty-bitmap-create-successor transaction-able
   iotests: test external backup api

  block/dirty-bitmap.c |  60 ++
  blockdev.c   | 113 +
  include/block/dirty-bitmap.h |   9 
  include/qemu/hbitmap.h   |   8 +++
  qapi-schema.json |   4 +-
  qapi/block-core.json |  90 +
  qmp-commands.hx  | 118 +++
  tests/qemu-iotests/150   |  88 
  tests/qemu-iotests/150.out   |  21 
  tests/qemu-iotests/151   |  77 
  tests/qemu-iotests/151.out   |   7 +++
  tests/qemu-iotests/group |   2 +
  util/hbitmap.c   |  26 ++
  13 files changed, 622 insertions(+), 1 deletion(-)
  create mode 100755 tests/qemu-iotests/150
  create mode 100644 tests/qemu-iotests/150.out
  create mode 100755 tests/qemu-iotests/151
  create mode 100644 tests/qemu-iotests/151.out

--
1.8.3.1



--
Best regards,
Vladimir




--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH v1 09/22] migration: convert unix socket protocol to use QIOChannel

2016-02-03 Thread Daniel P. Berrange
On Tue, Feb 02, 2016 at 06:02:13PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > Convert the unix socket migration protocol driver to use
> > QIOChannel and QEMUFileChannel, instead of plain sockets
> > APIs. It can be unconditionally built, since the socket
> > impl of QIOChannel will report a suitable error on platforms
> > where UNIX sockets are unavailable.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  migration/Makefile.objs |   4 +-
> >  migration/migration.c   |   4 ++
> >  migration/unix.c| 103 
> > +++-
> >  3 files changed, 72 insertions(+), 39 deletions(-)
> > 
> > diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> > index b357e2f..a5f8a03 100644
> > --- a/migration/Makefile.objs
> > +++ b/migration/Makefile.objs
> > @@ -1,11 +1,11 @@
> > -common-obj-y += migration.o tcp.o
> > +common-obj-y += migration.o tcp.o unix.o
> >  common-obj-y += vmstate.o
> >  common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o 
> > qemu-file-stdio.o
> >  common-obj-y += qemu-file-channel.o
> >  common-obj-y += xbzrle.o postcopy-ram.o
> >  
> >  common-obj-$(CONFIG_RDMA) += rdma.o
> > -common-obj-$(CONFIG_POSIX) += exec.o unix.o fd.o
> > +common-obj-$(CONFIG_POSIX) += exec.o fd.o
> >  
> >  common-obj-y += block.o
> >  
> > diff --git a/migration/migration.c b/migration/migration.c
> > index e921b20..1c5f12e 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -312,8 +312,10 @@ void qemu_start_incoming_migration(const char *uri, 
> > Error **errp)
> >  #if !defined(WIN32)
> >  } else if (strstart(uri, "exec:", &p)) {
> >  exec_start_incoming_migration(p, errp);
> > +#endif
> >  } else if (strstart(uri, "unix:", &p)) {
> >  unix_start_incoming_migration(p, errp);
> > +#if !defined(WIN32)
> >  } else if (strstart(uri, "fd:", &p)) {
> >  fd_start_incoming_migration(p, errp);
> >  #endif
> > @@ -1017,8 +1019,10 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
> > blk,
> >  #if !defined(WIN32)
> >  } else if (strstart(uri, "exec:", &p)) {
> >  exec_start_outgoing_migration(s, p, &local_err);
> > +#endif
> >  } else if (strstart(uri, "unix:", &p)) {
> >  unix_start_outgoing_migration(s, p, &local_err);
> > +#if !defined(WIN32)
> >  } else if (strstart(uri, "fd:", &p)) {
> >  fd_start_outgoing_migration(s, p, &local_err);
> >  #endif
> > diff --git a/migration/unix.c b/migration/unix.c
> > index b591813..4674640 100644
> > --- a/migration/unix.c
> > +++ b/migration/unix.c
> > @@ -1,10 +1,11 @@
> >  /*
> >   * QEMU live migration via Unix Domain Sockets
> >   *
> > - * Copyright Red Hat, Inc. 2009
> > + * Copyright Red Hat, Inc. 2009-2015
> 
> year++ ?
> 
> >   *
> >   * Authors:
> >   *  Chris Lalancette 
> > + *  Daniel P. Berrange 
> >   *
> >   * This work is licensed under the terms of the GNU GPL, version 2.  See
> >   * the COPYING file in the top-level directory.
> > @@ -17,11 +18,9 @@
> >  
> >  #include "qemu-common.h"
> >  #include "qemu/error-report.h"
> > -#include "qemu/sockets.h"
> > -#include "qemu/main-loop.h"
> >  #include "migration/migration.h"
> >  #include "migration/qemu-file.h"
> > -#include "block/block.h"
> > +#include "io/channel-socket.h"
> >  
> >  //#define DEBUG_MIGRATION_UNIX
> >  
> > @@ -33,71 +32,101 @@
> >  do { } while (0)
> >  #endif
> >  
> > -static void unix_wait_for_connect(int fd, Error *err, void *opaque)
> > +
> > +static SocketAddress *unix_build_address(const char *path)
> > +{
> > +SocketAddress *saddr;
> > +
> > +saddr = g_new0(SocketAddress, 1);
> > +saddr->type = SOCKET_ADDRESS_KIND_UNIX;
> > +saddr->u.q_unix = g_new0(UnixSocketAddress, 1);
> > +saddr->u.q_unix->path = g_strdup(path);
> > +
> > +return saddr;
> > +}
> > +
> > +
> > +static void unix_outgoing_migration(Object *src,
> > +Error *err,
> > +gpointer opaque)
> >  {
> >  MigrationState *s = opaque;
> > +QIOChannel *sioc = QIO_CHANNEL(src);
> >  
> > -if (fd < 0) {
> > +if (err) {
> >  DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
> >  s->file = NULL;
> >  migrate_fd_error(s);
> >  } else {
> >  DPRINTF("migrate connect success\n");
> > -s->file = qemu_fopen_socket(fd, "wb");
> > +s->file = qemu_fopen_channel_output(sioc);
> >  migrate_fd_connect(s);
> >  }
> > +object_unref(src);
> >  }
> >  
> > +
> >  void unix_start_outgoing_migration(MigrationState *s, const char *path, 
> > Error **errp)
> >  {
> > -unix_nonblocking_connect(path, unix_wait_for_connect, s, errp);
> > +SocketAddress *saddr = unix_build_address(path);
> > +QIOChannelSocket *sioc;
> > +sioc = qio_channel_socket_new();
> > +qio_channel_socket_connect_async(sioc,
> > + 

Re: [Qemu-devel] [PATCH v14 7/8] Implement new driver for block replication

2016-02-03 Thread Hailiang Zhang

On 2016/2/3 17:55, Wen Congyang wrote:

On 02/03/2016 05:32 PM, Stefan Hajnoczi wrote:

On Wed, Feb 03, 2016 at 09:29:15AM +0800, Wen Congyang wrote:

On 02/02/2016 10:34 PM, Stefan Hajnoczi wrote:

On Mon, Feb 01, 2016 at 09:13:36AM +0800, Wen Congyang wrote:

On 01/29/2016 11:46 PM, Stefan Hajnoczi wrote:

On Fri, Jan 29, 2016 at 11:13:42AM +0800, Changlong Xie wrote:

On 01/28/2016 11:15 PM, Stefan Hajnoczi wrote:

On Thu, Jan 28, 2016 at 09:13:24AM +0800, Wen Congyang wrote:

On 01/27/2016 10:46 PM, Stefan Hajnoczi wrote:

On Wed, Jan 13, 2016 at 05:18:31PM +0800, Changlong Xie wrote:

I'm concerned that the bdrv_drain_all() in vm_stop() can take a long
time if the disk is slow/failing.  bdrv_drain_all() blocks until all
in-flight I/O requests have completed.  What does the Primary do if the
Secondary becomes unresponsive?


Actually, we knew this problem. But currently, there seems no better way to
resolve it. If you have any ideas?


Is it possible to hold the checkpoint information and acknowledge the
checkpoint right away, without waiting for bdrv_drain_all() or any
Secondory guest activity to complete?


There is no way to know that secondary becomes unreponsive.


I meant whether it is necessary for the Secondary to vm_stop() and apply
the checkpoint before acknowledging the checkpoint to the Primary?


I don't understand this.
Here is the COLO checkpoint flow:

 PrimarySecondary
 new checkpoint notice --->
 vm_stop()  vm_stop()
 vm state(device state, memory, cpu)   --->
load state
   <--- done
 vm_start() vm_start()


If the Secondary's vm_stop() call blocks then the Primary is stuck too.

I was wondering whether the Secondary can do:

<---  done
   vm_stop()
   load state

It simply receives the checkpoint data into a buffer and immediately
replies with "done".  vm_stop() and load state is only performed after
sending "done".


Secondary vm is running, so we should also get the pages that are dirtied
by secondary vm, but not dirtied by primary vm.
We have two ways to do it:
1. Cache all original memory in the secondary qemu
2. Send the dirty pfn list to primary qemu, and get it.

If we ack the checkpoint and the call vm_stop(), we only can select 1. It
means that secondary qemu costs more memory.
In COLO mode, we will compare the output socket, and will do checkpoint if
the application level data is different. If we ack the checkpoint and the
call vm_stop(), the client can not get any more data until secondary vm
is running again. So we still 'wait' the secondary vm.




The advantage is that the Primary will not be delayed by the Secondary.
It's an approach that doesn't block.

But perhaps it's a problem if the Secondary is slower than the Primary
since the Secondary still needs to complete vm_stop() and load state
before it can resume execution?


I think this really means falling back to microcheckpointing until the
Secondary guest can checkpoint.  Instead of a blocking vm_stop() we
would prevent vcpus from running and when the last pending I/O finishes
the Secondary could apply the last checkpoint.  This approach does not
block QEMU (the monitor, etc).



If secondary host becomes unresponsive, it means that we cannot do 
mocrocheckpointing.
We should do failover in this case.


This is dangerous because it means that a delay/failure in the Secondary
would cause the Primary to fail over to the broken Secondary.  All the
more reason not to perform blocking operations on the Secondary in the
checkpoint code path.


If the secondary is broken, primary qemu will take over.


Does the Primary use a timeout between "new checkpoint notice" and
Secondary's "done" so it can move on if the Secondary is unresponsive?


To hailiang:
IIRC, we don't use a timeout but I think we can do it. In our design, there is


Yes, we may need a timeout to help detecting the unresponsive case
which can not be caught by the external heartbeat module.
I will investigate it.

Thanks,
Hailiang


an exteranl heartbeat to check primary and secondary status, and decide when
to do checkpoint.

Thanks
Wen Congyang




Stefan






.







Re: [Qemu-devel] [PATCH v1 13/22] migration: convert RDMA to use QIOChannel interface

2016-02-03 Thread Daniel P. Berrange
On Tue, Feb 02, 2016 at 08:01:36PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > This converts the RDMA code to provide a subclass of
> > QIOChannel that uses RDMA for the data transport.
> > 
> > The RDMA code would be much better off it it could
> > be split up in a generic RDMA layer, a QIOChannel
> > impl based on RMDA, and then the RMDA migration
> > glue. This is left as a future exercise for the brave.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  migration/rdma.c | 260 
> > ++-
> >  1 file changed, 161 insertions(+), 99 deletions(-)
> > 

> > +static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
> > +  const struct iovec *iov,
> > +  size_t niov,
> > +  int **fds,
> > +  size_t *nfds,
> > +  Error **errp)
> > +{
> > +QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
> > +RDMAContext *rdma = rioc->rdma;
> >  RDMAControlHeader head;
> >  int ret = 0;
> > +ssize_t i;
> > +size_t done = 0;
> >  
> >  CHECK_ERROR_STATE();
> >  
> > -/*
> > - * First, we hold on to the last SEND message we
> > - * were given and dish out the bytes until we run
> > - * out of bytes.
> > - */
> > -r->len = qemu_rdma_fill(r->rdma, buf, size, 0);
> > -if (r->len) {
> > -return r->len;
> > -}
> > +for (i = 0; i < niov; i++) {
> > +size_t want = iov[i].iov_len;
> > +uint8_t *data = (void *)iov[i].iov_base;
> >  
> > -/*
> > - * Once we run out, we block and wait for another
> > - * SEND message to arrive.
> > - */
> > -ret = qemu_rdma_exchange_recv(rdma, &head, RDMA_CONTROL_QEMU_FILE);
> > +/*
> > + * First, we hold on to the last SEND message we
> > + * were given and dish out the bytes until we run
> > + * out of bytes.
> > + */
> > +ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
> > +if (ret > 0) {
> > +done += ret;
> > +if (ret < want) {
> > +break;
> > +} else {
> > +continue;
> > +}
> 
> > +}
> >  
> > -if (ret < 0) {
> > -rdma->error_state = ret;
> > -return ret;
> > -}
> > +/*
> > + * Once we run out, we block and wait for another
> > + * SEND message to arrive.
> > + */
> > +ret = qemu_rdma_exchange_recv(rdma, &head, RDMA_CONTROL_QEMU_FILE);
> >  
> > -/*
> > - * SEND was received with new bytes, now try again.
> > - */
> > -return qemu_rdma_fill(r->rdma, buf, size, 0);
> > +if (ret < 0) {
> > +rdma->error_state = ret;
> > +return ret;
> > +}
> > +
> > +/*
> > + * SEND was received with new bytes, now try again.
> > + */
> > +ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
> > +if (ret > 0) {
> > +done += ret;
> > +if (ret < want) {
> > +break;
> > +}
> > +}
> 
> I don't quite understand the behaviour of this loop.
> If rdma_fill returns less than you wanted for the first iov we break.
> If it returns 0 then we try and get some more.
> The weird thing to me is if we have two iov entries; if the
> amount returned by the qemu_rdma_fill happens to match the size of
> the 1st iov then I think we end up doing the exchange_recv and
> waiting for more.  Is that what we want? Why?

No, it isn't quite what we want. If we have successfully received
some data in a preceeding iov, then we shouldn't wait for more data
for any following iov. I'll rework this bit of code to work better

In fact technically, we should not block for data at all when the
channel is in non-blocking mode. Fixing that would require some
refactoring of qemu_rdma_block_for_wrid() so that we could tell
it to only look for an already completed work request and not
block

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version

2016-02-03 Thread Laszlo Ersek
On 02/03/16 09:44, Gerd Hoffmann wrote:
>   Hi,
> 
>> I think the "dma_enabled" property is not exposed to the user.
> 
> It is: "-global fw_cfg.dma_enabled=off" works (as in: doesn't throw an
> error).  Has no effect through as it gets overridden later on.
> 
>> The default value of "dma_enabled" in both fw_cfg_io_properties and
>> fw_cfg_mem_properties is irrelevant; the actual property value is always
>> overwritten in fw_cfg_init_io_dma() and fw_cfg_init_mem_wide(), which
>> all of the init paths go through.
> 
> And IMHO we should not do that, so setting the property actually has an
> effect.

Fair point.

>> I agree that DMA capability should be filtered with machine type.
>> However, that distinction should not be made using the current
>> "dma_enabled" properties (i.e., of "fw_cfg_io_properties" and
>> "fw_cfg_mem_properties". Instead, it should be made in the
>> board-specific callers of fw_cfg_init_(io_dma|mem_wide).
> 
> Why?

That's how "has_reserved_memory" works as well, for example.

But, if the property is made work, I guess PC_COMPAT_2_4 can be used
too. (Or should it be HW_COMPAT_2_4?)

Is that your point?

Thanks
Laszlo



[Qemu-devel] [PATCH] hw/pxb: add pxb devices to the bridge category

2016-02-03 Thread Marcel Apfelbaum
Signed-off-by: Marcel Apfelbaum 
---
 hw/pci-bridge/pci_expander_bridge.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 62fd29d..d23b8da 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -302,6 +302,7 @@ static void pxb_dev_class_init(ObjectClass *klass, void 
*data)
 
 dc->desc = "PCI Expander Bridge";
 dc->props = pxb_dev_properties;
+set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 }
 
 static const TypeInfo pxb_dev_info = {
@@ -334,6 +335,7 @@ static void pxb_pcie_dev_class_init(ObjectClass *klass, 
void *data)
 
 dc->desc = "PCI Express Expander Bridge";
 dc->props = pxb_dev_properties;
+set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 }
 
 static const TypeInfo pxb_pcie_dev_info = {
-- 
2.4.3




Re: [Qemu-devel] [PATCH 2/3] pcdimm: add 'type' field to PCDIMMDeviceInfo

2016-02-03 Thread Vladimir Sementsov-Ogievskiy

On 03.02.2016 01:12, Eric Blake wrote:

On 01/27/2016 11:51 PM, Vladimir Sementsov-Ogievskiy wrote:

The field is needed to distinguish pc-dimm and nvdimm.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 
CC: Stefan Hajnoczi 
CC: Xiao Guangrong 
CC: "Michael S. Tsirkin" 
CC: Igor Mammedov 
CC: Eric Blake 
CC: Markus Armbruster 
---
+++ b/qapi-schema.json
@@ -3924,6 +3924,8 @@
  #
  # @hotpluggable: true if device if could be added/removed while machine is 
running
  #
+# @type: device type: 'pc-dimm' or 'nvdimm' (since 2.6)
+#
  # Since: 2.1
  ##
  { 'struct': 'PCDIMMDeviceInfo',
@@ -3934,7 +3936,8 @@
  'node': 'int',
  'memdev': 'str',
  'hotplugged': 'bool',
-'hotpluggable': 'bool'
+'hotpluggable': 'bool',
+'type': 'str'

No. Since it is a finite set of values (just two possible), you should
be using an enum here rather than open-coded 'str'. Something like:

{ 'enum': 'DIMMType', 'data': [ 'pc-dimm', 'nvdimm' ] }



Are you sure? This is only output Info, so user will never "set" this 
field. Also, qemu type system (as I understand) is based on string 
names. object_dynamic_cast and other functions uses "const char 
*typename". This enum will be out of qemu type system and we will have 
to sync it.. Is there already some practice of translating string 
typenames to enum values?



--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH 3/3] balloon: don't use NVDIMM for ballooning

2016-02-03 Thread Vladimir Sementsov-Ogievskiy

On 02.02.2016 18:30, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


NVDIMM for now is planned to use as a backing store for DAX filesystem
in the guest and thus this memory is excluded from guest memory
management and LRUs.

In this case libvirt running QEMU along with configured balloon almost
immediately inflates balloon and effectively kill the guest as
qemu counts nvdimm as part of the ram.

Counting dimm devices as part of the ram for ballooning was started from
commit 463756d03:
  virtio-balloon: Fix balloon not working correctly when hotplug memory

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 
CC: Stefan Hajnoczi 
CC: Xiao Guangrong 
CC: "Michael S. Tsirkin" 
CC: Igor Mammedov 
CC: Eric Blake 
CC: Markus Armbruster 
---
  hw/virtio/virtio-balloon.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 6a4c4d2..749be25 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -26,6 +26,7 @@
  #include "qapi/visitor.h"
  #include "qapi-event.h"
  #include "trace.h"
+#include "hw/mem/nvdimm.h"
  
  #if defined(__linux__)

  #include 
@@ -308,7 +309,9 @@ static ram_addr_t get_current_ram_size(void)
  if (value) {
  switch (value->type) {
  case MEMORY_DEVICE_INFO_KIND_DIMM:
-size += value->u.dimm->size;
+if (strcmp(value->u.dimm->type, TYPE_NVDIMM)) {
+size += value->u.dimm->size;
+}
  break;
  default:
  break;

Should this be a blacklist ("don't count TYPE_NVDIMM") or a whitelist
("count TYPE_PC_DIMM")?  I guess that depends on whether we think future
types are more likely to need counting or not counting.


May be, to avoid such bugs in future, it would be better to make like a 
whitelist.




--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH] Fix inconsistency between comment and variable name

2016-02-03 Thread Cao jin



On 02/03/2016 05:35 PM, Markus Armbruster wrote:

Michael Tokarev  writes:


03.02.2016 06:19, Cao jin wrote:

Signed-off-by: Cao jin 
---
  include/hw/qdev-core.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index abcdee8..42fa5db 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -221,7 +221,7 @@ typedef struct BusChild {

  /**
   * BusState:
- * @hotplug_device: link to a hotplug device associated with bus.
+ * @hotplug_handler: link to a hotplug device associated with bus.


Hmm.  Now while the field name in comment and in the structure
do match, the comment is still wrong, since it is a linke to
a handler, not a device… :)


Do you mean to suggest the line should be changed to

  * @hotplug_device: link to a hotplug handler associated with bus.

?




And sometimes a hotplug handler could be a bus, see 
qbus_set_bus_hotplug_handler(). on both case(device & bus), the actual 
hotplug handler called at last is a TYPE_HOTPLUG_HANDLER which reside in 
that device/bus, so I guess, the @hotplug_handler maybe fine:)


anyway, these terms could easily confuse newbies

--
Yours Sincerely,

Cao jin





Re: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches

2016-02-03 Thread Gonglei (Arei)
Hi,

> Subject: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches
> 
> This includes two optimization of virtio:
> 
> - "slimming down" VirtQueueElements by not including room for
>   1024 buffers.  This makes malloc much faster.
> 
> - optimizations to limit the number of address_space_translate
>   calls in virtio.c, from Vincenzo and myself.
> 

Very nice! After apply those optimizations (patch 8, 9, 10), I got 4MB/sec 
bonus of speed.
Based on my virtio-crypto device benchmark, ase-128-cbc algorithm. I haven't 
rebase
other patches of this patch set yet, maybe I can get other bonus, can I? 

Before applying those three patches:

Testing AES-128-CBC cipher: 
Encrypting in chunks of 256 bytes: done. 246.84 MiB in 5.02 secs: 49.13 
MiB/sec (1011061 packets)
Encrypting in chunks of 256 bytes: done. 247.03 MiB in 5.02 secs: 49.16 
MiB/sec (1011840 packets)
Encrypting in chunks of 256 bytes: done. 246.98 MiB in 5.02 secs: 49.17 
MiB/sec (1011636 packets)
Encrypting in chunks of 256 bytes: done. 247.14 MiB in 5.02 secs: 49.19 
MiB/sec (1012270 packets)
Encrypting in chunks of 256 bytes: done. 246.96 MiB in 5.02 secs: 49.16 
MiB/sec (1011565 packets)
Encrypting in chunks of 256 bytes: done. 246.97 MiB in 5.02 secs: 49.18 
MiB/sec (1011594 packets)
Encrypting in chunks of 256 bytes: done. 246.89 MiB in 5.02 secs: 49.15 
MiB/sec (1011259 packets)
Encrypting in chunks of 256 bytes: done. 246.96 MiB in 5.02 secs: 49.15 
MiB/sec (1011561 packets)

'Perf top' shows:

 23.61%  qemu-kvm [.] address_space_translate
 14.49%  qemu-kvm [.] qemu_get_ram_ptr
  4.65%  qemu-kvm [.] phys_page_find
  4.31%  qemu-kvm [.] address_space_translate_internal
  3.18%  libpthread-2.19.so   [.] __pthread_mutex_unlock_usercnt
  2.83%  qemu-kvm [.] qemu_ram_addr_from_host
  2.40%  qemu-kvm [.] address_space_map
  2.34%  libc-2.19.so [.] _int_malloc
  2.22%  libc-2.19.so [.] _int_free
  1.96%  libc-2.19.so [.] malloc
  1.71%  libpthread-2.19.so   [.] pthread_mutex_lock
  1.40%  qemu-kvm [.] find_next_zero_bit
  1.38%  libc-2.19.so [.] malloc_consolidate
  1.31%  qemu-kvm [.] lduw_le_phys
  1.27%  libc-2.19.so [.] __memcpy_sse2_unaligned
  1.05%  qemu-kvm [.] qemu_get_ram_block
  1.05%  qemu-kvm [.] object_unref
  1.04%  qemu-kvm [.] memory_region_get_ram_addr


After applying those optimizations:
Encrypting in chunks of 256 bytes: done. 267.92 MiB in 5.03 secs: 
53.31 MiB/sec (1097399 packets)
Encrypting in chunks of 256 bytes: done. 268.05 MiB in 5.02 secs: 53.35 
MiB/sec (1097935 packets)
Encrypting in chunks of 256 bytes: done. 265.40 MiB in 5.02 secs: 52.82 
MiB/sec (1087091 packets)
Encrypting in chunks of 256 bytes: done. 263.18 MiB in 5.01 secs: 52.50 
MiB/sec (1077999 packets)
Encrypting in chunks of 256 bytes: done. 266.85 MiB in 5.01 secs: 53.29 
MiB/sec (1093010 packets)
Encrypting in chunks of 256 bytes: done. 267.64 MiB in 5.02 secs: 53.28 
MiB/sec (1096251 packets)
Encrypting in chunks of 256 bytes: done. 267.30 MiB in 5.02 secs: 53.24 
MiB/sec (1094861 packets)
Encrypting in chunks of 256 bytes: done. 267.29 MiB in 5.02 secs: 53.25 
MiB/sec (1094833 packets)

'Perf top' shows:

22.56%  qemu-kvm [.] address_space_translate
 13.29%  qemu-kvm [.] qemu_get_ram_ptr
  4.71%  qemu-kvm [.] phys_page_find
  4.43%  qemu-kvm [.] address_space_translate_internal
  3.47%  libpthread-2.19.so   [.] __pthread_mutex_unlock_usercnt
  3.08%  qemu-kvm [.] qemu_ram_addr_from_host
  2.62%  qemu-kvm [.] address_space_map
  2.61%  libc-2.19.so [.] _int_malloc
  2.58%  libc-2.19.so [.] _int_free
  2.38%  libc-2.19.so [.] malloc
  2.06%  libpthread-2.19.so   [.] pthread_mutex_lock
  1.68%  libc-2.19.so [.] malloc_consolidate
  1.35%  libc-2.19.so [.] __memcpy_sse2_unaligned
  1.23%  qemu-kvm [.] lduw_le_phys
  1.18%  qemu-kvm [.] find_next_zero_bit
  1.02%  qemu-kvm [.] object_unref



Regards,
-Gonglei



Re: [Qemu-devel] [PULL 0/1] Monitor patches for 2016-02-03

2016-02-03 Thread Peter Maydell
On 3 February 2016 at 09:51, Markus Armbruster  wrote:
> The following changes since commit c65db7705b7926f4a084b93778e4bd5dd3990aad:
>
>   Merge remote-tracking branch 
> 'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging 
> (2016-02-02 18:04:04 +)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/armbru.git tags/pull-monitor-2016-02-03
>
> for you to fetch changes up to 64ffbe04eaafebf4045a3ace52a360c14959d196:
>
>   hmp: fix sendkey out of bounds write (CVE-2015-8619) (2016-02-03 10:13:06 
> +0100)
>
> 
> Monitor patches for 2016-02-03
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [Qemu-arm] [PATCH] target-arm: Don't report presence of EL2 if it doesn't exist

2016-02-03 Thread Sergey Fedorov
On 02.02.2016 21:20, Peter Maydell wrote:
> We already modify the processor feature bits to not report EL3
> support to the guest if EL3 isn't enabled for the CPU we're emulating.
> Add similar support for not reporting EL2 unless it is enabled.
> This is necessary because real world guest code running at EL3
> (trusted firmware or bootloaders) will query the ID registers to
> determine whether it should start a guest Linux kernel in EL2 or EL3.
>
> Signed-off-by: Peter Maydell 

Reviewed-by: Sergey Fedorov 

> ---
> When full EL2 arrives and we have the CPU property for it then
> this will expand a bit to look like the 'if (!cpu->has_el3)'
> condition just above it.
>
>  target-arm/cpu.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/target-arm/cpu.c b/target-arm/cpu.c
> index 6c34476..0cc075d 100644
> --- a/target-arm/cpu.c
> +++ b/target-arm/cpu.c
> @@ -650,6 +650,15 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  cpu->id_aa64pfr0 &= ~0xf000;
>  }
>  
> +if (!arm_feature(env, ARM_FEATURE_EL2)) {
> +/* Disable the hypervisor feature bits in the processor feature
> + * registers if we don't have EL2. These are id_pfr1[15:12] and
> + * id_aa64pfr0_el1[11:8].
> + */
> +cpu->id_aa64pfr0 &= ~0xf00;
> +cpu->id_pfr1 &= ~0xf000;
> +}
> +
>  if (!cpu->has_mpu) {
>  unset_feature(env, ARM_FEATURE_MPU);
>  }




[Qemu-devel] [PATCH v2] target-mips: implement R6 multi-threading

2016-02-03 Thread Leon Alrae
From: Yongbok Kim 

MIPS Release 6 provides multi-threading features which replace
pre-R6 MT Module. CP0.Config3.MT is always 0 in R6, instead there is new
CP0.Config5.VP (Virtual Processor) bit which indicates presence of
multi-threading support which includes CP0.GlobalNumber register and
DVP/EVP instructions.

Signed-off-by: Yongbok Kim 
Signed-off-by: Leon Alrae 
---
This is an updated version of the patch originally sent by Yongbok some
time ago.

v2:
* rebased
* fix typo: GlobalNumer -> GlobalNumber
* fix minor style issues
* update the patch description
---
 disas/mips.c |  4 +++
 target-mips/cpu.c|  9 +++
 target-mips/cpu.h| 25 +++
 target-mips/helper.h |  4 +++
 target-mips/op_helper.c  | 48 +++
 target-mips/translate.c  | 59 
 target-mips/translate_init.c |  3 ++-
 7 files changed, 151 insertions(+), 1 deletion(-)

diff --git a/disas/mips.c b/disas/mips.c
index 0e488d8..249931b 100644
--- a/disas/mips.c
+++ b/disas/mips.c
@@ -1405,6 +1405,10 @@ const struct mips_opcode mips_builtin_opcodes[] =
 {"cmp.sor.d",  "D,S,T", 0x46a00019, 0xffe0003f, RD_S|RD_T|WR_D|FP_D,  0, 
I32R6},
 {"cmp.sune.d", "D,S,T", 0x46a0001a, 0xffe0003f, RD_S|RD_T|WR_D|FP_D,  0, 
I32R6},
 {"cmp.sne.d",  "D,S,T", 0x46a0001b, 0xffe0003f, RD_S|RD_T|WR_D|FP_D,  0, 
I32R6},
+{"dvp","",  0x41600024, 0x, TRAP, 0, 
I32R6},
+{"dvp","t", 0x41600024, 0xffe0, TRAP|WR_t,0, 
I32R6},
+{"evp","",  0x4164, 0x, TRAP, 0, 
I32R6},
+{"evp","t", 0x4164, 0xffe0, TRAP|WR_t,0, 
I32R6},
 
 /* MSA */
 {"sll.b",   "+d,+e,+f", 0x780d, 0xffe0003f, WR_VD|RD_VS|RD_VT,  0, MSA},
diff --git a/target-mips/cpu.c b/target-mips/cpu.c
index 0b3f130..7dc3a44 100644
--- a/target-mips/cpu.c
+++ b/target-mips/cpu.c
@@ -77,6 +77,15 @@ static bool mips_cpu_has_work(CPUState *cs)
 has_work = false;
 }
 }
+/* MIPS Release 6 has the ability to halt the CPU.  */
+if (env->CP0_Config5 & (1 << CP0C5_VP)) {
+if (cs->interrupt_request & CPU_INTERRUPT_WAKE) {
+has_work = true;
+}
+if (!mips_vp_active(env)) {
+has_work = false;
+}
+}
 return has_work;
 }
 
diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index 86b6333..ae8c575 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -238,6 +238,8 @@ struct CPUMIPSState {
 
 int32_t CP0_Index;
 /* CP0_MVP* are per MVP registers. */
+int32_t CP0_VPControl;
+#define CP0VPCtl_DIS0
 int32_t CP0_Random;
 int32_t CP0_VPEControl;
 #define CP0VPECo_YSI   21
@@ -287,6 +289,8 @@ struct CPUMIPSState {
 # define CP0EnLo_RI 31
 # define CP0EnLo_XI 30
 #endif
+int32_t CP0_GlobalNumber;
+#define CP0GN_VPId 0
 target_ulong CP0_Context;
 target_ulong CP0_KScratch[MIPS_KSCRATCH_NUM];
 int32_t CP0_PageMask;
@@ -472,6 +476,7 @@ struct CPUMIPSState {
 #define CP0C5_XNP13
 #define CP0C5_UFE9
 #define CP0C5_FRE8
+#define CP0C5_VP 7
 #define CP0C5_SBRI   6
 #define CP0C5_MVH5
 #define CP0C5_LLB4
@@ -859,6 +864,26 @@ static inline int mips_vpe_active(CPUMIPSState *env)
 return active;
 }
 
+static inline int mips_vp_active(CPUMIPSState *env)
+{
+CPUState *other_cs = first_cpu;
+
+/* Check if the VP disabled other VPs (which means the VP is enabled) */
+if ((env->CP0_VPControl >> CP0VPCtl_DIS) & 1) {
+return 1;
+}
+
+/* Check if the virtual processor is disabled due to a DVP */
+CPU_FOREACH(other_cs) {
+MIPSCPU *other_cpu = MIPS_CPU(other_cs);
+if ((&other_cpu->env != env) &&
+((other_cpu->env.CP0_VPControl >> CP0VPCtl_DIS) & 1)) {
+return 0;
+}
+}
+return 1;
+}
+
 #include "exec/exec-all.h"
 
 static inline void compute_hflags(CPUMIPSState *env)
diff --git a/target-mips/helper.h b/target-mips/helper.h
index 95b9149..1bc8bb2 100644
--- a/target-mips/helper.h
+++ b/target-mips/helper.h
@@ -176,6 +176,10 @@ DEF_HELPER_0(dmt, tl)
 DEF_HELPER_0(emt, tl)
 DEF_HELPER_1(dvpe, tl, env)
 DEF_HELPER_1(evpe, tl, env)
+
+/* R6 Multi-threading */
+DEF_HELPER_1(dvp, tl, env)
+DEF_HELPER_1(evp, tl, env)
 #endif /* !CONFIG_USER_ONLY */
 
 /* microMIPS functions */
diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 684ec92..7c5669c 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -571,6 +571,14 @@ static bool mips_vpe_is_wfi(MIPSCPU *c)
 return cpu->halted && mips_vpe_active(env);
 }
 
+static bool mips_vp_is_wfi(MIPSCPU *c)
+{
+CPUState *cpu = CPU(c);
+CPUMIPSState *env = &c->env;
+
+return cpu->halted && mips_vp_active(env);
+}
+
 static inline void mips_vpe_wake(MIPSCPU *c)
 {
 /* Dont set ->halted = 0 directly, let it be done via cpu_has_work
@@ -18

Re: [Qemu-devel] [PATCH 07/10] virtio: combine the read of a descriptor

2016-02-03 Thread Gonglei (Arei)
Hi,

> Subject: [Qemu-devel] [PATCH 07/10] virtio: combine the read of a descriptor
> 
> Compared to vring, virtio has a performance penalty of 10%.  Fix it
> by combining all the reads for a descriptor in a single address_space_read
> call.  This also simplifies the code nicely.
> 
> Reviewed-by: Cornelia Huck 
> Signed-off-by: Paolo Bonzini 
> ---
>  hw/virtio/virtio.c | 86 
> ++
>  1 file changed, 35 insertions(+), 51 deletions(-)
> 

Unbelievable! After applying this patch, the virtio-crypto speed can attach 
74MB/sec, host
Cpu overhead is 180% (the main thread 100% and vcpu threads 80%)

Testing AES-128-CBC cipher: 
Encrypting in chunks of 256 bytes: done. 371.94 MiB in 5.02 secs: 74.12 
MiB/sec (1523475 packets)
Encrypting in chunks of 256 bytes: done. 369.85 MiB in 5.01 secs: 73.88 
MiB/sec (1514900 packets)
Encrypting in chunks of 256 bytes: done. 371.07 MiB in 5.02 secs: 73.97 
MiB/sec (1519914 packets)
Encrypting in chunks of 256 bytes: done. 371.66 MiB in 5.02 secs: 74.09 
MiB/sec (1522309 packets)
Encrypting in chunks of 256 bytes: done. 371.79 MiB in 5.02 secs: 74.12 
MiB/sec (1522868 packets)
Encrypting in chunks of 256 bytes: done. 371.94 MiB in 5.02 secs: 74.15 
MiB/sec (1523457 packets)
Encrypting in chunks of 256 bytes: done. 371.90 MiB in 5.02 secs: 74.14 
MiB/sec (1523317 packets)
Encrypting in chunks of 256 bytes: done. 371.71 MiB in 5.02 secs: 74.10 
MiB/sec (1522522 packets)

15.95%  qemu-kvm [.] address_space_translate
  6.98%  qemu-kvm [.] qemu_get_ram_ptr
  4.87%  libpthread-2.19.so   [.] __pthread_mutex_unlock_usercnt
  4.40%  qemu-kvm [.] qemu_ram_addr_from_host
  3.79%  qemu-kvm [.] address_space_map
  3.41%  libc-2.19.so [.] _int_malloc
  3.29%  libc-2.19.so [.] _int_free
  3.07%  libc-2.19.so [.] malloc
  2.95%  libpthread-2.19.so   [.] pthread_mutex_lock
  2.94%  qemu-kvm [.] phys_page_find
  2.73%  qemu-kvm [.] address_space_translate_internal
  2.65%  libc-2.19.so [.] malloc_consolidate
  2.35%  libc-2.19.so [.] __memcpy_sse2_unaligned
  1.72%  qemu-kvm [.] find_next_zero_bit
  1.38%  qemu-kvm [.] address_space_rw
  1.34%  qemu-kvm [.] object_unref
  1.30%  qemu-kvm [.] object_ref
  1.28%  qemu-kvm [.] virtqueue_pop
  1.20%  libc-2.19.so [.] memset
  1.11%  qemu-kvm [.] virtio_notify

Thank you so much!

Regards,
-Gonglei

> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 79a635f..2433866 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -107,35 +107,15 @@ void virtio_queue_update_rings(VirtIODevice *vdev,
> int n)
>vring->align);
>  }
> 
> -static inline uint64_t vring_desc_addr(VirtIODevice *vdev, hwaddr desc_pa,
> -   int i)
> +static void vring_desc_read(VirtIODevice *vdev, VRingDesc *desc,
> +hwaddr desc_pa, int i)
>  {
> -hwaddr pa;
> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, addr);
> -return virtio_ldq_phys(vdev, pa);
> -}
> -
> -static inline uint32_t vring_desc_len(VirtIODevice *vdev, hwaddr desc_pa, int
> i)
> -{
> -hwaddr pa;
> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, len);
> -return virtio_ldl_phys(vdev, pa);
> -}
> -
> -static inline uint16_t vring_desc_flags(VirtIODevice *vdev, hwaddr desc_pa,
> -int i)
> -{
> -hwaddr pa;
> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, flags);
> -return virtio_lduw_phys(vdev, pa);
> -}
> -
> -static inline uint16_t vring_desc_next(VirtIODevice *vdev, hwaddr desc_pa,
> -   int i)
> -{
> -hwaddr pa;
> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, next);
> -return virtio_lduw_phys(vdev, pa);
> +address_space_read(&address_space_memory, desc_pa + i *
> sizeof(VRingDesc),
> +   MEMTXATTRS_UNSPECIFIED, (void *)desc,
> sizeof(VRingDesc));
> +virtio_tswap64s(vdev, &desc->addr);
> +virtio_tswap32s(vdev, &desc->len);
> +virtio_tswap16s(vdev, &desc->flags);
> +virtio_tswap16s(vdev, &desc->next);
>  }
> 
>  static inline uint16_t vring_avail_flags(VirtQueue *vq)
> @@ -345,18 +325,18 @@ static unsigned int virtqueue_get_head(VirtQueue
> *vq, unsigned int idx)
>  return head;
>  }
> 
> -static unsigned virtqueue_next_desc(VirtIODevice *vdev, hwaddr desc_pa,
> -unsigned int i, unsigned int max)
> +static unsigned virtqueue_read_next_desc(VirtIODevice *vdev, VRingDesc
> *desc,
> + hwaddr desc_pa, unsigned
> int max)
>  {
>  unsigne

Re: [Qemu-devel] [PATCH] rng-random: implement request queue

2016-02-03 Thread Amit Shah
Hi Ladi,

Adding Pankaj to CC, he too looked at this recently.

On (Fri) 22 Jan 2016 [13:19:58], Ladi Prosek wrote:
> If the guest adds a buffer to the virtio queue while another buffer
> is still pending and hasn't been filled and returned by the rng
> device, rng-random internally discards the pending request, which
> leads to the second buffer getting stuck in the queue. For the guest
> this manifests as delayed completion of reads from virtio-rng, i.e.
> a read is completed only after another read is issued.
> 
> This patch adds an internal queue of requests, analogous to what
> rng-egd uses, to make sure that requests and responses are balanced
> and correctly ordered.

... and this can lead to breaking migration (the queue of requests on
the host needs to be migrated, else the new host will have no idea of
the queue).

I think we should limit the queue size to 1 instead.  Multiple rng
requests should not be common, because if we did have entropy, we'd
just service the guest request and be done with it.  If we haven't
replied to the guest, it just means that the host itself is waiting
for more entropy, or is waiting for the timeout before the guest's
ratelimit is lifted.

So, instead of fixing this using a queue, how about limiting the size
of the vq to have just one element at a time?

Thanks,

Amit



Re: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches

2016-02-03 Thread Gonglei (Arei)

> -Original Message-
> From: qemu-devel-bounces+arei.gonglei=huawei@nongnu.org
> [mailto:qemu-devel-bounces+arei.gonglei=huawei@nongnu.org] On
> Behalf Of Paolo Bonzini
> Sent: Sunday, January 31, 2016 6:29 PM
> To: qemu-devel@nongnu.org
> Cc: cornelia.h...@de.ibm.com; m...@redhat.com
> Subject: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches
> 
> This includes two optimization of virtio:
> 
> - "slimming down" VirtQueueElements by not including room for
>   1024 buffers.  This makes malloc much faster.
> 
> - optimizations to limit the number of address_space_translate
>   calls in virtio.c, from Vincenzo and myself.
> 
> Thanks,
> 
> Paolo
> 
> v1->v2: improved commit messages [Conny]
> add assertions on sz [Conny]
> change bools from 1 and 0 to "true" and "false" [Conny]
> update shadow avail_idx in virtio_queue_set_last_avail_idx [Michael]
> collect Reviewed-by
> 
> Paolo Bonzini (7):
>   virtio: move VirtQueueElement at the beginning of the structs
>   virtio: move allocation to virtqueue_pop/vring_pop
>   virtio: introduce qemu_get/put_virtqueue_element
>   virtio: introduce virtqueue_alloc_element
>   virtio: slim down allocation of VirtQueueElements
>   vring: slim down allocation of VirtQueueElements
>   virtio: combine the read of a descriptor
> 
> Vincenzo Maffione (3):
>   virtio: cache used_idx in a VirtQueue field
>   virtio: read avail_idx from VQ only when necessary
>   virtio: combine write of an entry into used ring
> 
>  hw/9pfs/9p.c|   2 +-
>  hw/9pfs/virtio-9p-device.c  |  17 +-
>  hw/9pfs/virtio-9p.h |   2 +-
>  hw/block/dataplane/virtio-blk.c |  11 +-
>  hw/block/virtio-blk.c   |  23 +--
>  hw/char/virtio-serial-bus.c |  78 +
>  hw/display/virtio-gpu.c |  25 ++-
>  hw/input/virtio-input.c |  24 ++-
>  hw/net/virtio-net.c |  69 +---
>  hw/scsi/virtio-scsi-dataplane.c |  15 +-
>  hw/scsi/virtio-scsi.c   |  26 ++-
>  hw/virtio/dataplane/vring.c |  62 ---
>  hw/virtio/virtio-balloon.c  |  22 ++-
>  hw/virtio/virtio-rng.c  |  10 +-
>  hw/virtio/virtio.c  | 340
> +---
>  include/hw/virtio/dataplane/vring.h |   2 +-
>  include/hw/virtio/virtio-balloon.h  |   2 +-
>  include/hw/virtio/virtio-blk.h  |   5 +-
>  include/hw/virtio/virtio-net.h  |   2 +-
>  include/hw/virtio/virtio-scsi.h |  15 +-
>  include/hw/virtio/virtio-serial.h   |   2 +-
>  include/hw/virtio/virtio.h  |  13 +-
>  22 files changed, 486 insertions(+), 281 deletions(-)
> 
> --
> 2.5.0
> 

For patch 7,8,9,10:

 Tested-by: Gonglei 

Regards,
-Gonglei



Re: [Qemu-devel] [PATCH qemu] migration/vmstate: document VMStateFlags

2016-02-03 Thread Amit Shah
On (Thu) 21 Jan 2016 [21:39:27], Sascha Silbe wrote:
> The VMState API is rather sparsely documented. Start by describing the
> meaning of all VMStateFlags.
> 
> Signed-off-by: Sascha Silbe 

Thanks, this is much needed.  I'm just returning from a long trip,
will go through this in a bit.

Amit



Re: [Qemu-devel] [PATCH V2 1/2] ARM: PL061: Clear PL061 device state after reset

2016-02-03 Thread Shannon Zhao
Hi Wei,

This still has a problem as I said before.

If we execute "virsh reboot xxx" during VM booting(i.e. before the PL061
driver loaded), then after VM booting, the VM will not have any reaction
to the "virsh reboot xxx".

On 2016/2/2 4:49, Wei Huang wrote:
> Current QEMU doesn't clear PL061 state after reset. This causes a
> weird issue with guest reboot via GPIO. Here is the device state
> description with two reboot requests:
> 
>   (PL061State fields)   data   old_in_data   istate
> VM boot 0  0 0
> After 1st ACPI reboot request   8  8 8
> After VM PL061 driver ACK   8  8 0
> After VM reboot 8  8 0
> 
> 2nd ACPI reboot request 8
> 
> In the second reboot request above, because old_in_data field is 8,
> QEMU decides that there is a pending edge IRQ already (see
> pl061_update()) in input; so it doesn't raise up IRQ again. As a result
> the second reboot request is lost. The correct way is to clear PL061
> device state after reset.
> 
> NOTE: The reset state is found from the following documentation:
>  - PL061 Technical Reference Manual
>  - Stellaris LM3S8962 Microcontroller Data Sheet
>  - Stellaris LM3S5P31 Microcontroller Data Sheet
> 
> Signed-off-by: Wei Huang 
> ---
>  hw/gpio/pl061.c | 32 ++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/gpio/pl061.c b/hw/gpio/pl061.c
> index e5a696e..342a70d 100644
> --- a/hw/gpio/pl061.c
> +++ b/hw/gpio/pl061.c
> @@ -284,8 +284,35 @@ static void pl061_write(void *opaque, hwaddr offset,
>  
>  static void pl061_reset(PL061State *s)
>  {
> -  s->locked = 1;
> -  s->cr = 0xff;
> +/* reset values from PL061 TRM, Stellaris LM3S5P31 & LM3S8962 Data Sheet 
> */
> +s->data = 0;
> +s->old_out_data = 0;
> +s->old_in_data = 0;
> +s->dir = 0;
> +s->isense = 0;
> +s->ibe = 0;
> +s->iev = 0;
> +s->im = 0;
> +s->istate = 0;
> +s->afsel = 0;
> +s->dr2r = 0xff;
> +s->dr4r = 0;
> +s->dr8r = 0;
> +s->odr = 0;
> +s->pur = 0;
> +s->pdr = 0;
> +s->slr = 0;
> +s->den = 0;
> +s->locked = 1;
> +s->cr = 0xff;
> +s->amsel = 0;
> +}
> +
> +static void pl061_state_reset(DeviceState *dev)
> +{
> +PL061State *s = PL061(dev);
> +
> +pl061_reset(s);
>  }
>  
>  static void pl061_set_irq(void * opaque, int irq, int level)
> @@ -343,6 +370,7 @@ static void pl061_class_init(ObjectClass *klass, void 
> *data)
>  
>  k->init = pl061_initfn;
>  dc->vmsd = &vmstate_pl061;
> +dc->reset = &pl061_state_reset;
>  }
>  
>  static const TypeInfo pl061_info = {
> 

-- 
Shannon




Re: [Qemu-devel] [PULL 0/6] virtio-gpu: bugfixes and spice support preparation

2016-02-03 Thread Peter Maydell
On 3 February 2016 at 11:16, Gerd Hoffmann  wrote:
>   Hi,
>
> Minor fixes and some preparing work in console
> and virtio-gpu for spice support.
>
> please pull,
>   Gerd
>
> The following changes since commit c65db7705b7926f4a084b93778e4bd5dd3990aad:
>
>   Merge remote-tracking branch 
> 'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging 
> (2016-02-02 18:04:04 +)
>
> are available in the git repository at:
>
>
>   git://git.kraxel.org/qemu tags/pull-vga-20160203-1
>
> for you to fetch changes up to 321c9adba5a64a1a9de2dd7db5433b62a5433439:
>
>   virtio-gpu: block any rendering until client (ui) is done (2016-02-03 
> 10:41:36 +0100)
>
> 
> virtio-gpu: bugfixes and spice support preparation
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version

2016-02-03 Thread Gerd Hoffmann
  Hi,

> >> I agree that DMA capability should be filtered with machine type.
> >> However, that distinction should not be made using the current
> >> "dma_enabled" properties (i.e., of "fw_cfg_io_properties" and
> >> "fw_cfg_mem_properties". Instead, it should be made in the
> >> board-specific callers of fw_cfg_init_(io_dma|mem_wide).
> > 
> > Why?
> 
> That's how "has_reserved_memory" works as well, for example.

Those are machine options, not device options.  If we have a device
property there is no need to create a duplicate a machine option for the
same purpose.

> But, if the property is made work, I guess PC_COMPAT_2_4 can be used
> too. (Or should it be HW_COMPAT_2_4?)

HW_COMPAT_2_4 probably, so it applies to q35 too.

cheers,
  Gerd




Re: [Qemu-devel] [RFC PATCH v1 1/1] vGPU core driver : to provide common interface for vGPU.

2016-02-03 Thread Kirti Wankhede



On 2/3/2016 11:26 AM, Tian, Kevin wrote:
[...]

* @vgpu_create:Called to allocate basic resouces in graphics
*  driver for a particular vgpu.
*  @dev: physical pci device structure on which vgpu
*should be created
*  @uuid: uuid for which VM it is intended to
*  @instance: vgpu instance in that VM
*  @vgpu_id: This represents the type of vgpu to be
*created
*  Returns integer: success (0) or error (< 0)


Specifically for Intel GVT-g we didn't hard partition resource among vGPUs.
Instead we allow user to accurately control how many physical resources
are allocated to a vGPU. So this interface should be extensible to allow
vendor specific resource control.



This interface forwards the create request to vendor/GPU driver
informing about which physical GPU this request is intended for and the
type of vGPU. Then its vendor/GPU driver's responsibility to do
resources allocation and manage resources in their own driver.


However the current parameter definition disallows resource configuration
information passed from user. As I said, Intel GVT-g doesn't do static
allocation based on type. We provide flexibility to user for fine-grained
resource management.



int (*vgpu_create)(struct pci_dev *dev, uuid_le uuid,
-  uint32_t instance, uint32_t vgpu_id);
+  uint32_t instance, char *vgpu_params);

If we change integer vgpu_id parameter to char *vgpu_param then GPU 
driver can have multiple parameters.


Suppose there is a GPU at :85:00.0, then to create vgpu:
# echo "::" > 
/sys/bus/pci/devices/\:85\:00.0/vgpu_create


Common vgpu module will not parse vgpu_params string, it will be 
forwarded to GPU driver, then its GPU driver's responsibility to parse 
the string and act accordingly. This should give flexibility to have 
multiple parameters for GPU driver.



*
* Physical GPU that support vGPU should be register with vgpu module with
* gpu_device_ops structure.
*/



Also it'd be good design to allow extensible usages, such as statistics, and
other vendor specific control knobs (e.g. foreground/background VM switch
in Intel GVT-g, etc.)



Can you elaborate on what other control knobs that would be needed?



Some examples:

- foreground/background VM switch
- resource query
- various statistics info
- virtual monitor configuration
- ...

Since this vgpu core driver will become the central point for all vgpu
management, we need provide an easy way for vendor specific extension,
e.g. exposing above callbacks as sysfs nodes and then vendor can create
its own extensions under subdirectory (/intel, /nvidia, ...).




Ok. Makes sense.

Are these parameters per physical device or per vgpu device?

Adding attribute groups to gpu_device_ops structure would provide a way 
to add vendor specific extensions. I think we need two types of 
attributes here, per physical device and per vgpu device. Right?


 const struct attribute_group **dev_groups;
 const struct attribute_group **vgpu_groups;



Thanks,
Kirti.


Thanks
Kevin





Re: [Qemu-devel] [PATCH v1 13/22] migration: convert RDMA to use QIOChannel interface

2016-02-03 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Tue, Feb 02, 2016 at 08:01:36PM +, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > This converts the RDMA code to provide a subclass of
> > > QIOChannel that uses RDMA for the data transport.
> > > 
> > > The RDMA code would be much better off it it could
> > > be split up in a generic RDMA layer, a QIOChannel
> > > impl based on RMDA, and then the RMDA migration
> > > glue. This is left as a future exercise for the brave.
> > > 
> > > Signed-off-by: Daniel P. Berrange 
> > > ---
> > >  migration/rdma.c | 260 
> > > ++-
> > >  1 file changed, 161 insertions(+), 99 deletions(-)
> > > 
> 
> > > +static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
> > > +  const struct iovec *iov,
> > > +  size_t niov,
> > > +  int **fds,
> > > +  size_t *nfds,
> > > +  Error **errp)
> > > +{
> > > +QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
> > > +RDMAContext *rdma = rioc->rdma;
> > >  RDMAControlHeader head;
> > >  int ret = 0;
> > > +ssize_t i;
> > > +size_t done = 0;
> > >  
> > >  CHECK_ERROR_STATE();
> > >  
> > > -/*
> > > - * First, we hold on to the last SEND message we
> > > - * were given and dish out the bytes until we run
> > > - * out of bytes.
> > > - */
> > > -r->len = qemu_rdma_fill(r->rdma, buf, size, 0);
> > > -if (r->len) {
> > > -return r->len;
> > > -}
> > > +for (i = 0; i < niov; i++) {
> > > +size_t want = iov[i].iov_len;
> > > +uint8_t *data = (void *)iov[i].iov_base;
> > >  
> > > -/*
> > > - * Once we run out, we block and wait for another
> > > - * SEND message to arrive.
> > > - */
> > > -ret = qemu_rdma_exchange_recv(rdma, &head, RDMA_CONTROL_QEMU_FILE);
> > > +/*
> > > + * First, we hold on to the last SEND message we
> > > + * were given and dish out the bytes until we run
> > > + * out of bytes.
> > > + */
> > > +ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
> > > +if (ret > 0) {
> > > +done += ret;
> > > +if (ret < want) {
> > > +break;
> > > +} else {
> > > +continue;
> > > +}
> > 
> > > +}
> > >  
> > > -if (ret < 0) {
> > > -rdma->error_state = ret;
> > > -return ret;
> > > -}
> > > +/*
> > > + * Once we run out, we block and wait for another
> > > + * SEND message to arrive.
> > > + */
> > > +ret = qemu_rdma_exchange_recv(rdma, &head, 
> > > RDMA_CONTROL_QEMU_FILE);
> > >  
> > > -/*
> > > - * SEND was received with new bytes, now try again.
> > > - */
> > > -return qemu_rdma_fill(r->rdma, buf, size, 0);
> > > +if (ret < 0) {
> > > +rdma->error_state = ret;
> > > +return ret;
> > > +}
> > > +
> > > +/*
> > > + * SEND was received with new bytes, now try again.
> > > + */
> > > +ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
> > > +if (ret > 0) {
> > > +done += ret;
> > > +if (ret < want) {
> > > +break;
> > > +}
> > > +}
> > 
> > I don't quite understand the behaviour of this loop.
> > If rdma_fill returns less than you wanted for the first iov we break.
> > If it returns 0 then we try and get some more.
> > The weird thing to me is if we have two iov entries; if the
> > amount returned by the qemu_rdma_fill happens to match the size of
> > the 1st iov then I think we end up doing the exchange_recv and
> > waiting for more.  Is that what we want? Why?
> 
> No, it isn't quite what we want. If we have successfully received
> some data in a preceeding iov, then we shouldn't wait for more data
> for any following iov. I'll rework this bit of code to work better
> 
> In fact technically, we should not block for data at all when the
> channel is in non-blocking mode. Fixing that would require some
> refactoring of qemu_rdma_block_for_wrid() so that we could tell
> it to only look for an already completed work request and not
> block

I wouldn't go changing qemu_rdma_block_for_wrid unless you need
to.

Dave

> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v1 13/22] migration: convert RDMA to use QIOChannel interface

2016-02-03 Thread Daniel P. Berrange
On Wed, Feb 03, 2016 at 01:23:04PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > On Tue, Feb 02, 2016 at 08:01:36PM +, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrange (berra...@redhat.com) wrote:
> > > > This converts the RDMA code to provide a subclass of
> > > > QIOChannel that uses RDMA for the data transport.
> > > > 
> > > > The RDMA code would be much better off it it could
> > > > be split up in a generic RDMA layer, a QIOChannel
> > > > impl based on RMDA, and then the RMDA migration
> > > > glue. This is left as a future exercise for the brave.
> > > > 
> > > > Signed-off-by: Daniel P. Berrange 
> > > > ---
> > > >  migration/rdma.c | 260 
> > > > ++-
> > > >  1 file changed, 161 insertions(+), 99 deletions(-)
> > > > 

> > > 
> > > I don't quite understand the behaviour of this loop.
> > > If rdma_fill returns less than you wanted for the first iov we break.
> > > If it returns 0 then we try and get some more.
> > > The weird thing to me is if we have two iov entries; if the
> > > amount returned by the qemu_rdma_fill happens to match the size of
> > > the 1st iov then I think we end up doing the exchange_recv and
> > > waiting for more.  Is that what we want? Why?
> > 
> > No, it isn't quite what we want. If we have successfully received
> > some data in a preceeding iov, then we shouldn't wait for more data
> > for any following iov. I'll rework this bit of code to work better
> > 
> > In fact technically, we should not block for data at all when the
> > channel is in non-blocking mode. Fixing that would require some
> > refactoring of qemu_rdma_block_for_wrid() so that we could tell
> > it to only look for an already completed work request and not
> > block
> 
> I wouldn't go changing qemu_rdma_block_for_wrid unless you need
> to.

Yeah, I won't do it now - just something to think about for the future
to properly do non-blocking I/o channels.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] Memory on stellaris board

2016-02-03 Thread Aurelio Remonda
Hello, i was trying to understand how does the sram and flash size
works on lm3s6965evb, i found a hardcoded 0x00ff007f as dc0 value and
how flash_size and sram_size are calculated based on that hexadecimal.
I mean this:

flash_size = (((board->dc0 & 0x) + 1) << 1) * 1024;
sram_size = ((board->dc0 >> 18) + 1) * 1024;
On stellaris.c

When i use the -m [size] flag while running qemu how does the flag
affect does flash_size and sram_size values? Or it doesn't? Where can
i see that the memory has indeed change?

When i debug qemu with or whitout the flag i keep getting the same
sram_size and flash_size values (the ones calculated with 0x00ff007f)

Thank you!!

-- 
Aurelio Remonda

Software Engineer

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina
Phone: +54-351-4217888 / 4218211



Re: [Qemu-devel] [PATCH v1 03/22] migration: ensure qemu_fflush() always writes full data amount

2016-02-03 Thread Daniel P. Berrange
On Thu, Jan 28, 2016 at 05:53:46PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > The QEMUFile writev_buffer / put_buffer functions are expected
> > to write out the full set of requested data, blocking until
> > complete. The qemu_fflush() caller does not expect to deal with
> > partial writes. Clarify the function comments and add a sanity
> > check to the code to catch mistaken implementations.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  include/migration/qemu-file.h |  6 --
> >  migration/qemu-file.c | 16 
> >  2 files changed, 16 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> > index b5d08d2..5debe8c 100644
> > --- a/include/migration/qemu-file.h
> > +++ b/include/migration/qemu-file.h
> > @@ -29,7 +29,8 @@
> >  
> >  /* This function writes a chunk of data to a file at the given position.
> >   * The pos argument can be ignored if the file is only being used for
> > - * streaming.  The handler should try to write all of the data it can.
> > + * streaming.  The handler must write all of the data or return a negative
> > + * errno value.
> >   */
> >  typedef ssize_t (QEMUFilePutBufferFunc)(void *opaque, const uint8_t *buf,
> >  int64_t pos, size_t size);
> > @@ -55,7 +56,8 @@ typedef int (QEMUFileCloseFunc)(void *opaque);
> >  typedef int (QEMUFileGetFD)(void *opaque);
> >  
> >  /*
> > - * This function writes an iovec to file.
> > + * This function writes an iovec to file. The handler must write all
> > + * of the data or return a negative errno value.
> >   */
> >  typedef ssize_t (QEMUFileWritevBufferFunc)(void *opaque, struct iovec *iov,
> > int iovcnt, int64_t pos);
> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> > index 0bbd257..f89e64e 100644
> > --- a/migration/qemu-file.c
> > +++ b/migration/qemu-file.c
> > @@ -107,11 +107,13 @@ bool qemu_file_is_writable(QEMUFile *f)
> >   * Flushes QEMUFile buffer
> >   *
> >   * If there is writev_buffer QEMUFileOps it uses it otherwise uses
> > - * put_buffer ops.
> > + * put_buffer ops. This will flush all pending data. If data was
> > + * only partially flushed, it will set an error state.
> >   */
> >  void qemu_fflush(QEMUFile *f)
> >  {
> >  ssize_t ret = 0;
> > +ssize_t expect = 0;
> >  
> >  if (!qemu_file_is_writable(f)) {
> >  return;
> > @@ -119,21 +121,27 @@ void qemu_fflush(QEMUFile *f)
> >  
> >  if (f->ops->writev_buffer) {
> >  if (f->iovcnt > 0) {
> > +expect = iov_size(f->iov, f->iovcnt);
> >  ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt, 
> > f->pos);
> >  }
> >  } else {
> >  if (f->buf_index > 0) {
> > +expect = f->buf_index;
> >  ret = f->ops->put_buffer(f->opaque, f->buf, f->pos, 
> > f->buf_index);
> >  }
> >  }
> > +
> >  if (ret >= 0) {
> >  f->pos += ret;
> >  }
> > -f->buf_index = 0;
> > -f->iovcnt = 0;
> > -if (ret < 0) {
> > +/* We expect the QEMUFile write impl to send the full
> > + * data set we requested, so sanity check that.
> > + */
> > +if (ret < 0 || ret != expect) {
> >  qemu_file_set_error(f, ret);
> 
> You could simplify that to if (ret != expect) couldn't you?
> 
> In the case you're trying to guard against, the value past
> to qemu_file_set_error is potentially truncated; which in the worst
> case could make it appear as success; although I doubt that
> can happen in our uses.

qemu_file_set_error expects an errni, so in fact my code was
buggy if the ret != expect part got evaluated, so I'll change
this to

 if (ret != expect) {
qemu_file_set_error(f, ret < 0 ? ret : -EIO);
 }


> 
> Reviewed-by: Dr. David Alan Gilbert 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] Memory on stellaris board

2016-02-03 Thread Peter Maydell
On 3 February 2016 at 13:00, Aurelio Remonda
 wrote:
> Hello, i was trying to understand how does the sram and flash size
> works on lm3s6965evb, i found a hardcoded 0x00ff007f as dc0 value and
> how flash_size and sram_size are calculated based on that hexadecimal.
> I mean this:
>
> flash_size = (((board->dc0 & 0x) + 1) << 1) * 1024;
> sram_size = ((board->dc0 >> 18) + 1) * 1024;
> On stellaris.c
>
> When i use the -m [size] flag while running qemu how does the flag
> affect does flash_size and sram_size values? Or it doesn't? Where can
> i see that the memory has indeed change?

This board model ignores -m. We just implement a model of this particular
bit of hardware, which has a fixed amount of RAM in it.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v1 07/22] migration: introduce a new QEMUFile impl based on QIOChannel

2016-02-03 Thread Daniel P. Berrange
On Tue, Feb 02, 2016 at 05:06:24PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > Introduce a new QEMUFile implementation that is based on
> > the QIOChannel objects. This impl is different from existing
> > impls in that there is no file descriptor that can be made
> > available, as some channels may be based on higher level
> > protocols such as TLS.
> > 
> > Although the QIOChannel based implementation can trivially
> > provide a bi-directional stream, initially we have separate
> > functions for opening input & output directions to fit with
> > the expectation of the current QEMUFile interface.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  include/migration/qemu-file.h |   4 +
> >  migration/Makefile.objs   |   1 +
> >  migration/qemu-file-channel.c | 201 
> > ++
> >  3 files changed, 206 insertions(+)
> >  create mode 100644 migration/qemu-file-channel.c

> > +static ssize_t channel_writev_buffer(void *opaque,
> > + struct iovec *iov,
> > + int iovcnt,
> > + int64_t pos)
> > +{
> > +QIOChannel *ioc = QIO_CHANNEL(opaque);
> > +ssize_t done = 0;
> > +ssize_t want = iov_size(iov, iovcnt);
> > +struct iovec oldiov = { NULL, 0 };
> > +
> > +while (done < want) {
> > +ssize_t len;
> > +struct iovec *cur = iov;
> > +int curcnt = iovcnt;
> > +
> > +channel_skip_iov(done, &cur, &curcnt, &oldiov);
> > +
> > +len = qio_channel_writev(ioc, cur, curcnt, NULL);
> > +if (oldiov.iov_base) {
> > +/* Restore original caller's info in @iov */
> > +cur[0].iov_base = oldiov.iov_base;
> > +cur[0].iov_len = oldiov.iov_len;
> > +oldiov.iov_base = NULL;
> > +oldiov.iov_len = 0;
> > +}
> > +if (len == QIO_CHANNEL_ERR_BLOCK) {
> > +qio_channel_wait(ioc, G_IO_OUT);
> > +continue;
> > +}
> > +if (len < 0) {
> > +/* XXX handle Error objects */
> > +return -EIO;
> 
> It's best to return 'len' rather than lose what little
> error information you had (similarly below).

In this case 'len' is in fact just '-1', as all error
info is in the Error ** parameter, but the QEMUFile API
contract requires an errno value. So we don't have much
choice but to return a fixed EIO - returning 'len' would
also be a fixed errno - whichever errno corresponds to
the value -1.

I'd like to switch QEMUFile over to use Error **errp
parameters for error reporting, so that we can make
detailed error info available throughout the migration
I/O code. That ought to wait until after this series
is done though, to avoid complicating it yet more.

> 
> > +}
> > +
> > +done += len;
> > +}
> > +return done;
> > +}
> > +
> > +
> > +static ssize_t channel_get_buffer(void *opaque,
> > +  uint8_t *buf,
> > +  int64_t pos,
> > +  size_t size)
> > +{
> > +QIOChannel *ioc = QIO_CHANNEL(opaque);
> > +ssize_t ret;
> > +
> > + reread:
> > +ret = qio_channel_read(ioc, (char *)buf, size, NULL);
> > +if (ret < 0) {
> > +if (ret == QIO_CHANNEL_ERR_BLOCK) {
> > +qio_channel_yield(ioc, G_IO_IN);
> > +goto reread;
> > +} else {
> > +/* XXX handle Error * object */
> > +return -EIO;
> > +}
> > +}
> > +return ret;
> 
> 
> I'd prefer a loop to a goto; generally the only places we
> use goto is an error exit.
> 
>   do {
>ret = qio_channel_read(ioc, (char *)buf, size, NULL);
>if (ret == QIO_CHANNEL_ERR_BLOCK) {
>qio_channel_yield(ioc, G_IO_IN);
>}
>   } while (ret == QIO_CHANNEL_ERR_BLOCK);
> 
>   return ret;
> 
> and IMHO the loop is clearer.

Ok, will change that.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH 7/7] target-arm: Enable EL3 for Cortex-A53 and Cortex-A57

2016-02-03 Thread Peter Maydell
Enable EL3 support for our Cortex-A53 and Cortex-A57 CPU models.
We have enough implemented now to be able to run real world code
at least to some extent (I can boot ARM Trusted Firmware to the
point where it pulls in OP-TEE and then falls over because it
doesn't have a UEFI image it can chain to).

Signed-off-by: Peter Maydell 
---
 target-arm/cpu64.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index cc177bb..073677b5 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -109,6 +109,7 @@ static void aarch64_a57_initfn(Object *obj)
 set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
 set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
 set_feature(&cpu->env, ARM_FEATURE_CRC);
+set_feature(&cpu->env, ARM_FEATURE_EL3);
 cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A57;
 cpu->midr = 0x411fd070;
 cpu->revidr = 0x;
@@ -161,6 +162,7 @@ static void aarch64_a53_initfn(Object *obj)
 set_feature(&cpu->env, ARM_FEATURE_V8_SHA256);
 set_feature(&cpu->env, ARM_FEATURE_V8_PMULL);
 set_feature(&cpu->env, ARM_FEATURE_CRC);
+set_feature(&cpu->env, ARM_FEATURE_EL3);
 cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A53;
 cpu->midr = 0x410fd034;
 cpu->revidr = 0x;
-- 
1.9.1




[Qemu-devel] [PATCH 2/7] target-arm: Implement MDCR_EL3 and SDCR

2016-02-03 Thread Peter Maydell
Implement the MDCR_EL3 register (which is SDCR for AArch32).
For the moment we implement it as reads-as-written.

Signed-off-by: Peter Maydell 
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 24 
 2 files changed, 25 insertions(+)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 52284e9..cf2df50 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -382,6 +382,7 @@ typedef struct CPUARMState {
 uint64_t mdscr_el1;
 uint64_t oslsr_el1; /* OS Lock Status */
 uint64_t mdcr_el2;
+uint64_t mdcr_el3;
 /* If the counter is enabled, this stores the last time the counter
  * was reset. Otherwise it stores the counter value
  */
diff --git a/target-arm/helper.c b/target-arm/helper.c
index b631b83..8b96b80 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -364,6 +364,23 @@ static CPAccessResult 
access_el3_aa32ns_aa64any(CPUARMState *env,
 return CP_ACCESS_OK;
 }
 
+/* Some secure-only AArch32 registers trap to EL3 if used from
+ * Secure EL1 (but are just ordinary UNDEF in other non-EL3 contexts).
+ * We assume that the .access field is set to PL1_RW.
+ */
+static CPAccessResult access_trap_aa32s_el1(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+if (arm_current_el(env) == 3) {
+return CP_ACCESS_OK;
+}
+if (arm_is_secure_below_el3(env)) {
+return CP_ACCESS_TRAP_EL3;
+}
+/* This will be EL1 NS and EL2 NS, which just UNDEF */
+return CP_ACCESS_TRAP_UNCATEGORIZED;
+}
+
 static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
 {
 ARMCPU *cpu = arm_env_get_cpu(env);
@@ -3532,6 +3549,13 @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
   .cp = 15, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 0,
   .access = PL3_RW, .fieldoffset = offsetoflow32(CPUARMState, 
cp15.scr_el3),
   .writefn = scr_write },
+{ .name = "MDCR_EL3", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 3, .opc2 = 1,
+  .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.mdcr_el3) },
+{ .name = "SDCR", .type = ARM_CP_ALIAS,
+  .cp = 15, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 1,
+  .access = PL1_RW, .accessfn = access_trap_aa32s_el1,
+  .fieldoffset = offsetoflow32(CPUARMState, cp15.mdcr_el3) },
 { .name = "SDER32_EL3", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 6, .crn = 1, .crm = 1, .opc2 = 1,
   .access = PL3_RW, .resetvalue = 0,
-- 
1.9.1




[Qemu-devel] [PATCH 5/7] target-arm: Add isread parameter to CPAccessFns

2016-02-03 Thread Peter Maydell
System registers might have access requirements which need to
be described via a CPAccessFn and which differ for reads and
writes. For this to be possible we need to pass the access
function a parameter to tell it whether the access being checked
is a read or a write.

Signed-off-by: Peter Maydell 
---
 target-arm/cpu.h   |  4 ++-
 target-arm/helper.c| 81 +-
 target-arm/helper.h|  2 +-
 target-arm/op_helper.c |  5 +--
 target-arm/translate-a64.c |  6 ++--
 target-arm/translate.c |  7 ++--
 6 files changed, 68 insertions(+), 37 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 0fb79d0..5137632 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1319,7 +1319,9 @@ typedef uint64_t CPReadFn(CPUARMState *env, const 
ARMCPRegInfo *opaque);
 typedef void CPWriteFn(CPUARMState *env, const ARMCPRegInfo *opaque,
uint64_t value);
 /* Access permission check functions for coprocessor registers. */
-typedef CPAccessResult CPAccessFn(CPUARMState *env, const ARMCPRegInfo 
*opaque);
+typedef CPAccessResult CPAccessFn(CPUARMState *env,
+  const ARMCPRegInfo *opaque,
+  bool isread);
 /* Hook function for register reset */
 typedef void CPResetFn(CPUARMState *env, const ARMCPRegInfo *opaque);
 
diff --git a/target-arm/helper.c b/target-arm/helper.c
index d85b04f..8bc3ea3 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -344,7 +344,8 @@ void init_cpreg_list(ARMCPU *cpu)
  * access_el3_aa32ns_aa64any: Used to check both AArch32/64 register views.
  */
 static CPAccessResult access_el3_aa32ns(CPUARMState *env,
-const ARMCPRegInfo *ri)
+const ARMCPRegInfo *ri,
+bool isread)
 {
 bool secure = arm_is_secure_below_el3(env);
 
@@ -356,10 +357,11 @@ static CPAccessResult access_el3_aa32ns(CPUARMState *env,
 }
 
 static CPAccessResult access_el3_aa32ns_aa64any(CPUARMState *env,
-const ARMCPRegInfo *ri)
+const ARMCPRegInfo *ri,
+bool isread)
 {
 if (!arm_el_is_aa64(env, 3)) {
-return access_el3_aa32ns(env, ri);
+return access_el3_aa32ns(env, ri, isread);
 }
 return CP_ACCESS_OK;
 }
@@ -369,7 +371,8 @@ static CPAccessResult access_el3_aa32ns_aa64any(CPUARMState 
*env,
  * We assume that the .access field is set to PL1_RW.
  */
 static CPAccessResult access_trap_aa32s_el1(CPUARMState *env,
-const ARMCPRegInfo *ri)
+const ARMCPRegInfo *ri,
+bool isread)
 {
 if (arm_current_el(env) == 3) {
 return CP_ACCESS_OK;
@@ -651,7 +654,8 @@ static void cpacr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 env->cp15.cpacr_el1 = value;
 }
 
-static CPAccessResult cpacr_access(CPUARMState *env, const ARMCPRegInfo *ri)
+static CPAccessResult cpacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
+   bool isread)
 {
 if (arm_feature(env, ARM_FEATURE_V8)) {
 /* Check if CPACR accesses are to be trapped to EL2 */
@@ -668,7 +672,8 @@ static CPAccessResult cpacr_access(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return CP_ACCESS_OK;
 }
 
-static CPAccessResult cptr_access(CPUARMState *env, const ARMCPRegInfo *ri)
+static CPAccessResult cptr_access(CPUARMState *env, const ARMCPRegInfo *ri,
+  bool isread)
 {
 /* Check if CPTR accesses are set to trap to EL3 */
 if (arm_current_el(env) == 2 && (env->cp15.cptr_el[3] & CPTR_TCPAC)) {
@@ -710,7 +715,8 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
-static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri)
+static CPAccessResult pmreg_access(CPUARMState *env, const ARMCPRegInfo *ri,
+   bool isread)
 {
 /* Performance monitor registers user accessibility is controlled
  * by PMUSERENR.
@@ -1154,7 +1160,8 @@ static void teecr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 env->teecr = value;
 }
 
-static CPAccessResult teehbr_access(CPUARMState *env, const ARMCPRegInfo *ri)
+static CPAccessResult teehbr_access(CPUARMState *env, const ARMCPRegInfo *ri,
+bool isread)
 {
 if (arm_current_el(env) == 0 && (env->teecr & 1)) {
 return CP_ACCESS_TRAP;
@@ -1207,7 +1214,8 @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
 
 #ifndef CONFIG_USER_ONLY
 
-static CPAccessResult gt_cntfrq_access(CPUARMState *env, const ARMCPRegInfo 
*ri)
+static CPAccessResult gt_cntfrq_access(CPUARMState *env, const ARMCPRegInfo 
*ri,
+   bool isre

Re: [Qemu-devel] [PATCH 07/10] virtio: combine the read of a descriptor

2016-02-03 Thread Paolo Bonzini


On 03/02/2016 13:34, Gonglei (Arei) wrote:
> Hi,
> 
>> Subject: [Qemu-devel] [PATCH 07/10] virtio: combine the read of a descriptor
>>
>> Compared to vring, virtio has a performance penalty of 10%.  Fix it
>> by combining all the reads for a descriptor in a single address_space_read
>> call.  This also simplifies the code nicely.
>>
>> Reviewed-by: Cornelia Huck 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  hw/virtio/virtio.c | 86 
>> ++
>>  1 file changed, 35 insertions(+), 51 deletions(-)
>>
> 
> Unbelievable! After applying this patch, the virtio-crypto speed can attach 
> 74MB/sec, host
> Cpu overhead is 180% (the main thread 100% and vcpu threads 80%)

The three patches from Vincenzo will help too.  What was it like before?

Also, are you using ioeventfd or dataplane?  virtio-crypto sounds like
something that could be very easily run outside the "big QEMU lock".

Paolo

> Testing AES-128-CBC cipher: 
> Encrypting in chunks of 256 bytes: done. 371.94 MiB in 5.02 secs: 
> 74.12 MiB/sec (1523475 packets)
> Encrypting in chunks of 256 bytes: done. 369.85 MiB in 5.01 secs: 
> 73.88 MiB/sec (1514900 packets)
> Encrypting in chunks of 256 bytes: done. 371.07 MiB in 5.02 secs: 
> 73.97 MiB/sec (1519914 packets)
> Encrypting in chunks of 256 bytes: done. 371.66 MiB in 5.02 secs: 
> 74.09 MiB/sec (1522309 packets)
> Encrypting in chunks of 256 bytes: done. 371.79 MiB in 5.02 secs: 
> 74.12 MiB/sec (1522868 packets)
> Encrypting in chunks of 256 bytes: done. 371.94 MiB in 5.02 secs: 
> 74.15 MiB/sec (1523457 packets)
> Encrypting in chunks of 256 bytes: done. 371.90 MiB in 5.02 secs: 
> 74.14 MiB/sec (1523317 packets)
> Encrypting in chunks of 256 bytes: done. 371.71 MiB in 5.02 secs: 
> 74.10 MiB/sec (1522522 packets)
> 
> 15.95%  qemu-kvm [.] address_space_translate
>   6.98%  qemu-kvm [.] qemu_get_ram_ptr
>   4.87%  libpthread-2.19.so   [.] __pthread_mutex_unlock_usercnt
>   4.40%  qemu-kvm [.] qemu_ram_addr_from_host
>   3.79%  qemu-kvm [.] address_space_map
>   3.41%  libc-2.19.so [.] _int_malloc
>   3.29%  libc-2.19.so [.] _int_free
>   3.07%  libc-2.19.so [.] malloc
>   2.95%  libpthread-2.19.so   [.] pthread_mutex_lock
>   2.94%  qemu-kvm [.] phys_page_find
>   2.73%  qemu-kvm [.] address_space_translate_internal
>   2.65%  libc-2.19.so [.] malloc_consolidate
>   2.35%  libc-2.19.so [.] __memcpy_sse2_unaligned
>   1.72%  qemu-kvm [.] find_next_zero_bit
>   1.38%  qemu-kvm [.] address_space_rw
>   1.34%  qemu-kvm [.] object_unref
>   1.30%  qemu-kvm [.] object_ref
>   1.28%  qemu-kvm [.] virtqueue_pop
>   1.20%  libc-2.19.so [.] memset
>   1.11%  qemu-kvm [.] virtio_notify
> 
> Thank you so much!
> 
> Regards,
> -Gonglei
> 
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 79a635f..2433866 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -107,35 +107,15 @@ void virtio_queue_update_rings(VirtIODevice *vdev,
>> int n)
>>vring->align);
>>  }
>>
>> -static inline uint64_t vring_desc_addr(VirtIODevice *vdev, hwaddr desc_pa,
>> -   int i)
>> +static void vring_desc_read(VirtIODevice *vdev, VRingDesc *desc,
>> +hwaddr desc_pa, int i)
>>  {
>> -hwaddr pa;
>> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, addr);
>> -return virtio_ldq_phys(vdev, pa);
>> -}
>> -
>> -static inline uint32_t vring_desc_len(VirtIODevice *vdev, hwaddr desc_pa, 
>> int
>> i)
>> -{
>> -hwaddr pa;
>> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, len);
>> -return virtio_ldl_phys(vdev, pa);
>> -}
>> -
>> -static inline uint16_t vring_desc_flags(VirtIODevice *vdev, hwaddr desc_pa,
>> -int i)
>> -{
>> -hwaddr pa;
>> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, flags);
>> -return virtio_lduw_phys(vdev, pa);
>> -}
>> -
>> -static inline uint16_t vring_desc_next(VirtIODevice *vdev, hwaddr desc_pa,
>> -   int i)
>> -{
>> -hwaddr pa;
>> -pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, next);
>> -return virtio_lduw_phys(vdev, pa);
>> +address_space_read(&address_space_memory, desc_pa + i *
>> sizeof(VRingDesc),
>> +   MEMTXATTRS_UNSPECIFIED, (void *)desc,
>> sizeof(VRingDesc));
>> +virtio_tswap64s(vdev, &desc->addr);
>> +virtio_tswap32s(vdev, &desc->len);
>> +virtio_tswap16s(vdev, &desc->flags);
>> +virtio_tswap16s(vdev, &desc->next);
>>  }
>>
>>  static inline uint16_t vring_avail_flags(VirtQueue *vq)
>> @@ -345,18 +325,18 @@ 

Re: [Qemu-devel] [PATCH v6 1/5] util: Introduce error reporting functions with fatal/abort

2016-02-03 Thread Lluís Vilanova
Markus Armbruster writes:

> Thomas Huth  writes:
>> On 03.02.2016 10:48, Markus Armbruster wrote:
>>> David Gibson  writes:
>>> 
 On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote:
> On 02.02.2016 19:53, Markus Armbruster wrote:
>> Lluís Vilanova  writes:
> ...
> 
>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
>>> index 7ab2355..6c2f142 100644
>>> --- a/include/qemu/error-report.h
>>> +++ b/include/qemu/error-report.h
>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) 
>>> GCC_FMT_ATTR(1, 2);
>>> const char *error_get_progname(void);
>>> extern bool enable_timestamp_msg;
>>> 
>>> +/* Report message and exit with error */
>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) 
>>> GCC_FMT_ATTR(1, 0);
>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) 
>>> GCC_FMT_ATTR(1, 2);
>> 
>> This lets people write things like
>> 
>> error_report_fatal("The sky is falling");
>> 
>> instead of
>> 
>> error_report("The sky is falling");
>> exit(1);
>> 
>> or
>> 
>> fprintf(stderr, "The sky is falling\n");
>> exit(1);
>> 
>> I don't think that's an improvement in clarity.
> 
> The problem is not the existing code, but that in a couple of new
> patches, I've now already seen that people are trying to use
> 
> error_setg(&error_fatal, ... );
 
 So, I don't actually see any real advantage to error_report_fatal(...)
 over error_setg(&error_fatal, ...).
>>> 
>>> I do.  Compare:
>>> 
>>> (a) error_report(...);
>>> exit(1);
>>> 
>>> (b) error_report_fatal(...);
>>> 
>>> (c) error_setg(&error_fatal, ...);
>>> 
>>> In my opinion, (a) is clearest: even a relatively clueless reader will
>>> know what exit(1) does, can guess what error_report() approximately
>>> does, and doesn't need to know what it does exactly.  (b) is slightly
>>> less obvious, and (c) is positively opaque.
>>> 
>>> Let's stick to the obvious (a) and be done with it.
>> 
>> Ok, (a) is fine for me too, as long as we avoid (c). Lluís, could you
>> maybe add that information to your patch that updates the HACKING text?

> I feel such detailed advice belings into error.h.  Sketch appended.

> If that doesn't succeed in keeping (c) out, make checkpatch flag it.

>> (and sorry for the fuzz with error_report_fatal() ... I thought it would
>> be a good solution to avoid (c), but if (a) is preferred instead, then
>> we should go with that solution instead).

I can easily change that, no problem. I'm just happy consensus is landing on
this subject.


>> And, by the way, what about the spots that currently already use
>> error_setg(&error_abort, ) ? Should they be turned into
>> error_report() + abort() instead? Or only abort(), without error
>> message, since abort() is only about programming errors?

> As I wrote in my first reply to this thread, I'd like them to be cleaned
> up to just abort() or assert().

> I like assert(), because it gives me exactly what I can use to debug the
> programming error: a core dump (if enabled) and a source location
> (useful when no core dump).  I never bought the argument that we should
> use abort() instead of assert(0) because "what if NDEBUG?!?".  If you
> define NDEBUG, our 600+ abort()s won't save you from our 4000+
> assert()s.

Sorry, but I don't buy the argument of, "I prefer assert() because there's
already lots of them". To me, there's a semantic difference between debug builds
and regular ones (aka, assert vs abort). Also, I think it adds to the confusion
that assert and abort seem to be used interchangeably in the code.

What about this definition?

* exit(): user-triggered errors
* abort(): general programming errors
* assert(): additional sanity/consistency checks against programming errors

Now, abort & assert have an overlap. Should we discourage one in favour of the
other?

Also:

* error_report_fatal ensures the same exit code is always used (otherwise it can
  fail with inconsistent error codes)
* error_report_abort brings the code information of assert into abort

But of course, I'm happy either way :)


> diff --git a/include/qapi/error.h b/include/qapi/error.h
> index 45d6c72..ea7e74f 100644
> --- a/include/qapi/error.h
> +++ b/include/qapi/error.h
> @@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err);
>   * human-readable error message is made from printf-style @fmt, ...
>   * The resulting message should be a single phrase, with no newline or
>   * trailing punctuation.
> + * Please don't error_setg(&error_fatal, ...), use error_report() and
> + * exit(), because that's more obvious.
> + * Likewise, don't error_setg(&error_abort, ...), use assert().
>   */
>  #define error_setg(errp, fmt, ...)  \
>  error_setg_internal((errp), __FILE__, __LINE__, __func__,   \
> @@ -213,6 +216,8 @@ void error_setg_win32_internal(Er

Re: [Qemu-devel] Memory on stellaris board

2016-02-03 Thread Aurelio Remonda
On Wed, Feb 3, 2016 at 10:34 AM, Peter Maydell  wrote:
> On 3 February 2016 at 13:00, Aurelio Remonda
>  wrote:
>> Hello, i was trying to understand how does the sram and flash size
>> works on lm3s6965evb, i found a hardcoded 0x00ff007f as dc0 value and
>> how flash_size and sram_size are calculated based on that hexadecimal.
>> I mean this:
>>
>> flash_size = (((board->dc0 & 0x) + 1) << 1) * 1024;
>> sram_size = ((board->dc0 >> 18) + 1) * 1024;
>> On stellaris.c
>>
>> When i use the -m [size] flag while running qemu how does the flag
>> affect does flash_size and sram_size values? Or it doesn't? Where can
>> i see that the memory has indeed change?
>
> This board model ignores -m. We just implement a model of this particular
> bit of hardware, which has a fixed amount of RAM in it.

Thanks for the quick answer, do you think it is worth to make the m
flag work for this model?
Can you give me a hint on where to look?(another board that use it) so
i can add this feature
for this model.
Thanks!


-- 
Aurelio Remonda

Software Engineer

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina
Phone: +54-351-4217888 / 4218211



Re: [Qemu-devel] [PATCH v9] spec: add qcow2 bitmaps extension specification

2016-02-03 Thread Vladimir Sementsov-Ogievskiy

On 03.02.2016 11:04, Fam Zheng wrote:

On Tue, 02/02 09:35, Vladimir Sementsov-Ogievskiy wrote:

The new feature for qcow2: storing bitmaps.

This patch adds new header extension to qcow2 - Bitmaps Extension. It
provides an ability to store virtual disk related bitmaps in a qcow2
image. For now there is only one type of such bitmaps: Dirty Tracking
Bitmap, which just tracks virtual disk changes from some moment.

Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
should be stored in this qcow2 file. The size of each bitmap
(considering its granularity) is equal to virtual disk size.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v9
 - rewordings, thanks to Max

v8
 - rewordings
 - bitmap_directory_size: 4b -> 8b
 - add more descriptive description in == Bitmaps == section
 - add paragraph "Dirty tracking bitmaps"

   Bitmap directory entry:
 - extra data should not allocate additional clusters
 - padding must be all-bytes-zero
 - add extra_data_compatible flag (now behavior in case of unknown
   extra data is defined by this flag)

v7:

- Rewordings, grammar.
   Max, Eric, John, thank you very much.

- add last paragraph: remaining bits in bitmap data clusters must be
   zero.

- s/Bitmap Directory/bitmap directory/ and other names like this at
   the request of Max.

v6:

- reword bitmap_directory_size description
- bitmap type: make 0 reserved
- extra_data_size: resize to 4bytes
   Also, I've marked this field as "must be zero". We can always change
   it, if we decide allowing managing app to specify any extra data, by
   defining some magic value as a top of user extra data.. So, for now
   non zeor extra_data_size should be considered as an error.
- swap name and extra_data to give good alignment to extra_data.


v5:

- 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of
   bitmaps.
- rewordings
- move upper bounds to "Notes about Qemu limits"
- s/should/must somewhere. (but not everywhere)
- move name_size field closer to name itself in bitmap header
- add extra data area to bitmap header
- move bitmap data description to separate section


  docs/specs/qcow2.txt | 223 ++-
  1 file changed, 222 insertions(+), 1 deletion(-)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index f236d8c..db5e666 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -103,7 +103,18 @@ in the description of a field.
  write to an image with unknown auto-clear features if it
  clears the respective bits from this field first.
  
-Bits 0-63:  Reserved (set to 0)

+Bit 0:  Bitmaps extension bit
+This bit indicates consistency for the bitmaps
+extension data.
+
+It is an error if this bit is set without the
+bitmaps extension present.
+
+If the bitmaps extension is present but this
+bit is unset, the bitmaps extension data must 
be
+considered inconsistent.
+
+Bits 1-63:  Reserved (set to 0)
  
   96 -  99:  refcount_order

  Describes the width of a reference count block entry 
(width
@@ -123,6 +134,7 @@ be stored. Each extension has a structure like the 
following:
  0x - End of the header extension area
  0xE2792ACA - Backing file format name
  0x6803f857 - Feature name table
+0x23852875 - Bitmaps extension
  other  - Unknown header extension, can be safely
   ignored
  
@@ -166,6 +178,36 @@ the header extension data. Each entry look like this:

  terminated if it has full length)
  
  
+== Bitmaps extension ==

+
+The bitmaps extension is an optional header extension. It provides the ability
+to store bitmaps related to a virtual disk. For now, there is only one bitmap
+type: the dirty tracking bitmap, which tracks virtual disk changes from some
+point in time.
+
+The data of the extension should be considered consistent only if the
+corresponding auto-clear feature bit is set, see autoclear_features above.
+
+The fields of the bitmaps extension are:
+
+Byte  0 -  3:  nb_bitmaps
+   The number of bitmaps contained in the image. Must be
+   greater than or equal to 1.
+
+   Note: Qemu currently only supports up to 65535 bitmaps per
+   image.
+
+  4 -  7:  Reserved, must be zero.
+
+  8 - 15:  bitmap_directory_size
+   Size of the bitmap directory in bytes. It is the cumulative
+   size of all (nb_bitmaps) bitmap headers.
+
+  

  1   2   3   4   >