date:20131205

Re: [Qemu-devel] [PATCH 1/3] scsi-disk: close drive on START_STOP

2013-12-05 Thread Markus Armbruster

Alexey Kardashevskiy  writes:

> On 12/05/2013 12:12 AM, Markus Armbruster wrote:
>> Alexey Kardashevskiy  writes:
>> 
>>> On 12/04/2013 08:33 PM, Markus Armbruster wrote:
 Paolo Bonzini  writes:

> Il 04/12/2013 05:55, Alexey Kardashevskiy ha scritto:
>> Normally the user is expected to eject DVD if it is not locked by
>> the guest. eject_device() makes few checks and calls bdrv_close()
>> if DVD is not in use.
>>
>> However it is still possible to eject DVD even if it is in use.
>> For that, QEMU sets "eject requested" flag, the guest reads it, issues
>> ALLOW_MEDIUM_REMOVAL(enable=1) and START_STOP(start=0). But in this case,
>> bdrv_close() is not called anywhere so it remains "inserted" in QEMU's
>> terms.
>
> This is expected behavior, and matches what IDE does.
>
> Markus, can you confirm?

 Confirmed.  See commit 4be9762.

 Alexey, monitor commands eject does two things: it first opens the tray,
 and if that works, it removes the medium.

 If the tray is locked closed, it tells the device model that eject was
 requested.  Works just like the physical eject button.

 With -f, it then rips out the medium.  This is similar to opening the
 tray with a unbent paperclip.  Let's ignore this case.

 The scsi-cd device model tells the guest about the eject request.  A
 well-behaved guest will then command the device to unlock and open the
 tray.

 The guest uses the same commands on behalf of its applications,
 e.g. /usr/bin/eject.

 Your patch changes behavior of "eject /dev/sr0 && eject -t /dev/sr0":
 you no longer get the same medium back.  You normally do with real
 hardware.

 The somewhat unfortunate consequence is that monitor command eject can
 only remove the medium when the tray is not locked.
>>>
>>>
>>> Oh. Wow. Nice :-/
>>>
>>> Ok. So. It is expected that the real system will close the tray back if it
>>> was mounted, is not it?
>>>
>>> Right now, after "eject" "info block" is like this:
>>>
>>> cd1: virtimg/Fedora-19-ppc64-netinst.iso (raw)
>>> Removable device: locked, tray open
>>>
>>> And the mountpoint does not work in the guest. The state above even
>>> persists after "umount" in the guest. It only becomes correct again
>>> (tray==closed) when I mount DVD again.
>>>
>>> Is it all expected to work like this? Thanks.
>> 
>> Can't reproduce, but can reproduce something similar.  Freshly booted
>> guest running RHEL-7 alpha, with the CD mounted:
>> 
>> (qemu) info block cd
>> 
>> cd: r7.iso (raw, read-only)
>> Removable device: locked, tray closed
>> 
>> Looks good.  Try to eject:
>> 
>> (qemu) eject cd
>> Device 'cd' is locked
>> 
>> Looks good.  This should have signalled the guest "user wants to eject".
>> The guest should either ignore it, or unmount, unlock and eject.
>> Apparently, it does that:
>> 
>> (qemu) info block cd
>> 
>> cd: r7.iso (raw, read-only)
>> Removable device: locked, tray closed
>> (qemu) eject cd
>> Device 'cd' is locked
>> (qemu) info block cd
>> 
>> cd: r7.iso (raw, read-only)
>> Removable device: locked, tray closed
>> (qemu) info block cd
>> 
>> cd: r7.iso (raw, read-only)
>> Removable device: not locked, tray open
>> 
>> Except it forgets to unmount!  dmesg has "VFS: busy inodes on changed
>> media or resized disk sr0".
>> 
>> Need somebody to find out how exactly this fails, and whether it's a
>> guest bug or a QEMU bug.
>
>
> The guest unlocks DVD (by sending ALLOW PERMIT MEDIUM REMOVAL) and stops
> DVD (by sending START_STOP). Is there any other message missing which would
> do real physical eject?

START_STOP has a "load/eject" flag that causes load with start and eject
with stop.

> What does it have to do with unmount (which is purely the guest software
> state)?

Not sure I understand you here.

A guest that voluntarily ejects a medium while keeping it mounted gets
what it asked for: breakage.

Re: [Qemu-devel] [RFC v5 2/5] hw/arm/digic: prepare DIGIC-based boards support

2013-12-05 Thread Peter Maydell

On 5 December 2013 00:20, Peter Crosthwaite
 wrote:
> Is hivecs-on-reset ideally a new ARM_FEATURE or is there a simpler
> conditional we can use as post_init time?

I think we want the property if (!arm_feature(ARM_FEATURE_M)).

-- PMM

Re: [Qemu-devel] [RFC v5 2/5] hw/arm/digic: prepare DIGIC-based boards support

2013-12-05 Thread Peter Maydell

On 5 December 2013 00:25, Peter Crosthwaite
 wrote:
> But the bootloader does this already. We have support for board
> configurable secondary bootloops. Is this as simple as supporting
> board configurable primary boot fragments?
>
> arm_boot needs to be patched to do its bootstrap magic with no -kernel
> arg I guess.

I'd really rather not extend the arm_boot code to more usage cases
if I can avoid it. It's really intended for loading kernels. In this case
the thing being loaded really is a ROM image, and the correct way
to handle this is to make the board model behave the same way
the hardware does and make the ROM image sit at the same place
in the memory map that the real ROM image does.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-12-05 Thread Matthias Brugger

2013/11/11 Stefan Hajnoczi :
> On Mon, Nov 11, 2013 at 11:00:45AM +0100, Matthias Brugger wrote:
>> 2013/11/5 Stefan Hajnoczi :
>> > I'd also like to see the thread pool implementation you wish to add
>> > before we add a layer of indirection which has no users yet.
>>
>> Fair enough, I will evaluate if it will make more sense to implement a
>> new AIO infrastructure instead to try reuse the thread-pool.
>> Actually my implementation will differ in the way, that we will have
>> several workerthreads with everyone of them having its own queue. The
>> requests will be distributed between them depending on an identifier.
>> The request function which  the worker_thread call will be the same as
>> using aio=threads, so I'm not quite sure which will be the way to go.
>> Any opinions and hints, like the one you gave are highly appreciated.
>
> If I understand the slides you linked to correctly, the guest will pass
> an identifier with each request.  The host has worker threads allowing
> each stream of requests to be serviced independently.  The idea is to
> associate guest processes with unique identifiers.
>
> The guest I/O scheduler is supposed to submit requests in a way that
> meets certain policies (e.g. fairness between processes, deadlines,
> etc).
>
> Why is it necessary to push this task down into the host?  I don't
> understand the advantage of this approach except that maybe it works
> around certain misconfigurations, I/O scheduler quirks, or plain old
> bugs - all of which should be investigated and fixed at the source
> instead of adding another layer of code to mask them.

It is about I/O scheduling. CFQ the state of the art I/O scheduler
merges adjacent requests from the same PID before dispatching them to
the disk.
If we can distinguish between the different threads of a virtual
machine that read/write a file, the I/O scheduler in the host can
merge requests in an effective way for sequential access. Qemu fails
in this, because of its architecture. Apart that at the moment there
is no way to distinguish the guest threads from each other (I'm
working on some kernel patches), Qemu has one big queue from which
several workerthreads grab requests and dispatch them to the disk.
Even if you have one large read from just one thread in the guest, the
I/O scheduler in the host will get the requests from different PIDs (=
workerthreads) and won't be able to merge them.
In former versions, there was some work done to merge requests in
Qemu, but I don't think they were very useful, because you don't know
how the layout of the image file looks like on the physical disk.
Anyway I think this code parts have been removed.
The only layer where you really know how the blocks of the virtual
disk image are distributed over the disk is the block layer of the
host. So you have to do the block request merging there. With the new
architecture this would come for free, as you can map every thread
from a guest to one workerthread of Qemu.

Matthias

>
> Stefan

-- 
motzblog.wordpress.com

Re: [Qemu-devel] [PATCH v9 0/5] add allwinner A10 SoC support

2013-12-05 Thread Peter Maydell

On 5 December 2013 02:33, Li Guang  wrote:
> Peter Crosthwaite wrote:
>>
>> Hi Liguang,
>>
>> V9 has some checkpatch errors:
>>
>> [pcrost@xsjandreislx qemu]$ git format-patch HEAD~5
>> 0001-vmstate-add-VMSTATE_PTIMER_ARRAY.patch
>> 0002-hw-timer-add-allwinner-a10-timer.patch
>> 0003-hw-intc-add-allwinner-A10-interrupt-controller.patch
>> 0004-hw-arm-add-allwinner-a10-SoC-support.patch
>> 0005-hw-arm-add-cubieboard-support.patch
>> [pcrost@xsjandreislx qemu]$ ./scripts/checkpatch.pl 00*
>> ERROR: need consistent spacing around '*' (ctx:WxB)
>> #30: FILE: include/migration/vmstate.h:618:
>> +VMSTATE_ARRAY_OF_POINTER(_f, _s, _n, 0, vmstate_info_ptimer,
>> ptimer_state *)

> the error here seems subtle,
> 2 rules used,
> 1. should a space both before and after '*'
> 2. shouldn't a space before ')'

Yes, sometimes checkpatch gets confused by macros, especially
ones like this whose arguments might be types. The correct
spacing here I think would be to have no space before or after
the '*'.

thanks
-- PMM

Re: [Qemu-devel] gpu and console chicken and egg

2013-12-05 Thread Gerd Hoffmann

  Hi,

> > Hmm, why does it depend on the UI?  Wasn't the plan to render into a
> > dma-buf no matter what?  Then either read the rendered result from the
> > dmabuf (non-gl UI like vnc) or let the (gl-capable) UI pass the dma-buf
> > to the compositor?
> 
> That would be the hopeful plan, however so far my brief investigation says
> I'm possibly being a bit naive with what EGL can do. I'm still talking to the
> EGL and wayland people about how best to model this, but either way
> this won't work with nvidia drivers which is a case we need to handle, so
> we need to interact between the UI GL usage and the renderer.

Hmm.  That implies we simply can't combine hardware-accelerated 3d
rendering with vnc, correct?

> Also
> non-Linux platforms would want this in some way I'd assume, at least
> so virtio-gpu is usable with qemu on them.

Yes, the non-3d part should have no linux dependency and should be
available on all platforms.

> GL isn't that simple, and I'm not sure I can make it that simple 
> unfortunately,
> the renderer requires certain extensions on top of the base GL 2.1 and GL3.0.
> live migration with none might be the first answer, and then we'd have to 
> expend
> serious effort on making live migration work for any sort of different
> GL drivers.
> Reading everything back while renderering continues could be a lot of
> fun. (or pain).

We probably want to start with gl={none,host} then.  Live migration only
supported with "none".

If we can't combine remote displays with 3d rendering (nvidia issue
above) live migration with 3d makes little sense anyway.

> I don't think this will let me change the feature bits though since the virtio
> PCI layer has already picked them up I think. I just wondered if we have any
> examples of changing features later.

I think you can.  There are no helper functions for it though, you
probably have to walk the data structures and fiddle with the bits
directly.

Maybe it is easier to just have a command line option to enable/disable
3d globally, and a global variable with the 3d status.  Being able to
turn off all 3d is probably useful anyway.  Either as standalone option
or as display option (i.e. -display sdl,3d={on,off,auto}).  Then do a
simple check for 3d availability when *parsing* the options.  That'll
also remove the need for the 3d option for virtio-gpu, it can just check
the global flag instead.

> I should probably resubmit the multi-head changes and SDL2 changes and
> we should look at merging them first.

Yes.

> a) dma-buf/EGL, EGLimage vs EGLstream, nothing exists upstream, so
> unknown timeframe.
> I don't think we should block merging on this, also dma-buf doesn't
> exist on Windows/MacOSX
> so qemu there should still get virtio-gpu available.

Yes.  Merging virtio-gpu with 2d should not wait for 3d being finally
sorted.  3d is too much of a moving target still.

> c) GTK multi-head + GL support - I'd like to have the GTK UI be able
> for multi-head as well
> my first attempt moved a lot of code around, I'm not really sure what
> the secondary head
> windows should contain vs the primary head.

Yes, the multihead UI design is the tricky part here.  I'd say don't try
to make the first draft too fancy.  I expect we will have quite some
discussions on that topic.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] qdev: Keep global allocation counter per bus

2013-12-05 Thread Paolo Bonzini

Il 04/12/2013 21:24, Alexander Graf ha scritto:
> When we have 2 separate qdev devices that both create a qbus of the
> same type without specifying a bus name or device name, we end up
> with two buses of the same name, such as ide.0 on the Mac machines:
> 
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
> 
> If we now spawn a device that connects to a ide.0 the last created
> bus gets the device, with the first created bus inaccessible to the
> command line.
> 
> After some discussion on IRC we concluded that the best quick fix way
> forward for this is to make automated bus-class type based allocation
> count a global counter. That's what this patch implements. With this
> we instead get
> 
>   dev: macio-ide, id ""
> bus: ide.1
>   type IDE
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
> 
> on the example mentioned above.
> 
> CC: Paolo Bonzini 
> CC: Markus Armbruster 
> CC: Anthony Liguori 
> Signed-off-by: Alexander Graf 
> ---
>  hw/core/qdev.c | 20 +---
>  include/hw/qdev-core.h |  2 ++
>  2 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index e374a93..959130c 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -409,27 +409,33 @@ DeviceState *qdev_find_recursive(BusState *bus, const 
> char *id)
>  static void qbus_realize(BusState *bus, DeviceState *parent, const char 
> *name)
>  {
>  const char *typename = object_get_typename(OBJECT(bus));
> +BusClass *bc;
>  char *buf;
> -int i,len;
> +int i, len, bus_id;
>  
>  bus->parent = parent;
>  
>  if (name) {
>  bus->name = g_strdup(name);
>  } else if (bus->parent && bus->parent->id) {
> -/* parent device has id -> use it for bus name */
> +/* parent device has id -> use it plus parent-bus-id for bus name */
> +bus_id = bus->parent->num_child_bus;
> +
>  len = strlen(bus->parent->id) + 16;
>  buf = g_malloc(len);
> -snprintf(buf, len, "%s.%d", bus->parent->id, 
> bus->parent->num_child_bus);
> +snprintf(buf, len, "%s.%d", bus->parent->id, bus_id);
>  bus->name = buf;
>  } else {
> -/* no id -> use lowercase bus type for bus name */
> +/* no id -> use lowercase bus type plus global bus-id for bus name */
> +bc = BUS_GET_CLASS(bus);
> +bus_id = bc->automatic_ids++;
> +
>  len = strlen(typename) + 16;
>  buf = g_malloc(len);
> -len = snprintf(buf, len, "%s.%d", typename,
> -   bus->parent ? bus->parent->num_child_bus : 0);
> -for (i = 0; i < len; i++)
> +len = snprintf(buf, len, "%s.%d", typename, bus_id);
> +for (i = 0; i < len; i++) {
>  buf[i] = qemu_tolower(buf[i]);
> +}
>  bus->name = buf;
>  }
>  
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index f2043a6..09f8527 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -161,6 +161,8 @@ struct BusClass {
>  int (*reset)(BusState *bus);
>  /* maximum devices allowed on the bus, 0: no limit. */
>  int max_dev;
> +/* number of automatically allocated bus ids (e.g. ide.0) */
> +int automatic_ids;
>  };
>  
>  typedef struct BusChild {
> 

Reviewed-by: Paolo Bonzini

Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement

2013-12-05 Thread Stefan Hajnoczi

On Wed, Dec 04, 2013 at 06:40:30PM +0100, Paolo Bonzini wrote:
> Il 04/12/2013 15:06, Stefan Hajnoczi ha scritto:
> > On Thu, Oct 10, 2013 at 05:07:18PM +0200, Paolo Bonzini wrote:
> >> @@ -298,30 +278,31 @@ static void handle_notify(EventNotifier *e)
> >>  vring_disable_notification(s->vdev, &s->vring);
> >>  
> >>  for (;;) {
> >> -head = vring_pop(s->vdev, &s->vring, iov, end, &out_num, 
> >> &in_num);
> >> -if (head < 0) {
> >> +ret = vring_pop(s->vdev, &s->vring, &elem);
> >> +if (ret < 0) {
> >> +assert(elem == NULL);
> >>  break; /* no more requests */
> >>  }
> >>  
> >> -trace_virtio_blk_data_plane_process_request(s, out_num, 
> >> in_num,
> >> -head);
> >> +trace_virtio_blk_data_plane_process_request(s, elem->out_num,
> >> +elem->in_num, 
> >> elem->index);
> >>  
> >> -if (process_request(&s->ioqueue, iov, out_num, in_num, head) 
> >> < 0) {
> >> +if (process_request(&s->ioqueue, elem) < 0) {
> >>  vring_set_broken(&s->vring);
> >> +vring_push(&s->vring, elem, 0);
> > 
> > If we give up on the vring I don't think we should push the element
> > back.  It may cause the guest to panic.
> > 
> > I guess what we really need here is to unmap scatter-gather buffers and
> > delete elem.
> 
> That's what already happens actually.  vring_push has
> 
> 
> +g_slice_free(VirtQueueElement, elem);
> +
>  /* Don't touch vring if a fatal error occurred */
>  if (vring->broken) {
>  return;
> 
> in this patch and
> 
> +for (i = 0; i < elem->out_num; i++) {
> +vring_unmap(elem->out_sg[i].iov_base, false);
> +}
> +
> +for (i = 0; i < elem->in_num; i++) {
> +vring_unmap(elem->in_sg[i].iov_base, true);
> +}
> 
>  g_slice_free(VirtQueueElement, elem);
> 
> in the next one.
> 
> Though I admit vring_push isn't such a great name and API.  I can add
> instead a vring_free_element function.  Do you think vring_push should
> call it, or should the caller do that?

I think vring_push() should free the VirtQueueElement.

We just need to expose vring_free_element() so that handle_notify() can
call it without pushing bogus buffers back to the guest.

Stefan

Re: [Qemu-devel] [PATCH] target-i386: clear guest TSC on reset

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 07:15, Fernando Luis Vázquez Cao ha scritto:
> VCPU TSC is not cleared by a warm reset (*), which leaves many Linux
> guests vulnerable to the overflow in cyc2ns_offset fixed by upstream
> commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow
> in cyc2ns_offset").
> 
> To put it in a nutshell, if a Linux guest without the patch above applied
> has been up more than 208 days and attempts a warm reset chances are that
> the newly booted kernel will panic or hang.
> 
> (*) Intel Xeon E5 processors show the same broken behavior due to
> the errata "TSC is Not Affected by Warm Reset" (Intel® Xeon®
> Processor E5 Family Specification Update - August 2013): "The
> TSC (Time Stamp Counter MSR 10H) should be cleared on
> reset. Due to this erratum the TSC is not affected by warm
> reset."
> 
> Cc: sta...@vger.kernel.org
> Cc: Will Auld 
> Cc: Marcelo Tosatti 
> Signed-off-by: Fernando Luis Vazquez Cao 

I agree that the bug is in QEMU.  One small nit in your patch is that
you should reset env->tsc_adjust and env->tsc in x86_cpu_reset.  This
would already be pretty good.

However, a bigger problem is that env->tsc is a useless duplicate of
"cpu_get_ticks() + env->tsc_adjust".  It would be nice to drop env->tsc
completely except for migration backwards compatibility.  Thus you can:

- fill in env->tsc as mentioned above from target-i386/machine.c's
cpu_pre_save function.  This guarantees backwards compatibility.

- add a function cpu_set_ticks(int64_t ticks) to cpus.c.  The function
does nothing if use_icount is true, otherwise it needs to have (roughly)
the opposite logic compared to cpu_get_ticks.  You then call this
function from x86_cpu_reset instead of setting env->tsc.  You can
similarly call this function from kvm_get_msrs.

- add a function kvm_set_ticks(int64_t ticks) to kvm-all.c and
kvm-stub.c.  For kvm-all.c it calls kvm_arch_set_ticks(CPUState *cpu,
int64_t ticks) in target-*/kvm.c.  The kvm_arch_set_tsc() function has a
dummy implementation for all architectures except x86.  For x86 it calls
KVM_SET_MSRS passing "ticks + env->tsc_offset".

- call kvm_set_ticks() from cpu_set_ticks() and cpu_enable_ticks()

Can you do this?

Thanks,

Paolo

> ---
> 
> --- qemu-orig/target-i386/kvm.c   2013-11-28 07:02:45.0 +0900
> +++ qemu/target-i386/kvm.c2013-12-05 14:47:03.085738175 +0900
> @@ -1125,6 +1125,8 @@ static int kvm_put_msrs(X86CPU *cpu, int
>  kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
>  }
>  if (has_msr_tsc_adjust) {
> +if (level == KVM_PUT_RESET_STATE)
> +env->tsc_adjust = 0;
>  kvm_msr_entry_set(&msrs[n++], MSR_TSC_ADJUST, env->tsc_adjust);
>  }
>  if (has_msr_misc_enable) {
> @@ -1139,22 +1141,22 @@ static int kvm_put_msrs(X86CPU *cpu, int
>  kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar);
>  }
>  #endif
> -if (level == KVM_PUT_FULL_STATE) {
> +/*
> + * The following MSRs have side effects on the guest or are too heavy
> + * for normal writeback. Limit them to reset or full state updates.
> + */
> +if (level >= KVM_PUT_RESET_STATE) {
> +if (level == KVM_PUT_RESET_STATE)
> +env->tsc = 0;
>  /*
>   * KVM is yet unable to synchronize TSC values of multiple VCPUs on
>   * writeback. Until this is fixed, we only write the offset to SMP
>   * guests after migration, desynchronizing the VCPUs, but avoiding
>   * huge jump-backs that would occur without any writeback at all.
>   */
> -if (smp_cpus == 1 || env->tsc != 0) {
> +if (smp_cpus == 1 || env->tsc != 0 || level == KVM_PUT_RESET_STATE) {
>  kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
>  }
> -}
> -/*
> - * The following MSRs have side effects on the guest or are too heavy
> - * for normal writeback. Limit them to reset or full state updates.
> - */
> -if (level >= KVM_PUT_RESET_STATE) {
>  kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
>env->system_time_msr);
>  kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, 
> env->wall_clock_msr);
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [Qemu-devel] [PATCH 0/4] spapr-pci: prepare for vfio

2013-12-05 Thread Alexey Kardashevskiy

On 11/21/2013 03:08 PM, Alexey Kardashevskiy wrote:
> Here are few reworks for spapr-pci PHB which I'd like to have to support VFIO.
> QOM, errors printing, traces, nothing really serious. Thanks!
> 
> Alexey Kardashevskiy (4):
>   spapr-pci: convert init() callback to realize()
>   spapr-pci: introduce a finish_realize() callback
>   spapr-pci: add spapr_pci trace
>   spapr-pci: converts fprintf to error_report
> 
>  hw/ppc/spapr_pci.c  | 90 
> ++---
>  include/hw/pci-host/spapr.h | 18 -
>  trace-events|  1 +
>  3 files changed, 69 insertions(+), 40 deletions(-)


Ping?



-- 
Alexey

Re: [Qemu-devel] [PATCH] spapr-iommu: extend SPAPR_TCE_TABLE class

2013-12-05 Thread Alexey Kardashevskiy

On 11/20/2013 04:39 PM, Alexey Kardashevskiy wrote:
> This adds a put_tce() callback to the SPAPR TCE TABLE device class.
> The new callback allows to have different IOMMU types such as upcoming
> VFIO IOMMU and it will be used more by the upcoming Multi-TCE support.
> 
> This reworks the H_PUT_TCE handler to make use of the new put_tce()
> callback.


Ping?


> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  hw/ppc/spapr_iommu.c   | 21 +
>  include/hw/ppc/spapr.h | 13 +
>  2 files changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index ef45f4f..0016c13 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -207,7 +207,7 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, 
> target_ulong ioba,
>  IOMMUTLBEntry entry;
>  
>  if (ioba >= tcet->window_size) {
> -hcall_dprintf("spapr_vio_put_tce on out-of-bounds IOBA 0x"
> +hcall_dprintf("spapr put_tce_emu on out-of-bounds IOBA 0x"
>TARGET_FMT_lx "\n", ioba);
>  return H_PARAMETER;
>  }
> @@ -232,12 +232,21 @@ static target_ulong h_put_tce(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  target_ulong tce = args[2];
>  target_ulong ret = H_PARAMETER;
>  sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn);
> +sPAPRTCETableClass *info;
> +
> +if (!tcet) {
> +return H_PARAMETER;
> +}
> +
> +info = SPAPR_TCE_TABLE_GET_CLASS(tcet);
> +if (!info || !info->put_tce) {
> +return H_PARAMETER;
> +}
>  
>  ioba &= ~(SPAPR_TCE_PAGE_SIZE - 1);
>  
> -if (tcet) {
> -ret = put_tce_emu(tcet, ioba, tce);
> -}
> +ret = info->put_tce(tcet, ioba, tce);
> +
>  trace_spapr_iommu_put(liobn, ioba, tce, ret);
>  
>  return ret;
> @@ -287,9 +296,12 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const 
> char *propname,
>  static void spapr_tce_table_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +sPAPRTCETableClass *stc = SPAPR_TCE_TABLE_CLASS(klass);
> +
>  dc->vmsd = &vmstate_spapr_tce_table;
>  dc->init = spapr_tce_table_realize;
>  dc->reset = spapr_tce_reset;
> +stc->put_tce = put_tce_emu;
>  
>  QLIST_INIT(&spapr_tce_tables);
>  
> @@ -302,6 +314,7 @@ static TypeInfo spapr_tce_table_info = {
>  .parent = TYPE_DEVICE,
>  .instance_size = sizeof(sPAPRTCETable),
>  .class_init = spapr_tce_table_class_init,
> +.class_size = sizeof(sPAPRTCETableClass),
>  .instance_finalize = spapr_tce_table_finalize,
>  };
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index fdaab2d..827cda2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -367,12 +367,25 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr 
> rtas_addr,
>  
>  #define RTAS_ERROR_LOG_MAX  2048
>  
> +typedef struct sPAPRTCETableClass sPAPRTCETableClass;
>  typedef struct sPAPRTCETable sPAPRTCETable;
>  
>  #define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
>  #define SPAPR_TCE_TABLE(obj) \
>  OBJECT_CHECK(sPAPRTCETable, (obj), TYPE_SPAPR_TCE_TABLE)
>  
> +#define SPAPR_TCE_TABLE_CLASS(klass) \
> + OBJECT_CLASS_CHECK(sPAPRTCETableClass, (klass), TYPE_SPAPR_TCE_TABLE)
> +#define SPAPR_TCE_TABLE_GET_CLASS(obj) \
> + OBJECT_GET_CLASS(sPAPRTCETableClass, (obj), TYPE_SPAPR_TCE_TABLE)
> +
> +struct sPAPRTCETableClass {
> +DeviceClass parent_class;
> +
> +target_ulong (*put_tce)(sPAPRTCETable *tcet, target_ulong ioba,
> +target_ulong tce);
> +};
> +
>  struct sPAPRTCETable {
>  DeviceState parent;
>  uint32_t liobn;
> 


-- 
Alexey

Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 09:40, Matthias Brugger ha scritto:
> CFQ the state of the art I/O scheduler

The deadline scheduler typically provides much better performance for
server usage (including hosting VMs).  It doesn't support some features
such as I/O throttling via cgroups, but QEMU now has a very good
throttling mechanism implemented by Benoit Canet.

I suggest that you repeat your experiments using all six configurations:
- deadline scheduler with aio=native
- deadline scheduler with aio=threads
- deadline scheduler with aio=threads + your patches
- CFQ scheduler with aio=native
- CFQ scheduler with aio=threads
- CFQ scheduler with aio=threads + your patches

> In former versions, there was some work done to merge requests in
> Qemu, but I don't think they were very useful, because you don't know
> how the layout of the image file looks like on the physical disk.
> Anyway I think this code parts have been removed.

This is still there for writes, in bdrv_aio_multiwrite.  Only
virtio-blk.c uses it, but it's there.

> The only layer where you really know how the blocks of the virtual
> disk image are distributed over the disk is the block layer of the
> host. So you have to do the block request merging there. With the new
> architecture this would come for free, as you can map every thread
> from a guest to one workerthread of Qemu.

This also assumes a relatively "dumb" guest.  If the guest uses itself a
thread pool, you would have exactly the same problem, wouldn't you?

Paolo

Re: [Qemu-devel] [PATCH] qdev: Keep global allocation counter per bus

2013-12-05 Thread Markus Armbruster

Alexander Graf  writes:

> When we have 2 separate qdev devices that both create a qbus of the
> same type without specifying a bus name or device name, we end up
> with two buses of the same name, such as ide.0 on the Mac machines:
>
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>
> If we now spawn a device that connects to a ide.0 the last created
> bus gets the device, with the first created bus inaccessible to the
> command line.

isapc has the same issue: two onboard isa-ide devices, each providing a
bus, both buses named ide.0.

> After some discussion on IRC we concluded that the best quick fix way
> forward for this is to make automated bus-class type based allocation
> count a global counter. That's what this patch implements. With this
> we instead get
>
>   dev: macio-ide, id ""
> bus: ide.1
>   type IDE
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>
> on the example mentioned above.

Commit message should explain more clearly how and when this affects bus
names.

Patch breaks isapc:

$ qemu -nodefaults -S -display none -monitor stdio -M isapc -drive 
if=none,id=drive0 -device ide-cd,drive=drive0
(qemu) Segmentation fault (core dumped)

Debugging a bit:

(gdb) bt
#0  0x5572e745 in ide_get_geometry (bus=0x0, unit=0, cyls=
0x7fffdb8a, heads=0x7fffdb88 "\210\271qU", secs=
0x7fffdb89 "\271qU") at /home/armbru/work/qemu/hw/ide/qdev.c:129
#1  0x558f1fed in pc_cmos_init_late (opaque=0x5628b420 
)
at /home/armbru/work/qemu/hw/i386/pc.c:336
#2  0x55898abc in qemu_devices_reset ()
at /home/armbru/work/qemu/vl.c:1836
#3  0x55898b28 in qemu_system_reset (report=false)
at /home/armbru/work/qemu/vl.c:1845
#4  0x558a0640 in main (argc=13, argv=0x7fffe048, envp=
0x7fffe0b8) at /home/armbru/work/qemu/vl.c:4344
(gdb) p arg->idebus
$1 = {0x56322e10, 0x0}
(gdb) p i
$2 = 2

Looks like your patch kills the second isa-ide somehow.

Your commit message doesn't state your command line, so I had to figure
out a PPC example myself:

$ qemu-system-ppc -M mac99 -nodefaults -S -display none -monitor stdio 
-drive if=none,id=drive0 -device ide-cd,drive=drive0,bus=ide.0

"info qtree" before your patch:

  dev: macio-ide, id ""
irq 2
mmio /1000
bus: ide.0
  type IDE
  dev: ide-cd, id ""
drive = drive0
logical_block_size = 512
physical_block_size = 512
min_io_size = 0
opt_io_size = 0
bootindex = -1
discard_granularity = 512
ver = "1.7.50"
wwn = 0x0
serial = "QM3"
model = 
unit = 0
  dev: macio-ide, id ""
irq 2
mmio /1000
bus: ide.0
  type IDE

After:

  dev: macio-ide, id ""
irq 2
mmio /1000
bus: ide.1
  type IDE
  dev: macio-ide, id ""
irq 2
mmio /1000
bus: ide.0
  type IDE
  dev: ide-cd, id ""
drive = drive0
logical_block_size = 512
physical_block_size = 512
min_io_size = 0
opt_io_size = 0
bootindex = -1
discard_granularity = 512
ver = "1.7.50"
wwn = 0x0
serial = "QM1"
model = 
unit = 0

Incompatible change: device ide-cd moved to a different controller.
Great fun when you try to live migrate across your patch.

I'd expect isapc to have the same issue once its crash bug is fixed.

First law of QEMU hacking: if your patch looks simple, it's probably
wrong ;)

Re: [Qemu-devel] [PATCH v2 5/6] qemu-option: Remove qemu_opts_create_nofail

2013-12-05 Thread Markus Armbruster

Peter Crosthwaite  writes:

> This is a boiler-plate _nofail variant of qemu_opts_create. Remove and
> use error_abort in call sites.
>
> null/0 arguments needs to be added for the id and fail_if_exists fields
> in affected callsites due to argument inconsistency between the normal and
> no_fail variants.
>
> Signed-off-by: Peter Crosthwaite 
> ---
> changed since v1:
> Wrap some long lines (Markus review)
> Flip fail_if_exists to 0 in call sites (Markus review)
> Remove spurious whitespace fix (Markus review)
> Mention fail_if_exists in commit msg (Markus review)
>
>  block/blkdebug.c   |  2 +-
>  block/blkverify.c  |  2 +-
>  block/curl.c   |  2 +-
>  block/gluster.c|  2 +-
>  block/iscsi.c  |  2 +-
>  block/nbd.c|  3 ++-
>  block/qcow2.c  |  2 +-
>  block/raw-posix.c  |  2 +-
>  block/raw-win32.c  |  5 +++--
>  block/rbd.c|  2 +-
>  block/sheepdog.c   |  2 +-
>  block/vvfat.c  |  2 +-
>  blockdev.c |  6 --
>  hw/watchdog/watchdog.c |  3 ++-
>  include/qemu/option.h  |  1 -
>  qdev-monitor.c |  2 +-
>  qemu-img.c |  2 +-
>  util/qemu-config.c |  2 +-
>  util/qemu-option.c |  9 -
>  util/qemu-sockets.c| 18 +-
>  vl.c   | 15 +--
>  21 files changed, 42 insertions(+), 44 deletions(-)
>
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 16d2b91..7f915ea 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -359,7 +359,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  const char *filename, *config;
>  int ret;
>  
> -opts = qemu_opts_create_nofail(&runtime_opts);
> +opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
>  qemu_opts_absorb_qdict(opts, options, &local_err);
>  if (error_is_set(&local_err)) {
>  error_propagate(errp, local_err);
> diff --git a/block/blkverify.c b/block/blkverify.c
> index 3c63528..1c1637f 100644
> --- a/block/blkverify.c
> +++ b/block/blkverify.c
> @@ -125,7 +125,7 @@ static int blkverify_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  const char *filename, *raw;
>  int ret;
>  
> -opts = qemu_opts_create_nofail(&runtime_opts);
> +opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
>  qemu_opts_absorb_qdict(opts, options, &local_err);
>  if (error_is_set(&local_err)) {
>  error_propagate(errp, local_err);
> diff --git a/block/curl.c b/block/curl.c
> index 5a46f97..a603936 100644
> --- a/block/curl.c
> +++ b/block/curl.c
> @@ -413,7 +413,7 @@ static int curl_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  return -EROFS;
>  }
>  
> -opts = qemu_opts_create_nofail(&runtime_opts);
> +opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
>  qemu_opts_absorb_qdict(opts, options, &local_err);
>  if (error_is_set(&local_err)) {
>  qerror_report_err(local_err);
> diff --git a/block/gluster.c b/block/gluster.c
> index 877686a..563d497 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -298,7 +298,7 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
> *options,
>  Error *local_err = NULL;
>  const char *filename;
>  
> -opts = qemu_opts_create_nofail(&runtime_opts);
> +opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
>  qemu_opts_absorb_qdict(opts, options, &local_err);
>  if (error_is_set(&local_err)) {
>  qerror_report_err(local_err);
> diff --git a/block/iscsi.c b/block/iscsi.c
> index a2d578c..2cec43a 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -1241,7 +1241,7 @@ static int iscsi_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  return -EINVAL;
>  }
>  
> -opts = qemu_opts_create_nofail(&runtime_opts);
> +opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
>  qemu_opts_absorb_qdict(opts, options, &local_err);
>  if (error_is_set(&local_err)) {
>  qerror_report_err(local_err);
> diff --git a/block/nbd.c b/block/nbd.c
> index c8d..5bd9359 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -234,7 +234,8 @@ static int nbd_config(BDRVNBDState *s, QDict *options)
>  return -EINVAL;
>  }
>  
> -s->socket_opts = qemu_opts_create_nofail(&socket_optslist);
> +s->socket_opts = qemu_opts_create(&socket_optslist, NULL, 0,
> +  &error_abort);
>  
>  qemu_opts_absorb_qdict(s->socket_opts, options, &local_err);
>  if (error_is_set(&local_err)) {
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 6e5d98d..cf28377 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -669,7 +669,7 @@ static int qcow2_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  }
>  
>  /* Enable lazy_refcounts according to image and command line options */
> -opts = qemu_opts_create_nofail(&qcow2_runtime_opts);
> +opts =

Re: [Qemu-devel] [PATCH v2 1/6] error: Add error_abort

2013-12-05 Thread Markus Armbruster

Peter Crosthwaite  writes:

> Add a special Error * that can be passed to error handling APIs to
> signal that any errors are fatal and should abort QEMU. There are two
> advantages to this:
>
> - allows for brevity when wishing to assert success of Error **
>   accepting APIs. No need for this pattern:
> Error * local_err = NULL;
> api_call(foo, bar, &local_err);
> assert_no_error(local_err);
>   This also removes the need for _nofail variants of APIs with
>   asserting call sites now reduced to 1LOC.
> - SIGABRT happens from within the offending API. When a fatal error
>   occurs in an API call (when the caller is asserting sucess) failure
>   often means the API itself is broken. With the abort happening in the
>   API call now, the stack frames into the call are available at debug
>   time. In the assert_no_error scheme the abort happens after the fact.
>
> The exact semantic is that when an error is raised, if the argument
> Error ** matches &error_abort, then the abort occurs immediately. The
> error messaged is reported.
>
> For error_propagate, if the destination error is &error_abort, then
> the abort happens at propagation time.
>
> Signed-off-by: Peter Crosthwaite 
> ---
> changed since v1:
> Delayed assertions that *errp == NULL.

Care to explain why you want to delay these assertions?  I'm not sure I
get it...

[...]
> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const 
> char *fmt, ...)
>  if (errp == NULL) {
>  return;
>  }
> -assert(*errp == NULL);
>  
>  err = g_malloc0(sizeof(*err));
>  
> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const 
> char *fmt, ...)
>  va_end(ap);
>  err->err_class = err_class;
>  
> +if (errp == &error_abort) {
> +error_report("%s", error_get_pretty(err));
> +abort();
> +}
> +
> +assert(*errp == NULL);
>  *errp = err;
>  }
>  
[...]

Re: [Qemu-devel] [PATCH v2 0/3] Make thread pool implementation modular

2013-12-05 Thread Stefan Hajnoczi

On Thu, Dec 05, 2013 at 09:40:56AM +0100, Matthias Brugger wrote:
> 2013/11/11 Stefan Hajnoczi :
> > On Mon, Nov 11, 2013 at 11:00:45AM +0100, Matthias Brugger wrote:
> >> 2013/11/5 Stefan Hajnoczi :
> >> > I'd also like to see the thread pool implementation you wish to add
> >> > before we add a layer of indirection which has no users yet.
> >>
> >> Fair enough, I will evaluate if it will make more sense to implement a
> >> new AIO infrastructure instead to try reuse the thread-pool.
> >> Actually my implementation will differ in the way, that we will have
> >> several workerthreads with everyone of them having its own queue. The
> >> requests will be distributed between them depending on an identifier.
> >> The request function which  the worker_thread call will be the same as
> >> using aio=threads, so I'm not quite sure which will be the way to go.
> >> Any opinions and hints, like the one you gave are highly appreciated.
> >
> > If I understand the slides you linked to correctly, the guest will pass
> > an identifier with each request.  The host has worker threads allowing
> > each stream of requests to be serviced independently.  The idea is to
> > associate guest processes with unique identifiers.
> >
> > The guest I/O scheduler is supposed to submit requests in a way that
> > meets certain policies (e.g. fairness between processes, deadlines,
> > etc).
> >
> > Why is it necessary to push this task down into the host?  I don't
> > understand the advantage of this approach except that maybe it works
> > around certain misconfigurations, I/O scheduler quirks, or plain old
> > bugs - all of which should be investigated and fixed at the source
> > instead of adding another layer of code to mask them.
> 
> It is about I/O scheduling. CFQ the state of the art I/O scheduler
> merges adjacent requests from the same PID before dispatching them to
> the disk.
> If we can distinguish between the different threads of a virtual
> machine that read/write a file, the I/O scheduler in the host can
> merge requests in an effective way for sequential access. Qemu fails
> in this, because of its architecture. Apart that at the moment there
> is no way to distinguish the guest threads from each other (I'm
> working on some kernel patches), Qemu has one big queue from which
> several workerthreads grab requests and dispatch them to the disk.
> Even if you have one large read from just one thread in the guest, the
> I/O scheduler in the host will get the requests from different PIDs (=
> workerthreads) and won't be able to merge them.

>From clone(2):

  If several threads are doing I/O on behalf of the same process
  (aio_read(3), for instance), they should employ CLONE_IO to get better
  I/O performance.

I think we should just make sure that worker threads share the same
io_context in QEMU.  That way the host I/O scheduler will treat their
requests as one.

Can you benchmark CLONE_IO and see if it improves the situation?

> In former versions, there was some work done to merge requests in
> Qemu, but I don't think they were very useful, because you don't know
> how the layout of the image file looks like on the physical disk.
> Anyway I think this code parts have been removed.

If you mean bdrv_aio_multiwrite(), it's still there and used by
virtio-blk emulation.  It's only used for write requests, not reads.

Stefan

Re: [Qemu-devel] [PATCH] qdev: Keep global allocation counter per bus

2013-12-05 Thread Markus Armbruster

Alexander Graf  writes:

> When we have 2 separate qdev devices that both create a qbus of the
> same type without specifying a bus name or device name, we end up
> with two buses of the same name, such as ide.0 on the Mac machines:
>
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>
> If we now spawn a device that connects to a ide.0 the last created
> bus gets the device, with the first created bus inaccessible to the
> command line.
>
> After some discussion on IRC we concluded that the best quick fix way
> forward for this is to make automated bus-class type based allocation
> count a global counter. That's what this patch implements. With this
> we instead get
>
>   dev: macio-ide, id ""
> bus: ide.1
>   type IDE
>   dev: macio-ide, id ""
> bus: ide.0
>   type IDE
>
> on the example mentioned above.

What I don't like about the global counter: we define the board's ABI
implicitly by device initialization order.  Bad taste and fragile, but
we do this elsewhere, too, e.g. pci_create_simple(bugs, -1, ...).

Wanted: tests to catch accidental ABI changes, covering at least the
parts we define implicitly.

Re: [Qemu-devel] [PATCH] qdev: Keep global allocation counter per bus

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 10:44, Markus Armbruster ha scritto:
> Incompatible change: device ide-cd moved to a different controller.

Yes, it should be stated in the commit message but it's expected as
discussed yesterday on IRC.  The solution is not to use "-device" (which
was broken) if you care about backwards compatibility; use "-drive if=ide".

> Great fun when you try to live migrate across your patch.
> 
> I'd expect isapc to have the same issue once its crash bug is fixed.
> 
> First law of QEMU hacking: if your patch looks simple, it's probably
> wrong ;)

Yes, the question is how wrong and how the wrong balances the right.

Paolo

Re: [Qemu-devel] [PATCHv3 1.8 7/9] qemu-img: round down request length to an aligned sector

2013-12-05 Thread Stefan Hajnoczi

On Wed, Dec 04, 2013 at 04:56:19PM +0100, Peter Lieven wrote:
> Am 04.12.2013 16:49, schrieb Stefan Hajnoczi:
> > On Wed, Nov 27, 2013 at 11:07:07AM +0100, Peter Lieven wrote:
> >> @@ -1397,19 +1396,21 @@ static int img_convert(int argc, char **argv)
> >>  }
> >>  }
> >>  
> >> +cluster_sectors = 0;
> >> +ret = bdrv_get_info(out_bs, &bdi);
> >> +if (ret < 0 && compress) {
> >> +error_report("could not get block driver info");
> >> +goto out;
> >> +} else {
> >> +cluster_sectors = bdi.cluster_size / BDRV_SECTOR_SIZE;
> >> +}
> > Why do we only report error if 'compress' is set?  cluster_sectors must
> > be valid and we cannot guarantee that if bdrv_get_info() failed.
> You mean this should be:
> 
> +if (ret < 0) {
> +if (compress) {
> +error_report("could not get block driver info");
> +goto out;
> +}
> +} else {
> +cluster_sectors = bdi.cluster_size / BDRV_SECTOR_SIZE;
> +}
> 
> 
> if cluster_sectors is 0 the alignment logic is skipped, but we cannot
> guarantee that bdi is zero and stays zero if the call fails.
> 
> can you fix that when you pick up the patch?

Sure.

Stefan

Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 10:24, Stefan Hajnoczi ha scritto:
>> > 
>> > That's what already happens actually.  vring_push has
>> > 
>> > 
>> > +g_slice_free(VirtQueueElement, elem);
>> > +
>> >  /* Don't touch vring if a fatal error occurred */
>> >  if (vring->broken) {
>> >  return;
>> > 
>> > in this patch and
>> > 
>> > +for (i = 0; i < elem->out_num; i++) {
>> > +vring_unmap(elem->out_sg[i].iov_base, false);
>> > +}
>> > +
>> > +for (i = 0; i < elem->in_num; i++) {
>> > +vring_unmap(elem->in_sg[i].iov_base, true);
>> > +}
>> > 
>> >  g_slice_free(VirtQueueElement, elem);
>> > 
>> > in the next one.
>> > 
>> > Though I admit vring_push isn't such a great name and API.  I can add
>> > instead a vring_free_element function.  Do you think vring_push should
>> > call it, or should the caller do that?
> I think vring_push() should free the VirtQueueElement.
> 
> We just need to expose vring_free_element() so that handle_notify() can
> call it without pushing bogus buffers back to the guest.

It's not pushing back bogus buffer, see the "if (vring->broken)" above.
 But if you prefer handle_notify() to call vring_free_element(), I can
of course do that.

Paolo

Re: [Qemu-devel] [PATCHv3 1.8 8/9] qemu-img: increase min_sparse to 128 sectors (64kb)

2013-12-05 Thread Stefan Hajnoczi

On Thu, Dec 05, 2013 at 05:55:22AM +0100, Peter Lieven wrote:
> 
> 
> > Am 05.12.2013 um 03:12 schrieb Eric Blake :
> > 
> > On 12/04/2013 09:46 AM, Peter Lieven wrote:
> > 
> >>> I guess a sane size would be cluster size.  For a raw file 4 KB is
> >>> reasonable since that's the file system block size.
> >> in case of iscsi the cluster size could be much too high as for example
> >> my storage has a cluster_size of 15MB.
> >>> 
> >>> Is it necessary to increase to 64 KB here?
> >> No, its indepent of the rest. Paolo suggested to increase it and I can 
> >> confirm
> >> that for my usage case its faster than 4K.
> > 
> > At least on NTFS file systems, 64k is the minimum size of a hole in a
> > sparse file.  While many file systems support smaller holes, there are
> > definitely systems where trying to detect smaller holes only results in
> > wasted efforts.  Is it worth making the default dynamic based on stat()
> > information regarding optimum IO size for the given destination file system?
> 
> it is definetely worth it, but i would require additional work and testing. 
> the current code does not create holes that are aligned to min_sparse and 
> min_sparse has to be limited to a reasonable size. and i wonder if the right 
> value is bs->bl.opt_transfer_lenght, bs->bl.discard_alignment or 
> bdi->cluster_size/9. maybe depepnding on if its a cow Image or not.
> 
> i can look at this. but i would leave the patch out for now.

Okay, I'll drop this patch for now.  It can be improved in a separate
series.

Stefan

Re: [Qemu-devel] [RFC PATCH v1 0/5] Add error_abort and associated cleanups

2013-12-05 Thread Paolo Bonzini

Il 03/12/2013 21:33, Igor Mammedov ha scritto:
> I'm sorry for hijacking thread, but that actually an issue that started an
> original discussion.
> Where void returning QOM API functions are used with NULL, without any chance
> to detect that error happened. So abusing NULL errp in this functions
> might lead to hard to find runtime errors.
> I think Eric's suggestion was to enforce passing non NULL errp and let caller
> to deal with error gracefully so that above mentioned misuse was impossible.
> Why is ignoring errors from "void foo(...)" like API considered acceptable?

See http://permalink.gmane.org/gmane.comp.emulators.qemu/243779

> * Peter's alternative
>   + self-documenting
>   + consistent
>   + predictable

I'll add another small advantage which is fewer SLOC.

> * make Error* mandatory for all void functions
>   + consistent
>   + almost predictable (because in C you can ignore return values)
>   - not necessarily does the right thing (e.g. cleanup functions)
>   - requires manual effort to abide to the policy

Better wording of the last: a missing &error_abort is easier to spot
than a missing assert_no_error(errp).

Paolo

Re: [Qemu-devel] [PATCHv3 1.8 1/9] qemu-img: add support for skipping zeroes in input during convert

2013-12-05 Thread Stefan Hajnoczi

On Wed, Dec 04, 2013 at 05:51:20PM +0100, Peter Lieven wrote:
> Am 04.12.2013 17:46, schrieb Stefan Hajnoczi:
> > On Wed, Nov 27, 2013 at 11:07:01AM +0100, Peter Lieven wrote:
> >> +/* If the output image is being created as a copy on write
> >> + * image, assume that sectors which are unallocated in the
> >> + * input image are present in both the output's and 
> >> input's
> >> + * base images (no need to copy them). */
> >> +if (out_baseimg) {
> >> +if (!(ret & BDRV_BLOCK_DATA)) {
> >> +sector_num += n1;
> >> +continue;
> >> +}
> >> +/* The next 'n1' sectors are allocated in the input 
> >> image.
> >> + * Copy only those as they may be followed by 
> >> unallocated
> >> + * sectors. */
> >> +nb_sectors = n1;
> >> +}
> >> +/* avoid redundant callouts to get_block_status */
> >> +sector_num_next_status = sector_num + n1;
> > Can you explain when we need sector_num_next_status?  It's not clear to
> > me from this patch when we will loop around already knowing that blocks
> > are allocated.
> We call get_block_status with MIN(INT_MAX, nb_sectors). So we might
> receive an allocation status for a huge area. Later we trim the request
> size to MIN(iobuf_size, nb_sectors) and eventually align the request.
> 
> For example take a fully allocated image on an iSCSI san. I can easily get
> that information with the first get_block_status call, but I repeat these
> calls over and over and in case of the iSCSI SAN these calls are quite
> expensive.

Makes sense, thanks!

[Qemu-devel] propose gsoc project

2013-12-05 Thread Xin Tong

I am wondering whether it is possible to propose project in QEMU google
summer of code as a student ? I have some ideas regarding TLB emulation in
system mode and would like to find a mentor to do a gsoc for it.

Thank you,
Xin

Re: [Qemu-devel] propose gsoc project

2013-12-05 Thread Fam Zheng


On 2013年12月05日 18:48, Xin Tong wrote:

I am wondering whether it is possible to propose project in QEMU google
summer of code as a student ? I have some ideas regarding TLB emulation
in system mode and would like to find a mentor to do a gsoc for it.



Sure, it's definitely welcome. You can post your idea to qemu-devel 
(just as you do now) and/or ask around on the IRC channel (#qemu on 
irc.OFCT.net).


Thanks,
Fam

Re: [Qemu-devel] [RFC PATCH v0 2/3] gluster: Implement .bdrv_co_write_zeroes for gluster

2013-12-05 Thread Bharata B Rao

On Wed, Dec 04, 2013 at 02:16:28PM -0500, Jeff Cody wrote:
> On Fri, Nov 22, 2013 at 12:46:17PM +0530, Bharata B Rao wrote:
> > +
> > +ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, 
> > acb);
> > +if (ret < 0) {
> 
> I believe glfs_zerofill_async returns -1 on failure, and sets errno.
> In that case, we should set ret = -errno here.

This needs to be done for other routines too. Will address this and the
other comment you have given in 2/3 thread. Thanks.

Regards,
Bharata.

[Qemu-devel] [RFC PATCH v1 3/3] gluster: Add support for creating zero-filled image

2013-12-05 Thread Bharata B Rao

GlusterFS supports creation of zero-filled file on GlusterFS volume
by means of an API called glfs_zerofill(). Use this API from QEMU to
create an image that is filled with zeroes by using the preallocation
option of qemu-img.

qemu-img create gluster://server/volume/image -o preallocation=full 10G

The allowed values for preallocation are 'full' and 'off'. By default
preallocation is off and image is not zero-filled.

glfs_zerofill() offloads the writing of zeroes to the server and if
the storage supports SCSI WRITESAME, GlusterFS server can issue
BLKZEROOUT ioctl to achieve the zeroing.

Signed-off-by: Bharata B Rao 
---
 block/gluster.c | 50 +-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index 1390270..c167abe 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -364,6 +364,29 @@ out:
 qemu_aio_release(acb);
 return ret;
 }
+
+static inline int gluster_supports_zerofill(void)
+{
+return 1;
+}
+
+static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
+int64_t size)
+{
+return glfs_zerofill(fd, offset, size);
+}
+
+#else
+static inline int gluster_supports_zerofill(void)
+{
+return 0;
+}
+
+static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
+int64_t size)
+{
+return 0;
+}
 #endif
 
 static int qemu_gluster_create(const char *filename,
@@ -372,6 +395,7 @@ static int qemu_gluster_create(const char *filename,
 struct glfs *glfs;
 struct glfs_fd *fd;
 int ret = 0;
+int prealloc = 0;
 int64_t total_size = 0;
 GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
 
@@ -384,6 +408,19 @@ static int qemu_gluster_create(const char *filename,
 while (options && options->name) {
 if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
 total_size = options->value.n / BDRV_SECTOR_SIZE;
+} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
+if (!options->value.s || !strcmp(options->value.s, "off")) {
+prealloc = 0;
+} else if (!strcmp(options->value.s, "full") &&
+gluster_supports_zerofill()) {
+prealloc = 1;
+} else {
+error_setg(errp, "Invalid preallocation mode: '%s'"
+" or GlusterFS doesn't support zerofill API",
+   options->value.s);
+ret = -EINVAL;
+goto out;
+}
 }
 options++;
 }
@@ -393,9 +430,15 @@ static int qemu_gluster_create(const char *filename,
 if (!fd) {
 ret = -errno;
 } else {
-if (glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
+if (!glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE)) {
+if (prealloc && qemu_gluster_zerofill(fd, 0,
+total_size * BDRV_SECTOR_SIZE)) {
+ret = -errno;
+}
+} else {
 ret = -errno;
 }
+
 if (glfs_close(fd) != 0) {
 ret = -errno;
 }
@@ -579,6 +622,11 @@ static QEMUOptionParameter qemu_gluster_create_options[] = 
{
 .type = OPT_SIZE,
 .help = "Virtual disk size"
 },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = OPT_STRING,
+.help = "Preallocation mode (allowed values: off, full)"
+},
 { NULL }
 };
 
-- 
1.7.11.7

[Qemu-devel] [RFC PATCH v1 2/3] gluster: Implement .bdrv_co_write_zeroes for gluster

2013-12-05 Thread Bharata B Rao

Support .bdrv_co_write_zeroes() from gluster driver by using GlusterFS API
glfs_zerofill() that off-loads the writing of zeroes to GlusterFS server.

Signed-off-by: Bharata B Rao 
---
 block/gluster.c | 77 +
 configure   |  8 ++
 2 files changed, 69 insertions(+), 16 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 88ef48d..1390270 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -245,6 +245,22 @@ static void qemu_gluster_complete_aio(void *opaque)
 qemu_coroutine_enter(acb->coroutine, NULL);
 }
 
+/*
+ * AIO callback routine called from GlusterFS thread.
+ */
+static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
+{
+GlusterAIOCB *acb = (GlusterAIOCB *)arg;
+
+acb->ret = ret;
+acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
+qemu_bh_schedule(acb->bh);
+}
+
+static const AIOCBInfo gluster_aiocb_info = {
+.aiocb_size = sizeof(GlusterAIOCB),
+};
+
 /* TODO Convert to fine grained options */
 static QemuOptsList runtime_opts = {
 .name = "gluster",
@@ -317,6 +333,39 @@ out:
 return ret;
 }
 
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+static int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors)
+{
+int ret;
+GlusterAIOCB *acb;
+BDRVGlusterState *s = bs->opaque;
+off_t size;
+off_t offset;
+
+offset = sector_num * BDRV_SECTOR_SIZE;
+size = nb_sectors * BDRV_SECTOR_SIZE;
+
+acb = qemu_aio_get(&gluster_aiocb_info, bs, NULL, NULL);
+acb->size = size;
+acb->ret = 0;
+acb->coroutine = qemu_coroutine_self();
+
+ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
+if (ret < 0) {
+ret = -errno;
+goto out;
+}
+
+qemu_coroutine_yield();
+ret = acb->ret;
+
+out:
+qemu_aio_release(acb);
+return ret;
+}
+#endif
+
 static int qemu_gluster_create(const char *filename,
 QEMUOptionParameter *options, Error **errp)
 {
@@ -359,22 +408,6 @@ out:
 return ret;
 }
 
-static const AIOCBInfo gluster_aiocb_info = {
-.aiocb_size = sizeof(GlusterAIOCB),
-};
-
-/*
- * AIO callback routine called from GlusterFS thread.
- */
-static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
-{
-GlusterAIOCB *acb = (GlusterAIOCB *)arg;
-
-acb->ret = ret;
-acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
-qemu_bh_schedule(acb->bh);
-}
-
 static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
 {
@@ -567,6 +600,9 @@ static BlockDriver bdrv_gluster = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -588,6 +624,9 @@ static BlockDriver bdrv_gluster_tcp = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -609,6 +648,9 @@ static BlockDriver bdrv_gluster_unix = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
@@ -630,6 +672,9 @@ static BlockDriver bdrv_gluster_rdma = {
 #ifdef CONFIG_GLUSTERFS_DISCARD
 .bdrv_co_discard  = qemu_gluster_co_discard,
 #endif
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
+#endif
 .create_options   = qemu_gluster_create_options,
 };
 
diff --git a/configure b/configure
index 0666228..886d71b 100755
--- a/configure
+++ b/configure
@@ -255,6 +255,7 @@ coroutine_pool=""
 seccomp=""
 glusterfs=""
 glusterfs_discard="no"
+glusterfs_zerofill="no"
 virtio_blk_data_plane=""
 gtk=""
 gtkabi="2.0"
@@ -2673,6 +2674,9 @@ if test "$glusterfs" != "no" ; then
 if $pkg_config --atleast-version=5 glusterfs-api; then
   glusterfs_discard="yes"
 fi
+if $pkg_config --atleast-version=6 glusterfs-api; then
+  glusterfs_zerofill="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support"
@@ -4175,6 +4179,10 @@ if test "$glusterfs_discard" = "yes" ; then
   echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_zerofill" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=y" >> $config_host_mak
 fi
-- 
1.7.11.7

[Qemu-devel] [RFC PATCH v1 1/3] gluster: Convert aio routines into coroutines

2013-12-05 Thread Bharata B Rao

Convert the read, write, flush and discard implementations from aio-based
ones to coroutine based ones.

Signed-off-by: Bharata B Rao 
---
 block/gluster.c | 184 +++-
 1 file changed, 63 insertions(+), 121 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 877686a..88ef48d 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -24,13 +24,12 @@ typedef struct GlusterAIOCB {
 BlockDriverAIOCB common;
 int64_t size;
 int ret;
-bool *finished;
 QEMUBH *bh;
+Coroutine *coroutine;
 } GlusterAIOCB;
 
 typedef struct BDRVGlusterState {
 struct glfs *glfs;
-int fds[2];
 struct glfs_fd *fd;
 int event_reader_pos;
 GlusterAIOCB *event_acb;
@@ -231,46 +230,19 @@ out:
 return NULL;
 }
 
-static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s)
+static void qemu_gluster_complete_aio(void *opaque)
 {
-int ret;
-bool *finished = acb->finished;
-BlockDriverCompletionFunc *cb = acb->common.cb;
-void *opaque = acb->common.opaque;
-
-if (!acb->ret || acb->ret == acb->size) {
-ret = 0; /* Success */
-} else if (acb->ret < 0) {
-ret = acb->ret; /* Read/Write failed */
-} else {
-ret = -EIO; /* Partial read/write - fail it */
-}
+GlusterAIOCB *acb = (GlusterAIOCB *)opaque;
 
-qemu_aio_release(acb);
-cb(opaque, ret);
-if (finished) {
-*finished = true;
+if (acb->ret == acb->size) {
+acb->ret = 0;
+} else if (acb->ret > 0) {
+acb->ret = -EIO; /* Partial read/write - fail it */
 }
-}
 
-static void qemu_gluster_aio_event_reader(void *opaque)
-{
-BDRVGlusterState *s = opaque;
-ssize_t ret;
-
-do {
-char *p = (char *)&s->event_acb;
-
-ret = read(s->fds[GLUSTER_FD_READ], p + s->event_reader_pos,
-   sizeof(s->event_acb) - s->event_reader_pos);
-if (ret > 0) {
-s->event_reader_pos += ret;
-if (s->event_reader_pos == sizeof(s->event_acb)) {
-s->event_reader_pos = 0;
-qemu_gluster_complete_aio(s->event_acb, s);
-}
-}
-} while (ret < 0 && errno == EINTR);
+qemu_bh_delete(acb->bh);
+acb->bh = NULL;
+qemu_coroutine_enter(acb->coroutine, NULL);
 }
 
 /* TODO Convert to fine grained options */
@@ -309,7 +281,6 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 
 filename = qemu_opt_get(opts, "filename");
 
-
 s->glfs = qemu_gluster_init(gconf, filename);
 if (!s->glfs) {
 ret = -errno;
@@ -329,17 +300,7 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 s->fd = glfs_open(s->glfs, gconf->image, open_flags);
 if (!s->fd) {
 ret = -errno;
-goto out;
-}
-
-ret = qemu_pipe(s->fds);
-if (ret < 0) {
-ret = -errno;
-goto out;
 }
-fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK);
-qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ],
-qemu_gluster_aio_event_reader, NULL, s);
 
 out:
 qemu_opts_del(opts);
@@ -398,44 +359,24 @@ out:
 return ret;
 }
 
-static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb)
-{
-GlusterAIOCB *acb = (GlusterAIOCB *)blockacb;
-bool finished = false;
-
-acb->finished = &finished;
-while (!finished) {
-qemu_aio_wait();
-}
-}
-
 static const AIOCBInfo gluster_aiocb_info = {
 .aiocb_size = sizeof(GlusterAIOCB),
-.cancel = qemu_gluster_aio_cancel,
 };
 
+/*
+ * AIO callback routine called from GlusterFS thread.
+ */
 static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
 {
 GlusterAIOCB *acb = (GlusterAIOCB *)arg;
-BlockDriverState *bs = acb->common.bs;
-BDRVGlusterState *s = bs->opaque;
-int retval;
 
 acb->ret = ret;
-retval = qemu_write_full(s->fds[GLUSTER_FD_WRITE], &acb, sizeof(acb));
-if (retval != sizeof(acb)) {
-/*
- * Gluster AIO callback thread failed to notify the waiting
- * QEMU thread about IO completion.
- */
-error_report("Gluster AIO completion failed: %s", strerror(errno));
-abort();
-}
+acb->bh = qemu_bh_new(qemu_gluster_complete_aio, acb);
+qemu_bh_schedule(acb->bh);
 }
 
-static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
-int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-BlockDriverCompletionFunc *cb, void *opaque, int write)
+static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
 {
 int ret;
 GlusterAIOCB *acb;
@@ -446,10 +387,10 @@ static BlockDriverAIOCB 
*qemu_gluster_aio_rw(BlockDriverState *bs,
 offset = sector_num * BDRV_SECTOR_SIZE;
 size = nb_sectors * BDRV_SECTOR_SIZE;
 
-acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
+acb = qemu_aio_get(&gluster_aiocb_info, bs, NULL,

[Qemu-devel] [RFC PATCH v1 0/3] gluster: conversion to coroutines and supporting write_zeroes

2013-12-05 Thread Bharata B Rao

Hi,

This series is about converting all the bdrv_aio* implementations in gluster
driver to coroutine based implementations. Read, write, flush and discard
routines are converted.

This also adds support for .bdrv_co_write_zeroes() in gluster and provides
a new preallocation option with qemu-img (-o preallocation=full) that can
be used for raw images on GlusterFS backend to create fully allocated and
zero-filled images.

Changes in v1
-
- Removed qemu_gluster_aio_cancel() and associated code.
- Calling qemu_aio_release() from where aiocb is created.
- s/qemu_gluster_aio_rw/qemu_gluster_co_rw.
- Use errno appropriately from read, write, flush, discard and zerofill routines
  in gluster driver.
- Fix a memory leak in qemu_gluster_create().
- Proceed with glfs_zerofill() only if glfs_ftruncate() succeeds in
  qemu_gluster_create().

Bharata B Rao (3):
  gluster: Convert aio routines into coroutines
  gluster: Implement .bdrv_co_write_zeroes for gluster
  gluster: Add support for creating zero-filled image

 block/gluster.c | 297 +++-
 configure   |   8 ++
 2 files changed, 174 insertions(+), 131 deletions(-)

-- 
1.7.11.7

Re: [Qemu-devel] [PATCH] qdev: Keep global allocation counter per bus

2013-12-05 Thread Markus Armbruster

Paolo Bonzini  writes:

> Il 05/12/2013 10:44, Markus Armbruster ha scritto:
>> Incompatible change: device ide-cd moved to a different controller.
>
> Yes, it should be stated in the commit message but it's expected as
> discussed yesterday on IRC.  The solution is not to use "-device" (which
> was broken) if you care about backwards compatibility; use "-drive if=ide".

-device is broken for the *other* controller.  It works just fine for
this one.

>> Great fun when you try to live migrate across your patch.
>> 
>> I'd expect isapc to have the same issue once its crash bug is fixed.
>> 
>> First law of QEMU hacking: if your patch looks simple, it's probably
>> wrong ;)
>
> Yes, the question is how wrong and how the wrong balances the right.

Is it really too much bother to change the ide.0 name for the
controllers that bus=ide.0 doesn't use, and keep it for the one it does
use?

If yes, the incompatible change needs to be documented much more clearly
in the commit message.

Re: [Qemu-devel] [PATCH] qdev: Keep global allocation counter per bus

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 12:20, Markus Armbruster ha scritto:
> Paolo Bonzini  writes:
> 
>> Il 05/12/2013 10:44, Markus Armbruster ha scritto:
>>> Incompatible change: device ide-cd moved to a different controller.
>>
>> Yes, it should be stated in the commit message but it's expected as
>> discussed yesterday on IRC.  The solution is not to use "-device" (which
>> was broken) if you care about backwards compatibility; use "-drive if=ide".
> 
> -device is broken for the *other* controller.  It works just fine for
> this one.

It is broken in that "-device bus=ide.0" would access the second
controller, the one corresponding to "-drive if=ide,bus=1", or -hdc/-hdd.

>>> Great fun when you try to live migrate across your patch.
>>>
>>> I'd expect isapc to have the same issue once its crash bug is fixed.
>>>
>>> First law of QEMU hacking: if your patch looks simple, it's probably
>>> wrong ;)
>>
>> Yes, the question is how wrong and how the wrong balances the right.
> 
> Is it really too much bother to change the ide.0 name for the
> controllers that bus=ide.0 doesn't use, and keep it for the one it does
> use?

Yes, for two reasons:

(1) practical reason, the one mentioned above: it would mean that
"-device bus=ide.0" corresponds to "-drive if=ide,bus=1" and similarly
for ide.1/bus=0.  So we would make the cure worse than the disease, in
my opinion.  This IMO is a pretty strong sign that the
backwards-compatibility problem doesn't exist and no one ever used
"-device" for built-in devices on anything other than pc IDE and pseries
SCSI.

(2) technical reason: the two are inverted because bus names currently
have a "last wins" policy.  The policy is implemented by using
QTAILQ_INSERT_HEAD in bus_add_child.  So it is not possible to know the
correct bus names unless you know how many buses you will have (e.g. for
3 buses you'd start giving numbers from ide.2 and go down from there).
And implementing this would probably be really really ugly.

> If yes, the incompatible change needs to be documented much more clearly
> in the commit message.

And in the release notes.

Paolo

[Qemu-devel] [PATCH V7 2/6] qcow2: add error message in qcow2_write_snapshots()

2013-12-05 Thread Wenchao Xia

The function still returns int since qcow2_snapshot_delete() will
return the number.

Signed-off-by: Wenchao Xia 
Reviewed-by: Max Reitz 
---
 block/qcow2-snapshot.c |   43 +--
 1 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 670a58d..d7ab4ae 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -152,7 +152,7 @@ fail:
 }
 
 /* add at the end of the file a new list of snapshots */
-static int qcow2_write_snapshots(BlockDriverState *bs)
+static int qcow2_write_snapshots(BlockDriverState *bs, Error **errp)
 {
 BDRVQcowState *s = bs->opaque;
 QCowSnapshot *sn;
@@ -183,10 +183,15 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 offset = snapshots_offset;
 if (offset < 0) {
 ret = offset;
+error_setg_errno(errp, -ret,
+ "Failed in allocation of cluster for snapshot list");
 goto fail;
 }
 ret = bdrv_flush(bs);
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in flush after snapshot list cluster "
+ "allocation");
 goto fail;
 }
 
@@ -194,6 +199,10 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
  * must indeed be completely free */
 ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in overlap check for snapshot list cluster "
+ "at %" PRIi64 " with size %d",
+ offset, snapshots_size);
 goto fail;
 }
 
@@ -227,24 +236,40 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 
 ret = bdrv_pwrite(bs->file, offset, &h, sizeof(h));
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in write of snapshot header at %"
+ PRIi64 " with size %d",
+ offset, (int)sizeof(h));
 goto fail;
 }
 offset += sizeof(h);
 
 ret = bdrv_pwrite(bs->file, offset, &extra, sizeof(extra));
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in write of extra snapshot data at %"
+ PRIi64 " with size %d",
+ offset, (int)sizeof(extra));
 goto fail;
 }
 offset += sizeof(extra);
 
 ret = bdrv_pwrite(bs->file, offset, sn->id_str, id_str_size);
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in write of snapshot id string at %"
+ PRIi64 " with size %d",
+ offset, id_str_size);
 goto fail;
 }
 offset += id_str_size;
 
 ret = bdrv_pwrite(bs->file, offset, sn->name, name_size);
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in write of snapshot name string at %"
+ PRIi64 " with size %d",
+ offset, name_size);
 goto fail;
 }
 offset += name_size;
@@ -256,6 +281,8 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
  */
 ret = bdrv_flush(bs);
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in flush after snapshot list update");
 goto fail;
 }
 
@@ -268,6 +295,10 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, nb_snapshots),
&header_data, sizeof(header_data));
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in update of image header at %d with size %d",
+ (int)offsetof(QCowHeader, nb_snapshots),
+ (int)sizeof(header_data));
 goto fail;
 }
 
@@ -283,6 +314,9 @@ fail:
 qcow2_free_clusters(bs, snapshots_offset, snapshots_size,
 QCOW2_DISCARD_ALWAYS);
 }
+if (errp) {
+g_assert(error_is_set(errp));
+}
 return ret;
 }
 
@@ -447,10 +481,8 @@ void qcow2_snapshot_create(BlockDriverState *bs,
 s->snapshots = new_snapshot_list;
 s->snapshots[s->nb_snapshots++] = *sn;
 
-ret = qcow2_write_snapshots(bs);
+ret = qcow2_write_snapshots(bs, errp);
 if (ret < 0) {
-/* Following line will be replaced with more detailed error later */
-error_setg(errp, "Failed in write of snapshot");
 g_free(s->snapshots);
 s->snapshots = old_snapshot_list;
 s->nb_snapshots--;
@@ -624,9 +656,8 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
 s->snapshots + snapshot_index + 1,
 (s->nb_snapshots - snapshot_index - 1) * sizeof(sn));
 s->

[Qemu-devel] [PATCH V7 0/6] qcow2: rollback the modification on fail in snapshot creation

2013-12-05 Thread Wenchao Xia

V2:
  1: all fail case will goto fail section.
  2: add the goto code.

v3:
  Address Stefan's comments:
  2: don't goto fail after allocation failure.
  3: use sn->l1size correctly in qcow2_free_cluster().
  4-7: add test case to verify the error paths.
  Other:
  1: new patch fix a existing bug, which will be exposed in error path test.

v4:
  General change:
  rebased on upstream since error path for qcow2_write_snapshots() already
exist in upstream. removed old patch 1 since it is fixed by Max in upstream.
  5: moved the snapshot_l1_update event just before write operation, instead of
before overlap check, since it is more straight.
  6: remove a duplicated error path test about flush after snapshot list
update, add a filter which replace number to X, since now in error in report
detailed message including error cluster number.
  Address Stefan's comments:
  1, 2, 4: add *errp to store detailed error message, instead of error_report()
and compile time determined debug printf message.
  3: do not free cluster when fail in header update for safety reason.
  Address Eric's comments:
  1, 2, 4: add *errp to store detailed error message, instead of error_report()
and compile time determined debug printf message.
  5: squashed patches that add and use debug events.
  6: added comments about test only on Linux.

v5:
  General change:
  6: rebased on upstream, use case number 070, adjust 070.out due to error
message change in this version.

  Address Max's comments:
  1 use error_setg_errno() when possible, remove "ret =" in functions when
possible since the function does not need to return int value, fix 32bit/
64bit issue in printf for "sizeof" and "offse", typo fix.
  2 use error_setg_errno() when possible, fix 32bit/64bit issue in printf
for "sizeof" and "offse", typo fix.
  3 typo fix in comments.
  5 typo fix in commit message.

  Address Eric's comments:
  2 fix 32bit/64bit issue in printf for "sizeof" and "offse".

v6:
  Address Jeff's comments:
  6: add quote for image name in test case.

v7:
  Rebased on Stefan's block tree, since I need to test after Fam's
cache mode series.
  6: change case number to 075 to avoid conflict, add a comments in
case that it covers only default cache mode, qemu-img snapshot would
not be affected by case's cache setting.

Wenchao Xia (6):
  1 snapshot: add parameter *errp in snapshot create
  2 qcow2: add error message in qcow2_write_snapshots()
  3 qcow2: do not free clusters when fail in header update in 
qcow2_write_snapshots
  4 qcow2: cancel the modification on fail in qcow2_snapshot_create()
  5 blkdebug: add debug events for snapshot
  6 qemu-iotests: add test for qcow2 snapshot

 block/blkdebug.c |4 +
 block/qcow2-snapshot.c   |  105 ---
 block/qcow2.h|4 +-
 block/rbd.c  |   19 ++--
 block/sheepdog.c |   28 +++--
 block/snapshot.c |   19 +++-
 blockdev.c   |   10 +-
 include/block/block.h|4 +
 include/block/block_int.h|5 +-
 include/block/snapshot.h |5 +-
 qemu-img.c   |   10 +-
 savevm.c |   12 ++-
 tests/qemu-iotests/075   |  218 ++
 tests/qemu-iotests/075.out   |   35 ++
 tests/qemu-iotests/common.filter |7 ++
 tests/qemu-iotests/group |1 +
 16 files changed, 429 insertions(+), 57 deletions(-)
 create mode 100755 tests/qemu-iotests/075
 create mode 100644 tests/qemu-iotests/075.out

[Qemu-devel] [PATCH V7 4/6] qcow2: cancel the modification on fail in qcow2_snapshot_create()

2013-12-05 Thread Wenchao Xia

Signed-off-by: Wenchao Xia 
Reviewed-by: Max Reitz 
---
 block/qcow2-snapshot.c |   25 +
 1 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 55746c4..5f787bc 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -400,6 +400,7 @@ void qcow2_snapshot_create(BlockDriverState *bs,
 int i, ret;
 uint64_t *l1_table = NULL;
 int64_t l1_table_offset;
+Error *err = NULL;
 
 memset(sn, 0, sizeof(*sn));
 
@@ -448,7 +449,7 @@ void qcow2_snapshot_create(BlockDriverState *bs,
  PRIu64 " with size %" PRIu64,
  sn->l1_table_offset,
  (uint64_t)(s->l1_size * sizeof(uint64_t)));
-goto fail;
+goto dealloc_cluster;
 }
 
 ret = bdrv_pwrite(bs->file, sn->l1_table_offset, l1_table,
@@ -459,7 +460,7 @@ void qcow2_snapshot_create(BlockDriverState *bs,
  PRIu64 " with size %" PRIu64,
  sn->l1_table_offset,
  (uint64_t)(s->l1_size * sizeof(uint64_t)));
-goto fail;
+goto dealloc_cluster;
 }
 
 g_free(l1_table);
@@ -476,7 +477,7 @@ void qcow2_snapshot_create(BlockDriverState *bs,
  "Failed in update of refcount for snapshot at %"
  PRIu64 " with size %d",
  s->l1_table_offset, s->l1_size);
-goto fail;
+goto dealloc_cluster;
 }
 
 /* Append the new snapshot to the snapshot list */
@@ -494,7 +495,7 @@ void qcow2_snapshot_create(BlockDriverState *bs,
 g_free(s->snapshots);
 s->snapshots = old_snapshot_list;
 s->nb_snapshots--;
-goto fail;
+goto restore_refcount;
 }
 
 g_free(old_snapshot_list);
@@ -514,6 +515,22 @@ void qcow2_snapshot_create(BlockDriverState *bs,
 #endif
 return;
 
+restore_refcount:
+if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1)
+< 0 && errp) {
+/* Nothing can be done now, need image check later */
+error_setg(&err, "%s\nqcow2: Error in restoring refcount in snapshot",
+   error_get_pretty(*errp));
+error_free(*errp);
+*errp = NULL;
+error_propagate(errp, err);
+}
+
+dealloc_cluster:
+qcow2_free_clusters(bs, sn->l1_table_offset,
+sn->l1_size * sizeof(uint64_t),
+QCOW2_DISCARD_ALWAYS);
+
 fail:
 g_free(sn->id_str);
 g_free(sn->name);
-- 
1.7.1

[Qemu-devel] [PATCH V7 1/6] snapshot: add parameter *errp in snapshot create

2013-12-05 Thread Wenchao Xia

The return value is only used for error report before this patch,
so change the function protype to return void.

Signed-off-by: Wenchao Xia 
Reviewed-by: Max Reitz 
---
 block/qcow2-snapshot.c|   30 +-
 block/qcow2.h |4 +++-
 block/rbd.c   |   19 ++-
 block/sheepdog.c  |   28 ++--
 block/snapshot.c  |   19 +--
 blockdev.c|   10 --
 include/block/block_int.h |5 +++--
 include/block/snapshot.h  |5 +++--
 qemu-img.c|   10 ++
 savevm.c  |   12 
 10 files changed, 93 insertions(+), 49 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index ad8bf3d..670a58d 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -347,7 +347,9 @@ static int find_snapshot_by_id_or_name(BlockDriverState *bs,
 }
 
 /* if no id is provided, a new one is constructed */
-int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
+void qcow2_snapshot_create(BlockDriverState *bs,
+   QEMUSnapshotInfo *sn_info,
+   Error **errp)
 {
 BDRVQcowState *s = bs->opaque;
 QCowSnapshot *new_snapshot_list = NULL;
@@ -366,7 +368,8 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
 
 /* Check that the ID is unique */
 if (find_snapshot_by_id_and_name(bs, sn_info->id_str, NULL) >= 0) {
-return -EEXIST;
+error_setg(errp, "Snapshot with id %s already exist", sn_info->id_str);
+return;
 }
 
 /* Populate sn with passed data */
@@ -382,7 +385,8 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
 /* Allocate the L1 table of the snapshot and copy the current one there. */
 l1_table_offset = qcow2_alloc_clusters(bs, s->l1_size * sizeof(uint64_t));
 if (l1_table_offset < 0) {
-ret = l1_table_offset;
+error_setg_errno(errp, -l1_table_offset,
+ "Failed in allocation of snapshot L1 table");
 goto fail;
 }
 
@@ -397,12 +401,22 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
 ret = qcow2_pre_write_overlap_check(bs, 0, sn->l1_table_offset,
 s->l1_size * sizeof(uint64_t));
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in overlap check for snapshot L1 table at %"
+ PRIu64 " with size %" PRIu64,
+ sn->l1_table_offset,
+ (uint64_t)(s->l1_size * sizeof(uint64_t)));
 goto fail;
 }
 
 ret = bdrv_pwrite(bs->file, sn->l1_table_offset, l1_table,
   s->l1_size * sizeof(uint64_t));
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in update of snapshot L1 table at %"
+ PRIu64 " with size %" PRIu64,
+ sn->l1_table_offset,
+ (uint64_t)(s->l1_size * sizeof(uint64_t)));
 goto fail;
 }
 
@@ -416,6 +430,10 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
  */
 ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 
1);
 if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed in update of refcount for snapshot at %"
+ PRIu64 " with size %d",
+ s->l1_table_offset, s->l1_size);
 goto fail;
 }
 
@@ -431,6 +449,8 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
 
 ret = qcow2_write_snapshots(bs);
 if (ret < 0) {
+/* Following line will be replaced with more detailed error later */
+error_setg(errp, "Failed in write of snapshot");
 g_free(s->snapshots);
 s->snapshots = old_snapshot_list;
 s->nb_snapshots--;
@@ -452,14 +472,14 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info)
   qcow2_check_refcounts(bs, &result, 0);
 }
 #endif
-return 0;
+return;
 
 fail:
 g_free(sn->id_str);
 g_free(sn->name);
 g_free(l1_table);
 
-return ret;
+return;
 }
 
 /* copy the snapshot 'snapshot_name' into the current disk image */
diff --git a/block/qcow2.h b/block/qcow2.h
index 303eb26..c56a5b6 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -481,7 +481,9 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t 
offset, int nb_sectors);
 int qcow2_expand_zero_clusters(BlockDriverState *bs);
 
 /* qcow2-snapshot.c functions */
-int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
+void qcow2_snapshot_create(BlockDriverState *bs,
+   QEMUSnapshotInfo *sn_info,
+   Error **errp);
 int qcow2_snapshot_goto(BlockDriverState *bs,

[Qemu-devel] [PATCH V7 3/6] qcow2: do not free clusters when fail in header update in qcow2_write_snapshots

2013-12-05 Thread Wenchao Xia

Signed-off-by: Wenchao Xia 
Reviewed-by: Max Reitz 
---
 block/qcow2-snapshot.c |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index d7ab4ae..55746c4 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -299,6 +299,14 @@ static int qcow2_write_snapshots(BlockDriverState *bs, 
Error **errp)
  "Failed in update of image header at %d with size %d",
  (int)offsetof(QCowHeader, nb_snapshots),
  (int)sizeof(header_data));
+
+/*
+ * If the snapshot data part has been updated on disk, then the
+ * clusters at snapshot_offset may be used in next snapshot operation.
+ * If we free those clusters in fail path, they may be allocated and
+ * made dirty causing damage, so skip cluster free to be safe.
+ */
+snapshots_offset = 0;
 goto fail;
 }
 
-- 
1.7.1

[Qemu-devel] [PATCH V7 5/6] blkdebug: add debug events for snapshot

2013-12-05 Thread Wenchao Xia

Some code in qcow2-snapshot.c directly accesses bs->file, so in those
places errors can't be injected by other events. Since the code in
qcow2-snapshot.c is similar to the other qcow2 internal code (in regards
to e.g. the L1 table), add some debug events.

Signed-off-by: Wenchao Xia 
Reviewed-by: Max Reitz 
---
 block/blkdebug.c   |4 
 block/qcow2-snapshot.c |3 +++
 include/block/block.h  |4 
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 37cf028..891b549 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -186,6 +186,10 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
 
 [BLKDBG_FLUSH_TO_OS]= "flush_to_os",
 [BLKDBG_FLUSH_TO_DISK]  = "flush_to_disk",
+
+[BLKDBG_SNAPSHOT_L1_UPDATE] = "snapshot_l1_update",
+[BLKDBG_SNAPSHOT_LIST_UPDATE]   = "snapshot_list_update",
+[BLKDBG_SNAPSHOT_HEADER_UPDATE] = "snapshot_header_update",
 };
 
 static int get_event_by_name(const char *name, BlkDebugEvent *event)
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 5f787bc..2548de7 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -207,6 +207,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs, 
Error **errp)
 }
 
 
+BLKDBG_EVENT(bs->file, BLKDBG_SNAPSHOT_LIST_UPDATE);
 /* Write all snapshots to the new list */
 for(i = 0; i < s->nb_snapshots; i++) {
 sn = s->snapshots + i;
@@ -292,6 +293,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs, 
Error **errp)
 header_data.nb_snapshots= cpu_to_be32(s->nb_snapshots);
 header_data.snapshots_offset= cpu_to_be64(snapshots_offset);
 
+BLKDBG_EVENT(bs->file, BLKDBG_SNAPSHOT_HEADER_UPDATE);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, nb_snapshots),
&header_data, sizeof(header_data));
 if (ret < 0) {
@@ -452,6 +454,7 @@ void qcow2_snapshot_create(BlockDriverState *bs,
 goto dealloc_cluster;
 }
 
+BLKDBG_EVENT(bs->file, BLKDBG_SNAPSHOT_L1_UPDATE);
 ret = bdrv_pwrite(bs->file, sn->l1_table_offset, l1_table,
   s->l1_size * sizeof(uint64_t));
 if (ret < 0) {
diff --git a/include/block/block.h b/include/block/block.h
index 36efaea..8901683 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -515,6 +515,10 @@ typedef enum {
 BLKDBG_FLUSH_TO_OS,
 BLKDBG_FLUSH_TO_DISK,
 
+BLKDBG_SNAPSHOT_L1_UPDATE,
+BLKDBG_SNAPSHOT_LIST_UPDATE,
+BLKDBG_SNAPSHOT_HEADER_UPDATE,
+
 BLKDBG_EVENT_MAX,
 } BlkDebugEvent;
 
-- 
1.7.1

Re: [Qemu-devel] [RFC PATCH v0 1/3] gluster: Convert aio routines into coroutines

2013-12-05 Thread Bharata B Rao

On Tue, Dec 03, 2013 at 03:04:01PM +0100, Stefan Hajnoczi wrote:
> On Fri, Nov 22, 2013 at 12:46:16PM +0530, Bharata B Rao wrote:
> > +qemu_bh_delete(acb->bh);
> > +acb->bh = NULL;
> > +qemu_coroutine_enter(acb->coroutine, NULL);
> > +if (acb->finished) {
> > +*acb->finished = true;
> > +}
> 
> Now that aio interfaces are gone ->finished and cancellation can be
> removed.
> 
> > +qemu_aio_release(acb);
> 
> Please do this in the functions that called qemu_aio_get().  Coroutines
> may yield so it's a little risky to assume the coroutine has finished
> accessing acb.
> > -static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
> > -int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
> > -BlockDriverCompletionFunc *cb, void *opaque, int write)
> > +static coroutine_fn int qemu_gluster_aio_rw(BlockDriverState *bs,
> > +int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
> 
> Please rename this to qemu_gluster_co_rw() since it isn't aio anymore
> and doesn't return a BlockDriverAIOCB.

Thanks will address these in v1.

Regards,
Bharata.

[Qemu-devel] [PATCH V7 6/6] qemu-iotests: add test for qcow2 snapshot

2013-12-05 Thread Wenchao Xia

This test will focus on the low level procedure of qcow2 snapshot
operations, now it covers only the create operation. Overlap error
paths are not checked since no good way to trigger those errors.

Signed-off-by: Wenchao Xia 
---
 tests/qemu-iotests/075   |  218 ++
 tests/qemu-iotests/075.out   |   35 ++
 tests/qemu-iotests/common.filter |7 ++
 tests/qemu-iotests/group |1 +
 4 files changed, 261 insertions(+), 0 deletions(-)
 create mode 100755 tests/qemu-iotests/075
 create mode 100644 tests/qemu-iotests/075.out

diff --git a/tests/qemu-iotests/075 b/tests/qemu-iotests/075
new file mode 100755
index 000..fac50c8
--- /dev/null
+++ b/tests/qemu-iotests/075
@@ -0,0 +1,218 @@
+#!/bin/bash
+#
+# qcow2 internal snapshot test
+#
+# Copyright (C) 2013 IBM, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+# Note: This case uses qemu-img snapshot create, and only the default
+# cache mode is covered now.
+owner=xiaw...@linux.vnet.ibm.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+BLKDEBUG_CONF="$TEST_DIR/blkdebug.conf"
+
+_cleanup()
+{
+_cleanup_test_img
+rm "$BLKDEBUG_CONF"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.pattern
+
+# only test qcow2
+_supported_fmt qcow2
+_supported_proto generic
+# bind the errno correctly and filter the output of image check and qemu-img,
+# if you want to run it on other OS
+_supported_os Linux
+
+
+IMGOPTS="compat=1.1"
+
+CLUSTER_SIZE=65536
+
+SIZE=1G
+
+BLKDBG_TEST_IMG="blkdebug:$BLKDEBUG_CONF:$TEST_IMG"
+
+errno=5
+
+once=on
+
+imm=off
+
+
+# Start test, note that the injected errors are related to qcow2's snapshot
+# logic closely, see qcow2-snapshot.c for more details.
+
+# path 1: fail in L1 table allocation for snapshot
+echo
+echo "Path 1: fail in allocation of L1 table"
+
+_make_test_img $SIZE
+
+cat > "$BLKDEBUG_CONF" <&1
+
+
+# path 2: fail in update new L1 table
+echo
+echo "Path 2: fail in update new L1 table for snapshot"
+
+_make_test_img $SIZE
+
+cat > "$BLKDEBUG_CONF" <&1 | _filter_number
+$QEMU_IMG snapshot -l "$TEST_IMG"
+_check_test_img 2>&1
+
+# path 3: fail in update refcount block before write snapshot list
+echo
+echo "Path 3: fail in update refcount block before write snapshot list"
+
+_make_test_img $SIZE
+
+cat > "$BLKDEBUG_CONF" <&1 | _filter_number
+$QEMU_IMG snapshot -l "$TEST_IMG"
+_check_test_img 2>&1
+
+# path 4: fail in snapshot list allocation or its flush it is possible
+# qcow2_alloc_clusters() not fail immediately since cache hit, but in any
+# case, no error should be found in image check.
+echo
+echo "Path 4: fail in snapshot list allocation or its flush"
+
+_make_test_img $SIZE
+
+cat > "$BLKDEBUG_CONF" <&1 | grep "Failed" | 
grep "allocation" | grep "list"`
+if ! test -z "$err"
+then
+echo "Error happens as expected"
+fi
+$QEMU_IMG snapshot -l "$TEST_IMG"
+_check_test_img 2>&1
+
+
+# path 5: fail in snapshot list update
+echo
+echo "Path 5: fail in snapshot list update"
+
+_make_test_img $SIZE
+
+cat > "$BLKDEBUG_CONF" <&1 | _filter_number
+$QEMU_IMG snapshot -l "$TEST_IMG"
+_check_test_img 2>&1
+
+# path 6: fail in flush after snapshot list update, no good way to trigger it,
+# since the cache is empty and makes flush do nothing in that call, so leave
+# this path not tested
+
+# path 7: fail in update qcow2 header, it would have leaked cluster since not
+# discard the allocated ones for safe reason, see qcow2-snapshot.c.
+echo
+echo "Path 7: fail in update qcow2 header"
+
+_make_test_img $SIZE
+
+cat > "$BLKDEBUG_CONF" <&1 | _filter_number
+$QEMU_IMG snapshot -l "$TEST_IMG"
+_check_test_img 2>&1 | _filter_number
+
+# path 8: fail in overlap check before update L1 table for snapshot
+# path 9: fail in overlap check before update snapshot list
+# Since those clusters are allocated at runtime, there is no good way to
+# make them overlap in this script, so skip those two paths now.
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/075.out b/tests/qemu-iotests/075.out
new file mode 100644
index 000..16eb4fc
--- /dev/null
+++ b/tests/qemu-iotests/075.out
@@ -0,0 +1,35 @@
+QA output created by 075
+
+Path 1: fail in allocation of L1

Re: [Qemu-devel] [PATCHv3 1.8 0/9] qemu-img convert optimizations

2013-12-05 Thread Stefan Hajnoczi

On Wed, Nov 27, 2013 at 11:07:00AM +0100, Peter Lieven wrote:
> this series adds some optimizations for qemu-img during convert which
> have been developed recently:
> - skipping input based on get_block_status
> - variable I/O buffer size
> - align write requests to cluster_size
> 
> v2->v3:
>   - added Paolos comments in Patch 1
>   - changed the comment in patch 7 [Paolo]
>   - remove the patch to add sector progress output
>   - added a new patch to decrease the progress update interval.
> 
> v1->v2:
>   - introduce opt_transfer_length in BlockLimits [Paolo]
>   - remove knobs for iobuffer_size and alignment and
> use them unconditionally [Paolo]
>   - calculate I/O buffer size by BlockLimits information [Paolo]
>   - change the alignment patch to round down to the
> last and not to the next aligned sector [Paolo]
>   - limit updates in the sector progress output
>   - new patch to increase the default for min_sparse [Paolo]
> 
> Peter Lieven (9):
>   qemu-img: add support for skipping zeroes in input during convert
>   qemu-img: fix usage instruction for qemu-img convert
>   block/iscsi: set bdi->cluster_size
>   block: add opt_transfer_length to BlockLimits
>   block/iscsi: set bs->bl.opt_transfer_length
>   qemu-img: dynamically adjust iobuffer size during convert
>   qemu-img: round down request length to an aligned sector
>   qemu-img: increase min_sparse to 128 sectors (64kb)
>   qemu-img: decrease progress update interval on convert
> 
>  block/iscsi.c |   10 
>  include/block/block_int.h |3 ++
>  qemu-img.c|  131 
> +++--
>  qemu-img.texi |2 +-
>  4 files changed, 93 insertions(+), 53 deletions(-)

Merged all except patch 8/9.

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

Re: [Qemu-devel] [PATCH 1/4] memory: cache min/max_access_size

2013-12-05 Thread Uri Lublin


On 12/02/2013 04:40 PM, Paolo Bonzini wrote:

This will simplify the code in the next patch.

Signed-off-by: Paolo Bonzini 
---
  include/exec/memory.h |  2 ++
  memory.c  | 27 +++
  2 files changed, 13 insertions(+), 16 deletions(-)

  
  typedef struct MemoryListener MemoryListener;

diff --git a/memory.c b/memory.c
index 28f6449..56e54aa 100644
--- a/memory.c
+++ b/memory.c
@@ -443,8 +443,6 @@ static void memory_region_write_accessor(MemoryRegion *mr,
  static void access_with_adjusted_size(hwaddr addr,
uint64_t *value,
unsigned size,
-  unsigned access_size_min,
-  unsigned access_size_max,
void (*access)(MemoryRegion *mr,
   hwaddr addr,
   uint64_t *value,
@@ -457,15 +455,8 @@ static void access_with_adjusted_size(hwaddr addr,
  unsigned access_size;
  unsigned i;
  
-if (!access_size_min) {

-access_size_min = 1;
-}
-if (!access_size_max) {
-access_size_max = 4;
-}
-
  /* FIXME: support unaligned access? */
-access_size = MAX(MIN(size, access_size_max), access_size_min);
+access_size = MAX(MIN(size, mr->min_access_size), mr->min_access_size);

Hi Paolo,

Here it should be   mr->max_access_size

[ btw, MAX(MIN(a,b), b) = b ]

Regards,
Uri.

Re: [Qemu-devel] [PATCH 1/4] memory: cache min/max_access_size

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 13:19, Uri Lublin ha scritto:
>>
>>   /* FIXME: support unaligned access? */
>> -access_size = MAX(MIN(size, access_size_max), access_size_min);
>> +access_size = MAX(MIN(size, mr->min_access_size),
>> mr->min_access_size);
> Hi Paolo,
> 
> Here it should be   mr->max_access_size
> 
> [ btw, MAX(MIN(a,b), b) = b ]

Thanks.

Paolo

Re: [Qemu-devel] [PATCH] qcow2: use start_of_cluster() and offset_into_cluster() everywhere

2013-12-05 Thread Stefan Hajnoczi

On Thu, Dec 05, 2013 at 02:32:34PM +0800, Hu Tao wrote:
> Signed-off-by: Hu Tao 
> ---
>  block/qcow2-cluster.c  |  2 +-
>  block/qcow2-refcount.c | 22 +++---
>  2 files changed, 12 insertions(+), 12 deletions(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

Re: [Qemu-devel] [PATCH 1/3] scsi-disk: close drive on START_STOP

2013-12-05 Thread Markus Armbruster

Markus Armbruster  writes:

> Paolo Bonzini  writes:
>
>> Il 04/12/2013 05:55, Alexey Kardashevskiy ha scritto:
>>> Normally the user is expected to eject DVD if it is not locked by
>>> the guest. eject_device() makes few checks and calls bdrv_close()
>>> if DVD is not in use.
>>> 
>>> However it is still possible to eject DVD even if it is in use.
>>> For that, QEMU sets "eject requested" flag, the guest reads it, issues
>>> ALLOW_MEDIUM_REMOVAL(enable=1) and START_STOP(start=0). But in this case,
>>> bdrv_close() is not called anywhere so it remains "inserted" in QEMU's
>>> terms.
>>
>> This is expected behavior, and matches what IDE does.
>>
>> Markus, can you confirm?
>
> Confirmed.  See commit 4be9762.
>
> Alexey, monitor commands eject does two things: it first opens the tray,
> and if that works, it removes the medium.
>
> If the tray is locked closed, it tells the device model that eject was
> requested.  Works just like the physical eject button.
>
> With -f, it then rips out the medium.  This is similar to opening the
> tray with a unbent paperclip.  Let's ignore this case.
>
> The scsi-cd device model tells the guest about the eject request.  A
> well-behaved guest will then command the device to unlock and open the
> tray.
>
> The guest uses the same commands on behalf of its applications,
> e.g. /usr/bin/eject.
>
> Your patch changes behavior of "eject /dev/sr0 && eject -t /dev/sr0":
> you no longer get the same medium back.  You normally do with real
> hardware.

Alexey asked me for details on IRC.

$ qemu -nodefaults -monitor stdio -S -machine accel=kvm -m 512 -display 
vnc=:0 -device cirrus-vga -drive if=none,id=disk,file=test.qcow2 -device 
ide-hd,drive=disk,bus=ide.0 -drive if=none,id=cd,file=f16.iso -device 
ide-cd,drive=cd,bus=ide.1
QEMU 1.7.50 monitor - type 'help' for more information
(qemu) info block cd

cd: f16.iso (raw)
Removable device: not locked, tray closed

Boot the guest (Fedora 16, no X)

(qemu) c

The guest locked the tray:

(qemu) info block cd

cd: f16.iso (raw)
Removable device: locked, tray closed

In the guest, log in as root on the console, and run

# eject /dev/sr0

Makes the guest open the tray:

(qemu) info block cd

cd: f16.iso (raw)
Removable device: locked, tray open

In the guest, run

# eject -t /dev/sr0

Makes the guest close the tray:

(qemu) info block cd

cd: f16.iso (raw)
Removable device: locked, tray closed

Verify the guest can access the medium:

# mount -r /dev/sr0 /mnt

> The somewhat unfortunate consequence is that monitor command eject can
> only remove the medium when the tray is not locked.

Re: [Qemu-devel] [PATCH 01/27] acpi: factor out common pm_update_sci() into acpi core

2013-12-05 Thread Michael S. Tsirkin

On Thu, Nov 21, 2013 at 03:38:22AM +0100, Igor Mammedov wrote:
> Signed-off-by: Igor Mammedov 

Sorry doesn't apply.
Can you rebase on top of latest tree please?

> ---
> perhaps this patch sholud go before "piix4: add acpi pci hotplug support"
> so that there were no need in this rename in piix4_acpi_pci_hotplug()
> here.
> 
> s/pm_update_sci/acpi_update_sci/
> ---
>  hw/acpi/core.c |   18 ++
>  hw/acpi/ich9.c |   23 ++-
>  hw/acpi/piix4.c|   34 ++
>  include/hw/acpi/acpi.h |8 
>  4 files changed, 38 insertions(+), 45 deletions(-)
> 
> diff --git a/hw/acpi/core.c b/hw/acpi/core.c
> index 58308a3..8c0d48c 100644
> --- a/hw/acpi/core.c
> +++ b/hw/acpi/core.c
> @@ -662,3 +662,21 @@ uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, uint32_t 
> addr)
>  
>  return val;
>  }
> +
> +void acpi_update_sci(ACPIREGS *regs, qemu_irq irq, uint32_t gpe0_sts_mask)
> +{
> +int sci_level, pm1a_sts;
> +
> +pm1a_sts = acpi_pm1_evt_get_sts(regs);
> +
> +sci_level = ((pm1a_sts &
> +  regs->pm1.evt.en & ACPI_BITMASK_PM1_COMMON_ENABLED) != 0) 
> ||
> +((regs->gpe.sts[0] & regs->gpe.en[0] & gpe0_sts_mask) != 0);
> +
> +qemu_set_irq(irq, sci_level);
> +
> +/* schedule a timer interruption if needed */
> +acpi_pm_tmr_update(regs,
> +   (regs->pm1.evt.en & ACPI_BITMASK_TIMER_ENABLE) &&
> +   !(pm1a_sts & ACPI_BITMASK_TIMER_STATUS));
> +}
> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> index 7e0429e..e59688b 100644
> --- a/hw/acpi/ich9.c
> +++ b/hw/acpi/ich9.c
> @@ -44,29 +44,10 @@ do { printf("%s "fmt, __func__, ## __VA_ARGS__); } while 
> (0)
>  #define ICH9_DEBUG(fmt, ...)do { } while (0)
>  #endif
>  
> -static void pm_update_sci(ICH9LPCPMRegs *pm)
> -{
> -int sci_level, pm1a_sts;
> -
> -pm1a_sts = acpi_pm1_evt_get_sts(&pm->acpi_regs);
> -
> -sci_level = (((pm1a_sts & pm->acpi_regs.pm1.evt.en) &
> -  (ACPI_BITMASK_RT_CLOCK_ENABLE |
> -   ACPI_BITMASK_POWER_BUTTON_ENABLE |
> -   ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
> -   ACPI_BITMASK_TIMER_ENABLE)) != 0);
> -qemu_set_irq(pm->irq, sci_level);
> -
> -/* schedule a timer interruption if needed */
> -acpi_pm_tmr_update(&pm->acpi_regs,
> -   (pm->acpi_regs.pm1.evt.en & 
> ACPI_BITMASK_TIMER_ENABLE) &&
> -   !(pm1a_sts & ACPI_BITMASK_TIMER_STATUS));
> -}
> -
>  static void ich9_pm_update_sci_fn(ACPIREGS *regs)
>  {
>  ICH9LPCPMRegs *pm = container_of(regs, ICH9LPCPMRegs, acpi_regs);
> -pm_update_sci(pm);
> +acpi_update_sci(&pm->acpi_regs, pm->irq, 0);
>  }
>  
>  static uint64_t ich9_gpe_readb(void *opaque, hwaddr addr, unsigned width)
> @@ -193,7 +174,7 @@ static void pm_reset(void *opaque)
>  pm->smi_en |= ICH9_PMIO_SMI_EN_APMC_EN;
>  }
>  
> -pm_update_sci(pm);
> +acpi_update_sci(&pm->acpi_regs, pm->irq, 0);
>  }
>  
>  static void pm_powerdown_req(Notifier *n, void *opaque)
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 0be385e..b6dfa71 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -117,29 +117,11 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
> *parent,
>  #define ACPI_ENABLE 0xf1
>  #define ACPI_DISABLE 0xf0
>  
> -static void pm_update_sci(PIIX4PMState *s)
> -{
> -int sci_level, pmsts;
> -
> -pmsts = acpi_pm1_evt_get_sts(&s->ar);
> -sci_level = (((pmsts & s->ar.pm1.evt.en) &
> -  (ACPI_BITMASK_RT_CLOCK_ENABLE |
> -   ACPI_BITMASK_POWER_BUTTON_ENABLE |
> -   ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
> -   ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
> -(((s->ar.gpe.sts[0] & s->ar.gpe.en[0]) &
> -  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
> -
> -qemu_set_irq(s->irq, sci_level);
> -/* schedule a timer interruption if needed */
> -acpi_pm_tmr_update(&s->ar, (s->ar.pm1.evt.en & 
> ACPI_BITMASK_TIMER_ENABLE) &&
> -   !(pmsts & ACPI_BITMASK_TIMER_STATUS));
> -}
> -
>  static void pm_tmr_timer(ACPIREGS *ar)
>  {
>  PIIX4PMState *s = container_of(ar, PIIX4PMState, ar);
> -pm_update_sci(s);
> +acpi_update_sci(&s->ar, s->irq, PIIX4_PCI_HOTPLUG_STATUS |
> +PIIX4_CPU_HOTPLUG_STATUS);
>  }
>  
>  static void apm_ctrl_changed(uint32_t val, void *arg)
> @@ -429,7 +411,8 @@ static int piix4_acpi_pci_hotplug(DeviceState *qdev, 
> PCIDevice *dev,
>  }
>  s->ar.gpe.sts[0] |= PIIX4_PCI_HOTPLUG_STATUS;
>  
> -pm_update_sci(s);
> +acpi_update_sci(&s->ar, s->irq, PIIX4_PCI_HOTPLUG_STATUS |
> +PIIX4_CPU_HOTPLUG_STATUS);
>  return 0;
>  }
>  
> @@ -629,7 +612,8 @@ static void gpe_writeb(void *opaque, hwaddr addr, 
> uint64_t val,
>  PIIX4PMState *s = opaque;
>  
>  acpi_gpe_ioport_writeb(&s->ar, addr, val);
>

[Qemu-devel] [PATCH v3 02/12] target-arm: A64: add set_pc cpu method

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

When executing translation blocks we need to be able to recover
our program counter. Add a method to set it for AArch64 CPUs.
This covers user-mode, but for system mode emulation we will
need to check if the CPU is in an AArch32 execution state.

Signed-off-by: Alexander Graf 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/cpu64.c |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index 3e99c21..04ce879 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -68,11 +68,22 @@ static void aarch64_cpu_finalizefn(Object *obj)
 {
 }
 
+static void aarch64_cpu_set_pc(CPUState *cs, vaddr value)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+/*
+ * TODO: this will need updating for system emulation,
+ * when the core may be in AArch32 mode.
+ */
+cpu->env.pc = value;
+}
+
 static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
 {
 CPUClass *cc = CPU_CLASS(oc);
 
 cc->dump_state = aarch64_cpu_dump_state;
+cc->set_pc = aarch64_cpu_set_pc;
 cc->gdb_read_register = aarch64_cpu_gdb_read_register;
 cc->gdb_write_register = aarch64_cpu_gdb_write_register;
 cc->gdb_num_core_regs = 34;
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 01/12] target-arm: Split A64 from A32/T32 gen_intermediate_code_internal()

2013-12-05 Thread Peter Maydell

The A32/T32 gen_intermediate_code_internal() is complicated because it
has to deal with:
 * conditionally executed instructions
 * Thumb IT blocks
 * kernel helper page
 * M profile exception-exit special casing

None of these apply to A64, so putting the "this is A64 so
call the A64 decoder" check in the middle of the A32/T32
loop is confusing and means the A64 decoder's handling of
things like conditional jump and singlestepping has to take
account of the conditional-execution jumps the main loop
might emit.

Refactor the code to give A64 its own gen_intermediate_code_internal
function instead.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/translate-a64.c |  209 ++--
 target-arm/translate.c |   62 +
 target-arm/translate.h |   20 -
 3 files changed, 246 insertions(+), 45 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 932b601..a713137 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -28,6 +28,8 @@
 #include "translate.h"
 #include "qemu/host-utils.h"
 
+#include "exec/gen-icount.h"
+
 #include "helper.h"
 #define GEN_HELPER 1
 #include "helper.h"
@@ -106,7 +108,42 @@ static void gen_exception_insn(DisasContext *s, int 
offset, int excp)
 {
 gen_a64_set_pc_im(s->pc - offset);
 gen_exception(excp);
-s->is_jmp = DISAS_JUMP;
+s->is_jmp = DISAS_EXC;
+}
+
+static inline bool use_goto_tb(DisasContext *s, int n, uint64_t dest)
+{
+/* No direct tb linking with singlestep or deterministic io */
+if (s->singlestep_enabled || (s->tb->cflags & CF_LAST_IO)) {
+return false;
+}
+
+/* Only link tbs from inside the same guest page */
+if ((s->tb->pc & TARGET_PAGE_MASK) != (dest & TARGET_PAGE_MASK)) {
+return false;
+}
+
+return true;
+}
+
+static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
+{
+TranslationBlock *tb;
+
+tb = s->tb;
+if (use_goto_tb(s, n, dest)) {
+tcg_gen_goto_tb(n);
+gen_a64_set_pc_im(dest);
+tcg_gen_exit_tb((tcg_target_long)tb + n);
+s->is_jmp = DISAS_TB_JUMP;
+} else {
+gen_a64_set_pc_im(dest);
+if (s->singlestep_enabled) {
+gen_exception(EXCP_DEBUG);
+}
+tcg_gen_exit_tb(0);
+s->is_jmp = DISAS_JUMP;
+}
 }
 
 static void real_unallocated_encoding(DisasContext *s)
@@ -120,7 +157,7 @@ static void real_unallocated_encoding(DisasContext *s)
 real_unallocated_encoding(s); \
 } while (0)
 
-void disas_a64_insn(CPUARMState *env, DisasContext *s)
+static void disas_a64_insn(CPUARMState *env, DisasContext *s)
 {
 uint32_t insn;
 
@@ -133,9 +170,171 @@ void disas_a64_insn(CPUARMState *env, DisasContext *s)
 unallocated_encoding(s);
 break;
 }
+}
 
-if (unlikely(s->singlestep_enabled) && (s->is_jmp == DISAS_TB_JUMP)) {
-/* go through the main loop for single step */
-s->is_jmp = DISAS_JUMP;
+void gen_intermediate_code_internal_a64(ARMCPU *cpu,
+TranslationBlock *tb,
+bool search_pc)
+{
+CPUState *cs = CPU(cpu);
+CPUARMState *env = &cpu->env;
+DisasContext dc1, *dc = &dc1;
+CPUBreakpoint *bp;
+uint16_t *gen_opc_end;
+int j, lj;
+target_ulong pc_start;
+target_ulong next_page_start;
+int num_insns;
+int max_insns;
+
+pc_start = tb->pc;
+
+dc->tb = tb;
+
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
+
+dc->is_jmp = DISAS_NEXT;
+dc->pc = pc_start;
+dc->singlestep_enabled = cs->singlestep_enabled;
+dc->condjmp = 0;
+
+dc->aarch64 = 1;
+dc->thumb = 0;
+dc->bswap_code = 0;
+dc->condexec_mask = 0;
+dc->condexec_cond = 0;
+#if !defined(CONFIG_USER_ONLY)
+dc->user = 0;
+#endif
+dc->vfp_enabled = 0;
+dc->vec_len = 0;
+dc->vec_stride = 0;
+
+next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
+lj = -1;
+num_insns = 0;
+max_insns = tb->cflags & CF_COUNT_MASK;
+if (max_insns == 0) {
+max_insns = CF_COUNT_MASK;
+}
+
+gen_tb_start();
+
+tcg_clear_temp_count();
+
+do {
+if (unlikely(!QTAILQ_EMPTY(&env->breakpoints))) {
+QTAILQ_FOREACH(bp, &env->breakpoints, entry) {
+if (bp->pc == dc->pc) {
+gen_exception_insn(dc, 0, EXCP_DEBUG);
+/* Advance PC so that clearing the breakpoint will
+   invalidate this TB.  */
+dc->pc += 2;
+goto done_generating;
+}
+}
+}
+
+if (search_pc) {
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+if (lj < j) {
+lj++;
+while (lj < j) {
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
+}
+}
+

[Qemu-devel] [PATCH v3 12/12] target-arm: A64: add support for compare and branch imm

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

This patch adds emulation for the compare and branch insns,
CBZ and CBNZ.

Signed-off-by: Alexander Graf 
[claudio: adapted to new decoder,
  compare with immediate 0,
  introduce read_cpu_reg to get the 0 extension on (!sf)]
Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
---
 target-arm/translate-a64.c |   46 ++--
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 213a98a..fdc3ed8 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -202,6 +202,25 @@ static TCGv_i64 cpu_reg(DisasContext *s, int reg)
 }
 }
 
+/* read a cpu register in 32bit/64bit mode. Returns a TCGv_i64
+ * representing the register contents. This TCGv is an auto-freed
+ * temporary so it need not be explicitly freed, and may be modified.
+ */
+static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf)
+{
+TCGv_i64 v = new_tmp_a64(s);
+if (reg != 31) {
+if (sf) {
+tcg_gen_mov_i64(v, cpu_X[reg]);
+} else {
+tcg_gen_ext32u_i64(v, cpu_X[reg]);
+}
+} else {
+tcg_gen_movi_i64(v, 0);
+}
+return v;
+}
+
 /*
  * the instruction disassembly implemented here matches
  * the instruction encoding classifications in chapter 3 (C3)
@@ -227,10 +246,33 @@ static void disas_uncond_b_imm(DisasContext *s, uint32_t 
insn)
 gen_goto_tb(s, 0, addr);
 }
 
-/* Compare & branch (immediate) */
+/* C3.2.1 Compare & branch (immediate)
+ *   31  30 25  24  23  5 4  0
+ * ++-++-++
+ * | sf | 0 1 1 0 1 0 | op | imm19   |   Rt   |
+ * ++-++-++
+ */
 static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
 {
-unsupported_encoding(s, insn);
+unsigned int sf, op, rt;
+uint64_t addr;
+int label_match;
+TCGv_i64 tcg_cmp;
+
+sf = extract32(insn, 31, 1);
+op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
+rt = extract32(insn, 0, 5);
+addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
+
+tcg_cmp = read_cpu_reg(s, rt, sf);
+label_match = gen_new_label();
+
+tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
+tcg_cmp, 0, label_match);
+
+gen_goto_tb(s, 0, s->pc);
+gen_set_label(label_match);
+gen_goto_tb(s, 1, addr);
 }
 
 /* C3.2.5 Test & branch (immediate)
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 09/12] target-arm: A64: add support for BR, BLR and RET insns

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

Implement BR, BLR and RET. This is all of the 'unconditional
branch (register)' instruction category except for ERET
and DPRS (which are system mode only).

Signed-off-by: Alexander Graf 
[claudio: reimplemented on top of new decoder structure]
Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/translate-a64.c |   43 +--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index bab890d..ce2f841 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -384,10 +384,49 @@ static void disas_exc(DisasContext *s, uint32_t insn)
 unsupported_encoding(s, insn);
 }
 
-/* Unconditional branch (register) */
+/* C3.2.7 Unconditional branch (register)
+ *  31   25 24   21 20   16 15   10 95 4 0
+ * +---+---+---+---+--+---+
+ * | 1 1 0 1 0 1 1 |  opc  |  op2  |  op3  |  Rn  |  op4  |
+ * +---+---+---+---+--+---+
+ */
 static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
 {
-unsupported_encoding(s, insn);
+unsigned int opc, op2, op3, rn, op4;
+
+opc = extract32(insn, 21, 4);
+op2 = extract32(insn, 16, 5);
+op3 = extract32(insn, 10, 6);
+rn = extract32(insn, 5, 5);
+op4 = extract32(insn, 0, 5);
+
+if (op4 != 0x0 || op3 != 0x0 || op2 != 0x1f) {
+unallocated_encoding(s);
+return;
+}
+
+switch (opc) {
+case 0: /* BR */
+case 2: /* RET */
+break;
+case 1: /* BLR */
+tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
+break;
+case 4: /* ERET */
+case 5: /* DRPS */
+if (rn != 0x1f) {
+unallocated_encoding(s);
+} else {
+unsupported_encoding(s, insn);
+}
+return;
+default:
+unallocated_encoding(s);
+return;
+}
+
+tcg_gen_mov_i64(cpu_pc, cpu_reg(s, rn));
+s->is_jmp = DISAS_JUMP;
 }
 
 /* C3.2 Branches, exception generating and system instructions */
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 10/12] target-arm: A64: add support for conditional branches

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

This patch adds emulation for the conditional branch (b.cond) instruction.

Signed-off-by: Alexander Graf 
[claudio: adapted to new decoder structure,
  reused arm infrastructure for checking the flags]
Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/translate-a64.c |   29 +++--
 target-arm/translate.c |   14 +-
 target-arm/translate.h |2 ++
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index ce2f841..a6a1973 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -239,10 +239,35 @@ static void disas_test_b_imm(DisasContext *s, uint32_t 
insn)
 unsupported_encoding(s, insn);
 }
 
-/* Conditional branch (immediate) */
+/* C3.2.2 / C5.6.19 Conditional branch (immediate)
+ *  31   25  24  23  5   4  30
+ * +---++-++--+
+ * | 0 1 0 1 0 1 0 | o1 | imm19   | o0 | cond |
+ * +---++-++--+
+ */
 static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
 {
-unsupported_encoding(s, insn);
+unsigned int cond;
+uint64_t addr;
+
+if ((insn & (1 << 4)) || (insn & (1 << 24))) {
+unallocated_encoding(s);
+return;
+}
+addr = s->pc + sextract32(insn, 5, 19) * 4 - 4;
+cond = extract32(insn, 0, 4);
+
+if (cond < 0x0e) {
+/* genuinely conditional branches */
+int label_match = gen_new_label();
+arm_gen_test_cc(cond, label_match);
+gen_goto_tb(s, 0, s->pc);
+gen_set_label(label_match);
+gen_goto_tb(s, 1, addr);
+} else {
+/* 0xe and 0xf are both "always" conditions */
+gen_goto_tb(s, 0, addr);
+}
 }
 
 /* C5.6.68 HINT */
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 553bded..9e2d1eb 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -671,7 +671,11 @@ static void gen_thumb2_parallel_addsub(int op1, int op2, 
TCGv_i32 a, TCGv_i32 b)
 }
 #undef PAS_OP
 
-static void gen_test_cc(int cc, int label)
+/*
+ * generate a conditional branch based on ARM condition code cc.
+ * This is common between ARM and Aarch64 targets.
+ */
+void arm_gen_test_cc(int cc, int label)
 {
 TCGv_i32 tmp;
 int inv;
@@ -6903,7 +6907,7 @@ static void disas_arm_insn(CPUARMState * env, 
DisasContext *s)
 /* if not always execute, we generate a conditional jump to
next instruction */
 s->condlabel = gen_new_label();
-gen_test_cc(cond ^ 1, s->condlabel);
+arm_gen_test_cc(cond ^ 1, s->condlabel);
 s->condjmp = 1;
 }
 if ((insn & 0x0f90) == 0x0300) {
@@ -8910,7 +8914,7 @@ static int disas_thumb2_insn(CPUARMState *env, 
DisasContext *s, uint16_t insn_hw
 op = (insn >> 22) & 0xf;
 /* Generate a conditional jump to next instruction.  */
 s->condlabel = gen_new_label();
-gen_test_cc(op ^ 1, s->condlabel);
+arm_gen_test_cc(op ^ 1, s->condlabel);
 s->condjmp = 1;
 
 /* offset[11:1] = insn[10:0] */
@@ -9267,7 +9271,7 @@ static void disas_thumb_insn(CPUARMState *env, 
DisasContext *s)
 cond = s->condexec_cond;
 if (cond != 0x0e) { /* Skip conditional when condition is AL. */
   s->condlabel = gen_new_label();
-  gen_test_cc(cond ^ 1, s->condlabel);
+  arm_gen_test_cc(cond ^ 1, s->condlabel);
   s->condjmp = 1;
 }
 }
@@ -9940,7 +9944,7 @@ static void disas_thumb_insn(CPUARMState *env, 
DisasContext *s)
 }
 /* generate a conditional jump to next instruction */
 s->condlabel = gen_new_label();
-gen_test_cc(cond ^ 1, s->condlabel);
+arm_gen_test_cc(cond ^ 1, s->condlabel);
 s->condjmp = 1;
 
 /* jump to the offset */
diff --git a/target-arm/translate.h b/target-arm/translate.h
index 23a45da..a6f6b3e 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -65,4 +65,6 @@ static inline void gen_a64_set_pc_im(uint64_t val)
 }
 #endif
 
+void arm_gen_test_cc(int cc, int label);
+
 #endif /* TARGET_ARM_TRANSLATE_H */
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 04/12] target-arm: Support fp registers in gdb stub

2013-12-05 Thread Peter Maydell

Register the aarch64-fpu XML and implement the necessary
read/write handlers so we can support reading and writing
of FP registers in the gdb stub.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 configure   |2 +-
 gdb-xml/aarch64-fpu.xml |   86 +++
 target-arm/helper.c |   48 +-
 3 files changed, 134 insertions(+), 2 deletions(-)
 create mode 100644 gdb-xml/aarch64-fpu.xml

diff --git a/configure b/configure
index 3317013..c9ad1de 100755
--- a/configure
+++ b/configure
@@ -4401,7 +4401,7 @@ case "$target_name" in
   aarch64)
 TARGET_BASE_ARCH=arm
 bflt="yes"
-gdb_xml_files="aarch64-core.xml"
+gdb_xml_files="aarch64-core.xml aarch64-fpu.xml"
   ;;
   cris)
   ;;
diff --git a/gdb-xml/aarch64-fpu.xml b/gdb-xml/aarch64-fpu.xml
new file mode 100644
index 000..997197e
--- /dev/null
+++ b/gdb-xml/aarch64-fpu.xml
@@ -0,0 +1,86 @@
+
+
+
+
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
+
+
+  
+  
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+
+
+
+
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 263dbbf..73c97e8 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -65,6 +65,48 @@ static int vfp_gdb_set_reg(CPUARMState *env, uint8_t *buf, 
int reg)
 return 0;
 }
 
+static int aarch64_fpu_gdb_get_reg(CPUARMState *env, uint8_t *buf, int reg)
+{
+switch (reg) {
+case 0 ... 31:
+/* 128 bit FP register */
+stfq_le_p(buf, env->vfp.regs[reg * 2]);
+stfq_le_p(buf + 8, env->vfp.regs[reg * 2 + 1]);
+return 16;
+case 32:
+/* FPSR */
+stl_p(buf, vfp_get_fpsr(env));
+return 4;
+case 33:
+/* FPCR */
+stl_p(buf, vfp_get_fpcr(env));
+return 4;
+default:
+return 0;
+}
+}
+
+static int aarch64_fpu_gdb_set_reg(CPUARMState *env, uint8_t *buf, int reg)
+{
+switch (reg) {
+case 0 ... 31:
+/* 128 bit FP register */
+env->vfp.regs[reg * 2] = ldfq_le_p(buf);
+env->vfp.regs[reg * 2 + 1] = ldfq_le_p(buf + 8);
+return 16;
+case 32:
+/* FPSR */
+vfp_set_fpsr(env, ldl_p(buf));
+return 4;
+case 33:
+/* FPCR */
+vfp_set_fpcr(env, ldl_p(buf));
+return 4;
+default:
+return 0;
+}
+}
+
 static int raw_read(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t *value)
 {
@@ -1785,7 +1827,11 @@ void arm_cpu_register_gdb_regs_for_features(ARMCPU *cpu)
 CPUState *cs = CPU(cpu);
 CPUARMState *env = &cpu->env;
 
-if (arm_feature(env, ARM_FEATURE_NEON)) {
+if (arm_feature(env, ARM_FEATURE_AARCH64)) {
+gdb_register_coprocessor(cs, aarch64_fpu_gdb_get_reg,
+ aarch64_fpu_gdb_set_reg,
+ 34, "aarch64-fpu.xml", 0);
+} else if (arm_feature(env, ARM_FEATURE_NEON)) {
 gdb_register_coprocessor(cs, vfp_gdb_get_reg, vfp_gdb_set_reg,
  51, "arm-neon.xml", 0);
 } else if (arm_feature(env, ARM_FEATURE_VFP3)) {
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 00/12] target-arm: A64 decoder, foundation plus branches

2013-12-05 Thread Peter Maydell

Round three of the first-chunk of A64 decoder work, updated
following code review. Only patches 8 and 12 have changed
(and RTH, those are the only two still waiting for your review :-))

Contents:
 * the new decoder skeleton,
 * gdbstub support for FP insns
 * a patch from me which gives the A64 decoder its own
   gen_intermediate_code_internal() loop for simplicity
 * the branch related patches from Alex's series, inserted into
   the new decoder skeleton

These patches sit on top of the v8 KVM control patchset I posted
last week. You can find a git tree with them here:
 git://git.linaro.org/people/pmaydell/qemu-arm.git a64-first-set
web UI:
 
https://git.linaro.org/gitweb?p=people/pmaydell/qemu-arm.git;a=shortlog;h=refs/heads/a64-first-set

Changes v1->v2:
 * fixed a non-prettified insn pattern format in a comment
 * flip order of goto_tbs in cond-branch, test&branch, cmp&branch
 * read_cpu_reg() now returns a (trashable) TCGv_i64 rather than
   requiring one to be passed in
Changes v2->v3:
 * provide and use new_tmp_a64() as well as new_tmp_a64_zero()
 * mark the autofreed temp array as invalid if building with TCG debug

thanks
-- PMM

Alexander Graf (7):
  target-arm: A64: add set_pc cpu method
  target-arm: A64: add stubs for a64 specific helpers
  target-arm: A64: add support for B and BL insns
  target-arm: A64: add support for BR, BLR and RET insns
  target-arm: A64: add support for conditional branches
  target-arm: A64: add support for 'test and branch' imm
  target-arm: A64: add support for compare and branch imm

Claudio Fontana (2):
  target-arm: A64: provide skeleton for a64 insn decoding
  target-arm: A64: expand decoding skeleton for system instructions

Peter Maydell (3):
  target-arm: Split A64 from A32/T32 gen_intermediate_code_internal()
  target-arm: A64: provide functions for accessing FPCR and FPSR
  target-arm: Support fp registers in gdb stub

 configure  |2 +-
 gdb-xml/aarch64-fpu.xml|   86 +
 target-arm/Makefile.objs   |2 +-
 target-arm/cpu.h   |   28 ++
 target-arm/cpu64.c |   11 +
 target-arm/helper-a64.c|   25 ++
 target-arm/helper-a64.h|   18 +
 target-arm/helper.c|   48 ++-
 target-arm/helper.h|4 +
 target-arm/translate-a64.c |  895 +++-
 target-arm/translate.c |   76 ++--
 target-arm/translate.h |   25 +-
 12 files changed, 1159 insertions(+), 61 deletions(-)
 create mode 100644 gdb-xml/aarch64-fpu.xml
 create mode 100644 target-arm/helper-a64.c
 create mode 100644 target-arm/helper-a64.h

-- 
1.7.9.5

[Qemu-devel] [PATCH v3 07/12] target-arm: A64: expand decoding skeleton for system instructions

2013-12-05 Thread Peter Maydell

From: Claudio Fontana 

Decode the various kinds of system instructions:
 hints (HINT), which include NOP, YIELD, WFE, WFI, SEV, SEL
 sync instructions, which include CLREX, DSB, DMB, ISB
 msr_i, which move immediate to processor state field
 sys, which include all SYS and SYSL instructions
 msr, which move from a gp register to a system register
 mrs, which move from a system register to a gp register

Provide implementations where they are trivial nops.

Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/translate-a64.c |  131 +++-
 1 file changed, 129 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 8e16cb1..1e2b371 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -190,12 +190,139 @@ static void disas_cond_b_imm(DisasContext *s, uint32_t 
insn)
 unsupported_encoding(s, insn);
 }
 
-/* System */
-static void disas_system(DisasContext *s, uint32_t insn)
+/* C5.6.68 HINT */
+static void handle_hint(DisasContext *s, uint32_t insn,
+unsigned int op1, unsigned int op2, unsigned int crm)
+{
+unsigned int selector = crm << 3 | op2;
+
+if (op1 != 3) {
+unallocated_encoding(s);
+return;
+}
+
+switch (selector) {
+case 0: /* NOP */
+return;
+case 1: /* YIELD */
+case 2: /* WFE */
+case 3: /* WFI */
+case 4: /* SEV */
+case 5: /* SEVL */
+/* we treat all as NOP at least for now */
+return;
+default:
+/* default specified as NOP equivalent */
+return;
+}
+}
+
+/* CLREX, DSB, DMB, ISB */
+static void handle_sync(DisasContext *s, uint32_t insn,
+unsigned int op1, unsigned int op2, unsigned int crm)
+{
+if (op1 != 3) {
+unallocated_encoding(s);
+return;
+}
+
+switch (op2) {
+case 2: /* CLREX */
+unsupported_encoding(s, insn);
+return;
+case 4: /* DSB */
+case 5: /* DMB */
+case 6: /* ISB */
+/* We don't emulate caches so barriers are no-ops */
+return;
+default:
+unallocated_encoding(s);
+return;
+}
+}
+
+/* C5.6.130 MSR (immediate) - move immediate to processor state field */
+static void handle_msr_i(DisasContext *s, uint32_t insn,
+ unsigned int op1, unsigned int op2, unsigned int crm)
 {
 unsupported_encoding(s, insn);
 }
 
+/* C5.6.204 SYS */
+static void handle_sys(DisasContext *s, uint32_t insn, unsigned int l,
+   unsigned int op1, unsigned int op2,
+   unsigned int crn, unsigned int crm, unsigned int rt)
+{
+unsupported_encoding(s, insn);
+}
+
+/* C5.6.129 MRS - move from system register */
+static void handle_mrs(DisasContext *s, uint32_t insn, unsigned int op0,
+   unsigned int op1, unsigned int op2,
+   unsigned int crn, unsigned int crm, unsigned int rt)
+{
+unsupported_encoding(s, insn);
+}
+
+/* C5.6.131 MSR (register) - move to system register */
+static void handle_msr(DisasContext *s, uint32_t insn, unsigned int op0,
+   unsigned int op1, unsigned int op2,
+   unsigned int crn, unsigned int crm, unsigned int rt)
+{
+unsupported_encoding(s, insn);
+}
+
+/* C3.2.4 System
+ *  31 22 21  20 19 18 16 15   12 118 7   5 40
+ * +-+---+-+-+---+---+-+--+
+ * | 1 1 0 1 0 1 0 1 0 0 | L | op0 | op1 |  CRn  |  CRm  | op2 |  Rt  |
+ * +-+---+-+-+---+---+-+--+
+ */
+static void disas_system(DisasContext *s, uint32_t insn)
+{
+unsigned int l, op0, op1, crn, crm, op2, rt;
+l = extract32(insn, 21, 1);
+op0 = extract32(insn, 19, 2);
+op1 = extract32(insn, 16, 3);
+crn = extract32(insn, 12, 4);
+crm = extract32(insn, 8, 4);
+op2 = extract32(insn, 5, 3);
+rt = extract32(insn, 0, 5);
+
+if (op0 == 0) {
+if (l || rt != 31) {
+unallocated_encoding(s);
+return;
+}
+switch (crn) {
+case 2: /* C5.6.68 HINT */
+handle_hint(s, insn, op1, op2, crm);
+break;
+case 3: /* CLREX, DSB, DMB, ISB */
+handle_sync(s, insn, op1, op2, crm);
+break;
+case 4: /* C5.6.130 MSR (immediate) */
+handle_msr_i(s, insn, op1, op2, crm);
+break;
+default:
+unallocated_encoding(s);
+break;
+}
+return;
+}
+
+if (op0 == 1) {
+/* C5.6.204 SYS */
+handle_sys(s, insn, l, op1, op2, crn, crm, rt);
+} else if (l) { /* op0 > 1 */
+/* C5.6.129 MRS - move from system register */
+handle_mrs(s, insn, op0, op1, op2, crn, crm, rt);
+} else {
+/* C5.6.131 MSR (register) - move to system registe

[Qemu-devel] [PATCH v3 06/12] target-arm: A64: provide skeleton for a64 insn decoding

2013-12-05 Thread Peter Maydell

From: Claudio Fontana 

Provide a skeleton for a64 instruction decoding in translate-a64.c,
by dividing instructions into the classes defined by the
ARM Architecture Reference Manual(DDI0487A_a) section C3.

Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/translate-a64.c |  370 +++-
 1 file changed, 362 insertions(+), 8 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index a713137..8e16cb1 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -146,17 +146,348 @@ static inline void gen_goto_tb(DisasContext *s, int n, 
uint64_t dest)
 }
 }
 
-static void real_unallocated_encoding(DisasContext *s)
+static void unallocated_encoding(DisasContext *s)
 {
-fprintf(stderr, "Unknown instruction: %#x\n", s->insn);
 gen_exception_insn(s, 4, EXCP_UDEF);
 }
 
-#define unallocated_encoding(s) do { \
-fprintf(stderr, "unallocated encoding at line: %d\n", __LINE__); \
-real_unallocated_encoding(s); \
-} while (0)
+#define unsupported_encoding(s, insn)\
+do { \
+qemu_log_mask(LOG_UNIMP, \
+  "%s:%d: unsupported instruction encoding 0x%08x "  \
+  "at pc=%016" PRIx64 "\n",  \
+  __FILE__, __LINE__, insn, s->pc - 4);  \
+unallocated_encoding(s); \
+} while (0);
 
+/*
+ * the instruction disassembly implemented here matches
+ * the instruction encoding classifications in chapter 3 (C3)
+ * of the ARM Architecture Reference Manual (DDI0487A_a)
+ */
+
+/* Unconditional branch (immediate) */
+static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Compare & branch (immediate) */
+static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Test & branch (immediate) */
+static void disas_test_b_imm(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Conditional branch (immediate) */
+static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* System */
+static void disas_system(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Exception generation */
+static void disas_exc(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Unconditional branch (register) */
+static void disas_uncond_b_reg(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* C3.2 Branches, exception generating and system instructions */
+static void disas_b_exc_sys(DisasContext *s, uint32_t insn)
+{
+switch (extract32(insn, 25, 7)) {
+case 0x0a: case 0x0b:
+case 0x4a: case 0x4b: /* Unconditional branch (immediate) */
+disas_uncond_b_imm(s, insn);
+break;
+case 0x1a: case 0x5a: /* Compare & branch (immediate) */
+disas_comp_b_imm(s, insn);
+break;
+case 0x1b: case 0x5b: /* Test & branch (immediate) */
+disas_test_b_imm(s, insn);
+break;
+case 0x2a: /* Conditional branch (immediate) */
+disas_cond_b_imm(s, insn);
+break;
+case 0x6a: /* Exception generation / System */
+if (insn & (1 << 24)) {
+disas_system(s, insn);
+} else {
+disas_exc(s, insn);
+}
+break;
+case 0x6b: /* Unconditional branch (register) */
+disas_uncond_b_reg(s, insn);
+break;
+default:
+unallocated_encoding(s);
+break;
+}
+}
+
+/* Load/store exclusive */
+static void disas_ldst_excl(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Load register (literal) */
+static void disas_ld_lit(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Load/store pair (all forms) */
+static void disas_ldst_pair(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* Load/store register (all forms) */
+static void disas_ldst_reg(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* AdvSIMD load/store multiple structures */
+static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* AdvSIMD load/store single structure */
+static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
+{
+unsupported_encoding(s, insn);
+}
+
+/* C3.3 Loads and stores */
+static void disas_ldst(DisasContext *s, uint32_t insn)
+{
+switch (extract32(insn, 24, 6)) {
+case 0x08: /* Load/store exclusive */
+disas_ldst_excl(s, insn);
+break;
+case 0x18: case 0x1c: /* Load register (literal) */
+disas_ld_lit(s, i

Re: [Qemu-devel] [PATCH 1/3] scsi-disk: close drive on START_STOP

2013-12-05 Thread Alexey Kardashevskiy

On 12/05/2013 11:29 PM, Markus Armbruster wrote:
> Markus Armbruster  writes:
> 
>> Paolo Bonzini  writes:
>>
>>> Il 04/12/2013 05:55, Alexey Kardashevskiy ha scritto:
 Normally the user is expected to eject DVD if it is not locked by
 the guest. eject_device() makes few checks and calls bdrv_close()
 if DVD is not in use.

 However it is still possible to eject DVD even if it is in use.
 For that, QEMU sets "eject requested" flag, the guest reads it, issues
 ALLOW_MEDIUM_REMOVAL(enable=1) and START_STOP(start=0). But in this case,
 bdrv_close() is not called anywhere so it remains "inserted" in QEMU's
 terms.
>>>
>>> This is expected behavior, and matches what IDE does.
>>>
>>> Markus, can you confirm?
>>
>> Confirmed.  See commit 4be9762.
>>
>> Alexey, monitor commands eject does two things: it first opens the tray,
>> and if that works, it removes the medium.
>>
>> If the tray is locked closed, it tells the device model that eject was
>> requested.  Works just like the physical eject button.
>>
>> With -f, it then rips out the medium.  This is similar to opening the
>> tray with a unbent paperclip.  Let's ignore this case.
>>
>> The scsi-cd device model tells the guest about the eject request.  A
>> well-behaved guest will then command the device to unlock and open the
>> tray.
>>
>> The guest uses the same commands on behalf of its applications,
>> e.g. /usr/bin/eject.
>>
>> Your patch changes behavior of "eject /dev/sr0 && eject -t /dev/sr0":
>> you no longer get the same medium back.  You normally do with real
>> hardware.
> 
> Alexey asked me for details on IRC.
> 
> $ qemu -nodefaults -monitor stdio -S -machine accel=kvm -m 512 -display 
> vnc=:0 -device cirrus-vga -drive if=none,id=disk,file=test.qcow2 -device 
> ide-hd,drive=disk,bus=ide.0 -drive if=none,id=cd,file=f16.iso -device 
> ide-cd,drive=cd,bus=ide.1
> QEMU 1.7.50 monitor - type 'help' for more information
> (qemu) info block cd
> 
> cd: f16.iso (raw)
> Removable device: not locked, tray closed
> 
> Boot the guest (Fedora 16, no X)
> 
> (qemu) c
> 
> The guest locked the tray:
> 
> (qemu) info block cd
> 
> cd: f16.iso (raw)
> Removable device: locked, tray closed
> 
> In the guest, log in as root on the console, and run
> 
> # eject /dev/sr0
> 
> Makes the guest open the tray:
> 
> (qemu) info block cd
> 
> cd: f16.iso (raw)
> Removable device: locked, tray open
> 
> In the guest, run
> 
> # eject -t /dev/sr0
> 
> Makes the guest close the tray:
> 
> (qemu) info block cd
> 
> cd: f16.iso (raw)
> Removable device: locked, tray closed
> 
> Verify the guest can access the medium:
> 
> # mount -r /dev/sr0 /mnt
> 
>> The somewhat unfortunate consequence is that monitor command eject can
>> only remove the medium when the tray is not locked.

Thanks!

Just out of curiosity. A lot (in fact, all around me) dvd drives do not
support trayclose as they are in laptops or servers (which use the same
laptop models). I cannot even verify how this "eject -t" exactly works - no
hardware around me. And even if I could find it, I could easily take the
disc off the tray in that short period of time between tray is open and
tray is closed but we still absolutely want "eject" + "eject -t" to work as
you described.

Why exactly? :) Only because change of behavior is bad? Just asking. Thanks.



-- 
Alexey

[Qemu-devel] [PATCH v3 05/12] target-arm: A64: add stubs for a64 specific helpers

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

We will need helpers that only make sense with AArch64. Add
helper-a64.{c,h} files as stubs that we can fill with these
helpers in the following patches.

Signed-off-by: Alexander Graf 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/Makefile.objs |2 +-
 target-arm/helper-a64.c  |   25 +
 target-arm/helper-a64.h  |   18 ++
 target-arm/helper.h  |4 
 4 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 target-arm/helper-a64.c
 create mode 100644 target-arm/helper-a64.h

diff --git a/target-arm/Makefile.objs b/target-arm/Makefile.objs
index 5493a4c..a5914e9 100644
--- a/target-arm/Makefile.objs
+++ b/target-arm/Makefile.objs
@@ -5,6 +5,6 @@ obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
 obj-y += neon_helper.o iwmmxt_helper.o
 obj-y += gdbstub.o
-obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o gdbstub64.o
+obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
 obj-$(call land,$(CONFIG_KVM),$(call lnot,$(TARGET_AARCH64))) += kvm32.o
 obj-$(call land,$(CONFIG_KVM),$(TARGET_AARCH64)) += kvm64.o
diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
new file mode 100644
index 000..adb8428
--- /dev/null
+++ b/target-arm/helper-a64.c
@@ -0,0 +1,25 @@
+/*
+ *  AArch64 specific helpers
+ *
+ *  Copyright (c) 2013 Alexander Graf 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "cpu.h"
+#include "exec/gdbstub.h"
+#include "helper.h"
+#include "qemu/host-utils.h"
+#include "sysemu/sysemu.h"
+#include "qemu/bitops.h"
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
new file mode 100644
index 000..dd28306
--- /dev/null
+++ b/target-arm/helper-a64.h
@@ -0,0 +1,18 @@
+/*
+ *  AArch64 specific helper definitions
+ *
+ *  Copyright (c) 2013 Alexander Graf 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
diff --git a/target-arm/helper.h b/target-arm/helper.h
index cac9564..f3727a5 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -458,4 +458,8 @@ DEF_HELPER_3(neon_qzip8, void, env, i32, i32)
 DEF_HELPER_3(neon_qzip16, void, env, i32, i32)
 DEF_HELPER_3(neon_qzip32, void, env, i32, i32)
 
+#ifdef TARGET_AARCH64
+#include "helper-a64.h"
+#endif
+
 #include "exec/def-helper.h"
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 03/12] target-arm: A64: provide functions for accessing FPCR and FPSR

2013-12-05 Thread Peter Maydell

The information which AArch32 holds in the FPSCR is split for
AArch64 into two logically distinct registers, FPSR and FPCR.
Since they are carefully arranged to use non-overlapping bits,
we leave the underlying state in the same place, and provide
accessor functions which just update the appropriate bits
via vfp_get_fpscr() and vfp_set_fpscr().

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/cpu.h |   28 
 1 file changed, 28 insertions(+)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index ff7aac5..4807354 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -445,6 +445,34 @@ static inline void xpsr_write(CPUARMState *env, uint32_t 
val, uint32_t mask)
 uint32_t vfp_get_fpscr(CPUARMState *env);
 void vfp_set_fpscr(CPUARMState *env, uint32_t val);
 
+/* For A64 the FPSCR is split into two logically distinct registers,
+ * FPCR and FPSR. However since they still use non-overlapping bits
+ * we store the underlying state in fpscr and just mask on read/write.
+ */
+#define FPSR_MASK 0xf89f
+#define FPCR_MASK 0x07f79f00
+static inline uint32_t vfp_get_fpsr(CPUARMState *env)
+{
+return vfp_get_fpscr(env) & FPSR_MASK;
+}
+
+static inline void vfp_set_fpsr(CPUARMState *env, uint32_t val)
+{
+uint32_t new_fpscr = (vfp_get_fpscr(env) & ~FPSR_MASK) | (val & FPSR_MASK);
+vfp_set_fpscr(env, new_fpscr);
+}
+
+static inline uint32_t vfp_get_fpcr(CPUARMState *env)
+{
+return vfp_get_fpscr(env) & FPCR_MASK;
+}
+
+static inline void vfp_set_fpcr(CPUARMState *env, uint32_t val)
+{
+uint32_t new_fpscr = (vfp_get_fpscr(env) & ~FPCR_MASK) | (val & FPCR_MASK);
+vfp_set_fpscr(env, new_fpscr);
+}
+
 enum arm_cpu_mode {
   ARM_CPU_MODE_USR = 0x10,
   ARM_CPU_MODE_FIQ = 0x11,
-- 
1.7.9.5

[Qemu-devel] [PATCH v3 11/12] target-arm: A64: add support for 'test and branch' imm

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

This patch adds emulation for the test and branch insns,
TBZ and TBNZ.

Signed-off-by: Alexander Graf 
[claudio:
  adapted for new decoder
  always compare with 0
  remove a TCG temporary
]
Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-arm/translate-a64.c |   27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index a6a1973..213a98a 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -233,10 +233,33 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t 
insn)
 unsupported_encoding(s, insn);
 }
 
-/* Test & branch (immediate) */
+/* C3.2.5 Test & branch (immediate)
+ *   31  30 25  24  23   19 18  5 40
+ * ++-++---+-+--+
+ * | b5 | 0 1 1 0 1 1 | op |  b40  |imm14|  Rt  |
+ * ++-++---+-+--+
+ */
 static void disas_test_b_imm(DisasContext *s, uint32_t insn)
 {
-unsupported_encoding(s, insn);
+unsigned int bit_pos, op, rt;
+uint64_t addr;
+int label_match;
+TCGv_i64 tcg_cmp;
+
+bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
+op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
+addr = s->pc + sextract32(insn, 5, 14) * 4 - 4;
+rt = extract32(insn, 0, 5);
+
+tcg_cmp = tcg_temp_new_i64();
+tcg_gen_andi_i64(tcg_cmp, cpu_reg(s, rt), (1ULL << bit_pos));
+label_match = gen_new_label();
+tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
+tcg_cmp, 0, label_match);
+tcg_temp_free_i64(tcg_cmp);
+gen_goto_tb(s, 0, s->pc);
+gen_set_label(label_match);
+gen_goto_tb(s, 1, addr);
 }
 
 /* C3.2.2 / C5.6.19 Conditional branch (immediate)
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH 1/3] scsi-disk: close drive on START_STOP

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 13:42, Alexey Kardashevskiy ha scritto:
> Thanks!
> 
> Just out of curiosity. A lot (in fact, all around me) dvd drives do not
> support trayclose as they are in laptops or servers (which use the same
> laptop models). I cannot even verify how this "eject -t" exactly works - no
> hardware around me. And even if I could find it, I could easily take the
> disc off the tray in that short period of time between tray is open and
> tray is closed but we still absolutely want "eject" + "eject -t" to work as
> you described.

Taking the disc off the tray is equivalent to going to the monitor and
doing "eject -f cd".

I.e. programmatic actions leave the disc in (at least if you do not
consider laptops and servers; but the guest can detect whether the tray
can be auto-closed, and we tell it that it can).  Out-of-band user
actions force the disc out.

> Why exactly? :) Only because change of behavior is bad? Just asking. Thanks.

The only practical use of CDs is installation, and it took a long
time to get it to work with all sorts of guests.

Paolo

[Qemu-devel] [PATCH v3 08/12] target-arm: A64: add support for B and BL insns

2013-12-05 Thread Peter Maydell

From: Alexander Graf 

Implement the B and BL instructions (PC relative branches and calls).

For convenience in managing TCG temporaries which might be generated
if a source register is the zero-register XZR, we provide a simple
mechanism for creating a new temp which is automatically freed at the
end of decode of the instruction.

Signed-off-by: Alexander Graf 
[claudio: renamed functions, adapted to new decoder layout]
Signed-off-by: Claudio Fontana 
Signed-off-by: Peter Maydell 
---
 target-arm/translate-a64.c |   64 ++--
 target-arm/translate.h |3 +++
 2 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 1e2b371..bab890d 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -160,16 +160,71 @@ static void unallocated_encoding(DisasContext *s)
 unallocated_encoding(s); \
 } while (0);
 
+static void init_tmp_a64_array(DisasContext *s)
+{
+int i;
+#ifdef CONFIG_DEBUG_TCG
+for (i = 0; i < ARRAY_SIZE(s->tmp_a64); i++) {
+TCGV_UNUSED_I64(s->tmp_a64[i]);
+}
+#endif
+s->tmp_a64_count = 0;
+}
+
+static void free_tmp_a64(DisasContext *s)
+{
+int i;
+for (i = 0; i < s->tmp_a64_count; i++) {
+tcg_temp_free_i64(s->tmp_a64[i]);
+}
+init_tmp_a64_array(s);
+}
+
+static TCGv_i64 new_tmp_a64(DisasContext *s)
+{
+assert(s->tmp_a64_count < TMP_A64_MAX);
+return s->tmp_a64[s->tmp_a64_count++] = tcg_temp_new_i64();
+}
+
+static TCGv_i64 new_tmp_a64_zero(DisasContext *s)
+{
+TCGv_i64 t = new_tmp_a64(s);
+tcg_gen_movi_i64(t, 0);
+return t;
+}
+
+static TCGv_i64 cpu_reg(DisasContext *s, int reg)
+{
+if (reg == 31) {
+return new_tmp_a64_zero(s);
+} else {
+return cpu_X[reg];
+}
+}
+
 /*
  * the instruction disassembly implemented here matches
  * the instruction encoding classifications in chapter 3 (C3)
  * of the ARM Architecture Reference Manual (DDI0487A_a)
  */
 
-/* Unconditional branch (immediate) */
+/* C3.2.7 Unconditional branch (immediate)
+ *   31  30   26 25  0
+ * ++---+-+
+ * | op | 0 0 1 0 1 | imm26   |
+ * ++---+-+
+ */
 static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
 {
-unsupported_encoding(s, insn);
+uint64_t addr = s->pc + sextract32(insn, 0, 26) * 4 - 4;
+
+if (insn & (1 << 31)) {
+/* C5.6.26 BL Branch with link */
+tcg_gen_movi_i64(cpu_reg(s, 30), s->pc);
+}
+
+/* C5.6.20 B Branch / C5.6.26 BL Branch with link */
+gen_goto_tb(s, 0, addr);
 }
 
 /* Compare & branch (immediate) */
@@ -651,6 +706,9 @@ static void disas_a64_insn(CPUARMState *env, DisasContext 
*s)
 assert(FALSE); /* all 15 cases should be handled above */
 break;
 }
+
+/* if we allocated any temporaries, free them here */
+free_tmp_a64(s);
 }
 
 void gen_intermediate_code_internal_a64(ARMCPU *cpu,
@@ -691,6 +749,8 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
 dc->vec_len = 0;
 dc->vec_stride = 0;
 
+init_tmp_a64_array(dc);
+
 next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
 lj = -1;
 num_insns = 0;
diff --git a/target-arm/translate.h b/target-arm/translate.h
index 8789181..23a45da 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -24,6 +24,9 @@ typedef struct DisasContext {
 int vec_len;
 int vec_stride;
 int aarch64;
+#define TMP_A64_MAX 16
+int tmp_a64_count;
+TCGv_i64 tmp_a64[TMP_A64_MAX];
 } DisasContext;
 
 extern TCGv_ptr cpu_env;
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting "-sandbox on" by default

2013-12-05 Thread Stefan Hajnoczi

On Wed, Dec 04, 2013 at 11:21:12AM -0200, Eduardo Otubo wrote:
> On 12/04/2013 07:39 AM, Stefan Hajnoczi wrote:
> >On Fri, Nov 22, 2013 at 11:00:24AM -0500, Paul Moore wrote:
> >>>Developers will only be happy with seccomp if it's easy and rewarding to
> >>>support/debug.
> >>
> >>Agreed.
> >>
> >>As a developer, how do you feel about the audit/syslog based approach I
> >>mentioned earlier?
> >
> >I used the commands you posted (I think that's what you mean).  They
> >produce useful output.
> >
> >The problem is that without an error message on stderr or from the
> >shell, no one will think "QEMU process dead and hung == check seccomp"
> >immediately.  It's frustrating to deal with a "silent" failure.
> 
> The process dies with a SIGKILL, and sig handling in Qemu is hard to
> implement due to dozen of external linked libraries that has their
> own signal masks and conflicts with seccomp. I've already tried this
> approach in the past (you can find in the list by searching for
> debug mode)

I now realize we may be talking past each other.  Dying with
SIGKILL/SIGSYS is perfectly reasonable and I would be happy with that
:-).

But I think there's a bug in seccomp: a multi-threaded process can be
left in a zombie state.  In my case the primary thread was killed by
seccomp but another thread was deadlocked on a futex.

The result is the process isn't quite dead yet.  The shell will not reap
it and we're stuck with a zombie.

I can reproduce it reliably when I run "qemu-system-x86_64 -sandbox on"
on Fedora 20 (qemu-system-x86-1.6.1-2).

Should seccomp use do_group_exit() for SIGKILL?

Stefan

Re: [Qemu-devel] [PATCH] target-i386: clear guest TSC on reset

2013-12-05 Thread Fernando Luis Vazquez Cao


(2013/12/05 18:28), Paolo Bonzini wrote:

Il 05/12/2013 07:15, Fernando Luis Vázquez Cao ha scritto:

VCPU TSC is not cleared by a warm reset (*), which leaves many Linux
guests vulnerable to the overflow in cyc2ns_offset fixed by upstream
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow
in cyc2ns_offset").

To put it in a nutshell, if a Linux guest without the patch above applied
has been up more than 208 days and attempts a warm reset chances are that
the newly booted kernel will panic or hang.

(*) Intel Xeon E5 processors show the same broken behavior due to
 the errata "TSC is Not Affected by Warm Reset" (Intel® Xeon®
 Processor E5 Family Specification Update - August 2013): "The
 TSC (Time Stamp Counter MSR 10H) should be cleared on
 reset. Due to this erratum the TSC is not affected by warm
 reset."

Cc: sta...@vger.kernel.org
Cc: Will Auld 
Cc: Marcelo Tosatti 
Signed-off-by: Fernando Luis Vazquez Cao 

I agree that the bug is in QEMU.  One small nit in your patch is that
you should reset env->tsc_adjust and env->tsc in x86_cpu_reset.  This
would already be pretty good.


Yes, that is certainly cleaner (I should try not to take shortcuts...). 
I am attaching
an updated patch (I apologize for not sending it inline - for reasons 
better left

untold I am writing this on a problematic email client :) ).




However, a bigger problem is that env->tsc is a useless duplicate of
"cpu_get_ticks() + env->tsc_adjust".  It would be nice to drop env->tsc
completely except for migration backwards compatibility.  Thus you can:

- fill in env->tsc as mentioned above from target-i386/machine.c's
cpu_pre_save function.  This guarantees backwards compatibility.

- add a function cpu_set_ticks(int64_t ticks) to cpus.c.  The function
does nothing if use_icount is true, otherwise it needs to have (roughly)
the opposite logic compared to cpu_get_ticks.  You then call this
function from x86_cpu_reset instead of setting env->tsc.  You can
similarly call this function from kvm_get_msrs.

- add a function kvm_set_ticks(int64_t ticks) to kvm-all.c and
kvm-stub.c.  For kvm-all.c it calls kvm_arch_set_ticks(CPUState *cpu,
int64_t ticks) in target-*/kvm.c.  The kvm_arch_set_tsc() function has a
dummy implementation for all architectures except x86.  For x86 it calls
KVM_SET_MSRS passing "ticks + env->tsc_offset".

- call kvm_set_ticks() from cpu_set_ticks() and cpu_enable_ticks()

Can you do this?


Can you pick my original fix first? I can do what you suggest in a follow-up
patch.

Thanks,
Fernando
[PATCH v2] target-i386: clear guest TSC on reset

From: Fernando Luis Vazquez Cao 

VCPU TSC is not cleared by a warm reset (*), which leaves many Linux
guests vulnerable to the overflow in cyc2ns_offset fixed by upstream
commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow
in cyc2ns_offset").

To put it in a nutshell, if a Linux guest without the patch above applied
has been up more than 208 days and attempts a warm reset chances are that
the newly booted kernel will panic or hang.

(*) Intel Xeon E5 processors show the same broken behavior due to
the errata "TSC is Not Affected by Warm Reset" (Intelﾂｮ Xeonﾂｮ
Processor E5 Family Specification Update - August 2013): "The
TSC (Time Stamp Counter MSR 10H) should be cleared on
reset. Due to this erratum the TSC is not affected by warm
reset."

Cc: Will Auld 
Cc: Marcelo Tosatti 
Signed-off-by: Fernando Luis Vazquez Cao 
---

diff -urNp qemu-orig/target-i386/cpu.c qemu/target-i386/cpu.c
--- qemu-orig/target-i386/cpu.c 2013-11-28 07:02:45.0 +0900
+++ qemu/target-i386/cpu.c  2013-12-05 21:45:19.980156320 +0900
@@ -2446,6 +2446,9 @@ static void x86_cpu_reset(CPUState *s)
 cpu_breakpoint_remove_all(env, BP_CPU);
 cpu_watchpoint_remove_all(env, BP_CPU);
 
+env->tsc_adjust = 0;
+env->tsc = 0;
+
 #if !defined(CONFIG_USER_ONLY)
 /* We hard-wire the BSP to the first CPU. */
 if (s->cpu_index == 0) {
diff -urNp qemu-orig/target-i386/kvm.c qemu/target-i386/kvm.c
--- qemu-orig/target-i386/kvm.c 2013-11-28 07:02:45.0 +0900
+++ qemu/target-i386/kvm.c  2013-12-05 21:45:28.900200552 +0900
@@ -1139,22 +1139,20 @@ static int kvm_put_msrs(X86CPU *cpu, int
 kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar);
 }
 #endif
-if (level == KVM_PUT_FULL_STATE) {
+/*
+ * The following MSRs have side effects on the guest or are too heavy
+ * for normal writeback. Limit them to reset or full state updates.
+ */
+if (level >= KVM_PUT_RESET_STATE) {
 /*
  * KVM is yet unable to synchronize TSC values of multiple VCPUs on
  * writeback. Until this is fixed, we only write the offset to SMP
  * guests after migration, desynchronizing the VCPUs, but avoiding
  * huge jump-backs that would occur without any writeback at all.
  */
-if (smp_cpus == 1 || env->tsc != 0) {
+if (smp_cpus == 1 || e

Re: [Qemu-devel] [PATCH 0/9] vmstate code split + unit tests

2013-12-05 Thread Orit Wasserman


On 11/28/2013 04:01 PM, Eduardo Habkost wrote:

This series separates the QEMUFile and VMState code from savevm.c, and adds a
few unit tests to the VMState code.

Eduardo Habkost (9):
   qemu-file: Make a few functions non-static
   migration: Move QEMU_VM_* defines to migration/migration.h
   savevm: Convert all tabs to spaces
   savevm.c: Coding style fixes
   savevm.c: Coding style fix
   vmstate: Move VMState code to vmstate.c
   qemu-file: Move QEMUFile code to qemu-file.c
   savevm: Small comment about why timer QEMUFile/VMState code is in
 savevm.c
   tests: Some unit tests for vmstate.c

  Makefile.objs |2 +
  include/migration/migration.h |   11 +
  include/migration/qemu-file.h |4 +
  qemu-file.c   |  826 +
  savevm.c  | 1590 ++---
  tests/.gitignore  |1 +
  tests/Makefile|4 +
  tests/test-vmstate.c  |  357 +
  vmstate.c |  650 +
  9 files changed, 1921 insertions(+), 1524 deletions(-)
  create mode 100644 qemu-file.c
  create mode 100644 tests/test-vmstate.c
  create mode 100644 vmstate.c



Series Reviewed-by: Orit Wasserman 
(with v2 of patch 8)

Re: [Qemu-devel] [PATCH v2 1/6] error: Add error_abort

2013-12-05 Thread Eric Blake

On 12/05/2013 03:13 AM, Markus Armbruster wrote:

>>
>> For error_propagate, if the destination error is &error_abort, then
>> the abort happens at propagation time.
>>
>> Signed-off-by: Peter Crosthwaite 
>> ---
>> changed since v1:
>> Delayed assertions that *errp == NULL.
> 
> Care to explain why you want to delay these assertions?  I'm not sure I
> get it...

error_abort as a global variable is always NULL.

> 
> [...]
>> @@ -31,7 +33,6 @@ void error_set(Error **errp, ErrorClass err_class, const 
>> char *fmt, ...)
>>  if (errp == NULL) {
>>  return;
>>  }
>> -assert(*errp == NULL);

So *&error_abort is null and this assertion would fire, unless we delay
the check for NULL...

>>  
>>  err = g_malloc0(sizeof(*err));
>>  
>> @@ -40,6 +41,12 @@ void error_set(Error **errp, ErrorClass err_class, const 
>> char *fmt, ...)
>>  va_end(ap);
>>  err->err_class = err_class;
>>  
>> +if (errp == &error_abort) {
>> +error_report("%s", error_get_pretty(err));
>> +abort();
>> +}
>> +
>> +assert(*errp == NULL);

...until after the check for &error_abort.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCHv3 1.8 6/9] qemu-img: dynamically adjust iobuffer size during convert

2013-12-05 Thread Eric Blake

On 11/27/2013 03:07 AM, Peter Lieven wrote:
> since the convert process is basically a sync operation it might
> be benificial in some case to change the hardcoded I/O buffer

s/benificial/beneficial/

> size to a greater value.
> 
> This patch increases the I/O buffer size if the output
> driver advertises an optimal transfer length or discard alignment
> that is greater than the default buffer size of 2M.
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] target-i386: clear guest TSC on reset

2013-12-05 Thread Paolo Bonzini

Il 05/12/2013 14:15, Fernando Luis Vazquez Cao ha scritto:
>  /*
>   * KVM is yet unable to synchronize TSC values of multiple VCPUs on
>   * writeback. Until this is fixed, we only write the offset to SMP
>   * guests after migration, desynchronizing the VCPUs, but avoiding
>   * huge jump-backs that would occur without any writeback at all.
>   */
> -if (smp_cpus == 1 || env->tsc != 0) {
> +if (smp_cpus == 1 || env->tsc != 0 || level == KVM_PUT_RESET_STATE) {
>  kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
>  }

This is still a bit ugly, and desynchronizes the VCPUs on reset.

The main point of my outlined solution is that you only have one value
that is tracked, not one per VCPU (which in the case of migration adds
unpredictable latencies---for example due to emptying the migration
buffers).  We already save that value; all that's left is to use it
instead of env->tsc.

Though you would need one change here:

> - add a function kvm_set_ticks(int64_t ticks) to kvm-all.c and
> kvm-stub.c.  For kvm-all.c it calls kvm_arch_set_ticks(CPUState *cpu,
> int64_t ticks) in target-*/kvm.c.  The kvm_arch_set_tsc() function has a
> dummy implementation for all architectures except x86.  For x86 it calls
> KVM_SET_MSRS passing "ticks + env->tsc_offset". 

Instead you can make kvm_{,arch_}update_ticks() and pass
"cpu_get_ticks() + env->tsc_offset" to KVM_SET_MSRS (looping across all
VCPUs).  Assuming the TSC is synchronized to begin with on host CPUs,
and the latency is similar for all CPUs from the invocation of the ioctl
to the time TSC_OFFSET is written, the synchronization should be decent.

Paolo

Re: [Qemu-devel] [RFC V3 4/7] qmp: Allow block_passwd to manipulate bs graph nodes.

2013-12-05 Thread Benoît Canet

Le Wednesday 04 Dec 2013 à 16:56:05 (-0700), Eric Blake a écrit :
> On 12/03/2013 06:26 AM, Benoît Canet wrote:
> > Signed-off-by: Benoit Canet 
> > ---
> >  
> > +BlockDriverState * bdrv_lookup_bs(bool has_device, const char * device,
> > +  bool has_node_name, const char * 
> > node_name,
> 
> Style: no space after * (3 instances)
> 
> > +  Error **errp)
> > +{
> > +BlockDriverState *bs = NULL;
> > +
> > +if ((has_device && has_node_name) ||
> > +(!has_device && !has_node_name)) {
> 
> Could be shortened to:
> 
> if (has_device == has_node_name) {
> 
> > +error_setg(errp, "Use either device or node-name but not both.");
> 
> We tend to avoid trailing '.' on error messages
> 
> >  
> > -void qmp_block_passwd(const char *device, const char *password, Error 
> > **errp)
> > +void qmp_block_passwd(bool has_device, const char * device,
> > +  bool has_node_name, const char * node_name,
> > +  const char * password, Error **errp)
> 
> Again, no space after '*'
> 
> > +++ b/include/block/block.h
> > @@ -371,6 +371,9 @@ void bdrv_eject(BlockDriverState *bs, bool eject_flag);
> >  const char *bdrv_get_format_name(BlockDriverState *bs);
> >  BlockDriverState *bdrv_find(const char *name);
> >  BlockDriverState *bdrv_find_node(const char *node_name);
> > +BlockDriverState * bdrv_lookup_bs(bool has_device, const char * device,
> > +  bool has_node_name, const char * 
> > node_name,
> > +  Error **errp);
> 
> And again
> 
> > +++ b/qapi-schema.json
> > @@ -1675,7 +1675,11 @@
> >  # determine which ones are encrypted, set the passwords with this command, 
> > and
> >  # then start the guest with the @cont command.
> >  #
> > -# @device:   the name of the device to set the password on
> > +# Either @device or @node-name must be set but not both.
> > +#
> > +# @device: #optional the name of the block backend device to set the 
> > password on
> > +#
> > +# @node-name: #optional graph node name to set the password on (Since 1.8)
> 
> 2.0
> 
> >  #
> >  # @password: the password to use for the device
> >  #
> > @@ -1689,7 +1693,8 @@
> >  #
> >  # Since: 0.14.0
> >  ##
> > -{ 'command': 'block_passwd', 'data': {'device': 'str', 'password': 'str'} }
> > +{ 'command': 'block_passwd', 'data': {'*device': 'str',
> > +  '*node-name': 'str', 'password': 
> > 'str'} }
> 
> Seems like a reasonable addition; shouldn't cause any back-compat
> problems (older management tools will always provide the now-optional
> 'device').
> 
> Is it intentional that you are not exposing this new functionality in HMP?

Yes, I don't foresee any way to print the graph in HMP so I am limiting the
changes to QMP.

Best regards

Benoît

> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Re: [Qemu-devel] [RFC V3 7/7] qmp: Allow to take external snapshots on bs graphs node.

2013-12-05 Thread Benoît Canet

Le Wednesday 04 Dec 2013 à 17:11:26 (-0700), Eric Blake a écrit :
> On 12/03/2013 06:26 AM, Benoît Canet wrote:
> > Signed-off-by: Benoit Canet 
> > ---
> >  blockdev.c   | 49 +
> >  hmp.c|  4 +++-
> >  qapi-schema.json | 13 ++---
> >  qmp-commands.hx  | 11 ++-
> >  4 files changed, 64 insertions(+), 13 deletions(-)
> > 
> 
> > 
> > +if (has_node_name && !has_snapshot_node_name) {
> > +error_setg(errp, "New snapshot node name missing");
> > +return;
> > +}
> 
> Why is it okay to omit the node name when passing a device name (which
> creates an anonymous node as the new root of the device tree) but not
> when passing a node name?  Are you trying to guarantee that all
> anonymous nodes can be reached from a device name, and that when taking
> a snapshot from a node name the new node is not necessarily tied to a
> device and must therefore be named?

Yes bs device living just under block backend will ever be accessible via device
whereas other bs really need a node-name to be set to be manipulated.
Also it avoid adding a new mandatory field for the device case which is good for
compatibility with previous versions.

Best regards

Benoît

> 
> > -# @device:  the name of the device to generate the snapshot from.
> > +# Either @device or @node-name must be set but not both.
> > +#
> > +# @device: #optional the name of the device to generate the snapshot from.
> > +#
> > +# @node-name: #optional graph node name to generate the snapshot from 
> > (Since 1.8)
> >  #
> >  # @snapshot-file: the target of the new image. A new file will be created.
> >  #
> > +# @snapshot-node-name: the graph node name of the new image (Since 1.8)
> 
> 2.0, also mark this one #optional
> 
> > +#
> >  # @format: #optional the format of the snapshot image, default is 'qcow2'.
> 
> Unrelated to this patch, but @format is another field worth turning into
> an enum instead of an open-coded string.
> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Re: [Qemu-devel] [RFC V3 3/7] qapi: Add skeletton of command to query a drive bs graph.

2013-12-05 Thread Benoît Canet

Le Wednesday 04 Dec 2013 à 16:46:34 (-0700), Eric Blake a écrit :
> On 12/03/2013 06:26 AM, Benoît Canet wrote:
> 
> In addition to Fam's review,
> 
> s/skeletton/skeleton/ in subject
> 
> > ---
> >  blockdev.c   |  8 
> >  qapi-schema.json | 32 
> >  2 files changed, 40 insertions(+)
> > 
> > diff --git a/blockdev.c b/blockdev.c
> > index a474bb5..824e718 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -1940,6 +1940,14 @@ void qmp_drive_backup(const char *device, const char 
> > *target,
> >  }
> >  }
> >  
> > +BlockGraphNode * qmp_query_drive_graph(const char *device, Error **errp)
> 
> Style: no space after *
> 
> > +{
> > +/* the implementation of this function would recurse through the
> > + * BlockDriverState graph to build it's result
> > + */
> > +return NULL;
> 
> Shouldn't you set errp when returning failure?
> 
> > +++ b/qapi-schema.json
> > @@ -2008,6 +2008,38 @@
> >  { 'command': 'drive-backup', 'data': 'DriveBackup' }
> >  
> >  ##
> > +# @BlockGraphNode
> > +#
> > +# Information about a node of the block driver state graph
> > +#
> > +# @node-name: the name of the node in the graph
> > +#
> > +# @drv: the name of the block format used by this node as described in
> > +#   @BlockDeviceInfo.
> 
> It would be nice if BlockDeviceInfo and BlockGraphNode used an enum
> rather than an open-coded string for this field.
> 
> > +#
> > +# @children: a list of @BlockGraphNode being the children of this node
> 
> s/being/that are/
> 
> > +##
> > +# @query-drive-graph
> > +#
> > +# Get the block driver states graph for a given drive
> > +#
> > +# @device: the name of the device to get the graph from
> > +#
> > +# Returns: the root @BlockGraphNode
> > +#
> > +# Since 1.8
> > +##
> > +{ 'command': 'query-drive-graph',
> > +  'data': { 'device': 'str' },
> > +  'returns': 'BlockGraphNode' }
> 
> Am I correct that it will be possible to have named nodes that are not
> currently associated with any device?  If so, how do we learn about
> those nodes?  Would it be better to have a command that returns an array
> of structs for all known node roots, with an optional member describing
> which device owns that node root?  Something like:

The code have a list of all named nodes but not a list of named nodes roots.
Also it's difficult to get the device name for a named node because the bses 
don't
have any backward pointers to their parents.
It could be done by recursing into all the blockbackend bs but it's twisted.

In fact I am wondering if we really need something to spit out the named nodes
topology in QMP for the simple reason that the names of the nodes are given by 
the
management so the management should already know the topology.

Best regards

Benoît

> 
> # Represent a root of a block graph
> # @root: a named node forming a root of a node graph
> # @device: #optional device name that owns this root
> { 'type': 'BlockGraphRoot',
>   'data': { 'root': 'BlockGraphNode',
> '*device': 'str' } }
> 
> # @query_drive-graphs
> # Returns an array of all node graph roots
> { 'command': 'query-drive-graphs',
>   'returns': [ 'BlockGraphRoot' ] }
> 
> possibly with 'data':{'*device':'str'} to allow filtering to just a
> 1-element array based on the device name (although I'm not sure if
> providing the complexity of filtering is worth it).
> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

[Qemu-devel] [Bug 1258168] [NEW] QEMU fails to build on CentOS 5.10 with --disable-pie reporting "/usr/bin/ld: -f may not be used without -shared "

2013-12-05 Thread Don Slutz

Public bug reported:

fails for (7dc65c0 (HEAD, origin/master, origin/HEAD, master) Open 2.0
development tree):

...
libtool  --mode=link --tag=CC cc -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 
-D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef 
-Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing  -Wendif-labels 
-Wmissing-include-dirs -Wnested-externs -Wformat-security -Wformat-y2k 
-Winit-self -Wold-style-definition -fstack-protector-all 
-I/usr/include/libpng12   -I/usr/include/nss3 -I/usr/include/nspr4 -pthread 
-I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1   
-I/home/don/qemu/dtc/libfdt -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/home/don/qemu/tests -O2 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -g -Wl,--warn-common -m64 -g  -o vscclient 
libcacard/vscclient.o libcacard.la  -Wc,-fstack-protector-all -lrt -pthread 
-L/lib64 -lgthread-2.0 -lglib-2.0-lz -L/usr/kerberos/lib64 -lcurl -ldl 
-lgssapi_krb5 -lkrb5 -lk5crypto -lcom_err -lidn -lssl -lcrypto -lz -luuid
cc -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings 
-Wmissing-prototypes -fno-strict-aliasing -Wendif-labels -Wmissing-include-dirs 
-Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wold-style-definition -fstack-protector-all -I/usr/include/libpng12 
-I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1 
-I/home/don/qemu/dtc/libfdt -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/home/don/qemu/tests -O2 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -g -Wl,--warn-common -m64 -g -o .libs/vscclient 
libcacard/vscclient.o -Wl,-fstack-protector-all -pthread  ./.libs/libcacard.so 
-L/lib64 -L/usr/kerberos/lib64 -lssl3 -lsmime3 -lnss3 -lnssutil3 -lplds4 -lplc4 
-lnspr4 -lpthread -lrt -lgthread-2.0 -lglib-2.0 -lcurl -ldl -lgssapi_krb5 
-lkrb5 -lk5crypto -lcom_err -lidn -lssl -lcrypto -lz -luuid  -Wl,--rpath 
-Wl,/usr/local/lib
/usr/bin/ld: -f may not be used without -shared
collect2: ld returned 1 exit status
make: *** [vscclient] Error 1 

rm -rf out/tmp;mkdir out/tmp;pushd out/tmp;../../configure --disable-pie;make 
V=1 1>zz1 2>&1;popd
~/qemu/out/tmp ~/qemu
Install prefix/usr/local
BIOS directory/usr/local/share/qemu
binary directory  /usr/local/bin
library directory /usr/local/lib
libexec directory /usr/local/libexec
include directory /usr/local/include
config directory  /usr/local/etc
local state directory   /usr/local/var
Manual directory  /usr/local/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /home/don/qemu
C compilercc
Host C compiler   cc
C++ compiler  c++
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g
QEMU_CFLAGS   -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings 
-Wmissing-prototypes -fno-strict-aliasing -Wendif-labels -Wmissing-include-dirs 
-Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wold-style-definition -fstack-protector-all   -I/usr/include/libpng12 
-I/usr/include/nss3 -I/usr/include/nspr4   -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1   -I$(SRC_PATH)/dtc/libfdt
LDFLAGS   -Wl,--warn-common -m64 -g
make  make
install   install
pythonpython
smbd  /usr/sbin/smbd
host CPU  x86_64
host big endian   no
target listalpha-softmmu arm-softmmu cris-softmmu i386-softmmu 
lm32-softmmu m68k-softmmu microblaze-softmmu microblazeel-softmmu mips-softmmu 
mips64-softmmu mips64el-softmmu mipsel-softmmu moxie-softmmu or32-softmmu 
ppc-softmmu ppc64-softmmu ppcemb-softmmu s390x-softmmu sh4-softmmu 
sh4eb-softmmu sparc-softmmu sparc64-softmmu unicore32-softmmu x86_64-softmmu 
xtensa-softmmu xtensaeb-softmmu alpha-linux-user arm-linux-user 
armeb-linux-user cris-linux-user i386-linux-user m68k-linux-user 
microblaze-linux-user microblazeel-linux-user mips-linux-user mips64-linux-user 
mips64el-linux-user mipsel-linux-user mipsn32-linux-user mipsn32el-linux-user 
or32-linux-user ppc-linux-user ppc64-linux-user ppc64abi32-linux-user 
s390x-linux-user sh4-linux-user sh4eb-linux-user sparc-linux-user 
sparc32plus-linux-user sparc64-linux-user unicore32-linux-user x86_64-linux-user
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
-Werror enabled   no
pixmansystem
SDL support   yes
GTK support   no
curses supportyes
curl support  yes
mingw32 support   no
Audio drivers oss
Block whitelist (rw)
Block whitelist (ro)
VirtFS supportyes
VNC support   yes
VNC TLS support   no
VNC SASL support  yes
VNC JPEG support  yes
VNC PNG support   yes
VNC WS supportno
xen support   yes
brlapi

Re: [Qemu-devel] [RFC V3 3/7] qapi: Add skeletton of command to query a drive bs graph.

2013-12-05 Thread Eric Blake

On 12/05/2013 07:24 AM, Benoît Canet wrote:

>>
>> Am I correct that it will be possible to have named nodes that are not
>> currently associated with any device?  If so, how do we learn about
>> those nodes?  Would it be better to have a command that returns an array
>> of structs for all known node roots, with an optional member describing
>> which device owns that node root?  Something like:
> 
> The code have a list of all named nodes but not a list of named nodes roots.
> Also it's difficult to get the device name for a named node because the bses 
> don't
> have any backward pointers to their parents.
> It could be done by recursing into all the blockbackend bs but it's twisted.

Still worth thinking about how to structure things so we could add it in
the future if it turns out to be useful to management, but I can
understand why you aren't providing it right away.

> 
> In fact I am wondering if we really need something to spit out the named nodes
> topology in QMP for the simple reason that the names of the nodes are given 
> by the
> management so the management should already know the topology.

There's one case where management might not know - if libvirtd gets
restarted while in the middle of an operation that was attempting to
create a named node, then on restart and reconnection to the monitor,
libvirt would want to query to see if the node actually got created or
if the command needs to be attempted again.  I'm not a fan of write-only
interfaces - and making management responsible to track all named nodes
with no way to query if qemu actually agrees with the topology that
management thinks it has commanded feels like a write-only interface.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC V3 3/7] qapi: Add skeletton of command to query a drive bs graph.

2013-12-05 Thread Benoît Canet

Le Thursday 05 Dec 2013 à 07:38:37 (-0700), Eric Blake a écrit :
> On 12/05/2013 07:24 AM, Benoît Canet wrote:
> 
> >>
> >> Am I correct that it will be possible to have named nodes that are not
> >> currently associated with any device?  If so, how do we learn about
> >> those nodes?  Would it be better to have a command that returns an array
> >> of structs for all known node roots, with an optional member describing
> >> which device owns that node root?  Something like:
> > 
> > The code have a list of all named nodes but not a list of named nodes roots.
> > Also it's difficult to get the device name for a named node because the 
> > bses don't
> > have any backward pointers to their parents.
> > It could be done by recursing into all the blockbackend bs but it's twisted.
> 
> Still worth thinking about how to structure things so we could add it in
> the future if it turns out to be useful to management, but I can
> understand why you aren't providing it right away.
> 
> > 
> > In fact I am wondering if we really need something to spit out the named 
> > nodes
> > topology in QMP for the simple reason that the names of the nodes are given 
> > by the
> > management so the management should already know the topology.
> 
> There's one case where management might not know - if libvirtd gets
> restarted while in the middle of an operation that was attempting to
> create a named node, then on restart and reconnection to the monitor,
> libvirt would want to query to see if the node actually got created or
> if the command needs to be attempted again.  I'm not a fan of write-only
> interfaces - and making management responsible to track all named nodes
> with no way to query if qemu actually agrees with the topology that
> management thinks it has commanded feels like a write-only interface.

Would a command returning info about a specific named node be sufficient for
libvirt checks ?
It's far less complex to implement than exposing the whole graph.
We could also provide a simple command to list the names of the named nodes.

Best regards

Benoît

> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Re: [Qemu-devel] [RFC V3 6/7] block: Create authorizations mechanism for external snapshots.

2013-12-05 Thread Benoît Canet

Le Wednesday 04 Dec 2013 à 15:03:42 (+0800), Fam Zheng a écrit :
> On 2013年12月04日 14:34, Benoît Canet wrote:
> >Le Wednesday 04 Dec 2013 à 14:12:19 (+0800), Fam Zheng a écrit :
> >>On 2013年12月04日 13:20, Benoît Canet wrote:
> >>>Le Wednesday 04 Dec 2013 à 11:47:22 (+0800), Fam Zheng a écrit :
> On 2013年12月03日 21:26, Benoît Canet wrote:
> >---
> >  block.c   | 64 
> > +--
> >  block/blkverify.c |  2 +-
> >  include/block/block.h | 16 +---
> >  include/block/block_int.h |  9 ---
> >  4 files changed, 75 insertions(+), 16 deletions(-)
> >
> >diff --git a/block.c b/block.c
> >index 8016ff2..0569cb2 100644
> >--- a/block.c
> >+++ b/block.c
> >@@ -4945,21 +4945,69 @@ int bdrv_amend_options(BlockDriverState *bs, 
> >QEMUOptionParameter *options)
> >  return bs->drv->bdrv_amend_options(bs, options);
> >  }
> >
> >-ExtSnapshotPerm bdrv_check_ext_snapshot(BlockDriverState *bs)
> >+/* will be used to recurse on single child block filter until first 
> >format
> >+ * (single child block filter will store their child in bs->file)
> >+ */
> >+ExtSnapshotPerm bdrv_generic_check_ext_snapshot(BlockDriverState *bs,
> >+BlockDriverState 
> >*candidate)
> >  {
> >-if (bs->drv->bdrv_check_ext_snapshot) {
> >-return bs->drv->bdrv_check_ext_snapshot(bs);
> >+if (!bs->drv) {
> >+return EXT_SNAPSHOT_FORBIDDEN;
> >  }
> >
> >-if (bs->file && bs->file->drv && 
> >bs->file->drv->bdrv_check_ext_snapshot) {
> >-return bs->file->drv->bdrv_check_ext_snapshot(bs);
> >+if (!bs->drv->authorizations[BS_CANT_SNAPSHOT]) {
> 
> This double negative feels hard to read for me.
> 
> >+if (bs == candidate) {
> >+ return EXT_SNAPSHOT_ALLOWED;
> >+} else {
> >+ return EXT_SNAPSHOT_FORBIDDEN;
> >+}
> >  }
> >
> >-/* external snapshots are allowed by default */
> >-return EXT_SNAPSHOT_ALLOWED;
> >+if (!bs->drv->authorizations[BS_FILTER_PASS_DOWN]) {
> >+return EXT_SNAPSHOT_FORBIDDEN;
> >+}
> >+
> >+if (!bs->file) {
> >+return EXT_SNAPSHOT_FORBIDDEN;
> >+}
> >+
> >+return bdrv_recurse_check_ext_snapshot(bs->file, candidate);
> >  }
> >
> >-ExtSnapshotPerm bdrv_check_ext_snapshot_forbidden(BlockDriverState *bs)
> >+ExtSnapshotPerm bdrv_recurse_check_ext_snapshot(BlockDriverState *bs,
> >+BlockDriverState 
> >*candidate)
> >  {
> >+if (bs->drv && bs->drv->bdrv_check_ext_snapshot) {
> >+return bs->drv->bdrv_check_ext_snapshot(bs, candidate);
> >+}
> 
> Maybe I'm missing something, but if a driver always returns positive
> permit, despite of what candidate is (or even it's relevant to bs),
> then doesn't it also affect other devices? because...
> 
> >+
> >+return bdrv_generic_check_ext_snapshot(bs, candidate);
> >+}
> >+
> >+/* This function check if the candidate bs has snapshots authorized by 
> >going
> >+ * down the forest of bs, skipping filters and stopping on the the 
> >first bses
> >+ * authorizing snapshots
> >+ */
> >+ExtSnapshotPerm bdrv_check_ext_snapshot(BlockDriverState *candidate)
> >+{
> >+BlockDriverState *bs;
> >+
> >+/* walk down the bs forest recursively */
> >+QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> 
> this iterates through all the known graph trees (device_list),
> instead of limiting to only the device that candidate belongs to.
> >>>
> >>>The recursion termination success is candidate == bs.
> >>>This make sure that the scan of the other tree of the forest will not 
> >>>return any
> >>>spurious success.
> >>>
> >>
> >>But the "candidate == bs" check is in
> >>bdrv_generic_check_ext_snapshot, which gets short-circuited by
> >>driver implementation if the driver implements it, in
> >>bdrv_recurse_check_ext_snapshot.
> >>
> >>So if I have an "always yes" drv->bdrv_check_ext_snapshot and it
> >>happens to be the first one in bdrv_states, I will allow all
> >>snapshot operations.
> >>
> >
> >My bad I forgot to document the drv_>bdrv_check_ext_snapshot.
> >It meant to be recursive and only for twisted block filter like this one 
> >(quorum):
> >
> >static ExtSnapshotPerm quorum_check_ext_snapshot(BlockDriverState *bs,
> >  BlockDriverState 
> > *candidate)
> >{
> > BDRVQuorumState *s = bs->opaque;
> > int i;
> >
> > for (i = 0; i < s->total; i++) {
> > ExtSnapshotPerm perm = bdrv_recurse_check_ext_snapshot(s->bs[i],
> >

[Qemu-devel] [PATCH] qemu-img: make progress output more accurate during convert

2013-12-05 Thread Peter Lieven

the progress output is very bumpy if the input images contains
a significant portion of unallocated sectors. This patch
checks how much sectors are allocated a priori if progress
output is selected.

Signed-off-by: Peter Lieven 
---
 qemu-img.c |   26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index be72274..1ea064e 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1130,8 +1130,7 @@ static int img_convert(int argc, char **argv)
 const char *fmt, *out_fmt, *cache, *out_baseimg, *out_filename;
 BlockDriver *drv, *proto_drv;
 BlockDriverState **bs = NULL, *out_bs = NULL;
-int64_t total_sectors, nb_sectors, sector_num, bs_offset,
-sector_num_next_status = 0;
+int64_t total_sectors, nb_sectors, sector_num, bs_offset;
 uint64_t bs_sectors;
 uint8_t * buf = NULL;
 size_t bufsectors = IO_BUF_SIZE / BDRV_SECTOR_SIZE;
@@ -1476,6 +1475,8 @@ static int img_convert(int argc, char **argv)
 /* signal EOF to align */
 bdrv_write_compressed(out_bs, 0, NULL, 0);
 } else {
+int64_t sectors_to_read, sectors_read, sector_num_next_status;
+bool count_allocated_sectors;
 int has_zero_init = min_sparse ? bdrv_has_zero_init(out_bs) : 0;
 
 if (!has_zero_init && bdrv_can_write_zeroes_with_unmap(out_bs)) {
@@ -1486,12 +1487,21 @@ static int img_convert(int argc, char **argv)
 has_zero_init = 1;
 }
 
+sectors_to_read = total_sectors;
+count_allocated_sectors = progress && (out_baseimg || has_zero_init);
+restart:
 sector_num = 0; // total number of sectors converted so far
-nb_sectors = total_sectors - sector_num;
+sectors_read = 0;
+sector_num_next_status = 0;
 
 for(;;) {
 nb_sectors = total_sectors - sector_num;
 if (nb_sectors <= 0) {
+if (count_allocated_sectors) {
+sectors_to_read = sectors_read;
+count_allocated_sectors = false;
+goto restart;
+}
 ret = 0;
 break;
 }
@@ -1557,8 +1567,14 @@ static int img_convert(int argc, char **argv)
 }
 
 n = MIN(n, bs_sectors - (sector_num - bs_offset));
-n1 = n;
 
+sectors_read += n;
+if (count_allocated_sectors) {
+sector_num += n;
+continue;
+}
+
+n1 = n;
 ret = bdrv_read(bs[bs_i], sector_num - bs_offset, buf, n);
 if (ret < 0) {
 error_report("error while reading sector %" PRId64 ": %s",
@@ -1583,7 +1599,7 @@ static int img_convert(int argc, char **argv)
 n -= n1;
 buf1 += n1 * 512;
 }
-qemu_progress_print(100.0 * sector_num / total_sectors, 0);
+qemu_progress_print(100.0 * sectors_read / sectors_to_read, 0);
 }
 }
 out:
-- 
1.7.9.5

Re: [Qemu-devel] [PATCHv3 1.8 0/9] qemu-img convert optimizations

2013-12-05 Thread Peter Lieven

Am 05.12.2013 13:15, schrieb Stefan Hajnoczi:
> On Wed, Nov 27, 2013 at 11:07:00AM +0100, Peter Lieven wrote:
>> this series adds some optimizations for qemu-img during convert which
>> have been developed recently:
>> - skipping input based on get_block_status
>> - variable I/O buffer size
>> - align write requests to cluster_size
>>
>> v2->v3:
>>   - added Paolos comments in Patch 1
>>   - changed the comment in patch 7 [Paolo]
>>   - remove the patch to add sector progress output
>>   - added a new patch to decrease the progress update interval.
>>
>> v1->v2:
>>   - introduce opt_transfer_length in BlockLimits [Paolo]
>>   - remove knobs for iobuffer_size and alignment and
>> use them unconditionally [Paolo]
>>   - calculate I/O buffer size by BlockLimits information [Paolo]
>>   - change the alignment patch to round down to the
>> last and not to the next aligned sector [Paolo]
>>   - limit updates in the sector progress output
>>   - new patch to increase the default for min_sparse [Paolo]
>>
>> Peter Lieven (9):
>>   qemu-img: add support for skipping zeroes in input during convert
>>   qemu-img: fix usage instruction for qemu-img convert
>>   block/iscsi: set bdi->cluster_size
>>   block: add opt_transfer_length to BlockLimits
>>   block/iscsi: set bs->bl.opt_transfer_length
>>   qemu-img: dynamically adjust iobuffer size during convert
>>   qemu-img: round down request length to an aligned sector
>>   qemu-img: increase min_sparse to 128 sectors (64kb)
>>   qemu-img: decrease progress update interval on convert
>>
>>  block/iscsi.c |   10 
>>  include/block/block_int.h |3 ++
>>  qemu-img.c|  131 
>> +++--
>>  qemu-img.texi |2 +-
>>  4 files changed, 93 insertions(+), 53 deletions(-)
> Merged all except patch 8/9.
>
> Thanks, applied to my block tree:
> https://github.com/stefanha/qemu/commits/block
Thank you.

As discussed I've sent the follow-up patch

[PATCH] qemu-img: make progress output more accurate during convert

to the list some minutes ago.

Peter

Re: [Qemu-devel] [RFC V3 3/7] qapi: Add skeletton of command to query a drive bs graph.

2013-12-05 Thread Eric Blake

On 12/05/2013 07:43 AM, Benoît Canet wrote:

>> There's one case where management might not know - if libvirtd gets
>> restarted while in the middle of an operation that was attempting to
>> create a named node, then on restart and reconnection to the monitor,
>> libvirt would want to query to see if the node actually got created or
>> if the command needs to be attempted again.  I'm not a fan of write-only
>> interfaces - and making management responsible to track all named nodes
>> with no way to query if qemu actually agrees with the topology that
>> management thinks it has commanded feels like a write-only interface.
> 
> Would a command returning info about a specific named node be sufficient for
> libvirt checks ?
> It's far less complex to implement than exposing the whole graph.
> We could also provide a simple command to list the names of the named nodes.

Yes, both of those ideas are useful; it still means management must
track the topology between the nodes, but it at least gives management
enough control to know which set of nodes exist to confirm which
operations have occurred.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1258168] Re: QEMU fails to build on CentOS 5.10 with --disable-pie reporting "/usr/bin/ld: -f may not be used without -shared "

2013-12-05 Thread Don Slutz

Using the hack:

git diff
diff --git a/configure b/configure
index 0666228..cf8123b 100755
--- a/configure
+++ b/configure
@@ -1292,7 +1292,9 @@ done
 
 if compile_prog "-Werror -fstack-protector-all" "" ; then
 QEMU_CFLAGS="$QEMU_CFLAGS -fstack-protector-all"
-LIBTOOLFLAGS="$LIBTOOLFLAGS -Wc,-fstack-protector-all"
+if test "$pie" != "no" ; then
+   LIBTOOLFLAGS="$LIBTOOLFLAGS -Wc,-fstack-protector-all"
+fi
 fi
 
 # Workaround for http://gcc.gnu.org/PR55489.  Happens with -fPIE/-fPIC and

I now get:

/home/don/qemu/libcacard/vscclient.c: In function 'do_socket_read':
/home/don/qemu/libcacard/vscclient.c:410: warning: implicit declaration of 
function 'g_warn_if_reached'
/home/don/qemu/libcacard/vscclient.c:410: warning: nested extern declaration of 
'g_warn_if_reached'
/home/don/qemu/libcacard/vscclient.c: In function 'main':
/home/don/qemu/libcacard/vscclient.c:763: warning: implicit declaration of 
function 'g_byte_array_unref'
/home/don/qemu/libcacard/vscclient.c:763: warning: nested extern declaration of 
'g_byte_array_unref'
...
libcacard/vscclient.o: In function `do_socket_read':
/home/don/qemu/libcacard/vscclient.c:410: undefined reference to 
`g_warn_if_reached'
libcacard/vscclient.o: In function `main':
/home/don/qemu/libcacard/vscclient.c:763: undefined reference to 
`g_byte_array_unref'
collect2: ld returned 1 exit status
make: *** [vscclient] Error 1

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1258168

Title:
  QEMU fails to build on CentOS 5.10 with --disable-pie reporting
  "/usr/bin/ld: -f may not be used without -shared "

Status in QEMU:
  New

Bug description:
  fails for (7dc65c0 (HEAD, origin/master, origin/HEAD, master) Open 2.0
  development tree):

  ...
  libtool  --mode=link --tag=CC cc -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 
-D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef 
-Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing  -Wendif-labels 
-Wmissing-include-dirs -Wnested-externs -Wformat-security -Wformat-y2k 
-Winit-self -Wold-style-definition -fstack-protector-all 
-I/usr/include/libpng12   -I/usr/include/nss3 -I/usr/include/nspr4 -pthread 
-I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1   
-I/home/don/qemu/dtc/libfdt -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/home/don/qemu/tests -O2 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -g -Wl,--warn-common -m64 -g  -o vscclient 
libcacard/vscclient.o libcacard.la  -Wc,-fstack-protector-all -lrt -pthread 
-L/lib64 -lgthread-2.0 -lglib-2.0-lz -L/usr/kerberos/lib64 -lcurl -ldl 
-lgssapi_krb5 -lkrb5 -lk5crypto -lcom_err -lidn -lssl -lcrypto -lz -luuid
  cc -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings 
-Wmissing-prototypes -fno-strict-aliasing -Wendif-labels -Wmissing-include-dirs 
-Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wold-style-definition -fstack-protector-all -I/usr/include/libpng12 
-I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/usr/include/pixman-1 
-I/home/don/qemu/dtc/libfdt -pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/home/don/qemu/tests -O2 -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -g -Wl,--warn-common -m64 -g -o .libs/vscclient 
libcacard/vscclient.o -Wl,-fstack-protector-all -pthread  ./.libs/libcacard.so 
-L/lib64 -L/usr/kerberos/lib64 -lssl3 -lsmime3 -lnss3 -lnssutil3 -lplds4 -lplc4 
-lnspr4 -lpthread -lrt -lgthread-2.0 -lglib-2.0 -lcurl -ldl -lgssapi_krb5 
-lkrb5 -lk5crypto -lcom_err -lidn -lssl -lcrypto -lz -luuid  -Wl,--rpath 
-Wl,/usr/local/lib
  /usr/bin/ld: -f may not be used without -shared
  collect2: ld returned 1 exit status
  make: *** [vscclient] Error 1 

  rm -rf out/tmp;mkdir out/tmp;pushd out/tmp;../../configure --disable-pie;make 
V=1 1>zz1 2>&1;popd
  ~/qemu/out/tmp ~/qemu
  Install prefix/usr/local
  BIOS directory/usr/local/share/qemu
  binary directory  /usr/local/bin
  library directory /usr/local/lib
  libexec directory /usr/local/libexec
  include directory /usr/local/include
  config directory  /usr/local/etc
  local state directory   /usr/local/var
  Manual directory  /usr/local/share/man
  ELF interp prefix /usr/gnemul/qemu-%M
  Source path   /home/don/qemu
  C compilercc
  Host C compiler   cc
  C++ compiler  c++
  Objective-C compiler cc
  ARFLAGS   rv
  CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g
  QEMU_CFLAGS   -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 
-D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef 
-Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -Wendif-labels 
-Wmissing-include-dirs -Wnested-externs -Wformat-security -Wformat-y2k 
-Winit-self -Wold-style-definition -fstack-protector-all   
-I/usr/include/libpng12 -I/u

Re: [Qemu-devel] [PATCH 01/27] acpi: factor out common pm_update_sci() into acpi core

2013-12-05 Thread Igor Mammedov

On Thu, 5 Dec 2013 14:37:01 +0200
"Michael S. Tsirkin"  wrote:

> On Thu, Nov 21, 2013 at 03:38:22AM +0100, Igor Mammedov wrote:
> > Signed-off-by: Igor Mammedov 
> 
> Sorry doesn't apply.
> Can you rebase on top of latest tree please?
I just tried to rebase on top of updated pci tree.
There were no conflicts in this patch and it applied cleanly :/

> 
> > ---
> > perhaps this patch sholud go before "piix4: add acpi pci hotplug support"
> > so that there were no need in this rename in piix4_acpi_pci_hotplug()
> > here.
> > 
> > s/pm_update_sci/acpi_update_sci/
> > ---
> >  hw/acpi/core.c |   18 ++
> >  hw/acpi/ich9.c |   23 ++-
> >  hw/acpi/piix4.c|   34 ++
> >  include/hw/acpi/acpi.h |8 
> >  4 files changed, 38 insertions(+), 45 deletions(-)
> > 
> > diff --git a/hw/acpi/core.c b/hw/acpi/core.c
> > index 58308a3..8c0d48c 100644
> > --- a/hw/acpi/core.c
> > +++ b/hw/acpi/core.c
> > @@ -662,3 +662,21 @@ uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, uint32_t 
> > addr)
> >  
> >  return val;
> >  }
> > +
> > +void acpi_update_sci(ACPIREGS *regs, qemu_irq irq, uint32_t gpe0_sts_mask)
> > +{
> > +int sci_level, pm1a_sts;
> > +
> > +pm1a_sts = acpi_pm1_evt_get_sts(regs);
> > +
> > +sci_level = ((pm1a_sts &
> > +  regs->pm1.evt.en & ACPI_BITMASK_PM1_COMMON_ENABLED) != 
> > 0) ||
> > +((regs->gpe.sts[0] & regs->gpe.en[0] & gpe0_sts_mask) != 
> > 0);
> > +
> > +qemu_set_irq(irq, sci_level);
> > +
> > +/* schedule a timer interruption if needed */
> > +acpi_pm_tmr_update(regs,
> > +   (regs->pm1.evt.en & ACPI_BITMASK_TIMER_ENABLE) &&
> > +   !(pm1a_sts & ACPI_BITMASK_TIMER_STATUS));
> > +}
> > diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> > index 7e0429e..e59688b 100644
> > --- a/hw/acpi/ich9.c
> > +++ b/hw/acpi/ich9.c
> > @@ -44,29 +44,10 @@ do { printf("%s "fmt, __func__, ## __VA_ARGS__); } 
> > while (0)
> >  #define ICH9_DEBUG(fmt, ...)do { } while (0)
> >  #endif
> >  
> > -static void pm_update_sci(ICH9LPCPMRegs *pm)
> > -{
> > -int sci_level, pm1a_sts;
> > -
> > -pm1a_sts = acpi_pm1_evt_get_sts(&pm->acpi_regs);
> > -
> > -sci_level = (((pm1a_sts & pm->acpi_regs.pm1.evt.en) &
> > -  (ACPI_BITMASK_RT_CLOCK_ENABLE |
> > -   ACPI_BITMASK_POWER_BUTTON_ENABLE |
> > -   ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
> > -   ACPI_BITMASK_TIMER_ENABLE)) != 0);
> > -qemu_set_irq(pm->irq, sci_level);
> > -
> > -/* schedule a timer interruption if needed */
> > -acpi_pm_tmr_update(&pm->acpi_regs,
> > -   (pm->acpi_regs.pm1.evt.en & 
> > ACPI_BITMASK_TIMER_ENABLE) &&
> > -   !(pm1a_sts & ACPI_BITMASK_TIMER_STATUS));
> > -}
> > -
> >  static void ich9_pm_update_sci_fn(ACPIREGS *regs)
> >  {
> >  ICH9LPCPMRegs *pm = container_of(regs, ICH9LPCPMRegs, acpi_regs);
> > -pm_update_sci(pm);
> > +acpi_update_sci(&pm->acpi_regs, pm->irq, 0);
> >  }
> >  
> >  static uint64_t ich9_gpe_readb(void *opaque, hwaddr addr, unsigned width)
> > @@ -193,7 +174,7 @@ static void pm_reset(void *opaque)
> >  pm->smi_en |= ICH9_PMIO_SMI_EN_APMC_EN;
> >  }
> >  
> > -pm_update_sci(pm);
> > +acpi_update_sci(&pm->acpi_regs, pm->irq, 0);
> >  }
> >  
> >  static void pm_powerdown_req(Notifier *n, void *opaque)
> > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > index 0be385e..b6dfa71 100644
> > --- a/hw/acpi/piix4.c
> > +++ b/hw/acpi/piix4.c
> > @@ -117,29 +117,11 @@ static void 
> > piix4_acpi_system_hot_add_init(MemoryRegion *parent,
> >  #define ACPI_ENABLE 0xf1
> >  #define ACPI_DISABLE 0xf0
> >  
> > -static void pm_update_sci(PIIX4PMState *s)
> > -{
> > -int sci_level, pmsts;
> > -
> > -pmsts = acpi_pm1_evt_get_sts(&s->ar);
> > -sci_level = (((pmsts & s->ar.pm1.evt.en) &
> > -  (ACPI_BITMASK_RT_CLOCK_ENABLE |
> > -   ACPI_BITMASK_POWER_BUTTON_ENABLE |
> > -   ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
> > -   ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
> > -(((s->ar.gpe.sts[0] & s->ar.gpe.en[0]) &
> > -  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
> > -
> > -qemu_set_irq(s->irq, sci_level);
> > -/* schedule a timer interruption if needed */
> > -acpi_pm_tmr_update(&s->ar, (s->ar.pm1.evt.en & 
> > ACPI_BITMASK_TIMER_ENABLE) &&
> > -   !(pmsts & ACPI_BITMASK_TIMER_STATUS));
> > -}
> > -
> >  static void pm_tmr_timer(ACPIREGS *ar)
> >  {
> >  PIIX4PMState *s = container_of(ar, PIIX4PMState, ar);
> > -pm_update_sci(s);
> > +acpi_update_sci(&s->ar, s->irq, PIIX4_PCI_HOTPLUG_STATUS |
> > +PIIX4_CPU_HOTPLUG_STATUS);
> >  }
> >  
> >  static void apm_ctrl_changed(uint32_t val, void *arg)
> > @@ -429,7 +411,8 @@ static int piix4_acpi_pci_hotplug

Re: [Qemu-devel] [Bug 1257099] [NEW] QEMU fails to build on CentOS 5.10 with relocation R_X86_64_PC32 error

2013-12-05 Thread Paolo Bonzini

Il 04/12/2013 02:32, Don Slutz ha scritto:
> Any hints or pointers about the bug in RHEL5 binutils?  I can try and
> make a patch to auto detect this.

Actually it's RHEL5 GCC:

$ cat f.c
void *
f(unsigned char *buf, int len)
{
return (void*)0L;
}


void *
g(unsigned char *buf, int len)
{
return f(buf, len);
}
$ gcc -shared -o f.so f.c -fPIE -fPIC
/usr/bin/ld: /tmp/ccQc9els.o: relocation R_X86_64_PC32 against `f' can not be 
used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status


The bug is simply that "-fPIE -fPIC" counts as -fPIE rather than -fPIC:

$ gcc -S -o - f.c -fPIE |grep call
callf  # PC32 relocation
$ gcc -S -o - f.c -fPIC |grep call
callf@PLT  # PLT32 relocation

On RHEL5:
$ gcc -S -o - f.c -fPIE -fPIC |grep call
callf

On RHEL6:
$ gcc -S -o - f.c -fPIE -fPIC |grep call
callf@PLT

Paolo

[Qemu-devel] PING Re: [patch] introduce MIG_STATE_CANCELLING state

2013-12-05 Thread Paolo Bonzini

Il 07/11/2013 12:21, Paolo Bonzini ha scritto:
> Il 07/11/2013 12:01, Zhanghaoyu (A) ha scritto:
>> Introduce MIG_STATE_CANCELLING state to avoid starting a new migration task 
>> while the previous one still exist.
>>
>> Signed-off-by: Zeng Junliang 
>> Signed-off-by: Zhang Haoyu 
>> ---
>>  migration.c |   26 --
>>  1 files changed, 16 insertions(+), 10 deletions(-)
>>
>> diff --git a/migration.c b/migration.c
>> index fd73b97..af8a09c 100644
>> --- a/migration.c
>> +++ b/migration.c
>> @@ -40,6 +40,7 @@ enum {
>>  MIG_STATE_ERROR = -1,
>>  MIG_STATE_NONE,
>>  MIG_STATE_SETUP,
>> +MIG_STATE_CANCELLING,
>>  MIG_STATE_CANCELLED,
>>  MIG_STATE_ACTIVE,
>>  MIG_STATE_COMPLETED,
>> @@ -196,6 +197,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
>>  info->has_total_time = false;
>>  break;
>>  case MIG_STATE_ACTIVE:
>> +case MIG_STATE_CANCELLING:
>>  info->has_status = true;
>>  info->status = g_strdup("active");
>>  info->has_total_time = true;
>> @@ -282,6 +284,13 @@ void 
>> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
>>  
>>  /* shared migration helpers */
>>  
>> +static void migrate_set_state(MigrationState *s, int old_state, int 
>> new_state)
>> +{
>> +if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
>> +trace_migrate_set_state(new_state);
>> +}
>> +}
>> +
>>  static void migrate_fd_cleanup(void *opaque)
>>  {
>>  MigrationState *s = opaque;
>> @@ -303,18 +312,14 @@ static void migrate_fd_cleanup(void *opaque)
>>  
>>  if (s->state != MIG_STATE_COMPLETED) {
>>  qemu_savevm_state_cancel();
>> +if (s->state == MIG_STATE_CANCELLING) {
>> +migrate_set_state(s, MIG_STATE_CANCELLING, MIG_STATE_CANCELLED);
>> +}
>>  }
>>  
>>  notifier_list_notify(&migration_state_notifiers, s);
>>  }
>>  
>> -static void migrate_set_state(MigrationState *s, int old_state, int 
>> new_state)
>> -{
>> -if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
>> -trace_migrate_set_state(new_state);
>> -}
>> -}
>> -
>>  void migrate_fd_error(MigrationState *s)
>>  {
>>  DPRINTF("setting error state\n");
>> @@ -334,8 +339,8 @@ static void migrate_fd_cancel(MigrationState *s)
>>  if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
>>  break;
>>  }
>> -migrate_set_state(s, old_state, MIG_STATE_CANCELLED);
>> -} while (s->state != MIG_STATE_CANCELLED);
>> +migrate_set_state(s, old_state, MIG_STATE_CANCELLING);
>> +} while (s->state != MIG_STATE_CANCELLING);
>>  }
>>  
>>  void add_migration_state_change_notifier(Notifier *notify)
>> @@ -412,7 +417,8 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
>>  params.blk = has_blk && blk;
>>  params.shared = has_inc && inc;
>>  
>> -if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
>> +if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP ||
>> +s->state == MIG_STATE_CANCELLING) {
>>  error_set(errp, QERR_MIGRATION_ACTIVE);
>>  return;
>>  }
>>
> 
> Reviewed-by: Paolo Bonzini 

Ping.

Juan?

Paolo

Re: [Qemu-devel] Patch Round-up for stable 1.6.2, freeze on 2013-12-06

2013-12-05 Thread Paolo Bonzini

Il 04/12/2013 15:34, Michael Roth ha scritto:
> Hi everyone,
> 
> The following new patches are queued for QEMU stable v1.6.2:
> 
> https://github.com/mdroth/qemu/commits/stable-1.6-staging
> 
> The release is planned for 2013-12-09:
> 
> http://wiki.qemu.org/Planning/1.6
> 
> Please respond here or CC qemu-sta...@nongnu.org on any patches you
> think should be included in the release. The cut-off date is
> has been extended to 2013-12-06 due to the round-up email going
> out late.
> 
> Testing/feedback is greatly appreciated.
> 
> Thanks!
> 
> Alex Williamson (1):
>   vfio-pci: Fix multifunction=on
> 
> Alexey Kardashevskiy (1):
>   memory: fix 128 arithmetic in info mtree
> 
> Amit Shah (3):
>   char: move backends' io watch tag to CharDriverState
>   char: use common function to disable callbacks on chardev close
>   char: remove watch callback on chardev detach from frontend
> 
> Amos Kong (2):
>   virtio-net: fix the memory leak in rxfilter_notify()
>   rng-egd: offset the point when repeatedly read from the buffer
> 
> Bandan Das (1):
>   pci: unregister vmstate_pcibus on unplug
> 
> Cole Robinson (1):
>   Fix pc migration from qemu <= 1.5
> 
> Fam Zheng (1):
>   vmdk: Fix vmdk_parse_extents
> 
> Hans de Goede (1):
>   audio: honor QEMU_AUDIO_TIMER_PERIOD instead of waking up every *nano* 
> second
> 
> Igor Mammedov (1):
>   qdev-monitor: Fix crash when device_add is called with abstract driver
> 
> Jason Wang (1):
>   virtio-net: only delete bh that existed
> 
> Markus Armbruster (2):
>   tests: Fix schema parser test for in-tree build
>   tests: Update .gitignore for test-int128 and test-bitops
> 
> Matthew Daley (1):
>   xen_disk: mark ioreq as mapped before unmapping in error case
> 
> Max Filippov (1):
>   exec: fix breakpoint_invalidate when pc may not be translated
> 
> Max Reitz (1):
>   qcow2: count_contiguous_clusters and compression
> 
> Mike Frysinger (1):
>   configure: detect endian via compile test
> 
> Paolo Bonzini (1):
>   monitor: eliminate monitor_event_state_lock
> 
> Peter Lieven (1):
>   qcow2: fix possible corruption when reading multiple clusters
> 
> Peter Maydell (1):
>   configure: Explicitly set ARFLAGS so we can build with GNU Make 4.0
> 
> Richard Henderson (1):
>   Adjust qapi-visit for python-2.4.3
> 
> Stefan Hajnoczi (1):
>   qdev-monitor: Unref device when device_add fails
> 
> Stefan Weil (5):
>   tci: Add implementation of rotl_i64, rotr_i64
>   bitops: Add rotate functions (rol8, ror8, ...)
>   misc: Use new rotate functions
>   qemu-char: Fix potential out of bounds access to local arrays
>   linux-user: Fix stat64 syscall for SPARC64
> 
> Vlad Yasevich (1):
>   qom: Fix memory leak in object_property_set_link()
> 
> Wenchao Xia (2):
>   qapi: fix memleak by adding implict struct functions in dealloc visitor
>   tests: fix memleak in error path test for input visitor
> 
>  audio/audio.c  |3 +-
>  backends/rng-egd.c |4 +-
>  block/qcow2-cluster.c  |7 +++-
>  block/vmdk.c   |7 +++-
>  configure  |   45 +
>  exec.c |6 ++-
>  hw/block/xen_disk.c|1 +
>  hw/misc/vfio.c |7 
>  hw/net/virtio-net.c|   10 ++---
>  hw/pci-host/piix.c |9 -
>  hw/pci-host/q35.c  |   10 -
>  hw/pci/pci.c   |8 
>  include/hw/i386/pc.h   |8 
>  include/hw/pci-host/q35.h  |1 +
>  include/qemu/bitops.h  |   80 +
>  include/sysemu/char.h  |1 +
>  linux-user/syscall.c   |6 +--
>  linux-user/syscall_defs.h  |   14 +++
>  memory.c   |4 +-
>  monitor.c  |6 ---
>  qapi/qapi-dealloc-visitor.c|   20 ++
>  qdev-monitor.c |8 
>  qemu-char.c|   86 
> +++-
>  qom/object.c   |5 ++-
>  scripts/qapi-visit.py  |   17 ++--
>  target-arm/iwmmxt_helper.c |2 +-
>  tcg/optimize.c |   12 ++
>  tcg/tci/tcg-target.c   |1 -
>  tci.c  |   14 +--
>  tests/.gitignore   |3 ++
>  tests/Makefile |8 ++--
>  tests/test-qmp-input-visitor.c |1 +
>  32 files changed, 287 insertions(+), 127 deletions(-)
> 
> 
> 

This one is not yet here, but it's close:

http://permalink.gmane.org/gmane.comp.emulators.qemu/244329

It would also be nice to have the first 12 patches of
http://permalink.gmane.org/gmane.comp.emulators.qemu/244052, but perhaps
it's better to wait for 1.7.1.

Paolo

Re: [Qemu-devel] [PATCH 0/7] target-arm: Support AArch64 KVM

2013-12-05 Thread Peter Maydell

Slightly over-eager ping for code review and/or testing, since the A64
patches are going to sit on top of this and they're starting to pile up :-)
(Also noticed I forgot to cc Mian; apologies.)

thanks
-- PMM

On 28 November 2013 13:33, Peter Maydell  wrote:
> This patchset adds support for basic AArch64 KVM VM control.  It sits
> on top of the mach-virt + cpu-host patchset I sent out last week.
> The core of these patches is the work done by Mian M. Hamayun; I've
> just taken that, refactored it a bit to sit on top of the
> mach-virt+cpu-host patchset instead af defining an 'a57' cpu, and
> made some minor bugfixes as part of the code review I did in the
> process.
>
> (Mian: my apologies for not looking at your last patch series sooner.
> This actually ended up in my generating extra work for myself since
> if I'd been a bit quicker about that we could have dealt with more of
> this in code review rather than my fixing things up. I'll try to do
> better next time around.)
>
> This patch series supports:
>  * 64 bit KVM VM control
>  * SMP and UP
>  * PSCI boot of secondary CPUs
> It doesn't support:
>  * migration
>  * reset (partly because there's no way to reset a mach-virt system yet)
>  * anything except "-cpu host"
>  * debugging the VM via qemu gdbstub
>  * running 32 bit VMs on a 64 bit system
>[Mian's patchset includes support for that but I have left it out
>for the moment because it needs more thought about UI and so on]
>
> You can find this patchset plus the mach-virt/cpu-host one at
>  git://git.linaro.org/people/pmaydell/qemu-arm.git mach-virt-64
> https://git.linaro.org/gitweb?p=people/pmaydell/qemu-arm.git;a=shortlog;h=refs/heads/mach-virt-64
>
> thanks
> -- PMM
>
> Mian M. Hamayun (2):
>   target-arm: Add minimal KVM AArch64 support
>   hw/arm/boot: Add boot support for AArch64 processor
>
> Peter Maydell (5):
>   target-arm/kvm: Split 32 bit only code into its own file
>   target-arm: Clean up handling of AArch64 PSTATE
>   configure: Enable KVM for aarch64 host/target combination
>   hw/arm/boot: Allow easier swapping in of different loader code
>   default-configs: Add config for aarch64-softmmu
>
>  configure   |2 +-
>  default-configs/aarch64-softmmu.mak |9 +
>  hw/arm/boot.c   |  190 +
>  linux-user/signal.c |6 +-
>  target-arm/Makefile.objs|2 +
>  target-arm/cpu.c|6 +
>  target-arm/cpu.h|   68 -
>  target-arm/gdbstub64.c  |4 +-
>  target-arm/kvm.c|  495 +
>  target-arm/kvm32.c  |  515 
> +++
>  target-arm/kvm64.c  |  204 ++
>  target-arm/translate-a64.c  |   12 +-
>  12 files changed, 952 insertions(+), 561 deletions(-)
>  create mode 100644 default-configs/aarch64-softmmu.mak
>  create mode 100644 target-arm/kvm32.c
>  create mode 100644 target-arm/kvm64.c

Re: [Qemu-devel] [PATCH v2 2/2] target-i386: Intel MPX

2013-12-05 Thread Liu, Jinsong

Paolo Bonzini wrote:
> Il 04/12/2013 12:30, Liu, Jinsong ha scritto:
 
 Almost there.  Migration (vmstate) is still missing.
 
>> Like this:
>> 
>> ==
>> From faead85c0dbe62da896e0ed9e165d98e10216968 Mon Sep 17 00:00:00
>> 2001 
>> From: Liu Jinsong 
>> Date: Wed, 4 Dec 2013 16:56:49 +0800
>> Subject: [PATCH 2/2] target-i386: Intel MPX
>> 
>> Add some MPX related definiation, and hardcode sizes and offsets
>> of xsave features 3 and 4. It also add corresponding part to
>> kvm_get/put_xsave, and vmstate.
>> 
>> Signed-off-by: Liu Jinsong 
>> ---
>>  target-i386/cpu.c |4 
>>  target-i386/cpu.h |   22 +++---
>>  target-i386/kvm.c |   10 ++
>>  target-i386/machine.c |   32 
>>  4 files changed, 65 insertions(+), 3 deletions(-)
>> 
>> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
>> index 544b57f..52ca029 100644
>> --- a/target-i386/cpu.c
>> +++ b/target-i386/cpu.c
>> @@ -336,6 +336,10 @@ typedef struct ExtSaveArea {
>>  static const ExtSaveArea ext_save_areas[] = {
>>  [2] = { .feature = FEAT_1_ECX, .bits = CPUID_EXT_AVX,
>>  .offset = 0x240, .size = 0x100 },
>> +[3] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX,
>> +.offset = 0x3c0, .size = 0x40  },
>> +[4] = { .feature = FEAT_7_0_EBX, .bits = CPUID_7_0_EBX_MPX,
>> +.offset = 0x400, .size = 0x10  },
>>  };
>> 
>>  const char *get_register_name_32(unsigned int reg)
>> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
>> index ea373e8..5c1dd17 100644
>> --- a/target-i386/cpu.h
>> +++ b/target-i386/cpu.h
>> @@ -380,9 +380,12 @@
>> 
>>  #define MSR_VM_HSAVE_PA 0xc0010117
>> 
>> -#define XSTATE_FP   1
>> -#define XSTATE_SSE  2
>> -#define XSTATE_YMM  4
>> +#define XSTATE_FP   (1ULL << 0)
>> +#define XSTATE_SSE  (1ULL << 1)
>> +#define XSTATE_YMM  (1ULL << 2)
>> +#define XSTATE_BNDREGS  (1ULL << 3)
>> +#define XSTATE_BNDCSR   (1ULL << 4) +
>> 
>>  /* CPUID feature words */
>>  typedef enum FeatureWord {
>> @@ -545,6 +548,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
>>  #define CPUID_7_0_EBX_ERMS (1 << 9)
>>  #define CPUID_7_0_EBX_INVPCID  (1 << 10)
>>  #define CPUID_7_0_EBX_RTM  (1 << 11)
>> +#define CPUID_7_0_EBX_MPX  (1 << 14)
>>  #define CPUID_7_0_EBX_RDSEED   (1 << 18)
>>  #define CPUID_7_0_EBX_ADX  (1 << 19)
>>  #define CPUID_7_0_EBX_SMAP (1 << 20)
>> @@ -695,6 +699,16 @@ typedef union {
>>  uint64_t q;
>>  } MMXReg;
>> 
>> +typedef struct BNDReg {
>> +uint64_t lb;
>> +uint64_t ub;
>> +} BNDReg;
>> +
>> +typedef struct BNDCSReg {
>> +uint64_t cfg;
>> +uint64_t sts;
>> +} BNDCSReg;
>> +
>>  #ifdef HOST_WORDS_BIGENDIAN
>>  #define XMM_B(n) _b[15 - (n)]
>>  #define XMM_W(n) _w[7 - (n)]
>> @@ -912,6 +926,8 @@ typedef struct CPUX86State {
>> 
>>  uint64_t xstate_bv;
>>  XMMReg ymmh_regs[CPU_NB_REGS];
>> +BNDReg bnd_regs[4];
>> +BNDCSReg bndcs_regs;
>> 
>>  uint64_t xcr0;
>> 
>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>> index 749aa09..347d3d3 100644
>> --- a/target-i386/kvm.c
>> +++ b/target-i386/kvm.c
>> @@ -980,6 +980,8 @@ static int kvm_put_fpu(X86CPU *cpu)  #define
>>  XSAVE_XMM_SPACE   40 #define XSAVE_XSTATE_BV   128
>>  #define XSAVE_YMMH_SPACE  144
>> +#define XSAVE_BNDREGS 240
>> +#define XSAVE_BNDCSR  256
>> 
>>  static int kvm_put_xsave(X86CPU *cpu)
>>  {
>> @@ -1012,6 +1014,10 @@ static int kvm_put_xsave(X86CPU *cpu)
>>  *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV] = env->xstate_bv;
>>  memcpy(&xsave->region[XSAVE_YMMH_SPACE], env->ymmh_regs,
>>  sizeof env->ymmh_regs);
>> +memcpy(&xsave->region[XSAVE_BNDREGS], env->bnd_regs,
>> +sizeof env->bnd_regs);
>> +memcpy(&xsave->region[XSAVE_BNDCSR], &env->bndcs_regs,
>> +sizeof(env->bndcs_regs));
>>  r = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_XSAVE, xsave);  return
>>  r; }
>> @@ -1294,6 +1300,10 @@ static int kvm_get_xsave(X86CPU *cpu)
>>  env->xstate_bv = *(uint64_t *)&xsave->region[XSAVE_XSTATE_BV];
>>  memcpy(env->ymmh_regs, &xsave->region[XSAVE_YMMH_SPACE],
>>  sizeof env->ymmh_regs);
>> +memcpy(env->bnd_regs, &xsave->region[XSAVE_BNDREGS],
>> +sizeof env->bnd_regs);
>> +memcpy(&env->bndcs_regs, &xsave->region[XSAVE_BNDCSR],
>> +sizeof(env->bndcs_regs));
>>  return 0;
>>  }
>> 
>> diff --git a/target-i386/machine.c b/target-i386/machine.c
>> index e568da2..ca8be7d 100644
>> --- a/target-i386/machine.c
>> +++ b/target-i386/machine.c
>> @@ -63,6 +63,36 @@ static const VMStateDescription vmstate_ymmh_reg
>>  = { #define VMSTATE_YMMH_REGS_VARS(_field, _state, _n, _v) 
>>  \ VMSTATE_STRUCT_ARRAY(_field, _state, _n, _v,
>> vmstate_ymmh_reg, XMMReg) 
>> 
>> +static const VMSta

Re: [Qemu-devel] [RFC PATCH v1 0/5] Add error_abort and associated cleanups

2013-12-05 Thread Igor Mammedov

On Thu, 05 Dec 2013 11:37:27 +0100
Paolo Bonzini  wrote:

> Il 03/12/2013 21:33, Igor Mammedov ha scritto:
> > I'm sorry for hijacking thread, but that actually an issue that started an
> > original discussion.
> > Where void returning QOM API functions are used with NULL, without any 
> > chance
> > to detect that error happened. So abusing NULL errp in this functions
> > might lead to hard to find runtime errors.
> > I think Eric's suggestion was to enforce passing non NULL errp and let 
> > caller
> > to deal with error gracefully so that above mentioned misuse was impossible.
> > Why is ignoring errors from "void foo(...)" like API considered acceptable?
> 
> See http://permalink.gmane.org/gmane.comp.emulators.qemu/243779
> 
> > * Peter's alternative
> >   + self-documenting
> >   + consistent
> >   + predictable
> 
> I'll add another small advantage which is fewer SLOC.
There is not argument against Peter's approach at all,
question is what do we do with NULL errp in void API functions?

> 
> > * make Error* mandatory for all void functions
one more advantage:
+ not need to pepper every property setter/getter with local_error + 
error_propagate(),
  i.e. reduced code duplication.

> >   + consistent
> >   + almost predictable (because in C you can ignore return values)
there is no return values from void functions.

> >   - not necessarily does the right thing (e.g. cleanup functions)
we can pass &error_abort instead of NULL there if we don't care. If there will
be error it would mean something went horribly wrong and perhaps code
should care if error happens there.

for special cases we could invent &ignore_error if there will be real need for 
it.

> >   - requires manual effort to abide to the policy
with assert inside API there is no manual effort. But as Marcus
noted these errors will be only runtime detectable :(

> 
> Better wording of the last: a missing &error_abort is easier to spot
> than a missing assert_no_error(errp).
> 
> Paolo

[Qemu-devel] [PATCH] rbd: switch from pipe to QEMUBH completion notification

2013-12-05 Thread Stefan Hajnoczi

rbd callbacks are called from non-QEMU threads.  Up until now a pipe was
used to signal completion back to the QEMU iothread.

The pipe writer code handles EAGAIN using select(2).  The select(2) API
is not scalable since fd_set size is static.  FD_SET() can write beyond
the end of fd_set if the file descriptor number is too high.  (QEMU's
main loop uses poll(2) to avoid this issue with select(2).)

Since the pipe itself is quite clumsy to use and QEMUBH is now
thread-safe, just schedule a BH from the rbd callback function.  This
way we can simplify I/O completion in addition to eliminating the
potential FD_SET() crash when file descriptor numbers become too high.

Crash scenario: QEMU already has 1024 file descriptors open.  Hotplug an
rbd drive and get the pipe writer to take the select(2) code path.

Signed-off-by: Stefan Hajnoczi 
---
Josh: This patch has not been tested.  I have just compiled it.

 block/rbd.c | 130 ++--
 1 file changed, 22 insertions(+), 108 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 4a1ea5b..441e757 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -95,18 +95,13 @@ typedef struct RADOSCB {
 #define RBD_FD_WRITE 1
 
 typedef struct BDRVRBDState {
-int fds[2];
 rados_t cluster;
 rados_ioctx_t io_ctx;
 rbd_image_t image;
 char name[RBD_MAX_IMAGE_NAME_SIZE];
 char *snap;
-int event_reader_pos;
-RADOSCB *event_rcb;
 } BDRVRBDState;
 
-static void rbd_aio_bh_cb(void *opaque);
-
 static int qemu_rbd_next_tok(char *dst, int dst_len,
  char *src, char delim,
  const char *name,
@@ -369,9 +364,8 @@ static int qemu_rbd_create(const char *filename, 
QEMUOptionParameter *options,
 }
 
 /*
- * This aio completion is being called from qemu_rbd_aio_event_reader()
- * and runs in qemu context. It schedules a bh, but just in case the aio
- * was not cancelled before.
+ * This aio completion is being called from rbd_finish_bh() and runs in qemu
+ * BH context.
  */
 static void qemu_rbd_complete_aio(RADOSCB *rcb)
 {
@@ -401,36 +395,19 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
 acb->ret = r;
 }
 }
-/* Note that acb->bh can be NULL in case where the aio was cancelled */
-acb->bh = qemu_bh_new(rbd_aio_bh_cb, acb);
-qemu_bh_schedule(acb->bh);
-g_free(rcb);
-}
 
-/*
- * aio fd read handler. It runs in the qemu context and calls the
- * completion handling of completed rados aio operations.
- */
-static void qemu_rbd_aio_event_reader(void *opaque)
-{
-BDRVRBDState *s = opaque;
+g_free(rcb);
 
-ssize_t ret;
+if (acb->cmd == RBD_AIO_READ) {
+qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
+}
+qemu_vfree(acb->bounce);
+acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
+acb->status = 0;
 
-do {
-char *p = (char *)&s->event_rcb;
-
-/* now read the rcb pointer that was sent from a non qemu thread */
-ret = read(s->fds[RBD_FD_READ], p + s->event_reader_pos,
-   sizeof(s->event_rcb) - s->event_reader_pos);
-if (ret > 0) {
-s->event_reader_pos += ret;
-if (s->event_reader_pos == sizeof(s->event_rcb)) {
-s->event_reader_pos = 0;
-qemu_rbd_complete_aio(s->event_rcb);
-}
-}
-} while (ret < 0 && errno == EINTR);
+if (!acb->cancelled) {
+qemu_aio_release(acb);
+}
 }
 
 /* TODO Convert to fine grained options */
@@ -538,23 +515,9 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
 
 bs->read_only = (s->snap != NULL);
 
-s->event_reader_pos = 0;
-r = qemu_pipe(s->fds);
-if (r < 0) {
-error_report("error opening eventfd");
-goto failed;
-}
-fcntl(s->fds[0], F_SETFL, O_NONBLOCK);
-fcntl(s->fds[1], F_SETFL, O_NONBLOCK);
-qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], qemu_rbd_aio_event_reader,
-NULL, s);
-
-
 qemu_opts_del(opts);
 return 0;
 
-failed:
-rbd_close(s->image);
 failed_open:
 rados_ioctx_destroy(s->io_ctx);
 failed_shutdown:
@@ -569,10 +532,6 @@ static void qemu_rbd_close(BlockDriverState *bs)
 {
 BDRVRBDState *s = bs->opaque;
 
-close(s->fds[0]);
-close(s->fds[1]);
-qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], NULL, NULL, NULL);
-
 rbd_close(s->image);
 rados_ioctx_destroy(s->io_ctx);
 g_free(s->snap);
@@ -600,34 +559,11 @@ static const AIOCBInfo rbd_aiocb_info = {
 .cancel = qemu_rbd_aio_cancel,
 };
 
-static int qemu_rbd_send_pipe(BDRVRBDState *s, RADOSCB *rcb)
+static void rbd_finish_bh(void *opaque)
 {
-int ret = 0;
-while (1) {
-fd_set wfd;
-int fd = s->fds[RBD_FD_WRITE];
-
-/* send the op pointer to the qemu thread that is responsible
-   for the aio/op completion. Must do it in a qemu thread context */

[Qemu-devel] [V4 PATCH 10/14] target-ppc: VSX Stage 4: Add xssqrtsp

2013-12-05 Thread Tom Musta

This patch adds the VSX Scalar Square Root Single Precision (xssqrtsp)
instruction.

The existing VSX_SQRT() macro is modified to support rounding of the
intermediate double-precision result to single-precision.

V2: Updated conversion to single precision range.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/fpu_helper.c |   13 +
 target-ppc/helper.h |1 +
 target-ppc/translate.c  |2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index b633547..34a8b66 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1969,7 +1969,7 @@ VSX_RE(xvresp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_SQRT(op, nels, tp, fld, sfprf)   \
+#define VSX_SQRT(op, nels, tp, fld, sfprf, r2sp) \
 void helper_##op(CPUPPCState *env, uint32_t opcode)  \
 {\
 ppc_vsr_t xt, xb;\
@@ -1993,6 +1993,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
\
 }\
 }\
  \
+if (r2sp) {  \
+xt.fld[i] = helper_frsp(env, xt.fld[i]); \
+}\
+ \
 if (sfprf) { \
 helper_compute_fprf(env, xt.fld[i], sfprf);  \
 }\
@@ -2002,9 +2006,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
\
 helper_float_check_status(env);  \
 }
 
-VSX_SQRT(xssqrtdp, 1, float64, f64, 1)
-VSX_SQRT(xvsqrtdp, 2, float64, f64, 0)
-VSX_SQRT(xvsqrtsp, 4, float32, f32, 0)
+VSX_SQRT(xssqrtdp, 1, float64, f64, 1, 0)
+VSX_SQRT(xssqrtsp, 1, float64, f64, 1, 1)
+VSX_SQRT(xvsqrtdp, 2, float64, f64, 0, 0)
+VSX_SQRT(xvsqrtsp, 4, float32, f32, 0, 0)
 
 /* VSX_RSQRTE - VSX floating point reciprocal square root estimate
  *   op- instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index b1cf3c0..0192043 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -291,6 +291,7 @@ DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
 DEF_HELPER_2(xsresp, void, env, i32)
+DEF_HELPER_2(xssqrtsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index c4c57a1..b9cd35b 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7346,6 +7346,7 @@ GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10159,6 +10160,7 @@ GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
+GEN_XX2FORM(xssqrtsp,  0x16, 0x00, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

[Qemu-devel] [V4 PATCH 02/14] target-ppc: VSX Stage 4: Refactor lxsdx

2013-12-05 Thread Tom Musta

This patch refactors the lxsdx generator. Resuable code is isolated
into a macro.  The macro will be used in subsequent patches in this
series to implement other scalar load instructions.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/translate.c |   31 +--
 1 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 52d7165..2541b5f 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7006,20 +7006,23 @@ static inline TCGv_i64 cpu_vsrl(int n)
 }
 }
 
-static void gen_lxsdx(DisasContext *ctx)
-{
-TCGv EA;
-if (unlikely(!ctx->vsx_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_VSXU);
-return;
-}
-gen_set_access_type(ctx, ACCESS_INT);
-EA = tcg_temp_new();
-gen_addr_reg_index(ctx, EA);
-gen_qemu_ld64(ctx, cpu_vsrh(xT(ctx->opcode)), EA);
-/* NOTE: cpu_vsrl is undefined */
-tcg_temp_free(EA);
-}
+#define VSX_LOAD_SCALAR(name, operation)  \
+static void gen_##name(DisasContext *ctx) \
+{ \
+TCGv EA;  \
+if (unlikely(!ctx->vsx_enabled)) {\
+gen_exception(ctx, POWERPC_EXCP_VSXU);\
+return;   \
+} \
+gen_set_access_type(ctx, ACCESS_INT); \
+EA = tcg_temp_new();  \
+gen_addr_reg_index(ctx, EA);  \
+gen_qemu_##operation(ctx, cpu_vsrh(xT(ctx->opcode)), EA); \
+/* NOTE: cpu_vsrl is undefined */ \
+tcg_temp_free(EA);\
+}
+
+VSX_LOAD_SCALAR(lxsdx, ld64)
 
 static void gen_lxvd2x(DisasContext *ctx)
 {
-- 
1.7.1

[Qemu-devel] [V4 PATCH 00/14] target-ppc: VSX Stage 4

2013-12-05 Thread Tom Musta

This is the fourth and final series of patches that add emulation support
to QEMU for the PowerPC Vector Scalar Extension (VSX).

This series adds the instructions that were newly introduced with Power ISA
V2.07.  This includes 3 scalar load instructions, 2 scalar store instructions,
7 standard single precision scalar arithmetic instructions, 8 scalar single
precision fused multiply/add instructions, two integer-to-single-precision
conversion instructions and 3 vector logical instructions.

The single-precision scalar arithmetic instructions all interpret the most
significant 64 bits of a VSR as a single precision floating point number
stored in double precision format (similar to the standard PowerPC floating
point single precision instructions).  Thus a common theme in the supporting
code is rounding of an intermediate double-precision number to single 
precision.

V2: (a) Changed the rounding to single precision to reuse the existing
helper_frsp() routine.  (b) Re-implemented the fused multiply/add instructions
to use float32_muladd instead of float64_muladd, which avoids subtle rounding
errors.

V3: Re-implemented fused multiply/add per clarification from Richard Henderson.

V4: Changed fused multiply/add to use helper_frsp (inadvertently re-injected
when I used an earlier patch).  

Tom Musta (14):
  target-ppc: VSX Stage 4: Add VSX 2.07 Flag
  target-ppc: VSX Stage 4: Refactor lxsdx
  target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx
  target-ppc: VSX Stage 4: Refactor stxsdx
  target-ppc: VSX Stage 4: Add stxsiwx and stxsspx
  target-ppc: VSX Stage 4: Add xsaddsp and xssubsp
  target-ppc: VSX Stage 4: Add xsmulsp
  target-ppc: VSX Stage 4: Add xsdivsp
  target-ppc: VSX Stage 4: Add xsresp
  target-ppc: VSX Stage 4: Add xssqrtsp
  target-ppc: VSX Stage 4: add xsrsqrtesp
  target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds
  target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp
  target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc

 target-ppc/cpu.h|4 +-
 target-ppc/fpu_helper.c |  195 ---
 target-ppc/helper.h |   18 
 target-ppc/translate.c  |  110 ++--
 target-ppc/translate_init.c |2 +-
 5 files changed, 234 insertions(+), 95 deletions(-)

[Qemu-devel] [V4 PATCH 01/14] target-ppc: VSX Stage 4: Add VSX 2.07 Flag

2013-12-05 Thread Tom Musta

This patch adds a flag to identify those VSX instructions that are
new to Power ISA V2.07.  The flag is added to the Power 8 processor
initialization so that the P8 models understand how to decode and
emulate instructions in this category.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/cpu.h|4 +++-
 target-ppc/translate_init.c |2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index bb84767..0abc848 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1875,9 +1875,11 @@ enum {
 PPC2_DBRX  = 0x0010ULL,
 /* Book I 2.05 PowerPC specification */
 PPC2_ISA205= 0x0020ULL,
+/* VSX additions in ISA 2.07 */
+PPC2_VSX207= 0x0040ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
-  PPC2_ISA205)
+PPC2_ISA205 | PPC2_VSX207)
 };
 
 /*/
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 13457ec..e14ab63 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7270,7 +7270,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
PPC_64B | PPC_ALTIVEC |
PPC_SEGMENT_64B | PPC_SLBI |
PPC_POPCNTB | PPC_POPCNTWD;
-pcc->insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX;
+pcc->insns_flags2 = PPC2_VSX | PPC2_VSX207 | PPC2_DFP | PPC2_DBRX;
 pcc->msr_mask = 0x8284FF36ULL;
 pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
-- 
1.7.1

[Qemu-devel] [V4 PATCH 04/14] target-ppc: VSX Stage 4: Refactor stxsdx

2013-12-05 Thread Tom Musta

This patch refactors the stxsdx instruction.  Reusable code is
extracted into a macro which will be used in subsequent patches
in this series.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/translate.c |   27 +++
 1 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index ad40d27..52e487d 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7086,20 +7086,23 @@ static void gen_lxvw4x(DisasContext *ctx)
 tcg_temp_free(tmp);
 }
 
-static void gen_stxsdx(DisasContext *ctx)
-{
-TCGv EA;
-if (unlikely(!ctx->vsx_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_VSXU);
-return;
-}
-gen_set_access_type(ctx, ACCESS_INT);
-EA = tcg_temp_new();
-gen_addr_reg_index(ctx, EA);
-gen_qemu_st64(ctx, cpu_vsrh(xS(ctx->opcode)), EA);
-tcg_temp_free(EA);
+#define VSX_STORE_SCALAR(name, operation) \
+static void gen_##name(DisasContext *ctx) \
+{ \
+TCGv EA;  \
+if (unlikely(!ctx->vsx_enabled)) {\
+gen_exception(ctx, POWERPC_EXCP_VSXU);\
+return;   \
+} \
+gen_set_access_type(ctx, ACCESS_INT); \
+EA = tcg_temp_new();  \
+gen_addr_reg_index(ctx, EA);  \
+gen_qemu_##operation(ctx, cpu_vsrh(xS(ctx->opcode)), EA); \
+tcg_temp_free(EA);\
 }
 
+VSX_STORE_SCALAR(stxsdx, st64)
+
 static void gen_stxvd2x(DisasContext *ctx)
 {
 TCGv EA;
-- 
1.7.1

[Qemu-devel] [V4 PATCH 08/14] target-ppc: VSX Stage 4: Add xsdivsp

2013-12-05 Thread Tom Musta

This patch adds the VSX Scalar Divide Single Precision (xsdivsp)
instruction.

The existing VSX_DIV macro is modified to support rounding of the
intermediate double precision result to single precision.

V2: Updated conversion to single precision.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/fpu_helper.c |   13 +
 target-ppc/helper.h |1 +
 target-ppc/translate.c  |2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index b3a5261..da31964 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1874,7 +1874,7 @@ VSX_MUL(xvmulsp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_DIV(op, nels, tp, fld, sfprf) \
+#define VSX_DIV(op, nels, tp, fld, sfprf, r2sp)   \
 void helper_##op(CPUPPCState *env, uint32_t opcode)   \
 { \
 ppc_vsr_t xt, xa, xb; \
@@ -1903,6 +1903,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
 \
 } \
 } \
   \
+if (r2sp) {   \
+xt.fld[i] = helper_frsp(env, xt.fld[i]);  \
+} \
+  \
 if (sfprf) {  \
 helper_compute_fprf(env, xt.fld[i], sfprf);   \
 } \
@@ -1912,9 +1916,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
 \
 helper_float_check_status(env);   \
 }
 
-VSX_DIV(xsdivdp, 1, float64, f64, 1)
-VSX_DIV(xvdivdp, 2, float64, f64, 0)
-VSX_DIV(xvdivsp, 4, float32, f32, 0)
+VSX_DIV(xsdivdp, 1, float64, f64, 1, 0)
+VSX_DIV(xsdivsp, 1, float64, f64, 1, 1)
+VSX_DIV(xvdivdp, 2, float64, f64, 0, 0)
+VSX_DIV(xvdivsp, 4, float32, f32, 0, 0)
 
 /* VSX_RE  - VSX floating point reciprocal estimate
  *   op- instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0ccdc96..308f97c 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -289,6 +289,7 @@ DEF_HELPER_2(xsrdpiz, void, env, i32)
 DEF_HELPER_2(xsaddsp, void, env, i32)
 DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
+DEF_HELPER_2(xsdivsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 450ab88..896dbc2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7344,6 +7344,7 @@ GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10155,6 +10156,7 @@ GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),
 GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
+GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

[Qemu-devel] [V4 PATCH 05/14] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx

2013-12-05 Thread Tom Musta

This patch adds two store scalar instructions:

  - Store VSX Scalar as Integer Word Indexed (stxsiwx)
  - Store VSX Scalar Single-Precision Indexed (stxsspx)

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/translate.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 52e487d..62604fd 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7102,6 +7102,8 @@ static void gen_##name(DisasContext *ctx) 
\
 }
 
 VSX_STORE_SCALAR(stxsdx, st64)
+VSX_STORE_SCALAR(stxsiwx, st32)
+VSX_STORE_SCALAR(stxsspx, st32fs)
 
 static void gen_stxvd2x(DisasContext *ctx)
 {
@@ -10050,6 +10052,8 @@ GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, 
PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
 
 GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(stxsiwx, 0x1F, 0xC, 0x04, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
 
-- 
1.7.1

[Qemu-devel] [V4 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds

2013-12-05 Thread Tom Musta

This patch adds the Single Precision VSX Scalar Fused Multiply-Add
instructions: xsmaddasp, xsmaddmsp, xssubasp, xssubmsp, xsnmaddasp,
xsnmaddmsp, xsnmsubasp, xsnmsubmsp.

The existing VSX_MADD() macro is modified to support rounding of the
intermediate double precision result to single precision.

V2: Re-implemented per feedback from Richard Henderson.  In order to
avoid double rounding and incorrect results, the operands must be
converted to true single precision values and use the single precision
fused multiply/add routine.

V3: Re-implemented per feedback from Richard Henderson (I did not
fully understand his comment when I implemented V2).

V4: Changed to use helper_frsp (inadvertently re-injected when I used an
earlier patch).  Thanks to Richard Henderson for catching this.

Signed-off-by: Tom Musta 
---
 target-ppc/fpu_helper.c |   82 ++
 target-ppc/helper.h |8 
 target-ppc/translate.c  |   16 +
 3 files changed, 77 insertions(+), 29 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 8825db2..7926c71 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2192,7 +2192,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
  *   afrm  - A form (1=A, 0=M)
  *   sfprf - set FPRF
  */
-#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf)\
+#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)  \
 void helper_##op(CPUPPCState *env, uint32_t opcode)   \
 { \
 ppc_vsr_t xt_in, xa, xb, xt_out;  \
@@ -2218,8 +2218,18 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
 \
 for (i = 0; i < nels; i++) {  \
 float_status tstat = env->fp_status;  \
 set_float_exception_flags(0, &tstat); \
-xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],  \
- maddflgs, &tstat);   \
+if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even)) {\
+/* Avoid double rounding errors by rounding the intermediate */   \
+/* result to odd.*/   \
+set_float_rounding_mode(float_round_to_zero, &tstat); \
+xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],  \
+   maddflgs, &tstat); \
+xt_out.fld[i] |= (get_float_exception_flags(&tstat) & \
+  float_flag_inexact) != 0;   \
+} else {  \
+xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],  \
+maddflgs, &tstat);\
+} \
 env->fp_status.float_exception_flags |= tstat.float_exception_flags;  \
   \
 if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \
@@ -2242,6 +2252,11 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
 \
 fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, sfprf); \
 } \
 } \
+  \
+if (r2sp) {   \
+xt_out.fld[i] = helper_frsp(env, xt_out.fld[i]);  \
+} \
+  \
 if (sfprf) {  \
 helper_compute_fprf(env, xt_out.fld[i], sfprf);   \
 } \
@@ -2255,32 +2270,41 @@ void helper_##op(CPUPPCState *env, uint32_t opcode) 
  \
 #define NMADD_FLGS float_muladd_negate_result
 #define NMSUB_FLGS (float_muladd_negate_c | float_muladd_negate_result)
 
-VSX_MADD(xsmaddadp, 1, float64, f64, MADD_FLGS, 1, 1)
-VSX_MADD(xsmaddmdp, 1, float64, f64, MADD_FLGS, 0, 1)
-VSX_MADD(xsmsubadp, 1, float64, f64, MSUB_FLGS, 1, 1)
-VSX_MADD(xsmsubmdp, 1, float64, f64, MSUB_FLGS, 0, 1)
-VSX_MADD(xsnmaddadp, 1, float64, f64, NMADD_FLGS, 1, 1)
-VSX_MADD(xsnmaddmdp, 1, float64, f64, NMADD_FLGS, 0,

Re: [Qemu-devel] [PATCH] target-i386: clear guest TSC on reset

2013-12-05 Thread Fernando Luis Vazquez Cao

(2013/12/05 22:53), Paolo Bonzini wrote:
> Il 05/12/2013 14:15, Fernando Luis Vazquez Cao ha scritto:
>>  /*
>>   * KVM is yet unable to synchronize TSC values of multiple VCPUs on
>>   * writeback. Until this is fixed, we only write the offset to SMP
>>   * guests after migration, desynchronizing the VCPUs, but avoiding
>>   * huge jump-backs that would occur without any writeback at all.
>>   */
>> -if (smp_cpus == 1 || env->tsc != 0) {
>> +if (smp_cpus == 1 || env->tsc != 0 || level == KVM_PUT_RESET_STATE) 
>> {
>>  kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
>>  }
> This is still a bit ugly, and desynchronizes the VCPUs on reset.

I agree it is a bit ugly, but in my testing QEMU seemed to loop over all
the VCPUS fast enough for the kernel side kvm_write_tsc() to do a
reasonable job of matching the offsets (the Linux guest did not mark
the TSC unstable due to the TSCs being unsynchronized). Am I missing
something?

> The main point of my outlined solution is that you only have one value
> that is tracked, not one per VCPU (which in the case of migration adds
> unpredictable latencies---for example due to emptying the migration
> buffers).  We already save that value; all that's left is to use it
> instead of env->tsc.

I understand the benefits of what you are proposing but, since it is
wider is scope and it would be more difficult to backport, I would
prefer to implement it as a follow-up patch, unless you think that
the current patch as a standalone fix does more harm than good.

- Fernando

[Qemu-devel] [V4 PATCH 09/14] target-ppc: VSX Stage 4: Add xsresp

2013-12-05 Thread Tom Musta

This patch adds the VSX Scalar Reciprocal Estimate Single Precision
(xsresp) instruction.

The existing VSX_RE macro is modified to support rounding of the
intermediate double precision result to single precision.

V2: Updated conversion to single precision range.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/fpu_helper.c |   14 ++
 target-ppc/helper.h |1 +
 target-ppc/translate.c  |2 ++
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index da31964..b633547 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1928,7 +1928,7 @@ VSX_DIV(xvdivsp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_RE(op, nels, tp, fld, sfprf)  \
+#define VSX_RE(op, nels, tp, fld, sfprf, r2sp)\
 void helper_##op(CPUPPCState *env, uint32_t opcode)   \
 { \
 ppc_vsr_t xt, xb; \
@@ -1943,6 +1943,11 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
 \
 fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, sfprf);\
 } \
 xt.fld[i] = tp##_div(tp##_one, xb.fld[i], &env->fp_status);   \
+  \
+if (r2sp) {   \
+xt.fld[i] = helper_frsp(env, xt.fld[i]);  \
+} \
+  \
 if (sfprf) {  \
 helper_compute_fprf(env, xt.fld[0], sfprf);   \
 } \
@@ -1952,9 +1957,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
 \
 helper_float_check_status(env);   \
 }
 
-VSX_RE(xsredp, 1, float64, f64, 1)
-VSX_RE(xvredp, 2, float64, f64, 0)
-VSX_RE(xvresp, 4, float32, f32, 0)
+VSX_RE(xsredp, 1, float64, f64, 1, 0)
+VSX_RE(xsresp, 1, float64, f64, 1, 1)
+VSX_RE(xvredp, 2, float64, f64, 0, 0)
+VSX_RE(xvresp, 4, float32, f32, 0, 0)
 
 /* VSX_SQRT - VSX floating point square root
  *   op- instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 308f97c..b1cf3c0 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -290,6 +290,7 @@ DEF_HELPER_2(xsaddsp, void, env, i32)
 DEF_HELPER_2(xssubsp, void, env, i32)
 DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
+DEF_HELPER_2(xsresp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 896dbc2..c4c57a1 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7345,6 +7345,7 @@ GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10157,6 +10158,7 @@ GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
 GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
+GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

[Qemu-devel] [V4 PATCH 03/14] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx

2013-12-05 Thread Tom Musta

This patch adds the scalar load instructions introduced in ISA
V2.07:

  - Load VSX Scalar as Integer Word Algebraic Indexd (lxsiwax)
  - Load VSX Scalar as Integer Word and Zero Indexed (lxsiwzx)
  - Load VSX Scalar Single-Precision Indexed (lxsspx)

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/translate.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 2541b5f..ad40d27 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7023,6 +7023,9 @@ static void gen_##name(DisasContext *ctx) 
\
 }
 
 VSX_LOAD_SCALAR(lxsdx, ld64)
+VSX_LOAD_SCALAR(lxsiwax, ld32s)
+VSX_LOAD_SCALAR(lxsiwzx, ld32u)
+VSX_LOAD_SCALAR(lxsspx, ld32fs)
 
 static void gen_lxvd2x(DisasContext *ctx)
 {
@@ -10036,6 +10039,9 @@ GEN_VAFORM_PAIRED(vsel, vperm, 21),
 GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23),
 
 GEN_HANDLER_E(lxsdx, 0x1F, 0x0C, 0x12, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(lxsiwax, 0x1F, 0x0C, 0x02, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(lxsiwzx, 0x1F, 0x0C, 0x00, 0, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2_VSX207),
 GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
 GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
-- 
1.7.1

[Qemu-devel] [V4 PATCH 06/14] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp

2013-12-05 Thread Tom Musta

This patch adds the VSX Scalar Add Single-Precision (xsaddsp) and
VSX Scalar Subtract Single-Precision (xssubsp) instructions.

The existing VSX_ADD_SUB macro is modified to support the rounding
of the (intermediate) result to single-precision.

V2: updated conversion of result to single precision.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/fpu_helper.c |   20 +---
 target-ppc/helper.h |3 +++
 target-ppc/translate.c  |6 ++
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index f3d02cc..1256ad0 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1768,7 +1768,7 @@ static void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState 
*env)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf)  \
+#define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp)\
 void helper_##name(CPUPPCState *env, uint32_t opcode)\
 {\
 ppc_vsr_t xt, xa, xb;\
@@ -1794,6 +1794,10 @@ void helper_##name(CPUPPCState *env, uint32_t opcode)
\
 }\
 }\
  \
+if (r2sp) {  \
+xt.fld[i] = helper_frsp(env, xt.fld[i]); \
+}\
+ \
 if (sfprf) { \
 helper_compute_fprf(env, xt.fld[i], sfprf);  \
 }\
@@ -1802,12 +1806,14 @@ void helper_##name(CPUPPCState *env, uint32_t opcode)   
 \
 helper_float_check_status(env);  \
 }
 
-VSX_ADD_SUB(xsadddp, add, 1, float64, f64, 1)
-VSX_ADD_SUB(xvadddp, add, 2, float64, f64, 0)
-VSX_ADD_SUB(xvaddsp, add, 4, float32, f32, 0)
-VSX_ADD_SUB(xssubdp, sub, 1, float64, f64, 1)
-VSX_ADD_SUB(xvsubdp, sub, 2, float64, f64, 0)
-VSX_ADD_SUB(xvsubsp, sub, 4, float32, f32, 0)
+VSX_ADD_SUB(xsadddp, add, 1, float64, f64, 1, 0)
+VSX_ADD_SUB(xsaddsp, add, 1, float64, f64, 1, 1)
+VSX_ADD_SUB(xvadddp, add, 2, float64, f64, 0, 0)
+VSX_ADD_SUB(xvaddsp, add, 4, float32, f32, 0, 0)
+VSX_ADD_SUB(xssubdp, sub, 1, float64, f64, 1, 0)
+VSX_ADD_SUB(xssubsp, sub, 1, float64, f64, 1, 1)
+VSX_ADD_SUB(xvsubdp, sub, 2, float64, f64, 0, 0)
+VSX_ADD_SUB(xvsubsp, sub, 4, float32, f32, 0, 0)
 
 /* VSX_MUL - VSX floating point multiply
  *   op- instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0276b02..696b9d3 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -286,6 +286,9 @@ DEF_HELPER_2(xsrdpim, void, env, i32)
 DEF_HELPER_2(xsrdpip, void, env, i32)
 DEF_HELPER_2(xsrdpiz, void, env, i32)
 
+DEF_HELPER_2(xsaddsp, void, env, i32)
+DEF_HELPER_2(xssubsp, void, env, i32)
+
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
 DEF_HELPER_2(xvmuldp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 62604fd..bd639cc 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7341,6 +7341,9 @@ GEN_VSX_HELPER_2(xsrdpim, 0x12, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsrdpip, 0x12, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 
+GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
+
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvmuldp, 0x00, 0x0E, 0, PPC2_VSX)
@@ -10148,6 +10151,9 @@ GEN_XX2FORM(xsrdpim, 0x12, 0x07, PPC2_VSX),
 GEN_XX2FORM(xsrdpip, 0x12, 0x06, PPC2_VSX),
 GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),
 
+GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
+GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
+
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
 GEN_XX3FORM(xvmuldp, 0x00, 0x0E, PPC2_VSX),
-- 
1.7.1

[Qemu-devel] [V4 PATCH 07/14] target-ppc: VSX Stage 4: Add xsmulsp

2013-12-05 Thread Tom Musta

This patch adds the VSX Scalar Multiply Single-Precision (xsmulsp)
instruction.

The existing VSX_MUL macro is modified to support rounding of the
intermediate result to single precision.

V2: Updated conversion to single precision.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/fpu_helper.c |   13 +
 target-ppc/helper.h |1 +
 target-ppc/translate.c  |2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 1256ad0..b3a5261 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -1822,7 +1822,7 @@ VSX_ADD_SUB(xvsubsp, sub, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_MUL(op, nels, tp, fld, sfprf)\
+#define VSX_MUL(op, nels, tp, fld, sfprf, r2sp)  \
 void helper_##op(CPUPPCState *env, uint32_t opcode)  \
 {\
 ppc_vsr_t xt, xa, xb;\
@@ -1849,6 +1849,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
\
 }\
 }\
  \
+if (r2sp) {  \
+xt.fld[i] = helper_frsp(env, xt.fld[i]); \
+}\
+ \
 if (sfprf) { \
 helper_compute_fprf(env, xt.fld[i], sfprf);  \
 }\
@@ -1858,9 +1862,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
\
 helper_float_check_status(env);  \
 }
 
-VSX_MUL(xsmuldp, 1, float64, f64, 1)
-VSX_MUL(xvmuldp, 2, float64, f64, 0)
-VSX_MUL(xvmulsp, 4, float32, f32, 0)
+VSX_MUL(xsmuldp, 1, float64, f64, 1, 0)
+VSX_MUL(xsmulsp, 1, float64, f64, 1, 1)
+VSX_MUL(xvmuldp, 2, float64, f64, 0, 0)
+VSX_MUL(xvmulsp, 4, float32, f32, 0, 0)
 
 /* VSX_DIV - VSX floating point divide
  *   op- instruction mnemonic
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 696b9d3..0ccdc96 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -288,6 +288,7 @@ DEF_HELPER_2(xsrdpiz, void, env, i32)
 
 DEF_HELPER_2(xsaddsp, void, env, i32)
 DEF_HELPER_2(xssubsp, void, env, i32)
+DEF_HELPER_2(xsmulsp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index bd639cc..450ab88 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7343,6 +7343,7 @@ GEN_VSX_HELPER_2(xsrdpiz, 0x12, 0x05, 0, PPC2_VSX)
 
 GEN_VSX_HELPER_2(xsaddsp, 0x00, 0x00, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssubsp, 0x00, 0x01, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10153,6 +10154,7 @@ GEN_XX2FORM(xsrdpiz, 0x12, 0x05, PPC2_VSX),
 
 GEN_XX3FORM(xsaddsp, 0x00, 0x00, PPC2_VSX207),
 GEN_XX3FORM(xssubsp, 0x00, 0x01, PPC2_VSX207),
+GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

[Qemu-devel] [V4 PATCH 11/14] target-ppc: VSX Stage 4: add xsrsqrtesp

2013-12-05 Thread Tom Musta

This patch adds the VSX Scalar Reciprocal Square Root Estimate
Single Precision (xsrsqrtesp) instruction.

The existing VSX_RSQRTE() macro is modified to support rounding
of the intermediate double-precision result to single precision.

V2: Updated conversion to single precision range.

Signed-off-by: Tom Musta 
Reviewed-by: Richard Henderson 
---
 target-ppc/fpu_helper.c |   13 +
 target-ppc/helper.h |1 +
 target-ppc/translate.c  |2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index 34a8b66..8825db2 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -2018,7 +2018,7 @@ VSX_SQRT(xvsqrtsp, 4, float32, f32, 0, 0)
  *   fld   - vsr_t field (f32 or f64)
  *   sfprf - set FPRF
  */
-#define VSX_RSQRTE(op, nels, tp, fld, sfprf) \
+#define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp)   \
 void helper_##op(CPUPPCState *env, uint32_t opcode)  \
 {\
 ppc_vsr_t xt, xb;\
@@ -2043,6 +2043,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
\
 }\
 }\
  \
+if (r2sp) {  \
+xt.fld[i] = helper_frsp(env, xt.fld[i]); \
+}\
+ \
 if (sfprf) { \
 helper_compute_fprf(env, xt.fld[i], sfprf);  \
 }\
@@ -2052,9 +2056,10 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)  
\
 helper_float_check_status(env);  \
 }
 
-VSX_RSQRTE(xsrsqrtedp, 1, float64, f64, 1)
-VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0)
-VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0)
+VSX_RSQRTE(xsrsqrtedp, 1, float64, f64, 1, 0)
+VSX_RSQRTE(xsrsqrtesp, 1, float64, f64, 1, 1)
+VSX_RSQRTE(xvrsqrtedp, 2, float64, f64, 0, 0)
+VSX_RSQRTE(xvrsqrtesp, 4, float32, f32, 0, 0)
 
 static inline int ppc_float32_get_unbiased_exp(float32 f)
 {
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 0192043..84c6ee1 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -292,6 +292,7 @@ DEF_HELPER_2(xsmulsp, void, env, i32)
 DEF_HELPER_2(xsdivsp, void, env, i32)
 DEF_HELPER_2(xsresp, void, env, i32)
 DEF_HELPER_2(xssqrtsp, void, env, i32)
+DEF_HELPER_2(xsrsqrtesp, void, env, i32)
 
 DEF_HELPER_2(xvadddp, void, env, i32)
 DEF_HELPER_2(xvsubdp, void, env, i32)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index b9cd35b..ae80289 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -7347,6 +7347,7 @@ GEN_VSX_HELPER_2(xsmulsp, 0x00, 0x02, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsdivsp, 0x00, 0x03, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xsresp, 0x14, 0x01, 0, PPC2_VSX207)
 GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207)
+GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207)
 
 GEN_VSX_HELPER_2(xvadddp, 0x00, 0x0C, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvsubdp, 0x00, 0x0D, 0, PPC2_VSX)
@@ -10161,6 +10162,7 @@ GEN_XX3FORM(xsmulsp, 0x00, 0x02, PPC2_VSX207),
 GEN_XX3FORM(xsdivsp, 0x00, 0x03, PPC2_VSX207),
 GEN_XX2FORM(xsresp,  0x14, 0x01, PPC2_VSX207),
 GEN_XX2FORM(xssqrtsp,  0x16, 0x00, PPC2_VSX207),
+GEN_XX2FORM(xsrsqrtesp,  0x14, 0x00, PPC2_VSX207),
 
 GEN_XX3FORM(xvadddp, 0x00, 0x0C, PPC2_VSX),
 GEN_XX3FORM(xvsubdp, 0x00, 0x0D, PPC2_VSX),
-- 
1.7.1

1 2 3 >

1 - 100 of 217 matches

Mail list logo