Re: [Qemu-devel] [PATCH v5] spapr: Add support for time base offset migration

2014-04-14 Thread Alexander Graf


On 12.04.14 17:52, Alexey Kardashevskiy wrote:

This allows guests to have a different timebase origin from the host.

This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.

This adds kvm_access_one_reg() to access a special register which is not
in env->spr.

The feature must be present in the host kernel.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v5:
* fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
into timebase_post_load()
* removed round_up(1<<24) as KVM is expected to do this anyway
* removed @freq from migration stream
* renamed PPCTimebaseOffset to PPCTimebase
* CLOCKS_PER_SEC is used as a constant which 100us/s (man clock)

v4:
* made it per machine timebase offser rather than per CPU

v3:
* kvm_access_one_reg moved out to a separate patch
* tb_offset and host_timebase were replaced with guest_timebase as
the destionation does not really care of offset on the source

v2:
* bumped the vmstate_ppc_cpu version
* defined version for the env.tb_env field
---
  hw/ppc/ppc.c   | 76 ++
  hw/ppc/spapr.c |  4 +--
  include/hw/ppc/spapr.h |  1 +
  target-ppc/cpu-qom.h   | 15 ++
  target-ppc/kvm.c   |  5 
  trace-events   |  3 ++
  6 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 71df471..bf61a0a 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -29,9 +29,11 @@
  #include "sysemu/cpus.h"
  #include "hw/timer/m48t59.h"
  #include "qemu/log.h"
+#include "qemu/error-report.h"
  #include "hw/loader.h"
  #include "sysemu/kvm.h"
  #include "kvm_ppc.h"
+#include "trace.h"
  
  //#define PPC_DEBUG_IRQ

  //#define PPC_DEBUG_TB
@@ -829,6 +831,80 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t 
freq)
  cpu_ppc_store_purr(cpu, 0xULL);
  }
  
+static void timebase_pre_save(void *opaque)

+{
+PPCTimebase *tb = opaque;
+uint64_t ticks = cpu_get_real_ticks();
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return;
+}
+
+tb->time_of_the_day = get_clock_realtime() / 1000;


It'd be good to indicate in the field name that we're in us units. In 
fact it probably makes sense to use ns throughout and not convert :). 
Nanoseconds are QEMU's "native" internal time format.



+/*
+ * tb_offset is only expected to be changed by migration so
+ * there is no need to update it from KVM here
+ */
+tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
+}
+
+static int timebase_post_load(void *opaque, int version_id)
+{
+PPCTimebase *tb = opaque;
+CPUState *cpu;
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+int64_t tb_off_adj, tb_off;
+int64_t migration_duration_us, migration_duration_tb, guest_tb, host_us;
+unsigned long freq;
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return -1;
+}
+
+freq = first_ppc_cpu->env.tb_env->tb_freq;
+/*
+ * Calculate timebase on the destination side of migration.
+ * The destination timebase must be not less than the source timebase.
+ * We try to adjust timebase by downtime if host clocks are not
+ * too much out of sync (1 second for now).
+ */
+host_us = get_clock_realtime() / 1000;
+migration_duration_us = MIN(CLOCKS_PER_SEC, host_us - tb->time_of_the_day);


CLOCKS_PER_SEC only make sense with the return value of clock(). I 
wouldn't be surprised if this calculation goes totally bonkers on BSD or 
Windows systems.


I still don't understand why we're doing a MIN() here. I would 
understand a MAX() to guard against unsynchronized times on 
source/destination.



+migration_duration_tb = migration_duration_us * (freq / CLOCKS_PER_SEC);


I think muldiv64() is safer here in case tb_freq is small and the clock 
rate is high, no?



+guest_tb = tb->guest_timebase + MIN(0, migration_duration_tb);
+
+tb_off_adj = guest_tb - cpu_get_real_ticks();
+
+tb_off = first_ppc_cpu->env.tb_env->tb_offset;


Before migration env->tb_env->tb_offset should always be 0, no?


+trace_ppc_tb_adjust(tb_off, tb_off_adj, tb_off_adj - tb_off,
+(tb_off_adj - tb_off) / freq);
+
+/* Set new offset to all CPUs */
+CPU_FOREACH(cpu) {
+PowerPCCPU *pcpu = POWERPC_CPU(cpu);
+pcpu->env.tb_env->tb_offset = tb_off_adj;
+}
+
+return 0;
+}
+
+const VMStateDescription vmstate_ppc_timebase = {
+.name = "timebase",
+.

Re: [Qemu-devel] [QEMU v5 PATCH 00/18] SMBIOS: build full tables in QEMU

2014-04-14 Thread Markus Armbruster
"Gabriel L. Somlo"  writes:

> Kevin,
>
> Thanks for the comments. I'll work your feedback (and any other
> feedback I get by early next week) into another iteration of smbios
> patches for both SeaBIOS and QEMU.
>
> In the mean time, there's one remaining "big picture" design question:
>
> On Sat, Apr 12, 2014 at 11:56:08AM -0400, Kevin O'Connor wrote:
>> QEMU currently has command-line options that can modify the fields of
>> the type0 tables (-smbios type=0,vendor='foo').  To continue to
>> support that, I think QEMU should be able to build the type0 table as
>> it feels fit to, and SeaBIOS should be able to pass it through.  Of
>> course, if there's no specific request from the end user, then I think
>> QEMU can tell SeaBIOS that it may replace the type0 content with its
>> own data (eg, via "etc/update-smbios-type0").
>>
>> [...]
>>
>> As a minor quibble, I think patch 18 should make type0 required
>> instead of optional (once there are the new fw_cfg entries there is no
>> harm in always producing type0).  Also, it would be nice to move up
>> patch 18 to after patch 10 - that way an end-to-end test between
>> old/new smbios with the new interface could be done.  I defer to
>> regular qemu developers on these comments though.
>
> There's three options I can think of:
>
> 1. QEMU always generates its own type 0 table. In this case, SeaBIOS
> can probably just use that, along with the rest of the tables, as
> provided. QEMU would have to "impersonate" or "channel" SeaBIOS when
> generating the type 0 table (or "channel" TianoCore, depending on which
> bios is being used).

Begs the question how QEMU would figure out what exactly the BIOS wants
it to put into the type 0 table.  Unless somebody can come up with a
practical answer, we can ignore this option :)

> 2. QEMU only generates type 0 if explicitly told to do so on the
> command line (i.e., *not* by default). In this case, SeaBIOS (or OVMF,
> or any other BIOS) would have to scan the tables and insert its own
> default type 0 if one was not purposely supplied by QEMU. (I know my
> current SeaBIOS patch always overrides type 0, and agree that's
> inconsistent with this option, and plan on fixing it :)
>
> 3. QEMU never generates a type 0 structure (i.e. we remove that
> command line option), and the BIOS is *always* responsible for
> generating it ("allowing type 0 on the qemu command line was a bad
> idea, nobody uses it, we shouldn't have done it in the first place",
> to paraphrase from an earlier thread).

This is a simplification of 2: the BIOS doesn't have to check whether
QEMU provided type 0 information.  Paid for with an incompatible change
of a somewhat obscure QEMU feature.

> I personally like #2 as it appears simple and flexible, and doesn't
> require any further coordination (beyond qemu providing an entry
> point and a set of tables).
>
> However, I'm not religious about it -- I'm only really after type 2
> and 17, for OS X's sake, as you all may remember... :)
>
> Gerd, Laszlo, what do you guys think ?

My unsolicited advice: if 2. is easy for you, go for it.



Re: [Qemu-devel] [PATCH 4/4] spapr: Add support for time base offset migration

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 04:37 PM, Alexander Graf wrote:
> 
> On 12.04.14 23:44, Benjamin Herrenschmidt wrote:
>> On Sat, 2014-04-12 at 16:31 +0200, Alexander Graf wrote:
>>> Don't we generate PHBs on the fly? How exactly is this going to help
>>> with the problem at hand?
>> We can still assign the interrupts as a fixed function of the PHB
>> number...
> 
> Yes, but we create those depending on the order with which -device gets
> called IIUC. That's really what the underlying issue is. If we had 500
> prepopulated PHBs that PCI devices get assigned to we wouldn't have the
> problem (but different ones thanks to massive waste of memory and other
> resources).
> 
> So we either have to create some way to make interrupt numbering a function
> of something very simple we plug the PHB into, like a virtual pseries slot
> number which we multiply by x to get an irq number range. Or we'd have to
> manually link up PHB IRQ lines to XICS IRQ lines on the command line.


Our additional PHBs require @index property which is used to calculate base
MMIO and IO ranges. That we could use for LSI/MSI too.



-- 
Alexey



Re: [Qemu-devel] '.' IDs and certain names breaks -global and -set

2014-04-14 Thread Markus Armbruster
Alistair Francis  writes:

> On Wed, Apr 9, 2014 at 9:58 PM, Peter Crosthwaite
>  wrote:
>> On Wed, Apr 9, 2014 at 7:56 PM, Markus Armbruster  wrote:
>>> We have a number of device model names containing '.'.  They're unusable
>>> with -global.  That's because "-global A.B.C=foo" gets parsed as
>>>
>>> driver = "A"
>>> property = "B.C"
>>> value = "foo".
>>>
>>> Wrong when the device model name is really A.B and the property is C,
>>> e.g. "-global cfi.pflash01.name".
>>>
>>> Related problem: QemuOpts ID may contain '.'.  Such IDs are in fact
>>> common practice.  Unfortunately, they're unusable with -set.  For
>>> instance, -set A.B.C.D=foo gets parsed as
>>>
>>> group = "A"
>>> id = "B"
>>> arg = "C.D"
>>> value = "foo"
>>>
>>> Wrong when the id is really B.C and the arg is D, e.g. "-device
>>> isa-serial,id=s.0 -set device.s.0.chardev=c0".  This issue isn't limited
>>> to devices.
>>>
>>> Related historical problem: commit b560a9a.
>>>
>>> Possible solutions:
>>>
>>> * Outlaw '.' in QemuOpts parameter names (and thus device model names
>>>   and property names, because QemuOpts is a major way for users to
>>>   specify them).
>>>
>>
>> I like the "."s because they mean "this is where the vendor stops and
>> the IP name starts". Whereas "-" just delimits multiple words. We seem
>> to be running our of characters however with these overloading
>> problems. But AFAICT ":" is largely unused (within this context
>> anyway). So how about:
>>
>> "-" seperates multiple words in a name
>> ":" splits logical groupings of multiple words as appropriate ( e.g.
>> some-vendor:their-device )
>> "," demilits multiple command line arguments
>> "." accesses property within an object
>> "/" walks the canonical path
>>
>> I do also need to confess to my alternate agenda of device tree driven
>> QEMU - I am still parsing device trees and directly mapping the
>> compatible strings to QOM type names for machine creation. Ambiguity
>> of "-" as meaning the "," or the "-" is hard to deal with.
>>
>> Regards,
>> Peter
>>
>>> * Resolve the syntactic ambiguity semantically (ugh)
>>>
>>> * Extend the syntax of -global and -set to let users avoid the issue, or
>>>   replace them outright
>>>
>>> * Use the old ostrich algorithm
>>>
>>> Andreas, what restrictions does QOM place on QOM path component names?
>>> No '/', obviously.  Anything else?
>>>
>>
>
> What is the consensus with this? I like Peter's naming suggestions as
> they are straight
> forward and don't break any compatibility.

I'd like to hear a few more opinions, Andreas's in particular.  Give him
a bit more time to chime in.



Re: [Qemu-devel] [PATCH 4/4] spapr: Add support for time base offset migration

2014-04-14 Thread Alexander Graf


On 14.04.14 09:12, Alexey Kardashevskiy wrote:

On 04/14/2014 04:37 PM, Alexander Graf wrote:

On 12.04.14 23:44, Benjamin Herrenschmidt wrote:

On Sat, 2014-04-12 at 16:31 +0200, Alexander Graf wrote:

Don't we generate PHBs on the fly? How exactly is this going to help
with the problem at hand?

We can still assign the interrupts as a fixed function of the PHB
number...

Yes, but we create those depending on the order with which -device gets
called IIUC. That's really what the underlying issue is. If we had 500
prepopulated PHBs that PCI devices get assigned to we wouldn't have the
problem (but different ones thanks to massive waste of memory and other
resources).

So we either have to create some way to make interrupt numbering a function
of something very simple we plug the PHB into, like a virtual pseries slot
number which we multiply by x to get an irq number range. Or we'd have to
manually link up PHB IRQ lines to XICS IRQ lines on the command line.


Our additional PHBs require @index property which is used to calculate base
MMIO and IO ranges. That we could use for LSI/MSI too.


Works for me :)


Alex




Re: [Qemu-devel] [PATCH 33/35] pc: ACPI BIOS: reserve SRAT entry for hotplug mem hole

2014-04-14 Thread Hu Tao
On Fri, Apr 04, 2014 at 03:36:58PM +0200, Igor Mammedov wrote:
> Needed for Windows to use hotplugged memory device, otherwise
> it complains that server is not configured for memory hotplug.
> Tests shows that aftewards it uses dynamically provided
> proximity value from _PXM() method if available.
> 
> Signed-off-by: Igor Mammedov 
> ---
>  hw/i386/acpi-build.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index ef89e99..012b100 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1197,6 +1197,8 @@ build_srat(GArray *table_data, GArray *linker,
>  uint64_t curnode;
>  int srat_start, numa_start, slots;
>  uint64_t mem_len, mem_base, next_base;
> +PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> +ram_addr_t hotplug_as_size = memory_region_size(&pcms->hotplug_memory);
>  
>  srat_start = table_data->len;
>  
> @@ -1261,6 +1263,18 @@ build_srat(GArray *table_data, GArray *linker,
>  acpi_build_srat_memory(numamem, 0, 0, 0, MEM_AFFINITY_NOFLAGS);
>  }
>  
> +/*
> + * Fake entry required by Windows to enable memory hotplug in OS.
> + * Individual DIMM devices override proximity set here via _PXM method,
> + * which returns associated with it NUMA node id.
> + */
> +if (hotplug_as_size) {
> +numamem = acpi_data_push(table_data, sizeof *numamem);
> +acpi_build_srat_memory(numamem, pcms->hotplug_memory_base,
> +   hotplug_as_size, 0, MEM_AFFINITY_HOTPLUGGABLE 
> |
> +   MEM_AFFINITY_ENABLED);
> +}
> +

Hi Igor,

With the faked entry, memory unplug doesn't work. Entries should be set
up for each node with correct flags(enable, hotpluggable) to make memory
unplug work.

Windows has not been tested yet. I encountered a problem that there is
no SRAT in Windows so even memory hotplug doesn't work. (but there is
in Linux with the same configuration).

Regards,
Hu Tao



Re: [Qemu-devel] [PATCH v5] spapr: Add support for time base offset migration

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 05:08 PM, Alexander Graf wrote:
> 
> On 12.04.14 17:52, Alexey Kardashevskiy wrote:
>> This allows guests to have a different timebase origin from the host.
>>
>> This is needed for migration, where a guest can migrate from one host
>> to another and the two hosts might have a different timebase origin.
>> However, the timebase seen by the guest must not go backwards, and
>> should go forwards only by a small amount corresponding to the time
>> taken for the migration.
>>
>> This is only supported for recent POWER hardware which has the TBU40
>> (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
>> 970.
>>
>> This adds kvm_access_one_reg() to access a special register which is not
>> in env->spr.
>>
>> The feature must be present in the host kernel.
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>> Changes:
>> v5:
>> * fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
>> into timebase_post_load()
>> * removed round_up(1<<24) as KVM is expected to do this anyway
>> * removed @freq from migration stream
>> * renamed PPCTimebaseOffset to PPCTimebase
>> * CLOCKS_PER_SEC is used as a constant which 100us/s (man clock)
>>
>> v4:
>> * made it per machine timebase offser rather than per CPU
>>
>> v3:
>> * kvm_access_one_reg moved out to a separate patch
>> * tb_offset and host_timebase were replaced with guest_timebase as
>> the destionation does not really care of offset on the source
>>
>> v2:
>> * bumped the vmstate_ppc_cpu version
>> * defined version for the env.tb_env field
>> ---
>>   hw/ppc/ppc.c   | 76
>> ++
>>   hw/ppc/spapr.c |  4 +--
>>   include/hw/ppc/spapr.h |  1 +
>>   target-ppc/cpu-qom.h   | 15 ++
>>   target-ppc/kvm.c   |  5 
>>   trace-events   |  3 ++
>>   6 files changed, 102 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
>> index 71df471..bf61a0a 100644
>> --- a/hw/ppc/ppc.c
>> +++ b/hw/ppc/ppc.c
>> @@ -29,9 +29,11 @@
>>   #include "sysemu/cpus.h"
>>   #include "hw/timer/m48t59.h"
>>   #include "qemu/log.h"
>> +#include "qemu/error-report.h"
>>   #include "hw/loader.h"
>>   #include "sysemu/kvm.h"
>>   #include "kvm_ppc.h"
>> +#include "trace.h"
>> //#define PPC_DEBUG_IRQ
>>   //#define PPC_DEBUG_TB
>> @@ -829,6 +831,80 @@ static void cpu_ppc_set_tb_clk (void *opaque,
>> uint32_t freq)
>>   cpu_ppc_store_purr(cpu, 0xULL);
>>   }
>>   +static void timebase_pre_save(void *opaque)
>> +{
>> +PPCTimebase *tb = opaque;
>> +uint64_t ticks = cpu_get_real_ticks();
>> +PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
>> +
>> +if (!first_ppc_cpu->env.tb_env) {
>> +error_report("No timebase object");
>> +return;
>> +}
>> +
>> +tb->time_of_the_day = get_clock_realtime() / 1000;
> 
> It'd be good to indicate in the field name that we're in us units. In fact
> it probably makes sense to use ns throughout and not convert :).
> Nanoseconds are QEMU's "native" internal time format.

Ooookay, I'll make it "ns", then no indicator is needed, right?


>> +/*
>> + * tb_offset is only expected to be changed by migration so
>> + * there is no need to update it from KVM here
>> + */
>> +tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
>> +}
>> +
>> +static int timebase_post_load(void *opaque, int version_id)
>> +{
>> +PPCTimebase *tb = opaque;
>> +CPUState *cpu;
>> +PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
>> +int64_t tb_off_adj, tb_off;
>> +int64_t migration_duration_us, migration_duration_tb, guest_tb,
>> host_us;
>> +unsigned long freq;
>> +
>> +if (!first_ppc_cpu->env.tb_env) {
>> +error_report("No timebase object");
>> +return -1;
>> +}
>> +
>> +freq = first_ppc_cpu->env.tb_env->tb_freq;
>> +/*
>> + * Calculate timebase on the destination side of migration.
>> + * The destination timebase must be not less than the source timebase.
>> + * We try to adjust timebase by downtime if host clocks are not
>> + * too much out of sync (1 second for now).
>> + */
>> +host_us = get_clock_realtime() / 1000;
>> +migration_duration_us = MIN(CLOCKS_PER_SEC, host_us -
>> tb->time_of_the_day);
> 
> CLOCKS_PER_SEC only make sense with the return value of clock(). I wouldn't
> be surprised if this calculation goes totally bonkers on BSD or Windows
> systems.


http://linux.die.net/man/3/clock
C89, C99, POSIX.1-2001. POSIX requires that CLOCKS_PER_SEC equals 100
independent of the actual resolution.

This should be valid for BSD too. Ok, this has no effect on Windows.
Replace definition with get_ticks_per_sec()? Defining USEC_PER_SEC (or
NSEC_xxx) does not make much sense.

> 
> I still don't understand why we're doing a MIN() here. I would understand a
> MAX() to guard against unsynchronized times on source/destination.

If I added time difference between out-of

Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics

2014-04-14 Thread Alexander Graf


On 11.04.14 18:30, Alexey Kardashevskiy wrote:

On 04/12/2014 02:15 AM, Alexander Graf wrote:

On 11.04.14 18:01, Alexey Kardashevskiy wrote:

On 04/12/2014 01:38 AM, Alexander Graf wrote:

On 11.04.14 17:27, Alexey Kardashevskiy wrote:

On 04/12/2014 12:58 AM, Alexander Graf wrote:

On 11.04.14 16:50, Alexey Kardashevskiy wrote:

On 04/11/2014 11:58 PM, Alexander Graf wrote:

On 11.04.2014, at 14:38, Alexey Kardashevskiy  wrote:


On 04/11/2014 07:24 PM, Alexander Graf wrote:

On 10.04.14 16:43, Alexey Kardashevskiy wrote:

On 04/10/2014 11:26 PM, Alexander Graf wrote:

On 10.04.14 15:24, Alexey Kardashevskiy wrote:

On 04/10/2014 10:51 PM, Alexander Graf wrote:

On 14.03.14 05:18, Alexey Kardashevskiy wrote:

The current allocator returns IRQ numbers from a pool and
does not
support IRQs reuse in any form as it did not keep track of
what it
previously returned, it only had the last returned IRQ.
However migration may change interrupts for devices depending on
their order in the command line.

Wtf? Nonono, this sounds very bogus and wrong. Migration
shouldn't
change
anything.

I put wrong commit message. By change I meant that the default
state
before
the destination guest started accepting migration is different
from
what
the destination guest became after migration finished. And
migration
cannot
avoid changing this default state.

Ok, why is the IRQ configuration different?

Because QEMU creates devices in the order as in the command line,
and
libvirt changes this order - the XML used to create the guest and
the
XML
which is sends during migration are different. libvirt thinks it
is ok
while it keeps @reg property for (for example) spapr-vscsi devices
but it
is not because since the order is different, devices call IRQ
allocator in
different order and get different IRQs.

So your patch migrates the current IRQ configuration, but once you
restart
the virtual machine on the destination host it will have different
IRQ
numbering again, right?

No, why? IRQs are assigned at init time from realize() callbacks (and
survive reset) or as a part of ibm,change-msi rtas call which
happens in
the same order as it only depends on pci addresses and we do not
change
this either.

Ok, let me rephrase. If I shut the machine down because I'm doing
on-disk hibernate and then boot it back up, will the guest find the
same
configuration?

I do not understand what you mean by this. Hibernation by the guest OS
itself or by QEMU? If this involves QEMU exit and QEMU start - then yes,

by the guest OS. The host will only see a genuine "shutdown" event. The
guest OS will expect the machine to look *the exact same* as before the
shutdown.

Ok. So. I have to implement "irq" property everywhere (PHB is missing
INTA/B/C/D now) and check if they did not change during migration via
those

Hrm. Not sure. Maybe it'd make sense to join next week's call on platform
device creation. The problem seems pretty closely related.

What are those platform devices and what are you going to discuss exactly?

Devices that don't have a unified interrupt routing scheme like PCI where
you just link lines A/B/C/D to your controller and you're good to go.

Ah. VIO in my case.




VMSTATE.*EQUAL. Correct?

Why would you need this? I think we already said a couple dozen times that
configuration matching is a bigger problem, no?

For debug! It is not needed in general, yes.



If so (more or less), I still would like to keep patches 1..7.
In fact, the first one is independent and we need it anyway.
Yes/no?

Why?

IOMMUs do not migrate correctly - they only have a class have and
instance_id and this instance_it depends on command line arguments order.
The #1 patch makes it classname + liobn.

Why do we need a bus for that?


For BusClass::get_dev_path callback to get an unique name.


Juan, I don't think it makes a lot of sense to require a new fake bus 
just to give us a consistent migration view of things.


Do you have any ideas how to migration busless devices? We could just 
detect that case and give them numbering based on their occurence in the 
global QOM hierarchy, no?


Andreas, maybe you have an idea here as well :).


Alex




Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 05:34 PM, Alexander Graf wrote:
> 
> On 11.04.14 18:30, Alexey Kardashevskiy wrote:
>> On 04/12/2014 02:15 AM, Alexander Graf wrote:
>>> On 11.04.14 18:01, Alexey Kardashevskiy wrote:
 On 04/12/2014 01:38 AM, Alexander Graf wrote:
> On 11.04.14 17:27, Alexey Kardashevskiy wrote:
>> On 04/12/2014 12:58 AM, Alexander Graf wrote:
>>> On 11.04.14 16:50, Alexey Kardashevskiy wrote:
 On 04/11/2014 11:58 PM, Alexander Graf wrote:
> On 11.04.2014, at 14:38, Alexey Kardashevskiy  wrote:
>
>> On 04/11/2014 07:24 PM, Alexander Graf wrote:
>>> On 10.04.14 16:43, Alexey Kardashevskiy wrote:
 On 04/10/2014 11:26 PM, Alexander Graf wrote:
> On 10.04.14 15:24, Alexey Kardashevskiy wrote:
>> On 04/10/2014 10:51 PM, Alexander Graf wrote:
>>> On 14.03.14 05:18, Alexey Kardashevskiy wrote:
 The current allocator returns IRQ numbers from a pool and
 does not
 support IRQs reuse in any form as it did not keep track of
 what it
 previously returned, it only had the last returned IRQ.
 However migration may change interrupts for devices
 depending on
 their order in the command line.
>>> Wtf? Nonono, this sounds very bogus and wrong. Migration
>>> shouldn't
>>> change
>>> anything.
>> I put wrong commit message. By change I meant that the default
>> state
>> before
>> the destination guest started accepting migration is different
>> from
>> what
>> the destination guest became after migration finished. And
>> migration
>> cannot
>> avoid changing this default state.
> Ok, why is the IRQ configuration different?
 Because QEMU creates devices in the order as in the command line,
 and
 libvirt changes this order - the XML used to create the guest and
 the
 XML
 which is sends during migration are different. libvirt thinks it
 is ok
 while it keeps @reg property for (for example) spapr-vscsi devices
 but it
 is not because since the order is different, devices call IRQ
 allocator in
 different order and get different IRQs.
>>> So your patch migrates the current IRQ configuration, but once you
>>> restart
>>> the virtual machine on the destination host it will have different
>>> IRQ
>>> numbering again, right?
>> No, why? IRQs are assigned at init time from realize() callbacks
>> (and
>> survive reset) or as a part of ibm,change-msi rtas call which
>> happens in
>> the same order as it only depends on pci addresses and we do not
>> change
>> this either.
> Ok, let me rephrase. If I shut the machine down because I'm doing
> on-disk hibernate and then boot it back up, will the guest find the
> same
> configuration?
 I do not understand what you mean by this. Hibernation by the guest OS
 itself or by QEMU? If this involves QEMU exit and QEMU start - then
 yes,
>>> by the guest OS. The host will only see a genuine "shutdown" event. The
>>> guest OS will expect the machine to look *the exact same* as before the
>>> shutdown.
>> Ok. So. I have to implement "irq" property everywhere (PHB is missing
>> INTA/B/C/D now) and check if they did not change during migration via
>> those
> Hrm. Not sure. Maybe it'd make sense to join next week's call on platform
> device creation. The problem seems pretty closely related.
 What are those platform devices and what are you going to discuss exactly?
>>> Devices that don't have a unified interrupt routing scheme like PCI where
>>> you just link lines A/B/C/D to your controller and you're good to go.
>> Ah. VIO in my case.
>>
>>
>>
>> VMSTATE.*EQUAL. Correct?
> Why would you need this? I think we already said a couple dozen times
> that
> configuration matching is a bigger problem, no?
 For debug! It is not needed in general, yes.


>> If so (more or less), I still would like to keep patches 1..7.
>> In fact, the first one is independent and we need it anyway.
>> Yes/no?
> Why?
 IOMMUs do not migrate correctly - they only have a class have and
 instance_id and this instance_it depends on command line arguments order.
 The #1 patch makes it classname + liobn.
>>> Why do we need a bus for that?
>>
>> For BusClass::get_dev_path callback to get an unique name.
> 
> Juan, I don't think it makes a lot of sense to require a new fake bus just
> to give us a consistent migration view of things.
> 
> Do you have any 

Re: [Qemu-devel] [PATCH] spapr: add ibm, chip-id property in device tree

2014-04-14 Thread Alexander Graf


On 11.04.14 18:41, Alexey Kardashevskiy wrote:

On 04/12/2014 02:29 AM, Alexander Graf wrote:

On 13.03.14 07:29, Alexey Kardashevskiy wrote:

This adds a "ibm,chip-id" property for CPU nodes which should be the same
for all cores in the same CPU socket. The recent guest kernels use this
information to associate threads with sockets.

Refer to the kernel commit 256f2d4b463d3030ebc8d2b54f427543814a2bdc
for more details.

Signed-off-by: Alexey Kardashevskiy 
---
   hw/ppc/spapr.c | 9 +
   1 file changed, 9 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bf46c38..6366230 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -308,6 +308,8 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
   uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
   int i, smt = kvmppc_smt_threads();
   unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
+QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
+unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
 fdt = g_malloc0(FDT_MAX_SIZE);
   _FDT((fdt_create(fdt, FDT_MAX_SIZE)));
@@ -465,6 +467,13 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
  page_sizes_prop, page_sizes_prop_size)));
   }
   +if (sockets) {
+int cpus_per_socket = smp_cpus / sockets;
+uint32_t chip_id = cs->cpu_index / cpus_per_socket;
+
+_FDT((fdt_property_cell(fdt, "ibm,chip-id", chip_id)));
+}

Have you verified this works correctly with threads?

Sorry, do not follow you.
-smp X,sockets=Y,threads=Z - what combination of XYZ is suspicious?


No, I think it works. Does it work for non-power-of-2 socket numbers?





Also, I don't see why
we should omit the chip-id when we don't define sockets.

Why should we pollute device tree...


What does pHyp do? If we say we don't want to "pollute the device tree" 
for the sockets=1 case then this should be explicit. Right now we put 
the chip-id in when you say sockets=1, but not if you omit it.



Alex




Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics

2014-04-14 Thread Alexander Graf


On 14.04.14 09:41, Alexey Kardashevskiy wrote:

On 04/14/2014 05:34 PM, Alexander Graf wrote:

On 11.04.14 18:30, Alexey Kardashevskiy wrote:

On 04/12/2014 02:15 AM, Alexander Graf wrote:

On 11.04.14 18:01, Alexey Kardashevskiy wrote:

On 04/12/2014 01:38 AM, Alexander Graf wrote:

On 11.04.14 17:27, Alexey Kardashevskiy wrote:

On 04/12/2014 12:58 AM, Alexander Graf wrote:

On 11.04.14 16:50, Alexey Kardashevskiy wrote:

On 04/11/2014 11:58 PM, Alexander Graf wrote:

On 11.04.2014, at 14:38, Alexey Kardashevskiy  wrote:


On 04/11/2014 07:24 PM, Alexander Graf wrote:

On 10.04.14 16:43, Alexey Kardashevskiy wrote:

On 04/10/2014 11:26 PM, Alexander Graf wrote:

On 10.04.14 15:24, Alexey Kardashevskiy wrote:

On 04/10/2014 10:51 PM, Alexander Graf wrote:

On 14.03.14 05:18, Alexey Kardashevskiy wrote:

The current allocator returns IRQ numbers from a pool and
does not
support IRQs reuse in any form as it did not keep track of
what it
previously returned, it only had the last returned IRQ.
However migration may change interrupts for devices
depending on
their order in the command line.

Wtf? Nonono, this sounds very bogus and wrong. Migration
shouldn't
change
anything.

I put wrong commit message. By change I meant that the default
state
before
the destination guest started accepting migration is different
from
what
the destination guest became after migration finished. And
migration
cannot
avoid changing this default state.

Ok, why is the IRQ configuration different?

Because QEMU creates devices in the order as in the command line,
and
libvirt changes this order - the XML used to create the guest and
the
XML
which is sends during migration are different. libvirt thinks it
is ok
while it keeps @reg property for (for example) spapr-vscsi devices
but it
is not because since the order is different, devices call IRQ
allocator in
different order and get different IRQs.

So your patch migrates the current IRQ configuration, but once you
restart
the virtual machine on the destination host it will have different
IRQ
numbering again, right?

No, why? IRQs are assigned at init time from realize() callbacks
(and
survive reset) or as a part of ibm,change-msi rtas call which
happens in
the same order as it only depends on pci addresses and we do not
change
this either.

Ok, let me rephrase. If I shut the machine down because I'm doing
on-disk hibernate and then boot it back up, will the guest find the
same
configuration?

I do not understand what you mean by this. Hibernation by the guest OS
itself or by QEMU? If this involves QEMU exit and QEMU start - then
yes,

by the guest OS. The host will only see a genuine "shutdown" event. The
guest OS will expect the machine to look *the exact same* as before the
shutdown.

Ok. So. I have to implement "irq" property everywhere (PHB is missing
INTA/B/C/D now) and check if they did not change during migration via
those

Hrm. Not sure. Maybe it'd make sense to join next week's call on platform
device creation. The problem seems pretty closely related.

What are those platform devices and what are you going to discuss exactly?

Devices that don't have a unified interrupt routing scheme like PCI where
you just link lines A/B/C/D to your controller and you're good to go.

Ah. VIO in my case.




VMSTATE.*EQUAL. Correct?

Why would you need this? I think we already said a couple dozen times
that
configuration matching is a bigger problem, no?

For debug! It is not needed in general, yes.



If so (more or less), I still would like to keep patches 1..7.
In fact, the first one is independent and we need it anyway.
Yes/no?

Why?

IOMMUs do not migrate correctly - they only have a class have and
instance_id and this instance_it depends on command line arguments order.
The #1 patch makes it classname + liobn.

Why do we need a bus for that?

For BusClass::get_dev_path callback to get an unique name.

Juan, I don't think it makes a lot of sense to require a new fake bus just
to give us a consistent migration view of things.

Do you have any ideas how to migration busless devices? We could just
detect that case and give them numbering based on their occurence in the
global QOM hierarchy, no?


The mentioned instance_id is that occurrence number which totally depends
on the device order in the command line. And I have to not to depend on that.


So how would a bus fix that? The bus gets populated based on the command 
line order just as well, no?



Alex




Re: [Qemu-devel] [PATCH v5] spapr: Add support for time base offset migration

2014-04-14 Thread Alexander Graf


On 14.04.14 09:33, Alexey Kardashevskiy wrote:

On 04/14/2014 05:08 PM, Alexander Graf wrote:

On 12.04.14 17:52, Alexey Kardashevskiy wrote:

This allows guests to have a different timebase origin from the host.

This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.

This adds kvm_access_one_reg() to access a special register which is not
in env->spr.

The feature must be present in the host kernel.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v5:
* fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
into timebase_post_load()
* removed round_up(1<<24) as KVM is expected to do this anyway
* removed @freq from migration stream
* renamed PPCTimebaseOffset to PPCTimebase
* CLOCKS_PER_SEC is used as a constant which 100us/s (man clock)

v4:
* made it per machine timebase offser rather than per CPU

v3:
* kvm_access_one_reg moved out to a separate patch
* tb_offset and host_timebase were replaced with guest_timebase as
the destionation does not really care of offset on the source

v2:
* bumped the vmstate_ppc_cpu version
* defined version for the env.tb_env field
---
   hw/ppc/ppc.c   | 76
++
   hw/ppc/spapr.c |  4 +--
   include/hw/ppc/spapr.h |  1 +
   target-ppc/cpu-qom.h   | 15 ++
   target-ppc/kvm.c   |  5 
   trace-events   |  3 ++
   6 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 71df471..bf61a0a 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -29,9 +29,11 @@
   #include "sysemu/cpus.h"
   #include "hw/timer/m48t59.h"
   #include "qemu/log.h"
+#include "qemu/error-report.h"
   #include "hw/loader.h"
   #include "sysemu/kvm.h"
   #include "kvm_ppc.h"
+#include "trace.h"
 //#define PPC_DEBUG_IRQ
   //#define PPC_DEBUG_TB
@@ -829,6 +831,80 @@ static void cpu_ppc_set_tb_clk (void *opaque,
uint32_t freq)
   cpu_ppc_store_purr(cpu, 0xULL);
   }
   +static void timebase_pre_save(void *opaque)
+{
+PPCTimebase *tb = opaque;
+uint64_t ticks = cpu_get_real_ticks();
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return;
+}
+
+tb->time_of_the_day = get_clock_realtime() / 1000;

It'd be good to indicate in the field name that we're in us units. In fact
it probably makes sense to use ns throughout and not convert :).
Nanoseconds are QEMU's "native" internal time format.

Ooookay, I'll make it "ns", then no indicator is needed, right?


I still prefer to keep an indicator. Makes it more readable.





+/*
+ * tb_offset is only expected to be changed by migration so
+ * there is no need to update it from KVM here
+ */
+tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
+}
+
+static int timebase_post_load(void *opaque, int version_id)
+{
+PPCTimebase *tb = opaque;
+CPUState *cpu;
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+int64_t tb_off_adj, tb_off;
+int64_t migration_duration_us, migration_duration_tb, guest_tb,
host_us;
+unsigned long freq;
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return -1;
+}
+
+freq = first_ppc_cpu->env.tb_env->tb_freq;
+/*
+ * Calculate timebase on the destination side of migration.
+ * The destination timebase must be not less than the source timebase.
+ * We try to adjust timebase by downtime if host clocks are not
+ * too much out of sync (1 second for now).
+ */
+host_us = get_clock_realtime() / 1000;
+migration_duration_us = MIN(CLOCKS_PER_SEC, host_us -
tb->time_of_the_day);

CLOCKS_PER_SEC only make sense with the return value of clock(). I wouldn't
be surprised if this calculation goes totally bonkers on BSD or Windows
systems.


http://linux.die.net/man/3/clock
C89, C99, POSIX.1-2001. POSIX requires that CLOCKS_PER_SEC equals 100
independent of the actual resolution.

This should be valid for BSD too. Ok, this has no effect on Windows.
Replace definition with get_ticks_per_sec()? Defining USEC_PER_SEC (or
NSEC_xxx) does not make much sense.


I'd just define NSEC_PER_SEC.




I still don't understand why we're doing a MIN() here. I would understand a
MAX() to guard against unsynchronized times on source/destination.

If I added time difference between out-of-sync hosts, the delta would be
huge and the destination guest would behave really weird until this time
difference is over. Guests can cope with accidental timebase jump of
seconds bu

Re: [Qemu-devel] [PATCH 6/8] spapr: move interrupt allocator to xics

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 05:42 PM, Alexander Graf wrote:
> 
> On 14.04.14 09:41, Alexey Kardashevskiy wrote:
>> On 04/14/2014 05:34 PM, Alexander Graf wrote:
>>> On 11.04.14 18:30, Alexey Kardashevskiy wrote:
 On 04/12/2014 02:15 AM, Alexander Graf wrote:
> On 11.04.14 18:01, Alexey Kardashevskiy wrote:
>> On 04/12/2014 01:38 AM, Alexander Graf wrote:
>>> On 11.04.14 17:27, Alexey Kardashevskiy wrote:
 On 04/12/2014 12:58 AM, Alexander Graf wrote:
> On 11.04.14 16:50, Alexey Kardashevskiy wrote:
>> On 04/11/2014 11:58 PM, Alexander Graf wrote:
>>> On 11.04.2014, at 14:38, Alexey Kardashevskiy 
>>> wrote:
>>>
 On 04/11/2014 07:24 PM, Alexander Graf wrote:
> On 10.04.14 16:43, Alexey Kardashevskiy wrote:
>> On 04/10/2014 11:26 PM, Alexander Graf wrote:
>>> On 10.04.14 15:24, Alexey Kardashevskiy wrote:
 On 04/10/2014 10:51 PM, Alexander Graf wrote:
> On 14.03.14 05:18, Alexey Kardashevskiy wrote:
>> The current allocator returns IRQ numbers from a pool and
>> does not
>> support IRQs reuse in any form as it did not keep track of
>> what it
>> previously returned, it only had the last returned IRQ.
>> However migration may change interrupts for devices
>> depending on
>> their order in the command line.
> Wtf? Nonono, this sounds very bogus and wrong. Migration
> shouldn't
> change
> anything.
 I put wrong commit message. By change I meant that the default
 state
 before
 the destination guest started accepting migration is different
 from
 what
 the destination guest became after migration finished. And
 migration
 cannot
 avoid changing this default state.
>>> Ok, why is the IRQ configuration different?
>> Because QEMU creates devices in the order as in the command
>> line,
>> and
>> libvirt changes this order - the XML used to create the guest
>> and
>> the
>> XML
>> which is sends during migration are different. libvirt thinks it
>> is ok
>> while it keeps @reg property for (for example) spapr-vscsi
>> devices
>> but it
>> is not because since the order is different, devices call IRQ
>> allocator in
>> different order and get different IRQs.
> So your patch migrates the current IRQ configuration, but once
> you
> restart
> the virtual machine on the destination host it will have
> different
> IRQ
> numbering again, right?
 No, why? IRQs are assigned at init time from realize() callbacks
 (and
 survive reset) or as a part of ibm,change-msi rtas call which
 happens in
 the same order as it only depends on pci addresses and we do not
 change
 this either.
>>> Ok, let me rephrase. If I shut the machine down because I'm doing
>>> on-disk hibernate and then boot it back up, will the guest find the
>>> same
>>> configuration?
>> I do not understand what you mean by this. Hibernation by the
>> guest OS
>> itself or by QEMU? If this involves QEMU exit and QEMU start - then
>> yes,
> by the guest OS. The host will only see a genuine "shutdown"
> event. The
> guest OS will expect the machine to look *the exact same* as
> before the
> shutdown.
 Ok. So. I have to implement "irq" property everywhere (PHB is missing
 INTA/B/C/D now) and check if they did not change during migration via
 those
>>> Hrm. Not sure. Maybe it'd make sense to join next week's call on
>>> platform
>>> device creation. The problem seems pretty closely related.
>> What are those platform devices and what are you going to discuss
>> exactly?
> Devices that don't have a unified interrupt routing scheme like PCI where
> you just link lines A/B/C/D to your controller and you're good to go.
 Ah. VIO in my case.



 VMSTATE.*EQUAL. Correct?
>>> Why would you need this? I think we already said a couple dozen times
>>> that
>>> configuration matching is a bigger problem, no?
>> For debug! It is not needed in general, yes.
>>
>>
 If so (more or less), I still would like to keep patches 1..7.
 In fact, the first one is independent and we need it anyway.
 Yes/no?
>>> Why?
>> IOMMUs do not migrate

Re: [Qemu-devel] Error propagation in generated visitors and command marshallers

2014-04-14 Thread Markus Armbruster
Peter Crosthwaite  writes:

> On Fri, Apr 11, 2014 at 11:41 PM, Markus Armbruster  wrote:
>> Peter Crosthwaite  writes:
>>
>>> On Thu, Apr 10, 2014 at 1:48 AM, Markus Armbruster  
>>> wrote:
 I stumbled over this while trying to purge error_is_set() from the code.


 Here's how we commonly use the Error API:

 Error *err = NULL;

 foo(arg, &err)
 if (err) {
 goto out;
 }
 bar(arg, &err)
 if (err) {
 goto out;
 }

 This ensures that err is null on entry, both for foo() and for bar().
 Many functions rely on that, like this:

 void foo(ArgType arg, Error **errp)
 {
 if (frobnicate(arg) < 0) {
 error_setg(errp, "Can't frobnicate");
 // This asserts errp != NULL
 }
 }


 Here's how some of our visitor code uses the Error API (for real code,
 check out generated qmp-marshal.c):

 Error *err = NULL;
 QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args));
 Visitor *v = qmp_input_get_visitor(mi);
 char *foo = NULL;
 char *bar = NULL;

 visit_type_str(v, &foo, "foo", &err);
 visit_type_str(v, &bar, "bar", &err);
 if (err) {
 goto out;
 }

 Unlike above, this may pass a non-null errp to the second
 visit_type_str(), namely when the first one fails.

 The visitor functions guard against that, like this:

 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 if (!error_is_set(errp)) {
 v->type_str(v, obj, name, errp);
 }
 }

 As discussed before, error_is_set() is almost almost wrong, fragile or
 unclean.  What if errp is null?  Then we fail to stop visiting after an
 error.
>>>
>>> That should be the callers problem. If you pass a NULL errp then the
>>> intended semantic is to ignore errors.
>>
>> The *caller* isn't interested in an error.  But the callee's behavior
>> should *not* be affected by that at all other than not returning an
>> error.
>>
>> In particular, the callee should never continue after an error just
>> because the caller isn't interested in detailed error information.
>>
>
> But the error is from a previous call not the current call. Callers
> job to inform second call that first one failed (or in current status
> quo - not call the second call at all). But its caller job to know the
> dependancy otherwise calls are not self contained.
>
>> That's why "if (error_is_set(errp)) bail" and similar are always wrong
>> when errp is a parameter that may be null.
>>
>
> Agreed. Don't see the problem here though - it's bad caller code too.
>
 The function could be improved like this:

 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 assert(errp);
>>>
>>> And this is irregular in that you are now mandating the Error ** and
>>> thus removing the capability to ignore errors.
>>
>> It is irregular.  The irregularity is necessary as long as the function
>> isn't prepared for a null errp.
>>
>
> My understanding is all functions should be prepared for NULL errp.

If you combine "do nothing when called with an error set" semantics with
a "pass null to ignore errors" convention, you set yourself up for
accidental misuse.  Consider

visit_type_str(v, &foo, "foo", errp);
visit_type_str(v, &bar, "bar", errp);

where visit_type_str() does nothing when error_is_set(errp).  This
*breaks* when errp is null, but seeing that takes a non-local argument.

Here's a simple safety rule: if you do anything with an Error * but pass
it on, you must ensure it's not null, and you should make its
non-null-ness locally obvious whenever practical.

This rule should ensure that you can pass null to ignore errors without
changing behavior beyond ignoring the error.

 if (!*errp) {
 v->type_str(v, obj, name, errp);
 }
 }


 But: is it a good idea to have both patterns in the code?  Should we
 perhaps use the common pattern for visiting, too?  Like this:

 visit_type_str(v, &foo, "foo", &err);
 if (err) {
 goto out;
 }
 visit_type_str(v, &bar, "bar", &err);
 if (err) {
 goto out;
 }

 Then we can assume *errp is clear on function entry, like this:

 void visit_type_str(Visitor *v, char **obj, const char *name, Error 
 **errp)
 {
 v->type_str(v, obj, name, errp);
 }

 Should execute roughly the same number of conditional branches.

 Tedious repetition of "if (err) goto out" in the caller, but that's what
 we do elsewhere, and unlike elsewhere, these one's are generated.

 Opini

Re: [Qemu-devel] [PATCH] spapr: add ibm, chip-id property in device tree

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 05:41 PM, Alexander Graf wrote:
> 
> On 11.04.14 18:41, Alexey Kardashevskiy wrote:
>> On 04/12/2014 02:29 AM, Alexander Graf wrote:
>>> On 13.03.14 07:29, Alexey Kardashevskiy wrote:
 This adds a "ibm,chip-id" property for CPU nodes which should be the same
 for all cores in the same CPU socket. The recent guest kernels use this
 information to associate threads with sockets.

 Refer to the kernel commit 256f2d4b463d3030ebc8d2b54f427543814a2bdc
 for more details.

 Signed-off-by: Alexey Kardashevskiy 
 ---
hw/ppc/spapr.c | 9 +
1 file changed, 9 insertions(+)

 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index bf46c38..6366230 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -308,6 +308,8 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
uint32_t interrupt_server_ranges_prop[] = {0,
 cpu_to_be32(smp_cpus)};
int i, smt = kvmppc_smt_threads();
unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
 +QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
 +unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0)
 : 0;
  fdt = g_malloc0(FDT_MAX_SIZE);
_FDT((fdt_create(fdt, FDT_MAX_SIZE)));
 @@ -465,6 +467,13 @@ static void *spapr_create_fdt_skel(hwaddr
 initrd_base,
   page_sizes_prop,
 page_sizes_prop_size)));
}
+if (sockets) {
 +int cpus_per_socket = smp_cpus / sockets;
 +uint32_t chip_id = cs->cpu_index / cpus_per_socket;
 +
 +_FDT((fdt_property_cell(fdt, "ibm,chip-id", chip_id)));
 +}
>>> Have you verified this works correctly with threads?
>> Sorry, do not follow you.
>> -smp X,sockets=Y,threads=Z - what combination of XYZ is suspicious?
> 
> No, I think it works. Does it work for non-power-of-2 socket numbers?

Yes.
-smp 12,sockets=3,threads=4

[root@erif_root cpus]# find -iname ibm,chip-id
./PowerPC,POWER7@8/ibm,chip-id
./PowerPC,POWER7@4/ibm,chip-id
./PowerPC,POWER7@0/ibm,chip-id
[root@erif_root cpus]# find -iname ibm,chip-id -exec hexdump {} \;
000  0002
004
000  0001
004
000  
004

[root@erif_root cpus] cat /proc/cpuinfo
processor   : 0
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 1
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 2
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 3
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 4
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 5
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 6
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 7
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 8
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 9
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 10
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

processor   : 11
cpu : POWER7 (raw), altivec supported
clock   : 3720.00MHz
revision: 2.1 (pvr 003f 0201)

timebase: 51200
platform: pSeries
model   : IBM pSeries (emulated by qemu)
machine : CHRP IBM pSeries (emulated by qemu)


> 
>>
>>
>>> Also, I don't see why
>>> we should omit the chip-id when we don't define sockets.
>> Why should we pollute device tree...
> 
> What does pHyp do? If we say we don't want to "pollute the device tree" for
> the sockets=1 case then this should be explicit. Right now we put the
> chip-id in when you say sockets=1, but not if you omit it.

It is made up thing which is absent in pHyp.
Ok, I'll enable it always, won't hurt.


> 
> 
> Alex
> 


-- 
Alexey



Re: [Qemu-devel] [PATCH v5] spapr: Add support for time base offset migration

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 05:46 PM, Alexander Graf wrote:
> 
> On 14.04.14 09:33, Alexey Kardashevskiy wrote:
>> On 04/14/2014 05:08 PM, Alexander Graf wrote:
>>> On 12.04.14 17:52, Alexey Kardashevskiy wrote:
 This allows guests to have a different timebase origin from the host.

 This is needed for migration, where a guest can migrate from one host
 to another and the two hosts might have a different timebase origin.
 However, the timebase seen by the guest must not go backwards, and
 should go forwards only by a small amount corresponding to the time
 taken for the migration.

 This is only supported for recent POWER hardware which has the TBU40
 (timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
 970.

 This adds kvm_access_one_reg() to access a special register which is not
 in env->spr.

 The feature must be present in the host kernel.

 Signed-off-by: Alexey Kardashevskiy 
 ---
 Changes:
 v5:
 * fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
 into timebase_post_load()
 * removed round_up(1<<24) as KVM is expected to do this anyway
 * removed @freq from migration stream
 * renamed PPCTimebaseOffset to PPCTimebase
 * CLOCKS_PER_SEC is used as a constant which 100us/s (man clock)

 v4:
 * made it per machine timebase offser rather than per CPU

 v3:
 * kvm_access_one_reg moved out to a separate patch
 * tb_offset and host_timebase were replaced with guest_timebase as
 the destionation does not really care of offset on the source

 v2:
 * bumped the vmstate_ppc_cpu version
 * defined version for the env.tb_env field
 ---
hw/ppc/ppc.c   | 76
 ++
hw/ppc/spapr.c |  4 +--
include/hw/ppc/spapr.h |  1 +
target-ppc/cpu-qom.h   | 15 ++
target-ppc/kvm.c   |  5 
trace-events   |  3 ++
6 files changed, 102 insertions(+), 2 deletions(-)

 diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
 index 71df471..bf61a0a 100644
 --- a/hw/ppc/ppc.c
 +++ b/hw/ppc/ppc.c
 @@ -29,9 +29,11 @@
#include "sysemu/cpus.h"
#include "hw/timer/m48t59.h"
#include "qemu/log.h"
 +#include "qemu/error-report.h"
#include "hw/loader.h"
#include "sysemu/kvm.h"
#include "kvm_ppc.h"
 +#include "trace.h"
  //#define PPC_DEBUG_IRQ
//#define PPC_DEBUG_TB
 @@ -829,6 +831,80 @@ static void cpu_ppc_set_tb_clk (void *opaque,
 uint32_t freq)
cpu_ppc_store_purr(cpu, 0xULL);
}
+static void timebase_pre_save(void *opaque)
 +{
 +PPCTimebase *tb = opaque;
 +uint64_t ticks = cpu_get_real_ticks();
 +PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
 +
 +if (!first_ppc_cpu->env.tb_env) {
 +error_report("No timebase object");
 +return;
 +}
 +
 +tb->time_of_the_day = get_clock_realtime() / 1000;
>>> It'd be good to indicate in the field name that we're in us units. In fact
>>> it probably makes sense to use ns throughout and not convert :).
>>> Nanoseconds are QEMU's "native" internal time format.
>> Ooookay, I'll make it "ns", then no indicator is needed, right?
> 
> I still prefer to keep an indicator. Makes it more readable.
> 
>>
>>
 +/*
 + * tb_offset is only expected to be changed by migration so
 + * there is no need to update it from KVM here
 + */
 +tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
 +}
 +
 +static int timebase_post_load(void *opaque, int version_id)
 +{
 +PPCTimebase *tb = opaque;
 +CPUState *cpu;
 +PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
 +int64_t tb_off_adj, tb_off;
 +int64_t migration_duration_us, migration_duration_tb, guest_tb,
 host_us;
 +unsigned long freq;
 +
 +if (!first_ppc_cpu->env.tb_env) {
 +error_report("No timebase object");
 +return -1;
 +}
 +
 +freq = first_ppc_cpu->env.tb_env->tb_freq;
 +/*
 + * Calculate timebase on the destination side of migration.
 + * The destination timebase must be not less than the source
 timebase.
 + * We try to adjust timebase by downtime if host clocks are not
 + * too much out of sync (1 second for now).
 + */
 +host_us = get_clock_realtime() / 1000;
 +migration_duration_us = MIN(CLOCKS_PER_SEC, host_us -
 tb->time_of_the_day);
>>> CLOCKS_PER_SEC only make sense with the return value of clock(). I wouldn't
>>> be surprised if this calculation goes totally bonkers on BSD or Windows
>>> systems.
>>
>> http://linux.die.net/man/3/clock
>> C89, C99, POSIX.1-2001. POSIX requires that CLO

Re: [Qemu-devel] Automatic spice port selection

2014-04-14 Thread Fabio Fantoni

Il 26/03/2014 17:15, Fabio Fantoni ha scritto:
Time ago I have read somewhere that there is an option to 
automatically spice port in qemu as for vnc.

I started to write a libxl patch to add this feature like the vnc one:
https://github.com/Fantu/Xen/commit/63c54f354a34e7eb27b1c69016789a372a75843c 



Testing this patch lead to a missing "to=" parameter and I didn't 
found nothing else.


Is there someone that can tell me that if it is possible to let qemu 
automatically assign a spice port like for vnc?


Thanks for any reply.


Ping



Re: [Qemu-devel] Turning off default storage devices?

2014-04-14 Thread Markus Armbruster
Peter Crosthwaite  writes:

> Hi Andy,
>
> On Thu, Apr 10, 2014 at 5:55 AM, Andy Lutomirski  wrote:
>> Currently, -M q35 boots linux quite a bit slower than the default
>> machine type.  This seems to be because it takes a few hundred ms to
>> determine that there's nothing attached to the AHCI controller.
>>
>> In virtio setups, there will probably never be anything attached to
>> the AHCI controller.  Would it be possible to add something like
>> -machine default_storage=off to turn off default storage devices?
>> This could include the AHCI on q35 and the cdrom and such on pc.
>>
>> There's precedent: -machine usb=off turns off the default USB
>> controllers, which is great for setups that use xhci.
>>
>
> Is there a more generic solution to your problem? Can you implement
> command line device removal in a non specific way and avoid having to
> invent AHCI or even "storage" specific arguments. You could
> considering bringing the xhci use case you mentioned under the same
> umbrella.

USB has always been off by default, at least for the boards I'm familiar
with, due to the USB emulation's non-trivial CPU use.

There's no such thing as a Q35 board without USB in the physical world.
Can't stop us from making a virtual one, of course.

Likewise, there's no such thing as a Q35 board without AHCI in the
physical world, and again that can't stop us from making a virtual one.

The difference to USB is that our q35 machines have always had AHCI even
with -nodefaults.  You seem to propose adding a switch to disable AHCI,
yet leave it enabled with -nodefaults.

-nodefaults should give you a board with all the optional components
suppressed.

On the one hand, I'd rather not add exceptions to -nodefaults "give me
the board with all its optional components suppressed" semantics.

On the other hand, a few hundred ms are a long time.



Re: [Qemu-devel] [PATCH v5] spapr: Add support for time base offset migration

2014-04-14 Thread Alexander Graf


On 14.04.14 10:10, Alexey Kardashevskiy wrote:

On 04/14/2014 05:46 PM, Alexander Graf wrote:

On 14.04.14 09:33, Alexey Kardashevskiy wrote:

On 04/14/2014 05:08 PM, Alexander Graf wrote:

On 12.04.14 17:52, Alexey Kardashevskiy wrote:

This allows guests to have a different timebase origin from the host.

This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.

This adds kvm_access_one_reg() to access a special register which is not
in env->spr.

The feature must be present in the host kernel.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v5:
* fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
into timebase_post_load()
* removed round_up(1<<24) as KVM is expected to do this anyway
* removed @freq from migration stream
* renamed PPCTimebaseOffset to PPCTimebase
* CLOCKS_PER_SEC is used as a constant which 100us/s (man clock)

v4:
* made it per machine timebase offser rather than per CPU

v3:
* kvm_access_one_reg moved out to a separate patch
* tb_offset and host_timebase were replaced with guest_timebase as
the destionation does not really care of offset on the source

v2:
* bumped the vmstate_ppc_cpu version
* defined version for the env.tb_env field
---
hw/ppc/ppc.c   | 76
++
hw/ppc/spapr.c |  4 +--
include/hw/ppc/spapr.h |  1 +
target-ppc/cpu-qom.h   | 15 ++
target-ppc/kvm.c   |  5 
trace-events   |  3 ++
6 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 71df471..bf61a0a 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -29,9 +29,11 @@
#include "sysemu/cpus.h"
#include "hw/timer/m48t59.h"
#include "qemu/log.h"
+#include "qemu/error-report.h"
#include "hw/loader.h"
#include "sysemu/kvm.h"
#include "kvm_ppc.h"
+#include "trace.h"
  //#define PPC_DEBUG_IRQ
//#define PPC_DEBUG_TB
@@ -829,6 +831,80 @@ static void cpu_ppc_set_tb_clk (void *opaque,
uint32_t freq)
cpu_ppc_store_purr(cpu, 0xULL);
}
+static void timebase_pre_save(void *opaque)
+{
+PPCTimebase *tb = opaque;
+uint64_t ticks = cpu_get_real_ticks();
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return;
+}
+
+tb->time_of_the_day = get_clock_realtime() / 1000;

It'd be good to indicate in the field name that we're in us units. In fact
it probably makes sense to use ns throughout and not convert :).
Nanoseconds are QEMU's "native" internal time format.

Ooookay, I'll make it "ns", then no indicator is needed, right?

I still prefer to keep an indicator. Makes it more readable.




+/*
+ * tb_offset is only expected to be changed by migration so
+ * there is no need to update it from KVM here
+ */
+tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
+}
+
+static int timebase_post_load(void *opaque, int version_id)
+{
+PPCTimebase *tb = opaque;
+CPUState *cpu;
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+int64_t tb_off_adj, tb_off;
+int64_t migration_duration_us, migration_duration_tb, guest_tb,
host_us;
+unsigned long freq;
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return -1;
+}
+
+freq = first_ppc_cpu->env.tb_env->tb_freq;
+/*
+ * Calculate timebase on the destination side of migration.
+ * The destination timebase must be not less than the source
timebase.
+ * We try to adjust timebase by downtime if host clocks are not
+ * too much out of sync (1 second for now).
+ */
+host_us = get_clock_realtime() / 1000;
+migration_duration_us = MIN(CLOCKS_PER_SEC, host_us -
tb->time_of_the_day);

CLOCKS_PER_SEC only make sense with the return value of clock(). I wouldn't
be surprised if this calculation goes totally bonkers on BSD or Windows
systems.

http://linux.die.net/man/3/clock
C89, C99, POSIX.1-2001. POSIX requires that CLOCKS_PER_SEC equals 100
independent of the actual resolution.

This should be valid for BSD too. Ok, this has no effect on Windows.
Replace definition with get_ticks_per_sec()? Defining USEC_PER_SEC (or
NSEC_xxx) does not make much sense.

I'd just define NSEC_PER_SEC.


I still don't understand why we're doing a MIN() here. I would understand a
MAX() to guard against unsynchronized times on source/destination.

If I added time difference between out-of-sync hosts, the delta would be
huge and the destination guest would behave r

Re: [Qemu-devel] iPXE timeout missing

2014-04-14 Thread Gerd Hoffmann
On So, 2014-04-13 at 02:05 +0200, Sebastian wrote:
> Hi,
> 
> when trying current qemu git, I noticed the iPXE timeout has been removed:
> 
> commit 95ca557d5cfc1ef69ba9708ded552f389afe643d
> Author: Gerd Hoffmann 
> Date:   Mon Mar 25 09:07:40 2013 +0100
> 
>  ipxe: disable two second timeout
> 
> By removing the timeout, access to the iPXE command line is now 
> impossible (pressing Ctrl+B in the timeout breaks the script). The 
> command line access is (more or less) required when installing an 
> operating system from local media to a SAN device:
> 
>  > dhcp net0
>  > set keep-san 1 (keep SAN after failure)
>  > sanboot iscsi: (fails, no valid MBR)
>  > exit (boot other devices)
> 
> This way, the operating system installer (e.g. Windows Vista/7) sees the 
> SAN.
> 
> Is there a reason for removing the timeout?

Speed up boot.  Having to wait two seconds even when not booting from
the nic isn't very nice ...

Recently ipxe got two separate Ctrl-B timeouts, one which is applied at
load time, and one which is applied when actually booting from the nic.
See
https://git.ipxe.org/ipxe.git/commitdiff/27d1b40ee961d0910576a094a71dea1211254776

When updating iPXE to a version newer than that we can enable prompt
access without delaying boot for everybody.  /me puts that on the
post-2.0-release todo list.

For the time being you can build your own roms.  Go to roms/, edit
config.ipxe.general.h and set the timeout as you like, then "make pxerom
efirom".

HTH,
  Gerd





Re: [Qemu-devel] [PATCH qemu 5/6] virtio-input: control device

2014-04-14 Thread Gerd Hoffmann
  Hi,

> > What I was thinking about is sending feedback about volume changes to
> > the guest is something completely different, let me explain in detail:
> > 
> > What we have in spice today is that the guests volume change request is
> > passed to the spice client, which in turn forwards it to pulseaudio,
> > which then actually applies the volume setting to the audio stream.  And
> > you can see the guests volume changes in the hosts mixer UI (sound
> > settings -> applications):  change the volume in the guest and watch the
> > slider move.
> > 
> > Today this is a one-way ticket.  If you fiddle with the volume in the
> > hosts mixer ui the guest doesn't notice.  A emulated volume wheel would
> > be a possible way to feedback volume changes on the host to the guest.
> > Note that it isn't that simple to implement as it need a spice protocol
> > extension too (and support in spice server + client of course).

> Or forward volume keys to guest through the virtio keyboard - then we
> don't need to emulate a volume wheel.

Isn't going to fly.  Using volume up/down keys we can't says "volume is
at 42% now" to the guest.

cheers,
  Gerd





Re: [Qemu-devel] [PATCH for 2.0] ide: Correct improper smart self test counter reset in ide core.

2014-04-14 Thread Markus Armbruster
Benoît Canet  writes:

> The counter being reseted to zero make the array index negative.
> Reset it to 1.
>
> Signed-off-by: Benoit Canet 
> ---
>  hw/ide/core.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index e1dfe54..c943a4d 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -1602,7 +1602,7 @@ static bool cmd_smart(IDEState *s, uint8_t cmd)
>  case 2: /* extended self test */
>  s->smart_selftest_count++;
>  if (s->smart_selftest_count > 21) {
> -s->smart_selftest_count = 0;
> +s->smart_selftest_count = 1;
>  }
>  n = 2 + (s->smart_selftest_count - 1) * 24;
>  s->smart_selftest_data[n] = s->sector;

Good catch.

Commit message could use some love, though.  On every 21st SMART EXECUTE
OFFLINE:

* We write before a dynamically allocated buffer

  Your diff's context has one of the writes.

* We forget SMART history

  See the s->smart_selftest_count == 0 special cases in SMART READ DATA
  and SMART READ LOG.



Re: [Qemu-devel] [PATCH] Add remove_boot_device_path() function for hot-unplug device

2014-04-14 Thread Markus Armbruster
Copying Marcel.

Jun Li  writes:

> Add remove_boot_device_path() function to remove bootindex when hot-unplug
> a device. This patch fixed virtio-blk/virtio-net/scsi-disk/scsi-generic 
> device.
> So it has fixed bug1086603, ref:
> https://bugzilla.redhat.com/show_bug.cgi?id=1086603
>
> Signed-off-by: Jun Li 
> ---
>  hw/block/virtio-blk.c   |  1 +
>  hw/net/virtio-net.c |  1 +
>  hw/scsi/scsi-disk.c |  1 +
>  hw/scsi/scsi-generic.c  |  1 +
>  include/sysemu/sysemu.h |  2 ++
>  vl.c| 16 
>  6 files changed, 22 insertions(+)
>
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 8a568e5..ecdd266 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -752,6 +752,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev, 
> Error **errp)
>  unregister_savevm(dev, "virtio-blk", s);
>  blockdev_mark_auto_del(s->bs);
>  virtio_cleanup(vdev);
> +remove_boot_device_path(s->conf->bootindex, dev, "/disk@0,0");
>  }
>  
>  static Property virtio_blk_properties[] = {
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 33bd233..520c029 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1633,6 +1633,7 @@ static void virtio_net_device_unrealize(DeviceState 
> *dev, Error **errp)
>  g_free(n->vqs);
>  qemu_del_nic(n->nic);
>  virtio_cleanup(vdev);
> +remove_boot_device_path(n->nic_conf.bootindex, dev, "/ethernet-phy@0");
>  }
>  
>  static void virtio_net_instance_init(Object *obj)
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 48a28ae..f7303e8 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2156,6 +2156,7 @@ static void scsi_destroy(SCSIDevice *dev)
>  
>  scsi_device_purge_requests(&s->qdev, SENSE_CODE(NO_SENSE));
>  blockdev_mark_auto_del(s->qdev.conf.bs);
> +remove_boot_device_path(s->qdev.conf.bootindex, &dev->qdev, NULL);
>  }
>  
>  static void scsi_disk_resize_cb(void *opaque)
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 8d92e0d..22b9752 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -390,6 +390,7 @@ static void scsi_destroy(SCSIDevice *s)
>  {
>  scsi_device_purge_requests(s, SENSE_CODE(NO_SENSE));
>  blockdev_mark_auto_del(s->conf.bs);
> +remove_boot_device_path(s->conf.bootindex, &s->qdev, NULL);
>  }
>  
>  static int scsi_generic_initfn(SCSIDevice *s)
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index ba5c7f8..f7ad1e2 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -193,6 +193,8 @@ void rtc_change_mon_event(struct tm *tm);
>  
>  void add_boot_device_path(int32_t bootindex, DeviceState *dev,
>const char *suffix);
> +void remove_boot_device_path(int32_t bootindex, DeviceState *dev,
> + const char *suffix);
>  char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
>  
>  DeviceState *get_boot_device(uint32_t position);
> diff --git a/vl.c b/vl.c
> index 9975e5a..8aaa1d6 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1184,6 +1184,22 @@ void add_boot_device_path(int32_t bootindex, 
> DeviceState *dev,
>  QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
>  }
>  
> +void remove_boot_device_path(int32_t bootindex, DeviceState *dev,
> + const char *suffix)
> +{
> +FWBootEntry *i;
> +
> +if (bootindex == -1) {
> +return;
> +}
> +
> +QTAILQ_FOREACH(i, &fw_boot_order, link)
> +if (i->bootindex == bootindex) {
> +QTAILQ_REMOVE(&fw_boot_order, i, link);
> +return;
> +}
> +}
> +
>  DeviceState *get_boot_device(uint32_t position)
>  {
>  uint32_t counter = 0;



Re: [Qemu-devel] [QEMU v5 PATCH 00/18] SMBIOS: build full tables in QEMU

2014-04-14 Thread Gerd Hoffmann
  Hi,

> harm in always producing type0).  Also, it would be nice to move up
> patch 18 to after patch 10 - that way an end-to-end test between
> old/new smbios with the new interface could be done.

Agree, that would be nice to have (although I wouldn't make that a hard
requirement for merging).

I also think we should continue providing the tables one-by-one using
the old interface, at least for a transition period, so older seabios
versions continue to work.

cheers,
  Gerd





Re: [Qemu-devel] [QEMU v5 PATCH 00/18] SMBIOS: build full tables in QEMU

2014-04-14 Thread Gerd Hoffmann
  Hi,

> 2. QEMU only generates type 0 if explicitly told to do so on the
> command line (i.e., *not* by default). In this case, SeaBIOS (or OVMF,
> or any other BIOS) would have to scan the tables and insert its own
> default type 0 if one was not purposely supplied by QEMU. (I know my
> current SeaBIOS patch always overrides type 0, and agree that's
> inconsistent with this option, and plan on fixing it :)

This is the best option IMO.

cheers,
  Gerd





Re: [Qemu-devel] [PATCH] arch_init.c: remove duplicate function

2014-04-14 Thread Laszlo Ersek
On 04/14/14 04:27, Amos Kong wrote:
> We already have a function buffer_is_zero() in util/cutils.c
> 
> Signed-off-by: Amos Kong 
> ---
>  arch_init.c | 9 ++---
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 60c975d..342e5dc 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -152,11 +152,6 @@ int qemu_read_default_config_files(bool userconfig)
>  return 0;
>  }
>  
> -static inline bool is_zero_range(uint8_t *p, uint64_t size)
> -{
> -return buffer_find_nonzero_offset(p, size) == size;
> -}
> -
>  /* struct contains XBZRLE cache and a static page
> used by the compression */
>  static struct {
> @@ -587,7 +582,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>  acct_info.dup_pages++;
>  }
>  }
> -} else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
> +} else if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
>  acct_info.dup_pages++;
>  bytes_sent = save_block_hdr(f, block, offset, cont,
>  RAM_SAVE_FLAG_COMPRESS);
> @@ -983,7 +978,7 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>   */
>  void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
>  {
> -if (ch != 0 || !is_zero_range(host, size)) {
> +if (ch != 0 || !buffer_is_zero(host, size)) {
>  memset(host, ch, size);
>  }
>  }
> 

This seems to be correct, functionally -- apparently buffer_is_zero()
has laxer size requirements than buffer_find_nonzero_offset(). However,
I think the latter might be faster.

For ram_save_block() I guess the difference is negligible. But
ram_handle_compressed() is also called from "migration-rdma.c", where I
can't even guess if a little bit of slowdown would count.

I'm OK with the patch if Michael (CC'd) is.

Thanks
Laszlo



Re: [Qemu-devel] [PATCH 1/5] s390x/cpu: Use ioctl to reset state in the kernel

2014-04-14 Thread Alexander Graf


On 26.02.14 12:38, Christian Borntraeger wrote:

From: Thomas Huth 

Some of the state in the kernel can not be reset from QEMU yet.
For this we've got to use the KVM_S390_INITIAL_RESET ioctl to make
sure that the state in the kernel is set to the right values during
initial CPU reset, too.

Signed-off-by: Thomas Huth 
Acked-by: Cornelia Huck 
Signed-off-by: Christian Borntraeger 
---
  target-s390x/cpu.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index f1319e5..1a8c1cc 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -108,6 +108,15 @@ static void s390_cpu_initial_reset(CPUState *s)
  env->cregs[14] = CR14_RESET;
  
  env->pfault_token = -1UL;

+
+#if defined(CONFIG_KVM)
+/* Reset state inside the kernel that we cannot access yet from QEMU. */
+if (kvm_enabled()) {
+if (kvm_vcpu_ioctl(s, KVM_S390_INITIAL_RESET, NULL)) {


Could we put this into the vcpu register sync function? It gets a 
parameter that indicates when we are on RESET level, right?


That way we don't have to put kvm-only code in cpu.c :).


Alex




Re: [Qemu-devel] [PATCH microblaze/ppc v2 0/8] Remove include/hw/xilinx.h

2014-04-14 Thread Alexander Graf


On 25.02.14 09:46, Peter Crosthwaite wrote:

Ping!


Did this get applied meanwhile? Edgar?


Alex



On Thu, Feb 13, 2014 at 9:21 AM, Peter Crosthwaite
 wrote:

Hi Edgar,

This file is written for a very old qdev style of machine description.
Remove by inlining all usages of its functions (which is consistent with
the modern preference that qdev_initery should be inline and not in
cross-board helper functions).

To avoid movement of magic numbers, Define macros for IRQ and memory
maps as appropriate. This cleanup comes first, followed by the
piecewise removal of xilinx.h.

Regards,
Peter


Peter Crosthwaite (8):
   microblaze/s3adsp_1800: Define macros for irq map
   ppc/virtex_ml507: Define macros for irq/memory maps
   microblaze/ml605: Define macros for irq/memory maps
   xilinx: Inline usages of xilinx_intc_create()
   xilinx: Inline usages of xilinx_timer_create()
   xilinx: Inline usage of xilinx_ethlite_create()
   xilinx: Inline usages of xilinx_axi*_init()
   xilinx: Delete hw/include/xilinx.h

  hw/microblaze/petalogix_ml605_mmu.c  | 55 ++-
  hw/microblaze/petalogix_s3adsp1800_mmu.c | 35 ++---
  hw/ppc/virtex_ml507.c| 29 +++---
  include/hw/xilinx.h  | 90 
  4 files changed, 95 insertions(+), 114 deletions(-)
  delete mode 100644 include/hw/xilinx.h

--
1.8.5.4






Re: [Qemu-devel] [PATCH v2 0/3] target-ppc: Relax use of generic CPU name for KVM

2014-04-14 Thread Alexander Graf


On 11.04.14 19:34, Alexey Kardashevskiy wrote:

This enables generic CPU family name use with "-cpu" for all CPUs
across the family and not for the ones already defined in QEMU.

Changes:
v2:
* split one patch into 3
* added assert into ppc_cpu_get_family_class()


Thanks, applied to ppc-next.


Alex




Re: [Qemu-devel] [PATCH 1/2] QEMU: PPC: specify PVRs for all e500 cores

2014-04-14 Thread Alexander Graf


On 14.02.14 20:22, Stuart Yoder wrote:

From: Stuart Yoder 

Signed-off-by: Stuart Yoder 
---
  target-ppc/cpu-models.c |   64 +--
  target-ppc/cpu-models.h |   30 --
  2 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
index f6c9b3a..9cf603b 100644
--- a/target-ppc/cpu-models.c
+++ b/target-ppc/cpu-models.c
@@ -677,19 +677,79 @@
  "PowerPC e500 v2.0 core")
  POWERPC_DEF("e500v2_v10",CPU_POWERPC_e500v2_v10, e500v2,
  "PowerPC e500v2 v1.0 core")
+POWERPC_DEF("e500v2_v11",CPU_POWERPC_e500v2_v11, e500v2,
+"PowerPC e500v2 v1.1 core")
  POWERPC_DEF("e500v2_v20",CPU_POWERPC_e500v2_v20, e500v2,
  "PowerPC e500v2 v2.0 core")
  POWERPC_DEF("e500v2_v21",CPU_POWERPC_e500v2_v21, e500v2,
  "PowerPC e500v2 v2.1 core")
  POWERPC_DEF("e500v2_v22",CPU_POWERPC_e500v2_v22, e500v2,
  "PowerPC e500v2 v2.2 core")
+POWERPC_DEF("e500v2_v23",CPU_POWERPC_e500v2_v23, e500v2,
+"PowerPC e500v2 v2.3 core")
  POWERPC_DEF("e500v2_v30",CPU_POWERPC_e500v2_v30, e500v2,
  "PowerPC e500v2 v3.0 core")
+POWERPC_DEF("e500v2_v31",CPU_POWERPC_e500v2_v31, e500v2,
+"PowerPC e500v2 v3.1 core")
+POWERPC_DEF("e500v2_v1040", CPU_POWERPC_e500v2_v1040,   e500v2,
+"PowerPC e500v2 v104.0 core")
+POWERPC_DEF("e500v2_v1050", CPU_POWERPC_e500v2_v1050,   e500v2,
+"PowerPC e500v2 v105.0 core")
+POWERPC_DEF("e500v2_v1051", CPU_POWERPC_e500v2_v1051,   e500v2,
+"PowerPC e500v2 v105.1 core")
+POWERPC_DEF("e500v2_v1151", CPU_POWERPC_e500v2_v1151,   e500v2,
+"PowerPC e500v2 v115.1 core")
+POWERPC_DEF("e500v2_v1152", CPU_POWERPC_e500v2_v1152,   e500v2,
+"PowerPC e500v2 v115.2 core")
+POWERPC_DEF("e500v2_v2050", CPU_POWERPC_e500v2_v2050,   e500v2,
+"PowerPC e500v2 v205.0 core")
+POWERPC_DEF("e500v2_v2051", CPU_POWERPC_e500v2_v2051,   e500v2,
+"PowerPC e500v2 v205.1 core")
+POWERPC_DEF("e500v2_v2151", CPU_POWERPC_e500v2_v2151,   e500v2,
+"PowerPC e500v2 v215.1 core")
+POWERPC_DEF("e500v2_v2152", CPU_POWERPC_e500v2_v2152,   e500v2,
+"PowerPC e500v2 v215.2 core")
+POWERPC_DEF("e500v2_v2251", CPU_POWERPC_e500v2_v2251,   e500v2,
+"PowerPC e500v2 v225.1 core")
+
+/* e500mc family */
  POWERPC_DEF_SVR("e500mc", "e500mc",
-CPU_POWERPC_e500mc,   POWERPC_SVR_E500,  e500mc)
+CPU_POWERPC_e500mc_v20,   POWERPC_SVR_E500,  e500mc)


How about we use aliases instead like the other CPU types? :)


+POWERPC_DEF_SVR("e500mc_v10", "PowerPC e500mc v1.0 core",
+CPU_POWERPC_e500mc_v10,   POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v21", "PowerPC e500mc v2.1 core",
+CPU_POWERPC_e500mc_v21,   POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v22", "PowerPC e500mc v2.2 core",
+CPU_POWERPC_e500mc_v22,   POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v23", "PowerPC e500mc v2.3 core",
+CPU_POWERPC_e500mc_v23,   POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v1030", "PowerPC e500mc v103.0 core",
+CPU_POWERPC_e500mc_v1030, POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v30", "PowerPC e500mc v3.0 core",
+CPU_POWERPC_e500mc_v30,   POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v31", "PowerPC e500mc v3.1 core",
+CPU_POWERPC_e500mc_v31,   POWERPC_SVR_E500,  e500mc)
+POWERPC_DEF_SVR("e500mc_v32", "PowerPC e500mc v3.2 core",
+CPU_POWERPC_e500mc_v32,   POWERPC_SVR_E500,  e500mc)
+
  #ifdef TARGET_PPC64
+/* e5500 family */
  POWERPC_DEF_SVR("e5500", "e5500",
-CPU_POWERPC_e5500,POWERPC_SVR_E500,  e5500)


Same here


Alex


+CPU_POWERPC_e5500_v12,POWERPC_SVR_E500,  e5500)
+POWERPC_DEF_SVR("e5500_v10", "PowerPC e5500 v1.0 core",
+CPU_POWERPC_e5500_v10,POWERPC_SVR_E500,  e5500)
+POWERPC_DEF_SVR("e5500_v11", "PowerPC e5500 v1.1 core",
+CPU_POWERPC_e5500_v11,POWERPC_SVR_E500,  e5500)
+POWERPC_DEF_SVR("e5500_v1020", "PowerPC e5500 v102.0 core",
+CPU_POWERPC_e5500_v1020,POWERPC_SVR_E500,e5500)
+
+/* e6500 family */
+POWERPC_DEF_SVR("e6500", "e6500",
+CPU_POWERPC_e6500_v10,POWERPC_SVR_E500,  e6500)
+POWERPC_D

Re: [Qemu-devel] [PATCH 2/2] QEMU: PPC: set default cpu type to be 'host'

2014-04-14 Thread Alexander Graf


On 14.02.14 20:22, Stuart Yoder wrote:

From: Stuart Yoder 

-for KVM we always want the cpu to be that of the
  host system, so make that the default

-for TGC mode, the emulated cpu type should be explicitly
  set

Signed-off-by: Stuart Yoder 


Could you please indicate in the subject line that we're talking about 
e500? Otherwise this might be confusing for people who expect IBM POWER 
changes.


Otherwise I agree with the patches :).


Alex




Re: [Qemu-devel] [PATCH] spapr-pci: remove io ports workaround

2014-04-14 Thread Alexander Graf


On 10.02.14 06:12, Alexey Kardashevskiy wrote:

On 02/08/2014 01:06 AM, Alexander Graf wrote:

On 07.02.2014, at 14:44, Greg Kurz  wrote:


In the past, IO space could not be mapped into the memory address space
so we introduced a workaround for that. Nowadays it does not look
necessary so we can remove the workaround and make sPAPR PCI
configuration simplier.

Signed-off-by: Greg Kurz 

Alexey, I would like at least your ack for this. This patch would be the 3rd or 
4th attempt to do this shift and every time we saw unexpected breakage :).


Seems to be all right, cannot find any problem this time :) Thanks!

Acked-by: Alexey Kardashevskiy 


Thanks, applied to ppc-next.


Alex




Re: [Qemu-devel] [PATCH 4/4] spapr: Add support for time base offset migration

2014-04-14 Thread Benjamin Herrenschmidt
On Mon, 2014-04-14 at 08:32 +0200, Alexander Graf wrote:
> So for chips that might be configurable, configuration has to happen via 
> some SPR, right? Wouldn't it make sense to transfer that configuration 
> SPR on these and thus the configuration as well?

Well, on a real 970 I can always go poke the clock chip via i2c but
I doubt we'll ever emulate that ;-)

> Really, the main problem I see with adding the tb frequency into the 
> migration stream is that we'd change the migration format. Worst case it 
> would be a few useless bytes on migration - not a big deal. But I don't 
> think we support any case where it would actually be required - it'd 
> only be a safety net.

Indeed, no biggie either way then.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 4/4] spapr: Add support for time base offset migration

2014-04-14 Thread Benjamin Herrenschmidt
On Mon, 2014-04-14 at 08:37 +0200, Alexander Graf wrote:
> So we either have to create some way to make interrupt numbering a 
> function of something very simple we plug the PHB into, like a virtual 
> pseries slot number which we multiply by x to get an irq number range. 
> Or we'd have to manually link up PHB IRQ lines to XICS IRQ lines on the 
> command line.

We'd have the same problem with the PHB address ranges etc... and the BUID.

So I'd recommend just making it all a function of the BUID which is the
way PAPR identifies a PHB and have it be specified on the command line.

Ben.




Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH] smp: autodetect numbers of threads per core

2014-04-14 Thread Alexander Graf


On 15.11.13 17:58, Alexey Kardashevskiy wrote:

On 16.11.2013 0:15, Alexander Graf wrote:


Am 15.11.2013 um 00:12 schrieb Alexey Kardashevskiy :


At the moment only a whole CPU core can be assigned to a KVM. Since
POWER7/8 support several threads per core, we want all threads of a core
to go to the same KVM so every time we run QEMU with -enable-kvm on
POWER, we have to add -smp X,threads=(4|8)" (4 for POWER7 and
8 for POWER8).

This patch tries to read smp_threads number from an accelerator and
falls back to the default value (1) if the accelerator did not care
to change it.

Signed-off-by: Alexey Kardashevskiy 
---

(!!!)

The usual question - what would be the normal way of doing this?
What does this patch break? I cannot think of anything right now.

Is this really what the user wants? On p7 you can run in no-smt, smt2
and smt4 mode. Today we simply default to no-smt. Changing defaults
is usually a bad thing.


Defaulting to 1 thread on P7 is a bad thing (other threads stay unused -
what is good about this?) and the only reason which I know why it is
still threads=1 is that it is hard to make a patch for upstream to
change this default.


threads=1 improves single-thread performance significantly. The thread 
itself is faster when it runs in SMT1 mode. Also we don't have to kick 
other threads out of the guest context, making every guest/host 
transition faster.


Overall, it's really just a random default. I'm not sure it makes a lot 
of sense to change it.


However, could we be really smart here? How does ppc64_cpu --smt=off 
work? It only turns off the unused vcpus, right? Is there any way we 
could actually not even enter unused vcpus at all? Then we could indeed 
always expose the maximum available number of threads to the guest and 
let that one decide what to do.



Alex




Re: [Qemu-devel] [PATCH 4/4] spapr: Add support for time base offset migration

2014-04-14 Thread Benjamin Herrenschmidt
On Mon, 2014-04-14 at 17:12 +1000, Alexey Kardashevskiy wrote:
> Our additional PHBs require @index property which is used to calculate
> base
> MMIO and IO ranges. That we could use for LSI/MSI too.

That works. We need to decide how much interrupt "space" we give to
each.

The HW on P8 does 2048 MSIs plus the LSIs but we don't have to copy
that. The HW does that many because it can be split into 256 PE's but
our virtual PHBs represent a single PE#.

I think 4 LSIs + 256 MSIs is a reasonable number and should be fairly
future proof.

The numbers can be interleaved. IE the number space can be made to
look like

PHB0 LSIs
PHB1 LSIs
PHB2 LSIs

PHB0 MSIs
PHB1 MSIs
PHB2 MSIs

So the numbering remains compact (I would recommend doing that so the
ranges themselves remains power-of-2 sized).

I don't remember how we divide our interrupts into ranges/BUIDs
internally, so we might need to chose something that matches what we
do there a bit better, we'll see.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH 4/4] spapr: Add support for time base offset migration

2014-04-14 Thread Benjamin Herrenschmidt
On Mon, 2014-04-14 at 21:14 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2014-04-14 at 17:12 +1000, Alexey Kardashevskiy wrote:
> > Our additional PHBs require @index property which is used to calculate
> > base
> > MMIO and IO ranges. That we could use for LSI/MSI too.
> 
> That works. We need to decide how much interrupt "space" we give to
> each.
> 
> The HW on P8 does 2048 MSIs plus the LSIs but we don't have to copy
> that. The HW does that many because it can be split into 256 PE's but
> our virtual PHBs represent a single PE#.

Note that this value of "256" should itself be an overridable machine
level configuration so we can bump it later if it ever becomes necessary
and still create "backward compatible" VMs.

Cheers,
Ben.





Re: [Qemu-devel] [PATCH v6 19/37] target-arm: Implement AArch64 EL1 exception handling

2014-04-14 Thread Peter Maydell
On 14 April 2014 07:04, Peter Crosthwaite  wrote:
> On Fri, Apr 11, 2014 at 2:15 AM, Peter Maydell  
> wrote:
>> From: Rob Herring 
>> +switch (cs->exception_index) {
>> +case EXCP_PREFETCH_ABORT:
>> +case EXCP_DATA_ABORT:
>> +qemu_log_mask(CPU_LOG_INT, "...with FAR 0x%" PRIx64 "\n",
>> +  env->cp15.far_el1);
>
> need a /* fallthrough */

I think a simple 'break;' would be better, actually.

> otherwise:
>
> Reviewed-by: Peter Crosthwaite 

thanks
-- PMM



Re: [Qemu-devel] [PATCH v3 01/26] tcg-aarch64: Properly detect SIGSEGV writes

2014-04-14 Thread Claudio Fontana
On 07.04.2014 18:33, Richard Henderson wrote:
> On 04/07/2014 12:58 AM, Claudio Fontana wrote:
>>> +|| (insn & 0x3bc0) == 0x2840   /* C3.3.7 */
>>
>> I think the Load (L) bit should be 0 here so
>>
>> == 0x2800
> 
> Oops.  Fixed.
> 
>>
>>> +|| (insn & 0x3be00c00) == 0x38000400   /* C3.3.8 */
>>
>> With V=1, an opc of 0b10 is also a write, I think. It's the 128bit FP/SIMD 
>> STR.
> 
> Exactly, that's why I'm masking it out, to ignore it.
> 
>  insn  =  size 1 1   1 v 0 0 ...
>  mask  =   0 0 1 1   1 0 1 1 ...  = 0x3b...
>  equal =   0 0 1 1   1 0 0 0 ...  = 0x38...
> 
> 
> r~
> 

the problem is not in the two nibbles you show, but in the third nibble:
31 30 29 28  27 26 25 24  23 22 21 20
 size  1  1   1  v  0  0   opc   0  x

the third nibble in your mask is 'E' and the expected result is 0, which forces 
opc to be 0 for writes.
However, for 128bit SIMD STR, the opc is 2 (0b10).

Ciao,

Claudio





Re: [Qemu-devel] [PATCH v5 22/37] hw/arm/virt: Add support for Cortex-A57

2014-04-14 Thread Peter Maydell
On 14 April 2014 07:10, Peter Crosthwaite  wrote:
> On Sat, Mar 29, 2014 at 2:10 AM, Peter Maydell  
> wrote:
>> Support the Cortex-A57 in the virt machine model.
>>
>> Signed-off-by: Peter Maydell 
>> ---
>> This should perhaps not be just stealing the a15mpcore_priv
>> on the basis that it's a GICv2...
>
> Yeh, I still think its a minimal change to just instantiate a standalone gic.

I think a standalone GIC would be really awkward in virt.c,
but I'm inclined towards having an a57mpcore_priv which
provides a 64K-aligned GICv2 (ie, as per the server base
system architecture doc, the 4K GIC register banks get
aliased so they appear 16 times in a 64K space).

thanks
-- PMM



Re: [Qemu-devel] [PATCH v6 00/37] AArch64 system emulation

2014-04-14 Thread Peter Maydell
On 14 April 2014 07:39, Peter Crosthwaite  wrote:
> Im finished with the review. There's a few areas that I'm not very
> familiar with that Ive skipped or others have reviewed. So for all
> un-commented patches:
>
> Acked-by: Peter Crosthwaite 

Thanks! I've made the minor tweaks you suggested. We still
need to sort out how the GIC appears in the virt model, but
everything else looks good to go, so I'll put that all into
a pull request, and then we will have the CPU emulation
itself all in master, and can sort out a57mpcore.c/virt.c
as a separate and much smaller patchset.

thanks
-- PMM



[Qemu-devel] [PATCH v7 0/8] virtio endian-ambivalent target fixes

2014-04-14 Thread Greg Kurz
Hi,

Here's the latest patchset to have virtio working with targets
that can change endianness. It is based on 2.0-rc2.

It is also available at github:

https://github.com/gkurz/qemu/commits/virtio

This serie addresses all the remarks that I got from v6. It
also brings some more fixes: TCP checksums and virtio PCI.
It introduces an alternative way to support migrations
without the need to change the current layout.

Please comment and/or apply. :)

---
Greg

Greg Kurz (1):
  virtio-9p: use virtio wrappers to access headers

Rusty Russell (7):
  virtio: endian-ambivalent targets using legacy virtio
  virtio: allow byte swapping for vring and config access
  virtio-net: use virtio wrappers to access headers
  virtio-balloon: use virtio wrappers to access page frame numbers
  virtio-blk: use virtio wrappers to access headers
  virtio-scsi: use virtio wrappers to access headers
  virtio-serial-bus: use virtio wrappers to access headers


 exec.c|2 -
 hw/9pfs/Makefile.objs |3 -
 hw/9pfs/virtio-9p-device.c|3 +
 hw/block/Makefile.objs|2 -
 hw/block/virtio-blk.c |   38 ++
 hw/char/Makefile.objs |3 -
 hw/char/virtio-serial-bus.c   |   46 +++-
 hw/net/Makefile.objs  |2 -
 hw/net/virtio-net.c   |   34 +++--
 hw/scsi/Makefile.objs |2 -
 hw/scsi/virtio-scsi.c |   38 +-
 hw/virtio/Makefile.objs   |2 -
 hw/virtio/virtio-balloon.c|8 +-
 hw/virtio/virtio-pci.c|   11 +--
 hw/virtio/virtio.c|  123 ++---
 include/hw/virtio/virtio-access.h |  138 +
 include/hw/virtio/virtio.h|2 +
 vl.c  |4 +
 18 files changed, 339 insertions(+), 122 deletions(-)
 create mode 100644 include/hw/virtio/virtio-access.h




Re: [Qemu-devel] [PATCH 1/5] s390x/cpu: Use ioctl to reset state in the kernel

2014-04-14 Thread Thomas Huth
On Mon, 14 Apr 2014 12:14:49 +0200
Alexander Graf  wrote:

> 
> On 26.02.14 12:38, Christian Borntraeger wrote:
> > From: Thomas Huth 
> >
> > Some of the state in the kernel can not be reset from QEMU yet.
> > For this we've got to use the KVM_S390_INITIAL_RESET ioctl to make
> > sure that the state in the kernel is set to the right values during
> > initial CPU reset, too.
> >
> > Signed-off-by: Thomas Huth 
> > Acked-by: Cornelia Huck 
> > Signed-off-by: Christian Borntraeger 
> > ---
> >   target-s390x/cpu.c | 9 +
> >   1 file changed, 9 insertions(+)
> >
> > diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
> > index f1319e5..1a8c1cc 100644
> > --- a/target-s390x/cpu.c
> > +++ b/target-s390x/cpu.c
> > @@ -108,6 +108,15 @@ static void s390_cpu_initial_reset(CPUState *s)
> >   env->cregs[14] = CR14_RESET;
> >   
> >   env->pfault_token = -1UL;
> > +
> > +#if defined(CONFIG_KVM)
> > +/* Reset state inside the kernel that we cannot access yet from QEMU. 
> > */
> > +if (kvm_enabled()) {
> > +if (kvm_vcpu_ioctl(s, KVM_S390_INITIAL_RESET, NULL)) {
> 
> Could we put this into the vcpu register sync function? It gets a 
> parameter that indicates when we are on RESET level, right?

No, sorry, as far as I can see, this is not that easily possible:

On S390, we've got five different levels of reset (CPU reset,
Initial CPU reset, Subsystem reset, Clear reset & Power-on reset).
The ioctl is about initial CPU reset only, while the register sync
function flag is rather only used by system/clear reset instead.

So for example when the guest OS sends a SIGP INITIAL CPU RESET from
one CPU to another, only the s390_cpu_initial_reset() will be called,
but not the register sync with KVM_PUT_RESET_STATE.

 Thomas




[Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
drawbacks:
- since this only affects a few targets, the field should be put into a
  subsection
- virtio migration code should be ported to vmstate to be able to introduce
  such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
  get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
  relicensed virtio-access.h to GPLv2+ on Rusty's request,
  introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
  add VirtIODevice * arg to virtio helpers,
  use the existing virtio_is_big_endian() helper,
  virtio-pci: use the device is_big_endian flag,
  introduce virtio tswap16 and tswap64 helpers,
  move calls to tswap* out of virtio-access.h to make it platform independant,
  migration support,
  Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
  virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

 exec.c|2 -
 hw/virtio/Makefile.objs   |2 -
 hw/virtio/virtio-pci.c|   11 +--
 hw/virtio/virtio.c|   35 +
 include/hw/virtio/virtio-access.h |  138 +
 include/hw/virtio/virtio.h|2 +
 vl.c  |4 +
 7 files changed, 185 insertions(+), 9 deletions(-)
 create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
 #else /* !CONFIG_USER_ONLY */
 #include "sysemu/xen-mapcache.h"
 #include "trace.h"
+#include "hw/virtio/virtio.h"
 #endif
 #include "exec/cpu-all.h"
 
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
  * A helper function for the _utterly broken_ virtio device model to find out 
if
  * it's running on a big endian machine. Don't do this at home kids!
  */
-bool virtio_is_big_endian(void);
 bool virtio_is_big_endian(void)
 {
 #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
 common-obj-y += virtio-mmio.o
 common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
 
-obj-y += virtio.o virtio-balloon.o 
+obj-y += virtio.o virtio-balloon.o
 obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
 /* Flags track per-device state like workarounds for quirks in older guests. */
 #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
 
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
 static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
VirtIOPCIProxy *dev);
 
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, 
hwaddr addr,
 break;
 case 2:
 val = virtio_config_readw(vdev, addr);
-if (virtio_is_big_endian()) {
+if (vdev->is_big_endian) {
 val = bswap16(val);
 }
 break;
 case 4:
 val = virtio_config_readl

[Qemu-devel] [PATCH v7 2/8] virtio: allow byte swapping for vring and config access

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

This is based on a simpler patch by Anthony Liguouri, which only handled
the vring accesses.  We also need some drivers to access these helpers,
eg. for data which contains headers.

Signed-off-by: Rusty Russell 
[ add VirtIODevice * argument to most helpers,
  Greg Kurz  ]
Signed-off-by: Greg Kurz 
Reviewed-by: Thomas Huth 
---

Changes since v6:
 - argument re-ordering

 hw/virtio/virtio.c |   88 
 1 file changed, 47 insertions(+), 41 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index bb646f0..163de1c 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -102,53 +102,56 @@ static void virtqueue_init(VirtQueue *vq)
  vq->vring.align);
 }
 
-static inline uint64_t vring_desc_addr(hwaddr desc_pa, int i)
+static inline uint64_t vring_desc_addr(VirtIODevice *vdev, hwaddr desc_pa,
+   int i)
 {
 hwaddr pa;
 pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, addr);
-return ldq_phys(&address_space_memory, pa);
+return virtio_ldq_phys(vdev, pa);
 }
 
-static inline uint32_t vring_desc_len(hwaddr desc_pa, int i)
+static inline uint32_t vring_desc_len(VirtIODevice *vdev, hwaddr desc_pa, int 
i)
 {
 hwaddr pa;
 pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, len);
-return ldl_phys(&address_space_memory, pa);
+return virtio_ldl_phys(vdev, pa);
 }
 
-static inline uint16_t vring_desc_flags(hwaddr desc_pa, int i)
+static inline uint16_t vring_desc_flags(VirtIODevice *vdev, hwaddr desc_pa,
+int i)
 {
 hwaddr pa;
 pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, flags);
-return lduw_phys(&address_space_memory, pa);
+return virtio_lduw_phys(vdev, pa);
 }
 
-static inline uint16_t vring_desc_next(hwaddr desc_pa, int i)
+static inline uint16_t vring_desc_next(VirtIODevice *vdev, hwaddr desc_pa,
+   int i)
 {
 hwaddr pa;
 pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, next);
-return lduw_phys(&address_space_memory, pa);
+return virtio_lduw_phys(vdev, pa);
 }
 
 static inline uint16_t vring_avail_flags(VirtQueue *vq)
 {
 hwaddr pa;
 pa = vq->vring.avail + offsetof(VRingAvail, flags);
-return lduw_phys(&address_space_memory, pa);
+return virtio_lduw_phys(vq->vdev, pa);
 }
 
 static inline uint16_t vring_avail_idx(VirtQueue *vq)
 {
 hwaddr pa;
 pa = vq->vring.avail + offsetof(VRingAvail, idx);
-return lduw_phys(&address_space_memory, pa);
+return virtio_lduw_phys(vq->vdev, pa);
 }
 
 static inline uint16_t vring_avail_ring(VirtQueue *vq, int i)
 {
 hwaddr pa;
 pa = vq->vring.avail + offsetof(VRingAvail, ring[i]);
-return lduw_phys(&address_space_memory, pa);
+return virtio_lduw_phys(vq->vdev, pa);
 }
 
 static inline uint16_t vring_used_event(VirtQueue *vq)
@@ -160,44 +163,44 @@ static inline void vring_used_ring_id(VirtQueue *vq, int 
i, uint32_t val)
 {
 hwaddr pa;
 pa = vq->vring.used + offsetof(VRingUsed, ring[i].id);
-stl_phys(&address_space_memory, pa, val);
+virtio_stl_phys(vq->vdev, pa, val);
 }
 
 static inline void vring_used_ring_len(VirtQueue *vq, int i, uint32_t val)
 {
 hwaddr pa;
 pa = vq->vring.used + offsetof(VRingUsed, ring[i].len);
-stl_phys(&address_space_memory, pa, val);
+virtio_stl_phys(vq->vdev, pa, val);
 }
 
 static uint16_t vring_used_idx(VirtQueue *vq)
 {
 hwaddr pa;
 pa = vq->vring.used + offsetof(VRingUsed, idx);
-return lduw_phys(&address_space_memory, pa);
+return virtio_lduw_phys(vq->vdev, pa);
 }
 
 static inline void vring_used_idx_set(VirtQueue *vq, uint16_t val)
 {
 hwaddr pa;
 pa = vq->vring.used + offsetof(VRingUsed, idx);
-stw_phys(&address_space_memory, pa, val);
+virtio_stw_phys(vq->vdev, pa, val);
 }
 
 static inline void vring_used_flags_set_bit(VirtQueue *vq, int mask)
 {
+VirtIODevice *vdev = vq->vdev;
 hwaddr pa;
 pa = vq->vring.used + offsetof(VRingUsed, flags);
-stw_phys(&address_space_memory,
- pa, lduw_phys(&address_space_memory, pa) | mask);
+virtio_stw_phys(vdev, pa, virtio_lduw_phys(vdev, pa) | mask);
 }
 
 static inline void vring_used_flags_unset_bit(VirtQueue *vq, int mask)
 {
+VirtIODevice *vdev = vq->vdev;
 hwaddr pa;
 pa = vq->vring.used + offsetof(VRingUsed, flags);
-stw_phys(&address_space_memory,
- pa, lduw_phys(&address_space_memory, pa) & ~mask);
+virtio_stw_phys(vdev, pa, virtio_lduw_phys(vdev, pa) & ~mask);
 }
 
 static inline void vring_avail_event(VirtQueue *vq, uint16_t val)
@@ -207,7 +210,7 @@ static inline void vring_avail_event(VirtQueue *vq, 
uint16_t val)
 return;
 }
 pa = vq->vring.used + offsetof(VRingUsed, ring[vq->vring.num]);
-stw_phys(&address_space_memory, pa, val);
+virtio_stw_phys(vq->vdev, pa, val);
 }
 

[Qemu-devel] [PATCH v7 3/8] virtio-net: use virtio wrappers to access headers

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

virtio-net gets target independant.

Signed-off-by: Rusty Russell 
Reviewed-by: Anthony Liguori 
[ pass VirtIODevice * to memory accessors,
  TCP checksums fix by Cédric Le Goater,
  target independant,
  Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
 - target independant
 - TCP checksums fix

 hw/net/Makefile.objs |2 +-
 hw/net/virtio-net.c  |   34 +++---
 2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/hw/net/Makefile.objs b/hw/net/Makefile.objs
index ea93293..cd7060c 100644
--- a/hw/net/Makefile.objs
+++ b/hw/net/Makefile.objs
@@ -23,6 +23,7 @@ common-obj-$(CONFIG_ALLWINNER_EMAC) += allwinner_emac.o
 common-obj-$(CONFIG_CADENCE) += cadence_gem.o
 common-obj-$(CONFIG_STELLARIS_ENET) += stellaris_enet.o
 common-obj-$(CONFIG_LANCE) += lance.o
+common-obj-$(CONFIG_VIRTIO) += virtio-net.o
 
 obj-$(CONFIG_ETRAXFS) += etraxfs_eth.o
 obj-$(CONFIG_COLDFIRE) += mcf_fec.o
@@ -30,7 +31,6 @@ obj-$(CONFIG_MILKYMIST) += milkymist-minimac2.o
 obj-$(CONFIG_PSERIES) += spapr_llan.o
 obj-$(CONFIG_XILINX_ETHLITE) += xilinx_ethlite.o
 
-obj-$(CONFIG_VIRTIO) += virtio-net.o
 obj-y += vhost_net.o
 
 obj-$(CONFIG_ETSEC) += fsl_etsec/etsec.o fsl_etsec/registers.o \
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 439477b..0192e4c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -23,6 +23,7 @@
 #include "hw/virtio/virtio-bus.h"
 #include "qapi/qmp/qjson.h"
 #include "monitor/monitor.h"
+#include "hw/virtio/virtio-access.h"
 
 #define VIRTIO_NET_VM_VERSION11
 
@@ -72,8 +73,8 @@ static void virtio_net_get_config(VirtIODevice *vdev, uint8_t 
*config)
 VirtIONet *n = VIRTIO_NET(vdev);
 struct virtio_net_config netcfg;
 
-stw_p(&netcfg.status, n->status);
-stw_p(&netcfg.max_virtqueue_pairs, n->max_queues);
+virtio_stw_p(vdev, &netcfg.status, n->status);
+virtio_stw_p(vdev, &netcfg.max_virtqueue_pairs, n->max_queues);
 memcpy(netcfg.mac, n->mac, ETH_ALEN);
 memcpy(config, &netcfg, n->config_size);
 }
@@ -611,6 +612,7 @@ static int virtio_net_handle_offloads(VirtIONet *n, uint8_t 
cmd,
 static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
  struct iovec *iov, unsigned int iov_cnt)
 {
+VirtIODevice *vdev = VIRTIO_DEVICE(n);
 struct virtio_net_ctrl_mac mac_data;
 size_t s;
 NetClientState *nc = qemu_get_queue(n->nic);
@@ -639,7 +641,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
 
 s = iov_to_buf(iov, iov_cnt, 0, &mac_data.entries,
sizeof(mac_data.entries));
-mac_data.entries = ldl_p(&mac_data.entries);
+mac_data.entries = virtio_ldl_p(vdev, &mac_data.entries);
 if (s != sizeof(mac_data.entries)) {
 goto error;
 }
@@ -666,7 +668,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
 
 s = iov_to_buf(iov, iov_cnt, 0, &mac_data.entries,
sizeof(mac_data.entries));
-mac_data.entries = ldl_p(&mac_data.entries);
+mac_data.entries = virtio_ldl_p(vdev, &mac_data.entries);
 if (s != sizeof(mac_data.entries)) {
 goto error;
 }
@@ -706,12 +708,13 @@ error:
 static int virtio_net_handle_vlan_table(VirtIONet *n, uint8_t cmd,
 struct iovec *iov, unsigned int 
iov_cnt)
 {
+VirtIODevice *vdev = VIRTIO_DEVICE(n);
 uint16_t vid;
 size_t s;
 NetClientState *nc = qemu_get_queue(n->nic);
 
 s = iov_to_buf(iov, iov_cnt, 0, &vid, sizeof(vid));
-vid = lduw_p(&vid);
+vid = virtio_lduw_p(vdev, &vid);
 if (s != sizeof(vid)) {
 return VIRTIO_NET_ERR;
 }
@@ -748,7 +751,7 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
 return VIRTIO_NET_ERR;
 }
 
-queues = lduw_p(&mq.virtqueue_pairs);
+queues = virtio_lduw_p(vdev, &mq.virtqueue_pairs);
 
 if (queues < VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN ||
 queues > VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX ||
@@ -863,6 +866,14 @@ static int virtio_net_has_buffers(VirtIONetQueue *q, int 
bufsize)
 return 1;
 }
 
+static void virtio_net_hdr_swap(VirtIODevice *vdev, struct virtio_net_hdr *hdr)
+{
+virtio_tswap16s(vdev, &hdr->hdr_len);
+virtio_tswap16s(vdev, &hdr->gso_size);
+virtio_tswap16s(vdev, &hdr->csum_start);
+virtio_tswap16s(vdev, &hdr->csum_offset);
+}
+
 /* dhclient uses AF_PACKET but doesn't pass auxdata to the kernel so
  * it never finds out that the packets don't have valid checksums.  This
  * causes dhclient to get upset.  Fedora's carried a patch for ages to
@@ -898,6 +909,7 @@ static void receive_header(VirtIONet *n, const struct iovec 
*iov, int iov_cnt,
 void *wbuf = (void *)buf;
 work_around_broken_dhclient(wbuf, wbuf + n->host_hdr_len,
 size - n->host_hdr_len);
+virtio_net_hdr_swap(VIRTIO_DEVICE(n), wbuf);
 iov_from_buf(iov, iov_cnt, 0, buf, sizeof(struct virtio_net_hdr))

Re: [Qemu-devel] [PATCH for-2.0] pc: ACPI BIOS: fix incorrect integer encoding for 0x{F-1}FFFF values

2014-04-14 Thread Michael S. Tsirkin
On Sun, Apr 13, 2014 at 11:55:51PM +0200, Igor Mammedov wrote:
> Fix typo in build_append_int() which causes integer
> truncation when it's in range 0x{F-1} by packing it
> as WordConst instead of required DWordConst.
> 
> Signed-off-by: Igor Mammedov 

Good catch - this fixes hotplug in slots 16,17,18 and 19.

I didn't notice this but we had:

If (And (Arg0, 0x))
{
Notify (S80, Arg1)
}

If (And (Arg0, 0x))
{
Notify (S88, Arg1)
}

If (And (Arg0, 0x))
{
Notify (S90, Arg1)
}

If (And (Arg0, 0x))
{
Notify (S98, Arg1)
}

Reviewed-by: Michael S. Tsirkin 

> ---
>  hw/i386/acpi-build.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 748e866..c3321b5 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -474,7 +474,7 @@ static void build_append_int(GArray *table, uint32_t 
> value)
>  build_append_byte(table, 0x01); /* OneOp */
>  } else if (value <= 0xFF) {
>  build_append_value(table, value, 1);
> -} else if (value <= 0xF) {
> +} else if (value <= 0x) {
>  build_append_value(table, value, 2);
>  } else {
>  build_append_value(table, value, 4);
> -- 
> 1.9.0



[Qemu-devel] [PATCH v7 4/8] virtio-balloon: use virtio wrappers to access page frame numbers

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

This one can't get target independant because of TARGET_PAGE_SIZE.

Signed-off-by: Rusty Russell 
Reviewed-by: Anthony Liguori 
[ pass VirtIODevice * to memory accessors,
  Greg Kurz  ]
Signed-off-by: Greg Kurz 
---
 hw/virtio/virtio-balloon.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a470a0b..ba2e74e 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -30,6 +30,7 @@
 #endif
 
 #include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
 
 static void balloon_page(void *addr, int deflate)
 {
@@ -191,8 +192,9 @@ static void virtio_balloon_handle_output(VirtIODevice 
*vdev, VirtQueue *vq)
 while (iov_to_buf(elem.out_sg, elem.out_num, offset, &pfn, 4) == 4) {
 ram_addr_t pa;
 ram_addr_t addr;
+int p = virtio_ldl_p(vdev, &pfn);
 
-pa = (ram_addr_t)ldl_p(&pfn) << VIRTIO_BALLOON_PFN_SHIFT;
+pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
 offset += 4;
 
 /* FIXME: remove get_system_memory(), but how? */
@@ -233,8 +235,8 @@ static void virtio_balloon_receive_stats(VirtIODevice 
*vdev, VirtQueue *vq)
 
 while (iov_to_buf(elem->out_sg, elem->out_num, offset, &stat, sizeof(stat))
== sizeof(stat)) {
-uint16_t tag = tswap16(stat.tag);
-uint64_t val = tswap64(stat.val);
+uint16_t tag = virtio_tswap16(vdev, stat.tag);
+uint64_t val = virtio_tswap64(vdev, stat.val);
 
 offset += sizeof(stat);
 if (tag < VIRTIO_BALLOON_S_NR)




[Qemu-devel] [PATCH v7 5/8] virtio-blk: use virtio wrappers to access headers

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

virtio-blk gets target independant.

Signed-off-by: Rusty Russell 
Reviewed-by: Anthony Liguori 
[ pass VirtIODevice * to memory accessors,
  target independant,
  Greg Kurz  ]
Signed-off-by: Greg Kurz 
---

 Changes since v6:
 - target independant

 hw/block/Makefile.objs |2 +-
 hw/block/virtio-blk.c  |   38 +-
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
index bf46f03..67a64a7 100644
--- a/hw/block/Makefile.objs
+++ b/hw/block/Makefile.objs
@@ -8,8 +8,8 @@ common-obj-$(CONFIG_XEN_BACKEND) += xen_disk.o
 common-obj-$(CONFIG_ECC) += ecc.o
 common-obj-$(CONFIG_ONENAND) += onenand.o
 common-obj-$(CONFIG_NVME_PCI) += nvme.o
+common-obj-$(CONFIG_VIRTIO) += virtio-blk.o
 
 obj-$(CONFIG_SH4) += tc58128.o
 
-obj-$(CONFIG_VIRTIO) += virtio-blk.o
 obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8a568e5..a7447ff 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -26,6 +26,7 @@
 # include 
 #endif
 #include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
 
 typedef struct VirtIOBlockReq
 {
@@ -77,7 +78,8 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
 trace_virtio_blk_rw_complete(req, ret);
 
 if (ret) {
-bool is_read = !(ldl_p(&req->out->type) & VIRTIO_BLK_T_OUT);
+int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out->type);
+bool is_read = !(p & VIRTIO_BLK_T_OUT);
 if (virtio_blk_handle_rw_error(req, -ret, is_read))
 return;
 }
@@ -127,6 +129,7 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock 
*s)
 
 static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
 {
+VirtIODevice *vdev = VIRTIO_DEVICE(req->dev);
 #ifdef __linux__
 int ret;
 int i;
@@ -224,12 +227,13 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
 hdr.status = CHECK_CONDITION;
 }
 
-stl_p(&req->scsi->errors,
-  hdr.status | (hdr.msg_status << 8) |
-  (hdr.host_status << 16) | (hdr.driver_status << 24));
-stl_p(&req->scsi->residual, hdr.resid);
-stl_p(&req->scsi->sense_len, hdr.sb_len_wr);
-stl_p(&req->scsi->data_len, hdr.dxfer_len);
+virtio_stl_p(vdev,
+ &req->scsi->errors,
+ hdr.status | (hdr.msg_status << 8) |
+ (hdr.host_status << 16) | (hdr.driver_status << 24));
+virtio_stl_p(vdev, &req->scsi->residual, hdr.resid);
+virtio_stl_p(vdev, &req->scsi->sense_len, hdr.sb_len_wr);
+virtio_stl_p(vdev, &req->scsi->data_len, hdr.dxfer_len);
 
 virtio_blk_req_complete(req, status);
 g_free(req);
@@ -240,7 +244,7 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
 
 fail:
 /* Just put anything nonzero so that the ioctl fails in the guest.  */
-stl_p(&req->scsi->errors, 255);
+virtio_stl_p(vdev, &req->scsi->errors, 255);
 virtio_blk_req_complete(req, status);
 g_free(req);
 }
@@ -286,7 +290,7 @@ static void virtio_blk_handle_write(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 BlockRequest *blkreq;
 uint64_t sector;
 
-sector = ldq_p(&req->out->sector);
+sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out->sector);
 
 bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_WRITE);
 
@@ -320,7 +324,7 @@ static void virtio_blk_handle_read(VirtIOBlockReq *req)
 {
 uint64_t sector;
 
-sector = ldq_p(&req->out->sector);
+sector = virtio_ldq_p(VIRTIO_DEVICE(req->dev), &req->out->sector);
 
 bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_READ);
 
@@ -358,7 +362,7 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req,
 req->out = (void *)req->elem.out_sg[0].iov_base;
 req->in = (void *)req->elem.in_sg[req->elem.in_num - 1].iov_base;
 
-type = ldl_p(&req->out->type);
+type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out->type);
 
 if (type & VIRTIO_BLK_T_FLUSH) {
 virtio_blk_handle_flush(req, mrb);
@@ -487,12 +491,12 @@ static void virtio_blk_update_config(VirtIODevice *vdev, 
uint8_t *config)
 
 bdrv_get_geometry(s->bs, &capacity);
 memset(&blkcfg, 0, sizeof(blkcfg));
-stq_raw(&blkcfg.capacity, capacity);
-stl_raw(&blkcfg.seg_max, 128 - 2);
-stw_raw(&blkcfg.cylinders, s->conf->cyls);
-stl_raw(&blkcfg.blk_size, blk_size);
-stw_raw(&blkcfg.min_io_size, s->conf->min_io_size / blk_size);
-stw_raw(&blkcfg.opt_io_size, s->conf->opt_io_size / blk_size);
+virtio_stq_p(vdev, &blkcfg.capacity, capacity);
+virtio_stl_p(vdev, &blkcfg.seg_max, 128 - 2);
+virtio_stw_p(vdev, &blkcfg.cylinders, s->conf->cyls);
+virtio_stl_p(vdev, &blkcfg.blk_size, blk_size);
+virtio_stw_p(vdev, &blkcfg.min_io_size, s->conf->min_io_size / blk_size);
+virtio_stw_p(vdev, &blkcfg.opt_io_size, s->conf->o

Re: [Qemu-devel] [PATCH 1/5] s390x/cpu: Use ioctl to reset state in the kernel

2014-04-14 Thread Alexander Graf


On 14.04.14 13:54, Thomas Huth wrote:

On Mon, 14 Apr 2014 12:14:49 +0200
Alexander Graf  wrote:


On 26.02.14 12:38, Christian Borntraeger wrote:

From: Thomas Huth 

Some of the state in the kernel can not be reset from QEMU yet.
For this we've got to use the KVM_S390_INITIAL_RESET ioctl to make
sure that the state in the kernel is set to the right values during
initial CPU reset, too.

Signed-off-by: Thomas Huth 
Acked-by: Cornelia Huck 
Signed-off-by: Christian Borntraeger 
---
   target-s390x/cpu.c | 9 +
   1 file changed, 9 insertions(+)

diff --git a/target-s390x/cpu.c b/target-s390x/cpu.c
index f1319e5..1a8c1cc 100644
--- a/target-s390x/cpu.c
+++ b/target-s390x/cpu.c
@@ -108,6 +108,15 @@ static void s390_cpu_initial_reset(CPUState *s)
   env->cregs[14] = CR14_RESET;
   
   env->pfault_token = -1UL;

+
+#if defined(CONFIG_KVM)
+/* Reset state inside the kernel that we cannot access yet from QEMU. */
+if (kvm_enabled()) {
+if (kvm_vcpu_ioctl(s, KVM_S390_INITIAL_RESET, NULL)) {

Could we put this into the vcpu register sync function? It gets a
parameter that indicates when we are on RESET level, right?

No, sorry, as far as I can see, this is not that easily possible:

On S390, we've got five different levels of reset (CPU reset,
Initial CPU reset, Subsystem reset, Clear reset & Power-on reset).
The ioctl is about initial CPU reset only, while the register sync
function flag is rather only used by system/clear reset instead.

So for example when the guest OS sends a SIGP INITIAL CPU RESET from
one CPU to another, only the s390_cpu_initial_reset() will be called,
but not the register sync with KVM_PUT_RESET_STATE.


Right, I keep forgetting these details :)


Alex




[Qemu-devel] [PATCH v7 6/8] virtio-scsi: use virtio wrappers to access headers

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

virtio-scsi gets target independant.

Signed-off-by: Rusty Russell 
Reviewed-by: Anthony Liguori 
[ pass VirtIODevice * to memory accessors,
  fix missing tswap32 in virtio_scsi_push_event() by Cédric Le Goater,
  target independant,
  Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

 Changes since v6:
 - target independant

 hw/scsi/Makefile.objs |2 +-
 hw/scsi/virtio-scsi.c |   38 --
 2 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/hw/scsi/Makefile.objs b/hw/scsi/Makefile.objs
index 121ddc5..2cc2362 100644
--- a/hw/scsi/Makefile.objs
+++ b/hw/scsi/Makefile.objs
@@ -8,6 +8,6 @@ common-obj-$(CONFIG_ESP_PCI) += esp-pci.o
 obj-$(CONFIG_PSERIES) += spapr_vscsi.o
 
 ifeq ($(CONFIG_VIRTIO),y)
-obj-y += virtio-scsi.o
+common-obj-y += virtio-scsi.o
 obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o
 endif
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index b0d7517..82dc5b6 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include "hw/virtio/virtio-access.h"
 
 typedef struct VirtIOSCSIReq {
 VirtIOSCSI *dev;
@@ -306,6 +307,7 @@ static void virtio_scsi_command_complete(SCSIRequest *r, 
uint32_t status,
 VirtIOSCSIReq *req = r->hba_private;
 VirtIOSCSI *s = req->dev;
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
 uint32_t sense_len;
 
 if (r->io_canceled) {
@@ -315,12 +317,12 @@ static void virtio_scsi_command_complete(SCSIRequest *r, 
uint32_t status,
 req->resp.cmd->response = VIRTIO_SCSI_S_OK;
 req->resp.cmd->status = status;
 if (req->resp.cmd->status == GOOD) {
-req->resp.cmd->resid = tswap32(resid);
+req->resp.cmd->resid = virtio_tswap32(vdev, resid);
 } else {
 req->resp.cmd->resid = 0;
 sense_len = scsi_req_get_sense(r, req->resp.cmd->sense,
vs->sense_size);
-req->resp.cmd->sense_len = tswap32(sense_len);
+req->resp.cmd->sense_len = virtio_tswap32(vdev, sense_len);
 }
 virtio_scsi_complete_req(req);
 }
@@ -416,16 +418,16 @@ static void virtio_scsi_get_config(VirtIODevice *vdev,
 VirtIOSCSIConfig *scsiconf = (VirtIOSCSIConfig *)config;
 VirtIOSCSICommon *s = VIRTIO_SCSI_COMMON(vdev);
 
-stl_raw(&scsiconf->num_queues, s->conf.num_queues);
-stl_raw(&scsiconf->seg_max, 128 - 2);
-stl_raw(&scsiconf->max_sectors, s->conf.max_sectors);
-stl_raw(&scsiconf->cmd_per_lun, s->conf.cmd_per_lun);
-stl_raw(&scsiconf->event_info_size, sizeof(VirtIOSCSIEvent));
-stl_raw(&scsiconf->sense_size, s->sense_size);
-stl_raw(&scsiconf->cdb_size, s->cdb_size);
-stw_raw(&scsiconf->max_channel, VIRTIO_SCSI_MAX_CHANNEL);
-stw_raw(&scsiconf->max_target, VIRTIO_SCSI_MAX_TARGET);
-stl_raw(&scsiconf->max_lun, VIRTIO_SCSI_MAX_LUN);
+virtio_stl_p(vdev, &scsiconf->num_queues, s->conf.num_queues);
+virtio_stl_p(vdev, &scsiconf->seg_max, 128 - 2);
+virtio_stl_p(vdev, &scsiconf->max_sectors, s->conf.max_sectors);
+virtio_stl_p(vdev, &scsiconf->cmd_per_lun, s->conf.cmd_per_lun);
+virtio_stl_p(vdev, &scsiconf->event_info_size, sizeof(VirtIOSCSIEvent));
+virtio_stl_p(vdev, &scsiconf->sense_size, s->sense_size);
+virtio_stl_p(vdev, &scsiconf->cdb_size, s->cdb_size);
+virtio_stw_p(vdev, &scsiconf->max_channel, VIRTIO_SCSI_MAX_CHANNEL);
+virtio_stw_p(vdev, &scsiconf->max_target, VIRTIO_SCSI_MAX_TARGET);
+virtio_stl_p(vdev, &scsiconf->max_lun, VIRTIO_SCSI_MAX_LUN);
 }
 
 static void virtio_scsi_set_config(VirtIODevice *vdev,
@@ -434,14 +436,14 @@ static void virtio_scsi_set_config(VirtIODevice *vdev,
 VirtIOSCSIConfig *scsiconf = (VirtIOSCSIConfig *)config;
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(vdev);
 
-if ((uint32_t) ldl_raw(&scsiconf->sense_size) >= 65536 ||
-(uint32_t) ldl_raw(&scsiconf->cdb_size) >= 256) {
+if ((uint32_t) virtio_ldl_p(vdev, &scsiconf->sense_size) >= 65536 ||
+(uint32_t) virtio_ldl_p(vdev, &scsiconf->cdb_size) >= 256) {
 error_report("bad data written to virtio-scsi configuration space");
 exit(1);
 }
 
-vs->sense_size = ldl_raw(&scsiconf->sense_size);
-vs->cdb_size = ldl_raw(&scsiconf->cdb_size);
+vs->sense_size = virtio_ldl_p(vdev, &scsiconf->sense_size);
+vs->cdb_size = virtio_ldl_p(vdev, &scsiconf->cdb_size);
 }
 
 static uint32_t virtio_scsi_get_features(VirtIODevice *vdev,
@@ -519,8 +521,8 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, 
SCSIDevice *dev,
 
 evt = req->resp.event;
 memset(evt, 0, sizeof(VirtIOSCSIEvent));
-evt->event = event;
-evt->reason = reason;
+evt->event = virtio_tswap32(vdev, event);
+evt->reason = virtio_tswap32(vdev, reason);
 if (!dev) {
 assert(event == VIRTIO_SCSI_T_EVENTS_MISSED);
   

[Qemu-devel] [PATCH v7 7/8] virtio-serial-bus: use virtio wrappers to access headers

2014-04-14 Thread Greg Kurz
From: Rusty Russell 

virtio-serial-bus gets platform independant.

Signed-off-by: Rusty Russell 
Reviewed-by: Anthony Liguori 
[ pass VirtIODevice * to memory accessors,
  platform independant,
  Greg Kurz  ]
Signed-off-by: Greg Kurz 
---
 Changes since v6:
 - target independant

 hw/char/Makefile.objs   |3 +--
 hw/char/virtio-serial-bus.c |   46 +--
 2 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
index 317385d..d490fc7 100644
--- a/hw/char/Makefile.objs
+++ b/hw/char/Makefile.objs
@@ -24,5 +24,4 @@ common-obj-$(CONFIG_LM32) += lm32_juart.o
 common-obj-$(CONFIG_LM32) += lm32_uart.o
 common-obj-$(CONFIG_MILKYMIST) += milkymist-uart.o
 common-obj-$(CONFIG_SCLPCONSOLE) += sclpconsole.o sclpconsole-lm.o
-
-obj-$(CONFIG_VIRTIO) += virtio-serial-bus.o
+common-obj-$(CONFIG_VIRTIO) += virtio-serial-bus.o
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 2b647b6..0475af4 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -24,6 +24,7 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/virtio/virtio-serial.h"
+#include "hw/virtio/virtio-access.h"
 
 static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id)
 {
@@ -183,11 +184,12 @@ static size_t send_control_msg(VirtIOSerial *vser, void 
*buf, size_t len)
 static size_t send_control_event(VirtIOSerial *vser, uint32_t port_id,
  uint16_t event, uint16_t value)
 {
+VirtIODevice *vdev = VIRTIO_DEVICE(vser);
 struct virtio_console_control cpkt;
 
-stl_p(&cpkt.id, port_id);
-stw_p(&cpkt.event, event);
-stw_p(&cpkt.value, value);
+virtio_stl_p(vdev, &cpkt.id, port_id);
+virtio_stw_p(vdev, &cpkt.event, event);
+virtio_stw_p(vdev, &cpkt.value, value);
 
 trace_virtio_serial_send_control_event(port_id, event, value);
 return send_control_msg(vser, &cpkt, sizeof(cpkt));
@@ -278,6 +280,7 @@ void virtio_serial_throttle_port(VirtIOSerialPort *port, 
bool throttle)
 /* Guest wants to notify us of some event */
 static void handle_control_message(VirtIOSerial *vser, void *buf, size_t len)
 {
+VirtIODevice *vdev = VIRTIO_DEVICE(vser);
 struct VirtIOSerialPort *port;
 VirtIOSerialPortClass *vsc;
 struct virtio_console_control cpkt, *gcpkt;
@@ -291,8 +294,8 @@ static void handle_control_message(VirtIOSerial *vser, void 
*buf, size_t len)
 return;
 }
 
-cpkt.event = lduw_p(&gcpkt->event);
-cpkt.value = lduw_p(&gcpkt->value);
+cpkt.event = virtio_lduw_p(vdev, &gcpkt->event);
+cpkt.value = virtio_lduw_p(vdev, &gcpkt->value);
 
 trace_virtio_serial_handle_control_message(cpkt.event, cpkt.value);
 
@@ -312,10 +315,10 @@ static void handle_control_message(VirtIOSerial *vser, 
void *buf, size_t len)
 return;
 }
 
-port = find_port_by_id(vser, ldl_p(&gcpkt->id));
+port = find_port_by_id(vser, virtio_ldl_p(vdev, &gcpkt->id));
 if (!port) {
 error_report("virtio-serial-bus: Unexpected port id %u for device %s",
- ldl_p(&gcpkt->id), vser->bus.qbus.name);
+ virtio_ldl_p(vdev, &gcpkt->id), vser->bus.qbus.name);
 return;
 }
 
@@ -342,9 +345,9 @@ static void handle_control_message(VirtIOSerial *vser, void 
*buf, size_t len)
 }
 
 if (port->name) {
-stl_p(&cpkt.id, port->id);
-stw_p(&cpkt.event, VIRTIO_CONSOLE_PORT_NAME);
-stw_p(&cpkt.value, 1);
+virtio_stl_p(vdev, &cpkt.id, port->id);
+virtio_stw_p(vdev, &cpkt.event, VIRTIO_CONSOLE_PORT_NAME);
+virtio_stw_p(vdev, &cpkt.value, 1);
 
 buffer_len = sizeof(cpkt) + strlen(port->name) + 1;
 buffer = g_malloc(buffer_len);
@@ -522,12 +525,13 @@ static void vser_reset(VirtIODevice *vdev)
 static void virtio_serial_save(QEMUFile *f, void *opaque)
 {
 VirtIOSerial *s = VIRTIO_SERIAL(opaque);
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
 VirtIOSerialPort *port;
 uint32_t nr_active_ports;
 unsigned int i, max_nr_ports;
 
 /* The virtio device */
-virtio_save(VIRTIO_DEVICE(s), f);
+virtio_save(vdev, f);
 
 /* The config space */
 qemu_put_be16s(f, &s->config.cols);
@@ -536,7 +540,7 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
 qemu_put_be32s(f, &s->config.max_nr_ports);
 
 /* The ports map */
-max_nr_ports = tswap32(s->config.max_nr_ports);
+max_nr_ports = virtio_tswap32(vdev, s->config.max_nr_ports);
 for (i = 0; i < (max_nr_ports + 31) / 32; i++) {
 qemu_put_be32s(f, &s->ports_map[i]);
 }
@@ -667,6 +671,7 @@ static int fetch_active_ports_list(QEMUFile *f, int 
version_id,
 static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id)
 {
 VirtIOSerial *s = VIRTIO_SERIAL(opaque);
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
 uint32_t max_nr_ports, nr_active_ports, po

[Qemu-devel] [PATCH v7 8/8] virtio-9p: use virtio wrappers to access headers

2014-04-14 Thread Greg Kurz
Note that st*_raw and ld*_raw are effectively replaced by st*_p and ld*_p.

virtio-9p-device gets platform independant.

Signed-off-by: Greg Kurz 
---
 Changes since v6:
 - target independant

 hw/9pfs/Makefile.objs  |3 +--
 hw/9pfs/virtio-9p-device.c |3 ++-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/9pfs/Makefile.objs b/hw/9pfs/Makefile.objs
index 1e9b595..e71f47b 100644
--- a/hw/9pfs/Makefile.objs
+++ b/hw/9pfs/Makefile.objs
@@ -5,5 +5,4 @@ common-obj-y += virtio-9p-coth.o cofs.o codir.o cofile.o
 common-obj-y += coxattr.o virtio-9p-synth.o
 common-obj-$(CONFIG_OPEN_BY_HANDLE) +=  virtio-9p-handle.o
 common-obj-y += virtio-9p-proxy.o
-
-obj-y += virtio-9p-device.o
+common-obj-y += virtio-9p-device.o
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 15a4983..2572747 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -19,6 +19,7 @@
 #include "fsdev/qemu-fsdev.h"
 #include "virtio-9p-xattr.h"
 #include "virtio-9p-coth.h"
+#include "hw/virtio/virtio-access.h"
 
 static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
 {
@@ -34,7 +35,7 @@ static void virtio_9p_get_config(VirtIODevice *vdev, uint8_t 
*config)
 
 len = strlen(s->tag);
 cfg = g_malloc0(sizeof(struct virtio_9p_config) + len);
-stw_raw(&cfg->tag_len, len);
+virtio_stw_p(vdev, &cfg->tag_len, len);
 /* We don't copy the terminating null to config space */
 memcpy(cfg->tag, s->tag, len);
 memcpy(config, cfg, s->config_size);




Re: [Qemu-devel] [PATCH v4 21/25] tcg-aarch64: Introduce tcg_out_insn_3507

2014-04-14 Thread Claudio Fontana
On 11.04.2014 17:40, Richard Henderson wrote:
> Cleaning up the implementation of REV and REV16 at the same time.
> 
> Signed-off-by: Richard Henderson 
> ---
>  tcg/aarch64/tcg-target.c | 57 
> 
>  1 file changed, 33 insertions(+), 24 deletions(-)
> 
> diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
> index caaf8a2..0846835 100644
> --- a/tcg/aarch64/tcg-target.c
> +++ b/tcg/aarch64/tcg-target.c
> @@ -327,6 +327,11 @@ typedef enum {
>  I3506_CSEL  = 0x1a80,
>  I3506_CSINC = 0x1a800400,
>  
> +/* Data-processing (1 source) instructions.  */
> +I3507_REV16 = 0x5ac00400,
> +I3507_REV32 = 0x5ac00800,
> +I3507_REV64 = 0x5ac00c00,
> +
>  /* Data-processing (2 source) instructions.  */
>  I3508_LSLV  = 0x1ac02000,
>  I3508_LSRV  = 0x1ac02400,
> @@ -545,6 +550,12 @@ static void tcg_out_insn_3506(TCGContext *s, AArch64Insn 
> insn, TCGType ext,
>| tcg_cond_to_aarch64[c] << 12);
>  }
>  
> +static void tcg_out_insn_3507(TCGContext *s, AArch64Insn insn, TCGType ext,
> +  TCGReg rd, TCGReg rn)
> +{
> +tcg_out32(s, insn | ext << 31 | rn << 5 | rd);
> +}
> +
>  static void tcg_out_insn_3509(TCGContext *s, AArch64Insn insn, TCGType ext,
>TCGReg rd, TCGReg rn, TCGReg rm, TCGReg ra)
>  {
> @@ -960,20 +971,19 @@ static void tcg_out_brcond(TCGContext *s, TCGMemOp ext, 
> TCGCond c, TCGArg a,
>  }
>  }
>  
> -static inline void tcg_out_rev(TCGContext *s, TCGType ext,
> -   TCGReg rd, TCGReg rm)
> +static inline void tcg_out_rev64(TCGContext *s, TCGReg rd, TCGReg rn)
>  {
> -/* using REV 0x5ac00800 */
> -unsigned int base = ext ? 0xdac00c00 : 0x5ac00800;
> -tcg_out32(s, base | rm << 5 | rd);
> +tcg_out_insn(s, 3507, REV64, TCG_TYPE_I64, rd, rn);
>  }
>  
> -static inline void tcg_out_rev16(TCGContext *s, TCGType ext,
> - TCGReg rd, TCGReg rm)
> +static inline void tcg_out_rev32(TCGContext *s, TCGReg rd, TCGReg rn)
>  {
> -/* using REV16 0x5ac00400 */
> -unsigned int base = ext ? 0xdac00400 : 0x5ac00400;
> -tcg_out32(s, base | rm << 5 | rd);
> +tcg_out_insn(s, 3507, REV32, TCG_TYPE_I32, rd, rn);
> +}
> +
> +static inline void tcg_out_rev16(TCGContext *s, TCGReg rd, TCGReg rn)
> +{
> +tcg_out_insn(s, 3507, REV16, TCG_TYPE_I32, rd, rn);
>  }
>  
>  static inline void tcg_out_sxt(TCGContext *s, TCGType ext, TCGMemOp s_bits,
> @@ -1205,13 +1215,13 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
> TCGMemOp memop,
>  case MO_UW:
>  tcg_out_ldst_r(s, LDST_16, LDST_LD, data_r, addr_r, off_r);
>  if (bswap) {
> -tcg_out_rev16(s, TCG_TYPE_I32, data_r, data_r);
> +tcg_out_rev16(s, data_r, data_r);
>  }
>  break;
>  case MO_SW:
>  if (bswap) {
>  tcg_out_ldst_r(s, LDST_16, LDST_LD, data_r, addr_r, off_r);
> -tcg_out_rev16(s, TCG_TYPE_I32, data_r, data_r);
> +tcg_out_rev16(s, data_r, data_r);
>  tcg_out_sxt(s, TCG_TYPE_I64, MO_16, data_r, data_r);
>  } else {
>  tcg_out_ldst_r(s, LDST_16, LDST_LD_S_X, data_r, addr_r, off_r);
> @@ -1220,13 +1230,13 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
> TCGMemOp memop,
>  case MO_UL:
>  tcg_out_ldst_r(s, LDST_32, LDST_LD, data_r, addr_r, off_r);
>  if (bswap) {
> -tcg_out_rev(s, TCG_TYPE_I32, data_r, data_r);
> +tcg_out_rev32(s, data_r, data_r);
>  }
>  break;
>  case MO_SL:
>  if (bswap) {
>  tcg_out_ldst_r(s, LDST_32, LDST_LD, data_r, addr_r, off_r);
> -tcg_out_rev(s, TCG_TYPE_I32, data_r, data_r);
> +tcg_out_rev32(s, data_r, data_r);
>  tcg_out_sxt(s, TCG_TYPE_I64, MO_32, data_r, data_r);
>  } else {
>  tcg_out_ldst_r(s, LDST_32, LDST_LD_S_X, data_r, addr_r, off_r);
> @@ -1235,7 +1245,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
> TCGMemOp memop,
>  case MO_Q:
>  tcg_out_ldst_r(s, LDST_64, LDST_LD, data_r, addr_r, off_r);
>  if (bswap) {
> -tcg_out_rev(s, TCG_TYPE_I64, data_r, data_r);
> +tcg_out_rev64(s, data_r, data_r);
>  }
>  break;
>  default:
> @@ -1254,21 +1264,21 @@ static void tcg_out_qemu_st_direct(TCGContext *s, 
> TCGMemOp memop,
>  break;
>  case MO_16:
>  if (bswap && data_r != TCG_REG_XZR) {
> -tcg_out_rev16(s, TCG_TYPE_I32, TCG_REG_TMP, data_r);
> +tcg_out_rev16(s, TCG_REG_TMP, data_r);
>  data_r = TCG_REG_TMP;
>  }
>  tcg_out_ldst_r(s, LDST_16, LDST_ST, data_r, addr_r, off_r);
>  break;
>  case MO_32:
>  if (bswap && data_r != TCG_REG_XZR) {
> -tcg_out_rev(s, TCG_TYPE_I32, TCG_REG_TMP, data_r);
> +  

[Qemu-devel] [PATCH v5] target-ppc: ppc64 target's virtio can be either endian

2014-04-14 Thread Greg Kurz
We turn the virtio_is_big_endian() helper into a target dependant hook
to compute endianness for legacy virtio devices.

We base it on the OS endian, as reflected by the endianness of the
interrupt vectors (handled through the ILE bit in the LPCR register).

Using first_cpu to fetch the registers from KVM may look arbitrary
and awkward, but it is okay because KVM sets/unsets the ILE bit on
all CPUs.

Note that Book3e has no LPCR and worse, has a unrelated spr at the same
index... I did not find better than strcmp() to ensure we are using
the correct register.

Suggested-by: Benjamin Herrenschmidt 
Signed-off-by: Rusty Russell 
[ re-use the existing virito_is_big_endian() helper,
  check we have LPCR,
  Greg Kurz  ]
Signed-off-by: Greg Kurz 
Reviewed-by: Alexander Graf 
---
 Changes since v5:
 - merged with the existing virtio_is_big_endian() helper
 - ensure cpu has LPCR

 exec.c   |2 +-
 target-ppc/misc_helper.c |   18 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index e6777d0..cf98d42 100644
--- a/exec.c
+++ b/exec.c
@@ -2740,7 +2740,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 }
 #endif
 
-#if !defined(CONFIG_USER_ONLY)
+#if !defined(CONFIG_USER_ONLY) && !defined(TARGET_PPC)
 
 /*
  * A helper function for the _utterly broken_ virtio device model to find out 
if
diff --git a/target-ppc/misc_helper.c b/target-ppc/misc_helper.c
index 2eb2fa6..971cafe 100644
--- a/target-ppc/misc_helper.c
+++ b/target-ppc/misc_helper.c
@@ -20,6 +20,8 @@
 #include "helper.h"
 
 #include "helper_regs.h"
+#include "hw/virtio/virtio.h"
+#include "sysemu/kvm.h"
 
 /*/
 /* SPR accesses */
@@ -120,3 +122,19 @@ void ppc_store_msr(CPUPPCState *env, target_ulong value)
 {
 hreg_store_msr(env, value, 0);
 }
+
+#if !defined(CONFIG_USER_ONLY)
+bool virtio_is_big_endian(void)
+{
+PowerPCCPU *cp = POWERPC_CPU(first_cpu);
+CPUPPCState *env = &cp->env;
+
+/* NOTE: booke uses the same number for another unrelated spr.
+ */
+if (strcmp(env->spr_cb[SPR_LPCR].name, "LPCR")) {
+return TARGET_WORDS_BIGENDIAN;
+} else {
+return !(env->spr[SPR_LPCR] & LPCR_ILE);
+}
+}
+#endif




[Qemu-devel] [PATCH for-2.0] Revert "fix return check for KVM_GET_DIRTY_LOG ioctl"

2014-04-14 Thread Michael Tokarev
This reverts commit b533f658a98325d0e47b36113bd9f5bcc046fdae.

The original code was wrong, because effectively it ignored errors
from kernel, because kernel does not return -1 on error case but
returns -errno, and does not return -EPERM for this particular ioctl.
But in some cases kernel actually returned unsuccessful result,
namely, when the dirty bitmap in requested slot does not exist
it returns -ENOENT.  With new code this condition becomes an
error when it shouldn't be.

Revert that patch instead of fixing it properly this late in the
release process.  I disagree with this approach, but let's make
things move _somewhere_, instead of arguing endlessly whch of
the 2 proposed fixes is better.

Signed-off-by: Michael Tokarev 
---
 kvm-all.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kvm-all.c b/kvm-all.c
index cd4111d..82a9119 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -441,7 +441,7 @@ static int 
kvm_physical_sync_dirty_bitmap(MemoryRegionSection *section)
 
 d.slot = mem->slot;
 
-if (kvm_vm_ioctl(s, KVM_GET_DIRTY_LOG, &d) < 0) {
+if (kvm_vm_ioctl(s, KVM_GET_DIRTY_LOG, &d) == -1) {
 DPRINTF("ioctl failed %d\n", errno);
 ret = -1;
 break;
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
 drawbacks:
- since this only affects a few targets, the field should be put into a
   subsection
- virtio migration code should be ported to vmstate to be able to introduce
   such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
   get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
   relicensed virtio-access.h to GPLv2+ on Rusty's request,
   introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
   add VirtIODevice * arg to virtio helpers,
   use the existing virtio_is_big_endian() helper,
   virtio-pci: use the device is_big_endian flag,
   introduce virtio tswap16 and tswap64 helpers,
   move calls to tswap* out of virtio-access.h to make it platform independant,
   migration support,
   Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
   virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

  exec.c|2 -
  hw/virtio/Makefile.objs   |2 -
  hw/virtio/virtio-pci.c|   11 +--
  hw/virtio/virtio.c|   35 +
  include/hw/virtio/virtio-access.h |  138 +
  include/hw/virtio/virtio.h|2 +
  vl.c  |4 +
  7 files changed, 185 insertions(+), 9 deletions(-)
  create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
  #else /* !CONFIG_USER_ONLY */
  #include "sysemu/xen-mapcache.h"
  #include "trace.h"
+#include "hw/virtio/virtio.h"
  #endif
  #include "exec/cpu-all.h"
  
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,

   * A helper function for the _utterly broken_ virtio device model to find out 
if
   * it's running on a big endian machine. Don't do this at home kids!
   */
-bool virtio_is_big_endian(void);
  bool virtio_is_big_endian(void)
  {
  #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
  common-obj-y += virtio-mmio.o
  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
  
-obj-y += virtio.o virtio-balloon.o

+obj-y += virtio.o virtio-balloon.o
  obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
  /* Flags track per-device state like workarounds for quirks in older guests. 
*/
  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
  
-/* HACK for virtio to determine if it's running a big endian guest */

-bool virtio_is_big_endian(void);
-
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
  
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, hwaddr addr,

  break;
  case 2:
  val = virtio_config_readw(vdev, addr);
-if (virtio_is_big_endian()) {
+if (vdev->is_big_endian) {
  val

Re: [Qemu-devel] [PATCH for 2.0] ide: Correct improper smart self test counter reset in ide core.

2014-04-14 Thread Markus Armbruster
Markus Armbruster  writes:

> Benoît Canet  writes:
>
>> The counter being reseted to zero make the array index negative.
>> Reset it to 1.
>>
>> Signed-off-by: Benoit Canet 
>> ---
>>  hw/ide/core.c |2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/ide/core.c b/hw/ide/core.c
>> index e1dfe54..c943a4d 100644
>> --- a/hw/ide/core.c
>> +++ b/hw/ide/core.c
>> @@ -1602,7 +1602,7 @@ static bool cmd_smart(IDEState *s, uint8_t cmd)
>>  case 2: /* extended self test */
>>  s->smart_selftest_count++;
>>  if (s->smart_selftest_count > 21) {
>> -s->smart_selftest_count = 0;
>> +s->smart_selftest_count = 1;
>>  }
>>  n = 2 + (s->smart_selftest_count - 1) * 24;
>>  s->smart_selftest_data[n] = s->sector;
>
> Good catch.
>
> Commit message could use some love, though.  On every 21st SMART EXECUTE
> OFFLINE:
>
> * We write before a dynamically allocated buffer
>
>   Your diff's context has one of the writes.
>
> * We forget SMART history
>
>   See the s->smart_selftest_count == 0 special cases in SMART READ DATA
>   and SMART READ LOG.

Peter offered to improve the commit message on merge.  Even without
that, it would be better to get this into 2.0 with a sub-optimal commit
message than not to get it, so:

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
 drawbacks:
- since this only affects a few targets, the field should be put into a
   subsection
- virtio migration code should be ported to vmstate to be able to introduce
   such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
   get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
   relicensed virtio-access.h to GPLv2+ on Rusty's request,
   introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
   add VirtIODevice * arg to virtio helpers,
   use the existing virtio_is_big_endian() helper,
   virtio-pci: use the device is_big_endian flag,
   introduce virtio tswap16 and tswap64 helpers,
   move calls to tswap* out of virtio-access.h to make it platform independant,
   migration support,
   Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
   virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

  exec.c|2 -
  hw/virtio/Makefile.objs   |2 -
  hw/virtio/virtio-pci.c|   11 +--
  hw/virtio/virtio.c|   35 +
  include/hw/virtio/virtio-access.h |  138 +
  include/hw/virtio/virtio.h|2 +
  vl.c  |4 +
  7 files changed, 185 insertions(+), 9 deletions(-)
  create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
  #else /* !CONFIG_USER_ONLY */
  #include "sysemu/xen-mapcache.h"
  #include "trace.h"
+#include "hw/virtio/virtio.h"
  #endif
  #include "exec/cpu-all.h"
  
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,

   * A helper function for the _utterly broken_ virtio device model to find out 
if
   * it's running on a big endian machine. Don't do this at home kids!
   */
-bool virtio_is_big_endian(void);
  bool virtio_is_big_endian(void)
  {
  #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
  common-obj-y += virtio-mmio.o
  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
  
-obj-y += virtio.o virtio-balloon.o

+obj-y += virtio.o virtio-balloon.o
  obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
  /* Flags track per-device state like workarounds for quirks in older guests. 
*/
  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
  
-/* HACK for virtio to determine if it's running a big endian guest */

-bool virtio_is_big_endian(void);
-
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
  
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, hwaddr addr,

  break;
  case 2:
  val = virtio_config_readw(vdev, addr);
-if (virtio_is_big_endian()) {
+if (vdev->is_big_endian) {
  val

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 13:58, Greg Kurz wrote:
> >From: Rusty Russell 
> >
> >virtio data structures are defined as "target endian", which assumes
> >that's a fixed value.  In fact, that actually means it's platform-specific.
> >The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> >We introduce memory accessors to be used accross the virtio code where
> >needed. These accessors should support both legacy and 1.0 devices.
> >A good way to do it is to introduce a per-device property to store the
> >endianness. We choose to set this flag at device reset time because it
> >is reasonnable to assume the endianness won't change unless we reboot or
> >kexec another kernel. And it is also reasonnable to assume the new kernel
> >will reset the devices before using them (otherwise it will break).
> >
> >We reuse the virtio_is_big_endian() helper since it provides the right
> >value for legacy devices with most of the targets, that have fixed
> >endianness. It can then be overriden to support endian-ambivalent targets.
> >
> >To support migration, we need to set the flag in virtio_load() as well.
> >
> >(a) One solution would be to add it to the stream, but it have some
> > drawbacks:
> >- since this only affects a few targets, the field should be put into a
> >   subsection
> >- virtio migration code should be ported to vmstate to be able to introduce
> >   such a subsection
> >
> >(b) If we assume the following to be true:
> >- target endianness falls under some cpu state
> >- cpu state is always restored before virtio devices state because they
> >   get initialized in this order in main().
> >Then an alternative is to rely on virtio_is_big_endian() again at
> >load time. No need to mess around with the migration stream in this case.
> >
> >This patch implements (b).
> >
> >Note that the tswap helpers are implemented in virtio.c so that
> >virtio-access.h stays platform independant. Most of the virtio code
> >will be buildable under common-obj instead of obj then, and spare
> >some cycles when building for multiple targets.
> >
> >Signed-off-by: Rusty Russell 
> >[ ldq_phys() API change,
> >   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >   introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >   add VirtIODevice * arg to virtio helpers,
> >   use the existing virtio_is_big_endian() helper,
> >   virtio-pci: use the device is_big_endian flag,
> >   introduce virtio tswap16 and tswap64 helpers,
> >   move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >   migration support,
> >   Greg Kurz  ]
> >Cc: Cédric Le Goater 
> >Signed-off-by: Greg Kurz 
> >---
> >
> >Changes since v6:
> >- merge the virtio_needs_byteswap() helper from v6 and existing
> >   virtio_is_big_endian()
> >- virtio-pci: now supports endianness changes
> >- virtio-access.h fixes (target independant)
> >
> >  exec.c|2 -
> >  hw/virtio/Makefile.objs   |2 -
> >  hw/virtio/virtio-pci.c|   11 +--
> >  hw/virtio/virtio.c|   35 +
> >  include/hw/virtio/virtio-access.h |  138 
> > +
> >  include/hw/virtio/virtio.h|2 +
> >  vl.c  |4 +
> >  7 files changed, 185 insertions(+), 9 deletions(-)
> >  create mode 100644 include/hw/virtio/virtio-access.h
> >
> >diff --git a/exec.c b/exec.c
> >index 91513c6..e6777d0 100644
> >--- a/exec.c
> >+++ b/exec.c
> >@@ -42,6 +42,7 @@
> >  #else /* !CONFIG_USER_ONLY */
> >  #include "sysemu/xen-mapcache.h"
> >  #include "trace.h"
> >+#include "hw/virtio/virtio.h"
> >  #endif
> >  #include "exec/cpu-all.h"
> >@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> >addr,
> >   * A helper function for the _utterly broken_ virtio device model to find 
> > out if
> >   * it's running on a big endian machine. Don't do this at home kids!
> >   */
> >-bool virtio_is_big_endian(void);
> >  bool virtio_is_big_endian(void)
> >  {
> >  #if defined(TARGET_WORDS_BIGENDIAN)
> >diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> >index 1ba53d9..68c3064 100644
> >--- a/hw/virtio/Makefile.objs
> >+++ b/hw/virtio/Makefile.objs
> >@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> >  common-obj-y += virtio-mmio.o
> >  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> >-obj-y += virtio.o virtio-balloon.o
> >+obj-y += virtio.o virtio-balloon.o
> >  obj-$(CONFIG_LINUX) += vhost.o
> >diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> >index ce97514..82a1689 100644
> >--- a/hw/virtio/virtio-pci.c
> >+++ b/hw/virtio/virtio-pci.c
> >@@ -89,9 +89,6 @@
> >  /* Flags track per-device state like workarounds for quirks in older 
> > guests. */
> >  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
> >-/* HACK for virtio to determine if it's running a big endian guest */
> >-bool virtio_is

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:22:36PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 13:58, Greg Kurz wrote:
> >From: Rusty Russell 
> >
> >virtio data structures are defined as "target endian", which assumes
> >that's a fixed value.  In fact, that actually means it's platform-specific.
> >The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> >We introduce memory accessors to be used accross the virtio code where
> >needed. These accessors should support both legacy and 1.0 devices.
> >A good way to do it is to introduce a per-device property to store the
> >endianness. We choose to set this flag at device reset time because it
> >is reasonnable to assume the endianness won't change unless we reboot or
> >kexec another kernel. And it is also reasonnable to assume the new kernel
> >will reset the devices before using them (otherwise it will break).
> >
> >We reuse the virtio_is_big_endian() helper since it provides the right
> >value for legacy devices with most of the targets, that have fixed
> >endianness. It can then be overriden to support endian-ambivalent targets.
> >
> >To support migration, we need to set the flag in virtio_load() as well.
> >
> >(a) One solution would be to add it to the stream, but it have some
> > drawbacks:
> >- since this only affects a few targets, the field should be put into a
> >   subsection
> >- virtio migration code should be ported to vmstate to be able to introduce
> >   such a subsection
> >
> >(b) If we assume the following to be true:
> >- target endianness falls under some cpu state
> >- cpu state is always restored before virtio devices state because they
> >   get initialized in this order in main().
> >Then an alternative is to rely on virtio_is_big_endian() again at
> >load time. No need to mess around with the migration stream in this case.
> >
> >This patch implements (b).
> >
> >Note that the tswap helpers are implemented in virtio.c so that
> >virtio-access.h stays platform independant. Most of the virtio code
> >will be buildable under common-obj instead of obj then, and spare
> >some cycles when building for multiple targets.
> >
> >Signed-off-by: Rusty Russell 
> >[ ldq_phys() API change,
> >   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >   introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >   add VirtIODevice * arg to virtio helpers,
> >   use the existing virtio_is_big_endian() helper,
> >   virtio-pci: use the device is_big_endian flag,
> >   introduce virtio tswap16 and tswap64 helpers,
> >   move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >   migration support,
> >   Greg Kurz  ]
> >Cc: Cédric Le Goater 
> >Signed-off-by: Greg Kurz 
> >---
> >
> >Changes since v6:
> >- merge the virtio_needs_byteswap() helper from v6 and existing
> >   virtio_is_big_endian()
> >- virtio-pci: now supports endianness changes
> >- virtio-access.h fixes (target independant)
> >
> >  exec.c|2 -
> >  hw/virtio/Makefile.objs   |2 -
> >  hw/virtio/virtio-pci.c|   11 +--
> >  hw/virtio/virtio.c|   35 +
> >  include/hw/virtio/virtio-access.h |  138 
> > +
> >  include/hw/virtio/virtio.h|2 +
> >  vl.c  |4 +
> >  7 files changed, 185 insertions(+), 9 deletions(-)
> >  create mode 100644 include/hw/virtio/virtio-access.h
> >
> >diff --git a/exec.c b/exec.c
> >index 91513c6..e6777d0 100644
> >--- a/exec.c
> >+++ b/exec.c
> >@@ -42,6 +42,7 @@
> >  #else /* !CONFIG_USER_ONLY */
> >  #include "sysemu/xen-mapcache.h"
> >  #include "trace.h"
> >+#include "hw/virtio/virtio.h"
> >  #endif
> >  #include "exec/cpu-all.h"
> >@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> >addr,
> >   * A helper function for the _utterly broken_ virtio device model to find 
> > out if
> >   * it's running on a big endian machine. Don't do this at home kids!
> >   */
> >-bool virtio_is_big_endian(void);
> >  bool virtio_is_big_endian(void)
> >  {
> >  #if defined(TARGET_WORDS_BIGENDIAN)
> >diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> >index 1ba53d9..68c3064 100644
> >--- a/hw/virtio/Makefile.objs
> >+++ b/hw/virtio/Makefile.objs
> >@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> >  common-obj-y += virtio-mmio.o
> >  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> >-obj-y += virtio.o virtio-balloon.o
> >+obj-y += virtio.o virtio-balloon.o
> >  obj-$(CONFIG_LINUX) += vhost.o
> >diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> >index ce97514..82a1689 100644
> >--- a/hw/virtio/virtio-pci.c
> >+++ b/hw/virtio/virtio-pci.c
> >@@ -89,9 +89,6 @@
> >  /* Flags track per-device state like workarounds for quirks in older 
> > guests. */
> >  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
> >-/* HACK for virtio to determine if it's running a big endian guest */
> >-bool virtio_is

Re: [Qemu-devel] [PATCH v5] target-ppc: ppc64 target's virtio can be either endian

2014-04-14 Thread Alexander Graf


On 14.04.14 14:12, Greg Kurz wrote:

We turn the virtio_is_big_endian() helper into a target dependant hook
to compute endianness for legacy virtio devices.

We base it on the OS endian, as reflected by the endianness of the
interrupt vectors (handled through the ILE bit in the LPCR register).

Using first_cpu to fetch the registers from KVM may look arbitrary
and awkward, but it is okay because KVM sets/unsets the ILE bit on
all CPUs.

Note that Book3e has no LPCR and worse, has a unrelated spr at the same
index... I did not find better than strcmp() to ensure we are using
the correct register.

Suggested-by: Benjamin Herrenschmidt 
Signed-off-by: Rusty Russell 
[ re-use the existing virito_is_big_endian() helper,
   check we have LPCR,
   Greg Kurz  ]
Signed-off-by: Greg Kurz 
Reviewed-by: Alexander Graf 


Works for me :).


Alex




[Qemu-devel] [PULL for-2.0 2/2] acpi-test: update expected files

2014-04-14 Thread Michael S. Tsirkin
commit 58b035c7354afc0c5351ea62264c01d74196ec26
acpi: fix incorrect encoding for 0x{F-1}
changes the SSDT, update expected files accordingly.

Signed-off-by: Michael S. Tsirkin 
---
 tests/acpi-test-data/pc/SSDT | Bin 2261 -> 2269 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/tests/acpi-test-data/pc/SSDT b/tests/acpi-test-data/pc/SSDT
index 
6444f60ce786da7000d23b11d796cca579747f0c..c987fb237925aa532afb3d795ac5a787548e333d
 100644
GIT binary patch
delta 99
zcmcaAcvp}sIM^lRE(ZexWBNv}NJhr4$tjE#OpXPc?=h;gGI~sAW8Y!UU!B1NRKUQ{
d7HnYpEHfq|hd
W*uo$_a{+>9fy}dn%T4~u{sI7|85%tR

-- 
MST




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:24, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:

On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
 drawbacks:
- since this only affects a few targets, the field should be put into a
   subsection
- virtio migration code should be ported to vmstate to be able to introduce
   such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
   get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
   relicensed virtio-access.h to GPLv2+ on Rusty's request,
   introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
   add VirtIODevice * arg to virtio helpers,
   use the existing virtio_is_big_endian() helper,
   virtio-pci: use the device is_big_endian flag,
   introduce virtio tswap16 and tswap64 helpers,
   move calls to tswap* out of virtio-access.h to make it platform independant,
   migration support,
   Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
   virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

  exec.c|2 -
  hw/virtio/Makefile.objs   |2 -
  hw/virtio/virtio-pci.c|   11 +--
  hw/virtio/virtio.c|   35 +
  include/hw/virtio/virtio-access.h |  138 +
  include/hw/virtio/virtio.h|2 +
  vl.c  |4 +
  7 files changed, 185 insertions(+), 9 deletions(-)
  create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
  #else /* !CONFIG_USER_ONLY */
  #include "sysemu/xen-mapcache.h"
  #include "trace.h"
+#include "hw/virtio/virtio.h"
  #endif
  #include "exec/cpu-all.h"
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
   * A helper function for the _utterly broken_ virtio device model to find out 
if
   * it's running on a big endian machine. Don't do this at home kids!
   */
-bool virtio_is_big_endian(void);
  bool virtio_is_big_endian(void)
  {
  #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
  common-obj-y += virtio-mmio.o
  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
-obj-y += virtio.o virtio-balloon.o
+obj-y += virtio.o virtio-balloon.o
  obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
  /* Flags track per-device state like workarounds for quirks in older guests. 
*/
  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, 
hwaddr addr,
  break;
  case 2:
  val = virtio_config_readw(vdev, add

[Qemu-devel] [PULL for-2.0 0/2] acpi: SSDT update

2014-04-14 Thread Michael S. Tsirkin
The following changes since commit 590e5dd98fcc926cc3b63aad35aed79235ca4c2a:

  configure: Make stack-protector test check both compile and link (2014-04-14 
12:11:18 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 8611224a7b21db8fa1b0d12fdd053443543dd708:

  acpi-test: update expected files (2014-04-14 15:13:27 +0300)


acpi: SSDT update

This has a fix by Igor for a regression introduced by
bridge hotplug code.
Expected test files were updated accordingly.

Signed-off-by: Michael S. Tsirkin 


Igor Mammedov (1):
  acpi: fix incorrect encoding for 0x{F-1}

Michael S. Tsirkin (1):
  acpi-test: update expected files

 hw/i386/acpi-build.c |   2 +-
 tests/acpi-test-data/pc/SSDT | Bin 2261 -> 2269 bytes
 2 files changed, 1 insertion(+), 1 deletion(-)




[Qemu-devel] [PULL for-2.0 1/2] acpi: fix incorrect encoding for 0x{F-1}FFFF

2014-04-14 Thread Michael S. Tsirkin
From: Igor Mammedov 

Fix typo in build_append_int() which causes integer
truncation when it's in range 0x{F-1} by packing it
as WordConst instead of required DWordConst.

In partucular this fixes a regression: hotplug in slots 16,17,18 and 19
didn't work, since SSDT had code like this:

If (And (Arg0, 0x))
{
Notify (S80, Arg1)
}

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Stefan Weil 
---
 hw/i386/acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index a5d3fbf..c98df88 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -391,7 +391,7 @@ static void build_append_int(GArray *table, uint32_t value)
 build_append_byte(table, 0x01); /* OneOp */
 } else if (value <= 0xFF) {
 build_append_value(table, value, 1);
-} else if (value <= 0xF) {
+} else if (value <= 0x) {
 build_append_value(table, value, 2);
 } else {
 build_append_value(table, value, 4);
-- 
MST




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:28, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:22:36PM +0200, Alexander Graf wrote:

On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
 drawbacks:
- since this only affects a few targets, the field should be put into a
   subsection
- virtio migration code should be ported to vmstate to be able to introduce
   such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
   get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
   relicensed virtio-access.h to GPLv2+ on Rusty's request,
   introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
   add VirtIODevice * arg to virtio helpers,
   use the existing virtio_is_big_endian() helper,
   virtio-pci: use the device is_big_endian flag,
   introduce virtio tswap16 and tswap64 helpers,
   move calls to tswap* out of virtio-access.h to make it platform independant,
   migration support,
   Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
   virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

  exec.c|2 -
  hw/virtio/Makefile.objs   |2 -
  hw/virtio/virtio-pci.c|   11 +--
  hw/virtio/virtio.c|   35 +
  include/hw/virtio/virtio-access.h |  138 +
  include/hw/virtio/virtio.h|2 +
  vl.c  |4 +
  7 files changed, 185 insertions(+), 9 deletions(-)
  create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
  #else /* !CONFIG_USER_ONLY */
  #include "sysemu/xen-mapcache.h"
  #include "trace.h"
+#include "hw/virtio/virtio.h"
  #endif
  #include "exec/cpu-all.h"
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
   * A helper function for the _utterly broken_ virtio device model to find out 
if
   * it's running on a big endian machine. Don't do this at home kids!
   */
-bool virtio_is_big_endian(void);
  bool virtio_is_big_endian(void)
  {
  #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
  common-obj-y += virtio-mmio.o
  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
-obj-y += virtio.o virtio-balloon.o
+obj-y += virtio.o virtio-balloon.o
  obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
  /* Flags track per-device state like workarounds for quirks in older guests. 
*/
  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void *opaque, 
hwaddr addr,
  break;
  case 2:
  val = virtio_config_readw(vdev, add

Re: [Qemu-devel] [PATCH for 2.0] ide: Correct improper smart self test counter reset in ide core.

2014-04-14 Thread Kevin Wolf
Am 14.04.2014 um 14:10 hat Markus Armbruster geschrieben:
> Markus Armbruster  writes:
> 
> > Benoît Canet  writes:
> >
> >> The counter being reseted to zero make the array index negative.
> >> Reset it to 1.
> >>
> >> Signed-off-by: Benoit Canet 
> >> ---
> >>  hw/ide/core.c |2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/ide/core.c b/hw/ide/core.c
> >> index e1dfe54..c943a4d 100644
> >> --- a/hw/ide/core.c
> >> +++ b/hw/ide/core.c
> >> @@ -1602,7 +1602,7 @@ static bool cmd_smart(IDEState *s, uint8_t cmd)
> >>  case 2: /* extended self test */
> >>  s->smart_selftest_count++;
> >>  if (s->smart_selftest_count > 21) {
> >> -s->smart_selftest_count = 0;
> >> +s->smart_selftest_count = 1;
> >>  }
> >>  n = 2 + (s->smart_selftest_count - 1) * 24;
> >>  s->smart_selftest_data[n] = s->sector;
> >
> > Good catch.
> >
> > Commit message could use some love, though.  On every 21st SMART EXECUTE
> > OFFLINE:
> >
> > * We write before a dynamically allocated buffer
> >
> >   Your diff's context has one of the writes.
> >
> > * We forget SMART history
> >
> >   See the s->smart_selftest_count == 0 special cases in SMART READ DATA
> >   and SMART READ LOG.
> 
> Peter offered to improve the commit message on merge.  Even without
> that, it would be better to get this into 2.0 with a sub-optimal commit
> message than not to get it, so:
> 
> Reviewed-by: Markus Armbruster 

Should also be fixed in the stable branch of earlier releases. The bug
is present since SMART emulation was added in 2009.

Cc: qemu-sta...@nongnu.org
Acked-by: Kevin Wolf 



[Qemu-devel] When QEMU is emulating DMA, can other vcpus read/write the same memory region?

2014-04-14 Thread Le Tan
Hi, I have been confused by this question for a long time, both in
hardware and software.
First, in the real x86 hardware architectures, while the DMA is
operating, can other CPUs access the same memory region? Or the DMA
controller will hold the bus and the memory exclusively and CPUs have
to wait? Of course, we assume that the operating system is too stupid
to provide the mutex in software.
Second, if the CPUs have to wait, then how about in QEMU? When QEMU is
emulating the DMA, can other vcpus access the memory (or exactly, the
same memory region)? If not, how does QEMU achieve this? That is, how
does QEMU prevent other vcpus to access the memory? I have read
through the part of the emulation of DMA, but I didn't see anything
related to this.
Thanks very much!

--Le Tan



Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:31:06PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:28, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:22:36PM +0200, Alexander Graf wrote:
> >>On 14.04.14 13:58, Greg Kurz wrote:
> >>>From: Rusty Russell 
> >>>
> >>>virtio data structures are defined as "target endian", which assumes
> >>>that's a fixed value.  In fact, that actually means it's platform-specific.
> >>>The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >>>
> >>>We introduce memory accessors to be used accross the virtio code where
> >>>needed. These accessors should support both legacy and 1.0 devices.
> >>>A good way to do it is to introduce a per-device property to store the
> >>>endianness. We choose to set this flag at device reset time because it
> >>>is reasonnable to assume the endianness won't change unless we reboot or
> >>>kexec another kernel. And it is also reasonnable to assume the new kernel
> >>>will reset the devices before using them (otherwise it will break).
> >>>
> >>>We reuse the virtio_is_big_endian() helper since it provides the right
> >>>value for legacy devices with most of the targets, that have fixed
> >>>endianness. It can then be overriden to support endian-ambivalent targets.
> >>>
> >>>To support migration, we need to set the flag in virtio_load() as well.
> >>>
> >>>(a) One solution would be to add it to the stream, but it have some
> >>> drawbacks:
> >>>- since this only affects a few targets, the field should be put into a
> >>>   subsection
> >>>- virtio migration code should be ported to vmstate to be able to introduce
> >>>   such a subsection
> >>>
> >>>(b) If we assume the following to be true:
> >>>- target endianness falls under some cpu state
> >>>- cpu state is always restored before virtio devices state because they
> >>>   get initialized in this order in main().
> >>>Then an alternative is to rely on virtio_is_big_endian() again at
> >>>load time. No need to mess around with the migration stream in this case.
> >>>
> >>>This patch implements (b).
> >>>
> >>>Note that the tswap helpers are implemented in virtio.c so that
> >>>virtio-access.h stays platform independant. Most of the virtio code
> >>>will be buildable under common-obj instead of obj then, and spare
> >>>some cycles when building for multiple targets.
> >>>
> >>>Signed-off-by: Rusty Russell 
> >>>[ ldq_phys() API change,
> >>>   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >>>   introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> >>> global)
> >>>   add VirtIODevice * arg to virtio helpers,
> >>>   use the existing virtio_is_big_endian() helper,
> >>>   virtio-pci: use the device is_big_endian flag,
> >>>   introduce virtio tswap16 and tswap64 helpers,
> >>>   move calls to tswap* out of virtio-access.h to make it platform 
> >>> independant,
> >>>   migration support,
> >>>   Greg Kurz  ]
> >>>Cc: Cédric Le Goater 
> >>>Signed-off-by: Greg Kurz 
> >>>---
> >>>
> >>>Changes since v6:
> >>>- merge the virtio_needs_byteswap() helper from v6 and existing
> >>>   virtio_is_big_endian()
> >>>- virtio-pci: now supports endianness changes
> >>>- virtio-access.h fixes (target independant)
> >>>
> >>>  exec.c|2 -
> >>>  hw/virtio/Makefile.objs   |2 -
> >>>  hw/virtio/virtio-pci.c|   11 +--
> >>>  hw/virtio/virtio.c|   35 +
> >>>  include/hw/virtio/virtio-access.h |  138 
> >>> +
> >>>  include/hw/virtio/virtio.h|2 +
> >>>  vl.c  |4 +
> >>>  7 files changed, 185 insertions(+), 9 deletions(-)
> >>>  create mode 100644 include/hw/virtio/virtio-access.h
> >>>
> >>>diff --git a/exec.c b/exec.c
> >>>index 91513c6..e6777d0 100644
> >>>--- a/exec.c
> >>>+++ b/exec.c
> >>>@@ -42,6 +42,7 @@
> >>>  #else /* !CONFIG_USER_ONLY */
> >>>  #include "sysemu/xen-mapcache.h"
> >>>  #include "trace.h"
> >>>+#include "hw/virtio/virtio.h"
> >>>  #endif
> >>>  #include "exec/cpu-all.h"
> >>>@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> >>>addr,
> >>>   * A helper function for the _utterly broken_ virtio device model to 
> >>> find out if
> >>>   * it's running on a big endian machine. Don't do this at home kids!
> >>>   */
> >>>-bool virtio_is_big_endian(void);
> >>>  bool virtio_is_big_endian(void)
> >>>  {
> >>>  #if defined(TARGET_WORDS_BIGENDIAN)
> >>>diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> >>>index 1ba53d9..68c3064 100644
> >>>--- a/hw/virtio/Makefile.objs
> >>>+++ b/hw/virtio/Makefile.objs
> >>>@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> >>>  common-obj-y += virtio-mmio.o
> >>>  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> >>>-obj-y += virtio.o virtio-balloon.o
> >>>+obj-y += virtio.o virtio-balloon.o
> >>>  obj-$(CONFIG_LINUX) += vhost.o
> >>>diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> >>>index ce97514..82a1

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:24, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> >>On 14.04.14 13:58, Greg Kurz wrote:
> >>>From: Rusty Russell 
> >>>
> >>>virtio data structures are defined as "target endian", which assumes
> >>>that's a fixed value.  In fact, that actually means it's platform-specific.
> >>>The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >>>
> >>>We introduce memory accessors to be used accross the virtio code where
> >>>needed. These accessors should support both legacy and 1.0 devices.
> >>>A good way to do it is to introduce a per-device property to store the
> >>>endianness. We choose to set this flag at device reset time because it
> >>>is reasonnable to assume the endianness won't change unless we reboot or
> >>>kexec another kernel. And it is also reasonnable to assume the new kernel
> >>>will reset the devices before using them (otherwise it will break).
> >>>
> >>>We reuse the virtio_is_big_endian() helper since it provides the right
> >>>value for legacy devices with most of the targets, that have fixed
> >>>endianness. It can then be overriden to support endian-ambivalent targets.
> >>>
> >>>To support migration, we need to set the flag in virtio_load() as well.
> >>>
> >>>(a) One solution would be to add it to the stream, but it have some
> >>> drawbacks:
> >>>- since this only affects a few targets, the field should be put into a
> >>>   subsection
> >>>- virtio migration code should be ported to vmstate to be able to introduce
> >>>   such a subsection
> >>>
> >>>(b) If we assume the following to be true:
> >>>- target endianness falls under some cpu state
> >>>- cpu state is always restored before virtio devices state because they
> >>>   get initialized in this order in main().
> >>>Then an alternative is to rely on virtio_is_big_endian() again at
> >>>load time. No need to mess around with the migration stream in this case.
> >>>
> >>>This patch implements (b).
> >>>
> >>>Note that the tswap helpers are implemented in virtio.c so that
> >>>virtio-access.h stays platform independant. Most of the virtio code
> >>>will be buildable under common-obj instead of obj then, and spare
> >>>some cycles when building for multiple targets.
> >>>
> >>>Signed-off-by: Rusty Russell 
> >>>[ ldq_phys() API change,
> >>>   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >>>   introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> >>> global)
> >>>   add VirtIODevice * arg to virtio helpers,
> >>>   use the existing virtio_is_big_endian() helper,
> >>>   virtio-pci: use the device is_big_endian flag,
> >>>   introduce virtio tswap16 and tswap64 helpers,
> >>>   move calls to tswap* out of virtio-access.h to make it platform 
> >>> independant,
> >>>   migration support,
> >>>   Greg Kurz  ]
> >>>Cc: Cédric Le Goater 
> >>>Signed-off-by: Greg Kurz 
> >>>---
> >>>
> >>>Changes since v6:
> >>>- merge the virtio_needs_byteswap() helper from v6 and existing
> >>>   virtio_is_big_endian()
> >>>- virtio-pci: now supports endianness changes
> >>>- virtio-access.h fixes (target independant)
> >>>
> >>>  exec.c|2 -
> >>>  hw/virtio/Makefile.objs   |2 -
> >>>  hw/virtio/virtio-pci.c|   11 +--
> >>>  hw/virtio/virtio.c|   35 +
> >>>  include/hw/virtio/virtio-access.h |  138 
> >>> +
> >>>  include/hw/virtio/virtio.h|2 +
> >>>  vl.c  |4 +
> >>>  7 files changed, 185 insertions(+), 9 deletions(-)
> >>>  create mode 100644 include/hw/virtio/virtio-access.h
> >>>
> >>>diff --git a/exec.c b/exec.c
> >>>index 91513c6..e6777d0 100644
> >>>--- a/exec.c
> >>>+++ b/exec.c
> >>>@@ -42,6 +42,7 @@
> >>>  #else /* !CONFIG_USER_ONLY */
> >>>  #include "sysemu/xen-mapcache.h"
> >>>  #include "trace.h"
> >>>+#include "hw/virtio/virtio.h"
> >>>  #endif
> >>>  #include "exec/cpu-all.h"
> >>>@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> >>>addr,
> >>>   * A helper function for the _utterly broken_ virtio device model to 
> >>> find out if
> >>>   * it's running on a big endian machine. Don't do this at home kids!
> >>>   */
> >>>-bool virtio_is_big_endian(void);
> >>>  bool virtio_is_big_endian(void)
> >>>  {
> >>>  #if defined(TARGET_WORDS_BIGENDIAN)
> >>>diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> >>>index 1ba53d9..68c3064 100644
> >>>--- a/hw/virtio/Makefile.objs
> >>>+++ b/hw/virtio/Makefile.objs
> >>>@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> >>>  common-obj-y += virtio-mmio.o
> >>>  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> >>>-obj-y += virtio.o virtio-balloon.o
> >>>+obj-y += virtio.o virtio-balloon.o
> >>>  obj-$(CONFIG_LINUX) += vhost.o
> >>>diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> >>>index ce97514..82a1

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:34, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:31:06PM +0200, Alexander Graf wrote:

On 14.04.14 14:28, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:22:36PM +0200, Alexander Graf wrote:

Could we try to poison any non-virtio, non-endian-specific memory
accessors when this file is included? That way we don't fall into
pitfalls where we end up running stl_p in a tiny corner case
somewhere still.


Alex

Not sure about all users but it looks like the only file including this
is virtio.c right?
I'm guessing that's safe there.

any virtio device implementations, since they need to communicate
with the guest :).

In that case how can we poison regular memory accesses?
Devices normally do more than virtio related things.


Device should never do anything based on the guest endianness (because 
they don't know), so stx_p and ldx_p should always be poisoned.



Alex




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Greg Kurz
On Mon, 14 Apr 2014 14:16:03 +0200
Alexander Graf  wrote:

> 
> On 14.04.14 13:58, Greg Kurz wrote:
> > From: Rusty Russell 
> >
> > virtio data structures are defined as "target endian", which assumes
> > that's a fixed value.  In fact, that actually means it's platform-specific.
> > The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> > We introduce memory accessors to be used accross the virtio code where
> > needed. These accessors should support both legacy and 1.0 devices.
> > A good way to do it is to introduce a per-device property to store the
> > endianness. We choose to set this flag at device reset time because it
> > is reasonnable to assume the endianness won't change unless we reboot or
> > kexec another kernel. And it is also reasonnable to assume the new kernel
> > will reset the devices before using them (otherwise it will break).
> >
> > We reuse the virtio_is_big_endian() helper since it provides the right
> > value for legacy devices with most of the targets, that have fixed
> > endianness. It can then be overriden to support endian-ambivalent targets.
> >
> > To support migration, we need to set the flag in virtio_load() as well.
> >
> > (a) One solution would be to add it to the stream, but it have some
> >  drawbacks:
> > - since this only affects a few targets, the field should be put into a
> >subsection
> > - virtio migration code should be ported to vmstate to be able to introduce
> >such a subsection
> >
> > (b) If we assume the following to be true:
> > - target endianness falls under some cpu state
> > - cpu state is always restored before virtio devices state because they
> >get initialized in this order in main().
> > Then an alternative is to rely on virtio_is_big_endian() again at
> > load time. No need to mess around with the migration stream in this case.
> >
> > This patch implements (b).
> >
> > Note that the tswap helpers are implemented in virtio.c so that
> > virtio-access.h stays platform independant. Most of the virtio code
> > will be buildable under common-obj instead of obj then, and spare
> > some cycles when building for multiple targets.
> >
> > Signed-off-by: Rusty Russell 
> > [ ldq_phys() API change,
> >relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >add VirtIODevice * arg to virtio helpers,
> >use the existing virtio_is_big_endian() helper,
> >virtio-pci: use the device is_big_endian flag,
> >introduce virtio tswap16 and tswap64 helpers,
> >move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >migration support,
> >Greg Kurz  ]
> > Cc: Cédric Le Goater 
> > Signed-off-by: Greg Kurz 
> > ---
> >
> > Changes since v6:
> > - merge the virtio_needs_byteswap() helper from v6 and existing
> >virtio_is_big_endian()
> > - virtio-pci: now supports endianness changes
> > - virtio-access.h fixes (target independant)
> >
> >   exec.c|2 -
> >   hw/virtio/Makefile.objs   |2 -
> >   hw/virtio/virtio-pci.c|   11 +--
> >   hw/virtio/virtio.c|   35 +
> >   include/hw/virtio/virtio-access.h |  138 
> > +
> >   include/hw/virtio/virtio.h|2 +
> >   vl.c  |4 +
> >   7 files changed, 185 insertions(+), 9 deletions(-)
> >   create mode 100644 include/hw/virtio/virtio-access.h
> >
> > diff --git a/exec.c b/exec.c
> > index 91513c6..e6777d0 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -42,6 +42,7 @@
> >   #else /* !CONFIG_USER_ONLY */
> >   #include "sysemu/xen-mapcache.h"
> >   #include "trace.h"
> > +#include "hw/virtio/virtio.h"
> >   #endif
> >   #include "exec/cpu-all.h"
> >   
> > @@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> > addr,
> >* A helper function for the _utterly broken_ virtio device model to find 
> > out if
> >* it's running on a big endian machine. Don't do this at home kids!
> >*/
> > -bool virtio_is_big_endian(void);
> >   bool virtio_is_big_endian(void)
> >   {
> >   #if defined(TARGET_WORDS_BIGENDIAN)
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 1ba53d9..68c3064 100644
> > --- a/hw/virtio/Makefile.objs
> > +++ b/hw/virtio/Makefile.objs
> > @@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> >   common-obj-y += virtio-mmio.o
> >   common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> >   
> > -obj-y += virtio.o virtio-balloon.o
> > +obj-y += virtio.o virtio-balloon.o
> >   obj-$(CONFIG_LINUX) += vhost.o
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index ce97514..82a1689 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -89,9 +89,6 @@
> >   /* Flags track per-device state like workarounds for quirks in older 
> > guests. */
> >   #define VIRTIO_PCI_FLAG_BUS_MASTER_

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:37, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:

On 14.04.14 14:24, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:

On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
 drawbacks:
- since this only affects a few targets, the field should be put into a
   subsection
- virtio migration code should be ported to vmstate to be able to introduce
   such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
   get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
   relicensed virtio-access.h to GPLv2+ on Rusty's request,
   introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
   add VirtIODevice * arg to virtio helpers,
   use the existing virtio_is_big_endian() helper,
   virtio-pci: use the device is_big_endian flag,
   introduce virtio tswap16 and tswap64 helpers,
   move calls to tswap* out of virtio-access.h to make it platform independant,
   migration support,
   Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
   virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

  exec.c|2 -
  hw/virtio/Makefile.objs   |2 -
  hw/virtio/virtio-pci.c|   11 +--
  hw/virtio/virtio.c|   35 +
  include/hw/virtio/virtio-access.h |  138 +
  include/hw/virtio/virtio.h|2 +
  vl.c  |4 +
  7 files changed, 185 insertions(+), 9 deletions(-)
  create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
  #else /* !CONFIG_USER_ONLY */
  #include "sysemu/xen-mapcache.h"
  #include "trace.h"
+#include "hw/virtio/virtio.h"
  #endif
  #include "exec/cpu-all.h"
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
   * A helper function for the _utterly broken_ virtio device model to find out 
if
   * it's running on a big endian machine. Don't do this at home kids!
   */
-bool virtio_is_big_endian(void);
  bool virtio_is_big_endian(void)
  {
  #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
  common-obj-y += virtio-mmio.o
  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
-obj-y += virtio.o virtio-balloon.o
+obj-y += virtio.o virtio-balloon.o
  obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
  /* Flags track per-device state like workarounds for quirks in older guests. 
*/
  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
@@ -409,13 +406,13 @@ static uint64_t virtio_pci_confi

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:38:14PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:34, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:31:06PM +0200, Alexander Graf wrote:
> >>On 14.04.14 14:28, Michael S. Tsirkin wrote:
> >>>On Mon, Apr 14, 2014 at 02:22:36PM +0200, Alexander Graf wrote:
> Could we try to poison any non-virtio, non-endian-specific memory
> accessors when this file is included? That way we don't fall into
> pitfalls where we end up running stl_p in a tiny corner case
> somewhere still.
> 
> 
> Alex
> >>>Not sure about all users but it looks like the only file including this
> >>>is virtio.c right?
> >>>I'm guessing that's safe there.
> >>any virtio device implementations, since they need to communicate
> >>with the guest :).
> >In that case how can we poison regular memory accesses?
> >Devices normally do more than virtio related things.
> 
> Device should never do anything based on the guest endianness
> (because they don't know), so stx_p and ldx_p should always be
> poisoned.
> 
> 
> Alex

Makes sense.



Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:37, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> >>On 14.04.14 14:24, Michael S. Tsirkin wrote:
> >>>On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> On 14.04.14 13:58, Greg Kurz wrote:
> >From: Rusty Russell 
> >
> >virtio data structures are defined as "target endian", which assumes
> >that's a fixed value.  In fact, that actually means it's 
> >platform-specific.
> >The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> >We introduce memory accessors to be used accross the virtio code where
> >needed. These accessors should support both legacy and 1.0 devices.
> >A good way to do it is to introduce a per-device property to store the
> >endianness. We choose to set this flag at device reset time because it
> >is reasonnable to assume the endianness won't change unless we reboot or
> >kexec another kernel. And it is also reasonnable to assume the new kernel
> >will reset the devices before using them (otherwise it will break).
> >
> >We reuse the virtio_is_big_endian() helper since it provides the right
> >value for legacy devices with most of the targets, that have fixed
> >endianness. It can then be overriden to support endian-ambivalent 
> >targets.
> >
> >To support migration, we need to set the flag in virtio_load() as well.
> >
> >(a) One solution would be to add it to the stream, but it have some
> > drawbacks:
> >- since this only affects a few targets, the field should be put into a
> >   subsection
> >- virtio migration code should be ported to vmstate to be able to 
> >introduce
> >   such a subsection
> >
> >(b) If we assume the following to be true:
> >- target endianness falls under some cpu state
> >- cpu state is always restored before virtio devices state because they
> >   get initialized in this order in main().
> >Then an alternative is to rely on virtio_is_big_endian() again at
> >load time. No need to mess around with the migration stream in this case.
> >
> >This patch implements (b).
> >
> >Note that the tswap helpers are implemented in virtio.c so that
> >virtio-access.h stays platform independant. Most of the virtio code
> >will be buildable under common-obj instead of obj then, and spare
> >some cycles when building for multiple targets.
> >
> >Signed-off-by: Rusty Russell 
> >[ ldq_phys() API change,
> >   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >   introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >   add VirtIODevice * arg to virtio helpers,
> >   use the existing virtio_is_big_endian() helper,
> >   virtio-pci: use the device is_big_endian flag,
> >   introduce virtio tswap16 and tswap64 helpers,
> >   move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >   migration support,
> >   Greg Kurz  ]
> >Cc: Cédric Le Goater 
> >Signed-off-by: Greg Kurz 
> >---
> >
> >Changes since v6:
> >- merge the virtio_needs_byteswap() helper from v6 and existing
> >   virtio_is_big_endian()
> >- virtio-pci: now supports endianness changes
> >- virtio-access.h fixes (target independant)
> >
> >  exec.c|2 -
> >  hw/virtio/Makefile.objs   |2 -
> >  hw/virtio/virtio-pci.c|   11 +--
> >  hw/virtio/virtio.c|   35 +
> >  include/hw/virtio/virtio-access.h |  138 
> > +
> >  include/hw/virtio/virtio.h|2 +
> >  vl.c  |4 +
> >  7 files changed, 185 insertions(+), 9 deletions(-)
> >  create mode 100644 include/hw/virtio/virtio-access.h
> >
> >diff --git a/exec.c b/exec.c
> >index 91513c6..e6777d0 100644
> >--- a/exec.c
> >+++ b/exec.c
> >@@ -42,6 +42,7 @@
> >  #else /* !CONFIG_USER_ONLY */
> >  #include "sysemu/xen-mapcache.h"
> >  #include "trace.h"
> >+#include "hw/virtio/virtio.h"
> >  #endif
> >  #include "exec/cpu-all.h"
> >@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, 
> >target_ulong addr,
> >   * A helper function for the _utterly broken_ virtio device model to 
> > find out if
> >   * it's running on a big endian machine. Don't do this at home kids!
> >   */
> >-bool virtio_is_big_endian(void);
> >  bool virtio_is_big_endian(void)
> >  {
> >  #if defined(TARGET_WORDS_BIGENDIAN)
> >diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> >index 1ba53d9..68c3064 100644
> >--- a/hw/virtio/Makefile.objs
> >+++ b/hw/virtio/Makefile.objs
> >@@ -4,5 +4

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:46, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:

On 14.04.14 14:37, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:

On 14.04.14 14:24, Michael S. Tsirkin wrote:


This will have to be measured and proved by whoever's proposing the
patch, not by reviewers.  Platforms such as AMD which don't do
prediction well would be especially interesting to test on.

Sure, Greg, can you do that? I'm sure Michael has test cases
available he can give you to measure performance on this.

Speaking of which, how does all of this work with vhost?


Alex

I think that's missing.
As a first step, we need to disable vhost when
host/guest endian-ness do not match.

We could also add cross-endian support to vhost.

Or just wait a couple more months for virtio 1.0 which is fixed
endian-ness.


That won't help for current ppc64le guests, so I suppose we will need 
cross-endian vhost.



Alex




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Greg Kurz
On Mon, 14 Apr 2014 14:22:36 +0200
Alexander Graf  wrote:

> 
> On 14.04.14 13:58, Greg Kurz wrote:
> > From: Rusty Russell 
> >
> > virtio data structures are defined as "target endian", which assumes
> > that's a fixed value.  In fact, that actually means it's platform-specific.
> > The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> > We introduce memory accessors to be used accross the virtio code where
> > needed. These accessors should support both legacy and 1.0 devices.
> > A good way to do it is to introduce a per-device property to store the
> > endianness. We choose to set this flag at device reset time because it
> > is reasonnable to assume the endianness won't change unless we reboot or
> > kexec another kernel. And it is also reasonnable to assume the new kernel
> > will reset the devices before using them (otherwise it will break).
> >
> > We reuse the virtio_is_big_endian() helper since it provides the right
> > value for legacy devices with most of the targets, that have fixed
> > endianness. It can then be overriden to support endian-ambivalent targets.
> >
> > To support migration, we need to set the flag in virtio_load() as well.
> >
> > (a) One solution would be to add it to the stream, but it have some
> >  drawbacks:
> > - since this only affects a few targets, the field should be put into a
> >subsection
> > - virtio migration code should be ported to vmstate to be able to introduce
> >such a subsection
> >
> > (b) If we assume the following to be true:
> > - target endianness falls under some cpu state
> > - cpu state is always restored before virtio devices state because they
> >get initialized in this order in main().
> > Then an alternative is to rely on virtio_is_big_endian() again at
> > load time. No need to mess around with the migration stream in this case.
> >
> > This patch implements (b).
> >
> > Note that the tswap helpers are implemented in virtio.c so that
> > virtio-access.h stays platform independant. Most of the virtio code
> > will be buildable under common-obj instead of obj then, and spare
> > some cycles when building for multiple targets.
> >
> > Signed-off-by: Rusty Russell 
> > [ ldq_phys() API change,
> >relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >add VirtIODevice * arg to virtio helpers,
> >use the existing virtio_is_big_endian() helper,
> >virtio-pci: use the device is_big_endian flag,
> >introduce virtio tswap16 and tswap64 helpers,
> >move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >migration support,
> >Greg Kurz  ]
> > Cc: Cédric Le Goater 
> > Signed-off-by: Greg Kurz 
> > ---
> >
> > Changes since v6:
> > - merge the virtio_needs_byteswap() helper from v6 and existing
> >virtio_is_big_endian()
> > - virtio-pci: now supports endianness changes
> > - virtio-access.h fixes (target independant)
> >
> >   exec.c|2 -
> >   hw/virtio/Makefile.objs   |2 -
> >   hw/virtio/virtio-pci.c|   11 +--
> >   hw/virtio/virtio.c|   35 +
> >   include/hw/virtio/virtio-access.h |  138 
> > +
> >   include/hw/virtio/virtio.h|2 +
> >   vl.c  |4 +
> >   7 files changed, 185 insertions(+), 9 deletions(-)
> >   create mode 100644 include/hw/virtio/virtio-access.h
> >
> > diff --git a/exec.c b/exec.c
> > index 91513c6..e6777d0 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -42,6 +42,7 @@
> >   #else /* !CONFIG_USER_ONLY */
> >   #include "sysemu/xen-mapcache.h"
> >   #include "trace.h"
> > +#include "hw/virtio/virtio.h"
> >   #endif
> >   #include "exec/cpu-all.h"
> >   
> > @@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> > addr,
> >* A helper function for the _utterly broken_ virtio device model to find 
> > out if
> >* it's running on a big endian machine. Don't do this at home kids!
> >*/
> > -bool virtio_is_big_endian(void);
> >   bool virtio_is_big_endian(void)
> >   {
> >   #if defined(TARGET_WORDS_BIGENDIAN)
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 1ba53d9..68c3064 100644
> > --- a/hw/virtio/Makefile.objs
> > +++ b/hw/virtio/Makefile.objs
> > @@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
> >   common-obj-y += virtio-mmio.o
> >   common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> >   
> > -obj-y += virtio.o virtio-balloon.o
> > +obj-y += virtio.o virtio-balloon.o
> >   obj-$(CONFIG_LINUX) += vhost.o
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index ce97514..82a1689 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -89,9 +89,6 @@
> >   /* Flags track per-device state like workarounds for quirks in older 
> > guests. */
> >   #define VIRTIO_PCI_FLAG_BUS_MASTER_

Re: [Qemu-devel] [PATCH for 2.0] ide: Correct improper smart self test counter reset in ide core.

2014-04-14 Thread Peter Maydell
On 14 April 2014 13:31, Kevin Wolf  wrote:
> Am 14.04.2014 um 14:10 hat Markus Armbruster geschrieben:
>> Markus Armbruster  writes:
>>
>> > Benoît Canet  writes:
>> >
>> >> The counter being reseted to zero make the array index negative.
>> >> Reset it to 1.
>> >>
>> >> Signed-off-by: Benoit Canet 
>> >> ---
>> >>  hw/ide/core.c |2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/hw/ide/core.c b/hw/ide/core.c
>> >> index e1dfe54..c943a4d 100644
>> >> --- a/hw/ide/core.c
>> >> +++ b/hw/ide/core.c
>> >> @@ -1602,7 +1602,7 @@ static bool cmd_smart(IDEState *s, uint8_t cmd)
>> >>  case 2: /* extended self test */
>> >>  s->smart_selftest_count++;
>> >>  if (s->smart_selftest_count > 21) {
>> >> -s->smart_selftest_count = 0;
>> >> +s->smart_selftest_count = 1;
>> >>  }
>> >>  n = 2 + (s->smart_selftest_count - 1) * 24;
>> >>  s->smart_selftest_data[n] = s->sector;
>> >
>> > Good catch.
>> >
>> > Commit message could use some love, though.  On every 21st SMART EXECUTE
>> > OFFLINE:
>> >
>> > * We write before a dynamically allocated buffer
>> >
>> >   Your diff's context has one of the writes.
>> >
>> > * We forget SMART history
>> >
>> >   See the s->smart_selftest_count == 0 special cases in SMART READ DATA
>> >   and SMART READ LOG.
>>
>> Peter offered to improve the commit message on merge.  Even without
>> that, it would be better to get this into 2.0 with a sub-optimal commit
>> message than not to get it, so:
>>
>> Reviewed-by: Markus Armbruster 
>
> Should also be fixed in the stable branch of earlier releases. The bug
> is present since SMART emulation was added in 2009.
>
> Cc: qemu-sta...@nongnu.org
> Acked-by: Kevin Wolf 

Applied to master with a tweaked commit message and the various
r-b/cc/etc tags.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:50, Greg Kurz wrote:

On Mon, 14 Apr 2014 14:22:36 +0200
Alexander Graf  wrote:


On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
  drawbacks:
- since this only affects a few targets, the field should be put into a
subsection
- virtio migration code should be ported to vmstate to be able to introduce
such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
relicensed virtio-access.h to GPLv2+ on Rusty's request,
introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
add VirtIODevice * arg to virtio helpers,
use the existing virtio_is_big_endian() helper,
virtio-pci: use the device is_big_endian flag,
introduce virtio tswap16 and tswap64 helpers,
move calls to tswap* out of virtio-access.h to make it platform independant,
migration support,
Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---


[...]


+
+uint64_t virtio_tswap64(VirtIODevice *vdev, uint64_t s);
+
+static inline void virtio_tswap64s(VirtIODevice *vdev, uint64_t *s)
+{
+*s = virtio_tswap64(vdev, *s);
+}

Could we try to poison any non-virtio, non-endian-specific memory
accessors when this file is included? That way we don't fall into
pitfalls where we end up running stl_p in a tiny corner case somewhere
still.


Alex


Hmmm... there are still some stl_p users in virtio.c that I could not
port to the new virtio memory accessors... These are the virtio_config_*
helpers that gets called from virtio-pci and virtio-mmio.


Why couldn't you port them to the new virtio memory accessors? Are you 
missing a vdev parameter? Could you add one?


IIRC config space is a weird mix of target endianness and little endian.


Alex




[Qemu-devel] [PATCH v6] spapr: Add support for time base offset migration

2014-04-14 Thread Alexey Kardashevskiy
This allows guests to have a different timebase origin from the host.

This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.

This adds kvm_access_one_reg() to access a special register which is not
in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch.

The feature must be present in the host kernel.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v6:
* time_of_the_day is now time_of_the_day_ns and measured in nm instead of us
* VMSTATE_PPC_TIMEBASE_V supports versions now

v5:
* fixed multiple comments in cpu_ppc_get_adjusted_tb and merged it
into timebase_post_load()
* removed round_up(1<<24) as KVM is expected to do this anyway
* removed @freq from migration stream
* renamed PPCTimebaseOffset to PPCTimebase
* CLOCKS_PER_SEC is used as a constant which 100us/s (man clock)

v4:
* made it per machine timebase offser rather than per CPU

v3:
* kvm_access_one_reg moved out to a separate patch
* tb_offset and host_timebase were replaced with guest_timebase as
the destionation does not really care of offset on the source

v2:
* bumped the vmstate_ppc_cpu version
* defined version for the env.tb_env field
---
 hw/ppc/ppc.c   | 78 ++
 hw/ppc/spapr.c |  4 +--
 include/hw/ppc/spapr.h |  1 +
 target-ppc/cpu-qom.h   | 16 +++
 target-ppc/kvm.c   |  5 
 trace-events   |  3 ++
 6 files changed, 105 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 71df471..3be4d8c 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -29,9 +29,11 @@
 #include "sysemu/cpus.h"
 #include "hw/timer/m48t59.h"
 #include "qemu/log.h"
+#include "qemu/error-report.h"
 #include "hw/loader.h"
 #include "sysemu/kvm.h"
 #include "kvm_ppc.h"
+#include "trace.h"
 
 //#define PPC_DEBUG_IRQ
 //#define PPC_DEBUG_TB
@@ -49,6 +51,8 @@
 #  define LOG_TB(...) do { } while (0)
 #endif
 
+#define NSEC_PER_SEC10LL
+
 static void cpu_ppc_tb_stop (CPUPPCState *env);
 static void cpu_ppc_tb_start (CPUPPCState *env);
 
@@ -829,6 +833,80 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t 
freq)
 cpu_ppc_store_purr(cpu, 0xULL);
 }
 
+static void timebase_pre_save(void *opaque)
+{
+PPCTimebase *tb = opaque;
+uint64_t ticks = cpu_get_real_ticks();
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return;
+}
+
+tb->time_of_the_day_ns = get_clock_realtime();
+/*
+ * tb_offset is only expected to be changed by migration so
+ * there is no need to update it from KVM here
+ */
+tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset;
+}
+
+static int timebase_post_load(void *opaque, int version_id)
+{
+PPCTimebase *tb = opaque;
+CPUState *cpu;
+PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
+int64_t tb_off_adj, tb_off;
+int64_t migration_duration_ns, migration_duration_tb, guest_tb, host_ns;
+unsigned long freq;
+
+if (!first_ppc_cpu->env.tb_env) {
+error_report("No timebase object");
+return -1;
+}
+
+freq = first_ppc_cpu->env.tb_env->tb_freq;
+/*
+ * Calculate timebase on the destination side of migration.
+ * The destination timebase must be not less than the source timebase.
+ * We try to adjust timebase by downtime if host clocks are not
+ * too much out of sync (1 second for now).
+ */
+host_ns = get_clock_realtime();
+migration_duration_ns = MIN(NSEC_PER_SEC, host_ns - 
tb->time_of_the_day_ns);
+migration_duration_tb = muldiv64(migration_duration_ns, freq, 
NSEC_PER_SEC);
+guest_tb = tb->guest_timebase + MIN(0, migration_duration_tb);
+
+tb_off_adj = guest_tb - cpu_get_real_ticks();
+
+tb_off = first_ppc_cpu->env.tb_env->tb_offset;
+trace_ppc_tb_adjust(tb_off, tb_off_adj, tb_off_adj - tb_off,
+(tb_off_adj - tb_off) / freq);
+
+/* Set new offset to all CPUs */
+CPU_FOREACH(cpu) {
+PowerPCCPU *pcpu = POWERPC_CPU(cpu);
+pcpu->env.tb_env->tb_offset = tb_off_adj;
+}
+
+return 0;
+}
+
+const VMStateDescription vmstate_ppc_timebase = {
+.name = "timebase",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.pre_save = timebase_pre_save,
+.post_load = timebase_post_load,
+.fields  = (VMStateField []) {
+VMSTATE_UINT64(guest_timebase, PPCTimebase),
+VMSTATE_UINT64(time_of_the_day_ns, PPCTimebase),
+VMSTATE_END_OF_LIST()
+},
+};
+
 /* Set 

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:37, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> >>On 14.04.14 14:24, Michael S. Tsirkin wrote:
> >>>On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> On 14.04.14 13:58, Greg Kurz wrote:
> >From: Rusty Russell 
> >
> >virtio data structures are defined as "target endian", which assumes
> >that's a fixed value.  In fact, that actually means it's 
> >platform-specific.
> >The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> >We introduce memory accessors to be used accross the virtio code where
> >needed. These accessors should support both legacy and 1.0 devices.
> >A good way to do it is to introduce a per-device property to store the
> >endianness. We choose to set this flag at device reset time because it
> >is reasonnable to assume the endianness won't change unless we reboot or
> >kexec another kernel. And it is also reasonnable to assume the new kernel
> >will reset the devices before using them (otherwise it will break).
> >
> >We reuse the virtio_is_big_endian() helper since it provides the right
> >value for legacy devices with most of the targets, that have fixed
> >endianness. It can then be overriden to support endian-ambivalent 
> >targets.
> >
> >To support migration, we need to set the flag in virtio_load() as well.
> >
> >(a) One solution would be to add it to the stream, but it have some
> > drawbacks:
> >- since this only affects a few targets, the field should be put into a
> >   subsection
> >- virtio migration code should be ported to vmstate to be able to 
> >introduce
> >   such a subsection
> >
> >(b) If we assume the following to be true:
> >- target endianness falls under some cpu state
> >- cpu state is always restored before virtio devices state because they
> >   get initialized in this order in main().
> >Then an alternative is to rely on virtio_is_big_endian() again at
> >load time. No need to mess around with the migration stream in this case.
> >
> >This patch implements (b).
> >
> >Note that the tswap helpers are implemented in virtio.c so that
> >virtio-access.h stays platform independant. Most of the virtio code
> >will be buildable under common-obj instead of obj then, and spare
> >some cycles when building for multiple targets.
> >
> >Signed-off-by: Rusty Russell 
> >[ ldq_phys() API change,
> >   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >   introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >   add VirtIODevice * arg to virtio helpers,
> >   use the existing virtio_is_big_endian() helper,
> >   virtio-pci: use the device is_big_endian flag,
> >   introduce virtio tswap16 and tswap64 helpers,
> >   move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >   migration support,
> >   Greg Kurz  ]
> >Cc: Cédric Le Goater 
> >Signed-off-by: Greg Kurz 
> >---
> >
> >Changes since v6:
> >- merge the virtio_needs_byteswap() helper from v6 and existing
> >   virtio_is_big_endian()
> >- virtio-pci: now supports endianness changes
> >- virtio-access.h fixes (target independant)
> >
> >  exec.c|2 -
> >  hw/virtio/Makefile.objs   |2 -
> >  hw/virtio/virtio-pci.c|   11 +--
> >  hw/virtio/virtio.c|   35 +
> >  include/hw/virtio/virtio-access.h |  138 
> > +
> >  include/hw/virtio/virtio.h|2 +
> >  vl.c  |4 +
> >  7 files changed, 185 insertions(+), 9 deletions(-)
> >  create mode 100644 include/hw/virtio/virtio-access.h
> >
> >diff --git a/exec.c b/exec.c
> >index 91513c6..e6777d0 100644
> >--- a/exec.c
> >+++ b/exec.c
> >@@ -42,6 +42,7 @@
> >  #else /* !CONFIG_USER_ONLY */
> >  #include "sysemu/xen-mapcache.h"
> >  #include "trace.h"
> >+#include "hw/virtio/virtio.h"
> >  #endif
> >  #include "exec/cpu-all.h"
> >@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, 
> >target_ulong addr,
> >   * A helper function for the _utterly broken_ virtio device model to 
> > find out if
> >   * it's running on a big endian machine. Don't do this at home kids!
> >   */
> >-bool virtio_is_big_endian(void);
> >  bool virtio_is_big_endian(void)
> >  {
> >  #if defined(TARGET_WORDS_BIGENDIAN)
> >diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> >index 1ba53d9..68c3064 100644
> >--- a/hw/virtio/Makefile.objs
> >+++ b/hw/virtio/Makefile.objs
> >@@ -4,5 +4

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 02:49:31PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:46, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
> >>On 14.04.14 14:37, Michael S. Tsirkin wrote:
> >>>On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> On 14.04.14 14:24, Michael S. Tsirkin wrote:
> 
> >>>This will have to be measured and proved by whoever's proposing the
> >>>patch, not by reviewers.  Platforms such as AMD which don't do
> >>>prediction well would be especially interesting to test on.
> >>Sure, Greg, can you do that? I'm sure Michael has test cases
> >>available he can give you to measure performance on this.
> >>
> >>Speaking of which, how does all of this work with vhost?
> >>
> >>
> >>Alex
> >I think that's missing.
> >As a first step, we need to disable vhost when
> >host/guest endian-ness do not match.
> >
> >We could also add cross-endian support to vhost.
> >
> >Or just wait a couple more months for virtio 1.0 which is fixed
> >endian-ness.
> 
> That won't help for current ppc64le guests, so I suppose we will
> need cross-endian vhost.
> 
> 
> Alex

It's kernel level work anyway.
Just backport a new driver. Seems less work for more benefit.

-- 
MST



Re: [Qemu-devel] [PATCH] spapr_pci: Fix number of returned vectors in ibm, change-msi

2014-04-14 Thread Alexey Kardashevskiy
On 04/07/2014 10:53 PM, Alexey Kardashevskiy wrote:
> Current guest kernels try allocating as many vectors as the quota is.
> For example, in the case of virtio-net (which has just 3 vectors)
> the guest requests 4 vectors (that is the quota in the test) and
> the existing ibm,change-msi handler returns 4. But before it returns,
> it calls msix_set_message() in a loop and corrupts memory behind
> the end of msix_table.
> 
> This limits the number of vectors returned by ibm,change-msi to
> the maximum supported by the actual device.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> 
> 
> This might go to 2.0 actually.


Ping? Ok, not 2.0 but ppc-next-2.1 or ppc-next? :)


> 
> ---
>  hw/ppc/spapr_pci.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index cbef095..cdfa369 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -343,6 +343,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  
>  /* There is no cached config, allocate MSIs */
>  if (!phb->msi_table[ndev].nvec) {
> +if (req_num > pdev->msix_entries_nr) {
> +req_num = pdev->msix_entries_nr;
> +}
>  irq = spapr_allocate_irq_block(req_num, false,
> ret_intr_type == RTAS_TYPE_MSI);
>  if (irq < 0) {
> 


-- 
Alexey



Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Cedric Le Goater
On 04/14/2014 02:49 PM, Alexander Graf wrote:
> 
> On 14.04.14 14:46, Michael S. Tsirkin wrote:
>> On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
>>> On 14.04.14 14:37, Michael S. Tsirkin wrote:
 On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> On 14.04.14 14:24, Michael S. Tsirkin wrote:
>
 This will have to be measured and proved by whoever's proposing the
 patch, not by reviewers.  Platforms such as AMD which don't do
 prediction well would be especially interesting to test on.
>>> Sure, Greg, can you do that? I'm sure Michael has test cases
>>> available he can give you to measure performance on this.
>>>
>>> Speaking of which, how does all of this work with vhost?
>>>
>>>
>>> Alex
>> I think that's missing.
>> As a first step, we need to disable vhost when
>> host/guest endian-ness do not match.
>>
>> We could also add cross-endian support to vhost.
>>
>> Or just wait a couple more months for virtio 1.0 which is fixed
>> endian-ness.
> 
> That won't help for current ppc64le guests, so I suppose we will 
> need cross-endian vhost.

Yes. For the moment, vhost=off is needed and even with that, qemu needs 
to byteswap a few attributes (see virtio-net header in patch 3). I am 
looking for a simple way to propagate the vring endianness to the host 
without breaking the compatibility with the current virtio drivers. 
Hopefully this is feasible. 


C.







Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:56, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:49:31PM +0200, Alexander Graf wrote:

On 14.04.14 14:46, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:

On 14.04.14 14:37, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:

On 14.04.14 14:24, Michael S. Tsirkin wrote:


This will have to be measured and proved by whoever's proposing the
patch, not by reviewers.  Platforms such as AMD which don't do
prediction well would be especially interesting to test on.

Sure, Greg, can you do that? I'm sure Michael has test cases
available he can give you to measure performance on this.

Speaking of which, how does all of this work with vhost?


Alex

I think that's missing.
As a first step, we need to disable vhost when
host/guest endian-ness do not match.

We could also add cross-endian support to vhost.

Or just wait a couple more months for virtio 1.0 which is fixed
endian-ness.

That won't help for current ppc64le guests, so I suppose we will
need cross-endian vhost.


It's quite hard to backport a driver onto installation media :). We 
already have openSUSE isos out there that support virtio and ppc64le.



Alex




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Greg Kurz
On Mon, 14 Apr 2014 15:46:34 +0300
"Michael S. Tsirkin"  wrote:

> On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
> > 
> > On 14.04.14 14:37, Michael S. Tsirkin wrote:
> > >On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> > >>On 14.04.14 14:24, Michael S. Tsirkin wrote:
> > >>>On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> > On 14.04.14 13:58, Greg Kurz wrote:
> > >From: Rusty Russell 
> > >
> > >virtio data structures are defined as "target endian", which assumes
> > >that's a fixed value.  In fact, that actually means it's 
> > >platform-specific.
> > >The OASIS virtio 1.0 spec will fix this, by making it all little 
> > >endian.
> > >
> > >We introduce memory accessors to be used accross the virtio code where
> > >needed. These accessors should support both legacy and 1.0 devices.
> > >A good way to do it is to introduce a per-device property to store the
> > >endianness. We choose to set this flag at device reset time because it
> > >is reasonnable to assume the endianness won't change unless we reboot 
> > >or
> > >kexec another kernel. And it is also reasonnable to assume the new 
> > >kernel
> > >will reset the devices before using them (otherwise it will break).
> > >
> > >We reuse the virtio_is_big_endian() helper since it provides the right
> > >value for legacy devices with most of the targets, that have fixed
> > >endianness. It can then be overriden to support endian-ambivalent 
> > >targets.
> > >
> > >To support migration, we need to set the flag in virtio_load() as well.
> > >
> > >(a) One solution would be to add it to the stream, but it have some
> > > drawbacks:
> > >- since this only affects a few targets, the field should be put into a
> > >   subsection
> > >- virtio migration code should be ported to vmstate to be able to 
> > >introduce
> > >   such a subsection
> > >
> > >(b) If we assume the following to be true:
> > >- target endianness falls under some cpu state
> > >- cpu state is always restored before virtio devices state because they
> > >   get initialized in this order in main().
> > >Then an alternative is to rely on virtio_is_big_endian() again at
> > >load time. No need to mess around with the migration stream in this 
> > >case.
> > >
> > >This patch implements (b).
> > >
> > >Note that the tswap helpers are implemented in virtio.c so that
> > >virtio-access.h stays platform independant. Most of the virtio code
> > >will be buildable under common-obj instead of obj then, and spare
> > >some cycles when building for multiple targets.
> > >
> > >Signed-off-by: Rusty Russell 
> > >[ ldq_phys() API change,
> > >   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> > >   introduce a per-device is_big_endian flag (supersedes 
> > > needs_byteswap global)
> > >   add VirtIODevice * arg to virtio helpers,
> > >   use the existing virtio_is_big_endian() helper,
> > >   virtio-pci: use the device is_big_endian flag,
> > >   introduce virtio tswap16 and tswap64 helpers,
> > >   move calls to tswap* out of virtio-access.h to make it platform 
> > > independant,
> > >   migration support,
> > >   Greg Kurz  ]
> > >Cc: Cédric Le Goater 
> > >Signed-off-by: Greg Kurz 
> > >---
> > >
> > >Changes since v6:
> > >- merge the virtio_needs_byteswap() helper from v6 and existing
> > >   virtio_is_big_endian()
> > >- virtio-pci: now supports endianness changes
> > >- virtio-access.h fixes (target independant)
> > >
> > >  exec.c|2 -
> > >  hw/virtio/Makefile.objs   |2 -
> > >  hw/virtio/virtio-pci.c|   11 +--
> > >  hw/virtio/virtio.c|   35 +
> > >  include/hw/virtio/virtio-access.h |  138 
> > > +
> > >  include/hw/virtio/virtio.h|2 +
> > >  vl.c  |4 +
> > >  7 files changed, 185 insertions(+), 9 deletions(-)
> > >  create mode 100644 include/hw/virtio/virtio-access.h
> > >
> > >diff --git a/exec.c b/exec.c
> > >index 91513c6..e6777d0 100644
> > >--- a/exec.c
> > >+++ b/exec.c
> > >@@ -42,6 +42,7 @@
> > >  #else /* !CONFIG_USER_ONLY */
> > >  #include "sysemu/xen-mapcache.h"
> > >  #include "trace.h"
> > >+#include "hw/virtio/virtio.h"
> > >  #endif
> > >  #include "exec/cpu-all.h"
> > >@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, 
> > >target_ulong addr,
> > >   * A helper function for the _utterly broken_ virtio device model to 
> > > find out if
> > >   * it's running on a big endian machine. Don't do this at home kids!
> > >   */
> > >-bool virtio_is_bi

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 14:55, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:

On 14.04.14 14:37, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:

On 14.04.14 14:24, Michael S. Tsirkin wrote:

On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:

On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
 drawbacks:
- since this only affects a few targets, the field should be put into a
   subsection
- virtio migration code should be ported to vmstate to be able to introduce
   such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
   get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
   relicensed virtio-access.h to GPLv2+ on Rusty's request,
   introduce a per-device is_big_endian flag (supersedes needs_byteswap global)
   add VirtIODevice * arg to virtio helpers,
   use the existing virtio_is_big_endian() helper,
   virtio-pci: use the device is_big_endian flag,
   introduce virtio tswap16 and tswap64 helpers,
   move calls to tswap* out of virtio-access.h to make it platform independant,
   migration support,
   Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

Changes since v6:
- merge the virtio_needs_byteswap() helper from v6 and existing
   virtio_is_big_endian()
- virtio-pci: now supports endianness changes
- virtio-access.h fixes (target independant)

  exec.c|2 -
  hw/virtio/Makefile.objs   |2 -
  hw/virtio/virtio-pci.c|   11 +--
  hw/virtio/virtio.c|   35 +
  include/hw/virtio/virtio-access.h |  138 +
  include/hw/virtio/virtio.h|2 +
  vl.c  |4 +
  7 files changed, 185 insertions(+), 9 deletions(-)
  create mode 100644 include/hw/virtio/virtio-access.h

diff --git a/exec.c b/exec.c
index 91513c6..e6777d0 100644
--- a/exec.c
+++ b/exec.c
@@ -42,6 +42,7 @@
  #else /* !CONFIG_USER_ONLY */
  #include "sysemu/xen-mapcache.h"
  #include "trace.h"
+#include "hw/virtio/virtio.h"
  #endif
  #include "exec/cpu-all.h"
@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
   * A helper function for the _utterly broken_ virtio device model to find out 
if
   * it's running on a big endian machine. Don't do this at home kids!
   */
-bool virtio_is_big_endian(void);
  bool virtio_is_big_endian(void)
  {
  #if defined(TARGET_WORDS_BIGENDIAN)
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..68c3064 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o
  common-obj-y += virtio-mmio.o
  common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
-obj-y += virtio.o virtio-balloon.o
+obj-y += virtio.o virtio-balloon.o
  obj-$(CONFIG_LINUX) += vhost.o
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ce97514..82a1689 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -89,9 +89,6 @@
  /* Flags track per-device state like workarounds for quirks in older guests. 
*/
  #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1 << 0)
-/* HACK for virtio to determine if it's running a big endian guest */
-bool virtio_is_big_endian(void);
-
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH] smp: autodetect numbers of threads per core

2014-04-14 Thread Alexey Kardashevskiy
On 04/14/2014 09:15 PM, Alexander Graf wrote:
> 
> On 15.11.13 17:58, Alexey Kardashevskiy wrote:
>> On 16.11.2013 0:15, Alexander Graf wrote:
>>>
>>> Am 15.11.2013 um 00:12 schrieb Alexey Kardashevskiy :
>>>
 At the moment only a whole CPU core can be assigned to a KVM. Since
 POWER7/8 support several threads per core, we want all threads of a core
 to go to the same KVM so every time we run QEMU with -enable-kvm on
 POWER, we have to add -smp X,threads=(4|8)" (4 for POWER7 and
 8 for POWER8).

 This patch tries to read smp_threads number from an accelerator and
 falls back to the default value (1) if the accelerator did not care
 to change it.

 Signed-off-by: Alexey Kardashevskiy 
 ---

 (!!!)

 The usual question - what would be the normal way of doing this?
 What does this patch break? I cannot think of anything right now.
>>> Is this really what the user wants? On p7 you can run in no-smt, smt2
>>> and smt4 mode. Today we simply default to no-smt. Changing defaults
>>> is usually a bad thing.
>>
>> Defaulting to 1 thread on P7 is a bad thing (other threads stay unused -
>> what is good about this?) and the only reason which I know why it is
>> still threads=1 is that it is hard to make a patch for upstream to
>> change this default.
> 
> threads=1 improves single-thread performance significantly. The thread
> itself is faster when it runs in SMT1 mode. Also we don't have to kick
> other threads out of the guest context, making every guest/host transition
> faster.
>
> Overall, it's really just a random default. I'm not sure it makes a lot of
> sense to change it.
> 
> However, could we be really smart here? How does ppc64_cpu --smt=off work?
> It only turns off the unused vcpus, right? Is there any way we could
> actually not even enter unused vcpus at all? Then we could indeed always
> expose the maximum available number of threads to the guest and let that
> one decide what to do.

Now sure I understood the idea.

If the user runs QEMU with the maximum number of threads and then stops the
threads other than the very first in the guest, the effect will be the same
as with threads=1 - POWER7/8 can detect stalled CPU and assign the entire
internal bus thingy to one thread.

On the other hand after all these lessons I received last month about QEMU
parameters and everything, I do not feel I want this feature anymore -
looks like we have to let libvirt deal with it :)


-- 
Alexey



Re: [Qemu-devel] [PATCH] spapr_pci: Fix number of returned vectors in ibm, change-msi

2014-04-14 Thread Alexander Graf


On 14.04.14 14:55, Alexey Kardashevskiy wrote:

On 04/07/2014 10:53 PM, Alexey Kardashevskiy wrote:

Current guest kernels try allocating as many vectors as the quota is.
For example, in the case of virtio-net (which has just 3 vectors)
the guest requests 4 vectors (that is the quota in the test) and
the existing ibm,change-msi handler returns 4. But before it returns,
it calls msix_set_message() in a loop and corrupts memory behind
the end of msix_table.

This limits the number of vectors returned by ibm,change-msi to
the maximum supported by the actual device.

Signed-off-by: Alexey Kardashevskiy 
---


This might go to 2.0 actually.


Ping? Ok, not 2.0 but ppc-next-2.1 or ppc-next? :)


Thanks, applied to ppc-next and tagged as stable.


Alex




Re: [Qemu-devel] Should we have a 2.0-rc3 ?

2014-04-14 Thread Peter Maydell
On 11 April 2014 18:37, Peter Maydell  wrote:
> Status update:
> Applied:
>  * ACPI fixes (both sets)
>  * block queue
>  * SDL2 relative mode fixes
>  * fix for virtio-net CVE
>  * fix for qom-list crash
>  * my patch to stack-protector check
> Patches on list but need review/ack and/or not sure whether to apply:
>  * kvm_physical_sync_dirty_bitmap bug

I propose to apply the "revert" patch MJT posted.

>  * my fix to my stack-protector check patch (oops)

Applied.

>  * vmxnet3 patches

Applied.

> Raised as issues but no patches:
>  * PCI bus naming

Nak from Alex, will not attempt to change for 2.0.

>  * win64 virtio-scsi regression

In the absence of any patches here I think we should deem
win64 virtio-scsi sufficiently minor that it will get
fixed in 2.0.1.

Other:
 * applied patch for a buffer overrun fix in IDE
 * currently processing MST's ACPI pull

Summary: I'm going to apply MST's ACPI pull and the revert
patch for the sync_dirty_bitmap issue, and then tag rc3
today.

thanks
-- PMM



Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH] smp: autodetect numbers of threads per core

2014-04-14 Thread Alexander Graf


On 14.04.14 15:10, Alexey Kardashevskiy wrote:

On 04/14/2014 09:15 PM, Alexander Graf wrote:

On 15.11.13 17:58, Alexey Kardashevskiy wrote:

On 16.11.2013 0:15, Alexander Graf wrote:

Am 15.11.2013 um 00:12 schrieb Alexey Kardashevskiy :


At the moment only a whole CPU core can be assigned to a KVM. Since
POWER7/8 support several threads per core, we want all threads of a core
to go to the same KVM so every time we run QEMU with -enable-kvm on
POWER, we have to add -smp X,threads=(4|8)" (4 for POWER7 and
8 for POWER8).

This patch tries to read smp_threads number from an accelerator and
falls back to the default value (1) if the accelerator did not care
to change it.

Signed-off-by: Alexey Kardashevskiy 
---

(!!!)

The usual question - what would be the normal way of doing this?
What does this patch break? I cannot think of anything right now.

Is this really what the user wants? On p7 you can run in no-smt, smt2
and smt4 mode. Today we simply default to no-smt. Changing defaults
is usually a bad thing.

Defaulting to 1 thread on P7 is a bad thing (other threads stay unused -
what is good about this?) and the only reason which I know why it is
still threads=1 is that it is hard to make a patch for upstream to
change this default.

threads=1 improves single-thread performance significantly. The thread
itself is faster when it runs in SMT1 mode. Also we don't have to kick
other threads out of the guest context, making every guest/host transition
faster.

Overall, it's really just a random default. I'm not sure it makes a lot of
sense to change it.

However, could we be really smart here? How does ppc64_cpu --smt=off work?
It only turns off the unused vcpus, right? Is there any way we could
actually not even enter unused vcpus at all? Then we could indeed always
expose the maximum available number of threads to the guest and let that
one decide what to do.

Now sure I understood the idea.

If the user runs QEMU with the maximum number of threads and then stops the
threads other than the very first in the guest, the effect will be the same
as with threads=1 - POWER7/8 can detect stalled CPU and assign the entire
internal bus thingy to one thread.


We would still have to enter all vcpus when we go into the guest, no? 
And the reverse as well - get out of all vcpus when we go into host 
virtual mode.



On the other hand after all these lessons I received last month about QEMU
parameters and everything, I do not feel I want this feature anymore -
looks like we have to let libvirt deal with it :)

;)


Alex




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Michael S. Tsirkin
On Mon, Apr 14, 2014 at 03:08:23PM +0200, Alexander Graf wrote:
> 
> On 14.04.14 14:55, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote:
> >>On 14.04.14 14:37, Michael S. Tsirkin wrote:
> >>>On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> On 14.04.14 14:24, Michael S. Tsirkin wrote:
> >On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
> >>On 14.04.14 13:58, Greg Kurz wrote:
> >>>From: Rusty Russell 
> >>>
> >>>virtio data structures are defined as "target endian", which assumes
> >>>that's a fixed value.  In fact, that actually means it's 
> >>>platform-specific.
> >>>The OASIS virtio 1.0 spec will fix this, by making it all little 
> >>>endian.
> >>>
> >>>We introduce memory accessors to be used accross the virtio code where
> >>>needed. These accessors should support both legacy and 1.0 devices.
> >>>A good way to do it is to introduce a per-device property to store the
> >>>endianness. We choose to set this flag at device reset time because it
> >>>is reasonnable to assume the endianness won't change unless we reboot 
> >>>or
> >>>kexec another kernel. And it is also reasonnable to assume the new 
> >>>kernel
> >>>will reset the devices before using them (otherwise it will break).
> >>>
> >>>We reuse the virtio_is_big_endian() helper since it provides the right
> >>>value for legacy devices with most of the targets, that have fixed
> >>>endianness. It can then be overriden to support endian-ambivalent 
> >>>targets.
> >>>
> >>>To support migration, we need to set the flag in virtio_load() as well.
> >>>
> >>>(a) One solution would be to add it to the stream, but it have some
> >>> drawbacks:
> >>>- since this only affects a few targets, the field should be put into a
> >>>   subsection
> >>>- virtio migration code should be ported to vmstate to be able to 
> >>>introduce
> >>>   such a subsection
> >>>
> >>>(b) If we assume the following to be true:
> >>>- target endianness falls under some cpu state
> >>>- cpu state is always restored before virtio devices state because they
> >>>   get initialized in this order in main().
> >>>Then an alternative is to rely on virtio_is_big_endian() again at
> >>>load time. No need to mess around with the migration stream in this 
> >>>case.
> >>>
> >>>This patch implements (b).
> >>>
> >>>Note that the tswap helpers are implemented in virtio.c so that
> >>>virtio-access.h stays platform independant. Most of the virtio code
> >>>will be buildable under common-obj instead of obj then, and spare
> >>>some cycles when building for multiple targets.
> >>>
> >>>Signed-off-by: Rusty Russell 
> >>>[ ldq_phys() API change,
> >>>   relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >>>   introduce a per-device is_big_endian flag (supersedes 
> >>> needs_byteswap global)
> >>>   add VirtIODevice * arg to virtio helpers,
> >>>   use the existing virtio_is_big_endian() helper,
> >>>   virtio-pci: use the device is_big_endian flag,
> >>>   introduce virtio tswap16 and tswap64 helpers,
> >>>   move calls to tswap* out of virtio-access.h to make it platform 
> >>> independant,
> >>>   migration support,
> >>>   Greg Kurz  ]
> >>>Cc: Cédric Le Goater 
> >>>Signed-off-by: Greg Kurz 
> >>>---
> >>>
> >>>Changes since v6:
> >>>- merge the virtio_needs_byteswap() helper from v6 and existing
> >>>   virtio_is_big_endian()
> >>>- virtio-pci: now supports endianness changes
> >>>- virtio-access.h fixes (target independant)
> >>>
> >>>  exec.c|2 -
> >>>  hw/virtio/Makefile.objs   |2 -
> >>>  hw/virtio/virtio-pci.c|   11 +--
> >>>  hw/virtio/virtio.c|   35 +
> >>>  include/hw/virtio/virtio-access.h |  138 
> >>> +
> >>>  include/hw/virtio/virtio.h|2 +
> >>>  vl.c  |4 +
> >>>  7 files changed, 185 insertions(+), 9 deletions(-)
> >>>  create mode 100644 include/hw/virtio/virtio-access.h
> >>>
> >>>diff --git a/exec.c b/exec.c
> >>>index 91513c6..e6777d0 100644
> >>>--- a/exec.c
> >>>+++ b/exec.c
> >>>@@ -42,6 +42,7 @@
> >>>  #else /* !CONFIG_USER_ONLY */
> >>>  #include "sysemu/xen-mapcache.h"
> >>>  #include "trace.h"
> >>>+#include "hw/virtio/virtio.h"
> >>>  #endif
> >>>  #include "exec/cpu-all.h"
> >>>@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, 
> >>>target_ulong addr,
> >>>   * A helper function for the _utterly broken_ virtio device model to 
> >>> find out if
> >>>   * it's running on a big endian machine. Don't do this at home kid

Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Greg Kurz
On Mon, 14 Apr 2014 14:53:03 +0200
Alexander Graf  wrote:
> 
> On 14.04.14 14:50, Greg Kurz wrote:
> > On Mon, 14 Apr 2014 14:22:36 +0200
> > Alexander Graf  wrote:
> >
> >> On 14.04.14 13:58, Greg Kurz wrote:
> >>> From: Rusty Russell 
> >>>
> >>> virtio data structures are defined as "target endian", which assumes
> >>> that's a fixed value.  In fact, that actually means it's 
> >>> platform-specific.
> >>> The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >>>
> >>> We introduce memory accessors to be used accross the virtio code where
> >>> needed. These accessors should support both legacy and 1.0 devices.
> >>> A good way to do it is to introduce a per-device property to store the
> >>> endianness. We choose to set this flag at device reset time because it
> >>> is reasonnable to assume the endianness won't change unless we reboot or
> >>> kexec another kernel. And it is also reasonnable to assume the new kernel
> >>> will reset the devices before using them (otherwise it will break).
> >>>
> >>> We reuse the virtio_is_big_endian() helper since it provides the right
> >>> value for legacy devices with most of the targets, that have fixed
> >>> endianness. It can then be overriden to support endian-ambivalent targets.
> >>>
> >>> To support migration, we need to set the flag in virtio_load() as well.
> >>>
> >>> (a) One solution would be to add it to the stream, but it have some
> >>>   drawbacks:
> >>> - since this only affects a few targets, the field should be put into a
> >>> subsection
> >>> - virtio migration code should be ported to vmstate to be able to 
> >>> introduce
> >>> such a subsection
> >>>
> >>> (b) If we assume the following to be true:
> >>> - target endianness falls under some cpu state
> >>> - cpu state is always restored before virtio devices state because they
> >>> get initialized in this order in main().
> >>> Then an alternative is to rely on virtio_is_big_endian() again at
> >>> load time. No need to mess around with the migration stream in this case.
> >>>
> >>> This patch implements (b).
> >>>
> >>> Note that the tswap helpers are implemented in virtio.c so that
> >>> virtio-access.h stays platform independant. Most of the virtio code
> >>> will be buildable under common-obj instead of obj then, and spare
> >>> some cycles when building for multiple targets.
> >>>
> >>> Signed-off-by: Rusty Russell 
> >>> [ ldq_phys() API change,
> >>> relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >>> introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> >>> global)
> >>> add VirtIODevice * arg to virtio helpers,
> >>> use the existing virtio_is_big_endian() helper,
> >>> virtio-pci: use the device is_big_endian flag,
> >>> introduce virtio tswap16 and tswap64 helpers,
> >>> move calls to tswap* out of virtio-access.h to make it platform 
> >>> independant,
> >>> migration support,
> >>> Greg Kurz  ]
> >>> Cc: Cédric Le Goater 
> >>> Signed-off-by: Greg Kurz 
> >>> ---
> 
> [...]
> 
> >>> +
> >>> +uint64_t virtio_tswap64(VirtIODevice *vdev, uint64_t s);
> >>> +
> >>> +static inline void virtio_tswap64s(VirtIODevice *vdev, uint64_t *s)
> >>> +{
> >>> +*s = virtio_tswap64(vdev, *s);
> >>> +}
> >> Could we try to poison any non-virtio, non-endian-specific memory
> >> accessors when this file is included? That way we don't fall into
> >> pitfalls where we end up running stl_p in a tiny corner case somewhere
> >> still.
> >>
> >>
> >> Alex
> >>
> > Hmmm... there are still some stl_p users in virtio.c that I could not
> > port to the new virtio memory accessors... These are the virtio_config_*
> > helpers that gets called from virtio-pci and virtio-mmio.
> 
> Why couldn't you port them to the new virtio memory accessors? Are you 
> missing a vdev parameter? Could you add one?
> 
> IIRC config space is a weird mix of target endianness and little endian.
> 

The helpers in virtio do the stl_p, but it is the caller that does the
byteswap (virtio-pci) or not (virtio-mmio)... That's where I got a little
confused and did not dare to touch this code :-\

Of course, I can try to focus harder see what can/should be done here. :)

-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
Alan Moore.




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Alexander Graf


On 14.04.14 15:12, Greg Kurz wrote:

On Mon, 14 Apr 2014 14:53:03 +0200
Alexander Graf  wrote:

On 14.04.14 14:50, Greg Kurz wrote:

On Mon, 14 Apr 2014 14:22:36 +0200
Alexander Graf  wrote:


On 14.04.14 13:58, Greg Kurz wrote:

From: Rusty Russell 

virtio data structures are defined as "target endian", which assumes
that's a fixed value.  In fact, that actually means it's platform-specific.
The OASIS virtio 1.0 spec will fix this, by making it all little endian.

We introduce memory accessors to be used accross the virtio code where
needed. These accessors should support both legacy and 1.0 devices.
A good way to do it is to introduce a per-device property to store the
endianness. We choose to set this flag at device reset time because it
is reasonnable to assume the endianness won't change unless we reboot or
kexec another kernel. And it is also reasonnable to assume the new kernel
will reset the devices before using them (otherwise it will break).

We reuse the virtio_is_big_endian() helper since it provides the right
value for legacy devices with most of the targets, that have fixed
endianness. It can then be overriden to support endian-ambivalent targets.

To support migration, we need to set the flag in virtio_load() as well.

(a) One solution would be to add it to the stream, but it have some
   drawbacks:
- since this only affects a few targets, the field should be put into a
 subsection
- virtio migration code should be ported to vmstate to be able to introduce
 such a subsection

(b) If we assume the following to be true:
- target endianness falls under some cpu state
- cpu state is always restored before virtio devices state because they
 get initialized in this order in main().
Then an alternative is to rely on virtio_is_big_endian() again at
load time. No need to mess around with the migration stream in this case.

This patch implements (b).

Note that the tswap helpers are implemented in virtio.c so that
virtio-access.h stays platform independant. Most of the virtio code
will be buildable under common-obj instead of obj then, and spare
some cycles when building for multiple targets.

Signed-off-by: Rusty Russell 
[ ldq_phys() API change,
 relicensed virtio-access.h to GPLv2+ on Rusty's request,
 introduce a per-device is_big_endian flag (supersedes needs_byteswap 
global)
 add VirtIODevice * arg to virtio helpers,
 use the existing virtio_is_big_endian() helper,
 virtio-pci: use the device is_big_endian flag,
 introduce virtio tswap16 and tswap64 helpers,
 move calls to tswap* out of virtio-access.h to make it platform 
independant,
 migration support,
 Greg Kurz  ]
Cc: Cédric Le Goater 
Signed-off-by: Greg Kurz 
---

[...]


+
+uint64_t virtio_tswap64(VirtIODevice *vdev, uint64_t s);
+
+static inline void virtio_tswap64s(VirtIODevice *vdev, uint64_t *s)
+{
+*s = virtio_tswap64(vdev, *s);
+}

Could we try to poison any non-virtio, non-endian-specific memory
accessors when this file is included? That way we don't fall into
pitfalls where we end up running stl_p in a tiny corner case somewhere
still.


Alex


Hmmm... there are still some stl_p users in virtio.c that I could not
port to the new virtio memory accessors... These are the virtio_config_*
helpers that gets called from virtio-pci and virtio-mmio.

Why couldn't you port them to the new virtio memory accessors? Are you
missing a vdev parameter? Could you add one?

IIRC config space is a weird mix of target endianness and little endian.


The helpers in virtio do the stl_p, but it is the caller that does the
byteswap (virtio-pci) or not (virtio-mmio)... That's where I got a little
confused and did not dare to touch this code :-\

Of course, I can try to focus harder see what can/should be done here. :)


Let's make Michael happy with target specific virtio drivers first and 
then look at this mess ;).



Alex





Re: [Qemu-devel] Should we have a 2.0-rc3 ?

2014-04-14 Thread Paolo Bonzini
Il 14/04/2014 09:11, Peter Maydell ha scritto:
> On 11 April 2014 18:37, Peter Maydell  wrote:
>> Status update:
>> Applied:
>>  * ACPI fixes (both sets)
>>  * block queue
>>  * SDL2 relative mode fixes
>>  * fix for virtio-net CVE
>>  * fix for qom-list crash
>>  * my patch to stack-protector check
>> Patches on list but need review/ack and/or not sure whether to apply:
>>  * kvm_physical_sync_dirty_bitmap bug
> 
> I propose to apply the "revert" patch MJT posted.
> 
>>  * my fix to my stack-protector check patch (oops)
> 
> Applied.
> 
>>  * vmxnet3 patches
> 
> Applied.
> 
>> Raised as issues but no patches:
>>  * PCI bus naming
> 
> Nak from Alex, will not attempt to change for 2.0.
> 
>>  * win64 virtio-scsi regression
> 
> In the absence of any patches here I think we should deem
> win64 virtio-scsi sufficiently minor that it will get
> fixed in 2.0.1.

The patch is at
http://article.gmane.org/gmane.comp.emulators.qemu/264111/raw
if you want to apply it.

Paolo

> Other:
>  * applied patch for a buffer overrun fix in IDE
>  * currently processing MST's ACPI pull
> 
> Summary: I'm going to apply MST's ACPI pull and the revert
> patch for the sync_dirty_bitmap issue, and then tag rc3
> today.
> 
> thanks
> -- PMM
> 
> 




Re: [Qemu-devel] [PATCH v7 1/8] virtio: endian-ambivalent targets using legacy virtio

2014-04-14 Thread Greg Kurz
On Mon, 14 Apr 2014 14:40:04 +0200
Alexander Graf  wrote:
> 
> On 14.04.14 14:37, Michael S. Tsirkin wrote:
> > On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote:
> >> On 14.04.14 14:24, Michael S. Tsirkin wrote:
> >>> On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote:
>  On 14.04.14 13:58, Greg Kurz wrote:
> > From: Rusty Russell 
> >
> > virtio data structures are defined as "target endian", which assumes
> > that's a fixed value.  In fact, that actually means it's 
> > platform-specific.
> > The OASIS virtio 1.0 spec will fix this, by making it all little endian.
> >
> > We introduce memory accessors to be used accross the virtio code where
> > needed. These accessors should support both legacy and 1.0 devices.
> > A good way to do it is to introduce a per-device property to store the
> > endianness. We choose to set this flag at device reset time because it
> > is reasonnable to assume the endianness won't change unless we reboot or
> > kexec another kernel. And it is also reasonnable to assume the new 
> > kernel
> > will reset the devices before using them (otherwise it will break).
> >
> > We reuse the virtio_is_big_endian() helper since it provides the right
> > value for legacy devices with most of the targets, that have fixed
> > endianness. It can then be overriden to support endian-ambivalent 
> > targets.
> >
> > To support migration, we need to set the flag in virtio_load() as well.
> >
> > (a) One solution would be to add it to the stream, but it have some
> >  drawbacks:
> > - since this only affects a few targets, the field should be put into a
> >subsection
> > - virtio migration code should be ported to vmstate to be able to 
> > introduce
> >such a subsection
> >
> > (b) If we assume the following to be true:
> > - target endianness falls under some cpu state
> > - cpu state is always restored before virtio devices state because they
> >get initialized in this order in main().
> > Then an alternative is to rely on virtio_is_big_endian() again at
> > load time. No need to mess around with the migration stream in this 
> > case.
> >
> > This patch implements (b).
> >
> > Note that the tswap helpers are implemented in virtio.c so that
> > virtio-access.h stays platform independant. Most of the virtio code
> > will be buildable under common-obj instead of obj then, and spare
> > some cycles when building for multiple targets.
> >
> > Signed-off-by: Rusty Russell 
> > [ ldq_phys() API change,
> >relicensed virtio-access.h to GPLv2+ on Rusty's request,
> >introduce a per-device is_big_endian flag (supersedes needs_byteswap 
> > global)
> >add VirtIODevice * arg to virtio helpers,
> >use the existing virtio_is_big_endian() helper,
> >virtio-pci: use the device is_big_endian flag,
> >introduce virtio tswap16 and tswap64 helpers,
> >move calls to tswap* out of virtio-access.h to make it platform 
> > independant,
> >migration support,
> >Greg Kurz  ]
> > Cc: Cédric Le Goater 
> > Signed-off-by: Greg Kurz 
> > ---
> >
> > Changes since v6:
> > - merge the virtio_needs_byteswap() helper from v6 and existing
> >virtio_is_big_endian()
> > - virtio-pci: now supports endianness changes
> > - virtio-access.h fixes (target independant)
> >
> >   exec.c|2 -
> >   hw/virtio/Makefile.objs   |2 -
> >   hw/virtio/virtio-pci.c|   11 +--
> >   hw/virtio/virtio.c|   35 +
> >   include/hw/virtio/virtio-access.h |  138 
> > +
> >   include/hw/virtio/virtio.h|2 +
> >   vl.c  |4 +
> >   7 files changed, 185 insertions(+), 9 deletions(-)
> >   create mode 100644 include/hw/virtio/virtio-access.h
> >
> > diff --git a/exec.c b/exec.c
> > index 91513c6..e6777d0 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -42,6 +42,7 @@
> >   #else /* !CONFIG_USER_ONLY */
> >   #include "sysemu/xen-mapcache.h"
> >   #include "trace.h"
> > +#include "hw/virtio/virtio.h"
> >   #endif
> >   #include "exec/cpu-all.h"
> > @@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, 
> > target_ulong addr,
> >* A helper function for the _utterly broken_ virtio device model to 
> > find out if
> >* it's running on a big endian machine. Don't do this at home kids!
> >*/
> > -bool virtio_is_big_endian(void);
> >   bool virtio_is_big_endian(void)
> >   {
> >   #if defined(TARGET_WORDS_BIGENDIAN)
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 1ba53d9..68c30

Re: [Qemu-devel] Should we have a 2.0-rc3 ?

2014-04-14 Thread Peter Maydell
On 14 April 2014 14:17, Paolo Bonzini  wrote:
> Il 14/04/2014 09:11, Peter Maydell ha scritto:
>>>  * win64 virtio-scsi regression
>>
>> In the absence of any patches here I think we should deem
>> win64 virtio-scsi sufficiently minor that it will get
>> fixed in 2.0.1.
>
> The patch is at
> http://article.gmane.org/gmane.comp.emulators.qemu/264111/raw
> if you want to apply it.

Hmm, it's been around for weeks already and nobody
reviewed or applied it? I'm definitely nervous about
applying exec.c patches at this stage...

(More generally it feels like either the code using this
needs to be able to cope with "might only get TARGET_PAGE_SIZE"
semantics, or we need to fix the Xen code paths so that
they provide the whole requested section as well. The
patch looks more like a bandaid than an actual fix :-()

thanks
-- PMM



Re: [Qemu-devel] [PATCH v4 23/25] tcg-aarch64: Introduce tcg_out_insn_3312, _3310, _3313

2014-04-14 Thread Claudio Fontana
On 11.04.2014 17:40, Richard Henderson wrote:
> Replace aarch64_ldst_op_data with AArch64LdstType, as it wasn't encoded
> for the proper shift for the field and was confusing.
> 
> Merge aarch64_ldst_op_data, AArch64LdstType, and a few stray opcode bits
> into a single I3312_* argument, eliminating some magic numbers from the
> helper functions.
> 
> Signed-off-by: Richard Henderson 
> ---
>  tcg/aarch64/tcg-target.c | 176 
> ---
>  1 file changed, 89 insertions(+), 87 deletions(-)
> 
> diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
> index 7f72df5..3235824 100644
> --- a/tcg/aarch64/tcg-target.c
> +++ b/tcg/aarch64/tcg-target.c
> @@ -242,19 +242,12 @@ static const enum aarch64_cond_code 
> tcg_cond_to_aarch64[] = {
>  [TCG_COND_LEU] = COND_LS,
>  };
>  
> -/* opcodes for LDR / STR instructions with base + simm9 addressing */
> -enum aarch64_ldst_op_data { /* size of the data moved */
> -LDST_8 = 0x38,
> -LDST_16 = 0x78,
> -LDST_32 = 0xb8,
> -LDST_64 = 0xf8,
> -};
> -enum aarch64_ldst_op_type { /* type of operation */
> -LDST_ST = 0x0,/* store */
> -LDST_LD = 0x4,/* load */
> -LDST_LD_S_X = 0x8,  /* load and sign-extend into Xt */
> -LDST_LD_S_W = 0xc,  /* load and sign-extend into Wt */
> -};
> +typedef enum {
> +LDST_ST = 0,/* store */
> +LDST_LD = 1,/* load */
> +LDST_LD_S_X = 2,  /* load and sign-extend into Xt */
> +LDST_LD_S_W = 3,  /* load and sign-extend into Wt */
> +} AArch64LdstType;
>  
>  /* We encode the format of the insn into the beginning of the name, so that
> we can have the preprocessor help "typecheck" the insn vs the output
> @@ -278,6 +271,28 @@ typedef enum {
>  I3207_BLR   = 0xd63f,
>  I3207_RET   = 0xd65f,
>  
> +/* Load/store register.  Described here as 3.3.12, but the helper
> +   that emits them can transform to 3.3.10 or 3.3.13.  */
> +I3312_STRB  = 0x3800 | LDST_ST << 22 | MO_8 << 30,
> +I3312_STRH  = 0x3800 | LDST_ST << 22 | MO_16 << 30,
> +I3312_STRW  = 0x3800 | LDST_ST << 22 | MO_32 << 30,
> +I3312_STRX  = 0x3800 | LDST_ST << 22 | MO_64 << 30,
> +
> +I3312_LDRB  = 0x3800 | LDST_LD << 22 | MO_8 << 30,
> +I3312_LDRH  = 0x3800 | LDST_LD << 22 | MO_16 << 30,
> +I3312_LDRW  = 0x3800 | LDST_LD << 22 | MO_32 << 30,
> +I3312_LDRX  = 0x3800 | LDST_LD << 22 | MO_64 << 30,
> +
> +I3312_LDRSBW= 0x3800 | LDST_LD_S_W << 22 | MO_8 << 30,
> +I3312_LDRSHW= 0x3800 | LDST_LD_S_W << 22 | MO_16 << 30,
> +
> +I3312_LDRSBX= 0x3800 | LDST_LD_S_X << 22 | MO_8 << 30,
> +I3312_LDRSHX= 0x3800 | LDST_LD_S_X << 22 | MO_16 << 30,
> +I3312_LDRSWX= 0x3800 | LDST_LD_S_X << 22 | MO_32 << 30,
> +
> +I3312_TO_I3310  = 0x00206800,
> +I3312_TO_I3313  = 0x0100,
> +
>  /* Load/store register pair instructions.  */
>  I3314_LDP   = 0x2840,
>  I3314_STP   = 0x2800,
> @@ -490,26 +505,25 @@ static void tcg_out_insn_3509(TCGContext *s, 
> AArch64Insn insn, TCGType ext,
>  tcg_out32(s, insn | ext << 31 | rm << 16 | ra << 10 | rn << 5 | rd);
>  }
>  
> +static void tcg_out_insn_3310(TCGContext *s, AArch64Insn insn,
> +  TCGReg rd, TCGReg base, TCGReg regoff)
> +{
> +/* Note the AArch64Insn constants above are for C3.3.12.  Adjust.  */
> +tcg_out32(s, insn | I3312_TO_I3310 | regoff << 16 | base << 5 | rd);
> +}
>  
> -static inline void tcg_out_ldst_9(TCGContext *s,
> -  enum aarch64_ldst_op_data op_data,
> -  enum aarch64_ldst_op_type op_type,
> -  TCGReg rd, TCGReg rn, intptr_t offset)
> +
> +static void tcg_out_insn_3312(TCGContext *s, AArch64Insn insn,
> +  TCGReg rd, TCGReg rn, intptr_t offset)
>  {
> -/* use LDUR with BASE register with 9bit signed unscaled offset */
> -tcg_out32(s, op_data << 24 | op_type << 20
> -  | (offset & 0x1ff) << 12 | rn << 5 | rd);
> +tcg_out32(s, insn | (offset & 0x1ff) << 12 | rn << 5 | rd);
>  }
>  
> -/* tcg_out_ldst_12 expects a scaled unsigned immediate offset */
> -static inline void tcg_out_ldst_12(TCGContext *s,
> -   enum aarch64_ldst_op_data op_data,
> -   enum aarch64_ldst_op_type op_type,
> -   TCGReg rd, TCGReg rn,
> -   tcg_target_ulong scaled_uimm)
> +static void tcg_out_insn_3313(TCGContext *s, AArch64Insn insn,
> +  TCGReg rd, TCGReg rn, uintptr_t scaled_uimm)
>  {
> -tcg_out32(s, (op_data | 1) << 24
> -  | op_type << 20 | scaled_uimm << 10 | rn << 5 | rd);
> +/* Note the AArch64Insn constants above are for C3.3.12.  Adjust.  */
> +tcg_out32(s, insn | I3

Re: [Qemu-devel] [PATCH] block: Handle error of bdrv_getlength in bdrv_create_dirty_bitmap

2014-04-14 Thread Kevin Wolf
Am 03.04.2014 um 16:44 hat Max Reitz geschrieben:
> On 02.04.2014 08:04, Fam Zheng wrote:
> >bdrv_getlength could fail, check the return value before using it.
> >
> >Signed-off-by: Fam Zheng 
> >---
> >  block-migration.c | 28 
> >  block.c   | 10 --
> >  block/mirror.c|  5 -
> >  include/block/block.h |  3 ++-
> >  4 files changed, 38 insertions(+), 8 deletions(-)
> >
> >diff --git a/block-migration.c b/block-migration.c
> >index 897fdba..62cd597 100644
> >--- a/block-migration.c
> >+++ b/block-migration.c
> >@@ -310,13 +310,26 @@ static int mig_save_device_bulk(QEMUFile *f, 
> >BlkMigDevState *bmds)
> >  /* Called with iothread lock taken.  */
> >-static void set_dirty_tracking(void)
> >+static int set_dirty_tracking(void)
> >  {
> >  BlkMigDevState *bmds;
> >  QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
> >-bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE);
> >+bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
> >+  NULL);
> >+if (!bmds->dirty_bitmap) {
> >+goto fail;
> >+}
> >+}
> >+return 0;
> >+
> >+fail:
> >+QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
> >+if (bmds->dirty_bitmap) {
> >+bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
> >+}
> >  }
> >+return -1;
> 
> Through block_save_setup(), this ends up as f->last_error which is
> at least in qemu_loadvm_state() interpreted as -errno (or rather,
> that function generally returns -errno and in this case, it returns
> f->last_error). I know that it is not easy to find a correct error
> code here, but EPERM seems rather unfitting; even EIO would be
> better, in my opinion.
> 
> Anyway, this is not really bad, so:
> 
> Reviewed-by: Max Reitz 

Let's not add new cases of -1 interpreted as negative errno. Patches
that improve error handling should do it right.

> >  }
> >  static void unset_dirty_tracking(void)
> >@@ -611,10 +624,17 @@ static int block_save_setup(QEMUFile *f, void *opaque)
> >  block_mig_state.submitted, block_mig_state.transferred);
> >  qemu_mutex_lock_iothread();
> >-init_blk_migration(f);
> >  /* start track dirty blocks */
> >-set_dirty_tracking();
> >+ret = set_dirty_tracking();
> >+
> >+if (ret) {
> >+qemu_mutex_unlock_iothread();
> >+return ret;
> >+}
> >+
> >+init_blk_migration(f);
> >+
> >  qemu_mutex_unlock_iothread();
> >  ret = flush_blks(f);
> >diff --git a/block.c b/block.c
> >index acb70fd..93006de 100644
> >--- a/block.c
> >+++ b/block.c
> >@@ -5079,7 +5079,8 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
> >QEMUIOVector *qiov)
> >  return true;
> >  }
> >-BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int 
> >granularity)
> >+BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int 
> >granularity,
> >+  Error **errp)
> >  {
> >  int64_t bitmap_size;
> >  BdrvDirtyBitmap *bitmap;
> >@@ -5088,7 +5089,12 @@ BdrvDirtyBitmap 
> >*bdrv_create_dirty_bitmap(BlockDriverState *bs, int granularity)
> >  granularity >>= BDRV_SECTOR_BITS;
> >  assert(granularity);
> >-bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS);
> >+bitmap_size = bdrv_getlength(bs);
> >+if (bitmap_size < 0) {
> >+error_setg(errp, "could not get length of device");

Why not error_setg_errno()?

> >+return NULL;
> >+}
> >+bitmap_size >>= BDRV_SECTOR_BITS;
> >  bitmap = g_malloc0(sizeof(BdrvDirtyBitmap));
> >  bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
> >  QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);

Kevin



  1   2   3   >