date:20150913

Re: [Qemu-devel] [PATCH v2 0/2] intel_iommu: Add support for translation for devices behind bridges

2015-09-13 Thread Knut Omang

On Sun, 2015-09-13 at 09:12 +1000, Benjamin Herrenschmidt wrote:
> On Sat, 2015-09-12 at 20:37 +0200, Knut Omang wrote:
> > As the thread went silent after our conclusions, I have made a
> > second
> > implementation for the Intel IOMMU according to this alternate
> > scheme,
> > It keeps the current API and handles the bus number resolution
> > lazily
> > within the IOMMU implementation, I will post the (single) patch as
> > v3
> > of this. 
> > 
> > Hopefully this is acceptable and can be leveraged to do a similar
> > rework, or be abstracted as generic functionality (?) for the other
> > architectures,..
> 
> Ah sorry, I meant to look at your email in more details and respond
> but
> it fell through the cracks.

I know how it is.. :-)

> I'm happy to have a look at your work and see how it applies to me,
> you
> can see my powernv code which also supports translation for devices
> behind bridges here (but doesn't do as much caching as q35 does):
> 
> https://github.com/ozbenh/qemu/commit/4e0ed1002f98fd97aa7ca3a48c74933
> d0343dd42

Looks conceptually similar - the caching part is probably not that
important, it just came natural as it evolved from the original
implementation. We are probably so far talking about max a few buses
and a few devices per bus, I suppose.

I suppose the best then now would be to stick to your suggestion of
getting the functionality in and let it mature, then look for any
optimization whether structural or performance wise,

> Which depends on:
> 
> https://github.com/ozbenh/qemu/commit/facedeba8811985ca20ac3dbad5d07e
> 1a10ea9b2
> 
> (Which I think Michael merged recently, I haven't checked).

Yes, I noticed while rebasing.

> Cheers,
> Ben.

Cheers,
Knut

Re: [Qemu-devel] [PATCH 15/17] target-openrisc: Fix madd

2015-09-13 Thread Bastian Koppelmann




On 09/03/2015 02:17 AM, Richard Henderson wrote:

Note that the specification for lf.madd.s is confused.  It's
the only mention of supposed FPMADDHI/FPMADDLO special registers.
On the other hand, or1ksim implements a somewhat normal non-fused
multiply and add.  Mirror that.

Signed-off-by: Richard Henderson 
---
  target-openrisc/cpu.h|  3 --
  target-openrisc/fpu_helper.c | 68 
  target-openrisc/helper.h |  7 ++---
  target-openrisc/translate.c  | 13 +++--
  4 files changed, 30 insertions(+), 61 deletions(-)



Reviewed-by: Bastian Koppelmann 

Cheers,
Bastian

Re: [Qemu-devel] [PATCH 12/17] target-openrisc: Enable m[tf]spr from user mode

2015-09-13 Thread Bastian Koppelmann




On 09/06/2015 10:36 PM, Richard Henderson wrote:

On Sep 5, 2015 14:35, Bastian Koppelmann  wrote:

IIRC a lot of the registers are supervisor only, e.g. VR, NPC or SR and
the manual is fairly clear about that. User mode cpu ought not to read
these registers unconditionally.

When I last discussed this on the openrisc list, back in March, there was no 
real specification for user mode, and what bits are or should be accessible.

Looking at
   http://opencores.org/or1k/Architecture_Specification
today, that still seems to be the case.

In the meantime, dropping the privilege check makes linux-user GCC tests work 
better.

Looking at the article, user mode seems to be optional, so I'm not 
against it, but it does look weird. How does ork1sim do it?


Cheers,
Bastian

[Qemu-devel] [PATCH v2 4/8] target-arm: Suppress TBI for S2 translations

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Stage-2 MMU translations do not have configurable TBI as
the top byte is always 0 (48-bit IPAs).

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/helper.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 81a1850..9977062 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -6370,7 +6370,9 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 if (arm_el_is_aa64(env, el)) {
 va_size = 64;
 if (el > 1) {
-tbi = extract64(tcr->raw_tcr, 20, 1);
+if (mmu_idx != ARMMMUIdx_S2NS) {
+tbi = extract64(tcr->raw_tcr, 20, 1);
+}
 } else {
 if (extract64(address, 55, 1)) {
 tbi = extract64(tcr->raw_tcr, 38, 1);
-- 
1.9.1

[Qemu-devel] [PATCH v2 1/8] hw/cpu/{a15mpcore, a9mpcore}: Handle missing has_el3 CPU props gracefully

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Handle missing CPU support for EL3 gracefully.

Signed-off-by: Edgar E. Iglesias 
---
 hw/cpu/a15mpcore.c | 2 +-
 hw/cpu/a9mpcore.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
index 4ef8db1..94e8cc1 100644
--- a/hw/cpu/a15mpcore.c
+++ b/hw/cpu/a15mpcore.c
@@ -64,7 +64,7 @@ static void a15mp_priv_realize(DeviceState *dev, Error **errp)
  * either all the CPUs have TZ, or none do.
  */
 cpuobj = OBJECT(qemu_get_cpu(0));
-has_el3 = object_property_find(cpuobj, "has_el3", &error_abort) &&
+has_el3 = object_property_find(cpuobj, "has_el3", NULL) &&
 object_property_get_bool(cpuobj, "has_el3", &error_abort);
 qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
 }
diff --git a/hw/cpu/a9mpcore.c b/hw/cpu/a9mpcore.c
index 7046246..869818c 100644
--- a/hw/cpu/a9mpcore.c
+++ b/hw/cpu/a9mpcore.c
@@ -69,7 +69,7 @@ static void a9mp_priv_realize(DeviceState *dev, Error **errp)
  * either all the CPUs have TZ, or none do.
  */
 cpuobj = OBJECT(qemu_get_cpu(0));
-has_el3 = object_property_find(cpuobj, "has_el3", &error_abort) &&
+has_el3 = object_property_find(cpuobj, "has_el3", NULL) &&
 object_property_get_bool(cpuobj, "has_el3", &error_abort);
 qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
 
-- 
1.9.1

[Qemu-devel] [PATCH v2 0/8] arm: Steps towards EL2 support round 4

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Hi,

This is another series with small steps towards EL2 emulation.

Patch 1 is a fix to allow easier testing of EL3-less cores.
Patches 2 and on add regs and a few small steps towards 2-stage MMU.

Comments welcome!

Best regards,
Edgar

v1 -> v2:
* Add fix for gracefully handling missing has_el2 CPU props
* Dropped suppress of TTBR1 for S2 (unneeded)
* Comment on vttbr_write TLB flush
* Mark second instance of VTTBR as ALIAS
* Split the active aa32ns_aa64any into separate AA32/AA64 registrations to
  allow the AA64 one to avoid .access checks
* VTCR does not need TLB flushes
* Various CP_CONST/resetvalue=0 instead of writefns/readfns
* Fix VMPIDR el2 vs el1 typo
* Fix VMPIDR reset value
* Fix spelling of suppress in commit message

Edgar E. Iglesias (8):
  hw/cpu/{a15mpcore, a9mpcore}: Handle missing has_el3 CPU props
gracefully
  target-arm: Add VTCR_EL2
  target-arm: Add VTTBR_EL2
  target-arm: Suppress TBI for S2 translations
  target-arm: Suppress EPD for S2, EL2 and EL3 translations
  target-arm: Add VPIDR_EL2
  target-arm: Break out mpidr_read_val()
  target-arm: Add VMPIDR_EL2

 hw/cpu/a15mpcore.c  |   2 +-
 hw/cpu/a9mpcore.c   |   2 +-
 target-arm/cpu.h|   4 ++
 target-arm/helper.c | 158 +---
 4 files changed, 155 insertions(+), 11 deletions(-)

-- 
1.9.1

[Qemu-devel] [PATCH v2 3/8] target-arm: Add VTTBR_EL2

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 34 --
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index f45fd05..c10e4ee 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -222,6 +222,7 @@ typedef struct CPUARMState {
 };
 uint64_t ttbr1_el[4];
 };
+uint64_t vttbr_el2; /* Virtualization Translation Table Base.  */
 /* MMU translation table base control. */
 TCR tcr_el[4];
 TCR vtcr_el2; /* Virtualization Translation Control.  */
diff --git a/target-arm/helper.c b/target-arm/helper.c
index c49b954..81a1850 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2213,6 +2213,20 @@ static void vmsa_ttbr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 raw_write(env, ri, value);
 }
 
+static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+ARMCPU *cpu = arm_env_get_cpu(env);
+CPUState *cs = CPU(cpu);
+
+/* Accesses to VTTBR may change the VMID so we must flush the TLB.  */
+if (raw_read(env, ri) != value) {
+tlb_flush_by_mmuidx(cs, ARMMMUIdx_S12NSE1, ARMMMUIdx_S12NSE0,
+ARMMMUIdx_S2NS, -1);
+raw_write(env, ri, value);
+}
+}
+
 static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
 { .name = "DFSR", .cp = 15, .crn = 5, .crm = 0, .opc1 = 0, .opc2 = 0,
   .access = PL1_RW, .type = ARM_CP_ALIAS,
@@ -3144,6 +3158,13 @@ static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
   .access = PL2_RW, .accessfn = access_el3_aa32ns_aa64any,
   .type = ARM_CP_CONST, .resetvalue = 0 },
+{ .name = "VTTBR", .state = ARM_CP_STATE_AA32,
+  .cp = 15, .opc1 = 6, .crm = 2,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns,
+  .type = ARM_CP_CONST | ARM_CP_64BIT, .resetvalue = 0 },
+{ .name = "VTTBR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 0,
+  .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
 { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
   .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
@@ -3286,6 +3307,16 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
   .access = PL2_RW, .type = ARM_CP_ALIAS,
   .fieldoffset = offsetof(CPUARMState, cp15.vtcr_el2) },
+{ .name = "VTTBR", .state = ARM_CP_STATE_AA32,
+  .cp = 15, .opc1 = 6, .crm = 2,
+  .type = ARM_CP_64BIT | ARM_CP_ALIAS,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns,
+  .fieldoffset = offsetof(CPUARMState, cp15.vttbr_el2),
+  .writefn = vttbr_write },
+{ .name = "VTTBR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 0,
+  .access = PL2_RW, .writefn = vttbr_write,
+  .fieldoffset = offsetof(CPUARMState, cp15.vttbr_el2) },
 { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
   .access = PL2_RW, .raw_writefn = raw_write, .writefn = sctlr_write,
@@ -5791,8 +5822,7 @@ static inline uint64_t regime_ttbr(CPUARMState *env, 
ARMMMUIdx mmu_idx,
int ttbrn)
 {
 if (mmu_idx == ARMMMUIdx_S2NS) {
-/* TODO: return VTTBR_EL2 */
-g_assert_not_reached();
+return env->cp15.vttbr_el2;
 }
 if (ttbrn == 0) {
 return env->cp15.ttbr0_el[regime_el(env, mmu_idx)];
-- 
1.9.1

[Qemu-devel] [PATCH v2 2/8] target-arm: Add VTCR_EL2

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 43 +--
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 5abd8ba..f45fd05 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -224,6 +224,7 @@ typedef struct CPUARMState {
 };
 /* MMU translation table base control. */
 TCR tcr_el[4];
+TCR vtcr_el2; /* Virtualization Translation Control.  */
 uint32_t c2_data; /* MPU data cacheable bits.  */
 uint32_t c2_insn; /* MPU instruction cacheable bits.  */
 union { /* MMU domain access control register
diff --git a/target-arm/helper.c b/target-arm/helper.c
index d453120..c49b954 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -325,6 +325,34 @@ void init_cpreg_list(ARMCPU *cpu)
 g_list_free(keys);
 }
 
+/*
+ * Some registers are not accessible if EL3.NS=0 and EL3 is using AArch32 but
+ * they are accesible when EL3 is using AArch64 regardless of EL3.NS.
+ *
+ * access_el3_aa32ns: Used to check AArch32 register views.
+ * access_el3_aa32ns_aa64any: Used to check both AArch32/64 register views.
+ */
+static CPAccessResult access_el3_aa32ns(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+bool secure = arm_is_secure_below_el3(env);
+
+assert(!arm_el_is_aa64(env, 3));
+if (secure) {
+return CP_ACCESS_TRAP_UNCATEGORIZED;
+}
+return CP_ACCESS_OK;
+}
+
+static CPAccessResult access_el3_aa32ns_aa64any(CPUARMState *env,
+const ARMCPRegInfo *ri)
+{
+if (!arm_el_is_aa64(env, 3)) {
+return access_el3_aa32ns(env, ri);
+}
+return CP_ACCESS_OK;
+}
+
 static void dacr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
 {
 ARMCPU *cpu = arm_env_get_cpu(env);
@@ -3112,6 +3140,10 @@ static const ARMCPRegInfo el3_no_el2_cp_reginfo[] = {
 { .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
   .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+{ .name = "VTCR_EL2", .state = ARM_CP_STATE_BOTH,
+  .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns_aa64any,
+  .type = ARM_CP_CONST, .resetvalue = 0 },
 { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
   .access = PL2_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
@@ -3246,6 +3278,14 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
   .access = PL2_RW, .writefn = vmsa_tcr_el1_write,
   .resetfn = vmsa_ttbcr_reset, .raw_writefn = raw_write,
   .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[2]) },
+{ .name = "VTCR", .state = ARM_CP_STATE_AA32,
+  .cp = 15, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns,
+  .fieldoffset = offsetof(CPUARMState, cp15.vtcr_el2) },
+{ .name = "VTCR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
+  .access = PL2_RW, .type = ARM_CP_ALIAS,
+  .fieldoffset = offsetof(CPUARMState, cp15.vtcr_el2) },
 { .name = "SCTLR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 0, .opc2 = 0,
   .access = PL2_RW, .raw_writefn = raw_write, .writefn = sctlr_write,
@@ -5741,8 +5781,7 @@ static inline bool 
regime_translation_disabled(CPUARMState *env,
 static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
 {
 if (mmu_idx == ARMMMUIdx_S2NS) {
-/* TODO: return VTCR_EL2 */
-g_assert_not_reached();
+return &env->cp15.vtcr_el2;
 }
 return &env->cp15.tcr_el[regime_el(env, mmu_idx)];
 }
-- 
1.9.1

[Qemu-devel] [PATCH v2 5/8] target-arm: Suppress EPD for S2, EL2 and EL3 translations

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Stage-2 translations, EL2 and EL3 regimes don't have the
EPD control.

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/helper.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 9977062..6c67ce2 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -6344,7 +6344,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 /* Read an LPAE long-descriptor translation table. */
 MMUFaultType fault_type = translation_fault;
 uint32_t level = 1;
-uint32_t epd;
+uint32_t epd = 0;
 int32_t tsz;
 uint32_t tg;
 uint64_t ttbr;
@@ -6438,7 +6438,9 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
  */
 if (ttbr_select == 0) {
 ttbr = regime_ttbr(env, mmu_idx, 0);
-epd = extract32(tcr->raw_tcr, 7, 1);
+if (el < 2) {
+epd = extract32(tcr->raw_tcr, 7, 1);
+}
 tsz = t0sz;
 
 tg = extract32(tcr->raw_tcr, 14, 2);
-- 
1.9.1

[Qemu-devel] [PATCH v2 6/8] target-arm: Add VPIDR_EL2

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 42 +-
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index c10e4ee..bef898f 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -385,6 +385,7 @@ typedef struct CPUARMState {
  */
 uint64_t c15_ccnt;
 uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */
+uint64_t vpidr_el2; /* Virtualization Processor ID Register */
 } cp15;
 
 struct {
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 6c67ce2..f151646 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2445,6 +2445,18 @@ static const ARMCPRegInfo strongarm_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
+static uint64_t midr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+ARMCPU *cpu = arm_env_get_cpu(env);
+unsigned int cur_el = arm_current_el(env);
+bool secure = arm_is_secure(env);
+
+if (arm_feature(&cpu->env, ARM_FEATURE_EL2) && !secure && cur_el == 1) {
+return env->cp15.vpidr_el2;
+}
+return raw_read(env, ri);
+}
+
 static uint64_t mpidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
 ARMCPU *cpu = ARM_CPU(arm_env_get_cpu(env));
@@ -4121,6 +4133,19 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 define_arm_cp_regs(cpu, v8_cp_reginfo);
 }
 if (arm_feature(env, ARM_FEATURE_EL2)) {
+ARMCPRegInfo vpidr_regs[] = {
+{ .name = "VPIDR", .state = ARM_CP_STATE_AA32,
+  .cp = 15, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 0,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns,
+  .resetvalue = cpu->midr,
+  .fieldoffset = offsetof(CPUARMState, cp15.vpidr_el2) },
+{ .name = "VPIDR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 0,
+  .access = PL2_RW, .resetvalue = cpu->midr,
+  .fieldoffset = offsetof(CPUARMState, cp15.vpidr_el2) },
+REGINFO_SENTINEL
+};
+define_arm_cp_regs(cpu, vpidr_regs);
 define_arm_cp_regs(cpu, el2_cp_reginfo);
 /* RVBAR_EL2 is only implemented if EL2 is the highest EL */
 if (!arm_feature(env, ARM_FEATURE_EL3)) {
@@ -4136,6 +4161,18 @@ void register_cp_regs_for_features(ARMCPU *cpu)
  * register the no_el2 reginfos.
  */
 if (arm_feature(env, ARM_FEATURE_EL3)) {
+/* When EL3 exists but not EL2, VPIDR takes the value
+ * of MIDR_EL1.
+ */
+ARMCPRegInfo vpidr_regs[] = {
+{ .name = "VPIDR_EL2", .state = ARM_CP_STATE_BOTH,
+  .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 0,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns_aa64any,
+  .type = ARM_CP_CONST, .resetvalue = cpu->midr,
+  .fieldoffset = offsetof(CPUARMState, cp15.vpidr_el2) },
+REGINFO_SENTINEL
+};
+define_arm_cp_regs(cpu, vpidr_regs);
 define_arm_cp_regs(cpu, el3_no_el2_cp_reginfo);
 }
 }
@@ -4213,6 +4250,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .cp = 15, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = CP_ANY,
   .access = PL1_R, .resetvalue = cpu->midr,
   .writefn = arm_cp_write_ignore, .raw_writefn = raw_write,
+  .readfn = midr_read,
   .fieldoffset = offsetof(CPUARMState, cp15.c0_cpuid),
   .type = ARM_CP_OVERRIDE },
 /* crn = 0 op1 = 0 crm = 3..7 : currently unassigned; we RAZ. */
@@ -4236,7 +4274,9 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 ARMCPRegInfo id_v8_midr_cp_reginfo[] = {
 { .name = "MIDR_EL1", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 0, .opc2 = 0,
-  .access = PL1_R, .type = ARM_CP_CONST, .resetvalue = cpu->midr },
+  .access = PL1_R, .type = ARM_CP_NO_RAW, .resetvalue = cpu->midr,
+  .fieldoffset = offsetof(CPUARMState, cp15.c0_cpuid),
+  .readfn = midr_read },
 /* crn = 0 op1 = 0 crm = 0 op2 = 4,7 : AArch32 aliases of MIDR */
 { .name = "MIDR", .type = ARM_CP_ALIAS | ARM_CP_CONST,
   .cp = 15, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 4,
-- 
1.9.1

[Qemu-devel] [PATCH v2 7/8] target-arm: Break out mpidr_read_val()

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Break out mpidr_read_val() to allow future sharing of the
code that conditionally sets the M and U bits of MPIDR.

No functional changes.

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/helper.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index f151646..327d2f3 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2457,7 +2457,7 @@ static uint64_t midr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return raw_read(env, ri);
 }
 
-static uint64_t mpidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+static uint64_t mpidr_read_val(CPUARMState *env)
 {
 ARMCPU *cpu = ARM_CPU(arm_env_get_cpu(env));
 uint64_t mpidr = cpu->mp_affinity;
@@ -2475,6 +2475,11 @@ static uint64_t mpidr_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return mpidr;
 }
 
+static uint64_t mpidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+return mpidr_read_val(env);
+}
+
 static const ARMCPRegInfo mpidr_cp_reginfo[] = {
 { .name = "MPIDR", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 0, .opc2 = 5,
-- 
1.9.1

[Qemu-devel] [PATCH v2 8/8] target-arm: Add VMPIDR_EL2

2015-09-13 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Signed-off-by: Edgar E. Iglesias 
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 26 --
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index bef898f..95886ff 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -386,6 +386,7 @@ typedef struct CPUARMState {
 uint64_t c15_ccnt;
 uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */
 uint64_t vpidr_el2; /* Virtualization Processor ID Register */
+uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register */
 } cp15;
 
 struct {
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 327d2f3..93eda73 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2477,6 +2477,12 @@ static uint64_t mpidr_read_val(CPUARMState *env)
 
 static uint64_t mpidr_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
+unsigned int cur_el = arm_current_el(env);
+bool secure = arm_is_secure(env);
+
+if (arm_feature(env, ARM_FEATURE_EL2) && !secure && cur_el == 1) {
+return env->cp15.vmpidr_el2;
+}
 return mpidr_read_val(env);
 }
 
@@ -4138,6 +4144,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 define_arm_cp_regs(cpu, v8_cp_reginfo);
 }
 if (arm_feature(env, ARM_FEATURE_EL2)) {
+uint64_t vmpidr_def = mpidr_read_val(env);
 ARMCPRegInfo vpidr_regs[] = {
 { .name = "VPIDR", .state = ARM_CP_STATE_AA32,
   .cp = 15, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 0,
@@ -4148,6 +4155,16 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 0,
   .access = PL2_RW, .resetvalue = cpu->midr,
   .fieldoffset = offsetof(CPUARMState, cp15.vpidr_el2) },
+{ .name = "VMPIDR", .state = ARM_CP_STATE_AA32,
+  .cp = 15, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 5,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns,
+  .resetvalue = vmpidr_def,
+  .fieldoffset = offsetof(CPUARMState, cp15.vmpidr_el2) },
+{ .name = "VMPIDR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 5,
+  .access = PL2_RW,
+  .resetvalue = vmpidr_def,
+  .fieldoffset = offsetof(CPUARMState, cp15.vmpidr_el2) },
 REGINFO_SENTINEL
 };
 define_arm_cp_regs(cpu, vpidr_regs);
@@ -4166,8 +4183,8 @@ void register_cp_regs_for_features(ARMCPU *cpu)
  * register the no_el2 reginfos.
  */
 if (arm_feature(env, ARM_FEATURE_EL3)) {
-/* When EL3 exists but not EL2, VPIDR takes the value
- * of MIDR_EL1.
+/* When EL3 exists but not EL2, VPIDR and VMPIDR take the value
+ * of MIDR_EL1 and MPIDR_EL1.
  */
 ARMCPRegInfo vpidr_regs[] = {
 { .name = "VPIDR_EL2", .state = ARM_CP_STATE_BOTH,
@@ -4175,6 +4192,11 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .access = PL2_RW, .accessfn = access_el3_aa32ns_aa64any,
   .type = ARM_CP_CONST, .resetvalue = cpu->midr,
   .fieldoffset = offsetof(CPUARMState, cp15.vpidr_el2) },
+{ .name = "VMPIDR_EL2", .state = ARM_CP_STATE_BOTH,
+  .opc0 = 3, .opc1 = 4, .crn = 0, .crm = 0, .opc2 = 5,
+  .access = PL2_RW, .accessfn = access_el3_aa32ns_aa64any,
+  .type = ARM_CP_NO_RAW,
+  .writefn = arm_cp_write_ignore, .readfn = mpidr_read },
 REGINFO_SENTINEL
 };
 define_arm_cp_regs(cpu, vpidr_regs);
-- 
1.9.1

Re: [Qemu-devel] rfc: vhost user enhancements for vm2vm communication

2015-09-13 Thread Michael S. Tsirkin

On Fri, Sep 11, 2015 at 05:39:07PM +0200, Claudio Fontana wrote:
> On 09.09.2015 09:06, Michael S. Tsirkin wrote:
> > On Mon, Sep 07, 2015 at 02:38:34PM +0200, Claudio Fontana wrote:
> >> Coming late to the party, 
> >>
> >> On 31.08.2015 16:11, Michael S. Tsirkin wrote:
> >>> Hello!
> >>> During the KVM forum, we discussed supporting virtio on top
> >>> of ivshmem. I have considered it, and came up with an alternative
> >>> that has several advantages over that - please see below.
> >>> Comments welcome.
> >>
> >> as Jan mentioned we actually discussed a virtio-shmem device which would 
> >> incorporate the advantages of ivshmem (so no need for a separate ivshmem 
> >> device), which would use the well known virtio interface, taking advantage 
> >> of the new virtio-1 virtqueue layout to split r/w and read-only rings as 
> >> seen from the two sides, and make use also of BAR0 which has been freed up 
> >> for use by the device.
> >>
> >> This way it would be possible to share the rings and the actual memory for 
> >> the buffers in the PCI bars. The guest VMs could decide to use the shared 
> >> memory regions directly as prepared by the hypervisor (in the jailhouse 
> >> case) or QEMU/KVM, or perform their own validation on the input depending 
> >> on the use case.
> >>
> >> Of course the communication between VMs needs in this case to be 
> >> pre-configured and is quite static (which is actually beneficial in our 
> >> use case).
> >>
> >> But still in your proposed solution, each VM needs to be pre-configured to 
> >> communicate with a specific other VM using a separate device right?
> >>
> >> But I wonder if we are addressing the same problem.. in your case you are 
> >> looking at having a shared memory pool for all VMs potentially visible to 
> >> all VMs (the vhost-user case), while in the virtio-shmem proposal we 
> >> discussed we were assuming specific different regions for every channel.
> >>
> >> Ciao,
> >>
> >> Claudio
> > 
> > The problem, as I see it, is to allow inter-vm communication with
> > polling (to get very low latencies) but polling within VMs only, without
> > need to run a host thread (which when polling uses up a host CPU).
> > 
> > What was proposed was to simply change virtio to allow
> > "offset within BAR" instead of PA.
> 
> There are many consequences to this, offset within BAR alone is not enough, 
> there are multiple things at the virtio level that need sorting out.
> Also we need to consider virtio-mmio etc.
> 
> > This would allow VM2VM communication if there are only 2 VMs,
> > but if data needs to be sent to multiple VMs, you
> > must copy it.
> 
> Not necessarily, however getting it to work (sharing the backend window and 
> arbitrating the multicast) is really hard.
> 
> > 
> > Additionally, it's a single-purpose feature: you can use it from
> > a userspace PMD but linux will never use it.
> > 
> > 
> > My proposal is a superset: don't require that BAR memory is
> > used, use IOMMU translation tables.
> > This way, data can be sent to multiple VMs by sharing the same
> > memory with them all.
> 
> Can you describe in detail how your proposal deals with the arbitration 
> necessary for multicast handling?

Basically it falls out naturally. Consider linux guest as an example,
and assume dynamic mappings for simplicity.

Multicast is done by a bridge on the guest side. That code clones the
skb (reference-counting page fragments) and passes it to multiple ports.
Each of these will program the IOMMU to allow read access to the
fragments to the relevant device.



> > 
> > It is still possible to put data in some device BAR if that's
> > what the guest wants to do: just program the IOMMU to limit
> > virtio to the memory range that is within this BAR.
> > 
> > Another advantage here is that the feature is more generally useful.
> > 
> > 
> >>>
> >>> -
> >>>
> >>> Existing solutions to userspace switching between VMs on the
> >>> same host are vhost-user and ivshmem.
> >>>
> >>> vhost-user works by mapping memory of all VMs being bridged into the
> >>> switch memory space.
> >>>
> >>> By comparison, ivshmem works by exposing a shared region of memory to all 
> >>> VMs.
> >>> VMs are required to use this region to store packets. The switch only
> >>> needs access to this region.
> >>>
> >>> Another difference between vhost-user and ivshmem surfaces when polling
> >>> is used. With vhost-user, the switch is required to handle
> >>> data movement between VMs, if using polling, this means that 1 host CPU
> >>> needs to be sacrificed for this task.
> >>>
> >>> This is easiest to understand when one of the VMs is
> >>> used with VF pass-through. This can be schematically shown below:
> >>>
> >>> +-- VM1 --++---VM2---+
> >>> | virtio-pci  +-vhost-user-+ virtio-pci -- VF | -- VFIO -- IOMMU 
> >>> -- NIC
> >>> +-++-+
> >>>
> >>>
> >>> With ivshmem in theory communication can happen dire

Re: [Qemu-devel] [PATCH] q35: Remove old machine versions

2015-09-13 Thread Michael S. Tsirkin

On Fri, Sep 11, 2015 at 03:44:47PM -0300, Eduardo Habkost wrote:
> Ping?
> 
> So, what's the reason we are still keeping those old machines in the
> code?

Victor also wanted to clean out some very old machine types for
the PIIX, too.

But if someone created a machine with libvirt, these machine types
are now written in the XML. Failing to start guests isn't nice.

Maybe we could drop most of the compat code, but keep the
old machine types around with most visible changes (no_floppy? anything
else?). As we can't live migrate these older machine types,
minor guest visible changes aren't a big deal if they don't
break guest boot.

> 
> On Tue, Aug 18, 2015 at 04:11:42PM -0700, Eduardo Habkost wrote:
> > Migration with q35 was not possible before commit
> > 04329029a8c539eb5f75dcb6d8b016f0c53a031a, because q35 unconditionally 
> > creates
> > an ich9-ahci device, that was marked as unmigratable. So all q35 machines
> > before pc-q35-2.4 were unmigratable, and there's no point in keeping
> > compatibility code for them.
> > 
> > Remove all old pc-q35 machine classes and keep only pc-q35-2.4.
> > 
> > Signed-off-by: Eduardo Habkost 
> > ---
> >  hw/i386/pc_q35.c | 153 
> > ---
> >  1 file changed, 153 deletions(-)
> > 
> > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > index 4ee653e..e482f2f 100644
> > --- a/hw/i386/pc_q35.c
> > +++ b/hw/i386/pc_q35.c
> > @@ -272,60 +272,6 @@ static void pc_q35_init(MachineState *machine)
> >  }
> >  }
> >  
> > -static void pc_compat_2_3(MachineState *machine)
> > -{
> > -PCMachineState *pcms = PC_MACHINE(machine);
> > -savevm_skip_section_footers();
> > -if (kvm_enabled()) {
> > -pcms->smm = ON_OFF_AUTO_OFF;
> > -}
> > -global_state_set_optional();
> > -savevm_skip_configuration();
> > -}
> > -
> > -static void pc_compat_2_2(MachineState *machine)
> > -{
> > -pc_compat_2_3(machine);
> > -machine->suppress_vmdesc = true;
> > -}
> > -
> > -static void pc_compat_2_1(MachineState *machine)
> > -{
> > -PCMachineState *pcms = PC_MACHINE(machine);
> > -
> > -pc_compat_2_2(machine);
> > -pcms->enforce_aligned_dimm = false;
> > -x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> > -}
> > -
> > -static void pc_compat_2_0(MachineState *machine)
> > -{
> > -pc_compat_2_1(machine);
> > -}
> > -
> > -static void pc_compat_1_7(MachineState *machine)
> > -{
> > -pc_compat_2_0(machine);
> > -option_rom_has_mr = true;
> > -x86_cpu_compat_kvm_no_autoenable(FEAT_1_ECX, CPUID_EXT_X2APIC);
> > -}
> > -
> > -static void pc_compat_1_6(MachineState *machine)
> > -{
> > -pc_compat_1_7(machine);
> > -rom_file_has_mr = false;
> > -}
> > -
> > -static void pc_compat_1_5(MachineState *machine)
> > -{
> > -pc_compat_1_6(machine);
> > -}
> > -
> > -static void pc_compat_1_4(MachineState *machine)
> > -{
> > -pc_compat_1_5(machine);
> > -}
> > -
> >  #define DEFINE_Q35_MACHINE(suffix, name, compatfn, optionfn) \
> >  static void pc_init_##suffix(MachineState *machine) \
> >  { \
> > @@ -358,102 +304,3 @@ static void pc_q35_2_4_machine_options(MachineClass 
> > *m)
> >  
> >  DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
> > pc_q35_2_4_machine_options);
> > -
> > -
> > -static void pc_q35_2_3_machine_options(MachineClass *m)
> > -{
> > -pc_q35_2_4_machine_options(m);
> > -m->no_floppy = 0;
> > -m->no_tco = 1;
> > -m->alias = NULL;
> > -SET_MACHINE_COMPAT(m, PC_COMPAT_2_3);
> > -}
> > -
> > -DEFINE_Q35_MACHINE(v2_3, "pc-q35-2.3", pc_compat_2_3,
> > -   pc_q35_2_3_machine_options);
> > -
> > -
> > -static void pc_q35_2_2_machine_options(MachineClass *m)
> > -{
> > -PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > -pc_q35_2_3_machine_options(m);
> > -SET_MACHINE_COMPAT(m, PC_COMPAT_2_2);
> > -pcmc->rsdp_in_ram = false;
> > -}
> > -
> > -DEFINE_Q35_MACHINE(v2_2, "pc-q35-2.2", pc_compat_2_2,
> > -   pc_q35_2_2_machine_options);
> > -
> > -
> > -static void pc_q35_2_1_machine_options(MachineClass *m)
> > -{
> > -PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > -pc_q35_2_2_machine_options(m);
> > -m->default_display = NULL;
> > -SET_MACHINE_COMPAT(m, PC_COMPAT_2_1);
> > -pcmc->smbios_uuid_encoded = false;
> > -}
> > -
> > -DEFINE_Q35_MACHINE(v2_1, "pc-q35-2.1", pc_compat_2_1,
> > -   pc_q35_2_1_machine_options);
> > -
> > -
> > -static void pc_q35_2_0_machine_options(MachineClass *m)
> > -{
> > -PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > -pc_q35_2_1_machine_options(m);
> > -SET_MACHINE_COMPAT(m, PC_COMPAT_2_0);
> > -pcmc->has_reserved_memory = false;
> > -pcmc->smbios_legacy_mode = true;
> > -pcmc->acpi_data_size = 0x1;
> > -}
> > -
> > -DEFINE_Q35_MACHINE(v2_0, "pc-q35-2.0", pc_compat_2_0,
> > -   pc_q35_2_0_machine_options);
> > -
> > -
> > -static void pc_q35_1_7_machin

Re: [Qemu-devel] [RFC PATCH 1/3] pc: fw_cfg: move ioport base constant to pc.h

2015-09-13 Thread Marc Marí

On Sat, 12 Sep 2015 19:30:40 -0400
"Gabriel L. Somlo"  wrote:

> Move BIOS_CFG_IOPORT define from pc.c to pc.h, and rename
> it to FW_CFG_IO_BASE. Also, add FW_CFG_IO_SIZE define (set
> to 0x02, to cover the overlapping 16-bit control and 8-bit
> data ports).
> 
> Signed-off-by: Gabriel Somlo 
> ---
>  hw/i386/pc.c | 5 ++---
>  include/hw/i386/pc.h | 3 +++
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index b5107f7..1a92b4f 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -86,7 +86,6 @@ void pc_set_legacy_acpi_data_size(void)
>  acpi_data_size = 0x1;
>  }
>  
> -#define BIOS_CFG_IOPORT 0x510
>  #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0)
>  #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
>  #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
> @@ -760,7 +759,7 @@ static FWCfgState *bochs_bios_init(void)
>  int i, j;
>  unsigned int apic_id_limit = pc_apic_id_limit(max_cpus);
>  
> -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
>  /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86:
>   *
>   * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU
> hotplug @@ -1292,7 +1291,7 @@ FWCfgState
> *xen_load_linux(PCMachineState *pcms, 
>  assert(MACHINE(pcms)->kernel_filename != NULL);
>  
> -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
>  rom_set_fw(fw_cfg);
>  
>  load_linux(pcms, fw_cfg);
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 3e002c9..0cab3c5 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -206,6 +206,9 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
>  
>  void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
>  
> +#define FW_CFG_IO_BASE 0x510
> +#define FW_CFG_IO_SIZE  0x02
> +
>  /* acpi_piix.c */
>  
>  I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base,

There is already a size defined in hw/nvram/fw_cfg.c (FW_CFG_SIZE). You
could move this definition to the .h and reuse it for ACPI. This way,
it is easier to modify.

Note that this value is used both for the size of the IO port and the
size of the CTL field when using memory regions. You can split it now in
your patches, or it will be split in my patches.

I'm not going to comment on the other patches, because I don't know
ACPI.

Thanks
Marc

Re: [Qemu-devel] Python problem

2015-09-13 Thread Peter Maydell

On 13 September 2015 at 03:19, Programmingkid  wrote:
> Excellent. This fixed the problem. Thank you very much. The minimum
> version of python QEMU supports is 2.6?

At the moment it should be 2.4, apart from this bug. However we're about
to raise it to 2.6 (and there's a patch on the list that updates the
version check in configure). We were only retaining Python 2.4 support for
the benefit of RHEL5, and we stopped supporting RHEL5 in QEMU 2.5
when we raised our minimum glib version requirement.

thanks
-- PMM

Re: [Qemu-devel] [RFC PATCH 2/3] acpi: pc: add fw_cfg device node to ssdt

2015-09-13 Thread Michael S. Tsirkin

On Sat, Sep 12, 2015 at 07:30:41PM -0400, Gabriel L. Somlo wrote:
> Add a fw_cfg device node to the ACPI SSDT. While the guest-side
> BIOS can't utilize this information (since it has to access the
> hard-coded fw_cfg device to extract ACPI tables to begin with),
> having fw_cfg listed in ACPI will help the guest kernel keep a
> more accurate inventory of in-use IO port regions.
> 
> Signed-off-by: Gabriel Somlo 
> ---
>  hw/i386/acpi-build.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 95e0c65..9d0ec22 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1071,6 +1071,25 @@ build_ssdt(GArray *table_data, GArray *linker,
>  aml_append(scope, aml_name_decl("_S5", pkg));
>  aml_append(ssdt, scope);
>  
> +if (guest_info->fw_cfg) {
> +scope = aml_scope("\\_SB");
> +dev = aml_device("FWCF");
> +
> +aml_append(dev, aml_name_decl("_HID", aml_string("FWCF0001")));

Generally that's an illegal HID. If this device has a driver,
use QEMU as a prefix. Otherwise, use one of the pre-defined ones
with a PNP ISA ID.

> +/* device present, functioning, decoding, not shown in UI */
> +aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
> +
> +crs = aml_resource_template();
> +aml_append(crs,
> +aml_io(AML_DECODE16, FW_CFG_IO_BASE, FW_CFG_IO_BASE,
> +   0x01, FW_CFG_IO_SIZE)
> +);
> +aml_append(dev, aml_name_decl("_CRS", crs));
> +
> +aml_append(scope, dev);
> +aml_append(ssdt, scope);
> +}
> +
>  if (misc->applesmc_io_base) {
>  scope = aml_scope("\\_SB.PCI0.ISA");
>  dev = aml_device("SMC");
> -- 
> 2.4.3

[Qemu-devel] Windows does not support DataTableRegion at all [was: docs: describe QEMU's VMGenID design]

2015-09-13 Thread Laszlo Ersek

As the subject suggests, I have terrible news.

I'll preserve the full context here, so that it's easy to scroll back to
the ASL for reference.

I'm also CC'ing edk2-devel, because a number of BIOS developers should
be congregating there.

On 08/28/15 22:18, Laszlo Ersek wrote:
> Cc: Paolo Bonzini 
> Cc: Gal Hammer 
> Cc: Igor Mammedov 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Laszlo Ersek 
> ---
>
> Notes:
> This is based on the super long private email discussion we had two
> months ago, plus on the IRL discussion between Michael and myself @ the
> KVM Forum 2015.
>
>  docs/specs/vmgenid.txt | 343 
> +
>  1 file changed, 343 insertions(+)
>  create mode 100644 docs/specs/vmgenid.txt
>
> diff --git a/docs/specs/vmgenid.txt b/docs/specs/vmgenid.txt
> new file mode 100644
> index 000..d4bf132
> --- /dev/null
> +++ b/docs/specs/vmgenid.txt
> @@ -0,0 +1,343 @@
> +Virtual Machine Generation ID Device
> +
> +
> +The Microsoft specification entitled "Virtual Machine Generation ID",
> +maintained at , defines an 
> ACPI
> +feature that allows the guest OSPM to recognize when it has been returned "to
> +an earlier point in time", eg. by restoral from snapshot, or by incoming
> +migration. Quoting the spec,
> +
> +The virtual machine generation ID is a feature whereby the virtual 
> machines
> +BIOS will expose a new ID. This is a 128-bit, cryptographically random
> +integer value identifier that will be different every time the virtual
> +machine executes from a different configuration file-such as executing 
> from
> +a recovered snapshot, or executing after restoring from backup. [...]
> +
> +The document you are reading now extracts the requirements set forth by the
> +VMGenID spec for hypervisors that intend to provide the feature, and 
> describes
> +QEMU's implementation. The design below targets both SeaBIOS and OVMF as
> +compatible guest firmwares, without any changes to either of them.
> +
> +Requirements
> +
> +
> +These requirements are extracted from the "How to implement virtual machine
> +generation ID support in a virtualization platform" section of the
> +specification, dated August 1, 2012.
> +
> +R1a. The generation ID shall live in an 8-byte aligned buffer.
> +
> +R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or 
> device
> + MMIO range.
> +
> +R1c. The buffer holding the generation ID shall be kept separate from areas
> + used by the operating system.
> +
> +R1d. The buffer shall not be covered by an AddressRangeMemory or
> + AddressRangeACPI entry in the E820 or UEFI memory map.
> +
> +R1e. The generation ID shall not live in a page frame that could be mapped 
> with
> + caching disabled. (In other words, if the generation ID lives in RAM, 
> then
> + it shall only be mapped as cacheable.)
> +
> +R2 to R5. [These AML requirements are isolated well enough in the Microsoft
> +  specification for us to simply refer to them here.]
> +
> +R6. The hypervisor shall expose a _HID (hardware identifier) object in the
> +VMGenId device's scope that is unique to the hypervisor vendor.
> +
> +Generation ID buffer design
> +---
> +
> +QEMU places the generation ID buffer inside a separate fw_cfg blob that is
> +exposed to the guest OS with the ACPI linker/loader.
> +
> +The structure of the blob is as follows. Offsets, sizes and numeric values 
> are
> +given in decimal; furthermore the latter are encoded in little endian.
> +
> +  Offs  Field   Size  Value
> +    --    
> + 0  System Description36
> +Table Header
> + 0Signature4"UEFI"
> + 4Length   462
> + 8Revision 1 1
> + 9Checksum 1 0
> +10OEMID6ACPI_BUILD_APPNAME6 ("BOCHS ")
> +16OEM Table ID 8"QEMUPARM"
> +24OEM Revision 4 1
> +28Creator ID   4  ACPI_BUILD_APPNAME4 ("BXPC")
> +32Creator Revision 4 1
> +
> +36  UEFI Table18
> +Sub-Header
> +36Identifier  16  417a5dff-bf4b-4abc-a839-6593bb41f452
> +52DataOffset   254
> +
> +54  ADDR base pointer  862
> +  
> +62  OVMF SDT Header   36zeroes
> +probe suppressor
> +98  VMGenID alignment  6

Re: [Qemu-devel] Windows does not support DataTableRegion at all [was: docs: describe QEMU's VMGenID design]

2015-09-13 Thread Michael S. Tsirkin

On Sun, Sep 13, 2015 at 01:56:44PM +0200, Laszlo Ersek wrote:
> As the subject suggests, I have terrible news.
> 
> I'll preserve the full context here, so that it's easy to scroll back to
> the ASL for reference.
> 
> I'm also CC'ing edk2-devel, because a number of BIOS developers should
> be congregating there.

Wow, bravo! It does look like we need to go back to
the drawing board.
The only crazy thing you didn't try is to use
an XSDT instead of the DSDT.
I find it unlikely that this will help ...

-- 
MST

[Qemu-devel] [PATCH FYI 00/13] ACPI stuff for the DataTableRegion()-based VMGenID

2015-09-13 Thread Laszlo Ersek

So, as I wrote in the parent, this does not actually work in Windows,
because Windows doesn't support the DataTableRegion() operator; not even
modern Windows versions.

I'm nonetheless posting the series for the following purposes:

- Posterity. I think the series is worth preserving in the mailing list
  archive.

- Testing by others, if anyone is so inclined. (Should someone come back
  here later: the series applies to commit
  fc04a730b7e60f4a62d6260d4eb9c537d1d3643f.)

- I think that several patches from the series would be worth merging in
  their own right.

Anatomy of the FYI series:

- Patch 01 is known from the RFC posting; it has seen a number of
  changes. Those are all noted on the patch itself.

- Patch 02 is not related to ACPI, but it was the first one I wrote,
  while trying to attack this series, so I'm including it here anyway.

- Patches 03 to 08 add generic code for, and then introduce, the UEFI
  ACPI Data Table, described in patch 01.

- Patches 09 to 13 add helper functions for, and then generate, the VMGI
  device's AML.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Cc: Shannon Zhao 

Thanks
Laszlo

Laszlo Ersek (13):
  docs: describe QEMU's VMGenID design
  hw/acpi: add i386 callbacks for injecting GPE 04 when the VMGENID
changes
  hw/acpi: rename "AcpiBuildTables.table_data" to "main_blob"
  hw/acpi: allow RSDT entries to be relocated to various fw_cfg blobs
  hw/acpi: add more flexible acpi_add_table() and build_header()
variants
  hw/acpi: introduce ACPI_BUILD_QEMUPARAM_FILE
  hw/acpi: introduce the AcpiQemuParamTable structure
  hw/i386: build UEFI ACPI Data Table for VMGENID in the dedicated blob
(WIP)
  hw/acpi: expose more parameters for aml_method()
  hw/acpi: add AML generator function for DataTableRegion()
  hw/acpi: add AML generator function for AccessAs()
  hw/acpi: add AML generator function for CreateQWordField()
  hw/i386: generate AML for the VMGENID device (WIP)

 include/hw/acpi/acpi.h   |   1 +
 include/hw/acpi/acpi_dev_interface.h |   4 +
 include/hw/acpi/aml-build.h  |  23 ++-
 include/hw/acpi/ich9.h   |   1 +
 include/hw/acpi/vmgenid.h|  72 
 hw/acpi/aml-build.c  | 135 --
 hw/acpi/ich9.c   |   8 +
 hw/acpi/piix4.c  |   8 +
 hw/arm/virt-acpi-build.c |  11 +-
 hw/i386/acpi-build.c | 159 -
 hw/isa/lpc_ich9.c|   1 +
 docs/vmgenid.txt | 336 +++
 12 files changed, 738 insertions(+), 21 deletions(-)
 create mode 100644 include/hw/acpi/vmgenid.h
 create mode 100644 docs/vmgenid.txt

-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 01/13] docs: describe QEMU's VMGenID design

2015-09-13 Thread Laszlo Ersek

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
Acked-by: Michael S. Tsirkin 
---

Notes:
fyi:
- move from docs/specs/ to docs/ [Eric, Paolo]
- fix grammar [Eric]
- clarify that requirement R1e covers ROM and MMIO too [Michael]
- replace '"BOCHS"' with '"BOCHS "' in the DataTableRegion operator, so
  that the OEM ID argument matches ACPI_BUILD_APPNAME6 exactly
- remove the _CRS with the IO descriptor in it, because Windows' VMGENID
  driver chokes on that (but is okay with the absence of the _CRS). See
  
  for more.

rfc:
- This is based on the super long private email discussion we had two
  months ago, plus on the IRL discussion between Michael and myself @
  the KVM Forum 2015.

 docs/vmgenid.txt | 336 +++
 1 file changed, 336 insertions(+)
 create mode 100644 docs/vmgenid.txt

diff --git a/docs/vmgenid.txt b/docs/vmgenid.txt
new file mode 100644
index 000..4a9c1d0
--- /dev/null
+++ b/docs/vmgenid.txt
@@ -0,0 +1,336 @@
+Virtual Machine Generation ID Device
+
+
+The Microsoft specification entitled "Virtual Machine Generation ID",
+maintained at , defines an ACPI
+feature that allows the guest OSPM to recognize when it has been returned "to
+an earlier point in time", e.g. by restoring from snapshot, or by incoming
+migration. Quoting the spec,
+
+The virtual machine generation ID is a feature whereby the virtual machines
+BIOS will expose a new ID. This is a 128-bit, cryptographically random
+integer value identifier that will be different every time the virtual
+machine executes from a different configuration file-such as executing from
+a recovered snapshot, or executing after restoring from backup. [...]
+
+The document you are reading now extracts the requirements set forth by the
+VMGenID spec for hypervisors that intend to provide the feature, and describes
+QEMU's implementation. The design below targets both SeaBIOS and OVMF as
+compatible guest firmwares, without any changes to either of them.
+
+Requirements
+
+
+These requirements are extracted from the "How to implement virtual machine
+generation ID support in a virtualization platform" section of the
+specification, dated August 1, 2012.
+
+R1a. The generation ID shall live in an 8-byte aligned buffer.
+
+R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or device
+ MMIO range.
+
+R1c. The buffer holding the generation ID shall be kept separate from areas
+ used by the operating system.
+
+R1d. The buffer shall not be covered by an AddressRangeMemory or
+ AddressRangeACPI entry in the E820 or UEFI memory map.
+
+R1e. The generation ID shall not live in a page frame that could be mapped with
+ caching disabled. (In other words, regardless of whether the generation ID
+ lives in RAM, ROM or MMIO, it shall only be mapped as cacheable.)
+
+R2 to R5. [These AML requirements are isolated well enough in the Microsoft
+  specification for us to simply refer to them here.]
+
+R6. The hypervisor shall expose a _HID (hardware identifier) object in the
+VMGenId device's scope that is unique to the hypervisor vendor.
+
+Generation ID buffer design
+---
+
+QEMU places the generation ID buffer inside a separate fw_cfg blob that is
+exposed to the guest OS with the ACPI linker/loader.
+
+The structure of the blob is as follows. Offsets, sizes and numeric values are
+given in decimal; furthermore the latter are encoded in little endian.
+
+  Offs  Field   Size  Value
+    --    
+ 0  System Description36
+Table Header
+ 0Signature4"UEFI"
+ 4Length   462
+ 8Revision 1 1
+ 9Checksum 1 0
+10OEMID6ACPI_BUILD_APPNAME6 ("BOCHS ")
+16OEM Table ID 8"QEMUPARM"
+24OEM Revision 4 1
+28Creator ID   4  ACPI_BUILD_APPNAME4 ("BXPC")
+32Creator Revision 4 1
+
+36  UEFI Table18
+Sub-Header
+36Identifier  16  417a5dff-bf4b-4abc-a839-6593bb41f452
+52DataOffset   254
+
+54  ADDR base pointer  862
+  
+62  OVMF SDT Header   36

[Qemu-devel] [PATCH FYI 07/13] hw/acpi: introduce the AcpiQemuParamTable structure

2015-09-13 Thread Laszlo Ersek

This ACPI table is supposed to carry various parameters for OSPM. We
introduce it with a single parameter field, "vmgenid_addr_base_ptr", which
is described as ADBP / "ADDR base pointer" in "docs/vmgenid.txt" (along
with the general structure of the table).

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/vmgenid.h | 72 +++
 1 file changed, 72 insertions(+)
 create mode 100644 include/hw/acpi/vmgenid.h

diff --git a/include/hw/acpi/vmgenid.h b/include/hw/acpi/vmgenid.h
new file mode 100644
index 000..07e813a
--- /dev/null
+++ b/include/hw/acpi/vmgenid.h
@@ -0,0 +1,72 @@
+/*
+ * ACPI definitions related to the VMGENID device (see "docs/vmgenid.txt").
+ *
+ * Copyright (C) 2015 Red Hat, Inc.
+ *
+ * Authors:
+ *   Laszlo Ersek 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, see .
+ */
+
+#ifndef HW_ACPI_VMGENID_H
+#define HW_ACPI_VMGENID_H
+
+#include "hw/acpi/acpi-defs.h"
+
+#define ACPI_UEFI_IDENT_SIZE 16
+
+struct AcpiQemuParamTable {
+/* ACPI common table header */
+ACPI_TABLE_HEADER_DEF
+
+/*
+ * UEFI ACPI Data Table Sub-Header.
+ *
+ * The "UEFI" signature is reserved for this table header starting with
+ * ACPI 4.0. The header structure is described in the UEFI Specification,
+ * version 2.3 or later, in Appendix O.
+
+ * These fields are harmless for SeaBIOS, but ensure unicity in OVMF
+ * ("UEFI" is a multi-instance table type).
+ */
+uint8_t identifier[ACPI_UEFI_IDENT_SIZE];
+uint16_t data_offset;
+
+/* QEMU-specific fields start here. */
+
+/* Base pointer for the VMGENID device's ADDR control method. */
+uint64_t vmgenid_addr_base_ptr;
+} QEMU_PACKED;
+typedef struct AcpiQemuParamTable AcpiQemuParamTable;
+
+/* Aggregate initializer for "AcpiQemuParamTable.identifier". */
+#define QEMU_PARAM_TABLE_GUID { 0xFF, 0x5D, 0x7A, 0x41, 0x4B, 0xBF, 0xBC, 
0x4A, \
+0xA8, 0x39, 0x65, 0x93, 0xBB, 0x41, 0xF4, 0x52 
}
+
+/*
+ * This offset points into the fw_cfg blob that contains both
+ * AcpiQemuParamTable and the "live" generation ID after it. The offset points
+ * at the generation ID field, skipping over the "OVMF SDT Header probe
+ * suppressor" and "VMGenID alignment padding" fields in the blob (which are
+ * located right after AcpiQemuParamTable).
+ *
+ * This is an integer constant expression.
+ */
+#define VM_GENERATION_ID_OFFSET \
+ROUND_UP(sizeof(AcpiQemuParamTable) + sizeof(AcpiTableHeader), 8)
+
+#define VM_GENERATION_ID_SIZE 16
+
+#endif /* HW_ACPI_VMGENID_H */
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 04/13] hw/acpi: allow RSDT entries to be relocated to various fw_cfg blobs

2015-09-13 Thread Laszlo Ersek

The build_rsdt() function can relocate RSDT entries only to ACPI tables
that exist inside the same ACPI_BUILD_TABLE_FILE blob.

In order to relax this limitation, change the element type of the
"table_offsets" array from plain offset (always into
ACPI_BUILD_TABLE_FILE) to a (pointed-to-blob, offset) pair.

For compatibility with the current callers of acpi_add_table() and
build_rsdt(), acpi_add_table() will hard-code ACPI_BUILD_TABLE_FILE as
pointed-to-blob. However, the pointed-to-blob can now be determined when
adding an ACPI table, case by case, not when building the RSDT.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h |  6 ++
 hw/acpi/aml-build.c | 16 ++--
 hw/arm/virt-acpi-build.c|  2 +-
 hw/i386/acpi-build.c|  2 +-
 4 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7d89c40..7518659 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -276,6 +276,12 @@ Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 
+struct BlobOffset {
+const char *blob_name; /* no ownership; set from ACPI_BUILD_*_FILE  */
+uint32_t offset;   /* offset into blob named @blob_name */
+};
+typedef struct BlobOffset BlobOffset;
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 2c7d59d..60c7c2e 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1171,8 +1171,11 @@ unsigned acpi_data_len(GArray *table)
 
 void acpi_add_table(GArray *table_offsets, GArray *table_data)
 {
-uint32_t offset = cpu_to_le32(table_data->len);
-g_array_append_val(table_offsets, offset);
+BlobOffset blob_offset = {
+.blob_name = ACPI_BUILD_TABLE_FILE,
+.offset = cpu_to_le32(table_data->len)
+};
+g_array_append_val(table_offsets, blob_offset);
 }
 
 void acpi_build_tables_init(AcpiBuildTables *tables)
@@ -1199,16 +1202,17 @@ build_rsdt(GArray *table_data, GArray *linker, GArray 
*table_offsets)
 AcpiRsdtDescriptorRev1 *rsdt;
 size_t rsdt_len;
 int i;
-const int table_data_len = (sizeof(uint32_t) * table_offsets->len);
 
-rsdt_len = sizeof(*rsdt) + table_data_len;
+rsdt_len = sizeof(*rsdt) + sizeof(uint32_t) * table_offsets->len;
 rsdt = acpi_data_push(table_data, rsdt_len);
-memcpy(rsdt->table_offset_entry, table_offsets->data, table_data_len);
 for (i = 0; i < table_offsets->len; ++i) {
+BlobOffset *blob_offset = (BlobOffset *)table_offsets->data + i;
+
+rsdt->table_offset_entry[i] = blob_offset->offset;
 /* rsdt->table_offset_entry to be filled by Guest linker */
 bios_linker_loader_add_pointer(linker,
ACPI_BUILD_TABLE_FILE,
-   ACPI_BUILD_TABLE_FILE,
+   blob_offset->blob_name,
table_data, 
&rsdt->table_offset_entry[i],
sizeof(uint32_t));
 }
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index fcbb2d7..5725994 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -557,7 +557,7 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 virt_acpi_get_cpu_info(&cpuinfo);
 
 table_offsets = g_array_new(false, true /* clear */,
-sizeof(uint32_t));
+sizeof(BlobOffset));
 
 bios_linker_loader_alloc(tables->linker, ACPI_BUILD_TABLE_FILE,
  64, false /* high memory */);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2cd8891..baebfcc 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1679,7 +1679,7 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 acpi_get_pci_info(&pci);
 
 table_offsets = g_array_new(false, true /* clear */,
-sizeof(uint32_t));
+sizeof(BlobOffset));
 ACPI_BUILD_DPRINTF("init ACPI tables\n");
 
 bios_linker_loader_alloc(tables->linker, ACPI_BUILD_TABLE_FILE,
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 09/13] hw/acpi: expose more parameters for aml_method()

2015-09-13 Thread Laszlo Ersek

ACPI 1.0b defines the SerializeFlag in MethodFlags. We have not exposed
this until now, but serializing methods that create named objects is
warmly recommended by (recent versions of) the ACPI spec, and recent iasl
actually warns about it. Therefore expose SerializeFlag in a new function.
The old aml_method() function is preserved for old callers' sake.

While at it, expose the SyncLevel bitfield of MethodFlags as well. Because
that was introduced in ACPI 2.0, add a separate function for it. This
allows us to provide one comprehensive DefMethod implementation.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h |  3 +++
 hw/acpi/aml-build.c | 26 +-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index ee54242..f8f96ec 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -266,6 +266,9 @@ Aml *aml_qword_memory(AmlDecode dec, AmlMinFixed min_fixed,
 Aml *aml_scope(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
 Aml *aml_device(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
 Aml *aml_method(const char *name, int arg_count);
+Aml *aml_method_serialized(const char *name, int arg_count, bool serialized);
+Aml *aml_method_flags(const char *name, int arg_count, bool serialized,
+  int sync_level);
 Aml *aml_if(Aml *predicate);
 Aml *aml_else(void);
 Aml *aml_while(Aml *predicate);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 7d58483..a0f187e 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -698,9 +698,33 @@ Aml *aml_while(Aml *predicate)
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMethod */
 Aml *aml_method(const char *name, int arg_count)
 {
+return aml_method_serialized(name, arg_count, false);
+}
+
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMethod */
+Aml *aml_method_serialized(const char *name, int arg_count, bool serialized)
+{
+return aml_method_flags(name, arg_count, serialized, 0);
+}
+
+/* ACPI 2.0: 17.2.4.2 Named Objects Encoding: DefMethod */
+Aml *aml_method_flags(const char *name, int arg_count, bool serialized,
+  int sync_level)
+{
 Aml *var = aml_bundle(0x14 /* MethodOp */, AML_PACKAGE);
 build_append_namestring(var->buf, "%s", name);
-build_append_byte(var->buf, arg_count); /* MethodFlags: ArgCount */
+
+assert(arg_count >= 0);
+assert(arg_count <= 7);
+if (serialized) {
+assert(sync_level >= 0x00);
+assert(sync_level <= 0x0f);
+} else {
+assert(sync_level == 0);
+}
+
+build_append_byte(var->buf,
+  arg_count | (serialized << 3) | (sync_level << 4));
 return var;
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 03/13] hw/acpi: rename "AcpiBuildTables.table_data" to "main_blob"

2015-09-13 Thread Laszlo Ersek

The identifier "table_data" is used in wildly different name spaces and
scopes, which makes it practically impossible to grep for uses of
"AcpiBuildTables.table_data" specifically. Rename the field to "main_blob"
(which is a unique identifier across the tree), and update all references
with the help of the compiler.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h | 2 +-
 hw/acpi/aml-build.c | 4 ++--
 hw/arm/virt-acpi-build.c| 6 +++---
 hw/i386/acpi-build.c| 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index e3afa13..7d89c40 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -151,7 +151,7 @@ typedef enum {
 
 typedef
 struct AcpiBuildTables {
-GArray *table_data;
+GArray *main_blob;
 GArray *rsdp;
 GArray *tcpalog;
 GArray *linker;
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 0d4b324..2c7d59d 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1178,7 +1178,7 @@ void acpi_add_table(GArray *table_offsets, GArray 
*table_data)
 void acpi_build_tables_init(AcpiBuildTables *tables)
 {
 tables->rsdp = g_array_new(false, true /* clear */, 1);
-tables->table_data = g_array_new(false, true /* clear */, 1);
+tables->main_blob = g_array_new(false, true /* clear */, 1);
 tables->tcpalog = g_array_new(false, true /* clear */, 1);
 tables->linker = bios_linker_loader_init();
 }
@@ -1188,7 +1188,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, 
bool mfre)
 void *linker_data = bios_linker_loader_cleanup(tables->linker);
 g_free(linker_data);
 g_array_free(tables->rsdp, true);
-g_array_free(tables->table_data, true);
+g_array_free(tables->main_blob, true);
 g_array_free(tables->tcpalog, mfre);
 }
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9088248..fcbb2d7 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -552,7 +552,7 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 GArray *table_offsets;
 unsigned dsdt, rsdt;
 VirtAcpiCpuInfo cpuinfo;
-GArray *tables_blob = tables->table_data;
+GArray *tables_blob = tables->main_blob;
 
 virt_acpi_get_cpu_info(&cpuinfo);
 
@@ -631,7 +631,7 @@ static void virt_acpi_build_update(void *build_opaque, 
uint32_t offset)
 
 virt_acpi_build(build_state->guest_info, &tables);
 
-acpi_ram_update(build_state->table_mr, tables.table_data);
+acpi_ram_update(build_state->table_mr, tables.main_blob);
 acpi_ram_update(build_state->rsdp_mr, tables.rsdp);
 acpi_ram_update(build_state->linker_mr, tables.linker);
 
@@ -685,7 +685,7 @@ void virt_acpi_setup(VirtGuestInfo *guest_info)
 virt_acpi_build(build_state->guest_info, &tables);
 
 /* Now expose it all to Guest */
-build_state->table_mr = acpi_add_rom_blob(build_state, tables.table_data,
+build_state->table_mr = acpi_add_rom_blob(build_state, tables.main_blob,
ACPI_BUILD_TABLE_FILE,
ACPI_BUILD_TABLE_MAX_SIZE);
 assert(build_state->table_mr != NULL);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 95e0c65..2cd8891 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1670,7 +1670,7 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 PcPciInfo pci;
 uint8_t *u;
 size_t aml_len = 0;
-GArray *tables_blob = tables->table_data;
+GArray *tables_blob = tables->main_blob;
 
 acpi_get_cpu_info(&cpu);
 acpi_get_pm_info(&pm);
@@ -1833,7 +1833,7 @@ static void acpi_build_update(void *build_opaque, 
uint32_t offset)
 
 acpi_build(build_state->guest_info, &tables);
 
-acpi_ram_update(build_state->table_mr, tables.table_data);
+acpi_ram_update(build_state->table_mr, tables.main_blob);
 
 if (build_state->rsdp) {
 memcpy(build_state->rsdp, tables.rsdp->data, 
acpi_data_len(tables.rsdp));
@@ -1899,7 +1899,7 @@ void acpi_setup(PcGuestInfo *guest_info)
 acpi_build(build_state->guest_info, &tables);
 
 /* Now expose it all to Guest */
-build_state->table_mr = acpi_add_rom_blob(build_state, tables.table_data,
+build_state->table_mr = acpi_add_rom_blob(build_state, tables.main_blob,
ACPI_BUILD_TABLE_FILE,
ACPI_BUILD_TABLE_MAX_SIZE);
 assert(build_state->table_mr != NULL);
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 02/13] hw/acpi: add i386 callbacks for injecting GPE 04 when the VMGENID changes

2015-09-13 Thread Laszlo Ersek

Add a new method called "vm_generation_id_changed" to the
AcpiDeviceIfClass interface. The new method sends an ACPI notfication when
the VM generation ID is changed. This contributes to the implementation of
requirement R5, from "docs/vmgenid.txt".

This patch is a slight modification of Gal Hammer's

  [PATCH V15 2/5] acpi: add a vm_generation_id_changed method
  http://thread.gmane.org/gmane.comp.emulators.qemu/332451/focus=332453

(for which reason his S-o-b is preserved in the first position). The
changes are (and should be captured in the commit message):

- There's no need for the helper function acpi_vm_generation_id_changed():
  acpi_send_gpe_event() already does the right thing and is at the right
  abstraction level.

- The next available GPE status bit is bit 4 (value 16); less significant
  bits (bits 1 through 3) are already used for PCI, CPU, and memory
  hotplug.

  Furthermore, bit 0 (value 1) is not available (the _L00 method already
  exist in the DSDTs, with empty body as a precaution); probably because
  the ACPI spec (section "Queuing the Matching Control Method for
  Execution") reserves response code 0 for "no outstanding events". In
  other words, _E00 / _L00 can never be queued.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Gal Hammer 
[ler...@redhat.com: see changes above, plus extended commit message]
Signed-off-by: Laszlo Ersek 

Signed-off-by: Laszlo Ersek 
---

Notes:
fyi:
- This patch is not actually related to the ACPI work, but this was the
  first one I wrote, when I was still trying to figure out the right
  order to go about this series. So I'm including it here.

 include/hw/acpi/acpi.h   | 1 +
 include/hw/acpi/acpi_dev_interface.h | 4 
 include/hw/acpi/ich9.h   | 1 +
 hw/acpi/ich9.c   | 8 
 hw/acpi/piix4.c  | 8 
 hw/isa/lpc_ich9.c| 1 +
 6 files changed, 23 insertions(+)

diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h
index b20bd55..d46095d 100644
--- a/include/hw/acpi/acpi.h
+++ b/include/hw/acpi/acpi.h
@@ -96,6 +96,7 @@ typedef enum {
 ACPI_PCI_HOTPLUG_STATUS = 2,
 ACPI_CPU_HOTPLUG_STATUS = 4,
 ACPI_MEMORY_HOTPLUG_STATUS = 8,
+ACPI_VMGENID_CHANGED_STATUS = 16,
 } AcpiGPEStatusBits;
 
 /* structs */
diff --git a/include/hw/acpi/acpi_dev_interface.h 
b/include/hw/acpi/acpi_dev_interface.h
index f245f8d..d0f210f 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -28,6 +28,9 @@ typedef struct AcpiDeviceIf {
  * ospm_status: returns status of ACPI device objects, reported
  *  via _OST method if device supports it.
  *
+ * vm_generation_id_changed: notify the guest that its generation ID has been
+ *   changed.
+ *
  * Interface is designed for providing unified interface
  * to generic ACPI functionality that could be used without
  * knowledge about internals of actual device that implements
@@ -39,5 +42,6 @@ typedef struct AcpiDeviceIfClass {
 
 /*  */
 void (*ospm_status)(AcpiDeviceIf *adev, ACPIOSTInfoList ***list);
+void (*vm_generation_id_changed)(AcpiDeviceIf *adev);
 } AcpiDeviceIfClass;
 #endif
diff --git a/include/hw/acpi/ich9.h b/include/hw/acpi/ich9.h
index 345fd8d..e656f59 100644
--- a/include/hw/acpi/ich9.h
+++ b/include/hw/acpi/ich9.h
@@ -77,4 +77,5 @@ void ich9_pm_device_unplug_cb(ICH9LPCPMRegs *pm, DeviceState 
*dev,
   Error **errp);
 
 void ich9_pm_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list);
+void ich9_vm_generation_id_changed(AcpiDeviceIf *adev);
 #endif /* HW_ACPI_ICH9_H */
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 1c7fcfa..bd7214e 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -482,3 +482,11 @@ void ich9_pm_ospm_status(AcpiDeviceIf *adev, 
ACPIOSTInfoList ***list)
 
 acpi_memory_ospm_status(&s->pm.acpi_memory_hotplug, list);
 }
+
+void ich9_vm_generation_id_changed(AcpiDeviceIf *adev)
+{
+ICH9LPCState *s = ICH9_LPC_DEVICE(adev);
+ICH9LPCPMRegs *pm = &s->pm;
+
+acpi_send_gpe_event(&pm->acpi_regs, pm->irq, ACPI_VMGENID_CHANGED_STATUS);
+}
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 2cd2fee..d83957c 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -583,6 +583,13 @@ static void piix4_ospm_status(AcpiDeviceIf *adev, 
ACPIOSTInfoList ***list)
 acpi_memory_ospm_status(&s->acpi_memory_hotplug, list);
 }
 
+static void piix4_vm_generation_id_changed(AcpiDeviceIf *adev)
+{
+PIIX4PMState *s = PIIX4_PM(adev);
+
+acpi_send_gpe_event(&s->ar, s->irq, ACPI_VMGENID_CHANGED_STATUS);
+}
+
 static Property piix4_pm_properties[] = {
 DEFINE_PROP_UINT32("smb_io_base", PIIX4PMState, smb_io_base, 0),
 DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 0),
@@ -621,6 +628,7 @@ static void piix4_pm_class_init(ObjectClass *klass, void 
*data)
 hc->unplug_request =

[Qemu-devel] [PATCH FYI 10/13] hw/acpi: add AML generator function for DataTableRegion()

2015-09-13 Thread Laszlo Ersek

This ASL operator (and the underlying AML) enables named ACPI data tables
to be located from AML code, and to be accessed field-wise, like an
operation region. This is useful for passing down "parameter tables" to
the guest; the ACPI linker/loader can relocate pointers in them, and then
the AML code can read valid pointer values from the fields in the tables.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h |  2 ++
 hw/acpi/aml-build.c | 14 ++
 2 files changed, 16 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index f8f96ec..dc4d215 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -225,6 +225,8 @@ Aml *aml_io(AmlIODecode dec, uint16_t min_base, uint16_t 
max_base,
 uint8_t aln, uint8_t len);
 Aml *aml_operation_region(const char *name, AmlRegionSpace rs,
   uint32_t offset, uint32_t len);
+Aml *aml_data_table_region(const char *name, Aml *sig, Aml *oem_id,
+   Aml *oem_table_id);
 Aml *aml_irq_no_flags(uint8_t irq);
 Aml *aml_named_field(const char *name, unsigned length);
 Aml *aml_reserved_field(unsigned length);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a0f187e..2dd2f33 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -788,6 +788,20 @@ Aml *aml_operation_region(const char *name, AmlRegionSpace 
rs,
 return var;
 }
 
+/* ACPI 2.0: 17.2.4.2 Named Objects Encoding: DefDataRegion */
+Aml *aml_data_table_region(const char *name, Aml *sig, Aml *oem_id,
+   Aml *oem_table_id)
+{
+Aml *var = aml_alloc();
+build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+build_append_byte(var->buf, 0x88); /* DataRegionOp */
+build_append_namestring(var->buf, "%s", name);
+aml_append(var, sig);
+aml_append(var, oem_id);
+aml_append(var, oem_table_id);
+return var;
+}
+
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: NamedField */
 Aml *aml_named_field(const char *name, unsigned length)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 08/13] hw/i386: build UEFI ACPI Data Table for VMGENID in the dedicated blob (WIP)

2015-09-13 Thread Laszlo Ersek

Using the tools
- acpi_add_table2(),
- build_header2()

and the blob
- ACPI_BUILD_QEMUPARAM_FILE

that have been added in the previous patches, we can now implement the
UEFI ACPI Data Table (and the related linker/loader commands) that are
specified in "docs/vmgenid.txt".

At this point the UEFI ACPI Data Table becomes visible to the guest, but
it is never used, because we don't reference it yet from the AML in the
SSDT.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 hw/i386/acpi-build.c | 69 
 1 file changed, 69 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 045015e..a742f25 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -41,6 +41,7 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "sysemu/tpm.h"
 #include "hw/acpi/tpm.h"
+#include "hw/acpi/vmgenid.h"
 #include "sysemu/tpm_backend.h"
 
 /* Supported chipsets: */
@@ -1658,6 +1659,69 @@ static bool acpi_has_iommu(void)
 return intel_iommu && !ambiguous;
 }
 
+static void
+build_qemuparam(GArray *qemuparam_blob, GArray *linker)
+{
+AcpiQemuParamTable *param_table;
+static const uint8_t ident[ACPI_UEFI_IDENT_SIZE] = QEMU_PARAM_TABLE_GUID;
+
+/*
+ * The generation ID field -- which lives outside of AcpiQemuParamTable --
+ * must fit in the 4KB blob.
+ */
+QEMU_BUILD_BUG_ON(VM_GENERATION_ID_OFFSET + VM_GENERATION_ID_SIZE > 4096);
+
+param_table = acpi_data_push(qemuparam_blob, 4096);
+
+/* set up the UEFI ACPI Data Table Sub-Header */
+memcpy(param_table->identifier, ident, ACPI_UEFI_IDENT_SIZE);
+param_table->data_offset =
+cpu_to_le16(offsetof(AcpiQemuParamTable, data_offset) +
+sizeof param_table->data_offset);
+
+/* set up the QEMU parameters */
+param_table->vmgenid_addr_base_ptr = cpu_to_le64(sizeof *param_table);
+
+/* Prepare linker/loader commands. We handle the allocation of the blob and
+ * the relocation of the "ADDR base pointer" field here. The linking into
+ * the RSDT has already been queued by acpi_add_table2(), whereas
+ * @param_table will be checksummed by build_header2() internally.
+ */
+bios_linker_loader_alloc(linker,
+ ACPI_BUILD_QEMUPARAM_FILE,
+ 4096, /* alloc_align */
+ false /* ie. it can be in high memory */);
+
+bios_linker_loader_add_pointer(linker,
+   /* name of blob containing pointer */
+   ACPI_BUILD_QEMUPARAM_FILE,
+   /* name of blob being pointed to */
+   ACPI_BUILD_QEMUPARAM_FILE,
+   /* blob containing pointer */
+   qemuparam_blob,
+   /* address of pointer */
+   ¶m_table->vmgenid_addr_base_ptr,
+   /* size of pointer */
+   sizeof param_table->vmgenid_addr_base_ptr);
+
+build_header2(/* @linker receives the ADD_CHECKSUM command */
+  linker,
+  /* the blob in which the ACPI table is embedded */
+  qemuparam_blob,
+  /* fw_cfg name of the same */
+  ACPI_BUILD_QEMUPARAM_FILE,
+  /* SDT header of ACPI table, residing in the blob */
+  (AcpiTableHeader *)param_table,
+  /* ACPI Signature */
+  "UEFI",
+  /* ACPI OEM Table ID */
+  "QEMUPARM",
+  /* size of ACPI table */
+  sizeof *param_table,
+  /* ACPI table revision */
+  1);
+}
+
 static
 void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
 {
@@ -1741,6 +1805,11 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 acpi_add_table(table_offsets, tables_blob);
 build_dmar_q35(tables_blob, tables->linker);
 }
+if (true /* lersek: misc.vmgenid_iobase */) {
+acpi_add_table2(table_offsets, tables->qemuparam_blob,
+ACPI_BUILD_QEMUPARAM_FILE);
+build_qemuparam(tables->qemuparam_blob, tables->linker);
+}
 
 /* Add tables supplied by user (if any) */
 for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 13/13] hw/i386: generate AML for the VMGENID device (WIP)

2015-09-13 Thread Laszlo Ersek

This patch implements the "ACPI device, control methods" section of
"docs/vmgenid.txt", with dynamic AML generation.

A small portion of this patch was inspired by Gal Hammer's

  [PATCH V15 4/5] i386: add a Virtual Machine Generation ID device
  http://thread.gmane.org/gmane.comp.emulators.qemu/332451/focus=332454

TODO: the ACPICA instance that is built into the Linux kernel warns about
the fact that both _L04 and _E04 are provided by GPE 04, when the code in
this patch is active. Therefore this patch should also conditionalize _L04
(ATM in "hw/i386/acpi-dsdt.dsl" and "hw/i386/q35-acpi-dsdt.dsl"), so that
they are only generated when VMGENID is *not* active.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 hw/i386/acpi-build.c | 79 
 1 file changed, 79 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index a742f25..67be94a 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1126,6 +1126,85 @@ build_ssdt(GArray *table_data, GArray *linker,
 aml_append(ssdt, scope);
 }
 
+if (true /* lersek: misc->vmgenid_iobase */) {
+unsigned adbp_offset;
+size_t genid_incr;
+
+scope = aml_scope("\\_SB");
+dev = aml_device("VMGI");
+
+aml_append(dev, aml_name_decl("_CID", aml_string("VM_Gen_Counter")));
+aml_append(dev, aml_name_decl("_DDN", aml_string("VM_Gen_Counter")));
+aml_append(dev, aml_name_decl("_HID", aml_string("QEMU0002")));
+
+aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
+
+method = aml_method_serialized("ADDR", 0, true);
+
+aml_append(method,
+   aml_data_table_region("TBLR",
+ aml_string("UEFI"),
+ aml_string("%s", ACPI_BUILD_APPNAME6),
+ aml_string("QEMUPARM")));
+field = aml_field("TBLR", AML_ANY_ACC, AML_PRESERVE);
+adbp_offset = offsetof (AcpiQemuParamTable, vmgenid_addr_base_ptr);
+/* Offset() before ADBP, expressed in bits */
+aml_append(field, aml_reserved_field(adbp_offset * 8));
+aml_append(field, aml_named_field("ADBP", 64));
+aml_append(method, field);
+
+aml_append(method,
+   aml_operation_region("VMGR", AML_SYSTEM_IO,
+0x512 /* lersek: misc->vmgenid_iobase 
*/,
+9));
+field = aml_field("VMGR", AML_DWORD_ACC, AML_PRESERVE);
+aml_append(field, aml_named_field("PTLO", 32));
+aml_append(field, aml_named_field("PTHI", 32));
+aml_append(field, aml_access_field(AML_BYTE_ACC));
+aml_append(field, aml_named_field("DONE", 8));
+aml_append(method, field);
+
+aml_append(method, aml_name_decl("RESU", aml_buffer(8, NULL)));
+aml_append(method,
+  aml_create_qword_field(aml_name("RESU"), aml_int(0),
+ "ADFU"));
+aml_append(method,
+  aml_create_dword_field(aml_name("RESU"), aml_int(0),
+ "ADLO"));
+aml_append(method,
+  aml_create_dword_field(aml_name("RESU"), aml_int(4),
+ "ADHI"));
+
+/*
+ * Offset increment from the end of the parameter table (ie. where ADBP
+ * points to) until the generation ID field, skipping over the "OVMF
+ * SDT Header probe suppressor" and "VMGenID alignment padding" fields
+ * in the blob.
+ */
+genid_incr = VM_GENERATION_ID_OFFSET - sizeof(AcpiQemuParamTable);
+aml_append(method, aml_store(aml_add(aml_name("ADBP"),
+ aml_int(genid_incr)),
+ aml_name("ADFU")));
+aml_append(method, aml_store(aml_name("ADLO"), aml_name("PTLO")));
+aml_append(method, aml_store(aml_name("ADHI"), aml_name("PTHI")));
+aml_append(method, aml_store(aml_int(0), aml_name("DONE")));
+
+pkg = aml_package(2);
+aml_append(pkg, aml_name("ADLO"));
+aml_append(pkg, aml_name("ADHI"));
+aml_append(method, aml_return(pkg));
+
+aml_append(dev, method);
+aml_append(scope, dev);
+aml_append(ssdt, scope);
+
+scope = aml_scope("\\_GPE");
+method = aml_method("_E04", 0);
+aml_append(method, aml_notify(aml_name("\\_SB.VMGI"), aml_int(0x80)));
+aml_append(scope, method);
+aml_append(ssdt, scope);
+}
+
 sb_scope = aml_scope("\\_SB");
 {
 /* create PCI0.PRES device and its _CRS to reserve CPU hotplug MMIO */
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 12/13] hw/acpi: add AML generator function for CreateQWordField()

2015-09-13 Thread Laszlo Ersek

It follows the pattern of CreateDWordField() / aml_create_dword_field().

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h |  1 +
 hw/acpi/aml-build.c | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 32e49b3..4f6a2be 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -280,6 +280,7 @@ Aml *aml_buffer(int buffer_size, uint8_t *byte_list);
 Aml *aml_resource_template(void);
 Aml *aml_field(const char *name, AmlAccessType type, AmlUpdateRule rule);
 Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const char *name);
+Aml *aml_create_qword_field(Aml *srcbuf, Aml *index, const char *name);
 Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 5aeb289..ca5bcd7 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -854,6 +854,17 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const 
char *name)
 return var;
 }
 
+/* ACPI 2.0: 17.2.4.2 Named Objects Encoding: DefCreateQWordField */
+Aml *aml_create_qword_field(Aml *srcbuf, Aml *index, const char *name)
+{
+Aml *var = aml_alloc();
+build_append_byte(var->buf, 0x8F); /* DefCreateQWordField */
+aml_append(var, srcbuf);
+aml_append(var, index);
+build_append_namestring(var->buf, "%s", name);
+return var;
+}
+
 /* ACPI 1.0b: 16.2.3 Data Objects Encoding: String */
 Aml *aml_string(const char *name_format, ...)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 05/13] hw/acpi: add more flexible acpi_add_table() and build_header() variants

2015-09-13 Thread Laszlo Ersek

acpi_add_table() and build_header() hardcode a number of traits that we'd
like to pass in later on, on a table-by-table basis. These are:

- The fw_cfg file name of the blob that contains the ACPI table.
  ACPI_BUILD_TABLE_FILE is hard-coded at the moment.

- The OEM Table ID field. Due to the way the DataTableRegion() operator
  works, the OEM Table ID field is our only possibility to ensure a unique
  lookup for DataTableRegion(), since we don't populate (Signature, OEM
  ID) uniquely. However, currently OEM Table ID is directly derived from
  Signature. Unicity for DataTableRegion() requires making OEM Table ID
  independent.

Expose the internals of the functions that we have now, so that callers
can control the above traits.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h |  6 +
 hw/acpi/aml-build.c | 53 +
 2 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7518659..47d28c9 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -285,9 +285,15 @@ typedef struct BlobOffset BlobOffset;
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev);
+void
+build_header2(GArray *linker, GArray *table_data, const char *blob_name,
+  AcpiTableHeader *h, const char *sig, const char *oem_table_id,
+  int len, uint8_t rev);
 void *acpi_data_push(GArray *table_data, unsigned size);
 unsigned acpi_data_len(GArray *table);
 void acpi_add_table(GArray *table_offsets, GArray *table_data);
+void acpi_add_table2(GArray *table_offsets, GArray *table_data,
+ const char *blob_name);
 void acpi_build_tables_init(AcpiBuildTables *tables);
 void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre);
 void
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 60c7c2e..03111a3 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1139,18 +1139,48 @@ void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
 {
+char oem_table_id[8];
+
+memcpy(oem_table_id, ACPI_BUILD_APPNAME4, 4);
+memcpy(oem_table_id + 4, sig, 4);
+build_header2(linker, table_data, ACPI_BUILD_TABLE_FILE, h, sig,
+  oem_table_id, len, rev);
+}
+
+/* Fill in the SDT header of an ACPI table, and add a linker command to update
+ * its checksum.
+ *
+ * @linker: The linker/loader command file. This will receive the checksum
+ *  update command.
+ * @table_data: The blob in which the ACPI table is embedded.
+ * @blob_name: The fw_cfg file name of the blob under which @table_data will be
+ * exposed later.
+ * @h: Pointer to the ACPI table header. Must reside within @table_data.
+ * @sig: Signature to place into the ACPI table header. Exactly four bytes will
+ *   be copied.
+ * @oem_table_id: OEM Table ID to place into the ACPI table header. Exactly
+ *eight bytes will be copied.
+ * @len: Length of the ACPI table, including the SDT header. This determines
+ *   the length field in the header itself, and the number of bytes the
+ *   checksum will cover.
+ * @rev: Revision to place into the ACPI table header.
+ */
+void
+build_header2(GArray *linker, GArray *table_data, const char *blob_name,
+  AcpiTableHeader *h, const char *sig, const char *oem_table_id,
+  int len, uint8_t rev)
+{
 memcpy(&h->signature, sig, 4);
 h->length = cpu_to_le32(len);
 h->revision = rev;
 memcpy(h->oem_id, ACPI_BUILD_APPNAME6, 6);
-memcpy(h->oem_table_id, ACPI_BUILD_APPNAME4, 4);
-memcpy(h->oem_table_id + 4, sig, 4);
+memcpy(h->oem_table_id, oem_table_id, 8);
 h->oem_revision = cpu_to_le32(1);
 memcpy(h->asl_compiler_id, ACPI_BUILD_APPNAME4, 4);
 h->asl_compiler_revision = cpu_to_le32(1);
 h->checksum = 0;
 /* Checksum to be filled in by Guest linker */
-bios_linker_loader_add_checksum(linker, ACPI_BUILD_TABLE_FILE,
+bios_linker_loader_add_checksum(linker, blob_name,
 table_data->data, h, len, &h->checksum);
 }
 
@@ -1171,8 +1201,23 @@ unsigned acpi_data_len(GArray *table)
 
 void acpi_add_table(GArray *table_offsets, GArray *table_data)
 {
+acpi_add_table2(table_offsets, table_data, ACPI_BUILD_TABLE_FILE);
+}
+
+/* Register an ACPI table to be referenced from the RSDT.
+ *
+ * @table_offsets: The array collecting the registrations, with element type
+ * BlobOffset.
+ * @table_data: The blob to which the caller will append the ACPI table, after
+ *  this function returns.
+ * @blob_name: The fw_cfg file name of the blob under which @table_data will be
+ * exposed later.
+ */
+void acpi_add_table2(GArray *

[Qemu-devel] [PATCH FYI 11/13] hw/acpi: add AML generator function for AccessAs()

2015-09-13 Thread Laszlo Ersek

The AccessAs(AccessType) macro can be used inside the Field() operator in
ASL, for diverging from the Field's default access type, for the fields
that follow AccessAs(). The new helper function allows us to generate the
matching AML.

The AccessAttribute parameter of the macro (described in the spec) is not
exposed because it is reserved for all spaces except SMBus device space,
and we don't use that.

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h |  1 +
 hw/acpi/aml-build.c | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index dc4d215..32e49b3 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -230,6 +230,7 @@ Aml *aml_data_table_region(const char *name, Aml *sig, Aml 
*oem_id,
 Aml *aml_irq_no_flags(uint8_t irq);
 Aml *aml_named_field(const char *name, unsigned length);
 Aml *aml_reserved_field(unsigned length);
+Aml *aml_access_field(AmlAccessType type);
 Aml *aml_local(int num);
 Aml *aml_string(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
 Aml *aml_lnot(Aml *arg);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 2dd2f33..5aeb289 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -821,6 +821,17 @@ Aml *aml_reserved_field(unsigned length)
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: AccessField */
+Aml *aml_access_field(AmlAccessType type)
+{
+Aml *var = aml_alloc();
+/* AccessField := 0x01 AccessType AccessAttrib */
+build_append_byte(var->buf, 0x01);
+build_append_byte(var->buf, type);
+build_append_byte(var->buf, 0x00 /* reserved outside SMBus dev space */);
+return var;
+}
+
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefField */
 Aml *aml_field(const char *name, AmlAccessType type, AmlUpdateRule rule)
 {
-- 
1.8.3.1

[Qemu-devel] [PATCH FYI 06/13] hw/acpi: introduce ACPI_BUILD_QEMUPARAM_FILE

2015-09-13 Thread Laszlo Ersek

We'll build the UEFI ACPI Data Table for the VMGENID device in a separate
fw_cfg blob (see "docs/vmgenid.txt").

When introducing a new fw_cfg blob for ACPI linker/loader purposes, we
have to decide first if the new blob will be subject to patching on first
guest access.

(1) If so, then the blob must be backed by a RAMBlock, so that the patched
contents can be migrated, for the case when migration occurs while the
guest is reading the patched blob.

(2) If the blob is not subject to patching on first guest access, then it
doesn't have to be migrated in a RAMBlock; the target host can rebuild
the blob identically from scratch (assuming the same machine type of
course), and the migrated guest can continue reading that.

In the first case, the new blob has to follow the pattern seen with
"main_blob", "rsdp" and "linker":

- In acpi_build_tables_cleanup(), the direct output of the AML generator
  should be freed unconditionally, because that output has been copied
  into a RAMBlock both at startup (with acpi_add_rom_blob()) and on
  patching too (with acpi_ram_update()).

- Therefore the "mfre" parameter is not considered for such blobs in
  acpi_build_tables_cleanup().

In the second case, the new blob has to follow the pattern seen with
"tcpalog":

- In acpi_build_tables_cleanup(), the direct output of the AML generator
  must not be freed on startup (because that output is linked by fw_cfg).

- However on patching (which affects only *other* blobs), the fresh blob
  contents out of the AML generator are identical to the original contents
  (which are still linked by fw_cfg), thus the fresh output should be
  simply thrown away.

- Hence the "mfre" parameter must be considered for such blobs in
  acpi_build_tables_cleanup().

Now that we're introducing the ACPI_BUILD_QEMUPARAM_FILE blob (without any
contents as yet), only for the purposes of the VMGENID device, we can see
(from "docs/vmgenid.txt") that this blob won't need any patching, in
response to guest actions that occur before the guest first downloads the
blob. For that reason we follow pattern (2).

Cc: Paolo Bonzini 
Cc: Gal Hammer 
Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Cc: Shannon Zhao 
Signed-off-by: Laszlo Ersek 
---
 include/hw/acpi/aml-build.h | 2 ++
 hw/acpi/aml-build.c | 2 ++
 hw/arm/virt-acpi-build.c| 3 +++
 hw/i386/acpi-build.c| 3 +++
 4 files changed, 10 insertions(+)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 47d28c9..ee54242 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -16,6 +16,7 @@
 #define ACPI_BUILD_TABLE_FILE "etc/acpi/tables"
 #define ACPI_BUILD_RSDP_FILE "etc/acpi/rsdp"
 #define ACPI_BUILD_TPMLOG_FILE "etc/tpm/log"
+#define ACPI_BUILD_QEMUPARAM_FILE "etc/acpi/qemuparam"
 
 typedef enum {
 AML_NO_OPCODE = 0,/* has only data */
@@ -154,6 +155,7 @@ struct AcpiBuildTables {
 GArray *main_blob;
 GArray *rsdp;
 GArray *tcpalog;
+GArray *qemuparam_blob;
 GArray *linker;
 } AcpiBuildTables;
 
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 03111a3..7d58483 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1228,6 +1228,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
 tables->rsdp = g_array_new(false, true /* clear */, 1);
 tables->main_blob = g_array_new(false, true /* clear */, 1);
 tables->tcpalog = g_array_new(false, true /* clear */, 1);
+tables->qemuparam_blob = g_array_new(false, true /* clear */, 1);
 tables->linker = bios_linker_loader_init();
 }
 
@@ -1238,6 +1239,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, 
bool mfre)
 g_array_free(tables->rsdp, true);
 g_array_free(tables->main_blob, true);
 g_array_free(tables->tcpalog, mfre);
+g_array_free(tables->qemuparam_blob, mfre);
 }
 
 /* Build rsdt table */
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 5725994..8b6856e 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -695,6 +695,9 @@ void virt_acpi_setup(VirtGuestInfo *guest_info)
 
 fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
 tables.tcpalog->data, acpi_data_len(tables.tcpalog));
+fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_QEMUPARAM_FILE,
+tables.qemuparam_blob->data,
+acpi_data_len(tables.qemuparam_blob));
 
 build_state->rsdp_mr = acpi_add_rom_blob(build_state, tables.rsdp,
   ACPI_BUILD_RSDP_FILE, 0);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index baebfcc..045015e 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1909,6 +1909,9 @@ void acpi_setup(PcGuestInfo *guest_info)
 
 fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
 tables.tcpalog->data, acpi_data_len(tables.tcpalog));
+fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_QEMUPARAM_FILE,
+

Re: [Qemu-devel] Windows does not support DataTableRegion at all [was: docs: describe QEMU's VMGenID design]

2015-09-13 Thread Laszlo Ersek

On 09/13/15 14:34, Michael S. Tsirkin wrote:
> On Sun, Sep 13, 2015 at 01:56:44PM +0200, Laszlo Ersek wrote:
>> As the subject suggests, I have terrible news.
>>
>> I'll preserve the full context here, so that it's easy to scroll back to
>> the ASL for reference.
>>
>> I'm also CC'ing edk2-devel, because a number of BIOS developers should
>> be congregating there.
> 
> Wow, bravo! It does look like we need to go back to
> the drawing board.

Thank you. :)

> The only crazy thing you didn't try is to use
> an XSDT instead of the DSDT.
> I find it unlikely that this will help ...
> 

Actually, I forgot to mention it, but I *did* try to use XSDT, sort of 
automatically. I had mentioned earlier that EFI_ACPI_TABLE_PROTOCOL 
automatically links stuff into both RSDT and XSDT, and I verified in this case 
that the UEFI ACPI Data Table *was* linked into the XSDT.

*

[root@ovmf-fedora acpi.6]# dmesg | grep UEFI
[0.00] ACPI: UEFI 0x3E8F1000 3E (v01 BOCHS  QEMUPARM 
0001 BXPC 0001)

*

[root@ovmf-fedora acpi.6]# cat xsdt.dsl 
/*
 * Intel ACPI Component Architecture
 * AML/ASL+ Disassembler version 20150515-64
 * Copyright (c) 2000 - 2015 Intel Corporation
 * 
 * Disassembly of xsdt.dat, Sun Sep 13 14:54:17 2015
 *
 * ACPI Data Table [XSDT]
 *
 * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
 */

[000h    4]Signature : "XSDT"[Extended System 
Description Table]
[004h 0004   4] Table Length : 004C
[008h 0008   1] Revision : 01
[009h 0009   1] Checksum : 90
[00Ah 0010   6]   Oem ID : "BOCHS "
[010h 0016   8] Oem Table ID : "BXPCFACP"
[018h 0024   4] Oem Revision : 0001
[01Ch 0028   4]  Asl Compiler ID : ""
[020h 0032   4]Asl Compiler Revision : 0113

[024h 0036   8]   ACPI Table Address   0 : 3FEF5000
[02Ch 0044   8]   ACPI Table Address   1 : 3FEF4000
[034h 0052   8]   ACPI Table Address   2 : 3FEF3000
[03Ch 0060   8]   ACPI Table Address   3 : 3FEF2000
[044h 0068   8]   ACPI Table Address   4 : 3E8F1000

Raw Table Data: Length 76 (0x4C)

  : 58 53 44 54 4C 00 00 00 01 90 42 4F 43 48 53 20  // XSDTL.BOCHS 
  0010: 42 58 50 43 46 41 43 50 01 00 00 00 20 20 20 20  // BXPCFACP
  0020: 13 00 00 01 00 50 EF 3F 00 00 00 00 00 40 EF 3F  // .P.?.@.?
  0030: 00 00 00 00 00 30 EF 3F 00 00 00 00 00 20 EF 3F  // .0.?. .?
  0040: 00 00 00 00 00 10 8F 3E 00 00 00 00  // ...>

*

See "ACPI Table Address 4", 3E8F1000.

Thanks!
Laszlo

Re: [Qemu-devel] Python problem

2015-09-13 Thread Programmingkid

On Sep 13, 2015, at 7:28 AM, Peter Maydell wrote:

> On 13 September 2015 at 03:19, Programmingkid  
> wrote:
>> Excellent. This fixed the problem. Thank you very much. The minimum
>> version of python QEMU supports is 2.6?
> 
> At the moment it should be 2.4, apart from this bug. However we're about
> to raise it to 2.6 (and there's a patch on the list that updates the
> version check in configure). We were only retaining Python 2.4 support for
> the benefit of RHEL5, and we stopped supporting RHEL5 in QEMU 2.5
> when we raised our minimum glib version requirement.
> 
> thanks
> -- PMM

This sounds like I need to add to the system requirements section of the 
documentation. 

Python 2.6 or higher is needed. Any other system requirements you think should 
be added
to QEMU's documentation.

[Qemu-devel] [PATCH v2 2/2] hw/arm/virt-acpi-build: Add DBG2 table

2015-09-13 Thread Leif Lindholm

Add a DBG2 table, describing the pl011 UART.

Signed-off-by: Leif Lindholm 
---
 hw/arm/virt-acpi-build.c | 60 +++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9088248..0ea7023 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -352,6 +352,61 @@ build_rsdp(GArray *rsdp_table, GArray *linker, unsigned 
rsdt)
 }
 
 static void
+build_dbg2(GArray *table_data, GArray *linker, VirtGuestInfo *guest_info)
+{
+AcpiDebugPort2Header *dbg2;
+AcpiDebugPort2Device *dev;
+struct AcpiGenericAddress *addr;
+uint32_t *addr_size;
+char *name;
+const MemMapEntry *uart_memmap = &guest_info->memmap[VIRT_UART];
+int table_size, dev_size, namepath_length;
+const char namepath[] = ".";
+
+namepath_length = strlen(namepath) + 1;
+dev_size = sizeof(*dev) + sizeof(*addr) * 1 + sizeof(uint32_t) * 1 +
+namepath_length;
+table_size = dev_size + sizeof(AcpiDebugPort2Header);
+
+dbg2 = acpi_data_push(table_data, table_size);
+dev = (void *)dbg2 + sizeof(*dbg2);
+addr = (void *)dev + sizeof(*dev);
+addr_size = (void *)addr + sizeof(*addr);
+name = (void *)addr_size + sizeof(*addr_size);
+
+dbg2->devices_offset = sizeof(*dbg2);
+dbg2->devices_count = 1;
+
+/* First (only) debug device */
+dev->revision = 0;
+dev->length = cpu_to_le16(dev_size);
+dev->address_count = 1;
+dev->namepath_length = cpu_to_le16(namepath_length);
+dev->namepath_offset = cpu_to_le16((void *)name - (void *)dev);
+dev->oem_data_length = 0;
+dev->oem_data_offset = 0;
+dev->port_type = cpu_to_le16(0x8000);/* Serial */
+dev->port_subtype = cpu_to_le16(0x3);/* ARM PL011 UART */
+dev->base_address_offset = cpu_to_le16((void *)addr - (void *)dev);
+dev->address_size_offset = cpu_to_le16((void *)addr_size - (void *)dev);
+
+/* First (only) address */
+addr->space_id = AML_SYSTEM_MEMORY;
+addr->bit_width = 8;
+addr->bit_offset = 0;
+addr->access_width = 1;
+addr->address = cpu_to_le64(uart_memmap->base);
+
+/* Size of first (only) address */
+*addr_size = cpu_to_le32(sizeof(*addr));
+
+/* Namespace String for first (only) device */
+strcpy(name, namepath);
+
+build_header(linker, table_data, (void *)dbg2, "DBG2", table_size, 0);
+}
+
+static void
 build_spcr(GArray *table_data, GArray *linker, VirtGuestInfo *guest_info)
 {
 AcpiSerialPortConsoleRedirection *spcr;
@@ -577,7 +632,7 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 dsdt = tables_blob->len;
 build_dsdt(tables_blob, tables->linker, guest_info);
 
-/* FADT MADT GTDT MCFG SPCR pointed to by RSDT */
+/* FADT MADT GTDT MCFG DBG2 SPCR pointed to by RSDT */
 acpi_add_table(table_offsets, tables_blob);
 build_fadt(tables_blob, tables->linker, dsdt);
 
@@ -591,6 +646,9 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 build_mcfg(tables_blob, tables->linker, guest_info);
 
 acpi_add_table(table_offsets, tables_blob);
+build_dbg2(tables_blob, tables->linker, guest_info);
+
+acpi_add_table(table_offsets, tables_blob);
 build_spcr(tables_blob, tables->linker, guest_info);
 
 /* RSDT is pointed to by RSDP */
-- 
2.1.4

[Qemu-devel] [PATCH v2 0/2] ACPI/arm-virt: add DBG2

2015-09-13 Thread Leif Lindholm

The Debug Port Table 2 (DBG2) is mandated by the ARM Server Base Boot
Requirements specification. Add the DBG2 table definitions, and set up
an entry in the ARM virt machine for the pl011 UART.

Changes since v1:
- Static structure replaced with separate Header/Device structs.
- Missing cpu_to_le*() transforms added in table construction.
- Added missing setting of address_size_offset.
- Commit message modified to mention SPCR spec version bump.

Not changed since v1:
- It's still statically allocated, although the structure definitions
  would now permit a dynamic creation ... I'm just slightly too
  unfamiliar with both ACPI in general and the QEMU aml_* functions to
  quite wrap my head around how to do this dynamically.

Leif Lindholm (2):
  ACPI: Add definitions for the DBG2 table
  hw/arm/virt-acpi-build: Add DBG2 table

 hw/arm/virt-acpi-build.c| 60 -
 include/hw/acpi/acpi-defs.h | 35 --
 2 files changed, 92 insertions(+), 3 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH v2 1/2] ACPI: Add definitions for the DBG2 table

2015-09-13 Thread Leif Lindholm

The DBG2 table can be considered a "companion" to SPCR - it points out
debug consoles available in the system.

Also update SPCR comments to reflect DBG2 is now described in this file,
and update the supported SPCR specification revision (no functional
change).

Signed-off-by: Leif Lindholm 
---
 include/hw/acpi/acpi-defs.h | 35 +--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 2b431e6..a7bd984 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -197,10 +197,41 @@ enum {
 };
 
 /*
- * Serial Port Console Redirection Table (SPCR), Rev. 1.02
+ * Debug Port Table 2 (DBG2)
  *
  * For .interface_type see Debug Port Table 2 (DBG2) serial port
- * subtypes in Table 3, Rev. May 22, 2012
+ * subtypes in Table 3, Rev. Aug 10, 2015
+ *
+ */
+struct AcpiDebugPort2Header {
+ACPI_TABLE_HEADER_DEF
+uint32_t devices_offset;
+uint32_t devices_count;
+} QEMU_PACKED;
+typedef struct AcpiDebugPort2Header
+   AcpiDebugPort2Header;
+
+struct AcpiDebugPort2Device {
+uint8_t  revision;
+uint16_t length;
+uint8_t  address_count;
+uint16_t namepath_length;
+uint16_t namepath_offset;
+uint16_t oem_data_length;
+uint16_t oem_data_offset;
+uint16_t port_type;
+uint16_t port_subtype;
+uint8_t  reserved1[2];
+uint16_t base_address_offset;
+uint16_t address_size_offset;
+} QEMU_PACKED;
+typedef struct AcpiDebugPort2Device
+   AcpiDebugPort2Device;
+
+/*
+ * Serial Port Console Redirection Table (SPCR), Rev. 1.03
+ *
+ * .interface_type format same as for DBG2.
  */
 struct AcpiSerialPortConsoleRedirection {
 ACPI_TABLE_HEADER_DEF
-- 
2.1.4

Re: [Qemu-devel] [RFC PATCH 2/3] acpi: pc: add fw_cfg device node to ssdt

2015-09-13 Thread Gabriel L. Somlo

On Sun, Sep 13, 2015 at 02:45:23PM +0300, Michael S. Tsirkin wrote:
> On Sat, Sep 12, 2015 at 07:30:41PM -0400, Gabriel L. Somlo wrote:
> > Add a fw_cfg device node to the ACPI SSDT. While the guest-side
> > BIOS can't utilize this information (since it has to access the
> > hard-coded fw_cfg device to extract ACPI tables to begin with),
> > having fw_cfg listed in ACPI will help the guest kernel keep a
> > more accurate inventory of in-use IO port regions.
> > 
> > Signed-off-by: Gabriel Somlo 
> > ---
> >  hw/i386/acpi-build.c | 19 +++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index 95e0c65..9d0ec22 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1071,6 +1071,25 @@ build_ssdt(GArray *table_data, GArray *linker,
> >  aml_append(scope, aml_name_decl("_S5", pkg));
> >  aml_append(ssdt, scope);
> >  
> > +if (guest_info->fw_cfg) {
> > +scope = aml_scope("\\_SB");
> > +dev = aml_device("FWCF");
> > +
> > +aml_append(dev, aml_name_decl("_HID", aml_string("FWCF0001")));
> 
> Generally that's an illegal HID. If this device has a driver,
> use QEMU as a prefix. Otherwise, use one of the pre-defined ones
> with a PNP ISA ID.

I'm working on a sysfs driver to allow access to fw_cfg files via the
guest kernel (similar to e.g. /sys/firmware/dmi/entries/...). That
probably means I should go with QEMU0002 (0001 is already assigned to
the pvpanic device).

I'll use that in v2, which I'll send out once I get some feedback from
the arm side as well.

Thanks much,
--Gabriel

> > +/* device present, functioning, decoding, not shown in UI */
> > +aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
> > +
> > +crs = aml_resource_template();
> > +aml_append(crs,
> > +aml_io(AML_DECODE16, FW_CFG_IO_BASE, FW_CFG_IO_BASE,
> > +   0x01, FW_CFG_IO_SIZE)
> > +);
> > +aml_append(dev, aml_name_decl("_CRS", crs));
> > +
> > +aml_append(scope, dev);
> > +aml_append(ssdt, scope);
> > +}
> > +
> >  if (misc->applesmc_io_base) {
> >  scope = aml_scope("\\_SB.PCI0.ISA");
> >  dev = aml_device("SMC");
> > -- 
> > 2.4.3

Re: [Qemu-devel] [RFC PATCH 1/3] pc: fw_cfg: move ioport base constant to pc.h

2015-09-13 Thread Gabriel L. Somlo

On Sun, Sep 13, 2015 at 12:51:53PM +0200, Marc Marí wrote:
> On Sat, 12 Sep 2015 19:30:40 -0400
> "Gabriel L. Somlo"  wrote:
> 
> > Move BIOS_CFG_IOPORT define from pc.c to pc.h, and rename
> > it to FW_CFG_IO_BASE. Also, add FW_CFG_IO_SIZE define (set
> > to 0x02, to cover the overlapping 16-bit control and 8-bit
> > data ports).
> > 
> > Signed-off-by: Gabriel Somlo 
> > ---
> >  hw/i386/pc.c | 5 ++---
> >  include/hw/i386/pc.h | 3 +++
> >  2 files changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index b5107f7..1a92b4f 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -86,7 +86,6 @@ void pc_set_legacy_acpi_data_size(void)
> >  acpi_data_size = 0x1;
> >  }
> >  
> > -#define BIOS_CFG_IOPORT 0x510
> >  #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0)
> >  #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
> >  #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
> > @@ -760,7 +759,7 @@ static FWCfgState *bochs_bios_init(void)
> >  int i, j;
> >  unsigned int apic_id_limit = pc_apic_id_limit(max_cpus);
> >  
> > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
> >  /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86:
> >   *
> >   * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU
> > hotplug @@ -1292,7 +1291,7 @@ FWCfgState
> > *xen_load_linux(PCMachineState *pcms, 
> >  assert(MACHINE(pcms)->kernel_filename != NULL);
> >  
> > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
> >  rom_set_fw(fw_cfg);
> >  
> >  load_linux(pcms, fw_cfg);
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 3e002c9..0cab3c5 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -206,6 +206,9 @@ typedef void (*cpu_set_smm_t)(int smm, void *arg);
> >  
> >  void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name);
> >  
> > +#define FW_CFG_IO_BASE 0x510
> > +#define FW_CFG_IO_SIZE  0x02
> > +
> >  /* acpi_piix.c */
> >  
> >  I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base,
> 
> There is already a size defined in hw/nvram/fw_cfg.c (FW_CFG_SIZE). You
> could move this definition to the .h and reuse it for ACPI. This way,
> it is easier to modify.
> 
> Note that this value is used both for the size of the IO port and the
> size of the CTL field when using memory regions. You can split it now in
> your patches, or it will be split in my patches.

Thanks for the feedback! It does look like FW_CFG_SIZE in fw_cfg.c
appears to be mainly concerned with the width of the control register,
which is a "private" property of fw_cfg.c, rather than the total size
of the fw_cfg ioport region, which is a property of hw/i386/pc.c
(same as a15memmap[VIRT_FW_CFG] contains the same (base,size)
properties for the equivalent mmio region on arm).

We could rename FW_CFG_SIZE in fw_cfg.c to FW_CFG_CTL_SIZE for
increased clarity, but the fact that it's equal to FW_CFG_IO_SIZE
on hw/i386/... seems to me like more of a coincidence...

OTOH, i386/acpi_build.c includes both pc.h and fw_cfg.h, so if I have
to, I could use FW_CFG_IO_BASE from the former and FW_CFG_SIZE from
the latter.

It's more of a question of aesthetics at this point, so I'm happy
to do it whichever way I'm told :)

Thanks,
--Gabriel

> 
> I'm not going to comment on the other patches, because I don't know
> ACPI.
> 
> Thanks
> Marc

Re: [Qemu-devel] [PATCH v2] hw/misc: Add support for ADC controller in Xilinx Zynq 7000

2015-09-13 Thread Peter Crosthwaite

On Sat, Sep 12, 2015 at 2:08 PM, Guenter Roeck  wrote:
> Add support for the Xilinx XADC core used in Zynq 7000.
>
> References:
> - Zynq-7000 All Programmable SoC Technical Reference Manual
> - 7 Series FPGAs and Zynq-7000 All Programmable SoC XADC
>   Dual 12-Bit 1 MSPS Analog-to-Digital Converter
>
> Tested with Linux using qemu machine xilinx-zynq-a9 with devicetree
> files zynq-zc702.dtb and zynq-zc706.dtb, and kernel configuration
> multi_v7_defconfig.
>
> Signed-off-by: Guenter Roeck 
>
> ---
> v2: Use extract32()
> Merge zynq_xadc_reset() and _zynq_xadc_reset() into one function
> Use "xlnx,zynq_xadc"
> Move device model to include/hw/misc/zynq_xadc.h
> irq -> qemu_irq
> xadc_dfifo_depth -> xadc_dfifo_entries
> Dropped unnecessary comments
> Merged zynq_xadc_realize() into zynq_xadc_init()
>
>  hw/arm/xilinx_zynq.c|   6 +
>  hw/misc/Makefile.objs   |   1 +
>  hw/misc/zynq_xadc.c | 270 
> 
>  include/hw/misc/zynq_xadc.h |  49 
>  4 files changed, 326 insertions(+)
>  create mode 100644 hw/misc/zynq_xadc.c
>  create mode 100644 include/hw/misc/zynq_xadc.h
>
> diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
> index a4e7b5c..f933f81 100644
> --- a/hw/arm/xilinx_zynq.c
> +++ b/hw/arm/xilinx_zynq.c
> @@ -24,6 +24,7 @@
>  #include "hw/block/flash.h"
>  #include "sysemu/block-backend.h"
>  #include "hw/loader.h"
> +#include "hw/misc/zynq_xadc.h"
>  #include "hw/ssi.h"
>  #include "qemu/error-report.h"
>
> @@ -225,6 +226,11 @@ static void zynq_init(MachineState *machine)
>  sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xE0101000);
>  sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[79-IRQ_OFFSET]);
>
> +dev = qdev_create(NULL, TYPE_ZYNQ_XADC);
> +qdev_init_nofail(dev);
> +sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xF8007100);
> +sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[39-IRQ_OFFSET]);
> +
>  dev = qdev_create(NULL, "pl330");
>  qdev_prop_set_uint8(dev, "num_chnls",  8);
>  qdev_prop_set_uint8(dev, "num_periph_req",  4);
> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
> index 4aa76ff..5f76f05 100644
> --- a/hw/misc/Makefile.objs
> +++ b/hw/misc/Makefile.objs
> @@ -36,6 +36,7 @@ obj-$(CONFIG_OMAP) += omap_sdrc.o
>  obj-$(CONFIG_OMAP) += omap_tap.o
>  obj-$(CONFIG_SLAVIO) += slavio_misc.o
>  obj-$(CONFIG_ZYNQ) += zynq_slcr.o
> +obj-$(CONFIG_ZYNQ) += zynq_xadc.o
>  obj-$(CONFIG_STM32F2XX_SYSCFG) += stm32f2xx_syscfg.o
>
>  obj-$(CONFIG_PVPANIC) += pvpanic.o
> diff --git a/hw/misc/zynq_xadc.c b/hw/misc/zynq_xadc.c
> new file mode 100644
> index 000..dc67b73
> --- /dev/null
> +++ b/hw/misc/zynq_xadc.c
> @@ -0,0 +1,270 @@
> +/*
> + * ADC registers for Xilinx Zynq Platform
> + *
> + * Copyright (c) 2015 Guenter Roeck
> + * Based on hw/misc/zynq_slcr.c, written by Michal Simek
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + */
> +
> +#include "hw/hw.h"
> +#include "hw/misc/zynq_xadc.h"
> +#include "qemu/timer.h"
> +#include "sysemu/sysemu.h"
> +
> +enum {


These names should match the TRM, which has them as:

> +CFG= 0x000 / 4,

_CFG (ok)

> +INTSTS,

_INT_STS (just needs _)

> +INTMSK,

_INT_MASK

> +STATUS,

_MSTS

> +CFIFO,

_CMDFIFO

> +DFIFO,

_RDFIFO

> +CTL,

_MCTL

I have dropped the XADCIF_ from the names as that is implicit but the
remainder should be recognisable to TRM.

> +};
> +
> +#define XADC_ZYNQ_CFG_ENABLEBIT(31)
> +#define XADC_ZYNQ_CFG_CFIFOTH_RD(x) (((x) >> 20) & 0x0f)
> +#define XADC_ZYNQ_CFG_DFIFOTH_RD(x) (((x) >> 16) & 0x0f)
> +#define XADC_ZYNQ_CFG_WEDGE BIT(13)
> +#define XADC_ZYNQ_CFG_REDGE BIT(12)
> +#define XADC_ZYNQ_CFG_TCKRATE_DIV2  (0x0 << 8)
> +#define XADC_ZYNQ_CFG_TCKRATE_DIV4  (0x1 << 8)
> +#define XADC_ZYNQ_CFG_TCKRATE_DIV8  (0x2 << 8)
> +#define XADC_ZYNQ_CFG_TCKRATE_DIV16 (0x3 << 8)
> +#define XADC_ZYNQ_CFG_IGAP_MASK 0x1f
> +#define XADC_ZYNQ_CFG_IGAP(x)   ((x) & XADC_ZYNQ_CFG_IGAP_MASK)
> +

These defs have the same name stem but, are inconsistent and cannot be
used interchangeably. Some, are extractors (CFIFOTH_RD), some are
boolean masks (WEDGE) and some are depositors of specific values
(TCKRATE_DIV2) It means readers need to refer back to this def table
to figure out what the macro operation is for each field. It probably
needs a suffix system. _MASK, _EXTRACT, _DEPOSIT. Line wrap is a
problem, so maybe truncate to shorthands.

> +#define XADC_ZYNQ_INT_CFIFO_LTH BIT(9)
> +#define XADC_ZYNQ_INT_DFIFO_GTH BIT(8)

Re: [Qemu-devel] [RFC PATCH 1/3] pc: fw_cfg: move ioport base constant to pc.h

2015-09-13 Thread Marc Marí

On Sun, 13 Sep 2015 13:28:24 -0400
"Gabriel L. Somlo"  wrote:

> On Sun, Sep 13, 2015 at 12:51:53PM +0200, Marc Marí wrote:
> > On Sat, 12 Sep 2015 19:30:40 -0400
> > "Gabriel L. Somlo"  wrote:
> > 
> > > Move BIOS_CFG_IOPORT define from pc.c to pc.h, and rename
> > > it to FW_CFG_IO_BASE. Also, add FW_CFG_IO_SIZE define (set
> > > to 0x02, to cover the overlapping 16-bit control and 8-bit
> > > data ports).
> > > 
> > > Signed-off-by: Gabriel Somlo 
> > > ---
> > >  hw/i386/pc.c | 5 ++---
> > >  include/hw/i386/pc.h | 3 +++
> > >  2 files changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index b5107f7..1a92b4f 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -86,7 +86,6 @@ void pc_set_legacy_acpi_data_size(void)
> > >  acpi_data_size = 0x1;
> > >  }
> > >  
> > > -#define BIOS_CFG_IOPORT 0x510
> > >  #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0)
> > >  #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
> > >  #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
> > > @@ -760,7 +759,7 @@ static FWCfgState *bochs_bios_init(void)
> > >  int i, j;
> > >  unsigned int apic_id_limit = pc_apic_id_limit(max_cpus);
> > >  
> > > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> > > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
> > >  /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86:
> > >   *
> > >   * SeaBIOS needs FW_CFG_MAX_CPUS for CPU hotplug, but the CPU
> > > hotplug @@ -1292,7 +1291,7 @@ FWCfgState
> > > *xen_load_linux(PCMachineState *pcms, 
> > >  assert(MACHINE(pcms)->kernel_filename != NULL);
> > >  
> > > -fw_cfg = fw_cfg_init_io(BIOS_CFG_IOPORT);
> > > +fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
> > >  rom_set_fw(fw_cfg);
> > >  
> > >  load_linux(pcms, fw_cfg);
> > > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > > index 3e002c9..0cab3c5 100644
> > > --- a/include/hw/i386/pc.h
> > > +++ b/include/hw/i386/pc.h
> > > @@ -206,6 +206,9 @@ typedef void (*cpu_set_smm_t)(int smm, void
> > > *arg); 
> > >  void ioapic_init_gsi(GSIState *gsi_state, const char
> > > *parent_name); 
> > > +#define FW_CFG_IO_BASE 0x510
> > > +#define FW_CFG_IO_SIZE  0x02
> > > +
> > >  /* acpi_piix.c */
> > >  
> > >  I2CBus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t
> > > smb_io_base,
> > 
> > There is already a size defined in hw/nvram/fw_cfg.c (FW_CFG_SIZE).
> > You could move this definition to the .h and reuse it for ACPI.
> > This way, it is easier to modify.
> > 
> > Note that this value is used both for the size of the IO port and
> > the size of the CTL field when using memory regions. You can split
> > it now in your patches, or it will be split in my patches.
> 
> Thanks for the feedback! It does look like FW_CFG_SIZE in fw_cfg.c
> appears to be mainly concerned with the width of the control register,
> which is a "private" property of fw_cfg.c, rather than the total size
> of the fw_cfg ioport region, which is a property of hw/i386/pc.c
> (same as a15memmap[VIRT_FW_CFG] contains the same (base,size)
> properties for the equivalent mmio region on arm).
> 
> We could rename FW_CFG_SIZE in fw_cfg.c to FW_CFG_CTL_SIZE for
> increased clarity, but the fact that it's equal to FW_CFG_IO_SIZE
> on hw/i386/... seems to me like more of a coincidence...

In hw/nvram/fw_cfg.c

L. 675, in fw_cfg_io_realize():
memory_region_init_io(&s->comb_iomem, OBJECT(s), &fw_cfg_comb_mem_ops,
  FW_CFG(s), "fwcfg", FW_CFG_SIZE); 

L. 707, in fw_cfg_mem_realize():
memory_region_init_io(&s->ctl_iomem, OBJECT(s), &fw_cfg_ctl_mem_ops,
  FW_CFG(s), "fwcfg.ctl", FW_CFG_SIZE);

The first one is the size of all the IO region, and the second one is
the size of the memory mapped CTL field.

The value for the ACPI size could be the same value used in
memory_region_init_io. But it needs to be deatached from the CTL size.
That's what I meant before. Of course, it can be the same, but it's
better to split it.

Thanks
Marc

> OTOH, i386/acpi_build.c includes both pc.h and fw_cfg.h, so if I have
> to, I could use FW_CFG_IO_BASE from the former and FW_CFG_SIZE from
> the latter.
> 
> It's more of a question of aesthetics at this point, so I'm happy
> to do it whichever way I'm told :)
> 
> > 
> > I'm not going to comment on the other patches, because I don't know
> > ACPI.
> > 
> > Thanks
> > Marc

Re: [Qemu-devel] [PATCH] imx_serial: Generate interrupt on tx empty if enabled

2015-09-13 Thread Peter Crosthwaite

On Fri, Sep 11, 2015 at 12:37 AM, Michael Tokarev  wrote:
>
> Can we please have some r-b or ACK for this? :)
>
> 20.08.2015 18:52, Guenter Roeck wrote:
>> Generate an interrupt if the tx buffer is empty and the tx empty interrupt
>> is enabled. This fixes a problem seen when running a Linux image since
>> Linux commit 55c3cb1358e ("serial: imx: remove unneeded imx_transmit_buffer()
>> from imx_start_tx()"). Linux now waits for the tx empty interrupt before
>> starting to send data, causing transmit stalls until there is an interrupt
>> for another reason.
>>
>> Signed-off-by: Guenter Roeck 


Looks right, Jean-Christophe may know more though.

Reviewed-by: Peter Crosthwaite 

>> ---
>>  hw/char/imx_serial.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/char/imx_serial.c b/hw/char/imx_serial.c
>> index f3fbc77..8dc791d 100644
>> --- a/hw/char/imx_serial.c
>> +++ b/hw/char/imx_serial.c
>> @@ -145,7 +145,9 @@ static void imx_update(IMXSerialState *s)
>>  uint32_t flags;
>>
>>  flags = (s->usr1 & s->ucr1) & (USR1_TRDY|USR1_RRDY);
>> -if (!(s->ucr1 & UCR1_TXMPTYEN)) {
>> +if (s->ucr1 & UCR1_TXMPTYEN) {
>> +flags |= (s->uts1 & UTS1_TXEMPTY);
>> +} else {
>>  flags &= ~USR1_TRDY;

Out of scope, but this conditional looks wrong (in original code too).
Why does TXMPTYEN gate TRDY?

Regards,
Peter

>>  }
>>
>>
>
>

Re: [Qemu-devel] [PATCH v2] hw/misc/zynq_slcr: Change CPU clock rate for Linux boots

2015-09-13 Thread Peter Crosthwaite

On Sat, Sep 12, 2015 at 2:06 PM, Guenter Roeck  wrote:
> The Linux kernel only accepts 34 Khz and 67 Khz clock rates, and
> may crash if the actual clock rate is too low. The clock rate used to be
> (ps-clk-frequency * 26 / 4), which resulted in a CPU frequency of
> 21 Khz if ps-clk-frequency was set to  Hz. Change it to
> (ps-clk-frequency * 20 / 2) = 33 Khz for to make Linux happy.
> Limit the change to Linux boots only.
>
> Signed-off-by: Guenter Roeck 
>

Reviewed-by: Peter Crosthwaite 

Can this go via target-arm? (cc PMM).

There may be more changes worth making on is_linux. I don't have the
patch with the full list of FSBL-related SLCR changes handy and can't
seem to find it in any modern Yocto trees. Wondering if Yocto still
supports booting Zynq without FSBL (Nathan/Alistair may know more)?

Regards,
Peter

> ---
> v2: Limit scope of change to Linux boots.
>
>  hw/misc/zynq_slcr.c | 29 +++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
> index 964f253..ed510fb 100644
> --- a/hw/misc/zynq_slcr.c
> +++ b/hw/misc/zynq_slcr.c
> @@ -14,6 +14,7 @@
>   * with this program; if not, see .
>   */
>
> +#include "hw/arm/linux-boot-if.h"
>  #include "hw/hw.h"
>  #include "qemu/timer.h"
>  #include "hw/sysbus.h"
> @@ -177,6 +178,8 @@ typedef struct ZynqSLCRState {
>
>  MemoryRegion iomem;
>
> +bool is_linux;
> +
>  uint32_t regs[ZYNQ_SLCR_NUM_REGS];
>  } ZynqSLCRState;
>
> @@ -189,7 +192,11 @@ static void zynq_slcr_reset(DeviceState *d)
>
>  s->regs[LOCKSTA] = 1;
>  /* 0x100 - 0x11C */
> -s->regs[ARM_PLL_CTRL]   = 0x0001A008;
> +if (!s->is_linux) {
> +s->regs[ARM_PLL_CTRL]   = 0x0001A008;
> +} else {
> +s->regs[ARM_PLL_CTRL]   = 0x00014008;
> +}
>  s->regs[DDR_PLL_CTRL]   = 0x0001A008;
>  s->regs[IO_PLL_CTRL]= 0x0001A008;
>  s->regs[PLL_STATUS] = 0x003F;
> @@ -198,7 +205,11 @@ static void zynq_slcr_reset(DeviceState *d)
>  s->regs[IO_PLL_CFG] = 0x00014000;
>
>  /* 0x120 - 0x16C */
> -s->regs[ARM_CLK_CTRL]   = 0x1F000400;
> +if (!s->is_linux) {
> +s->regs[ARM_CLK_CTRL]   = 0x1F000400;
> +} else {
> +s->regs[ARM_CLK_CTRL]   = 0x1F000200;
> +}
>  s->regs[DDR_CLK_CTRL]   = 0x1843;
>  s->regs[DCI_CLK_CTRL]   = 0x01E03201;
>  s->regs[APER_CLK_CTRL]  = 0x01FFCCCD;
> @@ -429,17 +440,27 @@ static const VMStateDescription vmstate_zynq_slcr = {
>  .version_id = 2,
>  .minimum_version_id = 2,
>  .fields = (VMStateField[]) {
> +VMSTATE_BOOL(is_linux, ZynqSLCRState),
>  VMSTATE_UINT32_ARRAY(regs, ZynqSLCRState, ZYNQ_SLCR_NUM_REGS),
>  VMSTATE_END_OF_LIST()
>  }
>  };
>
> +static void zynq_sclr_linux_init(ARMLinuxBootIf *obj, bool secure_boot)
> +{
> +ZynqSLCRState *s = ZYNQ_SLCR(obj);
> +
> +s->is_linux = true;
> +}
> +
>  static void zynq_slcr_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +ARMLinuxBootIfClass *albifc = ARM_LINUX_BOOT_IF_CLASS(klass);
>
>  dc->vmsd = &vmstate_zynq_slcr;
>  dc->reset = zynq_slcr_reset;
> +albifc->arm_linux_init = zynq_sclr_linux_init;
>  }
>
>  static const TypeInfo zynq_slcr_info = {
> @@ -448,6 +469,10 @@ static const TypeInfo zynq_slcr_info = {
>  .parent = TYPE_SYS_BUS_DEVICE,
>  .instance_size  = sizeof(ZynqSLCRState),
>  .instance_init = zynq_slcr_init,
> +.interfaces = (InterfaceInfo []) {
> +{ TYPE_ARM_LINUX_BOOT_IF },
> +{ },
> +},
>  };
>
>  static void zynq_slcr_register_types(void)
> --
> 2.1.4
>

Re: [Qemu-devel] [PATCH V1] sdhci: Fix hostctl2 write logic.

2015-09-13 Thread Peter Crosthwaite

On Fri, Sep 11, 2015 at 3:30 AM, Sai Pavan Boddu
 wrote:
> From: Peter Crosthwaite 
>
> This should be a shifted MASKED_WRITE like all other instances of
> non-word aligned registers.
>
> Signed-off-by: Peter Crosthwaite 


As the sender, this requires your signed-off-by line (in addition to
any originals). git commit --amend -s should do it.

Your own RB might help as well (I can't do review as author).

Regards,
Peter

> ---
>  hw/sd/sdhci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index 8fd75f7..fd354e3 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -1059,7 +1059,7 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
> unsigned size)
>  value |= SDHC_CTRL2_SAMPLING_CLKSEL;
>  }
>  s->acmd12errsts = value;
> -s->hostctl2 = value >> 16;
> +MASKED_WRITE(s->hostctl2, mask >> 16, value >> 16);
>  break;
>  case SDHC_CLKCON:
>  if (!(mask & 0xFF00)) {
> --
> 2.1.1
>

Re: [Qemu-devel] [PATCH v2] hw/misc/zynq_slcr: Change CPU clock rate for Linux boots

2015-09-13 Thread Guenter Roeck


On 09/13/2015 01:22 PM, Peter Crosthwaite wrote:

On Sat, Sep 12, 2015 at 2:06 PM, Guenter Roeck  wrote:

The Linux kernel only accepts 34 Khz and 67 Khz clock rates, and
may crash if the actual clock rate is too low. The clock rate used to be
(ps-clk-frequency * 26 / 4), which resulted in a CPU frequency of
21 Khz if ps-clk-frequency was set to  Hz. Change it to
(ps-clk-frequency * 20 / 2) = 33 Khz for to make Linux happy.
Limit the change to Linux boots only.

Signed-off-by: Guenter Roeck 



Reviewed-by: Peter Crosthwaite 

Can this go via target-arm? (cc PMM).

There may be more changes worth making on is_linux. I don't have the
patch with the full list of FSBL-related SLCR changes handy and can't
seem to find it in any modern Yocto trees. Wondering if Yocto still
supports booting Zynq without FSBL (Nathan/Alistair may know more)?



Good question. I didn't find any related patches in Yocto 1.8,
but on the other side I wasn't sure if I was looking in the right place.

Guenter


Regards,
Peter


---
v2: Limit scope of change to Linux boots.

  hw/misc/zynq_slcr.c | 29 +++--
  1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
index 964f253..ed510fb 100644
--- a/hw/misc/zynq_slcr.c
+++ b/hw/misc/zynq_slcr.c
@@ -14,6 +14,7 @@
   * with this program; if not, see .
   */

+#include "hw/arm/linux-boot-if.h"
  #include "hw/hw.h"
  #include "qemu/timer.h"
  #include "hw/sysbus.h"
@@ -177,6 +178,8 @@ typedef struct ZynqSLCRState {

  MemoryRegion iomem;

+bool is_linux;
+
  uint32_t regs[ZYNQ_SLCR_NUM_REGS];
  } ZynqSLCRState;

@@ -189,7 +192,11 @@ static void zynq_slcr_reset(DeviceState *d)

  s->regs[LOCKSTA] = 1;
  /* 0x100 - 0x11C */
-s->regs[ARM_PLL_CTRL]   = 0x0001A008;
+if (!s->is_linux) {
+s->regs[ARM_PLL_CTRL]   = 0x0001A008;
+} else {
+s->regs[ARM_PLL_CTRL]   = 0x00014008;
+}
  s->regs[DDR_PLL_CTRL]   = 0x0001A008;
  s->regs[IO_PLL_CTRL]= 0x0001A008;
  s->regs[PLL_STATUS] = 0x003F;
@@ -198,7 +205,11 @@ static void zynq_slcr_reset(DeviceState *d)
  s->regs[IO_PLL_CFG] = 0x00014000;

  /* 0x120 - 0x16C */
-s->regs[ARM_CLK_CTRL]   = 0x1F000400;
+if (!s->is_linux) {
+s->regs[ARM_CLK_CTRL]   = 0x1F000400;
+} else {
+s->regs[ARM_CLK_CTRL]   = 0x1F000200;
+}
  s->regs[DDR_CLK_CTRL]   = 0x1843;
  s->regs[DCI_CLK_CTRL]   = 0x01E03201;
  s->regs[APER_CLK_CTRL]  = 0x01FFCCCD;
@@ -429,17 +440,27 @@ static const VMStateDescription vmstate_zynq_slcr = {
  .version_id = 2,
  .minimum_version_id = 2,
  .fields = (VMStateField[]) {
+VMSTATE_BOOL(is_linux, ZynqSLCRState),
  VMSTATE_UINT32_ARRAY(regs, ZynqSLCRState, ZYNQ_SLCR_NUM_REGS),
  VMSTATE_END_OF_LIST()
  }
  };

+static void zynq_sclr_linux_init(ARMLinuxBootIf *obj, bool secure_boot)
+{
+ZynqSLCRState *s = ZYNQ_SLCR(obj);
+
+s->is_linux = true;
+}
+
  static void zynq_slcr_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
+ARMLinuxBootIfClass *albifc = ARM_LINUX_BOOT_IF_CLASS(klass);

  dc->vmsd = &vmstate_zynq_slcr;
  dc->reset = zynq_slcr_reset;
+albifc->arm_linux_init = zynq_sclr_linux_init;
  }

  static const TypeInfo zynq_slcr_info = {
@@ -448,6 +469,10 @@ static const TypeInfo zynq_slcr_info = {
  .parent = TYPE_SYS_BUS_DEVICE,
  .instance_size  = sizeof(ZynqSLCRState),
  .instance_init = zynq_slcr_init,
+.interfaces = (InterfaceInfo []) {
+{ TYPE_ARM_LINUX_BOOT_IF },
+{ },
+},
  };

  static void zynq_slcr_register_types(void)
--
2.1.4

Re: [Qemu-devel] [PATCH v2] hw/misc/zynq_slcr: Change CPU clock rate for Linux boots

2015-09-13 Thread Peter Maydell

On 13 September 2015 at 21:22, Peter Crosthwaite
 wrote:
> On Sat, Sep 12, 2015 at 2:06 PM, Guenter Roeck  wrote:
>> The Linux kernel only accepts 34 Khz and 67 Khz clock rates, and
>> may crash if the actual clock rate is too low. The clock rate used to be
>> (ps-clk-frequency * 26 / 4), which resulted in a CPU frequency of
>> 21 Khz if ps-clk-frequency was set to  Hz. Change it to
>> (ps-clk-frequency * 20 / 2) = 33 Khz for to make Linux happy.
>> Limit the change to Linux boots only.
>>
>> Signed-off-by: Guenter Roeck 
>>
>
> Reviewed-by: Peter Crosthwaite 
>
> Can this go via target-arm? (cc PMM).
>
> There may be more changes worth making on is_linux. I don't have the
> patch with the full list of FSBL-related SLCR changes handy and can't
> seem to find it in any modern Yocto trees. Wondering if Yocto still
> supports booting Zynq without FSBL (Nathan/Alistair may know more)?

I'd prefer us not to propagate lots of "only if Linux boot"
changes into devices. The GIC *must* have these because the
kernel can't configure it otherwise from non-secure mode.
I'm not sure that applies here.

thanks
-- PMM

Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation

2015-09-13 Thread Aurelien Jarno

On 2015-09-10 19:48, Aurelien Jarno wrote:
> On 2015-09-01 22:51, Richard Henderson wrote:
> > I've been looking at this problem off and on for the last week or so,
> > prompted by the sparc performance work.  Although I havn't been able
> > to get a proper sparc64 guest install working, I see the exact same
> > problem with a mips guest.
> > 
> > On alpha or x86, which seem to perform well, perf numbers for the
> > executable have about 30% of the execution time spent in cpu_exec.
> > For mips, on the other hand, we spend about 30% of the time in
> > routines related to tcg (re-)translation.
> 
> Indeed the problem happens on CPUs which implement the MMU as a 
> "software assisted TLB" (or any other marketing name), as opposed to
> hardware page walk MMU. They can hold a limited number of TLB entry
> at a given time, and require the OS to do the page walk to refill the
> TLB. For that an exception is generated, and the faulting address has
> to be determined. That's were the TB retranslation takes place, and
> that's why it happens a lot more on these CPUS.
> 
> A few years ago, I measured about 45% of the TB translation actually
> being retranslation for mips and 60% for SH4 for a standard workload.
> For a comparison, these value around 1% on i386 and around 5% on ARM.
> 
> That's why each time we add an optimization to the optimize, we get
> faster code, but we might loose because it takes longer to generate.
> 
> > Aurelien has a patch in his own branches that attempts to mitigate this
> > on mips by shadow caching more tlb entries.  While this does improve
> > performace a bit, it employs a linear search through a large buffer,
> > with the effect of 30-ish % perf numbers for r4k_map_address.
> > (One could probably improve things by hashing the data in that array,
> > rather than a linear search, but...)
> 
> Yes, that is just a workaround and probably highly workload dependent,
> that's why I never submitted it.
> 
> > In the past we've talked about getting rid of retranslation entirely.
> > It's clever, but it certainly has its share of problems.  I gave it
> > a go this weekend.
> 
> Really great that you have been able to implement that.
> 
> > The following isn't quite right.  It fails to boot on sparc even with
> > our tiny test kernel.  It also triggers an abort on mips, eventually.
> > But it's able to get all the way through to a prompt, and in the 
> > process I can see that perf results are quite different -- much more
> > like results I see for alpha.
> > 
> > Thoughts on the approach?
> 
> It looks like the approach we discussed with Paolo back in June:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg04885.html
> 
> For me it looks like the good way to proceed, we just have to take care
> that the informations to store do not take too much space compared to 
> the actual translated code.
> 
> I'll give a look and a test asap.

I haven't really reviewed the code yet, but I have been able to test
your tcg-search-2 branch.

First of all I have tested half of the targets (alpha, arm, cris, i386,
mips, ppc, s390x, sh4 and sparc), and I haven't noticed any regression.
They now have more than 50 hours of uptime, some of them have been 
building stuff most of the time, so they are quite stable. That said
I have only tested your branch on an x86-64 host, and it might be a 
good idea to test it in one or two different host architectures (I put
that on my todo list, but no promise there).

On the performance side, I have done real measurements only on i386 and
mips. On i386, I haven't seen any measurable difference. On mips, the
boot time is unchanged, but then some workloads are quite faster. The
best I have measured is on perl code, with a x2.4 improvements, while
on an average workload, the gain is around x1.5.

With all that said, you can get:

  Tested-by: Aurelien Jarno 

I hope to give you the corresponding reviewed-by in the next days.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [PATCH 0/2] target-ppc: vector instruction fixes

2015-09-13 Thread Aurelien Jarno

This patchset fixes some vector instructions which are incorrectly
decoded or implemented. The first patch is needed to run recent version
of openssl, as it enabled POWER8 instrutctions when it detects such a
CPU.

Aurelien Jarno (2):
  target-ppc: fix vcipher, vcipherlast, vncipherlast and vpermxor
  target-ppc: fix xscmpodp and xscmpudp decoding

 target-ppc/int_helper.c | 19 ++-
 target-ppc/translate.c  | 11 +--
 2 files changed, 23 insertions(+), 7 deletions(-)

Cc: Tom Musta 
Cc: Alexander Graf 

-- 
2.1.4

[Qemu-devel] [PATCH 2/2] target-ppc: fix xscmpodp and xscmpudp decoding

2015-09-13 Thread Aurelien Jarno

The xscmpodp and xscmpudp instructions only have the AX, BX bits in
there encoding, the lowest bit (usually TX) is marked as an invalid
bit. We therefore can't decode them with GEN_XX2FORM, which decodes
the two lowest bit.

Introduce a new form GEN_XX2FORM, which decodes AX and BX and mark
the lowest bit as invalid.

Cc: Tom Musta 
Cc: Alexander Graf 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Aurelien Jarno 
---
 target-ppc/translate.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 84c5cea..c0eed13 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -10670,6 +10670,13 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 0, 
PPC_NONE, fl2), \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 2, opc3, 0, PPC_NONE, fl2), \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 3, opc3, 0, PPC_NONE, fl2)
 
+#undef GEN_XX2IFORM
+#define GEN_XX2IFORM(name, opc2, opc3, fl2)   \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 1, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 1, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 2, opc3, 1, PPC_NONE, fl2), \
+GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 3, opc3, 1, PPC_NONE, fl2)
+
 #undef GEN_XX3_RC_FORM
 #define GEN_XX3_RC_FORM(name, opc2, opc3, fl2)  \
 GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x00, opc3 | 0x00, 0, PPC_NONE, fl2), 
\
@@ -10731,8 +10738,8 @@ GEN_XX3FORM(xsnmaddadp, 0x04, 0x14, PPC2_VSX),
 GEN_XX3FORM(xsnmaddmdp, 0x04, 0x15, PPC2_VSX),
 GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
 GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),
-GEN_XX2FORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
-GEN_XX2FORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
+GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
+GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
 GEN_XX3FORM(xsmaxdp, 0x00, 0x14, PPC2_VSX),
 GEN_XX3FORM(xsmindp, 0x00, 0x15, PPC2_VSX),
 GEN_XX2FORM(xscvdpsp, 0x12, 0x10, PPC2_VSX),
-- 
2.1.4

[Qemu-devel] [PATCH 1/2] target-ppc: fix vcipher, vcipherlast, vncipherlast and vpermxor

2015-09-13 Thread Aurelien Jarno

For vector instructions, the helpers get pointers to the vector register
in arguments. Some operands might point to the same register, including
the operand holding the result.

When emulating instructions which access the vector elements in a
non-linear way, we need to store the result in an temporary variable.

This fixes openssl when emulating a POWER8 CPU.

Cc: Tom Musta 
Cc: Alexander Graf 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Aurelien Jarno 
---
 target-ppc/int_helper.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 0a55d5e..b122868 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -2327,24 +2327,28 @@ void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 
 void helper_vcipher(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
+ppc_avr_t result;
 int i;
 
 VECTOR_FOR_INORDER_I(i, u32) {
-r->AVRW(i) = b->AVRW(i) ^
+result.AVRW(i) = b->AVRW(i) ^
 (AES_Te0[a->AVRB(AES_shifts[4*i + 0])] ^
  AES_Te1[a->AVRB(AES_shifts[4*i + 1])] ^
  AES_Te2[a->AVRB(AES_shifts[4*i + 2])] ^
  AES_Te3[a->AVRB(AES_shifts[4*i + 3])]);
 }
+*r = result;
 }
 
 void helper_vcipherlast(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
+ppc_avr_t result;
 int i;
 
 VECTOR_FOR_INORDER_I(i, u8) {
-r->AVRB(i) = b->AVRB(i) ^ (AES_sbox[a->AVRB(AES_shifts[i])]);
+result.AVRB(i) = b->AVRB(i) ^ (AES_sbox[a->AVRB(AES_shifts[i])]);
 }
+*r = result;
 }
 
 void helper_vncipher(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
@@ -2369,11 +2373,13 @@ void helper_vncipher(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 
 void helper_vncipherlast(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
+ppc_avr_t result;
 int i;
 
 VECTOR_FOR_INORDER_I(i, u8) {
-r->AVRB(i) = b->AVRB(i) ^ (AES_isbox[a->AVRB(AES_ishifts[i])]);
+result.AVRB(i) = b->AVRB(i) ^ (AES_isbox[a->AVRB(AES_ishifts[i])]);
 }
+*r = result;
 }
 
 #define ROTRu32(v, n) (((v) >> (n)) | ((v) << (32-n)))
@@ -2460,16 +2466,19 @@ void helper_vshasigmad(ppc_avr_t *r,  ppc_avr_t *a, 
uint32_t st_six)
 
 void helper_vpermxor(ppc_avr_t *r,  ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
+ppc_avr_t result;
 int i;
+
 VECTOR_FOR_INORDER_I(i, u8) {
 int indexA = c->u8[i] >> 4;
 int indexB = c->u8[i] & 0xF;
 #if defined(HOST_WORDS_BIGENDIAN)
-r->u8[i] = a->u8[indexA] ^ b->u8[indexB];
+result.u8[i] = a->u8[indexA] ^ b->u8[indexB];
 #else
-r->u8[i] = a->u8[15-indexA] ^ b->u8[15-indexB];
+result.u8[i] = a->u8[15-indexA] ^ b->u8[15-indexB];
 #endif
 }
+*r = result;
 }
 
 #undef VECTOR_FOR_INORDER_I
-- 
2.1.4

[Qemu-devel] [PATCH 0/2] target-mips: get rid of old debugging code

2015-09-13 Thread Aurelien Jarno

This patchset get rid of old debugging code in translate.c, that has
been superseded by other debugging way (e.g. (-d in_asm,op). It comes
from the discussion there:

  https://lists.gnu.org/archive/html/qemu-devel/2015-07/msg03162.html

I had it ready for some time, now that 2.4 has been release, it's
probably time to apply it.

Aurelien Jarno (2):
  target-mips: get rid of MIPS_DEBUG
  target-mips: get rid of MIPS_DEBUG_SIGN_EXTENSIONS

 target-mips/translate.c | 663 ++--
 1 file changed, 19 insertions(+), 644 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH 2/2] target-mips: get rid of MIPS_DEBUG_SIGN_EXTENSIONS

2015-09-13 Thread Aurelien Jarno

MIPS_DEBUG_SIGN_EXTENSIONS was used sometimes ago to verify that 32-bit
instructions correctly sign extend their results. It's now not need
anymore, remove it.

Cc: Leon Alrae 
Signed-off-by: Aurelien Jarno 
---
 target-mips/translate.c | 39 ---
 1 file changed, 39 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 36bc25d..d865a83 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -33,9 +33,7 @@
 
 #include "trace-tcg.h"
 
-
 #define MIPS_DEBUG_DISAS 0
-//#define MIPS_DEBUG_SIGN_EXTENSIONS
 
 /* MIPS major opcodes */
 #define MASK_OP_MAJOR(op)  (op & (0x3F << 26))
@@ -19806,40 +19804,6 @@ static void fpu_dump_state(CPUMIPSState *env, FILE *f, 
fprintf_function fpu_fpri
 #undef printfpr
 }
 
-#if defined(TARGET_MIPS64) && defined(MIPS_DEBUG_SIGN_EXTENSIONS)
-/* Debug help: The architecture requires 32bit code to maintain proper
-   sign-extended values on 64bit machines.  */
-
-#define SIGN_EXT_P(val) val) & ~0x7fff) == 0) || (((val) & 
~0x7fff) == ~0x7fff))
-
-static void
-cpu_mips_check_sign_extensions (CPUMIPSState *env, FILE *f,
-fprintf_function cpu_fprintf,
-int flags)
-{
-int i;
-
-if (!SIGN_EXT_P(env->active_tc.PC))
-cpu_fprintf(f, "BROKEN: pc=0x" TARGET_FMT_lx "\n", env->active_tc.PC);
-if (!SIGN_EXT_P(env->active_tc.HI[0]))
-cpu_fprintf(f, "BROKEN: HI=0x" TARGET_FMT_lx "\n", 
env->active_tc.HI[0]);
-if (!SIGN_EXT_P(env->active_tc.LO[0]))
-cpu_fprintf(f, "BROKEN: LO=0x" TARGET_FMT_lx "\n", 
env->active_tc.LO[0]);
-if (!SIGN_EXT_P(env->btarget))
-cpu_fprintf(f, "BROKEN: btarget=0x" TARGET_FMT_lx "\n", env->btarget);
-
-for (i = 0; i < 32; i++) {
-if (!SIGN_EXT_P(env->active_tc.gpr[i]))
-cpu_fprintf(f, "BROKEN: %s=0x" TARGET_FMT_lx "\n", regnames[i], 
env->active_tc.gpr[i]);
-}
-
-if (!SIGN_EXT_P(env->CP0_EPC))
-cpu_fprintf(f, "BROKEN: EPC=0x" TARGET_FMT_lx "\n", env->CP0_EPC);
-if (!SIGN_EXT_P(env->lladdr))
-cpu_fprintf(f, "BROKEN: LLAddr=0x" TARGET_FMT_lx "\n", env->lladdr);
-}
-#endif
-
 void mips_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
  int flags)
 {
@@ -19871,9 +19835,6 @@ void mips_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 env->CP0_Config4, env->CP0_Config5);
 if (env->hflags & MIPS_HFLAG_FPU)
 fpu_dump_state(env, f, cpu_fprintf, flags);
-#if defined(TARGET_MIPS64) && defined(MIPS_DEBUG_SIGN_EXTENSIONS)
-cpu_mips_check_sign_extensions(env, f, cpu_fprintf, flags);
-#endif
 }
 
 void mips_tcg_init(void)
-- 
2.1.4

[Qemu-devel] [PATCH 1/2] target-mips: get rid of MIPS_DEBUG

2015-09-13 Thread Aurelien Jarno

MIPS_DEBUG is a define used to dump the instruction disassembling. It
has to be defined at compile time. In practice I believe it's more
efficient to just look at the instruction disassembly and op dump using
-d in_asm,op. This patch therefore removes the corresponding code, which
clutters translate.c.

Cc: Leon Alrae 
Signed-off-by: Aurelien Jarno 
---
 target-mips/translate.c | 624 ++--
 1 file changed, 19 insertions(+), 605 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 93cb4f2..36bc25d 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1482,26 +1482,23 @@ static const char * const msaregnames[] = {
 "w30.d0", "w30.d1", "w31.d0", "w31.d1",
 };
 
-#define MIPS_DEBUG(fmt, ...)  \
+#define LOG_DISAS(...)\
 do {  \
 if (MIPS_DEBUG_DISAS) {   \
-qemu_log_mask(CPU_LOG_TB_IN_ASM,  \
-  TARGET_FMT_lx ": %08x " fmt "\n",   \
-  ctx->pc, ctx->opcode , ## __VA_ARGS__); \
+qemu_log_mask(CPU_LOG_TB_IN_ASM, ## __VA_ARGS__); \
 } \
 } while (0)
 
-#define LOG_DISAS(...)\
+#define MIPS_INVAL(op)\
 do {  \
 if (MIPS_DEBUG_DISAS) {   \
-qemu_log_mask(CPU_LOG_TB_IN_ASM, ## __VA_ARGS__); \
+qemu_log_mask(CPU_LOG_TB_IN_ASM,  \
+  TARGET_FMT_lx ": %08x Invalid %s %03x %03x %03x\n", \
+  ctx->pc, ctx->opcode, op, ctx->opcode >> 26,\
+  ctx->opcode & 0x3F, ((ctx->opcode >> 16) & 0x1F));  \
 } \
 } while (0)
 
-#define MIPS_INVAL(op)\
-MIPS_DEBUG("Invalid %s %03x %03x %03x", op, ctx->opcode >> 26,\
-   ctx->opcode & 0x3F, ((ctx->opcode >> 16) & 0x1F))
-
 /* General purpose registers moves. */
 static inline void gen_load_gpr (TCGv t, int reg)
 {
@@ -2105,14 +2102,12 @@ static target_ulong pc_relative_pc (DisasContext *ctx)
 static void gen_ld(DisasContext *ctx, uint32_t opc,
int rt, int base, int16_t offset)
 {
-const char *opn = "ld";
 TCGv t0, t1, t2;
 
 if (rt == 0 && ctx->insn_flags & (INSN_LOONGSON2E | INSN_LOONGSON2F)) {
 /* Loongson CPU uses a load to zero register for prefetch.
We emulate it as a NOP. On other CPU we must perform the
actual memory access. */
-MIPS_DEBUG("NOP");
 return;
 }
 
@@ -2125,20 +2120,17 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_TEUL |
ctx->default_tcg_memop_mask);
 gen_store_gpr(t0, rt);
-opn = "lwu";
 break;
 case OPC_LD:
 tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_TEQ |
ctx->default_tcg_memop_mask);
 gen_store_gpr(t0, rt);
-opn = "ld";
 break;
 case OPC_LLD:
 case R6_OPC_LLD:
 save_cpu_state(ctx, 1);
 op_ld_lld(t0, t0, ctx);
 gen_store_gpr(t0, rt);
-opn = "lld";
 break;
 case OPC_LDL:
 t1 = tcg_temp_new();
@@ -2161,7 +2153,6 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 tcg_gen_or_tl(t0, t0, t1);
 tcg_temp_free(t1);
 gen_store_gpr(t0, rt);
-opn = "ldl";
 break;
 case OPC_LDR:
 t1 = tcg_temp_new();
@@ -2185,7 +2176,6 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 tcg_gen_or_tl(t0, t0, t1);
 tcg_temp_free(t1);
 gen_store_gpr(t0, rt);
-opn = "ldr";
 break;
 case OPC_LDPC:
 t1 = tcg_const_tl(pc_relative_pc(ctx));
@@ -2193,7 +2183,6 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 tcg_temp_free(t1);
 tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_TEQ);
 gen_store_gpr(t0, rt);
-opn = "ldpc";
 break;
 #endif
 case OPC_LWPC:
@@ -2202,35 +2191,29 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 tcg_temp_free(t1);
 tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_TESL);
 gen_store_gpr(t0, rt);
-opn = "lwpc";
 break;
 case OPC_LW:
 tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,

[Qemu-devel] [PULL 0/6] sh4-next queue

2015-09-13 Thread Aurelien Jarno

The following changes since commit 7b9c09f7d486647784c605739d69b708a7249c9b:

  Merge remote-tracking branch 'remotes/sstabellini/tags/xen-2015-09-10-tag' 
into staging (2015-09-10 18:25:52 +0100)

are available in the git repository at:

  git://git.aurel32.net/qemu.git tags/pull-sh4-next-20150913

for you to fetch changes up to cdd14a8cf25c34ff8d0777530e8d16565f6bf7a1:

  sh4: Fix initramfs initialization for endiannes-mismatched targets 
(2015-09-13 23:08:51 +0200)


sh4-next:

- TCG optimizations
- fix initramfs endianness issue


Aurelien Jarno (5):
  target-sh4: add flags markups for FP helpers
  target-sh4: use deposit in swap.b instruction
  target-sh4: improve cmp/str instruction
  target-sh4: improve shld instruction
  target-sh4: improve shad instruction

Guenter Roeck (1):
  sh4: Fix initramfs initialization for endiannes-mismatched targets

 hw/sh4/r2d.c   |   6 +--
 target-sh4/helper.h|  34 ++---
 target-sh4/translate.c | 126 -
 3 files changed, 71 insertions(+), 95 deletions(-)

-- 
2.1.4

[Qemu-devel] [PULL 5/6] target-sh4: improve shad instruction

2015-09-13 Thread Aurelien Jarno

The SH4 shad instruction can shift in both direction, depending on the
sign of the shift. This is currently implemented using branches, which
is not really efficient and prevents the optimizer to do its job. In
practice it is often used with a constant loaded in a register just
before.

Simplify the implementation by computing both the value shifted to the
left and to the right, and then selecting the correct one with a
movcond. As with a negative value the shift amount can go up to 32 which
is undefined, we shift the value in two steps.

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 
---
 target-sh4/translate.c | 53 +-
 1 file changed, 22 insertions(+), 31 deletions(-)

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index c8dd3a7..724c0e7 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -832,37 +832,28 @@ static void _decode_opc(DisasContext * ctx)
return;
 case 0x400c:   /* shad Rm,Rn */
{
-TCGLabel *label1 = gen_new_label();
-TCGLabel *label2 = gen_new_label();
-TCGLabel *label3 = gen_new_label();
-TCGLabel *label4 = gen_new_label();
-   TCGv shift;
-   tcg_gen_brcondi_i32(TCG_COND_LT, REG(B7_4), 0, label1);
-   /* Rm positive, shift to the left */
-shift = tcg_temp_new();
-   tcg_gen_andi_i32(shift, REG(B7_4), 0x1f);
-   tcg_gen_shl_i32(REG(B11_8), REG(B11_8), shift);
-   tcg_temp_free(shift);
-   tcg_gen_br(label4);
-   /* Rm negative, shift to the right */
-   gen_set_label(label1);
-shift = tcg_temp_new();
-   tcg_gen_andi_i32(shift, REG(B7_4), 0x1f);
-   tcg_gen_brcondi_i32(TCG_COND_EQ, shift, 0, label2);
-   tcg_gen_not_i32(shift, REG(B7_4));
-   tcg_gen_andi_i32(shift, shift, 0x1f);
-   tcg_gen_addi_i32(shift, shift, 1);
-   tcg_gen_sar_i32(REG(B11_8), REG(B11_8), shift);
-   tcg_temp_free(shift);
-   tcg_gen_br(label4);
-   /* Rm = -32 */
-   gen_set_label(label2);
-   tcg_gen_brcondi_i32(TCG_COND_LT, REG(B11_8), 0, label3);
-   tcg_gen_movi_i32(REG(B11_8), 0);
-   tcg_gen_br(label4);
-   gen_set_label(label3);
-   tcg_gen_movi_i32(REG(B11_8), 0x);
-   gen_set_label(label4);
+TCGv t0 = tcg_temp_new();
+TCGv t1 = tcg_temp_new();
+TCGv t2 = tcg_temp_new();
+
+tcg_gen_andi_i32(t0, REG(B7_4), 0x1f);
+
+/* positive case: shift to the left */
+tcg_gen_shl_i32(t1, REG(B11_8), t0);
+
+/* negative case: shift to the right in two steps to
+   correctly handle the -32 case */
+tcg_gen_xori_i32(t0, t0, 0x1f);
+tcg_gen_sar_i32(t2, REG(B11_8), t0);
+tcg_gen_sari_i32(t2, t2, 1);
+
+/* select between the two cases */
+tcg_gen_movi_i32(t0, 0);
+tcg_gen_movcond_i32(TCG_COND_GE, REG(B11_8), REG(B7_4), t0, t1, 
t2);
+
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
}
return;
 case 0x400d:   /* shld Rm,Rn */
-- 
2.1.4

[Qemu-devel] [PULL 4/6] target-sh4: improve shld instruction

2015-09-13 Thread Aurelien Jarno

The SH4 shld instruction can shift in both direction, depending on the
sign of the shift. This is currently implemented using branches, which
is not really efficient and prevents the optimizer to do its job. In
practice it is often used with a constant loaded in a register just
before.

Simplify the implementation by computing both the value shifted to the
left and to the right, and then selecting the correct one with a
movcond. As with a negative value the shift amount can go up to 32 which
is undefined, we shift the value in two steps.

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 
---
 target-sh4/translate.c | 48 ++--
 1 file changed, 22 insertions(+), 26 deletions(-)

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index ca6ef5a..c8dd3a7 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -867,32 +867,28 @@ static void _decode_opc(DisasContext * ctx)
return;
 case 0x400d:   /* shld Rm,Rn */
{
-TCGLabel *label1 = gen_new_label();
-TCGLabel *label2 = gen_new_label();
-TCGLabel *label3 = gen_new_label();
-   TCGv shift;
-   tcg_gen_brcondi_i32(TCG_COND_LT, REG(B7_4), 0, label1);
-   /* Rm positive, shift to the left */
-shift = tcg_temp_new();
-   tcg_gen_andi_i32(shift, REG(B7_4), 0x1f);
-   tcg_gen_shl_i32(REG(B11_8), REG(B11_8), shift);
-   tcg_temp_free(shift);
-   tcg_gen_br(label3);
-   /* Rm negative, shift to the right */
-   gen_set_label(label1);
-shift = tcg_temp_new();
-   tcg_gen_andi_i32(shift, REG(B7_4), 0x1f);
-   tcg_gen_brcondi_i32(TCG_COND_EQ, shift, 0, label2);
-   tcg_gen_not_i32(shift, REG(B7_4));
-   tcg_gen_andi_i32(shift, shift, 0x1f);
-   tcg_gen_addi_i32(shift, shift, 1);
-   tcg_gen_shr_i32(REG(B11_8), REG(B11_8), shift);
-   tcg_temp_free(shift);
-   tcg_gen_br(label3);
-   /* Rm = -32 */
-   gen_set_label(label2);
-   tcg_gen_movi_i32(REG(B11_8), 0);
-   gen_set_label(label3);
+TCGv t0 = tcg_temp_new();
+TCGv t1 = tcg_temp_new();
+TCGv t2 = tcg_temp_new();
+
+tcg_gen_andi_i32(t0, REG(B7_4), 0x1f);
+
+/* positive case: shift to the left */
+tcg_gen_shl_i32(t1, REG(B11_8), t0);
+
+/* negative case: shift to the right in two steps to
+   correctly handle the -32 case */
+tcg_gen_xori_i32(t0, t0, 0x1f);
+tcg_gen_shr_i32(t2, REG(B11_8), t0);
+tcg_gen_shri_i32(t2, t2, 1);
+
+/* select between the two cases */
+tcg_gen_movi_i32(t0, 0);
+tcg_gen_movcond_i32(TCG_COND_GE, REG(B11_8), REG(B7_4), t0, t1, 
t2);
+
+tcg_temp_free(t0);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
}
return;
 case 0x3008:   /* sub Rm,Rn */
-- 
2.1.4

[Qemu-devel] [PULL 1/6] target-sh4: add flags markups for FP helpers

2015-09-13 Thread Aurelien Jarno

Most floating point helpers can trigger an exception, but don't change
the globals. Mark these helpers as TCG_CALL_NO_WG.

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 
---
 target-sh4/helper.h | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/target-sh4/helper.h b/target-sh4/helper.h
index c9bc407..dce859c 100644
--- a/target-sh4/helper.h
+++ b/target-sh4/helper.h
@@ -18,28 +18,28 @@ DEF_HELPER_2(ld_fpscr, void, env, i32)
 
 DEF_HELPER_FLAGS_1(fabs_FT, TCG_CALL_NO_RWG_SE, f32, f32)
 DEF_HELPER_FLAGS_1(fabs_DT, TCG_CALL_NO_RWG_SE, f64, f64)
-DEF_HELPER_3(fadd_FT, f32, env, f32, f32)
-DEF_HELPER_3(fadd_DT, f64, env, f64, f64)
-DEF_HELPER_2(fcnvsd_FT_DT, f64, env, f32)
-DEF_HELPER_2(fcnvds_DT_FT, f32, env, f64)
+DEF_HELPER_FLAGS_3(fadd_FT, TCG_CALL_NO_WG, f32, env, f32, f32)
+DEF_HELPER_FLAGS_3(fadd_DT, TCG_CALL_NO_WG, f64, env, f64, f64)
+DEF_HELPER_FLAGS_2(fcnvsd_FT_DT, TCG_CALL_NO_WG, f64, env, f32)
+DEF_HELPER_FLAGS_2(fcnvds_DT_FT, TCG_CALL_NO_WG, f32, env, f64)
 
 DEF_HELPER_3(fcmp_eq_FT, void, env, f32, f32)
 DEF_HELPER_3(fcmp_eq_DT, void, env, f64, f64)
 DEF_HELPER_3(fcmp_gt_FT, void, env, f32, f32)
 DEF_HELPER_3(fcmp_gt_DT, void, env, f64, f64)
-DEF_HELPER_3(fdiv_FT, f32, env, f32, f32)
-DEF_HELPER_3(fdiv_DT, f64, env, f64, f64)
-DEF_HELPER_2(float_FT, f32, env, i32)
-DEF_HELPER_2(float_DT, f64, env, i32)
-DEF_HELPER_4(fmac_FT, f32, env, f32, f32, f32)
-DEF_HELPER_3(fmul_FT, f32, env, f32, f32)
-DEF_HELPER_3(fmul_DT, f64, env, f64, f64)
+DEF_HELPER_FLAGS_3(fdiv_FT, TCG_CALL_NO_WG, f32, env, f32, f32)
+DEF_HELPER_FLAGS_3(fdiv_DT, TCG_CALL_NO_WG, f64, env, f64, f64)
+DEF_HELPER_FLAGS_2(float_FT, TCG_CALL_NO_WG, f32, env, i32)
+DEF_HELPER_FLAGS_2(float_DT, TCG_CALL_NO_WG, f64, env, i32)
+DEF_HELPER_FLAGS_4(fmac_FT, TCG_CALL_NO_WG, f32, env, f32, f32, f32)
+DEF_HELPER_FLAGS_3(fmul_FT, TCG_CALL_NO_WG, f32, env, f32, f32)
+DEF_HELPER_FLAGS_3(fmul_DT, TCG_CALL_NO_WG, f64, env, f64, f64)
 DEF_HELPER_FLAGS_1(fneg_T, TCG_CALL_NO_RWG_SE, f32, f32)
-DEF_HELPER_3(fsub_FT, f32, env, f32, f32)
-DEF_HELPER_3(fsub_DT, f64, env, f64, f64)
-DEF_HELPER_2(fsqrt_FT, f32, env, f32)
-DEF_HELPER_2(fsqrt_DT, f64, env, f64)
-DEF_HELPER_2(ftrc_FT, i32, env, f32)
-DEF_HELPER_2(ftrc_DT, i32, env, f64)
+DEF_HELPER_FLAGS_3(fsub_FT, TCG_CALL_NO_WG, f32, env, f32, f32)
+DEF_HELPER_FLAGS_3(fsub_DT, TCG_CALL_NO_WG, f64, env, f64, f64)
+DEF_HELPER_FLAGS_2(fsqrt_FT, TCG_CALL_NO_WG, f32, env, f32)
+DEF_HELPER_FLAGS_2(fsqrt_DT, TCG_CALL_NO_WG, f64, env, f64)
+DEF_HELPER_FLAGS_2(ftrc_FT, TCG_CALL_NO_WG, i32, env, f32)
+DEF_HELPER_FLAGS_2(ftrc_DT, TCG_CALL_NO_WG, i32, env, f64)
 DEF_HELPER_3(fipr, void, env, i32, i32)
 DEF_HELPER_2(ftrv, void, env, i32)
-- 
2.1.4

[Qemu-devel] [PULL 6/6] sh4: Fix initramfs initialization for endiannes-mismatched targets

2015-09-13 Thread Aurelien Jarno

From: Guenter Roeck 

If host and target endianness does not match, loding an initramfs does not work.
Fix by writing boot parameters with appropriate endianness conversion.

Signed-off-by: Guenter Roeck 
Signed-off-by: Aurelien Jarno 
---
 hw/sh4/r2d.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/sh4/r2d.c b/hw/sh4/r2d.c
index 5e22ed7..3b0b2ec 100644
--- a/hw/sh4/r2d.c
+++ b/hw/sh4/r2d.c
@@ -338,9 +338,9 @@ static void r2d_init(MachineState *machine)
 }
 
 /* initialization which should be done by firmware */
-boot_params.loader_type = 1;
-boot_params.initrd_start = INITRD_LOAD_OFFSET;
-boot_params.initrd_size = initrd_size;
+boot_params.loader_type = tswap32(1);
+boot_params.initrd_start = tswap32(INITRD_LOAD_OFFSET);
+boot_params.initrd_size = tswap32(initrd_size);
 }
 
 if (kernel_cmdline) {
-- 
2.1.4

[Qemu-devel] [PULL 2/6] target-sh4: use deposit in swap.b instruction

2015-09-13 Thread Aurelien Jarno

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 
---
 target-sh4/translate.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index be0cb32..50043cf 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -612,15 +612,11 @@ static void _decode_opc(DisasContext * ctx)
return;
 case 0x6008:   /* swap.b Rm,Rn */
{
-   TCGv high, low;
-   high = tcg_temp_new();
-   tcg_gen_andi_i32(high, REG(B7_4), 0x);
-   low = tcg_temp_new();
+TCGv low = tcg_temp_new();;
tcg_gen_ext16u_i32(low, REG(B7_4));
tcg_gen_bswap16_i32(low, low);
-   tcg_gen_or_i32(REG(B11_8), high, low);
+tcg_gen_deposit_i32(REG(B11_8), REG(B7_4), low, 0, 16);
tcg_temp_free(low);
-   tcg_temp_free(high);
}
return;
 case 0x6009:   /* swap.w Rm,Rn */
-- 
2.1.4

[Qemu-devel] [PULL 3/6] target-sh4: improve cmp/str instruction

2015-09-13 Thread Aurelien Jarno

Instead of testing bytes one by one, we can use the following trick
from https://graphics.stanford.edu/~seander/bithacks.html:

  haszero(v) = (v - 0x01010101) & ~v & 0x80808080

The subexpression v - 0x01010101, evaluates to a high bit set in any
byte whenever the corresponding byte in v is zero or greater than 0x80.
The sub-expression ~v & 0x80808080 evaluates to high bits set in bytes
where the byte of v doesn't have its high bit set (so the byte was less
than 0x80). Finally, by ANDing these two sub-expressions the result is
the high bits set where the bytes in v were zero, since the high bits
set due to a value greater than 0x80 in the first sub-expression are
masked off by the second.

Reviewed-by: Richard Henderson 
Signed-off-by: Aurelien Jarno 
---
 target-sh4/translate.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 50043cf..ca6ef5a 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -688,18 +688,11 @@ static void _decode_opc(DisasContext * ctx)
{
TCGv cmp1 = tcg_temp_new();
TCGv cmp2 = tcg_temp_new();
-   tcg_gen_xor_i32(cmp1, REG(B7_4), REG(B11_8));
-   tcg_gen_andi_i32(cmp2, cmp1, 0xff00);
-tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, cmp2, 0);
-   tcg_gen_andi_i32(cmp2, cmp1, 0x00ff);
-   tcg_gen_setcondi_i32(TCG_COND_EQ, cmp2, cmp2, 0);
-tcg_gen_or_i32(cpu_sr_t, cpu_sr_t, cmp2);
-   tcg_gen_andi_i32(cmp2, cmp1, 0xff00);
-   tcg_gen_setcondi_i32(TCG_COND_EQ, cmp2, cmp2, 0);
-tcg_gen_or_i32(cpu_sr_t, cpu_sr_t, cmp2);
-   tcg_gen_andi_i32(cmp2, cmp1, 0x00ff);
-   tcg_gen_setcondi_i32(TCG_COND_EQ, cmp2, cmp2, 0);
-tcg_gen_or_i32(cpu_sr_t, cpu_sr_t, cmp2);
+tcg_gen_xor_i32(cmp2, REG(B7_4), REG(B11_8));
+tcg_gen_subi_i32(cmp1, cmp2, 0x01010101);
+tcg_gen_andc_i32(cmp1, cmp1, cmp2);
+tcg_gen_andi_i32(cmp1, cmp1, 0x80808080);
+tcg_gen_setcondi_i32(TCG_COND_NE, cpu_sr_t, cmp1, 0);
tcg_temp_free(cmp2);
tcg_temp_free(cmp1);
}
-- 
2.1.4

Re: [Qemu-devel] [PATCH v2] hw/misc/zynq_slcr: Change CPU clock rate for Linux boots

2015-09-13 Thread Guenter Roeck


Peter,

On 09/13/2015 01:47 PM, Peter Maydell wrote:

On 13 September 2015 at 21:22, Peter Crosthwaite
 wrote:

On Sat, Sep 12, 2015 at 2:06 PM, Guenter Roeck  wrote:

The Linux kernel only accepts 34 Khz and 67 Khz clock rates, and
may crash if the actual clock rate is too low. The clock rate used to be
(ps-clk-frequency * 26 / 4), which resulted in a CPU frequency of
21 Khz if ps-clk-frequency was set to  Hz. Change it to
(ps-clk-frequency * 20 / 2) = 33 Khz for to make Linux happy.
Limit the change to Linux boots only.

Signed-off-by: Guenter Roeck 



Reviewed-by: Peter Crosthwaite 

Can this go via target-arm? (cc PMM).

There may be more changes worth making on is_linux. I don't have the
patch with the full list of FSBL-related SLCR changes handy and can't
seem to find it in any modern Yocto trees. Wondering if Yocto still
supports booting Zynq without FSBL (Nathan/Alistair may know more)?


I'd prefer us not to propagate lots of "only if Linux boot"
changes into devices. The GIC *must* have these because the
kernel can't configure it otherwise from non-secure mode.
I'm not sure that applies here.



Not sure I understand. Is this a NACK ?

Thanks,
Guenter

Re: [Qemu-devel] [PATCH v2] hw/misc/zynq_slcr: Change CPU clock rate for Linux boots

2015-09-13 Thread Peter Crosthwaite

On Sun, Sep 13, 2015 at 1:47 PM, Peter Maydell  wrote:
> On 13 September 2015 at 21:22, Peter Crosthwaite
>  wrote:
>> On Sat, Sep 12, 2015 at 2:06 PM, Guenter Roeck  wrote:
>>> The Linux kernel only accepts 34 Khz and 67 Khz clock rates, and
>>> may crash if the actual clock rate is too low. The clock rate used to be
>>> (ps-clk-frequency * 26 / 4), which resulted in a CPU frequency of
>>> 21 Khz if ps-clk-frequency was set to  Hz. Change it to
>>> (ps-clk-frequency * 20 / 2) = 33 Khz for to make Linux happy.
>>> Limit the change to Linux boots only.
>>>
>>> Signed-off-by: Guenter Roeck 
>>>
>>
>> Reviewed-by: Peter Crosthwaite 
>>
>> Can this go via target-arm? (cc PMM).
>>
>> There may be more changes worth making on is_linux. I don't have the
>> patch with the full list of FSBL-related SLCR changes handy and can't
>> seem to find it in any modern Yocto trees. Wondering if Yocto still
>> supports booting Zynq without FSBL (Nathan/Alistair may know more)?
>
> I'd prefer us not to propagate lots of "only if Linux boot"
> changes into devices. The GIC *must* have these because the
> kernel can't configure it otherwise from non-secure mode.
> I'm not sure that applies here.
>

At least this change is a must. I have had this discussion with kernel
people before and they insist that initing the PLLs and clocks to
desired values is the job of the bootloader and the kernel reads back
the values from this core. It is same philosophy at the GIC init,
which is at the end of the day, done by some pre-boot software. The
same bootloader (FSBL) makes other changes that kernels past present
and future may rely on and it would be good to have those.

Regards,
Peter

> thanks
> -- PMM

Re: [Qemu-devel] [opnfv-tech-discuss] rfc: vhost user enhancements for vm2vm communication

2015-09-13 Thread Zhang, Yang Z

Michael S. Tsirkin wrote on 2015-09-13:
> On Fri, Sep 11, 2015 at 05:39:07PM +0200, Claudio Fontana wrote:
>> On 09.09.2015 09:06, Michael S. Tsirkin wrote:
>> 
>> There are many consequences to this, offset within BAR alone is not
>> enough, there are multiple things at the virtio level that need sorting
>> out. Also we need to consider virtio-mmio etc.
>> 
>>> This would allow VM2VM communication if there are only 2 VMs, but
>>> if data needs to be sent to multiple VMs, you must copy it.
>> 
>> Not necessarily, however getting it to work (sharing the backend window
>> and arbitrating the multicast) is really hard.
>> 
>>> 
>>> Additionally, it's a single-purpose feature: you can use it from a
>>> userspace PMD but linux will never use it.
>>> 
>>> 
>>> My proposal is a superset: don't require that BAR memory is used,
>>> use IOMMU translation tables.
>>> This way, data can be sent to multiple VMs by sharing the same
>>> memory with them all.
>> 
>> Can you describe in detail how your proposal deals with the
>> arbitration
> necessary for multicast handling?
> 
> Basically it falls out naturally. Consider linux guest as an example,
> and assume dynamic mappings for simplicity.
> 
> Multicast is done by a bridge on the guest side. That code clones the
> skb (reference-counting page fragments) and passes it to multiple ports.
> Each of these will program the IOMMU to allow read access to the
> fragments to the relevant device.

How to work with vswitch in host side like OVS? Since the flow table is inside 
host, but guest cannot see it.

Best regards,
Yang

[Qemu-devel] [PATCH v8 3/7] scripts: Submit changes while updating linux headers

2015-09-13 Thread Gavin Shan

This submits changes with formatted commit log while updating Linux
headers using scripts/update-linux-headers.sh.

Signed-off-by: Gavin Shan 
---
 scripts/update-linux-headers.sh | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 18daabe..a345632 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -63,6 +63,34 @@ cp_virtio() {
 fi
 }
 
+submit_change() {
+from=$1
+to=$2
+if ! [ -e "$to/include/qemu-common.h" ]; then
+echo "$to not QEMU source directory, skip submitting changes"
+exit 3
+fi
+
+version=$(make -C "$from" -s kernelversion)
+commit=$(git -C "$from" rev-parse --short HEAD)
+message=$(cat <$output/include/standard-headers/linux/if_ether.h
 EOF
 
 rm -rf "$tmpdir"
+
+submit_change "$linux" "$output"
-- 
2.1.0

[Qemu-devel] [PATCH v8 2/7] scripts: Include arch/powerpc/include/uapi/asm/eeh.h

2015-09-13 Thread Gavin Shan

This includes linux/arch/powerpc/include/uapi/asm/eeh.h while
updating linux header files. The specific header file, introduced
by the following Linux upstream commits for EEH on sPAPR platform:

  ed3e81f ("powerpc/eeh: Move PE state constants around")
  ec33d36 ("powerpc/eeh: Introduce eeh_pe_inject_err()")

Signed-off-by: Gavin Shan 
Reviewed-by: David Gibson 
---
 scripts/update-linux-headers.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 2fddf2e..18daabe 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -90,6 +90,7 @@ for arch in $ARCHLIST; do
 cp "$tmpdir/include/asm/hyperv.h" "$output/linux-headers/asm-x86"
 fi
 if [ $arch = powerpc ]; then
+cp "$tmpdir/include/asm/eeh.h" "$output/linux-headers/asm-powerpc/"
 cp "$tmpdir/include/asm/epapr_hcalls.h" 
"$output/linux-headers/asm-powerpc/"
 fi
 
-- 
2.1.0

[Qemu-devel] [PATCH v8 4/7] Synchronize Linux headers from kernel 4.3.0-rc1

2015-09-13 Thread Gavin Shan

Synchronize the Linux headers from kernel version 4.3.0-rc1
(commit 6ff33f3)

This commit was created automatically by update-linux-headers.sh.

Signed-off-by: Gavin Shan 
---
 include/standard-headers/linux/pci_regs.h| 381 ---
 include/standard-headers/linux/virtio_ring.h |   3 +-
 linux-headers/asm-arm64/kvm.h|  37 ++-
 linux-headers/asm-powerpc/eeh.h  |  56 
 linux-headers/asm-x86/hyperv.h   |   4 +
 linux-headers/asm-x86/kvm.h  |   4 +-
 linux-headers/linux/kvm.h|   7 +
 7 files changed, 391 insertions(+), 101 deletions(-)
 create mode 100644 linux-headers/asm-powerpc/eeh.h

diff --git a/include/standard-headers/linux/pci_regs.h 
b/include/standard-headers/linux/pci_regs.h
index 57e8c80..413417f 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -13,10 +13,10 @@
  * PCI to PCI Bridge Specification
  * PCI System Design Guide
  *
- * For hypertransport information, please consult the following manuals
- * from http://www.hypertransport.org
+ * For HyperTransport information, please consult the following manuals
+ * from http://www.hypertransport.org
  *
- * The Hypertransport I/O Link Specification
+ * The HyperTransport I/O Link Specification
  */
 
 #ifndef LINUX_PCI_REGS_H
@@ -26,6 +26,7 @@
  * Under PCI, each device has 256 bytes of configuration address space,
  * of which the first 64 bytes are standardized as follows:
  */
+#define PCI_STD_HEADER_SIZEOF  64
 #define PCI_VENDOR_ID  0x00/* 16 bits */
 #define PCI_DEVICE_ID  0x02/* 16 bits */
 #define PCI_COMMAND0x04/* 16 bits */
@@ -36,7 +37,7 @@
 #define  PCI_COMMAND_INVALIDATE0x10/* Use memory write and 
invalidate */
 #define  PCI_COMMAND_VGA_PALETTE 0x20  /* Enable palette snooping */
 #define  PCI_COMMAND_PARITY0x40/* Enable parity checking */
-#define  PCI_COMMAND_WAIT  0x80/* Enable address/data stepping */
+#define  PCI_COMMAND_WAIT  0x80/* Enable address/data stepping */
 #define  PCI_COMMAND_SERR  0x100   /* Enable SERR */
 #define  PCI_COMMAND_FAST_BACK 0x200   /* Enable back-to-back writes */
 #define  PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */
@@ -44,7 +45,7 @@
 #define PCI_STATUS 0x06/* 16 bits */
 #define  PCI_STATUS_INTERRUPT  0x08/* Interrupt status */
 #define  PCI_STATUS_CAP_LIST   0x10/* Support Capability List */
-#define  PCI_STATUS_66MHZ  0x20/* Support 66 Mhz PCI 2.1 bus */
+#define  PCI_STATUS_66MHZ  0x20/* Support 66 MHz PCI 2.1 bus */
 #define  PCI_STATUS_UDF0x40/* Support User Definable 
Features [obsolete] */
 #define  PCI_STATUS_FAST_BACK  0x80/* Accept fast-back to back */
 #define  PCI_STATUS_PARITY 0x100   /* Detected parity error */
@@ -125,7 +126,8 @@
 #define  PCI_IO_RANGE_TYPE_MASK0x0fUL  /* I/O bridging type */
 #define  PCI_IO_RANGE_TYPE_16  0x00
 #define  PCI_IO_RANGE_TYPE_32  0x01
-#define  PCI_IO_RANGE_MASK (~0x0fUL)
+#define  PCI_IO_RANGE_MASK (~0x0fUL) /* Standard 4K I/O windows */
+#define  PCI_IO_1K_RANGE_MASK  (~0x03UL) /* Intel 1K I/O windows */
 #define PCI_SEC_STATUS 0x1e/* Secondary status register, only bit 
14 used */
 #define PCI_MEMORY_BASE0x20/* Memory range behind */
 #define PCI_MEMORY_LIMIT   0x22
@@ -203,16 +205,18 @@
 #define  PCI_CAP_ID_CHSWP  0x06/* CompactPCI HotSwap */
 #define  PCI_CAP_ID_PCIX   0x07/* PCI-X */
 #define  PCI_CAP_ID_HT 0x08/* HyperTransport */
-#define  PCI_CAP_ID_VNDR   0x09/* Vendor specific */
+#define  PCI_CAP_ID_VNDR   0x09/* Vendor-Specific */
 #define  PCI_CAP_ID_DBG0x0A/* Debug port */
 #define  PCI_CAP_ID_CCRC   0x0B/* CompactPCI Central Resource Control 
*/
-#define  PCI_CAP_ID_SHPC   0x0C/* PCI Standard Hot-Plug Controller */
+#define  PCI_CAP_ID_SHPC   0x0C/* PCI Standard Hot-Plug Controller */
 #define  PCI_CAP_ID_SSVID  0x0D/* Bridge subsystem vendor/device ID */
 #define  PCI_CAP_ID_AGP3   0x0E/* AGP Target PCI-PCI bridge */
-#define  PCI_CAP_ID_EXP0x10/* PCI Express */
+#define  PCI_CAP_ID_SECDEV 0x0F/* Secure Device */
+#define  PCI_CAP_ID_EXP0x10/* PCI Express */
 #define  PCI_CAP_ID_MSIX   0x11/* MSI-X */
-#define  PCI_CAP_ID_SATA   0x12/* Serial ATA */
+#define  PCI_CAP_ID_SATA   0x12/* SATA Data/Index Conf. */
 #define  PCI_CAP_ID_AF 0x13/* PCI Advanced Features */
+#define  PCI_CAP_ID_MAXPCI_CAP_ID_AF
 #define PCI_CAP_LIST_NEXT  1   /* Next capability in the list */
 #define PCI_CAP_FLAGS  2   /* Capability defined flags (16 bits) */
 #define PCI_CAP_SIZEOF 4
@@ -264,8 +268,8 @@
 #define  PCI_AGP_COMMAND_RQ_MASK 0xff00

[Qemu-devel] [PATCH v8 0/7] sPAPR: Support EEH Error Injection

2015-09-13 Thread Gavin Shan

The patchset depends on below Linux upstream commits:

  commit ed3e81f ("powerpc/eeh: Move PE state constants around")
  commit ec33d36 ("powerpc/eeh: Introduce eeh_pe_inject_err()")

According to PAPR specification 2.7, there're 3 RTAS calls relevent to error
injection: "ibm,open-errinjct", "ibm,close-errinjct", "ibm,errinjct". The
userland utility "errinjct" running on guest utilizes those 3 RTAS calls like
this way: Call "ibm,open-errinjct" that returns open-token, which is passed to
"ibm,errinjct" together with error specific arguments to do error injection.
Finally, to return the open-token by calling "ibm,close-errinject".

"ibm,errinjct" can be used to inject various errors, not limited to EEH errors.
However, this patchset is going to support injecting EEH errors only for VFIO
PCI devices.

=
Changelog
=
v8:
   * Rebased to git://github.com/dgibson/qemu.git (branch: spapr-next)
   * Apply "git -C $to commit" to update-linux-headers.sh.
   * Use "git rev-parse --short HEAD" to retrieve top commit
   * Use "EOF" to construct the commit message
   * Drop sPAPRPHBClass::eeh_inject_error().
v7:
   * Cover comments from Peter Maydell in scripts/update-linux-headers.sh.
   * Reset spapr->errinjct_token when rebooting guest.
v6:
   * Improved scripts/update-linux-headers.sh to format commit log with
 last commit ID and Linux kernel version. Also, "stdint.h" is allowed
 to be included in virtio headers.
   * #include "asm-powerpc/eeh.h".
   * Incremental spapr->errinjct_token so that the condition (0x1 &
 spapr->errinjct_token) can be used to check if the token is valid.
   * Big-endian tokens in /rtas/ibm,errinjct-tokens.
   * Pick rtas_ldq() to load 64-bits value from RTAS call buffer, which
 was dropped in v2.
   * Use EEH_ERR_FUNC_MAX to validate EEH error function.
   * Removed unnecessary paranthesitis.
v5:
   * Put "errinjct_token" to migration stream disregarding it's opened or
 not. Also, it starts to be supported from v4 vmstate_spapr.
   * Include powerpc/include/uapi/asm/eeh.h in scripts/update_linux_headers.sh
v4:
   * To record currently opened token, not next one as suggested by Alexey.
v3:
   * Replace random token number with incremental counter. Another boolean
 variable to track if it's opened. Both of them are added to migration
 stream.
   * The return value from sPAPRPHBClass::eeh_inject_error() can be passed
 to user directly. No need to do conversion.
   * Corrected error code to RTAS_OUT_CLOSE_ERROR in rtas_ibm_errinjct().
   * Don't expose error injection tokens for unsupported types.
v2:
   * Rebased to git://github.com/dgibson/qemu.git (branch: spapr-next)
   * Remove specific PCI error types in hw/ppc/spapr.h. Use those macros
 asm-powerpc/eeh.h instead.

Gavin Shan (7):
  scripts: Allow include "stdint.h" in virtio headers
  scripts: Include arch/powerpc/include/uapi/asm/eeh.h
  scripts: Submit changes while updating linux headers
  Synchronize Linux headers from kernel 4.3.0-rc1
  Obsolete PCI_MSIX_FLAGS_BIRMASK
  sPAPR: Support RTAS call ibm, {open, close}-errinjct
  sPAPR: Support RTAS call ibm,errinjct

 hw/i386/kvm/pci-assign.c |   4 +-
 hw/pci/msix.c|   2 +-
 hw/pci/pcie_aer.c|   2 +-
 hw/ppc/spapr.c   |   9 +-
 hw/ppc/spapr_pci.c   |  30 +++
 hw/ppc/spapr_pci_vfio.c  |  32 +++
 hw/ppc/spapr_rtas.c  | 137 ++
 hw/s390x/s390-pci-bus.c  |   8 +-
 hw/vfio/pci.c|   8 +-
 hw/xen/xen_pt_msi.c  |   4 +-
 include/hw/pci-host/spapr.h  |   3 +
 include/hw/ppc/spapr.h   |  16 +-
 include/standard-headers/linux/pci_regs.h| 381 ---
 include/standard-headers/linux/virtio_ring.h |   3 +-
 linux-headers/asm-arm64/kvm.h|  37 ++-
 linux-headers/asm-powerpc/eeh.h  |  56 
 linux-headers/asm-x86/hyperv.h   |   4 +
 linux-headers/asm-x86/kvm.h  |   4 +-
 linux-headers/linux/kvm.h|   7 +
 scripts/update-linux-headers.sh  |  34 ++-
 tests/libqos/pci.c   |   8 +-
 21 files changed, 667 insertions(+), 122 deletions(-)
 create mode 100644 linux-headers/asm-powerpc/eeh.h

-- 
2.1.0

[Qemu-devel] [PATCH v8 6/7] sPAPR: Support RTAS call ibm, {open, close}-errinjct

2015-09-13 Thread Gavin Shan

This supports RTAS calls "ibm,{open,close}-errinjct" to manupliate
the token, which is passed to RTAS call "ibm,errinjct" to indicate
the valid context for error injection. Each VM is permitted to have
only one token at once and we simply have sequential number for that.
The token is resetted in ppc_spapr_reset() when rebooting guest. It's
notable that the least bit of the token is reserved to indicate if the
token has been opened, meaning the valid token should be always odd.

Signed-off-by: Gavin Shan 
Reviewed-by: David Gibson 
---
 hw/ppc/spapr.c |  9 +++-
 hw/ppc/spapr_rtas.c| 60 ++
 include/hw/ppc/spapr.h |  9 +++-
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f22db12..51dc9cf 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1062,6 +1062,9 @@ static void ppc_spapr_reset(void)
 
 qemu_devices_reset();
 
+/* Reset error injection token */
+spapr->errinjct_token = 0;
+
 /*
  * We place the device tree and RTAS just below either the top of the RMA,
  * or just below 2GB, whichever is lowere, so that it can be
@@ -1189,7 +1192,7 @@ static bool version_before_3(void *opaque, int version_id)
 
 static const VMStateDescription vmstate_spapr = {
 .name = "spapr",
-.version_id = 3,
+.version_id = 4,
 .minimum_version_id = 1,
 .post_load = spapr_post_load,
 .fields = (VMStateField[]) {
@@ -1200,6 +1203,10 @@ static const VMStateDescription vmstate_spapr = {
 VMSTATE_UINT64_TEST(rtc_offset, sPAPRMachineState, version_before_3),
 
 VMSTATE_PPC_TIMEBASE_V(tb, sPAPRMachineState, 2),
+
+/* Error injection token */
+VMSTATE_UINT32_V(errinjct_token, sPAPRMachineState, 4),
+
 VMSTATE_END_OF_LIST()
 },
 };
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 3b7b20b..5520fd2 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -610,6 +610,62 @@ out:
 rtas_st(rets, 0, rc);
 }
 
+static void rtas_ibm_open_errinjct(PowerPCCPU *cpu,
+   sPAPRMachineState *spapr,
+   uint32_t token, uint32_t nargs,
+   target_ulong args, uint32_t nret,
+   target_ulong rets)
+{
+int32_t ret;
+
+/* Sanity check on number of arguments */
+if (nargs != 0 || nret != 2) {
+ret = RTAS_OUT_PARAM_ERROR;
+goto out;
+}
+
+/* Check if we already had token */
+if (spapr->errinjct_token & 1) {
+ret = RTAS_OUT_TOKEN_OPENED;
+goto out;
+}
+
+/* Grab the token */
+rtas_st(rets, 0, ++spapr->errinjct_token);
+ret = RTAS_OUT_SUCCESS;
+out:
+rtas_st(rets, 1, ret);
+}
+
+static void rtas_ibm_close_errinjct(PowerPCCPU *cpu,
+sPAPRMachineState *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args, uint32_t nret,
+target_ulong rets)
+{
+uint32_t open_token;
+int32_t ret;
+
+/* Sanity check on number of arguments */
+if (nargs != 1 || nret != 1) {
+ret = RTAS_OUT_PARAM_ERROR;
+goto out;
+}
+
+/* Match with the passed token */
+open_token = rtas_ld(args, 0);
+if (!(spapr->errinjct_token & 1) ||
+spapr->errinjct_token != open_token) {
+ret = RTAS_OUT_CLOSE_ERROR;
+goto out;
+}
+
+spapr->errinjct_token++;
+ret = RTAS_OUT_SUCCESS;
+out:
+rtas_st(rets, 0, ret);
+}
+
 static struct rtas_call {
 const char *name;
 spapr_rtas_fn fn;
@@ -760,6 +816,10 @@ static void core_rtas_register_types(void)
 rtas_get_sensor_state);
 spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, 
"ibm,configure-connector",
 rtas_ibm_configure_connector);
+spapr_rtas_register(RTAS_IBM_OPEN_ERRINJCT, "ibm,open-errinjct",
+rtas_ibm_open_errinjct);
+spapr_rtas_register(RTAS_IBM_CLOSE_ERRINJCT, "ibm,close-errinjct",
+rtas_ibm_close_errinjct);
 }
 
 type_init(core_rtas_register_types)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index c75cc5e..7931e18 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -73,6 +73,9 @@ struct sPAPRMachineState {
 int htab_fd;
 bool htab_fd_stale;
 
+/* Error injection token */
+uint32_t errinjct_token;
+
 /* RTAS state */
 QTAILQ_HEAD(, sPAPRConfigureConnectorState) ccs_list;
 
@@ -412,6 +415,8 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi);
 #define RTAS_OUT_BUSY   -2
 #define RTAS_OUT_PARAM_ERROR-3
 #define RTAS_OUT_NOT_SUPPORTED  -3
+#define RTAS_OUT_TOKEN_OPENED   -4
+#define RTAS_OUT_CLOSE_ERROR-4
 #define RTAS_OUT_NOT_AUTHORIZED -9002
 
 /* RTAS tokens */
@@ -455,8 +460,10 @@

[Qemu-devel] [PATCH v8 1/7] scripts: Allow include "stdint.h" in virtio headers

2015-09-13 Thread Gavin Shan

This allows to include "stdint.h" in virtio header files. Otherwise,
scripts/update-linux-headers.sh fails when updating headers from
Linux 4.2.rc8 kernel. include/uapi/linux/virtio_ring.h starts to
include "stdint.h" from commit d768f32a ("virtio: Fix typecast of
pointer in vring_init()").

Signed-off-by: Gavin Shan 
Reviewed-by: Thomas Huth 
Reviewed-by: David Gibson 
---
 scripts/update-linux-headers.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index f0e830c..2fddf2e 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -37,7 +37,8 @@ cp_virtio() {
 mkdir -p "$to"
 for f in $virtio; do
 if
-grep '#include' "$f" | grep -v -e 'linux/virtio' \
+grep '#include' "$f" | grep -v -e 'stdint' \
+ -e 'linux/virtio' \
  -e 'linux/types' \
  -e 'linux/if_ether' \
  -e 'sys/' \
-- 
2.1.0

[Qemu-devel] [PATCH v8 5/7] Obsolete PCI_MSIX_FLAGS_BIRMASK

2015-09-13 Thread Gavin Shan

This replaces PCI_MSIX_FLAGS_BIRMASK with PCI_MSIX_TABLE_BIR. Also,
3 more macros regarding MSIx table offset, MSIx PBA BAR index and
MSIx PBA offset and this uses them. Besides, PCI_ERR_UNC_TRAIN is
replaced with PCI_ERR_UNC_UND. The changes were introduced by
below Linux upstream commits:

  commit 24bc69da ("PCI: Clean up MSI/MSI-X capability #defines")
  commit 846fc709 ("PCI/AER: Rename PCI_ERR_UNC_TRAIN to PCI_ERR_UNC_UND")

Signed-off-by: Gavin Shan 
Reviewed-by: David Gibson 
---
 hw/i386/kvm/pci-assign.c | 4 ++--
 hw/pci/msix.c| 2 +-
 hw/pci/pcie_aer.c| 2 +-
 hw/s390x/s390-pci-bus.c  | 8 
 hw/vfio/pci.c| 8 
 hw/xen/xen_pt_msi.c  | 4 ++--
 tests/libqos/pci.c   | 8 
 7 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index b1beaa6..46d2749 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1310,8 +1310,8 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
  PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
 
 msix_table_entry = pci_get_long(pci_dev->config + pos + 
PCI_MSIX_TABLE);
-bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK;
-msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK;
+bar_nr = msix_table_entry & PCI_MSIX_TABLE_BIR;
+msix_table_entry &= PCI_MSIX_TABLE_OFFSET;
 dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry;
 dev->msix_max = msix_max;
 }
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 2fdada4..11beee5 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -250,7 +250,7 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
  ranges_overlap(table_offset, table_size, pba_offset, pba_size)) ||
 table_offset + table_size > memory_region_size(table_bar) ||
 pba_offset + pba_size > memory_region_size(pba_bar) ||
-(table_offset | pba_offset) & PCI_MSIX_FLAGS_BIRMASK) {
+(table_offset | pba_offset) & PCI_MSIX_TABLE_BIR) {
 return -EINVAL;
 }
 
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index f1847ac..1def4a4 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -828,7 +828,7 @@ typedef struct PCIEAERErrorName {
 static const struct PCIEAERErrorName pcie_aer_error_list[] = {
 {
 .name = "TRAIN",
-.val = PCI_ERR_UNC_TRAIN,
+.val = PCI_ERR_UNC_UND,
 .correctable = false,
 }, {
 .name = "DLP",
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 560b66a..7dac2c0 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -507,10 +507,10 @@ static int s390_pcihost_setup_msix(S390PCIBusDevice 
*pbdev)
 pba = pci_host_config_read_common(pbdev->pdev, pos + PCI_MSIX_PBA,
  pci_config_size(pbdev->pdev), sizeof(pba));
 
-pbdev->msix.table_bar = table & PCI_MSIX_FLAGS_BIRMASK;
-pbdev->msix.table_offset = table & ~PCI_MSIX_FLAGS_BIRMASK;
-pbdev->msix.pba_bar = pba & PCI_MSIX_FLAGS_BIRMASK;
-pbdev->msix.pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
+pbdev->msix.table_bar = table & PCI_MSIX_TABLE_BIR;
+pbdev->msix.table_offset = table & PCI_MSIX_TABLE_OFFSET;
+pbdev->msix.pba_bar = pba & PCI_MSIX_PBA_BIR;
+pbdev->msix.pba_offset = pba & PCI_MSIX_PBA_OFFSET;
 pbdev->msix.entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
 pbdev->msix.available = true;
 return 0;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 4023d8e..0481d05 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2244,10 +2244,10 @@ static int vfio_early_setup_msix(VFIOPCIDevice *vdev)
 pba = le32_to_cpu(pba);
 
 vdev->msix = g_malloc0(sizeof(*(vdev->msix)));
-vdev->msix->table_bar = table & PCI_MSIX_FLAGS_BIRMASK;
-vdev->msix->table_offset = table & ~PCI_MSIX_FLAGS_BIRMASK;
-vdev->msix->pba_bar = pba & PCI_MSIX_FLAGS_BIRMASK;
-vdev->msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
+vdev->msix->table_bar = table & PCI_MSIX_TABLE_BIR;
+vdev->msix->table_offset = table & PCI_MSIX_TABLE_OFFSET;
+vdev->msix->pba_bar = pba & PCI_MSIX_PBA_BIR;
+vdev->msix->pba_offset = pba & PCI_MSIX_PBA_OFFSET;
 vdev->msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
 
 /*
diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index e3d7194..61efbe2 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -565,8 +565,8 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t 
base)
   & XC_PAGE_MASK);
 
 xen_host_pci_get_long(hd, base + PCI_MSIX_TABLE, &table_off);
-bar_index = msix->bar_index = table_off & PCI_MSIX_FLAGS_BIRMASK;
-table_off = table_off & ~PCI_MSIX_FLAGS_BIRMASK;
+bar_index = msix->bar_index = table_off & PCI_MSIX_TABLE_BIR;
+table_off = table_off & PCI_MSIX_TABLE_OFFSET;
 msix->table_base = s->real_device.io_regions[bar_index].base_addr;
 XEN_PT_LOG(d, "get MSI-X table BAR bas

[Qemu-devel] [RFCv2 1/2] spapr: Remove unnecessary owner field from sPAPRDRConnector

2015-09-13 Thread David Gibson

The sPAPRDRConnector pseudo-device contains an owner field which is
set in spapr_dr_connector_new().  However, that function also calls
object_property_add_child() to set the DRConnector as the QOM child of
the owner object.  That means that owner is always the same as the QOM
parent, and so redundant.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_drc.c | 5 ++---
 include/hw/ppc/spapr_drc.h | 1 -
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 9ce844a..68e0c3e 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -416,7 +416,7 @@ static void realize(DeviceState *d, Error **errp)
 child_name = object_get_canonical_path_component(OBJECT(drc));
 DPRINTFN("drc child name: %s", child_name);
 object_property_add_alias(root_container, link_name,
-  drc->owner, child_name, &err);
+  OBJECT(drc)->parent, child_name, &err);
 if (err) {
 error_report("%s", error_get_pretty(err));
 error_free(err);
@@ -456,7 +456,6 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
 
 drc->type = type;
 drc->id = id;
-drc->owner = owner;
 object_property_add_child(owner, "dr-connector[*]", OBJECT(drc), NULL);
 object_property_set_bool(OBJECT(drc), true, "realized", NULL);
 
@@ -669,7 +668,7 @@ int spapr_drc_populate_dt(void *fdt, int fdt_offset, Object 
*owner,
 drc = SPAPR_DR_CONNECTOR(obj);
 drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
-if (owner && (drc->owner != owner)) {
+if (owner && (OBJECT(drc)->parent != owner)) {
 continue;
 }
 
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 28ffeae..16e2d4b 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -137,7 +137,6 @@ typedef struct sPAPRDRConnector {
 
 sPAPRDRConnectorType type;
 uint32_t id;
-Object *owner;
 const char *name;
 
 /* sensor/indicator states */
-- 
2.4.3

Re: [Qemu-devel] [RFC PATCH] spapr: Reduce creation of LMB DR connectors from O(n^3) to O(n^2)

2015-09-13 Thread David Gibson

On Fri, Sep 11, 2015 at 09:42:06PM +0530, Bharata B Rao wrote:
> On Thu, Sep 10, 2015 at 04:28:25PM +1000, David Gibson wrote:
> > The dynamic reconfiguration (hotplug) code for the pseries machine type
> > uses a "DR connector" QOM object for each resource it will be possible
> > to hotplug.  Each of these is added to its owner using
> > object_property_add_child(owner, "dr-connector[*], ...);
> > 
> > This works ok for most cases, but gets ugly when allowing large amounts of
> > hotplugged RAM.  For RAM, there's a DR connector object for every 256MB of
> > potential memory.  So if maxmem=2T, for example, there are >250,000 objects
> > under the same parent.
> 
> There is one LMB DRC object for every 256MB, so with 2T maxmem, there will be
> max 8192 LMB DRC objects.

Oops, that's embarrasing, I messed up my arithmetic.  You're right,
only 8192 objects for a 2T guest.  Still rather a lot.

> > The QOM interfaces aren't really designed for this.  In particular
> > object_property_add() has O(n^2) time complexity (in the number of existing
> > children) for the [*] case.  First it has a linear search through array
> > indices to find a free slot, each of which is attempted to a recursive call
> > to object_property_add() with a specific [N].  Those calls are O(n) because
> > there's a linear search through all properties to check for duplicates.
> > 
> > For the specific case of DR connectors, we already have a sufficiently
> > unique index, so we don't need to use the [*] special behaviour.  That lets
> > us reduce the total time for creating the DR objects from O(n^3) to O(n^2).
> > 
> > O(n^2) is still kind of crappy, but it's enough to reduce the startup time
> > of qemu with maxmem=2T from ~20 minutes to ~4 seconds.
> > 
> > Signed-off-by: David Gibson 
> > Cc: Bharata B Rao 
> > ---
> >  hw/ppc/spapr_drc.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index c1f664f..4cf3a9b 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -463,14 +463,16 @@ sPAPRDRConnector *spapr_dr_connector_new(Object 
> > *owner,
> >  {
> >  sPAPRDRConnector *drc =
> >  SPAPR_DR_CONNECTOR(object_new(TYPE_SPAPR_DR_CONNECTOR));
> > +char *prop_name = g_strdup_printf("dr-connector[%"PRIu32"]", id);
> 
> This works only if memory hotplug alone is present. If CPU hotplug is also
> present, the lookup of DRC object for LMB DRC fails from ibm,cas call when
> the guest is booting.

Bother.

> I don't fully understand why it fails, but the object lookup doesn't seem to
> like duplicate names that we end up having here. With the above change, we
> can have duplicate prop_name under the same owner object (spapr machine
> object) due to both CPU and LMB DRC objects coming under the same parent.

So.. arguably having both types of connector under the same parent is
a mistake.

But in the short term, we should be able to fix that by using the DRC
index, instead of just the id as the property array index.

It means the indices won't be contiguous, but having something
meaningful in there is probably still better than the arbitrary index
that [*] will give us.  Especially since, confusingly, the will look
like they're the LMB ID *until* you add CPU hotplug, and then they'll
get offset, maybe, depending on whether CPU or memory gets constructed
first.

Revised patch coming shortly.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpBB3Z0cYIW8.pgp
Description: PGP signature

[Qemu-devel] [RFCv2 0/2] spapr: Cleanups to dynamic reconfiguration mechanism

2015-09-13 Thread David Gibson

Here are some cleanups and improvements to the "dynamic
reconfiguration" (hotplug) infrastructure for the "pseries" machine
type.

There's an improved version of my patch to mitigate the O(n^3) time
for large maxmem values, and another small cleanup to remove a
redundant field in the structure.

David Gibson (2):
  spapr: Remove unnecessary owner field from sPAPRDRConnector
  spapr: Don't use QOM [*] syntax for DR connectors.

 hw/ppc/spapr_drc.c | 10 ++
 include/hw/ppc/spapr_drc.h |  1 -
 2 files changed, 6 insertions(+), 5 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH v8 7/7] sPAPR: Support RTAS call ibm,errinjct

2015-09-13 Thread Gavin Shan

The patch supports RTAS call "ibm,errinjct" to allow injecting
EEH errors to VFIO PCI devices. The implementation is similiar
to EEH support for VFIO PCI devices: The RTAS request is captured
by QEMU and routed to spapr_phb_vfio_eeh_inject_error() where the
request is translated to VFIO container IOCTL command to be handled
by the host.

Signed-off-by: Gavin Shan 
Reviewed-by: David Gibson 
---
 hw/ppc/spapr_pci.c  | 30 ++
 hw/ppc/spapr_pci_vfio.c | 32 +++
 hw/ppc/spapr_rtas.c | 77 +
 include/hw/pci-host/spapr.h |  3 ++
 include/hw/ppc/spapr.h  |  9 +-
 5 files changed, 150 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 1b7559d..93d6d1b 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -646,6 +646,36 @@ param_error_exit:
 rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
 }
 
+int spapr_rtas_errinjct_ioa(sPAPRMachineState *spapr,
+target_ulong param_buf,
+bool is_64bits)
+{
+sPAPRPHBState *sphb;
+uint64_t buid, addr, mask;
+uint32_t func;
+
+if (is_64bits) {
+addr = rtas_ldq(param_buf, 0);
+mask = rtas_ldq(param_buf, 2);
+buid = rtas_ldq(param_buf, 5);
+func = rtas_ld(param_buf, 7);
+} else {
+addr = rtas_ld(param_buf, 0);
+mask = rtas_ld(param_buf, 1);
+buid = rtas_ldq(param_buf, 3);
+func = rtas_ld(param_buf, 5);
+}
+
+/* Find PHB */
+sphb = spapr_pci_find_phb(spapr, buid);
+if (!sphb || sphb->vfio_num == 0) {
+return RTAS_OUT_PARAM_ERROR;
+}
+
+/* Handle the request */
+return spapr_phb_vfio_eeh_inject_error(sphb, func, addr, mask, is_64bits);
+}
+
 static int pci_spapr_swizzle(int slot, int pin)
 {
 return (slot + pin) % PCI_NUM_PINS;
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 48137d5..8949398 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -17,6 +17,8 @@
  *  along with this program; if not, see .
  */
 
+#include "asm-powerpc/eeh.h"
+
 #include "hw/ppc/spapr.h"
 #include "hw/pci-host/spapr.h"
 #include "hw/pci/msix.h"
@@ -189,6 +191,36 @@ int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
 return RTAS_OUT_SUCCESS;
 }
 
+int spapr_phb_vfio_eeh_inject_error(sPAPRPHBState *sphb,
+uint32_t func, uint64_t addr,
+uint64_t mask, bool is_64bits)
+{
+struct vfio_eeh_pe_op op = {
+.op = VFIO_EEH_PE_INJECT_ERR,
+.argsz = sizeof(op)
+};
+int ret = RTAS_OUT_SUCCESS;
+
+op.err.type = is_64bits ? EEH_ERR_TYPE_64 : EEH_ERR_TYPE_32;
+op.err.addr = addr;
+op.err.mask = mask;
+if (func <= EEH_ERR_FUNC_MAX) {
+op.err.func = func;
+} else {
+ret = RTAS_OUT_PARAM_ERROR;
+goto out;
+}
+
+if (vfio_container_ioctl(&sphb->iommu_as, VFIO_EEH_PE_OP, &op) < 0) {
+ret = RTAS_OUT_HW_ERROR;
+goto out;
+}
+
+ret = RTAS_OUT_SUCCESS;
+out:
+return ret;
+}
+
 static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 5520fd2..684cd7a 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -637,6 +637,54 @@ out:
 rtas_st(rets, 1, ret);
 }
 
+static void rtas_ibm_errinjct(PowerPCCPU *cpu,
+  sPAPRMachineState *spapr,
+  uint32_t token, uint32_t nargs,
+  target_ulong args, uint32_t nret,
+  target_ulong rets)
+{
+target_ulong param_buf;
+uint32_t type, open_token;
+int32_t ret;
+
+/* Sanity check on number of arguments */
+if (nargs != 3 || nret != 1) {
+ret = RTAS_OUT_PARAM_ERROR;
+goto out;
+}
+
+/* Check if we have opened token */
+open_token = rtas_ld(args, 1);
+if (!(spapr->errinjct_token & 1) ||
+spapr->errinjct_token != open_token) {
+ret = RTAS_OUT_CLOSE_ERROR;
+goto out;
+}
+
+/* The parameter buffer should be 1KB aligned */
+param_buf = rtas_ld(args, 2);
+if (param_buf & 0x3ff) {
+ret = RTAS_OUT_PARAM_ERROR;
+goto out;
+}
+
+/* Check the error type */
+type = rtas_ld(args, 0);
+switch (type) {
+case RTAS_ERRINJCT_TYPE_IOA_BUS_ERROR:
+ret = spapr_rtas_errinjct_ioa(spapr, param_buf, false);
+break;
+case RTAS_ERRINJCT_TYPE_IOA_BUS_ERROR64:
+ret = spapr_rtas_errinjct_ioa(spapr, param_buf, true);
+break;
+default:
+ret = RTAS_OUT_PARAM_ERROR;
+}
+
+out:
+rtas_st(rets, 0, ret);
+}
+
 static void rtas_ibm_close_errinjct(PowerPCCPU *cpu,
 sPAPRMachineState *spapr,

[Qemu-devel] [RFCv2 2/2] spapr: Don't use QOM [*] syntax for DR connectors.

2015-09-13 Thread David Gibson

The dynamic reconfiguration (hotplug) code for the pseries machine type
uses a "DR connector" QOM object for each resource it will be possible
to hotplug.  Each of these is added to its owner using
object_property_add_child(owner, "dr-connector[*], ...);

That works ok, mostly, but it means that the property indices are
arbitrary, depending on the order in which the connectors are constructed.
When we have both memory and cpu hotplug, the connectors will be under the
same parent (at least in the current drafts), meaning the indices don't
correspond to any meaningful ID.

It gets worse when large amounts of hotpluggable RAM is configured.  For
RAM, there's a DR connector object for every 256MB of potential memory.  So
if maxmem=2T, for example, there are 8192 objects under the same parent.

The QOM interfaces aren't really designed for this.  In particular
object_property_add() with [*] has O(n^2) time complexity (in the number of
existing children): first it has a linear search through array indices to
find a free slot, each of which is attempted to a recursive call to
object_property_add() with a specific [N].  Those calls are O(n) because
there's a linear search through all properties to check for duplicates.

By using a meaningful index value, which we already know is unique we can
avoid the [*] special behaviour.  That lets us reduce the total time for
creating the DR objects from O(n^3) to O(n^2).

O(n^2) is still kind of crappy, but it's enough to reduce the startup time
of qemu with maxmem=2T from ~20 minutes to ~4 seconds.

Signed-off-by: David Gibson 
Cc: Bharata B Rao 
---
 hw/ppc/spapr_drc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 68e0c3e..2f95259 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -451,13 +451,16 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
 {
 sPAPRDRConnector *drc =
 SPAPR_DR_CONNECTOR(object_new(TYPE_SPAPR_DR_CONNECTOR));
+char *prop_name;
 
 g_assert(type);
 
 drc->type = type;
 drc->id = id;
-object_property_add_child(owner, "dr-connector[*]", OBJECT(drc), NULL);
+prop_name = g_strdup_printf("dr-connector[%"PRIu32"]", get_index(drc));
+object_property_add_child(owner, prop_name, OBJECT(drc), NULL);
 object_property_set_bool(OBJECT(drc), true, "realized", NULL);
+g_free(prop_name);
 
 /* human-readable name for a DRC to encode into the DT
  * description. this is mainly only used within a guest in place
-- 
2.4.3

Re: [Qemu-devel] [RFC PATCH] spapr: Reduce creation of LMB DR connectors from O(n^3) to O(n^2)

2015-09-13 Thread David Gibson

On Fri, Sep 11, 2015 at 02:43:43PM +0200, Paolo Bonzini wrote:
> 
> 
> On 10/09/2015 08:28, David Gibson wrote:
> > The dynamic reconfiguration (hotplug) code for the pseries machine type
> > uses a "DR connector" QOM object for each resource it will be possible
> > to hotplug.  Each of these is added to its owner using
> > object_property_add_child(owner, "dr-connector[*], ...);
> > 
> > This works ok for most cases, but gets ugly when allowing large amounts of
> > hotplugged RAM.  For RAM, there's a DR connector object for every 256MB of
> > potential memory.  So if maxmem=2T, for example, there are >250,000 objects
> > under the same parent.
> 
> That must consume quite some memory... I would guess 1K per object.

So, Bharata was right, it's only ~32k objects even for maxmem=4T.  I'm
not quite sure how I messed up my arithmetic there.

But still, yeah, it's a lot of objects :/.  Medium term I think we
should avoid creating so many objects for the connectors.  I'm
thinking a "connector array" object that handles a whole range of
connector indices them with a single QOM object should be possible.

> > The QOM interfaces aren't really designed for this.  In particular
> > object_property_add() has O(n^2) time complexity (in the number of existing
> > children) for the [*] case.  First it has a linear search through array
> > indices to find a free slot, each of which is attempted to a recursive call
> > to object_property_add() with a specific [N].  Those calls are O(n) because
> > there's a linear search through all properties to check for duplicates.
> > 
> > For the specific case of DR connectors, we already have a sufficiently
> > unique index, so we don't need to use the [*] special behaviour.  That lets
> > us reduce the total time for creating the DR objects from O(n^3) to O(n^2).
> > 
> > O(n^2) is still kind of crappy, but it's enough to reduce the startup time
> > of qemu with maxmem=2T from ~20 minutes to ~4 seconds.
> 
> Thanks, I agree that even O(n^2) is crappy.  We need to add a hash table
> for properties, so that [*] is O(n^2) and the optimized case is
> O(n).

Right, so I had the impression that QOM isn't really built for
handling thousands of objects under one parent.  I don't have a wide
enough view to know if that's a reasonable goal for it ever to handle
well.  Using a hash table at every node might be pretty expensive in
memory for nodes with only a handful of children.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpuCe9zmrnOx.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 1/8] hw/cpu/{a15mpcore, a9mpcore}: Handle missing has_el3 CPU props gracefully

2015-09-13 Thread Peter Crosthwaite

On Sun, Sep 13, 2015 at 2:07 AM, Edgar E. Iglesias
 wrote:
> From: "Edgar E. Iglesias" 
>
> Handle missing CPU support for EL3 gracefully.
>

What is the use case here? A9 and A15 should be able to not have EL3,
but in this case the property should still exist but be set false. No
prop should only be the case with a CPU that can't ever support EL3.

Regards,
Peter

> Signed-off-by: Edgar E. Iglesias 
> ---
>  hw/cpu/a15mpcore.c | 2 +-
>  hw/cpu/a9mpcore.c  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
> index 4ef8db1..94e8cc1 100644
> --- a/hw/cpu/a15mpcore.c
> +++ b/hw/cpu/a15mpcore.c
> @@ -64,7 +64,7 @@ static void a15mp_priv_realize(DeviceState *dev, Error 
> **errp)
>   * either all the CPUs have TZ, or none do.
>   */
>  cpuobj = OBJECT(qemu_get_cpu(0));
> -has_el3 = object_property_find(cpuobj, "has_el3", &error_abort) &&
> +has_el3 = object_property_find(cpuobj, "has_el3", NULL) &&
>  object_property_get_bool(cpuobj, "has_el3", &error_abort);
>  qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
>  }
> diff --git a/hw/cpu/a9mpcore.c b/hw/cpu/a9mpcore.c
> index 7046246..869818c 100644
> --- a/hw/cpu/a9mpcore.c
> +++ b/hw/cpu/a9mpcore.c
> @@ -69,7 +69,7 @@ static void a9mp_priv_realize(DeviceState *dev, Error 
> **errp)
>   * either all the CPUs have TZ, or none do.
>   */
>  cpuobj = OBJECT(qemu_get_cpu(0));
> -has_el3 = object_property_find(cpuobj, "has_el3", &error_abort) &&
> +has_el3 = object_property_find(cpuobj, "has_el3", NULL) &&
>  object_property_get_bool(cpuobj, "has_el3", &error_abort);
>  qdev_prop_set_bit(gicdev, "has-security-extensions", has_el3);
>
> --
> 1.9.1
>
>

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/2] spapr: Add support for hwrng when available

2015-09-13 Thread David Gibson

On Fri, Sep 11, 2015 at 09:30:28AM +0200, Thomas Huth wrote:
> On 11/09/15 02:45, David Gibson wrote:
> > On Thu, Sep 10, 2015 at 02:03:39PM +0200, Thomas Huth wrote:
> >> On 10/09/15 12:40, David Gibson wrote:
> >>> On Thu, Sep 10, 2015 at 09:33:21AM +0200, Thomas Huth wrote:
>  On 09/09/15 23:10, Thomas Huth wrote:
> > On 08/09/15 07:15, David Gibson wrote:
>  ...
> >> At this point rather than just implementing them as discrete machine
> >> options, I suspect it will be more maintainable to split out the
> >> h-random implementation as a pseudo-device with its own qdev and so
> >> forth.  We already do similarly for the RTAS time of day functions
> >> (spapr-rtc).
> >
> > I gave that I try, but it does not work as expected. To be able to
> > specify the options, I'd need to instantiate this device with the
> > "-device" option, right? Something like:
> >
> > -device spapr-rng,backend=rng0,usekvm=0
> >
> > Now this does not work when I use TYPE_SYS_BUS_DEVICE as parent class
> > like it is done for spapr-rtc, since the user apparently can not plug
> > device to this bus on machine spapr (you can also not plug an spapr-rtc
> > device this way!).
> >
> > The spapr-vlan, spapr-vty, etc. devices are TYPE_VIO_SPAPR_DEVICE, so I
> > also tried that instead, but then the rng device suddenly shows up under
> > /vdevice in the device tree - that's also not what we want, I guess.
> 
>  I did some more tests, and I think I can get this working with one small
>  modification to spapr_vio.c
> >> ...
>  i.e. when the dt_name has not been set, the device won't be added to the
>  /vdevice device tree node. If that's acceptable, I'll continue with this
>  approach.
> >>>
> >>> A bit hacky.
> >>>
> >>> I think it would be preferable to build it under SysBus by default,
> >>> like spapr-rtc.  Properties can be set on the device using -global (or
> >>> -set, but -global is easier).
> >>
> >> If anyhow possible, I'd prefere to use "-device" for this instead, because
> >>
> >> a) it's easier to use for the user, for example you can simply use
> >>"-device spapr-rng,?" to get the list of properties - this
> >>does not seem to work with spapr-rtc (it has a "date" property
> >>which does not show up in the help text?)
> > 
> > Actually, I don't think that's got anything to do with -device versus
> > otherwise.  "date" doesn't appear because it's an "object" property
> > rather than a "qdev" property - that distinction is subtle and
> > confusing, yes.
> 
> At least it is not very friendly for the user ... if a configuration
> property does not show up in the help text, you've got to document it
> somewhere else or nobody will be aware of it.

Not arguing with that.

In this case it happened because I just copied the setup code from
mc146818rtc which also doesn't set a description.

> >> b) unlike the rtc device which is always instantiated, the rng
> >>device is rather optional, so it is IMHO more intuitive if
> >>created via the -device option.
> > 
> > Hrm, that's true though.  And.. we're back at the perrenial question
> > of what "standard" devices should be constructed by default.  And what
> > "default" means.
> > 
> > It seems to me that while the random device is optional, it should be
> > created by default.  But with -device there's not really a way to do
> > that.  But then again if it's constructed internally there's not
> > really a way to turn it off short of hacky machine options.  Ugh.
> > 
> >> So I'd like to give it a try with the TYPE_VIO_SPAPR_DEVICE first ... if
> >> you then still don't like the patches at all, I can still rework them to
> >> use TYPE_SYS_BUS_DEVICE instead.
> > 
> > I still dislike putting it on the VIO "bus", since PAPR doesn't
> > consider it a VIO device.
> 
> Hmm, that's also a valid point.
> 
> After doing some more research, I think I've found yet another
> possibility (why isn't there a proper documentation/howto for all this
> QOM stuff ... or did I just miss it?) :

Tell me about it.  The fact that there are apparently a whole bunch of
conventions about how QOM things should be done that are neither
obvious nor document is starting to really irritate me.

> Instead of using a bus, simply set parent = TYPE_DEVICE, so that it is a
> "bus-less" device. Seems to work fine at a first glance, so unless
> somebody tells me that this is a very bad idea, I'll try to rework my
> patches accordingly...

From agraf's comment, this seems like the way to go.

I'm still pretty confused about where such a device sits in the
composition tree.  I had thought that SysBus was the root of the qdev
tree, but apparently not.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/2] spapr: Add support for hwrng when available

2015-09-13 Thread David Gibson

On Fri, Sep 11, 2015 at 11:43:02AM +0200, Alexander Graf wrote:
> 
> 
> On 11.09.15 02:46, David Gibson wrote:
> > On Thu, Sep 10, 2015 at 02:13:26PM +0200, Alexander Graf wrote:
> >>
> >>
> >>> Am 10.09.2015 um 14:03 schrieb Thomas Huth :
> >>>
>  On 10/09/15 12:40, David Gibson wrote:
> > On Thu, Sep 10, 2015 at 09:33:21AM +0200, Thomas Huth wrote:
> >> On 09/09/15 23:10, Thomas Huth wrote:
> >> On 08/09/15 07:15, David Gibson wrote:
> > ...
> >>> At this point rather than just implementing them as discrete machine
> >>> options, I suspect it will be more maintainable to split out the
> >>> h-random implementation as a pseudo-device with its own qdev and so
> >>> forth.  We already do similarly for the RTAS time of day functions
> >>> (spapr-rtc).
> >>
> >> I gave that I try, but it does not work as expected. To be able to
> >> specify the options, I'd need to instantiate this device with the
> >> "-device" option, right? Something like:
> >>
> >>-device spapr-rng,backend=rng0,usekvm=0
> >>
> >> Now this does not work when I use TYPE_SYS_BUS_DEVICE as parent class
> >> like it is done for spapr-rtc, since the user apparently can not plug
> >> device to this bus on machine spapr (you can also not plug an spapr-rtc
> >> device this way!).
> >>
> >> The spapr-vlan, spapr-vty, etc. devices are TYPE_VIO_SPAPR_DEVICE, so I
> >> also tried that instead, but then the rng device suddenly shows up 
> >> under
> >> /vdevice in the device tree - that's also not what we want, I guess.
> >
> > I did some more tests, and I think I can get this working with one small
> > modification to spapr_vio.c
> >>> ...
> > i.e. when the dt_name has not been set, the device won't be added to the
> > /vdevice device tree node. If that's acceptable, I'll continue with this
> > approach.
> 
>  A bit hacky.
> 
>  I think it would be preferable to build it under SysBus by default,
>  like spapr-rtc.  Properties can be set on the device using -global (or
>  -set, but -global is easier).
> >>>
> >>> If anyhow possible, I'd prefere to use "-device" for this instead, because
> >>>
> >>> a) it's easier to use for the user, for example you can simply use
> >>>   "-device spapr-rng,?" to get the list of properties - this
> >>>   does not seem to work with spapr-rtc (it has a "date" property
> >>>   which does not show up in the help text?)
> >>>
> >>> b) unlike the rtc device which is always instantiated, the rng
> >>>   device is rather optional, so it is IMHO more intuitive if
> >>>   created via the -device option.
> >>>
> >>> So I'd like to give it a try with the TYPE_VIO_SPAPR_DEVICE first ... if
> >>> you then still don't like the patches at all, I can still rework them to
> >>> use TYPE_SYS_BUS_DEVICE instead.
> >>
> >> Please don't use sysbus. If the vio device approach turns ugly,
> >> create a new spapr hcall bus instead. We should have had that from
> >> the beginning really.
> > 
> > Ok.. why?
> > 
> > It's a system (pseudo-)device that doesn't have any common bus
> > infrastructure with anything else.  Isn't that what SysBus is for?
> 
> No, sysbus means "A device that has MMIO and/or PIO connected via a bus
> I'm too lazy to model" really. These devices have neither.

Oh.

So.. where is one supposed to find that out?

> Back in the days before QOM, sysbus was our lowest common denominator,
> but now that we have TYPE_DEVICE and can branch off of that, we really
> shouldn't abuse sysbus devices for things they aren't.

So what actually is the root of the qdev tree then?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpF0UOoFop1o.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v3] ppc/spapr: Implement H_RANDOM hypercall in QEMU

2015-09-13 Thread David Gibson

On Fri, Sep 11, 2015 at 11:17:01AM +0200, Thomas Huth wrote:
> The PAPR interface defines a hypercall to pass high-quality
> hardware generated random numbers to guests. Recent kernels can
> already provide this hypercall to the guest if the right hardware
> random number generator is available. But in case the user wants
> to use another source like EGD, or QEMU is running with an older
> kernel, we should also have this call in QEMU, so that guests that
> do not support virtio-rng yet can get good random numbers, too.
> 
> This patch now adds a new pseude-device to QEMU that either
> directly provides this hypercall to the guest or is able to
> enable the in-kernel hypercall if available. The in-kernel
> hypercall can be enabled with the use-kvm property, e.g.:
> 
>  qemu-system-ppc64 -device spapr-rng,use-kvm=true
> 
> For handling the hypercall in QEMU instead, a RngBackend is required
> since the hypercall should provide "good" random data instead of
> pseudo-random (like from a "simple" library function like rand()
> or g_random_int()). Since there are multiple RngBackends available,
> the user must select an appropriate backend via the "backend"
> property of the device, e.g.:
> 
>  qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=rng0 \
>-device spapr-rng,backend=rng0 ...
> 
> See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
> other example of specifying RngBackends.
> 
> Signed-off-by: Thomas Huth 
> ---
>  v3:
>  - Completely reworked the patch set accordingly to discussion
>on the mailing list, so that the code is now encapsulated
>as a QEMU device in a separate file.

Looking good..

> 
>  hw/ppc/Makefile.objs   |   2 +-
>  hw/ppc/spapr.c |   8 +++
>  hw/ppc/spapr_rng.c | 178 
> +
>  include/hw/ppc/spapr.h |   4 ++
>  target-ppc/kvm.c   |   9 +++
>  target-ppc/kvm_ppc.h   |   5 ++
>  6 files changed, 205 insertions(+), 1 deletion(-)
>  create mode 100644 hw/ppc/spapr_rng.c
> 
> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> index c8ab06e..c1ffc77 100644
> --- a/hw/ppc/Makefile.objs
> +++ b/hw/ppc/Makefile.objs
> @@ -3,7 +3,7 @@ obj-y += ppc.o ppc_booke.o
>  # IBM pSeries (sPAPR)
>  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> -obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o
> +obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
>  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
>  obj-y += spapr_pci_vfio.o
>  endif
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index bf0c64f..34e7d24 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -768,6 +768,14 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
>  exit(1);
>  }
>  
> +if (object_resolve_path_type("", TYPE_SPAPR_RNG, NULL)) {
> +ret = spapr_rng_populate_dt(fdt);
> +if (ret < 0) {
> +fprintf(stderr, "couldn't setup rng device in fdt\n");
> +exit(1);
> +}
> +}
> +
>  QLIST_FOREACH(phb, &spapr->phbs, list) {
>  ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
>  }
> diff --git a/hw/ppc/spapr_rng.c b/hw/ppc/spapr_rng.c
> new file mode 100644
> index 000..d4923bc
> --- /dev/null
> +++ b/hw/ppc/spapr_rng.c
> @@ -0,0 +1,178 @@
> +/*
> + * QEMU sPAPR random number generator "device" for H_RANDOM hypercall
> + *
> + * Copyright 2015 Thomas Huth, Red Hat Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License,
> + * or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see .
> + */
> +
> +#include "qemu/error-report.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/device_tree.h"
> +#include "sysemu/rng.h"
> +#include "hw/ppc/spapr.h"
> +#include "kvm_ppc.h"
> +
> +#define SPAPR_RNG(obj) \
> +OBJECT_CHECK(sPAPRRngState, (obj), TYPE_SPAPR_RNG)
> +
> +typedef struct sPAPRRngState {
> +/*< private >*/
> +DeviceState ds;
> +RngBackend *backend;
> +bool use_kvm;
> +} sPAPRRngState;
> +
> +typedef struct HRandomData {
> +QemuSemaphore sem;
> +union {
> +uint64_t v64;
> +uint8_t v8[8];
> +} val;
> +int received;
> +} HRandomData;
> +
> +/* Callback function for the RngBackend */
> +static void random_recv(void *dest, const void *src, size_t size)
> +{
> +HRandomData *hrdp = dest;
>

Re: [Qemu-devel] [PATCH qemu v2 2/2] spapr_pci: Remove constraints about VFIO-PCI devices

2015-09-13 Thread David Gibson

On Fri, Sep 11, 2015 at 02:03:38PM -0600, Alex Williamson wrote:
> On Wed, 2015-09-09 at 20:43 -0600, Alex Williamson wrote:
> > On Thu, 2015-09-03 at 14:40 +1000, Alexey Kardashevskiy wrote:
> > > So far there were 2 limitations enforced on an emulated PHB
> > > regarding VFIO:
> > > 1) only one IOMMU group per IOMMU container was allowed and
> > > the spapr-pci-vfio-host-bridge device has an IOMMU ID property for this;
> > > 2) only one IOMMU container per PHB was allowed as a group
> > > can only be attached to one container.
> > > 
> > > However these are not really necessary so we are removing them here.
> > > 
> > > This deprecates IOMMU group ID and changes vfio_container_do_ioctl()
> > > not to receive it. Instead of getting a container from a group ID,
> > > this calls ioctl() for every container attached to the PHB address space.
> > > This allows multiple containers on the same PHB which means multiple
> > > groups per PHB. Note that this will lead to every H_PUT_TCE&etc call
> > > to be passed to every container separately which will affect
> > > the performance. For 32bit devices it is still recommended to put
> > > every group to a separate PHB.
> > > 
> > > Since the existing VFIO code is already trying to share a container for
> > > multiple groups, just removing a group id from
> > > the vfio_container_do_ioctl() allows the existing code to share
> > > a container if it is supported by the host kernel.
> > > 
> > > This replaces the check for a group id to be set correctly with
> > > the check that it is not set.
> > > 
> > > This removes casts to sPAPRPHBVFIOState as none of sPAPRPHBVFIOState
> > > members is accessed here.
> > > 
> > > Signed-off-by: Alexey Kardashevskiy 
> > > ---
> > >  hw/ppc/spapr_pci.c  | 10 +-
> > >  hw/ppc/spapr_pci_vfio.c | 17 ++---
> > >  hw/vfio/common.c| 22 ++
> > >  include/hw/vfio/vfio.h  |  3 +--
> > >  4 files changed, 18 insertions(+), 34 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > > index 4e289cb..1b7559d 100644
> > > --- a/hw/ppc/spapr_pci.c
> > > +++ b/hw/ppc/spapr_pci.c
> > > @@ -810,11 +810,6 @@ static int 
> > > spapr_phb_dma_capabilities_update(sPAPRPHBState *sphb)
> > >  pci_for_each_device(bus, pci_bus_num(bus), spapr_phb_walk_devices, 
> > > sphb);
> > >  
> > >  if (sphb->vfio_num) {
> > > -if (sphb->iommugroupid == -1) {
> > > -error_report("Wrong IOMMU group ID %d", sphb->iommugroupid);
> > > -return -1;
> > > -}
> > > -
> > >  ret = spapr_phb_vfio_dma_capabilities_update(sphb);
> > >  if (ret) {
> > >  error_report("Unable to get DMA32 info from VFIO");
> > > @@ -1282,6 +1277,11 @@ static void spapr_phb_realize(DeviceState *dev, 
> > > Error **errp)
> > >  PCIBus *bus;
> > >  uint64_t msi_window_size = 4096;
> > >  
> > > +if ((sphb->iommugroupid != -1) &&
> > > +object_dynamic_cast(OBJECT(sphb), 
> > > TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE)) {
> > > +error_report("Warning: iommugroupid is deprecated and will be 
> > > ignored");
> > > +}
> > > +
> > >  if (sphb->index != (uint32_t)-1) {
> > >  hwaddr windows_base;
> > >  
> > > diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> > > index f94d8a4..48137d5 100644
> > > --- a/hw/ppc/spapr_pci_vfio.c
> > > +++ b/hw/ppc/spapr_pci_vfio.c
> > > @@ -35,7 +35,7 @@ int 
> > > spapr_phb_vfio_dma_capabilities_update(sPAPRPHBState *sphb)
> > >  struct vfio_iommu_spapr_tce_info info = { .argsz = sizeof(info) };
> > >  int ret;
> > >  
> > > -ret = vfio_container_ioctl(&sphb->iommu_as, sphb->iommugroupid,
> > > +ret = vfio_container_ioctl(&sphb->iommu_as,
> > > VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
> > >  if (ret) {
> > >  return ret;
> > > @@ -54,8 +54,7 @@ void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb)
> > >  .op= VFIO_EEH_PE_ENABLE
> > >  };
> > >  
> > > -vfio_container_ioctl(&sphb->iommu_as,
> > > - sphb->iommugroupid, VFIO_EEH_PE_OP, &op);
> > > +vfio_container_ioctl(&sphb->iommu_as, VFIO_EEH_PE_OP, &op);
> > >  }
> > >  
> > >  int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
> > > @@ -81,8 +80,7 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
> > >  return RTAS_OUT_PARAM_ERROR;
> > >  }
> > >  
> > > -ret = vfio_container_ioctl(&sphb->iommu_as, sphb->iommugroupid,
> > > -   VFIO_EEH_PE_OP, &op);
> > > +ret = vfio_container_ioctl(&sphb->iommu_as, VFIO_EEH_PE_OP, &op);
> > >  if (ret < 0) {
> > >  return RTAS_OUT_HW_ERROR;
> > >  }
> > > @@ -96,8 +94,7 @@ int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, 
> > > int *state)
> > >  int ret;
> > >  
> > >  op.op = VFIO_EEH_PE_GET_STATE;
> > > -ret = vfio_container_ioctl(&sphb->iommu_as, sphb->iommugroupid,
> > > -

Re: [Qemu-devel] [RFCv2 2/2] spapr: Don't use QOM [*] syntax for DR connectors.

2015-09-13 Thread Bharata B Rao

On Mon, Sep 14, 2015 at 11:41:53AM +1000, David Gibson wrote:
> The dynamic reconfiguration (hotplug) code for the pseries machine type
> uses a "DR connector" QOM object for each resource it will be possible
> to hotplug.  Each of these is added to its owner using
> object_property_add_child(owner, "dr-connector[*], ...);
> 
> That works ok, mostly, but it means that the property indices are
> arbitrary, depending on the order in which the connectors are constructed.
> When we have both memory and cpu hotplug, the connectors will be under the
> same parent (at least in the current drafts), meaning the indices don't
> correspond to any meaningful ID.
> 
> It gets worse when large amounts of hotpluggable RAM is configured.  For
> RAM, there's a DR connector object for every 256MB of potential memory.  So
> if maxmem=2T, for example, there are 8192 objects under the same parent.
> 
> The QOM interfaces aren't really designed for this.  In particular
> object_property_add() with [*] has O(n^2) time complexity (in the number of
> existing children): first it has a linear search through array indices to
> find a free slot, each of which is attempted to a recursive call to
> object_property_add() with a specific [N].  Those calls are O(n) because
> there's a linear search through all properties to check for duplicates.
> 
> By using a meaningful index value, which we already know is unique we can
> avoid the [*] special behaviour.  That lets us reduce the total time for
> creating the DR objects from O(n^3) to O(n^2).
> 
> O(n^2) is still kind of crappy, but it's enough to reduce the startup time
> of qemu with maxmem=2T from ~20 minutes to ~4 seconds.
> 
> Signed-off-by: David Gibson 
> Cc: Bharata B Rao 

This patch works correctly with both CPU and memory hotplug.

Regards,
Bharata.

Re: [Qemu-devel] [PATCH v8 0/7] sPAPR: Support EEH Error Injection

2015-09-13 Thread David Gibson

On Mon, Sep 14, 2015 at 11:36:08AM +1000, Gavin Shan wrote:
> The patchset depends on below Linux upstream commits:
> 
>   commit ed3e81f ("powerpc/eeh: Move PE state constants around")
>   commit ec33d36 ("powerpc/eeh: Introduce eeh_pe_inject_err()")
> 
> According to PAPR specification 2.7, there're 3 RTAS calls relevent to error
> injection: "ibm,open-errinjct", "ibm,close-errinjct", "ibm,errinjct". The
> userland utility "errinjct" running on guest utilizes those 3 RTAS calls like
> this way: Call "ibm,open-errinjct" that returns open-token, which is passed to
> "ibm,errinjct" together with error specific arguments to do error injection.
> Finally, to return the open-token by calling "ibm,close-errinject".
> 
> "ibm,errinjct" can be used to inject various errors, not limited to EEH 
> errors.
> However, this patchset is going to support injecting EEH errors only for VFIO
> PCI devices.

I'm happy to merge 6..7/7 once 1..5/7 are in - I'm not sure what tree
they should be going through.

> =
> Changelog
> =
> v8:
>* Rebased to git://github.com/dgibson/qemu.git (branch: spapr-next)
>* Apply "git -C $to commit" to update-linux-headers.sh.
>* Use "git rev-parse --short HEAD" to retrieve top commit
>* Use "EOF" to construct the commit message
>* Drop sPAPRPHBClass::eeh_inject_error().
> v7:
>* Cover comments from Peter Maydell in scripts/update-linux-headers.sh.
>* Reset spapr->errinjct_token when rebooting guest.
> v6:
>* Improved scripts/update-linux-headers.sh to format commit log with
>  last commit ID and Linux kernel version. Also, "stdint.h" is allowed
>  to be included in virtio headers.
>* #include "asm-powerpc/eeh.h".
>* Incremental spapr->errinjct_token so that the condition (0x1 &
>  spapr->errinjct_token) can be used to check if the token is valid.
>* Big-endian tokens in /rtas/ibm,errinjct-tokens.
>* Pick rtas_ldq() to load 64-bits value from RTAS call buffer, which
>  was dropped in v2.
>* Use EEH_ERR_FUNC_MAX to validate EEH error function.
>* Removed unnecessary paranthesitis.
> v5:
>* Put "errinjct_token" to migration stream disregarding it's opened or
>  not. Also, it starts to be supported from v4 vmstate_spapr.
>* Include powerpc/include/uapi/asm/eeh.h in scripts/update_linux_headers.sh
> v4:
>* To record currently opened token, not next one as suggested by Alexey.
> v3:
>* Replace random token number with incremental counter. Another boolean
>  variable to track if it's opened. Both of them are added to migration
>  stream.
>* The return value from sPAPRPHBClass::eeh_inject_error() can be passed
>  to user directly. No need to do conversion.
>* Corrected error code to RTAS_OUT_CLOSE_ERROR in rtas_ibm_errinjct().
>* Don't expose error injection tokens for unsupported types.
> v2:
>* Rebased to git://github.com/dgibson/qemu.git (branch: spapr-next)
>* Remove specific PCI error types in hw/ppc/spapr.h. Use those macros
>  asm-powerpc/eeh.h instead.
> 
> Gavin Shan (7):
>   scripts: Allow include "stdint.h" in virtio headers
>   scripts: Include arch/powerpc/include/uapi/asm/eeh.h
>   scripts: Submit changes while updating linux headers
>   Synchronize Linux headers from kernel 4.3.0-rc1
>   Obsolete PCI_MSIX_FLAGS_BIRMASK
>   sPAPR: Support RTAS call ibm, {open, close}-errinjct
>   sPAPR: Support RTAS call ibm,errinjct
> 
>  hw/i386/kvm/pci-assign.c |   4 +-
>  hw/pci/msix.c|   2 +-
>  hw/pci/pcie_aer.c|   2 +-
>  hw/ppc/spapr.c   |   9 +-
>  hw/ppc/spapr_pci.c   |  30 +++
>  hw/ppc/spapr_pci_vfio.c  |  32 +++
>  hw/ppc/spapr_rtas.c  | 137 ++
>  hw/s390x/s390-pci-bus.c  |   8 +-
>  hw/vfio/pci.c|   8 +-
>  hw/xen/xen_pt_msi.c  |   4 +-
>  include/hw/pci-host/spapr.h  |   3 +
>  include/hw/ppc/spapr.h   |  16 +-
>  include/standard-headers/linux/pci_regs.h| 381 
> ---
>  include/standard-headers/linux/virtio_ring.h |   3 +-
>  linux-headers/asm-arm64/kvm.h|  37 ++-
>  linux-headers/asm-powerpc/eeh.h  |  56 
>  linux-headers/asm-x86/hyperv.h   |   4 +
>  linux-headers/asm-x86/kvm.h  |   4 +-
>  linux-headers/linux/kvm.h|   7 +
>  scripts/update-linux-headers.sh  |  34 ++-
>  tests/libqos/pci.c   |   8 +-
>  21 files changed, 667 insertions(+), 122 deletions(-)
>  create mode 100644 linux-headers/asm-powerpc/eeh.h
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
h

Re: [Qemu-devel] [PATCH v8 3/7] scripts: Submit changes while updating linux headers

2015-09-13 Thread David Gibson

On Mon, Sep 14, 2015 at 11:36:11AM +1000, Gavin Shan wrote:
> This submits changes with formatted commit log while updating Linux
> headers using scripts/update-linux-headers.sh.
> 
> Signed-off-by: Gavin Shan 

Reviewed-by: David Gibson 


> ---
>  scripts/update-linux-headers.sh | 30 ++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
> index 18daabe..a345632 100755
> --- a/scripts/update-linux-headers.sh
> +++ b/scripts/update-linux-headers.sh
> @@ -63,6 +63,34 @@ cp_virtio() {
>  fi
>  }
>  
> +submit_change() {
> +from=$1
> +to=$2
> +if ! [ -e "$to/include/qemu-common.h" ]; then
> +echo "$to not QEMU source directory, skip submitting changes"
> +exit 3
> +fi
> +
> +version=$(make -C "$from" -s kernelversion)
> +commit=$(git -C "$from" rev-parse --short HEAD)
> +message=$(cat < +Synchronize Linux headers from kernel $version
> +
> +Synchronize the Linux headers from kernel version $version
> +(commit $commit)
> +
> +This commit was created automatically by update-linux-headers.sh.
> +EOF
> +)
> +
> +if git -C "$to" commit -qa -m "$message" -s ; then
> +echo "Changes submitted successfully"
> +else
> +echo "Failure submitting changes"
> +exit 4
> +fi
> +}
> +
>  # This will pick up non-directories too (eg "Kconfig") but we will
>  # ignore them in the next loop.
>  ARCHLIST=$(cd "$linux/arch" && echo *)
> @@ -132,3 +160,5 @@ cat < >$output/include/standard-headers/linux/if_ether.h
>  EOF
>  
>  rm -rf "$tmpdir"
> +
> +submit_change "$linux" "$output"

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp6YdxacActG.pgp
Description: PGP signature

Re: [Qemu-devel] [RFCv2 2/2] spapr: Don't use QOM [*] syntax for DR connectors.

2015-09-13 Thread David Gibson

On Mon, Sep 14, 2015 at 09:37:16AM +0530, Bharata B Rao wrote:
> On Mon, Sep 14, 2015 at 11:41:53AM +1000, David Gibson wrote:
> > The dynamic reconfiguration (hotplug) code for the pseries machine type
> > uses a "DR connector" QOM object for each resource it will be possible
> > to hotplug.  Each of these is added to its owner using
> > object_property_add_child(owner, "dr-connector[*], ...);
> > 
> > That works ok, mostly, but it means that the property indices are
> > arbitrary, depending on the order in which the connectors are constructed.
> > When we have both memory and cpu hotplug, the connectors will be under the
> > same parent (at least in the current drafts), meaning the indices don't
> > correspond to any meaningful ID.
> > 
> > It gets worse when large amounts of hotpluggable RAM is configured.  For
> > RAM, there's a DR connector object for every 256MB of potential memory.  So
> > if maxmem=2T, for example, there are 8192 objects under the same parent.
> > 
> > The QOM interfaces aren't really designed for this.  In particular
> > object_property_add() with [*] has O(n^2) time complexity (in the number of
> > existing children): first it has a linear search through array indices to
> > find a free slot, each of which is attempted to a recursive call to
> > object_property_add() with a specific [N].  Those calls are O(n) because
> > there's a linear search through all properties to check for duplicates.
> > 
> > By using a meaningful index value, which we already know is unique we can
> > avoid the [*] special behaviour.  That lets us reduce the total time for
> > creating the DR objects from O(n^3) to O(n^2).
> > 
> > O(n^2) is still kind of crappy, but it's enough to reduce the startup time
> > of qemu with maxmem=2T from ~20 minutes to ~4 seconds.
> > 
> > Signed-off-by: David Gibson 
> > Cc: Bharata B Rao 
> 
> This patch works correctly with both CPU and memory hotplug.

Care to send a Reviewed-by and/or Tested-by in that case?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpsTSugWYYvW.pgp
Description: PGP signature

Re: [Qemu-devel] [RFCv2 2/2] spapr: Don't use QOM [*] syntax for DR connectors.

2015-09-13 Thread Bharata B Rao

On Mon, Sep 14, 2015 at 02:14:59PM +1000, David Gibson wrote:
> On Mon, Sep 14, 2015 at 09:37:16AM +0530, Bharata B Rao wrote:
> > On Mon, Sep 14, 2015 at 11:41:53AM +1000, David Gibson wrote:
> > > The dynamic reconfiguration (hotplug) code for the pseries machine type
> > > uses a "DR connector" QOM object for each resource it will be possible
> > > to hotplug.  Each of these is added to its owner using
> > > object_property_add_child(owner, "dr-connector[*], ...);
> > > 
> > > That works ok, mostly, but it means that the property indices are
> > > arbitrary, depending on the order in which the connectors are constructed.
> > > When we have both memory and cpu hotplug, the connectors will be under the
> > > same parent (at least in the current drafts), meaning the indices don't
> > > correspond to any meaningful ID.
> > > 
> > > It gets worse when large amounts of hotpluggable RAM is configured.  For
> > > RAM, there's a DR connector object for every 256MB of potential memory.  
> > > So
> > > if maxmem=2T, for example, there are 8192 objects under the same parent.
> > > 
> > > The QOM interfaces aren't really designed for this.  In particular
> > > object_property_add() with [*] has O(n^2) time complexity (in the number 
> > > of
> > > existing children): first it has a linear search through array indices to
> > > find a free slot, each of which is attempted to a recursive call to
> > > object_property_add() with a specific [N].  Those calls are O(n) because
> > > there's a linear search through all properties to check for duplicates.
> > > 
> > > By using a meaningful index value, which we already know is unique we can
> > > avoid the [*] special behaviour.  That lets us reduce the total time for
> > > creating the DR objects from O(n^3) to O(n^2).
> > > 
> > > O(n^2) is still kind of crappy, but it's enough to reduce the startup time
> > > of qemu with maxmem=2T from ~20 minutes to ~4 seconds.
> > > 
> > > Signed-off-by: David Gibson 
> > > Cc: Bharata B Rao 
> > 
> > This patch works correctly with both CPU and memory hotplug.
> 
> Care to send a Reviewed-by and/or Tested-by in that case?

Sorry,

Tested-by: Bharata B Rao

Re: [Qemu-devel] [PATCH v10 00/10] Add a netfilter object and netbuffer filter

2015-09-13 Thread Yang Hongyang


Hi Stefan,Jason,

I've convert this series to base on QOM, and introducing NetQueue apis
instead of using Netqueue internals as Stefan suggested. Could you please take a
look at it?
Most of the details have been reviewed by Jason, and the whole filter logic
isn't changed.
One missing feature compared to previous versions is the multiqueue support,
however, I've already implemented it, before sending it out, I need to get as
many review comments as possible on this series, and addressing it, in order to
reduce the iter round...And multiqueue support can be sent later as a seperate
series if the base can go in first. If there has to be another few rounds, I
will include multiqueue patches.

Thanks in advance.

On 09/09/2015 03:24 PM, Yang Hongyang wrote:

This patch add an netfilter abstract object, captures all network packets
on associated netdev. Also implement a concrete filter buffer based on
this abstract object. the "buffer" netfilter could be used by VM FT solutions
like MicroCheckpointing, to buffer/release packets. Or to simulate
packet delay.

You can also get the series from:
https://github.com/macrosheep/qemu/tree/netfilter-v10

Usage:
  -netdev tap,id=bn0
  -device e1000,netdev=bn0
  -object filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000

dynamically add/remove netfilters:
  object_add filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000
  object_del f0

NOTE:
  interval's scale is microsecond.
  chain is optional, and is one of in|out|all, default is "all".
"in" means this filter will receive packets sent to the @netdev
"out" means this filter will receive packets sent from the @netdev
"all" means this filter will receive packets both sent to/from
  the @netdev

TODO:
  - multiqueue

v10:
  - Reimplemented using QOM (suggested by stefan)
  - Do not export NetQueue internals (suggested by stefan)
  - see individual patch for detail

v9:
  - squash command description and help to patch 1&3
  - qapi changes according to Markus&Eric's comments
  - see individual patch for detail

v8:
  - some minor fixes according to Thomas's comments
  - rebased to the latest master branch

v7:
  - print filter info when execute 'info network'
  - addressed Jason's comments

v6:
  - add multiqueue support, please see individual patch for detail

v5:
  - add a sent_cb param to filter receive_iov api
  - squash the 4th patch into patch 3
  - remove dummy sent_cb (buffer filter)
  - addressed Jason's other comments, see individual patches for detail

v4:
  - get rid of struct Filter
  - squash the 4th patch into patch 2
  - fix qemu_netfilter_pass_to_next_iov
  - get rid of bh (buffer filter)
  - release the packet to next filter instead of to receiver (buffer filter)

v3:
  - add an api to pass the packet to next filter
  - remove netfilters when delete netdev
  - add qtest testcases for netfilter
  - addressed comments from Jason

v2:
  - add a chain option to netfilter object
  - move the hook place earlier, before net_queue_send
  - drop the unused api in buffer filter
  - squash buffer filter patches into one
  - remove receive() api from netfilter, only receive_iov() is enough
  - addressed comments from Jason&Thomas

v1:
  initial patch.

Yang Hongyang (10):
   qmp: delete qemu opts when delete an object
   init/cleanup of netfilter object
   netfilter: hook packets before net queue send
   net: merge qemu_deliver_packet and qemu_deliver_packet_iov
   net/queue: introduce NetQueueDeliverFunc
   netfilter: add an API to pass the packet to next filter
   netfilter: print filter info associate with the netdev
   net/queue: export qemu_net_queue_append_iov
   netfilter: add a netbuffer filter
   tests: add test cases for netfilter object

  include/net/filter.h|  68 
  include/net/net.h   |   6 +-
  include/net/queue.h |  20 -
  include/qemu/typedefs.h |   1 +
  net/Makefile.objs   |   2 +
  net/filter-buffer.c | 169 ++
  net/filter.c| 213 
  net/net.c   | 116 --
  net/queue.c |  24 --
  qapi-schema.json|  18 
  qemu-options.hx |  18 
  qmp.c   |   4 +
  tests/.gitignore|   1 +
  tests/Makefile  |   2 +
  tests/test-netfilter.c  | 200 +
  vl.c|  18 ++--
  16 files changed, 833 insertions(+), 47 deletions(-)
  create mode 100644 include/net/filter.h
  create mode 100644 net/filter-buffer.c
  create mode 100644 net/filter.c
  create mode 100644 tests/test-netfilter.c



--
Thanks,
Yang.

Re: [Qemu-devel] [PATCH 2/4] Fix bad error handling after memory_region_init_ram()

2015-09-13 Thread Peter Crosthwaite

On Fri, Sep 11, 2015 at 7:51 AM, Markus Armbruster  wrote:
> Symptom:
>
> $ qemu-system-x86_64 -m 1000
> Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
> upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
> Aborted (core dumped)
>
> Root cause: commit ef701d7 screwed up handling of out-of-memory
> conditions.  Before the commit, we report the error and exit(1), in
> one place, ram_block_add().  The commit lifts the error handling up
> the call chain some, to three places.  Fine.  Except it uses
> &error_abort in these places, changing the behavior from exit(1) to
> abort(), and thus undoing the work of commit 3922825 "exec: Don't
> abort when we can't allocate guest memory".
>
> The three places are:
>
> * memory_region_init_ram()
>
>   Commit 4994653 (right after commit ef701d7) lifted the error
>   handling further, through memory_region_init_ram(), multiplying the
>   incorrect use of &error_abort.  Later on, imitation of existing
>   (bad) code may have created more.
>
> * memory_region_init_ram_ptr()
>
>   The &error_abort is still there.
>
> * memory_region_init_rom_device()
>
>   Doesn't need fixing, because commit 33e0eb5 (soon after commit
>   ef701d7) lifted the error handling further, and in the process
>   changed it from &error_abort to passing it up the call chain.
>   Correct, because the callers are realize() methods.
>
> Fix the error handling after memory_region_init_ram() with a
> Coccinelle semantic patch:
>
> @r@
> expression mr, owner, name, size, err;
> position p;
> @@
> memory_region_init_ram(mr, owner, name, size,
> (
> -  &error_abort
> +  &error_fatal
> |
>err@p
> )
>   );
> @script:python@
> p << r.p;
> @@
> print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column)
>
> When the last argument is &error_abort, it gets replaced by
> &error_fatal.  This is the fix.
>
> If the last argument is anything else, its position is reported.  This
> lets us check the fix is complete.  Four positions get reported:
>
> * ram_backend_memory_alloc()
>
>   Error is passed up the call chain, ultimately through
>   user_creatable_complete().  As far as I can tell, it's callers all
>   handle the error sanely.
>
> * fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize()
>

This is super modern code that is the exception to the rule doing it right.

>   DeviceClass.realize() methods, errors handled sanely further up the
>   call chain.
>
> We're good.  Test case again behaves:
>
> $ qemu-system-x86_64 -m 1000
> qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate 
> memory
> [Exit 1 ]
>
> The next commits will repair the rest of commit ef701d7's damage.
>
> Signed-off-by: Markus Armbruster 
> ---
>  hw/arm/armv7m.c  |  2 +-
>  hw/arm/exynos4210.c  |  8 
>  hw/arm/highbank.c|  2 +-
>  hw/arm/integratorcp.c|  2 +-
>  hw/arm/mainstone.c   |  2 +-
>  hw/arm/musicpal.c|  2 +-
>  hw/arm/omap1.c   |  2 +-
>  hw/arm/omap2.c   |  2 +-
>  hw/arm/omap_sx1.c|  4 ++--
>  hw/arm/palm.c|  2 +-
>  hw/arm/pxa2xx.c  |  8 
>  hw/arm/realview.c|  6 +++---
>  hw/arm/spitz.c   |  2 +-
>  hw/arm/stellaris.c   |  4 ++--
>  hw/arm/stm32f205_soc.c   |  4 ++--
>  hw/arm/tosa.c|  2 +-
>  hw/arm/vexpress.c|  6 +++---
>  hw/arm/xilinx_zynq.c |  2 +-
>  hw/arm/xlnx-zynqmp.c |  2 +-
>  hw/block/onenand.c   |  2 +-
>  hw/cris/axis_dev88.c |  2 +-
>  hw/display/cg3.c |  4 ++--
>  hw/display/qxl.c |  6 +++---
>  hw/display/sm501.c   |  2 +-
>  hw/display/tc6393xb.c|  2 +-
>  hw/display/tcx.c |  4 ++--
>  hw/display/vga.c |  2 +-
>  hw/display/vmware_vga.c  |  2 +-
>  hw/i386/pc.c |  2 +-
>  hw/i386/pc_sysfw.c   |  4 ++--
>  hw/input/milkymist-softusb.c |  4 ++--
>  hw/m68k/an5206.c |  2 +-
>  hw/m68k/mcf5208.c|  2 +-
>  hw/microblaze/petalogix_ml605_mmu.c  |  4 ++--
>  hw/microblaze/petalogix_s3adsp1800_mmu.c |  4 ++--
>  hw/mips/mips_fulong2e.c  |  2 +-
>  hw/mips/mips_jazz.c  |  4 ++--
>  hw/mips/mips_malta.c |  2 +-
>  hw/mips/mips_mipssim.c

Re: [Qemu-devel] [PATCH 0/4] Don't abort when we can't allocate guest memory (again)

2015-09-13 Thread Peter Crosthwaite

On Fri, Sep 11, 2015 at 7:51 AM, Markus Armbruster  wrote:
> Not nice:
>
> $ qemu-system-x86_64 -m 1000
> Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
> upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
> Aborted (core dumped)
>
> I fixed this in commit 3922825 for v1.7, but commit ef701d7 regressed
> it for v2.2, and now I'm fixing it again, only this time the fix is
> fifteen times bigger.
>
> Folks involved in the flawed commit cc'ed, so they can do penance by
> reviewing my fix ;-P
>
> PATCH 1/4's error_fatal obviously enables further simplifications.  I
> got some in my local tree, but they're not ready, yet.
>

This is better behaviour than before so

Reviewed-by: Peter Crosthwaite 

But many of these call sites are from modular code that shouldn't have
the authority to fatal QEMU. The SoCs and devices with their own RAMs
in particular, as it would be nice to one-day hotplug all this stuff.
Many of them should be converted to propagations. I have made more
notes on P2.

Regards,
Peter

> Markus Armbruster (4):
>   error: New error_fatal
>   Fix bad error handling after memory_region_init_ram()
>   loader: Fix memory_region_init_resizeable_ram() error handling
>   memory: Fix bad error handling in memory_region_init_ram_ptr()
>
>  hw/arm/armv7m.c  |  2 +-
>  hw/arm/exynos4210.c  |  8 
>  hw/arm/highbank.c|  2 +-
>  hw/arm/integratorcp.c|  2 +-
>  hw/arm/mainstone.c   |  2 +-
>  hw/arm/musicpal.c|  2 +-
>  hw/arm/omap1.c   |  2 +-
>  hw/arm/omap2.c   |  2 +-
>  hw/arm/omap_sx1.c|  4 ++--
>  hw/arm/palm.c|  2 +-
>  hw/arm/pxa2xx.c  |  8 
>  hw/arm/realview.c|  6 +++---
>  hw/arm/spitz.c   |  2 +-
>  hw/arm/stellaris.c   |  4 ++--
>  hw/arm/stm32f205_soc.c   |  4 ++--
>  hw/arm/tosa.c|  2 +-
>  hw/arm/vexpress.c|  6 +++---
>  hw/arm/xilinx_zynq.c |  2 +-
>  hw/arm/xlnx-zynqmp.c |  2 +-
>  hw/block/onenand.c   |  2 +-
>  hw/core/loader.c |  2 +-
>  hw/cris/axis_dev88.c |  2 +-
>  hw/display/cg3.c |  4 ++--
>  hw/display/qxl.c |  6 +++---
>  hw/display/sm501.c   |  2 +-
>  hw/display/tc6393xb.c|  2 +-
>  hw/display/tcx.c |  4 ++--
>  hw/display/vga.c |  2 +-
>  hw/display/vmware_vga.c  |  2 +-
>  hw/i386/pc.c |  2 +-
>  hw/i386/pc_sysfw.c   |  4 ++--
>  hw/input/milkymist-softusb.c |  4 ++--
>  hw/m68k/an5206.c |  2 +-
>  hw/m68k/mcf5208.c|  2 +-
>  hw/microblaze/petalogix_ml605_mmu.c  |  4 ++--
>  hw/microblaze/petalogix_s3adsp1800_mmu.c |  4 ++--
>  hw/mips/mips_fulong2e.c  |  2 +-
>  hw/mips/mips_jazz.c  |  4 ++--
>  hw/mips/mips_malta.c |  2 +-
>  hw/mips/mips_mipssim.c   |  2 +-
>  hw/mips/mips_r4k.c   |  2 +-
>  hw/moxie/moxiesim.c  |  4 ++--
>  hw/net/milkymist-minimac2.c  |  2 +-
>  hw/openrisc/openrisc_sim.c   |  2 +-
>  hw/pci-host/prep.c   |  2 +-
>  hw/pci/pci.c |  2 +-
>  hw/ppc/mac_newworld.c|  2 +-
>  hw/ppc/mac_oldworld.c|  2 +-
>  hw/ppc/ppc405_boards.c   |  7 ---
>  hw/ppc/ppc405_uc.c   |  2 +-
>  hw/s390x/s390-virtio-ccw.c   |  2 +-
>  hw/s390x/sclp.c  |  3 ++-
>  hw/sh4/r2d.c |  2 +-
>  hw/sh4/shix.c|  6 +++---
>  hw/sparc/leon3.c |  2 +-
>  hw/sparc/sun4m.c |  6 +++---
>  hw/sparc64/sun4u.c   |  4 ++--
>  hw/tricore/tricore_testboard.c   | 18 +++--
>  hw/unicore32/puv3.c  |  2 +-
>  hw/xtensa/sim.c  |  4 ++--
>  hw/xtensa/xtfpga.c   |  7 ---
>  include/qapi/error.h | 11 +++
>  memory.c |  2 +-
>  numa.c   |  4 ++--
>  util/error.c | 34 
> 
>  xen-hvm.c|  2 +-
>  66 files changed, 144 insertions(+), 116 deletions(-)
>
> --
> 2.4.3
>
>

Re: [Qemu-devel] [RFCv2 2/2] spapr: Don't use QOM [*] syntax for DR connectors.

2015-09-13 Thread David Gibson

On Mon, Sep 14, 2015 at 10:11:50AM +0530, Bharata B Rao wrote:
> On Mon, Sep 14, 2015 at 02:14:59PM +1000, David Gibson wrote:
> > On Mon, Sep 14, 2015 at 09:37:16AM +0530, Bharata B Rao wrote:
> > > On Mon, Sep 14, 2015 at 11:41:53AM +1000, David Gibson wrote:
> > > > The dynamic reconfiguration (hotplug) code for the pseries machine type
> > > > uses a "DR connector" QOM object for each resource it will be possible
> > > > to hotplug.  Each of these is added to its owner using
> > > > object_property_add_child(owner, "dr-connector[*], ...);
> > > > 
> > > > That works ok, mostly, but it means that the property indices are
> > > > arbitrary, depending on the order in which the connectors are 
> > > > constructed.
> > > > When we have both memory and cpu hotplug, the connectors will be under 
> > > > the
> > > > same parent (at least in the current drafts), meaning the indices don't
> > > > correspond to any meaningful ID.
> > > > 
> > > > It gets worse when large amounts of hotpluggable RAM is configured.  For
> > > > RAM, there's a DR connector object for every 256MB of potential memory. 
> > > >  So
> > > > if maxmem=2T, for example, there are 8192 objects under the same parent.
> > > > 
> > > > The QOM interfaces aren't really designed for this.  In particular
> > > > object_property_add() with [*] has O(n^2) time complexity (in the 
> > > > number of
> > > > existing children): first it has a linear search through array indices 
> > > > to
> > > > find a free slot, each of which is attempted to a recursive call to
> > > > object_property_add() with a specific [N].  Those calls are O(n) because
> > > > there's a linear search through all properties to check for duplicates.
> > > > 
> > > > By using a meaningful index value, which we already know is unique we 
> > > > can
> > > > avoid the [*] special behaviour.  That lets us reduce the total time for
> > > > creating the DR objects from O(n^3) to O(n^2).
> > > > 
> > > > O(n^2) is still kind of crappy, but it's enough to reduce the startup 
> > > > time
> > > > of qemu with maxmem=2T from ~20 minutes to ~4 seconds.
> > > > 
> > > > Signed-off-by: David Gibson 
> > > > Cc: Bharata B Rao 
> > > 
> > > This patch works correctly with both CPU and memory hotplug.
> > 
> > Care to send a Reviewed-by and/or Tested-by in that case?
> 
> Sorry,
> 
> Tested-by: Bharata B Rao 

If you could send one for the cleanup in 1/2 as well, that would be nice.


-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpoCxiCndlox.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v10 00/10] Add a netfilter object and netbuffer filter

2015-09-13 Thread Jason Wang



On 09/14/2015 01:09 PM, Yang Hongyang wrote:
> Hi Stefan,Jason,
>
> I've convert this series to base on QOM, and introducing NetQueue apis
> instead of using Netqueue internals as Stefan suggested. Could you
> please take a
> look at it?

Will go through this in next few days.

> Most of the details have been reviewed by Jason, and the whole filter
> logic
> isn't changed.
> One missing feature compared to previous versions is the multiqueue
> support,
> however, I've already implemented it, before sending it out, I need to
> get as
> many review comments as possible on this series, and addressing it, in
> order to
> reduce the iter round...And multiqueue support can be sent later as a
> seperate
> series if the base can go in first. If there has to be another few
> rounds, I
> will include multiqueue patches.

Sounds good. There's one more thing which may be implemented. I'd expect
a unit test for buffer filter. (This could be done by a separate patch
on top).

Thanks

>
> Thanks in advance.
>
> On 09/09/2015 03:24 PM, Yang Hongyang wrote:
>> This patch add an netfilter abstract object, captures all network
>> packets
>> on associated netdev. Also implement a concrete filter buffer based on
>> this abstract object. the "buffer" netfilter could be used by VM FT
>> solutions
>> like MicroCheckpointing, to buffer/release packets. Or to simulate
>> packet delay.
>>
>> You can also get the series from:
>> https://github.com/macrosheep/qemu/tree/netfilter-v10
>>
>> Usage:
>>   -netdev tap,id=bn0
>>   -device e1000,netdev=bn0
>>   -object filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000
>>
>> dynamically add/remove netfilters:
>>   object_add filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000
>>   object_del f0
>>
>> NOTE:
>>   interval's scale is microsecond.
>>   chain is optional, and is one of in|out|all, default is "all".
>> "in" means this filter will receive packets sent to the @netdev
>> "out" means this filter will receive packets sent from the
>> @netdev
>> "all" means this filter will receive packets both sent to/from
>>   the @netdev
>>
>> TODO:
>>   - multiqueue
>>
>> v10:
>>   - Reimplemented using QOM (suggested by stefan)
>>   - Do not export NetQueue internals (suggested by stefan)
>>   - see individual patch for detail
>>
>> v9:
>>   - squash command description and help to patch 1&3
>>   - qapi changes according to Markus&Eric's comments
>>   - see individual patch for detail
>>
>> v8:
>>   - some minor fixes according to Thomas's comments
>>   - rebased to the latest master branch
>>
>> v7:
>>   - print filter info when execute 'info network'
>>   - addressed Jason's comments
>>
>> v6:
>>   - add multiqueue support, please see individual patch for detail
>>
>> v5:
>>   - add a sent_cb param to filter receive_iov api
>>   - squash the 4th patch into patch 3
>>   - remove dummy sent_cb (buffer filter)
>>   - addressed Jason's other comments, see individual patches for detail
>>
>> v4:
>>   - get rid of struct Filter
>>   - squash the 4th patch into patch 2
>>   - fix qemu_netfilter_pass_to_next_iov
>>   - get rid of bh (buffer filter)
>>   - release the packet to next filter instead of to receiver (buffer
>> filter)
>>
>> v3:
>>   - add an api to pass the packet to next filter
>>   - remove netfilters when delete netdev
>>   - add qtest testcases for netfilter
>>   - addressed comments from Jason
>>
>> v2:
>>   - add a chain option to netfilter object
>>   - move the hook place earlier, before net_queue_send
>>   - drop the unused api in buffer filter
>>   - squash buffer filter patches into one
>>   - remove receive() api from netfilter, only receive_iov() is enough
>>   - addressed comments from Jason&Thomas
>>
>> v1:
>>   initial patch.
>>
>> Yang Hongyang (10):
>>qmp: delete qemu opts when delete an object
>>init/cleanup of netfilter object
>>netfilter: hook packets before net queue send
>>net: merge qemu_deliver_packet and qemu_deliver_packet_iov
>>net/queue: introduce NetQueueDeliverFunc
>>netfilter: add an API to pass the packet to next filter
>>netfilter: print filter info associate with the netdev
>>net/queue: export qemu_net_queue_append_iov
>>netfilter: add a netbuffer filter
>>tests: add test cases for netfilter object
>>
>>   include/net/filter.h|  68 
>>   include/net/net.h   |   6 +-
>>   include/net/queue.h |  20 -
>>   include/qemu/typedefs.h |   1 +
>>   net/Makefile.objs   |   2 +
>>   net/filter-buffer.c | 169 ++
>>   net/filter.c| 213
>> 
>>   net/net.c   | 116 --
>>   net/queue.c |  24 --
>>   qapi-schema.json|  18 
>>   qemu-options.hx |  18 
>>   qmp.c   |   4 +
>>   tests/.gitignore|   1 +
>>   tests/Makefile  |   2 +
>>   tests/test-netfilter.c

Re: [Qemu-devel] [PATCH v10 00/10] Add a netfilter object and netbuffer filter

2015-09-13 Thread Yang Hongyang




On 09/14/2015 01:22 PM, Jason Wang wrote:



On 09/14/2015 01:09 PM, Yang Hongyang wrote:

Hi Stefan,Jason,

I've convert this series to base on QOM, and introducing NetQueue apis
instead of using Netqueue internals as Stefan suggested. Could you
please take a
look at it?


Will go through this in next few days.


Thanks a lot.




Most of the details have been reviewed by Jason, and the whole filter
logic
isn't changed.
One missing feature compared to previous versions is the multiqueue
support,
however, I've already implemented it, before sending it out, I need to
get as
many review comments as possible on this series, and addressing it, in
order to
reduce the iter round...And multiqueue support can be sent later as a
seperate
series if the base can go in first. If there has to be another few
rounds, I
will include multiqueue patches.


Sounds good. There's one more thing which may be implemented. I'd expect
a unit test for buffer filter. (This could be done by a separate patch
on top).


The 10th patch "tests: add test cases for netfilter object" might contains
basic tests on buffer filter using qmp, the netfilter object is an abstract
object, so in order to test it, we must use a concrete filter as the buffer
filter.



Thanks



Thanks in advance.

On 09/09/2015 03:24 PM, Yang Hongyang wrote:

This patch add an netfilter abstract object, captures all network
packets
on associated netdev. Also implement a concrete filter buffer based on
this abstract object. the "buffer" netfilter could be used by VM FT
solutions
like MicroCheckpointing, to buffer/release packets. Or to simulate
packet delay.

You can also get the series from:
https://github.com/macrosheep/qemu/tree/netfilter-v10

Usage:
   -netdev tap,id=bn0
   -device e1000,netdev=bn0
   -object filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000

dynamically add/remove netfilters:
   object_add filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000
   object_del f0

NOTE:
   interval's scale is microsecond.
   chain is optional, and is one of in|out|all, default is "all".
 "in" means this filter will receive packets sent to the @netdev
 "out" means this filter will receive packets sent from the
@netdev
 "all" means this filter will receive packets both sent to/from
   the @netdev

TODO:
   - multiqueue

v10:
   - Reimplemented using QOM (suggested by stefan)
   - Do not export NetQueue internals (suggested by stefan)
   - see individual patch for detail

v9:
   - squash command description and help to patch 1&3
   - qapi changes according to Markus&Eric's comments
   - see individual patch for detail

v8:
   - some minor fixes according to Thomas's comments
   - rebased to the latest master branch

v7:
   - print filter info when execute 'info network'
   - addressed Jason's comments

v6:
   - add multiqueue support, please see individual patch for detail

v5:
   - add a sent_cb param to filter receive_iov api
   - squash the 4th patch into patch 3
   - remove dummy sent_cb (buffer filter)
   - addressed Jason's other comments, see individual patches for detail

v4:
   - get rid of struct Filter
   - squash the 4th patch into patch 2
   - fix qemu_netfilter_pass_to_next_iov
   - get rid of bh (buffer filter)
   - release the packet to next filter instead of to receiver (buffer
filter)

v3:
   - add an api to pass the packet to next filter
   - remove netfilters when delete netdev
   - add qtest testcases for netfilter
   - addressed comments from Jason

v2:
   - add a chain option to netfilter object
   - move the hook place earlier, before net_queue_send
   - drop the unused api in buffer filter
   - squash buffer filter patches into one
   - remove receive() api from netfilter, only receive_iov() is enough
   - addressed comments from Jason&Thomas

v1:
   initial patch.

Yang Hongyang (10):
qmp: delete qemu opts when delete an object
init/cleanup of netfilter object
netfilter: hook packets before net queue send
net: merge qemu_deliver_packet and qemu_deliver_packet_iov
net/queue: introduce NetQueueDeliverFunc
netfilter: add an API to pass the packet to next filter
netfilter: print filter info associate with the netdev
net/queue: export qemu_net_queue_append_iov
netfilter: add a netbuffer filter
tests: add test cases for netfilter object

   include/net/filter.h|  68 
   include/net/net.h   |   6 +-
   include/net/queue.h |  20 -
   include/qemu/typedefs.h |   1 +
   net/Makefile.objs   |   2 +
   net/filter-buffer.c | 169 ++
   net/filter.c| 213

   net/net.c   | 116 --
   net/queue.c |  24 --
   qapi-schema.json|  18 
   qemu-options.hx |  18 
   qmp.c   |   4 +
   tests/.gitignore|   1 +
   tests/Makefile  |   2 +
   tests/test

Re: [Qemu-devel] [PATCH 5/7] vhost_net: move vhost_net_set_vq_index ahead at vhost_net_init

2015-09-13 Thread Yuanhan Liu

On Thu, Sep 10, 2015 at 02:54:02PM +0800, Jason Wang wrote:
> 
> 
> On 09/10/2015 02:18 PM, Yuanhan Liu wrote:
> > On Thu, Sep 10, 2015 at 01:52:30PM +0800, Jason Wang wrote:
> >>
> >> On 09/10/2015 01:17 PM, Yuanhan Liu wrote:
> >>> On Thu, Sep 10, 2015 at 12:46:00PM +0800, Jason Wang wrote:
> >
> > On 09/10/2015 11:57 AM, Yuanhan Liu wrote:
> >>> On Thu, Sep 10, 2015 at 11:14:27AM +0800, Jason Wang wrote:
> > On 09/08/2015 03:38 PM, Yuanhan Liu wrote:
> >>> So that we could use the `vq_index' as well in the vhost_net_init
> >>> stage, which is required when adding vhost-user multiple-queue 
> >>> support,
> >>> where we need the vq_index to indicate which queue pair we are 
> >>> gonna
> >>> initiate.
> >>>
> >>> vhost-user has no multiple queue support yet, hence no 
> >>> queue_index set
> >>> before. Here is a quick set to 0 at net_vhost_user_init() stage, 
> >>> and it
> >>> will be set properly soon in the next patch.
> >>>
> >>> Signed-off-by: Yuanhan Liu 
> >>> ---
> >>>  hw/net/vhost_net.c | 16 +++-
> >>>  net/vhost-user.c   |  1 +
> >>>  2 files changed, 8 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >>> index f9441e9..141b557 100644
> >>> --- a/hw/net/vhost_net.c
> >>> +++ b/hw/net/vhost_net.c
> >>> @@ -138,6 +138,11 @@ static int vhost_net_get_fd(NetClientState 
> >>> *backend)
> >>>  }
> >>>  }
> >>>  
> >>> +static void vhost_net_set_vq_index(struct vhost_net *net, int 
> >>> vq_index)
> >>> +{
> >>> +net->dev.vq_index = vq_index;
> >>> +}
> >>> +
> >>>  struct vhost_net *vhost_net_init(VhostNetOptions *options)
> >>>  {
> >>>  int r;
> >>> @@ -167,6 +172,8 @@ struct vhost_net 
> >>> *vhost_net_init(VhostNetOptions *options)
> >>>  }
> >>>  net->nc = options->net_backend;
> >>>  
> >>> +vhost_net_set_vq_index(net, net->nc->queue_index * 2);
> >>> +
> > This breaks vhost kernel multiqueue since queue_index was not
> > initialized at this time.
> >>> Right, thanks for pointing it out.
> >>>
> > We do this in set_netdev() instead of setting
> > it in each kind of netdev.
> >>> Can we move it to net_init_tap() for setting the right queue_index
> >>> for each nc?
> >>>
> >>> Or, can we call vhost_net_set_vq_index twice, one at 
> >>> vhost_net_init(for
> >>> vhost-user mq support), another one at vhost_net_start(for vhost 
> >>> kernel
> >>> mq support)?
> >>>
> >>> Or, do you have better ideas?
> > I think setting queue_index in net_init_tap() looks ok.
> >>> Good to know.
> >>>
> > But a question
> > is that why need we do this at so early stage? ( Even before its peers
> > is connected.)
> >>> For vhost-user multiple queues support, we will invoke vhost_net_init()
> >>> N times for each queue pair, and hence we need to distinguish which
> >>> queue it is while sending messages like VHOST_SET_VRING_CALL for
> >>> initializing corresponding queue pair.
> >>>
> >>> Does that make sense to you?
> >>>
> >> Not sure. Since current codes works for vhost-kernel. (vhost_net_init()
> >> was also called N times). We don't want to break existed vhost-kernel
> >> API when developing multiqueue. For each virtqueue TX/RX pair, we have
> >> one vhost net device and it has no knowledge for the others (which was
> >> hide by qemu). So VHOST_SET_VRING_CALL works without any change here.
> >>
> >> For the case here, since you still have multiple instances of vhost_net
> >> structure. Maybe the vhost-user backend can distinguish form this?
> > Yeah, I guess that's the difference between vhost-user and vhost-kernel.
> > Vhost-kernel opens a char device(/dev/vhost-net) for each vhost_dev,
> > hence it's distinguishable. But for vhost-user, all vhost_dev share one
> > char device(a socket) for communication, hence, it's not distinguishable.
> 
> How about using individual socket in this case? This seems can also
> minimize the changes of backend.
> 
> >
> > I was thinking maybe we could export vhost_net_set_vq_index() and invoke
> > it at net/vhost-user.c, so that we break nothing, and in the meantime,
> > it keeps the logic inside vhost-user.
> >
> > What do you think?
> >
> > --yliu
> >
> 
> Sounds work. Then I believe you will need to set queue_index in
> vhost_user initialization code?

Nah, it will not work, as vhost_net_set_vq_index() needs a vhost_net as
it's parameter: you can't do that before vhost_net_init(), but it's
useless to do that after vhost_net_init(). My bad for not being aware it
in the first time.

How about following then?

Thanks.

--yli

Re: [Qemu-devel] [PATCH v3 2/4] block: Add 'ignore-backing' field to BlockdevOptionsGenericCOWFormat

2015-09-13 Thread Alberto Garcia

On Fri 11 Sep 2015 07:33:41 PM CEST, Max Reitz  wrote:

>>> So why do we need the new flag? Because "backing: ''" is ugly?
>> 
>> I guess it's just because you're the only one who actually reads the
>> documentation. When discussing this, I didn't remember that we
>> already had a way to express this (an additional bool wouldn't have
>> been my favourite solution anyway). Thanks for catching this.
>
> I read the patch, it was part of the context. ;-)

Oh, that was embarrassing :-) Yes, it was the discussion from two weeks
ago about passing empty strings as BlockdevRef that made me think that
this would be ugly.

Anyway, was this ever implemented? It seems that passing a string to the
'backing' parameter is only specified in the JSON schema, but no one
actually uses that.

So I'll implement that for the next version of my series.

Berto

Re: [Qemu-devel] [RFCv2 1/2] spapr: Remove unnecessary owner field from sPAPRDRConnector

2015-09-13 Thread Bharata B Rao

On Mon, Sep 14, 2015 at 11:41:52AM +1000, David Gibson wrote:
> The sPAPRDRConnector pseudo-device contains an owner field which is
> set in spapr_dr_connector_new().  However, that function also calls
> object_property_add_child() to set the DRConnector as the QOM child of
> the owner object.  That means that owner is always the same as the QOM
> parent, and so redundant.
> 
> Signed-off-by: David Gibson 

Tested CPU and memory hotplug with reboot and migration.

Tested-by: Bharata B Rao

Re: [Qemu-devel] [PATCH v3] ppc/spapr: Implement H_RANDOM hypercall in QEMU

2015-09-13 Thread Thomas Huth

On 14/09/15 04:15, David Gibson wrote:
> On Fri, Sep 11, 2015 at 11:17:01AM +0200, Thomas Huth wrote:
>> The PAPR interface defines a hypercall to pass high-quality
>> hardware generated random numbers to guests. Recent kernels can
>> already provide this hypercall to the guest if the right hardware
>> random number generator is available. But in case the user wants
>> to use another source like EGD, or QEMU is running with an older
>> kernel, we should also have this call in QEMU, so that guests that
>> do not support virtio-rng yet can get good random numbers, too.
>>
>> This patch now adds a new pseude-device to QEMU that either
>> directly provides this hypercall to the guest or is able to
>> enable the in-kernel hypercall if available. The in-kernel
>> hypercall can be enabled with the use-kvm property, e.g.:
>>
>>  qemu-system-ppc64 -device spapr-rng,use-kvm=true
>>
>> For handling the hypercall in QEMU instead, a RngBackend is required
>> since the hypercall should provide "good" random data instead of
>> pseudo-random (like from a "simple" library function like rand()
>> or g_random_int()). Since there are multiple RngBackends available,
>> the user must select an appropriate backend via the "backend"
>> property of the device, e.g.:
>>
>>  qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=rng0 \
>>-device spapr-rng,backend=rng0 ...
>>
>> See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
>> other example of specifying RngBackends.
...
>> +
>> +#include "qemu/error-report.h"
>> +#include "sysemu/sysemu.h"
>> +#include "sysemu/device_tree.h"
>> +#include "sysemu/rng.h"
>> +#include "hw/ppc/spapr.h"
>> +#include "kvm_ppc.h"
>> +
>> +#define SPAPR_RNG(obj) \
>> +OBJECT_CHECK(sPAPRRngState, (obj), TYPE_SPAPR_RNG)
>> +
>> +typedef struct sPAPRRngState {
>> +/*< private >*/
>> +DeviceState ds;
>> +RngBackend *backend;
>> +bool use_kvm;
>> +} sPAPRRngState;
>> +
>> +typedef struct HRandomData {
>> +QemuSemaphore sem;
>> +union {
>> +uint64_t v64;
>> +uint8_t v8[8];
>> +} val;
>> +int received;
>> +} HRandomData;
>> +
>> +/* Callback function for the RngBackend */
>> +static void random_recv(void *dest, const void *src, size_t size)
>> +{
>> +HRandomData *hrdp = dest;
>> +
>> +if (src && size > 0) {
>> +assert(size + hrdp->received <= sizeof(hrdp->val.v8));
>> +memcpy(&hrdp->val.v8[hrdp->received], src, size);
>> +hrdp->received += size;
>> +}
>> +
>> +qemu_sem_post(&hrdp->sem);
> 
> I'm assuming qemu_sem_post() includes the necessary memory barrier to
> make sure the requesting thread actually sees the data.

Not sure whether I fully got your point here... both callback function
and main thread are calling an extern C-function, so the compiler should
not assume that the memory stays the same in the main thread...?

Anyway, I've tested the hypercall by implementing it in SLOF and calling
it a couple of times there to see that all bits in the result behave
randomly, so for me this is working fine.

>> +}
>> +
>> +/* Handler for the H_RANDOM hypercall */
>> +static target_ulong h_random(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>> + target_ulong opcode, target_ulong *args)
>> +{
>> +sPAPRRngState *rngstate;
>> +HRandomData hrdata;
>> +
>> +rngstate = SPAPR_RNG(object_resolve_path_type("", TYPE_SPAPR_RNG, 
>> NULL));
>> +
>> +if (!rngstate || !rngstate->backend) {
>> +return H_HARDWARE;
>> +}
>> +
>> +qemu_sem_init(&hrdata.sem, 0);
>> +hrdata.val.v64 = 0;
>> +hrdata.received = 0;
>> +
>> +qemu_mutex_unlock_iothread();
>> +while (hrdata.received < 8) {
>> +rng_backend_request_entropy(rngstate->backend, 8 - hrdata.received,
>> +random_recv, &hrdata);
>> +qemu_sem_wait(&hrdata.sem);
>> +}
>> +qemu_mutex_lock_iothread();
>> +
>> +qemu_sem_destroy(&hrdata.sem);
>> +args[0] = hrdata.val.v64;
>> +
>> +return H_SUCCESS;
>> +}
>> +
>> +static void spapr_rng_instance_init(Object *obj)
>> +{
>> +sPAPRRngState *rngstate = SPAPR_RNG(obj);
>> +
>> +if (object_resolve_path_type("", TYPE_SPAPR_RNG, NULL) != NULL) {
>> +error_report("spapr-rng can not be instantiated twice!");
>> +return;
>> +}
>> +
>> +object_property_add_link(obj, "backend", TYPE_RNG_BACKEND,
>> + (Object **)&rngstate->backend,
>> + object_property_allow_set_link,
>> + OBJ_PROP_LINK_UNREF_ON_RELEASE, NULL);
>> +object_property_set_description(obj, "backend",
>> +"ID of the random number generator 
>> backend",
>> +NULL);
> 
> Since virtio-rng does it the same way, I'm assuming there's a reason
> this is constructed with object_propery_add() rather than listing it
> in spapr_rng_properties, but

Re: [Qemu-devel] [PATCH] iscsi: Add chap and "initiator-name" etc as per drive options

2015-09-13 Thread Fam Zheng

On Fri, 09/11 08:27, ronnie sahlberg wrote:
> On Fri, Sep 11, 2015 at 8:20 AM, Eric Blake  wrote:
> > On 09/11/2015 12:00 AM, Fam Zheng wrote:
> >> Previously we use "-iscsi id=target-iqn,user=foo,password=bar,..." to
> >> specify iscsi connection parameters, unfortunately it doesn't work with
> >> qemu-img.
> >>
> >> This patch adds per drive options to iscsi driver so that at least
> >> qemu-img can use the "json:{...}" filename magic.
> >>
> >> Signed-off-by: Fam Zheng 
> >> ---
> >>  block/iscsi.c | 83 
> >> +--
> >>  1 file changed, 64 insertions(+), 19 deletions(-)
> >
> > It would be nice to also add a matching BlockdevOptionsIscsi to
> > qapi/block-core.json, to allow setting these structured options from
> > QMP.  Separate patch is fine, but we need to do the work for ALL of the
> > remaining block devices eventually, and now that you are structuring the
> > command line is a good time to think about it.
> >
> >
> >>  static void iscsi_nop_timed_event(void *opaque)
> >> @@ -1229,6 +1253,27 @@ static QemuOptsList runtime_opts = {
> >>  .name = "filename",
> >>  .type = QEMU_OPT_STRING,
> >>  .help = "URL to the iscsi image",
> >> +},{
> >> +.name = "user",
> >> +.type = QEMU_OPT_STRING,
> >> +.help = "username for CHAP authentication to target",
> >> +},{
> >> +.name = "password",
> >> +.type = QEMU_OPT_STRING,
> >> +.help = "password for CHAP authentication to target",
> >> +},{
> >
> > Also, this requires passing the password in the command line. We
> > _really_ need to solve the problem of allowing the password to be passed
> > via a fd or other QMP command, rather than on the command line.
> 
> 
> Passing via command line is evil. It should still be possible to pass
> all this via a config file to qemu :
> 
> """
> ...
> Howto use a configuration file to set iSCSI configuration options:
> @example
> cat >iscsi.conf < [iscsi "iqn.target.name"]
>   user = "me"
>   password = "my password"
>   initiator-name = "iqn.qemu.test:my-initiator"
>   header-digest = "CRC32C"
> EOF
> 
> qemu-system-i386 -drive file=iscsi://127.0.0.1/iqn.qemu.test/1 \
> -readconfig iscsi.conf
> @end example
> ...
> """

I agree passing password with clear text command line is bad, but -readconfig
doesn't work for qemu-img and qemu-io.  Any idea how to make that work?

Fam

Re: [Qemu-devel] [PATCH] iscsi: Add chap and "initiator-name" etc as per drive options

2015-09-13 Thread Peter Lieven



> Am 14.09.2015 um 08:38 schrieb Fam Zheng :
> 
>> On Fri, 09/11 08:27, ronnie sahlberg wrote:
>>> On Fri, Sep 11, 2015 at 8:20 AM, Eric Blake  wrote:
 On 09/11/2015 12:00 AM, Fam Zheng wrote:
 Previously we use "-iscsi id=target-iqn,user=foo,password=bar,..." to
 specify iscsi connection parameters, unfortunately it doesn't work with
 qemu-img.
 
 This patch adds per drive options to iscsi driver so that at least
 qemu-img can use the "json:{...}" filename magic.
 
 Signed-off-by: Fam Zheng 
 ---
 block/iscsi.c | 83 
 +--
 1 file changed, 64 insertions(+), 19 deletions(-)
>>> 
>>> It would be nice to also add a matching BlockdevOptionsIscsi to
>>> qapi/block-core.json, to allow setting these structured options from
>>> QMP.  Separate patch is fine, but we need to do the work for ALL of the
>>> remaining block devices eventually, and now that you are structuring the
>>> command line is a good time to think about it.
>>> 
>>> 
 static void iscsi_nop_timed_event(void *opaque)
 @@ -1229,6 +1253,27 @@ static QemuOptsList runtime_opts = {
 .name = "filename",
 .type = QEMU_OPT_STRING,
 .help = "URL to the iscsi image",
 +},{
 +.name = "user",
 +.type = QEMU_OPT_STRING,
 +.help = "username for CHAP authentication to target",
 +},{
 +.name = "password",
 +.type = QEMU_OPT_STRING,
 +.help = "password for CHAP authentication to target",
 +},{
>>> 
>>> Also, this requires passing the password in the command line. We
>>> _really_ need to solve the problem of allowing the password to be passed
>>> via a fd or other QMP command, rather than on the command line.
>> 
>> 
>> Passing via command line is evil. It should still be possible to pass
>> all this via a config file to qemu :
>> 
>> """
>> ...
>> Howto use a configuration file to set iSCSI configuration options:
>> @example
>> cat >iscsi.conf <> [iscsi "iqn.target.name"]
>>  user = "me"
>>  password = "my password"
>>  initiator-name = "iqn.qemu.test:my-initiator"
>>  header-digest = "CRC32C"
>> EOF
>> 
>> qemu-system-i386 -drive file=iscsi://127.0.0.1/iqn.qemu.test/1 \
>>-readconfig iscsi.conf
>> @end example
>> ...
>> """
> 
> I agree passing password with clear text command line is bad, but -readconfig
> doesn't work for qemu-img and qemu-io.  Any idea how to make that work?

you can pass the secrets via environment variables (see libiscsi readme).

Peter

99 matches

Mail list logo