Re: [PATCH 06/30] bsd-user/arm/target_arch_cpu.h: Correct code pointer

2022-01-14 Thread Peter Maydell
On Fri, 14 Jan 2022 at 06:38, Warner Losh  wrote:
>
>
>
> On Thu, Jan 13, 2022 at 10:15 AM Peter Maydell  
> wrote:
>>
>> On Sun, 9 Jan 2022 at 16:26, Warner Losh  wrote:
>> >
>> > The code has moved in FreeBSD since the emulator was started, update the
>> > comment to reflect that change. Remove now-redundant comment saying the
>> > same thing (but incorrectly).
>> >
>> > Signed-off-by: Warner Losh 
>> > ---
>> >  bsd-user/arm/target_arch_cpu.h | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/bsd-user/arm/target_arch_cpu.h 
>> > b/bsd-user/arm/target_arch_cpu.h
>> > index 05b19ce6119..905f13aa1b9 100644
>> > --- a/bsd-user/arm/target_arch_cpu.h
>> > +++ b/bsd-user/arm/target_arch_cpu.h
>> > @@ -73,7 +73,7 @@ static inline void target_cpu_loop(CPUARMState *env)
>> >  int32_t syscall_nr = n;
>> >  int32_t arg1, arg2, arg3, arg4, arg5, arg6, arg7, 
>> > arg8;
>> >
>> > -/* See arm/arm/trap.c cpu_fetch_syscall_args() */
>> > +/* See arm/arm/syscall.c cpu_fetch_syscall_args() */
>> >  if (syscall_nr == TARGET_FREEBSD_NR_syscall) {
>> >  syscall_nr = env->regs[0];
>> >  arg1 = env->regs[1];
>>
>> Commit message says we're updating one comment and deleting a
>> second one; code only does an update, no delete ?
>
>
> Commit is right, commit message is wrong. I'll fix the commit message. I got
> this confused with part 8 where I kinda sorta did something similar (but not
> that similar).

(Maybe you had in mind the similar comment that used to be a few lines
above this one and which you removed in patch 5?)

With a fixed commit message:
Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PULL V3 00/13] Net patches

2022-01-14 Thread Peter Maydell
On Fri, 14 Jan 2022 at 09:19, Jason Wang  wrote:
>
>
> 在 2022/1/14 下午1:08, Jason Wang 写道:
> > The following changes since commit f8d75e10d3e0033a0a29a7a7e4777a4fbc17a016:
> >
> >Merge remote-tracking branch 'remotes/legoater/tags/pull-ppc-20220112' 
> > into staging (2022-01-13 11:18:24 +)
> >
> > are available in the git repository at:
> >
> >https://github.com/jasowang/qemu.git tags/net-pull-request
> >
> > for you to fetch changes up to 818692f0a01587d02220916b31d5bb8e7dced611:
> >
> >net/vmnet: update MAINTAINERS list (2022-01-14 12:58:19 +0800)
> >
> > 
> >
> > Changes since V2:
> >
> > - Try to make vmnet work on some old mac version
>
>
> I tend to hold this pull request since new issues were spotted in the
> vmnet series.

OK; I'll drop this one from my queue.

thanks
-- PMM



Re max ISA serial ports

2022-01-14 Thread Ani Sinha
I have a question re the following commit :

commit def337ffda34d331404bd7f1a42726b71500df22
Author: Peter Maydell 
Date:   Fri Apr 20 15:52:46 2018 +0100

serial-isa: Use MAX_ISA_SERIAL_PORTS instead of MAX_SERIAL_PORTS


Does this mean that this limit of 4 slots qemu / hypervisor specific or is
it limited in general by hardware across all hypervisor? Can you please
clarify?


Re: [PATCH 01/17] ppc/pnv: use PHB4 obj in pnv_pec_stk_pci_xscom_ops

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

The current relationship between PnvPhb4PecStack and PnvPHB4 objects is
overly complex. Recent work done in pnv_phb4.c and pnv_phb4_pec.c shows
that the stack obj role in the overall design is more of a placeholder for
its 'phb' object, having no atributes that stand on its own. This became
clearer after pnv-phb4 user creatable devices were implemented.

What remains now are a lot of stack->phb and phb->stack pointers
throughout .read and .write callbacks of MemoryRegionOps that are being
initialized in phb4_realize() time. stk_realize() is a no-op if the
machine is being run with -nodefaults.

The first step of trying to decouple the stack and phb relationship is
to move the MemoryRegionOps that belongs to PnvPhb4PecStack to PhbPHB4.
Unfortunately this can't be done  without some preliminary steps to
change the usage of 'stack' and replace it with 'phb' in these
read/write callbacks.

This patch starts this process by using a PnvPHB4 opaque in
pnv_pec_stk_pci_xscom_ops instead of PnvPhb4PecStack.

Signed-off-by: Daniel Henrique Barboza 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



Re: [PATCH 00/17] remove PnvPhb4PecStack from Powernv9

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

Hi,

After all the done enabling pnv-phb4 user devices, it became clear that
the stack object is just a container of the PHB and its resources than
something that needs to be maintained by its own. Removing the
PnvPhb4PecStack object promotes a simpler code where we're dealing only
with PECs and PHB4s.

One thing that isn't handled in this series is the nested regs names.
There are 30+ nested per-stack registers with names such as
'PEC_NEST_STK*' or 'PEC_PCI_STK*' that are left as is. Renaming them to
remove the 'STK' reference can be done in a follow up when we're
satisfied with what it is presented here.


I think that's fine. The name identifies the sub-unit logic to which
the register belongs to.

Thanks,

C.
 


No functional change is intended with this series. The series is based
on top of current master (at f8d75e10d3),

Daniel Henrique Barboza (17):
   ppc/pnv: use PHB4 obj in pnv_pec_stk_pci_xscom_ops
   ppc/pnv: move PCI registers to PnvPHB4
   ppc/pnv: move phbbar to PnvPHB4
   ppc/pnv: move intbar to PnvPHB4
   ppc/pnv: change pnv_phb4_update_regions() to use PnvPHB4
   ppc/pnv: move mmbar0/mmbar1 and friends to PnvPHB4
   ppc/pnv: move nest_regs[] to PnvPHB4
   ppc/pnv: change pnv_pec_stk_update_map() to use PnvPHB4
   ppc/pnv: move nest_regs_mr to PnvPHB4
   ppc/pnv: move phb_regs_mr to PnvPHB4
   ppc/pnv: introduce PnvPHB4 'phb_number' property
   ppc/pnv: introduce PnvPHB4 'pec' property
   ppc/pnv: remove stack pointer from PnvPHB4
   ppc/pnv: move default_phb_realize() to pec_realize()
   ppc/pnv: convert pec->stacks[] into pec->phbs[]
   ppc/pnv: remove PnvPhb4PecStack object
   ppc/pnv: rename pnv_pec_stk_update_map()

  hw/pci-host/pnv_phb4.c | 271 -
  hw/pci-host/pnv_phb4_pec.c | 122 ---
  include/hw/pci-host/pnv_phb4.h |  84 +-
  3 files changed, 200 insertions(+), 277 deletions(-)






Re: [PATCH 02/17] ppc/pnv: move PCI registers to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

Previous patch changed pnv_pec_stk_pci_xscom_read() and
pnv_pec_stk_pci_xscom_write() to use a PnvPHB4 opaque, making it easier
to move both pci_regs[] and the pci_regs_mr MemoryRegion to the PnvHB4
object.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 30 +++---
  include/hw/pci-host/pnv_phb4.h | 10 +-
  2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index e010572376..fd9f6af4b3 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1071,54 +1071,54 @@ static const MemoryRegionOps pnv_pec_stk_nest_xscom_ops 
= {
  static uint64_t pnv_pec_stk_pci_xscom_read(void *opaque, hwaddr addr,
 unsigned size)
  {
-PnvPhb4PecStack *stack = PNV_PHB4(opaque)->stack;
+PnvPHB4 *phb = PNV_PHB4(opaque);
  uint32_t reg = addr >> 3;
  
  /* TODO: add list of allowed registers and error out if not */

-return stack->pci_regs[reg];
+return phb->pci_regs[reg];
  }
  
  static void pnv_pec_stk_pci_xscom_write(void *opaque, hwaddr addr,

  uint64_t val, unsigned size)
  {
-PnvPhb4PecStack *stack = PNV_PHB4(opaque)->stack;
+PnvPHB4 *phb = PNV_PHB4(opaque);
  uint32_t reg = addr >> 3;
  
  switch (reg) {

  case PEC_PCI_STK_PCI_FIR:
-stack->pci_regs[reg] = val;
+phb->pci_regs[reg] = val;
  break;
  case PEC_PCI_STK_PCI_FIR_CLR:
-stack->pci_regs[PEC_PCI_STK_PCI_FIR] &= val;
+phb->pci_regs[PEC_PCI_STK_PCI_FIR] &= val;
  break;
  case PEC_PCI_STK_PCI_FIR_SET:
-stack->pci_regs[PEC_PCI_STK_PCI_FIR] |= val;
+phb->pci_regs[PEC_PCI_STK_PCI_FIR] |= val;
  break;
  case PEC_PCI_STK_PCI_FIR_MSK:
-stack->pci_regs[reg] = val;
+phb->pci_regs[reg] = val;
  break;
  case PEC_PCI_STK_PCI_FIR_MSKC:
-stack->pci_regs[PEC_PCI_STK_PCI_FIR_MSK] &= val;
+phb->pci_regs[PEC_PCI_STK_PCI_FIR_MSK] &= val;
  break;
  case PEC_PCI_STK_PCI_FIR_MSKS:
-stack->pci_regs[PEC_PCI_STK_PCI_FIR_MSK] |= val;
+phb->pci_regs[PEC_PCI_STK_PCI_FIR_MSK] |= val;
  break;
  case PEC_PCI_STK_PCI_FIR_ACT0:
  case PEC_PCI_STK_PCI_FIR_ACT1:
-stack->pci_regs[reg] = val;
+phb->pci_regs[reg] = val;
  break;
  case PEC_PCI_STK_PCI_FIR_WOF:
-stack->pci_regs[reg] = 0;
+phb->pci_regs[reg] = 0;
  break;
  case PEC_PCI_STK_ETU_RESET:
-stack->pci_regs[reg] = val & 0x8000ull;
+phb->pci_regs[reg] = val & 0x8000ull;
  /* TODO: Implement reset */
  break;
  case PEC_PCI_STK_PBAIB_ERR_REPORT:
  break;
  case PEC_PCI_STK_PBAIB_TX_CMD_CRED:
  case PEC_PCI_STK_PBAIB_TX_DAT_CRED:
-stack->pci_regs[reg] = val;
+phb->pci_regs[reg] = val;
  break;
  default:
  qemu_log_mask(LOG_UNIMP, "phb4_pec_stk: pci_xscom_write 
0x%"HWADDR_PRIx
@@ -1477,7 +1477,7 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  
  snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",

   pec->chip_id, pec->index, stack->stack_no);
-pnv_xscom_region_init(&stack->pci_regs_mr, OBJECT(phb),
+pnv_xscom_region_init(&phb->pci_regs_mr, OBJECT(phb),
&pnv_pec_stk_pci_xscom_ops, phb, name,
PHB4_PEC_PCI_STK_REGS_COUNT);
  
@@ -1496,7 +1496,7 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)

  &stack->nest_regs_mr);
  pnv_xscom_add_subregion(pec->chip,
  pec_pci_base + 0x40 * (stack->stack_no + 1),
-&stack->pci_regs_mr);
+&phb->pci_regs_mr);
  pnv_xscom_add_subregion(pec->chip,
  pec_pci_base + PNV9_XSCOM_PEC_PCI_STK0 +
  0x40 * stack->stack_no,
diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index 4b7ce8a723..4487c3a6e2 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -107,6 +107,11 @@ struct PnvPHB4 {
  MemoryRegion pci_mmio;
  MemoryRegion pci_io;
  
+/* PCI registers (excluding pass-through) */

+#define PHB4_PEC_PCI_STK_REGS_COUNT  0xf
+uint64_t pci_regs[PHB4_PEC_PCI_STK_REGS_COUNT];
+MemoryRegion pci_regs_mr;
+
  /* On-chip IODA tables */
  uint64_t ioda_LIST[PNV_PHB4_MAX_LSIs];
  uint64_t ioda_MIST[PNV_PHB4_MAX_MIST];
@@ -155,11 +160,6 @@ struct PnvPhb4PecStack {
  uint64_t nest_regs[PHB4_PEC_NEST_STK_REGS_COUNT];
  MemoryRegion nest_regs_mr;
  
-/* PCI registers (excluding pass-through) */

-#define PHB4_PEC_PCI_STK_REGS_COUNT  0xf
-uint64_t pci_regs[PHB4_PEC_PCI_STK_R

Re: [PATCH 06/17] ppc/pnv: move mmbar0/mmbar1 and friends to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

These 2 MemoryRegions, together with mmio(0|1)_base and mmio(0|1)_size
variables, are used together in the same functions. We're better of
moving them all in a single step.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 52 +-
  include/hw/pci-host/pnv_phb4.h | 14 -
  2 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 034721f159..dc4db091e4 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -228,16 +228,16 @@ static void pnv_phb4_check_mbt(PnvPHB4 *phb, uint32_t 
index)
  /* TODO: Figure out how to implemet/decode AOMASK */
  
  /* Check if it matches an enabled MMIO region in the PEC stack */

-if (memory_region_is_mapped(&phb->stack->mmbar0) &&
-base >= phb->stack->mmio0_base &&
-(base + size) <= (phb->stack->mmio0_base + phb->stack->mmio0_size)) {
-parent = &phb->stack->mmbar0;
-base -= phb->stack->mmio0_base;
-} else if (memory_region_is_mapped(&phb->stack->mmbar1) &&
-base >= phb->stack->mmio1_base &&
-(base + size) <= (phb->stack->mmio1_base + phb->stack->mmio1_size)) {
-parent = &phb->stack->mmbar1;
-base -= phb->stack->mmio1_base;
+if (memory_region_is_mapped(&phb->mmbar0) &&
+base >= phb->mmio0_base &&
+(base + size) <= (phb->mmio0_base + phb->mmio0_size)) {
+parent = &phb->mmbar0;
+base -= phb->mmio0_base;
+} else if (memory_region_is_mapped(&phb->mmbar1) &&
+base >= phb->mmio1_base &&
+(base + size) <= (phb->mmio1_base + phb->mmio1_size)) {
+parent = &phb->mmbar1;
+base -= phb->mmio1_base;
  } else {
  phb_error(phb, "PHB MBAR %d out of parent bounds", index);
  return;
@@ -910,13 +910,13 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
   */
  
  /* Handle unmaps */

-if (memory_region_is_mapped(&stack->mmbar0) &&
+if (memory_region_is_mapped(&phb->mmbar0) &&
  !(bar_en & PEC_NEST_STK_BAR_EN_MMIO0)) {
-memory_region_del_subregion(sysmem, &stack->mmbar0);
+memory_region_del_subregion(sysmem, &phb->mmbar0);
  }
-if (memory_region_is_mapped(&stack->mmbar1) &&
+if (memory_region_is_mapped(&phb->mmbar1) &&
  !(bar_en & PEC_NEST_STK_BAR_EN_MMIO1)) {
-memory_region_del_subregion(sysmem, &stack->mmbar1);
+memory_region_del_subregion(sysmem, &phb->mmbar1);
  }
  if (memory_region_is_mapped(&phb->phbbar) &&
  !(bar_en & PEC_NEST_STK_BAR_EN_PHB)) {
@@ -931,29 +931,29 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  pnv_phb4_update_regions(phb);
  
  /* Handle maps */

-if (!memory_region_is_mapped(&stack->mmbar0) &&
+if (!memory_region_is_mapped(&phb->mmbar0) &&
  (bar_en & PEC_NEST_STK_BAR_EN_MMIO0)) {
  bar = stack->nest_regs[PEC_NEST_STK_MMIO_BAR0] >> 8;
  mask = stack->nest_regs[PEC_NEST_STK_MMIO_BAR0_MASK];
  size = ((~mask) >> 8) + 1;
-snprintf(name, sizeof(name), "pec-%d.%d-stack-%d-mmio0",
+snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio0",
   pec->chip_id, pec->index, stack->stack_no);
-memory_region_init(&stack->mmbar0, OBJECT(stack), name, size);
-memory_region_add_subregion(sysmem, bar, &stack->mmbar0);
-stack->mmio0_base = bar;
-stack->mmio0_size = size;
+memory_region_init(&phb->mmbar0, OBJECT(phb), name, size);
+memory_region_add_subregion(sysmem, bar, &phb->mmbar0);
+phb->mmio0_base = bar;
+phb->mmio0_size = size;
  }
-if (!memory_region_is_mapped(&stack->mmbar1) &&
+if (!memory_region_is_mapped(&phb->mmbar1) &&
  (bar_en & PEC_NEST_STK_BAR_EN_MMIO1)) {
  bar = stack->nest_regs[PEC_NEST_STK_MMIO_BAR1] >> 8;
  mask = stack->nest_regs[PEC_NEST_STK_MMIO_BAR1_MASK];
  size = ((~mask) >> 8) + 1;
-snprintf(name, sizeof(name), "pec-%d.%d-stack-%d-mmio1",
+snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio1",
   pec->chip_id, pec->index, stack->stack_no);
-memory_region_init(&stack->mmbar1, OBJECT(stack), name, size);
-memory_region_add_subregion(sysmem, bar, &stack->mmbar1);
-stack->mmio1_base = bar;
-stack->mmio1_size = size;
+memory_region_init(&phb->mmbar1, OBJECT(phb), name, size);
+memory_region_add_subregion(sysmem, bar, &phb->mmbar1);
+phb->mmio1_base = bar;
+phb->mmio1_size = size;
  }
  if (!memory_region_is_mapped(&phb->phbbar) &&
  (bar_en & PEC_NEST_STK_BAR_EN_PHB)) {
diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index cf5dd4009c..4a8f510f6d 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_ph

Re: [PATCH 04/17] ppc/pnv: move intbar to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This MemoryRegion can also be moved in a single step.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 18 +-
  include/hw/pci-host/pnv_phb4.h |  2 +-
  2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 00eaf91fca..fbc475f27a 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -877,7 +877,7 @@ static void pnv_phb4_update_regions(PnvPhb4PecStack *stack)
  memory_region_del_subregion(&phb->phbbar, &phb->mr_regs);
  }
  if (memory_region_is_mapped(&phb->xsrc.esb_mmio)) {
-memory_region_del_subregion(&stack->intbar, &phb->xsrc.esb_mmio);
+memory_region_del_subregion(&phb->intbar, &phb->xsrc.esb_mmio);
  }
  
  /* Map registers if enabled */

@@ -886,8 +886,8 @@ static void pnv_phb4_update_regions(PnvPhb4PecStack *stack)
  }
  
  /* Map ESB if enabled */

-if (memory_region_is_mapped(&stack->intbar)) {
-memory_region_add_subregion(&stack->intbar, 0, &phb->xsrc.esb_mmio);
+if (memory_region_is_mapped(&phb->intbar)) {
+memory_region_add_subregion(&phb->intbar, 0, &phb->xsrc.esb_mmio);
  }
  
  /* Check/update m32 */

@@ -924,9 +924,9 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  !(bar_en & PEC_NEST_STK_BAR_EN_PHB)) {
  memory_region_del_subregion(sysmem, &phb->phbbar);
  }
-if (memory_region_is_mapped(&stack->intbar) &&
+if (memory_region_is_mapped(&phb->intbar) &&
  !(bar_en & PEC_NEST_STK_BAR_EN_INT)) {
-memory_region_del_subregion(sysmem, &stack->intbar);
+memory_region_del_subregion(sysmem, &phb->intbar);
  }
  
  /* Update PHB */

@@ -966,14 +966,14 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  memory_region_init(&phb->phbbar, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->phbbar);
  }
-if (!memory_region_is_mapped(&stack->intbar) &&
+if (!memory_region_is_mapped(&phb->intbar) &&
  (bar_en & PEC_NEST_STK_BAR_EN_INT)) {
  bar = stack->nest_regs[PEC_NEST_STK_INT_BAR] >> 8;
  size = PNV_PHB4_MAX_INTs << 16;
-snprintf(name, sizeof(name), "pec-%d.%d-stack-%d-int",
+snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-int",
   stack->pec->chip_id, stack->pec->index, stack->stack_no);
-memory_region_init(&stack->intbar, OBJECT(stack), name, size);
-memory_region_add_subregion(sysmem, bar, &stack->intbar);
+memory_region_init(&phb->intbar, OBJECT(phb), name, size);
+memory_region_add_subregion(sysmem, bar, &phb->intbar);
  }
  
  /* Update PHB */

diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index b11fa80e81..cf5dd4009c 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -114,6 +114,7 @@ struct PnvPHB4 {
  
  /* Memory windows from PowerBus to PHB */

  MemoryRegion phbbar;
+MemoryRegion intbar;
  
  /* On-chip IODA tables */

  uint64_t ioda_LIST[PNV_PHB4_MAX_LSIs];
@@ -169,7 +170,6 @@ struct PnvPhb4PecStack {
  /* Memory windows from PowerBus to PHB */
  MemoryRegion mmbar0;
  MemoryRegion mmbar1;
-MemoryRegion intbar;
  uint64_t mmio0_base;
  uint64_t mmio0_size;
  uint64_t mmio1_base;






Re: [PATCH 03/17] ppc/pnv: move phbbar to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This MemoryRegion is simple enough to be moved in a single step.

A 'stack->phb' pointer had to be introduced in pnv_pec_stk_update_map()
because this function isn't ready to be fully converted to use a PnvPHB4
pointer instead. This will be dealt with in the following patches.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 19 ++-
  include/hw/pci-host/pnv_phb4.h |  4 +++-
  2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index fd9f6af4b3..00eaf91fca 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -874,15 +874,15 @@ static void pnv_phb4_update_regions(PnvPhb4PecStack 
*stack)
  
  /* Unmap first always */

  if (memory_region_is_mapped(&phb->mr_regs)) {
-memory_region_del_subregion(&stack->phbbar, &phb->mr_regs);
+memory_region_del_subregion(&phb->phbbar, &phb->mr_regs);
  }
  if (memory_region_is_mapped(&phb->xsrc.esb_mmio)) {
  memory_region_del_subregion(&stack->intbar, &phb->xsrc.esb_mmio);
  }
  
  /* Map registers if enabled */

-if (memory_region_is_mapped(&stack->phbbar)) {
-memory_region_add_subregion(&stack->phbbar, 0, &phb->mr_regs);
+if (memory_region_is_mapped(&phb->phbbar)) {
+memory_region_add_subregion(&phb->phbbar, 0, &phb->mr_regs);
  }
  
  /* Map ESB if enabled */

@@ -897,6 +897,7 @@ static void pnv_phb4_update_regions(PnvPhb4PecStack *stack)
  static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  {
  PnvPhb4PecState *pec = stack->pec;
+PnvPHB4 *phb = stack->phb;
  MemoryRegion *sysmem = get_system_memory();
  uint64_t bar_en = stack->nest_regs[PEC_NEST_STK_BAR_EN];
  uint64_t bar, mask, size;
@@ -919,9 +920,9 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  !(bar_en & PEC_NEST_STK_BAR_EN_MMIO1)) {
  memory_region_del_subregion(sysmem, &stack->mmbar1);
  }
-if (memory_region_is_mapped(&stack->phbbar) &&
+if (memory_region_is_mapped(&phb->phbbar) &&
  !(bar_en & PEC_NEST_STK_BAR_EN_PHB)) {
-memory_region_del_subregion(sysmem, &stack->phbbar);
+memory_region_del_subregion(sysmem, &phb->phbbar);
  }
  if (memory_region_is_mapped(&stack->intbar) &&
  !(bar_en & PEC_NEST_STK_BAR_EN_INT)) {
@@ -956,14 +957,14 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  stack->mmio1_base = bar;
  stack->mmio1_size = size;
  }
-if (!memory_region_is_mapped(&stack->phbbar) &&
+if (!memory_region_is_mapped(&phb->phbbar) &&
  (bar_en & PEC_NEST_STK_BAR_EN_PHB)) {
  bar = stack->nest_regs[PEC_NEST_STK_PHB_REGS_BAR] >> 8;
  size = PNV_PHB4_NUM_REGS << 3;
-snprintf(name, sizeof(name), "pec-%d.%d-stack-%d-phb",
+snprintf(name, sizeof(name), "pec-%d.%d-phb-%d",
   pec->chip_id, pec->index, stack->stack_no);
-memory_region_init(&stack->phbbar, OBJECT(stack), name, size);
-memory_region_add_subregion(sysmem, bar, &stack->phbbar);
+memory_region_init(&phb->phbbar, OBJECT(phb), name, size);
+memory_region_add_subregion(sysmem, bar, &phb->phbbar);
  }
  if (!memory_region_is_mapped(&stack->intbar) &&
  (bar_en & PEC_NEST_STK_BAR_EN_INT)) {
diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index 4487c3a6e2..b11fa80e81 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -112,6 +112,9 @@ struct PnvPHB4 {
  uint64_t pci_regs[PHB4_PEC_PCI_STK_REGS_COUNT];
  MemoryRegion pci_regs_mr;
  
+/* Memory windows from PowerBus to PHB */

+MemoryRegion phbbar;
+
  /* On-chip IODA tables */
  uint64_t ioda_LIST[PNV_PHB4_MAX_LSIs];
  uint64_t ioda_MIST[PNV_PHB4_MAX_MIST];
@@ -166,7 +169,6 @@ struct PnvPhb4PecStack {
  /* Memory windows from PowerBus to PHB */
  MemoryRegion mmbar0;
  MemoryRegion mmbar1;
-MemoryRegion phbbar;
  MemoryRegion intbar;
  uint64_t mmio0_base;
  uint64_t mmio0_size;






Re: [PATCH 07/17] ppc/pnv: move nest_regs[] to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

stack->nest_regs[] is used in several XSCOM functions and it's one of
the main culprits of having to deal with stack->phb pointers around the
code.

Sure, we're having to add 2 extra stack->phb pointers to ease
nest_regs[] migration to PnvPHB4. They'll be dealt with shortly.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 52 ++
  include/hw/pci-host/pnv_phb4.h |  7 +++--
  2 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index dc4db091e4..916a7a3cf0 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -862,10 +862,11 @@ static uint64_t pnv_pec_stk_nest_xscom_read(void *opaque, 
hwaddr addr,
  unsigned size)
  {
  PnvPhb4PecStack *stack = PNV_PHB4_PEC_STACK(opaque);
+PnvPHB4 *phb = stack->phb;
  uint32_t reg = addr >> 3;
  
  /* TODO: add list of allowed registers and error out if not */

-return stack->nest_regs[reg];
+return phb->nest_regs[reg];
  }
  
  static void pnv_phb4_update_regions(PnvPHB4 *phb)

@@ -897,7 +898,7 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  PnvPhb4PecState *pec = stack->pec;
  PnvPHB4 *phb = stack->phb;
  MemoryRegion *sysmem = get_system_memory();
-uint64_t bar_en = stack->nest_regs[PEC_NEST_STK_BAR_EN];
+uint64_t bar_en = phb->nest_regs[PEC_NEST_STK_BAR_EN];
  uint64_t bar, mask, size;
  char name[64];
  
@@ -933,8 +934,8 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)

  /* Handle maps */
  if (!memory_region_is_mapped(&phb->mmbar0) &&
  (bar_en & PEC_NEST_STK_BAR_EN_MMIO0)) {
-bar = stack->nest_regs[PEC_NEST_STK_MMIO_BAR0] >> 8;
-mask = stack->nest_regs[PEC_NEST_STK_MMIO_BAR0_MASK];
+bar = phb->nest_regs[PEC_NEST_STK_MMIO_BAR0] >> 8;
+mask = phb->nest_regs[PEC_NEST_STK_MMIO_BAR0_MASK];
  size = ((~mask) >> 8) + 1;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio0",
   pec->chip_id, pec->index, stack->stack_no);
@@ -945,8 +946,8 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  }
  if (!memory_region_is_mapped(&phb->mmbar1) &&
  (bar_en & PEC_NEST_STK_BAR_EN_MMIO1)) {
-bar = stack->nest_regs[PEC_NEST_STK_MMIO_BAR1] >> 8;
-mask = stack->nest_regs[PEC_NEST_STK_MMIO_BAR1_MASK];
+bar = phb->nest_regs[PEC_NEST_STK_MMIO_BAR1] >> 8;
+mask = phb->nest_regs[PEC_NEST_STK_MMIO_BAR1_MASK];
  size = ((~mask) >> 8) + 1;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio1",
   pec->chip_id, pec->index, stack->stack_no);
@@ -957,7 +958,7 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  }
  if (!memory_region_is_mapped(&phb->phbbar) &&
  (bar_en & PEC_NEST_STK_BAR_EN_PHB)) {
-bar = stack->nest_regs[PEC_NEST_STK_PHB_REGS_BAR] >> 8;
+bar = phb->nest_regs[PEC_NEST_STK_PHB_REGS_BAR] >> 8;
  size = PNV_PHB4_NUM_REGS << 3;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d",
   pec->chip_id, pec->index, stack->stack_no);
@@ -966,7 +967,7 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  }
  if (!memory_region_is_mapped(&phb->intbar) &&
  (bar_en & PEC_NEST_STK_BAR_EN_INT)) {
-bar = stack->nest_regs[PEC_NEST_STK_INT_BAR] >> 8;
+bar = phb->nest_regs[PEC_NEST_STK_INT_BAR] >> 8;
  size = PNV_PHB4_MAX_INTs << 16;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-int",
   stack->pec->chip_id, stack->pec->index, stack->stack_no);
@@ -982,34 +983,35 @@ static void pnv_pec_stk_nest_xscom_write(void *opaque, 
hwaddr addr,
   uint64_t val, unsigned size)
  {
  PnvPhb4PecStack *stack = PNV_PHB4_PEC_STACK(opaque);
+PnvPHB4 *phb = stack->phb;
  PnvPhb4PecState *pec = stack->pec;
  uint32_t reg = addr >> 3;
  
  switch (reg) {

  case PEC_NEST_STK_PCI_NEST_FIR:
-stack->nest_regs[PEC_NEST_STK_PCI_NEST_FIR] = val;
+phb->nest_regs[PEC_NEST_STK_PCI_NEST_FIR] = val;
  break;
  case PEC_NEST_STK_PCI_NEST_FIR_CLR:
-stack->nest_regs[PEC_NEST_STK_PCI_NEST_FIR] &= val;
+phb->nest_regs[PEC_NEST_STK_PCI_NEST_FIR] &= val;
  break;
  case PEC_NEST_STK_PCI_NEST_FIR_SET:
-stack->nest_regs[PEC_NEST_STK_PCI_NEST_FIR] |= val;
+phb->nest_regs[PEC_NEST_STK_PCI_NEST_FIR] |= val;
  break;
  case PEC_NEST_STK_PCI_NEST_FIR_MSK:
-stack->nest_regs[PEC_NEST_STK_PCI_NEST_FIR_MSK] = val;
+phb->nest_regs[PEC_NEST_STK_PCI_NEST_FIR_MSK] = val;
  break;
  case PEC_NEST_STK_PCI_NEST_FIR_MSKC:
-stack->nest_regs[PEC_NEST_STK_PCI_NEST_FIR_MSK] &= val;
+phb->nest_regs[PE

Re: [PATCH 08/17] ppc/pnv: change pnv_pec_stk_update_map() to use PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

stack->nest_regs_mr wasn't migrated to PnvPHB4 together with phb->nest_regs[] in
the previous patch. We were unable to cleanly convert its write MemoryRegionOps,
pnv_pec_stk_nest_xscom_write(), to use PnvPHB4 instead of PnvPhb4PecStack due to
pnv_pec_stk_update_map() using a stack. Thing is, we're now able to convert
pnv_pec_stk_update_map() because of what the did in previous patch.

The need for this intermediate step is a good example of the interconnected
relationship between stack and phb that we aim to cleanup.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  hw/pci-host/pnv_phb4.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 916a7a3cf0..0f4464ec67 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -893,10 +893,10 @@ static void pnv_phb4_update_regions(PnvPHB4 *phb)
  pnv_phb4_check_all_mbt(phb);
  }
  
-static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)

+static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  {
+PnvPhb4PecStack *stack = phb->stack;
  PnvPhb4PecState *pec = stack->pec;
-PnvPHB4 *phb = stack->phb;
  MemoryRegion *sysmem = get_system_memory();
  uint64_t bar_en = phb->nest_regs[PEC_NEST_STK_BAR_EN];
  uint64_t bar, mask, size;
@@ -1046,7 +1046,7 @@ static void pnv_pec_stk_nest_xscom_write(void *opaque, 
hwaddr addr,
  break;
  case PEC_NEST_STK_BAR_EN:
  phb->nest_regs[reg] = val & 0xf000ull;
-pnv_pec_stk_update_map(stack);
+pnv_pec_stk_update_map(phb);
  break;
  case PEC_NEST_STK_DATA_FRZ_TYPE:
  case PEC_NEST_STK_PBCQ_TUN_BAR:






Re: [PATCH 05/17] ppc/pnv: change pnv_phb4_update_regions() to use PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

The function does not rely on stack for anything it does anymore. This
is also one less instance of 'stack->phb' that we need to worry about.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index fbc475f27a..034721f159 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -868,10 +868,8 @@ static uint64_t pnv_pec_stk_nest_xscom_read(void *opaque, 
hwaddr addr,
  return stack->nest_regs[reg];
  }
  
-static void pnv_phb4_update_regions(PnvPhb4PecStack *stack)

+static void pnv_phb4_update_regions(PnvPHB4 *phb)
  {
-PnvPHB4 *phb = stack->phb;
-
  /* Unmap first always */
  if (memory_region_is_mapped(&phb->mr_regs)) {
  memory_region_del_subregion(&phb->phbbar, &phb->mr_regs);
@@ -930,7 +928,7 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  }
  
  /* Update PHB */

-pnv_phb4_update_regions(stack);
+pnv_phb4_update_regions(phb);
  
  /* Handle maps */

  if (!memory_region_is_mapped(&stack->mmbar0) &&
@@ -977,7 +975,7 @@ static void pnv_pec_stk_update_map(PnvPhb4PecStack *stack)
  }
  
  /* Update PHB */

-pnv_phb4_update_regions(stack);
+pnv_phb4_update_regions(phb);
  }
  
  static void pnv_pec_stk_nest_xscom_write(void *opaque, hwaddr addr,







Re: [PATCH 10/17] ppc/pnv: move phb_regs_mr to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

After recent changes, this MemoryRegion can be migrated to PnvPHB4
without too much trouble.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 6 +++---
  include/hw/pci-host/pnv_phb4.h | 6 +++---
  2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 37bab10fcb..b5045fca64 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1481,9 +1481,9 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
PHB4_PEC_PCI_STK_REGS_COUNT);
  
  /* PHB pass-through */

-snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-stack-%d-phb",
+snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",
   pec->chip_id, pec->index, stack->stack_no);
-pnv_xscom_region_init(&stack->phb_regs_mr, OBJECT(phb),
+pnv_xscom_region_init(&phb->phb_regs_mr, OBJECT(phb),
&pnv_phb4_xscom_ops, phb, name, 0x40);
  
  pec_nest_base = pecc->xscom_nest_base(pec);

@@ -1499,7 +1499,7 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  pnv_xscom_add_subregion(pec->chip,
  pec_pci_base + PNV9_XSCOM_PEC_PCI_STK0 +
  0x40 * stack->stack_no,
-&stack->phb_regs_mr);
+&phb->phb_regs_mr);
  }
  
  static void pnv_phb4_instance_init(Object *obj)

diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index 1d53dda0ed..6968efaba8 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -117,6 +117,9 @@ struct PnvPHB4 {
  uint64_t nest_regs[PHB4_PEC_NEST_STK_REGS_COUNT];
  MemoryRegion nest_regs_mr;
  
+/* PHB pass-through XSCOM */

+MemoryRegion phb_regs_mr;
+
  /* Memory windows from PowerBus to PHB */
  MemoryRegion phbbar;
  MemoryRegion intbar;
@@ -170,9 +173,6 @@ struct PnvPhb4PecStack {
  /* My own stack number */
  uint32_t stack_no;
  
-/* PHB pass-through XSCOM */

-MemoryRegion phb_regs_mr;
-
  /* The owner PEC */
  PnvPhb4PecState *pec;
  






Re: [PATCH 09/17] ppc/pnv: move nest_regs_mr to PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

We're now able to cleanly move nest_regs_mr to the PnvPHB4 device.

One thing of notice here is the need to use a phb->stack->pec pointer
because pnv_pec_stk_nest_xscom_write requires a PEC object. Another
thing that can be noticed in the use of 'stack->stack_no' that still
remains throughout the XSCOM code.

After moving all MemoryRegions to the PnvPHB4 object, this illustrates
what is the remaining role of the stack: provide a PEC pointer and the
'stack_no' information. If we can provide these in the PnvPHB4 object
instead (spoiler: we can, and we will), the PnvPhb4PecStack device will
be deprecated and can be removed.

Signed-off-by: Daniel Henrique Barboza 



Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  hw/pci-host/pnv_phb4.c | 16 +++-
  include/hw/pci-host/pnv_phb4.h |  3 +--
  2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 0f4464ec67..37bab10fcb 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -861,8 +861,7 @@ const MemoryRegionOps pnv_phb4_xscom_ops = {
  static uint64_t pnv_pec_stk_nest_xscom_read(void *opaque, hwaddr addr,
  unsigned size)
  {
-PnvPhb4PecStack *stack = PNV_PHB4_PEC_STACK(opaque);
-PnvPHB4 *phb = stack->phb;
+PnvPHB4 *phb = PNV_PHB4(opaque);
  uint32_t reg = addr >> 3;
  
  /* TODO: add list of allowed registers and error out if not */

@@ -982,9 +981,8 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  static void pnv_pec_stk_nest_xscom_write(void *opaque, hwaddr addr,
   uint64_t val, unsigned size)
  {
-PnvPhb4PecStack *stack = PNV_PHB4_PEC_STACK(opaque);
-PnvPHB4 *phb = stack->phb;
-PnvPhb4PecState *pec = stack->pec;
+PnvPHB4 *phb = PNV_PHB4(opaque);
+PnvPhb4PecState *pec = phb->stack->pec;
  uint32_t reg = addr >> 3;
  
  switch (reg) {

@@ -1470,10 +1468,10 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  assert(pec);
  
  /* Initialize the XSCOM regions for the stack registers */

-snprintf(name, sizeof(name), "xscom-pec-%d.%d-nest-stack-%d",
+snprintf(name, sizeof(name), "xscom-pec-%d.%d-nest-phb-%d",
   pec->chip_id, pec->index, stack->stack_no);
-pnv_xscom_region_init(&stack->nest_regs_mr, OBJECT(stack),
-  &pnv_pec_stk_nest_xscom_ops, stack, name,
+pnv_xscom_region_init(&phb->nest_regs_mr, OBJECT(phb),
+  &pnv_pec_stk_nest_xscom_ops, phb, name,
PHB4_PEC_NEST_STK_REGS_COUNT);
  
  snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",

@@ -1494,7 +1492,7 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  /* Populate the XSCOM address space. */
  pnv_xscom_add_subregion(pec->chip,
  pec_nest_base + 0x40 * (stack->stack_no + 1),
-&stack->nest_regs_mr);
+&phb->nest_regs_mr);
  pnv_xscom_add_subregion(pec->chip,
  pec_pci_base + 0x40 * (stack->stack_no + 1),
  &phb->pci_regs_mr);
diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index a7e08772c1..1d53dda0ed 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -115,6 +115,7 @@ struct PnvPHB4 {
  /* Nest registers */
  #define PHB4_PEC_NEST_STK_REGS_COUNT  0x17
  uint64_t nest_regs[PHB4_PEC_NEST_STK_REGS_COUNT];
+MemoryRegion nest_regs_mr;
  
  /* Memory windows from PowerBus to PHB */

  MemoryRegion phbbar;
@@ -169,8 +170,6 @@ struct PnvPhb4PecStack {
  /* My own stack number */
  uint32_t stack_no;
  
-MemoryRegion nest_regs_mr;

-
  /* PHB pass-through XSCOM */
  MemoryRegion phb_regs_mr;
  






Re: Re max ISA serial ports

2022-01-14 Thread Peter Maydell
On Fri, 14 Jan 2022 at 10:31, Ani Sinha  wrote:
>
> I have a question re the following commit :
>
> commit def337ffda34d331404bd7f1a42726b71500df22
> Author: Peter Maydell 
> Date:   Fri Apr 20 15:52:46 2018 +0100
>
> serial-isa: Use MAX_ISA_SERIAL_PORTS instead of MAX_SERIAL_PORTS
>
>
> Does this mean that this limit of 4 slots qemu / hypervisor specific
> or is it limited in general by hardware across all hypervisor?
> Can you please clarify?

This commit was part of a series which removed the previous
compile time limit on the number of serial ports. (The later
6af2692e86f9fdfb3 and b8846a4d6352b2 remove that limit.)
For some hardware, like the ISA serial port, there is still a
compile time limit because we are emulating real hardware
which had a fixed limit, so there's no point in making QEMU's
code for that device capable of handling any number of ports.
(As the commit message says, the limit in this case is
imposed because there are fixed IO port and IRQ settings for
ISA serial ports.) Commit def337ffda3 is just disentangling
the old generic compile-time limit MAX_SERIAL_PORTS from the
new specific-to-this-device compile-time limit MAX_ISA_SERIAL_PORTS
so that the later commit 6af2692e86f9fdfb3 can delete
MAX_SERIAL_PORTS entirely.

Summary: QEMU (and KVM etc) have no limit on the number
of serial ports. Some specific device emulation does,
usually where the real device it's emulating is similarly
limited.

-- PMM



Re: [PULL v5 00/18] Build system and KVM changes for 2021-12-23

2022-01-14 Thread Peter Maydell
On Wed, 12 Jan 2022 at 15:23, Paolo Bonzini  wrote:
>
> The following changes since commit b37778b840f6dc6d1bbaf0e8e0641b3d48ad77c5:
>
>   linux-user: Fix clang warning for nios2-linux-user code (2022-01-12 
> 09:22:01 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 9d30c78c7d3b994825cbe63fa277279ae3ef4248:
>
>   meson: reenable filemonitor-inotify compilation (2022-01-12 14:09:06 +0100)
>
> 
> * configure and meson cleanups
> * KVM_GET/SET_SREGS2 support for x86
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.0
for any user-visible changes.

-- PMM



Re: [PATCH 12/17] ppc/pnv: introduce PnvPHB4 'pec' property

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This property will track the owner PEC of this PHB. For now it's
redundant since we can retrieve the PEC via phb->stack->pec but it
will not be redundant when we get rid of the stack device.

Signed-off-by: Daniel Henrique Barboza 



Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  hw/pci-host/pnv_phb4.c | 20 +++-
  hw/pci-host/pnv_phb4_pec.c |  2 ++
  include/hw/pci-host/pnv_phb4.h |  3 +++
  3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 44f3087913..c9117221b2 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -894,8 +894,7 @@ static void pnv_phb4_update_regions(PnvPHB4 *phb)
  
  static void pnv_pec_stk_update_map(PnvPHB4 *phb)

  {
-PnvPhb4PecStack *stack = phb->stack;
-PnvPhb4PecState *pec = stack->pec;
+PnvPhb4PecState *pec = phb->pec;
  MemoryRegion *sysmem = get_system_memory();
  uint64_t bar_en = phb->nest_regs[PEC_NEST_STK_BAR_EN];
  uint64_t bar, mask, size;
@@ -969,7 +968,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  bar = phb->nest_regs[PEC_NEST_STK_INT_BAR] >> 8;
  size = PNV_PHB4_MAX_INTs << 16;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-int",
- stack->pec->chip_id, stack->pec->index, phb->phb_number);
+ phb->pec->chip_id, phb->pec->index, phb->phb_number);
  memory_region_init(&phb->intbar, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->intbar);
  }
@@ -982,7 +981,7 @@ static void pnv_pec_stk_nest_xscom_write(void *opaque, 
hwaddr addr,
   uint64_t val, unsigned size)
  {
  PnvPHB4 *phb = PNV_PHB4(opaque);
-PnvPhb4PecState *pec = phb->stack->pec;
+PnvPhb4PecState *pec = phb->pec;
  uint32_t reg = addr >> 3;
  
  switch (reg) {

@@ -1458,8 +1457,7 @@ static AddressSpace *pnv_phb4_dma_iommu(PCIBus *bus, void 
*opaque, int devfn)
  
  static void pnv_phb4_xscom_realize(PnvPHB4 *phb)

  {
-PnvPhb4PecStack *stack = phb->stack;
-PnvPhb4PecState *pec = stack->pec;
+PnvPhb4PecState *pec = phb->pec;
  PnvPhb4PecClass *pecc = PNV_PHB4_PEC_GET_CLASS(pec);
  uint32_t pec_nest_base;
  uint32_t pec_pci_base;
@@ -1569,10 +1567,12 @@ static void pnv_phb4_realize(DeviceState *dev, Error 
**errp)
  }
  
  /*

- * All other phb properties but 'version' and 'phb-number'
- * are already set.
+ * All other phb properties but 'pec', 'version' and
+ * 'phb-number' are already set.
   */
-pecc = PNV_PHB4_PEC_GET_CLASS(phb->stack->pec);
+object_property_set_link(OBJECT(phb), "pec", OBJECT(phb->stack->pec),
+ &error_abort);
+pecc = PNV_PHB4_PEC_GET_CLASS(phb->pec);
  object_property_set_int(OBJECT(phb), "version", pecc->version,
  &error_fatal);
  object_property_set_int(OBJECT(phb), "phb-number",
@@ -1688,6 +1688,8 @@ static Property pnv_phb4_properties[] = {
  DEFINE_PROP_UINT64("version", PnvPHB4, version, 0),
  DEFINE_PROP_LINK("stack", PnvPHB4, stack, TYPE_PNV_PHB4_PEC_STACK,
   PnvPhb4PecStack *),
+DEFINE_PROP_LINK("pec", PnvPHB4, pec, TYPE_PNV_PHB4_PEC,
+ PnvPhb4PecState *),
  DEFINE_PROP_END_OF_LIST(),
  };
  
diff --git a/hw/pci-host/pnv_phb4_pec.c b/hw/pci-host/pnv_phb4_pec.c

index 7c4b4023df..36cc4ffe7c 100644
--- a/hw/pci-host/pnv_phb4_pec.c
+++ b/hw/pci-host/pnv_phb4_pec.c
@@ -287,6 +287,8 @@ static void pnv_pec_stk_default_phb_realize(PnvPhb4PecStack 
*stack,
  
  object_property_set_int(OBJECT(stack->phb), "phb-number", stack->stack_no,

  &error_abort);
+object_property_set_link(OBJECT(stack->phb), "pec", OBJECT(pec),
+ &error_abort);
  object_property_set_int(OBJECT(stack->phb), "chip-id", pec->chip_id,
  &error_fatal);
  object_property_set_int(OBJECT(stack->phb), "index", phb_id,
diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index fc7807be1c..f66bc76b78 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -87,6 +87,9 @@ struct PnvPHB4 {
  /* My own PHB number */
  uint32_t phb_number;
  
+/* The owner PEC */

+PnvPhb4PecState *pec;
+
  char bus_path[8];
  
  /* Main register images */







Re: [PATCH 13/17] ppc/pnv: remove stack pointer from PnvPHB4

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This pointer was being used for two reasons: pnv_phb4_update_regions()
was using it to access the PHB and phb4_realize() was using it as a way
to determine if the PHB was user created.

We can determine if the PHB is user created via phb->pec, introduced in
the previous patch, and pnv_phb4_update_regions() is no longer using
stack->phb.

Remove the pointer from the PnvPHB4 device.


Reviewed-by: Cédric Le Goater 

Thanks,

C.





Signed-off-by: Daniel Henrique Barboza 
---
  hw/pci-host/pnv_phb4.c | 17 +
  hw/pci-host/pnv_phb4_pec.c |  2 --
  include/hw/pci-host/pnv_phb4.h |  2 --
  3 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index c9117221b2..25b4248776 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1549,9 +1549,10 @@ static void pnv_phb4_realize(DeviceState *dev, Error 
**errp)
  char name[32];
  
  /* User created PHB */

-if (!phb->stack) {
+if (!phb->pec) {
  PnvMachineState *pnv = PNV_MACHINE(qdev_get_machine());
  PnvChip *chip = pnv_get_chip(pnv, phb->chip_id);
+PnvPhb4PecStack *stack;
  PnvPhb4PecClass *pecc;
  BusState *s;
  
@@ -1560,7 +1561,7 @@ static void pnv_phb4_realize(DeviceState *dev, Error **errp)

  return;
  }
  
-phb->stack = pnv_phb4_get_stack(chip, phb, &local_err);

+stack = pnv_phb4_get_stack(chip, phb, &local_err);
  if (local_err) {
  error_propagate(errp, local_err);
  return;
@@ -1570,19 +1571,13 @@ static void pnv_phb4_realize(DeviceState *dev, Error 
**errp)
   * All other phb properties but 'pec', 'version' and
   * 'phb-number' are already set.
   */
-object_property_set_link(OBJECT(phb), "pec", OBJECT(phb->stack->pec),
+object_property_set_link(OBJECT(phb), "pec", OBJECT(stack->pec),
   &error_abort);
  pecc = PNV_PHB4_PEC_GET_CLASS(phb->pec);
  object_property_set_int(OBJECT(phb), "version", pecc->version,
  &error_fatal);
  object_property_set_int(OBJECT(phb), "phb-number",
-phb->stack->stack_no, &error_abort);
-
-/*
- * Assign stack->phb since pnv_phb4_update_regions() uses it
- * to access the phb.
- */
-phb->stack->phb = phb;
+stack->stack_no, &error_abort);
  
  /*

   * Reparent user created devices to the chip to build
@@ -1686,8 +1681,6 @@ static Property pnv_phb4_properties[] = {
  DEFINE_PROP_UINT32("index", PnvPHB4, phb_id, 0),
  DEFINE_PROP_UINT32("chip-id", PnvPHB4, chip_id, 0),
  DEFINE_PROP_UINT64("version", PnvPHB4, version, 0),
-DEFINE_PROP_LINK("stack", PnvPHB4, stack, TYPE_PNV_PHB4_PEC_STACK,
- PnvPhb4PecStack *),
  DEFINE_PROP_LINK("pec", PnvPHB4, pec, TYPE_PNV_PHB4_PEC,
   PnvPhb4PecState *),
  DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/pci-host/pnv_phb4_pec.c b/hw/pci-host/pnv_phb4_pec.c
index 36cc4ffe7c..1de0eb9adc 100644
--- a/hw/pci-host/pnv_phb4_pec.c
+++ b/hw/pci-host/pnv_phb4_pec.c
@@ -295,8 +295,6 @@ static void pnv_pec_stk_default_phb_realize(PnvPhb4PecStack 
*stack,
  &error_fatal);
  object_property_set_int(OBJECT(stack->phb), "version", pecc->version,
  &error_fatal);
-object_property_set_link(OBJECT(stack->phb), "stack", OBJECT(stack),
- &error_abort);
  
  if (!sysbus_realize(SYS_BUS_DEVICE(stack->phb), errp)) {

  return;
diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index f66bc76b78..90eb4575f8 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -154,8 +154,6 @@ struct PnvPHB4 {
  XiveSource xsrc;
  qemu_irq *qirqs;
  
-PnvPhb4PecStack *stack;

-
  QLIST_HEAD(, PnvPhb4DMASpace) dma_spaces;
  };
  






Re: [PATCH 17/17] ppc/pnv: rename pnv_pec_stk_update_map()

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This function does not use 'stack' anymore. Rename it to
pnv_pec_phb_update_map().

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci-host/pnv_phb4.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index a9ec42ce2c..d27b62a50a 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -892,7 +892,7 @@ static void pnv_phb4_update_regions(PnvPHB4 *phb)
  pnv_phb4_check_all_mbt(phb);
  }
  
-static void pnv_pec_stk_update_map(PnvPHB4 *phb)

+static void pnv_pec_phb_update_map(PnvPHB4 *phb)
  {
  PnvPhb4PecState *pec = phb->pec;
  MemoryRegion *sysmem = get_system_memory();
@@ -1043,7 +1043,7 @@ static void pnv_pec_stk_nest_xscom_write(void *opaque, 
hwaddr addr,
  break;
  case PEC_NEST_STK_BAR_EN:
  phb->nest_regs[reg] = val & 0xf000ull;
-pnv_pec_stk_update_map(phb);
+pnv_pec_phb_update_map(phb);
  break;
  case PEC_NEST_STK_DATA_FRZ_TYPE:
  case PEC_NEST_STK_PBCQ_TUN_BAR:






Re: [PATCH 11/17] ppc/pnv: introduce PnvPHB4 'phb_number' property

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

One of the remaining dependencies we have on the PnvPhb4PecStack object
is the stack->stack_no property. This is set as the position the stack
occupies in the pec->stacks[] array.

We need a way to report this same value in the PnvPHB4. This patch
creates a new property called 'phb_number' to be used in existing code
in all instances stack->stack_no is currently being used.

The 'phb_number' name is an indication of our future intention to convert
the pec->stacks[] array into a pec->phbs[] array, when the PEC object will
deal directly with phb4 objects.



So the PHB would have a 'phb_number' and a 'index' property ? That's
confusing. Can we simplify ? compute one from another ?

or keep 'stack_no' to make it clear this belongs to the stack subunit
logic.

Thanks,

C.



Signed-off-by: Daniel Henrique Barboza 
---
  hw/pci-host/pnv_phb4.c | 28 +---
  hw/pci-host/pnv_phb4_pec.c |  2 ++
  include/hw/pci-host/pnv_phb4.h |  3 +++
  3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index b5045fca64..44f3087913 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -937,7 +937,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  mask = phb->nest_regs[PEC_NEST_STK_MMIO_BAR0_MASK];
  size = ((~mask) >> 8) + 1;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio0",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  memory_region_init(&phb->mmbar0, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->mmbar0);
  phb->mmio0_base = bar;
@@ -949,7 +949,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  mask = phb->nest_regs[PEC_NEST_STK_MMIO_BAR1_MASK];
  size = ((~mask) >> 8) + 1;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio1",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  memory_region_init(&phb->mmbar1, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->mmbar1);
  phb->mmio1_base = bar;
@@ -960,7 +960,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  bar = phb->nest_regs[PEC_NEST_STK_PHB_REGS_BAR] >> 8;
  size = PNV_PHB4_NUM_REGS << 3;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  memory_region_init(&phb->phbbar, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->phbbar);
  }
@@ -969,7 +969,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  bar = phb->nest_regs[PEC_NEST_STK_INT_BAR] >> 8;
  size = PNV_PHB4_MAX_INTs << 16;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-int",
- stack->pec->chip_id, stack->pec->index, stack->stack_no);
+ stack->pec->chip_id, stack->pec->index, phb->phb_number);
  memory_region_init(&phb->intbar, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->intbar);
  }
@@ -1469,20 +1469,20 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  
  /* Initialize the XSCOM regions for the stack registers */

  snprintf(name, sizeof(name), "xscom-pec-%d.%d-nest-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  pnv_xscom_region_init(&phb->nest_regs_mr, OBJECT(phb),
&pnv_pec_stk_nest_xscom_ops, phb, name,
PHB4_PEC_NEST_STK_REGS_COUNT);
  
  snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",

- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  pnv_xscom_region_init(&phb->pci_regs_mr, OBJECT(phb),
&pnv_pec_stk_pci_xscom_ops, phb, name,
PHB4_PEC_PCI_STK_REGS_COUNT);
  
  /* PHB pass-through */

  snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  pnv_xscom_region_init(&phb->phb_regs_mr, OBJECT(phb),
&pnv_phb4_xscom_ops, phb, name, 0x40);
  
@@ -1491,14 +1491,14 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  
  /* Populate the XSCOM address space. */

  pnv_xscom_add_subregion(pec->chip,
-pec_nest_base + 0x40 * (stack->stack_no + 1),
+pec_nest_base + 0x40 * (phb->phb_number + 1),
  &phb->nest_regs_mr);
  pnv_xscom_add_subregion(pec->chip,
-pec_pci_base + 0x40 * (s

Re: [PATCH 16/17] ppc/pnv: remove PnvPhb4PecStack object

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

All the complexity that was scattered between PnvPhb4PecStack and
PnvPHB4 are now centered in the PnvPHB4 device. PnvPhb4PecStack does not
serve any purpose in the current code base.

Signed-off-by: Daniel Henrique Barboza 



Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  hw/pci-host/pnv_phb4_pec.c | 34 --
  include/hw/pci-host/pnv_phb4.h | 20 
  2 files changed, 54 deletions(-)

diff --git a/hw/pci-host/pnv_phb4_pec.c b/hw/pci-host/pnv_phb4_pec.c
index 61d7add25a..02e7689372 100644
--- a/hw/pci-host/pnv_phb4_pec.c
+++ b/hw/pci-host/pnv_phb4_pec.c
@@ -282,43 +282,9 @@ static const TypeInfo pnv_pec_type_info = {
  }
  };
  
-static void pnv_pec_stk_realize(DeviceState *dev, Error **errp)

-{
-}
-
-static Property pnv_pec_stk_properties[] = {
-DEFINE_PROP_UINT32("stack-no", PnvPhb4PecStack, stack_no, 0),
-DEFINE_PROP_LINK("pec", PnvPhb4PecStack, pec, TYPE_PNV_PHB4_PEC,
- PnvPhb4PecState *),
-DEFINE_PROP_END_OF_LIST(),
-};
-
-static void pnv_pec_stk_class_init(ObjectClass *klass, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(klass);
-
-device_class_set_props(dc, pnv_pec_stk_properties);
-dc->realize = pnv_pec_stk_realize;
-dc->user_creatable = false;
-
-/* TODO: reset regs ? */
-}
-
-static const TypeInfo pnv_pec_stk_type_info = {
-.name  = TYPE_PNV_PHB4_PEC_STACK,
-.parent= TYPE_DEVICE,
-.instance_size = sizeof(PnvPhb4PecStack),
-.class_init= pnv_pec_stk_class_init,
-.interfaces= (InterfaceInfo[]) {
-{ TYPE_PNV_XSCOM_INTERFACE },
-{ }
-}
-};
-
  static void pnv_pec_register_types(void)
  {
  type_register_static(&pnv_pec_type_info);
-type_register_static(&pnv_pec_stk_type_info);
  }
  
  type_init(pnv_pec_register_types);

diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index 170de2e752..96e8583e48 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -167,26 +167,6 @@ extern const MemoryRegionOps pnv_phb4_xscom_ops;
  #define TYPE_PNV_PHB4_PEC "pnv-phb4-pec"
  OBJECT_DECLARE_TYPE(PnvPhb4PecState, PnvPhb4PecClass, PNV_PHB4_PEC)
  
-#define TYPE_PNV_PHB4_PEC_STACK "pnv-phb4-pec-stack"

-OBJECT_DECLARE_SIMPLE_TYPE(PnvPhb4PecStack, PNV_PHB4_PEC_STACK)
-
-/* Per-stack data */
-struct PnvPhb4PecStack {
-DeviceState parent;
-
-/* My own stack number */
-uint32_t stack_no;
-
-/* The owner PEC */
-PnvPhb4PecState *pec;
-
-/*
- * PHB4 pointer. pnv_phb4_update_regions() needs to access
- * the PHB4 via a PnvPhb4PecStack pointer.
- */
-PnvPHB4 *phb;
-};
-
  struct PnvPhb4PecState {
  DeviceState parent;
  






Re: [PATCH 15/17] ppc/pnv: convert pec->stacks[] into pec->phbs[]

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This patch changes the design of the PEC device to use PHB4s instead of
PecStacks. After all the recent changes, PHB4s now contain all the
information needed for their proper functioning, not relying on PecStack
in any capacity.

All changes are being made in a single patch to avoid renaming parts of
the PecState and leaving the code in a strange way. E.g. rename
PecClass->num_stacks to num_phbs, which would then read a
pnv_pec_num_stacks[] array. To avoid mixing the old and new design more
than necessary it's clearer to do these changes in a single step.

The name changes made are:

- in PnvPhb4PecState, rename PHB4_PEC_MAX_STACKS to PHB4_PEC_MAX_PHBS,
'num_stacks' to 'num_phbs' and convert "PnvPhb4PecStack
stacks[PHB4_PEC_MAX_STACKS]" to "PnvPHB4 *phbs[PHB4_PEC_MAX_PHBS]";

- in PnvPhb4PecClass, rename *num_stacks to *num_phbs;

- pnv_pec_num_stacks[] is renamed to pnv_pec_num_phbs[].

The logical changes:

- pnv_pec_default_phb_realize():
   * init the PnvPHB4 qdev and assign it to the corresponding
pec->phbs[phb_number];
   * do not use stack->phb anymore;

- pnv_pec_realize():
   * use the new default_phb_realize() to init/realize each PHB if
running with defaults;

- pnv_pec_instance_init(): removed since we're creating the PHBs during
pec_realize();

- pnv_phb4_get_stack():
   * renamed to pnv_phb4_get_pec() and returns a PnvPhb4PecState*;
   * assign the right pec->phbs[] pointer to the phb;
   * set 'phb_number' of the PHB given that the information is already
available;

- pnv_phb4_realize(): use 'phb->pec' instead of 'stack'.

This design change shouldn't caused any behavioral change in the runtime
of the machine.

Signed-off-by: Daniel Henrique Barboza 




Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  hw/pci-host/pnv_phb4.c | 31 +++
  hw/pci-host/pnv_phb4_pec.c | 71 ++
  include/hw/pci-host/pnv_phb4.h | 10 ++---
  3 files changed, 40 insertions(+), 72 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 25b4248776..a9ec42ce2c 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1360,7 +1360,7 @@ int pnv_phb4_pec_get_phb_id(PnvPhb4PecState *pec, int 
stack_index)
  int offset = 0;
  
  while (index--) {

-offset += pecc->num_stacks[index];
+offset += pecc->num_phbs[index];
  }
  
  return offset + stack_index;

@@ -1510,8 +1510,8 @@ static void pnv_phb4_instance_init(Object *obj)
  object_initialize_child(obj, "source", &phb->xsrc, TYPE_XIVE_SOURCE);
  }
  
-static PnvPhb4PecStack *pnv_phb4_get_stack(PnvChip *chip, PnvPHB4 *phb,

-   Error **errp)
+static PnvPhb4PecState *pnv_phb4_get_pec(PnvChip *chip, PnvPHB4 *phb,
+ Error **errp)
  {
  Pnv9Chip *chip9 = PNV9_CHIP(chip);
  int chip_id = phb->chip_id;
@@ -1520,14 +1520,19 @@ static PnvPhb4PecStack *pnv_phb4_get_stack(PnvChip 
*chip, PnvPHB4 *phb,
  
  for (i = 0; i < chip->num_pecs; i++) {

  /*
- * For each PEC, check the amount of stacks it supports
- * and see if the given phb4 index matches a stack.
+ * For each PEC, check the amount of phbs it supports
+ * and see if the given phb4 index matches an index.
   */
  PnvPhb4PecState *pec = &chip9->pecs[i];
  
-for (j = 0; j < pec->num_stacks; j++) {

+for (j = 0; j < pec->num_phbs; j++) {
  if (index == pnv_phb4_pec_get_phb_id(pec, j)) {
-return &pec->stacks[j];
+pec->phbs[j] = phb;
+
+/* Set phb-number now since we already have it */
+object_property_set_int(OBJECT(phb), "phb-number",
+   j, &error_abort);
+return pec;
  }
  }
  }
@@ -1552,7 +1557,6 @@ static void pnv_phb4_realize(DeviceState *dev, Error 
**errp)
  if (!phb->pec) {
  PnvMachineState *pnv = PNV_MACHINE(qdev_get_machine());
  PnvChip *chip = pnv_get_chip(pnv, phb->chip_id);
-PnvPhb4PecStack *stack;
  PnvPhb4PecClass *pecc;
  BusState *s;
  
@@ -1561,23 +1565,16 @@ static void pnv_phb4_realize(DeviceState *dev, Error **errp)

  return;
  }
  
-stack = pnv_phb4_get_stack(chip, phb, &local_err);

+phb->pec = pnv_phb4_get_pec(chip, phb, &local_err);
  if (local_err) {
  error_propagate(errp, local_err);
  return;
  }
  
-/*

- * All other phb properties but 'pec', 'version' and
- * 'phb-number' are already set.
- */
-object_property_set_link(OBJECT(phb), "pec", OBJECT(stack->pec),
- &error_abort);
+/* All other phb properties are already set */
  pecc = PNV_PHB4_PEC_GET_CLASS(phb->pec);
  object_property_set_int

Re: [PATCH 14/17] ppc/pnv: move default_phb_realize() to pec_realize()

2022-01-14 Thread Cédric Le Goater

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

This is the last step before making the PEC device uses PHB4s directly.
Move the current pnv_pec_stk_default_phb_realize() call to
pec_realize(), renaming the function to pnv_pec_default_phb_realize(),
and set the PHB attributes using the PEC object directly.

Signed-off-by: Daniel Henrique Barboza 



Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  hw/pci-host/pnv_phb4_pec.c | 67 --
  1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/hw/pci-host/pnv_phb4_pec.c b/hw/pci-host/pnv_phb4_pec.c
index 1de0eb9adc..3339e0ea3d 100644
--- a/hw/pci-host/pnv_phb4_pec.c
+++ b/hw/pci-host/pnv_phb4_pec.c
@@ -112,6 +112,32 @@ static const MemoryRegionOps pnv_pec_pci_xscom_ops = {
  .endianness = DEVICE_BIG_ENDIAN,
  };
  
+static void pnv_pec_default_phb_realize(PnvPhb4PecStack *stack,

+int phb_number,
+Error **errp)
+{
+PnvPhb4PecState *pec = stack->pec;
+PnvPhb4PecClass *pecc = PNV_PHB4_PEC_GET_CLASS(pec);
+int phb_id = pnv_phb4_pec_get_phb_id(pec, phb_number);
+
+stack->phb = PNV_PHB4(qdev_new(TYPE_PNV_PHB4));
+
+object_property_set_int(OBJECT(stack->phb), "phb-number", phb_number,
+&error_abort);
+object_property_set_link(OBJECT(stack->phb), "pec", OBJECT(pec),
+ &error_abort);
+object_property_set_int(OBJECT(stack->phb), "chip-id", pec->chip_id,
+&error_fatal);
+object_property_set_int(OBJECT(stack->phb), "index", phb_id,
+&error_fatal);
+object_property_set_int(OBJECT(stack->phb), "version", pecc->version,
+&error_fatal);
+
+if (!sysbus_realize(SYS_BUS_DEVICE(stack->phb), errp)) {
+return;
+}
+}
+
  static void pnv_pec_instance_init(Object *obj)
  {
  PnvPhb4PecState *pec = PNV_PHB4_PEC(obj);
@@ -144,6 +170,15 @@ static void pnv_pec_realize(DeviceState *dev, Error **errp)
  
  object_property_set_int(stk_obj, "stack-no", i, &error_abort);

  object_property_set_link(stk_obj, "pec", OBJECT(pec), &error_abort);
+
+if (defaults_enabled()) {
+pnv_pec_default_phb_realize(stack, i, errp);
+}
+
+/*
+ * qdev gets angry if we don't realize 'stack' here, even
+ * if stk_realize() is now empty.
+ */
  if (!qdev_realize(DEVICE(stk_obj), NULL, errp)) {
  return;
  }
@@ -276,40 +311,8 @@ static const TypeInfo pnv_pec_type_info = {
  }
  };
  
-static void pnv_pec_stk_default_phb_realize(PnvPhb4PecStack *stack,

-Error **errp)
-{
-PnvPhb4PecState *pec = stack->pec;
-PnvPhb4PecClass *pecc = PNV_PHB4_PEC_GET_CLASS(pec);
-int phb_id = pnv_phb4_pec_get_phb_id(pec, stack->stack_no);
-
-stack->phb = PNV_PHB4(qdev_new(TYPE_PNV_PHB4));
-
-object_property_set_int(OBJECT(stack->phb), "phb-number", stack->stack_no,
-&error_abort);
-object_property_set_link(OBJECT(stack->phb), "pec", OBJECT(pec),
- &error_abort);
-object_property_set_int(OBJECT(stack->phb), "chip-id", pec->chip_id,
-&error_fatal);
-object_property_set_int(OBJECT(stack->phb), "index", phb_id,
-&error_fatal);
-object_property_set_int(OBJECT(stack->phb), "version", pecc->version,
-&error_fatal);
-
-if (!sysbus_realize(SYS_BUS_DEVICE(stack->phb), errp)) {
-return;
-}
-}
-
  static void pnv_pec_stk_realize(DeviceState *dev, Error **errp)
  {
-PnvPhb4PecStack *stack = PNV_PHB4_PEC_STACK(dev);
-
-if (!defaults_enabled()) {
-return;
-}
-
-pnv_pec_stk_default_phb_realize(stack, errp);
  }
  
  static Property pnv_pec_stk_properties[] = {







Re: [PATCH v2 2/2] qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER

2022-01-14 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> When building QEMU with --disable-vhost-user and using introspection,
> query-qmp-schema lists vhost-user-blk even though it's not actually
> available:
>
>   { "execute": "query-qmp-schema" }
>   {
>   "return": [
>   ...
>   {
>   "name": "312",
>   "members": [
>   {
>   "name": "nbd"
>   },
>   {
>   "name": "vhost-user-blk"
>   }
>   ],
>   "meta-type": "enum",
>   "values": [
>   "nbd",
>   "vhost-user-blk"
>   ]
>   },
>
> Restrict vhost-user-blk in BlockExportType when
> CONFIG_VHOST_USER_BLK_SERVER is disabled, so it
> doesn't end listed by query-qmp-schema.
>
> Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v2: Reword + restrict BlockExportOptions union (armbru)
> ---
>  qapi/block-export.json | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/qapi/block-export.json b/qapi/block-export.json
> index c1b92ce1c1c..f9ce79a974b 100644
> --- a/qapi/block-export.json
> +++ b/qapi/block-export.json
> @@ -277,7 +277,8 @@
>  # Since: 4.2
>  ##
>  { 'enum': 'BlockExportType',
> -  'data': [ 'nbd', 'vhost-user-blk',
> +  'data': [ 'nbd',
> +{ 'name': 'vhost-user-blk', 'if': 'CONFIG_VHOST_USER_BLK_SERVER' 
> },

Please break this line like

   { 'name': 'vhost-user-blk',
 'if': 'CONFIG_VHOST_USER_BLK_SERVER' },

>  { 'name': 'fuse', 'if': 'CONFIG_FUSE' } ] }
>  
>  ##
> @@ -319,7 +320,8 @@
>'discriminator': 'type',
>'data': {
>'nbd': 'BlockExportOptionsNbd',
> -  'vhost-user-blk': 'BlockExportOptionsVhostUserBlk',
> +  'vhost-user-blk': { 'type': 'BlockExportOptionsVhostUserBlk',
> +  'if': 'CONFIG_VHOST_USER_BLK_SERVER' },
>'fuse': { 'type': 'BlockExportOptionsFuse',
>  'if': 'CONFIG_FUSE' }
> } }

Acked-by: Markus Armbruster 




Re: [PATCH V2 for-6.2 2/2] block/rbd: workaround for ceph issue #53784

2022-01-14 Thread Ilya Dryomov
On Thu, Jan 13, 2022 at 3:44 PM Peter Lieven  wrote:
>
> librbd had a bug until early 2022 that affected all versions of ceph that
> supported fast-diff. This bug results in reporting of incorrect offsets
> if the offset parameter to rbd_diff_iterate2 is not object aligned.
>
> This patch works around this bug for pre Quincy versions of librbd.
>
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Peter Lieven 
> ---
>  block/rbd.c | 42 --
>  1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index 20bb896c4a..d174d51659 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -1320,6 +1320,7 @@ static int coroutine_fn 
> qemu_rbd_co_block_status(BlockDriverState *bs,
>  int status, r;
>  RBDDiffIterateReq req = { .offs = offset };
>  uint64_t features, flags;
> +uint64_t head = 0;
>
>  assert(offset + bytes <= s->image_size);
>
> @@ -1347,7 +1348,43 @@ static int coroutine_fn 
> qemu_rbd_co_block_status(BlockDriverState *bs,
>  return status;
>  }
>
> -r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
> +#if LIBRBD_VERSION_CODE < LIBRBD_VERSION(1, 17, 0)
> +/*
> + * librbd had a bug until early 2022 that affected all versions of ceph 
> that
> + * supported fast-diff. This bug results in reporting of incorrect 
> offsets
> + * if the offset parameter to rbd_diff_iterate2 is not object aligned.
> + * Work around this bug by rounding down the offset to object boundaries.
> + * This is OK because we call rbd_diff_iterate2 with whole_object = true.
> + * However, this workaround only works for non cloned images with default
> + * striping.
> + *
> + * See: https://tracker.ceph.com/issues/53784
> + */
> +
> +/*  check if RBD image has non-default striping enabled */

Nit: extra space

Thanks,

Ilya



Re: [PATCH] Mark remaining global TypeInfo instances as const

2022-01-14 Thread Philippe Mathieu-Daudé via

On 13/1/22 18:10, Bernhard Beschow wrote:

More than 1k of TypeInfo instances are already marked as const. Mark the
remaining ones, too.

Signed-off-by: Bernhard Beschow 
---
  hw/core/generic-loader.c   | 2 +-
  hw/core/guest-loader.c | 2 +-
  hw/display/bcm2835_fb.c| 2 +-
  hw/display/i2c-ddc.c   | 2 +-
  hw/display/macfb.c | 4 ++--
  hw/display/virtio-vga.c| 2 +-
  hw/dma/bcm2835_dma.c   | 2 +-
  hw/i386/pc_piix.c  | 2 +-
  hw/i386/sgx-epc.c  | 2 +-
  hw/intc/bcm2835_ic.c   | 2 +-
  hw/intc/bcm2836_control.c  | 2 +-
  hw/ipmi/ipmi.c | 4 ++--
  hw/mem/nvdimm.c| 2 +-
  hw/mem/pc-dimm.c   | 2 +-
  hw/misc/bcm2835_mbox.c | 2 +-
  hw/misc/bcm2835_powermgt.c | 2 +-
  hw/misc/bcm2835_property.c | 2 +-
  hw/misc/bcm2835_rng.c  | 2 +-
  hw/misc/pvpanic-isa.c  | 2 +-
  hw/misc/pvpanic-pci.c  | 2 +-
  hw/net/fsl_etsec/etsec.c   | 2 +-
  hw/ppc/prep_systemio.c | 2 +-
  hw/ppc/spapr_iommu.c   | 2 +-
  hw/s390x/s390-pci-bus.c| 2 +-
  hw/s390x/sclp.c| 2 +-
  hw/s390x/tod-kvm.c | 2 +-
  hw/s390x/tod-tcg.c | 2 +-
  hw/s390x/tod.c | 2 +-
  hw/scsi/lsi53c895a.c   | 2 +-
  hw/sd/allwinner-sdhost.c   | 2 +-
  hw/sd/aspeed_sdhci.c   | 2 +-
  hw/sd/bcm2835_sdhost.c | 2 +-
  hw/sd/cadence_sdhci.c  | 2 +-
  hw/sd/npcm7xx_sdhci.c  | 2 +-
  hw/usb/dev-mtp.c   | 2 +-
  hw/usb/host-libusb.c   | 2 +-
  hw/vfio/igd.c  | 2 +-
  hw/virtio/virtio-pmem.c| 2 +-
  qom/object.c   | 4 ++--
  39 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/hw/core/generic-loader.c b/hw/core/generic-loader.c
index 9a24ffb880..eaafc416f4 100644
--- a/hw/core/generic-loader.c
+++ b/hw/core/generic-loader.c
@@ -207,7 +207,7 @@ static void generic_loader_class_init(ObjectClass *klass, 
void *data)
  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
  }
  
-static TypeInfo generic_loader_info = {

+static const TypeInfo generic_loader_info = {


Good cleanup. If you use a sed expression to automatically replace,
it would be nice to mention it (see i.e. commit f548f20176cb5f4406).

Reviewed-by: Philippe Mathieu-Daudé 

To avoid further non-const static TypeInfo introduced, we should
add a check in scripts/checkpatch.pl. Maybe a simple "static TypeInfo"
line comparison is enough. Do you mind having a look for a patch?

Thanks,

Phil.



Re: [PATCH V2 for-6.2 0/2] fixes for bdrv_co_block_status

2022-01-14 Thread Ilya Dryomov
On Thu, Jan 13, 2022 at 3:44 PM Peter Lieven  wrote:
>
> V1->V2:
>  Patch 1: Treat a hole just like an unallocated area. [Ilya]
>  Patch 2: Apply workaround only for pre-Quincy librbd versions and
>   ensure default striping and non child images. [Ilya]
>
> Peter Lieven (2):
>   block/rbd: fix handling of holes in .bdrv_co_block_status
>   block/rbd: workaround for ceph issue #53784
>
>  block/rbd.c | 52 +---
>  1 file changed, 45 insertions(+), 7 deletions(-)
>
> --
> 2.25.1
>
>

These patches have both "for-6.2" in the subject and
Cc: qemu-sta...@nongnu.org in the description, which is a little
confusing.  Just want to clarify that they should go into master
and be backported to 6.2.

Reviewed-by: Ilya Dryomov 

Thanks,

Ilya



Re: [PATCH 2/2] docker: add msitools to Fedora/mingw cross

2022-01-14 Thread Philippe Mathieu-Daudé via

On 14/1/22 09:43, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

That should help catch build issues/regressions with wixl.

Signed-off-by: Marc-André Lureau 
---
  tests/docker/dockerfiles/fedora-win32-cross.docker | 1 +
  tests/docker/dockerfiles/fedora-win64-cross.docker | 1 +
  2 files changed, 2 insertions(+)

diff --git a/tests/docker/dockerfiles/fedora-win32-cross.docker 
b/tests/docker/dockerfiles/fedora-win32-cross.docker
index aad39dd97ff4..d80e66c6517d 100644
--- a/tests/docker/dockerfiles/fedora-win32-cross.docker
+++ b/tests/docker/dockerfiles/fedora-win32-cross.docker
@@ -29,6 +29,7 @@ ENV PACKAGES \
  mingw32-pixman \
  mingw32-pkg-config \
  mingw32-SDL2 \
+msitools \
  perl \
  perl-Test-Harness \
  python3 \
diff --git a/tests/docker/dockerfiles/fedora-win64-cross.docker 
b/tests/docker/dockerfiles/fedora-win64-cross.docker
index 9a224a619bd4..2b12b94ccfb4 100644
--- a/tests/docker/dockerfiles/fedora-win64-cross.docker
+++ b/tests/docker/dockerfiles/fedora-win64-cross.docker
@@ -26,6 +26,7 @@ ENV PACKAGES \
  mingw64-libusbx \
  mingw64-pixman \
  mingw64-pkg-config \
+msitools \
  perl \
  perl-Test-Harness \
  python3 \


This clashes with testing/next pull request:
https://lore.kernel.org/qemu-devel/20220112112722.3641051-10-alex.ben...@linaro.org/



Re: [PATCH 1/2] build-sys: fix undefined ARCH error

2022-01-14 Thread Philippe Mathieu-Daudé via

On 14/1/22 09:43, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

../qga/meson.build:76:4: ERROR: Key ARCH is not in the dictionary.

Fixes commit 823eb013 ("configure, meson: move ARCH to meson.build")

Signed-off-by: Marc-André Lureau 
---
  qga/meson.build | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH 23/30] bsd-user/signal.c: sigset manipulation routines.

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 16:53, Warner Losh  wrote:
>
> target_sigemptyset: resets a set to having no bits set
> qemu_sigorset:  computes the or of two sets
> target_sigaddset:   adds a signal to a set
> target_sigismember: returns true when signal is a member
> host_to_target_sigset_internal: convert host sigset to target
> host_to_target_sigset: convert host sigset to target
> target_to_host_sigset_internal: convert target sigset to host
> target_to_host_sigset: convert target sigset to host
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/qemu.h   |  3 ++
>  bsd-user/signal.c | 89 +++
>  2 files changed, 92 insertions(+)
>
> diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
> index e12617f5d69..e8c417c7c33 100644
> --- a/bsd-user/qemu.h
> +++ b/bsd-user/qemu.h
> @@ -223,7 +223,10 @@ void queue_signal(CPUArchState *env, int sig, 
> target_siginfo_t *info);
>  abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong 
> sp);
>  int target_to_host_signal(int sig);
>  int host_to_target_signal(int sig);
> +void host_to_target_sigset(target_sigset_t *d, const sigset_t *s);
> +void target_to_host_sigset(sigset_t *d, const target_sigset_t *s);
>  void QEMU_NORETURN force_sig(int target_sig);
> +int qemu_sigorset(sigset_t *dest, const sigset_t *left, const sigset_t 
> *right);
>
>  /* mmap.c */
>  int target_mprotect(abi_ulong start, abi_ulong len, int prot);
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index 93c3b3c5033..8dadc9a39a7 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -32,6 +32,9 @@
>
>  static struct target_sigaction sigact_table[TARGET_NSIG];
>  static void host_signal_handler(int host_sig, siginfo_t *info, void *puc);
> +static void target_to_host_sigset_internal(sigset_t *d,
> +const target_sigset_t *s);
> +
>
>  int host_to_target_signal(int sig)
>  {
> @@ -43,6 +46,44 @@ int target_to_host_signal(int sig)
>  return sig;
>  }
>
> +static inline void target_sigemptyset(target_sigset_t *set)
> +{
> +memset(set, 0, sizeof(*set));
> +}
> +
> +#include 

Don't include system headers halfway through the file like this,
please : put the #include at the top of the file with the others.

> +
> +int
> +qemu_sigorset(sigset_t *dest, const sigset_t *left, const sigset_t *right)
> +{
> +sigset_t work;
> +int i;
> +
> +sigemptyset(&work);
> +for (i = 1; i < NSIG; ++i) {
> +if (sigismember(left, i) || sigismember(right, i)) {
> +sigaddset(&work, i);
> +}
> +}
> +
> +*dest = work;
> +return 0;
> +}

FreeBSD's manpage says it has a native sigorset() --
https://www.freebsd.org/cgi/man.cgi?query=sigemptyset&sektion=3&apropos=0&manpath=freebsd
can you just use that ?

> +
> +static inline void target_sigaddset(target_sigset_t *set, int signum)
> +{
> +signum--;
> +uint32_t mask = (uint32_t)1 << (signum % TARGET_NSIG_BPW);
> +set->__bits[signum / TARGET_NSIG_BPW] |= mask;
> +}
> +
> +static inline int target_sigismember(const target_sigset_t *set, int signum)
> +{
> +signum--;
> +abi_ulong mask = (abi_ulong)1 << (signum % TARGET_NSIG_BPW);
> +return (set->__bits[signum / TARGET_NSIG_BPW] & mask) != 0;
> +}
> +
>  /* Adjust the signal context to rewind out of safe-syscall if we're in it */
>  static inline void rewind_if_in_safe_syscall(void *puc)
>  {
> @@ -55,6 +96,54 @@ static inline void rewind_if_in_safe_syscall(void *puc)
>  }
>  }
>
> +static void host_to_target_sigset_internal(target_sigset_t *d,
> +const sigset_t *s)
> +{
> +int i;
> +
> +target_sigemptyset(d);
> +for (i = 1; i <= TARGET_NSIG; i++) {

i here is iterating through host signal numbers, not target
numbers, so TARGET_NSIG isn't the right upper bound.
On Linux we iterate from 1 to _NSIG-1; on BSD I think
you may want (i = 0; i < NSIG; i++), but you should check that.

> +if (sigismember(s, i)) {
> +target_sigaddset(d, host_to_target_signal(i));
> +}
> +}
> +}

These functions are a little odd when you compare them to their
linux-user equivalents, because they're both written
with a sort of abstraction between host and target signal
numbers (they call host_to_target_signal() and
target_to_host_signal()) but also written with baked-in
assumptions that the mapping is basically 1:1 (they don't
have the code that handles the possibility that the
target signal isn't representable as a host signal or
vice-versa). But assuming the BSDs don't change their
signal numbering across architectures, this is fine.

thanks
-- PMM



Re: [PATCH 4/6] migration: Add ram-only capability

2022-01-14 Thread Markus Armbruster
Nikita Lapshin  writes:

> If this capability is enabled migration stream
> will have RAM section only.
>
> Signed-off-by: Nikita Lapshin 

[...]

> diff --git a/qapi/migration.json b/qapi/migration.json
> index d53956852c..626fc59d14 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -454,6 +454,8 @@
>  #
>  # @no-ram: If enabled, migration stream won't contain any ram in it. (since 
> 7.0)
>  #
> +# @ram-only: If enabled, only RAM sections will be sent. (since 7.0)
> +#

What happens when I ask for 'no-ram': true, 'ram-only': true?

>  # Features:
>  # @unstable: Members @x-colo and @x-ignore-shared are experimental.
>  #
> @@ -467,7 +469,7 @@
> 'block', 'return-path', 'pause-before-switchover', 'multifd',
> 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> -   'validate-uuid', 'background-snapshot', 'no-ram'] }
> +   'validate-uuid', 'background-snapshot', 'no-ram', 'ram-only'] }
>  ##
>  # @MigrationCapabilityStatus:
>  #
> @@ -521,7 +523,8 @@
>  #   {"state": true, "capability": "events"},
>  #   {"state": false, "capability": "postcopy-ram"},
>  #   {"state": false, "capability": "x-colo"},
> -#   {"state": false, "capability": "no-ram"}
> +#   {"state": false, "capability": "no-ram"},
> +#   {"state": false, "capability": "ram-only"}
>  #]}
>  #
>  ##




Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2022-01-14 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> Hi,
>
> On 12/28/21 10:22, Yanan Wang wrote:

[...]

>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index edeab6084b..ff0ab4ca20 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -1404,7 +1404,9 @@
>>  #
>>  # @dies: number of dies per socket in the CPU topology
>>  #
>> -# @cores: number of cores per die in the CPU topology
>> +# @clusters: number of clusters per die in the CPU topology
>
> Missing:
>
>#(since 7.0)
>
>> +#
>> +# @cores: number of cores per cluster in the CPU topology
>>  #
>>  # @threads: number of threads per core in the CPU topology
>>  #
>> @@ -1416,6 +1418,7 @@
>>   '*cpus': 'int',
>>   '*sockets': 'int',
>>   '*dies': 'int',
>> + '*clusters': 'int',
>>   '*cores': 'int',
>>   '*threads': 'int',
>>   '*maxcpus': 'int' } }
>
> If you want I can update the doc when applying.

With the update, QAPU schema
Acked-by: Markus Armbruster 




Re: [PATCH 11/17] ppc/pnv: introduce PnvPHB4 'phb_number' property

2022-01-14 Thread Daniel Henrique Barboza




On 1/14/22 07:46, Cédric Le Goater wrote:

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

One of the remaining dependencies we have on the PnvPhb4PecStack object
is the stack->stack_no property. This is set as the position the stack
occupies in the pec->stacks[] array.

We need a way to report this same value in the PnvPHB4. This patch
creates a new property called 'phb_number' to be used in existing code
in all instances stack->stack_no is currently being used.

The 'phb_number' name is an indication of our future intention to convert
the pec->stacks[] array into a pec->phbs[] array, when the PEC object will
deal directly with phb4 objects.



So the PHB would have a 'phb_number' and a 'index' property ? That's
confusing. Can we simplify ? compute one from another ?

or keep 'stack_no' to make it clear this belongs to the stack subunit
logic.



I guess for now we can keep it as phb->stack_no. We can think about reworking 
the
logic (my initial reaction is to keep 'index' and then derive the 'stack_no' 
from
it when needed) in a follow up.



Thanks,


Daniel



Thanks,

C.



Signed-off-by: Daniel Henrique Barboza 
---
  hw/pci-host/pnv_phb4.c | 28 +---
  hw/pci-host/pnv_phb4_pec.c |  2 ++
  include/hw/pci-host/pnv_phb4.h |  3 +++
  3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index b5045fca64..44f3087913 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -937,7 +937,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  mask = phb->nest_regs[PEC_NEST_STK_MMIO_BAR0_MASK];
  size = ((~mask) >> 8) + 1;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio0",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  memory_region_init(&phb->mmbar0, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->mmbar0);
  phb->mmio0_base = bar;
@@ -949,7 +949,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  mask = phb->nest_regs[PEC_NEST_STK_MMIO_BAR1_MASK];
  size = ((~mask) >> 8) + 1;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-mmio1",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  memory_region_init(&phb->mmbar1, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->mmbar1);
  phb->mmio1_base = bar;
@@ -960,7 +960,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  bar = phb->nest_regs[PEC_NEST_STK_PHB_REGS_BAR] >> 8;
  size = PNV_PHB4_NUM_REGS << 3;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  memory_region_init(&phb->phbbar, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->phbbar);
  }
@@ -969,7 +969,7 @@ static void pnv_pec_stk_update_map(PnvPHB4 *phb)
  bar = phb->nest_regs[PEC_NEST_STK_INT_BAR] >> 8;
  size = PNV_PHB4_MAX_INTs << 16;
  snprintf(name, sizeof(name), "pec-%d.%d-phb-%d-int",
- stack->pec->chip_id, stack->pec->index, stack->stack_no);
+ stack->pec->chip_id, stack->pec->index, phb->phb_number);
  memory_region_init(&phb->intbar, OBJECT(phb), name, size);
  memory_region_add_subregion(sysmem, bar, &phb->intbar);
  }
@@ -1469,20 +1469,20 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  /* Initialize the XSCOM regions for the stack registers */
  snprintf(name, sizeof(name), "xscom-pec-%d.%d-nest-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  pnv_xscom_region_init(&phb->nest_regs_mr, OBJECT(phb),
    &pnv_pec_stk_nest_xscom_ops, phb, name,
    PHB4_PEC_NEST_STK_REGS_COUNT);
  snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  pnv_xscom_region_init(&phb->pci_regs_mr, OBJECT(phb),
    &pnv_pec_stk_pci_xscom_ops, phb, name,
    PHB4_PEC_PCI_STK_REGS_COUNT);
  /* PHB pass-through */
  snprintf(name, sizeof(name), "xscom-pec-%d.%d-pci-phb-%d",
- pec->chip_id, pec->index, stack->stack_no);
+ pec->chip_id, pec->index, phb->phb_number);
  pnv_xscom_region_init(&phb->phb_regs_mr, OBJECT(phb),
    &pnv_phb4_xscom_ops, phb, name, 0x40);
@@ -1491,14 +1491,14 @@ static void pnv_phb4_xscom_realize(PnvPHB4 *phb)
  /* Populate the XSCOM address space. */
  pnv_xscom_add_subregion(pec->chip,
-    pec_nest_base + 

Re: [PATCH 4/6] migration: Add ram-only capability

2022-01-14 Thread Daniel P . Berrangé
On Fri, Jan 14, 2022 at 12:22:13PM +0100, Markus Armbruster wrote:
> Nikita Lapshin  writes:
> 
> > If this capability is enabled migration stream
> > will have RAM section only.
> >
> > Signed-off-by: Nikita Lapshin 
> 
> [...]
> 
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index d53956852c..626fc59d14 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -454,6 +454,8 @@
> >  #
> >  # @no-ram: If enabled, migration stream won't contain any ram in it. 
> > (since 7.0)
> >  #
> > +# @ram-only: If enabled, only RAM sections will be sent. (since 7.0)
> > +#
> 
> What happens when I ask for 'no-ram': true, 'ram-only': true?

So IIUC

  no-ram=false, ram-only=false =>  RAM + vmstate
  no-ram=true, ram-only=false => vmstate
  no-ram=false, ram-only=true =>  RAM
  no-ram=true, ram-only=true => nothing to send ?

I find that the fact that one flag is a negative request and
the other flag is a positive request to be confusing.

If we must have two flags then could we at least use the same
style for both. ie either

  @no-ram
  @no-vmstate

Or

  @ram-only
  @vmstate-only

Since the code enforces these flags are mutually exclusive
though, it might point towards being handled by a enum

  { 'enum': 'MigrationStreamContent',
'data': ['both', 'ram', 'vmstate'] }

none of these approaches are especially future proof if we ever
need fine grained control over sending a sub-set of the non-RAM
vmstate. Not sure if that matters in the end.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 2/3] migration: Add canary to VMSTATE_END_OF_LIST

2022-01-14 Thread Philippe Mathieu-Daudé via

On 13/1/22 20:44, Dr. David Alan Gilbert (git) wrote:

From: "Dr. David Alan Gilbert" 

We fairly regularly forget VMSTATE_END_OF_LIST markers off descriptions;
given that the current check is only for ->name being NULL, sometimes
we get unlucky and the code apparently works and no one spots the error.

Explicitly add a flag, VMS_END that should be set, and assert it is
set during the traversal.

Note: This can't go in until we update the copy of vmstate.h in slirp.


Do we need a libslirp buildsys version check to get this patch merged?

Reviewed-by: Philippe Mathieu-Daudé 


Suggested-by: Peter Maydell 
Signed-off-by: Dr. David Alan Gilbert 
---
  include/migration/vmstate.h | 7 ++-
  migration/savevm.c  | 1 +
  migration/vmstate.c | 2 ++
  3 files changed, 9 insertions(+), 1 deletion(-)




Re: [PATCH 24/30] bsd-user/signal.c: setup_frame

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 16:36, Warner Losh  wrote:
>
> setup_frame sets up a signalled stack frame. Associated routines to
> extract the pointer to the stack frame and to support alternate stacks.
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/signal.c | 166 --
>  1 file changed, 144 insertions(+), 22 deletions(-)
>
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index 8dadc9a39a7..8e1427553da 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -30,11 +30,27 @@
>   * fork.
>   */
>
> +static target_stack_t target_sigaltstack_used = {
> +.ss_sp = 0,
> +.ss_size = 0,
> +.ss_flags = TARGET_SS_DISABLE,
> +};

sigaltstacks are per-thread, so this needs to be in the TaskState,
not global. (We fixed this in linux-user in commit 5bfce0b74fbd5d5
in 2019: the change is relatively small.)

> +
>  static struct target_sigaction sigact_table[TARGET_NSIG];
>  static void host_signal_handler(int host_sig, siginfo_t *info, void *puc);
>  static void target_to_host_sigset_internal(sigset_t *d,
>  const target_sigset_t *s);
>
> +static inline int on_sig_stack(unsigned long sp)
> +{
> +return sp - target_sigaltstack_used.ss_sp < 
> target_sigaltstack_used.ss_size;
> +}
> +
> +static inline int sas_ss_flags(unsigned long sp)
> +{
> +return target_sigaltstack_used.ss_size == 0 ? SS_DISABLE : 
> on_sig_stack(sp)
> +? SS_ONSTACK : 0;
> +}
>
>  int host_to_target_signal(int sig)
>  {
> @@ -336,28 +352,6 @@ void queue_signal(CPUArchState *env, int sig, 
> target_siginfo_t *info)
>  return;
>  }
>
> -static int fatal_signal(int sig)
> -{
> -
> -switch (sig) {
> -case TARGET_SIGCHLD:
> -case TARGET_SIGURG:
> -case TARGET_SIGWINCH:
> -case TARGET_SIGINFO:
> -/* Ignored by default. */
> -return 0;
> -case TARGET_SIGCONT:
> -case TARGET_SIGSTOP:
> -case TARGET_SIGTSTP:
> -case TARGET_SIGTTIN:
> -case TARGET_SIGTTOU:
> -/* Job control signals.  */
> -return 0;
> -default:
> -return 1;
> -}
> -}

There wasn't any need to move this function, I think ?

> -
>  /*
>   * Force a synchronously taken QEMU_SI_FAULT signal. For QEMU the
>   * 'force' part is handled in process_pending_signals().
> @@ -484,6 +478,134 @@ static void host_signal_handler(int host_sig, siginfo_t 
> *info, void *puc)
>  cpu_exit(thread_cpu);
>  }
>
> +static int fatal_signal(int sig)
> +{
> +
> +switch (sig) {
> +case TARGET_SIGCHLD:
> +case TARGET_SIGURG:
> +case TARGET_SIGWINCH:
> +case TARGET_SIGINFO:
> +/* Ignored by default. */
> +return 0;
> +case TARGET_SIGCONT:
> +case TARGET_SIGSTOP:
> +case TARGET_SIGTSTP:
> +case TARGET_SIGTTIN:
> +case TARGET_SIGTTOU:
> +/* Job control signals.  */
> +return 0;
> +default:
> +return 1;
> +}
> +}
> +
> +static inline abi_ulong get_sigframe(struct target_sigaction *ka,
> +CPUArchState *regs, size_t frame_size)
> +{
> +abi_ulong sp;
> +
> +/* Use default user stack */
> +sp = get_sp_from_cpustate(regs);
> +
> +if ((ka->sa_flags & TARGET_SA_ONSTACK) && (sas_ss_flags(sp) == 0)) {
> +sp = target_sigaltstack_used.ss_sp +
> +target_sigaltstack_used.ss_size;
> +}
> +
> +#if defined(TARGET_MIPS) || defined(TARGET_ARM)
> +return (sp - frame_size) & ~7;
> +#elif defined(TARGET_AARCH64)
> +return (sp - frame_size) & ~15;
> +#else
> +return sp - frame_size;
> +#endif

We don't need to do it in this patchseries, but you should strongly
consider pulling the architecture-specifics out in a way that
avoids this kind of ifdef ladder.

> +}
> +
> +/* compare to mips/mips/pm_machdep.c and sparc64/sparc64/machdep.c sendsig() 
> */
> +static void setup_frame(int sig, int code, struct target_sigaction *ka,
> +target_sigset_t *set, target_siginfo_t *tinfo, CPUArchState *regs)
> +{
> +struct target_sigframe *frame;
> +abi_ulong frame_addr;
> +int i;
> +
> +frame_addr = get_sigframe(ka, regs, sizeof(*frame));
> +trace_user_setup_frame(regs, frame_addr);
> +if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
> +goto give_sigsegv;

FreeBSD for Arm (haven't checked other BSDs or other archs)
gives a SIGILL for the "can't write signal frame to stack"
case, I think:
https://github.com/freebsd/freebsd-src/blob/main/sys/arm/arm/exec_machdep.c#L316
I don't understand why they picked SIGILL, SIGSEGV seems much more
logical to me, but we should follow the kernel behaviour.

> +}
> +
> +memset(frame, 0, sizeof(*frame));
> +#if defined(TARGET_MIPS)
> +int mflags = on_sig_stack(frame_addr) ? TARGET_MC_ADD_MAGIC :
> +TARGET_MC_SET_ONSTACK | TARGET_MC_ADD_MAGIC;
> +#else
> +int mflags = 0;
> +#endif
> +if (get_mcontext(regs, &frame->sf_uc.uc_mcontext, mflags)) {
> +goto give_sigsegv;

The F

Re: [PATCH 11/17] ppc/pnv: introduce PnvPHB4 'phb_number' property

2022-01-14 Thread Cédric Le Goater

On 1/14/22 12:29, Daniel Henrique Barboza wrote:



On 1/14/22 07:46, Cédric Le Goater wrote:

On 1/13/22 20:29, Daniel Henrique Barboza wrote:

One of the remaining dependencies we have on the PnvPhb4PecStack object
is the stack->stack_no property. This is set as the position the stack
occupies in the pec->stacks[] array.

We need a way to report this same value in the PnvPHB4. This patch
creates a new property called 'phb_number' to be used in existing code
in all instances stack->stack_no is currently being used.

The 'phb_number' name is an indication of our future intention to convert
the pec->stacks[] array into a pec->phbs[] array, when the PEC object will
deal directly with phb4 objects.



So the PHB would have a 'phb_number' and a 'index' property ? That's
confusing. Can we simplify ? compute one from another ?

or keep 'stack_no' to make it clear this belongs to the stack subunit
logic.



I guess for now we can keep it as phb->stack_no. We can think about reworking 
the
logic (my initial reaction is to keep 'index' and then derive the 'stack_no' 
from
it when needed) in a follow up.


ok. Patches 1-10 are fine. no need to resend.

Thanks,

C.



Re: [PATCH 1/4] tests: acpi: manually pad OEM_ID/OEM_TABLE_ID for test_oem_fields() test

2022-01-14 Thread Igor Mammedov
On Wed, 12 Jan 2022 08:44:19 -0500
"Michael S. Tsirkin"  wrote:

> On Wed, Jan 12, 2022 at 08:03:29AM -0500, Igor Mammedov wrote:
> > The next commit will revert OEM fields padding with whitespace to
> > padding with '\0' as it was before [1]. As result test_oem_fields() will
> > fail due to unexpectedly smaller ID sizes read from QEMU ACPI tables.
> > 
> > Pad OEM_ID/OEM_TABLE_ID manually with spaces so that values the test
> > puts on QEMU CLI and expected values match.
> > 
> > 1) 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be changed")
> > Signed-off-by: Igor Mammedov   
> 
> That's kind of ugly in that we do not test
> shorter names then.  How about we pad with \0 instead?


test_acpi_q35_slic() should cover short OEM_TABLE_ID.

also padding in this patch makes test_oem_fields() cleaner
and simplifies 3/4, switching to \0 here would require
merging this patch with the fix itself to avoid breaking
bisection.

If you still prefer to have test_oem_fields() test short
names, I can post following on top that can to it without
breaking bisection:

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 90c9f6a0a2..0fd7cf1f89 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -71,8 +71,8 @@
 
 #define ACPI_REBUILD_EXPECTED_AML "TEST_ACPI_REBUILD_AML"
 
-#define OEM_ID "TEST  "
-#define OEM_TABLE_ID   "OEM "
+#define OEM_ID "TEST"
+#define OEM_TABLE_ID   "OEM"
 #define OEM_TEST_ARGS  "-machine x-oem-id='" OEM_ID "',x-oem-table-id='" \
OEM_TABLE_ID "'"
 
@@ -1530,8 +1530,8 @@ static void test_oem_fields(test_data *data)
 continue;
 }
 
-g_assert(memcmp(sdt->aml + 10, OEM_ID, 6) == 0);
-g_assert(memcmp(sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
+g_assert(strncmp((char *)sdt->aml + 10, OEM_ID, 6) == 0);
+g_assert(strncmp((char *)sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
 }
 }
 


> And add a comment explaining why it's done.
> 
> > ---
> >  tests/qtest/bios-tables-test.c | 15 ++-
> >  1 file changed, 6 insertions(+), 9 deletions(-)
> > 
> > diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
> > index e6b72d9026..90c9f6a0a2 100644
> > --- a/tests/qtest/bios-tables-test.c
> > +++ b/tests/qtest/bios-tables-test.c
> > @@ -71,9 +71,10 @@
> >  
> >  #define ACPI_REBUILD_EXPECTED_AML "TEST_ACPI_REBUILD_AML"
> >  
> > -#define OEM_ID "TEST"
> > -#define OEM_TABLE_ID   "OEM"
> > -#define OEM_TEST_ARGS  "-machine 
> > x-oem-id="OEM_ID",x-oem-table-id="OEM_TABLE_ID
> > +#define OEM_ID "TEST  "
> > +#define OEM_TABLE_ID   "OEM "
> > +#define OEM_TEST_ARGS  "-machine x-oem-id='" OEM_ID 
> > "',x-oem-table-id='" \
> > +   OEM_TABLE_ID "'"
> >  
> >  typedef struct {
> >  bool tcg_only;
> > @@ -1519,11 +1520,7 @@ static void test_acpi_q35_slic(void)
> >  static void test_oem_fields(test_data *data)
> >  {
> >  int i;
> > -char oem_id[6];
> > -char oem_table_id[8];
> >  
> > -strpadcpy(oem_id, sizeof oem_id, OEM_ID, ' ');
> > -strpadcpy(oem_table_id, sizeof oem_table_id, OEM_TABLE_ID, ' ');
> >  for (i = 0; i < data->tables->len; ++i) {
> >  AcpiSdtTable *sdt;
> >  
> > @@ -1533,8 +1530,8 @@ static void test_oem_fields(test_data *data)
> >  continue;
> >  }
> >  
> > -g_assert(memcmp(sdt->aml + 10, oem_id, 6) == 0);
> > -g_assert(memcmp(sdt->aml + 16, oem_table_id, 8) == 0);
> > +g_assert(memcmp(sdt->aml + 10, OEM_ID, 6) == 0);
> > +g_assert(memcmp(sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
> >  }
> >  }
> >  
> > -- 
> > 2.31.1  
> 




Re: [PATCH 25/30] bsd-user/signal.c: handle_pending_signal

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 16:47, Warner Losh  wrote:
>
> Handle a queued signal.
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 


> +static void handle_pending_signal(CPUArchState *cpu_env, int sig,
> +  struct emulated_sigtable *k)
> +{
> +CPUState *cpu = env_cpu(cpu_env);
> +TaskState *ts = cpu->opaque;
> +struct qemu_sigqueue *q;
> +struct target_sigaction *sa;
> +int code;
> +sigset_t set;
> +abi_ulong handler;
> +target_siginfo_t tinfo;
> +target_sigset_t target_old_set;
> +
> +trace_user_handle_signal(cpu_env, sig);
> +
> +/* Dequeue signal. */
> +q = k->first;
> +k->first = q->next;
> +if (!k->first) {
> +k->pending = 0;
> +}

(The dequeue simplifies if you follow linux-user's way of handling
"queueing" a signal.)

> +
> +sig = gdb_handlesig(cpu, sig);
> +if (!sig) {
> +sa = NULL;
> +handler = TARGET_SIG_IGN;
> +} else {
> +sa = &sigact_table[sig - 1];
> +handler = sa->_sa_handler;
> +}
> +
> +if (do_strace) {
> +print_taken_signal(sig, &q->info);
> +}
> +
> +if (handler == TARGET_SIG_DFL) {
> +/*
> + * default handler : ignore some signal. The other are job
> + * control or fatal.
> + */
> +if (TARGET_SIGTSTP == sig || TARGET_SIGTTIN == sig ||
> +TARGET_SIGTTOU == sig) {
> +kill(getpid(), SIGSTOP);
> +} else if (TARGET_SIGCHLD != sig && TARGET_SIGURG != sig &&
> +TARGET_SIGINFO != sig &&
> +TARGET_SIGWINCH != sig && TARGET_SIGCONT != sig) {
> +force_sig(sig);
> +}
> +} else if (TARGET_SIG_IGN == handler) {

Avoid the yoda-conditionals, please.

> +/* ignore sig */
> +} else if (TARGET_SIG_ERR == handler) {
> +force_sig(sig);

Note that if you follow linux-user and my suggestion on
patch 21, these force_sig() calls become calls to
dump_core_and_abort(), unlike the one in setup_frame(),
which should be a "queue this signal". (The difference is
that here we know the process is definitely going to die,
because it has no valid handler for a fatal signal. In
setup_frame() the process might be able to continue, if it
has a signal handler for the SIGILL or SIGSEGV or whatever.)

> +} else {
> +/* compute the blocked signals during the handler execution */
> +sigset_t *blocked_set;
> +
> +target_to_host_sigset(&set, &sa->sa_mask);
> +/*
> + * SA_NODEFER indicates that the current signal should not be
> + * blocked during the handler.
> + */
> +if (!(sa->sa_flags & TARGET_SA_NODEFER)) {
> +sigaddset(&set, target_to_host_signal(sig));
> +}
> +
> +/*
> + * Save the previous blocked signal state to restore it at the
> + * end of the signal execution (see do_sigreturn).
> + */
> +host_to_target_sigset_internal(&target_old_set, &ts->signal_mask);
> +
> +blocked_set = ts->in_sigsuspend ?
> +&ts->sigsuspend_mask : &ts->signal_mask;
> +qemu_sigorset(&ts->signal_mask, blocked_set, &set);
> +ts->in_sigsuspend = false;
> +sigprocmask(SIG_SETMASK, &ts->signal_mask, NULL);
> +
> +/* XXX VM86 on x86 ??? */
> +
> +code = q->info.si_code;
> +/* prepare the stack frame of the virtual CPU */
> +if (sa->sa_flags & TARGET_SA_SIGINFO) {
> +tswap_siginfo(&tinfo, &q->info);

Oh, you're doing the tswap_siginfo() here. If you really want to
do that, then the setup_frame() should be able to do a simple
structure-copy I think and doesn't need the logic to figure out
which union fields are relevant. But putting the tswap_siginfo()
inside setup_frame() would match where linux-user does it.

> +setup_frame(sig, code, sa, &target_old_set, &tinfo, cpu_env);
> +} else {
> +setup_frame(sig, code, sa, &target_old_set, NULL, cpu_env);
> +}
> +if (sa->sa_flags & TARGET_SA_RESETHAND) {
> +sa->_sa_handler = TARGET_SIG_DFL;
> +}
> +}
> +if (q != &k->info) {
> +free_sigqueue(cpu_env, q);
> +}
> +}
> +
>  void process_pending_signals(CPUArchState *cpu_env)
>  {
>  }
> --
> 2.33.1

thanks
-- PMM



Re: [PATCH 26/30] bsd-user/signal.c: tswap_siginfo

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 16:56, Warner Losh  wrote:
>
> Convert siginfo from targer to host.
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/signal.c | 34 ++
>  1 file changed, 34 insertions(+)
>
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index 934528d5fb0..c954d0f4f37 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -197,6 +197,40 @@ static inline void 
> host_to_target_siginfo_noswap(target_siginfo_t *tinfo,
>  }
>  }
>
> +static void tswap_siginfo(target_siginfo_t *tinfo, const target_siginfo_t 
> *info)
> +{
> +int sig, code;
> +
> +sig = info->si_signo;
> +code = info->si_code;
> +tinfo->si_signo = tswap32(sig);
> +tinfo->si_errno = tswap32(info->si_errno);
> +tinfo->si_code = tswap32(info->si_code);
> +tinfo->si_pid = tswap32(info->si_pid);
> +tinfo->si_uid = tswap32(info->si_uid);
> +tinfo->si_status = tswap32(info->si_status);
> +tinfo->si_addr = tswapal(info->si_addr);

This implicitly assumes FreeBSD target_siginfo_t, because
(a) all the field names are different on eg NetBSD
(b) if, like NetBSD, the pid and uid fields are inside the same
union as the addr, you can't just swap all of them unconditionally
but need different logic to handle them as part of the "which bit
of the union is valid" code.

FreeBSD-only is fine for now, but you might want to add a comment.

> +/*
> + * Unswapped, because we passed it through mostly untouched.  si_value is
> + * opaque to the kernel, so we didn't bother with potentially wasting 
> cycles
> + * to swap it into host byte order.
> + */
> +tinfo->si_value.sival_ptr = info->si_value.sival_ptr;
> +if (SIGILL == sig || SIGFPE == sig || SIGSEGV == sig || SIGBUS == sig ||
> +SIGTRAP == sig) {
> +tinfo->_reason._fault._trapno = 
> tswap32(info->_reason._fault._trapno);
> +}
> +#ifdef SIGPOLL
> +if (SIGPOLL == sig) {
> +tinfo->_reason._poll._band = tswap32(info->_reason._poll._band);
> +}
> +#endif
> +if (SI_TIMER == code) {
> +tinfo->_reason._timer._timerid = 
> tswap32(info->_reason._timer._timerid);
> +tinfo->_reason._timer._overrun = 
> tswap32(info->_reason._timer._overrun);
> +}
> +}

You had a call to this already in the previous patch, which presumably
means that it didn't compile at that point in the series, so this
patch should be moved earlier.

My reply to patch 2 has the higher-level commentary about handling
of target_siginfo_t.

thanks
-- PMM



Re: [PATCH 27/30] bsd-user/signal.c: process_pending_signals

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 16:57, Warner Losh  wrote:
>
> Process the currently queued signals.
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/signal.c | 34 ++
>  1 file changed, 34 insertions(+)
>
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index c954d0f4f37..1dd6dbb4ee1 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -781,6 +781,40 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
> int sig,
>
>  void process_pending_signals(CPUArchState *cpu_env)

I won't review this, because I favour using the logic that
linux-user does here.

-- PMM



Re: [PATCH v6 18/23] hw/intc: Add RISC-V AIA APLIC device emulation

2022-01-14 Thread Frank Chang
Anup Patel  於 2021年12月30日 週四 下午8:55寫道:

> From: Anup Patel 
>
> The RISC-V AIA (Advanced Interrupt Architecture) defines a new
> interrupt controller for wired interrupts called APLIC (Advanced
> Platform Level Interrupt Controller). The APLIC is capabable of
> forwarding wired interupts to RISC-V HARTs directly or as MSIs
> (Message Signaled Interupts).
>
> This patch adds device emulation for RISC-V AIA APLIC.
>
> Signed-off-by: Anup Patel 
> Signed-off-by: Anup Patel 
> ---
>  hw/intc/Kconfig   |   3 +
>  hw/intc/meson.build   |   1 +
>  hw/intc/riscv_aplic.c | 970 ++
>  include/hw/intc/riscv_aplic.h |  79 +++
>  4 files changed, 1053 insertions(+)
>  create mode 100644 hw/intc/riscv_aplic.c
>  create mode 100644 include/hw/intc/riscv_aplic.h
>
> diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
> index 010ded7eae..528e77b4a6 100644
> --- a/hw/intc/Kconfig
> +++ b/hw/intc/Kconfig
> @@ -70,6 +70,9 @@ config LOONGSON_LIOINTC
>  config RISCV_ACLINT
>  bool
>
> +config RISCV_APLIC
> +bool
> +
>  config SIFIVE_PLIC
>  bool
>
> diff --git a/hw/intc/meson.build b/hw/intc/meson.build
> index 70080bc161..7466024402 100644
> --- a/hw/intc/meson.build
> +++ b/hw/intc/meson.build
> @@ -50,6 +50,7 @@ specific_ss.add(when: 'CONFIG_S390_FLIC', if_true:
> files('s390_flic.c'))
>  specific_ss.add(when: 'CONFIG_S390_FLIC_KVM', if_true:
> files('s390_flic_kvm.c'))
>  specific_ss.add(when: 'CONFIG_SH_INTC', if_true: files('sh_intc.c'))
>  specific_ss.add(when: 'CONFIG_RISCV_ACLINT', if_true:
> files('riscv_aclint.c'))
> +specific_ss.add(when: 'CONFIG_RISCV_APLIC', if_true:
> files('riscv_aplic.c'))
>  specific_ss.add(when: 'CONFIG_SIFIVE_PLIC', if_true:
> files('sifive_plic.c'))
>  specific_ss.add(when: 'CONFIG_XICS', if_true: files('xics.c'))
>  specific_ss.add(when: ['CONFIG_KVM', 'CONFIG_XICS'],
> diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
> new file mode 100644
> index 00..f4b8828dac
> --- /dev/null
> +++ b/hw/intc/riscv_aplic.c
> @@ -0,0 +1,970 @@
> +/*
> + * RISC-V APLIC (Advanced Platform Level Interrupt Controller)
> + *
> + * Copyright (c) 2021 Western Digital Corporation or its affiliates.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along with
> + * this program.  If not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/log.h"
> +#include "qemu/module.h"
> +#include "qemu/error-report.h"
> +#include "qemu/bswap.h"
> +#include "exec/address-spaces.h"
> +#include "hw/sysbus.h"
> +#include "hw/pci/msi.h"
> +#include "hw/boards.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/intc/riscv_aplic.h"
> +#include "hw/irq.h"
> +#include "target/riscv/cpu.h"
> +#include "sysemu/sysemu.h"
> +#include "migration/vmstate.h"
> +
> +#define APLIC_MAX_IDC  (1UL << 14)
> +#define APLIC_MAX_SOURCE   1024
> +#define APLIC_MIN_IPRIO_BITS   1
> +#define APLIC_MAX_IPRIO_BITS   8
> +#define APLIC_MAX_CHILDREN 1024
> +
> +#define APLIC_DOMAINCFG0x
> +#define APLIC_DOMAINCFG_IE (1 << 8)
> +#define APLIC_DOMAINCFG_DM (1 << 2)
> +#define APLIC_DOMAINCFG_BE (1 << 0)
> +
> +#define APLIC_SOURCECFG_BASE   0x0004
> +#define APLIC_SOURCECFG_D  (1 << 10)
> +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x03ff
> +#define APLIC_SOURCECFG_SM_MASK0x0007
> +#define APLIC_SOURCECFG_SM_INACTIVE0x0
> +#define APLIC_SOURCECFG_SM_DETACH  0x1
> +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
> +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
> +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
> +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
> +
> +#define APLIC_MMSICFGADDR  0x1bc0
> +#define APLIC_MMSICFGADDRH 0x1bc4
> +#define APLIC_SMSICFGADDR  0x1bc8
> +#define APLIC_SMSICFGADDRH 0x1bcc
> +
> +#define APLIC_xMSICFGADDRH_L   (1UL << 31)
> +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
> +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
> +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
> +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
> +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
> +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
> +#define APLIC_xMSICFGADDRH_LHXW_MASK   0xf
> +#define APLIC_xMSICFGADDRH_LHXW_SHIFT  12
> +#define APLIC_xMSICFGADDRH_BAPPN_MASK  0xfff
> +
> +#define APLIC_xMSICFGADDR_PPN_SHIFT12
> +

Re: [PATCH 28/30] bsd-user/signal.c: implement do_sigreturn

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 17:00, Warner Losh  wrote:
>
> Implements the meat of a sigreturn(2) system call via do_sigreturn, and
> helper reset_signal_mask. Fix the prototype of do_sigreturn in qemu.h
> and remove do_rt_sigreturn since it's linux only.
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/qemu.h   |  3 +--
>  bsd-user/signal.c | 56 +++
>  2 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
> index 011fdfebbaa..b8c64ca0e5b 100644
> --- a/bsd-user/qemu.h
> +++ b/bsd-user/qemu.h
> @@ -219,14 +219,13 @@ extern int do_strace;
>  /* signal.c */
>  void process_pending_signals(CPUArchState *cpu_env);
>  void signal_init(void);
> -long do_sigreturn(CPUArchState *env);
> -long do_rt_sigreturn(CPUArchState *env);
>  void queue_signal(CPUArchState *env, int sig, target_siginfo_t *info);
>  abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong 
> sp);
>  int target_to_host_signal(int sig);
>  int host_to_target_signal(int sig);
>  void host_to_target_sigset(target_sigset_t *d, const sigset_t *s);
>  void target_to_host_sigset(sigset_t *d, const target_sigset_t *s);
> +long do_sigreturn(CPUArchState *regs, abi_ulong addr);

Please always call CPUArchState* arguments 'env'.

>  void QEMU_NORETURN force_sig(int target_sig);
>  int qemu_sigorset(sigset_t *dest, const sigset_t *left, const sigset_t 
> *right);
>
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index 1dd6dbb4ee1..d11f5eddd7e 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -640,6 +640,62 @@ give_sigsegv:
>  force_sig(TARGET_SIGSEGV);
>  }
>
> +static int reset_signal_mask(target_ucontext_t *ucontext)
> +{
> +int i;
> +sigset_t blocked;
> +target_sigset_t target_set;
> +TaskState *ts = (TaskState *)thread_cpu->opaque;
> +
> +for (i = 0; i < TARGET_NSIG_WORDS; i++)
> +if (__get_user(target_set.__bits[i],
> +&ucontext->uc_sigmask.__bits[i])) {
> +return -TARGET_EFAULT;
> +}
> +target_to_host_sigset_internal(&blocked, &target_set);
> +ts->signal_mask = blocked;
> +sigprocmask(SIG_SETMASK, &ts->signal_mask, NULL);

do_sigreturn() itself shouldn't be setting the active signal
mask, at least if you follow the linux-user design. It just
sets the thread's signal_mask field in the TaskState by
calling set_sigmask(), and then on our way out in the
main cpu loop we'll call process_pending_signals() which
sets the real thread signal mask to that value. (This, together
with do_sigreturn() calling block_signals() before it starts
work, avoids some race conditions where a host signal is delivered
as soon as we unblock, I think.)

-- PMM



Re: [PATCH 1/1] softmmu: fix device deletion events with -device JSON syntax

2022-01-14 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> The -device JSON syntax impl leaks a reference on the created
> DeviceState instance. As a result when you hot-unplug the
> device, the device_finalize method won't be called and thus
> it will fail to emit the required DEVICE_DELETED event.
>
> A 'json-cli' feature was previously added against the
> 'device_add' QMP command QAPI schema to indicated to mgmt
> apps that -device supported JSON syntax. Given the hotplug
> bug that feature flag is no unusable for its purpose, so

As Laurent and Thomas pointed out, this should be "is not usable" or "is
unusable".

> we add a new 'json-cli-hotplug' feature to indicate the
> -device supports JSON without breaking hotplug.
>
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/802
> Signed-off-by: Daniel P. Berrangé 
> ---
>  qapi/qdev.json |  5 -
>  softmmu/vl.c   |  4 +++-
>  tests/qtest/device-plug-test.c | 19 +++
>  3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/qapi/qdev.json b/qapi/qdev.json
> index 69656b14df..26cd10106b 100644
> --- a/qapi/qdev.json
> +++ b/qapi/qdev.json
> @@ -44,6 +44,9 @@
>  # @json-cli: If present, the "-device" command line option supports JSON
>  #syntax with a structure identical to the arguments of this
>  #command.
> +# @json-cli-hotplug: If present, the "-device" command line option supports 
> JSON
> +#syntax without the reference counting leak that broke
> +#hot-unplug

For local consistency, please end the sentence with a period and wrap
lines like so:

   # @json-cli-hotplug: If present, the "-device" command line option supports
   #JSON syntax without the reference counting leak that
   #broke hot-unplug.

>  #
>  # Notes:
>  #
> @@ -74,7 +77,7 @@
>  { 'command': 'device_add',
>'data': {'driver': 'str', '*bus': 'str', '*id': 'str'},
>'gen': false, # so we can get the additional arguments
> -  'features': ['json-cli'] }
> +  'features': ['json-cli', 'json-cli-hotplug'] }
>  
>  ##
>  # @device_del:

Kevin, I hope you can apply these touch-ups in your tree.  Then, QAPI
schema
Acked-by: Markus Armbruster 

[...]




Re: [PATCH 0/2] block-backend: Retain permissions after migration

2022-01-14 Thread Hanna Reitz

On 25.11.21 14:53, Hanna Reitz wrote:

Hi,

Peng Liang has reported an issue regarding migration of raw images here:
https://lists.nongnu.org/archive/html/qemu-block/2021-11/msg00673.html

It turns out that after migrating, all permissions are shared when they
weren’t before.  The cause of the problem is that we deliberately delay
restricting the shared permissions until migration is really done (until
the runstate is no longer INMIGRATE) and first share all permissions;
but this causes us to lose the original shared permission mask and
overwrites it with BLK_PERM_ALL, so once we do try to restrict the
shared permissions, we only again share them all.

Fix this by saving the set of shared permissions through the first
blk_perm_set() call that shares all; and add a regression test.


I don’t believe we have to fix this in 6.2, because I think this bug has
existed for four years now.  (I.e. it isn’t critical, and it’s no
regression.)


Thanks for the reviews, applied to my block branch:

https://gitlab.com/hreitz/qemu/-/commits/block

Hanna




Re: [PATCH 0/2] block-backend: Retain permissions after migration

2022-01-14 Thread Hanna Reitz

On 10.01.22 12:51, Peng Liang wrote:

On 11/25/2021 9:53 PM, Hanna Reitz wrote:

Hi,

Peng Liang has reported an issue regarding migration of raw images here:
https://lists.nongnu.org/archive/html/qemu-block/2021-11/msg00673.html

It turns out that after migrating, all permissions are shared when they
weren’t before.  The cause of the problem is that we deliberately delay
restricting the shared permissions until migration is really done (until
the runstate is no longer INMIGRATE) and first share all permissions;
but this causes us to lose the original shared permission mask and
overwrites it with BLK_PERM_ALL, so once we do try to restrict the
shared permissions, we only again share them all.

Fix this by saving the set of shared permissions through the first
blk_perm_set() call that shares all; and add a regression test.


I don’t believe we have to fix this in 6.2, because I think this bug has
existed for four years now.  (I.e. it isn’t critical, and it’s no
regression.)


Hanna Reitz (2):
   block-backend: Retain permissions after migration
   iotests/migration-permissions: New test

  block/block-backend.c |  11 ++
  .../qemu-iotests/tests/migration-permissions  | 101 ++
  .../tests/migration-permissions.out   |   5 +
  3 files changed, 117 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/migration-permissions
  create mode 100644 tests/qemu-iotests/tests/migration-permissions.out


Hi Hanna,
QEMU 6.3 development tree has been opened.  Will this fix be merged in 6.3?


Oh, yes, right.  Thanks for the reminder! :)

Hanna




Re: [PATCH 3/3] intel-iommu: PASID support

2022-01-14 Thread Liu Yi L

On 2022/1/14 15:22, Jason Wang wrote:

On Fri, Jan 14, 2022 at 3:13 PM Peter Xu  wrote:


On Fri, Jan 14, 2022 at 01:58:07PM +0800, Jason Wang wrote:

Right, but I think you meant to do this only when scalable mode is disabled.


Yes IMHO it will definitely suite for !scalable case since that's exactly what
we did before.  What I'm also wondering is even if scalable is enabled but no
"real" pasid is used, so if all the translations go through the default pasid
that stored in the device context entry, then maybe we can ignore checking it.
The latter is the "hacky" part mentioned above.


The problem I see is that we can't know what PASID is used as default
without reading the context entry?


Can the default NO_PASID being used in mixture of !NO_PASID use case on the
same device?  If that's possible, then I agree..


My understanding is that it is possible.



My previous idea should be based on the fact that if NO_PASID is used on one
device, then all translations will be based on NO_PASID, but now I'm not sure
of it.


Actually, what I meant is:

device 1 using transactions without PASID with RID2PASID 1
device 2 using transactions without PASID with RID2PASID 2


Interesting series, Jason.

haven't read through all your code yet. Just a quick comment. The 
RID2PASID1 and RID2PASID2 may be the same one. Vt-d spec has defined a RPS 
bit in ecap register. If it is reported as 0, that means the RID_PASID 
(previously it is called RID2PASID :-)) field of scalable mode context 
entry is not supported, a PASID value of 0 will be used for transactions 
wihout PASID. So in the code, you may check the RPS bit to see if the 
RID_PASID value are the same for all devices.


Regards,
Yi Liu


Then we can't assume a default pasid here.







The other thing to mention is, if we postpone the iotlb lookup to be after
context entry, then logically we can have per-device iotlb, that means we can
replace IntelIOMMUState.iotlb with VTDAddressSpace.iotlb in the future, too,
which can also be more efficient.


Right but we still need to limit the total slots and ATS is a better
way to deal with the IOTLB bottleneck actually.


I think it depends on how the iotlb ghash is implemented.  Logically I think if
we can split the cache to per-device it'll be slightly better because we don't
need to iterate over iotlbs of other devices when lookup anymore; meanwhile
each iotlb takes less space too (no devfn needed anymore).


So we've already used sid in the IOTLB hash, I wonder how much we can
gain form this.

Thanks



Thanks,

--
Peter Xu





--
Regards,
Yi Liu



Re: [PATCH v6 18/23] hw/intc: Add RISC-V AIA APLIC device emulation

2022-01-14 Thread Anup Patel
On Fri, Jan 14, 2022 at 5:33 PM Frank Chang  wrote:
>
> Anup Patel  於 2021年12月30日 週四 下午8:55寫道:
>>
>> From: Anup Patel 
>>
>> The RISC-V AIA (Advanced Interrupt Architecture) defines a new
>> interrupt controller for wired interrupts called APLIC (Advanced
>> Platform Level Interrupt Controller). The APLIC is capabable of
>> forwarding wired interupts to RISC-V HARTs directly or as MSIs
>> (Message Signaled Interupts).
>>
>> This patch adds device emulation for RISC-V AIA APLIC.
>>
>> Signed-off-by: Anup Patel 
>> Signed-off-by: Anup Patel 
>> ---
>>  hw/intc/Kconfig   |   3 +
>>  hw/intc/meson.build   |   1 +
>>  hw/intc/riscv_aplic.c | 970 ++
>>  include/hw/intc/riscv_aplic.h |  79 +++
>>  4 files changed, 1053 insertions(+)
>>  create mode 100644 hw/intc/riscv_aplic.c
>>  create mode 100644 include/hw/intc/riscv_aplic.h
>>
>> diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
>> index 010ded7eae..528e77b4a6 100644
>> --- a/hw/intc/Kconfig
>> +++ b/hw/intc/Kconfig
>> @@ -70,6 +70,9 @@ config LOONGSON_LIOINTC
>>  config RISCV_ACLINT
>>  bool
>>
>> +config RISCV_APLIC
>> +bool
>> +
>>  config SIFIVE_PLIC
>>  bool
>>
>> diff --git a/hw/intc/meson.build b/hw/intc/meson.build
>> index 70080bc161..7466024402 100644
>> --- a/hw/intc/meson.build
>> +++ b/hw/intc/meson.build
>> @@ -50,6 +50,7 @@ specific_ss.add(when: 'CONFIG_S390_FLIC', if_true: 
>> files('s390_flic.c'))
>>  specific_ss.add(when: 'CONFIG_S390_FLIC_KVM', if_true: 
>> files('s390_flic_kvm.c'))
>>  specific_ss.add(when: 'CONFIG_SH_INTC', if_true: files('sh_intc.c'))
>>  specific_ss.add(when: 'CONFIG_RISCV_ACLINT', if_true: 
>> files('riscv_aclint.c'))
>> +specific_ss.add(when: 'CONFIG_RISCV_APLIC', if_true: files('riscv_aplic.c'))
>>  specific_ss.add(when: 'CONFIG_SIFIVE_PLIC', if_true: files('sifive_plic.c'))
>>  specific_ss.add(when: 'CONFIG_XICS', if_true: files('xics.c'))
>>  specific_ss.add(when: ['CONFIG_KVM', 'CONFIG_XICS'],
>> diff --git a/hw/intc/riscv_aplic.c b/hw/intc/riscv_aplic.c
>> new file mode 100644
>> index 00..f4b8828dac
>> --- /dev/null
>> +++ b/hw/intc/riscv_aplic.c
>> @@ -0,0 +1,970 @@
>> +/*
>> + * RISC-V APLIC (Advanced Platform Level Interrupt Controller)
>> + *
>> + * Copyright (c) 2021 Western Digital Corporation or its affiliates.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2 or later, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along 
>> with
>> + * this program.  If not, see .
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qemu/log.h"
>> +#include "qemu/module.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/bswap.h"
>> +#include "exec/address-spaces.h"
>> +#include "hw/sysbus.h"
>> +#include "hw/pci/msi.h"
>> +#include "hw/boards.h"
>> +#include "hw/qdev-properties.h"
>> +#include "hw/intc/riscv_aplic.h"
>> +#include "hw/irq.h"
>> +#include "target/riscv/cpu.h"
>> +#include "sysemu/sysemu.h"
>> +#include "migration/vmstate.h"
>> +
>> +#define APLIC_MAX_IDC  (1UL << 14)
>> +#define APLIC_MAX_SOURCE   1024
>> +#define APLIC_MIN_IPRIO_BITS   1
>> +#define APLIC_MAX_IPRIO_BITS   8
>> +#define APLIC_MAX_CHILDREN 1024
>> +
>> +#define APLIC_DOMAINCFG0x
>> +#define APLIC_DOMAINCFG_IE (1 << 8)
>> +#define APLIC_DOMAINCFG_DM (1 << 2)
>> +#define APLIC_DOMAINCFG_BE (1 << 0)
>> +
>> +#define APLIC_SOURCECFG_BASE   0x0004
>> +#define APLIC_SOURCECFG_D  (1 << 10)
>> +#define APLIC_SOURCECFG_CHILDIDX_MASK  0x03ff
>> +#define APLIC_SOURCECFG_SM_MASK0x0007
>> +#define APLIC_SOURCECFG_SM_INACTIVE0x0
>> +#define APLIC_SOURCECFG_SM_DETACH  0x1
>> +#define APLIC_SOURCECFG_SM_EDGE_RISE   0x4
>> +#define APLIC_SOURCECFG_SM_EDGE_FALL   0x5
>> +#define APLIC_SOURCECFG_SM_LEVEL_HIGH  0x6
>> +#define APLIC_SOURCECFG_SM_LEVEL_LOW   0x7
>> +
>> +#define APLIC_MMSICFGADDR  0x1bc0
>> +#define APLIC_MMSICFGADDRH 0x1bc4
>> +#define APLIC_SMSICFGADDR  0x1bc8
>> +#define APLIC_SMSICFGADDRH 0x1bcc
>> +
>> +#define APLIC_xMSICFGADDRH_L   (1UL << 31)
>> +#define APLIC_xMSICFGADDRH_HHXS_MASK   0x1f
>> +#define APLIC_xMSICFGADDRH_HHXS_SHIFT  24
>> +#define APLIC_xMSICFGADDRH_LHXS_MASK   0x7
>> +#define APLIC_xMSICFGADDRH_LHXS_SHIFT  20
>> +#define APLIC_xMSICFGADDRH_HHXW_MASK   0x7
>> +#define APLIC_xMSICFGADDRH_HHXW_SHIFT  16
>> +#defi

Re: [PATCH 1/4] tests: acpi: manually pad OEM_ID/OEM_TABLE_ID for test_oem_fields() test

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 12:48:20PM +0100, Igor Mammedov wrote:
> On Wed, 12 Jan 2022 08:44:19 -0500
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Jan 12, 2022 at 08:03:29AM -0500, Igor Mammedov wrote:
> > > The next commit will revert OEM fields padding with whitespace to
> > > padding with '\0' as it was before [1]. As result test_oem_fields() will
> > > fail due to unexpectedly smaller ID sizes read from QEMU ACPI tables.
> > > 
> > > Pad OEM_ID/OEM_TABLE_ID manually with spaces so that values the test
> > > puts on QEMU CLI and expected values match.
> > > 
> > > 1) 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be 
> > > changed")
> > > Signed-off-by: Igor Mammedov   
> > 
> > That's kind of ugly in that we do not test
> > shorter names then.  How about we pad with \0 instead?
> 
> 
> test_acpi_q35_slic() should cover short OEM_TABLE_ID.
> 
> also padding in this patch makes test_oem_fields() cleaner
> and simplifies 3/4, switching to \0 here would require
> merging this patch with the fix itself to avoid breaking
> bisection.
> 
> If you still prefer to have test_oem_fields() test short
> names, I can post following on top that can to it without
> breaking bisection:
> 
> diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
> index 90c9f6a0a2..0fd7cf1f89 100644
> --- a/tests/qtest/bios-tables-test.c
> +++ b/tests/qtest/bios-tables-test.c
> @@ -71,8 +71,8 @@
>  
>  #define ACPI_REBUILD_EXPECTED_AML "TEST_ACPI_REBUILD_AML"
>  
> -#define OEM_ID "TEST  "
> -#define OEM_TABLE_ID   "OEM "
> +#define OEM_ID "TEST"
> +#define OEM_TABLE_ID   "OEM"
>  #define OEM_TEST_ARGS  "-machine x-oem-id='" OEM_ID "',x-oem-table-id='" 
> \
> OEM_TABLE_ID "'"

Don't we want to revert ARGS change too then?


> @@ -1530,8 +1530,8 @@ static void test_oem_fields(test_data *data)
>  continue;
>  }
>  
> -g_assert(memcmp(sdt->aml + 10, OEM_ID, 6) == 0);
> -g_assert(memcmp(sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
> +g_assert(strncmp((char *)sdt->aml + 10, OEM_ID, 6) == 0);
> +g_assert(strncmp((char *)sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
>  }
>  }
>  

Looks much cleaner to me. OK as a patch on top.


> 
> > And add a comment explaining why it's done.
> > 
> > > ---
> > >  tests/qtest/bios-tables-test.c | 15 ++-
> > >  1 file changed, 6 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/tests/qtest/bios-tables-test.c 
> > > b/tests/qtest/bios-tables-test.c
> > > index e6b72d9026..90c9f6a0a2 100644
> > > --- a/tests/qtest/bios-tables-test.c
> > > +++ b/tests/qtest/bios-tables-test.c
> > > @@ -71,9 +71,10 @@
> > >  
> > >  #define ACPI_REBUILD_EXPECTED_AML "TEST_ACPI_REBUILD_AML"
> > >  
> > > -#define OEM_ID "TEST"
> > > -#define OEM_TABLE_ID   "OEM"
> > > -#define OEM_TEST_ARGS  "-machine 
> > > x-oem-id="OEM_ID",x-oem-table-id="OEM_TABLE_ID
> > > +#define OEM_ID "TEST  "
> > > +#define OEM_TABLE_ID   "OEM "
> > > +#define OEM_TEST_ARGS  "-machine x-oem-id='" OEM_ID 
> > > "',x-oem-table-id='" \
> > > +   OEM_TABLE_ID "'"
> > >  
> > >  typedef struct {
> > >  bool tcg_only;
> > > @@ -1519,11 +1520,7 @@ static void test_acpi_q35_slic(void)
> > >  static void test_oem_fields(test_data *data)
> > >  {
> > >  int i;
> > > -char oem_id[6];
> > > -char oem_table_id[8];
> > >  
> > > -strpadcpy(oem_id, sizeof oem_id, OEM_ID, ' ');
> > > -strpadcpy(oem_table_id, sizeof oem_table_id, OEM_TABLE_ID, ' ');
> > >  for (i = 0; i < data->tables->len; ++i) {
> > >  AcpiSdtTable *sdt;
> > >  
> > > @@ -1533,8 +1530,8 @@ static void test_oem_fields(test_data *data)
> > >  continue;
> > >  }
> > >  
> > > -g_assert(memcmp(sdt->aml + 10, oem_id, 6) == 0);
> > > -g_assert(memcmp(sdt->aml + 16, oem_table_id, 8) == 0);
> > > +g_assert(memcmp(sdt->aml + 10, OEM_ID, 6) == 0);
> > > +g_assert(memcmp(sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
> > >  }
> > >  }
> > >  
> > > -- 
> > > 2.31.1  
> > 




Re: [PATCH 29/30] bsd-user/signal.c: implement do_sigaction

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 16:32, Warner Losh  wrote:
>
> Implement the meat of the sigaction(2) system call with do_sigaction and
> helper routiner block_signals (which is also used to implemement signal
> masking so it's global).
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/qemu.h   | 21 +
>  bsd-user/signal.c | 76 +++
>  2 files changed, 97 insertions(+)
>
> diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
> index b8c64ca0e5b..c643d6ba246 100644
> --- a/bsd-user/qemu.h
> +++ b/bsd-user/qemu.h
> @@ -226,8 +226,29 @@ int host_to_target_signal(int sig);
>  void host_to_target_sigset(target_sigset_t *d, const sigset_t *s);
>  void target_to_host_sigset(sigset_t *d, const target_sigset_t *s);
>  long do_sigreturn(CPUArchState *regs, abi_ulong addr);
> +int do_sigaction(int sig, const struct target_sigaction *act,
> +struct target_sigaction *oact);
>  void QEMU_NORETURN force_sig(int target_sig);
>  int qemu_sigorset(sigset_t *dest, const sigset_t *left, const sigset_t 
> *right);
> +/**
> + * block_signals: block all signals while handling this guest syscall
> + *
> + * Block all signals, and arrange that the signal mask is returned to
> + * its correct value for the guest before we resume execution of guest code.
> + * If this function returns non-zero, then the caller should immediately
> + * return -TARGET_ERESTARTSYS to the main loop, which will take the pending
> + * signal and restart execution of the syscall.
> + * If block_signals() returns zero, then the caller can continue with
> + * emulation of the system call knowing that no signals can be taken
> + * (and therefore that no race conditions will result).
> + * This should only be called once, because if it is called a second time
> + * it will always return non-zero. (Think of it like a mutex that can't
> + * be recursively locked.)
> + * Signals will be unblocked again by process_pending_signals().
> + *
> + * Return value: non-zero if there was a pending signal, zero if not.
> + */
> +int block_signals(void); /* Returns non zero if signal pending */
>
>  /* mmap.c */
>  int target_mprotect(abi_ulong start, abi_ulong len, int prot);
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index d11f5eddd7e..f055d1db407 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -231,6 +231,22 @@ static void tswap_siginfo(target_siginfo_t *tinfo, const 
> target_siginfo_t *info)
>  }
>  }
>
> +int block_signals(void)
> +{
> +TaskState *ts = (TaskState *)thread_cpu->opaque;
> +sigset_t set;
> +
> +/*
> + * It's OK to block everything including SIGSEGV, because we won't run 
> any
> + * further guest code before unblocking signals in
> + * process_pending_signals().
> + */
> +sigfillset(&set);
> +sigprocmask(SIG_SETMASK, &set, 0);

For linux-user we rely on sigprocmask() in a multithreaded
program setting the signal mask for only the calling thread,
which isn't POSIX-mandated. (Arguably we should use
pthread_sigmask() instead, but we don't for basically
historical reasons since linux-user is host-OS-specific anyway.)
Does BSD have the same "this changes this thread's signal mask"
semantics for sigprocmask()?

> +
> +return qatomic_xchg(&ts->signal_pending, 1);
> +}
> +
>  /* Returns 1 if given signal should dump core if not handled. */
>  static int core_dump_signal(int sig)
>  {
> @@ -534,6 +550,66 @@ static int fatal_signal(int sig)
>  }
>  }
>
> +/* do_sigaction() return host values and errnos */
> +int do_sigaction(int sig, const struct target_sigaction *act,
> +struct target_sigaction *oact)
> +{
> +struct target_sigaction *k;
> +struct sigaction act1;
> +int host_sig;
> +int ret = 0;
> +
> +if (sig < 1 || sig > TARGET_NSIG || TARGET_SIGKILL == sig ||
> +TARGET_SIGSTOP == sig) {

Kernel seems to allow SIGKILL and SIGSTOP unless act is
non-NULL and act->sa_handler is SIG_DFL ?
https://github.com/freebsd/freebsd-src/blob/main/sys/kern/kern_sig.c#L747
(Compare linux-user commit ee3500d33a7431, a recent bugfix.)

> +return -EINVAL;
> +}
> +
> +if (block_signals()) {
> +return -TARGET_ERESTART;

Are we returning host errnos, or target errnos ?
(The linux-user version of this function has been a bit
confused about this in the past; I suspect you've picked up
fragments of it from different points in time.)

> +}
> +
> +k = &sigact_table[sig - 1];
> +if (oact) {
> +oact->_sa_handler = tswapal(k->_sa_handler);
> +oact->sa_flags = tswap32(k->sa_flags);
> +oact->sa_mask = k->sa_mask;
> +}
> +if (act) {
> +/* XXX: this is most likely not threadsafe. */
> +k->_sa_handler = tswapal(act->_sa_handler);
> +k->sa_flags = tswap32(act->sa_flags);
> +k->sa_mask = act->sa_mask;
> +
> +/* Update the host signal state. */
> +host_sig = tar

Re: [PATCH 30/30] bsd-user/signal.c: do_sigaltstack

2022-01-14 Thread Peter Maydell
On Sun, 9 Jan 2022 at 17:08, Warner Losh  wrote:
>
> Implement the meat of the sigaltstack(2) system call with do_sigaltstack.
>
> Signed-off-by: Stacey Son 
> Signed-off-by: Kyle Evans 
> Signed-off-by: Warner Losh 
> ---
>  bsd-user/qemu.h   |  1 +
>  bsd-user/signal.c | 66 +++
>  2 files changed, 67 insertions(+)
>
> diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
> index c643d6ba246..fcdea460ed2 100644
> --- a/bsd-user/qemu.h
> +++ b/bsd-user/qemu.h
> @@ -226,6 +226,7 @@ int host_to_target_signal(int sig);
>  void host_to_target_sigset(target_sigset_t *d, const sigset_t *s);
>  void target_to_host_sigset(sigset_t *d, const target_sigset_t *s);
>  long do_sigreturn(CPUArchState *regs, abi_ulong addr);
> +abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong 
> sp);
>  int do_sigaction(int sig, const struct target_sigaction *act,
>  struct target_sigaction *oact);
>  void QEMU_NORETURN force_sig(int target_sig);
> diff --git a/bsd-user/signal.c b/bsd-user/signal.c
> index f055d1db407..e5e5e28c60c 100644
> --- a/bsd-user/signal.c
> +++ b/bsd-user/signal.c
> @@ -528,6 +528,72 @@ static void host_signal_handler(int host_sig, siginfo_t 
> *info, void *puc)
>  cpu_exit(thread_cpu);
>  }
>
> +/* do_sigaltstack() returns target values and errnos. */
> +/* compare to kern/kern_sig.c sys_sigaltstack() and kern_sigaltstack() */
> +abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong 
> sp)
> +{
> +int ret;
> +target_stack_t oss;
> +
> +if (uoss_addr) {
> +/* Save current signal stack params */
> +oss.ss_sp = tswapl(target_sigaltstack_used.ss_sp);
> +oss.ss_size = tswapl(target_sigaltstack_used.ss_size);
> +oss.ss_flags = tswapl(sas_ss_flags(sp));
> +}

This will need some minor changes to work with the sigaltstack
info being per-thread and in the TaskState struct.

-- PMM



Re: [PATCH v2 0/2] hw/intc/arm_gic: Allow reset of the running priority

2022-01-14 Thread Peter Maydell
On Thu, 13 Jan 2022 at 15:19, Petr Pavlu  wrote:
>
> Thank you Peter for review of the first version of the patch. v2 splits
> the changes into two commits and updates the code as suggested.
>


Applied to target-arm.next, thanks.

-- PMM



Re: [PULL 0/6] Block patches

2022-01-14 Thread Peter Maydell
On Wed, 12 Jan 2022 at 17:14, Stefan Hajnoczi  wrote:
>
> The following changes since commit 91f5f7a5df1fda8c34677a7c49ee8a4bb5b56a36:
>
>   Merge remote-tracking branch 
> 'remotes/lvivier-gitlab/tags/linux-user-for-7.0-pull-request' into staging 
> (2022-01-12 11:51:47 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to db608fb78444c58896db69495729e4458eeaace1:
>
>   virtio: unify dataplane and non-dataplane ->handle_output() (2022-01-12 
> 17:09:39 +)
>
> 
> Pull request
>

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 3/3] isa/piix4: Resolve global variables

2022-01-14 Thread Peter Maydell
On Wed, 12 Jan 2022 at 22:02, Bernhard Beschow  wrote:
>
> Now that piix4_set_irq's opaque parameter references own PIIX4State,
> piix4_dev becomes redundant and pci_irq_levels can be moved into PIIX4State.
>
> Signed-off-by: Bernhard Beschow 
> ---
>  hw/isa/piix4.c| 22 +-
>  include/hw/southbridge/piix.h |  2 --
>  2 files changed, 9 insertions(+), 15 deletions(-)
>
> diff --git a/hw/isa/piix4.c b/hw/isa/piix4.c
> index a31e9714cf..964e09cf7f 100644
> --- a/hw/isa/piix4.c
> +++ b/hw/isa/piix4.c
> @@ -39,14 +39,14 @@
>  #include "sysemu/runstate.h"
>  #include "qom/object.h"
>
> -PCIDevice *piix4_dev;
> -
>  struct PIIX4State {
>  PCIDevice dev;
>  qemu_irq cpu_intr;
>  qemu_irq *isa;
>  qemu_irq i8259[ISA_NUM_IRQS];
>
> +int pci_irq_levels[PIIX_NUM_PIRQS];
> +

I wondered how we were migrating this state, and the answer
seems to be that we aren't (and weren't before, when it was
a global variable, so this is a pre-existing bug).

Does the malta platform support migration save/load?
We should probably add this field to the vmstate struct
(which will be a migration compatibility break, which is OK
as the malta board isn't versioned.)

-- PMM



Re: [PATCH 2/3] pci: Always pass own DeviceState to pci_map_irq_fn's

2022-01-14 Thread Peter Maydell
On Wed, 12 Jan 2022 at 21:36, Bernhard Beschow  wrote:
>
> Passing own DeviceState rather than just the IRQs allows for resolving
> global variables.
>
> Signed-off-by: Bernhard Beschow 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH 15/17] ppc/pnv: convert pec->stacks[] into pec->phbs[]

2022-01-14 Thread Cédric Le Goater

@@ -1520,14 +1520,19 @@ static PnvPhb4PecStack *pnv_phb4_get_stack(PnvChip 
*chip, PnvPHB4 *phb,
  
  for (i = 0; i < chip->num_pecs; i++) {

  /*
- * For each PEC, check the amount of stacks it supports
- * and see if the given phb4 index matches a stack.
+ * For each PEC, check the amount of phbs it supports
+ * and see if the given phb4 index matches an index.
   */
  PnvPhb4PecState *pec = &chip9->pecs[i];
  
-for (j = 0; j < pec->num_stacks; j++) {

+for (j = 0; j < pec->num_phbs; j++) {
  if (index == pnv_phb4_pec_get_phb_id(pec, j)) {
-return &pec->stacks[j];
+pec->phbs[j] = phb;


Why do we need this array ?


+
+/* Set phb-number now since we already have it */
+object_property_set_int(OBJECT(phb), "phb-number",
+   j, &error_abort);


that's ugly :/

C.


+return pec;
  }
  }
  }




Re: [PATCH 0/2] Introduce printer subsystem and USB printer device

2022-01-14 Thread Ruien Zhang

On 1/14/22 5:32 PM, Gerd Hoffmann wrote:

   Hi,


This patchset introduces:

1) Skeleton of QEMU printer subsystem with a dummy builtin driver.

2) USB printer device emulation, with definitions in the extension of IPP-over-
USB [3].

WIP:

1) QEMU printer subsystem interfaces, which will be finalized with a concrete
backend driver.

2) IPP-over-USB implementation.


Hmm, I'm wondering what uses cases you have in mind and whenever
it makes sense to introduce a printer subsystem?



Simply for the "potential" backend diversity. I have to admit that I 
haven't figured out another backend which would be commonly-seen either, 
which is also one part of the reason why the interfaces are not firming 
up right now.



Having an ipp-over-usb device looks useful, but the only use case I can
see is to allow guests access a network printer.  I can't see the
benefits of a printer subsystem, especially in a world where non-ipp
printers are going extinct.  We would most likely have just a single
kind of printer backend, where the only job qemu will have is to
forwarding requests and replies, maybe with some http header rewriting.

Likewise usb would be the one and only device (parallel ports are long
gone in printers).  So the indirection added by a printer subsystem
doesn't buy us anything because we just don't need that flexibility.
I'd suggest to pass the url directly to the device instead:

qemu -device usb-ipp-printer,url=ipp://hostname/ipp/printer

take care,
   Gerd



Indeed, the subsystem is an over-abstraction. The forwarding way is much 
neater, considering how things really work nowadays.


Anyway, thanks for the practical suggestion, it will be revised, along 
with other designs around the path I'm currently working on.


Regards,
Ruien





Re: [PATCH v3 2/3] target/riscv: add support for svinval extension

2022-01-14 Thread Anup Patel
On Fri, Jan 14, 2022 at 7:11 AM Weiwei Li  wrote:
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 
> ---
>  target/riscv/cpu.c  |  1 +
>  target/riscv/cpu.h  |  1 +
>  target/riscv/insn32.decode  |  7 ++
>  target/riscv/insn_trans/trans_svinval.c.inc | 75 +
>  target/riscv/translate.c|  1 +
>  5 files changed, 85 insertions(+)
>  create mode 100644 target/riscv/insn_trans/trans_svinval.c.inc
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index ff6c86c85b..45ac98e06b 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -668,6 +668,7 @@ static Property riscv_cpu_properties[] = {
>  DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
>  DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
>
> +DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false),
>  DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false),
>
>  DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index d3d17cde82..c3d1845ca1 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -327,6 +327,7 @@ struct RISCVCPU {
>  bool ext_counters;
>  bool ext_ifencei;
>  bool ext_icsr;
> +bool ext_svinval;
>  bool ext_svnapot;
>  bool ext_zfh;
>  bool ext_zfhmin;
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index 5bbedc254c..7a0351fde2 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -809,3 +809,10 @@ fcvt_l_h   1100010  00010 . ... . 1010011 @r2_rm
>  fcvt_lu_h  1100010  00011 . ... . 1010011 @r2_rm
>  fcvt_h_l   1101010  00010 . ... . 1010011 @r2_rm
>  fcvt_h_lu  1101010  00011 . ... . 1010011 @r2_rm
> +
> +# *** Svinval Standard Extension ***
> +sinval_vma0001011 . . 000 0 1110011 @sfence_vma
> +sfence_w_inval0001100 0 0 000 0 1110011
> +sfence_inval_ir   0001100 1 0 000 0 1110011
> +hinval_vvma   0011011 . . 000 0 1110011 @hfence_vvma

s/0011011/0010011/

> +hinval_gvma   0111011 . . 000 0 1110011 @hfence_gvma

s/0111011/0110011/

> diff --git a/target/riscv/insn_trans/trans_svinval.c.inc 
> b/target/riscv/insn_trans/trans_svinval.c.inc
> new file mode 100644
> index 00..1dde665661
> --- /dev/null
> +++ b/target/riscv/insn_trans/trans_svinval.c.inc
> @@ -0,0 +1,75 @@
> +/*
> + * RISC-V translation routines for the Svinval Standard Instruction Set.
> + *
> + * Copyright (c) 2020-2021 PLCT lab
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program.  If not, see .
> + */
> +
> +#define REQUIRE_SVINVAL(ctx) do {\
> +if (!RISCV_CPU(ctx->cs)->cfg.ext_svinval) {  \
> +return false;\
> +}\
> +} while (0)
> +
> +static bool trans_sinval_vma(DisasContext *ctx, arg_sinval_vma *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +/* Do the same as sfence.vma currently */
> +REQUIRE_EXT(ctx, RVS);
> +#ifndef CONFIG_USER_ONLY
> +gen_helper_tlb_flush(cpu_env);
> +return true;
> +#endif
> +return false;
> +}
> +
> +static bool trans_sfence_w_inval(DisasContext *ctx, arg_sfence_w_inval *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +REQUIRE_EXT(ctx, RVS);
> +/* Do nothing currently */
> +return true;
> +}
> +
> +static bool trans_sfence_inval_ir(DisasContext *ctx, arg_sfence_inval_ir *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +REQUIRE_EXT(ctx, RVS);
> +/* Do nothing currently */
> +return true;
> +}
> +
> +static bool trans_hinval_vvma(DisasContext *ctx, arg_hinval_vvma *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +/* Do the same as hfence.vvma currently */
> +REQUIRE_EXT(ctx, RVH);
> +#ifndef CONFIG_USER_ONLY
> +gen_helper_hyp_tlb_flush(cpu_env);
> +return true;
> +#endif
> +return false;
> +}
> +
> +static bool trans_hinval_gvma(DisasContext *ctx, arg_hinval_gvma *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +/* Do the same as hfence.gvma currently */
> +REQUIRE_EXT(ctx, RVH);
> +#ifndef CONFIG_USER_ONLY
> +gen_helper_hyp_gvma_tlb_flush(cpu_env);
> +return true;
> +#endif
> +return false;
> +}
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> ind

Re: [PATCH 15/17] ppc/pnv: convert pec->stacks[] into pec->phbs[]

2022-01-14 Thread Daniel Henrique Barboza




On 1/14/22 10:33, Cédric Le Goater wrote:

@@ -1520,14 +1520,19 @@ static PnvPhb4PecStack *pnv_phb4_get_stack(PnvChip 
*chip, PnvPHB4 *phb,
  for (i = 0; i < chip->num_pecs; i++) {
  /*
- * For each PEC, check the amount of stacks it supports
- * and see if the given phb4 index matches a stack.
+ * For each PEC, check the amount of phbs it supports
+ * and see if the given phb4 index matches an index.
   */
  PnvPhb4PecState *pec = &chip9->pecs[i];
-    for (j = 0; j < pec->num_stacks; j++) {
+    for (j = 0; j < pec->num_phbs; j++) {
  if (index == pnv_phb4_pec_get_phb_id(pec, j)) {
-    return &pec->stacks[j];
+    pec->phbs[j] = phb;


Why do we need this array ?



Actually we don't. While making  these patches I forgot to assign this pointer 
back
to the array and everything worked. We don't search the PHB back from the PEC at
any point.

This is being kept because I refrain from doing too much design changes at 
once. We
can drop it though - either in this patch or in a follow up.




+
+    /* Set phb-number now since we already have it */
+    object_property_set_int(OBJECT(phb), "phb-number",
+   j, &error_abort);


that's ugly :/


Not my proudest line of code indeed.

Perhaps we're better of trying to get rid of stack->stack_no altogether before 
even
converting it to phb->stack_no. I'll see how that goes.



Daniel




C.


+    return pec;
  }
  }
  }




[PULL 05/16] docs: Correct 'vhost-user-blk' spelling

2022-01-14 Thread Kevin Wolf
From: Philippe Mathieu-Daudé 

Reported-by: Eric Blake 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220107105420.395011-2-f4...@amsat.org>
Signed-off-by: Kevin Wolf 
---
 docs/tools/qemu-storage-daemon.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/tools/qemu-storage-daemon.rst 
b/docs/tools/qemu-storage-daemon.rst
index 3e5a9dc032..9b0eaba6e5 100644
--- a/docs/tools/qemu-storage-daemon.rst
+++ b/docs/tools/qemu-storage-daemon.rst
@@ -201,7 +201,7 @@ Export raw image file ``disk.img`` over NBD UNIX domain 
socket ``nbd.sock``::
   --nbd-server addr.type=unix,addr.path=nbd.sock \
   --export type=nbd,id=export,node-name=disk,writable=on
 
-Export a qcow2 image file ``disk.qcow2`` as a vhosts-user-blk device over UNIX
+Export a qcow2 image file ``disk.qcow2`` as a vhost-user-blk device over UNIX
 domain socket ``vhost-user-blk.sock``::
 
   $ qemu-storage-daemon \
-- 
2.31.1




[PULL 03/16] include/sysemu/blockdev.h: remove drive_get_max_devs

2022-01-14 Thread Kevin Wolf
From: Emanuele Giuseppe Esposito 

Remove drive_get_max_devs, as it is not used by anyone.

Last use was removed in commit 8f2d75e81d5
("hw: Drop superfluous special checks for orphaned -drive").

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20211215121140.456939-4-eespo...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 include/sysemu/blockdev.h |  1 -
 blockdev.c| 17 -
 2 files changed, 18 deletions(-)

diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index ea35c42f5c..f9fb54d437 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -48,7 +48,6 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int 
unit);
 void drive_check_orphaned(void);
 DriveInfo *drive_get_by_index(BlockInterfaceType type, int index);
 int drive_get_max_bus(BlockInterfaceType type);
-int drive_get_max_devs(BlockInterfaceType type);
 
 QemuOpts *drive_add(BlockInterfaceType type, int index, const char *file,
 const char *optstr);
diff --git a/blockdev.c b/blockdev.c
index 25b3b202e7..8197165bb5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -168,23 +168,6 @@ void blockdev_auto_del(BlockBackend *blk)
 }
 }
 
-/**
- * Returns the current mapping of how many units per bus
- * a particular interface can support.
- *
- *  A positive integer indicates n units per bus.
- *  0 implies the mapping has not been established.
- * -1 indicates an invalid BlockInterfaceType was given.
- */
-int drive_get_max_devs(BlockInterfaceType type)
-{
-if (type >= IF_IDE && type < IF_COUNT) {
-return if_max_devs[type];
-}
-
-return -1;
-}
-
 static int drive_index_to_bus_id(BlockInterfaceType type, int index)
 {
 int max_devs = if_max_devs[type];
-- 
2.31.1




[PULL 01/16] block_int: make bdrv_backing_overridden static

2022-01-14 Thread Kevin Wolf
From: Emanuele Giuseppe Esposito 

bdrv_backing_overridden is only used in block.c, so there is
no need to leave it in block_int.h

Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20211215121140.456939-2-eespo...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h | 3 ---
 block.c   | 4 +++-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index f4c75e8ba9..27008cfb22 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1122,9 +1122,6 @@ BlockDriver *bdrv_probe_all(const uint8_t *buf, int 
buf_size,
 void bdrv_parse_filename_strip_prefix(const char *filename, const char *prefix,
   QDict *options);
 
-bool bdrv_backing_overridden(BlockDriverState *bs);
-
-
 /**
  * bdrv_add_aio_context_notifier:
  *
diff --git a/block.c b/block.c
index 0ac5b163d2..10346b5011 100644
--- a/block.c
+++ b/block.c
@@ -103,6 +103,8 @@ static int bdrv_reopen_prepare(BDRVReopenState 
*reopen_state,
 static void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 static void bdrv_reopen_abort(BDRVReopenState *reopen_state);
 
+static bool bdrv_backing_overridden(BlockDriverState *bs);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -7475,7 +7477,7 @@ static bool append_strong_runtime_options(QDict *d, 
BlockDriverState *bs)
 /* Note: This function may return false positives; it may return true
  * even if opening the backing file specified by bs's image header
  * would result in exactly bs->backing. */
-bool bdrv_backing_overridden(BlockDriverState *bs)
+static bool bdrv_backing_overridden(BlockDriverState *bs)
 {
 if (bs->backing) {
 return strcmp(bs->auto_backing_file,
-- 
2.31.1




Re: [PATCH qemu] spapr: Force 32bit when resetting a core

2022-01-14 Thread Cédric Le Goater

On 1/7/22 14:39, Greg Kurz wrote:

On Fri, 7 Jan 2022 23:19:03 +1100
David Gibson  wrote:


On Fri, Jan 07, 2022 at 12:57:47PM +0100, Greg Kurz wrote:

On Fri, 7 Jan 2022 18:24:23 +1100
Alexey Kardashevskiy  wrote:


"PowerPC Processor binding to IEEE 1275" says in
"8.2.1. Initial Register Values" that the initial state is defined as
32bit so do it for both SLOF and VOF.

This should not cause behavioral change as SLOF switches to 64bit very
early anyway.


Only one CPU goes through SLOF. What about the other ones, including
hot plugged CPUs ?


Those will be started by the start-cpu RTAS call which has its own
semantics.



Ah indeed, there's code in linux/arch/powerpc/kernel/head_64.S to switch
secondaries to 64bit... but then, as noted by Cedric, ppc_cpu_reset(),
which is called earlier sets MSR_SF but the changelog of commit 8b9f2118ca40
doesn't provide much details on the motivation. Any idea ?


I found some reference to the commit here but it doesn't seem
to be the root cause :

 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723914

Thanks,

C.



[PULL 04/16] softmmu: fix device deletion events with -device JSON syntax

2022-01-14 Thread Kevin Wolf
From: Daniel P. Berrangé 

The -device JSON syntax impl leaks a reference on the created
DeviceState instance. As a result when you hot-unplug the
device, the device_finalize method won't be called and thus
it will fail to emit the required DEVICE_DELETED event.

A 'json-cli' feature was previously added against the
'device_add' QMP command QAPI schema to indicated to mgmt
apps that -device supported JSON syntax. Given the hotplug
bug that feature flag is not usable for its purpose, so
we add a new 'json-cli-hotplug' feature to indicate the
-device supports JSON without breaking hotplug.

Fixes: 5dacda5167560b3af8eadbce5814f60ba44b467e
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/802
Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220105123847.4047954-2-berra...@redhat.com>
Reviewed-by: Laurent Vivier 
Tested-by: Ján Tomko 
Reviewed-by: Thomas Huth 
Signed-off-by: Kevin Wolf 
---
 qapi/qdev.json |  5 -
 softmmu/vl.c   |  4 +++-
 tests/qtest/device-plug-test.c | 19 +++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/qapi/qdev.json b/qapi/qdev.json
index 69656b14df..26cd10106b 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -44,6 +44,9 @@
 # @json-cli: If present, the "-device" command line option supports JSON
 #syntax with a structure identical to the arguments of this
 #command.
+# @json-cli-hotplug: If present, the "-device" command line option supports 
JSON
+#syntax without the reference counting leak that broke
+#hot-unplug
 #
 # Notes:
 #
@@ -74,7 +77,7 @@
 { 'command': 'device_add',
   'data': {'driver': 'str', '*bus': 'str', '*id': 'str'},
   'gen': false, # so we can get the additional arguments
-  'features': ['json-cli'] }
+  'features': ['json-cli', 'json-cli-hotplug'] }
 
 ##
 # @device_del:
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 207a9eb8be..5e1b35ba48 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2684,6 +2684,7 @@ static void qemu_create_cli_devices(void)
 qemu_opts_foreach(qemu_find_opts("device"),
   device_init_func, NULL, &error_fatal);
 QTAILQ_FOREACH(opt, &device_opts, next) {
+DeviceState *dev;
 loc_push_restore(&opt->loc);
 /*
  * TODO Eventually we should call qmp_device_add() here to make sure it
@@ -2692,7 +2693,8 @@ static void qemu_create_cli_devices(void)
  * from the start, so call qdev_device_add_from_qdict() directly for
  * now.
  */
-qdev_device_add_from_qdict(opt->opts, true, &error_fatal);
+dev = qdev_device_add_from_qdict(opt->opts, true, &error_fatal);
+object_unref(OBJECT(dev));
 loc_pop(&opt->loc);
 }
 rom_reset_order_override();
diff --git a/tests/qtest/device-plug-test.c b/tests/qtest/device-plug-test.c
index 559d47727a..ad79bd4c14 100644
--- a/tests/qtest/device-plug-test.c
+++ b/tests/qtest/device-plug-test.c
@@ -77,6 +77,23 @@ static void test_pci_unplug_request(void)
 qtest_quit(qtest);
 }
 
+static void test_pci_unplug_json_request(void)
+{
+QTestState *qtest = qtest_initf(
+"-device '{\"driver\": \"virtio-mouse-pci\", \"id\": \"dev0\"}'");
+
+/*
+ * Request device removal. As the guest is not running, the request won't
+ * be processed. However during system reset, the removal will be
+ * handled, removing the device.
+ */
+device_del(qtest, "dev0");
+system_reset(qtest);
+wait_device_deleted_event(qtest, "dev0");
+
+qtest_quit(qtest);
+}
+
 static void test_ccw_unplug(void)
 {
 QTestState *qtest = qtest_initf("-device virtio-balloon-ccw,id=dev0");
@@ -145,6 +162,8 @@ int main(int argc, char **argv)
  */
 qtest_add_func("/device-plug/pci-unplug-request",
test_pci_unplug_request);
+qtest_add_func("/device-plug/pci-unplug-json-request",
+   test_pci_unplug_json_request);
 
 if (!strcmp(arch, "s390x")) {
 qtest_add_func("/device-plug/ccw-unplug",
-- 
2.31.1




[PULL 08/16] block-backend: prevent dangling BDS pointers across aio_poll()

2022-01-14 Thread Kevin Wolf
From: Stefan Hajnoczi 

The BlockBackend root child can change when aio_poll() is invoked. This
happens when a temporary filter node is removed upon blockjob
completion, for example.

Functions in block/block-backend.c must be aware of this when using a
blk_bs() pointer across aio_poll() because the BlockDriverState refcnt
may reach 0, resulting in a stale pointer.

One example is scsi_device_purge_requests(), which calls blk_drain() to
wait for in-flight requests to cancel. If the backup blockjob is active,
then the BlockBackend root child is a temporary filter BDS owned by the
blockjob. The blockjob can complete during bdrv_drained_begin() and the
last reference to the BDS is released when the temporary filter node is
removed. This results in a use-after-free when blk_drain() calls
bdrv_drained_end(bs) on the dangling pointer.

Explicitly hold a reference to bs across block APIs that invoke
aio_poll().

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2021778
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2036178
Signed-off-by: Stefan Hajnoczi 
Message-Id: <2022053613.25453-2-stefa...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 block/block-backend.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 12ef80ea17..23e727199b 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -822,16 +822,22 @@ BlockBackend *blk_by_public(BlockBackendPublic *public)
 void blk_remove_bs(BlockBackend *blk)
 {
 ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
-BlockDriverState *bs;
 BdrvChild *root;
 
 notifier_list_notify(&blk->remove_bs_notifiers, blk);
 if (tgm->throttle_state) {
-bs = blk_bs(blk);
+BlockDriverState *bs = blk_bs(blk);
+
+/*
+ * Take a ref in case blk_bs() changes across bdrv_drained_begin(), for
+ * example, if a temporary filter node is removed by a blockjob.
+ */
+bdrv_ref(bs);
 bdrv_drained_begin(bs);
 throttle_group_detach_aio_context(tgm);
 throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
 bdrv_drained_end(bs);
+bdrv_unref(bs);
 }
 
 blk_update_root_state(blk);
@@ -1705,6 +1711,7 @@ void blk_drain(BlockBackend *blk)
 BlockDriverState *bs = blk_bs(blk);
 
 if (bs) {
+bdrv_ref(bs);
 bdrv_drained_begin(bs);
 }
 
@@ -1714,6 +1721,7 @@ void blk_drain(BlockBackend *blk)
 
 if (bs) {
 bdrv_drained_end(bs);
+bdrv_unref(bs);
 }
 }
 
@@ -2044,10 +2052,13 @@ static int blk_do_set_aio_context(BlockBackend *blk, 
AioContext *new_context,
 int ret;
 
 if (bs) {
+bdrv_ref(bs);
+
 if (update_root_node) {
 ret = bdrv_child_try_set_aio_context(bs, new_context, blk->root,
  errp);
 if (ret < 0) {
+bdrv_unref(bs);
 return ret;
 }
 }
@@ -2057,6 +2068,8 @@ static int blk_do_set_aio_context(BlockBackend *blk, 
AioContext *new_context,
 throttle_group_attach_aio_context(tgm, new_context);
 bdrv_drained_end(bs);
 }
+
+bdrv_unref(bs);
 }
 
 blk->ctx = new_context;
@@ -2326,11 +2339,13 @@ void blk_io_limits_disable(BlockBackend *blk)
 ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
 assert(tgm->throttle_state);
 if (bs) {
+bdrv_ref(bs);
 bdrv_drained_begin(bs);
 }
 throttle_group_unregister_tgm(tgm);
 if (bs) {
 bdrv_drained_end(bs);
+bdrv_unref(bs);
 }
 }
 
-- 
2.31.1




[PULL 00/16] Block layer patches

2022-01-14 Thread Kevin Wolf
The following changes since commit 67b6526cf042f22521feff5ea521a05d3dd2bf8f:

  Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into 
staging (2022-01-13 13:59:56 +)

are available in the Git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to e5e748739562268ef4063ee77bf53ad7040b25c7:

  iotests/testrunner.py: refactor test_field_width (2022-01-14 12:03:16 +0100)


Block layer patches

- qemu-storage-daemon: Add vhost-user-blk help
- block-backend: Fix use-after-free for BDS pointers after aio_poll()
- qemu-img: Fix sparseness of output image with unaligned ranges
- vvfat: Fix crashes in read-write mode
- Fix device deletion events with -device JSON syntax
- Code cleanups


Daniel P. Berrangé (1):
  softmmu: fix device deletion events with -device JSON syntax

Emanuele Giuseppe Esposito (3):
  block_int: make bdrv_backing_overridden static
  include/sysemu/blockdev.h: remove drive_mark_claimed_by_board and inline 
drive_def
  include/sysemu/blockdev.h: remove drive_get_max_devs

Hanna Reitz (2):
  iotests/stream-error-on-reset: New test
  iotests/308: Fix for CAP_DAC_OVERRIDE

Kevin Wolf (3):
  vvfat: Fix size of temporary qcow file
  vvfat: Fix vvfat_write() for writes before the root directory
  iotests: Test qemu-img convert of zeroed data cluster

Philippe Mathieu-Daudé (3):
  docs: Correct 'vhost-user-blk' spelling
  qemu-storage-daemon: Add vhost-user-blk help
  qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER

Stefan Hajnoczi (1):
  block-backend: prevent dangling BDS pointers across aio_poll()

Vladimir Sementsov-Ogievskiy (3):
  qemu-img: make is_allocated_sectors() more efficient
  block: drop BLK_PERM_GRAPH_MOD
  iotests/testrunner.py: refactor test_field_width

 qapi/block-core.json   |   7 +-
 qapi/block-export.json |   6 +-
 qapi/qdev.json |   5 +-
 docs/tools/qemu-storage-daemon.rst |   2 +-
 include/block/block.h  |   9 +-
 include/block/block_int.h  |   3 -
 include/sysemu/blockdev.h  |   3 -
 block.c|  11 +-
 block/block-backend.c  |  19 ++-
 block/commit.c |   1 -
 block/mirror.c |  15 +--
 block/monitor/block-hmp-cmds.c |   2 +-
 block/vvfat.c  |  37 --
 blockdev.c |  24 +---
 hw/block/block.c   |   3 +-
 qemu-img.c |  23 +++-
 softmmu/vl.c   |   8 +-
 storage-daemon/qemu-storage-daemon.c   |  13 ++
 tests/qtest/device-plug-test.c |  19 +++
 scripts/render_block_graph.py  |   1 -
 tests/qemu-iotests/testrunner.py   |  21 ++--
 tests/qemu-iotests/122 |   1 +
 tests/qemu-iotests/122.out |   2 +
 tests/qemu-iotests/273.out |   4 -
 tests/qemu-iotests/308 |  25 +++-
 tests/qemu-iotests/308.out |   2 +-
 tests/qemu-iotests/tests/stream-error-on-reset | 140 +
 tests/qemu-iotests/tests/stream-error-on-reset.out |   5 +
 28 files changed, 307 insertions(+), 104 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/stream-error-on-reset
 create mode 100644 tests/qemu-iotests/tests/stream-error-on-reset.out




[PULL 07/16] qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER

2022-01-14 Thread Kevin Wolf
From: Philippe Mathieu-Daudé 

When building QEMU with --disable-vhost-user and using introspection,
query-qmp-schema lists vhost-user-blk even though it's not actually
available:

  { "execute": "query-qmp-schema" }
  {
  "return": [
  ...
  {
  "name": "312",
  "members": [
  {
  "name": "nbd"
  },
  {
  "name": "vhost-user-blk"
  }
  ],
  "meta-type": "enum",
  "values": [
  "nbd",
  "vhost-user-blk"
  ]
  },

Restrict vhost-user-blk in BlockExportType when
CONFIG_VHOST_USER_BLK_SERVER is disabled, so it
doesn't end listed by query-qmp-schema.

Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API")
Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220107105420.395011-4-f4...@amsat.org>
Signed-off-by: Kevin Wolf 
---
 qapi/block-export.json | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/qapi/block-export.json b/qapi/block-export.json
index c1b92ce1c1..f9ce79a974 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -277,7 +277,8 @@
 # Since: 4.2
 ##
 { 'enum': 'BlockExportType',
-  'data': [ 'nbd', 'vhost-user-blk',
+  'data': [ 'nbd',
+{ 'name': 'vhost-user-blk', 'if': 'CONFIG_VHOST_USER_BLK_SERVER' },
 { 'name': 'fuse', 'if': 'CONFIG_FUSE' } ] }
 
 ##
@@ -319,7 +320,8 @@
   'discriminator': 'type',
   'data': {
   'nbd': 'BlockExportOptionsNbd',
-  'vhost-user-blk': 'BlockExportOptionsVhostUserBlk',
+  'vhost-user-blk': { 'type': 'BlockExportOptionsVhostUserBlk',
+  'if': 'CONFIG_VHOST_USER_BLK_SERVER' },
   'fuse': { 'type': 'BlockExportOptionsFuse',
 'if': 'CONFIG_FUSE' }
} }
-- 
2.31.1




[PULL 09/16] iotests/stream-error-on-reset: New test

2022-01-14 Thread Kevin Wolf
From: Hanna Reitz 

Test the following scenario:
- Simple stream block in two-layer backing chain (base and top)
- The job is drained via blk_drain(), then an error occurs while the job
  settles the ongoing request
- And so the job completes while in blk_drain()

This was reported as a segfault, but is fixed by "block-backend: prevent
dangling BDS pointers across aio_poll()".

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2036178
Signed-off-by: Hanna Reitz 
Signed-off-by: Stefan Hajnoczi 
Message-Id: <2022053613.25453-3-stefa...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 .../qemu-iotests/tests/stream-error-on-reset  | 140 ++
 .../tests/stream-error-on-reset.out   |   5 +
 2 files changed, 145 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/stream-error-on-reset
 create mode 100644 tests/qemu-iotests/tests/stream-error-on-reset.out

diff --git a/tests/qemu-iotests/tests/stream-error-on-reset 
b/tests/qemu-iotests/tests/stream-error-on-reset
new file mode 100755
index 00..7eaedb24d7
--- /dev/null
+++ b/tests/qemu-iotests/tests/stream-error-on-reset
@@ -0,0 +1,140 @@
+#!/usr/bin/env python3
+# group: rw quick
+#
+# Test what happens when a stream job completes in a blk_drain().
+#
+# Copyright (C) 2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import imgfmt, qemu_img_create, qemu_io_silent, QMPTestCase
+
+
+image_size = 1 * 1024 * 1024
+data_size = 64 * 1024
+base = os.path.join(iotests.test_dir, 'base.img')
+top = os.path.join(iotests.test_dir, 'top.img')
+
+
+# We want to test completing a stream job in a blk_drain().
+#
+# The blk_drain() we are going to use is a virtio-scsi device resetting,
+# which we can trigger by resetting the system.
+#
+# In order to have the block job complete on drain, we (1) throttle its
+# base image so we can start the drain after it has begun, but before it
+# completes, and (2) make it encounter an I/O error on the ensuing write.
+# (If it completes regularly, the completion happens after the drain for
+# some reason.)
+
+class TestStreamErrorOnReset(QMPTestCase):
+def setUp(self) -> None:
+"""
+Create two images:
+- base image {base} with {data_size} bytes allocated
+- top image {top} without any data allocated
+
+And the following VM configuration:
+- base image throttled to {data_size}
+- top image with a blkdebug configuration so the first write access
+  to it will result in an error
+- top image is attached to a virtio-scsi device
+"""
+assert qemu_img_create('-f', imgfmt, base, str(image_size)) == 0
+assert qemu_io_silent('-c', f'write 0 {data_size}', base) == 0
+assert qemu_img_create('-f', imgfmt, top, str(image_size)) == 0
+
+self.vm = iotests.VM()
+self.vm.add_args('-accel', 'tcg') # Make throttling work properly
+self.vm.add_object(self.vm.qmp_to_opts({
+'qom-type': 'throttle-group',
+'id': 'thrgr',
+'x-bps-total': str(data_size)
+}))
+self.vm.add_blockdev(self.vm.qmp_to_opts({
+'driver': imgfmt,
+'node-name': 'base',
+'file': {
+'driver': 'throttle',
+'throttle-group': 'thrgr',
+'file': {
+'driver': 'file',
+'filename': base
+}
+}
+}))
+self.vm.add_blockdev(self.vm.qmp_to_opts({
+'driver': imgfmt,
+'node-name': 'top',
+'file': {
+'driver': 'blkdebug',
+'node-name': 'top-blkdebug',
+'inject-error': [{
+'event': 'pwritev',
+'immediately': 'true',
+'once': 'true'
+}],
+'image': {
+'driver': 'file',
+'filename': top
+}
+},
+'backing': 'base'
+}))
+self.vm.add_device(self.vm.qmp_to_opts({
+'driver': 'virtio-scsi',
+'id': 'vscsi'
+}))
+self.vm.add_device(self.vm.qmp_to_opts({
+'driver': 'scsi-hd',
+'bus': 'vscsi.0',
+'drive': 'top'
+}))
+self.vm.launch()
+

[PULL 06/16] qemu-storage-daemon: Add vhost-user-blk help

2022-01-14 Thread Kevin Wolf
From: Philippe Mathieu-Daudé 

Add missing vhost-user-blk help:

  $ qemu-storage-daemon -h
  ...
--export [type=]vhost-user-blk,id=,node-name=,
 addr.type=unix,addr.path=[,writable=on|off]
 [,logical-block-size=][,num-queues=]
   export the specified block node as a
   vhosts-user-blk device over UNIX domain socket
--export [type=]vhost-user-blk,id=,node-name=,
 fd,addr.str=[,writable=on|off]
 [,logical-block-size=][,num-queues=]
   export the specified block node as a
   vhosts-user-blk device over file descriptor
  ...

Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API")
Reported-by: Qing Wang 
Reviewed-by: Eric Blake 
Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220107105420.395011-3-f4...@amsat.org>
Signed-off-by: Kevin Wolf 
---
 storage-daemon/qemu-storage-daemon.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/storage-daemon/qemu-storage-daemon.c 
b/storage-daemon/qemu-storage-daemon.c
index 52cf17e8ac..9d76d1114d 100644
--- a/storage-daemon/qemu-storage-daemon.c
+++ b/storage-daemon/qemu-storage-daemon.c
@@ -104,6 +104,19 @@ static void help(void)
 " export the specified block node over FUSE\n"
 "\n"
 #endif /* CONFIG_FUSE */
+#ifdef CONFIG_VHOST_USER_BLK_SERVER
+"  --export [type=]vhost-user-blk,id=,node-name=,\n"
+"   addr.type=unix,addr.path=[,writable=on|off]\n"
+"   [,logical-block-size=][,num-queues=]\n"
+" export the specified block node as a\n"
+" vhost-user-blk device over UNIX domain socket\n"
+"  --export [type=]vhost-user-blk,id=,node-name=,\n"
+"   fd,addr.str=[,writable=on|off]\n"
+"   [,logical-block-size=][,num-queues=]\n"
+" export the specified block node as a\n"
+" vhost-user-blk device over file descriptor\n"
+"\n"
+#endif /* CONFIG_VHOST_USER_BLK_SERVER */
 "  --monitor [chardev=]name[,mode=control][,pretty[=on|off]]\n"
 " configure a QMP monitor\n"
 "\n"
-- 
2.31.1




[PULL 02/16] include/sysemu/blockdev.h: remove drive_mark_claimed_by_board and inline drive_def

2022-01-14 Thread Kevin Wolf
From: Emanuele Giuseppe Esposito 

drive_def is only a particular use case of
qemu_opts_parse_noisily, so it can be inlined.

Also remove drive_mark_claimed_by_board, as it is only defined
but not implemented (nor used) anywhere.

Signed-off-by: Emanuele Giuseppe Esposito 
Message-Id: <20211215121140.456939-3-eespo...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 include/sysemu/blockdev.h  | 2 --
 block/monitor/block-hmp-cmds.c | 2 +-
 blockdev.c | 7 +--
 softmmu/vl.c   | 4 +++-
 4 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index a750f99b79..ea35c42f5c 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -45,13 +45,11 @@ BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo);
 void override_max_devs(BlockInterfaceType type, int max_devs);
 
 DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit);
-void drive_mark_claimed_by_board(void);
 void drive_check_orphaned(void);
 DriveInfo *drive_get_by_index(BlockInterfaceType type, int index);
 int drive_get_max_bus(BlockInterfaceType type);
 int drive_get_max_devs(BlockInterfaceType type);
 
-QemuOpts *drive_def(const char *optstr);
 QemuOpts *drive_add(BlockInterfaceType type, int index, const char *file,
 const char *optstr);
 DriveInfo *drive_new(QemuOpts *arg, BlockInterfaceType block_default_type,
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 2ac4aedfff..bfb3c043a0 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -101,7 +101,7 @@ void hmp_drive_add(Monitor *mon, const QDict *qdict)
 return;
 }
 
-opts = drive_def(optstr);
+opts = qemu_opts_parse_noisily(qemu_find_opts("drive"), optstr, false);
 if (!opts)
 return;
 
diff --git a/blockdev.c b/blockdev.c
index b5ff9b854e..25b3b202e7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -197,17 +197,12 @@ static int drive_index_to_unit_id(BlockInterfaceType 
type, int index)
 return max_devs ? index % max_devs : index;
 }
 
-QemuOpts *drive_def(const char *optstr)
-{
-return qemu_opts_parse_noisily(qemu_find_opts("drive"), optstr, false);
-}
-
 QemuOpts *drive_add(BlockInterfaceType type, int index, const char *file,
 const char *optstr)
 {
 QemuOpts *opts;
 
-opts = drive_def(optstr);
+opts = qemu_opts_parse_noisily(qemu_find_opts("drive"), optstr, false);
 if (!opts) {
 return NULL;
 }
diff --git a/softmmu/vl.c b/softmmu/vl.c
index a8cad43691..207a9eb8be 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2887,7 +2887,9 @@ void qemu_init(int argc, char **argv, char **envp)
 break;
 }
 case QEMU_OPTION_drive:
-if (drive_def(optarg) == NULL) {
+opts = qemu_opts_parse_noisily(qemu_find_opts("drive"),
+   optarg, false);
+if (opts == NULL) {
 exit(1);
 }
 break;
-- 
2.31.1




[PULL 16/16] iotests/testrunner.py: refactor test_field_width

2022-01-14 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

A lot of Optional[] types doesn't make code beautiful.
test_field_width defaults to 8, but that is never used in the code.

More over, if we want some default behavior for single call of
test_run(), it should just print the whole test name, not limiting or
expanding its width, so 8 is bad default.

So, just drop the default as unused for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20211210201450.101576-1-vsement...@virtuozzo.com>
Reviewed-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/testrunner.py | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/testrunner.py b/tests/qemu-iotests/testrunner.py
index 0feaa396d0..15788f919e 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -174,19 +174,17 @@ def __enter__(self) -> 'TestRunner':
 def __exit__(self, exc_type: Any, exc_value: Any, traceback: Any) -> None:
 self._stack.close()
 
-def test_print_one_line(self, test: str, starttime: str,
+def test_print_one_line(self, test: str,
+test_field_width: int,
+starttime: str,
 endtime: Optional[str] = None, status: str = '...',
 lasttime: Optional[float] = None,
 thistime: Optional[float] = None,
 description: str = '',
-test_field_width: Optional[int] = None,
 end: str = '\n') -> None:
 """ Print short test info before/after test run """
 test = os.path.basename(test)
 
-if test_field_width is None:
-test_field_width = 8
-
 if self.makecheck and status != '...':
 if status and status != 'pass':
 status = f' [{status}]'
@@ -328,7 +326,7 @@ def do_run_test(self, test: str, mp: bool) -> TestResult:
   casenotrun=casenotrun)
 
 def run_test(self, test: str,
- test_field_width: Optional[int] = None,
+ test_field_width: int,
  mp: bool = False) -> TestResult:
 """
 Run one test and print short status
@@ -347,20 +345,21 @@ def run_test(self, test: str,
 
 if not self.makecheck:
 self.test_print_one_line(test=test,
+ test_field_width=test_field_width,
  status = 'started' if mp else '...',
  starttime=start,
  lasttime=last_el,
- end = '\n' if mp else '\r',
- test_field_width=test_field_width)
+ end = '\n' if mp else '\r')
 
 res = self.do_run_test(test, mp)
 
 end = datetime.datetime.now().strftime('%H:%M:%S')
-self.test_print_one_line(test=test, status=res.status,
+self.test_print_one_line(test=test,
+ test_field_width=test_field_width,
+ status=res.status,
  starttime=start, endtime=end,
  lasttime=last_el, thistime=res.elapsed,
- description=res.description,
- test_field_width=test_field_width)
+ description=res.description)
 
 if res.casenotrun:
 print(res.casenotrun)
-- 
2.31.1




[PULL 10/16] iotests/308: Fix for CAP_DAC_OVERRIDE

2022-01-14 Thread Kevin Wolf
From: Hanna Reitz 

With CAP_DAC_OVERRIDE (which e.g. root generally has), permission checks
will be bypassed when opening files.

308 in one instance tries to open a read-only file (FUSE export) with
qemu-io as read/write, and expects this to fail.  However, when running
it as root, opening will succeed (thanks to CAP_DAC_OVERRIDE) and only
the actual write operation will fail.

Note this as "Case not run", but have the test pass in either case.

Reported-by: Vladimir Sementsov-Ogievskiy 
Fixes: 2c7dd057aa7bd7a875e9b1a53975c220d6380bc4
   ("export/fuse: Pass default_permissions for mount")
Signed-off-by: Hanna Reitz 
Message-Id: <20220103120014.13061-1-hre...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/308 | 25 +++--
 tests/qemu-iotests/308.out |  2 +-
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/308 b/tests/qemu-iotests/308
index 2e3f8f4282..bde4aac2fa 100755
--- a/tests/qemu-iotests/308
+++ b/tests/qemu-iotests/308
@@ -230,8 +230,29 @@ echo '=== Writable export ==='
 fuse_export_add 'export-mp' "'mountpoint': '$EXT_MP', 'writable': true"
 
 # Check that writing to the read-only export fails
-$QEMU_IO -f raw -c 'write -P 42 1M 64k' "$TEST_IMG" 2>&1 \
-| _filter_qemu_io | _filter_testdir | _filter_imgfmt
+output=$($QEMU_IO -f raw -c 'write -P 42 1M 64k' "$TEST_IMG" 2>&1 \
+ | _filter_qemu_io | _filter_testdir | _filter_imgfmt)
+
+# Expected reference output: Opening the file fails because it has no
+# write permission
+reference="Could not open 'TEST_DIR/t.IMGFMT': Permission denied"
+
+if echo "$output" | grep -q "$reference"; then
+echo "Writing to read-only export failed: OK"
+elif echo "$output" | grep -q "write failed: Permission denied"; then
+# With CAP_DAC_OVERRIDE (e.g. when running this test as root), the export
+# can be opened regardless of its file permissions, but writing will then
+# fail.  This is not the result for which we want to test, so count this as
+# a SKIP.
+_casenotrun "Opening RO export as R/W succeeded, perhaps because of" \
+"CAP_DAC_OVERRIDE"
+
+# Still, write this to the reference output to make the test pass
+echo "Writing to read-only export failed: OK"
+else
+echo "Writing to read-only export failed: ERROR"
+echo "$output"
+fi
 
 # But here it should work
 $QEMU_IO -f raw -c 'write -P 42 1M 64k' "$EXT_MP" | _filter_qemu_io
diff --git a/tests/qemu-iotests/308.out b/tests/qemu-iotests/308.out
index fc47bb11a2..e4467a10cf 100644
--- a/tests/qemu-iotests/308.out
+++ b/tests/qemu-iotests/308.out
@@ -95,7 +95,7 @@ virtual size: 0 B (0 bytes)
   'mountpoint': 'TEST_DIR/t.IMGFMT.fuse', 'writable': true
   } }
 {"return": {}}
-qemu-io: can't open device TEST_DIR/t.IMGFMT: Could not open 
'TEST_DIR/t.IMGFMT': Permission denied
+Writing to read-only export failed: OK
 wrote 65536/65536 bytes at offset 1048576
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 65536/65536 bytes at offset 1048576
-- 
2.31.1




[PATCH v5 5/6] hw/arm/virt: Disable highmem devices that don't fit in the PA range

2022-01-14 Thread Marc Zyngier
In order to only keep the highmem devices that actually fit in
the PA range, check their location against the range and update
highest_gpa if they fit. If they don't, mark them as disabled.

Signed-off-by: Marc Zyngier 
---
 hw/arm/virt.c | 34 --
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a427676b50..053791cc44 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1712,21 +1712,43 @@ static void virt_set_memmap(VirtMachineState *vms, int 
pa_bits)
 base = vms->memmap[VIRT_MEM].base + LEGACY_RAMLIMIT_BYTES;
 }
 
+/* We know for sure that at least the memory fits in the PA space */
+vms->highest_gpa = memtop - 1;
+
 for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
 hwaddr size = extended_memmap[i].size;
+bool fits;
 
 base = ROUND_UP(base, size);
 vms->memmap[i].base = base;
 vms->memmap[i].size = size;
+
+/*
+ * Check each device to see if they fit in the PA space,
+ * moving highest_gpa as we go.
+ *
+ * For each device that doesn't fit, disable it.
+ */
+fits = (base + size) <= BIT_ULL(pa_bits);
+if (fits) {
+vms->highest_gpa = base + size - 1;
+}
+
+switch (i) {
+case VIRT_HIGH_GIC_REDIST2:
+vms->highmem_redists &= fits;
+break;
+case VIRT_HIGH_PCIE_ECAM:
+vms->highmem_ecam &= fits;
+break;
+case VIRT_HIGH_PCIE_MMIO:
+vms->highmem_mmio &= fits;
+break;
+}
+
 base += size;
 }
 
-/*
- * If base fits within pa_bits, all good. If it doesn't, limit it
- * to the end of RAM, which is guaranteed to fit within pa_bits.
- */
-vms->highest_gpa = (base <= BIT_ULL(pa_bits) ? base : memtop) - 1;
-
 if (device_memory_size > 0) {
 ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
 ms->device_memory->base = device_memory_base;
-- 
2.30.2




[PULL 11/16] vvfat: Fix size of temporary qcow file

2022-01-14 Thread Kevin Wolf
The size of the qcow size was calculated so that only the FAT partition
would fit on it, but not the whole disk. However, offsets relative to
the whole disk are used to access it, so increase its size to be large
enough for that.

Signed-off-by: Kevin Wolf 
Message-Id: <20211209151815.23495-1-kw...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 block/vvfat.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 5dacc6cfac..36e73d4c64 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1230,6 +1230,7 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
  dirname, cyls, heads, secs));
 
 s->sector_count = cyls * heads * secs - s->offset_to_bootsector;
+bs->total_sectors = cyls * heads * secs;
 
 if (qemu_opt_get_bool(opts, "rw", false)) {
 if (!bdrv_is_read_only(bs)) {
@@ -1250,8 +1251,6 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 }
 
-bs->total_sectors = cyls * heads * secs;
-
 if (init_directories(s, dirname, heads, secs, errp)) {
 ret = -EIO;
 goto fail;
@@ -3147,8 +3146,8 @@ static int enable_write_target(BlockDriverState *bs, 
Error **errp)
 }
 
 opts = qemu_opts_create(bdrv_qcow->create_opts, NULL, 0, &error_abort);
-qemu_opt_set_number(opts, BLOCK_OPT_SIZE, s->sector_count * 512,
-&error_abort);
+qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
+bs->total_sectors * BDRV_SECTOR_SIZE, &error_abort);
 qemu_opt_set(opts, BLOCK_OPT_BACKING_FILE, "fat:", &error_abort);
 
 ret = bdrv_create(bdrv_qcow, s->qcow_filename, opts, errp);
-- 
2.31.1




[PULL 14/16] qemu-img: make is_allocated_sectors() more efficient

2022-01-14 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Consider the case when the whole buffer is zero and end is unaligned.

If i <= tail, we return 1 and do one unaligned WRITE, RMW happens.

If i > tail, we do on aligned WRITE_ZERO (or skip if target is zeroed)
and again one unaligned WRITE, RMW happens.

Let's do better: don't fragment the whole-zero buffer and report it as
ZERO: in case of zeroed target we just do nothing and avoid RMW. If
target is not zeroes, one unaligned WRITE_ZERO should not be much worse
than one unaligned WRITE.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20211217164654.1184218-3-vsement...@virtuozzo.com>
Tested-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 qemu-img.c | 23 +++
 tests/qemu-iotests/122.out |  8 ++--
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 21ba1e6800..6fe2466032 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1171,19 +1171,34 @@ static int is_allocated_sectors(const uint8_t *buf, int 
n, int *pnum,
 }
 }
 
+if (i == n) {
+/*
+ * The whole buf is the same.
+ * No reason to split it into chunks, so return now.
+ */
+*pnum = i;
+return !is_zero;
+}
+
 tail = (sector_num + i) & (alignment - 1);
 if (tail) {
 if (is_zero && i <= tail) {
-/* treat unallocated areas which only consist
- * of a small tail as allocated. */
+/*
+ * For sure next sector after i is data, and it will rewrite this
+ * tail anyway due to RMW. So, let's just write data now.
+ */
 is_zero = false;
 }
 if (!is_zero) {
-/* align up end offset of allocated areas. */
+/* If possible, align up end offset of allocated areas. */
 i += alignment - tail;
 i = MIN(i, n);
 } else {
-/* align down end offset of zero areas. */
+/*
+ * For sure next sector after i is data, and it will rewrite this
+ * tail anyway due to RMW. Better is avoid RMW and write zeroes up
+ * to aligned bound.
+ */
 i -= tail;
 }
 }
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
index 69b8e8b803..e18766e167 100644
--- a/tests/qemu-iotests/122.out
+++ b/tests/qemu-iotests/122.out
@@ -201,9 +201,7 @@ convert -S 4k
 { "start": 8192, "length": 4096, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
 { "start": 12288, "length": 4096, "depth": 0, "present": false, "zero": true, 
"data": false},
 { "start": 16384, "length": 4096, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-{ "start": 20480, "length": 46080, "depth": 0, "present": false, "zero": true, 
"data": false},
-{ "start": 66560, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-{ "start": 67584, "length": 67041280, "depth": 0, "present": false, "zero": 
true, "data": false}]
+{ "start": 20480, "length": 67088384, "depth": 0, "present": false, "zero": 
true, "data": false}]
 
 convert -c -S 4k
 [{ "start": 0, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true},
@@ -215,9 +213,7 @@ convert -c -S 4k
 
 convert -S 8k
 [{ "start": 0, "length": 24576, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-{ "start": 24576, "length": 41984, "depth": 0, "present": false, "zero": true, 
"data": false},
-{ "start": 66560, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-{ "start": 67584, "length": 67041280, "depth": 0, "present": false, "zero": 
true, "data": false}]
+{ "start": 24576, "length": 67084288, "depth": 0, "present": false, "zero": 
true, "data": false}]
 
 convert -c -S 8k
 [{ "start": 0, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true},
-- 
2.31.1




[PULL 15/16] block: drop BLK_PERM_GRAPH_MOD

2022-01-14 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

First, this permission never protected a node from being changed, as
generic child-replacing functions don't check it.

Second, it's a strange thing: it presents a permission of parent node
to change its child. But generally, children are replaced by different
mechanisms, like jobs or qmp commands, not by nodes.

Graph-mod permission is hard to understand. All other permissions
describe operations which done by parent node on its child: read,
write, resize. Graph modification operations are something completely
different.

The only place where BLK_PERM_GRAPH_MOD is used as "perm" (not shared
perm) is mirror_start_job, for s->target. Still modern code should use
bdrv_freeze_backing_chain() to protect from graph modification, if we
don't do it somewhere it may be considered as a bug. So, it's a bit
risky to drop GRAPH_MOD, and analyzing of possible loss of protection
is hard. But one day we should do it, let's do it now.

One more bit of information is that locking the corresponding byte in
file-posix doesn't make sense at all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20210902093754.2352-1-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json  |  7 ++-
 include/block/block.h |  9 +
 block.c   |  7 +--
 block/commit.c|  1 -
 block/mirror.c| 15 +++
 hw/block/block.c  |  3 +--
 scripts/render_block_graph.py |  1 -
 tests/qemu-iotests/273.out|  4 
 8 files changed, 12 insertions(+), 35 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index bd0b285245..9a5a3641d0 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1878,14 +1878,11 @@
 #
 # @resize: This permission is required to change the size of a block node.
 #
-# @graph-mod: This permission is required to change the node that this
-# BdrvChild points to.
-#
 # Since: 4.0
 ##
 { 'enum': 'BlockPermission',
-  'data': [ 'consistent-read', 'write', 'write-unchanged', 'resize',
-'graph-mod' ] }
+  'data': [ 'consistent-read', 'write', 'write-unchanged', 'resize' ] }
+
 ##
 # @XDbgBlockGraphEdge:
 #
diff --git a/include/block/block.h b/include/block/block.h
index e5dd22b034..9d4050220b 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -269,12 +269,13 @@ enum {
 BLK_PERM_RESIZE = 0x08,
 
 /**
- * This permission is required to change the node that this BdrvChild
- * points to.
+ * There was a now-removed bit BLK_PERM_GRAPH_MOD, with value of 0x10. QEMU
+ * 6.1 and earlier may still lock the corresponding byte in 
block/file-posix
+ * locking.  So, implementing some new permission should be very careful to
+ * not interfere with this old unused thing.
  */
-BLK_PERM_GRAPH_MOD  = 0x10,
 
-BLK_PERM_ALL= 0x1f,
+BLK_PERM_ALL= 0x0f,
 
 DEFAULT_PERM_PASSTHROUGH= BLK_PERM_CONSISTENT_READ
  | BLK_PERM_WRITE
diff --git a/block.c b/block.c
index 10346b5011..7b3ce415d8 100644
--- a/block.c
+++ b/block.c
@@ -2485,7 +2485,6 @@ char *bdrv_perm_names(uint64_t perm)
 { BLK_PERM_WRITE,   "write" },
 { BLK_PERM_WRITE_UNCHANGED, "write unchanged" },
 { BLK_PERM_RESIZE,  "resize" },
-{ BLK_PERM_GRAPH_MOD,   "change children" },
 { 0, NULL }
 };
 
@@ -2601,8 +2600,7 @@ static void bdrv_default_perms_for_cow(BlockDriverState 
*bs, BdrvChild *c,
 shared = 0;
 }
 
-shared |= BLK_PERM_CONSISTENT_READ | BLK_PERM_GRAPH_MOD |
-  BLK_PERM_WRITE_UNCHANGED;
+shared |= BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED;
 
 if (bs->open_flags & BDRV_O_INACTIVE) {
 shared |= BLK_PERM_WRITE | BLK_PERM_RESIZE;
@@ -2720,7 +2718,6 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission 
qapi_perm)
 [BLOCK_PERMISSION_WRITE]= BLK_PERM_WRITE,
 [BLOCK_PERMISSION_WRITE_UNCHANGED]  = BLK_PERM_WRITE_UNCHANGED,
 [BLOCK_PERMISSION_RESIZE]   = BLK_PERM_RESIZE,
-[BLOCK_PERMISSION_GRAPH_MOD]= BLK_PERM_GRAPH_MOD,
 };
 
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(permissions) != BLOCK_PERMISSION__MAX);
@@ -5546,8 +5543,6 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 update_inherits_from = bdrv_inherits_from_recursive(base, explicit_top);
 
 /* success - we can delete the intermediate states, and link top->base */
-/* TODO Check graph modification op blockers (BLK_PERM_GRAPH_MOD) once
- * we've figured out how they should work. */
 if (!backing_file_str) {
 bdrv_refresh_filename(base);
 backing_file_str = base->filename;
diff --git a/block/commit.c b/block/commit.c
index 10cc5ff451..b1fc7b908b 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -370,7 +370,6 @@ void commit_start(const char

[PULL 13/16] iotests: Test qemu-img convert of zeroed data cluster

2022-01-14 Thread Kevin Wolf
This demonstrates what happens when the block status changes in
sub-min_sparse granularity, but all of the parts are zeroed out. The
alignment logic in is_allocated_sectors() prevents that the target image
remains fully sparse as expected, but turns it into a data cluster of
explicit zeros.

Signed-off-by: Kevin Wolf 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20211217164654.1184218-2-vsement...@virtuozzo.com>
Tested-by: Peter Lieven 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/122 |  1 +
 tests/qemu-iotests/122.out | 10 --
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/122 b/tests/qemu-iotests/122
index efb260d822..be0f6b79e5 100755
--- a/tests/qemu-iotests/122
+++ b/tests/qemu-iotests/122
@@ -251,6 +251,7 @@ $QEMU_IO -c "write -P 0 0 64k" "$TEST_IMG" 2>&1 | 
_filter_qemu_io | _filter_test
 $QEMU_IO -c "write 0 1k" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
 $QEMU_IO -c "write 8k 1k" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
 $QEMU_IO -c "write 17k 1k" "$TEST_IMG" 2>&1 | _filter_qemu_io | _filter_testdir
+$QEMU_IO -c "write -P 0 65k 1k" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 for min_sparse in 4k 8k; do
 echo
diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out
index 8fbdac2b39..69b8e8b803 100644
--- a/tests/qemu-iotests/122.out
+++ b/tests/qemu-iotests/122.out
@@ -192,6 +192,8 @@ wrote 1024/1024 bytes at offset 8192
 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 1024/1024 bytes at offset 17408
 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 66560
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 convert -S 4k
 [{ "start": 0, "length": 4096, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
@@ -199,7 +201,9 @@ convert -S 4k
 { "start": 8192, "length": 4096, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
 { "start": 12288, "length": 4096, "depth": 0, "present": false, "zero": true, 
"data": false},
 { "start": 16384, "length": 4096, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-{ "start": 20480, "length": 67088384, "depth": 0, "present": false, "zero": 
true, "data": false}]
+{ "start": 20480, "length": 46080, "depth": 0, "present": false, "zero": true, 
"data": false},
+{ "start": 66560, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
+{ "start": 67584, "length": 67041280, "depth": 0, "present": false, "zero": 
true, "data": false}]
 
 convert -c -S 4k
 [{ "start": 0, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true},
@@ -211,7 +215,9 @@ convert -c -S 4k
 
 convert -S 8k
 [{ "start": 0, "length": 24576, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
-{ "start": 24576, "length": 67084288, "depth": 0, "present": false, "zero": 
true, "data": false}]
+{ "start": 24576, "length": 41984, "depth": 0, "present": false, "zero": true, 
"data": false},
+{ "start": 66560, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": OFFSET},
+{ "start": 67584, "length": 67041280, "depth": 0, "present": false, "zero": 
true, "data": false}]
 
 convert -c -S 8k
 [{ "start": 0, "length": 1024, "depth": 0, "present": true, "zero": false, 
"data": true},
-- 
2.31.1




Re: [PATCH v3 2/3] target/riscv: add support for svinval extension

2022-01-14 Thread Weiwei Li

Thanks for your comments.

在 2022/1/14 下午9:40, Anup Patel 写道:

On Fri, Jan 14, 2022 at 7:11 AM Weiwei Li  wrote:

Signed-off-by: Weiwei Li 
Signed-off-by: Junqiang Wang 
---
  target/riscv/cpu.c  |  1 +
  target/riscv/cpu.h  |  1 +
  target/riscv/insn32.decode  |  7 ++
  target/riscv/insn_trans/trans_svinval.c.inc | 75 +
  target/riscv/translate.c|  1 +
  5 files changed, 85 insertions(+)
  create mode 100644 target/riscv/insn_trans/trans_svinval.c.inc

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index ff6c86c85b..45ac98e06b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -668,6 +668,7 @@ static Property riscv_cpu_properties[] = {
  DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
  DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),

+DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false),
  DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false),

  DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d3d17cde82..c3d1845ca1 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -327,6 +327,7 @@ struct RISCVCPU {
  bool ext_counters;
  bool ext_ifencei;
  bool ext_icsr;
+bool ext_svinval;
  bool ext_svnapot;
  bool ext_zfh;
  bool ext_zfhmin;
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 5bbedc254c..7a0351fde2 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -809,3 +809,10 @@ fcvt_l_h   1100010  00010 . ... . 1010011 @r2_rm
  fcvt_lu_h  1100010  00011 . ... . 1010011 @r2_rm
  fcvt_h_l   1101010  00010 . ... . 1010011 @r2_rm
  fcvt_h_lu  1101010  00011 . ... . 1010011 @r2_rm
+
+# *** Svinval Standard Extension ***
+sinval_vma0001011 . . 000 0 1110011 @sfence_vma
+sfence_w_inval0001100 0 0 000 0 1110011
+sfence_inval_ir   0001100 1 0 000 0 1110011
+hinval_vvma   0011011 . . 000 0 1110011 @hfence_vvma

s/0011011/0010011/


+hinval_gvma   0111011 . . 000 0 1110011 @hfence_gvma

s/0111011/0110011/


Sorry. I didn't find the encodings for svinval instructions from the 
spec. So I get them from  spike 
(https://github.com/riscv-software-src/riscv-isa-sim/blob/master/riscv/encoding.h) 
which are as follows:


#defineMATCH_HINVAL_VVMA0x3673
#defineMASK_HINVAL_VVMA0xfe007fff
#defineMATCH_HINVAL_GVMA0x7673
#defineMASK_HINVAL_GVMA0xfe007fff
Are they not the latest encodings?

diff --git a/target/riscv/insn_trans/trans_svinval.c.inc 
b/target/riscv/insn_trans/trans_svinval.c.inc
new file mode 100644
index 00..1dde665661
--- /dev/null
+++ b/target/riscv/insn_trans/trans_svinval.c.inc
@@ -0,0 +1,75 @@
+/*
+ * RISC-V translation routines for the Svinval Standard Instruction Set.
+ *
+ * Copyright (c) 2020-2021 PLCT lab
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#define REQUIRE_SVINVAL(ctx) do {\
+if (!RISCV_CPU(ctx->cs)->cfg.ext_svinval) {  \
+return false;\
+}\
+} while (0)
+
+static bool trans_sinval_vma(DisasContext *ctx, arg_sinval_vma *a)
+{
+REQUIRE_SVINVAL(ctx);
+/* Do the same as sfence.vma currently */
+REQUIRE_EXT(ctx, RVS);
+#ifndef CONFIG_USER_ONLY
+gen_helper_tlb_flush(cpu_env);
+return true;
+#endif
+return false;
+}
+
+static bool trans_sfence_w_inval(DisasContext *ctx, arg_sfence_w_inval *a)
+{
+REQUIRE_SVINVAL(ctx);
+REQUIRE_EXT(ctx, RVS);
+/* Do nothing currently */
+return true;
+}
+
+static bool trans_sfence_inval_ir(DisasContext *ctx, arg_sfence_inval_ir *a)
+{
+REQUIRE_SVINVAL(ctx);
+REQUIRE_EXT(ctx, RVS);
+/* Do nothing currently */
+return true;
+}
+
+static bool trans_hinval_vvma(DisasContext *ctx, arg_hinval_vvma *a)
+{
+REQUIRE_SVINVAL(ctx);
+/* Do the same as hfence.vvma currently */
+REQUIRE_EXT(ctx, RVH);
+#ifndef CONFIG_USER_ONLY
+gen_helper_hyp_tlb_flush(cpu_env);
+return true;
+#endif
+return false;
+}
+
+static bool trans_hinval_gvma(DisasContext *ctx, arg_hinval_gvma *a)
+{
+REQUIRE_SVINVAL(ctx);
+/* Do the same as hfence.gvma currently */
+REQUIRE_

Re: [PATCH v3 3/3] target/riscv: add support for svpbmt extension

2022-01-14 Thread Anup Patel
On Fri, Jan 14, 2022 at 7:11 AM Weiwei Li  wrote:
>
> It uses two PTE bits, but otherwise has no effect on QEMU, since QEMU is 
> sequentially consistent and doesn't model PMAs currently
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 
> Tested-by: Heiko Stuebner 
> ---
>  target/riscv/cpu.c| 1 +
>  target/riscv/cpu.h| 1 +
>  target/riscv/cpu_bits.h   | 3 +++
>  target/riscv/cpu_helper.c | 9 -
>  4 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 45ac98e06b..4f82bd00a3 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -670,6 +670,7 @@ static Property riscv_cpu_properties[] = {
>
>  DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false),
>  DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false),
> +DEFINE_PROP_BOOL("svpbmt", RISCVCPU, cfg.ext_svpbmt, false),
>
>  DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
>  DEFINE_PROP_BOOL("zbb", RISCVCPU, cfg.ext_zbb, true),
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index c3d1845ca1..53f314c752 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -329,6 +329,7 @@ struct RISCVCPU {
>  bool ext_icsr;
>  bool ext_svinval;
>  bool ext_svnapot;
> +bool ext_svpbmt;
>  bool ext_zfh;
>  bool ext_zfhmin;
>
> diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> index bc23e3b523..ee294c1d0b 100644
> --- a/target/riscv/cpu_bits.h
> +++ b/target/riscv/cpu_bits.h
> @@ -486,7 +486,10 @@ typedef enum {
>  #define PTE_A   0x040 /* Accessed */
>  #define PTE_D   0x080 /* Dirty */
>  #define PTE_SOFT0x300 /* Reserved for Software */
> +#define PTE_RSVD0x1FC0 /* Reserved for future use */
> +#define PTE_PBMT0x6000 /* Page-based memory types */
>  #define PTE_N   0x8000 /* NAPOT translation */
> +#define PTE_ATTR0xFFC0 /* All attributes bits */
>
>  /* Page table PPN shift amount */
>  #define PTE_PPN_SHIFT   10
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 58ab85bca3..f90766e026 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -619,16 +619,23 @@ restart:
>  return TRANSLATE_FAIL;
>  }
>
> -hwaddr ppn = (pte & ~(target_ulong)PTE_N) >> PTE_PPN_SHIFT;
> +hwaddr ppn = (pte & ~(target_ulong)PTE_ATTR) >> PTE_PPN_SHIFT;
>
>  RISCVCPU *cpu = env_archcpu(env);
>  if (!cpu->cfg.ext_svnapot && (pte & PTE_N)) {
>  return TRANSLATE_FAIL;
> +} else if (!cpu->cfg.ext_svpbmt && (pte & PTE_PBMT)) {
> +return TRANSLATE_FAIL;
> +} else if (pte & PTE_RSVD) {
> +return TRANSLATE_FAIL;
>  } else if (!(pte & PTE_V)) {
>  /* Invalid PTE */
>  return TRANSLATE_FAIL;
>  } else if (!(pte & (PTE_R | PTE_W | PTE_X))) {
>  /* Inner PTE, continue walking */
> +if (pte & (PTE_D | PTE_A | PTE_U | PTE_N | PTE_PBMT)) {
> +return TRANSLATE_FAIL;
> +}

I think you should add a patch before PATCH1 to add following:

if (pte & (PTE_D | PTE_A | PTE_U)) {
return TRANSLATE_FAIL;
}

The current PATCH1 should add PTE_N to the comparison and
this patch can add PTE_PBMT to the comparison.

>  base = ppn << PGSHIFT;
>  } else if ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) {
>  /* Reserved leaf PTE flags: PTE_W */
> --
> 2.17.1
>

Apart from the minor comment above, it looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup



[PULL 12/16] vvfat: Fix vvfat_write() for writes before the root directory

2022-01-14 Thread Kevin Wolf
The calculation in sector2cluster() is done relative to the offset of
the root directory. Any writes to blocks before the start of the root
directory (in particular, writes to the FAT) result in negative values,
which are not handled correctly in vvfat_write().

This changes sector2cluster() to return a signed value, and makes sure
that vvfat_write() doesn't try to find mappings for negative cluster
number. It clarifies the code in vvfat_write() to make it more obvious
that the cluster numbers can be negative.

Signed-off-by: Kevin Wolf 
Message-Id: <20211209152231.23756-1-kw...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 block/vvfat.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 36e73d4c64..b2b58d93b8 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -882,7 +882,7 @@ static int read_directory(BDRVVVFATState* s, int 
mapping_index)
 return 0;
 }
 
-static inline uint32_t sector2cluster(BDRVVVFATState* s,off_t sector_num)
+static inline int32_t sector2cluster(BDRVVVFATState* s,off_t sector_num)
 {
 return (sector_num - s->offset_to_root_dir) / s->sectors_per_cluster;
 }
@@ -2981,6 +2981,7 @@ static int vvfat_write(BlockDriverState *bs, int64_t 
sector_num,
 {
 BDRVVVFATState *s = bs->opaque;
 int i, ret;
+int first_cluster, last_cluster;
 
 DLOG(checkpoint());
 
@@ -2999,9 +3000,20 @@ DLOG(checkpoint());
 if (sector_num < s->offset_to_fat)
 return -1;
 
-for (i = sector2cluster(s, sector_num);
-i <= sector2cluster(s, sector_num + nb_sectors - 1);) {
-mapping_t* mapping = find_mapping_for_cluster(s, i);
+/*
+ * Values will be negative for writes to the FAT, which is located before
+ * the root directory.
+ */
+first_cluster = sector2cluster(s, sector_num);
+last_cluster = sector2cluster(s, sector_num + nb_sectors - 1);
+
+for (i = first_cluster; i <= last_cluster;) {
+mapping_t *mapping = NULL;
+
+if (i >= 0) {
+mapping = find_mapping_for_cluster(s, i);
+}
+
 if (mapping) {
 if (mapping->read_only) {
 fprintf(stderr, "Tried to write to write-protected file %s\n",
@@ -3041,8 +3053,9 @@ DLOG(checkpoint());
 }
 }
 i = mapping->end;
-} else
+} else {
 i++;
+}
 }
 
 /*
@@ -3056,10 +3069,11 @@ DLOG(fprintf(stderr, "Write to qcow backend: %d + 
%d\n", (int)sector_num, nb_sec
 return ret;
 }
 
-for (i = sector2cluster(s, sector_num);
-i <= sector2cluster(s, sector_num + nb_sectors - 1); i++)
-if (i >= 0)
+for (i = first_cluster; i <= last_cluster; i++) {
+if (i >= 0) {
 s->used_clusters[i] |= USED_ALLOCATED;
+}
+}
 
 DLOG(checkpoint());
 /* TODO: add timeout */
-- 
2.31.1




Re: [PATCH v3 2/3] target/riscv: add support for svinval extension

2022-01-14 Thread Anup Patel
On Fri, Jan 14, 2022 at 7:24 PM Weiwei Li  wrote:
>
> Thanks for your comments.
>
> 在 2022/1/14 下午9:40, Anup Patel 写道:
>
> On Fri, Jan 14, 2022 at 7:11 AM Weiwei Li  wrote:
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 
> ---
>  target/riscv/cpu.c  |  1 +
>  target/riscv/cpu.h  |  1 +
>  target/riscv/insn32.decode  |  7 ++
>  target/riscv/insn_trans/trans_svinval.c.inc | 75 +
>  target/riscv/translate.c|  1 +
>  5 files changed, 85 insertions(+)
>  create mode 100644 target/riscv/insn_trans/trans_svinval.c.inc
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index ff6c86c85b..45ac98e06b 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -668,6 +668,7 @@ static Property riscv_cpu_properties[] = {
>  DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
>  DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
>
> +DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false),
>  DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false),
>
>  DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index d3d17cde82..c3d1845ca1 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -327,6 +327,7 @@ struct RISCVCPU {
>  bool ext_counters;
>  bool ext_ifencei;
>  bool ext_icsr;
> +bool ext_svinval;
>  bool ext_svnapot;
>  bool ext_zfh;
>  bool ext_zfhmin;
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index 5bbedc254c..7a0351fde2 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -809,3 +809,10 @@ fcvt_l_h   1100010  00010 . ... . 1010011 @r2_rm
>  fcvt_lu_h  1100010  00011 . ... . 1010011 @r2_rm
>  fcvt_h_l   1101010  00010 . ... . 1010011 @r2_rm
>  fcvt_h_lu  1101010  00011 . ... . 1010011 @r2_rm
> +
> +# *** Svinval Standard Extension ***
> +sinval_vma0001011 . . 000 0 1110011 @sfence_vma
> +sfence_w_inval0001100 0 0 000 0 1110011
> +sfence_inval_ir   0001100 1 0 000 0 1110011
> +hinval_vvma   0011011 . . 000 0 1110011 @hfence_vvma
>
> s/0011011/0010011/
>
> +hinval_gvma   0111011 . . 000 0 1110011 @hfence_gvma
>
> s/0111011/0110011/
>
> Sorry. I didn't find the encodings for svinval instructions from the spec. So 
> I get them from  spike 
> (https://github.com/riscv-software-src/riscv-isa-sim/blob/master/riscv/encoding.h)
>  which are as follows:
>
> #define MATCH_HINVAL_VVMA 0x3673
> #define MASK_HINVAL_VVMA 0xfe007fff
> #define MATCH_HINVAL_GVMA 0x7673
> #define MASK_HINVAL_GVMA 0xfe007fff
> Are they not the latest encodings?

The code in Spike seems to be buggy but that's a separate issue.

Refer, page 138 of
https://github.com/riscv/riscv-isa-manual/releases/download/draft-20220110-eae4f00/riscv-privileged.pdf

Regards,
Anup

>
> diff --git a/target/riscv/insn_trans/trans_svinval.c.inc 
> b/target/riscv/insn_trans/trans_svinval.c.inc
> new file mode 100644
> index 00..1dde665661
> --- /dev/null
> +++ b/target/riscv/insn_trans/trans_svinval.c.inc
> @@ -0,0 +1,75 @@
> +/*
> + * RISC-V translation routines for the Svinval Standard Instruction Set.
> + *
> + * Copyright (c) 2020-2021 PLCT lab
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program.  If not, see .
> + */
> +
> +#define REQUIRE_SVINVAL(ctx) do {\
> +if (!RISCV_CPU(ctx->cs)->cfg.ext_svinval) {  \
> +return false;\
> +}\
> +} while (0)
> +
> +static bool trans_sinval_vma(DisasContext *ctx, arg_sinval_vma *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +/* Do the same as sfence.vma currently */
> +REQUIRE_EXT(ctx, RVS);
> +#ifndef CONFIG_USER_ONLY
> +gen_helper_tlb_flush(cpu_env);
> +return true;
> +#endif
> +return false;
> +}
> +
> +static bool trans_sfence_w_inval(DisasContext *ctx, arg_sfence_w_inval *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +REQUIRE_EXT(ctx, RVS);
> +/* Do nothing currently */
> +return true;
> +}
> +
> +static bool trans_sfence_inval_ir(DisasContext *ctx, arg_sfence_inval_ir *a)
> +{
> +REQUIRE_SVINVAL(ctx);
> +REQUIRE_EXT(ctx, RVS);
> +/* Do nothin

Re: [PATCH 1/2] hw/virtio: add boilerplate for vhost-user-gpio device

2022-01-14 Thread Alex Bennée


Viresh Kumar  writes:

> This creates the QEMU side of the vhost-user-gpio device which connects
> to the remote daemon. It is based of vhost-user-i2c code.
>
> Signed-off-by: Viresh Kumar 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée



[PATCH v5 0/6] target/arm: Reduced-IPA space and highmem fixes

2022-01-14 Thread Marc Zyngier
Here's yet another stab at enabling QEMU on systems with
pathologically reduced IPA ranges such as the Apple M1 (previous
version at [1]). Eventually, we're able to run a KVM guest with more
than just 3GB of RAM on a system with a 36bit IPA space, and at most
123 vCPUs.

This also addresses some pathological QEMU behaviours, where the
highmem property is used as a flag allowing exposure of devices that
can't possibly fit in the PA space of the VM, resulting in a guest
failure.

In the end, we generalise the notion of PA space when exposing
individual devices in the expanded memory map, and treat highmem as
another flavour of PA space restriction.

This series does a few things:

- introduce new attributes to control the enabling of the highmem
  GICv3 redistributors and the highmem PCIe MMIO range

- correctly cap the PA range with highmem is off

- generalise the highmem behaviour to any PA range

- disable each highmem device region that doesn't fit in the PA range

- cleanup uses of highmem outside of virt_set_memmap()

This has been tested on an M1-based Mac-mini running Linux v5.16-rc6
with both KVM and TCG.

* From v4: [1]

  - Moved cpu_type_valid() check before we compute the memory map
  - Drop useless MAX() when computing highest_gpa
  - Fixed more deviations from the QEMU coding style
  - Collected Eric's RBs, with thanks

[1]: https://lore.kernel.org/r/20220107163324.2491209-1-...@kernel.org

Marc Zyngier (6):
  hw/arm/virt: Add a control for the the highmem PCIe MMIO
  hw/arm/virt: Add a control for the the highmem redistributors
  hw/arm/virt: Honor highmem setting when computing the memory map
  hw/arm/virt: Use the PA range to compute the memory map
  hw/arm/virt: Disable highmem devices that don't fit in the PA range
  hw/arm/virt: Drop superfluous checks against highmem

 hw/arm/virt-acpi-build.c | 10 ++--
 hw/arm/virt.c| 98 ++--
 include/hw/arm/virt.h|  5 +-
 3 files changed, 91 insertions(+), 22 deletions(-)

-- 
2.30.2




[PATCH 5/4] tests: acpi: test short OEM_ID/OEM_TABLE_ID values in test_oem_fields()

2022-01-14 Thread Igor Mammedov
Previous patch [1] added explicit whitespace padding to OEM_ID/OEM_TABLE_ID
values used in test_oem_fields() testcase to avoid false positive and
bisection issues when QEMU is switched to \0' padding. As result
testcase ceased to test values that were shorter than max possible
length values.

Update testcase to make sure that it's testing shorter IDs like it
used to before [2].

1) "tests: acpi: manually pad OEM_ID/OEM_TABLE_ID for  test_oem_fields() test"
2) 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be changed")

Signed-off-by: Igor Mammedov 
---
 tests/qtest/bios-tables-test.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 90c9f6a0a2..ad536fd7b1 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -71,10 +71,10 @@
 
 #define ACPI_REBUILD_EXPECTED_AML "TEST_ACPI_REBUILD_AML"
 
-#define OEM_ID "TEST  "
-#define OEM_TABLE_ID   "OEM "
-#define OEM_TEST_ARGS  "-machine x-oem-id='" OEM_ID "',x-oem-table-id='" \
-   OEM_TABLE_ID "'"
+#define OEM_ID "TEST"
+#define OEM_TABLE_ID   "OEM"
+#define OEM_TEST_ARGS  "-machine x-oem-id=" OEM_ID ",x-oem-table-id=" \
+   OEM_TABLE_ID
 
 typedef struct {
 bool tcg_only;
@@ -1530,8 +1530,8 @@ static void test_oem_fields(test_data *data)
 continue;
 }
 
-g_assert(memcmp(sdt->aml + 10, OEM_ID, 6) == 0);
-g_assert(memcmp(sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
+g_assert(strncmp((char *)sdt->aml + 10, OEM_ID, 6) == 0);
+g_assert(strncmp((char *)sdt->aml + 16, OEM_TABLE_ID, 8) == 0);
 }
 }
 
-- 
2.31.1




[PATCH v5 4/6] hw/arm/virt: Use the PA range to compute the memory map

2022-01-14 Thread Marc Zyngier
The highmem attribute is nothing but another way to express the
PA range of a VM. To support HW that has a smaller PA range then
what QEMU assumes, pass this PA range to the virt_set_memmap()
function, allowing it to correctly exclude highmem devices
if they are outside of the PA range.

Signed-off-by: Marc Zyngier 
---
 hw/arm/virt.c | 64 +--
 1 file changed, 52 insertions(+), 12 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ecc3e3e5b0..a427676b50 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1660,7 +1660,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState 
*vms, int idx)
 return arm_cpu_mp_affinity(idx, clustersz);
 }
 
-static void virt_set_memmap(VirtMachineState *vms)
+static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
 {
 MachineState *ms = MACHINE(vms);
 hwaddr base, device_memory_base, device_memory_size, memtop;
@@ -1678,6 +1678,14 @@ static void virt_set_memmap(VirtMachineState *vms)
 exit(EXIT_FAILURE);
 }
 
+/*
+ * !highmem is exactly the same as limiting the PA space to 32bit,
+ * irrespective of the underlying capabilities of the HW.
+ */
+if (!vms->highmem) {
+pa_bits = 32;
+}
+
 /*
  * We compute the base of the high IO region depending on the
  * amount of initial and device memory. The device memory start/size
@@ -1691,8 +1699,9 @@ static void virt_set_memmap(VirtMachineState *vms)
 
 /* Base address of the high IO region */
 memtop = base = device_memory_base + ROUND_UP(device_memory_size, GiB);
-if (!vms->highmem && memtop > 4 * GiB) {
-error_report("highmem=off, but memory crosses the 4GiB limit\n");
+if (memtop > BIT_ULL(pa_bits)) {
+   error_report("Addressing limited to %d bits, but memory exceeds it 
by %llu bytes\n",
+pa_bits, memtop - BIT_ULL(pa_bits));
 exit(EXIT_FAILURE);
 }
 if (base < device_memory_base) {
@@ -1711,7 +1720,13 @@ static void virt_set_memmap(VirtMachineState *vms)
 vms->memmap[i].size = size;
 base += size;
 }
-vms->highest_gpa = (vms->highmem ? base : memtop) - 1;
+
+/*
+ * If base fits within pa_bits, all good. If it doesn't, limit it
+ * to the end of RAM, which is guaranteed to fit within pa_bits.
+ */
+vms->highest_gpa = (base <= BIT_ULL(pa_bits) ? base : memtop) - 1;
+
 if (device_memory_size > 0) {
 ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
 ms->device_memory->base = device_memory_base;
@@ -1902,12 +1917,43 @@ static void machvirt_init(MachineState *machine)
 unsigned int smp_cpus = machine->smp.cpus;
 unsigned int max_cpus = machine->smp.max_cpus;
 
+if (!cpu_type_valid(machine->cpu_type)) {
+error_report("mach-virt: CPU type %s not supported", 
machine->cpu_type);
+exit(1);
+}
+
+possible_cpus = mc->possible_cpu_arch_ids(machine);
+
 /*
  * In accelerated mode, the memory map is computed earlier in kvm_type()
  * to create a VM with the right number of IPA bits.
  */
 if (!vms->memmap) {
-virt_set_memmap(vms);
+Object *cpuobj;
+ARMCPU *armcpu;
+int pa_bits;
+
+/*
+ * Instanciate a temporary CPU object to find out about what
+ * we are about to deal with. Once this is done, get rid of
+ * the object.
+ */
+cpuobj = object_new(possible_cpus->cpus[0].type);
+armcpu = ARM_CPU(cpuobj);
+
+if (object_property_get_bool(cpuobj, "aarch64", NULL)) {
+pa_bits = arm_pamax(armcpu);
+} else if (arm_feature(&armcpu->env, ARM_FEATURE_LPAE)) {
+/* v7 with LPAE */
+pa_bits = 40;
+} else {
+/* Anything else */
+pa_bits = 32;
+}
+
+object_unref(cpuobj);
+
+virt_set_memmap(vms, pa_bits);
 }
 
 /* We can probe only here because during property set
@@ -1915,11 +1961,6 @@ static void machvirt_init(MachineState *machine)
  */
 finalize_gic_version(vms);
 
-if (!cpu_type_valid(machine->cpu_type)) {
-error_report("mach-virt: CPU type %s not supported", 
machine->cpu_type);
-exit(1);
-}
-
 if (vms->secure) {
 /*
  * The Secure view of the world is the same as the NonSecure,
@@ -1989,7 +2030,6 @@ static void machvirt_init(MachineState *machine)
 
 create_fdt(vms);
 
-possible_cpus = mc->possible_cpu_arch_ids(machine);
 assert(possible_cpus->len == max_cpus);
 for (n = 0; n < possible_cpus->len; n++) {
 Object *cpuobj;
@@ -2646,7 +2686,7 @@ static int virt_kvm_type(MachineState *ms, const char 
*type_str)
 max_vm_pa_size = kvm_arm_get_max_vm_ipa_size(ms, &fixed_ipa);
 
 /* we freeze the memory map to compute the highest gpa */
-virt_set_memmap(vms);
+virt_set_memmap(vms, max_vm_pa_size);
 
 requested_pa_size = 64 - clz64(vms->h

[PATCH v5 3/6] hw/arm/virt: Honor highmem setting when computing the memory map

2022-01-14 Thread Marc Zyngier
Even when the VM is configured with highmem=off, the highest_gpa
field includes devices that are above the 4GiB limit.
Similarily, nothing seem to check that the memory is within
the limit set by the highmem=off option.

This leads to failures in virt_kvm_type() on systems that have
a crippled IPA range, as the reported IPA space is larger than
what it should be.

Instead, honor the user-specified limit to only use the devices
at the lowest end of the spectrum, and fail if we have memory
crossing the 4GiB limit.

Reviewed-by: Andrew Jones 
Reviewed-by: Eric Auger 
Signed-off-by: Marc Zyngier 
---
 hw/arm/virt.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e734a75850..ecc3e3e5b0 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1663,7 +1663,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState 
*vms, int idx)
 static void virt_set_memmap(VirtMachineState *vms)
 {
 MachineState *ms = MACHINE(vms);
-hwaddr base, device_memory_base, device_memory_size;
+hwaddr base, device_memory_base, device_memory_size, memtop;
 int i;
 
 vms->memmap = extended_memmap;
@@ -1690,7 +1690,11 @@ static void virt_set_memmap(VirtMachineState *vms)
 device_memory_size = ms->maxram_size - ms->ram_size + ms->ram_slots * GiB;
 
 /* Base address of the high IO region */
-base = device_memory_base + ROUND_UP(device_memory_size, GiB);
+memtop = base = device_memory_base + ROUND_UP(device_memory_size, GiB);
+if (!vms->highmem && memtop > 4 * GiB) {
+error_report("highmem=off, but memory crosses the 4GiB limit\n");
+exit(EXIT_FAILURE);
+}
 if (base < device_memory_base) {
 error_report("maxmem/slots too huge");
 exit(EXIT_FAILURE);
@@ -1707,7 +1711,7 @@ static void virt_set_memmap(VirtMachineState *vms)
 vms->memmap[i].size = size;
 base += size;
 }
-vms->highest_gpa = base - 1;
+vms->highest_gpa = (vms->highmem ? base : memtop) - 1;
 if (device_memory_size > 0) {
 ms->device_memory = g_malloc0(sizeof(*ms->device_memory));
 ms->device_memory->base = device_memory_base;
-- 
2.30.2




[PATCH v5 2/6] hw/arm/virt: Add a control for the the highmem redistributors

2022-01-14 Thread Marc Zyngier
Just like we can control the enablement of the highmem PCIe region
using highmem_ecam, let's add a control for the highmem GICv3
redistributor region.

Similarily to highmem_ecam, these redistributors are disabled when
highmem is off.

Reviewed-by: Andrew Jones 
Signed-off-by: Marc Zyngier 
---
 hw/arm/virt-acpi-build.c | 2 ++
 hw/arm/virt.c| 2 ++
 include/hw/arm/virt.h| 4 +++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 449fab0080..0757c28f69 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -947,6 +947,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables 
*tables)
 acpi_add_table(table_offsets, tables_blob);
 build_fadt_rev5(tables_blob, tables->linker, vms, dsdt);
 
+vms->highmem_redists &= vms->highmem;
+
 acpi_add_table(table_offsets, tables_blob);
 build_madt(tables_blob, tables->linker, vms);
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ed8ea96acc..e734a75850 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2106,6 +2106,7 @@ static void machvirt_init(MachineState *machine)
 virt_flash_fdt(vms, sysmem, secure_sysmem ?: sysmem);
 
 vms->highmem_mmio &= vms->highmem;
+vms->highmem_redists &= vms->highmem;
 
 create_gic(vms, sysmem);
 
@@ -2805,6 +2806,7 @@ static void virt_instance_init(Object *obj)
 
 vms->highmem_ecam = !vmc->no_highmem_ecam;
 vms->highmem_mmio = true;
+vms->highmem_redists = true;
 
 if (vmc->no_its) {
 vms->its = false;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 9c54acd10d..dc9fa26faa 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -144,6 +144,7 @@ struct VirtMachineState {
 bool highmem;
 bool highmem_ecam;
 bool highmem_mmio;
+bool highmem_redists;
 bool its;
 bool tcg_its;
 bool virt;
@@ -190,7 +191,8 @@ static inline int 
virt_gicv3_redist_region_count(VirtMachineState *vms)
 
 assert(vms->gic_version == VIRT_GIC_VERSION_3);
 
-return MACHINE(vms)->smp.cpus > redist0_capacity ? 2 : 1;
+return (MACHINE(vms)->smp.cpus > redist0_capacity &&
+vms->highmem_redists) ? 2 : 1;
 }
 
 #endif /* QEMU_ARM_VIRT_H */
-- 
2.30.2




[PATCH v5 1/6] hw/arm/virt: Add a control for the the highmem PCIe MMIO

2022-01-14 Thread Marc Zyngier
Just like we can control the enablement of the highmem PCIe ECAM
region using highmem_ecam, let's add a control for the highmem
PCIe MMIO  region.

Similarily to highmem_ecam, this region is disabled when highmem
is off.

Signed-off-by: Marc Zyngier 
---
 hw/arm/virt-acpi-build.c | 10 --
 hw/arm/virt.c|  7 +--
 include/hw/arm/virt.h|  1 +
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f2514ce77c..449fab0080 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -158,10 +158,9 @@ static void acpi_dsdt_add_virtio(Aml *scope,
 }
 
 static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
-  uint32_t irq, bool use_highmem, bool 
highmem_ecam,
-  VirtMachineState *vms)
+  uint32_t irq, VirtMachineState *vms)
 {
-int ecam_id = VIRT_ECAM_ID(highmem_ecam);
+int ecam_id = VIRT_ECAM_ID(vms->highmem_ecam);
 struct GPEXConfig cfg = {
 .mmio32 = memmap[VIRT_PCIE_MMIO],
 .pio= memmap[VIRT_PCIE_PIO],
@@ -170,7 +169,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry 
*memmap,
 .bus= vms->bus,
 };
 
-if (use_highmem) {
+if (vms->highmem_mmio) {
 cfg.mmio64 = memmap[VIRT_HIGH_PCIE_MMIO];
 }
 
@@ -869,8 +868,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 acpi_dsdt_add_fw_cfg(scope, &memmap[VIRT_FW_CFG]);
 acpi_dsdt_add_virtio(scope, &memmap[VIRT_MMIO],
 (irqmap[VIRT_MMIO] + ARM_SPI_BASE), NUM_VIRTIO_TRANSPORTS);
-acpi_dsdt_add_pci(scope, memmap, (irqmap[VIRT_PCIE] + ARM_SPI_BASE),
-  vms->highmem, vms->highmem_ecam, vms);
+acpi_dsdt_add_pci(scope, memmap, irqmap[VIRT_PCIE] + ARM_SPI_BASE, vms);
 if (vms->acpi_dev) {
 build_ged_aml(scope, "\\_SB."GED_DEVICE,
   HOTPLUG_HANDLER(vms->acpi_dev),
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b45b52c90e..ed8ea96acc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1412,7 +1412,7 @@ static void create_pcie(VirtMachineState *vms)
  mmio_reg, base_mmio, size_mmio);
 memory_region_add_subregion(get_system_memory(), base_mmio, mmio_alias);
 
-if (vms->highmem) {
+if (vms->highmem_mmio) {
 /* Map high MMIO space */
 MemoryRegion *high_mmio_alias = g_new0(MemoryRegion, 1);
 
@@ -1466,7 +1466,7 @@ static void create_pcie(VirtMachineState *vms)
 qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg",
  2, base_ecam, 2, size_ecam);
 
-if (vms->highmem) {
+if (vms->highmem_mmio) {
 qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "ranges",
  1, FDT_PCI_RANGE_IOPORT, 2, 0,
  2, base_pio, 2, size_pio,
@@ -2105,6 +2105,8 @@ static void machvirt_init(MachineState *machine)
 
 virt_flash_fdt(vms, sysmem, secure_sysmem ?: sysmem);
 
+vms->highmem_mmio &= vms->highmem;
+
 create_gic(vms, sysmem);
 
 virt_cpu_post_init(vms, sysmem);
@@ -2802,6 +2804,7 @@ static void virt_instance_init(Object *obj)
 vms->gic_version = VIRT_GIC_VERSION_NOSEL;
 
 vms->highmem_ecam = !vmc->no_highmem_ecam;
+vms->highmem_mmio = true;
 
 if (vmc->no_its) {
 vms->its = false;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index dc6b66ffc8..9c54acd10d 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -143,6 +143,7 @@ struct VirtMachineState {
 bool secure;
 bool highmem;
 bool highmem_ecam;
+bool highmem_mmio;
 bool its;
 bool tcg_its;
 bool virt;
-- 
2.30.2




[PATCH v5 6/6] hw/arm/virt: Drop superfluous checks against highmem

2022-01-14 Thread Marc Zyngier
Now that the devices present in the extended memory map are checked
against the available PA space and disabled when they don't fit,
there is no need to keep the same checks against highmem, as
highmem really is a shortcut for the PA space being 32bit.

Reviewed-by: Eric Auger 
Signed-off-by: Marc Zyngier 
---
 hw/arm/virt-acpi-build.c | 2 --
 hw/arm/virt.c| 5 +
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0757c28f69..449fab0080 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -947,8 +947,6 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables 
*tables)
 acpi_add_table(table_offsets, tables_blob);
 build_fadt_rev5(tables_blob, tables->linker, vms, dsdt);
 
-vms->highmem_redists &= vms->highmem;
-
 acpi_add_table(table_offsets, tables_blob);
 build_madt(tables_blob, tables->linker, vms);
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 053791cc44..4524f3807d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2171,9 +2171,6 @@ static void machvirt_init(MachineState *machine)
 
 virt_flash_fdt(vms, sysmem, secure_sysmem ?: sysmem);
 
-vms->highmem_mmio &= vms->highmem;
-vms->highmem_redists &= vms->highmem;
-
 create_gic(vms, sysmem);
 
 virt_cpu_post_init(vms, sysmem);
@@ -2192,7 +2189,7 @@ static void machvirt_init(MachineState *machine)
machine->ram_size, "mach-virt.tag");
 }
 
-vms->highmem_ecam &= vms->highmem && (!firmware_loaded || aarch64);
+vms->highmem_ecam &= (!firmware_loaded || aarch64);
 
 create_rtc(vms);
 
-- 
2.30.2




Re: [PATCH v3 2/3] target/riscv: add support for svinval extension

2022-01-14 Thread Weiwei Li



在 2022/1/14 下午10:01, Anup Patel 写道:

On Fri, Jan 14, 2022 at 7:24 PM Weiwei Li  wrote:

Thanks for your comments.

在 2022/1/14 下午9:40, Anup Patel 写道:

On Fri, Jan 14, 2022 at 7:11 AM Weiwei Li  wrote:

Signed-off-by: Weiwei Li 
Signed-off-by: Junqiang Wang 
---
  target/riscv/cpu.c  |  1 +
  target/riscv/cpu.h  |  1 +
  target/riscv/insn32.decode  |  7 ++
  target/riscv/insn_trans/trans_svinval.c.inc | 75 +
  target/riscv/translate.c|  1 +
  5 files changed, 85 insertions(+)
  create mode 100644 target/riscv/insn_trans/trans_svinval.c.inc

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index ff6c86c85b..45ac98e06b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -668,6 +668,7 @@ static Property riscv_cpu_properties[] = {
  DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
  DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),

+DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false),
  DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false),

  DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d3d17cde82..c3d1845ca1 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -327,6 +327,7 @@ struct RISCVCPU {
  bool ext_counters;
  bool ext_ifencei;
  bool ext_icsr;
+bool ext_svinval;
  bool ext_svnapot;
  bool ext_zfh;
  bool ext_zfhmin;
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 5bbedc254c..7a0351fde2 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -809,3 +809,10 @@ fcvt_l_h   1100010  00010 . ... . 1010011 @r2_rm
  fcvt_lu_h  1100010  00011 . ... . 1010011 @r2_rm
  fcvt_h_l   1101010  00010 . ... . 1010011 @r2_rm
  fcvt_h_lu  1101010  00011 . ... . 1010011 @r2_rm
+
+# *** Svinval Standard Extension ***
+sinval_vma0001011 . . 000 0 1110011 @sfence_vma
+sfence_w_inval0001100 0 0 000 0 1110011
+sfence_inval_ir   0001100 1 0 000 0 1110011
+hinval_vvma   0011011 . . 000 0 1110011 @hfence_vvma

s/0011011/0010011/

+hinval_gvma   0111011 . . 000 0 1110011 @hfence_gvma

s/0111011/0110011/

Sorry. I didn't find the encodings for svinval instructions from the spec. So I 
get them from  spike 
(https://github.com/riscv-software-src/riscv-isa-sim/blob/master/riscv/encoding.h)
 which are as follows:

#define MATCH_HINVAL_VVMA 0x3673
#define MASK_HINVAL_VVMA 0xfe007fff
#define MATCH_HINVAL_GVMA 0x7673
#define MASK_HINVAL_GVMA 0xfe007fff
Are they not the latest encodings?

The code in Spike seems to be buggy but that's a separate issue.

Refer, page 138 of
https://github.com/riscv/riscv-isa-manual/releases/download/draft-20220110-eae4f00/riscv-privileged.pdf

Regards,
Anup


OK. Thanks a lot. I'll fix this.

Regards,

Weiwei Li


diff --git a/target/riscv/insn_trans/trans_svinval.c.inc 
b/target/riscv/insn_trans/trans_svinval.c.inc
new file mode 100644
index 00..1dde665661
--- /dev/null
+++ b/target/riscv/insn_trans/trans_svinval.c.inc
@@ -0,0 +1,75 @@
+/*
+ * RISC-V translation routines for the Svinval Standard Instruction Set.
+ *
+ * Copyright (c) 2020-2021 PLCT lab
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#define REQUIRE_SVINVAL(ctx) do {\
+if (!RISCV_CPU(ctx->cs)->cfg.ext_svinval) {  \
+return false;\
+}\
+} while (0)
+
+static bool trans_sinval_vma(DisasContext *ctx, arg_sinval_vma *a)
+{
+REQUIRE_SVINVAL(ctx);
+/* Do the same as sfence.vma currently */
+REQUIRE_EXT(ctx, RVS);
+#ifndef CONFIG_USER_ONLY
+gen_helper_tlb_flush(cpu_env);
+return true;
+#endif
+return false;
+}
+
+static bool trans_sfence_w_inval(DisasContext *ctx, arg_sfence_w_inval *a)
+{
+REQUIRE_SVINVAL(ctx);
+REQUIRE_EXT(ctx, RVS);
+/* Do nothing currently */
+return true;
+}
+
+static bool trans_sfence_inval_ir(DisasContext *ctx, arg_sfence_inval_ir *a)
+{
+REQUIRE_SVINVAL(ctx);
+REQUIRE_EXT(ctx, RVS);
+/* Do nothing currently */
+return true;
+}
+
+static bool trans_hinval_vvma(DisasContext *ctx, arg_hinval_vvma *a)
+{
+REQUIRE_SVINVAL(ct

Re: [RFC PATCH] block/file-posix: Remove a deprecation warning on macOS 12

2022-01-14 Thread Philippe Mathieu-Daudé via

On 14/1/22 15:09, Hanna Reitz wrote:

On 06.01.22 00:56, Philippe Mathieu-Daudé wrote:

When building on macOS 12 we get:

   ../block/file-posix.c:3335:18: warning: 'IOMasterPort' is 
deprecated: first deprecated in macOS 12.0 [-Wdeprecated-declarations]

   kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );
    ^~~~
    IOMainPort

Use IOMainPort (define it to IOMasterPort on macOS < 12),
and replace 'master' by 'main' in a variable name.

Signed-off-by: Philippe Mathieu-Daudé 
---
  block/file-posix.c | 13 +
  1 file changed, 9 insertions(+), 4 deletions(-)


I hope the [RFC] tag isn’t directed at me.

Still, I can give my comment, of course.


diff --git a/block/file-posix.c b/block/file-posix.c
index b283093e5b..0dcfce1856 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3324,17 +3324,22 @@ BlockDriver bdrv_file = {
  #if defined(__APPLE__) && defined(__MACH__)
  static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char 
*bsdPath,

  CFIndex maxPathSize, int flags);
+
+#if !defined(MAC_OS_VERSION_12_0)


So AFAIU from my quick rather fruit-less googling, this macro is defined 
(to some version-defining integer) on every macOS version starting from 
12.0?  (Just confirming because the name could also mean it’d be defined 
only on 12.0.)


Thanks, I posted up to v3 and macOS users helped me, I will post a v4 soon.

v3: 
https://lore.kernel.org/qemu-devel/20220110131001.614319-1-f4...@amsat.org/



+#define IOMainPort IOMasterPort
+#endif
+
  static char *FindEjectableOpticalMedia(io_iterator_t *mediaIterator)
  {
  kern_return_t kernResult = KERN_FAILURE;
-    mach_port_t masterPort;
+    mach_port_t mainPort;
  CFMutableDictionaryRef  classesToMatch;
  const char *matching_array[] = {kIODVDMediaClass, kIOCDMediaClass};
  char *mediaType = NULL;
-    kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );
+    kernResult = IOMainPort(MACH_PORT_NULL, &mainPort);
  if ( KERN_SUCCESS != kernResult ) {
-    printf( "IOMasterPort returned %d\n", kernResult );
+    printf("IOMainPort returned %d\n", kernResult);
  }
  int index;
@@ -3347,7 +3352,7 @@ static char 
*FindEjectableOpticalMedia(io_iterator_t *mediaIterator)

  }
  CFDictionarySetValue(classesToMatch, 
CFSTR(kIOMediaEjectableKey),

   kCFBooleanTrue);
-    kernResult = IOServiceGetMatchingServices(masterPort, 
classesToMatch,
+    kernResult = IOServiceGetMatchingServices(mainPort, 
classesToMatch,

    mediaIterator);
  if (kernResult != KERN_SUCCESS) {
  error_report("Note: IOServiceGetMatchingServices 
returned %d",


“Looks good to me” ← here’s the comment you requested O:)


Thanks :)




Re: [PATCH 1/2] hw/virtio: add boilerplate for vhost-user-gpio device

2022-01-14 Thread Alex Bennée


Viresh Kumar  writes:

> This creates the QEMU side of the vhost-user-gpio device which connects
> to the remote daemon. It is based of vhost-user-i2c code.
>
> Signed-off-by: Viresh Kumar 

> +++ b/include/hw/virtio/vhost-user-gpio.h
> @@ -0,0 +1,35 @@
> +/*
> + * Vhost-user GPIO virtio device
> + *
> + * Copyright (c) 2021 Viresh Kumar 
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#ifndef _QEMU_VHOST_USER_GPIO_H
> +#define _QEMU_VHOST_USER_GPIO_H
> +
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-user.h"
> +#include "standard-headers/linux/virtio_gpio.h"

Hmm this fails:

  In file included from ../../hw/virtio/vhost-user-gpio.c:13:
  /home/alex/lsrc/qemu.git/include/hw/virtio/vhost-user-gpio.h:15:10: fatal 
error: standard-headers/linux/virtio_gpio.h: No such file or directory
 15 | #include "standard-headers/linux/virtio_gpio.h"
|  ^~
  compilation terminated.

The usual solution is to create a patch that imports the headers using:

  ./scripts/update-linux-headers.sh

either from the current mainline (or your own tree if the feature is in
flight) and mark the patch clearly as not for merging.

-- 
Alex Bennée



Re: [PATCH v3 3/3] target/riscv: add support for svpbmt extension

2022-01-14 Thread Weiwei Li



在 2022/1/14 下午9:59, Anup Patel 写道:

On Fri, Jan 14, 2022 at 7:11 AM Weiwei Li  wrote:

It uses two PTE bits, but otherwise has no effect on QEMU, since QEMU is 
sequentially consistent and doesn't model PMAs currently

Signed-off-by: Weiwei Li 
Signed-off-by: Junqiang Wang 
Tested-by: Heiko Stuebner 
---
  target/riscv/cpu.c| 1 +
  target/riscv/cpu.h| 1 +
  target/riscv/cpu_bits.h   | 3 +++
  target/riscv/cpu_helper.c | 9 -
  4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 45ac98e06b..4f82bd00a3 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -670,6 +670,7 @@ static Property riscv_cpu_properties[] = {

  DEFINE_PROP_BOOL("svinval", RISCVCPU, cfg.ext_svinval, false),
  DEFINE_PROP_BOOL("svnapot", RISCVCPU, cfg.ext_svnapot, false),
+DEFINE_PROP_BOOL("svpbmt", RISCVCPU, cfg.ext_svpbmt, false),

  DEFINE_PROP_BOOL("zba", RISCVCPU, cfg.ext_zba, true),
  DEFINE_PROP_BOOL("zbb", RISCVCPU, cfg.ext_zbb, true),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index c3d1845ca1..53f314c752 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -329,6 +329,7 @@ struct RISCVCPU {
  bool ext_icsr;
  bool ext_svinval;
  bool ext_svnapot;
+bool ext_svpbmt;
  bool ext_zfh;
  bool ext_zfhmin;

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index bc23e3b523..ee294c1d0b 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -486,7 +486,10 @@ typedef enum {
  #define PTE_A   0x040 /* Accessed */
  #define PTE_D   0x080 /* Dirty */
  #define PTE_SOFT0x300 /* Reserved for Software */
+#define PTE_RSVD0x1FC0 /* Reserved for future use */
+#define PTE_PBMT0x6000 /* Page-based memory types */
  #define PTE_N   0x8000 /* NAPOT translation */
+#define PTE_ATTR0xFFC0 /* All attributes bits */

  /* Page table PPN shift amount */
  #define PTE_PPN_SHIFT   10
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 58ab85bca3..f90766e026 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -619,16 +619,23 @@ restart:
  return TRANSLATE_FAIL;
  }

-hwaddr ppn = (pte & ~(target_ulong)PTE_N) >> PTE_PPN_SHIFT;
+hwaddr ppn = (pte & ~(target_ulong)PTE_ATTR) >> PTE_PPN_SHIFT;

  RISCVCPU *cpu = env_archcpu(env);
  if (!cpu->cfg.ext_svnapot && (pte & PTE_N)) {
  return TRANSLATE_FAIL;
+} else if (!cpu->cfg.ext_svpbmt && (pte & PTE_PBMT)) {
+return TRANSLATE_FAIL;
+} else if (pte & PTE_RSVD) {
+return TRANSLATE_FAIL;
  } else if (!(pte & PTE_V)) {
  /* Invalid PTE */
  return TRANSLATE_FAIL;
  } else if (!(pte & (PTE_R | PTE_W | PTE_X))) {
  /* Inner PTE, continue walking */
+if (pte & (PTE_D | PTE_A | PTE_U | PTE_N | PTE_PBMT)) {
+return TRANSLATE_FAIL;
+}

I think you should add a patch before PATCH1 to add following:

if (pte & (PTE_D | PTE_A | PTE_U)) {
 return TRANSLATE_FAIL;
}

The current PATCH1 should add PTE_N to the comparison and
this patch can add PTE_PBMT to the comparison.

OK. I'll update this.

  base = ppn << PGSHIFT;
  } else if ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) {
  /* Reserved leaf PTE flags: PTE_W */
--
2.17.1


Apart from the minor comment above, it looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup


Regards,

Weiwei Li




Re: [RFC PATCH] block/file-posix: Remove a deprecation warning on macOS 12

2022-01-14 Thread Hanna Reitz

On 06.01.22 00:56, Philippe Mathieu-Daudé wrote:

When building on macOS 12 we get:

   ../block/file-posix.c:3335:18: warning: 'IOMasterPort' is deprecated: first 
deprecated in macOS 12.0 [-Wdeprecated-declarations]
   kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );
^~~~
IOMainPort

Use IOMainPort (define it to IOMasterPort on macOS < 12),
and replace 'master' by 'main' in a variable name.

Signed-off-by: Philippe Mathieu-Daudé 
---
  block/file-posix.c | 13 +
  1 file changed, 9 insertions(+), 4 deletions(-)


I hope the [RFC] tag isn’t directed at me.

Still, I can give my comment, of course.


diff --git a/block/file-posix.c b/block/file-posix.c
index b283093e5b..0dcfce1856 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3324,17 +3324,22 @@ BlockDriver bdrv_file = {
  #if defined(__APPLE__) && defined(__MACH__)
  static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
  CFIndex maxPathSize, int flags);
+
+#if !defined(MAC_OS_VERSION_12_0)


So AFAIU from my quick rather fruit-less googling, this macro is defined 
(to some version-defining integer) on every macOS version starting from 
12.0?  (Just confirming because the name could also mean it’d be defined 
only on 12.0.)



+#define IOMainPort IOMasterPort
+#endif
+
  static char *FindEjectableOpticalMedia(io_iterator_t *mediaIterator)
  {
  kern_return_t kernResult = KERN_FAILURE;
-mach_port_t masterPort;
+mach_port_t mainPort;
  CFMutableDictionaryRef  classesToMatch;
  const char *matching_array[] = {kIODVDMediaClass, kIOCDMediaClass};
  char *mediaType = NULL;
  
-kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );

+kernResult = IOMainPort(MACH_PORT_NULL, &mainPort);
  if ( KERN_SUCCESS != kernResult ) {
-printf( "IOMasterPort returned %d\n", kernResult );
+printf("IOMainPort returned %d\n", kernResult);
  }
  
  int index;

@@ -3347,7 +3352,7 @@ static char *FindEjectableOpticalMedia(io_iterator_t 
*mediaIterator)
  }
  CFDictionarySetValue(classesToMatch, CFSTR(kIOMediaEjectableKey),
   kCFBooleanTrue);
-kernResult = IOServiceGetMatchingServices(masterPort, classesToMatch,
+kernResult = IOServiceGetMatchingServices(mainPort, classesToMatch,
mediaIterator);
  if (kernResult != KERN_SUCCESS) {
  error_report("Note: IOServiceGetMatchingServices returned %d",


“Looks good to me” ← here’s the comment you requested O:)

Hanna




Re: [PATCH 2/2] hw/virtio: add vhost-user-gpio-pci boilerplate

2022-01-14 Thread Alex Bennée


Viresh Kumar  writes:

> This allows is to instantiate a vhost-user-gpio device as part of a PCI
> bus. It is mostly boilerplate which looks pretty similar to the
> vhost-user-fs-pci device.
>
> Signed-off-by: Viresh Kumar 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée



Re: [PATCH qemu] spapr: Force 32bit when resetting a core

2022-01-14 Thread Cédric Le Goater

On 1/10/22 03:52, Alexey Kardashevskiy wrote:



On 08/01/2022 00:39, Greg Kurz wrote:

On Fri, 7 Jan 2022 23:19:03 +1100
David Gibson  wrote:


On Fri, Jan 07, 2022 at 12:57:47PM +0100, Greg Kurz wrote:

On Fri, 7 Jan 2022 18:24:23 +1100
Alexey Kardashevskiy  wrote:


"PowerPC Processor binding to IEEE 1275" says in
"8.2.1. Initial Register Values" that the initial state is defined as
32bit so do it for both SLOF and VOF.

This should not cause behavioral change as SLOF switches to 64bit very
early anyway.


Only one CPU goes through SLOF. What about the other ones, including
hot plugged CPUs ?


Those will be started by the start-cpu RTAS call which has its own
semantics.



Ah indeed, there's code in linux/arch/powerpc/kernel/head_64.S to switch
secondaries to 64bit... but then, as noted by Cedric, ppc_cpu_reset(),
which is called earlier sets MSR_SF but the changelog of commit 8b9f2118ca40
doesn't provide much details on the motivation. Any idea ?


https://patchwork.kernel.org/project/qemu-devel/patch/1458121432-2855-1-git-send-email-lviv...@redhat.com/

this is probably it:

===
Reset is properly defined as an exception (0x100). For exceptions, the
970MP user manual for example says:

4.5 Exception Definitions
When an exception/interrupt is taken, all bits in the MSR are set to
‘0’, with the following exceptions:
• Exceptions always set MSR[SF] to ‘1’.
===

but it looks like the above is about emulation bare metal 970 rather than 
pseries VCPU so that quote does not apply to spapr.


Yes, more info here :

  
https://patchwork.kernel.org/project/qemu-devel/patch/1458121432-2855-1-git-send-email-lviv...@redhat.com/

mac99+970 only boots with a 64bit kernel. 32bit are not supported because
of the use of the rfi instruction which was removed in v2.01. 32bit user
space is supported though.

However I was not able to build a disk with a compatible boot partition
for OpenBIOS. The above support only applies for kernel loaded in memory.
May be Mark knows how to do this ?

Anyhow, I didn't see any regression on PAPR with this patch, TCG or KVM.

Thanks,

C.



Re: [PATCH 5/4] tests: acpi: test short OEM_ID/OEM_TABLE_ID values in test_oem_fields()

2022-01-14 Thread Ani Sinha
On Fri, Jan 14, 2022 at 7:57 PM Igor Mammedov  wrote:

> Previous patch [1] added explicit whitespace padding to OEM_ID/OEM_TABLE_ID
> values used in test_oem_fields() testcase to avoid false positive and
> bisection issues when QEMU is switched to \0' padding. As result
> testcase ceased to test values that were shorter than max possible
> length values.
>
> Update testcase to make sure that it's testing shorter IDs like it
> used to before [2].
>
> 1) "tests: acpi: manually pad OEM_ID/OEM_TABLE_ID for  test_oem_fields()
> test"
> 2) 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be changed")
>
> Signed-off-by: Igor Mammedov 


Reviewed-by: Ani Sinha 



>


Re: [PULL 07/16] qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER

2022-01-14 Thread Philippe Mathieu-Daudé via

On 14/1/22 14:52, Kevin Wolf wrote:

From: Philippe Mathieu-Daudé 

When building QEMU with --disable-vhost-user and using introspection,
query-qmp-schema lists vhost-user-blk even though it's not actually
available:

   { "execute": "query-qmp-schema" }
   {
   "return": [
   ...
   {
   "name": "312",
   "members": [
   {
   "name": "nbd"
   },
   {
   "name": "vhost-user-blk"
   }
   ],
   "meta-type": "enum",
   "values": [
   "nbd",
   "vhost-user-blk"
   ]
   },

Restrict vhost-user-blk in BlockExportType when
CONFIG_VHOST_USER_BLK_SERVER is disabled, so it
doesn't end listed by query-qmp-schema.

Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API")
Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220107105420.395011-4-f4...@amsat.org>
Signed-off-by: Kevin Wolf 
---
  qapi/block-export.json | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/qapi/block-export.json b/qapi/block-export.json
index c1b92ce1c1..f9ce79a974 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -277,7 +277,8 @@
  # Since: 4.2
  ##
  { 'enum': 'BlockExportType',
-  'data': [ 'nbd', 'vhost-user-blk',
+  'data': [ 'nbd',
+{ 'name': 'vhost-user-blk', 'if': 'CONFIG_VHOST_USER_BLK_SERVER' },
  { 'name': 'fuse', 'if': 'CONFIG_FUSE' } ] }


Markus asked to split this line:
https://lore.kernel.org/qemu-devel/87zgny37s8@dusky.pond.sub.org/
I will add a cleanup patch, no need to cancel this PR for that ;)



Re: [RFC PATCH] block/file-posix: Remove a deprecation warning on macOS 12

2022-01-14 Thread Hanna Reitz

On 14.01.22 15:15, Philippe Mathieu-Daudé wrote:

On 14/1/22 15:09, Hanna Reitz wrote:

On 06.01.22 00:56, Philippe Mathieu-Daudé wrote:

When building on macOS 12 we get:

   ../block/file-posix.c:3335:18: warning: 'IOMasterPort' is 
deprecated: first deprecated in macOS 12.0 [-Wdeprecated-declarations]

   kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );
    ^~~~
    IOMainPort

Use IOMainPort (define it to IOMasterPort on macOS < 12),
and replace 'master' by 'main' in a variable name.

Signed-off-by: Philippe Mathieu-Daudé 
---
  block/file-posix.c | 13 +
  1 file changed, 9 insertions(+), 4 deletions(-)


I hope the [RFC] tag isn’t directed at me.

Still, I can give my comment, of course.


diff --git a/block/file-posix.c b/block/file-posix.c
index b283093e5b..0dcfce1856 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3324,17 +3324,22 @@ BlockDriver bdrv_file = {
  #if defined(__APPLE__) && defined(__MACH__)
  static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char 
*bsdPath,

  CFIndex maxPathSize, int flags);
+
+#if !defined(MAC_OS_VERSION_12_0)


So AFAIU from my quick rather fruit-less googling, this macro is 
defined (to some version-defining integer) on every macOS version 
starting from 12.0?  (Just confirming because the name could also 
mean it’d be defined only on 12.0.)


Thanks, I posted up to v3 and macOS users helped me, I will post a v4 
soon.


v3: 
https://lore.kernel.org/qemu-devel/20220110131001.614319-1-f4...@amsat.org/


I see.  The MAC_OS_X_VERSION_M{IN,AX}_REQUIRED thing was exactly what I 
didn’t really understand from said googling, but the important thing is 
that you do.  (Something to do with what runtime is actually in use 
rather than what the system can provide?  Well, I’ll just stop asking.)  O:)


Hanna




Re: [PATCH v5 12/12] docs/devel: Add documentation for the DMA control interface

2022-01-14 Thread Francisco Iglesias
On [2022 Jan 07] Fri 16:07:17, Peter Maydell wrote:
> On Tue, 14 Dec 2021 at 11:04, Francisco Iglesias
>  wrote:
> >
> > Also, since being the author, list myself as maintainer for the file.
> >
> > Signed-off-by: Francisco Iglesias 
> 
> 
> > +DmaCtrlIfClass
> > +--
> > +
> > +The ``DmaCtrlIfClass`` contains the interface methods that can be
> > +implemented by a DMA engine.
> > +
> > +.. code-block:: c
> > +
> > +typedef struct DmaCtrlIfClass {
> > +InterfaceClass parent;
> > +
> > +/*
> > + * read: Start a read transfer on the DMA engine implementing the 
> > DMA
> > + * control interface
> > + *
> > + * @dma_ctrl: the DMA engine implementing this interface
> > + * @addr: the address to read
> > + * @len: the number of bytes to read at 'addr'
> > + */
> 
> The prototype seems to be missing here.
> 
> > +} DmaCtrlIfClass;
> > +
> > +
> > +dma_ctrl_if_read
> > +
> > +
> > +The ``dma_ctrl_if_read`` function is used from a model embedding the DMA 
> > engine
> > +for starting DMA read transfers.
> > +
> > +.. code-block:: c
> > +
> > +/*
> > + * Start a read transfer on a DMA engine implementing the DMA control
> > + * interface.
> > + *
> > + * @dma_ctrl: the DMA engine implementing this interface
> > + * @addr: the address to read
> > + * @len: the number of bytes to read at 'addr'
> > + */
> > +void dma_ctrl_if_read(DmaCtrlIf *dma, hwaddr addr, uint32_t len);

Hi Peter,

> 
> The method says it "starts" the transfer. How does the thing on the
> end of the DMA control interface find out when the transfer completes,
> or if there were any errors ?

Yes, I can see that above is not clear enough at the moment, I'll attemp to
improve and fix this in v6! I'll also correct the other issues you found in the
series!

Thank you very much for reviewing again!

Best regards,
Francisco

> 
> thanks
> -- PMM



  1   2   >