date:20210820

Re: [PATCH v3 00/16] tcg/mips: Unaligned access and other cleanup

2021-08-20 Thread Huacai Chen

Hi, Jiaxun,

I'm not familiar with TCG, please review, thanks.

Huacai

On Thu, Aug 19, 2021 at 6:09 AM Philippe Mathieu-Daudé  wrote:
>
> Sorry, use Huacai's newer email .
>
> On Thu, Aug 19, 2021 at 12:07 AM Philippe Mathieu-Daudé  
> wrote:
> >
> > Cc'ing Jiaxun & Huacai.
> >
> > On 8/18/21 10:19 PM, Richard Henderson wrote:
> > > Based-on: <20210818191920.390759-1-richard.hender...@linaro.org>
> > > ("[PATCH v3 00/66] Unaligned access for user-only")
> > >
> > > Important points:
> > >   * Support unaligned accesses.
> > >   * Drop requirement for 256MB alignment of code_gen_buffer.
> > >   * Improvements to tcg_out_movi:
> > > - Have a tb-relative register for mips64, reducing the
> > >   code size for most pointers,
> > > - Try a few 3-insn sequences,
> > > - Drop everything else into a constant pool.
> > >
> > >
> > > r~
> > >
> > >
> > > Richard Henderson (16):
> > >   tcg/mips: Support unaligned access for user-only
> > >   tcg/mips: Support unaligned access for softmmu
> > >   tcg/mips: Drop inline markers
> > >   tcg/mips: Move TCG_AREG0 to S8
> > >   tcg/mips: Move TCG_GUEST_BASE_REG to S7
> > >   tcg/mips: Unify TCG_GUEST_BASE_REG tests
> > >   tcg/mips: Allow JAL to be out of range in tcg_out_bswap_subr
> > >   tcg/mips: Unset TCG_TARGET_HAS_direct_jump
> > >   tcg/mips: Drop special alignment for code_gen_buffer
> > >   tcg/mips: Create and use TCG_REG_TB
> > >   tcg/mips: Split out tcg_out_movi_one
> > >   tcg/mips: Split out tcg_out_movi_two
> > >   tcg/mips: Use the constant pool for 64-bit constants
> > >   tcg/mips: Aggressively use the constant pool for n64 calls
> > >   tcg/mips: Try tb-relative addresses in tcg_out_movi
> > >   tcg/mips: Try three insns with shift and add in tcg_out_movi
> > >
> > >  tcg/mips/tcg-target.h |  17 +-
> > >  tcg/region.c  |  91 -
> > >  tcg/mips/tcg-target.c.inc | 730 +++---
> > >  3 files changed, 604 insertions(+), 234 deletions(-)
> > >
> >

qemu-devel@nongnu.org

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 8:07 AM, Gerd Hoffmann wrote:
>   Hi,
> 
>> This also seems to me to be the tail wagging the dog. If we think
>> 'info mtree' has too much duplicate information (which it certainly
>> does) then we should make mtree_info() smarter about reducing that
>> duplication. Off the top of my head, we could change the code that
>> prints ASes to do something like:
> 
>>qemu_printf("...same as address-space %s\n", name);
> 
> Neat idea.
> 
> Having 'info mtree' accept an (optional) 'name' parameter to pick an
> address space to be printed would be useful too.

Yeah, for now I am thinking of a match string (for my use cases):

  (qemu) info mtree -a dma # all address spaces matching *dma*

Re: [PATCH 3/4] target/arm/cpu64: Replace kvm_supported with sve_vq_supported

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/19/21 9:37 PM, Andrew Jones wrote:
> Now that we have an ARMCPU member sve_vq_supported we no longer
> need the local kvm_supported bitmap for KVM's supported vector
> lengths.
> 
> Signed-off-by: Andrew Jones 
> ---
>  target/arm/cpu64.c | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 4/7] hw/adc: Make adci[*] R/W in NPCM7XX ADC

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/14/21 1:33 AM, Hao Wu wrote:
> Our sensor test requires both reading and writing from a sensor's
> QOM property. So we need to make the input of ADC module R/W instead
> of read only for that to work.
> 
> Signed-off-by: Hao Wu 
> Reviewed-by: Titus Rwantare 
> ---
>  hw/adc/npcm7xx_adc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/adc/npcm7xx_adc.c b/hw/adc/npcm7xx_adc.c
> index 47fb9e5f74..bc6f3f55e6 100644
> --- a/hw/adc/npcm7xx_adc.c
> +++ b/hw/adc/npcm7xx_adc.c
> @@ -242,7 +242,7 @@ static void npcm7xx_adc_init(Object *obj)
>  
>  for (i = 0; i < NPCM7XX_ADC_NUM_INPUTS; ++i) {
>  object_property_add_uint32_ptr(obj, "adci[*]",
> -&s->adci[i], OBJ_PROP_FLAG_WRITE);
> +&s->adci[i], OBJ_PROP_FLAG_READWRITE);

What about:

  qtest_enabled() ? OBJ_PROP_FLAG_READWRITE : OBJ_PROP_FLAG_WRITE

>  }
>  object_property_add_uint32_ptr(obj, "vref",
>  &s->vref, OBJ_PROP_FLAG_WRITE);
>

Re: [PATCH] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 4:36 AM, Li, Chunming wrote:
> The current SMMU V3 device model only support PCI/PCIe devices,
> so we update it to support non-PCI/PCIe devices.
> 
> hw/arm/smmuv3:
> . Create IOMMU memory regions for non-PCI/PCIe devices based on their 
> SID
> . Add sid-map property to store non-PCI/PCIe devices SID
> . Update implementation of CFGI commands based on device SID
> hw/arm/smmu-common:
> . Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting strategy
> hw/arm/virt:
> . Add PL330 DMA controller and connect with SMMUv3 for testing
> . Add smmuv3_sidmap for non-PCI/PCIe devices SID setting
> 
> Signed-off-by: Chunming Li 
> Signed-off-by: Renwei Liu 
> ---
> This patch depends on PL330 memory region connection patch:
> https://patchew.org/QEMU/4c23c17b8e87e74e906a25a3254a03f4fa1fe...@shasxm03.verisilicon.com/
> 
>  hw/arm/smmuv3.c  |  75 ++--
>  hw/arm/virt.c| 110 ++-
>  include/hw/arm/smmu-common.h |  12 +++-
>  include/hw/arm/smmuv3.h  |   2 +
>  include/hw/arm/virt.h|   3 +
>  5 files changed, 181 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 01b60bee4..c4da05d8b 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -32,6 +32,7 @@
>  #include "hw/arm/smmuv3.h"
>  #include "smmuv3-internal.h"
>  #include "smmu-internal.h"
> +#include "hw/qdev-properties.h"
>  
>  /**
>   * smmuv3_trigger_irq - pulse @irq if enabled and update
> @@ -612,7 +613,7 @@ static SMMUTransCfg *smmuv3_get_config(SMMUDevice *sdev, 
> SMMUEventInfo *event)
>  return cfg;
>  }
>  
> -static void smmuv3_flush_config(SMMUDevice *sdev)
> +static void __attribute__((unused)) smmuv3_flush_config(SMMUDevice *sdev)
>  {

Why keep this function if unused?

RE: [PATCH] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Li, Chunming


> On 8/20/21 4:36 AM, Li, Chunming wrote:
> > The current SMMU V3 device model only support PCI/PCIe devices,
> > so we update it to support non-PCI/PCIe devices.
> >
> > hw/arm/smmuv3:
> > . Create IOMMU memory regions for non-PCI/PCIe devices based
> on their SID
> > . Add sid-map property to store non-PCI/PCIe devices SID
> > . Update implementation of CFGI commands based on device SID
> > hw/arm/smmu-common:
> > . Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting
> strategy
> > hw/arm/virt:
> > . Add PL330 DMA controller and connect with SMMUv3 for
> testing
> > . Add smmuv3_sidmap for non-PCI/PCIe devices SID setting
> >
> > Signed-off-by: Chunming Li 
> > Signed-off-by: Renwei Liu 
> > ---
> > This patch depends on PL330 memory region connection patch:
> >
> https://patchew.org/QEMU/4C23C17B8E87E74E906A25A3254A03F4FA1FEC31@SHASX
> M03.verisilicon.com/
> >
> >  hw/arm/smmuv3.c  |  75 ++--
> >  hw/arm/virt.c| 110
> ++-
> >  include/hw/arm/smmu-common.h |  12 +++-
> >  include/hw/arm/smmuv3.h  |   2 +
> >  include/hw/arm/virt.h|   3 +
> >  5 files changed, 181 insertions(+), 21 deletions(-)
> >
> > diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> > index 01b60bee4..c4da05d8b 100644
> > --- a/hw/arm/smmuv3.c
> > +++ b/hw/arm/smmuv3.c
> > @@ -32,6 +32,7 @@
> >  #include "hw/arm/smmuv3.h"
> >  #include "smmuv3-internal.h"
> >  #include "smmu-internal.h"
> > +#include "hw/qdev-properties.h"
> >
> >  /**
> >   * smmuv3_trigger_irq - pulse @irq if enabled and update
> > @@ -612,7 +613,7 @@ static SMMUTransCfg *smmuv3_get_config(SMMUDevice
> *sdev, SMMUEventInfo *event)
> >  return cfg;
> >  }
> >
> > -static void smmuv3_flush_config(SMMUDevice *sdev)
> > +static void __attribute__((unused)) smmuv3_flush_config(SMMUDevice
> *sdev)
> >  {
> 
> Why keep this function if unused?
"smmuv3_flush_config" is useless after our modification. The modification is 
verified by PL330 DMA and PCIe network.
But we are not sure if it has potential risk. So we hope maintainer can help to 
check, then we can remove it.

Re: [qemu-web PATCH] Add a blog post about FUSE block exports

2021-08-20 Thread Hanna Reitz


On 19.08.21 18:23, Stefan Hajnoczi wrote:

On Thu, Aug 19, 2021 at 12:25:01PM +0200, Hanna Reitz wrote:

This post explains when FUSE block exports are useful, how they work,
and that it is fun to export an image file on its own path so it looks
like your image file (in whatever format it was) is a raw image now.

Signed-off-by: Hanna Reitz 
---
You can also find this patch here:
https://gitlab.com/hreitz/qemu-web fuse-blkexport-v1

My first patch to qemu-web, so I hope I am not doing anything overly
stupid here (adding SVGs with extremely long lines comes to mind)...
---
  _posts/2021-08-18-fuse-blkexport.md   | 488 ++
  screenshots/2021-08-18-block-graph-a.svg  |   2 +
  screenshots/2021-08-18-block-graph-b.svg  |   2 +
  screenshots/2021-08-18-block-graph-c.svg  |   2 +
  screenshots/2021-08-18-block-graph-d.svg  |   2 +
  screenshots/2021-08-18-block-graph-e.svg  |   2 +
  screenshots/2021-08-18-root-directory.svg |   2 +
  screenshots/2021-08-18-root-file.svg  |   2 +
  8 files changed, 502 insertions(+)
  create mode 100644 _posts/2021-08-18-fuse-blkexport.md
  create mode 100644 screenshots/2021-08-18-block-graph-a.svg
  create mode 100644 screenshots/2021-08-18-block-graph-b.svg
  create mode 100644 screenshots/2021-08-18-block-graph-c.svg
  create mode 100644 screenshots/2021-08-18-block-graph-d.svg
  create mode 100644 screenshots/2021-08-18-block-graph-e.svg
  create mode 100644 screenshots/2021-08-18-root-directory.svg
  create mode 100644 screenshots/2021-08-18-root-file.svg

Great! Two ideas:

It would be nice to include a shoutout to libguestfs and mention that
libguestfs avoids exposing the host kernel's file systems and partion
code to untrusted disk images. If you don't mount the image then the
FUSE export has similar security properties.


Oh, right!  Absolutely.

Though now I do wonder why one would actually want to use QEMU’s FUSE 
exports then...


Looks like the performance isn’t as bad as I claimed (for me around 
1.5G/s for reading/writing from/to a raw image on tmpfs), so perhaps 
that’s one point.  Another is probably that FUSE exports are better 
suited when you actually want access to the whole image.  I guess.



This is a long blog post. One idea is to show a quickstart
qemu-storage-daemon FUSE export command-line in the beginning before
explaining all the details. That way people who just want to see what
this is about can get an idea without learning all the background first.


Sounds good, will do.  Thanks!

Hanna

Re: [PATCH] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 9:42 AM, Li, Chunming wrote:
> 
>> On 8/20/21 4:36 AM, Li, Chunming wrote:
>>> The current SMMU V3 device model only support PCI/PCIe devices,
>>> so we update it to support non-PCI/PCIe devices.
>>>
>>> hw/arm/smmuv3:
>>> . Create IOMMU memory regions for non-PCI/PCIe devices based
>> on their SID
>>> . Add sid-map property to store non-PCI/PCIe devices SID
>>> . Update implementation of CFGI commands based on device SID
>>> hw/arm/smmu-common:
>>> . Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting
>> strategy
>>> hw/arm/virt:
>>> . Add PL330 DMA controller and connect with SMMUv3 for
>> testing
>>> . Add smmuv3_sidmap for non-PCI/PCIe devices SID setting
>>>
>>> Signed-off-by: Chunming Li 
>>> Signed-off-by: Renwei Liu 
>>> ---
>>> This patch depends on PL330 memory region connection patch:
>>>
>> https://patchew.org/QEMU/4C23C17B8E87E74E906A25A3254A03F4FA1FEC31@SHASX
>> M03.verisilicon.com/
>>>
>>>  hw/arm/smmuv3.c  |  75 ++--
>>>  hw/arm/virt.c| 110
>> ++-
>>>  include/hw/arm/smmu-common.h |  12 +++-
>>>  include/hw/arm/smmuv3.h  |   2 +
>>>  include/hw/arm/virt.h|   3 +
>>>  5 files changed, 181 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>> index 01b60bee4..c4da05d8b 100644
>>> --- a/hw/arm/smmuv3.c
>>> +++ b/hw/arm/smmuv3.c
>>> @@ -32,6 +32,7 @@
>>>  #include "hw/arm/smmuv3.h"
>>>  #include "smmuv3-internal.h"
>>>  #include "smmu-internal.h"
>>> +#include "hw/qdev-properties.h"
>>>
>>>  /**
>>>   * smmuv3_trigger_irq - pulse @irq if enabled and update
>>> @@ -612,7 +613,7 @@ static SMMUTransCfg *smmuv3_get_config(SMMUDevice
>> *sdev, SMMUEventInfo *event)
>>>  return cfg;
>>>  }
>>>
>>> -static void smmuv3_flush_config(SMMUDevice *sdev)
>>> +static void __attribute__((unused)) smmuv3_flush_config(SMMUDevice
>> *sdev)
>>>  {
>>
>> Why keep this function if unused?
> "smmuv3_flush_config" is useless after our modification. The modification is 
> verified by PL330 DMA and PCIe network.
> But we are not sure if it has potential risk. So we hope maintainer can help 
> to check, then we can remove it.

Either remove it in your patch, or your patch is incorrect and we
still need it. But don't let it as dead code and don't postpone the
removal please.

[PATCH] .mailmap: Fix more contributor entries

2021-08-20 Thread Philippe Mathieu-Daudé

These authors have some incorrect author email field.
For each of them, there is one commit with the replaced
entry.

Cc: Alex Chen 
Cc: Bibo Mao 
Cc: Guoyi Tu 
Cc: Haibin Zhang 
Cc: Hyman Huang 
Cc: Lichang Zhao 
Cc: Yuanjun Gong 
Signed-off-by: Philippe Mathieu-Daudé 
---
If you are Cc'ed and agree with this change, please reply with a
"Reviewed-by: Name " line. I'll wait 2 weeks and consider
no reply as a disagreement and will remove your entry from this
patch.
---
 .mailmap | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/.mailmap b/.mailmap
index f029d1c21fe..11b259e7d07 100644
--- a/.mailmap
+++ b/.mailmap
@@ -69,6 +69,7 @@ Yongbok Kim  
 # git author config, or had utf8/latin1 encoding issues.
 Aaron Lindsay 
 Alexey Gerasimenko 
+Alex Chen 
 Alex Ivanov 
 Andreas Färber 
 Bandan Das 
@@ -76,6 +77,7 @@ Benjamin MARSILI 
 Benoît Canet 
 Benoît Canet 
 Benoît Canet 
+Bibo Mao 
 Boqun Feng 
 Boqun Feng 
 Brad Smith 
@@ -99,9 +101,12 @@ Gautham R. Shenoy 
 Gautham R. Shenoy 
 Gonglei (Arei) 
 Guang Wang 
+Guoyi Tu 
+Haibin Zhang 
 Hailiang Zhang 
 Hanna Reitz  
 Hervé Poussineau 
+Hyman Huang 
 Jakub Jermář 
 Jakub Jermář 
 Jean-Christophe Dubois 
@@ -113,6 +118,7 @@ Jun Li 
 Laurent Vivier 
 Leandro Lupori 
 Li Guang 
+Lichang Zhao 
 Liming Wang 
 linzhecheng 
 Liran Schour 
@@ -171,6 +177,7 @@ Xiong Zhang 
 Yin Yin 
 Yu-Chen Lin 
 Yu-Chen Lin  
+Yuanjun Gong 
 YunQiang Su 
 YunQiang Su 
 Yuri Pudgorodskiy 
-- 
2.31.1

[PATCH v2] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Li, Chunming

The current SMMU V3 device model only support PCI/PCIe devices,
so we update it to support non-PCI/PCIe devices.

hw/arm/smmuv3:
. Create IOMMU memory regions for non-PCI/PCIe devices based on their 
SID
. Add sid-map property to store non-PCI/PCIe devices SID
. Update implementation of CFGI commands based on device SID
hw/arm/smmu-common:
. Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting strategy
hw/arm/virt:
. Add PL330 DMA controller and connect with SMMUv3 for testing
. Add smmuv3_sidmap for non-PCI/PCIe devices SID setting

Signed-off-by: Chunming Li 
Signed-off-by: Renwei Liu 
---
v2 (after reviewed by Philippe Mathieu-Daudé)
 - Remove unused function "smmuv3_flush_config"

This patch depends on PL330 memory region connection patch:
https://patchew.org/QEMU/4c23c17b8e87e74e906a25a3254a03f4fa1fe...@shasxm03.verisilicon.com/

 hw/arm/smmuv3.c  |  82 +-
 hw/arm/virt.c| 110 ++-
 include/hw/arm/smmu-common.h |  12 +++-
 include/hw/arm/smmuv3.h  |   2 +
 include/hw/arm/virt.h|   3 +
 5 files changed, 180 insertions(+), 29 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 01b60bee4..a0ca6235b 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -32,6 +32,7 @@
 #include "hw/arm/smmuv3.h"
 #include "smmuv3-internal.h"
 #include "smmu-internal.h"
+#include "hw/qdev-properties.h"
 
 /**
  * smmuv3_trigger_irq - pulse @irq if enabled and update
@@ -612,15 +613,6 @@ static SMMUTransCfg *smmuv3_get_config(SMMUDevice *sdev, 
SMMUEventInfo *event)
 return cfg;
 }
 
-static void smmuv3_flush_config(SMMUDevice *sdev)
-{
-SMMUv3State *s = sdev->smmu;
-SMMUState *bc = &s->smmu_state;
-
-trace_smmuv3_config_cache_inv(smmu_get_sid(sdev));
-g_hash_table_remove(bc->configs, sdev);
-}
-
 static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
   IOMMUAccessFlags flag, int iommu_idx)
 {
@@ -963,22 +955,18 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
 case SMMU_CMD_CFGI_STE:
 {
 uint32_t sid = CMD_SID(&cmd);
-IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
-SMMUDevice *sdev;
+SMMUSIDRange sid_range;
 
 if (CMD_SSEC(&cmd)) {
 cmd_error = SMMU_CERROR_ILL;
 break;
 }
 
-if (!mr) {
-break;
-}
-
+sid_range.start = sid;
+sid_range.end = sid;
 trace_smmuv3_cmdq_cfgi_ste(sid);
-sdev = container_of(mr, SMMUDevice, iommu);
-smmuv3_flush_config(sdev);
-
+g_hash_table_foreach_remove(bs->configs, smmuv3_invalidate_ste,
+&sid_range);
 break;
 }
 case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
@@ -1005,21 +993,18 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
 case SMMU_CMD_CFGI_CD_ALL:
 {
 uint32_t sid = CMD_SID(&cmd);
-IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid);
-SMMUDevice *sdev;
+SMMUSIDRange sid_range;
 
 if (CMD_SSEC(&cmd)) {
 cmd_error = SMMU_CERROR_ILL;
 break;
 }
 
-if (!mr) {
-break;
-}
-
+sid_range.start = sid;
+sid_range.end = sid;
 trace_smmuv3_cmdq_cfgi_cd(sid);
-sdev = container_of(mr, SMMUDevice, iommu);
-smmuv3_flush_config(sdev);
+g_hash_table_foreach_remove(bs->configs, smmuv3_invalidate_ste,
+&sid_range);
 break;
 }
 case SMMU_CMD_TLBI_NH_ASID:
@@ -1430,6 +1415,19 @@ static void smmu_reset(DeviceState *dev)
 smmuv3_init_regs(s);
 }
 
+static SMMUDevice *smmu_find_peri_sdev(SMMUState *s, uint16_t sid)
+{
+SMMUDevice *sdev;
+
+QLIST_FOREACH(sdev, &s->peri_sdev_list, next) {
+if (smmu_get_sid(sdev) == sid) {
+return sdev;
+}
+}
+
+return NULL;
+}
+
 static void smmu_realize(DeviceState *d, Error **errp)
 {
 SMMUState *sys = ARM_SMMU(d);
@@ -1437,6 +1435,9 @@ static void smmu_realize(DeviceState *d, Error **errp)
 SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
 SysBusDevice *dev = SYS_BUS_DEVICE(d);
 Error *local_err = NULL;
+SMMUDevice *sdev;
+char *name = NULL;
+uint16_t sid = 0;
 
 c->parent_realize(d, &local_err);
 if (local_err) {
@@ -1454,6 +1455,28 @@ static void smmu_realize(DeviceState *d, Error **errp)
 sysbus_init_mmio(dev, &sys->iomem);
 
 smmu_init_irq(s, dev);
+
+/* Create IOMMU memory region for peripheral devices based on their SID */
+for (int i = 0; i < s->num_sid; i++) {
+sid = s->sid_map[i];
+sdev = smmu_find_peri_sdev(sys, sid)

Re: [PATCH] .mailmap: Fix more contributor entries

2021-08-20 Thread Alex Chen

On 2021/8/20 16:04, Philippe Mathieu-Daudé wrote:
> These authors have some incorrect author email field.
> For each of them, there is one commit with the replaced
> entry.
> 
> Cc: Alex Chen 
> Cc: Bibo Mao 
> Cc: Guoyi Tu 
> Cc: Haibin Zhang 
> Cc: Hyman Huang 
> Cc: Lichang Zhao 
> Cc: Yuanjun Gong 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> If you are Cc'ed and agree with this change, please reply with a
> "Reviewed-by: Name " line. I'll wait 2 weeks and consider
> no reply as a disagreement and will remove your entry from this
> patch.

Reviewed-by: Alex Chen 

> ---
>  .mailmap | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/.mailmap b/.mailmap
> index f029d1c21fe..11b259e7d07 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -69,6 +69,7 @@ Yongbok Kim  
>  # git author config, or had utf8/latin1 encoding issues.
>  Aaron Lindsay 
>  Alexey Gerasimenko 
> +Alex Chen 
>  Alex Ivanov 
>  Andreas Färber 
>  Bandan Das 
> @@ -76,6 +77,7 @@ Benjamin MARSILI 
>  Benoît Canet 
>  Benoît Canet 
>  Benoît Canet 
> +Bibo Mao 
>  Boqun Feng 
>  Boqun Feng 
>  Brad Smith 
> @@ -99,9 +101,12 @@ Gautham R. Shenoy 
>  Gautham R. Shenoy 
>  Gonglei (Arei) 
>  Guang Wang 
> +Guoyi Tu 
> +Haibin Zhang 
>  Hailiang Zhang 
>  Hanna Reitz  
>  Hervé Poussineau 
> +Hyman Huang 
>  Jakub Jermář 
>  Jakub Jermář 
>  Jean-Christophe Dubois 
> @@ -113,6 +118,7 @@ Jun Li 
>  Laurent Vivier 
>  Leandro Lupori 
>  Li Guang 
> +Lichang Zhao 
>  Liming Wang 
>  linzhecheng 
>  Liran Schour 
> @@ -171,6 +177,7 @@ Xiong Zhang 
>  Yin Yin 
>  Yu-Chen Lin 
>  Yu-Chen Lin  
> +Yuanjun Gong 
>  YunQiang Su 
>  YunQiang Su 
>  Yuri Pudgorodskiy 
>

Re: [PATCH] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Peter Maydell

On Fri, 20 Aug 2021 at 03:36, Li, Chunming  wrote:
>
> The current SMMU V3 device model only support PCI/PCIe devices,
> so we update it to support non-PCI/PCIe devices.
>
> hw/arm/smmuv3:
> . Create IOMMU memory regions for non-PCI/PCIe devices based on their 
> SID
> . Add sid-map property to store non-PCI/PCIe devices SID
> . Update implementation of CFGI commands based on device SID
> hw/arm/smmu-common:
> . Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting strategy
> hw/arm/virt:
> . Add PL330 DMA controller and connect with SMMUv3 for testing
> . Add smmuv3_sidmap for non-PCI/PCIe devices SID setting

Please don't try to do all these things in one big patch --
put together a patchseries with several smaller patches,
each of which does one self-contained thing.


> Signed-off-by: Chunming Li 
> Signed-off-by: Renwei Liu 
> ---
> This patch depends on PL330 memory region connection patch:
> https://patchew.org/QEMU/4c23c17b8e87e74e906a25a3254a03f4fa1fe...@shasxm03.verisilicon.com/

If you have a patch that depends on another, it's usually better to
send them as a patchseries. I was wondering what the reason for
that PL330 patch was...

thanks
-- PMM

Re: [PATCH] .mailmap: Fix more contributor entries

2021-08-20 Thread Philippe Mathieu-Daudé

On Fri, Aug 20, 2021 at 10:25 AM Hyman Huang  wrote:
> 在 2021/8/20 16:04, Philippe Mathieu-Daudé 写道:
> > These authors have some incorrect author email field.
> > For each of them, there is one commit with the replaced
> > entry.
> >
> > Cc: Alex Chen 
> > Cc: Bibo Mao 
> > Cc: Guoyi Tu 
> > Cc: Haibin Zhang 
> > Cc: Hyman Huang 
> > Cc: Lichang Zhao 
> > Cc: Yuanjun Gong 
> > Signed-off-by: Philippe Mathieu-Daudé 
> > ---
> > If you are Cc'ed and agree with this change, please reply with a
> > "Reviewed-by: Name " line. I'll wait 2 weeks and consider
> > no reply as a disagreement and will remove your entry from this
> > patch.
> > ---
> Reviewed-by: Hyman Huang 

Thanks!

> >   .mailmap | 7 +++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/.mailmap b/.mailmap
> > index f029d1c21fe..11b259e7d07 100644
> > --- a/.mailmap
> > +++ b/.mailmap
> > @@ -69,6 +69,7 @@ Yongbok Kim  
> > 
> >   # git author config, or had utf8/latin1 encoding issues.
> >   Aaron Lindsay 
> >   Alexey Gerasimenko 
> > +Alex Chen 
> >   Alex Ivanov 
> >   Andreas Färber 
> >   Bandan Das 
> > @@ -76,6 +77,7 @@ Benjamin MARSILI 
> >   Benoît Canet 
> >   Benoît Canet 
> >   Benoît Canet 
> > +Bibo Mao 
> >   Boqun Feng 
> >   Boqun Feng 
> >   Brad Smith 
> > @@ -99,9 +101,12 @@ Gautham R. Shenoy 
> >   Gautham R. Shenoy 
> >   Gonglei (Arei) 
> >   Guang Wang 
> > +Guoyi Tu 
> > +Haibin Zhang 
> >   Hailiang Zhang 
> >   Hanna Reitz  
> >   Hervé Poussineau 
> > +Hyman Huang 
> >   Jakub Jermář 
> >   Jakub Jermář 
> >   Jean-Christophe Dubois 
> > @@ -113,6 +118,7 @@ Jun Li 
> >   Laurent Vivier 
> >   Leandro Lupori 
> >   Li Guang 
> > +Lichang Zhao 
> >   Liming Wang 
> >   linzhecheng 
> >   Liran Schour 
> > @@ -171,6 +177,7 @@ Xiong Zhang 
> >   Yin Yin 
> >   Yu-Chen Lin 
> >   Yu-Chen Lin  
> > +Yuanjun Gong 
> >   YunQiang Su 
> >   YunQiang Su 
> >   Yuri Pudgorodskiy 
> >
>
> --

Re: [PATCH v2 3/4] hw/dma/xlnx_csu_dma: Always expect 'dma' link property to be set

2021-08-20 Thread Peter Maydell

On Thu, 19 Aug 2021 at 17:34, Philippe Mathieu-Daudé  wrote:
>
> Simplify by always passing a MemoryRegion property to the device.
> Doing so we can move the AddressSpace field to the device struct,
> removing need for heap allocation.
>
> Update the Xilinx ZynqMP SoC model to pass the default system
> memory instead of a NULL value.
>
> Suggested-by: Peter Maydell 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v2 4/4] hw/dma/xlnx-zdma Always expect 'dma' link property to be set

2021-08-20 Thread Peter Maydell

On Thu, 19 Aug 2021 at 17:34, Philippe Mathieu-Daudé  wrote:
>
> Simplify by always passing a MemoryRegion property to the device.
> Doing so we can move the AddressSpace field to the device struct,
> removing need for heap allocation.
>
> Update the Xilinx ZynqMP / Versal SoC models to pass the default
> system memory instead of a NULL value.
>
> Suggested-by: Peter Maydell 
> Signed-off-by: Philippe Mathieu-Daudé 


Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [qemu-web PATCH] Add a blog post about FUSE block exports

2021-08-20 Thread Hanna Reitz


On 19.08.21 20:22, Klaus Kiwi wrote:



On Thu, Aug 19, 2021 at 7:27 AM Hanna Reitz > wrote:


This post explains when FUSE block exports are useful, how they work,
and that it is fun to export an image file on its own path so it looks
like your image file (in whatever format it was) is a raw image now.


Thanks Hanna, great work. Even if you explained this to me multiple times,
thanks to this I think I now finally understand *how* it works.


Oops, sorry for forgetting to CC you...


Signed-off-by: Hanna Reitz mailto:hre...@redhat.com>>
---
You can also find this patch here:
https://gitlab.com/hreitz/qemu-web
 fuse-blkexport-v1

My first patch to qemu-web, so I hope I am not doing anything overly
stupid here (adding SVGs with extremely long lines comes to mind)...
---
 _posts/2021-08-18-fuse-blkexport.md       | 488
++
 screenshots/2021-08-18-block-graph-a.svg  |   2 +
 screenshots/2021-08-18-block-graph-b.svg  |   2 +
 screenshots/2021-08-18-block-graph-c.svg  |   2 +
 screenshots/2021-08-18-block-graph-d.svg  |   2 +
 screenshots/2021-08-18-block-graph-e.svg  |   2 +
 screenshots/2021-08-18-root-directory.svg |   2 +
 screenshots/2021-08-18-root-file.svg      |   2 +
 8 files changed, 502 insertions(+)
 create mode 100644 _posts/2021-08-18-fuse-blkexport.md
 create mode 100644 screenshots/2021-08-18-block-graph-a.svg
 create mode 100644 screenshots/2021-08-18-block-graph-b.svg
 create mode 100644 screenshots/2021-08-18-block-graph-c.svg
 create mode 100644 screenshots/2021-08-18-block-graph-d.svg
 create mode 100644 screenshots/2021-08-18-block-graph-e.svg
 create mode 100644 screenshots/2021-08-18-root-directory.svg
 create mode 100644 screenshots/2021-08-18-root-file.svg

diff --git a/_posts/2021-08-18-fuse-blkexport.md
b/_posts/2021-08-18-fuse-blkexport.md
new file mode 100644
index 000..e6a55d0
--- /dev/null
+++ b/_posts/2021-08-18-fuse-blkexport.md
@@ -0,0 +1,488 @@
+---
+layout: post
+title:  "Exporting block devices as raw image files with FUSE"
+date:   2021-08-18 18:00:00 +0200
+author: Hanna Reitz
+categories: [storage, features, tutorials]


Non-fatal, but I feel that the title doesn't summarize all that this' 
blog posts is about.
An alternate suggestion might be in the lines of "A look into QEMU's 
FUSE export

feature, and how to use it to manipulate guest images".


Hmm, I don’t know.  The feature itself doesn’t really allow you to 
manipulate guest images, it only provides a translation layer so that 
other tools can do it.  I can definitely replace “Exporting block 
devices” by “Presenting guest images”, but I’m not sure I want to go 
much further, actually.



+---
+Sometimes, there is a VM disk image whose contents you want to
manipulate
+without booting the VM.  For raw images, that process is usually
fairly simple,
+because most Linux systems bring tools for the job, e.g.:
+* *dd* to just copy data to and from given offsets,
+* *parted* to manipulate the partition table,
+* *kpartx* to present all partitions as block devices,
+* *mount* to access filesystems’ contents.
+
+Sadly, but naturally, such tools only work for raw images, and
not for images
+e.g. in QEMU’s qcow2 format.  To access such an image’s content,
the format has
+to be translated to create a raw image, for example by:
+* Exporting the image file with `qemu-nbd -c` as an NBD block
device file,
+* Converting between image formats using `qemu-img convert`,
+* Accessing the image from a guest, where it appears as a normal
block device.
+

Guessing that this would be the best place to mention 
guestmount/libguestfs, as Stefan

mentioned in another reply to this thread?


Yes, probably replacing the “Accessing the image from a guest” point.

Bonus points if you can identify (dis)advantages, similarly to that 
you did below

with the other methods.

+Unfortunately, none of these methods is perfect: `qemu-nbd -c`
generally
+requires root rights, converting to a temporary raw copy requires
additional
+disk space and the conversion process takes time, and accessing
the image from a
+guest is just quite cumbersome in general (and also specifically
something that
+we set out to avoid in the first sentence of this blog post).
+
+As of QEMU 6.0, there is another method, namely FUSE block exports.
+Conceptually, these are rather similar to using `qemu-nbd -c`,
but they do not
+require root rights.
+
+**Note**: FUSE block exports are a feature that can be enabled or
disabled
+during the build process with `--enable-fuse` or
`--disable-fuse`, respectively;
+omitting either configure option will enable the feature if an

RE: [PATCH] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Li, Chunming

> On Fri, 20 Aug 2021 at 03:36, Li, Chunming
>  wrote:
> >
> > The current SMMU V3 device model only support PCI/PCIe devices,
> > so we update it to support non-PCI/PCIe devices.
> >
> > hw/arm/smmuv3:
> > . Create IOMMU memory regions for non-PCI/PCIe devices based
> on their SID
> > . Add sid-map property to store non-PCI/PCIe devices SID
> > . Update implementation of CFGI commands based on device SID
> > hw/arm/smmu-common:
> > . Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting
> strategy
> > hw/arm/virt:
> > . Add PL330 DMA controller and connect with SMMUv3 for
> testing
> > . Add smmuv3_sidmap for non-PCI/PCIe devices SID setting
> 
> Please don't try to do all these things in one big patch --
> put together a patchseries with several smaller patches,
> each of which does one self-contained thing.
> 
  Got it. Thanks for your comments.
  Let me try to split them into several smaller patches.
  But all these updates work for 1 thing. They are depend on each other.
> 
> > Signed-off-by: Chunming Li 
> > Signed-off-by: Renwei Liu 
> > ---
> > This patch depends on PL330 memory region connection patch:
> >
> https://patchew.org/QEMU/4C23C17B8E87E74E906A25A3254A03F4FA1FEC31@SHASX
> M03.verisilicon.com/
> 
> If you have a patch that depends on another, it's usually better to
> send them as a patchseries. I was wondering what the reason for
> that PL330 patch was...
  I need a non-PCI/PCIe device to do the verification. The old PL330 DMA 
controller
  cannot configure its memory region manually. So we update it and provide path.
  PL330 patch was reviewed and will be merged in target-arm.next for 6.2.
  The virt.c compile will fail if there is no PL330 device patch.
> 
> thanks
> -- PMM

Thanks.

Chunming Li

Re: [PATCH] hw/arm/smmuv3: Support non-PCI/PCIe devices connection

2021-08-20 Thread Peter Maydell

On Fri, 20 Aug 2021 at 10:04, Li, Chunming  wrote:
>
> > On Fri, 20 Aug 2021 at 03:36, Li, Chunming
> >  wrote:
> > >
> > > The current SMMU V3 device model only support PCI/PCIe devices,
> > > so we update it to support non-PCI/PCIe devices.
> > >
> > > hw/arm/smmuv3:
> > > . Create IOMMU memory regions for non-PCI/PCIe devices based
> > on their SID
> > > . Add sid-map property to store non-PCI/PCIe devices SID
> > > . Update implementation of CFGI commands based on device SID
> > > hw/arm/smmu-common:
> > > . Differentiate PCI/PCIe and non-PCI/PCIe devices SID getting
> > strategy
> > > hw/arm/virt:
> > > . Add PL330 DMA controller and connect with SMMUv3 for
> > testing
> > > . Add smmuv3_sidmap for non-PCI/PCIe devices SID setting
> >
> > Please don't try to do all these things in one big patch --
> > put together a patchseries with several smaller patches,
> > each of which does one self-contained thing.
> >
>   Got it. Thanks for your comments.
>   Let me try to split them into several smaller patches.
>   But all these updates work for 1 thing. They are depend on each other.

Yes, that's why you send a single patch *series*, which has
a cover letter that explains the overall purpose. Then each
individual patch in the series has a commit message that
describes what that specific patch is doing. As emails, the
patches are all sent as follow-ups to the coverletter.
"git format-patch" and "git send-email" should handle this for you.

https://wiki.qemu.org/Contribute/SubmitAPatch#Split_up_long_patches
has a little more discussion of this.

> > If you have a patch that depends on another, it's usually better to
> > send them as a patchseries. I was wondering what the reason for
> > that PL330 patch was...

>   I need a non-PCI/PCIe device to do the verification. The old PL330 DMA 
> controller
>   cannot configure its memory region manually. So we update it and provide 
> path.
>   PL330 patch was reviewed and will be merged in target-arm.next for 6.2.
>   The virt.c compile will fail if there is no PL330 device patch.

Yeah, I accepted it because it is a sensible cleanup on its own;
but it would have been a bit easier for me to understand why you
were doing that if I'd had the wider context.

thanks
-- PMM

Re: [qemu-web PATCH] Add a blog post about FUSE block exports

2021-08-20 Thread Daniel P . Berrangé

On Fri, Aug 20, 2021 at 09:56:54AM +0200, Hanna Reitz wrote:
> On 19.08.21 18:23, Stefan Hajnoczi wrote:
> > On Thu, Aug 19, 2021 at 12:25:01PM +0200, Hanna Reitz wrote:
> > > This post explains when FUSE block exports are useful, how they work,
> > > and that it is fun to export an image file on its own path so it looks
> > > like your image file (in whatever format it was) is a raw image now.
> > > 
> > > Signed-off-by: Hanna Reitz 
> > > ---
> > > You can also find this patch here:
> > > https://gitlab.com/hreitz/qemu-web fuse-blkexport-v1
> > > 
> > > My first patch to qemu-web, so I hope I am not doing anything overly
> > > stupid here (adding SVGs with extremely long lines comes to mind)...
> > > ---
> > >   _posts/2021-08-18-fuse-blkexport.md   | 488 ++
> > >   screenshots/2021-08-18-block-graph-a.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-b.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-c.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-d.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-e.svg  |   2 +
> > >   screenshots/2021-08-18-root-directory.svg |   2 +
> > >   screenshots/2021-08-18-root-file.svg  |   2 +
> > >   8 files changed, 502 insertions(+)
> > >   create mode 100644 _posts/2021-08-18-fuse-blkexport.md
> > >   create mode 100644 screenshots/2021-08-18-block-graph-a.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-b.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-c.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-d.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-e.svg
> > >   create mode 100644 screenshots/2021-08-18-root-directory.svg
> > >   create mode 100644 screenshots/2021-08-18-root-file.svg
> > Great! Two ideas:
> > 
> > It would be nice to include a shoutout to libguestfs and mention that
> > libguestfs avoids exposing the host kernel's file systems and partion
> > code to untrusted disk images. If you don't mount the image then the
> > FUSE export has similar security properties.
> 
> Oh, right!  Absolutely.
> 
> Though now I do wonder why one would actually want to use QEMU’s FUSE
> exports then...

If you can't use KVM for libguestfs, then performance likely to
suffer significantly. This could be either if your disk image
is non-native architecture, or if your running inside a VM and
don't have nested KVM avaialble. You do still need to bear in
mind the security implications of course, but QEMU FUSE might
be your only viable option to achieve the performance needed.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

RE: [PATCH RFC v6 08/12] target/riscv: Handle KVM_EXIT_RISCV_SBI exit

2021-08-20 Thread Jiangyifei


> -Original Message-
> From: Qemu-riscv
> [mailto:qemu-riscv-bounces+jiangyifei=huawei@nongnu.org] On Behalf Of
> Alistair Francis
> Sent: Thursday, August 19, 2021 2:14 PM
> To: Jiangyifei 
> Cc: Anup Patel ; open list:RISC-V
> ; open list:Overall ;
> limingwang (A) ; libvir-l...@redhat.com; Bin Meng
> ; qemu-devel@nongnu.org Developers
> ; Alistair Francis ;
> kvm-ri...@lists.infradead.org; Wanghaibin (D)
> ; Fanliang (EulerOS) ;
> Palmer Dabbelt ; Wubin (H) 
> Subject: Re: [PATCH RFC v6 08/12] target/riscv: Handle KVM_EXIT_RISCV_SBI
> exit
> 
> On Tue, Aug 17, 2021 at 1:25 PM Yifei Jiang  wrote:
> >
> > Use char-fe to handle console sbi call, which implement early console
> > io while apply 'earlycon=sbi' into kernel parameters.
> >
> > Signed-off-by: Yifei Jiang 
> > Signed-off-by: Mingwang Li 
> > ---
> >  target/riscv/kvm.c | 42 -
> >  target/riscv/sbi_ecall_interface.h | 72
> > ++
> >  2 files changed, 113 insertions(+), 1 deletion(-)  create mode 100644
> > target/riscv/sbi_ecall_interface.h
> >
> > diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c index
> > bc9cb5d8f9..a68f31c2f3 100644
> > --- a/target/riscv/kvm.c
> > +++ b/target/riscv/kvm.c
> > @@ -38,6 +38,8 @@
> >  #include "qemu/log.h"
> >  #include "hw/loader.h"
> >  #include "kvm_riscv.h"
> > +#include "sbi_ecall_interface.h"
> > +#include "chardev/char-fe.h"
> >
> >  static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type,
> > uint64_t idx)  { @@ -435,9 +437,47 @@ bool
> > kvm_arch_stop_on_emulation_error(CPUState *cs)
> >  return true;
> >  }
> >
> > +static int kvm_riscv_handle_sbi(struct kvm_run *run) {
> > +int ret = 0;
> > +unsigned char ch;
> > +switch (run->riscv_sbi.extension_id) {
> > +case SBI_EXT_0_1_CONSOLE_PUTCHAR:
> > +ch = run->riscv_sbi.args[0];
> > +qemu_chr_fe_write(serial_hd(0)->be, &ch, sizeof(ch));
> > +break;
> > +case SBI_EXT_0_1_CONSOLE_GETCHAR:
> > +ret = qemu_chr_fe_read_all(serial_hd(0)->be, &ch, sizeof(ch));
> > +if (ret == sizeof(ch)) {
> > +run->riscv_sbi.args[0] = ch;
> > +} else {
> > +run->riscv_sbi.args[0] = -1;
> > +}
> > +break;
> 
> These have been deprecated (see
> https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc#4-legacy-ext
> ensions-eids-0x00---0x0f),
> is it even worth supporting them?
> 

Yes. The legacy console SBI functions (sbi_console_getchar() and 
sbi_console_putchar())
are expected to be deprecated.

However, the linux kernel still uses these sbi call interfaces, so they are 
retained here.
If the linux kernel doesn't use these interfaces, we will remove them.

Yifei

> > +default:
> > +qemu_log_mask(LOG_UNIMP,
> > +  "%s: un-handled SBI EXIT, specific reasons
> is %lu\n",
> > +  __func__, run->riscv_sbi.extension_id);
> > +ret = -1;
> > +break;
> > +}
> > +return ret;
> > +}
> > +
> >  int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)  {
> > -return 0;
> > +int ret = 0;
> > +switch (run->exit_reason) {
> > +case KVM_EXIT_RISCV_SBI:
> > +ret = kvm_riscv_handle_sbi(run);
> > +break;
> > +default:
> > +qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> > +  __func__, run->exit_reason);
> > +ret = -1;
> > +break;
> > +}
> > +return ret;
> >  }
> >
> >  void kvm_riscv_reset_vcpu(RISCVCPU *cpu) diff --git
> > a/target/riscv/sbi_ecall_interface.h
> > b/target/riscv/sbi_ecall_interface.h
> > new file mode 100644
> > index 00..fb1a3fa8f2
> > --- /dev/null
> > +++ b/target/riscv/sbi_ecall_interface.h
> > @@ -0,0 +1,72 @@
> > +/*
> > + * SPDX-License-Identifier: BSD-2-Clause
> > + *
> > + * Copyright (c) 2019 Western Digital Corporation or its affiliates.
> > + *
> > + * Authors:
> > + *   Anup Patel 
> > + */
> > +
> > +#ifndef __SBI_ECALL_INTERFACE_H__
> > +#define __SBI_ECALL_INTERFACE_H__
> > +
> > +/* clang-format off */
> > +
> > +/* SBI Extension IDs */
> > +#define SBI_EXT_0_1_SET_TIMER   0x0
> > +#define SBI_EXT_0_1_CONSOLE_PUTCHAR 0x1
> > +#define SBI_EXT_0_1_CONSOLE_GETCHAR 0x2
> > +#define SBI_EXT_0_1_CLEAR_IPI   0x3
> > +#define SBI_EXT_0_1_SEND_IPI0x4
> > +#define SBI_EXT_0_1_REMOTE_FENCE_I  0x5
> > +#define SBI_EXT_0_1_REMOTE_SFENCE_VMA   0x6
> > +#define SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID 0x7
> > +#define SBI_EXT_0_1_SHUTDOWN0x8
> > +#define SBI_EXT_BASE0x10
> > +#define SBI_EXT_TIME0x54494D45
> > +#define SBI_EXT_IPI 0x735049
> > +#define SBI_EXT_RFENCE  0x52464E43
> > +#define SBI_EXT_HSM 0x48534D
> > +
> > +/* SBI function IDs for BASE extension*/
> > +#define SBI_EXT_BASE_GET_SPEC_VERSION   0x0
> > +#define SBI_EXT_BASE_GET_IMP_ID

Re: [PATCH v3 53/66] target/alpha: Reorg integer memory operations

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 21:11, Richard Henderson
 wrote:
>
> Pass in the MemOp instead of a callback.
> Drop the fp argument; add a locked argument.
>
> Signed-off-by: Richard Henderson 
> ---
>  target/alpha/translate.c | 104 +++
>  1 file changed, 40 insertions(+), 64 deletions(-)

Reviewed-by: Peter Maydell 

(One of those patches where diff has made an incomprehensible mess
of the change and it's easier to look separately at the "before"
and "after" files, at least for the first half of the patch...)

thanks
-- PMM

Re: [PATCH v3 59/66] accel/tcg: Handle SIGBUS in handle_cpu_signal

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 21:13, Richard Henderson
 wrote:
>
> We've been registering host SIGBUS, but then treating it
> exactly like SIGSEGV.
>
> Handle BUS_ADRALN via cpu_unaligned_access, but allow other
> SIGBUS si_codes to continue into the host-to-guest signal
> coversion code in host_signal_handler.  Unwind the guest
> state so that we report the correct guest PC for the fault.

You can't rely on alignment faults being marked by BUS_ADRALN:
eg MIPS doesn't give you that si_code. How much does that matter
for our use of it here ?

-- PMM

Re: [PATCH v3 60/66] tcg/aarch64: Support raising sigbus for user-only

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 20:56, Richard Henderson
 wrote:
>
> Use load-acquire / store-release for the normal case of
> alignment matching the access size.  Otherwise, emit a
> test + branch sequence invoking helper_unaligned_mmu.

I don't think this trick will work for a CPU that
implements the FEAT_LSE2 feature -- section B2.5.2 in
the Arm ARM DDI0487G.b says that if LSE2 is implemented
then the CPU either must or can make them just work and do
the unaligned access.

thanks
-- PMM

Re: [PATCH v3 61/66] tcg/ppc: Support raising sigbus for user-only

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 20:46, Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/ppc/tcg-target.h |   2 -
>  tcg/ppc/tcg-target.c.inc | 102 ---
>  2 files changed, 94 insertions(+), 10 deletions(-)
>

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 62/66] tcg/s390: Support raising sigbus for user-only

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 20:49, Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/s390/tcg-target.h |  2 --
>  tcg/s390/tcg-target.c.inc | 63 +--
>  2 files changed, 61 insertions(+), 4 deletions(-)

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 63/66] tcg/tci: Support raising sigbus for user-only

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 21:15, Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/tci.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/tcg/tci.c b/tcg/tci.c
> index e76087ccac..985c8a91cb 100644
> --- a/tcg/tci.c
> +++ b/tcg/tci.c
> @@ -296,7 +296,7 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
> target_ulong taddr,
>  uintptr_t ra = (uintptr_t)tb_ptr;
>
>  #ifdef CONFIG_SOFTMMU
> -switch (mop) {
> +switch (mop & (MO_BSWAP | MO_SSIZE)) {
>  case MO_UB:
>  return helper_ret_ldub_mmu(env, taddr, oi, ra);
>  case MO_SB:
> @@ -326,10 +326,14 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
> target_ulong taddr,
>  }
>  #else
>  void *haddr = g2h(env_cpu(env), taddr);
> +unsigned a_mask = (1u << get_alignment_bits(mop)) - 1;
>  uint64_t ret;
>
>  set_helper_retaddr(ra);
> -switch (mop) {
> +if (taddr & a_mask) {
> +helper_unaligned_ld(env, taddr);
> +}
> +switch (mop & (MO_BSWAP | MO_SSIZE)) {
>  case MO_UB:
>  ret = ldub_p(haddr);
>  break;
> @@ -377,11 +381,11 @@ static uint64_t tci_qemu_ld(CPUArchState *env, 
> target_ulong taddr,
>  static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val,
>  MemOpIdx oi, const void *tb_ptr)
>  {
> -MemOp mop = get_memop(oi) & (MO_BSWAP | MO_SSIZE);
> +MemOp mop = get_memop(oi);
>  uintptr_t ra = (uintptr_t)tb_ptr;

Don't you need this bit in tci_qemu_st() as well ?


-- PMM

Re: [PATCH 2/2] docs/about: Unify the subject format

2021-08-20 Thread Cornelia Huck

On Fri, Aug 20 2021, Yanan Wang  wrote:

> Unify the subject format in deprecated.rst to "since X.Y".
> Unify the subject format in removed-features.rst to "removed in X.Y".

It seems unlikely that we will ever deprecate something in a stable
release, and even more unlikely that we'll remove something in one, so
the short versions look like the thing we want to standardize on.

>
> Signed-off-by: Yanan Wang 
> ---
>  docs/about/deprecated.rst   | 56 -
>  docs/about/removed-features.rst | 28 -
>  2 files changed, 42 insertions(+), 42 deletions(-)

Unrelated to your patch, line 143 in removed-features.rst reads

``-vnc ...,tls=...``, ``-vnc ...,x509=...`` & ``-vnc ...,x509verify=...``

and is missing the release it was removed in (presumably 3.1?)

Anyway,

Reviewed-by: Cornelia Huck

Re: [PATCH 1/2] hw/9pfs: avoid 'path' copy in v9fs_walk()

2021-08-20 Thread Greg Kurz

On Tue, 17 Aug 2021 14:38:24 +0200
Christian Schoenebeck  wrote:

> The v9fs_walk() function resolves all client submitted path nodes to the
> local 'pathes' array. Using a separate string scalar variable 'path'
> inside the background worker thread loop and copying that local 'path'
> string scalar variable subsequently to the 'pathes' array (at the end of
> each loop iteration) is not necessary.
> 
> Instead simply resolve each path directly to the 'pathes' array and
> don't use the string scalar variable 'path' inside the fs worker thread
> loop at all.
> 
> The only advantage of the 'path' scalar was that in case of an error
> the respective 'pathes' element would not be filled. Right now this is
> not an issue as the v9fs_walk() function returns as soon as any error
> occurs.
> 
> Suggested-by: Greg Kurz 
> Signed-off-by: Christian Schoenebeck 
> ---

Reviewed-by: Greg Kurz 

With this change, the path variable is no longer used at all in the
first loop. I see at least an extra possible cleanup : don't set
path before the first loop since it gets reset before the second
one. Maybe we can even get rid of path all the way ? I'll have
a look.

>  hw/9pfs/9p.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 2815257f42..4d642ab12a 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -1787,7 +1787,8 @@ static void coroutine_fn v9fs_walk(void *opaque)
>  strcmp("..", wnames[name_idx].data))
>  {
>  err = s->ops->name_to_path(&s->ctx, &dpath,
> -wnames[name_idx].data, &path);
> +   wnames[name_idx].data,
> +   &pathes[name_idx]);
>  if (err < 0) {
>  err = -errno;
>  break;
> @@ -1796,14 +1797,13 @@ static void coroutine_fn v9fs_walk(void *opaque)
>  err = -EINTR;
>  break;
>  }
> -err = s->ops->lstat(&s->ctx, &path, &stbuf);
> +err = s->ops->lstat(&s->ctx, &pathes[name_idx], &stbuf);
>  if (err < 0) {
>  err = -errno;
>  break;
>  }
>  stbufs[name_idx] = stbuf;
> -v9fs_path_copy(&dpath, &path);
> -v9fs_path_copy(&pathes[name_idx], &path);
> +v9fs_path_copy(&dpath, &pathes[name_idx]);
>  }
>  }
>  });

Re: [PATCH v3 01/14] tcg/arm: Remove fallback definition of __ARM_ARCH

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:32, Richard Henderson
 wrote:
>
> GCC since 4.8 provides the definition and we now require 7.5.
>
> Signed-off-by: Richard Henderson 

I assume clang provides this too...

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH 2/2] hw/9pfs: use g_autofree in v9fs_walk() where possible

2021-08-20 Thread Greg Kurz

On Tue, 17 Aug 2021 15:46:50 +0200
Christian Schoenebeck  wrote:

> Suggested-by: Greg Kurz 
> Signed-off-by: Christian Schoenebeck 
> ---
>  hw/9pfs/9p.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 4d642ab12a..c857b31321 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -1703,11 +1703,12 @@ static bool same_stat_id(const struct stat *a, const 
> struct stat *b)
>  static void coroutine_fn v9fs_walk(void *opaque)
>  {
>  int name_idx;
> -V9fsQID *qids = NULL;
> +g_autofree V9fsQID *qids = NULL;
>  int i, err = 0;
>  V9fsPath dpath, path, *pathes = NULL;
>  uint16_t nwnames;
> -struct stat stbuf, fidst, *stbufs = NULL;
> +struct stat stbuf, fidst;
> +g_autofree struct stat *stbufs = NULL;
>  size_t offset = 7;
>  int32_t fid, newfid;
>  V9fsString *wnames = NULL;
> @@ -1872,8 +1873,6 @@ out_nofid:
>  v9fs_path_free(&pathes[name_idx]);
>  }
>  g_free(wnames);
> -g_free(qids);
> -g_free(stbufs);
>  g_free(pathes);

It seems that wnames and pathes could be converted to
g_autofree as well or I'm missing something ?

Feel free to add my R-b with or without that extra change.

Reviewed-by: Greg Kurz 

>  }
>  }

Re: [PATCH v3 02/14] tcg/arm: Standardize on tcg_out__{reg, imm}

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:41, Richard Henderson
 wrote:
>
> Some of the functions specified _reg, some _imm, and some
> left it blank.  Make it clearer to which we are referring.
>
> Split tcg_out_b_reg from tcg_out_bx_reg, to indicate when
> we do not actually require BX semantics.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 04/14] tcg/arm: Support armv4t in tcg_out_goto and tcg_out_call

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:38, Richard Henderson
 wrote:
>
> ARMv4T has BX as its only interworking instruction.  In order
> to support testing of different architecture revisions with a
> qemu binary that may have been built for, say ARMv6T2, fill in
> the blank required to make calls to helpers in thumb mode.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM

[PATCH] memory: Have 'info mtree' remove duplicated Address Space information

2021-08-20 Thread Philippe Mathieu-Daudé

Per Peter Maydell [*]:

  'info mtree' monitor command was designed on the assumption that
  there's really only one or two interesting address spaces, and
  with more recent developments that's just not the case any more.

Similarly about how the FlatView are sorted using a GHashTable,
sort the AddressSpace objects to remove the duplications (AS
using the same root MemoryRegion).

This drastically reduce 'info mtree' on some boards.

Before:

  $ (echo info mtree; echo q) \
| qemu-system-aarch64 -S -monitor stdio -M raspi3b \
| wc -l
  423

After:

  $ (echo info mtree; echo q) \
| qemu-system-aarch64 -S -monitor stdio -M raspi3b \
| wc -l
  108

  (qemu) info mtree
  address-space: I/O
- (prio 0, i/o): io

  address-space shared 9 times:
- cpu-memory-0
- cpu-memory-1
- cpu-memory-2
- cpu-memory-3
- cpu-secure-memory-0
- cpu-secure-memory-1
- cpu-secure-memory-2
- cpu-secure-memory-3
- memory
- (prio 0, i/o): system
  -3fff (prio 0, ram): ram
  3f00-3fff (prio 1, i/o): bcm2835-peripherals
3f003000-3f00301f (prio 0, i/o): bcm2835-sys-timer
3f004000-3f004fff (prio -1000, i/o): bcm2835-txp
3f006000-3f006fff (prio 0, i/o): mphi
3f007000-3f007fff (prio 0, i/o): bcm2835-dma
3f00b200-3f00b3ff (prio 0, i/o): bcm2835-ic
3f00b400-3f00b43f (prio -1000, i/o): bcm2835-sp804
3f00b800-3f00bbff (prio 0, i/o): bcm2835-mbox
3f10-3f1001ff (prio 0, i/o): bcm2835-powermgt
3f101000-3f102fff (prio 0, i/o): bcm2835-cprman
3f104000-3f10400f (prio 0, i/o): bcm2835-rng
3f20-3f200fff (prio 0, i/o): bcm2835_gpio
3f201000-3f201fff (prio 0, i/o): pl011
3f202000-3f202fff (prio 0, i/o): bcm2835-sdhost
3f203000-3f2030ff (prio -1000, i/o): bcm2835-i2s
3f204000-3f20401f (prio -1000, i/o): bcm2835-spi0
3f205000-3f20501f (prio -1000, i/o): bcm2835-i2c0
3f20f000-3f20f07f (prio -1000, i/o): bcm2835-otp
3f212000-3f212007 (prio 0, i/o): bcm2835-thermal
3f214000-3f2140ff (prio -1000, i/o): bcm2835-spis
3f215000-3f2150ff (prio 0, i/o): bcm2835-aux
3f30-3f3000ff (prio 0, i/o): sdhci
3f60-3f6000ff (prio -1000, i/o): bcm2835-smi
3f804000-3f80401f (prio -1000, i/o): bcm2835-i2c1
3f805000-3f80501f (prio -1000, i/o): bcm2835-i2c2
3f90-3f907fff (prio -1000, i/o): bcm2835-dbus
3f91-3f917fff (prio -1000, i/o): bcm2835-ave0
3f98-3f990fff (prio 0, i/o): dwc2
  3f98-3f980fff (prio 0, i/o): dwc2-io
  3f981000-3f990fff (prio 0, i/o): dwc2-fifo
3fc0-3fc00fff (prio -1000, i/o): bcm2835-v3d
3fe0-3fe000ff (prio -1000, i/o): bcm2835-sdramc
3fe05000-3fe050ff (prio 0, i/o): bcm2835-dma-chan15
  4000-40ff (prio 0, i/o): bcm2836-control

  address-space shared 4 times:
- bcm2835-dma-memory
- bcm2835-fb-memory
- bcm2835-property-memory
- dwc2
- (prio 0, i/o): bcm2835-gpu
  -3fff (prio 0, ram): alias 
bcm2835-gpu-ram-alias[*] @ram -3fff
  4000-7fff (prio 0, ram): alias 
bcm2835-gpu-ram-alias[*] @ram -3fff
  7e00-7eff (prio 1, i/o): alias 
bcm2835-peripherals @bcm2835-peripherals -00ff
  8000-bfff (prio 0, ram): alias 
bcm2835-gpu-ram-alias[*] @ram -3fff
  c000- (prio 0, ram): alias 
bcm2835-gpu-ram-alias[*] @ram -3fff

  address-space: bcm2835-mbox-memory
-008f (prio 0, i/o): bcm2835-mbox
  0010-001f (prio 0, i/o): bcm2835-fb
  0080-008f (prio 0, i/o): bcm2835-property

  memory-region: ram
-3fff (prio 0, ram): ram

  memory-region: bcm2835-peripherals
3f00-3fff (prio 1, i/o): bcm2835-peripherals
  3f003000-3f00301f (prio 0, i/o): bcm2835-sys-timer
  3f004000-3f004fff (prio -1000, i/o): bcm2835-txp
  3f006000-3f006fff (prio 0, i/o): mp

Re: [PATCH] memory: Have 'info mtree' remove duplicated Address Space information

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 12:54 PM, Philippe Mathieu-Daudé wrote:
> Per Peter Maydell [*]:
> 
>   'info mtree' monitor command was designed on the assumption that
>   there's really only one or two interesting address spaces, and
>   with more recent developments that's just not the case any more.
> 
> Similarly about how the FlatView are sorted using a GHashTable,
> sort the AddressSpace objects to remove the duplications (AS
> using the same root MemoryRegion).
> 
> This drastically reduce 'info mtree' on some boards.
> 
> Before:
> 
>   $ (echo info mtree; echo q) \
> | qemu-system-aarch64 -S -monitor stdio -M raspi3b \
> | wc -l
>   423
> 
> After:
> 
>   $ (echo info mtree; echo q) \
> | qemu-system-aarch64 -S -monitor stdio -M raspi3b \
> | wc -l
>   108
> 
>   (qemu) info mtree
>   address-space: I/O
> - (prio 0, i/o): io
> 
>   address-space shared 9 times:
> - cpu-memory-0
> - cpu-memory-1
> - cpu-memory-2
> - cpu-memory-3
> - cpu-secure-memory-0
> - cpu-secure-memory-1
> - cpu-secure-memory-2
> - cpu-secure-memory-3
> - memory
> - (prio 0, i/o): system
>   -3fff (prio 0, ram): ram
>   3f00-3fff (prio 1, i/o): bcm2835-peripherals
> 3f003000-3f00301f (prio 0, i/o): bcm2835-sys-timer
> 3f004000-3f004fff (prio -1000, i/o): bcm2835-txp
> 3f006000-3f006fff (prio 0, i/o): mphi
> 3f007000-3f007fff (prio 0, i/o): bcm2835-dma
> 3f00b200-3f00b3ff (prio 0, i/o): bcm2835-ic
> 3f00b400-3f00b43f (prio -1000, i/o): bcm2835-sp804
> 3f00b800-3f00bbff (prio 0, i/o): bcm2835-mbox
> 3f10-3f1001ff (prio 0, i/o): bcm2835-powermgt
> 3f101000-3f102fff (prio 0, i/o): bcm2835-cprman
> 3f104000-3f10400f (prio 0, i/o): bcm2835-rng
> 3f20-3f200fff (prio 0, i/o): bcm2835_gpio
> 3f201000-3f201fff (prio 0, i/o): pl011
> 3f202000-3f202fff (prio 0, i/o): bcm2835-sdhost
> 3f203000-3f2030ff (prio -1000, i/o): bcm2835-i2s
> 3f204000-3f20401f (prio -1000, i/o): bcm2835-spi0
> 3f205000-3f20501f (prio -1000, i/o): bcm2835-i2c0
> 3f20f000-3f20f07f (prio -1000, i/o): bcm2835-otp
> 3f212000-3f212007 (prio 0, i/o): bcm2835-thermal
> 3f214000-3f2140ff (prio -1000, i/o): bcm2835-spis
> 3f215000-3f2150ff (prio 0, i/o): bcm2835-aux
> 3f30-3f3000ff (prio 0, i/o): sdhci
> 3f60-3f6000ff (prio -1000, i/o): bcm2835-smi
> 3f804000-3f80401f (prio -1000, i/o): bcm2835-i2c1
> 3f805000-3f80501f (prio -1000, i/o): bcm2835-i2c2
> 3f90-3f907fff (prio -1000, i/o): bcm2835-dbus
> 3f91-3f917fff (prio -1000, i/o): bcm2835-ave0
> 3f98-3f990fff (prio 0, i/o): dwc2
>   3f98-3f980fff (prio 0, i/o): dwc2-io
>   3f981000-3f990fff (prio 0, i/o): dwc2-fifo
> 3fc0-3fc00fff (prio -1000, i/o): bcm2835-v3d
> 3fe0-3fe000ff (prio -1000, i/o): bcm2835-sdramc
> 3fe05000-3fe050ff (prio 0, i/o): bcm2835-dma-chan15
>   4000-40ff (prio 0, i/o): bcm2836-control
> 
>   address-space shared 4 times:
> - bcm2835-dma-memory
> - bcm2835-fb-memory
> - bcm2835-property-memory
> - dwc2
> - (prio 0, i/o): bcm2835-gpu
>   -3fff (prio 0, ram): alias 
> bcm2835-gpu-ram-alias[*] @ram -3fff
>   4000-7fff (prio 0, ram): alias 
> bcm2835-gpu-ram-alias[*] @ram -3fff
>   7e00-7eff (prio 1, i/o): alias 
> bcm2835-peripherals @bcm2835-peripherals -00ff
>   8000-bfff (prio 0, ram): alias 
> bcm2835-gpu-ram-alias[*] @ram -3fff
>   c000- (prio 0, ram): alias 
> bcm2835-gpu-ram-alias[*] @ram -3fff
> 
>   address-space: bcm2835-mbox-memory
> -008f (prio 0, i/o): bcm2835-mbox
>   0010-001f (prio 0, i/o): bcm2835-fb
>   0080-008f (prio 0, i/o): bcm2835-property
> 
>   memory-region: ram
> -3fff (prio 0, ram): ram
> 
>   memory-region: bcm2835-peripherals
> 3f0

Re: [PATCH v3 03/14] tcg/arm: Simplify use_armvt5_instructions

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:32, Richard Henderson
 wrote:
>
> According to the Arm ARM DDI 0406C, section A1.3, the valid variants
> are ARMv5T, ARMv5TE, ARMv5TEJ -- there is no ARMv5 without Thumb.
> Therefore simplify the test from preprocessor ifdefs to base
> architecture revision.  Retain the "t" in the name to minimize churn.
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/arm/tcg-target.h | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
> index 18bb16c784..f41b809554 100644
> --- a/tcg/arm/tcg-target.h
> +++ b/tcg/arm/tcg-target.h
> @@ -28,13 +28,7 @@
>
>  extern int arm_arch;
>
> -#if defined(__ARM_ARCH_5T__) \
> -|| defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
> -# define use_armv5t_instructions 1
> -#else
> -# define use_armv5t_instructions use_armv6_instructions
> -#endif
> -
> +#define use_armv5t_instructions (__ARM_ARCH >= 5 || arm_arch >= 5)
>  #define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
>  #define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)

Typo in subject: should be "armv5t". Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 05/14] tcg/arm: Examine QEMU_TCG_DEBUG environment variable

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:42, Richard Henderson
 wrote:
>
> Use the environment variable to test an older ISA from
> the one supported by the host.
>
> Signed-off-by: Richard Henderson 

I think we should document this environment variable somewhere...



> +/*
> + * For debugging/testing purposes, allow the ISA to be reduced
> + * (but not extended) from the set detected above.
> + */
> +#ifdef CONFIG_DEBUG_TCG
> +{
> +char *opt = g_strdup(getenv("QEMU_TCG_DEBUG"));

Consider g_autofree ?

> +if (opt) {
> +for (char *o = strtok(opt, ","); o ; o = strtok(NULL, ",")) {
> +if (o[0] == 'v' &&
> +o[1] >= '4' &&
> +o[1] <= '0' + arm_arch &&
> +o[2] == 0) {
> +arm_arch = o[1] - '0';
> +continue;
> +}
> +if (strcmp(o, "!neon") == 0) {
> +use_neon_instructions = false;
> +continue;
> +}
> +if (strcmp(o, "help") == 0) {
> +printf("QEMU_TCG_DEBUG={,} where  is\n"
> +   "  v   select ARMv\n"
> +   "  !neon  disable ARM NEON\n");
> +exit(0);
> +}

We should complain about something in the variable we don't understand,
rather than just ignoring it.

> +}
> +g_free(opt);
> +}
> +}
> +#endif
> +

-- PMM

Re: [PATCH 2/3] qcow2: refactor handle_dependencies() loop body

2021-08-20 Thread Hanna Reitz


On 24.07.21 15:38, Vladimir Sementsov-Ogievskiy wrote:

No logic change, just prepare for the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2-cluster.c | 49 ---
  1 file changed, 28 insertions(+), 21 deletions(-)


Reviewed-by: Hanna Reitz

Re: [PATCH 0/1] uas: add stream number sanity checks (maybe 6.1)

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/18/21 2:05 PM, Gerd Hoffmann wrote:
> Security fix.  Sorry for the last-minute patch, I had completely
> forgotten this one until the CVE number for it arrived today.
> 
> Given that the classic usb storage device is way more popular than
> the uas (usb attached scsi) device the impact should be pretty low
> and we might consider to not screw up our release schedule for this.

So 6.1-rc5 or stable?

Re: [PATCH v3 07/14] tcg/arm: Split out tcg_out_ldstm

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:36, Richard Henderson
 wrote:
>
> Expand these hard-coded instructions symbolically.
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/arm/tcg-target.c.inc | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 08/14] tcg/arm: Simplify usage of encode_imm

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:38, Richard Henderson
 wrote:
>
> We have already computed the rotated value of the imm8
> portion of the complete imm12 encoding.  No sense leaving
> the combination of rot + rotation to the caller.
>
> Create an encode_imm12_nofail helper that performs an assert.
>
> This removes the final use of the local "rotl" function,
> which duplicated our generic "rol32" function.
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/arm/tcg-target.c.inc | 141 +--
>  1 file changed, 77 insertions(+), 64 deletions(-)

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 13/14] tcg/arm: Reserve a register for guest_base

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:33, Richard Henderson
 wrote:
>
> Reserve a register for the guest_base using aarch64 for reference.
> By doing so, we do not have to recompute it for every memory load.
>
> Signed-off-by: Richard Henderson 
> ---
>  tcg/arm/tcg-target.c.inc | 39 ---
>  1 file changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
> index 35bd4c68d6..2728035177 100644
> --- a/tcg/arm/tcg-target.c.inc
> +++ b/tcg/arm/tcg-target.c.inc
> @@ -84,6 +84,9 @@ static const int tcg_target_call_oarg_regs[2] = {
>
>  #define TCG_REG_TMP  TCG_REG_R12
>  #define TCG_VEC_TMP  TCG_REG_Q15
> +#ifndef CONFIG_SOFTMMU
> +#define TCG_REG_GUEST_BASE  TCG_REG_R11
> +#endif
>
>  typedef enum {
>  COND_EQ = 0x0,
> @@ -1763,7 +1766,8 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
> TCGLabelQemuLdst *lb)
>
>  static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
>TCGReg datalo, TCGReg datahi,
> -  TCGReg addrlo, TCGReg addend)
> +  TCGReg addrlo, TCGReg addend,
> +  bool scratch_addend)
>  {
>  /* Byte swapping is left to middle-end expansion. */
>  tcg_debug_assert((opc & MO_BSWAP) == 0);
> @@ -1790,7 +1794,7 @@ static void tcg_out_qemu_ld_index(TCGContext *s, MemOp 
> opc,
>  && get_alignment_bits(opc) >= MO_64
>  && (datalo & 1) == 0 && datahi == datalo + 1) {
>  tcg_out_ldrd_r(s, COND_AL, datalo, addrlo, addend);
> -} else if (datalo != addend) {
> +} else if (scratch_addend) {
>  tcg_out_ld32_rwb(s, COND_AL, datalo, addend, addrlo);
>  tcg_out_ld32_12(s, COND_AL, datahi, addend, 4);
>  } else {

I don't understand this change. Yes, we can trash the addend
register, but if it's the same as 'datalo' then the second load
is not going to DTRT... Shouldn't this be
  if (scratch_addend && datalo != addend)
?

-- PMM

Re: [Patch 1/2] hw/arm/xlnx-versal: Add unimplemented APU mmio

2021-08-20 Thread Edgar E. Iglesias

On Wed, Aug 18, 2021 at 08:15:24PM -0700, Tong Ho wrote:
> Add unimplemented APU mmio region to xlnx-versal for booting
> bare-metal guests built with standalone bsp published at:
>   
> https://github.com/Xilinx/embeddedsw/tree/master/lib/bsp/standalone/src/arm/ARMv8/64bit


Reviewed-by: Edgar E. Iglesias 



> 
> Signed-off-by: Tong Ho 
> ---
>  hw/arm/xlnx-versal.c | 2 ++
>  include/hw/arm/xlnx-versal.h | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
> index fb776834f7..cb6ec0a4a0 100644
> --- a/hw/arm/xlnx-versal.c
> +++ b/hw/arm/xlnx-versal.c
> @@ -376,6 +376,8 @@ static void versal_unimp(Versal *s)
>  MM_CRL, MM_CRL_SIZE);
>  versal_unimp_area(s, "crf", &s->mr_ps,
>  MM_FPD_CRF, MM_FPD_CRF_SIZE);
> +versal_unimp_area(s, "apu", &s->mr_ps,
> +MM_FPD_FPD_APU, MM_FPD_FPD_APU_SIZE);
>  versal_unimp_area(s, "crp", &s->mr_ps,
>  MM_PMC_CRP, MM_PMC_CRP_SIZE);
>  versal_unimp_area(s, "iou-scntr", &s->mr_ps,
> diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
> index 22a8fa5d11..9b79051747 100644
> --- a/include/hw/arm/xlnx-versal.h
> +++ b/include/hw/arm/xlnx-versal.h
> @@ -167,6 +167,8 @@ struct Versal {
>  #define MM_IOU_SCNTRS_SIZE  0x1
>  #define MM_FPD_CRF  0xfd1aU
>  #define MM_FPD_CRF_SIZE 0x14
> +#define MM_FPD_FPD_APU  0xfd5c
> +#define MM_FPD_FPD_APU_SIZE 0x100
>  
>  #define MM_PMC_SD0  0xf104U
>  #define MM_PMC_SD0_SIZE 0x1
> -- 
> 2.25.1
>

Re: [PATCH 0/1] uas: add stream number sanity checks (maybe 6.1)

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 13:10, Gerd Hoffmann  wrote:
>
> Security fix.  Sorry for the last-minute patch, I had completely
> forgotten this one until the CVE number for it arrived today.
>
> Given that the classic usb storage device is way more popular than
> the uas (usb attached scsi) device the impact should be pretty low
> and we might consider to not screw up our release schedule for this.

What's the impact if the bug is exploited ?

-- PMM

Re: [PATCH 1/2] hw/9pfs: avoid 'path' copy in v9fs_walk()

2021-08-20 Thread Christian Schoenebeck

On Freitag, 20. August 2021 12:35:49 CEST Greg Kurz wrote:
> On Tue, 17 Aug 2021 14:38:24 +0200
> 
> Christian Schoenebeck  wrote:
> > The v9fs_walk() function resolves all client submitted path nodes to the
> > local 'pathes' array. Using a separate string scalar variable 'path'
> > inside the background worker thread loop and copying that local 'path'
> > string scalar variable subsequently to the 'pathes' array (at the end of
> > each loop iteration) is not necessary.
> > 
> > Instead simply resolve each path directly to the 'pathes' array and
> > don't use the string scalar variable 'path' inside the fs worker thread
> > loop at all.
> > 
> > The only advantage of the 'path' scalar was that in case of an error
> > the respective 'pathes' element would not be filled. Right now this is
> > not an issue as the v9fs_walk() function returns as soon as any error
> > occurs.
> > 
> > Suggested-by: Greg Kurz 
> > Signed-off-by: Christian Schoenebeck 
> > ---
> 
> Reviewed-by: Greg Kurz 
> 
> With this change, the path variable is no longer used at all in the
> first loop. 

Correct.

> I see at least an extra possible cleanup : don't set
> path before the first loop since it gets reset before the second
> one. 

Also correct.

> Maybe we can even get rid of path all the way ? I'll have
> a look.

Yes, that's the plan.

There is still quite a bunch that can be cleaned up in that function. I just 
didn't want to start with a two-digit patch set right after the long summer 
break. ;-)

If you want to send some cleanup patches, always appreciated.

Best regards,
Christian Schoenebeck

Re: [Patch 2/2] hw/arm/xlnx-zynqmp: Add unimplemented APU mmio

2021-08-20 Thread Edgar E. Iglesias

On Wed, Aug 18, 2021 at 08:15:25PM -0700, Tong Ho wrote:
> Add unimplemented APU mmio region to xlnx-zynqmp for booting
> bare-metal guests built with standalone bsp published at:
>   
> https://github.com/Xilinx/embeddedsw/tree/master/lib/bsp/standalone/src/arm/ARMv8/64bit
> 
> Signed-off-by: Tong Ho 
> ---
>  hw/arm/xlnx-zynqmp.c | 32 
>  include/hw/arm/xlnx-zynqmp.h |  7 +++
>  2 files changed, 39 insertions(+)
> 
> diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
> index 3597e8db4d..790df2b6f1 100644
> --- a/hw/arm/xlnx-zynqmp.c
> +++ b/hw/arm/xlnx-zynqmp.c
> @@ -20,6 +20,7 @@
>  #include "qemu/module.h"
>  #include "hw/arm/xlnx-zynqmp.h"
>  #include "hw/intc/arm_gic_common.h"
> +#include "hw/misc/unimp.h"
>  #include "hw/boards.h"
>  #include "sysemu/kvm.h"
>  #include "sysemu/sysemu.h"
> @@ -56,6 +57,9 @@
>  #define DPDMA_ADDR  0xfd4c
>  #define DPDMA_IRQ   116
>  
> +#define APU_ADDR0xfd5c
> +#define APU_SIZE0x100
> +
>  #define IPI_ADDR0xFF30
>  #define IPI_IRQ 64
>  
> @@ -222,6 +226,32 @@ static void xlnx_zynqmp_create_rpu(MachineState *ms, 
> XlnxZynqMPState *s,
>  qdev_realize(DEVICE(&s->rpu_cluster), NULL, &error_fatal);
>  }
>  
> +static void xlnx_zynqmp_create_unimp_mmio(XlnxZynqMPState *s)
> +{
> +static const struct UnimpInfo {
> +const char *name;
> +hwaddr base;
> +hwaddr size;
> +} unimp_areas[ARRAY_SIZE(s->mr_unimp)] = {
> +{ .name = "apu", APU_ADDR, APU_SIZE },
> +};
> +

Nitpick, I'd probably drop this blank line and spell out the unsigned int below.

Also, we might want consider asserting that every array member has been
initialized. Eg assert(info->name) in the loop below. Unless perhaps something
already bails out in those cases (e.g sysbus_realize_and_unref()).

In either case:
Reviewed-by: Edgar E. Iglesias 


> +unsigned nr;
> +
> +for (nr = 0; nr < ARRAY_SIZE(unimp_areas); nr++) {
> +const struct UnimpInfo *info = &unimp_areas[nr];
> +DeviceState *dev = qdev_new(TYPE_UNIMPLEMENTED_DEVICE);
> +SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
> +
> +qdev_prop_set_string(dev, "name", info->name);
> +qdev_prop_set_uint64(dev, "size", info->size);
> +object_property_add_child(OBJECT(s), info->name, OBJECT(dev));
> +
> +sysbus_realize_and_unref(sbd, &error_fatal);
> +sysbus_mmio_map(sbd, 0, info->base);
> +}
> +}
> +
>  static void xlnx_zynqmp_init(Object *obj)
>  {
>  MachineState *ms = MACHINE(qdev_get_machine());
> @@ -616,6 +646,8 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
> **errp)
>  sysbus_mmio_map(SYS_BUS_DEVICE(&s->rtc), 0, RTC_ADDR);
>  sysbus_connect_irq(SYS_BUS_DEVICE(&s->rtc), 0, gic_spi[RTC_IRQ]);
>  
> +xlnx_zynqmp_create_unimp_mmio(s);
> +
>  for (i = 0; i < XLNX_ZYNQMP_NUM_GDMA_CH; i++) {
>  if (!object_property_set_uint(OBJECT(&s->gdma[i]), "bus-width", 128,
>errp)) {
> diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
> index d3e2ef97f6..c84fe15996 100644
> --- a/include/hw/arm/xlnx-zynqmp.h
> +++ b/include/hw/arm/xlnx-zynqmp.h
> @@ -79,6 +79,11 @@ OBJECT_DECLARE_SIMPLE_TYPE(XlnxZynqMPState, XLNX_ZYNQMP)
>  #define XLNX_ZYNQMP_MAX_RAM_SIZE (XLNX_ZYNQMP_MAX_LOW_RAM_SIZE + \
>XLNX_ZYNQMP_MAX_HIGH_RAM_SIZE)
>  
> +/*
> + * Unimplemented mmio regions needed to boot some images.
> + */
> +#define XLNX_ZYNQMP_NUM_UNIMP_AREAS 1
> +
>  struct XlnxZynqMPState {
>  /*< private >*/
>  DeviceState parent_obj;
> @@ -96,6 +101,8 @@ struct XlnxZynqMPState {
>  MemoryRegion *ddr_ram;
>  MemoryRegion ddr_ram_low, ddr_ram_high;
>  
> +MemoryRegion mr_unimp[XLNX_ZYNQMP_NUM_UNIMP_AREAS];
> +
>  CadenceGEMState gem[XLNX_ZYNQMP_NUM_GEMS];
>  CadenceUARTState uart[XLNX_ZYNQMP_NUM_UARTS];
>  XlnxZynqMPCANState can[XLNX_ZYNQMP_NUM_CAN];
> -- 
> 2.25.1
>

Re: [PATCH 2/2] hw/9pfs: use g_autofree in v9fs_walk() where possible

2021-08-20 Thread Christian Schoenebeck

On Freitag, 20. August 2021 12:40:31 CEST Greg Kurz wrote:
> On Tue, 17 Aug 2021 15:46:50 +0200
> 
> Christian Schoenebeck  wrote:
> > Suggested-by: Greg Kurz 
> > Signed-off-by: Christian Schoenebeck 
> > ---
> > 
> >  hw/9pfs/9p.c | 7 +++
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > index 4d642ab12a..c857b31321 100644
> > --- a/hw/9pfs/9p.c
> > +++ b/hw/9pfs/9p.c
> > @@ -1703,11 +1703,12 @@ static bool same_stat_id(const struct stat *a,
> > const struct stat *b)> 
> >  static void coroutine_fn v9fs_walk(void *opaque)
> >  {
> >  
> >  int name_idx;
> > 
> > -V9fsQID *qids = NULL;
> > +g_autofree V9fsQID *qids = NULL;
> > 
> >  int i, err = 0;
> >  V9fsPath dpath, path, *pathes = NULL;
> >  uint16_t nwnames;
> > 
> > -struct stat stbuf, fidst, *stbufs = NULL;
> > +struct stat stbuf, fidst;
> > +g_autofree struct stat *stbufs = NULL;
> > 
> >  size_t offset = 7;
> >  int32_t fid, newfid;
> >  V9fsString *wnames = NULL;
> > 
> > @@ -1872,8 +1873,6 @@ out_nofid:
> >  v9fs_path_free(&pathes[name_idx]);
> >  
> >  }
> >  g_free(wnames);
> > 
> > -g_free(qids);
> > -g_free(stbufs);
> > 
> >  g_free(pathes);
> 
> It seems that wnames and pathes could be converted to
> g_autofree as well or I'm missing something ?

Yeah, I mentioned that in the cover letter. Those two are omitted in this 
patch because they contain dynamically allocated memory per array element 
which need to be freed individually before freeing the array.

So those two would probably require custom cleanup handlers.

> 
> Feel free to add my R-b with or without that extra change.
> 
> Reviewed-by: Greg Kurz 

Thanks!

> 
> >  }
> >  
> >  }

Best regards,
Christian Schoenebeck

Re: [PATCH 2/2] hw/9pfs: use g_autofree in v9fs_walk() where possible

2021-08-20 Thread Greg Kurz

On Fri, 20 Aug 2021 14:23:26 +0200
Christian Schoenebeck  wrote:

> On Freitag, 20. August 2021 12:40:31 CEST Greg Kurz wrote:
> > On Tue, 17 Aug 2021 15:46:50 +0200
> > 
> > Christian Schoenebeck  wrote:
> > > Suggested-by: Greg Kurz 
> > > Signed-off-by: Christian Schoenebeck 
> > > ---
> > > 
> > >  hw/9pfs/9p.c | 7 +++
> > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > > index 4d642ab12a..c857b31321 100644
> > > --- a/hw/9pfs/9p.c
> > > +++ b/hw/9pfs/9p.c
> > > @@ -1703,11 +1703,12 @@ static bool same_stat_id(const struct stat *a,
> > > const struct stat *b)> 
> > >  static void coroutine_fn v9fs_walk(void *opaque)
> > >  {
> > >  
> > >  int name_idx;
> > > 
> > > -V9fsQID *qids = NULL;
> > > +g_autofree V9fsQID *qids = NULL;
> > > 
> > >  int i, err = 0;
> > >  V9fsPath dpath, path, *pathes = NULL;
> > >  uint16_t nwnames;
> > > 
> > > -struct stat stbuf, fidst, *stbufs = NULL;
> > > +struct stat stbuf, fidst;
> > > +g_autofree struct stat *stbufs = NULL;
> > > 
> > >  size_t offset = 7;
> > >  int32_t fid, newfid;
> > >  V9fsString *wnames = NULL;
> > > 
> > > @@ -1872,8 +1873,6 @@ out_nofid:
> > >  v9fs_path_free(&pathes[name_idx]);
> > >  
> > >  }
> > >  g_free(wnames);
> > > 
> > > -g_free(qids);
> > > -g_free(stbufs);
> > > 
> > >  g_free(pathes);
> > 
> > It seems that wnames and pathes could be converted to
> > g_autofree as well or I'm missing something ?
> 
> Yeah, I mentioned that in the cover letter. Those two are omitted in this 
> patch because they contain dynamically allocated memory per array element 
> which need to be freed individually before freeing the array.
> 
> So those two would probably require custom cleanup handlers.
> 

The freeing of the individual elements is already handled in the loop above
the g_free() calls. The wnames and pathes pointers can thus be treated like
the other ones.

> > 
> > Feel free to add my R-b with or without that extra change.
> > 
> > Reviewed-by: Greg Kurz 
> 
> Thanks!
> 
> > 
> > >  }
> > >  
> > >  }
> 
> Best regards,
> Christian Schoenebeck
> 
>

Re: [PATCH 2/2] hw/9pfs: use g_autofree in v9fs_walk() where possible

2021-08-20 Thread Christian Schoenebeck

On Freitag, 20. August 2021 14:34:11 CEST Greg Kurz wrote:
> On Fri, 20 Aug 2021 14:23:26 +0200
> 
> Christian Schoenebeck  wrote:
> > On Freitag, 20. August 2021 12:40:31 CEST Greg Kurz wrote:
> > > On Tue, 17 Aug 2021 15:46:50 +0200
> > > 
> > > Christian Schoenebeck  wrote:
> > > > Suggested-by: Greg Kurz 
> > > > Signed-off-by: Christian Schoenebeck 
> > > > ---
> > > > 
> > > >  hw/9pfs/9p.c | 7 +++
> > > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > > > index 4d642ab12a..c857b31321 100644
> > > > --- a/hw/9pfs/9p.c
> > > > +++ b/hw/9pfs/9p.c
> > > > @@ -1703,11 +1703,12 @@ static bool same_stat_id(const struct stat *a,
> > > > const struct stat *b)>
> > > > 
> > > >  static void coroutine_fn v9fs_walk(void *opaque)
> > > >  {
> > > >  
> > > >  int name_idx;
> > > > 
> > > > -V9fsQID *qids = NULL;
> > > > +g_autofree V9fsQID *qids = NULL;
> > > > 
> > > >  int i, err = 0;
> > > >  V9fsPath dpath, path, *pathes = NULL;
> > > >  uint16_t nwnames;
> > > > 
> > > > -struct stat stbuf, fidst, *stbufs = NULL;
> > > > +struct stat stbuf, fidst;
> > > > +g_autofree struct stat *stbufs = NULL;
> > > > 
> > > >  size_t offset = 7;
> > > >  int32_t fid, newfid;
> > > >  V9fsString *wnames = NULL;
> > > > 
> > > > @@ -1872,8 +1873,6 @@ out_nofid:
> > > >  v9fs_path_free(&pathes[name_idx]);
> > > >  
> > > >  }
> > > >  g_free(wnames);
> > > > 
> > > > -g_free(qids);
> > > > -g_free(stbufs);
> > > > 
> > > >  g_free(pathes);
> > > 
> > > It seems that wnames and pathes could be converted to
> > > g_autofree as well or I'm missing something ?
> > 
> > Yeah, I mentioned that in the cover letter. Those two are omitted in this
> > patch because they contain dynamically allocated memory per array element
> > which need to be freed individually before freeing the array.
> > 
> > So those two would probably require custom cleanup handlers.
> 
> The freeing of the individual elements is already handled in the loop above
> the g_free() calls. The wnames and pathes pointers can thus be treated like
> the other ones.

Yes I know. I was just considering to make that in a safer way that would 
allow simple returns in future without goto out_something; branches. But yes, 
as it is right now they could be converted in the exact same way yet.

Best regards,
Christian Schoenebeck

Re: [PATCH 0/1] uas: add stream number sanity checks (maybe 6.1)

2021-08-20 Thread Philippe Mathieu-Daudé

Cc'ing Mauro to double-check.

On 8/20/21 2:12 PM, Peter Maydell wrote:
> On Wed, 18 Aug 2021 at 13:10, Gerd Hoffmann  wrote:
>>
>> Security fix.  Sorry for the last-minute patch, I had completely
>> forgotten this one until the CVE number for it arrived today.
>>
>> Given that the classic usb storage device is way more popular than
>> the uas (usb attached scsi) device the impact should be pretty low
>> and we might consider to not screw up our release schedule for this.
> 
> What's the impact if the bug is exploited ?

Bug class: "guest-triggered user-after-free".

Being privileged (root) in the guest, you can leak some data from
the host process then DoS the host or potentially exploit the
use-after-free to execute code on the host.

Re: [Patch 1/2] hw/arm/xlnx-versal: Add unimplemented APU mmio

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/19/21 5:15 AM, Tong Ho wrote:
> Add unimplemented APU mmio region to xlnx-versal for booting
> bare-metal guests built with standalone bsp published at:
>   
> https://github.com/Xilinx/embeddedsw/tree/master/lib/bsp/standalone/src/arm/ARMv8/64bit

This link is not very useful.

This one is:
https://github.com/Xilinx/embeddedsw/blob/master-rel-2020.2/lib/sw_apps/versal_psmfw/src/fpd_apu.h

> 
> Signed-off-by: Tong Ho 
> ---
>  hw/arm/xlnx-versal.c | 2 ++
>  include/hw/arm/xlnx-versal.h | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
> index fb776834f7..cb6ec0a4a0 100644
> --- a/hw/arm/xlnx-versal.c
> +++ b/hw/arm/xlnx-versal.c
> @@ -376,6 +376,8 @@ static void versal_unimp(Versal *s)
>  MM_CRL, MM_CRL_SIZE);
>  versal_unimp_area(s, "crf", &s->mr_ps,
>  MM_FPD_CRF, MM_FPD_CRF_SIZE);
> +versal_unimp_area(s, "apu", &s->mr_ps,
> +MM_FPD_FPD_APU, MM_FPD_FPD_APU_SIZE);
>  versal_unimp_area(s, "crp", &s->mr_ps,
>  MM_PMC_CRP, MM_PMC_CRP_SIZE);
>  versal_unimp_area(s, "iou-scntr", &s->mr_ps,
> diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
> index 22a8fa5d11..9b79051747 100644
> --- a/include/hw/arm/xlnx-versal.h
> +++ b/include/hw/arm/xlnx-versal.h
> @@ -167,6 +167,8 @@ struct Versal {
>  #define MM_IOU_SCNTRS_SIZE  0x1
>  #define MM_FPD_CRF  0xfd1aU
>  #define MM_FPD_CRF_SIZE 0x14
> +#define MM_FPD_FPD_APU  0xfd5c
> +#define MM_FPD_FPD_APU_SIZE 0x100
>  
>  #define MM_PMC_SD0  0xf104U
>  #define MM_PMC_SD0_SIZE 0x1
>

Re: [Patch 2/2] hw/arm/xlnx-zynqmp: Add unimplemented APU mmio

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/19/21 5:15 AM, Tong Ho wrote:
> Add unimplemented APU mmio region to xlnx-zynqmp for booting
> bare-metal guests built with standalone bsp published at:
>   
> https://github.com/Xilinx/embeddedsw/tree/master/lib/bsp/standalone/src/arm/ARMv8/64bit

Again, please point to something more useful:

https://github.com/Xilinx/embeddedsw/blob/master-rel-2020.2/lib/sw_apps/zynqmp_fsbl/src/xfsbl_hw.h#L271

> Signed-off-by: Tong Ho 
> ---
>  hw/arm/xlnx-zynqmp.c | 32 
>  include/hw/arm/xlnx-zynqmp.h |  7 +++
>  2 files changed, 39 insertions(+)

Re: [PATCH 3/3] qcow2: handle_dependencies(): relax conflict detection

2021-08-20 Thread Hanna Reitz


On 24.07.21 15:38, Vladimir Sementsov-Ogievskiy wrote:

There is no conflict and no dependency if we have parallel writes to
different subclusters of one cluster when cluster itself is already
allocated. So, relax extra dependency.

Measure performance:
First, prepare build/qemu-img-old and build/qemu-img-new images.

cd scripts/simplebench
./img_bench_templater.py

Paste the following to stdin of running script:

qemu_img=../../build/qemu-img-{old|new}
$qemu_img create -f qcow2 -o extended_l2=on /ssd/x.qcow2 1G
$qemu_img bench -c 10 -d 8 [-s 2K|-s 2K -o 512|-s $((1024*2+512))] \
 -w -t none -n /ssd/x.qcow2

The result:

All results are in seconds

--  -  -
 oldnew
-s 2K   6.7 ± 15%  6.2 ± 12%
  -7%
-s 2K -o 51213 ± 3%11 ± 5%
  -16%
-s $((1024*2+512))  9.5 ± 4%   8.4
  -12%
--  -  -

So small writes are more independent now and that helps to keep deeper
io queue which improves performance.

271 iotest output becomes racy for three allocation in one cluster.
Second and third writes may finish in different order. Second and
third requests don't depend on each other any more. Still they both
depend on first request anyway. Keep only one for consistent output.


I mean, we could also just filter the result 
(`s/\(20480\|40960\)/FILTERED/` or something).  Perhaps there was some 
idea behind doing three writes, I don’t know exactly.


I think I’d prefer a filter, because I guess this is the only test that 
actually will do two subcluster requests in parallel...?


Hanna


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2-cluster.c  | 11 +++
  tests/qemu-iotests/271 |  4 +---
  tests/qemu-iotests/271.out |  2 --
  3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 967121c7e6..8f56de5516 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1403,6 +1403,17 @@ static int handle_dependencies(BlockDriverState *bs, 
uint64_t guest_offset,
  continue;
  }
  
+if (old_alloc->keep_old_clusters &&

+(end <= l2meta_cow_start(old_alloc) ||
+ start >= l2meta_cow_end(old_alloc)))
+{
+/*
+ * Clusters intersect but COW areas don't. And cluster itself is
+ * already allocated. So, there is no actual conflict.
+ */
+continue;
+}
+
  /* Conflict */
  
  if (start < old_start) {

diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
index 599b849cc6..939e88ee88 100755
--- a/tests/qemu-iotests/271
+++ b/tests/qemu-iotests/271
@@ -866,7 +866,7 @@ echo
  
  _concurrent_io()

  {
-# Allocate three subclusters in the same cluster.
+# Allocate two subclusters in the same cluster.
  # This works because handle_dependencies() checks whether the requests
  # allocate the same cluster, even if the COW regions don't overlap (in
  # this case they don't).
@@ -876,7 +876,6 @@ break write_aio A
  aio_write -P 10 30k 2k
  wait_break A
  aio_write -P 11 20k 2k
-aio_write -P 12 40k 2k
  resume A
  aio_flush
  EOF
@@ -888,7 +887,6 @@ cat <  
diff --git a/tests/qemu-iotests/271.out b/tests/qemu-iotests/271.out

index 81043ba4d7..d94c8fe061 100644
--- a/tests/qemu-iotests/271.out
+++ b/tests/qemu-iotests/271.out
@@ -721,6 +721,4 @@ wrote 2048/2048 bytes at offset 30720
  2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 2048/2048 bytes at offset 20480
  2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 2048/2048 bytes at offset 40960
-2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  *** done

Re: [PATCH v3 06/14] tcg/arm: Support unaligned access for softmmu

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:32, Richard Henderson
 wrote:
>
> From armv6, the architecture supports unaligned accesses.
> All we need to do is perform the correct alignment check
> in tcg_out_tlb_read and not use LDRD/STRD when the access
> is not aligned.
>
> Signed-off-by: Richard Henderson 
> @@ -1578,27 +1576,32 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg 
> addrlo, TCGReg addrhi,
>
>  /*
>   * Check alignment, check comparators.
> - * Do this in no more than 3 insns.  Use MOVW for v7, if possible,
> + * Do this in 2-4 insns.  Use MOVW for v7, if possible,
>   * to reduce the number of sequential conditional instructions.
>   * Almost all guests have at least 4k pages, which means that we need
>   * to clear at least 9 bits even for an 8-byte memory, which means it
>   * isn't worth checking for an immediate operand for BIC.
>   */
> +/* For unaligned accesses, test the page of the last byte. */
> +t_addr = addrlo;
> +if (a_mask < s_mask) {
> +t_addr = TCG_REG_R0;
> +tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
> +addrlo, s_mask - a_mask);
> +}

I don't understand what this comment means or why we're doing the
addition. If we know we need to check eg whether the address is 2-aligned,
why aren't we just checking whether it's 2-aligned ? Could you
expand on the explanation a bit?


thanks
-- PMM

Re: [PATCH 01/26] ppc: Add a POWER10 DD2 CPU

2021-08-20 Thread Greg Kurz

On Mon, 9 Aug 2021 15:45:22 +0200
Cédric Le Goater  wrote:

> The POWER10 DD2 CPU adds an extra LPCR[HAIL] bit. DD1 doesn't have
> HAIL but since it does not break the modeling and that we don't plan
> to support DD1, modify the LPCR mask of all the POWER10 family.
> 

Maybe consider dropping DD1 at some point then ?

Anyway,

Reviewed-by: Greg Kurz 

> Setting the HAIL bit is a requirement to support the scv instruction
> on PowerNV POWER10 platforms since glibc-2.33.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  target/ppc/cpu-models.h | 1 +
>  target/ppc/cpu-models.c | 4 +++-
>  target/ppc/cpu_init.c   | 3 +++
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/cpu-models.h b/target/ppc/cpu-models.h
> index fc5e21728d7e..095259275941 100644
> --- a/target/ppc/cpu-models.h
> +++ b/target/ppc/cpu-models.h
> @@ -375,6 +375,7 @@ enum {
>  CPU_POWERPC_POWER9_DD20= 0x004E1200,
>  CPU_POWERPC_POWER10_BASE   = 0x0080,
>  CPU_POWERPC_POWER10_DD1= 0x00800100,
> +CPU_POWERPC_POWER10_DD20   = 0x00800200,
>  CPU_POWERPC_970_v22= 0x00390202,
>  CPU_POWERPC_970FX_v10  = 0x00391100,
>  CPU_POWERPC_970FX_v20  = 0x003C0200,
> diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
> index 87e4228614b0..4baa111713b0 100644
> --- a/target/ppc/cpu-models.c
> +++ b/target/ppc/cpu-models.c
> @@ -776,6 +776,8 @@
>  "POWER9 v2.0")
>  POWERPC_DEF("power10_v1.0",  CPU_POWERPC_POWER10_DD1,POWER10,
>  "POWER10 v1.0")
> +POWERPC_DEF("power10_v2.0",  CPU_POWERPC_POWER10_DD20,   POWER10,
> +"POWER10 v2.0")
>  #endif /* defined (TARGET_PPC64) */
>  
>  /***/
> @@ -952,7 +954,7 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
>  { "power8", "power8_v2.0" },
>  { "power8nvl", "power8nvl_v1.0" },
>  { "power9", "power9_v2.0" },
> -{ "power10", "power10_v1.0" },
> +{ "power10", "power10_v2.0" },
>  #endif
>  
>  /* Generic PowerPCs */
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 505a0ed6ac09..66deb18a6b65 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -8270,6 +8270,9 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
>   LPCR_DEE | LPCR_OEE))
>  | LPCR_MER | LPCR_GTSE | LPCR_TC |
>  LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE;
> +/* DD2 adds an extra HAIL bit */
> +pcc->lpcr_mask |= LPCR_HAIL;
> +
>  pcc->lpcr_pm = LPCR_PDEE | LPCR_HDEE | LPCR_EEE | LPCR_DEE | LPCR_OEE;
>  pcc->mmu_model = POWERPC_MMU_3_00;
>  #if defined(CONFIG_SOFTMMU)

Re: [PATCH 02/26] ppc/pnv: Change the POWER10 machine to support DD2 only

2021-08-20 Thread Greg Kurz

On Mon, 9 Aug 2021 15:45:23 +0200
Cédric Le Goater  wrote:

> There is no need to keep the DD1 chip model as it will never be
> publicly available.
> 
> Signed-off-by: Cédric Le Goater 
> ---

Reviewed-by: Greg Kurz 

>  include/hw/ppc/pnv.h | 2 +-
>  hw/ppc/pnv.c | 2 +-
>  hw/ppc/pnv_core.c| 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index d69cee17b232..3fec7c87d82d 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -170,7 +170,7 @@ DECLARE_INSTANCE_CHECKER(PnvChip, PNV_CHIP_POWER8NVL,
>  DECLARE_INSTANCE_CHECKER(PnvChip, PNV_CHIP_POWER9,
>   TYPE_PNV_CHIP_POWER9)
>  
> -#define TYPE_PNV_CHIP_POWER10 PNV_CHIP_TYPE_NAME("power10_v1.0")
> +#define TYPE_PNV_CHIP_POWER10 PNV_CHIP_TYPE_NAME("power10_v2.0")
>  DECLARE_INSTANCE_CHECKER(PnvChip, PNV_CHIP_POWER10,
>   TYPE_PNV_CHIP_POWER10)
>  
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index d16dd2d0801d..b122251d1a5d 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1916,7 +1916,7 @@ static void pnv_machine_power10_class_init(ObjectClass 
> *oc, void *data)
>  static const char compat[] = "qemu,powernv10\0ibm,powernv";
>  
>  mc->desc = "IBM PowerNV (Non-Virtualized) POWER10";
> -mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power10_v1.0");
> +mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power10_v2.0");
>  
>  pmc->compat = compat;
>  pmc->compat_size = sizeof(compat);
> diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
> index 8c2a15a0fb2f..4de8414df212 100644
> --- a/hw/ppc/pnv_core.c
> +++ b/hw/ppc/pnv_core.c
> @@ -347,7 +347,7 @@ static const TypeInfo pnv_core_infos[] = {
>  DEFINE_PNV_CORE_TYPE(power8, "power8_v2.0"),
>  DEFINE_PNV_CORE_TYPE(power8, "power8nvl_v1.0"),
>  DEFINE_PNV_CORE_TYPE(power9, "power9_v2.0"),
> -DEFINE_PNV_CORE_TYPE(power10, "power10_v1.0"),
> +DEFINE_PNV_CORE_TYPE(power10, "power10_v2.0"),
>  };
>  
>  DEFINE_TYPES(pnv_core_infos)

Re: [PATCH 04/26] ppc/pnv: Use a simple incrementing index for the chip-id

2021-08-20 Thread Greg Kurz

On Mon, 9 Aug 2021 15:45:25 +0200
Cédric Le Goater  wrote:

> When the QEMU PowerNV machine was introduced, multi chip support
> modeled a two socket system with dual chip modules as found on some P8
> Tuleta systems (8286-42A). But this is hardly used and not relevant
> for QEMU. Use a simple index instead.
> 

Makes sense.

> With this change, we can now increase the max socket number to 16 as
> found on high end systems.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  include/hw/ppc/pnv.h | 33 +++--
>  hw/ppc/pnv.c | 11 ++-
>  2 files changed, 13 insertions(+), 31 deletions(-)
> 
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 3fec7c87d82d..aa08d79d24de 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -174,25 +174,6 @@ DECLARE_INSTANCE_CHECKER(PnvChip, PNV_CHIP_POWER9,
>  DECLARE_INSTANCE_CHECKER(PnvChip, PNV_CHIP_POWER10,
>   TYPE_PNV_CHIP_POWER10)
>  
> -/*
> - * This generates a HW chip id depending on an index, as found on a
> - * two socket system with dual chip modules :
> - *
> - *0x0, 0x1, 0x10, 0x11
> - *
> - * 4 chips should be the maximum
> - *
> - * TODO: use a machine property to define the chip ids
> - */
> -#define PNV_CHIP_HWID(i) i) & 0x3e) << 3) | ((i) & 0x1))
> -
> -/*
> - * Converts back a HW chip id to an index. This is useful to calculate
> - * the MMIO addresses of some controllers which depend on the chip id.
> - */
> -#define PNV_CHIP_INDEX(chip)\
> -(((chip)->chip_id >> 2) * 2 + ((chip)->chip_id & 0x3))
> -
>  PowerPCCPU *pnv_chip_find_cpu(PnvChip *chip, uint32_t pir);
>  
>  #define TYPE_PNV_MACHINE   MACHINE_TYPE_NAME("powernv")
> @@ -256,11 +237,11 @@ void pnv_bmc_set_pnor(IPMIBmc *bmc, PnvPnor *pnor);
>  #define PNV_OCC_COMMON_AREA_SIZE0x0080ull
>  #define PNV_OCC_COMMON_AREA_BASE0x7fff80ull
>  #define PNV_OCC_SENSOR_BASE(chip)   (PNV_OCC_COMMON_AREA_BASE + \
> -PNV_OCC_SENSOR_DATA_BLOCK_BASE(PNV_CHIP_INDEX(chip)))
> +PNV_OCC_SENSOR_DATA_BLOCK_BASE((chip)->chip_id))
>  
>  #define PNV_HOMER_SIZE  0x0040ull
>  #define PNV_HOMER_BASE(chip)\
> -(0x7ffd80ull + ((uint64_t)PNV_CHIP_INDEX(chip)) * PNV_HOMER_SIZE)
> +(0x7ffd80ull + ((uint64_t)(chip)->chip_id) * PNV_HOMER_SIZE)
>  
>  
>  /*
> @@ -279,16 +260,16 @@ void pnv_bmc_set_pnor(IPMIBmc *bmc, PnvPnor *pnor);
>   */
>  #define PNV_ICP_SIZE 0x0010ull
>  #define PNV_ICP_BASE(chip)  \
> -(0x00038000ull + (uint64_t) PNV_CHIP_INDEX(chip) * PNV_ICP_SIZE)
> +(0x00038000ull + (uint64_t) (chip)->chip_id * PNV_ICP_SIZE)
>  
>  
>  #define PNV_PSIHB_SIZE   0x0010ull
>  #define PNV_PSIHB_BASE(chip) \
> -(0x0003fffe8000ull + (uint64_t)PNV_CHIP_INDEX(chip) * PNV_PSIHB_SIZE)
> +(0x0003fffe8000ull + (uint64_t)(chip)->chip_id * PNV_PSIHB_SIZE)
>  
>  #define PNV_PSIHB_FSP_SIZE   0x0001ull
>  #define PNV_PSIHB_FSP_BASE(chip) \
> -(0x0003ffe0ull + (uint64_t)PNV_CHIP_INDEX(chip) * \
> +(0x0003ffe0ull + (uint64_t)(chip)->chip_id * \
>   PNV_PSIHB_FSP_SIZE)
>  
>  /*
> @@ -324,11 +305,11 @@ void pnv_bmc_set_pnor(IPMIBmc *bmc, PnvPnor *pnor);
>  #define PNV9_OCC_COMMON_AREA_SIZE0x0080ull
>  #define PNV9_OCC_COMMON_AREA_BASE0x203fff80ull
>  #define PNV9_OCC_SENSOR_BASE(chip)   (PNV9_OCC_COMMON_AREA_BASE +   \
> -PNV_OCC_SENSOR_DATA_BLOCK_BASE(PNV_CHIP_INDEX(chip)))
> +PNV_OCC_SENSOR_DATA_BLOCK_BASE((chip)->chip_id))
>  
>  #define PNV9_HOMER_SIZE  0x0040ull
>  #define PNV9_HOMER_BASE(chip)   \
> -(0x203ffd80ull + ((uint64_t)PNV_CHIP_INDEX(chip)) * PNV9_HOMER_SIZE)
> +(0x203ffd80ull + ((uint64_t)(chip)->chip_id) * PNV9_HOMER_SIZE)
>  
>  /*
>   * POWER10 MMIO base addresses - 16TB stride per chip
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index b122251d1a5d..025f01c55744 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -809,9 +809,10 @@ static void pnv_init(MachineState *machine)
>   * TODO: should we decide on how many chips we can create based
>   * on #cores and Venice vs. Murano vs. Naples chip type etc...,
>   */
> -if (!is_power_of_2(pnv->num_chips) || pnv->num_chips > 4) {
> +if (!is_power_of_2(pnv->num_chips) || pnv->num_chips > 16) {
>  error_report("invalid number of chips: '%d'", pnv->num_chips);
> -error_printf("Try '-smp sockets=N'. Valid values are : 1, 2 or 
> 4.\n");
> +error_printf(
> +"Try '-smp sockets=N'. Valid values are : 1, 2, 4, 8 and 16.\n");
>  exit(1);
>  }
>  
> @@ -819,6 +820,7 @@ static void pnv_init(MachineState *machine)
>  for (i = 0; i < pnv->num_chips; i++) {
>  char chip_name[32];
>

Re: [PATCH v3 14/14] tcg/arm: Support raising sigbus for user-only

2021-08-20 Thread Peter Maydell

On Wed, 18 Aug 2021 at 22:43, Richard Henderson
 wrote:
>
> For v6+, use ldm/stm, ldrd/strd for the normal case of alignment
> matching the access size.  Otherwise, emit a test + branch sequence
> invoking helper_unaligned_{ld,st}.
>
> For v4+v5, use piecewise load and stores to implement misalignment.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM

xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Bin Meng

Hi,

The following command used to work on QEMU 4.2.0, but is now broken
with QEMU head.

$ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
-nographic -serial /dev/null -serial mon:stdio -monitor null -device
loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
allocate memory

Any ideas?

Regards,
Bin

Re: [PATCH 05/26] ppc/pnv: Distribute RAM among the chips

2021-08-20 Thread Greg Kurz

On Mon, 9 Aug 2021 15:45:26 +0200
Cédric Le Goater  wrote:

> But always give the first 1GB to chip 0 as skiboot requires it.
> 
> Signed-off-by: Cédric Le Goater 
> ---
>  hw/ppc/pnv.c | 33 +
>  1 file changed, 25 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 025f01c55744..2f5358b70c95 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -710,6 +710,23 @@ static void pnv_chip_power10_pic_print_info(PnvChip 
> *chip, Monitor *mon)
>  pnv_psi_pic_print_info(&chip10->psi, mon);
>  }
>  
> +/* Always give the first 1GB to chip 0 else we won't boot */
> +static uint64_t pnv_chip_get_ram_size(PnvMachineState *pnv, int chip_id)
> +{
> +MachineState *machine = MACHINE(pnv);
> +uint64_t ram_per_chip;
> +
> +assert(machine->ram_size >= 1 * GiB);
> +
> +ram_per_chip = machine->ram_size / pnv->num_chips;
> +if (ram_per_chip >= 1 * GiB) {
> +return QEMU_ALIGN_DOWN(ram_per_chip, 1 * MiB);
> +}
> +

So this is only reached if pnv->num_chips is >= 2, since
a single chip would have ram_per_chip == machine->ram_size
and thus take the return branch above.

Maybe worth making it clear with an assert() ?

> +ram_per_chip = (machine->ram_size - 1 * GiB) / (pnv->num_chips - 1);

Suggesting that because I was looking for a potential divide by zero ^^

> +return chip_id == 0 ? 1 * GiB : QEMU_ALIGN_DOWN(ram_per_chip, 1 * MiB);
> +}
> +
>  static void pnv_init(MachineState *machine)
>  {
>  const char *bios_name = machine->firmware ?: FW_FILE_NAME;
> @@ -717,6 +734,7 @@ static void pnv_init(MachineState *machine)
>  MachineClass *mc = MACHINE_GET_CLASS(machine);
>  char *fw_filename;
>  long fw_size;
> +uint64_t chip_ram_start = 0;
>  int i;
>  char *chip_typename;
>  DriveInfo *pnor = drive_get(IF_MTD, 0, 0);
> @@ -821,17 +839,16 @@ static void pnv_init(MachineState *machine)
>  char chip_name[32];
>  Object *chip = OBJECT(qdev_new(chip_typename));
>  int chip_id = i;
> +uint64_t chip_ram_size =  pnv_chip_get_ram_size(pnv, chip_id);
>  
>  pnv->chips[i] = PNV_CHIP(chip);
>  
> -/*
> - * TODO: put all the memory in one node on chip 0 until we find a
> - * way to specify different ranges for each chip
> - */
> -if (i == 0) {
> -object_property_set_int(chip, "ram-size", machine->ram_size,
> -&error_fatal);
> -}
> +/* Distribute RAM among the chips  */
> +object_property_set_int(chip, "ram-start", chip_ram_start,
> +&error_fatal);
> +object_property_set_int(chip, "ram-size", chip_ram_size,
> +&error_fatal);

Not really related but failing to set either of these looks
like it should never happen so I'd rather pass &error_abort
for debugging purpose.

Anyway,

Reviewed-by: Greg Kurz 

> +chip_ram_start += chip_ram_size;
>  
>  snprintf(chip_name, sizeof(chip_name), "chip[%d]", chip_id);
>  object_property_add_child(OBJECT(pnv), chip_name, chip);

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Philippe Mathieu-Daudé

Hi Bin,

On 8/20/21 4:04 PM, Bin Meng wrote:
> Hi,
> 
> The following command used to work on QEMU 4.2.0, but is now broken
> with QEMU head.
> 
> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> allocate memory
> 
> Any ideas?

Richard hit that recently too.

Can you provide:

cat /proc/sys/vm/overcommit_kbytes
cat /proc/sys/vm/overcommit_memory
cat /proc/sys/vm/overcommit_ratio

and

cat /proc/meminfo

(CommitLimit, Committed_AS)

and OOM messages.

Per David, 'you can trick QEMU in trying to work around
that issue, specifying a memory-backend-ram with "reserve=off"
as guest RAM.'

Re: [PATCH 06/26] ppc/pnv: add a chip topology index for POWER10

2021-08-20 Thread Greg Kurz

On Mon, 9 Aug 2021 15:45:27 +0200
Cédric Le Goater  wrote:

> Signed-off-by: Cédric Le Goater 

Maybe add a short description of its purpose in the changelog
for the records ? What's the difference with "ibm,chip-id" ?

> ---
>  hw/ppc/pnv_xscom.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index be7018e8ac59..faa488e3117a 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -284,6 +284,8 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int 
> root_offset,
>  _FDT(xscom_offset);
>  g_free(name);
>  _FDT((fdt_setprop_cell(fdt, xscom_offset, "ibm,chip-id", 
> chip->chip_id)));
> +_FDT((fdt_setprop_cell(fdt, xscom_offset, "ibm,primary-topology-index",
> +   chip->chip_id)));
>  _FDT((fdt_setprop_cell(fdt, xscom_offset, "#address-cells", 1)));
>  _FDT((fdt_setprop_cell(fdt, xscom_offset, "#size-cells", 1)));
>  _FDT((fdt_setprop(fdt, xscom_offset, "reg", reg, sizeof(reg;

[PATCH 3/4] target/i386: Added ignore TPR check in ctl_has_irq

2021-08-20 Thread Lara Lazier

The APM2 states that if V_IGN_TPR is nonzero, the current
virtual interrupt ignores the (virtual) TPR.

Signed-off-by: Lara Lazier 
---
 target/i386/tcg/sysemu/svm_helper.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/i386/tcg/sysemu/svm_helper.c 
b/target/i386/tcg/sysemu/svm_helper.c
index 2c44bdb243..cbd3f086c4 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -83,6 +83,11 @@ static inline bool ctl_has_irq(CPUX86State *env)
 
 int_prio = (env->int_ctl & V_INTR_PRIO_MASK) >> V_INTR_PRIO_SHIFT;
 tpr = env->int_ctl & V_TPR_MASK;
+
+if (env->int_ctl & V_IGN_TPR_MASK) {
+return env->int_ctl & V_IRQ_MASK;
+}
+
 return (env->int_ctl & V_IRQ_MASK) && (int_prio >= tpr);
 }
 
-- 
2.25.1

[PATCH 2/4] target/i386: Added VGIF V_IRQ masking capability

2021-08-20 Thread Lara Lazier

VGIF provides masking capability for when virtual interrupts
are taken. (APM2)

Signed-off-by: Lara Lazier 
---
 target/i386/cpu.c   |  7 +--
 target/i386/cpu.h   |  2 ++
 target/i386/tcg/sysemu/svm_helper.c | 12 
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5dcdab3b80..b2094175d9 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5649,6 +5649,7 @@ static void x86_cpu_reset(DeviceState *dev)
 /* init to reset state */
 env->int_ctl = 0;
 env->hflags2 |= HF2_GIF_MASK;
+env->hflags2 |= HF2_VGIF_MASK;
 env->hflags &= ~HF_GUEST_MASK;
 
 cpu_x86_update_cr0(env, 0x6010);
@@ -6532,10 +6533,12 @@ int x86_cpu_pending_interrupt(CPUState *cs, int 
interrupt_request)
   !(env->hflags & HF_INHIBIT_IRQ_MASK) {
 return CPU_INTERRUPT_HARD;
 #if !defined(CONFIG_USER_ONLY)
-} else if ((interrupt_request & CPU_INTERRUPT_VIRQ) &&
+} else if (env->hflags2 & HF2_VGIF_MASK) {
+if((interrupt_request & CPU_INTERRUPT_VIRQ) &&
(env->eflags & IF_MASK) &&
!(env->hflags & HF_INHIBIT_IRQ_MASK)) {
-return CPU_INTERRUPT_VIRQ;
+return CPU_INTERRUPT_VIRQ;
+}
 #endif
 }
 }
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e27a1aab99..d26df6de6b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -203,6 +203,7 @@ typedef enum X86Seg {
 #define HF2_MPX_PR_SHIFT 5 /* BNDCFGx.BNDPRESERVE */
 #define HF2_NPT_SHIFT6 /* Nested Paging enabled */
 #define HF2_IGNNE_SHIFT  7 /* Ignore CR0.NE=0 */
+#define HF2_VGIF_SHIFT   8 /* Can take VIRQ*/
 
 #define HF2_GIF_MASK(1 << HF2_GIF_SHIFT)
 #define HF2_HIF_MASK(1 << HF2_HIF_SHIFT)
@@ -212,6 +213,7 @@ typedef enum X86Seg {
 #define HF2_MPX_PR_MASK (1 << HF2_MPX_PR_SHIFT)
 #define HF2_NPT_MASK(1 << HF2_NPT_SHIFT)
 #define HF2_IGNNE_MASK  (1 << HF2_IGNNE_SHIFT)
+#define HF2_VGIF_MASK   (1 << HF2_VGIF_SHIFT)
 
 #define CR0_PE_SHIFT 0
 #define CR0_MP_SHIFT 1
diff --git a/target/i386/tcg/sysemu/svm_helper.c 
b/target/i386/tcg/sysemu/svm_helper.c
index 9ef2454779..2c44bdb243 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -130,6 +130,11 @@ static inline bool virtual_gif_enabled(CPUX86State *env)
 return false;
 }
 
+static inline bool virtual_gif_set(CPUX86State *env)
+{
+return !virtual_gif_enabled(env) || (env->int_ctl & V_GIF_MASK);
+}
+
 void helper_vmrun(CPUX86State *env, int aflag, int next_eip_addend)
 {
 CPUState *cs = env_cpu(env);
@@ -363,6 +368,10 @@ void helper_vmrun(CPUX86State *env, int aflag, int 
next_eip_addend)
 cs->interrupt_request |= CPU_INTERRUPT_VIRQ;
 }
 
+if (virtual_gif_set(env)) {
+env->hflags2 |= HF2_VGIF_MASK;
+}
+
 /* maybe we need to inject an event */
 event_inj = x86_ldl_phys(cs, env->vm_vmcb + offsetof(struct vmcb,
  control.event_inj));
@@ -519,6 +528,7 @@ void helper_stgi(CPUX86State *env)
 
 if (virtual_gif_enabled(env)) {
 env->int_ctl |= V_GIF_MASK;
+env->hflags2 |= HF2_VGIF_MASK;
 } else {
 env->hflags2 |= HF2_GIF_MASK;
 }
@@ -530,6 +540,7 @@ void helper_clgi(CPUX86State *env)
 
 if (virtual_gif_enabled(env)) {
 env->int_ctl &= ~V_GIF_MASK;
+env->hflags2 &= ~HF2_VGIF_MASK;
 } else {
 env->hflags2 &= ~HF2_GIF_MASK;
 }
@@ -811,6 +822,7 @@ void do_vmexit(CPUX86State *env)
  env->vm_vmcb + offsetof(struct vmcb, control.event_inj), 0);
 
 env->hflags2 &= ~HF2_GIF_MASK;
+env->hflags2 &= ~HF2_VGIF_MASK;
 /* FIXME: Resets the current ASID register to zero (host ASID). */
 
 /* Clears the V_IRQ and V_INTR_MASKING bits inside the processor. */
-- 
2.25.1

[PATCH 0/4] target/i386: V_IRQ masking and V_TPR fixes

2021-08-20 Thread Lara Lazier

Patch 2 adds VGIF capability to mask virtual interrupts.
Patches 3 and 4 fix bugs related to vTPR, while patch 1 refactors
int_ctl into the state structure to simplify the fixes in the
following patches.

Lara Lazier (4):
  target/i386: Moved int_ctl into CPUX86State structure
  target/i386: Added VGIF V_IRQ masking capability
  target/i386: Added ignore TPR check in ctl_has_irq
  target/i386: Added changed priority check for VIRQ

 slirp|  2 +-
 target/i386/cpu.c|  9 ++--
 target/i386/cpu.h| 18 
 target/i386/machine.c| 22 +-
 target/i386/tcg/seg_helper.c |  2 +-
 target/i386/tcg/sysemu/misc_helper.c | 11 -
 target/i386/tcg/sysemu/svm_helper.c  | 62 +++-
 7 files changed, 79 insertions(+), 47 deletions(-)

-- 
2.25.1

[PATCH 1/4] target/i386: Moved int_ctl into CPUX86State structure

2021-08-20 Thread Lara Lazier

Moved int_ctl into the CPUX86State structure to remove some
unnecessary stores and loads.

Signed-off-by: Lara Lazier 
---
 slirp|  2 +-
 target/i386/cpu.c|  2 +-
 target/i386/cpu.h|  1 +
 target/i386/machine.c| 22 -
 target/i386/tcg/seg_helper.c |  2 +-
 target/i386/tcg/sysemu/misc_helper.c |  4 +--
 target/i386/tcg/sysemu/svm_helper.c  | 48 +---
 7 files changed, 42 insertions(+), 39 deletions(-)

diff --git a/slirp b/slirp
index a88d9ace23..8f43a99191 16
--- a/slirp
+++ b/slirp
@@ -1 +1 @@
-Subproject commit a88d9ace234a24ce1c17189642ef9104799425e0
+Subproject commit 8f43a99191afb47ca3f3c6972f6306209f367ece
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ada7b49d8e..5dcdab3b80 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5647,7 +5647,7 @@ static void x86_cpu_reset(DeviceState *dev)
 env->old_exception = -1;
 
 /* init to reset state */
-
+env->int_ctl = 0;
 env->hflags2 |= HF2_GIF_MASK;
 env->hflags &= ~HF_GUEST_MASK;
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c9c7350c76..e27a1aab99 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1577,6 +1577,7 @@ typedef struct CPUX86State {
 uint64_t nested_cr3;
 uint32_t nested_pg_mode;
 uint8_t v_tpr;
+uint32_t int_ctl;
 
 /* KVM states, automatically cleared on reset */
 uint8_t nmi_injected;
diff --git a/target/i386/machine.c b/target/i386/machine.c
index f6f094f1c9..013ca6837f 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -203,7 +203,7 @@ static int cpu_pre_save(void *opaque)
 X86CPU *cpu = opaque;
 CPUX86State *env = &cpu->env;
 int i;
-
+env->v_tpr = env->int_ctl & V_TPR_MASK;
 /* FPU */
 env->fpus_vmstate = (env->fpus & ~0x3800) | (env->fpstt & 0x7) << 11;
 env->fptag_vmstate = 0;
@@ -1356,6 +1356,25 @@ static const VMStateDescription vmstate_svm_npt = {
 }
 };
 
+static bool svm_guest_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = &cpu->env;
+
+return !env->int_ctl;
+}
+
+static const VMStateDescription vmstate_svm_guest = {
+.name = "cpu/svn_guest",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = svm_guest_needed,
+.fields = (VMStateField[]){
+VMSTATE_UINT32(env.int_ctl, X86CPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
 #ifndef TARGET_X86_64
 static bool intel_efer32_needed(void *opaque)
 {
@@ -1524,6 +1543,7 @@ const VMStateDescription vmstate_x86_cpu = {
 &vmstate_msr_intel_pt,
 &vmstate_msr_virt_ssbd,
 &vmstate_svm_npt,
+&vmstate_svm_guest,
 #ifndef TARGET_X86_64
 &vmstate_efer32,
 #endif
diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c
index 3ed20ca31d..cef68b610a 100644
--- a/target/i386/tcg/seg_helper.c
+++ b/target/i386/tcg/seg_helper.c
@@ -1166,7 +1166,6 @@ bool x86_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 break;
 #if !defined(CONFIG_USER_ONLY)
 case CPU_INTERRUPT_VIRQ:
-/* FIXME: this should respect TPR */
 cpu_svm_check_intercept_param(env, SVM_EXIT_VINTR, 0, 0);
 intno = x86_ldl_phys(cs, env->vm_vmcb
  + offsetof(struct vmcb, control.int_vector));
@@ -1174,6 +1173,7 @@ bool x86_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
   "Servicing virtual hardware INT=0x%02x\n", intno);
 do_interrupt_x86_hardirq(env, intno, 1);
 cs->interrupt_request &= ~CPU_INTERRUPT_VIRQ;
+env->int_ctl &= ~V_IRQ_MASK;
 break;
 #endif
 }
diff --git a/target/i386/tcg/sysemu/misc_helper.c 
b/target/i386/tcg/sysemu/misc_helper.c
index e7a2ebde81..91b0fc916b 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -73,7 +73,7 @@ target_ulong helper_read_crN(CPUX86State *env, int reg)
 if (!(env->hflags2 & HF2_VINTR_MASK)) {
 val = cpu_get_apic_tpr(env_archcpu(env)->apic_state);
 } else {
-val = env->v_tpr;
+val = env->int_ctl & V_TPR_MASK;
 }
 break;
 }
@@ -121,7 +121,7 @@ void helper_write_crN(CPUX86State *env, int reg, 
target_ulong t0)
 cpu_set_apic_tpr(env_archcpu(env)->apic_state, t0);
 qemu_mutex_unlock_iothread();
 }
-env->v_tpr = t0 & 0x0f;
+env->int_ctl = (env->int_ctl & ~V_TPR_MASK) | (t0 & V_TPR_MASK);
 break;
 default:
 env->cr[reg] = t0;
diff --git a/target/i386/tcg/sysemu/svm_helper.c 
b/target/i386/tcg/sysemu/svm_helper.c
index 989af1b7f2..9ef2454779 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -76,14 +76,14 @@ static inline void svm_load_seg_cache(CPUX86State *env, 
hwaddr addr,
sc->base, sc->limit, sc->flags);
 }
 
-static inline bool ctl_has_irq(uint32_t

[RFC PATCH v2 7/8] pci: automatically unplug a PCI card before migration

2021-08-20 Thread Laurent Vivier

We have moved all the functions needed by failover to unplug a card to the
PCI subsystem.

A side effect of this change is we can implement automatic hotplug/unplug
of any PCI card during migration without using a failover virtio-net card.
For that, we need to introduce a new PCI device property,
"unplug-on-migration", we can set to "true" or "on" if we want QEMU unplugs
the card before the migration and plugs it back on the destination side
after the migration.

We modify the pci_dev_hide_device() function to check for the
"unplug-on-migration" property on the command line.
If it is present, the device is hidden on startup only on the destination
side and it will be unplugged before the migration.

To implement the "unplug-on-migration" property, we add a post_load
function in vmstate_pcibus to hotplug the card after the migration
(bus_post_load() and pci_dev_replug_on_migration()). This is not
needed with virtio-net failover because the device is plugged back
by the virtio-net device during the features migration
(VIRTIO_NET_F_STANDBY)

Signed-off-by: Laurent Vivier 
---
 include/hw/pci/pci.h |  1 +
 hw/pci/pci.c | 76 ++--
 hw/vfio/pci.c|  2 +-
 3 files changed, 69 insertions(+), 10 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index d35214144d1b..e02d965c064f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -362,6 +362,7 @@ struct PCIDevice {
 
 /* ID of standby device in net_failover pair */
 char *failover_pair_id;
+bool unplug_on_migration;
 Notifier migration_state;
 uint32_t acpi_index;
 };
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 10b875fdfad1..31ab80b81b6b 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -82,6 +82,8 @@ static Property pci_props[] = {
 QEMU_PCIE_LNKSTA_DLLLA_BITNR, true),
 DEFINE_PROP_BIT("x-pcie-extcap-init", PCIDevice, cap_present,
 QEMU_PCIE_EXTCAP_INIT_BITNR, true),
+DEFINE_PROP_BOOL("unplug-on-migration", PCIDevice,
+ unplug_on_migration, false),
 DEFINE_PROP_STRING("failover_pair_id", PCIDevice,
failover_pair_id),
 DEFINE_PROP_UINT32("acpi-index",  PCIDevice, acpi_index, 0),
@@ -110,6 +112,45 @@ static bool bus_unplug_pending(void *opaque)
 return false;
 }
 
+static int pci_dev_replug_on_migration(void *opaque, QemuOpts *opts,
+   Error **errp)
+{
+Error *err = NULL;
+const char *bus_name = opaque;
+const char *opt;
+DeviceState *dev;
+
+if (g_strcmp0(qemu_opt_get(opts, "bus"), bus_name)) {
+return 0;
+}
+
+opt = qemu_opt_get(opts, "unplug-on-migration");
+if (g_strcmp0(opt, "on") && g_strcmp0(opt, "true")) {
+return 0;
+}
+dev = qdev_device_add(opts, &err);
+if (err) {
+error_propagate(errp, err);
+return 1;
+}
+object_unref(OBJECT(dev));
+return 0;
+}
+
+static int bus_post_load(void *opaque, int version_id)
+{
+Error *err = NULL;
+PCIBus *bus = opaque;
+
+if (qemu_opts_foreach(qemu_find_opts("device"),
+  pci_dev_replug_on_migration, bus->qbus.name, &err)) {
+error_report_err(err);
+return -EINVAL;
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_pcibus = {
 .name = "PCIBUS",
 .version_id = 1,
@@ -122,6 +163,7 @@ static const VMStateDescription vmstate_pcibus = {
 VMSTATE_END_OF_LIST()
 },
 .dev_unplug_pending = bus_unplug_pending,
+.post_load = bus_post_load,
 };
 
 static void pci_init_bus_master(PCIDevice *pci_dev)
@@ -1200,7 +1242,7 @@ static void pci_qdev_unrealize(DeviceState *dev)
 PCIDevice *pci_dev = PCI_DEVICE(dev);
 PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev);
 
-if (pci_dev->failover_pair_id) {
+if (pci_dev->unplug_on_migration) {
 remove_migration_state_change_notifier(&pci_dev->migration_state);
 }
 
@@ -2265,6 +2307,15 @@ static bool pci_dev_hide_device(DeviceListener *listener,
 return false;
 }
 
+opt = qemu_opt_get(device_opts, "unplug-on-migration");
+if (g_strcmp0(opt, "on") == 0 || g_strcmp0(opt, "true") == 0) {
+if (runstate_check(RUN_STATE_INMIGRATE)) {
+return migration_incoming_get_current()->state !=
+   MIGRATION_STATUS_ACTIVE;
+}
+return false;
+}
+
 return false;
 }
 
@@ -2290,6 +2341,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
**errp)
 pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
 }
 
+if (pci_dev->failover_pair_id) {
+pci_dev->unplug_on_migration = true;
+}
+
 pci_dev = do_pci_register_device(pci_dev,
  object_get_typename(OBJECT(qdev)),
  pci_dev->devfn, errp);
@@ -2306,12 +2361,6 @@ static void pci_qdev_realize(DeviceState *qdev, Error 
**errp)
 }
 
 if (pci

[RFC PATCH v2 1/8] qdev: add an Error parameter to the DeviceListener hide_device() function

2021-08-20 Thread Laurent Vivier

This allows an error to be reported to the caller of qdev_device_add()

Signed-off-by: Laurent Vivier 
---
 include/hw/qdev-core.h | 6 --
 hw/core/qdev.c | 4 ++--
 hw/net/virtio-net.c| 2 +-
 softmmu/qdev-monitor.c | 4 ++--
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index bafc311bfa1b..e23b23a2f8d6 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -202,7 +202,8 @@ struct DeviceListener {
  * hide a failover device depending for example on the device
  * opts.
  */
-bool (*hide_device)(DeviceListener *listener, QemuOpts *device_opts);
+bool (*hide_device)(DeviceListener *listener, QemuOpts *device_opts,
+Error **errp);
 QTAILQ_ENTRY(DeviceListener) link;
 };
 
@@ -804,12 +805,13 @@ void device_listener_unregister(DeviceListener *listener);
 /**
  * @qdev_should_hide_device:
  * @opts: QemuOpts as passed on cmdline.
+ * @errp: pointer to error object
  *
  * Check if a device should be added.
  * When a device is added via qdev_device_add() this will be called,
  * and return if the device should be added now or not.
  */
-bool qdev_should_hide_device(QemuOpts *opts);
+bool qdev_should_hide_device(QemuOpts *opts, Error **errp);
 
 typedef enum MachineInitPhase {
 /* current_machine is NULL.  */
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index cefc5eaa0a92..13f4c1e696bf 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -211,13 +211,13 @@ void device_listener_unregister(DeviceListener *listener)
 QTAILQ_REMOVE(&device_listeners, listener, link);
 }
 
-bool qdev_should_hide_device(QemuOpts *opts)
+bool qdev_should_hide_device(QemuOpts *opts, Error **errp)
 {
 DeviceListener *listener;
 
 QTAILQ_FOREACH(listener, &device_listeners, link) {
 if (listener->hide_device) {
-if (listener->hide_device(listener, opts)) {
+if (listener->hide_device(listener, opts, errp)) {
 return true;
 }
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 16d20cdee52a..542f9e167eb4 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3279,7 +3279,7 @@ static void virtio_net_migration_state_notifier(Notifier 
*notifier, void *data)
 }
 
 static bool failover_hide_primary_device(DeviceListener *listener,
- QemuOpts *device_opts)
+ QemuOpts *device_opts, Error **errp)
 {
 VirtIONet *n = container_of(listener, VirtIONet, primary_listener);
 const char *standby_id;
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 721dec2d8200..7adf0d22beb1 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -627,8 +627,8 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
 error_setg(errp, "Device with failover_pair_id don't have id");
 return NULL;
 }
-if (qdev_should_hide_device(opts)) {
-if (bus && !qbus_is_hotpluggable(bus)) {
+if (qdev_should_hide_device(opts, errp)) {
+if (errp && !*errp && bus && !qbus_is_hotpluggable(bus)) {
 error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
 }
 return NULL;
-- 
2.31.1

[RFC PATCH v2 4/8] failover: pci: move failover hotplug/unplug code into pci subsystem

2021-08-20 Thread Laurent Vivier

failover allows a VFIO device to be unplugged on migration to switch
to the standby device, a virtio-net device.

Failover relies on PCI ability to hotplug/unplug a card (the VFIO
one) but all the code is implemented in virtio-net device that
is not a PCI device (even not in virtio-net-pci that is the bridge
between the PCI bus and the virtio-net device)

This patch moves all the failover PCI related code to the PCI
sub-system and keeps only in virtio-net the code related to the
failover (that is implemented in the virtio-net driver in the kernel)

There are several parts involved in the PCI device hotplug/unplug:

- the migration state notifier

  originally virtio_net_handle_migration_primary(), moved to hw/pci/pci.c
  as pci_dev_handle_migration(). On source, this function unplugs the card
  on migration setup and plugs it back in case of failure.
  (helpers are pci_dev_migration_unplug() and pci_dev_migration_replug(),
   previously failover_unplug_primary() and failover_replug_primary()).

- the hide device listener

  originally failover_hide_primary_device(), moved to hw/pci/pci.c
  as pci_dev_hide_device().
  This function is called by qdev_device_add() to know if a device
  must be hidden or not. To do that, the function checks the command line,
  if the device doesn't have a "failover_pair_id" it is not hidden. Otherwise
  it is hidden at startup on source and destination and it is only plugged by
  a compatible virtio-net device).

- the unplug pending checking functions

  originally primary_unplug_pending() and dev_unplug_pending(),
  moved to hw/pci/pci.c as bus_unplug_pending() and
  pci_device_unplug_pending()

  bus_unplug_pending() is called during the migration to check
  if the PCI device has been unplugged before starting to migrate
  data. It has been moved from vmstate_virtio_net to vmstate_pcibus.
  We can't move it to vmstate_pci_device because the device vmstate is
  removed before to start the migration (we don't migrate the device we
  unplug).

  pci_device_unplug_pending() is called by bus_unplug_pending() to
  check if an unplug is pending for a given device (this is a class
  function) of the bus.

  For a given PCI bus, bus_unplug_pending() will call
  pci_device_unplug_pending() for each available bus slot.

Signed-off-by: Laurent Vivier 
---
 include/hw/pci/pci.h   |   4 +
 include/hw/virtio/virtio-net.h |   2 -
 include/hw/virtio/virtio.h |   1 -
 hw/net/virtio-net.c| 156 ---
 hw/pci/pci.c   | 166 -
 5 files changed, 169 insertions(+), 160 deletions(-)

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index d0f4266e3725..d35214144d1b 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -241,6 +241,9 @@ struct PCIDeviceClass {
 
 /* rom bar */
 const char *romfile;
+
+DeviceListener listener;
+bool (*dev_unplug_pending)(void *opaque);
 };
 
 typedef void (*PCIINTxRoutingNotifier)(PCIDevice *dev);
@@ -359,6 +362,7 @@ struct PCIDevice {
 
 /* ID of standby device in net_failover pair */
 char *failover_pair_id;
+Notifier migration_state;
 uint32_t acpi_index;
 };
 
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 21d8c3aa5f3a..d8161b61cc9e 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -206,8 +206,6 @@ struct VirtIONet {
 bool needs_vnet_hdr_swap;
 bool mtu_bypass_backend;
 bool failover;
-DeviceListener primary_listener;
-Notifier migration_state;
 VirtioNetRssData rss_data;
 struct NetRxPkt *rx_pkt;
 struct EBPFRSSContext ebpf_rss;
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 8bab9cfb7507..c9d26d5daa43 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -159,7 +159,6 @@ struct VirtioDeviceClass {
  */
 int (*post_load)(VirtIODevice *vdev);
 const VMStateDescription *vmsd;
-bool (*primary_unplug_pending)(void *opaque);
 };
 
 void virtio_instance_init_common(Object *proxy_obj, void *data,
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index c81466f6efb7..35e3d024f8d6 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -36,10 +36,8 @@
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
 #include "hw/virtio/virtio-access.h"
-#include "migration/migration.h"
 #include "migration/misc.h"
 #include "standard-headers/linux/ethtool.h"
-#include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
 #include "trace.h"
 #include "monitor/qdev.h"
@@ -3188,127 +3186,6 @@ void virtio_net_set_netclient_name(VirtIONet *n, const 
char *name,
 n->netclient_type = g_strdup(type);
 }
 
-static bool failover_unplug_primary(VirtIONet *n, DeviceState *dev)
-{
-HotplugHandler *hotplug_ctrl;
-PCIDevice *pci_dev;
-Error *err = NULL;
-
-hotplug_ctrl = qdev_get_hotplug_handler(dev);
-if (hotplug_ctrl) {
-pci_

[PATCH 4/4] target/i386: Added changed priority check for VIRQ

2021-08-20 Thread Lara Lazier

Writes to cr8 affect v_tpr. This could set or unset an interrupt
request as the priority might have changed.

Signed-off-by: Lara Lazier 
---
 target/i386/cpu.h| 15 +++
 target/i386/tcg/sysemu/misc_helper.c |  7 +++
 target/i386/tcg/sysemu/svm_helper.c  | 15 ---
 3 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d26df6de6b..69e722253d 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2245,6 +2245,21 @@ static inline uint64_t cr4_reserved_bits(CPUX86State 
*env)
 return reserved_bits;
 }
 
+static inline bool ctl_has_irq(CPUX86State *env)
+{
+uint32_t int_prio;
+uint32_t tpr;
+
+int_prio = (env->int_ctl & V_INTR_PRIO_MASK) >> V_INTR_PRIO_SHIFT;
+tpr = env->int_ctl & V_TPR_MASK;
+
+if (env->int_ctl & V_IGN_TPR_MASK) {
+return (env->int_ctl & V_IRQ_MASK);
+}
+
+return (env->int_ctl & V_IRQ_MASK) && (int_prio >= tpr);
+}
+
 #if defined(TARGET_X86_64) && \
 defined(CONFIG_USER_ONLY) && \
 defined(CONFIG_LINUX)
diff --git a/target/i386/tcg/sysemu/misc_helper.c 
b/target/i386/tcg/sysemu/misc_helper.c
index 91b0fc916b..9ccaa054c4 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -122,6 +122,13 @@ void helper_write_crN(CPUX86State *env, int reg, 
target_ulong t0)
 qemu_mutex_unlock_iothread();
 }
 env->int_ctl = (env->int_ctl & ~V_TPR_MASK) | (t0 & V_TPR_MASK);
+
+CPUState *cs = env_cpu(env);
+if (ctl_has_irq(env)) {
+cpu_interrupt(cs, CPU_INTERRUPT_VIRQ);
+} else {
+cpu_reset_interrupt(cs, CPU_INTERRUPT_VIRQ);
+}
 break;
 default:
 env->cr[reg] = t0;
diff --git a/target/i386/tcg/sysemu/svm_helper.c 
b/target/i386/tcg/sysemu/svm_helper.c
index cbd3f086c4..312f10f1e4 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -76,21 +76,6 @@ static inline void svm_load_seg_cache(CPUX86State *env, 
hwaddr addr,
sc->base, sc->limit, sc->flags);
 }
 
-static inline bool ctl_has_irq(CPUX86State *env)
-{
-uint32_t int_prio;
-uint32_t tpr;
-
-int_prio = (env->int_ctl & V_INTR_PRIO_MASK) >> V_INTR_PRIO_SHIFT;
-tpr = env->int_ctl & V_TPR_MASK;
-
-if (env->int_ctl & V_IGN_TPR_MASK) {
-return env->int_ctl & V_IRQ_MASK;
-}
-
-return (env->int_ctl & V_IRQ_MASK) && (int_prio >= tpr);
-}
-
 static inline bool is_efer_invalid_state (CPUX86State *env)
 {
 if (!(env->efer & MSR_EFER_SVME)) {
-- 
2.25.1

[RFC PATCH v2 5/8] failover: hide the PCI device if the virtio-net device is not present

2021-08-20 Thread Laurent Vivier

We can plug either the virtio-net device first or the PCI device
first.

If we plug the PCI device first, QEMU checks the failover_pair_id
but doesn't hide the device if the virtio-net device is not present.

This is a problem because if the virtio-net device is not plugged
and the machine is migrated the VFIO device doesn't prevent anymore
the migration and the machine is migrated with the VFIO device.

And of course, on destination the guest reboots because VFIO device
cannot be migrated.

We can fix the problem by hiding the PCI device with a failover_pair_id
device that is not present. In this case, if the machine is migrated the
VFIO device is not migrated, and if the virtio-net device is plugged, the
PCI device is plugged and can be migrated using the failover mechanism.

Signed-off-by: Laurent Vivier 
---
 hw/pci/pci.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index f849945e2819..dd5e95216abf 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2244,10 +2244,11 @@ static bool pci_dev_hide_device(DeviceListener 
*listener,
 d =  qdev_find_recursive(sysbus_get_default(), opt);
 if (d == NULL) {
 /*
- * if the the virtio-net device is not plugged it can be
- * plugged later, and this device will be added to the failover
+ * PCI device has a pair id, but the virtio-net device is not
+ * plugged: hide it, it will be plugged later when the virtio-net
+ * device will be plugged
  */
-return false;
+return true;
 }
 
 if (runstate_check(RUN_STATE_PRELAUNCH)) {
-- 
2.31.1

[RFC PATCH v2 8/8] failover: qemu-opts: manage hidden device list

2021-08-20 Thread Laurent Vivier

failover relies on the command line parameters to store and detect
failover devices because while the device is hidden it doesn't appears
in qdev objects and so we don't have the list anywhere else.

But this doesn't work if the the device is hotplugged because it is
not added to the qemu opts list (and the opts list memory is released
after the qdev_device_add() if the device object is not created as it is
the case when it is hidden).

It seems cleaner to manage our own list of hidden devices.

To do that, this patch adds some qemu_opts functions to store the opts
list of the device when it is hidden. This list will be used to actually
plug the device when it will be enabled for the guest.

Signed-off-by: Laurent Vivier 
---
 include/qemu/option.h |  4 +++
 hw/core/qdev.c|  1 +
 hw/net/virtio-net.c   |  5 ++-
 hw/pci/pci.c  |  4 +--
 util/qemu-option.c| 82 +++
 5 files changed, 91 insertions(+), 5 deletions(-)

diff --git a/include/qemu/option.h b/include/qemu/option.h
index 306bf0757509..d44550f02542 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -150,4 +150,8 @@ QDict *keyval_parse(const char *params, const char 
*implied_key,
 bool *help, Error **errp);
 void keyval_merge(QDict *old, const QDict *new, Error **errp);
 
+int qemu_opts_hidden_device_foreach(qemu_opts_loopfunc func,
+void *opaque, Error **errp);
+QemuOpts *qemu_opts_hidden_device_find(const char *id);
+void qemu_opts_store_hidden_device(QemuOpts *opts);
 #endif
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 13f4c1e696bf..f402309a3af9 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -218,6 +218,7 @@ bool qdev_should_hide_device(QemuOpts *opts, Error **errp)
 QTAILQ_FOREACH(listener, &device_listeners, link) {
 if (listener->hide_device) {
 if (listener->hide_device(listener, opts, errp)) {
+qemu_opts_store_hidden_device(opts);
 return true;
 }
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 35e3d024f8d6..dc971bc9885b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -831,8 +831,7 @@ static char *failover_find_primary_device_id(VirtIONet *n)
 FailoverId fid;
 
 fid.n = n;
-if (!qemu_opts_foreach(qemu_find_opts("device"),
-   failover_set_primary, &fid, &err)) {
+if (!qemu_opts_hidden_device_foreach(failover_set_primary, &fid, &err)) {
 return NULL;
 }
 return fid.id;
@@ -874,7 +873,7 @@ static void failover_add_primary(VirtIONet *n, Error **errp)
   " failover_pair_id=%s\n", n->netclient_name);
 return;
 }
-opts = qemu_opts_find(qemu_find_opts("device"), id);
+opts = qemu_opts_hidden_device_find(id);
 g_assert(opts); /* cannot be NULL because id was found using opts list */
 dev = qdev_device_add(opts, &err);
 if (err) {
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 31ab80b81b6b..d222623a68b2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -142,8 +142,8 @@ static int bus_post_load(void *opaque, int version_id)
 Error *err = NULL;
 PCIBus *bus = opaque;
 
-if (qemu_opts_foreach(qemu_find_opts("device"),
-  pci_dev_replug_on_migration, bus->qbus.name, &err)) {
+if (qemu_opts_hidden_device_foreach(pci_dev_replug_on_migration,
+bus->qbus.name, &err)) {
 error_report_err(err);
 return -EINVAL;
 }
diff --git a/util/qemu-option.c b/util/qemu-option.c
index 61cb4a97bdb6..90bdec030624 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -37,6 +37,8 @@
 #include "qemu/id.h"
 #include "qemu/help_option.h"
 
+static QTAILQ_HEAD(, QemuOpts) hidden_devices =
+   QTAILQ_HEAD_INITIALIZER(hidden_devices);
 /*
  * Extracts the name of an option from the parameter string (@p points at the
  * first byte of the option name)
@@ -1224,3 +1226,83 @@ QemuOptsList *qemu_opts_append(QemuOptsList *dst,
 
 return dst;
 }
+
+/* create a copy of an opts */
+static QemuOpts *qemu_opts_duplicate(QemuOpts *opts)
+{
+QemuOpts *new;
+QemuOpt *opt;
+
+new = g_malloc0(sizeof(*new));
+new->id = g_strdup(opts->id);
+new->list = opts->list;
+
+QTAILQ_INIT(&new->head);
+
+QTAILQ_FOREACH_REVERSE(opt, &opts->head, next) {
+QemuOpt *new_opt = g_malloc0(sizeof(*new_opt));
+
+new_opt->name = g_strdup(opt->name);
+new_opt->str = g_strdup(opt->str);
+new_opt->opts = new;
+QTAILQ_INSERT_TAIL(&new->head, new_opt, next);
+}
+
+QTAILQ_INSERT_TAIL(&new->list->head, new, next);
+
+return new;
+}
+
+/*
+ * For each member of the hidded device list,
+ * call @func(@opaque, name, value, @errp).
+ * @func() may store an Error through @errp, but must return non-zero then.
+ * When @func() returns non-zero, break the loop and return

[RFC PATCH v2 0/8] virtio-net failover cleanup and new features

2021-08-20 Thread Laurent Vivier

v2: add helpers to manage the list of hidden devices rather
than relying on the command line parameters
Hide VFIO device if it is plugged before the virtio-net one

This series moves the code used by virtio-net failover from the
virtio-net device to the PCI subsystem.

Doing that, we can use failover with a regular QEMU PCI device
(we can add the function call to unregister the ROM vmstate) and we
can also use this code to unplug a PCI card before migration
and plug it back after migration without using a failover
device (of course, connectivity is lost during all the migration).
In contrary of failover, this does not need support from the
guest system to work.

Laurent Vivier (8):
  qdev: add an Error parameter to the DeviceListener hide_device()
function
  qdev/qbus: remove failover specific code
  failover: virtio-net: remove failover_primary_hidden flag
  failover: pci: move failover hotplug/unplug code into pci subsystem
  failover: hide the PCI device if the virtio-net device is not present
  failover: pci: unregister ROM on unplug
  pci: automatically unplug a PCI card before migration
  failover: qemu-opts: manage hidden device list

 include/hw/pci/pci.h   |   5 +
 include/hw/qdev-core.h |   6 +-
 include/hw/virtio/virtio-net.h |   4 -
 include/hw/virtio/virtio.h |   1 -
 include/qemu/option.h  |   4 +
 hw/core/qdev.c |   5 +-
 hw/net/virtio-net.c| 149 +---
 hw/pci/pci.c   | 242 +++--
 hw/vfio/pci.c  |   2 +-
 softmmu/qdev-monitor.c |  14 +-
 util/qemu-option.c |  82 +++
 11 files changed, 338 insertions(+), 176 deletions(-)

-- 
2.31.1

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Bin Meng

Hi Philippe,

On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
 wrote:
>
> Hi Bin,
>
> On 8/20/21 4:04 PM, Bin Meng wrote:
> > Hi,
> >
> > The following command used to work on QEMU 4.2.0, but is now broken
> > with QEMU head.
> >
> > $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> > -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> > loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> > qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> > allocate memory
> >
> > Any ideas?
>
> Richard hit that recently too.

I hit this when in the VM on Azure pipelines, but I was able to
reproduce this issue on my local machine.

>
> Can you provide:
>
> cat /proc/sys/vm/overcommit_kbytes
> cat /proc/sys/vm/overcommit_memory
> cat /proc/sys/vm/overcommit_ratio

$ cat /proc/sys/vm/overcommit_kbytes
0
$ cat /proc/sys/vm/overcommit_memory
0
$ cat /proc/sys/vm/overcommit_ratio
50

>
> and
>
> cat /proc/meminfo
>
> (CommitLimit, Committed_AS)

$ cat /proc/meminfo

CommitLimit:12388820 kB
Committed_AS:5019088 kB

> and OOM messages.

I did not see any OOM messages.

>
> Per David, 'you can trick QEMU in trying to work around
> that issue, specifying a memory-backend-ram with "reserve=off"
> as guest RAM.'

Regards,
Bin

[RFC PATCH v2 2/8] qdev/qbus: remove failover specific code

2021-08-20 Thread Laurent Vivier

Commit f3a850565693 ("qdev/qbus: add hidden device support") has
introduced a generic way to hide a device but it has modified
qdev_device_add() to check a specific option of the failover device,
"failover_pair_id", before calling the generic mechanism.

It's not needed (and not generic) to do that in qdev_device_add() because
this is also checked by the failover_hide_primary_device() function that
uses the generic mechanism to hide the device.

Cc: Jens Freimann 
Signed-off-by: Laurent Vivier 
---
 hw/net/virtio-net.c|  7 +++
 softmmu/qdev-monitor.c | 14 --
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 542f9e167eb4..0c5ec930356b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3288,6 +3288,13 @@ static bool failover_hide_primary_device(DeviceListener 
*listener,
 return false;
 }
 standby_id = qemu_opt_get(device_opts, "failover_pair_id");
+if (standby_id == NULL) {
+return false;
+}
+if (device_opts->id == NULL) {
+error_setg(errp, "Device with failover_pair_id don't have id");
+return true;
+}
 if (g_strcmp0(standby_id, n->netclient_name) != 0) {
 return false;
 }
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 7adf0d22beb1..5c92dbed3139 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -622,17 +622,11 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
 }
 }
 
-if (qemu_opt_get(opts, "failover_pair_id")) {
-if (!opts->id) {
-error_setg(errp, "Device with failover_pair_id don't have id");
-return NULL;
-}
-if (qdev_should_hide_device(opts, errp)) {
-if (errp && !*errp && bus && !qbus_is_hotpluggable(bus)) {
-error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
-}
-return NULL;
+if (qdev_should_hide_device(opts, errp)) {
+if (errp && !*errp && bus && !qbus_is_hotpluggable(bus)) {
+error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
 }
+return NULL;
 }
 
 if (phase_check(PHASE_MACHINE_READY) && bus && !qbus_is_hotpluggable(bus)) 
{
-- 
2.31.1

Re: [qemu-web PATCH] Add a blog post about FUSE block exports

2021-08-20 Thread Stefan Hajnoczi

On Fri, Aug 20, 2021 at 09:56:54AM +0200, Hanna Reitz wrote:
> On 19.08.21 18:23, Stefan Hajnoczi wrote:
> > On Thu, Aug 19, 2021 at 12:25:01PM +0200, Hanna Reitz wrote:
> > > This post explains when FUSE block exports are useful, how they work,
> > > and that it is fun to export an image file on its own path so it looks
> > > like your image file (in whatever format it was) is a raw image now.
> > > 
> > > Signed-off-by: Hanna Reitz 
> > > ---
> > > You can also find this patch here:
> > > https://gitlab.com/hreitz/qemu-web fuse-blkexport-v1
> > > 
> > > My first patch to qemu-web, so I hope I am not doing anything overly
> > > stupid here (adding SVGs with extremely long lines comes to mind)...
> > > ---
> > >   _posts/2021-08-18-fuse-blkexport.md   | 488 ++
> > >   screenshots/2021-08-18-block-graph-a.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-b.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-c.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-d.svg  |   2 +
> > >   screenshots/2021-08-18-block-graph-e.svg  |   2 +
> > >   screenshots/2021-08-18-root-directory.svg |   2 +
> > >   screenshots/2021-08-18-root-file.svg  |   2 +
> > >   8 files changed, 502 insertions(+)
> > >   create mode 100644 _posts/2021-08-18-fuse-blkexport.md
> > >   create mode 100644 screenshots/2021-08-18-block-graph-a.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-b.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-c.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-d.svg
> > >   create mode 100644 screenshots/2021-08-18-block-graph-e.svg
> > >   create mode 100644 screenshots/2021-08-18-root-directory.svg
> > >   create mode 100644 screenshots/2021-08-18-root-file.svg
> > Great! Two ideas:
> > 
> > It would be nice to include a shoutout to libguestfs and mention that
> > libguestfs avoids exposing the host kernel's file systems and partion
> > code to untrusted disk images. If you don't mount the image then the
> > FUSE export has similar security properties.
> 
> Oh, right!  Absolutely.
> 
> Though now I do wonder why one would actually want to use QEMU’s FUSE
> exports then...
> 
> Looks like the performance isn’t as bad as I claimed (for me around 1.5G/s
> for reading/writing from/to a raw image on tmpfs), so perhaps that’s one
> point.  Another is probably that FUSE exports are better suited when you
> actually want access to the whole image.  I guess.

I see a use case for applications that want to do something with the
disk image data themselves, e.g. backup, entropy, data recovery, etc.
They could use NBD but opening a regular file on a FUSE file system is
even easier.

The host kernel won't be exposed, so it's reason as long as the
application itself isn't doing anything risky (e.g. no parsing or parser
written in a memory-safe language).

Stefan


signature.asc
Description: PGP signature

[RFC PATCH v2 3/8] failover: virtio-net: remove failover_primary_hidden flag

2021-08-20 Thread Laurent Vivier

We dont't need a flag to know if the primary device must be hidden, we
can rely on the machine state:
Device is hidden if the machine is in prelaunch state (src) or
in inmigrate state with migration status set to none (dst).
We don't need to check the flag in virtio_net_handle_migration_primary()
because this function is only registered if the failover is enabled
and the flag also set to false. This function also sets it back to
true after a successful unplug during the migration but we don't
need to know it is hidden because nothing will try to plug it back
after the migration (except in case of error but the flag is set
back to false).

Signed-off-by: Laurent Vivier 
---
 include/hw/virtio/virtio-net.h |  2 --
 hw/net/virtio-net.c| 25 +++--
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 824a69c23f06..21d8c3aa5f3a 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -205,8 +205,6 @@ struct VirtIONet {
 AnnounceTimer announce_timer;
 bool needs_vnet_hdr_swap;
 bool mtu_bypass_backend;
-/* primary failover device is hidden*/
-bool failover_primary_hidden;
 bool failover;
 DeviceListener primary_listener;
 Notifier migration_state;
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 0c5ec930356b..c81466f6efb7 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -36,8 +36,10 @@
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
 #include "hw/virtio/virtio-access.h"
+#include "migration/migration.h"
 #include "migration/misc.h"
 #include "standard-headers/linux/ethtool.h"
+#include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
 #include "trace.h"
 #include "monitor/qdev.h"
@@ -937,7 +939,6 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
 
 if (virtio_has_feature(features, VIRTIO_NET_F_STANDBY)) {
 qapi_event_send_failover_negotiated(n->netclient_name);
-qatomic_set(&n->failover_primary_hidden, false);
 failover_add_primary(n, &err);
 if (err) {
 warn_report_err(err);
@@ -3225,7 +3226,6 @@ static bool failover_replug_primary(VirtIONet *n, 
DeviceState *dev,
 return false;
 }
 qdev_set_parent_bus(dev, primary_bus, &error_abort);
-qatomic_set(&n->failover_primary_hidden, false);
 hotplug_ctrl = qdev_get_hotplug_handler(dev);
 if (hotplug_ctrl) {
 hotplug_handler_pre_plug(hotplug_ctrl, dev, &err);
@@ -3243,7 +3243,6 @@ out:
 
 static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState 
*s)
 {
-bool should_be_hidden;
 Error *err = NULL;
 DeviceState *dev = failover_find_primary_device(n);
 
@@ -3251,13 +3250,10 @@ static void 
virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
 return;
 }
 
-should_be_hidden = qatomic_read(&n->failover_primary_hidden);
-
-if (migration_in_setup(s) && !should_be_hidden) {
+if (migration_in_setup(s)) {
 if (failover_unplug_primary(n, dev)) {
 vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
 qapi_event_send_unplug_primary(dev->id);
-qatomic_set(&n->failover_primary_hidden, true);
 } else {
 warn_report("couldn't unplug primary device");
 }
@@ -3299,8 +3295,18 @@ static bool failover_hide_primary_device(DeviceListener 
*listener,
 return false;
 }
 
-/* failover_primary_hidden is set during feature negotiation */
-return qatomic_read(&n->failover_primary_hidden);
+if (runstate_check(RUN_STATE_PRELAUNCH)) {
+/* hide the failover primary on src on startup */
+return true;
+}
+
+if (runstate_check(RUN_STATE_INMIGRATE) &&
+migration_incoming_get_current()->state == MIGRATION_STATUS_NONE) {
+/* hide the failover primary on dest on startup */
+return true;
+}
+
+return false;
 }
 
 static void virtio_net_device_realize(DeviceState *dev, Error **errp)
@@ -3338,7 +3344,6 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 
 if (n->failover) {
 n->primary_listener.hide_device = failover_hide_primary_device;
-qatomic_set(&n->failover_primary_hidden, true);
 device_listener_register(&n->primary_listener);
 n->migration_state.notify = virtio_net_migration_state_notifier;
 add_migration_state_change_notifier(&n->migration_state);
-- 
2.31.1

Re: [PATCH 0/2] 9pfs: v9fs_walk() cleanup

2021-08-20 Thread Christian Schoenebeck

On Dienstag, 17. August 2021 15:52:39 CEST Christian Schoenebeck wrote:
> Few cleanup patches for function v9fs_walk() as discussed last month.
> 
> In patch 2 array variables 'wnames' and 'pathes' are omitted because they
> contain dynamically allocated memory per array element which need to be
> freed individually before freeing the array.
> 
> Christian Schoenebeck (2):
>   hw/9pfs: avoid 'path' copy in v9fs_walk()
>   hw/9pfs: use g_autofree in v9fs_walk() where possible
> 
>  hw/9pfs/9p.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)

Queued on 9p.next:
https://github.com/cschoenebeck/qemu/commits/9p.next

Thanks!

Best regards,
Christian Schoenebeck

[RFC PATCH v2 6/8] failover: pci: unregister ROM on unplug

2021-08-20 Thread Laurent Vivier

The intend of failover is to allow a VM with a VFIO networking card to
be migrated without disrupting the network operation by switching
to a virtio-net device during the migration.

This simple change allows a simulated device like e1000e to be tested
rather than a vfio device, even if it's useless in real life it can help
to debug failover.

This is interesting to developers that want to test failover on
a system with no vfio device. Moreover it simplifies host networking
configuration as we can use the same bridge for virtio-net and
the other failover networking device.

Without this change the migration of a system configured with failover
fails with:

  ...
  -device virtio-net-pci,id=virtionet0,failover=on,...  \
  -device e1000,failover_pair_id=virtionet0,... \
  ...

  (qemu) migrate ...

  Unknown ramblock ":00:01.1:00.0/e1000e.rom", cannot accept migration
  error while loading state for instance 0x0 of device 'ram'
  load of migration failed: Invalid argument

This happens because QEMU correctly unregisters the interface vmstate but
not the ROM one. This patch fixes that.

Signed-off-by: Laurent Vivier 
---
 hw/pci/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index dd5e95216abf..10b875fdfad1 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2208,6 +2208,7 @@ static void pci_dev_handle_migration(PCIDevice *pci_dev, 
MigrationState *s)
 if (migration_in_setup(s)) {
 if (pci_dev_migration_unplug(pci_dev)) {
 vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
+pci_del_option_rom(pci_dev);
 qapi_event_send_unplug_primary(dev->id);
 } else {
 warn_report("couldn't unplug primary device");
-- 
2.31.1

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread David Hildenbrand


On 20.08.21 16:22, Bin Meng wrote:

Hi Philippe,

On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
 wrote:


Hi Bin,

On 8/20/21 4:04 PM, Bin Meng wrote:

Hi,

The following command used to work on QEMU 4.2.0, but is now broken
with QEMU head.

$ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
-nographic -serial /dev/null -serial mon:stdio -monitor null -device
loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
allocate memory

Any ideas?


Richard hit that recently too.


I hit this when in the VM on Azure pipelines, but I was able to
reproduce this issue on my local machine.



Can you provide:

cat /proc/sys/vm/overcommit_kbytes
cat /proc/sys/vm/overcommit_memory
cat /proc/sys/vm/overcommit_ratio


$ cat /proc/sys/vm/overcommit_kbytes
0
$ cat /proc/sys/vm/overcommit_memory
0
$ cat /proc/sys/vm/overcommit_ratio
50



and

cat /proc/meminfo

(CommitLimit, Committed_AS)


$ cat /proc/meminfo

CommitLimit:12388820 kB
Committed_AS:5019088 kB


and OOM messages.


I did not see any OOM messages.


-m 4000

corresponds to 38 TB if I am not wrong. Is that really what you want?

How much main memory does your system have?

--
Thanks,

David / dhildenb

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Peter Maydell

On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  wrote:
>
> On 20.08.21 16:22, Bin Meng wrote:
> > Hi Philippe,
> >
> > On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
> >  wrote:
> >>
> >> Hi Bin,
> >>
> >> On 8/20/21 4:04 PM, Bin Meng wrote:
> >>> Hi,
> >>>
> >>> The following command used to work on QEMU 4.2.0, but is now broken
> >>> with QEMU head.
> >>>
> >>> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> >>> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> >>> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> >>> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> >>> allocate memory

> -m 4000
>
> corresponds to 38 TB if I am not wrong. Is that really what you want?

Probably not, because the zynq board's init function does:

if (machine->ram_size > 2 * GiB) {
error_report("RAM size more than 2 GiB is not supported");
exit(EXIT_FAILURE);
}

It seems a bit daft that we allocate the memory before we do
the size check. This didn't use to be this way around...

Anyway, I think the cause of this change is commit c9800965c1be6c39
from Igor. We used to silently cap the RAM size to 2GB; now we
complain. Or at least we would complain if we hadn't already
tried to allocate the memory and fallen over...

-- PMM

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 4:39 PM, Peter Maydell wrote:
> On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  wrote:
>>
>> On 20.08.21 16:22, Bin Meng wrote:
>>> Hi Philippe,
>>>
>>> On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
>>>  wrote:

 Hi Bin,

 On 8/20/21 4:04 PM, Bin Meng wrote:
> Hi,
>
> The following command used to work on QEMU 4.2.0, but is now broken
> with QEMU head.
>
> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> allocate memory
> 
>> -m 4000
>>
>> corresponds to 38 TB if I am not wrong. Is that really what you want?
> 
> Probably not, because the zynq board's init function does:
> 
> if (machine->ram_size > 2 * GiB) {
> error_report("RAM size more than 2 GiB is not supported");
> exit(EXIT_FAILURE);
> }
> 
> It seems a bit daft that we allocate the memory before we do
> the size check. This didn't use to be this way around...
> 
> Anyway, I think the cause of this change is commit c9800965c1be6c39
> from Igor. We used to silently cap the RAM size to 2GB; now we
> complain. Or at least we would complain if we hadn't already
> tried to allocate the memory and fallen over...

Ouch... I remember having tested -M raspi2 -m 8G etc... to verify
the error messages, but didn't noticed the memory was allocated.

static void qemu_init_board(void)
{
MachineClass *machine_class = MACHINE_GET_CLASS(current_machine);

if (machine_class->default_ram_id && current_machine->ram_size &&
numa_uses_legacy_mem() && !current_machine->ram_memdev_id) {
create_default_memdev(current_machine, mem_path); // <- alloc
}

/* process plugin before CPUs are created ... */
qemu_plugin_load_list(&plugin_list, &error_fatal);

/* From here on we enter MACHINE_PHASE_INITIALIZED.  */
machine_run_board_init(current_machine); // <- Machine::init()
 //checks RAM size

...

$ qemu-system-x86_64 -m 1T
qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate
memory

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Igor Mammedov

On Fri, 20 Aug 2021 15:39:27 +0100
Peter Maydell  wrote:

> On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  wrote:
> >
> > On 20.08.21 16:22, Bin Meng wrote:  
> > > Hi Philippe,
> > >
> > > On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
> > >  wrote:  
> > >>
> > >> Hi Bin,
> > >>
> > >> On 8/20/21 4:04 PM, Bin Meng wrote:  
> > >>> Hi,
> > >>>
> > >>> The following command used to work on QEMU 4.2.0, but is now broken
> > >>> with QEMU head.
> > >>>
> > >>> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> > >>> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> > >>> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> > >>> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> > >>> allocate memory  
> 
> > -m 4000
> >
> > corresponds to 38 TB if I am not wrong. Is that really what you want?  
> 
> Probably not, because the zynq board's init function does:
> 
> if (machine->ram_size > 2 * GiB) {
> error_report("RAM size more than 2 GiB is not supported");
> exit(EXIT_FAILURE);
> }
> 
> It seems a bit daft that we allocate the memory before we do
> the size check. This didn't use to be this way around...
> 
> Anyway, I think the cause of this change is commit c9800965c1be6c39
> from Igor. We used to silently cap the RAM size to 2GB; now we
> complain. Or at least we would complain if we hadn't already
> tried to allocate the memory and fallen over...

That's because RAM (as host resource) is now separated
from device model (machine limits) and is allocated as
part of memory backend initialization (in this case
'create_default_memdev') before machine_run_board_init()
is run.

Maybe we can consolidate max limit checks in
create_default_memdev() by adding MachineClass::max_ram_size
but that can work only in default usecase (only '-m' is used).

However if user creates backend explicitly, there aren't any
clue about machine limits. We basically don't know what
backend is created for at the time it's initialized
(which includes RAM allocation, it might be created for VM's
RAM or ram/storage for some other device).

> 
> -- PMM
>

Re: [PATCH] hw/acpi: refactor acpi hp modules so that targets can just use what they need

2021-08-20 Thread Ani Sinha

On Thu, 19 Aug 2021, Philippe Mathieu-Daudé wrote:

> On 8/12/21 9:14 AM, Ani Sinha wrote:

> > +return;
>
> I suppose if you replace all 'return' by 'g_assert_not_reached()'
> both issues reproducers crash?
>
> Your patch is not incorrect, and indeed fixes the issues, but
> I feel we are going backward now allowing call which should
> never be there in the first place.
>

Linux kernel does something like this all over the place. They simply
replace functions with NOOPS when they are not allowed for a
configuration. They do this relying on preprocessor macros ofcourse!

It will be hard to do anythiung better without rearchitecting the modules.
That would have significant impact particularly on x86.

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread David Hildenbrand


On 20.08.21 17:44, Igor Mammedov wrote:

On Fri, 20 Aug 2021 15:39:27 +0100
Peter Maydell  wrote:


On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  wrote:


On 20.08.21 16:22, Bin Meng wrote:

Hi Philippe,

On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
 wrote:


Hi Bin,

On 8/20/21 4:04 PM, Bin Meng wrote:

Hi,

The following command used to work on QEMU 4.2.0, but is now broken
with QEMU head.

$ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
-nographic -serial /dev/null -serial mon:stdio -monitor null -device
loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
allocate memory



-m 4000

corresponds to 38 TB if I am not wrong. Is that really what you want?


Probably not, because the zynq board's init function does:

 if (machine->ram_size > 2 * GiB) {
 error_report("RAM size more than 2 GiB is not supported");
 exit(EXIT_FAILURE);
 }

It seems a bit daft that we allocate the memory before we do
the size check. This didn't use to be this way around...

Anyway, I think the cause of this change is commit c9800965c1be6c39
from Igor. We used to silently cap the RAM size to 2GB; now we
complain. Or at least we would complain if we hadn't already
tried to allocate the memory and fallen over...


That's because RAM (as host resource) is now separated
from device model (machine limits) and is allocated as
part of memory backend initialization (in this case
'create_default_memdev') before machine_run_board_init()
is run.

Maybe we can consolidate max limit checks in
create_default_memdev() by adding MachineClass::max_ram_size
but that can work only in default usecase (only '-m' is used).


We do have a workaround for s390x already: mc->fixup_ram_size

That should be called before the memory backend is created and seems to 
do just what we want, no?


--
Thanks,

David / dhildenb

[PATCH] softmmu/physmem: Improve guest memory allocation failure error message

2021-08-20 Thread Philippe Mathieu-Daudé

When Linux refuses to overcommit a seriously wild allocation we get:

  $ qemu-system-i386 -m 4000
  qemu-system-i386: cannot set up guest memory 'pc.ram': Cannot allocate memory

Slighly improve the error message, displaying the memory size
requested (in case the user didn't expect unspecified memory size
unit is in MiB):

  $ qemu-system-i386 -m 4000
  qemu-system-i386: Cannot set up 38.1 TiB of guest memory 'pc.ram': Cannot 
allocate memory

Reported-by: Bin Meng 
Signed-off-by: Philippe Mathieu-Daudé 
---
 softmmu/physmem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 2e18947598e..2f300a9e79b 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1982,8 +1982,10 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp)
   &new_block->mr->align,
   shared, noreserve);
 if (!new_block->host) {
+g_autofree char *size_s = size_to_str(new_block->max_length);
 error_setg_errno(errp, errno,
- "cannot set up guest memory '%s'",
+ "Cannot set up %s of guest memory '%s'",
+ size_s,
  memory_region_name(new_block->mr));
 qemu_mutex_unlock_ramlist();
 return;
-- 
2.31.1

Re: [PATCH] softmmu/physmem: Improve guest memory allocation failure error message

2021-08-20 Thread David Hildenbrand


On 20.08.21 17:52, Philippe Mathieu-Daudé wrote:

When Linux refuses to overcommit a seriously wild allocation we get:

   $ qemu-system-i386 -m 4000
   qemu-system-i386: cannot set up guest memory 'pc.ram': Cannot allocate memory

Slighly improve the error message, displaying the memory size
requested (in case the user didn't expect unspecified memory size
unit is in MiB):

   $ qemu-system-i386 -m 4000
   qemu-system-i386: Cannot set up 38.1 TiB of guest memory 'pc.ram': Cannot 
allocate memory

Reported-by: Bin Meng 
Signed-off-by: Philippe Mathieu-Daudé 
---
  softmmu/physmem.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 2e18947598e..2f300a9e79b 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1982,8 +1982,10 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp)
&new_block->mr->align,
shared, noreserve);
  if (!new_block->host) {
+g_autofree char *size_s = size_to_str(new_block->max_length);
  error_setg_errno(errp, errno,
- "cannot set up guest memory '%s'",
+ "Cannot set up %s of guest memory '%s'",
+ size_s,
   memory_region_name(new_block->mr));
  qemu_mutex_unlock_ramlist();
  return;



IIRC, ram blocks might not necessarily be used for guest memory ... or 
is my memory wrong?


--
Thanks,

David / dhildenb

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 5:47 PM, David Hildenbrand wrote:
> On 20.08.21 17:44, Igor Mammedov wrote:
>> On Fri, 20 Aug 2021 15:39:27 +0100
>> Peter Maydell  wrote:
>>
>>> On Fri, 20 Aug 2021 at 15:34, David Hildenbrand 
>>> wrote:

 On 20.08.21 16:22, Bin Meng wrote:
> Hi Philippe,
>
> On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
>  wrote:
>>
>> Hi Bin,
>>
>> On 8/20/21 4:04 PM, Bin Meng wrote:
>>> Hi,
>>>
>>> The following command used to work on QEMU 4.2.0, but is now broken
>>> with QEMU head.
>>>
>>> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
>>> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
>>> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
>>> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
>>> allocate memory
>>>
 -m 4000

 corresponds to 38 TB if I am not wrong. Is that really what you want?
>>>
>>> Probably not, because the zynq board's init function does:
>>>
>>>  if (machine->ram_size > 2 * GiB) {
>>>  error_report("RAM size more than 2 GiB is not supported");
>>>  exit(EXIT_FAILURE);
>>>  }
>>>
>>> It seems a bit daft that we allocate the memory before we do
>>> the size check. This didn't use to be this way around...
>>>
>>> Anyway, I think the cause of this change is commit c9800965c1be6c39
>>> from Igor. We used to silently cap the RAM size to 2GB; now we
>>> complain. Or at least we would complain if we hadn't already
>>> tried to allocate the memory and fallen over...
>>
>> That's because RAM (as host resource) is now separated
>> from device model (machine limits) and is allocated as
>> part of memory backend initialization (in this case
>> 'create_default_memdev') before machine_run_board_init()
>> is run.
>>
>> Maybe we can consolidate max limit checks in
>> create_default_memdev() by adding MachineClass::max_ram_size
>> but that can work only in default usecase (only '-m' is used).
> 
> We do have a workaround for s390x already: mc->fixup_ram_size
> 
> That should be called before the memory backend is created and seems to
> do just what we want, no?

Or maybe more explicit adding a MachineClass::check_ram_size() handler?

Re: [PATCH] hw/acpi: refactor acpi hp modules so that targets can just use what they need

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 5:43 PM, Ani Sinha wrote:
> On Thu, 19 Aug 2021, Philippe Mathieu-Daudé wrote:
>> On 8/12/21 9:14 AM, Ani Sinha wrote:
> 
>>> +return;
>>
>> I suppose if you replace all 'return' by 'g_assert_not_reached()'
>> both issues reproducers crash?
>>
>> Your patch is not incorrect, and indeed fixes the issues, but
>> I feel we are going backward now allowing call which should
>> never be there in the first place.
>>
> 
> Linux kernel does something like this all over the place. They simply
> replace functions with NOOPS when they are not allowed for a
> configuration. They do this relying on preprocessor macros ofcourse!
> 
> It will be hard to do anythiung better without rearchitecting the modules.

Which is why this situation is unsolved since various years <:)

> That would have significant impact particularly on x86.

I don't believe so, the rework has to be done at the machine creation,
no change during runtime.

Re: [PATCH] softmmu/physmem: Improve guest memory allocation failure error message

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 5:53 PM, David Hildenbrand wrote:
> On 20.08.21 17:52, Philippe Mathieu-Daudé wrote:
>> When Linux refuses to overcommit a seriously wild allocation we get:
>>
>>    $ qemu-system-i386 -m 4000
>>    qemu-system-i386: cannot set up guest memory 'pc.ram': Cannot
>> allocate memory
>>
>> Slighly improve the error message, displaying the memory size
>> requested (in case the user didn't expect unspecified memory size
>> unit is in MiB):
>>
>>    $ qemu-system-i386 -m 4000
>>    qemu-system-i386: Cannot set up 38.1 TiB of guest memory 'pc.ram':
>> Cannot allocate memory
>>
>> Reported-by: Bin Meng 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>   softmmu/physmem.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
>> index 2e18947598e..2f300a9e79b 100644
>> --- a/softmmu/physmem.c
>> +++ b/softmmu/physmem.c
>> @@ -1982,8 +1982,10 @@ static void ram_block_add(RAMBlock *new_block,
>> Error **errp)
>>    
>> &new_block->mr->align,
>>     shared, noreserve);
>>   if (!new_block->host) {
>> +    g_autofree char *size_s =
>> size_to_str(new_block->max_length);
>>   error_setg_errno(errp, errno,
>> - "cannot set up guest memory '%s'",
>> + "Cannot set up %s of guest memory
>> '%s'",
>> + size_s,
>>    memory_region_name(new_block->mr));
>>   qemu_mutex_unlock_ramlist();
>>   return;
>>
> 
> IIRC, ram blocks might not necessarily be used for guest memory ... or
> is my memory wrong?

No clue, this error message was already here.

No problem to change s/guest/block/ although.

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Igor Mammedov

On Fri, 20 Aug 2021 17:47:01 +0200
David Hildenbrand  wrote:

> On 20.08.21 17:44, Igor Mammedov wrote:
> > On Fri, 20 Aug 2021 15:39:27 +0100
> > Peter Maydell  wrote:
> >   
> >> On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  wrote:  
> >>>
> >>> On 20.08.21 16:22, Bin Meng wrote:  
>  Hi Philippe,
> 
>  On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
>   wrote:  
> >
> > Hi Bin,
> >
> > On 8/20/21 4:04 PM, Bin Meng wrote:  
> >> Hi,
> >>
> >> The following command used to work on QEMU 4.2.0, but is now broken
> >> with QEMU head.
> >>
> >> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> >> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> >> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> >> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> >> allocate memory  
> >>  
> >>> -m 4000
> >>>
> >>> corresponds to 38 TB if I am not wrong. Is that really what you want?  
> >>
> >> Probably not, because the zynq board's init function does:
> >>
> >>  if (machine->ram_size > 2 * GiB) {
> >>  error_report("RAM size more than 2 GiB is not supported");
> >>  exit(EXIT_FAILURE);
> >>  }
> >>
> >> It seems a bit daft that we allocate the memory before we do
> >> the size check. This didn't use to be this way around...
> >>
> >> Anyway, I think the cause of this change is commit c9800965c1be6c39
> >> from Igor. We used to silently cap the RAM size to 2GB; now we
> >> complain. Or at least we would complain if we hadn't already
> >> tried to allocate the memory and fallen over...  
> > 
> > That's because RAM (as host resource) is now separated
> > from device model (machine limits) and is allocated as
> > part of memory backend initialization (in this case
> > 'create_default_memdev') before machine_run_board_init()
> > is run.
> > 
> > Maybe we can consolidate max limit checks in
> > create_default_memdev() by adding MachineClass::max_ram_size
> > but that can work only in default usecase (only '-m' is used).  
> 
> We do have a workaround for s390x already: mc->fixup_ram_size
> 
> That should be called before the memory backend is created and seems to 
> do just what we want, no?

it's there for compat sake only if I recall correctly,
there should be no fixups ever.
If user asks for nonsence, QEMU should error out and force
user to correct CLI (fixups were one of items that were in
the way of splitting guest RAM into backend/frontend model)

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 6:03 PM, Igor Mammedov wrote:
> On Fri, 20 Aug 2021 17:47:01 +0200
> David Hildenbrand  wrote:
> 
>> On 20.08.21 17:44, Igor Mammedov wrote:
>>> On Fri, 20 Aug 2021 15:39:27 +0100
>>> Peter Maydell  wrote:
>>>   
 On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  wrote:  
>
> On 20.08.21 16:22, Bin Meng wrote:  
>> Hi Philippe,
>>
>> On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
>>  wrote:  
>>>
>>> Hi Bin,
>>>
>>> On 8/20/21 4:04 PM, Bin Meng wrote:  
 Hi,

 The following command used to work on QEMU 4.2.0, but is now broken
 with QEMU head.

 $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
 -nographic -serial /dev/null -serial mon:stdio -monitor null -device
 loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
 qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
 allocate memory  
  
> -m 4000
>
> corresponds to 38 TB if I am not wrong. Is that really what you want?  

 Probably not, because the zynq board's init function does:

  if (machine->ram_size > 2 * GiB) {
  error_report("RAM size more than 2 GiB is not supported");
  exit(EXIT_FAILURE);
  }

 It seems a bit daft that we allocate the memory before we do
 the size check. This didn't use to be this way around...

 Anyway, I think the cause of this change is commit c9800965c1be6c39
 from Igor. We used to silently cap the RAM size to 2GB; now we
 complain. Or at least we would complain if we hadn't already
 tried to allocate the memory and fallen over...  
>>>
>>> That's because RAM (as host resource) is now separated
>>> from device model (machine limits) and is allocated as
>>> part of memory backend initialization (in this case
>>> 'create_default_memdev') before machine_run_board_init()
>>> is run.
>>>
>>> Maybe we can consolidate max limit checks in
>>> create_default_memdev() by adding MachineClass::max_ram_size
>>> but that can work only in default usecase (only '-m' is used).  
>>
>> We do have a workaround for s390x already: mc->fixup_ram_size
>>
>> That should be called before the memory backend is created and seems to 
>> do just what we want, no?
> 
> it's there for compat sake only if I recall correctly,
> there should be no fixups ever.
> If user asks for nonsence, QEMU should error out and force
> user to correct CLI

Agreed, but this would be cheaper to run the checks *before*
allocating the resources ;)

> (fixups were one of items that were in
> the way of splitting guest RAM into backend/frontend model)
>

[PATCH] migration: RDMA registrations interval optimization

2021-08-20 Thread Zhiwei Jiang

RDMA migration very hard to complete when VM run mysql
benchmark on 1G host hugepage.I think the time between
ram_control_before_iterate(f, RAM_CONTROL_ROUND) and
after_iterate is too large when 1G host pagesize,so 1M
buffer size match with mlx driver that will be good.
after this patch,it will work as normal on my situation.

Signed-off-by: Zhiwei Jiang 
---
 migration/migration.c | 13 +
 migration/migration.h |  6 ++
 migration/ram.c   |  6 +-
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 041b8451a6..934916b161 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -457,6 +457,8 @@ void migrate_add_address(SocketAddress *address)
 static void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
 const char *p = NULL;
+MigrationState *s = migrate_get_current();
+s->enabled_rdma_migration = false;
 
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
 if (strstart(uri, "tcp:", &p) ||
@@ -465,6 +467,7 @@ static void qemu_start_incoming_migration(const char *uri, 
Error **errp)
 socket_start_incoming_migration(p ? p : uri, errp);
 #ifdef CONFIG_RDMA
 } else if (strstart(uri, "rdma:", &p)) {
+s->enabled_rdma_migration = true;
 rdma_start_incoming_migration(p, errp);
 #endif
 } else if (strstart(uri, "exec:", &p)) {
@@ -2040,6 +2043,7 @@ void migrate_init(MigrationState *s)
 s->start_postcopy = false;
 s->postcopy_after_devices = false;
 s->migration_thread_running = false;
+s->enabled_rdma_migration = false;
 error_free(s->error);
 s->error = NULL;
 s->hostname = NULL;
@@ -2300,6 +2304,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 socket_start_outgoing_migration(s, p ? p : uri, &local_err);
 #ifdef CONFIG_RDMA
 } else if (strstart(uri, "rdma:", &p)) {
+s->enabled_rdma_migration = true;
 rdma_start_outgoing_migration(s, p, &local_err);
 #endif
 } else if (strstart(uri, "exec:", &p)) {
@@ -2475,6 +2480,14 @@ bool migrate_use_events(void)
 return s->enabled_capabilities[MIGRATION_CAPABILITY_EVENTS];
 }
 
+bool migrate_use_rdma(void)
+{
+MigrationState *s;
+s = migrate_get_current();
+
+return s->enabled_rdma_migration;
+}
+
 bool migrate_use_multifd(void)
 {
 MigrationState *s;
diff --git a/migration/migration.h b/migration/migration.h
index 7a5aa8c2fd..860dc93df1 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -296,6 +296,11 @@ struct MigrationState {
  * This save hostname when out-going migration starts
  */
 char *hostname;
+
+/*
+ * Enable RDMA migration
+ */
+bool enabled_rdma_migration;
 };
 
 void migrate_set_state(int *state, int old_state, int new_state);
@@ -332,6 +337,7 @@ bool migrate_ignore_shared(void);
 bool migrate_validate_uuid(void);
 
 bool migrate_auto_converge(void);
+bool migrate_use_rdma(void);
 bool migrate_use_multifd(void);
 bool migrate_pause_before_switchover(void);
 int migrate_multifd_channels(void);
diff --git a/migration/ram.c b/migration/ram.c
index 7a43bfd7af..dc0c0e2565 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2043,7 +2043,11 @@ static int ram_save_host_page(RAMState *rs, 
PageSearchStatus *pss,
 qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
 unsigned long hostpage_boundary =
 QEMU_ALIGN_UP(pss->page + 1, pagesize_bits);
+/* Set RDMA boundary default 256*4K=1M that driver delivery more 
effective*/
+unsigned long rdma_boundary =
+QEMU_ALIGN_UP(pss->page + 1, 256);
 unsigned long start_page = pss->page;
+bool use_rdma = migrate_use_rdma();
 int res;
 
 if (ramblock_is_ignored(pss->block)) {
@@ -2069,7 +2073,7 @@ static int ram_save_host_page(RAMState *rs, 
PageSearchStatus *pss,
 }
 }
 pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
-} while ((pss->page < hostpage_boundary) &&
+} while ((pss->page < (use_rdma ? rdma_boundary : hostpage_boundary)) &&
  offset_in_ramblock(pss->block,
 ((ram_addr_t)pss->page) << TARGET_PAGE_BITS));
 /* The offset we leave with is the min boundary of host page and block */
-- 
2.25.1

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Igor Mammedov

On Fri, 20 Aug 2021 17:53:41 +0200
Philippe Mathieu-Daudé  wrote:

> On 8/20/21 5:47 PM, David Hildenbrand wrote:
> > On 20.08.21 17:44, Igor Mammedov wrote:  
> >> On Fri, 20 Aug 2021 15:39:27 +0100
> >> Peter Maydell  wrote:
> >>  
> >>> On Fri, 20 Aug 2021 at 15:34, David Hildenbrand 
> >>> wrote:  
> 
>  On 20.08.21 16:22, Bin Meng wrote:  
> > Hi Philippe,
> >
> > On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
> >  wrote:  
> >>
> >> Hi Bin,
> >>
> >> On 8/20/21 4:04 PM, Bin Meng wrote:  
> >>> Hi,
> >>>
> >>> The following command used to work on QEMU 4.2.0, but is now broken
> >>> with QEMU head.
> >>>
> >>> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> >>> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> >>> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> >>> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> >>> allocate memory  
> >>>  
>  -m 4000
> 
>  corresponds to 38 TB if I am not wrong. Is that really what you want?  
> >>>
> >>> Probably not, because the zynq board's init function does:
> >>>
> >>>  if (machine->ram_size > 2 * GiB) {
> >>>  error_report("RAM size more than 2 GiB is not supported");
> >>>  exit(EXIT_FAILURE);
> >>>  }
> >>>
> >>> It seems a bit daft that we allocate the memory before we do
> >>> the size check. This didn't use to be this way around...
> >>>
> >>> Anyway, I think the cause of this change is commit c9800965c1be6c39
> >>> from Igor. We used to silently cap the RAM size to 2GB; now we
> >>> complain. Or at least we would complain if we hadn't already
> >>> tried to allocate the memory and fallen over...  
> >>
> >> That's because RAM (as host resource) is now separated
> >> from device model (machine limits) and is allocated as
> >> part of memory backend initialization (in this case
> >> 'create_default_memdev') before machine_run_board_init()
> >> is run.
> >>
> >> Maybe we can consolidate max limit checks in
> >> create_default_memdev() by adding MachineClass::max_ram_size
> >> but that can work only in default usecase (only '-m' is used).  
> > 
> > We do have a workaround for s390x already: mc->fixup_ram_size
> > 
> > That should be called before the memory backend is created and seems to
> > do just what we want, no?  
> 
> Or maybe more explicit adding a MachineClass::check_ram_size() handler?

On the first glance, just max_size field should be sufficient
with checking code being generic, which should remove code duplication
such checks introduce across tree. Is there a specific board for
which call back is 'must to have'?

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 6:08 PM, Igor Mammedov wrote:
> On Fri, 20 Aug 2021 17:53:41 +0200
> Philippe Mathieu-Daudé  wrote:
> 
>> On 8/20/21 5:47 PM, David Hildenbrand wrote:
>>> On 20.08.21 17:44, Igor Mammedov wrote:  
 On Fri, 20 Aug 2021 15:39:27 +0100
 Peter Maydell  wrote:
  
> On Fri, 20 Aug 2021 at 15:34, David Hildenbrand 
> wrote:  
>>
>> On 20.08.21 16:22, Bin Meng wrote:  
>>> Hi Philippe,
>>>
>>> On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
>>>  wrote:  

 Hi Bin,

 On 8/20/21 4:04 PM, Bin Meng wrote:  
> Hi,
>
> The following command used to work on QEMU 4.2.0, but is now broken
> with QEMU head.
>
> $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
> -nographic -serial /dev/null -serial mon:stdio -monitor null -device
> loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
> qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
> allocate memory  
>  
>> -m 4000
>>
>> corresponds to 38 TB if I am not wrong. Is that really what you want?  
>
> Probably not, because the zynq board's init function does:
>
>  if (machine->ram_size > 2 * GiB) {
>  error_report("RAM size more than 2 GiB is not supported");
>  exit(EXIT_FAILURE);
>  }
>
> It seems a bit daft that we allocate the memory before we do
> the size check. This didn't use to be this way around...
>
> Anyway, I think the cause of this change is commit c9800965c1be6c39
> from Igor. We used to silently cap the RAM size to 2GB; now we
> complain. Or at least we would complain if we hadn't already
> tried to allocate the memory and fallen over...  

 That's because RAM (as host resource) is now separated
 from device model (machine limits) and is allocated as
 part of memory backend initialization (in this case
 'create_default_memdev') before machine_run_board_init()
 is run.

 Maybe we can consolidate max limit checks in
 create_default_memdev() by adding MachineClass::max_ram_size
 but that can work only in default usecase (only '-m' is used).  
>>>
>>> We do have a workaround for s390x already: mc->fixup_ram_size
>>>
>>> That should be called before the memory backend is created and seems to
>>> do just what we want, no?  
>>
>> Or maybe more explicit adding a MachineClass::check_ram_size() handler?
> 
> On the first glance, just max_size field should be sufficient
> with checking code being generic, which should remove code duplication
> such checks introduce across tree. Is there a specific board for
> which call back is 'must to have'?

Some boards have minimum or set of possible values (i.e. 2
or 4 SIMMs, each a pow2 between 8M-64M).

We could have few generic helpers and reuse them in each
machine, instead of open-coding each machine:
  machine_check_max_ram_size(),
  machine_check_ram_size_in_range(),
  ...

Re: xilinx-zynq-a9: cannot set up guest memory 'zynq.ext_ram'

2021-08-20 Thread Igor Mammedov

On Fri, 20 Aug 2021 18:06:30 +0200
Philippe Mathieu-Daudé  wrote:

> On 8/20/21 6:03 PM, Igor Mammedov wrote:
> > On Fri, 20 Aug 2021 17:47:01 +0200
> > David Hildenbrand  wrote:
> >   
> >> On 20.08.21 17:44, Igor Mammedov wrote:  
> >>> On Fri, 20 Aug 2021 15:39:27 +0100
> >>> Peter Maydell  wrote:
> >>> 
>  On Fri, 20 Aug 2021 at 15:34, David Hildenbrand  
>  wrote:
> >
> > On 20.08.21 16:22, Bin Meng wrote:
> >> Hi Philippe,
> >>
> >> On Fri, Aug 20, 2021 at 10:10 PM Philippe Mathieu-Daudé
> >>  wrote:
> >>>
> >>> Hi Bin,
> >>>
> >>> On 8/20/21 4:04 PM, Bin Meng wrote:
>  Hi,
> 
>  The following command used to work on QEMU 4.2.0, but is now broken
>  with QEMU head.
> 
>  $ qemu-system-arm -M xilinx-zynq-a9 -display none -m 4000
>  -nographic -serial /dev/null -serial mon:stdio -monitor null -device
>  loader,file=u-boot-dtb.bin,addr=0x400,cpu-num=0
>  qemu-system-arm: cannot set up guest memory 'zynq.ext_ram': Cannot
>  allocate memory
> 
> > -m 4000
> >
> > corresponds to 38 TB if I am not wrong. Is that really what you want?   
> >  
> 
>  Probably not, because the zynq board's init function does:
> 
>   if (machine->ram_size > 2 * GiB) {
>   error_report("RAM size more than 2 GiB is not supported");
>   exit(EXIT_FAILURE);
>   }
> 
>  It seems a bit daft that we allocate the memory before we do
>  the size check. This didn't use to be this way around...
> 
>  Anyway, I think the cause of this change is commit c9800965c1be6c39
>  from Igor. We used to silently cap the RAM size to 2GB; now we
>  complain. Or at least we would complain if we hadn't already
>  tried to allocate the memory and fallen over...
> >>>
> >>> That's because RAM (as host resource) is now separated
> >>> from device model (machine limits) and is allocated as
> >>> part of memory backend initialization (in this case
> >>> 'create_default_memdev') before machine_run_board_init()
> >>> is run.
> >>>
> >>> Maybe we can consolidate max limit checks in
> >>> create_default_memdev() by adding MachineClass::max_ram_size
> >>> but that can work only in default usecase (only '-m' is used).
> >>
> >> We do have a workaround for s390x already: mc->fixup_ram_size
> >>
> >> That should be called before the memory backend is created and seems to 
> >> do just what we want, no?  
> > 
> > it's there for compat sake only if I recall correctly,
> > there should be no fixups ever.
> > If user asks for nonsence, QEMU should error out and force
> > user to correct CLI  
> 
> Agreed, but this would be cheaper to run the checks *before*
> allocating the resources ;)
Agreed,
Only it will work for default usecase only as I described above.

> 
> > (fixups were one of items that were in
> > the way of splitting guest RAM into backend/frontend model)
> >   
>

[PATCH] libqtest: check for g_setenv() failure

2021-08-20 Thread Peter Maydell

g_setenv() can fail; check for it when starting a QEMU process
when we set the QEMU_AUDIO_DRV environment variable.

Because this happens after fork() reporting an exact message
via printf() is a bad idea; just exit(1), as we already do
for the case of execlp() failure.

Fixes: Coverity CID 1460117
Signed-off-by: Peter Maydell 
---
 tests/qtest/libqtest.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 825b13a44c7..73f6b977a66 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -301,7 +301,9 @@ QTestState *qtest_init_without_qmp_handshake(const char 
*extra_args)
 s->expected_status = 0;
 s->qemu_pid = fork();
 if (s->qemu_pid == 0) {
-g_setenv("QEMU_AUDIO_DRV", "none", true);
+if (!g_setenv("QEMU_AUDIO_DRV", "none", true)) {
+exit(1);
+}
 execlp("/bin/sh", "sh", "-c", command, NULL);
 exit(1);
 }
-- 
2.20.1

Re: [PATCH] softmmu/physmem: Improve guest memory allocation failure error message

2021-08-20 Thread Igor Mammedov

On Fri, 20 Aug 2021 18:00:26 +0200
Philippe Mathieu-Daudé  wrote:

> On 8/20/21 5:53 PM, David Hildenbrand wrote:
> > On 20.08.21 17:52, Philippe Mathieu-Daudé wrote:  
> >> When Linux refuses to overcommit a seriously wild allocation we get:
> >>
> >>    $ qemu-system-i386 -m 4000
> >>    qemu-system-i386: cannot set up guest memory 'pc.ram': Cannot
> >> allocate memory
> >>
> >> Slighly improve the error message, displaying the memory size
> >> requested (in case the user didn't expect unspecified memory size
> >> unit is in MiB):
> >>
> >>    $ qemu-system-i386 -m 4000
> >>    qemu-system-i386: Cannot set up 38.1 TiB of guest memory 'pc.ram':
> >> Cannot allocate memory
> >>
> >> Reported-by: Bin Meng 
> >> Signed-off-by: Philippe Mathieu-Daudé 
> >> ---
> >>   softmmu/physmem.c | 4 +++-
> >>   1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> >> index 2e18947598e..2f300a9e79b 100644
> >> --- a/softmmu/physmem.c
> >> +++ b/softmmu/physmem.c
> >> @@ -1982,8 +1982,10 @@ static void ram_block_add(RAMBlock *new_block,
> >> Error **errp)
> >>    
> >> &new_block->mr->align,
> >>     shared, noreserve);
> >>   if (!new_block->host) {
> >> +    g_autofree char *size_s =
> >> size_to_str(new_block->max_length);
> >>   error_setg_errno(errp, errno,
> >> - "cannot set up guest memory '%s'",
> >> + "Cannot set up %s of guest memory
> >> '%s'",
> >> + size_s,
> >>    memory_region_name(new_block->mr));
> >>   qemu_mutex_unlock_ramlist();
> >>   return;
> >>  
> > 
> > IIRC, ram blocks might not necessarily be used for guest memory ... or
> > is my memory wrong?  
> 
> No clue, this error message was already here.

it's not only guest RAM, adding size here is marginal improvement,
(it won't help much since it's not exact match to CLI which may use suffixes 
for sizes)

> 
> No problem to change s/guest/block/ although.
>

Re: [PATCH] libqtest: check for g_setenv() failure

2021-08-20 Thread Philippe Mathieu-Daudé

On 8/20/21 6:37 PM, Peter Maydell wrote:
> g_setenv() can fail; check for it when starting a QEMU process
> when we set the QEMU_AUDIO_DRV environment variable.
> 
> Because this happens after fork() reporting an exact message
> via printf() is a bad idea; just exit(1), as we already do
> for the case of execlp() failure.
> 
> Fixes: Coverity CID 1460117
> Signed-off-by: Peter Maydell 
> ---
>  tests/qtest/libqtest.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PULL 03/33] i386: split cpu accelerators from cpu.c, using AccelCPUClass

2021-08-20 Thread Peter Maydell

On Tue, 11 May 2021 at 09:22, Paolo Bonzini  wrote:
>
> From: Claudio Fontana 
>
> i386 is the first user of AccelCPUClass, allowing to split
> cpu.c into:
>
> cpu.ccpuid and common x86 cpu functionality
> host-cpu.c   host x86 cpu functions and "host" cpu type
> kvm/kvm-cpu.cKVM x86 AccelCPUClass
> hvf/hvf-cpu.cHVF x86 AccelCPUClass
> tcg/tcg-cpu.cTCG x86 AccelCPUClass
>
> Signed-off-by: Claudio Fontana 
> Reviewed-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
>
> [claudio]:
> Rebased on commit b8184135 ("target/i386: allow modifying TCG phys-addr-bits")
>
> Signed-off-by: Claudio Fontana 
> Message-Id: <20210322132800.7470-5-cfont...@suse.de>
> Signed-off-by: Paolo Bonzini 
> ---

> diff --git a/MAINTAINERS b/MAINTAINERS
> index b692c8fbee..c2723b32cb 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -351,7 +351,7 @@ M: Paolo Bonzini 
>  M: Richard Henderson 
>  M: Eduardo Habkost 
>  S: Maintained
> -F: target/i386/
> +F: target/i386/tcg/
>  F: tests/tcg/i386/
>  F: tests/tcg/x86_64/
>  F: hw/i386/

This change to MAINTAINERS has left all the .c files
in target/i386 that are not in one of the tcg, hvf, whpx,
kvm, hax, nvmm subdirectories orphaned -- they are no
longer covered by any MAINTAINERS section.

Where should those files be listed ?

(I just discovered this when get_maintainers.pl said it couldn't
find a maintainer for a change I made to target/i386/sev.c.)

thanks
-- PMM

[PATCH] target/i386: Fix memory leak in sev_read_file_base64()

2021-08-20 Thread Peter Maydell

In sev_read_file_base64() we call g_file_get_contents(), which
allocates memory for the file contents.  We then base64-decode the
contents (which allocates another buffer for the decoded data), but
forgot to free the memory for the original file data.

Use g_autofree to ensure that the file data is freed.

Fixes: Coverity CID 1459997
Signed-off-by: Peter Maydell 
---
Tested with 'make/make check' only...

 target/i386/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 83df8c09f6a..1e7833da1ab 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -565,7 +565,7 @@ static int
 sev_read_file_base64(const char *filename, guchar **data, gsize *len)
 {
 gsize sz;
-gchar *base64;
+g_autofree gchar *base64 = NULL;
 GError *error = NULL;
 
 if (!g_file_get_contents(filename, &base64, &sz, &error)) {
-- 
2.20.1

1 2 >

1 - 100 of 187 matches

Mail list logo