date:20221108

[PATCH] libvhost-user: Handling potential memory allocation failures

2022-11-08 Thread jianchunfu

Add malloc check of virtqueue element in libvhost-user.

Signed-off-by: jianchunfu 
---
 subprojects/libvhost-user/libvhost-user.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 47d2efc60f..901cd7a2c0 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -2551,6 +2551,10 @@ virtqueue_alloc_element(size_t sz,
 
 assert(sz >= sizeof(VuVirtqElement));
 elem = malloc(out_sg_end);
+if (!elem) {
+DPRINT("%s: failed to malloc virtqelement\n", __func__);
+exit(0);
+}
 elem->out_num = out_num;
 elem->in_num = in_num;
 elem->in_sg = (void *)elem + in_sg_ofs;
-- 
2.18.4

Re: [PATCH] LOCK GUARDS: replace manual lock()/unlock() calls to QEMU_LOCK_GUARD()

2022-11-08 Thread Philippe Mathieu-Daudé


Hi,

On 8/11/22 06:11, nyoro.ga...@gmail.com wrote:

From: Samker 

This is patch replaces WITH_QEMU_LOCK_GUARD() call with the
QEMU_LOCK_GUARD()

Signed-off-by: Samker 
---
  softmmu/physmem.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index fb00596777..907491ae17 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3115,7 +3115,7 @@ void cpu_register_map_client(QEMUBH *bh)
  {
  MapClient *client = g_malloc(sizeof(*client));
  
-WITH_QEMU_LOCK_GUARD(&map_client_list_lock);

+QEMU_LOCK_GUARD(&map_client_list_lock);
  client->bh = bh;
  QLIST_INSERT_HEAD(&map_client_list, client, link);
  if (!qatomic_read(&bounce.in_use)) {
@@ -3143,7 +3143,7 @@ void cpu_unregister_map_client(QEMUBH *bh)
  {
  MapClient *client;
  
-WITH_QEMU_LOCK_GUARD(&map_client_list_lock);

+QEMU_LOCK_GUARD(&map_client_list_lock);
  QLIST_FOREACH(client, &map_client_list, link) {
  if (client->bh == bh) {
  cpu_unregister_map_client_do(client);


This patch doesn't apply...

$ git grep QEMU_LOCK_GUARD origin/master -- softmmu/physmem.c
$

On what is it based?

Re: [PATCH v9 5/8] KVM: Register/unregister the guest private memory regions

2022-11-08 Thread Chao Peng

On Fri, Nov 04, 2022 at 09:19:31PM +, Sean Christopherson wrote:
> Paolo, any thoughts before I lead things further astray?
> 
> On Fri, Nov 04, 2022, Chao Peng wrote:
> > On Thu, Nov 03, 2022 at 11:04:53PM +, Sean Christopherson wrote:
> > > On Tue, Oct 25, 2022, Chao Peng wrote:
> > > > @@ -4708,6 +4802,24 @@ static long kvm_vm_ioctl(struct file *filp,
> > > > r = kvm_vm_ioctl_set_memory_region(kvm, &mem);
> > > > break;
> > > > }
> > > > +#ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM
> > > > +   case KVM_MEMORY_ENCRYPT_REG_REGION:
> > > > +   case KVM_MEMORY_ENCRYPT_UNREG_REGION: {
> > > 
> > > I'm having second thoughts about usurping 
> > > KVM_MEMORY_ENCRYPT_(UN)REG_REGION.  Aside
> > > from the fact that restricted/protected memory may not be encrypted, 
> > > there are
> > > other potential use cases for per-page memory attributes[*], e.g. to make 
> > > memory
> > > read-only (or no-exec, or exec-only, etc...) without having to modify 
> > > memslots.
> > > 
> > > Any paravirt use case where the attributes of a page are effectively 
> > > dictated by
> > > the guest is going to run into the exact same performance problems with 
> > > memslots,
> > > which isn't suprising in hindsight since shared vs. private is really 
> > > just an
> > > attribute, albeit with extra special semantics.
> > > 
> > > And if we go with a brand new ioctl(), maybe someday in the very distant 
> > > future
> > > we can deprecate and delete KVM_MEMORY_ENCRYPT_(UN)REG_REGION.
> > > 
> > > Switching to a new ioctl() should be a minor change, i.e. shouldn't throw 
> > > too big
> > > of a wrench into things.
> > > 
> > > Something like:
> > > 
> > >   KVM_SET_MEMORY_ATTRIBUTES
> > > 
> > >   struct kvm_memory_attributes {
> > >   __u64 address;
> > >   __u64 size;
> > >   __u64 flags;
> 
> Oh, this is half-baked.  I lost track of which flags were which.  What I 
> intended
> was a separate, initially-unused flags, e.g.

That makes sense.

> 
>  struct kvm_memory_attributes {
>   __u64 address;
>   __u64 size;
>   __u64 attributes;
>   __u64 flags;
>   }
> 
> so that KVM can tweak behavior and/or extend the effective size of the struct.
> 
> > I like the idea of adding a new ioctl(). But putting all attributes into
> > a flags in uAPI sounds not good to me, e.g. forcing userspace to set all
> > attributes in one call can cause pain for userspace, probably for KVM
> > implementation as well. For private<->shared memory conversion, we
> > actually only care the KVM_MEM_ATTR_SHARED or KVM_MEM_ATTR_PRIVATE bit,
> 
> Not necessarily, e.g. I can see pKVM wanting to convert from RW+PRIVATE => 
> RO+SHARED
> or even RW+PRIVATE => NONE+SHARED so that the guest can't write/access the 
> memory
> while it's accessible from the host.
> 
> And if this does extend beyond shared/private, dropping from RWX=>R, i.e. 
> dropping
> WX permissions, would also be a common operation.
> 
> Hmm, typing that out makes me think that if we do end up supporting other 
> "attributes",
> i.e. protections, we should go straight to full RWX protections instead of 
> doing
> things piecemeal, i.e. add individual protections instead of combinations like
> NO_EXEC and READ_ONLY.  The protections would have to be inverted for 
> backwards
> compatibility, but that's easy enough to handle.  The semantics could be like
> protection keys, which also have inverted persmissions, where the final 
> protections
> are the combination of memslot+attributes, i.e. a read-only memslot couldn't 
> be made
> writable via attributes.
> 
> E.g. userspace could do "NO_READ | NO_WRITE | NO_EXEC" to temporarily block 
> access
> to memory without needing to delete the memslot.  KVM would need to disallow
> unsupported combinations, e.g. disallowed effective protections would be:
> 
>   - W or WX [unless there's an arch that supports write-only memory]
>   - R or RW [until KVM plumbs through support for no-exec, or it's 
> unsupported in hardware]
>   - X   [until KVM plumbs through support for exec-only, or it's 
> unsupported in hardware]
> 
> Anyways, that's all future work...
> 
> > but we force userspace to set other irrelevant bits as well if use this
> > API.
> 
> They aren't irrelevant though, as the memory attributes are all describing the
> allowed protections for a given page.

The 'allowed' protections seems answer my concern. But after we enabled
"NO_READ | NO_WRITE | NO_EXEC", are we going to check "NO_READ |
NO_WRITE | NO_EXEC" are also set together with the PRIVATE bit? I just
can't imagine what the semantic would be if we have the PRIVATE bit set
but other bits indicate it's actually can READ/WRITE/EXEC from usrspace.

> If there's a use case where userspace "can't"
> keep track of the attributes for whatever reason, then userspace could do a 
> RMW
> to set/clear attributes.  Alternatively, the ioctl() could take an 
> "operation" and
> support WRITE/OR/AND to allow setting/clearing individual f

Re: [PATCH v8 5/5] docs: Add generic vhost-vdpa device documentation

2022-11-08 Thread Stefano Garzarella


On Tue, Nov 08, 2022 at 11:30:53AM +0800, Longpeng (Mike, Cloud Infrastructure 
Service Product Dept.) wrote:



在 2022/11/8 10:42, Jason Wang 写道:

On Tue, Nov 8, 2022 at 8:42 AM Longpeng(Mike)  wrote:


From: Longpeng 

Signed-off-by: Longpeng 
---
 docs/system/devices/vhost-vdpa-device.rst | 43 +++
 1 file changed, 43 insertions(+)
 create mode 100644 docs/system/devices/vhost-vdpa-device.rst

diff --git a/docs/system/devices/vhost-vdpa-device.rst 
b/docs/system/devices/vhost-vdpa-device.rst
new file mode 100644
index 00..b758c4fce6
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-device.rst


If the doc is for a general vhost-vDPA device, we'd better have a better name?



How about general-vhost-vdpa-device.rst?



I would leave vhost-vdpa as the prefix, how about 
vhost-vdpa-generic-device.rst?


Thanks,
Stefano

Re: [PATCH 2/3] net: Restore printing of the help text with "-nic help"

2022-11-08 Thread Thomas Huth


On 07/11/2022 13.27, Claudio Fontana wrote:

should -net and -netdev be adapted too?


"-netdev help" already works just fine ... and "-net" should IMHO rather be 
removed than improved ;-)


 Thomas

Re: [PATCH v8 4/5] vdpa-dev: mark the device as unmigratable

2022-11-08 Thread Stefano Garzarella


On Tue, Nov 08, 2022 at 08:41:56AM +0800, Longpeng(Mike) wrote:

From: Longpeng 

The generic vDPA device doesn't support migration currently, so
mark it as unmigratable temporarily.

Signed-off-by: Longpeng 
---
hw/virtio/vdpa-dev.c | 1 +
1 file changed, 1 insertion(+)


Is there a particular reason why we don't squash this change in the 
second patch of this series where we add the hw/virtio/vdpa-dev.c file?


Anyway, this change LGTM:

Reviewed-by: Stefano Garzarella 



diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 2885d06cbe..62d83d3423 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -327,6 +327,7 @@ static Property vhost_vdpa_device_properties[] = {

static const VMStateDescription vmstate_vhost_vdpa_device = {
.name = "vhost-vdpa-device",
+.unmigratable = 1,
.minimum_version_id = 1,
.version_id = 1,
.fields = (VMStateField[]) {
--
2.23.0

Re: [PATCH 2/3] net: Restore printing of the help text with "-nic help"

2022-11-08 Thread Claudio Fontana

On 11/8/22 09:42, Thomas Huth wrote:
> On 07/11/2022 13.27, Claudio Fontana wrote:
>> should -net and -netdev be adapted too?
> 
> "-netdev help" already works just fine ... and "-net" should IMHO rather be 
> removed than improved ;-)
> 
>   Thomas
> 

I wonder if it could be done once for all, in net_init_clients,
instead of repeating the is_help_option in net_init_netdev and net_param_nic
(and that would take care of net_init_client too, so we'd get "net" for free)..

Ciao,

C

Re: [PATCH] libvhost-user: Handling potential memory allocation failures

2022-11-08 Thread Michael S. Tsirkin

On Tue, Nov 08, 2022 at 04:01:41PM +0800, jianchunfu wrote:
> Add malloc check of virtqueue element in libvhost-user.
> 
> Signed-off-by: jianchunfu 
> ---
>  subprojects/libvhost-user/libvhost-user.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/subprojects/libvhost-user/libvhost-user.c 
> b/subprojects/libvhost-user/libvhost-user.c
> index 47d2efc60f..901cd7a2c0 100644
> --- a/subprojects/libvhost-user/libvhost-user.c
> +++ b/subprojects/libvhost-user/libvhost-user.c
> @@ -2551,6 +2551,10 @@ virtqueue_alloc_element(size_t sz,
>  
>  assert(sz >= sizeof(VuVirtqElement));
>  elem = malloc(out_sg_end);
> +if (!elem) {
> +DPRINT("%s: failed to malloc virtqelement\n", __func__);
> +exit(0);
> +}

exit is certainly not the right way to handle such errors.


>  elem->out_num = out_num;
>  elem->in_num = in_num;
>  elem->in_sg = (void *)elem + in_sg_ofs;
> -- 
> 2.18.4
> 
>

Re: [PATCH 2/3] net: Restore printing of the help text with "-nic help"

2022-11-08 Thread Thomas Huth


On 08/11/2022 09.52, Claudio Fontana wrote:

On 11/8/22 09:42, Thomas Huth wrote:

On 07/11/2022 13.27, Claudio Fontana wrote:

should -net and -netdev be adapted too?


"-netdev help" already works just fine ... and "-net" should IMHO rather be
removed than improved ;-)

   Thomas



I wonder if it could be done once for all, in net_init_clients,
instead of repeating the is_help_option in net_init_netdev and net_param_nic
(and that would take care of net_init_client too, so we'd get "net" for free)..


I don't think that it makes too much sense to have one option for all - 
since all three CLI options are slightly different anyway. E.g. "-net nic" 
only exists for "-net", "hubport" cannot be used with "-net", "-nic" can 
also be used to configure the NIC model, etc.


 Thomas

Re: [PATCH] LOCK GUARDS: replace manual lock()/unlock() calls to QEMU_LOCK_GUARD()

2022-11-08 Thread Martin Gachu

Hi Philippe,

> This patch doesn't apply...

> $ git grep QEMU_LOCK_GUARD origin/master -- softmmu/physmem.c

> $

> On what is it based?

I initially replaced  lock()/unlock() calls in the softmmu/physmem.c with
the more preferred WITH_QEMU_LOCK_GUARD() call.

However Stefan pointed out that instead I should have used QEMU_LOCK_GUARD
when I don't want to use block scope.

Re: [PATCH] hw/sd/sdhci: reset data count in sdhci_buff_access_is_sequential()

2022-11-08 Thread Mauro Matteo Cascella

On Mon, Nov 7, 2022 at 8:12 PM Philippe Mathieu-Daudé  wrote:
>
> On 7/11/22 11:35, Mauro Matteo Cascella wrote:
> > Make sure to reset data_count if it's equal to (or exceeds) block_size.
> > This prevents an off-by-one read / write when accessing s->fifo_buffer
> > in sdhci_read_dataport / sdhci_write_dataport, both called right after
> > sdhci_buff_access_is_sequential.
> >
> > Fixes: CVE-2022-3872
> > Reported-by: RivenDell 
> > Reported-by: Siqi Chen 
> > Reported-by: ningqiang 
> > Signed-off-by: Mauro Matteo Cascella 
> > ---
> >   hw/sd/sdhci.c | 4 
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> > index 306070c872..aa2fd79df2 100644
> > --- a/hw/sd/sdhci.c
> > +++ b/hw/sd/sdhci.c
> > @@ -978,6 +978,10 @@ static bool sdhci_can_issue_command(SDHCIState *s)
> >   static inline bool
> >   sdhci_buff_access_is_sequential(SDHCIState *s, unsigned byte_num)
> >   {
> > +if (s->data_count >= (s->blksize & BLOCK_SIZE_MASK)) {
> > +s->data_count = 0;
> > +}
>
> You avoid an off-by-one but now the model doesn't work per the spec.

Note that a similar thing is done in sdhci_{read,write}_dataport. But
it's true that this is probably not enough (e.g. no update of
s->prnsts). Thank you for sending
https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg01161.html.

> Not really what the best fix IMHO.
>
> >   if ((s->data_count & 0x3) != byte_num) {
> >   trace_sdhci_error("Non-sequential access to Buffer Data Port 
> > register"
> > "is prohibited\n");
>
> I wonder why sdhci_data_transfer() indiscriminately sets
> SDHC_SPACE_AVAILABLE in the write path (at least without
> clearing the FIFO first).
>
> The fix could be:
>
> -- >8 --
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> @@ -955,5 +955,5 @@ static void sdhci_data_transfer(void *opaque)
>   } else {
>   s->prnsts |= SDHC_DOING_WRITE | SDHC_DAT_LINE_ACTIVE |
> -SDHC_SPACE_AVAILABLE | SDHC_DATA_INHIBIT;
> +   SDHC_DATA_INHIBIT;
>   sdhci_write_block_to_card(s);
>   }
> ---
>
> Bin, what do you think?
>
> Regards,
>
> Phil.
>

--
Mauro Matteo Cascella
Red Hat Product Security
PGP-Key ID: BB3410B0

Re: [PATCH-for-7.2 1/2] hw/sd/sdhci: Do not set Buf Wr Ena before writing block (CVE-2022-3872)

2022-11-08 Thread Mauro Matteo Cascella

On Mon, Nov 7, 2022 at 11:12 PM Philippe Mathieu-Daudé
 wrote:
>
> When sdhci_write_block_to_card() is called to transfer data from
> the FIFO to the SD bus, the data is already present in the buffer
> and we have to consume it directly.
>
> See the description of the 'Buffer Write Enable' bit from the
> 'Present State' register (prnsts::SDHC_SPACE_AVAILABLE) in Table
> 2.14 from the SDHCI spec v2:
>
>   Buffer Write Enable
>
>   This status is used for non-DMA write transfers.
>
>   The Host Controller can implement multiple buffers to transfer
>   data efficiently. This read only flag indicates if space is
>   available for write data. If this bit is 1, data can be written
>   to the buffer. A change of this bit from 1 to 0 occurs when all
>   the block data is written to the buffer. A change of this bit
>   from 0 to 1 occurs when top of block data can be written to the
>   buffer and generates the Buffer Write Ready interrupt.
>
> In our case, we do not want to overwrite the buffer, so we want
> this bit to be 0, then set it to 1 once the data is written onto
> the bus.
>
> This is probably a copy/paste error from commit d7dfca0807
> ("hw/sdhci: introduce standard SD host controller").
>
> Reproducer:
> https://lore.kernel.org/qemu-devel/caa8xkjxrms0fkr28akvnnpyatm0y0b+5fichpsrhd+mugnu...@mail.gmail.com/
>
> Fixes: CVE-2022-3872
> Reported-by: RivenDell 
> Reported-by: Siqi Chen 
> Reported-by: ningqiang 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/sd/sdhci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index 306070c872..f230e7475f 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -954,7 +954,7 @@ static void sdhci_data_transfer(void *opaque)
>  sdhci_read_block_from_card(s);
>  } else {
>  s->prnsts |= SDHC_DOING_WRITE | SDHC_DAT_LINE_ACTIVE |
> -SDHC_SPACE_AVAILABLE | SDHC_DATA_INHIBIT;
> +   SDHC_DATA_INHIBIT;
>  sdhci_write_block_to_card(s);
>  }
>  }
> --
> 2.38.1
>

Tested-by: Mauro Matteo Cascella 

Thank you,

--
Mauro Matteo Cascella
Red Hat Product Security
PGP-Key ID: BB3410B0

Re: [PATCH] hw/arm/boot: set initrd parameters to 64bit in fdt

2022-11-08 Thread Alex Bennée

Schspa Shi  writes:

> We use 32bit value for linux,initrd-[start/end], when we have
> loader_start > 4GB, there will be a wrong initrd_start passed
> to the kernel, and the kernel will report the following warning.
>
> [0.00] [ cut here ]
> [0.00] initrd not fully accessible via the linear mapping -- please 
> check your bootloader ...
> [0.00] WARNING: CPU: 0 PID: 0 at arch/arm64/mm/init.c:355 
> arm64_memblock_init+0x158/0x244
> [0.00] Modules linked in:
> [0.00] CPU: 0 PID: 0 Comm: swapper Tainted: GW  
> 6.1.0-rc3-13250-g30a0b95b1335-dirty #28
> [0.00] Hardware name: Horizon Sigi Virtual development board
> (DT)

Is this an out-of-tree board model?

> [0.00] pstate: 60c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [0.00] pc : arm64_memblock_init+0x158/0x244
> [0.00] lr : arm64_memblock_init+0x158/0x244
> [0.00] sp : 89273df0
> [0.00] x29: 89273df0 x28: 001000cc0010 x27: 
> 8000
> [0.00] x26: 0050a3e2 x25: 88b46000 x24: 
> 88b46000
> [0.00] x23: 88a53000 x22: 8942 x21: 
> 88a53000
> [0.00] x20: 0400 x19: 0400 x18: 
> 1020
> [0.00] x17: 6568632065736165 x16: 6c70202d2d20676e x15: 
> 697070616d207261
> [0.00] x14: 656e696c20656874 x13: 0a2e2e2e20726564 x12: 
> 
> [0.00] x11:  x10:  x9 : 
> 
> [0.00] x8 :  x7 : 796c6c756620746f x6 : 
> 6e20647274696e69
> [0.00] x5 : 893c7c47 x4 : 88a2102f x3 : 
> 89273a88
> [0.00] x2 : 8000f038 x1 : 00c0 x0 : 
> 0056
> [0.00] Call trace:
> [0.00]  arm64_memblock_init+0x158/0x244
> [0.00]  setup_arch+0x164/0x1cc
> [0.00]  start_kernel+0x94/0x4ac
> [0.00]  __primary_switched+0xb4/0xbc
> [0.00] ---[ end trace  ]---
> [0.00] Zone ranges:
> [0.00]   DMA  [mem 0x0010-0x001007ff]
>
> To fix it, we can change it to u64 type.
>
> Signed-off-by: Schspa Shi 
> ---
>  hw/arm/boot.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 57efb61ee419..da719a4f8874 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -638,14 +638,14 @@ int arm_load_dtb(hwaddr addr, const struct 
> arm_boot_info *binfo,
>  }
>  
>  if (binfo->initrd_size) {
> -rc = qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-start",
> +rc = qemu_fdt_setprop_u64(fdt, "/chosen", "linux,initrd-start",
> binfo->initrd_start);
>  if (rc < 0) {
>  fprintf(stderr, "couldn't set /chosen/linux,initrd-start\n");
>  goto fail;
>  }
>  
> -rc = qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end",
> +rc = qemu_fdt_setprop_u64(fdt, "/chosen", "linux,initrd-end",
> binfo->initrd_start + binfo->initrd_size);
>  if (rc < 0) {
>  fprintf(stderr, "couldn't set /chosen/linux,initrd-end\n");

On the face of things this seems fine because unlike the other linux
properties they are not specified to be "expressed in #address-cells and
#size-cells" but I do wonder how we got into the situation where the
kernel and initrd ended up so high in the physical address space.

There is a whole comment in boot.c talking about keeping initrd within
lowmem:

/*
 * We want to put the initrd far enough into RAM that when the
 * kernel is uncompressed it will not clobber the initrd. However
 * on boards without much RAM we must ensure that we still leave
 * enough room for a decent sized initrd, and on boards with large
 * amounts of RAM we must avoid the initrd being so far up in RAM
 * that it is outside lowmem and inaccessible to the kernel.
 * So for boards with less  than 256MB of RAM we put the initrd
 * halfway into RAM, and for boards with 256MB of RAM or more we put
 * the initrd at 128MB.
 * We also refuse to put the initrd somewhere that will definitely
 * overlay the kernel we just loaded, though for kernel formats which
 * don't tell us their exact size (eg self-decompressing 32-bit kernels)
 * we might still make a bad choice here.
 */

Is this just because the base RAM address of the board is outside of the
32 bit address range?

-- 
Alex Bennée

Re: [PATCH] LOCK GUARDS: replace manual lock()/unlock() calls to QEMU_LOCK_GUARD()

2022-11-08 Thread Alex Bennée



nyoro.ga...@gmail.com writes:

> From: Samker 
>
> This is patch replaces WITH_QEMU_LOCK_GUARD() call with the
> QEMU_LOCK_GUARD()

You could extend this description as to why ".. because it extends the
scope of the lock to the whole function rather than the non-existent
block".

Otherwise:

Reviewed-by: Alex Bennée 


>
> Signed-off-by: Samker 
> ---
>  softmmu/physmem.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> index fb00596777..907491ae17 100644
> --- a/softmmu/physmem.c
> +++ b/softmmu/physmem.c
> @@ -3115,7 +3115,7 @@ void cpu_register_map_client(QEMUBH *bh)
>  {
>  MapClient *client = g_malloc(sizeof(*client));
>  
> -WITH_QEMU_LOCK_GUARD(&map_client_list_lock);
> +QEMU_LOCK_GUARD(&map_client_list_lock);
>  client->bh = bh;
>  QLIST_INSERT_HEAD(&map_client_list, client, link);
>  if (!qatomic_read(&bounce.in_use)) {
> @@ -3143,7 +3143,7 @@ void cpu_unregister_map_client(QEMUBH *bh)
>  {
>  MapClient *client;
>  
> -WITH_QEMU_LOCK_GUARD(&map_client_list_lock);
> +QEMU_LOCK_GUARD(&map_client_list_lock);
>  QLIST_FOREACH(client, &map_client_list, link) {
>  if (client->bh == bh) {
>  cpu_unregister_map_client_do(client);


-- 
Alex Bennée

[PATCH v1 2/9] tests/avocado: improve behaviour waiting for login prompts

2022-11-08 Thread Alex Bennée

This attempts to deal with the problem of login prompts not being
guaranteed to be terminated with a newline. The solution to this is to
peek at the incoming data looking to see if we see an up-coming match
before we fall back to the old readline() logic. The reason to mostly
rely on readline is because I am occasionally seeing the peek stalling
despite data being there.

This seems kinda hacky and gross so I'm open to alternative approaches
and cleaner python code.

Signed-off-by: Alex Bennée 
---
 tests/avocado/avocado_qemu/__init__.py | 89 +-
 1 file changed, 88 insertions(+), 1 deletion(-)

diff --git a/tests/avocado/avocado_qemu/__init__.py 
b/tests/avocado/avocado_qemu/__init__.py
index 910f3ba1ea..d6ff68e171 100644
--- a/tests/avocado/avocado_qemu/__init__.py
+++ b/tests/avocado/avocado_qemu/__init__.py
@@ -131,6 +131,58 @@ def pick_default_qemu_bin(bin_prefix='qemu-system-', 
arch=None):
 return path
 return None
 
+def _peek_ahead(console, min_match, success_message):
+"""
+peek ahead in the console stream keeping an eye out for the
+success message.
+
+Returns some message to process or None, indicating the normal
+readline should occur.
+"""
+console_logger = logging.getLogger('console')
+peek_len = 0
+retries = 0
+
+while True:
+try:
+old_peek_len = peek_len
+peek_ahead = console.peek(min_match).decode()
+peek_len = len(peek_ahead)
+
+# if we get stuck too long lets just fallback to readline
+if peek_len <= old_peek_len:
+retries += 1
+if retries > 10:
+return None
+
+# if we see a newline in the peek we can let safely bail
+# and let the normal readline() deal with it
+if peek_ahead.endswith(('\n', '\r', '\r\n')):
+return None
+
+# if we haven't seen enough for the whole message but the
+# first part matches lets just loop again
+if len(peek_ahead) < min_match and \
+   success_message[:peek_len] in peek_ahead:
+continue
+
+# if we see the whole success_message we are done, consume
+# it and pass back so we can exit to the user
+if success_message in peek_ahead:
+return console.read(peek_len).decode()
+
+# of course if we've seen enough then this line probably
+# doesn't contain what we are looking for, fallback
+if peek_len > min_match:
+return None
+
+except UnicodeDecodeError:
+console_logger.log("error in decode of peek")
+return None
+
+# we should never get here
+return None
+
 
 def _console_interaction(test, success_message, failure_message,
  send_string, keep_sending=False, vm=None):
@@ -139,17 +191,52 @@ def _console_interaction(test, success_message, 
failure_message,
 vm = test.vm
 console = vm.console_socket.makefile(mode='rb', encoding='utf-8')
 console_logger = logging.getLogger('console')
+
+# Establish the minimum number of bytes we would need to
+# potentially match against success_message
+if success_message is not None:
+min_match = len(success_message)
+else:
+min_match = 0
+
+console_logger.debug("looking for %d:(%s), sending %s (always=%s)",
+ min_match, success_message, send_string, keep_sending)
+
 while True:
+msg = None
+
+# First send our string, optionally repeating the send next
+# cycle.
 if send_string:
 vm.console_socket.sendall(send_string.encode())
 if not keep_sending:
 send_string = None # send only once
+
+# If the console has no data to read we briefly
+# sleep before continuing.
+if not console.readable():
+time.sleep(0.1)
+continue
+
 try:
-msg = console.readline().decode().strip()
+
+# First we shall peek ahead for a potential match to cover waiting
+# for lines without any newlines.
+if min_match > 0:
+msg = _peek_ahead(console, min_match, success_message)
+
+# otherwise we block here for a full line
+if not msg:
+msg = console.readline().decode().strip()
+
 except UnicodeDecodeError:
+console_logger.debug("skipped unicode error")
 msg = None
+
+# if nothing came out we continue and try again
 if not msg:
 continue
+
 console_logger.debug(msg)
 if success_message is None or success_message in msg:
 break
-- 
2.34.1

[PATCH v1 1/9] Run docker probe only if docker or podman are available

2022-11-08 Thread Alex Bennée

From: Stefan Weil 

The docker probe uses "sudo -n" which can cause an e-mail with a security 
warning
each time when configure is run. Therefore run docker probe only if either 
docker
or podman are available.

That avoids the problematic "sudo -n" on build environments which have neither
docker nor podman installed.

Fixes: c4575b59155e2e00 ("configure: store container engine in config-host.mak")
Signed-off-by: Stefan Weil 
Message-Id: <20221030083510.310584-1...@weilnetz.de>
Signed-off-by: Alex Bennée 
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index 66928692b0..26c7bc5154 100755
--- a/configure
+++ b/configure
@@ -1780,7 +1780,7 @@ fi
 # functions to probe cross compilers
 
 container="no"
-if test $use_containers = "yes"; then
+if test $use_containers = "yes" && (has "docker" || has "podman"); then
 case $($python "$source_path"/tests/docker/docker.py probe) in
 *docker) container=docker ;;
 podman) container=podman ;;
-- 
2.34.1

[PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-08 Thread Alex Bennée

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 
---
 include/hw/virtio/virtio.h   | 18 ++
 hw/block/vhost-user-blk.c|  6 +-
 hw/virtio/vhost-user-fs.c|  2 +-
 hw/virtio/vhost-user-gpio.c  |  2 +-
 hw/virtio/vhost-user-i2c.c   |  2 +-
 hw/virtio/vhost-user-rng.c   |  2 +-
 hw/virtio/vhost-user-vsock.c |  2 +-
 hw/virtio/vhost-vsock.c  |  2 +-
 8 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index f41b4a7e64..3191c618f3 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -389,6 +389,24 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
 return vdev->started;
 }
 
+return status & VIRTIO_CONFIG_S_DRIVER_OK;
+}
+
+/**
+ * virtio_device_should_start() - check if device startable
+ * @vdev - the VirtIO device
+ * @status - the devices status bits
+ *
+ * This is similar to virtio_device_started() but also encapsulates a
+ * check on the VM status which would prevent a device starting
+ * anyway.
+ */
+static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
+{
+if (vdev->use_started) {
+return vdev->started;
+}
+
 if (!vdev->vm_running) {
 return false;
 }
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 13bf5cc47a..8feaf12e4e 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -222,14 +222,10 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
 static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
 {
 VHostUserBlk *s = VHOST_USER_BLK(vdev);
-bool should_start = virtio_device_started(vdev, status);
+bool should_start = virtio_device_should_start(vdev, status);
 Error *local_err = NULL;
 int ret;
 
-if (!vdev->vm_running) {
-should_start = false;
-}
-
 if (!s->connected) {
 return;
 }
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index ad0f91c607..1c40f42045 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -123,7 +123,7 @@ static void vuf_stop(VirtIODevice *vdev)
 static void vuf_set_status(VirtIODevice *vdev, uint8_t status)
 {
 VHostUserFS *fs = VHOST_USER_FS(vdev);
-bool should_start = virtio_device_started(vdev, status);
+bool should_start = virtio_device_should_start(vdev, status);
 
 if (vhost_dev_is_started(&fs->vhost_dev) == should_start) {
 return;
diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
index 8b40fe450c..677d1c7730 100644
--- a/hw/virtio/vhost-user-gpio.c
+++ b/hw/virtio/vhost-user-gpio.c
@@ -152,7 +152,7 @@ static void vu_gpio_stop(VirtIODevice *vdev)
 static void vu_gpio_set_status(VirtIODevice *vdev, uint8_t status)
 {
 VHostUserGPIO *gpio = VHOST_USER_GPIO(vdev);
-bool should_start = virtio_device_started(vdev, status);
+bool should_start = virtio_device_should_start(vdev, status);
 
 trace_virtio_gpio_set_status(status);
 
diff --git a/hw/virtio/vhost-user-i2c.c b/hw/virtio/vhost-user-i2c.c
index bc58b6c0d1..864eba695e 100644
--- a/hw/virtio/vhost-user-i2c.c
+++ b/hw/virtio/vhost-user-i2c.c
@@ -93,7 +93,7 @@ static void vu_i2c_stop(VirtIODevice *vdev)
 static void vu_i2c_set_status(VirtIODevice *vdev, uint8_t status)
 {
 VHostUserI2C *i2c = VHOST_USER_I2C(vdev);
-bool should_start = virtio_device_started(vdev, status);
+bool should_start = virtio_device_should_start(vdev, status);
 
 if (vhost_dev_is_started(&i2c->vhost_dev) == should_start) {
 return;
diff --git a/hw/virtio/vhost-user-rng.c b/hw/virtio/vhost-user-rng.c
index bc1f36c5ac..8b47287875 100644
--- a/hw/virtio/vhost-user-rng.c
+++ b/hw/virtio/vhost-user-rng.c
@@ -90,7 +90,7 @@ static void vu_rng_stop(VirtIODevice *vdev)
 static void vu_rng_set_status(VirtIODevice *vdev, uint8_t status)
 {
 VHostUserRNG *rng = VHOST_USER_RNG(vdev);
-bool should_start = virtio_device_started(vdev, status);
+bool should_start = virtio_device_should_start(vdev, status);
 
 if (vhost_dev_is_started(&rng->vhost_dev) == should_start) {
 return;
diff --git a/hw/virtio/vhost-user-vsock.c b/hw/virtio/vhost-user-vsock.c
index 7b67e29d83..9431b9792c 100644
--- a/hw/virtio/vhost-user-vsock.c
+++ b/hw/virtio/vhost-user-vsock.c
@@ -5

[PATCH v1 for 7.2 0/9] test and doc updates

2022-11-08 Thread Alex Bennée

Hi,

Its freeze o'clock so I'm reverting to collecting fixes for the tree.
Most of these patches have been posted before as single patch RFCs. A
couple are already scheduled through other trees so will drop out in
due course. The docs update has included some feedback for the review
and should be good for an rc1.

The following have not been reviewed:

 - hw/virtio: introduce virtio_device_should_start
 - tests/docker: allow user to override check target
 - tests/avocado: improve behaviour waiting for login prompts

Alex Bennée (7):
  tests/avocado: improve behaviour waiting for login prompts
  tests/docker: allow user to override check target
  hw/virtio: introduce virtio_device_should_start
  docs/devel: add a maintainers section to development process
  docs/devel: make language a little less code centric
  docs/devel: simplify the minimal checklist
  docs/devel: try and improve the language around patch review

Cédric Le Goater (1):
  tests/avocado/machine_aspeed.py: Reduce noise on the console for SDK
tests

Stefan Weil (1):
  Run docker probe only if docker or podman are available

 docs/devel/code-of-conduct.rst   |   2 +
 docs/devel/index-process.rst |   1 +
 docs/devel/maintainers.rst   | 106 +++
 docs/devel/submitting-a-patch.rst| 101 +
 docs/devel/submitting-a-pull-request.rst |  12 +--
 configure|   2 +-
 include/hw/virtio/virtio.h   |  18 
 hw/block/vhost-user-blk.c|   6 +-
 hw/virtio/vhost-user-fs.c|   2 +-
 hw/virtio/vhost-user-gpio.c  |   2 +-
 hw/virtio/vhost-user-i2c.c   |   2 +-
 hw/virtio/vhost-user-rng.c   |   2 +-
 hw/virtio/vhost-user-vsock.c |   2 +-
 hw/virtio/vhost-vsock.c  |   2 +-
 tests/avocado/avocado_qemu/__init__.py   |  89 ++-
 tests/avocado/machine_aspeed.py  |  17 ++--
 tests/docker/Makefile.include|   2 +
 tests/docker/common.rc   |   6 +-
 18 files changed, 309 insertions(+), 65 deletions(-)
 create mode 100644 docs/devel/maintainers.rst

-- 
2.34.1

[PATCH v1 8/9] docs/devel: simplify the minimal checklist

2022-11-08 Thread Alex Bennée

The bullet points are quite long and contain process tips. Move those
bits of the bullet to the relevant sections and link to them. Use a
table for nicer formatting of the checklist.

Signed-off-by: Alex Bennée 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221012121152.1179051-4-alex.ben...@linaro.org>

---
v2
  - emphasise a Real Name should be used
  - s/therefor/therefore/
---
 docs/devel/submitting-a-patch.rst | 75 ---
 1 file changed, 49 insertions(+), 26 deletions(-)

diff --git a/docs/devel/submitting-a-patch.rst 
b/docs/devel/submitting-a-patch.rst
index 9c7c4331f3..1f2bde0625 100644
--- a/docs/devel/submitting-a-patch.rst
+++ b/docs/devel/submitting-a-patch.rst
@@ -12,25 +12,18 @@ be committed faster.
 This page seems very long, so if you are only trying to post a quick
 one-shot fix, the bare minimum we ask is that:
 
--  You **must** provide a Signed-off-by: line (this is a hard
-   requirement because it's how you say "I'm legally okay to contribute
-   this and happy for it to go into QEMU", modeled after the `Linux kernel
-   
`__
-   policy.) ``git commit -s`` or ``git format-patch -s`` will add one.
--  All contributions to QEMU must be **sent as patches** to the
-   qemu-devel `mailing list `__.
-   Patch contributions should not be posted on the bug tracker, posted on
-   forums, or externally hosted and linked to. (We have other mailing lists 
too,
-   but all patches must go to qemu-devel, possibly with a Cc: to another
-   list.) ``git send-email`` (`step-by-step setup
-   guide `__ and `hints and
-   tips 
`__)
-   works best for delivering the patch without mangling it, but
-   attachments can be used as a last resort on a first-time submission.
--  You must read replies to your message, and be willing to act on them.
-   Note, however, that maintainers are often willing to manually fix up
-   first-time contributions, since there is a learning curve involved in
-   making an ideal patch submission.
+.. list-table:: Minimal Checklist for Patches
+   :widths: 35 65
+   :header-rows: 1
+
+   * - Check
+ - Reason
+   * - Patches contain Signed-off-by: Real Name 
+ - States you are legally able to contribute the code. See 
:ref:`patch_emails_must_include_a_signed_off_by_line`
+   * - Sent as patch emails to ``qemu-devel@nongnu.org``
+ - The project uses an email list based workflow. See 
:ref:`submitting_your_patches`
+   * - Be prepared to respond to review comments
+ - Code that doesn't pass review will not get merged. See 
:ref:`participating_in_code_review`
 
 You do not have to subscribe to post (list policy is to reply-to-all to
 preserve CCs and keep non-subscribers in the loop on the threads they
@@ -229,6 +222,19 @@ bisection doesn't land on a known-broken state.
 Submitting your Patches
 ---
 
+The QEMU project uses a public email based workflow for reviewing and
+merging patches. As a result all contributions to QEMU must be **sent
+as patches** to the qemu-devel `mailing list
+`__. Patch
+contributions should not be posted on the bug tracker, posted on
+forums, or externally hosted and linked to. (We have other mailing
+lists too, but all patches must go to qemu-devel, possibly with a Cc:
+to another list.) ``git send-email`` (`step-by-step setup guide
+`__ and `hints and tips
+`__)
+works best for delivering the patch without mangling it, but
+attachments can be used as a last resort on a first-time submission.
+
 .. _if_you_cannot_send_patch_emails:
 
 If you cannot send patch emails
@@ -314,10 +320,12 @@ git repository to fetch the original commit.
 Patch emails must include a ``Signed-off-by:`` line
 ~~~
 
-For more information see `SubmittingPatches 1.12
-`__.
-This is vital or we will not be able to apply your patch! Please use
-your real name to sign a patch (not an alias or acronym).
+Your patches **must** include a Signed-off-by: line. This is a hard
+requirement because it's how you say "I'm legally okay to contribute
+this and happy for it to go into QEMU". The process is modelled after
+the `Linux kernel
+`__
+policy.
 
 If you wrote the patch, make sure your "From:

[PATCH v1 7/9] docs/devel: make language a little less code centric

2022-11-08 Thread Alex Bennée

We welcome all sorts of patches.

Signed-off-by: Alex Bennée 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221012121152.1179051-3-alex.ben...@linaro.org>
---
 docs/devel/submitting-a-patch.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/devel/submitting-a-patch.rst 
b/docs/devel/submitting-a-patch.rst
index fec33ce148..9c7c4331f3 100644
--- a/docs/devel/submitting-a-patch.rst
+++ b/docs/devel/submitting-a-patch.rst
@@ -3,11 +3,11 @@
 Submitting a Patch
 ==
 
-QEMU welcomes contributions of code (either fixing bugs or adding new
-functionality). However, we get a lot of patches, and so we have some
-guidelines about submitting patches. If you follow these, you'll help
-make our task of code review easier and your patch is likely to be
-committed faster.
+QEMU welcomes contributions to fix bugs, add functionality or improve
+the documentation. However, we get a lot of patches, and so we have
+some guidelines about submitting them. If you follow these, you'll
+help make our task of code review easier and your patch is likely to
+be committed faster.
 
 This page seems very long, so if you are only trying to post a quick
 one-shot fix, the bare minimum we ask is that:
-- 
2.34.1

[PATCH v1 3/9] tests/avocado/machine_aspeed.py: Reduce noise on the console for SDK tests

2022-11-08 Thread Alex Bennée

From: Cédric Le Goater 

The Aspeed SDK images are based on OpenBMC which starts a lot of
services. The output noise on the console can break from time to time
the test waiting for the logging prompt.

Change the U-Boot bootargs variable to add "quiet" to the kernel
command line and reduce the output volume. This also drops the test on
the CPU id which was nice to have but not essential.

Signed-off-by: Cédric Le Goater 
Message-Id: <20221104075347.370503-1-...@kaod.org>
Signed-off-by: Alex Bennée 
---
 tests/avocado/machine_aspeed.py | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
index fba6527026..1fc385e1c8 100644
--- a/tests/avocado/machine_aspeed.py
+++ b/tests/avocado/machine_aspeed.py
@@ -12,6 +12,7 @@
 from avocado_qemu import wait_for_console_pattern
 from avocado_qemu import exec_command
 from avocado_qemu import exec_command_and_wait_for_pattern
+from avocado_qemu import interrupt_interactive_console_until_pattern
 from avocado.utils import archive
 from avocado import skipIf
 
@@ -182,6 +183,8 @@ def test_arm_ast2600_evb_buildroot(self):
 
 class AST2x00MachineSDK(QemuSystemTest):
 
+EXTRA_BOOTARGS = ' quiet'
+
 # FIXME: Although these tests boot a whole distro they are still
 # slower than comparable machine models. There may be some
 # optimisations which bring down the runtime. In the meantime they
@@ -194,7 +197,7 @@ def wait_for_console_pattern(self, success_message, 
vm=None):
  failure_message='Kernel panic - not syncing',
  vm=vm)
 
-def do_test_arm_aspeed_sdk_start(self, image, cpu_id):
+def do_test_arm_aspeed_sdk_start(self, image):
 self.require_netdev('user')
 self.vm.set_console()
 self.vm.add_args('-drive', 'file=' + image + ',if=mtd,format=raw',
@@ -202,9 +205,13 @@ def do_test_arm_aspeed_sdk_start(self, image, cpu_id):
 self.vm.launch()
 
 self.wait_for_console_pattern('U-Boot 2019.04')
-self.wait_for_console_pattern('## Loading kernel from FIT Image')
+interrupt_interactive_console_until_pattern(
+self, 'Hit any key to stop autoboot:', 'ast#')
+exec_command_and_wait_for_pattern(
+self, 'setenv bootargs ${bootargs}' + self.EXTRA_BOOTARGS, 'ast#')
+exec_command_and_wait_for_pattern(
+self, 'boot', '## Loading kernel from FIT Image')
 self.wait_for_console_pattern('Starting kernel ...')
-self.wait_for_console_pattern('Booting Linux on physical CPU ' + 
cpu_id)
 
 @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
 def test_arm_ast2500_evb_sdk(self):
@@ -221,7 +228,7 @@ def test_arm_ast2500_evb_sdk(self):
 archive.extract(image_path, self.workdir)
 
 self.do_test_arm_aspeed_sdk_start(
-self.workdir + '/ast2500-default/image-bmc', '0x0')
+self.workdir + '/ast2500-default/image-bmc')
 self.wait_for_console_pattern('ast2500-default login:')
 
 @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
@@ -243,7 +250,7 @@ def test_arm_ast2600_evb_sdk(self):
 self.vm.add_args('-device',
  'ds1338,bus=aspeed.i2c.bus.5,address=0x32');
 self.do_test_arm_aspeed_sdk_start(
-self.workdir + '/ast2600-default/image-bmc', '0xf00')
+self.workdir + '/ast2600-default/image-bmc')
 self.wait_for_console_pattern('ast2600-default login:')
 exec_command_and_wait_for_pattern(self, 'root', 'Password:')
 exec_command_and_wait_for_pattern(self, '0penBmc', 
'root@ast2600-default:~#')
-- 
2.34.1

[PATCH v1 6/9] docs/devel: add a maintainers section to development process

2022-11-08 Thread Alex Bennée

We don't currently have a clear place in the documentation to describe
the roles and responsibilities of a maintainer. Lets create one so we
can. I've moved a few small bits out of other files to try and keep
everything in one place.

Signed-off-by: Alex Bennée 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221012121152.1179051-2-alex.ben...@linaro.org>

---
v2
  - s/roll/role
  - s/projects/project's
  - mention looking after the long term health of area
  - add a section on becoming a reviewer
  - expand becoming section
  - add footnote about key signing
---
 docs/devel/code-of-conduct.rst   |   2 +
 docs/devel/index-process.rst |   1 +
 docs/devel/maintainers.rst   | 106 +++
 docs/devel/submitting-a-pull-request.rst |  12 +--
 4 files changed, 113 insertions(+), 8 deletions(-)
 create mode 100644 docs/devel/maintainers.rst

diff --git a/docs/devel/code-of-conduct.rst b/docs/devel/code-of-conduct.rst
index 195444d1b4..f734ed0317 100644
--- a/docs/devel/code-of-conduct.rst
+++ b/docs/devel/code-of-conduct.rst
@@ -1,3 +1,5 @@
+.. _code_of_conduct:
+
 Code of Conduct
 ===
 
diff --git a/docs/devel/index-process.rst b/docs/devel/index-process.rst
index d0d7a200fd..d50dd74c3e 100644
--- a/docs/devel/index-process.rst
+++ b/docs/devel/index-process.rst
@@ -8,6 +8,7 @@ Notes about how to interact with the community and how and 
where to submit patch
 
code-of-conduct
conflict-resolution
+   maintainers
style
submitting-a-patch
trivial-patches
diff --git a/docs/devel/maintainers.rst b/docs/devel/maintainers.rst
new file mode 100644
index 00..05110909d1
--- /dev/null
+++ b/docs/devel/maintainers.rst
@@ -0,0 +1,106 @@
+.. _maintainers:
+
+The Role of Maintainers
+===
+
+Maintainers are a critical part of the project's contributor ecosystem.
+They come from a wide range of backgrounds from unpaid hobbyists
+working in their spare time to employees who work on the project as
+part of their job. Maintainer activities include:
+
+  - reviewing patches and suggesting changes
+  - collecting patches and preparing pull requests
+  - tending to the long term health of their area
+  - participating in other project activities
+
+They are also human and subject to the same pressures as everyone else
+including overload and burnout. Like everyone else they are subject
+to project's :ref:`code_of_conduct` and should also be exemplars of
+excellent community collaborators.
+
+The MAINTAINERS file
+
+
+The `MAINTAINERS
+`__
+file contains the canonical list of who is a maintainer. The file
+is machine readable so an appropriately configured git (see
+:ref:`cc_the_relevant_maintainer`) can automatically Cc them on
+patches that touch their area of code.
+
+The file also describes the status of the area of code to give an idea
+of how actively that section is maintained.
+
+.. list-table:: Meaning of support status in MAINTAINERS
+   :widths: 25 75
+   :header-rows: 1
+
+   * - Status
+ - Meaning
+   * - Supported
+ - Someone is actually paid to look after this.
+   * - Maintained
+ - Someone actually looks after it.
+   * - Odd Fixes
+ - It has a maintainer but they don't have time to do
+   much other than throw the odd patch in.
+   * - Orphan
+ - No current maintainer.
+   * - Obsolete
+ - Old obsolete code, should use something else.
+
+Please bear in mind that even if someone is paid to support something
+it does not mean they are paid to support you. This is open source and
+the code comes with no warranty and the project makes no guarantees
+about dealing with bugs or features requests.
+
+
+
+Becoming a reviewer
+---
+
+Most maintainers start by becoming subsystem reviewers. While anyone
+is welcome to review code on the mailing list getting added to the
+MAINTAINERS file with a line like::
+
+  R: Random Hacker 
+
+will ensure that patches touching a given subsystem will automatically
+be CC'd to you.
+
+Becoming a maintainer
+-
+
+Maintainers are volunteers who put themselves forward or have been
+asked by others to keep an eye on an area of code. They have generally
+demonstrated to the community, usually via contributions and code
+reviews, that they have a good understanding of the subsystem. They
+are also trusted to make a positive contribution to the project and
+work well with the other contributors.
+
+The process is simple - simply send a patch to the list that updates
+the ``MAINTAINERS`` file. Sometimes this is done as part of a larger
+series when a new sub-system is being added to the code base. This can
+also be done by a retiring maintainer who nominates their replacement
+after discussion with other contributors.
+
+Once the patch is reviewed and merged the only other step is to make
+sure your GPG key is signed.
+
+.. _

[PATCH v1 9/9] docs/devel: try and improve the language around patch review

2022-11-08 Thread Alex Bennée

It is important that contributors take the review process seriously
and we collaborate in a respectful way while avoiding personal
attacks. Try and make this clear in the language.

Signed-off-by: Alex Bennée 
Reviewed-by: Markus Armbruster 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221012121152.1179051-5-alex.ben...@linaro.org>
---
 docs/devel/submitting-a-patch.rst | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/docs/devel/submitting-a-patch.rst 
b/docs/devel/submitting-a-patch.rst
index 1f2bde0625..80e8693bb6 100644
--- a/docs/devel/submitting-a-patch.rst
+++ b/docs/devel/submitting-a-patch.rst
@@ -434,14 +434,20 @@ developers will identify bugs, or suggest a cleaner 
approach, or even
 just point out code style issues or commit message typos. You'll need to
 respond to these, and then send a second version of your patches with
 the issues fixed. This takes a little time and effort on your part, but
-if you don't do it then your changes will never get into QEMU. It's also
-just polite -- it is quite disheartening for a developer to spend time
-reviewing your code and suggesting improvements, only to find that
-you're not going to do anything further and it was all wasted effort.
+if you don't do it then your changes will never get into QEMU.
+
+Remember that a maintainer is under no obligation to take your
+patches. If someone has spent the time reviewing your code and
+suggesting improvements and you simply re-post without either
+addressing the comment directly or providing additional justification
+for the change then it becomes wasted effort. You cannot demand others
+merge and then fix up your code after the fact.
 
 When replying to comments on your patches **reply to all and not just
 the sender** -- keeping discussion on the mailing list means everybody
-can follow it.
+can follow it. Remember the spirit of the :ref:`code_of_conduct` and
+keep discussions respectful and collaborative and avoid making
+personal comments.
 
 .. _pay_attention_to_review_comments:
 
-- 
2.34.1

[PATCH v1 4/9] tests/docker: allow user to override check target

2022-11-08 Thread Alex Bennée

This is useful when trying to bisect a particular failing test behind
a docker run. For example:

  make docker-test-clang@fedora \
TARGET_LIST=arm-softmmu \
TEST_COMMAND="meson test qtest-arm/qos-test" \
J=9 V=1

Signed-off-by: Alex Bennée 

---
v1
 - fix s/target /target./
 - CHECK_TARGET -> TEST_COMMAND
---
 tests/docker/Makefile.include | 2 ++
 tests/docker/common.rc| 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index c87f14477a..fc7a3b7e71 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -184,6 +184,7 @@ docker:
@echo 'TARGET_LIST=a,b,cOverride target list in builds.'
@echo 'EXTRA_CONFIGURE_OPTS="..."'
@echo ' Extra configure options.'
+   @echo 'TEST_COMMAND="..."   Override the default `make check` 
target.'
@echo 'IMAGES="a b c ..":   Restrict available images to subset.'
@echo 'TESTS="x y z .." Restrict available tests to subset.'
@echo 'J=[0..9]*Overrides the -jN parameter for make 
commands'
@@ -230,6 +231,7 @@ docker-run: docker-qemu-src
$(if $(NETWORK),$(if $(subst 
$(NETWORK),,1),--net=$(NETWORK)),--net=none) \
-e TARGET_LIST=$(subst 
$(SPACE),$(COMMA),$(TARGET_LIST))\
-e EXTRA_CONFIGURE_OPTS="$(EXTRA_CONFIGURE_OPTS)" \
+   -e TEST_COMMAND="$(TEST_COMMAND)"   \
-e V=$V -e J=$J -e DEBUG=$(DEBUG)   \
-e SHOW_ENV=$(SHOW_ENV) \
$(if $(NOUSER),,\
diff --git a/tests/docker/common.rc b/tests/docker/common.rc
index e6f8cee0d6..9a33df2832 100755
--- a/tests/docker/common.rc
+++ b/tests/docker/common.rc
@@ -63,12 +63,12 @@ check_qemu()
 {
 # default to make check unless the caller specifies
 if [ $# = 0 ]; then
-INVOCATION="check"
+INVOCATION="${TEST_COMMAND:-make $MAKEFLAGS check}"
 else
-INVOCATION="$@"
+INVOCATION="make $MAKEFLAGS $@"
 fi
 
-make $MAKEFLAGS $INVOCATION
+$INVOCATION
 }
 
 test_fail()
-- 
2.34.1

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-08 Thread Michael S. Tsirkin

On Tue, Nov 08, 2022 at 09:23:04AM +, Alex Bennée wrote:
> The previous fix to virtio_device_started revealed a problem in its
> use by both the core and the device code. The core code should be able
> to handle the device "starting" while the VM isn't running to handle
> the restoration of migration state. To solve this dual use introduce a
> new helper for use by the vhost-user backends who all use it to feed a
> should_start variable.
> 
> We can also pick up a change vhost_user_blk_set_status while we are at
> it which follows the same pattern.
> 
> Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
> Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
> Signed-off-by: Alex Bennée 
> Cc: "Michael S. Tsirkin" 

why is this in this patchset?

> ---
>  include/hw/virtio/virtio.h   | 18 ++
>  hw/block/vhost-user-blk.c|  6 +-
>  hw/virtio/vhost-user-fs.c|  2 +-
>  hw/virtio/vhost-user-gpio.c  |  2 +-
>  hw/virtio/vhost-user-i2c.c   |  2 +-
>  hw/virtio/vhost-user-rng.c   |  2 +-
>  hw/virtio/vhost-user-vsock.c |  2 +-
>  hw/virtio/vhost-vsock.c  |  2 +-
>  8 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index f41b4a7e64..3191c618f3 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -389,6 +389,24 @@ static inline bool virtio_device_started(VirtIODevice 
> *vdev, uint8_t status)
>  return vdev->started;
>  }
>  
> +return status & VIRTIO_CONFIG_S_DRIVER_OK;
> +}
> +
> +/**
> + * virtio_device_should_start() - check if device startable
> + * @vdev - the VirtIO device
> + * @status - the devices status bits
> + *
> + * This is similar to virtio_device_started() but also encapsulates a
> + * check on the VM status which would prevent a device starting
> + * anyway.
> + */
> +static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
> status)
> +{
> +if (vdev->use_started) {
> +return vdev->started;
> +}
> +
>  if (!vdev->vm_running) {
>  return false;
>  }
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 13bf5cc47a..8feaf12e4e 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -222,14 +222,10 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
>  static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  Error *local_err = NULL;
>  int ret;
>  
> -if (!vdev->vm_running) {
> -should_start = false;
> -}
> -
>  if (!s->connected) {
>  return;
>  }
> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> index ad0f91c607..1c40f42045 100644
> --- a/hw/virtio/vhost-user-fs.c
> +++ b/hw/virtio/vhost-user-fs.c
> @@ -123,7 +123,7 @@ static void vuf_stop(VirtIODevice *vdev)
>  static void vuf_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserFS *fs = VHOST_USER_FS(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  
>  if (vhost_dev_is_started(&fs->vhost_dev) == should_start) {
>  return;
> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
> index 8b40fe450c..677d1c7730 100644
> --- a/hw/virtio/vhost-user-gpio.c
> +++ b/hw/virtio/vhost-user-gpio.c
> @@ -152,7 +152,7 @@ static void vu_gpio_stop(VirtIODevice *vdev)
>  static void vu_gpio_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserGPIO *gpio = VHOST_USER_GPIO(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  
>  trace_virtio_gpio_set_status(status);
>  
> diff --git a/hw/virtio/vhost-user-i2c.c b/hw/virtio/vhost-user-i2c.c
> index bc58b6c0d1..864eba695e 100644
> --- a/hw/virtio/vhost-user-i2c.c
> +++ b/hw/virtio/vhost-user-i2c.c
> @@ -93,7 +93,7 @@ static void vu_i2c_stop(VirtIODevice *vdev)
>  static void vu_i2c_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserI2C *i2c = VHOST_USER_I2C(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  
>  if (vhost_dev_is_started(&i2c->vhost_dev) == should_start) {
>  return;
> diff --git a/hw/virtio/vhost-user-rng.c b/hw/virtio/vhost-user-rng.c
> index bc1f36c5ac..8b47287875 100644
> --- a/hw/virtio/vhost-user-rng.c
> +++ b/hw/virtio/vhost-user-rng.c
> @@ -90,7 +90,7 @@ static void vu_rng_stop(VirtIODevice *vdev)
>  static void vu_rng_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserRNG *rng = VHOST_USER_RNG(vdev);
> -bool should_start = virtio_device_started(vdev, stat

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-08 Thread Michael S. Tsirkin

On Tue, Nov 08, 2022 at 09:23:04AM +, Alex Bennée wrote:
> The previous fix to virtio_device_started revealed a problem in its
> use by both the core and the device code. The core code should be able
> to handle the device "starting" while the VM isn't running to handle
> the restoration of migration state. To solve this dual use introduce a
> new helper for use by the vhost-user backends who all use it to feed a
> should_start variable.
> 
> We can also pick up a change vhost_user_blk_set_status while we are at
> it which follows the same pattern.
> 
> Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
> Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
> Signed-off-by: Alex Bennée 
> Cc: "Michael S. Tsirkin" 

is this the same as the RFC?

> ---
>  include/hw/virtio/virtio.h   | 18 ++
>  hw/block/vhost-user-blk.c|  6 +-
>  hw/virtio/vhost-user-fs.c|  2 +-
>  hw/virtio/vhost-user-gpio.c  |  2 +-
>  hw/virtio/vhost-user-i2c.c   |  2 +-
>  hw/virtio/vhost-user-rng.c   |  2 +-
>  hw/virtio/vhost-user-vsock.c |  2 +-
>  hw/virtio/vhost-vsock.c  |  2 +-
>  8 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index f41b4a7e64..3191c618f3 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -389,6 +389,24 @@ static inline bool virtio_device_started(VirtIODevice 
> *vdev, uint8_t status)
>  return vdev->started;
>  }
>  
> +return status & VIRTIO_CONFIG_S_DRIVER_OK;
> +}
> +
> +/**
> + * virtio_device_should_start() - check if device startable
> + * @vdev - the VirtIO device
> + * @status - the devices status bits
> + *
> + * This is similar to virtio_device_started() but also encapsulates a
> + * check on the VM status which would prevent a device starting
> + * anyway.
> + */
> +static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
> status)
> +{
> +if (vdev->use_started) {
> +return vdev->started;
> +}
> +
>  if (!vdev->vm_running) {
>  return false;
>  }
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 13bf5cc47a..8feaf12e4e 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -222,14 +222,10 @@ static void vhost_user_blk_stop(VirtIODevice *vdev)
>  static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserBlk *s = VHOST_USER_BLK(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  Error *local_err = NULL;
>  int ret;
>  
> -if (!vdev->vm_running) {
> -should_start = false;
> -}
> -
>  if (!s->connected) {
>  return;
>  }
> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> index ad0f91c607..1c40f42045 100644
> --- a/hw/virtio/vhost-user-fs.c
> +++ b/hw/virtio/vhost-user-fs.c
> @@ -123,7 +123,7 @@ static void vuf_stop(VirtIODevice *vdev)
>  static void vuf_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserFS *fs = VHOST_USER_FS(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  
>  if (vhost_dev_is_started(&fs->vhost_dev) == should_start) {
>  return;
> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
> index 8b40fe450c..677d1c7730 100644
> --- a/hw/virtio/vhost-user-gpio.c
> +++ b/hw/virtio/vhost-user-gpio.c
> @@ -152,7 +152,7 @@ static void vu_gpio_stop(VirtIODevice *vdev)
>  static void vu_gpio_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserGPIO *gpio = VHOST_USER_GPIO(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  
>  trace_virtio_gpio_set_status(status);
>  
> diff --git a/hw/virtio/vhost-user-i2c.c b/hw/virtio/vhost-user-i2c.c
> index bc58b6c0d1..864eba695e 100644
> --- a/hw/virtio/vhost-user-i2c.c
> +++ b/hw/virtio/vhost-user-i2c.c
> @@ -93,7 +93,7 @@ static void vu_i2c_stop(VirtIODevice *vdev)
>  static void vu_i2c_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserI2C *i2c = VHOST_USER_I2C(vdev);
> -bool should_start = virtio_device_started(vdev, status);
> +bool should_start = virtio_device_should_start(vdev, status);
>  
>  if (vhost_dev_is_started(&i2c->vhost_dev) == should_start) {
>  return;
> diff --git a/hw/virtio/vhost-user-rng.c b/hw/virtio/vhost-user-rng.c
> index bc1f36c5ac..8b47287875 100644
> --- a/hw/virtio/vhost-user-rng.c
> +++ b/hw/virtio/vhost-user-rng.c
> @@ -90,7 +90,7 @@ static void vu_rng_stop(VirtIODevice *vdev)
>  static void vu_rng_set_status(VirtIODevice *vdev, uint8_t status)
>  {
>  VHostUserRNG *rng = VHOST_USER_RNG(vdev);
> -bool should_start = virtio_device_started(vdev, statu

Re: [PATCH 2/3] net: Restore printing of the help text with "-nic help"

2022-11-08 Thread Claudio Fontana

On 11/8/22 09:59, Thomas Huth wrote:
> On 08/11/2022 09.52, Claudio Fontana wrote:
>> On 11/8/22 09:42, Thomas Huth wrote:
>>> On 07/11/2022 13.27, Claudio Fontana wrote:
 should -net and -netdev be adapted too?
>>>
>>> "-netdev help" already works just fine ... and "-net" should IMHO rather be
>>> removed than improved ;-)
>>>
>>>Thomas
>>>
>>
>> I wonder if it could be done once for all, in net_init_clients,
>> instead of repeating the is_help_option in net_init_netdev and net_param_nic
>> (and that would take care of net_init_client too, so we'd get "net" for 
>> free)..
> 
> I don't think that it makes too much sense to have one option for all - 
> since all three CLI options are slightly different anyway. E.g. "-net nic" 
> only exists for "-net", "hubport" cannot be used with "-net", "-nic" can 
> also be used to configure the NIC model, etc.
> 
>   Thomas
> 

Hi Thomas, I would not suggest to merge everything together,

I was considering whether it would make sense to just check the "type" id for 
is_help_option
once, since all the options "net", "netdev", "nic" have a "type" 
implied_opt_name,

and so it should be possible to make a list of structs that signify what to do 
for "net", "netdev", "nic", and 
loop on that and check for that help string once,

and then split off the codepath into the "net", "netdev", "nic" - specific code 
as it is now,
either manually or by storing the function that is now in the foreach as a 
function pointer in the struct, ie moving the is_help_option check one level up.

However, it might not be worth it since it seems that for "nic" the nic models 
need to also be printed, so it might make things needlessly verbose.

Not sure, have not tried to write the code for it.

Ciao,

Claudio

Re: [PATCH v9 5/8] KVM: Register/unregister the guest private memory regions

2022-11-08 Thread Chao Peng

On Tue, Nov 08, 2022 at 09:35:06AM +0800, Yuan Yao wrote:
> On Tue, Oct 25, 2022 at 11:13:41PM +0800, Chao Peng wrote:
> > Introduce generic private memory register/unregister by reusing existing
> > SEV ioctls KVM_MEMORY_ENCRYPT_{UN,}REG_REGION. It differs from SEV case
> > by treating address in the region as gpa instead of hva. Which cases
> > should these ioctls go is determined by the kvm_arch_has_private_mem().
> > Architecture which supports KVM_PRIVATE_MEM should override this function.
> >
> > KVM internally defaults all guest memory as private memory and maintain
> > the shared memory in 'mem_attr_array'. The above ioctls operate on this
> > field and unmap existing mappings if any.
> >
> > Signed-off-by: Chao Peng 
> > ---
> >  Documentation/virt/kvm/api.rst |  17 ++-
> >  arch/x86/kvm/Kconfig   |   1 +
> >  include/linux/kvm_host.h   |  10 +-
> >  virt/kvm/Kconfig   |   4 +
> >  virt/kvm/kvm_main.c| 227 +
> >  5 files changed, 198 insertions(+), 61 deletions(-)
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 975688912b8c..08253cf498d1 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -4717,10 +4717,19 @@ 
> > Documentation/virt/kvm/x86/amd-memory-encryption.rst.
> >  This ioctl can be used to register a guest memory region which may
> >  contain encrypted data (e.g. guest RAM, SMRAM etc).
> >
> > -It is used in the SEV-enabled guest. When encryption is enabled, a guest
> > -memory region may contain encrypted data. The SEV memory encryption
> > -engine uses a tweak such that two identical plaintext pages, each at
> > -different locations will have differing ciphertexts. So swapping or
> > +Currently this ioctl supports registering memory regions for two usages:
> > +private memory and SEV-encrypted memory.
> > +
> > +When private memory is enabled, this ioctl is used to register guest 
> > private
> > +memory region and the addr/size of kvm_enc_region represents guest physical
> > +address (GPA). In this usage, this ioctl zaps the existing guest memory
> > +mappings in KVM that fallen into the region.
> > +
> > +When SEV-encrypted memory is enabled, this ioctl is used to register guest
> > +memory region which may contain encrypted data for a SEV-enabled guest. The
> > +addr/size of kvm_enc_region represents userspace address (HVA). The SEV
> > +memory encryption engine uses a tweak such that two identical plaintext 
> > pages,
> > +each at different locations will have differing ciphertexts. So swapping or
> >  moving ciphertext of those pages will not result in plaintext being
> >  swapped. So relocating (or migrating) physical backing pages for the SEV
> >  guest will require some additional steps.
> > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> > index 8d2bd455c0cd..73fdfa429b20 100644
> > --- a/arch/x86/kvm/Kconfig
> > +++ b/arch/x86/kvm/Kconfig
> > @@ -51,6 +51,7 @@ config KVM
> > select HAVE_KVM_PM_NOTIFIER if PM
> > select HAVE_KVM_RESTRICTED_MEM if X86_64
> > select RESTRICTEDMEM if HAVE_KVM_RESTRICTED_MEM
> > +   select KVM_GENERIC_PRIVATE_MEM if HAVE_KVM_RESTRICTED_MEM
> > help
> >   Support hosting fully virtualized guest machines using hardware
> >   virtualization extensions.  You will need a fairly recent
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 79e5cbc35fcf..4ce98fa0153c 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -245,7 +245,8 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t 
> > cr2_or_gpa,
> >  int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
> >  #endif
> >
> > -#ifdef KVM_ARCH_WANT_MMU_NOTIFIER
> > +
> > +#if defined(KVM_ARCH_WANT_MMU_NOTIFIER) || 
> > defined(CONFIG_KVM_GENERIC_PRIVATE_MEM)
> >  struct kvm_gfn_range {
> > struct kvm_memory_slot *slot;
> > gfn_t start;
> > @@ -254,6 +255,9 @@ struct kvm_gfn_range {
> > bool may_block;
> >  };
> >  bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
> > +#endif
> > +
> > +#ifdef KVM_ARCH_WANT_MMU_NOTIFIER
> >  bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
> >  bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
> >  bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
> > @@ -794,6 +798,9 @@ struct kvm {
> > struct notifier_block pm_notifier;
> >  #endif
> > char stats_id[KVM_STATS_NAME_SIZE];
> > +#ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM
> > +   struct xarray mem_attr_array;
> > +#endif
> >  };
> >
> >  #define kvm_err(fmt, ...) \
> > @@ -1453,6 +1460,7 @@ bool kvm_arch_dy_has_pending_interrupt(struct 
> > kvm_vcpu *vcpu);
> >  int kvm_arch_post_init_vm(struct kvm *kvm);
> >  void kvm_arch_pre_destroy_vm(struct kvm *kvm);
> >  int kvm_arch_create_vm_debugfs(struct kvm *kvm);
> > +bool kvm_arch_has_private_mem(struct kvm *kvm);
> >
> >  #ifndef __KVM_HAVE_

Re: [PATCH 2/3] net: Restore printing of the help text with "-nic help"

2022-11-08 Thread Claudio Fontana

On 11/4/22 13:57, Thomas Huth wrote:
> Running QEMU with "-nic help" used to work in QEMU 5.2 and earlier versions
> (it showed the available netdev backends), but this feature got broken during
> some refactoring in version 6.0. Let's restore the old behavior, and while
> we're at it, let's also print the available NIC models here now since this
> option can be used to configure both, netdev backend and model in one go.
> 
> Fixes: ad6f932fe8 ("net: do not exit on "netdev_add help" monitor command")
> Signed-off-by: Thomas Huth 
> ---
>  net/net.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/net/net.c b/net/net.c
> index c0516a8067..b4b8f2a9cc 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -1571,8 +1571,18 @@ static int net_param_nic(void *dummy, QemuOpts *opts, 
> Error **errp)
>  const char *type;
>  
>  type = qemu_opt_get(opts, "type");
> -if (type && g_str_equal(type, "none")) {
> -return 0;/* Nothing to do, default_net is cleared in vl.c */
> +if (type) {
> +if (g_str_equal(type, "none")) {
> +return 0;/* Nothing to do, default_net is cleared in vl.c */
> +}
> +if (is_help_option(type)) {
> +GPtrArray *nic_models = qemu_get_nic_models(TYPE_DEVICE);
> +show_netdevs();
> +printf("\n");
> +qemu_show_nic_models(type, (const char **)nic_models->pdata);
> +g_ptr_array_free(nic_models, true);

nit: would not the order:

> +GPtrArray *nic_models;
> +show_netdevs();
> +printf("\n");
> +nic_models = qemu_get_nic_models(TYPE_DEVICE);
> +qemu_show_nic_models(type, (const char **)nic_models->pdata);
> +g_ptr_array_free(nic_models, true);

flow more logically?

Ciao

C

> +exit(0);
> +}
>  }
>  
>  idx = nic_get_free_idx();

Re: [PATCH 2/3] hvf: implement guest debugging on Apple Silicon hosts

2022-11-08 Thread Francesco Cagnin

Mads, thanks for the review and feedbacks. I worked on this a few months
ago but only managed to have it published now, I'm sorry for work being
done twice.

I'll add support for SSTEP_NOIRQ as you suggested.

For multi-core systems there's no particular issue, just I'm not
familiar with all of these topics and I'm asking for good practices. KVM
saves breakpoints information in struct 'KVMState', and following the
same approach I've updated struct 'HVFState' to store equivalent data.
To keep separate information for each cpu, KVM uses field 'kvm_state' in
struct 'CPUState' (see 'include/hw/core/cpu.h'), but at the moment
there's no analogous field for the HVF state (which is instead defined
as a single global variable in 'accel/hvf/hvf-accel-ops.c'). Should we
add a new field to 'CPUState' for the HVF state, or take a different
approach?

For your last question regarding `hv_vcpu_set_trap_debug_exceptions()`,
I think it's possible to move it to `hvf_arch_update_guest_debug()` but
I'm not confident about `hvf_arch_init_vcpu()`, I will experiment again.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-08 Thread Alex Bennée



"Michael S. Tsirkin"  writes:

> On Tue, Nov 08, 2022 at 09:23:04AM +, Alex Bennée wrote:
>> The previous fix to virtio_device_started revealed a problem in its
>> use by both the core and the device code. The core code should be able
>> to handle the device "starting" while the VM isn't running to handle
>> the restoration of migration state. To solve this dual use introduce a
>> new helper for use by the vhost-user backends who all use it to feed a
>> should_start variable.
>> 
>> We can also pick up a change vhost_user_blk_set_status while we are at
>> it which follows the same pattern.
>> 
>> Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
>> Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
>> Signed-off-by: Alex Bennée 
>> Cc: "Michael S. Tsirkin" 
>
> why is this in this patchset?

As per my cover letter:

  Most of these patches have been posted before as single patch RFCs. A
  couple are already scheduled through other trees so will drop out in
  due course

but I keep them in my tree until they are merged so I can continue to
soak test them (and have a stable base for my other WIP trees).

-- 
Alex Bennée

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-08 Thread Michael S. Tsirkin

On Tue, Nov 08, 2022 at 10:23:15AM +, Alex Bennée wrote:
> 
> "Michael S. Tsirkin"  writes:
> 
> > On Tue, Nov 08, 2022 at 09:23:04AM +, Alex Bennée wrote:
> >> The previous fix to virtio_device_started revealed a problem in its
> >> use by both the core and the device code. The core code should be able
> >> to handle the device "starting" while the VM isn't running to handle
> >> the restoration of migration state. To solve this dual use introduce a
> >> new helper for use by the vhost-user backends who all use it to feed a
> >> should_start variable.
> >> 
> >> We can also pick up a change vhost_user_blk_set_status while we are at
> >> it which follows the same pattern.
> >> 
> >> Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to 
> >> virtio_device_started)
> >> Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
> >> Signed-off-by: Alex Bennée 
> >> Cc: "Michael S. Tsirkin" 
> >
> > why is this in this patchset?
> 
> As per my cover letter:
> 
>   Most of these patches have been posted before as single patch RFCs. A
>   couple are already scheduled through other trees so will drop out in
>   due course
> 
> but I keep them in my tree until they are merged so I can continue to
> soak test them (and have a stable base for my other WIP trees).

That's fine just pls don't double-post them on list, certainly
not as part of a patchset.

> -- 
> Alex Bennée

Re: [PATCH v10 1/9] s390x/cpu topology: core_id sets s390x CPU topology

2022-11-08 Thread Pierre Morel





On 11/7/22 19:04, Janis Schoetterl-Glausch wrote:

On Fri, 2022-10-28 at 11:30 +0200, Pierre Morel wrote:


On 10/27/22 22:20, Janis Schoetterl-Glausch wrote:

On Wed, 2022-10-26 at 10:34 +0200, Pierre Morel wrote:


On 10/25/22 21:58, Janis Schoetterl-Glausch wrote:

On Wed, 2022-10-12 at 18:20 +0200, Pierre Morel wrote:

In the S390x CPU topology the core_id specifies the CPU address
and the position of the core withing the topology.

Let's build the topology based on the core_id.
s390x/cpu topology: core_id sets s390x CPU topology

In the S390x CPU topology the core_id specifies the CPU address
and the position of the cpu withing the topology.

Let's build the topology based on the core_id.

Signed-off-by: Pierre Morel 
---
include/hw/s390x/cpu-topology.h |  45 +++
hw/s390x/cpu-topology.c | 132 
hw/s390x/s390-virtio-ccw.c  |  21 +
hw/s390x/meson.build|   1 +
4 files changed, 199 insertions(+)
create mode 100644 include/hw/s390x/cpu-topology.h
create mode 100644 hw/s390x/cpu-topology.c


[...]


+/**
+ * s390_topology_realize:
+ * @dev: the device state
+ * @errp: the error pointer (not used)
+ *
+ * During realize the machine CPU topology is initialized with the
+ * QEMU -smp parameters.
+ * The maximum count of CPU TLE in the all Topology can not be greater
+ * than the maximum CPUs.
+ */
+static void s390_topology_realize(DeviceState *dev, Error **errp)
+{
+MachineState *ms = MACHINE(qdev_get_machine());
+S390Topology *topo = S390_CPU_TOPOLOGY(dev);
+
+topo->cpus = ms->smp.cores * ms->smp.threads;


Currently threads are not supported, effectively increasing the number of cpus,
so this is currently correct. Once the machine version limits the threads to 1,
it is also correct. However, once we support multiple threads, this becomes 
incorrect.
I wonder if it's ok from a backward compatibility point of view to modify the 
smp values
by doing cores *= threads, threads = 1 for old machines.


Right, this will become incorrect with thread support.
What about having a dedicated function:

topo->cpus = s390_get_cpus(ms);

This function will use the S390CcwMachineClass->max_thread introduced
later to report the correct number of CPUs.


I don't think max_threads is exactly what matters here, it's if
threads are supported or not or, if max_threads == 1 it doesn't matter.
The question is how best to do the check. You could check the machine version.
I wonder if you could add a feature bit for the multithreading facility that is
always false and use that.

I don't know if using a function makes a difference, that is if it is obvious on
introduction of multithreading support that the function needs to be updated.
(If it is implemented in a way that requires updating, if you check the machine
version it doesn't)
In any case, the name you suggested isn't very descriptive.


I think we care about this machine and olders.
Olders do not support topology so this, Multithreading (MT) does not mater.
This machine support topology, if I follow Cedric advise, the
"max_thread" will/may be introduce before the topology.

This in fact is not an implementation for MT or does not allow the
implementation of MT it is only a way to get rid of the false
information given to the user that we accept MT.

So I think that when we introduce MT we will take care of making things
right at this place as in other places of the code.

What about we keep the original:

  topo->cpus = ms->smp.cores * ms->smp.threads;


If topology is only supported for new machines and not the old machines
for which you set max_threads to a compatibility value (max cpus), then
you should just ignore the threads, cpus == cores.
(There might not be any point in keeping a topo->cpus member in this case, I 
haven't checked)


Right but, I need the nr_cpus in the topology so I prefer to keep it.

However, smp.threads has nothing to do there anymore as you pointed.
I think that nr_cpus should may be named nr_cores and should be set to 
smp.cores.


Thanks,
Regards,

Pierre


--
Pierre Morel
IBM Lab Boeblingen

Re: [PULL v3 3/7] hw/loongarch: Load FDT table into dram memory space

2022-11-08 Thread Richard Henderson


On 11/5/22 14:28, Song Gao wrote:

From: Xiaojuan Yang 

Load FDT table into dram memory space, and the addr is 2 MiB.
Since lowmem region starts from 0, FDT base address is located
at 2 MiB to avoid NULL pointer access.

Signed-off-by: Xiaojuan Yang 
Acked-by: Song Gao 
Message-Id: <20221028014007.2718352-2-yangxiaoj...@loongson.cn>
Signed-off-by: Song Gao 
---
  hw/loongarch/virt.c | 18 +++---
  include/hw/loongarch/virt.h |  3 ---
  2 files changed, 11 insertions(+), 10 deletions(-)


This breaks make check-tcg:

  TESThello on loongarch64
qemu-system-loongarch64: Some ROM regions are overlapping
These ROM regions might have been loaded by direct user request or by default.
They could be BIOS/firmware images, a guest kernel, initrd or some other file loaded into 
guest memory.
Check whether you intended to load all this guest code, and whether it has been built to 
load to the correct addresses.


The following two regions overlap (in the memory address space):
  hello ELF program header segment 0 (addresses 0x0020 - 
0x00242000)
  fdt (addresses 0x0020 - 0x0030)
make[1]: *** [Makefile:177: run-hello] Error 1


r~




diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 4b595a9ea4..50e9829a94 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -159,7 +159,6 @@ static void fdt_add_pcie_node(const LoongArchMachineState 
*lams)
   1, FDT_PCI_RANGE_MMIO, 2, base_mmio,
   2, base_mmio, 2, size_mmio);
  g_free(nodename);
-qemu_fdt_dumpdtb(ms->fdt, lams->fdt_size);
  }
  
  static void fdt_add_irqchip_node(LoongArchMachineState *lams)

@@ -656,6 +655,7 @@ static void loongarch_init(MachineState *machine)
  MemoryRegion *address_space_mem = get_system_memory();
  LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
  int i;
+hwaddr fdt_base;
  
  if (!cpu_model) {

  cpu_model = LOONGARCH_CPU_TYPE_NAME("la464");
@@ -760,12 +760,16 @@ static void loongarch_init(MachineState *machine)
  lams->machine_done.notify = virt_machine_done;
  qemu_add_machine_init_done_notifier(&lams->machine_done);
  fdt_add_pcie_node(lams);
-
-/* load fdt */
-MemoryRegion *fdt_rom = g_new(MemoryRegion, 1);
-memory_region_init_rom(fdt_rom, NULL, "fdt", VIRT_FDT_SIZE, &error_fatal);
-memory_region_add_subregion(get_system_memory(), VIRT_FDT_BASE, fdt_rom);
-rom_add_blob_fixed("fdt", machine->fdt, lams->fdt_size, VIRT_FDT_BASE);
+/*
+ * Since lowmem region starts from 0, FDT base address is located
+ * at 2 MiB to avoid NULL pointer access.
+ *
+ * Put the FDT into the memory map as a ROM image: this will ensure
+ * the FDT is copied again upon reset, even if addr points into RAM.
+ */
+fdt_base = 2 * MiB;
+qemu_fdt_dumpdtb(machine->fdt, lams->fdt_size);
+rom_add_blob_fixed("fdt", machine->fdt, lams->fdt_size, fdt_base);
  }
  
  bool loongarch_is_acpi_enabled(LoongArchMachineState *lams)

diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h
index 09f1c88ee5..45c383f5a7 100644
--- a/include/hw/loongarch/virt.h
+++ b/include/hw/loongarch/virt.h
@@ -28,9 +28,6 @@
  #define VIRT_GED_MEM_ADDR   (VIRT_GED_EVT_ADDR + ACPI_GED_EVT_SEL_LEN)
  #define VIRT_GED_REG_ADDR   (VIRT_GED_MEM_ADDR + MEMORY_HOTPLUG_IO_LEN)
  
-#define VIRT_FDT_BASE   0x1c40

-#define VIRT_FDT_SIZE   0x10
-
  struct LoongArchMachineState {
  /*< private >*/
  MachineState parent_obj;

Re: [PATCH v1 4/9] tests/docker: allow user to override check target

2022-11-08 Thread Philippe Mathieu-Daudé


On 8/11/22 10:23, Alex Bennée wrote:

This is useful when trying to bisect a particular failing test behind
a docker run. For example:

   make docker-test-clang@fedora \
 TARGET_LIST=arm-softmmu \
 TEST_COMMAND="meson test qtest-arm/qos-test" \
 J=9 V=1

Signed-off-by: Alex Bennée 

---
v1
  - fix s/target /target./
  - CHECK_TARGET -> TEST_COMMAND
---
  tests/docker/Makefile.include | 2 ++
  tests/docker/common.rc| 6 +++---
  2 files changed, 5 insertions(+), 3 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2] migration: check magic value for deciding the mapping of channels

2022-11-08 Thread manish.mishra




On 07/11/22 10:21 pm, manish.mishra wrote:

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses MSG_PEEK to check the magic number of
channels so that current data/control stream management remains
un-effected.

Suggested-by: Daniel P. Berrangé 
Signed-off-by: manish.mishra 

v2:
   TLS does not support MSG_PEEK, so V1 was broken for tls live
   migrations. For tls live migration, while initializing main channel
   tls handshake is done before we can create other channels, so this
   issue is not possible for tls live migrations. In V2 added a check
   to avoid checking magic number for tls live migration and fallback
   to older method to decide mapping of channels on destination side.


Hi Daniel, This is what i concluded from discussion on V1, if this is not what 
you expected, please let me know, we can continue discussion on V1 thread.

Thanks

Manish Mishra



---
  include/io/channel.h | 25 +++
  io/channel-socket.c  | 27 
  io/channel.c | 39 +++
  migration/migration.c| 44 +---
  migration/multifd.c  | 12 ---
  migration/multifd.h  |  2 +-
  migration/postcopy-ram.c |  5 +
  migration/postcopy-ram.h |  2 +-
  8 files changed, 130 insertions(+), 26 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..74177aeeea 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -115,6 +115,10 @@ struct QIOChannelClass {
  int **fds,
  size_t *nfds,
  Error **errp);
+ssize_t (*io_read_peek)(QIOChannel *ioc,
+void *buf,
+size_t nbytes,
+Error **errp);
  int (*io_close)(QIOChannel *ioc,
  Error **errp);
  GSource * (*io_create_watch)(QIOChannel *ioc,
@@ -475,6 +479,27 @@ int qio_channel_write_all(QIOChannel *ioc,
size_t buflen,
Error **errp);
  
+/**

+ * qio_channel_read_peek_all:
+ * @ioc: the channel object
+ * @buf: the memory region to read in data
+ * @nbytes: the number of bytes to read
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Read given @nbytes data from peek of channel into
+ * memory region @buf.
+ *
+ * The function will be blocked until read size is
+ * equal to requested size.
+ *
+ * Returns: 1 if all bytes were read, 0 if end-of-file
+ *  occurs without data, or -1 on error
+ */
+int qio_channel_read_peek_all(QIOChannel *ioc,
+  void* buf,
+  size_t nbytes,
+  Error **errp);
+
  /**
   * qio_channel_set_blocking:
   * @ioc: the channel object
diff --git a/io/channel-socket.c b/io/channel-socket.c
index b76dca9cc1..b99f5dfda6 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -705,6 +705,32 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
  }
  #endif /* WIN32 */
  
+static ssize_t qio_channel_socket_read_peek(QIOChannel *ioc,

+void *buf,
+size_t nbytes,
+Error **errp)
+{
+QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
+ssize_t bytes = 0;
+
+retry:
+bytes = recv(sioc->fd, buf, nbytes, MSG_PEEK);
+
+if (bytes < 0) {
+if (errno == EINTR) {
+goto retry;
+}
+if (errno == EAGAIN) {
+return QIO_CHANNEL_ERR_BLOCK;
+}
+
+error_setg_errno(errp, errno,
+ "Unable to read from peek of socket");
+return -1;
+}
+
+return bytes;
+}
  
  #ifdef QEMU_MSG_ZEROCOPY

  static int qio_channel_socket_flush(QIOChannel *ioc,
@@ -902,6 +928,7 @@ static void qio_channel_socket_class_init(ObjectClass 
*klass,
  
  ioc_klass->io_writev = qio_channel_socket_writev;

  ioc_klass->io_re

Re: [BUG] hw/i386/pc.c: CXL Fixed Memory Window should not reserve e820 in bios

2022-11-08 Thread Gerd Hoffmann

> >> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> >> > index 566accf7e6..5bf5465a21 100644
> >> > --- a/hw/i386/pc.c
> >> > +++ b/hw/i386/pc.c
> >> > @@ -1061,7 +1061,6 @@ void pc_memory_init(PCMachineState *pcms,
> >> >  hwaddr cxl_size = MiB;
> >> >
> >> >  cxl_base = pc_get_cxl_range_start(pcms);
> >> > -e820_add_entry(cxl_base, cxl_size, E820_RESERVED);

Just dropping it doesn't look like a good plan to me.

You can try set etc/reserved-memory-end fw_cfg file instead.  Firmware
(both seabios and ovmf) read it and will make sure the 64bit pci mmio
window is placed above that address, i.e. this effectively reserves
address space.  Right now used by memory hotplug code, but should work
for cxl too I think (disclaimer: don't know much about cxl ...).

take care & HTH,
  Gerd

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-08 Thread Alex Bennée



"Michael S. Tsirkin"  writes:

> On Tue, Nov 08, 2022 at 10:23:15AM +, Alex Bennée wrote:
>> 
>> "Michael S. Tsirkin"  writes:
>> 
>> > On Tue, Nov 08, 2022 at 09:23:04AM +, Alex Bennée wrote:
>> >> The previous fix to virtio_device_started revealed a problem in its
>> >> use by both the core and the device code. The core code should be able
>> >> to handle the device "starting" while the VM isn't running to handle
>> >> the restoration of migration state. To solve this dual use introduce a
>> >> new helper for use by the vhost-user backends who all use it to feed a
>> >> should_start variable.
>> >> 
>> >> We can also pick up a change vhost_user_blk_set_status while we are at
>> >> it which follows the same pattern.
>> >> 
>> >> Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to 
>> >> virtio_device_started)
>> >> Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
>> >> Signed-off-by: Alex Bennée 
>> >> Cc: "Michael S. Tsirkin" 
>> >
>> > why is this in this patchset?
>> 
>> As per my cover letter:
>> 
>>   Most of these patches have been posted before as single patch RFCs. A
>>   couple are already scheduled through other trees so will drop out in
>>   due course
>> 
>> but I keep them in my tree until they are merged so I can continue to
>> soak test them (and have a stable base for my other WIP trees).
>
> That's fine just pls don't double-post them on list, certainly
> not as part of a patchset.

Why not? Is this breaking some tooling?

-- 
Alex Bennée

Re: [PATCH v1 2/9] tests/avocado: improve behaviour waiting for login prompts

2022-11-08 Thread Philippe Mathieu-Daudé


On 8/11/22 10:23, Alex Bennée wrote:

This attempts to deal with the problem of login prompts not being
guaranteed to be terminated with a newline. The solution to this is to
peek at the incoming data looking to see if we see an up-coming match
before we fall back to the old readline() logic. The reason to mostly
rely on readline is because I am occasionally seeing the peek stalling
despite data being there.

This seems kinda hacky and gross so I'm open to alternative approaches
and cleaner python code.


I'd have done it the same way...


Signed-off-by: Alex Bennée 
---
  tests/avocado/avocado_qemu/__init__.py | 89 +-
  1 file changed, 88 insertions(+), 1 deletion(-)

diff --git a/tests/avocado/avocado_qemu/__init__.py 
b/tests/avocado/avocado_qemu/__init__.py
index 910f3ba1ea..d6ff68e171 100644
--- a/tests/avocado/avocado_qemu/__init__.py
+++ b/tests/avocado/avocado_qemu/__init__.py
@@ -131,6 +131,58 @@ def pick_default_qemu_bin(bin_prefix='qemu-system-', 
arch=None):
  return path
  return None
  
+def _peek_ahead(console, min_match, success_message):

+"""
+peek ahead in the console stream keeping an eye out for the
+success message.
+
+Returns some message to process or None, indicating the normal
+readline should occur.
+"""
+console_logger = logging.getLogger('console')
+peek_len = 0
+retries = 0
+
+while True:
+try:
+old_peek_len = peek_len
+peek_ahead = console.peek(min_match).decode()
+peek_len = len(peek_ahead)
+
+# if we get stuck too long lets just fallback to readline
+if peek_len <= old_peek_len:
+retries += 1
+if retries > 10:
+return None
+
+# if we see a newline in the peek we can let safely bail
+# and let the normal readline() deal with it
+if peek_ahead.endswith(('\n', '\r', '\r\n')):


'\r\n' superfluous.


+return None
+
+# if we haven't seen enough for the whole message but the
+# first part matches lets just loop again
+if len(peek_ahead) < min_match and \
+   success_message[:peek_len] in peek_ahead:
+continue
+
+# if we see the whole success_message we are done, consume
+# it and pass back so we can exit to the user
+if success_message in peek_ahead:
+return console.read(peek_len).decode()
+
+# of course if we've seen enough then this line probably
+# doesn't contain what we are looking for, fallback
+if peek_len > min_match:
+return None
+
+except UnicodeDecodeError:
+console_logger.log("error in decode of peek")
+return None
+
+# we should never get here
+return None


Reviewed-by: Philippe Mathieu-Daudé

[PULL 1/5] Revert "s390x/s390-virtio-ccw: add zpcii-disable machine property"

2022-11-08 Thread Thomas Huth

From: Cédric Le Goater 

This reverts commit 59d1ce44396e3ad2330dc3261ff3da7ad3a16184.

The "zpcii-disable" machine property is redundant with the "interpret"
zPCI device property. Remove it for clarification.

Signed-off-by: Cédric Le Goater 
Message-Id: <20221107161349.1032730-2-...@kaod.org>
Reviewed-by: Matthew Rosato 
Signed-off-by: Thomas Huth 
---
 include/hw/s390x/s390-virtio-ccw.h |  1 -
 hw/s390x/s390-pci-kvm.c|  4 +---
 hw/s390x/s390-virtio-ccw.c | 24 
 util/qemu-config.c |  4 
 qemu-options.hx|  8 +---
 5 files changed, 2 insertions(+), 39 deletions(-)

diff --git a/include/hw/s390x/s390-virtio-ccw.h 
b/include/hw/s390x/s390-virtio-ccw.h
index 4f8a39abda..9bba21a916 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -27,7 +27,6 @@ struct S390CcwMachineState {
 bool aes_key_wrap;
 bool dea_key_wrap;
 bool pv;
-bool zpcii_disable;
 uint8_t loadparm[8];
 };
 
diff --git a/hw/s390x/s390-pci-kvm.c b/hw/s390x/s390-pci-kvm.c
index 5eb7fd12e2..9134fe185f 100644
--- a/hw/s390x/s390-pci-kvm.c
+++ b/hw/s390x/s390-pci-kvm.c
@@ -22,9 +22,7 @@
 
 bool s390_pci_kvm_interp_allowed(void)
 {
-return (kvm_s390_get_zpci_op() && !s390_is_pv() &&
-!object_property_get_bool(OBJECT(qdev_get_machine()),
-  "zpcii-disable", NULL));
+return kvm_s390_get_zpci_op() && !s390_is_pv();
 }
 
 int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib, bool assist)
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 560ddbb6fb..bb98d40792 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -627,21 +627,6 @@ static inline void machine_set_dea_key_wrap(Object *obj, 
bool value,
 ms->dea_key_wrap = value;
 }
 
-static inline bool machine_get_zpcii_disable(Object *obj, Error **errp)
-{
-S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
-
-return ms->zpcii_disable;
-}
-
-static inline void machine_set_zpcii_disable(Object *obj, bool value,
- Error **errp)
-{
-S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
-
-ms->zpcii_disable = value;
-}
-
 static S390CcwMachineClass *current_mc;
 
 /*
@@ -778,12 +763,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void 
*data)
 "Up to 8 chars in set of [A-Za-z0-9. ] (lower case chars converted"
 " to upper case) to pass to machine loader, boot manager,"
 " and guest kernel");
-
-object_class_property_add_bool(oc, "zpcii-disable",
-   machine_get_zpcii_disable,
-   machine_set_zpcii_disable);
-object_class_property_set_description(oc, "zpcii-disable",
-"disable zPCI interpretation facilties");
 }
 
 static inline void s390_machine_initfn(Object *obj)
@@ -792,7 +771,6 @@ static inline void s390_machine_initfn(Object *obj)
 
 ms->aes_key_wrap = true;
 ms->dea_key_wrap = true;
-ms->zpcii_disable = false;
 }
 
 static const TypeInfo ccw_machine_info = {
@@ -857,12 +835,10 @@ DEFINE_CCW_MACHINE(7_2, "7.2", true);
 static void ccw_machine_7_1_instance_options(MachineState *machine)
 {
 static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V7_1 };
-S390CcwMachineState *ms = S390_CCW_MACHINE(machine);
 
 ccw_machine_7_2_instance_options(machine);
 s390_cpudef_featoff_greater(16, 1, S390_FEAT_PAIE);
 s390_set_qemu_cpu_model(0x8561, 15, 1, qemu_cpu_feat);
-ms->zpcii_disable = true;
 }
 
 static void ccw_machine_7_1_class_options(MachineClass *mc)
diff --git a/util/qemu-config.c b/util/qemu-config.c
index 5325f6bf80..433488aa56 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -236,10 +236,6 @@ static QemuOptsList machine_opts = {
 .help = "Up to 8 chars in set of [A-Za-z0-9. ](lower case chars"
 " converted to upper case) to pass to machine"
 " loader, boot manager, and guest kernel",
-},{
-.name = "zpcii-disable",
-.type = QEMU_OPT_BOOL,
-.help = "disable zPCI interpretation facilities",
 },
 { /* End of list */ }
 }
diff --git a/qemu-options.hx b/qemu-options.hx
index dbdf9c301b..8b8a4a5d01 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -37,8 +37,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "memory-encryption=@var{} memory encryption object to use 
(default=none)\n"
 "hmat=on|off controls ACPI HMAT support (default=off)\n"
 "memory-backend='backend-id' specifies explicitly provided 
backend for main RAM (default=none)\n"
-"
cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n"
-"zpcii-disable=on|off di

[PULL 2/5] s390x/s390-virtio-ccw: Switch off zPCI enhancements on older machines

2022-11-08 Thread Thomas Huth

From: Cédric Le Goater 

zPCI enhancement features (interpretation and forward assist) were
recently introduced to improve performance on PCI passthrough devices.
To maintain the same behaviour on older Z machines, deactivate the
features with the associated properties.

Signed-off-by: Cédric Le Goater 
Message-Id: <20221107161349.1032730-3-...@kaod.org>
Reviewed-by: Matthew Rosato 
Signed-off-by: Thomas Huth 
---
 hw/s390x/s390-virtio-ccw.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index bb98d40792..7d80bc1837 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -844,9 +844,14 @@ static void ccw_machine_7_1_instance_options(MachineState 
*machine)
 static void ccw_machine_7_1_class_options(MachineClass *mc)
 {
 S390CcwMachineClass *s390mc = S390_CCW_MACHINE_CLASS(mc);
+static GlobalProperty compat[] = {
+{ TYPE_S390_PCI_DEVICE, "interpret", "off", },
+{ TYPE_S390_PCI_DEVICE, "forwarding-assist", "off", },
+};
 
 ccw_machine_7_2_class_options(mc);
 compat_props_add(mc->compat_props, hw_compat_7_1, hw_compat_7_1_len);
+compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
 s390mc->max_threads = S390_MAX_CPUS;
 }
 DEFINE_CCW_MACHINE(7_1, "7.1", false);
-- 
2.31.1

[PULL 0/5] s390x fix and white space cleanup

2022-11-08 Thread Thomas Huth

The following changes since commit 524fc737431d240f9d9f10aaf381003092868bac:

  util/log: Ignore per-thread flag if global file already there (2022-11-07 
16:00:02 -0500)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2022-11-08

for you to fetch changes up to 6c10e08a4f131431dbb50a11912cb2a726879df3:

  hw/usb: fix tab indentation (2022-11-08 11:13:48 +0100)


* Last minute s390x fixes before the hard freeze
* Whiste space clean-up in ui/, display/ and hw/usb/
  (Gerd seems to be away from keyboard, so I threw those in)


Amarjargal Gundjalam (3):
  ui: fix tab indentation
  hw/display: fix tab indentation
  hw/usb: fix tab indentation

Cédric Le Goater (2):
  Revert "s390x/s390-virtio-ccw: add zpcii-disable machine property"
  s390x/s390-virtio-ccw: Switch off zPCI enhancements on older machines

 hw/display/vga_regs.h  |6 +-
 hw/usb/quirks-pl2303-ids.h |  180 +-
 include/hw/s390x/s390-virtio-ccw.h |1 -
 include/hw/usb.h   |  100 +-
 include/hw/usb/dwc2-regs.h | 1608 +++
 ui/vgafont.h   | 9214 ++--
 ui/vnc-enc-zywrle.h|   16 +-
 ui/vnc_keysym.h|2 +-
 hw/display/blizzard.c  |  352 +-
 hw/display/cirrus_vga.c| 1606 +++
 hw/display/omap_dss.c  |  598 +--
 hw/display/pxa2xx_lcd.c|  196 +-
 hw/display/xenfb.c |  260 +-
 hw/s390x/s390-pci-kvm.c|4 +-
 hw/s390x/s390-virtio-ccw.c |   29 +-
 hw/usb/dev-hub.c   |   82 +-
 hw/usb/dev-network.c   |  286 +-
 hw/usb/dev-wacom.c |4 +-
 hw/usb/hcd-musb.c  |  328 +-
 ui/vnc-enc-zywrle-template.c   |   20 +-
 util/qemu-config.c |4 -
 qemu-options.hx|8 +-
 22 files changed, 7436 insertions(+), 7468 deletions(-)

[PULL 5/5] hw/usb: fix tab indentation

2022-11-08 Thread Thomas Huth

From: Amarjargal Gundjalam 

The TABs should be replaced with spaces, to make sure that we have a
consistent coding style with an indentation of 4 spaces everywhere.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/370
Signed-off-by: Amarjargal Gundjalam 
Message-Id: 
<6c993f57800f8fef7a910074620f6e80e077a3d1.1666707782.git.amarjarga...@gmail.com>
Signed-off-by: Thomas Huth 
---
 hw/usb/quirks-pl2303-ids.h |  180 ++--
 include/hw/usb.h   |  100 +--
 include/hw/usb/dwc2-regs.h | 1608 ++--
 hw/usb/dev-hub.c   |   82 +-
 hw/usb/dev-network.c   |  286 +++
 hw/usb/dev-wacom.c |4 +-
 hw/usb/hcd-musb.c  |  328 
 7 files changed, 1294 insertions(+), 1294 deletions(-)

diff --git a/hw/usb/quirks-pl2303-ids.h b/hw/usb/quirks-pl2303-ids.h
index 8dbdb46ffe..28dd8da61c 100644
--- a/hw/usb/quirks-pl2303-ids.h
+++ b/hw/usb/quirks-pl2303-ids.h
@@ -1,150 +1,150 @@
 /*
  * Prolific PL2303 USB to serial adaptor driver header file
  *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
  *
  */
 
-#define BENQ_VENDOR_ID 0x04a5
-#define BENQ_PRODUCT_ID_S810x4027
+#define BENQ_VENDOR_ID  0x04a5
+#define BENQ_PRODUCT_ID_S81 0x4027
 
-#define PL2303_VENDOR_ID   0x067b
-#define PL2303_PRODUCT_ID  0x2303
-#define PL2303_PRODUCT_ID_RSAQ20x04bb
-#define PL2303_PRODUCT_ID_DCU110x1234
-#define PL2303_PRODUCT_ID_PHAROS   0xaaa0
-#define PL2303_PRODUCT_ID_RSAQ30xaaa2
-#define PL2303_PRODUCT_ID_ALDIGA   0x0611
-#define PL2303_PRODUCT_ID_MMX  0x0612
-#define PL2303_PRODUCT_ID_GPRS 0x0609
-#define PL2303_PRODUCT_ID_HCR331   0x331a
-#define PL2303_PRODUCT_ID_MOTOROLA 0x0307
+#define PL2303_VENDOR_ID0x067b
+#define PL2303_PRODUCT_ID   0x2303
+#define PL2303_PRODUCT_ID_RSAQ2 0x04bb
+#define PL2303_PRODUCT_ID_DCU11 0x1234
+#define PL2303_PRODUCT_ID_PHAROS0xaaa0
+#define PL2303_PRODUCT_ID_RSAQ3 0xaaa2
+#define PL2303_PRODUCT_ID_ALDIGA0x0611
+#define PL2303_PRODUCT_ID_MMX   0x0612
+#define PL2303_PRODUCT_ID_GPRS  0x0609
+#define PL2303_PRODUCT_ID_HCR3310x331a
+#define PL2303_PRODUCT_ID_MOTOROLA  0x0307
 
-#define ATEN_VENDOR_ID 0x0557
-#define ATEN_VENDOR_ID20x0547
-#define ATEN_PRODUCT_ID0x2008
+#define ATEN_VENDOR_ID  0x0557
+#define ATEN_VENDOR_ID2 0x0547
+#define ATEN_PRODUCT_ID 0x2008
 
-#define IODATA_VENDOR_ID   0x04bb
-#define IODATA_PRODUCT_ID  0x0a03
-#define IODATA_PRODUCT_ID_RSAQ50x0a0e
+#define IODATA_VENDOR_ID0x04bb
+#define IODATA_PRODUCT_ID   0x0a03
+#define IODATA_PRODUCT_ID_RSAQ5 0x0a0e
 
-#define ELCOM_VENDOR_ID0x056e
-#define ELCOM_PRODUCT_ID   0x5003
-#define ELCOM_PRODUCT_ID_UCSGT 0x5004
+#define ELCOM_VENDOR_ID 0x056e
+#define ELCOM_PRODUCT_ID0x5003
+#define ELCOM_PRODUCT_ID_UCSGT  0x5004
 
-#define ITEGNO_VENDOR_ID   0x0eba
-#define ITEGNO_PRODUCT_ID  0x1080
-#define ITEGNO_PRODUCT_ID_2080 0x2080
+#define ITEGNO_VENDOR_ID0x0eba
+#define ITEGNO_PRODUCT_ID   0x1080
+#define ITEGNO_PRODUCT_ID_2080  0x2080
 
-#define MA620_VENDOR_ID0x0df7
-#define MA620_PRODUCT_ID   0x0620
+#define MA620_VENDOR_ID 0x0df7
+#define MA620_PRODUCT_ID0x0620
 
-#define RATOC_VENDOR_ID0x0584
-#define RATOC_PRODUCT_ID   0xb000
+#define RATOC_VENDOR_ID 0x0584
+#define RATOC_PRODUCT_ID0xb000
 
-#define TRIPP_VENDOR_ID0x2478
-#define TRIPP_PRODUCT_ID   0x2008
+#define TRIPP_VENDOR_ID 0x2478
+#define TRIPP_PRODUCT_ID0x2008
 
-#define RADIOSHACK_VENDOR_ID   0x1453
-#define RADIOSHACK_PRODUCT_ID  0x4026
+#define RADIOSHACK_VENDOR_ID0x1453
+#define RADIOSHACK_PRODUCT_ID   0x4026
 
-#define DCU10_VENDOR_ID0x0731
-#define DCU10_PRODUCT_ID   0x0528
+#define DCU10_VENDOR_ID 0x0731
+#define DCU10_PRODUCT_ID0x0528
 
-#define SITECOM_VENDOR_ID  0x6189
-#define SITECOM_PRODUCT_ID 0x2068
+#define SITECOM_VENDOR_ID   0x6189
+#define SITECOM_PRODUCT_ID  0x2068
 
 /* Alcatel OT535/735 USB cable */
-#define ALCATEL_VENDOR_ID  0x11f7
-#define ALCATEL_PRODUCT_ID 0x02df
+#define ALCATEL_VENDOR_ID   0x11f7
+#define ALCATEL_PRODUCT_ID  0x02df
 
 /* Samsung I

Re: [PATCH v4 4/4] hw/usb: fix tab indentation

2022-11-08 Thread Thomas Huth


On 02/11/2022 07.51, Thomas Huth wrote:

On 31/10/2022 20.10, Amarjargal Gundjalam wrote:


On 26/10/22 00:22, Thomas Huth wrote:

On 25/10/2022 16.28, Amarjargal Gundjalam wrote:

The TABs should be replaced with spaces, to make sure that we have a
consistent coding style with an indentation of 4 spaces everywhere.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/370
Reviewed-by: Daniel P. Berrangé 

Signed-off-by: Amarjargal Gundjalam 
---
  hw/usb/dev-hub.c   |   82 +-
  hw/usb/dev-network.c   |  286 +++
  hw/usb/dev-wacom.c |    4 +-
  hw/usb/hcd-musb.c  |  328 
  hw/usb/quirks-pl2303-ids.h |  180 ++--
  include/hw/usb.h   |  100 +--
  include/hw/usb/dwc2-regs.h | 1608 ++--
  7 files changed, 1294 insertions(+), 1294 deletions(-)

...

  diff --git a/hw/usb/dev-wacom.c b/hw/usb/dev-wacom.c
index 8323650c6a..7177c17f03 100644
--- a/hw/usb/dev-wacom.c
+++ b/hw/usb/dev-wacom.c
@@ -36,8 +36,8 @@
  #include "qom/object.h"
    /* Interface requests */
-#define WACOM_GET_REPORT    0x2101
-#define WACOM_SET_REPORT    0x2109
+#define WACOM_GET_REPORT    0x2101
+#define WACOM_SET_REPORT    0x2109
    struct USBWacomState {
  USBDevice dev;
diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c
index 85f5ff5bd4..6929b026b1 100644
--- a/hw/usb/hcd-musb.c
+++ b/hw/usb/hcd-musb.c
@@ -28,227 +28,227 @@
  #include "hw/hw.h"
    /* Common USB registers */
-#define MUSB_HDRC_FADDR    0x00    /* 8-bit */
-#define MUSB_HDRC_POWER    0x01    /* 8-bit */
-
-#define MUSB_HDRC_INTRTX    0x02    /* 16-bit */
-#define MUSB_HDRC_INTRRX    0x04
-#define MUSB_HDRC_INTRTXE    0x06
-#define MUSB_HDRC_INTRRXE    0x08
-#define MUSB_HDRC_INTRUSB    0x0a    /* 8 bit */
-#define MUSB_HDRC_INTRUSBE    0x0b    /* 8 bit */
-#define MUSB_HDRC_FRAME    0x0c    /* 16-bit */
-#define MUSB_HDRC_INDEX    0x0e    /* 8 bit */
-#define MUSB_HDRC_TESTMODE    0x0f    /* 8 bit */
+#define MUSB_HDRC_FADDR 0x00    /* 8-bit */
+#define MUSB_HDRC_POWER 0x01    /* 8-bit */
+
+#define MUSB_HDRC_INTRTX    0x02    /* 16-bit */
+#define MUSB_HDRC_INTRRX    0x04
+#define MUSB_HDRC_INTRTXE   0x06


Sorry for not noticing it earlier, and the problem is pre-existing and 
not related to your patches, but in case you respinning again, my git is 
complaining here about the spaces at the end of the line:


.git/rebase-apply/patch:524: trailing whitespace.
#define MUSB_HDRC_INTRTXE   0x06

(maybe this could also be fixed by the maintainer when picking up the patch)

...



Should I fix them and submit a new version?


No, let's wait for the maintainer (Gerd) first to decide whether this should 
be fixed in this patch or not.


Seems like Gerd is currently away from keyboard, so I dared to pick up patch 
1, 3 and 4 for my current pull request (and fixed the white spaces at the 
line endings in patch 4). For patch 2, I think I'd prefer to wait for Gerd's 
feedback first what to do best with the fmopl.c file.


 Thomas

Re: [PATCH 2/2] target/mips: Correct check for CABS instructions

2022-11-08 Thread Jiaxun Yang





在 2022/11/7 22:35, Philippe Mathieu-Daudé 写道:

On 2/11/22 17:57, Jiaxun Yang wrote:

Accroading to "MIPS Architecture for Programmers Volume IV-c:
The MIPS-3D Application-Specific Extension to the MIPS64 Architecture"
(MD00099). CABS.cond.fmt belongs to MIPS-3D ASE, and it has nothing 
to do

with COP1X opcode.

Remove all unnecessary COP1X checks and check for MIPS3D availability
in decoding code path.

Signed-off-by: Jiaxun Yang 
---
  target/mips/tcg/translate.c | 9 +
  1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index e49d2a25a8..23e575ad95 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -1788,16 +1788,8 @@ static inline void gen_cmp ## type ## _ ## 
fmt(DisasContext *ctx, int n,  \

check_ps(ctx); \
break; \
  case FMT_D: \
-    if (abs) 
{    \

- check_cop1x(ctx); \
- } \
  check_cp1_registers(ctx, fs | 
ft);    \

break; \
-    case FMT_S: \
-    if (abs) 
{    \

- check_cop1x(ctx); \
- } \
- break; \


I'm not sure we want to remove this check on all opcodes handled by
the FOP_CONDS() macro, and for all architecture variants. Maybe we
need to special-case CABS.cond.fmt?


Hmm if I read the code correctly COP1X check is only ran when abs is
set, and the only case abs is set is for CABS.cond.fmt.




} \
  gen_ldcmp_fpr##bits(ctx, fp0, 
fs);    \
  gen_ldcmp_fpr##bits(ctx, fp1, 
ft);    \
@@ -10424,6 +10416,7 @@ static void gen_farith(DisasContext *ctx, 
enum fopcode op1,

  case OPC_CMP_NGT_S:
  check_insn_opc_removed(ctx, ISA_MIPS_R6);
  if (ctx->opcode & (1 << 6)) {
+    check_insn(ctx, ASE_MIPS3D);


You somehow revert commit b8aa4598e2 ("MIPS COP1X (and related)
instructions") which is in use since 15 years.


Still no idea about why it is here in first place
CABS.cond.fmt is even not mentioned in MIPS IV manual. So it's unlikely 
to have

anything to do with COP1X.

Thanks
- Jiaxun




  gen_cmpabs_s(ctx, func - 48, ft, fs, cc);
  } else {
  gen_cmp_s(ctx, func - 48, ft, fs, cc);

Re: [PATCH 1/2] target/mips: Don't check COP1X for 64 bit FP mode

2022-11-08 Thread Jiaxun Yang





在 2022/11/7 23:29, Philippe Mathieu-Daudé 写道:

On 7/11/22 23:47, Philippe Mathieu-Daudé wrote:

On 2/11/22 17:57, Jiaxun Yang wrote:
Some implementations (i.e. Loongson-2F) may decide to implement a 64 
bit

FPU without implmenting COP1X instructions.

As the eligibility of 64 bit FP instructions is already determined by
CP0St_FR, there is no need to check for COP1X again.

Signed-off-by: Jiaxun Yang 
---
  target/mips/tcg/translate.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 2f2d707a12..e49d2a25a8 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -1545,7 +1545,7 @@ void check_cop1x(DisasContext *ctx)
   */
  void check_cp1_64bitmode(DisasContext *ctx)
  {
-    if (unlikely(~ctx->hflags & (MIPS_HFLAG_F64 | 
MIPS_HFLAG_COP1X))) {

+    if (unlikely(~ctx->hflags & MIPS_HFLAG_F64) {
  gen_reserved_instruction(ctx);
  }
  }


Did you test your patch? I'm getting:

../../target/mips/tcg/translate.c:1548:49: error: expected ')'
    if (unlikely(~ctx->hflags & MIPS_HFLAG_F64) {
    ^
../../target/mips/tcg/translate.c:1548:8: note: to match this '('
    if (unlikely(~ctx->hflags & MIPS_HFLAG_F64) {
   ^
../../target/mips/tcg/translate.c:1551:1: error: expected statement
}


Woah, typo when copy changes back from test machine...
Will use git publish next time.

Thanks
- Jiaxun



^

Re: GTK clipboard implementation causing regression, falling through the cracks?

2022-11-08 Thread Gerd Hoffmann

On Mon, Oct 24, 2022 at 12:49:19PM +0200, Claudio Fontana wrote:
> Hi all,
> 
> the GTK clipboard implementation seems to be causing some stability issues 
> (guest CPUs stuck),
> 
> Gerd can you take a look?

Sorry, no.  Just back online from sick leave, no bandwidth for that
right now.

Easy way out is adding a config option for clipboard support.  The real
solution is probably to rewrite the clipboard support to avoid calling
the blocking clipboard functions as they apparently (from looking at
reports & stack traces) can block forever at times.

take care,
  Gerd

Re: [PATCH 2/3] hvf: implement guest debugging on Apple Silicon hosts

2022-11-08 Thread Mads Ynddal

> On 8 Nov 2022, at 11.09, Francesco Cagnin  wrote:
> 
> Mads, thanks for the review and feedbacks. I worked on this a few months
> ago but only managed to have it published now, I'm sorry for work being
> done twice.
> 
> I'll add support for SSTEP_NOIRQ as you suggested.
> 
> For multi-core systems there's no particular issue, just I'm not
> familiar with all of these topics and I'm asking for good practices. KVM
> saves breakpoints information in struct 'KVMState', and following the
> same approach I've updated struct 'HVFState' to store equivalent data.
> To keep separate information for each cpu, KVM uses field 'kvm_state' in
> struct 'CPUState' (see 'include/hw/core/cpu.h'), but at the moment
> there's no analogous field for the HVF state (which is instead defined
> as a single global variable in 'accel/hvf/hvf-accel-ops.c'). Should we
> add a new field to 'CPUState' for the HVF state, or take a different
> approach?
> 
> For your last question regarding `hv_vcpu_set_trap_debug_exceptions()`,
> I think it's possible to move it to `hvf_arch_update_guest_debug()` but
> I'm not confident about `hvf_arch_init_vcpu()`, I will experiment again.

You're welcome. And that's ok. It's just unlucky timing.

In my version, I added the sw breakpoints to `hvf_vcpu_state` so they would
follow each CPU, but it's effectively a copy of the global set of breakpoints. 
I'm no expert on QEMU, as this would have been my first contribution, but AFAIK,
KVM will also add the breakpoints to all CPUs as we can't know which CPU will
take a certain process.

Moving it to `hvf_arch_update_guest_debug` would probably be best. Having it in
`hvf_arch_init_vcpu` would mean that it's always enabled. In some corner-cases,
that might have an adverse effect.

I also noticed you are adding 1 to the WRPs and BRPs. As I interpret the
documentation, you should subtract 1 instead, given the value 0 is reserved:

diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index dbc3605f6d..80a583cbd1 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -39,11 +39,11 @@ static void hvf_arm_init_debug(CPUState *cpu)
 {
 ARMCPU *arm_cpu = ARM_CPU(cpu);

-max_hw_bps = 1 + extract64(arm_cpu->isar.id_aa64dfr0, 12, 4);
+max_hw_bps = extract64(arm_cpu->isar.id_aa64dfr0, 12, 4) - 1;
 hw_breakpoints =
 g_array_sized_new(true, true, sizeof(HWBreakpoint), max_hw_bps);

-max_hw_wps = 1 + extract64(arm_cpu->isar.id_aa64dfr0, 20, 4);
+max_hw_wps = extract64(arm_cpu->isar.id_aa64dfr0, 20, 4) - 1;
 hw_watchpoints =
 g_array_sized_new(true, true, sizeof(HWWatchpoint), max_hw_wps);
 return;

But the documentation is a bit ambiguous on that. Maybe we can test it?

Re: [PATCH v9 6/8] KVM: Update lpage info when private/shared memory are mixed

2022-11-08 Thread Yuan Yao

On Tue, Oct 25, 2022 at 11:13:42PM +0800, Chao Peng wrote:
> When private/shared memory are mixed in a large page, the lpage_info may
> not be accurate and should be updated with this mixed info. A large page
> has mixed pages can't be really mapped as large page since its
> private/shared pages are from different physical memory.
>
> Update lpage_info when private/shared memory attribute is changed. If
> both private and shared pages are within a large page region, it can't
> be mapped as large page. It's a bit challenge to track the mixed
> info in a 'count' like variable, this patch instead reserves a bit in
> 'disallow_lpage' to indicate a large page has mixed private/share pages.
>
> Signed-off-by: Chao Peng 
> ---
>  arch/x86/include/asm/kvm_host.h |   8 +++
>  arch/x86/kvm/mmu/mmu.c  | 112 +++-
>  arch/x86/kvm/x86.c  |   2 +
>  include/linux/kvm_host.h|  19 ++
>  virt/kvm/kvm_main.c |  16 +++--
>  5 files changed, 152 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 7551b6f9c31c..db811a54e3fd 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -37,6 +37,7 @@
>  #include 
>
>  #define __KVM_HAVE_ARCH_VCPU_DEBUGFS
> +#define __KVM_HAVE_ARCH_UPDATE_MEM_ATTR
>
>  #define KVM_MAX_VCPUS 1024
>
> @@ -952,6 +953,13 @@ struct kvm_vcpu_arch {
>  #endif
>  };
>
> +/*
> + * Use a bit in disallow_lpage to indicate private/shared pages mixed at the
> + * level. The remaining bits are used as a reference count.
> + */
> +#define KVM_LPAGE_PRIVATE_SHARED_MIXED   (1U << 31)
> +#define KVM_LPAGE_COUNT_MAX  ((1U << 31) - 1)
> +
>  struct kvm_lpage_info {
>   int disallow_lpage;
>  };
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 33b1aec44fb8..67a9823a8c35 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -762,11 +762,16 @@ static void update_gfn_disallow_lpage_count(const 
> struct kvm_memory_slot *slot,
>  {
>   struct kvm_lpage_info *linfo;
>   int i;
> + int disallow_count;
>
>   for (i = PG_LEVEL_2M; i <= KVM_MAX_HUGEPAGE_LEVEL; ++i) {
>   linfo = lpage_info_slot(gfn, slot, i);
> +
> + disallow_count = linfo->disallow_lpage & KVM_LPAGE_COUNT_MAX;
> + WARN_ON(disallow_count + count < 0 ||
> + disallow_count > KVM_LPAGE_COUNT_MAX - count);
> +
>   linfo->disallow_lpage += count;
> - WARN_ON(linfo->disallow_lpage < 0);
>   }
>  }
>
> @@ -6910,3 +6915,108 @@ void kvm_mmu_pre_destroy_vm(struct kvm *kvm)
>   if (kvm->arch.nx_lpage_recovery_thread)
>   kthread_stop(kvm->arch.nx_lpage_recovery_thread);
>  }
> +
> +static inline bool linfo_is_mixed(struct kvm_lpage_info *linfo)
> +{
> + return linfo->disallow_lpage & KVM_LPAGE_PRIVATE_SHARED_MIXED;
> +}
> +
> +static inline void linfo_update_mixed(struct kvm_lpage_info *linfo, bool 
> mixed)
> +{
> + if (mixed)
> + linfo->disallow_lpage |= KVM_LPAGE_PRIVATE_SHARED_MIXED;
> + else
> + linfo->disallow_lpage &= ~KVM_LPAGE_PRIVATE_SHARED_MIXED;
> +}
> +
> +static bool mem_attr_is_mixed_2m(struct kvm *kvm, unsigned int attr,
> +  gfn_t start, gfn_t end)
> +{
> + XA_STATE(xas, &kvm->mem_attr_array, start);
> + gfn_t gfn = start;
> + void *entry;
> + bool shared = attr == KVM_MEM_ATTR_SHARED;
> + bool mixed = false;
> +
> + rcu_read_lock();
> + entry = xas_load(&xas);
> + while (gfn < end) {
> + if (xas_retry(&xas, entry))
> + continue;
> +
> + KVM_BUG_ON(gfn != xas.xa_index, kvm);
> +
> + if ((entry && !shared) || (!entry && shared)) {
> + mixed = true;
> + goto out;
> + }
> +
> + entry = xas_next(&xas);
> + gfn++;
> + }
> +out:
> + rcu_read_unlock();
> + return mixed;
> +}
> +
> +static bool mem_attr_is_mixed(struct kvm *kvm, struct kvm_memory_slot *slot,
> +   int level, unsigned int attr,
> +   gfn_t start, gfn_t end)
> +{
> + unsigned long gfn;
> + void *entry;
> +
> + if (level == PG_LEVEL_2M)
> + return mem_attr_is_mixed_2m(kvm, attr, start, end);
> +
> + entry = xa_load(&kvm->mem_attr_array, start);
> + for (gfn = start; gfn < end; gfn += KVM_PAGES_PER_HPAGE(level - 1)) {
> + if (linfo_is_mixed(lpage_info_slot(gfn, slot, level - 1)))
> + return true;
> + if (xa_load(&kvm->mem_attr_array, gfn) != entry)
> + return true;
> + }
> + return false;
> +}
> +
> +void kvm_arch_update_mem_attr(struct kvm *kvm, struct kvm_memory_slot *slot,
> +   unsigned int attr, gfn_t start, gfn_t end)
> +{
> +
> +

[PATCH 04/13] block: Remove drained_end_counter

2022-11-08 Thread Kevin Wolf

drained_end_counter is unused now, nobody changes its value any more. It
can be removed.

In cases where we had two almost identical functions that only differed
in whether the caller passes drained_end_counter, or whether they would
poll for a local drained_end_counter to reach 0, these become a single
function.

Signed-off-by: Kevin Wolf 
---
 include/block/block-io.h | 15 -
 include/block/block_int-common.h |  6 +-
 block.c  |  5 +-
 block/block-backend.c|  4 +-
 block/io.c   | 97 
 blockjob.c   |  2 +-
 6 files changed, 30 insertions(+), 99 deletions(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index 770ddeb7c8..97e9ae8bee 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -235,21 +235,6 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, 
int64_t src_offset,
 int64_t bytes, BdrvRequestFlags read_flags,
 BdrvRequestFlags write_flags);
 
-/**
- * bdrv_drained_end_no_poll:
- *
- * Same as bdrv_drained_end(), but do not poll for the subgraph to
- * actually become unquiesced.  Therefore, no graph changes will occur
- * with this function.
- *
- * *drained_end_counter is incremented for every background operation
- * that is scheduled, and will be decremented for every operation once
- * it settles.  The caller must poll until it reaches 0.  The counter
- * should be accessed using atomic operations only.
- */
-void bdrv_drained_end_no_poll(BlockDriverState *bs, int *drained_end_counter);
-
-
 /*
  * "I/O or GS" API functions. These functions can run without
  * the BQL, but only in one specific iothread/main loop.
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 0956acbb60..6504db4fd9 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -939,15 +939,11 @@ struct BdrvChildClass {
  * These functions must not change the graph (and therefore also must not
  * call aio_poll(), which could change the graph indirectly).
  *
- * If drained_end() schedules background operations, it must atomically
- * increment *drained_end_counter for each such operation and atomically
- * decrement it once the operation has settled.
- *
  * Note that this can be nested. If drained_begin() was called twice, new
  * I/O is allowed only after drained_end() was called twice, too.
  */
 void (*drained_begin)(BdrvChild *child);
-void (*drained_end)(BdrvChild *child, int *drained_end_counter);
+void (*drained_end)(BdrvChild *child);
 
 /*
  * Returns whether the parent has pending requests for the child. This
diff --git a/block.c b/block.c
index fed8077993..7a24bd4c36 100644
--- a/block.c
+++ b/block.c
@@ -1227,11 +1227,10 @@ static bool bdrv_child_cb_drained_poll(BdrvChild *child)
 return bdrv_drain_poll(bs, false, NULL, false);
 }
 
-static void bdrv_child_cb_drained_end(BdrvChild *child,
-  int *drained_end_counter)
+static void bdrv_child_cb_drained_end(BdrvChild *child)
 {
 BlockDriverState *bs = child->opaque;
-bdrv_drained_end_no_poll(bs, drained_end_counter);
+bdrv_drained_end(bs);
 }
 
 static int bdrv_child_cb_inactivate(BdrvChild *child)
diff --git a/block/block-backend.c b/block/block-backend.c
index c0c7d56c8d..ecdfeb49bb 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -129,7 +129,7 @@ static void blk_root_inherit_options(BdrvChildRole role, 
bool parent_is_format,
 }
 static void blk_root_drained_begin(BdrvChild *child);
 static bool blk_root_drained_poll(BdrvChild *child);
-static void blk_root_drained_end(BdrvChild *child, int *drained_end_counter);
+static void blk_root_drained_end(BdrvChild *child);
 
 static void blk_root_change_media(BdrvChild *child, bool load);
 static void blk_root_resize(BdrvChild *child);
@@ -2549,7 +2549,7 @@ static bool blk_root_drained_poll(BdrvChild *child)
 return busy || !!blk->in_flight;
 }
 
-static void blk_root_drained_end(BdrvChild *child, int *drained_end_counter)
+static void blk_root_drained_end(BdrvChild *child)
 {
 BlockBackend *blk = child->opaque;
 assert(blk->quiesce_counter);
diff --git a/block/io.c b/block/io.c
index 183b407f5b..41e6121c31 100644
--- a/block/io.c
+++ b/block/io.c
@@ -58,27 +58,19 @@ static void bdrv_parent_drained_begin(BlockDriverState *bs, 
BdrvChild *ignore,
 }
 }
 
-static void bdrv_parent_drained_end_single_no_poll(BdrvChild *c,
-   int *drained_end_counter)
+void bdrv_parent_drained_end_single(BdrvChild *c)
 {
+IO_OR_GS_CODE();
+
 assert(c->parent_quiesce_counter > 0);
 c->parent_quiesce_counter--;
 if (c->klass->drained_end) {
-c->klass->drained_end(c, drained_end_counter);
+c->klass->drained_end(c);
 }
 }
 
-void bd

[PATCH 00/13] block: Simplify drain

2022-11-08 Thread Kevin Wolf

I'm aware that exactly nobody has been looking forward to a series with
this title, but it has to be. The way drain works means that we need to
poll in bdrv_replace_child_noperm() and that makes things rather messy
with Emanuele's multiqueue work because you must not poll while you hold
the graph lock.

The other reason why it has to be is that drain is way too complex and
there are too many different cases. Some simplification like this will
hopefully make it considerably more maintainable. The diffstat probably
tells something, too.

There are roughly speaking three parts in this series:

1. Make BlockDriver.bdrv_drained_begin/end() non-coroutine_fn again,
   which allows us to not poll on bdrv_drained_end() any more.

2. Remove subtree drains. They are a considerable complication in the
   whole drain machinery (in particular, they require polling in the
   BdrvChildClass.attach/detach() callbacks that are called during
   bdrv_replace_child_noperm()) and none of their users actually has a
   good reason to use them.

3. Finally get rid of polling in bdrv_replace_child_noperm() by
   requiring that the child is already drained by the caller and calling
   callbacks only once and not again for every nested drain section.

If necessary, a prefix of this series can be merged that covers only the
first or the first two parts and it would still make sense.

Kevin Wolf (13):
  qed: Don't yield in bdrv_qed_co_drain_begin()
  test-bdrv-drain: Don't yield in .bdrv_co_drained_begin/end()
  block: Revert .bdrv_drained_begin/end to non-coroutine_fn
  block: Remove drained_end_counter
  block: Inline bdrv_drain_invoke()
  block: Drain invidual nodes during reopen
  block: Don't use subtree drains in bdrv_drop_intermediate()
  stream: Replace subtree drain with a single node drain
  block: Remove subtree drains
  block: Call drain callbacks only once
  block: Remove ignore_bds_parents parameter from drain functions
  block: Don't poll in bdrv_replace_child_noperm()
  block: Remove poll parameter from bdrv_parent_drained_begin_single()

 include/block/block-global-state.h |   3 +
 include/block/block-io.h   |  52 +---
 include/block/block_int-common.h   |  17 +-
 include/block/block_int-io.h   |  12 -
 block.c| 132 ++-
 block/block-backend.c  |   4 +-
 block/io.c | 281 --
 block/qed.c|  24 +-
 block/replication.c|   6 -
 block/stream.c |  20 +-
 block/throttle.c   |   6 +-
 blockdev.c |  13 -
 blockjob.c |   2 +-
 tests/unit/test-bdrv-drain.c   | 369 +++--
 14 files changed, 270 insertions(+), 671 deletions(-)

-- 
2.38.1

[PATCH 03/13] block: Revert .bdrv_drained_begin/end to non-coroutine_fn

2022-11-08 Thread Kevin Wolf

Polling during bdrv_drained_end() can be problematic (and in the future,
we may get cases for bdrv_drained_begin() where polling is forbidden,
and we don't care about already in-flight requests, but just want to
prevent new requests from arriving).

The .bdrv_drained_begin/end callbacks running in a coroutine is the only
reason why we have to do this polling, so make them non-coroutine
callbacks again. None of the callers actually yield any more.

This means that bdrv_drained_end() effectively doesn't poll any more,
even if AIO_WAIT_WHILE() loops are still there (their condition is false
from the beginning). This is generally not a problem, but in
test-bdrv-drain, some additional explicit aio_poll() calls need to be
added because the test case wants to verify the final state after BHs
have executed.

Signed-off-by: Kevin Wolf 
---
 include/block/block_int-common.h | 10 ---
 block.c  |  4 +--
 block/io.c   | 49 +---
 block/qed.c  |  4 +--
 block/throttle.c |  6 ++--
 tests/unit/test-bdrv-drain.c | 18 ++--
 6 files changed, 30 insertions(+), 61 deletions(-)

diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 5a2cc077a0..0956acbb60 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -735,17 +735,19 @@ struct BlockDriver {
 void (*bdrv_io_unplug)(BlockDriverState *bs);
 
 /**
- * bdrv_co_drain_begin is called if implemented in the beginning of a
+ * bdrv_drain_begin is called if implemented in the beginning of a
  * drain operation to drain and stop any internal sources of requests in
  * the driver.
- * bdrv_co_drain_end is called if implemented at the end of the drain.
+ * bdrv_drain_end is called if implemented at the end of the drain.
  *
  * They should be used by the driver to e.g. manage scheduled I/O
  * requests, or toggle an internal state. After the end of the drain new
  * requests will continue normally.
+ *
+ * Implementations of both functions must not call aio_poll().
  */
-void coroutine_fn (*bdrv_co_drain_begin)(BlockDriverState *bs);
-void coroutine_fn (*bdrv_co_drain_end)(BlockDriverState *bs);
+void (*bdrv_drain_begin)(BlockDriverState *bs);
+void (*bdrv_drain_end)(BlockDriverState *bs);
 
 bool (*bdrv_supports_persistent_dirty_bitmap)(BlockDriverState *bs);
 bool coroutine_fn (*bdrv_co_can_store_new_dirty_bitmap)(
diff --git a/block.c b/block.c
index 3bd594eb2a..fed8077993 100644
--- a/block.c
+++ b/block.c
@@ -1705,8 +1705,8 @@ static int bdrv_open_driver(BlockDriverState *bs, 
BlockDriver *drv,
 assert(is_power_of_2(bs->bl.request_alignment));
 
 for (i = 0; i < bs->quiesce_counter; i++) {
-if (drv->bdrv_co_drain_begin) {
-drv->bdrv_co_drain_begin(bs);
+if (drv->bdrv_drain_begin) {
+drv->bdrv_drain_begin(bs);
 }
 }
 
diff --git a/block/io.c b/block/io.c
index 34b30e304e..183b407f5b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -250,55 +250,20 @@ typedef struct {
 int *drained_end_counter;
 } BdrvCoDrainData;
 
-static void coroutine_fn bdrv_drain_invoke_entry(void *opaque)
-{
-BdrvCoDrainData *data = opaque;
-BlockDriverState *bs = data->bs;
-
-if (data->begin) {
-bs->drv->bdrv_co_drain_begin(bs);
-} else {
-bs->drv->bdrv_co_drain_end(bs);
-}
-
-/* Set data->done and decrement drained_end_counter before bdrv_wakeup() */
-qatomic_mb_set(&data->done, true);
-if (!data->begin) {
-qatomic_dec(data->drained_end_counter);
-}
-bdrv_dec_in_flight(bs);
-
-g_free(data);
-}
-
-/* Recursively call BlockDriver.bdrv_co_drain_begin/end callbacks */
+/* Recursively call BlockDriver.bdrv_drain_begin/end callbacks */
 static void bdrv_drain_invoke(BlockDriverState *bs, bool begin,
   int *drained_end_counter)
 {
-BdrvCoDrainData *data;
-
-if (!bs->drv || (begin && !bs->drv->bdrv_co_drain_begin) ||
-(!begin && !bs->drv->bdrv_co_drain_end)) {
+if (!bs->drv || (begin && !bs->drv->bdrv_drain_begin) ||
+(!begin && !bs->drv->bdrv_drain_end)) {
 return;
 }
 
-data = g_new(BdrvCoDrainData, 1);
-*data = (BdrvCoDrainData) {
-.bs = bs,
-.done = false,
-.begin = begin,
-.drained_end_counter = drained_end_counter,
-};
-
-if (!begin) {
-qatomic_inc(drained_end_counter);
+if (begin) {
+bs->drv->bdrv_drain_begin(bs);
+} else {
+bs->drv->bdrv_drain_end(bs);
 }
-
-/* Make sure the driver callback completes during the polling phase for
- * drain_begin. */
-bdrv_inc_in_flight(bs);
-data->co = qemu_coroutine_create(bdrv_drain_invoke_entry, data);
-aio_co_schedule(bdrv_get_aio_context(bs), data->co);
 }
 
 /* Returns true if BDRV_POLL_WHILE() shou

[PATCH 12/13] block: Don't poll in bdrv_replace_child_noperm()

2022-11-08 Thread Kevin Wolf

In order to make sure that bdrv_replace_child_noperm() doesn't have to
poll any more, get rid of the bdrv_parent_drained_begin_single() call.

This is possible now because we can require that the child is already
drained when the function is called (it better be, having in-flight
requests while modifying the graph isn't going to end well!) and we
don't call the parent drain callbacks more than once.

The additional drain calls needed in callers cause the test case to run
its code in the drain handler too early (bdrv_attach_child() drains
now), so modify it to only enable the code after the test setup has
completed.

Signed-off-by: Kevin Wolf 
---
 include/block/block-io.h |  8 
 block.c  | 72 +---
 block/io.c   |  2 +-
 tests/unit/test-bdrv-drain.c | 10 +
 4 files changed, 70 insertions(+), 22 deletions(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index 5b54ed4672..ddce8550a9 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -290,6 +290,14 @@ bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector 
*qiov, int64_t pos);
  */
 void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll);
 
+/**
+ * bdrv_parent_drained_poll_single:
+ *
+ * Returns true if there is any pending activity to cease before @c can be
+ * called quiesced, false otherwise.
+ */
+bool bdrv_parent_drained_poll_single(BdrvChild *c);
+
 /**
  * bdrv_parent_drained_end_single:
  *
diff --git a/block.c b/block.c
index 5f5f79cd16..12039e9b8a 100644
--- a/block.c
+++ b/block.c
@@ -2399,6 +2399,20 @@ static void bdrv_replace_child_abort(void *opaque)
 
 GLOBAL_STATE_CODE();
 /* old_bs reference is transparently moved from @s to @s->child */
+if (!s->child->bs) {
+/*
+ * The parents were undrained when removing old_bs from the child. New
+ * requests can't have been made, though, because the child was empty.
+ *
+ * TODO Make bdrv_replace_child_noperm() transactionable to avoid
+ * undraining the parent in the first place. Once this is done, having
+ * new_bs drained when calling bdrv_replace_child_tran() is not a
+ * requirement any more.
+ */
+bdrv_parent_drained_begin_single(s->child, false);
+assert(!bdrv_parent_drained_poll_single(s->child));
+}
+assert(s->child->parent_quiesce_counter);
 bdrv_replace_child_noperm(s->child, s->old_bs);
 bdrv_unref(new_bs);
 }
@@ -2414,12 +2428,20 @@ static TransactionActionDrv bdrv_replace_child_drv = {
  *
  * Note: real unref of old_bs is done only on commit.
  *
+ * Both child and new_bs (if non-NULL) must be drained. new_bs must be kept
+ * drained until the transaction is completed (this automatically implies that
+ * child remains drained, too).
+ *
  * The function doesn't update permissions, caller is responsible for this.
  */
 static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *new_bs,
 Transaction *tran)
 {
 BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
+
+assert(child->parent_quiesce_counter);
+assert(!new_bs || new_bs->quiesce_counter);
+
 *s = (BdrvReplaceChildState) {
 .child = child,
 .old_bs = child->bs,
@@ -2818,6 +2840,12 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 int new_bs_quiesce_counter;
 
 assert(!child->frozen);
+/*
+ * When removing the child, it's the callers responsibility to make sure
+ * that no requests are in flight any more. Usually the parent is drained,
+ * but not through child->parent_quiesce_counter.
+ */
+assert(!new_bs || child->parent_quiesce_counter);
 assert(old_bs != new_bs);
 GLOBAL_STATE_CODE();
 
@@ -2825,16 +2853,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
 }
 
-new_bs_quiesce_counter = (new_bs ? new_bs->quiesce_counter : 0);
-
-/*
- * If the new child node is drained but the old one was not, flush
- * all outstanding requests to the old child node.
- */
-if (new_bs_quiesce_counter && !child->parent_quiesce_counter) {
-bdrv_parent_drained_begin_single(child, true);
-}
-
 if (old_bs) {
 if (child->klass->detach) {
 child->klass->detach(child);
@@ -2849,14 +2867,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 assert_bdrv_graph_writable(new_bs);
 QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);
 
-/*
- * Polling in bdrv_parent_drained_begin_single() may have led to the 
new
- * node's quiesce_counter having been decreased.  Not a problem, we 
just
- * need to recognize this here and then invoke drained_end 
appropriately
- * more often.
- */
-assert(new_bs->quiesce_counter <= new_bs_quiesce_counter);
-
 if (chi

[PATCH 11/13] block: Remove ignore_bds_parents parameter from drain functions

2022-11-08 Thread Kevin Wolf

ignore_bds_parents is now ignored, so we can just remove it.

Signed-off-by: Kevin Wolf 
---
 include/block/block-io.h | 10 ++
 block.c  |  4 +--
 block/io.c   | 78 +++-
 3 files changed, 32 insertions(+), 60 deletions(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index c35cb1e53f..5b54ed4672 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -305,14 +305,9 @@ void bdrv_parent_drained_end_single(BdrvChild *c);
  *
  * Poll for pending requests in @bs and its parents (except for 
@ignore_parent).
  *
- * If @ignore_bds_parents is true, parents that are BlockDriverStates must
- * ignore the drain request because they will be drained separately (used for
- * drain_all).
- *
  * This is part of bdrv_drained_begin.
  */
-bool bdrv_drain_poll(BlockDriverState *bs, BdrvChild *ignore_parent,
- bool ignore_bds_parents);
+bool bdrv_drain_poll(BlockDriverState *bs, BdrvChild *ignore_parent);
 
 /**
  * bdrv_drained_begin:
@@ -330,8 +325,7 @@ void bdrv_drained_begin(BlockDriverState *bs);
  * Quiesces a BDS like bdrv_drained_begin(), but does not wait for already
  * running requests to complete.
  */
-void bdrv_do_drained_begin_quiesce(BlockDriverState *bs,
-   BdrvChild *parent, bool ignore_bds_parents);
+void bdrv_do_drained_begin_quiesce(BlockDriverState *bs, BdrvChild *parent);
 
 /**
  * bdrv_drained_end:
diff --git a/block.c b/block.c
index 8878586f6e..5f5f79cd16 100644
--- a/block.c
+++ b/block.c
@@ -1218,13 +1218,13 @@ static char *bdrv_child_get_parent_desc(BdrvChild *c)
 static void bdrv_child_cb_drained_begin(BdrvChild *child)
 {
 BlockDriverState *bs = child->opaque;
-bdrv_do_drained_begin_quiesce(bs, NULL, false);
+bdrv_do_drained_begin_quiesce(bs, NULL);
 }
 
 static bool bdrv_child_cb_drained_poll(BdrvChild *child)
 {
 BlockDriverState *bs = child->opaque;
-return bdrv_drain_poll(bs, NULL, false);
+return bdrv_drain_poll(bs, NULL);
 }
 
 static void bdrv_child_cb_drained_end(BdrvChild *child)
diff --git a/block/io.c b/block/io.c
index 87c7a92f15..4a83359a8f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -45,13 +45,12 @@ static void bdrv_parent_cb_resize(BlockDriverState *bs);
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 int64_t offset, int64_t bytes, BdrvRequestFlags flags);
 
-static void bdrv_parent_drained_begin(BlockDriverState *bs, BdrvChild *ignore,
-  bool ignore_bds_parents)
+static void bdrv_parent_drained_begin(BlockDriverState *bs, BdrvChild *ignore)
 {
 BdrvChild *c, *next;
 
 QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
-if (c == ignore || (ignore_bds_parents && c->klass->parent_is_bds)) {
+if (c == ignore) {
 continue;
 }
 bdrv_parent_drained_begin_single(c, false);
@@ -69,13 +68,12 @@ void bdrv_parent_drained_end_single(BdrvChild *c)
 }
 }
 
-static void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore,
-bool ignore_bds_parents)
+static void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore)
 {
 BdrvChild *c;
 
 QLIST_FOREACH(c, &bs->parents, next_parent) {
-if (c == ignore || (ignore_bds_parents && c->klass->parent_is_bds)) {
+if (c == ignore) {
 continue;
 }
 bdrv_parent_drained_end_single(c);
@@ -90,14 +88,13 @@ static bool bdrv_parent_drained_poll_single(BdrvChild *c)
 return false;
 }
 
-static bool bdrv_parent_drained_poll(BlockDriverState *bs, BdrvChild *ignore,
- bool ignore_bds_parents)
+static bool bdrv_parent_drained_poll(BlockDriverState *bs, BdrvChild *ignore)
 {
 BdrvChild *c, *next;
 bool busy = false;
 
 QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
-if (c == ignore || (ignore_bds_parents && c->klass->parent_is_bds)) {
+if (c == ignore) {
 continue;
 }
 busy |= bdrv_parent_drained_poll_single(c);
@@ -238,16 +235,14 @@ typedef struct {
 bool begin;
 bool poll;
 BdrvChild *parent;
-bool ignore_bds_parents;
 } BdrvCoDrainData;
 
 /* Returns true if BDRV_POLL_WHILE() should go into a blocking aio_poll() */
-bool bdrv_drain_poll(BlockDriverState *bs, BdrvChild *ignore_parent,
- bool ignore_bds_parents)
+bool bdrv_drain_poll(BlockDriverState *bs, BdrvChild *ignore_parent)
 {
 IO_OR_GS_CODE();
 
-if (bdrv_parent_drained_poll(bs, ignore_parent, ignore_bds_parents)) {
+if (bdrv_parent_drained_poll(bs, ignore_parent)) {
 return true;
 }
 
@@ -258,16 +253,9 @@ bool bdrv_drain_poll(BlockDriverState *bs, BdrvChild 
*ignore_parent,
 return false;
 }
 
-static bool bdrv_drain_poll_top_level(BlockDriverState *bs,
-  BdrvChild *ignore_parent)
-{
-retur

[PATCH 13/13] block: Remove poll parameter from bdrv_parent_drained_begin_single()

2022-11-08 Thread Kevin Wolf

All callers of bdrv_parent_drained_begin_single() pass poll=false now,
so we don't need the parameter any more.

Signed-off-by: Kevin Wolf 
---
 include/block/block-io.h | 5 ++---
 block.c  | 4 ++--
 block/io.c   | 7 ++-
 3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index ddce8550a9..35669f0e62 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -285,10 +285,9 @@ bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector 
*qiov, int64_t pos);
 /**
  * bdrv_parent_drained_begin_single:
  *
- * Begin a quiesced section for the parent of @c. If @poll is true, wait for
- * any pending activity to cease.
+ * Begin a quiesced section for the parent of @c.
  */
-void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll);
+void bdrv_parent_drained_begin_single(BdrvChild *c);
 
 /**
  * bdrv_parent_drained_poll_single:
diff --git a/block.c b/block.c
index 12039e9b8a..c200f7afa0 100644
--- a/block.c
+++ b/block.c
@@ -2409,7 +2409,7 @@ static void bdrv_replace_child_abort(void *opaque)
  * new_bs drained when calling bdrv_replace_child_tran() is not a
  * requirement any more.
  */
-bdrv_parent_drained_begin_single(s->child, false);
+bdrv_parent_drained_begin_single(s->child);
 assert(!bdrv_parent_drained_poll_single(s->child));
 }
 assert(s->child->parent_quiesce_counter);
@@ -3016,7 +3016,7 @@ static BdrvChild 
*bdrv_attach_child_common(BlockDriverState *child_bs,
  * bdrv_replace_child_noperm() will undrain it if the child node is not
  * drained. The child was only just created, so polling is not necessary.
  */
-bdrv_parent_drained_begin_single(new_child, false);
+bdrv_parent_drained_begin_single(new_child);
 bdrv_replace_child_noperm(new_child, child_bs);
 
 BdrvAttachChildCommonState *s = g_new(BdrvAttachChildCommonState, 1);
diff --git a/block/io.c b/block/io.c
index d0f641926f..9bcb19e5ee 100644
--- a/block/io.c
+++ b/block/io.c
@@ -53,7 +53,7 @@ static void bdrv_parent_drained_begin(BlockDriverState *bs, 
BdrvChild *ignore)
 if (c == ignore) {
 continue;
 }
-bdrv_parent_drained_begin_single(c, false);
+bdrv_parent_drained_begin_single(c);
 }
 }
 
@@ -103,7 +103,7 @@ static bool bdrv_parent_drained_poll(BlockDriverState *bs, 
BdrvChild *ignore)
 return busy;
 }
 
-void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll)
+void bdrv_parent_drained_begin_single(BdrvChild *c)
 {
 IO_OR_GS_CODE();
 assert(c->parent_quiesce_counter == 0);
@@ -111,9 +111,6 @@ void bdrv_parent_drained_begin_single(BdrvChild *c, bool 
poll)
 if (c->klass->drained_begin) {
 c->klass->drained_begin(c);
 }
-if (poll) {
-BDRV_POLL_WHILE(c->bs, bdrv_parent_drained_poll_single(c));
-}
 }
 
 static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
-- 
2.38.1

[PATCH 05/13] block: Inline bdrv_drain_invoke()

2022-11-08 Thread Kevin Wolf

bdrv_drain_invoke() has now two entirely separate cases that share no
code any more and are selected depending on a bool parameter. Each case
has only one caller. Just inline the function.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/block/io.c b/block/io.c
index 41e6121c31..c520183fb7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -241,21 +241,6 @@ typedef struct {
 bool ignore_bds_parents;
 } BdrvCoDrainData;
 
-/* Recursively call BlockDriver.bdrv_drain_begin/end callbacks */
-static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
-{
-if (!bs->drv || (begin && !bs->drv->bdrv_drain_begin) ||
-(!begin && !bs->drv->bdrv_drain_end)) {
-return;
-}
-
-if (begin) {
-bs->drv->bdrv_drain_begin(bs);
-} else {
-bs->drv->bdrv_drain_end(bs);
-}
-}
-
 /* Returns true if BDRV_POLL_WHILE() should go into a blocking aio_poll() */
 bool bdrv_drain_poll(BlockDriverState *bs, bool recursive,
  BdrvChild *ignore_parent, bool ignore_bds_parents)
@@ -389,7 +374,9 @@ void bdrv_do_drained_begin_quiesce(BlockDriverState *bs,
 }
 
 bdrv_parent_drained_begin(bs, parent, ignore_bds_parents);
-bdrv_drain_invoke(bs, true);
+if (bs->drv && bs->drv->bdrv_drain_begin) {
+bs->drv->bdrv_drain_begin(bs);
+}
 }
 
 static void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
@@ -460,7 +447,9 @@ static void bdrv_do_drained_end(BlockDriverState *bs, bool 
recursive,
 assert(bs->quiesce_counter > 0);
 
 /* Re-enable things in child-to-parent order */
-bdrv_drain_invoke(bs, false);
+if (bs->drv && bs->drv->bdrv_drain_end) {
+bs->drv->bdrv_drain_end(bs);
+}
 bdrv_parent_drained_end(bs, parent, ignore_bds_parents);
 
 old_quiesce_counter = qatomic_fetch_dec(&bs->quiesce_counter);
-- 
2.38.1

[PATCH 10/13] block: Call drain callbacks only once

2022-11-08 Thread Kevin Wolf

We only need to call both the BlockDriver's callback and the parent
callbacks when going from undrained to drained or vice versa. A second
drain section doesn't make a difference for the driver or the parent,
they weren't supposed to send new requests before and after the second
drain.

One thing that gets in the way is the 'ignore_bds_parents' parameter in
bdrv_do_drained_begin_quiesce() and bdrv_do_drained_end(): If it is true
for the first drain, bs->quiesce_counter will be non-zero, but the
parent callbacks still haven't been called, so a second drain where it
is false would still have to call them.

Instead of keeping track of this, let's just get rid of the parameter.
It was introduced in commit 6cd5c9d7b2d as an optimisation so that
during bdrv_drain_all(), we wouldn't recursively drain all parents up to
the root for each node, resulting in quadratic complexity. As it happens,
calling the callbacks only once solves the same problem, so as of this
patch, we'll still have O(n) complexity and ignore_bds_parents is not
needed any more.

This patch only ignores the 'ignore_bds_parents' parameter. It will be
removed in a separate patch.

Signed-off-by: Kevin Wolf 
---
 block.c  | 13 ++---
 block/io.c   | 24 +---
 tests/unit/test-bdrv-drain.c | 16 ++--
 3 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/block.c b/block.c
index 9d082631d9..8878586f6e 100644
--- a/block.c
+++ b/block.c
@@ -2816,7 +2816,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 {
 BlockDriverState *old_bs = child->bs;
 int new_bs_quiesce_counter;
-int drain_saldo;
 
 assert(!child->frozen);
 assert(old_bs != new_bs);
@@ -2827,15 +2826,13 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 }
 
 new_bs_quiesce_counter = (new_bs ? new_bs->quiesce_counter : 0);
-drain_saldo = new_bs_quiesce_counter - child->parent_quiesce_counter;
 
 /*
  * If the new child node is drained but the old one was not, flush
  * all outstanding requests to the old child node.
  */
-while (drain_saldo > 0 && child->klass->drained_begin) {
+if (new_bs_quiesce_counter && !child->parent_quiesce_counter) {
 bdrv_parent_drained_begin_single(child, true);
-drain_saldo--;
 }
 
 if (old_bs) {
@@ -2859,7 +2856,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
  * more often.
  */
 assert(new_bs->quiesce_counter <= new_bs_quiesce_counter);
-drain_saldo += new_bs->quiesce_counter - new_bs_quiesce_counter;
 
 if (child->klass->attach) {
 child->klass->attach(child);
@@ -2869,10 +2865,13 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 /*
  * If the old child node was drained but the new one is not, allow
  * requests to come in only after the new node has been attached.
+ *
+ * Update new_bs_quiesce_counter because bdrv_parent_drained_begin_single()
+ * polls, which could have changed the value.
  */
-while (drain_saldo < 0 && child->klass->drained_end) {
+new_bs_quiesce_counter = (new_bs ? new_bs->quiesce_counter : 0);
+if (!new_bs_quiesce_counter && child->parent_quiesce_counter) {
 bdrv_parent_drained_end_single(child);
-drain_saldo++;
 }
 }
 
diff --git a/block/io.c b/block/io.c
index 870a25d7a5..87c7a92f15 100644
--- a/block/io.c
+++ b/block/io.c
@@ -62,7 +62,7 @@ void bdrv_parent_drained_end_single(BdrvChild *c)
 {
 IO_OR_GS_CODE();
 
-assert(c->parent_quiesce_counter > 0);
+assert(c->parent_quiesce_counter == 1);
 c->parent_quiesce_counter--;
 if (c->klass->drained_end) {
 c->klass->drained_end(c);
@@ -109,6 +109,7 @@ static bool bdrv_parent_drained_poll(BlockDriverState *bs, 
BdrvChild *ignore,
 void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll)
 {
 IO_OR_GS_CODE();
+assert(c->parent_quiesce_counter == 0);
 c->parent_quiesce_counter++;
 if (c->klass->drained_begin) {
 c->klass->drained_begin(c);
@@ -352,16 +353,16 @@ void bdrv_do_drained_begin_quiesce(BlockDriverState *bs,
BdrvChild *parent, bool ignore_bds_parents)
 {
 IO_OR_GS_CODE();
-assert(!qemu_in_coroutine());
 
 /* Stop things in parent-to-child order */
 if (qatomic_fetch_inc(&bs->quiesce_counter) == 0) {
 aio_disable_external(bdrv_get_aio_context(bs));
-}
 
-bdrv_parent_drained_begin(bs, parent, ignore_bds_parents);
-if (bs->drv && bs->drv->bdrv_drain_begin) {
-bs->drv->bdrv_drain_begin(bs);
+/* TODO Remove ignore_bds_parents, we don't consider it any more */
+bdrv_parent_drained_begin(bs, parent, false);
+if (bs->drv && bs->drv->bdrv_drain_begin) {
+bs->drv->bdrv_drain_begin(bs);
+}
 }
 }
 
@@ -412,13 +413,14 @@ static void bdrv_do_drained_end(BlockDriverState *bs, 
BdrvChild *parent,
 assert(b

[PATCH 01/13] qed: Don't yield in bdrv_qed_co_drain_begin()

2022-11-08 Thread Kevin Wolf

We want to change .bdrv_co_drained_begin() back to be a non-coroutine
callback, so in preparation, avoid yielding in its implementation.

Because we increase bs->in_flight and bdrv_drained_begin() polls, the
behaviour is unchanged.

Signed-off-by: Kevin Wolf 
---
 block/qed.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 2f36ad342c..013f826c44 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -282,9 +282,8 @@ static void coroutine_fn 
qed_unplug_allocating_write_reqs(BDRVQEDState *s)
 qemu_co_mutex_unlock(&s->table_lock);
 }
 
-static void coroutine_fn qed_need_check_timer_entry(void *opaque)
+static void coroutine_fn qed_need_check_timer(BDRVQEDState *s)
 {
-BDRVQEDState *s = opaque;
 int ret;
 
 trace_qed_need_check_timer_cb(s);
@@ -310,9 +309,20 @@ static void coroutine_fn qed_need_check_timer_entry(void 
*opaque)
 (void) ret;
 }
 
+static void coroutine_fn qed_need_check_timer_entry(void *opaque)
+{
+BDRVQEDState *s = opaque;
+
+qed_need_check_timer(opaque);
+bdrv_dec_in_flight(s->bs);
+}
+
 static void qed_need_check_timer_cb(void *opaque)
 {
+BDRVQEDState *s = opaque;
 Coroutine *co = qemu_coroutine_create(qed_need_check_timer_entry, opaque);
+
+bdrv_inc_in_flight(s->bs);
 qemu_coroutine_enter(co);
 }
 
@@ -363,8 +373,12 @@ static void coroutine_fn 
bdrv_qed_co_drain_begin(BlockDriverState *bs)
  * header is flushed.
  */
 if (s->need_check_timer && timer_pending(s->need_check_timer)) {
+Coroutine *co;
+
 qed_cancel_need_check_timer(s);
-qed_need_check_timer_entry(s);
+co = qemu_coroutine_create(qed_need_check_timer_entry, s);
+bdrv_inc_in_flight(bs);
+aio_co_enter(bdrv_get_aio_context(bs), co);
 }
 }
 
-- 
2.38.1

[PATCH 09/13] block: Remove subtree drains

2022-11-08 Thread Kevin Wolf

Subtree drains are not used any more. Remove them.

After this, BdrvChildClass.attach/detach() don't poll any more.

Signed-off-by: Kevin Wolf 
---
 include/block/block-io.h |  18 +--
 include/block/block_int-common.h |   1 -
 include/block/block_int-io.h |  12 --
 block.c  |  20 +--
 block/io.c   | 121 +++---
 tests/unit/test-bdrv-drain.c | 261 ++-
 6 files changed, 44 insertions(+), 389 deletions(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index 97e9ae8bee..c35cb1e53f 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -303,8 +303,7 @@ void bdrv_parent_drained_end_single(BdrvChild *c);
 /**
  * bdrv_drain_poll:
  *
- * Poll for pending requests in @bs, its parents (except for @ignore_parent),
- * and if @recursive is true its children as well (used for subtree drain).
+ * Poll for pending requests in @bs and its parents (except for 
@ignore_parent).
  *
  * If @ignore_bds_parents is true, parents that are BlockDriverStates must
  * ignore the drain request because they will be drained separately (used for
@@ -312,8 +311,8 @@ void bdrv_parent_drained_end_single(BdrvChild *c);
  *
  * This is part of bdrv_drained_begin.
  */
-bool bdrv_drain_poll(BlockDriverState *bs, bool recursive,
- BdrvChild *ignore_parent, bool ignore_bds_parents);
+bool bdrv_drain_poll(BlockDriverState *bs, BdrvChild *ignore_parent,
+ bool ignore_bds_parents);
 
 /**
  * bdrv_drained_begin:
@@ -334,12 +333,6 @@ void bdrv_drained_begin(BlockDriverState *bs);
 void bdrv_do_drained_begin_quiesce(BlockDriverState *bs,
BdrvChild *parent, bool ignore_bds_parents);
 
-/**
- * Like bdrv_drained_begin, but recursively begins a quiesced section for
- * exclusive access to all child nodes as well.
- */
-void bdrv_subtree_drained_begin(BlockDriverState *bs);
-
 /**
  * bdrv_drained_end:
  *
@@ -353,9 +346,4 @@ void bdrv_subtree_drained_begin(BlockDriverState *bs);
  */
 void bdrv_drained_end(BlockDriverState *bs);
 
-/**
- * End a quiescent section started by bdrv_subtree_drained_begin().
- */
-void bdrv_subtree_drained_end(BlockDriverState *bs);
-
 #endif /* BLOCK_IO_H */
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 6504db4fd9..65ee5fcbec 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -1184,7 +1184,6 @@ struct BlockDriverState {
 
 /* Accessed with atomic ops.  */
 int quiesce_counter;
-int recursive_quiesce_counter;
 
 unsigned int write_gen;   /* Current data generation */
 
diff --git a/include/block/block_int-io.h b/include/block/block_int-io.h
index 4b0b3e17ef..8bc061ebb8 100644
--- a/include/block/block_int-io.h
+++ b/include/block/block_int-io.h
@@ -179,16 +179,4 @@ void bdrv_bsc_invalidate_range(BlockDriverState *bs,
  */
 void bdrv_bsc_fill(BlockDriverState *bs, int64_t offset, int64_t bytes);
 
-
-/*
- * "I/O or GS" API functions. These functions can run without
- * the BQL, but only in one specific iothread/main loop.
- *
- * See include/block/block-io.h for more information about
- * the "I/O or GS" API.
- */
-
-void bdrv_apply_subtree_drain(BdrvChild *child, BlockDriverState *new_parent);
-void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState 
*old_parent);
-
 #endif /* BLOCK_INT_IO_H */
diff --git a/block.c b/block.c
index 43b893dd6c..9d082631d9 100644
--- a/block.c
+++ b/block.c
@@ -1224,7 +1224,7 @@ static void bdrv_child_cb_drained_begin(BdrvChild *child)
 static bool bdrv_child_cb_drained_poll(BdrvChild *child)
 {
 BlockDriverState *bs = child->opaque;
-return bdrv_drain_poll(bs, false, NULL, false);
+return bdrv_drain_poll(bs, NULL, false);
 }
 
 static void bdrv_child_cb_drained_end(BdrvChild *child)
@@ -1474,8 +1474,6 @@ static void bdrv_child_cb_attach(BdrvChild *child)
 assert(!bs->file);
 bs->file = child;
 }
-
-bdrv_apply_subtree_drain(child, bs);
 }
 
 static void bdrv_child_cb_detach(BdrvChild *child)
@@ -1486,8 +1484,6 @@ static void bdrv_child_cb_detach(BdrvChild *child)
 bdrv_backing_detach(child);
 }
 
-bdrv_unapply_subtree_drain(child, bs);
-
 assert_bdrv_graph_writable(bs);
 QLIST_REMOVE(child, next);
 if (child == bs->backing) {
@@ -2843,9 +2839,6 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 }
 
 if (old_bs) {
-/* Detach first so that the recursive drain sections coming from @child
- * are already gone and we only end the drain sections that came from
- * elsewhere. */
 if (child->klass->detach) {
 child->klass->detach(child);
 }
@@ -2860,17 +2853,14 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
 QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);
 
 /*
- * Detaching the old node may have led to the

Re: [PATCH v3 4/4] hw/nvme: add polling support

2022-11-08 Thread John Levon

On Fri, Nov 04, 2022 at 07:32:12AM +0100, Klaus Jensen wrote:

> On Nov  3 21:19, Jinhao Fan wrote:
> > On 11/3/2022 8:10 PM, Klaus Jensen wrote:
> > > I agree that the spec is a little unclear on this point. In any case, in
> > > Linux, when the driver has decided that the sq tail must be updated,
> > > it will use this check:
> > > 
> > >(new_idx - event_idx - 1) < (new_idx - old)
> > 
> > When eventidx is already behind, it's like:
> > 
> >  0
> >  1 <- event_idx
> >  2 <- old
> >  3 <- new_idx
> >  4
> >  .
> >  .
> >  .
> > 
> > In this case, (new_idx - event_idx - 1) = 3-1-1 = 1 >= (new_idx - old) =
> > 3-2=1, so the host won't update sq tail. Where am I wrong in this example?
> 
> That becomes 1 >= 1, i.e. "true". So this will result in the driver
> doing an mmio doorbell write.

The code is:

static inline int nvme_dbbuf_need_event(u16 event_idx, u16 new_idx, u16 old)
 
{   
 
return (u16)(new_idx - event_idx - 1) < (u16)(new_idx - old);   
 
}   
 

which per the above is "return 1 < 1;", or false. So the above case does *not*
do an mmio write. No?

regards
john

[PATCH 02/13] test-bdrv-drain: Don't yield in .bdrv_co_drained_begin/end()

2022-11-08 Thread Kevin Wolf

We want to change .bdrv_co_drained_begin/end() back to be non-coroutine
callbacks, so in preparation, avoid yielding in their implementation.

This does almost the same as the existing logic in bdrv_drain_invoke(),
by creating and entering coroutines internally. However, since the test
case is by far the heaviest user of coroutine code in drain callbacks,
it is preferable to have the complexity in the test case rather than the
drain core, which is already complicated enough without this.

The behaviour for bdrv_drain_begin() is unchanged because we increase
bs->in_flight and this is still polled. However, bdrv_drain_end()
doesn't wait for the spawned coroutine to complete any more. This is
fine, we don't rely on bdrv_drain_end() restarting all operations
immediately before the next aio_poll().

Signed-off-by: Kevin Wolf 
---
 tests/unit/test-bdrv-drain.c | 64 ++--
 1 file changed, 46 insertions(+), 18 deletions(-)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index 09dc4a4891..24f34e24ad 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -38,12 +38,22 @@ typedef struct BDRVTestState {
 bool sleep_in_drain_begin;
 } BDRVTestState;
 
+static void coroutine_fn sleep_in_drain_begin(void *opaque)
+{
+BlockDriverState *bs = opaque;
+
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 10);
+bdrv_dec_in_flight(bs);
+}
+
 static void coroutine_fn bdrv_test_co_drain_begin(BlockDriverState *bs)
 {
 BDRVTestState *s = bs->opaque;
 s->drain_count++;
 if (s->sleep_in_drain_begin) {
-qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 10);
+Coroutine *co = qemu_coroutine_create(sleep_in_drain_begin, bs);
+bdrv_inc_in_flight(bs);
+aio_co_enter(bdrv_get_aio_context(bs), co);
 }
 }
 
@@ -1916,6 +1926,21 @@ static int coroutine_fn 
bdrv_replace_test_co_preadv(BlockDriverState *bs,
 return 0;
 }
 
+static void coroutine_fn bdrv_replace_test_drain_co(void *opaque)
+{
+BlockDriverState *bs = opaque;
+BDRVReplaceTestState *s = bs->opaque;
+
+/* Keep waking io_co up until it is done */
+while (s->io_co) {
+aio_co_wake(s->io_co);
+s->io_co = NULL;
+qemu_coroutine_yield();
+}
+s->drain_co = NULL;
+bdrv_dec_in_flight(bs);
+}
+
 /**
  * If .drain_count is 0, wake up .io_co if there is one; and set
  * .was_drained.
@@ -1926,20 +1951,27 @@ static void coroutine_fn 
bdrv_replace_test_co_drain_begin(BlockDriverState *bs)
 BDRVReplaceTestState *s = bs->opaque;
 
 if (!s->drain_count) {
-/* Keep waking io_co up until it is done */
-s->drain_co = qemu_coroutine_self();
-while (s->io_co) {
-aio_co_wake(s->io_co);
-s->io_co = NULL;
-qemu_coroutine_yield();
-}
-s->drain_co = NULL;
-
+s->drain_co = qemu_coroutine_create(bdrv_replace_test_drain_co, bs);
+bdrv_inc_in_flight(bs);
+aio_co_enter(bdrv_get_aio_context(bs), s->drain_co);
 s->was_drained = true;
 }
 s->drain_count++;
 }
 
+static void coroutine_fn bdrv_replace_test_read_entry(void *opaque)
+{
+BlockDriverState *bs = opaque;
+char data;
+QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1);
+int ret;
+
+/* Queue a read request post-drain */
+ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0);
+g_assert(ret >= 0);
+bdrv_dec_in_flight(bs);
+}
+
 /**
  * Reduce .drain_count, set .was_undrained once it reaches 0.
  * If .drain_count reaches 0 and the node has a backing file, issue a
@@ -1951,17 +1983,13 @@ static void coroutine_fn 
bdrv_replace_test_co_drain_end(BlockDriverState *bs)
 
 g_assert(s->drain_count > 0);
 if (!--s->drain_count) {
-int ret;
-
 s->was_undrained = true;
 
 if (bs->backing) {
-char data;
-QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1);
-
-/* Queue a read request post-drain */
-ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0);
-g_assert(ret >= 0);
+Coroutine *co = qemu_coroutine_create(bdrv_replace_test_read_entry,
+  bs);
+bdrv_inc_in_flight(bs);
+aio_co_enter(bdrv_get_aio_context(bs), co);
 }
 }
 }
-- 
2.38.1

[PATCH 08/13] stream: Replace subtree drain with a single node drain

2022-11-08 Thread Kevin Wolf

The subtree drain was introduced in commit b1e1af394d9 as a way to avoid
graph changes between finding the base node and changing the block graph
as necessary on completion of the image streaming job.

The block graph could change between these two points because
bdrv_set_backing_hd() first drains the parent node, which involved
polling and can do anything.

Subtree draining was an imperfect way to make this less likely (because
with it, fewer callbacks are called during this window). Everyone agreed
that it's not really the right solution, and it was only committed as a
stopgap solution.

This replaces the subtree drain with a solution that simply drains the
parent node before we try to find the base node, and then call a version
of bdrv_set_backing_hd() that doesn't drain, but just asserts that the
parent node is already drained.

This way, any graph changes caused by draining happen before we start
looking at the graph and things stay consistent between finding the base
node and changing the graph.

Signed-off-by: Kevin Wolf 
---
 include/block/block-global-state.h |  3 +++
 block.c| 17 ++---
 block/stream.c | 20 ++--
 3 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/include/block/block-global-state.h 
b/include/block/block-global-state.h
index bb42ed9559..7923415d4e 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -82,6 +82,9 @@ int bdrv_open_file_child(const char *filename,
 BlockDriverState *bdrv_open_blockdev_ref(BlockdevRef *ref, Error **errp);
 int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
 Error **errp);
+int bdrv_set_backing_hd_drained(BlockDriverState *bs,
+BlockDriverState *backing_hd,
+Error **errp);
 int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
const char *bdref_key, Error **errp);
 BlockDriverState *bdrv_open(const char *filename, const char *reference,
diff --git a/block.c b/block.c
index 2f6b25875f..43b893dd6c 100644
--- a/block.c
+++ b/block.c
@@ -3395,14 +3395,15 @@ static int bdrv_set_backing_noperm(BlockDriverState *bs,
 return bdrv_set_file_or_backing_noperm(bs, backing_hd, true, tran, errp);
 }
 
-int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
-Error **errp)
+int bdrv_set_backing_hd_drained(BlockDriverState *bs,
+BlockDriverState *backing_hd,
+Error **errp)
 {
 int ret;
 Transaction *tran = tran_new();
 
 GLOBAL_STATE_CODE();
-bdrv_drained_begin(bs);
+assert(bs->quiesce_counter > 0);
 
 ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
 if (ret < 0) {
@@ -3412,7 +3413,17 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 ret = bdrv_refresh_perms(bs, errp);
 out:
 tran_finalize(tran, ret);
+return ret;
+}
 
+int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
+Error **errp)
+{
+int ret;
+GLOBAL_STATE_CODE();
+
+bdrv_drained_begin(bs);
+ret = bdrv_set_backing_hd_drained(bs, backing_hd, errp);
 bdrv_drained_end(bs);
 
 return ret;
diff --git a/block/stream.c b/block/stream.c
index 694709bd25..81dcf5a417 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -64,13 +64,16 @@ static int stream_prepare(Job *job)
 bdrv_cor_filter_drop(s->cor_filter_bs);
 s->cor_filter_bs = NULL;
 
-bdrv_subtree_drained_begin(s->above_base);
+/*
+ * bdrv_set_backing_hd() requires that unfiltered_bs is drained. Drain
+ * already here and use bdrv_set_backing_hd_drained() instead because
+ * the polling during drained_begin() might change the graph, and if we do
+ * this only later, we may end up working with the wrong base node (or it
+ * might even have gone away by the time we want to use it).
+ */
+bdrv_drained_begin(unfiltered_bs);
 
 base = bdrv_filter_or_cow_bs(s->above_base);
-if (base) {
-bdrv_ref(base);
-}
-
 unfiltered_base = bdrv_skip_filters(base);
 
 if (bdrv_cow_child(unfiltered_bs)) {
@@ -82,7 +85,7 @@ static int stream_prepare(Job *job)
 }
 }
 
-bdrv_set_backing_hd(unfiltered_bs, base, &local_err);
+bdrv_set_backing_hd_drained(unfiltered_bs, base, &local_err);
 ret = bdrv_change_backing_file(unfiltered_bs, base_id, base_fmt, 
false);
 if (local_err) {
 error_report_err(local_err);
@@ -92,10 +95,7 @@ static int stream_prepare(Job *job)
 }
 
 out:
-if (base) {
-bdrv_unref(base);
-}
-bdrv_subtree_drained_end(s->above_base);
+bdrv_drained_end(unfiltered_bs);
 return ret;
 }
 
-- 
2.38.1

[PATCH 07/13] block: Don't use subtree drains in bdrv_drop_intermediate()

2022-11-08 Thread Kevin Wolf

Instead of using a subtree drain from the top node (which also drains
child nodes of base that we're not even interested in), use a normal
drain for base, which automatically drains all of the parents, too.

Signed-off-by: Kevin Wolf 
---
 block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 5828b970e4..2f6b25875f 100644
--- a/block.c
+++ b/block.c
@@ -5581,7 +5581,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 GLOBAL_STATE_CODE();
 
 bdrv_ref(top);
-bdrv_subtree_drained_begin(top);
+bdrv_drained_begin(base);
 
 if (!top->drv || !base->drv) {
 goto exit;
@@ -5654,7 +5654,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 
 ret = 0;
 exit:
-bdrv_subtree_drained_end(top);
+bdrv_drained_end(base);
 bdrv_unref(top);
 return ret;
 }
-- 
2.38.1

[PATCH 06/13] block: Drain invidual nodes during reopen

2022-11-08 Thread Kevin Wolf

bdrv_reopen() and friends use subtree drains as a lazy way of covering
all the nodes they touch. Turns out that this lazy way is a lot more
complicated than just draining the nodes individually, even not
accounting for the additional complexity in the drain mechanism itself.

Simplify the code by switching to draining the individual nodes that are
already managed in the BlockReopenQueue anyway.

Signed-off-by: Kevin Wolf 
---
 block.c | 11 ---
 block/replication.c |  6 --
 blockdev.c  | 13 -
 3 files changed, 4 insertions(+), 26 deletions(-)

diff --git a/block.c b/block.c
index 7a24bd4c36..5828b970e4 100644
--- a/block.c
+++ b/block.c
@@ -4142,7 +4142,7 @@ static bool bdrv_recurse_has_child(BlockDriverState *bs,
  * returns a pointer to bs_queue, which is either the newly allocated
  * bs_queue, or the existing bs_queue being used.
  *
- * bs must be drained between bdrv_reopen_queue() and bdrv_reopen_multiple().
+ * bs is drained here and undrained by bdrv_reopen_queue_free().
  */
 static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
  BlockDriverState *bs,
@@ -4162,12 +4162,10 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 int flags;
 QemuOpts *opts;
 
-/* Make sure that the caller remembered to use a drained section. This is
- * important to avoid graph changes between the recursive queuing here and
- * bdrv_reopen_multiple(). */
-assert(bs->quiesce_counter > 0);
 GLOBAL_STATE_CODE();
 
+bdrv_drained_begin(bs);
+
 if (bs_queue == NULL) {
 bs_queue = g_new0(BlockReopenQueue, 1);
 QTAILQ_INIT(bs_queue);
@@ -4317,6 +4315,7 @@ void bdrv_reopen_queue_free(BlockReopenQueue *bs_queue)
 if (bs_queue) {
 BlockReopenQueueEntry *bs_entry, *next;
 QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
+bdrv_drained_end(bs_entry->state.bs);
 qobject_unref(bs_entry->state.explicit_options);
 qobject_unref(bs_entry->state.options);
 g_free(bs_entry);
@@ -4464,7 +4463,6 @@ int bdrv_reopen(BlockDriverState *bs, QDict *opts, bool 
keep_old_opts,
 
 GLOBAL_STATE_CODE();
 
-bdrv_subtree_drained_begin(bs);
 if (ctx != qemu_get_aio_context()) {
 aio_context_release(ctx);
 }
@@ -4475,7 +4473,6 @@ int bdrv_reopen(BlockDriverState *bs, QDict *opts, bool 
keep_old_opts,
 if (ctx != qemu_get_aio_context()) {
 aio_context_acquire(ctx);
 }
-bdrv_subtree_drained_end(bs);
 
 return ret;
 }
diff --git a/block/replication.c b/block/replication.c
index f1eed25e43..c62f48a874 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -374,9 +374,6 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 s->orig_secondary_read_only = bdrv_is_read_only(secondary_disk->bs);
 }
 
-bdrv_subtree_drained_begin(hidden_disk->bs);
-bdrv_subtree_drained_begin(secondary_disk->bs);
-
 if (s->orig_hidden_read_only) {
 QDict *opts = qdict_new();
 qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
@@ -401,9 +398,6 @@ static void reopen_backing_file(BlockDriverState *bs, bool 
writable,
 aio_context_acquire(ctx);
 }
 }
-
-bdrv_subtree_drained_end(hidden_disk->bs);
-bdrv_subtree_drained_end(secondary_disk->bs);
 }
 
 static void backup_job_cleanup(BlockDriverState *bs)
diff --git a/blockdev.c b/blockdev.c
index 3f1dec6242..8ffb3d9537 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3547,8 +3547,6 @@ fail:
 void qmp_blockdev_reopen(BlockdevOptionsList *reopen_list, Error **errp)
 {
 BlockReopenQueue *queue = NULL;
-GSList *drained = NULL;
-GSList *p;
 
 /* Add each one of the BDS that we want to reopen to the queue */
 for (; reopen_list != NULL; reopen_list = reopen_list->next) {
@@ -3585,9 +3583,7 @@ void qmp_blockdev_reopen(BlockdevOptionsList 
*reopen_list, Error **errp)
 ctx = bdrv_get_aio_context(bs);
 aio_context_acquire(ctx);
 
-bdrv_subtree_drained_begin(bs);
 queue = bdrv_reopen_queue(queue, bs, qdict, false);
-drained = g_slist_prepend(drained, bs);
 
 aio_context_release(ctx);
 }
@@ -3598,15 +3594,6 @@ void qmp_blockdev_reopen(BlockdevOptionsList 
*reopen_list, Error **errp)
 
 fail:
 bdrv_reopen_queue_free(queue);
-for (p = drained; p; p = p->next) {
-BlockDriverState *bs = p->data;
-AioContext *ctx = bdrv_get_aio_context(bs);
-
-aio_context_acquire(ctx);
-bdrv_subtree_drained_end(bs);
-aio_context_release(ctx);
-}
-g_slist_free(drained);
 }
 
 void qmp_blockdev_del(const char *node_name, Error **errp)
-- 
2.38.1

Re: [PATCH] hw/arm/boot: set initrd parameters to 64bit in fdt

2022-11-08 Thread Schspa Shi

Alex Bennée  writes:

> Schspa Shi  writes:
>
>> We use 32bit value for linux,initrd-[start/end], when we have
>> loader_start > 4GB, there will be a wrong initrd_start passed
>> to the kernel, and the kernel will report the following warning.
>>
>> [0.00] [ cut here ]
>> [0.00] initrd not fully accessible via the linear mapping -- please 
>> check your bootloader ...
>> [0.00] WARNING: CPU: 0 PID: 0 at arch/arm64/mm/init.c:355 
>> arm64_memblock_init+0x158/0x244
>> [0.00] Modules linked in:
>> [0.00] CPU: 0 PID: 0 Comm: swapper Tainted: GW  
>> 6.1.0-rc3-13250-g30a0b95b1335-dirty #28
>> [0.00] Hardware name: Horizon Sigi Virtual development board
>> (DT)
>
> Is this an out-of-tree board model?
>

Yes, this is a virtual board created by myself.

>> [0.00] pstate: 60c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS 
>> BTYPE=--)
>> [0.00] pc : arm64_memblock_init+0x158/0x244
>> [0.00] lr : arm64_memblock_init+0x158/0x244
>> [0.00] sp : 89273df0
>> [0.00] x29: 89273df0 x28: 001000cc0010 x27: 
>> 8000
>> [0.00] x26: 0050a3e2 x25: 88b46000 x24: 
>> 88b46000
>> [0.00] x23: 88a53000 x22: 8942 x21: 
>> 88a53000
>> [0.00] x20: 0400 x19: 0400 x18: 
>> 1020
>> [0.00] x17: 6568632065736165 x16: 6c70202d2d20676e x15: 
>> 697070616d207261
>> [0.00] x14: 656e696c20656874 x13: 0a2e2e2e20726564 x12: 
>> 
>> [0.00] x11:  x10:  x9 : 
>> 
>> [0.00] x8 :  x7 : 796c6c756620746f x6 : 
>> 6e20647274696e69
>> [0.00] x5 : 893c7c47 x4 : 88a2102f x3 : 
>> 89273a88
>> [0.00] x2 : 8000f038 x1 : 00c0 x0 : 
>> 0056
>> [0.00] Call trace:
>> [0.00]  arm64_memblock_init+0x158/0x244
>> [0.00]  setup_arch+0x164/0x1cc
>> [0.00]  start_kernel+0x94/0x4ac
>> [0.00]  __primary_switched+0xb4/0xbc
>> [0.00] ---[ end trace  ]---
>> [0.00] Zone ranges:
>> [0.00]   DMA  [mem 0x0010-0x001007ff]
>>
>> To fix it, we can change it to u64 type.
>>
>> Signed-off-by: Schspa Shi 
>> ---
>>  hw/arm/boot.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
>> index 57efb61ee419..da719a4f8874 100644
>> --- a/hw/arm/boot.c
>> +++ b/hw/arm/boot.c
>> @@ -638,14 +638,14 @@ int arm_load_dtb(hwaddr addr, const struct 
>> arm_boot_info *binfo,
>>  }
>>
>>  if (binfo->initrd_size) {
>> -rc = qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-start",
>> +rc = qemu_fdt_setprop_u64(fdt, "/chosen", "linux,initrd-start",
>> binfo->initrd_start);
>>  if (rc < 0) {
>>  fprintf(stderr, "couldn't set /chosen/linux,initrd-start\n");
>>  goto fail;
>>  }
>>
>> -rc = qemu_fdt_setprop_cell(fdt, "/chosen", "linux,initrd-end",
>> +rc = qemu_fdt_setprop_u64(fdt, "/chosen", "linux,initrd-end",
>> binfo->initrd_start + 
>> binfo->initrd_size);
>>  if (rc < 0) {
>>  fprintf(stderr, "couldn't set /chosen/linux,initrd-end\n");
>
> On the face of things this seems fine because unlike the other linux
> properties they are not specified to be "expressed in #address-cells and
> #size-cells" but I do wonder how we got into the situation where the
> kernel and initrd ended up so high in the physical address space.
>

The reason why I faced this problem is there is no DRAM region below
4GB on the hardware. And I make this virtual board to have the same
memory layout as the hardware.

Although there is no something like #address-cells and #size-cells,
but Linux will handle the size correctly by calling the following code:

prop = of_get_flat_dt_prop(node, "linux,initrd-start", &len);

Please check the link at
Link: https://lore.kernel.org/all/20091105074724.10460.4083.stgit@angua/
Every Linux version support fdt on arm[64] platform should work without
problems.

> There is a whole comment in boot.c talking about keeping initrd within
> lowmem:
>
> /*
>  * We want to put the initrd far enough into RAM that when the
>  * kernel is uncompressed it will not clobber the initrd. However
>  * on boards without much RAM we must ensure that we still leave
>  * enough room for a decent sized initrd, and on boards with large
>  * amounts of RAM we must avoid the initrd being so far up in RAM
>  * that it is outside lowmem and inaccessible to the kernel.
>  * So for boards with less  than 256MB of RAM we put the initrd
>  * halfway into RAM, and for boards with 256MB of RAM or more we put
>  * the in

Re: [PATCH] KVM: Add system call KVM_VERIFY_MSI to verify MSI vector

2022-11-08 Thread Marc Zyngier

On Tue, 08 Nov 2022 08:08:57 +,
chenxiang  wrote:
> 
> From: Xiang Chen 
> 
> Currently the numbers of MSI vectors come from register PCI_MSI_FLAGS
> which should be power-of-2, but in some scenaries it is not the same as
> the number that driver requires in guest, for example, a PCI driver wants
> to allocate 6 MSI vecotrs in guest, but as the limitation, it will allocate
> 8 MSI vectors. So it requires 8 MSI vectors in qemu while the driver in
> guest only wants to allocate 6 MSI vectors.
>
> When GICv4.1 is enabled, we can see some exception print as following for
> above scenaro:
> vfio-pci :3a:00.1: irq bypass producer (token 8f08224d) 
> registration fails:66311
> 
> In order to verify whether a MSI vector is valid, add KVM_VERIFY_MSI to do
> that. If there is a mapping, return 0, otherwise return negative value.
> 
> This is the kernel part of adding system call KVM_VERIFY_MSI.

Exposing something that is an internal implementation detail to
userspace feels like the absolute wrong way to solve this issue.

Can you please characterise the issue you're having? Is it that vfio
tries to enable an interrupt for which there is no virtual ITS
mapping? Shouldn't we instead try and manage this in the kernel?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.

[PATCH v2 2/5] target/riscv: Update VS timer whenever htimedelta changes

2022-11-08 Thread Anup Patel

The htimedelta[h] CSR has impact on the VS timer comparison so we
should call riscv_timer_write_timecmp() whenever htimedelta changes.

Fixes: 3ec0fe18a31f ("target/riscv: Add vstimecmp suppor")
Signed-off-by: Anup Patel 
Reviewed-by: Alistair Francis 
---
 target/riscv/csr.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 716f9d960e..4b1a608260 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -2722,6 +2722,8 @@ static RISCVException read_htimedelta(CPURISCVState *env, 
int csrno,
 static RISCVException write_htimedelta(CPURISCVState *env, int csrno,
target_ulong val)
 {
+RISCVCPU *cpu = env_archcpu(env);
+
 if (!env->rdtime_fn) {
 return RISCV_EXCP_ILLEGAL_INST;
 }
@@ -2731,6 +2733,12 @@ static RISCVException write_htimedelta(CPURISCVState 
*env, int csrno,
 } else {
 env->htimedelta = val;
 }
+
+if (cpu->cfg.ext_sstc && env->rdtime_fn) {
+riscv_timer_write_timecmp(cpu, env->vstimer, env->vstimecmp,
+  env->htimedelta, MIP_VSTIP);
+}
+
 return RISCV_EXCP_NONE;
 }
 
@@ -2748,11 +2756,19 @@ static RISCVException read_htimedeltah(CPURISCVState 
*env, int csrno,
 static RISCVException write_htimedeltah(CPURISCVState *env, int csrno,
 target_ulong val)
 {
+RISCVCPU *cpu = env_archcpu(env);
+
 if (!env->rdtime_fn) {
 return RISCV_EXCP_ILLEGAL_INST;
 }
 
 env->htimedelta = deposit64(env->htimedelta, 32, 32, (uint64_t)val);
+
+if (cpu->cfg.ext_sstc && env->rdtime_fn) {
+riscv_timer_write_timecmp(cpu, env->vstimer, env->vstimecmp,
+  env->htimedelta, MIP_VSTIP);
+}
+
 return RISCV_EXCP_NONE;
 }
 
-- 
2.34.1

[PATCH v2 0/5] Nested virtualization fixes for QEMU

2022-11-08 Thread Anup Patel

This series mainly includes fixes discovered while developing nested
virtualization running on QEMU.

These patches can also be found in the riscv_nested_fixes_v2 branch at:
https://github.com/avpatel/qemu.git

Changes since v1:
 - Added Alistair's Reviewed-by tags to appropriate patches
 - Added detailed comment block in PATCH4

Anup Patel (5):
  target/riscv: Typo fix in sstc() predicate
  target/riscv: Update VS timer whenever htimedelta changes
  target/riscv: Don't clear mask in riscv_cpu_update_mip() for VSTIP
  target/riscv: No need to re-start QEMU timer when timecmp ==
UINT64_MAX
  target/riscv: Ensure opcode is saved for all relevant instructions

 target/riscv/cpu_helper.c   |  2 --
 target/riscv/csr.c  | 18 ++-
 target/riscv/insn_trans/trans_rva.c.inc | 10 --
 target/riscv/insn_trans/trans_rvd.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvf.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvh.c.inc |  3 ++
 target/riscv/insn_trans/trans_rvi.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvzfh.c.inc   |  2 ++
 target/riscv/insn_trans/trans_svinval.c.inc |  3 ++
 target/riscv/time_helper.c  | 36 ++---
 10 files changed, 70 insertions(+), 10 deletions(-)

-- 
2.34.1

[PATCH v2 1/5] target/riscv: Typo fix in sstc() predicate

2022-11-08 Thread Anup Patel

We should use "&&" instead of "&" when checking hcounteren.TM and
henvcfg.STCE bits.

Fixes: 3ec0fe18a31f ("target/riscv: Add vstimecmp suppor")
Signed-off-by: Anup Patel 
Reviewed-by: Alistair Francis 
---
 target/riscv/csr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 5c9a7ee287..716f9d960e 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -838,7 +838,7 @@ static RISCVException sstc(CPURISCVState *env, int csrno)
 }
 
 if (riscv_cpu_virt_enabled(env)) {
-if (!(get_field(env->hcounteren, COUNTEREN_TM) &
+if (!(get_field(env->hcounteren, COUNTEREN_TM) &&
   get_field(env->henvcfg, HENVCFG_STCE))) {
 return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
 }
-- 
2.34.1

[PATCH v2 3/5] target/riscv: Don't clear mask in riscv_cpu_update_mip() for VSTIP

2022-11-08 Thread Anup Patel

Instead of clearing mask in riscv_cpu_update_mip() for VSTIP, we
should call riscv_cpu_update_mip() with mask == 0 from timer_helper.c
for VSTIP.

Fixes: 3ec0fe18a31f ("target/riscv: Add vstimecmp suppor")
Signed-off-by: Anup Patel 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu_helper.c  |  2 --
 target/riscv/time_helper.c | 12 
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 5d66246c2c..a403825e49 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -617,8 +617,6 @@ uint64_t riscv_cpu_update_mip(RISCVCPU *cpu, uint64_t mask, 
uint64_t value)
 vsgein = (env->hgeip & (1ULL << gein)) ? MIP_VSEIP : 0;
 }
 
-/* No need to update mip for VSTIP */
-mask = ((mask == MIP_VSTIP) && env->vstime_irq) ? 0 : mask;
 vstip = env->vstime_irq ? MIP_VSTIP : 0;
 
 if (!qemu_mutex_iothread_locked()) {
diff --git a/target/riscv/time_helper.c b/target/riscv/time_helper.c
index 8cce667dfd..4fb2a471a9 100644
--- a/target/riscv/time_helper.c
+++ b/target/riscv/time_helper.c
@@ -27,7 +27,7 @@ static void riscv_vstimer_cb(void *opaque)
 RISCVCPU *cpu = opaque;
 CPURISCVState *env = &cpu->env;
 env->vstime_irq = 1;
-riscv_cpu_update_mip(cpu, MIP_VSTIP, BOOL_TO_MASK(1));
+riscv_cpu_update_mip(cpu, 0, BOOL_TO_MASK(1));
 }
 
 static void riscv_stimer_cb(void *opaque)
@@ -57,16 +57,20 @@ void riscv_timer_write_timecmp(RISCVCPU *cpu, QEMUTimer 
*timer,
  */
 if (timer_irq == MIP_VSTIP) {
 env->vstime_irq = 1;
+riscv_cpu_update_mip(cpu, 0, BOOL_TO_MASK(1));
+} else {
+riscv_cpu_update_mip(cpu, MIP_STIP, BOOL_TO_MASK(1));
 }
-riscv_cpu_update_mip(cpu, timer_irq, BOOL_TO_MASK(1));
 return;
 }
 
+/* Clear the [VS|S]TIP bit in mip */
 if (timer_irq == MIP_VSTIP) {
 env->vstime_irq = 0;
+riscv_cpu_update_mip(cpu, 0, BOOL_TO_MASK(0));
+} else {
+riscv_cpu_update_mip(cpu, timer_irq, BOOL_TO_MASK(0));
 }
-/* Clear the [V]STIP bit in mip */
-riscv_cpu_update_mip(cpu, timer_irq, BOOL_TO_MASK(0));
 
 /* otherwise, set up the future timer interrupt */
 diff = timecmp - rtc_r;
-- 
2.34.1

[PATCH v2 5/5] target/riscv: Ensure opcode is saved for all relevant instructions

2022-11-08 Thread Anup Patel

We should call decode_save_opc() for all relevant instructions which
can potentially generate a virtual instruction fault or a guest page
fault because generating transformed instruction upon guest page fault
expects opcode to be available. Without this, hypervisor will see
transformed instruction as zero in htinst CSR for guest MMIO emulation
which makes MMIO emulation in hypervisor slow and also breaks nested
virtualization.

Fixes: a9814e3e08d2 ("target/riscv: Minimize the calls to decode_save_opc")
Signed-off-by: Anup Patel 
---
 target/riscv/insn_trans/trans_rva.c.inc | 10 +++---
 target/riscv/insn_trans/trans_rvd.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvf.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvh.c.inc |  3 +++
 target/riscv/insn_trans/trans_rvi.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvzfh.c.inc   |  2 ++
 target/riscv/insn_trans/trans_svinval.c.inc |  3 +++
 7 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rva.c.inc 
b/target/riscv/insn_trans/trans_rva.c.inc
index 45db82c9be..5f194a447b 100644
--- a/target/riscv/insn_trans/trans_rva.c.inc
+++ b/target/riscv/insn_trans/trans_rva.c.inc
@@ -20,8 +20,10 @@
 
 static bool gen_lr(DisasContext *ctx, arg_atomic *a, MemOp mop)
 {
-TCGv src1 = get_address(ctx, a->rs1, 0);
+TCGv src1;
 
+decode_save_opc(ctx);
+src1 = get_address(ctx, a->rs1, 0);
 if (a->rl) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
 }
@@ -43,6 +45,7 @@ static bool gen_sc(DisasContext *ctx, arg_atomic *a, MemOp 
mop)
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
 
+decode_save_opc(ctx);
 src1 = get_address(ctx, a->rs1, 0);
 tcg_gen_brcond_tl(TCG_COND_NE, load_res, src1, l1);
 
@@ -81,9 +84,10 @@ static bool gen_amo(DisasContext *ctx, arg_atomic *a,
 MemOp mop)
 {
 TCGv dest = dest_gpr(ctx, a->rd);
-TCGv src1 = get_address(ctx, a->rs1, 0);
-TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
+TCGv src1, src2 = get_gpr(ctx, a->rs2, EXT_NONE);
 
+decode_save_opc(ctx);
+src1 = get_address(ctx, a->rs1, 0);
 func(dest, src1, src2, ctx->mem_idx, mop);
 
 gen_set_gpr(ctx, a->rd, dest);
diff --git a/target/riscv/insn_trans/trans_rvd.c.inc 
b/target/riscv/insn_trans/trans_rvd.c.inc
index 1397c1ce1c..6e3159b797 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -38,6 +38,7 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVD);
 
+decode_save_opc(ctx);
 addr = get_address(ctx, a->rs1, a->imm);
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], addr, ctx->mem_idx, MO_TEUQ);
 
@@ -52,6 +53,7 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVD);
 
+decode_save_opc(ctx);
 addr = get_address(ctx, a->rs1, a->imm);
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], addr, ctx->mem_idx, MO_TEUQ);
 return true;
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc 
b/target/riscv/insn_trans/trans_rvf.c.inc
index a1d3eb52ad..965e1f8d11 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -38,6 +38,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
 
+decode_save_opc(ctx);
 addr = get_address(ctx, a->rs1, a->imm);
 dest = cpu_fpr[a->rd];
 tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_TEUL);
@@ -54,6 +55,7 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
 
+decode_save_opc(ctx);
 addr = get_address(ctx, a->rs1, a->imm);
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], addr, ctx->mem_idx, MO_TEUL);
 return true;
diff --git a/target/riscv/insn_trans/trans_rvh.c.inc 
b/target/riscv/insn_trans/trans_rvh.c.inc
index 4f8aecddc7..9248b48c36 100644
--- a/target/riscv/insn_trans/trans_rvh.c.inc
+++ b/target/riscv/insn_trans/trans_rvh.c.inc
@@ -36,6 +36,7 @@ static bool do_hlv(DisasContext *ctx, arg_r2 *a, MemOp mop)
 #ifdef CONFIG_USER_ONLY
 return false;
 #else
+decode_save_opc(ctx);
 if (check_access(ctx)) {
 TCGv dest = dest_gpr(ctx, a->rd);
 TCGv addr = get_gpr(ctx, a->rs1, EXT_NONE);
@@ -82,6 +83,7 @@ static bool do_hsv(DisasContext *ctx, arg_r2_s *a, MemOp mop)
 #ifdef CONFIG_USER_ONLY
 return false;
 #else
+decode_save_opc(ctx);
 if (check_access(ctx)) {
 TCGv addr = get_gpr(ctx, a->rs1, EXT_NONE);
 TCGv data = get_gpr(ctx, a->rs2, EXT_NONE);
@@ -135,6 +137,7 @@ static bool trans_hsv_d(DisasContext *ctx, arg_hsv_d *a)
 static bool do_hlvx(DisasContext *ctx, arg_r2 *a,
 void (*func)(TCGv, TCGv_env, TCGv))
 {
+decode_save_opc(ctx);
 if (check_access(ctx)) {
 TCGv dest = dest_gpr(ctx, a->rd);
 TCGv addr = get_gpr(ctx, a->rs1, EXT_NONE);
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/in

[PATCH v2 4/5] target/riscv: No need to re-start QEMU timer when timecmp == UINT64_MAX

2022-11-08 Thread Anup Patel

The time CSR will wrap-around immediately after reaching UINT64_MAX
so we don't need to re-start QEMU timer when timecmp == UINT64_MAX
in riscv_timer_write_timecmp().

Signed-off-by: Anup Patel 
Reviewed-by: Alistair Francis 
---
 target/riscv/time_helper.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/target/riscv/time_helper.c b/target/riscv/time_helper.c
index 4fb2a471a9..b654f91af9 100644
--- a/target/riscv/time_helper.c
+++ b/target/riscv/time_helper.c
@@ -72,6 +72,30 @@ void riscv_timer_write_timecmp(RISCVCPU *cpu, QEMUTimer 
*timer,
 riscv_cpu_update_mip(cpu, timer_irq, BOOL_TO_MASK(0));
 }
 
+/*
+ * Sstc specification says the following about timer interrupt:
+ * "A supervisor timer interrupt becomes pending - as reflected in
+ * the STIP bit in the mip and sip registers - whenever time contains
+ * a value greater than or equal to stimecmp, treating the values
+ * as unsigned integers. Writes to stimecmp are guaranteed to be
+ * reflected in STIP eventually, but not necessarily immediately.
+ * The interrupt remains posted until stimecmp becomes greater
+ * than time - typically as a result of writing stimecmp."
+ *
+ * When timecmp = UINT64_MAX, the time CSR will eventually reach
+ * timecmp value but on next timer tick the time CSR will wrap-around
+ * and become zero which is less than UINT64_MAX. Now, the timer
+ * interrupt behaves like a level triggered interrupt so it will
+ * become 1 when time = timecmp = UINT64_MAX and next timer tick
+ * it will become 0 again because time = 0 < timecmp = UINT64_MAX.
+ *
+ * Based on above, we don't re-start the QEMU timer when timecmp
+ * equals UINT64_MAX.
+ */
+if (timecmp == UINT64_MAX) {
+return;
+}
+
 /* otherwise, set up the future timer interrupt */
 diff = timecmp - rtc_r;
 /* back to ns (note args switched in muldiv64) */
-- 
2.34.1

Re: [PULL v3 3/7] hw/loongarch: Load FDT table into dram memory space

2022-11-08 Thread gaosong




在 2022/11/8 下午6:41, Richard Henderson 写道:

On 11/5/22 14:28, Song Gao wrote:

From: Xiaojuan Yang 

Load FDT table into dram memory space, and the addr is 2 MiB.
Since lowmem region starts from 0, FDT base address is located
at 2 MiB to avoid NULL pointer access.

Signed-off-by: Xiaojuan Yang 
Acked-by: Song Gao 
Message-Id: <20221028014007.2718352-2-yangxiaoj...@loongson.cn>
Signed-off-by: Song Gao 
---
  hw/loongarch/virt.c | 18 +++---
  include/hw/loongarch/virt.h |  3 ---
  2 files changed, 11 insertions(+), 10 deletions(-)


This breaks make check-tcg:

  TEST    hello on loongarch64
qemu-system-loongarch64: Some ROM regions are overlapping
These ROM regions might have been loaded by direct user request or by 
default.
They could be BIOS/firmware images, a guest kernel, initrd or some 
other file loaded into guest memory.
Check whether you intended to load all this guest code, and whether it 
has been built to load to the correct addresses.


The following two regions overlap (in the memory address space):
  hello ELF program header segment 0 (addresses 0x0020 - 
0x00242000)

  fdt (addresses 0x0020 - 0x0030)
make[1]: *** [Makefile:177: run-hello] Error 1



Thank you,  I had send a patch to fix this.

Thanks.
Song Gao




diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 4b595a9ea4..50e9829a94 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -159,7 +159,6 @@ static void fdt_add_pcie_node(const 
LoongArchMachineState *lams)

   1, FDT_PCI_RANGE_MMIO, 2, base_mmio,
   2, base_mmio, 2, size_mmio);
  g_free(nodename);
-    qemu_fdt_dumpdtb(ms->fdt, lams->fdt_size);
  }
    static void fdt_add_irqchip_node(LoongArchMachineState *lams)
@@ -656,6 +655,7 @@ static void loongarch_init(MachineState *machine)
  MemoryRegion *address_space_mem = get_system_memory();
  LoongArchMachineState *lams = LOONGARCH_MACHINE(machine);
  int i;
+    hwaddr fdt_base;
    if (!cpu_model) {
  cpu_model = LOONGARCH_CPU_TYPE_NAME("la464");
@@ -760,12 +760,16 @@ static void loongarch_init(MachineState *machine)
  lams->machine_done.notify = virt_machine_done;
qemu_add_machine_init_done_notifier(&lams->machine_done);
  fdt_add_pcie_node(lams);
-
-    /* load fdt */
-    MemoryRegion *fdt_rom = g_new(MemoryRegion, 1);
-    memory_region_init_rom(fdt_rom, NULL, "fdt", VIRT_FDT_SIZE, 
&error_fatal);
-    memory_region_add_subregion(get_system_memory(), VIRT_FDT_BASE, 
fdt_rom);
-    rom_add_blob_fixed("fdt", machine->fdt, lams->fdt_size, 
VIRT_FDT_BASE);

+    /*
+ * Since lowmem region starts from 0, FDT base address is located
+ * at 2 MiB to avoid NULL pointer access.
+ *
+ * Put the FDT into the memory map as a ROM image: this will ensure
+ * the FDT is copied again upon reset, even if addr points into 
RAM.

+ */
+    fdt_base = 2 * MiB;
+    qemu_fdt_dumpdtb(machine->fdt, lams->fdt_size);
+    rom_add_blob_fixed("fdt", machine->fdt, lams->fdt_size, fdt_base);
  }
    bool loongarch_is_acpi_enabled(LoongArchMachineState *lams)
diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h
index 09f1c88ee5..45c383f5a7 100644
--- a/include/hw/loongarch/virt.h
+++ b/include/hw/loongarch/virt.h
@@ -28,9 +28,6 @@
  #define VIRT_GED_MEM_ADDR   (VIRT_GED_EVT_ADDR + 
ACPI_GED_EVT_SEL_LEN)
  #define VIRT_GED_REG_ADDR   (VIRT_GED_MEM_ADDR + 
MEMORY_HOTPLUG_IO_LEN)

  -#define VIRT_FDT_BASE   0x1c40
-#define VIRT_FDT_SIZE   0x10
-
  struct LoongArchMachineState {
  /*< private >*/
  MachineState parent_obj;

[PATCH 1/1] target/loongarch: Fix loongarch fdt addr confict

2022-11-08 Thread Song Gao

Fix LoongArch check-tcg error:
  TESThello on loongarch64
qemu-system-loongarch64: Some ROM regions are overlapping
These ROM regions might have been loaded by direct user request or by default.
They could be BIOS/firmware images, a guest kernel, initrd or some other file 
loaded into guest memory.
Check whether you intended to load all this guest code, and whether it has been 
built to load to the correct addresses.

The following two regions overlap (in the memory address space):
  hello ELF program header segment 0 (addresses 0x0020 - 
0x00242000)
  fdt (addresses 0x0020 - 0x0030)
make[1]: *** [Makefile:177: run-hello] Error 1

Reported-by: Richard Henderson 
Signed-off-by: Song Gao 
---
 tests/tcg/loongarch64/system/kernel.ld |  7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/tcg/loongarch64/system/kernel.ld 
b/tests/tcg/loongarch64/system/kernel.ld
index f1a7c0168c..2110cfe8be 100644
--- a/tests/tcg/loongarch64/system/kernel.ld
+++ b/tests/tcg/loongarch64/system/kernel.ld
@@ -2,8 +2,11 @@ ENTRY(_start)
 
 SECTIONS
 {
-/* Linux kernel legacy start address.  */
-. = 0x9020;
+/*
+ * Linux kernel legacy start address.
+ * FDT is load at 0x20, kernel image size must be smaller than 1M
+ */
+. = 0x10;
 _text = .;
 .text : {
 *(.text)
-- 
2.31.1

Re: [PATCH 1/1] target/loongarch: Fix loongarch fdt addr confict

2022-11-08 Thread Richard Henderson


On 11/9/22 00:02, Song Gao wrote:

Fix LoongArch check-tcg error:
   TESThello on loongarch64
qemu-system-loongarch64: Some ROM regions are overlapping
These ROM regions might have been loaded by direct user request or by default.
They could be BIOS/firmware images, a guest kernel, initrd or some other file 
loaded into guest memory.
Check whether you intended to load all this guest code, and whether it has been 
built to load to the correct addresses.

The following two regions overlap (in the memory address space):
   hello ELF program header segment 0 (addresses 0x0020 - 
0x00242000)
   fdt (addresses 0x0020 - 0x0030)
make[1]: *** [Makefile:177: run-hello] Error 1

Reported-by: Richard Henderson 
Signed-off-by: Song Gao 
---
  tests/tcg/loongarch64/system/kernel.ld |  7 +--
  1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/tcg/loongarch64/system/kernel.ld 
b/tests/tcg/loongarch64/system/kernel.ld
index f1a7c0168c..2110cfe8be 100644
--- a/tests/tcg/loongarch64/system/kernel.ld
+++ b/tests/tcg/loongarch64/system/kernel.ld
@@ -2,8 +2,11 @@ ENTRY(_start)
  
  SECTIONS

  {
-/* Linux kernel legacy start address.  */
-. = 0x9020;
+/*
+ * Linux kernel legacy start address.
+ * FDT is load at 0x20, kernel image size must be smaller than 1M
+ */
+. = 0x10;


Or start above the fdt at 3M, to avoid that limitation?
The comment about the Linux kernel start address no longer applies.

Either way,
Reviewed-by: Richard Henderson 

r~

Re: [PULL v3 00/81] pci,pc,virtio: features, tests, fixes, cleanups

2022-11-08 Thread Igor Mammedov

On Mon, 7 Nov 2022 07:30:03 -0500
"Michael S. Tsirkin"  wrote:

> On Mon, Nov 07, 2022 at 05:43:44AM -0500, Stefan Hajnoczi wrote:
> > Hi Michael and Igor,
> > Looks like the ACPI commits broken the virtio-vga module:
> >   
> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=60 
> > >>> G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
> > >>>  QTEST_QEMU_BINARY=./qemu-system-or1k 
> > >>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
> > >>> /builds/qemu-project/qemu/build/tests/qtest/device-introspect-test 
> > >>> --tap -k  
> > ― ✀ 
> > ―
> > stderr:
> > failed to open module:
> > /builds/qemu-project/qemu/build/qemu-bundle/usr/local/lib64/qemu/hw-display-virtio-vga.so:
> > undefined symbol: aml_return
> > qemu-system-or1k: ../util/error.c:59: error_setv: Assertion `*errp ==
> > NULL' failed.
> > Broken pipe
> > ../tests/qtest/libqtest.c:188: kill_qemu() detected QEMU death from
> > signal 6 (Aborted) (core dumped)
> > TAP parsing error: Too few tests run (expected 6, got 0)
> > (test program exited with status code -6)
> > ――
> > 154/274 qemu:qtest+qtest-or1k / qtest-or1k/machine-none-test OK 0.05s
> > 1 subtests passed
> > 155/274 qemu:qtest+qtest-or1k / qtest-or1k/qmp-test OK 0.19s 4 subtests 
> > passed
> > 156/274 qemu:qtest+qtest-or1k / qtest-or1k/qmp-cmd-test ERROR 1.72s
> > killed by signal 6 SIGABRT  
> > >>> QTEST_QEMU_IMG=./qemu-img 
> > >>> G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
> > >>>  QTEST_QEMU_BINARY=./qemu-system-or1k 
> > >>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
> > >>> MALLOC_PERTURB_=53 
> > >>> /builds/qemu-project/qemu/build/tests/qtest/qmp-cmd-test --tap -k  
> > ― ✀ 
> > ―
> > stderr:
> > failed to open module:
> > /builds/qemu-project/qemu/build/qemu-bundle/usr/local/lib64/qemu/hw-display-virtio-vga.so:
> > undefined symbol: aml_return
> > qemu-system-or1k: ../util/error.c:59: error_setv: Assertion `*errp ==
> > NULL' failed.
> > Broken pipe
> > ../tests/qtest/libqtest.c:188: kill_qemu() detected QEMU death from
> > signal 6 (Aborted) (core dumped)
> > TAP parsing error: Too few tests run (expected 62, got 31)
> > (test program exited with status code -6)
> > ――
> > 
> > https://gitlab.com/qemu-project/qemu/-/jobs/3281425457
> > 
> > Stefan  
> 
> 
> Hmm it passed for me:
> 
> https://gitlab.com/mstredhat/qemu/-/jobs/3279401710

I'm sorry, I was 'out of service' yesterday.
I see that Ani already fixed the issue, thanks!

> 
> Igor you did make a change around VGA:
> 
> commit 03d525c27ab0b268cf375d8823f05e91509222b8
> Author: Igor Mammedov 
> Date:   Mon Oct 17 12:21:36 2022 +0200
> 
> acpi: pc: vga: use AcpiDevAmlIf interface to build VGA device descriptors
> 
> Signed-off-by: Igor Mammedov 
> Message-Id: <20221017102146.2254096-2-imamm...@redhat.com>
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> NB: we do not expect any functional change in
> any ACPI tables with this change. It's only a refactoring.
> 
> Reviewed-by: Ani Sinha 
> 
> 
> can you take a look pls?
> How bad is it if I drop that patch?

It can't be dropped without dropping whole series
which heavily depends on it.

Re: [PULL v4 45/83] tests: acpi: whitelist DSDT before generating PCI-ISA bridge AML automatically

2022-11-08 Thread Igor Mammedov

On Mon, 7 Nov 2022 17:51:11 -0500
"Michael S. Tsirkin"  wrote:

> From: Igor Mammedov 
> 
> Signed-off-by: Igor Mammedov 
> Message-Id: <20221017102146.2254096-3-imamm...@redhat.com>
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  tests/qtest/bios-tables-test-allowed-diff.h | 34 +
>  1 file changed, 34 insertions(+)
> 
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> b/tests/qtest/bios-tables-test-allowed-diff.h
> index dfb8523c8b..570b17478e 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1 +1,35 @@
>  /* List of comma-separated changed AML files to ignore */
> +"tests/data/acpi/pc/DSDT",
> +"tests/data/acpi/pc/DSDT.acpierst",
> +"tests/data/acpi/pc/DSDT.acpihmat",
> +"tests/data/acpi/pc/DSDT.bridge",
> +"tests/data/acpi/pc/DSDT.cphp",
> +"tests/data/acpi/pc/DSDT.dimmpxm",
> +"tests/data/acpi/pc/DSDT.hpbridge",
> +"tests/data/acpi/pc/DSDT.hpbrroot",
> +"tests/data/acpi/pc/DSDT.ipmikcs",
> +"tests/data/acpi/pc/DSDT.memhp",
> +"tests/data/acpi/pc/DSDT.nohpet",
> +"tests/data/acpi/pc/DSDT.numamem",
> +"tests/data/acpi/pc/DSDT.roothp",
> +"tests/data/acpi/q35/DSDT",
> +"tests/data/acpi/q35/DSDT.acpierst",
> +"tests/data/acpi/q35/DSDT.acpihmat",
> +"tests/data/acpi/q35/DSDT.applesmc",
> +"tests/data/acpi/q35/DSDT.bridge",
> +"tests/data/acpi/q35/DSDT.cphp",
> +"tests/data/acpi/q35/DSDT.cxl",
> +"tests/data/acpi/q35/DSDT.dimmpxm",
> +"tests/data/acpi/q35/DSDT.ipmibt",
> +"tests/data/acpi/q35/DSDT.ipmismbus",
> +"tests/data/acpi/q35/DSDT.ivrs",
> +"tests/data/acpi/q35/DSDT.memhp",
> +"tests/data/acpi/q35/DSDT.mmio64",
> +"tests/data/acpi/q35/DSDT.multi-bridge",
> +"tests/data/acpi/q35/DSDT.nohpet",
> +"tests/data/acpi/q35/DSDT.numamem",
> +"tests/data/acpi/q35/DSDT.pvpanic-isa",
> +"tests/data/acpi/q35/DSDT.tis.tpm12",
> +"tests/data/acpi/q35/DSDT.tis.tpm2",
> +"tests/data/acpi/q35/DSDT.viot",
> +"tests/data/acpi/q35/DSDT.xapic",

still missing DSDT.count2 table, likely in other updates (as well)
which should break bisection if not whole pull request.

I'll prep a tree based on your pull req, with fixups
for you to pull from.

Re: [PATCH] hw/arm/boot: set initrd parameters to 64bit in fdt

2022-11-08 Thread Peter Maydell

On Tue, 8 Nov 2022 at 12:52, Schspa Shi  wrote:
> Alex Bennée  writes:
> > There is a whole comment in boot.c talking about keeping initrd within
> > lowmem:
> >
> > /*
> >  * We want to put the initrd far enough into RAM that when the
> >  * kernel is uncompressed it will not clobber the initrd. However
> >  * on boards without much RAM we must ensure that we still leave
> >  * enough room for a decent sized initrd, and on boards with large
> >  * amounts of RAM we must avoid the initrd being so far up in RAM
> >  * that it is outside lowmem and inaccessible to the kernel.
> >  * So for boards with less  than 256MB of RAM we put the initrd
> >  * halfway into RAM, and for boards with 256MB of RAM or more we put
> >  * the initrd at 128MB.
> >  * We also refuse to put the initrd somewhere that will definitely
> >  * overlay the kernel we just loaded, though for kernel formats which
> >  * don't tell us their exact size (eg self-decompressing 32-bit kernels)
> >  * we might still make a bad choice here.
> >  */
> >
>
> I think this lowmem does not mean below 4GB. and it is to make sure
> the initrd_start > memblock_start_of_DRAM for Linux address range check.

The wording of this comment pre-dates 64-bit CPU support: it
is talking about the requirement in the 32-bit booting doc
https://www.kernel.org/doc/Documentation/arm/Booting
that says
"If an initramfs is in use then, as with the dtb, it must be placed in
a region of memory where the kernel decompressor will not overwrite it
while also with the region which will be covered by the kernel's
low-memory mapping."

So it does mean "below 4GB", because you can't boot a 32-bit kernel
if you don't put the kernel, initrd, etc below 4GB.

thanks
-- PMM

Re: [QEMU][PATCH v2 5/5] MAINTAINERS: Include canfd tests under Xilinx CAN

2022-11-08 Thread Francisco Iglesias

On [2022 Oct 21] Fri 22:47:46, Vikram Garhwal wrote:
> Signed-off-by: Vikram Garhwal 
> Reviewed-by: Peter Maydell 

Reviewed-by: Francisco Iglesias 

> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 538af2885c..a642026361 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1756,7 +1756,7 @@ M: Francisco Iglesias 
>  S: Maintained
>  F: hw/net/can/xlnx-*
>  F: include/hw/net/xlnx-*
> -F: tests/qtest/xlnx-can-test*
> +F: tests/qtest/xlnx-can*-test*
>  
>  EDU
>  M: Jiri Slaby 
> -- 
> 2.17.1
> 
>

Re: [PULL 00/55] MIPS patches for 2022-10-30

2022-11-08 Thread Peter Maydell

On Sun, 30 Oct 2022 at 22:29, Philippe Mathieu-Daudé  wrote:
>
> The following changes since commit 344744e148e6e865f5a57e745b02a87e5ea534ad:
>
>   Merge tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu 
> into staging (2022-10-26 10:53:49 -0400)
>
> are available in the Git repository at:
>
>   https://github.com/philmd/qemu.git tags/mips-20221030
>
> for you to fetch changes up to 487099aee951e4966936acd3e9afd24c69de85ea:
>
>   hw/mips/malta: Use bootloader helper to set BAR registers (2022-10-30 
> 23:08:10 +0100)
>
> 
> MIPS patches queue
>
> - Convert nanoMIPS disassembler from C++ to C (Milica Lazarevic)

Was this the last use of C++ in the tree, or am I forgetting
some other part that still needs the C++ compiler?

If it is the last thing, we should put in the "Build Dependencies"
part of the release notes that a C++ compiler is no longer required
and mention that the configure options to specify it will go away in
a future release.

thanks
-- PMM

[PATCH v3 0/2] util/log: Always send errors to logfile when daemonized

2022-11-08 Thread Greg Kurz

When QEMU is started with `--daemonize -D ${logfile} -d ${some_log_item}`,
error logs from error_report() and friends go to ${logfile}, but if QEMU
is started with `-daemonize -D ${logfile}` and no `-d`, the file isn't
even created and all logs go to /dev/null.

This inconsistency is quite confusing for users and gives the impression
that QEMU doesn't log errors at all. It seems much saner to always create
the log file when `-D` was passed and to be able to report errors.

It was spotted by the kata-containers project, which happens to do just
that `--daemonize -D` without `-d` trick.

v3:
- drop log_append (Paolo's patch)
- new approach : call qemu_log_trylock() from qemu_set_log_internal() in
  the per-thread case, instead of trying to special case the main thread

v2:
- new log_thread_id() implementation for hosts without gettid() syscall
- avoid conflict between global log file and per-thread logfile
- style improvements

Greg Kurz (1):
  util/log: Always send errors to logfile when daemonized

Paolo Bonzini (1):
  util/log: do not close and reopen log files when flags are turned off

 util/log.c | 84 +-
 1 file changed, 58 insertions(+), 26 deletions(-)

-- 
2.38.1

[PATCH v3 1/2] util/log: do not close and reopen log files when flags are turned off

2022-11-08 Thread Greg Kurz

From: Paolo Bonzini 

log_append makes sure that if you turn off the logging (which clears
log_flags and makes need_to_open_file false) the old log is not
overwritten.  The usecase is that if you remove or move the file
QEMU will not keep writing to the old file.  However, this is
not always the desited behavior, in particular having log_append==1
after changing the file name makes little sense.

When qemu_set_log_internal is called from the logfile monitor
command, filename must be non-NULL and therefore changed_name must
be true.  Therefore, the only case where the file is closed and
need_to_open_file == false is indeed when log_flags becomes
zero.  In this case, just flush the file and do not bother
closing it, thus faking the same append behavior as previously.

The behavioral change is that changing the logfile twice, for
example log1 -> log2 -> log1, will cause log1 to be overwritten.
This can simply be documented, since it is not a particularly
surprising behavior.

Suggested-by: Alex Bennée 
Signed-off-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
Reviewed-by: Greg Kurz 
Message-Id: <20221025092119.236224-1-pbonz...@redhat.com>
[groug: nullify global_file before actually closing the file]
Signed-off-by: Greg Kurz 
---
 util/log.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/util/log.c b/util/log.c
index c2198badf240..fb843453dd49 100644
--- a/util/log.c
+++ b/util/log.c
@@ -45,7 +45,6 @@ static __thread FILE *thread_file;
 static __thread Notifier qemu_log_thread_cleanup_notifier;
 
 int qemu_loglevel;
-static bool log_append;
 static bool log_per_thread;
 static GArray *debug_regions;
 
@@ -277,19 +276,20 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 daemonized = is_daemonized();
 need_to_open_file = log_flags && !per_thread && (!daemonized || filename);
 
-if (logfile && (!need_to_open_file || changed_name)) {
-qatomic_rcu_set(&global_file, NULL);
-if (logfile != stderr) {
+if (logfile) {
+fflush(logfile);
+if (changed_name && logfile != stderr) {
 RCUCloseFILE *r = g_new0(RCUCloseFILE, 1);
 r->fd = logfile;
+qatomic_rcu_set(&global_file, NULL);
 call_rcu(r, rcu_close_file, rcu);
+logfile = NULL;
 }
-logfile = NULL;
 }
 
 if (!logfile && need_to_open_file) {
 if (filename) {
-logfile = fopen(filename, log_append ? "a" : "w");
+logfile = fopen(filename, "w");
 if (!logfile) {
 error_setg_errno(errp, errno, "Error opening logfile %s",
  filename);
@@ -308,8 +308,6 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 logfile = stderr;
 }
 
-log_append = 1;
-
 qatomic_rcu_set(&global_file, logfile);
 }
 return true;
-- 
2.38.1

[PATCH v3 2/2] util/log: Always send errors to logfile when daemonized

2022-11-08 Thread Greg Kurz

When QEMU is started with `-daemonize`, all stdio descriptors get
redirected to `/dev/null`. This basically means that anything
printed with error_report() and friends is lost.

Current logging code allows to redirect to a file with `-D` but
this requires to enable some logging item with `-d` as well to
be functional.

Relax the check on the log flags when QEMU is daemonized, so that
other users of stderr can benefit from the redirection, without the
need to enable unwanted debug logs. Previous behaviour is retained
for the non-daemonized case. The logic is unrolled as an `if` for
better readability. The qemu_log_level and log_per_thread globals
reflect the state we want to transition to at this point : use
them instead of the intermediary locals for correctness.

qemu_set_log_internal() is adapted to open a per-thread log file
when '-d tid' is passed. This is done by hijacking qemu_try_lock()
which seems simpler that refactoring the code.

Signed-off-by: Greg Kurz 
---
 util/log.c | 72 --
 1 file changed, 53 insertions(+), 19 deletions(-)

diff --git a/util/log.c b/util/log.c
index fb843453dd49..7837ff991769 100644
--- a/util/log.c
+++ b/util/log.c
@@ -79,13 +79,15 @@ static int log_thread_id(void)
 
 static void qemu_log_thread_cleanup(Notifier *n, void *unused)
 {
-fclose(thread_file);
-thread_file = NULL;
+if (thread_file != stderr) {
+fclose(thread_file);
+thread_file = NULL;
+}
 }
 
 /* Lock/unlock output. */
 
-FILE *qemu_log_trylock(void)
+static FILE *qemu_log_trylock_with_err(Error **errp)
 {
 FILE *logfile;
 
@@ -96,6 +98,9 @@ FILE *qemu_log_trylock(void)
 = g_strdup_printf(global_filename, log_thread_id());
 logfile = fopen(filename, "w");
 if (!logfile) {
+error_setg_errno(errp, errno,
+ "Error opening logfile %s for thread %d",
+ filename, log_thread_id());
 return NULL;
 }
 thread_file = logfile;
@@ -122,6 +127,11 @@ FILE *qemu_log_trylock(void)
 return logfile;
 }
 
+FILE *qemu_log_trylock(void)
+{
+return qemu_log_trylock_with_err(NULL);
+}
+
 void qemu_log_unlock(FILE *logfile)
 {
 if (logfile) {
@@ -265,16 +275,21 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 #endif
 qemu_loglevel = log_flags;
 
-/*
- * In all cases we only log if qemu_loglevel is set.
- * Also:
- *   If per-thread, open the file for each thread in qemu_log_lock.
- *   If not daemonized we will always log either to stderr
- * or to a file (if there is a filename).
- *   If we are daemonized, we will only log if there is a filename.
- */
 daemonized = is_daemonized();
-need_to_open_file = log_flags && !per_thread && (!daemonized || filename);
+need_to_open_file = false;
+if (!daemonized) {
+/*
+ * If not daemonized we only log if qemu_loglevel is set, either to
+ * stderr or to a file (if there is a filename).
+ * If per-thread, open the file for each thread in qemu_log_trylock().
+ */
+need_to_open_file = qemu_loglevel && !log_per_thread;
+} else {
+/*
+ * If we are daemonized, we will only log if there is a filename.
+ */
+need_to_open_file = filename != NULL;
+}
 
 if (logfile) {
 fflush(logfile);
@@ -287,19 +302,34 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 }
 }
 
+if (log_per_thread && daemonized) {
+logfile = thread_file;
+}
+
 if (!logfile && need_to_open_file) {
 if (filename) {
-logfile = fopen(filename, "w");
-if (!logfile) {
-error_setg_errno(errp, errno, "Error opening logfile %s",
- filename);
-return false;
+if (log_per_thread) {
+logfile = qemu_log_trylock_with_err(errp);
+if (!logfile) {
+return false;
+}
+qemu_log_unlock(logfile);
+} else {
+logfile = fopen(filename, "w");
+if (!logfile) {
+error_setg_errno(errp, errno, "Error opening logfile %s",
+ filename);
+return false;
+}
 }
 /* In case we are a daemon redirect stderr to logfile */
 if (daemonized) {
 dup2(fileno(logfile), STDERR_FILENO);
 fclose(logfile);
-/* This will skip closing logfile in rcu_close_file. */
+/*
+ * This will skip closing logfile in rcu_close_file()
+ * or qemu_log_thread_cleanup().
+ */
 logfile = stderr;
 }
 } else

Re: [PULL v4 45/83] tests: acpi: whitelist DSDT before generating PCI-ISA bridge AML automatically

2022-11-08 Thread Ani Sinha

On Tue, Nov 8, 2022 at 7:09 PM Ani Sinha  wrote:
>
> On Tue, Nov 8, 2022 at 7:06 PM Igor Mammedov  wrote:
> >
> > On Mon, 7 Nov 2022 17:51:11 -0500
> > "Michael S. Tsirkin"  wrote:
> >
> > > From: Igor Mammedov 
> > >
> > > Signed-off-by: Igor Mammedov 
> > > Message-Id: <20221017102146.2254096-3-imamm...@redhat.com>
> > > Reviewed-by: Michael S. Tsirkin 
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >  tests/qtest/bios-tables-test-allowed-diff.h | 34 +
> > >  1 file changed, 34 insertions(+)
> > >
> > > diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> > > b/tests/qtest/bios-tables-test-allowed-diff.h
> > > index dfb8523c8b..570b17478e 100644
> > > --- a/tests/qtest/bios-tables-test-allowed-diff.h
> > > +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> > > @@ -1 +1,35 @@
> > >  /* List of comma-separated changed AML files to ignore */
> > > +"tests/data/acpi/pc/DSDT",
> > > +"tests/data/acpi/pc/DSDT.acpierst",
> > > +"tests/data/acpi/pc/DSDT.acpihmat",
> > > +"tests/data/acpi/pc/DSDT.bridge",
> > > +"tests/data/acpi/pc/DSDT.cphp",
> > > +"tests/data/acpi/pc/DSDT.dimmpxm",
> > > +"tests/data/acpi/pc/DSDT.hpbridge",
> > > +"tests/data/acpi/pc/DSDT.hpbrroot",
> > > +"tests/data/acpi/pc/DSDT.ipmikcs",
> > > +"tests/data/acpi/pc/DSDT.memhp",
> > > +"tests/data/acpi/pc/DSDT.nohpet",
> > > +"tests/data/acpi/pc/DSDT.numamem",
> > > +"tests/data/acpi/pc/DSDT.roothp",
> > > +"tests/data/acpi/q35/DSDT",
> > > +"tests/data/acpi/q35/DSDT.acpierst",
> > > +"tests/data/acpi/q35/DSDT.acpihmat",
> > > +"tests/data/acpi/q35/DSDT.applesmc",
> > > +"tests/data/acpi/q35/DSDT.bridge",
> > > +"tests/data/acpi/q35/DSDT.cphp",
> > > +"tests/data/acpi/q35/DSDT.cxl",
> > > +"tests/data/acpi/q35/DSDT.dimmpxm",
> > > +"tests/data/acpi/q35/DSDT.ipmibt",
> > > +"tests/data/acpi/q35/DSDT.ipmismbus",
> > > +"tests/data/acpi/q35/DSDT.ivrs",
> > > +"tests/data/acpi/q35/DSDT.memhp",
> > > +"tests/data/acpi/q35/DSDT.mmio64",
> > > +"tests/data/acpi/q35/DSDT.multi-bridge",
> > > +"tests/data/acpi/q35/DSDT.nohpet",
> > > +"tests/data/acpi/q35/DSDT.numamem",
> > > +"tests/data/acpi/q35/DSDT.pvpanic-isa",
> > > +"tests/data/acpi/q35/DSDT.tis.tpm12",
> > > +"tests/data/acpi/q35/DSDT.tis.tpm2",
> > > +"tests/data/acpi/q35/DSDT.viot",
> > > +"tests/data/acpi/q35/DSDT.xapic",
> >
> > still missing DSDT.count2 table, likely in other updates (as well)
> > which should break bisection if not whole pull request.
> >
> > I'll prep a tree based on your pull req, with fixups
> > for you to pull from.
>
> Does this mean there will be a v5 for the PR?
> V4 looks good in the CI so far ...

Never mind. It has merged to QEMU master.

Re: [PATCH v3 4/4] hw/nvme: add polling support

2022-11-08 Thread Klaus Jensen

On Nov  8 12:39, John Levon wrote:
> On Fri, Nov 04, 2022 at 07:32:12AM +0100, Klaus Jensen wrote:
> 
> > On Nov  3 21:19, Jinhao Fan wrote:
> > > On 11/3/2022 8:10 PM, Klaus Jensen wrote:
> > > > I agree that the spec is a little unclear on this point. In any case, in
> > > > Linux, when the driver has decided that the sq tail must be updated,
> > > > it will use this check:
> > > > 
> > > >(new_idx - event_idx - 1) < (new_idx - old)
> > > 
> > > When eventidx is already behind, it's like:
> > > 
> > >  0
> > >  1 <- event_idx
> > >  2 <- old
> > >  3 <- new_idx
> > >  4
> > >  .
> > >  .
> > >  .
> > > 
> > > In this case, (new_idx - event_idx - 1) = 3-1-1 = 1 >= (new_idx - old) =
> > > 3-2=1, so the host won't update sq tail. Where am I wrong in this example?
> > 
> > That becomes 1 >= 1, i.e. "true". So this will result in the driver
> > doing an mmio doorbell write.
> 
> The code is:
> 
> static inline int nvme_dbbuf_need_event(u16 event_idx, u16 new_idx, u16 old)  
>
> { 
>
> return (u16)(new_idx - event_idx - 1) < (u16)(new_idx - old); 
>
> } 
>
> 
> which per the above is "return 1 < 1;", or false. So the above case does *not*
> do an mmio write. No?
> 

Whelp.

Looks like I'm in the wrong here, apologies!


signature.asc
Description: PGP signature

Re: [PATCH V2] hw/riscv: virt: Remove size restriction for pflash

2022-11-08 Thread Philippe Mathieu-Daudé

On 7/11/22 18:34, Daniel P. Berrangé wrote:

On Mon, Nov 07, 2022 at 06:32:01PM +0100, Andrew Jones wrote:

On Mon, Nov 07, 2022 at 04:19:10PM +, Daniel P. Berrangé wrote:

On Mon, Nov 07, 2022 at 03:50:44PM +, Alex Bennée wrote:

Sunil V L  writes:

On Mon, Nov 07, 2022 at 01:06:38PM +, Peter Maydell wrote:

On Mon, 7 Nov 2022 at 13:03, Sunil V L  wrote:

The pflash implementation currently assumes fixed size of the
backend storage. Due to this, the backend storage file needs to be
exactly of size 32M. Otherwise, there will be an error like below.

"device requires 33554432 bytes, block backend provides 4194304 bytes"

Fix this issue by using the actual size of the backing store.

Signed-off-by: Sunil V L 
---

Do you really want the flash device size presented to the guest
to be variable depending on what the user passed as a block backend?
I don't think this is how we handle flash devices on other boards...

Hi Peter,

x86 appears to support variable flash but arm doesn't. What is
the reason for not supporting variable size flash in arm?

If I recall from the last time we went around this is was the question
of what you should pad it with.

Padding is a very good thing from the POV of upgrades. Firmware has shown
a tendancy to change (grow) over time, and the size has an impact of the
guest ABI/live migration state.

To be able to live migrate, or save/restore to/from files, then the machine
firmware size needs to be sufficient to cope with future size changes of
the firmware. The best way to deal with this is to not use the firmware
binaries' minimum compiled size, but instead to pad it upto a higher
boundary.

Enforcing such padding is a decent way to prevent users from inadvertantly
painting themselves into a corner with a very specific firmware binary
size at initial boot.

Padding is a good idea, but too much causes other problems. When building
lightweight VMs which may pull the firmware image from a network,
AArch64 VMs require 64MB of mostly zeros to be transferred first, which
can become a substantial amount of the overall boot time[*]. Being able to
create images smaller than the total flash device size, but still add some
pad for later growth, seems like the happy-medium to shoot for.

QEMU configures the firmware using -blockdev, so can use any file
format that QEMU supports at the block layer.  IOW, you can store
the firmware in a qcow2 file and thus you will never fetch any
of the padding zeros to be transferred.  That said I'm not sure
that libvirt supports anything other than a raw file today.

Drew might be referring to:

https://lore.kernel.org/qemu-devel/20210810134050.396747-1-david.edmond...@oracle.com/

 > Currently ARM UEFI images are typically built as 2MB/768kB flash
 > images for code and variables respectively. These images are both
 > then padded out to 64MB before being loaded by QEMU.
 >
 > Because the images are 64MB each, QEMU allocates 128MB of memory to
 > read them, and then proceeds to read all 128MB from disk (dirtying
 > the memory). Of this 128MB less than 3MB is useful - the rest is
 > zero padding.
 >
 > On a machine with 100 VMs this wastes over 12GB of memory.

See previous attempts:
- Huawei
https://www.mail-archive.com/qemu-devel@nongnu.org/msg607292.html
- Tencent
https://www.mail-archive.com/qemu-devel@nongnu.org/msg742066.html
- Oracle
https://www.mail-archive.com/qemu-devel@nongnu.org/msg760065.html
- Red Hat
https://www.mail-archive.com/qemu-block@nongnu.org/msg81714.html

[PATCH] checkpatch: typo fix

2022-11-08 Thread Michael S. Tsirkin

remove inline #inline - it's an obvious typo. Should just be remove
inline.

Signed-off-by: Michael S. Tsirkin 
---
 scripts/checkpatch.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index bc7d4780ec..6ecabfb2b5 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1682,7 +1682,7 @@ sub process {
 
# Block comments use /* on a line of its own
my $commentline = $rawline;
-   while ($commentline =~ s@^(\+.*)/\*.*\*/@$1@o) { # remove 
inline #inline /*...*/
+   while ($commentline =~ s@^(\+.*)/\*.*\*/@$1@o) { # remove 
inline /*...*/
}
if ($commentline =~ m@^\+.*/\*\*?+[ \t]*[^ \t]@) { # /* or /** 
non-blank
WARN("Block comments use a leading /* on a separate 
line\n" . $herecurr);
-- 
MST

Re: [PATCH v2 0/3] block: Start/end drain on correct AioContext

2022-11-08 Thread Kevin Wolf

Am 07.11.2022 um 16:13 hat Hanna Reitz geschrieben:
> Hi,
> 
> v1 cover letter:
> https://lists.nongnu.org/archive/html/qemu-block/2022-09/msg00389.html
> 
> bdrv_replace_child_noperm() drains the child via
> bdrv_parent_drained_{begin,end}_single().  When it removes a child, the
> bdrv_parent_drained_end_single() at its end will be called on an empty
> child, making the BDRV_POLL_WHILE() in it poll the main AioContext
> (because c->bs is NULL).
> 
> That’s wrong, though, because it’s supposed to operate on the parent.
> bdrv_parent_drained_end_single_no_poll() will have scheduled any BHs in
> the parents’ AioContext, which may be anything, not necessarily the main
> context.  Therefore, we must poll the parent’s context.
> 
> Patch 3 does this for both bdrv_parent_drained_{begin,end}_single().
> Patch 1 ensures that we can legally call
> bdrv_child_get_parent_aio_context() from those I/O context functions,
> and patch 2 fixes blk_do_set_aio_context() to not cause an assertion
> failure if it beginning a drain can end up in blk_get_aio_context()
> before blk->ctx has been updated.

Hmm, I may have unintentionally made this series obsolete with the drain
series I sent today. The poll instances that you're fixing simply don't
exist any more after it.

Can you check if the bug is gone with my series? I would hope so, but if
not, we would probably need to apply a fix in a different place.

Kevin

Re: [PATCH] hw/arm/boot: set initrd parameters to 64bit in fdt

2022-11-08 Thread Peter Maydell

On Tue, 8 Nov 2022 at 13:54, Peter Maydell  wrote:
>
> On Tue, 8 Nov 2022 at 12:52, Schspa Shi  wrote:
> > Alex Bennée  writes:
> > > There is a whole comment in boot.c talking about keeping initrd within
> > > lowmem:
> > >
> > > /*
> > >  * We want to put the initrd far enough into RAM that when the
> > >  * kernel is uncompressed it will not clobber the initrd. However
> > >  * on boards without much RAM we must ensure that we still leave
> > >  * enough room for a decent sized initrd, and on boards with large
> > >  * amounts of RAM we must avoid the initrd being so far up in RAM
> > >  * that it is outside lowmem and inaccessible to the kernel.
> > >  * So for boards with less  than 256MB of RAM we put the initrd
> > >  * halfway into RAM, and for boards with 256MB of RAM or more we put
> > >  * the initrd at 128MB.
> > >  * We also refuse to put the initrd somewhere that will definitely
> > >  * overlay the kernel we just loaded, though for kernel formats which
> > >  * don't tell us their exact size (eg self-decompressing 32-bit 
> > > kernels)
> > >  * we might still make a bad choice here.
> > >  */
> > >
> >
> > I think this lowmem does not mean below 4GB. and it is to make sure
> > the initrd_start > memblock_start_of_DRAM for Linux address range check.
>
> The wording of this comment pre-dates 64-bit CPU support: it
> is talking about the requirement in the 32-bit booting doc
> https://www.kernel.org/doc/Documentation/arm/Booting
> that says
> "If an initramfs is in use then, as with the dtb, it must be placed in
> a region of memory where the kernel decompressor will not overwrite it
> while also with the region which will be covered by the kernel's
> low-memory mapping."
>
> So it does mean "below 4GB", because you can't boot a 32-bit kernel
> if you don't put the kernel, initrd, etc below 4GB.

A kernel person corrects me on the meaning of "lowmem" here -- the
kernel means by it "within the first 768MB of RAM". There is also
an implicit requirement that everything be within the bottom 32-bits
of the physical address space.

-- PMM

Re: [PATCH v3 09/30] nbd/server: Clean up abuse of BlockExportOptionsNbd member @arg

2022-11-08 Thread Vladimir Sementsov-Ogievskiy


On 11/4/22 19:06, Markus Armbruster wrote:

block-export-add argument @name defaults to the value of argument
@node-name.

nbd_export_create() implements this by copying @node_name to @name.
It leaves @has_node_name false, violating the "has_node_name ==
!!node_name" invariant.  Unclean.  Falls apart when we elide
@has_node_name (next commit): then QAPI frees the same value twice,
once for @node_name and once @name.  iotest 307 duly explodes.

Goes back to commit c62d24e906 "blockdev-nbd: Boxed argument type for
nbd-server-add" (v5.0.0).  Got moved from qmp_nbd_server_add() to
nbd_export_create() (commit 56ee86261e), then copied back (commit
b6076afcab).  Commit 8675cbd68b "nbd: Utilize QAPI_CLONE for type
conversion" (v5.2.0) cleaned up the copy in qmp_nbd_server_add()
noting

 Second, our assignment to arg->name is fishy: the generated QAPI code
 for qapi_free_NbdServerAddOptions does not visit arg->name if
 arg->has_name is false, but if it DID visit it, we would have
 introduced a double-free situation when arg is finally freed.

Exactly.  However, the copy in nbd_export_create() remained dirty.

Clean it up.  Since the value stored in member @name is not actually
used outside this function, use a local variable instead of modifying
the QAPI object.

Signed-off-by: Markus Armbruster
Cc: Eric Blake
Cc: Vladimir Sementsov-Ogievskiy
Cc:qemu-bl...@nongnu.org



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PULL 00/55] MIPS patches for 2022-10-30

2022-11-08 Thread Philippe Mathieu-Daudé


On 8/11/22 14:59, Peter Maydell wrote:

On Sun, 30 Oct 2022 at 22:29, Philippe Mathieu-Daudé  wrote:


The following changes since commit 344744e148e6e865f5a57e745b02a87e5ea534ad:

   Merge tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu 
into staging (2022-10-26 10:53:49 -0400)

are available in the Git repository at:

   https://github.com/philmd/qemu.git tags/mips-20221030

for you to fetch changes up to 487099aee951e4966936acd3e9afd24c69de85ea:

   hw/mips/malta: Use bootloader helper to set BAR registers (2022-10-30 
23:08:10 +0100)


MIPS patches queue

- Convert nanoMIPS disassembler from C++ to C (Milica Lazarevic)


Was this the last use of C++ in the tree, or am I forgetting
some other part that still needs the C++ compiler?

If it is the last thing, we should put in the "Build Dependencies"
part of the release notes that a C++ compiler is no longer required
and mention that the configure options to specify it will go away in
a future release.


I guess the last use is from the Guest Agent on Windows...

$ git ls-files | fgrep .cpp
qga/vss-win32/install.cpp
qga/vss-win32/provider.cpp
qga/vss-win32/requester.cpp

Cc'ing QGA & Windows teams.

Re: [PATCH v2 0/3] block: Start/end drain on correct AioContext

2022-11-08 Thread Kevin Wolf

Am 08.11.2022 um 15:13 hat Kevin Wolf geschrieben:
> Am 07.11.2022 um 16:13 hat Hanna Reitz geschrieben:
> > Hi,
> > 
> > v1 cover letter:
> > https://lists.nongnu.org/archive/html/qemu-block/2022-09/msg00389.html
> > 
> > bdrv_replace_child_noperm() drains the child via
> > bdrv_parent_drained_{begin,end}_single().  When it removes a child, the
> > bdrv_parent_drained_end_single() at its end will be called on an empty
> > child, making the BDRV_POLL_WHILE() in it poll the main AioContext
> > (because c->bs is NULL).
> > 
> > That’s wrong, though, because it’s supposed to operate on the parent.
> > bdrv_parent_drained_end_single_no_poll() will have scheduled any BHs in
> > the parents’ AioContext, which may be anything, not necessarily the main
> > context.  Therefore, we must poll the parent’s context.
> > 
> > Patch 3 does this for both bdrv_parent_drained_{begin,end}_single().
> > Patch 1 ensures that we can legally call
> > bdrv_child_get_parent_aio_context() from those I/O context functions,
> > and patch 2 fixes blk_do_set_aio_context() to not cause an assertion
> > failure if it beginning a drain can end up in blk_get_aio_context()
> > before blk->ctx has been updated.
> 
> Hmm, I may have unintentionally made this series obsolete with the drain
> series I sent today. The poll instances that you're fixing simply don't
> exist any more after it.
> 
> Can you check if the bug is gone with my series? I would hope so, but if
> not, we would probably need to apply a fix in a different place.

Actually, on second thoughts, we'd probably still apply your patches as
a fix for 7.2 and then have my patches which would get rid of the
polling only in block-next. Patch 1 and 2 of this series would stay in
the tree, and that seems to make sense to me, too. So obsolete was not
the right word.

But it would still be interesting to see if my series fixes the bug,
too, because otherwise it might introduce a regression later.

By the way, is the bug hard to test in a test case? If this series had
one, I could just have run it myself against my tree.

Kevin

Re: [QEMU][PATCH v2 4/5] tests/qtest: Introduce tests for Xilinx VERSAL CANFD controller

2022-11-08 Thread Francisco Iglesias

On [2022 Oct 21] Fri 22:47:45, Vikram Garhwal wrote:
> The QTests perform three tests on the Xilinx VERSAL CANFD controller:
> Tests the CANFD controllers in loopback.
> Tests the CANFD controllers in normal mode with CAN frame.
> Tests the CANFD controllers in normal mode with CANFD frame.
> 
> Signed-off-by: Vikram Garhwal 
> Acked-by: Thomas Huth 

Reviewed-by: Francisco Iglesias 

> ---
>  tests/qtest/meson.build   |   1 +
>  tests/qtest/xlnx-canfd-test.c | 422 ++
>  2 files changed, 423 insertions(+)
>  create mode 100644 tests/qtest/xlnx-canfd-test.c
> 
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index c07a5b1a5f..9486ebee24 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -213,6 +213,7 @@ qtests_aarch64 = \
>(config_all_devices.has_key('CONFIG_TPM_TIS_SYSBUS') ? 
> ['tpm-tis-device-test'] : []) +\
>(config_all_devices.has_key('CONFIG_TPM_TIS_SYSBUS') ? 
> ['tpm-tis-device-swtpm-test'] : []) +  \
>(config_all_devices.has_key('CONFIG_XLNX_ZYNQMP_ARM') ? ['xlnx-can-test', 
> 'fuzz-xlnx-dp-test'] : []) + \
> +  (config_all_devices.has_key('CONFIG_XLNX_VERSAL') ? ['xlnx-canfd-test'] : 
> []) + \
>['arm-cpu-features',
> 'numa-test',
> 'boot-serial-test',
> diff --git a/tests/qtest/xlnx-canfd-test.c b/tests/qtest/xlnx-canfd-test.c
> new file mode 100644
> index 00..d0e3e43b78
> --- /dev/null
> +++ b/tests/qtest/xlnx-canfd-test.c
> @@ -0,0 +1,422 @@
> +/* SPDX-License-Identifier: MIT
> + *
> + * QTests for the Xilinx Versal CANFD controller.
> + *
> + * Copyright (c) 2022 AMD Inc.
> + *
> + * Written-by: Vikram Garhwal
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "libqtest.h"
> +
> +/* Base address. */
> +#define CANFD0_BASE_ADDR0xFF06
> +#define CANFD1_BASE_ADDR0xFF07
> +
> +/* Register addresses. */
> +#define R_SRR_OFFSET0x00
> +#define R_MSR_OFFSET0x04
> +#define R_FILTER_CONTROL_REGISTER   0xe0
> +#define R_SR_OFFSET 0x18
> +#define R_ISR_OFFSET0x1C
> +#define R_IER_OFFSET0x20
> +#define R_ICR_OFFSET0x24
> +#define R_TX_READY_REQ_REGISTER 0x90
> +#define RX_FIFO_STATUS_REGISTER 0xE8
> +#define R_TXID_OFFSET   0x100
> +#define R_TXDLC_OFFSET  0x104
> +#define R_TXDATA1_OFFSET0x108
> +#define R_TXDATA2_OFFSET0x10C
> +#define R_AFMR_REGISTER00xa00
> +#define R_AFIR_REGISTER00xa04
> +#define R_RX0_ID_OFFSET 0x2100
> +#define R_RX0_DLC_OFFSET0x2104
> +#define R_RX0_DATA1_OFFSET  0x2108
> +#define R_RX0_DATA2_OFFSET  0x210C
> +
> +/* CANFD modes. */
> +#define SRR_CONFIG_MODE 0x00
> +#define MSR_NORMAL_MODE 0x00
> +#define MSR_LOOPBACK_MODE   (1 << 1)
> +#define ENABLE_CANFD(1 << 1)
> +
> +/* CANFD status. */
> +#define STATUS_CONFIG_MODE  (1 << 0)
> +#define STATUS_NORMAL_MODE  (1 << 3)
> +#define STATUS_LOOPBACK_MODE(1 << 1)
> +#define ISR_TXOK(1 << 1)
> +#define ISR_RXOK(1 << 4)
> +
> +#define ENABLE_ALL_FILTERS  0x
> +#define ENABLE_ALL_INTERRUPTS   0x
> +
> +/* We are sending one canfd message. */
> +#define TX_READY_REG_VAL0x1
> +
> +#define FIRST_RX_STORE_INDEX0x1
> +#define STATUS_REG_MASK 0xF
> +#define DLC_FD_BIT_SHIFT0x1B
> +#define DLC_FD_BIT_MASK 0xF800
> +#define FIFO_STATUS_READ_INDEX_MASK 0x3F
> +#define FIFO_STATUS_FILL_LEVEL_MASK 0x7F00
>

[PATCH] qapi/block-core: Fix BlockdevOptionsNvmeIoUring @path description

2022-11-08 Thread Alberto Faria

The nvme-io_uring BlockDriver's path option must point at the character
device of an NVMe namespace, not at an image file.

Fixes: fd66dbd424f5 ("blkio: add libblkio block driver")
Suggested-by: Stefano Garzarella 
Signed-off-by: Alberto Faria 
---
 qapi/block-core.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6d904004f8..95ac4fa634 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3704,7 +3704,7 @@
 #
 # Driver specific block device options for the nvme-io_uring backend.
 #
-# @path: path to the image file
+# @path: path to the NVMe namespace's character device (e.g. /dev/ng0n1).
 #
 # Since: 7.2
 ##
-- 
2.38.1

Re: [PULL v4 45/83] tests: acpi: whitelist DSDT before generating PCI-ISA bridge AML automatically

2022-11-08 Thread Igor Mammedov

On Tue, 8 Nov 2022 08:49:01 -0500
"Michael S. Tsirkin"  wrote:

> On Tue, Nov 08, 2022 at 02:36:41PM +0100, Igor Mammedov wrote:
> > On Mon, 7 Nov 2022 17:51:11 -0500
> > "Michael S. Tsirkin"  wrote:
> >   
> > > From: Igor Mammedov 
> > > 
> > > Signed-off-by: Igor Mammedov 
> > > Message-Id: <20221017102146.2254096-3-imamm...@redhat.com>
> > > Reviewed-by: Michael S. Tsirkin 
> > > Signed-off-by: Michael S. Tsirkin 
> > > ---
> > >  tests/qtest/bios-tables-test-allowed-diff.h | 34 +
> > >  1 file changed, 34 insertions(+)
> > > 
> > > diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> > > b/tests/qtest/bios-tables-test-allowed-diff.h
> > > index dfb8523c8b..570b17478e 100644
> > > --- a/tests/qtest/bios-tables-test-allowed-diff.h
> > > +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> > > @@ -1 +1,35 @@
> > >  /* List of comma-separated changed AML files to ignore */
> > > +"tests/data/acpi/pc/DSDT",
> > > +"tests/data/acpi/pc/DSDT.acpierst",
> > > +"tests/data/acpi/pc/DSDT.acpihmat",
> > > +"tests/data/acpi/pc/DSDT.bridge",
> > > +"tests/data/acpi/pc/DSDT.cphp",
> > > +"tests/data/acpi/pc/DSDT.dimmpxm",
> > > +"tests/data/acpi/pc/DSDT.hpbridge",
> > > +"tests/data/acpi/pc/DSDT.hpbrroot",
> > > +"tests/data/acpi/pc/DSDT.ipmikcs",
> > > +"tests/data/acpi/pc/DSDT.memhp",
> > > +"tests/data/acpi/pc/DSDT.nohpet",
> > > +"tests/data/acpi/pc/DSDT.numamem",
> > > +"tests/data/acpi/pc/DSDT.roothp",
> > > +"tests/data/acpi/q35/DSDT",
> > > +"tests/data/acpi/q35/DSDT.acpierst",
> > > +"tests/data/acpi/q35/DSDT.acpihmat",
> > > +"tests/data/acpi/q35/DSDT.applesmc",
> > > +"tests/data/acpi/q35/DSDT.bridge",
> > > +"tests/data/acpi/q35/DSDT.cphp",
> > > +"tests/data/acpi/q35/DSDT.cxl",
> > > +"tests/data/acpi/q35/DSDT.dimmpxm",
> > > +"tests/data/acpi/q35/DSDT.ipmibt",
> > > +"tests/data/acpi/q35/DSDT.ipmismbus",
> > > +"tests/data/acpi/q35/DSDT.ivrs",
> > > +"tests/data/acpi/q35/DSDT.memhp",
> > > +"tests/data/acpi/q35/DSDT.mmio64",
> > > +"tests/data/acpi/q35/DSDT.multi-bridge",
> > > +"tests/data/acpi/q35/DSDT.nohpet",
> > > +"tests/data/acpi/q35/DSDT.numamem",
> > > +"tests/data/acpi/q35/DSDT.pvpanic-isa",
> > > +"tests/data/acpi/q35/DSDT.tis.tpm12",
> > > +"tests/data/acpi/q35/DSDT.tis.tpm2",
> > > +"tests/data/acpi/q35/DSDT.viot",
> > > +"tests/data/acpi/q35/DSDT.xapic",  
> > 
> > still missing DSDT.count2 table, likely in other updates (as well)
> > which should break bisection if not whole pull request.  
> 
> 
> That's because I reordered count2 patches to be after these.

Aha,
that should fix the issue.

> 
> > I'll prep a tree based on your pull req, with fixups
> > for you to pull from.  
>

Re: [PULL v4 45/83] tests: acpi: whitelist DSDT before generating PCI-ISA bridge AML automatically

2022-11-08 Thread Michael S. Tsirkin

On Tue, Nov 08, 2022 at 02:36:41PM +0100, Igor Mammedov wrote:
> On Mon, 7 Nov 2022 17:51:11 -0500
> "Michael S. Tsirkin"  wrote:
> 
> > From: Igor Mammedov 
> > 
> > Signed-off-by: Igor Mammedov 
> > Message-Id: <20221017102146.2254096-3-imamm...@redhat.com>
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  tests/qtest/bios-tables-test-allowed-diff.h | 34 +
> >  1 file changed, 34 insertions(+)
> > 
> > diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> > b/tests/qtest/bios-tables-test-allowed-diff.h
> > index dfb8523c8b..570b17478e 100644
> > --- a/tests/qtest/bios-tables-test-allowed-diff.h
> > +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> > @@ -1 +1,35 @@
> >  /* List of comma-separated changed AML files to ignore */
> > +"tests/data/acpi/pc/DSDT",
> > +"tests/data/acpi/pc/DSDT.acpierst",
> > +"tests/data/acpi/pc/DSDT.acpihmat",
> > +"tests/data/acpi/pc/DSDT.bridge",
> > +"tests/data/acpi/pc/DSDT.cphp",
> > +"tests/data/acpi/pc/DSDT.dimmpxm",
> > +"tests/data/acpi/pc/DSDT.hpbridge",
> > +"tests/data/acpi/pc/DSDT.hpbrroot",
> > +"tests/data/acpi/pc/DSDT.ipmikcs",
> > +"tests/data/acpi/pc/DSDT.memhp",
> > +"tests/data/acpi/pc/DSDT.nohpet",
> > +"tests/data/acpi/pc/DSDT.numamem",
> > +"tests/data/acpi/pc/DSDT.roothp",
> > +"tests/data/acpi/q35/DSDT",
> > +"tests/data/acpi/q35/DSDT.acpierst",
> > +"tests/data/acpi/q35/DSDT.acpihmat",
> > +"tests/data/acpi/q35/DSDT.applesmc",
> > +"tests/data/acpi/q35/DSDT.bridge",
> > +"tests/data/acpi/q35/DSDT.cphp",
> > +"tests/data/acpi/q35/DSDT.cxl",
> > +"tests/data/acpi/q35/DSDT.dimmpxm",
> > +"tests/data/acpi/q35/DSDT.ipmibt",
> > +"tests/data/acpi/q35/DSDT.ipmismbus",
> > +"tests/data/acpi/q35/DSDT.ivrs",
> > +"tests/data/acpi/q35/DSDT.memhp",
> > +"tests/data/acpi/q35/DSDT.mmio64",
> > +"tests/data/acpi/q35/DSDT.multi-bridge",
> > +"tests/data/acpi/q35/DSDT.nohpet",
> > +"tests/data/acpi/q35/DSDT.numamem",
> > +"tests/data/acpi/q35/DSDT.pvpanic-isa",
> > +"tests/data/acpi/q35/DSDT.tis.tpm12",
> > +"tests/data/acpi/q35/DSDT.tis.tpm2",
> > +"tests/data/acpi/q35/DSDT.viot",
> > +"tests/data/acpi/q35/DSDT.xapic",
> 
> still missing DSDT.count2 table, likely in other updates (as well)
> which should break bisection if not whole pull request.


That's because I reordered count2 patches to be after these.

> I'll prep a tree based on your pull req, with fixups
> for you to pull from.

Re: [PATCH v2 1/9] block: call bdrv_co_drain_begin in a coroutine

2022-11-08 Thread Vladimir Sementsov-Ogievskiy


On 11/4/22 12:56, Emanuele Giuseppe Esposito wrote:

It seems that bdrv_open_driver() forgot to create a coroutine
where to call bs->drv->bdrv_co_drain_begin(), a callback
marked as coroutine_fn.

Because there is no active I/O at this point, the coroutine
should end right after entering, so the caller does not need
to poll until it is finished.


Hmm. I see your point. But isn't it better to go the generic way and use a 
generated coroutine wrapper? Nothing guarantees that .bdrv_co_drain_begin() 
handlers will never do any yield point even on driver open...

Look for example at bdrv_co_check(). It has a generated wrapper bdrv_check(), 
declared in include/block/block-io.h

So you just need to declare the wrapper, and use it in bdrv_open_driver(), the 
code would be clearer too.



Signed-off-by: Emanuele Giuseppe Esposito 
Reviewed-by: Paolo Bonzini 
---
  block.c | 36 +++-
  1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 5311b21f8e..d2b2800039 100644
--- a/block.c
+++ b/block.c
@@ -1637,12 +1637,34 @@ out:
  g_free(gen_node_name);
  }
  
+typedef struct DrainCo {

+BlockDriverState *bs;
+int ret;
+} DrainCo;
+
+static void coroutine_fn bdrv_co_drain_begin(void *opaque)
+{
+int i;
+DrainCo *co = opaque;
+BlockDriverState *bs = co->bs;
+
+for (i = 0; i < bs->quiesce_counter; i++) {
+bs->drv->bdrv_co_drain_begin(bs);
+}
+co->ret = 0;
+}
+
  static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv,
  const char *node_name, QDict *options,
  int open_flags, Error **errp)
  {
  Error *local_err = NULL;
-int i, ret;
+int ret;
+Coroutine *co;
+DrainCo dco = {
+.bs = bs,
+.ret = NOT_DONE,
+};
  GLOBAL_STATE_CODE();
  
  bdrv_assign_node_name(bs, node_name, &local_err);

@@ -1690,10 +1712,14 @@ static int bdrv_open_driver(BlockDriverState *bs, 
BlockDriver *drv,
  assert(bdrv_min_mem_align(bs) != 0);
  assert(is_power_of_2(bs->bl.request_alignment));
  
-for (i = 0; i < bs->quiesce_counter; i++) {

-if (drv->bdrv_co_drain_begin) {
-drv->bdrv_co_drain_begin(bs);
-}
+if (drv->bdrv_co_drain_begin) {
+co = qemu_coroutine_create(bdrv_co_drain_begin, &dco);
+qemu_coroutine_enter(co);
+/*
+ * There should be no reason for drv->bdrv_co_drain_begin to wait at
+ * this point, because the device does not have any active I/O.
+ */
+assert(dco.ret != NOT_DONE);
  }
  
  return 0;


--
Best regards,
Vladimir

Re: [PULL v4 45/83] tests: acpi: whitelist DSDT before generating PCI-ISA bridge AML automatically

2022-11-08 Thread Ani Sinha

On Tue, Nov 8, 2022 at 7:06 PM Igor Mammedov  wrote:
>
> On Mon, 7 Nov 2022 17:51:11 -0500
> "Michael S. Tsirkin"  wrote:
>
> > From: Igor Mammedov 
> >
> > Signed-off-by: Igor Mammedov 
> > Message-Id: <20221017102146.2254096-3-imamm...@redhat.com>
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  tests/qtest/bios-tables-test-allowed-diff.h | 34 +
> >  1 file changed, 34 insertions(+)
> >
> > diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> > b/tests/qtest/bios-tables-test-allowed-diff.h
> > index dfb8523c8b..570b17478e 100644
> > --- a/tests/qtest/bios-tables-test-allowed-diff.h
> > +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> > @@ -1 +1,35 @@
> >  /* List of comma-separated changed AML files to ignore */
> > +"tests/data/acpi/pc/DSDT",
> > +"tests/data/acpi/pc/DSDT.acpierst",
> > +"tests/data/acpi/pc/DSDT.acpihmat",
> > +"tests/data/acpi/pc/DSDT.bridge",
> > +"tests/data/acpi/pc/DSDT.cphp",
> > +"tests/data/acpi/pc/DSDT.dimmpxm",
> > +"tests/data/acpi/pc/DSDT.hpbridge",
> > +"tests/data/acpi/pc/DSDT.hpbrroot",
> > +"tests/data/acpi/pc/DSDT.ipmikcs",
> > +"tests/data/acpi/pc/DSDT.memhp",
> > +"tests/data/acpi/pc/DSDT.nohpet",
> > +"tests/data/acpi/pc/DSDT.numamem",
> > +"tests/data/acpi/pc/DSDT.roothp",
> > +"tests/data/acpi/q35/DSDT",
> > +"tests/data/acpi/q35/DSDT.acpierst",
> > +"tests/data/acpi/q35/DSDT.acpihmat",
> > +"tests/data/acpi/q35/DSDT.applesmc",
> > +"tests/data/acpi/q35/DSDT.bridge",
> > +"tests/data/acpi/q35/DSDT.cphp",
> > +"tests/data/acpi/q35/DSDT.cxl",
> > +"tests/data/acpi/q35/DSDT.dimmpxm",
> > +"tests/data/acpi/q35/DSDT.ipmibt",
> > +"tests/data/acpi/q35/DSDT.ipmismbus",
> > +"tests/data/acpi/q35/DSDT.ivrs",
> > +"tests/data/acpi/q35/DSDT.memhp",
> > +"tests/data/acpi/q35/DSDT.mmio64",
> > +"tests/data/acpi/q35/DSDT.multi-bridge",
> > +"tests/data/acpi/q35/DSDT.nohpet",
> > +"tests/data/acpi/q35/DSDT.numamem",
> > +"tests/data/acpi/q35/DSDT.pvpanic-isa",
> > +"tests/data/acpi/q35/DSDT.tis.tpm12",
> > +"tests/data/acpi/q35/DSDT.tis.tpm2",
> > +"tests/data/acpi/q35/DSDT.viot",
> > +"tests/data/acpi/q35/DSDT.xapic",
>
> still missing DSDT.count2 table, likely in other updates (as well)
> which should break bisection if not whole pull request.
>
> I'll prep a tree based on your pull req, with fixups
> for you to pull from.

Does this mean there will be a v5 for the PR?
V4 looks good in the CI so far ...

[PATCH for-7.2] block/blkio: Set BlockDriver::has_variable_length to false

2022-11-08 Thread Alberto Faria

Setting it to true can cause the device size to be queried from libblkio
in otherwise fast paths, degrading performance. Set it to false and
require users to refresh the device size explicitly instead.

Fixes: 4c8f4fda0504 ("block/blkio: Tolerate device size changes")
Suggested-by: Kevin Wolf 
Signed-off-by: Alberto Faria 
---
 block/blkio.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/blkio.c b/block/blkio.c
index 620fab28a7..5eae3adfaf 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -993,7 +993,6 @@ static void blkio_refresh_limits(BlockDriverState *bs, 
Error **errp)
 { \
 .format_name = name, \
 .protocol_name   = name, \
-.has_variable_length = true, \
 .instance_size   = sizeof(BDRVBlkioState), \
 .bdrv_file_open  = blkio_file_open, \
 .bdrv_close  = blkio_close, \
-- 
2.38.1

Re: [PATCH v2 2/9] block-copy: add missing coroutine_fn annotations

2022-11-08 Thread Vladimir Sementsov-Ogievskiy


On 11/4/22 12:56, Emanuele Giuseppe Esposito wrote:

These functions end up calling bdrv_common_block_status_above(), a
generated_co_wrapper function.


generated_co_wrapper is not a coroutine_fn. Сonversely it's a function that do 
a class coroutine wrapping - start a coroutine and do POLL to wait for the 
coroutine to finish.


In addition, they also happen to be always called in coroutine context,
meaning all callers are coroutine_fn.


That's also not a reason for marking them coroutine_fn. "coroutine_fn" means 
that the function can be called only from coroutine context.


This means that the g_c_w function will enter the qemu_in_coroutine()
case and eventually suspend (or in other words call qemu_coroutine_yield()).
Therefore we need to mark such functions coroutine_fn too.


I don't think so. Moreover, this breaks the concept, as your new coroutine_fn 
functions will call generated_co_wrapper functions which are not marked 
coroutine_fn and never was.



Signed-off-by: Emanuele Giuseppe Esposito 
---
  block/block-copy.c | 15 +--
  1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index bb947afdda..f33ab1d0b6 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -577,8 +577,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
  return ret;
  }
  
-static int block_copy_block_status(BlockCopyState *s, int64_t offset,

-   int64_t bytes, int64_t *pnum)
+static coroutine_fn int block_copy_block_status(BlockCopyState *s,
+int64_t offset,
+int64_t bytes, int64_t *pnum)
  {
  int64_t num;
  BlockDriverState *base;
@@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState *s, 
int64_t offset,
   * Check if the cluster starting at offset is allocated or not.
   * return via pnum the number of contiguous clusters sharing this allocation.
   */
-static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset,
-   int64_t *pnum)
+static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s,
+int64_t offset,
+int64_t *pnum)
  {
  BlockDriverState *bs = s->source->bs;
  int64_t count, total_count = 0;
@@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t offset, 
int64_t bytes)
   * @return 0 when the cluster at @offset was unallocated,
   * 1 otherwise, and -ret on error.
   */
-int64_t block_copy_reset_unallocated(BlockCopyState *s,
- int64_t offset, int64_t *count)
+int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
+  int64_t offset,
+  int64_t *count)
  {
  int ret;
  int64_t clusters, bytes;


--
Best regards,
Vladimir

1 2 3 >

1 - 100 of 254 matches

Mail list logo