date:20200820

On 8/20/20 3:33 AM, David Gibson wrote:
> On Wed, Aug 19, 2020 at 03:08:38PM +0200, Cédric Le Goater wrote:
>> When running a guest with a kernel IRQ chip enabled, the XIVE
>> characteristics of the interrupts are advertised to the guest in the
>> H_INT_GET_SOURCE_INFO hcall. These characteristics depend on the
>> underlying HW interrupts but today, QEMU simply advertises its own
>> without checking what the host supports. It is not a problem for the
>> moment, but POWER10 will (re)add support for StoreEOI and we need a
>> way to in sync with the host.
>>
>> The KVM_DEV_XIVE_GRP_SOURCE_INFO command lets QEMU query the XIVE
>> characteristics of the underlying HW interrupts and override any
>> previous setting done by QEMU. This allows the fallback mode, when the
>> XIVE device is emulated by QEMU, to use its own custom settings on the
>> sources but makes sure that we don't let a guest run with features
>> incompatible with KVM.
>>
>> It only applies to the StoreEOI feature for the moment.
> 
> Urgh.  This means that the source characteristics can change across a
> migration, that's kind of a problem.

yes.

The official plan is : 

  * P9 compat : no StoreEOI 
  * P10 : StoreEOI 

and the obvious solution I started with was to change the global source 
flags at CAS time. 

But, we have experimental P9 firmwares in which StoreEOI is activated
and migration of a P9 compat guest from a P10 host to another P10 host 
should work. I am looking for a smarter solution.

Forbidding StoreEOI on P9 compat guests is ok, since this is already 
the default, but an option to change it would be nice to have. 

As for P10, all HW sources and IC sources now have StoreEOI but we
might want to disable it for some reason. This is already in place 
at the firmware and PowerNV level, QEMU is the next layer to address.

Thanks,

C. 




> 
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  include/hw/ppc/spapr_xive.h |  2 ++
>>  hw/intc/spapr_xive.c| 20 
>>  hw/intc/spapr_xive_kvm.c| 26 ++
>>  3 files changed, 48 insertions(+)
>>
>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
>> index 1dddcbcb9cdd..3f325723ea74 100644
>> --- a/include/hw/ppc/spapr_xive.h
>> +++ b/include/hw/ppc/spapr_xive.h
>> @@ -84,6 +84,8 @@ void kvmppc_xive_disconnect(SpaprInterruptController 
>> *intc);
>>  void kvmppc_xive_reset(SpaprXive *xive, Error **errp);
>>  int kvmppc_xive_set_source_config(SpaprXive *xive, uint32_t lisn, XiveEAS 
>> *eas,
>>Error **errp);
>> +int kvmppc_xive_get_source_info(SpaprXive *xive, uint32_t lisn, uint64_t 
>> *flags,
>> + Error **errp);
>>  void kvmppc_xive_sync_source(SpaprXive *xive, uint32_t lisn, Error **errp);
>>  uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int srcno, uint32_t offset,
>>  uint64_t data, bool write);
>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
>> index 1fa09f287ac0..943b9958a68b 100644
>> --- a/hw/intc/spapr_xive.c
>> +++ b/hw/intc/spapr_xive.c
>> @@ -932,6 +932,26 @@ static target_ulong h_int_get_source_info(PowerPCCPU 
>> *cpu,
>>  args[0] |= SPAPR_XIVE_SRC_STORE_EOI;
>>  }
>>  
>> +if (kvm_irqchip_in_kernel()) {
>> +Error *local_err = NULL;
>> +uint64_t flags = 0;
>> +
>> +kvmppc_xive_get_source_info(xive, lisn, &flags, &local_err);
>> +if (local_err) {
>> +error_report_err(local_err);
>> +return H_HARDWARE;
>> +}
>> +
>> +/*
>> + * Override QEMU settings with KVM values
>> + */
>> +if (flags & XIVE_SRC_STORE_EOI) {
>> +args[0] |= SPAPR_XIVE_SRC_STORE_EOI;
>> +} else {
>> +args[0] &= ~SPAPR_XIVE_SRC_STORE_EOI;
>> +}
>> +}
>> +
>>  /*
>>   * Force the use of the H_INT_ESB hcall in case of an LSI
>>   * interrupt. This is necessary under KVM to re-trigger the
>> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
>> index e8667ce5f621..90f4509e6959 100644
>> --- a/hw/intc/spapr_xive_kvm.c
>> +++ b/hw/intc/spapr_xive_kvm.c
>> @@ -217,6 +217,32 @@ int kvmppc_xive_set_source_config(SpaprXive *xive, 
>> uint32_t lisn, XiveEAS *eas,
>>   &kvm_src, true, errp);
>>  }
>>  
>> +int kvmppc_xive_get_source_info(SpaprXive *xive, uint32_t lisn, uint64_t 
>> *flags,
>> + Error **errp)
>> +{
>> +struct kvm_ppc_xive_src kvm_src = { 0 };
>> +int ret;
>> +
>> +/*
>> + * Check that KVM supports the new attribute to query source
>> + * characteristics.
>> + */
>> +if (!kvm_device_check_attr(xive->fd, KVM_DEV_XIVE_GRP_SOURCE_INFO, 0)) {
>> +return 0;
>> +}
>> +
>> +ret = kvm_device_access(xive->fd, KVM_DEV_XIVE_GRP_SOURCE_INFO, lisn,
>> +&kvm_src, false, errp);
>> +if (ret < 0) {
>> +return ret;
>> +}
>> +
>>

Re: [PATCH 7/8] spapr/xive: Use the xics flag to check for XIVE-only IRQ backends

On 8/20/20 3:36 AM, David Gibson wrote:
> On Wed, Aug 19, 2020 at 03:08:42PM +0200, Cédric Le Goater wrote:
> 
> I can see why this is a good idea, but it really needs a rationale in
> the comment for posterity.

yes. I can send this one independently.

Thanks,

C.


> 
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  hw/ppc/spapr_irq.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>> index 80cf1c3d6bb2..d036c8fef519 100644
>> --- a/hw/ppc/spapr_irq.c
>> +++ b/hw/ppc/spapr_irq.c
>> @@ -172,7 +172,7 @@ static int spapr_irq_check(SpaprMachineState *spapr, 
>> Error **errp)
>>   * To cover both and not confuse the OS, add an early failure in
>>   * QEMU.
>>   */
>> -if (spapr->irq == &spapr_irq_xive) {
>> +if (!spapr->irq->xics) {
>>  error_setg(errp, "XIVE-only machines require a POWER9 CPU");
>>  return -1;
>>  }
>

Re: [PATCH v2 28/58] s390x: Move typedef SCLPEventFacility to event-facility.h

2020-08-20 Thread Cornelia Huck

On Wed, 19 Aug 2020 20:12:06 -0400
Eduardo Habkost  wrote:

> This will make future conversion to OBJECT_DECLARE* easier.
> 
> In sclp.h, use "struct SCLPEventFacility" to avoid introducing
> unnecessary header dependencies.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes series v1 -> v2: new patch in series v2
> 
> Cc: Cornelia Huck 
> Cc: Halil Pasic 
> Cc: Christian Borntraeger 
> Cc: Thomas Huth 
> Cc: qemu-s3...@nongnu.org
> Cc: qemu-devel@nongnu.org
> ---
>  include/hw/s390x/event-facility.h | 1 +
>  include/hw/s390x/sclp.h   | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)

Acked-by: Cornelia Huck

[PATCH v2] ide:do nothing for identify cmd if no any device attached

2020-08-20 Thread zhaoxin\RockCuioc

v1 -> v2:
fix codestyle checked by checkpatch.pl

This patch is for avoiding win7 IDE driver polling 0x1f7 when
no any device attached. During Win7 VM boot procedure, if use virtio for
disk and there is no any device be attached on hda & hdb, the win7 IDE driver
would poll 0x1f7 for a while. This action may be stop windows LOGO atomic for
a while too on a poor performance CPU.

Signed-off-by: zhaoxin\RockCuioc 
---
 hw/ide/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index d997a78e47..26d86f4b40 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2073,8 +2073,9 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
 s = idebus_active_if(bus);
 trace_ide_exec_cmd(bus, s, val);
 
-/* ignore commands to non existent slave */
-if (s != bus->ifs && !s->blk) {
+/* ignore commands if no any device exist or non existent slave */
+if ((!bus->ifs[0].blk && !bus->ifs[1].blk) ||
+(s != bus->ifs && !s->blk)) {
 return;
 }
 
-- 
2.17.1

[PATCH v2] softfloat: add alternative sNaN propagation for fmax/fmin

2020-08-20 Thread Chih-Min Chao

For "fmax/fmin ft0, ft1, ft2" and if one of the inputs is sNaN,
  The original logic
return NaN and set invalid flag if ft1 == sNaN || ft2 == sNan

  The alternative path
set invalid flag and return the non-NaN value if ft1 == sNaN || ft2 == sNaN
return NaN if ft1 == sNaN && ft2 == sNaN

   The ieee754 spec allows both implementation and some architecture such
   as riscv choose differenct defintion in two spec versions.
   (riscv-spec-v2.2 use original version, riscv-spec-20191213 changes to
alternative)

Signed-off-by: Chih-Min Chao 
---
 fpu/softfloat.c | 75 +++--
 include/fpu/softfloat.h |  6 
 2 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 79be4f5..4466ece 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -870,11 +870,16 @@ static FloatParts return_nan(FloatParts a, float_status 
*s)
 return a;
 }
 
-static FloatParts pick_nan(FloatParts a, FloatParts b, float_status *s)
+static void set_snan_flag(FloatParts a, FloatParts b, float_status *s)
 {
 if (is_snan(a.cls) || is_snan(b.cls)) {
 s->float_exception_flags |= float_flag_invalid;
 }
+}
+
+static FloatParts pick_nan(FloatParts a, FloatParts b, float_status *s)
+{
+set_snan_flag(a, b, s);
 
 if (s->default_nan_mode) {
 return parts_default_nan(s);
@@ -2743,23 +2748,32 @@ float64 uint16_to_float64(uint16_t a, float_status 
*status)
  * and minNumMag() from the IEEE-754 2008.
  */
 static FloatParts minmax_floats(FloatParts a, FloatParts b, bool ismin,
-bool ieee, bool ismag, float_status *s)
+bool ieee, bool ismag, bool issnan_prop,
+float_status *s)
 {
 if (unlikely(is_nan(a.cls) || is_nan(b.cls))) {
 if (ieee) {
 /* Takes two floating-point values `a' and `b', one of
  * which is a NaN, and returns the appropriate NaN
  * result. If either `a' or `b' is a signaling NaN,
- * the invalid exception is raised.
+ * the invalid exception is raised but the NaN
+ * propagation is 'shall'.
  */
 if (is_snan(a.cls) || is_snan(b.cls)) {
-return pick_nan(a, b, s);
-} else if (is_nan(a.cls) && !is_nan(b.cls)) {
+if (issnan_prop) {
+return pick_nan(a, b, s);
+} else {
+set_snan_flag(a, b, s);
+}
+}
+
+if (is_nan(a.cls) && !is_nan(b.cls)) {
 return b;
 } else if (is_nan(b.cls) && !is_nan(a.cls)) {
 return a;
 }
 }
+
 return pick_nan(a, b, s);
 } else {
 int a_exp, b_exp;
@@ -2813,37 +2827,44 @@ static FloatParts minmax_floats(FloatParts a, 
FloatParts b, bool ismin,
 }
 }
 
-#define MINMAX(sz, name, ismin, isiee, ismag)   \
+#define MINMAX(sz, name, ismin, isiee, ismag, issnan_prop)  \
 float ## sz float ## sz ## _ ## name(float ## sz a, float ## sz b,  \
  float_status *s)   \
 {   \
 FloatParts pa = float ## sz ## _unpack_canonical(a, s); \
 FloatParts pb = float ## sz ## _unpack_canonical(b, s); \
-FloatParts pr = minmax_floats(pa, pb, ismin, isiee, ismag, s);  \
+FloatParts pr = minmax_floats(pa, pb, ismin, isiee, ismag,  \
+  issnan_prop, s);  \
 \
 return float ## sz ## _round_pack_canonical(pr, s); \
 }
 
-MINMAX(16, min, true, false, false)
-MINMAX(16, minnum, true, true, false)
-MINMAX(16, minnummag, true, true, true)
-MINMAX(16, max, false, false, false)
-MINMAX(16, maxnum, false, true, false)
-MINMAX(16, maxnummag, false, true, true)
-
-MINMAX(32, min, true, false, false)
-MINMAX(32, minnum, true, true, false)
-MINMAX(32, minnummag, true, true, true)
-MINMAX(32, max, false, false, false)
-MINMAX(32, maxnum, false, true, false)
-MINMAX(32, maxnummag, false, true, true)
-
-MINMAX(64, min, true, false, false)
-MINMAX(64, minnum, true, true, false)
-MINMAX(64, minnummag, true, true, true)
-MINMAX(64, max, false, false, false)
-MINMAX(64, maxnum, false, true, false)
-MINMAX(64, maxnummag, false, true, true)
+MINMAX(16, min, true, false, false, true)
+MINMAX(16, minnum, true, true, false, true)
+MINMAX(16, minnum_noprop, true, true, false, false)
+MINMAX(16, minnummag, true, true, true, true)
+MINMAX(16, max, false, false, false, true)
+MINMAX(16, maxnum, false, true, false, true)
+MINMAX(16, maxnum_noprop, false, true, false, false)
+MINMAX(16, maxnummag, false, true, true, true)
+
+MINMAX(32, min, true, false, false,

Re: [PATCH v7 0/7] coroutines: generate wrapper code

On 20/08/20 03:33, Eric Blake wrote:
>>>
>>
>> OK, will do. Thanks for taking a look!
> 
> As this series touched Makefile to add a generated .c, you'll also need
> to rebase that part to apply on top of Paolo's meson conversion (cc'ing
> him if you need help figuring it out)

It should be trivial to do so using custom_target.

Paolo

[PATCH] ppc/pnv: Add a HIOMAP erase command

The OPAL test suite runs a read-erase-write test on the PNOR :

  https://github.com/open-power/op-test/blob/master/testcases/OpTestPNOR.py

which revealed that the IPMI HIOMAP handlers didn't support
HIOMAP_C_ERASE. Implement the sector erase command by writing 0xFF in
the PNOR memory region.

Reported-by: Klaus Heinrich Kiwi 
Signed-off-by: Cédric Le Goater 
---
 hw/ppc/pnv_bmc.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/pnv_bmc.c b/hw/ppc/pnv_bmc.c
index 2e1a03daa45a..0fb082fcb8ee 100644
--- a/hw/ppc/pnv_bmc.c
+++ b/hw/ppc/pnv_bmc.c
@@ -140,6 +140,29 @@ static uint16_t bytes_to_blocks(uint32_t bytes)
 return bytes >> BLOCK_SHIFT;
 }
 
+static uint32_t blocks_to_bytes(uint16_t blocks)
+{
+return blocks << BLOCK_SHIFT;
+}
+
+#define IPMI_ERR_UNSPECIFIED0xff
+
+static int hiomap_erase(PnvPnor *pnor, uint32_t offset, uint32_t size)
+{
+MemTxResult result;
+int i;
+
+for (i = 0; i < size / 4; i++) {
+result = memory_region_dispatch_write(&pnor->mmio, offset + i * 4,
+  0x, MO_32,
+  MEMTXATTRS_UNSPECIFIED);
+if (result != MEMTX_OK) {
+return -1;
+}
+}
+return 0;
+}
+
 static void hiomap_cmd(IPMIBmcSim *ibs, uint8_t *cmd, unsigned int cmd_len,
RspBuffer *rsp)
 {
@@ -155,10 +178,16 @@ static void hiomap_cmd(IPMIBmcSim *ibs, uint8_t *cmd, 
unsigned int cmd_len,
 switch (cmd[2]) {
 case HIOMAP_C_MARK_DIRTY:
 case HIOMAP_C_FLUSH:
-case HIOMAP_C_ERASE:
 case HIOMAP_C_ACK:
 break;
 
+case HIOMAP_C_ERASE:
+if (hiomap_erase(pnor, blocks_to_bytes(cmd[5] << 8 | cmd[4]),
+blocks_to_bytes(cmd[7] << 8 | cmd[6]))) {
+rsp_buffer_set_error(rsp, IPMI_ERR_UNSPECIFIED);
+}
+break;
+
 case HIOMAP_C_GET_INFO:
 rsp_buffer_push(rsp, 2);  /* Version 2 */
 rsp_buffer_push(rsp, BLOCK_SHIFT); /* block size */
-- 
2.25.4

Re: [PATCH v2 for 5.2 0/3] block: add logging facility for long standing IO requests

2020-08-20 Thread Denis V. Lunev

On 8/12/20 5:00 PM, Stefan Hajnoczi wrote:
> On Mon, Aug 10, 2020 at 01:14:44PM +0300, Denis V. Lunev wrote:
>> There are severe delays with IO requests processing if QEMU is running in
>> virtual machine or over software defined storage. Such delays potentially
>> results in unpredictable guest behavior. For example, guests over IDE or
>> SATA drive could remount filesystem read-only if write is performed
>> longer than 10 seconds.
>>
>> Such reports are very complex to process. Some good starting point for this
>> seems quite reasonable. This patch provides one. It adds logging of such
>> potentially dangerous long IO operations.
>>
>> Changed from v2:
>> - removed accidentally added slirp subproject ID
>> - added comment describing timeout selection to patch 3
>>
>> Changes from v1:
>> - fixed conversions using macros suggested by Stefan
>> - fixed option declaration
>> - enabled by default with patch 3
>>
>> Signed-off-by: Denis V. Lunev 
>> CC: Vladimir Sementsov-Ogievskiy 
>> CC: Kevin Wolf 
>> CC: Max Reitz 
>> CC: Stefan Hajnoczi 
> Reviewed-by: Stefan Hajnoczi 
ping

Re: [PATCH v7 4/7] scripts: add coroutine-wrapper.py

On 10/06/20 12:03, Vladimir Sementsov-Ogievskiy wrote:
> +{struct_name} s = {{
> +.poll_state.bs = {bs},
> +.poll_state.in_progress = true,
> +
> +{ func.gen_block('.{name} = {name},') }
> +}};
> +
> +s.poll_state.co = qemu_coroutine_create({name}_entry, &s);
> +
> +return bdrv_poll_co(&s.poll_state);
> +}}
> +}}"""
> +
> +
> +def gen_wrappers_file(input_code: str) -> str:
> +res = gen_header()
> +for func in func_decl_iter(input_code):
> +res += '\n\n\n'
> +res += gen_wrapper(func)
> +
> +return prettify(res)  # prettify to wrap long lines
> +
> +
> +if __name__ == '__main__':
> +print(gen_wrappers_file(sys.stdin.read()))
> -- 

For Meson support, you'll have to move the "cat" inside the script.  But
since func_decl_iter can work separately on each file, it's enough to do
something like

for filename in sys.argv:
with open(filename, 'r') as f:
   print(gen_wrappers_file(f.read()))

Paolo

Re: [PATCH v8 0/6] block: seriously improve savevm/loadvm performance

2020-08-20 Thread Denis V. Lunev

On 7/9/20 4:26 PM, Denis V. Lunev wrote:
> This series do standard basic things:
> - it creates intermediate buffer for all writes from QEMU migration code
>   to QCOW2 image,
> - this buffer is sent to disk asynchronously, allowing several writes to
>   run in parallel.
>
> In general, migration code is fantastically inefficent (by observation),
> buffers are not aligned and sent with arbitrary pieces, a lot of time
> less than 100 bytes at a chunk, which results in read-modify-write
> operations with non-cached operations. It should also be noted that all
> operations are performed into unallocated image blocks, which also suffer
> due to partial writes to such new clusters.
>
> This patch series is an implementation of idea discussed in the RFC
> posted by Denis Plotnikov
> https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg01925.html
> Results with this series over NVME are better than original code
> original rfcthis
> cached:  1.79s  2.38s   1.27s
> non-cached:  3.29s  1.31s   0.81s
>
> Changes from v7:
> - dropped lock from LoadVMState
> - fixed assert in last patch
> - dropped patch 1 as queued
>
> Changes from v6:
> - blk_load_vmstate kludges added (patchew problem fixed)
>
> Changes from v5:
> - loadvm optimizations added with Vladimir comments included
>
> Changes from v4:
> - added patch 4 with blk_save_vmstate() cleanup
> - added R-By
> - bdrv_flush_vmstate -> bdrv_finalize_vmstate
> - fixed return code of bdrv_co_do_save_vmstate
> - fixed typos in comments (Eric, thanks!)
> - fixed patchew warnings
>
> Changes from v3:
> - rebased to master
> - added patch 3 which removes aio_task_pool_wait_one()
> - added R-By to patch 1
> - patch 4 is rewritten via bdrv_run_co
> - error path in blk_save_vmstate() is rewritten to call bdrv_flush_vmstate
>   unconditionally
> - added some comments
> - fixes initialization in bdrv_co_vmstate_save_task_entry as suggested
>
> Changes from v2:
> - code moved from QCOW2 level to generic block level
> - created bdrv_flush_vmstate helper to fix 022, 029 tests
> - added recursive for bs->file in bdrv_co_flush_vmstate (fix 267)
> - fixed blk_save_vmstate helper
> - fixed coroutine wait as Vladimir suggested with waiting fixes from me
>
> Changes from v1:
> - patchew warning fixed
> - fixed validation that only 1 waiter is allowed in patch 1
>
> Signed-off-by: Denis V. Lunev 
> CC: Kevin Wolf 
> CC: Max Reitz 
> CC: Stefan Hajnoczi 
> CC: Fam Zheng 
> CC: Juan Quintela 
> CC: "Dr. David Alan Gilbert" 
> CC: Vladimir Sementsov-Ogievskiy 
> CC: Denis Plotnikov 
>
>
ping

Re: [PATCH 2/2] target/mips: Add definition of Loongson-3A3000 CPU

2020-08-20 Thread Kaige Li





On 08/14/2020 10:48 AM, Jiaxun Yang wrote:



在 2020/8/14 上午10:43, Kaige Li 写道:

On 08/13/2020 06:37 PM, Jiaxun Yang wrote:




在 2020/8/13 下午5:41, Kaige Li 写道:

Add definition of the Loongson-3A3000 processor in QEMU.


Hi Kaige,

We're not defining Loongson-3A3000 in QEMU because we have
some features like Loongson-EXT2, VTLB not available currently, I'd
prefer define it after we add these features to TCG.

Loongson-3A4000's define is a exception to support KVM.

Ok, I see. This will be defined later, right?

Yes..

If you're willing to help I'd suggest you to take a look at how to 
implement

Loongson SPW (LDPTE LDDIR etc) in QEMU.

Ok, I'll analyze it carefully.

thanks.
-Kaige



Thanks.

- Jiaxun



Thanks.
Kaige.


Thanks.

- Jiaxun

Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-08-20 Thread David Edmondson

On Monday, 2020-08-10 at 13:14:46 +03, Denis V. Lunev wrote:

> +strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",

"%F %T" would include the year, which can be useful.

> + localtime_r(&start_time_host_s, &t));
> +
> +bs = blk_bs(blk_stats2blk(stats));
> +qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, 
> %s)\n",
> + block_account_type(cookie->type), cookie->bytes,
> + (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,
> + (int)((cookie->start_time_ns / 100) % 1000),

Is there a reason not to use %u rather than casting?

dme.
-- 
We wanna wait, but here we go again.

Re: Difficulty making usb-serial device visible within the guest

2020-08-20 Thread Gerd Hoffmann

  Hi,

> With auto-attach disabled this effectively prevents the device from
> attaching to the USB unless the chardev is opened which feels odd.

Why?  It's not needed to connect at boot btw, the device will be
hotplugged if the chardev is opened.

> I would expect that if I add a device to QEMU using -device then it is
> immediately visible in the guest, and if the chardev isn't connected
> then the device should report its status as disconnected as you would
> expect with a real USB to RS232 adapter.

Huh?  How would that work?  You can do some guesswork using modem line
status etc, but as far I know _reliable_ connected/disconnected
reporting simply doesn't exist in the world of serial lines ...

I'm open to add a device property to turn off the hotplug behavior.
Patches are welcome.

take care,
  Gerd

Re: [RFC v3 2/8] cpus: prepare new CpusAccel cpu accelerator interface

2020-08-20 Thread Claudio Fontana

Hi Paolo and all,

back in RFC v3 I introduced cpus_get_virtual_clock in this patch.

I observed an issue when adding the get_virtual_clock to the CpusAccel 
interface, ie
it seems that qemu_clock_get_ns() is called in some io-tests before the 
accelerator is initialized,
which seems to collide with the idea to make it part of the CpusAccel interface:

(gdb) bt
#0  0x558e6af0 in cpus_get_virtual_clock () at 
/home/claudio/git/qemu-pristine/qemu/softmmu/cpus.c:219
#1  0x55c5099c in qemu_clock_get_ns (type=type@entry=QEMU_CLOCK_VIRTUAL)
at /home/claudio/(gdb) bt
#0  0x558e6af0 in cpus_get_virtual_clock () at 
/home/claudio/git/qemu-pristine/qemu/softmmu/cpus.c:219
#1  0x55c5099c in qemu_clock_get_ns (type=type@entry=QEMU_CLOCK_VIRTUAL)
at /home/claudio/git/qemu-pristine/qemu/util/qemu-timer.c:638
#2  0x55b6077a in qemu_clock_get_ms (type=QEMU_CLOCK_VIRTUAL) at 
/home/claudio/git/qemu-pristine/qemu/include/qemu/timer.h:118
#3  0x55b6077a in cache_clean_timer_init (bs=bs@entry=0x568381a0, 
context=0x56821930)
at /home/claudio/git/qemu-pristine/qemu/block/qcow2.c:846
#4  0x55b63012 in qcow2_update_options_commit 
(bs=bs@entry=0x568381a0, r=r@entry=0x7fffd6a45e10)
at /home/claudio/git/qemu-pristine/qemu/block/qcow2.c:1221
#5  0x55b657ea in qcow2_update_options
(bs=bs@entry=0x568381a0, options=options@entry=0x5683d600, 
flags=flags@entry=139266, errp=errp@entry=0x7fffd580)
at /home/claudio/git/qemu-pristine/qemu/block/qcow2.c:1248
#6  0x55b671a2 in qcow2_do_open (bs=0x568381a0, 
options=0x5683d600, flags=139266, errp=0x7fffd580)
at /home/claudio/git/qemu-pristine/qemu/block/qcow2.c:1579
#7  0x55b67e62 in qcow2_open_entry (opaque=0x7fffd520) at 
/home/claudio/git/qemu-pristine/qemu/block/qcow2.c:1867
#8  0x55c4854c in coroutine_trampoline (i0=, 
i1=)
at /home/claudio/git/qemu-pristine/qemu/util/coroutine-ucontext.c:173
#9  0x7fffed3779c0 in __start_context () at /lib64/libc.so.6
#10 0x7fffcd90 in  ()
#11 0x in  ()

(gdb) p *current_machine
$3 = {parent_obj = {class = 0x567a2090, free = 0x772d9840 , 
Python Exception  There is no member named keys.: 
properties = 0x5681c580, ref = 2, 
parent = 0x5682aa90}, sysbus_notifier = {notify = 0x55990130 
, node = {le_next = 
0x564e1130 , le_prev = 0x565079f0 
}}, dtb = 0x0, dumpdtb = 0x0, 
  phandle_start = 0, dt_compatible = 0x0, dump_guest_core = true, mem_merge = 
true, usb = false, usb_disabled = false, firmware = 0x0, 
  iommu = false, suppress_vmdesc = false, enforce_config_section = false, 
enable_graphics = true, memory_encryption = 0x0, 
  ram_memdev_id = 0x0, ram = 0x0, device_memory = 0x0, ram_size = 0, 
maxram_size = 0, ram_slots = 0, boot_order = 0x0, 
  kernel_filename = 0x0, kernel_cmdline = 0x0, initrd_filename = 0x0, cpu_type 
= 0x0, accelerator = 0x0, possible_cpus = 0x0, smp = {
cpus = 1, cores = 1, threads = 1, sockets = 1, max_cpus = 1}, nvdimms_state 
= 0x56822850, numa_state = 0x56822be0}


The affected tests are:

Failures: 030 040 041 060 099 120 127 140 156 161 172 181 191 192 195 203 229 
249 256 267

Are the tests wrong here, to trigger this call stack before the accel is set,
or should the get virtual clock functionality be taken out of the interface, or 
...?

Thanks for any advice,

Ciao,

Claudio  


On 8/3/20 11:05 AM, Claudio Fontana wrote:
> The new interface starts unused, will start being used by the
> next patches.
> 
> It provides methods for each accelerator to start a vcpu, kick a vcpu,
> synchronize state, get cpu virtual clock and elapsed ticks.
> 
> Signed-off-by: Claudio Fontana 
> ---
>  hw/core/cpu.c  |   1 +
>  hw/i386/x86.c  |   2 +-
>  include/sysemu/cpu-timers.h|   9 +-
>  include/sysemu/cpus.h  |  36 
>  include/sysemu/hw_accel.h  |  69 ++-
>  softmmu/cpu-timers.c   |   9 +-
>  softmmu/cpus.c | 194 
> -
>  stubs/Makefile.objs|   2 +
>  stubs/cpu-synchronize-state.c  |  15 
>  stubs/cpus-get-virtual-clock.c |   8 ++
>  util/qemu-timer.c  |   8 +-
>  11 files changed, 231 insertions(+), 122 deletions(-)
>  create mode 100644 stubs/cpu-synchronize-state.c
>  create mode 100644 stubs/cpus-get-virtual-clock.c
> 
> diff --git a/hw/core/cpu.c b/hw/core/cpu.c
> index 594441a150..b389a312df 100644
> --- a/hw/core/cpu.c
> +++ b/hw/core/cpu.c
> @@ -33,6 +33,7 @@
>  #include "hw/qdev-properties.h"
>  #include "trace-root.h"
>  #include "qemu/plugin.h"
> +#include "sysemu/hw_accel.h"
>  
>  CPUInterruptHandler cpu_interrupt_handler;
>  
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 58cf2229d5..00c35bad7e 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -264,7 +264,7 @@ static long get_file_size(FILE *f)
>  /* TSC handling */
>  uint6

[PATCH V5] Introduce a new flag for i440fx to disable PCI hotplug on the root bus

2020-08-20 Thread Ani Sinha

We introduce a new global flag 'acpi-root-pci-hotplug' for i440fx with which
we can turn on or off PCI device hotplug on the root bus. This flag can be
used to prevent all PCI devices from getting hotplugged or unplugged from the
root PCI bus.
This feature is targetted mostly towards Windows VMs. It is useful in cases
where some hypervisor admins want to deploy guest VMs in a way so that the
users of the guest OSes are not able to hot-eject certain PCI devices from
the Windows system tray. Laine has explained the use case here in detail:
https://www.redhat.com/archives/libvir-list/2020-February/msg00110.html

Julia has resolved this issue for PCIE buses with the following commit:
530a0963184e57e71a5b538 ("pcie_root_port: Add hotplug disabling option")

This commit attempts to introduce similar behavior for PCI root buses used in
i440fx machine types (although in this case, we do not have a per-slot
capability to turn hotplug on or off).

Usage:
   -global PIIX4_PM.acpi-root-pci-hotplug=off

By default, this option is enabled which means that hotplug is turned on for
the PCI root bus.

The previously existing flag 'acpi-pci-hotplug-with-bridge-support' for PCI-PCI
bridges remain as is and can be used along with this new flag to control PCI
hotplug on PCI bridges.

This change has been tested using a Windows 2012R2 server guest image and also
with a Windows 2019 server guest image on a Ubuntu 18.04 host using the latest
master qemu from upstream.

Signed-off-by: Ani Sinha 
---
 hw/acpi/piix4.c  |  8 ++--
 hw/i386/acpi-build.c | 26 +++---
 2 files changed, 25 insertions(+), 9 deletions(-)

Change Log:
V4..V5: Updated commit message to reflect the fact that testing was also
performed on a Windows 2019 server guest image. Minor commit log formatting
to make sure 80 col rule is enforced.

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 26bac4f16c..4f436e5bf3 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -78,6 +78,7 @@ typedef struct PIIX4PMState {
 
 AcpiPciHpState acpi_pci_hotplug;
 bool use_acpi_hotplug_bridge;
+bool use_acpi_root_pci_hotplug;
 
 uint8_t disable_s3;
 uint8_t disable_s4;
@@ -595,8 +596,9 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
   "acpi-gpe0", GPE_LEN);
 memory_region_add_subregion(parent, GPE_BASE, &s->io_gpe);
 
-acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
-s->use_acpi_hotplug_bridge);
+if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug)
+acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
+s->use_acpi_hotplug_bridge);
 
 s->cpu_hotplug_legacy = true;
 object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy",
@@ -635,6 +637,8 @@ static Property piix4_pm_properties[] = {
 DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
 DEFINE_PROP_BOOL("acpi-pci-hotplug-with-bridge-support", PIIX4PMState,
  use_acpi_hotplug_bridge, true),
+DEFINE_PROP_BOOL("acpi-root-pci-hotplug", PIIX4PMState,
+ use_acpi_root_pci_hotplug, true),
 DEFINE_PROP_BOOL("memory-hotplug-support", PIIX4PMState,
  acpi_memory_hotplug.is_enabled, true),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b7bc2a..19a1702ad1 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -95,6 +95,7 @@ typedef struct AcpiPmInfo {
 bool s3_disabled;
 bool s4_disabled;
 bool pcihp_bridge_en;
+bool pcihp_root_en;
 uint8_t s4_val;
 AcpiFadtData fadt;
 uint16_t cpu_hp_io_base;
@@ -245,6 +246,9 @@ static void acpi_get_pm_info(MachineState *machine, 
AcpiPmInfo *pm)
 pm->pcihp_bridge_en =
 object_property_get_bool(obj, "acpi-pci-hotplug-with-bridge-support",
  NULL);
+pm->pcihp_root_en =
+object_property_get_bool(obj, "acpi-root-pci-hotplug", NULL);
+
 }
 
 static void acpi_get_misc_info(AcpiMiscInfo *info)
@@ -337,15 +341,18 @@ static void build_append_pcihp_notify_entry(Aml *method, 
int slot)
 }
 
 static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
- bool pcihp_bridge_en)
+ bool pcihp_bridge_en,
+ bool pcihp_root_en)
 {
 Aml *dev, *notify_method = NULL, *method;
 QObject *bsel;
 PCIBus *sec;
 int i;
+bool root_bus = pci_bus_is_root(bus);
+bool root_pcihp_disabled = (root_bus && !pcihp_root_en);
 
 bsel = object_property_get_qobject(OBJECT(bus), ACPI_PCIHP_PROP_BSEL, 
NULL);
-if (bsel) {
+if (bsel && !root_pcihp_disabled) {
 uint64_t bsel_val = qnum_get_uint(qobject_to(QNum, bsel));
 
 aml_append(parent_scope, aml_name_decl("BSEL", aml_int(bsel_val)));
@@ -361,6 +368,9 @@ static void build_append_pci_bus_devices

Re: hw-display-qxl.so: undefined symbol: qemu_qxl_io_log_semaphore

2020-08-20 Thread Gerd Hoffmann

  Hi,

> If systemtap won't change, then to fix this, for any foo.c
> that will be in a module, we need a separate 'foo.trace'
> file that generates a .o that is directly linked to the
> foo.so, not the qemu-system-x86_64 binary.

I think that is the plan anyway.

take care,
  Gerd

Re: [PATCH v7 14/47] stream: Deal with filters

On 19.08.20 17:16, Kevin Wolf wrote:
> Am 19.08.2020 um 16:47 hat Max Reitz geschrieben:
>> On 18.08.20 16:28, Kevin Wolf wrote:
>>> Am 25.06.2020 um 17:21 hat Max Reitz geschrieben:
 Because of the (not so recent anymore) changes that make the stream job
 independent of the base node and instead track the node above it, we
 have to split that "bottom" node into two cases: The bottom COW node,
 and the node directly above the base node (which may be an R/W filter
 or the bottom COW node).

 Signed-off-by: Max Reitz 
 ---
  qapi/block-core.json |  4 +++
  block/stream.c   | 63 
  blockdev.c   |  4 ++-
  3 files changed, 53 insertions(+), 18 deletions(-)

 diff --git a/qapi/block-core.json b/qapi/block-core.json
 index b20332e592..df87855429 100644
 --- a/qapi/block-core.json
 +++ b/qapi/block-core.json
 @@ -2486,6 +2486,10 @@
  # On successful completion the image file is updated to drop the backing 
 file
  # and the BLOCK_JOB_COMPLETED event is emitted.
  #
 +# In case @device is a filter node, block-stream modifies the first 
 non-filter
 +# overlay node below it to point to base's backing node (or NULL if @base 
 was
 +# not specified) instead of modifying @device itself.
>>>
>>> Not to @base's backing node, but to @base itself (or actually, to
>>> above_base's backing node, which is initially @base, but may have
>>> changed when the job is completed).
>>
>> Oh, yes.
>>
>> (I thought I had noticed that already at some point and fixed it
>> locally...  But apparently not.)
>>
>>> Should we also document what using a filter node for @base means?
>>
>> Hm.  What does it mean?  I think the more interesting case is what it
>> means if above_base is a filter, right?
>>
>> Maybe we can put in somewhere in the “If a base file is specified then
>> sectors are not copied from that base file and its backing chain.”  But
>> the more I think about it, the less I know what we could add to it.
>> What happens if there are filters above @base is that their data isn’t
>> copied, because that’s exactly the data in @base.
> 
> The interesting part is probably the graph reconfiguration at the end of
> the job. Which is actually already documented:
> 
> # When streaming completes the image file will have the base
> # file as its backing file.
> 
> Of course, this is not entirely correct any more (because the base may
> have changed).
> 
> If @base is a filter, what backing file path do we write into the top
> layer? A json: filename including the filter?

Yes.

Or, actually.  Now that I read the code...  It takes @base’s filename
before the stream job and then uses that.  So if @base has changed
during the job, then it still uses the old filename.

But that’s not really due to this series.

> Is this worth mentioning
> or do you consider it obvious?

Hm.  I consider it obvious, yes.  @base becomes @top’s backing file (at
least without any graph changes while the job is running), so naturally
what’s written into the image header is @base’s filename – which is a
json:{} filename.

On second thought, @backing-file’s description mysteriously states that
“QEMU will automatically determine the backing file string to use”.
Which makes sense because it would clearly not make sense to describe
what’s actually happening, which is to use @base’s filename at job start
regardless of whether it’s still there at the end of the job.

So I suppose I have the choice of either documenting exactly what’s
happening, even though it doesn’t make much sense, or just not, keeping
it mysterious.

So all in all, I believe the biggest surprise about what’s written into
the top layer isn’t that it may be a json:{} filename, but the filename
of a node that maybe doesn’t even exist anymore?  (Oh, no, please don’t
tell me you can delete it and get an invalid pointer read...)

The more I think about it, the more I think there are problems beyond
the scope of this series here. :/

Max

signature.asc
Description: OpenPGP digital signature

Re: [PULL 0/2] Block patches for 5.1.0-rc4

On Tue, 11 Aug 2020 at 10:35, Max Reitz  wrote:
>
> Hi,
>
> There is a bug in the backup job that breaks backups from images whose
> size is not aligned to the job's cluster size (i.e., qemu crashes
> because of a failed assertion).  If this bug makes it into the release,
> it would be a regression from 5.0.
>
> On one hand, this is probably a rare configuration that should not
> happen in practice.  On the other, it is a regression, and the fix
> (patch 1) is simple.  So I think it would be good to have this in 5.1.
>
>
> The following changes since commit e1d322c40524d2c544d1fcd37b267d106d16d328:
>
>   Update version for v5.1.0-rc3 release (2020-08-05 17:37:17 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/XanClic/qemu.git tags/pull-block-2020-08-11
>
> for you to fetch changes up to 1f3765b652930a3b485f1a67542c2410c3774abe:
>
>   iotests: add test for unaligned granularity bitmap backup (2020-08-11 
> 09:29:31 +0200)
>
> 
> Block patches for 5.1.0-rc4:
> - Fix abort when running a backup job on an image whose size is not
>   aligned to the backup job's cluster size
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.

-- PMM

Re: hw-display-qxl.so: undefined symbol: qemu_qxl_io_log_semaphore

2020-08-20 Thread Daniel P . Berrangé

On Thu, Aug 20, 2020 at 10:29:43AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > If systemtap won't change, then to fix this, for any foo.c
> > that will be in a module, we need a separate 'foo.trace'
> > file that generates a .o that is directly linked to the
> > foo.so, not the qemu-system-x86_64 binary.
> 
> I think that is the plan anyway.

It looks like we have no choice. The systemtap maintainers don't want to
change to make the semaphore symbols in the binary visible to modules.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH V5] Introduce a new flag for i440fx to disable PCI hotplug on the root bus

On 8/20/20 10:16 AM, Ani Sinha wrote:
> We introduce a new global flag 'acpi-root-pci-hotplug' for i440fx with which
> we can turn on or off PCI device hotplug on the root bus. This flag can be
> used to prevent all PCI devices from getting hotplugged or unplugged from the
> root PCI bus.
> This feature is targetted mostly towards Windows VMs. It is useful in cases
> where some hypervisor admins want to deploy guest VMs in a way so that the
> users of the guest OSes are not able to hot-eject certain PCI devices from
> the Windows system tray. Laine has explained the use case here in detail:
> https://www.redhat.com/archives/libvir-list/2020-February/msg00110.html
> 
> Julia has resolved this issue for PCIE buses with the following commit:
> 530a0963184e57e71a5b538 ("pcie_root_port: Add hotplug disabling option")
> 
> This commit attempts to introduce similar behavior for PCI root buses used in
> i440fx machine types (although in this case, we do not have a per-slot
> capability to turn hotplug on or off).
> 
> Usage:
>-global PIIX4_PM.acpi-root-pci-hotplug=off
> 
> By default, this option is enabled which means that hotplug is turned on for
> the PCI root bus.
> 
> The previously existing flag 'acpi-pci-hotplug-with-bridge-support' for 
> PCI-PCI
> bridges remain as is and can be used along with this new flag to control PCI
> hotplug on PCI bridges.
> 
> This change has been tested using a Windows 2012R2 server guest image and also
> with a Windows 2019 server guest image on a Ubuntu 18.04 host using the latest
> master qemu from upstream.

"latest master qemu from upstream" -> "v5.1.0 tag"

> 
> Signed-off-by: Ani Sinha 
> ---
>  hw/acpi/piix4.c  |  8 ++--
>  hw/i386/acpi-build.c | 26 +++---
>  2 files changed, 25 insertions(+), 9 deletions(-)
> 
> Change Log:
> V4..V5: Updated commit message to reflect the fact that testing was also
> performed on a Windows 2019 server guest image. Minor commit log formatting
> to make sure 80 col rule is enforced.
> 
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 26bac4f16c..4f436e5bf3 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -78,6 +78,7 @@ typedef struct PIIX4PMState {
>  
>  AcpiPciHpState acpi_pci_hotplug;
>  bool use_acpi_hotplug_bridge;
> +bool use_acpi_root_pci_hotplug;
>  
>  uint8_t disable_s3;
>  uint8_t disable_s4;
> @@ -595,8 +596,9 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
> *parent,
>"acpi-gpe0", GPE_LEN);
>  memory_region_add_subregion(parent, GPE_BASE, &s->io_gpe);
>  
> -acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
> -s->use_acpi_hotplug_bridge);
> +if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug)
> +acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
> +s->use_acpi_hotplug_bridge);
>  
>  s->cpu_hotplug_legacy = true;
>  object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy",
> @@ -635,6 +637,8 @@ static Property piix4_pm_properties[] = {
>  DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
>  DEFINE_PROP_BOOL("acpi-pci-hotplug-with-bridge-support", PIIX4PMState,
>   use_acpi_hotplug_bridge, true),
> +DEFINE_PROP_BOOL("acpi-root-pci-hotplug", PIIX4PMState,
> + use_acpi_root_pci_hotplug, true),
>  DEFINE_PROP_BOOL("memory-hotplug-support", PIIX4PMState,
>   acpi_memory_hotplug.is_enabled, true),
>  DEFINE_PROP_END_OF_LIST(),
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bc2a..19a1702ad1 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -95,6 +95,7 @@ typedef struct AcpiPmInfo {
>  bool s3_disabled;
>  bool s4_disabled;
>  bool pcihp_bridge_en;
> +bool pcihp_root_en;
>  uint8_t s4_val;
>  AcpiFadtData fadt;
>  uint16_t cpu_hp_io_base;
> @@ -245,6 +246,9 @@ static void acpi_get_pm_info(MachineState *machine, 
> AcpiPmInfo *pm)
>  pm->pcihp_bridge_en =
>  object_property_get_bool(obj, "acpi-pci-hotplug-with-bridge-support",
>   NULL);
> +pm->pcihp_root_en =
> +object_property_get_bool(obj, "acpi-root-pci-hotplug", NULL);
> +
>  }
>  
>  static void acpi_get_misc_info(AcpiMiscInfo *info)
> @@ -337,15 +341,18 @@ static void build_append_pcihp_notify_entry(Aml 
> *method, int slot)
>  }
>  
>  static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
> - bool pcihp_bridge_en)
> + bool pcihp_bridge_en,
> + bool pcihp_root_en)
>  {
>  Aml *dev, *notify_method = NULL, *method;
>  QObject *bsel;
>  PCIBus *sec;
>  int i;
> +bool root_bus = pci_bus_is_root(bus);
> +bool root_pcihp_disabled = (root_bus && !pcihp_root_en);
>  
>  bsel = object_pro

Re: [PULL v6 000/150] Meson-based build system

On Thu, 20 Aug 2020 at 00:20, Paolo Bonzini  wrote:
> My complaint was only that, until last week nobody even tried to apply
> and make the patches, and therefore some of the input I got surprised
> me.

I think from my point of view I've been surprised because my
impression as somebody not following it closely was that the effort
apparently went very rapidly from "experimental effort" to
"this is ready to apply". Perhaps it caught others by surprise
in a similar way.

thanks
-- PMM

Re: [PATCH] hw/arm/virt: Default to architecture appropriate CPU

On Thu, 20 Aug 2020 at 05:55, Punit Agrawal
 wrote:
>
> The default_cpu_type for the Virt machine is set to "cortex-a15" for
> both the arm (qemu-system-arm) and aarch64 (qemu-system-aarch64)
> targets.
>
> As a result, starting the aarch64 target with "-machine virt" defaults
> to booting with a Arm v7 cpu which is counter to
> expectation. Debugging the issue is further hampered by lack of any
> output from a supplied arm64 firmware as it is now running on the
> wrong CPU type.
>
> Fix this by defaulting to the "max" capability CPU for the target
> architecture. After the patch both the arm and aarch64 qemu default to
> the equivalent of passing "-cpu max".

Hi; this kind of thing has been suggested in the past several times.
Generally we've taken the view that we prefer:
 * not to make changes that would break pre-existing command lines
 * to maintain the general behaviour that a command line used
   with qemu-system-arm will also work with qemu-system-aarch64

It is certainly true that these days the default CPU type for
virt is not what most users want, though.

thanks
-- PMM

Re: Difficulty making usb-serial device visible within the guest

2020-08-20 Thread Mark Cave-Ayland

On 20/08/2020 09:15, Gerd Hoffmann wrote:

>> With auto-attach disabled this effectively prevents the device from
>> attaching to the USB unless the chardev is opened which feels odd.
> 
> Why?  It's not needed to connect at boot btw, the device will be
> hotplugged if the chardev is opened.

Yes, I see that - however...

>> I would expect that if I add a device to QEMU using -device then it is
>> immediately visible in the guest, and if the chardev isn't connected
>> then the device should report its status as disconnected as you would
>> expect with a real USB to RS232 adapter.
> 
> Huh?  How would that work?  You can do some guesswork using modem line
> status etc, but as far I know _reliable_ connected/disconnected
> reporting simply doesn't exist in the world of serial lines ...

I'm just comparing the behaviour with a real FTDI adapter: if I plug it in, it's
always visible on the guest USB regardless as to whether there is a cable 
attached -
the line status is returned as it is seen on the wire. Whether the guest OS has 
a
concept as to whether a cable is attached is something that is internal to the
driver, and completely separate from QEMU's notion as to whether a chardev is 
attached.

For someone who has a reasonable amount of QEMU development experience I was 
just
surprised that I ended up having to going through the source code to understand 
why
my usb-serial device wasn't appearing in the guest. Certainly there is no 
mention of
this requirement in the QEMU documentation.

Something else I noticed was that this behaviour is unique to USB: in-built 
serial
ports behave exactly as I described above which matches the real FTDI USB 
hardware.
And another question: if this behaviour is acceptable for USB, should it not be 
the
same for other hotplug buses, e.g. should a PCIe serial card plugged into a 
PCIe bus
plug and unplug itself as the chardev is attached/detached in a similar way?

ATB,

Mark.

[PATCH V6] Introduce a new flag for i440fx to disable PCI hotplug on the root bus

2020-08-20 Thread Ani Sinha

We introduce a new global flag 'acpi-root-pci-hotplug' for i440fx with which
we can turn on or off PCI device hotplug on the root bus. This flag can be
used to prevent all PCI devices from getting hotplugged or unplugged from the
root PCI bus.
This feature is targetted mostly towards Windows VMs. It is useful in cases
where some hypervisor admins want to deploy guest VMs in a way so that the
users of the guest OSes are not able to hot-eject certain PCI devices from
the Windows system tray. Laine has explained the use case here in detail:
https://www.redhat.com/archives/libvir-list/2020-February/msg00110.html

Julia has resolved this issue for PCIE buses with the following commit:
530a0963184e57e71a5b538 ("pcie_root_port: Add hotplug disabling option")

This commit attempts to introduce similar behavior for PCI root buses used in
i440fx machine types (although in this case, we do not have a per-slot
capability to turn hotplug on or off).

Usage:
   -global PIIX4_PM.acpi-root-pci-hotplug=off

By default, this option is enabled which means that hotplug is turned on for
the PCI root bus.

The previously existing flag 'acpi-pci-hotplug-with-bridge-support' for PCI-PCI
bridges remain as is and can be used along with this new flag to control PCI
hotplug on PCI bridges.

This change has been tested using a Windows 2012R2 server guest image and also
with a Windows 2019 server guest image on a Ubuntu 18.04 host using the latest
master qemu from upstream (v5.1.0 tag).

Signed-off-by: Ani Sinha 
---
 hw/acpi/piix4.c  |  8 ++--
 hw/i386/acpi-build.c | 26 +++---
 2 files changed, 25 insertions(+), 9 deletions(-)

Change Log:
V5..V6: specified upstream master tag information off which this patch is
based off of.

diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 26bac4f16c..4f436e5bf3 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -78,6 +78,7 @@ typedef struct PIIX4PMState {
 
 AcpiPciHpState acpi_pci_hotplug;
 bool use_acpi_hotplug_bridge;
+bool use_acpi_root_pci_hotplug;
 
 uint8_t disable_s3;
 uint8_t disable_s4;
@@ -595,8 +596,9 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
*parent,
   "acpi-gpe0", GPE_LEN);
 memory_region_add_subregion(parent, GPE_BASE, &s->io_gpe);
 
-acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
-s->use_acpi_hotplug_bridge);
+if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug)
+acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
+s->use_acpi_hotplug_bridge);
 
 s->cpu_hotplug_legacy = true;
 object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy",
@@ -635,6 +637,8 @@ static Property piix4_pm_properties[] = {
 DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
 DEFINE_PROP_BOOL("acpi-pci-hotplug-with-bridge-support", PIIX4PMState,
  use_acpi_hotplug_bridge, true),
+DEFINE_PROP_BOOL("acpi-root-pci-hotplug", PIIX4PMState,
+ use_acpi_root_pci_hotplug, true),
 DEFINE_PROP_BOOL("memory-hotplug-support", PIIX4PMState,
  acpi_memory_hotplug.is_enabled, true),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b7bc2a..19a1702ad1 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -95,6 +95,7 @@ typedef struct AcpiPmInfo {
 bool s3_disabled;
 bool s4_disabled;
 bool pcihp_bridge_en;
+bool pcihp_root_en;
 uint8_t s4_val;
 AcpiFadtData fadt;
 uint16_t cpu_hp_io_base;
@@ -245,6 +246,9 @@ static void acpi_get_pm_info(MachineState *machine, 
AcpiPmInfo *pm)
 pm->pcihp_bridge_en =
 object_property_get_bool(obj, "acpi-pci-hotplug-with-bridge-support",
  NULL);
+pm->pcihp_root_en =
+object_property_get_bool(obj, "acpi-root-pci-hotplug", NULL);
+
 }
 
 static void acpi_get_misc_info(AcpiMiscInfo *info)
@@ -337,15 +341,18 @@ static void build_append_pcihp_notify_entry(Aml *method, 
int slot)
 }
 
 static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
- bool pcihp_bridge_en)
+ bool pcihp_bridge_en,
+ bool pcihp_root_en)
 {
 Aml *dev, *notify_method = NULL, *method;
 QObject *bsel;
 PCIBus *sec;
 int i;
+bool root_bus = pci_bus_is_root(bus);
+bool root_pcihp_disabled = (root_bus && !pcihp_root_en);
 
 bsel = object_property_get_qobject(OBJECT(bus), ACPI_PCIHP_PROP_BSEL, 
NULL);
-if (bsel) {
+if (bsel && !root_pcihp_disabled) {
 uint64_t bsel_val = qnum_get_uint(qobject_to(QNum, bsel));
 
 aml_append(parent_scope, aml_name_decl("BSEL", aml_int(bsel_val)));
@@ -361,6 +368,9 @@ static void build_append_pci_bus_devices(Aml *parent_scope, 
PCIBus *bus,
 bool bridge_in_acpi;
 
 if (!pdev)

Re: [PATCH v7 14/47] stream: Deal with filters

On 20.08.20 10:31, Max Reitz wrote:

[...]

> So all in all, I believe the biggest surprise about what’s written into
> the top layer isn’t that it may be a json:{} filename, but the filename
> of a node that maybe doesn’t even exist anymore?  (Oh, no, please don’t
> tell me you can delete it and get an invalid pointer read...)

(I tried triggering that, but, oh, it’s strdup’ed() in stream_start().
I’m a bit daft.)



signature.asc
Description: OpenPGP digital signature

Re: [PATCH V5] Introduce a new flag for i440fx to disable PCI hotplug on the root bus

2020-08-20 Thread Ani Sinha

On Thu, Aug 20, 2020 at 2:21 PM Philippe Mathieu-Daudé
 wrote:
>
> On 8/20/20 10:16 AM, Ani Sinha wrote:
> > We introduce a new global flag 'acpi-root-pci-hotplug' for i440fx with which
> > we can turn on or off PCI device hotplug on the root bus. This flag can be
> > used to prevent all PCI devices from getting hotplugged or unplugged from 
> > the
> > root PCI bus.
> > This feature is targetted mostly towards Windows VMs. It is useful in cases
> > where some hypervisor admins want to deploy guest VMs in a way so that the
> > users of the guest OSes are not able to hot-eject certain PCI devices from
> > the Windows system tray. Laine has explained the use case here in detail:
> > https://www.redhat.com/archives/libvir-list/2020-February/msg00110.html
> >
> > Julia has resolved this issue for PCIE buses with the following commit:
> > 530a0963184e57e71a5b538 ("pcie_root_port: Add hotplug disabling option")
> >
> > This commit attempts to introduce similar behavior for PCI root buses used 
> > in
> > i440fx machine types (although in this case, we do not have a per-slot
> > capability to turn hotplug on or off).
> >
> > Usage:
> >-global PIIX4_PM.acpi-root-pci-hotplug=off
> >
> > By default, this option is enabled which means that hotplug is turned on for
> > the PCI root bus.
> >
> > The previously existing flag 'acpi-pci-hotplug-with-bridge-support' for 
> > PCI-PCI
> > bridges remain as is and can be used along with this new flag to control PCI
> > hotplug on PCI bridges.
> >
> > This change has been tested using a Windows 2012R2 server guest image and 
> > also
> > with a Windows 2019 server guest image on a Ubuntu 18.04 host using the 
> > latest
> > master qemu from upstream.
>
> "latest master qemu from upstream" -> "v5.1.0 tag"

Addressed in V6.

>
> >
> > Signed-off-by: Ani Sinha 
> > ---
> >  hw/acpi/piix4.c  |  8 ++--
> >  hw/i386/acpi-build.c | 26 +++---
> >  2 files changed, 25 insertions(+), 9 deletions(-)
> >
> > Change Log:
> > V4..V5: Updated commit message to reflect the fact that testing was also
> > performed on a Windows 2019 server guest image. Minor commit log formatting
> > to make sure 80 col rule is enforced.
> >
> > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> > index 26bac4f16c..4f436e5bf3 100644
> > --- a/hw/acpi/piix4.c
> > +++ b/hw/acpi/piix4.c
> > @@ -78,6 +78,7 @@ typedef struct PIIX4PMState {
> >
> >  AcpiPciHpState acpi_pci_hotplug;
> >  bool use_acpi_hotplug_bridge;
> > +bool use_acpi_root_pci_hotplug;
> >
> >  uint8_t disable_s3;
> >  uint8_t disable_s4;
> > @@ -595,8 +596,9 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
> > *parent,
> >"acpi-gpe0", GPE_LEN);
> >  memory_region_add_subregion(parent, GPE_BASE, &s->io_gpe);
> >
> > -acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
> > -s->use_acpi_hotplug_bridge);
> > +if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug)
> > +acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
> > +s->use_acpi_hotplug_bridge);
> >
> >  s->cpu_hotplug_legacy = true;
> >  object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy",
> > @@ -635,6 +637,8 @@ static Property piix4_pm_properties[] = {
> >  DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
> >  DEFINE_PROP_BOOL("acpi-pci-hotplug-with-bridge-support", PIIX4PMState,
> >   use_acpi_hotplug_bridge, true),
> > +DEFINE_PROP_BOOL("acpi-root-pci-hotplug", PIIX4PMState,
> > + use_acpi_root_pci_hotplug, true),
> >  DEFINE_PROP_BOOL("memory-hotplug-support", PIIX4PMState,
> >   acpi_memory_hotplug.is_enabled, true),
> >  DEFINE_PROP_END_OF_LIST(),
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index b7bc2a..19a1702ad1 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -95,6 +95,7 @@ typedef struct AcpiPmInfo {
> >  bool s3_disabled;
> >  bool s4_disabled;
> >  bool pcihp_bridge_en;
> > +bool pcihp_root_en;
> >  uint8_t s4_val;
> >  AcpiFadtData fadt;
> >  uint16_t cpu_hp_io_base;
> > @@ -245,6 +246,9 @@ static void acpi_get_pm_info(MachineState *machine, 
> > AcpiPmInfo *pm)
> >  pm->pcihp_bridge_en =
> >  object_property_get_bool(obj, 
> > "acpi-pci-hotplug-with-bridge-support",
> >   NULL);
> > +pm->pcihp_root_en =
> > +object_property_get_bool(obj, "acpi-root-pci-hotplug", NULL);
> > +
> >  }
> >
> >  static void acpi_get_misc_info(AcpiMiscInfo *info)
> > @@ -337,15 +341,18 @@ static void build_append_pcihp_notify_entry(Aml 
> > *method, int slot)
> >  }
> >
> >  static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
> > - bool pcihp_bridge_en)
> > + bool pcihp_bridge_en,
> >

Re: [PATCH v3 12/12] block/qcow2: automatically insert preallocate filter when on FUSE


19.08.2020 18:15, Stefan Hajnoczi wrote:

On Mon, Aug 17, 2020 at 12:15:53PM +0300, Vladimir Sementsov-Ogievskiy wrote:

vstorage has slow allocation, so this patch detect vstorage
(I hope, we don't use other FUSE filesystems) and inserts preallocate
filter between qcow2 node and its file child.

The following test executes more than 10 times faster
(43.2s -> 3.9s for me) with this patch. (/newssd3 is mount point of
vstorage, both cs and mds are on same ssd local ssd disk)

 IMG=/newssd3/z
 FILE_OPTS=file.filename=$IMG
 COUNT=15000
 CHUNK=64K
 CLUSTER=1M
 rm -f $IMG
 ./qemu-img create -f qcow2 -o cluster_size=$CLUSTER $IMG 1G
 ./qemu-img bench -c $COUNT -d 1 -s $CHUNK -w -t none -f qcow2 $IMG


Kevin's input is needed here. I think the philosophy is that nodes are
not automatically inserted. The user should define the graph explicitly.
The management tool would be responsible for configuring a preallocate
filter node.



This patch originally is not intended to be merged upstream, only for 
downstream.
I post it just to possibly get some ideas, could it be somehow useful for 
others.
(I'm not sure, that all FUSE filesystems needs this filter. But vstorage needs)

Hmm, about automatically inserted nodes: why not? block jobs insert their 
filters
automatically.. Anyway, for our downstream process the simplest thing is to
insert it automatically in Qemu.



--
Best regards,
Vladimir

[PATCH v1] pc: fix auto_enable_numa_with_memhp/auto_enable_numa_with_memdev for the 5.0 machine

2020-08-20 Thread David Hildenbrand

Unfortunately, a typo sneeked in: we want to set
auto_enable_numa_with_memdev to false, not auto_enable_numa_with_memhp.

Cc: qemu-sta...@nongnu.org # v5.1
Fixes: 195784a0cfad (numa: Auto-enable NUMA when any memory devices are 
possible)
Reported-by: Dr. David Alan Gilbert 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: "Michael S. Tsirkin" 
Signed-off-by: David Hildenbrand 
---
 hw/i386/pc_q35.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index a3e607a544..e94f45779b 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -371,7 +371,7 @@ static void pc_q35_5_0_machine_options(MachineClass *m)
 m->numa_mem_supported = true;
 compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len);
 compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len);
-m->auto_enable_numa_with_memhp = false;
+m->auto_enable_numa_with_memdev = false;
 }
 
 DEFINE_Q35_MACHINE(v5_0, "pc-q35-5.0", NULL,
-- 
2.26.2

Re: [PATCH v1] pc: fix auto_enable_numa_with_memhp/auto_enable_numa_with_memdev for the 5.0 machine

2020-08-20 Thread Dr. David Alan Gilbert

* David Hildenbrand (da...@redhat.com) wrote:
> Unfortunately, a typo sneeked in: we want to set
> auto_enable_numa_with_memdev to false, not auto_enable_numa_with_memhp.
> 
> Cc: qemu-sta...@nongnu.org # v5.1
> Fixes: 195784a0cfad (numa: Auto-enable NUMA when any memory devices are 
> possible)
> Reported-by: Dr. David Alan Gilbert 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  hw/i386/pc_q35.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index a3e607a544..e94f45779b 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -371,7 +371,7 @@ static void pc_q35_5_0_machine_options(MachineClass *m)
>  m->numa_mem_supported = true;
>  compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len);
>  compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len);
> -m->auto_enable_numa_with_memhp = false;
> +m->auto_enable_numa_with_memdev = false;
>  }
>  
>  DEFINE_Q35_MACHINE(v5_0, "pc-q35-5.0", NULL,
> -- 
> 2.26.2
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v5 1/1] audio/jack: fix use after free segfault

2020-08-20 Thread Christian Schoenebeck

On Donnerstag, 20. August 2020 07:37:28 CEST Gerd Hoffmann wrote:
>   Hi,
> 
> > > +qemu_bh_cancel(c->shutdown_bh);
> > 
> > Looks like a potential race. Quote from the API doc of qemu_bh_cancel():
> > "While cancellation itself is also wait-free and thread-safe, it can of
> > course race with the loop that executes bottom halves unless you are
> > holding the iothread mutex.  This makes it mostly useless if you are not
> > holding the mutex."
> 
> Should not be a problem, all auto backend code should only be called
> while qemu holds the iothread mutex.  With the exception of the shutdown
> handler which jack might call from signal context (which is why we need
> the BH in the first place).

Hmmm, as Geoffrey already added a lock today, I noticed that QEMU's main IO 
thread mutex is not initialized as 'recursive' lock type. Does that make 
sense? I.e. shouldn't there be a

qemu_rec_mutex_init(&qemu_global_mutex);

in softmmu/cpus.c for safety reasons to prevent nested locks from same thread 
causing misbehaviour?

CCing Paolo to clarify.

Best regards,
Christian Schoenebeck

Re: [PATCH v3] block/nbd: use non-blocking connect: fix vm hang on connect()


19.08.2020 17:46, Eric Blake wrote:

On 8/12/20 9:52 AM, Vladimir Sementsov-Ogievskiy wrote:

This make nbd connection_co to yield during reconnects, so that


s/make nbd connection_co to/makes nbd's connection_co/


reconnect doesn't hang up the main thread. This is very important in


s/hang up/block/


case of unavailable nbd server host: connect() call may take a long


s/of/of an/


time, blocking the main thread (and due to reconnect, it will hang
again and again with small gaps of working time during pauses between
connection attempts).

Realization notes:

  - We don't want to implement non-blocking connect() over non-blocking
  socket, because getaddrinfo() doesn't have portable non-blocking
  realization anyway, so let's just use a thread for both getaddrinfo()
  and connect().

  - We can't use qio_channel_socket_connect_async (which behave


behaves


  similarly and start a thread to execute connect() call), as it's rely


starts
relying


  on someone iterating main loop (g_main_loop_run() or something like
  this), which is not always the case.

  - We can't use thread_pool_submit_co API, as thread pool waits for all
  threads to finish (but we don't want to wait for blocking reconnect
  attempt on shutdown.

  So, we just create the thread by hand. Some additional difficulties
  are:

  - We want our connect don't block drained sections and aio context


s/don't block/to avoid blocking/


  switches. To achieve this, we make it possible to "cancel" synchronous
  wait for the connect (which is an coroutine yield actually), still,


s/an/a/


  the thread continues in background, and it successful result may be


s/it successful/if successful, its/


  reused on next reconnect attempt.

  - We don't want to wait for reconnect on shutdown, so there is
  CONNECT_THREAD_RUNNING_DETACHED thread state, which means that block
  layer not more interested in a result, and thread should close new


which means that the block layer is no longer interested


  connected socket on finish and free the state.

How to reproduce the bug, fixed with this commit:

1. Create an image on node1:
    qemu-img create -f qcow2 xx 100M

2. Start NBD server on node1:
    qemu-nbd xx

3. Start vm with second nbd disk on node2, like this:

   ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \
 file=/work/images/cent7.qcow2 -drive file=nbd+tcp://192.168.100.2 \
 -vnc :0 -qmp stdio -m 2G -enable-kvm -vga std

4. Access the vm through vnc (or some other way?), and check that NBD
    drive works:

    dd if=/dev/sdb of=/dev/null bs=1M count=10

    - the command should succeed.

5. Now, let's trigger nbd-reconnect loop in Qemu process. For this:

5.1 Kill NBD server on node1

5.2 run "dd if=/dev/sdb of=/dev/null bs=1M count=10" in the guest
 again. The command should fail and a lot of error messages about
 failing disk may appear as well.

 Now NBD client driver in Qemu tries to reconnect.
 Still, VM works well.

6. Make node1 unavailable on NBD port, so connect() from node2 will
    last for a long time:

    On node1 (Note, that 10809 is just a default NBD port):

    sudo iptables -A INPUT -p tcp --dport 10809 -j DROP

    After some time the guest hangs, and you may check in gdb that Qemu
    hangs in connect() call, issued from the main thread. This is the
    BUG.

7. Don't forget to drop iptables rule from your node1:

    sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi!

This a continuation of "[PATCH v2 for-5.1? 0/5] Fix nbd reconnect dead-locks",
which was mostly merged to 5.1. The only last patch was not merged, and
here is a no-change resend for convenience.


  block/nbd.c | 266 +++-
  1 file changed, 265 insertions(+), 1 deletion(-)


Looks big, but the commit message goes into good detail about what the problem 
is, why the solution takes the approach it does, and a good formula for 
reproduction.



diff --git a/block/nbd.c b/block/nbd.c
index 7bb881fef4..919ec5e573 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -38,6 +38,7 @@
  #include "qapi/qapi-visit-sockets.h"
  #include "qapi/qmp/qstring.h"
+#include "qapi/clone-visitor.h"
  #include "block/qdict.h"
  #include "block/nbd.h"
@@ -62,6 +63,47 @@ typedef enum NBDClientState {
  NBD_CLIENT_QUIT
  } NBDClientState;
+typedef enum NBDConnectThreadState {
+/* No thread, no pending results */
+    CONNECT_THREAD_NONE,


I'd indent the comments by four spaces, to line up with the enumeration values 
they describe.


+
+/* Thread is running, no results for now */
+    CONNECT_THREAD_RUNNING,
+
+/*
+ * Thread is running, but requestor exited. Thread should close the new socket
+ * and free the connect state on exit.
+ */
+    CONNECT_THREAD_RUNNING_DETACHED,
+
+/* Thread finished, results are stored in a state */
+    CONNECT_THREAD_FAIL,
+    CONNECT_THREAD_SUCCESS
+} NBDConnectThreadState;
+
+typedef struct NBDConnectThread {

Re: [PATCH v7 33/47] mirror: Deal with filters

On 19.08.20 18:50, Kevin Wolf wrote:
> Am 25.06.2020 um 17:22 hat Max Reitz geschrieben:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission for active commits where the base is smaller
>> than the top).
>>
>> Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
>> "target_backing_bs", because that is what it really refers to.
>>
>> Signed-off-by: Max Reitz 
> 
>> @@ -1682,6 +1721,7 @@ static BlockJob *mirror_start_job(
>>  s->zero_target = zero_target;
>>  s->copy_mode = copy_mode;
>>  s->base = base;
>> +s->base_overlay = bdrv_find_overlay(bs, base);
>>  s->granularity = granularity;
>>  s->buf_size = ROUND_UP(buf_size, granularity);
>>  s->unmap = unmap;
> 
> Is this valid without freezing the links between base_overlay and base?

Er...

> Actually, I guess we should freeze everything between bs and base (for
> base != NULL) and it's a preexisting problem that just happens to affect
> this code, too.

Yes, that’s how it looks to me, too.  I don’t think that has anything to
do with this patch.

> Or maybe freezing everything is too much. We only want to make sure that
> no non-filter is inserted between base and base_overlay and that base
> (and now base_overlay) always stay in the backing chain of bs. But what
> options apart from freezing do we have to achieve this?

I don’t know of any, and I don’t know whether anyone would actually care
if we were to just freeze everything.

> Why is using base_overlay even better than using base? Assuming there is
> a good reason, maybe the commit message could spell it out.

The problem is that querying the block status for a filter node falls
through to the underlying data-carrying node.  So if there’s a filter on
top of @base, and we query for is_allocated_above above @base, then
we’ll include @base, which we do not want.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v3] block/nbd: use non-blocking connect: fix vm hang on connect()

19.08.2020 20:52, Eric Blake wrote:

On 8/12/20 9:52 AM, Vladimir Sementsov-Ogievskiy wrote:

This make nbd connection_co to yield during reconnects, so that
reconnect doesn't hang up the main thread. This is very important in
case of unavailable nbd server host: connect() call may take a long
time, blocking the main thread (and due to reconnect, it will hang
again and again with small gaps of working time during pauses between
connection attempts).

How to reproduce the bug, fixed with this commit:

1. Create an image on node1:
qemu-img create -f qcow2 xx 100M

2. Start NBD server on node1:
qemu-nbd xx

3. Start vm with second nbd disk on node2, like this:

./x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \
file=/work/images/cent7.qcow2 -drive file=nbd+tcp://192.168.100.2 \
-vnc :0 -qmp stdio -m 2G -enable-kvm -vga std

Where is the configuration to set up retry on the nbd connection? I wonder if
you have a non-upstream patch that turns it on by default in your builds; for
upstream, I would have expected something more along the lines of -blockdev
driver=nbd,reconnect-delay=20,server.type=inet,server.data.hostname=192.168.100.2,server.data.port=10809
(typing off the top of my head, rather than actually tested).

No, it's not necessary: reconnect is enabled always. reconnect-delay just says
what to do with guest requests when connection is down. By default, they just
fails immediately. But even with reconnect-delay=0 reconnect code works and
tries to reestablish the connection.

4. Access the vm through vnc (or some other way?), and check that NBD
drive works:

dd if=/dev/sdb of=/dev/null bs=1M count=10

- the command should succeed.

5. Now, let's trigger nbd-reconnect loop in Qemu process. For this:

5.1 Kill NBD server on node1

5.2 run "dd if=/dev/sdb of=/dev/null bs=1M count=10" in the guest
again. The command should fail and a lot of error messages about
failing disk may appear as well.

Why does the guest access fail when the server goes away? Shouldn't the
pending guest requests merely be queued for retry (where the guest has not seen
a failure yet, but may do so if timeouts are reached), rather than being
instant errors?

And that's exactly how it should work when reconnect-delay is 0. If you set
reconnect-delay to be >0, then in this period of time after detection of
connection failure all the requests will be queued.

Now NBD client driver in Qemu tries to reconnect.
Still, VM works well.

6. Make node1 unavailable on NBD port, so connect() from node2 will
last for a long time:

On node1 (Note, that 10809 is just a default NBD port):

sudo iptables -A INPUT -p tcp --dport 10809 -j DROP

After some time the guest hangs, and you may check in gdb that Qemu
hangs in connect() call, issued from the main thread. This is the
BUG.

7. Don't forget to drop iptables rule from your node1:

sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

--
Best regards,
Vladimir

Re: [PATCH 02/10] numa: introduce MachineClass::forbid_asymmetrical_numa

2020-08-20 Thread Daniel Henrique Barboza





On 8/20/20 1:15 AM, David Gibson wrote:

On Wed, Aug 19, 2020 at 10:11:28PM -0400, Eduardo Habkost wrote:

On Thu, Aug 20, 2020 at 11:17:26AM +1000, David Gibson wrote:

On Fri, Aug 14, 2020 at 05:54:16PM -0300, Daniel Henrique Barboza wrote:

The pSeries machine does not support asymmetrical NUMA
configurations.


This seems a bit oddly specific to have as a global machine class
property.

Would it make more sense for machines with specific NUMA constraints
to just verify those during their initialization?


This would be much simpler.  However, I like the idea of
representing machine-specific configuration validation rules as
data that can eventually be exported to management software.


Ah, ok, so basically the usual tradeoff between flexibility and
advertisability.




To provide context, what I did here was inspired by this commit:

commit 0533ef5f2089f4f12a0ec5c8035e5e15ba0b5556
Author: Tao Xu 
Date:   Thu Sep 5 16:32:38 2019 +0800

numa: Introduce MachineClass::auto_enable_numa for implicit NUMA node


In this commit, exclusive NUMA code from spapr.c was taken and put it
into numa.c, with a flag being set in spapr machine_init.


Thanks,


DHB




So, in that case, I guess the question is whether we envisage "no
assymmetry" as a constraint common enough that it's worth creating an
advertisable rule or not.  If we only ever have one user, then we
haven't really done any better than hard coding the constraint in the
manageent software.

Of course to complicate matters, in the longer term we're looking at
removing that constraint from pseries - but doing so will be dependent
on the guest kernel understanding a new format for the NUMA
information in the device tree.  So qemu alone won't have enough
information to tell if such a configuration is possible or not.


(CCing John Snow, who had spent some time thinking about
configuration validation recently.)




CC: Eduardo Habkost 
CC: Marcel Apfelbaum 
Signed-off-by: Daniel Henrique Barboza 
---
  hw/core/numa.c  | 7 +++
  hw/ppc/spapr.c  | 1 +
  include/hw/boards.h | 1 +
  3 files changed, 9 insertions(+)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index d1a94a14f8..1e81233c1d 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -547,6 +547,7 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error 
**errp)
   */
  static void validate_numa_distance(MachineState *ms)
  {
+MachineClass *mc = MACHINE_GET_CLASS(ms);
  int src, dst;
  bool is_asymmetrical = false;
  int nb_numa_nodes = ms->numa_state->num_nodes;
@@ -575,6 +576,12 @@ static void validate_numa_distance(MachineState *ms)
  }
  
  if (is_asymmetrical) {

+if (mc->forbid_asymmetrical_numa) {
+error_report("This machine type does not support "
+ "asymmetrical numa distances.");
+exit(EXIT_FAILURE);
+}
+
  for (src = 0; src < nb_numa_nodes; src++) {
  for (dst = 0; dst < nb_numa_nodes; dst++) {
  if (src != dst && numa_info[src].distance[dst] == 0) {
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index dd2fa4826b..3b16edaf4c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4512,6 +4512,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
   */
  mc->numa_mem_align_shift = 28;
  mc->auto_enable_numa = true;
+mc->forbid_asymmetrical_numa = true;
  
  smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;

  smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index bc5b82ad20..dc6cdd1c53 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -215,6 +215,7 @@ struct MachineClass {
  bool nvdimm_supported;
  bool numa_mem_supported;
  bool auto_enable_numa;
+bool forbid_asymmetrical_numa;
  const char *default_ram_id;
  
  HotplugHandler *(*get_hotplug_handler)(MachineState *machine,

Re: [PULL v7 000/151] Meson-based build system

On Wed, 19 Aug 2020 at 22:32, Paolo Bonzini  wrote:
>
> The following changes since commit d0ed6a69d399ae193959225cdeaa9382746c91cc:
>
>   Update version for v5.1.0 release (2020-08-11 17:07:03 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 2eddb3c65821dce76433d5da6f3e6419349d1b77:
>
>   docs: convert build system documentation to rST (2020-08-19 16:13:30 -0400)
>
> v6->v7:
> * new patch to preserve compatibility symlinks from previous binary locations
> * fixed cut-and-paste error in linux-user/mips/meson.build
> * preserve compatibility check-block target even if no block tests are defined

'make check' still fails for the all-linux-static config, this
time for a different reason:

make: *** No rule to make target 'check-qtest', needed by 'check'. Stop.

thanks
-- PMM

Re: [PATCH V1 30/32] vfio-pci: save and restore

2020-08-20 Thread Jason Zeng

On Wed, Aug 19, 2020 at 05:15:11PM -0400, Steven Sistare wrote:
> On 8/9/2020 11:50 PM, Jason Zeng wrote:
> > On Fri, Aug 07, 2020 at 04:38:12PM -0400, Steven Sistare wrote:
> >> On 8/6/2020 6:22 AM, Jason Zeng wrote:
> >>> Hi Steve,
> >>>
> >>> On Thu, Jul 30, 2020 at 08:14:34AM -0700, Steve Sistare wrote:
>  @@ -3182,6 +3207,51 @@ static Property vfio_pci_dev_properties[] = {
>   DEFINE_PROP_END_OF_LIST(),
>   };
>   
>  +static int vfio_pci_post_load(void *opaque, int version_id)
>  +{
>  +int vector;
>  +MSIMessage msg;
>  +Error *err = 0;
>  +VFIOPCIDevice *vdev = opaque;
>  +PCIDevice *pdev = &vdev->pdev;
>  +
>  +if (msix_enabled(pdev)) {
>  +vfio_msix_enable(vdev);
>  +pdev->msix_function_masked = false;
>  +
>  +for (vector = 0; vector < vdev->pdev.msix_entries_nr; vector++) 
>  {
>  +if (!msix_is_masked(pdev, vector)) {
>  +msg = msix_get_message(pdev, vector);
>  +vfio_msix_vector_use(pdev, vector, msg);
>  +}
>  +}
> >>>
> >>> It looks to me MSIX re-init here may lose device IRQs and impact
> >>> device hardware state?
> >>>
> >>> The re-init will cause the kernel vfio driver to connect the device
> >>> MSIX vectors to new eventfds and KVM instance. But before that, device
> >>> IRQs will be routed to previous eventfd. Looks these IRQs will be lost.
> >>
> >> Thanks Jason, that sounds like a problem.  I could try reading and saving 
> >> an 
> >> event from eventfd before shutdown, and injecting it into the eventfd after
> >> restart, but that would be racy unless I disable interrupts.  Or, 
> >> unconditionally
> >> inject a spurious interrupt after restart to kick it, in case an interrupt 
> >> was lost.
> >>
> >> Do you have any other ideas?
> > 
> > Maybe we can consider to also hand over the eventfd file descriptor, or
> 
> I believe preserving this descriptor in isolation is not sufficient.  We would
> also need to preserve the KVM instance which it is linked to.
> 
> > or even the KVM fds to the new Qemu?
> > 
> > If the KVM fds can be preserved, we will just need to restore Qemu KVM
> > side states. But not sure how complicated the implementation would be.
> 
> That should work, but I fear it would require many code changes in QEMU
> to re-use descriptors at object creation time and suppress the initial 
> configuration ioctl's, so it's not my first choice for a solution.
> 
> > If we only preserve the eventfd fd, we can attach the old eventfd to
> > vfio devices. But looks it may turn out we always inject an interrupt
> > unconditionally, because kernel KVM irqfd eventfd handling is a bit
> > different than normal user land eventfd read/write. It doesn't decrease
> > the counter in the eventfd context. So if we read the eventfd from new
> > Qemu, it looks will always have a non-zero counter, which requires an
> > interrupt injection.
> 
> Good to know, thanks.
> 
> I will try creating a new eventfd and injecting an interrupt unconditionally.
> I need a test case to demonstrate losing an interrupt, and fixing it with
> injection.  Any advice?  My stress tests with a virtual function nic and a
> directly assigned nvme block device have never failed across live update.
> 

I am not familiar with nvme devices. For nic device, to my understanding,
stress nic testing will not have many IRQs, because nic driver usually
enables NAPI, which only take the first interrupt, then disable interrupt
and start polling. It will only re-enable interrupt after some packet
quota reached or the traffic quiesces for a while. But anyway, if the
test goes enough long time, the number of IRQs should also be big, not
sure why it doesn't trigger any issue. Maybe we can have some study on
the IRQ pattern for the testing and see how we can design a test case?
or see if our assumption is wrong?


> >>> And the re-init will make the device go through the procedure of
> >>> disabling MSIX, enabling INTX, and re-enabling MSIX and vectors.
> >>> So if the device is active, its hardware state will be impacted?
> >>
> >> Again thanks.  vfio_msix_enable() does indeed call 
> >> vfio_disable_interrupts().
> >> For a quick experiment, I deleted that call in for the post_load code 
> >> path, and 
> >> it seems to work fine, but I need to study it more.
> > 
> > vfio_msix_vector_use() will also trigger this procedure in the kernel.
> 
> Because that code path calls VFIO_DEVICE_SET_IRQS? Or something else?
> Can you point to what it triggers in the kernel?


In vfio_msix_vector_use(), I see vfio_disable_irqindex() will be invoked
if "vdev->nr_vectors < nr + 1" is true. Since the 'vdev' is re-inited,
so this condition should be true, and vfio_disable_irqindex() will
trigger VFIO_DEVICE_SET_IRQS with VFIO_IRQ_SET_DATA_NONE, which will
cause kernel to disable MSIX.

> 
> > Looks we shouldn't trigger any kernel vfio actions h

Re: [PATCH 07/10] spapr: create helper to set ibm,associativity

2020-08-20 Thread Daniel Henrique Barboza





On 8/20/20 12:00 AM, David Gibson wrote:

On Fri, Aug 14, 2020 at 05:54:21PM -0300, Daniel Henrique Barboza wrote:

We have several places around hw/ppc files where we use the
same code to set the ibm,associativity array. This patch
creates a helper called spapr_set_associativity() to do
that in a single place. It'll also make it saner to change
the value of ibm,associativity in the next patches.

After this patch, only 2 places are left with open code
ibm,associativity assignment:

- spapr_dt_dynamic_reconfiguration_memory()
- h_home_node_associativity() in spapr_hcall.c

The update of associativity values will be made in these places
manually later on.

Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: David Gibson 

I like this - any chance you could move this to the front of the
series so that we can make this code easier to follow while we're
still discussing the more meaningful changes?



No problem. I'll move this up front in v2.


DHB




---
  hw/ppc/spapr.c | 32 +---
  hw/ppc/spapr_nvdimm.c  |  8 +++-
  hw/ppc/spapr_pci.c |  8 +++-
  include/hw/ppc/spapr.h |  1 +
  4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bc51d2db90..b80a6f6936 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -201,15 +201,27 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
PowerPCCPU *cpu,
  return ret;
  }
  
+void spapr_set_associativity(uint32_t *assoc, int node_id, int cpu_index)

+{
+uint8_t assoc_size = 0x4;
+
+if (cpu_index >= 0) {
+assoc_size = 0x5;
+assoc[5] = cpu_to_be32(cpu_index);
+}
+
+assoc[0] = cpu_to_be32(assoc_size);
+assoc[1] = cpu_to_be32(0x0);
+assoc[2] = cpu_to_be32(0x0);
+assoc[3] = cpu_to_be32(0x0);
+assoc[4] = cpu_to_be32(node_id);
+}
+
  static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, PowerPCCPU *cpu)
  {
  int index = spapr_get_vcpu_id(cpu);
-uint32_t associativity[] = {cpu_to_be32(0x5),
-cpu_to_be32(0x0),
-cpu_to_be32(0x0),
-cpu_to_be32(0x0),
-cpu_to_be32(cpu->node_id),
-cpu_to_be32(index)};
+uint32_t associativity[6];
+spapr_set_associativity(associativity, cpu->node_id, index);
  
  /* Advertise NUMA via ibm,associativity */

  return fdt_setprop(fdt, offset, "ibm,associativity", associativity,
@@ -325,15 +337,13 @@ static void add_str(GString *s, const gchar *s1)
  static int spapr_dt_memory_node(void *fdt, int nodeid, hwaddr start,
  hwaddr size)
  {
-uint32_t associativity[] = {
-cpu_to_be32(0x4), /* length */
-cpu_to_be32(0x0), cpu_to_be32(0x0),
-cpu_to_be32(0x0), cpu_to_be32(nodeid)
-};
+uint32_t associativity[5];
  char mem_name[32];
  uint64_t mem_reg_property[2];
  int off;
  
+spapr_set_associativity(associativity, nodeid, -1);

+
  mem_reg_property[0] = cpu_to_be64(start);
  mem_reg_property[1] = cpu_to_be64(size);
  
diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c

index 81410aa63f..bd109bfc00 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -115,15 +115,13 @@ int spapr_dt_nvdimm(void *fdt, int parent_offset,
   &error_abort);
  uint64_t slot = object_property_get_uint(OBJECT(nvdimm), 
PC_DIMM_SLOT_PROP,
   &error_abort);
-uint32_t associativity[] = {
-cpu_to_be32(0x4), /* length */
-cpu_to_be32(0x0), cpu_to_be32(0x0),
-cpu_to_be32(0x0), cpu_to_be32(node)
-};
+uint32_t associativity[5];
  uint64_t lsize = nvdimm->label_size;
  uint64_t size = object_property_get_int(OBJECT(nvdimm), PC_DIMM_SIZE_PROP,
  NULL);
  
+spapr_set_associativity(associativity, node, -1);

+
  drc = spapr_drc_by_id(TYPE_SPAPR_DRC_PMEM, slot);
  g_assert(drc);
  
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c

index 09ac58fd7f..c02ace226c 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2321,11 +2321,8 @@ int spapr_dt_phb(SpaprMachineState *spapr, SpaprPhbState 
*phb,
  cpu_to_be32(1),
  cpu_to_be32(RTAS_IBM_RESET_PE_DMA_WINDOW)
  };
-uint32_t associativity[] = {cpu_to_be32(0x4),
-cpu_to_be32(0x0),
-cpu_to_be32(0x0),
-cpu_to_be32(0x0),
-cpu_to_be32(phb->numa_node)};
+uint32_t associativity[5];
+
  SpaprTceTable *tcet;
  SpaprDrc *drc;
  Error *err = NULL;
@@ -2358,6 +2355,7 @@ int spapr_dt_phb(SpaprMachineState *spapr, SpaprPhbState 
*phb,
  
  /* Advertise NUMA via ibm,associativity */

  if (phb->numa_node != -1) {
+spapr_set_associativit

Re: [PATCH v7 14/47] stream: Deal with filters


20.08.2020 12:22, Max Reitz wrote:

On 20.08.20 10:31, Max Reitz wrote:

[...]


So all in all, I believe the biggest surprise about what’s written into
the top layer isn’t that it may be a json:{} filename, but the filename
of a node that maybe doesn’t even exist anymore?  (Oh, no, please don’t
tell me you can delete it and get an invalid pointer read...)


(I tried triggering that, but, oh, it’s strdup’ed() in stream_start().
I’m a bit daft.)




If it's broken anyway, probably we can just revert c624b015bf and start to 
freeze base again?


--
Best regards,
Vladimir

Re: deprecation of in-tree builds

Am 18.08.2020 um 21:14 hat Peter Maydell geschrieben:
> On Mon, 23 Mar 2020 at 13:32, Stefan Hajnoczi  wrote:
> > On Sat, Mar 21, 2020 at 11:50:23PM +0100, BALATON Zoltan wrote:
> > > This was discussed before. I think instead of annoying people with a
> > > warning, rather configure should be changed to create a build dir if run
> > > from source and have a Makefile in top dir that runs make -C builddir so
> > > people don't have to care about this or change their ways and can continue
> > > to run configure && make from source dir but you don't have to support
> > > in-tree build. Then you can deprecate in-tree builds but supporting only
> > > out-of-tree without this convenience would not just unnecessarily annoy
> > > those who prefer working in a single tree but people (and apparently some
> > > tools) expect sources to build with usual configure; make; make install so
> > > that should be the minimum to support.
> >
> > Yes, please!  I use in-tree builds and find it tedious to cd into a
> > build dir manually.
> >
> > Also, many build scripts (packaging, etc) we'll break if we simply
> > remove in-tree builds.  I think make && make install should continue to
> > work.
> 
> Paolo's conversion-to-Meson patchseries is about to land, so now
> is the time for people who would like this "automatically create
> a build directory and use it" behaviour to write the necessary
> patches. Any volunteers ?
> 
> My current plan is to land the Meson series first, because it is
> really painful for Paolo to try to keep rebasing it as other
> changes to the old build system occur. This would break
> in-tree builds temporarily until the "automatic creation and
> use of a builddir" patches go in on top of it.

Usually, our requirement is that patch series don't break anything. And
if something slips through, whoever broke it is supposed to fix it, not
whoever is affected.

Kevin

Re: [PATCH v5 1/1] audio/jack: fix use after free segfault

On 20/08/20 12:06, Christian Schoenebeck wrote:
> Hmmm, as Geoffrey already added a lock today, I noticed that QEMU's main IO 
> thread mutex is not initialized as 'recursive' lock type. Does that make 
> sense? I.e. shouldn't there be a
> 
>   qemu_rec_mutex_init(&qemu_global_mutex);
> 
> in softmmu/cpus.c for safety reasons to prevent nested locks from same thread 
> causing misbehaviour?
> 
> CCing Paolo to clarify.

atexit functions (in this case
audio_cleanup->free_audio_state->qjack_fini_out) might be called both
with or without the BQL taken, so v7 of this series is likely wrong as
you point out.

However, recursive locks are pure evil. :)

More seriously: programming with concurrency is first and foremost about
understanding invariants[1].  Locks are relatively simple to reason
about because they enforce invariants at the points where they are
acquired or released.

If you have something like

static void do_it_locked(struct foo *foo)
{
/* ... */
}

static void do_it(struct foo *foo)
{
mutex_lock(&foo->lock);
do_it_locked(foo);
mutex_unlock(&foo->lock);
}

then you can immediately know that foo_locked() calls requires more
care, because the invariants around "foo" have to be provided by the
caller of foo_locked().  Instead, on a call to foo() the invariants are
guaranteed just because the caller must not have locked foo().

Instead, recursive locks allow you to slip into a different mindset
where locks protect the _code_ instead of the _data_ (and the invariants
of that data).  Things look simpler because you can just have

static void do_it(struct foo *foo)
{
rec_mutex_lock(&foo->lock);
/* ... */
rec_mutex_unlock(&foo->lock);
}

but callers have no clue about what invariants are provided around calls
to do_it().  So, understanding complex code that uses recursive mutexes
is effectively impossible.

More on the practical side, recursive mutex are an easy way to get a
deadlock.  It's a common idiom to do

/* Need to take foo->lock outside bar->lock.  */
mutex_unlock(&bar->lock);
mutex_lock(&foo->lock);
mutex_lock(&bar->lock);

With a recursive mutex, there's no guarantee that foo->lock is *really*
taken outside bar->lock, because the first unlock could have done
nothing except decreasing the lock-nesting count.  This is also why QEMU
uses differently-named functions to lock/unlock recursive mutexes,
instead of having a flag like pthread mutexes do; it makes code like
this stand out as wrong:

/* Meant to take foo->lock outside bar->lock, but really doesn't  */
rec_mutex_unlock(&bar->lock);
mutex_lock(&foo->lock);
rec_mutex_lock(&bar->lock);

A variant of this applies to callbacks: the golden rule is that
callbacks (e.g. from a function that iterates over a data structure)
should be called without any lock taken in order to avoid deadlocks.
However, this rule is most often ignored in code that uses a recursive
mutex, because that code is written around the (incorrect) paradigm that
mutual exclusion makes code safe.  Which technically is true in this
case, as deadlocks are not a safety problem, but that's not a great
consolation.

My suggestion is to work towards protecting the audio code with its own
mutex(es) and ignore the existence of the BQL for subsystems that can do
so (audio is a prime candidate).  Also please add comments to
audio_int.h about which functions are called from other threads than the
QEMU main thread.

Thanks,

Paolo

[1] https://lamport.azurewebsites.net/pubs/teaching-concurrency.pdf

Re: [RFC PATCH 07/22] block/export: Remove magic from block-export-add

Am 19.08.2020 um 21:50 hat Eric Blake geschrieben:
> cc: Peter Krempa
> 
> On 8/13/20 11:29 AM, Kevin Wolf wrote:
> > nbd-server-add tries to be convenient and adds two questionable
> > features that we don't want to share in block-export-add, even for NBD
> > exports:
> > 
> > 1. When requesting a writable export of a read-only device, the export
> > is silently downgraded to read-only. This should be an error in the
> > context of block-export-add.
> 
> I'd be happy for this to be an error even with nbd-export-add; I don't think
> it would harm any of libvirt's existing usage (either for storage migration,
> or for incremental backups).
> 
> Side note: In the past, I had a proposal to enhance the NBD Protocol to
> allow a client to advertise to the server its intent on being a read-only or
> read-write client.  Not relevant to this patch, but this part of the commit
> message reminds me that I should revisit that topic (Rich and I recently hit
> another case in nbdkit where such an extension would be nice, when it comes
> to using NBD's multi-conn for better performance on a read-only connection,
> but only if the server knows the client intends to be read-only)
> 
> > 
> > 2. When using a BlockBackend name, unplugging the device from the guest
> > will automatically stop the NBD server, too. This may sometimes be
> > what you want, but it could also be very surprising. Let's keep
> > things explicit with block-export-add. If the user wants to stop the
> > export, they should tell us so.
> 
> Here, keeping the nbd command different from the block-export command seems
> tolerable.  On the other hand, I wonder if Peter needs to change anything in
> libvirt's incremental backup code to handle this sudden disappearance of an
> NBD device during a disk hot-unplug (that is, either the presence of an
> ongoing pull-mode backup should block disk unplug, or libvirt needs a way to
> guarantee that an ongoing backup NBD device remains in spite of subsequent
> disk actions on the guest).  Depending on libvirt's needs, we may want to
> revisit the nbd command to have the same policy as block-export-add, plus an
> introspectible feature notation.

As long as we can keep the compatibility code local to qmp_nbd_*(), I
don't think it's too bad. In particular because it's already written.

Instead of adjusting libvirt to changes in the nbd-* commands, I'd
rather have it change over to block-export-*. I would like to see the
nbd-server-add/remove commands deprecated soon after we have the
replacements.

> > 
> > Move these things into the nbd-server-add QMP command handler so that
> > they apply only there.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >   include/block/nbd.h   |  3 ++-
> 
> > +void qmp_block_export_add(BlockExportOptions *export, Error **errp)
> > +{
> > +blk_exp_add(export, errp);
> >   }
> >   void qmp_nbd_server_add(BlockExportOptionsNbd *arg, Error **errp)
> >   {
> > -BlockExportOptions export = {
> > +BlockExport *export;
> > +BlockDriverState *bs;
> > +BlockBackend *on_eject_blk;
> > +
> > +BlockExportOptions export_opts = {
> >   .type = BLOCK_EXPORT_TYPE_NBD,
> >   .u.nbd = *arg,
> >   };
> > -qmp_block_export_add(&export, errp);
> > +
> > +/*
> > + * nbd-server-add doesn't complain when a read-only device should be
> > + * exported as writable, but simply downgrades it. This is an error 
> > with
> > + * block-export-add.
> 
> I'd be happy with either marking this deprecated now (and fixing it in two
> releases), or declaring it a bug in nbd-server-add now (and fixing it
> outright).

How about deprecating nbd-server-add completely?

> > + */
> > +bs = bdrv_lookup_bs(arg->device, arg->device, NULL);
> > +if (bs && bdrv_is_read_only(bs)) {
> > +arg->writable = false;
> > +}
> > +
> > +export = blk_exp_add(&export_opts, errp);
> > +if (!export) {
> > +return;
> > +}
> > +
> > +/*
> > + * nbd-server-add removes the export when the named BlockBackend used 
> > for
> > + * @device goes away.
> > + */
> > +on_eject_blk = blk_by_name(arg->device);
> > +if (on_eject_blk) {
> > +nbd_export_set_on_eject_blk(export, on_eject_blk);
> > +}
> 
> Wait - is the magic export removal tied only to exporting a drive name, and
> not a node name?  So as long as libvirt is using only node names whwen
> adding exports, a drive being unplugged won't interfere?

Yes, seems so. It's the existing behaviour, I'm only moving the code
around.

> Overall, the change makes sense to me, although I'd love to see if we could
> go further on the writable vs. read-only issue.

If nbd-server-add will be going away relatively soon, it's probably not
worth the trouble. But if you have reasons to keep it, maybe we should
consider it.

Kevin

Re: [RFC PATCH 08/22] nbd: Add max-connections to nbd-server-start

Am 19.08.2020 um 22:00 hat Eric Blake geschrieben:
> On 8/13/20 11:29 AM, Kevin Wolf wrote:
> > This is a QMP equivalent of qemu-nbd's --share option, limiting the
> > maximum number of clients that can attach at the same time.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >   qapi/block-export.json | 10 --
> >   include/block/nbd.h|  3 ++-
> >   block/monitor/block-hmp-cmds.c |  2 +-
> >   blockdev-nbd.c | 33 ++---
> >   qemu-storage-daemon.c  |  4 ++--
> >   5 files changed, 39 insertions(+), 13 deletions(-)
> > 
> > diff --git a/qapi/block-export.json b/qapi/block-export.json
> > index 40369814b4..1fdc55c53a 100644
> > --- a/qapi/block-export.json
> > +++ b/qapi/block-export.json
> > @@ -14,6 +14,8 @@
> >   # is only resolved at time of use, so can be deleted and
> >   # recreated on the fly while the NBD server is active.
> >   # If missing, it will default to denying access (since 4.0).
> > +# @max-connections: The maximum number of connections to allow at the same
> > +#   time, 0 for unlimited. (since 5.2; default: 0)
> 
> Nice way to add feature parity.
> 
> Limiting the number of connections (particularly for a writable client,
> where we cannot guarantee cache consistency between the connections), seems
> like a worthwhile feature to have; I've always found it odd that qemu-nbd
> and QMP nbd-server-add defaulted to different limits (1 vs. unlimited).  For
> reference, nbdkit defaults to unlimited, and I'm happy if
> qemu-storage-daemon does likewise; but changing qemu-nbd's default of 1
> would be backwards incompatible and may cause surprises (there's always
> 'qemu-nbd -e' when needed).  I also wonder if we should change 'qemu-nbd -e
> 0' to mean unlimited rather than an error (right now, qemu-iotests/common.rc
> uses -e 42 for all nbd-based tests for a saner limit than just 1, but it
> smells of being arbitrary compared to unlimited).

I think eventually the actual NBD server implementation in qemu-nbd
should go away and it should just reuse the QMP one. Changing the
default would remove one more difference between them, which can only be
helpful in this process. (Though of course having a different default is
still simple enough for a simple wrapper.)

Kevin

Re: deprecation of in-tree builds

2020-08-20 Thread Michael Tokarev

20.08.2020 13:54, Kevin Wolf wrote:
[]
>> My current plan is to land the Meson series first, because it is
>> really painful for Paolo to try to keep rebasing it as other
>> changes to the old build system occur. This would break
>> in-tree builds temporarily until the "automatic creation and
>> use of a builddir" patches go in on top of it.
> 
> Usually, our requirement is that patch series don't break anything. And
> if something slips through, whoever broke it is supposed to fix it, not
> whoever is affected.

In this case it isn't really broken per se - there's a simple
workaround (build in a subdir). This rule exist to keep the
bisectability, and this one is trivial by invoking this very
workaround.

/mjt

Re: [PATCH] ppc/pnv: Add a HIOMAP erase command

2020-08-20 Thread David Gibson

On Thu, Aug 20, 2020 at 09:36:50AM +0200, Cédric Le Goater wrote:
> The OPAL test suite runs a read-erase-write test on the PNOR :
> 
>   https://github.com/open-power/op-test/blob/master/testcases/OpTestPNOR.py
> 
> which revealed that the IPMI HIOMAP handlers didn't support
> HIOMAP_C_ERASE. Implement the sector erase command by writing 0xFF in
> the PNOR memory region.
> 
> Reported-by: Klaus Heinrich Kiwi 
> Signed-off-by: Cédric Le Goater 

Applied to ppc-for-5.2.

> ---
>  hw/ppc/pnv_bmc.c | 31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/pnv_bmc.c b/hw/ppc/pnv_bmc.c
> index 2e1a03daa45a..0fb082fcb8ee 100644
> --- a/hw/ppc/pnv_bmc.c
> +++ b/hw/ppc/pnv_bmc.c
> @@ -140,6 +140,29 @@ static uint16_t bytes_to_blocks(uint32_t bytes)
>  return bytes >> BLOCK_SHIFT;
>  }
>  
> +static uint32_t blocks_to_bytes(uint16_t blocks)
> +{
> +return blocks << BLOCK_SHIFT;
> +}
> +
> +#define IPMI_ERR_UNSPECIFIED0xff
> +
> +static int hiomap_erase(PnvPnor *pnor, uint32_t offset, uint32_t size)
> +{
> +MemTxResult result;
> +int i;
> +
> +for (i = 0; i < size / 4; i++) {
> +result = memory_region_dispatch_write(&pnor->mmio, offset + i * 4,
> +  0x, MO_32,
> +  MEMTXATTRS_UNSPECIFIED);
> +if (result != MEMTX_OK) {
> +return -1;
> +}
> +}
> +return 0;
> +}
> +
>  static void hiomap_cmd(IPMIBmcSim *ibs, uint8_t *cmd, unsigned int cmd_len,
> RspBuffer *rsp)
>  {
> @@ -155,10 +178,16 @@ static void hiomap_cmd(IPMIBmcSim *ibs, uint8_t *cmd, 
> unsigned int cmd_len,
>  switch (cmd[2]) {
>  case HIOMAP_C_MARK_DIRTY:
>  case HIOMAP_C_FLUSH:
> -case HIOMAP_C_ERASE:
>  case HIOMAP_C_ACK:
>  break;
>  
> +case HIOMAP_C_ERASE:
> +if (hiomap_erase(pnor, blocks_to_bytes(cmd[5] << 8 | cmd[4]),
> +blocks_to_bytes(cmd[7] << 8 | cmd[6]))) {
> +rsp_buffer_set_error(rsp, IPMI_ERR_UNSPECIFIED);
> +}
> +break;
> +
>  case HIOMAP_C_GET_INFO:
>  rsp_buffer_push(rsp, 2);  /* Version 2 */
>  rsp_buffer_push(rsp, BLOCK_SHIFT); /* block size */

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v7 35/47] commit: Deal with filters

On 19.08.20 19:58, Kevin Wolf wrote:
> Am 25.06.2020 um 17:22 hat Max Reitz geschrieben:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission if the base is smaller than the top).
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  block/block-backend.c  |  9 +++-
>>  block/commit.c | 96 +-
>>  block/monitor/block-hmp-cmds.c |  2 +-
>>  blockdev.c |  4 +-
>>  4 files changed, 81 insertions(+), 30 deletions(-)
>>
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 6936b25c83..7f2c7dbccc 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -2271,8 +2271,13 @@ int blk_commit_all(void)
>>  AioContext *aio_context = blk_get_aio_context(blk);
>>  
>>  aio_context_acquire(aio_context);
>> -if (blk_is_inserted(blk) && blk->root->bs->backing) {
>> -int ret = bdrv_commit(blk->root->bs);
> 
> The old code didn't try to commit nodes that don't have a backing file.
> 
>> +if (blk_is_inserted(blk)) {
>> +BlockDriverState *non_filter;
>> +int ret;
>> +
>> +/* Legacy function, so skip implicit filters */
>> +non_filter = bdrv_skip_implicit_filters(blk->root->bs);
>> +ret = bdrv_commit(non_filter);
> 
> The new one tries unconditionally. For nodes without a backing file,
> bdrv_commit() will return -ENOTSUP, so the whole function fails.

:(

Hm.  Should I fix it by checking for
bdrv_cow_bs(bdrv_skip_implicit_filters())?  Or bdrv_backing_chain_next()
and change the bdrv_skip_implicit_filters() to a bdrv_skip_filters()?  I
feel like that would make even more sense.

> (First real bug at patch 35. I almost thought I wouldn't find any!)

:)

>>  if (ret < 0) {
>>  aio_context_release(aio_context);
>>  return ret;
>> diff --git a/block/commit.c b/block/commit.c
>> index 7732d02dfe..4122b6736d 100644
>> --- a/block/commit.c
>> +++ b/block/commit.c
>> @@ -37,6 +37,7 @@ typedef struct CommitBlockJob {
>>  BlockBackend *top;
>>  BlockBackend *base;
>>  BlockDriverState *base_bs;
>> +BlockDriverState *base_overlay;
>>  BlockdevOnError on_error;
>>  bool base_read_only;
>>  bool chain_frozen;
> 
> Hm, again this mysterious base_overlay. I know that stream introduced it
> to avoid freezing the link to base, but commit doesn't seem to do that.
> 
> Is it to avoid using the block status of filter drivers between
> base_overlay and base?

Yes.

> If so, I guess that goes back to the question I
> raised earlier in this series: What is the block status supposed to tell
> for filter nodes?

Honestly, I would really like to get away without having to answer that
question in this series.  Intuitively, I feel like falling through to
the next data-bearing layer is not something most callers want.  But I’d
rather investigate that question separately from this series (even
though that likely means we’ll never do it), and just treat it as it is
in this series.

> But anyway, in contrast to mirror, commit actually freezes the chain
> between commit_top_bs and base, so it should be safe at least.
> 
>> @@ -89,7 +90,7 @@ static void commit_abort(Job *job)
>>   * XXX Can (or should) we somehow keep 'consistent read' blocked even
>>   * after the failed/cancelled commit job is gone? If we already wrote
>>   * something to base, the intermediate images aren't valid any more. */
>> -bdrv_replace_node(s->commit_top_bs, backing_bs(s->commit_top_bs),
>> +bdrv_replace_node(s->commit_top_bs, s->commit_top_bs->backing->bs,
>>&error_abort);
>>  
>>  bdrv_unref(s->commit_top_bs);
>> @@ -153,7 +154,7 @@ static int coroutine_fn commit_run(Job *job, Error 
>> **errp)
>>  break;
>>  }
>>  /* Copy if allocated above the base */
>> -ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base), 
>> false,
>> +ret = bdrv_is_allocated_above(blk_bs(s->top), s->base_overlay, true,
>>offset, COMMIT_BUFFER_SIZE, &n);
>>  copy = (ret == 1);
>>  trace_commit_one_iteration(s, offset, n, ret);
>> @@ -253,15 +254,35 @@ void commit_start(const char *job_id, BlockDriverState 
>> *bs,
>>  CommitBlockJob *s;
>>  BlockDriverState *iter;
>>  BlockDriverState *commit_top_bs = NULL;
>> +BlockDriverState *filtered_base;
>>  Error *local_err = NULL;
>> +int64_t base_size, top_size;
>> +uint64_t perms, iter_shared_perms;
>>  int ret;
>>  
>>  assert(top != bs);
>> -if (top == base) {
>> +if (bdrv_skip_filters(top) == bdrv_skip_filters(base)) {
>>  error_setg(errp, "Invalid files for merge: top and base are the 
>> same");
>>  return;
>>  }
>>  
>> +base_size = bdrv_getlength(base);
>> +if (base_size < 0) {
>> +error_setg_errno(errp, -base_

Re: [PATCH v7 14/47] stream: Deal with filters

On 20.08.20 12:49, Vladimir Sementsov-Ogievskiy wrote:
> 20.08.2020 12:22, Max Reitz wrote:
>> On 20.08.20 10:31, Max Reitz wrote:
>>
>> [...]
>>
>>> So all in all, I believe the biggest surprise about what’s written into
>>> the top layer isn’t that it may be a json:{} filename, but the filename
>>> of a node that maybe doesn’t even exist anymore?  (Oh, no, please don’t
>>> tell me you can delete it and get an invalid pointer read...)
>>
>> (I tried triggering that, but, oh, it’s strdup’ed() in stream_start().
>> I’m a bit daft.)
>>
> 
> 
> If it's broken anyway, probably we can just revert c624b015bf and start
> to freeze base again?

Well, it’s only broken if you care about the backing filename string
that’s written to @top.  So it isn’t broken altogether.

Though, well.  If we all agree to just revert it and maybe add a @bottom
parameter instead, then I suppose we could do it.

(Maybe in a follow-up, though.)

Max



signature.asc
Description: OpenPGP digital signature

Re: [PULL v7 000/151] Meson-based build system

On 20/08/20 12:33, Peter Maydell wrote:
> 'make check' still fails for the all-linux-static config, this
> time for a different reason:
> 
> make: *** No rule to make target 'check-qtest', needed by 'check'. Stop.

Oh, there are two "check:" rules.

I pushed again the tag with just this delta:

diff --git a/tests/Makefile.include b/tests/Makefile.include
index ae4315e00e..9ac8f5b86a 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -526,13 +526,12 @@ check-acceptance: check-venv $(TESTS_RESULTS_DIR) 
get-vm-images
 check-block:
 check-build: build-unit
 
-check: check-unit
 check-clean:
rm -rf $(check-unit-y) tests/*.o tests/*/*.o $(QEMU_IOTESTS_HELPERS-y)
rm -f tests/test-qapi-gen-timestamp
rm -rf $(TESTS_VENV_DIR) $(TESTS_RESULTS_DIR)
 
-check: check-block check-qapi-schema check-unit check-qtest check-decodetree
+check: check-unit
 
 clean: check-clean
 

If any more reasonable expectations from the older build system turn out to
be broken, like in the binaries case, I will be happy to accomodate them

Paolo

Re: deprecation of in-tree builds

On 20/08/20 12:54, Kevin Wolf wrote:
>> Paolo's conversion-to-Meson patchseries is about to land, so now
>> is the time for people who would like this "automatically create
>> a build directory and use it" behaviour to write the necessary
>> patches. Any volunteers ?
>>
>> My current plan is to land the Meson series first, because it is
>> really painful for Paolo to try to keep rebasing it as other
>> changes to the old build system occur. This would break
>> in-tree builds temporarily until the "automatic creation and
>> use of a builddir" patches go in on top of it.
>
> Usually, our requirement is that patch series don't break anything. And
> if something slips through, whoever broke it is supposed to fix it, not
> whoever is affected.

The Meson conversion was announced in October 2019 as breaking in-tree
builds, and the deprecation request is from March 2020.  So I don't
think this is a breakage but rather a widely-announced change.

Paolo

Re: [PATCH v5 1/1] audio/jack: fix use after free segfault

2020-08-20 Thread Christian Schoenebeck

On Donnerstag, 20. August 2020 12:54:49 CEST Paolo Bonzini wrote:
> More on the practical side, recursive mutex are an easy way to get a
> deadlock.  It's a common idiom to do
> 
> /* Need to take foo->lock outside bar->lock.  */
> mutex_unlock(&bar->lock);
> mutex_lock(&foo->lock);
> mutex_lock(&bar->lock);

The general theoretical implications about recursive locks was clear to me. 
AFAICS your point is that a recursive lock could mislead poeple taking things 
easy and running into a deadlock scenario like outlined by you.

My point was if it happens for whatever reason that a main IO mutex lock was 
accidentally introduced, i.e. without knowing it was already locked on a 
higher level, wouldn't it make sense to deal with this in some kind of 
defensive way?

One way would be a recursive type and logging a warning, which you obviously 
don't like; another option would be an assertion fault instead to make 
developers immediately aware about the double lock on early testing. Because 
on a large scale project like this, it is almost impossible for all developers 
to be aware about all implied locks. Don't you think so?

At least IMO the worst case would be a double unlock on a non-recursive main 
thread mutex and running silently into undefined behaviour.

> My suggestion is to work towards protecting the audio code with its own
> mutex(es) and ignore the existence of the BQL for subsystems that can do
> so (audio is a prime candidate).  Also please add comments to
> audio_int.h about which functions are called from other threads than the
> QEMU main thread.

That main thread lock came up here because I noticed this API comment on 
qemu_bh_cancel():

  "While cancellation itself is also wait-free and thread-safe, it can of   

   course race with the loop that executes bottom halves unless you are 
   holding the iothread mutex.  This makes it mostly useless if you are not 
   holding the mutex."

So this lock was not about driver internal data protection, but rather about 
dealing with the BH API correctly.

Best regards,
Christian Schoenebeck

Re: deprecation of in-tree builds

On Thu, 20 Aug 2020 at 12:56, Paolo Bonzini  wrote:
>
> On 20/08/20 12:54, Kevin Wolf wrote:
> >> Paolo's conversion-to-Meson patchseries is about to land, so now
> >> is the time for people who would like this "automatically create
> >> a build directory and use it" behaviour to write the necessary
> >> patches. Any volunteers ?
> >>
> >> My current plan is to land the Meson series first, because it is
> >> really painful for Paolo to try to keep rebasing it as other
> >> changes to the old build system occur. This would break
> >> in-tree builds temporarily until the "automatic creation and
> >> use of a builddir" patches go in on top of it.
> >
> > Usually, our requirement is that patch series don't break anything. And
> > if something slips through, whoever broke it is supposed to fix it, not
> > whoever is affected.
>
> The Meson conversion was announced in October 2019 as breaking in-tree
> builds, and the deprecation request is from March 2020.  So I don't
> think this is a breakage but rather a widely-announced change.

No, we need to put in the back-compat for making basic in-source-tree
configure/make/make install work. There was a fair amount of discussion
of this on-list and the clear consensus was that it was widely
enough used, that it wouldn't be a huge amount of work to make it
do a behind-the-scenes creation of the buildir, and that we should
retain it.

thanks
-- PMM

Re: [ovirt-devel] Re: device compatibility interface for live migration with assigned devices

2020-08-20 Thread Cornelia Huck

On Wed, 19 Aug 2020 17:28:38 +0800
Jason Wang  wrote:

> On 2020/8/19 下午4:13, Yan Zhao wrote:
> > On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:  
> >> On 2020/8/19 下午2:59, Yan Zhao wrote:  
> >>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:  
>  On 2020/8/19 上午11:30, Yan Zhao wrote:  
> > hi All,
> > could we decide that sysfs is the interface that every VFIO vendor 
> > driver
> > needs to provide in order to support vfio live migration, otherwise the
> > userspace management tool would not list the device into the compatible
> > list?
> >
> > if that's true, let's move to the standardizing of the sysfs interface.
> > (1) content
> > common part: (must)
> >   - software_version: (in major.minor.bugfix scheme)  
>  This can not work for devices whose features can be negotiated/advertised
>  independently. (E.g virtio devices)

I thought the 'software_version' was supposed to describe kind of a
'protocol version' for the data we transmit? I.e., you add a new field,
you bump the version number.

>   
> >>> sorry, I don't understand here, why virtio devices need to use vfio 
> >>> interface?  
> >>
> >> I don't see any reason that virtio devices can't be used by VFIO. Do you?
> >>
> >> Actually, virtio devices have been used by VFIO for many years:
> >>
> >> - passthrough a hardware virtio devices to userspace(VM) drivers
> >> - using virtio PMD inside guest
> >>  
> > So, what's different for it vs passing through a physical hardware via 
> > VFIO?  
> 
> 
> The difference is in the guest, the device could be either real hardware 
> or emulated ones.
> 
> 
> > even though the features are negotiated dynamically, could you explain
> > why it would cause software_version not work?  
> 
> 
> Virtio device 1 supports feature A, B, C
> Virtio device 2 supports feature B, C, D
> 
> So you can't migrate a guest from device 1 to device 2. And it's 
> impossible to model the features with versions.

We're talking about the features offered by the device, right? Would it
be sufficient to mandate that the target device supports the same
features or a superset of the features supported by the source device?

> 
> 
> >
> >  
> >>> I think this thread is discussing about vfio related devices.
> >>>  
> >   - device_api: vfio-pci or vfio-ccw ...
> >   - type: mdev type for mdev device or
> >   a signature for physical device which is a counterpart for
> >mdev type.
> >
> > device api specific part: (must)
> >  - pci id: pci id of mdev parent device or pci id of physical pci
> >device (device_api is vfio-pci)API here.  
>  So this assumes a PCI device which is probably not true.
>   
> >>> for device_api of vfio-pci, why it's not true?
> >>>
> >>> for vfio-ccw, it's subchannel_type.  
> >>
> >> Ok but having two different attributes for the same file is not good idea.
> >> How mgmt know there will be a 3rd type?  
> > that's why some attributes need to be common. e.g.
> > device_api: it's common because mgmt need to know it's a pci device or a
> >  ccw device. and the api type is already defined vfio.h.
> > (The field is agreed by and actually suggested by Alex in previous 
> > mail)
> > type: mdev_type for mdev. if mgmt does not understand it, it would not
> >be able to create one compatible mdev device.
> > software_version: mgmt can compare the major and minor if it understands
> >this fields.  
> 
> 
> I think it would be helpful if you can describe how mgmt is expected to 
> work step by step with the proposed sysfs API. This can help people to 
> understand.

My proposal would be:
- check that device_api matches
- check possible device_api specific attributes
- check that type matches [I don't think the combination of mdev types
  and another attribute to determine compatibility is a good idea;
  actually, the current proposal confuses me every time I look at it]
- check that software_version is compatible, assuming semantic
  versioning
- check possible type-specific attributes

> 
> Thanks for the patience. Since sysfs is uABI, when accepted, we need 
> support it forever. That's why we need to be careful.

Nod.

(...)

[PATCH] hw/arm/virt: Default to architecture appropriate CPU

2020-08-20 Thread Punit Agrawal

The default_cpu_type for the Virt machine is set to "cortex-a15" for
both the arm (qemu-system-arm) and aarch64 (qemu-system-aarch64)
targets.

As a result, starting the aarch64 target with "-machine virt" defaults
to booting with a Arm v7 cpu which is counter to
expectation. Debugging the issue is further hampered by lack of any
output from a supplied arm64 firmware as it is now running on the
wrong CPU type.

Fix this by defaulting to the "max" capability CPU for the target
architecture. After the patch both the arm and aarch64 qemu default to
the equivalent of passing "-cpu max".

Signed-off-by: Punit Agrawal 
---
Hi Peter,

I spent inordinately long time scratching my head trying to figure out
why my arm64 VM wasn't booting in TCG mode.

Please consider merging if you don't see any issues with the patch.

Thanks,
Punit

 hw/arm/virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ecfee362a1..7c5cbdba87 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2424,7 +2424,7 @@ static void virt_machine_class_init(ObjectClass *oc, void 
*data)
 mc->minimum_page_bits = 12;
 mc->possible_cpu_arch_ids = virt_possible_cpu_arch_ids;
 mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
-mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a15");
+mc->default_cpu_type = ARM_CPU_TYPE_NAME("max");
 mc->get_default_cpu_node_id = virt_get_default_cpu_node_id;
 mc->kvm_type = virt_kvm_type;
 assert(!mc->get_hotplug_handler);
-- 
2.28.0

[PATCH] ui: Add more mouse buttons to SPICE

2020-08-20 Thread Frediano Ziglio

Add support for SIDE and EXTRA buttons.

Signed-off-by: Frediano Ziglio 
---
 ui/spice-input.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ui/spice-input.c b/ui/spice-input.c
index cd4bb0043f..d5bba231c9 100644
--- a/ui/spice-input.c
+++ b/ui/spice-input.c
@@ -123,6 +123,8 @@ static void spice_update_buttons(QemuSpicePointer *pointer,
 [INPUT_BUTTON_RIGHT]   = 0x02,
 [INPUT_BUTTON_WHEEL_UP]= 0x10,
 [INPUT_BUTTON_WHEEL_DOWN]  = 0x20,
+[INPUT_BUTTON_SIDE]= 0x40,
+[INPUT_BUTTON_EXTRA]   = 0x80,
 };
 
 if (wheel < 0) {
-- 
2.25.4

Re: [PATCH v4 1/4] migration: Add block-bitmap-mapping parameter

On 20.08.20 03:17, Eric Blake wrote:
> On 8/18/20 8:32 AM, Max Reitz wrote:
>> This migration parameter allows mapping block node names and bitmap
>> names to aliases for the purpose of block dirty bitmap migration.
>>
>> This way, management tools can use different node and bitmap names on
>> the source and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> While touching this code, fix a bug where bitmap names longer than 255
>> bytes would fail an assertion in qemu_put_counted_string().
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy 
>> Signed-off-by: Max Reitz 
>> ---
> 
>> +##
>> +# @BitmapMigrationNodeAlias:
>> +#
>> +# Maps a block node name and the bitmaps it has to aliases for dirty
>> +# bitmap migration.
>> +#
>> +# @node-name: A block node name.
>> +#
>> +# @alias: An alias block node name for migration (for example the
>> +# node name on the opposite site).
>> +#
>> +# @bitmaps: Mappings for the bitmaps on this node.
>> +#
>> +# Since: 5.2
>> +##
>> +{ 'struct': 'BitmapMigrationNodeAlias',
>> +  'data': {
>> +  'node-name': 'str',
>> +  'alias': 'str',
>> +  'bitmaps': [ 'BitmapMigrationBitmapAlias' ]
>> +  } }
> 
> Possible change: should 'alias' be optional (if absent, it defaults to
> 'node-name')?  But that can be done on top, if we like it.

I don’t know whether anyone would make use of that, though, so I’d defer
it until someone wants it.

>> +static GHashTable *construct_alias_map(const
>> BitmapMigrationNodeAliasList *bbm,
>> +   bool name_to_alias,
>> +   Error **errp)
>> +{
>> +    GHashTable *alias_map;
>> +    size_t max_node_name_len =
>> +    sizeof(((BlockDriverState *)NULL)->node_name) - 1;
> 
> Looks a bit nicer as = sizeof_field(BlockDriverState, node_name) - 1.

Oh, I didn’t know we had that.

>> +
>> +    alias_map = g_hash_table_new_full(g_str_hash, g_str_equal,
>> +  g_free,
>> free_alias_map_inner_node);
>> +
>> +    for (; bbm; bbm = bbm->next) {
>> +    const BitmapMigrationNodeAlias *bmna = bbm->value;
>> +    const BitmapMigrationBitmapAliasList *bmbal;
>> +    AliasMapInnerNode *amin;
>> +    GHashTable *bitmaps_map;
>> +    const char *node_map_from, *node_map_to;
>> +
>> +    if (!id_wellformed(bmna->alias)) {
>> +    error_setg(errp, "The node alias '%s' is not well-formed",
>> +   bmna->alias);
>> +    goto fail;
>> +    }
>> +
>> +    if (strlen(bmna->alias) > 255) {
> 
> Magic number.  UINT8_MAX seems better (since the limit really is due to
> our migration format limiting to one byte).

Well, yes, but qemu_put_counted_string() uses that magic number, too.
*shrug*

>> +    g_hash_table_insert(alias_map, g_strdup(node_map_from), amin);
>> +
>> +    for (bmbal = bmna->bitmaps; bmbal; bmbal = bmbal->next) {
>> +    const BitmapMigrationBitmapAlias *bmba = bmbal->value;
>> +    const char *bmap_map_from, *bmap_map_to;
>> +
>> +    if (strlen(bmba->alias) > 255) {
> 
> and again
> 
>> +    error_setg(errp,
>> +   "The bitmap alias '%s' is longer than 255
>> bytes",
>> +   bmba->alias);
>> +    goto fail;
>> +    }
> 
> Thanks for adding in the length checking since last revision!
> 
> 
>> @@ -326,12 +538,29 @@ static int add_bitmaps_to_list(DBMSaveState *s,
>> BlockDriverState *bs,
>>   return -1;
>>   }
>>   +    if (bitmap_aliases) {
>> +    bitmap_alias = g_hash_table_lookup(bitmap_aliases,
>> bitmap_name);
>> +    if (!bitmap_alias) {
>> +    /* Skip bitmaps with no alias */
>> +    continue;
>> +    }
>> +    } else {
>> +    if (strlen(bitmap_name) > 255) {
>> +    error_report("Cannot migrate bitmap '%s' on node '%s': "
>> + "Name is longer than 255 bytes",
>> + bitmap_name, bs_name);
>> +    return -1;
> 
> Another one.
> 
> 
> Reviewed-by: Eric Blake 
> 
> I'm happy to make those touchups, and put this on my bitmaps queue for a
> pull request as soon as Paolo's meson stuff stabilizes.

Sounds good to me, thanks!

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v4 3/3] hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology

On Wed, 19 Aug 2020 17:42:58 -0500
Babu Moger  wrote:

> On 8/19/20 7:18 AM, Igor Mammedov wrote:
> > On Fri, 14 Aug 2020 16:39:40 -0500
> > Babu Moger  wrote:
> >   
> >> Remove node_id, nr_nodes and nodes_per_pkg from topology. Use
> >> die_id, nr_dies and dies_per_pkg which is already available.
> >> Removes the confusion over two variables.
> >>
> >> With node_id removed in topology the uninitialized memory issue
> >> with -device and CPU hotplug will be fixed.
> >>
> >> Link: 
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1828750&data=02%7C01%7Cbabu.moger%40amd.com%7C466254703c904bd4c86c08d8443a0e91%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637334363427680286&sdata=%2Fr7Mucca8Pr%2BrjvwJ6S9zxiIg3HCKAiPp4PvGP3nvZI%3D&reserved=0
> >> Signed-off-by: Babu Moger 
> >> ---
> >>  hw/i386/pc.c   |1 -
> >>  hw/i386/x86.c  |1 -
> >>  include/hw/i386/topology.h |   40 +---
> >>  target/i386/cpu.c  |   11 +++
> >>  target/i386/cpu.h  |1 -
> >>  5 files changed, 12 insertions(+), 42 deletions(-)
> >>
> >> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> >> index 47c5ca3e34..0ae5cb3af4 100644
> >> --- a/hw/i386/pc.c
> >> +++ b/hw/i386/pc.c
> >> @@ -1498,7 +1498,6 @@ static void pc_cpu_pre_plug(HotplugHandler 
> >> *hotplug_dev,
> >>  init_topo_info(&topo_info, x86ms);
> >>  
> >>  env->nr_dies = x86ms->smp_dies;
> >> -env->nr_nodes = topo_info.nodes_per_pkg;
> >>  env->pkg_offset = x86ms->apicid_pkg_offset(&topo_info);
> >>  
> >>  /*
> >> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> >> index e90c42d2fc..4efa1f8b87 100644
> >> --- a/hw/i386/x86.c
> >> +++ b/hw/i386/x86.c
> >> @@ -62,7 +62,6 @@ inline void init_topo_info(X86CPUTopoInfo *topo_info,
> >>  {
> >>  MachineState *ms = MACHINE(x86ms);
> >>  
> >> -topo_info->nodes_per_pkg = ms->numa_state->num_nodes / 
> >> ms->smp.sockets;
> >>  topo_info->dies_per_pkg = x86ms->smp_dies;
> >>  topo_info->cores_per_die = ms->smp.cores;
> >>  topo_info->threads_per_core = ms->smp.threads;
> >> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> >> index 07239f95f4..05ddde7ba0 100644
> >> --- a/include/hw/i386/topology.h
> >> +++ b/include/hw/i386/topology.h
> >> @@ -47,14 +47,12 @@ typedef uint32_t apic_id_t;
> >>  
> >>  typedef struct X86CPUTopoIDs {
> >>  unsigned pkg_id;
> >> -unsigned node_id;
> >>  unsigned die_id;
> >>  unsigned core_id;
> >>  unsigned smt_id;
> >>  } X86CPUTopoIDs;
> >>  
> >>  typedef struct X86CPUTopoInfo {
> >> -unsigned nodes_per_pkg;
> >>  unsigned dies_per_pkg;
> >>  unsigned cores_per_die;
> >>  unsigned threads_per_core;
> >> @@ -89,11 +87,6 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo 
> >> *topo_info)
> >>  return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
> >>  }
> >>  
> >> -/* Bit width of the node_id field per socket */
> >> -static inline unsigned apicid_node_width_epyc(X86CPUTopoInfo *topo_info)
> >> -{
> >> -return apicid_bitwidth_for_count(MAX(topo_info->nodes_per_pkg, 1));
> >> -}
> >>  /* Bit offset of the Core_ID field
> >>   */
> >>  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
> >> @@ -114,30 +107,23 @@ static inline unsigned 
> >> apicid_pkg_offset(X86CPUTopoInfo *topo_info)
> >>  return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
> >>  }
> >>  
> >> -#define NODE_ID_OFFSET 3 /* Minimum node_id offset if numa configured */
> >> +#define EPYC_DIE_OFFSET 3 /* Minimum die_id offset if numa configured */  
> >   ^^
> > 
> > from EPYC's pov NUMA is always configured, there no 'if'
> > 
> > but I have a question, is it possible to drop EPYC_DIE_OFFSET
> > and reuse apicid_die_offset(), will it break something?  
> 
> Yes. I am also thinking about it. We can go back to existing
> apicid_die_width().

Given we are using die_id now, we probably can get rid of all
topo hooks to generate apicid.

> Looking back again, I think all these code changes related to node_id is
> causing more issues than fixes. We have added all these code to handle
> some non- SPECed configurations like
>  https://bugzilla.redhat.com/show_bug.cgi?id=1728166.
> 
> Going forward it might create even more issues. Now, I think we should go
> back and remove all these changes and just use the default decoding. Most
> of the SPECed configuration would work just fine. With some non-SPECed
> user inputs, it will create some sub-optimal configuration. If we can live
> with that we will be just fine. I did not anticipate all these problems
> when I originally started this work. Sorry, my bad.

Topology code is complex and sometimes it's easier to add a new thing,
it's just that not every time it turns out as expected.
We overlooked posiblilty to reuse die-id, which has lead to more
complex and fr

Re: [PATCH v4 1/4] migration: Add block-bitmap-mapping parameter


18.08.2020 16:32, Max Reitz wrote:

This migration parameter allows mapping block node names and bitmap
names to aliases for the purpose of block dirty bitmap migration.

This way, management tools can use different node and bitmap names on
the source and destination and pass the mapping of how bitmaps are to be
transferred to qemu (on the source, the destination, or even both with
arbitrary aliases in the migration stream).

While touching this code, fix a bug where bitmap names longer than 255
bytes would fail an assertion in qemu_put_counted_string().

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Max Reitz 
---
  qapi/migration.json| 101 +++-
  migration/migration.h  |   3 +
  migration/block-dirty-bitmap.c | 409 -
  migration/migration.c  |  30 +++
  monitor/hmp-cmds.c |  30 +++
  5 files changed, 517 insertions(+), 56 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index ea53b23dca..0c4ae102b1 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json


[..]


  #
+# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
+#  aliases for the purpose of dirty bitmap migration.  Such
+#  aliases may for example be the corresponding names on the
+#  opposite site.
+#  The mapping must be one-to-one, but not necessarily
+#  complete: On the source, unmapped bitmaps and all bitmaps
+#  on unmapped nodes will be ignored.  On the destination,
+#  all unmapped aliases in the incoming migration stream will
+#  be reported, but they will not result in failure.

Actually, on unknown alias we cancel incoming bitmap migration, which means 
that destination vm continues to run, other (non-bitmap) migration states 
continue to migrate but all further chunks of bitmap migration will be ignored. 
(I'm not sure it worth be mentioned)


+#  Note that the destination does not know about bitmaps it
+#  does not receive, so there is no limitation or requirement
+#  regarding the number of bitmaps received, or how they are
+#  named, or on which nodes they are placed.
+#  By default (when this parameter has never been set), bitmap
+#  names are mapped to themselves.  Nodes are mapped to their
+#  block device name if there is one, and to their node name
+#  otherwise. (Since 5.2)
+#
  # Since: 2.4
  ##
  { 'enum': 'MigrationParameter',
@@ -656,7 +712,8 @@
 'multifd-channels',
 'xbzrle-cache-size', 'max-postcopy-bandwidth',
 'max-cpu-throttle', 'multifd-compression',
-   'multifd-zlib-level' ,'multifd-zstd-level' ] }
+   'multifd-zlib-level' ,'multifd-zstd-level',
+   'block-bitmap-mapping' ] }
  
  ##


[..]


@@ -303,21 +497,39 @@ static int add_bitmaps_to_list(DBMSaveState *s, 
BlockDriverState *bs,
  return 0;
  }
  
+bitmap_name = bdrv_dirty_bitmap_name(bitmap);

+
  if (!bs_name || strcmp(bs_name, "") == 0) {
  error_report("Bitmap '%s' in unnamed node can't be migrated",
- bdrv_dirty_bitmap_name(bitmap));
+ bitmap_name);
  return -1;
  }
  
-if (bs_name[0] == '#') {

+if (alias_map) {
+const AliasMapInnerNode *amin = g_hash_table_lookup(alias_map, 
bs_name);
+
+if (!amin) {
+/* Skip bitmaps on nodes with no alias */
+return 0;
+}
+
+node_alias = amin->string;
+bitmap_aliases = amin->subtree;
+} else {
+node_alias = bs_name;
+bitmap_aliases = NULL;
+}
+
+if (node_alias[0] == '#') {
  error_report("Bitmap '%s' in a node with auto-generated "
   "name '%s' can't be migrated",
- bdrv_dirty_bitmap_name(bitmap), bs_name);
+ bitmap_name, node_alias);
  return -1;
  }


This check is related only to pre-alias_map behavior, so it's probably better 
to keep it inside else{} branch above. Still, aliases already checked to be 
wellformed, so this check will be always false anyway for aliases and will not 
hurt.

  
  FOR_EACH_DIRTY_BITMAP(bs, bitmap) {

-if (!bdrv_dirty_bitmap_name(bitmap)) {
+bitmap_name = bdrv_dirty_bitmap_name(bitmap);
+if (!bitmap_name) {
  continue;
  }
  


[..]


@@ -770,8 +1014,10 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  return 0;
  }
  
-static int dirty_bitmap_load_header(QEMUFile *f, DBMLoadState *s)

+static int dirty_bitmap_load_header(QEMUFile *f, DBMLoadState *s,
+GHashTable *alias_map)
  {
+GHashTable *bitmap_alias_map = NULL;
  Error *local_err = NULL;
  bool nothing;
  s->flags = qemu_get_bitmap_flags(f);
@@ -780,28 +1026,73 @@ static int dirty_bitmap_load_header(QEMUFile *f, 
DBMLoadState *s)
  nothing =

[PATCH] ui: Add more mouse buttons to SPICE

2020-08-20 Thread Frediano Ziglio

From: Frediano Ziglio 

Add support for SIDE and EXTRA buttons.

Signed-off-by: Frediano Ziglio 
---
 ui/spice-input.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ui/spice-input.c b/ui/spice-input.c
index cd4bb0043f..d5bba231c9 100644
--- a/ui/spice-input.c
+++ b/ui/spice-input.c
@@ -123,6 +123,8 @@ static void spice_update_buttons(QemuSpicePointer *pointer,
 [INPUT_BUTTON_RIGHT]   = 0x02,
 [INPUT_BUTTON_WHEEL_UP]= 0x10,
 [INPUT_BUTTON_WHEEL_DOWN]  = 0x20,
+[INPUT_BUTTON_SIDE]= 0x40,
+[INPUT_BUTTON_EXTRA]   = 0x80,
 };
 
 if (wheel < 0) {
-- 
2.25.4

Re: [PATCH v2 04/58] pl110: Rename pl110_version enum values

On 8/20/20 2:11 AM, Eduardo Habkost wrote:
> The PL110 enum value name will conflict with the PL110 type cast
> checker, when we replace the existing macro with an inline
> function.  Add a VERSION_ prefix to all pl110_version enum
> values, to avoid conflicts.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes v1 -> v2:
> * Fixed typo on commit message
> * Rename all enum values to VERSION_* (Philippe Mathieu-Daudé)
> 
> ---
> Cc: Peter Maydell 
> Cc: qemu-...@nongnu.org
> Cc: qemu-devel@nongnu.org
> ---
>  hw/display/pl110.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 36/58] piix: Move QOM macros to header

On 8/20/20 2:12 AM, Eduardo Habkost wrote:
> This will make future conversion to OBJECT_DECLARE* easier.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes series v1 -> v2: new patch in series v2
> 
> Cc: "Michael S. Tsirkin" 
> Cc: Marcel Apfelbaum 
> Cc: "Hervé Poussineau" 
> Cc: "Philippe Mathieu-Daudé" 
> Cc: Aleksandar Markovic 
> Cc: Aurelien Jarno 
> Cc: qemu-devel@nongnu.org
> ---
>  include/hw/southbridge/piix.h | 4 
>  hw/isa/piix3.c| 4 
>  2 files changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 42/58] vfio/pci: Move QOM macros to header

On 8/20/20 2:12 AM, Eduardo Habkost wrote:
> This will make future conversion to OBJECT_DECLARE* easier.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes series v1 -> v2: new patch in series v2
> 
> Cc: Alex Williamson 
> Cc: qemu-devel@nongnu.org
> ---
>  hw/vfio/pci.h | 3 +++
>  hw/vfio/pci.c | 3 ---
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 42/58] vfio/pci: Move QOM macros to header

On 8/20/20 2:12 AM, Eduardo Habkost wrote:
> This will make future conversion to OBJECT_DECLARE* easier.
> 
> Signed-off-by: Eduardo Habkost 
> ---
> Changes series v1 -> v2: new patch in series v2
> 
> Cc: Alex Williamson 
> Cc: qemu-devel@nongnu.org
> ---
>  hw/vfio/pci.h | 3 +++
>  hw/vfio/pci.c | 3 ---
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH] audio/jack: fix use after free segfault

2020-08-20 Thread Christian Schoenebeck

On Mittwoch, 19. August 2020 17:57:35 CEST Geoffrey McRae wrote:
> > The ringbuffer implementation looks a bit wild:
> > 
> > /* read PCM interleaved */
> > static int qjack_buffer_read(QJackBuffer *buffer, float *dest, int
> > size)
> > {
> > 
> > assert(buffer->data);
> > const int samples = size / sizeof(float);
> > int frames= samples / buffer->channels;
> > const int avail   = atomic_load_acquire(&buffer->used);
> > 
> > if (frames > avail) {
> > 
> > frames = avail;
> > 
> > }
> > 
> > int copy = frames;
> > int rptr = buffer->rptr;
> > 
> > while (copy) {
> > 
> > for (int c = 0; c < buffer->channels; ++c) {
> > 
> > *dest++ = buffer->data[c][rptr];
> > 
> > }
> > 
> > if (++rptr == buffer->frames) {
> > 
> > rptr = 0;
> > 
> > }
> > 
> > --copy;
> > 
> > }
> > 
> > buffer->rptr = rptr;
> > 
> > atomic_sub(&buffer->used, frames);
> > return frames * buffer->channels * sizeof(float);
> > 
> > }
> > 
> > On both sides there is no check whether one side is over/underrunning
> > the
> > other side (rptr vs. wptr). I would really recommend using an existing
> > ringbuffer implementation instead of writing one by yourself.
> 
> `buffer->used` ensures there is no overwrite unless I have missed
> something?

Right, I missed that you are using that separate variable for that. OK then!
Typical ringbuffer implementations only have a rptr and wptr. That's why I 
missed it.

> 
> > And question:
> > 
> > static size_t qjack_write(HWVoiceOut *hw, void *buf, size_t len)
> > {
> > 
> > QJackOut *jo = (QJackOut *)hw;
> > ++jo->c.packets;
> > 
> > if (jo->c.state != QJACK_STATE_RUNNING) {
> > 
> > qjack_client_recover(&jo->c);
> > return len;
> > 
> > }
> > 
> > qjack_client_connect_ports(&jo->c);
> > return qjack_buffer_write(&jo->c.fifo, buf, len);
> > 
> > }
> > 
> > So you are ensuring to reconnect the JACK ports in every cycle. Isn't
> > that a
> > bit often?
> 
> No, please see the implementation of qjack_client_connect_ports.

Ah, you mean this entry check:

static void qjack_client_connect_ports(QJackClient *c)
{
if (!c->connect_ports || !c->opt->connect_ports) {
return;
}
 ...

It's okay. However on the long-term I would consider moving that away from the 
audio thread as most JACK API functions are not deterministic, i.e. they could 
lead to audio dropouts if executed on audio thread.

Best regards,
Christian Schoenebeck

Re: [PATCH v4 4/4] iotests: Test node/bitmap aliases during migration

On 20.08.20 03:58, Eric Blake wrote:
> On 8/18/20 8:32 AM, Max Reitz wrote:
>> Signed-off-by: Max Reitz 
>> ---
>>   tests/qemu-iotests/300 | 595 +
>>   tests/qemu-iotests/300.out |   5 +
> 
> Rather sparse output (I hate debugging those sorts of outputs when the
> test is failing).

Hm.  I don’t know, the stack trace usually gives a good idea and
./check -d gives QMP context.

The advantage of a sparse output is that we don’t need to adjust the
reference output every time some optional field is added somewhere.

>>   tests/qemu-iotests/group   |   1 +
>>   3 files changed, 601 insertions(+)
>>   create mode 100755 tests/qemu-iotests/300
>>   create mode 100644 tests/qemu-iotests/300.out
>>
> 
>> +    # Dirty some random megabytes
>> +    for _ in range(9):
>> +    mb_ofs = random.randrange(1024)
>> +    self.vm_a.hmp_qemu_io(self.src_node_name, f'write
>> {mb_ofs}M 1M')
> 
> It turns out that the discard operation likewise dirties the bitmap, but
> slightly faster (see edb90bbd).  We could optimize it on top, but I'm
> not going to require a micro-optimizing to get it in.  The test takes
> about 12 seconds to run for me, but you didn't mark it as such in
> 'group', so that's good; but it turns up a problem:
> 
> 300  fail   [20:55:54] [20:56:06]    output
> mismatch (see 300.out.bad)
> --- /home/eblake/qemu-tmp2/tests/qemu-iotests/300.out    2020-08-19
> 20:53:11.087791988 -0500
> +++ /home/eblake/qemu-tmp2/tests/qemu-iotests/300.out.bad    2020-08-19
> 20:56:06.092428756 -0500
> @@ -1,5 +1,41 @@
> -.
> +WARNING:qemu.machine:qemu received signal 11; command:
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> -display none -vga none -chardev
> socket,id=mon,path=/tmp/tmp.qT831UThme/qemu-b-798452-monitor.sock -mon
> chardev=mon,mode=control -qtest
> unix:path=/tmp/tmp.qT831UThme/qemu-b-798452-qtest.sock -accel qtest
> -nodefaults -display none -accel qtest -blockdev
> node-name=node0,driver=null-co -incoming unix:/tmp/tmp.qT831UThme/mig_sock"
> +.FE...
> +==
> +ERROR: test_migratee_bitmap_is_not_mapped_on_dst
> (__main__.TestBlockBitmapMappingErrors)
> +--
> +Traceback (most recent call last):
> +  File
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
> 435, in _do_shutdown
> +    self._soft_shutdown(timeout, has_quit)
> +  File
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
> 415, in _soft_shutdown
> +    self._qmp.cmd('quit')
> +  File
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/qmp.py",
> line 266, in cmd
> +    return self.cmd_obj(qmp_cmd)
> +  File
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/qmp.py",
> line 246, in cmd_obj
> +    self.__sock.sendall(json.dumps(qmp_cmd).encode('utf-8'))
> +BrokenPipeError: [Errno 32] Broken pipe
> +
> +The above exception was the direct cause of the following exception:
> +
> +Traceback (most recent call last):
> +  File "300", line 76, in tearDown
> +    self.vm_b.shutdown()
> +  File
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
> 465, in shutdown
> +    self._do_shutdown(timeout, has_quit)
> +  File
> "/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
> 438, in _do_shutdown
> +    raise AbnormalShutdown("Could not perform graceful shutdown") \
> +qemu.machine.AbnormalShutdown: Could not perform graceful shutdown
> +
> +==
> +FAIL: test_migratee_bitmap_is_not_mapped_on_dst
> (__main__.TestBlockBitmapMappingErrors)
> +--
> +Traceback (most recent call last):
> +  File "300", line 384, in test_migratee_bitmap_is_not_mapped_on_dst
> +    self.migrate(False)
> +  File "300", line 99, in migrate
> +    self.assertEqual(self.vm_a.wait_migration('postmigrate'),
> +AssertionError: False != True
> +
>  --
>  Ran 37 tests
> 
> -OK
> +FAILED (failures=1, errors=1)
> 
> I'm not sure why I'm seeing that, but it looks like you've got a bad
> deref somewhere in the alias code.

Ah, crap.

This should fix it:

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 89cb16b12c..d407dfefea 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1091,7 +1091,9 @@ static int dirty_bitmap_load_header(QEMUFile *f,
DBMLoadState *s,
 } else {
 bitmap_name = s->bitmap_alias;
 }
+}

+if (!s->cancelled) {
 g_strlcpy(s->bitmap_name, bitmap_name, sizeof(s->bitmap_name));
 s->bi

Re: device compatibility interface for live migration with assigned devices

2020-08-20 Thread Sean Mooney

On Thu, 2020-08-20 at 14:27 +0800, Yan Zhao wrote:
> On Thu, Aug 20, 2020 at 06:16:28AM +0100, Sean Mooney wrote:
> > On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> > > On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > > > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > > > Daniel P. Berrangé  wrote:
> > > > > > 
> > > > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > > > >On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > > > 
> > > > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > > > 
> > > > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > > > >  we actually can also retrieve the same information through 
> > > > > > > > sysfs, .e.g
> > > > > > > > 
> > > > > > > >  |- [path to device]
> > > > > > > > |--- migration
> > > > > > > > | |--- self
> > > > > > > > | |   |---device_api
> > > > > > > > ||   |---mdev_type
> > > > > > > > ||   |---software_version
> > > > > > > > ||   |---device_id
> > > > > > > > ||   |---aggregator
> > > > > > > > | |--- compatible
> > > > > > > > | |   |---device_api
> > > > > > > > ||   |---mdev_type
> > > > > > > > ||   |---software_version
> > > > > > > > ||   |---device_id
> > > > > > > > ||   |---aggregator
> > > > > > > > 
> > > > > > > > 
> > > > > > > >  Yes but:
> > > > > > > > 
> > > > > > > >  - You need one file per attribute (one syscall for one 
> > > > > > > > attribute)
> > > > > > > >  - Attribute is coupled with kobject
> > > > > > 
> > > > > > Is that really that bad? You have the device with an embedded 
> > > > > > kobject
> > > > > > anyway, and you can just put things into an attribute group?
> > > > > > 
> > > > > > [Also, I think that self/compatible split in the example makes 
> > > > > > things
> > > > > > needlessly complex. Shouldn't semantic versioning and matching 
> > > > > > already
> > > > > > cover nearly everything? I would expect very few cases that are more
> > > > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > > > need that self/compatible split for that, either.]
> > > > > 
> > > > > Hi Cornelia,
> > > > > 
> > > > > The reason I want to declare compatible list of attributes is that
> > > > > sometimes it's not a simple 1:1 matching of source attributes and 
> > > > > target attributes
> > > > > as I demonstrated below,
> > > > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is 
> > > > > compatible to
> > > > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > > > >(mdev_type i915-GVTg_V5_8 + aggregator 4)
> > > > 
> > > > the way you are doing the nameing is till really confusing by the way
> > > > if this has not already been merged in the kernel can you chagne the 
> > > > mdev
> > > > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 
> > > > instead of half the device
> > > > 
> > > > currently you need to deived the aggratod by the number at the end of 
> > > > the mdev type to figure out
> > > > how much of the phsicial device is being used with is a very unfridly 
> > > > api convention
> > > > 
> > > > the way aggrator are being proposed in general is not really someting i 
> > > > like but i thin this at least
> > > > is something that should be able to correct.
> > > > 
> > > > with the complexity in the mdev type name + aggrator i suspect that 
> > > > this will never be support
> > > > in openstack nova directly requireing integration via cyborg unless we 
> > > > can pre partion the
> > > > device in to mdevs staicaly and just ignore this.
> > > > 
> > > > this is way to vendor sepecif to integrate into something like 
> > > > openstack in nova unless we can guarentee
> > > > taht how aggreator work will be portable across vendors genericly.
> > > > 
> > > > > 
> > > > > and aggragator may be just one of such examples that 1:1 matching 
> > > > > does not
> > > > > fit.
> > > > 
> > > > for openstack nova i dont see us support anything beyond the 1:1 case 
> > > > where the mdev type does not change.
> > > > 
> > > 
> > > hi Sean,
> > > I understand it's hard for openstack. but 1:N is always meaningful.
> > > e.g.
> > > if source device 1 has cap A, it is compatible to
> > > device 2: cap A,
> > > device 3: cap A+B,
> > > device 4: cap A+B+C
> > > 
> > > to allow openstack to detect it correctly, in compatible list of
> > > device 2, we would say compatible cap is A;
> > > device 3, compatible cap is A or A+B;
> > > device 4, compatible cap is A or A+B, or A+B+C;
> > > 
> > > then if openstack finds device A's self cap A is contained in com

Re: [PATCH] ui: Add more mouse buttons to SPICE

2020-08-20 Thread Marc-André Lureau

Hi

On Thu, Aug 20, 2020 at 5:01 PM Frediano Ziglio  wrote:

> From: Frediano Ziglio 
>
> Add support for SIDE and EXTRA buttons.
>
> Signed-off-by: Frediano Ziglio 
> ---
>  ui/spice-input.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/ui/spice-input.c b/ui/spice-input.c
> index cd4bb0043f..d5bba231c9 100644
> --- a/ui/spice-input.c
> +++ b/ui/spice-input.c
> @@ -123,6 +123,8 @@ static void spice_update_buttons(QemuSpicePointer
> *pointer,
>  [INPUT_BUTTON_RIGHT]   = 0x02,
>  [INPUT_BUTTON_WHEEL_UP]= 0x10,
>  [INPUT_BUTTON_WHEEL_DOWN]  = 0x20,
> +[INPUT_BUTTON_SIDE]= 0x40,
> +[INPUT_BUTTON_EXTRA]   = 0x80,
>  };
>
>  if (wheel < 0) {
> --
> 2.25.4
>
>
I don't see where those values are defined, can you describe it a bit?

thanks


-- 
Marc-André Lureau

Re: deprecation of in-tree builds

Am 20.08.2020 um 13:56 hat Paolo Bonzini geschrieben:
> On 20/08/20 12:54, Kevin Wolf wrote:
> >> Paolo's conversion-to-Meson patchseries is about to land, so now
> >> is the time for people who would like this "automatically create
> >> a build directory and use it" behaviour to write the necessary
> >> patches. Any volunteers ?
> >>
> >> My current plan is to land the Meson series first, because it is
> >> really painful for Paolo to try to keep rebasing it as other
> >> changes to the old build system occur. This would break
> >> in-tree builds temporarily until the "automatic creation and
> >> use of a builddir" patches go in on top of it.
> >
> > Usually, our requirement is that patch series don't break anything. And
> > if something slips through, whoever broke it is supposed to fix it, not
> > whoever is affected.
> 
> The Meson conversion was announced in October 2019 as breaking in-tree
> builds, and the deprecation request is from March 2020.  So I don't
> think this is a breakage but rather a widely-announced change.

Wasn't the decision after that discussion that we do _not_ want to
deprecate './configure; make' from the source directory?

I seem to remember that we wanted to merge a message to make a
recommendation for out-of-tree builds, but looking at configure, I can't
find even that.

So without that, and also without a mention in deprecated.rst, I don't
think having mentioned a wish to break things a while ago means that we
should just follow through with that despite the objections.

At least this isn't how it has worked for other patch series. If it is
the new standard, I can remove -drive tomorrow. I've been mentioning for
years that I don't like it and want to remove it, so people can just
deal with it.

Kevin

Re: [PATCH v4 1/4] migration: Add block-bitmap-mapping parameter

On 20.08.20 14:58, Vladimir Sementsov-Ogievskiy wrote:
> 18.08.2020 16:32, Max Reitz wrote:
>> This migration parameter allows mapping block node names and bitmap
>> names to aliases for the purpose of block dirty bitmap migration.
>>
>> This way, management tools can use different node and bitmap names on
>> the source and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> While touching this code, fix a bug where bitmap names longer than 255
>> bytes would fail an assertion in qemu_put_counted_string().
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy 
>> Signed-off-by: Max Reitz 
>> ---
>>   qapi/migration.json    | 101 +++-
>>   migration/migration.h  |   3 +
>>   migration/block-dirty-bitmap.c | 409 -
>>   migration/migration.c  |  30 +++
>>   monitor/hmp-cmds.c |  30 +++
>>   5 files changed, 517 insertions(+), 56 deletions(-)
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index ea53b23dca..0c4ae102b1 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
> 
> [..]
> 
>>   #
>> +# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
>> +#  aliases for the purpose of dirty bitmap migration.  Such
>> +#  aliases may for example be the corresponding names on the
>> +#  opposite site.
>> +#  The mapping must be one-to-one, but not necessarily
>> +#  complete: On the source, unmapped bitmaps and all bitmaps
>> +#  on unmapped nodes will be ignored.  On the destination,
>> +#  all unmapped aliases in the incoming migration stream will
>> +#  be reported, but they will not result in failure.
> Actually, on unknown alias we cancel incoming bitmap migration, which
> means that destination vm continues to run, other (non-bitmap) migration
> states continue to migrate but all further chunks of bitmap migration
> will be ignored. (I'm not sure it worth be mentioned)

Ah, yeah.

[...]

>> @@ -303,21 +497,39 @@ static int add_bitmaps_to_list(DBMSaveState *s,
>> BlockDriverState *bs,
>>   return 0;
>>   }
>>   +    bitmap_name = bdrv_dirty_bitmap_name(bitmap);
>> +
>>   if (!bs_name || strcmp(bs_name, "") == 0) {
>>   error_report("Bitmap '%s' in unnamed node can't be migrated",
>> - bdrv_dirty_bitmap_name(bitmap));
>> + bitmap_name);
>>   return -1;
>>   }
>>   -    if (bs_name[0] == '#') {
>> +    if (alias_map) {
>> +    const AliasMapInnerNode *amin =
>> g_hash_table_lookup(alias_map, bs_name);
>> +
>> +    if (!amin) {
>> +    /* Skip bitmaps on nodes with no alias */
>> +    return 0;
>> +    }
>> +
>> +    node_alias = amin->string;
>> +    bitmap_aliases = amin->subtree;
>> +    } else {
>> +    node_alias = bs_name;
>> +    bitmap_aliases = NULL;
>> +    }
>> +
>> +    if (node_alias[0] == '#') {
>>   error_report("Bitmap '%s' in a node with auto-generated "
>>    "name '%s' can't be migrated",
>> - bdrv_dirty_bitmap_name(bitmap), bs_name);
>> + bitmap_name, node_alias);
>>   return -1;
>>   }
> 
> This check is related only to pre-alias_map behavior, so it's probably
> better to keep it inside else{} branch above. Still, aliases already
> checked to be wellformed, so this check will be always false anyway for
> aliases and will not hurt.

Hm, it’s a trade off.  It does look a bit weird, because how can aliases
be auto-generated?  But OTOH it makes it clearer that we’ll never allow
non-wellformed aliases through.

[...]

>>   if (s->flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
>> -    if (!qemu_get_counted_string(f, s->bitmap_name)) {
>> +    const char *bitmap_name;
>> +
>> +    if (!qemu_get_counted_string(f, s->bitmap_alias)) {
>>   error_report("Unable to read bitmap name string");
> 
> Probably s/name/alias/ like for node error message.

Why not.

[...]

>> --- a/monitor/hmp-cmds.c
>> +++ b/monitor/hmp-cmds.c
>> @@ -469,6 +469,32 @@ void hmp_info_migrate_parameters(Monitor *mon,
>> const QDict *qdict)
>>   monitor_printf(mon, "%s: '%s'\n",
>>   MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
>>   params->tls_authz);
>> +
>> +    if (params->has_block_bitmap_mapping) {
>> +    const BitmapMigrationNodeAliasList *bmnal;
>> +
>> +    monitor_printf(mon, "%s:\n",
>> +   MigrationParameter_str(
>> +  
>> MIGRATION_PARAMETER_BLOCK_BITMAP_MAPPING));
>> +
>> +    for (bmnal = params->block_bitmap_mapping;
>> + bmnal;
>> + bmnal = bmnal->next)
>> +    {
>> +    const BitmapMigrationNodeAlias *bmna = bmnal->value;
>> +

Re: [PATCH v4 2/4] iotests.py: Add wait_for_runstate()


18.08.2020 16:32, Max Reitz wrote:

Signed-off-by: Max Reitz


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH v4 3/4] iotests.py: Let wait_migration() return on failure


18.08.2020 16:32, Max Reitz wrote:

Let wait_migration() return on failure (with the return value indicating
whether the migration was completed or has failed), so we can use it for
migrations that are expected to fail, too.

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/iotests.py | 18 --
  1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ee93cf22db..f39fd580a6 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -729,16 +729,22 @@ class VM(qtest.QEMUQtestMachine):
  }
  ]))
  
-def wait_migration(self, expect_runstate):

+def wait_migration(self, expect_runstate: Optional[str]) -> bool:
  while True:
  event = self.event_wait('MIGRATION')
  log(event, filters=[filter_qmp_event])
-if event['data']['status'] == 'completed':
+if event['data']['status'] in ('completed', 'failed'):
  break
-# The event may occur in finish-migrate, so wait for the expected
-# post-migration runstate
-while self.qmp('query-status')['return']['status'] != expect_runstate:
-pass
+
+if event['data']['status'] == 'completed':
+# The event may occur in finish-migrate, so wait for the expected
+# post-migration runstate
+runstate = None
+while runstate != expect_runstate:
+runstate = self.qmp('query-status')['return']['status']


Would be good to use the helper from the previous patch. With it or not:
Reviewed-by: Vladimir Sementsov-Ogievskiy 


+return True
+else:
+return False
  
  def node_info(self, node_name):

  nodes = self.qmp('query-named-block-nodes')




--
Best regards,
Vladimir

[PATCH v2 2/4] spapr/xive: Use kvmppc_xive_source_reset() in post_load

This is doing an extra loop but should be equivalent.

It also differentiate the reset of the sources from the restore of the
sources configuration. This will help in allocating the vCPU IPIs
independently.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/spapr_xive_kvm.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index d5fb5b260d5e..4ea687c93ecd 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -684,22 +684,22 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
 }
 }
 
+/*
+ * We can only restore the source config if the source has been
+ * previously set in KVM. Since we don't do that at reset time
+ * when restoring a VM, let's do it now.
+ */
+ret = kvmppc_xive_source_reset(&xive->source, &local_err);
+if (ret < 0) {
+goto fail;
+}
+
 /* Restore the EAT */
 for (i = 0; i < xive->nr_irqs; i++) {
 if (!xive_eas_is_valid(&xive->eat[i])) {
 continue;
 }
 
-/*
- * We can only restore the source config if the source has been
- * previously set in KVM. Since we don't do that for all interrupts
- * at reset time anymore, let's do it now.
- */
-ret = kvmppc_xive_source_reset_one(&xive->source, i, &local_err);
-if (ret < 0) {
-goto fail;
-}
-
 ret = kvmppc_xive_set_source_config(xive, i, &xive->eat[i], 
&local_err);
 if (ret < 0) {
 goto fail;
-- 
2.25.4

[PATCH v2 0/4] spapr/xive: Allocate vCPU IPIs from the vCPU contexts

Hello,


When QEMU switches to the XIVE interrupt mode, it creates all the
guest interrupts at the level of the KVM device. These interrupts are
backed by real HW interrupts from the IPI interrupt pool of the XIVE
controller.

Currently, this is done from the QEMU main thread, which results in
allocating all interrupts from the chip on which QEMU is running. IPIs
are not distributed across the system and the load is not well
balanced across the interrupt controllers.

Change the vCPU IPI allocation to run from the vCPU context. The
associated XIVE IPI interrupt will be allocated on the chip on which
the vCPU is running and improve distribution of the IPIs in the system.
When the vCPUs are pinned, this will make the IPI local to the chip of
the vCPU. It will reduce rerouting between interrupt controllers and
gives better performance.


I did some basic migration testing with the 'dual' and 'xive' modes,
also tried CPU hoplug. I haven't tried migration with older pseries
machine. 

Thanks,

C.

Cédric Le Goater (4):
  spapr/xive: Modify kvm_cpu_is_enabled() interface
  spapr/xive: Use kvmppc_xive_source_reset() in post_load
  spapr/xive: Allocate IPIs independently from the other sources
  spapr/xive: Allocate vCPU IPIs from the vCPU contexts

 hw/intc/spapr_xive_kvm.c | 102 ---
 1 file changed, 84 insertions(+), 18 deletions(-)

-- 
2.25.4

[PATCH v2 3/4] spapr/xive: Allocate IPIs independently from the other sources

The vCPU IPIs are now allocated in kvmppc_xive_cpu_connect() when the
vCPU connects to the KVM device and not when all the sources are reset
in kvmppc_xive_source_reset()

This requires extra care for hotplug vCPUs and VM restore.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/spapr_xive_kvm.c | 47 +++-
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 4ea687c93ecd..f838c208b559 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -146,6 +146,15 @@ int kvmppc_xive_cpu_synchronize_state(XiveTCTX *tctx, 
Error **errp)
 return s.ret;
 }
 
+static int kvmppc_xive_reset_ipi(SpaprXive *xive, CPUState *cs, Error **errp)
+{
+unsigned long ipi = kvm_arch_vcpu_id(cs);
+uint64_t state = 0;
+
+return kvm_device_access(xive->fd, KVM_DEV_XIVE_GRP_SOURCE, ipi,
+  &state, true, errp);
+}
+
 int kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
 {
 ERRP_GUARD();
@@ -175,6 +184,12 @@ int kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
 return ret;
 }
 
+/* Create/reset the vCPU IPI */
+ret = kvmppc_xive_reset_ipi(xive, tctx->cs, errp);
+if (ret < 0) {
+return ret;
+}
+
 kvm_cpu_enable(tctx->cs);
 return 0;
 }
@@ -260,6 +275,12 @@ int kvmppc_xive_source_reset_one(XiveSource *xsrc, int 
srcno, Error **errp)
 
 assert(xive->fd != -1);
 
+/*
+ * The vCPU IPIs are now allocated in kvmppc_xive_cpu_connect()
+ * and not with all sources in kvmppc_xive_source_reset()
+ */
+assert(srcno >= SPAPR_XIRQ_BASE);
+
 if (xive_source_irq_is_lsi(xsrc, srcno)) {
 state |= KVM_XIVE_LEVEL_SENSITIVE;
 if (xsrc->status[srcno] & XIVE_STATUS_ASSERTED) {
@@ -271,12 +292,28 @@ int kvmppc_xive_source_reset_one(XiveSource *xsrc, int 
srcno, Error **errp)
  true, errp);
 }
 
+/*
+ * To be valid, a source must have been claimed by the machine (valid
+ * entry in the EAS table) and if it is a vCPU IPI, the vCPU should
+ * have been enabled, which means the IPI has been allocated in
+ * kvmppc_xive_cpu_connect().
+ */
+static bool xive_source_is_valid(SpaprXive *xive, int i)
+{
+return xive_eas_is_valid(&xive->eat[i]) &&
+(i >= SPAPR_XIRQ_BASE || kvm_cpu_is_enabled(i));
+}
+
 static int kvmppc_xive_source_reset(XiveSource *xsrc, Error **errp)
 {
 SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
 int i;
 
-for (i = 0; i < xsrc->nr_irqs; i++) {
+/*
+ * Skip the vCPU IPIs. These are created/reset when the vCPUs are
+ * connected in kvmppc_xive_cpu_connect()
+ */
+for (i = SPAPR_XIRQ_BASE; i < xsrc->nr_irqs; i++) {
 int ret;
 
 if (!xive_eas_is_valid(&xive->eat[i])) {
@@ -370,7 +407,7 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
 for (i = 0; i < xsrc->nr_irqs; i++) {
 uint8_t pq;
 
-if (!xive_eas_is_valid(&xive->eat[i])) {
+if (!xive_source_is_valid(xive, i)) {
 continue;
 }
 
@@ -553,7 +590,7 @@ static void kvmppc_xive_change_state_handler(void *opaque, 
int running,
 uint8_t pq;
 uint8_t old_pq;
 
-if (!xive_eas_is_valid(&xive->eat[i])) {
+if (!xive_source_is_valid(xive, i)) {
 continue;
 }
 
@@ -581,7 +618,7 @@ static void kvmppc_xive_change_state_handler(void *opaque, 
int running,
 for (i = 0; i < xsrc->nr_irqs; i++) {
 uint8_t pq;
 
-if (!xive_eas_is_valid(&xive->eat[i])) {
+if (!xive_source_is_valid(xive, i)) {
 continue;
 }
 
@@ -696,7 +733,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
 
 /* Restore the EAT */
 for (i = 0; i < xive->nr_irqs; i++) {
-if (!xive_eas_is_valid(&xive->eat[i])) {
+if (!xive_source_is_valid(xive, i)) {
 continue;
 }
 
-- 
2.25.4

[PATCH v2 4/4] spapr/xive: Allocate vCPU IPIs from the vCPU contexts

When QEMU switches to the XIVE interrupt mode, it creates all the
guest interrupts at the level of the KVM device. These interrupts are
backed by real HW interrupts from the IPI interrupt pool of the XIVE
controller.

Currently, this is done from the QEMU main thread, which results in
allocating all interrupts from the chip on which QEMU is running. IPIs
are not distributed across the system and the load is not well
balanced across the interrupt controllers.

Change the vCPU IPI allocation to run from the vCPU context. The
associated XIVE IPI interrupt will be allocated on the chip on which
the vCPU is running and improve distribution of the IPIs in the system.
When the vCPUs are pinned, this will make the IPI local to the chip of
the vCPU. It will reduce rerouting between interrupt controllers and
gives better performance.

Device interrupts are still treated the same. To improve placement, we
would need some information on the chip owning the virtual source or
the HW source in case of a passthrough device but this reuires
changes in PAPR.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/spapr_xive_kvm.c | 36 +---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index f838c208b559..7d7f6fdbe25a 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -146,13 +146,43 @@ int kvmppc_xive_cpu_synchronize_state(XiveTCTX *tctx, 
Error **errp)
 return s.ret;
 }
 
-static int kvmppc_xive_reset_ipi(SpaprXive *xive, CPUState *cs, Error **errp)
+/*
+ * Allocate the vCPU IPIs from the vCPU context. This will allocate
+ * the XIVE IPI interrupt on the chip on which the vCPU is running.
+ * This gives a better distribution of IPIs when the guest has a lot
+ * of vCPUs. When the vCPUs are pinned, this will make the IPI local
+ * to the chip of the vCPU. It will reduce rerouting between interrupt
+ * controllers and gives better performance.
+ */
+typedef struct {
+SpaprXive *xive;
+Error *err;
+int rc;
+} XiveInitIPI;
+
+static void kvmppc_xive_reset_ipi_on_cpu(CPUState *cs, run_on_cpu_data arg)
 {
 unsigned long ipi = kvm_arch_vcpu_id(cs);
+XiveInitIPI *s = arg.host_ptr;
 uint64_t state = 0;
 
-return kvm_device_access(xive->fd, KVM_DEV_XIVE_GRP_SOURCE, ipi,
-  &state, true, errp);
+s->rc = kvm_device_access(s->xive->fd, KVM_DEV_XIVE_GRP_SOURCE, ipi,
+  &state, true, &s->err);
+}
+
+static int kvmppc_xive_reset_ipi(SpaprXive *xive, CPUState *cs, Error **errp)
+{
+XiveInitIPI s = {
+.xive = xive,
+.err  = NULL,
+.rc   = 0,
+};
+
+run_on_cpu(cs, kvmppc_xive_reset_ipi_on_cpu, RUN_ON_CPU_HOST_PTR(&s));
+if (s.err) {
+error_propagate(errp, s.err);
+}
+return s.rc;
 }
 
 int kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
-- 
2.25.4

[PATCH v2 1/4] spapr/xive: Modify kvm_cpu_is_enabled() interface

We will use to check if a vCPU IPI has been created.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/spapr_xive_kvm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
index 3eea4cb1c49f..d5fb5b260d5e 100644
--- a/hw/intc/spapr_xive_kvm.c
+++ b/hw/intc/spapr_xive_kvm.c
@@ -36,10 +36,9 @@ typedef struct KVMEnabledCPU {
 static QLIST_HEAD(, KVMEnabledCPU)
 kvm_enabled_cpus = QLIST_HEAD_INITIALIZER(&kvm_enabled_cpus);
 
-static bool kvm_cpu_is_enabled(CPUState *cs)
+static bool kvm_cpu_is_enabled(unsigned long vcpu_id)
 {
 KVMEnabledCPU *enabled_cpu;
-unsigned long vcpu_id = kvm_arch_vcpu_id(cs);
 
 QLIST_FOREACH(enabled_cpu, &kvm_enabled_cpus, node) {
 if (enabled_cpu->vcpu_id == vcpu_id) {
@@ -157,7 +156,7 @@ int kvmppc_xive_cpu_connect(XiveTCTX *tctx, Error **errp)
 assert(xive->fd != -1);
 
 /* Check if CPU was hot unplugged and replugged. */
-if (kvm_cpu_is_enabled(tctx->cs)) {
+if (kvm_cpu_is_enabled(kvm_arch_vcpu_id(tctx->cs))) {
 return 0;
 }
 
-- 
2.25.4

Re: [PATCH v7 35/47] commit: Deal with filters

Am 20.08.2020 um 13:27 hat Max Reitz geschrieben:
> On 19.08.20 19:58, Kevin Wolf wrote:
> > Am 25.06.2020 um 17:22 hat Max Reitz geschrieben:
> >> This includes some permission limiting (for example, we only need to
> >> take the RESIZE permission if the base is smaller than the top).
> >>
> >> Signed-off-by: Max Reitz 
> >> ---
> >>  block/block-backend.c  |  9 +++-
> >>  block/commit.c | 96 +-
> >>  block/monitor/block-hmp-cmds.c |  2 +-
> >>  blockdev.c |  4 +-
> >>  4 files changed, 81 insertions(+), 30 deletions(-)
> >>
> >> diff --git a/block/block-backend.c b/block/block-backend.c
> >> index 6936b25c83..7f2c7dbccc 100644
> >> --- a/block/block-backend.c
> >> +++ b/block/block-backend.c
> >> @@ -2271,8 +2271,13 @@ int blk_commit_all(void)
> >>  AioContext *aio_context = blk_get_aio_context(blk);
> >>  
> >>  aio_context_acquire(aio_context);
> >> -if (blk_is_inserted(blk) && blk->root->bs->backing) {
> >> -int ret = bdrv_commit(blk->root->bs);
> > 
> > The old code didn't try to commit nodes that don't have a backing file.
> > 
> >> +if (blk_is_inserted(blk)) {
> >> +BlockDriverState *non_filter;
> >> +int ret;
> >> +
> >> +/* Legacy function, so skip implicit filters */
> >> +non_filter = bdrv_skip_implicit_filters(blk->root->bs);
> >> +ret = bdrv_commit(non_filter);
> > 
> > The new one tries unconditionally. For nodes without a backing file,
> > bdrv_commit() will return -ENOTSUP, so the whole function fails.
> 
> :(
> 
> Hm.  Should I fix it by checking for
> bdrv_cow_bs(bdrv_skip_implicit_filters())?  Or bdrv_backing_chain_next()
> and change the bdrv_skip_implicit_filters() to a bdrv_skip_filters()?  I
> feel like that would make even more sense.

I agree that bdrv_skip_filters() makes more sense. If I have a qcow2
image and an explicit throttle filter on top, there is no reason to skip
this image.

bdrv_backing_chain_next() or bdrv_cow_bs() should be the same in a
boolean context, so I'd vote for bdrv_cow_bs() because it has less work
to do to get the same result.

> > (First real bug at patch 35. I almost thought I wouldn't find any!)
> 
> :)
> 
> >>  if (ret < 0) {
> >>  aio_context_release(aio_context);
> >>  return ret;
> >> diff --git a/block/commit.c b/block/commit.c
> >> index 7732d02dfe..4122b6736d 100644
> >> --- a/block/commit.c
> >> +++ b/block/commit.c
> >> @@ -37,6 +37,7 @@ typedef struct CommitBlockJob {
> >>  BlockBackend *top;
> >>  BlockBackend *base;
> >>  BlockDriverState *base_bs;
> >> +BlockDriverState *base_overlay;
> >>  BlockdevOnError on_error;
> >>  bool base_read_only;
> >>  bool chain_frozen;
> > 
> > Hm, again this mysterious base_overlay. I know that stream introduced it
> > to avoid freezing the link to base, but commit doesn't seem to do that.
> > 
> > Is it to avoid using the block status of filter drivers between
> > base_overlay and base?
> 
> Yes.
> 
> > If so, I guess that goes back to the question I
> > raised earlier in this series: What is the block status supposed to tell
> > for filter nodes?
> 
> Honestly, I would really like to get away without having to answer that
> question in this series.  Intuitively, I feel like falling through to
> the next data-bearing layer is not something most callers want.  But I’d
> rather investigate that question separately from this series (even
> though that likely means we’ll never do it), and just treat it as it is
> in this series.

Well, I'm asking the question because not having the answer makes us
jump through hoops in this series to accomodate a behaviour it probably
shouldn't even have. (Because I agree that filters should probably keep
DATA clear, i.e. they are never the layer that defines the content.)

Additional node references (i.e. references that are not edges in the
graph) always make the design more complicated and require us to
consider more things like what happens on graph changes. So it's a
question of maintainability.

> > But anyway, in contrast to mirror, commit actually freezes the chain
> > between commit_top_bs and base, so it should be safe at least.
> > 
> >> @@ -89,7 +90,7 @@ static void commit_abort(Job *job)
> >>   * XXX Can (or should) we somehow keep 'consistent read' blocked even
> >>   * after the failed/cancelled commit job is gone? If we already wrote
> >>   * something to base, the intermediate images aren't valid any more. 
> >> */
> >> -bdrv_replace_node(s->commit_top_bs, backing_bs(s->commit_top_bs),
> >> +bdrv_replace_node(s->commit_top_bs, s->commit_top_bs->backing->bs,
> >>&error_abort);
> >>  
> >>  bdrv_unref(s->commit_top_bs);
> >> @@ -153,7 +154,7 @@ static int coroutine_fn commit_run(Job *job, Error 
> >> **errp)
> >>  break;
> >>  }
> >>  /* Co

Re: [PATCH] hw/arm/sbsa-ref.c : Add a fake embedded controller

2020-08-20 Thread Graeme Gregory

On Thu, Aug 20, 2020 at 02:32:01PM +0100, Graeme Gregory wrote:
> A difference between sbsa platform and the virt platform is PSCI is
> handled by ARM-TF in the sbsa platform. This means that the PSCI code
> there needs to communicate some of the platform power changes down
> to the qemu code for things like shutdown/reset control.
> 
> Space has been left to extend the EC if we find other use cases in
> future where ARM-TF and qemu need to communicate.
> 
> Signed-off-by: Graeme Gregory 
> ---
>  hw/arm/sbsa-ref.c | 95 +++
>  1 file changed, 95 insertions(+)
> 
> diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
> index f030a416fd..c8743fc1d0 100644
> --- a/hw/arm/sbsa-ref.c
> +++ b/hw/arm/sbsa-ref.c
> @@ -41,6 +41,7 @@
>  #include "hw/usb.h"
>  #include "hw/char/pl011.h"
>  #include "net/net.h"
> +#include "migration/vmstate.h"
>  
>  #define RAMLIMIT_GB 8192
>  #define RAMLIMIT_BYTES (RAMLIMIT_GB * GiB)
> @@ -62,6 +63,7 @@ enum {
>  SBSA_CPUPERIPHS,
>  SBSA_GIC_DIST,
>  SBSA_GIC_REDIST,
> +SBSA_SECURE_EC,
>  SBSA_SMMU,
>  SBSA_UART,
>  SBSA_RTC,
> @@ -98,6 +100,14 @@ typedef struct {
>  #define SBSA_MACHINE(obj) \
>  OBJECT_CHECK(SBSAMachineState, (obj), TYPE_SBSA_MACHINE)
>  
> +typedef struct {
> +SysBusDevice parent_obj;
> +MemoryRegion iomem;
> +} SECUREECState;
> +
> +#define TYPE_SECURE_EC  "sbsa-secure-ec"
> +#define SECURE_EC(obj) OBJECT_CHECK(SECUREECState, (obj), TYPE_SECURE_EC)
> +
>  static const MemMapEntry sbsa_ref_memmap[] = {
>  /* 512M boot ROM */
>  [SBSA_FLASH] =  {  0, 0x2000 },
> @@ -107,6 +117,7 @@ static const MemMapEntry sbsa_ref_memmap[] = {
>  [SBSA_CPUPERIPHS] = { 0x4000, 0x0004 },
>  [SBSA_GIC_DIST] =   { 0x4006, 0x0001 },
>  [SBSA_GIC_REDIST] = { 0x4008, 0x0400 },
> +[SBSA_SECURE_EC] =  { 0x5000, 0x1000 },
>  [SBSA_UART] =   { 0x6000, 0x1000 },
>  [SBSA_RTC] ={ 0x6001, 0x1000 },
>  [SBSA_GPIO] =   { 0x6002, 0x1000 },
> @@ -585,6 +596,65 @@ static void *sbsa_ref_dtb(const struct arm_boot_info 
> *binfo, int *fdt_size)
>  return board->fdt;
>  }
>  
> +enum sbsa_secure_ec_powerstates {
> +SBSA_SECURE_EC_CMD_NULL,
> +SBSA_SECURE_EC_CMD_POWEROFF,
> +SBSA_SECURE_EC_CMD_REBOOT,
> +};
> +
> +static uint64_t secure_ec_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +/* No use for this currently */
> +return 0;
> +}
> +
> +static void secure_ec_write(void *opaque, hwaddr offset,
> + uint64_t value, unsigned size)
> +{
> +if (offset == 0) { /* PSCI machine power command register */
> +switch (value) {
> +case SBSA_SECURE_EC_CMD_NULL:
> +break;
> +case SBSA_SECURE_EC_CMD_POWEROFF:
> +qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
> +break;
> +case SBSA_SECURE_EC_CMD_REBOOT:
> +qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
> +break;
> +default:
> +error_report("sbsa-ref: ERROR Unkown power command");
> +}
> +} else {
> +error_report("sbsa-ref: unknown EC register");
> +}
> +}
> +
> +static const MemoryRegionOps secure_ec_ops = {
> +.read = secure_ec_read,
> +.write = secure_ec_write,
> +.endianness = DEVICE_NATIVE_ENDIAN,
> +};
> +
> +static void secure_ec_init(Object *obj)
> +{
> +SECUREECState *s = SECURE_EC(obj);
> +SysBusDevice *dev = SYS_BUS_DEVICE(obj);
> +
> +memory_region_init_io(&s->iomem, obj, &secure_ec_ops, s, "secure-ec",
> +0x1000);
> +sysbus_init_mmio(dev, &s->iomem);
> +}
> +
> +static void create_secure_ec(MemoryRegion *mem)
> +{
> +hwaddr base = sbsa_ref_memmap[SBSA_SECURE_EC].base;
> +DeviceState *dev = qdev_create(NULL, TYPE_SECURE_EC);

I've just discovered the API change here, I tested in an older tree, but
Ill wait for other reviews before re-issuing.

Graeme

> +SysBusDevice *s = SYS_BUS_DEVICE(dev);
> +
> +memory_region_add_subregion(mem, base,
> +sysbus_mmio_get_region(s, 0));
> +}
> +
>  static void sbsa_ref_init(MachineState *machine)
>  {
>  unsigned int smp_cpus = machine->smp.cpus;
> @@ -708,6 +778,8 @@ static void sbsa_ref_init(MachineState *machine)
>  
>  create_pcie(sms);
>  
> +create_secure_ec(secure_sysmem);
> +
>  sms->bootinfo.ram_size = machine->ram_size;
>  sms->bootinfo.nb_cpus = smp_cpus;
>  sms->bootinfo.board_id = -1;
> @@ -798,8 +870,31 @@ static const TypeInfo sbsa_ref_info = {
>  .instance_size = sizeof(SBSAMachineState),
>  };
>  
> +static const VMStateDescription vmstate_secure_ec_info = {
> +.name = "sbsa-secure-ec",
> +.version_id = 0,
> +.minimum_version_id = 0,
> +};
> +
> +static void secure_ec_class_init(O

Re: [PATCH v4 4/4] iotests: Test node/bitmap aliases during migration


20.08.2020 16:17, Max Reitz wrote:

On 20.08.20 03:58, Eric Blake wrote:

On 8/18/20 8:32 AM, Max Reitz wrote:

Signed-off-by: Max Reitz 
---
   tests/qemu-iotests/300 | 595 +
   tests/qemu-iotests/300.out |   5 +


Rather sparse output (I hate debugging those sorts of outputs when the
test is failing).


Hm.  I don’t know, the stack trace usually gives a good idea and
./check -d gives QMP context.

The advantage of a sparse output is that we don’t need to adjust the
reference output every time some optional field is added somewhere.


   tests/qemu-iotests/group   |   1 +
   3 files changed, 601 insertions(+)
   create mode 100755 tests/qemu-iotests/300
   create mode 100644 tests/qemu-iotests/300.out




+    # Dirty some random megabytes
+    for _ in range(9):
+    mb_ofs = random.randrange(1024)
+    self.vm_a.hmp_qemu_io(self.src_node_name, f'write
{mb_ofs}M 1M')


It turns out that the discard operation likewise dirties the bitmap, but
slightly faster (see edb90bbd).  We could optimize it on top, but I'm
not going to require a micro-optimizing to get it in.  The test takes
about 12 seconds to run for me, but you didn't mark it as such in
'group', so that's good; but it turns up a problem:

300  fail   [20:55:54] [20:56:06]    output
mismatch (see 300.out.bad)
--- /home/eblake/qemu-tmp2/tests/qemu-iotests/300.out    2020-08-19
20:53:11.087791988 -0500
+++ /home/eblake/qemu-tmp2/tests/qemu-iotests/300.out.bad    2020-08-19
20:56:06.092428756 -0500
@@ -1,5 +1,41 @@
-.
+WARNING:qemu.machine:qemu received signal 11; command:
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
-display none -vga none -chardev
socket,id=mon,path=/tmp/tmp.qT831UThme/qemu-b-798452-monitor.sock -mon
chardev=mon,mode=control -qtest
unix:path=/tmp/tmp.qT831UThme/qemu-b-798452-qtest.sock -accel qtest
-nodefaults -display none -accel qtest -blockdev
node-name=node0,driver=null-co -incoming unix:/tmp/tmp.qT831UThme/mig_sock"
+.FE...
+==
+ERROR: test_migratee_bitmap_is_not_mapped_on_dst
(__main__.TestBlockBitmapMappingErrors)
+--
+Traceback (most recent call last):
+  File
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
435, in _do_shutdown
+    self._soft_shutdown(timeout, has_quit)
+  File
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
415, in _soft_shutdown
+    self._qmp.cmd('quit')
+  File
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/qmp.py",
line 266, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/qmp.py",
line 246, in cmd_obj
+    self.__sock.sendall(json.dumps(qmp_cmd).encode('utf-8'))
+BrokenPipeError: [Errno 32] Broken pipe
+
+The above exception was the direct cause of the following exception:
+
+Traceback (most recent call last):
+  File "300", line 76, in tearDown
+    self.vm_b.shutdown()
+  File
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
465, in shutdown
+    self._do_shutdown(timeout, has_quit)
+  File
"/home/eblake/qemu-tmp2/tests/qemu-iotests/../../python/qemu/machine.py", line
438, in _do_shutdown
+    raise AbnormalShutdown("Could not perform graceful shutdown") \
+qemu.machine.AbnormalShutdown: Could not perform graceful shutdown
+
+==
+FAIL: test_migratee_bitmap_is_not_mapped_on_dst
(__main__.TestBlockBitmapMappingErrors)
+--
+Traceback (most recent call last):
+  File "300", line 384, in test_migratee_bitmap_is_not_mapped_on_dst
+    self.migrate(False)
+  File "300", line 99, in migrate
+    self.assertEqual(self.vm_a.wait_migration('postmigrate'),
+AssertionError: False != True
+
  --
  Ran 37 tests

-OK
+FAILED (failures=1, errors=1)

I'm not sure why I'm seeing that, but it looks like you've got a bad
deref somewhere in the alias code.


Ah, crap.

This should fix it:

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 89cb16b12c..d407dfefea 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1091,7 +1091,9 @@ static int dirty_bitmap_load_header(QEMUFile *f,
DBMLoadState *s,
  } else {
  bitmap_name = s->bitmap_alias;
  }
+}

+if (!s->cancelled) {
  g_strlcpy(s->bitmap_name, bitmap_name, sizeof(s->bitmap_name));
  s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);




That's correct thing to do


I had this originally, and then I decided to drop that hunk just be

[PATCH v2] spapr/xive: Use the xics flag to check for XIVE-only IRQ backends

The sPAPR machine has four different IRQ backends, each implementing
the XICS or XIVE interrupt mode or both in the case of the 'dual'
backend.

If a machine is started in P8 compat mode, QEMU should necessarily
support the XICS interrupt mode and in that case, the XIVE-only IRQ
backend is invalid. Currently, spapr_irq_check() tests the pointer
value to the IRQ backend to check for this condition, instead use the
'xics' flag. It's equivalent and it will ease the introduction of new
XIVE-only IRQ backends if needed.

Signed-off-by: Cédric Le Goater 
---
 hw/ppc/spapr_irq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index 80cf1c3d6bb2..d036c8fef519 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -172,7 +172,7 @@ static int spapr_irq_check(SpaprMachineState *spapr, Error 
**errp)
  * To cover both and not confuse the OS, add an early failure in
  * QEMU.
  */
-if (spapr->irq == &spapr_irq_xive) {
+if (!spapr->irq->xics) {
 error_setg(errp, "XIVE-only machines require a POWER9 CPU");
 return -1;
 }
-- 
2.25.4

Re: [PATCH V6] Introduce a new flag for i440fx to disable PCI hotplug on the root bus

On Thu, 20 Aug 2020 14:51:56 +0530
Ani Sinha  wrote:

> We introduce a new global flag 'acpi-root-pci-hotplug' for i440fx with which
> we can turn on or off PCI device hotplug on the root bus. This flag can be
> used to prevent all PCI devices from getting hotplugged or unplugged from the
> root PCI bus.
> This feature is targetted mostly towards Windows VMs. It is useful in cases
> where some hypervisor admins want to deploy guest VMs in a way so that the
> users of the guest OSes are not able to hot-eject certain PCI devices from
> the Windows system tray. Laine has explained the use case here in detail:
> https://www.redhat.com/archives/libvir-list/2020-February/msg00110.html
> 
> Julia has resolved this issue for PCIE buses with the following commit:
> 530a0963184e57e71a5b538 ("pcie_root_port: Add hotplug disabling option")
> 
> This commit attempts to introduce similar behavior for PCI root buses used in
> i440fx machine types (although in this case, we do not have a per-slot
> capability to turn hotplug on or off).
> 
> Usage:
>-global PIIX4_PM.acpi-root-pci-hotplug=off
> 
> By default, this option is enabled which means that hotplug is turned on for
> the PCI root bus.
> 
> The previously existing flag 'acpi-pci-hotplug-with-bridge-support' for 
> PCI-PCI
> bridges remain as is and can be used along with this new flag to control PCI
> hotplug on PCI bridges.
> 
> This change has been tested using a Windows 2012R2 server guest image and also
> with a Windows 2019 server guest image on a Ubuntu 18.04 host using the latest
> master qemu from upstream (v5.1.0 tag).
> 
> Signed-off-by: Ani Sinha 
> ---
>  hw/acpi/piix4.c  |  8 ++--
>  hw/i386/acpi-build.c | 26 +++---
>  2 files changed, 25 insertions(+), 9 deletions(-)
> 
> Change Log:
> V5..V6: specified upstream master tag information off which this patch is
> based off of.
> 
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 26bac4f16c..4f436e5bf3 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -78,6 +78,7 @@ typedef struct PIIX4PMState {
>  
>  AcpiPciHpState acpi_pci_hotplug;
>  bool use_acpi_hotplug_bridge;
> +bool use_acpi_root_pci_hotplug;
>  
>  uint8_t disable_s3;
>  uint8_t disable_s4;

> @@ -595,8 +596,9 @@ static void piix4_acpi_system_hot_add_init(MemoryRegion 
> *parent,
>"acpi-gpe0", GPE_LEN);
>  memory_region_add_subregion(parent, GPE_BASE, &s->io_gpe);
>  
> -acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
> -s->use_acpi_hotplug_bridge);
> +if (s->use_acpi_hotplug_bridge || s->use_acpi_root_pci_hotplug)
> +acpi_pcihp_init(OBJECT(s), &s->acpi_pci_hotplug, bus, parent,
> +s->use_acpi_hotplug_bridge);
If intent was to disable hardware part of ACPI hotplug,
then this hunk is not enough. I'd say it introduces bug since you are leaving
device_add/del route open and "_E01" AML code around trying to access no longer
described/present io ports.

Without this hunk patch is fine, as a means to hide hotplug from Windows.

If you'd like to disable hw part, you will need to consider case where hotplug 
is
disabled completly and block all related AML and block device_add|del.
So it would be a bit more than above hunk.


>  s->cpu_hotplug_legacy = true;
>  object_property_add_bool(OBJECT(s), "cpu-hotplug-legacy",
> @@ -635,6 +637,8 @@ static Property piix4_pm_properties[] = {
>  DEFINE_PROP_UINT8(ACPI_PM_PROP_S4_VAL, PIIX4PMState, s4_val, 2),
>  DEFINE_PROP_BOOL("acpi-pci-hotplug-with-bridge-support", PIIX4PMState,
>   use_acpi_hotplug_bridge, true),
> +DEFINE_PROP_BOOL("acpi-root-pci-hotplug", PIIX4PMState,
> + use_acpi_root_pci_hotplug, true),
>  DEFINE_PROP_BOOL("memory-hotplug-support", PIIX4PMState,
>   acpi_memory_hotplug.is_enabled, true),
>  DEFINE_PROP_END_OF_LIST(),
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index b7bc2a..19a1702ad1 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -95,6 +95,7 @@ typedef struct AcpiPmInfo {
>  bool s3_disabled;
>  bool s4_disabled;
>  bool pcihp_bridge_en;
> +bool pcihp_root_en;
>  uint8_t s4_val;
>  AcpiFadtData fadt;
>  uint16_t cpu_hp_io_base;
> @@ -245,6 +246,9 @@ static void acpi_get_pm_info(MachineState *machine, 
> AcpiPmInfo *pm)
>  pm->pcihp_bridge_en =
>  object_property_get_bool(obj, "acpi-pci-hotplug-with-bridge-support",
>   NULL);
> +pm->pcihp_root_en =
> +object_property_get_bool(obj, "acpi-root-pci-hotplug", NULL);
> +
>  }
>  
>  static void acpi_get_misc_info(AcpiMiscInfo *info)
> @@ -337,15 +341,18 @@ static void build_append_pcihp_notify_entry(Aml 
> *method, int slot)
>  }
>  
>  static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
> - bool pcihp

Re: [PATCH v1] pc: fix auto_enable_numa_with_memhp/auto_enable_numa_with_memdev for the 5.0 machine

On Thu, 20 Aug 2020 11:48:28 +0200
David Hildenbrand  wrote:

> Unfortunately, a typo sneeked in: we want to set
> auto_enable_numa_with_memdev to false, not auto_enable_numa_with_memhp.
> 
> Cc: qemu-sta...@nongnu.org # v5.1
> Fixes: 195784a0cfad (numa: Auto-enable NUMA when any memory devices are 
> possible)
> Reported-by: Dr. David Alan Gilbert 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Igor Mammedov 

> ---
>  hw/i386/pc_q35.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index a3e607a544..e94f45779b 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -371,7 +371,7 @@ static void pc_q35_5_0_machine_options(MachineClass *m)
>  m->numa_mem_supported = true;
>  compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len);
>  compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len);
> -m->auto_enable_numa_with_memhp = false;
> +m->auto_enable_numa_with_memdev = false;
>  }
>  
>  DEFINE_Q35_MACHINE(v5_0, "pc-q35-5.0", NULL,

Re: deprecation of in-tree builds

On Thu, 20 Aug 2020 at 14:30, Kevin Wolf  wrote:
>
> Am 20.08.2020 um 13:56 hat Paolo Bonzini geschrieben:
> > The Meson conversion was announced in October 2019 as breaking in-tree
> > builds, and the deprecation request is from March 2020.  So I don't
> > think this is a breakage but rather a widely-announced change.
>
> Wasn't the decision after that discussion that we do _not_ want to
> deprecate './configure; make' from the source directory?
>
> I seem to remember that we wanted to merge a message to make a
> recommendation for out-of-tree builds, but looking at configure, I can't
> find even that.

We opted not to do that or to mark out-of-tree builds as deprecated
in the release notes, because the outcome of the discussion was
that we wanted to retain the support for 'run configure and make
from the source tree'.

thanks
-- PMM

Re: [PATCH v13 00/11] iotests: Dump QCOW2 dirty bitmaps metadata

2020-08-20 Thread Andrey Shinkevich


On 20.08.2020 03:49, Eric Blake wrote:

On 8/14/20 6:56 AM, Andrey Shinkevich wrote:

Dear Eric!

Vladimir has compeated reviewing this series. I have not received any 
other responses to it so far.


So, is it good for pull request now? Would you please consider taking 
this series as you did it with the Vladimir's related one?


I've spent some time playing with this; I have now queued it on my 
bitmaps tree, and will be posting a pull request as soon as Paolo's 
meson changes settle.  I also made the tweaks suggested on 9/11.




Sounds good, thank you so much, Eric. I appreciate.

Andrey

Re: [RFC PATCH 13/22] block/export: Move refcount from NBDExport to BlockExport

Am 19.08.2020 um 22:58 hat Eric Blake geschrieben:
> On 8/13/20 11:29 AM, Kevin Wolf wrote:
> > Having a refcount makes sense for all types of block exports. It is also
> > a prerequisite for keeping a list of all exports at the BlockExport
> > level.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> 
> > +++ b/include/block/export.h
> > @@ -21,14 +21,24 @@ typedef struct BlockExport BlockExport;
> >   typedef struct BlockExportDriver {
> >   BlockExportType type;
> >   BlockExport *(*create)(BlockExportOptions *, Error **);
> > +void (*delete)(BlockExport *);
> >   } BlockExportDriver;
> >   struct BlockExport {
> >   const BlockExportDriver *drv;
> > +
> > +/*
> > + * Reference count for this block export. This includes strong 
> > references
> > + * both from the owner (qemu-nbd or the monitor) and clients connected 
> > to
> > + * the export.
> 
> I guess 'the monitor' includes qemu-storage-daemon.

Yes, qemu-storage-daemon has a QMP monitor, so I would count it there.
Even the --export command line option only calls the QMP command
internally.

> > + */
> > +int refcount;
> >   };
> >   extern const BlockExportDriver blk_exp_nbd;
> >   BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp);
> > +void blk_exp_ref(BlockExport *exp);
> > +void blk_exp_unref(BlockExport *exp);
> 
> Yay, I think this naming is more consistent with the rest of qemu...
> 
> >   #endif
> > diff --git a/include/block/nbd.h b/include/block/nbd.h
> > index 23030db3f1..af8509ab70 100644
> > --- a/include/block/nbd.h
> > +++ b/include/block/nbd.h
> > @@ -336,8 +336,6 @@ NBDExport *nbd_export_new(BlockDriverState *bs,
> >   void nbd_export_set_on_eject_blk(BlockExport *exp, BlockBackend *blk);
> >   void nbd_export_close(NBDExport *exp);
> >   void nbd_export_remove(NBDExport *exp, NbdServerRemoveMode mode, Error 
> > **errp);
> > -void nbd_export_get(NBDExport *exp);
> > -void nbd_export_put(NBDExport *exp);
> 
> ...as opposed to this which is common in kernel but less so in this project.
> No hard feelings from me :)
> 
> > +++ b/blockdev-nbd.c
> > @@ -232,7 +232,7 @@ BlockExport *nbd_export_create(BlockExportOptions 
> > *exp_args, Error **errp)
> >   /* The list of named exports has a strong reference to this export 
> > now and
> >* our only way of accessing it is through nbd_export_find(), so we 
> > can drop
> >* the strong reference that is @exp. */
> > -nbd_export_put(exp);
> > +blk_exp_unref((BlockExport*) exp);
> 
> Even a helper function that converts NBDBlockExport* to BlockExport* rather
> than a cast might be nicer, but then again, I see from Max's review that
> this may be a temporary state of things.
> (The QAPI contains such type-safe container casts, such as
> qapi_DriveBackup_base(), if that helps...)

Yes, this goes away before the end of the series.

> > @@ -1537,7 +1536,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs,
> >   exp = g_new0(NBDExport, 1);
> >   exp->common = (BlockExport) {
> > -.drv = &blk_exp_nbd,
> > +.drv= &blk_exp_nbd,
> > +.refcount   = 1,
> 
> I'm not sure whether trying to align the '=' is good, because the moment you
> add a longer field name, every other line has to be touched.  I'm fine with
> just one space on both side of =, even if it is more ragged to read.  But
> you're the author, so you get to pick.

I generally prefer aligned '=' because the code is read much more often
than it is written or modified, so being friendly for readers is
important.

> > @@ -1626,8 +1625,9 @@ NBDExport *nbd_export_new(BlockDriverState *bs,
> >   exp->ctx = ctx;
> >   blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, 
> > exp);
> > +blk_exp_ref(&exp->common);
> >   QTAILQ_INSERT_TAIL(&exports, exp, next);
> > -nbd_export_get(exp);
> > +
> 
> Is there any consequence to this changed ordering in grabbing the reference
> vs. updating the list?

No intended consequences, but if Max is right that the code (before and
after this series) lacks some locking, it might make a theoretical
difference. If it does, the new code is safer than the old one. If it
doesn't, it's just more consistent with the order I'm used to see in
other places: First take the reference, then use it.

Kevin

Re: [PATCH v4 2/4] iotests.py: Add wait_for_runstate()

2020-08-20 Thread Dr. David Alan Gilbert

* Eric Blake (ebl...@redhat.com) wrote:
> On 8/18/20 8:32 AM, Max Reitz wrote:
> > Signed-off-by: Max Reitz 
> > ---
> >   tests/qemu-iotests/iotests.py | 4 
> >   1 file changed, 4 insertions(+)
> 
> Reviewed-by: Eric Blake 
> 
> > 
> > diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> > index 717b5b652c..ee93cf22db 100644
> > --- a/tests/qemu-iotests/iotests.py
> > +++ b/tests/qemu-iotests/iotests.py
> > @@ -833,6 +833,10 @@ class VM(qtest.QEMUQtestMachine):
> >  'Found node %s under %s (but expected %s)' % \
> >  (node['name'], path, expected_node)
> > +def wait_for_runstate(self, runstate: str) -> None:
> > +while self.qmp('query-status')['return']['status'] != runstate:
> > +time.sleep(0.2)
> 
> This looks like it could inf-loop if we have a bug where the status never
> changes as expected; but I guess CI bots have timeouts at higher layers that
> would detect if such a bug sneaks in.

Although it might be useful to make sure when such a timeout lands, you
know which state you thought you were waiting for.

Dave

> > +
> >   index_re = re.compile(r'([^\[]+)\[([^\]]+)\]')
> >   class QMPTestCase(unittest.TestCase):
> > 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH] qtest: add fuzz test case

On 19/08/2020 16.15, Li Qiang wrote:
> Currently the device fuzzer find a more and more issues.
> For every fuzz case, we need not only the fixes but also
> the coressponding test case. We can analysis the reproducer
> for every case and find what happened in where and write
> a beautiful test case. However the raw data of reproducer is not
> friendly to analysis. It will take a very long time, even far more
> than the fixes itself. So let's create a new file to hold all of
> the fuzz test cases and just use the raw data to act as the test
> case. This way nobody will be afraid of writing a test case for
> the fuzz reproducer.
> 
> This patch adds the issue LP#1878263 test case.
> 
> Signed-off-by: Li Qiang 
> ---
>  tests/qtest/Makefile.include |  2 ++
>  tests/qtest/fuzz-test.c  | 45 
>  2 files changed, 47 insertions(+)
>  create mode 100644 tests/qtest/fuzz-test.c
> 
> diff --git a/tests/qtest/Makefile.include b/tests/qtest/Makefile.include
> index b0204e44f2..ff460179c5 100644
> --- a/tests/qtest/Makefile.include
> +++ b/tests/qtest/Makefile.include
> @@ -7,6 +7,7 @@ check-qtest-generic-y += machine-none-test
>  check-qtest-generic-y += qmp-test
>  check-qtest-generic-y += qmp-cmd-test
>  check-qtest-generic-y += qom-test
> +check-qtest-generic-y += fuzz-test

I think this should go into check-qtest-i386-y instead ...

> diff --git a/tests/qtest/fuzz-test.c b/tests/qtest/fuzz-test.c
> new file mode 100644
> index 00..695c6dffb9
> --- /dev/null
> +++ b/tests/qtest/fuzz-test.c
> @@ -0,0 +1,45 @@
> +/*
> + * QTest testcase for fuzz case
> + *
> + * Copyright (c) 2020 Li Qiang 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +
> +#include "qemu/osdep.h"
> +
> +#include "libqtest.h"
> +
> +/*
> + * This used to trigger the assert in scsi_dma_complete
> + * https://bugs.launchpad.net/qemu/+bug/1878263
> + */
> +static void test_megasas_zero_iov_cnt(void)
> +{
> +QTestState *s;
> +
> +s = qtest_init("-nographic -monitor none -serial none "
> +   "-M q35 -device megasas -device scsi-cd,drive=null0 "
> +   "-blockdev 
> driver=null-co,read-zeroes=on,node-name=null0");

... since you hard-coded -M q35 here.

Alternatively, you need to check qtest_get_arch() for "i386" /  "x86_64"
in the main() function.

 Thomas

Re: [PATCH 02/10] numa: introduce MachineClass::forbid_asymmetrical_numa

On Thu, 20 Aug 2020 07:33:00 -0300
Daniel Henrique Barboza  wrote:

> On 8/20/20 1:15 AM, David Gibson wrote:
> > On Wed, Aug 19, 2020 at 10:11:28PM -0400, Eduardo Habkost wrote:  
> >> On Thu, Aug 20, 2020 at 11:17:26AM +1000, David Gibson wrote:  
> >>> On Fri, Aug 14, 2020 at 05:54:16PM -0300, Daniel Henrique Barboza wrote:  
>  The pSeries machine does not support asymmetrical NUMA
>  configurations.  
> >>>
> >>> This seems a bit oddly specific to have as a global machine class
> >>> property.
> >>>
> >>> Would it make more sense for machines with specific NUMA constraints
> >>> to just verify those during their initialization?  
> >>
> >> This would be much simpler.  However, I like the idea of
> >> representing machine-specific configuration validation rules as
> >> data that can eventually be exported to management software.  
> > 
> > Ah, ok, so basically the usual tradeoff between flexibility and
> > advertisability.  
> 
> 
> 
> To provide context, what I did here was inspired by this commit:
> 
> commit 0533ef5f2089f4f12a0ec5c8035e5e15ba0b5556
> Author: Tao Xu 
> Date:   Thu Sep 5 16:32:38 2019 +0800
> 
>  numa: Introduce MachineClass::auto_enable_numa for implicit NUMA node
> 
> 
> In this commit, exclusive NUMA code from spapr.c was taken and put it
> into numa.c, with a flag being set in spapr machine_init.

We have too many auto_enable_numa*, we probably should merge them into
just one auto_enable_numa if it's possible.

> 
> Thanks,
> 
> 
> DHB
> 
> 
> > 
> > So, in that case, I guess the question is whether we envisage "no
> > assymmetry" as a constraint common enough that it's worth creating an
> > advertisable rule or not.  If we only ever have one user, then we
> > haven't really done any better than hard coding the constraint in the
> > manageent software.
> > 
> > Of course to complicate matters, in the longer term we're looking at
> > removing that constraint from pseries - but doing so will be dependent
> > on the guest kernel understanding a new format for the NUMA
> > information in the device tree.  So qemu alone won't have enough
> > information to tell if such a configuration is possible or not.

If it's guest limitiation, then it doesn't sound like QEMU material to me.
It would be more appropriate to validate at mgmt level which knows what
guest is being used.

> >   
> >> (CCing John Snow, who had spent some time thinking about
> >> configuration validation recently.)
> >>
> >>  
> 
>  CC: Eduardo Habkost 
>  CC: Marcel Apfelbaum 
>  Signed-off-by: Daniel Henrique Barboza 
>  ---
>    hw/core/numa.c  | 7 +++
>    hw/ppc/spapr.c  | 1 +
>    include/hw/boards.h | 1 +
>    3 files changed, 9 insertions(+)
> 
>  diff --git a/hw/core/numa.c b/hw/core/numa.c
>  index d1a94a14f8..1e81233c1d 100644
>  --- a/hw/core/numa.c
>  +++ b/hw/core/numa.c
>  @@ -547,6 +547,7 @@ static int parse_numa(void *opaque, QemuOpts *opts, 
>  Error **errp)
> */
>    static void validate_numa_distance(MachineState *ms)
>    {
>  +MachineClass *mc = MACHINE_GET_CLASS(ms);
>    int src, dst;
>    bool is_asymmetrical = false;
>    int nb_numa_nodes = ms->numa_state->num_nodes;
>  @@ -575,6 +576,12 @@ static void validate_numa_distance(MachineState *ms)
>    }
>    
>    if (is_asymmetrical) {
>  +if (mc->forbid_asymmetrical_numa) {
>  +error_report("This machine type does not support "
>  + "asymmetrical numa distances.");
>  +exit(EXIT_FAILURE);
>  +}
>  +
>    for (src = 0; src < nb_numa_nodes; src++) {
>    for (dst = 0; dst < nb_numa_nodes; dst++) {
>    if (src != dst && numa_info[src].distance[dst] == 0) {
>  diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>  index dd2fa4826b..3b16edaf4c 100644
>  --- a/hw/ppc/spapr.c
>  +++ b/hw/ppc/spapr.c
>  @@ -4512,6 +4512,7 @@ static void spapr_machine_class_init(ObjectClass 
>  *oc, void *data)
> */
>    mc->numa_mem_align_shift = 28;
>    mc->auto_enable_numa = true;
>  +mc->forbid_asymmetrical_numa = true;
>    
>    smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;
>    smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON;
>  diff --git a/include/hw/boards.h b/include/hw/boards.h
>  index bc5b82ad20..dc6cdd1c53 100644
>  --- a/include/hw/boards.h
>  +++ b/include/hw/boards.h
>  @@ -215,6 +215,7 @@ struct MachineClass {
>    bool nvdimm_supported;
>    bool numa_mem_supported;
>    bool auto_enable_numa;
>  +bool forbid_asymmetrical_numa;
>    const char *default_ram_id;
>    
>    HotplugHandler *(*get_hotplug_handler)(MachineState *machine,  
> >>>  
> >>
> >>
> >>  
> >   
>

Re: [PATCH v4 2/4] iotests.py: Add wait_for_runstate()


20.08.2020 17:23, Dr. David Alan Gilbert wrote:

* Eric Blake (ebl...@redhat.com) wrote:

On 8/18/20 8:32 AM, Max Reitz wrote:

Signed-off-by: Max Reitz 
---
   tests/qemu-iotests/iotests.py | 4 
   1 file changed, 4 insertions(+)


Reviewed-by: Eric Blake 



diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 717b5b652c..ee93cf22db 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -833,6 +833,10 @@ class VM(qtest.QEMUQtestMachine):
  'Found node %s under %s (but expected %s)' % \
  (node['name'], path, expected_node)
+def wait_for_runstate(self, runstate: str) -> None:
+while self.qmp('query-status')['return']['status'] != runstate:
+time.sleep(0.2)


This looks like it could inf-loop if we have a bug where the status never
changes as expected; but I guess CI bots have timeouts at higher layers that
would detect if such a bug sneaks in.


Although it might be useful to make sure when such a timeout lands, you
know which state you thought you were waiting for.

Dave



Timeout class is defined in iotests.py, so we can simply insert something like

 ... , timeout=60) -> None:
  with Timeout(timeout, f"Timeout waiting for '{runstate}' runstate"):
 ...


--
Best regards,
Vladimir

Re: [PULL 3/3] hw: add compat machines for 5.2

On Wed, 19 Aug 2020 11:22:58 -0400
Eduardo Habkost  wrote:

> From: Cornelia Huck 
> 
> Add 5.2 machine types for arm/i440fx/q35/s390x/spapr.
> 
> Reviewed-by: Andrew Jones 
> Reviewed-by: Michael S. Tsirkin 
> Reviewed-by: Greg Kurz 
> Acked-by: Christian Borntraeger 
> Acked-by: David Gibson 
> Acked-by: Thomas Huth 
> Signed-off-by: Cornelia Huck 
> Message-Id: <20200819144016.281156-1-coh...@redhat.com>
> Signed-off-by: Eduardo Habkost 

Is this the latest version of the patch?
It doesn't apply cleanly on top of the current master.

> ---
>  include/hw/boards.h|  3 +++
>  include/hw/i386/pc.h   |  3 +++
>  hw/arm/virt.c  |  9 -
>  hw/core/machine.c  |  3 +++
>  hw/i386/pc.c   |  3 +++
>  hw/i386/pc_piix.c  | 14 +-
>  hw/i386/pc_q35.c   | 13 -
>  hw/ppc/spapr.c | 15 +--
>  hw/s390x/s390-virtio-ccw.c | 14 +-
>  9 files changed, 71 insertions(+), 6 deletions(-)
> 
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 426ce5f625..bc5b82ad20 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -319,6 +319,9 @@ struct MachineState {
>  } \
>  type_init(machine_initfn##_register_types)
>  
> +extern GlobalProperty hw_compat_5_1[];
> +extern const size_t hw_compat_5_1_len;
> +
>  extern GlobalProperty hw_compat_5_0[];
>  extern const size_t hw_compat_5_0_len;
>  
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 3d7ed3a55e..fe52e165b2 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -193,6 +193,9 @@ void pc_system_firmware_init(PCMachineState *pcms, 
> MemoryRegion *rom_memory);
>  void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
> const CPUArchIdList *apic_ids, GArray *entry);
>  
> +extern GlobalProperty pc_compat_5_1[];
> +extern const size_t pc_compat_5_1_len;
> +
>  extern GlobalProperty pc_compat_5_0[];
>  extern const size_t pc_compat_5_0_len;
>  
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index ecfee362a1..acf9bfbece 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2546,10 +2546,17 @@ static void machvirt_machine_init(void)
>  }
>  type_init(machvirt_machine_init);
>  
> +static void virt_machine_5_2_options(MachineClass *mc)
> +{
> +}
> +DEFINE_VIRT_MACHINE_AS_LATEST(5, 2)
> +
>  static void virt_machine_5_1_options(MachineClass *mc)
>  {
> +virt_machine_5_2_options(mc);
> +compat_props_add(mc->compat_props, hw_compat_5_1, hw_compat_5_1_len);
>  }
> -DEFINE_VIRT_MACHINE_AS_LATEST(5, 1)
> +DEFINE_VIRT_MACHINE(5, 1)
>  
>  static void virt_machine_5_0_options(MachineClass *mc)
>  {
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 8d1a90c6cf..cf5f2dfaeb 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -28,6 +28,9 @@
>  #include "hw/mem/nvdimm.h"
>  #include "migration/vmstate.h"
>  
> +GlobalProperty hw_compat_5_1[] = {};
> +const size_t hw_compat_5_1_len = G_N_ELEMENTS(hw_compat_5_1);
> +
>  GlobalProperty hw_compat_5_0[] = {
>  { "pci-host-bridge", "x-config-reg-migration-enabled", "off" },
>  { "virtio-balloon-device", "page-poison", "false" },
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 47c5ca3e34..9aa813949c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -97,6 +97,9 @@
>  #include "fw_cfg.h"
>  #include "trace.h"
>  
> +GlobalProperty pc_compat_5_1[] = {};
> +const size_t pc_compat_5_1_len = G_N_ELEMENTS(pc_compat_5_1);
> +
>  GlobalProperty pc_compat_5_0[] = {
>  };
>  const size_t pc_compat_5_0_len = G_N_ELEMENTS(pc_compat_5_0);
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index b789e83f9a..c5ba70ca17 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -426,7 +426,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
>  machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
>  }
>  
> -static void pc_i440fx_5_1_machine_options(MachineClass *m)
> +static void pc_i440fx_5_2_machine_options(MachineClass *m)
>  {
>  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>  pc_i440fx_machine_options(m);
> @@ -435,6 +435,18 @@ static void pc_i440fx_5_1_machine_options(MachineClass 
> *m)
>  pcmc->default_cpu_version = 1;
>  }
>  
> +DEFINE_I440FX_MACHINE(v5_2, "pc-i440fx-5.2", NULL,
> +  pc_i440fx_5_2_machine_options);
> +
> +static void pc_i440fx_5_1_machine_options(MachineClass *m)
> +{
> +pc_i440fx_5_2_machine_options(m);
> +m->alias = NULL;
> +m->is_default = false;
> +compat_props_add(m->compat_props, hw_compat_5_1, hw_compat_5_1_len);
> +compat_props_add(m->compat_props, pc_compat_5_1, pc_compat_5_1_len);
> +}
> +
>  DEFINE_I440FX_MACHINE(v5_1, "pc-i440fx-5.1", NULL,
>pc_i440fx_5_1_machine_options);
>  
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index a3e607a544..0cb9c18cd4 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -353,7 +353,7 @@ static void pc_q35_machine_options(MachineCl

Re: [PULL 3/3] hw: add compat machines for 5.2

2020-08-20 Thread Cornelia Huck

On Thu, 20 Aug 2020 16:35:08 +0200
Igor Mammedov  wrote:

> On Wed, 19 Aug 2020 11:22:58 -0400
> Eduardo Habkost  wrote:
> 
> > From: Cornelia Huck 
> > 
> > Add 5.2 machine types for arm/i440fx/q35/s390x/spapr.
> > 
> > Reviewed-by: Andrew Jones 
> > Reviewed-by: Michael S. Tsirkin 
> > Reviewed-by: Greg Kurz 
> > Acked-by: Christian Borntraeger 
> > Acked-by: David Gibson 
> > Acked-by: Thomas Huth 
> > Signed-off-by: Cornelia Huck 
> > Message-Id: <20200819144016.281156-1-coh...@redhat.com>
 ^^

> > Signed-off-by: Eduardo Habkost   
> 
> Is this the latest version of the patch?
> It doesn't apply cleanly on top of the current master.

Again? I re-spun it yesterday against then-current master.

Re: [RFC PATCH 07/22] block/export: Remove magic from block-export-add

2020-08-20 Thread Eric Blake


On 8/20/20 6:05 AM, Kevin Wolf wrote:


As long as we can keep the compatibility code local to qmp_nbd_*(), I
don't think it's too bad. In particular because it's already written.

Instead of adjusting libvirt to changes in the nbd-* commands, I'd
rather have it change over to block-export-*. I would like to see the
nbd-server-add/remove commands deprecated soon after we have the
replacements.


Makes sense to me. So deprecate nbd-server-add in 5.2, and require 
block-export in 6.1.




+/*
+ * nbd-server-add doesn't complain when a read-only device should be
+ * exported as writable, but simply downgrades it. This is an error with
+ * block-export-add.


I'd be happy with either marking this deprecated now (and fixing it in two
releases), or declaring it a bug in nbd-server-add now (and fixing it
outright).


How about deprecating nbd-server-add completely?


Works for me. Keeping the warts backwards-compatible in nbd-server-add 
is more palatable if we know we are going to drop it wholesale down the 
road.



+/*
+ * nbd-server-add removes the export when the named BlockBackend used for
+ * @device goes away.
+ */
+on_eject_blk = blk_by_name(arg->device);
+if (on_eject_blk) {
+nbd_export_set_on_eject_blk(export, on_eject_blk);
+}


Wait - is the magic export removal tied only to exporting a drive name, and
not a node name?  So as long as libvirt is using only node names whwen
adding exports, a drive being unplugged won't interfere?


Yes, seems so. It's the existing behaviour, I'm only moving the code
around.


Overall, the change makes sense to me, although I'd love to see if we could
go further on the writable vs. read-only issue.


If nbd-server-add will be going away relatively soon, it's probably not
worth the trouble. But if you have reasons to keep it, maybe we should
consider it.


No, I'm fine with the idea of getting rid of nbd-server-add, at which 
point changing it before removal is pointless.



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PULL 3/3] hw: add compat machines for 5.2

On Thu, 20 Aug 2020 16:35:08 +0200
Igor Mammedov  wrote:

> On Wed, 19 Aug 2020 11:22:58 -0400
> Eduardo Habkost  wrote:
> 
> > From: Cornelia Huck 
> > 
> > Add 5.2 machine types for arm/i440fx/q35/s390x/spapr.
> > 
> > Reviewed-by: Andrew Jones 
> > Reviewed-by: Michael S. Tsirkin 
> > Reviewed-by: Greg Kurz 
> > Acked-by: Christian Borntraeger 
> > Acked-by: David Gibson 
> > Acked-by: Thomas Huth 
> > Signed-off-by: Cornelia Huck 
> > Message-Id: <20200819144016.281156-1-coh...@redhat.com>
> > Signed-off-by: Eduardo Habkost   
> 
> Is this the latest version of the patch?
> It doesn't apply cleanly on top of the current master.
Sorry for noise,
it was messed up tree on my side,
after clean checkout it applies fine.

> 
> > ---
> >  include/hw/boards.h|  3 +++
> >  include/hw/i386/pc.h   |  3 +++
> >  hw/arm/virt.c  |  9 -
> >  hw/core/machine.c  |  3 +++
> >  hw/i386/pc.c   |  3 +++
> >  hw/i386/pc_piix.c  | 14 +-
> >  hw/i386/pc_q35.c   | 13 -
> >  hw/ppc/spapr.c | 15 +--
> >  hw/s390x/s390-virtio-ccw.c | 14 +-
> >  9 files changed, 71 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/hw/boards.h b/include/hw/boards.h
> > index 426ce5f625..bc5b82ad20 100644
> > --- a/include/hw/boards.h
> > +++ b/include/hw/boards.h
> > @@ -319,6 +319,9 @@ struct MachineState {
> >  } \
> >  type_init(machine_initfn##_register_types)
> >  
> > +extern GlobalProperty hw_compat_5_1[];
> > +extern const size_t hw_compat_5_1_len;
> > +
> >  extern GlobalProperty hw_compat_5_0[];
> >  extern const size_t hw_compat_5_0_len;
> >  
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 3d7ed3a55e..fe52e165b2 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -193,6 +193,9 @@ void pc_system_firmware_init(PCMachineState *pcms, 
> > MemoryRegion *rom_memory);
> >  void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
> > const CPUArchIdList *apic_ids, GArray *entry);
> >  
> > +extern GlobalProperty pc_compat_5_1[];
> > +extern const size_t pc_compat_5_1_len;
> > +
> >  extern GlobalProperty pc_compat_5_0[];
> >  extern const size_t pc_compat_5_0_len;
> >  
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index ecfee362a1..acf9bfbece 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -2546,10 +2546,17 @@ static void machvirt_machine_init(void)
> >  }
> >  type_init(machvirt_machine_init);
> >  
> > +static void virt_machine_5_2_options(MachineClass *mc)
> > +{
> > +}
> > +DEFINE_VIRT_MACHINE_AS_LATEST(5, 2)
> > +
> >  static void virt_machine_5_1_options(MachineClass *mc)
> >  {
> > +virt_machine_5_2_options(mc);
> > +compat_props_add(mc->compat_props, hw_compat_5_1, hw_compat_5_1_len);
> >  }
> > -DEFINE_VIRT_MACHINE_AS_LATEST(5, 1)
> > +DEFINE_VIRT_MACHINE(5, 1)
> >  
> >  static void virt_machine_5_0_options(MachineClass *mc)
> >  {
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 8d1a90c6cf..cf5f2dfaeb 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -28,6 +28,9 @@
> >  #include "hw/mem/nvdimm.h"
> >  #include "migration/vmstate.h"
> >  
> > +GlobalProperty hw_compat_5_1[] = {};
> > +const size_t hw_compat_5_1_len = G_N_ELEMENTS(hw_compat_5_1);
> > +
> >  GlobalProperty hw_compat_5_0[] = {
> >  { "pci-host-bridge", "x-config-reg-migration-enabled", "off" },
> >  { "virtio-balloon-device", "page-poison", "false" },
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 47c5ca3e34..9aa813949c 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -97,6 +97,9 @@
> >  #include "fw_cfg.h"
> >  #include "trace.h"
> >  
> > +GlobalProperty pc_compat_5_1[] = {};
> > +const size_t pc_compat_5_1_len = G_N_ELEMENTS(pc_compat_5_1);
> > +
> >  GlobalProperty pc_compat_5_0[] = {
> >  };
> >  const size_t pc_compat_5_0_len = G_N_ELEMENTS(pc_compat_5_0);
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index b789e83f9a..c5ba70ca17 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -426,7 +426,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
> >  machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
> >  }
> >  
> > -static void pc_i440fx_5_1_machine_options(MachineClass *m)
> > +static void pc_i440fx_5_2_machine_options(MachineClass *m)
> >  {
> >  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> >  pc_i440fx_machine_options(m);
> > @@ -435,6 +435,18 @@ static void pc_i440fx_5_1_machine_options(MachineClass 
> > *m)
> >  pcmc->default_cpu_version = 1;
> >  }
> >  
> > +DEFINE_I440FX_MACHINE(v5_2, "pc-i440fx-5.2", NULL,
> > +  pc_i440fx_5_2_machine_options);
> > +
> > +static void pc_i440fx_5_1_machine_options(MachineClass *m)
> > +{
> > +pc_i440fx_5_2_machine_options(m);
> > +m->alias = NULL;
> > +m->is_default = false;
> > +compat_props_add(m->compat_props, hw_compat_5_

[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * An instruction was described wrong so that on usage the program would 
 crash.

  [Test Case]

   * Run s390x in emulation and there use this program:
 For simplicity and speed you can use KVM guest as usual on s390x, that 
 after prep&install&compile of the test you run in qemu-tcg like:

 $ sudo qemu-system-s390x -machine s390-ccw-virtio,accel=tcg -cpu 
max,zpci=on -serial mon:stdio -display none -m 4096 -nic 
user,model=virtio,hostfwd=tcp::-:22 -drive 
file=/var/lib/uvtool/libvirt/images/focal-sqxbr.qcow,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
 -device 
virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
 Obviously is you have no s390x access you need to use emulation right 
 away.

   * Build and run failing program
 $ sudo apt install clang
 $ cat > bug-sqrtl-one-line.c << EOF
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
  __volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}
  EOF
 $ cc bug-sqrtl-one-line.c
 $ ./a.out
 Segmentation fault (core dumped)

 qemu is dead by now as long as the bug is present

  [Regression Potential]

   * The change only modifies 128 bit square root on s390x so regressions
 should be limited to exactly that - which formerly before this fix was 
 a broken instruction.

  [Other Info]
   
   * n/a

  ---

  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.

  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
  that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0
  (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920
  workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.

  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}

  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)

  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883984/+subscriptions

[Bug 1885720] Re: qemu/migration/postcopy-ram.c:387: bad return expression ?

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1885720

Title:
  qemu/migration/postcopy-ram.c:387: bad return expression ?

Status in QEMU:
  Fix Released

Bug description:
  qemu/migration/postcopy-ram.c:387:9: style: Non-boolean value returned
  from function returning bool [returnNonBoolInBooleanFunction]

  Source code is

 return -1;

  but

  bool postcopy_ram_supported_by_host(

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1885720/+subscriptions

[Bug 1886318] Re: Qemu after v5.0.0 breaks macos guests

Patch has been included here:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=dba04c3488c4699f5

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886318

Title:
  Qemu after v5.0.0 breaks macos guests

Status in QEMU:
  Fix Released

Bug description:
  The Debian Sid 5.0-6 qemu-kvm package can no longer get further than
  the Clover bootloader whereas 5.0-6 and earlier worked fine.

  So I built qemu master from github and it has the same problem,
  whereas git tag v5.0.0 (or 4.2.1) does not, so something between
  v5.0.0 release and the last few days has caused the problem.

  Here's my qemu script, pretty standard macOS-Simple-KVM setup on a
  Xeon host:

  qemu-system-x86_64 \
  -enable-kvm \
  -m 4G \
  -machine q35,accel=kvm \
  -smp 4,sockets=1,cores=2,threads=2 \
  -cpu 
  
Penryn,vendor=GenuineIntel,kvm=on,+sse3,+sse4.2,+aes,+xsave,+avx,+xsaveopt,+xsavec,+xgetbv1,+avx2,+bmi2,+smep,+bmi1,+fma,+movbe,+invtsc
 
  \
  -device 
  
isa-applesmc,osk="ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc"
 
  \
  -smbios type=2 \
  -drive if=pflash,format=raw,readonly,file="/tmp/OVMF_CODE.fd" \
  -drive if=pflash,format=raw,file="/tmp/macos_catalina_VARS.fd" \
  -vga qxl \
  -device ich9-ahci,id=sata \
  -drive id=ESP,if=none,format=raw,file=/tmp/ESP.img \
  -device ide-hd,bus=sata.2,drive=ESP \
  -drive id=InstallMedia,format=raw,if=none,file=/tmp/BaseSystem.img \
  -device ide-hd,bus=sata.3,drive=InstallMedia \
  -drive id=SystemDisk,if=none,format=raw,file=/tmp/macos_catalina.img \
  -device ide-hd,bus=sata.4,drive=SystemDisk \
  -usb -device usb-kbd -device usb-mouse

  Perhaps something has changed in Penryn support recently, as that's
  required for macos?

  See also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964247

  Also on a related note, kernel 5.6/5.7 (on Debian) hard crashes the
  host when I try GPU passthrough on macos, whereas Ubuntu20/Win10 work
  fine - as does 5.5 kernel.

  See also https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=961676

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1886318/+subscriptions

[Bug 1874073] Re: openrisc_sim.c:87:42: error: 'cpu_irqs[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized]

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1874073

Title:
  openrisc_sim.c:87:42: error: 'cpu_irqs[0]' may be used uninitialized
  in this function [-Werror=maybe-uninitialized]

Status in QEMU:
  Fix Released

Bug description:
  I see the warning since gcc10:

  static void openrisc_sim_init(MachineState *machine):
  ...
  qemu_irq *cpu_irqs[2];
  ...

  
  serial_mm_init(get_system_memory(), 0x9000, 0, serial_irq,
 115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);

  I would initialize cpu_irqs[2] with {}.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1874073/+subscriptions

[Bug 1856706] Re: target/mips/op_helper.c:971:duplicated branches ?

Patch has been included here:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=9788e8c9b64e4cebb01

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1856706

Title:
  target/mips/op_helper.c:971:duplicated branches ?

Status in QEMU:
  Fix Released

Bug description:
  qemu-4.2.0/target/mips/op_helper.c:971:8: warning: this condition has
  identical branches [-Wduplicated-branches]

  Source code is

 if (other_tc == other->current_tc) {
  tccause = other->CP0_Cause;
  } else {
  tccause = other->CP0_Cause;
  }

  Possible cut'n'paste error ?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1856706/+subscriptions

[Bug 1880822] Re: CVE-2020-13253 QEMU: sd: OOB access could crash the guest resulting in DoS

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1880822

Title:
  CVE-2020-13253 QEMU: sd: OOB access could crash the guest resulting in
  DoS

Status in QEMU:
  Fix Released

Bug description:
  An out-of-bounds read access issue was found in the SD Memory Card
  emulator of the QEMU. It occurs while performing block write commands
  via sdhci_write(), if a guest user has sent 'address' which is OOB of
  's->wp_groups'. A guest user/process may use this flaw to crash the
  QEMU process resulting in DoS.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1880822/+subscriptions

Re: [PULL v7 000/151] Meson-based build system

On Thu, 20 Aug 2020 at 12:52, Paolo Bonzini  wrote:
>
> On 20/08/20 12:33, Peter Maydell wrote:
> > 'make check' still fails for the all-linux-static config, this
> > time for a different reason:
> >
> > make: *** No rule to make target 'check-qtest', needed by 'check'. Stop.
>
> Oh, there are two "check:" rules.
>
> I pushed again the tag

For the all-linux-static build, Meson emits these WARNING lines:

Configuring ninjatool using configuration
Library m found: YES
Library util found: YES
Library aio found: YES
Library rt found: YES
Found pkg-config: /usr/bin/pkg-config (0.29.1)
WARNING: Static library 'asound' not found for dependency 'sdl2', may
not be statically linked
WARNING: Static library 'pulse-simple' not found for dependency
'sdl2', may not be statically linked
WARNING: Static library 'pulse' not found for dependency 'sdl2', may
not be statically linked
WARNING: Static library 'sndio' not found for dependency 'sdl2', may
not be statically linked
WARNING: Static library 'wayland-egl' not found for dependency 'sdl2',
may not be statically linked
WARNING: Static library 'wayland-client' not found for dependency
'sdl2', may not be statically linked
WARNING: Static library 'wayland-cursor' not found for dependency
'sdl2', may not be statically linked
Run-time dependency sdl2 found: YES 2.0.8
Found CMake: /usr/bin/cmake (3.10.2)
Run-time dependency sdl-image found: NO (tried pkgconfig and cmake)
Run-time dependency libpng found: YES 1.6.34


What are they for, and can they be suppressed ?

thanks
-- PMM

[PATCH v3 1/7] x86: lpc9: let firmware negotiate 'CPU hotplug with SMI' features

It will allow firmware to notify QEMU that firmware requires SMI
being triggered on CPU hot[un]plug, so that it would be able to account
for hotplugged CPU and relocate it to new SMM base and/or safely remove
CPU on unplug.

Using negotiated features, follow up patches will insert SMI upcall
into AML code, to make sure that firmware processes hotplug before
guest OS would attempt to use new CPU.

Signed-off-by: Igor Mammedov 
---
v3:
  - rebase on top of "[PATCH v2] hw: add compat machines for 5.2"
so apply that before this patch
v2:
  - rebase on top of 5.1 (move compat values to 5.1 machine)
  - make "x-smi-cpu-hotunplug" false by default (Laszlo Ersek 
)
---
 include/hw/i386/ich9.h |  2 ++
 hw/i386/pc.c   |  4 +++-
 hw/i386/pc_piix.c  |  1 +
 hw/i386/pc_q35.c   |  1 +
 hw/isa/lpc_ich9.c  | 13 +
 5 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
index a98d10b252..d1bb3f7bf0 100644
--- a/include/hw/i386/ich9.h
+++ b/include/hw/i386/ich9.h
@@ -247,5 +247,7 @@ typedef struct ICH9LPCState {
 
 /* bit positions used in fw_cfg SMI feature negotiation */
 #define ICH9_LPC_SMI_F_BROADCAST_BIT0
+#define ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT  1
+#define ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT   2
 
 #endif /* HW_ICH9_H */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 9aa813949c..583db11d28 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -97,7 +97,9 @@
 #include "fw_cfg.h"
 #include "trace.h"
 
-GlobalProperty pc_compat_5_1[] = {};
+GlobalProperty pc_compat_5_1[] = {
+{ "ICH9-LPC", "x-smi-cpu-hotplug", "off" },
+};
 const size_t pc_compat_5_1_len = G_N_ELEMENTS(pc_compat_5_1);
 
 GlobalProperty pc_compat_5_0[] = {
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index c5ba70ca17..68f8ba1bf9 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -433,6 +433,7 @@ static void pc_i440fx_5_2_machine_options(MachineClass *m)
 m->alias = "pc";
 m->is_default = true;
 pcmc->default_cpu_version = 1;
+compat_props_add(m->compat_props, pc_compat_5_1, pc_compat_5_1_len);
 }
 
 DEFINE_I440FX_MACHINE(v5_2, "pc-i440fx-5.2", NULL,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 0cb9c18cd4..b729cf9a58 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -359,6 +359,7 @@ static void pc_q35_5_2_machine_options(MachineClass *m)
 pc_q35_machine_options(m);
 m->alias = "q35";
 pcmc->default_cpu_version = 1;
+compat_props_add(m->compat_props, pc_compat_5_1, pc_compat_5_1_len);
 }
 
 DEFINE_Q35_MACHINE(v5_2, "pc-q35-5.2", NULL,
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index cd6e169d47..19f32bed3e 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -373,6 +373,15 @@ static void smi_features_ok_callback(void *opaque)
 /* guest requests invalid features, leave @features_ok at zero */
 return;
 }
+if (!(guest_features & BIT_ULL(ICH9_LPC_SMI_F_BROADCAST_BIT)) &&
+guest_features & (BIT_ULL(ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT) |
+  BIT_ULL(ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT))) {
+/*
+ * cpu hot-[un]plug with SMI requires SMI broadcast,
+ * leave @features_ok at zero
+ */
+return;
+}
 
 /* valid feature subset requested, lock it down, report success */
 lpc->smi_negotiated_features = guest_features;
@@ -747,6 +756,10 @@ static Property ich9_lpc_properties[] = {
 DEFINE_PROP_BOOL("noreboot", ICH9LPCState, pin_strap.spkr_hi, true),
 DEFINE_PROP_BIT64("x-smi-broadcast", ICH9LPCState, smi_host_features,
   ICH9_LPC_SMI_F_BROADCAST_BIT, true),
+DEFINE_PROP_BIT64("x-smi-cpu-hotplug", ICH9LPCState, smi_host_features,
+  ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT, true),
+DEFINE_PROP_BIT64("x-smi-cpu-hotunplug", ICH9LPCState, smi_host_features,
+  ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.26.2

Re: [PATCH v4 2/4] iotests.py: Add wait_for_runstate()