Re: [PATCH v2 13/22] qemu-iotests/199: drop extra constraints

2020-02-19 Thread Andrey Shinkevich




On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index de9ba8d94c..dda918450a 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -116,5 +116,4 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  
  
  if __name__ == '__main__':

-iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
- supported_protocols=['file'])
+iotests.main(supported_fmts=['qcow2'])



Reviewed-by: Andrey Shinkevich 
--
With the best regards,
Andrey Shinkevich



Re: [PATCH v4 1/3] target/arm: Support SError injection

2020-02-19 Thread Marc Zyngier
On Wed, 19 Feb 2020 10:09:39 +1100
Gavin Shan  wrote:

> Hi Marc,
> 
> On 2/19/20 3:28 AM, Marc Zyngier wrote:
> > On 2020-02-18 02:04, Gavin Shan wrote:  
> >> This supports SError injection, which will be used by "virt" board to
> >> simulating the behavior of NMI injection in next patch. As Peter Maydell
> >> suggested, this adds a new interrupt (ARM_CPU_SERROR), which is parallel
> >> to CPU_INTERRUPT_HARD. The backend depends on if kvm is enabled or not.
> >> kvm_vcpu_ioctl(cpu, KVM_SET_VCPU_EVENTS) is leveraged to inject SError
> >> or data abort to guest. When TCG is enabled, the behavior is simulated
> >> by injecting SError and data abort to guest.  
> > 
> > s/and/or/ (you can't inject both at the same time).
> >   
> 
> Absolutely, will be corrected in v5, which will be hold. I hope to receive
> comments from Peter and Richard before going to do another respin :)

Sure, there is no hurry at all.

> 
> >>
> >> Signed-off-by: Gavin Shan 
> >> ---
> >>  target/arm/cpu.c  | 69   
> +++
> >>  target/arm/cpu.h  | 20 -
> >>  target/arm/helper.c   | 12 
> >>  target/arm/m_helper.c |  8 +
> >>  target/arm/machine.c  |  3 +-
> >>  5 files changed, 91 insertions(+), 21 deletions(-)
> >>
> >> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> >> index de733aceeb..e5750080bc 100644
> >> --- a/target/arm/cpu.c
> >> +++ b/target/arm/cpu.c
> >> @@ -78,7 +78,7 @@ static bool arm_cpu_has_work(CPUState *cs)
> >>  && cs->interrupt_request &
> >>  (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
> >>   | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
> >> - | CPU_INTERRUPT_EXITTB);
> >> + | CPU_INTERRUPT_SERROR | CPU_INTERRUPT_EXITTB)  
> ;
> >>  }
> >>
> >>  void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *  
> hook,
> >> @@ -449,6 +449,9 @@ static inline bool arm_excp_unmasked(CPUState *cs,
> >> unsigned int excp_idx,
> >>  return false;
> >>  }
> >>  return !(env->daif & PSTATE_I);
> >> +    case EXCP_SERROR:
> >> +   pstate_unmasked = !(env->daif & PSTATE_A);
> >> +   break;  
> > 
> > nit: Consider keeping the physical interrupts together, as they are close  
> ly
> > related.
> >   
> 
> Sorry, I didn't get the point. Maybe you're suggesting something like below
> ?
> If yes, I'm not sure if it's necessary.
> 
>  pstate_unmasked = !(env->daif & (PSTATE_A | PSTATE_I));
> 
> I think PSTATE_A is enough to mask out SError according to ARMv8 architectu
> re
> reference manual (D1.7), as below:
> 
> A, I, F Asynchronous exception mask bits:
> A
>SError interrupt mask bit.
> I
>IRQ interrupt mask bit.
> F
>FIQ interrupt mask bit.

No, all I'm suggesting is that you keep the cases for IRQ, FIQ and
SError close together in the switch statement, instead of placing
SError after the virtual interrupts. Given that they use the same code
pattern, it makes sense to order them this way. But as I said, this is
a nit, and not something that affects the outcome of this code.

> >>  default:
> >>  g_assert_not_reached();
> >>  }
> >> @@ -538,6 +541,15 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int
> >> interrupt_request)
> >>
> >>  /* The prioritization of interrupts is IMPLEMENTATION DEFIN  
> ED. */
> >>
> >> +    if (interrupt_request & CPU_INTERRUPT_SERROR) {
> >> +    excp_idx = EXCP_SERROR;
> >> +    target_el = arm_phys_excp_target_el(cs, excp_id  
> x, cur_el, secure);
> >> +    if (arm_excp_unmasked(cs, excp_idx, target_el,
> >> +     
>    cur_el, secure, hcr_el2)) {
> >> +    goto found;
> >> +    }
> >> +    }
> >> +
> >>  if (interrupt_request & CPU_INTERRUPT_FIQ) {
> >>  excp_idx = EXCP_FIQ;
> >>  target_el = arm_phys_excp_target_el(cs, excp_  
> idx, cur_el, secure);
> >> @@ -570,6 +582,7 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int
> >> interrupt_request)
> >>  goto found;
> >>  }
> >>  }
> >> +
> >>  return false;
> >>
> >>   found:
> >> @@ -585,7 +598,7 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState
> >> *cs, int interrupt_request)
> >>  CPUClass *cc = CPU_GET_CLASS(cs);
> >>  ARMCPU *cpu = ARM_CPU(cs);
> >>  CPUARMState *env = &cpu->env;
> >> -    bool ret = false;
> >> +    uint32_t excp_idx;
> >>
> >>  /* ARMv7-M interrupt masking works differently than -A or -  
> R.
> >>   * There is no FIQ/IRQ distinction. Instead of I and F bi  
> ts
> >> @@ -594,13 +607,26 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState
> >> *cs, int interrupt_request)
> >>   * (which depends on state like BASEPRI, FAULTMASK and th  
> e
> >>   * currently active exception).
> >>   */
> >> -    if (interrupt_request & CPU_INTERRUPT_HARD
> >> -    && (armv7m_nvic_can_take_pending_exception(env->n  
> vic))) {
> >> -    cs->exception_index = EXCP_IRQ;
> >> -    cc->do_interrupt

Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-02-19 Thread Paolo Bonzini
On 18/02/20 20:44, Olaf Hering wrote:
> Am Tue, 18 Feb 2020 18:37:09 +0100
> schrieb Paolo Bonzini :
> 
>> On 18/02/20 18:27, Olaf Hering wrote:
>>> The approach below (making 'xenfv' an alias of 'pc') does not work:
>>> xen_enabled() is false when pc_i440fx_3_1_machine_options runs.  
>> Don't use an alias, copy the 3.1 code into the xenfv machine type and/or
>> call the 3.1 functions from the xenfv machine type.
> 
> In the end it may look like this.
> 
> Let me know about any preferences regarding the naming of configure options 
> and variables.

Has any version of Xen been released with a QEMU version above 3.1?

Paolo

> 
> Olaf
> 
> diff --git a/configure b/configure
> index 6f5d850949..65ca345fd6 100755
> --- a/configure
> +++ b/configure
> @@ -368,6 +368,7 @@ vnc_jpeg=""
>  vnc_png=""
>  xkbcommon=""
>  xen=""
> +xen_hvm_pc_i440fx_version_3_1=""
>  xen_ctrl_version=""
>  xen_pci_passthrough=""
>  linux_aio=""
> @@ -1162,6 +1163,10 @@ for opt do
>;;
>--enable-xen-pci-passthrough) xen_pci_passthrough="yes"
>;;
> +  --disable-xenfv-i440fx-version-3_1) xen_hvm_pc_i440fx_version_3_1="no"
> +  ;;
> +  --enable-xenfv-i440fx-version-3_1) xen_hvm_pc_i440fx_version_3_1="yes"
> +  ;;
>--disable-brlapi) brlapi="no"
>;;
>--enable-brlapi) brlapi="yes"
> @@ -7836,6 +7841,9 @@ if supported_xen_target $target; then
>  if test "$xen_pci_passthrough" = yes; then
>  echo "CONFIG_XEN_PCI_PASSTHROUGH=y" >> "$config_target_mak"
>  fi
> +if test "$xen_hvm_pc_i440fx_version_3_1" = yes; then
> +echo "CONFIG_XEN_HVM_PC_I440FX_VERSION_3_1=y" >> "$config_target_mak"
> +fi
>  else
>  echo "$target/config-devices.mak: CONFIG_XEN=n" >> $config_host_mak
>  fi
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index fa12203079..83d1fcc0ba 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -949,6 +949,11 @@ DEFINE_PC_MACHINE(isapc, "isapc", pc_init_isa,
>  #ifdef CONFIG_XEN
>  static void xenfv_machine_options(MachineClass *m)
>  {
> +#ifdef CONFIG_XEN_HVM_PC_I440FX_VERSION_3_1
> +pc_i440fx_3_1_machine_options(m);
> +#else
> +pc_i440fx_4_2_machine_options(m);
> +#endif
>  m->desc = "Xen Fully-virtualized PC";
>  m->max_cpus = HVM_MAX_VCPUS;
>  m->default_machine_opts = "accel=xen";
> 




signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-02-19 Thread Olaf Hering
Am Wed, 19 Feb 2020 09:05:49 +0100
schrieb Paolo Bonzini :

> Has any version of Xen been released with a QEMU version above 3.1?

Xen 4.13 has a copy of qemu4. But, Xen can use an external qemu. It is unknown 
how many supposed-to-be-migrated domUs with qemu4+ are out there. But there is 
a six digit number of running domUs with qemu3 out there.

Olaf


pgpZcol0bYXqj.pgp
Description: Digitale Signatur von OpenPGP


Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-19 Thread Philippe Mathieu-Daudé

On 2/19/20 8:55 AM, Laurent Vivier wrote:

Le 19/02/2020 à 02:57, Aleksandar Markovic a écrit :

2:54 AM Sre, 19.02.2020. Aleksandar Markovic
mailto:aleksandar.m.m...@gmail.com>> је
написао/ла:


2:06 AM Sre, 19.02.2020. Finn Thain 
> је написао/ла:


On Tue, 18 Feb 2020, Aleksandar Markovic wrote:


On Wednesday, January 29, 2020, Finn Thain

mailto:fth...@telegraphics.com.au>>

wrote:


Hi All,

There are bugs in the emulated dp8393x device that can stop packet
reception in a Linux/m68k guest (q800 machine).

With a Linux/m68k v5.5 guest (q800), it's possible to remotely

trigger

an Oops by sending ping floods.

With a Linux/mips guest (magnum machine), the driver fails to probe
the dp8393x device.

With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
fatal to the guest kernel.

Whilst debugging the device, I found that the receiver algorithm
differs from the one described in the National Semiconductor
datasheet.

This patch series resolves these bugs.

AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
---



Herve,

Do your Jazz tests pass with these changes?



AFAIK those tests did not expose the NetBSD panic that is caused by
mainline QEMU (mentioned above).

I have actually run the tests you requested (Hervé described them in an
earlier thread). There was no regression. Quite the reverse -- it's no
longer possible to remotely crash the NetBSD kernel.

Apparently my testing was also the first time that the jazzsonic driver
(from the Linux/mips Magnum port) was tested successfully with QEMU. It
doesn't work in mainline QEMU.



Well, I appologize if I missed all these facts. I just did not notice

them, at least not in this form. And, yes, some "Tested-by:" by Herve
would be desirable and nice.


FWIW I tested Finn kernel and QEMU part following:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg667432.html
to check than its series doesn't introduce regressions, booting Linux 
and NetBSD.
I haven't run the thorough networking tests Laurent do with the q800 
(scp of big files, fetch over http in loop).


Hervé testing is welcomed, but as a hobbyist he doesn't spend more than 
1h every 2 months on this, so I don't think his approval is a blocker.
We are not talking about business critical emulation, so we can fix 
regressions on top.


--

That said, I'm spending some hobby time on a Magnum boot code to be able 
to test the board upstream (without depending on proprietary BIOS and 
painful graphical setup).


Currently I get NetBSD to kdb and Linux get stuck there:
https://paste.debian.net/plain/1129965





Or, perhaps, even "Reviewed-by:".



It would be nice to have this merged before next release because q800
machine networking is not reliable without them.

And thank you to Finn for all his hard work on this device emulation.

Laurent






Re: [PATCH 00/22] linux-user: generate syscall_nr.sh

2020-02-19 Thread Laurent Vivier
Le 18/02/2020 à 23:48, Alistair Francis a écrit :
> On Mon, Feb 17, 2020 at 2:36 PM Laurent Vivier  wrote:
>>
>> This series copies the files syscall.tbl from linux v5.5 and generates
>> the file syscall_nr.h from them.
>>
>> This is done for all the QEMU targets that have a syscall.tbl
>> in the linux source tree: mips, mips64, i386, x86_64, sparc, s390x,
>> ppc, arm, microblaze, sh4, xtensa, m68k, hppa and alpha.
>>
>> tilegx and cris are depecrated in linux (tilegx has no maintainer in QEMU)
>>
>> aarch64, nios2, openrisc and riscv have no syscall.tbl in linux.
> 
> What's the plan with these other architectures?
> 
> RISC-V uses asm-generic, is there some way to generate syscall_nr.h from that?

Automatically no. The best to do should be to convert linux parts using
 asm-generic to syscall.tbl and then after that using it in QEMU.


Thanks,
Laurent




Re: [PATCH] WHPX: Assigning maintainer for Windows Hypervisor Platform

2020-02-19 Thread Philippe Mathieu-Daudé

Thank you Sunil!

On 2/18/20 9:51 PM, Justin Terry (SF) wrote:

Looks good to me! Thanks Sunil.

Signed-off-by: Justin Terry (VM) 


Justin, I suppose you meant:
Reviewed-by: Justin Terry (VM) 




-Original Message-
From: Sunil Muthuswamy 
Sent: Tuesday, February 18, 2020 12:39 PM
To: Eduardo Habkost ; Paolo Bonzini
; Richard Henderson 
Cc: Stefan Weil ; qemu-devel@nongnu.org; Justin Terry
(SF) 
Subject: [PATCH] WHPX: Assigning maintainer for Windows Hypervisor
Platform

Signed-off-by: Sunil Muthuswamy 
---
  MAINTAINERS | 8 
  1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1740a4fddc..9b3ba4e1b5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -404,6 +404,14 @@ S: Supported
  F: target/i386/kvm.c
  F: scripts/kvm/vmxcap

+WHPX CPUs


Using "X86 WHPX CPUs" instead:

Reviewed-by: Philippe Mathieu-Daudé 


+M: Sunil Muthuswamy 
+S: Supported
+F: target/i386/whpx-all.c
+F: target/i386/whp-dispatch.h
+F: accel/stubs/whpx-stub.c
+F: include/sysemu/whpx.h
+
  Guest CPU Cores (Xen)
  -
  X86 Xen CPUs
--
2.17.1







Re: [PATCH v2 fixed 01/16] util: vfio-helpers: Factor out and fix processing of existing ram blocks

2020-02-19 Thread David Hildenbrand
On 18.02.20 23:00, Peter Xu wrote:
> On Wed, Feb 12, 2020 at 02:42:39PM +0100, David Hildenbrand wrote:
>> Factor it out into common code when a new notifier is registered, just
>> as done with the memory region notifier. This allows us to have the
>> logic about how to process existing ram blocks at a central place (which
>> will be extended soon).
>>
>> Just like when adding a new ram block, we have to register the max_length
>> for now. We don't have a way to get notified about resizes yet, and some
>> memory would not be mapped when growing the ram block.
>>
>> Note: Currently, ram blocks are only "fake resized". All memory
>> (max_length) is accessible.
>>
>> We can get rid of a bunch of functions in stubs/ram-block.c . Print the
>> warning from inside qemu_vfio_ram_block_added().

[...]

>>  #include "exec/ramlist.h"
>>  #include "exec/cpu-common.h"
>>  
>> -void *qemu_ram_get_host_addr(RAMBlock *rb)
>> -{
>> -return 0;
>> -}
>> -
>> -ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
>> -{
>> -return 0;
>> -}
>> -
>> -ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
>> -{
>> -return 0;
>> -}
> 
> Maybe put into another patch?
> 
> Actually I'm thinking whether it would worth to do...  They're still
> declared in include/exec/cpu-common.h, so logically who includes the
> header but linked against stubs can still call this function.  So
> keeping them there still make sense to me.

Why keep dead code around? If you look closely, the stubs really only
contain what's strictly necessary to make current code compile, not any
available ramblock related function.

I don't see a good reason for a separate patch either (after all, we're
removing the last users in this patch), but if more people agree, I can
move it to a separate patch.
[...]

>> diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
>> index 813f7ec564..71e02e7f35 100644
>> --- a/util/vfio-helpers.c
>> +++ b/util/vfio-helpers.c
>> @@ -376,8 +376,13 @@ static void qemu_vfio_ram_block_added(RAMBlockNotifier 
>> *n,
>>void *host, size_t size)
>>  {
>>  QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
>> +int ret;
>> +
>>  trace_qemu_vfio_ram_block_added(s, host, size);
>> -qemu_vfio_dma_map(s, host, size, false, NULL);
>> +ret = qemu_vfio_dma_map(s, host, size, false, NULL);
>> +if (ret) {
>> +error_report("qemu_vfio_dma_map(%p, %zu) failed: %d", host, size, 
>> ret);
>> +}
> 
> Irrelevant change (another patch)?

This is the error that was printed in qemu_vfio_init_ramblock(). Not
moving it in this patch would mean we would stop printing the error.
[...]

>> -
>>  static void qemu_vfio_open_common(QEMUVFIOState *s)
>>  {
>>  qemu_mutex_init(&s->lock);
>>  s->ram_notifier.ram_block_added = qemu_vfio_ram_block_added;
>>  s->ram_notifier.ram_block_removed = qemu_vfio_ram_block_removed;
>> -ram_block_notifier_add(&s->ram_notifier);
>>  s->low_water_mark = QEMU_VFIO_IOVA_MIN;
>>  s->high_water_mark = QEMU_VFIO_IOVA_MAX;
>> -qemu_ram_foreach_block(qemu_vfio_init_ramblock, s);
>> +ram_block_notifier_add(&s->ram_notifier);
> 
> Pure question: this looks like a good improvement, however do you know
> why HAX and SEV do not need to init ramblock?

They register very early (e.g., at accel init time), before any ram
blocks are added.

Thanks for your review!

-- 
Thanks,

David / dhildenb




Re: [PATCH v2 fixed 03/16] util: vfio-helpers: Remove Error parameter from qemu_vfio_undo_mapping()

2020-02-19 Thread David Hildenbrand
On 18.02.20 23:07, Peter Xu wrote:
> On Wed, Feb 12, 2020 at 02:42:41PM +0100, David Hildenbrand wrote:
>> Everybody discards the error. Let's error_report() instead so this error
>> doesn't get lost.
>>
>> Cc: Richard Henderson 
>> Cc: Paolo Bonzini 
>> Cc: Eduardo Habkost 
>> Cc: Marcel Apfelbaum 
>> Cc: Alex Williamson 
>> Cc: Stefan Hajnoczi 
>> Signed-off-by: David Hildenbrand 
> 
> IMHO error_setg() should be preferred comparing to error_report()
> because it has a context to be delivered to the caller, so the error
> has a better chance to be used in a better way (e.g., QMP only
> supports error_setg()).

Please note that I decided to go for error_report() because that's also
what's being done in the matching opposite way: qemu_vfio_do_mapping().

> 
> A better solution is that we deliver the error upper.  For example,
> qemu_vfio_dma_map() is one caller of qemu_vfio_undo_mapping, if you
> see the callers of qemu_vfio_dma_map() you'll notice most of them has
> Error** defined (e.g., nvme_init_queue).  Then we can link all of them
> up.

Propagating errors is helpful if the caller can actually do something
with the error. If it's a function that's supposed to never fail (which
is the case here obviously), then there is not much benefit in doing so.

> 
> Another lazy solution (and especially if vfio-helpers are still mostly
> used only by advanced users) is we can simply pass in &error_abort for
> the three callers then they won't be missed...

I prefer this variant, as it matches qemu_vfio_do_mapping() - except
that we don't report -errno - which is fine, because the function is
expected to not fail in any sane use case.

Thanks!

-- 
Thanks,

David / dhildenb




Re: [PATCH v3 1/5] target/riscv: add vector unit stride load and store instructions

2020-02-19 Thread LIU Zhiwei

Hi, Richard
Thanks for your informative comments. I'm addressing these comments.
And a little confused in some comments.
On 2020/2/12 14:38, Richard Henderson wrote:

On 2/9/20 11:42 PM, LIU Zhiwei wrote:

+/*
+ * As simd_desc supports at most 256 bytes, and in this implementation,
+ * the max vector group length is 2048 bytes. So split it into two parts.
+ *
+ * The first part is floor(maxsz, 64), encoded in maxsz of simd_desc.
+ * The second part is (maxsz % 64) >> 3, encoded in data of simd_desc.
+ */
+static uint32_t maxsz_part1(uint32_t maxsz)
+{
+return ((maxsz & ~(0x3f)) >> 3) + 0x8; /* add offset 8 to avoid return 0 */
+}
+
+static uint32_t maxsz_part2(uint32_t maxsz)
+{
+return (maxsz & 0x3f) >> 3;
+}

I would much rather adjust simd_desc to support 2048 bytes.

I've just posted a patch set that removes an assert in target/arm that would
trigger if SIMD_DATA_SHIFT was increased to make room for a larger oprsz.

Or, since we're not going through tcg_gen_gvec_* for ldst, don't bother with
simd_desc at all, and just pass vlen, unencoded.


+/* define check conditions data structure */
+struct vext_check_ctx {
+
+struct vext_reg {
+uint8_t reg;
+bool widen;
+bool need_check;
+} check_reg[6];
+
+struct vext_overlap_mask {
+uint8_t reg;
+uint8_t vm;
+bool need_check;
+} check_overlap_mask;
+
+struct vext_nf {
+uint8_t nf;
+bool need_check;
+} check_nf;
+target_ulong check_misa;
+
+} vchkctx;

You cannot use a global variable.  The data must be thread-safe.

If we're going to do the checks this way, with a structure, it needs to be on
the stack or within DisasContext.


+#define GEN_VEXT_LD_US_TRANS(NAME, DO_OP, SEQ)\
+static bool trans_##NAME(DisasContext *s, arg_r2nfvm* a)  \
+{ \
+vchkctx.check_misa = RVV; \
+vchkctx.check_overlap_mask.need_check = true; \
+vchkctx.check_overlap_mask.reg = a->rd;   \
+vchkctx.check_overlap_mask.vm = a->vm;\
+vchkctx.check_reg[0].need_check = true;   \
+vchkctx.check_reg[0].reg = a->rd; \
+vchkctx.check_reg[0].widen = false;   \
+vchkctx.check_nf.need_check = true;   \
+vchkctx.check_nf.nf = a->nf;  \
+  \
+if (!vext_check(s)) { \
+return false; \
+} \
+return DO_OP(s, a, SEQ);  \
+}

I don't see the improvement from a pointer.  Something like

 if (vext_check_isa_ill(s) &&
 vext_check_overlap(s, a->rd, a->rm) &&
 vext_check_reg(s, a->rd, false) &&
 vext_check_nf(s, a->nf)) {
 return DO_OP(s, a, SEQ);
 }
 return false;

seems just as clear without the extra data.


+#ifdef CONFIG_USER_ONLY
+#define MO_SB 0
+#define MO_LESW 0
+#define MO_LESL 0
+#define MO_LEQ 0
+#define MO_UB 0
+#define MO_LEUW 0
+#define MO_LEUL 0
+#endif

What is this for?  We already define these unconditionally.



+static inline int vext_elem_mask(void *v0, int mlen, int index)
+{
+int idx = (index * mlen) / 8;
+int pos = (index * mlen) % 8;
+
+return (*((uint8_t *)v0 + idx) >> pos) & 0x1;
+}

This is a little-endian indexing of the mask.  Just above we talk about using a
host-endian ordering of uint64_t.

Thus this must be based on uint64_t instead of uint8_t.


+/*
+ * This function checks watchpoint before really load operation.
+ *
+ * In softmmu mode, the TLB API probe_access is enough for watchpoint check.
+ * In user mode, there is no watchpoint support now.
+ *
+ * It will triggle an exception if there is no mapping in TLB
+ * and page table walk can't fill the TLB entry. Then the guest
+ * software can return here after process the exception or never return.
+ */
+static void probe_read_access(CPURISCVState *env, target_ulong addr,
+target_ulong len, uintptr_t ra)
+{
+while (len) {
+const target_ulong pagelen = -(addr | TARGET_PAGE_MASK);
+const target_ulong curlen = MIN(pagelen, len);
+
+probe_read(env, addr, curlen, cpu_mmu_index(env, false), ra);

The return value here is non-null when we can read directly from host memory.
It would be a shame to throw that work away.



+/* data structure and common functions for load and store */
+typedef void vext_ld_elem_fn(CPURISCVState *env, target_ulong addr,
+uint32_t idx, void *vd, uintptr_t retaddr);
+typede

Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine

2020-02-19 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 18.02.2020 um 15:12 hat Markus Armbruster geschrieben:
>> >> Regarding calling monitor_qmp_requests_pop_any_with_lock(): it needs
>> >> @monitor_lock and @mon_list to be valid.  We just initialized
>> >> @monitor_lock, and @mon_list is empty.
>> >> monitor_qmp_requests_pop_any_with_lock() should be safe and return null.
>> >> monitor_qmp_dispatcher_co() should also be safe and yield without doing
>> >> work.
>> >> 
>> >> Can we exploit that to make things a bit simpler?  Separate patch would
>> >> be fine with me.
>> >
>> > If this is true, we could replace this line:
>> >
>> > aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>> >
>> > with the following one:
>> >
>> > qemu_aio_coroutine_enter(iohandler_get_aio_context(), 
>> > qmp_dispatcher_co);
>> >
>> > I'm not sure that this is any simpler.
>> 
>> Naive question: what's the difference between "scheduling", "entering",
>> and "waking up" a coroutine?
>> 
>> qemu_coroutine_enter() and qemu_aio_coroutine_enter() are in
>> coroutine.h.
>
> These are the low-level functions that just enter the coroutine (i.e. do
> a stack switch and jump to coroutine code) immediately when they are
> called. They must be called in the right thread with the right
> AioContext locks held. (What "right" means depends on the code run in
> the coroutine.)

I see; low level building blocks.

>> aio_co_schedule(), aio_co_wake() and aio_co_enter() are in aio.h.
>
> aio_co_schedule() never enters the coroutine directly. It only adds it
> to the list of scheduled coroutines and schedules a BH in the target
> AioContext. This BH in turn will actually enter the coroutine.

Makes sense.

> aio_co_enter() enters the coroutine immediately if it's being called
> outside of coroutine context and in the right thread for the given
> AioContext. If it's in the right thread, but in coroutine context then
> it will queue the given coroutine so that it runs whenever the current
> coroutine yields. If it's in the wrong thread, it uses aio_co_schedule()
> to get it run in the right thread.

Uff, sounds complicated.  Lots of magic.

> aio_co_wake() takes just the coroutine as a parameter and calls
> aio_co_enter() with the context in which the coroutine was last run.

Complicated by extension.

> All of these functions make sure that the AioContext lock is taken when
> the coroutine is entered and that they run in the right thread.
>
>> qemu_coroutine_enter() calls qemu_aio_coroutine_enter().
>> 
>> aio_co_wake() calls aio_co_enter() calls qemu_aio_coroutine_enter().
>> 
>> aio_co_enter() seems to be independent.
>
> It's not. It calls either aio_co_schedule() or
> qemu_aio_coroutine_enter(), or inserts the coroutine ina queue that is
> processed in qemu_aio_coroutine_enter().

I was blind.

>> aio.h seems to be layered on top of coroutine.h.  Should I prefer using
>> aio.h to coroutine.h?
>
> In the common case, using the aio.h functions is preferable because they
> just do "the right thing" without requiring as much thinking as the
> low-level functions.

Got it.

>> >> >  }
>> >> >  
>> >> >  QemuOptsList qemu_mon_opts = {
>> >> > diff --git a/monitor/qmp.c b/monitor/qmp.c
>> >> > index 54c06ba824..9444de9fcf 100644
>> >> > --- a/monitor/qmp.c
>> >> > +++ b/monitor/qmp.c
>> >> > @@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, 
>> >> > QDict *rsp)
>> >> >  }
>> >> >  }
>> >> >  
>> >> > +/*
>> >> > + * Runs outside of coroutine context for OOB commands, but in 
>> >> > coroutine context
>> >> > + * for everything else.
>> >> > + */
>> >> 
>> >> Nitpick: wrap around column 70, please.
>> >> 
>> >> Note to self: the precondition is asserted in do_qmp_dispatch() below.
>> >> Asserting here is impractical, because we don't know whether this is an
>> >> OOB command.
>> >> 
>> >> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
>> >> >  {
>> >> >  Monitor *old_mon;
>> >> > @@ -211,43 +215,87 @@ static QMPRequest 
>> >> > *monitor_qmp_requests_pop_any_with_lock(void)
>> >> >  return req_obj;
>> >> >  }
>> >> >  
>> >> > -void monitor_qmp_bh_dispatcher(void *data)
>> >> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data)
>> >> >  {
>> >> > -QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
>> >> > +QMPRequest *req_obj = NULL;
>> >> >  QDict *rsp;
>> >> >  bool need_resume;
>> >> >  MonitorQMP *mon;
>> >> >  
>> >> > -if (!req_obj) {
>> >> > -return;
>> >> > -}
>> >> > +while (true) {
>> >> > +assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);
>> >> 
>> >> Read and assert, then ...
>> >> 
>> >> > +
>> >> > +/* Mark the dispatcher as not busy already here so that we 
>> >> > don't miss
>> >> > + * any new requests coming in the middle of our processing. */
>> >> > +atomic_mb_set(&qmp_dispatcher_co_busy, false);
>> >> 
>> >> ... set.  Would exchange, then assert be cleaner?
>> >
>> > Then you would ask me why 

Re: [PATCH v2 0/3] vhost-user: Lift Max Ram Slots Limitation

2020-02-19 Thread Raphael Norwitz
On Mon, Feb 10, 2020 at 11:04:28AM -0500, Michael S. Tsirkin wrote:
> 
> On Sun, Feb 09, 2020 at 12:14:42PM -0500, Raphael Norwitz wrote:
> > On Thu, Feb 06, 2020 at 03:33:13AM -0500, Michael S. Tsirkin wrote:
> > > 
> > > On Wed, Jan 15, 2020 at 09:57:03PM -0500, Raphael Norwitz wrote:
> > > > 
> > > > Changes since V1:
> > > > * Kept the assert in vhost_user_set_mem_table_postcopy, but moved it
> > > >   to prevent corruption
> > > > * Made QEMU send a single VHOST_USER_GET_MAX_MEMSLOTS message at
> > > >   startup and cache the returned value so that QEMU does not need to
> > > >   query the backend every time vhost_backend_memslots_limit is 
> > > > called.
> > > 
> > > I'm a bit confused about what happens on reconnect.
> > > Can you clarify pls?
> > > 
> > >From what I can see, backends which support reconnect call vhost_dev_init,
> > which then calls vhost_user_backend_init(), as vhost-user-blk does here:
> > https://github.com/qemu/qemu/blob/master/hw/block/vhost-user-blk.c#L315. The
> > ram slots limit is fetched in vhost_user_backend_init() so every time the
> > device reconnects the limit should be refetched. 
> 
> Right. Point is, we might have validated using an old limit.
> Reconnect needs to verify limit did not change or at least
> did not decrease.
> 
> -- 
> MST
Good point - I did not consider this case. Could we keep the slots limit in
the VhostUserState instead?

Say vhost_user_init() initializes the limit inside the VhostUserState to 0. 
Then,
vhost_user_backend_init() checks if this limit is 0. If so, this is the initial
connection and qemu fetches the limit from the backend, ensures the returned
value is nonzero, and sets the limit the VhostUserState. If not, qemu knows this
is a reconnect and queries the backend slots limit. If the returned value does
not equal the limit in the VhostUserState, vhost_user_backend_init() returns an
error.

Thoughts?



Re: [PATCH] memory: batch allocate ioeventfds[] in address_space_update_ioeventfds()

2020-02-19 Thread Stefan Hajnoczi
On Tue, Feb 18, 2020 at 9:50 PM Peter Xu  wrote:
>
> On Tue, Feb 18, 2020 at 06:22:26PM +, Stefan Hajnoczi wrote:
> > Reallocing the ioeventfds[] array each time an element is added is very
> > expensive as the number of ioeventfds increases.  Batch allocate instead
> > to amortize the cost of realloc.
> >
> > This patch reduces Linux guest boot times from 362s to 140s when there
> > are 2 virtio-blk devices with 1 virtqueue and 99 virtio-blk devices with
> > 32 virtqueues.
> >
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  memory.c | 17 ++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/memory.c b/memory.c
> > index aeaa8dcc9e..2d6f931f8c 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -794,10 +794,18 @@ static void 
> > address_space_update_ioeventfds(AddressSpace *as)
> >  FlatView *view;
> >  FlatRange *fr;
> >  unsigned ioeventfd_nb = 0;
> > -MemoryRegionIoeventfd *ioeventfds = NULL;
> > +unsigned ioeventfd_max;
> > +MemoryRegionIoeventfd *ioeventfds;
> >  AddrRange tmp;
> >  unsigned i;
> >
> > +/*
> > + * It is likely that the number of ioeventfds hasn't changed much, so 
> > use
> > + * the previous size as the starting value.
> > + */
> > +ioeventfd_max = as->ioeventfd_nb;
> > +ioeventfds = g_new(MemoryRegionIoeventfd, ioeventfd_max);
>
> Would the ioeventfd_max being cached and never goes down but it can
> only keep or increase?

No, it will decrease but only the next time
address_space_update_ioeventfds() is called.  That's when we'll use
the next ioeventfds_nb as the starting point.

> I'm not sure if that's a big problem, but
> considering the commit message mentioned 99 virtio-blk with 32 queues
> each, I'm not sure... :)
>
> I'm thinking maybe start with a relative big number but always under
> control (e.g., 64), then...

I also considered doing a final g_realloc() at the end of the function
to get rid of the excess allocation but I'm not sure it's worth it...

> > +
> >  view = address_space_get_flatview(as);
> >  FOR_EACH_FLAT_RANGE(fr, view) {
> >  for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
> > @@ -806,8 +814,11 @@ static void 
> > address_space_update_ioeventfds(AddressSpace *as)
> >   
> > int128_make64(fr->offset_in_region)));
> >  if (addrrange_intersects(fr->addr, tmp)) {
> >  ++ioeventfd_nb;
> > -ioeventfds = g_realloc(ioeventfds,
> > -  ioeventfd_nb * 
> > sizeof(*ioeventfds));
> > +if (ioeventfd_nb > ioeventfd_max) {
> > +ioeventfd_max += 64;
>
> ... do exponential increase here (max*=2) instead so still easy to
> converge?

I'm happy to tweak the policy.  Let's see what Paolo thinks.

Stefan



Re: [PATCH 3/3] hw/xtensa/xtfpga:fix leak of fdevice tree blob

2020-02-19 Thread Laurent Vivier
Le 18/02/2020 à 10:11, kuhn.chen...@huawei.com a écrit :
> From: Chen Qun 
> 
> The device tree blob returned by load_device_tree is malloced.
> We should free it after cpu_physical_memory_write().
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> ---
>  hw/xtensa/xtfpga.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/xtensa/xtfpga.c b/hw/xtensa/xtfpga.c
> index 8e2dd1327a..60ccc74f5f 100644
> --- a/hw/xtensa/xtfpga.c
> +++ b/hw/xtensa/xtfpga.c
> @@ -380,6 +380,7 @@ static void xtfpga_init(const XtfpgaBoardDesc *board, 
> MachineState *machine)
>  cur_tagptr = put_tag(cur_tagptr, BP_TAG_FDT,
>   sizeof(dtb_addr), &dtb_addr);
>  cur_lowmem = QEMU_ALIGN_UP(cur_lowmem + fdt_size, 4 * KiB);
> +g_free(fdt);
>  }
>  #else
>  if (dtb_filename) {
> 

Reviewed-by: Laurent Vivier 



[PATCH] migration/savevm: release gslist after dump_vmstate_json

2020-02-19 Thread pannengyuan
From: Pan Nengyuan 

'list' forgot to free at the end of dump_vmstate_json_to_file(), although it's 
called only once, but seems like a clean code.

Fix the leak as follow:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768)
#1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445)
#2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066)
#3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139)
#4 0x5585db591581 in object_class_get_list_tramp 
/mnt/sdb/qemu-new/qemu/qom/object.c:1084
#5 0x5585db590f66 in object_class_foreach_tramp 
/mnt/sdb/qemu-new/qemu/qom/object.c:1028
#6 0x7fb945eb35f7 in g_hash_table_foreach (/lib64/libglib-2.0.so.0+0x3b5f7)
#7 0x5585db59110c in object_class_foreach 
/mnt/sdb/qemu-new/qemu/qom/object.c:1038
#8 0x5585db5916b6 in object_class_get_list 
/mnt/sdb/qemu-new/qemu/qom/object.c:1092
#9 0x5585db335ca0 in dump_vmstate_json_to_file 
/mnt/sdb/qemu-new/qemu/migration/savevm.c:638
#10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420
#11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308
#12 0x5585da29420d in _start 
(/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d)

Indirect leak of 7472 byte(s) in 467 object(s) allocated from:
#0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768)
#1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445)
#2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066)
#3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139)
#4 0x5585db591581 in object_class_get_list_tramp 
/mnt/sdb/qemu-new/qemu/qom/object.c:1084
#5 0x5585db590f66 in object_class_foreach_tramp 
/mnt/sdb/qemu-new/qemu/qom/object.c:1028
#6 0x7fb945eb35f7 in g_hash_table_foreach (/lib64/libglib-2.0.so.0+0x3b5f7)
#7 0x5585db59110c in object_class_foreach 
/mnt/sdb/qemu-new/qemu/qom/object.c:1038
#8 0x5585db5916b6 in object_class_get_list 
/mnt/sdb/qemu-new/qemu/qom/object.c:1092
#9 0x5585db335ca0 in dump_vmstate_json_to_file 
/mnt/sdb/qemu-new/qemu/migration/savevm.c:638
#10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420
#11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308
#12 0x5585da29420d in _start 
(/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d)

Reported-by: Euler Robot 
Signed-off-by: Pan Nengyuan 
---
 migration/savevm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/savevm.c b/migration/savevm.c
index f19cb9ec7a..60e6ea8a8d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -665,6 +665,7 @@ void dump_vmstate_json_to_file(FILE *out_file)
 }
 fprintf(out_file, "\n}\n");
 fclose(out_file);
+g_slist_free(list);
 }
 
 static uint32_t calculate_new_instance_id(const char *idstr)
-- 
2.18.2




Re: [PATCH 1/3] hw/nios2:fix leak of fdevice tree blob

2020-02-19 Thread Laurent Vivier
Le 18/02/2020 à 10:11, kuhn.chen...@huawei.com a écrit :
> From: Chen Qun 
> 
> The device tree blob returned by load_device_tree is malloced.
> We should free it after cpu_physical_memory_write().
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> ---
>  hw/nios2/boot.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/nios2/boot.c b/hw/nios2/boot.c
> index 46b8349876..88224aa84c 100644
> --- a/hw/nios2/boot.c
> +++ b/hw/nios2/boot.c
> @@ -109,6 +109,7 @@ static int nios2_load_dtb(struct nios2_boot_info bi, 
> const uint32_t ramsize,
>  }
>  
>  cpu_physical_memory_write(bi.fdt, fdt, fdt_size);
> +g_free(fdt);
>  return fdt_size;
>  }
>  
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH RFC] target/i386: filter out VMX_PIN_BASED_POSTED_INTR when enabling SynIC

2020-02-19 Thread Vitaly Kuznetsov
Paolo Bonzini  writes:

> On 18/02/20 18:08, Vitaly Kuznetsov wrote:
>> Paolo Bonzini  writes:
>> 
>>> On 18/02/20 15:44, Vitaly Kuznetsov wrote:
 Signed-off-by: Vitaly Kuznetsov 
 ---
 RFC: This is somewhat similar to eVMCS breakage and it is likely possible
 to fix this in KVM. I decided to try QEMU first as this is a single
 control and unlike eVMCS we don't need to keep a list of things to disable.
>>>
>>> I think you should disable "virtual-interrupt delivery" instead (which
>>> in turn requires "process posted interrupts" to be zero).  That is the
>>> one that is incompatible with AutoEOI interrupts.
>> 
>> I'm fighting the symptoms, not the cause :-) My understanding is that
>> when SynIC is enabled for CPU0 KVM does
>> 
>> kvm_vcpu_update_apicv()
>>  vmx_refresh_apicv_exec_ctrl()
>>  pin_controls_set()
>> 
>> for *all* vCPUs (KVM_REQ_APICV_UPDATE). I'm not sure why
>> SECONDARY_EXEC_APIC_REGISTER_VIRT/SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY
>> are not causing problems and only PIN_BASED_POSTED_INTR does as we clear
>> them all (not very important atm).
>
> Let's take a step back, what is the symptom, i.e. how does it fail?

I just do

~/qemu/x86_64-softmmu/qemu-system-x86_64 -machine q35,accel=kvm -cpu 
host,hv_vpindex,hv_synic -smp 2 -m 16384 -vnc :0
and get
qemu-system-x86_64: error: failed to set MSR 0x48d to 0xff0016
qemu-system-x86_64: /root/qemu/target/i386/kvm.c:2684: kvm_buf_set_msrs: 
Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
Aborted

(it works with '-smp 1' or without 'hv_synic')

> Because thinking more about it, since we have separate VMCS we can set
> PIN_BASED_POSTED_INTR and SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY just fine
> in the vmcs02.
> The important part is to unconditionally call
> vmx_deliver_nested_posted_interrupt.
>
> Something like
>
>   if (kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)) {
> kvm_lapic_set_irr(vector, apic);
> kvm_make_request(KVM_REQ_EVENT, vcpu);
> kvm_vcpu_kick(vcpu);
> }
>
> and in vmx_deliver_posted_interrupt
>
> r = vmx_deliver_nested_posted_interrupt(vcpu, vector);
> if (!r)
> return 0;
>
>   if (!vcpu->arch.apicv_active)
> return -1;
> ...
> return 0;

Sound like a plan, let me try playing with it.

-- 
Vitaly




Re: [PATCH] migration/savevm: release gslist after dump_vmstate_json

2020-02-19 Thread Dr. David Alan Gilbert
* pannengy...@huawei.com (pannengy...@huawei.com) wrote:
> From: Pan Nengyuan 
> 
> 'list' forgot to free at the end of dump_vmstate_json_to_file(), although 
> it's called only once, but seems like a clean code.
> 
> Fix the leak as follow:
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
> #0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768)
> #1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445)
> #2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066)
> #3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139)
> #4 0x5585db591581 in object_class_get_list_tramp 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1084
> #5 0x5585db590f66 in object_class_foreach_tramp 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1028
> #6 0x7fb945eb35f7 in g_hash_table_foreach 
> (/lib64/libglib-2.0.so.0+0x3b5f7)
> #7 0x5585db59110c in object_class_foreach 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1038
> #8 0x5585db5916b6 in object_class_get_list 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1092
> #9 0x5585db335ca0 in dump_vmstate_json_to_file 
> /mnt/sdb/qemu-new/qemu/migration/savevm.c:638
> #10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420
> #11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308
> #12 0x5585da29420d in _start 
> (/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d)
> 
> Indirect leak of 7472 byte(s) in 467 object(s) allocated from:
> #0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768)
> #1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445)
> #2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066)
> #3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139)
> #4 0x5585db591581 in object_class_get_list_tramp 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1084
> #5 0x5585db590f66 in object_class_foreach_tramp 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1028
> #6 0x7fb945eb35f7 in g_hash_table_foreach 
> (/lib64/libglib-2.0.so.0+0x3b5f7)
> #7 0x5585db59110c in object_class_foreach 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1038
> #8 0x5585db5916b6 in object_class_get_list 
> /mnt/sdb/qemu-new/qemu/qom/object.c:1092
> #9 0x5585db335ca0 in dump_vmstate_json_to_file 
> /mnt/sdb/qemu-new/qemu/migration/savevm.c:638
> #10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420
> #11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308
> #12 0x5585da29420d in _start 
> (/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d)
> 
> Reported-by: Euler Robot 

Good robot!

> Signed-off-by: Pan Nengyuan 
> ---
>  migration/savevm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index f19cb9ec7a..60e6ea8a8d 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -665,6 +665,7 @@ void dump_vmstate_json_to_file(FILE *out_file)
>  }
>  fprintf(out_file, "\n}\n");
>  fclose(out_file);
> +g_slist_free(list);

Reviewed-by: Dr. David Alan Gilbert 

>  }
>  
>  static uint32_t calculate_new_instance_id(const char *idstr)
> -- 
> 2.18.2
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PATCH] util/async: make bh_aio_poll() O(1)

2020-02-19 Thread Stefan Hajnoczi
The ctx->first_bh list contains all created BHs, including those that
are not scheduled.  The list is iterated by the event loop and therefore
has O(n) time complexity with respected to the number of created BHs.

Rewrite BHs so that only scheduled or deleted BHs are enqueued.
Only BHs that actually require action will be iterated.

One semantic change is required: qemu_bh_delete() enqueues the BH and
therefore invokes aio_notify().  The
tests/test-aio.c:test_source_bh_delete_from_cb() test case assumed that
g_main_context_iteration(NULL, false) returns false after
qemu_bh_delete() but it now returns true for one iteration.  Fix up the
test case.

This patch makes aio_compute_timeout() and aio_bh_poll() drop from a CPU
profile reported by perf-top(1).  Previously they combined to 9% CPU
utilization when AioContext polling is commented out and the guest has 2
virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32 devices.

Signed-off-by: Stefan Hajnoczi 
---
 include/block/aio.h |  17 +++-
 tests/test-aio.c|   3 +-
 util/async.c| 242 ++--
 3 files changed, 160 insertions(+), 102 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 7ba9bd7874..51f4aa09cb 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -51,6 +51,18 @@ struct ThreadPool;
 struct LinuxAioState;
 struct LuringState;
 
+/*
+ * Each aio_bh_poll() call carves off a slice of the BH list.  This way newly
+ * scheduled BHs are not processed until the next aio_bh_poll() call.  This
+ * concept extends to nested aio_bh_poll() calls because slices are chained
+ * together.
+ */
+typedef struct BHListSlice BHListSlice;
+struct BHListSlice {
+QEMUBH *first_bh;
+BHListSlice *next;
+};
+
 struct AioContext {
 GSource source;
 
@@ -91,8 +103,9 @@ struct AioContext {
  */
 QemuLockCnt list_lock;
 
-/* Anchor of the list of Bottom Halves belonging to the context */
-struct QEMUBH *first_bh;
+/* Bottom Halves pending aio_bh_poll() processing */
+BHListSlice bh_list;
+BHListSlice **bh_list_tail;
 
 /* Used by aio_notify.
  *
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 86fb73b3d5..8a46078463 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -615,7 +615,8 @@ static void test_source_bh_delete_from_cb(void)
 g_assert_cmpint(data1.n, ==, data1.max);
 g_assert(data1.bh == NULL);
 
-g_assert(!g_main_context_iteration(NULL, false));
+assert(g_main_context_iteration(NULL, false));
+assert(!g_main_context_iteration(NULL, false));
 }
 
 static void test_source_bh_delete_from_cb_many(void)
diff --git a/util/async.c b/util/async.c
index c192a24a61..a7d334efc0 100644
--- a/util/async.c
+++ b/util/async.c
@@ -36,16 +36,83 @@
 /***/
 /* bottom halves (can be seen as timers which expire ASAP) */
 
+/* QEMUBH::flags values */
+enum {
+/* Already enqueued and waiting for aio_bh_poll() */
+BH_PENDING   = (1 << 0),
+
+/* Invoke the callback */
+BH_SCHEDULED = (1 << 1),
+
+/* Delete without invoking callback */
+BH_DELETED   = (1 << 2),
+
+/* Delete after invoking callback */
+BH_ONESHOT   = (1 << 3),
+
+/* Schedule periodically when the event loop is idle */
+BH_IDLE  = (1 << 4),
+};
+
 struct QEMUBH {
 AioContext *ctx;
 QEMUBHFunc *cb;
 void *opaque;
 QEMUBH *next;
-bool scheduled;
-bool idle;
-bool deleted;
+unsigned flags;
 };
 
+/* Called concurrently from any thread */
+static void aio_bh_enqueue(QEMUBH *bh, unsigned new_flags)
+{
+AioContext *ctx = bh->ctx;
+unsigned old_flags;
+
+/*
+ * The memory barrier implicit in atomic_fetch_or makes sure that:
+ * 1. idle & any writes needed by the callback are done before the
+ *locations are read in the aio_bh_poll.
+ * 2. ctx is loaded before the callback has a chance to execute and bh
+ *could be freed.
+ */
+old_flags = atomic_fetch_or(&bh->flags, BH_PENDING | new_flags);
+if (!(old_flags & BH_PENDING)) {
+QEMUBH *old;
+
+do {
+old = ctx->bh_list.first_bh;
+bh->next = old;
+} while (atomic_cmpxchg(&ctx->bh_list.first_bh, old, bh) != old);
+}
+
+aio_notify(ctx);
+}
+
+/* Only called from aio_bh_poll() and aio_ctx_finalize() */
+static QEMUBH *aio_bh_dequeue(QEMUBH **first_bh, unsigned *flags)
+{
+QEMUBH *bh;
+
+bh = *first_bh;
+if (!bh) {
+return NULL;
+}
+
+*first_bh = bh->next;
+bh->next = NULL;
+
+/*
+ * The atomic_and is paired with aio_bh_enqueue().  The implicit memory
+ * barrier ensures that the callback sees all writes done by the scheduling
+ * thread.  It also ensures that the scheduling thread sees the cleared
+ * flag before bh->cb has run, and thus will call aio_notify again if
+ * necessary.
+ */
+*flags = atomic_fetch_and(&bh->flags,
+  

Re: [PATCH v2 0/3] vhost-user: Lift Max Ram Slots Limitation

2020-02-19 Thread Michael S. Tsirkin
On Wed, Feb 19, 2020 at 12:33:24AM -0500, Raphael Norwitz wrote:
> On Mon, Feb 10, 2020 at 11:04:28AM -0500, Michael S. Tsirkin wrote:
> > 
> > On Sun, Feb 09, 2020 at 12:14:42PM -0500, Raphael Norwitz wrote:
> > > On Thu, Feb 06, 2020 at 03:33:13AM -0500, Michael S. Tsirkin wrote:
> > > > 
> > > > On Wed, Jan 15, 2020 at 09:57:03PM -0500, Raphael Norwitz wrote:
> > > > > 
> > > > > Changes since V1:
> > > > > * Kept the assert in vhost_user_set_mem_table_postcopy, but moved 
> > > > > it
> > > > >   to prevent corruption
> > > > > * Made QEMU send a single VHOST_USER_GET_MAX_MEMSLOTS message at
> > > > >   startup and cache the returned value so that QEMU does not need 
> > > > > to
> > > > >   query the backend every time vhost_backend_memslots_limit is 
> > > > > called.
> > > > 
> > > > I'm a bit confused about what happens on reconnect.
> > > > Can you clarify pls?
> > > > 
> > > >From what I can see, backends which support reconnect call 
> > > >vhost_dev_init,
> > > which then calls vhost_user_backend_init(), as vhost-user-blk does here:
> > > https://github.com/qemu/qemu/blob/master/hw/block/vhost-user-blk.c#L315. 
> > > The
> > > ram slots limit is fetched in vhost_user_backend_init() so every time the
> > > device reconnects the limit should be refetched. 
> > 
> > Right. Point is, we might have validated using an old limit.
> > Reconnect needs to verify limit did not change or at least
> > did not decrease.
> > 
> > -- 
> > MST
> Good point - I did not consider this case. Could we keep the slots limit in
> the VhostUserState instead?
> 
> Say vhost_user_init() initializes the limit inside the VhostUserState to 0. 
> Then,
> vhost_user_backend_init() checks if this limit is 0. If so, this is the 
> initial
> connection and qemu fetches the limit from the backend, ensures the returned
> value is nonzero, and sets the limit the VhostUserState. If not, qemu knows 
> this
> is a reconnect and queries the backend slots limit. If the returned value does
> not equal the limit in the VhostUserState, vhost_user_backend_init() returns 
> an
> error.
> 
> Thoughts?

Right.
Or if we really want to, check backend value is >= the saved one.
Basically same thing we do with features.

-- 
MST




Re: [PATCH 2/5] aio-posix: don't pass ns timeout to epoll_wait()

2020-02-19 Thread Sergio Lopez
On Fri, Feb 14, 2020 at 05:17:09PM +, Stefan Hajnoczi wrote:
> Don't pass the nanosecond timeout into epoll_wait(), which expects
> milliseconds.
> 
> The epoll_wait() timeout value does not matter if qemu_poll_ns()
> determined that the poll fd is ready, but passing a value in the wrong
> units is still ugly.  Pass a 0 timeout to epoll_wait() instead.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  util/aio-posix.c | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


[PULL 06/17] hw/m68k/next-cube: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 956a78118bf
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-7-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/m68k/next-cube.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index e5343348d076..350c6fec78f9 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -734,7 +734,7 @@ void next_irq(void *opaque, int number, int level)
 switch (number) {
 /* level 3 - floppy, kbd/mouse, power, ether rx/tx, scsi, clock */
 case NEXT_FD_I:
-shift = 7;;
+shift = 7;
 break;
 case NEXT_KBD_I:
 shift = 3;
-- 
2.24.1




[PULL 07/17] hw/scsi/esp: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 74d71ea16bc
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-8-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/scsi/esp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index f8fc30cccbd4..405f8b7cbcf7 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -293,7 +293,7 @@ static void handle_satn_stop(ESPState *s)
 s->dma_cb = handle_satn_stop;
 return;
 }
-s->pdma_cb = satn_stop_pdma_cb;;
+s->pdma_cb = satn_stop_pdma_cb;
 s->cmdlen = get_cmd(s, s->cmdbuf, sizeof(s->cmdbuf));
 if (s->cmdlen) {
 trace_esp_handle_satn_stop(s->cmdlen);
-- 
2.24.1




[PULL 02/17] Report stringified errno in VFIO related errors

2020-02-19 Thread Laurent Vivier
From: Michal Privoznik 

In a few places we report errno formatted as a negative integer.
This is not as user friendly as it can be. Use strerror() and/or
error_setg_errno() instead.

Signed-off-by: Michal Privoznik 
Reviewed-by: Ján Tomko 
Reviewed-by: Cornelia Huck 
Reviewed-by: Eric Auger 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Williamson 
Message-Id: 
<4949c3ecf1a32189b8a4b5eb4b0fd04c1122501d.1581674006.git.mpriv...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/vfio/common.c| 4 ++--
 util/vfio-helpers.c | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 5ca11488d676..0b3593b3c0c4 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -319,7 +319,7 @@ static int vfio_dma_unmap(VFIOContainer *container,
 unmap.size -= 1ULL << ctz64(container->pgsizes);
 continue;
 }
-error_report("VFIO_UNMAP_DMA: %d", -errno);
+error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno));
 return -errno;
 }
 
@@ -352,7 +352,7 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
iova,
 return 0;
 }
 
-error_report("VFIO_MAP_DMA: %d", -errno);
+error_report("VFIO_MAP_DMA failed: %s", strerror(errno));
 return -errno;
 }
 
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 813f7ec56427..ddd9a96e7672 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -545,7 +545,7 @@ static int qemu_vfio_do_mapping(QEMUVFIOState *s, void 
*host, size_t size,
 trace_qemu_vfio_do_mapping(s, host, size, iova);
 
 if (ioctl(s->container, VFIO_IOMMU_MAP_DMA, &dma_map)) {
-error_report("VFIO_MAP_DMA: %d", -errno);
+error_report("VFIO_MAP_DMA failed: %s", strerror(errno));
 return -errno;
 }
 return 0;
@@ -570,7 +570,7 @@ static void qemu_vfio_undo_mapping(QEMUVFIOState *s, 
IOVAMapping *mapping,
 assert(QEMU_IS_ALIGNED(mapping->size, qemu_real_host_page_size));
 assert(index >= 0 && index < s->nr_mappings);
 if (ioctl(s->container, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
-error_setg(errp, "VFIO_UNMAP_DMA failed: %d", -errno);
+error_setg_errno(errp, errno, "VFIO_UNMAP_DMA failed");
 }
 memmove(mapping, &s->mappings[index + 1],
 sizeof(s->mappings[0]) * (s->nr_mappings - index - 1));
@@ -669,7 +669,7 @@ int qemu_vfio_dma_reset_temporary(QEMUVFIOState *s)
 trace_qemu_vfio_dma_reset_temporary(s);
 qemu_mutex_lock(&s->lock);
 if (ioctl(s->container, VFIO_IOMMU_UNMAP_DMA, &unmap)) {
-error_report("VFIO_UNMAP_DMA: %d", -errno);
+error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno));
 qemu_mutex_unlock(&s->lock);
 return -errno;
 }
-- 
2.24.1




[PULL 04/17] audio/alsaaudio: Remove superfluous semicolons

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 286a5d201e4
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-3-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 audio/alsaaudio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/audio/alsaaudio.c b/audio/alsaaudio.c
index a23a5a0b60a1..a8e62542f97e 100644
--- a/audio/alsaaudio.c
+++ b/audio/alsaaudio.c
@@ -819,7 +819,7 @@ static size_t alsa_read(HWVoiceIn *hw, void *buf, size_t 
len)
 switch (nread) {
 case 0:
 trace_alsa_read_zero(len);
-return pos;;
+return pos;
 
 case -EPIPE:
 if (alsa_recover(alsa->handle)) {
@@ -835,7 +835,7 @@ static size_t alsa_read(HWVoiceIn *hw, void *buf, size_t 
len)
 default:
 alsa_logerr(nread, "Failed to read %zu frames to %p\n",
 len, dst);
-return pos;;
+return pos;
 }
 }
 
-- 
2.24.1




[PULL 05/17] hw/arm/xlnx-versal: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 6f16da53ffe
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-6-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/arm/xlnx-versal-virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index 462493c46711..0d2e3bdda169 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -350,7 +350,7 @@ static void create_virtio_regions(VersalVirt *s)
 int i;
 
 for (i = 0; i < NUM_VIRTIO_TRANSPORT; i++) {
-char *name = g_strdup_printf("virtio%d", i);;
+char *name = g_strdup_printf("virtio%d", i);
 hwaddr base = MM_TOP_RSVD + i * virtio_mmio_size;
 int irq = VERSAL_RSVD_IRQ_FIRST + i;
 MemoryRegion *mr;
-- 
2.24.1




[PULL 03/17] scripts/checkpatch.pl: Detect superfluous semicolon in C code

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Display error when a commit contains superfluous semicolon:

  $ git show 6663a0a3376 | scripts/checkpatch.pl -q -
  ERROR: superfluous trailing semicolon
  #276: FILE: block/io_uring.c:186:
  +ret = -ENOSPC;;
  total: 1 errors, 1 warnings, 485 lines checked

Reported-by: Luc Michel 
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Reviewed-by: Luc Michel 
Message-Id: <20200218094402.26625-2-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 scripts/checkpatch.pl | 5 +
 1 file changed, 5 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ce43a306f867..11512a8a09b8 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1830,6 +1830,11 @@ sub process {
ERROR("suspicious ; after while (0)\n" . $herecurr);
}
 
+# Check superfluous trailing ';'
+   if ($line =~ /;;$/) {
+   ERROR("superfluous trailing semicolon\n" . $herecurr);
+   }
+
 # Check relative indent for conditionals and blocks.
if ($line =~ /\b(?:(?:if|while|for)\s*\(|do\b)/ && $line !~ 
/^.\s*#/ && $line !~ /\}\s*while\s*/) {
my ($s, $c) = ($stat, $cond);
-- 
2.24.1




[PULL 10/17] target/i386/whpx: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 812d49f2a3e
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-12-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 target/i386/whpx-all.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index 3ed2aa1892a1..35601b81766f 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -511,7 +511,7 @@ static void whpx_get_registers(CPUState *cpu)
 /* WHvX64RegisterPat - Skipped */
 
 assert(whpx_register_names[idx] == WHvX64RegisterSysenterCs);
-env->sysenter_cs = vcxt.values[idx++].Reg64;;
+env->sysenter_cs = vcxt.values[idx++].Reg64;
 assert(whpx_register_names[idx] == WHvX64RegisterSysenterEip);
 env->sysenter_eip = vcxt.values[idx++].Reg64;
 assert(whpx_register_names[idx] == WHvX64RegisterSysenterEsp);
-- 
2.24.1




[PULL 00/17] Trivial branch patches

2020-02-19 Thread Laurent Vivier
The following changes since commit 6c599282f8ab382fe59f03a6cae755b89561a7b3:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2020-02-15-v2' 
into staging (2020-02-17 13:32:25 +)

are available in the Git repository at:

  git://github.com/vivier/qemu.git tags/trivial-branch-pull-request

for you to fetch changes up to d1cb67841ca213802ee789957188ec87e8b7996d:

  hw/xtensa/xtfpga:fix leak of fdevice tree blob (2020-02-19 10:33:38 +0100)


Fix memory leak with fdt
cosmetic change in code and logs
update mailmap



Chen Qun (2):
  hw/nios2:fix leak of fdevice tree blob
  hw/xtensa/xtfpga:fix leak of fdevice tree blob

Michal Privoznik (1):
  Report stringified errno in VFIO related errors

Philippe Mathieu-Daudé (13):
  scripts/checkpatch.pl: Detect superfluous semicolon in C code
  audio/alsaaudio: Remove superfluous semicolons
  hw/arm/xlnx-versal: Remove superfluous semicolon
  hw/m68k/next-cube: Remove superfluous semicolon
  hw/scsi/esp: Remove superfluous semicolon
  hw/vfio/display: Remove superfluous semicolon
  ui/input-barrier: Remove superfluous semicolon
  target/i386/whpx: Remove superfluous semicolon
  tests/qtest/libqos/qgraph: Remove superfluous semicolons
  contrib/rdmacm-mux: Remove superfluous semicolon
  hw/display/qxl: Remove unneeded variable assignment
  hw/block/pflash_cfi02: Remove unneeded variable assignment
  hw/net/rocker: Report unimplemented feature with qemu_log_mask(UNIMP)

Yu-Chen Lin (1):
  mailmap: Add entry for Yu-Chen Lin

 .mailmap|  3 ++-
 audio/alsaaudio.c   |  4 ++--
 contrib/rdmacm-mux/main.c   |  2 +-
 hw/arm/xlnx-versal-virt.c   |  2 +-
 hw/block/pflash_cfi02.c |  1 -
 hw/display/qxl.c|  2 +-
 hw/m68k/next-cube.c |  2 +-
 hw/net/rocker/rocker.c  | 15 +--
 hw/nios2/boot.c |  1 +
 hw/scsi/esp.c   |  2 +-
 hw/vfio/common.c|  4 ++--
 hw/vfio/display.c   |  2 +-
 hw/xtensa/xtfpga.c  |  1 +
 scripts/checkpatch.pl   |  5 +
 target/i386/whpx-all.c  |  2 +-
 tests/qtest/libqos/qgraph.c |  4 ++--
 ui/input-barrier.c  |  2 +-
 util/vfio-helpers.c |  6 +++---
 18 files changed, 35 insertions(+), 25 deletions(-)

-- 
2.24.1




Re: [PATCH v5 2/5] gpiolib: Add support for GPIO line table lookup

2020-02-19 Thread Geert Uytterhoeven
On Tue, Feb 18, 2020 at 4:18 PM Geert Uytterhoeven
 wrote:
> Currently GPIOs can only be referred to by GPIO controller and offset in
> GPIO lookup tables.
>
> Add support for looking them up by line name.
> Rename gpiod_lookup.chip_label to gpiod_lookup.key, to make it clear
> that this field can have two meanings, and update the kerneldoc and
> GPIO_LOOKUP*() macros.
>
> Signed-off-by: Geert Uytterhoeven 
> Reviewed-by: Ulrich Hecht 
> Reviewed-by: Eugeniu Rosca 
> Tested-by: Eugeniu Rosca 

> --- a/include/linux/gpio/machine.h
> +++ b/include/linux/gpio/machine.h
> @@ -20,8 +20,9 @@ enum gpio_lookup_flags {
>
>  /**
>   * struct gpiod_lookup - lookup table
> - * @chip_label: name of the chip the GPIO belongs to
> - * @chip_hwnum: hardware number (i.e. relative to the chip) of the GPIO
> + * @key: either the name of the chip the GPIO belongs to, or the GPIO line 
> name
> + * @chip_hwnum: hardware number (i.e. relative to the chip) of the GPIO, or
> + *  U16_MAX to indicate that @key is a GPIO line name
>   * @con_id: name of the GPIO from the device's point of view
>   * @idx: index of the GPIO in case several GPIOs share the same name
>   * @flags: bitmask of gpio_lookup_flags GPIO_* values
> @@ -30,7 +31,7 @@ enum gpio_lookup_flags {
>   * functions using platform data.
>   */
>  struct gpiod_lookup {
> -   const char *chip_label;
> +   const char *key;
> u16 chip_hwnum;
> const char *con_id;
> unsigned int idx;

This needs an update in the documentation:

--- a/Documentation/driver-api/gpio/board.rst
+++ b/Documentation/driver-api/gpio/board.rst
@@ -113,13 +113,15 @@ files that desire to do so need to include the
following header::
 GPIOs are mapped by the means of tables of lookups, containing instances of the
 gpiod_lookup structure. Two macros are defined to help declaring such
mappings::

-   GPIO_LOOKUP(chip_label, chip_hwnum, con_id, flags)
-   GPIO_LOOKUP_IDX(chip_label, chip_hwnum, con_id, idx, flags)
+   GPIO_LOOKUP(key, chip_hwnum, con_id, flags)
+   GPIO_LOOKUP_IDX(key, chip_hwnum, con_id, idx, flags)

 where

-  - chip_label is the label of the gpiod_chip instance providing the GPIO
-  - chip_hwnum is the hardware number of the GPIO within the chip
+  - key is either the label of the gpiod_chip instance providing the GPIO, or
+the GPIO line name
+  - chip_hwnum is the hardware number of the GPIO within the chip, or U16_MAX
+to indicate that key is a GPIO line name
   - con_id is the name of the GPIO function from the device point of view. It
can be NULL, in which case it will match any function.
   - idx is the index of the GPIO within the function.


Furthermore, a few drivers populate the gpiod_lookup members directly,
instead of using the convenience macros:

arch/arm/mach-integrator/impd1.c
drivers/i2c/busses/i2c-i801.c
drivers/mfd/sm501.c

Either they have to be updated s/chip_label/key/, or start using the macros,
e.g.

--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -1444,9 +1444,9 @@ static int i801_add_mux(struct i801_priv *priv)
return -ENOMEM;
lookup->dev_id = "i2c-mux-gpio";
for (i = 0; i < mux_config->n_gpios; i++) {
-   lookup->table[i].chip_label = mux_config->gpio_chip;
-   lookup->table[i].chip_hwnum = mux_config->gpios[i];
-   lookup->table[i].con_id = "mux";
+   lookup->table[i] = (struct gpiod_lookup)
+   GPIO_LOOKUP(mux_config->gpio_chip,
+   mux_config->gpios[i], "mux", 0);
}
gpiod_add_lookup_table(lookup);
priv->lookup = lookup;

Do you have any preference?
Thanks!

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds



[PULL 11/17] tests/qtest/libqos/qgraph: Remove superfluous semicolons

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: fc281c80202
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-13-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 tests/qtest/libqos/qgraph.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/libqos/qgraph.c b/tests/qtest/libqos/qgraph.c
index 7a7ae2a19edc..ca01de07437b 100644
--- a/tests/qtest/libqos/qgraph.c
+++ b/tests/qtest/libqos/qgraph.c
@@ -474,7 +474,7 @@ QOSEdgeType qos_graph_edge_get_type(QOSGraphEdge *edge)
 if (!edge) {
 return -1;
 }
-return edge->type;;
+return edge->type;
 }
 
 char *qos_graph_edge_get_dest(QOSGraphEdge *edge)
@@ -590,7 +590,7 @@ void qos_add_test(const char *name, const char *interface,
   QOSTestFunc test_func, QOSGraphTestOptions *opts)
 {
 QOSGraphNode *node;
-char *test_name = g_strdup_printf("%s-tests/%s", interface, name);;
+char *test_name = g_strdup_printf("%s-tests/%s", interface, name);
 QOSGraphTestOptions def_opts = { };
 
 if (!opts) {
-- 
2.24.1




[PULL 09/17] ui/input-barrier: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 6105683da35
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-11-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 ui/input-barrier.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ui/input-barrier.c b/ui/input-barrier.c
index fe35049b83a2..527c75e130ee 100644
--- a/ui/input-barrier.c
+++ b/ui/input-barrier.c
@@ -455,7 +455,7 @@ static gboolean writecmd(InputBarrier *ib, struct 
barrierMsg *msg)
 break;
 default:
 write_cmd(p, barrierCmdEUnknown, avail);
-break;;
+break;
 }
 
 len = MAX_HELLO_LENGTH - avail - sizeof(int);
-- 
2.24.1




[PULL 01/17] mailmap: Add entry for Yu-Chen Lin

2020-02-19 Thread Laurent Vivier
From: Yu-Chen Lin 

I have two mail address, add entries for
showing author and email correctly.

Signed-off-by: Yu-Chen Lin 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Yu-Chen Lin 
Message-Id: <20200206125504.7150-1-npes87...@gmail.com>
Signed-off-by: Laurent Vivier 
---
 .mailmap | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.mailmap b/.mailmap
index a521c17b4430..76154c7d8abc 100644
--- a/.mailmap
+++ b/.mailmap
@@ -152,7 +152,8 @@ Xiaoqiang Zhao 
 Xinhua Cao 
 Xiong Zhang 
 Yin Yin 
-yuchenlin 
+Yu-Chen Lin 
+Yu-Chen Lin  
 YunQiang Su 
 YunQiang Su 
 Yuri Pudgorodskiy 
-- 
2.24.1




[PULL 17/17] hw/xtensa/xtfpga:fix leak of fdevice tree blob

2020-02-19 Thread Laurent Vivier
From: Chen Qun 

The device tree blob returned by load_device_tree is malloced.
We should free it after cpu_physical_memory_write().

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Acked-by: Max Filippov 
Reviewed-by: Laurent Vivier 
Message-Id: <20200218091154.21696-4-kuhn.chen...@huawei.com>
Signed-off-by: Laurent Vivier 
---
 hw/xtensa/xtfpga.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/xtensa/xtfpga.c b/hw/xtensa/xtfpga.c
index 8e2dd1327a46..60ccc74f5f1d 100644
--- a/hw/xtensa/xtfpga.c
+++ b/hw/xtensa/xtfpga.c
@@ -380,6 +380,7 @@ static void xtfpga_init(const XtfpgaBoardDesc *board, 
MachineState *machine)
 cur_tagptr = put_tag(cur_tagptr, BP_TAG_FDT,
  sizeof(dtb_addr), &dtb_addr);
 cur_lowmem = QEMU_ALIGN_UP(cur_lowmem + fdt_size, 4 * KiB);
+g_free(fdt);
 }
 #else
 if (dtb_filename) {
-- 
2.24.1




[PULL 15/17] hw/net/rocker: Report unimplemented feature with qemu_log_mask(UNIMP)

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fix warnings reported by Clang static code analyzer:

CC  hw/net/rocker/rocker.o
  hw/net/rocker/rocker.c:213:9: warning: Value stored to 'tx_tso_mss' is never 
read
  tx_tso_mss = rocker_tlv_get_le16(tlvs[ROCKER_TLV_TX_TSO_MSS]);
  ^
  hw/net/rocker/rocker.c:217:9: warning: Value stored to 'tx_tso_hdr_len' is 
never read
  tx_tso_hdr_len = rocker_tlv_get_le16(tlvs[ROCKER_TLV_TX_TSO_HDR_LEN]);
  ^
  hw/net/rocker/rocker.c:255:9: warning: Value stored to 'tx_l3_csum_off' is 
never read
  tx_l3_csum_off += tx_tso_mss = tx_tso_hdr_len = 0;
  ^ ~~~

Fixes: dc488f888
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20200217101637.27558-1-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/net/rocker/rocker.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c
index 81dd3b5f141d..15d66f6cbcf0 100644
--- a/hw/net/rocker/rocker.c
+++ b/hw/net/rocker/rocker.c
@@ -27,6 +27,7 @@
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include "qemu/bitops.h"
+#include "qemu/log.h"
 
 #include "rocker.h"
 #include "rocker_hw.h"
@@ -207,14 +208,22 @@ static int tx_consume(Rocker *r, DescInfo *info)
 
 if (tlvs[ROCKER_TLV_TX_L3_CSUM_OFF]) {
 tx_l3_csum_off = rocker_tlv_get_le16(tlvs[ROCKER_TLV_TX_L3_CSUM_OFF]);
+qemu_log_mask(LOG_UNIMP, "rocker %s: L3 not implemented"
+ " (cksum off: %u)\n",
+  __func__, tx_l3_csum_off);
 }
 
 if (tlvs[ROCKER_TLV_TX_TSO_MSS]) {
 tx_tso_mss = rocker_tlv_get_le16(tlvs[ROCKER_TLV_TX_TSO_MSS]);
+qemu_log_mask(LOG_UNIMP, "rocker %s: TSO not implemented (MSS: %u)\n",
+  __func__, tx_tso_mss);
 }
 
 if (tlvs[ROCKER_TLV_TX_TSO_HDR_LEN]) {
 tx_tso_hdr_len = rocker_tlv_get_le16(tlvs[ROCKER_TLV_TX_TSO_HDR_LEN]);
+qemu_log_mask(LOG_UNIMP, "rocker %s: TSO not implemented"
+ " (hdr length: %u)\n",
+  __func__, tx_tso_hdr_len);
 }
 
 rocker_tlv_for_each_nested(tlv_frag, tlvs[ROCKER_TLV_TX_FRAGS], rem) {
@@ -249,12 +258,6 @@ static int tx_consume(Rocker *r, DescInfo *info)
 iovcnt++;
 }
 
-if (iovcnt) {
-/* XXX perform Tx offloads */
-/* XXX   silence compiler for now */
-tx_l3_csum_off += tx_tso_mss = tx_tso_hdr_len = 0;
-}
-
 err = fp_port_eg(r->fp_port[port], iov, iovcnt);
 
 err_too_many_frags:
-- 
2.24.1




[PULL 08/17] hw/vfio/display: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: 8b818e059bf
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-9-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/vfio/display.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/display.c b/hw/vfio/display.c
index a5a608c5b226..f4977c66e1b5 100644
--- a/hw/vfio/display.c
+++ b/hw/vfio/display.c
@@ -287,7 +287,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 VFIOPCIDevice *vdev = opaque;
 VFIODisplay *dpy = vdev->dpy;
 VFIODMABuf *primary, *cursor;
-bool free_bufs = false, new_cursor = false;;
+bool free_bufs = false, new_cursor = false;
 
 primary = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_PRIMARY);
 if (primary == NULL) {
-- 
2.24.1




[PULL 12/17] contrib/rdmacm-mux: Remove superfluous semicolon

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fixes: a5d2f6f8773
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Paolo Bonzini 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Juan Quintela 
Message-Id: <20200218094402.26625-14-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 contrib/rdmacm-mux/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/rdmacm-mux/main.c b/contrib/rdmacm-mux/main.c
index de53048f0619..bd82abbad35e 100644
--- a/contrib/rdmacm-mux/main.c
+++ b/contrib/rdmacm-mux/main.c
@@ -490,7 +490,7 @@ static int read_and_process(int fd)
 
 static int accept_all(void)
 {
-int fd, rc = 0;;
+int fd, rc = 0;
 
 pthread_rwlock_wrlock(&server.lock);
 
-- 
2.24.1




[PULL 14/17] hw/block/pflash_cfi02: Remove unneeded variable assignment

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fix warning reported by Clang static code analyzer:

CC  hw/block/pflash_cfi02.o
  hw/block/pflash_cfi02.c:311:5: warning: Value stored to 'ret' is never read
  ret = -1;
  ^ ~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-Id: <20200215161557.4077-4-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/block/pflash_cfi02.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
index 7c4744c020c1..12f18d401a85 100644
--- a/hw/block/pflash_cfi02.c
+++ b/hw/block/pflash_cfi02.c
@@ -308,7 +308,6 @@ static uint64_t pflash_read(void *opaque, hwaddr offset, 
unsigned int width)
 hwaddr boff;
 uint64_t ret;
 
-ret = -1;
 /* Lazy reset to ROMD mode after a certain amount of read accesses */
 if (!pfl->rom_mode && pfl->wcycle == 0 &&
 ++pfl->read_counter > PFLASH_LAZY_ROMD_THRESHOLD) {
-- 
2.24.1




[PULL 13/17] hw/display/qxl: Remove unneeded variable assignment

2020-02-19 Thread Laurent Vivier
From: Philippe Mathieu-Daudé 

Fix warning reported by Clang static code analyzer:

  hw/display/qxl.c:1634:14: warning: Value stored to 'orig_io_port' during its 
initialization is never read
  uint32_t orig_io_port = io_port;
   ^~~~   ~~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-Id: <20200215161557.4077-3-phi...@redhat.com>
Signed-off-by: Laurent Vivier 
---
 hw/display/qxl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index 64884da70857..21a43a1d5ec2 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -1631,7 +1631,7 @@ static void ioport_write(void *opaque, hwaddr addr,
 PCIQXLDevice *d = opaque;
 uint32_t io_port = addr;
 qxl_async_io async = QXL_SYNC;
-uint32_t orig_io_port = io_port;
+uint32_t orig_io_port;
 
 if (d->guest_bug && io_port != QXL_IO_RESET) {
 return;
-- 
2.24.1




[PULL 16/17] hw/nios2:fix leak of fdevice tree blob

2020-02-19 Thread Laurent Vivier
From: Chen Qun 

The device tree blob returned by load_device_tree is malloced.
We should free it after cpu_physical_memory_write().

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Reviewed-by: Laurent Vivier 
Message-Id: <20200218091154.21696-2-kuhn.chen...@huawei.com>
Signed-off-by: Laurent Vivier 
---
 hw/nios2/boot.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/nios2/boot.c b/hw/nios2/boot.c
index 46b834987691..88224aa84c8b 100644
--- a/hw/nios2/boot.c
+++ b/hw/nios2/boot.c
@@ -109,6 +109,7 @@ static int nios2_load_dtb(struct nios2_boot_info bi, const 
uint32_t ramsize,
 }
 
 cpu_physical_memory_write(bi.fdt, fdt, fdt_size);
+g_free(fdt);
 return fdt_size;
 }
 
-- 
2.24.1




Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine

2020-02-19 Thread Kevin Wolf
Am 19.02.2020 um 10:03 hat Markus Armbruster geschrieben:
> >> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
> >> >>}
> >> >>qmp_request_free(req_obj);
> >> >> 
> >> >>   -/* Reschedule instead of looping so the main loop stays 
> >> >> responsive */
> >> >>   -qemu_bh_schedule(qmp_dispatcher_bh);
> >> >>   +/*
> >> >>   + * Yield and reschedule so the main loop stays responsive.
> >> >>   + *
> >> >>   + * Move back to iohandler_ctx so that nested event loops for
> >> >>   + * qemu_aio_context don't start new monitor commands.
> >> >> 
> >> >> Can you explain this sentence for dummies?
> >> >
> >> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
> >> > are scheduled there, the next iteration of the monitor dispatcher loop
> >> > could start from a nested event loop. If we are scheduled in
> >> > iohandler_ctx, then only the actual main loop will reenter the coroutine
> >> > and nested event loops ignore it.
> >> >
> >> > I'm not sure if or why this is actually important, but this matches
> >> > scheduling the dispatcher BH in iohandler_ctx in the code before this
> >> > patch.
> >> >
> >> > If we didn't do this, we could end up starting monitor requests in more
> >> > places than before, and who knows what that would mean.
> >> 
> >> Let me say it in my own words, to make sure I got it.  I'm going to
> >> ignore special cases like "not using I/O thread" and exec-oob.
> >> 
> >> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
> >> This pushes requests onto the monitor's qmp_requests queue.
> >
> > mon_iothread (like all iothreads) has a separate AioContext, which
> > doesn't have a name, but can be accessed with
> > iothread_get_aio_context(mon_iothread).
> 
> Got it.
> 
> @iohandler_ctx and @qemu_aio_context both belong to the main loop.
> 
> @qemu_aio_context is for general "run in the main loop" use.  It is
> polled both in the actual main loop and event loops nested in it.  I
> figure "both" to keep things responsive.
> 
> @iohandler_ctx is for "I/O handlers" (whatever those are).  It is polled
> only in the actual main loop.  I figure this means "I/O handlers" may
> get delayed while a nested event loop runs.  But better starve a bit
> than run in unexpected places.
> 
> >> Before this patch, the dispatcher runs in a bottom half in the main
> >> thread, in qemu_aio_context.
> >
> > The dispatcher BH is what is scheduled in iohandler_ctx. This means that
> > the BH is run from the main loop, but not from nested event loops.
> 
> Got it.
> 
> >> The patch moves it to a coroutine running in the main thread.  It runs
> >> in iohandler_ctx, but switches to qemu_aio_context for executing command
> >> handlers.
> >> 
> >> We want to keep command handlers running in qemu_aio_context, as before
> >> this patch.
> >
> > "Running in AioContext X" is kind of a sloppy term, and I'm afraid it
> > might be more confusing than helpful here. What this means is that the
> > code is run while holding the lock of the AioContext, and it registers
> > its BHs, callbacks etc. in that AioContext so it will be called from the
> > event loop of the respective thread.
> >
> > Before this patch, command handlers can't really use callbacks without a
> > nested event loop because they are synchronous.
> 
> I figure an example for such a nested event loop is BDRV_POLL_WHILE() in
> bdrv_truncate(), which runs in the command handler for block_resize.
> True?

Yes. I think most nested event loops are in the block layer, and they
use the BDRV_POLL_WHILE() or AIO_WAIT_WHILE() macros today.

> > The BQL is held, which
> > is equivalent to holding the qemu_aio_context lock.
> 
> Why is it equivalent?  Are they always taken together?

I guess ideally they would be. In practice, we neglect the
qemu_aio_context lock and rely on the BQL for everything in the main
thread. So maybe I should say the BQL replaces the AioContext lock for
the main context rather than being equivalent.

> > But other than that,
> > all of the definition of "running in an AioContext" doesn't really apply.
> >
> > Now with coroutines, you assign them an AioContext, which is the context
> > in which BHs reentering the coroutine will be scheduled, e.g. from
> > explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
> > for a lock like a CoMutex.
> >
> > qemu_aio_context is what most (if not all) of the existing QMP handlers
> > use when they run a nested event loop,
> 
> bdrv_truncate() appears to use bdrv_get_aio_context(), which is the
> BlockDriverState's AioContext if set, else @qemu_aio_context.

Right, the BDRV_POLL_WHILE() macro allows waiting for something in a
different AioContext from the main context. This works because of the
aio_wait_kick() in bdrv_truncate_co_entry(), which schedules a BH in the
main con

Re: [PATCH] iotests/279: Fix for non-qcow2 formats

2020-02-19 Thread Max Reitz
On 19.12.19 15:42, Max Reitz wrote:
> First, driver=qcow2 will not work so well with non-qcow2 formats (and
> this test claims to support qcow, qed, and vmdk).
> 
> Second, vmdk will always report the backing file format to be vmdk.
> Filter that out so the output looks like for all other formats.
> 
> Third, the flat vmdk subformats do not support backing files, so they
> will not work with this test.
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/279 | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)

Applied to my block branch.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 3/5] qemu/queue.h: add QLIST_SAFE_REMOVE()

2020-02-19 Thread Sergio Lopez
On Fri, Feb 14, 2020 at 05:17:10PM +, Stefan Hajnoczi wrote:
> QLIST_REMOVE() assumes the element is in a list.  It also leaves the
> element's linked list pointers dangling.
> 
> Introduce a safe version of QLIST_REMOVE() and convert open-coded
> instances of this pattern.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block.c  |  5 +
>  chardev/spice.c  |  4 +---
>  include/qemu/queue.h | 14 ++
>  3 files changed, 16 insertions(+), 7 deletions(-)

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


Re: [PATCH v2 0/5] block: Generic file creation fallback

2020-02-19 Thread Max Reitz
On 22.01.20 17:45, Max Reitz wrote:
> Hi,
> 
> As version 1, this series adds a fallback path for creating files (on
> the protocol layer) if the protocol driver does not support file
> creation, but the file already exists.
> 
> 
> Branch: https://github.com/XanClic/qemu.git skip-proto-create-v2
> Branch: https://git.xanclic.moe/XanClic/qemu.git skip-proto-create-v2
> 
> 
> v2:
> - Drop blk_truncate_for_formatting(): It doesn’t make sense to introduce
>   this function any more after 26536c7fc25917d1bd13781f81fe3ab039643bff
>   (“block: Do not truncate file node when formatting”), because we’d
>   only use it in bdrv_create_file_fallback().
>   Thus, it makes more sense to create special helper functions
>   specifically for bdrv_create_file_fallback().
> 
> - Thus, dropped patches 2 and 3.
> 
> - And changed patch 4 to include those helper functions.
> 
> - Rebased, which was a bit of a pain.

Thanks for the reviews, added a note to the new test why the second case
is expected to fail (as requested by Maxim), and applied the series to
my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 0/2] qemu-img: Fix convert -n -B for backing-less targets

2020-02-19 Thread Max Reitz
On 21.01.20 16:59, Max Reitz wrote:
> Hi,
> 
> When reviewing David’s series to add --target-is-zero convert, I looked
> for a case to show that the current implementation will crash if
> -n --target-is-zero is used together with -B.  It then turned out that
> -B will always crash when combined with -n and the target image does not
> have a backing file set in its image header.
> 
> This series fixes that.

Thanks for the review, applied to my block branch.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 4/5] aio-posix: make AioHandler deletion O(1)

2020-02-19 Thread Sergio Lopez
On Fri, Feb 14, 2020 at 05:17:11PM +, Stefan Hajnoczi wrote:
> It is not necessary to scan all AioHandlers for deletion.  Keep a list
> of deleted handlers instead of scanning the full list of all handlers.
> 
> The AioHandler->deleted field can be dropped.  Let's check if the
> handler has been inserted into the deleted list instead.  Add a new
> QLIST_IS_INSERTED() API for this check.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/block/aio.h  |  6 -
>  include/qemu/queue.h |  3 +++
>  util/aio-posix.c | 53 +---
>  3 files changed, 43 insertions(+), 19 deletions(-)

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


Re: [PATCH 0/2] block: Fix VM size field width in snapshot dump

2020-02-19 Thread Max Reitz
On 17.01.20 11:58, Max Reitz wrote:
> Hi,
> 
> https://bugs.launchpad.net/qemu/+bug/1859989 reports that fields in
> "qemu-img snapshot -l"s output are not always separated by spaces in
> 4.1.1.  Fix that.
> 
> 
> Branch: https://github.com/XanClic/qemu.git lp-1859989-v1
> Branch: https://git.xanclic.moe/XanClic/qemu.git lp-1859989-v1

Thanks for the review, I renamed the test from 284 to 286, and applied
the series to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 0/5] block: Generic file creation fallback

2020-02-19 Thread Maxim Levitsky
On Wed, 2020-02-19 at 11:38 +0100, Max Reitz wrote:
> On 22.01.20 17:45, Max Reitz wrote:
> > Hi,
> > 
> > As version 1, this series adds a fallback path for creating files (on
> > the protocol layer) if the protocol driver does not support file
> > creation, but the file already exists.
> > 
> > 
> > Branch: https://github.com/XanClic/qemu.git skip-proto-create-v2
> > Branch: https://git.xanclic.moe/XanClic/qemu.git skip-proto-create-v2
> > 
> > 
> > v2:
> > - Drop blk_truncate_for_formatting(): It doesn’t make sense to introduce
> >   this function any more after 26536c7fc25917d1bd13781f81fe3ab039643bff
> >   (“block: Do not truncate file node when formatting”), because we’d
> >   only use it in bdrv_create_file_fallback().
> >   Thus, it makes more sense to create special helper functions
> >   specifically for bdrv_create_file_fallback().
> > 
> > - Thus, dropped patches 2 and 3.
> > 
> > - And changed patch 4 to include those helper functions.
> > 
> > - Rebased, which was a bit of a pain.
> 
> Thanks for the reviews, added a note to the new test why the second case
> is expected to fail (as requested by Maxim), and applied the series to
> my block branch:
> 
> https://git.xanclic.moe/XanClic/qemu/commits/branch/block
> 
> Max
> 
Thank you too!
Best regards,
Maxim Levitsky




Re: [PATCH v2 0/2] finish qemu-nbd --partition deprecation

2020-02-19 Thread Max Reitz
On 31.01.20 18:11, Eric Blake wrote:
> ping

Do you want further review or is Ján’s sufficient for you?

Also, I wonder whether it would make a good GSoC/Outreachy/... project
to add partition reading support to the raw block driver, or whether
that’s a bad idea. O:-)

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll

2020-02-19 Thread Sergio Lopez
On Fri, Feb 14, 2020 at 05:17:12PM +, Stefan Hajnoczi wrote:
> File descriptor monitoring is O(1) with epoll(7), but
> aio_dispatch_handlers() still scans all AioHandlers instead of
> dispatching just those that are ready.  This makes aio_poll() O(n) with
> respect to the total number of registered handlers.
> 
> Add a local ready_list to aio_poll() so that each nested aio_poll()
> builds a list of handlers ready to be dispatched.  Since file descriptor
> polling is level-triggered, nested aio_poll() calls also see fds that
> were ready in the parent but not yet dispatched.  This guarantees that
> nested aio_poll() invocations will dispatch all fds, even those that
> became ready before the nested invocation.
> 
> Since only handlers ready to be dispatched are placed onto the
> ready_list, the new aio_dispatch_ready_handlers() function provides O(1)
> dispatch.
> 
> Note that AioContext polling is still O(n) and currently cannot be fully
> disabled.  This still needs to be fixed before aio_poll() is fully O(1).
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  util/aio-posix.c | 106 +--
>  1 file changed, 76 insertions(+), 30 deletions(-)

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


Re: [PATCH] util/async: make bh_aio_poll() O(1)

2020-02-19 Thread Paolo Bonzini
Really a great idea, though I have some remarks on the implementation below.

On 19/02/20 11:00, Stefan Hajnoczi wrote:
> + * Each aio_bh_poll() call carves off a slice of the BH list.  This way newly
> + * scheduled BHs are not processed until the next aio_bh_poll() call.  This
> + * concept extends to nested aio_bh_poll() calls because slices are chained
> + * together.

This is the tricky part so I would expand a bit on why it's needed:

/*
 * Each aio_bh_poll() call carves off a slice of the BH list, so that
 * newly scheduled BHs are not processed until the next aio_bh_poll()
 * call.  All active aio_bh_poll() calls chained their slices together
 * in a list, so that nested aio_bh_poll() calls process all scheduled
 * bottom halves.
 */

> +struct BHListSlice {
> +QEMUBH *first_bh;
> +BHListSlice *next;
> +};
> +

Using QLIST and QSLIST removes the need to create your own lists, since
you can use QSLIST_MOVE_ATOMIC and QSLIST_INSERT_HEAD_ATOMIC.  For example:

struct BHListSlice {
QSLIST_HEAD(, QEMUBH) first_bh;
QLIST_ENTRY(BHListSlice) next;
};

...

QSLIST_HEAD(, QEMUBH) active_bh;
QLIST_HEAD(, BHListSlice) bh_list;

Related to this, in the aio_bh_poll() loop:

for (s = ctx->bh_list.next; s; s = s->next) {
}

You can actually do the removal inside the loop.  This is slightly more
efficient since you can remove slices early from the nested
aio_bh_poll().  Not that it's relevant for performance, but I think the
FIFO order for slices is also more intuitive than LIFO.

Putting this idea together with the QLIST one, you would get:

/*
 * If a bottom half causes a recursive call, this slice will be
 * removed by the nested aio_bh_poll().
 */
QSLIST_MOVE_ATOMIC(&slice.first_bh, ctx->active_bh);
QLIST_INSERT_TAIL(&ctx->bh_list, slice);
while ((s = QLIST_FIRST(&ctx->bh_list)) {
while ((bh = aio_bh_dequeue(&s, &flags))) {
}
QLIST_REMOVE_HEAD(s, next);
}

>  /* Multiple occurrences of aio_bh_poll cannot be called concurrently.
>   * The count in ctx->list_lock is incremented before the call, and is
>   * not affected by the call.

The second sentence is now stale.

Paolo

>   */
>  int aio_bh_poll(AioContext *ctx)
>  {
> -QEMUBH *bh, **bhp, *next;
> +BHListSlice slice;
> +BHListSlice **old_tail;
> +BHListSlice *s;
> +QEMUBH *bh;
> +unsigned flags;
>  int ret;
> -bool deleted = false;
> +
> +old_tail = push_bh_list_slice(ctx, &slice);
>  




Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE

2020-02-19 Thread Max Reitz
On 18.02.20 21:55, Eric Blake wrote:
> On 2/17/20 9:13 AM, Max Reitz wrote:
>> Hi,
>>
>> It’s my understanding that without some is_zero infrastructure for QEMU,
>> it’s impossible to implement this flag in qemu’s NBD server.
> 
> You're right that we may need some more infrastructure before being able
> to decide when to report this bit in all cases.  But for raw files, that
> infrastructure already exists: does block_status at offset 0 and the
> entire image as length return status that the entire file is a hole.

Hm, except that only works if the file is just completely unallocated.
Calling that existing infrastructure is a bit of a stretch, I think.

Or are you saying that bdrv_co_block_status(..., 0, bdrv_getlength(),
...) is our existing infrastructure?  Actually, why not.  Can we make
block drivers catch that special case?  (Or might the generic block code
somehow truncate such requests?)

> And
> for qcow2 files, it would not be that hard to teach a similar
> block_status request to report the entire image as a hole based on my
> proposed qcow2 autoclear bit tracking that the image still reads as zero.
> 
>>
>> At the same time, I still haven’t understood what we need the flag for.
>>
>> As far as I understood in our discussion on your qemu series, there is
>> no case where anyone would need to know whether an image is zero.  All
>> > practical cases involve someone having to ensure that some image is
>> zero.  Knowing whether an image is zero can help with that, but that can
>> be an implementation detail.
>>
>> For qcow2, the idea would be that there is some flag that remains true
>> as long as the image is guaranteed to be zero.  Then we’d have some
>> bdrv_make_zero function, and qcow2’s implementation would use this
>> information to gauge whether there’s something to do as all.
>>
>> For NBD, we cannot use this idea directly because to implement such a
>> flag (as you’re describing in this mail), we’d need separate is_zero
>> infrastructure, and that kind of makes the point of “drivers’
>> bdrv_make_zero() implementations do the right thing by themselves” moot.
> 
> We don't necessarily need a separate is_zero infrastructure if we can
> instead teach the existing block_status infrastructure to report that
> the entire image reads as zero.  You're right that clients that need to
> force an entire image to be zero won't need to directly call
> block_status (they can just call bdrv_make_zero, and let that worry
> about whether a block status call makes sense among its list of steps to
> try).  But since block_status can report all-zero status for some cases,
> it's not hard to use that for feeding the NBD bit.

OK.  I’m not 100% sure there’s nothing that would bite us in the butt
here, but I seem to remember you made all block_status things 64-bit, so
I suppose you know. :)

> However, there's a difference between qemu's block status (which is
> already typed correctly to return a 64-bit answer, even if it may need a
> few tweaks for clients that currently don't expect it to request more
> than 32 bits) and NBD's block status (which can only report 32 bits
> barring a new extension to the protocol), and where a single all-zero
> bit at NBD_OPT_GO is just as easy of an extension as a way to report a
> 64-bit all-zero response to NBD_CMD_BLOCK_STATUS.

Agreed.

>> OTOH, we wouldn’t need such a flag for the implementation, because we
>> could just send a 64-bit discard/make_zero over the whole block device
>> length to the NBD server, and then the server internally does the right
>> thing(TM).  AFAIU discard and write_zeroes currently have only 32 bit
>> length fields, but there were plans for adding support for 64 bit
>> versions anyway.  From my naïve outsider perspective, doing that doesn’t
>> seem a more complicated protocol addition than adding some way to tell
>> whether an NBD export is zero.
> 
> Adding 64-bit commands to NBD is more invasive than adding a single
> startup status bit.

True.  But if we/you want 64-bit commands anyway, then it doesn’t really
matter what’s more invasive.

> Both ideas can be done - doing one does not
> preclude the other.

Absolutely.  It’s just that if you do one anyway and it supersedes the
other, than we don’t have to do both.  Hence me wondering whether one
does supersede the other.

> But at the same time, not all servers will
> implement both ideas - if one is easy to implement while the other is
> hard, it is not unlikely that qemu will still encounter NBD servers that
> advertise startup state but not support 64-bit make_zero (even if qemu
> as NBD server starts supporting 64-bit make zero) or even 64-bit block
> status results.

Hm.  You know better than me whether that’s a good argument, because it
mostly depends on how many NBD server implementations there are;
specifically whether there are any that are decidedly not feature-complete.

> Another thing to think about here is timing.  With the proposed NBD
> addition, it is the server telling the cl

Re: [PATCH 5/5] aio-posix: make AioHandler dispatch O(1) with epoll

2020-02-19 Thread Paolo Bonzini
On 14/02/20 18:17, Stefan Hajnoczi wrote:
> +while ((node = QLIST_FIRST(ready_list))) {
> +QLIST_SAFE_REMOVE(node, node_ready);

Why does this need safe remove?

Paolo

> +progress = aio_dispatch_handler(ctx, node) || progress;
> +}




Re: [PATCH v3 02/12] ppc: Remove stub of PPC970 HID4 implementation

2020-02-19 Thread BALATON Zoltan

On Wed, 19 Feb 2020, David Gibson wrote:

The PowerPC 970 CPU was a cut-down POWER4, which had hypervisor capability.
However, it can be (and often was) strapped into "Apple mode", where the
hypervisor capabilities were disabled (essentially putting it always in
hypervisor mode).

That's actually the only mode of the 970 we support in qemu, and we're
unlikely to change that any time soon.  However, we do have a partial
implementation of the 970's HID4 register which affects things only
relevant for hypervisor mode.

That stub is also really ugly, since it attempts to duplicate the effects
of HID4 by re-encoding it into the LPCR register used in newer CPUs, but
in a really confusing way.

Just get rid of it.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Greg Kurz 
---
target/ppc/mmu-hash64.c | 28 +---
target/ppc/translate_init.inc.c | 17 ++---
2 files changed, 7 insertions(+), 38 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index da8966ccf5..a881876647 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1091,33 +1091,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)

/* Filter out bits */
switch (env->mmu_model) {
-case POWERPC_MMU_64B: /* 970 */
-if (val & 0x40) {
-lpcr |= LPCR_LPES0;
-}
-if (val & 0x8000ull) {
-lpcr |= LPCR_LPES1;
-}
-if (val & 0x20) {
-lpcr |= (0x4ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x4000ull) {
-lpcr |= (0x2ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x2000ull) {
-lpcr |= (0x1ull << LPCR_RMLS_SHIFT);
-}
-env->spr[SPR_RMOR] = ((lpcr >> 41) & 0xull) << 26;
-
-/*
- * XXX We could also write LPID from HID4 here
- * but since we don't tag any translation on it
- * it doesn't actually matter
- *
- * XXX For proper emulation of 970 we also need
- * to dig HRMOR out of HID5
- */
-break;
case POWERPC_MMU_2_03: /* P5p */
lpcr = val & (LPCR_RMLS | LPCR_ILE |
  LPCR_LPES0 | LPCR_LPES1 |
@@ -1154,6 +1127,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
}
break;
default:
+g_assert_not_reached();
;


Is this empty statement (lone semicolon) still needed now that you've 
added something to this case? Thought it was only there to be able to add 
a label to it so it could be removed now. (Does this count as a double ; 
that a recent patch was trying to fix?)


Regards,
BALATON Zoltan


}
env->spr[SPR_LPCR] = lpcr;
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index a0d0eaabf2..d7d4f012b8 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -7895,25 +7895,20 @@ static void spr_write_lpcr(DisasContext *ctx, int sprn, 
int gprn)
{
gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
}
-
-static void spr_write_970_hid4(DisasContext *ctx, int sprn, int gprn)
-{
-#if defined(TARGET_PPC64)
-spr_write_generic(ctx, sprn, gprn);
-gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
-#endif
-}
-
#endif /* !defined(CONFIG_USER_ONLY) */

static void gen_spr_970_lpar(CPUPPCState *env)
{
#if !defined(CONFIG_USER_ONLY)
/* Logical partitionning */
-/* PPC970: HID4 is effectively the LPCR */
+/* PPC970: HID4 covers things later controlled by the LPCR and
+ * RMOR in later CPUs, but with a different encoding.  We only
+ * support the 970 in "Apple mode" which has all hypervisor
+ * facilities disabled by strapping, so we can basically just
+ * ignore it */
spr_register(env, SPR_970_HID4, "HID4",
 SPR_NOACCESS, SPR_NOACCESS,
- &spr_read_generic, &spr_write_970_hid4,
+ &spr_read_generic, &spr_write_generic,
 0x);
#endif
}


Re: [PATCH v2 fixed 01/16] util: vfio-helpers: Factor out and fix processing of existing ram blocks

2020-02-19 Thread David Hildenbrand
On 19.02.20 09:43, David Hildenbrand wrote:
> On 18.02.20 23:00, Peter Xu wrote:
>> On Wed, Feb 12, 2020 at 02:42:39PM +0100, David Hildenbrand wrote:
>>> Factor it out into common code when a new notifier is registered, just
>>> as done with the memory region notifier. This allows us to have the
>>> logic about how to process existing ram blocks at a central place (which
>>> will be extended soon).
>>>
>>> Just like when adding a new ram block, we have to register the max_length
>>> for now. We don't have a way to get notified about resizes yet, and some
>>> memory would not be mapped when growing the ram block.
>>>
>>> Note: Currently, ram blocks are only "fake resized". All memory
>>> (max_length) is accessible.
>>>
>>> We can get rid of a bunch of functions in stubs/ram-block.c . Print the
>>> warning from inside qemu_vfio_ram_block_added().
> 
> [...]
> 
>>>  #include "exec/ramlist.h"
>>>  #include "exec/cpu-common.h"
>>>  
>>> -void *qemu_ram_get_host_addr(RAMBlock *rb)
>>> -{
>>> -return 0;
>>> -}
>>> -
>>> -ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
>>> -{
>>> -return 0;
>>> -}
>>> -
>>> -ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
>>> -{
>>> -return 0;
>>> -}
>>
>> Maybe put into another patch?
>>
>> Actually I'm thinking whether it would worth to do...  They're still
>> declared in include/exec/cpu-common.h, so logically who includes the
>> header but linked against stubs can still call this function.  So
>> keeping them there still make sense to me.
> 
> Why keep dead code around? If you look closely, the stubs really only
> contain what's strictly necessary to make current code compile, not any
> available ramblock related function.
> 
> I don't see a good reason for a separate patch either (after all, we're
> removing the last users in this patch), but if more people agree, I can
> move it to a separate patch.

FWIW, moved it to a separate patch :)


-- 
Thanks,

David / dhildenb




Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-02-19 Thread Olaf Hering
Am Thu, 16 Jan 2020 19:26:39 +0100
schrieb Paolo Bonzini :

> On 16/01/20 19:03, Olaf Hering wrote:
>  [...]  
> 
> This patch is wrong; xenfv does not support cross-version migration
> compatibility.  Even if the migration stream does not change, the
> hardware exposed to the guest will.
> 
> My understanding is that Xen is able to use "-M
> pc-i440fx-VERSION,accel=xen".  The presence of the version in the
> machine type guarantees that the migration stream is compatible and that
> the hardware exposed to the guest is the same on the source and destination.

The current idea is to make 'xenfv' a copy of 'pc-i440fx-3.1'.
But is this actually the desired behavior?

Lets assume xenfv_machine_options calls pc_i440fx_5_0_machine_options.
What impact that that have on the result of pc_init1()?
Is any of the things done by pc_i440fx_5_0_machine_options and
pc_i440fx_machine_options a desired, or even breaking, change for the
current result of pc_xen_hvm_init?

Also, do the calls to compat_props_add have negative impact on compatibility
for running domUs?

Olaf


pgp9q3w2gqRaR.pgp
Description: Digitale Signatur von OpenPGP


Re: [PATCH] memory: batch allocate ioeventfds[] in address_space_update_ioeventfds()

2020-02-19 Thread Paolo Bonzini
On 18/02/20 19:22, Stefan Hajnoczi wrote:
> + * It is likely that the number of ioeventfds hasn't changed much, so use
> + * the previous size as the starting value.
> + */
> +ioeventfd_max = as->ioeventfd_nb;
> +ioeventfds = g_new(MemoryRegionIoeventfd, ioeventfd_max);

This would be a bit space-inefficient if we are adding just one ioeventfd,
because it would waste 64 entries right below.  I would like to squash this
if it's okay with you:

diff --git a/memory.c b/memory.c
index 2d6f931f8c..09be40edd2 100644
--- a/memory.c
+++ b/memory.c
@@ -801,9 +801,10 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 
 /*
  * It is likely that the number of ioeventfds hasn't changed much, so use
- * the previous size as the starting value.
+ * the previous size as the starting value, with some headroom to avoid
+ * gratuitous reallocations.
  */
-ioeventfd_max = as->ioeventfd_nb;
+ioeventfd_max = QEMU_ALIGN_UP(as->ioeventfd_nb, 4);
 ioeventfds = g_new(MemoryRegionIoeventfd, ioeventfd_max);
 
 view = address_space_get_flatview(as);
@@ -815,7 +816,7 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 if (addrrange_intersects(fr->addr, tmp)) {
 ++ioeventfd_nb;
 if (ioeventfd_nb > ioeventfd_max) {
-ioeventfd_max += 64;
+ioeventfd_max = MAX(ioeventfd_max * 2, 4);
 ioeventfds = g_realloc(ioeventfds,
 ioeventfd_max * sizeof(*ioeventfds));
 }

Thanks,

Paolo

>  view = address_space_get_flatview(as);
>  FOR_EACH_FLAT_RANGE(fr, view) {
>  for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
> @@ -806,8 +814,11 @@ static void address_space_update_ioeventfds(AddressSpace 
> *as)
>   
> int128_make64(fr->offset_in_region)));
>  if (addrrange_intersects(fr->addr, tmp)) {
>  ++ioeventfd_nb;
> -ioeventfds = g_realloc(ioeventfds,
> -  ioeventfd_nb * 
> sizeof(*ioeventfds));
> +if (ioeventfd_nb > ioeventfd_max) {
> +ioeventfd_max += 64;
> +ioeventfds = g_realloc(ioeventfds,
> +ioeventfd_max * sizeof(*ioeventfds));
> +}




Re: [PATCH] memory: batch allocate ioeventfds[] in address_space_update_ioeventfds()

2020-02-19 Thread Paolo Bonzini
On 19/02/20 10:18, Stefan Hajnoczi wrote:
>> ... do exponential increase here (max*=2) instead so still easy to
>> converge?
> I'm happy to tweak the policy.  Let's see what Paolo thinks.

I included Peter's suggestion in my own tweak.  Thanks to both of you!

Paolo




[PATCH v2 0/2] This small series does two things:

2020-02-19 Thread Kashyap Chamarthy
(1) Convert the original qemu-cpu-models.texi to rST

(2) In a separate patch, incorporate the additional new content from
this:
https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg06455.html
([PATCH v3] qemu-cpu-models: Document -noTSX, mds-no, taa-no, and
tsx-ctrl)

A Sphinx rendering of the converted doc is here:


https://kashyapc.fedorapeople.org/qemu_v4.2.0-1300-g5da20ed7e3_docs/system/qemu-cpu-models.html

Kashyap Chamarthy (2):
  docs: Convert qemu-cpu-models.texi to rST
  qemu-cpu-models.rst: Document -noTSX, mds-no, taa-no, and tsx-ctrl

 docs/qemu-cpu-models.texi   | 677 
 docs/system/index.rst   |   1 +
 docs/system/qemu-cpu-models.rst | 547 ++
 3 files changed, 548 insertions(+), 677 deletions(-)
 delete mode 100644 docs/qemu-cpu-models.texi
 create mode 100644 docs/system/qemu-cpu-models.rst

-- 
2.21.0




[PATCH v2 2/2] qemu-cpu-models.rst: Document -noTSX, mds-no, taa-no, and tsx-ctrl

2020-02-19 Thread Kashyap Chamarthy
- Add the '-noTSX' variants for CascadeLake and SkyLake.

- Document the three MSR bits: 'mds-no', 'taa-no', and 'tsx-ctrl'

  Two confusing things about 'mds-no' (and the first point applies to
  the other two MSRs too):

  (1) The 'mds-no' bit will _not_ show up in the guest's /proc/cpuinfo.
  Rather it is used to fill in the guest's sysfs:

/sys/devices/system/cpu/vulnerabilities/mds:Not affected

  Paolo confirmed on IRC as such.

  (2) There are _three_ variants[+] of CascadeLake CPUs, with different
  stepping levels: 5, 6, and 7.  To quote wikichip.org[*]:

"note that while steppings 6 & 7 are fully mitigated, earlier
stepping 5 is not protected against MSBDS, MLPDS, nor MDSUM"

  The above is also indicated in the Intel's document[+], as
  indicated by "No" under the three columns of MFBDS, MSBDS, and
  MLPDS.

  I've expressed this in the docs without belabouring the details.

  [+] 
https://software.intel.com/security-software-guidance/insights/processors-affected-microarchitectural-data-sampling
  [*] 
https://en.wikichip.org/wiki/intel/microarchitectures/cascade_lake#Key_changes_from_Skylake

Signed-off-by: Kashyap Chamarthy 
---
[Retaining the change-log from the Texinfo-variant of this patch, which
is now no-longer needed:
https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg06455.html]

v3:
 - Address feedback from Paolo
 - Add URL to the deep-dive on Intel's MDS
v2:
 - Address feedback from DanPB
 - Add sections on 'taa-no' and 'tsx-ctrl'
Signed-off-by: Kashyap Chamarthy 
---
 docs/system/qemu-cpu-models.rst | 61 ++---
 1 file changed, 56 insertions(+), 5 deletions(-)

diff --git a/docs/system/qemu-cpu-models.rst b/docs/system/qemu-cpu-models.rst
index 2332fba1f9..5030dbef64 100644
--- a/docs/system/qemu-cpu-models.rst
+++ b/docs/system/qemu-cpu-models.rst
@@ -51,15 +51,18 @@ mixture of host CPU models between machines, if live 
migration
 compatibility is required, use the newest CPU model that is compatible
 across all desired hosts.
 
+* Intel Xeon Processor (Cascade Lake, 2019), with "stepping" levels 6 or
+  7 only.  (The Cascade Lake Xeon processor with *stepping 5 is
+  vulnerable to MDS variants*.)
+
+  * ``Cascadelake-Server``
+  * ``Cascadelake-Server-noTSX``
+
 * Intel Xeon Processor (Skylake, 2016)
 
   * ``Skylake-Server``
   * ``Skylake-Server-IBRS``
-
-* Intel Core Processor (Skylake, 2015)
-
-  * ``Skylake-Client``
-  * ``Skylake-Client-IBRS``
+  * ``Skylake-Server-IBRS-noTSX``
 
 * Intel Core Processor (Broadwell, 2014)
 
@@ -172,6 +175,54 @@ features are included if using "Host passthrough" or "Host 
model".
   Requires the host CPU microcode to support this feature before it
   can be used for guest CPUs.
 
+``mds-no``
+  Recommended to inform the guest OS that the host is *not* vulnerable
+  to any of the MDS variants ([MFBDS] CVE-2018-12130, [MLPDS]
+  CVE-2018-12127, [MSBDS] CVE-2018-12126).
+
+  This is an MSR (Model-Specific Register) feature rather than a CPUID feature,
+  so it will not appear in the Linux ``/proc/cpuinfo`` in the host or
+  guest.  Instead, the host kernel uses it to populate the MDS
+  vulnerability file in ``sysfs``.
+
+  So it should only be enabled for VMs if the host reports @code{Not
+  affected} in the ``/sys/devices/system/cpu/vulnerabilities/mds`` file.
+
+``taa-no``
+  Recommended to inform that the guest that the host is ``not``
+  vulnerable to CVE-2019-11135, TSX Asynchronous Abort (TAA).
+
+  This too is an MSR feature, so it does not show up in the Linux
+  ``/proc/cpuinfo`` in the host or guest.
+
+  It should only be enabled for VMs if the host reports ``Not affected``
+  in the ``/sys/devices/system/cpu/vulnerabilities/tsx_async_abort``
+  file.
+
+``tsx-ctrl``
+  Recommended to inform the guest that it can disable the Intel TSX
+  (Transactional Synchronization Extensions) feature; or, if the
+  processor is vulnerable, use the Intel VERW instruction (a
+  processor-level instruction that performs checks on memory access) as
+  a mitigation for the TAA vulnerability.  (For details, refer to this
+  `Intel's deep-dive into
+  MDS 
`_.)
+
+  Expose this to the guest OS if and only if: (a) the host has TSX
+  enabled; *and* (b) the guest has ``rtm`` CPU flag enabled.
+
+  By disabling TSX, KVM-based guests can avoid paying the price of
+  mitigting TSX-based attacks.
+
+  Note that ``tsx-ctrl`` too is an MSR feature, so it does not show
+  up in the Linux ``/proc/cpuinfo`` in the host or guest.
+
+  To validate that Intel TSX is indeed disabled for the guest, there are
+  two ways: (a) check for the *absence* of ``rtm`` in the guest's
+  ``/proc/cpuinfo``; or (b) the
+  ``/sys/devices/system/cpu/vulnerabilities/tsx_async_abort`` file in
+  the guest should report ``Mitigation: TSX disabled``.
+
 
 Preferred CPU models for 

[PATCH v2 1/2] docs: Convert qemu-cpu-models.texi to rST

2020-02-19 Thread Kashyap Chamarthy
This doc was originally written by Daniel P. Berrangé
, introduced via commit[1]: 2544e9e4aa (docs: add
guidance on configuring CPU models for x86, 2018-06-27).

This is a 1-1 conversion of Texinfo to rST, besides a couple of minor
non-content tweaks that are too trivial to spell out.  Further
modifications will be done via a separate patch.

(Thanks to Stephen Finucane on IRC for the suggestion to use rST
"definition lists" instead of bullets in some places.)

[1] https://git.qemu.org/?p=qemu.git;a=commit;h=2544e9e4aa

Signed-off-by: Kashyap Chamarthy 
---
 docs/qemu-cpu-models.texi   | 677 
 docs/system/index.rst   |   1 +
 docs/system/qemu-cpu-models.rst | 496 +++
 3 files changed, 497 insertions(+), 677 deletions(-)
 delete mode 100644 docs/qemu-cpu-models.texi
 create mode 100644 docs/system/qemu-cpu-models.rst

diff --git a/docs/qemu-cpu-models.texi b/docs/qemu-cpu-models.texi
deleted file mode 100644
index f88a1def0d..00
--- a/docs/qemu-cpu-models.texi
+++ /dev/null
@@ -1,677 +0,0 @@
-@c man begin SYNOPSIS
-QEMU / KVM CPU model configuration
-@c man end
-
-@set qemu_system_x86 qemu-system-x86_64
-
-@c man begin DESCRIPTION
-
-@menu
-* recommendations_cpu_models_x86::  Recommendations for KVM CPU model 
configuration on x86 hosts
-* recommendations_cpu_models_MIPS:: Supported CPU model configurations on MIPS 
hosts
-* cpu_model_syntax_apps::   Syntax for configuring CPU models
-@end menu
-
-QEMU / KVM virtualization supports two ways to configure CPU models
-
-@table @option
-
-@item Host passthrough
-
-This passes the host CPU model features, model, stepping, exactly to the
-guest. Note that KVM may filter out some host CPU model features if they
-cannot be supported with virtualization. Live migration is unsafe when
-this mode is used as libvirt / QEMU cannot guarantee a stable CPU is
-exposed to the guest across hosts. This is the recommended CPU to use,
-provided live migration is not required.
-
-@item Named model
-
-QEMU comes with a number of predefined named CPU models, that typically
-refer to specific generations of hardware released by Intel and AMD.
-These allow the guest VMs to have a degree of isolation from the host CPU,
-allowing greater flexibility in live migrating between hosts with differing
-hardware.
-@end table
-
-In both cases, it is possible to optionally add or remove individual CPU
-features, to alter what is presented to the guest by default.
-
-Libvirt supports a third way to configure CPU models known as "Host model".
-This uses the QEMU "Named model" feature, automatically picking a CPU model
-that is similar the host CPU, and then adding extra features to approximate
-the host model as closely as possible. This does not guarantee the CPU family,
-stepping, etc will precisely match the host CPU, as they would with "Host
-passthrough", but gives much of the benefit of passthrough, while making
-live migration safe.
-
-@node recommendations_cpu_models_x86
-@subsection Recommendations for KVM CPU model configuration on x86 hosts
-
-The information that follows provides recommendations for configuring
-CPU models on x86 hosts. The goals are to maximise performance, while
-protecting guest OS against various CPU hardware flaws, and optionally
-enabling live migration between hosts with heterogeneous CPU models.
-
-@menu
-* preferred_cpu_models_intel_x86::   Preferred CPU models for Intel x86 
hosts
-* important_cpu_features_intel_x86:: Important CPU features for Intel x86 
hosts
-* preferred_cpu_models_amd_x86:: Preferred CPU models for AMD x86 hosts
-* important_cpu_features_amd_x86::   Important CPU features for AMD x86 
hosts
-* default_cpu_models_x86::   Default x86 CPU models
-* other_non_recommended_cpu_models_x86:: Other non-recommended x86 CPUs
-@end menu
-
-@node preferred_cpu_models_intel_x86
-@subsubsection Preferred CPU models for Intel x86 hosts
-
-The following CPU models are preferred for use on Intel hosts. Administrators /
-applications are recommended to use the CPU model that matches the generation
-of the host CPUs in use. In a deployment with a mixture of host CPU models
-between machines, if live migration compatibility is required, use the newest
-CPU model that is compatible across all desired hosts.
-
-@table @option
-@item @code{Skylake-Server}
-@item @code{Skylake-Server-IBRS}
-
-Intel Xeon Processor (Skylake, 2016)
-
-
-@item @code{Skylake-Client}
-@item @code{Skylake-Client-IBRS}
-
-Intel Core Processor (Skylake, 2015)
-
-
-@item @code{Broadwell}
-@item @code{Broadwell-IBRS}
-@item @code{Broadwell-noTSX}
-@item @code{Broadwell-noTSX-IBRS}
-
-Intel Core Processor (Broadwell, 2014)
-
-
-@item @code{Haswell}
-@item @code{Haswell-IBRS}
-@item @code{Haswell-noTSX}
-@item @code{Haswell-noTSX-IBRS}
-
-Intel Core Processor (Haswell, 2013)
-
-
-@item @code{IvyBridge}
-@item @code{IvyBridge-IBRS}
-
-Intel Xeon E3-12xx v2 (Ivy Bridge, 2012)
-
-
-@

Re: [PATCH v2 0/2] This small series does two things:

2020-02-19 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20200219114607.1855-1-kcham...@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=a83564e4ec67469a8e4f46fecf45c31f', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-7boyfjhq/src/docker-src.2020-02-19-07.03.50.909:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=a83564e4ec67469a8e4f46fecf45c31f
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-7boyfjhq/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real2m4.583s
user0m7.607s


The full log is available at
http://patchew.org/logs/20200219114607.1855-1-kcham...@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

qemu-cpu-models: Convert to rST; document other MSR bits [Was: Re: [PATCH v2 0/2] This small series does two things:]

2020-02-19 Thread Kashyap Chamarthy
[Fix the subject line of the cover letter; sorry, it was my first time
(mis-)use of `git-publish`.]

On Wed, Feb 19, 2020 at 12:46:05PM +0100, Kashyap Chamarthy wrote:
> (1) Convert the original qemu-cpu-models.texi to rST
> 
> (2) In a separate patch, incorporate the additional new content from
> this:
> https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg06455.html
> ([PATCH v3] qemu-cpu-models: Document -noTSX, mds-no, taa-no, and
> tsx-ctrl)
> 
> A Sphinx rendering of the converted doc is here:
> 
> 
> https://kashyapc.fedorapeople.org/qemu_v4.2.0-1300-g5da20ed7e3_docs/system/qemu-cpu-models.html
> 
> Kashyap Chamarthy (2):
>   docs: Convert qemu-cpu-models.texi to rST
>   qemu-cpu-models.rst: Document -noTSX, mds-no, taa-no, and tsx-ctrl
> 
>  docs/qemu-cpu-models.texi   | 677 
>  docs/system/index.rst   |   1 +
>  docs/system/qemu-cpu-models.rst | 547 ++
>  3 files changed, 548 insertions(+), 677 deletions(-)
>  delete mode 100644 docs/qemu-cpu-models.texi
>  create mode 100644 docs/system/qemu-cpu-models.rst
> 
> -- 
> 2.21.0
> 

-- 
/kashyap




Re: [PATCH] pcie_root_port: Add disable_hotplug option

2020-02-19 Thread Julia Suvorova
On Wed, Feb 19, 2020 at 4:47 AM Michael S. Tsirkin  wrote:
>
> On Tue, Feb 18, 2020 at 10:02:19PM -0500, Laine Stump wrote:
> > Also, is there a rhyme/reason for some options having true/false, and some
> > being off/on? disable-acs seems to be true/false, but disable-modern is
> > on/off. Doesn't make any difference to me in the end, but just thought I'd
> > bring it up in case there might be a reason to use on/off instead of
> > true/false for this one.
>
> Some places accept on/off, some true/false, some on/off/true/false
> others on/off/yes/no and others on/off/true/false/yes/no.
>
> In this case both user visitor machinery. Which I *think*
> means on/off is the safe choice and true/false can be
> broken in some places.
>
> We really should clean up this mess ... Julia, what do you think?
> Let's make them all support all options?

Options already support all of on/off/true/false/yes/no as long as
they are defined as boolean (look at parse_type_bool()). That is, you
can use disable-modern with yes/no/true/false too.
The only problem is with types OnOffSplit and OnOffAuto (as in disable-legacy).

Best regards, Julia Suvorova.




Re: [PATCH v5 79/79] tests:numa-test: use explicit memdev to specify node RAM

2020-02-19 Thread Igor Mammedov
On Tue, 18 Feb 2020 18:51:34 +0100
Philippe Mathieu-Daudé  wrote:

> On 2/17/20 6:34 PM, Igor Mammedov wrote:
> > Follow up patches will remove automatic RAM distribution
> > between nodes and will make default machine types require
> > "memdev" option instead of legacy "mem" option.  
> 
> Can we keep this patch for the follow up?
memdev for numa was there for along time, just untested.
With this all numa tests switch to it instead of using
legacy option (+ a test for legacy option).
I don't think the patch should delayed along with numa
cleanups.

It of-cause could be posted as standalone patch as well,
I'll leave it upto Paolo whether to merge it or not.
 
> > 
> > Make tests to follow new rules and add an additional test
> > for legacy "mem" option on old machine type, to make sure
> > it won't regress in the future.
> > 
> > Signed-off-by: Igor Mammedov 
> > Acked-by: Thomas Huth 
> > ---  
> 




Re: [PATCH v3 1/4] luks: extract qcrypto_block_calculate_payload_offset()

2020-02-19 Thread Max Reitz
On 11.02.20 17:03, Stefan Hajnoczi wrote:
> The qcow2 .bdrv_measure() code calculates the crypto payload offset.
> This logic really belongs in crypto/block.c where it can be reused by
> other image formats.
> 
> The "luks" block driver will need this same logic in order to implement
> .bdrv_measure(), so extract the qcrypto_block_calculate_payload_offset()
> function now.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/qcow2.c  | 74 +++---
>  crypto/block.c | 40 +++
>  include/crypto/block.h | 22 +
>  3 files changed, 81 insertions(+), 55 deletions(-)

[...]

> diff --git a/crypto/block.c b/crypto/block.c
> index 325752871c..a9e1b8cc36 100644
> --- a/crypto/block.c
> +++ b/crypto/block.c
> @@ -115,6 +115,46 @@ QCryptoBlock 
> *qcrypto_block_create(QCryptoBlockCreateOptions *options,

[...]

> +bool
> +qcrypto_block_calculate_payload_offset(QCryptoBlockCreateOptions 
> *create_opts,
> +   const char *optprefix,
> +   size_t *len,
> +   Error **errp)
> +{
> +QCryptoBlock *crypto;
> +bool ok;
> +
> +/* Fake LUKS creation in order to determine the payload size */
> +crypto = qcrypto_block_create(create_opts, optprefix,
> +  qcrypto_block_headerlen_hdr_init_func,
> +  qcrypto_block_headerlen_hdr_write_func,
> +  len, errp);
> +ok = crypto != NULL;
> +qcrypto_block_free(crypto);
> +return ok;

Speaking of g_autoptr...  Would g_autoptr(QCryptoBlock) crypto; suffice
to contract these three lines into “return crypto != NULL;”?

Either way:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 1/2] docs: Convert qemu-cpu-models.texi to rST

2020-02-19 Thread Peter Maydell
On Wed, 19 Feb 2020 at 11:46, Kashyap Chamarthy  wrote:
>
> This doc was originally written by Daniel P. Berrangé
> , introduced via commit[1]: 2544e9e4aa (docs: add
> guidance on configuring CPU models for x86, 2018-06-27).
>
> This is a 1-1 conversion of Texinfo to rST, besides a couple of minor
> non-content tweaks that are too trivial to spell out.  Further
> modifications will be done via a separate patch.
>
> (Thanks to Stephen Finucane on IRC for the suggestion to use rST
> "definition lists" instead of bullets in some places.)
>
> [1] https://git.qemu.org/?p=qemu.git;a=commit;h=2544e9e4aa
>
> Signed-off-by: Kashyap Chamarthy 
> ---
>  docs/qemu-cpu-models.texi   | 677 
>  docs/system/index.rst   |   1 +
>  docs/system/qemu-cpu-models.rst | 496 +++
>  3 files changed, 497 insertions(+), 677 deletions(-)
>  delete mode 100644 docs/qemu-cpu-models.texi
>  create mode 100644 docs/system/qemu-cpu-models.rst

Hi; I haven't looked in detail at the actual conversion
parts, but from the diffstat you're missing some parts:

1) qemu-doc.texi has an "@include docs/qemu-cpu-models.texi"
that needs to be removed. That then means that the
"CPU models" section in qemu-doc.texi is empty, so we
can just delete it (the @node and @section directives,
and the reference to it in the earlier @menu)
(I'm surprised this didn't cause 'make' to fail with
an error trying to build the texi docs.)

2) The bit of Makefile which lists the dependencies of
qemu-doc.html needs to be updated to remove
docs/qemu-cpu-models.texi.

3) we create a qemu-cpu-models.7 manpage, so the parts
of Makefile that currently handle creating it from the
texi need to be changed to do it from the rST instead.
You can look at how we handle qemu-block-drivers.7
for an example. Don't forget that you'll need to add
a line to docs/system/conf.py to get Sphinx to build
the manpage, as well as the makefile changes.
You can check how the manpage renders with
'man -l /path/to/builddir/docs/interop/qemu-cpu-models.7'

4) The qemu-cpu-models.texi uses a substitution
"@set qemu_system_x86 qemu-system-x86-64" so that
downstream RedHat can easily update the examples text
to refer to whatever they rename the binary to. The
equivalent of this in rST you can see in qemu-block-drivers.rst:
at the top of the file we have

.. |qemu_system| replace:: qemu-system-x86_64

and then an example block using it is like:

.. parsed-literal::

  |qemu_system| linux.img -hdb fat:/my_directory

(you have to use a parsed-literal block and not the
usual :: so that the expansion is done.)

thanks
-- PMM



Re: [PATCH v2 14/22] qemu-iotests/199: better catch postcopy time

2020-02-19 Thread Andrey Shinkevich

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 72 +-
  1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a..6599fc6fb4 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@
  
  import os

  import iotests
-import time
  from iotests import qemu_img
  
+debug = False

+
  disk_a = os.path.join(iotests.test_dir, 'disk_a')
  disk_b = os.path.join(iotests.test_dir, 'disk_b')
  size = '256G'
  fifo = os.path.join(iotests.test_dir, 'mig_fifo')
  
  
+def event_seconds(event):

+return event['timestamp']['seconds'] + \
+event['timestamp']['microseconds'] / 100.0
+
+
+def event_dist(e1, e2):
+return event_seconds(e2) - event_seconds(e1)
+
+
  class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  def tearDown(self):

It's common to put the definition of setUp() ahead


+if debug:
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_b_events += self.vm_b.get_qmp_events()
+for e in self.vm_a_events:
+e['vm'] = 'SRC'
+for e in self.vm_b_events:
+e['vm'] = 'DST'
+events = (self.vm_a_events + self.vm_b_events)
+events = [(e['timestamp']['seconds'],
+   e['timestamp']['microseconds'],
+   e['vm'],
+   e['event'],
+   e.get('data', '')) for e in events]
+for e in sorted(events):
+print('{}.{:06} {} {} {}'.format(*e))
+
  self.vm_a.shutdown()
  self.vm_b.shutdown()
  os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_a.launch()
  self.vm_b.launch()
  
+# collect received events for debug

+self.vm_a_events = []
+self.vm_b_events = []
+
  def test_postcopy(self):
  write_size = 0x4000
  granularity = 512
@@ -77,15 +107,13 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
  s += 0x1
  
-bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}

-events_cap = {'capability': 'events', 'state': True}
+caps = [{'capability': 'dirty-bitmaps', 'state': True},

The name "capabilities" would be an appropriate identifier.


+{'capability': 'events', 'state': True}]
  
-result = self.vm_a.qmp('migrate-set-capabilities',

-   capabilities=[bitmaps_cap, events_cap])
+result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
  self.assert_qmp(result, 'return', {})
  
-result = self.vm_b.qmp('migrate-set-capabilities',

-   capabilities=[bitmaps_cap])
+result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
  self.assert_qmp(result, 'return', {})
  
  result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)

@@ -94,24 +122,38 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  result = self.vm_a.qmp('migrate-start-postcopy')
  self.assert_qmp(result, 'return', {})
  
-while True:

-event = self.vm_a.event_wait('MIGRATION')
-if event['data']['status'] == 'completed':
-break
+e_resume = self.vm_b.event_wait('RESUME')

"event_resume" gives a faster understanding


+self.vm_b_events.append(e_resume)
  
  s = 0x8000

  while s < write_size:
  self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
  s += 0x1
  
+match = {'data': {'status': 'completed'}}

+e_complete = self.vm_b.event_wait('MIGRATION', match=match)

"event_complete" also


+self.vm_b_events.append(e_complete)
+
+# take queued event, should already been

RE: RFC: Split EPT huge pages in advance of dirty logging

2020-02-19 Thread Zhoujian (jay)
Hi Peter,

> -Original Message-
> From: Peter Xu [mailto:pet...@redhat.com]
> Sent: Wednesday, February 19, 2020 1:43 AM
> To: Zhoujian (jay) 
> Cc: k...@vger.kernel.org; qemu-devel@nongnu.org; pbonz...@redhat.com;
> dgilb...@redhat.com; quint...@redhat.com; Liujinsong (Paul)
> ; linfeng (M) ; wangxin (U)
> ; Huangweidong (C)
> 
> Subject: Re: RFC: Split EPT huge pages in advance of dirty logging
> 
> On Tue, Feb 18, 2020 at 01:13:47PM +, Zhoujian (jay) wrote:
> > Hi all,
> >
> > We found that the guest will be soft-lockup occasionally when live
> > migrating a 60 vCPU, 512GiB huge page and memory sensitive VM. The
> > reason is clear, almost all of the vCPUs are waiting for the KVM MMU
> > spin-lock to create 4K SPTEs when the huge pages are write protected. This
> phenomenon is also described in this patch set:
> > https://patchwork.kernel.org/cover/11163459/
> > which aims to handle page faults in parallel more efficiently.
> >
> > Our idea is to use the migration thread to touch all of the guest
> > memory in the granularity of 4K before enabling dirty logging. To be
> > more specific, we split all the PDPE_LEVEL SPTEs into DIRECTORY_LEVEL
> > SPTEs as the first step, and then split all the DIRECTORY_LEVEL SPTEs into
> PAGE_TABLE_LEVEL SPTEs as the following step.
> 
> IIUC, QEMU will prefer to use huge pages for all the anonymous ramblocks
> (please refer to ram_block_add):
> 
> qemu_madvise(new_block->host, new_block->max_length,
> QEMU_MADV_HUGEPAGE);

Yes, you're right

> 
> Another alternative I can think of is to add an extra parameter to QEMU to
> explicitly disable huge pages (so that can even be MADV_NOHUGEPAGE
> instead of MADV_HUGEPAGE).  However that should also drag down the
> performance for the whole lifecycle of the VM.  

From the performance point of view, it is better to keep the huge pages
when the VM is not in the live migration state.

> A 3rd option is to make a QMP
> command to dynamically turn huge pages on/off for ramblocks globally.

We're searching a dynamic method too.
We plan to add two new flags for each memory slot, say
KVM_MEM_FORCE_PT_DIRECTORY_PAGES and
KVM_MEM_FORCE_PT_PAGE_TABLE_PAGES. These flags can be set
through KVM_SET_USER_MEMORY_REGION ioctl.

The mapping_level which is called by tdp_page_fault in the kernel side
will return PT_DIRECTORY_LEVEL if the
KVM_MEM_FORCE_PT_DIRECTORY_PAGES flag of the memory slot is
set, and return PT_PAGE_TABLE_LEVEL if the
KVM_MEM_FORCE_PT_PAGE_TABLE_PAGES flag is set.
 
The key steps to split the huge pages in advance of enabling dirty log is
as follows:
1. The migration thread in user space uses
KVM_SET_USER_MEMORY_REGION ioctl to set the
KVM_MEM_FORCE_PT_DIRECTORY_PAGES flag for each memory slot.
2. The migration thread continues to use the KVM_SPLIT_HUGE_PAGES
ioctl (which is newly added) to do the splitting of large pages in the
kernel side.
3. A new vCPU is created temporally(do some initialization but will not
run) to help to do the work, i.e. as the parameter of the tdp_page_fault.
4. Collect the GPA ranges of all the memory slots with the
KVM_MEM_FORCE_PT_DIRECTORY_PAGES flag set.
5. Split the 1G huge pages(collected in step 4) into 2M by calling
tdp_page_fault, since the mapping_level will return
PT_DIRECTORY_LEVEL. Here is the main difference from the usual
path which is caused by the Guest side(EPT violation/misconfig etc),
we call it directly in the hypervisor side.
6. Do some cleanups, i.e. free the vCPU related resources
7. The KVM_SPLIT_HUGE_PAGES ioctl returned to the user space side.
8. Using KVM_MEM_FORCE_PT_PAGE_TABLE_PAGES instread of
KVM_MEM_FORCE_PT_DIRECTORY_PAGES to repeat step 1 ~ step 7,
in step 5 the 2M huge pages will be splitted into 4K pages.
9. Clear the KVM_MEM_FORCE_PT_DIRECTORY_PAGES and
KVM_MEM_FORCE_PT_PAGE_TABLE_PAGES flags for each memory slot.
10. Then the migration thread calls the log_start ioctl to enable the dirty
logging, and the remaining thing is the same.

What's your take on this, thanks.

Regards,
Jay Zhou

> Haven't thought deep into any of them, but seems doable.
> 
> Thanks,
> 
> --
> Peter Xu



Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-19 Thread Andrey Shinkevich




On 18/02/2020 23:57, Eric Blake wrote:

On 2/18/20 2:02 PM, Andrey Shinkevich wrote:

qemu-iotests:$ ./check -qcow2
PASSED
(except always failed 261 and 272)


Have you reported those failures on the threads that introduced those 
tests?




Not yet unfortunately. I have not investigated the case.
"$ ./check -qcow2 261" dumps

+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': IMGFMT header exceeds 
cluster size


and "$ ./check -qcow2 272" dumps

+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'
+qemu-io: can't open device .../qemu/tests/qemu-iotests/scratch/t.qcow2: 
Image is not in qcow2 format


--
With the best regards,
Andrey Shinkevich



[PATCH 0/2] block/curl: Improve HTTP header parsing

2020-02-19 Thread David Edmondson
An HTTP object store of my acquaintance returns "accept-ranges: bytes"
(all lower case) as a header, causing the QEMU curl backend to refuse
to talk to it. RFC 7230 says that HTTP headers are case insensitive,
so update the curl backend accordingly.

At the same time, allow for arbitrary white space around the HTTP
header field value, as required by the RFC.

David Edmondson (2):
  block/curl: HTTP header fields allow whitespace around values
  block/curl: HTTP header field names are case insensitive

 block/curl.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

-- 
2.24.1




[PATCH 2/2] block/curl: HTTP header field names are case insensitive

2020-02-19 Thread David Edmondson
RFC 7230 section 3.2 indicates that HTTP header field names are case
insensitive.

Signed-off-by: David Edmondson 
---
 block/curl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 0cf99a4b31b8..4256659cd85b 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -216,11 +216,11 @@ static size_t curl_header_cb(void *ptr, size_t size, 
size_t nmemb, void *opaque)
 size_t realsize = size * nmemb;
 const char *header = (char *)ptr;
 const char *end = header + realsize;
-const char *accept_ranges = "Accept-Ranges:";
+const char *accept_ranges = "accept-ranges:";
 const char *bytes = "bytes";
 
 if (realsize >= strlen(accept_ranges)
-&& strncmp(header, accept_ranges, strlen(accept_ranges)) == 0) {
+&& strncasecmp(header, accept_ranges, strlen(accept_ranges)) == 0) {
 
 char *p = strchr(header, ':') + 1;
 
-- 
2.24.1




[PATCH 1/2] block/curl: HTTP header fields allow whitespace around values

2020-02-19 Thread David Edmondson
RFC 7230 section 3.2 indicates that whitespace is permitted between
the field name and field value and after the field value.

Signed-off-by: David Edmondson 
---
 block/curl.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index f86299378e38..0cf99a4b31b8 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -214,11 +214,34 @@ static size_t curl_header_cb(void *ptr, size_t size, 
size_t nmemb, void *opaque)
 {
 BDRVCURLState *s = opaque;
 size_t realsize = size * nmemb;
-const char *accept_line = "Accept-Ranges: bytes";
+const char *header = (char *)ptr;
+const char *end = header + realsize;
+const char *accept_ranges = "Accept-Ranges:";
+const char *bytes = "bytes";
 
-if (realsize >= strlen(accept_line)
-&& strncmp((char *)ptr, accept_line, strlen(accept_line)) == 0) {
-s->accept_range = true;
+if (realsize >= strlen(accept_ranges)
+&& strncmp(header, accept_ranges, strlen(accept_ranges)) == 0) {
+
+char *p = strchr(header, ':') + 1;
+
+/* Skip whitespace between the header name and value. */
+while (p < end && *p && isspace(*p)) {
+p++;
+}
+
+if (end - p >= strlen(bytes)
+&& strncmp(p, bytes, strlen(bytes)) == 0) {
+
+/* Check that there is nothing but whitespace after the value. */
+p += strlen(bytes);
+while (p < end && *p && isspace(*p)) {
+p++;
+}
+
+if (p == end || !*p) {
+s->accept_range = true;
+}
+}
 }
 
 return realsize;
-- 
2.24.1




Re: [PATCH v2 0/2] finish qemu-nbd --partition deprecation

2020-02-19 Thread Eric Blake

On 2/19/20 4:53 AM, Max Reitz wrote:

On 31.01.20 18:11, Eric Blake wrote:

ping


Do you want further review or is Ján’s sufficient for you?


Commit 0bc16997 has already landed, so no further review will show in 
git history. But you're always welcome to raise issues that might result 
in follow-up patches.




Also, I wonder whether it would make a good GSoC/Outreachy/... project
to add partition reading support to the raw block driver, or whether
that’s a bad idea. O:-)


Personally, I think that since nbdkit already provides a good partition 
filter [1], any other implementation of a partition filter is duplicated 
effort.  The only reason to teach qemu how to access an arbitrary 
partition is if it makes serving that arbitrary partition to a guest 
more efficient than what is currently possible by pointing qemu at an 
NBD server run by nbdkit with its partition filter.  But in general, you 
are more likely to want qemu to serve an entire disk image file to the 
guest (and let the guest interpret partitions embedded within that file) 
than to have qemu serve a single partition to a guest (and have the 
guest treat that partition as a bare disk), so I don't see it as 
something that would be frequently used or needed in qemu.


[1] http://libguestfs.org/nbdkit-partition-filter.1.html

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-19 Thread Eric Blake

On 2/19/20 7:25 AM, Andrey Shinkevich wrote:



On 18/02/2020 23:57, Eric Blake wrote:

On 2/18/20 2:02 PM, Andrey Shinkevich wrote:

qemu-iotests:$ ./check -qcow2
PASSED
(except always failed 261 and 272)


Have you reported those failures on the threads that introduced those 
tests?




Not yet unfortunately. I have not investigated the case.
"$ ./check -qcow2 261" dumps

+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': IMGFMT header exceeds 
cluster size


Which version of od are you using?  I do recall wondering whether 
reliance on the GNU coreutils extension --endian=big was going to cause 
problems later - well, here we are, it's later :)


https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg06781.html



and "$ ./check -qcow2 272" dumps

+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'


Yay, same problem for both tests.  Fix common.rc once, and both tests 
should start working for you.


+qemu-io: can't open device .../qemu/tests/qemu-iotests/scratch/t.qcow2: 
Image is not in qcow2 format




--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v5 27/79] arm/palm: use memdev for RAM

2020-02-19 Thread Igor Mammedov
On Tue, 18 Feb 2020 18:22:06 +0100
Philippe Mathieu-Daudé  wrote:

> On 2/17/20 6:34 PM, Igor Mammedov wrote:
> > memory_region_allocate_system_memory() API is going away, so
> > replace it with memdev allocated MemoryRegion. The later is
> > initialized by generic code, so board only needs to opt in
> > to memdev scheme by providing
> >MachineClass::default_ram_id
> > and using MachineState::ram instead of manually initializing
> > RAM memory region.
> > 
> > PS:
> >   while at it add check for user supplied RAM size and error
> >   out if it mismatches board expected value.
> > 
> > Signed-off-by: Igor Mammedov 
> > Reviewed-by: Andrew Jones 
> > ---
> > v2:
> >* fix format string causing build failure on 32-bit host
> >  (Philippe Mathieu-Daudé )
> > 
> > CC: balr...@gmail.com
> > ---
> >   hw/arm/palm.c | 20 ++--
> >   1 file changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/arm/palm.c b/hw/arm/palm.c
> > index 72eca8cc55..388b2629a5 100644
> > --- a/hw/arm/palm.c
> > +++ b/hw/arm/palm.c
> > @@ -31,6 +31,7 @@
> >   #include "hw/loader.h"
> >   #include "exec/address-spaces.h"
> >   #include "cpu.h"
> > +#include "qemu/cutils.h"
> >   
> >   static uint64_t static_read(void *opaque, hwaddr offset, unsigned size)
> >   {
> > @@ -181,7 +182,6 @@ static void palmte_gpio_setup(struct omap_mpu_state_s 
> > *cpu)
> >   
> >   static struct arm_boot_info palmte_binfo = {
> >   .loader_start = OMAP_EMIFF_BASE,
> > -.ram_size = 0x0200,  
> 
> Again, this is incorrect. Used by hw/arm/boot::do_cpu_reset().
Thanks,
fixed in v6

> 
> >   .board_id = 0x331,
> >   };
> >   
> > @@ -195,15 +195,21 @@ static void palmte_init(MachineState *machine)
> >   static uint32_t cs2val = 0xe1a0;
> >   static uint32_t cs3val = 0xe1a0e1a0;
> >   int rom_size, rom_loaded = 0;
> > -MemoryRegion *dram = g_new(MemoryRegion, 1);
> > +MachineClass *mc = MACHINE_GET_CLASS(machine);
> >   MemoryRegion *flash = g_new(MemoryRegion, 1);
> >   MemoryRegion *cs = g_new(MemoryRegion, 4);
> >   
> > -memory_region_allocate_system_memory(dram, NULL, "omap1.dram",
> > - palmte_binfo.ram_size);
> > -memory_region_add_subregion(address_space_mem, OMAP_EMIFF_BASE, dram);
> > +if (machine->ram_size != mc->default_ram_size) {
> > +char *sz = size_to_str(mc->default_ram_size);
> > +error_report("Invalid RAM size, should be %s", sz);
> > +g_free(sz);
> > +exit(EXIT_FAILURE);
> > +}
> > +
> > +memory_region_add_subregion(address_space_mem, OMAP_EMIFF_BASE,
> > +machine->ram);
> >   
> > -mpu = omap310_mpu_init(dram, machine->cpu_type);
> > +mpu = omap310_mpu_init(machine->ram, machine->cpu_type);
> >   
> >   /* External Flash (EMIFS) */
> >   memory_region_init_ram(flash, NULL, "palmte.flash", flash_size,
> > @@ -265,6 +271,8 @@ static void palmte_machine_init(MachineClass *mc)
> >   mc->init = palmte_init;
> >   mc->ignore_memory_transaction_failures = true;
> >   mc->default_cpu_type = ARM_CPU_TYPE_NAME("ti925t");
> > +mc->default_ram_size = 0x0200;
> > +mc->default_ram_id = "omap1.dram";
> >   }
> >   
> >   DEFINE_MACHINE("cheetah", palmte_machine_init)
> >   
> 




Re: [PATCH v5 11/79] arm/collie: use memdev for RAM

2020-02-19 Thread Igor Mammedov
On Tue, 18 Feb 2020 18:16:14 +0100
Philippe Mathieu-Daudé  wrote:

> Hi Igor,
> 
> On 2/17/20 6:33 PM, Igor Mammedov wrote:
> > memory_region_allocate_system_memory() API is going away, so
> > replace it with memdev allocated MemoryRegion. The later is
> > initialized by generic code, so board only needs to opt in
> > to memdev scheme by providing
> >MachineClass::default_ram_id
> > and using MachineState::ram instead of manually initializing
> > RAM memory region.
> > 
> > PS:
> >   - while at it add check for user supplied RAM size and error
> > out if it mismatches board expected value.
> >   - introduce RAM_ADDR_UFMT to avoid build errors on 32-bit hosts
> > when specifying format string for ram_addr_t type  
> 
> Since v3 the 2nd comment is not valid anymore (also in other patches 
> from this series).

I looks like the last remnant of RAM_ADDR_UFMT
I'll remove it in v6

> > 
> > Signed-off-by: Igor Mammedov 
> > Reviewed-by: Andrew Jones 
> > ---
> > v2:
> >* fix format string causing build failure on 32-bit host
> >  (Philippe Mathieu-Daudé )
> > v3:
> >* instead of RAM_ADDR_UFMT adding use size_to_str()
> >   Philippe Mathieu-Daudé 
> > ---
> >   hw/arm/collie.c | 17 -
> >   1 file changed, 12 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/arm/collie.c b/hw/arm/collie.c
> > index 970a4405cc..024893fc9e 100644
> > --- a/hw/arm/collie.c
> > +++ b/hw/arm/collie.c
> > @@ -10,6 +10,7 @@
> >*/
> >   #include "qemu/osdep.h"
> >   #include "qemu/units.h"
> > +#include "qemu/cutils.h"
> >   #include "hw/sysbus.h"
> >   #include "hw/boards.h"
> >   #include "strongarm.h"
> > @@ -20,20 +21,24 @@
> >   
> >   static struct arm_boot_info collie_binfo = {
> >   .loader_start = SA_SDCS0,
> > -.ram_size = 0x2000,  
> 
> This doesn't seem correct, this field is used do_cpu_reset() -> 
> set_kernel_args() (see hw/arm/boot.c).

Thanks,
fixed in v6.

I'll respin series as already several amended patches accumulated by this time.

> 
> >   };
> >   
> >   static void collie_init(MachineState *machine)
> >   {
> >   StrongARMState *s;
> >   DriveInfo *dinfo;
> > -MemoryRegion *sdram = g_new(MemoryRegion, 1);
> > +MachineClass *mc = MACHINE_GET_CLASS(machine);
> > +
> > +if (machine->ram_size != mc->default_ram_size) {
> > +char *sz = size_to_str(mc->default_ram_size);
> > +error_report("Invalid RAM size, should be %s", sz);
> > +g_free(sz);
> > +exit(EXIT_FAILURE);
> > +}
> >   
> >   s = sa1110_init(machine->cpu_type);
> >   
> > -memory_region_allocate_system_memory(sdram, NULL, "strongarm.sdram",
> > - collie_binfo.ram_size);
> > -memory_region_add_subregion(get_system_memory(), SA_SDCS0, sdram);
> > +memory_region_add_subregion(get_system_memory(), SA_SDCS0, 
> > machine->ram);
> >   
> >   dinfo = drive_get(IF_PFLASH, 0, 0);
> >   pflash_cfi01_register(SA_CS0, "collie.fl1", 0x0200,
> > @@ -57,6 +62,8 @@ static void collie_machine_init(MachineClass *mc)
> >   mc->init = collie_init;
> >   mc->ignore_memory_transaction_failures = true;
> >   mc->default_cpu_type = ARM_CPU_TYPE_NAME("sa1110");
> > +mc->default_ram_size = 0x2000;
> > +mc->default_ram_id = "strongarm.sdram";
> >   }
> >   
> >   DEFINE_MACHINE("collie", collie_machine_init)
> >   
> 
> 




Re: [PATCH v5 34/79] arm/xilinx_zynq: drop RAM size fixup

2020-02-19 Thread Igor Mammedov
On Tue, 18 Feb 2020 18:23:47 +0100
Philippe Mathieu-Daudé  wrote:

> On 2/17/20 6:34 PM, Igor Mammedov wrote:
> > If user provided non-sense RAM size, board will complain and
> > continue running with max RAM size supported.
> > Also RAM is going to be allocated by generic code, so it won't be
> > possible for board to fix things up for user.
> > 
> > Make it error message and exit to force user fix CLI,
> > instead of accepting non-sense CLI values.
> > 
> > Signed-off-by: Igor Mammedov 
> > Reviewed-by: Alistair Francis 
> > ---
> >   hw/arm/xilinx_zynq.c | 16 
> >   1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
> > index 3a0fa5b23f..df950fc400 100644
> > --- a/hw/arm/xilinx_zynq.c
> > +++ b/hw/arm/xilinx_zynq.c
> > @@ -158,7 +158,6 @@ static inline void zynq_init_spi_flashes(uint32_t 
> > base_addr, qemu_irq irq,
> >   
> >   static void zynq_init(MachineState *machine)
> >   {
> > -ram_addr_t ram_size = machine->ram_size;
> >   ARMCPU *cpu;
> >   MemoryRegion *address_space_mem = get_system_memory();
> >   MemoryRegion *ext_ram = g_new(MemoryRegion, 1);
> > @@ -168,6 +167,12 @@ static void zynq_init(MachineState *machine)
> >   qemu_irq pic[64];
> >   int n;
> >   
> > +/* max 2GB ram */
> > +if (machine->ram_size > 0x8000) {
> > +error_report("RAM size more than %d is not supported", 
> > 0x8000);  
> 
> Eh? Maybe:
> 
> if (machine->ram_size > 2 * GiB) {
> error_report("RAM size more than 2 GiB is not supported");

amended in v6

> > +exit(EXIT_FAILURE);
> > +}
> > +
> >   cpu = ARM_CPU(object_new(machine->cpu_type));
> >   
> >   /* By default A9 CPUs have EL3 enabled.  This board does not
> > @@ -184,14 +189,9 @@ static void zynq_init(MachineState *machine)
> >   &error_fatal);
> >   object_property_set_bool(OBJECT(cpu), true, "realized", &error_fatal);
> >   
> > -/* max 2GB ram */
> > -if (ram_size > 0x8000) {
> > -ram_size = 0x8000;
> > -}
> > -
> >   /* DDR remapped to address zero.  */
> >   memory_region_allocate_system_memory(ext_ram, NULL, "zynq.ext_ram",
> > - ram_size);
> > + machine->ram_size);
> >   memory_region_add_subregion(address_space_mem, 0, ext_ram);
> >   
> >   /* 256K of on-chip memory */
> > @@ -300,7 +300,7 @@ static void zynq_init(MachineState *machine)
> >   sysbus_connect_irq(busdev, 0, pic[40 - IRQ_OFFSET]);
> >   sysbus_mmio_map(busdev, 0, 0xF8007000);
> >   
> > -zynq_binfo.ram_size = ram_size;
> > +zynq_binfo.ram_size = machine->ram_size;
> >   zynq_binfo.nb_cpus = 1;
> >   zynq_binfo.board_id = 0xd32;
> >   zynq_binfo.loader_start = 0;
> >   
> 




Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-19 Thread Andrey Shinkevich




On 19/02/2020 16:36, Eric Blake wrote:

On 2/19/20 7:25 AM, Andrey Shinkevich wrote:



On 18/02/2020 23:57, Eric Blake wrote:

On 2/18/20 2:02 PM, Andrey Shinkevich wrote:

qemu-iotests:$ ./check -qcow2
PASSED
(except always failed 261 and 272)


Have you reported those failures on the threads that introduced those 
tests?




Not yet unfortunately. I have not investigated the case.
"$ ./check -qcow2 261" dumps

+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': IMGFMT header exceeds 
cluster size


Which version of od are you using?  I do recall wondering whether 


$ od --version
od (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later...

reliance on the GNU coreutils extension --endian=big was going to cause 
problems later - well, here we are, it's later :)


https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg06781.html



and "$ ./check -qcow2 272" dumps

+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'


Yay, same problem for both tests.  Fix common.rc once, and both tests 
should start working for you.


Thank you Eric! I want to sort it out later...



+qemu-io: can't open device 
.../qemu/tests/qemu-iotests/scratch/t.qcow2: Image is not in qcow2 format






--
With the best regards,
Andrey Shinkevich




RE: [PATCH 1/3] arm_gic: Mask the un-supported priority bits

2020-02-19 Thread Sai Pavan Boddu
Hi Peter,

All your suggestions look good, I will send at V2. But  I think I have done a 
mistake in V1, More comments inline below.

> -Original Message-
> From: Peter Maydell 
> Sent: Tuesday, February 18, 2020 11:40 PM
> To: Sai Pavan Boddu 
> Cc: Edgar E . Iglesias ; Alistair Francis
> ; Anthony Liguori ;
> Andreas Färber ; qemu-arm ;
> QEMU Developers 
> Subject: Re: [PATCH 1/3] arm_gic: Mask the un-supported priority bits
> 
> On Fri, 14 Feb 2020 at 13:21, Sai Pavan Boddu
>  wrote:
> >
> > Priority bits implemented in arm-gic can 8 to 4, un-implemented bits
> > are read as zeros(RAZ).
> 
> This is nice to see -- I've known our GIC was a bit out-of-spec in this area 
> but
> it's good to see it's less painful to retrofit than I thought it might be.
> 
> > Signed-off-by: Sai Pavan Boddu 
> > ---
> >  hw/intc/arm_gic.c| 9 ++---
> >  hw/intc/arm_gic_common.c | 1 +
> >  include/hw/intc/arm_gic_common.h | 1 +
> >  3 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c index
> > 1d7da7b..8875330 100644
> > --- a/hw/intc/arm_gic.c
> > +++ b/hw/intc/arm_gic.c
> > @@ -43,6 +43,9 @@
> >  }   \
> >  } while (0)
> >
> > +#define UMASK(n) \
> > +1 << n) - 1) << (8 - n)) & 0xFF)
> 
> This is a bit confusingly named (usually 'umask' is the file-permission mask 
> on
> unix). I think it's worth following the pattern used in
> hw/intc/arm_gicv3_cpuif.c for icv_fullprio_mask(), and using a function with
> a comment describing it. Also, you've not considered the virtualization parts
> of the GIC, which also use these codepaths. In those cases, the value of the
> mask is always based on GIC_VIRT_MAX_GROUP_PRIO_BITS of priority (a
> GICv2 has 5 bits of priority in the VGIC, always). So:
> 
> static uint32_t gic_fullprio_mask(GICState *s, int cpu) {
> /*
>  * Return a mask word which clears the unimplemented priority
>  * bits from a priority value for an interrupt. (Not to be
>  * confused with the group priority, whose mask depends on BPR.)
>  */
> int pribits;
> 
> if (gic_is_vcpu(cpu)) {
> pribits = GIC_VIRT_MAX_GROUP_PRIO_BITS;
> } else {
> pribits = s->n_prio_bits;
> }
> return ~0U << (8 - s->n_prio_bits);
> }
> 
> > +
> >  static const uint8_t gic_id_11mpcore[] = {
> >  0x00, 0x00, 0x00, 0x00, 0x90, 0x13, 0x04, 0x00, 0x0d, 0xf0, 0x05,
> > 0xb1  }; @@ -652,9 +655,9 @@ void gic_dist_set_priority(GICState *s,
> > int cpu, int irq, uint8_t val,
> >  }
> >
> >  if (irq < GIC_INTERNAL) {
> > -s->priority1[irq][cpu] = val;
> > +s->priority1[irq][cpu] = val & UMASK(s->n_prio_bits) ;
> >  } else {
> > -s->priority2[(irq) - GIC_INTERNAL] = val;
> > +s->priority2[(irq) - GIC_INTERNAL] = val &
> > + UMASK(s->n_prio_bits);
> >  }
> >  }
> 
> Slightly cleaner to just put
>val &= gic_fullprio_mask(s);
> before the if() rather than doing the same thing in both branches.
> 
> >
> > @@ -684,7 +687,7 @@ static void gic_set_priority_mask(GICState *s, int
> cpu, uint8_t pmask,
> >  return;
> >  }
> >  }
> > -s->priority_mask[cpu] = pmask;
> > +s->priority_mask[cpu] = pmask & UMASK(s->n_prio_bits);
[Sai Pavan Boddu] mask should be applied in " gic_dist_get_priority ",  as we 
miss group priority bits here.
Let me know, if my understanding is correct.

Thanks for the review.

Regards,
Sai Pavan
> >  }
> >
> >  static uint32_t gic_get_priority_mask(GICState *s, int cpu,
> > MemTxAttrs attrs) diff --git a/hw/intc/arm_gic_common.c
> > b/hw/intc/arm_gic_common.c index e6c4fe7..e4668c7 100644
> > --- a/hw/intc/arm_gic_common.c
> > +++ b/hw/intc/arm_gic_common.c
> > @@ -357,6 +357,7 @@ static Property arm_gic_common_properties[] = {
> >  DEFINE_PROP_BOOL("has-security-extensions", GICState, security_extn,
> 0),
> >  /* True if the GIC should implement the virtualization extensions */
> >  DEFINE_PROP_BOOL("has-virtualization-extensions", GICState,
> > virt_extn, 0),
> > +DEFINE_PROP_UINT32("num-prio-bits", GICState, n_prio_bits, 8),
> 
> In patch 2 you use "num-priority-bits" for the proprety name on the
> a9mpcore object. I like that better, and I think we should name the property
> the same thing on both devices.
> 
> You should have some code in the realize method which sanity checks the
> n_prio_bits value we are passed. It can't be more than 8, and I'm not sure
> what the lowest valid value is. Your commit message says 4. I'm pretty sure
> that if the GIC has the virt extensions then it can't be less than
> GIC_VIRT_MAX_GROUP_PRIO_BITS (ie 5).
> 
> thanks
> -- PMM


Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-19 Thread Eric Blake

On 2/19/20 7:36 AM, Eric Blake wrote:


+od: unrecognized option '--endian=big'
+Try 'od --help' for more information.
+od: invalid -N argument '--endian=big'
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': IMGFMT header exceeds 
cluster size


Which version of od are you using?  I do recall wondering whether 
reliance on the GNU coreutils extension --endian=big was going to cause 
problems later - well, here we are, it's later :)


https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg06781.html


coreutils documents that od --endian was added in 8.23, released in 
2014-07-18.  Per https://wiki.qemu.org/Supported_Build_Platforms, we 
still have support for RHEL 7 through 2022, and it was first released 
2014-06-09 (all other supported distros have newer releases, but I 
didn't check what coreutils version they included, or even if the BSD 
builds which don't use coreutils would also be impacted by this 
problem).  Still, I'd like to know your specific setup, and why the CI 
tools have not flagged it.


But even one counterexample within the bounds of our supported distro 
page is a good argument that use of od --endian is not yet portable. 
Or, if your setup is not on the supported page, it becomes a question of 
whether it should be added or whether you should upgrade to something 
that is supported.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 2/2] block/curl: HTTP header field names are case insensitive

2020-02-19 Thread Philippe Mathieu-Daudé

Hi David,

On 2/19/20 2:27 PM, David Edmondson wrote:

RFC 7230 section 3.2 indicates that HTTP header field names are case
insensitive.

Signed-off-by: David Edmondson 
---
  block/curl.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 0cf99a4b31b8..4256659cd85b 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -216,11 +216,11 @@ static size_t curl_header_cb(void *ptr, size_t size, 
size_t nmemb, void *opaque)
  size_t realsize = size * nmemb;
  const char *header = (char *)ptr;
  const char *end = header + realsize;
-const char *accept_ranges = "Accept-Ranges:";
+const char *accept_ranges = "accept-ranges:";
  const char *bytes = "bytes";
  
  if (realsize >= strlen(accept_ranges)

-&& strncmp(header, accept_ranges, strlen(accept_ranges)) == 0) {
+&& strncasecmp(header, accept_ranges, strlen(accept_ranges)) == 0) {


Can you use g_ascii_strncasecmp() instead?

  
  char *p = strchr(header, ':') + 1;
  






Re: [PATCH v5 11/79] arm/collie: use memdev for RAM

2020-02-19 Thread Philippe Mathieu-Daudé

On 2/19/20 2:44 PM, Igor Mammedov wrote:

On Tue, 18 Feb 2020 18:16:14 +0100
Philippe Mathieu-Daudé  wrote:


Hi Igor,

On 2/17/20 6:33 PM, Igor Mammedov wrote:

memory_region_allocate_system_memory() API is going away, so
replace it with memdev allocated MemoryRegion. The later is
initialized by generic code, so board only needs to opt in
to memdev scheme by providing
MachineClass::default_ram_id
and using MachineState::ram instead of manually initializing
RAM memory region.

PS:
   - while at it add check for user supplied RAM size and error
 out if it mismatches board expected value.
   - introduce RAM_ADDR_UFMT to avoid build errors on 32-bit hosts
 when specifying format string for ram_addr_t type


Since v3 the 2nd comment is not valid anymore (also in other patches
from this series).


I looks like the last remnant of RAM_ADDR_UFMT
I'll remove it in v6


Thanks.





Signed-off-by: Igor Mammedov 
Reviewed-by: Andrew Jones 
---
v2:
* fix format string causing build failure on 32-bit host
  (Philippe Mathieu-Daudé )
v3:
* instead of RAM_ADDR_UFMT adding use size_to_str()
   Philippe Mathieu-Daudé 
---
   hw/arm/collie.c | 17 -
   1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/arm/collie.c b/hw/arm/collie.c
index 970a4405cc..024893fc9e 100644
--- a/hw/arm/collie.c
+++ b/hw/arm/collie.c
@@ -10,6 +10,7 @@
*/
   #include "qemu/osdep.h"
   #include "qemu/units.h"
+#include "qemu/cutils.h"
   #include "hw/sysbus.h"
   #include "hw/boards.h"
   #include "strongarm.h"
@@ -20,20 +21,24 @@
   
   static struct arm_boot_info collie_binfo = {

   .loader_start = SA_SDCS0,
-.ram_size = 0x2000,


This doesn't seem correct, this field is used do_cpu_reset() ->
set_kernel_args() (see hw/arm/boot.c).


Thanks,
fixed in v6.

I'll respin series as already several amended patches accumulated by this time.


With the arm_boot_info ram_size unmodified, please add
Reviewed-by: Philippe Mathieu-Daudé 






   };
   
   static void collie_init(MachineState *machine)

   {
   StrongARMState *s;
   DriveInfo *dinfo;
-MemoryRegion *sdram = g_new(MemoryRegion, 1);
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+
+if (machine->ram_size != mc->default_ram_size) {
+char *sz = size_to_str(mc->default_ram_size);
+error_report("Invalid RAM size, should be %s", sz);
+g_free(sz);
+exit(EXIT_FAILURE);
+}
   
   s = sa1110_init(machine->cpu_type);
   
-memory_region_allocate_system_memory(sdram, NULL, "strongarm.sdram",

- collie_binfo.ram_size);
-memory_region_add_subregion(get_system_memory(), SA_SDCS0, sdram);
+memory_region_add_subregion(get_system_memory(), SA_SDCS0, machine->ram);
   
   dinfo = drive_get(IF_PFLASH, 0, 0);

   pflash_cfi01_register(SA_CS0, "collie.fl1", 0x0200,
@@ -57,6 +62,8 @@ static void collie_machine_init(MachineClass *mc)
   mc->init = collie_init;
   mc->ignore_memory_transaction_failures = true;
   mc->default_cpu_type = ARM_CPU_TYPE_NAME("sa1110");
+mc->default_ram_size = 0x2000;
+mc->default_ram_id = "strongarm.sdram";
   }
   
   DEFINE_MACHINE("collie", collie_machine_init)
   










Re: [PATCH v5 79/79] tests:numa-test: use explicit memdev to specify node RAM

2020-02-19 Thread Philippe Mathieu-Daudé

On 2/19/20 2:00 PM, Igor Mammedov wrote:

On Tue, 18 Feb 2020 18:51:34 +0100
Philippe Mathieu-Daudé  wrote:


On 2/17/20 6:34 PM, Igor Mammedov wrote:

Follow up patches will remove automatic RAM distribution
between nodes and will make default machine types require
"memdev" option instead of legacy "mem" option.


Can we keep this patch for the follow up?

memdev for numa was there for along time, just untested.
With this all numa tests switch to it instead of using
legacy option (+ a test for legacy option).
I don't think the patch should delayed along with numa
cleanups.


I guess what confuses me is "Follow up patches *will* remove..."



It of-cause could be posted as standalone patch as well,
I'll leave it upto Paolo whether to merge it or not.
  


Make tests to follow new rules and add an additional test
for legacy "mem" option on old machine type, to make sure
it won't regress in the future.

Signed-off-by: Igor Mammedov 
Acked-by: Thomas Huth 
---









Re: [PATCH v5 27/79] arm/palm: use memdev for RAM

2020-02-19 Thread Philippe Mathieu-Daudé

On 2/19/20 2:44 PM, Igor Mammedov wrote:

On Tue, 18 Feb 2020 18:22:06 +0100
Philippe Mathieu-Daudé  wrote:


On 2/17/20 6:34 PM, Igor Mammedov wrote:

memory_region_allocate_system_memory() API is going away, so
replace it with memdev allocated MemoryRegion. The later is
initialized by generic code, so board only needs to opt in
to memdev scheme by providing
MachineClass::default_ram_id
and using MachineState::ram instead of manually initializing
RAM memory region.

PS:
   while at it add check for user supplied RAM size and error
   out if it mismatches board expected value.

Signed-off-by: Igor Mammedov 
Reviewed-by: Andrew Jones 
---
v2:
* fix format string causing build failure on 32-bit host
  (Philippe Mathieu-Daudé )

CC: balr...@gmail.com
---
   hw/arm/palm.c | 20 ++--
   1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/hw/arm/palm.c b/hw/arm/palm.c
index 72eca8cc55..388b2629a5 100644
--- a/hw/arm/palm.c
+++ b/hw/arm/palm.c
@@ -31,6 +31,7 @@
   #include "hw/loader.h"
   #include "exec/address-spaces.h"
   #include "cpu.h"
+#include "qemu/cutils.h"
   
   static uint64_t static_read(void *opaque, hwaddr offset, unsigned size)

   {
@@ -181,7 +182,6 @@ static void palmte_gpio_setup(struct omap_mpu_state_s *cpu)
   
   static struct arm_boot_info palmte_binfo = {

   .loader_start = OMAP_EMIFF_BASE,
-.ram_size = 0x0200,


Again, this is incorrect. Used by hw/arm/boot::do_cpu_reset().

Thanks,
fixed in v6


With the arm_boot_info ram_size unmodified, please add
Reviewed-by: Philippe Mathieu-Daudé 






   .board_id = 0x331,
   };
   
@@ -195,15 +195,21 @@ static void palmte_init(MachineState *machine)

   static uint32_t cs2val = 0xe1a0;
   static uint32_t cs3val = 0xe1a0e1a0;
   int rom_size, rom_loaded = 0;
-MemoryRegion *dram = g_new(MemoryRegion, 1);
+MachineClass *mc = MACHINE_GET_CLASS(machine);
   MemoryRegion *flash = g_new(MemoryRegion, 1);
   MemoryRegion *cs = g_new(MemoryRegion, 4);
   
-memory_region_allocate_system_memory(dram, NULL, "omap1.dram",

- palmte_binfo.ram_size);
-memory_region_add_subregion(address_space_mem, OMAP_EMIFF_BASE, dram);
+if (machine->ram_size != mc->default_ram_size) {
+char *sz = size_to_str(mc->default_ram_size);
+error_report("Invalid RAM size, should be %s", sz);
+g_free(sz);
+exit(EXIT_FAILURE);
+}
+
+memory_region_add_subregion(address_space_mem, OMAP_EMIFF_BASE,
+machine->ram);
   
-mpu = omap310_mpu_init(dram, machine->cpu_type);

+mpu = omap310_mpu_init(machine->ram, machine->cpu_type);
   
   /* External Flash (EMIFS) */

   memory_region_init_ram(flash, NULL, "palmte.flash", flash_size,
@@ -265,6 +271,8 @@ static void palmte_machine_init(MachineClass *mc)
   mc->init = palmte_init;
   mc->ignore_memory_transaction_failures = true;
   mc->default_cpu_type = ARM_CPU_TYPE_NAME("ti925t");
+mc->default_ram_size = 0x0200;
+mc->default_ram_id = "omap1.dram";
   }
   
   DEFINE_MACHINE("cheetah", palmte_machine_init)
   









Re: [PATCH v3 04/12] target/ppc: Introduce ppc_hash64_use_vrma() helper

2020-02-19 Thread Fabiano Rosas
David Gibson  writes:

> When running guests under a hypervisor, the hypervisor obviously needs to
> be protected from guest accesses even if those are in what the guest
> considers real mode (translation off).  The POWER hardware provides two
> ways of doing that: The old way has guest real mode accesses simply offset
> and bounds checked into host addresses.  It works, but requires that a
> significant chunk of the guest's memory - the RMA - be physically
> contiguous in the host, which is pretty inconvenient.  The new way, known
> as VRMA, has guest real mode accesses translated in roughly the normal way
> but with some special parameters.
>
> In POWER7 and POWER8 the LPCR[VPM0] bit selected between the two modes, but
> in POWER9 only VRMA mode is supported

... when translation is off, right? Because I see in the 3.0 ISA that
LPCR[VPM1] is still there.

> and LPCR[VPM0] no longer exists.  We
> handle that difference in behaviour in ppc_hash64_set_isi().. but not in
> other places that we blindly check LPCR[VPM0].
>
> Correct those instances with a new helper to tell if we should be in VRMA
> mode.
>
> Signed-off-by: David Gibson 
> Reviewed-by: Cédric Le Goater 
> ---
>  target/ppc/mmu-hash64.c | 41 +++--
>  1 file changed, 19 insertions(+), 22 deletions(-)
>
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index 5fabd93c92..d878180df5 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -668,6 +668,19 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU 
> *cpu,
>  return 0;
>  }
>  
> +static bool ppc_hash64_use_vrma(CPUPPCState *env)
> +{
> +switch (env->mmu_model) {
> +case POWERPC_MMU_3_00:
> +/* ISAv3.0 (POWER9) always uses VRMA, the VPM0 field and RMOR
> + * register no longer exist */
> +return true;
> +
> +default:
> +return !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> +}
> +}
> +
>  static void ppc_hash64_set_isi(CPUState *cs, uint64_t error_code)
>  {
>  CPUPPCState *env = &POWERPC_CPU(cs)->env;
> @@ -676,15 +689,7 @@ static void ppc_hash64_set_isi(CPUState *cs, uint64_t 
> error_code)
>  if (msr_ir) {
>  vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
>  } else {
> -switch (env->mmu_model) {
> -case POWERPC_MMU_3_00:
> -/* Field deprecated in ISAv3.00 - interrupts always go to hyperv 
> */
> -vpm = true;
> -break;
> -default:
> -vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> -break;
> -}
> +vpm = ppc_hash64_use_vrma(env);
>  }
>  if (vpm && !msr_hv) {
>  cs->exception_index = POWERPC_EXCP_HISI;
> @@ -702,15 +707,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, uint64_t 
> dar, uint64_t dsisr)
>  if (msr_dr) {
>  vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
>  } else {
> -switch (env->mmu_model) {
> -case POWERPC_MMU_3_00:
> -/* Field deprecated in ISAv3.00 - interrupts always go to hyperv 
> */
> -vpm = true;
> -break;
> -default:
> -vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> -break;
> -}
> +vpm = ppc_hash64_use_vrma(env);
>  }
>  if (vpm && !msr_hv) {
>  cs->exception_index = POWERPC_EXCP_HDSI;
> @@ -799,7 +796,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
> eaddr,
>  if (!(eaddr >> 63)) {
>  raddr |= env->spr[SPR_HRMOR];
>  }
> -} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
> +} else if (ppc_hash64_use_vrma(env)) {
>  /* Emulated VRMA mode */
>  slb = &env->vrma_slb;
>  if (!slb->sps) {
> @@ -967,7 +964,7 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
> target_ulong addr)
>  } else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
>  /* In HV mode, add HRMOR if top EA bit is clear */
>  return raddr | env->spr[SPR_HRMOR];
> -} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
> +} else if (ppc_hash64_use_vrma(env)) {
>  /* Emulated VRMA mode */
>  slb = &env->vrma_slb;
>  if (!slb->sps) {
> @@ -1056,8 +1053,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
>  slb->sps = NULL;
>  
>  /* Is VRMA enabled ? */
> -lpcr = env->spr[SPR_LPCR];
> -if (!(lpcr & LPCR_VPM0)) {
> +if (ppc_hash64_use_vrma(env)) {

Shouldn't this be !ppc_hash64_use_vrma(env)?

And a comment about the original code: all other places that check
LPCR_VPM0 do it after verifying that translation is off, except here
(ppc_hash64_update_vrma). Could that be an issue?

>  return;
>  }
>  
> @@ -1065,6 +1061,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
>   * Make one up. Mostly ignore the ESID which will not be needed
>   * for translation
>   */
> +lpcr = env->spr[SPR_LPCR];
>  vsid = SLB

Re: [PATCH v2 0/2] This small series does two things:

2020-02-19 Thread Kashyap Chamarthy
On Wed, Feb 19, 2020 at 04:05:56AM -0800, no-re...@patchew.org wrote:

[...]

> The full log is available at
> http://patchew.org/logs/20200219114607.1855-1-kcham...@redhat.com/testing.docker-mingw@fedora/?type=message.

Patchew is right -- I forgot the Makefile changes.  Hence the failure:

make: *** No rule to make target 'docs/qemu-cpu-models.texi', needed
by 'qemu-doc.html'.  Stop.

v2 upcoming.

[...]

-- 
/kashyap




Re: [PATCH v5 34/79] arm/xilinx_zynq: drop RAM size fixup

2020-02-19 Thread Philippe Mathieu-Daudé

On 2/19/20 2:44 PM, Igor Mammedov wrote:

On Tue, 18 Feb 2020 18:23:47 +0100
Philippe Mathieu-Daudé  wrote:


On 2/17/20 6:34 PM, Igor Mammedov wrote:

If user provided non-sense RAM size, board will complain and
continue running with max RAM size supported.
Also RAM is going to be allocated by generic code, so it won't be
possible for board to fix things up for user.

Make it error message and exit to force user fix CLI,
instead of accepting non-sense CLI values.

Signed-off-by: Igor Mammedov 
Reviewed-by: Alistair Francis 
---
   hw/arm/xilinx_zynq.c | 16 
   1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 3a0fa5b23f..df950fc400 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -158,7 +158,6 @@ static inline void zynq_init_spi_flashes(uint32_t 
base_addr, qemu_irq irq,
   
   static void zynq_init(MachineState *machine)

   {
-ram_addr_t ram_size = machine->ram_size;
   ARMCPU *cpu;
   MemoryRegion *address_space_mem = get_system_memory();
   MemoryRegion *ext_ram = g_new(MemoryRegion, 1);
@@ -168,6 +167,12 @@ static void zynq_init(MachineState *machine)
   qemu_irq pic[64];
   int n;
   
+/* max 2GB ram */

+if (machine->ram_size > 0x8000) {
+error_report("RAM size more than %d is not supported", 0x8000);


Eh? Maybe:

 if (machine->ram_size > 2 * GiB) {
 error_report("RAM size more than 2 GiB is not supported");


amended in v6


Then please add to v6:
Reviewed-by: Philippe Mathieu-Daudé 

Thanks :)




+exit(EXIT_FAILURE);
+}
+
   cpu = ARM_CPU(object_new(machine->cpu_type));
   
   /* By default A9 CPUs have EL3 enabled.  This board does not

@@ -184,14 +189,9 @@ static void zynq_init(MachineState *machine)
   &error_fatal);
   object_property_set_bool(OBJECT(cpu), true, "realized", &error_fatal);
   
-/* max 2GB ram */

-if (ram_size > 0x8000) {
-ram_size = 0x8000;
-}
-
   /* DDR remapped to address zero.  */
   memory_region_allocate_system_memory(ext_ram, NULL, "zynq.ext_ram",
- ram_size);
+ machine->ram_size);
   memory_region_add_subregion(address_space_mem, 0, ext_ram);
   
   /* 256K of on-chip memory */

@@ -300,7 +300,7 @@ static void zynq_init(MachineState *machine)
   sysbus_connect_irq(busdev, 0, pic[40 - IRQ_OFFSET]);
   sysbus_mmio_map(busdev, 0, 0xF8007000);
   
-zynq_binfo.ram_size = ram_size;

+zynq_binfo.ram_size = machine->ram_size;
   zynq_binfo.nb_cpus = 1;
   zynq_binfo.board_id = 0xd32;
   zynq_binfo.loader_start = 0;
   









Re: [PATCH v3 01/12] ppc: Remove stub support for 32-bit hypervisor mode

2020-02-19 Thread Fabiano Rosas
David Gibson  writes:

> a4f30719a8cd, way back in 2007 noted that "PowerPC hypervisor mode is not
> fundamentally available only for PowerPC 64" and added a 32-bit version
> of the MSR[HV] bit.
>
> But nothing was ever really done with that; there is no meaningful support
> for 32-bit hypervisor mode 13 years later.  Let's stop pretending and just
> remove the stubs.
>
> Signed-off-by: David Gibson 

Reviewed-by: Fabiano Rosas 

> ---
>  target/ppc/cpu.h| 21 +++--
>  target/ppc/translate_init.inc.c |  6 +++---
>  2 files changed, 10 insertions(+), 17 deletions(-)
>
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index b283042515..8077fdb068 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -24,8 +24,6 @@
>  #include "exec/cpu-defs.h"
>  #include "cpu-qom.h"
>  
> -/* #define PPC_EMULATE_32BITS_HYPV */
> -
>  #define TCG_GUEST_DEFAULT_MO 0
>  
>  #define TARGET_PAGE_BITS_64K 16
> @@ -300,13 +298,12 @@ typedef struct ppc_v3_pate_t {
>  #define MSR_SF   63 /* Sixty-four-bit modehflags 
> */
>  #define MSR_TAG  62 /* Tag-active mode (POWERx ?)
> */
>  #define MSR_ISF  61 /* Sixty-four-bit interrupt mode on 630  
> */
> -#define MSR_SHV  60 /* hypervisor state   hflags 
> */
> +#define MSR_HV   60 /* hypervisor state   hflags 
> */
>  #define MSR_TS0  34 /* Transactional state, 2 bits (Book3s)  
> */
>  #define MSR_TS1  33
>  #define MSR_TM   32 /* Transactional Memory Available (Book3s)   
> */
>  #define MSR_CM   31 /* Computation mode for BookE hflags 
> */
>  #define MSR_ICM  30 /* Interrupt computation mode for BookE  
> */
> -#define MSR_THV  29 /* hypervisor state for 32 bits PowerPC   hflags 
> */
>  #define MSR_GS   28 /* guest state for BookE 
> */
>  #define MSR_UCLE 26 /* User-mode cache lock enable for BookE 
> */
>  #define MSR_VR   25 /* altivec availablex hflags 
> */
> @@ -401,10 +398,13 @@ typedef struct ppc_v3_pate_t {
>  
>  #define msr_sf   ((env->msr >> MSR_SF)   & 1)
>  #define msr_isf  ((env->msr >> MSR_ISF)  & 1)
> -#define msr_shv  ((env->msr >> MSR_SHV)  & 1)
> +#if defined(TARGET_PPC64)
> +#define msr_hv   ((env->msr >> MSR_HV)   & 1)
> +#else
> +#define msr_hv   (0)
> +#endif
>  #define msr_cm   ((env->msr >> MSR_CM)   & 1)
>  #define msr_icm  ((env->msr >> MSR_ICM)  & 1)
> -#define msr_thv  ((env->msr >> MSR_THV)  & 1)
>  #define msr_gs   ((env->msr >> MSR_GS)   & 1)
>  #define msr_ucle ((env->msr >> MSR_UCLE) & 1)
>  #define msr_vr   ((env->msr >> MSR_VR)   & 1)
> @@ -449,16 +449,9 @@ typedef struct ppc_v3_pate_t {
>  
>  /* Hypervisor bit is more specific */
>  #if defined(TARGET_PPC64)
> -#define MSR_HVB (1ULL << MSR_SHV)
> -#define msr_hv  msr_shv
> -#else
> -#if defined(PPC_EMULATE_32BITS_HYPV)
> -#define MSR_HVB (1ULL << MSR_THV)
> -#define msr_hv  msr_thv
> +#define MSR_HVB (1ULL << MSR_HV)
>  #else
>  #define MSR_HVB (0ULL)
> -#define msr_hv  (0)
> -#endif
>  #endif
>  
>  /* DSISR */
> diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
> index 53995f62ea..a0d0eaabf2 100644
> --- a/target/ppc/translate_init.inc.c
> +++ b/target/ppc/translate_init.inc.c
> @@ -8804,7 +8804,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
>  PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
>  PPC2_TM | PPC2_PM_ISA206;
>  pcc->msr_mask = (1ull << MSR_SF) |
> -(1ull << MSR_SHV) |
> +(1ull << MSR_HV) |
>  (1ull << MSR_TM) |
>  (1ull << MSR_VR) |
>  (1ull << MSR_VSX) |
> @@ -9017,7 +9017,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
>  PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
>  PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
>  pcc->msr_mask = (1ull << MSR_SF) |
> -(1ull << MSR_SHV) |
> +(1ull << MSR_HV) |
>  (1ull << MSR_TM) |
>  (1ull << MSR_VR) |
>  (1ull << MSR_VSX) |
> @@ -9228,7 +9228,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
>  PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
>  PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
>  pcc->msr_mask = (1ull << MSR_SF) |
> -(1ull << MSR_SHV) |
> +(1ull << MSR_HV) |
>  (1ull << MSR_TM) |
>  (1ull << MSR_VR) |
>  (1ull << MSR_VSX) |



Re: [PATCH v5 2/4] vl: Initialise main loop earlier

2020-02-19 Thread Wolfgang Bumiller
On Tue, Feb 18, 2020 at 04:40:34PM +0100, Kevin Wolf wrote:
> We want to be able to use qemu_aio_context in the monitor
> initialisation.
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Marc-André Lureau 
> Reviewed-by: Markus Armbruster 
> ---
>  vl.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index 794f2e5733..98bc51e089 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2894,6 +2894,11 @@ int main(int argc, char **argv, char **envp)
>  runstate_init();
>  precopy_infrastructure_init();
>  postcopy_infrastructure_init();
> +
> +if (qemu_init_main_loop(&main_loop_err)) {
> +error_report_err(main_loop_err);
> +exit(1);
> +}
>  monitor_init_globals();

This is a tiny bit scary, as we now have around 1kloc of code between
here and os_daemonize() where in the future we may accidentally cause
the aio context's on-demand thread pool to spawn before fork()ing
(silently losing the threads again - we did have such an issue right
there in monitor_init_globals() in the past)

For *now* it should be fine since currently that wouldn't have worked,
but we may need to keep an eye out for that in future patches.
Basically, not everything we do between here and os_daemonize() way down
below is actually allowed to freely use the main loop's aio context even
if now it already exists from this point onward.

I wonder if aio_get_thread_pool() should check the daemonization state
(maybe with some debug #ifdef)?

>  
>  if (qcrypto_init(&err) < 0) {
> @@ -3811,11 +3816,6 @@ int main(int argc, char **argv, char **envp)
>  qemu_unlink_pidfile_notifier.notify = qemu_unlink_pidfile;
>  qemu_add_exit_notifier(&qemu_unlink_pidfile_notifier);
>  
> -if (qemu_init_main_loop(&main_loop_err)) {
> -error_report_err(main_loop_err);
> -exit(1);
> -}
> -
>  #ifdef CONFIG_SECCOMP
>  olist = qemu_find_opts_err("sandbox", NULL);
>  if (olist) {
> -- 
> 2.20.1




Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-02-19 Thread Olaf Hering
Am Wed, 19 Feb 2020 12:35:30 +0100
schrieb Olaf Hering :

> Is any of the things done by pc_i440fx_5_0_machine_options and
> pc_i440fx_machine_options a desired, or even breaking, change for the
> current result of pc_xen_hvm_init?

I tried to follow a few of the initialized members:

default_nic_model, perhaps the involved code paths are not called, so the NULL 
pointer does not matter.

pvh_enabled, does this mean the PVH domU type? If yes, this would be lost when 
xenfv is locked at v3.1.


Olaf


pgpFyuzy9M65y.pgp
Description: Digitale Signatur von OpenPGP


Re: [PATCH v2 15/22] qemu-iotests/199: improve performance: set bitmap by discard

2020-02-19 Thread Andrey Shinkevich

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 31 ---
  1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 6599fc6fb4..d78f81b71c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -67,8 +67,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  os.mkfifo(fifo)
  qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
  qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
-self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
-self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a,
+  'discard=unmap')
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b,
+  'discard=unmap')
  self.vm_b.add_incoming("exec: cat '" + fifo + "'")
  self.vm_a.launch()
  self.vm_b.launch()
@@ -78,7 +80,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_b_events = []
  
  def test_postcopy(self):

-write_size = 0x4000
+discard_size = 0x4000
  granularity = 512
  chunk = 4096
  
@@ -86,25 +88,32 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 name='bitmap', granularity=granularity)
  self.assert_qmp(result, 'return', {})
  
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',

+   node='drive0', name='bitmap')
+empty_sha256 = result['return']['sha256']
+
  s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
  s += 0x1
  s = 0x8000
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
  s += 0x1
  
  result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',

 node='drive0', name='bitmap')
  sha256 = result['return']['sha256']
  
+# Check, that updating the bitmap by discards works

+assert sha256 != empty_sha256
+
  result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
 name='bitmap')
  self.assert_qmp(result, 'return', {})
  s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
  s += 0x1
  
  caps = [{'capability': 'dirty-bitmaps', 'state': True},

@@ -126,8 +135,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_b_events.append(e_resume)
  
  s = 0x8000

-while s < write_size:
-self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
  s += 0x1
  
  match = {'data': {'status': 'completed'}}




Reviewed-by: Andrey Shinkevich 
--
With the best regards,
Andrey Shinkevich



Re: [PATCH v2 03/22] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup

2020-02-19 Thread Vladimir Sementsov-Ogievskiy

18.02.2020 14:00, Andrey Shinkevich wrote:

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 73792ab005..4e8959ae52 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -259,7 +259,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
  }
  /* Called with iothread lock taken.  */
-static void dirty_bitmap_mig_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(void)
  {
  SaveBitmapState *dbms;
@@ -338,7 +338,7 @@ static int init_dirty_bitmap_migration(void)
  return 0;
  fail:
-    dirty_bitmap_mig_cleanup();
+    dirty_bitmap_do_save_cleanup();
  return -1;
  }
@@ -377,7 +377,7 @@ static void bulk_phase(QEMUFile *f, bool limit)
  /* for SaveVMHandlers */
  static void dirty_bitmap_save_cleanup(void *opaque)
  {
-    dirty_bitmap_mig_cleanup();
+    dirty_bitmap_do_save_cleanup();
  }
  static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
@@ -412,7 +412,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void 
*opaque)
  trace_dirty_bitmap_save_complete_finish();
-    dirty_bitmap_mig_cleanup();
+    dirty_bitmap_do_save_cleanup();
  return 0;
  }



At the next opportunity, I would suggest the name like
"dirty_bitmap_do_clean_after_saving()"
and similar for dirty_bitmap_save_cleanup()
"dirty_bitmap_clean_after_saving()".


I'd keep my naming, it corresponds to .save_cleanup handler name.



Reviewed-by: Andrey Shinkevich 



--
Best regards,
Vladimir



Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine

2020-02-19 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 19.02.2020 um 10:03 hat Markus Armbruster geschrieben:
>> >> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
>> >> >>}
>> >> >>qmp_request_free(req_obj);
>> >> >> 
>> >> >>   -/* Reschedule instead of looping so the main loop stays 
>> >> >> responsive */
>> >> >>   -qemu_bh_schedule(qmp_dispatcher_bh);
>> >> >>   +/*
>> >> >>   + * Yield and reschedule so the main loop stays responsive.
>> >> >>   + *
>> >> >>   + * Move back to iohandler_ctx so that nested event loops for
>> >> >>   + * qemu_aio_context don't start new monitor commands.
>> >> >> 
>> >> >> Can you explain this sentence for dummies?
>> >> >
>> >> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
>> >> > are scheduled there, the next iteration of the monitor dispatcher loop
>> >> > could start from a nested event loop. If we are scheduled in
>> >> > iohandler_ctx, then only the actual main loop will reenter the coroutine
>> >> > and nested event loops ignore it.
>> >> >
>> >> > I'm not sure if or why this is actually important, but this matches
>> >> > scheduling the dispatcher BH in iohandler_ctx in the code before this
>> >> > patch.
>> >> >
>> >> > If we didn't do this, we could end up starting monitor requests in more
>> >> > places than before, and who knows what that would mean.
>> >> 
>> >> Let me say it in my own words, to make sure I got it.  I'm going to
>> >> ignore special cases like "not using I/O thread" and exec-oob.
>> >> 
>> >> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
>> >> This pushes requests onto the monitor's qmp_requests queue.
>> >
>> > mon_iothread (like all iothreads) has a separate AioContext, which
>> > doesn't have a name, but can be accessed with
>> > iothread_get_aio_context(mon_iothread).
>> 
>> Got it.
>> 
>> @iohandler_ctx and @qemu_aio_context both belong to the main loop.
>> 
>> @qemu_aio_context is for general "run in the main loop" use.  It is
>> polled both in the actual main loop and event loops nested in it.  I
>> figure "both" to keep things responsive.
>> 
>> @iohandler_ctx is for "I/O handlers" (whatever those are).  It is polled
>> only in the actual main loop.  I figure this means "I/O handlers" may
>> get delayed while a nested event loop runs.  But better starve a bit
>> than run in unexpected places.
>> 
>> >> Before this patch, the dispatcher runs in a bottom half in the main
>> >> thread, in qemu_aio_context.
>> >
>> > The dispatcher BH is what is scheduled in iohandler_ctx. This means that
>> > the BH is run from the main loop, but not from nested event loops.
>> 
>> Got it.
>> 
>> >> The patch moves it to a coroutine running in the main thread.  It runs
>> >> in iohandler_ctx, but switches to qemu_aio_context for executing command
>> >> handlers.
>> >> 
>> >> We want to keep command handlers running in qemu_aio_context, as before
>> >> this patch.
>> >
>> > "Running in AioContext X" is kind of a sloppy term, and I'm afraid it
>> > might be more confusing than helpful here. What this means is that the
>> > code is run while holding the lock of the AioContext, and it registers
>> > its BHs, callbacks etc. in that AioContext so it will be called from the
>> > event loop of the respective thread.
>> >
>> > Before this patch, command handlers can't really use callbacks without a
>> > nested event loop because they are synchronous.
>> 
>> I figure an example for such a nested event loop is BDRV_POLL_WHILE() in
>> bdrv_truncate(), which runs in the command handler for block_resize.
>> True?
>
> Yes. I think most nested event loops are in the block layer, and they
> use the BDRV_POLL_WHILE() or AIO_WAIT_WHILE() macros today.
>
>> > The BQL is held, which
>> > is equivalent to holding the qemu_aio_context lock.
>> 
>> Why is it equivalent?  Are they always taken together?
>
> I guess ideally they would be. In practice, we neglect the
> qemu_aio_context lock and rely on the BQL for everything in the main
> thread. So maybe I should say the BQL replaces the AioContext lock for
> the main context rather than being equivalent.

Sounds a bit sloppy.  It works...

>> > But other than that,
>> > all of the definition of "running in an AioContext" doesn't really apply.
>> >
>> > Now with coroutines, you assign them an AioContext, which is the context
>> > in which BHs reentering the coroutine will be scheduled, e.g. from
>> > explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
>> > for a lock like a CoMutex.
>> >
>> > qemu_aio_context is what most (if not all) of the existing QMP handlers
>> > use when they run a nested event loop,
>> 
>> bdrv_truncate() appears to use bdrv_get_aio_context(), which is the
>> BlockDriverState's AioContext if set, else @qemu_aio_context.
>
> Right, the BDRV_POLL_WHILE() macro allows waiting

Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine

2020-02-19 Thread Markus Armbruster
Markus Armbruster  writes:

[...]
> If you agree with my proposed tweaks, and nothing else comes up, I can
> try to do them in my tree.

I'll tweak your v5, of course.




Re: [PATCH v5 79/79] tests:numa-test: use explicit memdev to specify node RAM

2020-02-19 Thread Igor Mammedov
On Wed, 19 Feb 2020 15:06:24 +0100
Philippe Mathieu-Daudé  wrote:

> On 2/19/20 2:00 PM, Igor Mammedov wrote:
> > On Tue, 18 Feb 2020 18:51:34 +0100
> > Philippe Mathieu-Daudé  wrote:
> >   
> >> On 2/17/20 6:34 PM, Igor Mammedov wrote:  
> >>> Follow up patches will remove automatic RAM distribution
> >>> between nodes and will make default machine types require
> >>> "memdev" option instead of legacy "mem" option.  
> >>
> >> Can we keep this patch for the follow up?  
> > memdev for numa was there for along time, just untested.
> > With this all numa tests switch to it instead of using
> > legacy option (+ a test for legacy option).
> > I don't think the patch should delayed along with numa
> > cleanups.  
> 
> I guess what confuses me is "Follow up patches *will* remove..."
I'll drop this frase since there aren't immediate "Follow up patches"
to avoid confusion

> > 
> > It of-cause could be posted as standalone patch as well,
> > I'll leave it upto Paolo whether to merge it or not.
> > 
> >>>
> >>> Make tests to follow new rules and add an additional test
> >>> for legacy "mem" option on old machine type, to make sure
> >>> it won't regress in the future.
> >>>
> >>> Signed-off-by: Igor Mammedov 
> >>> Acked-by: Thomas Huth 
> >>> ---  
> >>  
> >   
> 




Race condition in overlayed qcow2?

2020-02-19 Thread dovgaluk

Hi!

I encountered a problem with record/replay of QEMU execution and figured 
out the following, when
QEMU is started with one virtual disk connected to the qcow2 image with 
applied 'snapshot' option.


The patch d710cf575ad5fb3ab329204620de45bfe50caa53 "block/qcow2: 
introduce parallel subrequest handling in read and write"
introduces some kind of race condition, which causes difference in the 
data read from the disk.


I detected this by adding the following code, which logs IO operation 
checksum. And this checksum may be different in different runs of the 
same recorded execution.


logging in blk_aio_complete function:
qemu_log("%"PRId64": blk_aio_complete\n", 
replay_get_current_icount());

QEMUIOVector *qiov = acb->rwco.iobuf;
if (qiov && qiov->iov) {
size_t i, j;
uint64_t sum = 0;
int count = 0;
for (i = 0 ; i < qiov->niov ; ++i) {
for (j = 0 ; j < qiov->iov[i].iov_len ; ++j) {
sum += ((uint8_t*)qiov->iov[i].iov_base)[j];
++count;
}
}
qemu_log("--- iobuf offset %"PRIx64" len %x sum: 
%"PRIx64"\n", acb->rwco.offset, count, sum);

}

I tried to get rid of aio task by patching qcow2_co_preadv_part:
ret = qcow2_co_preadv_task(bs, ret, cluster_offset, offset, cur_bytes, 
qiov, qiov_offset);


That change fixed a bug, but I have no idea what to debug next to figure 
out the exact reason of the failure.


Do you have any ideas or hints?

Pavel Dovgalyuk



Re: [PATCH v2 16/22] qemu-iotests/199: change discard patterns

2020-02-19 Thread Andrey Shinkevich

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

iotest 40 works too long because of many discard opertion. On the same


operations
At the same time


time, postcopy period is very short, in spite of all these efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 44 +-
  1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index d78f81b71c..7914fd0b2b 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -30,6 +30,28 @@ size = '256G'
  fifo = os.path.join(iotests.test_dir, 'mig_fifo')
  
  
+GiB = 1024 * 1024 * 1024

+
+discards1 = (
+(0, GiB),
+(2 * GiB + 512 * 5, 512),
+(3 * GiB + 512 * 5, 512),
+(100 * GiB, GiB)
+)
+
+discards2 = (
+(3 * GiB + 512 * 8, 512),
+(4 * GiB + 512 * 8, 512),
+(50 * GiB, GiB),
+(100 * GiB + GiB // 2, GiB)
+)
+
+
+def apply_discards(vm, discards):
+for d in discards:


If we run qemu-io only once, it will update the bitmap state and will 
speed the test performance up. Is that wrong idea?



+vm.hmp_qemu_io('drive0', 'discard {} {}'.format(*d))
+
+
  def event_seconds(event):
  return event['timestamp']['seconds'] + \
  event['timestamp']['microseconds'] / 100.0
@@ -80,9 +102,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_b_events = []
  
  def test_postcopy(self):

-discard_size = 0x4000
  granularity = 512
-chunk = 4096
  
  result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',

 name='bitmap', granularity=granularity)
@@ -92,14 +112,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 node='drive0', name='bitmap')
  empty_sha256 = result['return']['sha256']
  
-s = 0

-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
-s = 0x8000
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_a, discards1 + discards2)
  
  result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',

 node='drive0', name='bitmap')
@@ -111,10 +124,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
 name='bitmap')
  self.assert_qmp(result, 'return', {})
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+
+apply_discards(self.vm_a, discards1)
  
  caps = [{'capability': 'dirty-bitmaps', 'state': True},

  {'capability': 'events', 'state': True}]
@@ -134,10 +145,7 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  e_resume = self.vm_b.event_wait('RESUME')
  self.vm_b_events.append(e_resume)
  
-s = 0x8000

-while s < discard_size:
-self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_b, discards2)
  
  match = {'data': {'status': 'completed'}}

  e_complete = self.vm_b.event_wait('MIGRATION', match=match)



Reviewed-by: Andrey Shinkevich 
--
With the best regards,
Andrey Shinkevich



Re: [PATCH v3 11/12] target/ppc: Streamline construction of VRMA SLB entry

2020-02-19 Thread Fabiano Rosas
David Gibson  writes:


Hi, just a nitpick, feel free to ignore.

> When in VRMA mode (i.e. a guest thinks it has the MMU off, but the
> hypervisor is still applying translation) we use a special SLB entry,
> rather than looking up an SLBE by address as we do when guest translation
> is on.
>
> We build that special entry in ppc_hash64_update_vrma() along with some
> logic for handling some non-VRMA cases.  Split the actual build of the
> VRMA SLBE into a separate helper and streamline it a bit.
>
> Signed-off-by: David Gibson 
> ---
>  target/ppc/mmu-hash64.c | 79 -
>  1 file changed, 38 insertions(+), 41 deletions(-)
>
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index 170a78bd2e..06cfff9860 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -789,6 +789,39 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
>  }
>  }
>  
> +static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
> +{
> +CPUPPCState *env = &cpu->env;
> +target_ulong lpcr = env->spr[SPR_LPCR];
> +uint32_t vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
> +target_ulong vsid = SLB_VSID_VRMA | ((vrmasd << 4) & SLB_VSID_LLP_MASK);
> +int i;
> +
> +/*
> + * Make one up. Mostly ignore the ESID which will not be needed
> + * for translation
> + */

I find this comment a bit vague. I suggest we either leave it behind or
make it more precise. The ISA says:

"translation of effective addresses to virtual addresses use the SLBE
values in Figure 18 instead of the entry in the SLB corresponding to the
ESID"




[PATCH] iotests: Fix nonportable use of od --endian

2020-02-19 Thread Eric Blake
Tests 261 and 272 fail on RHEL 7 with coreutils 8.22, since od
--endian was not added until coreutils 8.23.  Fix this by manually
constructing the final value one byte at a time.

Fixes: fc8ba423
Reported-by: Andrey Shinkevich 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/common.rc | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 8a6366c09daf..b77ef3d22cd1 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -56,6 +56,12 @@ poke_file()
 # peek_file_le 'test.img' 512 2 => 65534
 peek_file_le()
 {
-# Wrap in echo $() to strip spaces
-echo $(od -j"$2" -N"$3" --endian=little -An -vtu"$3" "$1")
+local val=0 shift=0 i
+
+# coreutils' od --endian is not portable, so manually assemble bytes.
+for i in $(od -j"$2" -N"$3" -An -v -tu1 "$1"); do
+val=$(( val | (i << shift) ))
+shift=$((shift + 8))
+done
+echo $val
 }

 # peek_file_be 'test.img' 512 2 => 65279
 peek_file_be()
 {
-# Wrap in echo $() to strip spaces
-echo $(od -j"$2" -N"$3" --endian=big -An -vtu"$3" "$1")
+local val=0 i
+
+# coreutils' od --endian is not portable, so manually assemble bytes.
+for i in $(od -j"$2" -N"$3" -An -v -tu1 "$1"); do
+val=$(( (val << 8) | i ))
+done
+echo $val
 }

-# peek_file_raw 'test.img' 512 2 => '\xff\xfe'
+# peek_file_raw 'test.img' 512 2 => '\xff\xfe'. Do not use if the raw data
+# is likely to contain \0 or trailing \n.
 peek_file_raw()
 {
 dd if="$1" bs=1 skip="$2" count="$3" status=none
-- 
2.24.1




  1   2   3   4   >