date:20200703

Re: [PATCH v2 35/44] error: Eliminate error_propagate() with Coccinelle, part 2

2020-07-03 Thread Markus Armbruster

Eric Blake  writes:

> On 7/2/20 10:49 AM, Markus Armbruster wrote:
>> When all we do with an Error we receive into a local variable is
>> propagating to somewhere else, we can just as well receive it there
>> right away.  The previous commit did that with a Coccinelle script I
>> consider fairly trustworthy.  This commit uses the same script with
>> the matching of return taken out, i.e. we convert
>>
>>  if (!foo(..., &err)) {
>>  ...
>>  error_propagate(errp, err);
>>  ...
>>  }
>>
>> to
>>
>>  if (!foo(..., errp)) {
>>  ...
>>  ...
>>  }
>>
>> This is unsound: @err could still be read between afterwards.  I don't
>> know how to express "no read of @err without an intervening write" in
>> Coccinelle.  Instead, I manually double-checked for uses of @err.
>>
>> Suboptimal line breaks tweaked manually.  qdev_realize() simplified
>> further to placate scripts/checkpatch.pl.
>>
>> Signed-off-by: Markus Armbruster 
>> ---
>>   block.c  |  6 ++
>>   block/blkdebug.c |  7 ++-
>>   block/blklogwrites.c |  3 +--
>>   block/blkverify.c|  3 +--
>>   block/crypto.c   |  4 +---
>>   block/file-posix.c   |  6 ++
>>   block/file-win32.c   |  6 ++
>>   block/gluster.c  |  4 +---
>>   block/iscsi.c|  3 +--
>>   block/nbd.c  |  8 ++--
>>   block/qcow2.c| 13 -
>>   block/raw-format.c   |  4 +---
>>   block/sheepdog.c |  8 ++--
>>   block/ssh.c  |  4 +---
>>   block/throttle.c |  4 +---
>>   block/vmdk.c |  4 +---
>>   block/vpc.c  |  3 +--
>>   block/vvfat.c|  3 +--
>>   blockdev.c   |  3 +--
>>   hw/intc/xics.c   |  4 +---
>>   hw/vfio/pci.c|  3 +--
>>   net/tap.c|  3 +--
>>   qom/object.c |  4 +---
>>   23 files changed, 32 insertions(+), 78 deletions(-)
>
> Small enough to review each instance.
>
> Reviewed-by: Eric Blake 

I tried *really* hard to make part 1's script powerful and safe, to give
the unsafe / manual parts (this commit and next) a chance of meaningful
review.  Thanks for providing it!

Re: [PATCH v2 36/44] error: Eliminate error_propagate() manually

2020-07-03 Thread Markus Armbruster

Eric Blake  writes:

> On 7/2/20 10:49 AM, Markus Armbruster wrote:
>> When all we do with an Error we receive into a local variable is
>> propagating to somewhere else, we can just as well receive it there
>> right away.  The previous two commits did that for sufficiently simple
>> cases with Coccinelle.  Do it for several more manually.
>>
>> Signed-off-by: Markus Armbruster 
>> ---
>
>> +++ b/qdev-monitor.c
>> @@ -597,7 +597,6 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error 
>> **errp)
>>   const char *driver, *path;
>>   DeviceState *dev = NULL;
>>   BusState *bus = NULL;
>> -Error *err = NULL;
>>   bool hide;
>> driver = qemu_opt_get(opts, "driver");
>> @@ -652,15 +651,14 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error 
>> **errp)
>>   dev = qdev_new(driver);
>> /* Check whether the hotplug is allowed by the machine */
>> -if (qdev_hotplug && !qdev_hotplug_allowed(dev, &err)) {
>> +if (qdev_hotplug && !qdev_hotplug_allowed(dev, errp)) {
>>   /* Error must be set in the machine hook */
>> -assert(err);
>
> That comment could be deleted now.

Yes.

> Either way,
> Reviewed-by: Eric Blake 

Thanks!

Re: [PATCH v2 37/44] error: Reduce unnecessary error propagation

2020-07-03 Thread Markus Armbruster

Eric Blake  writes:

> On 7/2/20 10:49 AM, Markus Armbruster wrote:
>> When all we do with an Error we receive into a local variable is
>> propagating to somewhere else, we can just as well receive it there
>> right away, even when we need to keep error_propagate() for other
>> error paths.
>>
>> Signed-off-by: Markus Armbruster 
>> ---
>
>> +++ b/block/replication.c
>> @@ -85,7 +85,6 @@ static int replication_open(BlockDriverState *bs, QDict 
>> *options,
>>   {
>>   int ret;
>>   BDRVReplicationState *s = bs->opaque;
>> -Error *local_err = NULL;
>>   QemuOpts *opts = NULL;
>>   const char *mode;
>>   const char *top_id;
>> @@ -99,7 +98,7 @@ static int replication_open(BlockDriverState *bs, QDict 
>> *options,
>> ret = -EINVAL;
>>   opts = qemu_opts_create(&replication_runtime_opts, NULL, 0, 
>> &error_abort);
>> -if (!qemu_opts_absorb_qdict(opts, options, &local_err)) {
>> +if (!qemu_opts_absorb_qdict(opts, options, errp)) {
>>   goto fail;
>>   }
>
> Does this one belong in 36/44, given that removal of 'local_err' is
> evidence that no other error path needed it?
>
> Either way, it belongs in the series, and the result of the two
> patches together is fine.
>
> Reviewed-by: Eric Blake 

Actually, this hunk needs to go before PATCH 33 to keep it correct.
I'll find out how to best reshuffle hunks.  The end result will be the
same.

Thanks!

Re: [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()

2020-07-03 Thread Max Reitz

On 03.07.20 00:00, Alberto Garcia wrote:
> On Thu 02 Jul 2020 11:57:46 AM CEST, Max Reitz wrote:
>>> The reason why we would want to check it is, of course, because that
>>> bit does have a meaning in regular L2 entries.
>>>
>>> But that bit is ignored in images with subclusters so the only reason
>>> why we would check it is to report corruption, not because we need to
>>> know its value.
>>
>> Sure.  But isn’t that the whole point of having
>> QCOW2_SUBCLUSTER_INVALID in the first place?
> 
> At the moment we're only returning QCOW2_SUBCLUSTER_INVALID in cases
> where there is no way to interpret the entry correctly: a) the
> allocation and zero bits are set for the same subcluster, and b) the
> allocation bit is set but the entry has no valid offset.
> 
> It doesn't mean that we cannot use _SUBCLUSTER_INVALID for cases like
> the one we're discussing, but this one is different from the other two.

OK, that makes sense.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()

2020-07-03 Thread Max Reitz

On 03.07.20 00:40, Alberto Garcia wrote:
> On Thu 02 Jul 2020 04:28:57 PM CEST, Max Reitz wrote:
>>> +/* For full clusters use zero_in_l2_slice() instead */
>>> +assert(nb_subclusters > 0 && nb_subclusters < 
>>> s->subclusters_per_cluster);
>>> +assert(sc + nb_subclusters <= s->subclusters_per_cluster);
>>
>> Maybe we should also assert that @offset is aligned to the subcluster
>> size.
> 
> It doesn't hurt but the only caller already guarantees that already ...

Sure, but it also guarantees the rest of these conditions, doesn’t it? :)

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta

2020-07-03 Thread Max Reitz

On 03.07.20 01:05, Alberto Garcia wrote:
> On Thu 02 Jul 2020 05:09:47 PM CEST, Max Reitz wrote:
>>> Without a backing file, there is no read required - writing to an
>>> unallocated subcluster within a preallocated cluster merely has to
>>> provide zeros to the rest of the write.  And depending on whether we
>>> can intelligently guarantee that the underlying protocol already
>>> reads as zeroes when preallocated, we even have an optimization where
>>> even that is not necessary.  We can still lump it in the "COW"
>>> terminology, in that our write is more complex than merely writing in
>>> place, but it isn't a true copy-on-write operation as there is
>>> nothing to be copied.
>>
>> The term “COW” specifically in the qcow2 driver also refers to having
>> to write zeroes to an area that isn’t written to by the guest as part
>> of the process of having to allocate a (sub)cluster.
> 
> The question is valid: if the space for the clusters is allocated but
> the subclusters are not marked as such then any partial write request
> will need to fill the rest with zeroes (in practice handle_alloc_space()
> can do that efficiently but that's another question).
> 
> If there is a backing file then there's no other alternative because we
> do need to copy the data from the backing file.
> 
> If there is no backing file perhaps we could allocate all subclusters as
> well. I suppose we can detect that scenario at that point in the code (I
> haven't checked) and I don't know what would happen if one later
> attaches a backing file on runtime using the command-line options.
> 
> But what I would argue is that I don't see the benefit of using extended
> L2 entries on an preallocated image with no backing file: other than
> having twice as much L2 metadata what would be the use? The point of
> subclusters is that they make allocation more efficient, but if the
> image is already fully allocated then they give you nothing.

That’s true.  I didn’t think about it this way.

Then indeed it doesn’t make sense to potentially break cases of later
adding a backing file:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [RFC v2 1/1] memory: Delete assertion in memory_region_unregister_iommu_notifier

2020-07-03 Thread Jason Wang




On 2020/7/2 下午11:45, Peter Xu wrote:

On Thu, Jul 02, 2020 at 11:01:54AM +0800, Jason Wang wrote:

So I think we agree that a new notifier is needed?

Good to me, or a new flag should be easier (IOMMU_NOTIFIER_DEV_IOTLB)?



That should work but I wonder something as following is better.

Instead of introducing new flags, how about carry the type of event in 
the notifier then the device (vhost) can choose the message it want to 
process like:


static vhost_iommu_event(IOMMUNotifier *n, IOMMUTLBEvent *event)

{

switch (event->type) {

case IOMMU_MAP:
case IOMMU_UNMAP:
case IOMMU_DEV_IOTLB_UNMAP:
...

}

Thanks

Re: [PATCH] softmmu/vl: Remove the check for colons in -accel parameters

2020-07-03 Thread Claudio Fontana

On 7/3/20 7:34 AM, Thomas Huth wrote:
> On 18/06/2020 09.40, Thomas Huth wrote:
>> The new -accel option does not accept colons in the parameters anymore
>> (since it does not convert the parameters to -machine accel=... parameters
>> anymore). Thus we can now remove the check for colons in -accel:
>>
>> $ qemu-system-x86_64 -accel kvm:tcg
>> qemu-system-x86_64: -accel kvm:tcg: invalid accelerator kvm:tcg
>>
>> Signed-off-by: Thomas Huth 
>> ---
>>   softmmu/vl.c | 5 -
>>   1 file changed, 5 deletions(-)
>>
>> diff --git a/softmmu/vl.c b/softmmu/vl.c
>> index f669c06ede..273acfcf6b 100644
>> --- a/softmmu/vl.c
>> +++ b/softmmu/vl.c
>> @@ -3485,11 +3485,6 @@ void qemu_init(int argc, char **argv, char **envp)
>>   g_slist_free(accel_list);
>>   exit(0);
>>   }
>> -if (optarg && strchr(optarg, ':')) {
>> -error_report("Don't use ':' with -accel, "
>> - "use -M accel=... for now instead");
>> -exit(1);
>> -}
>>   break;
>>   case QEMU_OPTION_usb:
>>   olist = qemu_find_opts("machine");
>>
> 
> Ping?
> 
>   Thomas
> 
> 

Reviewed-by: Claudio Fontana

Re: [PATCH v4 00/14]vDPA support in qemu

2020-07-03 Thread Jason Wang




On 2020/7/1 下午10:55, Cindy Lu wrote:

vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by software.
This PATCH introduce the vDPA support in qemu
TODO
1) vIOMMU support
2) live migration support
3) docs for vhost-vdpa
4) config interrupt support

Change from v1
separate the patch of vhost_vq_get_addr
separate the patch of vhost_dev_start
introduce the docmation for vhost-vdpa.rst
other comments form last version
github address
https://github.com/lulu-github-name/qemutmp.git PATCHV2

Change from v3
fix the complie problem
separate the patch of vhost_force_iommu
other comments form last version
github address
https://github.com/lulu-github-name/qemutmp.git PATCHV3

Change from v3
fix the centos7 test problem
other comments form last version
github address
https://github.com/lulu-github-name/qemutmp.git PATCHV4



Thanks a lot for patches.

I think we may have a more generic handling of forcing IOMMU_PLATFORM. 
But it can be done on top.


So

Acked-by: Jason Wang 




Cindy Lu (11):
   net: introduce qemu_get_peer
   vhost_net: use the function qemu_get_peer
   vhost: introduce new VhostOps vhost_dev_start
   vhost: implement vhost_dev_start method
   vhost: introduce new VhostOps vhost_vq_get_addr
   vhost: implement vhost_vq_get_addr method
   vhost: introduce new VhostOps vhost_force_iommu
   vhost: implement vhost_force_iommu method
   vhost_net: introduce set_config & get_config
   vhost-vdpa: introduce vhost-vdpa backend
   vhost-vdpa: introduce vhost-vdpa net client

Jason Wang (3):
   virtio-bus: introduce queue_enabled method
   virtio-pci: implement queue_enabled method
   vhost: check the existence of vhost_set_iotlb_callback

  configure |  21 ++
  docs/interop/index.rst|   1 +
  docs/interop/vhost-vdpa.rst   |  17 ++
  hw/net/vhost_net-stub.c   |  11 +
  hw/net/vhost_net.c|  45 ++-
  hw/net/virtio-net.c   |  19 ++
  hw/virtio/Makefile.objs   |   1 +
  hw/virtio/vhost-backend.c |   6 +
  hw/virtio/vhost-vdpa.c| 475 ++
  hw/virtio/vhost.c |  52 +++-
  hw/virtio/virtio-pci.c|  13 +
  hw/virtio/virtio.c|   6 +
  include/hw/virtio/vhost-backend.h |  19 +-
  include/hw/virtio/vhost-vdpa.h|  26 ++
  include/hw/virtio/vhost.h |   7 +
  include/hw/virtio/virtio-bus.h|   4 +
  include/net/net.h |   1 +
  include/net/vhost-vdpa.h  |  22 ++
  include/net/vhost_net.h   |   5 +
  net/Makefile.objs |   2 +-
  net/clients.h |   2 +
  net/net.c |  10 +
  net/vhost-vdpa.c  | 228 ++
  qapi/net.json |  28 +-
  qemu-options.hx   |  12 +
  25 files changed, 1004 insertions(+), 29 deletions(-)
  create mode 100644 docs/interop/vhost-vdpa.rst
  create mode 100644 hw/virtio/vhost-vdpa.c
  create mode 100644 include/hw/virtio/vhost-vdpa.h
  create mode 100644 include/net/vhost-vdpa.h
  create mode 100644 net/vhost-vdpa.c

Re: Questionable aspects of QEMU Error's design

2020-07-03 Thread Markus Armbruster

Markus Armbruster  writes:

> Vladimir Sementsov-Ogievskiy  writes:
>
>> 28.04.2020 08:20, Vladimir Sementsov-Ogievskiy wrote:
>>> 27.04.2020 18:36, Markus Armbruster wrote:
 FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
 running into bugs.  So far:
[...]
 I got another one coming for QOM and qdev before I can post the
 conversion.

 Vladimir, since the conversion will mess with error_propagate(), I'd
 like to get it in before your auto-propagation work.

>>>
>>> OK, just let me know when to regenerate the series, it's not hard.
>>>
>>
>> Hi! Is all that merged? Should I resend now?
>
> I ran into many bugs and fell into a few rabbit holes.  I'm busy
> finishing and flushing the patches.

All merged except for the final series "[PATCH v2 00/44] Less clumsy
error checking".  v2 has a lot of change within the series, but in
aggregate it's really close to v1.  This makes be optimistic it can
serve as a base for your auto-propagation work.  To get it into 5.1, we
need a respin, a re-review, and a pull request.  Time is awfully short.
Sorry for taking so long!  If you want to try, I can give it priority on
my side.

Re: [RFC v2 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier

2020-07-03 Thread Eugenio Perez Martin

On Mon, Jun 29, 2020 at 5:05 PM Paolo Bonzini  wrote:
>
> On 26/06/20 08:41, Eugenio Pérez wrote:
> > If we examinate *entry in frame 4 of backtrace:
> > *entry = {target_as = 0x56f6c050, iova = 0x0, translated_addr = 0x0,
> > addr_mask = 0x, perm = 0x0}
> >
> > Which (I think) tries to invalidate all the TLB registers of the device.
> >
> > Just deleting that assert is enough for the VM to start and communicate
> > using IOMMU, but maybe a better alternative is possible. We could move
> > it to the caller functions in other cases than IOMMU invalidation, or
> > make it conditional only if not invalidating.
>
> Yes, I think moving it up in the call stack is better. I cannot say
> where because the backtrace was destroyed by git (due to lines starting
> with "#").
>

Ouch, what a failure!

Pasting here for completion, sorry!

(gdb) bt
#0  0x7521370f in raise () at /lib64/libc.so.6
#1  0x751fdb25 in abort () at /lib64/libc.so.6
#2  0x751fd9f9 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x7520bcc6 in .annobin_assert.c_end () at /lib64/libc.so.6
#4  0x55888171 in memory_region_notify_one
(notifier=0x7ffde0487fa8, entry=0x7ffde5dfe200) at
/home/qemu/memory.c:1918
#5  0x55888247 in memory_region_notify_iommu
(iommu_mr=0x56f6c0b0, iommu_idx=0, entry=...) at
/home/qemu/memory.c:1941
#6  0x55951c8d in vtd_process_device_iotlb_desc
(s=0x57609000, inv_desc=0x7ffde5dfe2d0)
at /home/qemu/hw/i386/intel_iommu.c:2468
#7  0x55951e6a in vtd_process_inv_desc (s=0x57609000) at
/home/qemu/hw/i386/intel_iommu.c:2531
#8  0x55951fa5 in vtd_fetch_inv_desc (s=0x57609000) at
/home/qemu/hw/i386/intel_iommu.c:2563
#9  0x559520e5 in vtd_handle_iqt_write (s=0x57609000) at
/home/qemu/hw/i386/intel_iommu.c:2590
#10 0x55952b45 in vtd_mem_write (opaque=0x57609000,
addr=136, val=2688, size=4) at /home/qemu/hw/i386/intel_iommu.c:2837
#11 0x55883e17 in memory_region_write_accessor
(mr=0x57609330, addr=136, value=0x7ffde5dfe478, size=4,
shift=0, mask=4294967295, attrs=...) at /home/qemu/memory.c:483
#12 0x5588401d in access_with_adjusted_size
(addr=136, value=0x7ffde5dfe478, size=4, access_size_min=4,
access_size_max=8, access_fn=
0x55883d38 , mr=0x57609330,
attrs=...) at /home/qemu/memory.c:544
#13 0x55886f37 in memory_region_dispatch_write
(mr=0x57609330, addr=136, data=2688, op=MO_32, attrs=...)
at /home/qemu/memory.c:1476
#14 0x55827a03 in flatview_write_continue
(fv=0x7ffdd8503150, addr=4275634312, attrs=...,
ptr=0x77ff0028, len=4, addr1=136, l=4, mr=0x57609330) at
/home/qemu/exec.c:3146
#15 0x55827b48 in flatview_write (fv=0x7ffdd8503150,
addr=4275634312, attrs=..., buf=0x77ff0028, len=4)
at /home/qemu/exec.c:3186
#16 0x55827e9d in address_space_write
(as=0x567ca640 , addr=4275634312,
attrs=..., buf=0x77ff0028, len=4) at /home/qemu/exec.c:3277
#17 0x55827f0a in address_space_rw
(as=0x567ca640 , addr=4275634312,
attrs=..., buf=0x77ff0028, len=4, is_write=true)
at /home/qemu/exec.c:3287
#18 0x5589b633 in kvm_cpu_exec (cpu=0x56b65640) at
/home/qemu/accel/kvm/kvm-all.c:2511
#19 0x55876ba8 in qemu_kvm_cpu_thread_fn (arg=0x56b65640)
at /home/qemu/cpus.c:1284
#20 0x55dafff1 in qemu_thread_start (args=0x56b8c3b0) at
util/qemu-thread-posix.c:521
#21 0x755a62de in start_thread () at /lib64/libpthread.so.0
#22 0x752d7e83 in clone () at /lib64/libc.so.6

(gdb) frame 4
#4  0x55888171 in memory_region_notify_one
(notifier=0x7ffde0487fa8, entry=0x7ffde5dfe200) at
/home/qemu/memory.c:1918
1918assert(entry->iova >= notifier->start && entry_end <=
notifier->end);
(gdb) p *entry
$1 = {target_as = 0x56f6c050, iova = 0, translated_addr = 0,
addr_mask = 18446744073709551615, perm = IOMMU_NONE}

Thanks!

> Paolo
>
> > Any comment would be appreciated. Thanks!
> >
> > Guest kernel version: kernel-3.10.0-1151.el7.x86_64
> >
> > Bug reference: https://bugs.launchpad.net/qemu/+bug/1885175
> >
> > v2: Actually delete assertion instead of just commenting out using C99
> >
> > Eugenio Pérez (1):
> >   memory: Delete assertion in memory_region_unregister_iommu_notifier
> >
> >  memory.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
>

Re: [PATCH v1 1/3] hw/char: Convert the Ibex UART to use the qdev Clock model

2020-07-03 Thread Philippe Mathieu-Daudé

+Damien

On 6/30/20 10:12 PM, Alistair Francis wrote:
> Conver the Ibex UART to use the recently added qdev-clock functions.

Yeah! This is our first user \o/

> 
> Signed-off-by: Alistair Francis 
> ---
>  include/hw/char/ibex_uart.h |  2 ++
>  hw/char/ibex_uart.c | 19 ++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/char/ibex_uart.h b/include/hw/char/ibex_uart.h
> index 2bec772615..322bfffd8b 100644
> --- a/include/hw/char/ibex_uart.h
> +++ b/include/hw/char/ibex_uart.h
> @@ -101,6 +101,8 @@ typedef struct {
>  uint32_t uart_val;
>  uint32_t uart_timeout_ctrl;
>  
> +Clock *f_clk;
> +
>  CharBackend chr;
>  qemu_irq tx_watermark;
>  qemu_irq rx_watermark;
> diff --git a/hw/char/ibex_uart.c b/hw/char/ibex_uart.c
> index 45cd724998..f967e6919a 100644
> --- a/hw/char/ibex_uart.c
> +++ b/hw/char/ibex_uart.c
> @@ -28,6 +28,7 @@
>  #include "qemu/osdep.h"
>  #include "hw/char/ibex_uart.h"
>  #include "hw/irq.h"
> +#include "hw/qdev-clock.h"
>  #include "hw/qdev-properties.h"
>  #include "migration/vmstate.h"
>  #include "qemu/log.h"
> @@ -330,7 +331,7 @@ static void ibex_uart_write(void *opaque, hwaddr addr,
>  }
>  if (value & UART_CTRL_NCO) {
>  uint64_t baud = ((value & UART_CTRL_NCO) >> 16);

UART_CTRL_NCO is defined as:

  #define UART_CTRL_NCO   (0x << 16)

Note for later, convert to the clearer registerfields API?

> -baud *= 1000;
> +baud *= clock_get_hz(s->f_clk);
>  baud >>= 20;
>  
>  s->char_tx_time = (NANOSECONDS_PER_SECOND / baud) * 10;
> @@ -385,6 +386,18 @@ static void ibex_uart_write(void *opaque, hwaddr addr,
>  }
>  }
>  
> +static void ibex_uart_clk_update(void *opaque)
> +{
> +IbexUartState *s = opaque;
> +
> +/* recompute uart's speed on clock change */
> +uint64_t baud = ((s->uart_ctrl & UART_CTRL_NCO) >> 16);
> +baud *= clock_get_hz(s->f_clk);
> +baud >>= 20;

Maybe worth to extract:

  uint64_t ibex_uart_get_baud(IbexUartState *s)
  {
   uint64_t baud;

   baud = ((s->uart_ctrl & UART_CTRL_NCO) >> 16);
   baud *= clock_get_hz(s->f_clk);
   baud >>= 20;

   return baud;
  }

> +
> +s->char_tx_time = (NANOSECONDS_PER_SECOND / baud) * 10;
> +}
> +
>  static void fifo_trigger_update(void *opaque)
>  {
>  IbexUartState *s = opaque;
> @@ -444,6 +457,10 @@ static void ibex_uart_init(Object *obj)
>  {
>  IbexUartState *s = IBEX_UART(obj);
>  
> +s->f_clk = qdev_init_clock_in(DEVICE(obj), "f_clock",
> +  ibex_uart_clk_update, s);
> +clock_set_hz(s->f_clk, 5000);

Can you add a definition for this 50 MHz value:

Otherwise:
Reviewed-by: Philippe Mathieu-Daudé 

> +
>  sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->tx_watermark);
>  sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->rx_watermark);
>  sysbus_init_irq(SYS_BUS_DEVICE(obj), &s->tx_empty);
>

Re: [PATCH v1 2/3] hw/riscv: Allow 64 bit access to SiFive CLINT

2020-07-03 Thread Philippe Mathieu-Daudé

On 6/30/20 10:12 PM, Alistair Francis wrote:
> Commit 5d971f9e672507210e77d020d89e0e89165c8fc9
> "memory: Revert "memory: accept mismatching sizes in
> memory_region_access_valid"" broke most RISC-V boards as they do 64 bit
> accesses to the CLINT and QEMU would trigger a fault. Fix this failure
> by allowing 8 byte accesses.
> 

Fixes: 5d971f9e67 (Revert "accept mismatching sizes in access_valid")
Reviewed-by: Philippe Mathieu-Daudé 

> Signed-off-by: Alistair Francis 
> ---
>  hw/riscv/sifive_clint.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/riscv/sifive_clint.c b/hw/riscv/sifive_clint.c
> index b11ffa0edc..669c21adc2 100644
> --- a/hw/riscv/sifive_clint.c
> +++ b/hw/riscv/sifive_clint.c
> @@ -181,7 +181,7 @@ static const MemoryRegionOps sifive_clint_ops = {
>  .endianness = DEVICE_LITTLE_ENDIAN,
>  .valid = {
>  .min_access_size = 4,
> -.max_access_size = 4
> +.max_access_size = 8
>  }
>  };
>  
>

Re: [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set

2020-07-03 Thread Max Reitz

On 28.06.20 13:02, Alberto Garcia wrote:
> Traditional qcow2 images don't allow preallocation if a backing file
> is set. This is because once a cluster is allocated there is no way to
> tell that its data should be read from the backing file.
> 
> Extended L2 entries have individual allocation bits for each
> subcluster, and therefore it is perfectly possible to have an
> allocated cluster with all its subclusters unallocated.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Eric Blake 
> ---
>  block/qcow2.c  | 7 ---
>  tests/qemu-iotests/206.out | 2 +-
>  2 files changed, 5 insertions(+), 4 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters

2020-07-03 Thread Max Reitz

On 28.06.20 13:02, Alberto Garcia wrote:
> This function is only used by qcow2_expand_zero_clusters() to
> downgrade a qcow2 image to a previous version. This would require
> transforming all extended L2 entries into normal L2 entries but this
> is not a simple task and there are no plans to implement this at the
> moment.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Eric Blake 
> ---
>  block/qcow2-cluster.c  | 8 +++-
>  tests/qemu-iotests/061 | 6 ++
>  tests/qemu-iotests/061.out | 5 +
>  3 files changed, 18 insertions(+), 1 deletion(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-07-03 Thread Kevin Wolf

Am 01.07.2020 um 14:58 hat Philippe Mathieu-DaudÃ© geschrieben:
> On 6/29/20 11:34 PM, Klaus Jensen wrote:
> > On Jun 29 14:07, no-re...@patchew.org wrote:
> >> Patchew URL: 
> >> https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/
> 
> >> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
> >> 20:12:10.0 +
> >> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
> >> 20:58:48.288790818 +
> >> @@ -1,3 +1,5 @@
> >> +WARNING:qemu.machine:qemu received signal 9: 
> >> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> >>  -display none -vga none -chardev 
> >> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
> >> chardev=mon,mode=control -qtest 
> >> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
> >> -nodefaults -display none -accel qtest
> >> +WARNING:qemu.machine:qemu received signal 9: 
> >> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> >>  -display none -vga none -chardev 
> >> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
> >> chardev=mon,mode=control -qtest 
> >> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
> >> -nodefaults -display none -accel qtest
> 
> Kevin, Max, can iotests/040 be affected by this change?

The diffstat of this series looks like it doesn't touch anything outside
of the nvme emuation, which isn't used by this test, so at least I'd say
it's not the fault of the patch series.

I think test cases use SIGKILL primarily in timeout handlers, so maybe
the test host was overloaded and didn't shutdown QEMU in time so it was
killed. There is no actually failing test case:

 ...
 --
 Ran 59 tests

You would have 'F' or 'E' for fail/error instead of '.' otherwise.

Kevin

> > 
> > 
> > Hmm, I can't seem to reproduce this locally and the test succeeded on
> > the next series[1] that is based on this.
> > 
> > Is this a flaky test? Or a bad test runner? I'm of course worried when
> > a qcow2 test fails and I touch something else than the nvme device ;)
> > 
> > 
> >   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-...@irrelevant.dk/
> > 
>

Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 9:50 AM, Kevin Wolf wrote:
> Am 01.07.2020 um 14:58 hat Philippe Mathieu-DaudÃ© geschrieben:
>> On 6/29/20 11:34 PM, Klaus Jensen wrote:
>>> On Jun 29 14:07, no-re...@patchew.org wrote:
 Patchew URL: 
 https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/
>>
 --- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
 20:12:10.0 +
 +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
 20:58:48.288790818 +
 @@ -1,3 +1,5 @@
 +WARNING:qemu.machine:qemu received signal 9: 
 /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
  -display none -vga none -chardev 
 socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
 chardev=mon,mode=control -qtest 
 unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
 -nodefaults -display none -accel qtest
 +WARNING:qemu.machine:qemu received signal 9: 
 /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
  -display none -vga none -chardev 
 socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
 chardev=mon,mode=control -qtest 
 unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
 -nodefaults -display none -accel qtest
>>
>> Kevin, Max, can iotests/040 be affected by this change?
> 
> The diffstat of this series looks like it doesn't touch anything outside
> of the nvme emuation, which isn't used by this test, so at least I'd say
> it's not the fault of the patch series.
> 
> I think test cases use SIGKILL primarily in timeout handlers, so maybe
> the test host was overloaded and didn't shutdown QEMU in time so it was
> killed. There is no actually failing test case:
> 
>  ...
>  --
>  Ran 59 tests
> 
> You would have 'F' or 'E' for fail/error instead of '.' otherwise.

TIL how to read that line :)

Thanks for your analysis Kevin!

> 
> Kevin
> 
>>>
>>>
>>> Hmm, I can't seem to reproduce this locally and the test succeeded on
>>> the next series[1] that is based on this.
>>>
>>> Is this a flaky test? Or a bad test runner? I'm of course worried when
>>> a qcow2 test fails and I touch something else than the nvme device ;)
>>>
>>>
>>>   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-...@irrelevant.dk/
>>>
>>
>

Re: [PATCH] iotests.py: Do not wait() before communicate()

2020-07-03 Thread Kevin Wolf

Am 30.06.2020 um 10:37 hat Max Reitz geschrieben:
> Waiting on a process for which we have a pipe will stall if the process
> outputs more data than fits into the OS-provided buffer.  We must use
> communicate() before wait(), and in fact, communicate() perfectly
> replaces wait() already.
> 
> We have to drop the stderr=subprocess.STDOUT parameter from
> subprocess.Popen() in qemu_nbd_early_pipe(), because stderr is passed on
> to the child process, so if we do not drop this parameter, communicate()
> will hang (because the pipe is not closed).
> 
> Signed-off-by: Max Reitz 

Thanks, applied to the block branch.

Kevin

Re: [PATCH 1/1] disas: mips: Add Loongson 2F disassembler

2020-07-03 Thread Thomas Huth


On 02/07/2020 21.42, Stefan Brankovic wrote:

Add disassembler for Loongson 2F instruction set.

Testing is done by comparing qemu disassembly output, obtained by
using -d in_asm command line option, with appropriate objdump output.

Signed-off-by: Stefan Brankovic 
---
  MAINTAINERS |1 +
  configure   |1 +
  disas/Makefile.objs |1 +
  disas/loongson2f.cpp| 8134 +++
  disas/loongson2f.h  | 2542 
  include/disas/dis-asm.h |1 +
  include/exec/poison.h   |1 +
  target/mips/cpu.c   |4 +
  8 files changed, 10685 insertions(+)
  create mode 100644 disas/loongson2f.cpp
  create mode 100644 disas/loongson2f.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 3abe3faa4e..913ed2a6d3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -219,6 +219,7 @@ S: Maintained
  F: target/mips/
  F: default-configs/*mips*
  F: disas/*mips*
+F: disas/loongson*
  F: docs/system/cpu-models-mips.rst.inc
  F: hw/intc/mips_gic.c
  F: hw/mips/
diff --git a/configure b/configure
index 597e909b53..e163dac53e 100755
--- a/configure
+++ b/configure
@@ -8102,6 +8102,7 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
  disas_config "MIPS"
  if test -n "${cxx}"; then
disas_config "NANOMIPS"
+  disas_config "LOONGSON2F"
  fi
;;
moxie*)
diff --git a/disas/Makefile.objs b/disas/Makefile.objs
index 3c1cdce026..0d5ee1e038 100644
--- a/disas/Makefile.objs
+++ b/disas/Makefile.objs
@@ -14,6 +14,7 @@ common-obj-$(CONFIG_I386_DIS) += i386.o
  common-obj-$(CONFIG_M68K_DIS) += m68k.o
  common-obj-$(CONFIG_MICROBLAZE_DIS) += microblaze.o
  common-obj-$(CONFIG_MIPS_DIS) += mips.o
+common-obj-$(CONFIG_LOONGSON2F_DIS) += loongson2f.o
  common-obj-$(CONFIG_NANOMIPS_DIS) += nanomips.o
  common-obj-$(CONFIG_NIOS2_DIS) += nios2.o
  common-obj-$(CONFIG_MOXIE_DIS) += moxie.o
diff --git a/disas/loongson2f.cpp b/disas/loongson2f.cpp
new file mode 100644
index 00..a2f32dcf93
--- /dev/null
+++ b/disas/loongson2f.cpp
@@ -0,0 +1,8134 @@


This file (and the header) lack a proper header comment. Which license 
do you want to use for this code? Who wrote the initial implementation?


Also, unless you've copied the code from another project that uses C++, 
why did you use C++ here? QEMU is C by default, we only allow C++ for 
some files that have been taken from other C++ projects and need to be 
kept in sync from time to time. So if you wrote this code from scratch, 
please use C instead.


 Thanks,
  Thomas



+extern "C" {
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "disas/dis-asm.h"
+}
+
+#include "loongson2f.h"
+
+int print_insn_loongson2f(bfd_vma addr, disassemble_info *info)
+{
+bfd_byte buffer[4];
+uint32_t insn32;
+int status;
+Decoder *decoder = new Decoder();
+
+status = info->read_memory_func(addr, buffer, 4, info);
+if (status != 0) {
+info->memory_error_func(status, addr, info);
+return -1;
+}
+if (info->endian == BFD_ENDIAN_BIG) {
+insn32 = bfd_getb32(buffer);
+} else {
+insn32 = bfd_getl32(buffer);
+}
+
+status = decoder->decode32(info, insn32);
+
+delete decoder;
+
+return status == 0 ? -1 : 4;
+}

Re: [PATCH v2 02/18] hw/block/nvme: additional tracing

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Add various additional tracing and streamline nvme_identify_ns and
> nvme_identify_nslist (they do not need to repeat the command, it is
> already in the trace name).
> 
> Signed-off-by: Klaus Jensen 
> Reviewed-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c   | 19 +++
>  hw/block/nvme.h   | 14 ++
>  hw/block/trace-events | 13 +++--
>  3 files changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 71b388aa0e20..f5d9148f0936 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -331,6 +331,8 @@ static void nvme_post_cqes(void *opaque)
>  static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
>  {
>  assert(cq->cqid == req->sq->cqid);
> +trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
> +  req->status);
>  QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
>  QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
>  timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> @@ -343,6 +345,8 @@ static void nvme_rw_cb(void *opaque, int ret)
>  NvmeCtrl *n = sq->ctrl;
>  NvmeCQueue *cq = n->cq[sq->cqid];
>  
> +trace_pci_nvme_rw_cb(nvme_cid(req));
> +
>  if (!ret) {
>  block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
>  req->status = NVME_SUCCESS;
> @@ -378,6 +382,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, 
> NvmeNamespace *ns, NvmeCmd *cmd,
>  uint64_t offset = slba << data_shift;
>  uint32_t count = nlb << data_shift;
>  
> +trace_pci_nvme_write_zeroes(nvme_cid(req), slba, nlb);
> +
>  if (unlikely(slba + nlb > ns->id_ns.nsze)) {
>  trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
>  return NVME_LBA_RANGE | NVME_DNR;
> @@ -445,6 +451,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, 
> NvmeRequest *req)
>  NvmeNamespace *ns;
>  uint32_t nsid = le32_to_cpu(cmd->nsid);
>  
> +trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), cmd->opcode);
> +
>  if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
>  trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
>  return NVME_INVALID_NSID | NVME_DNR;
> @@ -876,6 +884,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>  
>  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
> +trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> +
>  switch (cmd->opcode) {
>  case NVME_ADM_CMD_DELETE_SQ:
>  return nvme_del_sq(n, cmd);
> @@ -1204,6 +1214,8 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr 
> addr, unsigned size)
>  uint8_t *ptr = (uint8_t *)&n->bar;
>  uint64_t val = 0;
>  
> +trace_pci_nvme_mmio_read(addr);
> +
>  if (unlikely(addr & (sizeof(uint32_t) - 1))) {
>  NVME_GUEST_ERR(pci_nvme_ub_mmiord_misaligned32,
> "MMIO read not 32-bit aligned,"
> @@ -1273,6 +1285,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, 
> int val)
>  return;
>  }
>  
> +trace_pci_nvme_mmio_doorbell_cq(cq->cqid, new_head);
> +
>  start_sqs = nvme_cq_full(cq) ? 1 : 0;
>  cq->head = new_head;
>  if (start_sqs) {
> @@ -1311,6 +1325,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, 
> int val)
>  return;
>  }
>  
> +trace_pci_nvme_mmio_doorbell_sq(sq->sqid, new_tail);
> +
>  sq->tail = new_tail;
>  timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
>  }
> @@ -1320,6 +1336,9 @@ static void nvme_mmio_write(void *opaque, hwaddr addr, 
> uint64_t data,
>  unsigned size)
>  {
>  NvmeCtrl *n = (NvmeCtrl *)opaque;
> +
> +trace_pci_nvme_mmio_write(addr, data);
> +
>  if (addr < sizeof(n->bar)) {
>  nvme_write_bar(n, addr, data, size);
>  } else if (addr >= 0x1000) {
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 1d30c0bca283..1bf5c80ed843 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -115,4 +115,18 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, 
> NvmeNamespace *ns)
>  return n->ns_size >> nvme_ns_lbads(ns);
>  }
>  
> +static inline uint16_t nvme_cid(NvmeRequest *req)
> +{
> +if (req) {
> +return le16_to_cpu(req->cqe.cid);
> +}
> +
> +return 0x;

In this case I find the inverted logic easier. Matter
of taste :)

> +}
> +
> +static inline uint16_t nvme_sqid(NvmeRequest *req)
> +{
> +return le16_to_cpu(req->sq->sqid);
> +}

Later I'd prefer we move these out of the header, and remove
the inline attribute.
I.e. nvme_ns_nlbas() is only used once.

OK for now.
Reviewed-by: Philippe Mathieu-Daudé 

> +
>  #endif /* HW_NVME_H */
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 958fcc5508d1..c40c0d2e4b28 100644
> --- a

Re: [PATCH v2 01/25] iotests: Fix 051 output after qdev_init_nofail() removal

2020-07-03 Thread Kevin Wolf

Am 24.06.2020 um 16:04 hat Alex BennÃ©e geschrieben:
> From: Philippe Mathieu-DaudÃ© 
> 
> Commit 96927c744 replaced qdev_init_nofail() call by
> isa_realize_and_unref() which has a different error
> message. Update the test output accordingly.
> 
> Gitlab CI error after merging b77b5b3dc7:
> https://gitlab.com/qemu-project/qemu/-/jobs/597414772#L4375
> 
> Reported-by: Thomas Huth 
> Signed-off-by: Philippe Mathieu-DaudÃ© 
> Signed-off-by: Alex BennÃ©e 
> Reviewed-by: John Snow 
> Reviewed-by: Thomas Huth 
> Message-Id: <20200616154949.6586-1-phi...@redhat.com>

Thanks, applied (this individual patch) to the block branch.

Kevin

Re: [PATCH v6 4/5] 9pfs: T_readdir latency optimization

2020-07-03 Thread Christian Schoenebeck

On Donnerstag, 2. Juli 2020 19:23:35 CEST Christian Schoenebeck wrote:
> > > Back to the actual topic: so what do we do about the mutex then? CoMutex
> > > for 9p2000.u and Mutex for 9p2000.L? I know you find that ugly, but it
> > > would just be a transitional measure.
> > 
> > That would ruin my day...
> > 
> > More seriously, the recent fix for a deadlock condition that was present
> > for years proves that nobody seems to be using silly clients with QEMU.
> > So I think we should just dump the lock and add a one-time warning in
> > the top level handlers when we detect a duplicate readdir request on
> > the same fid. This should be a very simple patch (I can take care of
> > it right away).
> 
> Looks like we have a consensus! Then I wait for your patch removing the
> lock, and for your feedback whether or not you see anything else in this
> patch set?

Please wait before you work on this patch. I really do think your patch should 
be based on/after this optimization patch for one reason: if (and even though 
it's a big if) somebody comes along with a silly client as you named it, then 
your patch can simply be reverted, which would not be possible if it's next.

So I would really suggest I change this patch here to go the ugly way with 2 
mutex types for readdir 9p2000.L vs 9p2000.L, and your patch would get rid of 
that mess by removing the lock entirely, okay?

Best regards,
Christian Schoenebeck

Re: [PATCH v2 04/18] hw/block/nvme: add temperature threshold feature

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> It might seem weird to implement this feature for an emulated device,
> but it is mandatory to support and the feature is useful for testing
> asynchronous event request support, which will be added in a later
> patch.

It might be interesting to plug that to the "temperature sensor
interface" I suggested here (I'll rework on it during 5.2):
https://www.mail-archive.com/qemu-block@nongnu.org/msg65192.html

> 
> Signed-off-by: Klaus Jensen 
> Acked-by: Keith Busch 
> Reviewed-by: Maxim Levitsky 
> ---
>  hw/block/nvme.c  | 48 
>  hw/block/nvme.h  |  1 +
>  include/block/nvme.h |  5 -
>  3 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index b7037a7d3504..5ca50646369e 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -59,6 +59,9 @@
>  #define NVME_DB_SIZE  4
>  #define NVME_CMB_BIR 2
>  #define NVME_PMR_BIR 2
> +#define NVME_TEMPERATURE 0x143
> +#define NVME_TEMPERATURE_WARNING 0x157
> +#define NVME_TEMPERATURE_CRITICAL 0x175
>  
>  #define NVME_GUEST_ERR(trace, fmt, ...) \
>  do { \
> @@ -827,9 +830,31 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, 
> NvmeCmd *cmd)
>  static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
>  {
>  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> +uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>  uint32_t result;
>  
>  switch (dw10) {
> +case NVME_TEMPERATURE_THRESHOLD:
> +result = 0;
> +
> +/*
> + * The controller only implements the Composite Temperature sensor, 
> so
> + * return 0 for all other sensors.
> + */
> +if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +break;
> +}
> +
> +switch (NVME_TEMP_THSEL(dw11)) {
> +case NVME_TEMP_THSEL_OVER:
> +result = cpu_to_le16(n->features.temp_thresh_hi);
> +break;
> +case NVME_TEMP_THSEL_UNDER:
> +result = cpu_to_le16(n->features.temp_thresh_low);
> +break;
> +}
> +
> +break;
>  case NVME_VOLATILE_WRITE_CACHE:
>  result = blk_enable_write_cache(n->conf.blk);
>  trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
> @@ -874,6 +899,23 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>  
>  switch (dw10) {
> +case NVME_TEMPERATURE_THRESHOLD:
> +if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> +break;
> +}
> +
> +switch (NVME_TEMP_THSEL(dw11)) {
> +case NVME_TEMP_THSEL_OVER:
> +n->features.temp_thresh_hi = NVME_TEMP_TMPTH(dw11);
> +break;
> +case NVME_TEMP_THSEL_UNDER:
> +n->features.temp_thresh_low = NVME_TEMP_TMPTH(dw11);
> +break;
> +default:
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
> +break;
>  case NVME_VOLATILE_WRITE_CACHE:
>  blk_set_enable_write_cache(n->conf.blk, dw11 & 1);
>  break;
> @@ -1454,6 +1496,7 @@ static void nvme_init_state(NvmeCtrl *n)
>  n->namespaces = g_new0(NvmeNamespace, n->num_namespaces);
>  n->sq = g_new0(NvmeSQueue *, n->params.max_ioqpairs + 1);
>  n->cq = g_new0(NvmeCQueue *, n->params.max_ioqpairs + 1);
> +n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING;
>  }
>  
>  static void nvme_init_blk(NvmeCtrl *n, Error **errp)
> @@ -1611,6 +1654,11 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
> *pci_dev)
>  id->acl = 3;
>  id->frmw = 7 << 1;
>  id->lpa = 1 << 0;
> +
> +/* recommended default value (~70 C) */
> +id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
> +id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
> +
>  id->sqes = (0x6 << 4) | 0x6;
>  id->cqes = (0x4 << 4) | 0x4;
>  id->nn = cpu_to_le32(n->num_namespaces);
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index 1bf5c80ed843..3acde10e1d2a 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -107,6 +107,7 @@ typedef struct NvmeCtrl {
>  NvmeSQueue  admin_sq;
>  NvmeCQueue  admin_cq;
>  NvmeIdCtrl  id_ctrl;
> +NvmeFeatureVal  features;
>  } NvmeCtrl;
>  
>  /* calculate the number of LBAs that the namespace can accomodate */
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index 2a80d2a7ed89..d2c457695b38 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -860,7 +860,10 @@ enum NvmeIdCtrlOncs {
>  typedef struct NvmeFeatureVal {
>  uint32_tarbitration;
>  uint32_tpower_mgmt;
> -uint32_ttemp_thresh;
> +struct {
> +uint16_t temp_thresh_hi;
> +uint16_t temp_thresh_low;
> +};
>  uint32_terr_rec;
>  uint32_tvolatile_wc;
>  uint32_tnum_queues;
>

Re: [PATCH v2 02/18] hw/block/nvme: additional tracing

2020-07-03 Thread Klaus Jensen

On Jul  3 10:03, Philippe Mathieu-Daudé wrote:
> On 7/3/20 8:34 AM, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > Add various additional tracing and streamline nvme_identify_ns and
> > nvme_identify_nslist (they do not need to repeat the command, it is
> > already in the trace name).
> > 
> > Signed-off-by: Klaus Jensen 
> > Reviewed-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme.c   | 19 +++
> >  hw/block/nvme.h   | 14 ++
> >  hw/block/trace-events | 13 +++--
> >  3 files changed, 44 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 71b388aa0e20..f5d9148f0936 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -331,6 +331,8 @@ static void nvme_post_cqes(void *opaque)
> >  static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req)
> >  {
> >  assert(cq->cqid == req->sq->cqid);
> > +trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid,
> > +  req->status);
> >  QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
> >  QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
> >  timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> > @@ -343,6 +345,8 @@ static void nvme_rw_cb(void *opaque, int ret)
> >  NvmeCtrl *n = sq->ctrl;
> >  NvmeCQueue *cq = n->cq[sq->cqid];
> >  
> > +trace_pci_nvme_rw_cb(nvme_cid(req));
> > +
> >  if (!ret) {
> >  block_acct_done(blk_get_stats(n->conf.blk), &req->acct);
> >  req->status = NVME_SUCCESS;
> > @@ -378,6 +382,8 @@ static uint16_t nvme_write_zeros(NvmeCtrl *n, 
> > NvmeNamespace *ns, NvmeCmd *cmd,
> >  uint64_t offset = slba << data_shift;
> >  uint32_t count = nlb << data_shift;
> >  
> > +trace_pci_nvme_write_zeroes(nvme_cid(req), slba, nlb);
> > +
> >  if (unlikely(slba + nlb > ns->id_ns.nsze)) {
> >  trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
> >  return NVME_LBA_RANGE | NVME_DNR;
> > @@ -445,6 +451,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeCmd *cmd, 
> > NvmeRequest *req)
> >  NvmeNamespace *ns;
> >  uint32_t nsid = le32_to_cpu(cmd->nsid);
> >  
> > +trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), 
> > cmd->opcode);
> > +
> >  if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> >  trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
> >  return NVME_INVALID_NSID | NVME_DNR;
> > @@ -876,6 +884,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
> > *cmd, NvmeRequest *req)
> >  
> >  static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeCmd *cmd, NvmeRequest *req)
> >  {
> > +trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), cmd->opcode);
> > +
> >  switch (cmd->opcode) {
> >  case NVME_ADM_CMD_DELETE_SQ:
> >  return nvme_del_sq(n, cmd);
> > @@ -1204,6 +1214,8 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr 
> > addr, unsigned size)
> >  uint8_t *ptr = (uint8_t *)&n->bar;
> >  uint64_t val = 0;
> >  
> > +trace_pci_nvme_mmio_read(addr);
> > +
> >  if (unlikely(addr & (sizeof(uint32_t) - 1))) {
> >  NVME_GUEST_ERR(pci_nvme_ub_mmiord_misaligned32,
> > "MMIO read not 32-bit aligned,"
> > @@ -1273,6 +1285,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, 
> > int val)
> >  return;
> >  }
> >  
> > +trace_pci_nvme_mmio_doorbell_cq(cq->cqid, new_head);
> > +
> >  start_sqs = nvme_cq_full(cq) ? 1 : 0;
> >  cq->head = new_head;
> >  if (start_sqs) {
> > @@ -1311,6 +1325,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, 
> > int val)
> >  return;
> >  }
> >  
> > +trace_pci_nvme_mmio_doorbell_sq(sq->sqid, new_tail);
> > +
> >  sq->tail = new_tail;
> >  timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
> >  }
> > @@ -1320,6 +1336,9 @@ static void nvme_mmio_write(void *opaque, hwaddr 
> > addr, uint64_t data,
> >  unsigned size)
> >  {
> >  NvmeCtrl *n = (NvmeCtrl *)opaque;
> > +
> > +trace_pci_nvme_mmio_write(addr, data);
> > +
> >  if (addr < sizeof(n->bar)) {
> >  nvme_write_bar(n, addr, data, size);
> >  } else if (addr >= 0x1000) {
> > diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> > index 1d30c0bca283..1bf5c80ed843 100644
> > --- a/hw/block/nvme.h
> > +++ b/hw/block/nvme.h
> > @@ -115,4 +115,18 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, 
> > NvmeNamespace *ns)
> >  return n->ns_size >> nvme_ns_lbads(ns);
> >  }
> >  
> > +static inline uint16_t nvme_cid(NvmeRequest *req)
> > +{
> > +if (req) {
> > +return le16_to_cpu(req->cqe.cid);
> > +}
> > +
> > +return 0x;
> 
> In this case I find the inverted logic easier. Matter
> of taste :)
> 

Your taste is definitely better than mine. I'll queue it up for a style
fix ;)

> > +}
> > +
> > +static inline uint16_t

[Bug 1886155] [NEW] error: argument 2 of ‘__atomic_load’ discards ‘const’ qualifier

2020-07-03 Thread Martin Liska

Public bug reported:

GCC 11 reports the following errors:

[  125s] In file included from 
/home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/seqlock.h:17,
[  125s]  from 
/home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/qht.h:10,
[  125s]  from 
/home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c:69:
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c: In function 
'qht_do_lookup':
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:153:5: 
error: argument 2 of '__atomic_load' discards 'const' qualifier 
[-Werror=incompatible-pointer-types]
[  125s]   153 | __atomic_load(ptr, valptr, __ATOMIC_RELAXED);   \
[  125s]   | ^
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:161:5: 
note: in expansion of macro 'atomic_rcu_read__nocheck'
[  125s]   161 | atomic_rcu_read__nocheck(ptr, &_val); \
[  125s]   | ^~~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c:499:27: note: in 
expansion of macro 'atomic_rcu_read'
[  125s]   499 | void *p = atomic_rcu_read(&b->pointers[i]);
[  125s]   |   ^~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:153:5: 
error: argument 2 of '__atomic_load' discards 'const' qualifier 
[-Werror=incompatible-pointer-types]
[  125s]   153 | __atomic_load(ptr, valptr, __ATOMIC_RELAXED);   \
[  125s]   | ^
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:161:5: 
note: in expansion of macro 'atomic_rcu_read__nocheck'
[  125s]   161 | atomic_rcu_read__nocheck(ptr, &_val); \
[  125s]   | ^~~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c:506:13: note: in 
expansion of macro 'atomic_rcu_read'
[  125s]   506 | b = atomic_rcu_read(&b->next);
[  125s]   | ^~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c: In function 
'qht_lookup_custom':
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:153:5: 
error: argument 2 of '__atomic_load' discards 'const' qualifier 
[-Werror=incompatible-pointer-types]
[  125s]   153 | __atomic_load(ptr, valptr, __ATOMIC_RELAXED);   \
[  125s]   | ^
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:161:5: 
note: in expansion of macro 'atomic_rcu_read__nocheck'
[  125s]   161 | atomic_rcu_read__nocheck(ptr, &_val); \
[  125s]   | ^~~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c:534:11: note: in 
expansion of macro 'atomic_rcu_read'
[  125s]   534 | map = atomic_rcu_read(&ht->map);
[  125s]   |   ^~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c: In function 
'qht_statistics_init':
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:153:5: 
error: argument 2 of '__atomic_load' discards 'const' qualifier 
[-Werror=incompatible-pointer-types]
[  125s]   153 | __atomic_load(ptr, valptr, __ATOMIC_RELAXED);   \
[  125s]   | ^
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:161:5: 
note: in expansion of macro 'atomic_rcu_read__nocheck'
[  125s]   161 | atomic_rcu_read__nocheck(ptr, &_val); \
[  125s]   | ^~~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c:907:11: note: in 
expansion of macro 'atomic_rcu_read'
[  125s]   907 | map = atomic_rcu_read(&ht->map);
[  125s]   |   ^~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:153:5: 
error: argument 2 of '__atomic_load' discards 'const' qualifier 
[-Werror=incompatible-pointer-types]
[  125s]   153 | __atomic_load(ptr, valptr, __ATOMIC_RELAXED);   \
[  125s]   | ^
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/atomic.h:161:5: 
note: in expansion of macro 'atomic_rcu_read__nocheck'
[  125s]   161 | atomic_rcu_read__nocheck(ptr, &_val); \
[  125s]   | ^~~~
[  125s] /home/abuild/rpmbuild/BUILD/qemu-5.0.0/util/qht.c:941:21: note: in 
expansion of macro 'atomic_rcu_read'
[  125s]   941 | b = atomic_rcu_read(&b->next);
[  125s]   | ^~~

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886155

Title:
  error: argument 2 of ‘__atomic_load’ discards ‘const’ qualifier

Status in QEMU:
  New

Bug description:
  GCC 11 reports the following errors:

  [  125s] In file included from 
/home/abuild/rpmbuild/BUILD/qemu-5.0.0/include/qemu/seqlock.h:17,
  [  125s]  from 
/home/abuild/rpmbuild/BUILD/qemu-5.0.0/includ

Re: [PATCH v2 17/18] hw/block/nvme: provide the mandatory subnqn field

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> The SUBNQN field is mandatory in NVM Express 1.3.
> 
> Signed-off-by: Klaus Jensen 
> Reviewed-by: Maxim Levitsky 
> Reviewed-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 8138baa6fbd8..5bbb6aa0efc3 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2134,6 +2134,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
> *pci_dev)
>  id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
> NVME_ONCS_FEATURES);
>  
> +pstrcpy((char *) id->subnqn, sizeof(id->subnqn), 
> "nqn.2019-08.org.qemu:");
> +pstrcat((char *) id->subnqn, sizeof(id->subnqn), n->params.serial);

What about using strpadcpy()?

  char *subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s",
 n->params.serial);
  strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
  g_free(subnqn);

> +
>  id->psd[0].mp = cpu_to_le16(0x9c4);
>  id->psd[0].enlat = cpu_to_le32(0x10);
>  id->psd[0].exlat = cpu_to_le32(0x4);
>

Re: [PATCH v2 04/18] hw/block/nvme: add temperature threshold feature

2020-07-03 Thread Klaus Jensen

On Jul  3 10:08, Philippe Mathieu-Daudé wrote:
> On 7/3/20 8:34 AM, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > It might seem weird to implement this feature for an emulated device,
> > but it is mandatory to support and the feature is useful for testing
> > asynchronous event request support, which will be added in a later
> > patch.
> 
> It might be interesting to plug that to the "temperature sensor
> interface" I suggested here (I'll rework on it during 5.2):
> https://www.mail-archive.com/qemu-block@nongnu.org/msg65192.html
> 

That would be pretty cool, since currently only the thresholds can be
changed to cause a reaction.

Re: [PATCH v2 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Reject the nsid broadcast value (0x) and 0xfffe in the
> Active Namespace ID list.

Can we have a definition instead of this 0xfffe magic value please?

> 
> Signed-off-by: Klaus Jensen 
> ---
>  hw/block/nvme.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 65c2fa3ac1f4..0dac7a41ddae 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -956,6 +956,10 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
> NvmeIdentify *c)
>  
>  trace_pci_nvme_identify_nslist(min_nsid);
>  
> +if (min_nsid == 0xfffe || min_nsid == NVME_NSID_BROADCAST) {
> +return NVME_INVALID_NSID | NVME_DNR;
> +}
> +
>  list = g_malloc0(data_len);
>  for (i = 0; i < n->num_namespaces; i++) {
>  if (i < min_nsid) {
>

Re: [PATCH v2 17/18] hw/block/nvme: provide the mandatory subnqn field

2020-07-03 Thread Klaus Jensen

On Jul  3 10:18, Philippe Mathieu-Daudé wrote:
> On 7/3/20 8:34 AM, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > The SUBNQN field is mandatory in NVM Express 1.3.
> > 
> > Signed-off-by: Klaus Jensen 
> > Reviewed-by: Maxim Levitsky 
> > Reviewed-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 8138baa6fbd8..5bbb6aa0efc3 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -2134,6 +2134,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
> > *pci_dev)
> >  id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROS | NVME_ONCS_TIMESTAMP |
> > NVME_ONCS_FEATURES);
> >  
> > +pstrcpy((char *) id->subnqn, sizeof(id->subnqn), 
> > "nqn.2019-08.org.qemu:");
> > +pstrcat((char *) id->subnqn, sizeof(id->subnqn), n->params.serial);
> 
> What about using strpadcpy()?
> 
>   char *subnqn = g_strdup_printf("nqn.2019-08.org.qemu:%s",
>  n->params.serial);
>   strpadcpy((char *)id->subnqn, sizeof(id->subnqn), subnqn, '\0');
>   g_free(subnqn);
> 

Thanks, that's better. Fixed!

Re: [PATCH v2 14/18] hw/block/nvme: support identify namespace descriptor list

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Since we are not providing the NGUID or EUI64 fields, we must support
> the Namespace UUID. We do not have any way of storing a persistent
> unique identifier, so conjure up a UUID that is just the namespace id.
> 
> Signed-off-by: Klaus Jensen 
> Reviewed-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c   | 41 +
>  hw/block/trace-events |  1 +
>  2 files changed, 42 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 8230e0e3826b..65c2fa3ac1f4 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -971,6 +971,45 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
> NvmeIdentify *c)
>  return ret;
>  }
>  
> +static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeIdentify *c)
> +{
> +uint32_t nsid = le32_to_cpu(c->nsid);
> +uint64_t prp1 = le64_to_cpu(c->prp1);
> +uint64_t prp2 = le64_to_cpu(c->prp2);
> +
> +uint8_t list[NVME_IDENTIFY_DATA_SIZE];
> +
> +struct data {
> +struct {
> +NvmeIdNsDescr hdr;
> +uint8_t v[16];

You might consider to use QemuUUID from "qemu/uuid.h". The benefits
are you can use qemu_uuid_parse() qemu_uuid_unparse*() for tracing,
and DEFINE_PROP_UUID() in case you want to set a particular UUID
from command line, it case it is important to the guest.

> +} uuid;
> +};
> +
> +struct data *ns_descrs = (struct data *)list;
> +
> +trace_pci_nvme_identify_ns_descr_list(nsid);
> +
> +if (unlikely(nsid == 0 || nsid > n->num_namespaces)) {
> +trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces);
> +return NVME_INVALID_NSID | NVME_DNR;
> +}
> +
> +memset(list, 0x0, sizeof(list));
> +
> +/*
> + * Because the NGUID and EUI64 fields are 0 in the Identify Namespace 
> data
> + * structure, a Namespace UUID (nidt = 0x3) must be reported in the
> + * Namespace Identification Descriptor. Add a very basic Namespace UUID
> + * here.
> + */
> +ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
> +ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
> +stl_be_p(&ns_descrs->uuid.v, nsid);
> +
> +return nvme_dma_read_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2);
> +}
> +
>  static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
>  {
>  NvmeIdentify *c = (NvmeIdentify *)cmd;
> @@ -982,6 +1021,8 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeCmd *cmd)
>  return nvme_identify_ctrl(n, c);
>  case NVME_ID_CNS_NS_ACTIVE_LIST:
>  return nvme_identify_nslist(n, c);
> +case NVME_ID_CNS_NS_DESCR_LIST:
> +return nvme_identify_ns_descr_list(n, c);
>  default:
>  trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
>  return NVME_INVALID_FIELD | NVME_DNR;
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 4a4ef34071df..7b7303cab1dd 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -45,6 +45,7 @@ pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, 
> cqid=%"PRIu16""
>  pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
> +pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, 
> uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 
> 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
>  pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) 
> "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
>  pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) 
> "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
>

Re: [PATCH v2 11/18] hw/block/nvme: add remaining mandatory controller parameters

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Add support for any remaining mandatory controller operating parameters
> (features).
> 
> Signed-off-by: Klaus Jensen 
> Reviewed-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c   | 39 +--
>  hw/block/nvme.h   | 18 ++
>  hw/block/trace-events |  2 ++
>  include/block/nvme.h  |  7 +++
>  4 files changed, 60 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index ba523f6768bf..affb9a967534 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1056,8 +1056,16 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>  uint32_t result;
> +uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> +uint16_t iv;
>  
> -switch (dw10) {
> +trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> +
> +if (!nvme_feature_support[fid]) {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
> +switch (fid) {
>  case NVME_TEMPERATURE_THRESHOLD:
>  result = 0;
>  
> @@ -1088,14 +1096,27 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>   ((n->params.max_ioqpairs - 1) << 16));
>  trace_pci_nvme_getfeat_numq(result);
>  break;
> +case NVME_INTERRUPT_VECTOR_CONF:
> +iv = dw11 & 0x;
> +if (iv >= n->params.max_ioqpairs + 1) {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
> +result = iv;
> +if (iv == n->admin_cq.vector) {
> +result |= NVME_INTVC_NOCOALESCING;
> +}
> +
> +result = cpu_to_le32(result);
> +break;
>  case NVME_ASYNCHRONOUS_EVENT_CONF:
>  result = cpu_to_le32(n->features.async_config);
>  break;
>  case NVME_TIMESTAMP:
>  return nvme_get_feature_timestamp(n, cmd);
>  default:
> -trace_pci_nvme_err_invalid_getfeat(dw10);
> -return NVME_INVALID_FIELD | NVME_DNR;
> +result = cpu_to_le32(nvme_feature_default[fid]);

So here we expect uninitialized fid entries to return 0, right?

> +break;
>  }
>  
>  req->cqe.result = result;
> @@ -1124,8 +1145,15 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>  {
>  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> +uint8_t fid = NVME_GETSETFEAT_FID(dw10);
>  
> -switch (dw10) {
> +trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
> +
> +if (!nvme_feature_support[fid]) {
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
> +switch (fid) {
>  case NVME_TEMPERATURE_THRESHOLD:
>  if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
>  break;
> @@ -1172,8 +1200,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>  case NVME_TIMESTAMP:
>  return nvme_set_feature_timestamp(n, cmd);
>  default:
> -trace_pci_nvme_err_invalid_setfeat(dw10);
> -return NVME_INVALID_FIELD | NVME_DNR;
> +return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
>  }
>  return NVME_SUCCESS;
>  }
> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> index f8940435f9ef..8ad1e3c89cee 100644
> --- a/hw/block/nvme.h
> +++ b/hw/block/nvme.h
> @@ -87,6 +87,24 @@ typedef struct NvmeFeatureVal {
>  uint32_tasync_config;
>  } NvmeFeatureVal;
>  
> +static const uint32_t nvme_feature_default[0x100] = {
> +[NVME_ARBITRATION]   = NVME_ARB_AB_NOLIMIT,
> +};
> +
> +static const bool nvme_feature_support[0x100] = {
> +[NVME_ARBITRATION]  = true,
> +[NVME_POWER_MANAGEMENT] = true,
> +[NVME_TEMPERATURE_THRESHOLD]= true,
> +[NVME_ERROR_RECOVERY]   = true,
> +[NVME_VOLATILE_WRITE_CACHE] = true,
> +[NVME_NUMBER_OF_QUEUES] = true,
> +[NVME_INTERRUPT_COALESCING] = true,
> +[NVME_INTERRUPT_VECTOR_CONF]= true,
> +[NVME_WRITE_ATOMICITY]  = true,
> +[NVME_ASYNCHRONOUS_EVENT_CONF]  = true,
> +[NVME_TIMESTAMP]= true,
> +};

Nack. No variable assignation in header please.
Move that to the source file.

> +
>  typedef struct NvmeCtrl {
>  PCIDeviceparent_obj;
>  MemoryRegion iomem;
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index 091af16ca7d7..42e62f4649f8 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -46,6 +46,8 @@ pci_nvme_identify_ctrl(void) "identify controller"
>  pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
>  pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, 
> uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 
> 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
> +pci_nv

Re: [PATCH v2 10/18] hw/block/nvme: fix missing endian conversion

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 8:34 AM, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Fix a missing cpu_to conversion.
> 
> Signed-off-by: Klaus Jensen 
> Reviewed-by: Dmitry Fomichev 
> ---
>  hw/block/nvme.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index f3a5b857bc92..ba523f6768bf 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1080,7 +1080,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd 
> *cmd, NvmeRequest *req)
>  
>  break;
>  case NVME_VOLATILE_WRITE_CACHE:
> -result = blk_enable_write_cache(n->conf.blk);
> +result = cpu_to_le32(blk_enable_write_cache(n->conf.blk));
>  trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
>  break;
>  case NVME_NUMBER_OF_QUEUES:
> 

This doesn't look correct. What you probably want:

-- >8 --

--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -815,8 +815,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n,
NvmeCmd *cmd, NvmeRequest *req)
 trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled");
 break;
 case NVME_NUMBER_OF_QUEUES:
-result = cpu_to_le32((n->params.max_ioqpairs - 1) |
- ((n->params.max_ioqpairs - 1) << 16));
+result = (n->params.max_ioqpairs - 1)
+  | ((n->params.max_ioqpairs - 1) << 16);
 trace_pci_nvme_getfeat_numq(result);
 break;
 case NVME_TIMESTAMP:
@@ -825,8 +825,8 @@ static uint16_t nvme_get_feature(NvmeCtrl *n,
NvmeCmd *cmd, NvmeRequest *req)
 trace_pci_nvme_err_invalid_getfeat(dw10);
 return NVME_INVALID_FIELD | NVME_DNR;
 }
+req->cqe.result = cpu_to_le32(result);

-req->cqe.result = result;
 return NVME_SUCCESS;
 }
---

Re: [PATCH v2 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list

2020-07-03 Thread Klaus Jensen

On Jul  3 10:20, Philippe Mathieu-Daudé wrote:
> On 7/3/20 8:34 AM, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > Reject the nsid broadcast value (0x) and 0xfffe in the
> > Active Namespace ID list.
> 
> Can we have a definition instead of this 0xfffe magic value please?
> 

Hmm, not really actually. It's not a magic value, its just because the
logic in Active Namespace ID list would require that it should report
any namespaces with ids *higher* than the one specified, so since
0x (NVME_NSID_BROADCAST) is invalid, NVME_NSID_BROADCAST - 1
needs to be as well.

What do you say I change it to `min_nsid >= NVME_NSID_BROADCAST - 1`?
The original condition just reads well if you are sitting with the spec
on the side.

> > 
> > Signed-off-by: Klaus Jensen 
> > ---
> >  hw/block/nvme.c | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 65c2fa3ac1f4..0dac7a41ddae 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -956,6 +956,10 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
> > NvmeIdentify *c)
> >  
> >  trace_pci_nvme_identify_nslist(min_nsid);
> >  
> > +if (min_nsid == 0xfffe || min_nsid == NVME_NSID_BROADCAST) {
> > +return NVME_INVALID_NSID | NVME_DNR;
> > +}
> > +
> >  list = g_malloc0(data_len);
> >  for (i = 0; i < n->num_namespaces; i++) {
> >  if (i < min_nsid) {
> > 
>

Re: [PATCH 6/6] migration: support picking vmstate disk in QMP snapshot commands

2020-07-03 Thread Daniel P . Berrangé

On Thu, Jul 02, 2020 at 01:19:43PM -0500, Eric Blake wrote:
> On 7/2/20 12:57 PM, Daniel P. Berrangé wrote:
> > This wires up support for a new "vmstate" parameter to the QMP commands
> > for snapshots (savevm, loadvm). This parameter accepts block driver
> > state node name.
> > 
> > One use case for this would be a VM using OVMF firmware where the
> > variables store is the first snapshottable disk image. The vmstate
> > snapshot usually wants to be stored in the primary root disk of the
> > VM, not the firmeware varstore. Thus there needs to be a mechanism
> 
> firmware
> 
> > to override the default choice of disk.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> 
> > +++ b/qapi/migration.json
> > @@ -1630,6 +1630,7 @@
> >   # @tag: name of the snapshot to create. If it already
> >   # exists it will be replaced.
> >   # @exclude: list of block device node names to exclude
> > +# @vmstate: block device node name to save vmstate to
> >   #
> >   # Note that execution of the VM will be paused during the time
> >   # it takes to save the snapshot
> > @@ -1641,6 +1642,7 @@
> >   # -> { "execute": "savevm",
> >   #  "data": {
> >   # "tag": "my-snap",
> > +# "vmstate": "disk0",
> >   # "exclude": ["pflash0-vars"]
> >   #  }
> >   #}
> > @@ -1650,6 +1652,7 @@
> >   ##
> >   { 'command': 'savevm',
> > 'data': { 'tag': 'str',
> > +'*vmstate': 'str',
> >   '*exclude': ['str'] } }
> 
> During save, the list of block devices is obvious: everything that is not
> excluded.  But,
> 
> >   ##
> > @@ -1659,6 +1662,7 @@
> >   #
> >   # @tag: name of the snapshot to load.
> >   # @exclude: list of block device node names to exclude
> > +# @vmstate: block device node name to load vmstate from
> >   #
> >   # Returns: nothing
> >   #
> > @@ -1667,6 +1671,7 @@
> >   # -> { "execute": "loadvm",
> >   #  "data": {
> >   # "tag": "my-snap",
> > +# "vmstate": "disk0",
> >   # "exclude": ["pflash0-vars"]
> >   #  }
> >   #}
> > @@ -1676,6 +1681,7 @@
> >   ##
> >   { 'command': 'loadvm',
> > 'data': { 'tag': 'str',
> > +'*vmstate': 'str',
> >   '*exclude': ['str'] } }
> 
> ...now that we support exclusion during saving, or even without exclusion
> but when the user has performed hotplug/unplug operations in the meantime
> from when the snapshot was created, isn't load better off listing all
> devices which SHOULD be restored, rather than excluding devices that should
> NOT be restored?  (After all, libvirt knows which disks existed at the time
> of the snapshot, which may be different than the set of disks that exist now
> even though we are throwing out the state now to go back to the state at the
> time of the snapshot)

If the user has hotplugged / unplugged any devices, then I expect the
snapshot load to fail, because the vmstate will be referencing devices
that don't exist, or will be missing devices. Same way migration will
fail unless target QEMU has exact same device setup that was first
serialized into the vmstate

In theory I guess you could have kept device ABI the same, but switched
out disk backends, but I question the sanity of doing that while you have
saved snapshots, unless you're preserving those snapshots in the new
images in which case it will just work.

> Then there's the question of symmetry: if load needs an explicit list of
> blocks to load from (rather than the set of blocks that are currently
> associated with the machine), should save also take its list by positive
> inclusion rather than negative exclusion?

I choose exclusion because the normal case is that you want to snapshot
everything. You sometimes have a small number of exceptions, most notably
the OVMF varstore. IOW if you're currently relying on default behaviour
of snapshotting everything, it is much easier to just exclude one image
and than to switch to explicitly including everything. Essentially I can
just pass a static string associated with the varstore to be excluded,
instead of having to dynamically build up a list of everything.

I wouldn't mind supporting inclusion *and* exclusion, so users have the
choice of which approach to take.

> And why does delvm not need to specify which block is the vmstate? delvm is
> in the same boat as loadvm - the set of blocks involved at the time of the
> snapshot creation may be different than the set of blocks currently
> associated with the guest at the time you run load/delvm.

There's no code in delvm that needs to take any special action wrt
vmstate. It simply deletes snapshots from all the disks present.

> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://w

Re: [PATCH v2 11/18] hw/block/nvme: add remaining mandatory controller parameters

2020-07-03 Thread Klaus Jensen

On Jul  3 10:31, Philippe Mathieu-Daudé wrote:
> On 7/3/20 8:34 AM, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > Add support for any remaining mandatory controller operating parameters
> > (features).
> > 
> > Signed-off-by: Klaus Jensen 
> > Reviewed-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme.c   | 39 +--
> >  hw/block/nvme.h   | 18 ++
> >  hw/block/trace-events |  2 ++
> >  include/block/nvme.h  |  7 +++
> >  4 files changed, 60 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index ba523f6768bf..affb9a967534 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -1056,8 +1056,16 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, 
> > NvmeCmd *cmd, NvmeRequest *req)
> >  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> >  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> >  uint32_t result;
> > +uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> > +uint16_t iv;
> >  
> > -switch (dw10) {
> > +trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
> > +
> > +if (!nvme_feature_support[fid]) {
> > +return NVME_INVALID_FIELD | NVME_DNR;
> > +}
> > +
> > +switch (fid) {
> >  case NVME_TEMPERATURE_THRESHOLD:
> >  result = 0;
> >  
> > @@ -1088,14 +1096,27 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, 
> > NvmeCmd *cmd, NvmeRequest *req)
> >   ((n->params.max_ioqpairs - 1) << 16));
> >  trace_pci_nvme_getfeat_numq(result);
> >  break;
> > +case NVME_INTERRUPT_VECTOR_CONF:
> > +iv = dw11 & 0x;
> > +if (iv >= n->params.max_ioqpairs + 1) {
> > +return NVME_INVALID_FIELD | NVME_DNR;
> > +}
> > +
> > +result = iv;
> > +if (iv == n->admin_cq.vector) {
> > +result |= NVME_INTVC_NOCOALESCING;
> > +}
> > +
> > +result = cpu_to_le32(result);
> > +break;
> >  case NVME_ASYNCHRONOUS_EVENT_CONF:
> >  result = cpu_to_le32(n->features.async_config);
> >  break;
> >  case NVME_TIMESTAMP:
> >  return nvme_get_feature_timestamp(n, cmd);
> >  default:
> > -trace_pci_nvme_err_invalid_getfeat(dw10);
> > -return NVME_INVALID_FIELD | NVME_DNR;
> > +result = cpu_to_le32(nvme_feature_default[fid]);
> 
> So here we expect uninitialized fid entries to return 0, right?
> 

Yes, if defaults are not 0 (like NVME_ARBITRATION), it is explicitly set.

> > +break;
> >  }
> >  
> >  req->cqe.result = result;
> > @@ -1124,8 +1145,15 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, 
> > NvmeCmd *cmd, NvmeRequest *req)
> >  {
> >  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
> >  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
> > +uint8_t fid = NVME_GETSETFEAT_FID(dw10);
> >  
> > -switch (dw10) {
> > +trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
> > +
> > +if (!nvme_feature_support[fid]) {
> > +return NVME_INVALID_FIELD | NVME_DNR;
> > +}
> > +
> > +switch (fid) {
> >  case NVME_TEMPERATURE_THRESHOLD:
> >  if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
> >  break;
> > @@ -1172,8 +1200,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
> > *cmd, NvmeRequest *req)
> >  case NVME_TIMESTAMP:
> >  return nvme_set_feature_timestamp(n, cmd);
> >  default:
> > -trace_pci_nvme_err_invalid_setfeat(dw10);
> > -return NVME_INVALID_FIELD | NVME_DNR;
> > +return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
> >  }
> >  return NVME_SUCCESS;
> >  }
> > diff --git a/hw/block/nvme.h b/hw/block/nvme.h
> > index f8940435f9ef..8ad1e3c89cee 100644
> > --- a/hw/block/nvme.h
> > +++ b/hw/block/nvme.h
> > @@ -87,6 +87,24 @@ typedef struct NvmeFeatureVal {
> >  uint32_tasync_config;
> >  } NvmeFeatureVal;
> >  
> > +static const uint32_t nvme_feature_default[0x100] = {
> > +[NVME_ARBITRATION]   = NVME_ARB_AB_NOLIMIT,
> > +};
> > +
> > +static const bool nvme_feature_support[0x100] = {
> > +[NVME_ARBITRATION]  = true,
> > +[NVME_POWER_MANAGEMENT] = true,
> > +[NVME_TEMPERATURE_THRESHOLD]= true,
> > +[NVME_ERROR_RECOVERY]   = true,
> > +[NVME_VOLATILE_WRITE_CACHE] = true,
> > +[NVME_NUMBER_OF_QUEUES] = true,
> > +[NVME_INTERRUPT_COALESCING] = true,
> > +[NVME_INTERRUPT_VECTOR_CONF]= true,
> > +[NVME_WRITE_ATOMICITY]  = true,
> > +[NVME_ASYNCHRONOUS_EVENT_CONF]  = true,
> > +[NVME_TIMESTAMP]= true,
> > +};
> 
> Nack. No variable assignation in header please.
> Move that to the source file.
> 

Understood :)

> > +
> >  typedef struct NvmeCtrl {
> >  PCIDeviceparent_obj;
> >  MemoryRegion iomem;
> > diff --git a/hw/block/trace-events b/hw/block/trace-events
> > index 091af16ca7d7..42e62f4649f8 100644
> > --- a/hw/block/trace-events
> > +

Re: [PULL 00/10] Modules 20200702 patches

2020-07-03 Thread Peter Maydell

On Thu, 2 Jul 2020 at 13:23, Gerd Hoffmann  wrote:
>
> The following changes since commit fc1bff958998910ec8d25db86cd2f53ff125f7ab:
>
>   hw/misc/pca9552: Add missing TypeInfo::class_size field (2020-06-29 
> 21:16:10 +0100)
>
> are available in the Git repository at:
>
>   git://git.kraxel.org/qemu tags/modules-20200702-pull-request
>
> for you to fetch changes up to 474a5d66036d18ee5ccaa88364660d05bf32127b:
>
>   chardev: enable modules, use for braille (2020-07-01 21:08:11 +0200)
>
> 
> qom: add support for qom objects in modules.
> build some devices (qxl, virtio-gpu, ccid, usb-redir) as modules.
> build braille chardev as module.
>
> note: qemu doesn't rebuild objects on cflags changes (specifically
>   -fPIC being added when code is switched from builtin to module).
>   Workaround for resulting build errors: "make clean", rebuild.
>
> 
>
> Gerd Hoffmann (10):
>   module: qom module support
>   object: qom module support
>   qdev: device module support
>   build: fix device module builds
>   ccid: build smartcard as module
>   usb: build usb-redir as module
>   vga: build qxl as module
>   vga: build virtio-gpu only once
>   vga: build virtio-gpu as module
>   chardev: enable modules, use for braille

No code review at all? :-(   In particular the "build: fix device module
builds" commit (as you note in your commit message) does not look at
all right. I would much prefer if we could get some code review
for these changes before applying them.

thanks
-- PMM

Re: [PULL 04/10] build: fix device module builds

2020-07-03 Thread Claudio Fontana

On 7/2/20 2:20 PM, Gerd Hoffmann wrote:
> See comment.  Feels quite hackish.  Better ideas anyone?
> 


A better idea could be to investigate what and why gets into the variable.
I guess at this point we will need to revisit this later on.

CLaudio


> Signed-off-by: Gerd Hoffmann 
> Message-id: 20200624131045.14512-5-kra...@redhat.com
> ---
>  Makefile.target | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Makefile.target b/Makefile.target
> index 8ed1eba95b9c..c70325df5796 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -179,6 +179,13 @@ endif # CONFIG_SOFTMMU
>  dummy := $(call unnest-vars,,obj-y)
>  all-obj-y := $(obj-y)
>  
> +#
> +# common-obj-m has some crap here, probably as side effect from
> +# filling obj-y.  Clear it.  Fixes suspious dependency errors when
> +# building devices as modules.
> +#
> +common-obj-m :=
> +>  include $(SRC_PATH)/Makefile.objs
>  dummy := $(call unnest-vars,.., \
> authz-obj-y \
>

[PULL 00/41] virtio,acpi: features, fixes, cleanups.

2020-07-03 Thread Michael S. Tsirkin

The following changes since commit fc1bff958998910ec8d25db86cd2f53ff125f7ab:

  hw/misc/pca9552: Add missing TypeInfo::class_size field (2020-06-29 21:16:10 
+0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 900ed7043750ae3cdf35c05da66e150a8821c3a0:

  vhost-vdpa: introduce vhost-vdpa net client (2020-07-03 04:59:13 -0400)


virtio,acpi: features, fixes, cleanups.

vdpa support
virtio-mem support
a handy script for disassembling acpi tables
misc fixes and cleanups

Signed-off-by: Michael S. Tsirkin 


Andrew Jones (1):
  tests/acpi: remove stale allowed tables

Cindy Lu (11):
  net: introduce qemu_get_peer
  vhost_net: use the function qemu_get_peer
  vhost: introduce new VhostOps vhost_dev_start
  vhost: implement vhost_dev_start method
  vhost: introduce new VhostOps vhost_vq_get_addr
  vhost: implement vhost_vq_get_addr method
  vhost: introduce new VhostOps vhost_force_iommu
  vhost: implement vhost_force_iommu method
  vhost_net: introduce set_config & get_config
  vhost-vdpa: introduce vhost-vdpa backend
  vhost-vdpa: introduce vhost-vdpa net client

David Hildenbrand (22):
  virtio-balloon: always indicate S_DONE when migration fails
  pc: Support coldplugging of virtio-pmem-pci devices on all buses
  exec: Introduce ram_block_discard_(disable|require)()
  vfio: Convert to ram_block_discard_disable()
  accel/kvm: Convert to ram_block_discard_disable()
  s390x/pv: Convert to ram_block_discard_disable()
  virtio-balloon: Rip out qemu_balloon_inhibit()
  target/i386: sev: Use ram_block_discard_disable()
  migration/rdma: Use ram_block_discard_disable()
  migration/colo: Use ram_block_discard_disable()
  virtio-mem: Paravirtualized memory hot(un)plug
  virtio-pci: Proxy for virtio-mem
  MAINTAINERS: Add myself as virtio-mem maintainer
  hmp: Handle virtio-mem when printing memory device info
  numa: Handle virtio-mem in NUMA stats
  pc: Support for virtio-mem-pci
  virtio-mem: Allow notifiers for size changes
  virtio-pci: Send qapi events when the virtio-mem size changes
  virtio-mem: Migration sanity checks
  virtio-mem: Add trace events
  virtio-mem: Exclude unplugged memory during migration
  numa: Auto-enable NUMA when any memory devices are possible

Jason Wang (3):
  virtio-bus: introduce queue_enabled method
  virtio-pci: implement queue_enabled method
  vhost: check the existence of vhost_set_iotlb_callback

Maxime Coquelin (1):
  docs: vhost-user: add Virtio status protocol feature

Michael S. Tsirkin (2):
  tests: disassemble-aml.sh: generate AML in readable format
  Revert "tests/migration: Reduce autoconverge initial bandwidth"

Peter Xu (1):
  MAINTAINERS: add VT-d entry

 configure   |  21 +
 qapi/misc.json  |  64 +-
 qapi/net.json   |  28 +-
 hw/virtio/virtio-mem-pci.h  |  34 ++
 include/exec/memory.h   |  41 ++
 include/hw/boards.h |   1 +
 include/hw/pci/pci.h|   1 +
 include/hw/vfio/vfio-common.h   |   4 +-
 include/hw/virtio/vhost-backend.h   |  19 +-
 include/hw/virtio/vhost-vdpa.h  |  26 +
 include/hw/virtio/vhost.h   |   7 +
 include/hw/virtio/virtio-bus.h  |   4 +
 include/hw/virtio/virtio-mem.h  |  86 +++
 include/migration/colo.h|   2 +-
 include/migration/misc.h|   2 +
 include/net/net.h   |   1 +
 include/net/vhost-vdpa.h|  22 +
 include/net/vhost_net.h |   5 +
 include/sysemu/balloon.h|   2 -
 net/clients.h   |   2 +
 tests/qtest/bios-tables-test-allowed-diff.h |  18 -
 accel/kvm/kvm-all.c |   4 +-
 balloon.c   |  17 -
 exec.c  |  52 ++
 hw/arm/virt.c   |   2 +
 hw/core/numa.c  |  17 +-
 hw/i386/microvm.c   |   1 +
 hw/i386/pc.c|  66 ++-
 hw/i386/pc_piix.c   |   1 +
 hw/i386/pc_q35.c|   1 +
 hw/net/vhost_net-stub.c |  11 +
 hw/net/vhost_net.c  |  45 +-
 hw/net/virtio-net.c |  19 +
 hw/s390x/s390-virtio-ccw.c  |  22 +-
 hw/vfio/ap.c|   8 +-
 hw/vfio/ccw.c   |  11 +-
 hw/vfio/common.c

[PULL 03/41] virtio-balloon: always indicate S_DONE when migration fails

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

If something goes wrong during precopy, before stopping the VM, we will
never send a S_DONE indication to the VM, resulting in the hinted pages
not getting released to be used by the guest OS (e.g., Linux).

Easy to reproduce:
1. Start migration (e.g., HMP "migrate -d 'exec:gzip -c > STATEFILE.gz'")
2. Cancel migration (e.g., HMP "migrate_cancel")
3. Oberve in the guest (e.g., cat /proc/meminfo) that there is basically
   no free memory left.

While at it, add similar locking to virtio_balloon_free_page_done() as
done in virtio_balloon_free_page_stop. Locking is still weird, but that
has to be sorted out separately.

There is nothing to do in the PRECOPY_NOTIFY_COMPLETE case. Add some
comments regarding S_DONE handling.

Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reviewed-by: Alexander Duyck 
Cc: Wei Wang 
Cc: Alexander Duyck 
Signed-off-by: David Hildenbrand 
Message-Id: <20200629080615.26022-1-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-balloon.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 10507b2a43..8a84718490 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -628,8 +628,13 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
-virtio_notify_config(vdev);
+if (s->free_page_report_status != FREE_PAGE_REPORT_S_DONE) {
+/* See virtio_balloon_free_page_stop() */
+qemu_mutex_lock(&s->free_page_lock);
+s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+qemu_mutex_unlock(&s->free_page_lock);
+virtio_notify_config(vdev);
+}
 }
 
 static int
@@ -653,17 +658,26 @@ virtio_balloon_free_page_report_notify(NotifierWithReturn 
*n, void *data)
 case PRECOPY_NOTIFY_SETUP:
 precopy_enable_free_page_optimization();
 break;
-case PRECOPY_NOTIFY_COMPLETE:
-case PRECOPY_NOTIFY_CLEANUP:
 case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
 virtio_balloon_free_page_stop(dev);
 break;
 case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC:
 if (vdev->vm_running) {
 virtio_balloon_free_page_start(dev);
-} else {
-virtio_balloon_free_page_done(dev);
+break;
 }
+/*
+ * Set S_DONE before migrating the vmstate, so the guest will reuse
+ * all hinted pages once running on the destination. Fall through.
+ */
+case PRECOPY_NOTIFY_CLEANUP:
+/*
+ * Especially, if something goes wrong during precopy or if migration
+ * is canceled, we have to properly communicate S_DONE to the VM.
+ */
+virtio_balloon_free_page_done(dev);
+break;
+case PRECOPY_NOTIFY_COMPLETE:
 break;
 default:
 virtio_error(vdev, "%s: %d reason unknown", __func__, pnd->reason);
-- 
MST

[PULL 02/41] Revert "tests/migration: Reduce autoconverge initial bandwidth"

2020-07-03 Thread Michael S. Tsirkin

This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial 
bandwidth")
since that change makes unit tests much slower for all developers, while it's 
not
a robust way to fix migration tests. Migration tests need to find
a more robust way to discover a reasonable bandwidth without slowing
things down for everyone.

Fixes: 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
Signed-off-by: Michael S. Tsirkin 
Acked-by: Dr. David Alan Gilbert 
Acked-by: Philippe Mathieu-Daudé 
Acked-by: Thomas Huth 
---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index dc3490c9fa..21ea5ba1d2 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
  * without throttling.
  */
 migrate_set_parameter_int(from, "downtime-limit", 1);
-migrate_set_parameter_int(from, "max-bandwidth", 100); /* ~1Mb/s */
+migrate_set_parameter_int(from, "max-bandwidth", 1); /* ~100Mb/s */
 
 /* To check remaining size after precopy */
 migrate_set_capability(from, "pause-before-switchover", true);
-- 
MST

[PULL 01/41] tests: disassemble-aml.sh: generate AML in readable format

2020-07-03 Thread Michael S. Tsirkin

On systems where the IASL tool exists, we can convert
extected ACPI tables to ASL format, which is useful
for debugging and documentation purposes.
This script does this for all ACPI tables under tests/data/acpi/.

Signed-off-by: Michael S. Tsirkin 
---
 tests/data/acpi/disassemle-aml.sh   | 52 +
 tests/data/acpi/rebuild-expected-aml.sh |  1 +
 2 files changed, 53 insertions(+)
 create mode 100755 tests/data/acpi/disassemle-aml.sh

diff --git a/tests/data/acpi/disassemle-aml.sh 
b/tests/data/acpi/disassemle-aml.sh
new file mode 100755
index 00..1d8a4d0301
--- /dev/null
+++ b/tests/data/acpi/disassemle-aml.sh
@@ -0,0 +1,52 @@
+#!/usr/bin/bash
+
+outdir=
+while getopts "o:" arg; do
+  case ${arg} in
+o )
+outdir=$OPTARG
+;;
+\? )
+echo "Usage: ./tests/data/acpi/disassemle-aml.sh [-o 
]"
+exit 1
+;;
+
+  esac
+done
+
+for machine in tests/data/acpi/*
+do
+if [[ ! -d "$machine" ]];
+then
+continue
+fi
+
+if [[ "${outdir}" ]];
+then
+mkdir -p "${outdir}"/${machine} || exit $?
+fi
+for aml in $machine/*
+do
+if [[ "$aml" == $machine/*.dsl ]];
+then
+continue
+fi
+if [[ "$aml" == $machine/SSDT*.* ]];
+then
+dsdt=${aml/SSDT*./DSDT.}
+extra="-e ${dsdt}"
+elif [[ "$aml" == $machine/SSDT* ]];
+then
+dsdt=${aml/SSDT*/DSDT};
+extra="-e ${dsdt}"
+else
+extra=""
+fi
+asl=${aml}.dsl
+if [[ "${outdir}" ]];
+then
+asl="${outdir}"/${machine}/${asl}
+fi
+iasl -d -p ${asl} ${extra} ${aml}
+done
+done
diff --git a/tests/data/acpi/rebuild-expected-aml.sh 
b/tests/data/acpi/rebuild-expected-aml.sh
index 9cbaab1a4d..76cd797d1e 100755
--- a/tests/data/acpi/rebuild-expected-aml.sh
+++ b/tests/data/acpi/rebuild-expected-aml.sh
@@ -36,6 +36,7 @@ old_allowed_dif=`grep -v -e 'List of comma-separated changed 
AML files to ignore
 echo '/* List of comma-separated changed AML files to ignore */' > 
${SRC_PATH}/tests/qtest/bios-tables-test-allowed-diff.h
 
 echo "The files were rebuilt and can be added to git."
+echo "You can use ${SRC_PATH}/tests/data/acpi/disassemle-aml.sh to disassemble 
them to ASL."
 
 if [ -z "$old_allowed_dif" ]; then
 echo "Note! Please do not commit expected files with source changes"
-- 
MST

[PULL 11/41] migration/rdma: Use ram_block_discard_disable()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

RDMA will pin all guest memory (as documented in docs/rdma.txt). We want
to disable RAM block discards - however, to keep it simple use
ram_block_discard_is_required() instead of inhibiting.

Note: It is not sufficient to limit disabling to pin_all. Even when only
conditionally pinning 1 MB chunks, as soon as one page within such a
chunk was discarded and one page not, the discarded pages will be pinned
as well.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-9-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 migration/rdma.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index ec45d33ba3..bbe6f36627 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -29,6 +29,7 @@
 #include "qemu/sockets.h"
 #include "qemu/bitmap.h"
 #include "qemu/coroutine.h"
+#include "exec/memory.h"
 #include 
 #include 
 #include 
@@ -4017,8 +4018,14 @@ void rdma_start_incoming_migration(const char 
*host_port, Error **errp)
 Error *local_err = NULL;
 
 trace_rdma_start_incoming_migration();
-rdma = qemu_rdma_data_init(host_port, &local_err);
 
+/* Avoid ram_block_discard_disable(), cannot change during migration. */
+if (ram_block_discard_is_required()) {
+error_setg(errp, "RDMA: cannot disable RAM discard");
+return;
+}
+
+rdma = qemu_rdma_data_init(host_port, &local_err);
 if (rdma == NULL) {
 goto err;
 }
@@ -4067,10 +4074,17 @@ void rdma_start_outgoing_migration(void *opaque,
 const char *host_port, Error **errp)
 {
 MigrationState *s = opaque;
-RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
 RDMAContext *rdma_return_path = NULL;
+RDMAContext *rdma;
 int ret = 0;
 
+/* Avoid ram_block_discard_disable(), cannot change during migration. */
+if (ram_block_discard_is_required()) {
+error_setg(errp, "RDMA: cannot disable RAM discard");
+return;
+}
+
+rdma = qemu_rdma_data_init(host_port, errp);
 if (rdma == NULL) {
 goto err;
 }
-- 
MST

[PULL 04/41] pc: Support coldplugging of virtio-pmem-pci devices on all buses

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

E.g., with "pc-q35-4.2", trying to coldplug a virtio-pmem-pci devices
results in
"virtio-pmem-pci not supported on this bus"

Reasons is, that the bus does not support hotplug and, therefore, does
not have a hotplug handler. Let's allow coldplugging virtio-pmem devices
on such buses. The hotplug order is only relevant for virtio-pmem-pci
when the guest is already alive and the device is visible before
memory_device_plug() wired up the memory device bits.

Hotplug attempts will still fail with:
"Error: Bus 'pcie.0' does not support hotplugging"

Hotunplug attempts will still fail with:
"Error: Bus 'pcie.0' does not support hotplugging"

Reported-by: Vivek Goyal 
Reviewed-by: Pankaj Gupta 
Cc: Pankaj Gupta 
Cc: Igor Mammedov 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-2-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/pc.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 4af9679d03..58b1425c17 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1643,13 +1643,13 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler 
*hotplug_dev,
 HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
 Error *local_err = NULL;
 
-if (!hotplug_dev2) {
+if (!hotplug_dev2 && dev->hotplugged) {
 /*
  * Without a bus hotplug handler, we cannot control the plug/unplug
- * order. This should never be the case on x86, however better add
- * a safety net.
+ * order. We should never reach this point when hotplugging on x86,
+ * however, better add a safety net.
  */
-error_setg(errp, "virtio-pmem-pci not supported on this bus.");
+error_setg(errp, "virtio-pmem-pci hotplug not supported on this bus.");
 return;
 }
 /*
@@ -1658,7 +1658,7 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler 
*hotplug_dev,
  */
 memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev), NULL,
&local_err);
-if (!local_err) {
+if (!local_err && hotplug_dev2) {
 hotplug_handler_pre_plug(hotplug_dev2, dev, &local_err);
 }
 error_propagate(errp, local_err);
@@ -1676,9 +1676,11 @@ static void pc_virtio_pmem_pci_plug(HotplugHandler 
*hotplug_dev,
  * device bits.
  */
 memory_device_plug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev));
-hotplug_handler_plug(hotplug_dev2, dev, &local_err);
-if (local_err) {
-memory_device_unplug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev));
+if (hotplug_dev2) {
+hotplug_handler_plug(hotplug_dev2, dev, &local_err);
+if (local_err) {
+memory_device_unplug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev));
+}
 }
 error_propagate(errp, local_err);
 }
-- 
MST

[PULL 07/41] accel/kvm: Convert to ram_block_discard_disable()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Discarding memory does not work as expected. At the time this is called,
we cannot have anyone active that relies on discards to work properly.

Reviewed-by: Dr. David Alan Gilbert 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-5-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 accel/kvm/kvm-all.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d54a8701d8..ab36fbfa0c 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -40,7 +40,6 @@
 #include "trace.h"
 #include "hw/irq.h"
 #include "sysemu/sev.h"
-#include "sysemu/balloon.h"
 #include "qapi/visitor.h"
 #include "qapi/qapi-types-common.h"
 #include "qapi/qapi-visit-common.h"
@@ -2229,7 +2228,8 @@ static int kvm_init(MachineState *ms)
 
 s->sync_mmu = !!kvm_vm_check_extension(kvm_state, KVM_CAP_SYNC_MMU);
 if (!s->sync_mmu) {
-qemu_balloon_inhibit(true);
+ret = ram_block_discard_disable(true);
+assert(!ret);
 }
 
 return 0;
-- 
MST

[PULL 06/41] vfio: Convert to ram_block_discard_disable()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

VFIO is (except devices without a physical IOMMU or some mediated devices)
incompatible with discarding of RAM. The kernel will pin basically all VM
memory. Let's convert to ram_block_discard_disable(), which can now
fail, in contrast to qemu_balloon_inhibit().

Leave "x-balloon-allowed" named as it is for now.

Reviewed-by: Tony Krowiak 
Acked-by: Cornelia Huck 
Cc: Cornelia Huck 
Cc: Alex Williamson 
Cc: Christian Borntraeger 
Cc: Tony Krowiak 
Cc: Halil Pasic 
Cc: Pierre Morel 
Cc: Eric Farman 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-4-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/vfio/vfio-common.h |  4 +--
 hw/vfio/ap.c  |  8 +++---
 hw/vfio/ccw.c | 11 
 hw/vfio/common.c  | 53 +++
 hw/vfio/pci.c |  6 ++--
 5 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index fd564209ac..c78f3ff559 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -108,7 +108,7 @@ typedef struct VFIODevice {
 bool reset_works;
 bool needs_reset;
 bool no_mmap;
-bool balloon_allowed;
+bool ram_block_discard_allowed;
 VFIODeviceOps *ops;
 unsigned int num_irqs;
 unsigned int num_regions;
@@ -128,7 +128,7 @@ typedef struct VFIOGroup {
 QLIST_HEAD(, VFIODevice) device_list;
 QLIST_ENTRY(VFIOGroup) next;
 QLIST_ENTRY(VFIOGroup) container_next;
-bool balloon_allowed;
+bool ram_block_discard_allowed;
 } VFIOGroup;
 
 typedef struct VFIODMABuf {
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 95564c17ed..b9330a8e6f 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -105,12 +105,12 @@ static void vfio_ap_realize(DeviceState *dev, Error 
**errp)
 vapdev->vdev.dev = dev;
 
 /*
- * vfio-ap devices operate in a way compatible with
- * memory ballooning, as no pages are pinned in the host.
+ * vfio-ap devices operate in a way compatible with discarding of
+ * memory in RAM blocks, as no pages are pinned in the host.
  * This needs to be set before vfio_get_device() for vfio common to
- * handle the balloon inhibitor.
+ * handle ram_block_discard_disable().
  */
-vapdev->vdev.balloon_allowed = true;
+vapdev->vdev.ram_block_discard_allowed = true;
 
 ret = vfio_get_device(vfio_group, mdevid, &vapdev->vdev, errp);
 if (ret) {
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 06e69d7066..ff7f369779 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -574,12 +574,13 @@ static void vfio_ccw_get_device(VFIOGroup *group, 
VFIOCCWDevice *vcdev,
 
 /*
  * All vfio-ccw devices are believed to operate in a way compatible with
- * memory ballooning, ie. pages pinned in the host are in the current
- * working set of the guest driver and therefore never overlap with pages
- * available to the guest balloon driver.  This needs to be set before
- * vfio_get_device() for vfio common to handle the balloon inhibitor.
+ * discarding of memory in RAM blocks, ie. pages pinned in the host are
+ * in the current working set of the guest driver and therefore never
+ * overlap e.g., with pages available to the guest balloon driver.  This
+ * needs to be set before vfio_get_device() for vfio common to handle
+ * ram_block_discard_disable().
  */
-vcdev->vdev.balloon_allowed = true;
+vcdev->vdev.ram_block_discard_allowed = true;
 
 if (vfio_get_device(group, vcdev->cdev.mdevid, &vcdev->vdev, errp)) {
 goto out_err;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 0b3593b3c0..33357140b8 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -33,7 +33,6 @@
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
 #include "qemu/range.h"
-#include "sysemu/balloon.h"
 #include "sysemu/kvm.h"
 #include "sysemu/reset.h"
 #include "trace.h"
@@ -1215,31 +1214,36 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 space = vfio_get_address_space(as);
 
 /*
- * VFIO is currently incompatible with memory ballooning insofar as the
+ * VFIO is currently incompatible with discarding of RAM insofar as the
  * madvise to purge (zap) the page from QEMU's address space does not
  * interact with the memory API and therefore leaves stale virtual to
  * physical mappings in the IOMMU if the page was previously pinned.  We
- * therefore add a balloon inhibit for each group added to a container,
+ * therefore set discarding broken for each group added to a container,
  * whether the container is used individually or shared.  This provides
  * us with options to allow devices within a group to opt-in and allow
- * ballooning, so long as it is done consistently for a group (for instance
+ * discarding, so long as it is done consistently

[PULL 05/41] exec: Introduce ram_block_discard_(disable|require)()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

We want to replace qemu_balloon_inhibit() by something more generic.
Especially, we want to make sure that technologies that really rely on
RAM block discards to work reliably to run mutual exclusive with
technologies that effectively break it.

E.g., vfio will usually pin all guest memory, turning the virtio-balloon
basically useless and make the VM consume more memory than reported via
the balloon. While the balloon is special already (=> no guarantees, same
behavior possible afer reboots and with huge pages), this will be
different, especially, with virtio-mem.

Let's implement a way such that we can make both types of technology run
mutually exclusive. We'll convert existing balloon inhibitors in successive
patches and add some new ones. Add the check to
qemu_balloon_is_inhibited() for now. We might want to make
virtio-balloon an acutal inhibitor in the future - however, that
requires more thought to not break existing setups.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Richard Henderson 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-3-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/exec/memory.h | 41 ++
 balloon.c |  3 ++-
 exec.c| 52 +++
 3 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 7207025bd4..38ec38b9a8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2472,6 +2472,47 @@ static inline MemOp devend_memop(enum device_endian end)
 }
 #endif
 
+/*
+ * Inhibit technologies that require discarding of pages in RAM blocks, e.g.,
+ * to manage the actual amount of memory consumed by the VM (then, the memory
+ * provided by RAM blocks might be bigger than the desired memory consumption).
+ * This *must* be set if:
+ * - Discarding parts of a RAM blocks does not result in the change being
+ *   reflected in the VM and the pages getting freed.
+ * - All memory in RAM blocks is pinned or duplicated, invaldiating any 
previous
+ *   discards blindly.
+ * - Discarding parts of a RAM blocks will result in integrity issues (e.g.,
+ *   encrypted VMs).
+ * Technologies that only temporarily pin the current working set of a
+ * driver are fine, because we don't expect such pages to be discarded
+ * (esp. based on guest action like balloon inflation).
+ *
+ * This is *not* to be used to protect from concurrent discards (esp.,
+ * postcopy).
+ *
+ * Returns 0 if successful. Returns -EBUSY if a technology that relies on
+ * discards to work reliably is active.
+ */
+int ram_block_discard_disable(bool state);
+
+/*
+ * Inhibit technologies that disable discarding of pages in RAM blocks.
+ *
+ * Returns 0 if successful. Returns -EBUSY if discards are already set to
+ * broken.
+ */
+int ram_block_discard_require(bool state);
+
+/*
+ * Test if discarding of memory in ram blocks is disabled.
+ */
+bool ram_block_discard_is_disabled(void);
+
+/*
+ * Test if discarding of memory in ram blocks is required to work reliably.
+ */
+bool ram_block_discard_is_required(void);
+
 #endif
 
 #endif
diff --git a/balloon.c b/balloon.c
index f104b42961..5fff79523a 100644
--- a/balloon.c
+++ b/balloon.c
@@ -40,7 +40,8 @@ static int balloon_inhibit_count;
 
 bool qemu_balloon_is_inhibited(void)
 {
-return atomic_read(&balloon_inhibit_count) > 0;
+return atomic_read(&balloon_inhibit_count) > 0 ||
+   ram_block_discard_is_disabled();
 }
 
 void qemu_balloon_inhibit(bool state)
diff --git a/exec.c b/exec.c
index 21926dc9c7..893636176e 100644
--- a/exec.c
+++ b/exec.c
@@ -4115,4 +4115,56 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, 
MemoryRegion *root)
 }
 }
 
+/*
+ * If positive, discarding RAM is disabled. If negative, discarding RAM is
+ * required to work and cannot be disabled.
+ */
+static int ram_block_discard_disabled;
+
+int ram_block_discard_disable(bool state)
+{
+int old;
+
+if (!state) {
+atomic_dec(&ram_block_discard_disabled);
+return 0;
+}
+
+do {
+old = atomic_read(&ram_block_discard_disabled);
+if (old < 0) {
+return -EBUSY;
+}
+} while (atomic_cmpxchg(&ram_block_discard_disabled, old, old + 1) != old);
+return 0;
+}
+
+int ram_block_discard_require(bool state)
+{
+int old;
+
+if (!state) {
+atomic_inc(&ram_block_discard_disabled);
+return 0;
+}
+
+do {
+old = atomic_read(&ram_block_discard_disabled);
+if (old > 0) {
+return -EBUSY;
+}
+} while (atomic_cmpxchg(&ram_block_discard_disabled, old, old - 1) != old);
+return 0;
+}
+
+bool ram_block_discard_is_disabled(void)
+{
+return atomic_read(&ram_block_discard_disabled) > 0;
+}
+
+bool ram_block_discard_is_required(void)
+{
+return atomic_read(&ram_block_discard_disable

[PULL 09/41] virtio-balloon: Rip out qemu_balloon_inhibit()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

The only remaining special case is postcopy. It cannot handle
concurrent discards yet, which would result in requesting already sent
pages from the source. Special-case it in virtio-balloon instead.

Introduce migration_in_incoming_postcopy(), to find out if incoming
postcopy is active.

Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Michael S. Tsirkin 
Cc: "Michael S. Tsirkin" 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-7-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/migration/misc.h   |  2 ++
 include/sysemu/balloon.h   |  2 --
 balloon.c  | 18 --
 hw/virtio/virtio-balloon.c | 10 --
 migration/migration.c  |  7 +++
 migration/postcopy-ram.c   | 23 ---
 6 files changed, 17 insertions(+), 45 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index d2762257aa..34e7d75713 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -69,6 +69,8 @@ bool migration_has_failed(MigrationState *);
 /* ...and after the device transmission */
 bool migration_in_postcopy_after_devices(MigrationState *);
 void migration_global_dump(Monitor *mon);
+/* True if incomming migration entered POSTCOPY_INCOMING_DISCARD */
+bool migration_in_incoming_postcopy(void);
 
 /* migration/block-dirty-bitmap.c */
 void dirty_bitmap_mig_init(void);
diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
index aea0c44985..20a2defe3a 100644
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -23,7 +23,5 @@ typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo 
*info);
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
  QEMUBalloonStatus *stat_func, void *opaque);
 void qemu_remove_balloon_handler(void *opaque);
-bool qemu_balloon_is_inhibited(void);
-void qemu_balloon_inhibit(bool state);
 
 #endif
diff --git a/balloon.c b/balloon.c
index 5fff79523a..354408c6ea 100644
--- a/balloon.c
+++ b/balloon.c
@@ -36,24 +36,6 @@
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
 static void *balloon_opaque;
-static int balloon_inhibit_count;
-
-bool qemu_balloon_is_inhibited(void)
-{
-return atomic_read(&balloon_inhibit_count) > 0 ||
-   ram_block_discard_is_disabled();
-}
-
-void qemu_balloon_inhibit(bool state)
-{
-if (state) {
-atomic_inc(&balloon_inhibit_count);
-} else {
-atomic_dec(&balloon_inhibit_count);
-}
-
-assert(atomic_read(&balloon_inhibit_count) >= 0);
-}
 
 static bool have_balloon(Error **errp)
 {
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 8a84718490..ae31f0817a 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -63,6 +63,12 @@ static bool 
virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
 return pbp->base_gpa == base_gpa;
 }
 
+static bool virtio_balloon_inhibited(void)
+{
+/* Postcopy cannot deal with concurrent discards, so it's special. */
+return ram_block_discard_is_disabled() || migration_in_incoming_postcopy();
+}
+
 static void balloon_inflate_page(VirtIOBalloon *balloon,
  MemoryRegion *mr, hwaddr mr_offset,
  PartiallyBalloonedPage *pbp)
@@ -336,7 +342,7 @@ static void virtio_balloon_handle_report(VirtIODevice 
*vdev, VirtQueue *vq)
  * accessible by another device or process, or if the guest is
  * expecting it to retain a non-zero value.
  */
-if (qemu_balloon_is_inhibited() || dev->poison_val) {
+if (virtio_balloon_inhibited() || dev->poison_val) {
 goto skip_element;
 }
 
@@ -421,7 +427,7 @@ static void virtio_balloon_handle_output(VirtIODevice 
*vdev, VirtQueue *vq)
 
 trace_virtio_balloon_handle_output(memory_region_name(section.mr),
pa);
-if (!qemu_balloon_is_inhibited()) {
+if (!virtio_balloon_inhibited()) {
 if (vq == s->ivq) {
 balloon_inflate_page(s, section.mr,
  section.offset_within_region, &pbp);
diff --git a/migration/migration.c b/migration/migration.c
index 481a590f72..d365d82209 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1772,6 +1772,13 @@ bool migration_in_postcopy_after_devices(MigrationState 
*s)
 return migration_in_postcopy() && s->postcopy_after_devices;
 }
 
+bool migration_in_incoming_postcopy(void)
+{
+PostcopyState ps = postcopy_state_get();
+
+return ps >= POSTCOPY_INCOMING_DISCARD && ps < POSTCOPY_INCOMING_END;
+}
+
 bool migration_is_idle(void)
 {
 MigrationState *s = current_migration;
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index a36402722b..b41a9fe2fd 100644
--- a/migr

[PULL 08/41] s390x/pv: Convert to ram_block_discard_disable()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Discarding RAM does not work as expected with protected VMs. Let's
switch to ram_block_discard_disable() for now, as we want to get rid
of qemu_balloon_inhibit(). Note that it will currently never fail, but
might fail in the future with new technologies (e.g., virtio-mem).

Acked-by: Cornelia Huck 
Cc: Richard Henderson 
Cc: Cornelia Huck 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: Janosch Frank 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-6-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/s390x/s390-virtio-ccw.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index b111406d56..023fd25f2b 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -43,7 +43,6 @@
 #include "hw/qdev-properties.h"
 #include "hw/s390x/tod.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/balloon.h"
 #include "hw/s390x/pv.h"
 #include "migration/blocker.h"
 
@@ -329,7 +328,7 @@ static void s390_machine_unprotect(S390CcwMachineState *ms)
 ms->pv = false;
 migrate_del_blocker(pv_mig_blocker);
 error_free_or_abort(&pv_mig_blocker);
-qemu_balloon_inhibit(false);
+ram_block_discard_disable(false);
 }
 
 static int s390_machine_protect(S390CcwMachineState *ms)
@@ -338,17 +337,22 @@ static int s390_machine_protect(S390CcwMachineState *ms)
 int rc;
 
/*
-* Ballooning on protected VMs needs support in the guest for
-* sharing and unsharing balloon pages. Block ballooning for
-* now, until we have a solution to make at least Linux guests
-* either support it or fail gracefully.
+* Discarding of memory in RAM blocks does not work as expected with
+* protected VMs. Sharing and unsharing pages would be required. Disable
+* it for now, until until we have a solution to make at least Linux
+* guests either support it (e.g., virtio-balloon) or fail gracefully.
 */
-qemu_balloon_inhibit(true);
+rc = ram_block_discard_disable(true);
+if (rc) {
+error_report("protected VMs: cannot disable RAM discard");
+return rc;
+}
+
 error_setg(&pv_mig_blocker,
"protected VMs are currently not migrateable.");
 rc = migrate_add_blocker(pv_mig_blocker, &local_err);
 if (rc) {
-qemu_balloon_inhibit(false);
+ram_block_discard_disable(false);
 error_report_err(local_err);
 error_free_or_abort(&pv_mig_blocker);
 return rc;
@@ -357,7 +361,7 @@ static int s390_machine_protect(S390CcwMachineState *ms)
 /* Create SE VM */
 rc = s390_pv_vm_enable();
 if (rc) {
-qemu_balloon_inhibit(false);
+ram_block_discard_disable(false);
 migrate_del_blocker(pv_mig_blocker);
 error_free_or_abort(&pv_mig_blocker);
 return rc;
-- 
MST

[PULL 17/41] numa: Handle virtio-mem in NUMA stats

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Account the memory to the configured nid.

Reviewed-by: Pankaj Gupta 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: "Michael S. Tsirkin" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-15-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/core/numa.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 2725886d06..e9aec69afd 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -824,6 +824,7 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[])
 MemoryDeviceInfoList *info;
 PCDIMMDeviceInfo *pcdimm_info;
 VirtioPMEMDeviceInfo *vpi;
+VirtioMEMDeviceInfo *vmi;
 
 for (info = info_list; info; info = info->next) {
 MemoryDeviceInfo *value = info->value;
@@ -844,6 +845,11 @@ static void numa_stat_memory_devices(NumaNodeMem 
node_mem[])
 node_mem[0].node_mem += vpi->size;
 node_mem[0].node_plugged_mem += vpi->size;
 break;
+case MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM:
+vmi = value->u.virtio_mem.data;
+node_mem[vmi->node].node_mem += vmi->size;
+node_mem[vmi->node].node_plugged_mem += vmi->size;
+break;
 default:
 g_assert_not_reached();
 }
-- 
MST

[PULL 10/41] target/i386: sev: Use ram_block_discard_disable()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

AMD SEV will pin all guest memory, mark discarding of RAM broken. At the
time this is called, we cannot have anyone active that relies on discards
to work properly - let's still implement error handling.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-8-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 target/i386/sev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index d273174ad3..f100a53231 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -680,6 +680,12 @@ sev_guest_init(const char *id)
 uint32_t host_cbitpos;
 struct sev_user_data_status status = {};
 
+ret = ram_block_discard_disable(true);
+if (ret) {
+error_report("%s: cannot disable RAM discard", __func__);
+return NULL;
+}
+
 sev = lookup_sev_guest_info(id);
 if (!sev) {
 error_report("%s: '%s' is not a valid '%s' object",
@@ -751,6 +757,7 @@ sev_guest_init(const char *id)
 return sev;
 err:
 sev_guest = NULL;
+ram_block_discard_disable(false);
 return NULL;
 }
 
-- 
MST

[PULL 16/41] hmp: Handle virtio-mem when printing memory device info

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Print the memory device info just like for other memory devices.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Dr. David Alan Gilbert" 
Cc: "Michael S. Tsirkin" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-14-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 monitor/hmp-cmds.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 2b0b58a336..2ec13e4cc3 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1821,6 +1821,7 @@ void hmp_info_memory_devices(Monitor *mon, const QDict 
*qdict)
 MemoryDeviceInfoList *info_list = qmp_query_memory_devices(&err);
 MemoryDeviceInfoList *info;
 VirtioPMEMDeviceInfo *vpi;
+VirtioMEMDeviceInfo *vmi;
 MemoryDeviceInfo *value;
 PCDIMMDeviceInfo *di;
 
@@ -1855,6 +1856,21 @@ void hmp_info_memory_devices(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "  size: %" PRIu64 "\n", vpi->size);
 monitor_printf(mon, "  memdev: %s\n", vpi->memdev);
 break;
+case MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM:
+vmi = value->u.virtio_mem.data;
+monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
+   MemoryDeviceInfoKind_str(value->type),
+   vmi->id ? vmi->id : "");
+monitor_printf(mon, "  memaddr: 0x%" PRIx64 "\n", 
vmi->memaddr);
+monitor_printf(mon, "  node: %" PRId64 "\n", vmi->node);
+monitor_printf(mon, "  requested-size: %" PRIu64 "\n",
+   vmi->requested_size);
+monitor_printf(mon, "  size: %" PRIu64 "\n", vmi->size);
+monitor_printf(mon, "  max-size: %" PRIu64 "\n", 
vmi->max_size);
+monitor_printf(mon, "  block-size: %" PRIu64 "\n",
+   vmi->block_size);
+monitor_printf(mon, "  memdev: %s\n", vmi->memdev);
+break;
 default:
 g_assert_not_reached();
 }
-- 
MST

[PULL 14/41] virtio-pci: Proxy for virtio-mem

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Let's add a proxy for virtio-mem, make it a memory device, and
pass-through the properties.

Reviewed-by: Pankaj Gupta 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: "Dr. David Alan Gilbert" 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-12-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-mem-pci.h |  33 ++
 include/hw/pci/pci.h   |   1 +
 hw/virtio/virtio-mem-pci.c | 129 +
 hw/virtio/Makefile.objs|   1 +
 4 files changed, 164 insertions(+)
 create mode 100644 hw/virtio/virtio-mem-pci.h
 create mode 100644 hw/virtio/virtio-mem-pci.c

diff --git a/hw/virtio/virtio-mem-pci.h b/hw/virtio/virtio-mem-pci.h
new file mode 100644
index 00..8820cd6628
--- /dev/null
+++ b/hw/virtio/virtio-mem-pci.h
@@ -0,0 +1,33 @@
+/*
+ * Virtio MEM PCI device
+ *
+ * Copyright (C) 2020 Red Hat, Inc.
+ *
+ * Authors:
+ *  David Hildenbrand 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_VIRTIO_MEM_PCI_H
+#define QEMU_VIRTIO_MEM_PCI_H
+
+#include "hw/virtio/virtio-pci.h"
+#include "hw/virtio/virtio-mem.h"
+
+typedef struct VirtIOMEMPCI VirtIOMEMPCI;
+
+/*
+ * virtio-mem-pci: This extends VirtioPCIProxy.
+ */
+#define TYPE_VIRTIO_MEM_PCI "virtio-mem-pci-base"
+#define VIRTIO_MEM_PCI(obj) \
+OBJECT_CHECK(VirtIOMEMPCI, (obj), TYPE_VIRTIO_MEM_PCI)
+
+struct VirtIOMEMPCI {
+VirtIOPCIProxy parent_obj;
+VirtIOMEM vdev;
+};
+
+#endif /* QEMU_VIRTIO_MEM_PCI_H */
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index a4e9c33416..c1bf7d5356 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -87,6 +87,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_VIRTIO_VSOCK   0x1012
 #define PCI_DEVICE_ID_VIRTIO_PMEM0x1013
 #define PCI_DEVICE_ID_VIRTIO_IOMMU   0x1014
+#define PCI_DEVICE_ID_VIRTIO_MEM 0x1015
 
 #define PCI_VENDOR_ID_REDHAT 0x1b36
 #define PCI_DEVICE_ID_REDHAT_BRIDGE  0x0001
diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c
new file mode 100644
index 00..b325303b32
--- /dev/null
+++ b/hw/virtio/virtio-mem-pci.c
@@ -0,0 +1,129 @@
+/*
+ * Virtio MEM PCI device
+ *
+ * Copyright (C) 2020 Red Hat, Inc.
+ *
+ * Authors:
+ *  David Hildenbrand 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "virtio-mem-pci.h"
+#include "hw/mem/memory-device.h"
+#include "qapi/error.h"
+
+static void virtio_mem_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+VirtIOMEMPCI *mem_pci = VIRTIO_MEM_PCI(vpci_dev);
+DeviceState *vdev = DEVICE(&mem_pci->vdev);
+
+qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
+object_property_set_bool(OBJECT(vdev), true, "realized", errp);
+}
+
+static void virtio_mem_pci_set_addr(MemoryDeviceState *md, uint64_t addr,
+Error **errp)
+{
+object_property_set_uint(OBJECT(md), addr, VIRTIO_MEM_ADDR_PROP, errp);
+}
+
+static uint64_t virtio_mem_pci_get_addr(const MemoryDeviceState *md)
+{
+return object_property_get_uint(OBJECT(md), VIRTIO_MEM_ADDR_PROP,
+&error_abort);
+}
+
+static MemoryRegion *virtio_mem_pci_get_memory_region(MemoryDeviceState *md,
+  Error **errp)
+{
+VirtIOMEMPCI *pci_mem = VIRTIO_MEM_PCI(md);
+VirtIOMEM *vmem = VIRTIO_MEM(&pci_mem->vdev);
+VirtIOMEMClass *vmc = VIRTIO_MEM_GET_CLASS(vmem);
+
+return vmc->get_memory_region(vmem, errp);
+}
+
+static uint64_t virtio_mem_pci_get_plugged_size(const MemoryDeviceState *md,
+Error **errp)
+{
+return object_property_get_uint(OBJECT(md), VIRTIO_MEM_SIZE_PROP,
+errp);
+}
+
+static void virtio_mem_pci_fill_device_info(const MemoryDeviceState *md,
+MemoryDeviceInfo *info)
+{
+VirtioMEMDeviceInfo *vi = g_new0(VirtioMEMDeviceInfo, 1);
+VirtIOMEMPCI *pci_mem = VIRTIO_MEM_PCI(md);
+VirtIOMEM *vmem = VIRTIO_MEM(&pci_mem->vdev);
+VirtIOMEMClass *vpc = VIRTIO_MEM_GET_CLASS(vmem);
+DeviceState *dev = DEVICE(md);
+
+if (dev->id) {
+vi->has_id = true;
+vi->id = g_strdup(dev->id);
+}
+
+/* let the real device handle everything else */
+vpc->fill_device_info(vmem, vi);
+
+info->u.virtio_mem.data = vi;
+info->type = MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM;
+}
+
+static void virtio_mem_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
+MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(kl

[PULL 18/41] pc: Support for virtio-mem-pci

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Let's wire it up similar to virtio-pmem. Also disallow unplug, so it's
harder for users to shoot themselves into the foot.

Reviewed-by: Pankaj Gupta 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Eric Blake 
Cc: Markus Armbruster 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-16-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/pc.c| 49 -
 hw/i386/Kconfig |  1 +
 2 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 58b1425c17..576f2502f9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -88,6 +88,7 @@
 #include "hw/net/ne2000-isa.h"
 #include "standard-headers/asm-x86/bootparam.h"
 #include "hw/virtio/virtio-pmem-pci.h"
+#include "hw/virtio/virtio-mem-pci.h"
 #include "hw/mem/memory-device.h"
 #include "sysemu/replay.h"
 #include "qapi/qmp/qerror.h"
@@ -1637,8 +1638,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 numa_cpu_pre_plug(cpu_slot, dev, errp);
 }
 
-static void pc_virtio_pmem_pci_pre_plug(HotplugHandler *hotplug_dev,
-DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_pre_plug(HotplugHandler *hotplug_dev,
+  DeviceState *dev, Error **errp)
 {
 HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
 Error *local_err = NULL;
@@ -1649,7 +1650,8 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler 
*hotplug_dev,
  * order. We should never reach this point when hotplugging on x86,
  * however, better add a safety net.
  */
-error_setg(errp, "virtio-pmem-pci hotplug not supported on this bus.");
+error_setg(errp, "hotplug of virtio based memory devices not supported"
+   " on this bus.");
 return;
 }
 /*
@@ -1664,8 +1666,8 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler 
*hotplug_dev,
 error_propagate(errp, local_err);
 }
 
-static void pc_virtio_pmem_pci_plug(HotplugHandler *hotplug_dev,
-DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_plug(HotplugHandler *hotplug_dev,
+  DeviceState *dev, Error **errp)
 {
 HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
 Error *local_err = NULL;
@@ -1685,17 +1687,17 @@ static void pc_virtio_pmem_pci_plug(HotplugHandler 
*hotplug_dev,
 error_propagate(errp, local_err);
 }
 
-static void pc_virtio_pmem_pci_unplug_request(HotplugHandler *hotplug_dev,
-  DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_unplug_request(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
 {
-/* We don't support virtio pmem hot unplug */
-error_setg(errp, "virtio pmem device unplug not supported.");
+/* We don't support hot unplug of virtio based memory devices */
+error_setg(errp, "virtio based memory devices cannot be unplugged.");
 }
 
-static void pc_virtio_pmem_pci_unplug(HotplugHandler *hotplug_dev,
-  DeviceState *dev, Error **errp)
+static void pc_virtio_md_pci_unplug(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
 {
-/* We don't support virtio pmem hot unplug */
+/* We don't support hot unplug of virtio based memory devices */
 }
 
 static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
@@ -1705,8 +1707,9 @@ static void pc_machine_device_pre_plug_cb(HotplugHandler 
*hotplug_dev,
 pc_memory_pre_plug(hotplug_dev, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
 pc_cpu_pre_plug(hotplug_dev, dev, errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) {
-pc_virtio_pmem_pci_pre_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) ||
+   object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+pc_virtio_md_pci_pre_plug(hotplug_dev, dev, errp);
 }
 }
 
@@ -1717,8 +1720,9 @@ static void pc_machine_device_plug_cb(HotplugHandler 
*hotplug_dev,
 pc_memory_plug(hotplug_dev, dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
 pc_cpu_plug(hotplug_dev, dev, errp);
-} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) {
-pc_virtio_pmem_pci_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) ||
+   object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+pc_virtio_md_pci_plug(hotplug_dev, dev, errp);
 }
 }
 
@@ -1729,8 +1733,9 @@ static void 
pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,

[PULL 13/41] virtio-mem: Paravirtualized memory hot(un)plug

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

This is the very basic/initial version of virtio-mem. An introduction to
virtio-mem can be found in the Linux kernel driver [1]. While it can be
used in the current state for hotplug of a smaller amount of memory, it
will heavily benefit from resizeable memory regions in the future.

Each virtio-mem device manages a memory region (provided via a memory
backend). After requested by the hypervisor ("requested-size"), the
guest can try to plug/unplug blocks of memory within that region, in order
to reach the requested size. Initially, and after a reboot, all memory is
unplugged (except in special cases - reboot during postcopy).

The guest may only try to plug/unplug blocks of memory within the usable
region size. The usable region size is a little bigger than the
requested size, to give the device driver some flexibility. The usable
region size will only grow, except on reboots or when all memory is
requested to get unplugged. The guest can never plug more memory than
requested. Unplugged memory will get zapped/discarded, similar to in a
balloon device.

The block size is variable, however, it is always chosen in a way such that
THP splits are avoided (e.g., 2MB). The state of each block
(plugged/unplugged) is tracked in a bitmap.

As virtio-mem devices (e.g., virtio-mem-pci) will be memory devices, we now
expose "VirtioMEMDeviceInfo" via "query-memory-devices".

--

There are two important follow-up items that are in the works:
1. Resizeable memory regions: Use resizeable allocations/RAM blocks to
   grow/shrink along with the usable region size. This avoids creating
   initially very big VMAs, RAM blocks, and KVM slots.
2. Protection of unplugged memory: Make sure the gust cannot actually
   make use of unplugged memory.

Other follow-up items that are in the works:
1. Exclude unplugged memory during migration (via precopy notifier).
2. Handle remapping of memory.
3. Support for other architectures.

--

Example usage (virtio-mem-pci is introduced in follow-up patches):

Start QEMU with two virtio-mem devices (one per NUMA node):
 $ qemu-system-x86_64 -m 4G,maxmem=20G \
  -smp sockets=2,cores=2 \
  -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 \
  [...]
  -object memory-backend-ram,id=mem0,size=8G \
  -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,requested-size=0M \
  -object memory-backend-ram,id=mem1,size=8G \
  -device virtio-mem-pci,id=vm1,memdev=mem1,node=1,requested-size=1G

Query the configuration:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x14000
   node: 0
   requested-size: 0
   size: 0
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x34000
   node: 1
   requested-size: 1073741824
   size: 1073741824
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

Add some memory to node 0:
 (qemu) qom-set vm0 requested-size 500M

Remove some memory from node 1:
 (qemu) qom-set vm1 requested-size 200M

Query the configuration again:
 (qemu) info memory-devices
 Memory device [virtio-mem]: "vm0"
   memaddr: 0x14000
   node: 0
   requested-size: 524288000
   size: 524288000
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem0
 Memory device [virtio-mem]: "vm1"
   memaddr: 0x34000
   node: 1
   requested-size: 209715200
   size: 209715200
   max-size: 8589934592
   block-size: 2097152
   memdev: /objects/mem1

[1] https://lkml.kernel.org/r/20200311171422.10484-1-da...@redhat.com

Cc: "Michael S. Tsirkin" 
Cc: Eric Blake 
Cc: Markus Armbruster 
Cc: "Dr. David Alan Gilbert" 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-11-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi/misc.json |  39 +-
 include/hw/virtio/virtio-mem.h |  78 
 hw/virtio/virtio-mem.c | 724 +
 hw/virtio/Kconfig  |  11 +
 hw/virtio/Makefile.objs|   1 +
 5 files changed, 852 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/virtio/virtio-mem.h
 create mode 100644 hw/virtio/virtio-mem.c

diff --git a/qapi/misc.json b/qapi/misc.json
index a5a0beb902..65ca3edf32 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1356,19 +1356,56 @@
   }
 }
 
+##
+# @VirtioMEMDeviceInfo:
+#
+# VirtioMEMDevice state information
+#
+# @id: device's ID
+#
+# @memaddr: physical address in memory, where device is mapped
+#
+# @requested-size: the user requested size of the device
+#
+# @size: the (current) size of memory that the device provides
+#
+# @max-size: the maximum size of memory that the device can provide
+#
+# @block-size: the block size of memory that the device provides
+#
+# @node: NUMA node number where device is assigned to
+#
+# @mem

[PULL 15/41] MAINTAINERS: Add myself as virtio-mem maintainer

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Let's make sure patches/bug reports find the right person.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: Peter Maydell 
Cc: Markus Armbruster 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-13-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index dec252f38b..5f02160436 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1790,6 +1790,15 @@ F: hw/virtio/virtio-crypto.c
 F: hw/virtio/virtio-crypto-pci.c
 F: include/hw/virtio/virtio-crypto.h
 
+virtio-mem
+M: David Hildenbrand 
+S: Supported
+W: https://virtio-mem.gitlab.io/
+F: hw/virtio/virtio-mem.c
+F: hw/virtio/virtio-mem-pci.h
+F: hw/virtio/virtio-mem-pci.c
+F: include/hw/virtio/virtio-mem.h
+
 nvme
 M: Keith Busch 
 L: qemu-bl...@nongnu.org
-- 
MST

[PULL 25/41] tests/acpi: remove stale allowed tables

2020-07-03 Thread Michael S. Tsirkin

From: Andrew Jones 

Fixes: 93dd625f8bf7 ("tests/acpi: update expected data files")
Signed-off-by: Andrew Jones 
Message-Id: <20200629140938.17566-2-drjo...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 18 --
 1 file changed, 18 deletions(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 8992f1f12b..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,19 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/pc/DSDT",
-"tests/data/acpi/pc/DSDT.acpihmat",
-"tests/data/acpi/pc/DSDT.bridge",
-"tests/data/acpi/pc/DSDT.cphp",
-"tests/data/acpi/pc/DSDT.dimmpxm",
-"tests/data/acpi/pc/DSDT.ipmikcs",
-"tests/data/acpi/pc/DSDT.memhp",
-"tests/data/acpi/pc/DSDT.numamem",
-"tests/data/acpi/q35/DSDT",
-"tests/data/acpi/q35/DSDT.acpihmat",
-"tests/data/acpi/q35/DSDT.bridge",
-"tests/data/acpi/q35/DSDT.cphp",
-"tests/data/acpi/q35/DSDT.dimmpxm",
-"tests/data/acpi/q35/DSDT.ipmibt",
-"tests/data/acpi/q35/DSDT.memhp",
-"tests/data/acpi/q35/DSDT.mmio64",
-"tests/data/acpi/q35/DSDT.numamem",
-"tests/data/acpi/q35/DSDT.tis",
-- 
MST

[PULL 20/41] virtio-pci: Send qapi events when the virtio-mem size changes

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Let's register the notifier and trigger the qapi event with the right
device id.

MEMORY_DEVICE_SIZE_CHANGE is similar to BALLOON_CHANGE, however on a
memory device level.

Don't unregister the notifier (we neither have finalize() nor unrealize()
for VirtIOPCIProxy, so it's not that simple to do it) - both devices are
expected to vanish at the same time.

Cc: "Michael S. Tsirkin" 
Cc: Markus Armbruster 
Cc: "Dr. David Alan Gilbert" 
Cc: Eric Blake 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-18-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 qapi/misc.json | 25 +
 hw/virtio/virtio-mem-pci.h |  1 +
 hw/virtio/virtio-mem-pci.c | 28 
 monitor/monitor.c  |  1 +
 4 files changed, 55 insertions(+)

diff --git a/qapi/misc.json b/qapi/misc.json
index 65ca3edf32..149c925246 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1434,6 +1434,31 @@
 ##
 { 'command': 'query-memory-devices', 'returns': ['MemoryDeviceInfo'] }
 
+##
+# @MEMORY_DEVICE_SIZE_CHANGE:
+#
+# Emitted when the size of a memory device changes. Only emitted for memory
+# devices that can actually change the size (e.g., virtio-mem due to guest
+# action).
+#
+# @id: device's ID
+# @size: the new size of memory that the device provides
+#
+# Note: this event is rate-limited.
+#
+# Since: 5.1
+#
+# Example:
+#
+# <- { "event": "MEMORY_DEVICE_SIZE_CHANGE",
+#  "data": { "id": "vm0", "size": 1073741824},
+#  "timestamp": { "seconds": 1588168529, "microseconds": 201316 } }
+#
+##
+{ 'event': 'MEMORY_DEVICE_SIZE_CHANGE',
+  'data': { '*id': 'str', 'size': 'size' } }
+
+
 ##
 # @MEM_UNPLUG_ERROR:
 #
diff --git a/hw/virtio/virtio-mem-pci.h b/hw/virtio/virtio-mem-pci.h
index 8820cd6628..b51a28b275 100644
--- a/hw/virtio/virtio-mem-pci.h
+++ b/hw/virtio/virtio-mem-pci.h
@@ -28,6 +28,7 @@ typedef struct VirtIOMEMPCI VirtIOMEMPCI;
 struct VirtIOMEMPCI {
 VirtIOPCIProxy parent_obj;
 VirtIOMEM vdev;
+Notifier size_change_notifier;
 };
 
 #endif /* QEMU_VIRTIO_MEM_PCI_H */
diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c
index b325303b32..1a8e854123 100644
--- a/hw/virtio/virtio-mem-pci.c
+++ b/hw/virtio/virtio-mem-pci.c
@@ -14,6 +14,7 @@
 #include "virtio-mem-pci.h"
 #include "hw/mem/memory-device.h"
 #include "qapi/error.h"
+#include "qapi/qapi-events-misc.h"
 
 static void virtio_mem_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
 {
@@ -74,6 +75,21 @@ static void virtio_mem_pci_fill_device_info(const 
MemoryDeviceState *md,
 info->type = MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM;
 }
 
+static void virtio_mem_pci_size_change_notify(Notifier *notifier, void *data)
+{
+VirtIOMEMPCI *pci_mem = container_of(notifier, VirtIOMEMPCI,
+ size_change_notifier);
+DeviceState *dev = DEVICE(pci_mem);
+const uint64_t * const size_p = data;
+const char *id = NULL;
+
+if (dev->id) {
+id = g_strdup(dev->id);
+}
+
+qapi_event_send_memory_device_size_change(!!id, id, *size_p);
+}
+
 static void virtio_mem_pci_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -98,9 +114,21 @@ static void virtio_mem_pci_class_init(ObjectClass *klass, 
void *data)
 static void virtio_mem_pci_instance_init(Object *obj)
 {
 VirtIOMEMPCI *dev = VIRTIO_MEM_PCI(obj);
+VirtIOMEMClass *vmc;
+VirtIOMEM *vmem;
 
 virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
 TYPE_VIRTIO_MEM);
+
+dev->size_change_notifier.notify = virtio_mem_pci_size_change_notify;
+vmem = VIRTIO_MEM(&dev->vdev);
+vmc = VIRTIO_MEM_GET_CLASS(vmem);
+/*
+ * We never remove the notifier again, as we expect both devices to
+ * disappear at the same time.
+ */
+vmc->add_size_change_notifier(vmem, &dev->size_change_notifier);
+
 object_property_add_alias(obj, VIRTIO_MEM_BLOCK_SIZE_PROP,
   OBJECT(&dev->vdev), VIRTIO_MEM_BLOCK_SIZE_PROP);
 object_property_add_alias(obj, VIRTIO_MEM_SIZE_PROP, OBJECT(&dev->vdev),
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 125494410a..19dcb8fbe3 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -235,6 +235,7 @@ static MonitorQAPIEventConf 
monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
 [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
 [QAPI_EVENT_QUORUM_FAILURE]= { 1000 * SCALE_MS },
 [QAPI_EVENT_VSERPORT_CHANGE]   = { 1000 * SCALE_MS },
+[QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE] = { 1000 * SCALE_MS },
 };
 
 /*
-- 
MST

[PULL 24/41] numa: Auto-enable NUMA when any memory devices are possible

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Let's auto-enable it also when maxmem is specified but no slots are
defined. This will result in us properly creating ACPI srat tables,
indicating the maximum possible PFN to the guest OS. Based on this, e.g.,
Linux will enable the swiotlb properly.

This avoids having to manually force the switolb on (swiotlb=force) in
Linux in case we're booting only using DMA memory (e.g., 2GB on x86-64),
and virtio-mem adds memory later on that really needs the swiotlb to be
used for DMA.

Let's take care of backwards compatibility if somebody has a setup that
specifies "maxram" without "slots".

Reported-by: Alex Shi 
Cc: Peter Maydell 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Sergio Lopez 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: qemu-...@nongnu.org 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-22-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/boards.h |  1 +
 hw/arm/virt.c   |  2 ++
 hw/core/numa.c  | 11 ++-
 hw/i386/microvm.c   |  1 +
 hw/i386/pc.c|  1 +
 hw/i386/pc_piix.c   |  1 +
 hw/i386/pc_q35.c|  1 +
 7 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 18815d9be2..426ce5f625 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -207,6 +207,7 @@ struct MachineClass {
 const char **valid_cpu_types;
 strList *allowed_dynamic_sysbus_devices;
 bool auto_enable_numa_with_memhp;
+bool auto_enable_numa_with_memdev;
 void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes,
  int nb_nodes, ram_addr_t size);
 bool ignore_boot_device_suffixes;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index cd0834ce7f..f97be80a86 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2373,6 +2373,7 @@ static void virt_machine_class_init(ObjectClass *oc, void 
*data)
 hc->unplug = virt_machine_device_unplug_cb;
 mc->nvdimm_supported = true;
 mc->auto_enable_numa_with_memhp = true;
+mc->auto_enable_numa_with_memdev = true;
 mc->default_ram_id = "mach-virt.ram";
 
 object_class_property_add(oc, "acpi", "OnOffAuto",
@@ -2485,6 +2486,7 @@ static void virt_machine_5_0_options(MachineClass *mc)
 virt_machine_5_1_options(mc);
 compat_props_add(mc->compat_props, hw_compat_5_0, hw_compat_5_0_len);
 mc->numa_mem_supported = true;
+mc->auto_enable_numa_with_memdev = false;
 }
 DEFINE_VIRT_MACHINE(5, 0)
 
diff --git a/hw/core/numa.c b/hw/core/numa.c
index e9aec69afd..6a20ce7cf1 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -688,8 +688,9 @@ void numa_complete_configuration(MachineState *ms)
 NodeInfo *numa_info = ms->numa_state->nodes;
 
 /*
- * If memory hotplug is enabled (slots > 0) but without '-numa'
- * options explicitly on CLI, guestes will break.
+ * If memory hotplug is enabled (slot > 0) or memory devices are enabled
+ * (ms->maxram_size > ram_size) but without '-numa' options explicitly on
+ * CLI, guests will break.
  *
  *   Windows: won't enable memory hotplug without SRAT table at all
  *
@@ -704,9 +705,9 @@ void numa_complete_configuration(MachineState *ms)
  * assume there is just one node with whole RAM.
  */
 if (ms->numa_state->num_nodes == 0 &&
-((ms->ram_slots > 0 &&
-mc->auto_enable_numa_with_memhp) ||
-mc->auto_enable_numa)) {
+((ms->ram_slots && mc->auto_enable_numa_with_memhp) ||
+ (ms->maxram_size > ms->ram_size && mc->auto_enable_numa_with_memdev) 
||
+ mc->auto_enable_numa)) {
 NumaNodeOptions node = { };
 parse_numa_node(ms, &node, &error_abort);
 numa_info[0].node_mem = ram_size;
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 5e931975a0..81d0888930 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -464,6 +464,7 @@ static void microvm_class_init(ObjectClass *oc, void *data)
 mc->max_cpus = 288;
 mc->has_hotpluggable_cpus = false;
 mc->auto_enable_numa_with_memhp = false;
+mc->auto_enable_numa_with_memdev = false;
 mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
 mc->nvdimm_supported = false;
 mc->default_ram_id = "microvm.ram";
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 576f2502f9..61acc9e530 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1975,6 +1975,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
 mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
 mc->auto_enable_numa_with_memhp = true;
+mc->auto_enable_numa_with_memdev = true;
 mc->has_hotpluggable_cpus = true;
 mc->default_boot_order = "cad";
 mc->hot_add_cpu = pc_hot_add_cpu;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1d832b2878..fae487f57d 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -444

[PULL 23/41] virtio-mem: Exclude unplugged memory during migration

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

The content of unplugged memory is undefined and should not be migrated,
ever. Exclude all unplugged memory during precopy using the precopy notifier
infrastructure introduced for free page hinting in virtio-balloon.

Unplugged memory is marked as "not dirty", meaning it won't be
considered for migration.

Cc: "Michael S. Tsirkin" 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-21-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio-mem.h |  3 ++
 hw/virtio/virtio-mem.c | 54 +-
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h
index b74c77cd42..0778224964 100644
--- a/include/hw/virtio/virtio-mem.h
+++ b/include/hw/virtio/virtio-mem.h
@@ -67,6 +67,9 @@ typedef struct VirtIOMEM {
 
 /* notifiers to notify when "size" changes */
 NotifierList size_change_notifiers;
+
+/* don't migrate unplugged memory */
+NotifierWithReturn precopy_notifier;
 } VirtIOMEM;
 
 typedef struct VirtIOMEMClass {
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index fdd4dbb42c..bf9b414522 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -62,8 +62,14 @@ static bool virtio_mem_is_busy(void)
 /*
  * Postcopy cannot handle concurrent discards and we don't want to migrate
  * pages on-demand with stale content when plugging new blocks.
+ *
+ * For precopy, we don't want unplugged blocks in our migration stream, and
+ * when plugging new blocks, the page content might differ between source
+ * and destination (observable by the guest when not initializing pages
+ * after plugging them) until we're running on the destination (as we 
didn't
+ * migrate these blocks when they were unplugged).
  */
-return migration_in_incoming_postcopy();
+return migration_in_incoming_postcopy() || !migration_is_idle();
 }
 
 static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa,
@@ -475,6 +481,7 @@ static void virtio_mem_device_realize(DeviceState *dev, 
Error **errp)
 host_memory_backend_set_mapped(vmem->memdev, true);
 vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem));
 qemu_register_reset(virtio_mem_system_reset, vmem);
+precopy_add_notifier(&vmem->precopy_notifier);
 }
 
 static void virtio_mem_device_unrealize(DeviceState *dev)
@@ -482,6 +489,7 @@ static void virtio_mem_device_unrealize(DeviceState *dev)
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VirtIOMEM *vmem = VIRTIO_MEM(dev);
 
+precopy_remove_notifier(&vmem->precopy_notifier);
 qemu_unregister_reset(virtio_mem_system_reset, vmem);
 vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem));
 host_memory_backend_set_mapped(vmem->memdev, false);
@@ -757,12 +765,56 @@ static void virtio_mem_set_block_size(Object *obj, 
Visitor *v, const char *name,
 vmem->block_size = value;
 }
 
+static void virtio_mem_precopy_exclude_unplugged(VirtIOMEM *vmem)
+{
+void * const host = qemu_ram_get_host_addr(vmem->memdev->mr.ram_block);
+unsigned long first_zero_bit, last_zero_bit;
+uint64_t offset, length;
+
+/*
+ * Find consecutive unplugged blocks and exclude them from migration.
+ *
+ * Note: Blocks cannot get (un)plugged during precopy, no locking needed.
+ */
+first_zero_bit = find_first_zero_bit(vmem->bitmap, vmem->bitmap_size);
+while (first_zero_bit < vmem->bitmap_size) {
+offset = first_zero_bit * vmem->block_size;
+last_zero_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size,
+  first_zero_bit + 1) - 1;
+length = (last_zero_bit - first_zero_bit + 1) * vmem->block_size;
+
+qemu_guest_free_page_hint(host + offset, length);
+first_zero_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size,
+last_zero_bit + 2);
+}
+}
+
+static int virtio_mem_precopy_notify(NotifierWithReturn *n, void *data)
+{
+VirtIOMEM *vmem = container_of(n, VirtIOMEM, precopy_notifier);
+PrecopyNotifyData *pnd = data;
+
+switch (pnd->reason) {
+case PRECOPY_NOTIFY_SETUP:
+precopy_enable_free_page_optimization();
+break;
+case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC:
+virtio_mem_precopy_exclude_unplugged(vmem);
+break;
+default:
+break;
+}
+
+return 0;
+}
+
 static void virtio_mem_instance_init(Object *obj)
 {
 VirtIOMEM *vmem = VIRTIO_MEM(obj);
 
 vmem->block_size = VIRTIO_MEM_MIN_BLOCK_SIZE;
 notifier_list_init(&vmem->size_change_notifiers);
+vmem->precopy_notifier.notify = virtio_mem_precopy_notify;
 
 object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size,
 NULL, NULL, NULL);
-- 
MST

[PULL 27/41] MAINTAINERS: add VT-d entry

2020-07-03 Thread Michael S. Tsirkin

From: Peter Xu 

Add this entry as suggested by Jason and Michael.

CC: Jason Wang 
CC: Michael S. Tsirkin 
CC: Paolo Bonzini 
Signed-off-by: Peter Xu 
Message-Id: <20200701124418.63060-1-pet...@redhat.com>
Acked-by: Jason Wang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5f02160436..49a0d837d7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2624,6 +2624,15 @@ F: tests/uefi-test-tools/
 F: .gitlab-ci.d/edk2.yml
 F: .gitlab-ci.d/edk2/
 
+VT-d Emulation
+M: Michael S. Tsirkin 
+M: Peter Xu 
+R: Jason Wang 
+S: Supported
+F: hw/i386/intel_iommu.c
+F: hw/i386/intel_iommu_internal.h
+F: include/hw/i386/intel_iommu.h
+
 Usermode Emulation
 --
 Overall usermode emulation
-- 
MST

[PULL 21/41] virtio-mem: Migration sanity checks

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

We want to make sure that certain properties don't change during
migration, especially to catch user errors in a nice way. Let's migrate
a temporary structure and validate that the properties didn't change.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-19-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-mem.c | 70 ++
 1 file changed, 70 insertions(+)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 2df33f9125..6ed5409669 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -519,12 +519,82 @@ static int virtio_mem_post_load(void *opaque, int 
version_id)
 return virtio_mem_restore_unplugged(VIRTIO_MEM(opaque));
 }
 
+typedef struct VirtIOMEMMigSanityChecks {
+VirtIOMEM *parent;
+uint64_t addr;
+uint64_t region_size;
+uint64_t block_size;
+uint32_t node;
+} VirtIOMEMMigSanityChecks;
+
+static int virtio_mem_mig_sanity_checks_pre_save(void *opaque)
+{
+VirtIOMEMMigSanityChecks *tmp = opaque;
+VirtIOMEM *vmem = tmp->parent;
+
+tmp->addr = vmem->addr;
+tmp->region_size = memory_region_size(&vmem->memdev->mr);
+tmp->block_size = vmem->block_size;
+tmp->node = vmem->node;
+return 0;
+}
+
+static int virtio_mem_mig_sanity_checks_post_load(void *opaque, int version_id)
+{
+VirtIOMEMMigSanityChecks *tmp = opaque;
+VirtIOMEM *vmem = tmp->parent;
+const uint64_t new_region_size = memory_region_size(&vmem->memdev->mr);
+
+if (tmp->addr != vmem->addr) {
+error_report("Property '%s' changed from 0x%" PRIx64 " to 0x%" PRIx64,
+ VIRTIO_MEM_ADDR_PROP, tmp->addr, vmem->addr);
+return -EINVAL;
+}
+/*
+ * Note: Preparation for resizeable memory regions. The maximum size
+ * of the memory region must not change during migration.
+ */
+if (tmp->region_size != new_region_size) {
+error_report("Property '%s' size changed from 0x%" PRIx64 " to 0x%"
+ PRIx64, VIRTIO_MEM_MEMDEV_PROP, tmp->region_size,
+ new_region_size);
+return -EINVAL;
+}
+if (tmp->block_size != vmem->block_size) {
+error_report("Property '%s' changed from 0x%" PRIx64 " to 0x%" PRIx64,
+ VIRTIO_MEM_BLOCK_SIZE_PROP, tmp->block_size,
+ vmem->block_size);
+return -EINVAL;
+}
+if (tmp->node != vmem->node) {
+error_report("Property '%s' changed from %" PRIu32 " to %" PRIu32,
+ VIRTIO_MEM_NODE_PROP, tmp->node, vmem->node);
+return -EINVAL;
+}
+return 0;
+}
+
+static const VMStateDescription vmstate_virtio_mem_sanity_checks = {
+.name = "virtio-mem-device/sanity-checks",
+.pre_save = virtio_mem_mig_sanity_checks_pre_save,
+.post_load = virtio_mem_mig_sanity_checks_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(addr, VirtIOMEMMigSanityChecks),
+VMSTATE_UINT64(region_size, VirtIOMEMMigSanityChecks),
+VMSTATE_UINT64(block_size, VirtIOMEMMigSanityChecks),
+VMSTATE_UINT32(node, VirtIOMEMMigSanityChecks),
+VMSTATE_END_OF_LIST(),
+},
+};
+
 static const VMStateDescription vmstate_virtio_mem_device = {
 .name = "virtio-mem-device",
 .minimum_version_id = 1,
 .version_id = 1,
 .post_load = virtio_mem_post_load,
 .fields = (VMStateField[]) {
+VMSTATE_WITH_TMP(VirtIOMEM, VirtIOMEMMigSanityChecks,
+ vmstate_virtio_mem_sanity_checks),
 VMSTATE_UINT64(usable_region_size, VirtIOMEM),
 VMSTATE_UINT64(size, VirtIOMEM),
 VMSTATE_UINT64(requested_size, VirtIOMEM),
-- 
MST

Re: Questionable aspects of QEMU Error's design

2020-07-03 Thread Vladimir Sementsov-Ogievskiy


03.07.2020 10:38, Markus Armbruster wrote:

Markus Armbruster  writes:


Vladimir Sementsov-Ogievskiy  writes:


28.04.2020 08:20, Vladimir Sementsov-Ogievskiy wrote:

27.04.2020 18:36, Markus Armbruster wrote:

FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
running into bugs.  So far:

[...]

I got another one coming for QOM and qdev before I can post the
conversion.

Vladimir, since the conversion will mess with error_propagate(), I'd
like to get it in before your auto-propagation work.



OK, just let me know when to regenerate the series, it's not hard.



Hi! Is all that merged? Should I resend now?


I ran into many bugs and fell into a few rabbit holes.  I'm busy
finishing and flushing the patches.


All merged except for the final series "[PATCH v2 00/44] Less clumsy
error checking".  v2 has a lot of change within the series, but in
aggregate it's really close to v1.  This makes be optimistic it can
serve as a base for your auto-propagation work.  To get it into 5.1, we
need a respin, a re-review, and a pull request.  Time is awfully short.
Sorry for taking so long!  If you want to try, I can give it priority on
my side.



Of course let's try)


--
Best regards,
Vladimir

[PULL 26/41] docs: vhost-user: add Virtio status protocol feature

2020-07-03 Thread Michael S. Tsirkin

From: Maxime Coquelin 

This patch specifies the VHOST_USER_SET_STATUS and
VHOST_USER_GET_STATUS requests, which are sent by
the master to update and query the Virtio status
in the backend.

Signed-off-by: Maxime Coquelin 
Message-Id: <20200618134501.145747-1-maxime.coque...@redhat.com>
Acked-by: Jason Wang 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 docs/interop/vhost-user.rst | 24 
 1 file changed, 24 insertions(+)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 688b7c6900..10e3e3475e 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -816,6 +816,7 @@ Protocol features
   #define VHOST_USER_PROTOCOL_F_RESET_DEVICE 13
   #define VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS 14
   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
+  #define VHOST_USER_PROTOCOL_F_STATUS   16
 
 Master message types
 
@@ -1307,6 +1308,29 @@ Master message types
   ``VHOST_USER_ADD_MEM_REG`` message, this message is used to set and
   update the memory tables of the slave device.
 
+``VHOST_USER_SET_STATUS``
+  :id: 39
+  :equivalent ioctl: VHOST_VDPA_SET_STATUS
+  :slave payload: N/A
+  :master payload: ``u64``
+
+  When the ``VHOST_USER_PROTOCOL_F_STATUS`` protocol feature has been
+  successfully negotiated, this message is submitted by the master to
+  notify the backend with updated device status as defined in the Virtio
+  specification.
+
+``VHOST_USER_GET_STATUS``
+  :id: 40
+  :equivalent ioctl: VHOST_VDPA_GET_STATUS
+  :slave payload: ``u64``
+  :master payload: N/A
+
+  When the ``VHOST_USER_PROTOCOL_F_STATUS`` protocol feature has been
+  successfully negotiated, this message is submitted by the master to
+  query the backend for its device status as defined in the Virtio
+  specification.
+
+
 Slave message types
 ---
 
-- 
MST

[PULL 30/41] virtio-bus: introduce queue_enabled method

2020-07-03 Thread Michael S. Tsirkin

From: Jason Wang 

This patch introduces queue_enabled() method which allows the
transport to implement its own way to report whether or not a queue is
enabled.

Signed-off-by: Jason Wang 
Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-4-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/hw/virtio/virtio-bus.h | 4 
 hw/virtio/virtio.c | 6 ++
 2 files changed, 10 insertions(+)

diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
index 38c9399cd4..0f6f215925 100644
--- a/include/hw/virtio/virtio-bus.h
+++ b/include/hw/virtio/virtio-bus.h
@@ -83,6 +83,10 @@ typedef struct VirtioBusClass {
  */
 int (*ioeventfd_assign)(DeviceState *d, EventNotifier *notifier,
 int n, bool assign);
+/*
+ * Whether queue number n is enabled.
+ */
+bool (*queue_enabled)(DeviceState *d, int n);
 /*
  * Does the transport have variable vring alignment?
  * (ie can it ever call virtio_queue_set_align()?)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index cc9c9dc162..5bd2a2f621 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3286,6 +3286,12 @@ hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, 
int n)
 
 bool virtio_queue_enabled(VirtIODevice *vdev, int n)
 {
+BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
+VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+
+if (k->queue_enabled) {
+return k->queue_enabled(qbus->parent, n);
+}
 return virtio_queue_get_desc_addr(vdev, n) != 0;
 }
 
-- 
MST

[PULL 22/41] virtio-mem: Add trace events

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

Let's add some trace events that might come in handy later.

Cc: "Michael S. Tsirkin" 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-20-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-mem.c | 10 +-
 hw/virtio/trace-events | 10 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 6ed5409669..fdd4dbb42c 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -30,6 +30,7 @@
 #include "hw/boards.h"
 #include "hw/qdev-properties.h"
 #include "config-devices.h"
+#include "trace.h"
 
 /*
  * Use QEMU_VMALLOC_ALIGN, so no THP will have to be split when unplugging
@@ -100,6 +101,7 @@ static void virtio_mem_send_response(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 VirtIODevice *vdev = VIRTIO_DEVICE(vmem);
 VirtQueue *vq = vmem->vq;
 
+trace_virtio_mem_send_response(le16_to_cpu(resp->type));
 iov_from_buf(elem->in_sg, elem->in_num, 0, resp, sizeof(*resp));
 
 virtqueue_push(vq, elem, sizeof(*resp));
@@ -195,6 +197,7 @@ static void virtio_mem_plug_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 const uint16_t nb_blocks = le16_to_cpu(req->u.plug.nb_blocks);
 uint16_t type;
 
+trace_virtio_mem_plug_request(gpa, nb_blocks);
 type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, true);
 virtio_mem_send_response_simple(vmem, elem, type);
 }
@@ -206,6 +209,7 @@ static void virtio_mem_unplug_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 const uint16_t nb_blocks = le16_to_cpu(req->u.unplug.nb_blocks);
 uint16_t type;
 
+trace_virtio_mem_unplug_request(gpa, nb_blocks);
 type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, false);
 virtio_mem_send_response_simple(vmem, elem, type);
 }
@@ -225,6 +229,7 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem,
 return;
 }
 
+trace_virtio_mem_resized_usable_region(vmem->usable_region_size, newsize);
 vmem->usable_region_size = newsize;
 }
 
@@ -247,7 +252,7 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem)
 vmem->size = 0;
 notifier_list_notify(&vmem->size_change_notifiers, &vmem->size);
 }
-
+trace_virtio_mem_unplugged_all();
 virtio_mem_resize_usable_region(vmem, vmem->requested_size, true);
 return 0;
 }
@@ -255,6 +260,7 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem)
 static void virtio_mem_unplug_all_request(VirtIOMEM *vmem,
   VirtQueueElement *elem)
 {
+trace_virtio_mem_unplug_all_request();
 if (virtio_mem_unplug_all(vmem)) {
 virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_BUSY);
 } else {
@@ -272,6 +278,7 @@ static void virtio_mem_state_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 .type = cpu_to_le16(VIRTIO_MEM_RESP_ACK),
 };
 
+trace_virtio_mem_state_request(gpa, nb_blocks);
 if (!virtio_mem_valid_range(vmem, gpa, size)) {
 virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_ERROR);
 return;
@@ -284,6 +291,7 @@ static void virtio_mem_state_request(VirtIOMEM *vmem, 
VirtQueueElement *elem,
 } else {
 resp.u.state.state = cpu_to_le16(VIRTIO_MEM_STATE_MIXED);
 }
+trace_virtio_mem_state_response(le16_to_cpu(resp.u.state.state));
 virtio_mem_send_response(vmem, elem, &resp);
 }
 
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 6427a0047d..292fc15e29 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -74,3 +74,13 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d"
 virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d"
 virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t 
sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d"
 virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, 
uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64
+
+# virtio-mem.c
+virtio_mem_send_response(uint16_t type) "type=%" PRIu16
+virtio_mem_plug_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 " 
nb_blocks=%" PRIu16
+virtio_mem_unplug_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 
" nb_blocks=%" PRIu16
+virtio_mem_unplugged_all(void) ""
+virtio_mem_unplug_all_request(void) ""
+virtio_mem_resized_usable_region(uint64_t old_size, uint64_t new_size) 
"old_size=0x%" PRIx64 "new_size=0x%" PRIx64
+virtio_mem_state_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 
" nb_blocks=%" PRIu16
+virtio_mem_state_response(uint16_t state) "state=%" PRIu16
-- 
MST

[PULL 32/41] vhost: check the existence of vhost_set_iotlb_callback

2020-07-03 Thread Michael S. Tsirkin

From: Jason Wang 

Add the check of vhost_set_iotlb_callback
before calling

Signed-off-by: Jason Wang 
Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-6-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 hw/virtio/vhost.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 5fd25fe520..10304b583e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1686,8 +1686,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 }
 }
 
-if (vhost_dev_has_iommu(hdev)) {
-hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
+if (vhost_dev_has_iommu(hdev) &&
+hdev->vhost_ops->vhost_set_iotlb_callback) {
+hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
 
 /* Update used ring information for IOTLB to work correctly,
  * vhost-kernel code requires for this.*/
@@ -1730,7 +1731,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 }
 
 if (vhost_dev_has_iommu(hdev)) {
-hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
+if (hdev->vhost_ops->vhost_set_iotlb_callback) {
+hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
+}
 memory_listener_unregister(&hdev->iommu_listener);
 }
 vhost_log_put(hdev, true);
-- 
MST

[PULL 28/41] net: introduce qemu_get_peer

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

This is a small function that can get the peer
from given NetClientState and queue_index

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-2-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/net/net.h | 1 +
 net/net.c | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 39085d9444..e7ef42d62b 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -176,6 +176,7 @@ void hmp_info_network(Monitor *mon, const QDict *qdict);
 void net_socket_rs_init(SocketReadState *rs,
 SocketReadStateFinalize *finalize,
 bool vnet_hdr);
+NetClientState *qemu_get_peer(NetClientState *nc, int queue_index);
 
 /* NIC info */
 
diff --git a/net/net.c b/net/net.c
index d1130296e1..9099a327dd 100644
--- a/net/net.c
+++ b/net/net.c
@@ -325,6 +325,13 @@ void *qemu_get_nic_opaque(NetClientState *nc)
 return nic->opaque;
 }
 
+NetClientState *qemu_get_peer(NetClientState *nc, int queue_index)
+{
+assert(nc != NULL);
+NetClientState *ncs = nc + queue_index;
+return ncs->peer;
+}
+
 static void qemu_cleanup_net_client(NetClientState *nc)
 {
 QTAILQ_REMOVE(&net_clients, nc, next);
-- 
MST

[PULL 29/41] vhost_net: use the function qemu_get_peer

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

user the qemu_get_peer to replace the old process

Signed-off-by: Cindy Lu 
Reviewed-by: Laurent Vivier 
Message-Id: <20200701145538.22333-3-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 hw/net/vhost_net.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6b82803fa7..4096d64aaf 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -306,7 +306,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
 VirtioBusState *vbus = VIRTIO_BUS(qbus);
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
+struct vhost_net *net;
 int r, e, i;
+NetClientState *peer;
 
 if (!k->set_guest_notifiers) {
 error_report("binding does not support guest notifiers");
@@ -314,9 +316,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 }
 
 for (i = 0; i < total_queues; i++) {
-struct vhost_net *net;
 
-net = get_vhost_net(ncs[i].peer);
+peer = qemu_get_peer(ncs, i);
+net = get_vhost_net(peer);
 vhost_net_set_vq_index(net, i * 2);
 
 /* Suppress the masking guest notifiers on vhost user
@@ -335,15 +337,16 @@ int vhost_net_start(VirtIODevice *dev, NetClientState 
*ncs,
 }
 
 for (i = 0; i < total_queues; i++) {
-r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev);
+peer = qemu_get_peer(ncs, i);
+r = vhost_net_start_one(get_vhost_net(peer), dev);
 
 if (r < 0) {
 goto err_start;
 }
 
-if (ncs[i].peer->vring_enable) {
+if (peer->vring_enable) {
 /* restore vring enable state */
-r = vhost_set_vring_enable(ncs[i].peer, ncs[i].peer->vring_enable);
+r = vhost_set_vring_enable(peer, peer->vring_enable);
 
 if (r < 0) {
 goto err_start;
@@ -355,7 +358,8 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 
 err_start:
 while (--i >= 0) {
-vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev);
+peer = qemu_get_peer(ncs , i);
+vhost_net_stop_one(get_vhost_net(peer), dev);
 }
 e = k->set_guest_notifiers(qbus->parent, total_queues * 2, false);
 if (e < 0) {
-- 
MST

[PATCH v11 8/8] xen: introduce ERRP_AUTO_PROPAGATE

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_AUTO_PROPAGATE macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call).  Fix such cases.

If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
ERRP_AUTO_PROPAGATE macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_AUTO_PROPAGATE leaves errp as is if it's not NULL or
   &error_fatal, this means that we don't break error_abort
   (we'll abort on error_set, not on error_propagate)

This commit is generated by command

sed -n '/^X86 Xen CPUs$/,/^$/{s/^F: //p}' MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Reported-by: Kevin Wolf 
Reported-by: Greg Kurz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/block/dataplane/xen-block.c |  17 +++---
 hw/block/xen-block.c   | 102 ++---
 hw/pci-host/xen_igd_pt.c   |   7 +--
 hw/xen/xen-backend.c   |   7 +--
 hw/xen/xen-bus.c   |  92 +
 hw/xen/xen-host-pci-device.c   |  27 +
 hw/xen/xen_pt.c|  25 
 hw/xen/xen_pt_config_init.c|  17 +++---
 8 files changed, 128 insertions(+), 166 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 5f8f15778b..1a077cc05f 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -723,8 +723,8 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
unsigned int protocol,
Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 XenDevice *xendev = dataplane->xendev;
-Error *local_err = NULL;
 unsigned int ring_size;
 unsigned int i;
 
@@ -760,9 +760,8 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
 }
 
 xen_device_set_max_grant_refs(xendev, dataplane->nr_ring_ref,
-  &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+  errp);
+if (*errp) {
 goto stop;
 }
 
@@ -770,9 +769,8 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
   dataplane->ring_ref,
   dataplane->nr_ring_ref,
   PROT_READ | PROT_WRITE,
-  &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+  errp);
+if (*errp) {
 goto stop;
 }
 
@@ -805,9 +803,8 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
 dataplane->event_channel =
 xen_device_bind_event_channel(xendev, event_channel,
   xen_block_dataplane_event, dataplane,
-  &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+  errp);
+if (*errp) {
 goto stop;
 }
 
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index a775fba7c0..623ae5b8e0 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -195,6 +195,7 @@ static const BlockDevOps xen_block_dev_ops = {
 
 static void xen_block_realize(XenDevice *xendev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev);
 XenBlockDeviceClass *blockdev_class =
 XEN_BLOCK_DEVICE_GET_CLASS(xendev);
@@ -202,7 +203,6 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 XenBlockVdev *vdev = &blockdev->props.vdev;
 BlockConf *conf = &blockdev->props.conf;
 BlockBackend *blk = conf->blk;
-Error *local_err = NULL;
 
 if (vdev->type == XEN_BLOCK_VDEV_TYPE_INVALID) {
 error_setg(errp, "vdev property not set");
@@ -212,9 +212,8 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 trace_xen_block_realize(type, vdev->disk, vdev->partition);
 
 if (blockdev_class->realize) {
-blockdev_class->realize(blockdev, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+blockdev_class->realize(blockdev, errp);
+if (*errp) {
 return;
 }
 }
@@ -280,8 +279,8 @@ static void xen_block_frontend_changed(XenDevice *xendev,
enum xenbus_state frontend_state,
Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 enum xenbus_state backend_st

[PULL 34/41] vhost: implement vhost_dev_start method

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

use the vhost_dev_start callback to send the status to backend

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-8-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 hw/virtio/vhost.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 10304b583e..32809e54b5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1685,7 +1685,12 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 goto fail_log;
 }
 }
-
+if (hdev->vhost_ops->vhost_dev_start) {
+r = hdev->vhost_ops->vhost_dev_start(hdev, true);
+if (r) {
+goto fail_log;
+}
+}
 if (vhost_dev_has_iommu(hdev) &&
 hdev->vhost_ops->vhost_set_iotlb_callback) {
 hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
@@ -1723,6 +1728,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice 
*vdev)
 /* should only be called after backend is connected */
 assert(hdev->vhost_ops);
 
+if (hdev->vhost_ops->vhost_dev_start) {
+hdev->vhost_ops->vhost_dev_start(hdev, false);
+}
 for (i = 0; i < hdev->nvqs; ++i) {
 vhost_virtqueue_stop(hdev,
  vdev,
-- 
MST

[PULL 33/41] vhost: introduce new VhostOps vhost_dev_start

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

This patch introduces new VhostOps vhost_dev_start callback which allows the
vhost_net set the start/stop status to backend

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-7-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/hw/virtio/vhost-backend.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index 6f6670783f..b80f344cd6 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -112,6 +112,7 @@ typedef int (*vhost_get_inflight_fd_op)(struct vhost_dev 
*dev,
 typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev *dev,
 struct vhost_inflight *inflight);
 
+typedef int (*vhost_dev_start_op)(struct vhost_dev *dev, bool started);
 typedef struct VhostOps {
 VhostBackendType backend_type;
 vhost_backend_init vhost_backend_init;
@@ -152,6 +153,7 @@ typedef struct VhostOps {
 vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter;
 vhost_get_inflight_fd_op vhost_get_inflight_fd;
 vhost_set_inflight_fd_op vhost_set_inflight_fd;
+vhost_dev_start_op vhost_dev_start;
 } VhostOps;
 
 extern const VhostOps user_ops;
-- 
MST

[PULL 31/41] virtio-pci: implement queue_enabled method

2020-07-03 Thread Michael S. Tsirkin

From: Jason Wang 

With version 1, we can detect whether a queue is enabled via
queue_enabled.

Signed-off-by: Jason Wang 
Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-5-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 hw/virtio/virtio-pci.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 7bc8c1c056..8554cf2a03 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1107,6 +1107,18 @@ static AddressSpace *virtio_pci_get_dma_as(DeviceState 
*d)
 return pci_get_address_space(dev);
 }
 
+static bool virtio_pci_queue_enabled(DeviceState *d, int n)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
+VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
+return proxy->vqs[vdev->queue_sel].enabled;
+}
+
+return virtio_queue_enabled(vdev, n);
+}
+
 static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
struct virtio_pci_cap *cap)
 {
@@ -2064,6 +2076,7 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, 
void *data)
 k->ioeventfd_enabled = virtio_pci_ioeventfd_enabled;
 k->ioeventfd_assign = virtio_pci_ioeventfd_assign;
 k->get_dma_as = virtio_pci_get_dma_as;
+k->queue_enabled = virtio_pci_queue_enabled;
 }
 
 static const TypeInfo virtio_pci_bus_info = {
-- 
MST

[PULL 35/41] vhost: introduce new VhostOps vhost_vq_get_addr

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

This patch introduces new VhostOps vhost_vq_get_addr_op callback to get
the vring addr from the backend

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-9-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/hw/virtio/vhost-backend.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index b80f344cd6..fa84abac97 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -34,6 +34,7 @@ struct vhost_vring_state;
 struct vhost_vring_addr;
 struct vhost_scsi_target;
 struct vhost_iotlb_msg;
+struct vhost_virtqueue;
 
 typedef int (*vhost_backend_init)(struct vhost_dev *dev, void *opaque);
 typedef int (*vhost_backend_cleanup)(struct vhost_dev *dev);
@@ -113,6 +114,10 @@ typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev 
*dev,
 struct vhost_inflight *inflight);
 
 typedef int (*vhost_dev_start_op)(struct vhost_dev *dev, bool started);
+
+typedef int (*vhost_vq_get_addr_op)(struct vhost_dev *dev,
+struct vhost_vring_addr *addr,
+struct vhost_virtqueue *vq);
 typedef struct VhostOps {
 VhostBackendType backend_type;
 vhost_backend_init vhost_backend_init;
@@ -154,6 +159,7 @@ typedef struct VhostOps {
 vhost_get_inflight_fd_op vhost_get_inflight_fd;
 vhost_set_inflight_fd_op vhost_set_inflight_fd;
 vhost_dev_start_op vhost_dev_start;
+vhost_vq_get_addr_op  vhost_vq_get_addr;
 } VhostOps;
 
 extern const VhostOps user_ops;
-- 
MST

[PULL 36/41] vhost: implement vhost_vq_get_addr method

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

use vhost_vq_get_addr callback to get the vq address from backend

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-10-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/hw/virtio/vhost-backend.h |  4 
 hw/virtio/vhost.c | 28 +++-
 2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index fa84abac97..bfc24207e2 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -118,6 +118,9 @@ typedef int (*vhost_dev_start_op)(struct vhost_dev *dev, 
bool started);
 typedef int (*vhost_vq_get_addr_op)(struct vhost_dev *dev,
 struct vhost_vring_addr *addr,
 struct vhost_virtqueue *vq);
+
+typedef int (*vhost_get_device_id_op)(struct vhost_dev *dev, uint32_t *dev_id);
+
 typedef struct VhostOps {
 VhostBackendType backend_type;
 vhost_backend_init vhost_backend_init;
@@ -160,6 +163,7 @@ typedef struct VhostOps {
 vhost_set_inflight_fd_op vhost_set_inflight_fd;
 vhost_dev_start_op vhost_dev_start;
 vhost_vq_get_addr_op  vhost_vq_get_addr;
+vhost_get_device_id_op vhost_get_device_id;
 } VhostOps;
 
 extern const VhostOps user_ops;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 32809e54b5..1e083a8976 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -773,15 +773,25 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
 struct vhost_virtqueue *vq,
 unsigned idx, bool enable_log)
 {
-struct vhost_vring_addr addr = {
-.index = idx,
-.desc_user_addr = (uint64_t)(unsigned long)vq->desc,
-.avail_user_addr = (uint64_t)(unsigned long)vq->avail,
-.used_user_addr = (uint64_t)(unsigned long)vq->used,
-.log_guest_addr = vq->used_phys,
-.flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
-};
-int r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
+struct vhost_vring_addr addr;
+int r;
+memset(&addr, 0, sizeof(struct vhost_vring_addr));
+
+if (dev->vhost_ops->vhost_vq_get_addr) {
+r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
+if (r < 0) {
+VHOST_OPS_DEBUG("vhost_vq_get_addr failed");
+return -errno;
+}
+} else {
+addr.desc_user_addr = (uint64_t)(unsigned long)vq->desc;
+addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail;
+addr.used_user_addr = (uint64_t)(unsigned long)vq->used;
+}
+addr.index = idx;
+addr.log_guest_addr = vq->used_phys;
+addr.flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0;
+r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
 if (r < 0) {
 VHOST_OPS_DEBUG("vhost_set_vring_addr failed");
 return -errno;
-- 
MST

[PULL 19/41] virtio-mem: Allow notifiers for size changes

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

We want to send qapi events in case the size of a virtio-mem device
changes. This allows upper layers to always know how much memory is
actually currently consumed via a virtio-mem device.

Unfortuantely, we have to report the id of our proxy device. Let's provide
an easy way for our proxy device to register, so it can send the qapi
events. Piggy-backing on the notifier infrastructure (although we'll
only ever have one notifier registered) seems to be an easy way.

Reviewed-by: Dr. David Alan Gilbert 
Cc: "Michael S. Tsirkin" 
Cc: "Dr. David Alan Gilbert" 
Cc: Igor Mammedov 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-17-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio-mem.h |  5 +
 hw/virtio/virtio-mem.c | 21 -
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h
index 6981096f7c..b74c77cd42 100644
--- a/include/hw/virtio/virtio-mem.h
+++ b/include/hw/virtio/virtio-mem.h
@@ -64,6 +64,9 @@ typedef struct VirtIOMEM {
 
 /* block size and alignment */
 uint64_t block_size;
+
+/* notifiers to notify when "size" changes */
+NotifierList size_change_notifiers;
 } VirtIOMEM;
 
 typedef struct VirtIOMEMClass {
@@ -73,6 +76,8 @@ typedef struct VirtIOMEMClass {
 /* public */
 void (*fill_device_info)(const VirtIOMEM *vmen, VirtioMEMDeviceInfo *vi);
 MemoryRegion *(*get_memory_region)(VirtIOMEM *vmem, Error **errp);
+void (*add_size_change_notifier)(VirtIOMEM *vmem, Notifier *notifier);
+void (*remove_size_change_notifier)(VirtIOMEM *vmem, Notifier *notifier);
 } VirtIOMEMClass;
 
 #endif
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index d8a0c974d3..2df33f9125 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -184,6 +184,7 @@ static int virtio_mem_state_change_request(VirtIOMEM *vmem, 
uint64_t gpa,
 } else {
 vmem->size -= size;
 }
+notifier_list_notify(&vmem->size_change_notifiers, &vmem->size);
 return VIRTIO_MEM_RESP_ACK;
 }
 
@@ -242,7 +243,10 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem)
 return -EBUSY;
 }
 bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size);
-vmem->size = 0;
+if (vmem->size) {
+vmem->size = 0;
+notifier_list_notify(&vmem->size_change_notifiers, &vmem->size);
+}
 
 virtio_mem_resize_usable_region(vmem, vmem->requested_size, true);
 return 0;
@@ -561,6 +565,18 @@ static MemoryRegion 
*virtio_mem_get_memory_region(VirtIOMEM *vmem, Error **errp)
 return &vmem->memdev->mr;
 }
 
+static void virtio_mem_add_size_change_notifier(VirtIOMEM *vmem,
+Notifier *notifier)
+{
+notifier_list_add(&vmem->size_change_notifiers, notifier);
+}
+
+static void virtio_mem_remove_size_change_notifier(VirtIOMEM *vmem,
+   Notifier *notifier)
+{
+notifier_remove(notifier);
+}
+
 static void virtio_mem_get_size(Object *obj, Visitor *v, const char *name,
 void *opaque, Error **errp)
 {
@@ -668,6 +684,7 @@ static void virtio_mem_instance_init(Object *obj)
 VirtIOMEM *vmem = VIRTIO_MEM(obj);
 
 vmem->block_size = VIRTIO_MEM_MIN_BLOCK_SIZE;
+notifier_list_init(&vmem->size_change_notifiers);
 
 object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size,
 NULL, NULL, NULL);
@@ -705,6 +722,8 @@ static void virtio_mem_class_init(ObjectClass *klass, void 
*data)
 
 vmc->fill_device_info = virtio_mem_fill_device_info;
 vmc->get_memory_region = virtio_mem_get_memory_region;
+vmc->add_size_change_notifier = virtio_mem_add_size_change_notifier;
+vmc->remove_size_change_notifier = virtio_mem_remove_size_change_notifier;
 }
 
 static const TypeInfo virtio_mem_info = {
-- 
MST

[PULL 37/41] vhost: introduce new VhostOps vhost_force_iommu

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

This patch introduces new VhostOps vhost_force_iommu callback
to force enable features bit VIRTIO_F_IOMMU_PLATFORM.

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-11-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/hw/virtio/vhost-backend.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index bfc24207e2..e7cb8d028c 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -121,6 +121,8 @@ typedef int (*vhost_vq_get_addr_op)(struct vhost_dev *dev,
 
 typedef int (*vhost_get_device_id_op)(struct vhost_dev *dev, uint32_t *dev_id);
 
+typedef bool (*vhost_force_iommu_op)(struct vhost_dev *dev);
+
 typedef struct VhostOps {
 VhostBackendType backend_type;
 vhost_backend_init vhost_backend_init;
@@ -164,6 +166,7 @@ typedef struct VhostOps {
 vhost_dev_start_op vhost_dev_start;
 vhost_vq_get_addr_op  vhost_vq_get_addr;
 vhost_get_device_id_op vhost_get_device_id;
+vhost_force_iommu_op vhost_force_iommu;
 } VhostOps;
 
 extern const VhostOps user_ops;
-- 
MST

[PULL 39/41] vhost_net: introduce set_config & get_config

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

This patch introduces set_config & get_config  method which allows
vhost_net set/get the config to backend

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-13-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 include/net/vhost_net.h |  5 +
 hw/net/vhost_net-stub.c | 11 +++
 hw/net/vhost_net.c  | 10 ++
 3 files changed, 26 insertions(+)

diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 77e47398c4..172b0051d8 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -28,6 +28,11 @@ void vhost_net_cleanup(VHostNetState *net);
 uint64_t vhost_net_get_features(VHostNetState *net, uint64_t features);
 void vhost_net_ack_features(VHostNetState *net, uint64_t features);
 
+int vhost_net_get_config(struct vhost_net *net,  uint8_t *config,
+ uint32_t config_len);
+
+int vhost_net_set_config(struct vhost_net *net, const uint8_t *data,
+ uint32_t offset, uint32_t size, uint32_t flags);
 bool vhost_net_virtqueue_pending(VHostNetState *net, int n);
 void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev,
   int idx, bool mask);
diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
index aac0e98228..a7f4252630 100644
--- a/hw/net/vhost_net-stub.c
+++ b/hw/net/vhost_net-stub.c
@@ -52,6 +52,17 @@ uint64_t vhost_net_get_features(struct vhost_net *net, 
uint64_t features)
 return features;
 }
 
+int vhost_net_get_config(struct vhost_net *net,  uint8_t *config,
+ uint32_t config_len)
+{
+return 0;
+}
+int vhost_net_set_config(struct vhost_net *net, const uint8_t *data,
+ uint32_t offset, uint32_t size, uint32_t flags)
+{
+return 0;
+}
+
 void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
 {
 }
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 4096d64aaf..4561665f6b 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -110,6 +110,16 @@ uint64_t vhost_net_get_features(struct vhost_net *net, 
uint64_t features)
 return vhost_get_features(&net->dev, vhost_net_get_feature_bits(net),
 features);
 }
+int vhost_net_get_config(struct vhost_net *net,  uint8_t *config,
+ uint32_t config_len)
+{
+return vhost_dev_get_config(&net->dev, config, config_len);
+}
+int vhost_net_set_config(struct vhost_net *net, const uint8_t *data,
+ uint32_t offset, uint32_t size, uint32_t flags)
+{
+return vhost_dev_set_config(&net->dev, data, offset, size, flags);
+}
 
 void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
 {
-- 
MST

Re: [PATCH v2 11/18] hw/block/nvme: add remaining mandatory controller parameters

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 10:46 AM, Klaus Jensen wrote:
> On Jul  3 10:31, Philippe Mathieu-Daudé wrote:
>> On 7/3/20 8:34 AM, Klaus Jensen wrote:
>>> From: Klaus Jensen 
>>>
>>> Add support for any remaining mandatory controller operating parameters
>>> (features).
>>>
>>> Signed-off-by: Klaus Jensen 
>>> Reviewed-by: Dmitry Fomichev 
>>> ---
>>>  hw/block/nvme.c   | 39 +--
>>>  hw/block/nvme.h   | 18 ++
>>>  hw/block/trace-events |  2 ++
>>>  include/block/nvme.h  |  7 +++
>>>  4 files changed, 60 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
>>> index ba523f6768bf..affb9a967534 100644
>>> --- a/hw/block/nvme.c
>>> +++ b/hw/block/nvme.c
>>> @@ -1056,8 +1056,16 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, 
>>> NvmeCmd *cmd, NvmeRequest *req)
>>>  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>>>  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>>>  uint32_t result;
>>> +uint8_t fid = NVME_GETSETFEAT_FID(dw10);
>>> +uint16_t iv;
>>>  
>>> -switch (dw10) {
>>> +trace_pci_nvme_getfeat(nvme_cid(req), fid, dw11);
>>> +
>>> +if (!nvme_feature_support[fid]) {
>>> +return NVME_INVALID_FIELD | NVME_DNR;
>>> +}
>>> +
>>> +switch (fid) {
>>>  case NVME_TEMPERATURE_THRESHOLD:
>>>  result = 0;
>>>  
>>> @@ -1088,14 +1096,27 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, 
>>> NvmeCmd *cmd, NvmeRequest *req)
>>>   ((n->params.max_ioqpairs - 1) << 16));
>>>  trace_pci_nvme_getfeat_numq(result);
>>>  break;
>>> +case NVME_INTERRUPT_VECTOR_CONF:
>>> +iv = dw11 & 0x;
>>> +if (iv >= n->params.max_ioqpairs + 1) {
>>> +return NVME_INVALID_FIELD | NVME_DNR;
>>> +}
>>> +
>>> +result = iv;
>>> +if (iv == n->admin_cq.vector) {
>>> +result |= NVME_INTVC_NOCOALESCING;
>>> +}
>>> +
>>> +result = cpu_to_le32(result);
>>> +break;
>>>  case NVME_ASYNCHRONOUS_EVENT_CONF:
>>>  result = cpu_to_le32(n->features.async_config);
>>>  break;
>>>  case NVME_TIMESTAMP:
>>>  return nvme_get_feature_timestamp(n, cmd);
>>>  default:
>>> -trace_pci_nvme_err_invalid_getfeat(dw10);
>>> -return NVME_INVALID_FIELD | NVME_DNR;
>>> +result = cpu_to_le32(nvme_feature_default[fid]);
>>
>> So here we expect uninitialized fid entries to return 0, right?
>>
> 
> Yes, if defaults are not 0 (like NVME_ARBITRATION), it is explicitly set.
> 
>>> +break;
>>>  }
>>>  
>>>  req->cqe.result = result;
>>> @@ -1124,8 +1145,15 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, 
>>> NvmeCmd *cmd, NvmeRequest *req)
>>>  {
>>>  uint32_t dw10 = le32_to_cpu(cmd->cdw10);
>>>  uint32_t dw11 = le32_to_cpu(cmd->cdw11);
>>> +uint8_t fid = NVME_GETSETFEAT_FID(dw10);
>>>  
>>> -switch (dw10) {
>>> +trace_pci_nvme_setfeat(nvme_cid(req), fid, dw11);
>>> +
>>> +if (!nvme_feature_support[fid]) {
>>> +return NVME_INVALID_FIELD | NVME_DNR;
>>> +}
>>> +
>>> +switch (fid) {
>>>  case NVME_TEMPERATURE_THRESHOLD:
>>>  if (NVME_TEMP_TMPSEL(dw11) != NVME_TEMP_TMPSEL_COMPOSITE) {
>>>  break;
>>> @@ -1172,8 +1200,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeCmd 
>>> *cmd, NvmeRequest *req)
>>>  case NVME_TIMESTAMP:
>>>  return nvme_set_feature_timestamp(n, cmd);
>>>  default:
>>> -trace_pci_nvme_err_invalid_setfeat(dw10);
>>> -return NVME_INVALID_FIELD | NVME_DNR;
>>> +return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
>>>  }
>>>  return NVME_SUCCESS;
>>>  }
>>> diff --git a/hw/block/nvme.h b/hw/block/nvme.h
>>> index f8940435f9ef..8ad1e3c89cee 100644
>>> --- a/hw/block/nvme.h
>>> +++ b/hw/block/nvme.h
>>> @@ -87,6 +87,24 @@ typedef struct NvmeFeatureVal {
>>>  uint32_tasync_config;
>>>  } NvmeFeatureVal;

What do you think about adding:

--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -804,7 +804,8 @@ enum NvmeFeatureIds {
 NVME_WRITE_ATOMICITY= 0xa,
 NVME_ASYNCHRONOUS_EVENT_CONF= 0xb,
 NVME_TIMESTAMP  = 0xe,
-NVME_SOFTWARE_PROGRESS_MARKER   = 0x80
+NVME_SOFTWARE_PROGRESS_MARKER   = 0x80,
+NVME_FEATURE_ID_COUNT   = 0x100
 };

>>>  
>>> +static const uint32_t nvme_feature_default[0x100] = {

Why uint32_t and not uint16_t?

With the previously suggested enum you can now replace 0x100
by NVME_FEATURE_ID_COUNT.

>>> +[NVME_ARBITRATION]   = NVME_ARB_AB_NOLIMIT,
>>> +};
>>> +
>>> +static const bool nvme_feature_support[0x100] = {

Ditto NVME_FEATURE_ID_COUNT.

>>> +[NVME_ARBITRATION]  = true,
>>> +[NVME_POWER_MANAGEMENT] = true,
>>> +[NVME_TEMPERATURE_THRESHOLD]= true,
>>> +[NVME_ERROR_RECOVERY]   = true,
>>> +[NVME_VOLATILE_WRITE_CACHE] = true,
>>> +[NVME_NUMBER_OF_QUEUES] = true,
>

[PULL 38/41] vhost: implement vhost_force_iommu method

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

use the vhost_force_iommu callback to force enable feature bit 
VIRTIO_F_IOMMU_PLATFORM

Signed-off-by: Cindy Lu 
Message-Id: <20200701145538.22333-12-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 hw/virtio/vhost.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 1e083a8976..1a1384e7a6 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -810,6 +810,11 @@ static int vhost_dev_set_features(struct vhost_dev *dev,
 if (!vhost_dev_has_iommu(dev)) {
 features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM);
 }
+if (dev->vhost_ops->vhost_force_iommu) {
+if (dev->vhost_ops->vhost_force_iommu(dev) == true) {
+features |= 0x1ULL << VIRTIO_F_IOMMU_PLATFORM;
+   }
+}
 r = dev->vhost_ops->vhost_set_features(dev, features);
 if (r < 0) {
 VHOST_OPS_DEBUG("vhost_set_features failed");
-- 
MST

[PULL 40/41] vhost-vdpa: introduce vhost-vdpa backend

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

Currently we have 2 types of vhost backends in QEMU: vhost kernel and
vhost-user. The above patch provides a generic device for vDPA purpose,
this vDPA device exposes to user space a non-vendor-specific configuration
interface for setting up a vhost HW accelerator, this patch set introduces
a third vhost backend called vhost-vdpa based on the vDPA interface.

Vhost-vdpa usage:

qemu-system-x86_64 -cpu host -enable-kvm \
..
-netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
-device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \

Signed-off-by: Lingshan zhu 
Signed-off-by: Tiwei Bie 
Signed-off-by: Cindy Lu 
Signed-off-by: Jason Wang 
Message-Id: <20200701145538.22333-14-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 configure |  21 ++
 include/hw/virtio/vhost-backend.h |   4 +-
 include/hw/virtio/vhost-vdpa.h|  26 ++
 include/hw/virtio/vhost.h |   7 +
 hw/net/vhost_net.c|  19 +-
 hw/net/virtio-net.c   |  19 ++
 hw/virtio/vhost-backend.c |   6 +
 hw/virtio/vhost-vdpa.c| 475 ++
 docs/interop/index.rst|   1 +
 docs/interop/vhost-vdpa.rst   |  17 ++
 hw/virtio/Makefile.objs   |   1 +
 qemu-options.hx   |  12 +
 12 files changed, 601 insertions(+), 7 deletions(-)
 create mode 100644 include/hw/virtio/vhost-vdpa.h
 create mode 100644 hw/virtio/vhost-vdpa.c
 create mode 100644 docs/interop/vhost-vdpa.rst

diff --git a/configure b/configure
index 4a22dcd563..3db7f20185 100755
--- a/configure
+++ b/configure
@@ -1575,6 +1575,10 @@ for opt do
   ;;
   --enable-vhost-user) vhost_user="yes"
   ;;
+  --disable-vhost-vdpa) vhost_vdpa="no"
+  ;;
+  --enable-vhost-vdpa) vhost_vdpa="yes"
+  ;;
   --disable-vhost-kernel) vhost_kernel="no"
   ;;
   --enable-vhost-kernel) vhost_kernel="yes"
@@ -1883,6 +1887,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   vhost-cryptovhost-user-crypto backend support
   vhost-kernelvhost kernel backend support
   vhost-user  vhost-user backend support
+  vhost-vdpa  vhost-vdpa kernel backend support
   spice   spice
   rbd rados block device (rbd)
   libiscsiiscsi support
@@ -2394,6 +2399,10 @@ test "$vhost_user" = "" && vhost_user=yes
 if test "$vhost_user" = "yes" && test "$mingw32" = "yes"; then
   error_exit "vhost-user isn't available on win32"
 fi
+test "$vhost_vdpa" = "" && vhost_vdpa=$linux
+if test "$vhost_vdpa" = "yes" && test "$linux" != "yes"; then
+  error_exit "vhost-vdpa is only available on Linux"
+fi
 test "$vhost_kernel" = "" && vhost_kernel=$linux
 if test "$vhost_kernel" = "yes" && test "$linux" != "yes"; then
   error_exit "vhost-kernel is only available on Linux"
@@ -2422,6 +2431,11 @@ test "$vhost_user_fs" = "" && vhost_user_fs=$vhost_user
 if test "$vhost_user_fs" = "yes" && test "$vhost_user" = "no"; then
   error_exit "--enable-vhost-user-fs requires --enable-vhost-user"
 fi
+#vhost-vdpa backends
+test "$vhost_net_vdpa" = "" && vhost_net_vdpa=$vhost_vdpa
+if test "$vhost_net_vdpa" = "yes" && test "$vhost_vdpa" = "no"; then
+  error_exit "--enable-vhost-net-vdpa requires --enable-vhost-vdpa"
+fi
 
 # OR the vhost-kernel and vhost-user values for simplicity
 if test "$vhost_net" = ""; then
@@ -6936,6 +6950,7 @@ echo "vhost-scsi support $vhost_scsi"
 echo "vhost-vsock support $vhost_vsock"
 echo "vhost-user support $vhost_user"
 echo "vhost-user-fs support $vhost_user_fs"
+echo "vhost-vdpa support $vhost_vdpa"
 echo "Trace backends$trace_backends"
 if have_backend "simple"; then
 echo "Trace output file $trace_file-"
@@ -7437,6 +7452,9 @@ fi
 if test "$vhost_net_user" = "yes" ; then
   echo "CONFIG_VHOST_NET_USER=y" >> $config_host_mak
 fi
+if test "$vhost_net_vdpa" = "yes" ; then
+  echo "CONFIG_VHOST_NET_VDPA=y" >> $config_host_mak
+fi
 if test "$vhost_crypto" = "yes" ; then
   echo "CONFIG_VHOST_CRYPTO=y" >> $config_host_mak
 fi
@@ -7452,6 +7470,9 @@ fi
 if test "$vhost_user" = "yes" ; then
   echo "CONFIG_VHOST_USER=y" >> $config_host_mak
 fi
+if test "$vhost_vdpa" = "yes" ; then
+  echo "CONFIG_VHOST_VDPA=y" >> $config_host_mak
+fi
 if test "$vhost_user_fs" = "yes" ; then
   echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
 fi
diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index e7cb8d028c..8825bd278f 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -17,7 +17,8 @@ typedef enum VhostBackendType {
 VHOST_BACKEND_TYPE_NONE = 0,
 VHOST_BACKEND_TYPE_KERNEL = 1,
 VHOST_BACKEND_TYPE_USER = 2,
-VHOST_BACKEND_TYPE_MAX = 3,
+VHOST_BACKEND_TYPE_VDPA = 3,
+VHOST_BACKEND_TYPE_MAX = 4,
 } VhostBackendType;
 
 typedef enum VhostSetConfigType {
@@ -170,6 +171,7 @@ typedef struct VhostOps {
 } VhostOps;
 
 extern const VhostOps user_ops;
+ext

[PULL 12/41] migration/colo: Use ram_block_discard_disable()

2020-07-03 Thread Michael S. Tsirkin

From: David Hildenbrand 

COLO will copy all memory in a RAM block, disable discarding of RAM.

Reviewed-by: Dr. David Alan Gilbert 
Tested-by: Lukas Straub 
Cc: "Michael S. Tsirkin" 
Cc: Hailiang Zhang 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
Message-Id: <20200626072248.78761-10-da...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/migration/colo.h |  2 +-
 migration/migration.c|  8 +++-
 migration/savevm.c   | 11 +--
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 1636e6f907..768e1f04c3 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -25,7 +25,7 @@ void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
 /* loadvm */
-void migration_incoming_enable_colo(void);
+int migration_incoming_enable_colo(void);
 void migration_incoming_disable_colo(void);
 bool migration_incoming_colo_enabled(void);
 void *colo_process_incoming_thread(void *opaque);
diff --git a/migration/migration.c b/migration/migration.c
index d365d82209..92e44e021e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -338,12 +338,18 @@ bool migration_incoming_colo_enabled(void)
 
 void migration_incoming_disable_colo(void)
 {
+ram_block_discard_disable(false);
 migration_colo_enabled = false;
 }
 
-void migration_incoming_enable_colo(void)
+int migration_incoming_enable_colo(void)
 {
+if (ram_block_discard_disable(true)) {
+error_report("COLO: cannot disable RAM discard");
+return -EBUSY;
+}
 migration_colo_enabled = true;
+return 0;
 }
 
 void migrate_add_address(SocketAddress *address)
diff --git a/migration/savevm.c b/migration/savevm.c
index b979ea6e7f..6e01724605 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2111,8 +2111,15 @@ static int 
loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
 
 static int loadvm_process_enable_colo(MigrationIncomingState *mis)
 {
-migration_incoming_enable_colo();
-return colo_init_ram_cache();
+int ret = migration_incoming_enable_colo();
+
+if (!ret) {
+ret = colo_init_ram_cache();
+if (ret) {
+migration_incoming_disable_colo();
+}
+}
+return ret;
 }
 
 /*
-- 
MST

[PULL 5/7] vvfat: Fix array_remove_slice()

2020-07-03 Thread Kevin Wolf

array_remove_slice() calls array_roll() with array->next - 1 as the
destination index. This is only correct for count == 1, otherwise we're
writing past the end of the array. array->next - count would be correct.

However, this is the only place ever calling array_roll(), so this
rather complicated operation isn't even necessary.

Fix the problem and simplify the code by replacing it with a single
memmove() call. array_roll() can now be removed.

Reported-by: Nathan Huckleberry 
Signed-off-by: Kevin Wolf 
Message-Id: <20200623175534.38286-3-kw...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 block/vvfat.c | 42 +-
 1 file changed, 5 insertions(+), 37 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 62230542e5..2eb8cbb19f 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -140,48 +140,16 @@ static inline void* array_insert(array_t* array,unsigned 
int index,unsigned int
 return array->pointer+index*array->item_size;
 }
 
-/* this performs a "roll", so that the element which was at index_from becomes
- * index_to, but the order of all other elements is preserved. */
-static inline int array_roll(array_t* array,int index_to,int index_from,int 
count)
-{
-char* buf;
-char* from;
-char* to;
-int is;
-
-if(!array ||
-index_to<0 || index_to>=array->next ||
-index_from<0 || index_from>=array->next)
-return -1;
-
-if(index_to==index_from)
-return 0;
-
-is=array->item_size;
-from=array->pointer+index_from*is;
-to=array->pointer+index_to*is;
-buf=g_malloc(is*count);
-memcpy(buf,from,is*count);
-
-if(index_to=0);
 assert(count > 0);
 assert(index + count <= array->next);
-if(array_roll(array,array->next-1,index,count))
-return -1;
+
+memmove(array->pointer + index * array->item_size,
+array->pointer + (index + count) * array->item_size,
+(array->next - index - count) * array->item_size);
+
 array->next -= count;
 return 0;
 }
-- 
2.25.4

[PATCH v11 0/8] error: auto propagated local_err part I

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

Based-on: <20200702155000.3455325-1-arm...@redhat.com>

v11: (based-on "[PATCH v2 00/44] Less clumsy error checking")
01: minor rebase of documentation, keep r-bs
02: - minor comment tweaks [Markus]
- use explicit file name in MAINTAINERS instead of pattern
- add Markus's r-b
03,07,08: rabase changes, drop r-bs


v11 is available at
 https://src.openvz.org/scm/~vsementsov/qemu.git #tag 
up-auto-local-err-partI-v11
v10 is available at
 https://src.openvz.org/scm/~vsementsov/qemu.git #tag 
up-auto-local-err-partI-v10

In these series, there is no commit-per-subsystem script, each generated
commit is generated in separate.

Still, generating commands are very similar, and looks like

sed -n '/^$/,/^$/{s/^F: //p}' MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Note, that in each generated commit, generation command is the only
text, indented by 8 spaces in 'git log -1' output, so, to regenerate all
commits (for example, after rebase, or change in coccinelle script), you
may use the following command:

git rebase -x "sh -c \"git show --pretty= --name-only | xargs git checkout 
HEAD^ -- ; git reset; git log -1 | grep '^' | sh\"" HEAD~6

Which will start automated interactive rebase for generated patches,
which will stop if generated patch changed
(you may do git commit --amend to apply updated generated changes).

Note:
  git show --pretty= --name-only   - lists files, changed in HEAD
  git log -1 | grep '^' | sh   - rerun generation command of HEAD


Check for compilation of changed .c files
git rebase -x "sh -c \"git show --pretty= --name-only | sed -n 's/\.c$/.o/p' | 
xargs make -j9\"" HEAD~6

Vladimir Sementsov-Ogievskiy (8):
  error: auto propagated local_err
  scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()
  SD (Secure Card): introduce ERRP_AUTO_PROPAGATE
  pflash: introduce ERRP_AUTO_PROPAGATE
  fw_cfg: introduce ERRP_AUTO_PROPAGATE
  virtio-9p: introduce ERRP_AUTO_PROPAGATE
  nbd: introduce ERRP_AUTO_PROPAGATE
  xen: introduce ERRP_AUTO_PROPAGATE

 scripts/coccinelle/auto-propagated-errp.cocci | 337 ++
 include/block/nbd.h   |   1 +
 include/qapi/error.h  | 208 +--
 block/nbd.c   |   7 +-
 hw/9pfs/9p-local.c|  12 +-
 hw/9pfs/9p.c  |   1 +
 hw/block/dataplane/xen-block.c|  17 +-
 hw/block/pflash_cfi01.c   |   7 +-
 hw/block/pflash_cfi02.c   |   7 +-
 hw/block/xen-block.c  | 102 +++---
 hw/nvram/fw_cfg.c |  14 +-
 hw/pci-host/xen_igd_pt.c  |   7 +-
 hw/sd/sdhci-pci.c |   7 +-
 hw/sd/sdhci.c |  21 +-
 hw/sd/ssi-sd.c|  10 +-
 hw/xen/xen-backend.c  |   7 +-
 hw/xen/xen-bus.c  |  92 ++---
 hw/xen/xen-host-pci-device.c  |  27 +-
 hw/xen/xen_pt.c   |  25 +-
 hw/xen/xen_pt_config_init.c   |  17 +-
 nbd/client.c  |   5 +
 nbd/server.c  |   5 +
 MAINTAINERS   |   1 +
 23 files changed, 690 insertions(+), 247 deletions(-)
 create mode 100644 scripts/coccinelle/auto-propagated-errp.cocci

Cc: Eric Blake 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Greg Kurz 
Cc: Christian Schoenebeck 
Cc: Stefan Hajnoczi 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: "Philippe Mathieu-Daudé" 
Cc: Laszlo Ersek 
Cc: Gerd Hoffmann 
Cc: Markus Armbruster 
Cc: Michael Roth 
Cc: qemu-devel@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: xen-de...@lists.xenproject.org

-- 
2.21.0

[PULL 41/41] vhost-vdpa: introduce vhost-vdpa net client

2020-07-03 Thread Michael S. Tsirkin

From: Cindy Lu 

This patch set introduces a new net client type: vhost-vdpa.
vhost-vdpa net client will set up a vDPA device which is specified
by a "vhostdev" parameter.

Signed-off-by: Lingshan Zhu 
Signed-off-by: Tiwei Bie 
Signed-off-by: Cindy Lu 
Signed-off-by: Jason Wang 
Message-Id: <20200701145538.22333-15-l...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Jason Wang 
---
 qapi/net.json|  28 -
 include/net/vhost-vdpa.h |  22 
 net/clients.h|   2 +
 net/net.c|   3 +
 net/vhost-vdpa.c | 228 +++
 net/Makefile.objs|   2 +-
 6 files changed, 282 insertions(+), 3 deletions(-)
 create mode 100644 include/net/vhost-vdpa.h
 create mode 100644 net/vhost-vdpa.c

diff --git a/qapi/net.json b/qapi/net.json
index 9244c9af56..558d520a2f 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -428,16 +428,39 @@
 '*vhostforce':'bool',
 '*queues':'int' } }
 
+##
+# @NetdevVhostVDPAOptions:
+#
+# Vhost-vdpa network backend
+#
+# vDPA device is a device that uses a datapath which complies with the virtio
+# specifications with a vendor specific control path.
+#
+# @vhostdev: path of vhost-vdpa device
+#(default:'/dev/vhost-vdpa-0')
+#
+# @queues: number of queues to be created for multiqueue vhost-vdpa
+#  (default: 1)
+#
+# Since: 5.1
+##
+{ 'struct': 'NetdevVhostVDPAOptions',
+  'data': {
+'*vhostdev': 'str',
+'*queues':   'int' } }
+
 ##
 # @NetClientDriver:
 #
 # Available netdev drivers.
 #
 # Since: 2.7
+#
+# @vhost-vdpa since 5.1
 ##
 { 'enum': 'NetClientDriver',
   'data': [ 'none', 'nic', 'user', 'tap', 'l2tpv3', 'socket', 'vde',
-'bridge', 'hubport', 'netmap', 'vhost-user' ] }
+'bridge', 'hubport', 'netmap', 'vhost-user', 'vhost-vdpa' ] }
 
 ##
 # @Netdev:
@@ -465,7 +488,8 @@
 'bridge':   'NetdevBridgeOptions',
 'hubport':  'NetdevHubPortOptions',
 'netmap':   'NetdevNetmapOptions',
-'vhost-user': 'NetdevVhostUserOptions' } }
+'vhost-user': 'NetdevVhostUserOptions',
+'vhost-vdpa': 'NetdevVhostVDPAOptions' } }
 
 ##
 # @NetFilterDirection:
diff --git a/include/net/vhost-vdpa.h b/include/net/vhost-vdpa.h
new file mode 100644
index 00..45e34b7cfc
--- /dev/null
+++ b/include/net/vhost-vdpa.h
@@ -0,0 +1,22 @@
+/*
+ * vhost-vdpa.h
+ *
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * Copyright(c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_VDPA_H
+#define VHOST_VDPA_H
+
+#define TYPE_VHOST_VDPA "vhost-vdpa"
+
+struct vhost_net *vhost_vdpa_get_vhost_net(NetClientState *nc);
+uint64_t vhost_vdpa_get_acked_features(NetClientState *nc);
+
+extern const int vdpa_feature_bits[];
+
+#endif /* VHOST_VDPA_H */
diff --git a/net/clients.h b/net/clients.h
index a6ef267e19..92f9b59aed 100644
--- a/net/clients.h
+++ b/net/clients.h
@@ -61,4 +61,6 @@ int net_init_netmap(const Netdev *netdev, const char *name,
 int net_init_vhost_user(const Netdev *netdev, const char *name,
 NetClientState *peer, Error **errp);
 
+int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
+NetClientState *peer, Error **errp);
 #endif /* QEMU_NET_CLIENTS_H */
diff --git a/net/net.c b/net/net.c
index 9099a327dd..94dc546fb2 100644
--- a/net/net.c
+++ b/net/net.c
@@ -966,6 +966,9 @@ static int (* const 
net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
 #ifdef CONFIG_VHOST_NET_USER
 [NET_CLIENT_DRIVER_VHOST_USER] = net_init_vhost_user,
 #endif
+#ifdef CONFIG_VHOST_NET_VDPA
+[NET_CLIENT_DRIVER_VHOST_VDPA] = net_init_vhost_vdpa,
+#endif
 #ifdef CONFIG_L2TPV3
 [NET_CLIENT_DRIVER_L2TPV3]= net_init_l2tpv3,
 #endif
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
new file mode 100644
index 00..bc0e0d2d35
--- /dev/null
+++ b/net/vhost-vdpa.c
@@ -0,0 +1,228 @@
+/*
+ * vhost-vdpa.c
+ *
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * Copyright(c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "clients.h"
+#include "net/vhost_net.h"
+#include "net/vhost-vdpa.h"
+#include "hw/virtio/vhost-vdpa.h"
+#include "qemu/config-file.h"
+#include "qemu/error-report.h"
+#include "qemu/option.h"
+#include "qapi/error.h"
+#include 
+#include 
+#include "standard-headers/linux/virtio_net.h"
+#include "monitor/monitor.h"
+#include "hw/virtio/vhost.h"
+
+/* Todo:need to add the multiqueue support here */
+typedef struct VhostVDPAState {
+NetClientState nc;
+struct vhost_vdpa vhost_vdpa;
+VHostNetState *vhost_net;
+uint64_t acked_features;
+bool started;
+} VhostVDPAState;
+
+const int vdpa_feature_bits[] = {
+VIRTIO_F_NOTIFY_ON_EMPTY,
+

[PATCH v11 1/8] error: auto propagated local_err

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

Introduce a new ERRP_AUTO_PROPAGATE macro, to be used at start of
functions with an errp OUT parameter.

It has three goals:

1. Fix issue with error_fatal and error_prepend/error_append_hint: user
can't see this additional information, because exit() happens in
error_setg earlier than information is added. [Reported by Greg Kurz]

2. Fix issue with error_abort and error_propagate: when we wrap
error_abort by local_err+error_propagate, the resulting coredump will
refer to error_propagate and not to the place where error happened.
(the macro itself doesn't fix the issue, but it allows us to [3.] drop
the local_err+error_propagate pattern, which will definitely fix the
issue) [Reported by Kevin Wolf]

3. Drop local_err+error_propagate pattern, which is used to workaround
void functions with errp parameter, when caller wants to know resulting
status. (Note: actually these functions could be merely updated to
return int error code).

To achieve these goals, later patches will add invocations
of this macro at the start of functions with either use
error_prepend/error_append_hint (solving 1) or which use
local_err+error_propagate to check errors, switching those
functions to use *errp instead (solving 2 and 3).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Paul Durrant 
Reviewed-by: Greg Kurz 
Reviewed-by: Eric Blake 
---

Cc: Eric Blake 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Greg Kurz 
Cc: Christian Schoenebeck 
Cc: Stefan Hajnoczi 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: "Philippe Mathieu-Daudé" 
Cc: Laszlo Ersek 
Cc: Gerd Hoffmann 
Cc: Markus Armbruster 
Cc: Michael Roth 
Cc: qemu-devel@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: xen-de...@lists.xenproject.org

 include/qapi/error.h | 205 ---
 1 file changed, 172 insertions(+), 33 deletions(-)

diff --git a/include/qapi/error.h b/include/qapi/error.h
index 5ceb3ace06..b54aedbfd7 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -39,7 +39,7 @@
  *   • pointer-valued functions return non-null / null pointer, and
  *   • integer-valued functions return non-negative / negative.
  *
- * How to:
+ * = Deal with Error object =
  *
  * Create an error:
  * error_setg(errp, "situation normal, all fouled up");
@@ -73,28 +73,91 @@
  * reporting it (primarily useful in testsuites):
  * error_free_or_abort(&err);
  *
- * Pass an existing error to the caller:
- * error_propagate(errp, err);
- * where Error **errp is a parameter, by convention the last one.
+ * = Deal with Error ** function parameter =
  *
- * Pass an existing error to the caller with the message modified:
- * error_propagate_prepend(errp, err);
+ * A function may use the error system to return errors. In this case, the
+ * function defines an Error **errp parameter, by convention the last one (with
+ * exceptions for functions using ... or va_list).
  *
- * Avoid
- * error_propagate(errp, err);
- * error_prepend(errp, "Could not frobnicate '%s': ", name);
- * because this fails to prepend when @errp is &error_fatal.
+ * The caller may then pass in the following errp values:
+ *
+ * 1. &error_abort
+ *Any error will result in abort().
+ * 2. &error_fatal
+ *Any error will result in exit() with a non-zero status.
+ * 3. NULL
+ *No error reporting through errp parameter.
+ * 4. The address of a NULL-initialized Error *err
+ *Any error will populate errp with an error object.
  *
- * Create a new error and pass it to the caller:
+ * The following rules then implement the correct semantics desired by the
+ * caller.
+ *
+ * Create a new error to pass to the caller:
  * error_setg(errp, "situation normal, all fouled up");
  *
- * Call a function and receive an error from it:
+ * Calling another errp-based function:
+ * f(..., errp);
+ *
+ * == Checking success of subcall ==
+ *
+ * If a function returns a value indicating an error in addition to setting
+ * errp (which is recommended), then you don't need any additional code, just
+ * do:
+ *
+ * int ret = f(..., errp);
+ * if (ret < 0) {
+ * ... handle error ...
+ * return ret;
+ * }
+ *
+ * If a function returns nothing (not recommended for new code), the only way
+ * to check success is by consulting errp; doing this safely requires the use
+ * of the ERRP_AUTO_PROPAGATE macro, like this:
+ *
+ * int our_func(..., Error **errp) {
+ * ERRP_AUTO_PROPAGATE();
+ * ...
+ * subcall(..., errp);
+ * if (*errp) {
+ * ...
+ * return -EINVAL;
+ * }
+ * ...
+ * }
+ *
+ * ERRP_AUTO_PROPAGATE takes care of wrapping the original errp as needed, so
+ * that the rest of the function can directly use errp (including
+ * dereferencing), where any errors will then be propagated on to the original
+ * errp when leaving the function.
+ *
+ * In some cases, we need to check result of subcall, but do not want to
+ * propagate the Error object t

[PULL 4/7] vvfat: Check that updated filenames are valid

2020-07-03 Thread Kevin Wolf

FAT allows only a restricted set of characters in file names, and for
some of the illegal characters, it's actually important that we catch
them: If filenames can contain '/', the guest can construct filenames
containing "../" and escape from the assigned vvfat directory. The same
problem could arise if ".." was ever accepted as a literal filename.

Fix this by adding a check that all filenames are valid in
check_directory_consistency().

Reported-by: Nathan Huckleberry 
Signed-off-by: Kevin Wolf 
Message-Id: <20200623175534.38286-2-kw...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 block/vvfat.c | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index c65a98e3ee..62230542e5 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -520,12 +520,31 @@ static void set_begin_of_direntry(direntry_t* direntry, 
uint32_t begin)
 direntry->begin_hi = cpu_to_le16((begin >> 16) & 0x);
 }
 
+static bool valid_filename(const unsigned char *name)
+{
+unsigned char c;
+if (!strcmp((const char*)name, ".") || !strcmp((const char*)name, "..")) {
+return false;
+}
+for (; (c = *name); name++) {
+if (!((c >= '0' && c <= '9') ||
+  (c >= 'A' && c <= 'Z') ||
+  (c >= 'a' && c <= 'z') ||
+  c > 127 ||
+  strchr("$%'-_@~`!(){}^#&.+,;=[]", c) != NULL))
+{
+return false;
+}
+}
+return true;
+}
+
 static uint8_t to_valid_short_char(gunichar c)
 {
 c = g_unichar_toupper(c);
 if ((c >= '0' && c <= '9') ||
 (c >= 'A' && c <= 'Z') ||
-strchr("$%'-_@~`!(){}^#&", c) != 0) {
+strchr("$%'-_@~`!(){}^#&", c) != NULL) {
 return c;
 } else {
 return 0;
@@ -2098,6 +2117,10 @@ DLOG(fprintf(stderr, "check direntry %d:\n", i); 
print_direntry(direntries + i))
 }
 lfn.checksum = 0x100; /* cannot use long name twice */
 
+if (!valid_filename(lfn.name)) {
+fprintf(stderr, "Invalid file name\n");
+goto fail;
+}
 if (path_len + 1 + lfn.len >= PATH_MAX) {
 fprintf(stderr, "Name too long: %s/%s\n", path, lfn.name);
 goto fail;
-- 
2.25.4

Re: [PATCH v2 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list

2020-07-03 Thread Philippe Mathieu-Daudé

On 7/3/20 10:37 AM, Klaus Jensen wrote:
> On Jul  3 10:20, Philippe Mathieu-Daudé wrote:
>> On 7/3/20 8:34 AM, Klaus Jensen wrote:
>>> From: Klaus Jensen 
>>>
>>> Reject the nsid broadcast value (0x) and 0xfffe in the
>>> Active Namespace ID list.
>>
>> Can we have a definition instead of this 0xfffe magic value please?
>>
> 
> Hmm, not really actually. It's not a magic value, its just because the
> logic in Active Namespace ID list would require that it should report
> any namespaces with ids *higher* than the one specified, so since
> 0x (NVME_NSID_BROADCAST) is invalid, NVME_NSID_BROADCAST - 1
> needs to be as well.

OK.

> 
> What do you say I change it to `min_nsid >= NVME_NSID_BROADCAST - 1`?
> The original condition just reads well if you are sitting with the spec
> on the side.

IMO this is clearer:

  if (min_nsid + 1 >= NVME_NSID_BROADCAST) {
  return NVME_INVALID_NSID | NVME_DNR;
  }

Whichever form you prefer you can amend to the respin patch:
Reviewed-by: Philippe Mathieu-Daudé 

> 
>>>
>>> Signed-off-by: Klaus Jensen 
>>> ---
>>>  hw/block/nvme.c | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
>>> index 65c2fa3ac1f4..0dac7a41ddae 100644
>>> --- a/hw/block/nvme.c
>>> +++ b/hw/block/nvme.c
>>> @@ -956,6 +956,10 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
>>> NvmeIdentify *c)
>>>  
>>>  trace_pci_nvme_identify_nslist(min_nsid);
>>>  
>>> +if (min_nsid == 0xfffe || min_nsid == NVME_NSID_BROADCAST) {
>>> +return NVME_INVALID_NSID | NVME_DNR;
>>> +}
>>> +
>>>  list = g_malloc0(data_len);
>>>  for (i = 0; i < n->num_namespaces; i++) {
>>>  if (i < min_nsid) {
>>>
>>
>

[PATCH v11 2/8] scripts: Coccinelle script to use ERRP_AUTO_PROPAGATE()

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

Script adds ERRP_AUTO_PROPAGATE macro invocation where appropriate and
does corresponding changes in code (look for details in
include/qapi/error.h)

Usage example:
spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
 --macro-file scripts/cocci-macro-file.h --in-place --no-show-diff \
 --max-width 80 FILES...

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Markus Armbruster 
---

Cc: Eric Blake 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Greg Kurz 
Cc: Christian Schoenebeck 
Cc: Stefan Hajnoczi 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: "Philippe Mathieu-Daudé" 
Cc: Laszlo Ersek 
Cc: Gerd Hoffmann 
Cc: Markus Armbruster 
Cc: Michael Roth 
Cc: qemu-devel@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: xen-de...@lists.xenproject.org

 scripts/coccinelle/auto-propagated-errp.cocci | 337 ++
 include/qapi/error.h  |   3 +
 MAINTAINERS   |   1 +
 3 files changed, 341 insertions(+)
 create mode 100644 scripts/coccinelle/auto-propagated-errp.cocci

diff --git a/scripts/coccinelle/auto-propagated-errp.cocci 
b/scripts/coccinelle/auto-propagated-errp.cocci
new file mode 100644
index 00..c29f695adf
--- /dev/null
+++ b/scripts/coccinelle/auto-propagated-errp.cocci
@@ -0,0 +1,337 @@
+// Use ERRP_AUTO_PROPAGATE (see include/qapi/error.h)
+//
+// Copyright (c) 2020 Virtuozzo International GmbH.
+//
+// This program is free software; you can redistribute it and/or
+// modify it under the terms of the GNU General Public License as
+// published by the Free Software Foundation; either version 2 of the
+// License, or (at your option) any later version.
+//
+// This program is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License
+// along with this program.  If not, see
+// .
+//
+// Usage example:
+// spatch --sp-file scripts/coccinelle/auto-propagated-errp.cocci \
+//  --macro-file scripts/cocci-macro-file.h --in-place \
+//  --no-show-diff --max-width 80 FILES...
+//
+// Note: --max-width 80 is needed because coccinelle default is less
+// than 80, and without this parameter coccinelle may reindent some
+// lines which fit into 80 characters but not to coccinelle default,
+// which in turn produces extra patch hunks for no reason.
+
+// Switch unusual Error ** parameter names to errp
+// (this is necessary to use ERRP_AUTO_PROPAGATE).
+//
+// Disable optional_qualifier to skip functions with
+// "Error *const *errp" parameter.
+//
+// Skip functions with "assert(_errp && *_errp)" statement, because
+// that signals unusual semantics, and the parameter name may well
+// serve a purpose. (like nbd_iter_channel_error()).
+//
+// Skip util/error.c to not touch, for example, error_propagate() and
+// error_propagate_prepend().
+@ depends on !(file in "util/error.c") disable optional_qualifier@
+identifier fn;
+identifier _errp != errp;
+@@
+
+ fn(...,
+-   Error **_errp
++   Error **errp
+,...)
+ {
+(
+ ... when != assert(_errp && *_errp)
+&
+ <...
+-_errp
++errp
+ ...>
+)
+ }
+
+// Add invocation of ERRP_AUTO_PROPAGATE to errp-functions where
+// necessary
+//
+// Note, that without "when any" the final "..." does not mach
+// something matched by previous pattern, i.e. the rule will not match
+// double error_prepend in control flow like in
+// vfio_set_irq_signaling().
+//
+// Note, "exists" says that we want apply rule even if it does not
+// match on all possible control flows (otherwise, it will not match
+// standard pattern when error_propagate() call is in if branch).
+@ disable optional_qualifier exists@
+identifier fn, local_err;
+symbol errp;
+@@
+
+ fn(..., Error **errp, ...)
+ {
++   ERRP_AUTO_PROPAGATE();
+...  when != ERRP_AUTO_PROPAGATE();
+(
+(
+error_append_hint(errp, ...);
+|
+error_prepend(errp, ...);
+|
+error_vprepend(errp, ...);
+)
+... when any
+|
+Error *local_err = NULL;
+...
+(
+error_propagate_prepend(errp, local_err, ...);
+|
+error_propagate(errp, local_err);
+)
+...
+)
+ }
+
+// Warn when several Error * definitions are in the control flow.
+// This rule is not chained to rule1 and less restrictive, to cover more
+// functions to warn (even those we are not going to convert).
+//
+// Note, that even with one (or zero) Error * definition in the each
+// control flow we may have several (in total) Error * definitions in
+// the function. This case deserves attention too, but I don't see
+// simple way to match with help of coccinelle.
+@check1 disable optional_qualifier exists@
+identifier fn, _errp, local_err, local_err2;
+position p1, p2;
+@@
+
+ fn(..., Error **_errp, ...)
+ {
+ ...
+ Error *local_err = NULL;@p1
+ ... when any
+

[PATCH v11 5/8] fw_cfg: introduce ERRP_AUTO_PROPAGATE

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_AUTO_PROPAGATE macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call).  Fix such cases.

If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
ERRP_AUTO_PROPAGATE macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_AUTO_PROPAGATE leaves errp as is if it's not NULL or
   &error_fatal, this means that we don't break error_abort
   (we'll abort on error_set, not on error_propagate)

This commit is generated by command

sed -n '/^Firmware configuration (fw_cfg)$/,/^$/{s/^F: //p}' \
MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Reported-by: Kevin Wolf 
Reported-by: Greg Kurz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/nvram/fw_cfg.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 0408a31f8e..d5386c3235 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -1231,12 +1231,11 @@ static Property fw_cfg_io_properties[] = {
 
 static void fw_cfg_io_realize(DeviceState *dev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 FWCfgIoState *s = FW_CFG_IO(dev);
-Error *local_err = NULL;
 
-fw_cfg_file_slots_allocate(FW_CFG(s), &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+fw_cfg_file_slots_allocate(FW_CFG(s), errp);
+if (*errp) {
 return;
 }
 
@@ -1282,14 +1281,13 @@ static Property fw_cfg_mem_properties[] = {
 
 static void fw_cfg_mem_realize(DeviceState *dev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 FWCfgMemState *s = FW_CFG_MEM(dev);
 SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 const MemoryRegionOps *data_ops = &fw_cfg_data_mem_ops;
-Error *local_err = NULL;
 
-fw_cfg_file_slots_allocate(FW_CFG(s), &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+fw_cfg_file_slots_allocate(FW_CFG(s), errp);
+if (*errp) {
 return;
 }
 
-- 
2.21.0

Re: [PATCH v2 15/18] hw/block/nvme: reject invalid nsid values in active namespace id list

2020-07-03 Thread Klaus Jensen

On Jul  3 11:14, Philippe Mathieu-Daudé wrote:
> On 7/3/20 10:37 AM, Klaus Jensen wrote:
> > On Jul  3 10:20, Philippe Mathieu-Daudé wrote:
> >> On 7/3/20 8:34 AM, Klaus Jensen wrote:
> >>> From: Klaus Jensen 
> >>>
> >>> Reject the nsid broadcast value (0x) and 0xfffe in the
> >>> Active Namespace ID list.
> >>
> >> Can we have a definition instead of this 0xfffe magic value please?
> >>
> > 
> > Hmm, not really actually. It's not a magic value, its just because the
> > logic in Active Namespace ID list would require that it should report
> > any namespaces with ids *higher* than the one specified, so since
> > 0x (NVME_NSID_BROADCAST) is invalid, NVME_NSID_BROADCAST - 1
> > needs to be as well.
> 
> OK.
> 
> > 
> > What do you say I change it to `min_nsid >= NVME_NSID_BROADCAST - 1`?
> > The original condition just reads well if you are sitting with the spec
> > on the side.
> 
> IMO this is clearer:
> 
>   if (min_nsid + 1 >= NVME_NSID_BROADCAST) {
>   return NVME_INVALID_NSID | NVME_DNR;
>   }
> 

But since min_nsid is uint32_t that would not be wise ;)

I'll go with the - 1 and add a comment!

> Whichever form you prefer you can amend to the respin patch:
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> > 
> >>>
> >>> Signed-off-by: Klaus Jensen 
> >>> ---
> >>>  hw/block/nvme.c | 4 
> >>>  1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> >>> index 65c2fa3ac1f4..0dac7a41ddae 100644
> >>> --- a/hw/block/nvme.c
> >>> +++ b/hw/block/nvme.c
> >>> @@ -956,6 +956,10 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
> >>> NvmeIdentify *c)
> >>>  
> >>>  trace_pci_nvme_identify_nslist(min_nsid);
> >>>  
> >>> +if (min_nsid == 0xfffe || min_nsid == NVME_NSID_BROADCAST) {
> >>> +return NVME_INVALID_NSID | NVME_DNR;
> >>> +}
> >>> +
> >>>  list = g_malloc0(data_len);
> >>>  for (i = 0; i < n->num_namespaces; i++) {
> >>>  if (i < min_nsid) {
> >>>
> >>
> > 
>

[PATCH v11 4/8] pflash: introduce ERRP_AUTO_PROPAGATE

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_AUTO_PROPAGATE macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call).  Fix such cases.

If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
ERRP_AUTO_PROPAGATE macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_AUTO_PROPAGATE leaves errp as is if it's not NULL or
   &error_fatal, this means that we don't break error_abort
   (we'll abort on error_set, not on error_propagate)

This commit is generated by command

sed -n '/^Parallel NOR Flash devices$/,/^$/{s/^F: //p}' \
MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Reported-by: Kevin Wolf 
Reported-by: Greg Kurz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/block/pflash_cfi01.c | 7 +++
 hw/block/pflash_cfi02.c | 7 +++
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index cddc3a5a0c..859cfeae14 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -696,12 +696,12 @@ static const MemoryRegionOps pflash_cfi01_ops = {
 
 static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 PFlashCFI01 *pfl = PFLASH_CFI01(dev);
 uint64_t total_len;
 int ret;
 uint64_t blocks_per_device, sector_len_per_device, device_len;
 int num_devices;
-Error *local_err = NULL;
 
 if (pfl->sector_len == 0) {
 error_setg(errp, "attribute \"sector-length\" not specified or zero.");
@@ -735,9 +735,8 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
**errp)
 &pfl->mem, OBJECT(dev),
 &pflash_cfi01_ops,
 pfl,
-pfl->name, total_len, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+pfl->name, total_len, errp);
+if (*errp) {
 return;
 }
 
diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
index b40ce2335a..15035ee5ef 100644
--- a/hw/block/pflash_cfi02.c
+++ b/hw/block/pflash_cfi02.c
@@ -724,9 +724,9 @@ static const MemoryRegionOps pflash_cfi02_ops = {
 
 static void pflash_cfi02_realize(DeviceState *dev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 PFlashCFI02 *pfl = PFLASH_CFI02(dev);
 int ret;
-Error *local_err = NULL;
 
 if (pfl->uniform_sector_len == 0 && pfl->sector_len[0] == 0) {
 error_setg(errp, "attribute \"sector-length\" not specified or zero.");
@@ -792,9 +792,8 @@ static void pflash_cfi02_realize(DeviceState *dev, Error 
**errp)
 
 memory_region_init_rom_device(&pfl->orig_mem, OBJECT(pfl),
   &pflash_cfi02_ops, pfl, pfl->name,
-  pfl->chip_len, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+  pfl->chip_len, errp);
+if (*errp) {
 return;
 }
 
-- 
2.21.0

[PULL 0/7] Block layer patches

2020-07-03 Thread Kevin Wolf

The following changes since commit 64f0ad8ad8e13257e7c912df470d46784b55c3fd:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2020-07-02' into 
staging (2020-07-02 15:54:09 +0100)

are available in the Git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 4f071a9460886667fde061c05b79dc786cc22e3c:

  iotests: Fix 051 output after qdev_init_nofail() removal (2020-07-03 10:06:29 
+0200)


Block layer patches:

- qemu-img convert: Don't pre-zero images (removes nowadays
  counterproductive optimisation)
- qemu-storage-daemon: Fix object-del, cleaner shutdown
- vvfat: Check that the guest doesn't escape the given host directory
  with read-write vvfat drives
- vvfat: Fix crash by out-of-bounds array writes for read-write drives
- iotests fixes


Kevin Wolf (3):
  qemu-img convert: Don't pre-zero images
  vvfat: Check that updated filenames are valid
  vvfat: Fix array_remove_slice()

Max Reitz (1):
  iotests.py: Do not wait() before communicate()

Philippe Mathieu-Daudé (1):
  iotests: Fix 051 output after qdev_init_nofail() removal

Stefan Hajnoczi (2):
  qemu-storage-daemon: remember to add qemu_object_opts
  qemu-storage-daemon: add missing cleanup calls

 block/vvfat.c | 67 +++
 qemu-img.c|  9 --
 qemu-storage-daemon.c |  5 
 tests/qemu-iotests/iotests.py | 34 +++---
 tests/qemu-iotests/051.pc.out |  4 +--
 5 files changed, 53 insertions(+), 66 deletions(-)

[PATCH v11 3/8] SD (Secure Card): introduce ERRP_AUTO_PROPAGATE

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_AUTO_PROPAGATE macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call).  Fix such cases.

If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
ERRP_AUTO_PROPAGATE macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_AUTO_PROPAGATE leaves errp as is if it's not NULL or
   &error_fatal, this means that we don't break error_abort
   (we'll abort on error_set, not on error_propagate)

This commit is generated by command

sed -n '/^SD (Secure Card)$/,/^$/{s/^F: //p}' \
MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Reported-by: Kevin Wolf 
Reported-by: Greg Kurz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/sd/sdhci-pci.c |  7 +++
 hw/sd/sdhci.c | 21 +
 hw/sd/ssi-sd.c| 10 +-
 3 files changed, 17 insertions(+), 21 deletions(-)

diff --git a/hw/sd/sdhci-pci.c b/hw/sd/sdhci-pci.c
index 4f5977d487..38ec572fc6 100644
--- a/hw/sd/sdhci-pci.c
+++ b/hw/sd/sdhci-pci.c
@@ -29,13 +29,12 @@ static Property sdhci_pci_properties[] = {
 
 static void sdhci_pci_realize(PCIDevice *dev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 SDHCIState *s = PCI_SDHCI(dev);
-Error *local_err = NULL;
 
 sdhci_initfn(s);
-sdhci_common_realize(s, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+sdhci_common_realize(s, errp);
+if (*errp) {
 return;
 }
 
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index eb2be6529e..be1928784d 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -1288,7 +1288,7 @@ static const MemoryRegionOps sdhci_mmio_ops = {
 
 static void sdhci_init_readonly_registers(SDHCIState *s, Error **errp)
 {
-Error *local_err = NULL;
+ERRP_AUTO_PROPAGATE();
 
 switch (s->sd_spec_version) {
 case 2 ... 3:
@@ -1299,9 +1299,8 @@ static void sdhci_init_readonly_registers(SDHCIState *s, 
Error **errp)
 }
 s->version = (SDHC_HCVER_VENDOR << 8) | (s->sd_spec_version - 1);
 
-sdhci_check_capareg(s, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+sdhci_check_capareg(s, errp);
+if (*errp) {
 return;
 }
 }
@@ -1332,11 +1331,10 @@ void sdhci_uninitfn(SDHCIState *s)
 
 void sdhci_common_realize(SDHCIState *s, Error **errp)
 {
-Error *local_err = NULL;
+ERRP_AUTO_PROPAGATE();
 
-sdhci_init_readonly_registers(s, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+sdhci_init_readonly_registers(s, errp);
+if (*errp) {
 return;
 }
 s->buf_maxsz = sdhci_get_fifolen(s);
@@ -1456,13 +1454,12 @@ static void sdhci_sysbus_finalize(Object *obj)
 
 static void sdhci_sysbus_realize(DeviceState *dev, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 SDHCIState *s = SYSBUS_SDHCI(dev);
 SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
-Error *local_err = NULL;
 
-sdhci_common_realize(s, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+sdhci_common_realize(s, errp);
+if (*errp) {
 return;
 }
 
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
index e0fb9f3093..43e5730b00 100644
--- a/hw/sd/ssi-sd.c
+++ b/hw/sd/ssi-sd.c
@@ -241,10 +241,10 @@ static const VMStateDescription vmstate_ssi_sd = {
 
 static void ssi_sd_realize(SSISlave *d, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 ssi_sd_state *s = FROM_SSI_SLAVE(ssi_sd_state, d);
 DeviceState *carddev;
 DriveInfo *dinfo;
-Error *err = NULL;
 
 qbus_create_inplace(&s->sdbus, sizeof(s->sdbus), TYPE_SD_BUS,
 DEVICE(d), "sd-bus");
@@ -255,23 +255,23 @@ static void ssi_sd_realize(SSISlave *d, Error **errp)
 carddev = qdev_new(TYPE_SD_CARD);
 if (dinfo) {
 if (!qdev_prop_set_drive_err(carddev, "drive",
- blk_by_legacy_dinfo(dinfo), &err)) {
+ blk_by_legacy_dinfo(dinfo), errp)) {
 goto fail;
 }
 }
 
-if (!object_property_set_bool(OBJECT(carddev), "spi", true, &err)) {
+if (!object_property_set_bool(OBJECT(carddev), "spi", true, errp)) {
 goto fail;
 }
 
-if (!qdev_realize_and_unref(carddev, BUS(&s->sdbus), &err)) {
+if (!qdev_realize_and_unref(carddev, BUS(&s->sdbus), errp)) {
 goto fail;
 }
 
 return;
 
 fail:
-error_propagate_prepend(errp, err, "failed to init SD card: ");
+error_prepend(errp, "failed to init SD card: ");
 }
 
 static void ssi_sd_reset(DeviceState *dev

[PATCH v11 7/8] nbd: introduce ERRP_AUTO_PROPAGATE

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_AUTO_PROPAGATE macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call).  Fix such cases.

If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
ERRP_AUTO_PROPAGATE macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_AUTO_PROPAGATE leaves errp as is if it's not NULL or
   &error_fatal, this means that we don't break error_abort
   (we'll abort on error_set, not on error_propagate)

This commit is generated by command

sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \
MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Reported-by: Kevin Wolf 
Reported-by: Greg Kurz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/nbd.h | 1 +
 block/nbd.c | 7 +++
 nbd/client.c| 5 +
 nbd/server.c| 5 +
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 20363280ae..f7d87636d3 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -361,6 +361,7 @@ void nbd_server_start_options(NbdServerOptions *arg, Error 
**errp);
 static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
const char *desc, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 int ret = qio_channel_read_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
 
 if (ret < 0) {
diff --git a/block/nbd.c b/block/nbd.c
index 6876da04a7..b7cea0f650 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1408,16 +1408,15 @@ static void nbd_client_close(BlockDriverState *bs)
 static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
   Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 QIOChannelSocket *sioc;
-Error *local_err = NULL;
 
 sioc = qio_channel_socket_new();
 qio_channel_set_name(QIO_CHANNEL(sioc), "nbd-client");
 
-qio_channel_socket_connect_sync(sioc, saddr, &local_err);
-if (local_err) {
+qio_channel_socket_connect_sync(sioc, saddr, errp);
+if (*errp) {
 object_unref(OBJECT(sioc));
-error_propagate(errp, local_err);
 return NULL;
 }
 
diff --git a/nbd/client.c b/nbd/client.c
index ba173108ba..e258ef3f7e 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -68,6 +68,7 @@ static int nbd_send_option_request(QIOChannel *ioc, uint32_t 
opt,
uint32_t len, const char *data,
Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 NBDOption req;
 QEMU_BUILD_BUG_ON(sizeof(req) != 16);
 
@@ -153,6 +154,7 @@ static int nbd_receive_option_reply(QIOChannel *ioc, 
uint32_t opt,
 static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply,
 bool strict, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 g_autofree char *msg = NULL;
 
 if (!(reply->type & (1 << 31))) {
@@ -337,6 +339,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
char **description,
 static int nbd_opt_info_or_go(QIOChannel *ioc, uint32_t opt,
   NBDExportInfo *info, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 NBDOptionReply reply;
 uint32_t len = strlen(info->name);
 uint16_t type;
@@ -882,6 +885,7 @@ static int nbd_start_negotiate(AioContext *aio_context, 
QIOChannel *ioc,
bool structured_reply, bool *zeroes,
Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 uint64_t magic;
 
 trace_nbd_start_negotiate(tlscreds, hostname ? hostname : "");
@@ -1017,6 +1021,7 @@ int nbd_receive_negotiate(AioContext *aio_context, 
QIOChannel *ioc,
   const char *hostname, QIOChannel **outioc,
   NBDExportInfo *info, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 int result;
 bool zeroes;
 bool base_allocation = info->base_allocation;
diff --git a/nbd/server.c b/nbd/server.c
index 20754e9ebc..8a12e586d7 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -211,6 +211,7 @@ static int GCC_FMT_ATTR(4, 0)
 nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t type,
 Error **errp, const char *fmt, va_list va)
 {
+ERRP_AUTO_PROPAGATE();
 g_autofree char *msg = NULL;
 int ret;
 size_t len;
@@ -382,6 +383,7 @@ static int nbd_opt_read_name(NBDClient *client, char 
**name, uint32_t *length,
 static int nbd_negotiate_send_rep_list(NBDClient *client, NBDExport *exp,

[PULL 2/7] qemu-storage-daemon: remember to add qemu_object_opts

2020-07-03 Thread Kevin Wolf

From: Stefan Hajnoczi 

The --object option is supported by qemu-storage-daemon but the
qemu_object_opts QemuOptsList wasn't being added. As a result calls to
qemu_find_opts("object") failed with "There is no option group
'object'".

This patch fixes the object-del QMP command.

Signed-off-by: Stefan Hajnoczi 
Message-Id: <20200619101132.2401756-2-stefa...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 qemu-storage-daemon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/qemu-storage-daemon.c b/qemu-storage-daemon.c
index 9e7adfe3a6..a01cbd6371 100644
--- a/qemu-storage-daemon.c
+++ b/qemu-storage-daemon.c
@@ -316,6 +316,7 @@ int main(int argc, char *argv[])
 
 module_call_init(MODULE_INIT_QOM);
 module_call_init(MODULE_INIT_TRACE);
+qemu_add_opts(&qemu_object_opts);
 qemu_add_opts(&qemu_trace_opts);
 qcrypto_init(&error_fatal);
 bdrv_init();
-- 
2.25.4

[PATCH v11 6/8] virtio-9p: introduce ERRP_AUTO_PROPAGATE

2020-07-03 Thread Vladimir Sementsov-Ogievskiy

If we want to add some info to errp (by error_prepend() or
error_append_hint()), we must use the ERRP_AUTO_PROPAGATE macro.
Otherwise, this info will not be added when errp == &error_fatal
(the program will exit prior to the error_append_hint() or
error_prepend() call).  Fix such cases.

If we want to check error after errp-function call, we need to
introduce local_err and then propagate it to errp. Instead, use
ERRP_AUTO_PROPAGATE macro, benefits are:
1. No need of explicit error_propagate call
2. No need of explicit local_err variable: use errp directly
3. ERRP_AUTO_PROPAGATE leaves errp as is if it's not NULL or
   &error_fatal, this means that we don't break error_abort
   (we'll abort on error_set, not on error_propagate)

This commit is generated by command

sed -n '/^virtio-9p$/,/^$/{s/^F: //p}' MAINTAINERS | \
xargs git ls-files | grep '\.[hc]$' | \
xargs spatch \
--sp-file scripts/coccinelle/auto-propagated-errp.cocci \
--macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80

Reported-by: Kevin Wolf 
Reported-by: Greg Kurz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Acked-by: Greg Kurz 
Reviewed-by: Christian Schoenebeck 
---
 hw/9pfs/9p-local.c | 12 +---
 hw/9pfs/9p.c   |  1 +
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 54e012e5b4..0361e0c0b4 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -1479,10 +1479,10 @@ static void error_append_security_model_hint(Error 
*const *errp)
 
 static int local_parse_opts(QemuOpts *opts, FsDriverEntry *fse, Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 const char *sec_model = qemu_opt_get(opts, "security_model");
 const char *path = qemu_opt_get(opts, "path");
 const char *multidevs = qemu_opt_get(opts, "multidevs");
-Error *local_err = NULL;
 
 if (!sec_model) {
 error_setg(errp, "security_model property not set");
@@ -1516,11 +1516,10 @@ static int local_parse_opts(QemuOpts *opts, 
FsDriverEntry *fse, Error **errp)
 fse->export_flags &= ~V9FS_FORBID_MULTIDEVS;
 fse->export_flags &= ~V9FS_REMAP_INODES;
 } else {
-error_setg(&local_err, "invalid multidevs property '%s'",
+error_setg(errp, "invalid multidevs property '%s'",
multidevs);
-error_append_hint(&local_err, "Valid options are: multidevs="
+error_append_hint(errp, "Valid options are: multidevs="
   "[remap|forbid|warn]\n");
-error_propagate(errp, local_err);
 return -1;
 }
 }
@@ -1530,9 +1529,8 @@ static int local_parse_opts(QemuOpts *opts, FsDriverEntry 
*fse, Error **errp)
 return -1;
 }
 
-if (fsdev_throttle_parse_opts(opts, &fse->fst, &local_err)) {
-error_propagate_prepend(errp, local_err,
-"invalid throttle configuration: ");
+if (fsdev_throttle_parse_opts(opts, &fse->fst, errp)) {
+error_prepend(errp, "invalid throttle configuration: ");
 return -1;
 }
 
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 9755fba9a9..bdb1360482 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -4011,6 +4011,7 @@ void pdu_submit(V9fsPDU *pdu, P9MsgHeader *hdr)
 int v9fs_device_realize_common(V9fsState *s, const V9fsTransport *t,
Error **errp)
 {
+ERRP_AUTO_PROPAGATE();
 int i, len;
 struct stat stat;
 FsDriverEntry *fse;
-- 
2.21.0

Re: [PATCH V2 0/2] net/colo-compare.c: Expose "max_queue_size" to users and clean up

2020-07-03 Thread Jason Wang




On 2020/7/3 下午5:27, Zhang, Chen wrote:

Hi Jason,

Maybe missed this updated series?

Thanks
Zhang Chen



Nope :)

It's in my queue. Since I only have those two patches so I don't plan to 
send pull request this week.


(Anyway it's not a feature, so we don't need to worry about soft freeze).

Thanks





-Original Message-
From: Zhang, Chen 
Sent: Wednesday, June 24, 2020 9:21 AM
To: Jason Wang 
Cc: Zhang Chen ; qemu-dev ; Zhang, Chen 
Subject: [PATCH V2 0/2] net/colo-compare.c: Expose "max_queue_size" to
users and clean up

From: Zhang Chen 

This series make a way to config COLO "max_queue_size" parameters
according to user's scenarios and environments and do some clean up for
descriptions.

V2:
  - Rebase on upstream code.

Zhang Chen (2):
   net/colo-compare.c: Expose compare "max_queue_size" to users
   qemu-options.hx: Clean up and fix typo for colo-compare

  net/colo-compare.c | 43
++-
  qemu-options.hx| 33 +
  2 files changed, 59 insertions(+), 17 deletions(-)

--
2.17.1

[PULL 6/7] iotests.py: Do not wait() before communicate()

2020-07-03 Thread Kevin Wolf

From: Max Reitz 

Waiting on a process for which we have a pipe will stall if the process
outputs more data than fits into the OS-provided buffer.  We must use
communicate() before wait(), and in fact, communicate() perfectly
replaces wait() already.

We have to drop the stderr=subprocess.STDOUT parameter from
subprocess.Popen() in qemu_nbd_early_pipe(), because stderr is passed on
to the child process, so if we do not drop this parameter, communicate()
will hang (because the pipe is not closed).

Signed-off-by: Max Reitz 
Message-Id: <20200630083711.40567-1-mre...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/iotests.py | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 5ea4c4df8b..ef739dd1e3 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -146,11 +146,12 @@ def qemu_img_pipe(*args):
 stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT,
 universal_newlines=True)
-exitcode = subp.wait()
-if exitcode < 0:
+output = subp.communicate()[0]
+if subp.returncode < 0:
 sys.stderr.write('qemu-img received signal %i: %s\n'
- % (-exitcode, ' '.join(qemu_img_args + list(args
-return subp.communicate()[0]
+ % (-subp.returncode,
+' '.join(qemu_img_args + list(args
+return output
 
 def qemu_img_log(*args):
 result = qemu_img_pipe(*args)
@@ -177,11 +178,11 @@ def qemu_io(*args):
 subp = subprocess.Popen(args, stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT,
 universal_newlines=True)
-exitcode = subp.wait()
-if exitcode < 0:
+output = subp.communicate()[0]
+if subp.returncode < 0:
 sys.stderr.write('qemu-io received signal %i: %s\n'
- % (-exitcode, ' '.join(args)))
-return subp.communicate()[0]
+ % (-subp.returncode, ' '.join(args)))
+return output
 
 def qemu_io_log(*args):
 result = qemu_io(*args)
@@ -257,15 +258,14 @@ def qemu_nbd_early_pipe(*args):
and its output in case of an error'''
 subp = subprocess.Popen(qemu_nbd_args + ['--fork'] + list(args),
 stdout=subprocess.PIPE,
-stderr=subprocess.STDOUT,
 universal_newlines=True)
-exitcode = subp.wait()
-if exitcode < 0:
+output = subp.communicate()[0]
+if subp.returncode < 0:
 sys.stderr.write('qemu-nbd received signal %i: %s\n' %
- (-exitcode,
+ (-subp.returncode,
   ' '.join(qemu_nbd_args + ['--fork'] + list(args
 
-return exitcode, subp.communicate()[0] if exitcode else ''
+return subp.returncode, output if subp.returncode else ''
 
 def qemu_nbd_popen(*args):
 '''Run qemu-nbd in daemon mode and return the parent's exit code'''
@@ -1062,11 +1062,11 @@ def qemu_pipe(*args):
 subp = subprocess.Popen(args, stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT,
 universal_newlines=True)
-exitcode = subp.wait()
-if exitcode < 0:
+output = subp.communicate()[0]
+if subp.returncode < 0:
 sys.stderr.write('qemu received signal %i: %s\n' %
- (-exitcode, ' '.join(args)))
-return subp.communicate()[0]
+ (-subp.returncode, ' '.join(args)))
+return output
 
 def supported_formats(read_only=False):
 '''Set 'read_only' to True to check ro-whitelist
-- 
2.25.4

Re: [PULL 00/10] Modules 20200702 patches

2020-07-03 Thread Claudio Fontana

Hello Gerd,

I think in general the idea to make it easier to modularize things is great,
is this thought for devices only, or could I rework my changes to support 
modularizing per-target AccelClass types and all the related code on top of 
your design?

Thanks,

Claudio

On 7/2/20 2:20 PM, Gerd Hoffmann wrote:
> The following changes since commit fc1bff958998910ec8d25db86cd2f53ff125f7ab:
> 
>   hw/misc/pca9552: Add missing TypeInfo::class_size field (2020-06-29 
> 21:16:10 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kraxel.org/qemu tags/modules-20200702-pull-request
> 
> for you to fetch changes up to 474a5d66036d18ee5ccaa88364660d05bf32127b:
> 
>   chardev: enable modules, use for braille (2020-07-01 21:08:11 +0200)
> 
> 
> qom: add support for qom objects in modules.
> build some devices (qxl, virtio-gpu, ccid, usb-redir) as modules.
> build braille chardev as module.
> 
> note: qemu doesn't rebuild objects on cflags changes (specifically
>   -fPIC being added when code is switched from builtin to module).
>   Workaround for resulting build errors: "make clean", rebuild.
> 
> 
> 
> Gerd Hoffmann (10):
>   module: qom module support
>   object: qom module support
>   qdev: device module support
>   build: fix device module builds
>   ccid: build smartcard as module
>   usb: build usb-redir as module
>   vga: build qxl as module
>   vga: build virtio-gpu only once
>   vga: build virtio-gpu as module
>   chardev: enable modules, use for braille
> 
>  Makefile.objs|  2 ++
>  Makefile.target  |  7 +
>  include/qemu/module.h|  2 ++
>  include/qom/object.h | 12 +++
>  chardev/char.c   |  2 +-
>  hw/core/qdev.c   |  6 ++--
>  qdev-monitor.c   |  5 +--
>  qom/object.c | 14 +
>  qom/qom-qmp-cmds.c   |  3 +-
>  softmmu/vl.c |  4 +--
>  util/module.c| 67 
>  chardev/Makefile.objs|  5 ++-
>  hw/Makefile.objs |  2 ++
>  hw/display/Makefile.objs | 28 ++---
>  hw/usb/Makefile.objs | 13 +---
>  15 files changed, 148 insertions(+), 24 deletions(-)
>

Re: [PULL 00/41] virtio,acpi: features, fixes, cleanups.

2020-07-03 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20200703090252.368694-1-...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PULL 00/41] virtio,acpi: features, fixes, cleanups.
Type: series
Message-id: 20200703090252.368694-1-...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag] patchew/20200703090252.368694-1-...@redhat.com -> 
patchew/20200703090252.368694-1-...@redhat.com
Switched to a new branch 'test'
d326f4c vhost-vdpa: introduce vhost-vdpa net client
5b79506 vhost-vdpa: introduce vhost-vdpa backend
34e3d32 vhost_net: introduce set_config & get_config
45e1755 vhost: implement vhost_force_iommu method
521dda5 vhost: introduce new VhostOps vhost_force_iommu
cde3f4d vhost: implement vhost_vq_get_addr method
9ca vhost: introduce new VhostOps vhost_vq_get_addr
46365ef vhost: implement vhost_dev_start method
258ef40 vhost: introduce new VhostOps vhost_dev_start
595fee8 vhost: check the existence of vhost_set_iotlb_callback
6d31611 virtio-pci: implement queue_enabled method
9d7cc7e virtio-bus: introduce queue_enabled method
63e1852 vhost_net: use the function qemu_get_peer
c14a63b0 net: introduce qemu_get_peer
ae02598 MAINTAINERS: add VT-d entry
e3da26d docs: vhost-user: add Virtio status protocol feature
11e69be tests/acpi: remove stale allowed tables
5495dd2 numa: Auto-enable NUMA when any memory devices are possible
03bbf1c virtio-mem: Exclude unplugged memory during migration
d06c835 virtio-mem: Add trace events
8e27aef virtio-mem: Migration sanity checks
c96ea1a virtio-pci: Send qapi events when the virtio-mem size changes
b36b37d virtio-mem: Allow notifiers for size changes
da08f51 pc: Support for virtio-mem-pci
118ff96 numa: Handle virtio-mem in NUMA stats
1bbdf6d hmp: Handle virtio-mem when printing memory device info
8e46b61 MAINTAINERS: Add myself as virtio-mem maintainer
2b74d63 virtio-pci: Proxy for virtio-mem
961edc2 virtio-mem: Paravirtualized memory hot(un)plug
4f15a29 migration/colo: Use ram_block_discard_disable()
bedd15d migration/rdma: Use ram_block_discard_disable()
170b21f target/i386: sev: Use ram_block_discard_disable()
04b1297 virtio-balloon: Rip out qemu_balloon_inhibit()
016195a s390x/pv: Convert to ram_block_discard_disable()
117712d accel/kvm: Convert to ram_block_discard_disable()
7a7ff33 vfio: Convert to ram_block_discard_disable()
e41bedd exec: Introduce ram_block_discard_(disable|require)()
7bf2f8b pc: Support coldplugging of virtio-pmem-pci devices on all buses
abff164 virtio-balloon: always indicate S_DONE when migration fails
cbb5a2c Revert "tests/migration: Reduce autoconverge initial bandwidth"
2f9d6c5 tests: disassemble-aml.sh: generate AML in readable format

=== OUTPUT BEGIN ===
1/41 Checking commit 2f9d6c5c8d4f (tests: disassemble-aml.sh: generate AML in 
readable format)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100755

WARNING: line over 80 characters
#30: FILE: tests/data/acpi/disassemle-aml.sh:10:
+echo "Usage: ./tests/data/acpi/disassemle-aml.sh [-o 
]"

ERROR: line over 90 characters
#81: FILE: tests/data/acpi/rebuild-expected-aml.sh:39:
+echo "You can use ${SRC_PATH}/tests/data/acpi/disassemle-aml.sh to disassemble 
them to ASL."

total: 1 errors, 2 warnings, 59 lines checked

Patch 1/41 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

2/41 Checking commit cbb5a2c19835 (Revert "tests/migration: Reduce autoconverge 
initial bandwidth")
3/41 Checking commit abff1647fa36 (virtio-balloon: always indicate S_DONE when 
migration fails)
4/41 Checking commit 7bf2f8bc2dbe (pc: Support coldplugging of virtio-pmem-pci 
devices on all buses)
5/41 Checking commit e41bedd7874b (exec: Introduce 
ram_block_discard_(disable|require)())
6/41 Checking commit 7a7ff3317cec (vfio: Convert to ram_block_discard_disable())
7/41 Checking commit 117712d3f611 (accel/kvm: Convert to 
ram_block_discard_disable())
8/41 Checking commit 016195ab9045 (s390x/pv: Convert to 
ram_block_discard_disable())
9/41 Checking commit 04b12978b0c8 (virtio-balloon: Rip out 
qemu_balloon_inhibit())
10/41 Checking commit 170b21f22848 (target/i386: sev: Use 
ram_block_discard_disable())
11/41 Checking commit bedd15d37028 (migration/rdma: Use 
ram_block_discard_disable())
12/41 Checking commit 4f15a2930cd4 (migration/colo: Use 
ram_block_discard_disable())
13/41 Checking commit 961edc25d06d (virtio-mem: Paravirtualized memory 
hot(un)plug)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#150: 
new file mode 100644

WARNING: architecture specific defines should be avoided
#207: FIL

1 2 3 4 5 >

1 - 100 of 477 matches

Mail list logo