Re: [Qemu-devel] [PATCH] msix: fix interrupt aggregation problem at the passthrough of NVMe SSD

2019-04-10 Thread Zhuangyanying



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Tuesday, April 09, 2019 11:04 PM
> To: Zhuangyanying 
> Cc: marcel.apfelb...@gmail.com; qemu-devel@nongnu.org; Gonglei (Arei)
> 
> Subject: Re: [PATCH] msix: fix interrupt aggregation problem at the
> passthrough of NVMe SSD
> 
> On Tue, Apr 09, 2019 at 02:14:56PM +, Zhuangyanying wrote:
> > From: Zhuang Yanying 
> >
> > Recently I tested the performance of NVMe SSD passthrough and found
> > that interrupts were aggregated on vcpu0(or the first vcpu of each
> > numa) by /proc/interrupts,when GuestOS was upgraded to sles12sp3 (or
> > redhat7.6). But /proc/irq/X/smp_affinity_list shows that the interrupt is
> spread out, such as 0-10, 11-21, and so on.
> > This problem cannot be resolved by "echo X >
> > /proc/irq/X/smp_affinity_list", because the NVMe SSD interrupt is
> > requested by the API pci_alloc_irq_vectors(), so the interrupt has the
> IRQD_AFFINITY_MANAGED flag.
> >
> > GuestOS sles12sp3 backport "automatic interrupt affinity for MSI/MSI-X
> > capable devices", but the implementation of __setup_irq has no
> > corresponding modification. It is still irq_startup(), then
> > setup_affinity(), that is sending an affinity message when the interrupt is
> unmasked.
> 
> So does latest upstream still change data/address of an unmasked vector?
> 
The latest upstream works fine. Affinity configuration is successful.

The original order at my understanding is
nvme_setup_io_queues()
 \ \
  \ --->pci_alloc_irq_vectors_affinity()
   \\
\-> msi_domain_alloc_irqs()
 \ \ /* if IRQD_AFFINITY_MANAGED, then "mask = affinity " */
  \ -> ...-> __irq_alloc_descs()
   \ \  /* cpumask_copy(desc->irq_common_data.affinity, 
affinity); */
\ -> ...-> desc_smp_init()
 ->request_threaded_irq()
\
 ->__setup_irq()
   \  \
\  ->irq_startup()->msi_domain_activate()
 \  \
  \  ->irq_enable()->pci_msi_unmask_irq()
   \
-->setup_affinity()
   \\
\-->if (irqd_affinity_is_managed(&desc->irq_data)) 
 \   set = desc->irq_common_data.affinity;
  \  cpumask_and(mask, cpu_online_mask, set);
   \
-->irq_do_set_affinity()
   \
-->msi_domain_set_affinity()
  \   /* Actual setting affinity*/
   -->__pci_write_msi_msg()

Afetr commit e43b3b58:" genirq/cpuhotplug: Enforce affinity setting on startup 
of managed irqs "
@@ -265,8 +265,8 @@ int irq_startup(struct irq_desc *desc, bool resend, bool 
force)
irq_setup_affinity(desc);
break;
case IRQ_STARTUP_MANAGED:
+   irq_do_set_affinity(d, aff, false);
ret = __irq_startup (desc);
-   irq_set_affinity_locked(d, aff, false);
break;
First irq_do_set_affinity(), then __irq_startup(),that is unmask irq.

> > The bare metal configuration is successful, but qemu will not trigger
> > the msix update, and the affinity configuration fails.
> > The affinity is configured by /proc/irq/X/smp_affinity_list,
> > implemented at apic_ack_edge(), the bitmap is stored in pending_mask,
> > mask->__pci_write_msi_msg()->unmask,
> > and the timing is guaranteed, and the configuration takes effect.
> >
> > The GuestOS linux-master incorporates the "genirq/cpuhotplug: Enforce
> > affinity setting on startup of managed irqs" to ensure that the
> > affinity is first issued and then __irq_startup(), for the managerred
> > interrupt. So configuration is successful.
> >
> > It now looks like sles12sp3 (up to sles15sp1, linux-4.12.x), redhat7.6
> > (3.10.0-957.10.1) does not have backport the patch yet.
> 
> Sorry - which patch?
genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs
commit e43b3b58
> 
> > "if (is_masked == was_masked) return;" can it be removed at qemu?
> > What is the reason for this check?
> >
> > Signed-off-by: Zhuang Yanying 
> > ---
> >  hw/pci/msix.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/hw/pci/msix.c b/hw/pci/msix.c index 4e33641..e1ff533
> > 100644
> > --- a/hw/pci/msix.c
> > +++ b/hw/pci/msix.c
> > @@ -119,10 +119,6 @@ static void msix_handle_mask_update(PCIDevice
> > *dev, int vector, bool was_masked)  {
> >  bool is_masked = msix_is_masked(dev, vector);
> >
> > -if (is_masked == was_masked) {
> > -return;
> > -}
> > -
> >  msix_fire_vector_notifier(dev, vector, is_masked);
> >
> >  if (!is_masked && msix_is_pending(dev, vector)) {
> 
> 
> To add to that,

[Qemu-devel] [Bug 1823831] Re: BSD bootloader halts with hypervisor.framework

2019-04-10 Thread Chen Zhang
Git bisect shows that 92d5f1a4147c3722b5e9a8bcfb7dc261b7a8b855 is the
first bad commit.

Author: Paolo Bonzini 
Date:   Tue Aug 21 15:31:24 2018 +0200

target/i386: unify masking of interrupts

Interrupt handling depends on various flags in env->hflags or env->hflags2,
and the exact detail were not exactly replicated between x86_cpu_has_work
and x86_cpu_exec_interrupt.  Create a new function that extracts the
highest-priority non-masked interrupt, and use it in both functions.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1823831

Title:
  BSD bootloader halts with hypervisor.framework

Status in QEMU:
  New

Bug description:
  Guest: FreeBSD 12.0 Install CD
  Host: MacOS 11.14.3 qemu master at 90fb864a7df0a9af677352e94f8225f7b03de922

  Command arguments:

  qemu-system-x86_64 -m 4000m -cdrom Downloads/FreeBSD-12.0-RELEASE-
  amd64-bootonly.iso

  When qemu was run with -accel hvf, the bootloader would halt after
  showing the menu. The bootloader would not respond to any keyboard
  events.

  Without acceleration option, the bootloader would count down to zero
  and proceed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1823831/+subscriptions



Re: [Qemu-devel] [PATCH] docs: replace min-glib with fedora

2019-04-10 Thread Stefano Garzarella
On Tue, Apr 09, 2019 at 05:44:35PM +0200, Marc-André Lureau wrote:
> min-glib.docker was removed in commit
> e7b3af81597db1a6b55f2c15d030d703c6b2c6ac ("glib: bump min required
> glib library version to 2.40").
> 
> Cc: Daniel P. Berrangé 
> Signed-off-by: Marc-André Lureau 
> ---
>  docs/devel/testing.rst | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Stefano Garzarella 

Thanks,
Stefano



Re: [Qemu-devel] [PATCH] configure: Automatically fall back to TCI on non-release architectures

2019-04-10 Thread Thomas Huth
On 09/04/2019 21.46, Stefan Weil wrote:
[...]
> The known problems with the current implementation are
[...]
> * Calling conventions. The current implementation works on many hosts,
> but for example not on Sparc
Is there also a problem with the sparc *target* (i.e. not only sparc
hosts)? TCI does not seem to work here:

 https://gitlab.com/huth/qemu/-/jobs/193861498

 Thomas



Re: [Qemu-devel] [PATCH v2 for-4.0?] aio-posix: ensure poll mode is left when aio_notify is called

2019-04-10 Thread Sergio Lopez

Paolo Bonzini  writes:

> With aio=thread, adaptive polling makes latency worse rather than
> better, because it delays the execution of the ThreadPool's
> completion bottom half.
>
> event_notifier_poll() does run while polling, detecting that
> a bottom half was scheduled by a worker thread, but because
> ctx->notifier is explicitly ignored in run_poll_handlers_once(),
> scheduling the BH does not count as making progress and
> run_poll_handlers() keeps running.  Fix this by recomputing
> the deadline after *timeout could have changed.
>
> With this change, ThreadPool still cannot participate in polling
> but at least it does not suffer from extra latency.
>
> Reported-by: Sergio Lopez 
> Cc: Stefan Hajnoczi 
> Cc: Kevin Wolf 
> Cc: qemu-bl...@nongnu.org
> Signed-off-by: Paolo Bonzini 
> Message-Id: <1553692145-86728-1-git-send-email-pbonz...@redhat.com>
> Signed-off-by: Paolo Bonzini 
> ---
> v1->v2: use qemu_soonest_timeout to handle timeout == -1
>  util/aio-posix.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/util/aio-posix.c b/util/aio-posix.c
> index 6fbfa7924f..db11021287 100644
> --- a/util/aio-posix.c
> +++ b/util/aio-posix.c
> @@ -519,6 +519,10 @@ static bool run_poll_handlers_once(AioContext *ctx, 
> int64_t *timeout)
>  if (!node->deleted && node->io_poll &&
>  aio_node_check(ctx, node->is_external) &&
>  node->io_poll(node->opaque)) {
> +/*
> + * Polling was successful, exit try_poll_mode immediately
> + * to adjust the next polling time.
> + */
>  *timeout = 0;
>  if (node->opaque != &ctx->notifier) {
>  progress = true;
> @@ -558,8 +562,9 @@ static bool run_poll_handlers(AioContext *ctx, int64_t 
> max_ns, int64_t *timeout)
>  do {
>  progress = run_poll_handlers_once(ctx, timeout);
>  elapsed_time = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time;
> -} while (!progress && elapsed_time < max_ns
> - && !atomic_read(&ctx->poll_disable_cnt));
> +max_ns = qemu_soonest_timeout(*timeout, max_ns);
> +assert(!(max_ns && progress));
> +} while (elapsed_time < max_ns && !atomic_read(&ctx->poll_disable_cnt));
>  
>  /* If time has passed with no successful polling, adjust *timeout to
>   * keep the same ending time.
> @@ -585,8 +590,7 @@ static bool run_poll_handlers(AioContext *ctx, int64_t 
> max_ns, int64_t *timeout)
>   */
>  static bool try_poll_mode(AioContext *ctx, int64_t *timeout)
>  {
> -/* See qemu_soonest_timeout() uint64_t hack */
> -int64_t max_ns = MIN((uint64_t)*timeout, (uint64_t)ctx->poll_ns);
> +int64_t max_ns = qemu_soonest_timeout(*timeout, ctx->poll_ns);
>  
>  if (max_ns && !atomic_read(&ctx->poll_disable_cnt)) {
>  poll_set_started(ctx, true);

Thanks, this one does the trick.

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] configure: Automatically fall back to TCI on non-release architectures

2019-04-10 Thread Thomas Huth
On 10/04/2019 08.07, Thomas Huth wrote:
> On 09/04/2019 21.46, Stefan Weil wrote:
>> On 05.04.19 11:16, Philippe Mathieu-Daudé wrote:
>>> On 4/5/19 11:02 AM, Daniel P. Berrangé wrote:
 On Fri, Apr 05, 2019 at 10:47:54AM +0200, Philippe Mathieu-Daudé wrote:
 Do the various crashes that you illustrate in that cover letter
 still exist today ? If so, 2 years of continued brokenness with no
 fixes would reinforce the the view that it is time to remove TCI
 from the codebase.
>>>
>>> Or find a maintainer and add tests...
>>
>> Thank you for CC'ing me. I could not spend much of my free time for QEMU
>> last year and typically will miss important messages on the list unless
>> my address is explicitly given. Nevertheless I still feel responsible
>> for TCI, and I am also listed as maintainer in MAINTAINERS.
> 
> That's great, good to know that you're still interested in TCI! ... but
> I think one of the main problems is still that we completely lack test
> coverage for TCI - the code always is in danger to bit-rot if it is not
> tested by default.
> 
> I could maybe have a try to add some test to our .gitlab-ci.yml file ...
> if you or somebody else could add one to .travis.yml, that would be
> great. Something like:
> 
>  - ./configure --enable-tcg-interpreter 
> --target-list=alpha-softmmu,ppc64-softmmu,i386-softmmu,sparc-softmmu,microblaze-softmmu,moxie-softmmu,arm-softmmu,hppa-softmmu
>  - make
>  - make tests/boot-serial-test
>  - QTEST_QEMU_BINARY="alpha-softmmu/qemu-system-alpha"  
> ./tests/boot-serial-test
>  - QTEST_QEMU_BINARY="ppc64-softmmu/qemu-system-ppc64"  
> ./tests/boot-serial-test
>  - ...

Additionally, I think it should be possible to compile with the
x86_64-linux-user target and then to run "make check-tcg" ... however,
that currently crashes with:

TODO qemu/tcg/tci.c:859: tcg_qemu_tb_exec()
qemu/tcg/tci.c:859: tcg fatal error
qemu: uncaught target signal 11 (Segmentation fault) - core dumped

 Thomas



Re: [Qemu-devel] [PATCH v5 4/6] dax: check synchronous mapping is supported

2019-04-10 Thread Jan Kara
On Wed 10-04-19 09:38:24, Pankaj Gupta wrote:
> This patch introduces 'daxdev_mapping_supported' helper
> which checks if 'MAP_SYNC' is supported with filesystem
> mapping. It also checks if corresponding dax_device is
> synchronous. Virtio pmem device is asynchronous and
> does not not support VM_SYNC. 
> 
> Suggested-by: Jan Kara 
> Signed-off-by: Pankaj Gupta 
> ---
>  include/linux/dax.h | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index b896706a5ee9..4a2a60ffec86 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -38,6 +38,24 @@ void kill_dax(struct dax_device *dax_dev);
>  void dax_write_cache(struct dax_device *dax_dev, bool wc);
>  bool dax_write_cache_enabled(struct dax_device *dax_dev);
>  bool dax_synchronous(struct dax_device *dax_dev);
> +
> +/*
> + * Callers check if synchronous mapping is enabled for DAX file
> + * and attached dax device is also synchronous.
> + *
> + * dax_synchronous function verifies if dax device is synchronous.
> + * Currently, only virtio pmem device supports asynchronous device
> + * flush.
> + */

Thanks for the patch! I'd restructure this comment like:

/*
 * Check if given mapping is supported by the file / underlying device.
 */
> +static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
> + struct dax_device *dax_dev)
> +{

/* Everyone supports non-sync mappings */
> + if (!(vma->vm_flags & VM_SYNC))
> + return true;
/* Sync mappings are supported only for files using DAX */
> + if (!IS_DAX(file_inode(vma->vm_file)))
> + return false;
/* Underlying device must support persisting through CPU instructions */
> + return dax_synchronous(dax_dev);
> +}
>  #else
>  static inline struct dax_device *dax_get_by_host(const char *host)
>  {
> @@ -69,6 +87,11 @@ static inline bool dax_synchronous(struct dax_device 
> *dax_dev)
>  {
>   return true;
>  }
> +static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
> + struct dax_device *dax_dev)
> +{
> + return true;

This looks wrong. Shouldn't it rather be:

return !(vma->flags & VM_SYNC);

?

Honza
-- 
Jan Kara 
SUSE Labs, CR



Re: [Qemu-devel] [PATCH v5 3/6] libnvdimm: add dax_dev sync flag

2019-04-10 Thread Jan Kara
On Wed 10-04-19 09:38:23, Pankaj Gupta wrote:
> @@ -64,6 +65,10 @@ static inline bool dax_write_cache_enabled(struct 
> dax_device *dax_dev)
>  {
>   return false;
>  }
> +static inline bool dax_synchronous(struct dax_device *dax_dev)
> +{
> + return true;
> +}

Is there a need to define dax_synchronous() for !CONFIG_DAX? Because that
property of dax device is pretty much undefined and I don't see anything
needing to call it for !CONFIG_DAX...

Honza
-- 
Jan Kara 
SUSE Labs, CR



Re: [Qemu-devel] [PATCH v5 4/6] dax: check synchronous mapping is supported

2019-04-10 Thread Pankaj Gupta


> > This patch introduces 'daxdev_mapping_supported' helper
> > which checks if 'MAP_SYNC' is supported with filesystem
> > mapping. It also checks if corresponding dax_device is
> > synchronous. Virtio pmem device is asynchronous and
> > does not not support VM_SYNC.
> > 
> > Suggested-by: Jan Kara 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  include/linux/dax.h | 23 +++
> >  1 file changed, 23 insertions(+)
> > 
> > diff --git a/include/linux/dax.h b/include/linux/dax.h
> > index b896706a5ee9..4a2a60ffec86 100644
> > --- a/include/linux/dax.h
> > +++ b/include/linux/dax.h
> > @@ -38,6 +38,24 @@ void kill_dax(struct dax_device *dax_dev);
> >  void dax_write_cache(struct dax_device *dax_dev, bool wc);
> >  bool dax_write_cache_enabled(struct dax_device *dax_dev);
> >  bool dax_synchronous(struct dax_device *dax_dev);
> > +
> > +/*
> > + * Callers check if synchronous mapping is enabled for DAX file
> > + * and attached dax device is also synchronous.
> > + *
> > + * dax_synchronous function verifies if dax device is synchronous.
> > + * Currently, only virtio pmem device supports asynchronous device
> > + * flush.
> > + */
> 
> Thanks for the patch! I'd restructure this comment like:
> 
> /*
>  * Check if given mapping is supported by the file / underlying device.
>  */

Sure.

> > +static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
> > +   struct dax_device *dax_dev)
> > +{
> 
>   /* Everyone supports non-sync mappings */
> > +   if (!(vma->vm_flags & VM_SYNC))
> > +   return true;
>   /* Sync mappings are supported only for files using DAX */
> > +   if (!IS_DAX(file_inode(vma->vm_file)))
> > +   return false;
>   /* Underlying device must support persisting through CPU instructions */
> > +   return dax_synchronous(dax_dev);
> > +}
> >  #else
> >  static inline struct dax_device *dax_get_by_host(const char *host)
> >  {
> > @@ -69,6 +87,11 @@ static inline bool dax_synchronous(struct dax_device
> > *dax_dev)
> >  {
> > return true;
> >  }
> > +static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
> > +   struct dax_device *dax_dev)
> > +{
> > +   return true;
> 
> This looks wrong. Shouldn't it rather be:
> 
>   return !(vma->flags & VM_SYNC);
> 
> ?

Right. I will correct this.

Thanks,
Pankaj

> 
>   Honza
> --
> Jan Kara 
> SUSE Labs, CR
> 



Re: [Qemu-devel] [PATCH] migration: savevm: fix error code with migration blockers

2019-04-10 Thread Dr. David Alan Gilbert
* Cole Robinson (crobi...@redhat.com) wrote:
> The only caller that checks the error code is looking for != 0,
> so returning false is incorrect.
> 
> Fixes: 5aaac467938 "migration: savevm: consult migration blockers"
> 
> Signed-off-by: Cole Robinson 

Thanks, this was in 3.1.0 so isn't a 4.0 regression, so it'll have
to wait for 4.1.


Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/savevm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 34bcad3807..a3dae4cf02 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2544,7 +2544,7 @@ int save_snapshot(const char *name, Error **errp)
>  AioContext *aio_context;
>  
>  if (migration_is_blocked(errp)) {
> -return false;
> +return ret;
>  }
>  
>  if (!replay_can_snapshot()) {
> -- 
> 2.21.0
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v5 3/6] libnvdimm: add dax_dev sync flag

2019-04-10 Thread Pankaj Gupta


> 
> On Wed 10-04-19 09:38:23, Pankaj Gupta wrote:
> > @@ -64,6 +65,10 @@ static inline bool dax_write_cache_enabled(struct
> > dax_device *dax_dev)
> >  {
> > return false;
> >  }
> > +static inline bool dax_synchronous(struct dax_device *dax_dev)
> > +{
> > +   return true;
> > +}
> 
> Is there a need to define dax_synchronous() for !CONFIG_DAX? Because that
> property of dax device is pretty much undefined and I don't see anything
> needing to call it for !CONFIG_DAX...

You are right. Will remove this.

Thanks for the review.

Best regards,
Pankaj

> 
>   Honza
> --
> Jan Kara 
> SUSE Labs, CR
> 



Re: [Qemu-devel] [RFC PATCH] hw/arm/virt: use variable size of flash device to save memory

2019-04-10 Thread Xiang Zheng
Hi Kevin,

On 2019/4/9 16:28, Kevin Wolf wrote:
> Am 09.04.2019 um 08:01 hat Markus Armbruster geschrieben:
>> László's last sentence below is "This really needs the attention of the
>> block people."  Cc'ing some.
>>
>> Laszlo Ersek  writes:
>>
>>> On 04/08/19 15:43, Xiang Zheng wrote:

 On 2019/4/3 23:35, Laszlo Ersek wrote:
>> I thought about your comments and wrote the following patch (just for 
>> test)
>> which uses a file mapping to replace the anonymous mapping. UEFI seems 
>> to work
>> fine. So why not use a file mapping to read or write a pflash device?
> Honestly I can't answer "why not do this". Maybe we should.
>
> Regarding "why not do this *exactly as shown below*" -- probably because
> then we'd have updates to the same underlying regular file via both
> mmap() and write(), and the interactions between those are really tricky
> (= best avoided).
>
> One of my favorite questions over the years used to be posting a matrix
> of possible mmap() and file descriptor r/w/sync interactions, with the
> outcomes as they were "implied" by POSIX, and then asking at the bottom,
> "is my understanding correct?" I've never ever received an answer to
> that. :)

 In my opinion, it's feasible to r/w/sync the memory devices which use a 
 block
 backend via mmap() and write().
>>>
>>> Maybe. I think that would be a first in QEMU, and you'd likely have to
>>> describe and follow a semi-formal model between fd actions and mmap()
>>> actions.
>>>
>>> Here's the (unconfirmed) table I referred to earlier.
>>>
>>> +-+-+
>>> | change made | change visible via  |
>>> | through +-+-+-+
>>> | | MAP_SHARED  | MAP_PRIVATE | read()  |
>>> +-+-+-+-+
>>> | MAP_SHARED  | yes | unspecified | depends on MS_SYNC, |
>>> | | | | MS_ASYNC, or normal |
>>> | | | | system activity |
>>> +-+-+-+-+
>>> | MAP_PRIVATE | no  | no  | no  |
>>> +-+-+-+-+
>>> | write() | depends on  | unspecified | yes |
>>> | | MS_INVALIDATE,  | | |
>>> | | or the system's | | |
>>> | | read/write  | | |
>>> | | consistency | | |
>>> +-+-+-+-+
>>>
>>> Presumably:
>>>
>>> - a write() to some offset range of a regular file should be visible in
>>> a MAP_SHARED mapping of that range after a matching msync(...,
>>> MS_INVALIDATE) call;
>>>
>>> - changes through a MAP_SHARED mapping to a regular file should be
>>> visible via read() after a matching msync(..., MS_SYNC) call returns.
>>>
>>> I sent this table first in 2010 to the Austin Group mailing list, and
>>> received no comments. Then another person (on the same list) asked
>>> basically the same questions in 2013, to which I responded with the
>>> above assumptions / interpretations -- and received no comments from
>>> third parties again.
>>>
>>> But, I'm really out of my depth here -- we're not even discussing
>>> generic read()/write()/mmap()/munmap()/msync() interactions, but how
>>> they would fit into the qemu block layer. And I have no idea.
> 
> There is no infrastructure in the block layer for mmapping image files.
> 
> Also... although we don't really use them in practice, some QCOW2
> features for pflash block backends are -- or would be -- nice. (Not the
> internal snapshots or the live RAM dumps, of course.)

 Regarding sparse files, can we read actual backend size data for the 
 read-only
 pflash memory as the following steps shown?

 1) Create a sparse file -- QEMU_EFI-test.raw:
dd if=/dev/zero of=QEMU_EFI-test.raw bs=1M seek=64 count=0

 2) Read from QEMU_EFI.fd and write to QEMU_EFI-test.raw:
dd of="QEMU_EFI-test.raw" if="QEMU_EFI.fd" conv=notrunc

 3) Use QEMU_EFI-test.raw as the read-only pflash and start qemu with the 
 below
 patch applied:

 ---8>---

 diff --git a/hw/block/block.c b/hw/block/block.c
 index bf56c76..ad18d5e 100644
 --- a/hw/block/block.c
 +++ b/hw/block/block.c
 @@ -30,7 +30,7 @@
  bool blk_check_size_and_read_all(BlockBackend *blk, void *buf, hwaddr 
 size,
   Error **errp)
  {
 -int64_t blk_len;
 +int64_t blk_len, actual_len;
  int ret;

  blk_

Re: [Qemu-devel] [PATCH v4 17/20] console: make screendump asynchronous

2019-04-10 Thread Gerd Hoffmann
> +static void qmp_screendump_finish(QemuConsole *con, struct qmp_screendump 
> *dump)
> +{
> +Error *err = NULL;
> +DisplaySurface *surface;
> +Monitor *prev_mon = cur_mon;

Why this is needed?

> +/*
> + * FIXME: async save with coroutine? it would have to copy or
> + * lock the surface.
> + */
> +ppm_save(dump->filename, surface, &err);

DisplaySurface is just a thin layer above pixman images these days.
Pixman images are reference counted, so you can
pixman_image_ref(surface->image) to make sure it doesn't disappear
underneath you, then pass the pixman image to ppm_save.

cheers,
  Gerd




Re: [Qemu-devel] [PATCH v4 16/20] console: add graphic_hw_update_done()

2019-04-10 Thread Gerd Hoffmann
On Tue, Apr 09, 2019 at 06:10:05PM +0200, Marc-André Lureau wrote:
> Add a function to be called when a graphic update is done.
> 
> Declare the QXL renderer as async: render_update_cookie_num counts the
> number of outstanding updates, and graphic_hw_update_done() is called
> when it reaches none.

Reviewed-by: Gerd Hoffmann 




Re: [Qemu-devel] [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the DT

2019-04-10 Thread Shameerali Kolothum Thodi

> -Original Message-
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: 09 April 2019 16:09
> To: Shameerali Kolothum Thodi ;
> qemu-devel@nongnu.org; qemu-...@nongnu.org; eric.au...@redhat.com;
> imamm...@redhat.com
> Cc: peter.mayd...@linaro.org; shannon.zha...@gmail.com;
> sa...@linux.intel.com; sebastien.bo...@intel.com; xuwei (O)
> ; ard.biesheu...@linaro.org; Linuxarm
> 
> Subject: Re: [PATCH v4 8/8] hw/arm/boot: Expose the PC-DIMM nodes in the
> DT
> 
> On 04/09/19 12:29, Shameer Kolothum wrote:
> > This patch adds memory nodes corresponding to PC-DIMM regions.
> > This will enable support for cold plugged device memory for Guests
> > with DT boot.
> >
> > Signed-off-by: Shameer Kolothum 
> > Signed-off-by: Eric Auger 
> > ---
> >  hw/arm/boot.c | 42 ++
> >  1 file changed, 42 insertions(+)
> >
> > diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> > index 8c840ba..150e1ed 100644
> > --- a/hw/arm/boot.c
> > +++ b/hw/arm/boot.c
> > @@ -19,6 +19,7 @@
> >  #include "sysemu/numa.h"
> >  #include "hw/boards.h"
> >  #include "hw/loader.h"
> > +#include "hw/mem/memory-device.h"
> >  #include "elf.h"
> >  #include "sysemu/device_tree.h"
> >  #include "qemu/config-file.h"
> > @@ -538,6 +539,41 @@ static void fdt_add_psci_node(void *fdt)
> >  qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
> >  }
> >
> > +static int fdt_add_hotpluggable_memory_nodes(void *fdt,
> > + uint32_t acells,
> uint32_t scells) {
> > +MemoryDeviceInfoList *info, *info_list = qmp_memory_device_list();
> > +MemoryDeviceInfo *mi;
> > +int ret = 0;
> > +
> > +for (info = info_list; info != NULL; info = info->next) {
> > +mi = info->value;
> > +switch (mi->type) {
> > +case MEMORY_DEVICE_INFO_KIND_DIMM:
> > +{
> > +PCDIMMDeviceInfo *di = mi->u.dimm.data;
> > +
> > +ret = fdt_add_memory_node(fdt, acells, di->addr, scells,
> > +  di->size, di->node, true);
> > +if (ret) {
> > +fprintf(stderr,
> > +"couldn't add PCDIMM /memory@%"PRIx64"
> node\n",
> > +di->addr);
> > +goto out;
> > +}
> > +break;
> > +}
> > +default:
> > +fprintf(stderr, "%s memory nodes are not yet supported\n",
> > +MemoryDeviceInfoKind_str(mi->type));
> > +ret = -ENOENT;
> > +goto out;
> > +}
> > +}
> > +out:
> > +qapi_free_MemoryDeviceInfoList(info_list);
> > +return ret;
> > +}
> > +
> >  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> >   hwaddr addr_limit, AddressSpace *as)
> >  {
> > @@ -637,6 +673,12 @@ int arm_load_dtb(hwaddr addr, const struct
> arm_boot_info *binfo,
> >  }
> >  }
> >
> > +rc = fdt_add_hotpluggable_memory_nodes(fdt, acells, scells);
> > +if (rc < 0) {
> > +fprintf(stderr, "couldn't add hotpluggable memory nodes\n");
> > +goto fail;
> > +}
> > +
> >  rc = fdt_path_offset(fdt, "/chosen");
> >  if (rc < 0) {
> >  qemu_fdt_add_subnode(fdt, "/chosen");
> >
> 
> 
> Given patches #7 and #8, as I understand them, the firmware cannot
> distinguish hotpluggable & present, from hotpluggable & absent. The firmware
> can only skip both hotpluggable cases. That's fine in that the firmware will 
> hog
> neither type -- but is that OK for the OS as well, for both ACPI boot and DT
> boot?

Right. This only handles the hotpluggable-and-present condition.

> Consider in particular the "hotpluggable & present, ACPI boot" case. Assuming
> we modify the firmware to skip "hotpluggable" altogether, the UEFI memmap
> will not include the range despite it being present at boot. Presumably, ACPI
> will refer to the range somehow, however. Will that not confuse the OS?

From my testing so far, without patches #7 and #8(ie, no UEFI memmap entry),
ACPI boots fine. I think ACPI only relies on aml and SRAT. 
 
> When Igor raised this earlier, I suggested that hotpluggable-and-present
> should be added by the firmware, but also allocated immediately, as
> EfiBootServicesData type memory. This will prevent other drivers in the
> firmware from allocating AcpiNVS or Reserved chunks from the same memory
> range, the UEFI memmap will contain the range as EfiBootServicesData, and
> then the OS can release that allocation in one go early during boot.

Ok. Agree that hotpluggable-and-present case it may make sense to make use of
that memory rather than just hiding it altogether.

On another note, Does hotpluggable-and-absent case make any valid use case for
a DT boot? I am not sure how we can hot-add memory in the case of DT boot later.
I have not verified the sysfs probe interface yet and there are discussions of 
dropping
that interface altogether.

> But this really has to be clari

Re: [Qemu-devel] [PATCH 1/2] qemu-img convert: ignore read errors

2019-04-10 Thread Vladimir Sementsov-Ogievskiy
09.04.2019 17:56, Andrey Shinkevich wrote:
> The 'qemu-img convert' new command option 'force read' with the key '-R'
> allows converting a damaged image to get all the available information
> in case of the read errors. The program reports read errors and continue
> the image conversion. The users should keep in their minds that the
> resulting image is inconsistent.
> 
> Suggested-by: Denis V. Lunev 
> Signed-off-by: Andrey Shinkevich 
> ---
>   qemu-img-cmds.hx |  4 ++--
>   qemu-img.c   | 18 --
>   qemu-img.texi|  2 +-
>   3 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
> index 1526f32..09af9cb 100644
> --- a/qemu-img-cmds.hx
> +++ b/qemu-img-cmds.hx
> @@ -44,9 +44,9 @@ STEXI
>   ETEXI
>   
>   DEF("convert", img_convert,
> -"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] 
> [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] 
> [-B backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-m 
> num_coroutines] [-W] filename [filename2 [...]] output_filename")
> +"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] 
> [-R] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O 
> output_fmt] [-B backing_file] [-o options] [-l snapshot_param] [-S 
> sparse_size] [-m num_coroutines] [-W] filename [filename2 [...]] 
> output_filename")
>   STEXI
> -@item convert [--object @var{objectdef}] [--image-opts] 
> [--target-image-opts] [-U] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t 
> @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-B 
> @var{backing_file}] [-o @var{options}] [-l @var{snapshot_param}] [-S 
> @var{sparse_size}] [-m @var{num_coroutines}] [-W] @var{filename} 
> [@var{filename2} [...]] @var{output_filename}
> +@item convert [--object @var{objectdef}] [--image-opts] 
> [--target-image-opts] [-U] [-R] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t 
> @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-B 
> @var{backing_file}] [-o @var{options}] [-l @var{snapshot_param}] [-S 
> @var{sparse_size}] [-m @var{num_coroutines}] [-W] @var{filename} 
> [@var{filename2} [...]] @var{output_filename}
>   ETEXI
>   
>   DEF("create", img_create,
> diff --git a/qemu-img.c b/qemu-img.c
> index aa6f81f..494e880 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -175,6 +175,7 @@ static void QEMU_NORETURN help(void)
>  "Parameters to convert subcommand:\n"
>  "  '-m' specifies how many coroutines work in parallel during 
> the convert\n"
>  "   process (defaults to 8)\n"
> +   "  '-R' continue the image conversion in case of the sector read 
> error\n"
>  "  '-W' allow to write to the target out of order rather than 
> sequential\n"
>  "\n"
>  "Parameters to snapshot subcommand:\n"
> @@ -1579,6 +1580,7 @@ typedef struct ImgConvertState {
>   int64_t wait_sector_num[MAX_COROUTINES];
>   CoMutex lock;
>   int ret;
> +bool force_read;
>   } ImgConvertState;
>   
>   static void convert_select_part(ImgConvertState *s, int64_t sector_num,
> @@ -1693,7 +1695,15 @@ static int coroutine_fn 
> convert_co_read(ImgConvertState *s, int64_t sector_num,
>   blk, (sector_num - src_cur_offset) << BDRV_SECTOR_BITS,
>   n << BDRV_SECTOR_BITS, &qiov, 0);
>   if (ret < 0) {
> -return ret;
> +if (!s->force_read) {
> +return ret;
> +}
> +int64_t offset = (sector_num - src_cur_offset) << 
> BDRV_SECTOR_BITS;
> +unsigned int bytes = n << BDRV_SECTOR_BITS;

declarations should be at start of block and better separated by one empty line.
On the other hand, if we add these variables, I think better to declare them in 
while
block and reuse for blk_co_preadv too.

with at least this fixed (but note, also, that I'm not good in cmd line option 
defining. Looks like we
have too many places where something should be added):
Reviewed-by: Vladimir Sementsov-Ogievskiy 

> +error_report("failed to read %d bytes at offset %" PRId64 ": %s",
> + bytes, offset, strerror(-ret));
> +memset(buf, 0, n * BDRV_SECTOR_SIZE);

hmm, may be some non-zero pattern would be better.

> +/* TODO: retry to read smaller chunks */
>   }
>   
>   sector_num += n;
> @@ -2018,6 +2028,7 @@ static int img_convert(int argc, char **argv)
>   .buf_sectors= IO_BUF_SIZE / BDRV_SECTOR_SIZE,
>   .wr_in_order= true,
>   .num_coroutines = 8,
> +.force_read = false,

It is initialized by zero by default, as all other fields. ".copy_range = 
false" is
bad example here, I think.

>   };
>   
>   for(;;) {
> @@ -2029,7 +2040,7 @@ static int img_convert(int argc, char **argv)
>   {"target-image-opts", no_argument, 0, OPTION_TARGET_IMAGE_OPTS},
>   

Re: [Qemu-devel] [PATCH 2/2] iotests: new test 253 check qemu-img convert force read

2019-04-10 Thread Vladimir Sementsov-Ogievskiy
09.04.2019 17:56, Andrey Shinkevich wrote:
> A new test for the patch 'qemu-img convert: ignore read errors'
> 
> Signed-off-by: Andrey Shinkevich 
> ---
>   tests/qemu-iotests/253 | 69 
> ++
>   tests/qemu-iotests/253.out |  4 +++
>   tests/qemu-iotests/group   |  1 +
>   3 files changed, 74 insertions(+)
>   create mode 100755 tests/qemu-iotests/253
>   create mode 100644 tests/qemu-iotests/253.out
> 
> diff --git a/tests/qemu-iotests/253 b/tests/qemu-iotests/253
> new file mode 100755
> index 000..671a3c4
> --- /dev/null
> +++ b/tests/qemu-iotests/253
> @@ -0,0 +1,69 @@
> +#!/usr/bin/env python
> +#
> +# Test of the 'qemu-img convert' for a damaged image.
> +#
> +# Copyright (c) 2019 Virtuozzo International GmbH
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +import iotests
> +import os
> +from iotests import qemu_img_create, qemu_io, qemu_img_pipe, \
> +file_path, log
> +
> +iotests.verify_image_format(supported_fmts=['qcow2'])
> +
> +disk = file_path('disk')
> +chunk = 256 * 1024
> +bad_sector = 768
> +blkdebug_file = os.path.join(iotests.test_dir, 'blkdebug.conf')
> +
> +
> +def create_blkdebug_file():
> +file = open(blkdebug_file, 'w')
> +file.write('''
> +[inject-error]
> +event = "read_aio"
> +errno = "5"
> +once ="off"
> +sector = "%d"
> +''' % (bad_sector))
> +file.close()
> +
> +
> +def write_to_disk(offset, size):
> +write = 'write -P 7 {} {}'.format(offset, size)
> +qemu_io('-c', write, disk)
> +
> +
> +qemu_img_create('-f', iotests.imgfmt, disk, '1M')
> +
> +for num in range(1, 4):
> +write_to_disk((num-1) * chunk, chunk)
> +
> +create_blkdebug_file()
> +disk_converted = disk + '_converted'
> +log(qemu_img_pipe('convert', '-O', iotests.imgfmt, 'blkdebug:%s:%s'
> +  % (blkdebug_file, disk), disk_converted))
> +log(qemu_img_pipe('convert', '-O', iotests.imgfmt, '-R', 'blkdebug:%s:%s'
> +  % (blkdebug_file, disk), disk_converted))
> +
> +# TODO: with a class, self.assertNotEqual(
> +#   qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', disk),
> +#   qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', disk_converted),
> +#   'image file map matches the copied file after conversion')

You may use just assert. But maps will not be different. I think better is to
check resulting image by qemu-io read -P.

> +
> +os.remove(disk_converted)
> +os.remove(blkdebug_file)
> diff --git a/tests/qemu-iotests/253.out b/tests/qemu-iotests/253.out
> new file mode 100644
> index 000..b972541
> --- /dev/null
> +++ b/tests/qemu-iotests/253.out
> @@ -0,0 +1,4 @@
> +qemu-img: error while reading sector 0: Input/output error
> +
> +qemu-img: failed to read 786432 bytes at offset 0: Input/output error
> +
> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index bae7718..4cd2fe8 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -248,3 +248,4 @@
>   246 rw auto quick
>   247 rw auto quick
>   248 rw auto quick
> +253 rw auto quick
> 


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] migration: savevm: fix error code with migration blockers

2019-04-10 Thread Juan Quintela
Cole Robinson  wrote:
> The only caller that checks the error code is looking for != 0,
> so returning false is incorrect.
>
> Fixes: 5aaac467938 "migration: savevm: consult migration blockers"
>
> Signed-off-by: Cole Robinson 

Reviewed-by: Juan Quintela 



Re: [Qemu-devel] [PATCH for-4.1] q35: acpi: do not create dummy MCFG table

2019-04-10 Thread Igor Mammedov
On Wed, 10 Apr 2019 09:12:31 +0800
Wei Yang  wrote:

> On Tue, Apr 09, 2019 at 05:00:37PM +0200, Igor Mammedov wrote:
> >Dummy table (with signature "QEMU") creation came from original SeaBIOS
> >codebase. And QEMU would have to keep it around if there were Q35 machine
> >that depended on keeping ACPI tables blob constant size. Luckily there
> >were no versioned Q35 machine types before commit:
> >  (since 2.3) a1666142db acpi-build: make ROMs RAM blocks resizeable
> >which obsoleted need to keep ACPI tables blob the same size on 
> >source/destination.
> >  
> 
> I am not sure getting the "versioned Q35" correctly. Seems originally there is
> q35-1.4?
> 
> commit bf3caa3dc17552b323cec6831301a22cfc98ecd5
> Author: Paolo Bonzini 
> Date:   Fri Feb 8 14:06:15 2013 +0100
> 
> pc: add compatibility machine types for 1.4
> 
> Adds both pc-i440fx-1.4 and pc-q35-1.4.

current upstream has only pc-q35-2.4 as earliest versioned Q35
machine type (try to run: qemu-system-x86_64 -M help)



Re: [Qemu-devel] [PATCH for-4.1 v4 00/12] bundle edk2 platform firmware with QEMU

2019-04-10 Thread Igor Mammedov
On Wed, 10 Apr 2019 01:00:10 +0200
Laszlo Ersek  wrote:

> Repo:   https://github.com/lersek/qemu.git
> Branch: edk2_build_v4
> 
> Posting a v4 is necessary because patch #6 needs
> - manual conflict resolution against some commits between v4.0.0-rc2 and
>   v4.0.0-rc3,
> - and corresponding commit message updates.
> 
> These are noted in more detail on the patch itself:
> 
>   [PATCH for-4.1 v4 06/12]
>   roms/Makefile: replace the $(EDK2_EFIROM) target with "edk2-basetools"
> 
> Due to this update, I had to drop all previous feedback tags on this
> patch. I plan to send a pullreq as soon as 4.0 is released, so please
> re-review / retest the above patch (and *only* the above patch) soon.
> (The notes on the patch list the tags individually that I've now
> dropped.)
> 
> Version 3, that is:
> 
>   [Qemu-devel] [PATCH for-4.1 v3 00/12] bundle edk2 platform firmware with 
> QEMU

For whole series:

Reviewed-by: Igor Mammedov 

> 
> was posted at:
> 
>   https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg06148.html
>   http://mid.mail-archive.com/20190321113408.19929-1-lersek@redhat.com
> 
> Thanks,
> Laszlo
> 
> Laszlo Ersek (12):
>   roms: lift "edk2-funcs.sh" from "tests/uefi-test-tools/build.sh"
>   roms/edk2-funcs.sh: require gcc-4.8+ for building i386 and x86_64
>   tests/uefi-test-tools/build.sh: work around TianoCore#1607
>   roms/edk2: advance to tag edk2-stable201903
>   roms/edk2-funcs.sh: add the qemu_edk2_get_thread_count() function
>   roms/Makefile: replace the $(EDK2_EFIROM) target with "edk2-basetools"
>   roms: build edk2 firmware binaries and variable store templates
>   pc-bios: add edk2 firmware binaries and variable store templates
>   pc-bios: document the edk2 firmware images; add firmware descriptors
>   tests: add missing dependency to build QTEST_QEMU_BINARY, round 2
>   Makefile: install the edk2 firmware images and their descriptors
>   MAINTAINERS: add the "EDK2 Firmware" subsystem
> 
>  .gitignore |   1 +
>  MAINTAINERS|  12 +
>  Makefile   |  29 ++-
>  configure  |   1 +
>  pc-bios/README |  11 +
>  pc-bios/descriptors/50-edk2-i386-secure.json   |  34 +++
>  pc-bios/descriptors/50-edk2-x86_64-secure.json |  35 +++
>  pc-bios/descriptors/60-edk2-aarch64.json   |  31 +++
>  pc-bios/descriptors/60-edk2-arm.json   |  31 +++
>  pc-bios/descriptors/60-edk2-i386.json  |  33 +++
>  pc-bios/descriptors/60-edk2-x86_64.json|  34 +++
>  pc-bios/edk2-aarch64-code.fd.bz2   | Bin 0 -> 1177603 bytes
>  pc-bios/edk2-arm-code.fd.bz2   | Bin 0 -> 1173662 bytes
>  pc-bios/edk2-arm-vars.fd.bz2   | Bin 0 -> 263 bytes
>  pc-bios/edk2-i386-code.fd.bz2  | Bin 0 -> 1688659 bytes
>  pc-bios/edk2-i386-secure-code.fd.bz2   | Bin 0 -> 1881979 bytes
>  pc-bios/edk2-i386-vars.fd.bz2  | Bin 0 -> 190 bytes
>  pc-bios/edk2-licenses.txt  | 209 
>  pc-bios/edk2-x86_64-code.fd.bz2| Bin 0 -> 1669280 bytes
>  pc-bios/edk2-x86_64-secure-code.fd.bz2 | Bin 0 -> 1901210 bytes
>  roms/Makefile  |   9 +-
>  roms/Makefile.edk2 | 148 
>  roms/edk2  |   2 +-
>  roms/edk2-build.sh |  55 +
>  roms/edk2-funcs.sh | 253 
>  tests/Makefile.include |   2 +-
>  tests/uefi-test-tools/build.sh | 100 +---
>  27 files changed, 934 insertions(+), 96 deletions(-)
>  create mode 100644 pc-bios/descriptors/50-edk2-i386-secure.json
>  create mode 100644 pc-bios/descriptors/50-edk2-x86_64-secure.json
>  create mode 100644 pc-bios/descriptors/60-edk2-aarch64.json
>  create mode 100644 pc-bios/descriptors/60-edk2-arm.json
>  create mode 100644 pc-bios/descriptors/60-edk2-i386.json
>  create mode 100644 pc-bios/descriptors/60-edk2-x86_64.json
>  create mode 100644 pc-bios/edk2-aarch64-code.fd.bz2
>  create mode 100644 pc-bios/edk2-arm-code.fd.bz2
>  create mode 100644 pc-bios/edk2-arm-vars.fd.bz2
>  create mode 100644 pc-bios/edk2-i386-code.fd.bz2
>  create mode 100644 pc-bios/edk2-i386-secure-code.fd.bz2
>  create mode 100644 pc-bios/edk2-i386-vars.fd.bz2
>  create mode 100644 pc-bios/edk2-licenses.txt
>  create mode 100644 pc-bios/edk2-x86_64-code.fd.bz2
>  create mode 100644 pc-bios/edk2-x86_64-secure-code.fd.bz2
>  create mode 100644 roms/Makefile.edk2
>  create mode 100755 roms/edk2-build.sh
>  create mode 100644 roms/edk2-funcs.sh
> 




[Qemu-devel] [PATCH] migration: Fix handling fd protocol

2019-04-10 Thread Yury Kotov
Currently such case is possible for incoming:
QMP: add-fd (fdset = 0, fd of some file):
adds fd to fdset 0 and returns QEMU's fd (e.g. 33)
QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc
...
Incoming migration completes and unrefs ioc -> close(33)
QMP: remove-fd (fdset = 0, fd = 33):
removes fd from fdset 0 and qemu_close() -> close(33) => double close

For outgoing migration the case is the same but getfd instead of add-fd.
Fix it by duping client's fd.

Signed-off-by: Yury Kotov 
---
 migration/fd.c | 39 +++
 migration/trace-events |  4 ++--
 2 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/migration/fd.c b/migration/fd.c
index a7c13df4ad..c9ff07ac41 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -15,6 +15,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "channel.h"
 #include "fd.h"
 #include "migration.h"
@@ -26,15 +27,27 @@
 void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error 
**errp)
 {
 QIOChannel *ioc;
-int fd = monitor_get_fd(cur_mon, fdname, errp);
+int fd, dup_fd;
+
+fd = monitor_get_fd(cur_mon, fdname, errp);
 if (fd == -1) {
 return;
 }
 
-trace_migration_fd_outgoing(fd);
-ioc = qio_channel_new_fd(fd, errp);
+/* fd is previously created by qmp command 'getfd',
+ * so client is responsible to close it. Dup it to save original value from
+ * QIOChannel's destructor */
+dup_fd = qemu_dup(fd);
+if (dup_fd == -1) {
+error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno),
+   errno);
+return;
+}
+
+trace_migration_fd_outgoing(fd, dup_fd);
+ioc = qio_channel_new_fd(dup_fd, errp);
 if (!ioc) {
-close(fd);
+close(dup_fd);
 return;
 }
 
@@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel 
*ioc,
 void fd_start_incoming_migration(const char *infd, Error **errp)
 {
 QIOChannel *ioc;
-int fd;
+int fd, dup_fd;
 
 fd = strtol(infd, NULL, 0);
-trace_migration_fd_incoming(fd);
 
-ioc = qio_channel_new_fd(fd, errp);
+/* fd is previously created by qmp command 'add-fd' or something else,
+ * so client is responsible to close it. Dup it to save original value from
+ * QIOChannel's destructor */
+dup_fd = qemu_dup(fd);
+if (dup_fd == -1) {
+error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno),
+   errno);
+return;
+}
+
+trace_migration_fd_incoming(fd, dup_fd);
+ioc = qio_channel_new_fd(dup_fd, errp);
 if (!ioc) {
-close(fd);
+close(dup_fd);
 return;
 }
 
diff --git a/migration/trace-events b/migration/trace-events
index de2e136e57..d2d30a6b3c 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s"
 migration_exec_incoming(const char *cmd) "cmd=%s"
 
 # fd.c
-migration_fd_outgoing(int fd) "fd=%d"
-migration_fd_incoming(int fd) "fd=%d"
+migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d"
+migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d"
 
 # socket.c
 migration_socket_incoming_accepted(void) ""
-- 
2.21.0




Re: [Qemu-devel] [PULL 0/1] device-tree queue

2019-04-10 Thread Peter Maydell
On Wed, 10 Apr 2019 at 00:55, Alistair Francis  wrote:
>
> The following changes since commit f151f8aca5cf5da24f6eb743a55a2233091ae532:
>
>   migration/ram.c: Fix use-after-free in multifd_recv_unfill_packet() 
> (2019-04-09 20:46:34 +0100)
>
> are available in the Git repository at:
>
>   g...@github.com:alistair23/qemu.git tags/pull-device-tree-20190409-1
>
> for you to fetch changes up to 065e6298a75164b4347682b63381dbe752c2b156:
>
>   device_tree: Fix integer overflowing in load_device_tree() (2019-04-09 
> 16:35:40 -0700)
>
> 
> Single device tree fix for 4.0
>
> A single patch to avoid an overflow when loading device trees.
>
> 
> Markus Armbruster (1):
>   device_tree: Fix integer overflowing in load_device_tree()

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.0
for any user-visible changes.

-- PMM



Re: [Qemu-devel] [PATCH] migration: Fix handling fd protocol

2019-04-10 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190410092652.22616-1-yury-ko...@yandex-team.ru/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  migration/xbzrle.o
  CC  migration/postcopy-ram.o
/tmp/qemu-test/src/migration/fd.c: In function 'fd_start_outgoing_migration':
/tmp/qemu-test/src/migration/fd.c:40:14: error: implicit declaration of 
function 'qemu_dup'; did you mean 'qemu_hexdump'? 
[-Werror=implicit-function-declaration]
 dup_fd = qemu_dup(fd);
  ^~~~
  qemu_hexdump
/tmp/qemu-test/src/migration/fd.c:40:14: error: nested extern declaration of 
'qemu_dup' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [/tmp/qemu-test/src/rules.mak:69: migration/fd.o] Error 1
make: *** Waiting for unfinished jobs


The full log is available at
http://patchew.org/logs/20190410092652.22616-1-yury-ko...@yandex-team.ru/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH for-4.1 v4 11/12] Makefile: install the edk2 firmware images and their descriptors

2019-04-10 Thread Philippe Mathieu-Daudé
On 4/10/19 1:00 AM, Laszlo Ersek wrote:
> Decompress and install the edk2 firmware blobs as part of "make install",
> unless blob installation was disabled with configure's "--disable-blobs"
> option.
> 
> Additionally, decompress the blobs as a pre-requisite for building softmmu
> binaries -- this is helpful for both "make check" and other ad-hoc tests
> one might want to run in the build directory.
> 
> Signed-off-by: Laszlo Ersek 
> Reviewed-by: Michal Privoznik 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Michael S. Tsirkin 

Tested-by: Philippe Mathieu-Daudé 

Note that this job (with install) on Travis takes <30min:
https://travis-ci.org/philmd/qemu/jobs/518117217

> ---
> 
> Notes:
> v4:
> 
> - no change
> 
> v3:
> 
> - pick up Michal's R-b
> 
> - pick up Phil's R-b
> 
> - pick up Michael's R-b
> 
> - Decompress fd.bz2 files with bzip2 rather than fd.xz files with xz, so
>   that decompression at "make install" time succeed on older build OSes
>   too [Peter]. Note that "BUNZIP2" matches the name of the actual
>   command "bunzip2" (i.e. it is sensible), and that it consists of 7
>   characters, satisfying the quiet-command limit.
> 
> - do not pick up Phil's T-b, consequently
> 
> - do not pick up Igor's T-b for the same reason
> 
> v2:
> 
> - adapt to tracking the edk2 flash device files in compressed form [Dan,
>   Michael, Phil]
> 
> - do not pick up Michal's and Michael's R-b's due to the above change
> 
>  configure  |  1 +
>  Makefile   | 29 +++-
>  .gitignore |  1 +
>  3 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 1c563a70276a..e1ad87b697e0 100755
> --- a/configure
> +++ b/configure
> @@ -7892,6 +7892,7 @@ for bios_file in \
>  $source_path/pc-bios/*.img \
>  $source_path/pc-bios/openbios-* \
>  $source_path/pc-bios/u-boot.* \
> +$source_path/pc-bios/edk2-*.fd.bz2 \
>  $source_path/pc-bios/palcode-*
>  do
>  LINKS="$LINKS pc-bios/$(basename $bios_file)"
> diff --git a/Makefile b/Makefile
> index 04a0d4505085..626a04d305cc 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -296,6 +296,10 @@ ui/input-keymap-%.c: $(KEYCODEMAP_GEN) $(KEYCODEMAP_CSV) 
> $(SRC_PATH)/ui/Makefile
>  $(KEYCODEMAP_GEN): .git-submodule-status
>  $(KEYCODEMAP_CSV): .git-submodule-status
>  
> +edk2-decompressed = $(basename $(wildcard pc-bios/edk2-*.fd.bz2))
> +pc-bios/edk2-%.fd: pc-bios/edk2-%.fd.bz2
> + $(call quiet-command,bzip2 -d -c $< > $@,"BUNZIP2",$<)
> +
>  # Don't try to regenerate Makefile or configure
>  # We don't generate any of them
>  Makefile: ;
> @@ -445,6 +449,7 @@ $(SOFTMMU_SUBDIR_RULES): $(block-obj-y)
>  $(SOFTMMU_SUBDIR_RULES): $(crypto-obj-y)
>  $(SOFTMMU_SUBDIR_RULES): $(io-obj-y)
>  $(SOFTMMU_SUBDIR_RULES): config-all-devices.mak
> +$(SOFTMMU_SUBDIR_RULES): $(edk2-decompressed)
>  
>  subdir-%:
>   $(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C $* V="$(V)" 
> TARGET_DIR="$*/" all,)
> @@ -633,6 +638,7 @@ clean:
>   ! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-arm.a \
>   ! -path ./roms/edk2/BaseTools/Source/Python/UPT/Dll/sqlite3.dll 
> \
>   -exec rm {} +
> + rm -f $(edk2-decompressed)
>   rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* 
> *.pod *~ */*~
>   rm -f fsdev/*.pod scsi/*.pod
>   rm -f qemu-img-cmds.h
> @@ -723,9 +729,14 @@ spapr-rtas.bin slof.bin skiboot.lid \
>  palcode-clipper \
>  u-boot.e500 u-boot-sam460-20100605.bin \
>  qemu_vga.ndrv \
> +edk2-licenses.txt \
>  hppa-firmware.img
> +
> +DESCS=50-edk2-i386-secure.json 50-edk2-x86_64-secure.json \
> +60-edk2-aarch64.json 60-edk2-arm.json 60-edk2-i386.json 60-edk2-x86_64.json
>  else
>  BLOBS=
> +DESCS=
>  endif
>  
>  # Note that we manually filter-out the non-Sphinx documentation which
> @@ -786,7 +797,8 @@ endif
>  
>  ICON_SIZES=16x16 24x24 32x32 48x48 64x64 128x128 256x256 512x512
>  
> -install: all $(if $(BUILD_DOCS),install-doc) install-datadir 
> install-localstatedir
> +install: all $(if $(BUILD_DOCS),install-doc) install-datadir 
> install-localstatedir \
> + $(if $(INSTALL_BLOBS),$(edk2-decompressed))
>  ifneq ($(TOOLS),)
>   $(call install-prog,$(subst 
> qemu-ga,qemu-ga$(EXESUF),$(TOOLS)),$(DESTDIR)$(bindir))
>  endif
> @@ -808,6 +820,21 @@ ifneq ($(BLOBS),)
>   set -e; for x in $(BLOBS); do \
>   $(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x 
> "$(DESTDIR)$(qemu_datadir)"; \
>   done
> +endif
> +ifdef INSTALL_BLOBS
> + set -e; for x in $(edk2-decompressed); do \
> + $(INSTALL_DATA) $$x "$(DESTDIR)$(qemu_datadir)"; \
> + done
> +endif
> +ifneq ($(DESCS),)
> + $(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/firmware"
> + set -e; tmpf=$$(mktemp); trap 'rm -f -- "$$tmpf"' EXIT; \
> + for x in $(DESCS); do \
> + sed -e 's,@DATADIR@,$(DESTDIR)$(qemu_datadir),' \
> +

Re: [Qemu-devel] [PATCH for-4.1 v4 08/12] pc-bios: add edk2 firmware binaries and variable store templates

2019-04-10 Thread Philippe Mathieu-Daudé
On 4/10/19 1:00 AM, Laszlo Ersek wrote:
> Add the files built by the last patch: (compressed) binaries, and the
> cumulative license text that covers them.
> 
> Signed-off-by: Laszlo Ersek 
> Reviewed-by: Michal Privoznik 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Michael S. Tsirkin 

Tested-by: Philippe Mathieu-Daudé 

> ---
> 
> Notes:
> v4:
> 
> - no change
> 
> v3:
> 
> - pick up Michal's R-b
> 
> - pick up Phil's R-b
> 
> - pick up Michael's R-b
> 
> - Compress FD files with bzip2 rather than xz, so that decompression at
>   "make install" time succeed on older build OSes too. The total size
>   grows from 9,394,072 bytes to 9,492,846. [Peter]
> 
> - do not pick up Igor's T-b therefore
> 
> v2:
> 
> - capture the compressed build outputs of the last patch; slightly
>   update the commit message [Dan, Michael, Phil]
> 
> - consequently, do not pick up Michal's and Michael's R-b's
> 
>  pc-bios/edk2-licenses.txt  | 209 
>  pc-bios/edk2-aarch64-code.fd.bz2   | Bin 0 -> 1177603 bytes
>  pc-bios/edk2-arm-code.fd.bz2   | Bin 0 -> 1173662 bytes
>  pc-bios/edk2-arm-vars.fd.bz2   | Bin 0 -> 263 bytes
>  pc-bios/edk2-i386-code.fd.bz2  | Bin 0 -> 1688659 bytes
>  pc-bios/edk2-i386-secure-code.fd.bz2   | Bin 0 -> 1881979 bytes
>  pc-bios/edk2-i386-vars.fd.bz2  | Bin 0 -> 190 bytes
>  pc-bios/edk2-x86_64-code.fd.bz2| Bin 0 -> 1669280 bytes
>  pc-bios/edk2-x86_64-secure-code.fd.bz2 | Bin 0 -> 1901210 bytes
>  9 files changed, 209 insertions(+)
> 
> diff --git a/pc-bios/edk2-licenses.txt b/pc-bios/edk2-licenses.txt
> new file mode 100644
> index ..8bdb1abc993e
> --- /dev/null
> +++ b/pc-bios/edk2-licenses.txt
> @@ -0,0 +1,209 @@
> +==> edk2/License.txt <==
> +Copyright (c) 2004 - 2016, Intel Corporation. All rights reserved.
> +Copyright (c) 2008 - 2010, Apple Inc. All rights reserved.
> +Copyright (c) 2011 - 2015, ARM Limited. All rights reserved.
> +Copyright (c) 2014 - 2015, Linaro Limited. All rights reserved.
> +Copyright (c) 2013 - 2015, Red Hat, Inc.
> +
> +Redistribution and use in source and binary forms, with or without
> +modification, are permitted provided that the following conditions
> +are met:
> +
> +* Redistributions of source code must retain the above copyright
> +  notice, this list of conditions and the following disclaimer.
> +* Redistributions in binary form must reproduce the above copyright
> +  notice, this list of conditions and the following disclaimer in
> +  the documentation and/or other materials provided with the
> +  distribution.
> +
> +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
> +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> +COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
> +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +POSSIBILITY OF SUCH DAMAGE.
> +
> +==> edk2/OvmfPkg/License.txt <==
> +Copyright (c) 2012, Intel Corporation. All rights reserved.
> +
> +Redistribution and use in source and binary forms, with or without
> +modification, are permitted provided that the following conditions
> +are met:
> +
> +* Redistributions of source code must retain the above copyright
> +  notice, this list of conditions and the following disclaimer.
> +* Redistributions in binary form must reproduce the above copyright
> +  notice, this list of conditions and the following disclaimer in
> +  the documentation and/or other materials provided with the
> +  distribution.
> +
> +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
> +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> +COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
> +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +POSSIBILITY OF SUCH DAMAGE.
> +
> +
> +Some files are subject to the following license, the MIT license. Those files
>

Re: [Qemu-devel] [PATCH for-4.1 v4 10/12] tests: add missing dependency to build QTEST_QEMU_BINARY, round 2

2019-04-10 Thread Philippe Mathieu-Daudé
On 4/10/19 1:00 AM, Laszlo Ersek wrote:
> In commit b94b330e2333 ("tests: add missing dependency to build
> QTEST_QEMU_BINARY", 2017-07-31), Phil fixed the dependency list of make
> target "check-qtest-%". Namely, the recipe would set QTEST_QEMU_BINARY to
> the softmmu emulator for the emulation target, but the prerequisites
> didn't include the emulator.
> 
> The same issue affects the "check-report-qtest-%.tap" make target, which
> is the other make target whose recipe sets QTEST_QEMU_BINARY:
> 
>> $ make -j4 check-report-qtest-aarch64.tap
>>   TAP check-report-qtest-aarch64.tap
>> sh: /.../aarch64-softmmu/qemu-system-aarch64: No such file or directory
> 
> Apply Phil's fix to this make target too.
> 
> Signed-off-by: Laszlo Ersek 
> Reviewed-by: Michal Privoznik 
> Tested-by: Igor Mammedov 
> Reviewed-by: Michael S. Tsirkin 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 

> ---
> 
> Notes:
> v4:
> 
> - no change
> 
> v3:
> 
> - pick up Michal's R-b
> 
> - pick up Igor's T-b
> 
> - pick up Michael's R-b
> 
> v2:
> 
> - new patch, relied upon by the next patch
> 
>  tests/Makefile.include | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 36fc73fef55a..e2432d5e7712 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -912,7 +912,7 @@ check-speed: $(check-speed-y)
>  
>  # gtester tests with TAP output
>  
> -$(patsubst %, check-report-qtest-%.tap, $(QTEST_TARGETS)): 
> check-report-qtest-%.tap: $(check-qtest-y)
> +$(patsubst %, check-report-qtest-%.tap, $(QTEST_TARGETS)): 
> check-report-qtest-%.tap: subdir-%-softmmu $(check-qtest-y)
>   $(call do_test_tap, $(check-qtest-$*-y) $(check-qtest-generic-y), \
> QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \
> QTEST_QEMU_IMG=qemu-img$(EXESUF))
> 



[Qemu-devel] [Bug 1823831] Re: BSD bootloader halts with hypervisor.framework

2019-04-10 Thread Chen Zhang
In good versions (27e18b8952f8b7a1e26350846f8a0d5a9b33bfb8), calls to
x86_cpu_has_work(), likely due to IRQ 0, returned interchanging true or
false.

In bad versions (92d5f1a4147c3722b5e9a8bcfb7dc261b7a8b855), all calls
returned false.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1823831

Title:
  BSD bootloader halts with hypervisor.framework

Status in QEMU:
  New

Bug description:
  Guest: FreeBSD 12.0 Install CD
  Host: MacOS 11.14.3 qemu master at 90fb864a7df0a9af677352e94f8225f7b03de922

  Command arguments:

  qemu-system-x86_64 -m 4000m -cdrom Downloads/FreeBSD-12.0-RELEASE-
  amd64-bootonly.iso

  When qemu was run with -accel hvf, the bootloader would halt after
  showing the menu. The bootloader would not respond to any keyboard
  events.

  Without acceleration option, the bootloader would count down to zero
  and proceed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1823831/+subscriptions



Re: [Qemu-devel] [PATCH 1/3] xen-bus: use a separate fd for each event channel

2019-04-10 Thread Anthony PERARD
On Mon, Apr 08, 2019 at 04:16:15PM +0100, Paul Durrant wrote:
> To better support use of IOThread-s it will be necessary to be able to set
> the AioContext for each XenEventChannel and hence it is necessary to open a
> separate handle to libxenevtchan for each channel.
> 
> This patch stops using NotifierList for event channel callbacks, replacing
> that construct by a list of complete XenEventChannel structures. Each of
> these now has a xenevtchn_handle pointer in place of the single pointer
> previously held in the XenDevice structure. The individual handles are
> opened/closed in xen_device_bind/unbind_event_channel(), replacing the
> single open/close in xen_device_realize/unrealize().
> 
> NOTE: This patch does not add an AioContext parameter to
>   xen_device_bind_event_channel(). That will be done in a subsequent
>   patch.
> 
> Signed-off-by: Paul Durrant 

Reviewed-by: Anthony PERARD 

There are a few places were I would have like to add an assert, but they
can't be compiled-out in QEMU :-(.

Thanks,

-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH RESEND v2 1/2] hvf: declare hvf_handle_io if NEED_CPU_H is defined

2019-04-10 Thread Roman Bolshakov
On Sun, Apr 07, 2019 at 05:28:38PM +0530, Sukrit Bhatnagar wrote:
> hvf_handle_io needs the poisoned type CPUArchState as its argument.
> Declaring it if NEED_CPU_H is defined enables include/sysemu/hvf.h
> to be included for common object compilation as well.
> 

Reviewed-by: Roman Bolshakov 
Tested-by: Roman Bolshakov 

Thanks,
Roman



Re: [Qemu-devel] [PATCH RESEND v2 2/2] cpus: move hvf_cpu_synchronize* calls to cpu_synchronize* functions

2019-04-10 Thread Roman Bolshakov
On Sun, Apr 07, 2019 at 05:28:39PM +0530, Sukrit Bhatnagar wrote:
> Keep the calls made to synchronize cpu by all hypervisors in one place
> inside cpu_synchronize_* functions in include/sysemu/hw_accel.h
> 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Signed-off-by: Sukrit Bhatnagar 
> ---
>  cpus.c| 12 
>  include/sysemu/hw_accel.h | 10 ++
>  2 files changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index e83f72b48b..026df0dc5f 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1021,10 +1021,6 @@ void cpu_synchronize_all_states(void)
>  
>  CPU_FOREACH(cpu) {
>  cpu_synchronize_state(cpu);
> -/* TODO: move to cpu_synchronize_state() */
> -if (hvf_enabled()) {
> -hvf_cpu_synchronize_state(cpu);
> -}
>  }
>  }
>  
> @@ -1034,10 +1030,6 @@ void cpu_synchronize_all_post_reset(void)
>  
>  CPU_FOREACH(cpu) {
>  cpu_synchronize_post_reset(cpu);
> -/* TODO: move to cpu_synchronize_post_reset() */
> -if (hvf_enabled()) {
> -hvf_cpu_synchronize_post_reset(cpu);
> -}
>  }
>  }
>  
> @@ -1047,10 +1039,6 @@ void cpu_synchronize_all_post_init(void)
>  
>  CPU_FOREACH(cpu) {
>  cpu_synchronize_post_init(cpu);
> -/* TODO: move to cpu_synchronize_post_init() */
> -if (hvf_enabled()) {
> -hvf_cpu_synchronize_post_init(cpu);
> -}
>  }
>  }
>  
> diff --git a/include/sysemu/hw_accel.h b/include/sysemu/hw_accel.h
> index d2ddfb5ad0..bf081b4026 100644
> --- a/include/sysemu/hw_accel.h
> +++ b/include/sysemu/hw_accel.h
> @@ -15,6 +15,7 @@
>  #include "sysemu/hax.h"
>  #include "sysemu/kvm.h"
>  #include "sysemu/whpx.h"
> +#include "sysemu/hvf.h"
>  
>  static inline void cpu_synchronize_state(CPUState *cpu)
>  {
> @@ -27,6 +28,9 @@ static inline void cpu_synchronize_state(CPUState *cpu)
>  if (whpx_enabled()) {
>  whpx_cpu_synchronize_state(cpu);
>  }
> +if (hvf_enabled()) {
> +hvf_cpu_synchronize_state(cpu);
> +}
>  }
>  
>  static inline void cpu_synchronize_post_reset(CPUState *cpu)
> @@ -40,6 +44,9 @@ static inline void cpu_synchronize_post_reset(CPUState *cpu)
>  if (whpx_enabled()) {
>  whpx_cpu_synchronize_post_reset(cpu);
>  }
> +if (hvf_enabled()) {
> +hvf_cpu_synchronize_post_reset(cpu);
> +}
>  }
>  
>  static inline void cpu_synchronize_post_init(CPUState *cpu)
> @@ -53,6 +60,9 @@ static inline void cpu_synchronize_post_init(CPUState *cpu)
>  if (whpx_enabled()) {
>  whpx_cpu_synchronize_post_init(cpu);
>  }
> +if (hvf_enabled()) {
> +hvf_cpu_synchronize_post_init(cpu);
> +}
>  }
>  
>  static inline void cpu_synchronize_pre_loadvm(CPUState *cpu)
> -- 
> 2.20.1
> 
> 

Hi Sukrit,

The commit itself looks sane but qemu starts to hang during BIOS boot
after I've applied it if I run it with hvf accel.

Thanks,
Roman



Re: [Qemu-devel] [PATCH RESEND v2 2/2] cpus: move hvf_cpu_synchronize* calls to cpu_synchronize* functions

2019-04-10 Thread Sukrit Bhatnagar
On Wed, 10 Apr 2019 at 17:20, Roman Bolshakov  wrote:
>
> On Sun, Apr 07, 2019 at 05:28:39PM +0530, Sukrit Bhatnagar wrote:
> > Keep the calls made to synchronize cpu by all hypervisors in one place
> > inside cpu_synchronize_* functions in include/sysemu/hw_accel.h
> >
> > Cc: Richard Henderson 
> > Cc: Paolo Bonzini 
> > Signed-off-by: Sukrit Bhatnagar 
> > ---
> >  cpus.c| 12 
> >  include/sysemu/hw_accel.h | 10 ++
> >  2 files changed, 10 insertions(+), 12 deletions(-)
> >
> > diff --git a/cpus.c b/cpus.c
> > index e83f72b48b..026df0dc5f 100644
> > --- a/cpus.c
> > +++ b/cpus.c
> > @@ -1021,10 +1021,6 @@ void cpu_synchronize_all_states(void)
> >
> >  CPU_FOREACH(cpu) {
> >  cpu_synchronize_state(cpu);
> > -/* TODO: move to cpu_synchronize_state() */
> > -if (hvf_enabled()) {
> > -hvf_cpu_synchronize_state(cpu);
> > -}
> >  }
> >  }
> >
> > @@ -1034,10 +1030,6 @@ void cpu_synchronize_all_post_reset(void)
> >
> >  CPU_FOREACH(cpu) {
> >  cpu_synchronize_post_reset(cpu);
> > -/* TODO: move to cpu_synchronize_post_reset() */
> > -if (hvf_enabled()) {
> > -hvf_cpu_synchronize_post_reset(cpu);
> > -}
> >  }
> >  }
> >
> > @@ -1047,10 +1039,6 @@ void cpu_synchronize_all_post_init(void)
> >
> >  CPU_FOREACH(cpu) {
> >  cpu_synchronize_post_init(cpu);
> > -/* TODO: move to cpu_synchronize_post_init() */
> > -if (hvf_enabled()) {
> > -hvf_cpu_synchronize_post_init(cpu);
> > -}
> >  }
> >  }
> >
> > diff --git a/include/sysemu/hw_accel.h b/include/sysemu/hw_accel.h
> > index d2ddfb5ad0..bf081b4026 100644
> > --- a/include/sysemu/hw_accel.h
> > +++ b/include/sysemu/hw_accel.h
> > @@ -15,6 +15,7 @@
> >  #include "sysemu/hax.h"
> >  #include "sysemu/kvm.h"
> >  #include "sysemu/whpx.h"
> > +#include "sysemu/hvf.h"
> >
> >  static inline void cpu_synchronize_state(CPUState *cpu)
> >  {
> > @@ -27,6 +28,9 @@ static inline void cpu_synchronize_state(CPUState *cpu)
> >  if (whpx_enabled()) {
> >  whpx_cpu_synchronize_state(cpu);
> >  }
> > +if (hvf_enabled()) {
> > +hvf_cpu_synchronize_state(cpu);
> > +}
> >  }
> >
> >  static inline void cpu_synchronize_post_reset(CPUState *cpu)
> > @@ -40,6 +44,9 @@ static inline void cpu_synchronize_post_reset(CPUState 
> > *cpu)
> >  if (whpx_enabled()) {
> >  whpx_cpu_synchronize_post_reset(cpu);
> >  }
> > +if (hvf_enabled()) {
> > +hvf_cpu_synchronize_post_reset(cpu);
> > +}
> >  }
> >
> >  static inline void cpu_synchronize_post_init(CPUState *cpu)
> > @@ -53,6 +60,9 @@ static inline void cpu_synchronize_post_init(CPUState 
> > *cpu)
> >  if (whpx_enabled()) {
> >  whpx_cpu_synchronize_post_init(cpu);
> >  }
> > +if (hvf_enabled()) {
> > +hvf_cpu_synchronize_post_init(cpu);
> > +}
> >  }
> >
> >  static inline void cpu_synchronize_pre_loadvm(CPUState *cpu)
> > --
> > 2.20.1
> >
> >
>
> Hi Sukrit,
>
> The commit itself looks sane but qemu starts to hang during BIOS boot
> after I've applied it if I run it with hvf accel.

Hi Roman,
Thanks for reviewing the patches.
If this is happening, then I assume patches are not ready
to be merged into qemu-stable.
Do you have any idea why the problem is not detected
during tests, but rather at runtime?
I am not really sure if this patch (2/2) has anything to do with
the problem but the previous patch (1/2) might.

> Thanks,
> Roman



Re: [Qemu-devel] [PATCH 03/13] libqos: move common i2c code to libqos

2019-04-10 Thread Paolo Bonzini
On 09/04/19 11:42, Thomas Huth wrote:
>> -static void tmp105_set16(I2CAdapter *i2c, uint8_t addr, uint8_t reg,
>> - uint16_t value)
>> -{
>> -uint8_t cmd[3];
>> -uint8_t resp[2];
>> -
>> -cmd[0] = reg;
>> -cmd[1] = value >> 8;
>> -cmd[2] = value & 255;
>> -i2c_send(i2c, addr, cmd, 3);
>> -i2c_recv(i2c, addr, resp, 2);
>> -g_assert_cmphex(resp[0], ==, cmd[1]);
>> -g_assert_cmphex(resp[1], ==, cmd[2]);
>> -}
> ... but the old set8/16 functions also used i2c_recv() and
> g_assert_cmphex() ... shouldn't this be added to the new functions, too?

That is dependent on the actual device you are working with.  I probably
should add some code to the tmp105 test that tests this behavior too.

Paolo



Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Cornelia Huck
On Wed, 10 Apr 2019 09:38:22 +0530
Pankaj Gupta  wrote:

> This patch adds virtio-pmem driver for KVM guest.
> 
> Guest reads the persistent memory range information from
> Qemu over VIRTIO and registers it on nvdimm_bus. It also
> creates a nd_region object with the persistent memory
> range information so that existing 'nvdimm/pmem' driver
> can reserve this into system memory map. This way
> 'virtio-pmem' driver uses existing functionality of pmem
> driver to register persistent memory compatible for DAX
> capable filesystems.
> 
> This also provides function to perform guest flush over
> VIRTIO from 'pmem' driver when userspace performs flush
> on DAX memory range.
> 
> Signed-off-by: Pankaj Gupta 
> ---
>  drivers/nvdimm/virtio_pmem.c |  88 ++
>  drivers/virtio/Kconfig   |  10 +++
>  drivers/virtio/Makefile  |   1 +
>  drivers/virtio/pmem.c| 124 +++
>  include/linux/virtio_pmem.h  |  60 +++
>  include/uapi/linux/virtio_ids.h  |   1 +
>  include/uapi/linux/virtio_pmem.h |  10 +++
>  7 files changed, 294 insertions(+)
>  create mode 100644 drivers/nvdimm/virtio_pmem.c
>  create mode 100644 drivers/virtio/pmem.c
>  create mode 100644 include/linux/virtio_pmem.h
>  create mode 100644 include/uapi/linux/virtio_pmem.h
> 
(...)
> diff --git a/drivers/virtio/pmem.c b/drivers/virtio/pmem.c
> new file mode 100644
> index ..cc9de9589d56
> --- /dev/null
> +++ b/drivers/virtio/pmem.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * virtio_pmem.c: Virtio pmem Driver
> + *
> + * Discovers persistent memory range information
> + * from host and registers the virtual pmem device
> + * with libnvdimm core.
> + */
> +#include 
> +#include <../../drivers/nvdimm/nd.h>
> +
> +static struct virtio_device_id id_table[] = {
> + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> + { 0 },
> +};
> +
> + /* Initialize virt queue */
> +static int init_vq(struct virtio_pmem *vpmem)

IMHO, you don't gain much by splitting off this function...

> +{
> + struct virtqueue *vq;
> +
> + /* single vq */
> + vpmem->req_vq = vq = virtio_find_single_vq(vpmem->vdev,
> + host_ack, "flush_queue");
> + if (IS_ERR(vq))
> + return PTR_ERR(vq);

I'm personally not a fan of chained assignments... I think I'd just
drop the 'vq' variable and operate on vpmem->req_vq directly.

> +
> + spin_lock_init(&vpmem->pmem_lock);
> + INIT_LIST_HEAD(&vpmem->req_list);
> +
> + return 0;
> +};
> +
> +static int virtio_pmem_probe(struct virtio_device *vdev)
> +{
> + int err = 0;
> + struct resource res;
> + struct virtio_pmem *vpmem;
> + struct nvdimm_bus *nvdimm_bus;
> + struct nd_region_desc ndr_desc = {};
> + int nid = dev_to_node(&vdev->dev);
> + struct nd_region *nd_region;
> +
> + if (!vdev->config->get) {
> + dev_err(&vdev->dev, "%s failure: config disabled\n",

Maybe s/config disabled/config access disabled/ ? That seems to be the
more common message.

> + __func__);
> + return -EINVAL;
> + }
> +
> + vdev->priv = vpmem = devm_kzalloc(&vdev->dev, sizeof(*vpmem),
> + GFP_KERNEL);

Here, the vpmem variable makes sense for convenience, but I'm again not
a fan of the chaining :)

> + if (!vpmem) {
> + err = -ENOMEM;
> + goto out_err;
> + }
> +
> + vpmem->vdev = vdev;
> + err = init_vq(vpmem);
> + if (err)
> + goto out_err;
> +
> + virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> + start, &vpmem->start);
> + virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> + size, &vpmem->size);
> +
> + res.start = vpmem->start;
> + res.end   = vpmem->start + vpmem->size-1;
> + vpmem->nd_desc.provider_name = "virtio-pmem";
> + vpmem->nd_desc.module = THIS_MODULE;
> +
> + vpmem->nvdimm_bus = nvdimm_bus = nvdimm_bus_register(&vdev->dev,
> + &vpmem->nd_desc);

And here :)

> + if (!nvdimm_bus)
> + goto out_vq;
> +
> + dev_set_drvdata(&vdev->dev, nvdimm_bus);
> +
> + ndr_desc.res = &res;
> + ndr_desc.numa_node = nid;
> + ndr_desc.flush = virtio_pmem_flush;
> + set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> + set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> + nd_region = nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc);
> + nd_region->provider_data =  dev_to_virtio
> + (nd_region->dev.parent->parent);

Isn't it clear that this parent chain will always end up at &vdev->dev?
Maybe simply set ->provider_data to vdev directly? (Does it need to
grab a reference count of the device, BTW?)

> +
> + if (!nd_region)
> + goto out_nd;

Probably better to do this check before you access nd_region's
members :)

> +
> 

[Qemu-devel] [Bug 1572959] Re: bcm2835_property: inconsistent values when both setting and querying the framebuffer

2019-04-10 Thread Thomas Huth
The patch had been merged here:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=27a5dc7be6a55b60039e3920

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1572959

Title:
  bcm2835_property: inconsistent values when both setting and querying
  the framebuffer

Status in QEMU:
  Fix Released

Bug description:
  As the framebuffer settings are copied into the result message before
  it is reconfigured, inconsistent behavior can happen when, for
  instance, you set with a single message the width, height, and depth,
  and ask at the same time to allocate the buffer and get the pitch and
  the size.

  In this case, the reported pitch and size would be incorrect as they
  were computed with the initial values of width, height and depth, not
  the ones the client requested.

  Attached is a patch also sent to the qemu-devel mailing list.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1572959/+subscriptions



[Qemu-devel] [Bug 1525676] Re: qemu runas and sandbox option incompatible, process will hang in futex after setgid

2019-04-10 Thread Thomas Huth
I haven't tried, but I think this should be fixed now with the new
elevateprivileges parameter of the -sandbox option. Have you tried to
reproduce the problem with the latest version of QEMU already?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1525676

Title:
  qemu runas and sandbox option incompatible, process will hang in futex
  after setgid

Status in QEMU:
  Incomplete
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  With -runas [user] and -sandbox on, qemu process will fail in the
  process of dropping privileges. While setgid() is done (see below),
  setuid() is not attempted. Instead process blocks waiting for a futex
  never to come.

  [pid 21769] +++ killed by SIGSYS +++
  [pid 21767] <... tgkill resumed> )  = 0
  [pid 21767] tgkill(21767, 21768, SIGRT_1 
  [pid 21768] <... nanosleep resumed> {0, 7284187}) = ? ERESTART_RESTARTBLOCK 
(Interrupted by signal)
  [pid 21768] --- SIGRT_1 {si_signo=SIGRT_1, si_code=SI_TKILL, si_pid=21767, 
si_uid=0} ---
  [pid 21768] setgid(100) = 0
  [pid 21768] futex(0x7f7f0f53fd1c, FUTEX_WAKE_PRIVATE, 1) = 0
  [pid 21768] rt_sigreturn()  = -1 EINTR (Interrupted system call)
  [pid 21768] nanosleep({0, 7284187},  
  [pid 21767] <... tgkill resumed> )  = 0
  [pid 21767] futex(0x7ffd5a9b6890, FUTEX_WAIT_PRIVATE, 3, NULL 
  [pid 21768] <... nanosleep resumed> 0x7f7f0f53eb00) = 0
  [pid 21768] futex(0x55f52acd1f44, FUTEX_WAIT, 4294967295, NULL

  This bug might be addresses in the discussion/patchset
  http://qemu.11.n7.nabble.com/PATCH-Add-syscalls-for-runas-and-chroot-
  to-the-seccomp-sandbox-td359588.html

  # lsb_release -rd
  Description:Ubuntu 15.10
  Release:15.10

  # apt-cache policy qemu-system-x86
  qemu-system-x86:
Installed: 1:2.3+dfsg-5ubuntu9.1
Candidate: 1:2.3+dfsg-5ubuntu9.1
Version table:
   *** 1:2.3+dfsg-5ubuntu9.1 0
  500 http://archive.ubuntu.com/ubuntu/ wily-updates/main amd64 Packages
  500 http://archive.ubuntu.com/ubuntu/ wily-security/main amd64 
Packages
  100 /var/lib/dpkg/status
   1:2.3+dfsg-5ubuntu9 0
  500 http://archive.ubuntu.com/ubuntu/ wily/main amd64 Packages

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1525676/+subscriptions



[Qemu-devel] [PATCH for-4.1] gitlab-ci.yml: Test the TCG interpreter in a CI pipeline

2019-04-10 Thread Thomas Huth
So far we do not have any test coverage for TCI (the TCG interpreter) yet.
Thus let's add a CI pipeline that runs at least some basic TCG tests with
a TCI build, to make sure that there are no further regressions.

Signed-off-by: Thomas Huth 
---
 NB: I did not include the sparc and sparc64 targets here since they are
 currently broken with TCI. Also "make check-tcg" does not work with
 x86_64-linux-user yet ... in case someone wants to fix that...

 .gitlab-ci.yml | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 79d02cf740..c63bf2f822 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -71,3 +71,18 @@ build-clang:
  ppc-softmmu s390x-softmmu x86_64-softmmu arm-linux-user"
  - make -j2
  - make -j2 check
+
+build-tci:
+ script:
+ - TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64"
+ - ./configure --enable-tcg-interpreter
+  --target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)"
+ - make -j2
+ - make tests/boot-serial-test tests/cdrom-test tests/pxe-test
+ - for tg in $TARGETS ; do
+ export QTEST_QEMU_BINARY="${tg}-softmmu/qemu-system-${tg}" ;
+ ./tests/boot-serial-test || exit 1 ;
+ ./tests/cdrom-test || exit 1 ;
+   done
+ - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" ./tests/pxe-test
+ - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x" ./tests/pxe-test -m slow
-- 
2.21.0




Re: [Qemu-devel] [PATCH v5 0/6] virtio pmem driver

2019-04-10 Thread Arkadiusz Miśkiewicz
On 10/04/2019 06:08, Pankaj Gupta wrote:
>  This patch series has implementation for "virtio pmem". 
>  "virtio pmem" is fake persistent memory(nvdimm) in guest 
>  which allows to bypass the guest page cache. This also

Will kernel pstore be able to use this persistent memory for storing
crash dumps?

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )



Re: [Qemu-devel] [PATCH 2/3] xen-bus: allow AioContext to be specified for each event channel

2019-04-10 Thread Anthony PERARD
On Mon, Apr 08, 2019 at 04:16:16PM +0100, Paul Durrant wrote:
> This patch adds an AioContext parameter to xen_device_bind_event_channel()
> and then uses aio_set_fd_handler() to set the callback rather than
> qemu_set_fd_handler().
> 
> Signed-off-by: Paul Durrant 
> ---
> @@ -943,6 +944,7 @@ static void xen_device_event(void *opaque)
>  }
>  
>  XenEventChannel *xen_device_bind_event_channel(XenDevice *xendev,
> +   AioContext *ctx,
> unsigned int port,
> XenEventHandler handler,
> void *opaque, Error **errp)
> @@ -968,8 +970,9 @@ XenEventChannel *xen_device_bind_event_channel(XenDevice 
> *xendev,
>  channel->handler = handler;
>  channel->opaque = opaque;
>  
> -qemu_set_fd_handler(xenevtchn_fd(channel->xeh), xen_device_event, NULL,
> -channel);
> +channel->ctx = ctx;
> +aio_set_fd_handler(channel->ctx, xenevtchn_fd(channel->xeh), false,
> +   xen_device_event, NULL, NULL, channel);

I wonder if the `'is_external' parameter of aio_set_fd_handler shoud be
`true' here, instead. That flag seems to be used when making a snapshot
of a blockdev, for example.

That was introduced by:
dca21ef23ba48f6f1428c59f295a857e5dc203c8^..c07bc2c1658fffeee08eb46402b2f66d55b07586

What do you think?


-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Michael S. Tsirkin
On Wed, Apr 10, 2019 at 09:38:22AM +0530, Pankaj Gupta wrote:
> This patch adds virtio-pmem driver for KVM guest.
> 
> Guest reads the persistent memory range information from
> Qemu over VIRTIO and registers it on nvdimm_bus. It also
> creates a nd_region object with the persistent memory
> range information so that existing 'nvdimm/pmem' driver
> can reserve this into system memory map. This way
> 'virtio-pmem' driver uses existing functionality of pmem
> driver to register persistent memory compatible for DAX
> capable filesystems.
> 
> This also provides function to perform guest flush over
> VIRTIO from 'pmem' driver when userspace performs flush
> on DAX memory range.
> 
> Signed-off-by: Pankaj Gupta 
> ---
>  drivers/nvdimm/virtio_pmem.c |  88 ++
>  drivers/virtio/Kconfig   |  10 +++
>  drivers/virtio/Makefile  |   1 +
>  drivers/virtio/pmem.c| 124 +++
>  include/linux/virtio_pmem.h  |  60 +++
>  include/uapi/linux/virtio_ids.h  |   1 +
>  include/uapi/linux/virtio_pmem.h |  10 +++
>  7 files changed, 294 insertions(+)
>  create mode 100644 drivers/nvdimm/virtio_pmem.c
>  create mode 100644 drivers/virtio/pmem.c
>  create mode 100644 include/linux/virtio_pmem.h
>  create mode 100644 include/uapi/linux/virtio_pmem.h
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> new file mode 100644
> index ..01044cc2f3a3
> --- /dev/null
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -0,0 +1,88 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * virtio_pmem.c: Virtio pmem Driver
> + *
> + * Discovers persistent memory range information
> + * from host and provides a virtio based flushing
> + * interface.
> + */
> +#include 
> +#include "nd.h"
> +
> + /* The interrupt handler */
> +void host_ack(struct virtqueue *vq)
> +{
> + unsigned int len;
> + unsigned long flags;
> + struct virtio_pmem_request *req, *req_buf;
> + struct virtio_pmem *vpmem = vq->vdev->priv;
> +
> + spin_lock_irqsave(&vpmem->pmem_lock, flags);
> + while ((req = virtqueue_get_buf(vq, &len)) != NULL) {
> + req->done = true;
> + wake_up(&req->host_acked);
> +
> + if (!list_empty(&vpmem->req_list)) {
> + req_buf = list_first_entry(&vpmem->req_list,
> + struct virtio_pmem_request, list);
> + list_del(&vpmem->req_list);
> + req_buf->wq_buf_avail = true;
> + wake_up(&req_buf->wq_buf);
> + }
> + }
> + spin_unlock_irqrestore(&vpmem->pmem_lock, flags);
> +}
> +EXPORT_SYMBOL_GPL(host_ack);
> +
> + /* The request submission function */
> +int virtio_pmem_flush(struct nd_region *nd_region)
> +{
> + int err;
> + unsigned long flags;
> + struct scatterlist *sgs[2], sg, ret;
> + struct virtio_device *vdev = nd_region->provider_data;
> + struct virtio_pmem *vpmem = vdev->priv;
> + struct virtio_pmem_request *req;
> +
> + might_sleep();
> + req = kmalloc(sizeof(*req), GFP_KERNEL);
> + if (!req)
> + return -ENOMEM;
> +
> + req->done = req->wq_buf_avail = false;
> + strcpy(req->name, "FLUSH");
> + init_waitqueue_head(&req->host_acked);
> + init_waitqueue_head(&req->wq_buf);
> + sg_init_one(&sg, req->name, strlen(req->name));
> + sgs[0] = &sg;
> + sg_init_one(&ret, &req->ret, sizeof(req->ret));
> + sgs[1] = &ret;
> +
> + spin_lock_irqsave(&vpmem->pmem_lock, flags);
> + err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
> + if (err) {
> + dev_err(&vdev->dev, "failed to send command to virtio pmem 
> device\n");
> +
> + list_add_tail(&vpmem->req_list, &req->list);
> + spin_unlock_irqrestore(&vpmem->pmem_lock, flags);
> +
> + /* When host has read buffer, this completes via host_ack */
> + wait_event(req->wq_buf, req->wq_buf_avail);
> + spin_lock_irqsave(&vpmem->pmem_lock, flags);
> + }
> + err = virtqueue_kick(vpmem->req_vq);
> + spin_unlock_irqrestore(&vpmem->pmem_lock, flags);
> +
> + if (!err) {
> + err = -EIO;
> + goto ret;
> + }
> + /* When host has read buffer, this completes via host_ack */
> + wait_event(req->host_acked, req->done);
> + err = req->ret;
> +ret:
> + kfree(req);
> + return err;
> +};
> +EXPORT_SYMBOL_GPL(virtio_pmem_flush);
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 35897649c24f..9f634a2ed638 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -42,6 +42,16 @@ config VIRTIO_PCI_LEGACY
>  
> If unsure, say Y.
>  
> +config VIRTIO_PMEM
> + tristate "Support for virtio pmem driver"
> + depends on VIRTIO
> + depends on LIBNVDIMM
> + help
> + This driver provides support for virtio bas

Re: [Qemu-devel] [PATCH] migration: Fix handling fd protocol

2019-04-10 Thread Dr. David Alan Gilbert
* Yury Kotov (yury-ko...@yandex-team.ru) wrote:
> Currently such case is possible for incoming:
> QMP: add-fd (fdset = 0, fd of some file):
> adds fd to fdset 0 and returns QEMU's fd (e.g. 33)
> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc
> ...
> Incoming migration completes and unrefs ioc -> close(33)
> QMP: remove-fd (fdset = 0, fd = 33):
> removes fd from fdset 0 and qemu_close() -> close(33) => double close

Well spotted! That would very rarely cause a problem, but is a race.

> For outgoing migration the case is the same but getfd instead of add-fd.
> Fix it by duping client's fd.
> 
> Signed-off-by: Yury Kotov 

Note your patch hit a problem building on windows; I don't think we have
a qemu_dup for windows.

However, I think this problem is wider than just migration.
For example, I see that dump.c also uses monitor_get_fd, and it's
dump_cleanup also does a close on the fd. So I guess it hits the same
problem?
Also, qmp.c in qmp_add_client does a close on the fd in some error cases
(I've not followed the normal case).

So perhaps all the users of monitor_get_fd are hitting this problem.

Should we be doing the dup in monitor_get_fd?

Dave


> ---
>  migration/fd.c | 39 +++
>  migration/trace-events |  4 ++--
>  2 files changed, 33 insertions(+), 10 deletions(-)
> 
> diff --git a/migration/fd.c b/migration/fd.c
> index a7c13df4ad..c9ff07ac41 100644
> --- a/migration/fd.c
> +++ b/migration/fd.c
> @@ -15,6 +15,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
>  #include "channel.h"
>  #include "fd.h"
>  #include "migration.h"
> @@ -26,15 +27,27 @@
>  void fd_start_outgoing_migration(MigrationState *s, const char *fdname, 
> Error **errp)
>  {
>  QIOChannel *ioc;
> -int fd = monitor_get_fd(cur_mon, fdname, errp);
> +int fd, dup_fd;
> +
> +fd = monitor_get_fd(cur_mon, fdname, errp);
>  if (fd == -1) {
>  return;
>  }
>  
> -trace_migration_fd_outgoing(fd);
> -ioc = qio_channel_new_fd(fd, errp);
> +/* fd is previously created by qmp command 'getfd',
> + * so client is responsible to close it. Dup it to save original value 
> from
> + * QIOChannel's destructor */
> +dup_fd = qemu_dup(fd);
> +if (dup_fd == -1) {
> +error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, 
> strerror(errno),
> +   errno);
> +return;
> +}
> +
> +trace_migration_fd_outgoing(fd, dup_fd);
> +ioc = qio_channel_new_fd(dup_fd, errp);
>  if (!ioc) {
> -close(fd);
> +close(dup_fd);
>  return;
>  }
>  
> @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel 
> *ioc,
>  void fd_start_incoming_migration(const char *infd, Error **errp)
>  {
>  QIOChannel *ioc;
> -int fd;
> +int fd, dup_fd;
>  
>  fd = strtol(infd, NULL, 0);
> -trace_migration_fd_incoming(fd);
>  
> -ioc = qio_channel_new_fd(fd, errp);
> +/* fd is previously created by qmp command 'add-fd' or something else,
> + * so client is responsible to close it. Dup it to save original value 
> from
> + * QIOChannel's destructor */
> +dup_fd = qemu_dup(fd);
> +if (dup_fd == -1) {
> +error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno),
> +   errno);
> +return;
> +}
> +
> +trace_migration_fd_incoming(fd, dup_fd);
> +ioc = qio_channel_new_fd(dup_fd, errp);
>  if (!ioc) {
> -close(fd);
> +close(dup_fd);
>  return;
>  }
>  
> diff --git a/migration/trace-events b/migration/trace-events
> index de2e136e57..d2d30a6b3c 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s"
>  migration_exec_incoming(const char *cmd) "cmd=%s"
>  
>  # fd.c
> -migration_fd_outgoing(int fd) "fd=%d"
> -migration_fd_incoming(int fd) "fd=%d"
> +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d"
> +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d"
>  
>  # socket.c
>  migration_socket_incoming_accepted(void) ""
> -- 
> 2.21.0
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH for-4.1] q35: acpi: do not create dummy MCFG table

2019-04-10 Thread Wei Yang
On Wed, Apr 10, 2019 at 11:08:33AM +0200, Igor Mammedov wrote:
>On Wed, 10 Apr 2019 09:12:31 +0800
>Wei Yang  wrote:
>
>> On Tue, Apr 09, 2019 at 05:00:37PM +0200, Igor Mammedov wrote:
>> >Dummy table (with signature "QEMU") creation came from original SeaBIOS
>> >codebase. And QEMU would have to keep it around if there were Q35 machine
>> >that depended on keeping ACPI tables blob constant size. Luckily there
>> >were no versioned Q35 machine types before commit:
>> >  (since 2.3) a1666142db acpi-build: make ROMs RAM blocks resizeable
>> >which obsoleted need to keep ACPI tables blob the same size on 
>> >source/destination.
>> >  
>> 
>> I am not sure getting the "versioned Q35" correctly. Seems originally there 
>> is
>> q35-1.4?
>> 
>> commit bf3caa3dc17552b323cec6831301a22cfc98ecd5
>> Author: Paolo Bonzini 
>> Date:   Fri Feb 8 14:06:15 2013 +0100
>> 
>> pc: add compatibility machine types for 1.4
>> 
>> Adds both pc-i440fx-1.4 and pc-q35-1.4.
>
>current upstream has only pc-q35-2.4 as earliest versioned Q35
>machine type (try to run: qemu-system-x86_64 -M help)

Yes, I see those old type are removed. This means we don't want to support
those old types, right? Because those old types can't be migrated to latest
upstream qemu.

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Pankaj Gupta


> 
> On Wed, Apr 10, 2019 at 09:38:22AM +0530, Pankaj Gupta wrote:
> > This patch adds virtio-pmem driver for KVM guest.
> > 
> > Guest reads the persistent memory range information from
> > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > creates a nd_region object with the persistent memory
> > range information so that existing 'nvdimm/pmem' driver
> > can reserve this into system memory map. This way
> > 'virtio-pmem' driver uses existing functionality of pmem
> > driver to register persistent memory compatible for DAX
> > capable filesystems.
> > 
> > This also provides function to perform guest flush over
> > VIRTIO from 'pmem' driver when userspace performs flush
> > on DAX memory range.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  drivers/nvdimm/virtio_pmem.c |  88 ++
> >  drivers/virtio/Kconfig   |  10 +++
> >  drivers/virtio/Makefile  |   1 +
> >  drivers/virtio/pmem.c| 124 +++
> >  include/linux/virtio_pmem.h  |  60 +++
> >  include/uapi/linux/virtio_ids.h  |   1 +
> >  include/uapi/linux/virtio_pmem.h |  10 +++
> >  7 files changed, 294 insertions(+)
> >  create mode 100644 drivers/nvdimm/virtio_pmem.c
> >  create mode 100644 drivers/virtio/pmem.c
> >  create mode 100644 include/linux/virtio_pmem.h
> >  create mode 100644 include/uapi/linux/virtio_pmem.h
> > 
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > new file mode 100644
> > index ..01044cc2f3a3
> > --- /dev/null
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -0,0 +1,88 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * virtio_pmem.c: Virtio pmem Driver
> > + *
> > + * Discovers persistent memory range information
> > + * from host and provides a virtio based flushing
> > + * interface.
> > + */
> > +#include 
> > +#include "nd.h"
> > +
> > + /* The interrupt handler */
> > +void host_ack(struct virtqueue *vq)
> > +{
> > +   unsigned int len;
> > +   unsigned long flags;
> > +   struct virtio_pmem_request *req, *req_buf;
> > +   struct virtio_pmem *vpmem = vq->vdev->priv;
> > +
> > +   spin_lock_irqsave(&vpmem->pmem_lock, flags);
> > +   while ((req = virtqueue_get_buf(vq, &len)) != NULL) {
> > +   req->done = true;
> > +   wake_up(&req->host_acked);
> > +
> > +   if (!list_empty(&vpmem->req_list)) {
> > +   req_buf = list_first_entry(&vpmem->req_list,
> > +   struct virtio_pmem_request, list);
> > +   list_del(&vpmem->req_list);
> > +   req_buf->wq_buf_avail = true;
> > +   wake_up(&req_buf->wq_buf);
> > +   }
> > +   }
> > +   spin_unlock_irqrestore(&vpmem->pmem_lock, flags);
> > +}
> > +EXPORT_SYMBOL_GPL(host_ack);
> > +
> > + /* The request submission function */
> > +int virtio_pmem_flush(struct nd_region *nd_region)
> > +{
> > +   int err;
> > +   unsigned long flags;
> > +   struct scatterlist *sgs[2], sg, ret;
> > +   struct virtio_device *vdev = nd_region->provider_data;
> > +   struct virtio_pmem *vpmem = vdev->priv;
> > +   struct virtio_pmem_request *req;
> > +
> > +   might_sleep();
> > +   req = kmalloc(sizeof(*req), GFP_KERNEL);
> > +   if (!req)
> > +   return -ENOMEM;
> > +
> > +   req->done = req->wq_buf_avail = false;
> > +   strcpy(req->name, "FLUSH");
> > +   init_waitqueue_head(&req->host_acked);
> > +   init_waitqueue_head(&req->wq_buf);
> > +   sg_init_one(&sg, req->name, strlen(req->name));
> > +   sgs[0] = &sg;
> > +   sg_init_one(&ret, &req->ret, sizeof(req->ret));
> > +   sgs[1] = &ret;
> > +
> > +   spin_lock_irqsave(&vpmem->pmem_lock, flags);
> > +   err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
> > +   if (err) {
> > +   dev_err(&vdev->dev, "failed to send command to virtio pmem 
> > device\n");
> > +
> > +   list_add_tail(&vpmem->req_list, &req->list);
> > +   spin_unlock_irqrestore(&vpmem->pmem_lock, flags);
> > +
> > +   /* When host has read buffer, this completes via host_ack */
> > +   wait_event(req->wq_buf, req->wq_buf_avail);
> > +   spin_lock_irqsave(&vpmem->pmem_lock, flags);
> > +   }
> > +   err = virtqueue_kick(vpmem->req_vq);
> > +   spin_unlock_irqrestore(&vpmem->pmem_lock, flags);
> > +
> > +   if (!err) {
> > +   err = -EIO;
> > +   goto ret;
> > +   }
> > +   /* When host has read buffer, this completes via host_ack */
> > +   wait_event(req->host_acked, req->done);
> > +   err = req->ret;
> > +ret:
> > +   kfree(req);
> > +   return err;
> > +};
> > +EXPORT_SYMBOL_GPL(virtio_pmem_flush);
> > +MODULE_LICENSE("GPL");
> > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> > index 35897649c24f..9f634a2ed638 100644
> > --- a/drivers/virtio/Kconfig
> > +++ b/drivers/virtio/Kconfig
> > @@ -42,6 +42,16 @@ config VIRTIO_PCI_LEGACY
> >  
> >   If unsure, say Y.
> >  
> > +config VIRTIO_PMEM
> > +   

Re: [Qemu-devel] [PATCH RESEND v2 2/2] cpus: move hvf_cpu_synchronize* calls to cpu_synchronize* functions

2019-04-10 Thread Roman Bolshakov
On Wed, Apr 10, 2019 at 05:35:23PM +0530, Sukrit Bhatnagar wrote:
> On Wed, 10 Apr 2019 at 17:20, Roman Bolshakov  wrote:
> >
> > On Sun, Apr 07, 2019 at 05:28:39PM +0530, Sukrit Bhatnagar wrote:
> > > Keep the calls made to synchronize cpu by all hypervisors in one place
> > > inside cpu_synchronize_* functions in include/sysemu/hw_accel.h
> > >
> >
> > The commit itself looks sane but qemu starts to hang during BIOS boot
> > after I've applied it if I run it with hvf accel.
> 
> Hi Roman,
> Thanks for reviewing the patches.
> If this is happening, then I assume patches are not ready
> to be merged into qemu-stable.
> Do you have any idea why the problem is not detected
> during tests, but rather at runtime?
> I am not really sure if this patch (2/2) has anything to do with
> the problem but the previous patch (1/2) might.
> 

I've applied, built and tested both sequentially. Applying and running
with patch 1/2 alone doesn't result in the behavior I mentioned. I also
tried to apply only the first hunk that moves hvf_cpu_synchronize_state
into cpu_synchronize_state and it also causes the issue with BIOS.

Honestly, I don't know if the tests run qemu with hvf accel. AFAIK
kvm-unit-tests can be used to test an accel itself.

Thanks,
Roman



Re: [Qemu-devel] [PATCH for-4.1] q35: acpi: do not create dummy MCFG table

2019-04-10 Thread Igor Mammedov
On Wed, 10 Apr 2019 22:01:53 +0800
Wei Yang  wrote:

> On Wed, Apr 10, 2019 at 11:08:33AM +0200, Igor Mammedov wrote:
> >On Wed, 10 Apr 2019 09:12:31 +0800
> >Wei Yang  wrote:
> >  
> >> On Tue, Apr 09, 2019 at 05:00:37PM +0200, Igor Mammedov wrote:  
> >> >Dummy table (with signature "QEMU") creation came from original SeaBIOS
> >> >codebase. And QEMU would have to keep it around if there were Q35 machine
> >> >that depended on keeping ACPI tables blob constant size. Luckily there
> >> >were no versioned Q35 machine types before commit:
> >> >  (since 2.3) a1666142db acpi-build: make ROMs RAM blocks resizeable
> >> >which obsoleted need to keep ACPI tables blob the same size on 
> >> >source/destination.
> >> >
> >> 
> >> I am not sure getting the "versioned Q35" correctly. Seems originally 
> >> there is
> >> q35-1.4?
> >> 
> >> commit bf3caa3dc17552b323cec6831301a22cfc98ecd5
> >> Author: Paolo Bonzini 
> >> Date:   Fri Feb 8 14:06:15 2013 +0100
> >> 
> >> pc: add compatibility machine types for 1.4
> >> 
> >> Adds both pc-i440fx-1.4 and pc-q35-1.4.  
> >
> >current upstream has only pc-q35-2.4 as earliest versioned Q35
> >machine type (try to run: qemu-system-x86_64 -M help)  
> 
> Yes, I see those old type are removed. This means we don't want to support
> those old types, right? Because those old types can't be migrated to latest
> upstream qemu.

Exactly




Re: [Qemu-devel] [PATCH] migration: Fix handling fd protocol

2019-04-10 Thread Yury Kotov
Hi,

10.04.2019, 16:58, "Dr. David Alan Gilbert" :
> * Yury Kotov (yury-ko...@yandex-team.ru) wrote:
>>  Currently such case is possible for incoming:
>>  QMP: add-fd (fdset = 0, fd of some file):
>>  adds fd to fdset 0 and returns QEMU's fd (e.g. 33)
>>  QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc
>>  ...
>>  Incoming migration completes and unrefs ioc -> close(33)
>>  QMP: remove-fd (fdset = 0, fd = 33):
>>  removes fd from fdset 0 and qemu_close() -> close(33) => double close
>
> Well spotted! That would very rarely cause a problem, but is a race.
>

Actually, it hits for incoming case on 2 of 50 VMs on our production...

>>  For outgoing migration the case is the same but getfd instead of add-fd.
>>  Fix it by duping client's fd.
>>
>>  Signed-off-by: Yury Kotov 
>
> Note your patch hit a problem building on windows; I don't think we have
> a qemu_dup for windows.
>

Oh, I see... I'll add an ifdef for this in v2.

> However, I think this problem is wider than just migration.
> For example, I see that dump.c also uses monitor_get_fd, and it's
> dump_cleanup also does a close on the fd. So I guess it hits the same
> problem?
> Also, qmp.c in qmp_add_client does a close on the fd in some error cases
> (I've not followed the normal case).
>
> So perhaps all the users of monitor_get_fd are hitting this problem.
>
> Should we be doing the dup in monitor_get_fd?
>

Hmm, it sounds reasonable but Windows's case will remain broken.
And also using fd from fdset without qemu_open will remain broken.

Another way to fix them:
1) Add searching of monitor's fds and not duped fdset's fds to qemu_close
2) Replace broken closes to qemu_close

Regards,
Yury Kotov

> Dave
>
>>  ---
>>   migration/fd.c | 39 +++
>>   migration/trace-events | 4 ++--
>>   2 files changed, 33 insertions(+), 10 deletions(-)
>>
>>  diff --git a/migration/fd.c b/migration/fd.c
>>  index a7c13df4ad..c9ff07ac41 100644
>>  --- a/migration/fd.c
>>  +++ b/migration/fd.c
>>  @@ -15,6 +15,7 @@
>>    */
>>
>>   #include "qemu/osdep.h"
>>  +#include "qapi/error.h"
>>   #include "channel.h"
>>   #include "fd.h"
>>   #include "migration.h"
>>  @@ -26,15 +27,27 @@
>>   void fd_start_outgoing_migration(MigrationState *s, const char *fdname, 
>> Error **errp)
>>   {
>>   QIOChannel *ioc;
>>  - int fd = monitor_get_fd(cur_mon, fdname, errp);
>>  + int fd, dup_fd;
>>  +
>>  + fd = monitor_get_fd(cur_mon, fdname, errp);
>>   if (fd == -1) {
>>   return;
>>   }
>>
>>  - trace_migration_fd_outgoing(fd);
>>  - ioc = qio_channel_new_fd(fd, errp);
>>  + /* fd is previously created by qmp command 'getfd',
>>  + * so client is responsible to close it. Dup it to save original value from
>>  + * QIOChannel's destructor */
>>  + dup_fd = qemu_dup(fd);
>>  + if (dup_fd == -1) {
>>  + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno),
>>  + errno);
>>  + return;
>>  + }
>>  +
>>  + trace_migration_fd_outgoing(fd, dup_fd);
>>  + ioc = qio_channel_new_fd(dup_fd, errp);
>>   if (!ioc) {
>>  - close(fd);
>>  + close(dup_fd);
>>   return;
>>   }
>>
>>  @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel 
>> *ioc,
>>   void fd_start_incoming_migration(const char *infd, Error **errp)
>>   {
>>   QIOChannel *ioc;
>>  - int fd;
>>  + int fd, dup_fd;
>>
>>   fd = strtol(infd, NULL, 0);
>>  - trace_migration_fd_incoming(fd);
>>
>>  - ioc = qio_channel_new_fd(fd, errp);
>>  + /* fd is previously created by qmp command 'add-fd' or something else,
>>  + * so client is responsible to close it. Dup it to save original value from
>>  + * QIOChannel's destructor */
>>  + dup_fd = qemu_dup(fd);
>>  + if (dup_fd == -1) {
>>  + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno),
>>  + errno);
>>  + return;
>>  + }
>>  +
>>  + trace_migration_fd_incoming(fd, dup_fd);
>>  + ioc = qio_channel_new_fd(dup_fd, errp);
>>   if (!ioc) {
>>  - close(fd);
>>  + close(dup_fd);
>>   return;
>>   }
>>
>>  diff --git a/migration/trace-events b/migration/trace-events
>>  index de2e136e57..d2d30a6b3c 100644
>>  --- a/migration/trace-events
>>  +++ b/migration/trace-events
>>  @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s"
>>   migration_exec_incoming(const char *cmd) "cmd=%s"
>>
>>   # fd.c
>>  -migration_fd_outgoing(int fd) "fd=%d"
>>  -migration_fd_incoming(int fd) "fd=%d"
>>  +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d"
>>  +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d"
>>
>>   # socket.c
>>   migration_socket_incoming_accepted(void) ""
>>  --
>>  2.21.0
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH for-4.1] q35: acpi: do not create dummy MCFG table

2019-04-10 Thread Wei Yang
On Tue, Apr 09, 2019 at 05:00:37PM +0200, Igor Mammedov wrote:
>Dummy table (with signature "QEMU") creation came from original SeaBIOS
>codebase. And QEMU would have to keep it around if there were Q35 machine
>that depended on keeping ACPI tables blob constant size. Luckily there
>were no versioned Q35 machine types before commit:
>  (since 2.3) a1666142db acpi-build: make ROMs RAM blocks resizeable
>which obsoleted need to keep ACPI tables blob the same size on 
>source/destination.

I think we should keep table blob the same size on source/destination.

But with resizable MemoryRegion, we don't need to reserve some dummy space.
Because we can expand it later.

>
>Considering the 1st versioned machine is pc-q35-2.4, the dummy table
>is not really necessary and it's safe to drop it without breaking
>cross version migration in both directions unconditionally.
>
>Signed-off-by: Igor Mammedov 
>---
> hw/i386/acpi-build.c | 18 --
> 1 file changed, 4 insertions(+), 14 deletions(-)
>
>diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>index 416da31..8671e25 100644
>--- a/hw/i386/acpi-build.c
>+++ b/hw/i386/acpi-build.c
>@@ -2401,7 +2401,6 @@ static void
> build_mcfg_q35(GArray *table_data, BIOSLinker *linker, AcpiMcfgInfo *info)
> {
> AcpiTableMcfg *mcfg;
>-const char *sig;
> int len = sizeof(*mcfg) + 1 * sizeof(mcfg->allocation[0]);
> 
> mcfg = acpi_data_push(table_data, len);
>@@ -2411,19 +2410,7 @@ build_mcfg_q35(GArray *table_data, BIOSLinker *linker, 
>AcpiMcfgInfo *info)
> mcfg->allocation[0].start_bus_number = 0;
> mcfg->allocation[0].end_bus_number = PCIE_MMCFG_BUS(info->mcfg_size - 1);
> 
>-/* MCFG is used for ECAM which can be enabled or disabled by guest.

I want to cnfirm what is "enabled or disabled by guest" here.

If we don't reserve mcfg and "guest" enable mcfg during running, the ACPI
table size changed. But the destination still has the original table size,
since destination "guest" keep sleep during this period.

Now the migration would face table size difference and break migration?

>- * To avoid table size changes (which create migration issues),
>- * always create the table even if there are no allocations,
>- * but set the signature to a reserved value in this case.
>- * ACPI spec requires OSPMs to ignore such tables.
>- */
>-if (info->mcfg_base == PCIE_BASE_ADDR_UNMAPPED) {
>-/* Reserved signature: ignored by OSPM */
>-sig = "QEMU";
>-} else {
>-sig = "MCFG";
>-}
>-build_header(linker, table_data, (void *)mcfg, sig, len, 1, NULL, NULL);
>+build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL, 
>NULL);
> }
> 
> /*
>@@ -2592,6 +2579,9 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
> }
> mcfg->mcfg_base = qnum_get_uint(qobject_to(QNum, o));
> qobject_unref(o);
>+if (mcfg->mcfg_base == PCIE_BASE_ADDR_UNMAPPED) {
>+return false;
>+}
> 
> o = object_property_get_qobject(pci_host, PCIE_HOST_MCFG_SIZE, NULL);
> assert(o);
>-- 
>2.7.4

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Cornelia Huck
On Wed, 10 Apr 2019 10:03:01 -0400 (EDT)
Pankaj Gupta  wrote:

> > 
> > On Wed, Apr 10, 2019 at 09:38:22AM +0530, Pankaj Gupta wrote:  
> > > This patch adds virtio-pmem driver for KVM guest.
> > > 
> > > Guest reads the persistent memory range information from
> > > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > > creates a nd_region object with the persistent memory
> > > range information so that existing 'nvdimm/pmem' driver
> > > can reserve this into system memory map. This way
> > > 'virtio-pmem' driver uses existing functionality of pmem
> > > driver to register persistent memory compatible for DAX
> > > capable filesystems.
> > > 
> > > This also provides function to perform guest flush over
> > > VIRTIO from 'pmem' driver when userspace performs flush
> > > on DAX memory range.
> > > 
> > > Signed-off-by: Pankaj Gupta 

> > > diff --git a/include/uapi/linux/virtio_ids.h
> > > b/include/uapi/linux/virtio_ids.h
> > > index 6d5c3b2d4f4d..346389565ac1 100644
> > > --- a/include/uapi/linux/virtio_ids.h
> > > +++ b/include/uapi/linux/virtio_ids.h
> > > @@ -43,5 +43,6 @@
> > >  #define VIRTIO_ID_INPUT18 /* virtio input */
> > >  #define VIRTIO_ID_VSOCK19 /* virtio vsock transport */
> > >  #define VIRTIO_ID_CRYPTO   20 /* virtio crypto */
> > > +#define VIRTIO_ID_PMEM 25 /* virtio pmem */
> > >  
> > >  #endif /* _LINUX_VIRTIO_IDS_H */  
> > 
> > Didn't Paolo point out someone is using 25 for audio?  
> 
> 
> Sorry! I did not notice this.
> 
> > 
> > Please, reserve an ID with the Virtio TC before using it.  
> 
> I thought I requested[1] ID 25.
> 
> [1] https://github.com/oasis-tcs/virtio-spec/issues/38
> [2] https://lists.oasis-open.org/archives/virtio-dev/201805/threads.html
> 
> In that case I will send request for next available ID i.e 26?

Have the folks using this for audio sent a reservation request as well?
If not, I'd say the first requester should get the id...

(And yes, we need to be quicker with voting on/applying id
reservations :/)



[Qemu-devel] Questions about acpi interrupt link device's ‘_PRS' field

2019-04-10 Thread Li Qiang
Hi all,

I see the link device ‘_PRS’  uses irq line 5, 10, 11 in ‘build_link_dev’ 
function.
But I never see the 5 lines uses in the guest, just uses 10 and 11. 
Why this happen?  Maybe related with the guest?

Thanks,
Li Qiang





Re: [Qemu-devel] [PATCH 01/10] block/pflash_cfi02: Add test for supported commands

2019-04-10 Thread Thomas Huth
On 09/04/2019 17.37, Stephen Checkoway wrote:
> 
> 
> On Apr 9, 2019, at 02:13, Thomas Huth  wrote:
> 
>> We'd like to get rid of global_qtest in the long run (since it is
>> causing trouble for tests that run multiple instances of QEMU in
>> parallel, e.g. migration tests)... so if it is feasible, please don't
>> use it in new code anymore. Try to use a local variable in the function
>> that call qtest_initf() and pass the test state around via a parameter
>> to the functions that need it.
> 
> One of the patches introduces a FlashConfig structure which gets passed 
> around to all the functions. I could stuff the local qtest in there. The 
> earlier patches would still use global_qtest. Would that be acceptable or 
> would you prefer that global_qtest not appear in any of the patches?

Using FlashConfig sounds fine to me. If you need to use global_qtest in
some of the earlier patches, that's ok, too (especially if these go away
again in the later patches ;-)).

 Thomas



Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Yuval Shaia
On Wed, Apr 10, 2019 at 02:24:26PM +0200, Cornelia Huck wrote:
> On Wed, 10 Apr 2019 09:38:22 +0530
> Pankaj Gupta  wrote:
> 
> > This patch adds virtio-pmem driver for KVM guest.
> > 
> > Guest reads the persistent memory range information from
> > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > creates a nd_region object with the persistent memory
> > range information so that existing 'nvdimm/pmem' driver
> > can reserve this into system memory map. This way
> > 'virtio-pmem' driver uses existing functionality of pmem
> > driver to register persistent memory compatible for DAX
> > capable filesystems.
> > 
> > This also provides function to perform guest flush over
> > VIRTIO from 'pmem' driver when userspace performs flush
> > on DAX memory range.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  drivers/nvdimm/virtio_pmem.c |  88 ++
> >  drivers/virtio/Kconfig   |  10 +++
> >  drivers/virtio/Makefile  |   1 +
> >  drivers/virtio/pmem.c| 124 +++
> >  include/linux/virtio_pmem.h  |  60 +++
> >  include/uapi/linux/virtio_ids.h  |   1 +
> >  include/uapi/linux/virtio_pmem.h |  10 +++
> >  7 files changed, 294 insertions(+)
> >  create mode 100644 drivers/nvdimm/virtio_pmem.c
> >  create mode 100644 drivers/virtio/pmem.c
> >  create mode 100644 include/linux/virtio_pmem.h
> >  create mode 100644 include/uapi/linux/virtio_pmem.h
> > 
> (...)
> > diff --git a/drivers/virtio/pmem.c b/drivers/virtio/pmem.c
> > new file mode 100644
> > index ..cc9de9589d56
> > --- /dev/null
> > +++ b/drivers/virtio/pmem.c
> > @@ -0,0 +1,124 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * virtio_pmem.c: Virtio pmem Driver
> > + *
> > + * Discovers persistent memory range information
> > + * from host and registers the virtual pmem device
> > + * with libnvdimm core.
> > + */
> > +#include 
> > +#include <../../drivers/nvdimm/nd.h>
> > +
> > +static struct virtio_device_id id_table[] = {
> > +   { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> > +   { 0 },
> > +};
> > +
> > + /* Initialize virt queue */
> > +static int init_vq(struct virtio_pmem *vpmem)
> 
> IMHO, you don't gain much by splitting off this function...

It make sense to have all the vq-init-related stuff in one function, so
here pmem_lock and req_list are used only for the vq.
Saying that - maybe it would be better to have the 3 in one struct.

> 
> > +{
> > +   struct virtqueue *vq;
> > +
> > +   /* single vq */
> > +   vpmem->req_vq = vq = virtio_find_single_vq(vpmem->vdev,
> > +   host_ack, "flush_queue");
> > +   if (IS_ERR(vq))
> > +   return PTR_ERR(vq);
> 
> I'm personally not a fan of chained assignments... I think I'd just
> drop the 'vq' variable and operate on vpmem->req_vq directly.

+1

> 
> > +
> > +   spin_lock_init(&vpmem->pmem_lock);
> > +   INIT_LIST_HEAD(&vpmem->req_list);
> > +
> > +   return 0;
> > +};
> > +
> > +static int virtio_pmem_probe(struct virtio_device *vdev)
> > +{
> > +   int err = 0;
> > +   struct resource res;
> > +   struct virtio_pmem *vpmem;
> > +   struct nvdimm_bus *nvdimm_bus;
> > +   struct nd_region_desc ndr_desc = {};
> > +   int nid = dev_to_node(&vdev->dev);
> > +   struct nd_region *nd_region;
> > +
> > +   if (!vdev->config->get) {
> > +   dev_err(&vdev->dev, "%s failure: config disabled\n",
> 
> Maybe s/config disabled/config access disabled/ ? That seems to be the
> more common message.
> 
> > +   __func__);
> > +   return -EINVAL;
> > +   }
> > +
> > +   vdev->priv = vpmem = devm_kzalloc(&vdev->dev, sizeof(*vpmem),
> > +   GFP_KERNEL);
> 
> Here, the vpmem variable makes sense for convenience, but I'm again not
> a fan of the chaining :)

+1

> 
> > +   if (!vpmem) {
> > +   err = -ENOMEM;
> > +   goto out_err;
> > +   }
> > +
> > +   vpmem->vdev = vdev;
> > +   err = init_vq(vpmem);
> > +   if (err)
> > +   goto out_err;
> > +
> > +   virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> > +   start, &vpmem->start);
> > +   virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> > +   size, &vpmem->size);
> > +
> > +   res.start = vpmem->start;
> > +   res.end   = vpmem->start + vpmem->size-1;
> > +   vpmem->nd_desc.provider_name = "virtio-pmem";
> > +   vpmem->nd_desc.module = THIS_MODULE;
> > +
> > +   vpmem->nvdimm_bus = nvdimm_bus = nvdimm_bus_register(&vdev->dev,
> > +   &vpmem->nd_desc);
> 
> And here :)
> 
> > +   if (!nvdimm_bus)
> > +   goto out_vq;
> > +
> > +   dev_set_drvdata(&vdev->dev, nvdimm_bus);
> > +
> > +   ndr_desc.res = &res;
> > +   ndr_desc.numa_node = nid;
> > +   ndr_desc.flush = virtio_pmem_flush;
> > +   set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > +   set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +   nd_region = nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc);

Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Yuval Shaia
> +
> +static int virtio_pmem_probe(struct virtio_device *vdev)
> +{
> + int err = 0;
> + struct resource res;
> + struct virtio_pmem *vpmem;
> + struct nvdimm_bus *nvdimm_bus;
> + struct nd_region_desc ndr_desc = {};
> + int nid = dev_to_node(&vdev->dev);
> + struct nd_region *nd_region;
> +
> + if (!vdev->config->get) {
> + dev_err(&vdev->dev, "%s failure: config disabled\n",
> + __func__);
> + return -EINVAL;
> + }
> +
> + vdev->priv = vpmem = devm_kzalloc(&vdev->dev, sizeof(*vpmem),
> + GFP_KERNEL);
> + if (!vpmem) {
> + err = -ENOMEM;
> + goto out_err;
> + }
> +
> + vpmem->vdev = vdev;
> + err = init_vq(vpmem);
> + if (err)
> + goto out_err;
> +
> + virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> + start, &vpmem->start);
> + virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> + size, &vpmem->size);
> +
> + res.start = vpmem->start;
> + res.end   = vpmem->start + vpmem->size-1;
> + vpmem->nd_desc.provider_name = "virtio-pmem";
> + vpmem->nd_desc.module = THIS_MODULE;
> +
> + vpmem->nvdimm_bus = nvdimm_bus = nvdimm_bus_register(&vdev->dev,
> + &vpmem->nd_desc);
> + if (!nvdimm_bus)
> + goto out_vq;
> +
> + dev_set_drvdata(&vdev->dev, nvdimm_bus);
> +
> + ndr_desc.res = &res;
> + ndr_desc.numa_node = nid;
> + ndr_desc.flush = virtio_pmem_flush;
> + set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> + set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> + nd_region = nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc);
> + nd_region->provider_data =  dev_to_virtio

Pleas delete the extra space.

> + (nd_region->dev.parent->parent);
> +
> + if (!nd_region)
> + goto out_nd;
> +
> + return 0;
> +out_nd:
> + err = -ENXIO;
> + nvdimm_bus_unregister(nvdimm_bus);
> +out_vq:
> + vdev->config->del_vqs(vdev);
> +out_err:
> + dev_err(&vdev->dev, "failed to register virtio pmem memory\n");
> + return err;
> +}



Re: [Qemu-devel] [Qemu-block] [PATCH] block/io.c: fix for the allocation failure

2019-04-10 Thread Stefan Hajnoczi
On Mon, Apr 08, 2019 at 12:14:49PM +0200, Kevin Wolf wrote:
> Am 08.04.2019 um 12:04 hat Kevin Wolf geschrieben:
> > Am 08.04.2019 um 11:44 hat Andrey Shinkevich geschrieben:
> > > 
> > > 
> > > On 06/04/2019 01:50, John Snow wrote:
> > > > 
> > > > 
> > > > On 4/5/19 10:24 AM, Andrey Shinkevich wrote:
> > > >> On a file system used by the customer, fallocate() returns an error
> > > >> if the block is not properly aligned. So, bdrv_co_pwrite_zeroes()
> > > >> fails. We can handle that case the same way as it is done for the
> > > >> unsupported cases, namely, call to bdrv_driver_pwritev() that writes
> > > >> zeroes to an image for the unaligned chunk of the block.
> > > >>
> > > >> Suggested-by: Denis V. Lunev 
> > > >> Signed-off-by: Andrey Shinkevich 
> > > >> ---
> > > >>   block/io.c | 2 +-
> > > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/block/io.c b/block/io.c
> > > >> index dfc153b..0412a51 100644
> > > >> --- a/block/io.c
> > > >> +++ b/block/io.c
> > > >> @@ -1516,7 +1516,7 @@ static int coroutine_fn 
> > > >> bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > > >>   assert(!bs->supported_zero_flags);
> > > >>   }
> > > >>   
> > > >> -if (ret == -ENOTSUP && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > > >> +if (ret < 0 && !(flags & BDRV_REQ_NO_FALLBACK)) {
> > > >>   /* Fall back to bounce buffer if write zeroes is 
> > > >> unsupported */
> > > >>   BdrvRequestFlags write_flags = flags & 
> > > >> ~BDRV_REQ_ZERO_WRITE;
> > > >>   
> > > >>
> > > > 
> > > > I suppose that if fallocate fails for any reason and we're allowing
> > > > fallback, we're either going to succeed ... or fail again very soon
> > > > thereafter.
> > > > 
> > > > Are there any cases where it is vital to not ignore the first fallocate
> > > > failure? I'm a little wary of ignoring the return code from
> > > > bdrv_co_pwrite_zeroes, but I am assuming that if there is a "real"
> > > > failure here that the following bounce writes will also fail "safely."
> > > > 
> > > > I'm not completely confident, but I have no tangible objections:
> > > > Reviewed-by: John Snow 
> > > > 
> > > 
> > > Thank you for your review, John!
> > > 
> > > Let me clarify the circumstances and quote the bug report:
> > > "Customer had Win-2012 VM with 50GB system disk which was later resized 
> > > to 256GB without resizing the partition inside VM.
> > > Now, while trying to resize to 50G, the following error will appear
> > > 'Failed to reduce the number of L2 tables: Invalid argument'
> > > It was found that it is possible to shrink the disk to 128G and any size 
> > > above that number, but size below 128G will bring the mentioned error."
> > > 
> > > The fallocate() returns no error on that file system if the offset and
> > > the (offset + bytes) parameters of the bdrv_co_do_pwrite_zeroes() both
> > > are aligned to 4K.
> > 
> > What is the return value you get from this file system?
> > 
> > Maybe turning that into ENOTSUP in file-posix would be less invasive.
> > Just falling back for any error gives me the vague feeling that it could
> > cause problems sooner or later.
> 
> Also, which file system is this? This seems to be a file system bug.
> fallocate() isn't documented to possibly have alignment restrictions for
> FALLOC_FL_ZERO_RANGE (if this is the operation you're talking about).
> FALLOC_FL_PUNCH_HOLE even explicitly mentions the behaviour for partial
> blocks, so there is no doubt that operations for partial blocks are
> considered valid. Operations that may impose restrictions are explicitly
> documented as such.
> 
> So in any case, this would only be a workaround for a buggy file system.
> The real bug needs to be fixed there.

I agree regarding the root cause of the bug, but the fallback behavior
is reasonable IMO.

Andrey: If you update the patch with a more specific errno I'll queue
that patch instead.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-4.0 v2 2/2] roms: Allow the EDK2_EFIROM variable to be overridden

2019-04-10 Thread Laszlo Ersek
On 04/10/19 08:25, Olaf Hering wrote:
> Am Mon, 8 Apr 2019 13:05:07 +0200
> schrieb Laszlo Ersek :
>
>> Then build scripts could be updated to invoke:
>>
>>   make -C roms \
>> EDK2_BASETOOLS_OPTFLAGS='...' \
>> EDK2_BASETOOLS_LDFLAGS='...' \
>> efirom
>
> The question remains: 'But why?'.

Because it lets you pass "-fno-pie" & friends.

> Why can edk2 not be built with '-fno-pie' right away?

Those flags are not universal across gcc/binutils versions.

Some gcc/binutils versions don't enable PIE to begin with.

Some gcc/binutils versions take different PIE-disablement flags
(considering both the compiler and the linker) from other gcc/binutils
versions.

For example, please refer to the following *iPXE* commit:

> commit 7c395b0e21806b946fe944a27fc273407f357ea1
> Author: Michael Brown 
> Date:   Wed Jun 14 12:33:16 2017 +0100
>
> [build] Use -no-pie on newer versions of gcc
>
> Some distributions patch gcc to generate position independent
> executables by default.  We currently include a workaround to check
> for this and to add -fno-PIE -nopie to CFLAGS if required.
>
> Newer patched versions of gcc require -fno-PIE -no-pie instead.  Check
> for both variants.
>
> Reported-by: Nathan Rennie-Waldock 
> Originally-fixed-by: Markos Chandras 
> Signed-off-by: Michael Brown 

(I remember this commit because the logic it had added actually failed
on a system that I used once for building iPXE -- it was an x86_64
system with both a native gcc, and a *different version*
x86_64-to-x86_64 *cross* gcc. The selection logic determined the flags
for the one compiler toolchain, but then passed the flags to the other
compiler toolchain. The build broke. I narrowed it down to the above
commit, and then shrugged it off; it wasn't important enough to spend
more time on it.)

I also refer you to the following *edk2* commit:

> commit 11d0cd23dd1bc15a6e6a1598250ea2e0c4c36e9a
> Author: Ard Biesheuvel 
> Date:   Mon Jun 18 10:23:49 2018 +0200
>
> BaseTools/tools_def IA32: drop -no-pie linker option for GCC49
>
> As reported by Liming, GCC 4.9.2 does not support the -no-pie
> linker option that we added to the GCC49 and GCC5 toolchain
> profiles in commit c25d3905523a ("BaseTools/tools_def IA32:
> disable PIE code generation explicitly") to work around issues
> with recent distro toolchains that enable PIE code generation
> by default.
>
> So rollback the changes for GCC49 but preserve them for GCC5
>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Ard Biesheuvel 
> Acked-by: Laszlo Ersek 

The above facility will let you stuff the options into the edk2
BaseTools build, in your downstream qemu.spec file, that match your
downstream gcc/binutils version.

Most build hosts will need no flags.

> Did noone approach the edk2 developers yet that their buildsystem is
> broken in that regard?

Well, have you?

I don't remember anyone else reporting such an issue yet, on edk2-devel
or elsewhere, with building BaseTools. I certainly remember
collaborating with Gary Lin from SUSE on various stuff, but not this.

You are welcome to file a bug at .
Please pick "EDK2" for "Product", "Code" for "Component", and
"BaseTools" for "Package".

... Please don't try to force a "zero build-interface changes" policy on
upstream, just so you can avoid a small tweak in a downstream build
script when you rebase. Not even the gcc/binutils command lines conform
to that. We all hope downstream rebases to be painless, and upstreams do
strive to help with that, but the downstream rebase experience is never
*entirely* painless (or automated).

Thanks,
Laszlo



Re: [Qemu-devel] [PATCH v2 for-4.0?] aio-posix: ensure poll mode is left when aio_notify is called

2019-04-10 Thread Stefan Hajnoczi
On Tue, Apr 09, 2019 at 02:28:23PM +0200, Paolo Bonzini wrote:

Why is this 4.0 material?  It's not a 4.0 regression and tweaking the
event loop is risky.  I suggest waiting for 4.1.

> With aio=thread, adaptive polling makes latency worse rather than
> better, because it delays the execution of the ThreadPool's
> completion bottom half.
> 
> event_notifier_poll() does run while polling, detecting that
> a bottom half was scheduled by a worker thread, but because
> ctx->notifier is explicitly ignored in run_poll_handlers_once(),
> scheduling the BH does not count as making progress and
> run_poll_handlers() keeps running.  Fix this by recomputing
> the deadline after *timeout could have changed.
> 
> With this change, ThreadPool still cannot participate in polling
> but at least it does not suffer from extra latency.
> 
> Reported-by: Sergio Lopez 
> Cc: Stefan Hajnoczi 
> Cc: Kevin Wolf 
> Cc: qemu-bl...@nongnu.org
> Signed-off-by: Paolo Bonzini 
> Message-Id: <1553692145-86728-1-git-send-email-pbonz...@redhat.com>
> Signed-off-by: Paolo Bonzini 
> ---
> v1->v2: use qemu_soonest_timeout to handle timeout == -1
>  util/aio-posix.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-4.1] roms: Correct the EDK2_BASETOOLS_OPTFLAGS variable description

2019-04-10 Thread Laszlo Ersek
On 04/10/19 06:57, Philippe Mathieu-Daudé wrote:
> In commit 1cab464136b4 we incorrectly described the
> EDK2_BASETOOLS_OPTFLAGS can pass CPPFLAGS and CFLAGS
> options to the EDK2 build tools, but it only expands
> the CFLAGS (not to the CPPFLAGS).
> Update the description to be more accurate.
> 
> Reported-by: Laszlo Ersek 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  roms/Makefile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/roms/Makefile b/roms/Makefile
> index 1ff78b63bb3..d7fd6025e7c 100644
> --- a/roms/Makefile
> +++ b/roms/Makefile
> @@ -120,8 +120,8 @@ build-efi-roms: build-pxe-roms
>   $(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \
>   $(patsubst %,bin-x86_64-efi/%.efidrv,$(pxerom_targets))
>  
> -# Build scripts can pass compiler/linker flags to the EDK2 build tools
> -# via the EDK2_BASETOOLS_OPTFLAGS (CPPFLAGS and CFLAGS) and
> +# Build scripts can pass compiler/linker flags to the EDK2
> +# build  tools via the EDK2_BASETOOLS_OPTFLAGS (CFLAGS) and
>  # EDK2_BASETOOLS_LDFLAGS (LDFLAGS) environment variables.
>  #
>  # Example:
> 

Reviewed-by: Laszlo Ersek 

Please noone submit a pull request with this patch (for 4.1) until my
edk2 bundling series is merged; as this one will likely introduce
another contextual conflict, and I'd *really* like to avoid another
rebase on my end, due to such an issue.

Thanks!
Laszlo



Re: [Qemu-devel] [PATCH] hostmem-file: warn when memory-backend-file, share=on and in incoming migration

2019-04-10 Thread Dr. David Alan Gilbert
* Catherine Ho (catherine.h...@gmail.com) wrote:
> Hi Igor
> 
> 
> On Mon, 8 Apr 2019 at 18:35, Igor Mammedov  wrote:
> 
> > On Sun,  7 Apr 2019 22:19:05 -0400
> > Catherine Ho  wrote:
> >
> > > Currently it is not forbidden to use "-object
> > memory-backend-file,share=on"
> > > and together with "-incoming". But after incoming migration is finished,
> > > the memory-backend-file will be definitely written if share=on. So the
> > > memory-backend-file can only be used once, but failed in the 2nd time
> > > incoming.
> > >
> > > Thus it gives a warning and the users can run the qemu if they really
> > > want to do it.
> >
> > Shouldn't we add a migration blocker in such a case instead of warning
> > and letting qemu run wild?
> >
> 
> IMO, it doesn't need to block this. With share=on and -incoming, the user
> can
> still save the device memory state into memory-backend file again if
> ignore-shared
> capability is on.
> 
> If we block this, the user can't use the ignore-shared capability in
> incoming
> migration.

-incomign with share=on is a perfectly normal thing to do - it just
depends who you are sharing the file with and the lifetime of that
shared file.

For example; if you're just running a qemu with vhost-user then you
use share=on - however wyou typically select the backend file as
a new file from /dev/shm - it's not a file that you previously migrated
to.

QEMU has no way to know about the provenance of the shared file it's
been given, so we can't really warn people about it.

Dave

> B.R.
> Catherine
> 
> >
> >
> > > Signed-off-by: Catherine Ho 
> > > ---
> > >  backends/hostmem-file.c | 11 +++
> > >  1 file changed, 11 insertions(+)
> > >
> > > diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> > > index 37ac6445d2..59429ee0b4 100644
> > > --- a/backends/hostmem-file.c
> > > +++ b/backends/hostmem-file.c
> > > @@ -16,6 +16,7 @@
> > >  #include "sysemu/hostmem.h"
> > >  #include "sysemu/sysemu.h"
> > >  #include "qom/object_interfaces.h"
> > > +#include "migration/migration.h"
> > >
> > >  /* hostmem-file.c */
> > >  /**
> > > @@ -79,6 +80,16 @@ file_backend_memory_alloc(HostMemoryBackend *backend,
> > Error **errp)
> > >  }
> > >  }
> > >
> > > +/*
> > > + * In ignore shared case, if share=on for host memory backend file,
> > > + * the ram might be written after incoming process is finished. Thus
> > > + * the memory backend can't be reused for 2nd/3rd... incoming
> > > + */
> > > +if (backend->share && migrate_ignore_shared()
> > > +   && runstate_check(RUN_STATE_INMIGRATE))
> > > +warn_report("share=on for memory backend file might be "
> > > +"conflicted with incoming in ignore shared
> > case");
> > > +
> > >  backend->force_prealloc = mem_prealloc;
> > >  name = host_memory_backend_get_name(backend);
> > >  memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
> >
> >
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 3/3] xen-bus / xen-block: add support for event channel polling

2019-04-10 Thread Anthony PERARD
On Mon, Apr 08, 2019 at 04:16:17PM +0100, Paul Durrant wrote:
> This patch introduces a poll callback for event channel fd-s and uses
> this to invoke the channel callback function.
> 
> To properly support polling, it is necessary for the event channel callback
> function to return a boolean saying whether it has done any useful work or
> not. Thus xen_block_dataplane_event() is modified to directly invoke
> xen_block_handle_requests() and the latter only returns true if it actually
> processes any requests. This also means that the call to qemu_bh_schedule()
> is moved into xen_block_complete_aio(), which is more intuitive since the
> only reason for doing a deferred poll of the shared ring should be because
> there were previously insufficient resources to fully complete a previous
> poll.
> 
> Signed-off-by: Paul Durrant 
> ---

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH for-4.1] q35: acpi: do not create dummy MCFG table

2019-04-10 Thread Igor Mammedov
On Wed, 10 Apr 2019 22:27:56 +0800
Wei Yang  wrote:

[...]
> >@@ -2411,19 +2410,7 @@ build_mcfg_q35(GArray *table_data, BIOSLinker 
> >*linker, AcpiMcfgInfo *info)
> > mcfg->allocation[0].start_bus_number = 0;
> > mcfg->allocation[0].end_bus_number = PCIE_MMCFG_BUS(info->mcfg_size - 
> > 1);
> > 
> >-/* MCFG is used for ECAM which can be enabled or disabled by guest.  
> 
> I want to cnfirm what is "enabled or disabled by guest" here.

Firmware theoretically during PCI initialization may disable ECAM support
and that's when we do no need MCFG. In practice that's not happening
(SeaBIOS or UEFI) but we in case there is out there a firmware that does
disable ECAM we do not generate MCFG.

Note:
ACPI tables generated twice, 1st when QEMU starts and the second time
when firmware accesses fwcfg to read blobs for the 1st time.
The later happens after PCI subsystem was initialized by firmware.
At that time we know if ECAM was enabled or not.

> If we don't reserve mcfg and "guest" enable mcfg during running, the ACPI
> table size changed. But the destination still has the original table size,
> since destination "guest" keep sleep during this period.
> 
> Now the migration would face table size difference

with commit a1666142db we do not care as all the tables created on
source will be migrated to destination as is overwriting whatever blobs
destination created on startup.

> and break migration?
nope,

to help you figure out why it works
look at what following git commits did:
  git log c8d6f66ae7..a1666142db
and pay attention to 'used_length'

> 
> >- * To avoid table size changes (which create migration issues),
> >- * always create the table even if there are no allocations,
> >- * but set the signature to a reserved value in this case.
> >- * ACPI spec requires OSPMs to ignore such tables.
> >- */
> >-if (info->mcfg_base == PCIE_BASE_ADDR_UNMAPPED) {
> >-/* Reserved signature: ignored by OSPM */
> >-sig = "QEMU";
> >-} else {
> >-sig = "MCFG";
> >-}
> >-build_header(linker, table_data, (void *)mcfg, sig, len, 1, NULL, NULL);
> >+build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL, 
> >NULL);
> > }
> > 
> > /*
> >@@ -2592,6 +2579,9 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
> > }
> > mcfg->mcfg_base = qnum_get_uint(qobject_to(QNum, o));
> > qobject_unref(o);
> >+if (mcfg->mcfg_base == PCIE_BASE_ADDR_UNMAPPED) {
> >+return false;
> >+}
> > 
> > o = object_property_get_qobject(pci_host, PCIE_HOST_MCFG_SIZE, NULL);
> > assert(o);
> >-- 
> >2.7.4  
> 




Re: [Qemu-devel] [PATCH for-4.1] roms: Correct the EDK2_BASETOOLS_OPTFLAGS variable description

2019-04-10 Thread Philippe Mathieu-Daudé
On 4/10/19 4:58 PM, Laszlo Ersek wrote:
> On 04/10/19 06:57, Philippe Mathieu-Daudé wrote:
>> In commit 1cab464136b4 we incorrectly described the
>> EDK2_BASETOOLS_OPTFLAGS can pass CPPFLAGS and CFLAGS
>> options to the EDK2 build tools, but it only expands
>> the CFLAGS (not to the CPPFLAGS).
>> Update the description to be more accurate.
>>
>> Reported-by: Laszlo Ersek 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  roms/Makefile | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/roms/Makefile b/roms/Makefile
>> index 1ff78b63bb3..d7fd6025e7c 100644
>> --- a/roms/Makefile
>> +++ b/roms/Makefile
>> @@ -120,8 +120,8 @@ build-efi-roms: build-pxe-roms
>>  $(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \
>>  $(patsubst %,bin-x86_64-efi/%.efidrv,$(pxerom_targets))
>>  
>> -# Build scripts can pass compiler/linker flags to the EDK2 build tools
>> -# via the EDK2_BASETOOLS_OPTFLAGS (CPPFLAGS and CFLAGS) and
>> +# Build scripts can pass compiler/linker flags to the EDK2
>> +# build  tools via the EDK2_BASETOOLS_OPTFLAGS (CFLAGS) and
>>  # EDK2_BASETOOLS_LDFLAGS (LDFLAGS) environment variables.
>>  #
>>  # Example:
>>
> 
> Reviewed-by: Laszlo Ersek 
> 
> Please noone submit a pull request with this patch (for 4.1) until my
> edk2 bundling series is merged; as this one will likely introduce
> another contextual conflict, and I'd *really* like to avoid another
> rebase on my end, due to such an issue.

Thanks for the review. I'll rebase it on top of your 'bundle edk2
platform firmware with QEMU' v4 and resend (later). Feel free to include
it on top of your pullreq.

Regards,

Phil.



Re: [Qemu-devel] [PATCH v2 for-4.0?] aio-posix: ensure poll mode is left when aio_notify is called

2019-04-10 Thread Stefan Hajnoczi
On Tue, Apr 09, 2019 at 02:28:23PM +0200, Paolo Bonzini wrote:
> With aio=thread, adaptive polling makes latency worse rather than
> better, because it delays the execution of the ThreadPool's
> completion bottom half.
> 
> event_notifier_poll() does run while polling, detecting that
> a bottom half was scheduled by a worker thread, but because
> ctx->notifier is explicitly ignored in run_poll_handlers_once(),
> scheduling the BH does not count as making progress and
> run_poll_handlers() keeps running.  Fix this by recomputing
> the deadline after *timeout could have changed.
> 
> With this change, ThreadPool still cannot participate in polling
> but at least it does not suffer from extra latency.
> 
> Reported-by: Sergio Lopez 
> Cc: Stefan Hajnoczi 
> Cc: Kevin Wolf 
> Cc: qemu-bl...@nongnu.org
> Signed-off-by: Paolo Bonzini 
> Message-Id: <1553692145-86728-1-git-send-email-pbonz...@redhat.com>
> Signed-off-by: Paolo Bonzini 
> ---
> v1->v2: use qemu_soonest_timeout to handle timeout == -1
>  util/aio-posix.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 2/3] xen-bus: allow AioContext to be specified for each event channel

2019-04-10 Thread Anthony PERARD
On Wed, Apr 10, 2019 at 04:20:05PM +0100, Paul Durrant wrote:
> > -Original Message-
> > From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> > Sent: 10 April 2019 13:57
> > To: Paul Durrant 
> > Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; 
> > xen-de...@lists.xenproject.org; Stefano Stabellini
> > ; Stefan Hajnoczi ; Kevin Wolf 
> > ; Max
> > Reitz 
> > Subject: Re: [PATCH 2/3] xen-bus: allow AioContext to be specified for each 
> > event channel
> > 
> > On Mon, Apr 08, 2019 at 04:16:16PM +0100, Paul Durrant wrote:
> > > This patch adds an AioContext parameter to xen_device_bind_event_channel()
> > > and then uses aio_set_fd_handler() to set the callback rather than
> > > qemu_set_fd_handler().
> > >
> > > Signed-off-by: Paul Durrant 
> > > ---
> > > @@ -943,6 +944,7 @@ static void xen_device_event(void *opaque)
> > >  }
> > >
> > >  XenEventChannel *xen_device_bind_event_channel(XenDevice *xendev,
> > > +   AioContext *ctx,
> > > unsigned int port,
> > > XenEventHandler handler,
> > > void *opaque, Error 
> > > **errp)
> > > @@ -968,8 +970,9 @@ XenEventChannel 
> > > *xen_device_bind_event_channel(XenDevice *xendev,
> > >  channel->handler = handler;
> > >  channel->opaque = opaque;
> > >
> > > -qemu_set_fd_handler(xenevtchn_fd(channel->xeh), xen_device_event, 
> > > NULL,
> > > -channel);
> > > +channel->ctx = ctx;
> > > +aio_set_fd_handler(channel->ctx, xenevtchn_fd(channel->xeh), false,
> > > +   xen_device_event, NULL, NULL, channel);
> > 
> > I wonder if the `'is_external' parameter of aio_set_fd_handler shoud be
> > `true' here, instead. That flag seems to be used when making a snapshot
> > of a blockdev, for example.
> > 
> > That was introduced by:
> > dca21ef23ba48f6f1428c59f295a857e5dc203c8^..c07bc2c1658fffeee08eb46402b2f66d55b07586
> > 
> > What do you think?
> 
> Interesting. I admit I was merely transcribing what qemu_set_fd_handler() 
> passes without really looking into the values. Looking at the arguments that 
> virtio-blk passes to aio_set_event_notifier() though, and what 'is_external' 
> means, it would appear that setting it to true is probably the right thing to 
> do. Do you want me to send a v2 of the series or can you fix it up?

I'll fix that up.

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



[Qemu-devel] [PATCH for-4.1] commit: Make base read-only if there is an early failure

2019-04-10 Thread Alberto Garcia
You can reproduce this by passing an invalid filter-node-name (like
"1234") to block-commit. In this case the base image is put in
read-write mode but is never reset back to read-only.

Signed-off-by: Alberto Garcia 
---
 block/commit.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/commit.c b/block/commit.c
index ba60fef58a..698eda1dfe 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -384,6 +384,9 @@ fail:
 if (s->top) {
 blk_unref(s->top);
 }
+if (s->base_read_only) {
+bdrv_reopen_set_read_only(base, true, NULL);
+}
 job_early_fail(&s->common.job);
 /* commit_top_bs has to be replaced after deleting the block job,
  * otherwise this would fail because of lack of permissions. */
-- 
2.11.0




Re: [Qemu-devel] [PATCH 0/2] qemu-img convert: ignore read errors

2019-04-10 Thread Max Reitz
On 09.04.19 16:56, Andrey Shinkevich wrote:
> The 'qemu-img convert' new command option 'force read' with the key '-R'
> allows converting a damaged image to get all the available information
> in case of the read errors. The program reports read errors and continue
> the image conversion. The users should keep in their minds that the
> resulting image is inconsistent.

I’ve already sent a series “Add salvaging mode to convert” back in December.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/3] xen-bus: allow AioContext to be specified for each event channel

2019-04-10 Thread Paul Durrant
> -Original Message-
> From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> Sent: 10 April 2019 13:57
> To: Paul Durrant 
> Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; 
> xen-de...@lists.xenproject.org; Stefano Stabellini
> ; Stefan Hajnoczi ; Kevin Wolf 
> ; Max
> Reitz 
> Subject: Re: [PATCH 2/3] xen-bus: allow AioContext to be specified for each 
> event channel
> 
> On Mon, Apr 08, 2019 at 04:16:16PM +0100, Paul Durrant wrote:
> > This patch adds an AioContext parameter to xen_device_bind_event_channel()
> > and then uses aio_set_fd_handler() to set the callback rather than
> > qemu_set_fd_handler().
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > @@ -943,6 +944,7 @@ static void xen_device_event(void *opaque)
> >  }
> >
> >  XenEventChannel *xen_device_bind_event_channel(XenDevice *xendev,
> > +   AioContext *ctx,
> > unsigned int port,
> > XenEventHandler handler,
> > void *opaque, Error **errp)
> > @@ -968,8 +970,9 @@ XenEventChannel 
> > *xen_device_bind_event_channel(XenDevice *xendev,
> >  channel->handler = handler;
> >  channel->opaque = opaque;
> >
> > -qemu_set_fd_handler(xenevtchn_fd(channel->xeh), xen_device_event, NULL,
> > -channel);
> > +channel->ctx = ctx;
> > +aio_set_fd_handler(channel->ctx, xenevtchn_fd(channel->xeh), false,
> > +   xen_device_event, NULL, NULL, channel);
> 
> I wonder if the `'is_external' parameter of aio_set_fd_handler shoud be
> `true' here, instead. That flag seems to be used when making a snapshot
> of a blockdev, for example.
> 
> That was introduced by:
> dca21ef23ba48f6f1428c59f295a857e5dc203c8^..c07bc2c1658fffeee08eb46402b2f66d55b07586
> 
> What do you think?

Interesting. I admit I was merely transcribing what qemu_set_fd_handler() 
passes without really looking into the values. Looking at the arguments that 
virtio-blk passes to aio_set_event_notifier() though, and what 'is_external' 
means, it would appear that setting it to true is probably the right thing to 
do. Do you want me to send a v2 of the series or can you fix it up?

  Cheers,

Paul

> 
> 
> --
> Anthony PERARD



Re: [Qemu-devel] [PATCH for-4.1] commit: Make base read-only if there is an early failure

2019-04-10 Thread Eric Blake
On 4/10/19 10:24 AM, Alberto Garcia wrote:
> You can reproduce this by passing an invalid filter-node-name (like
> "1234") to block-commit. In this case the base image is put in
> read-write mode but is never reset back to read-only.
> 

Is it worth iotest coverage?

> Signed-off-by: Alberto Garcia 
> ---
>  block/commit.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/block/commit.c b/block/commit.c
> index ba60fef58a..698eda1dfe 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -384,6 +384,9 @@ fail:
>  if (s->top) {
>  blk_unref(s->top);
>  }
> +if (s->base_read_only) {
> +bdrv_reopen_set_read_only(base, true, NULL);
> +}
>  job_early_fail(&s->common.job);
>  /* commit_top_bs has to be replaced after deleting the block job,
>   * otherwise this would fail because of lack of permissions. */
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] hostmem-file: warn when memory-backend-file, share=on and in incoming migration

2019-04-10 Thread Catherine Ho
Hi Dr. David

On Wed, 10 Apr 2019 at 22:59, Dr. David Alan Gilbert 
wrote:

> * Catherine Ho (catherine.h...@gmail.com) wrote:
> > Hi Igor
> >
> >
> > On Mon, 8 Apr 2019 at 18:35, Igor Mammedov  wrote:
> >
> > > On Sun,  7 Apr 2019 22:19:05 -0400
> > > Catherine Ho  wrote:
> > >
> > > > Currently it is not forbidden to use "-object
> > > memory-backend-file,share=on"
> > > > and together with "-incoming". But after incoming migration is
> finished,
> > > > the memory-backend-file will be definitely written if share=on. So
> the
> > > > memory-backend-file can only be used once, but failed in the 2nd time
> > > > incoming.
> > > >
> > > > Thus it gives a warning and the users can run the qemu if they really
> > > > want to do it.
> > >
> > > Shouldn't we add a migration blocker in such a case instead of warning
> > > and letting qemu run wild?
> > >
> >
> > IMO, it doesn't need to block this. With share=on and -incoming, the user
> > can
> > still save the device memory state into memory-backend file again if
> > ignore-shared
> > capability is on.
> >
> > If we block this, the user can't use the ignore-shared capability in
> > incoming
> > migration.
>
> -incomign with share=on is a perfectly normal thing to do - it just
> depends who you are sharing the file with and the lifetime of that
> shared file.
>
> For example; if you're just running a qemu with vhost-user then you
> use share=on - however wyou typically select the backend file as
> a new file from /dev/shm - it's not a file that you previously migrated
> to.
>
Thanks,
but using a new file from /dev/shm means kernel will start from
start_kernel or early? Is it different from the x-ignore-shared case?
If we remove the share=on in incoming migration, all the writting
of ram will not be flush into the memory backend file. Thus we
can use the base memory backend file for ever.
e.g.
1) save the vm like a snapshot, current ram state is "kernel
has been started, systemd has been started"
2) restore it with -incoming and *no* share=on flag
3) restore it with -incoming and *no* share=on again...
In contrary, if we use share=on, the base backend file will
be written at once after 1st time incoming.

So, IMO, no "share=on" is the proper usage of incoming migration
when ignore-shared is on.
Please correct me if sth is wrong, thanks:)

B.R.
Catherine


> QEMU has no way to know about the provenance of the shared file it's
> been given, so we can't really warn people about it.
>
> Dave
>
> > B.R.
> > Catherine
> >
> > >
> > >
> > > > Signed-off-by: Catherine Ho 
> > > > ---
> > > >  backends/hostmem-file.c | 11 +++
> > > >  1 file changed, 11 insertions(+)
> > > >
> > > > diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> > > > index 37ac6445d2..59429ee0b4 100644
> > > > --- a/backends/hostmem-file.c
> > > > +++ b/backends/hostmem-file.c
> > > > @@ -16,6 +16,7 @@
> > > >  #include "sysemu/hostmem.h"
> > > >  #include "sysemu/sysemu.h"
> > > >  #include "qom/object_interfaces.h"
> > > > +#include "migration/migration.h"
> > > >
> > > >  /* hostmem-file.c */
> > > >  /**
> > > > @@ -79,6 +80,16 @@ file_backend_memory_alloc(HostMemoryBackend
> *backend,
> > > Error **errp)
> > > >  }
> > > >  }
> > > >
> > > > +/*
> > > > + * In ignore shared case, if share=on for host memory backend
> file,
> > > > + * the ram might be written after incoming process is finished.
> Thus
> > > > + * the memory backend can't be reused for 2nd/3rd... incoming
> > > > + */
> > > > +if (backend->share && migrate_ignore_shared()
> > > > +   && runstate_check(RUN_STATE_INMIGRATE))
> > > > +warn_report("share=on for memory backend file might be "
> > > > +"conflicted with incoming in ignore shared
> > > case");
> > > > +
> > > >  backend->force_prealloc = mem_prealloc;
> > > >  name = host_memory_backend_get_name(backend);
> > > >  memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
> > >
> > >
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>


Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Pankaj Gupta


> 
> > This patch adds virtio-pmem driver for KVM guest.
> > 
> > Guest reads the persistent memory range information from
> > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > creates a nd_region object with the persistent memory
> > range information so that existing 'nvdimm/pmem' driver
> > can reserve this into system memory map. This way
> > 'virtio-pmem' driver uses existing functionality of pmem
> > driver to register persistent memory compatible for DAX
> > capable filesystems.
> > 
> > This also provides function to perform guest flush over
> > VIRTIO from 'pmem' driver when userspace performs flush
> > on DAX memory range.
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  drivers/nvdimm/virtio_pmem.c |  88 ++
> >  drivers/virtio/Kconfig   |  10 +++
> >  drivers/virtio/Makefile  |   1 +
> >  drivers/virtio/pmem.c| 124 +++
> >  include/linux/virtio_pmem.h  |  60 +++
> >  include/uapi/linux/virtio_ids.h  |   1 +
> >  include/uapi/linux/virtio_pmem.h |  10 +++
> >  7 files changed, 294 insertions(+)
> >  create mode 100644 drivers/nvdimm/virtio_pmem.c
> >  create mode 100644 drivers/virtio/pmem.c
> >  create mode 100644 include/linux/virtio_pmem.h
> >  create mode 100644 include/uapi/linux/virtio_pmem.h
> > 
> (...)
> > diff --git a/drivers/virtio/pmem.c b/drivers/virtio/pmem.c
> > new file mode 100644
> > index ..cc9de9589d56
> > --- /dev/null
> > +++ b/drivers/virtio/pmem.c
> > @@ -0,0 +1,124 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * virtio_pmem.c: Virtio pmem Driver
> > + *
> > + * Discovers persistent memory range information
> > + * from host and registers the virtual pmem device
> > + * with libnvdimm core.
> > + */
> > +#include 
> > +#include <../../drivers/nvdimm/nd.h>
> > +
> > +static struct virtio_device_id id_table[] = {
> > +   { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> > +   { 0 },
> > +};
> > +
> > + /* Initialize virt queue */
> > +static int init_vq(struct virtio_pmem *vpmem)
> 
> IMHO, you don't gain much by splitting off this function...

o.k. I kept this for better code structure.

> 
> > +{
> > +   struct virtqueue *vq;
> > +
> > +   /* single vq */
> > +   vpmem->req_vq = vq = virtio_find_single_vq(vpmem->vdev,
> > +   host_ack, "flush_queue");
> > +   if (IS_ERR(vq))
> > +   return PTR_ERR(vq);
> 
> I'm personally not a fan of chained assignments... I think I'd just
> drop the 'vq' variable and operate on vpmem->req_vq directly.

Sure.

> 
> > +
> > +   spin_lock_init(&vpmem->pmem_lock);
> > +   INIT_LIST_HEAD(&vpmem->req_list);
> > +
> > +   return 0;
> > +};
> > +
> > +static int virtio_pmem_probe(struct virtio_device *vdev)
> > +{
> > +   int err = 0;
> > +   struct resource res;
> > +   struct virtio_pmem *vpmem;
> > +   struct nvdimm_bus *nvdimm_bus;
> > +   struct nd_region_desc ndr_desc = {};
> > +   int nid = dev_to_node(&vdev->dev);
> > +   struct nd_region *nd_region;
> > +
> > +   if (!vdev->config->get) {
> > +   dev_err(&vdev->dev, "%s failure: config disabled\n",
> 
> Maybe s/config disabled/config access disabled/ ? That seems to be the
> more common message.

This is better.

> 
> > +   __func__);
> > +   return -EINVAL;
> > +   }
> > +
> > +   vdev->priv = vpmem = devm_kzalloc(&vdev->dev, sizeof(*vpmem),
> > +   GFP_KERNEL);
> 
> Here, the vpmem variable makes sense for convenience, but I'm again not
> a fan of the chaining :)

Sure will change :)

> 
> > +   if (!vpmem) {
> > +   err = -ENOMEM;
> > +   goto out_err;
> > +   }
> > +
> > +   vpmem->vdev = vdev;
> > +   err = init_vq(vpmem);
> > +   if (err)
> > +   goto out_err;
> > +
> > +   virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> > +   start, &vpmem->start);
> > +   virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> > +   size, &vpmem->size);
> > +
> > +   res.start = vpmem->start;
> > +   res.end   = vpmem->start + vpmem->size-1;
> > +   vpmem->nd_desc.provider_name = "virtio-pmem";
> > +   vpmem->nd_desc.module = THIS_MODULE;
> > +
> > +   vpmem->nvdimm_bus = nvdimm_bus = nvdimm_bus_register(&vdev->dev,
> > +   &vpmem->nd_desc);
> 
> And here :)

Sure.

> 
> > +   if (!nvdimm_bus)
> > +   goto out_vq;
> > +
> > +   dev_set_drvdata(&vdev->dev, nvdimm_bus);
> > +
> > +   ndr_desc.res = &res;
> > +   ndr_desc.numa_node = nid;
> > +   ndr_desc.flush = virtio_pmem_flush;
> > +   set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > +   set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +   nd_region = nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc);
> > +   nd_region->provider_data =  dev_to_virtio
> > +   (nd_region->dev.parent->parent);
> 
> Isn't it clear that this parent chain will always end up at &vdev->dev?

Yes, It will resolve to

Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Pankaj Gupta


> > 
> > > This patch adds virtio-pmem driver for KVM guest.
> > > 
> > > Guest reads the persistent memory range information from
> > > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > > creates a nd_region object with the persistent memory
> > > range information so that existing 'nvdimm/pmem' driver
> > > can reserve this into system memory map. This way
> > > 'virtio-pmem' driver uses existing functionality of pmem
> > > driver to register persistent memory compatible for DAX
> > > capable filesystems.
> > > 
> > > This also provides function to perform guest flush over
> > > VIRTIO from 'pmem' driver when userspace performs flush
> > > on DAX memory range.
> > > 
> > > Signed-off-by: Pankaj Gupta 
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c |  88 ++
> > >  drivers/virtio/Kconfig   |  10 +++
> > >  drivers/virtio/Makefile  |   1 +
> > >  drivers/virtio/pmem.c| 124 +++
> > >  include/linux/virtio_pmem.h  |  60 +++
> > >  include/uapi/linux/virtio_ids.h  |   1 +
> > >  include/uapi/linux/virtio_pmem.h |  10 +++
> > >  7 files changed, 294 insertions(+)
> > >  create mode 100644 drivers/nvdimm/virtio_pmem.c
> > >  create mode 100644 drivers/virtio/pmem.c
> > >  create mode 100644 include/linux/virtio_pmem.h
> > >  create mode 100644 include/uapi/linux/virtio_pmem.h
> > > 
> > (...)
> > > diff --git a/drivers/virtio/pmem.c b/drivers/virtio/pmem.c
> > > new file mode 100644
> > > index ..cc9de9589d56
> > > --- /dev/null
> > > +++ b/drivers/virtio/pmem.c
> > > @@ -0,0 +1,124 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * virtio_pmem.c: Virtio pmem Driver
> > > + *
> > > + * Discovers persistent memory range information
> > > + * from host and registers the virtual pmem device
> > > + * with libnvdimm core.
> > > + */
> > > +#include 
> > > +#include <../../drivers/nvdimm/nd.h>
> > > +
> > > +static struct virtio_device_id id_table[] = {
> > > + { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> > > + { 0 },
> > > +};
> > > +
> > > + /* Initialize virt queue */
> > > +static int init_vq(struct virtio_pmem *vpmem)
> > 
> > IMHO, you don't gain much by splitting off this function...
> 
> It make sense to have all the vq-init-related stuff in one function, so
> here pmem_lock and req_list are used only for the vq.

Yes.

> Saying that - maybe it would be better to have the 3 in one struct.
> 
> > 
> > > +{
> > > + struct virtqueue *vq;
> > > +
> > > + /* single vq */
> > > + vpmem->req_vq = vq = virtio_find_single_vq(vpmem->vdev,
> > > + host_ack, "flush_queue");
> > > + if (IS_ERR(vq))
> > > + return PTR_ERR(vq);
> > 
> > I'm personally not a fan of chained assignments... I think I'd just
> > drop the 'vq' variable and operate on vpmem->req_vq directly.
> 
> +1

Will drop extra vq.

> 
> > 
> > > +
> > > + spin_lock_init(&vpmem->pmem_lock);
> > > + INIT_LIST_HEAD(&vpmem->req_list);
> > > +
> > > + return 0;
> > > +};
> > > +
> > > +static int virtio_pmem_probe(struct virtio_device *vdev)
> > > +{
> > > + int err = 0;
> > > + struct resource res;
> > > + struct virtio_pmem *vpmem;
> > > + struct nvdimm_bus *nvdimm_bus;
> > > + struct nd_region_desc ndr_desc = {};
> > > + int nid = dev_to_node(&vdev->dev);
> > > + struct nd_region *nd_region;
> > > +
> > > + if (!vdev->config->get) {
> > > + dev_err(&vdev->dev, "%s failure: config disabled\n",
> > 
> > Maybe s/config disabled/config access disabled/ ? That seems to be the
> > more common message.
> > 
> > > + __func__);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + vdev->priv = vpmem = devm_kzalloc(&vdev->dev, sizeof(*vpmem),
> > > + GFP_KERNEL);
> > 
> > Here, the vpmem variable makes sense for convenience, but I'm again not
> > a fan of the chaining :)
> 
> +1

here as well.

> 
> > 
> > > + if (!vpmem) {
> > > + err = -ENOMEM;
> > > + goto out_err;
> > > + }
> > > +
> > > + vpmem->vdev = vdev;
> > > + err = init_vq(vpmem);
> > > + if (err)
> > > + goto out_err;
> > > +
> > > + virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> > > + start, &vpmem->start);
> > > + virtio_cread(vpmem->vdev, struct virtio_pmem_config,
> > > + size, &vpmem->size);
> > > +
> > > + res.start = vpmem->start;
> > > + res.end   = vpmem->start + vpmem->size-1;
> > > + vpmem->nd_desc.provider_name = "virtio-pmem";
> > > + vpmem->nd_desc.module = THIS_MODULE;
> > > +
> > > + vpmem->nvdimm_bus = nvdimm_bus = nvdimm_bus_register(&vdev->dev,
> > > + &vpmem->nd_desc);
> > 
> > And here :)
> > 
> > > + if (!nvdimm_bus)
> > > + goto out_vq;
> > > +
> > > + dev_set_drvdata(&vdev->dev, nvdimm_bus);
> > > +
> > > + ndr_desc.res = &res;
> > > + ndr_desc.numa_node = nid;
> > > + ndr_desc.flush = virtio_pmem_flush;
> > > + set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags)

Re: [Qemu-devel] [RFC-PATCH] Introducing virtio-example device.

2019-04-10 Thread Yoni Bettan



On 4/9/19 4:17 PM, Stefan Hajnoczi wrote:

On Mon, Apr 01, 2019 at 02:18:43PM +0300, Yoni Bettan wrote:

The main goal is to add an example device to Qemu to be used as template or
guideline for contributors when they wish to create a new virtio device.

Another reason for this device is to document "the right way" to write
a new virtio device in Qemu.

This device is a simple device and its functionality is to increase its input
by 1.

The device driver is located at:
https://github.com/ybettan/QemuDeviceDrivers.git

In addition I am writing a blog to give a logical overview of the virtio
protocol and a step-by-step guide to write a new virtio device.
This blog can be found at https://howtovms.wordpress.com.

scripts/checkpatch.pl have some errors do to "//" (old style one line comment),
those lines contains FIXMEs for the next version and will be removed.

Signed-off-by: Yoni Bettan 



Hi Stefan and thank you for your review.


Where is the specification for this device?  The lack of specification
is already not "the right way".




Another step in this process is to write a specification for this 
device. Since I am learning the virtio protocol while implementing this 
example device it was easier for me to start with the device and from 
there write its specification but take into consideration that the 
specification will be written soon.





There are multiple problems with the code, but the larger issue is that
this example device is just helping people shoot themselves in the foot
more easily.



If you can point me to those problem I will be glad so I can update the 
code and understand those problems you are talking about.





The difficulty with VIRTIO is not how to implement devices/drivers, it's
that people don't read the specification and therefore do not understand
the device model properly.  The spec is dry and missing information in
some places.  I think more accessible documentation, like your blog, can
help here.



As I said, the blog will be updated with explanations on each step of 
the communication between the device and the driver and the reason it is 
not there yet is because I preferred getting some reviews, make the code 
better and only then documenting "the write way" to not mislead peoples.





If you would like to educate people about VIRTIO, then explaining the
device model, lifecycle, how to change a device in a
backwards-compatible way, virtqueue semantics, etc are the topics that
deserve attention.

For someone who has learnt these topics, implementing the device/driver
is not hard.  For someone who doesn't understand these topics, no
example device will help.  At best they will copy-paste together
something that sort of works but has issues.



For me, reading the specification and even reading some code examples 
over the internet didn't made me understand it until I saw the related 
code so I agree with you it is not enough yet but all the other parts 
are on there way.



The final purpose is to have:

1. device specification

2. device implementation

3. device driver

4. blog

maybe I should have written it at the beginning, this is not the entire 
project but it is its start.





Stefan



Thanks again for your review.




[Qemu-devel] [PATCH for-4.0-maybe] device_tree: Fix integer overflowing in load_device_tree()

2019-04-10 Thread Philippe Mathieu-Daudé
On 4/10/19 8:34 AM, Alistair Francis wrote:
> On Tue, Apr 9, 2019 at 10:59 PM Markus Armbruster  wrote:
>> Philippe Mathieu-Daudé  writes:
>>> On 4/10/19 7:28 AM, Markus Armbruster wrote:
 Philippe Mathieu-Daudé  writes:
> On 4/9/19 7:40 PM, Markus Armbruster wrote:
>> If the value of get_image_size() exceeds INT_MAX / 2 - 1, the
>> computation of @dt_size overflows to a negative number, which then
>> gets converted to a very large size_t for g_malloc0() and
>> load_image_size().  In the (fortunately improbable) case g_malloc0()
>> succeeds and load_image_size() survives, we'd assign the negative
>> number to *sizep.  What that would do to the callers I can't say, but
>> it's unlikely to be good.
>>
>> Fix by rejecting images whose size would overflow.
>>
>> Signed-off-by: Markus Armbruster 
>> ---
>>  device_tree.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/device_tree.c b/device_tree.c
>> index 296278e12a..f8b46b3c73 100644
>> --- a/device_tree.c
>> +++ b/device_tree.c
>> @@ -84,6 +84,10 @@ void *load_device_tree(const char *filename_path, int 
>> *sizep)
>>   filename_path);
>>  goto fail;
>>  }
>> +if (dt_size > INT_MAX / 2 - 1) {
>
> We should avoid magic number duplication.
> That said, this patch looks safe.
>
> Reviewed-by: Philippe Mathieu-Daudé 

 Thanks!

> BTW how did you figure that out?

 Downstream handling of upstream commit da885fe1ee8 led me to the
 function.  I spotted dt_size = get_image_size(filename_path).
 Experience has taught me to check the left hand side's type.  Bad.  Then
 I saw how dt_size gets increased.  Worse.
>>>
>>> So you genuinely neglected to mention Kurtis Miller then :)
>>
>> Explanation, not excuse: the only occurence of the name in my downstream
>> reading was a two-liner BZ comment, which I totally missed in my haste
>> to give the fix a chance to make 4.0.  I certainly didn't mean to
>> deprive him of credit!

My English teacher explained me neglected sounds like a
reprimand/reproach (as in negligence), sorry I didn't mean to be rude
here, I wanted to say something like "Peter remarked you did gave
credits to the first one who reported this issue, but since you figured
this bug via your own way, the omission was not intentional then."

> No worries. I have sent the pull request and it includes the reported by.
> 
> Alistair
> 
>>
>> [...]
>>



Re: [Qemu-devel] [PATCH] xen-block: support feature-large-sector-size

2019-04-10 Thread Anthony PERARD
On Tue, Apr 09, 2019 at 05:40:38PM +0100, Paul Durrant wrote:
> A recent Xen commit [1] clarified the semantics of sector based quantities
> used in the blkif protocol such that it is now safe to create a xen-block
> device with a logical_block_size != 512, as long as the device only
> connects to a frontend advertizing 'feature-large-block-size'.
> 
> This patch modifies xen-block accordingly. It also uses a stack variable
> for the BlockBackend in xen_block_realize() to avoid repeated dereferencing
> of the BlockConf pointer, and changes the parameters of
> xen_block_dataplane_create() so that the BlockBackend pointer and sector
> size are passed expicitly rather than implicitly via the BlockConf.
> 
> These modifications have been tested against a recent Windows PV XENVBD
> driver [2] using a xen-disk device with a 4kB logical block size.
> 
> [1] 
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=67e1c050e36b2c9900cca83618e56189effbad98
> [2] https://winpvdrvbuild.xenproject.org:8080/job/XENVBD-master/126
> 
> Signed-off-by: Paul Durrant 
> ---
> diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
> index ef635be4c2..05e890ad78 100644
> --- a/hw/block/xen-block.c
> +++ b/hw/block/xen-block.c
> @@ -51,11 +51,25 @@ static void xen_block_connect(XenDevice *xendev, Error 
> **errp)
[...]
> +if (xen_device_frontend_scanf(xendev, "feature-large-sector-size", "%u",
> +  &feature_large_sector_size) != 1) {
> +feature_large_sector_size = 0;
> +}
> +
> +if (feature_large_sector_size != 1 &&
> +conf->logical_block_size != XEN_BLKIF_SECTOR_SIZE) {
> +error_setg(errp, "logical_block_size != %u not supported",

Maybe add "by frontend" to the error message?

> +   XEN_BLKIF_SECTOR_SIZE);
> +return;
> +}
> +

With the question answered:
Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH] xen-block: support feature-large-sector-size

2019-04-10 Thread Paul Durrant
> -Original Message-
> From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> Sent: 10 April 2019 16:52
> To: Paul Durrant 
> Cc: qemu-devel@nongnu.org; xen-de...@lists.xenproject.org; 
> qemu-bl...@nongnu.org; Stefano Stabellini
> ; Stefan Hajnoczi ; Kevin Wolf 
> ; Max
> Reitz 
> Subject: Re: [PATCH] xen-block: support feature-large-sector-size
> 
> On Tue, Apr 09, 2019 at 05:40:38PM +0100, Paul Durrant wrote:
> > A recent Xen commit [1] clarified the semantics of sector based quantities
> > used in the blkif protocol such that it is now safe to create a xen-block
> > device with a logical_block_size != 512, as long as the device only
> > connects to a frontend advertizing 'feature-large-block-size'.
> >
> > This patch modifies xen-block accordingly. It also uses a stack variable
> > for the BlockBackend in xen_block_realize() to avoid repeated dereferencing
> > of the BlockConf pointer, and changes the parameters of
> > xen_block_dataplane_create() so that the BlockBackend pointer and sector
> > size are passed expicitly rather than implicitly via the BlockConf.
> >
> > These modifications have been tested against a recent Windows PV XENVBD
> > driver [2] using a xen-disk device with a 4kB logical block size.
> >
> > [1] 
> > http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=67e1c050e36b2c9900cca83618e56189effbad98
> > [2] https://winpvdrvbuild.xenproject.org:8080/job/XENVBD-master/126
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
> > index ef635be4c2..05e890ad78 100644
> > --- a/hw/block/xen-block.c
> > +++ b/hw/block/xen-block.c
> > @@ -51,11 +51,25 @@ static void xen_block_connect(XenDevice *xendev, Error 
> > **errp)
> [...]
> > +if (xen_device_frontend_scanf(xendev, "feature-large-sector-size", 
> > "%u",
> > +  &feature_large_sector_size) != 1) {
> > +feature_large_sector_size = 0;
> > +}
> > +
> > +if (feature_large_sector_size != 1 &&
> > +conf->logical_block_size != XEN_BLKIF_SECTOR_SIZE) {
> > +error_setg(errp, "logical_block_size != %u not supported",
> 
> Maybe add "by frontend" to the error message?

Yes, I'm fine with that addition.

> 
> > +   XEN_BLKIF_SECTOR_SIZE);
> > +return;
> > +}
> > +
> 
> With the question answered:
> Reviewed-by: Anthony PERARD 
> 

Thanks,

  Paul

> Thanks,
> 
> --
> Anthony PERARD



Re: [Qemu-devel] Questions about acpi interrupt link device's ‘_PRS' field

2019-04-10 Thread Paolo Bonzini
On 10/04/19 16:33, Li Qiang wrote:
> Hi all,
> 
>  
> 
> I see the link device ‘_PRS’  uses irq line 5, 10, 11 in
> ‘build_link_dev’ function.
> 
> But I never see the 5 lines uses in the guest, just uses 10 and 11.
> 
> Why this happen?  Maybe related with the guest?

Because the MADT table tells the guest to only use lines 10 and 11.  The
BIOS configures the chipset that way.

Paolo




Re: [Qemu-devel] [PATCH 2/3] xen-bus: allow AioContext to be specified for each event channel

2019-04-10 Thread Paul Durrant
> -Original Message-
> From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> Sent: 10 April 2019 16:23
> To: Paul Durrant 
> Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; 
> xen-de...@lists.xenproject.org; Stefano Stabellini
> ; Stefan Hajnoczi ; Kevin Wolf 
> ; Max
> Reitz 
> Subject: Re: [PATCH 2/3] xen-bus: allow AioContext to be specified for each 
> event channel
> 
> On Wed, Apr 10, 2019 at 04:20:05PM +0100, Paul Durrant wrote:
> > > -Original Message-
> > > From: Anthony PERARD [mailto:anthony.per...@citrix.com]
> > > Sent: 10 April 2019 13:57
> > > To: Paul Durrant 
> > > Cc: qemu-devel@nongnu.org; qemu-bl...@nongnu.org; 
> > > xen-de...@lists.xenproject.org; Stefano
> Stabellini
> > > ; Stefan Hajnoczi ; Kevin 
> > > Wolf ;
> Max
> > > Reitz 
> > > Subject: Re: [PATCH 2/3] xen-bus: allow AioContext to be specified for 
> > > each event channel
> > >
> > > On Mon, Apr 08, 2019 at 04:16:16PM +0100, Paul Durrant wrote:
> > > > This patch adds an AioContext parameter to 
> > > > xen_device_bind_event_channel()
> > > > and then uses aio_set_fd_handler() to set the callback rather than
> > > > qemu_set_fd_handler().
> > > >
> > > > Signed-off-by: Paul Durrant 
> > > > ---
> > > > @@ -943,6 +944,7 @@ static void xen_device_event(void *opaque)
> > > >  }
> > > >
> > > >  XenEventChannel *xen_device_bind_event_channel(XenDevice *xendev,
> > > > +   AioContext *ctx,
> > > > unsigned int port,
> > > > XenEventHandler handler,
> > > > void *opaque, Error 
> > > > **errp)
> > > > @@ -968,8 +970,9 @@ XenEventChannel 
> > > > *xen_device_bind_event_channel(XenDevice *xendev,
> > > >  channel->handler = handler;
> > > >  channel->opaque = opaque;
> > > >
> > > > -qemu_set_fd_handler(xenevtchn_fd(channel->xeh), xen_device_event, 
> > > > NULL,
> > > > -channel);
> > > > +channel->ctx = ctx;
> > > > +aio_set_fd_handler(channel->ctx, xenevtchn_fd(channel->xeh), false,
> > > > +   xen_device_event, NULL, NULL, channel);
> > >
> > > I wonder if the `'is_external' parameter of aio_set_fd_handler shoud be
> > > `true' here, instead. That flag seems to be used when making a snapshot
> > > of a blockdev, for example.
> > >
> > > That was introduced by:
> > > dca21ef23ba48f6f1428c59f295a857e5dc203c8^..c07bc2c1658fffeee08eb46402b2f66d55b07586
> > >
> > > What do you think?
> >
> > Interesting. I admit I was merely transcribing what qemu_set_fd_handler() 
> > passes without really
> looking into the values. Looking at the arguments that virtio-blk passes to 
> aio_set_event_notifier()
> though, and what 'is_external' means, it would appear that setting it to true 
> is probably the right
> thing to do. Do you want me to send a v2 of the series or can you fix it up?
> 
> I'll fix that up.

Thanks,

  Paul

> 
> Reviewed-by: Anthony PERARD 
> 
> Thanks,
> 
> --
> Anthony PERARD



[Qemu-devel] [PATCH] iotests: Let 245 pass on tmpfs

2019-04-10 Thread Max Reitz
tmpfs does not support O_DIRECT.  Detect this case, and skip flipping
@direct if the filesystem does not support it.

Fixes: bf3e50f6239090e63a8ffaaec971671e66d88e07
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/245 | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
index 7891a210c1..a04c6235c1 100644
--- a/tests/qemu-iotests/245
+++ b/tests/qemu-iotests/245
@@ -209,6 +209,12 @@ class TestBlockdevReopen(iotests.QMPTestCase):
 
 # Reopen an image several times changing some of its options
 def test_reopen(self):
+# Check whether the filesystem supports O_DIRECT
+if 'O_DIRECT' in qemu_io('-f', 'raw', '-t', 'none', '-c', 'quit', 
hd_path[0]):
+supports_direct = False
+else:
+supports_direct = True
+
 # Open the hd1 image passing all backing options
 opts = hd_opts(1)
 opts['backing'] = hd_opts(0)
@@ -231,9 +237,9 @@ class TestBlockdevReopen(iotests.QMPTestCase):
 self.assert_qmp(self.get_node('hd1'), 'cache/writeback', True)
 self.assert_qmp(self.get_node('hd1'), 'cache/direct', False)
 self.assert_qmp(self.get_node('hd1'), 'cache/no-flush', False)
-self.reopen(opts, {'cache': { 'direct': True, 'no-flush': True }})
+self.reopen(opts, {'cache': { 'direct': supports_direct, 'no-flush': 
True }})
 self.assert_qmp(self.get_node('hd1'), 'cache/writeback', True)
-self.assert_qmp(self.get_node('hd1'), 'cache/direct', True)
+self.assert_qmp(self.get_node('hd1'), 'cache/direct', supports_direct)
 self.assert_qmp(self.get_node('hd1'), 'cache/no-flush', True)
 
 # Reopen again with the original options
-- 
2.20.1




[Qemu-devel] [PATCH] target/riscv: Remove unused include of riscv_htif.h for virt board

2019-04-10 Thread Jonathan Behrens
Unless I'm missing something, the virt board doesn't support HTIF and
should not be including this header.

Jonathan

Signed-off-by: Jonathan Behrens 
---
 hw/riscv/virt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index fc4c6b306e..3526463034 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -29,7 +29,6 @@
 #include "hw/sysbus.h"
 #include "hw/char/serial.h"
 #include "target/riscv/cpu.h"
-#include "hw/riscv/riscv_htif.h"
 #include "hw/riscv/riscv_hart.h"
 #include "hw/riscv/sifive_plic.h"
 #include "hw/riscv/sifive_clint.h"
-- 
2.20.1


Re: [Qemu-devel] [PATCH for-4.0?] iotests: Let 245 pass on tmpfs

2019-04-10 Thread Eric Blake
On 4/10/19 11:29 AM, Max Reitz wrote:
> tmpfs does not support O_DIRECT.  Detect this case, and skip flipping
> @direct if the filesystem does not support it.
> 
> Fixes: bf3e50f6239090e63a8ffaaec971671e66d88e07
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/245 | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)

Are you trying to get this in 4.0-rc3? (As a test, it has no bearing on
the actual binaries; fewer testsuite failures are nice if we squeeze it
in, but at the same time it is not enough to delay the release if it
does not get fixed until 4.1).

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-4.0?] iotests: Let 245 pass on tmpfs

2019-04-10 Thread Max Reitz
On 10.04.19 18:39, Eric Blake wrote:
> On 4/10/19 11:29 AM, Max Reitz wrote:
>> tmpfs does not support O_DIRECT.  Detect this case, and skip flipping
>> @direct if the filesystem does not support it.
>>
>> Fixes: bf3e50f6239090e63a8ffaaec971671e66d88e07
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/245 | 10 --
>>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> Are you trying to get this in 4.0-rc3? (As a test, it has no bearing on
> the actual binaries; fewer testsuite failures are nice if we squeeze it
> in, but at the same time it is not enough to delay the release if it
> does not get fixed until 4.1).
> 
> Reviewed-by: Eric Blake 

If there are other patches for rc3, we could take this as well.  But
there is no point in making a pull request just for this.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Michael S. Tsirkin
On Wed, Apr 10, 2019 at 04:31:39PM +0200, Cornelia Huck wrote:
> On Wed, 10 Apr 2019 10:03:01 -0400 (EDT)
> Pankaj Gupta  wrote:
> 
> > > 
> > > On Wed, Apr 10, 2019 at 09:38:22AM +0530, Pankaj Gupta wrote:  
> > > > This patch adds virtio-pmem driver for KVM guest.
> > > > 
> > > > Guest reads the persistent memory range information from
> > > > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > > > creates a nd_region object with the persistent memory
> > > > range information so that existing 'nvdimm/pmem' driver
> > > > can reserve this into system memory map. This way
> > > > 'virtio-pmem' driver uses existing functionality of pmem
> > > > driver to register persistent memory compatible for DAX
> > > > capable filesystems.
> > > > 
> > > > This also provides function to perform guest flush over
> > > > VIRTIO from 'pmem' driver when userspace performs flush
> > > > on DAX memory range.
> > > > 
> > > > Signed-off-by: Pankaj Gupta 
> 
> > > > diff --git a/include/uapi/linux/virtio_ids.h
> > > > b/include/uapi/linux/virtio_ids.h
> > > > index 6d5c3b2d4f4d..346389565ac1 100644
> > > > --- a/include/uapi/linux/virtio_ids.h
> > > > +++ b/include/uapi/linux/virtio_ids.h
> > > > @@ -43,5 +43,6 @@
> > > >  #define VIRTIO_ID_INPUT18 /* virtio input */
> > > >  #define VIRTIO_ID_VSOCK19 /* virtio vsock transport */
> > > >  #define VIRTIO_ID_CRYPTO   20 /* virtio crypto */
> > > > +#define VIRTIO_ID_PMEM 25 /* virtio pmem */
> > > >  
> > > >  #endif /* _LINUX_VIRTIO_IDS_H */  
> > > 
> > > Didn't Paolo point out someone is using 25 for audio?  
> > 
> > 
> > Sorry! I did not notice this.
> > 
> > > 
> > > Please, reserve an ID with the Virtio TC before using it.  
> > 
> > I thought I requested[1] ID 25.
> > 
> > [1] https://github.com/oasis-tcs/virtio-spec/issues/38
> > [2] https://lists.oasis-open.org/archives/virtio-dev/201805/threads.html
> > 
> > In that case I will send request for next available ID i.e 26?
> 
> Have the folks using this for audio sent a reservation request as well?
> If not, I'd say the first requester should get the id...

No but I think they ship their's already. No one ships pmem
so it's less pain for everyone if we just skip 25.

> (And yes, we need to be quicker with voting on/applying id
> reservations :/)

We can't vote on what was not proposed for a vote.
At the moment that responsibility is with the commented
once discussion on the comment has taken place.

I think what's missing is a description of the process
in the README. Want to write it up and post it?

-- 
MST



Re: [Qemu-devel] [Qemu-block] [RFC PATCH] aio: Add a knob to always poll if there are in-flight requests

2019-04-10 Thread Sergio Lopez

Stefan Hajnoczi  writes:

> On Tue, Apr 02, 2019 at 02:19:08PM +0200, Sergio Lopez wrote:
>> The polling mode in aio_poll is able to trim down ~20us on the average
>> request latency, but it needs manual fine tuning to adjust it to the
>> characteristics of the storage.
>> 
>> Here we add a new knob to the IOThread object, "poll-inflight". When
>> this knob is enabled, aio_poll will always use polling if there are
>> in-flight requests, ignoring the rest of poll-* parameters. If there
>> aren't any in-flight requests, the usual polling rules apply, which is
>> useful given that the default poll-max-ns value of 32us is usually
>> enough to catch a new request in the VQ when the Guest is putting
>> pressure on us.
>> 
>> To keep track of the number of in-flight requests, AioContext has a
>> new counter which is increased/decreased by thread-pool.c and
>> linux-aio.c on request submission/completion.
>> 
>> With poll-inflight, users willing to spend more Host CPU resources in
>> exchange for a lower latency just need to enable a single knob.
>> 
>> This is just an initial version of this feature and I'm just sharing
>> it to get some early feedback. As such, managing this property through
>> QAPI is not yet implemented.
>> 
>> Signed-off-by: Sergio Lopez 
>> ---
>>  block/linux-aio.c |  7 +++
>>  include/block/aio.h   |  9 -
>>  include/sysemu/iothread.h |  1 +
>>  iothread.c| 33 +
>>  util/aio-posix.c  | 32 +++-
>>  util/thread-pool.c|  3 +++
>>  6 files changed, 83 insertions(+), 2 deletions(-)
>
> Hi Sergio,
> More polling modes are useful for benchmarking and performance analysis.
> From this perspective I think poll-inflight is worthwhile.
>
> Like most performance optimizations, the effectiveness of this new
> polling mode depends on the workload.  It could waste CPU, especially on
> a queue depth 1 workload with a slow disk.
>
> Do you think better self-tuning is possible?  Then users don't need to
> set tunables like this one.

Probably only if we aim for some more complex, which will have its own
inherent costs. We could take inspiration from Linux's io_poll hybrid
mode, which maintains per-device statistics to calculate the average
latency, to take a nap for half that time and free the CPU a bit.

Of course, our case is significantly harder. The kernel only deals with
the HW, and only a few devices do have support for io_poll. In our case,
the IOThread may be shared among various devices with radically
different backends, which may also have a wide range of latencies
(depending on the underlying storage, file format, cache mode...). But
perhaps we can try to be clever and calculate the standard deviation
of the collected data to (in)validate the stats.

There are also some implementation challenges, as deciding where to
store those stats and designing an interface for aio_poll to access
that information, preferably in a lockless fashion.

If we can figure those out, we should be able to iterate over all the
BDSs sharing the AioContext, using the average latency (if valid),
combined with a timestamp from when the first in-flight request was
issued, to calculate a deadline and, with it, decide if we should either
take a nap using ppoll() with a timeout calculated to be wake up early
enough to catch the completion while polling, or just enter polling mode
for a while.

Perhaps it'd be worth doing a simple PoC outside QEMU, using the
vhost-user-blk example server to avoid the block layer complexity and
evaluate the raw benefits with different kinds of backends and
workloads.

Thanks,
Sergio.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver

2019-04-10 Thread Cornelia Huck
On Wed, 10 Apr 2019 12:46:12 -0400
"Michael S. Tsirkin"  wrote:

> On Wed, Apr 10, 2019 at 04:31:39PM +0200, Cornelia Huck wrote:
> > On Wed, 10 Apr 2019 10:03:01 -0400 (EDT)
> > Pankaj Gupta  wrote:
> >   
> > > > 
> > > > On Wed, Apr 10, 2019 at 09:38:22AM +0530, Pankaj Gupta wrote:
> > > > > This patch adds virtio-pmem driver for KVM guest.
> > > > > 
> > > > > Guest reads the persistent memory range information from
> > > > > Qemu over VIRTIO and registers it on nvdimm_bus. It also
> > > > > creates a nd_region object with the persistent memory
> > > > > range information so that existing 'nvdimm/pmem' driver
> > > > > can reserve this into system memory map. This way
> > > > > 'virtio-pmem' driver uses existing functionality of pmem
> > > > > driver to register persistent memory compatible for DAX
> > > > > capable filesystems.
> > > > > 
> > > > > This also provides function to perform guest flush over
> > > > > VIRTIO from 'pmem' driver when userspace performs flush
> > > > > on DAX memory range.
> > > > > 
> > > > > Signed-off-by: Pankaj Gupta   
> >   
> > > > > diff --git a/include/uapi/linux/virtio_ids.h
> > > > > b/include/uapi/linux/virtio_ids.h
> > > > > index 6d5c3b2d4f4d..346389565ac1 100644
> > > > > --- a/include/uapi/linux/virtio_ids.h
> > > > > +++ b/include/uapi/linux/virtio_ids.h
> > > > > @@ -43,5 +43,6 @@
> > > > >  #define VIRTIO_ID_INPUT18 /* virtio input */
> > > > >  #define VIRTIO_ID_VSOCK19 /* virtio vsock transport */
> > > > >  #define VIRTIO_ID_CRYPTO   20 /* virtio crypto */
> > > > > +#define VIRTIO_ID_PMEM 25 /* virtio pmem */
> > > > >  
> > > > >  #endif /* _LINUX_VIRTIO_IDS_H */
> > > > 
> > > > Didn't Paolo point out someone is using 25 for audio?
> > > 
> > > 
> > > Sorry! I did not notice this.
> > >   
> > > > 
> > > > Please, reserve an ID with the Virtio TC before using it.
> > > 
> > > I thought I requested[1] ID 25.
> > > 
> > > [1] https://github.com/oasis-tcs/virtio-spec/issues/38
> > > [2] https://lists.oasis-open.org/archives/virtio-dev/201805/threads.html
> > > 
> > > In that case I will send request for next available ID i.e 26?  
> > 
> > Have the folks using this for audio sent a reservation request as well?
> > If not, I'd say the first requester should get the id...  
> 
> No but I think they ship their's already. No one ships pmem
> so it's less pain for everyone if we just skip 25.

Ugh. Ok, then we should change pmem...

> 
> > (And yes, we need to be quicker with voting on/applying id
> > reservations :/)  
> 
> We can't vote on what was not proposed for a vote.
> At the moment that responsibility is with the commented
> once discussion on the comment has taken place.
> 
> I think what's missing is a description of the process
> in the README. Want to write it up and post it?

I can add that to my TODO list, can't promise speedy resolution, though.



Re: [Qemu-devel] [PATCH for-4.0?] iotests: Let 245 pass on tmpfs

2019-04-10 Thread Peter Maydell
On Wed, 10 Apr 2019 at 17:40, Max Reitz  wrote:
>
> On 10.04.19 18:39, Eric Blake wrote:
> > On 4/10/19 11:29 AM, Max Reitz wrote:
> >> tmpfs does not support O_DIRECT.  Detect this case, and skip flipping
> >> @direct if the filesystem does not support it.
> >>
> >> Fixes: bf3e50f6239090e63a8ffaaec971671e66d88e07
> >> Signed-off-by: Max Reitz 
> >> ---
> >>  tests/qemu-iotests/245 | 10 --
> >>  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > Are you trying to get this in 4.0-rc3? (As a test, it has no bearing on
> > the actual binaries; fewer testsuite failures are nice if we squeeze it
> > in, but at the same time it is not enough to delay the release if it
> > does not get fixed until 4.1).
> >
> > Reviewed-by: Eric Blake 
>
> If there are other patches for rc3, we could take this as well.  But
> there is no point in making a pull request just for this.

rc3 has been tagged already, I'm afraid. rc4 will be only
if something release-critical crops up.

thanks
-- PMM



Re: [Qemu-devel] [PATCH] hostmem-file: warn when memory-backend-file, share=on and in incoming migration

2019-04-10 Thread Dr. David Alan Gilbert
* Catherine Ho (catherine.h...@gmail.com) wrote:
> Hi Dr. David
> 
> On Wed, 10 Apr 2019 at 22:59, Dr. David Alan Gilbert 
> wrote:
> 
> > * Catherine Ho (catherine.h...@gmail.com) wrote:
> > > Hi Igor
> > >
> > >
> > > On Mon, 8 Apr 2019 at 18:35, Igor Mammedov  wrote:
> > >
> > > > On Sun,  7 Apr 2019 22:19:05 -0400
> > > > Catherine Ho  wrote:
> > > >
> > > > > Currently it is not forbidden to use "-object
> > > > memory-backend-file,share=on"
> > > > > and together with "-incoming". But after incoming migration is
> > finished,
> > > > > the memory-backend-file will be definitely written if share=on. So
> > the
> > > > > memory-backend-file can only be used once, but failed in the 2nd time
> > > > > incoming.
> > > > >
> > > > > Thus it gives a warning and the users can run the qemu if they really
> > > > > want to do it.
> > > >
> > > > Shouldn't we add a migration blocker in such a case instead of warning
> > > > and letting qemu run wild?
> > > >
> > >
> > > IMO, it doesn't need to block this. With share=on and -incoming, the user
> > > can
> > > still save the device memory state into memory-backend file again if
> > > ignore-shared
> > > capability is on.
> > >
> > > If we block this, the user can't use the ignore-shared capability in
> > > incoming
> > > migration.
> >
> > -incomign with share=on is a perfectly normal thing to do - it just
> > depends who you are sharing the file with and the lifetime of that
> > shared file.
> >
> > For example; if you're just running a qemu with vhost-user then you
> > use share=on - however wyou typically select the backend file as
> > a new file from /dev/shm - it's not a file that you previously migrated
> > to.
> >
> Thanks,
> but using a new file from /dev/shm means kernel will start from
> start_kernel or early? Is it different from the x-ignore-shared case?
> If we remove the share=on in incoming migration, all the writting
> of ram will not be flush into the memory backend file. Thus we
> can use the base memory backend file for ever.
> e.g.
> 1) save the vm like a snapshot, current ram state is "kernel
> has been started, systemd has been started"
> 2) restore it with -incoming and *no* share=on flag
> 3) restore it with -incoming and *no* share=on again...
> In contrary, if we use share=on, the base backend file will
> be written at once after 1st time incoming.
> 
> So, IMO, no "share=on" is the proper usage of incoming migration
> when ignore-shared is on.
> Please correct me if sth is wrong, thanks:)

OK, I see what you're trying to do - you mean for the 'snapshotting'
case;  but that's not the only use.  Another use is for being able to
do a very quick upgrade of the running qemu to a new qemu binary
version; and in that case you want to be able to write to the shared
file so that you can repeatedly do the quick migrate.

Dave

> 
> B.R.
> Catherine
> 
> 
> > QEMU has no way to know about the provenance of the shared file it's
> > been given, so we can't really warn people about it.
> >
> > Dave
> >
> > > B.R.
> > > Catherine
> > >
> > > >
> > > >
> > > > > Signed-off-by: Catherine Ho 
> > > > > ---
> > > > >  backends/hostmem-file.c | 11 +++
> > > > >  1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> > > > > index 37ac6445d2..59429ee0b4 100644
> > > > > --- a/backends/hostmem-file.c
> > > > > +++ b/backends/hostmem-file.c
> > > > > @@ -16,6 +16,7 @@
> > > > >  #include "sysemu/hostmem.h"
> > > > >  #include "sysemu/sysemu.h"
> > > > >  #include "qom/object_interfaces.h"
> > > > > +#include "migration/migration.h"
> > > > >
> > > > >  /* hostmem-file.c */
> > > > >  /**
> > > > > @@ -79,6 +80,16 @@ file_backend_memory_alloc(HostMemoryBackend
> > *backend,
> > > > Error **errp)
> > > > >  }
> > > > >  }
> > > > >
> > > > > +/*
> > > > > + * In ignore shared case, if share=on for host memory backend
> > file,
> > > > > + * the ram might be written after incoming process is finished.
> > Thus
> > > > > + * the memory backend can't be reused for 2nd/3rd... incoming
> > > > > + */
> > > > > +if (backend->share && migrate_ignore_shared()
> > > > > +   && runstate_check(RUN_STATE_INMIGRATE))
> > > > > +warn_report("share=on for memory backend file might be "
> > > > > +"conflicted with incoming in ignore shared
> > > > case");
> > > > > +
> > > > >  backend->force_prealloc = mem_prealloc;
> > > > >  name = host_memory_backend_get_name(backend);
> > > > >  memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
> > > >
> > > >
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> >
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v2 13/13] qemu-iotests: Test the x-blockdev-reopen QMP command

2019-04-10 Thread Max Reitz
On 06.03.19 19:11, Alberto Garcia wrote:
> This patch adds several tests for the x-blockdev-reopen QMP command.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  tests/qemu-iotests/243 | 991 
> +
>  tests/qemu-iotests/243.out |   5 +
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 997 insertions(+)
>  create mode 100644 tests/qemu-iotests/243
>  create mode 100644 tests/qemu-iotests/243.out
> 
> diff --git a/tests/qemu-iotests/243 b/tests/qemu-iotests/243
> new file mode 100644
> index 00..35b30092ca
> --- /dev/null
> +++ b/tests/qemu-iotests/243
> @@ -0,0 +1,991 @@

[...]

> +# If an image has a backing file then the 'backing' option must be
> +# passed on reopen. We don't allow leaving the option out in this
> +# case because it's unclear what the correct semantics would be.
> +def test_missing_backing_options_1(self):

[...]

> +# hd0 has no backing file: we can omit the 'backing' option
> +self.reopen(opts)

[...]

> +# Detach hd2 from hd0.
> +self.reopen(opts, {'backing': None})
> +self.reopen(opts, {}, "backing is missing for 'hd0'")

I don’t understand the second test.  hd0 has no default backing file and
it currently has no backing child attached to it.  Why would this call
fail now?

I’m asking because this no longer throws an error after “block: Leave
BDS.backing_file constant” of my “block: Deal with filters” series.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH RESEND v2 2/2] cpus: move hvf_cpu_synchronize* calls to cpu_synchronize* functions

2019-04-10 Thread Paolo Bonzini
On 10/04/19 16:02, Roman Bolshakov wrote:
> I've applied, built and tested both sequentially. Applying and running
> with patch 1/2 alone doesn't result in the behavior I mentioned. I also
> tried to apply only the first hunk that moves hvf_cpu_synchronize_state
> into cpu_synchronize_state and it also causes the issue with BIOS.
> 
> Honestly, I don't know if the tests run qemu with hvf accel. AFAIK
> kvm-unit-tests can be used to test an accel itself.

No, tests do not cover HVF.

Paolo



Re: [Qemu-devel] [PATCH] qemu-img: fix .hx and .texi disparity

2019-04-10 Thread John Snow



On 4/10/19 1:39 AM, Markus Armbruster wrote:
> John Snow  writes:
> 
>> It turns out that having options listed in three places continues to be
>> a bad idea. I'm still toying with the idea of an improved infrastructure
>> here, but in the meantime, another bandaid.
>>
>> There are three locations:
>> (1) .hx file, formatted as texi
>> (2) .hx file, formatted as human readable.
>> (3) .texi file, as section headers, formatted as texi.
>>
>> You can compare the two summaries within the .hx file like so:
>>
>> Human-readable command summaries:
>> `./qemu-img --help | grep 'Command syntax' -A14`
>> Detokenized texi command summaries:
>> `grep "@item" qemu-img-cmds.hx | sed -E 's|@var\{([^\}]*?)\}|\1|g'`
>>
>> You can compare the two separate texi summaries like so:
>>
>> Texi command summaries:
>> `grep "@item" qemu-img-cmds.hx"`
>> Texi command headers:
>> grep -E "@item.*@var" qemu-img.texi | tail -14
>>
>> Signed-off-by: John Snow 
> 
> Extra points for explaining how you found the issues in detail
> sufficient for reproducing.
> 
> Nice to have in 4.0.
> 
> Reviewed-by: Markus Armbruster 
> 

If it can go in for 4.0, who stages it? Kevin?

--js



Re: [Qemu-devel] [Qemu-block] [PULL 07/14] iotests: Add 241 to test NBD on unaligned images

2019-04-10 Thread Max Reitz
On 01.04.19 16:08, Eric Blake wrote:
> Add a test for the NBD client workaround in the previous patch.  It's
> not really feasible for an iotest to assume a specific tracing engine,
> so we can't really probe trace_nbd_parse_blockstatus_compliance to see
> if the server was fixed vs. whether the client just worked around the
> server (other than by rearranging order between code patches and this
> test). But having a successful exchange sure beats the previous state
> of an error message. Since format probing can change alignment, we can
> use that as an easy way to test several configurations.
> 
> Not tested yet, but worth adding to this test in future patches: an
> NBD server that can advertise a non-sector-aligned size (such as
> nbdkit) causes qemu as the NBD client to misbehave when it rounds the
> size up and accesses beyond the advertised size. Qemu as NBD server
> never advertises a non-sector-aligned size (since bdrv_getlength()
> currently rounds up to sector boundaries); until qemu can act as such
> a server, testing that flaw will have to rely on external binaries.
> 
> Signed-off-by: Eric Blake 
> Message-Id: <20190329042750.14704-2-ebl...@redhat.com>
> Tested-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> [eblake: add forced-512 alignment, and nbdkit reproducer comment]
> ---
>  tests/qemu-iotests/241 | 100 +
>  tests/qemu-iotests/241.out |  26 ++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 127 insertions(+)
>  create mode 100755 tests/qemu-iotests/241
>  create mode 100644 tests/qemu-iotests/241.out
> 
> diff --git a/tests/qemu-iotests/241 b/tests/qemu-iotests/241
> new file mode 100755
> index 000..4b196857387
> --- /dev/null
> +++ b/tests/qemu-iotests/241
> @@ -0,0 +1,100 @@
> +#!/bin/bash
> +#
> +# Test qemu-nbd vs. unaligned images
> +#
> +# Copyright (C) 2018-2019 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +seq="$(basename $0)"
> +echo "QA output created by $seq"
> +
> +status=1 # failure is the default!
> +
> +nbd_unix_socket=$TEST_DIR/test_qemu_nbd_socket
> +
> +_cleanup()
> +{
> +_cleanup_test_img
> +nbd_server_stop
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +. ./common.nbd
> +
> +_supported_fmt raw
> +_supported_proto nbd
> +_supported_os Linux
> +_require_command QEMU_NBD
> +
> +# can't use _make_test_img, because qemu-img rounds image size up,
> +# and because we want to use Unix socket rather than TCP port. Likewise,
> +# we have to redirect TEST_IMG to our server.
> +# This tests that we can deal with the hole at the end of an unaligned
> +# raw file (either because the server doesn't advertise alignment too
> +# large, or because the client ignores the server's noncompliance - even
> +# though we can't actually wire iotests into checking trace messages).
> +printf %01000d 0 > "$TEST_IMG_FILE"
> +TEST_IMG="nbd:unix:$nbd_unix_socket"
> +
> +echo
> +echo "=== Exporting unaligned raw image, natural alignment ==="
> +echo
> +
> +nbd_server_start_unix_socket -f $IMGFMT "$TEST_IMG_FILE"
> +
> +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
> +$QEMU_IMG map -f raw --output=json "$TEST_IMG" | _filter_qemu_img_map
> +$QEMU_IO -f raw -c map "$TEST_IMG"
> +nbd_server_stop
> +
> +echo
> +echo "=== Exporting unaligned raw image, forced server sector alignment ==="
> +echo
> +
> +# Intentionally omit '-f' to force image probing, which in turn forces
> +# sector alignment, here at the server.
> +nbd_server_start_unix_socket "$TEST_IMG_FILE"
> +
> +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
> +$QEMU_IMG map -f raw --output=json "$TEST_IMG" | _filter_qemu_img_map
> +$QEMU_IO -f raw -c map "$TEST_IMG"
> +nbd_server_stop
> +
> +echo
> +echo "=== Exporting unaligned raw image, forced client sector alignment ==="
> +echo
> +
> +# Now force sector alignment at the client.
> +nbd_server_start_unix_socket -f $IMGFMT "$TEST_IMG_FILE"
> +
> +$QEMU_NBD_PROG --list -k $nbd_unix_socket | grep '\(size\|min\)'
> +$QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map
> +$QEMU_IO -c map "$TEST_IMG"
> +nbd_server_stop
> +
> +# Not tested yet: we also want to ensure that qemu as NBD client does
> +# not access 

Re: [Qemu-devel] [PATCH] roms: List and describe the Makefile 'clean' rule

2019-04-10 Thread Laszlo Ersek
On 04/10/19 07:36, Philippe Mathieu-Daudé wrote:
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  roms/Makefile | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/roms/Makefile b/roms/Makefile
> index 1ff78b63bb3..f55c4a2d3bb 100644
> --- a/roms/Makefile
> +++ b/roms/Makefile
> @@ -61,6 +61,8 @@ default:
>   @echo "  skiboot-- update skiboot.lid"
>   @echo "  u-boot.e500-- update u-boot.e500"
>   @echo "  u-boot.sam460  -- update u-boot.sam460"
> + @echo "  clean  -- delete the files generated by the previous" \
> +   "build targets"
>  
>  bios: build-seabios-config-seabios-128k build-seabios-config-seabios-256k
>   cp seabios/builds/seabios-128k/bios.bin ../pc-bios/bios.bin

This hunk makes sense in general, but it will conflict with my

  [Qemu-devel] [PATCH for-4.1 v4 07/12]
  roms: build edk2 firmware binaries and variable store templates

Please let's delay this hunk until after my pull...

> @@ -102,7 +104,7 @@ pxe-rom-%: build-pxe-roms
>  
>  efirom: $(patsubst %,efi-rom-%,$(pxerom_variants))
>  
> -efi-rom-%: build-pxe-roms build-efi-roms $(EDK2_EFIROM)
> +efi-rom-%: build-pxe-roms build-efi-roms edk2-basetools
>   $(EDK2_EFIROM) -f "0x$(VID)" -i "0x$(DID)" -l 0x02 \
>   -b ipxe/src/bin/$(VID)$(DID).rom \
>   -ec ipxe/src/bin-i386-efi/$(VID)$(DID).efidrv \

This hunk doesn't belong in this patch at all -- it's from my patch

  [Qemu-devel] [PATCH for-4.1 v4 06/12]
  roms/Makefile: replace the $(EDK2_EFIROM) target with "edk2-basetools"

(The hunk also conflicts with the purpose stated in $SUBJECT.)

> @@ -131,7 +133,7 @@ build-efi-roms: build-pxe-roms
>  #EDK2_BASETOOLS_LDFLAGS='...' \
>  #efirom
>  #
> -$(EDK2_EFIROM):
> +edk2-basetools:
>   $(MAKE) -C edk2/BaseTools \
>   EXTRA_OPTFLAGS='$(EDK2_BASETOOLS_OPTFLAGS)' \
>   EXTRA_LDFLAGS='$(EDK2_BASETOOLS_LDFLAGS)'

Ditto.

> @@ -156,6 +158,9 @@ skiboot:
>   $(MAKE) -C skiboot CROSS=$(powerpc64_cross_prefix)
>   cp skiboot/skiboot.lid ../pc-bios/skiboot.lid
>  
> +efi: edk2-basetools
> + $(MAKE) -f Makefile.edk2
> +
>  clean:
>   rm -rf seabios/.config seabios/out seabios/builds
>   $(MAKE) -C sgabios clean

And this seems to belong to:

  [Qemu-devel] [PATCH for-4.1 v4 07/12]
  roms: build edk2 firmware binaries and variable store templates

> @@ -166,3 +171,4 @@ clean:
>   rm -rf u-boot/build.e500
>   $(MAKE) -C u-boot-sam460ex distclean
>   $(MAKE) -C skiboot clean
> + $(MAKE) -f Makefile.edk2 clean
> 

Ditto.

So, for qemu-trivial's sake:

Nacked-by: Laszlo Ersek 

Thanks
Laszlo



Re: [Qemu-devel] [PATCH v2 6/8] migration: Add multifd-compress parameter

2019-04-10 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
> Signed-off-by: Juan Quintela 
> 
> ---
> Rename it to NONE
> ---
>  hmp.c| 17 +
>  hw/core/qdev-properties.c| 11 +++
>  include/hw/qdev-properties.h |  1 +
>  migration/migration.c| 16 
>  qapi/common.json | 13 +
>  qapi/migration.json  | 19 +++
>  tests/migration-test.c   | 13 ++---
>  7 files changed, 83 insertions(+), 7 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index 8eec768088..02fbe27757 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -435,6 +435,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const 
> QDict *qdict)
>  monitor_printf(mon, "%s: %u\n",
>  MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_CHANNELS),
>  params->multifd_channels);
> +monitor_printf(mon, "%s: %s\n",
> +MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_COMPRESS),
> +MultifdCompress_str(params->multifd_compress));
>  monitor_printf(mon, "%s: %" PRIu64 "\n",
>  MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
>  params->xbzrle_cache_size);
> @@ -1737,6 +1740,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const 
> QDict *qdict)
>  MigrateSetParameters *p = g_new0(MigrateSetParameters, 1);
>  uint64_t valuebw = 0;
>  uint64_t cache_size;
> +int compress_type;
>  Error *err = NULL;
>  int val, ret;
>  
> @@ -1822,6 +1826,19 @@ void hmp_migrate_set_parameter(Monitor *mon, const 
> QDict *qdict)
>  p->has_multifd_channels = true;
>  visit_type_int(v, param, &p->multifd_channels, &err);
>  break;
> +case MIGRATION_PARAMETER_MULTIFD_COMPRESS:
> +p->has_multifd_compress = true;
> +visit_type_enum(v, param, &compress_type,
> +&MultifdCompress_lookup, &err);
> +if (err) {
> +break;
> +}
> +if (compress_type < 0 || compress_type > MULTIFD_COMPRESS__MAX) {

I still think that needs to be >= rather than > ?

Dave

> +error_setg(&err, "Invalid multifd_compress option %s", valuestr);
> +break;
> +}
> +p->multifd_compress = compress_type;
> +break;
>  case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE:
>  p->has_xbzrle_cache_size = true;
>  visit_type_size(v, param, &cache_size, &err);
> diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
> index 5da1439a8b..7c8e71532f 100644
> --- a/hw/core/qdev-properties.c
> +++ b/hw/core/qdev-properties.c
> @@ -645,6 +645,17 @@ const PropertyInfo qdev_prop_fdc_drive_type = {
>  .set_default_value = set_default_value_enum,
>  };
>  
> +/* --- MultifdCompress --- */
> +
> +const PropertyInfo qdev_prop_multifd_compress = {
> +.name = "MultifdCompress",
> +.description = "multifd_compress values",
> +.enum_table = &MultifdCompress_lookup,
> +.get = get_enum,
> +.set = set_enum,
> +.set_default_value = set_default_value_enum,
> +};
> +
>  /* --- pci address --- */
>  
>  /*
> diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
> index b6758c852e..ac452d8f2c 100644
> --- a/include/hw/qdev-properties.h
> +++ b/include/hw/qdev-properties.h
> @@ -23,6 +23,7 @@ extern const PropertyInfo qdev_prop_tpm;
>  extern const PropertyInfo qdev_prop_ptr;
>  extern const PropertyInfo qdev_prop_macaddr;
>  extern const PropertyInfo qdev_prop_on_off_auto;
> +extern const PropertyInfo qdev_prop_multifd_compress;
>  extern const PropertyInfo qdev_prop_losttickpolicy;
>  extern const PropertyInfo qdev_prop_blockdev_on_error;
>  extern const PropertyInfo qdev_prop_bios_chs_trans;
> diff --git a/migration/migration.c b/migration/migration.c
> index 609e0df5d0..d6f8ef342a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -82,6 +82,7 @@
>  /* The delay time (in ms) between two COLO checkpoints */
>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
> +#define DEFAULT_MIGRATE_MULTIFD_COMPRESS MULTIFD_COMPRESS_NONE
>  
>  /* Background transfer rate for postcopy, 0 means unlimited, note
>   * that page requests can still exceed this limit.
> @@ -769,6 +770,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
> **errp)
>  params->block_incremental = s->parameters.block_incremental;
>  params->has_multifd_channels = true;
>  params->multifd_channels = s->parameters.multifd_channels;
> +params->has_multifd_compress = true;
> +params->multifd_compress = s->parameters.multifd_compress;
>  params->has_xbzrle_cache_size = true;
>  params->xbzrle_cache_size = s->parameters.xbzrle_cache_size;
>  params->has_max_postcopy_bandwidth = true;
> @@ -1268,6 +1271,9 @@ static void 
> migrate_params_test_apply(MigrateSetParameters *params,
>  if (params->has_multifd_channels) {
>  dest->mul

Re: [Qemu-devel] [PATCH RFC 2/3] pxtool: Add new qemu-img command info generation tool

2019-04-10 Thread John Snow



On 4/10/19 1:54 AM, Markus Armbruster wrote:
> John Snow  writes:
> 
>> Presently we use hxtool and a .hx format to generate a few things like
>> the qemu_img subcommand dispatch table, the qemu_img help() display output,
>> and a help output in qemu-img.texi.
>>
>> Unfortunately, this means that this information is duplicated in at least
>> three places:
>>
>> (1) in qemu-img-cmds.hx as a human readable string
>> (2) in qemu-img-cmds.hx as a texi string
>> (3) in qemu-img.texi again, as a texi string
>>
>> We can do a little better, though: all of these sources can be generated
>> from a single json file. Add a new small tool ("pxtool") that can do this.
>>
>> This tool can at least handle generating (1) and (2) from the same source
>> without needing to reduplicate that information. Deduplicating (3) happens
>> in the next patch.
>>
>> Notes:
>>  - The json format was chosen to be very "stupid", and the command line
>>documentation is being kept one-per-line to make future diffs easier
>>to read.
>>  - It's easy enough to generate the human-readable output from the texi
>>output by removing '@var{foo}' with 'foo'.
>>  - The qemu-img command dispatch always calls img_cmdname, so we don't
>>need to store this information separately, either.
>>  - The need for DEF() macros could be removed as well, but I left it in
>>to create a minimally disruptive patch.
>> Signed-off-by: John Snow 
> 
> We got just five .hx:
> 
> qemu-img.cmds.hxkilled off by this patch
> qemu-options.hx CLI QAPIfication should kill this off
> hw/audio/pl041.hx   misnamed, not actually food for hxtool
> hmp-commands.hx no exit strategy
> hmp-commands-info.hxfor these two, yet
> 
> CLI QAPIfication got stuck in the back-burner, and as long as that's the
> case, it's not an alternative to your patches.
> 

Good to know. I figure that at least while we wait on something more
comprehensive there's no real short term harm in tidying up.

Something I'd really like to do is define a python/json-esque
configuration file that allows you to specify some "common options" that
are shared between all of the sub-commands, and then define each command
in terms of both which common options it possesses, and then any local
command-specific options it has.

(Why the common options design? So that argument names and command
options can be encouraged to be identical and identically documented
between all subcommands that use them.)

Then, from the configuration file, generate all of the necessary C
parsers (I have a protoype for this, it's not difficult to do in at
least the basic case) that can return boxed command arguments, and then
also generate the help strings from that metadata.

I suspect that work *IS* something that might brush up against / should
use (or extend) QAPI code.

Then, I'd like to find a way to split qemu-img.texi into sub-command
files and find a way to source the same "informative text" for both:

(1) The texi output, as per usual, and
(2) qemu-img subcommand --help

such that --help, when passed as an argument to the subcommand, prints
out help only relevant to the subcommand instead, e.g.

`qemu-img check --help`

might print the "common options" section only as it relates to commands
actually used by the check command, then prints the check section of the
texi as formatted for terminals.

This way, we can have manpages, html pages, and interactive help text
all derived from the same semi-automated sources, always up to date and
much easier to read/discover/parse by human eyeballs.

That's what I'd like to accomplish, ultimately.

For now, I think this RFC set is not harmful and won't bother anybody.
It's definitely not worse than what we have now, fragility of removing
@var{} tokens and all.

--js




Re: [Qemu-devel] [Qemu-block] [PULL 07/14] iotests: Add 241 to test NBD on unaligned images

2019-04-10 Thread Eric Blake
On 4/10/19 12:45 PM, Max Reitz wrote:
> On 01.04.19 16:08, Eric Blake wrote:
>> Add a test for the NBD client workaround in the previous patch.  It's
>> not really feasible for an iotest to assume a specific tracing engine,
>> so we can't really probe trace_nbd_parse_blockstatus_compliance to see
>> if the server was fixed vs. whether the client just worked around the
>> server (other than by rearranging order between code patches and this
>> test). But having a successful exchange sure beats the previous state
>> of an error message. Since format probing can change alignment, we can
>> use that as an easy way to test several configurations.
>>
>> Not tested yet, but worth adding to this test in future patches: an
>> NBD server that can advertise a non-sector-aligned size (such as
>> nbdkit) causes qemu as the NBD client to misbehave when it rounds the
>> size up and accesses beyond the advertised size. Qemu as NBD server
>> never advertises a non-sector-aligned size (since bdrv_getlength()
>> currently rounds up to sector boundaries); until qemu can act as such
>> a server, testing that flaw will have to rely on external binaries.
>>

>> +++ b/tests/qemu-iotests/241.out

>> +=== Exporting unaligned raw image, forced server sector alignment ===
>> +
>> +WARNING: Image format was not specified for 
>> '/home/eblake/qemu/tests/qemu-iotests/scratch/t.raw' and probing guessed raw.
>> + Automatically detecting the format is dangerous for raw images, 
>> write operations on block 0 will be restricted.
>> + Specify the 'raw' format explicitly to remove the restrictions.
> 
> Is there a patch for this already?

Oh shoot. I saw the complaint about the hard-coded path needing a
filter, and still failed to fix it in time for 4.0-rc3. Not a driver for
-rc4 on its own, but if we have something else pop up that mandates
-rc4, I'll get that fix in.

Will post shortly...

> 
> Also, I planned to reply to the original patch – but where is it?  The
> latest version I can see on the list is v3, and that only has seven
> lines in the reference output.  Well, the difference would explain why
> Vladimir would give a Tested-by to a test that clearly will not run on
> his machine.

Yeah, this line in the commit message is evidence of my botching things:

> [eblake: add forced-512 alignment, and nbdkit reproducer comment]

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH] hmp: delvm: use hmp_handle_error

2019-04-10 Thread Cole Robinson
This gives us the consistent 'Error:' prefix added in 66363e9a43f,
which helps users like libvirt who still need to scrape hmp error
messages to detect failure.

Signed-off-by: Cole Robinson 
---
 hmp.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hmp.c b/hmp.c
index 8eec768088..74a4bfc1f9 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1481,10 +1481,11 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
 const char *name = qdict_get_str(qdict, "name");
 
 if (bdrv_all_delete_snapshot(name, &bs, &err) < 0) {
-error_reportf_err(err,
-  "Error while deleting snapshot on device '%s': ",
-  bdrv_get_device_name(bs));
+error_prepend(&err,
+  "Error while deleting snapshot on device '%s': ",
+  bdrv_get_device_name(bs));
 }
+hmp_handle_error(mon, &err);
 }
 
 void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
-- 
2.21.0




Re: [Qemu-devel] [PATCH for-4.1] roms: Correct the EDK2_BASETOOLS_OPTFLAGS variable description

2019-04-10 Thread Laszlo Ersek
On 04/10/19 17:10, Philippe Mathieu-Daudé wrote:
> On 4/10/19 4:58 PM, Laszlo Ersek wrote:
>> On 04/10/19 06:57, Philippe Mathieu-Daudé wrote:
>>> In commit 1cab464136b4 we incorrectly described the
>>> EDK2_BASETOOLS_OPTFLAGS can pass CPPFLAGS and CFLAGS
>>> options to the EDK2 build tools, but it only expands
>>> the CFLAGS (not to the CPPFLAGS).
>>> Update the description to be more accurate.
>>>
>>> Reported-by: Laszlo Ersek 
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> ---
>>>  roms/Makefile | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/roms/Makefile b/roms/Makefile
>>> index 1ff78b63bb3..d7fd6025e7c 100644
>>> --- a/roms/Makefile
>>> +++ b/roms/Makefile
>>> @@ -120,8 +120,8 @@ build-efi-roms: build-pxe-roms
>>> $(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \
>>> $(patsubst %,bin-x86_64-efi/%.efidrv,$(pxerom_targets))
>>>  
>>> -# Build scripts can pass compiler/linker flags to the EDK2 build tools
>>> -# via the EDK2_BASETOOLS_OPTFLAGS (CPPFLAGS and CFLAGS) and
>>> +# Build scripts can pass compiler/linker flags to the EDK2
>>> +# build  tools via the EDK2_BASETOOLS_OPTFLAGS (CFLAGS) and
>>>  # EDK2_BASETOOLS_LDFLAGS (LDFLAGS) environment variables.
>>>  #
>>>  # Example:
>>>
>>
>> Reviewed-by: Laszlo Ersek 
>>
>> Please noone submit a pull request with this patch (for 4.1) until my
>> edk2 bundling series is merged; as this one will likely introduce
>> another contextual conflict, and I'd *really* like to avoid another
>> rebase on my end, due to such an issue.
> 
> Thanks for the review. I'll rebase it on top of your 'bundle edk2
> platform firmware with QEMU' v4 and resend (later).

Thank you!

> Feel free to include it on top of your pullreq.

... I think I'll keep that pullreq as minimal as it can be. :)

Thanks
Laszlo

> 
> Regards,
> 
> Phil.
> 




Re: [Qemu-devel] [PATCH] qemu-img: fix .hx and .texi disparity

2019-04-10 Thread Eric Blake
On 4/10/19 12:37 PM, John Snow wrote:
> 
> 
> On 4/10/19 1:39 AM, Markus Armbruster wrote:
>> John Snow  writes:
>>
>>> It turns out that having options listed in three places continues to be
>>> a bad idea. I'm still toying with the idea of an improved infrastructure
>>> here, but in the meantime, another bandaid.
>>>

>>
>> Nice to have in 4.0.
>>
>> Reviewed-by: Markus Armbruster 
>>
> 
> If it can go in for 4.0, who stages it? Kevin?

At this point, we've missed -rc3; the fix on its own is not a
release-blocker, but if something else comes up that requires -rc4 for
an actual release blocker, we can sort out someone sending a pull
request for this one at the same time. (I'm aware of a couple of iotest
fixups in the same boat... Guess it's time to start updating the wiki to
point to email threads where we document 'wish list' patches)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2] target/i386: kvm: add VMX migration blocker

2019-04-10 Thread Cole Robinson
On 11/20/18 6:44 AM, Dr. David Alan Gilbert wrote:
> * Paolo Bonzini (pbonz...@redhat.com) wrote:
>> Nested VMX does not support live migration yet.  Add a blocker
>> until that is worked out.
>>
>> Nested SVM only does not support it, but unfortunately it is
>> enabled by default for -cpu host so we cannot really disable it.
>>
>> Signed-off-by: Paolo Bonzini 
> 
> So I'm OK with this, but it does need a release note warning whenever it
> goes in, because it'll surprise those who've already enabled nesting
> but don't use it on all their VMs.
> 

We are hitting this in Fedora 30. Now that nested VMX is enabled by
default at the kernel level, and virt-manager/boxes will use the
equivalent of -cpu host by default, libvirt managedsave (migrate to
file) and virt-manager snapshots (savevm) are rejected for default
created VMs on intel. That's quite unfortunate.

Any ideas on how to resolve this?

FYI it's the root of this bug though there's libvirt issues mixed in:
https://bugzilla.redhat.com/show_bug.cgi?id=1697997

Thanks,
Cole



Re: [Qemu-devel] [PATCH for-4.0?] hmp: delvm: use hmp_handle_error

2019-04-10 Thread Eric Blake
On 4/10/19 1:03 PM, Cole Robinson wrote:
> This gives us the consistent 'Error:' prefix added in 66363e9a43f,
> which helps users like libvirt who still need to scrape hmp error
> messages to detect failure.
> 
> Signed-off-by: Cole Robinson 
> ---
>  hmp.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Not enough to drive -rc4 on its own, but worth adding to our wishlist of
potential easy patches if we do have a release blocker surface.

Reviewed-by: Eric Blake 

> 
> diff --git a/hmp.c b/hmp.c
> index 8eec768088..74a4bfc1f9 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1481,10 +1481,11 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
>  const char *name = qdict_get_str(qdict, "name");
>  
>  if (bdrv_all_delete_snapshot(name, &bs, &err) < 0) {
> -error_reportf_err(err,
> -  "Error while deleting snapshot on device '%s': ",
> -  bdrv_get_device_name(bs));
> +error_prepend(&err,
> +  "Error while deleting snapshot on device '%s': ",

Do we want to reword the message (maybe s/Error while //) to avoid the
word "Error" twice in the same line?

> +  bdrv_get_device_name(bs));
>  }
> +hmp_handle_error(mon, &err);
>  }
>  
>  void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



[Qemu-devel] QEMU Logo question

2019-04-10 Thread Barajas, Felipe
Hi,

I am trying to understand who in the QEMU community can give me permission to 
use the QEMU logo in a whitepaper.
I am an application engineer at Intel and I have prepared some benchmarks using 
QEMU as a solution accelerated by Intel hardware (CPUs, SSDs, etc)
I would like to use the QEMU logo in this white paper.

Do you know who in the community would have authority to give me such 
permission ?

Thank you

Felipe Barajas




Re: [Qemu-devel] [PATCH for-4.0?] hmp: delvm: use hmp_handle_error

2019-04-10 Thread Cole Robinson
On 4/10/19 2:27 PM, Eric Blake wrote:
> On 4/10/19 1:03 PM, Cole Robinson wrote:
>> This gives us the consistent 'Error:' prefix added in 66363e9a43f,
>> which helps users like libvirt who still need to scrape hmp error
>> messages to detect failure.
>>
>> Signed-off-by: Cole Robinson 
>> ---
>>  hmp.c | 7 ---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> Not enough to drive -rc4 on its own, but worth adding to our wishlist of
> potential easy patches if we do have a release blocker surface.
> 
> Reviewed-by: Eric Blake 
> 
>>
>> diff --git a/hmp.c b/hmp.c
>> index 8eec768088..74a4bfc1f9 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -1481,10 +1481,11 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
>>  const char *name = qdict_get_str(qdict, "name");
>>  
>>  if (bdrv_all_delete_snapshot(name, &bs, &err) < 0) {
>> -error_reportf_err(err,
>> -  "Error while deleting snapshot on device '%s': ",
>> -  bdrv_get_device_name(bs));
>> +error_prepend(&err,
>> +  "Error while deleting snapshot on device '%s': ",
> 
> Do we want to reword the message (maybe s/Error while //) to avoid the
> word "Error" twice in the same line?
> 

I'm fine with that, and I checked it won't affect libvirt's error
scraping in this area

Thanks,
Cole



Re: [Qemu-devel] [PATCH for-4.1 v2] qemu-img: Saner printing of large file sizes

2019-04-10 Thread Max Reitz
On 01.04.19 16:57, Eric Blake wrote:
> Disk sizes close to INT64_MAX cause overflow, for some pretty
> ridiculous output:
> 
>   $ ./nbdkit -U - memory size=$((2**63 - 512)) --run 'qemu-img info $nbd'
>   image: nbd+unix://?socket=/tmp/nbdkitHSAzNz/socket
>   file format: raw
>   virtual size: -8388607T (9223372036854775296 bytes)
>   disk size: unavailable
> 
> But there's no reason to have two separate implementations of integer
> to human-readable abbreviation, where one has overflow and stops at
> 'T', while the other avoids overflow and goes all the way to 'E'. With
> this patch, the output now claims 8EiB instead of -8388607T, which
> really is the correct rounding of largest file size supported by qemu
> (we could go 511 bytes larger if we used byte-accurate sizing instead
> of rounding up to the next sector boundary, but that wouldn't change
> the human-readable result).
> 
> Quite a few iotests need updates to expected output to match.
> 
> Reported-by: Richard W.M. Jones 
> Signed-off-by: Eric Blake 
> Tested-by: Richard W.M. Jones 
> Reviewed-by: Alberto Garcia 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v2 - actually update iotests to match; no change to block/ code so R-b added

There are more iotests this breaks.  First, there is 059 for vmdk, which
looks just like the rest.

But for -m32, it gets a bit more difficult.  Every size above 999 GB
(1000 GB gets rounded to 1 TB, which is 2^31 * 512) gets printed as
"inf [unit]":

$ ./qemu-img create -f qcow2 /tmp/foo.qcow2 1T
$ ./qemu-img info /tmp/foo.qcow2
[...]
virtual size: inf TiB (1099511627776 bytes)
[...]

$ ./qemu-img create -f qcow2 /tmp/foo.qcow2 1P
$ ./qemu-img info /tmp/foo.qcow2
[...]
virtual size: inf PiB (1125899906842624 bytes)
[...]

This breaks the iotests that test the maximum disk size of more exotic
formats.  I can see failures for vdi, vhdx, and parallels.

But regardless of the iotests, we shouldn’t show the size as infinite
just because of -m32.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] configure: Automatically fall back to TCI on non-release architectures

2019-04-10 Thread Stefan Weil
On 10.04.19 10:22, Thomas Huth wrote:
> Additionally, I think it should be possible to compile with the
> x86_64-linux-user target and then to run "make check-tcg" ... however,
> that currently crashes with:
> 
> TODO qemu/tcg/tci.c:859: tcg_qemu_tb_exec()
> qemu/tcg/tci.c:859: tcg fatal error
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped

That's easy to fix. I implemented only those TCG opcodes which really
were executed during testing. This test triggers an assertion because it
needs INDEX_op_ld16u_i64 which was missing up to now.

I'll send a patch which adds support for INDEX_op_ld16u_i64.

Thank you for your tests.

Stefan




  1   2   >