Re: [Qemu-devel] [PATCH] vl.c: print error message if load fw_cfg file failed

2018-10-07 Thread Philippe Mathieu-Daudé
On 10/7/18 6:33 AM, Li Qiang wrote:
> It makes sense to print the error message while reading
> file failed.
> 
> Signed-off-by: Li Qiang 
> ---
>  vl.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index cc55fe04a2..3db410e771 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2207,8 +2207,9 @@ static int parse_fw_cfg(void *opaque, QemuOpts *opts, 
> Error **errp)
>  size = strlen(str); /* NUL terminator NOT included in fw_cfg blob */
>  buf = g_memdup(str, size);
>  } else {
> -if (!g_file_get_contents(file, &buf, &size, NULL)) {
> -error_report("can't load %s", file);
> +GError *error = NULL;
> +if (!g_file_get_contents(file, &buf, &size, &error)) {
> +error_report("can't load %s, %s", file, error->message);

Reviewed-by: Philippe Mathieu-Daudé 

>  return -1;
>  }
>  }
>



Re: [Qemu-devel] [PATCH] slirp: Propagate host TCP RST packet to the guest after socket disconnected

2018-10-07 Thread Samuel Thibault
Hello,

Gavin Grant, le jeu. 30 août 2018 16:57:57 +0100, a ecrit:
> Commit 27d92ebc5ed1bb0b518d0ebc4c609182ad20a799 handled the case where the TCP
> connection is abruptly closed via a RST packet, by checking for the ECONNRESET
> errno. However it does not consider the case where the connection has been
> half-closed by the host (FIN/ACK), then the host socket is disconnected. For
> example, if the host application calls close() on the socket, then the
> application exits.
> 
> In this case, the socket still exists due to the file descriptor in SLIRP, but
> it is disconnected. recv() does not indicate an error since an orderly socket
> close has previously occurred. The socket will then be stuck in FIN_WAIT_2,
> until the peer sends FIN/ACK or a timeout occurs. Instead we can send a RST
> to the peer and transition to the CLOSED state.
> 
> Signed-off-by: Gavin Grant 

I have fixed it a bit and pushed to my tree, thanks!

Samuel



[Qemu-devel] [PULL 3/3] slirp: Propagate host TCP RST packet to the guest after socket disconnected

2018-10-07 Thread Samuel Thibault
From: Gavin Grant 

Commit 27d92ebc5ed1bb0b518d0ebc4c609182ad20a799 handled the case where the TCP
connection is abruptly closed via a RST packet, by checking for the ECONNRESET
errno. However it does not consider the case where the connection has been
half-closed by the host (FIN/ACK), then the host socket is disconnected. For
example, if the host application calls close() on the socket, then the
application exits.

In this case, the socket still exists due to the file descriptor in SLIRP, but
it is disconnected. recv() does not indicate an error since an orderly socket
close has previously occurred. The socket will then be stuck in FIN_WAIT_2,
until the peer sends FIN/ACK or a timeout occurs. Instead we can send a RST
to the peer and transition to the CLOSED state.

Signed-off-by: Gavin Grant 
Signed-off-by: Samuel Thibault 
---
 slirp/socket.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/slirp/socket.c b/slirp/socket.c
index 08fe98907d..322383a1f9 100644
--- a/slirp/socket.c
+++ b/slirp/socket.c
@@ -204,12 +204,19 @@ soread(struct socket *so)
return 0;
else {
int err;
-   socklen_t slen = sizeof err;
+   socklen_t elen = sizeof err;
+   struct sockaddr_storage addr;
+   struct sockaddr *paddr = (struct sockaddr *) &addr;
+   socklen_t alen = sizeof addr;
 
err = errno;
if (nn == 0) {
-   getsockopt(so->s, SOL_SOCKET, SO_ERROR,
-  &err, &slen);
+   if (getpeername(so->s, paddr, &alen) < 0) {
+   err = errno;
+   } else {
+   getsockopt(so->s, SOL_SOCKET, SO_ERROR,
+   &err, &elen);
+   }
}
 
DEBUG_MISC((dfd, " --- soread() disconnected, nn = %d, 
errno = %d-%s\n", nn, errno,strerror(errno)));
-- 
2.19.0




[Qemu-devel] [PULL 1/3] slirp: document mbuf pointers and sizes

2018-10-07 Thread Samuel Thibault
From: Peter Maydell 

and fix confusing datasize name into gapsize in m_inc.

Signed-off-by: Peter Maydell 
Signed-off-by: Samuel Thibault 
---
 slirp/mbuf.c | 14 +++---
 slirp/mbuf.h | 13 +
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/slirp/mbuf.c b/slirp/mbuf.c
index 1b7868355a..aa1f28afb1 100644
--- a/slirp/mbuf.c
+++ b/slirp/mbuf.c
@@ -151,7 +151,7 @@ m_cat(struct mbuf *m, struct mbuf *n)
 void
 m_inc(struct mbuf *m, int size)
 {
-int datasize;
+int gapsize;
 
 /* some compilers throw up on gotos.  This one we can fake. */
 if (M_ROOM(m) > size) {
@@ -159,17 +159,17 @@ m_inc(struct mbuf *m, int size)
 }
 
 if (m->m_flags & M_EXT) {
-datasize = m->m_data - m->m_ext;
-m->m_ext = g_realloc(m->m_ext, size + datasize);
+gapsize = m->m_data - m->m_ext;
+m->m_ext = g_realloc(m->m_ext, size + gapsize);
 } else {
-datasize = m->m_data - m->m_dat;
-m->m_ext = g_malloc(size + datasize);
+gapsize = m->m_data - m->m_dat;
+m->m_ext = g_malloc(size + gapsize);
 memcpy(m->m_ext, m->m_dat, m->m_size);
 m->m_flags |= M_EXT;
 }
 
-m->m_data = m->m_ext + datasize;
-m->m_size = size + datasize;
+m->m_data = m->m_ext + gapsize;
+m->m_size = size + gapsize;
 }
 
 
diff --git a/slirp/mbuf.h b/slirp/mbuf.h
index 33b84485d6..bfdf8c4577 100644
--- a/slirp/mbuf.h
+++ b/slirp/mbuf.h
@@ -47,6 +47,19 @@
  * free the m_ext.  This is inefficient memory-wise, but who cares.
  */
 
+/*
+ * mbufs allow to have a gap between the start of the allocated buffer (m_ext 
if
+ * M_EXT is set, m_dat otherwise) and the in-use data:
+ *
+ *  |--gapsize->|---m_len--->
+ *  |--m_size-->
+ *  |M_ROOM>
+ *   |-M_FREEROOM-->
+ *
+ *  ^   ^   ^
+ *  m_dat/m_ext m_data  end of buffer
+ */
+
 /*
  * How much room is in the mbuf, from m_data to the end of the mbuf
  */
-- 
2.19.0




[Qemu-devel] [PULL 2/3] slirp: fix ICMP handling on macOS hosts

2018-10-07 Thread Samuel Thibault
From: Andrew Oates 

On Linux, SOCK_DGRAM+IPPROTO_ICMP sockets give only the ICMP packet when
read from.  On macOS, however, the socket acts like a SOCK_RAW socket
and includes the IP header as well.

This change strips the extra IP header from the received packet on macOS
before sending it to the guest.  SOCK_DGRAM ICMP sockets aren't
supported on other BSDs, but we enable this behavior for them as well to
treat the sockets the same as raw sockets.

Signed-off-by: Andrew Oates 
Signed-off-by: Samuel Thibault 
---
 slirp/ip_icmp.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/slirp/ip_icmp.c b/slirp/ip_icmp.c
index 0b667a429a..da100d1f55 100644
--- a/slirp/ip_icmp.c
+++ b/slirp/ip_icmp.c
@@ -420,7 +420,32 @@ void icmp_receive(struct socket *so)
 icp = mtod(m, struct icmp *);
 
 id = icp->icmp_id;
-len = qemu_recv(so->s, icp, m->m_len, 0);
+len = qemu_recv(so->s, icp, M_ROOM(m), 0);
+/*
+ * The behavior of reading SOCK_DGRAM+IPPROTO_ICMP sockets is inconsistent
+ * between host OSes.  On Linux, only the ICMP header and payload is
+ * included.  On macOS/Darwin, the socket acts like a raw socket and
+ * includes the IP header as well.  On other BSDs, SOCK_DGRAM+IPPROTO_ICMP
+ * sockets aren't supported at all, so we treat them like raw sockets.  It
+ * isn't possible to detect this difference at runtime, so we must use an
+ * #ifdef to determine if we need to remove the IP header.
+ */
+#ifdef CONFIG_BSD
+if (len >= sizeof(struct ip)) {
+struct ip *inner_ip = mtod(m, struct ip *);
+int inner_hlen = inner_ip->ip_hl << 2;
+if (inner_hlen > len) {
+len = -1;
+errno = -EINVAL;
+} else {
+len -= inner_hlen;
+memmove(icp, (unsigned char *)icp + inner_hlen, len);
+}
+} else {
+  len = -1;
+  errno = -EINVAL;
+}
+#endif
 icp->icmp_id = id;
 
 m->m_data -= hlen;
-- 
2.19.0




[Qemu-devel] [PULL 0/3] Slirp updates

2018-10-07 Thread Samuel Thibault
The following changes since commit 3c2d3042849686969add641bd38b08b9877b9e8f:

  Merge remote-tracking branch 
'remotes/mcayland/tags/qemu-openbios.for-upstream-20181005' into staging 
(2018-10-05 17:55:22 +0100)

are available in the Git repository at:

  https://people.debian.org/~sthibault/qemu.git tags/samuel-thibault

for you to fetch changes up to 93a972f8548571d35c718ca3a94d5ab1507b2443:

  slirp: Propagate host TCP RST packet to the guest after socket disconnected 
(2018-10-07 19:50:48 +0200)


slirp updates

Andrew Oates (1):
  slirp: fix ICMP handling on macOS hosts

Gavin Grant (1):
  slirp: Propagate host TCP RST packet to the guest after socket
disconnected

Peter Maydell (1):
  slirp: document mbuf pointers and sizes


Andrew Oates (1):
  slirp: fix ICMP handling on macOS hosts

Gavin Grant (1):
  slirp: Propagate host TCP RST packet to the guest after socket 
disconnected

Peter Maydell (1):
  slirp: document mbuf pointers and sizes

 slirp/ip_icmp.c | 27 ++-
 slirp/mbuf.c| 14 +++---
 slirp/mbuf.h| 13 +
 slirp/socket.c  | 13 ++---
 4 files changed, 56 insertions(+), 11 deletions(-)



[Qemu-devel] [PATCH] tcg: Implement CPU_LOG_TB_NOCHAIN during expansion

2018-10-07 Thread Richard Henderson
Rather than test NOCHAIN before linking, do not emit the
goto_tb opcode at all.  We already do this for goto_ptr.

Previously, nochain seemed more or less a hack to debug code gen.
But there's a real use case when it comes to tracing, so spend a
little more effort on what we produce.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cpu-exec.c | 2 +-
 tcg/tcg-op.c | 9 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 6bcb6d99bd..870027d435 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -416,7 +416,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
 }
 #endif
 /* See if we can patch the calling TB. */
-if (last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+if (last_tb) {
 tb_add_jump(last_tb, tb_exit, tb);
 }
 return tb;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index daa416a143..7a8015c5a9 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2586,6 +2586,10 @@ void tcg_gen_exit_tb(TranslationBlock *tb, unsigned idx)
seen this numbered exit before, via tcg_gen_goto_tb.  */
 tcg_debug_assert(tcg_ctx->goto_tb_issue_mask & (1 << idx));
 #endif
+/* When not chaining, exit without indicating a link.  */
+if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+val = 0;
+}
 } else {
 /* This is an exit via the exitreq label.  */
 tcg_debug_assert(idx == TB_EXIT_REQUESTED);
@@ -2603,7 +2607,10 @@ void tcg_gen_goto_tb(unsigned idx)
 tcg_debug_assert((tcg_ctx->goto_tb_issue_mask & (1 << idx)) == 0);
 tcg_ctx->goto_tb_issue_mask |= 1 << idx;
 #endif
-tcg_gen_op1i(INDEX_op_goto_tb, idx);
+/* When not chaining, we simply fall through to the "fallback" exit.  */
+if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+tcg_gen_op1i(INDEX_op_goto_tb, idx);
+}
 }
 
 void tcg_gen_lookup_and_goto_ptr(void)
-- 
2.17.1




[Qemu-devel] [PATCH] qemu-system-hppa: Raise exception 26 on emulated hardware

2018-10-07 Thread Helge Deller
On PCXS chips (PA7000, pa 1.1a), trap #18 is raised on memory faults,
while all later chips (>= PA7100) generate either trap #26, #27 or #28
(depending on the fault type).

Since the current qemu emulation emulates a B160L machine (with a
PA7300LC PCX-L2 chip, we should raise trap #26 (EXCP_DMAR) instead of
#18 (EXCP_DMP) on access faults by the Linux kernel to page zero.

With the patch we now get the correct output (I tested against real
hardware):
 Kernel Fault: Code=26 (Data memory access rights trap) (Addr=0004)
instead of:
 Kernel Fault: Code=18 (Data memory protection/unaligned access trap) 
(Addr=0004)

Signed-off-by: Helge Deller 

diff --git a/target/hppa/mem_helper.c b/target/hppa/mem_helper.c
index ab160c2a74..aecf3075f6 100644
--- a/target/hppa/mem_helper.c
+++ b/target/hppa/mem_helper.c
@@ -137,7 +137,8 @@ int hppa_get_physical_address(CPUHPPAState *env, vaddr 
addr, int mmu_idx,
 
 if (unlikely(!(prot & type))) {
 /* The access isn't allowed -- Inst/Data Memory Protection Fault.  */
-ret = (type & PAGE_EXEC ? EXCP_IMP : EXCP_DMP);
+ret = (type & PAGE_EXEC ? EXCP_IMP :
+   prot & PAGE_READ ? EXCP_DMP : EXCP_DMAR);
 goto egress;
 }
 



Re: [Qemu-devel] [PATCH v2 2/2] MAINTAINERS: Remove myself as block maintainer

2018-10-07 Thread Max Reitz
On 26.09.18 20:05, Jeff Cody wrote:
> I'll not be involved in day-to-day qemu development.  Remove myself as
> maintainer from the remainder of the network block drivers, and revert
> them to the general block layer maintainership.
> 
> Move 'sheepdog' to the 'Odd Fixes' support level.
> 
> For VHDX, added my personal email address as a maintainer, as I can
> answer questions or send the occassional bug fix.  Leaving it as
> 'Supported', instead of 'Odd Fixes', because I think the rest of the
> block layer maintainers and developers will upkeep it as well, if
> needed.
> 
> Signed-off-by: Jeff Cody 
> ---
>  MAINTAINERS | 17 ++---
>  1 file changed, 2 insertions(+), 15 deletions(-)

Not sure who's going to merge this (maybe me, but then I'd like an ACK
from John on patch 1), but:

Acked-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 01/14] block: Don't call update_flags_from_options() if the options are wrong

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> If qemu_opts_absorb_qdict() fails and we carry on and call
> update_flags_from_options() then that can result on a failed
> assertion:
> 
>$ qemu-io -c 'reopen -o read-only=foo' hd.qcow2
>block.c:1101: update_flags_from_options: Assertion `qemu_opt_find(opts, 
> BDRV_OPT_CACHE_DIRECT)' failed.
>Aborted
> 
> This only happens in bdrv_reopen_queue_child(). Although this function
> doesn't return errors, we can simply keep the wrong options in the
> reopen queue and later bdrv_reopen_prepare() will fail.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c| 11 +--
>  tests/qemu-iotests/133 |  6 ++
>  tests/qemu-iotests/133.out |  4 
>  3 files changed, 19 insertions(+), 2 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 03/14] block: Use bdrv_reopen_set_read_only() in bdrv_backing_update_filename()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This patch replaces the bdrv_reopen() calls that set and remove the
> BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH] tcg: Add tlb_index and tlb_entry helpers

2018-10-07 Thread Richard Henderson
Isolate the computation of an index from an address into a
helper before we change that function.

Signed-off-by: Richard Henderson 
---

Emilio, this should make your dynamic tlb sizing patch 1/6
significantly smaller.


r~

---
 accel/tcg/softmmu_template.h | 68 +---
 include/exec/cpu_ldst.h  | 14 +++
 include/exec/cpu_ldst_template.h | 25 ++--
 accel/tcg/cputlb.c   | 60 +---
 4 files changed, 90 insertions(+), 77 deletions(-)

diff --git a/accel/tcg/softmmu_template.h b/accel/tcg/softmmu_template.h
index f060a693d4..fae901f00d 100644
--- a/accel/tcg/softmmu_template.h
+++ b/accel/tcg/softmmu_template.h
@@ -111,9 +111,10 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState 
*env,
 WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong addr,
 TCGMemOpIdx oi, uintptr_t retaddr)
 {
-unsigned mmu_idx = get_mmuidx(oi);
-int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
-target_ulong tlb_addr = env->tlb_table[mmu_idx][index].ADDR_READ;
+uintptr_t mmu_idx = get_mmuidx(oi);
+uintptr_t index = tlb_index(env, mmu_idx, addr);
+CPUTLBEntry *entry = tlb_entry(env, mmu_idx, index);
+target_ulong tlb_addr = entry->ADDR_READ;
 unsigned a_bits = get_alignment_bits(get_memop(oi));
 uintptr_t haddr;
 DATA_TYPE res;
@@ -129,7 +130,7 @@ WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong 
addr,
 tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, READ_ACCESS_TYPE,
  mmu_idx, retaddr);
 }
-tlb_addr = env->tlb_table[mmu_idx][index].ADDR_READ;
+tlb_addr = entry->ADDR_READ;
 }
 
 /* Handle an IO access.  */
@@ -166,7 +167,7 @@ WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong 
addr,
 return res;
 }
 
-haddr = addr + env->tlb_table[mmu_idx][index].addend;
+haddr = addr + entry->addend;
 #if DATA_SIZE == 1
 res = glue(glue(ld, LSUFFIX), _p)((uint8_t *)haddr);
 #else
@@ -179,9 +180,10 @@ WORD_TYPE helper_le_ld_name(CPUArchState *env, 
target_ulong addr,
 WORD_TYPE helper_be_ld_name(CPUArchState *env, target_ulong addr,
 TCGMemOpIdx oi, uintptr_t retaddr)
 {
-unsigned mmu_idx = get_mmuidx(oi);
-int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
-target_ulong tlb_addr = env->tlb_table[mmu_idx][index].ADDR_READ;
+uintptr_t mmu_idx = get_mmuidx(oi);
+uintptr_t index = tlb_index(env, mmu_idx, addr);
+CPUTLBEntry *entry = tlb_entry(env, mmu_idx, index);
+target_ulong tlb_addr = entry->ADDR_READ;
 unsigned a_bits = get_alignment_bits(get_memop(oi));
 uintptr_t haddr;
 DATA_TYPE res;
@@ -197,7 +199,7 @@ WORD_TYPE helper_be_ld_name(CPUArchState *env, target_ulong 
addr,
 tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, READ_ACCESS_TYPE,
  mmu_idx, retaddr);
 }
-tlb_addr = env->tlb_table[mmu_idx][index].ADDR_READ;
+tlb_addr = entry->ADDR_READ;
 }
 
 /* Handle an IO access.  */
@@ -234,7 +236,7 @@ WORD_TYPE helper_be_ld_name(CPUArchState *env, target_ulong 
addr,
 return res;
 }
 
-haddr = addr + env->tlb_table[mmu_idx][index].addend;
+haddr = addr + entry->addend;
 res = glue(glue(ld, LSUFFIX), _be_p)((uint8_t *)haddr);
 return res;
 }
@@ -275,9 +277,10 @@ static inline void glue(io_write, SUFFIX)(CPUArchState 
*env,
 void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
TCGMemOpIdx oi, uintptr_t retaddr)
 {
-unsigned mmu_idx = get_mmuidx(oi);
-int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
-target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
+uintptr_t mmu_idx = get_mmuidx(oi);
+uintptr_t index = tlb_index(env, mmu_idx, addr);
+CPUTLBEntry *entry = tlb_entry(env, mmu_idx, index);
+target_ulong tlb_addr = entry->addr_write;
 unsigned a_bits = get_alignment_bits(get_memop(oi));
 uintptr_t haddr;
 
@@ -292,7 +295,7 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 tlb_fill(ENV_GET_CPU(env), addr, DATA_SIZE, MMU_DATA_STORE,
  mmu_idx, retaddr);
 }
-tlb_addr = env->tlb_table[mmu_idx][index].addr_write & 
~TLB_INVALID_MASK;
+tlb_addr = entry->addr_write & ~TLB_INVALID_MASK;
 }
 
 /* Handle an IO access.  */
@@ -313,16 +316,16 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 if (DATA_SIZE > 1
 && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
  >= TARGET_PAGE_SIZE)) {
-int i, index2;
-target_ulong page2, tlb_addr2;
+int i;
+target_ulong page2;
+CPUTLBEntry *entry2;
 do_unaligned_access:
 /* Ensure the second page is in the TLB.  Note that the first page
is already guaranteed to b

Re: [Qemu-devel] [PATCH] hw/ppc/spapr_rng: Introduce CONFIG_SPAPR_RNG switch for spapr_rng.c

2018-10-07 Thread David Gibson
On Fri, Oct 05, 2018 at 08:12:12AM +0200, Thomas Huth wrote:
> On 2018-10-05 06:25, David Gibson wrote:
> > On Thu, Oct 04, 2018 at 12:07:01PM +0200, Thomas Huth wrote:
> >> The spapr-rng device is suboptimal when compared to virtio-rng, so
> >> users might want to disable it in their builds. Thus let's introduce
> >> a proper CONFIG switch to allow us to compile QEMU without this device.
> >>
> >> Signed-off-by: Thomas Huth 
> > 
> > Uh, sure, I guess so.
> 
> Yes, we definitely want this for the QEMU in RHEL :-)
> 
> >> diff --git a/hw/ppc/spapr_rng.c b/hw/ppc/spapr_rng.c
> >> index d2acd61..dec8434 100644
> >> --- a/hw/ppc/spapr_rng.c
> >> +++ b/hw/ppc/spapr_rng.c
> >> @@ -27,6 +27,7 @@
> >>  #include "sysemu/rng.h"
> >>  #include "hw/ppc/spapr.h"
> >>  #include "kvm_ppc.h"
> >> +#include "spapr_rng.h"
> >>  
> >>  #define SPAPR_RNG(obj) \
> >>  OBJECT_CHECK(sPAPRRngState, (obj), TYPE_SPAPR_RNG)
> >> @@ -132,29 +133,6 @@ static void spapr_rng_realize(DeviceState *dev, Error 
> >> **errp)
> >>  }
> >>  }
> >>  
> >> -int spapr_rng_populate_dt(void *fdt)
> >> -{
> >> -int node;
> >> -int ret;
> >> -
> >> -node = qemu_fdt_add_subnode(fdt, "/ibm,platform-facilities");
> >> -if (node <= 0) {
> >> -return -1;
> >> -}
> >> -ret = fdt_setprop_string(fdt, node, "device_type",
> >> - "ibm,platform-facilities");
> >> -ret |= fdt_setprop_cell(fdt, node, "#address-cells", 0x1);
> >> -ret |= fdt_setprop_cell(fdt, node, "#size-cells", 0x0);
> >> -
> >> -node = fdt_add_subnode(fdt, node, "ibm,random-v1");
> >> -if (node <= 0) {
> >> -return -1;
> >> -}
> >> -ret |= fdt_setprop_string(fdt, node, "compatible", "ibm,random");
> >> -
> >> -return ret ? -1 : 0;
> >> -}
> >> -
> > 
> > Moving this to an inline doesn't seem right to me though - it's a more
> > complex function that we usually want in a .h inline, and I don't
> > really see a good reason for it to be there (rather than an #ifdeffed
> > stub).
> 
> An #ifdef is not possible here - the CONFIG switches for the targets are
> *not* turned into pre-processor macros, only the CONFIG switches for the
> host settings.

Ah, right.

> So putting this function as static inline into a separate
> header seems to be the best option to me right now. Alternatively, I
> could also put it directly into spapr.c directly, but that file is
> already very big... well, I don't mind, let me know what you prefer.

I'd prefer spapr.c to the inline.

But.. couldn't you put a stub version in stubs?  That would make a
weak symbol that would be overridden when SPAPR_RNG is compiled in.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 02/14] block: Add bdrv_reopen_set_read_only()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> Most callers of bdrv_reopen() only use it to switch a BlockDriverState
> between read-only and read-write, so this patch adds a new function
> that does just that.
> 
> We also want to get rid of the flags parameter in the bdrv_reopen()
> API, so this function sets the "read-only" option and passes the
> original flags (which will then be updated in bdrv_reopen_prepare()).
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c   | 17 +
>  include/block/block.h |  2 ++
>  2 files changed, 19 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 04/14] block: Use bdrv_reopen_set_read_only() in commit_start/complete()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This patch replaces the bdrv_reopen() calls that set and remove the
> BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/commit.c | 16 ++--
>  1 file changed, 6 insertions(+), 10 deletions(-)

Will need a rebase on 22dffcbec62ba918db690ed44beba4bd4e970bb9, but that
looks like a rather simple job, so:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 05/14] block: Use bdrv_reopen_set_read_only() in bdrv_commit()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This patch replaces the bdrv_reopen() calls that set and remove the
> BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/commit.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 06/14] block: Use bdrv_reopen_set_read_only() in stream_start/complete()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This patch replaces the bdrv_reopen() calls that set and remove the
> BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/stream.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)

In anticipation of a rebase on 1b57488acf1:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 07/14] block: Use bdrv_reopen_set_read_only() in qmp_change_backing_file()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This patch replaces the bdrv_reopen() calls that set and remove the
> BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  blockdev.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 08/14] block: Use bdrv_reopen_set_read_only() in external_snapshot_commit()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This patch replaces the bdrv_reopen() calls that set and remove the
> BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function.

Ha!  Got you!  It's just one call this time, not "calls"! :-)

> Signed-off-by: Alberto Garcia 
> ---
>  blockdev.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 09/14] block: Use bdrv_reopen_set_read_only() in the mirror driver

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> The 'block-commit' QMP command is implemented internally using two
> different drivers. If the source image is the active layer then the
> mirror driver is used (commit_active_start()), otherwise the commit
> driver is used (commit_start()).
> 
> In both cases the destination image must be put temporarily in
> read-write mode. This is done correctly in the latter case, but what
> commit_active_start() does is copy all flags instead.

Well, not only commit_active_start().  mirror_exit() does exactly the
same.  It does seem on purpose to let the target have exactly the same
flags as the source eventually.

Then again, none of the other flags seem to make any sense to me here.
Therefore, I tend to agree that it is a bug fix, even though I wouldn't
say it was an oversight.

Reviewed-by: Max Reitz 

...eagerly awaiting rebase on 737efc1eda2.

> This patch replaces the bdrv_reopen() calls in that function with
> bdrv_reopen_set_read_only() so that only the read-only status is
> changed.
> 
> A similar change is made in mirror_exit(), which is also used by the
> 'drive-mirror' and 'blockdev-mirror' commands.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/mirror.c | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-time sizing

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> -#define CPU_COMMON_TLB \
> +typedef struct CPUTLBDesc {
> +size_t size;
> +size_t mask; /* (.size - 1) << CPU_TLB_ENTRY_BITS for TLB fast path */
> +} CPUTLBDesc;

I think you don't need both size and mask.  Size is (or ought to be) used
infrequently enough that I think it can be computed on demand.  But on a
related note, if you remember the out-of-line tlb changes, it'll be easier for
x86 if you index an array of scalars.


> -int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> +int index = (addr >> TARGET_PAGE_BITS) & (env->tlb_desc[mmu_idx].size - 
> 1);

I just posted a patch to functionalize this.  But I imagine

static inline uintptr_t tlb_index(CPUArchState *env, uintptr_t mmu_idx,
  target_ulong addr)
{
uintptr_t ofs = (addr >> TARGET_PAGE_BITS) & env->tlb_mask[mmu_idx];
return ofs >> CPU_TLB_BITS;
}

static inline CPUTLBEntry *tlb_entry(CPUArchState *env, uintptr_t mmu_idx,
 target_ulong addr)
{
uintptr_t ofs = (addr >> TARGET_PAGE_BITS) & env->tlb_mask[mmu_idx];
return (void *)&env->tlb_table[mmu_idx][0] + ofs;
}

In the few places where we use both of these functions, the compiler ought to
be able to pull out the common sub-expression.


> @@ -139,7 +149,10 @@ static void tlb_flush_nocheck(CPUState *cpu)
>   * that do not hold the lock are performed by the same owner thread.
>   */
>  qemu_spin_lock(&env->tlb_lock);
> -memset(env->tlb_table, -1, sizeof(env->tlb_table));
> +for (i = 0; i < NB_MMU_MODES; i++) {
> +memset(env->tlb_table[i], -1,
> +   env->tlb_desc[i].size * sizeof(CPUTLBEntry));

Here we can use something like

static inline uintptr_t sizeof_tlb(CPUArchState *env, uintptr_t mmu_idx)
{
return env->tlb_mask[mmu_idx] + (1 << CPU_TLB_BITS);
}

memset(env->tlb_table[i], -1, sizeof_tlb(env, i));

> -for (i = 0; i < CPU_TLB_SIZE; i++) {
> +for (i = 0; i < env->tlb_desc[mmu_idx].size; i++) {

This seems to be one of the few places where you need the number of entries
rather than the size.  Perhaps

   for (i = sizeof_tlb(env, mmu_idx) >> CPU_TLB_BITS; i-- > 0; ) {

Or if you can think of a name for the function of the same effect...


r~



Re: [Qemu-devel] [PATCH 10/14] block: Drop bdrv_reopen()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> No one is using this function anymore, so we can safely remove it

Is this such a minor action to you that you think it doesn't need a full
stop? ;-)

> Signed-off-by: Alberto Garcia 
> ---
>  block.c   | 21 -
>  include/block/block.h |  1 -
>  2 files changed, 22 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC 6/6] cputlb: dynamically resize TLBs based on use rate

2018-10-07 Thread Emilio G. Cota
On Sun, Oct 07, 2018 at 19:37:50 +0200, Philippe Mathieu-Daudé wrote:
> On 10/6/18 11:45 PM, Emilio G. Cota wrote:
> > 2. System boot + shutdown, ubuntu 18.04 x86_64:
> 
> You can also run the VM tests to build QEMU:
> 
> $ make vm-test

Thanks, will give that a look.

> > +if (rate == 100) {
> > +new_size = MIN(desc->size << 2, 1 << 
> > TCG_TARGET_TLB_MAX_INDEX_BITS);
> > +} else if (rate > 70) {
> > +new_size = MIN(desc->size << 1, 1 << 
> > TCG_TARGET_TLB_MAX_INDEX_BITS);
> > +} else if (rate < 30) {
> 
> I wonder if those thresholds might be per TCG_TARGET.

Do you mean to tune the growth rate based on each TCG target?
(max and min are already determined by the TCG target).
The optimal growth rate is mostly dependent on the guest
workload, so I wouldn't expect the TCG target to matter
much.

That said, we could spend quite some time tweaking the
TLB sizing algorithm. But with this RFC I wanted to see
(a) whether this approach is a good idea at all, and (b) show
what 'easy' speedups might look like (because converting
all TCG targets is a pain, so it better be justified).

> Btw the paper used 40% here, did you tried it too?

Yes, I tried several alternatives including what the
paper describes, i.e. (skipping the min/max checks
for simplicity):

if (rate > 70) {
new_size = 2 * old_size;
} else if (rate < 40) {
new_size = old_size / 2;
}

But that didn't give great speedups (see "resizing-paper"
set):
  https://imgur.com/a/w3AqHP7

A few points stand out to me:

- We get very different speedups even if we implement
  the algorithm they describe (not sure that's exactly
  what they implemented though). But there are many
  variables that could explain that, e.g. different guest
  images (and therefore different TLB flush rates) and
  different QEMU baselines (ours is faster than the paper's,
  so getting speedups is harder).

- 70/40% use rate for growing/shrinking the TLB does not
  seem a great choice, if one wants to avoid a pathological
  case that can induce constant resizing. Imagine we got
  exactly 70% use rate, and all TLB misses were compulsory
  (i.e. a direct-mapped TLB would have not prevented a
  single miss). We'd then double the TLB size:
size_new = 2*size_old
  But then the use rate will halve:
use_new = 0.7/2 = 0.35
  So we'd then end up in a grow-shrink loop!
  Picking a "shrink threshold" below 0.70/2=0.35 avoids this.

- Aggressively increasing the TLB size when usage is high
  makes sense. However, reducing the size at the same
  rate does not make much sense. Imagine the following
  scenario with two processes being scheduled: one process
  uses a lot of memory, and the other one uses little, but
  both are CPU-intensive and therefore being assigned similar
  time slices by the scheduler. Ideally you'd resize the TLB
  to meet each process' memory demands. However, at flush
  time we don't even know what process is running or about
  to run, so we have to size the TLB exclusively based on
  recent use rates. In this scenario you're probably close
  to optimal if you size the TLB to meet the demands of the
  most memory-hungry process. You'll lose some extra time
  flushing the (now larger) TLB, but your net gain is likely
  to be positive given the TLB fills you won't have to do
  when the memory-heavy process is scheduled in.

So to me it's quite likely that in the paper they
could have gotten even better results by reducing the
shrink rate, like we did.

Thanks,

Emilio



Re: [Qemu-devel] [PATCH 11/14] qemu-io: Put flag changes in the options QDict in reopen_f()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> When reopen_f() puts a block device in the reopen queue, some of the
> new options are passed using a QDict, but others ("read-only" and the
> cache options) are passed as flags.
> 
> This patch puts those flags in the QDict. This way the flags parameter
> becomes redundant and we'll be able to get rid of it in a subsequent
> patch.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  qemu-io-cmds.c | 27 ++-
>  tests/qemu-iotests/133 |  9 +
>  tests/qemu-iotests/133.out |  8 
>  3 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
> index db0b3ee5ef..4ad5269abc 100644
> --- a/qemu-io-cmds.c
> +++ b/qemu-io-cmds.c
> @@ -10,6 +10,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
> +#include "qapi/qmp/qdict.h"
>  #include "qemu-io.h"
>  #include "sysemu/block-backend.h"
>  #include "block/block.h"
> @@ -1978,6 +1979,7 @@ static int reopen_f(BlockBackend *blk, int argc, char 
> **argv)
>  int flags = bs->open_flags;
>  bool writethrough = !blk_enable_write_cache(blk);
>  bool has_rw_option = false;
> +bool has_cache_option = false;
>  
>  BlockReopenQueue *brq;
>  Error *local_err = NULL;
> @@ -1989,6 +1991,7 @@ static int reopen_f(BlockBackend *blk, int argc, char 
> **argv)
>  error_report("Invalid cache option: %s", optarg);
>  return -EINVAL;
>  }
> +has_cache_option = true;
>  break;
>  case 'o':
>  if (!qemu_opts_parse_noisily(&reopen_opts, optarg, 0)) {
> @@ -2046,9 +2049,31 @@ static int reopen_f(BlockBackend *blk, int argc, char 
> **argv)
>  }
>  
>  qopts = qemu_opts_find(&reopen_opts, NULL);
> -opts = qopts ? qemu_opts_to_qdict(qopts, NULL) : NULL;
> +opts = qopts ? qemu_opts_to_qdict(qopts, NULL) : qdict_new();
>  qemu_opts_reset(&reopen_opts);
>  
> +if (qdict_haskey(opts, BDRV_OPT_READ_ONLY)) {
> +if (has_rw_option) {
> +error_report("Can't set both -r/-w and '" BDRV_OPT_READ_ONLY 
> "'");

I'm not a fan of elision ("can't") in user interfaces.  (In fact, I'm
not a fan of elision anywhere in the code base or commit logs, but I
don't put up a struggle there.)

> +qobject_unref(opts);
> +return -EINVAL;
> +}
> +} else {
> +qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !(flags & BDRV_O_RDWR));
> +}
> +
> +if (qdict_haskey(opts, BDRV_OPT_CACHE_DIRECT) ||
> +qdict_haskey(opts, BDRV_OPT_CACHE_NO_FLUSH)) {
> +if (has_cache_option) {
> +error_report("Can't set both -c and the cache options");
> +qobject_unref(opts);
> +return -EINVAL;
> +}
> +} else {
> +qdict_put_bool(opts, BDRV_OPT_CACHE_DIRECT, flags & BDRV_O_NOCACHE);
> +qdict_put_bool(opts, BDRV_OPT_CACHE_NO_FLUSH, flags & 
> BDRV_O_NO_FLUSH);
> +}
> +
>  bdrv_subtree_drained_begin(bs);
>  brq = bdrv_reopen_queue(NULL, bs, opts, flags);
>  bdrv_reopen_multiple(bdrv_get_aio_context(bs), brq, &local_err);
> diff --git a/tests/qemu-iotests/133 b/tests/qemu-iotests/133
> index b9f17c996f..a58130c5b4 100755
> --- a/tests/qemu-iotests/133
> +++ b/tests/qemu-iotests/133
> @@ -98,6 +98,15 @@ echo
>  
>  $QEMU_IO -c 'reopen -o read-only=foo' $TEST_IMG
>  
> +echo
> +echo "=== Check that mixing -c/-r/-w and their equivalent options is 
> forbidden ==="
> +echo
> +
> +$QEMU_IO -c 'reopen -r -o read-only=on' $TEST_IMG
> +$QEMU_IO -c 'reopen -w -o read-only=on' $TEST_IMG

The equivalent option however would be read-only=off.

> +$QEMU_IO -c 'reopen -c none -o cache.direct=on' $TEST_IMG
> +$QEMU_IO -c 'reopen -c writeback -o cache.direct=on' $TEST_IMG

But the equivalent option would be cache.direct=off.

> +$QEMU_IO -c 'reopen -c directsync -o cache.no-flush=on' $TEST_IMG

And cache.direct=on here.  Or rather, the equivalent -c value for
cache.no-flush=on would be "unsafe".

It doesn't really matter, but if you don't care, it shouldn't say
"equivalent options" but maybe "corresponding options".

Overall, the patch looks good.

Max

>  # success, all done
>  echo "*** done"
>  rm -f $seq.full
> diff --git a/tests/qemu-iotests/133.out b/tests/qemu-iotests/133.out
> index 570f623b6b..4d84401688 100644
> --- a/tests/qemu-iotests/133.out
> +++ b/tests/qemu-iotests/133.out
> @@ -28,4 +28,12 @@ format name: null-co
>  === Check that invalid options are handled correctly ===
>  
>  Parameter 'read-only' expects 'on' or 'off'
> +
> +=== Check that mixing -c/-r/-w and their equivalent options is forbidden ===
> +
> +Can't set both -r/-w and 'read-only'
> +Can't set both -r/-w and 'read-only'
> +Can't set both -c and the cache options
> +Can't set both -c and the cache options
> +Can't set both -c and the cache options
>  *** done
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> Currently we evict an entry to the victim TLB when it doesn't match
> the current address. But it could be that there's no match because
> the current entry is invalid. Do not evict the entry to the vtlb
> in that case.
> 
> This change will help us keep track of the TLB's use rate.
> 
> Signed-off-by: Emilio G. Cota 
> ---
>  include/exec/cpu-all.h | 14 ++
>  accel/tcg/cputlb.c |  2 +-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 117d2fbbca..d938dedafc 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> target_ulong addr)
>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
>  }
>  
> +/**
> + * tlb_is_valid - return true if at least one of the addresses is valid
> + * @te: pointer to CPUTLBEntry
> + *
> + * This is useful when we don't have a particular address to compare against,
> + * and we just want to know whether any entry holds valid data.
> + */
> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> +{
> +return !(te->addr_read & TLB_INVALID_MASK) ||
> +   !(te->addr_write & TLB_INVALID_MASK) ||
> +   !(te->addr_code & TLB_INVALID_MASK);
> +}

No, I think you misunderstand.

First, TLB_INVALID_MASK is only ever set for addr_write, in response to
PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid for
read+exec.  So there is benefit to swapping it out to the victim cache.

This is used by the s390x target to make the "lowpage" read-only, which is a
special architected 512 byte range within pages 0 and 1.  This is done by
forcing writes, but not reads, back through tlb_fill.


r~



[Qemu-devel] [PATCH] util: aio-posix: fix a typo

2018-10-07 Thread Li Qiang
Cc: qemu-triv...@nongnu.org
Signed-off-by: Li Qiang 
---
 util/aio-posix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/aio-posix.c b/util/aio-posix.c
index 621b302..51c41ed 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -40,7 +40,7 @@ struct AioHandler
 
 #ifdef CONFIG_EPOLL_CREATE1
 
-/* The fd number threashold to switch to epoll */
+/* The fd number threshold to switch to epoll */
 #define EPOLL_ENABLE_THRESHOLD 64
 
 static void aio_epoll_disable(AioContext *ctx)
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 12/14] block: Clean up reopen_backing_file() in block/replication.c

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> This function is used to put the hidden and secondary disks in
> read-write mode before launching the backup job, and back in read-only
> mode afterwards.
> 
> This patch does the following changes:
> 
>   - Use an options QDict with the "read-only" option instead of
> passing the changes as flags only.
> 
>   - Simplify the code (it was unnecessarily complicated and verbose).
> 
>   - Fix a bug due to which the secondary disk was not being put back
> in read-only mode when writable=false (because in this case
> orig_secondary_flags always had the BDRV_O_RDWR flag set).
> 
>   - Stop clearing the BDRV_O_INACTIVE flag.

Why?  Sorry, but I don't know much about replication.  Do you know this
change is OK?  (Since the code deliberately clears it right now.)

(Well, it makes sense not to re-set it afterwards...)

> The flags parameter to bdrv_reopen_queue() becomes redundant and we'll
> be able to get rid of it in a subsequent patch.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/replication.c | 42 ++
>  1 file changed, 18 insertions(+), 24 deletions(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index 6349d6958e..1ad0ef2ef8 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -20,6 +20,7 @@
>  #include "block/block_backup.h"
>  #include "sysemu/block-backend.h"
>  #include "qapi/error.h"
> +#include "qapi/qmp/qdict.h"
>  #include "replication.h"
>  
>  typedef enum {
> @@ -39,8 +40,8 @@ typedef struct BDRVReplicationState {
>  char *top_id;
>  ReplicationState *rs;
>  Error *blocker;
> -int orig_hidden_flags;
> -int orig_secondary_flags;
> +bool orig_hidden_read_only;
> +bool orig_secondary_read_only;
>  int error;
>  } BDRVReplicationState;
>  
> @@ -376,39 +377,32 @@ static void reopen_backing_file(BlockDriverState *bs, 
> bool writable,
>  {
>  BDRVReplicationState *s = bs->opaque;
>  BlockReopenQueue *reopen_queue = NULL;
> -int orig_hidden_flags, orig_secondary_flags;
> -int new_hidden_flags, new_secondary_flags;
>  Error *local_err = NULL;
>  
>  if (writable) {

I'd like to request a comment that "writable" is true the first time
this function is called, and false the second time.  That'd make it
clear why this rewrite makes sense.

(And that writable=false means reverting to the original state.)

((And no, it doesn't make more sense before this patch.))

> -orig_hidden_flags = s->orig_hidden_flags =
> -bdrv_get_flags(s->hidden_disk->bs);
> -new_hidden_flags = (orig_hidden_flags | BDRV_O_RDWR) &
> -~BDRV_O_INACTIVE;
> -orig_secondary_flags = s->orig_secondary_flags =
> -bdrv_get_flags(s->secondary_disk->bs);
> -new_secondary_flags = (orig_secondary_flags | BDRV_O_RDWR) &
> - ~BDRV_O_INACTIVE;
> -} else {
> -orig_hidden_flags = (s->orig_hidden_flags | BDRV_O_RDWR) &
> -~BDRV_O_INACTIVE;
> -new_hidden_flags = s->orig_hidden_flags;
> -orig_secondary_flags = (s->orig_secondary_flags | BDRV_O_RDWR) &
> -~BDRV_O_INACTIVE;
> -new_secondary_flags = s->orig_secondary_flags;
> +s->orig_hidden_read_only =
> +!(bdrv_get_flags(s->hidden_disk->bs) & BDRV_O_RDWR);
> +s->orig_secondary_read_only =
> +!(bdrv_get_flags(s->secondary_disk->bs) & BDRV_O_RDWR);

Why not use bdrv_is_read_only()?

Max

>  }
>  
>  bdrv_subtree_drained_begin(s->hidden_disk->bs);
>  bdrv_subtree_drained_begin(s->secondary_disk->bs);
>  
> -if (orig_hidden_flags != new_hidden_flags) {
> -reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs, 
> NULL,
> - new_hidden_flags);
> +if (s->orig_hidden_read_only) {
> +int flags = bdrv_get_flags(s->hidden_disk->bs);
> +QDict *opts = qdict_new();
> +qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
> +reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs,
> + opts, flags);
>  }
>  
> -if (!(orig_secondary_flags & BDRV_O_RDWR)) {
> +if (s->orig_secondary_read_only) {
> +int flags = bdrv_get_flags(s->secondary_disk->bs);
> +QDict *opts = qdict_new();
> +qdict_put_bool(opts, BDRV_OPT_READ_ONLY, !writable);
>  reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
> - NULL, new_secondary_flags);
> + opts, flags);
>  }
>  
>  if (reopen_queue) {
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3] vmdk: align end of file to a sector boundary

2018-10-07 Thread Fam Zheng
On Fri, 10/05 10:00, yuchenlin wrote:
> Ping?

Hi,

This was merged as 51b3c6b73acae1e3fd3c7d441fc86dd17356695f.

Fam

> 
> On 2018-09-13 16:34, Fam Zheng wrote:
> > On Thu, 09/13 16:29, yuchen...@synology.com wrote:
> > > From: yuchenlin 
> > > 
> > > There is a rare case which the size of last compressed cluster
> > > is larger than the cluster size, which will cause the file is
> > > not aligned at the sector boundary.
> > > 
> > > There are three reasons to do it. First, if vmdk doesn't align at
> > > the sector boundary, there may be many undefined behaviors,
> > > such as, in vbox it will show VMDK: Compressed image is corrupted
> > > 'syno-vm-disk1.vmdk' (VERR_ZIP_CORRUPTED) when we try to import an
> > > ova with unaligned vmdk. Second, all the cluster_sector is aligned
> > > to sector, the last one should be like this, too. Third, it ease
> > > reading with sector based I/Os.
> > > 
> > > Signed-off-by: yuchenlin 
> > 
> > Reviewed-by: Fam Zheng 
> 



Re: [Qemu-devel] [PATCH 13/14] block: Remove flags parameter from bdrv_reopen_queue()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> Now that all callers are passing all flag changes as QDict options,
> the flags parameter is no longer necessary, so we can get rid of it.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c   | 5 +++--
>  block/replication.c   | 6 ++
>  include/block/block.h | 3 +--
>  qemu-io-cmds.c| 2 +-
>  4 files changed, 7 insertions(+), 9 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 14/14] block: Stop passing flags to bdrv_reopen_queue_child()

2018-10-07 Thread Max Reitz
On 19.09.18 16:47, Alberto Garcia wrote:
> Now that all callers are passing the new options using the QDict we no
> longer need the 'flags' parameter.
> 
> This patch makes the following changes:
> 
>1) The update_options_from_flags() call is no longer necessary
>   so it can be removed.
> 
>2) The update_flags_from_options() call is now used in all cases,
>   and is moved down a few lines so it happens after the options
>   QDict contains the final set of values.
> 
>3) The flags parameter is removed. Now the flags are initialized
>   using the current value (for the top-level node) or the parent
>   flags (after inherit_options()). In both cases the initial
>   values are updated to reflect the new options in the QDict. This
>   happens in bdrv_reopen_queue_child() (as explained above) and in
>   bdrv_reopen_prepare().
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c | 50 +-
>  1 file changed, 21 insertions(+), 29 deletions(-)
> 
> diff --git a/block.c b/block.c
> index fd27b204d9..26c6a4e58a 100644
> --- a/block.c
> +++ b/block.c
> @@ -2884,7 +2884,6 @@ BlockDriverState *bdrv_open(const char *filename, const 
> char *reference,
>  static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>   BlockDriverState *bs,
>   QDict *options,
> - int flags,
>   const BdrvChildRole *role,
>   QDict *parent_options,
>   int parent_flags)
> @@ -2894,6 +2893,7 @@ static BlockReopenQueue 
> *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>  BlockReopenQueueEntry *bs_entry;
>  BdrvChild *child;
>  QDict *old_options, *explicit_options;
> +int flags;
>  
>  /* Make sure that the caller remembered to use a drained section. This is
>   * important to avoid graph changes between the recursive queuing here 
> and
> @@ -2919,22 +2919,11 @@ static BlockReopenQueue 
> *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>  /*
>   * Precedence of options:
>   * 1. Explicitly passed in options (highest)
> - * 2. Set in flags (only for top level)
> - * 3. Retained from explicitly set options of bs
> - * 4. Inherited from parent node
> - * 5. Retained from effective options of bs
> + * 2. Retained from explicitly set options of bs
> + * 3. Inherited from parent node
> + * 4. Retained from effective options of bs
>   */
>  
> -if (!parent_options) {
> -/*
> - * Any setting represented by flags is always updated. If the
> - * corresponding QDict option is set, it takes precedence. Otherwise
> - * the flag is translated into a QDict option. The old setting of bs 
> is
> - * not considered.
> - */
> -update_options_from_flags(options, flags);
> -}
> -
>  /* Old explicitly set values (don't overwrite by inherited value) */
>  if (bs_entry) {
>  old_options = qdict_clone_shallow(bs_entry->state.explicit_options);
> @@ -2948,12 +2937,22 @@ static BlockReopenQueue 
> *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>  
>  /* Inherit from parent node */
>  if (parent_options) {
> -QemuOpts *opts;
> -QDict *options_copy;
> -Error *local_err = NULL;
> -assert(!flags);
> +flags = 0;
>  role->inherit_options(&flags, options, parent_flags, parent_options);
> -options_copy = qdict_clone_shallow(options);
> +} else {
> +flags = bdrv_get_flags(bs);
> +}
> +
> +/* Old values are used for options that aren't set yet */
> +old_options = qdict_clone_shallow(bs->options);
> +bdrv_join_options(bs, options, old_options);
> +qobject_unref(old_options);
> +
> +/* We have the final set of options so let's update the flags */
> +{
> +Error *local_err = NULL;
> +QemuOpts *opts;
> +QDict *options_copy = qdict_clone_shallow(options);

I-I'm not sure this conforms to our coding style.

While I applaud your effort to keep the patch size small, I know for
sure I don't want to read code like this.  (Unless it's necessary
because of some variables' lifetimes.)

I agree with the changes, however.

Max

>  opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
>  qemu_opts_absorb_qdict(opts, options_copy, &local_err);
>  /* Don't call update_flags_from_options() if the options are wrong.
> @@ -2967,11 +2966,6 @@ static BlockReopenQueue 
> *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>  qobject_unref(options_copy);
>  }
>  
> -/* Old values are used for options that aren't set yet */
> -old_options = qdict_clone_shallow(bs->option

Re: [Qemu-devel] [PATCH] intel_iommu: handle invalid ce for shadow sync

2018-10-07 Thread Jason Wang




On 2018年09月13日 15:55, Peter Xu wrote:

There are two callers for vtd_sync_shadow_page_table_range(), one
provided a valid context entry and one not.  Move that fetching
operation into the caller vtd_sync_shadow_page_table() where we need to
fetch the context entry.

Meanwhile, we should handle VTD_FR_CONTEXT_ENTRY_P properly when
synchronizing shadow page tables.  Having invalid context entry there is
perfectly valid when we move a device out of an existing domain.  When
that happens, instead of posting an error we invalidate the whole region.

Without this patch, QEMU will crash if we do these steps:

(1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
(2) bind the NICs with vfio-pci in the guest
(3) start testpmd with the NICs applied
(4) stop testpmd
(5) rebind the NIC back to ixgbe kernel driver

The patch should fix it.

Reported-by: Pei Zhang 
Tested-by: Pei Zhang 
CC: Pei Zhang 
CC: Alex Williamson 
CC: Jason Wang 
CC: Maxime Coquelin 
CC: Michael S. Tsirkin 
CC: QEMU Stable 
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
Signed-off-by: Peter Xu 
---
  hw/i386/intel_iommu.c | 54 ++-
  1 file changed, 33 insertions(+), 21 deletions(-)


Reviewed-by: Jason Wang 

Some nits, see below.



diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3dfada19a6..2509520d6f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -37,6 +37,8 @@
  #include "kvm_i386.h"
  #include "trace.h"
  
+static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);

+
  static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
  uint64_t wmask, uint64_t w1cmask)
  {
@@ -1047,39 +1049,49 @@ static int 
vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as,
  .notify_unmap = true,
  .aw = s->aw_bits,
  .as = vtd_as,
+.domain_id = VTD_CONTEXT_ENTRY_DID(ce->hi),
  };
-VTDContextEntry ce_cache;
+
+return vtd_page_walk(ce, addr, addr + size, &info);
+}
+
+static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
+{
  int ret;
+VTDContextEntry ce;
+IOMMUNotifier *n;
  
-if (ce) {

-/* If the caller provided context entry, use it */
-ce_cache = *ce;
-} else {
-/* If the caller didn't provide ce, try to fetch */
-ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
-   vtd_as->devfn, &ce_cache);
-if (ret) {
+ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
+   pci_bus_num(vtd_as->bus),
+   vtd_as->devfn, &ce);
+if (ret) {
+if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
+/*
+ * It's a valid scenario to have a context entry that is
+ * not present.  For example, when a device is removed
+ * from an existing domain then the context entry will be
+ * zeroed by the guest before it was put into another
+ * domain.  When this happens, instead of synchronizing
+ * the shadow pages we should invalidate all existing
+ * mappings and notify the backends.
+ */
+IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
+vtd_address_space_unmap(vtd_as, n);
+}
+} else {
  /*
   * This should not really happen, but in case it happens,
   * we just skip the sync for this time.  After all we even
   * don't have the root table pointer!
   */


It looks to me the comment is not accurate, no root pointer is not the 
only reason for the failure of vtd_dev_to_context_entry().



  error_report_once("%s: invalid context entry for bus 0x%x"
-  " devfn 0x%x",
-  __func__, pci_bus_num(vtd_as->bus),
-  vtd_as->devfn);
-return 0;


I'm not quite sure error_report_once() is really needed here since all 
failures has been traced.



+  " devfn 0x%x", __func__,
+  pci_bus_num(vtd_as->bus), vtd_as->devfn);
  }
+return 0;
  }
  
-info.domain_id = VTD_CONTEXT_ENTRY_DID(ce_cache.hi);

-
-return vtd_page_walk(&ce_cache, addr, addr + size, &info);
-}
-
-static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
-{
-return vtd_sync_shadow_page_table_range(vtd_as, NULL, 0, UINT64_MAX);
+return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
  }


As has been discussed, this will left addr UINT64_MAX, it's better to 
have [start, end] instead of (start, range).


Thanks

  
  /*





Re: [Qemu-devel] [RFC 4/6] tcg: define TCG_TARGET_TLB_MAX_INDEX_BITS

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> From: Pranith Kumar 
> 
> This paves the way for implementing a dynamically-sized softmmu.
> 
> Signed-off-by: Pranith Kumar 
> Signed-off-by: Emilio G. Cota 
> ---

There's no point in this, since the original constraint was due to encoding
immediates in the and instruction.  Now we're using tlb_mask.


r~



Re: [Qemu-devel] [RFC 5/6] cpu-defs: define MIN_CPU_TLB_SIZE

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> @@ -89,7 +89,7 @@ typedef uint64_t target_ulong;
>   * 0x18 (the offset of the addend field in each TLB entry) plus the offset
>   * of tlb_table inside env (which is non-trivial but not huge).
>   */
> -#define CPU_TLB_BITS \
> +#define MIN_CPU_TLB_BITS \
>  MIN(8,   \
>  TCG_TARGET_TLB_DISPLACEMENT_BITS - CPU_TLB_ENTRY_BITS -  \
>  (NB_MMU_MODES <= 1 ? 0 : \

There's no point in this either, since the original constraint was due to the
immediate offset into an add instruction.  Now we're loading the base address
from an array.  The actual size of the tlb is immaterial now, since the size of
the tlb does not affect the size of CPUArchState.


r~



Re: [Qemu-devel] [PATCH v5 03/16] memory-device: improve "range conflicts" error message

2018-10-07 Thread David Gibson
On Fri, Oct 05, 2018 at 11:20:11AM +0200, David Hildenbrand wrote:
> Handle id==NULL better and indicate that we are dealing with memory
> devices.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: David Gibson 

> ---
>  hw/mem/memory-device.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c
> index 7c706fadfc..0624184c40 100644
> --- a/hw/mem/memory-device.c
> +++ b/hw/mem/memory-device.c
> @@ -175,7 +175,8 @@ uint64_t memory_device_get_free_addr(MachineState *ms, 
> const uint64_t *hint,
>  if (ranges_overlap(md_addr, md_size, new_addr, size)) {
>  if (hint) {
>  const DeviceState *d = DEVICE(md);
> -error_setg(errp, "address range conflicts with '%s'", d->id);
> +error_setg(errp, "address range conflicts with memory device"
> +   " id='%s'", d->id ? d->id : "(unnamed)");
>  goto out;
>  }
>  new_addr = QEMU_ALIGN_UP(md_addr + md_size, align);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC 3/6] cputlb: track TLB use rates

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> @@ -753,6 +763,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
> vaddr,
>  }
>  
>  copy_tlb_helper_locked(te, &tn);
> +env->tlb_desc[mmu_idx].used++;
>  qemu_spin_unlock(&env->tlb_lock);
>  }

So.. you're computing what?  Total TLB occupancy?
Perhaps a better name than "used"?

Otherwise the patch looks plausible.


r~



Re: [Qemu-devel] [PATCH v3 17/22] virtio-pmem: prototype

2018-10-07 Thread David Gibson
On Mon, Sep 24, 2018 at 05:48:14AM -0400, Pankaj Gupta wrote:
> 
> > On 24/09/2018 07:45, David Gibson wrote:
> > > On Thu, Sep 20, 2018 at 12:32:38PM +0200, David Hildenbrand wrote:
> > >> From: Pankaj Gupta 
> > >>
> > >> This is the current protoype of virtio-pmem. Support will require
> > >> machine changes for the architectures that will support it, so it will
> > >> not yet be compiled.
> > >>
> > >> Signed-off-by: Pankaj Gupta 
> > >> [ MemoryDevice/MemoryRegion changes, cleanups, addr property "memaddr",
> > >>   split up patches ]
> > >> Signed-off-by: David Hildenbrand 
> > > 
> > > Reviewed-by: David Gibson 
> > > 
> > > Seems generally sane.  Is there a (craft?) virtio-pmem spec around to
> > > see what this is actually trying to implement?
> > 
> > I am not aware of any. The first goal is to get something implemented
> > that works and doesn't break important concepts. E.g. hotplugging memory
> > devices was one of these concepts that Igor saw as a potential problem
> > for getting virtio-pmem upstream.
> > 
> > But maybe Pankaj already has some draft version of a spec lying around.
> 
> I have created document but no official spec. Project idea is shared here [1] 
> and kernel part of series [2]. I will create a spec and share.
> 
> [1] https://www.spinics.net/lists/kvm/msg153095.html
> [2] https://marc.info/?l=kvm&m=153572224719212&w=2

Ok.  Because this creates a new interface we could be working with for
a while, I think we do need at least a semi-formal spec - and to get
some review of that spec - before we merge.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v4 0/2] intel_iommu: better handling of dmar state switch

2018-10-07 Thread Jason Wang




On 2018年09月29日 11:36, Peter Xu wrote:

v4:
- add a patch to introduce vtd_reset_caches()
- reset the caches in the two places where GCMD update happens [Eric]

Please review, thanks.

Peter Xu (2):
   intel_iommu: introduce vtd_reset_caches()
   intel_iommu: better handling of dmar state switch

  hw/i386/intel_iommu.c | 34 +++---
  1 file changed, 23 insertions(+), 11 deletions(-)



Reviewed-by: Jason Wang 




Re: [Qemu-devel] [PATCH v3] vmdk: align end of file to a sector boundary

2018-10-07 Thread yuchenlin via Qemu-devel

On 2018-10-08 10:38, Fam Zheng wrote:

On Fri, 10/05 10:00, yuchenlin wrote:

Ping?


Hi,

This was merged as 51b3c6b73acae1e3fd3c7d441fc86dd17356695f.

Fam



Hi,

Thank you for your information.

yuchenlin



On 2018-09-13 16:34, Fam Zheng wrote:
> On Thu, 09/13 16:29, yuchen...@synology.com wrote:
> > From: yuchenlin 
> >
> > There is a rare case which the size of last compressed cluster
> > is larger than the cluster size, which will cause the file is
> > not aligned at the sector boundary.
> >
> > There are three reasons to do it. First, if vmdk doesn't align at
> > the sector boundary, there may be many undefined behaviors,
> > such as, in vbox it will show VMDK: Compressed image is corrupted
> > 'syno-vm-disk1.vmdk' (VERR_ZIP_CORRUPTED) when we try to import an
> > ova with unaligned vmdk. Second, all the cluster_sector is aligned
> > to sector, the last one should be like this, too. Third, it ease
> > reading with sector based I/Os.
> >
> > Signed-off-by: yuchenlin 
>
> Reviewed-by: Fam Zheng 






Re: [Qemu-devel] [PATCH v2 0/3] add exit-script option to qemu

2018-10-07 Thread Jason Wang




On 2018年10月04日 19:43, Dominik Csapak wrote:

this patch series aims to execute a script when qemu exits
so that one can do cleanups when using --daemonize without
having to use the qmp monitor


Hi:

Can you give a example of why it must be done through this way? It looks 
to me that we can do this by monitor the pid and behave accordingly 
through bash.


Thanks



changes since v1:

* refactored as qemu_launch_script, only for non-windows platforms
* updated net/tap.c to use qemu_launch_script instead of launch_script
* fixed a small error in the option description

Dominik Csapak (3):
   osdep: add qemu_launch_script for executing scripts
   tap: use qemu_launch_script instead of launch_script
   vl.c: call optional script when exiting

  include/qemu/osdep.h | 12 +++
  net/tap.c| 56 ++--
  qemu-options.hx  | 20 +++
  util/oslib-posix.c   | 34 +++
  util/oslib-win32.c   |  8 
  vl.c | 29 +++
  6 files changed, 113 insertions(+), 46 deletions(-)






[Qemu-devel] [QEMU-PPC] [PATCH V2 2/3] target/ppc: Add one reg id for ptcr

2018-10-07 Thread Suraj Jitindar Singh
The ptcr (partition table control register) is used to store the address
and size of the partition table. For nested kvm-hv we have a level 1
guest register the location of it's partition table with the hypervisor.
Thus to support migration we need to be able to read this out of kvm
and restore it post migration.

Add the one reg id for the ptcr.

Signed-off-by: Suraj Jitindar Singh 
---
 target/ppc/translate_init.inc.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 263e63cb03..487196800b 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8197,11 +8197,11 @@ static void gen_spr_power9_mmu(CPUPPCState *env)
 {
 #if !defined(CONFIG_USER_ONLY)
 /* Partition Table Control */
-spr_register_hv(env, SPR_PTCR, "PTCR",
-SPR_NOACCESS, SPR_NOACCESS,
-SPR_NOACCESS, SPR_NOACCESS,
-&spr_read_generic, &spr_write_ptcr,
-0x);
+spr_register_kvm_hv(env, SPR_PTCR, "PTCR",
+SPR_NOACCESS, SPR_NOACCESS,
+SPR_NOACCESS, SPR_NOACCESS,
+&spr_read_generic, &spr_write_ptcr,
+KVM_REG_PPC_PTCR, 0x);
 #endif
 }
 
-- 
2.13.6




[Qemu-devel] [QEMU-PPC] [PATCH V2 3/3] ppc/spapr_caps: Add SPAPR_CAP_NESTED_KVM_HV

2018-10-07 Thread Suraj Jitindar Singh
Add the spapr cap SPAPR_CAP_NESTED_KVM_HV to be used to control the
availability of nested kvm-hv to the level 1 (L1) guest.

Assuming a hypervisor with support enabled an L1 guest can be allowed to
use the kvm-hv module (and thus run it's own kvm-hv guests) by setting:
-machine pseries,cap-nested-hv=true
or disabled with:
-machine pseries,cap-nested-hv=false

Signed-off-by: Suraj Jitindar Singh 
---
 hw/ppc/spapr.c |  2 ++
 hw/ppc/spapr_caps.c| 32 
 include/hw/ppc/spapr.h |  5 -
 target/ppc/kvm.c   | 12 
 target/ppc/kvm_ppc.h   | 12 
 5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 98868d893a..8ce97900e9 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1915,6 +1915,7 @@ static const VMStateDescription vmstate_spapr = {
 &vmstate_spapr_cap_sbbc,
 &vmstate_spapr_cap_ibs,
 &vmstate_spapr_irq_map,
+&vmstate_spapr_cap_nested_kvm_hv,
 NULL
 }
 };
@@ -3886,6 +3887,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
 smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
 smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
+smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
 spapr_caps_add_properties(smc, &error_abort);
 smc->irq = &spapr_irq_xics;
 }
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index aa605cea91..64f98ae68d 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -368,6 +368,28 @@ static void 
cap_hpt_maxpagesize_cpu_apply(sPAPRMachineState *spapr,
 ppc_hash64_filter_pagesizes(cpu, spapr_pagesize_cb, &maxshift);
 }
 
+static void cap_nested_kvm_hv_apply(sPAPRMachineState *spapr,
+uint8_t val, Error **errp)
+{
+if (!val) {
+/* capability disabled by default */
+return;
+}
+
+if (tcg_enabled()) {
+error_setg(errp,
+   "No Nested KVM-HV support in tcg, try cap-nested-hv=off");
+} else if (kvm_enabled()) {
+if (!kvmppc_has_cap_nested_kvm_hv()) {
+error_setg(errp,
+"KVM implementation does not support Nested KVM-HV, try cap-nested-hv=off");
+} else if (kvmppc_set_cap_nested_kvm_hv(val) < 0) {
+error_setg(errp,
+"Error enabling cap-nested-hv with KVM, try cap-nested-hv=off");
+}
+}
+}
+
 sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 [SPAPR_CAP_HTM] = {
 .name = "htm",
@@ -437,6 +459,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 .apply = cap_hpt_maxpagesize_apply,
 .cpu_apply = cap_hpt_maxpagesize_cpu_apply,
 },
+[SPAPR_CAP_NESTED_KVM_HV] = {
+.name = "nested-hv",
+.description = "Allow Nested KVM-HV",
+.index = SPAPR_CAP_NESTED_KVM_HV,
+.get = spapr_cap_get_bool,
+.set = spapr_cap_set_bool,
+.type = "bool",
+.apply = cap_nested_kvm_hv_apply,
+},
 };
 
 static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
@@ -564,6 +595,7 @@ SPAPR_CAP_MIG_STATE(dfp, SPAPR_CAP_DFP);
 SPAPR_CAP_MIG_STATE(cfpc, SPAPR_CAP_CFPC);
 SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
 SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
+SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
 
 void spapr_caps_init(sPAPRMachineState *spapr)
 {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ad4d7cfd97..bced85dd92 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -70,8 +70,10 @@ typedef enum {
 #define SPAPR_CAP_IBS   0x05
 /* HPT Maximum Page Size (encoded as a shift) */
 #define SPAPR_CAP_HPT_MAXPAGESIZE   0x06
+/* Nested KVM-HV */
+#define SPAPR_CAP_NESTED_KVM_HV 0x07
 /* Num Caps */
-#define SPAPR_CAP_NUM   (SPAPR_CAP_HPT_MAXPAGESIZE + 1)
+#define SPAPR_CAP_NUM   (SPAPR_CAP_NESTED_KVM_HV + 1)
 
 /*
  * Capability Values
@@ -793,6 +795,7 @@ extern const VMStateDescription vmstate_spapr_cap_dfp;
 extern const VMStateDescription vmstate_spapr_cap_cfpc;
 extern const VMStateDescription vmstate_spapr_cap_sbbc;
 extern const VMStateDescription vmstate_spapr_cap_ibs;
+extern const VMStateDescription vmstate_spapr_cap_nested_kvm_hv;
 
 static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
 {
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 30aeafa7de..f81327d6cd 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -91,6 +91,7 @@ static int cap_ppc_pvr_compat;
 static int cap_ppc_safe_cache;
 static int cap_ppc_safe_bounds_check;
 static int cap_ppc_safe_indirect_branch;
+static int cap_ppc_nested_kvm_hv;
 
 static uint32_t debug_inst_opcode;
 
@@ -150,6 +151,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 cap_mmu_hash_v3 = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_HASH_V3);
 cap_resize_hpt = kvm_vm

[Qemu-devel] [QEMU-PPC] [PATCH V2 0/3] ppc/spapr: Add support for nested kvm-hv

2018-10-07 Thread Suraj Jitindar Singh
This patch series adds the qemu support for running nested kvm-hv on a
POWER9 platform with appropriate hypervisor support and migration of
these guests.
That is, the ability to run kvm-hv guests as guests of an operating system
which is itself a kvm-hv guest.

The host (L0 hypervisor) and level 1 guest (L1 guest hypervisor) require
the following patch series:
KVM: PPC: Book3S HV: Nested HV virtualization

And this patch series in the qemu instance running on the L0 hypervisor.

Patch series based on: ppc-for-3.1

Still waiting for the cap number and one reg id to be confirmed in linux.

Changes RFC -> V2:
- Split header changes into separate patch
- Reword error messages when enabling nested-hv with kvm

Suraj Jitindar Singh (3):
  target/ppc: Update linux-headers for v4.19-rc7
  target/ppc: Add one reg id for ptcr
  ppc/spapr_caps: Add SPAPR_CAP_NESTED_KVM_HV

 hw/ppc/spapr.c  |  2 ++
 hw/ppc/spapr_caps.c | 32 
 include/hw/ppc/spapr.h  |  5 -
 linux-headers/asm-powerpc/kvm.h |  1 +
 linux-headers/linux/kvm.h   |  1 +
 target/ppc/kvm.c| 12 
 target/ppc/kvm_ppc.h| 12 
 target/ppc/translate_init.inc.c | 10 +-
 8 files changed, 69 insertions(+), 6 deletions(-)

-- 
2.13.6




[Qemu-devel] [QEMU-PPC] [PATCH V2 1/3] target/ppc: Update linux-headers for v4.19-rc7

2018-10-07 Thread Suraj Jitindar Singh
Signed-off-by: Suraj Jitindar Singh 
---
 linux-headers/asm-powerpc/kvm.h | 1 +
 linux-headers/linux/kvm.h   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 1b32b56a03..8c876c166e 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -634,6 +634,7 @@ struct kvm_ppc_cpu_char {
 
 #define KVM_REG_PPC_DEC_EXPIRY (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xbe)
 #define KVM_REG_PPC_ONLINE (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xbf)
+#define KVM_REG_PPC_PTCR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc0)
 
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 66790724f1..d49767ad25 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -951,6 +951,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_HYPERV_TLBFLUSH 155
 #define KVM_CAP_S390_HPAGE_1M 156
 #define KVM_CAP_NESTED_STATE 157
+#define KVM_CAP_PPC_NESTED_HV 160
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.13.6




Re: [Qemu-devel] [RFC 6/6] cputlb: dynamically resize TLBs based on use rate

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> @@ -122,6 +123,39 @@ size_t tlb_flush_count(void)
>  return count;
>  }
>  
> +/* Call with tlb_lock held */
> +static void tlb_mmu_resize_locked(CPUArchState *env, int mmu_idx)
> +{
> +CPUTLBDesc *desc = &env->tlb_desc[mmu_idx];
> +size_t rate = desc->used * 100 / desc->size;
> +size_t new_size = desc->size;
> +
> +if (rate == 100) {
> +new_size = MIN(desc->size << 2, 1 << TCG_TARGET_TLB_MAX_INDEX_BITS);
> +} else if (rate > 70) {
> +new_size = MIN(desc->size << 1, 1 << TCG_TARGET_TLB_MAX_INDEX_BITS);
> +} else if (rate < 30) {
> +desc->n_flushes_low_rate++;
> +if (desc->n_flushes_low_rate == 100) {
> +new_size = MAX(desc->size >> 1, 1 << MIN_CPU_TLB_BITS);
> +desc->n_flushes_low_rate = 0;
> +}
> +}
> +
> +if (new_size == desc->size) {

s/desc->size/old_size/g
Otherwise it looks plausible as a first cut.


r~



Re: [Qemu-devel] [PATCH v3 17/22] virtio-pmem: prototype

2018-10-07 Thread Pankaj Gupta
Hello David,

Thanks for the review.

> > 
> > > On 24/09/2018 07:45, David Gibson wrote:
> > > > On Thu, Sep 20, 2018 at 12:32:38PM +0200, David Hildenbrand wrote:
> > > >> From: Pankaj Gupta 
> > > >>
> > > >> This is the current protoype of virtio-pmem. Support will require
> > > >> machine changes for the architectures that will support it, so it will
> > > >> not yet be compiled.
> > > >>
> > > >> Signed-off-by: Pankaj Gupta 
> > > >> [ MemoryDevice/MemoryRegion changes, cleanups, addr property
> > > >> "memaddr",
> > > >>   split up patches ]
> > > >> Signed-off-by: David Hildenbrand 
> > > > 
> > > > Reviewed-by: David Gibson 
> > > > 
> > > > Seems generally sane.  Is there a (craft?) virtio-pmem spec around to
> > > > see what this is actually trying to implement?
> > > 
> > > I am not aware of any. The first goal is to get something implemented
> > > that works and doesn't break important concepts. E.g. hotplugging memory
> > > devices was one of these concepts that Igor saw as a potential problem
> > > for getting virtio-pmem upstream.
> > > 
> > > But maybe Pankaj already has some draft version of a spec lying around.
> > 
> > I have created document but no official spec. Project idea is shared here
> > [1]
> > and kernel part of series [2]. I will create a spec and share.
> > 
> > [1] https://www.spinics.net/lists/kvm/msg153095.html
> > [2] https://marc.info/?l=kvm&m=153572224719212&w=2
> 
> Ok.  Because this creates a new interface we could be working with for
> a while, I think we do need at least a semi-formal spec - and to get
> some review of that spec - before we merge.

Sure, Will create one and try to send for review this week.

Thanks,
Pankaj

> 
> --
> David Gibson  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
> _other_
>   | _way_ _around_!
> http://www.ozlabs.org/~dgibson
> 



[Qemu-devel] [PATCH 0/2] linux-user: usbfs improvements

2018-10-07 Thread Cortland Tölva
From: Cortland Setlow Tölva 

This patch series enables programs running under QEMU Linux user mode
emulation to implement user-space USB drivers via the USBFS ioctl()s.
Support is limited to control, bulk, and possibly interrupt transfers.

Usbfs ioctl codes were incorrect whenever host and target disagreed on
struct size.  The submit, discard, and reap usbfs ioctls require special
memory buffer handling and the second commit implements this but not for
USB3 streams or isochronous transfers.

Cortland Tölva (2):
  linux-user: Use calculated sizes for usbfs ioctls.
  linux-user: Implement special usbfs ioctls.

 linux-user/ioctls.h|   8 ++
 linux-user/syscall.c   | 177 +
 linux-user/syscall_defs.h  |  42 -
 linux-user/syscall_types.h |  20 +
 4 files changed, 227 insertions(+), 20 deletions(-)

-- 
2.17.1



[Qemu-devel] [PATCH 2/2] linux-user: Implement usbfs submit and reap ioctls.

2018-10-07 Thread Cortland Tölva
Userspace submits a USB Request Buffer to the kernel, optionally
discards it, and finally reaps the URB.  Thunk buffers from target
to host and back.

Tested by running an i386 scanner driver on ARMv7, by running
the PowerPC lsusb utility on x86_64, and MIPS lsusb on ARMv7.  
The discard urb ioctl is implemented but has not been tested.

Signed-off-by: Cortland Tölva 
---
lsusb was run with -vv so as to exercise the submit and reap ioctls.

 linux-user/ioctls.h|   8 ++
 linux-user/syscall.c   | 177 +
 linux-user/syscall_defs.h  |   4 +
 linux-user/syscall_types.h |  20 +
 4 files changed, 209 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 92f6177f1d..ae8951625f 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -143,6 +143,14 @@
   IOCTL(USBDEVFS_SETCONFIGURATION, IOC_W, MK_PTR(TYPE_INT))
   IOCTL(USBDEVFS_GETDRIVER, IOC_R,
 MK_PTR(MK_STRUCT(STRUCT_usbdevfs_getdriver)))
+  IOCTL_SPECIAL(USBDEVFS_SUBMITURB, IOC_W, do_ioctl_usbdevfs_submiturb,
+  MK_PTR(MK_STRUCT(STRUCT_usbdevfs_urb)))
+  IOCTL_SPECIAL(USBDEVFS_DISCARDURB, IOC_RW, do_ioctl_usbdevfs_discardurb,
+  MK_PTR(MK_STRUCT(STRUCT_usbdevfs_urb)))
+  IOCTL_SPECIAL(USBDEVFS_REAPURB, IOC_R, do_ioctl_usbdevfs_reapurb,
+  MK_PTR(TYPE_PTRVOID))
+  IOCTL_SPECIAL(USBDEVFS_REAPURBNDELAY, IOC_R, do_ioctl_usbdevfs_reapurb,
+  MK_PTR(TYPE_PTRVOID))
   IOCTL(USBDEVFS_DISCSIGNAL, IOC_W,
 MK_PTR(MK_STRUCT(STRUCT_usbdevfs_disconnectsignal)))
   IOCTL(USBDEVFS_CLAIMINTERFACE, IOC_W, MK_PTR(TYPE_INT))
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 2641260186..9b7ea96cfb 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -96,6 +96,7 @@
 #include 
 #if defined(CONFIG_USBFS)
 #include 
+#include 
 #endif
 #include 
 #include 
@@ -4199,6 +4200,182 @@ static abi_long do_ioctl_ifconf(const IOCTLEntry *ie, 
uint8_t *buf_temp,
 return ret;
 }
 
+#if defined(CONFIG_USBFS)
+#if HOST_LONG_BITS > 64
+#error USBDEVFS thunks do not support >64 bit hosts yet.
+#endif
+struct live_urb {
+uint64_t target_urb_adr;
+uint64_t target_buf_adr;
+char *target_buf_ptr;
+struct usbdevfs_urb host_urb;
+};
+
+static GHashTable *usbdevfs_urb_hashtable(void)
+{
+static GHashTable *urb_hashtable;
+
+if (!urb_hashtable) {
+urb_hashtable = g_hash_table_new(g_int64_hash, g_int64_equal);
+}
+return urb_hashtable;
+}
+
+static void urb_hashtable_insert(struct live_urb *urb)
+{
+GHashTable *urb_hashtable = usbdevfs_urb_hashtable();
+g_hash_table_insert(urb_hashtable, urb, urb);
+}
+
+static struct live_urb *urb_hashtable_lookup(uint64_t target_urb_adr)
+{
+GHashTable *urb_hashtable = usbdevfs_urb_hashtable();
+return g_hash_table_lookup(urb_hashtable, &target_urb_adr);
+}
+
+static void urb_hashtable_remove(struct live_urb *urb)
+{
+GHashTable *urb_hashtable = usbdevfs_urb_hashtable();
+g_hash_table_remove(urb_hashtable, urb);
+}
+
+static abi_long
+do_ioctl_usbdevfs_reapurb(const IOCTLEntry *ie, uint8_t *buf_temp,
+  int fd, int cmd, abi_long arg)
+{
+const argtype usbfsurb_arg_type[] = { MK_STRUCT(STRUCT_usbdevfs_urb) };
+const argtype ptrvoid_arg_type[] = { TYPE_PTRVOID, 0, 0 };
+struct live_urb *lurb;
+void *argptr;
+uint64_t hurb;
+int target_size;
+uintptr_t target_urb_adr;
+abi_long ret;
+
+target_size = thunk_type_size(usbfsurb_arg_type, THUNK_TARGET);
+
+memset(buf_temp, 0, sizeof(uint64_t));
+ret = get_errno(safe_ioctl(fd, ie->host_cmd, buf_temp));
+if (is_error(ret)) {
+return ret;
+}
+
+memcpy(&hurb, buf_temp, sizeof(uint64_t));
+lurb = (void *)((uintptr_t)hurb - offsetof(struct live_urb, host_urb));
+if (!lurb->target_urb_adr) {
+return -TARGET_EFAULT;
+}
+urb_hashtable_remove(lurb);
+unlock_user(lurb->target_buf_ptr, lurb->target_buf_adr,
+lurb->host_urb.buffer_length);
+lurb->target_buf_ptr = NULL;
+
+/* restore the guest buffer pointer */
+lurb->host_urb.buffer = (void *)(uintptr_t)lurb->target_buf_adr;
+
+/* update the guest urb struct */
+argptr = lock_user(VERIFY_WRITE, lurb->target_urb_adr, target_size, 0);
+if (!argptr) {
+g_free(lurb);
+return -TARGET_EFAULT;
+}
+thunk_convert(argptr, &lurb->host_urb, usbfsurb_arg_type, THUNK_TARGET);
+unlock_user(argptr, lurb->target_urb_adr, target_size);
+
+target_size = thunk_type_size(ptrvoid_arg_type, THUNK_TARGET);
+/* write back the urb handle */
+argptr = lock_user(VERIFY_WRITE, arg, target_size, 0);
+if (!argptr) {
+g_free(lurb);
+return -TARGET_EFAULT;
+}
+
+/* GHashTable uses 64-bit keys but thunk_convert expects uintptr_t */
+target_urb_adr = lurb->target_urb_adr;
+thunk_convert(argptr, &target_urb_adr, ptrvoid_arg_type, THUNK_TARGET);
+unlock_user(argptr, arg, target_size);
+
+g_free(lurb);
+ 

[Qemu-devel] [PATCH 1/2] linux-user: Use calculated sizes for usbfs ioctls.

2018-10-07 Thread Cortland Tölva
Size calculation should have used the target struct.  Fix the error by
marking these ioctls as needing runtime size calcuation.

Signed-off-by: Cortland Tölva 
---
 linux-user/syscall_defs.h | 38 ++
 1 file changed, 18 insertions(+), 20 deletions(-)

diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index af91f9582d..2daa5ebdcc 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -863,31 +863,29 @@ struct target_pollfd {
 
 #define TARGET_FS_IOC_FIEMAP TARGET_IOWR('f',11,struct fiemap)
 
-#if defined(CONFIG_USBFS)
 /* usb ioctls */
-#define TARGET_USBDEVFS_CONTROL TARGET_IOWR('U', 0, struct 
usbdevfs_ctrltransfer)
-#define TARGET_USBDEVFS_BULK TARGET_IOWR('U', 2, struct usbdevfs_bulktransfer)
-#define TARGET_USBDEVFS_RESETEP TARGET_IOR('U', 3, int)
-#define TARGET_USBDEVFS_SETINTERFACE TARGET_IOR('U', 4, struct 
usbdevfs_setinterface)
-#define TARGET_USBDEVFS_SETCONFIGURATION TARGET_IOR('U',  5, int)
-#define TARGET_USBDEVFS_GETDRIVER TARGET_IOW('U', 8, struct usbdevfs_getdriver)
-#define TARGET_USBDEVFS_DISCSIGNAL TARGET_IOR('U', 14, struct 
usbdevfs_disconnectsignal)
-#define TARGET_USBDEVFS_CLAIMINTERFACE TARGET_IOR('U', 15, int)
-#define TARGET_USBDEVFS_RELEASEINTERFACE TARGET_IOR('U', 16, int)
-#define TARGET_USBDEVFS_CONNECTINFO TARGET_IOW('U', 17, struct 
usbdevfs_connectinfo)
-#define TARGET_USBDEVFS_IOCTL TARGET_IOWR('U', 18, struct usbdevfs_ioctl)
-#define TARGET_USBDEVFS_HUB_PORTINFO TARGET_IOR('U', 19, struct 
usbdevfs_hub_portinfo)
+#define TARGET_USBDEVFS_CONTROL TARGET_IOWRU('U', 0)
+#define TARGET_USBDEVFS_BULK TARGET_IOWRU('U', 2)
+#define TARGET_USBDEVFS_RESETEP TARGET_IORU('U', 3)
+#define TARGET_USBDEVFS_SETINTERFACE TARGET_IORU('U', 4)
+#define TARGET_USBDEVFS_SETCONFIGURATION TARGET_IORU('U',  5)
+#define TARGET_USBDEVFS_GETDRIVER TARGET_IOWU('U', 8)
+#define TARGET_USBDEVFS_DISCSIGNAL TARGET_IORU('U', 14)
+#define TARGET_USBDEVFS_CLAIMINTERFACE TARGET_IORU('U', 15)
+#define TARGET_USBDEVFS_RELEASEINTERFACE TARGET_IORU('U', 16)
+#define TARGET_USBDEVFS_CONNECTINFO TARGET_IOWU('U', 17)
+#define TARGET_USBDEVFS_IOCTL TARGET_IOWRU('U', 18)
+#define TARGET_USBDEVFS_HUB_PORTINFO TARGET_IORU('U', 19)
 #define TARGET_USBDEVFS_RESET TARGET_IO('U', 20)
-#define TARGET_USBDEVFS_CLEAR_HALT TARGET_IOR('U', 21, int)
+#define TARGET_USBDEVFS_CLEAR_HALT TARGET_IORU('U', 21)
 #define TARGET_USBDEVFS_DISCONNECT TARGET_IO('U', 22)
 #define TARGET_USBDEVFS_CONNECT TARGET_IO('U', 23)
-#define TARGET_USBDEVFS_CLAIM_PORT TARGET_IOR('U', 24, int)
-#define TARGET_USBDEVFS_RELEASE_PORT TARGET_IOR('U', 25, int)
-#define TARGET_USBDEVFS_GET_CAPABILITIES TARGET_IOR('U', 26, int)
-#define TARGET_USBDEVFS_DISCONNECT_CLAIM TARGET_IOR('U', 27, struct 
usbdevfs_disconnect_claim)
-#define TARGET_USBDEVFS_DROP_PRIVILEGES TARGET_IOW('U', 30, int)
+#define TARGET_USBDEVFS_CLAIM_PORT TARGET_IORU('U', 24)
+#define TARGET_USBDEVFS_RELEASE_PORT TARGET_IORU('U', 25)
+#define TARGET_USBDEVFS_GET_CAPABILITIES TARGET_IORU('U', 26)
+#define TARGET_USBDEVFS_DISCONNECT_CLAIM TARGET_IORU('U', 27)
+#define TARGET_USBDEVFS_DROP_PRIVILEGES TARGET_IOWU('U', 30)
 #define TARGET_USBDEVFS_GET_SPEED TARGET_IO('U', 31)
-#endif /* CONFIG_USBFS */
 
 /* cdrom commands */
 #define TARGET_CDROMPAUSE  0x5301 /* Pause Audio Operation */
-- 
2.17.1



Re: [Qemu-devel] [PATCH v5 5/9] x86_iommu/amd: Prepare for interrupt remap support

2018-10-07 Thread Peter Xu
On Mon, Oct 01, 2018 at 07:44:37PM +, Singh, Brijesh wrote:
> Register the interrupt remapping callback and read/write ops for the
> amd-iommu-ir memory region.
> 
> amd-iommu-ir is set to higher priority to ensure that this region won't
> be masked out by other memory regions.
> 
> Signed-off-by: Brijesh Singh 
> Cc: Peter Xu 
> Cc: "Michael S. Tsirkin" 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Tom Lendacky 
> Cc: Suravee Suthikulpanit 

Reviewed-by: Peter Xu 

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] intel_iommu: handle invalid ce for shadow sync

2018-10-07 Thread Peter Xu
On Mon, Oct 08, 2018 at 11:08:31AM +0800, Jason Wang wrote:

[...]

> > +static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
> > +{
> >   int ret;
> > +VTDContextEntry ce;
> > +IOMMUNotifier *n;
> > -if (ce) {
> > -/* If the caller provided context entry, use it */
> > -ce_cache = *ce;
> > -} else {
> > -/* If the caller didn't provide ce, try to fetch */
> > -ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
> > -   vtd_as->devfn, &ce_cache);
> > -if (ret) {
> > +ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
> > +   pci_bus_num(vtd_as->bus),
> > +   vtd_as->devfn, &ce);
> > +if (ret) {
> > +if (ret == -VTD_FR_CONTEXT_ENTRY_P) {
> > +/*
> > + * It's a valid scenario to have a context entry that is
> > + * not present.  For example, when a device is removed
> > + * from an existing domain then the context entry will be
> > + * zeroed by the guest before it was put into another
> > + * domain.  When this happens, instead of synchronizing
> > + * the shadow pages we should invalidate all existing
> > + * mappings and notify the backends.
> > + */
> > +IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
> > +vtd_address_space_unmap(vtd_as, n);
> > +}
> > +} else {
> >   /*
> >* This should not really happen, but in case it happens,
> >* we just skip the sync for this time.  After all we even
> >* don't have the root table pointer!
> >*/
> 
> It looks to me the comment is not accurate, no root pointer is not the only
> reason for the failure of vtd_dev_to_context_entry().
> 
> >   error_report_once("%s: invalid context entry for bus 0x%x"
> > -  " devfn 0x%x",
> > -  __func__, pci_bus_num(vtd_as->bus),
> > -  vtd_as->devfn);
> > -return 0;
> 
> I'm not quite sure error_report_once() is really needed here since all
> failures has been traced.

True; I'll then consider have all of them to be error_report_once()
and drop the one here.

> 
> > +  " devfn 0x%x", __func__,
> > +  pci_bus_num(vtd_as->bus), vtd_as->devfn);
> >   }
> > +return 0;
> >   }
> > -info.domain_id = VTD_CONTEXT_ENTRY_DID(ce_cache.hi);
> > -
> > -return vtd_page_walk(&ce_cache, addr, addr + size, &info);
> > -}
> > -
> > -static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
> > -{
> > -return vtd_sync_shadow_page_table_range(vtd_as, NULL, 0, UINT64_MAX);
> > +return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
> >   }
> 
> As has been discussed, this will left addr UINT64_MAX, it's better to have
> [start, end] instead of (start, range).

Hmm, this size is inclusive, so we should be fine.  Though I'll take
your advise to use start/end pair to be clearer.

Thanks!

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v5 6/9] x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled

2018-10-07 Thread Peter Xu
On Mon, Oct 01, 2018 at 07:44:39PM +, Singh, Brijesh wrote:
> Emulate the interrupt remapping support when guest virtual APIC is
> not enabled.
> 
> For more info Refer: AMD IOMMU spec Rev 3.0 - section 2.2.5.1
> 
> When VAPIC is not enabled, it uses interrupt remapping as defined in
> Table 20 and Figure 15 from IOMMU spec.
> 
> Signed-off-by: Brijesh Singh 
> Cc: Peter Xu 
> Cc: "Michael S. Tsirkin" 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Tom Lendacky 
> Cc: Suravee Suthikulpanit 

Reviewed-by: Peter Xu 

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] intel_iommu: handle invalid ce for shadow sync

2018-10-07 Thread Peter Xu
On Mon, Oct 01, 2018 at 01:36:50PM +0200, Auger Eric wrote:
> Hi Peter,
> On 9/13/18 9:55 AM, Peter Xu wrote:
> > There are two callers for vtd_sync_shadow_page_table_range(), one
> > provided a valid context entry and one not.  Move that fetching
> > operation into the caller vtd_sync_shadow_page_table() where we need to
> > fetch the context entry.
> > 
> > Meanwhile, we should handle VTD_FR_CONTEXT_ENTRY_P properly when
> > synchronizing shadow page tables.  Having invalid context entry there is
> > perfectly valid when we move a device out of an existing domain.  When
> > that happens, instead of posting an error we invalidate the whole region.
> > 
> > Without this patch, QEMU will crash if we do these steps:
> > 
> > (1) start QEMU with VT-d IOMMU and two 10G NICs (ixgbe)
> > (2) bind the NICs with vfio-pci in the guest
> > (3) start testpmd with the NICs applied
> > (4) stop testpmd
> > (5) rebind the NIC back to ixgbe kernel driver
> > 
> > The patch should fix it.
> > 
> > Reported-by: Pei Zhang 
> > Tested-by: Pei Zhang 
> > CC: Pei Zhang 
> > CC: Alex Williamson 
> > CC: Jason Wang 
> > CC: Maxime Coquelin 
> > CC: Michael S. Tsirkin 
> > CC: QEMU Stable 
> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1627272
> > Signed-off-by: Peter Xu 
> > ---
> >  hw/i386/intel_iommu.c | 54 ++-
> >  1 file changed, 33 insertions(+), 21 deletions(-)
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 3dfada19a6..2509520d6f 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -37,6 +37,8 @@
> >  #include "kvm_i386.h"
> >  #include "trace.h"
> >  
> > +static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n);
> > +
> >  static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
> >  uint64_t wmask, uint64_t w1cmask)
> >  {
> Comment above is outdated:
> /* If context entry is NULL, we'll try to fetch it on our own. */

Indeed.

> > @@ -1047,39 +1049,49 @@ static int 
> > vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as,
> >  .notify_unmap = true,
> >  .aw = s->aw_bits,
> >  .as = vtd_as,
> > +.domain_id = VTD_CONTEXT_ENTRY_DID(ce->hi),
> >  };
> > -VTDContextEntry ce_cache;
> > +
> > +return vtd_page_walk(ce, addr, addr + size, &info);
> > +}
> Maybe change would gain in clarity if split into 2 patches, code
> reorganization and fix on the side.

Sure.  Thanks for reviewing!

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] hw/core/generic-loader: Set a category for the generic-loader device

2018-10-07 Thread Thomas Huth
On 2018-10-05 18:32, Peter Maydell wrote:
> On 5 October 2018 at 10:42, Thomas Huth  wrote:
>> Each device that is instantiatable by the users should be marked with
>> a category.
> 
> Presumably we could assert() this somewhere (at which
> point we'd find that we have dozens of devices that
> fail to set a category bit, I imagine) ?

You bet ... alone in qemu-system-aarch64, "-device help" shows more than
120 devices without a category...

So if we'd want to enforce this, there is a lot of clean-up work needed
first. Once we're in a good shape, I think a "make check" test would be
the right thing to enforce that future devices are always provided with
a category.

 Thomas



Re: [Qemu-devel] [PATCH] intel_iommu: handle invalid ce for shadow sync

2018-10-07 Thread Peter Xu
On Mon, Oct 08, 2018 at 02:06:20PM +0800, Peter Xu wrote:
> > > -static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
> > > -{
> > > -return vtd_sync_shadow_page_table_range(vtd_as, NULL, 0, UINT64_MAX);
> > > +return vtd_sync_shadow_page_table_range(vtd_as, &ce, 0, UINT64_MAX);
> > >   }
> > 
> > As has been discussed, this will left addr UINT64_MAX, it's better to have
> > [start, end] instead of (start, range).
> 
> Hmm, this size is inclusive, so we should be fine.  Though I'll take
> your advise to use start/end pair to be clearer.

Sorry it's not...

Actually vtd_page_walk() itself is not inclusive, so we will need to
touch that up if we want to let the whole stack use the [start, end]
inclusive way.  However I would try not to bother with it since after
all page sync operation is per-small-page granularity, so imho missing
the last addr UINT64_MAX is ok (as long as we're covering the last
page, which is UINT64_MAX - PAGE_SIZE + 1).  I'm not sure whether
it'll worth it then to change the whole stack into inclusive way... so
I'll temporarily keep it as is.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] tests: remove gcov-files- variables

2018-10-07 Thread Thomas Huth
On 2018-10-05 18:17, Paolo Bonzini wrote:
> Commit 31d2dda ("build-system: remove per-test GCOV reporting", 2018-06-20)
> removed users of the variables, since those uses can be replaced by a simple
> overall report produced by gcovr.  However, the variables were never removed.
> Do it now.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  docs/devel/testing.rst |   4 +-
>  tests/Makefile.include | 118 
> -
>  2 files changed, 1 insertion(+), 121 deletions(-)

Good idea, I think these variables were pretty much out of sync anyway.

Acked-by: Thomas Huth 



Re: [Qemu-devel] [PATCH v2 0/2] intel_iommu: handle invalid ce for shadow sync

2018-10-07 Thread Peter Xu
On Mon, Oct 08, 2018 at 02:47:11PM +0800, Peter Xu wrote:
> v2:
> - split patch into more, remove useless comment [Eric]
> - remove one error_report_once() when rework the code [Jason]
> 
> This series fixes a QEMU crash reported here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1627272
> 
> Please review, thanks.

Sorry when splitting the patch I forgot to:

CC: QEMU Stable 

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH v4 03/11] build: rename CONFIG_QGA_NTDDDISK to CONFIG_QGA_NTDDSCSI

2018-10-07 Thread Sameeh Jubran
Reviewed-by: Sameeh Jubran 
On Thu, Oct 4, 2018 at 4:23 PM Marc-André Lureau
 wrote:
>
> Hi
>
> On Thu, Oct 4, 2018 at 3:22 PM Tomáš Golembiovský  wrote:
> >
> > There was inconsistency between commits:
> >
> >   50cbebb9a3 configure: add configure check for ntdddisk.h
> >   a3ef3b2272 qga: added bus type and disk location path
> >
> > The first commit added #define CONFIG_QGA_NTDDDISK but the second commit
> > expected the name to be CONFIG_QGA_NTDDSCSI. As a result the code in
> > second patch was never used.
> >
> > Renaming the option to CONFIG_QGA_NTDDSCSI to match the name of header
> > file that is being checked for.
> >
> > Signed-off-by: Tomáš Golembiovský 
>
> Reviewed-by: Marc-André Lureau 
>
> > ---
> >  configure | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/configure b/configure
> > index 58862d2ae8..0f168607e8 100755
> > --- a/configure
> > +++ b/configure
> > @@ -6201,7 +6201,7 @@ if test "$mingw32" = "yes" ; then
> >  echo "WIN_SDK=\"$win_sdk\"" >> $config_host_mak
> >fi
> >if test "$guest_agent_ntddscsi" = "yes" ; then
> > -echo "CONFIG_QGA_NTDDDISK=y" >> $config_host_mak
> > +echo "CONFIG_QGA_NTDDSCSI=y" >> $config_host_mak
> >fi
> >if test "$guest_agent_msi" = "yes"; then
> >  echo "QEMU_GA_MSI_ENABLED=yes" >> $config_host_mak
> > --
> > 2.19.0
> >
>


-- 
Respectfully,
Sameeh Jubran
Linkedin
Software Engineer @ Daynix.



Re: [Qemu-devel] [PATCH 7/7] qga-win: changing --retry-path option behavior

2018-10-07 Thread Bishara AbuHattoum
Your advice was taken and implemented, sending changes in the next version.

On Thu, Sep 27, 2018 at 2:48 PM Marc-André Lureau <
marcandre.lur...@gmail.com> wrote:

> Hi
>
> On Thu, Sep 27, 2018 at 11:41 AM Bishara AbuHattoum 
> wrote:
> >
> > Currently whenever the qemu-ga's service doesn't find the virtio-serial
> > the run_agent() loops in a QGA_RETRY_INTERVAL (default 5 seconds)
> > intervals and try to restart the qemu-ga which causes a synchronous loop.
> > Changed to wait and listen for the serial events by registering for
> > notifications a proper serial event handler that deals with events:
> >   DBT_DEVICEARRIVALindicates that the device has been inserted
> and
> >is available
> >   DBT_DEVICEREMOVECOMPLETE indicates that the devive has been removed
> > Which allow us to determine when the channel path is available for the
> > qemu-ga to restart.
> >
> > Signed-off-by: Bishara AbuHattoum 
> > Signed-off-by: Sameeh Jubran 
>
> looks good overall,
>
> > ---
> >  qga/main.c  | 101 +++-
> >  qga/service-win32.h |   4 ++
> >  2 files changed, 104 insertions(+), 1 deletion(-)
> >
> > diff --git a/qga/main.c b/qga/main.c
> > index 6a152eb3a7..215dcb36f1 100644
> > --- a/qga/main.c
> > +++ b/qga/main.c
> > @@ -34,6 +34,7 @@
> >  #include "qemu/systemd.h"
> >  #include "qemu-version.h"
> >  #ifdef _WIN32
> > +#include 
> >  #include "qga/service-win32.h"
> >  #include "qga/vss-win32.h"
> >  #endif
> > @@ -101,6 +102,8 @@ struct GAState {
> >  bool logging_enabled;
> >  #ifdef _WIN32
> >  GAService service;
> > +HANDLE channel_available_event;
> > +HANDLE stop_event;
> >  #endif
> >  bool delimit_response;
> >  bool frozen;
> > @@ -137,6 +140,7 @@ static const char *ga_freeze_whitelist[] = {
> >  #ifdef _WIN32
> >  DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, LPVOID data,
> >LPVOID ctx);
> > +DWORD WINAPI handle_serial_device_events(DWORD type, LPVOID data);
> >  VOID WINAPI service_main(DWORD argc, TCHAR *argv[]);
> >  #endif
> >  static int run_agent(GAState *s);
> > @@ -729,6 +733,36 @@ static gboolean channel_init(GAState *s, const
> gchar *method, const gchar *path,
> >  }
> >
> >  #ifdef _WIN32
> > +DWORD WINAPI handle_serial_device_events(DWORD type, LPVOID data)
> > +{
> > +DWORD ret = NO_ERROR;
> > +PDEV_BROADCAST_HDR broadcast_header = (PDEV_BROADCAST_HDR)data;
> > +
> > +if (broadcast_header->dbch_devicetype ==
> DBT_DEVTYP_DEVICEINTERFACE) {
> > +switch (type) {
> > +/* Device inserted */
> > +case DBT_DEVICEARRIVAL:
> > +/* Start QEMU-ga's service */
> > +if (!SetEvent(ga_state->channel_available_event)) {
> > +ret = GetLastError();
> > +}
> > +break;
> > +/* Device removed */
> > +case DBT_DEVICEQUERYREMOVE:
> > +case DBT_DEVICEREMOVEPENDING:
> > +case DBT_DEVICEREMOVECOMPLETE:
> > +/* Stop QEMU-ga's service */
> > +if (!ResetEvent(ga_state->channel_available_event)) {
> > +ret = GetLastError();
> > +}
> > +break;
> > +default:
> > +ret = ERROR_CALL_NOT_IMPLEMENTED;
> > +}
> > +}
> > +return ret;
> > +}
> > +
> >  DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, LPVOID data,
> >LPVOID ctx)
> >  {
> > @@ -740,9 +774,13 @@ DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD
> type, LPVOID data,
> >  case SERVICE_CONTROL_STOP:
> >  case SERVICE_CONTROL_SHUTDOWN:
> >  quit_handler(SIGTERM);
> > +SetEvent(ga_state->stop_event);
> >  service->status.dwCurrentState = SERVICE_STOP_PENDING;
> >  SetServiceStatus(service->status_handle, &service->status);
> >  break;
> > +case SERVICE_CONTROL_DEVICEEVENT:
> > +handle_serial_device_events(type, data);
> > +break;
> >
> >  default:
> >  ret = ERROR_CALL_NOT_IMPLEMENTED;
> > @@ -769,10 +807,25 @@ VOID WINAPI service_main(DWORD argc, TCHAR *argv[])
> >  service->status.dwServiceSpecificExitCode = NO_ERROR;
> >  service->status.dwCheckPoint = 0;
> >  service->status.dwWaitHint = 0;
> > +DEV_BROADCAST_DEVICEINTERFACE notification_filter;
> > +ZeroMemory(¬ification_filter, sizeof(notification_filter));
> > +notification_filter.dbcc_devicetype = DBT_DEVTYP_DEVICEINTERFACE;
> > +notification_filter.dbcc_size =
> sizeof(DEV_BROADCAST_DEVICEINTERFACE);
> > +notification_filter.dbcc_classguid = GUID_VIOSERIAL_PORT;
> > +
> > +service->device_notification_handle =
> > +RegisterDeviceNotification(service->status_handle,
> > +¬ification_filter, DEVICE_NOTIFY_SERVICE_HANDLE);
> > +if (!service->device_notification_handle) {
> > +g_critical("

Re: [Qemu-devel] [PATCH 3/7] qga: move w32 service handling out of run_agent()

2018-10-07 Thread Bishara AbuHattoum
As I already clarified in the cover letter, I will fix the typo in the
commit message and drop the message ID from all the patches, changes will
be in the next version.

On Thu, Sep 27, 2018 at 2:45 PM Marc-André Lureau <
marcandre.lur...@gmail.com> wrote:

> Hi
>
> On Thu, Sep 27, 2018 at 11:39 AM Bishara AbuHattoum 
> wrote:
> >
> > From: Michael Roth 
> >
> > Eventually we want a w32 service to be able to restart the qga main
> > loop from within service_main(). To allow for this we move service
> > handling out of run_agent() such that service_main() calls
> > run_agent() instead of the reverse.
> >
> > Signed-off-by: Michael Roth 
> > Message-Id: <20171026233054.21133-4-mdr...@linux.vnet.ibm.com>
>
> Same remark about message-id
>
> > ---
> >  qga/main.c | 25 ++---
> >  1 file changed, 14 insertions(+), 11 deletions(-)
> >
> > diff --git a/qga/main.c b/qga/main.c
> > index 7381053a0f..d9888bff5f 100644
> > --- a/qga/main.c
> > +++ b/qga/main.c
> > @@ -136,6 +136,7 @@ DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD
> type, LPVOID data,
> >LPVOID ctx);
> >  VOID WINAPI service_main(DWORD argc, TCHAR *argv[]);
> >  #endif
> > +static int run_agent(GAState *s);
> >
> >  static void
> >  init_dfl_pathnames(void)
> > @@ -763,7 +764,7 @@ VOID WINAPI service_main(DWORD argc, TCHAR *argv[])
> >  service->status.dwWaitHint = 0;
> >  SetServiceStatus(service->status_handle, &service->status);
> >
> > -g_main_loop_run(ga_state->main_loop);
> > +run_agent(ga_state);
> >
> >  service->status.dwCurrentState = SERVICE_STOPPED;
> >  SetServiceStatus(service->status_handle, &service->status);
> > @@ -1372,17 +1373,8 @@ static int run_agent(GAState *s)
> >  g_critical("failed to initialize guest agent channel");
> >  return EXIT_FAILURE;
> >  }
> > -#ifndef _WIN32
> > +
> >  g_main_loop_run(ga_state->main_loop);
> > -#else
> > -if (config->daemonize) {
> > -SERVICE_TABLE_ENTRY service_table[] = {
> > -{ (char *)QGA_SERVICE_NAME, service_main }, { NULL, NULL }
> };
> > -StartServiceCtrlDispatcher(service_table);
> > -} else {
> > -g_main_loop_run(ga_state->main_loop);
> > -}
> > -#endif
> >
> >  return EXIT_SUCCESS;
> >  }
> > @@ -1468,7 +1460,18 @@ int main(int argc, char **argv)
> >  g_critical("error initializing guest agent");
> >  goto end;
> >  }
> > +
> > +#ifdef _WIN32
> > +if (config->daemonize) {
> > +SERVICE_TABLE_ENTRY service_table[] = {
> > +{ (char *)QGA_SERVICE_NAME, service_main }, { NULL, NULL }
> };
> > +StartServiceCtrlDispatcher(service_table);
> > +} else {
> > +ret = run_agent(s);
> > +}
> > +#endif
> >  ret = run_agent(s);
>
> run_agent() is called twice.
>
> > +
> >  cleanup_agent(s);
> >
> >  end:
> > --
> > 2.17.0
> >
> >
>
>
> --
> Marc-André Lureau
>


Re: [Qemu-devel] [PATCH 1/7] qga: group agent init/cleanup init separate routines

2018-10-07 Thread Bishara AbuHattoum
As I already clarified in the cover letter, I will fix the typo in the
commit message and drop the message ID from all the patches, changes will
be in the next version.

On Thu, Sep 27, 2018 at 2:44 PM Marc-André Lureau <
marcandre.lur...@gmail.com> wrote:

> Hi
>
> On Thu, Sep 27, 2018 at 11:37 AM Bishara AbuHattoum 
> wrote:
> >
> > From: Michael Roth 
> >
> > This patch better separates the init/cleanup routines out into
> > separate functions to make make the start-up procedure a bit easier
>
> "make make"
>
> > to follow. This will be useful when we eventually break out the
> > actual start/stop of the agent's main loop into separates routines
> > that can be called multiple times after the init phase.
> >
> > Signed-off-by: Michael Roth 
> > Message-Id: <20171026233054.21133-2-mdr...@linux.vnet.ibm.com>
>
> I don't know if it make sense to retain the original Message-Id.
>
> it is not as good as it used to wrt to cleaning up on error, but that
> may be acceptable..
> Reviewed-by: Marc-André Lureau 
>
> > ---
> >  qga/main.c | 82 +-
> >  1 file changed, 50 insertions(+), 32 deletions(-)
> >
> > diff --git a/qga/main.c b/qga/main.c
> > index 6d70242d05..7b7b837a14 100644
> > --- a/qga/main.c
> > +++ b/qga/main.c
> > @@ -1245,9 +1245,21 @@ static bool check_is_frozen(GAState *s)
> >  return false;
> >  }
> >
> > -static int run_agent(GAState *s, GAConfig *config, int
> socket_activation)
> > +static GAState *initialize_agent(GAConfig *config)
> >  {
> > -ga_state = s;
> > +GAState *s = g_new0(GAState, 1);
> > +
> > +g_assert(ga_state == NULL);
> > +
> > +s->log_level = config->log_level;
> > +s->log_file = stderr;
> > +#ifdef CONFIG_FSFREEZE
> > +s->fsfreeze_hook = config->fsfreeze_hook;
> > +#endif
> > +s->pstate_filepath = g_strdup_printf("%s/qga.state",
> config->state_dir);
> > +s->state_filepath_isfrozen =
> g_strdup_printf("%s/qga.state.isfrozen",
> > + config->state_dir);
> > +s->frozen = check_is_frozen(s);
> >
> >  g_log_set_default_handler(ga_log, s);
> >  g_log_set_fatal_mask(NULL, G_LOG_LEVEL_ERROR);
> > @@ -1263,7 +1275,7 @@ static int run_agent(GAState *s, GAConfig *config,
> int socket_activation)
> >  if (g_mkdir_with_parents(config->state_dir, S_IRWXU) == -1) {
> >  g_critical("unable to create (an ancestor of) the state
> directory"
> > " '%s': %s", config->state_dir, strerror(errno));
> > -return EXIT_FAILURE;
> > +return NULL;
> >  }
> >  #endif
> >
> > @@ -1288,7 +1300,7 @@ static int run_agent(GAState *s, GAConfig *config,
> int socket_activation)
> >  if (!log_file) {
> >  g_critical("unable to open specified log file: %s",
> > strerror(errno));
> > -return EXIT_FAILURE;
> > +return NULL;
> >  }
> >  s->log_file = log_file;
> >  }
> > @@ -1299,7 +1311,7 @@ static int run_agent(GAState *s, GAConfig *config,
> int socket_activation)
> > s->pstate_filepath,
> > ga_is_frozen(s))) {
> >  g_critical("failed to load persistent state");
> > -return EXIT_FAILURE;
> > +return NULL;
> >  }
> >
> >  config->blacklist = ga_command_blacklist_init(config->blacklist);
> > @@ -1320,12 +1332,37 @@ static int run_agent(GAState *s, GAConfig
> *config, int socket_activation)
> >  #ifndef _WIN32
> >  if (!register_signal_handlers()) {
> >  g_critical("failed to register signal handlers");
> > -return EXIT_FAILURE;
> > +return NULL;
> >  }
> >  #endif
> >
> >  s->main_loop = g_main_loop_new(NULL, false);
> >
> > +ga_state = s;
> > +return s;
> > +}
> > +
> > +static void cleanup_agent(GAState *s)
> > +{
> > +if (s->command_state) {
> > +ga_command_state_cleanup_all(s->command_state);
> > +ga_command_state_free(s->command_state);
> > +json_message_parser_destroy(&s->parser);
> > +}
> > +if (s->channel) {
> > +ga_channel_free(s->channel);
> > +}
> > +g_free(s->pstate_filepath);
> > +g_free(s->state_filepath_isfrozen);
> > +if (s->main_loop) {
> > +g_main_loop_unref(s->main_loop);
> > +}
> > +g_free(s);
> > +ga_state = NULL;
> > +}
> > +
> > +static int run_agent(GAState *s, GAConfig *config, int
> socket_activation)
> > +{
> >  if (!channel_init(ga_state, config->method, config->channel_path,
> >socket_activation ? FIRST_SOCKET_ACTIVATION_FD :
> -1)) {
> >  g_critical("failed to initialize guest agent channel");
> > @@ -1349,7 +1386,7 @@ static int run_agent(GAState *s, GAConfig *config,
> int socket_activation)
> >  int main(int argc, char **argv)
> >  {
> >  int ret = EXIT_SUCCESS;
> > -GAState *s = g_new0(GAState, 1);
> > +GA

Re: [Qemu-devel] Hotplug handler

2018-10-07 Thread Sameeh Jubran
This is the command line. All of the devices are wired to pci.0, there is
no pci bridge.

According to this,  Integrated Endpoints are not hot-pluggable. However I
can still use device_del to delete a device and device_add to add e1000
with no issues.
https://github.com/qemu/qemu/blob/master/docs/pcie.txt#L37

qemu-system-x86_64 \
-device
e1000,netdev=hostnet0,mac=56:cc:c1:01:cc:21,id=cc1_71,primary=cc1_72 \
-netdev
tap,vhost=on,id=hostnet1,script=world_bridge_standalone.sh,downscript=no,ifname=cc1_72,queues=4
\
-device
virtio-net,host_mtu=1500,netdev=hostnet1,mac=56:cc:c1:04:2c:21,id=cc1_72,vectors=10,mq=on,standby=cc1_71
\
-netdev
tap,id=hostnet0,script=world_bridge_standalone.sh,downscript=no,ifname=cc1_71
\
-enable-kvm \
-name netkvm \
-m 1000M \
-snapshot \
-smp 4 \
-drive file=windows_10_enterprise_x64_netkvm_dev.qcow2,if=ide,id=drivex \
-global PIIX4_PM.disable_s3=0 \
-global PIIX4_PM.disable_s4=0 \
-usbdevice tablet \
-vga qxl \
-spice port=6110,disable-ticketing \
-device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x7
\
-chardev spicevmc,name=vdagent,id=vdagent \
-device
virtserialport,nr=1,bus=virtio-serial0.0,chardev=vdagent,name=com.redhat.spice.0
\
-chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 \
-device virtio-serial \
-device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 \
-monitor stdio


On Thu, Oct 4, 2018 at 2:20 PM Igor Mammedov  wrote:

> On Wed, 3 Oct 2018 19:50:58 +0300
> Sameeh Jubran  wrote:
>
> > Hi all,
> >
> > I am trying to get the hotplug handler of a pci device in Qemu using
> > "qdev_get_hotplug_handler" function. This function simply tries to get
> > the hotplug handler from the parent bus. For some reason it's always
> > null. Why it is not initialized?
> >
> > Thanks!
> >
>
> what's used qemu command line?
>


-- 
Respectfully,
*Sameeh Jubran*
*Linkedin *
*Software Engineer @ Daynix .*


[Qemu-devel] [PATCH v2 0/7] qga: add support for re-opening channel on error

2018-10-07 Thread Bishara AbuHattoum
Changes from v1:
 [1/7] qga: group agent init/cleanup init separate routines
 Fixed typo in the commit message and dropped the unnecessary message ID
 [2/7] qga: hang GAConfig/socket_activation off of GAState global
 Dropped the unnecessary message ID
 [3/7] qga: move w32 service handling out of run_agent()
 Fixed run_agent() extra call and dropped the unnecessary message ID 
 [4/7] qga: add --retry-path option for re-initializing channel on failure
 Dropped the unnecessary message ID
 [5/7] qga-win: install service with --retry-path set by default
 Dropped the unnecessary message ID
 [6/7] qga-win: report specific error when failing to open channel
 Dropped the unnecessary message ID
 [7/7] qga-win: changing --retry-path option behavior
 Dropped the use of two events and WaitForMultipleObjects, and used
 one 'wakeup_event'.

A patch series was firstly introduced about a year ago which resolves an issue
with qemu-ga and virtio-serial functionality. The issue was discussed in the
following thread:
  https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg06256.html

In short, Sameeh sent an implementation which uses udev on Linux and device
events API on Windows. Since udev API is only supported for kernels 2.6+ it is
not a good approach and Michael suggested his alternative series which uses a
loop and checks if the serial is present or not. Since the Windows API is
supported and has backward compatibility up until Windows xp, there is no reason
for not using the Windows API. It was finally agreed by Michael and Sameeh that
a combination of both approaches should be used.

This series does just that, it rebases Michael's patches on top of upstream and
merges Sameeh's patches for Windows API into Michael's implementation.

Michael's series:
  https://github.com/mdroth/qemu/commits/qga-retry-path
Sameeh's series:
  https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02400.html
  https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02399.html
  https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02401.html

Bishara AbuHattoum (1):
  qga-win: changing --retry-path option behavior

Michael Roth (6):
  qga: group agent init/cleanup init separate routines
  qga: hang GAConfig/socket_activation off of GAState global
  qga: move w32 service handling out of run_agent()
  qga: add --retry-path option for re-initializing channel on failure
  qga-win: install service with --retry-path set by default
  qga-win: report specific error when failing to open channel

 qga/channel-win32.c   |   3 +-
 qga/installer/qemu-ga.wxs |   2 +-
 qga/main.c| 288 +-
 qga/service-win32.h   |   4 +
 4 files changed, 229 insertions(+), 68 deletions(-)

-- 
2.17.0




[Qemu-devel] [PATCH v2 2/7] qga: hang GAConfig/socket_activation off of GAState global

2018-10-07 Thread Bishara AbuHattoum
From: Michael Roth 

For w32 services we rely on the global GAState to access resources
associated with the agent within service_main(). Currently this is
sufficient for starting the agent since we open the channel once prior
to calling service_main(), and simply start the GMainLoop to start the
agent from within service_main().

Eventually we want to be able to also [re-]open the communication
channel from within service_main(), which requires access to
config/socket_activation variables, so we hang them off GAState in
preparation for that.

Signed-off-by: Michael Roth 
Signed-off-by: Sameeh Jubran 
---
 qga/main.c | 54 +-
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/qga/main.c b/qga/main.c
index fd02086bfc..9f4dc0b2c5 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -69,6 +69,25 @@ typedef struct GAPersistentState {
 int64_t fd_counter;
 } GAPersistentState;
 
+typedef struct GAConfig {
+char *channel_path;
+char *method;
+char *log_filepath;
+char *pid_filepath;
+#ifdef CONFIG_FSFREEZE
+char *fsfreeze_hook;
+#endif
+char *state_dir;
+#ifdef _WIN32
+const char *service;
+#endif
+gchar *bliststr; /* blacklist may point to this string */
+GList *blacklist;
+int daemonize;
+GLogLevelFlags log_level;
+int dumpconf;
+} GAConfig;
+
 struct GAState {
 JSONMessageParser parser;
 GMainLoop *main_loop;
@@ -94,6 +113,8 @@ struct GAState {
 #endif
 gchar *pstate_filepath;
 GAPersistentState pstate;
+GAConfig *config;
+int socket_activation;
 };
 
 struct GAState *ga_state;
@@ -905,25 +926,6 @@ static GList *split_list(const gchar *str, const gchar 
*delim)
 return list;
 }
 
-typedef struct GAConfig {
-char *channel_path;
-char *method;
-char *log_filepath;
-char *pid_filepath;
-#ifdef CONFIG_FSFREEZE
-char *fsfreeze_hook;
-#endif
-char *state_dir;
-#ifdef _WIN32
-const char *service;
-#endif
-gchar *bliststr; /* blacklist may point to this string */
-GList *blacklist;
-int daemonize;
-GLogLevelFlags log_level;
-int dumpconf;
-} GAConfig;
-
 static void config_load(GAConfig *config)
 {
 GError *gerr = NULL;
@@ -1211,7 +1213,7 @@ static bool check_is_frozen(GAState *s)
 return false;
 }
 
-static GAState *initialize_agent(GAConfig *config)
+static GAState *initialize_agent(GAConfig *config, int socket_activation)
 {
 GAState *s = g_new0(GAState, 1);
 
@@ -1304,6 +1306,8 @@ static GAState *initialize_agent(GAConfig *config)
 
 s->main_loop = g_main_loop_new(NULL, false);
 
+s->config = config;
+s->socket_activation = socket_activation;
 ga_state = s;
 return s;
 }
@@ -1327,10 +1331,10 @@ static void cleanup_agent(GAState *s)
 ga_state = NULL;
 }
 
-static int run_agent(GAState *s, GAConfig *config, int socket_activation)
+static int run_agent(GAState *s)
 {
-if (!channel_init(ga_state, config->method, config->channel_path,
-  socket_activation ? FIRST_SOCKET_ACTIVATION_FD : -1)) {
+if (!channel_init(s, s->config->method, s->config->channel_path,
+  s->socket_activation ? FIRST_SOCKET_ACTIVATION_FD : -1)) 
{
 g_critical("failed to initialize guest agent channel");
 return EXIT_FAILURE;
 }
@@ -1425,12 +1429,12 @@ int main(int argc, char **argv)
 goto end;
 }
 
-s = initialize_agent(config);
+s = initialize_agent(config, socket_activation);
 if (!s) {
 g_critical("error initializing guest agent");
 goto end;
 }
-ret = run_agent(s, config, socket_activation);
+ret = run_agent(s);
 cleanup_agent(s);
 
 end:
-- 
2.17.0




[Qemu-devel] [PATCH v2 1/7] qga: group agent init/cleanup init separate routines

2018-10-07 Thread Bishara AbuHattoum
From: Michael Roth 

This patch better separates the init/cleanup routines out into
separate functions to make the start-up procedure a bit easier to
follow. This will be useful when we eventually break out the actual
start/stop of the agent's main loop into separates routines that
can be called multiple times after the init phase.

Signed-off-by: Michael Roth 
---
 qga/main.c | 82 +-
 1 file changed, 50 insertions(+), 32 deletions(-)

diff --git a/qga/main.c b/qga/main.c
index c399320d3c..fd02086bfc 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -1211,9 +1211,21 @@ static bool check_is_frozen(GAState *s)
 return false;
 }
 
-static int run_agent(GAState *s, GAConfig *config, int socket_activation)
+static GAState *initialize_agent(GAConfig *config)
 {
-ga_state = s;
+GAState *s = g_new0(GAState, 1);
+
+g_assert(ga_state == NULL);
+
+s->log_level = config->log_level;
+s->log_file = stderr;
+#ifdef CONFIG_FSFREEZE
+s->fsfreeze_hook = config->fsfreeze_hook;
+#endif
+s->pstate_filepath = g_strdup_printf("%s/qga.state", config->state_dir);
+s->state_filepath_isfrozen = g_strdup_printf("%s/qga.state.isfrozen",
+ config->state_dir);
+s->frozen = check_is_frozen(s);
 
 g_log_set_default_handler(ga_log, s);
 g_log_set_fatal_mask(NULL, G_LOG_LEVEL_ERROR);
@@ -1229,7 +1241,7 @@ static int run_agent(GAState *s, GAConfig *config, int 
socket_activation)
 if (g_mkdir_with_parents(config->state_dir, S_IRWXU) == -1) {
 g_critical("unable to create (an ancestor of) the state directory"
" '%s': %s", config->state_dir, strerror(errno));
-return EXIT_FAILURE;
+return NULL;
 }
 #endif
 
@@ -1254,7 +1266,7 @@ static int run_agent(GAState *s, GAConfig *config, int 
socket_activation)
 if (!log_file) {
 g_critical("unable to open specified log file: %s",
strerror(errno));
-return EXIT_FAILURE;
+return NULL;
 }
 s->log_file = log_file;
 }
@@ -1265,7 +1277,7 @@ static int run_agent(GAState *s, GAConfig *config, int 
socket_activation)
s->pstate_filepath,
ga_is_frozen(s))) {
 g_critical("failed to load persistent state");
-return EXIT_FAILURE;
+return NULL;
 }
 
 config->blacklist = ga_command_blacklist_init(config->blacklist);
@@ -1286,12 +1298,37 @@ static int run_agent(GAState *s, GAConfig *config, int 
socket_activation)
 #ifndef _WIN32
 if (!register_signal_handlers()) {
 g_critical("failed to register signal handlers");
-return EXIT_FAILURE;
+return NULL;
 }
 #endif
 
 s->main_loop = g_main_loop_new(NULL, false);
 
+ga_state = s;
+return s;
+}
+
+static void cleanup_agent(GAState *s)
+{
+if (s->command_state) {
+ga_command_state_cleanup_all(s->command_state);
+ga_command_state_free(s->command_state);
+json_message_parser_destroy(&s->parser);
+}
+if (s->channel) {
+ga_channel_free(s->channel);
+}
+g_free(s->pstate_filepath);
+g_free(s->state_filepath_isfrozen);
+if (s->main_loop) {
+g_main_loop_unref(s->main_loop);
+}
+g_free(s);
+ga_state = NULL;
+}
+
+static int run_agent(GAState *s, GAConfig *config, int socket_activation)
+{
 if (!channel_init(ga_state, config->method, config->channel_path,
   socket_activation ? FIRST_SOCKET_ACTIVATION_FD : -1)) {
 g_critical("failed to initialize guest agent channel");
@@ -1315,7 +1352,7 @@ static int run_agent(GAState *s, GAConfig *config, int 
socket_activation)
 int main(int argc, char **argv)
 {
 int ret = EXIT_SUCCESS;
-GAState *s = g_new0(GAState, 1);
+GAState *s;
 GAConfig *config = g_new0(GAConfig, 1);
 int socket_activation;
 
@@ -1383,44 +1420,25 @@ int main(int argc, char **argv)
 }
 }
 
-s->log_level = config->log_level;
-s->log_file = stderr;
-#ifdef CONFIG_FSFREEZE
-s->fsfreeze_hook = config->fsfreeze_hook;
-#endif
-s->pstate_filepath = g_strdup_printf("%s/qga.state", config->state_dir);
-s->state_filepath_isfrozen = g_strdup_printf("%s/qga.state.isfrozen",
- config->state_dir);
-s->frozen = check_is_frozen(s);
-
 if (config->dumpconf) {
 config_dump(config);
 goto end;
 }
 
+s = initialize_agent(config);
+if (!s) {
+g_critical("error initializing guest agent");
+goto end;
+}
 ret = run_agent(s, config, socket_activation);
+cleanup_agent(s);
 
 end:
-if (s->command_state) {
-ga_command_state_cleanup_all(s->command_state);
-ga_command_state_free(s->command_state);
-json_message_parser_destroy(&s->parser);
-}
-if (s->chan

Re: [Qemu-devel] [PATCH v4 11/11] qga-win: demystify namespace striping

2018-10-07 Thread Sameeh Jubran
+1, this is much clearer.
On Thu, Oct 4, 2018 at 4:34 PM Marc-André Lureau
 wrote:
>
> Hi
>
> On Thu, Oct 4, 2018 at 3:22 PM Tomáš Golembiovský  wrote:
> >
> > It was not obvious what exactly the cryptic string copying does to the
> > GUID. This change makes the intent clearer.
> >
> > Signed-off-by: Tomáš Golembiovský 
>
> Reviewed-by: Marc-André Lureau 
>
> > ---
> >  qga/commands-win32.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> > index d0d969d0ce..82881aa749 100644
> > --- a/qga/commands-win32.c
> > +++ b/qga/commands-win32.c
> > @@ -507,7 +507,14 @@ static GuestPCIAddress *get_pci_info(char *guid, Error 
> > **errp)
> >  char dev_name[MAX_PATH];
> >  char *buffer = NULL;
> >  GuestPCIAddress *pci = NULL;
> > -char *name = g_strdup(&guid[4]);
> > +char *name = NULL;
> > +
> > +if ((g_str_has_prefix(guid, ".\\") == TRUE) ||
> > +(g_str_has_prefix(guid, "?\\") == TRUE)) {
> > +name = g_strdup(&guid[4]);
>
> I find "guid + 4" easier to read though
>
> > +} else {
> > +name = g_strdup(guid);
> > +}
> >
> >  if (!QueryDosDevice(name, dev_name, ARRAY_SIZE(dev_name))) {
> >  error_setg_win32(errp, GetLastError(), "failed to get dos device 
> > name");
> > --
> > 2.19.0
> >
>


-- 
Respectfully,
Sameeh Jubran
Linkedin
Software Engineer @ Daynix.



[Qemu-devel] [PATCH v2 5/7] qga-win: install service with --retry-path set by default

2018-10-07 Thread Bishara AbuHattoum
From: Michael Roth 

It's nicer from a management perspective that the agent can survive
hotplug/unplug of the channel device, or be started prior to the
installation of the channel device's driver without and still be able
to resume normal function afterward. On linux there are alternatives
like systemd to support this, but on w32 --retry-path is the only
option so it makes sense to set it by default when installed as a
w32 service.

Signed-off-by: Michael Roth 
---
 qga/installer/qemu-ga.wxs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qga/installer/qemu-ga.wxs b/qga/installer/qemu-ga.wxs
index f751a7e9f7..64bf90bd85 100644
--- a/qga/installer/qemu-ga.wxs
+++ b/qga/installer/qemu-ga.wxs
@@ -78,7 +78,7 @@
   Account="LocalSystem"
   ErrorControl="ignore"
   Interactive="no"
-  Arguments="-d"
+  Arguments="-d --retry-path"
   >
 
 
-- 
2.17.0




[Qemu-devel] [PATCH v2 3/7] qga: move w32 service handling out of run_agent()

2018-10-07 Thread Bishara AbuHattoum
From: Michael Roth 

Eventually we want a w32 service to be able to restart the qga main
loop from within service_main(). To allow for this we move service
handling out of run_agent() such that service_main() calls
run_agent() instead of the reverse.

Signed-off-by: Michael Roth 
Signed-off-by: Bishara AbuHattoum 
---
 qga/main.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/qga/main.c b/qga/main.c
index 9f4dc0b2c5..23a0a46b84 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -136,6 +136,7 @@ DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, 
LPVOID data,
   LPVOID ctx);
 VOID WINAPI service_main(DWORD argc, TCHAR *argv[]);
 #endif
+static int run_agent(GAState *s);
 
 static void
 init_dfl_pathnames(void)
@@ -729,7 +730,7 @@ VOID WINAPI service_main(DWORD argc, TCHAR *argv[])
 service->status.dwWaitHint = 0;
 SetServiceStatus(service->status_handle, &service->status);
 
-g_main_loop_run(ga_state->main_loop);
+run_agent(ga_state);
 
 service->status.dwCurrentState = SERVICE_STOPPED;
 SetServiceStatus(service->status_handle, &service->status);
@@ -1338,17 +1339,8 @@ static int run_agent(GAState *s)
 g_critical("failed to initialize guest agent channel");
 return EXIT_FAILURE;
 }
-#ifndef _WIN32
+
 g_main_loop_run(ga_state->main_loop);
-#else
-if (config->daemonize) {
-SERVICE_TABLE_ENTRY service_table[] = {
-{ (char *)QGA_SERVICE_NAME, service_main }, { NULL, NULL } };
-StartServiceCtrlDispatcher(service_table);
-} else {
-g_main_loop_run(ga_state->main_loop);
-}
-#endif
 
 return EXIT_SUCCESS;
 }
@@ -1434,7 +1426,19 @@ int main(int argc, char **argv)
 g_critical("error initializing guest agent");
 goto end;
 }
+
+#ifdef _WIN32
+if (config->daemonize) {
+SERVICE_TABLE_ENTRY service_table[] = {
+{ (char *)QGA_SERVICE_NAME, service_main }, { NULL, NULL } };
+StartServiceCtrlDispatcher(service_table);
+} else {
+ret = run_agent(s);
+}
+#else
 ret = run_agent(s);
+#endif
+
 cleanup_agent(s);
 
 end:
-- 
2.17.0




[Qemu-devel] [PATCH v2 4/7] qga: add --retry-path option for re-initializing channel on failure

2018-10-07 Thread Bishara AbuHattoum
From: Michael Roth 

This adds an option to instruct the agent to periodically attempt
re-opening the communication channel after a channel error has
occurred. The main use-case for this is providing an OS-independent
way of allowing the agent to survive situations like hotplug/unplug of
the communication channel, or initial guest set up where the agent may
be installed/started prior to the installation of the channel device's
driver.

There are nicer ways of implementing this functionality via things
like systemd services, but this option is useful for platforms like
*BSD/w32.

Currently a channel error will result in the GSource for that channel
being removed from the GMainLoop, but the main loop continuing to run.
That behavior results in a dead loop when --retry-path isn't set, and
prevents us from knowing when to attempt re-opening the channel when
it is set, so we also force the loop to exit as part of this patch.

Signed-off-by: Michael Roth 
---
 qga/main.c | 62 +++---
 1 file changed, 54 insertions(+), 8 deletions(-)

diff --git a/qga/main.c b/qga/main.c
index 23a0a46b84..f359499aaa 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -58,6 +58,7 @@
 #endif
 #define QGA_SENTINEL_BYTE 0xFF
 #define QGA_CONF_DEFAULT CONFIG_QEMU_CONFDIR G_DIR_SEPARATOR_S "qemu-ga.conf"
+#define QGA_RETRY_INTERVAL 5
 
 static struct {
 const char *state_dir;
@@ -86,6 +87,7 @@ typedef struct GAConfig {
 int daemonize;
 GLogLevelFlags log_level;
 int dumpconf;
+bool retry_path;
 } GAConfig;
 
 struct GAState {
@@ -115,6 +117,7 @@ struct GAState {
 GAPersistentState pstate;
 GAConfig *config;
 int socket_activation;
+bool force_exit;
 };
 
 struct GAState *ga_state;
@@ -137,6 +140,7 @@ DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, 
LPVOID data,
 VOID WINAPI service_main(DWORD argc, TCHAR *argv[]);
 #endif
 static int run_agent(GAState *s);
+static void stop_agent(GAState *s, bool requested);
 
 static void
 init_dfl_pathnames(void)
@@ -185,9 +189,7 @@ static void quit_handler(int sig)
 }
 g_debug("received signal num %d, quitting", sig);
 
-if (g_main_loop_is_running(ga_state->main_loop)) {
-g_main_loop_quit(ga_state->main_loop);
-}
+stop_agent(ga_state, true);
 }
 
 #ifndef _WIN32
@@ -272,6 +274,10 @@ QEMU_COPYRIGHT "\n"
 "to list available RPCs)\n"
 "  -D, --dump-conf   dump a qemu-ga config file based on current config\n"
 "options / command-line parameters to stdout\n"
+"  -r, --retry-path  attempt re-opening path if it's unavailable or closed\n"
+"due to an error which may be recoverable in the future\n"
+"(virtio-serial driver re-install, serial device hot\n"
+"plug/unplug, etc.)\n"
 "  -h, --helpdisplay this help and exit\n"
 "\n"
 QEMU_HELP_BOTTOM "\n"
@@ -631,6 +637,7 @@ static gboolean channel_event_cb(GIOCondition condition, 
gpointer data)
 switch (status) {
 case G_IO_STATUS_ERROR:
 g_warning("error reading channel");
+stop_agent(s, false);
 return false;
 case G_IO_STATUS_NORMAL:
 buf[count] = 0;
@@ -974,6 +981,10 @@ static void config_load(GAConfig *config)
 /* enable all log levels */
 config->log_level = G_LOG_LEVEL_MASK;
 }
+if (g_key_file_has_key(keyfile, "general", "retry-path", NULL)) {
+config->retry_path =
+g_key_file_get_boolean(keyfile, "general", "retry-path", &gerr);
+}
 if (g_key_file_has_key(keyfile, "general", "blacklist", NULL)) {
 config->bliststr =
 g_key_file_get_string(keyfile, "general", "blacklist", &gerr);
@@ -1035,6 +1046,8 @@ static void config_dump(GAConfig *config)
 g_key_file_set_string(keyfile, "general", "statedir", config->state_dir);
 g_key_file_set_boolean(keyfile, "general", "verbose",
config->log_level == G_LOG_LEVEL_MASK);
+g_key_file_set_boolean(keyfile, "general", "retry-path",
+   config->retry_path);
 tmp = list_join(config->blacklist, ',');
 g_key_file_set_string(keyfile, "general", "blacklist", tmp);
 g_free(tmp);
@@ -1053,7 +1066,7 @@ static void config_dump(GAConfig *config)
 
 static void config_parse(GAConfig *config, int argc, char **argv)
 {
-const char *sopt = "hVvdm:p:l:f:F::b:s:t:D";
+const char *sopt = "hVvdm:p:l:f:F::b:s:t:Dr";
 int opt_ind = 0, ch;
 const struct option lopt[] = {
 { "help", 0, NULL, 'h' },
@@ -1073,6 +1086,7 @@ static void config_parse(GAConfig *config, int argc, char 
**argv)
 { "service", 1, NULL, 's' },
 #endif
 { "statedir", 1, NULL, 't' },
+{ "retry-path", 0, NULL, 'r' },
 { NULL, 0, NULL, 0 }
 };
 
@@ -1117,6 +1131,9 @@ static void config_parse(GAConfig *config, int argc, char 
**argv)
 case 'D':
 config->dumpconf = 1;
 break;
+ 

[Qemu-devel] [PATCH v2 6/7] qga-win: report specific error when failing to open channel

2018-10-07 Thread Bishara AbuHattoum
From: Michael Roth 

Useful in general, but especially now that errors might occur more
frequently with --retry-path set.

Signed-off-by: Michael Roth 
---
 qga/channel-win32.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qga/channel-win32.c b/qga/channel-win32.c
index b3597a8a0f..c86f4388db 100644
--- a/qga/channel-win32.c
+++ b/qga/channel-win32.c
@@ -302,7 +302,8 @@ static gboolean ga_channel_open(GAChannel *c, 
GAChannelMethod method,
OPEN_EXISTING,
FILE_FLAG_NO_BUFFERING | FILE_FLAG_OVERLAPPED, 
NULL);
 if (c->handle == INVALID_HANDLE_VALUE) {
-g_critical("error opening path %s", newpath);
+g_critical("error opening path %s: %s", newpath,
+   g_win32_error_message(GetLastError()));
 return false;
 }
 
-- 
2.17.0




[Qemu-devel] [PATCH v2 7/7] qga-win: changing --retry-path option behavior

2018-10-07 Thread Bishara AbuHattoum
Currently whenever the qemu-ga's service doesn't find the virtio-serial
the run_agent() loops in a QGA_RETRY_INTERVAL (default 5 seconds)
intervals and try to restart the qemu-ga which causes a synchronous loop.
Changed to wait and listen for the serial events by registering for
notifications a proper serial event handler that deals with events:
  DBT_DEVICEARRIVALindicates that the device has been inserted and
   is available
  DBT_DEVICEREMOVECOMPLETE indicates that the devive has been removed
Which allow us to determine when the channel path is available for the
qemu-ga to restart.

Signed-off-by: Bishara AbuHattoum 
Signed-off-by: Sameeh Jubran 
---
 qga/main.c  | 86 -
 qga/service-win32.h |  4 +++
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/qga/main.c b/qga/main.c
index f359499aaa..c5bb063b1c 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -34,6 +34,7 @@
 #include "qemu/systemd.h"
 #include "qemu-version.h"
 #ifdef _WIN32
+#include 
 #include "qga/service-win32.h"
 #include "qga/vss-win32.h"
 #endif
@@ -101,6 +102,7 @@ struct GAState {
 bool logging_enabled;
 #ifdef _WIN32
 GAService service;
+HANDLE wakeup_event;
 #endif
 bool delimit_response;
 bool frozen;
@@ -137,6 +139,7 @@ static const char *ga_freeze_whitelist[] = {
 #ifdef _WIN32
 DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, LPVOID data,
   LPVOID ctx);
+DWORD WINAPI handle_serial_device_events(DWORD type, LPVOID data);
 VOID WINAPI service_main(DWORD argc, TCHAR *argv[]);
 #endif
 static int run_agent(GAState *s);
@@ -695,6 +698,36 @@ static gboolean channel_init(GAState *s, const gchar 
*method, const gchar *path,
 }
 
 #ifdef _WIN32
+DWORD WINAPI handle_serial_device_events(DWORD type, LPVOID data)
+{
+DWORD ret = NO_ERROR;
+PDEV_BROADCAST_HDR broadcast_header = (PDEV_BROADCAST_HDR)data;
+
+if (broadcast_header->dbch_devicetype == DBT_DEVTYP_DEVICEINTERFACE) {
+switch (type) {
+/* Device inserted */
+case DBT_DEVICEARRIVAL:
+/* Start QEMU-ga's service */
+if (!SetEvent(ga_state->wakeup_event)) {
+ret = GetLastError();
+}
+break;
+/* Device removed */
+case DBT_DEVICEQUERYREMOVE:
+case DBT_DEVICEREMOVEPENDING:
+case DBT_DEVICEREMOVECOMPLETE:
+/* Stop QEMU-ga's service */
+if (!ResetEvent(ga_state->wakeup_event)) {
+ret = GetLastError();
+}
+break;
+default:
+ret = ERROR_CALL_NOT_IMPLEMENTED;
+}
+}
+return ret;
+}
+
 DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, LPVOID data,
   LPVOID ctx)
 {
@@ -706,9 +739,13 @@ DWORD WINAPI service_ctrl_handler(DWORD ctrl, DWORD type, 
LPVOID data,
 case SERVICE_CONTROL_STOP:
 case SERVICE_CONTROL_SHUTDOWN:
 quit_handler(SIGTERM);
+SetEvent(ga_state->wakeup_event);
 service->status.dwCurrentState = SERVICE_STOP_PENDING;
 SetServiceStatus(service->status_handle, &service->status);
 break;
+case SERVICE_CONTROL_DEVICEEVENT:
+handle_serial_device_events(type, data);
+break;
 
 default:
 ret = ERROR_CALL_NOT_IMPLEMENTED;
@@ -735,10 +772,24 @@ VOID WINAPI service_main(DWORD argc, TCHAR *argv[])
 service->status.dwServiceSpecificExitCode = NO_ERROR;
 service->status.dwCheckPoint = 0;
 service->status.dwWaitHint = 0;
+DEV_BROADCAST_DEVICEINTERFACE notification_filter;
+ZeroMemory(¬ification_filter, sizeof(notification_filter));
+notification_filter.dbcc_devicetype = DBT_DEVTYP_DEVICEINTERFACE;
+notification_filter.dbcc_size = sizeof(DEV_BROADCAST_DEVICEINTERFACE);
+notification_filter.dbcc_classguid = GUID_VIOSERIAL_PORT;
+
+service->device_notification_handle =
+RegisterDeviceNotification(service->status_handle,
+¬ification_filter, DEVICE_NOTIFY_SERVICE_HANDLE);
+if (!service->device_notification_handle) {
+g_critical("Failed to register device notification handle!\n");
+return;
+}
 SetServiceStatus(service->status_handle, &service->status);
 
 run_agent(ga_state);
 
+UnregisterDeviceNotification(service->device_notification_handle);
 service->status.dwCurrentState = SERVICE_STOPPED;
 SetServiceStatus(service->status_handle, &service->status);
 }
@@ -1326,12 +1377,24 @@ static GAState *initialize_agent(GAConfig *config, int 
socket_activation)
 
 s->config = config;
 s->socket_activation = socket_activation;
+
+#ifdef _WIN32
+s->wakeup_event = CreateEvent(NULL, TRUE, FALSE, TEXT("WakeUp"));
+if (s->wakeup_event == NULL) {
+g_critical("CreateEvent failed");
+return NULL;
+}
+#endif
+
 ga_state = s;
 return 

Re: [Qemu-devel] [PATCH v4 09/11] qga-win: handle multi-disk volumes

2018-10-07 Thread Sameeh Jubran
I did a quick scan for the documentation and the code and it seems that the
name format that you're looking for is provided by the "QueryDosDeviceW"
function. The function returns multiple names and in the current code we
only check the first one. I believe that one of these names provided should
be the win32 device name (dos name).

Plus I did some googling and found out this similar question:
https://stackoverflow.com/questions/36199097/list-the-content-of-the-win32-device-namespace

Which suggested using the following tool:
https://docs.microsoft.com/en-us/sysinternals/downloads/winobj


On Thu, Oct 4, 2018 at 2:43 PM Tomáš Golembiovský 
wrote:

> Probe the volume for disk extents and return list of all disks.
> Originally only first disk of composite volume was returned.
>
> Note that the patch changes get_pci_info() from one state of brokenness
> into a different state of brokenness. In other words it still does not do
> what it's supposed to do (see comment in code). If anyone knows how to
> fix it, please step in.
>
> Signed-off-by: Tomáš Golembiovský 
> ---
>  qga/commands-win32.c | 126 ---
>  1 file changed, 108 insertions(+), 18 deletions(-)
>
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index 1e64642b8a..a591d8221d 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c
> @@ -477,9 +477,26 @@ static GuestDiskBusType
> find_bus_type(STORAGE_BUS_TYPE bus)
>  return win2qemu[(int)bus];
>  }
>
> +/* XXX: The following function is BROKEN!
> + *
> + * It does not work and probably has never worked. When we query for list
> of
> + * disks we get cryptic names like "\Device\001d" instead of
> + * "\PhysicalDriveX" or "\HarddiskX". Whether the names can be translated
> one
> + * way or the other for comparison is an open question.
> + *
> + * When we query volume names (the original version) we are able to match
> those
> + * but then the property queries report error "Invalid function". (duh!)
> + */
> +
> +/*
>  DEFINE_GUID(GUID_DEVINTERFACE_VOLUME,
>  0x53f5630dL, 0xb6bf, 0x11d0, 0x94, 0xf2,
>  0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> +*/
> +DEFINE_GUID(GUID_DEVINTERFACE_DISK,
> +0x53f56307L, 0xb6bf, 0x11d0, 0x94, 0xf2,
> +0x00, 0xa0, 0xc9, 0x1e, 0xfb, 0x8b);
> +
>
>  static GuestPCIAddress *get_pci_info(char *guid, Error **errp)
>  {
> @@ -497,7 +514,7 @@ static GuestPCIAddress *get_pci_info(char *guid, Error
> **errp)
>  goto out;
>  }
>
> -dev_info = SetupDiGetClassDevs(&GUID_DEVINTERFACE_VOLUME, 0, 0,
> +dev_info = SetupDiGetClassDevs(&GUID_DEVINTERFACE_DISK, 0, 0,
> DIGCF_PRESENT | DIGCF_DEVICEINTERFACE);
>  if (dev_info == INVALID_HANDLE_VALUE) {
>  error_setg_win32(errp, GetLastError(), "failed to get devices
> tree");
> @@ -637,20 +654,20 @@ static void get_single_disk_info(char *name,
> GuestDiskAddress *disk,
>  {
>  SCSI_ADDRESS addr, *scsi_ad;
>  DWORD len;
> -HANDLE vol_h;
> +HANDLE disk_h;
>  Error *local_err = NULL;
>
>  scsi_ad = &addr;
>
>  g_debug("getting disk info for: %s", name);
> -vol_h = CreateFile(name, 0, FILE_SHARE_READ, NULL, OPEN_EXISTING,
> +disk_h = CreateFile(name, 0, FILE_SHARE_READ, NULL, OPEN_EXISTING,
> 0, NULL);
> -if (vol_h == INVALID_HANDLE_VALUE) {
> -error_setg_win32(errp, GetLastError(), "failed to open volume");
> -goto err;
> +if (disk_h == INVALID_HANDLE_VALUE) {
> +error_setg_win32(errp, GetLastError(), "failed to open disk");
> +return;
>  }
>
> -get_disk_properties(vol_h, disk, &local_err);
> +get_disk_properties(disk_h, disk, &local_err);
>  if (local_err) {
>  error_propagate(errp, local_err);
>  goto err_close;
> @@ -668,7 +685,7 @@ static void get_single_disk_info(char *name,
> GuestDiskAddress *disk,
>   * according to Microsoft docs
>   *
> https://technet.microsoft.com/en-us/library/ee851589(v=ws.10).aspx */
>  g_debug("getting pci-controller info");
> -if (DeviceIoControl(vol_h, IOCTL_SCSI_GET_ADDRESS, NULL, 0,
> scsi_ad,
> +if (DeviceIoControl(disk_h, IOCTL_SCSI_GET_ADDRESS, NULL, 0,
> scsi_ad,
>  sizeof(SCSI_ADDRESS), &len, NULL)) {
>  disk->unit = addr.Lun;
>  disk->target = addr.TargetId;
> @@ -699,8 +716,7 @@ static void get_single_disk_info(char *name,
> GuestDiskAddress *disk,
>  }
>
>  err_close:
> -CloseHandle(vol_h);
> -err:
> +CloseHandle(disk_h);
>  return;
>  }
>
> @@ -712,6 +728,10 @@ static GuestDiskAddressList
> *build_guest_disk_info(char *guid, Error **errp)
>  Error *local_err = NULL;
>  GuestDiskAddressList *list = NULL, *cur_item = NULL;
>  GuestDiskAddress *disk = NULL;
> +int i;
> +HANDLE vol_h;
> +DWORD size;
> +PVOLUME_DISK_EXTENTS extents = NULL;
>
>  /* strip final backslash */
>  

[Qemu-devel] [PATCH] oslib-posix: Use MAP_STACK in qemu_alloc_stack() on OpenBSD

2018-10-07 Thread Brad Smith
Use MAP_STACK in qemu_alloc_stack() on OpenBSD.

Added to -current and will be in our soon to be 6.4 release.

MAP_STACK  Indicate that the mapping is used as a stack.  This
   flag must be used in combination with MAP_ANON and
   MAP_PRIVATE.

Implement MAP_STACK option for mmap().  Synchronous faults (pagefault and
syscall) confirm the stack register points at MAP_STACK memory, otherwise
SIGSEGV is delivered. sigaltstack() and pthread_attr_setstack() are modified
to create a MAP_STACK sub-region which satisfies alignment requirements.
Observe that MAP_STACK can only be set/cleared by mmap(), which zeroes the
contents of the region -- there is no mprotect() equivalent operation, so
there is no MAP_STACK-adding gadget.


Signed-off-by: Brad Smith 


diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index fbd0dc8c57..51e9a012c2 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -611,7 +611,11 @@ void *qemu_alloc_stack(size_t *sz)
 *sz += pagesz;
 
 ptr = mmap(NULL, *sz, PROT_READ | PROT_WRITE,
-   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+   MAP_PRIVATE | MAP_ANONYMOUS
+#ifdef MAP_STACK
+   | MAP_STACK
+#endif
+   , -1, 0);
 if (ptr == MAP_FAILED) {
 perror("failed to allocate memory for stack");
 abort();



Re: [Qemu-devel] [Qemu-trivial] [PATCH] qemu-io-cmds: Fix two format strings

2018-10-07 Thread Philippe Mathieu-Daudé
On 10/6/18 8:38 PM, Stefan Weil wrote:
> Use %zu instead of %zd for unsigned numbers.
> 
> This fixes two error messages from the LSTM static code analyzer:
> 
> This argument should be of type 'ssize_t' but is of type 'unsigned long'

Eventually prepend some spaces to have the difference between your
statement and the analyzer output.

> 
> Signed-off-by: Stefan Weil 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  qemu-io-cmds.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
> index db0b3ee5ef..5363482213 100644
> --- a/qemu-io-cmds.c
> +++ b/qemu-io-cmds.c
> @@ -907,7 +907,7 @@ static int readv_f(BlockBackend *blk, int argc, char 
> **argv)
>  memset(cmp_buf, pattern, qiov.size);
>  if (memcmp(buf, cmp_buf, qiov.size)) {
>  printf("Pattern verification failed at offset %"
> -   PRId64 ", %zd bytes\n", offset, qiov.size);
> +   PRId64 ", %zu bytes\n", offset, qiov.size);
>  ret = -EINVAL;
>  }
>  g_free(cmp_buf);
> @@ -1294,7 +1294,7 @@ static void aio_read_done(void *opaque, int ret)
>  memset(cmp_buf, ctx->pattern, ctx->qiov.size);
>  if (memcmp(ctx->buf, cmp_buf, ctx->qiov.size)) {
>  printf("Pattern verification failed at offset %"
> -   PRId64 ", %zd bytes\n", ctx->offset, ctx->qiov.size);
> +   PRId64 ", %zu bytes\n", ctx->offset, ctx->qiov.size);
>  }
>  g_free(cmp_buf);
>  }
> 



Re: [Qemu-devel] [RFC 6/6] cputlb: dynamically resize TLBs based on use rate

2018-10-07 Thread Philippe Mathieu-Daudé
Hi Emilio,

On 10/6/18 11:45 PM, Emilio G. Cota wrote:
> Perform the resizing only on flushes, otherwise we'd
> have to take a perf hit by either rehashing the array
> or unnecessarily flushing it.
> 
> We grow the array aggressively, and reduce the size more
> slowly. This accommodates mixed workloads, where some
> processes might be memory-heavy while others are not.
> 
> As the following experiments show, this a net perf gain,
> particularly for memory-heavy workloads. Experiments
> are run on an Intel i7-6700K CPU @ 4.00GHz.
> 
> 1. System boot + shudown, debian aarch64:
> 
> - Before (tb-lock-v3):
>  Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs):
> 
>7469.363393  task-clock (msec) #0.998 CPUs utilized
> ( +-  0.07% )
> 31,507,707,190  cycles#4.218 GHz  
> ( +-  0.07% )
> 57,101,577,452  instructions  #1.81  insns per cycle  
> ( +-  0.08% )
> 10,265,531,804  branches  # 1374.352 M/sec
> ( +-  0.07% )
>173,020,681  branch-misses #1.69% of all branches  
> ( +-  0.10% )
> 
>7.483359063 seconds time elapsed   
>( +-  0.08% )
> 
> - After:
>  Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs):
> 
>7185.036730  task-clock (msec) #0.999 CPUs utilized
> ( +-  0.11% )
> 30,303,501,143  cycles#4.218 GHz  
> ( +-  0.11% )
> 54,198,386,487  instructions  #1.79  insns per cycle  
> ( +-  0.08% )
>  9,726,518,945  branches  # 1353.719 M/sec
> ( +-  0.08% )
>167,082,307  branch-misses #1.72% of all branches  
> ( +-  0.08% )
> 
>7.195597842 seconds time elapsed   
>( +-  0.11% )
> 
> That is, a 3.8% improvement.
> 
> 2. System boot + shutdown, ubuntu 18.04 x86_64:

You can also run the VM tests to build QEMU:

$ make vm-test
vm-test: Test QEMU in preconfigured virtual machines

  vm-build-ubuntu.i386- Build QEMU in ubuntu i386 VM
  vm-build-freebsd- Build QEMU in FreeBSD VM
  vm-build-netbsd - Build QEMU in NetBSD VM
  vm-build-openbsd- Build QEMU in OpenBSD VM
  vm-build-centos - Build QEMU in CentOS VM, with Docker

> 
> - Before (tb-lock-v3):
> Performance counter stats for 'taskset -c 0 ../img/x86_64/ubuntu-die.sh 
> -nographic' (2 runs):
> 
>   49971.036482  task-clock (msec) #0.999 CPUs utilized
> ( +-  1.62% )
>210,766,077,140  cycles#4.218 GHz  
> ( +-  1.63% )
>428,829,830,790  instructions  #2.03  insns per cycle  
> ( +-  0.75% )
> 77,313,384,038  branches  # 1547.164 M/sec
> ( +-  0.54% )
>835,610,706  branch-misses #1.08% of all branches  
> ( +-  2.97% )
> 
>   50.003855102 seconds time elapsed   
>( +-  1.61% )
> 
> - After:
>  Performance counter stats for 'taskset -c 0 ../img/x86_64/ubuntu-die.sh 
> -nographic' (2 runs):
> 
>   50118.124477  task-clock (msec) #0.999 CPUs utilized
> ( +-  4.30% )
>132,396  context-switches  #0.003 M/sec
> ( +-  1.20% )
>  0  cpu-migrations#0.000 K/sec
> ( +-100.00% )
>167,754  page-faults   #0.003 M/sec
> ( +-  0.06% )
>211,414,701,601  cycles#4.218 GHz  
> ( +-  4.30% )
>  stalled-cycles-frontend
>  stalled-cycles-backend
>431,618,818,597  instructions  #2.04  insns per cycle  
> ( +-  6.40% )
> 80,197,256,524  branches  # 1600.165 M/sec
> ( +-  8.59% )
>794,830,352  branch-misses #0.99% of all branches  
> ( +-  2.05% )
> 
>   50.177077175 seconds time elapsed   
>( +-  4.23% )
> 
> No improvement (within noise range).
> 
> 3. x86_64 SPEC06int:
>   SPEC06int (test set)
>  [ Y axis: speedup over master ]
>   8 +-+--+++-+++++++-+++--+-+
> |   |
> |   tlb-lock-v3 |
>   7 +-+..$$$...+indirection   +-+
> |$ $