date:20170118

Re: [Qemu-devel] [PATCH v5 0/3] Add litmus tests for MTTCG consistency tests

2017-01-18 Thread no-reply

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Message-id: 20161201052844.31819-1-bobby.pr...@gmail.com
Subject: [Qemu-devel] [PATCH v5 0/3] Add litmus tests for MTTCG consistency 
tests
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
fatal: unable to access 'https://github.com/patchew-project/qemu/': Failed 
connect to github.com:443; Operation now in progress
error: Could not fetch 3c8cf5a9c21ff8782164d1def7f44bd888713384
Traceback (most recent call last):
  File "/usr/bin/patchew", line 406, in test_one
git_clone_repo(clone, r["repo"], r["head"], logf)
  File "/usr/bin/patchew", line 47, in git_clone_repo
stdout=logf, stderr=logf)
  File "/usr/lib64/python3.4/subprocess.py", line 561, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'remote', 'add', '-f', 
'--mirror=fetch', '3c8cf5a9c21ff8782164d1def7f44bd888713384', 
'https://github.com/patchew-project/qemu']' returned non-zero exit status 1



---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [Nbd] [Qemu-block] How to online resize qemu disk with nbd protocol?

2017-01-18 Thread Wouter Verhelst

On Mon, Jan 16, 2017 at 01:36:21PM -0600, Eric Blake wrote:
> Maybe the structured reply proposal can be extended into this (reserve a
> "reply" header that can be issued as many times as desired by the server
> without the client ever having issued the request first, and where the
> reply never uses the end-of-reply marker), but I'm not sure I want to go
> that direction just yet.

It's not necessarily a bad idea, which could also be used for:
- keepalive probes from server to client
- unsolicited ESHUTDOWN messages

both of which are currently not possible and might be useful for the
protocol to have.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
   people in the world who think they really understand all of its rules,
   and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

[Qemu-devel] [PATCH v5 1/2] gdbstub.c: fix GDB connection segfault caused by empty machines

2017-01-18 Thread Ziyue Yang

From: Ziyue Yang 

This patch is to fix the segmentation fault caused by attaching
GDB to a QEMU instance initialized with "-M none" option.

The bug can be reproduced by

> ./qemu-system-x86_64 -M none -nographic -S -s

and attach a GDB to it by

> gdb -ex 'target remote :1234

The segmentation fault was originally caused by trying to read
the information about CPU when communicating with GDB. However,
it's impossible for any control flow to exist on an empty machine,
nor can CPU's be hot plugged to an empty machine later by QOM
commands. So I think simply disabling GDB connections on empty
machines makes sense.

Signed-off-by: Ziyue Yang 
---
 gdbstub.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gdbstub.c b/gdbstub.c
index de62d26..426d55e 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -18,6 +18,7 @@
  */
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qemu/error-report.h"
 #include "qemu/cutils.h"
 #include "cpu.h"
 #ifdef CONFIG_USER_ONLY
@@ -1731,6 +1732,12 @@ int gdbserver_start(const char *device)
 CharDriverState *mon_chr;
 ChardevCommon common = { 0 };

+if (!first_cpu) {
+error_report("gdbstub: meaningless to attach gdb to a "
+ "machine without any CPU.");
+return -1;
+}
+
 if (!device)
 return -1;
 if (strcmp(device, "none") != 0) {
--
2.7.4

[Qemu-devel] [PATCH v5 2/2] gdbstub.c: update old error report statements

2017-01-18 Thread Ziyue Yang

From: Ziyue Yang 

Some updates from fprintf(stderr, ...) to error_report.

Signed-off-by: Ziyue Yang 
---
 gdbstub.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 426d55e..959f6dc 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -637,8 +637,8 @@ void gdb_register_coprocessor(CPUState *cpu,
 *p = s;
 if (g_pos) {
 if (g_pos != s->base_reg) {
-fprintf(stderr, "Error: Bad gdb register numbering for '%s'\n"
-"Expected %d got %d\n", xml, g_pos, s->base_reg);
+error_report("Error: Bad gdb register numbering for '%s', "
+ "expected %d got %d", xml, g_pos, s->base_reg);
 } else {
 cpu->gdb_num_g_regs = cpu->gdb_num_regs;
 }
@@ -890,7 +890,7 @@ static int gdb_handle_packet(GDBState *s, const char 
*line_buf)
 }
 case 'k':
 /* Kill the target */
-fprintf(stderr, "\nQEMU: Terminated via GDBstub\n");
+error_report("QEMU: Terminated via GDBstub");
 exit(0);
 case 'D':
 /* Detach packet */
@@ -1358,8 +1358,8 @@ void gdb_do_syscallv(gdb_syscall_complete_cb cb, const 
char *fmt, va_list va)
 break;
 default:
 bad_format:
-fprintf(stderr, "gdbstub: Bad syscall format string '%s'\n",
-fmt - 1);
+error_report("gdbstub: Bad syscall format string '%s'",
+ fmt - 1);
 break;
 }
 } else {
--
2.7.4

Re: [Qemu-devel] [PATCH v5 1/2] gdbstub.c: fix GDB connection segfault caused by empty machines

2017-01-18 Thread Thomas Huth

On 18.01.2017 09:02, Ziyue Yang wrote:
> From: Ziyue Yang 
> 
> This patch is to fix the segmentation fault caused by attaching
> GDB to a QEMU instance initialized with "-M none" option.
> 
> The bug can be reproduced by
> 
>> ./qemu-system-x86_64 -M none -nographic -S -s
> 
> and attach a GDB to it by
> 
>> gdb -ex 'target remote :1234
> 
> The segmentation fault was originally caused by trying to read
> the information about CPU when communicating with GDB. However,
> it's impossible for any control flow to exist on an empty machine,
> nor can CPU's be hot plugged to an empty machine later by QOM
> commands. So I think simply disabling GDB connections on empty
> machines makes sense.
> 
> Signed-off-by: Ziyue Yang 
> ---
>  gdbstub.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index de62d26..426d55e 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -18,6 +18,7 @@
>   */
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
> +#include "qemu/error-report.h"
>  #include "qemu/cutils.h"
>  #include "cpu.h"
>  #ifdef CONFIG_USER_ONLY
> @@ -1731,6 +1732,12 @@ int gdbserver_start(const char *device)
>  CharDriverState *mon_chr;
>  ChardevCommon common = { 0 };
> 
> +if (!first_cpu) {
> +error_report("gdbstub: meaningless to attach gdb to a "
> + "machine without any CPU.");
> +return -1;
> +}
> +
>  if (!device)
>  return -1;
>  if (strcmp(device, "none") != 0) {
> --
> 2.7.4

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH v5 2/2] gdbstub.c: update old error report statements

2017-01-18 Thread Thomas Huth

On 18.01.2017 09:03, Ziyue Yang wrote:
> From: Ziyue Yang 
> 
> Some updates from fprintf(stderr, ...) to error_report.
> 
> Signed-off-by: Ziyue Yang 
> ---
>  gdbstub.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 426d55e..959f6dc 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -637,8 +637,8 @@ void gdb_register_coprocessor(CPUState *cpu,
>  *p = s;
>  if (g_pos) {
>  if (g_pos != s->base_reg) {
> -fprintf(stderr, "Error: Bad gdb register numbering for '%s'\n"
> -"Expected %d got %d\n", xml, g_pos, s->base_reg);
> +error_report("Error: Bad gdb register numbering for '%s', "
> + "expected %d got %d", xml, g_pos, s->base_reg);
>  } else {
>  cpu->gdb_num_g_regs = cpu->gdb_num_regs;
>  }
> @@ -890,7 +890,7 @@ static int gdb_handle_packet(GDBState *s, const char 
> *line_buf)
>  }
>  case 'k':
>  /* Kill the target */
> -fprintf(stderr, "\nQEMU: Terminated via GDBstub\n");
> +error_report("QEMU: Terminated via GDBstub");
>  exit(0);
>  case 'D':
>  /* Detach packet */
> @@ -1358,8 +1358,8 @@ void gdb_do_syscallv(gdb_syscall_complete_cb cb, const 
> char *fmt, va_list va)
>  break;
>  default:
>  bad_format:
> -fprintf(stderr, "gdbstub: Bad syscall format string '%s'\n",
> -fmt - 1);
> +error_report("gdbstub: Bad syscall format string '%s'",
> + fmt - 1);
>  break;
>  }
>  } else {
> --
> 2.7.4

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH RFC v3 14/14] intel_iommu: enable vfio devices

2017-01-18 Thread Peter Xu

On Wed, Jan 18, 2017 at 11:10:53AM +0800, Jason Wang wrote:
> 
> 
> On 2017年01月17日 22:45, Peter Xu wrote:
> >On Mon, Jan 16, 2017 at 05:54:55PM +0800, Jason Wang wrote:
> >>
> >>On 2017年01月16日 17:18, Peter Xu wrote:
> >  static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t 
> > domain_id,
> >hwaddr addr, uint8_t am)
> >  {
> >@@ -1222,6 +1251,7 @@ static void 
> >vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> >  info.addr = addr;
> >  info.mask = ~((1 << am) - 1);
> >  g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, 
> > &info);
> >+vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);
> Is the case of GLOBAL or DSI flush missed, or we don't care it at all?
> >>>IMHO we don't. For device assignment, since we are having CM=1 here,
> >>>we should have explicit page invalidations even if guest sends
> >>>global/domain invalidations.
> >>>
> >>>Thanks,
> >>>
> >>>-- peterx
> >>Is this spec required?
> >I think not. IMO the spec is very coarse grained on describing cache
> >mode...
> >
> >>Btw, it looks to me that both DSI and GLOBAL are
> >>indeed explicit flush.
> >Actually when cache mode is on, it is unclear to me on how we should
> >treat domain/global invalidations, at least from the spec (as
> >mentioned earlier). My understanding is that they are not "explicit
> >flushes", which IMHO should only mean page selective IOTLB
> >invalidations.
> 
> Probably not, at least from the view of performance. DSI and global should
> be more efficient in some cases.

I agree with you that DSI/GLOBAL flushes are more efficient in some
way. But IMHO that does not mean these invalidations are "explicit
invalidations", and I suspect whether cache mode has to coop with it.

But here I should add one more thing besides PSI - context entry
invalidation should be one of "the explicit invalidations" as well,
which we need to handle just like PSI when cache mode is on.

> 
> >
> >>Just have a quick go through on driver codes and find this something
> >>interesting in intel_iommu_flush_iotlb_psi():
> >>
> >>...
> >> /*
> >>  * Fallback to domain selective flush if no PSI support or the size is
> >>  * too big.
> >>  * PSI requires page size to be 2 ^ x, and the base address is 
> >> naturally
> >>  * aligned to the size
> >>  */
> >> if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu->cap))
> >> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> >> DMA_TLB_DSI_FLUSH);
> >> else
> >> iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
> >> DMA_TLB_PSI_FLUSH);
> >>...
> >I think this is interesting... and I doubt its correctness while with
> >cache mode enabled.
> >
> >If so (sending domain invalidation instead of a big range of page
> >invalidations), how should we capture which pages are unmapped in
> >emulated IOMMU?
> 
> We don't need to track individual pages here, since all pages for a specific
> domain were unmapped I believe?

IMHO this might not be the correct behavior.

If we receive one domain specific invalidation, I agree that we should
invalidate the IOTLB cache for all the devices inside the domain.
However, when cache mode is on, we should be depending on the PSIs to
unmap each page (unless we want to unmap the whole address space, in
this case it's very possible that the guest is just unmapping a range,
not the entire space). If we convert several PSIs into one big DSI,
IMHO we will leave those pages mapped/unmapped while we should
unmap/map them.

> 
> >
> >>It looks like DSI_FLUSH is possible even for CM on.
> >>
> >>And in flush_unmaps():
> >>
> >>...
> >> /* In caching mode, global flushes turn emulation expensive */
> >> if (!cap_caching_mode(iommu->cap))
> >> iommu->flush.flush_iotlb(iommu, 0, 0, 0,
> >>  DMA_TLB_GLOBAL_FLUSH);
> >>...
> >>
> >>If I understand the comments correctly, GLOBAL is ok for CM too (though the
> >>code did not do it for performance reason).
> >I think it should be okay to send global flush for CM, but not sure
> >whether we should notify anything when we receive it. Hmm, anyway, I
> >think I need some more reading to make sure I understand the whole
> >thing correctly. :)
> >
> >For example, when I see this commit:
> >
> >commit 78d5f0f500e6ba8f6cfd0673475ff4d941d705a2
> >Author: Nadav Amit 
> >Date:   Thu Apr 8 23:00:41 2010 +0300
> >
> > intel-iommu: Avoid global flushes with caching mode.
> > While it may be efficient on real hardware, emulation of global
> > invalidations is very expensive as all shadow entries must be examined.
> > This patch changes the behaviour when caching mode is enabled (which is
> > the case when IOMMU emulation takes place). In this case, page specific
> > invalidation is used instead.
> >
> >Before I ask someone else besid

Re: [Qemu-devel] [PATCH 3/3] COLO: Don't process failover request while loading VM's state

2017-01-18 Thread Hailiang Zhang


On 2017/1/18 2:24, Eric Blake wrote:

On 01/17/2017 06:57 AM, zhanghailiang wrote:

We should not do failover work while the main thread is loading
VM's state. Otherwise the consistent of VM's memory and
device state will be broken.

We will restart the loading process after jump over the stage,
The new failover status 'RELAUNCH' will help to record if we
need to restart the process.

Cc: Eric Blake 
Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Reviewed-by: Dr. David Alan Gilbert 
---
  migration/colo.c | 26 ++
  qapi-schema.json |  4 +++-
  2 files changed, 29 insertions(+), 1 deletion(-)




+++ b/qapi-schema.json
@@ -856,10 +856,12 @@
  #
  # @completed: finish the process of failover
  #
+# @relaunch: restart the failover process, from 'none' -> 'completed'


You'll need to add a '(since 2.9)' tag



OK, I'll add it in next version, thanks.


+#
  # Since: 2.8
  ##
  { 'enum': 'FailoverStatus',
-  'data': [ 'none', 'require', 'active', 'completed'] }
+  'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }

  ##
  # @x-colo-lost-heartbeat:

[Qemu-devel] [PATCH v16 0/2] virtio-crypto: virtio crypto device specification

2017-01-18 Thread Gonglei

Changes since v15:
 - use feature bits for non-session mode in order to keep compatibility with
   pre-existing code. [Halil & Michael]
 - introduce VIRTIO_CRYPTO_F_ NON_SESSION_MODE feature bit to control all other
   non-session mode feature bits.
 - fix some typos. [Stefan]
 - introduce struct virtio_crypto_op_data_req_mux to support both session
   and non-session based crypto operations and keep compatibility with
   pre-existing code.

Changes since v14:
 - drop VIRTIO_CRYPTO_S_STARTED status [Halil & Cornelia]
 - correct a sentence about dataqueue and controlq in the first paragraph. 
[Halil]
 - change a MAY to MUST about max_dataqueues. [Halil]
 - add non-session mode support
   a) add four features for different crypto services to identify wheather 
support session mode.
   b) rewrite some

For pervious versions of virtio crypto spec, Pls see:

[v14]:
https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg02212.html

[v13]:
https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg07348.html

For more information, please see:
 http://qemu-project.org/Features/VirtioCrypto

Please help to review, thanks.

Gonglei (2):
  virtio-crypto: Add virtio crypto device specification
  virtio-crypto: Add conformance clauses

 conformance.tex   |   30 ++
 content.tex   |2 +
 virtio-crypto.tex | 1245 +
 3 files changed, 1277 insertions(+)
 create mode 100644 virtio-crypto.tex

-- 
1.7.12.4

[Qemu-devel] [PATCH v16 2/2] virtio-crypto: Add conformance clauses

2017-01-18 Thread Gonglei

Add the conformance targets and clauses for
virtio-crypto device.

Signed-off-by: Gonglei 
---
 conformance.tex | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index f59e360..3bde4b6 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -146,6 +146,21 @@ An SCSI host driver MUST conform to the following 
normative statements:
 \item \ref{drivernormative:Device Types / SCSI Host Device / Device Operation 
/ Device Operation: eventq}
 \end{itemize}
 
+\subsection{Crypto Driver Conformance}\label{sec:Conformance / Driver 
Conformance / Crypto Driver Conformance}
+
+An Crypto driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / Crypto Device / Device configuration 
layout}
+\item \ref{drivernormative:Device Types / Crypto Device / Device 
Initialization}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: create session}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: destroy session}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
HASH Service operation}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
MAC Service operation}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
Symmetric algorithms Operation}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
AEAD Service operation}
+\end{itemize}
+
 \section{Device Conformance}\label{sec:Conformance / Device Conformance}
 
 A device MUST conform to the following normative statements:
@@ -267,6 +282,21 @@ An SCSI host device MUST conform to the following 
normative statements:
 \item \ref{devicenormative:Device Types / SCSI Host Device / Device Operation 
/ Device Operation: eventq}
 \end{itemize}
 
+\subsection{Crypto Device Conformance}\label{sec:Conformance / Device 
Conformance / Crypto Device Conformance}
+
+An Crypto device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / Crypto Device / Device configuration 
layout}
+\item \ref{devicenormative:Device Types / Crypto Device / Device 
Initialization}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: create session}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: destroy session}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
HASH Service operation}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
MAC Service operation}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
Symmetric algorithms Operation}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
AEAD Service operation}
+\end{itemize}
+
 \section{Legacy Interface: Transitional Device and
 Transitional Driver Conformance}\label{sec:Conformance / Legacy
 Interface: Transitional Device and 
-- 
1.7.12.4

[Qemu-devel] [PATCH v16 1/2] virtio-crypto: Add virtio crypto device specification

2017-01-18 Thread Gonglei

The virtio crypto device is a virtual crypto device (ie. hardware
crypto accelerator card). Currently, the virtio crypto device provides
the following crypto services: CIPHER, MAC, HASH, and AEAD.

In this patch, CIPHER, MAC, HASH, AEAD services are introduced.

VIRTIO-153

Signed-off-by: Gonglei 
CC: Michael S. Tsirkin 
CC: Cornelia Huck 
CC: Stefan Hajnoczi 
CC: Lingli Deng 
CC: Jani Kokkonen 
CC: Ola Liljedahl 
CC: Varun Sethi 
CC: Zeng Xin 
CC: Keating Brian 
CC: Ma Liang J 
CC: Griffin John 
CC: Mihai Claudiu Caraman 
---
 content.tex   |2 +
 virtio-crypto.tex | 1245 +
 2 files changed, 1247 insertions(+)
 create mode 100644 virtio-crypto.tex

diff --git a/content.tex b/content.tex
index 4b45678..ab75f78 100644
--- a/content.tex
+++ b/content.tex
@@ -5750,6 +5750,8 @@ descriptor for the \field{sense_len}, \field{residual},
 \field{status_qualifier}, \field{status}, \field{response} and
 \field{sense} fields.
 
+\input{virtio-crypto.tex}
+
 \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
 Currently there are three device-independent feature bits defined:
diff --git a/virtio-crypto.tex b/virtio-crypto.tex
new file mode 100644
index 000..732cd30
--- /dev/null
+++ b/virtio-crypto.tex
@@ -0,0 +1,1245 @@
+\section{Crypto Device}\label{sec:Device Types / Crypto Device}
+
+The virtio crypto device is a virtual cryptography device as well as a kind of
+virtual hardware accelerator for virtual machines. The encryption and
+decryption requests are placed in any of the data active queues and are 
ultimately handled by the
+backend crypto accelerators. The second kind of queue is the control queue 
used to create 
+or destroy sessions for symmetric algorithms and will control some advanced
+features in the future. The virtio crypto device provides the following crypto
+services: CIPHER, MAC, HASH, and AEAD.
+
+
+\subsection{Device ID}\label{sec:Device Types / Crypto Device / Device ID}
+
+20
+
+\subsection{Virtqueues}\label{sec:Device Types / Crypto Device / Virtqueues}
+
+\begin{description}
+\item[0] dataq1
+\item[\ldots]
+\item[N-1] dataqN
+\item[N] controlq
+\end{description}
+
+N is set by \field{max_dataqueues}.
+
+\subsection{Feature bits}\label{sec:Device Types / Crypto Device / Feature 
bits}
+
+VIRTIO_CRYPTO_F_NON_SESSION_MODE (0) non-session mode is available.
+VIRTIO_CRYPTO_F_CIPHER_NON_SESSION_MODE (1) non-session mode is available for 
CIPHER service.
+VIRTIO_CRYPTO_F_HASH_NON_SESSION_MODE (2) non-session mode is available for 
HASH service.
+VIRTIO_CRYPTO_F_MAC_NON_SESSION_MODE (3) non-session mode is available for MAC 
service.
+VIRTIO_CRYPTO_F_AEAD_NON_SESSION_MODE (4) non-session mode is available for 
AEAD service.
+
+\subsubsection{Feature bit requirements}\label{sec:Device Types / Crypto 
Device / Feature bits}
+
+Some crypto feature bits require other crypto feature bits
+(see \ref{drivernormative:Basic Facilities of a Virtio Device / Feature Bits}):
+
+\begin{description}
+\item[VIRTIO_CRYPTO_F_CIPHER_NON_SESSION_MODE] Requires 
VIRTIO_CRYPTO_F_NON_SESSION_MODE.
+\item[VIRTIO_CRYPTO_F_HASH_NON_SESSION_MODE] Requires 
VIRTIO_CRYPTO_F_NON_SESSION_MODE.
+\item[VIRTIO_CRYPTO_F_MAC_NON_SESSION_MODE] Requires 
VIRTIO_CRYPTO_F_NON_SESSION_MODE.
+\item[VIRTIO_CRYPTO_F_AEAD_NON_SESSION_MODE] Requires 
VIRTIO_CRYPTO_F_NON_SESSION_MODE.
+\end{description}
+
+\subsection{Device configuration layout}\label{sec:Device Types / Crypto 
Device / Device configuration layout}
+
+The following driver-read-only configuration fields are defined:
+
+\begin{lstlisting}
+struct virtio_crypto_config {
+le32 status;
+le32 max_dataqueues;
+le32 crypto_services;
+/* Detailed algorithms mask */
+le32 cipher_algo_l;
+le32 cipher_algo_h;
+le32 hash_algo;
+le32 mac_algo_l;
+le32 mac_algo_h;
+le32 aead_algo;
+/* Maximum length of cipher key */
+le32 max_cipher_key_len;
+/* Maximum length of authenticated key */
+le32 max_auth_key_len;
+le32 reserved;
+/* Maximum size of each crypto request's content */
+le64 max_size;
+};
+\end{lstlisting}
+
+The value of the \field{status} field is VIRTIO_CRYPTO_S_HW_READY or 
~VIRTIO_CRYPTO_S_HW_READY.
+
+\begin{lstlisting}
+#define VIRTIO_CRYPTO_S_HW_READY  (1 << 0)
+\end{lstlisting}
+
+The VIRTIO_CRYPTO_S_HW_READY flag is used to show whether the hardware is 
ready to work or not.
+
+The following driver-read-only fields include \field{max_dataqueues}, which 
specifies the
+maximum number of data virtqueues (dataq1\ldots dataqN), and 
\field{crypto_services},
+which indicates the crypto services the virtio crypto supports.
+
+The following services are defined:
+
+\begin{lstlisting}
+/* CIPHER service */
+#define VIRTIO_CRYPTO_SERVICE_CIPHER 0
+/* HASH service */
+#define VIRTIO_CRYPTO_SERVICE_HASH   1
+/* MAC (Message Authentication Codes) service */
+#define VIRTIO_CRYPTO_SERVICE_MAC2
+/* AEAD (Authenticated Encryption with Associate

Re: [Qemu-devel] [PATCH RFC v3 14/14] intel_iommu: enable vfio devices

2017-01-18 Thread Jason Wang




On 2017年01月18日 16:11, Peter Xu wrote:

On Wed, Jan 18, 2017 at 11:10:53AM +0800, Jason Wang wrote:


On 2017年01月17日 22:45, Peter Xu wrote:

On Mon, Jan 16, 2017 at 05:54:55PM +0800, Jason Wang wrote:

On 2017年01月16日 17:18, Peter Xu wrote:

  static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
hwaddr addr, uint8_t am)
  {
@@ -1222,6 +1251,7 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, 
uint16_t domain_id,
  info.addr = addr;
  info.mask = ~((1 << am) - 1);
  g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info);
+vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);

Is the case of GLOBAL or DSI flush missed, or we don't care it at all?

IMHO we don't. For device assignment, since we are having CM=1 here,
we should have explicit page invalidations even if guest sends
global/domain invalidations.

Thanks,

-- peterx

Is this spec required?

I think not. IMO the spec is very coarse grained on describing cache
mode...


Btw, it looks to me that both DSI and GLOBAL are
indeed explicit flush.

Actually when cache mode is on, it is unclear to me on how we should
treat domain/global invalidations, at least from the spec (as
mentioned earlier). My understanding is that they are not "explicit
flushes", which IMHO should only mean page selective IOTLB
invalidations.

Probably not, at least from the view of performance. DSI and global should
be more efficient in some cases.

I agree with you that DSI/GLOBAL flushes are more efficient in some
way. But IMHO that does not mean these invalidations are "explicit
invalidations", and I suspect whether cache mode has to coop with it.


Well, the spec does not forbid DSI/GLOBAL with CM and the driver codes 
had used them for almost ten years. I can hardly believe it's wrong.




But here I should add one more thing besides PSI - context entry
invalidation should be one of "the explicit invalidations" as well,
which we need to handle just like PSI when cache mode is on.


Just have a quick go through on driver codes and find this something
interesting in intel_iommu_flush_iotlb_psi():

...
 /*
  * Fallback to domain selective flush if no PSI support or the size is
  * too big.
  * PSI requires page size to be 2 ^ x, and the base address is naturally
  * aligned to the size
  */
 if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu->cap))
 iommu->flush.flush_iotlb(iommu, did, 0, 0,
 DMA_TLB_DSI_FLUSH);
 else
 iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
 DMA_TLB_PSI_FLUSH);
...

I think this is interesting... and I doubt its correctness while with
cache mode enabled.

If so (sending domain invalidation instead of a big range of page
invalidations), how should we capture which pages are unmapped in
emulated IOMMU?

We don't need to track individual pages here, since all pages for a specific
domain were unmapped I believe?

IMHO this might not be the correct behavior.

If we receive one domain specific invalidation, I agree that we should
invalidate the IOTLB cache for all the devices inside the domain.
However, when cache mode is on, we should be depending on the PSIs to
unmap each page (unless we want to unmap the whole address space, in
this case it's very possible that the guest is just unmapping a range,
not the entire space). If we convert several PSIs into one big DSI,
IMHO we will leave those pages mapped/unmapped while we should
unmap/map them.


Confused, do you have an example for this? (I fail to understand why DSI 
can't work, at least implementation can convert DSI to several PSIs 
internally).


Thanks




It looks like DSI_FLUSH is possible even for CM on.

And in flush_unmaps():

...
 /* In caching mode, global flushes turn emulation expensive */
 if (!cap_caching_mode(iommu->cap))
 iommu->flush.flush_iotlb(iommu, 0, 0, 0,
  DMA_TLB_GLOBAL_FLUSH);
...

If I understand the comments correctly, GLOBAL is ok for CM too (though the
code did not do it for performance reason).

I think it should be okay to send global flush for CM, but not sure
whether we should notify anything when we receive it. Hmm, anyway, I
think I need some more reading to make sure I understand the whole
thing correctly. :)

For example, when I see this commit:

commit 78d5f0f500e6ba8f6cfd0673475ff4d941d705a2
Author: Nadav Amit 
Date:   Thu Apr 8 23:00:41 2010 +0300

 intel-iommu: Avoid global flushes with caching mode.
 While it may be efficient on real hardware, emulation of global
 invalidations is very expensive as all shadow entries must be examined.
 This patch changes the behaviour when caching mode is enabled (which is
 the case when IOMMU emulation takes place). In this case, page specific
 invalidation is used instead.

Before I ask someone else besides qemu-devel, I am curious about

Re: [Qemu-devel] [PATCH v2 6/6] PC: Support dynamic sysbus on pc_i440fx

2017-01-18 Thread Igor Mammedov

On Mon, 16 Jan 2017 11:20:58 -0800
b...@skyportsystems.com wrote:

> From: Ben Warren 
commit message should give a reason why it's need.

> 
> Signed-off-by: Ben Warren 
> ---
>  hw/i386/pc_piix.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index a54a468..8ac894c 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -435,6 +435,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
>  m->hot_add_cpu = pc_hot_add_cpu;
>  m->default_machine_opts = "firmware=bios-256k.bin";
>  m->default_display = "std";
> +m->has_dynamic_sysbus = true;
>  }
>  
>  static void pc_i440fx_2_8_machine_options(MachineClass *m)

Re: [Qemu-devel] [PATCH RFC v3 14/14] intel_iommu: enable vfio devices

2017-01-18 Thread Peter Xu

On Wed, Jan 18, 2017 at 04:36:05PM +0800, Jason Wang wrote:
> 
> 
> On 2017年01月18日 16:11, Peter Xu wrote:
> >On Wed, Jan 18, 2017 at 11:10:53AM +0800, Jason Wang wrote:
> >>
> >>On 2017年01月17日 22:45, Peter Xu wrote:
> >>>On Mon, Jan 16, 2017 at 05:54:55PM +0800, Jason Wang wrote:
> On 2017年01月16日 17:18, Peter Xu wrote:
> >>>  static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t 
> >>> domain_id,
> >>>hwaddr addr, uint8_t am)
> >>>  {
> >>>@@ -1222,6 +1251,7 @@ static void 
> >>>vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> >>>  info.addr = addr;
> >>>  info.mask = ~((1 << am) - 1);
> >>>  g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, 
> >>> &info);
> >>>+vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);
> >>Is the case of GLOBAL or DSI flush missed, or we don't care it at all?
> >IMHO we don't. For device assignment, since we are having CM=1 here,
> >we should have explicit page invalidations even if guest sends
> >global/domain invalidations.
> >
> >Thanks,
> >
> >-- peterx
> Is this spec required?
> >>>I think not. IMO the spec is very coarse grained on describing cache
> >>>mode...
> >>>
> Btw, it looks to me that both DSI and GLOBAL are
> indeed explicit flush.
> >>>Actually when cache mode is on, it is unclear to me on how we should
> >>>treat domain/global invalidations, at least from the spec (as
> >>>mentioned earlier). My understanding is that they are not "explicit
> >>>flushes", which IMHO should only mean page selective IOTLB
> >>>invalidations.
> >>Probably not, at least from the view of performance. DSI and global should
> >>be more efficient in some cases.
> >I agree with you that DSI/GLOBAL flushes are more efficient in some
> >way. But IMHO that does not mean these invalidations are "explicit
> >invalidations", and I suspect whether cache mode has to coop with it.
> 
> Well, the spec does not forbid DSI/GLOBAL with CM and the driver codes had
> used them for almost ten years. I can hardly believe it's wrong.

I think we have misunderstanding here. :)

I never thought we should not send DSI/GLOBAL invalidations with cache
mode. I just think we should not do anything special even if we have
cache mode on when we receive these signals.

In the spec, "explicit invalidation" is mentioned in the cache mode
chapter:

The Caching Mode (CM) field in Capability Register indicates if
the hardware implementation caches not-present or erroneous
translation-structure entries. When the CM field is reported as
Set, any software updates to any remapping structures (including
updates to not-present entries or present entries whose
programming resulted in translation faults) requires explicit
invalidation of the caches.

And I thought we were discussion about "what is explicit invalidation"
mentioned above.

> 
> >
> >But here I should add one more thing besides PSI - context entry
> >invalidation should be one of "the explicit invalidations" as well,
> >which we need to handle just like PSI when cache mode is on.
> >
> Just have a quick go through on driver codes and find this something
> interesting in intel_iommu_flush_iotlb_psi():
> 
> ...
>  /*
>   * Fallback to domain selective flush if no PSI support or the size 
>  is
>   * too big.
>   * PSI requires page size to be 2 ^ x, and the base address is 
>  naturally
>   * aligned to the size
>   */
>  if (!cap_pgsel_inv(iommu->cap) || mask > 
>  cap_max_amask_val(iommu->cap))
>  iommu->flush.flush_iotlb(iommu, did, 0, 0,
>  DMA_TLB_DSI_FLUSH);
>  else
>  iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
>  DMA_TLB_PSI_FLUSH);
> ...
> >>>I think this is interesting... and I doubt its correctness while with
> >>>cache mode enabled.
> >>>
> >>>If so (sending domain invalidation instead of a big range of page
> >>>invalidations), how should we capture which pages are unmapped in
> >>>emulated IOMMU?
> >>We don't need to track individual pages here, since all pages for a specific
> >>domain were unmapped I believe?
> >IMHO this might not be the correct behavior.
> >
> >If we receive one domain specific invalidation, I agree that we should
> >invalidate the IOTLB cache for all the devices inside the domain.
> >However, when cache mode is on, we should be depending on the PSIs to
> >unmap each page (unless we want to unmap the whole address space, in
> >this case it's very possible that the guest is just unmapping a range,
> >not the entire space). If we convert several PSIs into one big DSI,
> >IMHO we will leave those pages mapped/unmapped while we should
> >unmap/map them.
> 
> Confused, do you have an example for this? (I fail to understand why DSI

[Qemu-devel] VM Hung issue

2017-01-18 Thread Umar Draz

Hello,

I am running qemu-kvm on CentOS 7.3, I have few vms of CentOS and Windows
2012 servers.

Randomly they are hung, but the status of virtual machine keep running, vnc
and network not responding.

I have checked all /var/log/messages and /var/log/libvirt/qemu/vm.log but
nothing found anything wrong.

Here is the following packages list regarding qemu and libvirt.

libvirt-daemon-driver-qemu-2.0.0-10.el7_3.2.x86_64
libvirt-2.0.0-10.el7_3.2.x86_64
ipxe-roms-qemu-20160127-5.git6366fa7a.el7.noarch
qemu-img-1.5.3-126.el7.x86_64
qemu-guest-agent-2.5.0-3.el7.x86_64
qemu-kvm-1.5.3-126.el7.x86_64
qemu-kvm-common-1.5.3-126.el7.x86_64

Would you please help what should I do more so I can find the issue.

-- Umar

Re: [Qemu-devel] [PATCH v2 5/6] qmp/hmp: add set-vm-generation-id commands

2017-01-18 Thread Igor Mammedov

On Mon, 16 Jan 2017 11:20:57 -0800
b...@skyportsystems.com wrote:

> From: Igor Mammedov 
> 
> Add set-vm-generation-id command to set Virtual Machine
> Generation ID counter.
> 
> QMP command example:
> { "execute": "set-vm-generation-id",
>   "arguments": {
>   "guid": "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
>   }
> }
> 
> HMP command example:
> set-vm-generation-id guid=324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87
> 
> Signed-off-by: Ben Warren 
> Cc: Igor Mammedov 
> Cc: Eric Blake 
when you are reposting someone else written patch,
you are supposed to keep SoB lines patch has had and just add
your own SoB after original SoB lines.

If you made non trivial change to the patch then ask
original author(s) if they wish to keep their SoB
before posting modified patch.
(Usually I do it offline).

> ---
>  hmp-commands.hx  | 13 +
>  hmp.c| 12 
>  hmp.h|  1 +
>  qapi-schema.json | 12 
>  stubs/vmgenid.c  |  6 ++
>  5 files changed, 44 insertions(+)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 8819281..56744aa 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1775,5 +1775,18 @@ ETEXI
>  },
>  
>  STEXI
> +@item set-vm-generation-id @var{uuid}
> +Set Virtual Machine Generation ID counter to @var{guid}
> +ETEXI
> +
> +{
> +.name   = "set-vm-generation-id",
> +.args_type  = "guid:s",
> +.params = "guid",
> +.help   = "Set Virtual Machine Generation ID counter",
> +.cmd = hmp_set_vm_generation_id,
> +},
> +
> +STEXI
>  @end table
>  ETEXI
> diff --git a/hmp.c b/hmp.c
> index 9ec27ae..a54a312 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -2579,3 +2579,15 @@ void hmp_info_vm_generation_id(Monitor *mon, const 
> QDict *qdict)
>  }
>  qapi_free_GuidInfo(info);
>  }
> +
> +void hmp_set_vm_generation_id(Monitor *mon, const QDict *qdict)
> +{
> +Error *errp = NULL;
> +const char *guid = qdict_get_str(qdict, "guid");
> +
> +qmp_set_vm_generation_id(guid, &errp);
> +if (errp) {
> +hmp_handle_error(mon, &errp);
> +return;
> +}
> +}
> diff --git a/hmp.h b/hmp.h
> index 799fd37..e0ac1e8 100644
> --- a/hmp.h
> +++ b/hmp.h
> @@ -138,5 +138,6 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict 
> *qdict);
>  void hmp_info_dump(Monitor *mon, const QDict *qdict);
>  void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
>  void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
> +void hmp_set_vm_generation_id(Monitor *mon, const QDict *qdict);
>  
>  #endif
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 2348391..c5ebea4 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -4796,3 +4796,15 @@
>  # Since 2.9
>  ##
>  { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
> +
> +##
> +# @set-vm-generation-id
> +#
> +# Set Virtual Machine Generation ID
> +#
> +# @changed: Is the Virtual Machine Generation ID a new value?
> +# @guid: new GUID to set as Virtual Machine Generation ID
> +#
> +# Since 2.9
> +##
> +{ 'command': 'set-vm-generation-id', 'data': {'guid': 'str'} }
> diff --git a/stubs/vmgenid.c b/stubs/vmgenid.c
> index 8c448ac..d25d41b 100644
> --- a/stubs/vmgenid.c
> +++ b/stubs/vmgenid.c
> @@ -6,3 +6,9 @@ GuidInfo *qmp_query_vm_generation_id(Error **errp)
>  error_setg(errp, "this command is not currently supported");
>  return NULL;
>  }
> +
> +void qmp_set_vm_generation_id(const char *guid, Error **errp)
> +{
> +error_setg(errp, "this command is not currently supported");
> +return;
> +}

Re: [Qemu-devel] [PATCH v9 07/11] megasas: undo the overwrites of msi user configuration

2017-01-18 Thread Hannes Reinecke

On 01/17/2017 07:18 AM, Cao jin wrote:
> Commit afea4e14 seems forgetting to undo the overwrites, which is
> unsuitable.
> 
> CC: Hannes Reinecke 
> CC: Paolo Bonzini 
> CC: Markus Armbruster 
> CC: Marcel Apfelbaum 
> CC: Michael S. Tsirkin 
> 
> Reviewed-by: Markus Armbruster 
> Acked-by: Marcel Apfelbaum 
> Signed-off-by: Cao jin 
> ---
>  hw/scsi/megasas.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
> index 14d6e0c6d565..c208d520c4df 100644
> --- a/hw/scsi/megasas.c
> +++ b/hw/scsi/megasas.c
> @@ -2350,11 +2350,10 @@ static void megasas_scsi_realize(PCIDevice *dev, 
> Error **errp)
>  "msi=off with this machine type.\n");
>  error_propagate(errp, err);
>  return;
> -} else if (ret) {
> -/* With msi=auto, we fall back to MSI off silently */
> -s->msi = ON_OFF_AUTO_OFF;
> -error_free(err);
>  }
> +assert(!err || s->msix == ON_OFF_AUTO_AUTO);
> +/* With msi=auto, we fall back to MSI off silently */
> +error_free(err);
>  }
>  
>  memory_region_init_io(&s->mmio_io, OBJECT(s), &megasas_mmio_ops, s,
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [Qemu-devel] [PATCH v9 05/11] megasas: change behaviour of msix switch

2017-01-18 Thread Hannes Reinecke

On 01/17/2017 07:18 AM, Cao jin wrote:
> Resolve the TODO, msix=auto means msix on; if user specify msix=on,
> then device creation fail on msix_init failure.
> Also undo the overwrites of user configuration of msix.
> 
> CC: Michael S. Tsirkin 
> CC: Hannes Reinecke 
> CC: Paolo Bonzini 
> CC: Markus Armbruster 
> CC: Marcel Apfelbaum 
> 
> Reviewed-by: Markus Armbruster 
> Acked-by: Marcel Apfelbaum 
> Signed-off-by: Cao jin 
> ---
>  hw/scsi/megasas.c | 27 ---
>  1 file changed, 20 insertions(+), 7 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [Qemu-devel] [PATCH v9 03/11] pci: Convert msix_init() to Error and fix callers

2017-01-18 Thread Hannes Reinecke

On 01/17/2017 07:18 AM, Cao jin wrote:
> msix_init() reports errors with error_report(), which is wrong when
> it's used in realize().  The same issue was fixed for msi_init() in
> commit 1108b2f. In order to make the API change as small as possible,
> leave the return value check to later patch.
> 
> For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
> msix_init's failure, suppress the error report by passing NULL error
> object.
> 
> Bonus: add comment for msix_init.
> 
> CC: Jiri Pirko 
> CC: Gerd Hoffmann 
> CC: Dmitry Fleytman 
> CC: Jason Wang 
> CC: Michael S. Tsirkin 
> CC: Hannes Reinecke 
> CC: Paolo Bonzini 
> CC: Alex Williamson 
> CC: Markus Armbruster 
> CC: Marcel Apfelbaum 
> Signed-off-by: Cao jin 
> ---
>  hw/block/nvme.c|  2 +-
>  hw/misc/ivshmem.c  |  8 
>  hw/net/e1000e.c|  2 +-
>  hw/net/rocker/rocker.c |  4 +++-
>  hw/net/vmxnet3.c   |  2 +-
>  hw/pci/msix.c  | 36 +++-
>  hw/scsi/megasas.c  |  4 +++-
>  hw/usb/hcd-xhci.c  |  4 ++--
>  hw/vfio/pci.c  |  8 ++--
>  hw/virtio/virtio-pci.c |  4 ++--
>  include/hw/pci/msix.h  |  5 +++--
>  11 files changed, 57 insertions(+), 22 deletions(-)
> 
For megasas: Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [Qemu-devel] [PATCH v4 0/2] memory: extend "info mtree" with flat view dump

2017-01-18 Thread Paolo Bonzini



On 16/01/2017 09:40, Peter Xu wrote:
> v4:
> - do unref of flatview when no view is there [Dave]
> 
> v3:
> - rather than dump flatview directly in "info mtree", provide a new
>   parameter "-f" for it. [Paolo]
> - replace "RW" chars with the type of memory region [Paolo]
> - (cc dave as well since it touches HMP interface)
> 
> v2:
> - fix a size error in patch 2
> - add r-b for Marc-André in patch 1
> 
> Each address space has its own flatview. It's another way to observe
> memory info besides the default memory region hierachy, for example,
> if we want to know which memory region will handle the write to
> specific address, a flatview will suite more here than the default
> hierachical dump.
> 
> Please review. Thanks,
> 
> Peter Xu (2):
>   memory: tune mtree_print_mr() to dump mr type
>   memory: hmp: add "-f" for "info mtree"
> 
>  hmp-commands-info.hx  |  6 ++--
>  include/exec/memory.h |  2 +-
>  memory.c  | 89 
> ++-
>  monitor.c |  4 ++-
>  4 files changed, 74 insertions(+), 27 deletions(-)
> 

Queued, thanks.

Paolo

Re: [Qemu-devel] [PATCH 0/3] Remove arch_init SMBIOS and ACPI code

2017-01-18 Thread Paolo Bonzini



On 17/01/2017 20:30, Eduardo Habkost wrote:
> This creates stubs for SMBIOS and ACPI option parsing functions,
> so that we don't need arch-specific #ifdefs in arch_init just for
> calling the option parsers.
> 
> Eduardo Habkost (3):
>   stubs: Add smbios_entry_add() stub
>   stubs: acpi_table_add() stub
>   arch_init: Move acpi_table_add() call back to vl.c
> 
>  include/sysemu/arch_init.h |  2 --
>  arch_init.c| 20 
>  stubs/acpi.c   |  7 +++
>  stubs/smbios.c |  7 +++
>  vl.c   |  9 +++--
>  stubs/Makefile.objs|  2 ++
>  6 files changed, 23 insertions(+), 24 deletions(-)
>  create mode 100644 stubs/acpi.c
>  create mode 100644 stubs/smbios.c
> 

Pretty much the same patches are in my pending pull request.

Paolo

Re: [Qemu-devel] [Qemu-arm] [PATCH v2 00/18] arm: Add virtualization to GICv3, and enable EL2 on 64-bit CPUs

2017-01-18 Thread Edgar E. Iglesias

Hi Peter,  I'm on holidays and won't have time to review for a couple of
weeks at least...

Cheers,
Edgar

---
Sent from my phone

On 17 Jan 2017 14:08, "Peter Maydell"  wrote:

> On 9 January 2017 at 16:05, Peter Maydell 
> wrote:
> > This patchset adds support for the Virtualization extensions to QEMU's
> > GICv3 emulation. This was the last missing piece that was stopping
> > us from turning on the EL2 support in the CPU model, so the patchset
> > also adds support for enabling it all on the virt board via the
> > '-machine virtualization=on' option.
>
> > Patches 1, 4-13, 15, 16 still need review.
>
> So is anybody interested in reviewing the complicated
> bits in the middle of this patchset (5-13) or shall I
> just throw it into target-arm.next? :-)
>
> thanks
> -- PMM
>
>

Re: [Qemu-devel] [PATCH] arch_init: Remove unnecessary default_config_files table

2017-01-18 Thread Paolo Bonzini



On 17/01/2017 19:00, Eduardo Habkost wrote:
> The existing default_config_files table in arch_init.c has a
> single entry, making it completely unnecessary. The whole code
> can be replaced by a single qemu_read_config_file() call in vl.c.
> 
> Signed-off-by: Eduardo Habkost 
> ---
>  include/qemu/config-file.h |  4 
>  arch_init.c| 27 ---
>  vl.c   | 18 ++
>  3 files changed, 14 insertions(+), 35 deletions(-)
> 
> diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h
> index 8d4b2b6d94..c80d5c8a33 100644
> --- a/include/qemu/config-file.h
> +++ b/include/qemu/config-file.h
> @@ -23,8 +23,4 @@ int qemu_read_config_file(const char *filename);
>  void qemu_config_parse_qdict(QDict *options, QemuOptsList **lists,
>   Error **errp);
>  
> -/* Read default QEMU config files
> - */
> -int qemu_read_default_config_files(bool userconfig);
> -
>  #endif /* QEMU_CONFIG_FILE_H */
> diff --git a/arch_init.c b/arch_init.c
> index 5cc58b2c35..e9b6d62d18 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -84,33 +84,6 @@ int graphic_depth = 32;
>  
>  const uint32_t arch_type = QEMU_ARCH;
>  
> -static struct defconfig_file {
> -const char *filename;
> -/* Indicates it is an user config file (disabled by -no-user-config) */
> -bool userconfig;
> -} default_config_files[] = {
> -{ CONFIG_QEMU_CONFDIR "/qemu.conf",   true },
> -{ NULL }, /* end of list */
> -};
> -
> -int qemu_read_default_config_files(bool userconfig)
> -{
> -int ret;
> -struct defconfig_file *f;
> -
> -for (f = default_config_files; f->filename; f++) {
> -if (!userconfig && f->userconfig) {
> -continue;
> -}
> -ret = qemu_read_config_file(f->filename);
> -if (ret < 0 && ret != -ENOENT) {
> -return ret;
> -}
> -}
> -
> -return 0;
> -}
> -
>  struct soundhw {
>  const char *name;
>  const char *descr;
> diff --git a/vl.c b/vl.c
> index c643d3ff3a..b563f9b924 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2990,6 +2990,18 @@ static int global_init_func(void *opaque, QemuOpts 
> *opts, Error **errp)
>  return 0;
>  }
>  
> +static int qemu_read_default_config_file(void)
> +{
> +int ret;
> +
> +ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR "/qemu.conf");
> +if (ret < 0 && ret != -ENOENT) {
> +return ret;
> +}
> +
> +return 0;
> +}
> +
>  int main(int argc, char **argv, char **envp)
>  {
>  int i;
> @@ -3117,10 +3129,8 @@ int main(int argc, char **argv, char **envp)
>  }
>  }
>  
> -if (defconfig) {
> -int ret;
> -ret = qemu_read_default_config_files(userconfig);
> -if (ret < 0) {
> +if (defconfig && userconfig) {
> +if (qemu_read_default_config_file() < 0) {
>  exit(1);
>  }
>  }
> 

Looks good!  Please include it yourself in your next pull request.

Reviewed-by: Paolo Bonzini 

Though maybe we should just remove .conf file support completely...
who's using it?!?

Paolo

Re: [Qemu-devel] [PATCH RFC v3 14/14] intel_iommu: enable vfio devices

2017-01-18 Thread Tian, Kevin

> From: Peter Xu [mailto:pet...@redhat.com]
> Sent: Wednesday, January 18, 2017 4:46 PM
> 
> On Wed, Jan 18, 2017 at 04:36:05PM +0800, Jason Wang wrote:
> >
> >
> > On 2017年01月18日 16:11, Peter Xu wrote:
> > >On Wed, Jan 18, 2017 at 11:10:53AM +0800, Jason Wang wrote:
> > >>
> > >>On 2017年01月17日 22:45, Peter Xu wrote:
> > >>>On Mon, Jan 16, 2017 at 05:54:55PM +0800, Jason Wang wrote:
> > On 2017年01月16日 17:18, Peter Xu wrote:
> > >>>  static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t
> domain_id,
> > >>>hwaddr addr, uint8_t am)
> > >>>  {
> > >>>@@ -1222,6 +1251,7 @@ static void
> vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> > >>>  info.addr = addr;
> > >>>  info.mask = ~((1 << am) - 1);
> > >>>  g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page,
> &info);
> > >>>+vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);
> > >>Is the case of GLOBAL or DSI flush missed, or we don't care it at all?
> > >IMHO we don't. For device assignment, since we are having CM=1 here,
> > >we should have explicit page invalidations even if guest sends
> > >global/domain invalidations.
> > >
> > >Thanks,
> > >
> > >-- peterx
> > Is this spec required?
> > >>>I think not. IMO the spec is very coarse grained on describing cache
> > >>>mode...
> > >>>
> > Btw, it looks to me that both DSI and GLOBAL are
> > indeed explicit flush.
> > >>>Actually when cache mode is on, it is unclear to me on how we should
> > >>>treat domain/global invalidations, at least from the spec (as
> > >>>mentioned earlier). My understanding is that they are not "explicit
> > >>>flushes", which IMHO should only mean page selective IOTLB
> > >>>invalidations.
> > >>Probably not, at least from the view of performance. DSI and global should
> > >>be more efficient in some cases.
> > >I agree with you that DSI/GLOBAL flushes are more efficient in some
> > >way. But IMHO that does not mean these invalidations are "explicit
> > >invalidations", and I suspect whether cache mode has to coop with it.
> >
> > Well, the spec does not forbid DSI/GLOBAL with CM and the driver codes had
> > used them for almost ten years. I can hardly believe it's wrong.
> 
> I think we have misunderstanding here. :)
> 
> I never thought we should not send DSI/GLOBAL invalidations with cache
> mode. I just think we should not do anything special even if we have
> cache mode on when we receive these signals.
> 
> In the spec, "explicit invalidation" is mentioned in the cache mode
> chapter:
> 
> The Caching Mode (CM) field in Capability Register indicates if
> the hardware implementation caches not-present or erroneous
> translation-structure entries. When the CM field is reported as
> Set, any software updates to any remapping structures (including
> updates to not-present entries or present entries whose
> programming resulted in translation faults) requires explicit
> invalidation of the caches.
> 
> And I thought we were discussion about "what is explicit invalidation"
> mentioned above.

Check 6.5.3.1 Implicit Invalidation on Page Requests

In addition to the explicit invalidation through invalidation commands 
(see Section 6.5.1 and Section 6.5.2) or through deferred invalidation 
messages (see Section 6.5.4), identified above, Page Requests from 
endpoint devices invalidate entries in the IOTLBs and paging-structure 
caches.

My impression is that above indirectly defines invalidation commands (
PSI/DSI/GLOBAL) as explicit invalidation, because they are explicitly
issued by driver. Then section 6.5.3.1 further describes implicit
invalidations caused by other VT-d operations.

I will check with VT-d spec owner to clarify.

> 
> >
> > >
> > >But here I should add one more thing besides PSI - context entry
> > >invalidation should be one of "the explicit invalidations" as well,
> > >which we need to handle just like PSI when cache mode is on.
> > >
> > Just have a quick go through on driver codes and find this something
> > interesting in intel_iommu_flush_iotlb_psi():
> > 
> > ...
> >  /*
> >   * Fallback to domain selective flush if no PSI support or the 
> >  size is
> >   * too big.
> >   * PSI requires page size to be 2 ^ x, and the base address is 
> >  naturally
> >   * aligned to the size
> >   */
> >  if (!cap_pgsel_inv(iommu->cap) || mask >
> cap_max_amask_val(iommu->cap))
> >  iommu->flush.flush_iotlb(iommu, did, 0, 0,
> >  DMA_TLB_DSI_FLUSH);
> >  else
> >  iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
> >  DMA_TLB_PSI_FLUSH);
> > ...
> > >>>I think this is interesting... and I doubt its correctness while with
> > >>>cache mode enabled.
> > >

Re: [Qemu-devel] [PATCH 9/9] tests: Test case for query-cpu-model-expansion

2017-01-18 Thread David Hildenbrand


Am 17.01.2017 um 02:02 schrieb Eduardo Habkost:

+def checkExpansions(self, model, msg):
+"""Perform multiple expansion operations on model, validate results
+
+@model is a CpuModelExpansionInfo struct, with some extra keys:
+* model['runnable'] should be set to True if the CPU model is
+  runnable on this host
+* model['qom-props'] will be set to the full list of properties for
+  the CPU, if the model is runnable
+"""
+exp_s = self.checkOneExpansion(model, 'static',
+   '%s.static' % (msg))
+exp_f = self.checkOneExpansion(model, 'full',
+   '%s.full' % (msg))
+exp_ss = self.checkOneExpansion(exp_s, 'static',
+'%s.static.static' % (msg))
+exp_sf = self.checkOneExpansion(exp_s, 'full',
+'%s.static.full' % (msg))
+exp_ff = self.checkOneExpansion(exp_f, 'full',
+'%s.full.full' % (msg))
+
+# static expansion twice should result in the same data:
+self.assertEquals(exp_s, exp_ss, '%s: static != static+static' % (msg))
+# full expansion twice should also result in the same data:
+self.assertEquals(exp_f, exp_ff, '%s: full != full+full' % (msg))
+
+# migration-safe CPU models have an extra feature:
+# their static expansion should be equivalent to the full
+# expansion (as their static expansion is also precise)


This is not true for s390x:

"z13-base" is both, static and migration-safe.

Doing a full expansion will expand all features (so your check against
QOM properties should succeed)

Doing a static expansion will expand no features, as z13-base is
already static, so there are no features to expand (no delta changes).

"z13" is only migration-safe.

Doing a full expansion will expand all features.

Doing a static expansion will only expand the features different to
"z13-base". (Remember, delta changes only to minimize reported
features).


And I wonder if that is also true for x86? This should only be true if
the "base" model contains absolutely no features.


+if self.isMigrationSafe(model['model']):
+self.assertEquals(exp_sf['model']['props'], 
exp_f['model']['props'],
+  '%s: props: static+full != full' % (msg))
+self.assertEquals(exp_sf.get('qom-props'), exp_f.get('qom-props'),
+  '%s: qom-props: static+full != full' % (msg))



--

David

Re: [Qemu-devel] [PATCH v5 0/3] Add litmus tests for MTTCG consistency tests

2017-01-18 Thread Fam Zheng

On Wed, 01/18 00:00, no-re...@patchew.org wrote:
> Hi,
> 
> Your series seems to have some coding style problems. See output below for
> more information:

Sorry for the noise, looks like it's a network issue during preparation,
unrelated to this series. I'll look into suppressing such errors in the future.

Fam

> 
> Message-id: 20161201052844.31819-1-bobby.pr...@gmail.com
> Subject: [Qemu-devel] [PATCH v5 0/3] Add litmus tests for MTTCG consistency 
> tests
> Type: series
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> # Useful git options
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> fatal: unable to access 'https://github.com/patchew-project/qemu/': Failed 
> connect to github.com:443; Operation now in progress
> error: Could not fetch 3c8cf5a9c21ff8782164d1def7f44bd888713384
> Traceback (most recent call last):
>   File "/usr/bin/patchew", line 406, in test_one
> git_clone_repo(clone, r["repo"], r["head"], logf)
>   File "/usr/bin/patchew", line 47, in git_clone_repo
> stdout=logf, stderr=logf)
>   File "/usr/lib64/python3.4/subprocess.py", line 561, in check_call
> raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['git', 'remote', 'add', '-f', 
> '--mirror=fetch', '3c8cf5a9c21ff8782164d1def7f44bd888713384', 
> 'https://github.com/patchew-project/qemu']' returned non-zero exit status 1
> 
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [Bug 788881] Re: i386-bsd-user and similar don't build on Mac OS X

2017-01-18 Thread Thomas Huth

The bsd-user target is currently unmainted in QEMU, and I think it is
not meant for Mac OS X, but rather for FreeBSD and friends. So let's
close this ticket now...

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/71

Title:
  i386-bsd-user and similar don't build on Mac OS X

Status in QEMU:
  Invalid

Bug description:
  0.14.1 crashes on Mac OS X 64bit with some targets (*-bsd-user):

CCi386-bsd-user/cpu-exec.o
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c: In function 
‘cpu_x86_signal_handler’:
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:895: error: dereferencing 
pointer to incomplete type
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:895: error: ‘REG_RIP’ 
undeclared (first use in this function)
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:895: error: (Each undeclared 
identifier is reported only once
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:895: error: for each function 
it appears in.)
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:897: error: dereferencing 
pointer to incomplete type
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:897: error: ‘REG_TRAPNO’ 
undeclared (first use in this function)
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:898: error: dereferencing 
pointer to incomplete type
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:898: error: ‘REG_ERR’ 
undeclared (first use in this function)
  /Users/michael/Downloads/qemu-0.14.1/cpu-exec.c:899: error: dereferencing 
pointer to incomplete type
  make[1]: *** [cpu-exec.o] Error 1

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/71/+subscriptions

[Qemu-devel] [Bug 788886] Re: Crash with -m32 and gcc-4.2 on Mac OS X 64bit

2017-01-18 Thread Thomas Huth

Which version of QEMU were you using? Can you still reproduce this
problem with the latest version of QEMU and the latest version of macOS?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/76

Title:
  Crash with -m32 and gcc-4.2 on Mac OS X 64bit

Status in QEMU:
  Incomplete

Bug description:
  For building 32bit Qemu on Mac OS X 10.6.7 , i configure with --extra-
  cflags=-m32 --extra-ldflags=-m32. make with gcc-4.2 then crashes with:

GEN   qemu-options.def
CCqemu-nbd.o
  gcc-4.2: -E, -S, -save-temps and -M options are not allowed with multiple 
-arch flags
  make: *** [qemu-nbd.o] Error 1

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/76/+subscriptions

[Qemu-devel] [Bug 735752] Re: qemu squeeze crashes "BUG: unable to handle kernel NULL pointer dereference at (null)"

2017-01-18 Thread Thomas Huth

Sounds like this was a kernel bug ... you should report those to the
right kernel mailing lists instead. Anyway, closing this ticket now
since there hasn't been any activity since a long time.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/735752

Title:
  qemu squeeze crashes "BUG: unable to handle kernel NULL pointer
  dereference at   (null)"

Status in QEMU:
  Invalid

Bug description:
  my virtual machine server (qemu+libvirt) regularly breaks down with such a 
record in the logs
  I can not even ping the guest, but i can ping host, but can not do something 
with it (cannot ssh login for example)
  And I dont know how to reproduce the problem :(

  Mar 15 17:58:04 mainhost kernel: [65866.976982] BUG: unable to handle kernel 
NULL pointer dereference at   (null)
  Mar 15 17:58:04 mainhost kernel: [65866.977422] IP: [] 
0x8100efbe  

  Mar 15 17:58:04 mainhost kernel: [65866.977663] PGD 7387b7067 PUD 81b723067 
PMD 0.  
 
  Mar 15 17:58:04 mainhost kernel: [65866.977902] Oops:  [#1] SMP.  
   
  Mar 15 17:58:04 mainhost kernel: [65866.978128] last sysfs file: 
/sys/devices/system/cpu/cpu3/topology/thread_siblings   

  Mar 15 17:58:04 mainhost kernel: [65866.978572] CPU 1.
   
  Mar 15 17:58:04 mainhost kernel: [65866.978577] Modules linked in: nfs lockd 
nfs_acl auth_rpcgss sunrpc ebtable_nat ebtables coretemp bridge stp llc xt_state
  Mar 15 17:58:04 mainhost kernel: [65866.979737].  
   
  Mar 15 17:58:04 mainhost kernel: [65866.979959] Pid: 3369, comm: 
qemu-system-x86 Not tainted 2.6.37-gentoo-r2 #2 Intel S5000VSA/S5000VSA 

  Mar 15 17:58:04 mainhost kernel: [65866.980085] RIP: 
0010:[]  [] 0x8100efbe  

  Mar 15 17:58:04 mainhost kernel: [65866.980085] RSP: 0018:880738767a48  
EFLAGS: 00010246
 
  Mar 15 17:58:04 mainhost kernel: [65866.980085] RAX:  RBX: 
f001 RCX: 88081cbeb948  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] RDX: 0022 RSI: 
f001 RDI: 88081cbeb000  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] RBP: 0001 R08: 
000fee01 R09: 0022  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] R10: 0080 R11: 
ea00 R12: 880818d83490  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] R13: 155e5000 R14: 
 R15: 0100  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] FS:  7f5f25e4e700() 
GS:88009f68() knlGS:f80001175000
 
  Mar 15 17:58:04 mainhost kernel: [65866.980085] CS:  0010 DS: 002b ES: 002b 
CR0: 8005003b   
 
  Mar 15 17:58:04 mainhost kernel: [65866.980085] CR2:  CR3: 
000806be9000 CR4: 000426e0  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] DR0: 0045 DR1: 
 DR2:   
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] DR3: 0005 DR6: 
0ff0 DR7: 0400  
  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] Process qemu-system-x86 (pid: 
3369, threadinfo 880738766000, task 8808203ac360)  
  Mar 15 17:58:04 mainhost kernel: [65866.980085] Stack:
   
  Mar 15 17:58:04 mainhost kernel: [65866.980085]   
8806a30f3ff8 88075398 8100f06f  
   
  Mar 15 17:58:04 mainhost kernel: [65866.980085]  0ff8 
8807705d6b40 0ff8 810123f0  
   
  Mar 15 17:58:04 mainhost kernel: [65866.980085]   
    
   
  Mar 15 17:58:04 mainhost kern

Re: [Qemu-devel] [PATCH v5 2/2] gdbstub.c: update old error report statements

2017-01-18 Thread Fam Zheng

On Wed, 01/18 16:03, Ziyue Yang wrote:
> From: Ziyue Yang 
> 
> Some updates from fprintf(stderr, ...) to error_report.

In the future, please use "git format-patch --cover-letter" and "git send-email
--thread" to make sure the patches form a thread. See also:

http://wiki.qemu.org/Contribute/SubmitAPatch

Fam

Re: [Qemu-devel] [PATCH v6 0/2] allow blockdev-add for NFS

2017-01-18 Thread Kevin Wolf

Am 17.01.2017 um 16:14 hat Peter Lieven geschrieben:
> Am 31.10.2016 um 18:20 schrieb Kevin Wolf:
> >Am 31.10.2016 um 16:05 hat Ashijeet Acharya geschrieben:
> >>Previously posted series patches:
> >>v5: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg07580.html
> >>v4: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg07449.html
> >>v3: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg06903.html
> >>v2: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg05844.html
> >>v1: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg04487.html
> >>
> >>This series adds blockdev-add support for NFS block driver.
> >Thanks, fixed as commented on patch 1 and applied.
> 
> Hi,
> 
> it seems this series breaks passing options via URI.
> 
> 1) in nfs_parse_uri
> 
> parse_uint_full(qp->p[i].value, NULL, 0)
> 
> segfaults, as the routine wants to set *NULL = val.

Yes, you're right.

> 2) all parameters that have a different names in options and qdict
> e.g. readahead-size vs. readahead cannot be passed via URI.
> 
> $ qemu-img convert -p 
> nfs://172.21.200.61/templates/VC_debian8-20170116.qcow2,linux\?readahead=131072
>  
> iscsi://172.21.200.56:3260/iqn.2001-05.com.equallogic:0-8a0906-69d384e0a-aa3004e55e15878d-XXX/0
> 
> qemu-img: Could not open 
> 'nfs://172.21.200.61/vcore-dev-cdrom/templates/VC_debian8-20170116.qcow2,linux?readahead=131072':
>  Block protocol 'nfs' doesn't support the option 'readahead-size'
> 
> Please let me know if the below fix would be correct:

No, this needs to be fixed the other way round: runtime_opts must use
the names as specified in the schema, and nfs_client_open() must access
them as such. Without that, blockdev-add can't work (and the command
line only with the "wrong" old option names from the URL, whereas it
should be using the same names as the QAPI schema).

Kevin

Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature

2017-01-18 Thread Igor Mammedov

On Tue, 17 Jan 2017 19:53:21 +0100
Laszlo Ersek  wrote:

> On 01/17/17 15:20, Igor Mammedov wrote:
> > On Tue, 17 Jan 2017 14:46:05 +0100
> > Laszlo Ersek  wrote:
> >   
> >> On 01/17/17 14:20, Igor Mammedov wrote:  
> >>> On Tue, 17 Jan 2017 13:06:13 +0100
> >>> Laszlo Ersek  wrote:
> >>> 
>  On 01/17/17 12:21, Igor Mammedov wrote:
> > On Mon, 16 Jan 2017 21:46:55 +0100
> > Laszlo Ersek  wrote:
> >   
> >> On 01/13/17 14:09, Igor Mammedov wrote:  
> >>> On Thu, 12 Jan 2017 19:24:45 +0100
> >>> Laszlo Ersek  wrote:
> >>> 
>  The generic edk2 SMM infrastructure prefers
>  EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each 
>  processor. If
>  Trigger() only brings the current processor into SMM, then edk2 
>  handles it
>  in the following ways:
> 
>  (1) If Trigger() is executed by the BSP (which is guaranteed before
>  ExitBootServices(), but is not necessarily true at runtime), 
>  then:
> 
>  (a) If edk2 has been configured for "traditional" SMM 
>  synchronization,
>  then the BSP sends directed SMIs to the APs with APIC 
>  delivery,
>  bringing them into SMM individually. Then the BSP runs the 
>  SMI
>  handler / dispatcher.
> 
>  (b) If edk2 has been configured for "relaxed" SMM 
>  synchronization,
>  then the APs that are not already in SMM are not brought in, 
>  and
>  the BSP runs the SMI handler / dispatcher.
> 
>  (2) If Trigger() is executed by an AP (which is possible after
>  ExitBootServices(), and can be forced e.g. by "taskset -c 1
>  efibootmgr"), then the AP in question brings in the BSP with a
>  directed SMI, and the BSP runs the SMI handler / dispatcher.
> 
>  The smaller problem with (1a) and (2) is that the BSP and AP
>  synchronization is slow. For example, the "taskset -c 1 efibootmgr"
>  command from (2) can take more than 3 seconds to complete, because
>  efibootmgr accesses non-volatile UEFI variables intensively.
> 
>  The larger problem is that QEMU's current behavior diverges from the
>  behavior usually seen on physical hardware, and that keeps exposing
>  obscure corner cases, race conditions and other instabilities in 
>  edk2,
>  which generally expects / prefers a software SMI to affect all CPUs 
>  at
>  once.
> 
>  Therefore introduce the "broadcast SMI" feature that causes QEMU to 
>  inject
>  the SMI on all VCPUs.
> 
>  While the original posting of this patch
>  
>  only intended to speed up (2), based on our recent "stress testing" 
>  of SMM
>  this patch actually provides functional improvements.
> 
>  Cc: "Michael S. Tsirkin" 
>  Cc: Gerd Hoffmann 
>  Cc: Igor Mammedov 
>  Cc: Paolo Bonzini 
>  Signed-off-by: Laszlo Ersek 
>  Reviewed-by: Michael S. Tsirkin 
>  ---
> 
>  Notes:
>  v6:
>  - no changes, pick up Michael's R-b
>  
>  v5:
>  - replace the ICH9_LPC_SMI_F_BROADCAST bit value with the
>    ICH9_LPC_SMI_F_BROADCAST_BIT bit position (necessary for
>    DEFINE_PROP_BIT() in the next patch)
> 
>   include/hw/i386/ich9.h |  3 +++
>   hw/isa/lpc_ich9.c  | 10 +-
>   2 files changed, 12 insertions(+), 1 deletion(-)
> 
>  diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
>  index da1118727146..18dcca7ebcbf 100644
>  --- a/include/hw/i386/ich9.h
>  +++ b/include/hw/i386/ich9.h
>  @@ -250,4 +250,7 @@ Object *ich9_lpc_find(void);
>   #define ICH9_SMB_HST_D1 0x06
>   #define ICH9_SMB_HOST_BLOCK_DB  0x07
>   
>  +/* bit positions used in fw_cfg SMI feature negotiation */
>  +#define ICH9_LPC_SMI_F_BROADCAST_BIT0
>  +
>   #endif /* HW_ICH9_H */
>  diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
>  index 376b7801a42c..ced6f803a4f2 100644
>  --- a/hw/isa/lpc_ich9.c
>  +++ b/hw/isa/lpc_ich9.c
>  @@ -437,7 +437,15 @@ static void ich9_apm_ctrl_changed(uint32_t val, 
>  void *arg)
>   
>   /* SMI_EN = PMBASE + 30. SMI control and enable register */
>   if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
>  -cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
>  +if (lpc->smi_negotiated_feat

Re: [Qemu-devel] [PATCH RFC v3 14/14] intel_iommu: enable vfio devices

2017-01-18 Thread Jason Wang




On 2017年01月18日 17:38, Tian, Kevin wrote:

From: Peter Xu [mailto:pet...@redhat.com]
Sent: Wednesday, January 18, 2017 4:46 PM

On Wed, Jan 18, 2017 at 04:36:05PM +0800, Jason Wang wrote:


On 2017年01月18日 16:11, Peter Xu wrote:

On Wed, Jan 18, 2017 at 11:10:53AM +0800, Jason Wang wrote:

On 2017年01月17日 22:45, Peter Xu wrote:

On Mon, Jan 16, 2017 at 05:54:55PM +0800, Jason Wang wrote:

On 2017年01月16日 17:18, Peter Xu wrote:

  static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t

domain_id,

hwaddr addr, uint8_t am)
  {
@@ -1222,6 +1251,7 @@ static void

vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,

  info.addr = addr;
  info.mask = ~((1 << am) - 1);
  g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page,

&info);

+vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am);

Is the case of GLOBAL or DSI flush missed, or we don't care it at all?

IMHO we don't. For device assignment, since we are having CM=1 here,
we should have explicit page invalidations even if guest sends
global/domain invalidations.

Thanks,

-- peterx

Is this spec required?

I think not. IMO the spec is very coarse grained on describing cache
mode...


Btw, it looks to me that both DSI and GLOBAL are
indeed explicit flush.

Actually when cache mode is on, it is unclear to me on how we should
treat domain/global invalidations, at least from the spec (as
mentioned earlier). My understanding is that they are not "explicit
flushes", which IMHO should only mean page selective IOTLB
invalidations.

Probably not, at least from the view of performance. DSI and global should
be more efficient in some cases.

I agree with you that DSI/GLOBAL flushes are more efficient in some
way. But IMHO that does not mean these invalidations are "explicit
invalidations", and I suspect whether cache mode has to coop with it.

Well, the spec does not forbid DSI/GLOBAL with CM and the driver codes had
used them for almost ten years. I can hardly believe it's wrong.

I think we have misunderstanding here. :)

I never thought we should not send DSI/GLOBAL invalidations with cache
mode. I just think we should not do anything special even if we have
cache mode on when we receive these signals.

In the spec, "explicit invalidation" is mentioned in the cache mode
chapter:

 The Caching Mode (CM) field in Capability Register indicates if
 the hardware implementation caches not-present or erroneous
 translation-structure entries. When the CM field is reported as
 Set, any software updates to any remapping structures (including
 updates to not-present entries or present entries whose
 programming resulted in translation faults) requires explicit
 invalidation of the caches.

And I thought we were discussion about "what is explicit invalidation"
mentioned above.

Check 6.5.3.1 Implicit Invalidation on Page Requests

In addition to the explicit invalidation through invalidation commands
(see Section 6.5.1 and Section 6.5.2) or through deferred invalidation
messages (see Section 6.5.4), identified above, Page Requests from
endpoint devices invalidate entries in the IOTLBs and paging-structure
caches.

My impression is that above indirectly defines invalidation commands (
PSI/DSI/GLOBAL) as explicit invalidation, because they are explicitly
issued by driver. Then section 6.5.3.1 further describes implicit
invalidations caused by other VT-d operations.

I will check with VT-d spec owner to clarify.


Good to hear from you.

So I think we should implement DSI and GLOBAL for vfio in this case. We 
can first try to implement it through current VFIO API which can accepts 
a range of iova. If not possible, let's discuss for other possible 
solutions.





But here I should add one more thing besides PSI - context entry
invalidation should be one of "the explicit invalidations" as well,
which we need to handle just like PSI when cache mode is on.


Just have a quick go through on driver codes and find this something
interesting in intel_iommu_flush_iotlb_psi():

...
 /*
  * Fallback to domain selective flush if no PSI support or the size is
  * too big.
  * PSI requires page size to be 2 ^ x, and the base address is naturally
  * aligned to the size
  */
 if (!cap_pgsel_inv(iommu->cap) || mask >

cap_max_amask_val(iommu->cap))

 iommu->flush.flush_iotlb(iommu, did, 0, 0,
 DMA_TLB_DSI_FLUSH);
 else
 iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
 DMA_TLB_PSI_FLUSH);
...

I think this is interesting... and I doubt its correctness while with
cache mode enabled.

If so (sending domain invalidation instead of a big range of page
invalidations), how should we capture which pages are unmapped in
emulated IOMMU?

We don't need to track individual pages here, since all pages for a specific
domain were unmapped I believe?

Re: [Qemu-devel] [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2017-01-18 Thread David Hildenbrand


Am 21.12.2016 um 07:52 schrieb Liang Li:

This patch set contains two parts of changes to the virtio-balloon.

One is the change for speeding up the inflating & deflating process,
the main idea of this optimization is to use {pfn|length} to present
the page information instead of the PFNs, to reduce the overhead of
virtio data transmission, address translation and madvise(). This can
help to improve the performance by about 85%.

Another change is for speeding up live migration. By skipping process
guest's unused pages in the first round of data copy, to reduce needless
data processing, this can help to save quite a lot of CPU cycles and
network bandwidth. We put guest's unused page information in a
{pfn|length} array and send it to host with the virt queue of
virtio-balloon. For an idle guest with 8GB RAM, this can help to shorten
the total live migration time from 2Sec to about 500ms in 10Gbps network
environment. For an guest with quite a lot of page cache and with little
unused pages, it's possible to let the guest drop it's page cache before
live migration, this case can benefit from this new feature too.


I agree that both changes make sense (although the second change just 
smells very racy, as you also pointed out in the patch description),

however I am not sure if virtio-balloon is really the right place for
the latter change.

virtio-balloon is all about ballooning, nothing else. What you're doing
is using it as a way to communicate balloon-unrelated data from/to the
hypervisor. Yes, it is also about guest memory, but completely unrelated
to the purpose of the balloon device.

Maybe using virtio-balloon for this purpose is okay - I have mixed
feelings (especially as I can't tell where else this could go). I would
like to get a second opinion on this.

--

David

Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature

2017-01-18 Thread Laszlo Ersek

On 01/18/17 11:03, Igor Mammedov wrote:
> On Tue, 17 Jan 2017 19:53:21 +0100
> Laszlo Ersek  wrote:
> 
>> On 01/17/17 15:20, Igor Mammedov wrote:
>>> On Tue, 17 Jan 2017 14:46:05 +0100
>>> Laszlo Ersek  wrote:
>>>   
 On 01/17/17 14:20, Igor Mammedov wrote:  
> On Tue, 17 Jan 2017 13:06:13 +0100
> Laszlo Ersek  wrote:
> 
>> On 01/17/17 12:21, Igor Mammedov wrote:
>>> On Mon, 16 Jan 2017 21:46:55 +0100
>>> Laszlo Ersek  wrote:
>>>   
 On 01/13/17 14:09, Igor Mammedov wrote:  
> On Thu, 12 Jan 2017 19:24:45 +0100
> Laszlo Ersek  wrote:
> 
>> The generic edk2 SMM infrastructure prefers
>> EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each 
>> processor. If
>> Trigger() only brings the current processor into SMM, then edk2 
>> handles it
>> in the following ways:
>>
>> (1) If Trigger() is executed by the BSP (which is guaranteed before
>> ExitBootServices(), but is not necessarily true at runtime), 
>> then:
>>
>> (a) If edk2 has been configured for "traditional" SMM 
>> synchronization,
>> then the BSP sends directed SMIs to the APs with APIC 
>> delivery,
>> bringing them into SMM individually. Then the BSP runs the 
>> SMI
>> handler / dispatcher.
>>
>> (b) If edk2 has been configured for "relaxed" SMM 
>> synchronization,
>> then the APs that are not already in SMM are not brought in, 
>> and
>> the BSP runs the SMI handler / dispatcher.
>>
>> (2) If Trigger() is executed by an AP (which is possible after
>> ExitBootServices(), and can be forced e.g. by "taskset -c 1
>> efibootmgr"), then the AP in question brings in the BSP with a
>> directed SMI, and the BSP runs the SMI handler / dispatcher.
>>
>> The smaller problem with (1a) and (2) is that the BSP and AP
>> synchronization is slow. For example, the "taskset -c 1 efibootmgr"
>> command from (2) can take more than 3 seconds to complete, because
>> efibootmgr accesses non-volatile UEFI variables intensively.
>>
>> The larger problem is that QEMU's current behavior diverges from the
>> behavior usually seen on physical hardware, and that keeps exposing
>> obscure corner cases, race conditions and other instabilities in 
>> edk2,
>> which generally expects / prefers a software SMI to affect all CPUs 
>> at
>> once.
>>
>> Therefore introduce the "broadcast SMI" feature that causes QEMU to 
>> inject
>> the SMI on all VCPUs.
>>
>> While the original posting of this patch
>> 
>> only intended to speed up (2), based on our recent "stress testing" 
>> of SMM
>> this patch actually provides functional improvements.
>>
>> Cc: "Michael S. Tsirkin" 
>> Cc: Gerd Hoffmann 
>> Cc: Igor Mammedov 
>> Cc: Paolo Bonzini 
>> Signed-off-by: Laszlo Ersek 
>> Reviewed-by: Michael S. Tsirkin 
>> ---
>>
>> Notes:
>> v6:
>> - no changes, pick up Michael's R-b
>> 
>> v5:
>> - replace the ICH9_LPC_SMI_F_BROADCAST bit value with the
>>   ICH9_LPC_SMI_F_BROADCAST_BIT bit position (necessary for
>>   DEFINE_PROP_BIT() in the next patch)
>>
>>  include/hw/i386/ich9.h |  3 +++
>>  hw/isa/lpc_ich9.c  | 10 +-
>>  2 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
>> index da1118727146..18dcca7ebcbf 100644
>> --- a/include/hw/i386/ich9.h
>> +++ b/include/hw/i386/ich9.h
>> @@ -250,4 +250,7 @@ Object *ich9_lpc_find(void);
>>  #define ICH9_SMB_HST_D1 0x06
>>  #define ICH9_SMB_HOST_BLOCK_DB  0x07
>>  
>> +/* bit positions used in fw_cfg SMI feature negotiation */
>> +#define ICH9_LPC_SMI_F_BROADCAST_BIT0
>> +
>>  #endif /* HW_ICH9_H */
>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
>> index 376b7801a42c..ced6f803a4f2 100644
>> --- a/hw/isa/lpc_ich9.c
>> +++ b/hw/isa/lpc_ich9.c
>> @@ -437,7 +437,15 @@ static void ich9_apm_ctrl_changed(uint32_t val, 
>> void *arg)
>>  
>>  /* SMI_EN = PMBASE + 30. SMI control and enable register */
>>  if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) {
>> -cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI);
>

Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature

2017-01-18 Thread Laszlo Ersek

On 01/18/17 11:19, Laszlo Ersek wrote:
> On 01/18/17 11:03, Igor Mammedov wrote:
>> On Tue, 17 Jan 2017 19:53:21 +0100
>> Laszlo Ersek  wrote:
>>

[snip]

>>> This is the code I wrote (extracting a new ich9_apm_broadcast_smi()
>>> function from ich9_apm_ctrl_changed(), due to size reasons):
>>>
 static void ich9_apm_broadcast_smi(void)
 {
 CPUState *cs;

 cpu_synchronize_all_states(); /* <- mark this */
>> it probably should be executed after pause_all_vcpus(),
>> maybe it will help.
>>
 CPU_FOREACH(cs) {
 X86CPU *cpu = X86_CPU(cs);
 CPUX86State *env = &cpu->env;

 if (env->smbase == 0x3 && env->eip == 0xfff0) {
 CPUClass *k = CPU_GET_CLASS(cs);
 uint64_t cpu_arch_id = k->get_arch_id(cs);

 trace_ich9_apm_broadcast_smi_skip(cpu_arch_id);
 continue;
 }

 cpu_interrupt(cs, CPU_INTERRUPT_SMI);
 }
 }  
>>>
>> [...]
>>> (b) If I add the cpu_synchronize_all_states() call before the loop, then
>>> the incorrect filter matches go away; QEMU sees the KVM VCPU states
>>> correctly, and the SMI is broad-cast.
>>>
>>> However, in this case, the boot slows to a *crawl*. It's unbearable. All
>>> VCPUs are pegged at 100%, and you can easily follow the OVMF debug log
>>> with the naked eye, almost line by line.
>>> I didn't expect that cpu_synchronize_all_states() would be so costly,
>>> but it makes the idea of checking VCPU state in the APM_CNT handler a
>>> non-starter.
>> I wonder were bottleneck is to slow down guest so much.
>> What is actual cost of calling cpu_synchronize_all_states()?
>>
>> Maybe calling pause_all_vcpus() before cpu_synchronize_all_states()
>> would help.
> 
> If I prepend just a pause_all_vcpus() function call, at the top, then
> the VM freezes (quite understandably) when the first SMI is raised via
> APM_CNT.
> 
> If I, instead, bracket the function body with pause_all_vcpus() and
> resume_all_vcpus(), like this:
> 
> static void ich9_apm_broadcast_smi(void)
> {
> CPUState *cs;
> 
> pause_all_vcpus();
> cpu_synchronize_all_states();
> CPU_FOREACH(cs) {
> X86CPU *cpu = X86_CPU(cs);
> CPUX86State *env = &cpu->env;
> 
> if (env->smbase == 0x3 && env->eip == 0xfff0) {
> CPUClass *k = CPU_GET_CLASS(cs);
> uint64_t cpu_arch_id = k->get_arch_id(cs);
> 
> trace_ich9_apm_broadcast_smi_skip(cpu_arch_id);
> continue;
> }
> 
> cpu_interrupt(cs, CPU_INTERRUPT_SMI);
> }
> resume_all_vcpus();
> }
> 
> then I see the following symptoms:
> - rather than 4 VCPUs being pegged at 100%, only one VCPU is pegged at
>   100%
> - the boot process, as observed from the OVMF log, is just as slow
>   (= crawling) as without the VCPU pausing/resuming (i.e., like with
>   cpu_synchronize_all_states() only); so ultimately the
>   pausing/resuming doesn't help.

I also tried this variant (bracketing only the sync call with pause/resume):

static void ich9_apm_broadcast_smi(void)
{
CPUState *cs;

pause_all_vcpus();
cpu_synchronize_all_states();
resume_all_vcpus();
CPU_FOREACH(cs) {
X86CPU *cpu = X86_CPU(cs);
CPUX86State *env = &cpu->env;

if (env->smbase == 0x3 && env->eip == 0xfff0) {
CPUClass *k = CPU_GET_CLASS(cs);
uint64_t cpu_arch_id = k->get_arch_id(cs);

trace_ich9_apm_broadcast_smi_skip(cpu_arch_id);
continue;
}

cpu_interrupt(cs, CPU_INTERRUPT_SMI);
}
}

This behaves identically to the version where pause/resume bracket the
entire function body (i.e., 1 VCPU pegged, super-slow boot progress).

Laszlo

Re: [Qemu-devel] VM Hung issue

2017-01-18 Thread Stefan Hajnoczi

On Wed, Jan 18, 2017 at 01:51:26PM +0500, Umar Draz wrote:
> I am running qemu-kvm on CentOS 7.3, I have few vms of CentOS and Windows
> 2012 servers.
> 
> Randomly they are hung, but the status of virtual machine keep running, vnc
> and network not responding.

It's not clear whether the guest is hung or QEMU is hung.

Does the VNC client connect successfully?  If the client fails to
connect then QEMU is hung.  If the client connects but the guest doesn't
respond to input then only the guest is hanging.

If QEMU is hung please post a backtrace:

  # gstack $PID_OF_QEMU

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH] throttle: move throttling cmdline options to a separate header file

2017-01-18 Thread Pradeep Kiruvale

On 22 November 2016 at 13:55,  wrote:

> Hi,
>
> Your series seems to have some coding style problems. See output below for
> more information:
>
> Subject: [Qemu-devel] [RFC PATCH] throttle: move throttling cmdline
> options to a separate header file
> Type: series
> Message-id: 147981681351.30309.4854065801791462661.stgit@
> bahia.lab.toulouse-stg.fr.ibm.com
>
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
>
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
>
> # Useful git options
> git config --local diff.renamelimit 0
> git config --local diff.renames True
>
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback
> -; then
> failed=1
> echo
> fi
> n=$((n+1))
> done
>
> exit $failed
> === TEST SCRIPT END ===
>
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> From https://github.com/patchew-project/qemu
>  * [new tag] patchew/147981681351.30309.4854065801791462661.stgit@
> bahia.lab.toulouse-stg.fr.ibm.com -> patchew/147981681351.30309.
> 4854065801791462661.st...@bahia.lab.toulouse-stg.fr.ibm.com
> Switched to a new branch 'test'
> f5fd85c throttle: move throttling cmdline options to a separate header file
>
> === OUTPUT BEGIN ===
> Checking PATCH 1/1: throttle: move throttling cmdline options to a
> separate header file...
> ERROR: Macros with multiple statements should be enclosed in a do - while
> loop
> #133: FILE: include/qemu/throttle-options.h:14:
>


I did not understand how to fix this issue? I tried putting the do - while
loop still did not work
any idea, how to fix this?

Regards,
Pradeep

+#define THROTTLE_OPTS \
> +{ \
> +.name = "throttling.iops-total", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "limit total I/O operations per second", \
> +},{ \
> +.name = "throttling.iops-read", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "limit read operations per second", \
> +},{ \
> +.name = "throttling.iops-write", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "limit write operations per second", \
> +},{ \
> +.name = "throttling.bps-total", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "limit total bytes per second", \
> +},{ \
> +.name = "throttling.bps-read", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "limit read bytes per second", \
> +},{ \
> +.name = "throttling.bps-write", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "limit write bytes per second", \
> +},{ \
> +.name = "throttling.iops-total-max", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "I/O operations burst", \
> +},{ \
> +.name = "throttling.iops-read-max", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "I/O operations read burst", \
> +},{ \
> +.name = "throttling.iops-write-max", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "I/O operations write burst", \
> +},{ \
> +.name = "throttling.bps-total-max", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "total bytes burst", \
> +},{ \
> +.name = "throttling.bps-read-max", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "total bytes read burst", \
> +},{ \
> +.name = "throttling.bps-write-max", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "total bytes write burst", \
> +},{ \
> +.name = "throttling.iops-total-max-length", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "length of the iops-total-max burst period, in
> seconds", \
> +},{ \
> +.name = "throttling.iops-read-max-length", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "length of the iops-read-max burst period, in
> seconds", \
> +},{ \
> +.name = "throttling.iops-write-max-length", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "length of the iops-write-max burst period, in
> seconds", \
> +},{ \
> +.name = "throttling.bps-total-max-length", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "length of the bps-total-max burst period, in
> seconds", \
> +},{ \
> +.name = "throttling.bps-read-max-length", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "length of the bps-read-max burst period, in
> seconds", \
> +},{ \
> +.name = "throttling.bps-write-max-length", \
> +.type = QEMU_OPT_NUMBER, \
> +.help = "length of the bps-write-max burst period, in
> seconds", \
> +},{ \
>

Re: [Qemu-devel] VM Hung issue

2017-01-18 Thread Umar Draz

Hello Stefan,

Well yes, whenever vm hung, then there is nothing on vnc display, and
network of that vm is also down. I can not ping the vm that time.

Mostly my Windows VMs are affected due to this and sometime same thing
happen with Linux vm.

On Wed, Jan 18, 2017 at 3:29 PM, Stefan Hajnoczi  wrote:

> On Wed, Jan 18, 2017 at 01:51:26PM +0500, Umar Draz wrote:
> > I am running qemu-kvm on CentOS 7.3, I have few vms of CentOS and Windows
> > 2012 servers.
> >
> > Randomly they are hung, but the status of virtual machine keep running,
> vnc
> > and network not responding.
>
> It's not clear whether the guest is hung or QEMU is hung.
>
> Does the VNC client connect successfully?  If the client fails to
> connect then QEMU is hung.  If the client connects but the guest doesn't
> respond to input then only the guest is hanging.
>
> If QEMU is hung please post a backtrace:
>
>   # gstack $PID_OF_QEMU
>
> Stefan
>

Re: [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking

2017-01-18 Thread Fam Zheng

On Fri, 12/02 03:58, Max Reitz wrote:
> On 31.10.2016 16:38, Fam Zheng wrote:
> > This implements open flag sensible image locking for local file
> > and host device protocol.
> > 
> > virtlockd in libvirt locks the first byte, so we start looking at the
> > file bytes from 1.
> > 
> > Quoting what was proposed by Kevin Wolf , there are
> > four locking modes by combining two bits (BDRV_O_RDWR and
> > BDRV_O_SHARE_RW), and implemented by taking two locks:
> > 
> > Lock bytes:
> > 
> > * byte 1: I can't allow other processes to write to the image
> > * byte 2: I am writing to the image
> > 
> > Lock modes:
> > 
> > * shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on
> >   byte 2. Test whether byte 1 is locked using an exclusive lock, and
> >   fail if so.
> 
> Testing whether something is locked would be easier by using F_OFD_GETLK
> instead of actually creating an exclusive lock and then releasing it.

My attempt to do this shows it doesn't work: fcntl forces the tested lock type
to read lock for some unknown reason, so it cannot be used to test other shared
lockers.

[Qemu-devel] Call for GSoC 2017 mentors & project ideas

2017-01-18 Thread Stefan Hajnoczi

QEMU will be applying to Google Summer of Code again this year.

Do you want to help some of the most talented university students
begin contributing to open source by working on our codebase and
joining our community?

We need to put together a list of project ideas by February 9:
http://qemu-project.org/Google_Summer_of_Code_2017#Project_Ideas

Active contributors are invited to post project ideas.  The project
should be doable in 12 weeks by someone fluent in C programming but
not familiar with the codebase.  Good project ideas have a clear
scope, few dependencies, a high chance of being merged, and can be
broken down into smaller steps.

For background on why QEMU participates in GSoC and how it works,
check out my KVM Forum presentation:
https://www.youtube.com/watch?v=xNVCX7YMUL8
https://vmsplice.net/~stefan/stefanha-kvm-forum-2016.pdf

KVM and Jailhouse are invited to participate under the QEMU umbrella
organization.

Please let me know if you have any questions.

Stefan

Re: [Qemu-devel] [PATCH 2/3] COLO: Shutdown related socket fd while do failover

2017-01-18 Thread Dr. David Alan Gilbert

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:
> If the net connection between primary host and secondary host breaks
> while COLO/COLO incoming threads are doing read() or write().
> It will block until connection is timeout, and the failover process
> will be blocked because of it.
> 
> So it is necessary to shutdown all the socket fds used by COLO
> to avoid this situation. Besides, we should close the corresponding
> file descriptors after failvoer BH shutdown them,
> Or there will be an error.

Hi,
  There are two parts to this patch:
   a) Add some semaphores to sequence failover
   b) Use shutdown()

  At first I wondered if perhaps they should be split; but I see
the reason for the semaphores is mostly to stop the race between
the fd's getting closed and the shutdown() calls; so I think it's
OK.

Do you have any problems with these semaphores during powering off the
guest?

Dave




> Signed-off-by: zhanghailiang 
> Signed-off-by: Li Zhijian 
> Reviewed-by: Dr. David Alan Gilbert 
> Cc: Dr. David Alan Gilbert 
> ---
>  include/migration/migration.h |  3 +++
>  migration/colo.c  | 43 
> +++
>  2 files changed, 46 insertions(+)
> 
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 487ac1e..7cac877 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -113,6 +113,7 @@ struct MigrationIncomingState {
>  QemuThread colo_incoming_thread;
>  /* The coroutine we should enter (back) after failover */
>  Coroutine *migration_incoming_co;
> +QemuSemaphore colo_incoming_sem;
>  
>  /* See savevm.c */
>  LoadStateEntry_Head loadvm_handlers;
> @@ -182,6 +183,8 @@ struct MigrationState
>  QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest) 
> src_page_requests;
>  /* The RAMBlock used in the last src_page_request */
>  RAMBlock *last_req_rb;
> +/* The semaphore is used to notify COLO thread that failover is finished 
> */
> +QemuSemaphore colo_exit_sem;
>  
>  /* The semaphore is used to notify COLO thread to do checkpoint */
>  QemuSemaphore colo_checkpoint_sem;
> diff --git a/migration/colo.c b/migration/colo.c
> index 08b2e46..3222812 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -59,6 +59,18 @@ static void secondary_vm_do_failover(void)
>  /* recover runstate to normal migration finish state */
>  autostart = true;
>  }
> +/*
> + * Make sure COLO incoming thread not block in recv or send,
> + * If mis->from_src_file and mis->to_src_file use the same fd,
> + * The second shutdown() will return -1, we ignore this value,
> + * It is harmless.
> + */
> +if (mis->from_src_file) {
> +qemu_file_shutdown(mis->from_src_file);
> +}
> +if (mis->to_src_file) {
> +qemu_file_shutdown(mis->to_src_file);
> +}
>  
>  old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
> FAILOVER_STATUS_COMPLETED);
> @@ -67,6 +79,8 @@ static void secondary_vm_do_failover(void)
>   "secondary VM", FailoverStatus_lookup[old_state]);
>  return;
>  }
> +/* Notify COLO incoming thread that failover work is finished */
> +qemu_sem_post(&mis->colo_incoming_sem);
>  /* For Secondary VM, jump to incoming co */
>  if (mis->migration_incoming_co) {
>  qemu_coroutine_enter(mis->migration_incoming_co);
> @@ -81,6 +95,18 @@ static void primary_vm_do_failover(void)
>  migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
>MIGRATION_STATUS_COMPLETED);
>  
> +/*
> + * Wake up COLO thread which may blocked in recv() or send(),
> + * The s->rp_state.from_dst_file and s->to_dst_file may use the
> + * same fd, but we still shutdown the fd for twice, it is harmless.
> + */
> +if (s->to_dst_file) {
> +qemu_file_shutdown(s->to_dst_file);
> +}
> +if (s->rp_state.from_dst_file) {
> +qemu_file_shutdown(s->rp_state.from_dst_file);
> +}
> +
>  old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
> FAILOVER_STATUS_COMPLETED);
>  if (old_state != FAILOVER_STATUS_ACTIVE) {
> @@ -88,6 +114,8 @@ static void primary_vm_do_failover(void)
>   FailoverStatus_lookup[old_state]);
>  return;
>  }
> +/* Notify COLO thread that failover work is finished */
> +qemu_sem_post(&s->colo_exit_sem);
>  }
>  
>  void colo_do_failover(MigrationState *s)
> @@ -361,6 +389,14 @@ out:
>  
>  timer_del(s->colo_delay_timer);
>  
> +/* Hope this not to be too long to wait here */
> +qemu_sem_wait(&s->colo_exit_sem);
> +qemu_sem_destroy(&s->colo_exit_sem);
> +/*
> + * Must be called after failover BH is completed,
> + * Or the failover BH may shutdown the wrong fd that
> + * re-used by other threads after we release here.
> + */
>

Re: [Qemu-devel] Call for GSoC 2017 mentors & project ideas

2017-01-18 Thread Alex Bennée


Stefan Hajnoczi  writes:

> QEMU will be applying to Google Summer of Code again this year.
>
> Do you want to help some of the most talented university students
> begin contributing to open source by working on our codebase and
> joining our community?
>
> We need to put together a list of project ideas by February 9:
> http://qemu-project.org/Google_Summer_of_Code_2017#Project_Ideas

Will Outreachy point here as well or are we going to be duplicating
ideas? I seem to recall a lot of copy and pasting last time.

Could we not write a page-per-project and use categories with
transclusion on the wiki?

> Active contributors are invited to post project ideas.  The project
> should be doable in 12 weeks by someone fluent in C programming but
> not familiar with the codebase.  Good project ideas have a clear
> scope, few dependencies, a high chance of being merged, and can be
> broken down into smaller steps.
>
> For background on why QEMU participates in GSoC and how it works,
> check out my KVM Forum presentation:
> https://www.youtube.com/watch?v=xNVCX7YMUL8
> https://vmsplice.net/~stefan/stefanha-kvm-forum-2016.pdf
>
> KVM and Jailhouse are invited to participate under the QEMU umbrella
> organization.
>
> Please let me know if you have any questions.
>
> Stefan


--
Alex Bennée

Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1

2017-01-18 Thread Fabian Grünbichler

On 17/01/2017 16:03, Paolo Bonzini wrote:
> On 17/01/2017 12:22, Fabian Grünbichler wrote:
>> 6) repeat 3-5 until md5sum does not match, kernel spews error
>> messages, or you are convinced that everything is OK
>>
>> sample kernel message (for ext3):
>> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: 
>> hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : Illegal 
>> Request [current]
>> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: Invalid 
>> field in cdb
>> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 00 
>> 0f 3a 90 00 00 07 d8 00
>> Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, 
>> dev sda, sector 255496192
> 
> Can you reproduce it if QEMU runs under "strace -e ioctl -ff" in the 
> host?  Or also using this systemtap script.
> 
> The important bit would be the lines with a nonzero status, but the
> others can be useful to see what the surroundings look like.
> 

OT: systemtap is not working with your script under Debian Jessie (or
maybe in general under Debian Jessie? not sure).

after some further testing it seems like this change in Qemu exposes
some subtle issue with our specific kernel (it works fine with the
upstream Ubuntu 4.4 one which ours is based on). I am currently
debugging further to narrow down potential causes - if I need further
input from your side or if I suspect Qemu to be at fault I'll resurrect
this thread (and include the strace output).

thanks for your quick reaction anyhow!

Re: [Qemu-devel] [PULL 04/41] virtio: convert to use DMA api

2017-01-18 Thread Paolo Bonzini

On 10/01/2017 06:39, Michael S. Tsirkin wrote:
> -void virtqueue_map(VirtQueueElement *elem)
> +void virtqueue_map(VirtIODevice *vdev, VirtQueueElement *elem)
>  {
> -virtqueue_map_iovec(elem->in_sg, elem->in_addr, &elem->in_num,
> -VIRTQUEUE_MAX_SIZE, 1);
> -virtqueue_map_iovec(elem->out_sg, elem->out_addr, &elem->out_num,
> -VIRTQUEUE_MAX_SIZE, 0);
> +virtqueue_map_iovec(vdev, elem->in_sg, elem->in_addr, &elem->in_num,
> +MIN(ARRAY_SIZE(elem->in_sg), 
> ARRAY_SIZE(elem->in_addr)),
> +1);
> +virtqueue_map_iovec(vdev, elem->out_sg, elem->out_addr, &elem->out_num,
> +MIN(ARRAY_SIZE(elem->out_sg),
> +ARRAY_SIZE(elem->out_addr)),
> +0);

Coverity reports that ARRAY_SIZE(elem->out_sg) (and all the others too)
is wrong because elem->out_sg is a pointer.

However, the check is not in the right place and the max_size argument
of virtqueue_map_iovec can be removed.  The check on in_num/out_num can
be moved to qemu_get_virtqueue_element instead, before the call to
virtqueue_alloc_element.

Thanks,

Paolo

Re: [Qemu-devel] TSC frequency configuration & invtsc migration (was Re: [PATCH 4/4] kvm: Allow migration with invtsc)

2017-01-18 Thread Marcelo Tosatti

On Tue, Jan 10, 2017 at 05:36:48PM +0100, Paolo Bonzini wrote:
> 
> 
> On 05/01/2017 11:48, Marcelo Tosatti wrote:
> >> Host A has TSC scaling, host B doesn't have TSC scaling. We want
> >> to be able to start the VM on host A, and migrate to B. In this
> >> case, the only possible solution is to use B's frequency when
> >> starting the VM. The QEMU process doesn't have enough information
> >> to make that decision.
> > That is a good point. But again, its a special case and 
> > should be supported by -cpu xxx,tsc-frequency=.
> 
> I don't think this is a scenario that can work reliably.  The computed
> TSC frequency may vary by 0.5% or so on every boot (e.g. you may get
> 2497000 kHz or 2511000 kHz for a 2.5 GHz TSC).  You can start the VM on
> host A, reboot host B, and then you'll be unable to migrate.
> 
> Paolo

Including an acceptable error in the comparison seems to be acceptable to work 
around that
case.

Re: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb descriptor

2017-01-18 Thread Paolo Bonzini



On 10/01/2017 06:39, Michael S. Tsirkin wrote:
> From: Jason Wang 
> 
> This patch enables device IOTLB support for intel iommu. The major
> work is to implement QI device IOTLB descriptor processing and notify
> the device through iommu notifier.
> 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> Reviewed-by: Peter Xu 
> ---
>  hw/i386/intel_iommu_internal.h | 13 ++-
>  include/hw/i386/x86-iommu.h|  1 +
>  hw/i386/intel_iommu.c  | 83 
> +++---
>  hw/i386/x86-iommu.c| 17 +
>  4 files changed, 107 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index 11abfa2..356f188 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -183,6 +183,7 @@
>  /* (offset >> 4) << 8 */
>  #define VTD_ECAP_IRO(DMAR_IOTLB_REG_OFFSET << 4)
>  #define VTD_ECAP_QI (1ULL << 1)
> +#define VTD_ECAP_DT (1ULL << 2)
>  /* Interrupt Remapping support */
>  #define VTD_ECAP_IR (1ULL << 3)
>  #define VTD_ECAP_EIM(1ULL << 4)
> @@ -326,6 +327,7 @@ typedef union VTDInvDesc VTDInvDesc;
>  #define VTD_INV_DESC_TYPE   0xf
>  #define VTD_INV_DESC_CC 0x1 /* Context-cache Invalidate Desc 
> */
>  #define VTD_INV_DESC_IOTLB  0x2
> +#define VTD_INV_DESC_DEVICE 0x3
>  #define VTD_INV_DESC_IEC0x4 /* Interrupt Entry Cache
> Invalidate Descriptor */
>  #define VTD_INV_DESC_WAIT   0x5 /* Invalidation Wait Descriptor 
> */
> @@ -361,6 +363,13 @@ typedef union VTDInvDesc VTDInvDesc;
>  #define VTD_INV_DESC_IOTLB_RSVD_LO  0xff00ULL
>  #define VTD_INV_DESC_IOTLB_RSVD_HI  0xf80ULL
>  
> +/* Mask for Device IOTLB Invalidate Descriptor */
> +#define VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) & 0xf000ULL)
> +#define VTD_INV_DESC_DEVICE_IOTLB_SIZE(val) ((val) & 0x1)
> +#define VTD_INV_DESC_DEVICE_IOTLB_SID(val) (((val) >> 32) & 0xULL)
> +#define VTD_INV_DESC_DEVICE_IOTLB_RSVD_HI 0xffeULL
> +#define VTD_INV_DESC_DEVICE_IOTLB_RSVD_LO 0xffe0fff8
> +
>  /* Information about page-selective IOTLB invalidate */
>  struct VTDIOTLBPageInvInfo {
>  uint16_t domain_id;
> @@ -399,8 +408,8 @@ typedef struct VTDRootEntry VTDRootEntry;
>  #define VTD_CONTEXT_ENTRY_FPD   (1ULL << 1) /* Fault Processing Disable 
> */
>  #define VTD_CONTEXT_ENTRY_TT(3ULL << 2) /* Translation Type */
>  #define VTD_CONTEXT_TT_MULTI_LEVEL  0
> -#define VTD_CONTEXT_TT_DEV_IOTLB1
> -#define VTD_CONTEXT_TT_PASS_THROUGH 2
> +#define VTD_CONTEXT_TT_DEV_IOTLB(1ULL << 2)
> +#define VTD_CONTEXT_TT_PASS_THROUGH (2ULL << 2)
>  /* Second Level Page Translation Pointer*/
>  #define VTD_CONTEXT_ENTRY_SLPTPTR   (~0xfffULL)
>  #define VTD_CONTEXT_ENTRY_RSVD_LO   (0xff0ULL | ~VTD_HAW_MASK)
> diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
> index 0c89d98..361c07c 100644
> --- a/include/hw/i386/x86-iommu.h
> +++ b/include/hw/i386/x86-iommu.h
> @@ -73,6 +73,7 @@ typedef struct IEC_Notifier IEC_Notifier;
>  struct X86IOMMUState {
>  SysBusDevice busdev;
>  bool intr_supported;/* Whether vIOMMU supports IR */
> +bool dt_supported;  /* Whether vIOMMU supports DT */
>  IommuType type; /* IOMMU type - AMD/Intel */
>  QLIST_HEAD(, IEC_Notifier) iec_notifiers; /* IEC notify list */
>  };
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index e39b764..ec62239 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -738,11 +738,18 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, 
> uint8_t bus_num,
>  "context-entry hi 0x%"PRIx64 " lo 0x%"PRIx64,
>  ce->hi, ce->lo);
>  return -VTD_FR_CONTEXT_ENTRY_INV;
> -} else if (ce->lo & VTD_CONTEXT_ENTRY_TT) {
> -VTD_DPRINTF(GENERAL, "error: unsupported Translation Type in "
> -"context-entry hi 0x%"PRIx64 " lo 0x%"PRIx64,
> -ce->hi, ce->lo);
> -return -VTD_FR_CONTEXT_ENTRY_INV;
> +} else {
> +switch (ce->lo & VTD_CONTEXT_ENTRY_TT) {
> +case VTD_CONTEXT_TT_MULTI_LEVEL:
> +/* fall through */
> +case VTD_CONTEXT_TT_DEV_IOTLB:
> +break;
> +default:
> +VTD_DPRINTF(GENERAL, "error: unsupported Translation Type in "
> +"context-entry hi 0x%"PRIx64 " lo 0x%"PRIx64,
> +ce->hi, ce->lo);
> +return -VTD_FR_CONTEXT_ENTRY_INV;
> +}
>  }
>  return 0;
>  }
> @@ -1438,7 +1445,61 @@ static bool vtd_process_inv_iec_desc(IntelIOMMUState 
> *s,
>  vtd_i

Re: [Qemu-devel] [PATCH v3 0/2] Allow migration with invtsc if TSC frequency is explicitly set

2017-01-18 Thread Marcelo Tosatti

On Sun, Jan 08, 2017 at 03:32:32PM -0200, Eduardo Habkost wrote:
> This series makes QEMU accept migration if tsc-frequency is
> explicitly set on configuration. As management software is
> required to keep device configuration the same on migration
> source or destination, explicit tsc-frequency will ensure that
> either:
> 
> * The destination host has a matching TSC frequency; or
> * The destination host has TSC scaling available.
> 
> Changelog
> =
> 
> v2 -> v3:
> * Fix build failure ((missing closing braces)
> 
> v1 -> v2:
> * v1 series subject was:
>   * [PATCH 0/4] Allow migration with invtsc if there's no
> frequency mismatch
> * Removed patches 3/4 and 4/4, that allowed migration
>   if no explicit tsc-frequency was set. Implementing the check on
>   post_load or post_init is not enough to make migration abort,
>   so we will need a more complex solution to implement that
>   feature.
> 
> Plans for future work
> =
> 
> 1) Querying host TSC frequency/scaling capability
> -
> 
> I plan to include TSC frequency/scaling information on
> query-cpu-model-expansion model="host" in a future series. Then
> management software will be able to automatically configure TSC
> frequency when invtsc is enabled, instead of requiring the user
> to configure it explicitly. While we don't implement that, invtsc
> migration will be possible only if the user configures TSC
> frequency manually.
> 
> 2) invtsc migration with no explicit TSC frequency
> --
> 
> A future series can implement migration when TSC frequency is not
> specified explicitly. It will be a bit more complex because it
> requires either letting the destination abort the migration, or
> sending TSC frequency/scaling information from destination to
> source.
> 
> ---
> Cc: Marcelo Tosatti 
> Cc: "Daniel P. Berrange" 
> Cc: Paolo Bonzini 
> Cc: k...@vger.kernel.org
> Cc: Haozhong Zhang 
> Cc: "Michael S. Tsirkin" 
> Cc: Igor Mammedov 
> Cc: libvir-l...@redhat.com
> Cc: Jiri Denemark 
> 
> Eduardo Habkost (2):
>   kvm: Simplify invtsc check
>   kvm: Allow invtsc migration if tsc-khz is set explicitly
> 
>  target/i386/kvm.c | 20 +++-
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> -- 
> 2.11.0.259.g40922b1

Looks good to me.

Re: [Qemu-devel] Proposal PCI/PCIe device placement on PAPR guests

2017-01-18 Thread Marcel Apfelbaum


On 01/12/2017 07:53 PM, Laine Stump wrote:

On 01/12/2017 11:35 AM, Michael Roth wrote:

Quoting Laine Stump (2017-01-12 08:52:10)


[...]




Yeah you're right, I'm probably remembering the wrong problem and wrong reason 
for the problem. I just remember there was *some* issue about hotplugging new 
PCI controllers. Possibly the internal
representation of the bus hierarchy wasn't updated unless you forced a rescan 
of all the devices or something? My memory of it is vague, I just remember 
being told it wasn't just a case of the
controller itself being initialized.

Alex or Marcel - since whatever it was I likely heard it from one of you (or 
imagined it in a dream), can you straighten me out?



Hi Laine,

Indeed, hot-plugging a QEMU pci-bridge in X86 is somehow problematic since
it comes with some bits on ACPI tables which are already loaded at hot-plug 
time.
We need the bits since the x86 pci-hotplug is ACPI based. For Q35 machine might 
be easier,
but is not implemented yet as far as I know.

Thanks,
Marcel

[...]

Re: [Qemu-devel] [PATCH V2 0/3] hw/pcie: Introduce Generic PCI Express Root Port

2017-01-18 Thread Marcel Apfelbaum


On 01/16/2017 06:46 PM, Andrea Bolognani wrote:

On Wed, 2017-01-11 at 14:18 +0200, Marcel Apfelbaum wrote:

v1 -> v2:
  - Rebased on master.

The Generic Root Port behaves the same as the
Intel's IOH device with id 3420, without having
Intel specific attributes.

The device has two purposes:
  (1) Can be used on both X86 and ARM machines.
  (2) It will allow us to tweak the behaviour
 (e.g add vendor-specific PCI capabilities)
  - something that obviously cannot be done
on a known device.

Patch 1/3: Introduce a base class for Root Ports - most of the code
is migrated from IOH3420 implementation.
Patch 2/3: Derives the IOH3420 from the new base class
Patch 3/3: Introduces the generic Root Port.

Tested with Linux and Windows guests only on x86 hosts.


I tested this both on x86/q35 (Debian guest) and on
aarch64/virt (Fedora guest) very briefly, eg. started
the guest, performed some network I/O and shut it down.

It seems to be working fine :)


Thanks for testing it! A tested-by tag would be appreciated for the next 
version :)
Marcel



--
Andrea Bolognani / Red Hat / Virtualization

Re: [Qemu-devel] [PATCH v6 wave 2 2/3] hw/isa/lpc_ich9: add broadcast SMI feature

2017-01-18 Thread Igor Mammedov

On Wed, 18 Jan 2017 11:23:48 +0100
Laszlo Ersek  wrote:

> On 01/18/17 11:19, Laszlo Ersek wrote:
> > On 01/18/17 11:03, Igor Mammedov wrote:  
> >> On Tue, 17 Jan 2017 19:53:21 +0100
> >> Laszlo Ersek  wrote:
> >>  
> 
> [snip]
> 
> >>> This is the code I wrote (extracting a new ich9_apm_broadcast_smi()
> >>> function from ich9_apm_ctrl_changed(), due to size reasons):
> >>>  
>  static void ich9_apm_broadcast_smi(void)
>  {
>  CPUState *cs;
> 
>  cpu_synchronize_all_states(); /* <- mark this */  
> >> it probably should be executed after pause_all_vcpus(),
> >> maybe it will help.
> >>  
>  CPU_FOREACH(cs) {
>  X86CPU *cpu = X86_CPU(cs);
>  CPUX86State *env = &cpu->env;
> 
>  if (env->smbase == 0x3 && env->eip == 0xfff0) {
>  CPUClass *k = CPU_GET_CLASS(cs);
>  uint64_t cpu_arch_id = k->get_arch_id(cs);
> 
>  trace_ich9_apm_broadcast_smi_skip(cpu_arch_id);
>  continue;
>  }
> 
>  cpu_interrupt(cs, CPU_INTERRUPT_SMI);
>  }
>  }
> >>>  
> >> [...]  
> >>> (b) If I add the cpu_synchronize_all_states() call before the loop, then
> >>> the incorrect filter matches go away; QEMU sees the KVM VCPU states
> >>> correctly, and the SMI is broad-cast.
> >>>
> >>> However, in this case, the boot slows to a *crawl*. It's unbearable. All
> >>> VCPUs are pegged at 100%, and you can easily follow the OVMF debug log
> >>> with the naked eye, almost line by line.
> >>> I didn't expect that cpu_synchronize_all_states() would be so costly,
> >>> but it makes the idea of checking VCPU state in the APM_CNT handler a
> >>> non-starter.  
> >> I wonder were bottleneck is to slow down guest so much.
> >> What is actual cost of calling cpu_synchronize_all_states()?
> >>
> >> Maybe calling pause_all_vcpus() before cpu_synchronize_all_states()
> >> would help.  
> > 
> > If I prepend just a pause_all_vcpus() function call, at the top, then
> > the VM freezes (quite understandably) when the first SMI is raised via
> > APM_CNT.
> > 
> > If I, instead, bracket the function body with pause_all_vcpus() and
> > resume_all_vcpus(), like this:
> > 
> > static void ich9_apm_broadcast_smi(void)
> > {
> > CPUState *cs;
> > 
> > pause_all_vcpus();
> > cpu_synchronize_all_states();
> > CPU_FOREACH(cs) {
> > X86CPU *cpu = X86_CPU(cs);
> > CPUX86State *env = &cpu->env;
> > 
> > if (env->smbase == 0x3 && env->eip == 0xfff0) {
> > CPUClass *k = CPU_GET_CLASS(cs);
> > uint64_t cpu_arch_id = k->get_arch_id(cs);
> > 
> > trace_ich9_apm_broadcast_smi_skip(cpu_arch_id);
> > continue;
> > }
> > 
> > cpu_interrupt(cs, CPU_INTERRUPT_SMI);
> > }
> > resume_all_vcpus();
> > }
> > 
> > then I see the following symptoms:
> > - rather than 4 VCPUs being pegged at 100%, only one VCPU is pegged at
> >   100%
> > - the boot process, as observed from the OVMF log, is just as slow
> >   (= crawling) as without the VCPU pausing/resuming (i.e., like with
> >   cpu_synchronize_all_states() only); so ultimately the
> >   pausing/resuming doesn't help.  
> 
> I also tried this variant (bracketing only the sync call with pause/resume):
> 
> static void ich9_apm_broadcast_smi(void)
> {
> CPUState *cs;
> 
> pause_all_vcpus();
> cpu_synchronize_all_states();
> resume_all_vcpus();
> CPU_FOREACH(cs) {
> X86CPU *cpu = X86_CPU(cs);
> CPUX86State *env = &cpu->env;
> 
> if (env->smbase == 0x3 && env->eip == 0xfff0) {
> CPUClass *k = CPU_GET_CLASS(cs);
> uint64_t cpu_arch_id = k->get_arch_id(cs);
> 
> trace_ich9_apm_broadcast_smi_skip(cpu_arch_id);
> continue;
> }
> 
> cpu_interrupt(cs, CPU_INTERRUPT_SMI);
> }
> }
> 
> This behaves identically to the version where pause/resume bracket the
> entire function body (i.e., 1 VCPU pegged, super-slow boot progress).
wrapping entire function into pause_all_vcpus()/resume_all_vcpus()
looks better as then cpu_interrupt() would not need to send IPIs to CPUs
since pause_all_vcpus() would have done it.

So the left question is what this 1 CPU does to make things so slow.
Does it send a lot of SMIs or something else?

Re: [Qemu-devel] [PATCH 9/9] tests: Test case for query-cpu-model-expansion

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 10:39:46AM +0100, David Hildenbrand wrote:
> Am 17.01.2017 um 02:02 schrieb Eduardo Habkost:
> > +def checkExpansions(self, model, msg):
> > +"""Perform multiple expansion operations on model, validate results
> > +
> > +@model is a CpuModelExpansionInfo struct, with some extra keys:
> > +* model['runnable'] should be set to True if the CPU model is
> > +  runnable on this host
> > +* model['qom-props'] will be set to the full list of properties for
> > +  the CPU, if the model is runnable
> > +"""
> > +exp_s = self.checkOneExpansion(model, 'static',
> > +   '%s.static' % (msg))
> > +exp_f = self.checkOneExpansion(model, 'full',
> > +   '%s.full' % (msg))
> > +exp_ss = self.checkOneExpansion(exp_s, 'static',
> > +'%s.static.static' % (msg))
> > +exp_sf = self.checkOneExpansion(exp_s, 'full',
> > +'%s.static.full' % (msg))
> > +exp_ff = self.checkOneExpansion(exp_f, 'full',
> > +'%s.full.full' % (msg))
> > +
> > +# static expansion twice should result in the same data:
> > +self.assertEquals(exp_s, exp_ss, '%s: static != static+static' % 
> > (msg))
> > +# full expansion twice should also result in the same data:
> > +self.assertEquals(exp_f, exp_ff, '%s: full != full+full' % (msg))
> > +
> > +# migration-safe CPU models have an extra feature:
> > +# their static expansion should be equivalent to the full
> > +# expansion (as their static expansion is also precise)
> 
> This is not true for s390x:
> 
> "z13-base" is both, static and migration-safe.
> 
> Doing a full expansion will expand all features (so your check against
> QOM properties should succeed)
> 
> Doing a static expansion will expand no features, as z13-base is
> already static, so there are no features to expand (no delta changes).
> 
> "z13" is only migration-safe.
> 
> Doing a full expansion will expand all features.
> 
> Doing a static expansion will only expand the features different to
> "z13-base". (Remember, delta changes only to minimize reported
> features).

I think my comment was confusing. By "equivalent" I don't mean
having the same expansion, but resulting in the same set of
features.

This is not comparing full_expansion(model) and
static_expansion(model). It is comparing full_expansion(model)
full_expansion(static_expansion(model)).

In other words, absolutely no feature should be lost or changed
during static expansion, and we verify that by doing a full
expansion after the static expansion (exp_sf) and comparing the
results with the full expansion (exp_f).

I believe this is true on s390x too, isn't it?

> 
> 
> And I wonder if that is also true for x86? This should only be true if
> the "base" model contains absolutely no features.

This is true on x86, but temporarily. "base" still contains no
features, but I plan to add extra information to type=full that
can't appear on type=static.

> 
> > +if self.isMigrationSafe(model['model']):
> > +self.assertEquals(exp_sf['model']['props'], 
> > exp_f['model']['props'],
> > +  '%s: props: static+full != full' % (msg))
> > +self.assertEquals(exp_sf.get('qom-props'), 
> > exp_f.get('qom-props'),
> > +  '%s: qom-props: static+full != full' % (msg))
> 
> 
> -- 
> 
> David

-- 
Eduardo

[Qemu-devel] [PATCH 1/3] cpu: Make the mapping of CPUs and NUMA nodes in cpu_common_realizefn

2017-01-18 Thread Dou Liyang

Current default way of seting the CPUState::numa_node might be wrong
in case on cold/hot-plug CPUs. Making the users confused why the
NUMA info is different beetween the guests and monitor.

Make the mapping of CPUs and NUMA nodes in qom/cpu.c:
cpu_common_realizefn(), where each VCPUs need to realize.

Signed-off-by: Dou Liyang 
---
 qom/cpu.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/qom/cpu.c b/qom/cpu.c
index 61ee0cb..e08dceb 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -23,6 +23,7 @@
 #include "qemu-common.h"
 #include "qom/cpu.h"
 #include "sysemu/kvm.h"
+#include "sysemu/numa.h"
 #include "qemu/notify.h"
 #include "qemu/log.h"
 #include "exec/log.h"
@@ -338,6 +339,18 @@ static void cpu_common_parse_features(const char 
*typename, char *features,
 }
 }
 
+static void cpu_common_map_numa_node(CPUState *cpu)
+{
+int i;
+
+for (i = 0; i < nb_numa_nodes; i++) {
+assert(cpu->cpu_index < max_cpus);
+if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
+cpu->numa_node = i;
+}
+}
+}
+
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
@@ -347,6 +360,8 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 cpu_resume(cpu);
 }
 
+cpu_common_map_numa_node(cpu);
+
 /* NOTE: latest generic point where the cpu is fully realized */
 trace_init_vcpu(cpu);
 }
-- 
2.5.5

Re: [Qemu-devel] [PATCH 9/9] tests: Test case for query-cpu-model-expansion

2017-01-18 Thread David Hildenbrand

Am 18.01.2017 um 13:39 schrieb Eduardo Habkost:
> On Wed, Jan 18, 2017 at 10:39:46AM +0100, David Hildenbrand wrote:
>> Am 17.01.2017 um 02:02 schrieb Eduardo Habkost:
>>> +def checkExpansions(self, model, msg):
>>> +"""Perform multiple expansion operations on model, validate results
>>> +
>>> +@model is a CpuModelExpansionInfo struct, with some extra keys:
>>> +* model['runnable'] should be set to True if the CPU model is
>>> +  runnable on this host
>>> +* model['qom-props'] will be set to the full list of properties for
>>> +  the CPU, if the model is runnable
>>> +"""
>>> +exp_s = self.checkOneExpansion(model, 'static',
>>> +   '%s.static' % (msg))
>>> +exp_f = self.checkOneExpansion(model, 'full',
>>> +   '%s.full' % (msg))
>>> +exp_ss = self.checkOneExpansion(exp_s, 'static',
>>> +'%s.static.static' % (msg))
>>> +exp_sf = self.checkOneExpansion(exp_s, 'full',
>>> +'%s.static.full' % (msg))
>>> +exp_ff = self.checkOneExpansion(exp_f, 'full',
>>> +'%s.full.full' % (msg))
>>> +
>>> +# static expansion twice should result in the same data:
>>> +self.assertEquals(exp_s, exp_ss, '%s: static != static+static' % 
>>> (msg))
>>> +# full expansion twice should also result in the same data:
>>> +self.assertEquals(exp_f, exp_ff, '%s: full != full+full' % (msg))
>>> +
>>> +# migration-safe CPU models have an extra feature:
>>> +# their static expansion should be equivalent to the full
>>> +# expansion (as their static expansion is also precise)
>>
>> This is not true for s390x:
>>
>> "z13-base" is both, static and migration-safe.
>>
>> Doing a full expansion will expand all features (so your check against
>> QOM properties should succeed)
>>
>> Doing a static expansion will expand no features, as z13-base is
>> already static, so there are no features to expand (no delta changes).
>>
>> "z13" is only migration-safe.
>>
>> Doing a full expansion will expand all features.
>>
>> Doing a static expansion will only expand the features different to
>> "z13-base". (Remember, delta changes only to minimize reported
>> features).
> 
> I think my comment was confusing. By "equivalent" I don't mean
> having the same expansion, but resulting in the same set of
> features.
> 
> This is not comparing full_expansion(model) and
> static_expansion(model). It is comparing full_expansion(model)
> full_expansion(static_expansion(model)).
> 
> In other words, absolutely no feature should be lost or changed
> during static expansion, and we verify that by doing a full
> expansion after the static expansion (exp_sf) and comparing the
> results with the full expansion (exp_f).
> 
> I believe this is true on s390x too, isn't it?

Okay, I actually was confused by this comment. This makes sense!

full(z13) == full(static(z13))

Thanks for clarifying!

-- 

David

[Qemu-devel] [PATCH 2/3] numa: Remove the numa_post_machine_init function

2017-01-18 Thread Dou Liyang

Current default way of seting the CPUState::numa_node in the
numa_post_machine_init() and calling it in vl.c:main() would
make the data incorrect in case on cold/hot-plug CPUs.

Now, we move it to the qom/cpu.c:cpu_common_realizefn().
So, Here we remove it.

Signed-off-by: Dou Liyang 
---
 include/sysemu/numa.h |  1 -
 numa.c| 15 ---
 vl.c  |  2 --
 3 files changed, 18 deletions(-)

diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 8f09dcf..b8015a5 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -25,7 +25,6 @@ typedef struct node_info {
 
 extern NodeInfo numa_info[MAX_NODES];
 void parse_numa_opts(MachineClass *mc);
-void numa_post_machine_init(void);
 void query_numa_node_mem(uint64_t node_mem[]);
 extern QemuOptsList qemu_numa_opts;
 void numa_set_mem_node_id(ram_addr_t addr, uint64_t size, uint32_t node);
diff --git a/numa.c b/numa.c
index 379bc8a..5f68497 100644
--- a/numa.c
+++ b/numa.c
@@ -394,21 +394,6 @@ void parse_numa_opts(MachineClass *mc)
 }
 }
 
-void numa_post_machine_init(void)
-{
-CPUState *cpu;
-int i;
-
-CPU_FOREACH(cpu) {
-for (i = 0; i < nb_numa_nodes; i++) {
-assert(cpu->cpu_index < max_cpus);
-if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
-cpu->numa_node = i;
-}
-}
-}
-}
-
 static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
const char *name,
uint64_t ram_size)
diff --git a/vl.c b/vl.c
index c643d3f..afe40ce 100644
--- a/vl.c
+++ b/vl.c
@@ -4549,8 +4549,6 @@ int main(int argc, char **argv, char **envp)
 
 cpu_synchronize_all_post_init();
 
-numa_post_machine_init();
-
 if (qemu_opts_foreach(qemu_find_opts("fw_cfg"),
   parse_fw_cfg, fw_cfg_find(), NULL) != 0) {
 exit(1);
-- 
2.5.5

Re: [Qemu-devel] [PATCH RFC v2 02/12] vfio: linux-headers update for vfio-ccw

2017-01-18 Thread Cornelia Huck

On Wed, 18 Jan 2017 10:51:17 +0800
Dong Jia Shi  wrote:

> * Alex Williamson  [2017-01-17 14:51:42 -0700]:
> 
> > On Thu, 12 Jan 2017 08:25:03 +0100
> > Dong Jia Shi  wrote:
> > 
> > > From: Xiao Feng Ren 
> > > 
> > > This is a placeholder for a linux-headers update.
> > > 
> > > Signed-off-by: Xiao Feng Ren 
> > > ---
> > >  include/standard-headers/asm-s390/vfio_ccw.h | 28 
> > > 
> > >  linux-headers/linux/vfio.h   | 17 +
> > >  2 files changed, 45 insertions(+)
> > >  create mode 100644 include/standard-headers/asm-s390/vfio_ccw.h
> > > 
> > > diff --git a/include/standard-headers/asm-s390/vfio_ccw.h 
> > > b/include/standard-headers/asm-s390/vfio_ccw.h
> > > new file mode 100644
> > > index 000..cddc09b
> > > --- /dev/null
> > > +++ b/include/standard-headers/asm-s390/vfio_ccw.h
> > > @@ -0,0 +1,28 @@
> > > +/*
> > > + * Interfaces for vfio-ccw
> > > + *
> > > + * Copyright IBM Corp. 2017
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License (version 2 only)
> > > + * as published by the Free Software Foundation.
> > > + *
> > > + * Author(s): Dong Jia Shi 
> > > + */
> > > +
> > > +#ifndef _VFIO_CCW_H_
> > > +#define _VFIO_CCW_H_
> > > +
> > > +#include "standard-headers/linux/types.h"
> > > +
> > > +struct ccw_io_region {
> > > +#define ORB_AREA_SIZE 12
> > > + uint8_t  orb_area[ORB_AREA_SIZE];
> > > +#define SCSW_AREA_SIZE 12
> > > + uint8_t  scsw_area[SCSW_AREA_SIZE];
> > > +#define IRB_AREA_SIZE 96
> > > + uint8_t  irb_area[IRB_AREA_SIZE];
> > > + uint32_t ret_code;
> > > +} QEMU_PACKED;
> > > +
> > > +#endif
> > 
> > This is really part of the uapi for the vfio-ccw mdev device, isn't it?
> Yes, it is.
> 
> > Should it really be buried in asm-s390 in the kernel?
> > 
> We had an internal discussion on this question before, since we think
> this interface is strongly s390 dependent, we put it here. What do you
> suggest? Thanks,

As there are already architecture-specific definitions in the common
file, having everything in the same file is probably better. I'm
certainly not against moving it out from asm-s390.

Re: [Qemu-devel] TSC frequency configuration & invtsc migration (was Re: [PATCH 4/4] kvm: Allow migration with invtsc)

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 09:55:43AM -0200, Marcelo Tosatti wrote:
> On Tue, Jan 10, 2017 at 05:36:48PM +0100, Paolo Bonzini wrote:
> > 
> > 
> > On 05/01/2017 11:48, Marcelo Tosatti wrote:
> > >> Host A has TSC scaling, host B doesn't have TSC scaling. We want
> > >> to be able to start the VM on host A, and migrate to B. In this
> > >> case, the only possible solution is to use B's frequency when
> > >> starting the VM. The QEMU process doesn't have enough information
> > >> to make that decision.
> > > That is a good point. But again, its a special case and 
> > > should be supported by -cpu xxx,tsc-frequency=.
> > 
> > I don't think this is a scenario that can work reliably.  The computed
> > TSC frequency may vary by 0.5% or so on every boot (e.g. you may get
> > 2497000 kHz or 2511000 kHz for a 2.5 GHz TSC).  You can start the VM on
> > host A, reboot host B, and then you'll be unable to migrate.
> > 
> > Paolo
> 
> Including an acceptable error in the comparison seems to be
> acceptable to work around that case.

How much error is acceptable when exposing the invtsc flag to the
guest?

-- 
Eduardo

[Qemu-devel] [PATCH 0/3] cpu: numa: Fix the mapping initialization of VCPUs and NUMA nodes

2017-01-18 Thread Dou Liyang

As we fixed a bug(Bug 1) in below links, Named "Method-A":

https://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03354.html

Then, Eduardo gave us many suggests. Thanks very much!
when we try them, we also find another bugs named "Bug 2".

[Problem]
-

As I use this command:

./x86_64-softmmu/qemu-system-x86_64 \
-hda /image/fedora.img \
-m 1G,slots=4,maxmem=4G \
-enable-kvm \
-smp 2,maxcpus=16,sockets=4,cores=2,threads=2 \
-device qemu64-x86_64-cpu,id=cpu1,socket-id=0,core-id=1,thread-id=0 \
-device qemu64-x86_64-cpu,id=cpu2,socket-id=0,core-id=1,thread-id=1 \
-device qemu64-x86_64-cpu,id=cpu3,socket-id=1,core-id=0,thread-id=0 \
-device qemu64-x86_64-cpu,id=cpu4,socket-id=1,core-id=0,thread-id=1 \
-device qemu64-x86_64-cpu,id=cpu5,socket-id=1,core-id=1,thread-id=0 \
-device qemu64-x86_64-cpu,id=cpu6,socket-id=1,core-id=1,thread-id=1 \
-numa node,nodeid=0,cpus=0-3 \
-numa node,nodeid=1,cpus=4-7 \
-numa node,nodeid=2,cpus=8-11 \
-numa node,nodeid=3,cpus=12-15 \
-monitor stdio \

1. Bug 1

In Qemu monitor:

(qemu) info numa 
4 nodes
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 256 MB
node 1 cpus:
node 1 size: 256 MB
node 2 cpus:
node 2 size: 256 MB
node 3 cpus:
node 3 size: 256 MB

2. Bug 2

(qemu) device_add qemu64-x86_64-cpu,id=cpu7,socket-id=2,core-id=0,thread-id=0

(qemu) info numa
4 nodes
node 0 cpus: 0 1 2 3 4 5 6 7 8
node 0 size: 256 MB
node 1 cpus:
node 1 size: 256 MB
node 2 cpus:
node 2 size: 256 MB
node 3 cpus:
node 3 size: 256 MB

[Method-A]
--

1. Method-A that we provided above: 
  * Ensure the numa_post_machine_init func in the appropriate location in
vl.c::main().

It can fix Bug 1, but, can't work for Bug 2.

1.1. For Bug 1: fixed

(qemu) info numa
4 nodes
node 0 cpus: 0 1 2 3
node 0 size: 256 MB
node 1 cpus: 4 5 6 7
node 1 size: 256 MB
node 2 cpus:
node 2 size: 256 MB
node 3 cpus:
node 3 size: 256 MB

1.2. For Bug 2: can not fixed

(qemu) device_add qemu64-x86_64-cpu,id=cpu7,socket-id=2,core-id=0,thread-id=0

(qemu) info numa
node 0 cpus: 0 1 2 3 8
node 0 size: 256 MB
node 1 cpus: 4 5 6 7
node 1 size: 256 MB
node 2 cpus:
node 2 size: 256 MB
node 3 cpus:
node 3 size: 256 MB

[Solution]
--

Move the CPUState::numa_node initialization to qom/cpu.c:cpu_common_realizefn(),
and remove numa_post_machine_init() completely.

It can fix Bug 1 and Bug 2. The result shows that:

(qemu) info numa
4 nodes
node 0 cpus: 0 1 2 3
node 0 size: 256 MB
node 1 cpus: 4 5 6 7
node 1 size: 256 MB
node 2 cpus:
node 2 size: 256 MB
node 3 cpus:
node 3 size: 256 MB

(qemu) device_add qemu64-x86_64-cpu,id=cpu7,socket-id=2,core-id=0,thread-id=0
(qemu) info numa
4 nodes
node 0 cpus: 0 1 2 3
node 0 size: 256 MB
node 1 cpus: 4 5 6 7
node 1 size: 256 MB
node 2 cpus: 8
node 2 size: 256 MB
node 3 cpus:
node 3 size: 256 MB

(qemu) device_add qemu64-x86_64-cpu,id=cpu8,socket-id=3,core-id=0,thread-id=0
(qemu) info numa
4 nodes
node 0 cpus: 0 1 2 3
node 0 size: 256 MB
node 1 cpus: 4 5 6 7
node 1 size: 256 MB
node 2 cpus: 8
node 2 size: 256 MB
node 3 cpus: 12
node 3 size: 256 MB

(qemu) device_del cpu5
(qemu) info numa
4 nodes
node 0 cpus: 0 1 2 3
node 0 size: 256 MB
node 1 cpus: 4 5 7
node 1 size: 256 MB
node 2 cpus: 8
node 2 size: 256 MB
node 3 cpus: 12
node 3 size: 256 MB

Dou Liyang (3):
  cpu: Make the mapping of CPUs and NUMA nodes in cpu_common_realizefn
  numa: Remove the numa_post_machine_init function
  cpu: make the function of cpu_common_map_numa_node more efficiently

 include/sysemu/numa.h |  1 -
 numa.c| 15 ---
 qom/cpu.c | 16 
 vl.c  |  2 --
 4 files changed, 16 insertions(+), 18 deletions(-)

-- 
2.5.5

Re: [Qemu-devel] [PATCH 1/2] vl: Ensure the numa_post_machine_init func in the appropriate location

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 12:02:35PM +0800, Dou Liyang wrote:
> Hi, Eduardo
> 
> At 01/18/2017 04:09 AM, Eduardo Habkost wrote:
> > On Tue, Jan 17, 2017 at 10:42:31PM +0800, Dou Liyang wrote:
> > > In the numa_post_machine_init(), we use CPU_FOREACH macro to set all
> > > CPUs' namu_node. So, we should make sure that we call it after Qemu
> > > has already initialied all the CPUs.
> > > 
> > > As we all know, the CPUs can be created by "-smp"(pc_new_cpu) or
> > > "-device"(qdev_device_add) command. But, before the device init,
> > > Qemu execute the numa_post_machine_init earlier. It makes the mapping
> > > of NUMA nodes and CPUs incorrect.
> > > 
> > > The patch move the numa_post_machine_init func in the appropriate
> > > location.
> > > 
> > > Signed-off-by: Dou Liyang 
> > 
> > I would like to move cpu_index initialization to
> > qom/cpu.c:cpu_common_realizefn(), and remove
> > numa_post_machine_init() completely.
> 
> Thanks, it is a good idea. I will try it later.
> 
> But, I hope to know that:
> 
> If you may want to say the cpu->**numa_node** initialization?
> Because the **cpu_index** initialization is in pc_cpu_pre_plug()
> currently, like that: cs->cpu_index = idx;

Oops, I meant the numa_node initialization. :)

-- 
Eduardo

[Qemu-devel] [PATCH 3/3] cpu: make the function of cpu_common_map_numa_node more efficiently

2017-01-18 Thread Dou Liyang

Current function does some unnecessary operations, such as it makes
the assert() in the loop, and the loop was not stopped in time.

This patch moves the assert() out the loop and stops the loop in
time.

Signed-off-by: Dou Liyang 
---
 qom/cpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qom/cpu.c b/qom/cpu.c
index e08dceb..3c655b2 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -343,10 +343,11 @@ static void cpu_common_map_numa_node(CPUState *cpu)
 {
 int i;
 
+assert(cpu->cpu_index < max_cpus);
 for (i = 0; i < nb_numa_nodes; i++) {
-assert(cpu->cpu_index < max_cpus);
 if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
 cpu->numa_node = i;
+return;
 }
 }
 }
-- 
2.5.5

Re: [Qemu-devel] [PATCH v1 07/15] iotests: fix 097 when run with qcow

2017-01-18 Thread Max Reitz

On 17.01.2017 10:59, Daniel P. Berrange wrote:
> On Mon, Jan 16, 2017 at 09:04:31PM +0100, Max Reitz wrote:
>> On 03.01.2017 19:27, Daniel P. Berrange wrote:
>>> The previous commit:
>>>
>>>   commit a3e1505daec31ef56f0489f8c8fff1b8e4ca92bd
>>>   Author: Eric Blake 
>>>   Date:   Mon Dec 5 09:49:34 2016 -0600
>>>
>>> qcow2: Don't strand clusters near 2G intervals during commit
>>>
>>> extended the 097 test case so that it did two passes, once
>>> with an internal snapshot, once without.
>>>
>>> qcow (v1) does not support internal snapshots, so this change
>>> broke test 097 when run against qcow.
>>>
>>> This splits 097 in two, creating a new 173 that tests the
>>> internal snapshot codepath, effectively putting 097 back
>>> to its content before the above commit.
>>>
>>> Signed-off-by: Daniel P. Berrange 
>>> ---
>>>  tests/qemu-iotests/097 |  10 +---
>>>  tests/qemu-iotests/097.out | 125 
>>> ++--
>>>  tests/qemu-iotests/173 | 126 
>>> +
>>>  tests/qemu-iotests/173.out | 119 ++
>>>  tests/qemu-iotests/group   |   1 +
>>>  5 files changed, 251 insertions(+), 130 deletions(-)
>>>  create mode 100755 tests/qemu-iotests/173
>>>  create mode 100644 tests/qemu-iotests/173.out
>>
>> I don't think the effort is worth it, considering that probably
>> literally nobody is still using qcow -- or so I hope, at least.
> 
> The reason I fixed this was because I wanted to be able to verify my
> refactoring to qcow didn't break anything else. It is much easier if
> I can just run "check -qcow" and not have to worry about which failures
> are just test bugs, vs genuine code bugs I've created. IOW, as long as
> qcow.c exists in the code base, IMHO, we should make sure the iotests
> continue to pass

Right, what I meant is that I don't think it's worth fixing tests for
qcow. If they don't work and it isn't qcow that's broken (but just the
test requiring something that qcow does not have), just removing it from
the _supported_fmt list is usually sufficient in my opinion.

> On that point, IMHO, it would be beneficial if we had some CI system
> that was setup to run the iotests on all new changes, across all the
> different disk formats we expect the iotests to work on. We get far
> too many regressions with iotests breaking and no one noticing right
> now :-(

Indeed, but that would require all of the iotests consistently passing
first...

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] arch_init: Remove unnecessary default_config_files table

2017-01-18 Thread Paolo Bonzini



On 18/01/2017 13:52, Eduardo Habkost wrote:
>>
>> Though maybe we should just remove .conf file support completely...
>> who's using it?!?
> You mean removing /etc/qemu.conf, or removing -readconfig
> completely?
> 
> The former doesn't seem to be used often. The latter looks very
> useful for people trying to write scripts around QEMU and to
> avoid command-line length limits, so I will be surprised if
> nobody is using it.

Of course /etc/qemu.conf only (so that -nodefconfig/-nouserconfig become
no-ops).

Paolo

[Qemu-devel] [PATCH v3] hw/core/null-machine: Add the possibility to instantiate a CPU and RAM

2017-01-18 Thread Thomas Huth

Sometimes it is useful to have just a machine with CPU and RAM, without
any further hardware in it, e.g. if you just want to do some instruction
debugging for TCG with a remote GDB attached to QEMU, or run some embedded
code with the "-semihosting" QEMU parameter. qemu-system-m68k already
features a "dummy" machine, and xtensa a "sim" machine for exactly this
purpose.
All target architectures have nowadays also a "none" machine, which would
be a perfect match for this, too - but it currently does not allow to add
CPU and RAM yet. Thus let's add these possibilities in a generic way to the
"none" machine, too, so that we hopefully do not need additional "dummy"
machines in the future anymore (and maybe can also get rid of the already
existing "dummy"/"sim" machines one day).
Note that the default behaviour of the "none" machine is not changed, i.e.
no CPU and no RAM is instantiated by default. You have explicitely got to
specify the CPU model with "-cpu" and the amount of RAM with "-m" to get
these new features.

Signed-off-by: Thomas Huth 
---
 v3:
 - Get rid of the cpu_init_def() wrapper again, make null-machine.o
   target dependent instead and use cpu_init() directly.
 - Omit the loader code for the "-kernel" option for now (users can
   use "-device loader,..." instead). We can add code for the -kernel
   parameter later (either an implementation or a warning), once we've
   decided how it should behave for the "none" machine.

 hw/core/Makefile.objs  |  2 +-
 hw/core/null-machine.c | 27 +--
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
index a4c94e5..0b6c0f1 100644
--- a/hw/core/Makefile.objs
+++ b/hw/core/Makefile.objs
@@ -12,7 +12,6 @@ common-obj-$(CONFIG_XILINX_AXI) += stream.o
 common-obj-$(CONFIG_PTIMER) += ptimer.o
 common-obj-$(CONFIG_SOFTMMU) += sysbus.o
 common-obj-$(CONFIG_SOFTMMU) += machine.o
-common-obj-$(CONFIG_SOFTMMU) += null-machine.o
 common-obj-$(CONFIG_SOFTMMU) += loader.o
 common-obj-$(CONFIG_SOFTMMU) += qdev-properties-system.o
 common-obj-$(CONFIG_SOFTMMU) += register.o
@@ -20,3 +19,4 @@ common-obj-$(CONFIG_SOFTMMU) += or-irq.o
 common-obj-$(CONFIG_PLATFORM_BUS) += platform-bus.o
 
 obj-$(CONFIG_SOFTMMU) += generic-loader.o
+obj-$(CONFIG_SOFTMMU) += null-machine.o
diff --git a/hw/core/null-machine.c b/hw/core/null-machine.c
index 0351ba7..27c8369 100644
--- a/hw/core/null-machine.c
+++ b/hw/core/null-machine.c
@@ -13,18 +13,41 @@
 
 #include "qemu/osdep.h"
 #include "qemu-common.h"
+#include "qemu/error-report.h"
 #include "hw/hw.h"
 #include "hw/boards.h"
+#include "sysemu/sysemu.h"
+#include "exec/address-spaces.h"
+#include "cpu.h"
 
-static void machine_none_init(MachineState *machine)
+static void machine_none_init(MachineState *mch)
 {
+CPUState *cpu = NULL;
+
+/* Initialize CPU (if a model has been specified) */
+if (mch->cpu_model) {
+cpu = cpu_init(mch->cpu_model);
+if (!cpu) {
+error_report("Unable to initialize CPU");
+exit(1);
+}
+}
+
+/* RAM at address zero */
+if (mch->ram_size) {
+MemoryRegion *ram = g_new(MemoryRegion, 1);
+
+memory_region_allocate_system_memory(ram, NULL, "ram", mch->ram_size);
+memory_region_add_subregion(get_system_memory(), 0, ram);
+}
 }
 
 static void machine_none_machine_init(MachineClass *mc)
 {
 mc->desc = "empty machine";
 mc->init = machine_none_init;
-mc->max_cpus = 0;
+mc->max_cpus = 1;
+mc->default_ram_size = 0;
 }
 
 DEFINE_MACHINE("none", machine_none_machine_init)
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 1/3] cpu: Make the mapping of CPUs and NUMA nodes in cpu_common_realizefn

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 08:40:05PM +0800, Dou Liyang wrote:
> Current default way of seting the CPUState::numa_node might be wrong
> in case on cold/hot-plug CPUs. Making the users confused why the
> NUMA info is different beetween the guests and monitor.
> 
> Make the mapping of CPUs and NUMA nodes in qom/cpu.c:
> cpu_common_realizefn(), where each VCPUs need to realize.
> 
> Signed-off-by: Dou Liyang 

parse_numa_opts() is called a long time before any CPU is
created, so this should be safe.

Reviewed-by: Eduardo Habkost 

(But we can squash patch 2/3 and patch 3/3 in this patch).

> ---
>  qom/cpu.c | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/qom/cpu.c b/qom/cpu.c
> index 61ee0cb..e08dceb 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -23,6 +23,7 @@
>  #include "qemu-common.h"
>  #include "qom/cpu.h"
>  #include "sysemu/kvm.h"
> +#include "sysemu/numa.h"
>  #include "qemu/notify.h"
>  #include "qemu/log.h"
>  #include "exec/log.h"
> @@ -338,6 +339,18 @@ static void cpu_common_parse_features(const char 
> *typename, char *features,
>  }
>  }
>  
> +static void cpu_common_map_numa_node(CPUState *cpu)
> +{
> +int i;
> +
> +for (i = 0; i < nb_numa_nodes; i++) {
> +assert(cpu->cpu_index < max_cpus);
> +if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
> +cpu->numa_node = i;
> +}
> +}
> +}
> +
>  static void cpu_common_realizefn(DeviceState *dev, Error **errp)
>  {
>  CPUState *cpu = CPU(dev);
> @@ -347,6 +360,8 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
> **errp)
>  cpu_resume(cpu);
>  }
>  
> +cpu_common_map_numa_node(cpu);
> +
>  /* NOTE: latest generic point where the cpu is fully realized */
>  trace_init_vcpu(cpu);
>  }
> -- 
> 2.5.5
> 
> 
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH] arch_init: Remove unnecessary default_config_files table

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 01:55:21PM +0100, Paolo Bonzini wrote:
> 
> 
> On 18/01/2017 13:52, Eduardo Habkost wrote:
> >>
> >> Though maybe we should just remove .conf file support completely...
> >> who's using it?!?
> > You mean removing /etc/qemu.conf, or removing -readconfig
> > completely?
> > 
> > The former doesn't seem to be used often. The latter looks very
> > useful for people trying to write scripts around QEMU and to
> > avoid command-line length limits, so I will be surprised if
> > nobody is using it.
> 
> Of course /etc/qemu.conf only (so that -nodefconfig/-nouserconfig become
> no-ops).

Agreed. Should we print a warning in 2.9 and remove it in 2.10?

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/3] cpu: numa: Fix the mapping initialization of VCPUs and NUMA nodes

2017-01-18 Thread no-reply

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 1484743207-10721-1-git-send-email-douly.f...@cn.fujitsu.com
Subject: [Qemu-devel] [PATCH 0/3] cpu: numa: Fix the mapping initialization of 
VCPUs and NUMA nodes
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
make docker-test-build@min-glib
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
fdce9a1 cpu: make the function of cpu_common_map_numa_node more efficiently
d79e94c numa: Remove the numa_post_machine_init function
8afaf15 cpu: Make the mapping of CPUs and NUMA nodes in cpu_common_realizefn

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
make[1]: Entering directory `/var/tmp/patchew-tester-tmp-3q9yfmbs/src'
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=4d6f6388bcc5
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno
libtasn1  no
curses supportno
virgl support no
curl support  no
mingw32 support   no
Audio drivers oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS supportno
VNC support   yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support   no
brlapi supportno
bluez  supportno
Documentation no
PIE   yes
vde support   no
netmap supportno
Linux AIO support no
ATTR/XATTR support yes
Install blobs yes
KVM support

Re: [Qemu-devel] [PATCH] arch_init: Remove unnecessary default_config_files table

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 10:17:17AM +0100, Paolo Bonzini wrote:
> 
> 
> On 17/01/2017 19:00, Eduardo Habkost wrote:
> > The existing default_config_files table in arch_init.c has a
> > single entry, making it completely unnecessary. The whole code
> > can be replaced by a single qemu_read_config_file() call in vl.c.
> > 
> > Signed-off-by: Eduardo Habkost 
> > ---
> >  include/qemu/config-file.h |  4 
> >  arch_init.c| 27 ---
> >  vl.c   | 18 ++
> >  3 files changed, 14 insertions(+), 35 deletions(-)
> > 
> > diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h
> > index 8d4b2b6d94..c80d5c8a33 100644
> > --- a/include/qemu/config-file.h
> > +++ b/include/qemu/config-file.h
> > @@ -23,8 +23,4 @@ int qemu_read_config_file(const char *filename);
> >  void qemu_config_parse_qdict(QDict *options, QemuOptsList **lists,
> >   Error **errp);
> >  
> > -/* Read default QEMU config files
> > - */
> > -int qemu_read_default_config_files(bool userconfig);
> > -
> >  #endif /* QEMU_CONFIG_FILE_H */
> > diff --git a/arch_init.c b/arch_init.c
> > index 5cc58b2c35..e9b6d62d18 100644
> > --- a/arch_init.c
> > +++ b/arch_init.c
> > @@ -84,33 +84,6 @@ int graphic_depth = 32;
> >  
> >  const uint32_t arch_type = QEMU_ARCH;
> >  
> > -static struct defconfig_file {
> > -const char *filename;
> > -/* Indicates it is an user config file (disabled by -no-user-config) */
> > -bool userconfig;
> > -} default_config_files[] = {
> > -{ CONFIG_QEMU_CONFDIR "/qemu.conf",   true },
> > -{ NULL }, /* end of list */
> > -};
> > -
> > -int qemu_read_default_config_files(bool userconfig)
> > -{
> > -int ret;
> > -struct defconfig_file *f;
> > -
> > -for (f = default_config_files; f->filename; f++) {
> > -if (!userconfig && f->userconfig) {
> > -continue;
> > -}
> > -ret = qemu_read_config_file(f->filename);
> > -if (ret < 0 && ret != -ENOENT) {
> > -return ret;
> > -}
> > -}
> > -
> > -return 0;
> > -}
> > -
> >  struct soundhw {
> >  const char *name;
> >  const char *descr;
> > diff --git a/vl.c b/vl.c
> > index c643d3ff3a..b563f9b924 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -2990,6 +2990,18 @@ static int global_init_func(void *opaque, QemuOpts 
> > *opts, Error **errp)
> >  return 0;
> >  }
> >  
> > +static int qemu_read_default_config_file(void)
> > +{
> > +int ret;
> > +
> > +ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR "/qemu.conf");
> > +if (ret < 0 && ret != -ENOENT) {
> > +return ret;
> > +}
> > +
> > +return 0;
> > +}
> > +
> >  int main(int argc, char **argv, char **envp)
> >  {
> >  int i;
> > @@ -3117,10 +3129,8 @@ int main(int argc, char **argv, char **envp)
> >  }
> >  }
> >  
> > -if (defconfig) {
> > -int ret;
> > -ret = qemu_read_default_config_files(userconfig);
> > -if (ret < 0) {
> > +if (defconfig && userconfig) {
> > +if (qemu_read_default_config_file() < 0) {
> >  exit(1);
> >  }
> >  }
> > 
> 
> Looks good!  Please include it yourself in your next pull request.
> 
> Reviewed-by: Paolo Bonzini 
> 
> Though maybe we should just remove .conf file support completely...
> who's using it?!?

You mean removing /etc/qemu.conf, or removing -readconfig
completely?

The former doesn't seem to be used often. The latter looks very
useful for people trying to write scripts around QEMU and to
avoid command-line length limits, so I will be surprised if
nobody is using it.

-- 
Eduardo

Re: [Qemu-devel] [PATCH] arch_init: Remove unnecessary default_config_files table

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 10:17:17AM +0100, Paolo Bonzini wrote:
> 
> 
> On 17/01/2017 19:00, Eduardo Habkost wrote:
> > The existing default_config_files table in arch_init.c has a
> > single entry, making it completely unnecessary. The whole code
> > can be replaced by a single qemu_read_config_file() call in vl.c.
> > 
> > Signed-off-by: Eduardo Habkost 
[...]
> 
> Looks good!  Please include it yourself in your next pull request.
> 
> Reviewed-by: Paolo Bonzini 

Applied to machine-next. Thanks!

-- 
Eduardo

[Qemu-devel] [PATCH RFC] acpi: add reset register to fadt

2017-01-18 Thread Phil Dennis-Jordan

About 2 years ago, Reza Jelveh submitted essentially this same patch:
https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg05832.html

It adds the reset register defined in ACPI 2.0 to the x86 FADT, which fixes 
rebooting for Darwin/OS X/macOS guests.

I'm trying to revive this as part of an effort to get recent versions of OS 
X/macOS working in Qemu/KVM with OVMF and without the need for out-of-tree 
patches. I've adapted the original patch so it cleanly applies on current 
master. The unaddressed issues last time around were:

 1. Does this work with a range of Windows versions?

I've tested this new version of the patch with KVM from Ubuntu's 
4.4.0-51-generic kernel, using the following guests:

 * Windows XP SP2 and SP3, x86. "pc" (FX440) machine, IDE disk, SeaBIOS.
 * Windows 7 SP 1, x86. "pc" (FX440) machine, AHCI disk, SeaBIOS.
 * Windows 10 1607, x86-64, q35 machine, AHCI disk, OVMF from 2017-01-17 EDK2 
master.
 * OS X 10.9 to 10.11, macOS 10.12; patched OVMF [1]

In all cases I went through a fresh install, checked basic functionality, and 
rebooting. I noticed nothing different in any of the Windows VMs, and of course 
the Darwin-based ones no longer hang when attempting to reboot.


 2. The reset register is defined in ACPI 2.0, this still advertises 1.0.

I have to admit, I know very little about how ACPI works, so I'm not sure I got 
this right, but: I subsequently added a line
rsdp->revision = 0x02;
to build_rsdp() in acpi-build.c, then booted the aforementioned Windows VMs, 
and the macOS 10.12 one with this change. I noticed no change or ill effect 
whatsoever.

I suspect more might be involved in enabling ACPI 2.0, and it should probably 
be an option so as to avoid regressions. I don't know what the best approach 
would be for this, so comments welcome. Should adding the reset register to the 
FADT also be configurable?


Additionally, please suggest any further guest OSes and configurations I should 
test which seem likely to exhibit regressions with this change.

(First post to this list - apologies if I messed up any of the conventions! 
Thanks, phil)

[1] EDK2 fork with OS X/macOS compatibility patches: 
https://github.com/pmj/edk2/tree/mac-patches

---
 hw/i386/acpi-build.c| 5 +
 include/hw/acpi/acpi-defs.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 0c8912f..93781a0 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -296,6 +296,11 @@ static void fadt_setup(AcpiFadtDescriptorRev1 *fadt, 
AcpiPmInfo *pm)
   (1 << ACPI_FADT_F_SLP_BUTTON) |
   (1 << ACPI_FADT_F_RTC_S4));
 fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_USE_PLATFORM_CLOCK);
+fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_RESET_REG_SUP);
+fadt->reset_val = 0xf;
+fadt->reset_reg.address_space_id   = 1;
+fadt->reset_reg.register_bit_width = 8;
+fadt->reset_reg.address= ICH9_RST_CNT_IOPORT;
 /* APIC destination mode ("Flat Logical") has an upper limit of 8 CPUs
  * For more than 8 CPUs, "Clustered Logical" mode has to be used
  */
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 4cc3630..5755570 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -140,6 +140,8 @@ struct AcpiFadtDescriptorRev1
 uint8_t  reserved4a; /* Reserved */
 uint8_t  reserved4b; /* Reserved */
 uint32_t flags;
+Acpi20GenericAddress reset_reg;
+uint8_t reset_val;
 } QEMU_PACKED;
 typedef struct AcpiFadtDescriptorRev1 AcpiFadtDescriptorRev1;
 
-- 
2.3.2 (Apple Git-55)

Re: [Qemu-devel] [PATCH 10/18] tcg/i386: add support for vector opcodes

2017-01-18 Thread Kirill Batuzov

On Tue, 17 Jan 2017, Richard Henderson wrote:

> On 01/17/2017 01:07 AM, Kirill Batuzov wrote:
> > To be able to generate vector operations in a TCG backend we need to do
> > several things.
> > 
> > 1. We need to tell the register allocator about vector target's register.
> >In case of x86 we'll use xmm0..xmm7. xmm7 is designated as a scratch
> >register, others can be used by the register allocator.
> > 
> > 2. We need a new constraint to indicate where to use vector registers. In
> >this commit the 'V' constraint is introduced.
> > 
> > 3. We need to be able to generate bare minimum: load, store and reg-to-reg
> >move. MOVDQU is used for loads and stores. MOVDQA is used for reg-to-reg
> >moves.
> > 
> > 4. Finally we need to support any other opcodes we want. INDEX_op_add_i32x4
> >is the only one for now. The PADDD instruction handles it perfectly.
> > 
> > Signed-off-by: Kirill Batuzov 
> > ---
> >  tcg/i386/tcg-target.h |  24 +-
> >  tcg/i386/tcg-target.inc.c | 109
> > +++---
> >  2 files changed, 125 insertions(+), 8 deletions(-)
> > 
> > diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
> > index 524cfc6..974a58b 100644
> > --- a/tcg/i386/tcg-target.h
> > +++ b/tcg/i386/tcg-target.h
> > @@ -29,8 +29,14 @@
> >  #define TCG_TARGET_TLB_DISPLACEMENT_BITS 31
> > 
> >  #ifdef __x86_64__
> > -# define TCG_TARGET_REG_BITS  64
> > -# define TCG_TARGET_NB_REGS   16
> > +# define TCG_TARGET_HAS_REG128 1
> > +# ifdef TCG_TARGET_HAS_REG128
> > +#  define TCG_TARGET_REG_BITS  64
> > +#  define TCG_TARGET_NB_REGS   24
> > +# else
> > +#  define TCG_TARGET_REG_BITS  64
> > +#  define TCG_TARGET_NB_REGS   16
> > +# endif
> >  #else
> >  # define TCG_TARGET_REG_BITS  32
> >  # define TCG_TARGET_NB_REGS8
> > @@ -56,6 +62,16 @@ typedef enum {
> >  TCG_REG_R13,
> >  TCG_REG_R14,
> >  TCG_REG_R15,
> > +#ifdef TCG_TARGET_HAS_REG128
> > +TCG_REG_XMM0,
> > +TCG_REG_XMM1,
> > +TCG_REG_XMM2,
> > +TCG_REG_XMM3,
> > +TCG_REG_XMM4,
> > +TCG_REG_XMM5,
> > +TCG_REG_XMM6,
> > +TCG_REG_XMM7,
> > +#endif
> 
> There's no need to conditionalize this.  The registers can be always defined
> even if they're not used.  We really really really want to keep ifdefs to an
> absolute minimum.
> 
> Why are you not defining xmm8-15?

At first I thought about supporting both x86_64 and i386 targets, but
put this idea away (at least for the time being). Since defining xmm8-15
does not contradict anything (as I see it now) I'll add them too.

> 
> > @@ -634,9 +662,24 @@ static inline void tgen_arithr(TCGContext *s, int
> > subop, int dest, int src)
> >  static inline void tcg_out_mov(TCGContext *s, TCGType type,
> > TCGReg ret, TCGReg arg)
> >  {
> > +int opc;
> >  if (arg != ret) {
> > -int opc = OPC_MOVL_GvEv + (type == TCG_TYPE_I64 ? P_REXW : 0);
> > -tcg_out_modrm(s, opc, ret, arg);
> > +switch (type) {
> > +#ifdef TCG_TARGET_HAS_REG128
> > +case TCG_TYPE_V128:
> > +ret -= TCG_REG_XMM0;
> > +arg -= TCG_REG_XMM0;
> > +tcg_out_modrm(s, OPC_MOVDQA_R2R, ret, arg);
> > +break;
> > +#endif
> > +case TCG_TYPE_I32:
> > +case TCG_TYPE_I64:
> > +opc = OPC_MOVL_GvEv + (type == TCG_TYPE_I64 ? P_REXW : 0);
> > +tcg_out_modrm(s, opc, ret, arg);
> > +break;
> > +default:
> > +assert(0);
> 
> g_assert_not_reached().
> 
> Again, no ifdefs.
> 
> We probably want to generate avx1 code when the cpu supports it, to avoid mode
> switches in the vector registers.  In this case, simply issue the same opcode,
> vex encoded.
> 
> > +#ifdef TCG_TARGET_HAS_REG128
> > +{ INDEX_op_add_i32x4, { "V", "0", "V" } },
> > +#endif
> 
> And, clearly, you need to rebase.
> 

I was too late to notice that some conflicting tcg-related pull has hit
master after my last rebase. Sorry. v2 will be rebased.

-- 
Kirill

Re: [Qemu-devel] [PATCH 2/3] numa: Remove the numa_post_machine_init function

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 08:40:06PM +0800, Dou Liyang wrote:
> Current default way of seting the CPUState::numa_node in the
> numa_post_machine_init() and calling it in vl.c:main() would
> make the data incorrect in case on cold/hot-plug CPUs.
> 
> Now, we move it to the qom/cpu.c:cpu_common_realizefn().
> So, Here we remove it.
> 
> Signed-off-by: Dou Liyang 


Reviewed-by: Eduardo Habkost 

Can be squashed in patch 1/3.

> ---
>  include/sysemu/numa.h |  1 -
>  numa.c| 15 ---
>  vl.c  |  2 --
>  3 files changed, 18 deletions(-)
> 
> diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
> index 8f09dcf..b8015a5 100644
> --- a/include/sysemu/numa.h
> +++ b/include/sysemu/numa.h
> @@ -25,7 +25,6 @@ typedef struct node_info {
>  
>  extern NodeInfo numa_info[MAX_NODES];
>  void parse_numa_opts(MachineClass *mc);
> -void numa_post_machine_init(void);
>  void query_numa_node_mem(uint64_t node_mem[]);
>  extern QemuOptsList qemu_numa_opts;
>  void numa_set_mem_node_id(ram_addr_t addr, uint64_t size, uint32_t node);
> diff --git a/numa.c b/numa.c
> index 379bc8a..5f68497 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -394,21 +394,6 @@ void parse_numa_opts(MachineClass *mc)
>  }
>  }
>  
> -void numa_post_machine_init(void)
> -{
> -CPUState *cpu;
> -int i;
> -
> -CPU_FOREACH(cpu) {
> -for (i = 0; i < nb_numa_nodes; i++) {
> -assert(cpu->cpu_index < max_cpus);
> -if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
> -cpu->numa_node = i;
> -}
> -}
> -}
> -}
> -
>  static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
> const char *name,
> uint64_t ram_size)
> diff --git a/vl.c b/vl.c
> index c643d3f..afe40ce 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4549,8 +4549,6 @@ int main(int argc, char **argv, char **envp)
>  
>  cpu_synchronize_all_post_init();
>  
> -numa_post_machine_init();
> -
>  if (qemu_opts_foreach(qemu_find_opts("fw_cfg"),
>parse_fw_cfg, fw_cfg_find(), NULL) != 0) {
>  exit(1);
> -- 
> 2.5.5
> 
> 
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH 3/3] cpu: make the function of cpu_common_map_numa_node more efficiently

2017-01-18 Thread Eduardo Habkost

On Wed, Jan 18, 2017 at 08:40:07PM +0800, Dou Liyang wrote:
> Current function does some unnecessary operations, such as it makes
> the assert() in the loop, and the loop was not stopped in time.
> 
> This patch moves the assert() out the loop and stops the loop in
> time.
> 
> Signed-off-by: Dou Liyang 


Reviewed-by: Eduardo Habkost 

Can be squashed in patch 1/3.

> ---
>  qom/cpu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qom/cpu.c b/qom/cpu.c
> index e08dceb..3c655b2 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -343,10 +343,11 @@ static void cpu_common_map_numa_node(CPUState *cpu)
>  {
>  int i;
>  
> +assert(cpu->cpu_index < max_cpus);
>  for (i = 0; i < nb_numa_nodes; i++) {
> -assert(cpu->cpu_index < max_cpus);
>  if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
>  cpu->numa_node = i;
> +return;
>  }
>  }
>  }
> -- 
> 2.5.5
> 
> 
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking

2017-01-18 Thread Max Reitz

On 18.01.2017 11:48, Fam Zheng wrote:
> On Fri, 12/02 03:58, Max Reitz wrote:
>> On 31.10.2016 16:38, Fam Zheng wrote:
>>> This implements open flag sensible image locking for local file
>>> and host device protocol.
>>>
>>> virtlockd in libvirt locks the first byte, so we start looking at the
>>> file bytes from 1.
>>>
>>> Quoting what was proposed by Kevin Wolf , there are
>>> four locking modes by combining two bits (BDRV_O_RDWR and
>>> BDRV_O_SHARE_RW), and implemented by taking two locks:
>>>
>>> Lock bytes:
>>>
>>> * byte 1: I can't allow other processes to write to the image
>>> * byte 2: I am writing to the image
>>>
>>> Lock modes:
>>>
>>> * shared writer (BDRV_O_RDWR | BDRV_O_SHARE_RW): Take shared lock on
>>>   byte 2. Test whether byte 1 is locked using an exclusive lock, and
>>>   fail if so.
>>
>> Testing whether something is locked would be easier by using F_OFD_GETLK
>> instead of actually creating an exclusive lock and then releasing it.
> 
> My attempt to do this shows it doesn't work: fcntl forces the tested lock type
> to read lock for some unknown reason, so it cannot be used to test other 
> shared
> lockers.

Do you have a reproducer? The attached quick and dirty test case works
for me (compile test.c to test and test2.c to test2), producing the
following output (when running ./test):

set rd lock: 0, Success
get lock: 0, Success; read lock in place
set wr lock: -1, Resource temporarily unavailable
unset lock: 0, Resource temporarily unavailable
===
unset lock: 0, Success
get lock: 0, Success; unlocked
set wr lock: 0, Success
unset lock: 0, Success
===
set wr lock: 0, Success
get lock: 0, Success; write lock in place
set wr lock: -1, Resource temporarily unavailable
unset lock: 0, Resource temporarily unavailable

So the l_type field is correctly set after every F_OFD_GETLK query call.

Max
#define _GNU_SOURCE

#include 
#include 
#include 
#include 
#include 
#include 


int main(void)
{
system("touch foo");


int fd = open("foo", O_RDWR);

int ret = fcntl(fd, F_OFD_SETLK, &(struct flock){ .l_type = F_RDLCK });
printf("set rd lock: %i, %s\n", ret, strerror(errno));

system("./test2");

printf("===\n");

ret = fcntl(fd, F_OFD_SETLK, &(struct flock){ .l_type = F_UNLCK });
printf("unset lock: %i, %s\n", ret, strerror(errno));

system("./test2");

printf("===\n");

ret = fcntl(fd, F_OFD_SETLK, &(struct flock){ .l_type = F_WRLCK });
printf("set wr lock: %i, %s\n", ret, strerror(errno));

system("./test2");

close(fd);
return 0;
}
#define _GNU_SOURCE

#include 
#include 
#include 
#include 
#include 


static const char *lock_names[] = {
[F_UNLCK] = "unlocked",
[F_RDLCK] = "read lock in place",
[F_WRLCK] = "write lock in place",
};


int main(void)
{
int fd = open("foo", O_RDWR);

struct flock fl = { .l_type = F_WRLCK };
int ret = fcntl(fd, F_OFD_GETLK, &fl);
printf("get lock: %i, %s; %s\n", ret, strerror(errno), lock_names[fl.l_type]);

ret = fcntl(fd, F_OFD_SETLK, &(struct flock){ .l_type = F_WRLCK });
printf("set wr lock: %i, %s\n", ret, strerror(errno));

close(fd);
return 0;
}


signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] arch_init: Remove unnecessary default_config_files table

2017-01-18 Thread Paolo Bonzini



On 18/01/2017 13:58, Eduardo Habkost wrote:
> On Wed, Jan 18, 2017 at 01:55:21PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 18/01/2017 13:52, Eduardo Habkost wrote:

 Though maybe we should just remove .conf file support completely...
 who's using it?!?
>>> You mean removing /etc/qemu.conf, or removing -readconfig
>>> completely?
>>>
>>> The former doesn't seem to be used often. The latter looks very
>>> useful for people trying to write scripts around QEMU and to
>>> avoid command-line length limits, so I will be surprised if
>>> nobody is using it.
>>
>> Of course /etc/qemu.conf only (so that -nodefconfig/-nouserconfig become
>> no-ops).
> 
> Agreed. Should we print a warning in 2.9 and remove it in 2.10?

I would just remove it, but that sounds like a plan too. :)

Paolo

Re: [Qemu-devel] [PATCH 1/3] cpu: Make the mapping of CPUs and NUMA nodes in cpu_common_realizefn

2017-01-18 Thread Dou Liyang


Hi, Eduardo

At 01/18/2017 08:56 PM, Eduardo Habkost wrote:

On Wed, Jan 18, 2017 at 08:40:05PM +0800, Dou Liyang wrote:

Current default way of seting the CPUState::numa_node might be wrong
in case on cold/hot-plug CPUs. Making the users confused why the
NUMA info is different beetween the guests and monitor.

Make the mapping of CPUs and NUMA nodes in qom/cpu.c:
cpu_common_realizefn(), where each VCPUs need to realize.

Signed-off-by: Dou Liyang 


parse_numa_opts() is called a long time before any CPU is
created, so this should be safe.

Reviewed-by: Eduardo Habkost 

(But we can squash patch 2/3 and patch 3/3 in this patch).


OK, :)

Thanks,
Liyang.




---
 qom/cpu.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/qom/cpu.c b/qom/cpu.c
index 61ee0cb..e08dceb 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -23,6 +23,7 @@
 #include "qemu-common.h"
 #include "qom/cpu.h"
 #include "sysemu/kvm.h"
+#include "sysemu/numa.h"
 #include "qemu/notify.h"
 #include "qemu/log.h"
 #include "exec/log.h"
@@ -338,6 +339,18 @@ static void cpu_common_parse_features(const char 
*typename, char *features,
 }
 }

+static void cpu_common_map_numa_node(CPUState *cpu)
+{
+int i;
+
+for (i = 0; i < nb_numa_nodes; i++) {
+assert(cpu->cpu_index < max_cpus);
+if (test_bit(cpu->cpu_index, numa_info[i].node_cpu)) {
+cpu->numa_node = i;
+}
+}
+}
+
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
@@ -347,6 +360,8 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 cpu_resume(cpu);
 }

+cpu_common_map_numa_node(cpu);
+
 /* NOTE: latest generic point where the cpu is fully realized */
 trace_init_vcpu(cpu);
 }
--
2.5.5

Re: [Qemu-devel] [PATCH 13/14] raw-posix: Implement image locking

2017-01-18 Thread Fam Zheng

On Wed, 01/18 14:02, Max Reitz wrote:
> >> Testing whether something is locked would be easier by using F_OFD_GETLK
> >> instead of actually creating an exclusive lock and then releasing it.
> > 
> > My attempt to do this shows it doesn't work: fcntl forces the tested lock 
> > type
> > to read lock for some unknown reason, so it cannot be used to test other 
> > shared
> > lockers.
> 
> Do you have a reproducer? The attached quick and dirty test case works
> for me (compile test.c to test and test2.c to test2), producing the
> following output (when running ./test):
> 
> set rd lock: 0, Success
> get lock: 0, Success; read lock in place
> set wr lock: -1, Resource temporarily unavailable
> unset lock: 0, Resource temporarily unavailable
> ===
> unset lock: 0, Success
> get lock: 0, Success; unlocked
> set wr lock: 0, Success
> unset lock: 0, Success
> ===
> set wr lock: 0, Success
> get lock: 0, Success; write lock in place
> set wr lock: -1, Resource temporarily unavailable
> unset lock: 0, Resource temporarily unavailable

You are right, now I see why I was confused with the F_OFD_GETLK interface.

Fam

> 
> So the l_type field is correctly set after every F_OFD_GETLK query call.
> 
> Max

Re: [Qemu-devel] [PATCH 0/3] cpu: numa: Fix the mapping initialization of VCPUs and NUMA nodes

2017-01-18 Thread Dou Liyang


Hi, All


**
ERROR:/tmp/qemu-test/src/tests/vhost-user-test.c:668:test_migrate: assertion failed: 
(qdict_haskey(rsp, "return"))
GTester: last random seed: R02Sf52546c4daff8087416f43fa7c146db8
ftruncate: Permission denied
ftruncate: Permission denied
qemu-system-aarch64: /tmp/qemu-test/src/qom/cpu.c:346: cpu_common_map_numa_node: 
Assertion `cpu->cpu_index < max_cpus' failed.
Broken pipe


I don't know What's the meaning of this log ?

Is the qemu-system-aarch64 can't recognize the
qom/cpu.c:346: assert(cpu->cpu_index < max_cpus);

Thanks,
Liyang



GTester: last random seed: R02Sa39aa674143b4b48a89276d59eee19b3
qemu-system-aarch64: /tmp/qemu-test/src/qom/cpu.c:346: cpu_common_map_numa_node: 
Assertion `cpu->cpu_index < max_cpus' failed.
Broken pipe
GTester: last random seed: R02Sbd557f67296de3764183db6b2105d88b
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
qemu-system-aarch64: /tmp/qemu-test/src/qom/cpu.c:346: cpu_common_map_numa_node: 
Assertion `cpu->cpu_index < max_cpus' failed.
Broken pipe
GTester: last random seed: R02S79c36cf5f50e19956d11fbc925f86df0
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
qemu-system-aarch64: /tmp/qemu-test/src/qom/cpu.c:346: cpu_common_map_numa_node: 
Assertion `cpu->cpu_index < max_cpus' failed.
Broken pipe
GTester: last random seed: R02S1186d0871ff44bed258ec179722caf91
qemu-system-aarch64: /tmp/qemu-test/src/qom/cpu.c:346: cpu_common_map_numa_node: 
Assertion `cpu->cpu_index < max_cpus' failed.
Broken pipe
GTester: last random seed: R02S3429594491b10489f9abab73dcc15151
make: *** [check-qtest-aarch64] Error 1
make: *** Waiting for unfinished jobs
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
Back to tcg accelerator.
make[1]: *** [docker-run] Error 2
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-3q9yfmbs/src'
make: *** [docker-run-test-quick@centos6] Error 2
=== OUTPUT END ===

Test command exited with code: 2


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH v6 kernel 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2017-01-18 Thread Li, Liang Z

> Am 21.12.2016 um 07:52 schrieb Liang Li:
> > This patch set contains two parts of changes to the virtio-balloon.
> >
> > One is the change for speeding up the inflating & deflating process,
> > the main idea of this optimization is to use {pfn|length} to present
> > the page information instead of the PFNs, to reduce the overhead of
> > virtio data transmission, address translation and madvise(). This can
> > help to improve the performance by about 85%.
> >
> > Another change is for speeding up live migration. By skipping process
> > guest's unused pages in the first round of data copy, to reduce
> > needless data processing, this can help to save quite a lot of CPU
> > cycles and network bandwidth. We put guest's unused page information
> > in a {pfn|length} array and send it to host with the virt queue of
> > virtio-balloon. For an idle guest with 8GB RAM, this can help to
> > shorten the total live migration time from 2Sec to about 500ms in
> > 10Gbps network environment. For an guest with quite a lot of page
> > cache and with little unused pages, it's possible to let the guest
> > drop it's page cache before live migration, this case can benefit from this
> new feature too.
> 
> I agree that both changes make sense (although the second change just
> smells very racy, as you also pointed out in the patch description), however I
> am not sure if virtio-balloon is really the right place for the latter change.
> 
> virtio-balloon is all about ballooning, nothing else. What you're doing is 
> using
> it as a way to communicate balloon-unrelated data from/to the hypervisor.
> Yes, it is also about guest memory, but completely unrelated to the purpose
> of the balloon device.
> 
> Maybe using virtio-balloon for this purpose is okay - I have mixed feelings
> (especially as I can't tell where else this could go). I would like to get a 
> second
> opinion on this.
> 

We have ever discussed the implementation for a long time, making use the 
current
virtio balloon seems better than the other solutions and is recommended by 
Michael.

Thanks!
Liang
> --
> 
> David

Re: [Qemu-devel] [PATCH v4 10/25] block: Add bdrv_dirname()

2017-01-18 Thread Max Reitz

On 17.01.2017 00:02, Eric Blake wrote:
> On 01/16/2017 02:49 PM, Max Reitz wrote:
>> This function may be implemented by block drivers to derive a directory
>> name from a BDS. Concatenating this g_free()-able string with a relative
>> filename must result in a valid (not necessarily existing) filename, so
>> this is a function that should generally be not implemented by format
>> drivers, because this is protocol-specific.
>>
>> If a BDS's driver does not implement this function, bdrv_dirname() will
>> fall through to the BDS's file if it exists. If it does not, the
>> exact_filename field will be used to generate a directory name.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  include/block/block.h |  1 +
>>  include/block/block_int.h |  1 +
>>  block.c   | 26 ++
>>  3 files changed, 28 insertions(+)
> 
>> +++ b/include/block/block_int.h
>> @@ -130,6 +130,7 @@ struct BlockDriver {
>>  int (*bdrv_make_empty)(BlockDriverState *bs);
>>  
>>  void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);
>> +char *(*bdrv_dirname)(BlockDriverState *bs, Error **errp);
> 
> I know we've been lousy at documentation in the past, but should we
> start doing a better job of documenting the contract as we add new
> callbacks, rather than just the commit messages?

Good point, I'll add a comment.

> Based on existing practice, though, I'm okay with:
> Reviewed-by: Eric Blake 

Thanks for reviewing!

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 13/25] block/nbd: Implement bdrv_dirname()

2017-01-18 Thread Max Reitz

On 17.01.2017 00:21, Eric Blake wrote:
> On 01/16/2017 02:49 PM, Max Reitz wrote:
>> The idea behind this implementation is that the export name might be
>> interpreted as a path (which is the only apparent interpretation of
>> relative filenames for NBD paths).
>>
>> The default implementation of bdrv_dirname() would handle that just fine
>> for nbd+tcp, but not for nbd+unix, because in that case, the last
>> element of the path is the Unix socket path and not the export name.
>> Therefore, we need to implement an own bdrv_dirname() which uses the
>> legacy NBD URL which has the export name at the end.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  block/nbd.c | 31 +++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/block/nbd.c b/block/nbd.c
>> index 35f24be069..42f0cd638c 100644
>> --- a/block/nbd.c
>> +++ b/block/nbd.c
>> @@ -552,6 +552,34 @@ static void nbd_refresh_filename(BlockDriverState *bs, 
>> QDict *options)
>>  bs->full_open_options = opts;
>>  }
>>  
>> +static char *nbd_dirname(BlockDriverState *bs, Error **errp)
>> +{
>> +BDRVNBDState *s = bs->opaque;
>> +const char *host = NULL, *port = NULL, *path = NULL;
>> +
>> +if (s->saddr->type == SOCKET_ADDRESS_KIND_INET) {
>> +const InetSocketAddress *inet = s->saddr->u.inet.data;
>> +if (!inet->has_ipv4 && !inet->has_ipv6 && !inet->has_to) {
>> +host = inet->host;
>> +port = inet->port;
>> +}
>> +} else if (s->saddr->type == SOCKET_ADDRESS_KIND_UNIX) {
>> +path = s->saddr->u.q_unix.data->path;
>> +}
>> +
> 
> Should we be catering to SOCKET_ADDRESS_KIND_VSOCK or
> SOCKET_ADDRESS_KIND_FD (if only to assert that they aren't possible with
> NBD)?

In those cases, the generic "Cannot generate..." error below will be
returned. I think that should be enough.

>> +if (path) {
>> +return g_strdup_printf("nbd:unix:%s:exportname=%s%s", path,
>> +   s->export ?: "", s->export ? "/" : "");
>> +} else if (host) {
>> +return g_strdup_printf("nbd://%s:%s/%s%s", host, port,
>> +   s->export ?: "", s->export ? "/" : "");
>> +}
>> +
>> +error_setg(errp, "Cannot generate a base directory for NBD node '%s'",
>> +   bs->filename);
>> +return NULL;
>> +}
>> +
> 
> The NBD protocol doesn't require the server to support directories, but
> also doesn't place any requirements on export names, so this approach
> assumes a particular server behavior, but is probably as good as any
> other approach.  But I'm still not sold that we need this, vs. just
> declaring that NBD has no directory structure and therefore we always
> return NULL (even with nbd+tcp, and whether or not the older nbd: URI
> was used).  So I'm not quite ready for R-b on this one yet.

Right. Perhaps it's better to just always make it an error for now and
maybe revisit it later if someone asks for it (which I can't imagine...).

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2] vl: Ensure the cpu_synchronize_all_post_init() in the appropriate location

2017-01-18 Thread Dou Liyang

At the Qemu initialization, we call the cpu_synchronize_all_post_init()
to synchronize All CPU states to KVM in the ./vl.c::main().

Currently, it is called before we initialize the CPUs, which created by
"-device" command, So, these CPUs may be ignored to synchronize.

The patch moves the numa_post_machine_init func in the appropriate
location to make sure that all the CPUs has already been created
when it is called.

Signed-off-by: Dou Liyang 
---

Change log v1-> v2:
1. Split it from 
https://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03354.html
2. Rewrite the log.

 vl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index c643d3f..38269be 100644
--- a/vl.c
+++ b/vl.c
@@ -4547,8 +4547,6 @@ int main(int argc, char **argv, char **envp)
 
 audio_init();
 
-cpu_synchronize_all_post_init();
-
 numa_post_machine_init();
 
 if (qemu_opts_foreach(qemu_find_opts("fw_cfg"),
@@ -4571,6 +4569,9 @@ int main(int argc, char **argv, char **envp)
   device_init_func, NULL, NULL)) {
 exit(1);
 }
+
+cpu_synchronize_all_post_init();
+
 rom_reset_order_override();
 
 /* Did we create any drives that we failed to create a device for? */
-- 
2.5.5

Re: [Qemu-devel] [PATCH 0/3] cpu: numa: Fix the mapping initialization of VCPUs and NUMA nodes

2017-01-18 Thread Igor Mammedov

On Wed, 18 Jan 2017 20:40:04 +0800
Dou Liyang  wrote:

> As we fixed a bug(Bug 1) in below links, Named "Method-A":
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03354.html
> 
> Then, Eduardo gave us many suggests. Thanks very much!
> when we try them, we also find another bugs named "Bug 2".
I have an alternative fix for both issues for which
I've been writing cover letter when I saw this series.

My fix series more intrusive though as it's goal isn't
just to fix 'info numa' but rather switch away from
cpu-index based mapping to socket/core/thread based mapping
and stop using bitmaps for mapping. And as 'info numa' was
getting in the way, I fixed both issues in a slightly
different way.

So pls wait a bit, once travis build is completed,
I'll post series here.

> [Problem]
> -
> 
> As I use this command:
> 
> ./x86_64-softmmu/qemu-system-x86_64 \
>   -hda /image/fedora.img \
>   -m 1G,slots=4,maxmem=4G \
>   -enable-kvm \
>   -smp 2,maxcpus=16,sockets=4,cores=2,threads=2 \
>   -device qemu64-x86_64-cpu,id=cpu1,socket-id=0,core-id=1,thread-id=0 \
>   -device qemu64-x86_64-cpu,id=cpu2,socket-id=0,core-id=1,thread-id=1 \
>   -device qemu64-x86_64-cpu,id=cpu3,socket-id=1,core-id=0,thread-id=0 \
>   -device qemu64-x86_64-cpu,id=cpu4,socket-id=1,core-id=0,thread-id=1 \
>   -device qemu64-x86_64-cpu,id=cpu5,socket-id=1,core-id=1,thread-id=0 \
>   -device qemu64-x86_64-cpu,id=cpu6,socket-id=1,core-id=1,thread-id=1 \
>   -numa node,nodeid=0,cpus=0-3 \
>   -numa node,nodeid=1,cpus=4-7 \
>   -numa node,nodeid=2,cpus=8-11 \
>   -numa node,nodeid=3,cpus=12-15 \
>   -monitor stdio \
> 
> 1. Bug 1
> 
> In Qemu monitor:
> 
> (qemu) info numa 
> 4 nodes
> node 0 cpus: 0 1 2 3 4 5 6 7
> node 0 size: 256 MB
> node 1 cpus:
> node 1 size: 256 MB
> node 2 cpus:
> node 2 size: 256 MB
> node 3 cpus:
> node 3 size: 256 MB
> 
> 2. Bug 2
> 
> (qemu) device_add qemu64-x86_64-cpu,id=cpu7,socket-id=2,core-id=0,thread-id=0
> 
> (qemu) info numa
> 4 nodes
> node 0 cpus: 0 1 2 3 4 5 6 7 8
> node 0 size: 256 MB
> node 1 cpus:
> node 1 size: 256 MB
> node 2 cpus:
> node 2 size: 256 MB
> node 3 cpus:
> node 3 size: 256 MB
> 
> [Method-A]
> --
> 
> 1. Method-A that we provided above: 
>   * Ensure the numa_post_machine_init func in the appropriate location in
> vl.c::main().
> 
> It can fix Bug 1, but, can't work for Bug 2.
> 
> 1.1. For Bug 1: fixed
> 
> (qemu) info numa
> 4 nodes
> node 0 cpus: 0 1 2 3
> node 0 size: 256 MB
> node 1 cpus: 4 5 6 7
> node 1 size: 256 MB
> node 2 cpus:
> node 2 size: 256 MB
> node 3 cpus:
> node 3 size: 256 MB
> 
> 1.2. For Bug 2: can not fixed
> 
> (qemu) device_add qemu64-x86_64-cpu,id=cpu7,socket-id=2,core-id=0,thread-id=0
> 
> (qemu) info numa
> node 0 cpus: 0 1 2 3 8
> node 0 size: 256 MB
> node 1 cpus: 4 5 6 7
> node 1 size: 256 MB
> node 2 cpus:
> node 2 size: 256 MB
> node 3 cpus:
> node 3 size: 256 MB
> 
> [Solution]
> --
> 
> Move the CPUState::numa_node initialization to 
> qom/cpu.c:cpu_common_realizefn(),
> and remove numa_post_machine_init() completely.
> 
> It can fix Bug 1 and Bug 2. The result shows that:
> 
> (qemu) info numa
> 4 nodes
> node 0 cpus: 0 1 2 3
> node 0 size: 256 MB
> node 1 cpus: 4 5 6 7
> node 1 size: 256 MB
> node 2 cpus:
> node 2 size: 256 MB
> node 3 cpus:
> node 3 size: 256 MB
> 
> (qemu) device_add qemu64-x86_64-cpu,id=cpu7,socket-id=2,core-id=0,thread-id=0
> (qemu) info numa
> 4 nodes
> node 0 cpus: 0 1 2 3
> node 0 size: 256 MB
> node 1 cpus: 4 5 6 7
> node 1 size: 256 MB
> node 2 cpus: 8
> node 2 size: 256 MB
> node 3 cpus:
> node 3 size: 256 MB
> 
> (qemu) device_add qemu64-x86_64-cpu,id=cpu8,socket-id=3,core-id=0,thread-id=0
> (qemu) info numa
> 4 nodes
> node 0 cpus: 0 1 2 3
> node 0 size: 256 MB
> node 1 cpus: 4 5 6 7
> node 1 size: 256 MB
> node 2 cpus: 8
> node 2 size: 256 MB
> node 3 cpus: 12
> node 3 size: 256 MB
> 
> (qemu) device_del cpu5
> (qemu) info numa
> 4 nodes
> node 0 cpus: 0 1 2 3
> node 0 size: 256 MB
> node 1 cpus: 4 5 7
> node 1 size: 256 MB
> node 2 cpus: 8
> node 2 size: 256 MB
> node 3 cpus: 12
> node 3 size: 256 MB
> 
> Dou Liyang (3):
>   cpu: Make the mapping of CPUs and NUMA nodes in cpu_common_realizefn
>   numa: Remove the numa_post_machine_init function
>   cpu: make the function of cpu_common_map_numa_node more efficiently
> 
>  include/sysemu/numa.h |  1 -
>  numa.c| 15 ---
>  qom/cpu.c | 16 
>  vl.c  |  2 --
>  4 files changed, 16 insertions(+), 18 deletions(-)
>

[Qemu-devel] [PATCH 2/2] vl: Print warning when a default config file is loaded

2017-01-18 Thread Eduardo Habkost

In case there were options set in the default config file, print
a warning so users can update their scripts.

If somebody wants to keep the config file as-is, avoid the
warning and use a command-line that will work in future QEMU
versions, they can use:

 $QEMU -nodefconfig -readconfig /etc/qemu/qemu.conf

I was going to add an additional message suggesting it as a
solution, but I thought it could make it more confusing. The
solution can be documented in the QEMU 2.9 ChangeLog.

Signed-off-by: Eduardo Habkost 
---
 vl.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/vl.c b/vl.c
index b563f9b924..e80b6da4bd 100644
--- a/vl.c
+++ b/vl.c
@@ -2999,6 +2999,12 @@ static int qemu_read_default_config_file(void)
 return ret;
 }
 
+if (ret > 0) {
+loc_set_none();
+error_report("Warning: Future QEMU versions won't load %s 
automatically",
+ CONFIG_QEMU_CONFDIR "/qemu.conf");
+}
+
 return 0;
 }
 
-- 
2.11.0.259.g40922b1

[Qemu-devel] [PATCH 0/2] vl: Print warning if a non-empty default config-file is found

2017-01-18 Thread Eduardo Habkost

We plan to remove support for /etc/qemu/qemu.conf in the near
future. Make QEMU print a warning in case there a non-empty
/etc/qemu/qemu.conf is loaded, so users have time to adapt.

This series is based on my machine-next branch, at:
  https://github.com/ehabkost/qemu.git machine-next

Eduardo Habkost (2):
  config: qemu_config_parse() return number of config groups
  vl: Print warning when a default config file is loaded

 util/qemu-config.c | 15 +++
 vl.c   |  6 ++
 2 files changed, 13 insertions(+), 8 deletions(-)

-- 
2.11.0.259.g40922b1

[Qemu-devel] [PATCH 1/2] config: qemu_config_parse() return number of config groups

2017-01-18 Thread Eduardo Habkost

Change qemu_config_parse() to return the number of config groups
in success and -EINVAL on error. This will allow callers of
qemu_config_parse() to check if something was really loaded from
the config file.

All existing callers of qemu_config_parse() and
qemu_read_config_file() only check if the return value was
negative, so the change shouldn't affect them.

Signed-off-by: Eduardo Habkost 
---
 util/qemu-config.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/util/qemu-config.c b/util/qemu-config.c
index 5527100a01..560c201bd3 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -379,6 +379,7 @@ void qemu_config_write(FILE *fp)
 }
 }
 
+/* Returns number of config groups on success, -errno on error */
 int qemu_config_parse(FILE *fp, QemuOptsList **lists, const char *fname)
 {
 char line[1024], group[64], id[64], arg[64], value[1024];
@@ -386,7 +387,8 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 QemuOptsList *list = NULL;
 Error *local_err = NULL;
 QemuOpts *opts = NULL;
-int res = -1, lno = 0;
+int res = -EINVAL, lno = 0;
+int count = 0;
 
 loc_push_none(&loc);
 while (fgets(line, sizeof(line), fp) != NULL) {
@@ -407,6 +409,7 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 goto out;
 }
 opts = qemu_opts_create(list, id, 1, NULL);
+count++;
 continue;
 }
 if (sscanf(line, "[%63[^]]]", group) == 1) {
@@ -417,6 +420,7 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 goto out;
 }
 opts = qemu_opts_create(list, NULL, 0, &error_abort);
+count++;
 continue;
 }
 value[0] = '\0';
@@ -441,7 +445,7 @@ int qemu_config_parse(FILE *fp, QemuOptsList **lists, const 
char *fname)
 error_report("error reading file");
 goto out;
 }
-res = 0;
+res = count;
 out:
 loc_pop(&loc);
 return res;
@@ -458,12 +462,7 @@ int qemu_read_config_file(const char *filename)
 
 ret = qemu_config_parse(f, vm_config_groups, filename);
 fclose(f);
-
-if (ret == 0) {
-return 0;
-} else {
-return -EINVAL;
-}
+return ret;
 }
 
 static void config_parse_qdict_section(QDict *options, QemuOptsList *opts,
-- 
2.11.0.259.g40922b1

[Qemu-devel] [PATCH] x86-KVM: Supply TSC and APIC clock rates to guest like VMWare

2017-01-18 Thread Phil Dennis-Jordan

This fixes timekeeping of x86-64 Darwin/OS X/macOS guests when using KVM.

Darwin/OS X/macOS for x86-64 uses the TSC for timekeeping; it normally 
calibrates this by querying various clock frequency scaling MSRs. Details 
depend on the exact CPU model detected. The local APIC timer frequency is 
extracted from (EFI) firmware.

This is problematic in the presence of virtualisation, as the MSRs in question 
are typically not handled by the hypervisor. VMWare (Fusion) advertises TSC and 
APIC frequency via a custom 0x4010 CPUID leaf, in the eax and ebx registers 
respectively. This is documented at https://lwn.net/Articles/301888/ among 
other places.

Darwin/OS X/macOS looks for the generic 0x4000 hypervisor leaf, and if this 
indicates via eax that leaf 0x4010 might be available, that is in turn 
queried for the two frequencies.

This adds a CPU option "vmware-tsc-apic-clocks" to enable the same behaviour 
when running Qemu with KVM acceleration, if the KVM TSC frequency can be 
established. The virtualised APIC bus cycle is hardcoded to 1GHz in KVM, so ebx 
of the CPUID leaf is also hardcoded to this value.

Signed-off-by: Phil Dennis-Jordan 

---
 target/i386/cpu.c |  1 +
 target/i386/cpu.h |  4 
 target/i386/kvm.c | 40 
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index aba11ae..e5523d4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3677,6 +3677,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("cpuid-0xb", X86CPU, enable_cpuid_0xb, true),
 DEFINE_PROP_BOOL("lmce", X86CPU, enable_lmce, false),
 DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true),
+DEFINE_PROP_BOOL("vmware-tsc-apic-clocks", X86CPU, vmware_clock_rates, 
false),
 DEFINE_PROP_END_OF_LIST()
 };
 
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 6c1902b..1d8590b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1213,6 +1213,10 @@ struct X86CPU {
 bool host_features;
 uint32_t apic_id;
 
+/* Enables publishing of TSC increment and Local APIC bus frequencies to
+ * the guest OS in CPUID page 0x4010, the same way that VMWare does. */
+bool vmware_clock_rates;
+
 /* if true the CPUID code directly forward host cache leaves to the guest 
*/
 bool cache_info_passthrough;
 
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 10a9cd8..7830b3a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -778,10 +778,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 
 if (cpu->expose_kvm) {
+uint32_t kvm_max_page = KVM_CPUID_FEATURES | kvm_base;
+if (cpu->vmware_clock_rates && kvm_base == KVM_CPUID_SIGNATURE) {
+kvm_max_page = MAX(kvm_max_page, KVM_CPUID_SIGNATURE | 0x10);
+}
 memcpy(signature, "KVMKVMKVM\0\0\0", 12);
 c = &cpuid_data.entries[cpuid_i++];
 c->function = KVM_CPUID_SIGNATURE | kvm_base;
-c->eax = KVM_CPUID_FEATURES | kvm_base;
+c->eax = kvm_max_page;
 c->ebx = signature[0];
 c->ecx = signature[1];
 c->edx = signature[2];
@@ -910,7 +914,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
-cpuid_data.cpuid.nent = cpuid_i;
 
 if (((env->cpuid_version >> 8)&0xF) >= 6
 && (env->features[FEAT_1_EDX] & (CPUID_MCE | CPUID_MCA)) ==
@@ -973,12 +976,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 vmstate_x86_cpu.unmigratable = 1;
 }
 
-cpuid_data.cpuid.padding = 0;
-r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, &cpuid_data);
-if (r) {
-return r;
-}
-
 r = kvm_arch_set_tsc_khz(cs);
 if (r < 0) {
 return r;
@@ -998,6 +995,33 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
+if (cpu->vmware_clock_rates) {
+if (cpu->expose_kvm
+&& kvm_base == KVM_CPUID_SIGNATURE
+&& env->tsc_khz != 0) {
+/* Publish TSC and LAPIC resolution on CPUID page 0x4010
+ * like VMWare for benefit of Darwin guests. */
+c = &cpuid_data.entries[cpuid_i++];
+c->function = KVM_CPUID_SIGNATURE | 0x10;
+c->eax = env->tsc_khz;
+/* LAPIC resolution of 1ns (freq: 1GHz) is hardcoded in KVM's
+ * APIC_BUS_CYCLE_NS*/
+c->ebx = 100;
+c->ecx = c->edx = 0;
+} else {
+error_report(
+"Warning: VMWare-style TSC/LAPIC clock reporting impossible.");
+}
+}
+
+cpuid_data.cpuid.nent = cpuid_i;
+
+cpuid_data.cpuid.padding = 0;
+r = kvm_vcpu_ioctl(cs, KVM_SET_CPUID2, &cpuid_data);
+if (r) {
+return r;
+}
+
 if (has_xsave) {
 env->kvm_xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave));
 }
-- 
2.3.2 (Apple Git-55)

[Qemu-devel] [PATCH] hw/usb/dev-hid: add a Mac guest compatibility option to usb-tablet

2017-01-18 Thread Phil Dennis-Jordan

Darwin/OS X/macOS's HID driver stack does not correctly drive Qemu's simulated 
USB Tablet. This adds a boolean option "mac_compat" which subtly changes the 
device so it behaves in a way that Mac guests can handle.

The specific incompatibilities with the regular Qemu USB tablet are:

 1. Absolute pointing devices with HID Report Descriptor usage page of 0x01 
(pointing) are handled by the macOS HID driver as analog sticks, so the 
movement of the cursor ends up being the cumulative deviance from the centre 
position.
 2. The bInterfaceProtocol of 0x02 enables a particular macOS HID driver mode 
which only works properly with mice (relative motion) not absolute pointing 
devices, so spurious events with relative coordinates are generated in addition 
to absolute ones. This manifests as a very jittery cursor.

The workaround is to report a usage page of 0x02 (mouse) and a 
bInterfaceProtocol value of 0x00.

Signed-off-by: Phil Dennis-Jordan 

---
 hw/usb/dev-hid.c | 138 ++-
 1 file changed, 136 insertions(+), 2 deletions(-)

diff --git a/hw/usb/dev-hid.c b/hw/usb/dev-hid.c
index 24d05f7..0f5b796 100644
--- a/hw/usb/dev-hid.c
+++ b/hw/usb/dev-hid.c
@@ -51,6 +51,7 @@ typedef struct USBHIDState {
 uint32_t usb_version;
 char *display;
 uint32_t head;
+bool mac_compat;
 } USBHIDState;
 
 #define TYPE_USB_HID "usb-hid"
@@ -200,6 +201,66 @@ static const USBDescIface desc_iface_tablet2 = {
 },
 };
 
+static const USBDescIface desc_iface_tablet_mac_compat = {
+.bInterfaceNumber  = 0,
+.bNumEndpoints = 1,
+.bInterfaceClass   = USB_CLASS_HID,
+.bInterfaceProtocol= 0x00, /* OSX/macOS can't handle 2 here */
+.ndesc = 1,
+.descs = (USBDescOther[]) {
+{
+/* HID descriptor */
+.data = (uint8_t[]) {
+0x09,  /*  u8  bLength */
+USB_DT_HID,/*  u8  bDescriptorType */
+0x01, 0x00,/*  u16 HID_class */
+0x00,  /*  u8  country_code */
+0x01,  /*  u8  num_descriptors */
+USB_DT_REPORT, /*  u8  type: Report */
+74, 0, /*  u16 len */
+},
+},
+},
+.eps = (USBDescEndpoint[]) {
+{
+.bEndpointAddress  = USB_DIR_IN | 0x01,
+.bmAttributes  = USB_ENDPOINT_XFER_INT,
+.wMaxPacketSize= 8,
+.bInterval = 0x0a,
+},
+},
+};
+
+static const USBDescIface desc_iface_tablet_mac_compat2 = {
+.bInterfaceNumber  = 0,
+.bNumEndpoints = 1,
+.bInterfaceClass   = USB_CLASS_HID,
+.bInterfaceProtocol= 0x00, /* OSX/macOS can't handle 2 here */
+.ndesc = 1,
+.descs = (USBDescOther[]) {
+{
+/* HID descriptor */
+.data = (uint8_t[]) {
+0x09,  /*  u8  bLength */
+USB_DT_HID,/*  u8  bDescriptorType */
+0x01, 0x00,/*  u16 HID_class */
+0x00,  /*  u8  country_code */
+0x01,  /*  u8  num_descriptors */
+USB_DT_REPORT, /*  u8  type: Report */
+74, 0, /*  u16 len */
+},
+},
+},
+.eps = (USBDescEndpoint[]) {
+{
+.bEndpointAddress  = USB_DIR_IN | 0x01,
+.bmAttributes  = USB_ENDPOINT_XFER_INT,
+.wMaxPacketSize= 8,
+.bInterval = 4, /* 2 ^ (4-1) * 125 usecs = 1 ms */
+},
+},
+};
+
 static const USBDescIface desc_iface_keyboard = {
 .bInterfaceNumber  = 0,
 .bNumEndpoints = 1,
@@ -330,6 +391,40 @@ static const USBDescDevice desc_device_tablet2 = {
 },
 };
 
+static const USBDescDevice desc_device_tablet_mac_compat = {
+.bcdUSB= 0x0100,
+.bMaxPacketSize0   = 8,
+.bNumConfigurations= 1,
+.confs = (USBDescConfig[]) {
+{
+.bNumInterfaces= 1,
+.bConfigurationValue   = 1,
+.iConfiguration= STR_CONFIG_TABLET,
+.bmAttributes  = USB_CFG_ATT_ONE | USB_CFG_ATT_WAKEUP,
+.bMaxPower = 50,
+.nif = 1,
+.ifs = &desc_iface_tablet_mac_compat,
+},
+},
+};
+
+static const USBDescDevice desc_device_tablet_mac_compat2 = {
+.bcdUSB= 0x0200,
+.bMaxPacketSize0   = 64,
+.bNumConfigurations= 1,
+.confs = (USBDescConfig[]) {
+{
+.bNumInterfaces= 1,
+.bConfigurationValue   = 1,
+.iConfiguration= STR_CONFIG_TABLET,
+.bmAttributes  = USB_CFG_ATT_O

Re: [Qemu-devel] [PATCH 01/16] aio: introduce aio_co_schedule and aio_co_wake

2017-01-18 Thread Stefan Hajnoczi

On Fri, Jan 13, 2017 at 02:17:16PM +0100, Paolo Bonzini wrote:
> +static void co_schedule_bh_cb(void *opaque)
> +{
> +AioContext *ctx = opaque;
> +QSLIST_HEAD(, Coroutine) straight, reversed;
> +
> +QSLIST_MOVE_ATOMIC(&reversed, &ctx->scheduled_coroutines);
> +QSLIST_INIT(&straight);
> +
> +while (!QSLIST_EMPTY(&reversed)) {
> +Coroutine *co = QSLIST_FIRST(&reversed);
> +QSLIST_REMOVE_HEAD(&reversed, co_scheduled_next);
> +QSLIST_INSERT_HEAD(&straight, co, co_scheduled_next);
> +}
> +
> +while (!QSLIST_EMPTY(&straight)) {
> +Coroutine *co = QSLIST_FIRST(&straight);
> +QSLIST_REMOVE_HEAD(&straight, co_scheduled_next);
> +trace_aio_co_schedule_bh_cb(ctx, co);
> +qemu_coroutine_enter(co);
> +}
> +}

ctx->scheduled_coroutines is a specialized CoQueue.  Was there no way to
modify and then use CoQueue instead of open coding it?

> +void aio_co_wake(struct Coroutine *co)
> +{
> +AioContext *ctx;
> +
> +/* Read coroutine before co->ctx.  Matches smp_wmb in
> + * qemu_coroutine_enter.
> + */
> +smp_read_barrier_depends();
> +ctx = atomic_read(&co->ctx);
> +
> +if (ctx != qemu_get_current_aio_context()) {
> +aio_co_schedule(ctx, co);
> +return;
> +}
> +
> +if (qemu_in_coroutine()) {
> +Coroutine *self = qemu_coroutine_self();
> +assert(self != co);
> +QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next);
> +} else {
> +aio_context_acquire(ctx);
> +qemu_coroutine_enter(co);
> +aio_context_release(ctx);

Why is it necessary to acquire AioContext here?  We're already in ctx.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 02/16] block-backend: allow blk_prw from coroutine context

2017-01-18 Thread Stefan Hajnoczi

On Fri, Jan 13, 2017 at 02:17:17PM +0100, Paolo Bonzini wrote:
> qcow2_create2 calls this.  Do not run a nested event loop, as that
> breaks when aio_co_wake tries to queue the coroutine on the co_queue_wakeup
> list of the currently running one.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/block-backend.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 03/16] test-thread-pool: use generic AioContext infrastructure

2017-01-18 Thread Stefan Hajnoczi

On Fri, Jan 13, 2017 at 02:17:18PM +0100, Paolo Bonzini wrote:
> Once the thread pool starts using aio_co_wake, it will also need
> qemu_get_current_aio_context().  Make test-thread-pool create
> an AioContext with qemu_init_main_loop, so that stubs/iothread.c
> and tests/iothread.c can provide the rest.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  tests/test-thread-pool.c | 12 +++-
>  1 file changed, 3 insertions(+), 9 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 04/16] io: add methods to set I/O handlers on AioContext

2017-01-18 Thread Stefan Hajnoczi

On Fri, Jan 13, 2017 at 02:17:19PM +0100, Paolo Bonzini wrote:
> This is in preparation for making qio_channel_yield work on
> AioContexts other than the main one.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  include/io/channel.h | 30 ++
>  io/channel-command.c | 13 +
>  io/channel-file.c| 11 +++
>  io/channel-socket.c  | 16 +++-
>  io/channel-tls.c | 12 
>  io/channel-watch.c   |  6 ++
>  io/channel.c | 11 +++
>  7 files changed, 94 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 05/16] io: make qio_channel_yield aware of AioContexts

2017-01-18 Thread Stefan Hajnoczi

On Mon, Jan 16, 2017 at 12:55:59PM +, Daniel P. Berrange wrote:
> On Mon, Jan 16, 2017 at 07:38:24PM +0800, Fam Zheng wrote:
> > On Fri, 01/13 14:17, Paolo Bonzini wrote:
> > > Support separate coroutines for reading and writing, and place the
> > > read/write handlers on the AioContext that the QIOChannel is registered
> > > with.
> > > 
> > > Signed-off-by: Paolo Bonzini 
> > > ---
> > >  include/io/channel.h   | 37 ++
> > >  io/channel.c   | 86 
> > > ++
> > >  tests/Makefile.include |  2 +-
> > >  3 files changed, 96 insertions(+), 29 deletions(-)
> > > 
> > > diff --git a/include/io/channel.h b/include/io/channel.h
> > > index 665edd7..d7bad94 100644
> > > --- a/include/io/channel.h
> > > +++ b/include/io/channel.h
> > > @@ -23,6 +23,7 @@
> > >  
> > >  #include "qemu-common.h"
> > >  #include "qom/object.h"
> > > +#include "qemu/coroutine.h"
> > >  #include "block/aio.h"
> > >  
> > >  #define TYPE_QIO_CHANNEL "qio-channel"
> > > @@ -59,8 +60,6 @@ typedef gboolean (*QIOChannelFunc)(QIOChannel *ioc,
> > > GIOCondition condition,
> > > gpointer data);
> > >  
> > > -typedef struct QIOChannelRestart QIOChannelRestart;
> > > -
> > >  /**
> > >   * QIOChannel:
> > >   *
> > > @@ -84,8 +83,8 @@ struct QIOChannel {
> > >  unsigned int features; /* bitmask of QIOChannelFeatures */
> > >  char *name;
> > >  AioContext *ctx;
> > > -QIOChannelRestart *read_coroutine;
> > > -QIOChannelRestart *write_coroutine;
> > > +Coroutine *read_coroutine;
> > > +Coroutine *write_coroutine;
> > >  #ifdef _WIN32
> > >  HANDLE event; /* For use with GSource on Win32 */
> > >  #endif
> > > @@ -508,13 +507,37 @@ guint qio_channel_add_watch(QIOChannel *ioc,
> > >  
> > >  
> > >  /**
> > > + * qio_channel_set_aio_context:
> > > + * @ioc: the channel object
> > > + * @ctx: the #AioContext to set the handlers on
> > > + *
> > > + * Request that qio_channel_yield() sets I/O handlers on
> > > + * the given #AioContext.  If @ctx is %NULL, qio_channel_yield()
> > > + * uses QEMU's main thread event loop.
> > > + */
> > > +void qio_channel_set_aio_context(QIOChannel *ioc,
> > > + AioContext *ctx);
> > > +
> > > +/**
> > > + * qio_channel_detach_aio_context:
> > > + * @ioc: the channel object
> > > + *
> > > + * Disable any I/O handlers set by qio_channel_yield().  With the
> > > + * help of aio_co_schedule(), this allows moving a coroutine that was
> > > + * paused by qio_channel_yield() to another context.
> > > + */
> > > +void qio_channel_detach_aio_context(QIOChannel *ioc);
> > > +
> > > +/**
> > >   * qio_channel_yield:
> > >   * @ioc: the channel object
> > >   * @condition: the I/O condition to wait for
> > >   *
> > > - * Yields execution from the current coroutine until
> > > - * the condition indicated by @condition becomes
> > > - * available.
> > > + * Yields execution from the current coroutine until the condition
> > > + * indicated by @condition becomes available.  @condition must
> > > + * be either %G_IO_IN or %G_IO_OUT; it cannot contain both.  In
> > > + * addition, no two coroutine can be waiting on the same condition
> > > + * and channel at the same time.
> > >   *
> > >   * This must only be called from coroutine context
> > >   */
> > > diff --git a/io/channel.c b/io/channel.c
> > > index ce470d7..1e043bf 100644
> > > --- a/io/channel.c
> > > +++ b/io/channel.c
> > > @@ -21,7 +21,7 @@
> > >  #include "qemu/osdep.h"
> > >  #include "io/channel.h"
> > >  #include "qapi/error.h"
> > > -#include "qemu/coroutine.h"
> > > +#include "qemu/main-loop.h"
> > >  
> > >  bool qio_channel_has_feature(QIOChannel *ioc,
> > >   QIOChannelFeature feature)
> > > @@ -238,36 +238,80 @@ off_t qio_channel_io_seek(QIOChannel *ioc,
> > >  }
> > >  
> > >  
> > > -typedef struct QIOChannelYieldData QIOChannelYieldData;
> > > -struct QIOChannelYieldData {
> > > -QIOChannel *ioc;
> > > -Coroutine *co;
> > > -};
> > > +static void qio_channel_set_aio_fd_handlers(QIOChannel *ioc);
> > > +
> > > +static void qio_channel_restart_read(void *opaque)
> > > +{
> > > +QIOChannel *ioc = opaque;
> > > +Coroutine *co = ioc->read_coroutine;
> > >  
> > > +ioc->read_coroutine = NULL;
> > > +qio_channel_set_aio_fd_handlers(ioc);
> > > +aio_co_wake(co);
> > > +}
> > >  
> > > -static gboolean qio_channel_yield_enter(QIOChannel *ioc,
> > > -GIOCondition condition,
> > > -gpointer opaque)
> > > +static void qio_channel_restart_write(void *opaque)
> > >  {
> > > -QIOChannelYieldData *data = opaque;
> > > -qemu_coroutine_enter(data->co);
> > > -return FALSE;
> > > +QIOChannel *ioc = opaque;
> > > +Coroutine *co = ioc->write_coroutine;
> > > +
> > > +ioc->write_coroutine = NULL;
> > > +q

[Qemu-devel] [PATCH] numa: access CPU's node id via property in hmp_info_numa()

2017-01-18 Thread Igor Mammedov

Move vcpu's assocciated numa_node field out of generic CPUState
into inherited classes that actually care about cpu<->numa mapping
and make monitor 'info numa' get vcpu's assocciated node id via
node-id property.
It allows to drop implicit node id intialization in
numa_post_machine_init() and would allow switching to mapping
defined by target's CpuInstanceProperties instead of global
numa_info[i].node_cpu bitmaps.

As side effect it fixes 'info numa' displaying wrong mapping
for CPUs specified with -device/device_add.
Before patch following CLI would produce:
QEMU -smp 1,sockets=3,maxcpus=3 \
   -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0 \
   -numa node,nodeid=0,cpus=0 \
   -numa node,nodeid=1,cpus=1 \
   -numa node,nodeid=2,cpus=2
(qemu) device_add qemu64-x86_64-cpu,socket-id=2,core-id=0,thread-id=0
(qemu) info numa
3 nodes
node 0 cpus: 0 1 2
node 0 size: 40 MB
node 1 cpus:
node 1 size: 40 MB
node 2 cpus:
node 2 size: 48 MB

after patch all CPUs are on nodes they are supposed to be:
(qemu) device_add qemu64-x86_64-cpu,socket-id=2,core-id=0,thread-id=0
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 40 MB
node 1 cpus: 1
node 1 size: 40 MB
node 2 cpus: 2
node 2 size: 48 MB

Signed-off-by: Igor Mammedov 
---
CC: Dou Liyang 
CC: fanc.f...@cn.fujitsu.com
CC: caoj.f...@cn.fujitsu.com
CC: stefa...@redhat.com
CC: izumi.t...@jp.fujitsu.com
CC: vilan...@ac.upc.edu
CC: ehabk...@redhat.com
CC: peter.mayd...@linaro.org
CC: Andrew Jones 
CC: David Gibson 
CC: Thomas Huth 
---
 include/qom/cpu.h   |  2 --
 include/sysemu/numa.h   |  1 -
 target/arm/cpu.h|  2 ++
 target/i386/cpu.h   |  1 +
 target/ppc/cpu.h|  2 ++
 hw/arm/virt.c   | 12 
 hw/i386/pc.c|  5 +
 hw/ppc/spapr.c  |  2 +-
 hw/ppc/spapr_cpu_core.c |  2 +-
 monitor.c   |  7 +--
 numa.c  | 15 ---
 target/arm/cpu.c|  1 +
 target/i386/cpu.c   |  1 +
 target/ppc/translate_init.c |  1 +
 vl.c|  2 --
 15 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 3f79a8e..ae637a9 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -257,7 +257,6 @@ struct qemu_work_item;
  * @cpu_index: CPU index (informative).
  * @nr_cores: Number of cores within this CPU package.
  * @nr_threads: Number of threads within this CPU.
- * @numa_node: NUMA node this CPU is belonging to.
  * @host_tid: Host thread ID.
  * @running: #true if CPU is currently running (lockless).
  * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
@@ -306,7 +305,6 @@ struct CPUState {
 
 int nr_cores;
 int nr_threads;
-int numa_node;
 
 struct QemuThread *thread;
 #ifdef _WIN32
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 8f09dcf..b8015a5 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -25,7 +25,6 @@ typedef struct node_info {
 
 extern NodeInfo numa_info[MAX_NODES];
 void parse_numa_opts(MachineClass *mc);
-void numa_post_machine_init(void);
 void query_numa_node_mem(uint64_t node_mem[]);
 extern QemuOptsList qemu_numa_opts;
 void numa_set_mem_node_id(ram_addr_t addr, uint64_t size, uint32_t node);
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 7bd16ee..ef263f1 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -662,6 +662,8 @@ struct ARMCPU {
 
 ARMELChangeHook *el_change_hook;
 void *el_change_hook_opaque;
+
+int32_t numa_nid;
 };
 
 static inline ARMCPU *arm_env_get_cpu(CPUARMState *env)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 6c1902b..e43dcc2 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1264,6 +1264,7 @@ struct X86CPU {
 int32_t socket_id;
 int32_t core_id;
 int32_t thread_id;
+int32_t numa_nid;
 };
 
 static inline X86CPU *x86_env_get_cpu(CPUX86State *env)
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 2a50c43..2d12ad5 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1154,6 +1154,7 @@ do {\
  * @cpu_dt_id: CPU index used in the device tree. KVM uses this index too
  * @max_compat: Maximal supported logical PVR from the command line
  * @cpu_version: Current logical PVR, zero if in "raw" mode
+ * @numa_nid: Numa node id the CPU belongs to
  *
  * A PowerPC CPU.
  */
@@ -1166,6 +1167,7 @@ struct PowerPCCPU {
 int cpu_dt_id;
 uint32_t max_compat;
 uint32_t cpu_version;
+int32_t numa_nid;
 
 /* Fields related to migration compatibility hacks */
 bool pre_2_8_migration;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 7a03f84..b86b5fd 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -329,7 +329,6 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
 {
 int cpu;
 int addr_cells = 1;
-unsigned int i;
 
 /*
  * From Documentation/devicetree/bindings/arm/cpus.txt
@@ -37

Re: [Qemu-devel] [PATCH v2] hw/arm/virt-acpi - reserve ECAM space as PNP0C02 device

2017-01-18 Thread Ard Biesheuvel

On 17 January 2017 at 09:49, Andrew Jones  wrote:
> On Mon, Jan 16, 2017 at 07:31:33PM +, Ard Biesheuvel wrote:
>> On 16 January 2017 at 18:20, Peter Maydell  wrote:
>> > On 16 January 2017 at 17:30, Ard Biesheuvel  
>> > wrote:
>> >> On 16 January 2017 at 17:25, Peter Maydell  
>> >> wrote:
>> >>> On 13 January 2017 at 17:32, Ard Biesheuvel  
>> >>> wrote:
>>  Linux for arm64 v4.10 and later will complain if the ECAM config space 
>>  is
>>  not reserved in the ACPI namespace:
>> 
>>    acpi PNP0A08:00: [Firmware Bug]: ECAM area [mem 
>>  0x3f00-0x3fff] not reserved in ACPI namespace
>> 
>>  The rationale is that OSes that don't consume the MCFG table should 
>>  still
>>  be able to infer that the PCI config space MMIO region is occupied.
>> 
>>  So update the ACPI table generation routine to add this reservation.
>> 
>>  Signed-off-by: Ard Biesheuvel 
>>  ---
>>   hw/arm/virt-acpi-build.c | 7 +++
>>   1 file changed, 7 insertions(+)
>> 
>>  diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>  index 085a61117378..50d52f685f68 100644
>>  --- a/hw/arm/virt-acpi-build.c
>>  +++ b/hw/arm/virt-acpi-build.c
>>  @@ -310,6 +310,13 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
>>  MemMapEntry *memmap,
>>   Aml *dev_rp0 = aml_device("%s", "RP0");
>>   aml_append(dev_rp0, aml_name_decl("_ADR", aml_int(0)));
>>   aml_append(dev, dev_rp0);
>>  +
>>  +Aml *dev_res0 = aml_device("%s", "RES0");
>>  +aml_append(dev_res0, aml_name_decl("_HID", aml_string("PNP0C02")));
>>  +crs = aml_resource_template();
>>  +aml_append(crs, aml_memory32_fixed(base_ecam, size_ecam, 
>>  AML_READ_WRITE));
>>  +aml_append(dev_res0, aml_name_decl("_CRS", crs));
>>  +aml_append(dev, dev_res0);
>>   aml_append(scope, dev);
>>   }
>> >>>
>> >>> This needs to be controlled via the machine class back-compat
>> >>> machinery in hw/arm/virt.c so that it only happens for virt-2.9
>> >>> and later.
>> >>>
>> >>
>> >> Why exactly?
>> >
>> > Because the "virt-2.8" machine has to present to the guest
>> > exactly what "virt" did as of the QEMU 2.8 release, including
>> > any bugs or missing things we happened to have in our ACPI
>> > tables. This allows cross-version compatibility (including
>> > VM migration). Drew will have a more detailed explanation
>> > if you need it.
>> >
>>
>> I suspected as much.
>>
>> But in this case, I am not sure if it is worth the trouble: the
>> generated data is only consumed at boot time by the firmware, and I
>> suppose migration involves freezing a VM, including whatever resident
>> firmware image was used to boot the OS, and so this is unlikely to
>> affect migration.
>>
>> But I will let Drew explain ...
>>
>
> In some cases the problem we're solving with the compat guards is
> a bit hypothetical, but, IMHO, nonetheless a good practice. While
> we may be sure that AAVMF and Linux will be fine with this table
> changing under their feet, we can't be sure there aren't other
> mach-virt users that have more sensitive firmwares/OSes. An ACPI-
> sensitive OS may notice the change on its next reboot after a
> migration, and then simply refuse to continue.
>
> Now, that said, I just spoke with Igor in order to learn the x86
> practice. He says that the policy has been more lax than what I
> suggest above. Hypothetical, low-risk issues are left unguarded,
> and only when a bug is found during testing is it then managed.
> The idea is to try and reduce the amount of compat variables and
> conditions needed in the ACPI generation code, but, of course, at
> some level of risk to users expecting their versioned machine type
> to always appear the same.
>
> So far we've been strict with mach-virt, guarding all hypothetical
> issues. Perhaps this patch is a good example to get a discussion
> started on whether or not we should be so strict though.
>

Yes please. I don't mind respinning the patch, but I agree that it
makes sense to consider whether minimal bug fixes like this one
require this treatment in the first place

Re: [Qemu-devel] [PATCH 05/16] io: make qio_channel_yield aware of AioContexts

2017-01-18 Thread Stefan Hajnoczi

On Fri, Jan 13, 2017 at 02:17:20PM +0100, Paolo Bonzini wrote:
>  /**
> + * qio_channel_set_aio_context:
> + * @ioc: the channel object
> + * @ctx: the #AioContext to set the handlers on
> + *
> + * Request that qio_channel_yield() sets I/O handlers on
> + * the given #AioContext.  If @ctx is %NULL, qio_channel_yield()
> + * uses QEMU's main thread event loop.
> + */
> +void qio_channel_set_aio_context(QIOChannel *ioc,
> + AioContext *ctx);
> +
> +/**
> + * qio_channel_detach_aio_context:
> + * @ioc: the channel object
> + *
> + * Disable any I/O handlers set by qio_channel_yield().  With the
> + * help of aio_co_schedule(), this allows moving a coroutine that was
> + * paused by qio_channel_yield() to another context.
> + */
> +void qio_channel_detach_aio_context(QIOChannel *ioc);

The block layer's bdrv_set_aio_context() has different semantics.  It
invokes .detach()/.attach() callbacks and does AioContext locking so the
function can be called safely even while the block driver is waiting for
events.

It's unfortunate to that the block and io channel APIs act differently
despite having similar names.  Was there a reason to choose different
semantics?

> +
> +/**
>   * qio_channel_yield:
>   * @ioc: the channel object
>   * @condition: the I/O condition to wait for
>   *
> - * Yields execution from the current coroutine until
> - * the condition indicated by @condition becomes
> - * available.
> + * Yields execution from the current coroutine until the condition
> + * indicated by @condition becomes available.  @condition must
> + * be either %G_IO_IN or %G_IO_OUT; it cannot contain both.  In
> + * addition, no two coroutine can be waiting on the same condition

s/coroutine/coroutines/


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 09/11] aspeed/smc: extend tests for Command mode

2017-01-18 Thread Cédric Le Goater

On 01/17/2017 06:34 PM, mar.krzeminski wrote:
 +static void test_write_page_mem(void)
 +{
 +uint32_t my_page_addr = 0x15000 * PAGE_SIZE;
 +uint32_t page[PAGE_SIZE / 4];
 +int i;
 +
 +/* Enable 4BYTE mode for controller. This is should be strapped by
 + * HW for CE0 anyhow.
 + */
 +spi_ce_ctrl(1 << CRTL_EXTENDED0);
 +
 +/* Enable 4BYTE mode for flash. */
 +spi_conf(CONF_ENABLE_W0);
 +spi_ctrl_start_user();
 +writeb(ASPEED_FLASH_BASE, EN_4BYTE_ADDR);
 +writeb(ASPEED_FLASH_BASE, WREN);
>>> This is a bit tricky, in real HW you need to set WREN
>>> before issue PROGRAM command on most of the devices.
>>> If there is no cache in emulated in SMC controller
>>> (if any in HW), WREN should be issued before every write.
>> ah yes. So to be more precise, WREN should be moved in the
>> loop below.
> Yes.

So, as the test is writing one page only, I think the code is 
currently correct. 

I will come up with your suggested cleanups and more tests 
later on, may be in the 2.9 time frame. 

Thanks for the review, 

C.

[Qemu-devel] [PATCH v3] aspeed/smc: handle dummy bytes when doing fast reads in command mode

2017-01-18 Thread Cédric Le Goater

When doing fast read, a certain amount of dummy bytes should be sent
before the read. This number is configurable in the controler CE0
Control Register and needs to be modeled using fake transfers to the
flash module.

This only supports command mode. User mode requires more work and a
possible extension of the m25p80 device model.

Signed-off-by: Cédric Le Goater 
---

 Changes since v2:

 - handled dummies under read routine and removed the test on the spi
   command READ_FAST (0xb)

 hw/ssi/aspeed_smc.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index a0a816407fc1..fc3cccec0566 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -69,7 +69,9 @@
 #define R_CTRL0   (0x10 / 4)
 #define   CTRL_CMD_SHIFT   16
 #define   CTRL_CMD_MASK0xff
+#define   CTRL_DUMMY_HIGH_SHIFT14
 #define   CTRL_AST2400_SPI_4BYTE   (1 << 13)
+#define   CTRL_DUMMY_LOW_SHIFT 6 /* 2 bits [7:6] */
 #define   CTRL_CE_STOP_ACTIVE  (1 << 2)
 #define   CTRL_CMD_MODE_MASK   0x3
 #define CTRL_READMODE  0x0
@@ -490,6 +492,16 @@ static uint32_t aspeed_smc_check_segment_addr(const 
AspeedSMCFlash *fl,
 return addr;
 }
 
+static int aspeed_smc_flash_dummies(const AspeedSMCFlash *fl)
+{
+const AspeedSMCState *s = fl->controller;
+uint32_t r_ctrl0 = s->regs[s->r_ctrl0 + fl->id];
+uint32_t dummy_high = (r_ctrl0 >> CTRL_DUMMY_HIGH_SHIFT) & 0x1;
+uint32_t dummy_low = (r_ctrl0 >> CTRL_DUMMY_LOW_SHIFT) & 0x3;
+
+return ((dummy_high << 2) | dummy_low) * 8;
+}
+
 static void aspeed_smc_flash_send_addr(AspeedSMCFlash *fl, uint32_t addr)
 {
 const AspeedSMCState *s = fl->controller;
@@ -526,6 +538,15 @@ static uint64_t aspeed_smc_flash_read(void *opaque, hwaddr 
addr, unsigned size)
 aspeed_smc_flash_select(fl);
 aspeed_smc_flash_send_addr(fl, addr);
 
+/*
+ * Use fake transfers to model dummy bytes. The value should
+ * be configured to some non-zero value in fast read mode and
+ * zero in read mode.
+ */
+for (i = 0; i < aspeed_smc_flash_dummies(fl); i++) {
+ssi_transfer(fl->controller->spi, 0xFF);
+}
+
 for (i = 0; i < size; i++) {
 ret |= ssi_transfer(s->spi, 0x0) << (8 * i);
 }
-- 
2.7.4

[Qemu-devel] [PATCH 1/2] linux-user: fix settime old value location

2017-01-18 Thread Pranith Kumar

From: Marc-André Lureau 

old_value is the 4th argument of timer_settime(), not the 2nd.

Signed-off-by: Marc-André Lureau 
Signed-off-by: Pranith Kumar 
---
 linux-user/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7b77503f94..5bd477a71b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -12027,7 +12027,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 target_to_host_itimerspec(&hspec_new, arg3);
 ret = get_errno(
   timer_settime(htimer, arg2, &hspec_new, &hspec_old));
-host_to_target_itimerspec(arg2, &hspec_old);
+host_to_target_itimerspec(arg4, &hspec_old);
 }
 break;
 }
-- 
2.11.0

[Qemu-devel] [PATCH 2/2] linux-user: fix tcg/mmap test

2017-01-18 Thread Pranith Kumar

From: Marc-André Lureau 

tests/tcg/mmap test fails with values other than default target page
size. When creating a map beyond EOF, extra anonymous pages are added up
to the target page boundary. Currently, this operation is performed only
when qemu_real_host_page_size < TARGET_PAGE_SIZE, but it should be
performed if the configured page size (qemu -p) is larger than
qemu_real_host_page_size too.

(also fixes some style issues to please checkpatch)

Signed-off-by: Marc-André Lureau 
Signed-off-by: Pranith Kumar 
---
 linux-user/mmap.c | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 61685bf79e..0794a4396a 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -418,31 +418,32 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 }
 
 /* When mapping files into a memory area larger than the file, accesses
-   to pages beyond the file size will cause a SIGBUS. 
+   to pages beyond the file size will cause a SIGBUS.
 
For example, if mmaping a file of 100 bytes on a host with 4K pages
emulating a target with 8K pages, the target expects to be able to
access the first 8K. But the host will trap us on any access beyond
-   4K.  
+   4K.
 
When emulating a target with a larger page-size than the hosts, we
may need to truncate file maps at EOF and add extra anonymous pages
up to the targets page boundary.  */
 
-if ((qemu_real_host_page_size < TARGET_PAGE_SIZE)
-&& !(flags & MAP_ANONYMOUS)) {
-   struct stat sb;
+if ((qemu_real_host_page_size < qemu_host_page_size) &&
+!(flags & MAP_ANONYMOUS)) {
+struct stat sb;
 
-   if (fstat (fd, &sb) == -1)
-   goto fail;
+if (fstat(fd, &sb) == -1) {
+goto fail;
+}
 
/* Are we trying to create a map beyond EOF?.  */
-   if (offset + len > sb.st_size) {
-   /* If so, truncate the file map at eof aligned with 
-  the hosts real pagesize. Additional anonymous maps
-  will be created beyond EOF.  */
-   len = REAL_HOST_PAGE_ALIGN(sb.st_size - offset);
-   }
+if (offset + len > sb.st_size) {
+/* If so, truncate the file map at eof aligned with
+   the hosts real pagesize. Additional anonymous maps
+   will be created beyond EOF.  */
+len = REAL_HOST_PAGE_ALIGN(sb.st_size - offset);
+}
 }
 
 if (!(flags & MAP_FIXED)) {
-- 
2.11.0

[Qemu-devel] [PATCH] configure: allow enabling seccomp on s390x

2017-01-18 Thread Christian Ehrhardt

Allow enabling seccomp support on s390x if sufficient build
dependencies are provided.

Signed-off-by: Christian Ehrhardt 
---
 configure | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/configure b/configure
index 86f5214..5056ba9 100755
--- a/configure
+++ b/configure
@@ -1927,6 +1927,9 @@ if test "$seccomp" != "no" ; then
 ppc|ppc64)
 libseccomp_minver="2.3.0"
 ;;
+s390|s390x)
+libseccomp_minver="2.3.0"
+;;
 *)
 libseccomp_minver=""
 ;;
-- 
2.7.4

1 2 3 4 >

1 - 100 of 351 matches

Mail list logo