Re: [PATCH v3 5/5] vhost-vdpa: add callback function for configure interrupt

2021-01-27 Thread Jason Wang



On 2021/1/27 下午3:51, Cindy Lu wrote:

+ /*set the fd call back to vdpa driver*/

I guess checkpatch.pl might warn here. Please try to silent checkpath.pl
before submitting patches.


Actually I do have run this script, but seems not warned here. I will
pay attention next time



Oh right, I remember kernel checkpath did the check but I don't know if 
the qemu one check the comment sytle.


Thanks




Re: [PATCH v2 00/12] buildsys: Do not build various objects if not necessary

2021-01-27 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> On 1/26/21 5:09 PM, Markus Armbruster wrote:
>> Philippe Mathieu-Daudé  writes:
>> 
>>> On 1/26/21 3:57 PM, Alex Bennée wrote:

 Philippe Mathieu-Daudé  writes:

> In this series we deselect a bunch of features when they
> not required, so less objects are built.
>
> While this reduce pressure on CI and slow systems, this is
> particularly helpful for developers regularly testing multiple
> build configurations.
>
> All CI tests pass:
> https://gitlab.com/philmd/qemu/-/pipelines/245654160
>
> Supersedes: <20210120151916.1167448-1-phi...@redhat.com>

 Series looks good to me but I guess you need some sub-system feedback.
>>>
>>> Yeah, I will wait for Markus feedback on qapi/ before respining (with
>>> target/ fix), ...
>> 
>> Maybe I'm naive today, but to me this looks like a case of "if it still
>> builds, it's fine".
>> 
>> Anything in particular you want my feedback for?
>
> You are listed as qapi/ maintainer with Michael :)
>
> QAPI
> M: Markus Armbruster 
> M: Michael Roth 
> S: Supported
> F: qapi/
>
> If it is fine to you, then I'll respin addressing Paolo's comments.
>
> Thanks :)
>
> Phil.

Go right ahead!




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Tue, Jan 26, 2021 at 6:18 PM Greg Kurz  wrote:
>
> On Tue, 26 Jan 2021 10:35:02 +
> Stefan Hajnoczi  wrote:

> The patch looks pretty good to me. It just seems to be missing a change in
> lo_create():
>
> fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
> mode);
>
> A malicious guest could have created anything called ${name} in this directory
> before calling FUSE_CREATE and we'll open it blindly, or I'm missing 
> something ?

Right, this seems like an omission.

Also the "& ~O_NOFOLLOW" looks like a copy-paste bug, since unlike
lo_open(), lo_create() is not opening a proc symlink.

So that should be replaced with "| O_NOFOLLOW"

Thanks,
Miklos




[PATCH] target/i86: implement PKS

2021-01-27 Thread Paolo Bonzini
Protection Keys for Supervisor-mode pages is a simple extension of
the PKU feature that QEMU already implements.  For supervisor-mode
pages, protection key restrictions come from a new MSR.  The MSR
has no XSAVE state associated to it.

PKS is only respected in long mode.  However, in principle it is
possible to set the MSR even outside long mode, and in fact
even the XSAVE state for PKRU could be set outside long mode
using XRSTOR.  So do not limit the migration subsections for
PKRU and PKRS to long mode.

Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c |  4 ++--
 target/i386/cpu.h |  5 +
 target/i386/helper.c  |  3 +++
 target/i386/machine.c | 24 
 target/i386/tcg/excp_helper.c | 32 
 target/i386/tcg/misc_helper.c | 14 ++
 6 files changed, 64 insertions(+), 18 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a8db1b415d..cc41a9101a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -667,7 +667,7 @@ static void x86_cpu_vendor_words2str(char *dst, uint32_t 
vendor1,
   CPUID_7_0_EBX_RDSEED */
 #define TCG_7_0_ECX_FEATURES (CPUID_7_0_ECX_PKU | \
   /* CPUID_7_0_ECX_OSPKE is dynamic */ \
-  CPUID_7_0_ECX_LA57)
+  CPUID_7_0_ECX_LA57 | CPUID_7_0_ECX_PKS)
 #define TCG_7_0_EDX_FEATURES 0
 #define TCG_7_1_EAX_FEATURES 0
 #define TCG_APM_FEATURES 0
@@ -964,7 +964,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 "la57", NULL, NULL, NULL,
 NULL, NULL, "rdpid", NULL,
 NULL, "cldemote", NULL, "movdiri",
-"movdir64b", NULL, NULL, NULL,
+"movdir64b", NULL, NULL, "pks",
 },
 .cpuid = {
 .eax = 7,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index b39ec505de..cc5a26f35b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -247,6 +247,7 @@ typedef enum X86Seg {
 #define CR4_SMEP_MASK   (1U << 20)
 #define CR4_SMAP_MASK   (1U << 21)
 #define CR4_PKE_MASK   (1U << 22)
+#define CR4_PKS_MASK   (1U << 24)
 
 #define DR6_BD  (1 << 13)
 #define DR6_BS  (1 << 14)
@@ -357,6 +358,7 @@ typedef enum X86Seg {
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
+#define MSR_IA32_PKRS   0x6e1
 
 #define FEATURE_CONTROL_LOCKED(1<<0)
 #define FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX (1<<2)
@@ -772,6 +774,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_ECX_MOVDIRI   (1U << 27)
 /* Move 64 Bytes as Direct Store Instruction */
 #define CPUID_7_0_ECX_MOVDIR64B (1U << 28)
+/* Protection Keys for Supervisor-mode Pages */
+#define CPUID_7_0_ECX_PKS   (1U << 31)
 
 /* AVX512 Neural Network Instructions */
 #define CPUID_7_0_EDX_AVX512_4VNNIW (1U << 2)
@@ -1487,6 +1491,7 @@ typedef struct CPUX86State {
 uint64_t msr_smi_count;
 
 uint32_t pkru;
+uint32_t pkrs;
 uint32_t tsx_ctrl;
 
 uint64_t spec_ctrl;
diff --git a/target/i386/helper.c b/target/i386/helper.c
index 6bb0c53182..618ad1c409 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -194,6 +194,9 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4)
 if (!(env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKU)) {
 new_cr4 &= ~CR4_PKE_MASK;
 }
+if (!(env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS)) {
+new_cr4 &= ~CR4_PKS_MASK;
+}
 
 env->cr[4] = new_cr4;
 env->hflags = hflags;
diff --git a/target/i386/machine.c b/target/i386/machine.c
index 1614e8c2f8..3768a753af 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -980,7 +980,6 @@ static const VMStateDescription vmstate_umwait = {
 }
 };
 
-#ifdef TARGET_X86_64
 static bool pkru_needed(void *opaque)
 {
 X86CPU *cpu = opaque;
@@ -999,7 +998,25 @@ static const VMStateDescription vmstate_pkru = {
 VMSTATE_END_OF_LIST()
 }
 };
-#endif
+
+static bool pkrs_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = &cpu->env;
+
+return env->pkrs != 0;
+}
+
+static const VMStateDescription vmstate_pkrs = {
+.name = "cpu/pkrs",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = pkrs_needed,
+.fields = (VMStateField[]){
+VMSTATE_UINT32(env.pkrs, X86CPU),
+VMSTATE_END_OF_LIST()
+}
+};
 
 static bool tsc_khz_needed(void *opaque)
 {
@@ -1480,9 +1497,8 @@ VMStateDescription vmstate_x86_cpu = {
 &vmstate_umwait,
 &vmstate_tsc_khz,
 &vmstate_msr_smi_count,
-#ifdef TARGET_X86_64
 &vmstate_pkru,
-#endif
+&vmstate_pkrs,
 &vmstate_spec_ctrl,
 &vmstate_mcg_ext_ctl,
 &vmstate_msr_intel_pt,
diff --git a/target/i386/tcg/excp_helper.c b/target/i386/tcg/excp_helper.c
index a0f44431fe..b7d6259e4a 100644
--- a/target/i386/tcg/excp_helper.c
+++ b/target/i386/tcg/excp_helper.c
@@ -361,6 +361,7 @@ stati

[PULL 0/9] Gitlab-CI and testing improvements

2021-01-27 Thread Thomas Huth
 Hi Peter,

the following changes since commit 9cd69f1a270235b652766f00b94114f48a2d603f:

  Merge remote-tracking branch 
'remotes/stefanberger/tags/pull-tpm-2021-01-25-1' into staging (2021-01-26 
09:51:02 +)

are available in the Git repository at:

  https://gitlab.com/huth/qemu.git tags/pull-request-2021-01-27

for you to fetch changes up to f8a9b4c66569cbc1640722369a91c635102b5264:

  libqtest: Rework qtest_rsp() (2021-01-27 07:18:13 +0100)


* Patches to speed up and improve the gitlab-CI
* Documentation for the decorators in the "acceptance" tests
* One small rework of a libqtest function


Markus Armbruster (1):
  libqtest: Rework qtest_rsp()

Philippe Mathieu-Daudé (4):
  tests/docker: Install static libc package in CentOS 7
  gitlab-ci: Test building linux-user targets on CentOS 7
  configure: Only check for audio drivers if system-mode is selected
  meson: Do not build optional libraries by default

Thomas Huth (3):
  gitlab-ci.yml: Avoid some submodules to speed up the CI a little bit
  gitlab-ci.yml: Exclude some redundant targets in 
build-without-default-features
  gitlab-ci.yml: Avoid recompiling the sources in the test jobs

Wainer dos Santos Moschetta (1):
  docs/devel: Explain how acceptance tests can be skipped

 .gitlab-ci.yml   | 19 -
 configure|  6 +++
 docs/devel/testing.rst   | 62 
 meson.build  |  3 ++
 tests/docker/dockerfiles/centos7.docker  |  1 +
 tests/docker/dockerfiles/centos8.docker  |  1 +
 tests/docker/dockerfiles/debian-amd64.docker |  1 +
 tests/docker/dockerfiles/fedora.docker   |  3 ++
 tests/docker/dockerfiles/ubuntu2004.docker   |  1 +
 tests/qtest/libqtest.c   | 50 +++---
 10 files changed, 121 insertions(+), 26 deletions(-)




[PULL 1/9] tests/docker: Install static libc package in CentOS 7

2021-01-27 Thread Thomas Huth
From: Philippe Mathieu-Daudé 

We need to install the static libc package to be able to run
the TCG tests:

  $ make check-tcg
  ...
BUILD   TCG tests for x86_64-softmmu
BUILD   x86_64-softmmu guest-tests with cc
  /usr/bin/ld: hello: warning: allocated section `.notes' not in segment
  /usr/bin/ld: memory: warning: allocated section `.notes' not in segment
BUILD   TCG tests for x86_64-linux-user
BUILD   x86_64-linux-user guest-tests with cc
  /usr/bin/ld: cannot find -lpthread
  /usr/bin/ld: cannot find -lc
  collect2: error: ld returned 1 exit status
  make[2]: *** [threadcount] Error 1
  make[1]: *** [cross-build-guest-tests] Error 2
  make: *** [build-tcg-tests-x86_64-linux-user] Error 2

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Thomas Huth 
Reviewed-by: Wainer dos Santos Moschetta 
Message-Id: <20210121172829.1643620-2-f4...@amsat.org>
Signed-off-by: Thomas Huth 
---
 tests/docker/dockerfiles/centos7.docker | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/docker/dockerfiles/centos7.docker 
b/tests/docker/dockerfiles/centos7.docker
index 6f11af1989..75fdb53c7c 100644
--- a/tests/docker/dockerfiles/centos7.docker
+++ b/tests/docker/dockerfiles/centos7.docker
@@ -15,6 +15,7 @@ ENV PACKAGES \
 gettext \
 git \
 glib2-devel \
+glibc-static \
 gnutls-devel \
 libaio-devel \
 libepoxy-devel \
-- 
2.27.0




[PULL 3/9] gitlab-ci.yml: Avoid some submodules to speed up the CI a little bit

2021-01-27 Thread Thomas Huth
Since the meson build system rework, the configure script prefers the
git submodules over the system libraries. So we are testing compilation
with capstone, fdt and libslirp as a submodule all over the place,
burning CPU cycles by recompiling these third party modules and wasting
some network bandwidth in the CI by cloning the submodules each time.
Let's stop doing that in at least a couple of jobs and use the system
libraries instead.

While we're at it, also install meson in the Fedora container, since
it is new enough already, so we do not need to check out the meson
submodule here.

Message-Id: <20210121174451.658924-1-th...@redhat.com>
Signed-off-by: Thomas Huth 
---
 .gitlab-ci.yml   | 6 +-
 tests/docker/dockerfiles/centos8.docker  | 1 +
 tests/docker/dockerfiles/debian-amd64.docker | 1 +
 tests/docker/dockerfiles/fedora.docker   | 3 +++
 tests/docker/dockerfiles/ubuntu2004.docker   | 1 +
 5 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index af4d74757d..a7bdf419d5 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -109,6 +109,7 @@ build-system-ubuntu:
   <<: *native_build_job_definition
   variables:
 IMAGE: ubuntu2004
+CONFIGURE_ARGS: --enable-fdt=system --enable-slirp=system
 TARGETS: aarch64-softmmu alpha-softmmu cris-softmmu hppa-softmmu
   moxie-softmmu microblazeel-softmmu mips64el-softmmu
 MAKE_CHECK_ARGS: check-build
@@ -140,6 +141,7 @@ build-system-debian:
   <<: *native_build_job_definition
   variables:
 IMAGE: debian-amd64
+CONFIGURE_ARGS: --enable-fdt=system
 TARGETS: arm-softmmu avr-softmmu i386-softmmu mipsel-softmmu
   riscv64-softmmu sh4eb-softmmu sparc-softmmu xtensaeb-softmmu
 MAKE_CHECK_ARGS: check-build
@@ -185,6 +187,7 @@ build-system-fedora:
   variables:
 IMAGE: fedora
 CONFIGURE_ARGS: --disable-gcrypt --enable-nettle --enable-docs
+ --enable-fdt=system --enable-slirp=system --enable-capstone=system
 TARGETS: tricore-softmmu microblaze-softmmu mips-softmmu
   xtensa-softmmu m68k-softmmu riscv32-softmmu ppc-softmmu sparc64-softmmu
 MAKE_CHECK_ARGS: check-build
@@ -216,7 +219,7 @@ build-system-centos:
   <<: *native_build_job_definition
   variables:
 IMAGE: centos8
-CONFIGURE_ARGS: --disable-nettle --enable-gcrypt
+CONFIGURE_ARGS: --disable-nettle --enable-gcrypt --enable-fdt=system
 TARGETS: ppc64-softmmu or1k-softmmu s390x-softmmu
   x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
 MAKE_CHECK_ARGS: check-build
@@ -248,6 +251,7 @@ build-system-opensuse:
   <<: *native_build_job_definition
   variables:
 IMAGE: opensuse-leap
+CONFIGURE_ARGS: --enable-fdt=system
 TARGETS: s390x-softmmu x86_64-softmmu aarch64-softmmu
 MAKE_CHECK_ARGS: check-build
   artifacts:
diff --git a/tests/docker/dockerfiles/centos8.docker 
b/tests/docker/dockerfiles/centos8.docker
index 64cb7a6eda..a763d55730 100644
--- a/tests/docker/dockerfiles/centos8.docker
+++ b/tests/docker/dockerfiles/centos8.docker
@@ -15,6 +15,7 @@ ENV PACKAGES \
 glib2-devel \
 libaio-devel \
 libepoxy-devel \
+libfdt-devel \
 libgcrypt-devel \
 lzo-devel \
 make \
diff --git a/tests/docker/dockerfiles/debian-amd64.docker 
b/tests/docker/dockerfiles/debian-amd64.docker
index a98314757d..ed546edcd6 100644
--- a/tests/docker/dockerfiles/debian-amd64.docker
+++ b/tests/docker/dockerfiles/debian-amd64.docker
@@ -21,6 +21,7 @@ RUN apt update && \
 libbz2-dev \
 liblzo2-dev \
 libgcrypt20-dev \
+libfdt-dev \
 librdmacm-dev \
 libsasl2-dev \
 libsnappy-dev \
diff --git a/tests/docker/dockerfiles/fedora.docker 
b/tests/docker/dockerfiles/fedora.docker
index 0b5053f2d0..0d7602abbe 100644
--- a/tests/docker/dockerfiles/fedora.docker
+++ b/tests/docker/dockerfiles/fedora.docker
@@ -6,6 +6,7 @@ ENV PACKAGES \
 brlapi-devel \
 bzip2 \
 bzip2-devel \
+capstone-devel \
 ccache \
 clang \
 cyrus-sasl-devel \
@@ -37,6 +38,7 @@ ENV PACKAGES \
 libpng-devel \
 librbd-devel \
 libseccomp-devel \
+libslirp-devel \
 libssh-devel \
 libubsan \
 libudev-devel \
@@ -46,6 +48,7 @@ ENV PACKAGES \
 llvm \
 lzo-devel \
 make \
+meson \
 mingw32-bzip2 \
 mingw32-curl \
 mingw32-glib2 \
diff --git a/tests/docker/dockerfiles/ubuntu2004.docker 
b/tests/docker/dockerfiles/ubuntu2004.docker
index ae889d8482..8519584d2b 100644
--- a/tests/docker/dockerfiles/ubuntu2004.docker
+++ b/tests/docker/dockerfiles/ubuntu2004.docker
@@ -37,6 +37,7 @@ ENV PACKAGES flex bison \
 libsasl2-dev \
 libsdl2-dev \
 libseccomp-dev \
+libslirp-dev \
 libsnappy-dev \
 libspice-protocol-dev \
 libspice-server-dev \
-- 
2.27.0




[PULL 8/9] docs/devel: Explain how acceptance tests can be skipped

2021-01-27 Thread Thomas Huth
From: Wainer dos Santos Moschetta 

Documented under the "Acceptance tests using the Avocado Framework"
section in testing.rst how environment variables are used to skip tests.

Signed-off-by: Wainer dos Santos Moschetta 
Message-Id: <20210115210022.417996-1-waine...@redhat.com>
Reviewed-by: Thomas Huth 
Signed-off-by: Thomas Huth 
---
 docs/devel/testing.rst | 62 ++
 1 file changed, 62 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 0aa7a13bba..9f8b77c8ec 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -871,6 +871,68 @@ qemu_bin
 
 The exact QEMU binary to be used on QEMUMachine.
 
+Skipping tests
+--
+The Avocado framework provides Python decorators which allow for easily skip
+tests running under certain conditions. For example, on the lack of a binary
+on the test system or when the running environment is a CI system. For further
+information about those decorators, please refer to::
+
+  
https://avocado-framework.readthedocs.io/en/latest/guides/writer/chapters/writing.html#skipping-tests
+
+While the conditions for skipping tests are often specifics of each one, there
+are recurring scenarios identified by the QEMU developers and the use of
+environment variables became a kind of standard way to enable/disable tests.
+
+Here is a list of the most used variables:
+
+AVOCADO_ALLOW_LARGE_STORAGE
+~~~
+Tests which are going to fetch or produce assets considered *large* are not
+going to run unless that `AVOCADO_ALLOW_LARGE_STORAGE=1` is exported on
+the environment.
+
+The definition of *large* is a bit arbitrary here, but it usually means an
+asset which occupies at least 1GB of size on disk when uncompressed.
+
+AVOCADO_ALLOW_UNTRUSTED_CODE
+
+There are tests which will boot a kernel image or firmware that can be
+considered not safe to run on the developer's workstation, thus they are
+skipped by default. The definition of *not safe* is also arbitrary but
+usually it means a blob which either its source or build process aren't
+public available.
+
+You should export `AVOCADO_ALLOW_UNTRUSTED_CODE=1` on the environment in
+order to allow tests which make use of those kind of assets.
+
+AVOCADO_TIMEOUT_EXPECTED
+
+The Avocado framework has a timeout mechanism which interrupts tests to avoid 
the
+test suite of getting stuck. The timeout value can be set via test parameter or
+property defined in the test class, for further details::
+
+  
https://avocado-framework.readthedocs.io/en/latest/guides/writer/chapters/writing.html#setting-a-test-timeout
+
+Even though the timeout can be set by the test developer, there are some tests
+that may not have a well-defined limit of time to finish under certain
+conditions. For example, tests that take longer to execute when QEMU is
+compiled with debug flags. Therefore, the `AVOCADO_TIMEOUT_EXPECTED` variable
+has been used to determine whether those tests should run or not.
+
+GITLAB_CI
+~
+A number of tests are flagged to not run on the GitLab CI. Usually because
+they proved to the flaky or there are constraints on the CI environment which
+would make them fail. If you encounter a similar situation then use that
+variable as shown on the code snippet below to skip the test:
+
+.. code::
+
+  @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
+  def test(self):
+  do_something()
+
 Uninstalling Avocado
 
 
-- 
2.27.0




[PULL 4/9] configure: Only check for audio drivers if system-mode is selected

2021-01-27 Thread Thomas Huth
From: Philippe Mathieu-Daudé 

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Bennée 
Acked-by: Gerd Hoffmann 
Message-Id: <20210122204441.2145197-2-phi...@redhat.com>
Acked-by: Paolo Bonzini 
Signed-off-by: Thomas Huth 
---
 configure | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/configure b/configure
index dcc5ea7d63..87de49e2c2 100755
--- a/configure
+++ b/configure
@@ -2319,6 +2319,12 @@ if test -z "$want_tools"; then
 fi
 fi
 
+##
+# Disable features only meaningful for system-mode emulation
+if test "$softmmu" = "no"; then
+audio_drv_list=""
+fi
+
 ##
 # Some versions of Mac OS X incorrectly define SIZE_MAX
 cat > $TMPC << EOF
-- 
2.27.0




[PULL 5/9] meson: Do not build optional libraries by default

2021-01-27 Thread Thomas Huth
From: Philippe Mathieu-Daudé 

The following libraries will be selected if a feature requires it:

- capstone
- fdt
- SLiRP

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20210122204441.2145197-5-phi...@redhat.com>
Acked-by: Paolo Bonzini 
Signed-off-by: Thomas Huth 
---
 meson.build | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/meson.build b/meson.build
index 35a9eddf5c..a58c6f6785 100644
--- a/meson.build
+++ b/meson.build
@@ -1460,6 +1460,7 @@ if capstone_opt == 'internal'
   ]
 
   libcapstone = static_library('capstone',
+   build_by_default: false,
sources: capstone_files,
c_args: capstone_cargs,
include_directories: 'capstone/include')
@@ -1537,6 +1538,7 @@ if have_system
 
 slirp_inc = include_directories('slirp', 'slirp/src')
 libslirp = static_library('slirp',
+  build_by_default: false,
   sources: slirp_files,
   c_args: slirp_cargs,
   include_directories: slirp_inc)
@@ -1582,6 +1584,7 @@ if have_system
 
 fdt_inc = include_directories('dtc/libfdt')
 libfdt = static_library('fdt',
+build_by_default: false,
 sources: fdt_files,
 include_directories: fdt_inc)
 fdt = declare_dependency(link_with: libfdt,
-- 
2.27.0




[PULL 2/9] gitlab-ci: Test building linux-user targets on CentOS 7

2021-01-27 Thread Thomas Huth
From: Philippe Mathieu-Daudé 

Add a configuration tested by Peter Maydell (see [1] and [2])
but not covered in our CI [3]:

  [705/2910] Compiling C object 
libqemu-arm-linux-user.fa.p/linux-user_strace.c.o
  FAILED: libqemu-arm-linux-user.fa.p/linux-user_strace.c.o
  ../linux-user/strace.c: In function 'do_print_sockopt':
  ../linux-user/strace.c:2831:14: error: 'IPV6_ADDR_PREFERENCES' undeclared 
(first use in this function)
   case IPV6_ADDR_PREFERENCES:
^

This job currently takes 31 minutes 32 seconds ([4]).

[1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg05086.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg05379.html
[3] https://gitlab.com/philmd/qemu/-/jobs/977408284
[4] https://gitlab.com/philmd/qemu/-/jobs/978223286

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Thomas Huth 
Reviewed-by: Wainer dos Santos Moschetta 
Message-Id: <20210121172829.1643620-3-f4...@amsat.org>
Signed-off-by: Thomas Huth 
---
 .gitlab-ci.yml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index de3a3d25b5..af4d74757d 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -411,6 +411,13 @@ build-user-plugins:
 MAKE_CHECK_ARGS: check-tcg
   timeout: 1h 30m
 
+build-user-centos7:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: centos7
+CONFIGURE_ARGS: --disable-system --disable-tools --disable-docs
+MAKE_CHECK_ARGS: check-tcg
+
 build-some-softmmu-plugins:
   <<: *native_build_job_definition
   variables:
-- 
2.27.0




[PULL 9/9] libqtest: Rework qtest_rsp()

2021-01-27 Thread Thomas Huth
From: Markus Armbruster 

qtest_rsp() is used in two different ways: (1) return some arguments
to caller, which the caller must free, and (2) return no arguments to
caller.  Passing non-zero @expected_args gets you (1), and passing
zero gets you (2).

Having "the return value must be freed" depend on an argument this way
is less than ideal.

Provide separate functions for the two ways: (1) qtest_rsp_args()
takes @expected_args (possibly zero), and returns that number of
arguments.  Caller must free the return value always.  (2) qtest_rsp()
assumes zero, and returns nothing.

Signed-off-by: Markus Armbruster 
Reviewed-by: Thomas Huth 
Reviewed-by: Marc-André Lureau 
Message-Id: <20210126151649.2220902-1-arm...@redhat.com>
Signed-off-by: Thomas Huth 
---
 tests/qtest/libqtest.c | 50 ++
 1 file changed, 26 insertions(+), 24 deletions(-)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 5249a628cc..fd043b0570 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -503,7 +503,7 @@ static GString *qtest_client_socket_recv_line(QTestState *s)
 return line;
 }
 
-static gchar **qtest_rsp(QTestState *s, int expected_args)
+static gchar **qtest_rsp_args(QTestState *s, int expected_args)
 {
 GString *line;
 gchar **words;
@@ -539,25 +539,27 @@ redo:
 g_assert(words[0] != NULL);
 g_assert_cmpstr(words[0], ==, "OK");
 
-if (expected_args) {
-for (i = 0; i < expected_args; i++) {
-g_assert(words[i] != NULL);
-}
-} else {
-g_strfreev(words);
-words = NULL;
+for (i = 0; i < expected_args; i++) {
+g_assert(words[i] != NULL);
 }
 
 return words;
 }
 
+static void qtest_rsp(QTestState *s)
+{
+gchar **words = qtest_rsp_args(s, 0);
+
+g_strfreev(words);
+}
+
 static int qtest_query_target_endianness(QTestState *s)
 {
 gchar **args;
 int big_endian;
 
 qtest_sendf(s, "endianness\n");
-args = qtest_rsp(s, 1);
+args = qtest_rsp_args(s, 1);
 g_assert(strcmp(args[1], "big") == 0 || strcmp(args[1], "little") == 0);
 big_endian = strcmp(args[1], "big") == 0;
 g_strfreev(args);
@@ -892,14 +894,14 @@ bool qtest_get_irq(QTestState *s, int num)
 void qtest_module_load(QTestState *s, const char *prefix, const char *libname)
 {
 qtest_sendf(s, "module_load %s %s\n", prefix, libname);
-qtest_rsp(s, 0);
+qtest_rsp(s);
 }
 
 static int64_t qtest_clock_rsp(QTestState *s)
 {
 gchar **words;
 int64_t clock;
-words = qtest_rsp(s, 2);
+words = qtest_rsp_args(s, 2);
 clock = g_ascii_strtoll(words[1], NULL, 0);
 g_strfreev(words);
 return clock;
@@ -926,13 +928,13 @@ int64_t qtest_clock_set(QTestState *s, int64_t val)
 void qtest_irq_intercept_out(QTestState *s, const char *qom_path)
 {
 qtest_sendf(s, "irq_intercept_out %s\n", qom_path);
-qtest_rsp(s, 0);
+qtest_rsp(s);
 }
 
 void qtest_irq_intercept_in(QTestState *s, const char *qom_path)
 {
 qtest_sendf(s, "irq_intercept_in %s\n", qom_path);
-qtest_rsp(s, 0);
+qtest_rsp(s);
 }
 
 void qtest_set_irq_in(QTestState *s, const char *qom_path, const char *name,
@@ -942,13 +944,13 @@ void qtest_set_irq_in(QTestState *s, const char 
*qom_path, const char *name,
 name = "unnamed-gpio-in";
 }
 qtest_sendf(s, "set_irq_in %s %s %d %d\n", qom_path, name, num, level);
-qtest_rsp(s, 0);
+qtest_rsp(s);
 }
 
 static void qtest_out(QTestState *s, const char *cmd, uint16_t addr, uint32_t 
value)
 {
 qtest_sendf(s, "%s 0x%x 0x%x\n", cmd, addr, value);
-qtest_rsp(s, 0);
+qtest_rsp(s);
 }
 
 void qtest_outb(QTestState *s, uint16_t addr, uint8_t value)
@@ -973,7 +975,7 @@ static uint32_t qtest_in(QTestState *s, const char *cmd, 
uint16_t addr)
 unsigned long value;
 
 qtest_sendf(s, "%s 0x%x\n", cmd, addr);
-args = qtest_rsp(s, 2);
+args = qtest_rsp_args(s, 2);
 ret = qemu_strtoul(args[1], NULL, 0, &value);
 g_assert(!ret && value <= UINT32_MAX);
 g_strfreev(args);
@@ -1000,7 +1002,7 @@ static void qtest_write(QTestState *s, const char *cmd, 
uint64_t addr,
 uint64_t value)
 {
 qtest_sendf(s, "%s 0x%" PRIx64 " 0x%" PRIx64 "\n", cmd, addr, value);
-qtest_rsp(s, 0);
+qtest_rsp(s);
 }
 
 void qtest_writeb(QTestState *s, uint64_t addr, uint8_t value)
@@ -1030,7 +1032,7 @@ static uint64_t qtest_read(QTestState *s, const char 
*cmd, uint64_t addr)
 uint64_t value;
 
 qtest_sendf(s, "%s 0x%" PRIx64 "\n", cmd, addr);
-args = qtest_rsp(s, 2);
+args = qtest_rsp_args(s, 2);
 ret = qemu_strtou64(args[1], NULL, 0, &value);
 g_assert(!ret);
 g_strfreev(args);
@@ -1082,7 +1084,7 @@ void qtest_memread(QTestState *s, uint64_t addr, void 
*data, size_t size)
 }
 
 qtest_sendf(s, "read 0x%" PRIx64 " 0x%zx\n", addr, size);
-args = qtest_rsp(s, 2);
+args = qtest_rsp_args(s, 2);
 
 for (i = 0; i < size; i++) {
 ptr[i] = hex2n

[PULL 7/9] gitlab-ci.yml: Avoid recompiling the sources in the test jobs

2021-01-27 Thread Thomas Huth
Currently, our check-system-* jobs are recompiling the whole sources
again. This happens due to the fact that the jobs are checking out
the whole source tree and required submodules again, and only try
to use the "build" directory with the binaries and object files as an
artifact from the previous stage - which simply does not work right
anymore (with the current version of meson). Due to some changed
time stamps, meson/ninja are always trying to rebuild the whole tree.

In the long run, we could likely use "meson test --no-rebuild", but
there is still some work going on in that area to improve the user
experience. So until this has been done, simply avoid recompiling the
sources with a trick: pass NINJA=":" to the make process in the test
jobs. Also check out the submodules manually before updating the
timestamps in the build folder, so that the binaries are definitely
newer that all the source files.
This saves ca. 10 - 15 minutes of precious CI cycles in each run.

Suggested-by: Paolo Bonzini 
Message-Id: <20210126065757.403853-1-th...@redhat.com>
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Wainer dos Santos Moschetta 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Thomas Huth 
---
 .gitlab-ci.yml | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index d0d8914bff..7c0db64710 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -38,9 +38,12 @@ include:
   stage: test
   image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
   script:
+- scripts/git-submodule.sh update
+$(grep GIT_SUBMODULES build/config-host.mak | sed 
's/GIT_SUBMODULES=//')
 - cd build
 - find . -type f -exec touch {} +
-- make $MAKE_CHECK_ARGS
+# Avoid recompiling by hiding ninja with NINJA=":"
+- make NINJA=":" $MAKE_CHECK_ARGS
 
 .acceptance_template: &acceptance_definition
   cache:
-- 
2.27.0




[PULL 6/9] gitlab-ci.yml: Exclude some redundant targets in build-without-default-features

2021-01-27 Thread Thomas Huth
The build-without-default-features job is running quite long and sometimes
already hits the 1h time limit. Exclude some targets which do not provide
additional test coverage here (since we e.g. also already test other targets
of the same type, just with different endianess, or a 64-bit superset) to
avoid that we hit the timeout here so easily.

Message-Id: <20210126172345.15947-1-th...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Wainer dos Santos Moschetta 
Signed-off-by: Thomas Huth 
---
 .gitlab-ci.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index a7bdf419d5..d0d8914bff 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -617,6 +617,7 @@ build-without-default-features:
   variables:
 IMAGE: debian-amd64
 CONFIGURE_ARGS: --without-default-features --disable-user
+
--target-list-exclude=arm-softmmu,i386-softmmu,mipsel-softmmu,mips64-softmmu,ppc-softmmu
 MAKE_CHECK_ARGS: check-unit
 
 check-patch:
-- 
2.27.0




Re: [PATCH v2 8/9] tests/docker: Add dockerfile for Alpine Linux

2021-01-27 Thread Daniel P . Berrangé
On Tue, Jan 26, 2021 at 04:38:57PM -0500, John Snow wrote:
> On 1/19/21 8:41 AM, Thomas Huth wrote:
> > On 18/01/2021 11.33, Daniel P. Berrangé wrote:
> > > On Mon, Jan 18, 2021 at 02:38:07PM +0800, Jiaxun Yang wrote:
> > > > Alpine Linux[1] is a security-oriented, lightweight Linux distribution
> > > > based on musl libc and busybox.
> > > > 
> > > > It it popular among Docker guests and embedded applications.
> > > > 
> > > > Adding it to test against different libc.
> > > > 
> > > > [1]: https://alpinelinux.org/
> > > > 
> > > > Signed-off-by: Jiaxun Yang 
> > > > ---
> > > >   tests/docker/dockerfiles/alpine.docker | 57 ++
> > > >   1 file changed, 57 insertions(+)
> > > >   create mode 100644 tests/docker/dockerfiles/alpine.docker
> > > > 
> > > > diff --git a/tests/docker/dockerfiles/alpine.docker
> > > > b/tests/docker/dockerfiles/alpine.docker
> > > > new file mode 100644
> > > > index 00..5be5198d00
> > > > --- /dev/null
> > > > +++ b/tests/docker/dockerfiles/alpine.docker
> > > > @@ -0,0 +1,57 @@
> > > > +
> > > > +FROM alpine:edge
> > > > +
> > > > +RUN apk update
> > > > +RUN apk upgrade
> > > > +
> > > > +# Please keep this list sorted alphabetically
> > > > +ENV PACKAGES \
> > > > +    alsa-lib-dev \
> > > > +    bash \
> > > > +    bison \
> > > 
> > > This shouldn't be required.
> > 
> > bison and flex were required to avoid some warnings in the past while
> > compiling the dtc submodule ... but I thought we got rid of the problem
> > at one point in time, so this can be removed now, indeed.
> > 
> > > > +    build-base \
> > > 
> > > This seems to be a meta packae that pulls in other
> > > misc toolchain packages. Please list the pieces we
> > > need explicitly instead.
> > 
> > Looking at the "Depends" list on
> > https://pkgs.alpinelinux.org/package/v3.3/main/x86/build-base there are
> > only 6 dependencies and we need most of those for QEMU anyway, so I
> > think it is ok to keep build-base here.
> > 
> > > > +    coreutils \
> > > > +    curl-dev \
> > > > +    flex \
> > > 
> > > This shouldn't be needed.
> > > 
> > > > +    git \
> > > > +    glib-dev \
> > > > +    glib-static \
> > > > +    gnutls-dev \
> > > > +    gtk+3.0-dev \
> > > > +    libaio-dev \
> > > > +    libcap-dev \
> > > 
> > > Should not be required, as we use cap-ng.
> > 
> > Right.
> > 
> > > > +    libcap-ng-dev \
> > > > +    libjpeg-turbo-dev \
> > > > +    libnfs-dev \
> > > > +    libpng-dev \
> > > > +    libseccomp-dev \
> > > > +    libssh-dev \
> > > > +    libusb-dev \
> > > > +    libxml2-dev \
> > > > +    linux-headers \
> > > 
> > > Is this really needed ? We don't install kernel-headers on other
> > > distros AFAICT.
> > 
> > I tried a build without this package, and it works fine indeed.
> > 
> > > > +    lzo-dev \
> > > > +    mesa-dev \
> > > > +    mesa-egl \
> > > > +    mesa-gbm \
> > > > +    meson \
> > > > +    ncurses-dev \
> > > > +    ninja \
> > > > +    paxmark \
> > > 
> > > What is this needed for ?
> > 
> > Seems like it also can be dropped.
> > 
> > > > +    perl \
> > > > +    pulseaudio-dev \
> > > > +    python3 \
> > > > +    py3-sphinx \
> > > > +    shadow \
> > > 
> > > Is this really needed ?
> > 
> > See:
> > https://www.spinics.net/lists/kvm/msg231556.html
> > 
> > I can remove the superfluous packages when picking up the patch, no need
> > to respin just because of this.
> > 
> >   Thomas
> > 
> > 
> 
> You can refer to my post earlier this January for a "minimal" Alpine Linux
> build, if you wish.
> 
> My goal was to find the smallest set of packages possible without passing
> any explicit configure flags.
> 
> I wonder if it's worth having layered "core build" and "test build" images
> so that we can smoke test the minimalistic build from time to time -- I seem
> to recall Dan posting information about a dependency management tool for
> Dockerfiles, but I admit I didn't look too closely at what problem that
> solves, exactly.

I'd rather we avoid layered images entirely as it creates extra stages
in the gitlab pipeline, and also makes it more complex to auto-generate.

Once this initial alpine image is merged, then I intend to add it to the
set which are auto-generated from my other patch series.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH] ui: fix spice display regression

2021-01-27 Thread marcandre . lureau
From: Marc-André Lureau 

Since commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 ("vl: remove
separate preconfig main_loop"), spice initialization is a bit dodgy, and
the client get stuck waiting for server events.

The initialization order changed, so that qemu_spice_display_start() is
called before the display interfaces are added. The new interfaces
aren't started by spice-server automatically (yet?), so we have to tell
the server explicitely when we are already running.

Signed-off-by: Marc-André Lureau 
---
 ui/spice-core.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index 5746d0aae7..6eebf12e3c 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -830,6 +830,8 @@ static void qemu_spice_init(void)
 
 static int qemu_spice_add_interface(SpiceBaseInstance *sin)
 {
+int ret;
+
 if (!spice_server) {
 if (QTAILQ_FIRST(&qemu_spice_opts.head) != NULL) {
 error_report("Oops: spice configured but not active");
@@ -848,7 +850,13 @@ static int qemu_spice_add_interface(SpiceBaseInstance *sin)
 qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
 }
 
-return spice_server_add_interface(spice_server, sin);
+ret = spice_server_add_interface(spice_server, sin);
+/* make sure the newly added interface is started */
+if (ret == 0 && spice_display_is_running) {
+spice_server_vm_start(spice_server);
+}
+
+return ret;
 }
 
 static GSList *spice_consoles;
-- 
2.29.0




Re: [PATCH v4 4/4] meson: Warn when TCI is selected but TCG backend is available

2021-01-27 Thread Daniel P . Berrangé
On Tue, Jan 26, 2021 at 05:24:10PM +, Alex Bennée wrote:
> 
> Stefan Weil  writes:
> 
> > Am 25.01.21 um 23:35 schrieb Richard Henderson:
> >> On 1/25/21 11:02 AM, Stefan Weil wrote:
> >>> Am 25.01.21 um 20:02 schrieb Richard Henderson:
>  On 1/25/21 8:58 AM, Stefan Weil wrote:
> > I have no evidence that TCI is less reliable than TCG, so I would not 
> > write
> > that.
>  It can't pass make check-tcg.
> >>> Where does it fail? Maybe an expected timeout problem which can be solved 
> >>> by
> >>> increasing the timeouts for TCI?
> >>>
> >>> I have just run a local test of `make check-tcg` with native TCG and with 
> >>> TCI
> >>> and did not see a difference. But I noticed that in both cases many tests 
> >>> show
> >>> "skipped".
> >> You need to enable docker or podman for your development, so that you get 
> >> all
> >> of the cross-compilers.
> >>
> >> Then:
> >>
> >>TESTfcvt on arm
> >> TODO ../qemu/tcg/tci.c:614: tcg_qemu_tb_exec()
> >> ../qemu/tcg/tci.c:614: tcg fatal error
> >> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> >>
> >>TESTfloat_convs on m68k
> >> TODO ../qemu/tcg/tci.c:614: tcg_qemu_tb_exec()
> >> ../qemu/tcg/tci.c:614: tcg fatal error
> >> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> >>
> >> which is of course one of the TODO assertions.
> >> It's positively criminal those still exist in the code.
> >
> >
> > I installed podman and repeated `make check-tcg`. The log file still 
> > shows 87 lines with "SKIPPED". There is also a gdb core dump, several 
> > warnings, but nothing related to TCI. Both tests cited above seem to 
> > work without a problem.
> 
> I'm going to go out on a limb and guess you have commit:
> 
>   23a77b2d18 (build-system: clean up TCG/TCI configury)
> 
> which temporarily has the effect of disabling TCI. See
> 
>   Subject: Re: [PATCH v4 1/4] configure: Fix --enable-tcg-interpreter
>   From: Paolo Bonzini 
>   Message-ID: <2b8b6291-b54c-b285-ae38-21f067a84...@redhat.com>
>   Date: Mon, 25 Jan 2021 17:36:42 +0100
> 
> with that fix fixed I see the same failures as Richard:
> 
>   ./qemu-arm ./tests/tcg/arm-linux-user/fcvt > /dev/null
>   TODO ../../tcg/tci.c:614: tcg_qemu_tb_exec()
>   ../../tcg/tci.c:614: tcg fatal error
>   qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>   fish: “./qemu-arm ./tests/tcg/arm-linu…” terminated by signal SIGSEGV 
> (Address boundary error)
> 
> which does raise the question before today when was the last time anyone
> attempted to run check-tcg on this?
> 
> > The complete log file is available from 
> > https://qemu.weilnetz.de/test/check-tcg.txt.
> >
> > Daniel, regarding your comment: TCI has 100 % test coverage for the 
> > productive code lines.
> 
> By what tests? The fact you don't hit asserts in your day to day testing
> doesn't mean there are features missing that are easily tripped up or
> that TCI got it right.
> 
> > All code lines which were never tested raise an 
> > assertion, so can easily be identified (and fixed as soon as there is a 
> > test case which triggers such an assertion). The known deficits are 
> > speed, missing TCG opcodes, unimplemented TCG opcodes because of missing 
> > test cases and missing support for some host architectures.
> 
> Passing check-tcg would be a minimum for me.

Passing check-tcg *in gitlab CI* would be the minimum to consider
it on a par with TCG.

The lack of automated GitLab CI for TCI is a reason my proposed wording
described TCI as less reliable than native TCG. We can't claim it has
equivalent reliability unless we have equiv automated testing of TCI.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Handling multiple inheritance [for CXL]

2021-01-27 Thread Daniel P . Berrangé
On Tue, Jan 26, 2021 at 01:33:52PM -0800, Ben Widawsky wrote:
> I'm working on CXL 2.0 type 3 memory devices [1]. In short, these are PCIe 
> devices
> that have persistent memory on them. As such, it would be nice to inherit from
> both a PCI_DEVICE class as well as an NVDIMM device class.
> 
> Truth be told, using TYPE_MEMORY_DEVICE as the interface does provide most of
> what I need. I'm wondering what the best way to handle this is. Currently, the
> only thing NVDIMM class provides is write/read_label_data, this is driven by
> _DSM. For CXL, the mechanism to read/write the equivalent area is not done via
> _DSM, but done directly via a mailbox interface. However, the intent is the
> same, and so utilizing similar code seems ideal.
> 
> If there's a desire to unify these code paths, I'd need something like 
> multiple
> inheritance. I'm looking for some feedback here on how to do it.

We don't have a direct concept of multiple inheritance in QOM.

The closest you can get is to turn the NVDIMM class into an
interface. You can inherit from PCI_DEVICE and then implement
the NVDIMM interface.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Stefan Hajnoczi
On Tue, Jan 26, 2021 at 06:16:04PM +0100, Greg Kurz wrote:
> On Tue, 26 Jan 2021 10:35:02 +
> Stefan Hajnoczi  wrote:
> 
> > A well-behaved FUSE client does not attempt to open special files with
> > FUSE_OPEN because they are handled on the client side (e.g. device nodes
> > are handled by client-side device drivers).
> > 
> > The check to prevent virtiofsd from opening special files is missing in
> > a few cases, most notably FUSE_OPEN. A malicious client can cause
> > virtiofsd to open a device node, potentially allowing the guest to
> > escape. 
> 
> or pretty much anything nasty you can think of, e.g. DoS if the malicious
> client repeatedly asks virtiofsd to open FIFOs the other side of which is
> never opened.
> 
> > This can be exploited by a modified guest device driver. It is
> > not exploitable from guest userspace since the guest kernel will handle
> > special files inside the guest instead of sending FUSE requests.
> > 
> > This patch adds the missing checks to virtiofsd. This is a short-term
> > solution because it does not prevent a compromised virtiofsd process
> > from opening device nodes on the host.
> > 
> > Reported-by: Alex Xu 
> > Fixes: CVE-2020-35517
> > Reviewed-by: Dr. David Alan Gilbert 
> > Reviewed-by: Vivek Goyal 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> 
> The patch looks pretty good to me. It just seems to be missing a change in
> lo_create():
> 
> fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
> mode);
> 
> A malicious guest could have created anything called ${name} in this directory
> before calling FUSE_CREATE and we'll open it blindly, or I'm missing 
> something ?

Good point! I will send another revision.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH] ui: fix spice display regression

2021-01-27 Thread Marc-André Lureau
Hi

On Wed, Jan 27, 2021 at 2:03 PM  wrote:
>
> From: Marc-André Lureau 
>
> Since commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 ("vl: remove
> separate preconfig main_loop"), spice initialization is a bit dodgy, and
> the client get stuck waiting for server events.
>
> The initialization order changed, so that qemu_spice_display_start() is
> called before the display interfaces are added. The new interfaces
> aren't started by spice-server automatically (yet?), so we have to tell
> the server explicitely when we are already running.
>
> Signed-off-by: Marc-André Lureau 
> ---
>  ui/spice-core.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/ui/spice-core.c b/ui/spice-core.c
> index 5746d0aae7..6eebf12e3c 100644
> --- a/ui/spice-core.c
> +++ b/ui/spice-core.c
> @@ -830,6 +830,8 @@ static void qemu_spice_init(void)
>
>  static int qemu_spice_add_interface(SpiceBaseInstance *sin)
>  {
> +int ret;
> +
>  if (!spice_server) {
>  if (QTAILQ_FIRST(&qemu_spice_opts.head) != NULL) {
>  error_report("Oops: spice configured but not active");
> @@ -848,7 +850,13 @@ static int qemu_spice_add_interface(SpiceBaseInstance 
> *sin)
>  qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
>  }
>
> -return spice_server_add_interface(spice_server, sin);
> +ret = spice_server_add_interface(spice_server, sin);
> +/* make sure the newly added interface is started */
> +if (ret == 0 && spice_display_is_running) {
> +spice_server_vm_start(spice_server);
> +}
> +
> +return ret;
>  }
>
>  static GSList *spice_consoles;
> --
> 2.29.0
>
>

Oops, it doesn't work reliably. There is some race in spice server now.

spice_server_vm_start() sends RED_WORKER_MESSAGE_START to the QXL
worker thread. But if two of those come, it will assert... It should
probably not, I will send a patch to spice.

I am looking for other options for QEMU though.

-- 
Marc-André Lureau



Re: [PATCH v2] virtio: Add corresponding memory_listener_unregister to unrealize

2021-01-27 Thread Stefano Garzarella

On Mon, Jan 25, 2021 at 08:25:05PM +0100, Eugenio Pérez wrote:

Address space is destroyed without proper removal of its listeners with
current code. They are expected to be removed in
virtio_device_instance_finalize [1], but qemu calls it through
object_deinit, after address_space_destroy call through
device_set_realized [2].

Move it to virtio_device_unrealize, called before device_set_realized
[3] and making it symmetric with memory_listener_register in
virtio_device_realize.

v2: Delete no-op call of virtio_device_instance_finalize.
   Add backtraces.

[1]

#0  virtio_device_instance_finalize (obj=0x57de5120)
at /home/qemu/include/hw/virtio/virtio.h:71
#1  0x55b703c9 in object_deinit (type=0x56639860,
 obj=) at ../qom/object.c:671
#2  object_finalize (data=0x57de5120) at ../qom/object.c:685
#3  object_unref (objptr=0x57de5120) at ../qom/object.c:1184
#4  0x55b4de9d in bus_free_bus_child (kid=0x57df0660)
at ../hw/core/qdev.c:55
#5  0x55c65003 in call_rcu_thread (opaque=opaque@entry=0x0)
at ../util/rcu.c:281

Queued by:

#0  bus_remove_child (bus=0x57de5098,
child=child@entry=0x57de5120) at ../hw/core/qdev.c:60
#1  0x55b4ee31 in device_unparent (obj=)
at ../hw/core/qdev.c:984
#2  0x55b70465 in object_finalize_child_property (
obj=, name=, opaque=0x57de5120)
at ../qom/object.c:1725
#3  0x55b6fa17 in object_property_del_child (
child=0x57de5120, obj=0x57ddcf90) at ../qom/object.c:645
#4  object_unparent (obj=0x57de5120) at ../qom/object.c:664
#5  0x55b4c071 in bus_unparent (obj=)
at ../hw/core/bus.c:147
#6  0x55b70465 in object_finalize_child_property (
obj=, name=, opaque=0x57de5098)
at ../qom/object.c:1725
#7  0x55b6fa17 in object_property_del_child (
child=0x57de5098, obj=0x57ddcf90) at ../qom/object.c:645
#8  object_unparent (obj=0x57de5098) at ../qom/object.c:664
#9  0x55b4ee19 in device_unparent (obj=)
at ../hw/core/qdev.c:981
#10 0x55b70465 in object_finalize_child_property (
obj=, name=, opaque=0x57ddcf90)
at ../qom/object.c:1725
#11 0x55b6fa17 in object_property_del_child (
child=0x57ddcf90, obj=0x5685da10) at ../qom/object.c:645
#12 object_unparent (obj=0x57ddcf90) at ../qom/object.c:664
#13 0x558dc331 in pci_for_each_device_under_bus (
opaque=, fn=, bus=)
at ../hw/pci/pci.c:1654

[2]

Optimizer omits pci_qdev_unrealize, called by device_set_realized, and
do_pci_unregister_device, called by pci_qdev_unrealize and caller of
address_space_destroy.

#0  address_space_destroy (as=0x57ddd1b8)
at ../softmmu/memory.c:2840
#1  0x55b4fc53 in device_set_realized (obj=0x57ddcf90,
 value=, errp=0x7fffeea8f1e0)
at ../hw/core/qdev.c:850
#2  0x55b6eaa6 in property_set_bool (obj=0x57ddcf90,
 v=, name=, opaque=0x56650ba0,
errp=0x7fffeea8f1e0) at ../qom/object.c:2255
#3  0x55b70e07 in object_property_set (
 obj=obj@entry=0x57ddcf90,
 name=name@entry=0x55db99df "realized",
 v=v@entry=0x7fffe46b7500,
 errp=errp@entry=0x565bbf38 )
at ../qom/object.c:1400
#4  0x55b73c5f in object_property_set_qobject (
 obj=obj@entry=0x57ddcf90,
 name=name@entry=0x55db99df "realized",
 value=value@entry=0x7fffe44f6180,
 errp=errp@entry=0x565bbf38 )
at ../qom/qom-qobject.c:28
#5  0x55b71044 in object_property_set_bool (
 obj=0x57ddcf90, name=0x55db99df "realized",
 value=, errp=0x565bbf38 )
at ../qom/object.c:1470
#6  0x55921cb7 in pcie_unplug_device (bus=,
 dev=0x57ddcf90,
 opaque=) at /home/qemu/include/hw/qdev-core.h:17
#7  0x558dc331 in pci_for_each_device_under_bus (
 opaque=, fn=,
 bus=) at ../hw/pci/pci.c:1654

[3]

#0  virtio_device_unrealize (dev=0x57de5120)
at ../hw/virtio/virtio.c:3680
#1  0x55b4fc63 in device_set_realized (obj=0x57de5120,
value=, errp=0x7fffee28df90)
at ../hw/core/qdev.c:850
#2  0x55b6eab6 in property_set_bool (obj=0x57de5120,
v=, name=, opaque=0x56650ba0,
errp=0x7fffee28df90) at ../qom/object.c:2255
#3  0x55b70e17 in object_property_set (
obj=obj@entry=0x57de5120,
name=name@entry=0x55db99ff "realized",
v=v@entry=0x7ffdd8035040,
errp=errp@entry=0x565bbf38 )
at ../qom/object.c:1400
#4  0x55b73c6f in object_property_set_qobject (
obj=obj@entry=0x57de5120,
name=name@entry=0x55db99ff "realized",
value=value@entry=0x7ffdd8035020,
errp=errp@entry=0x565bbf38 )
at ../qom/qom-qobject.c:28
#5  0x55b71054 in object_property_set_bool (
obj=0x57de5120, name=name@entry=0x55db99ff "realized",
value=value@entry=false, errp=0x565bbf38 )
at ../qom/object.c:1470
#6  0x55b4edc5 in qdev_unrealize (dev=)
at 

Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Greg Kurz
On Wed, 27 Jan 2021 10:25:28 +0100
Miklos Szeredi  wrote:

> On Tue, Jan 26, 2021 at 6:18 PM Greg Kurz  wrote:
> >
> > On Tue, 26 Jan 2021 10:35:02 +
> > Stefan Hajnoczi  wrote:
> 
> > The patch looks pretty good to me. It just seems to be missing a change in
> > lo_create():
> >
> > fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
> > mode);
> >
> > A malicious guest could have created anything called ${name} in this 
> > directory
> > before calling FUSE_CREATE and we'll open it blindly, or I'm missing 
> > something ?
> 
> Right, this seems like an omission.
> 
> Also the "& ~O_NOFOLLOW" looks like a copy-paste bug, since unlike
> lo_open(), lo_create() is not opening a proc symlink.
> 
> So that should be replaced with "| O_NOFOLLOW"
> 


Yes, I've realized that later on. We should definitely enforce O_NOFOLLOW
to avoid symlink escapes.

Then comes the case of special files... A well-known case is the FIFO that
causes openat() to block as described in my response. FWIW, we addressed
this one in 9P by adding O_NONBLOCK and fixing the flags to the client
expectation with fcntl(F_SETFL). But this is just a protection against
being blocked. Blindly opening a special file can lead to any kind of
troubles you can think of... so it really looks that the only sane way
to be safe from such an attack is to forbid openat() of special files at
the filesystem level.

> Thanks,
> Miklos
> 




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 11:20 AM Greg Kurz  wrote:
>
> On Wed, 27 Jan 2021 10:25:28 +0100
> Miklos Szeredi  wrote:
>
> > On Tue, Jan 26, 2021 at 6:18 PM Greg Kurz  wrote:
> > >
> > > On Tue, 26 Jan 2021 10:35:02 +
> > > Stefan Hajnoczi  wrote:
> >
> > > The patch looks pretty good to me. It just seems to be missing a change in
> > > lo_create():
> > >
> > > fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & 
> > > ~O_NOFOLLOW,
> > > mode);
> > >
> > > A malicious guest could have created anything called ${name} in this 
> > > directory
> > > before calling FUSE_CREATE and we'll open it blindly, or I'm missing 
> > > something ?
> >
> > Right, this seems like an omission.
> >
> > Also the "& ~O_NOFOLLOW" looks like a copy-paste bug, since unlike
> > lo_open(), lo_create() is not opening a proc symlink.
> >
> > So that should be replaced with "| O_NOFOLLOW"
> >
>
>
> Yes, I've realized that later on. We should definitely enforce O_NOFOLLOW
> to avoid symlink escapes.
>
> Then comes the case of special files... A well-known case is the FIFO that
> causes openat() to block as described in my response. FWIW, we addressed
> this one in 9P by adding O_NONBLOCK and fixing the flags to the client
> expectation with fcntl(F_SETFL). But this is just a protection against
> being blocked. Blindly opening a special file can lead to any kind of
> troubles you can think of... so it really looks that the only sane way
> to be safe from such an attack is to forbid openat() of special files at
> the filesystem level.

Another solution specifically for O_CREAT without O_EXCL would be to
turn it into an exclusive create.   If that fails with EEXIST then try
the normal open path (open with O_PATH, fstat, open proc symlink).  If
that fails with ENOENT, then retry the whole thing a certain number of
times.  If it still fails then somebody is definitely messing with us
and we can fail the request with EIO.

Rather ugly, but I can't think of anything better.

Thanks,
Miklos




Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Michal Privoznik

On 1/21/21 5:15 PM, Igor Mammedov wrote:

Add documentation for '-machine memory-backend' CLI option and
how to use it.

And document that x-use-canonical-path-for-ramblock-id,
is considered to be stable to make sure it won't go away by accident.

x- was intended for unstable/iternal properties, and not supposed to
be stable option. However it's too late to rename (drop x-)
it as it would mean that users will have to mantain both
x-use-canonical-path-for-ramblock-id (for QEMU 5.0-5.2) versions
and prefix-less for later versions.

Signed-off-by: Igor Mammedov 
---
v2:
  - add doc that x-use-canonical-path-for-ramblock-id is considered stable,
  (Peter Krempa )
v3:
  - 
s/x-use-canonical-path-for-ramblock-id=on/x-use-canonical-path-for-ramblock-id=off/
  (Michal Privoznik )
  - add to commit message why x- prefix is preserved
  - drop clause about x-use-canonical-path-for-ramblock-id being stable
from help section, but keep it in code comment above
x-use-canonical-path-for-ramblock-id property. It's sufficient
to prevent option being changed/removed by accident.
  (Peter Maydell )
---
  backends/hostmem.c | 10 ++
  qemu-options.hx| 26 +-
  2 files changed, 35 insertions(+), 1 deletion(-)


Reviewed-by: Michal Privoznik 

Michal




Re: [PATCH v5 02/10] iotests/297: Rewrite in Python and extend reach

2021-01-27 Thread Max Reitz

On 26.01.21 22:28, John Snow wrote:

On 1/18/21 5:57 AM, Max Reitz wrote:

Instead of checking iotests.py only, check all Python files in the
qemu-iotests/ directory.  Of course, most of them do not pass, so there
is an extensive skip list for now.  (The only files that do pass are
209, 254, 283, and iotests.py.)



Chiming in to say that I tried to tackle this before; I wrote some 
preliminary cleanups and sent to the list as an "WIP RFC" or something 
like that in earlyish 2020. I paid attention to qed.py and the other 
non-numerical files.


Maybe badly rotted by now, I don't know.


(Alternatively, we could have the opposite, i.e. an explicit list of
files that we do want to check, but I think it is better to check files
by default.)



I agree. Stop the bleeding first and worry about the rest after.


Unless started in debug mode (./check -d), the output has no information
on which files are tested, so we will not have a problem e.g. with
backports, where some files may be missing when compared to upstream.

Besides the technical rewrite, some more things are changed:

- For the pylint invocation, PYTHONPATH is adjusted.  This mirrors
   setting MYPYPATH for mypy.

- Also, MYPYPATH is now derived from PYTHONPATH, so that we include
   paths set by the environment.  Maybe at some point we want to let the
   check script add '../../python/' to PYTHONPATH so that iotests.py does
   not need to do that.



Does this solve an observed problem, or is it preventative? I ran into 
trouble once by pointing mypy to my system python libraries; it seemed 
to have a check that explicitly warned me against such tricks.


Yes, that happens when you derive it from sys.path.  (Which I tried at 
one point, and then ran into that exact problem.)


PYTHONPATH generally doesn’t include the system libraries, though, 
generally it shouldn’t even be set for the iotests.  So the only thing 
that’s put in there is ../../python/, and we need that in MYPYPATH, too.


(As I wrote, perhaps in the future the check script will add 
../../python/, so we don’t need to do that anywhere in the iotests, and 
then it makes more sense why MYPYPATH should just be $PYTHONPATH.)



I guess for now, if it works, it works. :o)


- Passing --notes=FIXME,XXX to pylint suppresses warnings for TODO
   comments.  TODO is fine, we do not need 297 to complain about such
   comments.



Agreed. You can also edit pylintrc to choose which keywords trigger the 
check -- "TODO" is probably fine, but "FIXME" is maybe a shade worse. 
Season to taste.


Yes, definitely a matter of taste.  I kind of like pylint to complain 
about TODO when I’m running it explicitly, so, well.



- The "Success" line from mypy's output is suppressed, because (A) it
   does not add useful information, and (B) it would leak information
   about the files having been tested to the reference output, which we
   decidedly do not want.

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/297 | 112 +
  tests/qemu-iotests/297.out |   5 +-
  2 files changed, 92 insertions(+), 25 deletions(-)

diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index 5c5420712b..e3db3e061e 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -1,4 +1,4 @@
-#!/usr/bin/env bash
+#!/usr/bin/env python3
  #
  # Copyright (C) 2020 Red Hat, Inc.


You could bump it up, if you wanted.


Do I, though? :)


  #
@@ -15,30 +15,98 @@
  # You should have received a copy of the GNU General Public License
  # along with this program.  If not, see .
-seq=$(basename $0)
-echo "QA output created by $seq"
+import os
+import re
+import shutil
+import subprocess
+import sys
-status=1    # failure is the default!
+import iotests
-# get standard environment
-. ./common.rc
-if ! type -p "pylint-3" > /dev/null; then
-    _notrun "pylint-3 not found"
-fi
-if ! type -p "mypy" > /dev/null; then
-    _notrun "mypy not found"
-fi
+# TODO: Empty this list!
+SKIP_FILES = (
+    '030', '040', '041', '044', '045', '055', '056', '057', '065', 
'093',
+    '096', '118', '124', '129', '132', '136', '139', '147', '148', 
'149',
+    '151', '152', '155', '163', '165', '169', '194', '196', '199', 
'202',
+    '203', '205', '206', '207', '208', '210', '211', '212', '213', 
'216',
+    '218', '219', '222', '224', '228', '234', '235', '236', '237', 
'238',
+    '240', '242', '245', '246', '248', '255', '256', '257', '258', 
'260',
+    '262', '264', '266', '274', '277', '280', '281', '295', '296', 
'298',

+    '299', '300', '302', '303', '304', '307',
+    'nbd-fault-injector.py', 'qcow2.py', 'qcow2_format.py', 'qed.py'
+)
-pylint-3 --score=n iotests.py
-MYPYPATH=../../python/ mypy --warn-unused-configs 
--disallow-subclassing-any \

-    --disallow-any-generics --disallow-incomplete-defs \
-    --disallow-untyped-decorators --no-implicit-optional \
-    --warn-redundant-casts --warn-unused-ignores \
-    --no-

Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Daniel P . Berrangé
On Thu, Jan 21, 2021 at 11:15:04AM -0500, Igor Mammedov wrote:
> Add documentation for '-machine memory-backend' CLI option and
> how to use it.
> 
> And document that x-use-canonical-path-for-ramblock-id,
> is considered to be stable to make sure it won't go away by accident.
> 
> x- was intended for unstable/iternal properties, and not supposed to
> be stable option. However it's too late to rename (drop x-)
> it as it would mean that users will have to mantain both
> x-use-canonical-path-for-ramblock-id (for QEMU 5.0-5.2) versions
> and prefix-less for later versions.
> 
> Signed-off-by: Igor Mammedov 
> ---
> v2:
>  - add doc that x-use-canonical-path-for-ramblock-id is considered stable,
>  (Peter Krempa )
> v3:
>  - 
> s/x-use-canonical-path-for-ramblock-id=on/x-use-canonical-path-for-ramblock-id=off/
>  (Michal Privoznik )
>  - add to commit message why x- prefix is preserved
>  - drop clause about x-use-canonical-path-for-ramblock-id being stable
>from help section, but keep it in code comment above
>x-use-canonical-path-for-ramblock-id property. It's sufficient
>to prevent option being changed/removed by accident.
>  (Peter Maydell )
> ---
>  backends/hostmem.c | 10 ++
>  qemu-options.hx| 26 +-
>  2 files changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 9f9ac95edd..813aeb83c9 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -498,6 +498,16 @@ host_memory_backend_class_init(ObjectClass *oc, void 
> *data)
>  host_memory_backend_get_share, host_memory_backend_set_share);
>  object_class_property_set_description(oc, "share",
>  "Mark the memory as private to QEMU or shared");
> +/*
> + * Do not delete/rename option. This option must be considered stable
> + * (as if it didn't have the 'x-' prefix including deprecation period) as
> + * long as 4.0 and older machine types exists.
> + * Option will be used by upper layers to override (disable) canonical 
> path
> + * for ramblock-id set by compat properties on old machine types ( <= 
> 4.0),
> + * to keep migration working when backend is used for main RAM with
> + * -machine memory-backend= option (main RAM historically used 
> prefix-less
> + * ramblock-id).
> + */
>  object_class_property_add_bool(oc, 
> "x-use-canonical-path-for-ramblock-id",
>  host_memory_backend_get_use_canonical_path,
>  host_memory_backend_set_use_canonical_path);
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 62791f56d8..059b1a1d14 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -35,7 +35,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
>  "suppress-vmdesc=on|off disables self-describing 
> migration (default=off)\n"
>  "nvdimm=on|off controls NVDIMM support (default=off)\n"
>  "memory-encryption=@var{} memory encryption object to 
> use (default=none)\n"
> -"hmat=on|off controls ACPI HMAT support (default=off)\n",
> +"hmat=on|off controls ACPI HMAT support (default=off)\n"
> +"memory-backend='backend-id' specifies explicitly 
> provided backend for main RAM (default=none)\n",
>  QEMU_ARCH_ALL)
>  SRST
>  ``-machine [type=]name[,prop=value[,...]]``
> @@ -96,6 +97,29 @@ SRST
>  ``hmat=on|off``
>  Enables or disables ACPI Heterogeneous Memory Attribute Table
>  (HMAT) support. The default is off.
> +
> + ``memory-backend='id'``
> +An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options.
> +Allows to use a memory backend as main RAM.
> +
> +For example:
> +::
> +-object 
> memory-backend-file,id=pc.ram,size=512M,mem-path=/hugetlbfs,prealloc=on,share=on
> +-machine memory-backend=pc.ram
> +-m 512M
> +
> +Migration compatibility note:
> +a) as backend id one shall use value of 'default-ram-id', advertised 
> by
> +machine type (available via ``query-machines`` QMP command), if 
> migration
> +to/from old QEMU (<5.0) is expected.
> +b) for machine types 4.0 and older, user shall
> +use ``x-use-canonical-path-for-ramblock-id=off`` backend option
> +if migration to/from old QEMU (<5.0) is expected.

How does a mgmt app know which machine types need to use this
option ? The machine type names are opaque strings, and apps
must not attempt to parse or interpret the version number
inside the machine type name, as they can be changed by
distros.  IOW, saying to use it for machine types 4.0 and
older isn't a valid usage strategy IMHO.

> +For example:
> +::
> +-object 
> memory-backend-ram,id=pc.ram,size=512M,x-use-canonical-path-for-ramblock-id=off
> +-machine memory-backend=pc.ram
> +-m 512M
>  ERST
>  

Regards,
Daniel
-- 
|: https://berrange.com 

Re: [PATCH] ui: fix spice display regression

2021-01-27 Thread Paolo Bonzini

On 27/01/21 11:18, Marc-André Lureau wrote:

Hi

On Wed, Jan 27, 2021 at 2:03 PM  wrote:


From: Marc-André Lureau 

Since commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4 ("vl: remove
separate preconfig main_loop"), spice initialization is a bit dodgy, and
the client get stuck waiting for server events.

The initialization order changed, so that qemu_spice_display_start() is
called before the display interfaces are added. The new interfaces
aren't started by spice-server automatically (yet?), so we have to tell
the server explicitely when we are already running.

Signed-off-by: Marc-André Lureau 


What is the exact different in initialization order once you add commit 
facf7c60ee ("vl: initialize displays _after_ exiting preconfiguration", 
2021-01-02)?


Thanks,

Paolo


---
  ui/spice-core.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index 5746d0aae7..6eebf12e3c 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -830,6 +830,8 @@ static void qemu_spice_init(void)

  static int qemu_spice_add_interface(SpiceBaseInstance *sin)
  {
+int ret;
+
  if (!spice_server) {
  if (QTAILQ_FIRST(&qemu_spice_opts.head) != NULL) {
  error_report("Oops: spice configured but not active");
@@ -848,7 +850,13 @@ static int qemu_spice_add_interface(SpiceBaseInstance *sin)
  qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
  }

-return spice_server_add_interface(spice_server, sin);
+ret = spice_server_add_interface(spice_server, sin);
+/* make sure the newly added interface is started */
+if (ret == 0 && spice_display_is_running) {
+spice_server_vm_start(spice_server);
+}
+
+return ret;
  }

  static GSList *spice_consoles;
--
2.29.0




Oops, it doesn't work reliably. There is some race in spice server now.

spice_server_vm_start() sends RED_WORKER_MESSAGE_START to the QXL
worker thread. But if two of those come, it will assert... It should
probably not, I will send a patch to spice.

I am looking for other options for QEMU though.






Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Daniel P . Berrangé
On Wed, Jan 27, 2021 at 10:45:11AM +, Daniel P. Berrangé wrote:
> On Thu, Jan 21, 2021 at 11:15:04AM -0500, Igor Mammedov wrote:
> > Add documentation for '-machine memory-backend' CLI option and
> > how to use it.
> > 
> > And document that x-use-canonical-path-for-ramblock-id,
> > is considered to be stable to make sure it won't go away by accident.
> > 
> > x- was intended for unstable/iternal properties, and not supposed to
> > be stable option. However it's too late to rename (drop x-)
> > it as it would mean that users will have to mantain both
> > x-use-canonical-path-for-ramblock-id (for QEMU 5.0-5.2) versions
> > and prefix-less for later versions.
> > 
> > Signed-off-by: Igor Mammedov 
> > ---
> > v2:
> >  - add doc that x-use-canonical-path-for-ramblock-id is considered stable,
> >  (Peter Krempa )
> > v3:
> >  - 
> > s/x-use-canonical-path-for-ramblock-id=on/x-use-canonical-path-for-ramblock-id=off/
> >  (Michal Privoznik )
> >  - add to commit message why x- prefix is preserved
> >  - drop clause about x-use-canonical-path-for-ramblock-id being stable
> >from help section, but keep it in code comment above
> >x-use-canonical-path-for-ramblock-id property. It's sufficient
> >to prevent option being changed/removed by accident.
> >  (Peter Maydell )
> > ---
> >  backends/hostmem.c | 10 ++
> >  qemu-options.hx| 26 +-
> >  2 files changed, 35 insertions(+), 1 deletion(-)
> > 
> > diff --git a/backends/hostmem.c b/backends/hostmem.c
> > index 9f9ac95edd..813aeb83c9 100644
> > --- a/backends/hostmem.c
> > +++ b/backends/hostmem.c
> > @@ -498,6 +498,16 @@ host_memory_backend_class_init(ObjectClass *oc, void 
> > *data)
> >  host_memory_backend_get_share, host_memory_backend_set_share);
> >  object_class_property_set_description(oc, "share",
> >  "Mark the memory as private to QEMU or shared");
> > +/*
> > + * Do not delete/rename option. This option must be considered stable
> > + * (as if it didn't have the 'x-' prefix including deprecation period) 
> > as
> > + * long as 4.0 and older machine types exists.
> > + * Option will be used by upper layers to override (disable) canonical 
> > path
> > + * for ramblock-id set by compat properties on old machine types ( <= 
> > 4.0),
> > + * to keep migration working when backend is used for main RAM with
> > + * -machine memory-backend= option (main RAM historically used 
> > prefix-less
> > + * ramblock-id).
> > + */
> >  object_class_property_add_bool(oc, 
> > "x-use-canonical-path-for-ramblock-id",
> >  host_memory_backend_get_use_canonical_path,
> >  host_memory_backend_set_use_canonical_path);
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 62791f56d8..059b1a1d14 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -35,7 +35,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
> >  "suppress-vmdesc=on|off disables self-describing 
> > migration (default=off)\n"
> >  "nvdimm=on|off controls NVDIMM support (default=off)\n"
> >  "memory-encryption=@var{} memory encryption object to 
> > use (default=none)\n"
> > -"hmat=on|off controls ACPI HMAT support 
> > (default=off)\n",
> > +"hmat=on|off controls ACPI HMAT support 
> > (default=off)\n"
> > +"memory-backend='backend-id' specifies explicitly 
> > provided backend for main RAM (default=none)\n",
> >  QEMU_ARCH_ALL)
> >  SRST
> >  ``-machine [type=]name[,prop=value[,...]]``
> > @@ -96,6 +97,29 @@ SRST
> >  ``hmat=on|off``
> >  Enables or disables ACPI Heterogeneous Memory Attribute Table
> >  (HMAT) support. The default is off.
> > +
> > + ``memory-backend='id'``
> > +An alternative to legacy ``-mem-path`` and ``mem-prealloc`` 
> > options.
> > +Allows to use a memory backend as main RAM.
> > +
> > +For example:
> > +::
> > +-object 
> > memory-backend-file,id=pc.ram,size=512M,mem-path=/hugetlbfs,prealloc=on,share=on
> > +-machine memory-backend=pc.ram
> > +-m 512M
> > +
> > +Migration compatibility note:
> > +a) as backend id one shall use value of 'default-ram-id', 
> > advertised by
> > +machine type (available via ``query-machines`` QMP command), if 
> > migration
> > +to/from old QEMU (<5.0) is expected.
> > +b) for machine types 4.0 and older, user shall
> > +use ``x-use-canonical-path-for-ramblock-id=off`` backend option
> > +if migration to/from old QEMU (<5.0) is expected.
> 
> How does a mgmt app know which machine types need to use this
> option ? The machine type names are opaque strings, and apps
> must not attempt to parse or interpret the version number
> inside the machine type name, as they can be changed by
> distros.  IOW, saying to use it for machine types 4.0 an

[PULL 2/3] vnc: send extended desktop resize on update requests

2021-01-27 Thread Gerd Hoffmann
Unlike other pseudo-encodings these don't break gtk-vnc
because older versions don't suport the extended desktop
resize extension in the first place.

Signed-off-by: Gerd Hoffmann 
Tested-by: Laszlo Ersek 
Message-Id: <20210125104041.495274-3-kra...@redhat.com>
---
 ui/vnc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/ui/vnc.c b/ui/vnc.c
index 2622f82a5a9f..16bb3be770b2 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2046,6 +2046,9 @@ static void framebuffer_update_request(VncState *vs, int 
incremental,
 } else {
 vs->update = VNC_STATE_UPDATE_FORCE;
 vnc_set_area_dirty(vs->dirty, vs->vd, x, y, w, h);
+if (vnc_has_feature(vs, VNC_FEATURE_RESIZE_EXT)) {
+vnc_desktop_resize_ext(vs, 0);
+}
 }
 }
 
-- 
2.29.2




[PULL 3/3] hw/display/vmware_vga: Remove dependency on VNC header

2021-01-27 Thread Gerd Hoffmann
From: Peter Maydell 

In commit 2f487a3d40faff1 we fixed a problem observed with using the
vmware-vga device and the VNC UI frontend in a belt-and-braces
manner:
 * we made the VNC frontend handle non-multiple-of-16 surface widths
 * we rounded up the vmware-vga display width to a multiple of 16

However this introduced a spurious dependency of a device model on a
UI frontend header.  vmware-vga isn't special and should not care
about what UI frontend it is using, and the VNC frontend needs to
handle arbitrary surface widths because other display device models
could use them.  Moreover, even if the maximum width in vmware-vga is
made a multiple of 16, the guest itself can always program a
different width.

Remove the dependency on the VNC header.  Since we have been using
the rounded-up width value since 2014, stick with it rather than
introducing a behaviour change, but don't calculate it by rounding up
to VNC_DIRTY_BITS_PER_PIXEL any more.

Signed-off-by: Peter Maydell 
Message-Id: <20210112161608.16055-1-peter.mayd...@linaro.org>
Signed-off-by: Gerd Hoffmann 
---
 hw/display/vmware_vga.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
index bef0d7d69a79..e2969a6c81c8 100644
--- a/hw/display/vmware_vga.c
+++ b/hw/display/vmware_vga.c
@@ -29,7 +29,6 @@
 #include "qemu/log.h"
 #include "hw/loader.h"
 #include "trace.h"
-#include "ui/vnc.h"
 #include "hw/pci/pci.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
@@ -220,7 +219,7 @@ enum {
 
 /* These values can probably be changed arbitrarily.  */
 #define SVGA_SCRATCH_SIZE   0x8000
-#define SVGA_MAX_WIDTH  ROUND_UP(2360, 
VNC_DIRTY_PIXELS_PER_BIT)
+#define SVGA_MAX_WIDTH  2368
 #define SVGA_MAX_HEIGHT 1770
 
 #ifdef VERBOSE
-- 
2.29.2




[PULL 0/3] Ui 20210127 patches

2021-01-27 Thread Gerd Hoffmann
The following changes since commit 9cd69f1a270235b652766f00b94114f48a2d603f:

  Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2021-01-25=
-1' into staging (2021-01-26 09:51:02 +)

are available in the Git repository at:

  git://git.kraxel.org/qemu tags/ui-20210127-pull-request

for you to fetch changes up to 15b08119add1e49ccbc7f7d6b3a04932d2430d7e:

  hw/display/vmware_vga: Remove dependency on VNC header (2021-01-27 09:48:04=
 +0100)


vnc: fix gtk-vnc compatibility issues.
vnc: vmware svga cleanup



Gerd Hoffmann (2):
  Revert "vnc: move initialization to framebuffer_update_request"
  vnc: send extended desktop resize on update requests

Peter Maydell (1):
  hw/display/vmware_vga: Remove dependency on VNC header

 hw/display/vmware_vga.c |  3 +--
 ui/vnc.c| 14 ++
 2 files changed, 11 insertions(+), 6 deletions(-)

--=20
2.29.2





[PULL 1/3] Revert "vnc: move initialization to framebuffer_update_request"

2021-01-27 Thread Gerd Hoffmann
This reverts commit 9e1632ad07ca49de99da4bb231e9e2f22f2d8df5.

Older gtk-vnc versions can't deal with non-incremental update
requests sending pseudo-encodings, so trying to send full server
state (including desktop size, cursor etc. which is done using
pseudo-encodings) doesn't fly.  Return to old behavior to send
those only for new connects and when changes happen.

Signed-off-by: Gerd Hoffmann 
Tested-by: Laszlo Ersek 
Message-Id: <20210125104041.495274-2-kra...@redhat.com>
---
 ui/vnc.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 66f7c1b9361e..2622f82a5a9f 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -687,6 +687,10 @@ static void vnc_desktop_resize(VncState *vs)
 !vnc_has_feature(vs, VNC_FEATURE_RESIZE_EXT))) {
 return;
 }
+if (vs->client_width == pixman_image_get_width(vs->vd->server) &&
+vs->client_height == pixman_image_get_height(vs->vd->server)) {
+return;
+}
 
 assert(pixman_image_get_width(vs->vd->server) < 65536 &&
pixman_image_get_width(vs->vd->server) >= 0);
@@ -2042,10 +2046,6 @@ static void framebuffer_update_request(VncState *vs, int 
incremental,
 } else {
 vs->update = VNC_STATE_UPDATE_FORCE;
 vnc_set_area_dirty(vs->dirty, vs->vd, x, y, w, h);
-vnc_colordepth(vs);
-vnc_desktop_resize(vs);
-vnc_led_state_change(vs);
-vnc_cursor_define(vs);
 }
 }
 
@@ -2189,7 +2189,10 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 break;
 }
 }
+vnc_desktop_resize(vs);
 check_pointer_type_change(&vs->mouse_mode_notifier, NULL);
+vnc_led_state_change(vs);
+vnc_cursor_define(vs);
 }
 
 static void set_pixel_conversion(VncState *vs)
-- 
2.29.2




Re: [PATCH] IOMMU and ATS not supported by vhost-user filesystem.

2021-01-27 Thread Stefan Hajnoczi
On Tue, Jan 26, 2021 at 03:23:38PM -0300, lagar...@linux.ibm.com wrote:
> From: Leonardo Garcia 
> 
> Currently, as IOMMU and ATS are not supported, if a user mistakenly set
> any of them and tries to mount the vhost-user filesystem inside the
> guest, whenever the user tries to access the mount point, the system
> will hang forever.
> 
> Signed-off-by: Leonardo Garcia 
> ---
>  hw/virtio/vhost-user-fs-pci.c | 7 +++
>  hw/virtio/vhost-user-fs.c | 5 +
>  2 files changed, 12 insertions(+)
> 
> diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
> index 2ed8492b3f..564d1fd108 100644
> --- a/hw/virtio/vhost-user-fs-pci.c
> +++ b/hw/virtio/vhost-user-fs-pci.c
> @@ -11,6 +11,8 @@
>   * top-level directory.
>   */
>  
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
>  #include "qemu/osdep.h"
>  #include "hw/qdev-properties.h"
>  #include "hw/virtio/vhost-user-fs.h"
> @@ -45,6 +47,11 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
> *vpci_dev, Error **errp)
>  vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
>  }
>  
> +if (vpci_dev->flags & VIRTIO_PCI_FLAG_ATS) {
> +error_setg(errp, "ATS is currently not supported with 
> vhost-user-fs-pci");
> +return;
> +}

Why is this check needed in addition to VIRTIO_F_IOMMU_PLATFORM?

What needs to be added to support ATS?

> +
>  qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
>  }
>  
> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> index ac4fc34b36..914d68b3ee 100644
> --- a/hw/virtio/vhost-user-fs.c
> +++ b/hw/virtio/vhost-user-fs.c
> @@ -203,6 +203,11 @@ static void vuf_device_realize(DeviceState *dev, Error 
> **errp)
>  return;
>  }
>  
> +if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> +error_setg(errp, "IOMMU is currently not supported with 
> vhost-user-fs");
> +return;
> +}
> +
>  if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {

I thought IOMMU support depends on the vhost-user device backend (e.g.
virtiofsd), so the vhost-user backend should participate in advertising
this feature.

Perhaps the check should be:

ret = vhost_dev_init(&fs->vhost_dev, &fs->vhost_user,
 VHOST_BACKEND_TYPE_USER, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "vhost_dev_init failed");
goto err_virtio;
}
+
+   if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM) &&
+   !(fs->vhost_dev.hdev_features & (1ull << VIRTIO_F_IOMMU_PLATFORM))) {
+   error_setg(errp, "IOMMU is not supported by the vhost-user device 
backend");
+   goto err_iommu_needed;
+   }

Also, can this logic be made generic for all vhost-user devices? It's
not really specific to vhost-user-fs.

Stefan


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly

2021-01-27 Thread Shenming Lu
On 2021/1/27 5:36, Alex Williamson wrote:
> On Wed, 9 Dec 2020 16:09:18 +0800
> Shenming Lu  wrote:
> 
>> In the VFIO VM state change handler, VFIO devices are transitioned
>> in the _SAVING state, which should keep them from sending interrupts.
> 
> Is this comment accurate?  It's my expectation that _SAVING has no
> bearing on a device generating interrupts.  Interrupt generation must
> be allowed to continue so long as the device is _RUNNING.  Thanks,
> 

To be more accurate, the _RUNNING bit in device_state is cleared in the
VFIO VM state change handler when stopping the VM. And if the device continues
to send interrupts after this, how can we save the states of device interrupts
in the stop-and-copy phase?...

Thanks,
Shenming

> Alex
> 
>> Then we can save the pending states of all interrupts in the GIC VM
>> state change handler (on ARM).
>>
>> So we have to set the priority of the VFIO VM state change handler
>> explicitly (like virtio devices) to ensure it is called before the
>> GIC's in saving.
>>
>> Signed-off-by: Shenming Lu 
>> Reviewed-by: Kirti Wankhede 
>> ---
>>  hw/vfio/migration.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 3b9de1353a..97ea82b100 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>>  register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, 
>> &savevm_vfio_handlers,
>>   vbasedev);
>>  
>> -migration->vm_state = 
>> qemu_add_vm_change_state_handler(vfio_vmstate_change,
>> +migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
>> +   
>> vfio_vmstate_change,
>> vbasedev);
>>  migration->migration_state.notify = vfio_migration_state_notifier;
>>  add_migration_state_change_notifier(&migration->migration_state);
> 
> .
> 



[PATCH v3] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Stefan Hajnoczi
A well-behaved FUSE client does not attempt to open special files with
FUSE_OPEN because they are handled on the client side (e.g. device nodes
are handled by client-side device drivers).

The check to prevent virtiofsd from opening special files is missing in
a few cases, most notably FUSE_OPEN. A malicious client can cause
virtiofsd to open a device node, potentially allowing the guest to
escape. This can be exploited by a modified guest device driver. It is
not exploitable from guest userspace since the guest kernel will handle
special files inside the guest instead of sending FUSE requests.

This patch adds the missing checks to virtiofsd. This is a short-term
solution because it does not prevent a compromised virtiofsd process
from opening device nodes on the host.

Reported-by: Alex Xu 
Fixes: CVE-2020-35517
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Vivek Goyal 
Signed-off-by: Stefan Hajnoczi 
---
v3:
 * Protect lo_create() [Greg]
v2:
 * Add doc comment clarifying that symlinks are traversed client-side
   [Daniel]

This issue was diagnosed on public IRC and is therefore already known
and not embargoed.

A stronger fix, and the long-term solution, is for users to mount the
shared directory and any sub-mounts with nodev, as well as nosuid and
noexec. Unfortunately virtiofsd cannot do this automatically because
bind mounts added by the user after virtiofsd has launched would not be
detected. I suggest the following:

1. Modify libvirt and Kata Containers to explicitly set these mount
   options.
2. Then modify virtiofsd to check that the shared directory has the
   necessary options at startup. Refuse to start if the options are
   missing so that the user is aware of the security requirements.

As a bonus this also increases the likelihood that other host processes
besides virtiofsd will be protected by nosuid/noexec/nodev so that a
malicious guest cannot drop these files in place and then arrange for a
host process to come across them.

Additionally, user namespaces have been discussed. They seem like a
worthwhile addition as an unprivileged or privilege-separated mode
although there are limitations with respect to security xattrs and the
actual uid/gid stored on the host file system not corresponding to the
guest uid/gid.
---
 tools/virtiofsd/passthrough_ll.c | 104 ++-
 1 file changed, 74 insertions(+), 30 deletions(-)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 5fb36d9407..054ad439a5 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -555,6 +555,30 @@ static int lo_fd(fuse_req_t req, fuse_ino_t ino)
 return fd;
 }
 
+/*
+ * Open a file descriptor for an inode. Returns -EBADF if the inode is not a
+ * regular file or a directory. Use this helper function instead of raw
+ * openat(2) to prevent security issues when a malicious client opens special
+ * files such as block device nodes. Symlink inodes are also rejected since
+ * symlinks must already have been traversed on the client side.
+ */
+static int lo_inode_open(struct lo_data *lo, struct lo_inode *inode,
+ int open_flags)
+{
+g_autofree char *fd_str = g_strdup_printf("%d", inode->fd);
+int fd;
+
+if (!S_ISREG(inode->filetype) && !S_ISDIR(inode->filetype)) {
+return -EBADF;
+}
+
+fd = openat(lo->proc_self_fd, fd_str, open_flags);
+if (fd < 0) {
+return -errno;
+}
+return fd;
+}
+
 static void lo_init(void *userdata, struct fuse_conn_info *conn)
 {
 struct lo_data *lo = (struct lo_data *)userdata;
@@ -684,8 +708,7 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, 
struct stat *attr,
 if (fi) {
 truncfd = fd;
 } else {
-sprintf(procname, "%i", ifd);
-truncfd = openat(lo->proc_self_fd, procname, O_RDWR);
+truncfd = lo_inode_open(lo, inode, O_RDWR);
 if (truncfd < 0) {
 goto out_err;
 }
@@ -1654,9 +1677,11 @@ static void update_open_flags(int writeback, int 
allow_direct_io,
 static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
   mode_t mode, struct fuse_file_info *fi)
 {
+int open_flags = (fi->flags | O_CREAT) & ~O_NOFOLLOW;
 int fd;
 struct lo_data *lo = lo_data(req);
 struct lo_inode *parent_inode;
+struct lo_inode *existing_inode = NULL;
 struct fuse_entry_param e;
 int err;
 struct lo_cred old = {};
@@ -1682,11 +1707,23 @@ static void lo_create(fuse_req_t req, fuse_ino_t 
parent, const char *name,
 
 update_open_flags(lo->writeback, lo->allow_direct_io, fi);
 
-fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
-mode);
+/* First, try to create a new file but don't open existing files */
+fd = openat(parent_inode->fd, name, open_flags | O_EXCL, mode);
 err = fd == -1 ? errno : 0;
+
 lo_restore_cred(&old);
 

Re: [RFC PATCH v2 3/3] vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration

2021-01-27 Thread Shenming Lu
On 2021/1/27 5:36, Alex Williamson wrote:
> On Wed, 9 Dec 2020 16:09:19 +0800
> Shenming Lu  wrote:
> 
>> Different from the normal situation when the guest starts, we can
>> know the max unmasked vetctor (at the beginning) after msix_load()
>> in VFIO migration. So in order to avoid ineffectively disabling and
> 
> s/ineffectively/inefficiently/?  It's "effective" either way I think.

Yeah, I should say "inefficiently". :-)

> 
>> enabling vectors repeatedly, let's allocate all needed vectors first
>> and then enable these unmasked vectors one by one without disabling.
>>
>> Signed-off-by: Shenming Lu 
>> ---
>>  hw/pci/msix.c | 17 +
>>  hw/vfio/pci.c | 10 --
>>  include/hw/pci/msix.h |  2 ++
>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/pci/msix.c b/hw/pci/msix.c
>> index 67e34f34d6..bf291d3ff8 100644
>> --- a/hw/pci/msix.c
>> +++ b/hw/pci/msix.c
>> @@ -557,6 +557,23 @@ unsigned int msix_nr_vectors_allocated(const PCIDevice 
>> *dev)
>>  return dev->msix_entries_nr;
>>  }
>>  
>> +int msix_get_max_unmasked_vector(PCIDevice *dev)
>> +{
>> +int max_unmasked_vector = -1;
>> +int vector;
>> +
>> +if ((dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &
>> +(MSIX_ENABLE_MASK | MSIX_MASKALL_MASK)) == MSIX_ENABLE_MASK) {
>> +for (vector = 0; vector < dev->msix_entries_nr; vector++) {
>> +if (!msix_is_masked(dev, vector)) {
>> +max_unmasked_vector = vector;
>> +}
>> +}
>> +}
>> +
>> +return max_unmasked_vector;
>> +}
> 
> Comments from QEMU PCI folks?
> 
>> +
>>  static int msix_set_notifier_for_vector(PCIDevice *dev, unsigned int vector)
>>  {
>>  MSIMessage msg;
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 51dc373695..e755ed2514 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -568,6 +568,9 @@ static void vfio_msix_vector_release(PCIDevice *pdev, 
>> unsigned int nr)
>>  
>>  static void vfio_msix_enable(VFIOPCIDevice *vdev)
>>  {
>> +int max_unmasked_vector = msix_get_max_unmasked_vector(&vdev->pdev);
>> +unsigned int used_vector = MAX(max_unmasked_vector, 0);
>> +
> 
> The above PCI function could also be done inline here pretty easily too:
> 
> unsigned int nr, max_vec = 0;
> 
> if (!msix_masked(&vdev->pdev))
> for (nr = 0; nr < msix_nr_vectors_allocated(&vdev->pdev); nr++) {
> if (!msix_is_masked(&vdev->pdev, nr)) {
> max_vec = nr;
> }
> }
> }
> 
> It's a bit cleaner than the msix utility function, imo.

Yeah, it makes sense.

> 
>>  vfio_disable_interrupts(vdev);
>>  
>>  vdev->msi_vectors = g_new0(VFIOMSIVector, vdev->msix->entries);
>> @@ -586,9 +589,12 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev)
>>   * triggering to userspace, then immediately release the vector, leaving
>>   * the physical device with no vectors enabled, but MSI-X enabled, just
>>   * like the guest view.
>> + * If there are unmasked vectors (such as in migration) which will be
>> + * enabled soon, we can allocate them here to avoid ineffectively 
>> disabling
>> + * and enabling vectors repeatedly later.
> 
> It just happens that migration triggers this usage model where the
> MSI-X enable bit is set with vectors unmasked in the vector table, but
> this is not unique to migration, guests can follow this pattern as well.
> Has this been tested with a variety of guests?  Logically it seems
> correct, but always good to prove so.  Thanks,

I have tested it in migration and normal guest startup (only the latest Linux).
And I can try to test with some other kernels, could you be more specific about 
this?

Thanks,
Shenming

> 
> Alex
> 
>>   */
>> -vfio_msix_vector_do_use(&vdev->pdev, 0, NULL, NULL);
>> -vfio_msix_vector_release(&vdev->pdev, 0);
>> +vfio_msix_vector_do_use(&vdev->pdev, used_vector, NULL, NULL);
>> +vfio_msix_vector_release(&vdev->pdev, used_vector);
>>  
>>  if (msix_set_vector_notifiers(&vdev->pdev, vfio_msix_vector_use,
>>vfio_msix_vector_release, NULL)) {
>> diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
>> index 4c4a60c739..4bfb463fa6 100644
>> --- a/include/hw/pci/msix.h
>> +++ b/include/hw/pci/msix.h
>> @@ -23,6 +23,8 @@ void msix_uninit_exclusive_bar(PCIDevice *dev);
>>  
>>  unsigned int msix_nr_vectors_allocated(const PCIDevice *dev);
>>  
>> +int msix_get_max_unmasked_vector(PCIDevice *dev);
>> +
>>  void msix_save(PCIDevice *dev, QEMUFile *f);
>>  void msix_load(PCIDevice *dev, QEMUFile *f);
>>  
> 
> .
> 



Re: [PATCH 3/7] ppc/pnv: Use skiboot addresses to load kernel and ramfs

2021-01-27 Thread Murilo Opsfelder Araújo
On Tuesday, January 26, 2021 2:10:55 PM -03 Cédric Le Goater wrote:
> The current settings are useful to load large kernels (with debug) but
> it moves the initrd image in a memory region not protected by
> skiboot. If skiboot is compiled with DEBUG=1, memory poisoning will
> corrupt the initrd.
>
> Cc: Murilo Opsfelder Araujo 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Murilo Opsfelder Araujo 

> ---
>
>  If we want to increase the kernel size limit as commit b45b56baeecd
>  ("ppc/pnv: increase kernel size limit to 256MiB") intented to do, I
>  think we should add a machine option.
>
>  hw/ppc/pnv.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 14fc9758a973..e500c2e2437e 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -65,9 +65,9 @@
>  #define FW_MAX_SIZE (16 * MiB)
>
>  #define KERNEL_LOAD_ADDR0x2000
> -#define KERNEL_MAX_SIZE (256 * MiB)
> -#define INITRD_LOAD_ADDR0x6000
> -#define INITRD_MAX_SIZE (256 * MiB)
> +#define KERNEL_MAX_SIZE (128 * MiB)
> +#define INITRD_LOAD_ADDR0x2800
> +#define INITRD_MAX_SIZE (128 * MiB)
>
>  static const char *pnv_chip_core_typename(const PnvChip *o)
>  {


--
Murilo




Re: macOS (Big Sur, Apple Silicon) 'make check' fails in test-crypto-tlscredsx509

2021-01-27 Thread Daniel P . Berrangé
On Tue, Jan 26, 2021 at 04:41:13PM +, Peter Maydell wrote:
> On Tue, 26 Jan 2021 at 16:37, Daniel P. Berrangé  wrote:
> >
> > On Tue, Jan 26, 2021 at 04:32:08PM +, Peter Maydell wrote:
> > > ** (tests/test-crypto-tlscredsx509:35180): CRITICAL **: 16:23:34.590:
> > > Failed to sign certificate ASN1 parser: Value is not valid.
> > > ERROR test-crypto-tlscredsx509 - Bail out! FATAL-CRITICAL: Failed to
> > > sign certificate ASN1 parser: Value is not valid.
> > > make: *** [run-test-70] Error 1
> > >
> > >
> > > Does this failure ring any bells for anybody?
> >
> > Not seen it before.
> >
> > Is this using a gnutls from homebrew, or one that apple
> > ship themselves ?  Any idea what version it is ?
> 
> Homebrew gnutls, 3.6.15.

On further investigation it seems the error comes from libtasn1,
but unfortunately there are 100's of scenarios it could arise
so difficult one to debug.

In the test_tls_generate_cert method in QEMU tests/crypto-tls-x509-helpers.c

There are conditional lines like

if (req->country) {

if (req->altname1) {
...etc...

I guess one, or more of those, is writing data that libtasn1 is not happy
with.

Some one with easy access to this apple silicon will likely need to start
by incrementally disabling each of those conditionals eg.  if (req->country
&& 0)

until we find out which one (might be more than one) make the 

   Failed to sign certificate ASN1 parser: Value is not valid.

error message go away. NB, once that ASN1 error goes away, the QEMU test
suite will likely give its own error because the certs will no longer
have the data it is expecting. 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] IOMMU and ATS not supported by vhost-user filesystem.

2021-01-27 Thread Leonardo Augusto Guimarães Garcia

On 1/27/21 8:19 AM, Stefan Hajnoczi wrote:

On Tue, Jan 26, 2021 at 03:23:38PM -0300, lagar...@linux.ibm.com wrote:

From: Leonardo Garcia 

Currently, as IOMMU and ATS are not supported, if a user mistakenly set
any of them and tries to mount the vhost-user filesystem inside the
guest, whenever the user tries to access the mount point, the system
will hang forever.

Signed-off-by: Leonardo Garcia 
---
  hw/virtio/vhost-user-fs-pci.c | 7 +++
  hw/virtio/vhost-user-fs.c | 5 +
  2 files changed, 12 insertions(+)

diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
index 2ed8492b3f..564d1fd108 100644
--- a/hw/virtio/vhost-user-fs-pci.c
+++ b/hw/virtio/vhost-user-fs-pci.c
@@ -11,6 +11,8 @@
   * top-level directory.
   */
  
+#include "qemu/osdep.h"

+#include "qapi/error.h"
  #include "qemu/osdep.h"
  #include "hw/qdev-properties.h"
  #include "hw/virtio/vhost-user-fs.h"
@@ -45,6 +47,11 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
  vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
  }
  
+if (vpci_dev->flags & VIRTIO_PCI_FLAG_ATS) {

+error_setg(errp, "ATS is currently not supported with 
vhost-user-fs-pci");
+return;
+}

Why is this check needed in addition to VIRTIO_F_IOMMU_PLATFORM?



I don't know if VIRTIO_PCI_FLAG_ATS should depend on 
VIRTIO_F_IOMMU_PLATFORM. At least from a code perspective today, they 
are completely independent. A user can specify one or the other or both. 
And if a user specifies VIRTIO_PCI_FLAG_ATS without specifying 
VIRTIO_F_IOMMU_PLATFORM, the same issue described in the original 
message will happen inside the guest.





What needs to be added to support ATS?



Unfortunately I don't know the answer for this question. Hopefully 
someone else can help with this one.






+
  qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
  }
  
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c

index ac4fc34b36..914d68b3ee 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -203,6 +203,11 @@ static void vuf_device_realize(DeviceState *dev, Error 
**errp)
  return;
  }
  
+if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {

+error_setg(errp, "IOMMU is currently not supported with 
vhost-user-fs");
+return;
+}
+
  if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {

I thought IOMMU support depends on the vhost-user device backend (e.g.
virtiofsd), so the vhost-user backend should participate in advertising
this feature.

Perhaps the check should be:

 ret = vhost_dev_init(&fs->vhost_dev, &fs->vhost_user,
  VHOST_BACKEND_TYPE_USER, 0);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "vhost_dev_init failed");
 goto err_virtio;
 }
+
+   if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM) &&
+   !(fs->vhost_dev.hdev_features & (1ull << VIRTIO_F_IOMMU_PLATFORM))) {
+   error_setg(errp, "IOMMU is not supported by the vhost-user device 
backend");
+   goto err_iommu_needed;
+   }

Also, can this logic be made generic for all vhost-user devices? It's
not really specific to vhost-user-fs.



Sure, I can do that. I wasn't sure whether this restriction was only for 
vhost-user-fs or whether it was generic for all vhost-user devices. I 
will include this in a next version of the patch.



Cheers,


Leo




Stefan




[PATCH v2] tcg/tci: Fix some unaligned memory accesses

2021-01-27 Thread Stefan Weil
Signed-off-by: Stefan Weil 
---
v2: Use inline functions from qemu/bswap.h instead of memcpy as suggested by 
Richard.

 tcg/tci.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tcg/tci.c b/tcg/tci.c
index 42354d8ebb..ddbb259e1d 100644
--- a/tcg/tci.c
+++ b/tcg/tci.c
@@ -166,7 +166,11 @@ static uint64_t tci_uint64(uint32_t high, uint32_t low)
 /* Read constant (native size) from bytecode. */
 static tcg_target_ulong tci_read_i(const uint8_t **tb_ptr)
 {
-tcg_target_ulong value = *(const tcg_target_ulong *)(*tb_ptr);
+#if TCG_TARGET_REG_BITS == 32
+tcg_target_ulong value = ldl_he_p(*tb_ptr);
+#elif TCG_TARGET_REG_BITS == 64
+tcg_target_ulong value = ldq_he_p(*tb_ptr);
+#endif
 *tb_ptr += sizeof(value);
 return value;
 }
@@ -174,7 +178,7 @@ static tcg_target_ulong tci_read_i(const uint8_t **tb_ptr)
 /* Read unsigned constant (32 bit) from bytecode. */
 static uint32_t tci_read_i32(const uint8_t **tb_ptr)
 {
-uint32_t value = *(const uint32_t *)(*tb_ptr);
+uint32_t value = ldl_he_p(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
@@ -182,7 +186,7 @@ static uint32_t tci_read_i32(const uint8_t **tb_ptr)
 /* Read signed constant (32 bit) from bytecode. */
 static int32_t tci_read_s32(const uint8_t **tb_ptr)
 {
-int32_t value = *(const int32_t *)(*tb_ptr);
+int32_t value = ldl_he_p(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
@@ -191,7 +195,7 @@ static int32_t tci_read_s32(const uint8_t **tb_ptr)
 /* Read constant (64 bit) from bytecode. */
 static uint64_t tci_read_i64(const uint8_t **tb_ptr)
 {
-uint64_t value = *(const uint64_t *)(*tb_ptr);
+uint64_t value = ldq_he_p(*tb_ptr);
 *tb_ptr += sizeof(value);
 return value;
 }
@@ -1129,7 +1133,7 @@ uintptr_t QEMU_DISABLE_CFI tcg_qemu_tb_exec(CPUArchState 
*env,
 /* QEMU specific operations. */
 
 case INDEX_op_exit_tb:
-ret = *(uint64_t *)tb_ptr;
+ret = ldq_he_p(tb_ptr);
 goto exit;
 break;
 case INDEX_op_goto_tb:
-- 
2.29.2




Re: macOS (Big Sur, Apple Silicon) 'make check' fails in test-crypto-tlscredsx509

2021-01-27 Thread Christian Schoenebeck
On Mittwoch, 27. Januar 2021 13:17:23 CET Daniel P. Berrangé wrote:
> On Tue, Jan 26, 2021 at 04:41:13PM +, Peter Maydell wrote:
> > On Tue, 26 Jan 2021 at 16:37, Daniel P. Berrangé  
wrote:
> > > On Tue, Jan 26, 2021 at 04:32:08PM +, Peter Maydell wrote:
> > > > ** (tests/test-crypto-tlscredsx509:35180): CRITICAL **: 16:23:34.590:
> > > > Failed to sign certificate ASN1 parser: Value is not valid.
> > > > ERROR test-crypto-tlscredsx509 - Bail out! FATAL-CRITICAL: Failed to
> > > > sign certificate ASN1 parser: Value is not valid.
> > > > make: *** [run-test-70] Error 1
> > > > 
> > > > 
> > > > Does this failure ring any bells for anybody?
> > > 
> > > Not seen it before.
> > > 
> > > Is this using a gnutls from homebrew, or one that apple
> > > ship themselves ?  Any idea what version it is ?
> > 
> > Homebrew gnutls, 3.6.15.
> 
> On further investigation it seems the error comes from libtasn1,
> but unfortunately there are 100's of scenarios it could arise
> so difficult one to debug.

I haven't looked into the relevant code of this discussion, but is the failing 
code dealing with Apple certificates? Because Apple just announced a switch of 
one of their intermediate certificates, so just in case this might be related.

Best regards,
Christian Schoenebeck





Re: macOS (Big Sur, Apple Silicon) 'make check' fails in test-crypto-tlscredsx509

2021-01-27 Thread Daniel P . Berrangé
On Wed, Jan 27, 2021 at 01:35:37PM +0100, Christian Schoenebeck wrote:
> On Mittwoch, 27. Januar 2021 13:17:23 CET Daniel P. Berrangé wrote:
> > On Tue, Jan 26, 2021 at 04:41:13PM +, Peter Maydell wrote:
> > > On Tue, 26 Jan 2021 at 16:37, Daniel P. Berrangé  
> wrote:
> > > > On Tue, Jan 26, 2021 at 04:32:08PM +, Peter Maydell wrote:
> > > > > ** (tests/test-crypto-tlscredsx509:35180): CRITICAL **: 16:23:34.590:
> > > > > Failed to sign certificate ASN1 parser: Value is not valid.
> > > > > ERROR test-crypto-tlscredsx509 - Bail out! FATAL-CRITICAL: Failed to
> > > > > sign certificate ASN1 parser: Value is not valid.
> > > > > make: *** [run-test-70] Error 1
> > > > > 
> > > > > 
> > > > > Does this failure ring any bells for anybody?
> > > > 
> > > > Not seen it before.
> > > > 
> > > > Is this using a gnutls from homebrew, or one that apple
> > > > ship themselves ?  Any idea what version it is ?
> > > 
> > > Homebrew gnutls, 3.6.15.
> > 
> > On further investigation it seems the error comes from libtasn1,
> > but unfortunately there are 100's of scenarios it could arise
> > so difficult one to debug.
> 
> I haven't looked into the relevant code of this discussion, but is the 
> failing 
> code dealing with Apple certificates? Because Apple just announced a switch 
> of 
> one of their intermediate certificates, so just in case this might be related.

It shouldn't be affected. The test suite is creating its own self-signed
root CA, and cert tree under that, so everything should be self contained.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4 4/4] meson: Warn when TCI is selected but TCG backend is available

2021-01-27 Thread Alex Bennée


Daniel P. Berrangé  writes:

> On Tue, Jan 26, 2021 at 05:24:10PM +, Alex Bennée wrote:
>> 
>> Stefan Weil  writes:
>> 
>> > Am 25.01.21 um 23:35 schrieb Richard Henderson:
>> >> On 1/25/21 11:02 AM, Stefan Weil wrote:
>> >>> Am 25.01.21 um 20:02 schrieb Richard Henderson:
>>  On 1/25/21 8:58 AM, Stefan Weil wrote:
>> > I have no evidence that TCI is less reliable than TCG, so I would not 
>> > write
>> > that.
>>  It can't pass make check-tcg.
>> >>> Where does it fail? Maybe an expected timeout problem which can be 
>> >>> solved by
>> >>> increasing the timeouts for TCI?
>> >>>
>> >>> I have just run a local test of `make check-tcg` with native TCG and 
>> >>> with TCI
>> >>> and did not see a difference. But I noticed that in both cases many 
>> >>> tests show
>> >>> "skipped".
>> >> You need to enable docker or podman for your development, so that you get 
>> >> all
>> >> of the cross-compilers.
>> >>
>> >> Then:
>> >>
>> >>TESTfcvt on arm
>> >> TODO ../qemu/tcg/tci.c:614: tcg_qemu_tb_exec()
>> >> ../qemu/tcg/tci.c:614: tcg fatal error
>> >> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>> >>
>> >>TESTfloat_convs on m68k
>> >> TODO ../qemu/tcg/tci.c:614: tcg_qemu_tb_exec()
>> >> ../qemu/tcg/tci.c:614: tcg fatal error
>> >> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>> >>
>> >> which is of course one of the TODO assertions.
>> >> It's positively criminal those still exist in the code.
>> >
>> >
>> > I installed podman and repeated `make check-tcg`. The log file still 
>> > shows 87 lines with "SKIPPED". There is also a gdb core dump, several 
>> > warnings, but nothing related to TCI. Both tests cited above seem to 
>> > work without a problem.
>> 
>> I'm going to go out on a limb and guess you have commit:
>> 
>>   23a77b2d18 (build-system: clean up TCG/TCI configury)
>> 
>> which temporarily has the effect of disabling TCI. See
>> 
>>   Subject: Re: [PATCH v4 1/4] configure: Fix --enable-tcg-interpreter
>>   From: Paolo Bonzini 
>>   Message-ID: <2b8b6291-b54c-b285-ae38-21f067a84...@redhat.com>
>>   Date: Mon, 25 Jan 2021 17:36:42 +0100
>> 
>> with that fix fixed I see the same failures as Richard:
>> 
>>   ./qemu-arm ./tests/tcg/arm-linux-user/fcvt > /dev/null
>>   TODO ../../tcg/tci.c:614: tcg_qemu_tb_exec()
>>   ../../tcg/tci.c:614: tcg fatal error
>>   qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>>   fish: “./qemu-arm ./tests/tcg/arm-linu…” terminated by signal SIGSEGV 
>> (Address boundary error)
>> 
>> which does raise the question before today when was the last time anyone
>> attempted to run check-tcg on this?
>> 
>> > The complete log file is available from 
>> > https://qemu.weilnetz.de/test/check-tcg.txt.
>> >
>> > Daniel, regarding your comment: TCI has 100 % test coverage for the 
>> > productive code lines.
>> 
>> By what tests? The fact you don't hit asserts in your day to day testing
>> doesn't mean there are features missing that are easily tripped up or
>> that TCI got it right.
>> 
>> > All code lines which were never tested raise an 
>> > assertion, so can easily be identified (and fixed as soon as there is a 
>> > test case which triggers such an assertion). The known deficits are 
>> > speed, missing TCG opcodes, unimplemented TCG opcodes because of missing 
>> > test cases and missing support for some host architectures.
>> 
>> Passing check-tcg would be a minimum for me.
>
> Passing check-tcg *in gitlab CI* would be the minimum to consider
> it on a par with TCG.
>
> The lack of automated GitLab CI for TCI is a reason my proposed wording
> described TCI as less reliable than native TCG. We can't claim it has
> equivalent reliability unless we have equiv automated testing of TCI.

I should point out that check-tcg is hardly a comprehensive test suite.
Most of our instruction testing for example tends to be done with RISU.
Any program that attempts to use vector instructions is likely to come a
cropper with TCI. I guess we don't even attempt to run check-acceptance
due to speed issues but it would be interesting to see how far it gets.

>
> Regards,
> Daniel


-- 
Alex Bennée



Re: [PATCH v5 00/11] virtio-mem: vfio support

2021-01-27 Thread Michael S. Tsirkin
On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote:
> A virtio-mem device manages a memory region in guest physical address
> space, represented as a single (currently large) memory region in QEMU,
> mapped into system memory address space. Before the guest is allowed to use
> memory blocks, it must coordinate with the hypervisor (plug blocks). After
> a reboot, all memory is usually unplugged - when the guest comes up, it
> detects the virtio-mem device and selects memory blocks to plug (based on
> resize requests from the hypervisor).
> 
> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem
> device (triggered by the guest). When unplugging blocks, we discard the
> memory - similar to memory balloon inflation. In contrast to memory
> ballooning, we always know which memory blocks a guest may actually use -
> especially during a reboot, after a crash, or after kexec (and during
> hibernation as well). Guests agreed to not access unplugged memory again,
> especially not via DMA.
> 
> The issue with vfio is, that it cannot deal with random discards - for this
> reason, virtio-mem and vfio can currently only run mutually exclusive.
> Especially, vfio would currently map the whole memory region (with possible
> only little/no plugged blocks), resulting in all pages getting pinned and
> therefore resulting in a higher memory consumption than expected (turning
> virtio-mem basically useless in these environments).
> 
> To make vfio work nicely with virtio-mem, we have to map only the plugged
> blocks, and map/unmap properly when plugging/unplugging blocks (including
> discarding of RAM when unplugging). We achieve that by using a new notifier
> mechanism that communicates changes.

series

Acked-by: Michael S. Tsirkin 

virtio bits

Reviewed-by: Michael S. Tsirkin 

This needs to go through vfio tree I assume.


> It's important to map memory in the granularity in which we could see
> unmaps again (-> virtio-mem block size) - so when e.g., plugging
> consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When
> unmapping, we can use a single vfio_unmap call for the applicable range.
> We expect that the block size of virtio-mem devices will be fairly large
> in the future (to not run out of mappings and to improve hot(un)plug
> performance), configured by the user, when used with vfio (e.g., 128MB,
> 1G, ...), but it will depend on the setup.
> 
> More info regarding virtio-mem can be found at:
> https://virtio-mem.gitlab.io/
> 
> v5 is located at:
>   g...@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v5
> 
> v4 -> v5:
> - "vfio: Support for RamDiscardMgr in the !vIOMMU case"
> -- Added more assertions for granularity vs. iommu supported pagesize
> - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr"
> -- Fix accounting of mappings
> - "vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus"
> -- Fence off SPAPR and add some comments regarding future support.
> -- Tweak patch description
> - Rebase and retest
> 
> v3 -> v4:
> - "vfio: Query and store the maximum number of DMA mappings
> -- Limit the patch to querying and storing only
> -- Renamed to "vfio: Query and store the maximum number of possible DMA
>mappings"
> - "vfio: Support for RamDiscardMgr in the !vIOMMU case"
> -- Remove sanity checks / warning the user
> - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr"
> -- Perform sanity checks by looking at the number of memslots and all
>registered RamDiscardMgr sections
> - Rebase and retest
> - Reshuffled the patches slightly
> 
> v2 -> v3:
> - Rebased + retested
> - Fixed some typos
> - Added RB's
> 
> v1 -> v2:
> - "memory: Introduce RamDiscardMgr for RAM memory regions"
> -- Fix some errors in the documentation
> -- Make register_listener() notify about populated parts and
>unregister_listener() notify about discarding populated parts, to
>simplify future locking inside virtio-mem, when handling requests via a
>separate thread.
> - "vfio: Query and store the maximum number of DMA mappings"
> -- Query number of mappings and track mappings (except for vIOMMU)
> - "vfio: Support for RamDiscardMgr in the !vIOMMU case"
> -- Adapt to RamDiscardMgr changes and warn via generic DMA reservation
> - "vfio: Support for RamDiscardMgr in the vIOMMU case"
> -- Use vmstate priority to handle migration dependencies
> 
> RFC - v1:
> - VFIO migration code. Due to missing kernel support, I cannot really test
>   if that part works.
> - Understand/test/document vIOMMU implications, also regarding migration
> - Nicer ram_block_discard_disable/require handling.
> - s/SparseRAMHandler/RamDiscardMgr/, refactorings, cleanups, documentation,
>   testing, ...
> 
> David Hildenbrand (11):
>   memory: Introduce RamDiscardMgr for RAM memory regions
>   virtio-mem: Factor out traversing unplugged ranges
>   virtio-mem: Implement RamDiscardMgr interface
>   vfio: Support for RamDiscardMgr in the !vIOMMU

Re: [PATCH] vfio/migrate: Move switch of dirty tracking into vfio_memory_listener

2021-01-27 Thread Paolo Bonzini

On 11/01/21 08:34, Keqian Zhu wrote:

+static void vfio_listener_log_start(MemoryListener *listener,
+MemoryRegionSection *section,
+int old, int new)
+{
+VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+
+vfio_set_dirty_page_tracking(container, true);
+}
+
+static void vfio_listener_log_stop(MemoryListener *listener,
+   MemoryRegionSection *section,
+   int old, int new)
+{
+VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+
+vfio_set_dirty_page_tracking(container, false);
+}
+


This would enable dirty page tracking also just for having a framebuffer 
(DIRTY_MEMORY_VGA).  Technically it would be correct, but it would also 
be more heavyweight than expected.


In order to only cover live migration, you can use the log_global_start 
and log_global_stop callbacks instead.


If you want to use log_start and log_stop, you need to add respectively

if (old != 0) {
return;
}

and

if (new != 0) {
return;
}

before the calls to vfio_set_dirty_page_tracking.  But I think it's more 
appropriate for VFIO to use log_global_*.


Thanks,

Paolo




Re: [PATCH v4 0/2] System Generation ID driver and VMGENID backend

2021-01-27 Thread Michael S. Tsirkin
On Thu, Jan 21, 2021 at 10:28:16AM +, Catangiu, Adrian Costin wrote:
> On 12/01/2021, 14:49, "Michael S. Tsirkin"  wrote:
> 
> On Tue, Jan 12, 2021 at 02:15:58PM +0200, Adrian Catangiu wrote:
> > The first patch in the set implements a device driver which exposes a
> > read-only device /dev/sysgenid to userspace, which contains a
> > monotonically increasing u32 generation counter. Libraries and
> > applications are expected to open() the device, and then call read()
> > which blocks until the SysGenId changes. Following an update, read()
> > calls no longer block until the application acknowledges the new
> > SysGenId by write()ing it back to the device. Non-blocking read() calls
> > return EAGAIN when there is no new SysGenId available. Alternatively,
> > libraries can mmap() the device to get a single shared page which
> > contains the latest SysGenId at offset 0.
> 
> Looking at some specifications, the gen ID might actually be located
> at an arbitrary address. How about instead of hard-coding the offset,
> we expose it e.g. in sysfs?
> 
> The functionality is split between SysGenID which exposes an internal u32
> counter to userspace, and an (optional) VmGenID backend which drives
> SysGenID generation changes based on hw vmgenid updates.
> 
> The hw UUID you're referring to (vmgenid) is not mmap-ed to userspace or
> otherwise exposed to userspace. It is only used internally by the vmgenid
> driver to find out about VM generation changes and drive the more generic
> SysGenID.
> 
> The SysGenID u32 monotonic increasing counter is the one that is mmaped to
> userspace, but it is a software counter. I don't see any value in using a 
> dynamic
> offset in the mmaped page. Offset 0 is fast and easy and most importantly it 
> is
> static so no need to dynamically calculate or find it at runtime.

Well you are burning a whole page on it, using an offset the page
can be shared with other functionality.

> Thanks,
> Adrian.
> 
> 
> 
> 
> Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar 
> Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in 
> Romania. Registration number J22/2621/2005.




Re: [PATCH 1/6] qapi: Add interfaces for alias support to Visitor

2021-01-27 Thread Markus Armbruster
Kevin Wolf  writes:

> This adds functions to the Visitor interface that can be used to define
> aliases and alias scopes.
>
> Signed-off-by: Kevin Wolf 
> ---
>  include/qapi/visitor-impl.h | 12 
>  include/qapi/visitor.h  | 37 +
>  qapi/qapi-visit-core.c  | 21 +
>  3 files changed, 70 insertions(+)
>
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 7362c043be..e30da2599c 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -113,6 +113,18 @@ struct Visitor
> The core takes care of the return type in the public interface. */
>  void (*optional)(Visitor *v, const char *name, bool *present);
>  
> +/*
> + * Optional; intended for input visitors. If not given, aliases are
> + * ignored.
> + */
> +void (*define_alias)(Visitor *v, const char *alias, const char **source);
> +
> +/* Must be set if define_alias is set */
> +void (*start_alias_scope)(Visitor *v);
> +
> +/* Must be set if define_alias is set */
> +void (*end_alias_scope)(Visitor *v);
> +
>  /* Must be set */
>  VisitorType type;
>  
> diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
> index ebc19ede7f..9bdc0ee03d 100644
> --- a/include/qapi/visitor.h
> +++ b/include/qapi/visitor.h
> @@ -459,6 +459,43 @@ void visit_end_alternate(Visitor *v, void **obj);
>   */
>  bool visit_optional(Visitor *v, const char *name, bool *present);
>  
> +/*
> + * Defines a new alias rule.
> + *
> + * If @alias is non-NULL, the member named @alias in the external
> + * representation of the current struct is defined as an alias for the

Terminology: the big comment uses "object".  See also the FIXME in
visit_start_struct()'s contract.

> + * member described by @source.
> + *
> + * If @alias is NULL, all members of the struct described by @source are
> + * considered to have alias members with the same key in the current
> + * struct.

Define "the current struct".  I believe it's the object being visited.

What happens if we're currently visiting something other than an object,
i.e. the root of a tree, or a list?

> + *
> + * @source is a NULL-terminated array of names that describe the path to
> + * a member, starting from the currently visited struct.

I'm afraid "describe the path to a member" is too vague.  How?

I figure this is what you have in mind:

cur = the currently visited object
for s in source:
cur = the member of cur denoted by s

When @cur is indeed an object, then "the member denoted by @s" makes
sense: you must pass a name when visiting object members, and whatever
is visited with name @s is the member denoted by @s.

"Must pass a name" is documented in the big comment:

 * The @name parameter of visit_type_FOO() describes the relation
 * between this QAPI value and its parent container.  When visiting
 * the root of a tree, @name is ignored; when visiting a member of an
 * object, @name is the key associated with the value; when visiting a
 * member of a list, @name is NULL; and when visiting the member of an
 * alternate, @name should equal the name used for visiting the
 * alternate.

But what if @cur is a list?  I guess that makes no sense.  Say so
explicitly, please.

> + *
> + * The alias stays valid until the current alias scope ends.
> + * visit_start/end_struct() implicitly start/end an alias scope.
> + * Additionally, visit_start/end_alias_scope() can be used to explicitly
> + * create a nested alias scope.
> + */
> +void visit_define_alias(Visitor *v, const char *alias, const char **source);
> +
> +/*
> + * Begins an explicit alias scope.
> + *
> + * Alias definitions after here will only stay valid until the
> + * corresponding visit_end_alias_scope() is called.
> + */
> +void visit_start_alias_scope(Visitor *v);
> +
> +/*
> + * Ends an explicit alias scope.
> + *
> + * Alias definitions between the correspoding visit_start_alias_scope()
> + * call and here go out of scope and won't apply in later code any more.
> + */
> +void visit_end_alias_scope(Visitor *v);
> +
>  /*
>   * Visit an enum value.
>   *
[...]




Re: [PATCH v3] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 12:21 PM Stefan Hajnoczi  wrote:
  }
> @@ -1654,9 +1677,11 @@ static void update_open_flags(int writeback, int 
> allow_direct_io,
>  static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
>mode_t mode, struct fuse_file_info *fi)
>  {
> +int open_flags = (fi->flags | O_CREAT) & ~O_NOFOLLOW;
>  int fd;
>  struct lo_data *lo = lo_data(req);
>  struct lo_inode *parent_inode;
> +struct lo_inode *existing_inode = NULL;
>  struct fuse_entry_param e;
>  int err;
>  struct lo_cred old = {};
> @@ -1682,11 +1707,23 @@ static void lo_create(fuse_req_t req, fuse_ino_t 
> parent, const char *name,
>
>  update_open_flags(lo->writeback, lo->allow_direct_io, fi);
>
> -fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
> -mode);
> +/* First, try to create a new file but don't open existing files */
> +fd = openat(parent_inode->fd, name, open_flags | O_EXCL, mode);
>  err = fd == -1 ? errno : 0;
> +
>  lo_restore_cred(&old);
>
> +/* Second, open existing files if O_EXCL was not specified */
> +if (err == EEXIST && !(fi->flags & O_EXCL)) {
> +existing_inode = lookup_name(req, parent, name);
> +if (existing_inode) {
> +fd = lo_inode_open(lo, existing_inode, open_flags);
> +if (fd < 0) {
> +err = -fd;
> +}
> +}
> +}
> +
>  if (!err) {
>  ssize_t fh;

It's more of a mess than I thought.

The problem here is there can also be a race between the open and the
subsequent lo_do_lookup().

At this point it's probably enough to verify that fuse_entry_param
refers to the same object as the fh (using fstat and comparing st_dev
and st_ino).

Also O_CREAT open is not supposed to return ENOENT, so failure to open
without O_CREAT (race between O_CREAT open and plain open) should at
least translate error to ESTALE or EIO.

Thanks,
Miklos




Re: [PATCH 2/6] qapi: Remember alias definitions in qobject-input-visitor

2021-01-27 Thread Markus Armbruster
Kevin Wolf  writes:

> This makes qobject-input-visitor remember the currently valid aliases in
> each StackObject. It doesn't actually allow using the aliases yet.
>
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/qobject-input-visitor.c | 115 +++
>  1 file changed, 115 insertions(+)
>
> diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
> index 23843b242e..a00ac32682 100644
> --- a/qapi/qobject-input-visitor.c
> +++ b/qapi/qobject-input-visitor.c
> @@ -29,6 +29,29 @@
>  #include "qemu/cutils.h"
>  #include "qemu/option.h"
>  
> +typedef struct InputVisitorAlias {
> +   /* Alias name. NULL for any (wildcard alias). */
> +const char *alias;
> +
> +/*
> + * NULL terminated array representing a path.
> + * Must contain at least one non-NULL element if alias is not NULL.

What does .alias = NULL, .src[] empty mean?

The previous patch's contract for visit_define_alias():

   /*
* Defines a new alias rule.
*
* If @alias is non-NULL, the member named @alias in the external
* representation of the current struct is defined as an alias for the
* member described by @source.
*
* If @alias is NULL, all members of the struct described by @source are
* considered to have alias members with the same key in the current
* struct.
*
* @source is a NULL-terminated array of names that describe the path to
* a member, starting from the currently visited struct.
*
* The alias stays valid until the current alias scope ends.
* visit_start/end_struct() implicitly start/end an alias scope.
* Additionally, visit_start/end_alias_scope() can be used to explicitly
* create a nested alias scope.
*/

If I read this correctly, then empty .src[] denotes "the current
struct", and therefore .alias = NULL, .src[] empty means "all members of
the current struct are considered to have alias members with the same
key in the current struct".  Which is be a complicated way to accomplish
nothing.

Am I confused?

> + */
> +const char **src;
> +
> +/* StackObject in which the alias should be looked for */
> +struct StackObject *alias_so;

Clear as mud.  Is it "the current struct"?  If not, what else?  Perhaps
an example would help me understand.

> +
> +/*
> + * The alias remains valid as long as the containing StackObject has

What's "the containing StackObject"?  I guess it's the one that has this
InputVisitorAlias in .aliases.

> + * StackObject.alias_scope_nesting >= InputVisitorAlias.scope_nesting
> + * or until the whole StackObject is removed.
> + */
> +int scope_nesting;
> +
> +QSLIST_ENTRY(InputVisitorAlias) next;
> +} InputVisitorAlias;
> +
>  typedef struct StackObject {
>  const char *name;/* Name of @obj in its parent, if any */
>  QObject *obj;/* QDict or QList being visited */
> @@ -38,6 +61,9 @@ typedef struct StackObject {
>  const QListEntry *entry;/* If @obj is QList: unvisited tail */
>  unsigned index; /* If @obj is QList: list index of @entry */
>  
> +QSLIST_HEAD(, InputVisitorAlias) aliases;
> +int alias_scope_nesting;/* Increase on scope start, decrease on end 
> */

I get what the comment means, but imperative mood is odd for a variable.
"Number of open scopes", perhaps?

> +
>  QSLIST_ENTRY(StackObject) node; /* parent */
>  } StackObject;
>  

I'm afraid I'm too confused about the interface (previous patch) and the
data structures to continue review with reasonable efficiency.  I don't
mean to imply either is bad!  I'm just confused, that's all.

[...]




[PATCH] hw/block/nvme: drain namespaces on sq deletion

2021-01-27 Thread Klaus Jensen
From: Klaus Jensen 

For most commands, when issuing an AIO, the BlockAIOCB is stored in the
NvmeRequest aiocb pointer when the AIO is issued. The purpose of storing
this is to allow the AIO to be cancelled when deleting submission
queues (it is currently not used for Abort).

Since the addition of the Dataset Management command and Zoned
Namespaces, NvmeRequests may involve more than one AIO and the AIOs are
issued without saving a reference to the BlockAIOCB. This is a problem
since nvme_del_sq will attempt to cancel outstanding AIOs, potentially
with an invalid BlockAIOCB.

Fix this by instead of explicitly cancelling the requests, just allow
the AIOs to complete by draining the namespace blockdevs.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 316858fd8adf..91f6fb6da1e2 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -403,6 +403,7 @@ static void nvme_req_clear(NvmeRequest *req)
 {
 req->ns = NULL;
 req->opaque = NULL;
+req->aiocb = NULL;
 memset(&req->cqe, 0x0, sizeof(req->cqe));
 req->status = NVME_SUCCESS;
 }
@@ -2396,6 +2397,7 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *req)
 NvmeSQueue *sq;
 NvmeCQueue *cq;
 uint16_t qid = le16_to_cpu(c->qid);
+int i;
 
 if (unlikely(!qid || nvme_check_sqid(n, qid))) {
 trace_pci_nvme_err_invalid_del_sq(qid);
@@ -2404,12 +2406,18 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest 
*req)
 
 trace_pci_nvme_del_sq(qid);
 
-sq = n->sq[qid];
-while (!QTAILQ_EMPTY(&sq->out_req_list)) {
-r = QTAILQ_FIRST(&sq->out_req_list);
-assert(r->aiocb);
-blk_aio_cancel(r->aiocb);
+for (i = 1; i <= n->num_namespaces; i++) {
+NvmeNamespace *ns = nvme_ns(n, i);
+if (!ns) {
+continue;
+}
+
+nvme_ns_drain(ns);
 }
+
+sq = n->sq[qid];
+assert(QTAILQ_EMPTY(&sq->out_req_list));
+
 if (!nvme_check_cqid(n, sq->cqid)) {
 cq = n->cq[sq->cqid];
 QTAILQ_REMOVE(&cq->sq_list, sq, entry);
-- 
2.30.0




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Greg Kurz
On Wed, 27 Jan 2021 11:34:52 +0100
Miklos Szeredi  wrote:

> On Wed, Jan 27, 2021 at 11:20 AM Greg Kurz  wrote:
> >
> > On Wed, 27 Jan 2021 10:25:28 +0100
> > Miklos Szeredi  wrote:
> >
> > > On Tue, Jan 26, 2021 at 6:18 PM Greg Kurz  wrote:
> > > >
> > > > On Tue, 26 Jan 2021 10:35:02 +
> > > > Stefan Hajnoczi  wrote:
> > >
> > > > The patch looks pretty good to me. It just seems to be missing a change 
> > > > in
> > > > lo_create():
> > > >
> > > > fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & 
> > > > ~O_NOFOLLOW,
> > > > mode);
> > > >
> > > > A malicious guest could have created anything called ${name} in this 
> > > > directory
> > > > before calling FUSE_CREATE and we'll open it blindly, or I'm missing 
> > > > something ?
> > >
> > > Right, this seems like an omission.
> > >
> > > Also the "& ~O_NOFOLLOW" looks like a copy-paste bug, since unlike
> > > lo_open(), lo_create() is not opening a proc symlink.
> > >
> > > So that should be replaced with "| O_NOFOLLOW"
> > >
> >
> >
> > Yes, I've realized that later on. We should definitely enforce O_NOFOLLOW
> > to avoid symlink escapes.
> >
> > Then comes the case of special files... A well-known case is the FIFO that
> > causes openat() to block as described in my response. FWIW, we addressed
> > this one in 9P by adding O_NONBLOCK and fixing the flags to the client
> > expectation with fcntl(F_SETFL). But this is just a protection against
> > being blocked. Blindly opening a special file can lead to any kind of
> > troubles you can think of... so it really looks that the only sane way
> > to be safe from such an attack is to forbid openat() of special files at
> > the filesystem level.
> 
> Another solution specifically for O_CREAT without O_EXCL would be to
> turn it into an exclusive create.  

Would this added O_EXCL then appear on the client side, e.g. to
guest userspace doing fcntl(F_GETFL) ?

> If that fails with EEXIST then try
> the normal open path (open with O_PATH, fstat, open proc symlink).  If

open(O_PATH | O_NOFOLLOW) + fstatat(AT_EMPTY_PATH|AT_SYMLINK_NOFOLLOW)
would indeed allow to filter out anything that isn't a directory and
to safely open the proc symlink.

> that fails with ENOENT, then retry the whole thing a certain number of

Indeed someone could have unlinked the file in the meantime, in which
case the open(O_PATH | O_NOFOLLOW) would fail, but if it succeeds then
we cannot hit ENOENT anymore AFAICT.

> times.  If it still fails then somebody is definitely messing with us
> and we can fail the request with EIO.
> 

Not sure what the retry+timeout is trying to mitigate here... why not
returning EIO right away ?


> Rather ugly, but I can't think of anything better.
> 
> Thanks,
> Miklos
> 




Re: [PATCH 3/6] qapi: Simplify full_name_nth() in qobject-input-visitor

2021-01-27 Thread Markus Armbruster
Kevin Wolf  writes:

> Instead of counting how many elements from the top of the stack we need
> to ignore until we find the thing we're interested in, we can just
> directly pass the StackObject pointer because all callers already know
> it.
>
> We only need a different way now to tell if we want to know the name of
> something contained in the given StackObject or of the StackObject
> itself. Passing name = NULL is the obvious way to request the latter.
>
> This simplifies the interface and makes it easier to use in cases where
> we have the StackObject, but don't know how many steps down the stack it
> is.
>
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/qobject-input-visitor.c | 38 ++--
>  1 file changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
> index a00ac32682..1415561828 100644
> --- a/qapi/qobject-input-visitor.c
> +++ b/qapi/qobject-input-visitor.c
> @@ -87,20 +87,16 @@ static QObjectInputVisitor *to_qiv(Visitor *v)
>  }
>  
>  /*
> - * Find the full name of something @qiv is currently visiting.
> - * @qiv is visiting something named @name in the stack of containers
> - * @qiv->stack.
> - * If @n is zero, return its full name.
> - * If @n is positive, return the full name of the @n-th container
> - * counting from the top.  The stack of containers must have at least
> - * @n elements.
> - * The returned string is valid until the next full_name_nth(@v) or
> - * destruction of @v.
> + * Find the full name of something named @name in @so which @qiv is
> + * currently visiting.  If @name is NULL, find the full name of @so
> + * itself.
> + *
> + * The returned string is valid until the next full_name_so(@qiv) or
> + * destruction of @qiv.

How can this distinguish between a list and its member?

Before the patch:

* list member: n = 0, name = NULL
* list: n = 1, name = NULL

Afterwards?

Checking... yes, regression.  Test case:

{"execute": "blockdev-add", "arguments": {"driver": "file", "node-name": 
"blk0", "filename": "tmp.img"}}
{"return": {}}
{"execute": "blockdev-add", "arguments": {"driver": "blkdebug", 
"node-name": "blk1", "image": "blk0", "take-child-perms": [0]}}
{"error": {"class": "GenericError", "desc": "Invalid parameter type for 
'take-child-perms', expected: string"}}

The second command's reply changes from

{"error": {"class": "GenericError", "desc": "Invalid parameter type for 
'take-child-perms[0]', expected: string"}}

to

{"error": {"class": "GenericError", "desc": "Invalid parameter type for 
'take-child-perms', expected: string"}}

The idea of using @so instead of @n may be salvagable.

[...]




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 2:49 PM Greg Kurz  wrote:
>
> On Wed, 27 Jan 2021 11:34:52 +0100
> Miklos Szeredi  wrote:

> > Another solution specifically for O_CREAT without O_EXCL would be to
> > turn it into an exclusive create.
>
> Would this added O_EXCL then appear on the client side, e.g. to
> guest userspace doing fcntl(F_GETFL) ?

No.  Guest kernel keeps track of open flags.

> > If that fails with EEXIST then try
> > the normal open path (open with O_PATH, fstat, open proc symlink).  If
>
> open(O_PATH | O_NOFOLLOW) + fstatat(AT_EMPTY_PATH|AT_SYMLINK_NOFOLLOW)
> would indeed allow to filter out anything that isn't a directory and
> to safely open the proc symlink.
>
> > that fails with ENOENT, then retry the whole thing a certain number of
>
> Indeed someone could have unlinked the file in the meantime, in which
> case the open(O_PATH | O_NOFOLLOW) would fail, but if it succeeds then
> we cannot hit ENOENT anymore AFAICT.

Right.

> > times.  If it still fails then somebody is definitely messing with us
> > and we can fail the request with EIO.
> >
>
> Not sure what the retry+timeout is trying to mitigate here... why not
> returning EIO right away ?

The semantics of O_CREATE are that it can fail neither because the
file exists nor because it doesn't.  This doesn't matter if the
exported tree is not modified outside of a single guest, because of
locking provided by the guest kernel.

However if we want to support shared access to a tree then O_CREAT
semantics should work even in the face of races due to external
modification of the tree.  I.e. following race:

virtiofsd: open(foo, O_CREAT | O_EXCL) -> EEXIST
other task: unlink(foo)
virtiofsd: open(foo, O_PATH | O_NOFOLLOW) -> ENOENT

To properly support the above the O_CREAT | O_EXCL open would need to
be retried.

Thanks,
Miklos




Re: [PATCH v3] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Stefan Hajnoczi
On Wed, Jan 27, 2021 at 02:01:54PM +0100, Miklos Szeredi wrote:
> On Wed, Jan 27, 2021 at 12:21 PM Stefan Hajnoczi  wrote:
>   }
> > @@ -1654,9 +1677,11 @@ static void update_open_flags(int writeback, int 
> > allow_direct_io,
> >  static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> >mode_t mode, struct fuse_file_info *fi)
> >  {
> > +int open_flags = (fi->flags | O_CREAT) & ~O_NOFOLLOW;
> >  int fd;
> >  struct lo_data *lo = lo_data(req);
> >  struct lo_inode *parent_inode;
> > +struct lo_inode *existing_inode = NULL;
> >  struct fuse_entry_param e;
> >  int err;
> >  struct lo_cred old = {};
> > @@ -1682,11 +1707,23 @@ static void lo_create(fuse_req_t req, fuse_ino_t 
> > parent, const char *name,
> >
> >  update_open_flags(lo->writeback, lo->allow_direct_io, fi);
> >
> > -fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & 
> > ~O_NOFOLLOW,
> > -mode);
> > +/* First, try to create a new file but don't open existing files */
> > +fd = openat(parent_inode->fd, name, open_flags | O_EXCL, mode);
> >  err = fd == -1 ? errno : 0;
> > +
> >  lo_restore_cred(&old);
> >
> > +/* Second, open existing files if O_EXCL was not specified */
> > +if (err == EEXIST && !(fi->flags & O_EXCL)) {
> > +existing_inode = lookup_name(req, parent, name);
> > +if (existing_inode) {
> > +fd = lo_inode_open(lo, existing_inode, open_flags);
> > +if (fd < 0) {
> > +err = -fd;
> > +}
> > +}
> > +}
> > +
> >  if (!err) {
> >  ssize_t fh;
> 
> It's more of a mess than I thought.
> 
> The problem here is there can also be a race between the open and the
> subsequent lo_do_lookup().
> 
> At this point it's probably enough to verify that fuse_entry_param
> refers to the same object as the fh (using fstat and comparing st_dev
> and st_ino).

Can you describe the race in detail? FUSE_CREATE vs FUSE_OPEN?
FUSE_CREATE vs FUSE_CREATE?

> Also O_CREAT open is not supposed to return ENOENT, so failure to open
> without O_CREAT (race between O_CREAT open and plain open) should at
> least translate error to ESTALE or EIO.

Thanks, will fix.

Sstefan


signature.asc
Description: PGP signature


Re: [PATCH] IOMMU and ATS not supported by vhost-user filesystem.

2021-01-27 Thread Stefan Hajnoczi
On Wed, Jan 27, 2021 at 09:30:35AM -0300, Leonardo Augusto Guimarães Garcia 
wrote:
> On 1/27/21 8:19 AM, Stefan Hajnoczi wrote:
> > On Tue, Jan 26, 2021 at 03:23:38PM -0300, lagar...@linux.ibm.com wrote:
> > > From: Leonardo Garcia 
> > > 
> > > Currently, as IOMMU and ATS are not supported, if a user mistakenly set
> > > any of them and tries to mount the vhost-user filesystem inside the
> > > guest, whenever the user tries to access the mount point, the system
> > > will hang forever.
> > > 
> > > Signed-off-by: Leonardo Garcia 
> > > ---
> > >   hw/virtio/vhost-user-fs-pci.c | 7 +++
> > >   hw/virtio/vhost-user-fs.c | 5 +
> > >   2 files changed, 12 insertions(+)
> > > 
> > > diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
> > > index 2ed8492b3f..564d1fd108 100644
> > > --- a/hw/virtio/vhost-user-fs-pci.c
> > > +++ b/hw/virtio/vhost-user-fs-pci.c
> > > @@ -11,6 +11,8 @@
> > >* top-level directory.
> > >*/
> > > +#include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > >   #include "qemu/osdep.h"
> > >   #include "hw/qdev-properties.h"
> > >   #include "hw/virtio/vhost-user-fs.h"
> > > @@ -45,6 +47,11 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
> > > *vpci_dev, Error **errp)
> > >   vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
> > >   }
> > > +if (vpci_dev->flags & VIRTIO_PCI_FLAG_ATS) {
> > > +error_setg(errp, "ATS is currently not supported with 
> > > vhost-user-fs-pci");
> > > +return;
> > > +}
> > Why is this check needed in addition to VIRTIO_F_IOMMU_PLATFORM?
> 
> 
> I don't know if VIRTIO_PCI_FLAG_ATS should depend on
> VIRTIO_F_IOMMU_PLATFORM. At least from a code perspective today, they are
> completely independent. A user can specify one or the other or both. And if
> a user specifies VIRTIO_PCI_FLAG_ATS without specifying
> VIRTIO_F_IOMMU_PLATFORM, the same issue described in the original message
> will happen inside the guest.

I don't see any PCI ATS-specific code in Linux drivers/virtio/ so I
wonder what the issue is?

Also, I thought PCI ATS is an optional feature. It's still possible to
have IOMMUs without ATS.

Michael: Can you help us understand why enabling ATS on a device in QEMU
would cause issues with a VIRTIO device that does not support
VIRTIO_F_IOMMU_PLATFORM?

> > 
> > > +
> > >   qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
> > >   }
> > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> > > index ac4fc34b36..914d68b3ee 100644
> > > --- a/hw/virtio/vhost-user-fs.c
> > > +++ b/hw/virtio/vhost-user-fs.c
> > > @@ -203,6 +203,11 @@ static void vuf_device_realize(DeviceState *dev, 
> > > Error **errp)
> > >   return;
> > >   }
> > > +if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > > +error_setg(errp, "IOMMU is currently not supported with 
> > > vhost-user-fs");
> > > +return;
> > > +}
> > > +
> > >   if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {
> > I thought IOMMU support depends on the vhost-user device backend (e.g.
> > virtiofsd), so the vhost-user backend should participate in advertising
> > this feature.
> > 
> > Perhaps the check should be:
> > 
> >  ret = vhost_dev_init(&fs->vhost_dev, &fs->vhost_user,
> >   VHOST_BACKEND_TYPE_USER, 0);
> >  if (ret < 0) {
> >  error_setg_errno(errp, -ret, "vhost_dev_init failed");
> >  goto err_virtio;
> >  }
> > +
> > +   if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM) &&
> > +   !(fs->vhost_dev.hdev_features & (1ull << VIRTIO_F_IOMMU_PLATFORM))) 
> > {
> > +   error_setg(errp, "IOMMU is not supported by the vhost-user device 
> > backend");
> > +   goto err_iommu_needed;
> > +   }
> > 
> > Also, can this logic be made generic for all vhost-user devices? It's
> > not really specific to vhost-user-fs.
> 
> 
> Sure, I can do that. I wasn't sure whether this restriction was only for
> vhost-user-fs or whether it was generic for all vhost-user devices. I will
> include this in a next version of the patch.

Awesome, thanks!

Stefan


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 2/3] vfio: Set the priority of the VFIO VM state change handler explicitly

2021-01-27 Thread Alex Williamson
On Wed, 27 Jan 2021 19:20:06 +0800
Shenming Lu  wrote:

> On 2021/1/27 5:36, Alex Williamson wrote:
> > On Wed, 9 Dec 2020 16:09:18 +0800
> > Shenming Lu  wrote:
> >   
> >> In the VFIO VM state change handler, VFIO devices are transitioned
> >> in the _SAVING state, which should keep them from sending interrupts.  
> > 
> > Is this comment accurate?  It's my expectation that _SAVING has no
> > bearing on a device generating interrupts.  Interrupt generation must
> > be allowed to continue so long as the device is _RUNNING.  Thanks,
> >   
> 
> To be more accurate, the _RUNNING bit in device_state is cleared in the
> VFIO VM state change handler when stopping the VM. And if the device continues
> to send interrupts after this, how can we save the states of device interrupts
> in the stop-and-copy phase?...

Exactly, it's clearing the _RUNNING bit that makes the device stop,
including no longer generating interrupts.  Perhaps I incorrectly
inferred "_SAVING state" as referring to the _SAVING bit when you
actually intended:

   *  +--- _RESUMING
   *  |+-- _SAVING
   *  ||+- _RUNNING
   *  |||
   *  000b => Device Stopped, not saving or resuming
   *  001b => Device running, which is the default state
-> *  010b => Stop the device & save the device state, stop-and-copy state

ie. the full state when only _SAVING is set.

Could we make the comment more clear to avoid this confusion?  Thanks,

Alex

> >> Then we can save the pending states of all interrupts in the GIC VM
> >> state change handler (on ARM).
> >>
> >> So we have to set the priority of the VFIO VM state change handler
> >> explicitly (like virtio devices) to ensure it is called before the
> >> GIC's in saving.
> >>
> >> Signed-off-by: Shenming Lu 
> >> Reviewed-by: Kirti Wankhede 
> >> ---
> >>  hw/vfio/migration.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> >> index 3b9de1353a..97ea82b100 100644
> >> --- a/hw/vfio/migration.c
> >> +++ b/hw/vfio/migration.c
> >> @@ -862,7 +862,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
> >>  register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, 
> >> &savevm_vfio_handlers,
> >>   vbasedev);
> >>  
> >> -migration->vm_state = 
> >> qemu_add_vm_change_state_handler(vfio_vmstate_change,
> >> +migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
> >> +   
> >> vfio_vmstate_change,
> >> vbasedev);
> >>  migration->migration_state.notify = vfio_migration_state_notifier;
> >>  add_migration_state_change_notifier(&migration->migration_state);  
> > 
> > .
> >   
> 




Re: [RFC PATCH v2 3/3] vfio: Avoid disabling and enabling vectors repeatedly in VFIO migration

2021-01-27 Thread Alex Williamson
On Wed, 27 Jan 2021 19:27:35 +0800
Shenming Lu  wrote:

> On 2021/1/27 5:36, Alex Williamson wrote:
> > On Wed, 9 Dec 2020 16:09:19 +0800
> > Shenming Lu  wrote:
> >   
> >> Different from the normal situation when the guest starts, we can
> >> know the max unmasked vetctor (at the beginning) after msix_load()
> >> in VFIO migration. So in order to avoid ineffectively disabling and  
> > 
> > s/ineffectively/inefficiently/?  It's "effective" either way I think.  
> 
> Yeah, I should say "inefficiently". :-)
> 
> >   
> >> enabling vectors repeatedly, let's allocate all needed vectors first
> >> and then enable these unmasked vectors one by one without disabling.
> >>
> >> Signed-off-by: Shenming Lu 
> >> ---
> >>  hw/pci/msix.c | 17 +
> >>  hw/vfio/pci.c | 10 --
> >>  include/hw/pci/msix.h |  2 ++
> >>  3 files changed, 27 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> >> index 67e34f34d6..bf291d3ff8 100644
> >> --- a/hw/pci/msix.c
> >> +++ b/hw/pci/msix.c
> >> @@ -557,6 +557,23 @@ unsigned int msix_nr_vectors_allocated(const 
> >> PCIDevice *dev)
> >>  return dev->msix_entries_nr;
> >>  }
> >>  
> >> +int msix_get_max_unmasked_vector(PCIDevice *dev)
> >> +{
> >> +int max_unmasked_vector = -1;
> >> +int vector;
> >> +
> >> +if ((dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &
> >> +(MSIX_ENABLE_MASK | MSIX_MASKALL_MASK)) == MSIX_ENABLE_MASK) {
> >> +for (vector = 0; vector < dev->msix_entries_nr; vector++) {
> >> +if (!msix_is_masked(dev, vector)) {
> >> +max_unmasked_vector = vector;
> >> +}
> >> +}
> >> +}
> >> +
> >> +return max_unmasked_vector;
> >> +}  
> > 
> > Comments from QEMU PCI folks?
> >   
> >> +
> >>  static int msix_set_notifier_for_vector(PCIDevice *dev, unsigned int 
> >> vector)
> >>  {
> >>  MSIMessage msg;
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index 51dc373695..e755ed2514 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -568,6 +568,9 @@ static void vfio_msix_vector_release(PCIDevice *pdev, 
> >> unsigned int nr)
> >>  
> >>  static void vfio_msix_enable(VFIOPCIDevice *vdev)
> >>  {
> >> +int max_unmasked_vector = msix_get_max_unmasked_vector(&vdev->pdev);
> >> +unsigned int used_vector = MAX(max_unmasked_vector, 0);
> >> +  
> > 
> > The above PCI function could also be done inline here pretty easily too:
> > 
> > unsigned int nr, max_vec = 0;
> > 
> > if (!msix_masked(&vdev->pdev))
> > for (nr = 0; nr < msix_nr_vectors_allocated(&vdev->pdev); nr++) {
> > if (!msix_is_masked(&vdev->pdev, nr)) {
> > max_vec = nr;
> > }
> > }
> > }
> > 
> > It's a bit cleaner than the msix utility function, imo.  
> 
> Yeah, it makes sense.
> 
> >   
> >>  vfio_disable_interrupts(vdev);
> >>  
> >>  vdev->msi_vectors = g_new0(VFIOMSIVector, vdev->msix->entries);
> >> @@ -586,9 +589,12 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev)
> >>   * triggering to userspace, then immediately release the vector, 
> >> leaving
> >>   * the physical device with no vectors enabled, but MSI-X enabled, 
> >> just
> >>   * like the guest view.
> >> + * If there are unmasked vectors (such as in migration) which will be
> >> + * enabled soon, we can allocate them here to avoid ineffectively 
> >> disabling
> >> + * and enabling vectors repeatedly later.  
> > 
> > It just happens that migration triggers this usage model where the
> > MSI-X enable bit is set with vectors unmasked in the vector table, but
> > this is not unique to migration, guests can follow this pattern as well.
> > Has this been tested with a variety of guests?  Logically it seems
> > correct, but always good to prove so.  Thanks,  
> 
> I have tested it in migration and normal guest startup (only the latest 
> Linux).
> And I can try to test with some other kernels, could you be more specific 
> about this?

Minimally also Windows, ideally a BSD as well.  Thanks,

Alex

> >>   */
> >> -vfio_msix_vector_do_use(&vdev->pdev, 0, NULL, NULL);
> >> -vfio_msix_vector_release(&vdev->pdev, 0);
> >> +vfio_msix_vector_do_use(&vdev->pdev, used_vector, NULL, NULL);
> >> +vfio_msix_vector_release(&vdev->pdev, used_vector);
> >>  
> >>  if (msix_set_vector_notifiers(&vdev->pdev, vfio_msix_vector_use,
> >>vfio_msix_vector_release, NULL)) {
> >> diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
> >> index 4c4a60c739..4bfb463fa6 100644
> >> --- a/include/hw/pci/msix.h
> >> +++ b/include/hw/pci/msix.h
> >> @@ -23,6 +23,8 @@ void msix_uninit_exclusive_bar(PCIDevice *dev);
> >>  
> >>  unsigned int msix_nr_vectors_allocated(const PCIDevice *dev);
> >>  
> >> +int msix_get_max_unmasked_vector(PCIDevice *dev);
> >> +
> >>  void msix_save(PCIDevice *dev, QEMUFile *f);
> >>  void msix_load(PCIDevice *dev, QEMUFile *f);
> >>
> >

Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Michal Privoznik

On 1/27/21 11:54 AM, Daniel P. Berrangé wrote:

On Wed, Jan 27, 2021 at 10:45:11AM +, Daniel P. Berrangé wrote:

On Thu, Jan 21, 2021 at 11:15:04AM -0500, Igor Mammedov wrote:





How does a mgmt app know which machine types need to use this
option ? The machine type names are opaque strings, and apps
must not attempt to parse or interpret the version number
inside the machine type name, as they can be changed by
distros.  IOW, saying to use it for machine types 4.0 and
older isn't a valid usage strategy IMHO.


Looking at the libvirt patch, we do indeed use his property
unconditionally for all machine types, precisely because parsing
version numbers from the machine type is not allowed.

https://www.redhat.com/archives/libvir-list/2021-January/msg00633.html

So this doc is telling apps to do something that isn't viable


The other approach that I was suggesting was, that QEMU stops reporting 
'default-ram-id' for affected machine types. The way the switch from '-m 
XMB' to memory-backend-* was implemented in libvirt is that if libvirt 
sees 'default-ram-id' attribute for given machine type it uses 
memory-backend-* otherwise it falls back to -m.


Since we know which machine types are "broken", we can stop reporting 
the attribute and thus stop tickling this bug. I agree that it puts more 
burden on distro maintainers to backport the change, but I think it's 
acceptable risk.


Michal




Re: [PATCH v3] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 3:14 PM Stefan Hajnoczi  wrote:
>
> On Wed, Jan 27, 2021 at 02:01:54PM +0100, Miklos Szeredi wrote:

> > The problem here is there can also be a race between the open and the
> > subsequent lo_do_lookup().
> >
> > At this point it's probably enough to verify that fuse_entry_param
> > refers to the same object as the fh (using fstat and comparing st_dev
> > and st_ino).
>
> Can you describe the race in detail? FUSE_CREATE vs FUSE_OPEN?
> FUSE_CREATE vs FUSE_CREATE?

A race between FUSE_CREATE and external modification:

VIRTIOFSD: lo_create() {
VIRTIOFSD: fd = open(foo, O_CREAT | O_EXCL)
EXTERNAL:  unlink(foo)
EXTERNAL:  open(foo, O_CREAT)
VIRTIOFSD: lo_do_lookup() {
VIRTIOFSD: newfd = open(foo, O_PATH | O_NOFOLLOW)

Nothing serious will happen, but there will be a discrepancy between
the open file and the inode that it references.  I.e.  the following
in the client will yield weird results:

open(foo, O_CREAT) -> fd
sprintf(procname, "/proc/self/fd/%i", fd);
open(procname, O_RDONLY) -> fd2
write(fd, buf, bufsize)
read(fd2, buf, bufsize)

This is probably not a security issue, more of a quality of
implementation issue.

Thanks,
Miklos




Re: [PATCH] virtio: move 'use-disabled-flag' property to hw_compat_4_2

2021-01-27 Thread Dr. David Alan Gilbert
* Stefano Garzarella (sgarz...@redhat.com) wrote:
> On Mon, Jan 18, 2021 at 04:03:12PM +, Dr. David Alan Gilbert wrote:
> > * Stefano Garzarella (sgarz...@redhat.com) wrote:
> > > Commit 9d7bd0826f introduced a new 'use-disabled-flag' property
> > > set to true by default.
> > > To allow the migration, we set this property to false in the hw_compat,
> > > but in the wrong place (hw_compat_4_1).
> > > 
> > > Since commit 9d7bd0826f was released with QEMU 5.0, we move
> > > 'use-disabled-flag' property to hw_compat_4_2, so 4.2 machine types
> > > will have the pre-patch behavior and the migration can work.
> > 
> > Be a little careful that fixing this probably causes a migration from
> > 5.2->6.0 to fail with this machine type;  so when we do these
> > type of fixes we often fix an old machine type between some pair of qemu
> > versions and then break it between a different set.
> 
> Good point!
> 
> I did some tests using the example below always using pc-q35-4.2 and it
> seems that works well:
> 
> - 5.2 -> 6.0pass
> - 5.2 -> 4.2FAIL
> - 6.0 -> 5.2pass
> - 6.0 -> 4.2pass
> - 4.2 -> 5.2pass
> - 4.2 -> 6.0pass
> 
> Should I run some more tests?

Apologies for the delay; I had to step back and understand a bit about
what was going on.

The problem here is that you're sending a 'disabled' subsection when
that option is true; your patch doesn't change the 4.1 machine type but
it does change the 4.2 machine type; and that makes the 4.2 machine type
not send it; so that means your patched version *will* work to existing
code (because it's a subsection anyway it doesn't break the stream
format when it's missing).


Reviewed-by: Dr. David Alan Gilbert 

> Thanks,
> Stefano
> 
> > 
> > Dave
> > 
> > > The issue was discovered with vhost-vsock device and 4.2 machine
> > > type without running any kernel in the VM:
> > > $ qemu-4.2 -M pc-q35-4.2,accel=kvm \
> > > -device vhost-vsock-pci,guest-cid=4 \
> > > -monitor stdio -incoming tcp:0:
> > > 
> > > $ qemu-5.2 -M pc-q35-4.2,accel=kvm \
> > > -device vhost-vsock-pci,guest-cid=3 \
> > > -monitor stdio
> > > (qemu) migrate -d tcp:0:
> > > 
> > > # qemu-4.2 output
> > > qemu-system-x86_64: Failed to load virtio-vhost_vsock:virtio
> > > qemu-system-x86_64: error while loading state for instance 0x0 of 
> > > device ':00:03.0/virtio-vhost_vsock'
> > > qemu-system-x86_64: load of migration failed: No such file or 
> > > directory
> > > 
> > > Reported-by: Jing Zhao 
> > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1907255
> > > Fixes: 9d7bd0826f ("virtio-pci: disable vring processing when 
> > > bus-mastering is disabled")
> > > Cc: mdr...@linux.vnet.ibm.com
> > > CC: qemu-sta...@nongnu.org
> > > Signed-off-by: Stefano Garzarella 
> > > ---
> > >  hw/core/machine.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > index de3b8f1b31..5d6163ab70 100644
> > > --- a/hw/core/machine.c
> > > +++ b/hw/core/machine.c
> > > @@ -70,12 +70,12 @@ GlobalProperty hw_compat_4_2[] = {
> > >  { "qxl", "revision", "4" },
> > >  { "qxl-vga", "revision", "4" },
> > >  { "fw_cfg", "acpi-mr-restore", "false" },
> > > +{ "virtio-device", "use-disabled-flag", "false" },
> > >  };
> > >  const size_t hw_compat_4_2_len = G_N_ELEMENTS(hw_compat_4_2);
> > > 
> > >  GlobalProperty hw_compat_4_1[] = {
> > >  { "virtio-pci", "x-pcie-flr-init", "off" },
> > > -{ "virtio-device", "use-disabled-flag", "false" },
> > >  };
> > >  const size_t hw_compat_4_1_len = G_N_ELEMENTS(hw_compat_4_1);
> > > 
> > > --
> > > 2.26.2
> > > 
> > -- 
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH] MAINTAINERS: Update 9pfs tree URL

2021-01-27 Thread Christian Schoenebeck
On Freitag, 15. Januar 2021 14:50:17 CET Christian Schoenebeck wrote:
> On Freitag, 15. Januar 2021 14:42:24 CET Greg Kurz wrote:
> > I've already moved my repositories to gitlab for extra CI coverage,
> > and I won't use the ones at github anymore.
> > 
> > Signed-off-by: Greg Kurz 
> 
> Reviewed-by: Christian Schoenebeck 
> 
> > ---
> > 
> >  MAINTAINERS |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index cb0656aec3d4..21038d3fdfce 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1828,7 +1828,7 @@ X: hw/9pfs/xen-9p*
> > 
> >  F: fsdev/
> >  F: docs/interop/virtfs-proxy-helper.rst
> >  F: tests/qtest/virtio-9p-test.c
> > 
> > -T: git https://github.com/gkurz/qemu.git 9p-next
> > +T: git https://gitlab.com/gkurz/qemu.git 9p-next
> > 
> >  virtio-blk
> >  M: Stefan Hajnoczi 

What's the status of this patch? I would add my T: line below just for the 
records. But I'd rather wait for this patch being merged to main line.

Best regards,
Christian Schoenebeck





Re: [PATCH] virtio: move 'use-disabled-flag' property to hw_compat_4_2

2021-01-27 Thread Stefano Garzarella

On Wed, Jan 27, 2021 at 02:28:13PM +, Dr. David Alan Gilbert wrote:

* Stefano Garzarella (sgarz...@redhat.com) wrote:

On Mon, Jan 18, 2021 at 04:03:12PM +, Dr. David Alan Gilbert wrote:
> * Stefano Garzarella (sgarz...@redhat.com) wrote:
> > Commit 9d7bd0826f introduced a new 'use-disabled-flag' property
> > set to true by default.
> > To allow the migration, we set this property to false in the hw_compat,
> > but in the wrong place (hw_compat_4_1).
> >
> > Since commit 9d7bd0826f was released with QEMU 5.0, we move
> > 'use-disabled-flag' property to hw_compat_4_2, so 4.2 machine types
> > will have the pre-patch behavior and the migration can work.
>
> Be a little careful that fixing this probably causes a migration from
> 5.2->6.0 to fail with this machine type;  so when we do these
> type of fixes we often fix an old machine type between some pair of qemu
> versions and then break it between a different set.

Good point!

I did some tests using the example below always using pc-q35-4.2 and it
seems that works well:

- 5.2 -> 6.0pass
- 5.2 -> 4.2FAIL
- 6.0 -> 5.2pass
- 6.0 -> 4.2pass
- 4.2 -> 5.2pass
- 4.2 -> 6.0pass

Should I run some more tests?


Apologies for the delay; I had to step back and understand a bit about
what was going on.


no problem :-)



The problem here is that you're sending a 'disabled' subsection when
that option is true; your patch doesn't change the 4.1 machine type but
it does change the 4.2 machine type; and that makes the 4.2 machine type
not send it; so that means your patched version *will* work to existing
code (because it's a subsection anyway it doesn't break the stream
format when it's missing).



Thanks for the detailed explanation!



Reviewed-by: Dr. David Alan Gilbert 



Thanks,
Stefano




seems currently QEMU doesn't support file backend for RAM memory region on Windows

2021-01-27 Thread Wu, Wentong
There is a doc about the API of memory-mapped file on Windows 
https://docs.microsoft.com/en-us/previous-versions/ms810613(v=msdn.10)?redirectedfrom=MSDN,
 not sure anyone is working on it.

Thanks
Wentong


[PATCH] docs/interop/qmp-spec: Document the request queue limit

2021-01-27 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
---
 docs/interop/qmp-spec.txt | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt
index cdf5842555..b0e8351d5b 100644
--- a/docs/interop/qmp-spec.txt
+++ b/docs/interop/qmp-spec.txt
@@ -133,9 +133,11 @@ to pass "id" with out-of-band commands.  Passing it with 
all commands
 is recommended for clients that accept capability "oob".
 
 If the client sends in-band commands faster than the server can
-execute them, the server will stop reading the requests from the QMP
-channel until the request queue length is reduced to an acceptable
-range.
+execute them, the server will stop reading requests until the request
+queue length is reduced to an acceptable range.
+
+To ensure commands to be executed out-of-band get read and executed,
+the client should have at most eight in-band commands in flight.
 
 Only a few commands support out-of-band execution.  The ones that do
 have "allow-oob": true in output of query-qmp-schema.
-- 
2.26.2




Re: [PATCH v3 2/8] meson: move up hw subdir (specifically before trace subdir)

2021-01-27 Thread Stefan Hajnoczi
On Thu, Jan 21, 2021 at 01:50:22PM +0100, Gerd Hoffmann wrote:
> Needed so trace/meson.build can see
> stuff done in hw/*/meson.build.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  meson.build | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/meson.build b/meson.build
> index 1f3d48b53a06..7462a50b4c36 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1777,6 +1777,8 @@ if 'CONFIG_VHOST_USER' in config_host
>vhost_user = libvhost_user.get_variable('vhost_user_dep')
>  endif
>  
> +subdir('hw')
> +
>  subdir('qapi')

Please add a comment into the meson.build file explaining the ordering
requirement. This will prevent people accidentally breaking this in the
future.


signature.asc
Description: PGP signature


Re: [PATCH v3 1/8] meson: add trace_events_config[]

2021-01-27 Thread Stefan Hajnoczi
On Thu, Jan 21, 2021 at 01:50:21PM +0100, Gerd Hoffmann wrote:
> It's an array of dicts, where each dict holds the configuration for one
> trace-events file.  For now just fill it from trace_events_subdirs.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  meson.build   |  1 +
>  trace/meson.build | 21 -
>  2 files changed, 17 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH v3 3/8] meson: add module_trace & module_trace_src

2021-01-27 Thread Stefan Hajnoczi
On Thu, Jan 21, 2021 at 01:50:23PM +0100, Gerd Hoffmann wrote:
> module_trace is a dict which keeps track of the trace source files for a
> module.
> 
> module_trace_src collects the trace source files for a given trace-events 
> file,
> which then either added to the source set or to a new module_trace dict
> depending on whenever they are for a module or core qemu.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  meson.build   |  3 ++-
>  trace/meson.build | 16 
>  2 files changed, 14 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Greg Kurz
On Wed, 27 Jan 2021 15:09:50 +0100
Miklos Szeredi  wrote:

> On Wed, Jan 27, 2021 at 2:49 PM Greg Kurz  wrote:
> >
> > On Wed, 27 Jan 2021 11:34:52 +0100
> > Miklos Szeredi  wrote:
> 
> > > Another solution specifically for O_CREAT without O_EXCL would be to
> > > turn it into an exclusive create.
> >
> > Would this added O_EXCL then appear on the client side, e.g. to
> > guest userspace doing fcntl(F_GETFL) ?
> 
> No.  Guest kernel keeps track of open flags.
> 

That was my impression as well as I didn't find a FUSE_GETFL request.
Thanks for confirming that !

> > > If that fails with EEXIST then try
> > > the normal open path (open with O_PATH, fstat, open proc symlink).  If
> >
> > open(O_PATH | O_NOFOLLOW) + fstatat(AT_EMPTY_PATH|AT_SYMLINK_NOFOLLOW)
> > would indeed allow to filter out anything that isn't a directory and
> > to safely open the proc symlink.
> >
> > > that fails with ENOENT, then retry the whole thing a certain number of
> >
> > Indeed someone could have unlinked the file in the meantime, in which
> > case the open(O_PATH | O_NOFOLLOW) would fail, but if it succeeds then
> > we cannot hit ENOENT anymore AFAICT.
> 
> Right.
> 
> > > times.  If it still fails then somebody is definitely messing with us
> > > and we can fail the request with EIO.
> > >
> >
> > Not sure what the retry+timeout is trying to mitigate here... why not
> > returning EIO right away ?
> 
> The semantics of O_CREATE are that it can fail neither because the
> file exists nor because it doesn't.  This doesn't matter if the
> exported tree is not modified outside of a single guest, because of
> locking provided by the guest kernel.
> 

Wrong. O_CREAT can legitimately fail with ENOENT if one element
of the pathname doesn't exist. And even if pathname only has
one element, you can still have O_CREAT to fail the same way
if the path of the parent directory is removed.

cat>enoent.c<
#include 
#include 
#include 

int main(int argc, char **argv)
{
mkdir("foo", 0777);
chdir("foo");
rmdir("../foo");
open("bar", O_CREAT);
}
EOF
make enoent
strace ./enoent

[...]

mkdir("foo", 0777)  = 0
chdir("foo")= 0
rmdir("../foo") = 0
openat(AT_FDCWD, "bar", O_RDONLY|O_CREAT, 0117130) = -1 ENOENT (No such file or 
directory)

> However if we want to support shared access to a tree then O_CREAT
> semantics should work even in the face of races due to external
> modification of the tree.  I.e. following race:
> 

Yeah, handling shared access is where the fun starts :)

> virtiofsd: open(foo, O_CREAT | O_EXCL) -> EEXIST
> other task: unlink(foo)
> virtiofsd: open(foo, O_PATH | O_NOFOLLOW) -> ENOENT
> 
> To properly support the above the O_CREAT | O_EXCL open would need to
> be retried.
> 

But in this case, it seems fine to return ENOENT since
the guest userspace cannot really assume it never happens.

> Thanks,
> Miklos
> 




Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Daniel P . Berrangé
On Wed, Jan 27, 2021 at 03:24:26PM +0100, Michal Privoznik wrote:
> On 1/27/21 11:54 AM, Daniel P. Berrangé wrote:
> > On Wed, Jan 27, 2021 at 10:45:11AM +, Daniel P. Berrangé wrote:
> > > On Thu, Jan 21, 2021 at 11:15:04AM -0500, Igor Mammedov wrote:
> 
> 
> > > 
> > > How does a mgmt app know which machine types need to use this
> > > option ? The machine type names are opaque strings, and apps
> > > must not attempt to parse or interpret the version number
> > > inside the machine type name, as they can be changed by
> > > distros.  IOW, saying to use it for machine types 4.0 and
> > > older isn't a valid usage strategy IMHO.
> > 
> > Looking at the libvirt patch, we do indeed use his property
> > unconditionally for all machine types, precisely because parsing
> > version numbers from the machine type is not allowed.
> > 
> > https://www.redhat.com/archives/libvir-list/2021-January/msg00633.html
> > 
> > So this doc is telling apps to do something that isn't viable
> 
> The other approach that I was suggesting was, that QEMU stops reporting
> 'default-ram-id' for affected machine types. The way the switch from '-m
> XMB' to memory-backend-* was implemented in libvirt is that if libvirt sees
> 'default-ram-id' attribute for given machine type it uses memory-backend-*
> otherwise it falls back to -m.
> 
> Since we know which machine types are "broken", we can stop reporting the
> attribute and thus stop tickling this bug. I agree that it puts more burden
> on distro maintainers to backport the change, but I think it's acceptable
> risk.

IIUC That's only a burden for distros if they're creating their own
machine types, in which case they've already decided the burden is
a net win.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 4:09 PM Greg Kurz  wrote:
>
> On Wed, 27 Jan 2021 15:09:50 +0100
> Miklos Szeredi  wrote:
> > The semantics of O_CREATE are that it can fail neither because the
> > file exists nor because it doesn't.  This doesn't matter if the
> > exported tree is not modified outside of a single guest, because of
> > locking provided by the guest kernel.
> >
>
> Wrong. O_CREAT can legitimately fail with ENOENT if one element

Let me make my  statement more precise:

O_CREAT cannot fail with ENOENT if parent directory exists throughout
the operation.

I'm sure this property is used all over the place in userspace code,
and hence should be supported in strict coherence (cache=none) mode.

For relaxed coherence (cache=auto) I'm not quite sure.  NFS is usually
the reference, we'd need to look into what guarantees it provides wrt.
O_CREAT and remote racing unlink.

Thanks,
Miklos




Re: [PATCH v3] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Greg Kurz
On Wed, 27 Jan 2021 14:14:30 +
Stefan Hajnoczi  wrote:

> On Wed, Jan 27, 2021 at 02:01:54PM +0100, Miklos Szeredi wrote:
> > On Wed, Jan 27, 2021 at 12:21 PM Stefan Hajnoczi  
> > wrote:
> >   }
> > > @@ -1654,9 +1677,11 @@ static void update_open_flags(int writeback, int 
> > > allow_direct_io,
> > >  static void lo_create(fuse_req_t req, fuse_ino_t parent, const char 
> > > *name,
> > >mode_t mode, struct fuse_file_info *fi)
> > >  {
> > > +int open_flags = (fi->flags | O_CREAT) & ~O_NOFOLLOW;
> > >  int fd;
> > >  struct lo_data *lo = lo_data(req);
> > >  struct lo_inode *parent_inode;
> > > +struct lo_inode *existing_inode = NULL;
> > >  struct fuse_entry_param e;
> > >  int err;
> > >  struct lo_cred old = {};
> > > @@ -1682,11 +1707,23 @@ static void lo_create(fuse_req_t req, fuse_ino_t 
> > > parent, const char *name,
> > >
> > >  update_open_flags(lo->writeback, lo->allow_direct_io, fi);
> > >
> > > -fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & 
> > > ~O_NOFOLLOW,
> > > -mode);
> > > +/* First, try to create a new file but don't open existing files */
> > > +fd = openat(parent_inode->fd, name, open_flags | O_EXCL, mode);
> > >  err = fd == -1 ? errno : 0;
> > > +
> > >  lo_restore_cred(&old);
> > >
> > > +/* Second, open existing files if O_EXCL was not specified */
> > > +if (err == EEXIST && !(fi->flags & O_EXCL)) {
> > > +existing_inode = lookup_name(req, parent, name);
> > > +if (existing_inode) {
> > > +fd = lo_inode_open(lo, existing_inode, open_flags);
> > > +if (fd < 0) {
> > > +err = -fd;
> > > +}
> > > +}
> > > +}
> > > +
> > >  if (!err) {
> > >  ssize_t fh;
> > 
> > It's more of a mess than I thought.
> > 
> > The problem here is there can also be a race between the open and the
> > subsequent lo_do_lookup().
> > 
> > At this point it's probably enough to verify that fuse_entry_param
> > refers to the same object as the fh (using fstat and comparing st_dev
> > and st_ino).
> 
> Can you describe the race in detail? FUSE_CREATE vs FUSE_OPEN?
> FUSE_CREATE vs FUSE_CREATE?
> 
> > Also O_CREAT open is not supposed to return ENOENT, so failure to open
> > without O_CREAT (race between O_CREAT open and plain open) should at
> > least translate error to ESTALE or EIO.
> 
> Thanks, will fix.
> 

Please wait, as explained in another mail, ENOENT can happen with
O_CREAT and guest userspace should be ready to handle it.

> Sstefan



pgpi9UCXNrlOP.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 4/8] meson: move qxl trace events to separate file

2021-01-27 Thread Stefan Hajnoczi
On Thu, Jan 21, 2021 at 01:50:24PM +0100, Gerd Hoffmann wrote:
> Move qxl trace events to separate trace-events-qxl file.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  hw/display/qxl-render.c |  1 +
>  hw/display/qxl.c|  1 +
>  hw/display/meson.build  |  5 +++
>  hw/display/trace-events | 67 -
>  hw/display/trace-events-qxl | 66 
>  5 files changed, 73 insertions(+), 67 deletions(-)
>  create mode 100644 hw/display/trace-events-qxl

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Igor Mammedov
On Wed, 27 Jan 2021 15:24:26 +0100
Michal Privoznik  wrote:

> On 1/27/21 11:54 AM, Daniel P. Berrangé wrote:
> > On Wed, Jan 27, 2021 at 10:45:11AM +, Daniel P. Berrangé wrote:  
> >> On Thu, Jan 21, 2021 at 11:15:04AM -0500, Igor Mammedov wrote:  
> 
> 
> >>
> >> How does a mgmt app know which machine types need to use this
> >> option ? The machine type names are opaque strings, and apps
> >> must not attempt to parse or interpret the version number
> >> inside the machine type name, as they can be changed by
> >> distros.  IOW, saying to use it for machine types 4.0 and
> >> older isn't a valid usage strategy IMHO.
it's possible (but no necessary) to use knob with new machine types
(defaults for these match suggested property value).
Limiting knob usage to 4.0 and older would allow us to drop
without extra efforts once 4.0 is deprecated/removed.

> > Looking at the libvirt patch, we do indeed use his property
> > unconditionally for all machine types, precisely because parsing
> > version numbers from the machine type is not allowed.
> > 
> > https://www.redhat.com/archives/libvir-list/2021-January/msg00633.html
> > 
> > So this doc is telling apps to do something that isn't viable  
> 
> The other approach that I was suggesting was, that QEMU stops reporting 
> 'default-ram-id' for affected machine types. The way the switch from '-m 
> XMB' to memory-backend-* was implemented in libvirt is that if libvirt 
> sees 'default-ram-id' attribute for given machine type it uses 
> memory-backend-* otherwise it falls back to -m.
> 
> Since we know which machine types are "broken", we can stop reporting 
> the attribute and thus stop tickling this bug. I agree that it puts more 
> burden on distro maintainers to backport the change, but I think it's 
> acceptable risk.

default-ram-id is already exposed in wild including old machine types
starting from 5.2

if libvirt will take care this one quirk, then I guess we can
do as suggested. I can post an additional patch to this effect if there
is agreement to go this route.

my take on it, at this point is that it's not worth the effort,
we can just use option unconditionally and flow usual deprecate/remove
process once 4.0 machine type is removed.

> 
> Michal
> 
> 




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Greg Kurz
On Wed, 27 Jan 2021 16:22:49 +0100
Miklos Szeredi  wrote:

> On Wed, Jan 27, 2021 at 4:09 PM Greg Kurz  wrote:
> >
> > On Wed, 27 Jan 2021 15:09:50 +0100
> > Miklos Szeredi  wrote:
> > > The semantics of O_CREATE are that it can fail neither because the
> > > file exists nor because it doesn't.  This doesn't matter if the
> > > exported tree is not modified outside of a single guest, because of
> > > locking provided by the guest kernel.
> > >
> >
> > Wrong. O_CREAT can legitimately fail with ENOENT if one element
> 
> Let me make my  statement more precise:
> 
> O_CREAT cannot fail with ENOENT if parent directory exists throughout
> the operation.
> 

True, but I still don't see what guarantees guest userspace that the
parent directory doesn't go away... I must have missed something.
Please elaborate.

> I'm sure this property is used all over the place in userspace code,
> and hence should be supported in strict coherence (cache=none) mode.
> 
> For relaxed coherence (cache=auto) I'm not quite sure.  NFS is usually
> the reference, we'd need to look into what guarantees it provides wrt.
> O_CREAT and remote racing unlink.
> 
> Thanks,
> Miklos
> 




[PATCH v6] Add support for pvpanic pci device

2021-01-27 Thread Mihai Carabas
This patchset adds support for pvpanic pci device.

v3:
- patch 1: made pvpanic isa device available only for PC, compile pvpanic-test
  only when isa device is present
- patch 2: fixed device id to 0x0011, used OBJECT_DECLARE_TYPE,
  PVPANIC_PCI_DEVICE, added VMSTATE_PCI_DEVICE, removed INTERFACE_PCIE_DEVICE
- patch 3: fixed documentation
- patch 4: add a qtest for pvpanic-pci

v4:
- added Rb/Ack on patches
- modify test case to include -action parameter that was recently added and also
  to be on par with the pvpanic ISA device testing

v5:
- added subsystem_vendor_id and subsystem_id needed for MS WHQL tests

v6:
- removed subsystem_vendor_id and subsystem_id as they are filed out by default
  if empty
- do not compile pvpanic-pci-test for ppc64 as we our tests do not support that
  platform

Mihai Carabas (4):
  hw/misc/pvpanic: split-out generic and bus dependent code
  hw/misc/pvpanic: add PCI interface support
  pvpanic : update pvpanic spec document
  tests/qtest: add a test case for pvpanic-pci

 docs/specs/pci-ids.txt |  1 +
 docs/specs/pvpanic.txt | 13 +-
 hw/i386/Kconfig|  2 +-
 hw/misc/Kconfig| 12 +-
 hw/misc/meson.build|  4 +-
 hw/misc/pvpanic-isa.c  | 94 ++
 hw/misc/pvpanic-pci.c  | 94 ++
 hw/misc/pvpanic.c  | 85 +++---
 include/hw/misc/pvpanic.h  | 24 ++-
 include/hw/pci/pci.h   |  1 +
 tests/qtest/meson.build|  3 +-
 tests/qtest/pvpanic-pci-test.c | 94 ++
 12 files changed, 341 insertions(+), 86 deletions(-)
 create mode 100644 hw/misc/pvpanic-isa.c
 create mode 100644 hw/misc/pvpanic-pci.c
 create mode 100644 tests/qtest/pvpanic-pci-test.c

-- 
1.8.3.1




[PATCH v6 4/4] tests/qtest: add a test case for pvpanic-pci

2021-01-27 Thread Mihai Carabas
Add a test case for pvpanic-pci device. The scenario is the same as pvpanic
ISA device, but is using the PCI bus.

Signed-off-by: Mihai Carabas 
Acked-by: Thomas Huth 
Reviewed-by: Peter Maydell 
Signed-off-by: Mihai Carabas 
---
 tests/qtest/meson.build|  1 +
 tests/qtest/pvpanic-pci-test.c | 94 ++
 2 files changed, 95 insertions(+)
 create mode 100644 tests/qtest/pvpanic-pci-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 0e85343..7ccdf02 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -34,6 +34,7 @@ qtests_i386 = \
config_all_devices.has_key('CONFIG_ISA_IPMI_BT') ? ['ipmi-bt-test'] : []) + 
 \
   (config_all_devices.has_key('CONFIG_WDT_IB700') ? ['wdt_ib700-test'] : []) + 
 \
   (config_all_devices.has_key('CONFIG_PVPANIC_ISA') ? ['pvpanic-test'] : []) + 
 \
+  (config_all_devices.has_key('CONFIG_PVPANIC_PCI') ? ['pvpanic-pci-test'] : 
[]) +  \
   (config_all_devices.has_key('CONFIG_HDA') ? ['intel-hda-test'] : []) +   
 \
   (config_all_devices.has_key('CONFIG_I82801B11') ? ['i82801b11-test'] : []) + 
\
   (config_all_devices.has_key('CONFIG_IOH3420') ? ['ioh3420-test'] : []) + 
 \
diff --git a/tests/qtest/pvpanic-pci-test.c b/tests/qtest/pvpanic-pci-test.c
new file mode 100644
index 000..24b33c6
--- /dev/null
+++ b/tests/qtest/pvpanic-pci-test.c
@@ -0,0 +1,94 @@
+/*
+ * QTest testcase for PV Panic PCI device
+ *
+ * Copyright (C) 2020 Oracle
+ *
+ * Authors:
+ * Mihai Carabas 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "libqos/pci.h"
+#include "libqos/pci-pc.h"
+#include "hw/pci/pci_regs.h"
+
+static void test_panic_nopause(void)
+{
+uint8_t val;
+QDict *response, *data;
+QTestState *qts;
+QPCIBus *pcibus;
+QPCIDevice *dev;
+QPCIBar bar;
+
+qts = qtest_init("-device pvpanic-pci,addr=04.0 -action panic=none");
+pcibus = qpci_new_pc(qts, NULL);
+dev = qpci_device_find(pcibus, QPCI_DEVFN(0x4, 0x0));
+qpci_device_enable(dev);
+bar = qpci_iomap(dev, 0, NULL);
+
+qpci_memread(dev, bar, 0, &val, sizeof(val));
+g_assert_cmpuint(val, ==, 3);
+
+val = 1;
+qpci_memwrite(dev, bar, 0, &val, sizeof(val));
+
+response = qtest_qmp_eventwait_ref(qts, "GUEST_PANICKED");
+g_assert(qdict_haskey(response, "data"));
+data = qdict_get_qdict(response, "data");
+g_assert(qdict_haskey(data, "action"));
+g_assert_cmpstr(qdict_get_str(data, "action"), ==, "run");
+qobject_unref(response);
+
+qtest_quit(qts);
+}
+
+static void test_panic(void)
+{
+uint8_t val;
+QDict *response, *data;
+QTestState *qts;
+QPCIBus *pcibus;
+QPCIDevice *dev;
+QPCIBar bar;
+
+qts = qtest_init("-device pvpanic-pci,addr=04.0 -action panic=pause");
+pcibus = qpci_new_pc(qts, NULL);
+dev = qpci_device_find(pcibus, QPCI_DEVFN(0x4, 0x0));
+qpci_device_enable(dev);
+bar = qpci_iomap(dev, 0, NULL);
+
+qpci_memread(dev, bar, 0, &val, sizeof(val));
+g_assert_cmpuint(val, ==, 3);
+
+val = 1;
+qpci_memwrite(dev, bar, 0, &val, sizeof(val));
+
+response = qtest_qmp_eventwait_ref(qts, "GUEST_PANICKED");
+g_assert(qdict_haskey(response, "data"));
+data = qdict_get_qdict(response, "data");
+g_assert(qdict_haskey(data, "action"));
+g_assert_cmpstr(qdict_get_str(data, "action"), ==, "pause");
+qobject_unref(response);
+
+qtest_quit(qts);
+}
+
+int main(int argc, char **argv)
+{
+int ret;
+
+g_test_init(&argc, &argv, NULL);
+qtest_add_func("/pvpanic-pci/panic", test_panic);
+qtest_add_func("/pvpanic-pci/panic-nopause", test_panic_nopause);
+
+ret = g_test_run();
+
+return ret;
+}
-- 
1.8.3.1




[PATCH v6 3/4] pvpanic : update pvpanic spec document

2021-01-27 Thread Mihai Carabas
Add pvpanic PCI device support details in docs/specs/pvpanic.txt.

Signed-off-by: Mihai Carabas 
Reviewed-by: Peter Maydell 
---
 docs/specs/pvpanic.txt | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/docs/specs/pvpanic.txt b/docs/specs/pvpanic.txt
index a90fbca..8afcde1 100644
--- a/docs/specs/pvpanic.txt
+++ b/docs/specs/pvpanic.txt
@@ -1,7 +1,7 @@
 PVPANIC DEVICE
 ==
 
-pvpanic device is a simulated ISA device, through which a guest panic
+pvpanic device is a simulated device, through which a guest panic
 event is sent to qemu, and a QMP event is generated. This allows
 management apps (e.g. libvirt) to be notified and respond to the event.
 
@@ -9,6 +9,9 @@ The management app has the option of waiting for GUEST_PANICKED 
events,
 and/or polling for guest-panicked RunState, to learn when the pvpanic
 device has fired a panic event.
 
+The pvpanic device can be implemented as an ISA device (using IOPORT) or as a
+PCI device.
+
 ISA Interface
 -
 
@@ -24,6 +27,14 @@ bit 1: a guest panic has happened and will be handled by the 
guest;
the host should record it or report it, but should not affect
the execution of the guest.
 
+PCI Interface
+-
+
+The PCI interface is similar to the ISA interface except that it uses an MMIO
+address space provided by its BAR0, 1 byte long. Any machine with a PCI bus
+can enable a pvpanic device by adding '-device pvpanic-pci' to the command
+line.
+
 ACPI Interface
 --
 
-- 
1.8.3.1




Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 4:35 PM Greg Kurz  wrote:
>
> On Wed, 27 Jan 2021 16:22:49 +0100
> Miklos Szeredi  wrote:
>
> > On Wed, Jan 27, 2021 at 4:09 PM Greg Kurz  wrote:
> > >
> > > On Wed, 27 Jan 2021 15:09:50 +0100
> > > Miklos Szeredi  wrote:
> > > > The semantics of O_CREATE are that it can fail neither because the
> > > > file exists nor because it doesn't.  This doesn't matter if the
> > > > exported tree is not modified outside of a single guest, because of
> > > > locking provided by the guest kernel.
> > > >
> > >
> > > Wrong. O_CREAT can legitimately fail with ENOENT if one element
> >
> > Let me make my  statement more precise:
> >
> > O_CREAT cannot fail with ENOENT if parent directory exists throughout
> > the operation.
> >
>
> True, but I still don't see what guarantees guest userspace that the
> parent directory doesn't go away... I must have missed something.
> Please elaborate.

Generally there's no guarantee, however there can be certain
situations where the caller can indeed rely on the existence of the
parent (e.g. /tmp).

Thanks,
Miklos




Re: [PATCH] IOMMU and ATS not supported by vhost-user filesystem.

2021-01-27 Thread Leonardo Augusto Guimarães Garcia

On 1/27/21 8:19 AM, Stefan Hajnoczi wrote:

On Tue, Jan 26, 2021 at 03:23:38PM -0300, lagar...@linux.ibm.com wrote:

From: Leonardo Garcia 

Currently, as IOMMU and ATS are not supported, if a user mistakenly set
any of them and tries to mount the vhost-user filesystem inside the
guest, whenever the user tries to access the mount point, the system
will hang forever.

Signed-off-by: Leonardo Garcia 
---
  hw/virtio/vhost-user-fs-pci.c | 7 +++
  hw/virtio/vhost-user-fs.c | 5 +
  2 files changed, 12 insertions(+)

diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
index 2ed8492b3f..564d1fd108 100644
--- a/hw/virtio/vhost-user-fs-pci.c
+++ b/hw/virtio/vhost-user-fs-pci.c
@@ -11,6 +11,8 @@
   * top-level directory.
   */
  
+#include "qemu/osdep.h"

+#include "qapi/error.h"
  #include "qemu/osdep.h"
  #include "hw/qdev-properties.h"
  #include "hw/virtio/vhost-user-fs.h"
@@ -45,6 +47,11 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
  vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
  }
  
+if (vpci_dev->flags & VIRTIO_PCI_FLAG_ATS) {

+error_setg(errp, "ATS is currently not supported with 
vhost-user-fs-pci");
+return;
+}

Why is this check needed in addition to VIRTIO_F_IOMMU_PLATFORM?

What needs to be added to support ATS?


+
  qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
  }
  
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c

index ac4fc34b36..914d68b3ee 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -203,6 +203,11 @@ static void vuf_device_realize(DeviceState *dev, Error 
**errp)
  return;
  }
  
+if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {

+error_setg(errp, "IOMMU is currently not supported with 
vhost-user-fs");
+return;
+}
+
  if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {

I thought IOMMU support depends on the vhost-user device backend (e.g.
virtiofsd), so the vhost-user backend should participate in advertising
this feature.

Perhaps the check should be:

 ret = vhost_dev_init(&fs->vhost_dev, &fs->vhost_user,
  VHOST_BACKEND_TYPE_USER, 0);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "vhost_dev_init failed");
 goto err_virtio;
 }
+
+   if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM) &&
+   !(fs->vhost_dev.hdev_features & (1ull << VIRTIO_F_IOMMU_PLATFORM))) {
+   error_setg(errp, "IOMMU is not supported by the vhost-user device 
backend");
+   goto err_iommu_needed;
+   }

Also, can this logic be made generic for all vhost-user devices? It's
not really specific to vhost-user-fs.


Further analyzing this, on vhost-user.c, I see:

    if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) &&
    !(virtio_has_feature(dev->protocol_features,
    VHOST_USER_PROTOCOL_F_SLAVE_REQ) &&
 virtio_has_feature(dev->protocol_features,
    VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
    error_report("IOMMU support requires reply-ack and "
 "slave-req protocol features.");
    return -1;
    }

This code was included by commit 6dcdd06. It included support for 
VIRTIO_F_IOMMU_PLATFORM with vhost_net devices. So, the restriction is 
not generic for all vhost-user devices.


Cheers,

Leo



Stefan




[PATCH v6 1/4] hw/misc/pvpanic: split-out generic and bus dependent code

2021-01-27 Thread Mihai Carabas
To ease the PCI device addition in next patches, split the code as follows:
- generic code (read/write/setup) is being kept in pvpanic.c
- ISA dependent code moved to pvpanic-isa.c

Also, rename:
- ISA_PVPANIC_DEVICE -> PVPANIC_ISA_DEVICE.
- TYPE_PVPANIC -> TYPE_PVPANIC_ISA.
- MemoryRegion io -> mr.
- pvpanic_ioport_* in pvpanic_*.

Update the build system with the new files and config structure.

Signed-off-by: Mihai Carabas 
Reviewed-by: Peter Maydell 
---
 hw/i386/Kconfig   |  2 +-
 hw/misc/Kconfig   |  6 ++-
 hw/misc/meson.build   |  3 +-
 hw/misc/pvpanic-isa.c | 94 +++
 hw/misc/pvpanic.c | 85 +++---
 include/hw/misc/pvpanic.h | 23 +++-
 tests/qtest/meson.build   |  2 +-
 7 files changed, 130 insertions(+), 85 deletions(-)
 create mode 100644 hw/misc/pvpanic-isa.c

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index eea059f..7f91f30 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -14,7 +14,7 @@ config PC
 imply ISA_DEBUG
 imply PARALLEL
 imply PCI_DEVICES
-imply PVPANIC
+imply PVPANIC_ISA
 imply QXL
 imply SEV
 imply SGA
diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index cf18ac0..23bc978 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -121,9 +121,13 @@ config IOTKIT_SYSCTL
 config IOTKIT_SYSINFO
 bool
 
-config PVPANIC
+config PVPANIC_COMMON
+bool
+
+config PVPANIC_ISA
 bool
 depends on ISA_BUS
+select PVPANIC_COMMON
 
 config AUX
 bool
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 607cd38..edaaec2 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -13,6 +13,7 @@ softmmu_ss.add(when: 'CONFIG_EMC141X', if_true: 
files('emc141x.c'))
 softmmu_ss.add(when: 'CONFIG_UNIMP', if_true: files('unimp.c'))
 softmmu_ss.add(when: 'CONFIG_EMPTY_SLOT', if_true: files('empty_slot.c'))
 softmmu_ss.add(when: 'CONFIG_LED', if_true: files('led.c'))
+softmmu_ss.add(when: 'CONFIG_PVPANIC_COMMON', if_true: files('pvpanic.c'))
 
 # ARM devices
 softmmu_ss.add(when: 'CONFIG_PL310', if_true: files('arm_l2x0.c'))
@@ -98,7 +99,7 @@ softmmu_ss.add(when: 'CONFIG_IOTKIT_SYSINFO', if_true: 
files('iotkit-sysinfo.c')
 softmmu_ss.add(when: 'CONFIG_ARMSSE_CPUID', if_true: files('armsse-cpuid.c'))
 softmmu_ss.add(when: 'CONFIG_ARMSSE_MHU', if_true: files('armsse-mhu.c'))
 
-softmmu_ss.add(when: 'CONFIG_PVPANIC', if_true: files('pvpanic.c'))
+softmmu_ss.add(when: 'CONFIG_PVPANIC_ISA', if_true: files('pvpanic-isa.c'))
 softmmu_ss.add(when: 'CONFIG_AUX', if_true: files('auxbus.c'))
 softmmu_ss.add(when: 'CONFIG_ASPEED_SOC', if_true: files('aspeed_scu.c', 
'aspeed_sdmc.c', 'aspeed_xdma.c'))
 softmmu_ss.add(when: 'CONFIG_MSF2', if_true: files('msf2-sysreg.c'))
diff --git a/hw/misc/pvpanic-isa.c b/hw/misc/pvpanic-isa.c
new file mode 100644
index 000..27113ab
--- /dev/null
+++ b/hw/misc/pvpanic-isa.c
@@ -0,0 +1,94 @@
+/*
+ * QEMU simulated pvpanic device.
+ *
+ * Copyright Fujitsu, Corp. 2013
+ *
+ * Authors:
+ * Wen Congyang 
+ * Hu Tao 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "sysemu/runstate.h"
+
+#include "hw/nvram/fw_cfg.h"
+#include "hw/qdev-properties.h"
+#include "hw/misc/pvpanic.h"
+#include "qom/object.h"
+#include "hw/isa/isa.h"
+
+OBJECT_DECLARE_SIMPLE_TYPE(PVPanicISAState, PVPANIC_ISA_DEVICE)
+
+/*
+ * PVPanicISAState for ISA device and
+ * use ioport.
+ */
+struct PVPanicISAState {
+ISADevice parent_obj;
+
+uint16_t ioport;
+PVPanicState pvpanic;
+};
+
+static void pvpanic_isa_initfn(Object *obj)
+{
+PVPanicISAState *s = PVPANIC_ISA_DEVICE(obj);
+
+pvpanic_setup_io(&s->pvpanic, DEVICE(s), 1);
+}
+
+static void pvpanic_isa_realizefn(DeviceState *dev, Error **errp)
+{
+ISADevice *d = ISA_DEVICE(dev);
+PVPanicISAState *s = PVPANIC_ISA_DEVICE(dev);
+PVPanicState *ps = &s->pvpanic;
+FWCfgState *fw_cfg = fw_cfg_find();
+uint16_t *pvpanic_port;
+
+if (!fw_cfg) {
+return;
+}
+
+pvpanic_port = g_malloc(sizeof(*pvpanic_port));
+*pvpanic_port = cpu_to_le16(s->ioport);
+fw_cfg_add_file(fw_cfg, "etc/pvpanic-port", pvpanic_port,
+sizeof(*pvpanic_port));
+
+isa_register_ioport(d, &ps->mr, s->ioport);
+}
+
+static Property pvpanic_isa_properties[] = {
+DEFINE_PROP_UINT16(PVPANIC_IOPORT_PROP, PVPanicISAState, ioport, 0x505),
+DEFINE_PROP_UINT8("events", PVPanicISAState, pvpanic.events, 
PVPANIC_PANICKED | PVPANIC_CRASHLOADED),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void pvpanic_isa_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->realize = pvpanic_isa_realizefn;
+device_class_set_props(dc, pvpanic_isa_properties);
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static

[PATCH v6 2/4] hw/misc/pvpanic: add PCI interface support

2021-01-27 Thread Mihai Carabas
Add PCI interface support for PVPANIC device. Create a new file pvpanic-pci.c
where the PCI specific routines reside and update the build system with the new
files and config structure.

Signed-off-by: Mihai Carabas 
Reviewed-by: Gerd Hoffmann 
Reviewed-by: Peter Maydell 
Signed-off-by: Mihai Carabas 
---
 docs/specs/pci-ids.txt|  1 +
 hw/misc/Kconfig   |  6 +++
 hw/misc/meson.build   |  1 +
 hw/misc/pvpanic-pci.c | 94 +++
 include/hw/misc/pvpanic.h |  1 +
 include/hw/pci/pci.h  |  1 +
 6 files changed, 104 insertions(+)
 create mode 100644 hw/misc/pvpanic-pci.c

diff --git a/docs/specs/pci-ids.txt b/docs/specs/pci-ids.txt
index abbdbca..5e407a6 100644
--- a/docs/specs/pci-ids.txt
+++ b/docs/specs/pci-ids.txt
@@ -64,6 +64,7 @@ PCI devices (other than virtio):
 1b36:000d  PCI xhci usb host adapter
 1b36:000f  mdpy (mdev sample device), linux/samples/vfio-mdev/mdpy.c
 1b36:0010  PCIe NVMe device (-device nvme)
+1b36:0011  PCI PVPanic device (-device pvpanic-pci)
 
 All these devices are documented in docs/specs.
 
diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index 23bc978..19c216f 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -124,6 +124,12 @@ config IOTKIT_SYSINFO
 config PVPANIC_COMMON
 bool
 
+config PVPANIC_PCI
+bool
+default y if PCI_DEVICES
+depends on PCI
+select PVPANIC_COMMON
+
 config PVPANIC_ISA
 bool
 depends on ISA_BUS
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index edaaec2..6292839 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -100,6 +100,7 @@ softmmu_ss.add(when: 'CONFIG_ARMSSE_CPUID', if_true: 
files('armsse-cpuid.c'))
 softmmu_ss.add(when: 'CONFIG_ARMSSE_MHU', if_true: files('armsse-mhu.c'))
 
 softmmu_ss.add(when: 'CONFIG_PVPANIC_ISA', if_true: files('pvpanic-isa.c'))
+softmmu_ss.add(when: 'CONFIG_PVPANIC_PCI', if_true: files('pvpanic-pci.c'))
 softmmu_ss.add(when: 'CONFIG_AUX', if_true: files('auxbus.c'))
 softmmu_ss.add(when: 'CONFIG_ASPEED_SOC', if_true: files('aspeed_scu.c', 
'aspeed_sdmc.c', 'aspeed_xdma.c'))
 softmmu_ss.add(when: 'CONFIG_MSF2', if_true: files('msf2-sysreg.c'))
diff --git a/hw/misc/pvpanic-pci.c b/hw/misc/pvpanic-pci.c
new file mode 100644
index 000..d629639
--- /dev/null
+++ b/hw/misc/pvpanic-pci.c
@@ -0,0 +1,94 @@
+/*
+ * QEMU simulated PCI pvpanic device.
+ *
+ * Copyright (C) 2020 Oracle
+ *
+ * Authors:
+ * Mihai Carabas 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "sysemu/runstate.h"
+
+#include "hw/nvram/fw_cfg.h"
+#include "hw/qdev-properties.h"
+#include "migration/vmstate.h"
+#include "hw/misc/pvpanic.h"
+#include "qom/object.h"
+#include "hw/pci/pci.h"
+
+OBJECT_DECLARE_SIMPLE_TYPE(PVPanicPCIState, PVPANIC_PCI_DEVICE)
+
+/*
+ * PVPanicPCIState for PCI device
+ */
+typedef struct PVPanicPCIState {
+PCIDevice dev;
+PVPanicState pvpanic;
+} PVPanicPCIState;
+
+static const VMStateDescription vmstate_pvpanic_pci = {
+.name = "pvpanic-pci",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_PCI_DEVICE(dev, PVPanicPCIState),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static void pvpanic_pci_realizefn(PCIDevice *dev, Error **errp)
+{
+PVPanicPCIState *s = PVPANIC_PCI_DEVICE(dev);
+PVPanicState *ps = &s->pvpanic;
+
+pvpanic_setup_io(&s->pvpanic, DEVICE(s), 2);
+
+pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &ps->mr);
+}
+
+static Property pvpanic_pci_properties[] = {
+DEFINE_PROP_UINT8("events", PVPanicPCIState, pvpanic.events, 
PVPANIC_PANICKED | PVPANIC_CRASHLOADED),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void pvpanic_pci_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass);
+
+device_class_set_props(dc, pvpanic_pci_properties);
+
+pc->realize = pvpanic_pci_realizefn;
+pc->vendor_id = PCI_VENDOR_ID_REDHAT;
+pc->device_id = PCI_DEVICE_ID_REDHAT_PVPANIC;
+pc->revision = 1;
+pc->class_id = PCI_CLASS_SYSTEM_OTHER;
+dc->vmsd = &vmstate_pvpanic_pci;
+
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static TypeInfo pvpanic_pci_info = {
+.name  = TYPE_PVPANIC_PCI_DEVICE,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PVPanicPCIState),
+.class_init= pvpanic_pci_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
+{ }
+}
+};
+
+static void pvpanic_register_types(void)
+{
+type_register_static(&pvpanic_pci_info);
+}
+
+type_init(pvpanic_register_types);
diff --git a/include/hw/misc/pvpanic.h b/include/hw/misc/pvpanic.h
index abc9dde..ca3c5bb 100644
--- a/include/hw/misc/pvpanic.h
+++ b/include/hw/misc/pvpanic.h
@@ -18,6 +18,7

Re: [Virtio-fs] [PATCH v2] virtiofsd: prevent opening of special files (CVE-2020-35517)

2021-01-27 Thread Miklos Szeredi
On Wed, Jan 27, 2021 at 4:47 PM Miklos Szeredi  wrote:
>
> On Wed, Jan 27, 2021 at 4:35 PM Greg Kurz  wrote:
> >
> > On Wed, 27 Jan 2021 16:22:49 +0100
> > Miklos Szeredi  wrote:
> >
> > > On Wed, Jan 27, 2021 at 4:09 PM Greg Kurz  wrote:
> > > >
> > > > On Wed, 27 Jan 2021 15:09:50 +0100
> > > > Miklos Szeredi  wrote:
> > > > > The semantics of O_CREATE are that it can fail neither because the
> > > > > file exists nor because it doesn't.  This doesn't matter if the
> > > > > exported tree is not modified outside of a single guest, because of
> > > > > locking provided by the guest kernel.
> > > > >
> > > >
> > > > Wrong. O_CREAT can legitimately fail with ENOENT if one element
> > >
> > > Let me make my  statement more precise:
> > >
> > > O_CREAT cannot fail with ENOENT if parent directory exists throughout
> > > the operation.
> > >
> >
> > True, but I still don't see what guarantees guest userspace that the
> > parent directory doesn't go away... I must have missed something.
> > Please elaborate.
>
> Generally there's no guarantee, however there can be certain
> situations where the caller can indeed rely on the existence of the
> parent (e.g. /tmp).

Example from the virtiofs repo:

https://gitlab.com/virtio-fs/ireg/-/blob/master/ireg.c#L315
https://gitlab.com/virtio-fs/ireg/-/blob/master/ireg.c#L382

In that case breaking O_CREAT would be harmless, since no two
instances are allowed anyway, so it would just give a confusing error.
But it is breakage and it probably wouldn't be hard to find much worse
breakage in real life applications.

Thanks,
Miklos




Re: [PATCH v6 01/14] block: return status from bdrv_append and friends

2021-01-27 Thread Alberto Garcia
On Sat 16 Jan 2021 10:51:56 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> The recommended use of qemu error api assumes returning status together
> with setting errp and avoid void functions with errp parameter. Let's
> improve bdrv_append and some friends to reduce error-propagation
> overhead in further patches.
>
> Choose int return status, because bdrv_replace_node_common() has call
> to bdrv_check_update_perm(), which reports int status, which seems
> correct to propagate.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Alberto Garcia 

Berto



Re: [PATCH 1/6] migration: Fix and clean up around @tls-authz

2021-01-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Thu, Dec 17, 2020 at 02:07:01PM +0100, Markus Armbruster wrote:
>> Daniel P. Berrangé  writes:
>> 
>> > On Mon, Dec 14, 2020 at 11:14:34AM +0100, Markus Armbruster wrote:
>> >> Daniel P. Berrangé  writes:
>> >> 
>> >> > On Fri, Nov 13, 2020 at 07:52:31AM +0100, Markus Armbruster wrote:
>> >> >> Commit d2f1d29b95 "migration: add support for a "tls-authz" migration
>> >> >> parameter" added MigrationParameters member @tls-authz.  Whereas the
>> >> >> other members aren't really optional (see commit 1bda8b3c695), this
>> >> >> one is genuinely optional: migration_instance_init() leaves it absent,
>> >> >> and migration_tls_channel_process_incoming() passes it to
>> >> >> qcrypto_tls_session_new(), which checks for null.
>> >> >> 
>> >> >> Commit d2f1d29b95 has a number of issues, though:
>> >> >> 
>> >> >> * When qmp_query_migrate_parameters() copies migration parameters into
>> >> >>   its reply, it ignores has_tls_authz, and assumes true instead.  When
>> >> >>   it is false,
>> >> >> 
>> >> >>   - HMP info migrate_parameters prints the null pointer (crash bug on
>> >> >> some systems), and
>> >> >> 
>> >> >>   - QMP query-migrate-parameters replies "tls-authz": "" (because the
>> >> >> QObject output visitor silently maps null pointer to "", which it
>> >> >> really shouldn't).
>> >> >> 
>> >> >>   The HMP defect was noticed and fixed in commit 7cd75cbdb8
>> >> >>   'migration: use "" instead of (null) for tls-authz'.  Unfortunately,
>> >> >>   the fix papered over the real bug: it made
>> >> >>   qmp_query_migrate_parameters() map null tls_authz to "".  It also
>> >> >>   dropped the check for has_tls_authz from
>> >> >>   hmp_info_migrate_parameters().
>> >> >> 
>> >> >>   Revert, and fix qmp_query_migrate_parameters() not to screw up
>> >> >>   has_tls_authz.  No change to HMP.  QMP now has "tls-authz" in the
>> >> >>   reply only when it's actually present in
>> >> >>   migrate_get_current()->parameters.  If we prefer to remain
>> >> >>   bug-compatible, we should make tls_authz non-optional there.
>> >> >> 
>> >> >> * migrate_params_test_apply() neglects to apply tls_authz.  Currently
>> >> >>   harmless, because migrate_params_check() doesn't care.  Fix it
>> >> >>   anyway.
>> >> >> 
>> >> >> * qmp_migrate_set_parameters() crashes:
>> >> >> 
>> >> >> {"execute": "migrate-set-parameters", "arguments": {"tls-authz": 
>> >> >> null}}
>> >> >> 
>> >> >>   Add the necessary rewrite of null to "".  For background
>> >> >>   information, see commit 01fa559826 "migration: Use JSON null instead
>> >> >>   of "" to reset parameter to default".
>> >> >> 
>> >> >> Fixes: d2f1d29b95aa45d13262b39153ff501ed6b1ac95
>> >> >> Cc: Daniel P. Berrangé 
>> >> >> Signed-off-by: Markus Armbruster 
>> >> >> ---
>> >> >>  qapi/migration.json   |  2 +-
>> >> >>  migration/migration.c | 17 ++---
>> >> >>  monitor/hmp-cmds.c|  2 +-
>> >> >>  3 files changed, 16 insertions(+), 5 deletions(-)
>> >> >> 
>> >> >> diff --git a/qapi/migration.json b/qapi/migration.json
>> >> >> index 3c75820527..688e8da749 100644
>> >> >> --- a/qapi/migration.json
>> >> >> +++ b/qapi/migration.json
>> >> >> @@ -928,7 +928,7 @@
>> >> >>  ##
>> >> >>  # @MigrationParameters:
>> >> >>  #
>> >> >> -# The optional members aren't actually optional.
>> >> >> +# The optional members aren't actually optional, except for 
>> >> >> @tls-authz.
>> >> >
>> >> > and tls-hostname and tls-creds.
>> >> 
>> >> Really?  See [*] below.
>> >> 
>> >> >>  #
>> >> >>  # @announce-initial: Initial delay (in milliseconds) before sending 
>> >> >> the
>> >> >>  #first announce (Since 4.0)
>> >> >> diff --git a/migration/migration.c b/migration/migration.c
>> >> >> index 3263aa55a9..cad56fbf8c 100644
>> >> >> --- a/migration/migration.c
>> >> >> +++ b/migration/migration.c
>> >> >> @@ -855,9 +855,8 @@ MigrationParameters 
>> >> >> *qmp_query_migrate_parameters(Error **errp)
>> >> params->has_tls_creds = true;
>> >> >>  params->tls_creds = g_strdup(s->parameters.tls_creds);
>> >> >>  params->has_tls_hostname = true;
>> >> >>  params->tls_hostname = g_strdup(s->parameters.tls_hostname);
>> >> 
>> >> [*] Looks non-optional to me.
>> >
>> > I guess it depends on what you mean by "optional" :-)
>> 
>> I meant "non-optional in the value of query-migrate-parameters".  The
>> comment were debating applies to that value, and nothing else.
>> 
>> > When I say they are all optional, I'm talking about from the POV
>> > of the end users / mgmt who first configures a migration operation.
>> >
>> > tls-creds only needs to be set if you want to enable TLS
>> >
>> > tls-hostname only needs to be set if you need to override the
>> > default hostname used for cert validation.
>> >
>> > tls-authz only needs to be set if you want to enable access
>> > control over migration clients.
>> >
>> > IOW, all three are optional from the POV of configuring a
>> > migration.
>> 
>> Understood.
>> 
>> > A

Re: [PATCH] MAINTAINERS: Update 9pfs tree URL

2021-01-27 Thread Greg Kurz
On Wed, 27 Jan 2021 15:33:51 +0100
Christian Schoenebeck  wrote:

> On Freitag, 15. Januar 2021 14:50:17 CET Christian Schoenebeck wrote:
> > On Freitag, 15. Januar 2021 14:42:24 CET Greg Kurz wrote:
> > > I've already moved my repositories to gitlab for extra CI coverage,
> > > and I won't use the ones at github anymore.
> > > 
> > > Signed-off-by: Greg Kurz 
> > 
> > Reviewed-by: Christian Schoenebeck 
> > 
> > > ---
> > > 
> > >  MAINTAINERS |2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index cb0656aec3d4..21038d3fdfce 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -1828,7 +1828,7 @@ X: hw/9pfs/xen-9p*
> > > 
> > >  F: fsdev/
> > >  F: docs/interop/virtfs-proxy-helper.rst
> > >  F: tests/qtest/virtio-9p-test.c
> > > 
> > > -T: git https://github.com/gkurz/qemu.git 9p-next
> > > +T: git https://gitlab.com/gkurz/qemu.git 9p-next
> > > 
> > >  virtio-blk
> > >  M: Stefan Hajnoczi 
> 
> What's the status of this patch? I would add my T: line below just for the 
> records. But I'd rather wait for this patch being merged to main line.
> 

Drat... I forgot about this one and I'm not about to post a PR
anytime soon.

Laurent,

Any chance you can take this in the trivial tree ?

> Best regards,
> Christian Schoenebeck
> 
> 




Re: [PATCH] MAINTAINERS: Update 9pfs tree URL

2021-01-27 Thread Laurent Vivier
Le 27/01/2021 à 17:02, Greg Kurz a écrit :
> On Wed, 27 Jan 2021 15:33:51 +0100
> Christian Schoenebeck  wrote:
> 
>> On Freitag, 15. Januar 2021 14:50:17 CET Christian Schoenebeck wrote:
>>> On Freitag, 15. Januar 2021 14:42:24 CET Greg Kurz wrote:
 I've already moved my repositories to gitlab for extra CI coverage,
 and I won't use the ones at github anymore.

 Signed-off-by: Greg Kurz 
>>>
>>> Reviewed-by: Christian Schoenebeck 
>>>
 ---

  MAINTAINERS |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/MAINTAINERS b/MAINTAINERS
 index cb0656aec3d4..21038d3fdfce 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -1828,7 +1828,7 @@ X: hw/9pfs/xen-9p*

  F: fsdev/
  F: docs/interop/virtfs-proxy-helper.rst
  F: tests/qtest/virtio-9p-test.c

 -T: git https://github.com/gkurz/qemu.git 9p-next
 +T: git https://gitlab.com/gkurz/qemu.git 9p-next

  virtio-blk
  M: Stefan Hajnoczi 
>>
>> What's the status of this patch? I would add my T: line below just for the 
>> records. But I'd rather wait for this patch being merged to main line.
>>
> 
> Drat... I forgot about this one and I'm not about to post a PR
> anytime soon.
> 
> Laurent,
> 
> Any chance you can take this in the trivial tree ?
> 

Applied to my trivial-patches branch.

Thanks,
Laurent





Re: [PATCH] tcg/tci: Restrict tci_write_reg16() to 64-bit hosts

2021-01-27 Thread Laurent Vivier
Le 23/01/2021 à 11:30, Stefan Weil a écrit :
> Am 23.01.21 um 10:41 schrieb Philippe Mathieu-Daudé:
> 
>> Restrict tci_write_reg16() to 64-bit hosts to fix on 32-bit ones:
>>
>>    [520/1115] Compiling C object libqemu-arm-linux-user.fa.p/tcg_tci.c.o
>>    FAILED: libqemu-arm-linux-user.fa.p/tcg_tci.c.o
>>    tcg/tci.c:132:1: error: 'tci_write_reg16' defined but not used 
>> [-Werror=unused-function]
>>     tci_write_reg16(tcg_target_ulong *regs, TCGReg index, uint16_t value)
>>     ^~~
>>
>> Fixes: 2f160e0f979 ("tci: Add implementation for INDEX_op_ld16u_i64")
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>   tcg/tci.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/tcg/tci.c b/tcg/tci.c
>> index 2311aa7d3ab..3fc82d3c79d 100644
>> --- a/tcg/tci.c
>> +++ b/tcg/tci.c
>> @@ -128,11 +128,13 @@ static void tci_write_reg8(tcg_target_ulong *regs, 
>> TCGReg index, uint8_t value)
>>   tci_write_reg(regs, index, value);
>>   }
>>   +#if TCG_TARGET_REG_BITS == 64
>>   static void
>>   tci_write_reg16(tcg_target_ulong *regs, TCGReg index, uint16_t value)
>>   {
>>   tci_write_reg(regs, index, value);
>>   }
>> +#endif
>>     static void
>>   tci_write_reg32(tcg_target_ulong *regs, TCGReg index, uint32_t value)
> 
> 
> Thanks for fixing this. This could optionally be applied via qemu-trivial.
> 
> Reviewed-by: Stefan Weil 
> 
> 

Applied to my trivial-patches branch.

Thanks,
Laurent





Re: [PATCHv2] configure: replace --enable/disable-git-update with --with-git-submodules

2021-01-27 Thread Daniel P . Berrangé
On Tue, Jan 19, 2021 at 12:20:46PM -0500, Dan Streetman wrote:
> Replace the --enable-git-update and --disable-git-update configure params
> with the param --with-git-submodules=(update|validate|ignore) to
> allow 3 options for building from a git repo.
> 
> This is needed because downstream packagers, e.g. Debian, Ubuntu, etc,
> also keep the source code in git, but do not want to enable the
> 'git_update' mode; with the current code, that's not possible even
> if the downstream package specifies --disable-git-update.
> 
> The previous parameters are deprecated but still available; the
> --enable-git-update parameter maps to --with-git-submodules=update and
> --disable-git-update parameter maps to --with-git-submodules=validate.
> 
> The configure script behavior is slightly modified, where previously
> the dtc, capstone, and slirp submodules were not validated when
> --disable-git-update was specified (but were updated with git-update
> enabled), now they are validated when using --with-git-submodules=validate
> and are only ignored when using --with-git-submodules=ignore.
> 
> Signed-off-by: Dan Streetman 
> ---
> v1: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg04799.html
> changes since v1:
>  - add --help output explaining --with-git-submodules valid values
>  - validate dtc, capstone, slirp submodules also
>  - update commit description text
> 
>  Makefile | 24 ++-
>  configure| 51 ++--
>  scripts/git-submodule.sh | 34 ---
>  3 files changed, 66 insertions(+), 43 deletions(-)

Reviewed-by: Daniel P. Berrangé 

I'll queue this with a batch of misc changes i have pending.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] net/slirp.c: Fix spelling error in error message

2021-01-27 Thread Laurent Vivier
Le 25/01/2021 à 17:08, Philippe Mathieu-Daudé a écrit :
> On 1/22/21 1:42 AM, dje--- via wrote:
>> DNS should be DHCP
>>
>> Signed-off-by: Doug Evans 
>> ---
>>  net/slirp.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH] IOMMU and ATS not supported by vhost-user filesystem.

2021-01-27 Thread Dr. David Alan Gilbert
* Leonardo Augusto Guimarães Garcia (lagar...@linux.ibm.com) wrote:
> On 1/27/21 8:19 AM, Stefan Hajnoczi wrote:
> > On Tue, Jan 26, 2021 at 03:23:38PM -0300, lagar...@linux.ibm.com wrote:
> > > From: Leonardo Garcia 
> > > 
> > > Currently, as IOMMU and ATS are not supported, if a user mistakenly set
> > > any of them and tries to mount the vhost-user filesystem inside the
> > > guest, whenever the user tries to access the mount point, the system
> > > will hang forever.
> > > 
> > > Signed-off-by: Leonardo Garcia 
> > > ---
> > >   hw/virtio/vhost-user-fs-pci.c | 7 +++
> > >   hw/virtio/vhost-user-fs.c | 5 +
> > >   2 files changed, 12 insertions(+)
> > > 
> > > diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
> > > index 2ed8492b3f..564d1fd108 100644
> > > --- a/hw/virtio/vhost-user-fs-pci.c
> > > +++ b/hw/virtio/vhost-user-fs-pci.c
> > > @@ -11,6 +11,8 @@
> > >* top-level directory.
> > >*/
> > > +#include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > >   #include "qemu/osdep.h"
> > >   #include "hw/qdev-properties.h"
> > >   #include "hw/virtio/vhost-user-fs.h"
> > > @@ -45,6 +47,11 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
> > > *vpci_dev, Error **errp)
> > >   vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
> > >   }
> > > +if (vpci_dev->flags & VIRTIO_PCI_FLAG_ATS) {
> > > +error_setg(errp, "ATS is currently not supported with 
> > > vhost-user-fs-pci");
> > > +return;
> > > +}
> > Why is this check needed in addition to VIRTIO_F_IOMMU_PLATFORM?
> > 
> > What needs to be added to support ATS?
> > 
> > > +
> > >   qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
> > >   }
> > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> > > index ac4fc34b36..914d68b3ee 100644
> > > --- a/hw/virtio/vhost-user-fs.c
> > > +++ b/hw/virtio/vhost-user-fs.c
> > > @@ -203,6 +203,11 @@ static void vuf_device_realize(DeviceState *dev, 
> > > Error **errp)
> > >   return;
> > >   }
> > > +if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
> > > +error_setg(errp, "IOMMU is currently not supported with 
> > > vhost-user-fs");
> > > +return;
> > > +}
> > > +
> > >   if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {
> > I thought IOMMU support depends on the vhost-user device backend (e.g.
> > virtiofsd), so the vhost-user backend should participate in advertising
> > this feature.
> > 
> > Perhaps the check should be:
> > 
> >  ret = vhost_dev_init(&fs->vhost_dev, &fs->vhost_user,
> >   VHOST_BACKEND_TYPE_USER, 0);
> >  if (ret < 0) {
> >  error_setg_errno(errp, -ret, "vhost_dev_init failed");
> >  goto err_virtio;
> >  }
> > +
> > +   if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM) &&
> > +   !(fs->vhost_dev.hdev_features & (1ull << VIRTIO_F_IOMMU_PLATFORM))) 
> > {
> > +   error_setg(errp, "IOMMU is not supported by the vhost-user device 
> > backend");
> > +   goto err_iommu_needed;
> > +   }
> > 
> > Also, can this logic be made generic for all vhost-user devices? It's
> > not really specific to vhost-user-fs.
> 
> Further analyzing this, on vhost-user.c, I see:
> 
>         if (virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM) &&
>                 !(virtio_has_feature(dev->protocol_features,
>                     VHOST_USER_PROTOCOL_F_SLAVE_REQ) &&
>                  virtio_has_feature(dev->protocol_features,
>                     VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
>             error_report("IOMMU support requires reply-ack and "
>                          "slave-req protocol 
> features.");
>             return -1;
>         }
> 
> This code was included by commit 6dcdd06. It included support for
> VIRTIO_F_IOMMU_PLATFORM with vhost_net devices. So, the restriction is not
> generic for all vhost-user devices.

That test is a slightly different test; that's checking that the
vhost-user device has two underlying features that are needed to make
iommu work; it's not a full test though; it doesn't actually check the
vhost-user device also wants to do iommu.

Dave

> Cheers,
> 
> Leo
> 
> > 
> > Stefan
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH] MAINTAINERS: Update 9pfs tree URL

2021-01-27 Thread Christian Schoenebeck
On Mittwoch, 27. Januar 2021 17:07:38 CET Laurent Vivier wrote:
> Le 27/01/2021 à 17:02, Greg Kurz a écrit :
> > On Wed, 27 Jan 2021 15:33:51 +0100
> > 
> > Christian Schoenebeck  wrote:
> >> On Freitag, 15. Januar 2021 14:50:17 CET Christian Schoenebeck wrote:
> >>> On Freitag, 15. Januar 2021 14:42:24 CET Greg Kurz wrote:
>  I've already moved my repositories to gitlab for extra CI coverage,
>  and I won't use the ones at github anymore.
>  
>  Signed-off-by: Greg Kurz 
> >>> 
> >>> Reviewed-by: Christian Schoenebeck 
> >>> 
>  ---
>  
>   MAINTAINERS |2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>  
>  diff --git a/MAINTAINERS b/MAINTAINERS
>  index cb0656aec3d4..21038d3fdfce 100644
>  --- a/MAINTAINERS
>  +++ b/MAINTAINERS
>  @@ -1828,7 +1828,7 @@ X: hw/9pfs/xen-9p*
>  
>   F: fsdev/
>   F: docs/interop/virtfs-proxy-helper.rst
>   F: tests/qtest/virtio-9p-test.c
>  
>  -T: git https://github.com/gkurz/qemu.git 9p-next
>  +T: git https://gitlab.com/gkurz/qemu.git 9p-next
>  
>   virtio-blk
>   M: Stefan Hajnoczi 
> >> 
> >> What's the status of this patch? I would add my T: line below just for
> >> the
> >> records. But I'd rather wait for this patch being merged to main line.
> > 
> > Drat... I forgot about this one and I'm not about to post a PR
> > anytime soon.

:-)

I suggest I will take over the next 9p queue round of whatever might come up. 
I have some head room now.

> > 
> > Laurent,
> > 
> > Any chance you can take this in the trivial tree ?
> 
> Applied to my trivial-patches branch.
> 
> Thanks,
> Laurent

Thanks Laurent!

I wait for this being merged. No hurry.

Best regards,
Christian Schoenebeck





Re: macOS (Big Sur, Apple Silicon) 'make check' fails in test-crypto-tlscredsx509

2021-01-27 Thread Stefan Weil

Am 27.01.21 um 13:17 schrieb Daniel P. Berrangé:


On Tue, Jan 26, 2021 at 04:41:13PM +, Peter Maydell wrote:

On Tue, 26 Jan 2021 at 16:37, Daniel P. Berrangé  wrote:

On Tue, Jan 26, 2021 at 04:32:08PM +, Peter Maydell wrote:

** (tests/test-crypto-tlscredsx509:35180): CRITICAL **: 16:23:34.590:
Failed to sign certificate ASN1 parser: Value is not valid.
ERROR test-crypto-tlscredsx509 - Bail out! FATAL-CRITICAL: Failed to
sign certificate ASN1 parser: Value is not valid.
make: *** [run-test-70] Error 1


Does this failure ring any bells for anybody?

Not seen it before.

Is this using a gnutls from homebrew, or one that apple
ship themselves ?  Any idea what version it is ?

Homebrew gnutls, 3.6.15.

On further investigation it seems the error comes from libtasn1,
but unfortunately there are 100's of scenarios it could arise
so difficult one to debug.

In the test_tls_generate_cert method in QEMU tests/crypto-tls-x509-helpers.c

There are conditional lines like

 if (req->country) {

 if (req->altname1) {
 ...etc...

I guess one, or more of those, is writing data that libtasn1 is not happy
with.

Some one with easy access to this apple silicon will likely need to start
by incrementally disabling each of those conditionals eg.  if (req->country
&& 0)

until we find out which one (might be more than one) make the

Failed to sign certificate ASN1 parser: Value is not valid.

error message go away. NB, once that ASN1 error goes away, the QEMU test
suite will likely give its own error because the certs will no longer
have the data it is expecting.

Regards,
Daniel



I could debug into gnutls_x509_crt_sign2. gnutls_x509_crt_privkey_sign 
seems to fail.


Disabling the conditionals mentioned above did not help.

And I also have another problem when running "make check-tcg":

% make check-tcg
  BUILD   TCG tests for aarch64-softmmu
  BUILD   aarch64-softmmu guest-tests with cc
/qemu/tests/tcg/aarch64/system/boot.S:81:18: error: unexpected token in 
'.section' directive

 .section .rodata
 ^





Re: [PATCH v20 01/20] multi-process: add the concept description to docs/devel/qemu-multiprocess

2021-01-27 Thread Jag Raman
Hi Cedric,

> On Jan 25, 2021, at 11:57 AM, Cédric Le Goater  wrote:
> 
> Hello, 
> 
> Thanks for this patchset. I have a slightly "off topic" question 
> but I don't want to distract the maintainers from merging any of 
> this. Feel free to ignore.
> 
> I understand that this framework is targetting KVM and mostly PCI 
> devices but I was wondering if it could be of any use for full 
> system emulation. Would it possible to use this framework to 
> interconnect QEMU processes emulating different machines but 
> sharing a common bus ? 

We are eventually moving towards a vfio-user framework which implements
VFIO over socket. What you are envisioning could be something in the
evolution of QEMU devices in the future, although VFIO does not support
all types of devices presently.

I am following the efforts to modularize QEMU devices into runtime libraries,
which probably would be one of the significant steps to emulating devices
in distributed fashion.

> 
> Using the proxy object, could a "slave" QEMU machine (host) connect 
> to a remote device implemented in another QEMU (bmc) machine ? or
> are we considering that remote processes are dedicated to device 
> modeling and no more. 

Presently, we are addressing the case where the remote device is in the
same machine as QEMU. Moving the remote device to a different machine
has a network bottleneck - the performance of such a model would depend
on the performance of various protocols like RDMA (for remote DMA), etc...

Thank you!
—
Jag

> 
> What I have in mind for the moment is LPC, FW address space and 
> some ISA devices, but the list could extend.
> 
> Thanks,
> 
> C.
> 
> 
> 
> On 1/19/21 9:28 PM, Jagannathan Raman wrote:
>> From: John G Johnson 
>> 
>> Signed-off-by: John G Johnson 
>> Signed-off-by: Elena Ufimtseva 
>> Signed-off-by: Jagannathan Raman 
>> Reviewed-by: Stefan Hajnoczi 
>> ---
>> docs/devel/index.rst |   1 +
>> docs/devel/multi-process.rst | 966 
>> +++
>> MAINTAINERS  |   7 +
>> 3 files changed, 974 insertions(+)
>> create mode 100644 docs/devel/multi-process.rst
>> 
>> diff --git a/docs/devel/index.rst b/docs/devel/index.rst
>> index ea0e1e1..5ccaf8b 100644
>> --- a/docs/devel/index.rst
>> +++ b/docs/devel/index.rst
>> @@ -36,3 +36,4 @@ Contents:
>>clocks
>>qom
>>block-coroutine-wrapper
>> +   multi-process
>> diff --git a/docs/devel/multi-process.rst b/docs/devel/multi-process.rst
>> new file mode 100644
>> index 000..6969932
>> --- /dev/null
>> +++ b/docs/devel/multi-process.rst
>> @@ -0,0 +1,966 @@
>> +This is the design document for multi-process QEMU. It does not
>> +necessarily reflect the status of the current implementation, which
>> +may lack features or be considerably different from what is described
>> +in this document. This document is still useful as a description of
>> +the goals and general direction of this feature.
>> +
>> +Please refer to the following wiki for latest details:
>> +https://wiki.qemu.org/Features/MultiProcessQEMU
>> +
>> +Multi-process QEMU
>> +===
>> +
>> +QEMU is often used as the hypervisor for virtual machines running in the
>> +Oracle cloud. Since one of the advantages of cloud computing is the
>> +ability to run many VMs from different tenants in the same cloud
>> +infrastructure, a guest that compromised its hypervisor could
>> +potentially use the hypervisor's access privileges to access data it is
>> +not authorized for.
>> +
>> +QEMU can be susceptible to security attacks because it is a large,
>> +monolithic program that provides many features to the VMs it services.
>> +Many of these features can be configured out of QEMU, but even a reduced
>> +configuration QEMU has a large amount of code a guest can potentially
>> +attack. Separating QEMU reduces the attack surface by aiding to
>> +limit each component in the system to only access the resources that
>> +it needs to perform its job.
>> +
>> +QEMU services
>> +-
>> +
>> +QEMU can be broadly described as providing three main services. One is a
>> +VM control point, where VMs can be created, migrated, re-configured, and
>> +destroyed. A second is to emulate the CPU instructions within the VM,
>> +often accelerated by HW virtualization features such as Intel's VT
>> +extensions. Finally, it provides IO services to the VM by emulating HW
>> +IO devices, such as disk and network devices.
>> +
>> +A multi-process QEMU
>> +
>> +
>> +A multi-process QEMU involves separating QEMU services into separate
>> +host processes. Each of these processes can be given only the privileges
>> +it needs to provide its service, e.g., a disk service could be given
>> +access only to the disk images it provides, and not be allowed to
>> +access other files, or any network devices. An attacker who compromised
>> +this service would not be able to use this exploit to access files or
>> +devices beyond what the disk service was given acces

Re: [PATCH] IOMMU and ATS not supported by vhost-user filesystem.

2021-01-27 Thread Laszlo Ersek
On 01/27/21 12:19, Stefan Hajnoczi wrote:
> On Tue, Jan 26, 2021 at 03:23:38PM -0300, lagar...@linux.ibm.com wrote:
>> From: Leonardo Garcia 
>>
>> Currently, as IOMMU and ATS are not supported, if a user mistakenly set
>> any of them and tries to mount the vhost-user filesystem inside the
>> guest, whenever the user tries to access the mount point, the system
>> will hang forever.
>>
>> Signed-off-by: Leonardo Garcia 
>> ---
>>  hw/virtio/vhost-user-fs-pci.c | 7 +++
>>  hw/virtio/vhost-user-fs.c | 5 +
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
>> index 2ed8492b3f..564d1fd108 100644
>> --- a/hw/virtio/vhost-user-fs-pci.c
>> +++ b/hw/virtio/vhost-user-fs-pci.c
>> @@ -11,6 +11,8 @@
>>   * top-level directory.
>>   */
>>  
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>>  #include "qemu/osdep.h"
>>  #include "hw/qdev-properties.h"
>>  #include "hw/virtio/vhost-user-fs.h"
>> @@ -45,6 +47,11 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
>> *vpci_dev, Error **errp)
>>  vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
>>  }
>>  
>> +if (vpci_dev->flags & VIRTIO_PCI_FLAG_ATS) {
>> +error_setg(errp, "ATS is currently not supported with 
>> vhost-user-fs-pci");
>> +return;
>> +}
> 
> Why is this check needed in addition to VIRTIO_F_IOMMU_PLATFORM?
> 
> What needs to be added to support ATS?
> 
>> +
>>  qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
>>  }
>>  
>> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
>> index ac4fc34b36..914d68b3ee 100644
>> --- a/hw/virtio/vhost-user-fs.c
>> +++ b/hw/virtio/vhost-user-fs.c
>> @@ -203,6 +203,11 @@ static void vuf_device_realize(DeviceState *dev, Error 
>> **errp)
>>  return;
>>  }
>>  
>> +if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
>> +error_setg(errp, "IOMMU is currently not supported with 
>> vhost-user-fs");
>> +return;
>> +}
>> +
>>  if (!vhost_user_init(&fs->vhost_user, &fs->conf.chardev, errp)) {
> 
> I thought IOMMU support depends on the vhost-user device backend (e.g.
> virtiofsd), so the vhost-user backend should participate in advertising
> this feature.

... I had the same thought when a few weeks earlier I tried to use
virtio-fs with an SEV guest (just OVMF), and virtiofsd crashed,
apparently :)

(I didn't report it because, well, SEV wants to prevent sharing between
host and guest, and virtio-fs works precisely in the opposite direction;
so the use case may not have much merit.)

Thanks
Laszlo

> 
> Perhaps the check should be:
> 
> ret = vhost_dev_init(&fs->vhost_dev, &fs->vhost_user,
>  VHOST_BACKEND_TYPE_USER, 0);
> if (ret < 0) {
> error_setg_errno(errp, -ret, "vhost_dev_init failed");
> goto err_virtio;
> }
> +
> +   if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM) &&
> +   !(fs->vhost_dev.hdev_features & (1ull << VIRTIO_F_IOMMU_PLATFORM))) {
> +   error_setg(errp, "IOMMU is not supported by the vhost-user device 
> backend");
> +   goto err_iommu_needed;
> +   }
> 
> Also, can this logic be made generic for all vhost-user devices? It's
> not really specific to vhost-user-fs.
> 
> Stefan
> 




Re: macOS (Big Sur, Apple Silicon) 'make check' fails in test-crypto-tlscredsx509

2021-01-27 Thread Daniel P . Berrangé
On Wed, Jan 27, 2021 at 05:44:59PM +0100, Stefan Weil wrote:
> Am 27.01.21 um 13:17 schrieb Daniel P. Berrangé:
> 
> > On Tue, Jan 26, 2021 at 04:41:13PM +, Peter Maydell wrote:
> > > On Tue, 26 Jan 2021 at 16:37, Daniel P. Berrangé  
> > > wrote:
> > > > On Tue, Jan 26, 2021 at 04:32:08PM +, Peter Maydell wrote:
> > > > > ** (tests/test-crypto-tlscredsx509:35180): CRITICAL **: 16:23:34.590:
> > > > > Failed to sign certificate ASN1 parser: Value is not valid.
> > > > > ERROR test-crypto-tlscredsx509 - Bail out! FATAL-CRITICAL: Failed to
> > > > > sign certificate ASN1 parser: Value is not valid.
> > > > > make: *** [run-test-70] Error 1
> > > > > 
> > > > > 
> > > > > Does this failure ring any bells for anybody?
> > > > Not seen it before.
> > > > 
> > > > Is this using a gnutls from homebrew, or one that apple
> > > > ship themselves ?  Any idea what version it is ?
> > > Homebrew gnutls, 3.6.15.
> > On further investigation it seems the error comes from libtasn1,
> > but unfortunately there are 100's of scenarios it could arise
> > so difficult one to debug.
> > 
> > In the test_tls_generate_cert method in QEMU tests/crypto-tls-x509-helpers.c
> > 
> > There are conditional lines like
> > 
> >  if (req->country) {
> > 
> >  if (req->altname1) {
> >  ...etc...
> > 
> > I guess one, or more of those, is writing data that libtasn1 is not happy
> > with.
> > 
> > Some one with easy access to this apple silicon will likely need to start
> > by incrementally disabling each of those conditionals eg.  if (req->country
> > && 0)
> > 
> > until we find out which one (might be more than one) make the
> > 
> > Failed to sign certificate ASN1 parser: Value is not valid.
> > 
> > error message go away. NB, once that ASN1 error goes away, the QEMU test
> > suite will likely give its own error because the certs will no longer
> > have the data it is expecting.
> > 
> > Regards,
> > Daniel
> 
> 
> I could debug into gnutls_x509_crt_sign2. gnutls_x509_crt_privkey_sign seems
> to fail.
> 
> Disabling the conditionals mentioned above did not help.

In $QEMU.git/crypto/init.c can you uncomment  the "#define DEBUG_GNUTLS"
line and then re-build and re-run the test case.

There's a bunch of debug logs in code paths from gnutls_x509_crt_privkey_sign
that might give us useful info.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v3] machine: add missing doc for memory-backend option

2021-01-27 Thread Michal Privoznik

On 1/27/21 4:35 PM, Igor Mammedov wrote:

On Wed, 27 Jan 2021 15:24:26 +0100
Michal Privoznik  wrote:


On 1/27/21 11:54 AM, Daniel P. Berrangé wrote:

On Wed, Jan 27, 2021 at 10:45:11AM +, Daniel P. Berrangé wrote:

On Thu, Jan 21, 2021 at 11:15:04AM -0500, Igor Mammedov wrote:





How does a mgmt app know which machine types need to use this
option ? The machine type names are opaque strings, and apps
must not attempt to parse or interpret the version number
inside the machine type name, as they can be changed by
distros.  IOW, saying to use it for machine types 4.0 and
older isn't a valid usage strategy IMHO.

it's possible (but no necessary) to use knob with new machine types
(defaults for these match suggested property value).
Limiting knob usage to 4.0 and older would allow us to drop
without extra efforts once 4.0 is deprecated/removed.


Problem here is that libvirt treats machine type as an opaque string. 
Therefore, as could be seen in my patch for libvirt, the property is 
disabled for all started VMs, regardless of machine type:


https://www.redhat.com/archives/libvir-list/2021-January/msg00686.html

So it can't really go away ever, can it?




Looking at the libvirt patch, we do indeed use his property
unconditionally for all machine types, precisely because parsing
version numbers from the machine type is not allowed.

https://www.redhat.com/archives/libvir-list/2021-January/msg00633.html

So this doc is telling apps to do something that isn't viable


The other approach that I was suggesting was, that QEMU stops reporting
'default-ram-id' for affected machine types. The way the switch from '-m
XMB' to memory-backend-* was implemented in libvirt is that if libvirt
sees 'default-ram-id' attribute for given machine type it uses
memory-backend-* otherwise it falls back to -m.

Since we know which machine types are "broken", we can stop reporting
the attribute and thus stop tickling this bug. I agree that it puts more
burden on distro maintainers to backport the change, but I think it's
acceptable risk.


default-ram-id is already exposed in wild including old machine types
starting from 5.2


It is, but according to qapi/machine.json it is optional. Mgmt apps have 
to be able to deal with it missing.




if libvirt will take care this one quirk, then I guess we can
do as suggested. I can post an additional patch to this effect if there
is agreement to go this route.


The beauty of this solution is that libvirt wouldn't need to do anything 
:-)  As I said earlier, if no default-ram-id is found then libvirt falls 
back to '-m X'.


I've cooked a dirty patch that works in my testing:

diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index ae0c4a..2214782d72 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -238,8 +238,33 @@ MachineInfoList *qmp_query_machines(Error **errp)
 info->has_default_cpu_type = true;
 }
 if (mc->default_ram_id) {
-info->default_ram_id = g_strdup(mc->default_ram_id);
-info->has_default_ram_id = true;
+int i;
+bool broken = false;
+
+/* Default RAM ID is broken if 
x-use-canonical-path-for-ramblock-id
+ * property of memory-backend is on. That's why it's 
disabled in
+ * create_default_memdev(). However, some machine types 
turn it on

+ * for backwards compatibility. */
+for (i = 0; i < mc->compat_props->len; i++) {
+GlobalProperty *p = g_ptr_array_index(mc->compat_props, i);
+
+if (strcmp(p->driver, TYPE_MEMORY_BACKEND_FILE) != 0)
+continue;
+
+if (strcmp(p->property, 
"x-use-canonical-path-for-ramblock-id") != 0)

+continue;
+
+if (strcmp(p->value, "true") != 0)
+continue;
+
+broken = true;
+break;
+}
+
+if (!broken) {
+info->default_ram_id = g_strdup(mc->default_ram_id);
+info->has_default_ram_id = true;
+}
 }

 QAPI_LIST_PREPEND(mach_list, info);


Michal




Re: macOS (Big Sur, Apple Silicon) 'make check' fails in test-crypto-tlscredsx509

2021-01-27 Thread Stefan Weil

Am 27.01.21 um 17:53 schrieb Daniel P. Berrangé:


In $QEMU.git/crypto/init.c can you uncomment the "#define DEBUG_GNUTLS"
line and then re-build and re-run the test case.

There's a bunch of debug logs in code paths from gnutls_x509_crt_privkey_sign
that might give us useful info.

Regards,
Daniel



% LANG=C.UTF-8 tests/test-crypto-tlscredsx509
# random seed: R02S9b95072a368ad370cdd4c780b8074596
3: ASSERT: mpi.c[wrap_nettle_mpi_print]:60
3: ASSERT: mpi.c[wrap_nettle_mpi_print]:60
2: signing structure using RSA-SHA256
3: ASSERT: common.c[_gnutls_x509_der_encode]:855
3: ASSERT: sign.c[_gnutls_x509_pkix_sign]:174
3: ASSERT: x509_write.c[gnutls_x509_crt_privkey_sign]:1834
3: ASSERT: x509_write.c[gnutls_x509_crt_sign2]:1152
Bail out! FATAL-CRITICAL: Failed to sign certificate ASN1 parser: Value 
is not valid.





Re: Handling multiple inheritance [for CXL]

2021-01-27 Thread Ben Widawsky
On 21-01-27 10:06:48, Daniel P. Berrangé wrote:
> On Tue, Jan 26, 2021 at 01:33:52PM -0800, Ben Widawsky wrote:
> > I'm working on CXL 2.0 type 3 memory devices [1]. In short, these are PCIe 
> > devices
> > that have persistent memory on them. As such, it would be nice to inherit 
> > from
> > both a PCI_DEVICE class as well as an NVDIMM device class.
> > 
> > Truth be told, using TYPE_MEMORY_DEVICE as the interface does provide most 
> > of
> > what I need. I'm wondering what the best way to handle this is. Currently, 
> > the
> > only thing NVDIMM class provides is write/read_label_data, this is driven by
> > _DSM. For CXL, the mechanism to read/write the equivalent area is not done 
> > via
> > _DSM, but done directly via a mailbox interface. However, the intent is the
> > same, and so utilizing similar code seems ideal.
> > 
> > If there's a desire to unify these code paths, I'd need something like 
> > multiple
> > inheritance. I'm looking for some feedback here on how to do it.
> 
> We don't have a direct concept of multiple inheritance in QOM.
> 
> The closest you can get is to turn the NVDIMM class into an
> interface. You can inherit from PCI_DEVICE and then implement
> the NVDIMM interface.
> 
> Regards,
> Daniel

Is there a concise summary of what the tradeoffs would be of moving NVDIMM to an
interface? AFAICT, there's a lot of things done through subclassing that can
just as easily be done as an interface, but I don't understand the reason for
that.



Re: [PATCH v4 1/7] qapi/block-core: Add retry option for error action

2021-01-27 Thread Eric Blake
On 12/15/20 6:30 AM, Jiahui Cen wrote:
> Add a new error action 'retry' to support retry on errors.
> 
> Signed-off-by: Jiahui Cen 
> Signed-off-by: Ying Fang 
> ---
>  blockdev.c   | 2 ++
>  qapi/block-core.json | 4 ++--
>  2 files changed, 4 insertions(+), 2 deletions(-)

> +++ b/qapi/block-core.json
> @@ -1146,7 +1146,7 @@
>  # Since: 1.3
>  ##
>  { 'enum': 'BlockdevOnError',
> -  'data': ['report', 'ignore', 'enospc', 'stop', 'auto'] }
> +  'data': ['report', 'ignore', 'enospc', 'stop', 'auto', 'retry'] }

Missing a documentation line that 'retry' was added in 6.0.

>  
>  ##
>  # @MirrorSyncMode:
> @@ -4770,7 +4770,7 @@
>  # Since: 2.1
>  ##
>  { 'enum': 'BlockErrorAction',
> -  'data': [ 'ignore', 'report', 'stop' ] }
> +  'data': [ 'ignore', 'report', 'stop', 'retry' ] }

Likewise.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v4 4/4] meson: Warn when TCI is selected but TCG backend is available

2021-01-27 Thread Richard Henderson
On 1/26/21 8:53 PM, Stefan Weil wrote:
> And the remaining TODO assertions are a good indicator that the current tests
> are incomplete for native TCG because they obviously don't cover all TCG 
> opcodes.

If the symbol appears in target/, then the opcode can be produced.  I've just
shown you how for 2 of them, via 2 different target architectures.


r~



  1   2   3   >