Re: [PATCH] Adding Cédric's repos in MAINTAINERS file.
Hello, On 12/8/21 04:52, lagar...@linux.ibm.com wrote: From: Leonardo Garcia Signed-off-by: Leonardo Garcia Here is a description of the branches I have put in place over the years for aspeed and powernv machines on github: - prev stable branch dev branch - current staging branch (I should call it -staging) -next frozen staging branch -for-upstream pull request branch (created on demand) gitlab replicates but for test purposes only. I haven't formalized yet ppc but it should more or less be the same. Thanks for reminding me. I will update when this is clear. Thanks, C. --- MAINTAINERS | 6 ++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 7543eb4d59..52c6b99763 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -273,6 +273,7 @@ F: hw/ppc/ppc.c F: hw/ppc/ppc_booke.c F: include/hw/ppc/ppc.h F: disas/ppc.c +T: git https://gitlab.com/legoater/qemu.git RISC-V TCG CPUs M: Palmer Dabbelt @@ -390,6 +391,7 @@ R: David Gibson R: Greg Kurz S: Maintained F: target/ppc/kvm.c +T: git https://gitlab.com/legoater/qemu.git S390 KVM CPUs M: Halil Pasic @@ -1343,6 +1345,7 @@ F: tests/qtest/libqos/*spapr* F: tests/qtest/rtas* F: tests/qtest/libqos/rtas* F: tests/avocado/ppc_pseries.py +T: git https://gitlab.com/legoater/qemu.git PowerNV (Non-Virtualized) M: Cédric Le Goater @@ -1356,6 +1359,7 @@ F: include/hw/ppc/pnv* F: include/hw/pci-host/pnv* F: pc-bios/skiboot.lid F: tests/qtest/pnv* +T: git https://gitlab.com/legoater/qemu.git powernv-next virtex_ml507 M: Edgar E. Iglesias @@ -1399,6 +1403,7 @@ F: hw/ppc/vof* F: include/hw/ppc/vof* F: pc-bios/vof/* F: pc-bios/vof* +T: git https://gitlab.com/legoater/qemu.git RISC-V Machines --- @@ -2244,6 +2249,7 @@ S: Supported F: hw/*/*xive* F: include/hw/*/*xive* F: docs/*/*xive* +T: git https://gitlab.com/legoater/qemu.git Renesas peripherals R: Yoshinori Sato
[PULL v2 4/7] iotests.py: add qemu_tool_popen()
Split qemu_tool_popen() from qemu_tool_pipe_and_status() to be used separately. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Nikita Lapshin --- tests/qemu-iotests/iotests.py | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 83bfedb902..452d047716 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -138,14 +138,22 @@ def unarchive_sample_image(sample, fname): shutil.copyfileobj(f_in, f_out) +def qemu_tool_popen(args: Sequence[str], +connect_stderr: bool = True) -> 'subprocess.Popen[str]': +stderr = subprocess.STDOUT if connect_stderr else None +# pylint: disable=consider-using-with +return subprocess.Popen(args, +stdout=subprocess.PIPE, +stderr=stderr, +universal_newlines=True) + + def qemu_tool_pipe_and_status(tool: str, args: Sequence[str], connect_stderr: bool = True) -> Tuple[str, int]: """ Run a tool and return both its output and its exit code """ -stderr = subprocess.STDOUT if connect_stderr else None -with subprocess.Popen(args, stdout=subprocess.PIPE, - stderr=stderr, universal_newlines=True) as subp: +with qemu_tool_popen(args, connect_stderr) as subp: output = subp.communicate()[0] if subp.returncode < 0: cmd = ' '.join(args) -- 2.31.1
Re: [PATCH 2/3] scripts/qapi-gen.py: add --add-trace-points option
On 12/21/21 20:35, Vladimir Sementsov-Ogievskiy wrote: > Add and option to generate trace points. We should generate both trace > points and trace-events files for further trace point code generation. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > scripts/qapi/gen.py | 13 ++--- > scripts/qapi/main.py | 10 +++--- > 2 files changed, 17 insertions(+), 6 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH qemu master] hw/misc/aspeed_pwm: fix typo
On 12/22/21 11:24, Troy Lee wrote: > Typo found during developing. > > Fixes: 70b3f1a34d3c ("hw/misc: Add basic Aspeed PWM model") > Signed-off-by: Troy Lee > --- > hw/misc/aspeed_pwm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/misc/aspeed_pwm.c b/hw/misc/aspeed_pwm.c > index 8ebab5dcef..dbf9634da3 100644 > --- a/hw/misc/aspeed_pwm.c > +++ b/hw/misc/aspeed_pwm.c > @@ -96,7 +96,7 @@ static void aspeed_pwm_class_init(ObjectClass *klass, void > *data) > > dc->realize = aspeed_pwm_realize; > dc->reset = aspeed_pwm_reset; > -dc->desc = "Aspeed PWM Controller", > +dc->desc = "Aspeed PWM Controller"; > dc->vmsd = &vmstate_aspeed_pwm; > } > No need for another patch, since it doesn't build. Simply squash it in your commit 70b3f1a34d3c.
Re: [PATCH 02/15] ppc: Mark the 'taihu' machine as deprecated
On 12/6/21 11:36, Cédric Le Goater wrote: > From: Thomas Huth > > The PPC 405 CPU is a system-on-a-chip, so all 405 machines are very similar, > except for some external periphery. However, the periphery of the 'taihu' > machine is hardly emulated at all (e.g. neither the LCD nor the USB part had > been implemented), so there is not much value added by this board. The users > can use the 'ref405ep' machine to test their PPC405 code instead. > > Signed-off-by: Thomas Huth > Reviewed-by: Daniel Henrique Barboza > Message-Id: <20211203164904.290954-2-th...@redhat.com> > Signed-off-by: Cédric Le Goater > --- > docs/about/deprecated.rst | 9 + > hw/ppc/ppc405_boards.c| 1 + > 2 files changed, 10 insertions(+) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v2 0/5] hw/qdev: Clarify qdev_connect_gpio_out() documentation
Hi Peter. Since you reviewed v1, and Ack-by on v2 would be welcomed. Otherwise, if you don't object, I plan to queue this via machine-next tree. Thanks, Phil. On 12/18/21 14:04, Philippe Mathieu-Daudé wrote: > Trivial patches clarifying qdev_connect_gpio_out() use, > basically that the qemu_irq argument is an input. > > Since v1: > - Addressed Yanan Wang and Peter Maydell comments: > - Correct qdev_init_gpio_out_named() doc > - Drop i8042_setup_a20_line() wrapper > > Philippe Mathieu-Daudé (5): > hw/qdev: Cosmetic around documentation > hw/qdev: Correct qdev_init_gpio_out_named() documentation > hw/qdev: Correct qdev_connect_gpio_out_named() documentation > hw/qdev: Rename qdev_connect_gpio_out*() 'input_pin' parameter > hw/input/pckbd: Open-code i8042_setup_a20_line() wrapper > > include/hw/input/i8042.h | 1 - > include/hw/qdev-core.h | 24 ++-- > hw/core/gpio.c | 13 +++-- > hw/i386/pc.c | 3 ++- > hw/input/pckbd.c | 5 - > 5 files changed, 27 insertions(+), 19 deletions(-) >
[PULL v2 0/7] NBD patches
The following changes since commit 2bf40d0841b942e7ba12953d515e62a436f0af84: Merge tag 'pull-user-20211220' of https://gitlab.com/rth7680/qemu into staging (2021-12-20 13:20:07 -0800) are available in the Git repository at: https://src.openvz.org/scm/~vsementsov/qemu.git tags/pull-nbd-2021-12-22-v2 for you to fetch changes up to ab7f7e67a7e7b49964109501dfcde4ec29bae60e: iotests: add nbd-reconnect-on-open test (2021-12-23 09:40:34 +0100) nbd: reconnect-on-open feature v2: simple fix for mypy and pylint complains on patch 04 Vladimir Sementsov-Ogievskiy (7): nbd: allow reconnect on open, with corresponding new options nbd/client-connection: nbd_co_establish_connection(): return real error nbd/client-connection: improve error message of cancelled attempt iotests.py: add qemu_tool_popen() iotests.py: add and use qemu_io_wrap_args() iotests.py: add qemu_io_popen() iotests: add nbd-reconnect-on-open test qapi/block-core.json | 9 ++- block/nbd.c | 45 +++- nbd/client-connection.c | 59 ++- tests/qemu-iotests/iotests.py | 37 ++ .../qemu-iotests/tests/nbd-reconnect-on-open | 71 +++ .../tests/nbd-reconnect-on-open.out | 11 +++ 6 files changed, 200 insertions(+), 32 deletions(-) create mode 100755 tests/qemu-iotests/tests/nbd-reconnect-on-open create mode 100644 tests/qemu-iotests/tests/nbd-reconnect-on-open.out -- 2.31.1
Re: [PATCH qemu master] hw/misc/aspeed_pwm: fix typo
Hello Troy Lee, On 12/22/21 11:24, Troy Lee wrote: Typo found during developing. Fixes: 70b3f1a34d3c ("hw/misc: Add basic Aspeed PWM model") PWM is not upstream. I will include the fix in a new aspeed-7.0 branch. Thanks, C. Signed-off-by: Troy Lee --- hw/misc/aspeed_pwm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/misc/aspeed_pwm.c b/hw/misc/aspeed_pwm.c index 8ebab5dcef..dbf9634da3 100644 --- a/hw/misc/aspeed_pwm.c +++ b/hw/misc/aspeed_pwm.c @@ -96,7 +96,7 @@ static void aspeed_pwm_class_init(ObjectClass *klass, void *data) dc->realize = aspeed_pwm_realize; dc->reset = aspeed_pwm_reset; -dc->desc = "Aspeed PWM Controller", +dc->desc = "Aspeed PWM Controller"; dc->vmsd = &vmstate_aspeed_pwm; }
Re: [PATCH for-6.1?] iotest: Further enhance qemu-img-bitmaps
On 21.07.21 22:46, Eric Blake wrote: Add a regression test to make sure we detect attempts to use 'qemu-img bitmap' to modify an in-use local file. Suggested-by: Nir Soffer Signed-off-by: Eric Blake --- Sadly, this missed my bitmaps pull request today. If there's any reason to respin that pull request, I'm inclined to add this in, as it just touches the iotests; otherwise, if it slips to 6.2 it's not too bad. (Going through my patches folder...) Not sure if you’re still interested in this, but if so, we should skip this test case if OFD locks are not available (like 153 does). Hanna
Re: [PATCH 3/3] meson: generate trace points for qmp commands
23.12.2021 01:11, Paolo Bonzini wrote: Il mar 21 dic 2021, 20:35 Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>> ha scritto: --- a/trace/meson.build +++ b/trace/meson.build @@ -2,10 +2,14 @@ specific_ss.add(files('control-target.c')) trace_events_files = [] -foreach dir : [ '.' ] + trace_events_subdirs - trace_events_file = meson.project_source_root() / dir / 'trace-events' +foreach path : [ '.' ] + trace_events_subdirs + qapi_trace_events + if path.contains('trace-events') + trace_events_file = meson.project_build_root() / 'qapi' / path Just using "trace_events_file = 'qapi' / path" might work, since the build is nonrecursive. This say: ninja: error: '../trace/qapi/qapi-commands-authz.trace-events', needed by 'trace/trace-events-all', missing and no known rule to make it make[1]: *** [Makefile:162: run-ninja] Error 1 make[1]: Leaving directory '/work/src/qemu/up/up-trace-qmp-commands/build' make: *** [GNUmakefile:11: all] Error 2 so, it consider the path relative to current "trace" directory. If it doesn't, use the custom target object, possibly indexing it as ct[index]. You can use a dictionary to store the custom targets and find them from the "path" variable. O! Great thanks! Magic. The following hack works: diff --git a/meson.build b/meson.build index 20d32fd20d..c42a76a14c 100644 --- a/meson.build +++ b/meson.build @@ -39,6 +39,7 @@ qemu_icondir = get_option('datadir') / 'icons' config_host_data = configuration_data() genh = [] qapi_trace_events = [] +qapi_trace_events_targets = {} target_dirs = config_host['TARGET_DIRS'].split() have_linux_user = false diff --git a/qapi/meson.build b/qapi/meson.build index 333ca60583..d4de04459d 100644 --- a/qapi/meson.build +++ b/qapi/meson.build @@ -139,6 +139,9 @@ foreach output : qapi_util_outputs if output.endswith('.h') genh += qapi_files[i] endif + if output.endswith('.trace-events') +qapi_trace_events_targets += {output: qapi_files[i]} + endif util_ss.add(qapi_files[i]) i = i + 1 endforeach @@ -147,6 +150,9 @@ foreach output : qapi_specific_outputs + qapi_nonmodule_outputs if output.endswith('.h') genh += qapi_files[i] endif + if output.endswith('.trace-events') +qapi_trace_events_targets += {output: qapi_files[i]} + endif specific_ss.add(when: 'CONFIG_SOFTMMU', if_true: qapi_files[i]) i = i + 1 endforeach diff --git a/trace/meson.build b/trace/meson.build index 77e44fa68d..daa24c3a2d 100644 --- a/trace/meson.build +++ b/trace/meson.build @@ -4,7 +4,7 @@ specific_ss.add(files('control-target.c')) trace_events_files = [] foreach path : [ '.' ] + trace_events_subdirs + qapi_trace_events if path.contains('trace-events') -trace_events_file = meson.project_build_root() / 'qapi' / path +trace_events_file = qapi_trace_events_targets[path] else trace_events_file = meson.project_source_root() / path / 'trace-events' endif -- Best regards, Vladimir
Re: [PATCH 1/3] block: better document SSH host key fingerprint checking
On 18.11.21 15:35, Daniel P. Berrangé wrote: The docs still illustrate host key fingerprint checking using the old md5 hashes which are considered insecure and obsolete. Change it to illustrate using a sha256 hash. Also show how to extract the hash value from the known_hosts file. Signed-off-by: Daniel P. Berrangé --- docs/system/qemu-block-drivers.rst.inc | 30 ++ 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/docs/system/qemu-block-drivers.rst.inc b/docs/system/qemu-block-drivers.rst.inc index 16225710eb..2aeeaf6361 100644 --- a/docs/system/qemu-block-drivers.rst.inc +++ b/docs/system/qemu-block-drivers.rst.inc @@ -778,10 +778,32 @@ The optional *HOST_KEY_CHECK* parameter controls how the remote host's key is checked. The default is ``yes`` which means to use the local ``.ssh/known_hosts`` file. Setting this to ``no`` turns off known-hosts checking. Or you can check that the host key -matches a specific fingerprint: -``host_key_check=md5:78:45:8e:14:57:4f:d5:45:83:0a:0e:f3:49:82:c9:c8`` -(``sha1:`` can also be used as a prefix, but note that OpenSSH -tools only use MD5 to print fingerprints). +matches a specific fingerprint. The fingerprint can be provided in +``md5``, ``sha1``, or ``sha256`` format, however, it is strongly +recommended to only use ``sha256``, since the other options are +considered insecure by modern standards. The fingerprint value +must be given as a hex encoded string:: + + host_key_check=sha256:04ce2ae89ff4295a6b9c4111640bdcb3297858ee55cb434d9dd88796e93aa795`` I think the backticks at the end of this line should be dropped. With that done: Reviewed-by: Hanna Reitz + +The key string may optionally contain ":" separators between +each pair of hex digits. + +The ``$HOME/.ssh/known_hosts`` file contains the base64 encoded +host keys. These can be converted into the format needed for +QEMU using a command such as:: + + $ for key in `grep 10.33.8.112 known_hosts | awk '{print $3}'` + do + echo $key | base64 -d | sha256sum + done + 6c3aa525beda9dc83eadfbd7e5ba7d976ecb59575d1633c87cd06ed2ed6e366f - + 12214fd9ea5b408086f98ecccd9958609bd9ac7c0ea316734006bc7818b45dc8 - + d36420137bcbd101209ef70c3b15dc07362fbe0fa53c5b135eba6e6afa82f0ce - + +Note that there can be multiple keys present per host, each with +different key ciphers. Care is needed to pick the key fingerprint +that matches the cipher QEMU will negotiate with the remote server. Currently authentication must be done using ssh-agent. Other authentication methods may be supported in future.
Re: [PATCH 2/3] block: support sha256 fingerprint with pre-blockdev options
On 18.11.21 15:35, Daniel P. Berrangé wrote: When support for sha256 fingerprint checking was aded in commit bf783261f0aee6e81af3916bff7606d71ccdc153 Author: Daniel P. Berrangé Date: Tue Jun 22 12:51:56 2021 +0100 block/ssh: add support for sha256 host key fingerprints it was only made to work with -blockdev. Getting it working with -drive requires some extra custom parsing. Signed-off-by: Daniel P. Berrangé --- block/ssh.c | 5 + 1 file changed, 5 insertions(+) Reviewed-by: Hanna Reitz
Re: Building QEMU as a shared library
Hi Peter, On 12/15/21 11:10, Peter Maydell wrote: > On Wed, 15 Dec 2021 at 08:18, Amir Gonnen wrote: >> My goal is to simulate a mixed architecture system. >> >> Today QEMU strongly assumes that the simulated system is a *single >> architecture*. >> Changing this assumption and supporting mixed architecture in QEMU proved to >> be >> non-trivial and may require significant development effort. Common code such >> as >> TCG and others explicitly include architecture specific header files, for >> example. > > Yeah. This is definitely something we'd like to fix some day. It's > the approach I would prefer for getting multi-architecture machines. Am I understanding correctly your preference would be *not* using shared libraries, but having a monolithic process able to use any configuration of heterogeneous architectures? What are your thoughts on Daniel idea to where (IIUC) cores can are external processes wired via vhost-user. One problem is not all operating systems supported provide this possibility. >> Instead, I would like to suggest a new approach we use at Neuroblade to >> achieve this: >> Build QEMU as a shared library that can be loaded and used directly in a >> larger simulation. >> Today we build qemu-system-nios2 shared library and load it from >> qemu-system-x86_64 in order >> to simulate an x86_64 system that also consists of multiple nios2 cores. >> In our simulation, two independent "main" functions are running on different >> threads, and >> simulation synchronization is reduced to synchronizing threads. > > I agree with Stefan that you should go ahead and send the code as > an RFC patchset, but I feel like there is a lot of work required > to really get the codebase into a state where it is a clean > shared library... > > -- PMM >
Re: [PATCH] acpi: validate hotplug selector on access
Hi, On Wed, Dec 22, 2021 at 9:52 PM Michael S. Tsirkin wrote: > > On Wed, Dec 22, 2021 at 09:27:51PM +0100, Philippe Mathieu-Daudé wrote: > > On Wed, Dec 22, 2021 at 9:20 PM Michael S. Tsirkin wrote: > > > On Wed, Dec 22, 2021 at 08:19:41PM +0100, Philippe Mathieu-Daudé wrote: > > > > +Mauro & Alex > > > > > > > > On 12/21/21 15:48, Michael S. Tsirkin wrote: > > > > > When bus is looked up on a pci write, we didn't > > > > > validate that the lookup succeeded. > > > > > Fuzzers thus can trigger QEMU crash by dereferencing the NULL > > > > > bus pointer. > > > > > > > > > > Fixes: b32bd763a1 ("pci: introduce acpi-index property for PCI > > > > > device") > > > > > Cc: "Igor Mammedov" > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/770 > > > > > Signed-off-by: Michael S. Tsirkin > > > > > > > > It seems this problem is important enough to get a CVE assigned. > > > > > > Guest root can crash guest. > > > I don't see why we would assign a CVE. > > > > Well thinking about downstream distributions, if there is a CVE assigned, > > it helps them to have it written in the commit. Maybe I am mistaken. > > > > Unrelated but it seems there is a coordination problem with the > > qemu-security@ list, > > if this isn't a security issue, why was a CVE requested? > > Right. I don't think a priveleged user crashing VM warrants a CVE, > it can just halt a CPU or whatever. Just cancel the CVE request pls. While I agree with you that this is kind of borderline and I expressed similar concerns in the past, I was told that: 1) root guest users are not necessarily trustworthy (from the host perspective). 2) NULL pointer deref and similar issues caused by an ill-handled/error condition are CVE worthy, even if triggered by root. 3) In other cases, DoS triggered by root is not a security issue because it's an expected behavior and not an ill-handled/error condition (think of assert failures, for example). In other words, "ill-handled condition" is the crucial factor that makes a bug CVE worthy or not. +Prasad, can you shed some light on this? Is my understanding correct? Also, please note that we regularly get CVE requests for bugs like this and many CVEs have been assigned in the past. Of course that doesn't mean we can't change things going forward, but I think we should make it clear (probably here: https://www.qemu.org/docs/master/system/security.html) that these kinds of bugs are not eligible for CVE assignment. > > > > Mauro, please update us when you get the CVE number. > > > > Michael, please amend the CVE number before committing the fix. > > > > > > > > FWIW Paolo asked every fuzzed bug reproducer to be committed > > > > as qtest, see tests/qtest/fuzz*c. Alex has a way to generate > > > > reproducer in plain C. > > > > > > > > Regards, > > > > > > > > Phil. > > > > -- Mauro Matteo Cascella Red Hat Product Security PGP-Key ID: BB3410B0
Re: [PATCH 3/3] block: print the server key type and fingerprint on failure
On 18.11.21 15:35, Daniel P. Berrangé wrote: When validating the server key fingerprint fails, it is difficult for the user to know what they got wrong. The fingerprint accepted by QEMU is received in a different format than openssh displays. There can also be keys for multiple different ciphers in known_hosts. It may not be obvious which cipher QEMU will use and whether it will be the same as openssh. Address this by printing the server key type and its corresponding fingerprint in the format QEMU accepts. Signed-off-by: Daniel P. Berrangé --- block/ssh.c | 37 ++--- 1 file changed, 30 insertions(+), 7 deletions(-) Nice! Reviewed-by: Hanna Reitz diff --git a/block/ssh.c b/block/ssh.c index fcc0ab765a..967a2b971e 100644 --- a/block/ssh.c +++ b/block/ssh.c @@ -386,14 +386,28 @@ static int compare_fingerprint(const unsigned char *fingerprint, size_t len, return *host_key_check - '\0'; } +static char *format_fingerprint(const unsigned char *fingerprint, size_t len) +{ +static const char *hex = "0123456789abcdef"; +char *ret = g_new0(char, (len * 2) + 1); +for (size_t i = 0; i < len; i++) { +ret[i * 2] = hex[((fingerprint[i] >> 4) & 0xf)]; +ret[(i * 2) + 1] = hex[(fingerprint[i] & 0xf)]; (I would have found an sn?printf() solution a bit simpler here (snprintf(&ret[i * 2], 2, "%02x", fingerprint[i])), but now you already wrote the code, so...) +} +ret[len * 2] = '\0'; +return ret; +}
[PATCH v2 0/2] block: Minor vhost-user-blk fixes
- Add vhost-user-blk help to qemu-storage-daemon, - Do not list vhost-user-blk in BlockExportType when CONFIG_VHOST_USER_BLK_SERVER is disabled. Since v1: - Reword patch 2 description (Markus) - Fix BlockExportOptions enum build failure (Markus) Philippe Mathieu-Daudé (2): qemu-storage-daemon: Add vhost-user-blk help qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER qapi/block-export.json | 6 -- storage-daemon/qemu-storage-daemon.c | 13 + 2 files changed, 17 insertions(+), 2 deletions(-) -- 2.33.1
[PATCH v2 1/2] qemu-storage-daemon: Add vhost-user-blk help
Add missing vhost-user-blk help: $ qemu-storage-daemon -h ... --export [type=]vhost-user-blk,id=,node-name=, addr.type=unix,addr.path=[,writable=on|off] [,logical-block-size=][,num-queues=] export the specified block node as a vhosts-user-blk device over UNIX domain socket --export [type=]vhost-user-blk,id=,node-name=, fd,addr.str=[,writable=on|off] [,logical-block-size=][,num-queues=] export the specified block node as a vhosts-user-blk device over file descriptor ... Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API") Reported-by: Qing Wang Signed-off-by: Philippe Mathieu-Daudé --- storage-daemon/qemu-storage-daemon.c | 13 + 1 file changed, 13 insertions(+) diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c index 52cf17e8ace..0c19e128e3f 100644 --- a/storage-daemon/qemu-storage-daemon.c +++ b/storage-daemon/qemu-storage-daemon.c @@ -104,6 +104,19 @@ static void help(void) " export the specified block node over FUSE\n" "\n" #endif /* CONFIG_FUSE */ +#ifdef CONFIG_VHOST_USER_BLK_SERVER +" --export [type=]vhost-user-blk,id=,node-name=,\n" +" addr.type=unix,addr.path=[,writable=on|off]\n" +" [,logical-block-size=][,num-queues=]\n" +" export the specified block node as a\n" +" vhosts-user-blk device over UNIX domain socket\n" +" --export [type=]vhost-user-blk,id=,node-name=,\n" +" fd,addr.str=[,writable=on|off]\n" +" [,logical-block-size=][,num-queues=]\n" +" export the specified block node as a\n" +" vhosts-user-blk device over file descriptor\n" +"\n" +#endif /* CONFIG_VHOST_USER_BLK_SERVER */ " --monitor [chardev=]name[,mode=control][,pretty[=on|off]]\n" " configure a QMP monitor\n" "\n" -- 2.33.1
[PATCH v2 2/2] qapi/block: Restrict vhost-user-blk to CONFIG_VHOST_USER_BLK_SERVER
When building QEMU with --disable-vhost-user and using introspection, query-qmp-schema lists vhost-user-blk even though it's not actually available: { "execute": "query-qmp-schema" } { "return": [ ... { "name": "312", "members": [ { "name": "nbd" }, { "name": "vhost-user-blk" } ], "meta-type": "enum", "values": [ "nbd", "vhost-user-blk" ] }, Restrict vhost-user-blk in BlockExportType when CONFIG_VHOST_USER_BLK_SERVER is disabled, so it doesn't end listed by query-qmp-schema. Fixes: 90fc91d50b7 ("convert vhost-user-blk server to block export API") Signed-off-by: Philippe Mathieu-Daudé --- v2: Reword + restrict BlockExportOptions union (armbru) --- qapi/block-export.json | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/qapi/block-export.json b/qapi/block-export.json index c1b92ce1c1c..f9ce79a974b 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -277,7 +277,8 @@ # Since: 4.2 ## { 'enum': 'BlockExportType', - 'data': [ 'nbd', 'vhost-user-blk', + 'data': [ 'nbd', +{ 'name': 'vhost-user-blk', 'if': 'CONFIG_VHOST_USER_BLK_SERVER' }, { 'name': 'fuse', 'if': 'CONFIG_FUSE' } ] } ## @@ -319,7 +320,8 @@ 'discriminator': 'type', 'data': { 'nbd': 'BlockExportOptionsNbd', - 'vhost-user-blk': 'BlockExportOptionsVhostUserBlk', + 'vhost-user-blk': { 'type': 'BlockExportOptionsVhostUserBlk', + 'if': 'CONFIG_VHOST_USER_BLK_SERVER' }, 'fuse': { 'type': 'BlockExportOptionsFuse', 'if': 'CONFIG_FUSE' } } } -- 2.33.1
[PULL 0/1] "make check" switch to meson test harness
The following changes since commit 2bf40d0841b942e7ba12953d515e62a436f0af84: Merge tag 'pull-user-20211220' of https://gitlab.com/rth7680/qemu into staging (2021-12-20 13:20:07 -0800) are available in the Git repository at: https://gitlab.com/bonzini/qemu.git tags/for-upstream-mtest for you to fetch changes up to 3d2f73ef75e25ba850aff4fcccb36d50137afd0f: build: use "meson test" as the test harness (2021-12-23 10:06:19 +0100) Replace tap-driver.pl with "meson test". Paolo Bonzini (1): build: use "meson test" as the test harness Makefile | 3 +- meson.build | 5 +- scripts/mtest2make.py | 112 ++- scripts/tap-driver.pl | 379 -- scripts/tap-merge.pl | 111 --- tests/fp/meson.build | 2 +- 6 files changed, 51 insertions(+), 561 deletions(-) delete mode 100755 scripts/tap-driver.pl delete mode 100755 scripts/tap-merge.pl -- 2.33.1
[PULL 1/1] build: use "meson test" as the test harness
"meson test" starting with version 0.57 is just as capable and easy to use as QEMU's own TAP driver. All existing options for "make check" work. The only required code change involves how to mark "slow" tests; they need to belong to an additional "slow" suite. The rules for .tap output are replaced by JUnit XML; GitLab is able to parse that output and present it in the CI pipeline report. Signed-off-by: Paolo Bonzini --- Makefile | 3 +- meson.build | 5 +- scripts/mtest2make.py | 112 + scripts/tap-driver.pl | 379 -- scripts/tap-merge.pl | 111 - tests/fp/meson.build | 2 +- 6 files changed, 51 insertions(+), 561 deletions(-) delete mode 100755 scripts/tap-driver.pl delete mode 100755 scripts/tap-merge.pl diff --git a/Makefile b/Makefile index 74c5b46d38..5d66c35ea5 100644 --- a/Makefile +++ b/Makefile @@ -145,7 +145,8 @@ NINJAFLAGS = $(if $V,-v) $(if $(MAKE.n), -n) $(if $(MAKE.k), -k0) \ $(filter-out -j, $(lastword -j1 $(filter -l% -j%, $(MAKEFLAGS \ ninja-cmd-goals = $(or $(MAKECMDGOALS), all) -ninja-cmd-goals += $(foreach t, $(.tests), $(.test.deps.$t)) +ninja-cmd-goals += $(foreach t, $(.check.build-suites), $(.check-$t.deps)) +ninja-cmd-goals += $(foreach t, $(.bench.build-suites), $(.bench-$t.deps)) makefile-targets := build.ninja ctags TAGS cscope dist clean uninstall # "ninja -t targets" also lists all prerequisites. If build system diff --git a/meson.build b/meson.build index f45ecf31bd..f0f1d5ba9d 100644 --- a/meson.build +++ b/meson.build @@ -1,8 +1,11 @@ project('qemu', ['c'], meson_version: '>=0.58.2', default_options: ['warning_level=1', 'c_std=gnu11', 'cpp_std=gnu++11', 'b_colorout=auto', - 'b_staticpic=false'], + 'b_staticpic=false', 'stdsplit=false'], version: files('VERSION')) +add_test_setup('quick', exclude_suites: 'slow', is_default: true) +add_test_setup('slow', env: ['G_TEST_SLOW=1', 'SPEED=slow']) + not_found = dependency('', required: false) keyval = import('keyval') ss = import('sourceset') diff --git a/scripts/mtest2make.py b/scripts/mtest2make.py index 02c0453e67..7067bdadf5 100644 --- a/scripts/mtest2make.py +++ b/scripts/mtest2make.py @@ -13,101 +13,79 @@ class Suite(object): def __init__(self): -self.tests = list() -self.slow_tests = list() -self.executables = set() +self.deps = set() +self.speeds = ['quick'] + +def names(self, base): +return [base if speed == 'quick' else f'{base}-{speed}' for speed in self.speeds] + print(''' SPEED = quick -# $1 = environment, $2 = test command, $3 = test name, $4 = dir -.test-human-tap = $1 $(if $4,(cd $4 && $2),$2) -m $(SPEED) < /dev/null | ./scripts/tap-driver.pl --test-name="$3" $(if $(V),,--show-failures-only) -.test-human-exitcode = $1 $(PYTHON) scripts/test-driver.py $(if $4,-C$4) $(if $(V),--verbose) -- $2 < /dev/null -.test-tap-tap = $1 $(if $4,(cd $4 && $2),$2) < /dev/null | sed "s/^[a-z][a-z]* [0-9]*/& $3/" || true -.test-tap-exitcode = printf "%s\\n" 1..1 "`$1 $(if $4,(cd $4 && $2),$2) < /dev/null > /dev/null || echo "not "`ok 1 $3" -.test.human-print = echo $(if $(V),'$1 $2','Running test $3') && -.test.env = MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$(( $${RANDOM:-0} % 255 + 1))} +.speed.quick = $(foreach s,$(sort $(filter-out %-slow, $1)), --suite $s) +.speed.slow = $(foreach s,$(sort $1), --suite $s) -# $1 = test name, $2 = test target (human or tap) -.test.run = $(call .test.$2-print,$(.test.env.$1),$(.test.cmd.$1),$(.test.name.$1)) $(call .test-$2-$(.test.driver.$1),$(.test.env.$1),$(.test.cmd.$1),$(.test.name.$1),$(.test.dir.$1)) +.mtestargs = --no-rebuild -t 0 +ifneq ($(SPEED), quick) +.mtestargs += --setup $(SPEED) +endif +.mtestargs += $(subst -j,--num-processes , $(filter-out -j, $(lastword -j1 $(filter -j%, $(MAKEFLAGS) -.test.output-format = human -''') +.check.mtestargs = $(MTESTARGS) $(.mtestargs) $(if $(V),--verbose,--print-errorlogs) +.bench.mtestargs = $(MTESTARGS) $(.mtestargs) --benchmark --verbose''') introspect = json.load(sys.stdin) -i = 0 def process_tests(test, targets, suites): -global i -env = ' '.join(('%s=%s' % (shlex.quote(k), shlex.quote(v)) -for k, v in test['env'].items())) executable = test['cmd'][0] try: executable = os.path.relpath(executable) except: pass -if test['workdir'] is not None: -try: -test['cmd'][0] = os.path.relpath(executable, test['workdir']) -except: -test['cmd'][0] = executable -else: -test['cmd'][0] = executable -cmd = ' '.join((shlex.quote(x) for x in test['cmd'])) -driver = test['protocol'] if 'protocol' in test else 'exitcode' - -i += 1 -if test['workdir'] is not None: -print('.test.dir.%d := %s' % (i, shlex.quote(test['workdir']))) deps = (target
Re: [PATCH 3/3] block: print the server key type and fingerprint on failure
On 11/18/21 15:35, Daniel P. Berrangé wrote: > When validating the server key fingerprint fails, it is difficult for > the user to know what they got wrong. The fingerprint accepted by QEMU > is received in a different format than openssh displays. There can also > be keys for multiple different ciphers in known_hosts. It may not be > obvious which cipher QEMU will use and whether it will be the same > as openssh. Address this by printing the server key type and its "OpenSSH"? (twice) > corresponding fingerprint in the format QEMU accepts. > > Signed-off-by: Daniel P. Berrangé > --- > block/ssh.c | 37 ++--- > 1 file changed, 30 insertions(+), 7 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [RFC PATCH v3 18/27] hw/intc: Add LoongArch ls7a interrupt controller support(PCH-PIC)
On 22/12/2021 02:38, yangxiaojuan wrote: Hi, Mark On 12/18/2021 08:33 AM, Mark Cave-Ayland wrote: On 04/12/2021 12:07, Xiaojuan Yang wrote: This patch realize the PCH-PIC interrupt controller. Signed-off-by: Xiaojuan Yang Signed-off-by: Song Gao --- hw/intc/Kconfig | 4 + hw/intc/loongarch_pch_pic.c | 357 hw/intc/meson.build | 1 + hw/intc/trace-events| 5 + hw/loongarch/Kconfig| 1 + include/hw/intc/loongarch_pch_pic.h | 61 + 6 files changed, 429 insertions(+) create mode 100644 hw/intc/loongarch_pch_pic.c create mode 100644 include/hw/intc/loongarch_pch_pic.h diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig index 511dcac537..96da13ad1d 100644 --- a/hw/intc/Kconfig +++ b/hw/intc/Kconfig @@ -76,3 +76,7 @@ config M68K_IRQC config LOONGARCH_IPI bool + +config LOONGARCH_PCH_PIC +bool +select UNIMP diff --git a/hw/intc/loongarch_pch_pic.c b/hw/intc/loongarch_pch_pic.c new file mode 100644 index 00..2ede29ceb0 --- /dev/null +++ b/hw/intc/loongarch_pch_pic.c @@ -0,0 +1,357 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * QEMU Loongson 7A1000 I/O interrupt controller. + * + * Copyright (C) 2021 Loongson Technology Corporation Limited + */ + +#include "qemu/osdep.h" +#include "hw/sysbus.h" +#include "hw/loongarch/loongarch.h" +#include "hw/irq.h" +#include "hw/intc/loongarch_pch_pic.h" +#include "migration/vmstate.h" +#include "trace.h" + +#define for_each_set_bit(bit, addr, size) \ + for ((bit) = find_first_bit((addr), (size));\ + (bit) < (size);\ + (bit) = find_next_bit((addr), (size), (bit) + 1)) + +static void pch_pic_update_irq(loongarch_pch_pic *s, uint64_t mask, int level) +{ +int i; +uint64_t val; +val = mask & s->intirr & (~s->int_mask); + +for_each_set_bit(i, &val, 64) { +if (level == 1) { +if ((s->intisr & (0x1ULL << i)) == 0) { +s->intisr |= 1ULL << i; +qemu_set_irq(s->parent_irq[s->htmsi_vector[i]], 1); +} +} else if (level == 0) { +if (s->intisr & (0x1ULL << i)) { +s->intisr &= ~(0x1ULL << i); +qemu_set_irq(s->parent_irq[s->htmsi_vector[i]], 0); +} +} +} +} The normal pattern would be to use something like: for (i = 0; i < 64; i++) { if (level) { s->intisr |= 1ULL << i; } else { s->intisr &= ~(0x1ULL << i); } qemu_set_irq(s->parent_irq[s->htmsi_vector[i]], level); } Why is it necessary to check the previous value of (s->intisr & (0x1ULL << i)) here? Here check the previous value to avoid Unnecessary write. It seems make things more complicated. I will modify In general a *_update_irq() function should be fine to propagate the IRQ up to the parent directly: I think this is fine in this case because you are directly manipulating the parent_irq elements rather than using e.g. a priority encoder within this device to raise an IRQ to the CPU. I'm presuming this final prioritisation and delivery is done elsewhere? +static void pch_pic_irq_handler(void *opaque, int irq, int level) +{ +loongarch_pch_pic *s = LOONGARCH_PCH_PIC(opaque); + +assert(irq < PCH_PIC_IRQ_NUM); +uint64_t mask = 1ULL << irq; + +trace_pch_pic_irq_handler(s->intedge, irq, level); + +if (s->intedge & mask) { +/* Edge triggered */ +if (level) { +if ((s->last_intirr & mask) == 0) { +s->intirr |= mask; +} +s->last_intirr |= mask; +} else { +s->last_intirr &= ~mask; +} +} else { +/* Level triggered */ +if (level) { +s->intirr |= mask; +s->last_intirr |= mask; +} else { +s->intirr &= ~mask; +s->last_intirr &= ~mask; +} + +} +pch_pic_update_irq(s, mask, level); +} + +static uint64_t loongarch_pch_pic_reg_read(void *opaque, hwaddr addr, + unsigned size) +{ +loongarch_pch_pic *s = LOONGARCH_PCH_PIC(opaque); +uint64_t val = 0; +uint32_t offset = addr & 0xfff; +int64_t offset_tmp; + +if (size == 8) { +switch (offset) { +case PCH_PIC_INT_ID_OFFSET: +val = (PCH_PIC_INT_ID_NUM << 32) | PCH_PIC_INT_ID_VAL; +break; +case PCH_PIC_INT_MASK_OFFSET: +val = s->int_mask; +break; +case PCH_PIC_INT_STATUS_OFFSET: +val = s->intisr & (~s->int_mask); +break; +case PCH_PIC_INT_EDGE_OFFSET: +val = s->intedge; +break; +case PCH_PIC_INT_POL_OFFSET: +val = s->int_polarity; +break; +case PCH_PIC_HTMSI_EN_OFFSET...PCH_PIC_HTMSI_EN_END: +val = s->h
Re: [PATCH 1/3] scripts/qapi/commands: gen_commands(): add add_trace_points argument
21.12.2021 22:35, Vladimir Sementsov-Ogievskiy wrote: Add possibility to generate trace points for each qmp command. We should generate both trace points and trace-events file, for further trace point code generation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- scripts/qapi/commands.py | 84 ++-- 1 file changed, 73 insertions(+), 11 deletions(-) diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py index 21001bbd6b..e62f1a4125 100644 --- a/scripts/qapi/commands.py +++ b/scripts/qapi/commands.py @@ -53,7 +53,8 @@ def gen_command_decl(name: str, def gen_call(name: str, arg_type: Optional[QAPISchemaObjectType], boxed: bool, - ret_type: Optional[QAPISchemaType]) -> str: + ret_type: Optional[QAPISchemaType], + add_trace_points: bool) -> str: ret = '' argstr = '' @@ -71,21 +72,65 @@ def gen_call(name: str, if ret_type: lhs = 'retval = ' -ret = mcgen(''' +qmp_name = f'qmpq_{c_name(name)}' That was called qmpq_ because qmp_ conflicts with existing qmp_ trace points for jobs. But looking at them, they don't add much information to new qmpq_ trace events, so, in v2 I'll remove old qmp_ trace points (not many of them) and new generated trace points will be named simply qmp_* -- Best regards, Vladimir
Re: [PATCH 3/3] meson: generate trace points for qmp commands
23.12.2021 12:33, Vladimir Sementsov-Ogievskiy wrote: 23.12.2021 01:11, Paolo Bonzini wrote: Il mar 21 dic 2021, 20:35 Vladimir Sementsov-Ogievskiy mailto:vsement...@virtuozzo.com>> ha scritto: --- a/trace/meson.build +++ b/trace/meson.build @@ -2,10 +2,14 @@ specific_ss.add(files('control-target.c')) trace_events_files = [] -foreach dir : [ '.' ] + trace_events_subdirs - trace_events_file = meson.project_source_root() / dir / 'trace-events' +foreach path : [ '.' ] + trace_events_subdirs + qapi_trace_events + if path.contains('trace-events') + trace_events_file = meson.project_build_root() / 'qapi' / path Just using "trace_events_file = 'qapi' / path" might work, since the build is nonrecursive. This say: ninja: error: '../trace/qapi/qapi-commands-authz.trace-events', needed by 'trace/trace-events-all', missing and no known rule to make it make[1]: *** [Makefile:162: run-ninja] Error 1 make[1]: Leaving directory '/work/src/qemu/up/up-trace-qmp-commands/build' make: *** [GNUmakefile:11: all] Error 2 so, it consider the path relative to current "trace" directory. If it doesn't, use the custom target object, possibly indexing it as ct[index]. You can use a dictionary to store the custom targets and find them from the "path" variable. O! Great thanks! Magic. The following hack works: diff --git a/meson.build b/meson.build index 20d32fd20d..c42a76a14c 100644 --- a/meson.build +++ b/meson.build @@ -39,6 +39,7 @@ qemu_icondir = get_option('datadir') / 'icons' config_host_data = configuration_data() genh = [] qapi_trace_events = [] +qapi_trace_events_targets = {} target_dirs = config_host['TARGET_DIRS'].split() have_linux_user = false diff --git a/qapi/meson.build b/qapi/meson.build index 333ca60583..d4de04459d 100644 --- a/qapi/meson.build +++ b/qapi/meson.build @@ -139,6 +139,9 @@ foreach output : qapi_util_outputs if output.endswith('.h') genh += qapi_files[i] endif + if output.endswith('.trace-events') + qapi_trace_events_targets += {output: qapi_files[i]} + endif util_ss.add(qapi_files[i]) i = i + 1 endforeach @@ -147,6 +150,9 @@ foreach output : qapi_specific_outputs + qapi_nonmodule_outputs if output.endswith('.h') genh += qapi_files[i] endif + if output.endswith('.trace-events') + qapi_trace_events_targets += {output: qapi_files[i]} + endif specific_ss.add(when: 'CONFIG_SOFTMMU', if_true: qapi_files[i]) i = i + 1 endforeach diff --git a/trace/meson.build b/trace/meson.build index 77e44fa68d..daa24c3a2d 100644 --- a/trace/meson.build +++ b/trace/meson.build @@ -4,7 +4,7 @@ specific_ss.add(files('control-target.c')) trace_events_files = [] foreach path : [ '.' ] + trace_events_subdirs + qapi_trace_events if path.contains('trace-events') - trace_events_file = meson.project_build_root() / 'qapi' / path + trace_events_file = qapi_trace_events_targets[path] else trace_events_file = meson.project_source_root() / path / 'trace-events' endif Or even simpler, I can use a list combined from needed qapi_files[] elements. So, the solution is to use custom target objects or their indexed subobjects instead of raw paths. This way Meson resolves dependencies better. -- Best regards, Vladimir
Re: [PATCH qemu] s390x/css: fix PMCW invalid mask
On Wed, 22 Dec 2021 17:46:11 +0100 Cornelia Huck wrote: > On Thu, Dec 16 2021, Nico Boehr wrote: > > > Previously, we required bits 5, 6 and 7 to be zero (0x07 == 0b111). But, > > as per the principles of operation, bit 5 is ignored in MSCH and bits 0, > > 1, 6 and 7 need to be zero. > > > > As both PMCW_FLAGS_MASK_INVALID and ioinst_schib_valid() are only used > > by ioinst_handle_msch(), adjust the mask accordingly. > > > > Fixes: db1c8f53bfb1 ("s390: Channel I/O basic definitions.") > > Signed-off-by: Nico Boehr > > Reviewed-by: Pierre Morel > > Reviewed-by: Halil Pasic > > Reviewed-by: Janosch Frank > > --- > > include/hw/s390x/ioinst.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/include/hw/s390x/ioinst.h b/include/hw/s390x/ioinst.h > > index 3771fff9d44d..ea8d0f244492 100644 > > --- a/include/hw/s390x/ioinst.h > > +++ b/include/hw/s390x/ioinst.h > > @@ -107,7 +107,7 @@ QEMU_BUILD_BUG_MSG(sizeof(PMCW) != 28, "size of PMCW is > > wrong"); > > #define PMCW_FLAGS_MASK_MP 0x0004 > > #define PMCW_FLAGS_MASK_TF 0x0002 > > #define PMCW_FLAGS_MASK_DNV 0x0001 > > -#define PMCW_FLAGS_MASK_INVALID 0x0700 > > +#define PMCW_FLAGS_MASK_INVALID 0xc300 > > Removing bit 5 from this mask makes sense, at it is simply ignored. > > I'm a bit confused about bits 0 and 1, however. They are _QF and _W, > respectively (just out of the context here), which are in the same class > as _DNV (i.e. characteristics of the subchannel that cannot be modified > via msch). Looking at the PoP, I don't see what is supposed to happen if > the program tries to modify the dnv bit (maybe I'm simply overlooking > it.) I would naively assume that the w bit should behave in the same way > (as it does for message subchannels what dnv does for I/O subchannels, > and the rest of the values are not meaningful if it is not set), and > probably also the qf bit (as it doesn't make sense for the program to > turn QDIO capabilities on and off.) The main question is whether trying > to modify these bits causes an error or is ignored. The PoP suggests an > error (no idea if the internal architecture agrees, it hopefully does); > what happens for dnv? """ Bits 0, 1, 6, and 7 of word 1, and bits 0-28 of word 6 of the SCHIB operand must be zeros, and bits 9 and 10 of word 1 must not both be ones. When the extended-I/O-measurement-block facility is installed and a format-1 measurement block is specified, bits 26-31 of word 11 must be specified as zeros. """ (IBM z/Architecture Principles of Operation (SA22-7832-10), 14-8) The internal architecture agrees. DNV bit is ignored. Regarding why, I don't know. Probably for historic reasons. The PoP tells us that whatever is not listed as significant or checked and results in an operation exception if not appropriate is ignored: """ The remaining fields of the SCHIB are ignored and do not affect the processing of MODIFY SUBCHANNEL. (For further details, see “Subchannel-Information Block” on page 2 """ (same page) Regarding word 1 of the SCHIB the alignment between PoP and AR is perfect AFAICT. > > We support neither message subchannels nor QDIO in QEMU, so it's > probably not relevant right now; but it would still be good if we could > clarify the expected behaviour here :) > > > > > #define PMCW_CHARS_MASK_ST 0x00e0 > > #define PMCW_CHARS_MASK_MBFC 0x0004 > >
Re: [RFC PATCH v3 22/27] hw/loongarch: Add some devices support for 3A5000.
On 22/12/2021 08:26, yangxiaojuan wrote: Hi, Mark On 12/18/2021 06:02 PM, Mark Cave-Ayland wrote: On 04/12/2021 12:07, Xiaojuan Yang wrote: 1.Add uart,virtio-net,vga and usb for 3A5000. 2.Add irq set and map for the pci host. Non pci device use irq 0-16, pci device use 16-64. 3.Add some unimplented device to emulate guest unused memory space. Signed-off-by: Xiaojuan Yang Signed-off-by: Song Gao --- hw/loongarch/Kconfig| 8 + hw/loongarch/loongson3.c| 63 +++-- hw/pci-host/ls7a.c | 42 +- include/hw/intc/loongarch_ipi.h | 2 ++ include/hw/pci-host/ls7a.h | 4 +++ softmmu/qdev-monitor.c | 3 +- 6 files changed, 117 insertions(+), 5 deletions(-) diff --git a/hw/loongarch/Kconfig b/hw/loongarch/Kconfig index 468e3acc74..9ea3b92708 100644 --- a/hw/loongarch/Kconfig +++ b/hw/loongarch/Kconfig @@ -1,5 +1,13 @@ config LOONGSON3_LS7A bool +imply VGA_PCI +imply VIRTIO_VGA +imply PARALLEL +imply PCI_DEVICES +select ISA_BUS +select SERIAL +select SERIAL_ISA +select VIRTIO_PCI select PCI_EXPRESS_7A select LOONGARCH_IPI select LOONGARCH_PCH_PIC diff --git a/hw/loongarch/loongson3.c b/hw/loongarch/loongson3.c index c42f830208..e4a02e7c18 100644 --- a/hw/loongarch/loongson3.c +++ b/hw/loongarch/loongson3.c @@ -10,8 +10,11 @@ #include "qemu/datadir.h" #include "qapi/error.h" #include "hw/boards.h" +#include "hw/char/serial.h" #include "sysemu/sysemu.h" #include "sysemu/qtest.h" +#include "hw/irq.h" +#include "net/net.h" #include "sysemu/runstate.h" #include "sysemu/reset.h" #include "hw/loongarch/loongarch.h" @@ -20,6 +23,7 @@ #include "hw/intc/loongarch_pch_pic.h" #include "hw/intc/loongarch_pch_msi.h" #include "hw/pci-host/ls7a.h" +#include "hw/misc/unimp.h" static void loongarch_cpu_reset(void *opaque) @@ -91,11 +95,12 @@ static void sysbus_mmio_map_loongarch(SysBusDevice *dev, int n, memory_region_add_subregion(iocsr, addr, dev->mmio[n].memory); } -static void loongson3_irq_init(MachineState *machine) +static PCIBus *loongson3_irq_init(MachineState *machine) { LoongArchMachineState *lams = LOONGARCH_MACHINE(machine); -DeviceState *ipi, *extioi, *pch_pic, *pch_msi, *cpudev; +DeviceState *ipi, *extioi, *pch_pic, *pch_msi, *cpudev, *pciehost; SysBusDevice *d; +PCIBus *pci_bus; int cpu, pin, i; unsigned long ipi_addr; @@ -135,6 +140,10 @@ static void loongson3_irq_init(MachineState *machine) sysbus_realize_and_unref(d, &error_fatal); sysbus_mmio_map(d, 0, LS7A_IOAPIC_REG_BASE); +serial_mm_init(get_system_memory(), LS7A_UART_BASE, 0, + qdev_get_gpio_in(pch_pic, LS7A_UART_IRQ - PCH_PIC_IRQ_OFFSET), + 115200, serial_hd(0), DEVICE_LITTLE_ENDIAN); + /* Connect 64 pch_pic irqs to extioi */ for (int i = 0; i < PCH_PIC_IRQ_NUM; i++) { sysbus_connect_irq(d, i, qdev_get_gpio_in(extioi, i)); @@ -149,6 +158,35 @@ static void loongson3_irq_init(MachineState *machine) sysbus_connect_irq(d, i, qdev_get_gpio_in(extioi, i + PCH_MSI_IRQ_START)); } + +pciehost = qdev_new(TYPE_LS7A_HOST_DEVICE); +d = SYS_BUS_DEVICE(pciehost); +sysbus_realize_and_unref(d, &error_fatal); +pci_bus = PCI_HOST_BRIDGE(pciehost)->bus; + +/* Connect 48 pci irq to pch_pic */ +for (i = 0; i < LS7A_PCI_IRQS; i++) { +qdev_connect_gpio_out(pciehost, i, + qdev_get_gpio_in(pch_pic, i + LS7A_DEVICE_IRQS)); +} + +return pci_bus; +} + +/* Network support */ +static void network_init(PCIBus *pci_bus) +{ +int i; + +for (i = 0; i < nb_nics; i++) { +NICInfo *nd = &nd_table[i]; + +if (!nd->model) { +nd->model = g_strdup("virtio"); +} + +pci_nic_init_nofail(nd, pci_bus, nd->model, NULL); +} } static void loongson3_init(MachineState *machine) @@ -161,6 +199,7 @@ static void loongson3_init(MachineState *machine) MemoryRegion *address_space_mem = get_system_memory(); LoongArchMachineState *lams = LOONGARCH_MACHINE(machine); int i; +PCIBus *pci_bus = NULL; if (!cpu_model) { cpu_model = LOONGARCH_CPU_TYPE_NAME("Loongson-3A5000"); @@ -207,8 +246,26 @@ static void loongson3_init(MachineState *machine) memory_region_add_subregion(address_space_mem, 0x9000, &lams->highmem); offset += highram_size; +/* + * There are some invalid guest memory access. + * Create some unimplemented devices to emulate this. + */ +create_unimplemented_device("ls7a-lpc", 0x10002000, 0x14); +create_unimplemented_device("pci-dma-cfg", 0x1001041c, 0x4); +create_unimplemented_device("node-bridge", 0xEFDFB000274, 0x4); +create_unimplemented_device("ls7a-lionlpc", 0x1fe01400, 0x
[PATCH v2 0/4] trace qmp commands
Hi all! This series aims to add trace points for each qmp command with help of qapi code generator. v2: 01: new 02: use qmp_* naming for new trace-events 03: add Philippe's r-b, thanks! 04: rewrite, so that it works now! Thanks to Paolo for fast help! Vladimir Sementsov-Ogievskiy (4): jobs: drop qmp_ trace points scripts/qapi/commands: gen_commands(): add add_trace_points argument scripts/qapi-gen.py: add --add-trace-points option meson: generate trace points for qmp commands meson.build | 1 + blockdev.c | 8 job-qmp.c| 6 --- block/trace-events | 9 - qapi/meson.build | 9 - scripts/qapi/commands.py | 84 ++-- scripts/qapi/gen.py | 13 +-- scripts/qapi/main.py | 10 +++-- trace-events | 8 trace/meson.build| 11 -- 10 files changed, 107 insertions(+), 52 deletions(-) -- 2.31.1
[PATCH v2 1/4] jobs: drop qmp_ trace points
We are going to implement automatic trace points for qmp commands. These several trace points are in conflict with upcoming ones. So, drop them now. Signed-off-by: Vladimir Sementsov-Ogievskiy --- blockdev.c | 8 job-qmp.c | 6 -- block/trace-events | 9 - trace-events | 8 4 files changed, 31 deletions(-) diff --git a/blockdev.c b/blockdev.c index 0eb2823b1b..10961d81a4 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2586,8 +2586,6 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device, goto out; } -trace_qmp_block_stream(bs); - out: aio_context_release(aio_context); } @@ -3354,7 +3352,6 @@ void qmp_block_job_cancel(const char *device, goto out; } -trace_qmp_block_job_cancel(job); job_user_cancel(&job->job, force, errp); out: aio_context_release(aio_context); @@ -3369,7 +3366,6 @@ void qmp_block_job_pause(const char *device, Error **errp) return; } -trace_qmp_block_job_pause(job); job_user_pause(&job->job, errp); aio_context_release(aio_context); } @@ -3383,7 +3379,6 @@ void qmp_block_job_resume(const char *device, Error **errp) return; } -trace_qmp_block_job_resume(job); job_user_resume(&job->job, errp); aio_context_release(aio_context); } @@ -3397,7 +3392,6 @@ void qmp_block_job_complete(const char *device, Error **errp) return; } -trace_qmp_block_job_complete(job); job_complete(&job->job, errp); aio_context_release(aio_context); } @@ -3411,7 +3405,6 @@ void qmp_block_job_finalize(const char *id, Error **errp) return; } -trace_qmp_block_job_finalize(job); job_ref(&job->job); job_finalize(&job->job, errp); @@ -3435,7 +3428,6 @@ void qmp_block_job_dismiss(const char *id, Error **errp) return; } -trace_qmp_block_job_dismiss(bjob); job = &bjob->job; job_dismiss(&job, errp); aio_context_release(aio_context); diff --git a/job-qmp.c b/job-qmp.c index 829a28aa70..cf0cb9d717 100644 --- a/job-qmp.c +++ b/job-qmp.c @@ -57,7 +57,6 @@ void qmp_job_cancel(const char *id, Error **errp) return; } -trace_qmp_job_cancel(job); job_user_cancel(job, true, errp); aio_context_release(aio_context); } @@ -71,7 +70,6 @@ void qmp_job_pause(const char *id, Error **errp) return; } -trace_qmp_job_pause(job); job_user_pause(job, errp); aio_context_release(aio_context); } @@ -85,7 +83,6 @@ void qmp_job_resume(const char *id, Error **errp) return; } -trace_qmp_job_resume(job); job_user_resume(job, errp); aio_context_release(aio_context); } @@ -99,7 +96,6 @@ void qmp_job_complete(const char *id, Error **errp) return; } -trace_qmp_job_complete(job); job_complete(job, errp); aio_context_release(aio_context); } @@ -113,7 +109,6 @@ void qmp_job_finalize(const char *id, Error **errp) return; } -trace_qmp_job_finalize(job); job_ref(job); job_finalize(job, errp); @@ -136,7 +131,6 @@ void qmp_job_dismiss(const char *id, Error **errp) return; } -trace_qmp_job_dismiss(job); job_dismiss(&job, errp); aio_context_release(aio_context); } diff --git a/block/trace-events b/block/trace-events index 549090d453..5be3e3913b 100644 --- a/block/trace-events +++ b/block/trace-events @@ -49,15 +49,6 @@ block_copy_read_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" block_copy_write_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d" block_copy_write_zeroes_fail(void *bcs, int64_t start, int ret) "bcs %p start %"PRId64" ret %d" -# ../blockdev.c -qmp_block_job_cancel(void *job) "job %p" -qmp_block_job_pause(void *job) "job %p" -qmp_block_job_resume(void *job) "job %p" -qmp_block_job_complete(void *job) "job %p" -qmp_block_job_finalize(void *job) "job %p" -qmp_block_job_dismiss(void *job) "job %p" -qmp_block_stream(void *bs) "bs %p" - # file-win32.c file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d" diff --git a/trace-events b/trace-events index a637a61eba..1265f1e0cc 100644 --- a/trace-events +++ b/trace-events @@ -79,14 +79,6 @@ job_state_transition(void *job, int ret, const char *legal, const char *s0, con job_apply_verb(void *job, const char *state, const char *verb, const char *legal) "job %p in state %s; applying verb %s (%s)" job_completed(void *job, int ret) "job %p ret %d" -# job-qmp.c -qmp_job_cancel(void *job) "job %p" -qmp_job_pause(void *job) "job %p" -qmp_job_resume(void *job) "job %p" -qmp_job_complete(void *job) "job %p" -qmp_job_finalize(void *job) "job %p" -qmp_job_dismiss(void *job) "job %p" - ### Guest events, keep at bottom -- 2.31.1
[PATCH v2 2/4] scripts/qapi/commands: gen_commands(): add add_trace_points argument
Add possibility to generate trace points for each qmp command. We should generate both trace points and trace-events file, for further trace point code generation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- scripts/qapi/commands.py | 84 ++-- 1 file changed, 73 insertions(+), 11 deletions(-) diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py index 21001bbd6b..9691c11f96 100644 --- a/scripts/qapi/commands.py +++ b/scripts/qapi/commands.py @@ -53,7 +53,8 @@ def gen_command_decl(name: str, def gen_call(name: str, arg_type: Optional[QAPISchemaObjectType], boxed: bool, - ret_type: Optional[QAPISchemaType]) -> str: + ret_type: Optional[QAPISchemaType], + add_trace_points: bool) -> str: ret = '' argstr = '' @@ -71,21 +72,65 @@ def gen_call(name: str, if ret_type: lhs = 'retval = ' -ret = mcgen(''' +qmp_name = f'qmp_{c_name(name)}' +upper = qmp_name.upper() + +if add_trace_points: +ret += mcgen(''' + +if (trace_event_get_state_backends(TRACE_%(upper)s)) { +g_autoptr(GString) req_json = qobject_to_json(QOBJECT(args)); +trace_%(qmp_name)s("", req_json->str); +} +''', + upper=upper, qmp_name=qmp_name) + +ret += mcgen(''' %(lhs)sqmp_%(c_name)s(%(args)s&err); -error_propagate(errp, err); ''', c_name=c_name(name), args=argstr, lhs=lhs) -if ret_type: -ret += mcgen(''' + +ret += mcgen(''' if (err) { +''') + +if add_trace_points: +ret += mcgen(''' +trace_%(qmp_name)s("FAIL: ", error_get_pretty(err)); +''', + qmp_name=qmp_name) + +ret += mcgen(''' +error_propagate(errp, err); goto out; } +''') + +if ret_type: +ret += mcgen(''' qmp_marshal_output_%(c_name)s(retval, ret, errp); ''', c_name=ret_type.c_name()) + +if add_trace_points: +if ret_type: +ret += mcgen(''' + +if (trace_event_get_state_backends(TRACE_%(upper)s)) { +g_autoptr(GString) ret_json = qobject_to_json(*ret); +trace_%(qmp_name)s("RET:", ret_json->str); +} +''', + upper=upper, qmp_name=qmp_name) +else: +ret += mcgen(''' + +trace_%(qmp_name)s("SUCCESS", ""); +''', + qmp_name=qmp_name) + return ret @@ -122,10 +167,14 @@ def gen_marshal_decl(name: str) -> str: proto=build_marshal_proto(name)) +def gen_trace(name: str) -> str: +return f'qmp_{c_name(name)}(const char *tag, const char *json) "%s%s"\n' + def gen_marshal(name: str, arg_type: Optional[QAPISchemaObjectType], boxed: bool, -ret_type: Optional[QAPISchemaType]) -> str: +ret_type: Optional[QAPISchemaType], +add_trace_points: bool) -> str: have_args = boxed or (arg_type and not arg_type.is_empty()) if have_args: assert arg_type is not None @@ -180,7 +229,7 @@ def gen_marshal(name: str, } ''') -ret += gen_call(name, arg_type, boxed, ret_type) +ret += gen_call(name, arg_type, boxed, ret_type, add_trace_points) ret += mcgen(''' @@ -238,11 +287,12 @@ def gen_register_command(name: str, class QAPISchemaGenCommandVisitor(QAPISchemaModularCVisitor): -def __init__(self, prefix: str): +def __init__(self, prefix: str, add_trace_points: bool): super().__init__( prefix, 'qapi-commands', ' * Schema-defined QAPI/QMP commands', None, __doc__) self._visited_ret_types: Dict[QAPIGenC, Set[QAPISchemaType]] = {} +self.add_trace_points = add_trace_points def _begin_user_module(self, name: str) -> None: self._visited_ret_types[self._genc] = set() @@ -261,6 +311,15 @@ def _begin_user_module(self, name: str) -> None: ''', commands=commands, visit=visit)) + +if self.add_trace_points and c_name(commands) != 'qapi_commands': +self._genc.add(mcgen(''' +#include "trace/trace-qapi.h" +#include "qapi/qmp/qjson.h" +#include "trace/trace-%(nm)s_trace_events.h" +''', + nm=c_name(commands))) + self._genh.add(mcgen(''' #include "%(types)s.h" @@ -322,7 +381,9 @@ def visit_command(self, with ifcontext(ifcond, self._genh, self._genc): self._genh.add(gen_command_decl(name, arg_type, boxed, ret_type)) self._genh.add(gen_marshal_decl(name)) -self._genc.add(gen_marshal(name, arg_type, boxed, ret_type)) +self._genc.add(gen_marshal(name, arg_type, boxed, ret_type, + self.add_trace_points)) +self._gent.add(gen_trace(name)) with self._temp_module('./init'): with ifcontext(ifcond, self._genh, sel
[PATCH v2 3/4] scripts/qapi-gen.py: add --add-trace-points option
Add and option to generate trace points. We should generate both trace points and trace-events files for further trace point code generation. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Philippe Mathieu-Daudé --- scripts/qapi/gen.py | 13 ++--- scripts/qapi/main.py | 10 +++--- 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/scripts/qapi/gen.py b/scripts/qapi/gen.py index 995a97d2b8..605b3fe68a 100644 --- a/scripts/qapi/gen.py +++ b/scripts/qapi/gen.py @@ -251,7 +251,7 @@ def __init__(self, self._builtin_blurb = builtin_blurb self._pydoc = pydoc self._current_module: Optional[str] = None -self._module: Dict[str, Tuple[QAPIGenC, QAPIGenH]] = {} +self._module: Dict[str, Tuple[QAPIGenC, QAPIGenH, QAPIGen]] = {} self._main_module: Optional[str] = None @property @@ -264,6 +264,11 @@ def _genh(self) -> QAPIGenH: assert self._current_module is not None return self._module[self._current_module][1] +@property +def _gent(self) -> QAPIGen: +assert self._current_module is not None +return self._module[self._current_module][2] + @staticmethod def _module_dirname(name: str) -> str: if QAPISchemaModule.is_user_module(name): @@ -293,7 +298,8 @@ def _add_module(self, name: str, blurb: str) -> None: basename = self._module_filename(self._what, name) genc = QAPIGenC(basename + '.c', blurb, self._pydoc) genh = QAPIGenH(basename + '.h', blurb, self._pydoc) -self._module[name] = (genc, genh) +gent = QAPIGen(basename + '.trace-events') +self._module[name] = (genc, genh, gent) self._current_module = name @contextmanager @@ -304,11 +310,12 @@ def _temp_module(self, name: str) -> Iterator[None]: self._current_module = old_module def write(self, output_dir: str, opt_builtins: bool = False) -> None: -for name, (genc, genh) in self._module.items(): +for name, (genc, genh, gent) in self._module.items(): if QAPISchemaModule.is_builtin_module(name) and not opt_builtins: continue genc.write(output_dir) genh.write(output_dir) +gent.write(output_dir) def _begin_builtin_module(self) -> None: pass diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py index f2ea6e0ce4..3adf0319cf 100644 --- a/scripts/qapi/main.py +++ b/scripts/qapi/main.py @@ -32,7 +32,8 @@ def generate(schema_file: str, output_dir: str, prefix: str, unmask: bool = False, - builtins: bool = False) -> None: + builtins: bool = False, + add_trace_points: bool = False) -> None: """ Generate C code for the given schema into the target directory. @@ -49,7 +50,7 @@ def generate(schema_file: str, schema = QAPISchema(schema_file) gen_types(schema, output_dir, prefix, builtins) gen_visit(schema, output_dir, prefix, builtins) -gen_commands(schema, output_dir, prefix) +gen_commands(schema, output_dir, prefix, add_trace_points) gen_events(schema, output_dir, prefix) gen_introspect(schema, output_dir, prefix, unmask) @@ -74,6 +75,8 @@ def main() -> int: parser.add_argument('-u', '--unmask-non-abi-names', action='store_true', dest='unmask', help="expose non-ABI names in introspection") +parser.add_argument('--add-trace-points', action='store_true', +help="add trace points to qmp marshals") parser.add_argument('schema', action='store') args = parser.parse_args() @@ -88,7 +91,8 @@ def main() -> int: output_dir=args.output_dir, prefix=args.prefix, unmask=args.unmask, - builtins=args.builtins) + builtins=args.builtins, + add_trace_points=args.add_trace_points) except QAPIError as err: print(f"{sys.argv[0]}: {str(err)}", file=sys.stderr) return 1 -- 2.31.1
[PATCH v2 4/4] meson: generate trace points for qmp commands
1. Use --add-trace-points when generate qmp commands 2. Add corresponding .trace-events files as outputs in qapi_files custom target 3. Define global qapi_trace_events list of .trace-events file targets, to fill in trace/qapi.build and to use in trace/meson.build 4. In trace/meson.build use the new array as an additional source of .trace_events files to be processed Signed-off-by: Vladimir Sementsov-Ogievskiy --- meson.build | 1 + qapi/meson.build | 9 - trace/meson.build | 11 --- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/meson.build b/meson.build index 17c7280f78..fcb130f163 100644 --- a/meson.build +++ b/meson.build @@ -38,6 +38,7 @@ qemu_icondir = get_option('datadir') / 'icons' config_host_data = configuration_data() genh = [] +qapi_trace_events = [] target_dirs = config_host['TARGET_DIRS'].split() have_linux_user = false diff --git a/qapi/meson.build b/qapi/meson.build index c0c49c15e4..826e6c2a0a 100644 --- a/qapi/meson.build +++ b/qapi/meson.build @@ -114,6 +114,7 @@ foreach module : qapi_all_modules 'qapi-events-@0@.h'.format(module), 'qapi-commands-@0@.c'.format(module), 'qapi-commands-@0@.h'.format(module), + 'qapi-commands-@0@.trace-events'.format(module), ] endif if module.endswith('-target') @@ -126,7 +127,7 @@ endforeach qapi_files = custom_target('shared QAPI source files', output: qapi_util_outputs + qapi_specific_outputs + qapi_nonmodule_outputs, input: [ files('qapi-schema.json') ], - command: [ qapi_gen, '-o', 'qapi', '-b', '@INPUT0@' ], + command: [ qapi_gen, '-o', 'qapi', '-b', '@INPUT0@', '--add-trace-points' ], depend_files: [ qapi_inputs, qapi_gen_depends ]) # Now go through all the outputs and add them to the right sourceset. @@ -137,6 +138,9 @@ foreach output : qapi_util_outputs if output.endswith('.h') genh += qapi_files[i] endif + if output.endswith('.trace-events') +qapi_trace_events += qapi_files[i] + endif util_ss.add(qapi_files[i]) i = i + 1 endforeach @@ -145,6 +149,9 @@ foreach output : qapi_specific_outputs + qapi_nonmodule_outputs if output.endswith('.h') genh += qapi_files[i] endif + if output.endswith('.trace-events') +qapi_trace_events += qapi_files[i] + endif specific_ss.add(when: 'CONFIG_SOFTMMU', if_true: qapi_files[i]) i = i + 1 endforeach diff --git a/trace/meson.build b/trace/meson.build index 573dd699c6..c4794a1f2a 100644 --- a/trace/meson.build +++ b/trace/meson.build @@ -2,10 +2,15 @@ specific_ss.add(files('control-target.c')) trace_events_files = [] -foreach dir : [ '.' ] + trace_events_subdirs - trace_events_file = meson.project_source_root() / dir / 'trace-events' +foreach item : [ '.' ] + trace_events_subdirs + qapi_trace_events + if item in qapi_trace_events +trace_events_file = item +group_name = item.full_path().split('/')[-1].underscorify() + else +trace_events_file = meson.project_source_root() / item / 'trace-events' +group_name = item == '.' ? 'root' : item.underscorify() + endif trace_events_files += [ trace_events_file ] - group_name = dir == '.' ? 'root' : dir.underscorify() group = '--group=' + group_name fmt = '@0@-' + group_name + '.@1@' -- 2.31.1
Re: [PATCH qemu] s390x/css: fix PMCW invalid mask
On Thu, Dec 23 2021, Halil Pasic wrote: > On Wed, 22 Dec 2021 17:46:11 +0100 > Cornelia Huck wrote: > >> On Thu, Dec 16 2021, Nico Boehr wrote: >> >> > Previously, we required bits 5, 6 and 7 to be zero (0x07 == 0b111). But, >> > as per the principles of operation, bit 5 is ignored in MSCH and bits 0, >> > 1, 6 and 7 need to be zero. >> > >> > As both PMCW_FLAGS_MASK_INVALID and ioinst_schib_valid() are only used >> > by ioinst_handle_msch(), adjust the mask accordingly. >> > >> > Fixes: db1c8f53bfb1 ("s390: Channel I/O basic definitions.") >> > Signed-off-by: Nico Boehr >> > Reviewed-by: Pierre Morel >> > Reviewed-by: Halil Pasic >> > Reviewed-by: Janosch Frank >> > --- >> > include/hw/s390x/ioinst.h | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/include/hw/s390x/ioinst.h b/include/hw/s390x/ioinst.h >> > index 3771fff9d44d..ea8d0f244492 100644 >> > --- a/include/hw/s390x/ioinst.h >> > +++ b/include/hw/s390x/ioinst.h >> > @@ -107,7 +107,7 @@ QEMU_BUILD_BUG_MSG(sizeof(PMCW) != 28, "size of PMCW >> > is wrong"); >> > #define PMCW_FLAGS_MASK_MP 0x0004 >> > #define PMCW_FLAGS_MASK_TF 0x0002 >> > #define PMCW_FLAGS_MASK_DNV 0x0001 >> > -#define PMCW_FLAGS_MASK_INVALID 0x0700 >> > +#define PMCW_FLAGS_MASK_INVALID 0xc300 >> >> Removing bit 5 from this mask makes sense, at it is simply ignored. >> >> I'm a bit confused about bits 0 and 1, however. They are _QF and _W, >> respectively (just out of the context here), which are in the same class >> as _DNV (i.e. characteristics of the subchannel that cannot be modified >> via msch). Looking at the PoP, I don't see what is supposed to happen if >> the program tries to modify the dnv bit (maybe I'm simply overlooking >> it.) I would naively assume that the w bit should behave in the same way >> (as it does for message subchannels what dnv does for I/O subchannels, >> and the rest of the values are not meaningful if it is not set), and >> probably also the qf bit (as it doesn't make sense for the program to >> turn QDIO capabilities on and off.) The main question is whether trying >> to modify these bits causes an error or is ignored. The PoP suggests an >> error (no idea if the internal architecture agrees, it hopefully does); >> what happens for dnv? > > """ > Bits 0, 1, 6, and 7 of word 1, and bits 0-28 of word 6 > of the SCHIB operand must be zeros, and bits 9 and > 10 of word 1 must not both be ones. When the > extended-I/O-measurement-block facility is installed > and a format-1 measurement block is specified, bits > 26-31 of word 11 must be specified as zeros. > """ > (IBM z/Architecture Principles of Operation (SA22-7832-10), 14-8) > > The internal architecture agrees. Thanks for checking. > > DNV bit is ignored. Regarding why, I don't know. Probably for historic > reasons. Yeah, it's a bit odd, "for historic reason" seems plausible. > The PoP tells us that whatever is not listed as significant > or checked and results in an operation exception if not appropriate > is ignored: > """ > The remaining > fields of the SCHIB are ignored and do not affect the > processing of MODIFY SUBCHANNEL. (For further > details, see “Subchannel-Information Block” on > page 2 > """ > (same page) > > Regarding word 1 of the SCHIB the alignment between PoP and AR is > perfect AFAICT. > >> >> We support neither message subchannels nor QDIO in QEMU, so it's >> probably not relevant right now; but it would still be good if we could >> clarify the expected behaviour here :) >> >> > >> > #define PMCW_CHARS_MASK_ST 0x00e0 >> > #define PMCW_CHARS_MASK_MBFC 0x0004 >> >> In that case, Reviewed-by: Cornelia Huck
Re: [PATCH v10 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically
Hi, Yong, On Tue, Dec 14, 2021 at 07:07:32PM +0800, huang...@chinatelecom.cn wrote: > From: Hyman Huang(黄勇) > > Introduce the third method GLOBAL_DIRTY_LIMIT of dirty > tracking for calculate dirtyrate periodly for dirty restraint. > > Implement thread for calculate dirtyrate periodly, which will > be used for dirty page limit. > > Add dirtylimit.h to introduce the util function for dirty > limit implementation. Sorry to be late on reading it, my apologies. > > Signed-off-by: Hyman Huang(黄勇) > --- > include/exec/memory.h | 5 +- > include/sysemu/dirtylimit.h | 51 ++ > migration/dirtyrate.c | 160 > +--- > migration/dirtyrate.h | 2 + > 4 files changed, 207 insertions(+), 11 deletions(-) > create mode 100644 include/sysemu/dirtylimit.h > > diff --git a/include/exec/memory.h b/include/exec/memory.h > index 20f1b27..606bec8 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -69,7 +69,10 @@ static inline void fuzz_dma_read_cb(size_t addr, > /* Dirty tracking enabled because measuring dirty rate */ > #define GLOBAL_DIRTY_DIRTY_RATE (1U << 1) > > -#define GLOBAL_DIRTY_MASK (0x3) > +/* Dirty tracking enabled because dirty limit */ > +#define GLOBAL_DIRTY_LIMIT (1U << 2) > + > +#define GLOBAL_DIRTY_MASK (0x7) > > extern unsigned int global_dirty_tracking; > > diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h > new file mode 100644 > index 000..34e48f8 > --- /dev/null > +++ b/include/sysemu/dirtylimit.h > @@ -0,0 +1,51 @@ > +/* > + * dirty limit helper functions > + * > + * Copyright (c) 2021 CHINA TELECOM CO.,LTD. > + * > + * Authors: > + * Hyman Huang(黄勇) > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + */ > +#ifndef QEMU_DIRTYRLIMIT_H > +#define QEMU_DIRTYRLIMIT_H > + > +#define DIRTYLIMIT_CALC_TIME_MS 1000/* 1000ms */ > + > +/** > + * dirtylimit_calc_current > + * > + * get current dirty page rate for specified virtual CPU. > + */ > +int64_t dirtylimit_calc_current(int cpu_index); > + > +/** > + * dirtylimit_calc_start > + * > + * start dirty page rate calculation thread. > + */ > +void dirtylimit_calc_start(void); > + > +/** > + * dirtylimit_calc_quit > + * > + * quit dirty page rate calculation thread. > + */ > +void dirtylimit_calc_quit(void); > + > +/** > + * dirtylimit_calc_state_init > + * > + * initialize dirty page rate calculation state. > + */ > +void dirtylimit_calc_state_init(int max_cpus); > + > +/** > + * dirtylimit_calc_state_finalize > + * > + * finalize dirty page rate calculation state. > + */ > +void dirtylimit_calc_state_finalize(void); > +#endif Since dirtylimit and dirtyrate looks so alike, not sure it's easier to just reuse dirtyrate.h; after all you reused dirtyrate.c. > diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c > index d65e744..e8d4e4a 100644 > --- a/migration/dirtyrate.c > +++ b/migration/dirtyrate.c > @@ -27,6 +27,7 @@ > #include "qapi/qmp/qdict.h" > #include "sysemu/kvm.h" > #include "sysemu/runstate.h" > +#include "sysemu/dirtylimit.h" > #include "exec/memory.h" > > /* > @@ -46,6 +47,155 @@ static struct DirtyRateStat DirtyStat; > static DirtyRateMeasureMode dirtyrate_mode = > DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; > > +struct { > +DirtyRatesData data; > +bool quit; > +QemuThread thread; > +} *dirtylimit_calc_state; > + > +static void dirtylimit_global_dirty_log_start(void) > +{ > +qemu_mutex_lock_iothread(); > +memory_global_dirty_log_start(GLOBAL_DIRTY_LIMIT); > +qemu_mutex_unlock_iothread(); > +} > + > +static void dirtylimit_global_dirty_log_stop(void) > +{ > +qemu_mutex_lock_iothread(); > +memory_global_dirty_log_stop(GLOBAL_DIRTY_LIMIT); > +qemu_mutex_unlock_iothread(); > +} This is merely dirtyrate_global_dirty_log_start/stop but with a different flag. Let's introduce global_dirty_log_change() with BQL? global_dirty_log_change(flag, onoff) { qemu_mutex_lock_iothread(); if (start) { memory_global_dirty_log_start(flag); } else { memory_global_dirty_log_stop(flag); } qemu_mutex_unlock_iothread(); } Then we merge 4 functions into one. We can also have a BQL-version of global_dirty_log_sync() in the same patch if you think above helpful. > + > +static inline void record_dirtypages(DirtyPageRecord *dirty_pages, > + CPUState *cpu, bool start) > +{ > +if (start) { > +dirty_pages[cpu->cpu_index].start_pages = cpu->dirty_pages; > +} else { > +dirty_pages[cpu->cpu_index].end_pages = cpu->dirty_pages; > +} > +} > + > +static void dirtylimit_calc_func(void) Would you still consider merging this with calculate_dirtyrate_dirty_ring? I still don't see why it can't. Maybe it cannot be directly reused, but the whole logic is really, really simi
Re: [RFC PATCH v2 05/14] block/mirror.c: use of job helpers in drivers to avoid TOC/TOU
On 20/12/2021 11:47, Vladimir Sementsov-Ogievskiy wrote: 20.12.2021 13:34, Emanuele Giuseppe Esposito wrote: On 18/12/2021 12:53, Vladimir Sementsov-Ogievskiy wrote: 04.11.2021 17:53, Emanuele Giuseppe Esposito wrote: Once job lock is used and aiocontext is removed, mirror has to perform job operations under the same critical section, using the helpers prepared in previous commit. Note: at this stage, job_{lock/unlock} and job lock guard macros are *nop*. Signed-off-by: Emanuele Giuseppe Esposito --- block/mirror.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index 00089e519b..f22fa7da6e 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -653,7 +653,7 @@ static int mirror_exit_common(Job *job) BlockDriverState *target_bs; BlockDriverState *mirror_top_bs; Error *local_err = NULL; - bool abort = job->ret < 0; + bool abort = job_has_failed(job); int ret = 0; if (s->prepared) { @@ -1161,9 +1161,7 @@ static void mirror_complete(Job *job, Error **errp) s->should_complete = true; /* If the job is paused, it will be re-entered when it is resumed */ - if (!job->paused) { - job_enter(job); - } + job_enter_not_paused(job); } static void coroutine_fn mirror_pause(Job *job) @@ -1182,7 +1180,7 @@ static bool mirror_drained_poll(BlockJob *job) * from one of our own drain sections, to avoid a deadlock waiting for * ourselves. */ - if (!s->common.job.paused && !job_is_cancelled(&job->job) && !s->in_drain) { + if (job_not_paused_nor_cancelled(&s->common.job) && !s->in_drain) { return true; } Why to introduce a separate API function for every use case? Could we instead just use WITH_JOB_LOCK_GUARD() ? This implies making the struct job_mutex public. Is that ok for you? Yes, I think it's OK. Alternatively, you can use job_lock() / job_unlock(), or even rewrite WITH_JOB_LOCK_GUARD() macro using job_lock/job_unlock, to keep mutex private.. But I don't think it really worth it now. Note that struct Job is already public, so if we'll use per-job mutex in future it still is not a problem. Only when we decide to make struct Job private, we'll have to decide something about JOB_LOCK_GUARD(), and at this point we'll just rewrite it to work through some helper function instead of directly touching the mutex. Ok I will do that. Just FYI the initial idea was that drivers like monitor would not need to know about job_mutex lock, that is why I made the helpers in mirror.c. Thank you, Emanuele
Re: [RFC PATCH v2 11/14] block_job_query: remove atomic read
On 18/12/2021 13:07, Vladimir Sementsov-Ogievskiy wrote: 04.11.2021 17:53, Emanuele Giuseppe Esposito wrote: Not sure what the atomic here was supposed to do, since job.busy is protected by the job lock. In block_job_query() it is protected only since previous commit. So, before previous commit, atomic read make sense. To me it doesn't really, because it is protected with job_lock/unlock in job.c, and here is read with an atomic. But maybe I am missing something. Hmm. but job_lock() is still a no-op at this point. So, actually, it would be more correct to drop this qatomic_read after patch 14. Will do. Thank you, Emanuele
[PULL 02/15] meson: reuse common_user_inc when building files specific to user-mode emulators
Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meson.build b/meson.build index f45ecf31bd..b0af02b805 100644 --- a/meson.build +++ b/meson.build @@ -2897,6 +2897,7 @@ foreach target : target_dirs else abi = config_target['TARGET_ABI_DIR'] target_type='user' +target_inc += common_user_inc qemu_target_name = 'qemu-' + target_name if target_base_arch in target_user_arch t = target_user_arch[target_base_arch].apply(config_target, strict: false) @@ -2905,7 +2906,6 @@ foreach target : target_dirs endif if 'CONFIG_LINUX_USER' in config_target base_dir = 'linux-user' - target_inc += include_directories('linux-user/host/' / host_arch) endif if 'CONFIG_BSD_USER' in config_target base_dir = 'bsd-user' -- 2.33.1
[PULL 00/15] Build system and KVM changes for 2021-12-23
The following changes since commit 2bf40d0841b942e7ba12953d515e62a436f0af84: Merge tag 'pull-user-20211220' of https://gitlab.com/rth7680/qemu into staging (2021-12-20 13:20:07 -0800) are available in the Git repository at: https://gitlab.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to c139f026aa685e6b27a5a8ecb3272d4ed1700312: KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGS (2021-12-23 10:05:28 +0100) * configure and meson cleanups * KVM_GET/SET_SREGS2 support for x86 * fix occasional container build failures for debian-tricore-cross Maxim Levitsky (1): KVM: use KVM_{GET|SET}_SREGS2 when supported. Paolo Bonzini (13): docker: include bison in debian-tricore-cross meson: reuse common_user_inc when building files specific to user-mode emulators user: move common-user includes to a subdirectory of {bsd,linux}-user/ meson: cleanup common-user/ build configure: simplify creation of plugin symbol list configure: do not set bsd_user/linux_user early configure, makefile: remove traces of really old files configure: parse --enable/--disable-strip automatically, flip default configure: move non-command-line variables away from command-line parsing section meson: build contrib/ executables after generated headers configure, meson: move config-poison.h to meson meson: add comments in the target-specific flags section KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGS Thomas Huth (1): block/file-posix: Simplify the XFS_IOC_DIOINFO handling Makefile | 11 +- block/file-posix.c | 37 ++--- bsd-user/{ => include}/special-errno.h | 0 bsd-user/meson.build | 2 +- common-user/meson.build| 2 +- configure | 182 +++-- contrib/elf2dmp/meson.build| 2 +- contrib/ivshmem-client/meson.build | 2 +- contrib/ivshmem-server/meson.build | 2 +- contrib/rdmacm-mux/meson.build | 2 +- .../{ => include}/host/aarch64/host-signal.h | 0 linux-user/{ => include}/host/alpha/host-signal.h | 0 linux-user/{ => include}/host/arm/host-signal.h| 0 linux-user/{ => include}/host/i386/host-signal.h | 0 linux-user/{ => include}/host/mips/host-signal.h | 0 linux-user/{ => include}/host/ppc/host-signal.h| 0 linux-user/{ => include}/host/ppc64/host-signal.h | 0 linux-user/{ => include}/host/riscv/host-signal.h | 0 linux-user/{ => include}/host/s390/host-signal.h | 0 linux-user/{ => include}/host/s390x/host-signal.h | 0 linux-user/{ => include}/host/sparc/host-signal.h | 0 .../{ => include}/host/sparc64/host-signal.h | 0 linux-user/{ => include}/host/x32/host-signal.h| 0 linux-user/{ => include}/host/x86_64/host-signal.h | 0 linux-user/{ => include}/special-errno.h | 0 linux-user/meson.build | 4 +- meson.build| 33 ++-- pc-bios/s390-ccw/Makefile | 2 - plugins/meson.build| 11 +- scripts/make-config-poison.sh | 16 ++ scripts/meson-buildoptions.py | 21 ++- scripts/meson-buildoptions.sh | 3 + target/i386/cpu.h | 3 + target/i386/kvm/kvm.c | 130 +-- target/i386/machine.c | 29 .../docker/dockerfiles/debian-tricore-cross.docker | 1 + 36 files changed, 259 insertions(+), 236 deletions(-) rename bsd-user/{ => include}/special-errno.h (100%) rename linux-user/{ => include}/host/aarch64/host-signal.h (100%) rename linux-user/{ => include}/host/alpha/host-signal.h (100%) rename linux-user/{ => include}/host/arm/host-signal.h (100%) rename linux-user/{ => include}/host/i386/host-signal.h (100%) rename linux-user/{ => include}/host/mips/host-signal.h (100%) rename linux-user/{ => include}/host/ppc/host-signal.h (100%) rename linux-user/{ => include}/host/ppc64/host-signal.h (100%) rename linux-user/{ => include}/host/riscv/host-signal.h (100%) rename linux-user/{ => include}/host/s390/host-signal.h (100%) rename linux-user/{ => include}/host/s390x/host-signal.h (100%) rename linux-user/{ => include}/host/sparc/host-signal.h (100%) rename linux-user/{ => include}/host/sparc64/host-signal.h (100%) rename linux-user/{ => include}/host/x32/host-signal.h (100%) rename linux-user/{ => include}/host/x86_64/host-signal.h (100%) rename linux-user/{ => include}/special-errno.h (100%) create
[PULL 14/15] KVM: use KVM_{GET|SET}_SREGS2 when supported.
From: Maxim Levitsky This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky Message-Id: <20211101132300.192584-2-mlevi...@redhat.com> Signed-off-by: Paolo Bonzini --- target/i386/cpu.h | 3 ++ target/i386/kvm/kvm.c | 108 +- target/i386/machine.c | 29 3 files changed, 138 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 04f2b790c9..9911d7c871 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1455,6 +1455,9 @@ typedef struct CPUX86State { SegmentCache idt; /* only base and limit are used */ target_ulong cr[5]; /* NOTE: cr1 is unused */ + +bool pdptrs_valid; +uint64_t pdptrs[4]; int32_t a20_mask; BNDReg bnd_regs[4]; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 13f8e30c2a..d81745620b 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -124,6 +124,7 @@ static uint32_t num_architectural_pmu_fixed_counters; static int has_xsave; static int has_xcrs; static int has_pit_state2; +static int has_sregs2; static int has_exception_payload; static bool has_msr_mcg_ext_ctl; @@ -2324,6 +2325,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s) has_xsave = kvm_check_extension(s, KVM_CAP_XSAVE); has_xcrs = kvm_check_extension(s, KVM_CAP_XCRS); has_pit_state2 = kvm_check_extension(s, KVM_CAP_PIT_STATE2); +has_sregs2 = kvm_check_extension(s, KVM_CAP_SREGS2) > 0; hv_vpindex_settable = kvm_check_extension(s, KVM_CAP_HYPERV_VP_INDEX); @@ -2650,6 +2652,61 @@ static int kvm_put_sregs(X86CPU *cpu) return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_SREGS, &sregs); } +static int kvm_put_sregs2(X86CPU *cpu) +{ +CPUX86State *env = &cpu->env; +struct kvm_sregs2 sregs; +int i; + +sregs.flags = 0; + +if ((env->eflags & VM_MASK)) { +set_v8086_seg(&sregs.cs, &env->segs[R_CS]); +set_v8086_seg(&sregs.ds, &env->segs[R_DS]); +set_v8086_seg(&sregs.es, &env->segs[R_ES]); +set_v8086_seg(&sregs.fs, &env->segs[R_FS]); +set_v8086_seg(&sregs.gs, &env->segs[R_GS]); +set_v8086_seg(&sregs.ss, &env->segs[R_SS]); +} else { +set_seg(&sregs.cs, &env->segs[R_CS]); +set_seg(&sregs.ds, &env->segs[R_DS]); +set_seg(&sregs.es, &env->segs[R_ES]); +set_seg(&sregs.fs, &env->segs[R_FS]); +set_seg(&sregs.gs, &env->segs[R_GS]); +set_seg(&sregs.ss, &env->segs[R_SS]); +} + +set_seg(&sregs.tr, &env->tr); +set_seg(&sregs.ldt, &env->ldt); + +sregs.idt.limit = env->idt.limit; +sregs.idt.base = env->idt.base; +memset(sregs.idt.padding, 0, sizeof sregs.idt.padding); +sregs.gdt.limit = env->gdt.limit; +sregs.gdt.base = env->gdt.base; +memset(sregs.gdt.padding, 0, sizeof sregs.gdt.padding); + +sregs.cr0 = env->cr[0]; +sregs.cr2 = env->cr[2]; +sregs.cr3 = env->cr[3]; +sregs.cr4 = env->cr[4]; + +sregs.cr8 = cpu_get_apic_tpr(cpu->apic_state); +sregs.apic_base = cpu_get_apic_base(cpu->apic_state); + +sregs.efer = env->efer; + +if (env->pdptrs_valid) { +for (i = 0; i < 4; i++) { +sregs.pdptrs[i] = env->pdptrs[i]; +} +sregs.flags |= KVM_SREGS2_FLAGS_PDPTRS_VALID; +} + +return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_SREGS2, &sregs); +} + + static void kvm_msr_buf_reset(X86CPU *cpu) { memset(cpu->kvm_msr_buf, 0, MSR_BUF_SIZE); @@ -3330,6 +3387,53 @@ static int kvm_get_sregs(X86CPU *cpu) return 0; } +static int kvm_get_sregs2(X86CPU *cpu) +{ +CPUX86State *env = &cpu->env; +struct kvm_sregs2 sregs; +int i, ret; + +ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_SREGS2, &sregs); +if (ret < 0) { +return ret; +} + +get_seg(&env->segs[R_CS], &sregs.cs); +get_seg(&env->segs[R_DS], &sregs.ds); +get_seg(&env->segs[R_ES], &sregs.es); +get_seg(&env->segs[R_FS], &sregs.fs); +get_seg(&env->segs[R_GS], &sregs.gs); +get_seg(&env->segs[R_SS], &sregs.ss); + +get_seg(&env->tr, &sregs.tr); +get_seg(&env->ldt, &sregs.ldt); + +env->idt.limit = sregs.idt.limit; +env->idt.base = sregs.idt.base; +env->gdt.limit = sregs.gdt.limit; +env->gdt.base = sregs.gdt.base; + +env->cr[0] = sregs.cr0; +env->cr[2] = sregs.cr2; +env->cr[3] = sregs.cr3; +env->cr[4] = sregs.cr4; + +env->efer = sregs.efer; + +env->pdptrs_valid = sregs.flags & KVM_SREGS2_FLAGS_PDPTRS_VALID; + +if (env->pdptrs_valid) { +for (i = 0; i < 4; i++) { +env->pdptrs[i] = sregs.pdptrs[i]; +} +} + +/* changes to apic base and cr8/tpr are read back via kvm_arch_post_run */ +x86_update_hflags(env); + +return 0; +} + static int kvm_get_msrs(X86CPU *cpu) { CPUX86State *env = &cpu->env; @@ -4173,7 +4277,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level) assert(
[PULL 01/15] docker: include bison in debian-tricore-cross
Binutils sometimes fail to build if bison is not installed: /bin/sh ./ylwrap `test -f arparse.y || echo ./`arparse.y y.tab.c arparse.c y.tab.h arparse.h y.output arparse.output -- -d ./ylwrap: 109: ./ylwrap: -d: not found (the correct invocation of ylwrap would have "bison -d" after the double dash). Work around by installing it in the container. Cc: Alex Bennée Resolves: https://gitlab.com/qemu-project/qemu/-/issues/596 Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- tests/docker/dockerfiles/debian-tricore-cross.docker | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/docker/dockerfiles/debian-tricore-cross.docker b/tests/docker/dockerfiles/debian-tricore-cross.docker index d8df2c6117..3f6b55562c 100644 --- a/tests/docker/dockerfiles/debian-tricore-cross.docker +++ b/tests/docker/dockerfiles/debian-tricore-cross.docker @@ -16,6 +16,7 @@ MAINTAINER Philippe Mathieu-Daudé RUN apt update && \ DEBIAN_FRONTEND=noninteractive apt install -yy eatmydata && \ DEBIAN_FRONTEND=noninteractive eatmydata apt install -yy \ + bison \ bzip2 \ ca-certificates \ ccache \ -- 2.33.1
[PULL 07/15] configure: do not set bsd_user/linux_user early
Similar to other optional features, leave the variables empty and compute the actual value later. Use the existence of include or source directories to detect whether an OS or CPU supports respectively bsd-user and linux-user. For now, BSD user-mode emulation is buildable even on TCI-only architectures. This probably will change once safe signals are brought over from linux-user. Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- configure | 28 +--- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/configure b/configure index 0306f0c8bc..6516ec243c 100755 --- a/configure +++ b/configure @@ -320,8 +320,8 @@ linux="no" solaris="no" profiler="no" softmmu="yes" -linux_user="no" -bsd_user="no" +linux_user="" +bsd_user="" pkgversion="" pie="" qom_cast_debug="yes" @@ -538,7 +538,6 @@ gnu/kfreebsd) ;; freebsd) bsd="yes" - bsd_user="yes" make="${MAKE-gmake}" # needed for kinfo_getvmmap(3) in libutil.h ;; @@ -583,7 +582,6 @@ haiku) ;; linux) linux="yes" - linux_user="yes" vhost_user=${default_feature:-yes} ;; esac @@ -1257,18 +1255,26 @@ if eval test -z "\${cross_cc_$cpu}"; then cross_cc_vars="$cross_cc_vars cross_cc_${cpu}" fi -# For user-mode emulation the host arch has to be one we explicitly -# support, even if we're using TCI. -if [ "$ARCH" = "unknown" ]; then - bsd_user="no" - linux_user="no" -fi - default_target_list="" deprecated_targets_list=ppc64abi32-linux-user deprecated_features="" mak_wilds="" +if [ "$linux_user" != no ]; then +if [ "$targetos" = linux ] && [ -d $source_path/linux-user/host/$cpu ]; then +linux_user=yes +elif [ "$linux_user" = yes ]; then +error_exit "linux-user not supported on this architecture" +fi +fi +if [ "$bsd_user" != no ]; then +if [ "$bsd_user" = "" ]; then +test $targetos = freebsd && bsd_user=yes +fi +if [ "$bsd_user" = yes ] && ! [ -d $source_path/bsd-user/$targetos ]; then +error_exit "bsd-user not supported on this host OS" +fi +fi if [ "$softmmu" = "yes" ]; then mak_wilds="${mak_wilds} $source_path/configs/targets/*-softmmu.mak" fi -- 2.33.1
[PULL 04/15] meson: cleanup common-user/ build
It is not necessary to have a separate static_library just for common_user files; using the one that already covers the rest of common_ss is enough unless you need to reuse some source files between emulators and tests. Just place common files for all user-mode emulators in common_ss, similar to what is already done for softmmu_ss in full system emulators. The only disadvantage is that the include_directories under bsd-user/include/ and linux-user/include/ are now enabled for all targets rather than only user mode emulators. This however is not different from how include/sysemu/ is available when building user mode emulators. Tested-by: Richard Henderson Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- common-user/meson.build | 2 +- meson.build | 13 + 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/common-user/meson.build b/common-user/meson.build index 5cb42bc664..26212dda5c 100644 --- a/common-user/meson.build +++ b/common-user/meson.build @@ -1,6 +1,6 @@ common_user_inc += include_directories('host/' / host_arch) -common_user_ss.add(files( +user_ss.add(files( 'safe-syscall.S', 'safe-syscall-error.c', )) diff --git a/meson.build b/meson.build index b0af02b805..879628ab68 100644 --- a/meson.build +++ b/meson.build @@ -2377,7 +2377,6 @@ blockdev_ss = ss.source_set() block_ss = ss.source_set() chardev_ss = ss.source_set() common_ss = ss.source_set() -common_user_ss = ss.source_set() crypto_ss = ss.source_set() hwcore_ss = ss.source_set() io_ss = ss.source_set() @@ -2629,17 +2628,6 @@ subdir('common-user') subdir('bsd-user') subdir('linux-user') -common_user_ss = common_user_ss.apply(config_all, strict: false) -common_user = static_library('common-user', - sources: common_user_ss.sources(), - dependencies: common_user_ss.dependencies(), - include_directories: common_user_inc, - name_suffix: 'fa', - build_by_default: false) -common_user = declare_dependency(link_with: common_user) - -user_ss.add(common_user) - # needed for fuzzing binaries subdir('tests/qtest/libqos') subdir('tests/qtest/fuzz') @@ -2857,6 +2845,7 @@ common_all = common_ss.apply(config_all, strict: false) common_all = static_library('common', build_by_default: false, sources: common_all.sources() + genh, +include_directories: common_user_inc, implicit_include_directories: false, dependencies: common_all.dependencies(), name_suffix: 'fa') -- 2.33.1
[PULL 15/15] KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGS
This is unnecessary, because the interrupt would be retrieved and queued anyway by KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS respectively, and it makes the flow more similar to the one for KVM_GET/SET_SREGS2. Signed-off-by: Paolo Bonzini --- target/i386/kvm/kvm.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index d81745620b..2c8feb4a6f 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2607,11 +2607,11 @@ static int kvm_put_sregs(X86CPU *cpu) CPUX86State *env = &cpu->env; struct kvm_sregs sregs; +/* + * The interrupt_bitmap is ignored because KVM_SET_SREGS is + * always followed by KVM_SET_VCPU_EVENTS. + */ memset(sregs.interrupt_bitmap, 0, sizeof(sregs.interrupt_bitmap)); -if (env->interrupt_injected >= 0) { -sregs.interrupt_bitmap[env->interrupt_injected / 64] |= -(uint64_t)1 << (env->interrupt_injected % 64); -} if ((env->eflags & VM_MASK)) { set_v8086_seg(&sregs.cs, &env->segs[R_CS]); @@ -3341,23 +3341,17 @@ static int kvm_get_sregs(X86CPU *cpu) { CPUX86State *env = &cpu->env; struct kvm_sregs sregs; -int bit, i, ret; +int ret; ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_SREGS, &sregs); if (ret < 0) { return ret; } -/* There can only be one pending IRQ set in the bitmap at a time, so try - to find it and save its number instead (-1 for none). */ -env->interrupt_injected = -1; -for (i = 0; i < ARRAY_SIZE(sregs.interrupt_bitmap); i++) { -if (sregs.interrupt_bitmap[i]) { -bit = ctz64(sregs.interrupt_bitmap[i]); -env->interrupt_injected = i * 64 + bit; -break; -} -} +/* + * The interrupt_bitmap is ignored because KVM_GET_SREGS is + * always preceded by KVM_GET_VCPU_EVENTS. + */ get_seg(&env->segs[R_CS], &sregs.cs); get_seg(&env->segs[R_DS], &sregs.ds); -- 2.33.1
[PULL 03/15] user: move common-user includes to a subdirectory of {bsd, linux}-user/
Avoid polluting the compilation of common-user/ with local include files; making an include file available to common-user/ should be a deliberate decision in order to keep a clear interface that can be used by both bsd-user/ and linux-user/. Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- bsd-user/{ => include}/special-errno.h | 0 bsd-user/meson.build| 2 +- linux-user/{ => include}/host/aarch64/host-signal.h | 0 linux-user/{ => include}/host/alpha/host-signal.h | 0 linux-user/{ => include}/host/arm/host-signal.h | 0 linux-user/{ => include}/host/i386/host-signal.h| 0 linux-user/{ => include}/host/mips/host-signal.h| 0 linux-user/{ => include}/host/ppc/host-signal.h | 0 linux-user/{ => include}/host/ppc64/host-signal.h | 0 linux-user/{ => include}/host/riscv/host-signal.h | 0 linux-user/{ => include}/host/s390/host-signal.h| 0 linux-user/{ => include}/host/s390x/host-signal.h | 0 linux-user/{ => include}/host/sparc/host-signal.h | 0 linux-user/{ => include}/host/sparc64/host-signal.h | 0 linux-user/{ => include}/host/x32/host-signal.h | 0 linux-user/{ => include}/host/x86_64/host-signal.h | 0 linux-user/{ => include}/special-errno.h| 0 linux-user/meson.build | 4 ++-- 18 files changed, 3 insertions(+), 3 deletions(-) rename bsd-user/{ => include}/special-errno.h (100%) rename linux-user/{ => include}/host/aarch64/host-signal.h (100%) rename linux-user/{ => include}/host/alpha/host-signal.h (100%) rename linux-user/{ => include}/host/arm/host-signal.h (100%) rename linux-user/{ => include}/host/i386/host-signal.h (100%) rename linux-user/{ => include}/host/mips/host-signal.h (100%) rename linux-user/{ => include}/host/ppc/host-signal.h (100%) rename linux-user/{ => include}/host/ppc64/host-signal.h (100%) rename linux-user/{ => include}/host/riscv/host-signal.h (100%) rename linux-user/{ => include}/host/s390/host-signal.h (100%) rename linux-user/{ => include}/host/s390x/host-signal.h (100%) rename linux-user/{ => include}/host/sparc/host-signal.h (100%) rename linux-user/{ => include}/host/sparc64/host-signal.h (100%) rename linux-user/{ => include}/host/x32/host-signal.h (100%) rename linux-user/{ => include}/host/x86_64/host-signal.h (100%) rename linux-user/{ => include}/special-errno.h (100%) diff --git a/bsd-user/special-errno.h b/bsd-user/include/special-errno.h similarity index 100% rename from bsd-user/special-errno.h rename to bsd-user/include/special-errno.h diff --git a/bsd-user/meson.build b/bsd-user/meson.build index 9fcb80c3fa..8380fa44c2 100644 --- a/bsd-user/meson.build +++ b/bsd-user/meson.build @@ -4,7 +4,7 @@ endif bsd_user_ss = ss.source_set() -common_user_inc += include_directories('.') +common_user_inc += include_directories('include') bsd_user_ss.add(files( 'bsdload.c', diff --git a/linux-user/host/aarch64/host-signal.h b/linux-user/include/host/aarch64/host-signal.h similarity index 100% rename from linux-user/host/aarch64/host-signal.h rename to linux-user/include/host/aarch64/host-signal.h diff --git a/linux-user/host/alpha/host-signal.h b/linux-user/include/host/alpha/host-signal.h similarity index 100% rename from linux-user/host/alpha/host-signal.h rename to linux-user/include/host/alpha/host-signal.h diff --git a/linux-user/host/arm/host-signal.h b/linux-user/include/host/arm/host-signal.h similarity index 100% rename from linux-user/host/arm/host-signal.h rename to linux-user/include/host/arm/host-signal.h diff --git a/linux-user/host/i386/host-signal.h b/linux-user/include/host/i386/host-signal.h similarity index 100% rename from linux-user/host/i386/host-signal.h rename to linux-user/include/host/i386/host-signal.h diff --git a/linux-user/host/mips/host-signal.h b/linux-user/include/host/mips/host-signal.h similarity index 100% rename from linux-user/host/mips/host-signal.h rename to linux-user/include/host/mips/host-signal.h diff --git a/linux-user/host/ppc/host-signal.h b/linux-user/include/host/ppc/host-signal.h similarity index 100% rename from linux-user/host/ppc/host-signal.h rename to linux-user/include/host/ppc/host-signal.h diff --git a/linux-user/host/ppc64/host-signal.h b/linux-user/include/host/ppc64/host-signal.h similarity index 100% rename from linux-user/host/ppc64/host-signal.h rename to linux-user/include/host/ppc64/host-signal.h diff --git a/linux-user/host/riscv/host-signal.h b/linux-user/include/host/riscv/host-signal.h similarity index 100% rename from linux-user/host/riscv/host-signal.h rename to linux-user/include/host/riscv/host-signal.h diff --git a/linux-user/host/s390/host-signal.h b/linux-user/include/host/s390/host-signal.h similarity index 100% rename from linux-user/host/s390/host-signal.h rename to linux-user/include/host/s390/host-signal.h diff --git a/linux-user/host/s390x/host-signal.h b/linux-user/include/host/s390x/host-signal.h similarity inde
[PULL 08/15] configure, makefile: remove traces of really old files
These files have been removed for more than year in the best case, or for more than ten years for some really old TCG files. Remove any traces of it. Acked-by: Richard Henderson Signed-off-by: Paolo Bonzini --- Makefile | 11 --- configure | 9 - 2 files changed, 4 insertions(+), 16 deletions(-) diff --git a/Makefile b/Makefile index 74c5b46d38..06ad8a61e1 100644 --- a/Makefile +++ b/Makefile @@ -205,14 +205,11 @@ recurse-clean: $(addsuffix /clean, $(ROM_DIRS)) clean: recurse-clean -$(quiet-@)test -f build.ninja && $(NINJA) $(NINJAFLAGS) -t clean || : -$(quiet-@)test -f build.ninja && $(NINJA) $(NINJAFLAGS) clean-ctlist || : -# avoid old build problems by removing potentially incorrect old files - rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h find . \( -name '*.so' -o -name '*.dll' -o -name '*.[oda]' \) -type f \ ! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-aarch64.a \ ! -path ./roms/edk2/ArmPkg/Library/GccLto/liblto-arm.a \ -exec rm {} + - rm -f TAGS cscope.* *.pod *~ */*~ - rm -f fsdev/*.pod scsi/*.pod + rm -f TAGS cscope.* *~ */*~ VERSION = $(shell cat $(SRC_PATH)/VERSION) @@ -223,10 +220,10 @@ qemu-%.tar.bz2: distclean: clean -$(quiet-@)test -f build.ninja && $(NINJA) $(NINJAFLAGS) -t clean -g || : - rm -f config-host.mak config-host.h* config-poison.h + rm -f config-host.mak config-poison.h rm -f tests/tcg/config-*.mak - rm -f config-all-disas.mak config.status - rm -f roms/seabios/config.mak roms/vgabios/config.mak + rm -f config.status + rm -f roms/seabios/config.mak rm -f qemu-plugins-ld.symbols qemu-plugins-ld64.symbols rm -f *-config-target.h *-config-devices.mak *-config-devices.h rm -rf meson-private meson-logs meson-info compile_commands.json diff --git a/configure b/configure index 6516ec243c..c8b32e7277 100755 --- a/configure +++ b/configure @@ -3665,9 +3665,6 @@ fi # so the build tree will be missing the link back to the new file, and # tests might fail. Prefer to keep the relevant files in their own # directory and symlink the directory instead. -# UNLINK is used to remove symlinks from older development versions -# that might get into the way when doing "git update" without doing -# a "make distclean" in between. LINKS="Makefile" LINKS="$LINKS tests/tcg/Makefile.target" LINKS="$LINKS pc-bios/optionrom/Makefile" @@ -3679,7 +3676,6 @@ LINKS="$LINKS tests/avocado tests/data" LINKS="$LINKS tests/qemu-iotests/check" LINKS="$LINKS python" LINKS="$LINKS contrib/plugins/Makefile " -UNLINK="pc-bios/keymaps" for bios_file in \ $source_path/pc-bios/*.bin \ $source_path/pc-bios/*.elf \ @@ -3701,11 +3697,6 @@ for f in $LINKS ; do symlink "$source_path/$f" "$f" fi done -for f in $UNLINK ; do -if [ -L "$f" ]; then -rm -f "$f" -fi -done (for i in $cross_cc_vars; do export $i -- 2.33.1
[PULL 05/15] block/file-posix: Simplify the XFS_IOC_DIOINFO handling
From: Thomas Huth The handling for the XFS_IOC_DIOINFO ioctl is currently quite excessive: This is not a "real" feature like the other features that we provide with the "--enable-xxx" and "--disable-xxx" switches for the configure script, since this does not influence lots of code (it's only about one call to xfsctl() in file-posix.c), so people don't gain much with the ability to disable this with "--disable-xfsctl". It's also unfortunate that the ioctl will be disabled on Linux in case the user did not install the right xfsprogs-devel package before running configure. Thus let's simplify this by providing the ioctl definition on our own, so we can completely get rid of the header dependency and thus the related code in the configure script. Suggested-by: Paolo Bonzini Signed-off-by: Thomas Huth Message-Id: <20211215125824.250091-1-th...@redhat.com> Signed-off-by: Paolo Bonzini --- block/file-posix.c | 37 - configure | 31 --- meson.build| 1 - 3 files changed, 16 insertions(+), 53 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index b283093e5b..1f1756e192 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -106,10 +106,6 @@ #include #endif -#ifdef CONFIG_XFS -#include -#endif - /* OS X does not have O_DSYNC */ #ifndef O_DSYNC #ifdef O_SYNC @@ -156,9 +152,6 @@ typedef struct BDRVRawState { int perm_change_flags; BDRVReopenState *reopen_state; -#ifdef CONFIG_XFS -bool is_xfs:1; -#endif bool has_discard:1; bool has_write_zeroes:1; bool discard_zeroes:1; @@ -409,14 +402,22 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp) if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) { bs->bl.request_alignment = 0; } -#ifdef CONFIG_XFS -if (s->is_xfs) { -struct dioattr da; -if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, &da) >= 0) { -bs->bl.request_alignment = da.d_miniosz; -/* The kernel returns wrong information for d_mem */ -/* s->buf_align = da.d_mem; */ -} + +#ifdef __linux__ +/* + * The XFS ioctl definitions are shipped in extra packages that might + * not always be available. Since we just need the XFS_IOC_DIOINFO ioctl + * here, we simply use our own definition instead: + */ +struct xfs_dioattr { +uint32_t d_mem; +uint32_t d_miniosz; +uint32_t d_maxiosz; +} da; +if (ioctl(fd, _IOR('X', 30, struct xfs_dioattr), &da) >= 0) { +bs->bl.request_alignment = da.d_miniosz; +/* The kernel returns wrong information for d_mem */ +/* s->buf_align = da.d_mem; */ } #endif @@ -798,12 +799,6 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, #endif s->needs_alignment = raw_needs_alignment(bs); -#ifdef CONFIG_XFS -if (platform_test_xfs_fd(s->fd)) { -s->is_xfs = true; -} -#endif - bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK; if (S_ISREG(st.st_mode)) { /* When extending regular files, we get zeros from the OS */ diff --git a/configure b/configure index 8ccfe51673..b66ab31834 100755 --- a/configure +++ b/configure @@ -291,7 +291,6 @@ EXTRA_CXXFLAGS="" EXTRA_LDFLAGS="" xen_ctrl_version="$default_feature" -xfs="$default_feature" membarrier="$default_feature" vhost_kernel="$default_feature" vhost_net="$default_feature" @@ -1019,10 +1018,6 @@ for opt do ;; --enable-opengl) opengl="yes" ;; - --disable-xfsctl) xfs="no" - ;; - --enable-xfsctl) xfs="yes" - ;; --disable-zlib-test) ;; --enable-guest-agent) guest_agent="yes" @@ -1429,7 +1424,6 @@ cat << EOF avx512f AVX512F optimization support replication replication support opengl opengl support - xfsctl xfsctl support qom-cast-debug cast debugging support tools build qemu-io, qemu-nbd and qemu-img tools bochs bochs image format support @@ -2321,28 +2315,6 @@ EOF fi fi -## -# xfsctl() probe, used for file-posix.c -if test "$xfs" != "no" ; then - cat > $TMPC << EOF -#include /* NULL */ -#include -int main(void) -{ -xfsctl(NULL, 0, 0, NULL); -return 0; -} -EOF - if compile_prog "" "" ; then -xfs="yes" - else -if test "$xfs" = "yes" ; then - feature_not_found "xfs" "Install xfsprogs/xfslibs devel" -fi -xfs=no - fi -fi - ## # plugin linker support probe @@ -3454,9 +3426,6 @@ echo "CONFIG_BDRV_RO_WHITELIST=$block_drv_ro_whitelist" >> $config_host_mak if test "$block_drv_whitelist_tools" = "yes" ; then echo "CONFIG_BDRV_WHITELIST_TOOLS=y" >> $config_host_mak fi -if test "$xfs" = "yes" ; then - echo "CONFIG_XFS=y" >> $config_host_mak -fi qemu_version=$(head $source_path/VERSION) echo "PKGVERSION=$pkgversion" >>$config_host_mak
[PULL 06/15] configure: simplify creation of plugin symbol list
--dynamic-list is present on all supported ELF (not Windows or Darwin) platforms, since it dates back to 2006; -exported_symbols_list is likewise present on all supported versions of macOS. Do not bother doing a functional test in configure. Remove the file creation from configure as well: for Darwin, move the the creation of the Darwin-formatted symbols to meson; for ELF, use the file in the source path directly and switch from -Wl, to -Xlinker to not break weird paths that include a comma. Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- configure | 80 - plugins/meson.build | 11 +-- 2 files changed, 8 insertions(+), 83 deletions(-) diff --git a/configure b/configure index b66ab31834..0306f0c8bc 100755 --- a/configure +++ b/configure @@ -78,7 +78,6 @@ TMPC="${TMPDIR1}/${TMPB}.c" TMPO="${TMPDIR1}/${TMPB}.o" TMPCXX="${TMPDIR1}/${TMPB}.cxx" TMPE="${TMPDIR1}/${TMPB}.exe" -TMPTXT="${TMPDIR1}/${TMPB}.txt" rm -f config.log @@ -2315,69 +2314,6 @@ EOF fi fi -## -# plugin linker support probe - -if test "$plugins" != "no"; then - -# -# See if --dynamic-list is supported by the linker - -ld_dynamic_list="no" -cat > $TMPTXT < $TMPC < -void foo(void); - -void foo(void) -{ - printf("foo\n"); -} - -int main(void) -{ - foo(); - return 0; -} -EOF - -if compile_prog "" "-Wl,--dynamic-list=$TMPTXT" ; then -ld_dynamic_list="yes" -fi - -# -# See if -exported_symbols_list is supported by the linker - -ld_exported_symbols_list="no" -cat > $TMPTXT <> $config_host_mak -# Copy the export object list to the build dir -if test "$ld_dynamic_list" = "yes" ; then - echo "CONFIG_HAS_LD_DYNAMIC_LIST=yes" >> $config_host_mak - ld_symbols=qemu-plugins-ld.symbols - cp "$source_path/plugins/qemu-plugins.symbols" $ld_symbols -elif test "$ld_exported_symbols_list" = "yes" ; then - echo "CONFIG_HAS_LD_EXPORTED_SYMBOLS_LIST=yes" >> $config_host_mak - ld64_symbols=qemu-plugins-ld64.symbols - echo "# Automatically generated by configure - do not modify" > $ld64_symbols - grep 'qemu_' "$source_path/plugins/qemu-plugins.symbols" | sed 's/;//g' | \ - sed -E 's/^[[:space:]]*(.*)/_\1/' >> $ld64_symbols -else - error_exit \ - "If \$plugins=yes, either \$ld_dynamic_list or " \ - "\$ld_exported_symbols_list should have been set to 'yes'." -fi fi if test -n "$gdb_bin"; then diff --git a/plugins/meson.build b/plugins/meson.build index b3de57853b..d0a2ee94cf 100644 --- a/plugins/meson.build +++ b/plugins/meson.build @@ -1,10 +1,15 @@ plugin_ldflags = [] # Modules need more symbols than just those in plugins/qemu-plugins.symbols if not enable_modules - if 'CONFIG_HAS_LD_DYNAMIC_LIST' in config_host -plugin_ldflags = ['-Wl,--dynamic-list=qemu-plugins-ld.symbols'] - elif 'CONFIG_HAS_LD_EXPORTED_SYMBOLS_LIST' in config_host + if targetos == 'darwin' +qemu_plugins_symbols_list = configure_file( + input: files('qemu-plugins.symbols'), + output: 'qemu-plugins-ld64.symbols', + capture: true, + command: ['sed', '-ne', 's/^[[:space:]]*\\(qemu_.*\\);/_\\1/p', '@INPUT@']) plugin_ldflags = ['-Wl,-exported_symbols_list,qemu-plugins-ld64.symbols'] + else +plugin_ldflags = ['-Xlinker', '--dynamic-list=' + (meson.project_source_root() / 'plugins/qemu-plugins.symbols')] endif endif -- 2.33.1
[PULL 10/15] configure: move non-command-line variables away from command-line parsing section
This makes it easier to identify candidates for moving to Meson. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Paolo Bonzini --- configure | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/configure b/configure index 302d58102b..8eb8e4c2cc 100755 --- a/configure +++ b/configure @@ -307,16 +307,12 @@ debug="no" sanitizers="no" tsan="no" fortify_source="$default_feature" -mingw32="no" gcov="no" EXESUF="" modules="no" module_upgrades="no" prefix="/usr/local" qemu_suffix="qemu" -bsd="no" -linux="no" -solaris="no" profiler="no" softmmu="yes" linux_user="" @@ -330,8 +326,6 @@ opengl="$default_feature" cpuid_h="no" avx2_opt="$default_feature" guest_agent="$default_feature" -guest_agent_with_vss="no" -guest_agent_ntddscsi="no" vss_win32_sdk="$default_feature" win_sdk="no" want_tools="$default_feature" @@ -526,6 +520,10 @@ fi # OS specific +mingw32="no" +bsd="no" +linux="no" +solaris="no" case $targetos in windows) mingw32="yes" @@ -2546,6 +2544,7 @@ fi ## # check if we have VSS SDK headers for win +guest_agent_with_vss="no" if test "$mingw32" = "yes" && test "$guest_agent" != "no" && \ test "$vss_win32_sdk" != "no" ; then case "$vss_win32_sdk" in @@ -2576,7 +2575,6 @@ EOF echo "ERROR: The headers are extracted in the directory \`inc'." feature_not_found "VSS support" fi -guest_agent_with_vss="no" fi fi @@ -2603,6 +2601,7 @@ fi ## # check if mingw environment provides a recent ntddscsi.h +guest_agent_ntddscsi="no" if test "$mingw32" = "yes" && test "$guest_agent" != "no"; then cat > $TMPC << EOF #include -- 2.33.1
[PULL 09/15] configure: parse --enable/--disable-strip automatically, flip default
Always include the STRIP variable in config-host.mak (it's only used by the s390-ccw firmware build, and it adds a default if configure omitted it), and use meson-buildoptions.sh to turn --enable/--disable-strip into -Dstrip. The default is now not to strip the binaries like for almost every other package that has a configure script. Signed-off-by: Paolo Bonzini --- configure | 10 +- pc-bios/s390-ccw/Makefile | 2 -- scripts/meson-buildoptions.py | 21 ++--- scripts/meson-buildoptions.sh | 3 +++ 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/configure b/configure index c8b32e7277..302d58102b 100755 --- a/configure +++ b/configure @@ -307,7 +307,6 @@ debug="no" sanitizers="no" tsan="no" fortify_source="$default_feature" -strip_opt="yes" mingw32="no" gcov="no" EXESUF="" @@ -890,7 +889,6 @@ for opt do debug_tcg="yes" debug_mutex="yes" debug="yes" - strip_opt="no" fortify_source="no" ;; --enable-sanitizers) sanitizers="yes" @@ -901,8 +899,6 @@ for opt do ;; --disable-tsan) tsan="no" ;; - --disable-strip) strip_opt="no" - ;; --disable-slirp) slirp="disabled" ;; --enable-slirp) slirp="enabled" @@ -1365,7 +1361,6 @@ Advanced options (experts only): --enable-debug enable common debug build options --enable-sanitizers enable default sanitizers --enable-tsanenable thread sanitizer - --disable-strip disable stripping binaries --disable-werror disable compilation abort on warning --disable-stack-protector disable compiler-provided stack protection --audio-drv-list=LISTset audio drivers to try if -audiodev is not used @@ -3312,9 +3307,6 @@ echo "GIT_SUBMODULES_ACTION=$git_submodules_action" >> $config_host_mak if test "$debug_tcg" = "yes" ; then echo "CONFIG_DEBUG_TCG=y" >> $config_host_mak fi -if test "$strip_opt" = "yes" ; then - echo "STRIP=${strip}" >> $config_host_mak -fi if test "$mingw32" = "yes" ; then echo "CONFIG_WIN32=y" >> $config_host_mak if test "$guest_agent_with_vss" = "yes" ; then @@ -3591,6 +3583,7 @@ echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak echo "GLIB_LIBS=$glib_libs" >> $config_host_mak echo "QEMU_LDFLAGS=$QEMU_LDFLAGS" >> $config_host_mak echo "LD_I386_EMULATION=$ld_i386_emulation" >> $config_host_mak +echo "STRIP=$strip" >> $config_host_mak echo "EXESUF=$EXESUF" >> $config_host_mak echo "LIBS_QGA=$libs_qga" >> $config_host_mak @@ -3805,7 +3798,6 @@ if test "$skip_meson" = no; then -Doptimization=$(if test "$debug" = yes; then echo 0; else echo 2; fi) \ -Ddebug=$(if test "$debug_info" = yes; then echo true; else echo false; fi) \ -Dwerror=$(if test "$werror" = yes; then echo true; else echo false; fi) \ --Dstrip=$(if test "$strip_opt" = yes; then echo true; else echo false; fi) \ -Db_pie=$(if test "$pie" = yes; then echo true; else echo false; fi) \ -Db_coverage=$(if test "$gcov" = yes; then echo true; else echo false; fi) \ -Db_lto=$lto -Dcfi=$cfi -Dtcg=$tcg -Dxen=$xen \ diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile index cee9d2c63b..0eb68efc7b 100644 --- a/pc-bios/s390-ccw/Makefile +++ b/pc-bios/s390-ccw/Makefile @@ -44,8 +44,6 @@ build-all: s390-ccw.img s390-netboot.img s390-ccw.elf: $(OBJECTS) $(call quiet-command,$(CC) $(LDFLAGS) -o $@ $(OBJECTS),"BUILD","$(TARGET_DIR)$@") -STRIP ?= strip - s390-ccw.img: s390-ccw.elf $(call quiet-command,$(STRIP) --strip-unneeded $< -o $@,"STRIP","$(TARGET_DIR)$@") diff --git a/scripts/meson-buildoptions.py b/scripts/meson-buildoptions.py index 96969d89ee..98ae944148 100755 --- a/scripts/meson-buildoptions.py +++ b/scripts/meson-buildoptions.py @@ -36,6 +36,10 @@ "trace_file", } +BUILTIN_OPTIONS = { +"strip", +} + LINE_WIDTH = 76 @@ -90,14 +94,17 @@ def allow_arg(opt): return not (set(opt["choices"]) <= {"auto", "disabled", "enabled"}) +def filter_options(json): +if ":" in json["name"]: +return False +if json["section"] == "user": +return json["name"] not in SKIP_OPTIONS +else: +return json["name"] in BUILTIN_OPTIONS + + def load_options(json): -json = [ -x -for x in json -if x["section"] == "user" -and ":" not in x["name"] -and x["name"] not in SKIP_OPTIONS -] +json = [x for x in json if filter_options(x)] return sorted(json, key=lambda x: x["name"]) diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index ae8f18edc2..46360e541d 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -13,6 +13,7 @@ meson_options_help() { printf "%s\n" ' jemalloc/system/tcmalloc)' printf "%s\n" ' --enable-slirp[=CHOICE] Whether and how to find the slirp library' printf "%s\n" ' (choices: auto/disabl
[PATCH v2 05/23] dma: Let dma_memory_read/write() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_memory_read() or dma_memory_write(). Patch created mechanically using spatch with this script: @@ expression E1, E2, E3, E4; @@ ( - dma_memory_read(E1, E2, E3, E4) + dma_memory_read(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) | - dma_memory_write(E1, E2, E3, E4) + dma_memory_write(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) ) Reviewed-by: Richard Henderson Reviewed-by: Li Qiang Reviewed-by: Edgar E. Iglesias Signed-off-by: Philippe Mathieu-Daudé Acked-by: Stefan Hajnoczi Message-Id: <20210702092439.989969-6-phi...@redhat.com> --- v4: Merged conflict in hw/dma/pl330.c --- include/hw/ppc/spapr_vio.h| 6 -- include/sysemu/dma.h | 20 hw/arm/musicpal.c | 13 +++-- hw/arm/smmu-common.c | 3 ++- hw/arm/smmuv3.c | 14 +- hw/core/generic-loader.c | 3 ++- hw/dma/pl330.c| 12 hw/dma/sparc32_dma.c | 16 ++-- hw/dma/xlnx-zynq-devcfg.c | 6 -- hw/dma/xlnx_dpdma.c | 10 ++ hw/i386/amd_iommu.c | 16 +--- hw/i386/intel_iommu.c | 28 +--- hw/ide/macio.c| 2 +- hw/intc/xive.c| 7 --- hw/misc/bcm2835_property.c| 3 ++- hw/misc/macio/mac_dbdma.c | 10 ++ hw/net/allwinner-sun8i-emac.c | 18 -- hw/net/ftgmac100.c| 25 - hw/net/imx_fec.c | 32 hw/net/npcm7xx_emc.c | 20 hw/nvram/fw_cfg.c | 9 ++--- hw/pci-host/pnv_phb3.c| 5 +++-- hw/pci-host/pnv_phb3_msi.c| 9 ++--- hw/pci-host/pnv_phb4.c| 5 +++-- hw/sd/allwinner-sdhost.c | 14 -- hw/sd/sdhci.c | 35 ++- hw/usb/hcd-dwc2.c | 8 hw/usb/hcd-ehci.c | 6 -- hw/usb/hcd-ohci.c | 18 +++--- hw/usb/hcd-xhci.c | 18 +++--- 30 files changed, 241 insertions(+), 150 deletions(-) diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index c90e74a67dd..5d2ea8e6656 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -97,14 +97,16 @@ static inline bool spapr_vio_dma_valid(SpaprVioDevice *dev, uint64_t taddr, static inline int spapr_vio_dma_read(SpaprVioDevice *dev, uint64_t taddr, void *buf, uint32_t size) { -return (dma_memory_read(&dev->as, taddr, buf, size) != 0) ? +return (dma_memory_read(&dev->as, taddr, +buf, size, MEMTXATTRS_UNSPECIFIED) != 0) ? H_DEST_PARM : H_SUCCESS; } static inline int spapr_vio_dma_write(SpaprVioDevice *dev, uint64_t taddr, const void *buf, uint32_t size) { -return (dma_memory_write(&dev->as, taddr, buf, size) != 0) ? +return (dma_memory_write(&dev->as, taddr, + buf, size, MEMTXATTRS_UNSPECIFIED) != 0) ? H_DEST_PARM : H_SUCCESS; } diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index e8ad42226f6..522682bf386 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -143,12 +143,14 @@ static inline MemTxResult dma_memory_rw(AddressSpace *as, dma_addr_t addr, * @addr: address within that address space * @buf: buffer with the data transferred * @len: length of the data transferred + * @attrs: memory transaction attributes */ static inline MemTxResult dma_memory_read(AddressSpace *as, dma_addr_t addr, - void *buf, dma_addr_t len) + void *buf, dma_addr_t len, + MemTxAttrs attrs) { return dma_memory_rw(as, addr, buf, len, - DMA_DIRECTION_TO_DEVICE, MEMTXATTRS_UNSPECIFIED); + DMA_DIRECTION_TO_DEVICE, attrs); } /** @@ -162,12 +164,14 @@ static inline MemTxResult dma_memory_read(AddressSpace *as, dma_addr_t addr, * @addr: address within that address space * @buf: buffer with the data transferred * @len: the number of bytes to write + * @attrs: memory transaction attributes */ static inline MemTxResult dma_memory_write(AddressSpace *as, dma_addr_t addr, - const void *buf, dma_addr_t len) + const void *buf, dma_addr_t len, + MemTxAttrs attrs) { return dma_memory_rw(as, addr, (void *)buf, len, - DMA_DIRECTION_FROM_DEVICE, MEMTXATTRS_UNSPECIFIED); + DMA_DIRECTION_FROM_DEVICE, attrs); } /** @@ -239,7 +243,7 @@ static inline void dma_memory_unmap(AddressSpace *as,
[PULL 11/15] meson: build contrib/ executables after generated headers
This will be needed as soon as config-poison.h moves from configure to a meson custom_target (which is built at "ninja" time). Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Paolo Bonzini --- contrib/elf2dmp/meson.build| 2 +- contrib/ivshmem-client/meson.build | 2 +- contrib/ivshmem-server/meson.build | 2 +- contrib/rdmacm-mux/meson.build | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/contrib/elf2dmp/meson.build b/contrib/elf2dmp/meson.build index 4d86cb390a..6707d43c4f 100644 --- a/contrib/elf2dmp/meson.build +++ b/contrib/elf2dmp/meson.build @@ -1,5 +1,5 @@ if curl.found() - executable('elf2dmp', files('main.c', 'addrspace.c', 'download.c', 'pdb.c', 'qemu_elf.c'), + executable('elf2dmp', files('main.c', 'addrspace.c', 'download.c', 'pdb.c', 'qemu_elf.c'), genh, dependencies: [glib, curl], install: true) endif diff --git a/contrib/ivshmem-client/meson.build b/contrib/ivshmem-client/meson.build index 1b171efb4f..ce8dcca84d 100644 --- a/contrib/ivshmem-client/meson.build +++ b/contrib/ivshmem-client/meson.build @@ -1,4 +1,4 @@ -executable('ivshmem-client', files('ivshmem-client.c', 'main.c'), +executable('ivshmem-client', files('ivshmem-client.c', 'main.c'), genh, dependencies: glib, build_by_default: targetos == 'linux', install: false) diff --git a/contrib/ivshmem-server/meson.build b/contrib/ivshmem-server/meson.build index 3a53942201..c6c3c82e89 100644 --- a/contrib/ivshmem-server/meson.build +++ b/contrib/ivshmem-server/meson.build @@ -1,4 +1,4 @@ -executable('ivshmem-server', files('ivshmem-server.c', 'main.c'), +executable('ivshmem-server', files('ivshmem-server.c', 'main.c'), genh, dependencies: [qemuutil, rt], build_by_default: targetos == 'linux', install: false) diff --git a/contrib/rdmacm-mux/meson.build b/contrib/rdmacm-mux/meson.build index 6cc5016747..7674f54cc5 100644 --- a/contrib/rdmacm-mux/meson.build +++ b/contrib/rdmacm-mux/meson.build @@ -2,7 +2,7 @@ if 'CONFIG_PVRDMA' in config_host # if not found, CONFIG_PVRDMA should not be set # FIXME: broken on big endian architectures libumad = cc.find_library('ibumad', required: true) - executable('rdmacm-mux', files('main.c'), + executable('rdmacm-mux', files('main.c'), genh, dependencies: [glib, libumad], build_by_default: false, install: false) -- 2.33.1
[PULL 12/15] configure, meson: move config-poison.h to meson
This ensures that the file is regenerated properly whenever config-target.h or config-devices.h files change. Signed-off-by: Paolo Bonzini --- Makefile | 2 +- configure | 11 --- meson.build | 12 scripts/make-config-poison.sh | 16 4 files changed, 29 insertions(+), 12 deletions(-) create mode 100755 scripts/make-config-poison.sh diff --git a/Makefile b/Makefile index 06ad8a61e1..2f80f56a4a 100644 --- a/Makefile +++ b/Makefile @@ -220,7 +220,7 @@ qemu-%.tar.bz2: distclean: clean -$(quiet-@)test -f build.ninja && $(NINJA) $(NINJAFLAGS) -t clean -g || : - rm -f config-host.mak config-poison.h + rm -f config-host.mak rm -f tests/tcg/config-*.mak rm -f config.status rm -f roms/seabios/config.mak diff --git a/configure b/configure index 8eb8e4c2cc..d2f12bc2d6 100755 --- a/configure +++ b/configure @@ -3827,17 +3827,6 @@ if test -n "${deprecated_features}"; then echo " features: ${deprecated_features}" fi -# Create list of config switches that should be poisoned in common code... -# but filter out CONFIG_TCG and CONFIG_USER_ONLY which are special. -target_configs_h=$(ls *-config-devices.h *-config-target.h 2>/dev/null) -if test -n "$target_configs_h" ; then -sed -n -e '/CONFIG_TCG/d' -e '/CONFIG_USER_ONLY/d' \ --e '/^#define / { s///; s/ .*//; s/^/#pragma GCC poison /p; }' \ -$target_configs_h | sort -u > config-poison.h -else -:> config-poison.h -fi - # Save the configure command line for later reuse. cat
[PATCH v2 01/23] dma: Let dma_memory_valid() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_memory_valid(). Reviewed-by: Richard Henderson Reviewed-by: Li Qiang Reviewed-by: Edgar E. Iglesias Signed-off-by: Philippe Mathieu-Daudé Acked-by: Stefan Hajnoczi Message-Id: <20210702092439.989969-2-phi...@redhat.com> --- include/hw/ppc/spapr_vio.h | 2 +- include/sysemu/dma.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index 4bea87f39cc..4c45f1579fa 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -91,7 +91,7 @@ static inline void spapr_vio_irq_pulse(SpaprVioDevice *dev) static inline bool spapr_vio_dma_valid(SpaprVioDevice *dev, uint64_t taddr, uint32_t size, DMADirection dir) { -return dma_memory_valid(&dev->as, taddr, size, dir); +return dma_memory_valid(&dev->as, taddr, size, dir, MEMTXATTRS_UNSPECIFIED); } static inline int spapr_vio_dma_read(SpaprVioDevice *dev, uint64_t taddr, diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 3201e7901db..296f3b57c9c 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -73,11 +73,11 @@ static inline void dma_barrier(AddressSpace *as, DMADirection dir) * dma_memory_{read,write}() and check for errors */ static inline bool dma_memory_valid(AddressSpace *as, dma_addr_t addr, dma_addr_t len, -DMADirection dir) +DMADirection dir, MemTxAttrs attrs) { return address_space_access_valid(as, addr, len, dir == DMA_DIRECTION_FROM_DEVICE, - MEMTXATTRS_UNSPECIFIED); + attrs); } static inline MemTxResult dma_memory_rw_relaxed(AddressSpace *as, -- 2.33.1
[PATCH v2 07/23] dma: Have dma_buf_rw() take a void pointer
DMA operations are run on any kind of buffer, not arrays of uint8_t. Convert dma_buf_rw() to take a void pointer argument to save us pointless casts to uint8_t *. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- softmmu/dma-helpers.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c index 3c06a2feddd..09e29997ee5 100644 --- a/softmmu/dma-helpers.c +++ b/softmmu/dma-helpers.c @@ -294,9 +294,10 @@ BlockAIOCB *dma_blk_write(BlockBackend *blk, } -static uint64_t dma_buf_rw(uint8_t *ptr, int32_t len, QEMUSGList *sg, +static uint64_t dma_buf_rw(void *buf, int32_t len, QEMUSGList *sg, DMADirection dir) { +uint8_t *ptr = buf; uint64_t resid; int sg_cur_index; -- 2.33.1
[PULL 13/15] meson: add comments in the target-specific flags section
Signed-off-by: Paolo Bonzini --- meson.build | 5 + 1 file changed, 5 insertions(+) diff --git a/meson.build b/meson.build index a61eb7cee5..3519ed51e3 100644 --- a/meson.build +++ b/meson.build @@ -233,6 +233,7 @@ endif # Target-specific checks and dependencies # ### +# Fuzzing if get_option('fuzzing') and get_option('fuzzing_engine') == '' and \ not cc.links(''' #include @@ -244,6 +245,7 @@ if get_option('fuzzing') and get_option('fuzzing_engine') == '' and \ error('Your compiler does not support -fsanitize=fuzzer') endif +# Tracing backends if 'ftrace' in get_option('trace_backends') and targetos != 'linux' error('ftrace is supported only on Linux') endif @@ -257,6 +259,7 @@ if 'syslog' in get_option('trace_backends') and not cc.compiles(''' error('syslog is not supported on this system') endif +# Miscellaneous Linux-only features if targetos != 'linux' and get_option('mpath').enabled() error('Multipath is supported only on Linux') endif @@ -266,6 +269,7 @@ if targetos != 'linux' and get_option('multiprocess').enabled() endif multiprocess_allowed = targetos == 'linux' and not get_option('multiprocess').disabled() +# Target-specific libraries and flags libm = cc.find_library('m', required: false) threads = dependency('threads') util = cc.find_library('util', required: false) @@ -306,6 +310,7 @@ elif targetos == 'openbsd' endif endif +# Target-specific configuration of accelerators accelerators = [] if not get_option('kvm').disabled() and targetos == 'linux' accelerators += 'CONFIG_KVM' -- 2.33.1
[PATCH v2 03/23] dma: Let dma_memory_rw_relaxed() take MemTxAttrs argument
We will add the MemTxAttrs argument to dma_memory_rw() in the next commit. Since dma_memory_rw_relaxed() is only used by dma_memory_rw(), modify it first in a separate commit to keep the next commit easier to review. Reviewed-by: Richard Henderson Reviewed-by: Li Qiang Reviewed-by: Edgar E. Iglesias Signed-off-by: Philippe Mathieu-Daudé Acked-by: Stefan Hajnoczi Message-Id: <20210702092439.989969-4-phi...@redhat.com> --- include/sysemu/dma.h | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index d23516f020a..3be803cf3ff 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -83,9 +83,10 @@ static inline bool dma_memory_valid(AddressSpace *as, static inline MemTxResult dma_memory_rw_relaxed(AddressSpace *as, dma_addr_t addr, void *buf, dma_addr_t len, -DMADirection dir) +DMADirection dir, +MemTxAttrs attrs) { -return address_space_rw(as, addr, MEMTXATTRS_UNSPECIFIED, +return address_space_rw(as, addr, attrs, buf, len, dir == DMA_DIRECTION_FROM_DEVICE); } @@ -93,7 +94,9 @@ static inline MemTxResult dma_memory_read_relaxed(AddressSpace *as, dma_addr_t addr, void *buf, dma_addr_t len) { -return dma_memory_rw_relaxed(as, addr, buf, len, DMA_DIRECTION_TO_DEVICE); +return dma_memory_rw_relaxed(as, addr, buf, len, + DMA_DIRECTION_TO_DEVICE, + MEMTXATTRS_UNSPECIFIED); } static inline MemTxResult dma_memory_write_relaxed(AddressSpace *as, @@ -102,7 +105,8 @@ static inline MemTxResult dma_memory_write_relaxed(AddressSpace *as, dma_addr_t len) { return dma_memory_rw_relaxed(as, addr, (void *)buf, len, - DMA_DIRECTION_FROM_DEVICE); + DMA_DIRECTION_FROM_DEVICE, + MEMTXATTRS_UNSPECIFIED); } /** @@ -124,7 +128,8 @@ static inline MemTxResult dma_memory_rw(AddressSpace *as, dma_addr_t addr, { dma_barrier(as, dir); -return dma_memory_rw_relaxed(as, addr, buf, len, dir); +return dma_memory_rw_relaxed(as, addr, buf, len, dir, + MEMTXATTRS_UNSPECIFIED); } /** -- 2.33.1
[PATCH v2 04/23] dma: Let dma_memory_rw() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_memory_rw(). Reviewed-by: Richard Henderson Reviewed-by: Li Qiang Reviewed-by: Edgar E. Iglesias Signed-off-by: Philippe Mathieu-Daudé Acked-by: Stefan Hajnoczi Message-Id: <20210702092439.989969-5-phi...@redhat.com> --- include/hw/pci/pci.h | 3 ++- include/sysemu/dma.h | 11 ++- hw/intc/spapr_xive.c | 3 ++- hw/usb/hcd-ohci.c | 10 ++ softmmu/dma-helpers.c | 3 ++- 5 files changed, 18 insertions(+), 12 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index e7cdf2d5ec5..4383f1c95e0 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -808,7 +808,8 @@ static inline MemTxResult pci_dma_rw(PCIDevice *dev, dma_addr_t addr, void *buf, dma_addr_t len, DMADirection dir) { -return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir); +return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, + dir, MEMTXATTRS_UNSPECIFIED); } /** diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 3be803cf3ff..e8ad42226f6 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -121,15 +121,15 @@ static inline MemTxResult dma_memory_write_relaxed(AddressSpace *as, * @buf: buffer with the data transferred * @len: the number of bytes to read or write * @dir: indicates the transfer direction + * @attrs: memory transaction attributes */ static inline MemTxResult dma_memory_rw(AddressSpace *as, dma_addr_t addr, void *buf, dma_addr_t len, -DMADirection dir) +DMADirection dir, MemTxAttrs attrs) { dma_barrier(as, dir); -return dma_memory_rw_relaxed(as, addr, buf, len, dir, - MEMTXATTRS_UNSPECIFIED); +return dma_memory_rw_relaxed(as, addr, buf, len, dir, attrs); } /** @@ -147,7 +147,8 @@ static inline MemTxResult dma_memory_rw(AddressSpace *as, dma_addr_t addr, static inline MemTxResult dma_memory_read(AddressSpace *as, dma_addr_t addr, void *buf, dma_addr_t len) { -return dma_memory_rw(as, addr, buf, len, DMA_DIRECTION_TO_DEVICE); +return dma_memory_rw(as, addr, buf, len, + DMA_DIRECTION_TO_DEVICE, MEMTXATTRS_UNSPECIFIED); } /** @@ -166,7 +167,7 @@ static inline MemTxResult dma_memory_write(AddressSpace *as, dma_addr_t addr, const void *buf, dma_addr_t len) { return dma_memory_rw(as, addr, (void *)buf, len, - DMA_DIRECTION_FROM_DEVICE); + DMA_DIRECTION_FROM_DEVICE, MEMTXATTRS_UNSPECIFIED); } /** diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c index 4ec659b93e1..eae95c716f1 100644 --- a/hw/intc/spapr_xive.c +++ b/hw/intc/spapr_xive.c @@ -1684,7 +1684,8 @@ static target_ulong h_int_esb(PowerPCCPU *cpu, mmio_addr = xive->vc_base + xive_source_esb_mgmt(xsrc, lisn) + offset; if (dma_memory_rw(&address_space_memory, mmio_addr, &data, 8, - (flags & SPAPR_XIVE_ESB_STORE))) { + (flags & SPAPR_XIVE_ESB_STORE), + MEMTXATTRS_UNSPECIFIED)) { qemu_log_mask(LOG_GUEST_ERROR, "XIVE: failed to access ESB @0x%" HWADDR_PRIx "\n", mmio_addr); return H_HARDWARE; diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c index 1cf2816772c..56e2315c734 100644 --- a/hw/usb/hcd-ohci.c +++ b/hw/usb/hcd-ohci.c @@ -586,7 +586,8 @@ static int ohci_copy_td(OHCIState *ohci, struct ohci_td *td, if (n > len) n = len; -if (dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, n, dir)) { +if (dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, + n, dir, MEMTXATTRS_UNSPECIFIED)) { return -1; } if (n == len) { @@ -595,7 +596,7 @@ static int ohci_copy_td(OHCIState *ohci, struct ohci_td *td, ptr = td->be & ~0xfffu; buf += n; if (dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, - len - n, dir)) { + len - n, dir, MEMTXATTRS_UNSPECIFIED)) { return -1; } return 0; @@ -613,7 +614,8 @@ static int ohci_copy_iso_td(OHCIState *ohci, if (n > len) n = len; -if (dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, n, dir)) { +if (dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, + n, dir, MEMTXATTRS_UNSPECIFIED)) { return -1; } if (n == len) { @@ -622,7 +624,7 @@ static int ohci_copy_iso_td(OHCIState *ohci, ptr = end_addr & ~0xfffu; buf += n; if (dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, - len - n, dir)) { + len - n, dir,
[PATCH v2 00/23] hw: Have DMA APIs take MemTxAttrs arg & propagate MemTxResult (full)
Hi Peter and Paolo. This series contains all the uncontroversary patches from the "improve DMA situations, avoid re-entrancy issues" earlier series. The rest will be discussed on top. The only operations added are: - take MemTxAttrs argument - propagate MemTxResult All patches are reviewed. If you don't see any objection, I plan to send this via a pull request by the end of next week. Regards, Phil. Philippe Mathieu-Daudé (23): dma: Let dma_memory_valid() take MemTxAttrs argument dma: Let dma_memory_set() take MemTxAttrs argument dma: Let dma_memory_rw_relaxed() take MemTxAttrs argument dma: Let dma_memory_rw() take MemTxAttrs argument dma: Let dma_memory_read/write() take MemTxAttrs argument dma: Let dma_memory_map() take MemTxAttrs argument dma: Have dma_buf_rw() take a void pointer dma: Have dma_buf_read() / dma_buf_write() take a void pointer dma: Let pci_dma_rw() take MemTxAttrs argument dma: Let dma_buf_rw() take MemTxAttrs argument dma: Let dma_buf_write() take MemTxAttrs argument dma: Let dma_buf_read() take MemTxAttrs argument dma: Let dma_buf_rw() propagate MemTxResult dma: Let dma_buf_read() / dma_buf_write() propagate MemTxResult dma: Let st*_dma() take MemTxAttrs argument dma: Let ld*_dma() take MemTxAttrs argument dma: Let st*_dma() propagate MemTxResult dma: Let ld*_dma() propagate MemTxResult hw/scsi/megasas: Use uint32_t for reply queue head/tail values pci: Let st*_pci_dma() take MemTxAttrs argument pci: Let ld*_pci_dma() take MemTxAttrs argument pci: Let st*_pci_dma() propagate MemTxResult pci: Let ld*_pci_dma() propagate MemTxResult include/hw/pci/pci.h | 38 +-- include/hw/ppc/spapr_vio.h| 30 include/sysemu/dma.h | 90 +-- hw/arm/musicpal.c | 13 ++--- hw/arm/smmu-common.c | 3 +- hw/arm/smmuv3.c | 14 -- hw/audio/intel-hda.c | 13 +++-- hw/core/generic-loader.c | 3 +- hw/display/virtio-gpu.c | 10 ++-- hw/dma/pl330.c| 12 +++-- hw/dma/sparc32_dma.c | 16 --- hw/dma/xlnx-zynq-devcfg.c | 6 ++- hw/dma/xlnx_dpdma.c | 10 ++-- hw/hyperv/vmbus.c | 8 ++-- hw/i386/amd_iommu.c | 16 --- hw/i386/intel_iommu.c | 28 ++- hw/ide/ahci.c | 18 --- hw/ide/macio.c| 2 +- hw/intc/pnv_xive.c| 7 +-- hw/intc/spapr_xive.c | 3 +- hw/intc/xive.c| 7 +-- hw/misc/bcm2835_property.c| 3 +- hw/misc/macio/mac_dbdma.c | 10 ++-- hw/net/allwinner-sun8i-emac.c | 18 --- hw/net/eepro100.c | 49 +++ hw/net/ftgmac100.c| 25 ++ hw/net/imx_fec.c | 32 - hw/net/npcm7xx_emc.c | 20 hw/net/tulip.c| 36 +++--- hw/nvme/ctrl.c| 5 +- hw/nvram/fw_cfg.c | 16 --- hw/pci-host/pnv_phb3.c| 5 +- hw/pci-host/pnv_phb3_msi.c| 9 ++-- hw/pci-host/pnv_phb4.c| 5 +- hw/scsi/esp-pci.c | 2 +- hw/scsi/megasas.c | 86 ++--- hw/scsi/mptsas.c | 16 +-- hw/scsi/scsi-bus.c| 4 +- hw/scsi/vmw_pvscsi.c | 20 +--- hw/sd/allwinner-sdhost.c | 14 +++--- hw/sd/sdhci.c | 35 +- hw/usb/hcd-dwc2.c | 8 ++-- hw/usb/hcd-ehci.c | 6 ++- hw/usb/hcd-ohci.c | 28 ++- hw/usb/hcd-xhci.c | 26 ++ hw/usb/libhw.c| 3 +- hw/virtio/virtio.c| 6 ++- softmmu/dma-helpers.c | 32 - hw/scsi/trace-events | 8 ++-- 49 files changed, 542 insertions(+), 332 deletions(-) -- 2.33.1
[PATCH v2 06/23] dma: Let dma_memory_map() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_memory_map(). Patch created mechanically using spatch with this script: @@ expression E1, E2, E3, E4; @@ - dma_memory_map(E1, E2, E3, E4) + dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED) Reviewed-by: Richard Henderson Reviewed-by: Li Qiang Reviewed-by: Edgar E. Iglesias Signed-off-by: Philippe Mathieu-Daudé Acked-by: Stefan Hajnoczi Message-Id: <20210702092439.989969-7-phi...@redhat.com> --- include/hw/pci/pci.h| 3 ++- include/sysemu/dma.h| 5 +++-- hw/display/virtio-gpu.c | 10 ++ hw/hyperv/vmbus.c | 8 +--- hw/ide/ahci.c | 8 +--- hw/usb/libhw.c | 3 ++- hw/virtio/virtio.c | 6 -- softmmu/dma-helpers.c | 3 ++- 8 files changed, 29 insertions(+), 17 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 4383f1c95e0..1acefc2a4c3 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -875,7 +875,8 @@ static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr, { void *buf; -buf = dma_memory_map(pci_get_address_space(dev), addr, plen, dir); +buf = dma_memory_map(pci_get_address_space(dev), addr, plen, dir, + MEMTXATTRS_UNSPECIFIED); return buf; } diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 522682bf386..97ff6f29f8c 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -202,16 +202,17 @@ MemTxResult dma_memory_set(AddressSpace *as, dma_addr_t addr, * @addr: address within that address space * @len: pointer to length of buffer; updated on return * @dir: indicates the transfer direction + * @attrs: memory attributes */ static inline void *dma_memory_map(AddressSpace *as, dma_addr_t addr, dma_addr_t *len, - DMADirection dir) + DMADirection dir, MemTxAttrs attrs) { hwaddr xlen = *len; void *p; p = address_space_map(as, addr, &xlen, dir == DMA_DIRECTION_FROM_DEVICE, - MEMTXATTRS_UNSPECIFIED); + attrs); *len = xlen; return p; } diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index d78b9700c7d..c6dc818988c 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -814,8 +814,9 @@ int virtio_gpu_create_mapping_iov(VirtIOGPU *g, do { len = l; -map = dma_memory_map(VIRTIO_DEVICE(g)->dma_as, - a, &len, DMA_DIRECTION_TO_DEVICE); +map = dma_memory_map(VIRTIO_DEVICE(g)->dma_as, a, &len, + DMA_DIRECTION_TO_DEVICE, + MEMTXATTRS_UNSPECIFIED); if (!map) { qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to map MMIO memory for" " element %d\n", __func__, e); @@ -1252,8 +1253,9 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size, for (i = 0; i < res->iov_cnt; i++) { hwaddr len = res->iov[i].iov_len; res->iov[i].iov_base = -dma_memory_map(VIRTIO_DEVICE(g)->dma_as, - res->addrs[i], &len, DMA_DIRECTION_TO_DEVICE); +dma_memory_map(VIRTIO_DEVICE(g)->dma_as, res->addrs[i], &len, + DMA_DIRECTION_TO_DEVICE, + MEMTXATTRS_UNSPECIFIED); if (!res->iov[i].iov_base || len != res->iov[i].iov_len) { /* Clean up the half-a-mapping we just created... */ diff --git a/hw/hyperv/vmbus.c b/hw/hyperv/vmbus.c index dbce3b35fba..8aad29f1bb2 100644 --- a/hw/hyperv/vmbus.c +++ b/hw/hyperv/vmbus.c @@ -373,7 +373,8 @@ static ssize_t gpadl_iter_io(GpadlIter *iter, void *buf, uint32_t len) maddr = (iter->gpadl->gfns[idx] << TARGET_PAGE_BITS) | off_in_page; -iter->map = dma_memory_map(iter->as, maddr, &mlen, iter->dir); +iter->map = dma_memory_map(iter->as, maddr, &mlen, iter->dir, + MEMTXATTRS_UNSPECIFIED); if (mlen != pgleft) { dma_memory_unmap(iter->as, iter->map, mlen, iter->dir, 0); iter->map = NULL; @@ -490,7 +491,8 @@ int vmbus_map_sgl(VMBusChanReq *req, DMADirection dir, struct iovec *iov, goto err; } -iov[ret_cnt].iov_base = dma_memory_map(sgl->as, a, &l, dir); +iov[ret_cnt].iov_base = dma_memory_map(sgl->as, a, &l, dir, + MEMTXATTRS_UNSPECIFIED); if (!l) { ret = -EFAULT; goto err; @@ -566,7 +568,7 @@ static vmbus_ring_buffer *ringbuf_map_hdr(VMBusRingBufCommon *ringbuf) dma_addr_t mlen = sizeof(*rb); rb = dma_memory_map(ringbuf->as, ringbuf->rb_addr, &mlen, -DMA
[PATCH v2 18/23] dma: Let ld*_dma() propagate MemTxResult
dma_memory_read() returns a MemTxResult type. Do not discard it, return it to the caller. Update the few callers. Reviewed-by: Richard Henderson Reviewed-by: Cédric Le Goater Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 6 -- include/hw/ppc/spapr_vio.h | 6 +- include/sysemu/dma.h | 25 - hw/intc/pnv_xive.c | 8 hw/usb/hcd-xhci.c | 7 --- 5 files changed, 29 insertions(+), 23 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 0613308b1b6..8c5f2ed5054 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -854,8 +854,10 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev, \ dma_addr_t addr) \ { \ -return ld##_l##_dma(pci_get_address_space(dev), addr, \ -MEMTXATTRS_UNSPECIFIED);\ +uint##_bits##_t val; \ +ld##_l##_dma(pci_get_address_space(dev), addr, &val, \ + MEMTXATTRS_UNSPECIFIED); \ +return val; \ } \ static inline void st##_s##_pci_dma(PCIDevice *dev, \ dma_addr_t addr, uint##_bits##_t val) \ diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index d2ec9b0637f..7eae1a48478 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -127,7 +127,11 @@ static inline int spapr_vio_dma_set(SpaprVioDevice *dev, uint64_t taddr, #define vio_stq(_dev, _addr, _val) \ (stq_be_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) #define vio_ldq(_dev, _addr) \ -(ldq_be_dma(&(_dev)->as, (_addr), MEMTXATTRS_UNSPECIFIED)) +({ \ +uint64_t _val; \ +ldq_be_dma(&(_dev)->as, (_addr), &_val, MEMTXATTRS_UNSPECIFIED); \ +_val; \ +}) int spapr_vio_send_crq(SpaprVioDevice *dev, uint8_t *crq); diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 725e8e90f88..e6776586613 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -240,14 +240,15 @@ static inline void dma_memory_unmap(AddressSpace *as, } #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \ -static inline uint##_bits##_t ld##_lname##_##_end##_dma(AddressSpace *as, \ -dma_addr_t addr, \ -MemTxAttrs attrs) \ -{ \ -uint##_bits##_t val;\ -dma_memory_read(as, addr, &val, (_bits) / 8, attrs); \ -return _end##_bits##_to_cpu(val); \ -} \ +static inline MemTxResult ld##_lname##_##_end##_dma(AddressSpace *as, \ +dma_addr_t addr, \ +uint##_bits##_t *pval, \ +MemTxAttrs attrs) \ +{ \ +MemTxResult res = dma_memory_read(as, addr, pval, (_bits) / 8, attrs); \ +_end##_bits##_to_cpus(pval); \ +return res; \ +} \ static inline MemTxResult st##_sname##_##_end##_dma(AddressSpace *as, \ dma_addr_t addr, \ uint##_bits##_t val, \ @@ -257,12 +258,10 @@ static inline void dma_memory_unmap(AddressSpace *as, return dma_memory_write(as, addr, &val, (_bits) / 8, attrs); \ } -static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr, MemTxAttrs attrs) +static inline MemTxResult ldub_dma(AddressSpace *as, dma_addr_t addr, + uint8_t *val, MemTxAttrs attrs) { -uint8_t val; - -dma_memory_read(as, addr, &val, 1, attrs); -return val; +return dma_memory_read(as, addr, val, 1, attrs); } static inline MemTxResult stb_dma(AddressSpace *as, dma_addr_t addr, diff --git a/hw/intc/pnv_xive.c b/hw/intc/pnv_xive.c index d9249bbc0c1..bb207514f2d 100644 --- a/hw/intc/pnv_xive.c +++ b/hw/intc/pnv_xive.c @@ -172,7 +172,7 @@ static uint64_t pnv_xive_vst_addr_indirect(PnvXive *xive, uint32_t type, /* Get the page size of the indirect table. */ vsd_addr = vsd & VSD_ADDRESS_MASK; -vsd = ldq_be_dma(&address_space_memory, vsd_addr, MEMTXATTRS_UNSPECIFIED); +ldq_be_dma(&address_space_memory, vsd_addr, &vsd, MEMTXATTRS_UNSPECIFIED); if (!(vsd & VSD_ADDRESS_MASK)) { #ifdef XIVE_DEBUG @@ -195,8 +195,8 @@ static uint6
[PATCH v2 09/23] dma: Let pci_dma_rw() take MemTxAttrs argument
Let devices specify transaction attributes when calling pci_dma_rw(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 10 ++ hw/audio/intel-hda.c | 3 ++- hw/scsi/esp-pci.c| 2 +- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 1acefc2a4c3..a751ab5a75d 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -806,10 +806,10 @@ static inline AddressSpace *pci_get_address_space(PCIDevice *dev) */ static inline MemTxResult pci_dma_rw(PCIDevice *dev, dma_addr_t addr, void *buf, dma_addr_t len, - DMADirection dir) + DMADirection dir, MemTxAttrs attrs) { return dma_memory_rw(pci_get_address_space(dev), addr, buf, len, - dir, MEMTXATTRS_UNSPECIFIED); + dir, attrs); } /** @@ -827,7 +827,8 @@ static inline MemTxResult pci_dma_rw(PCIDevice *dev, dma_addr_t addr, static inline MemTxResult pci_dma_read(PCIDevice *dev, dma_addr_t addr, void *buf, dma_addr_t len) { -return pci_dma_rw(dev, addr, buf, len, DMA_DIRECTION_TO_DEVICE); +return pci_dma_rw(dev, addr, buf, len, + DMA_DIRECTION_TO_DEVICE, MEMTXATTRS_UNSPECIFIED); } /** @@ -845,7 +846,8 @@ static inline MemTxResult pci_dma_read(PCIDevice *dev, dma_addr_t addr, static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, const void *buf, dma_addr_t len) { -return pci_dma_rw(dev, addr, (void *) buf, len, DMA_DIRECTION_FROM_DEVICE); +return pci_dma_rw(dev, addr, (void *) buf, len, + DMA_DIRECTION_FROM_DEVICE, MEMTXATTRS_UNSPECIFIED); } #define PCI_DMA_DEFINE_LDST(_l, _s, _bits) \ diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c index 8ce9df64e3e..fb3d34a4a0c 100644 --- a/hw/audio/intel-hda.c +++ b/hw/audio/intel-hda.c @@ -427,7 +427,8 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, dprint(d, 3, "dma: entry %d, pos %d/%d, copy %d\n", st->be, st->bp, st->bpl[st->be].len, copy); -pci_dma_rw(&d->pci, st->bpl[st->be].addr + st->bp, buf, copy, !output); +pci_dma_rw(&d->pci, st->bpl[st->be].addr + st->bp, buf, copy, !output, + MEMTXATTRS_UNSPECIFIED); st->lpib += copy; st->bp += copy; buf += copy; diff --git a/hw/scsi/esp-pci.c b/hw/scsi/esp-pci.c index dac054aeed4..1792f84cea6 100644 --- a/hw/scsi/esp-pci.c +++ b/hw/scsi/esp-pci.c @@ -280,7 +280,7 @@ static void esp_pci_dma_memory_rw(PCIESPState *pci, uint8_t *buf, int len, len = pci->dma_regs[DMA_WBC]; } -pci_dma_rw(PCI_DEVICE(pci), addr, buf, len, dir); +pci_dma_rw(PCI_DEVICE(pci), addr, buf, len, dir, MEMTXATTRS_UNSPECIFIED); /* update status registers */ pci->dma_regs[DMA_WBC] -= len; -- 2.33.1
[PATCH v2 02/23] dma: Let dma_memory_set() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_memory_set(). Reviewed-by: Richard Henderson Reviewed-by: Li Qiang Reviewed-by: Edgar E. Iglesias Signed-off-by: Philippe Mathieu-Daudé Acked-by: Stefan Hajnoczi Message-Id: <20210702092439.989969-3-phi...@redhat.com> --- include/hw/ppc/spapr_vio.h | 3 ++- include/sysemu/dma.h | 3 ++- hw/nvram/fw_cfg.c | 3 ++- softmmu/dma-helpers.c | 5 ++--- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index 4c45f1579fa..c90e74a67dd 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -111,7 +111,8 @@ static inline int spapr_vio_dma_write(SpaprVioDevice *dev, uint64_t taddr, static inline int spapr_vio_dma_set(SpaprVioDevice *dev, uint64_t taddr, uint8_t c, uint32_t size) { -return (dma_memory_set(&dev->as, taddr, c, size) != 0) ? +return (dma_memory_set(&dev->as, taddr, + c, size, MEMTXATTRS_UNSPECIFIED) != 0) ? H_DEST_PARM : H_SUCCESS; } diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 296f3b57c9c..d23516f020a 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -175,9 +175,10 @@ static inline MemTxResult dma_memory_write(AddressSpace *as, dma_addr_t addr, * @addr: address within that address space * @c: constant byte to fill the memory * @len: the number of bytes to fill with the constant byte + * @attrs: memory transaction attributes */ MemTxResult dma_memory_set(AddressSpace *as, dma_addr_t addr, - uint8_t c, dma_addr_t len); + uint8_t c, dma_addr_t len, MemTxAttrs attrs); /** * address_space_map: Map a physical memory region into a host virtual address. diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index c06b30de112..f7803fe3c30 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -399,7 +399,8 @@ static void fw_cfg_dma_transfer(FWCfgState *s) * tested before. */ if (read) { -if (dma_memory_set(s->dma_as, dma.address, 0, len)) { +if (dma_memory_set(s->dma_as, dma.address, 0, len, + MEMTXATTRS_UNSPECIFIED)) { dma.control |= FW_CFG_DMA_CTL_ERROR; } } diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c index 7d766a5e89a..1f07217ad4a 100644 --- a/softmmu/dma-helpers.c +++ b/softmmu/dma-helpers.c @@ -19,7 +19,7 @@ /* #define DEBUG_IOMMU */ MemTxResult dma_memory_set(AddressSpace *as, dma_addr_t addr, - uint8_t c, dma_addr_t len) + uint8_t c, dma_addr_t len, MemTxAttrs attrs) { dma_barrier(as, DMA_DIRECTION_FROM_DEVICE); @@ -31,8 +31,7 @@ MemTxResult dma_memory_set(AddressSpace *as, dma_addr_t addr, memset(fillbuf, c, FILLBUF_SIZE); while (len > 0) { l = len < FILLBUF_SIZE ? len : FILLBUF_SIZE; -error |= address_space_write(as, addr, MEMTXATTRS_UNSPECIFIED, - fillbuf, l); +error |= address_space_write(as, addr, attrs, fillbuf, l); len -= l; addr += l; } -- 2.33.1
[PATCH v2 08/23] dma: Have dma_buf_read() / dma_buf_write() take a void pointer
DMA operations are run on any kind of buffer, not arrays of uint8_t. Convert dma_buf_read/dma_buf_write functions to take a void pointer argument and save us pointless casts to uint8_t *. Remove this pointless casts in the megasas device model. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/dma.h | 4 ++-- hw/scsi/megasas.c | 22 +++--- softmmu/dma-helpers.c | 4 ++-- 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 97ff6f29f8c..0d5b836013d 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -302,8 +302,8 @@ BlockAIOCB *dma_blk_read(BlockBackend *blk, BlockAIOCB *dma_blk_write(BlockBackend *blk, QEMUSGList *sg, uint64_t offset, uint32_t align, BlockCompletionFunc *cb, void *opaque); -uint64_t dma_buf_read(uint8_t *ptr, int32_t len, QEMUSGList *sg); -uint64_t dma_buf_write(uint8_t *ptr, int32_t len, QEMUSGList *sg); +uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg); +uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg); void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, QEMUSGList *sg, enum BlockAcctType type); diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index 8f357841004..dc28302f96d 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -848,7 +848,7 @@ static int megasas_ctrl_get_info(MegasasState *s, MegasasCmd *cmd) MFI_INFO_PDMIX_SATA | MFI_INFO_PDMIX_LD); -cmd->iov_size -= dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); return MFI_STAT_OK; } @@ -878,7 +878,7 @@ static int megasas_mfc_get_defaults(MegasasState *s, MegasasCmd *cmd) info.disable_preboot_cli = 1; info.cluster_disable = 1; -cmd->iov_size -= dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); return MFI_STAT_OK; } @@ -899,7 +899,7 @@ static int megasas_dcmd_get_bios_info(MegasasState *s, MegasasCmd *cmd) info.expose_all_drives = 1; } -cmd->iov_size -= dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); return MFI_STAT_OK; } @@ -910,7 +910,7 @@ static int megasas_dcmd_get_fw_time(MegasasState *s, MegasasCmd *cmd) fw_time = cpu_to_le64(megasas_fw_time()); -cmd->iov_size -= dma_buf_read((uint8_t *)&fw_time, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&fw_time, dcmd_size, &cmd->qsg); return MFI_STAT_OK; } @@ -937,7 +937,7 @@ static int megasas_event_info(MegasasState *s, MegasasCmd *cmd) info.shutdown_seq_num = cpu_to_le32(s->shutdown_event); info.boot_seq_num = cpu_to_le32(s->boot_event); -cmd->iov_size -= dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); return MFI_STAT_OK; } @@ -1006,7 +1006,7 @@ static int megasas_dcmd_pd_get_list(MegasasState *s, MegasasCmd *cmd) info.size = cpu_to_le32(offset); info.count = cpu_to_le32(num_pd_disks); -cmd->iov_size -= dma_buf_read((uint8_t *)&info, offset, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, offset, &cmd->qsg); return MFI_STAT_OK; } @@ -1172,7 +1172,7 @@ static int megasas_dcmd_ld_get_list(MegasasState *s, MegasasCmd *cmd) info.ld_count = cpu_to_le32(num_ld_disks); trace_megasas_dcmd_ld_get_list(cmd->index, num_ld_disks, max_ld_disks); -resid = dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +resid = dma_buf_read(&info, dcmd_size, &cmd->qsg); cmd->iov_size = dcmd_size - resid; return MFI_STAT_OK; } @@ -1221,7 +1221,7 @@ static int megasas_dcmd_ld_list_query(MegasasState *s, MegasasCmd *cmd) info.size = dcmd_size; trace_megasas_dcmd_ld_get_list(cmd->index, num_ld_disks, max_ld_disks); -resid = dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +resid = dma_buf_read(&info, dcmd_size, &cmd->qsg); cmd->iov_size = dcmd_size - resid; return MFI_STAT_OK; } @@ -1390,7 +1390,7 @@ static int megasas_dcmd_cfg_read(MegasasState *s, MegasasCmd *cmd) ld_offset += sizeof(struct mfi_ld_config); } -cmd->iov_size -= dma_buf_read((uint8_t *)data, info->size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(data, info->size, &cmd->qsg); return MFI_STAT_OK; } @@ -1420,7 +1420,7 @@ static int megasas_dcmd_get_properties(MegasasState *s, MegasasCmd *cmd) info.ecc_bucket_leak_rate = cpu_to_le16(1440); info.expose_encl_devices = 1; -cmd->iov_size -= dma_buf_read((uint8_t *)&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); return MFI_STAT_OK; } @@ -1465,7 +1465,7 @@ static int megasas
[PATCH v2 14/23] dma: Let dma_buf_read() / dma_buf_write() propagate MemTxResult
Since the previous commit, dma_buf_rw() returns a MemTxResult type. Do not discard it, return it to the caller. Since both dma_buf_read/dma_buf_write functions were previously returning the QEMUSGList size not consumed, add an extra argument where the unconsummed size can be stored. Update the few callers. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/dma.h | 6 -- hw/ide/ahci.c | 8 hw/nvme/ctrl.c| 4 ++-- hw/scsi/megasas.c | 48 ++- hw/scsi/scsi-bus.c| 4 ++-- softmmu/dma-helpers.c | 18 ++-- 6 files changed, 52 insertions(+), 36 deletions(-) diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index fd8f16003dd..d11c1d794f9 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -302,8 +302,10 @@ BlockAIOCB *dma_blk_read(BlockBackend *blk, BlockAIOCB *dma_blk_write(BlockBackend *blk, QEMUSGList *sg, uint64_t offset, uint32_t align, BlockCompletionFunc *cb, void *opaque); -uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs); -uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs); +MemTxResult dma_buf_read(void *ptr, int32_t len, uint64_t *residp, + QEMUSGList *sg, MemTxAttrs attrs); +MemTxResult dma_buf_write(void *ptr, int32_t len, uint64_t *residp, + QEMUSGList *sg, MemTxAttrs attrs); void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, QEMUSGList *sg, enum BlockAcctType type); diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 205dfdc6622..0c7d31ceada 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1384,9 +1384,9 @@ static void ahci_pio_transfer(const IDEDMA *dma) const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; if (is_write) { -dma_buf_write(s->data_ptr, size, &s->sg, attrs); +dma_buf_write(s->data_ptr, size, NULL, &s->sg, attrs); } else { -dma_buf_read(s->data_ptr, size, &s->sg, attrs); +dma_buf_read(s->data_ptr, size, NULL, &s->sg, attrs); } } @@ -1479,9 +1479,9 @@ static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write) } if (is_write) { -dma_buf_read(p, l, &s->sg, MEMTXATTRS_UNSPECIFIED); +dma_buf_read(p, l, NULL, &s->sg, MEMTXATTRS_UNSPECIFIED); } else { -dma_buf_write(p, l, &s->sg, MEMTXATTRS_UNSPECIFIED); +dma_buf_write(p, l, NULL, &s->sg, MEMTXATTRS_UNSPECIFIED); } /* free sglist, update byte count */ diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 462f79a1f60..fa410a179a6 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -1150,9 +1150,9 @@ static uint16_t nvme_tx(NvmeCtrl *n, NvmeSg *sg, uint8_t *ptr, uint32_t len, uint64_t residual; if (dir == NVME_TX_DIRECTION_TO_DEVICE) { -residual = dma_buf_write(ptr, len, &sg->qsg, attrs); +dma_buf_write(ptr, len, &residual, &sg->qsg, attrs); } else { -residual = dma_buf_read(ptr, len, &sg->qsg, attrs); +dma_buf_read(ptr, len, &residual, &sg->qsg, attrs); } if (unlikely(residual)) { diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index fe36de10a21..87101705d01 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -738,6 +738,7 @@ static int megasas_ctrl_get_info(MegasasState *s, MegasasCmd *cmd) size_t dcmd_size = sizeof(info); BusChild *kid; int num_pd_disks = 0; +uint64_t resid; memset(&info, 0x0, dcmd_size); if (cmd->iov_size < dcmd_size) { @@ -848,7 +849,8 @@ static int megasas_ctrl_get_info(MegasasState *s, MegasasCmd *cmd) MFI_INFO_PDMIX_SATA | MFI_INFO_PDMIX_LD); -cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); +dma_buf_read(&info, dcmd_size, &resid, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); +cmd->iov_size -= resid; return MFI_STAT_OK; } @@ -856,6 +858,7 @@ static int megasas_mfc_get_defaults(MegasasState *s, MegasasCmd *cmd) { struct mfi_defaults info; size_t dcmd_size = sizeof(struct mfi_defaults); +uint64_t resid; memset(&info, 0x0, dcmd_size); if (cmd->iov_size < dcmd_size) { @@ -878,7 +881,8 @@ static int megasas_mfc_get_defaults(MegasasState *s, MegasasCmd *cmd) info.disable_preboot_cli = 1; info.cluster_disable = 1; -cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); +dma_buf_read(&info, dcmd_size, &resid, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); +cmd->iov_size -= resid; return MFI_STAT_OK; } @@ -886,6 +890,7 @@ static int megasas_dcmd_get_bios_info(MegasasState *s, MegasasCmd *cmd) { struct mfi_bios_data info; size_t dcmd_size = sizeof(info); +uint64_t resid; m
[PATCH v2 10/23] dma: Let dma_buf_rw() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_buf_rw(). Keep the default MEMTXATTRS_UNSPECIFIED in the 2 callers. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- softmmu/dma-helpers.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c index 7f37548394e..fa81d2b386c 100644 --- a/softmmu/dma-helpers.c +++ b/softmmu/dma-helpers.c @@ -295,7 +295,7 @@ BlockAIOCB *dma_blk_write(BlockBackend *blk, static uint64_t dma_buf_rw(void *buf, int32_t len, QEMUSGList *sg, - DMADirection dir) + DMADirection dir, MemTxAttrs attrs) { uint8_t *ptr = buf; uint64_t resid; @@ -307,8 +307,7 @@ static uint64_t dma_buf_rw(void *buf, int32_t len, QEMUSGList *sg, while (len > 0) { ScatterGatherEntry entry = sg->sg[sg_cur_index++]; int32_t xfer = MIN(len, entry.len); -dma_memory_rw(sg->as, entry.base, ptr, xfer, dir, - MEMTXATTRS_UNSPECIFIED); +dma_memory_rw(sg->as, entry.base, ptr, xfer, dir, attrs); ptr += xfer; len -= xfer; resid -= xfer; @@ -319,12 +318,14 @@ static uint64_t dma_buf_rw(void *buf, int32_t len, QEMUSGList *sg, uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg) { -return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_FROM_DEVICE); +return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_FROM_DEVICE, + MEMTXATTRS_UNSPECIFIED); } uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg) { -return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_TO_DEVICE); +return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_TO_DEVICE, + MEMTXATTRS_UNSPECIFIED); } void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, -- 2.33.1
[PATCH v2 17/23] dma: Let st*_dma() propagate MemTxResult
dma_memory_write() returns a MemTxResult type. Do not discard it, return it to the caller. Reviewed-by: Richard Henderson Reviewed-by: Cédric Le Goater Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/dma.h | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index f3cf60d222d..725e8e90f88 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -248,13 +248,13 @@ static inline void dma_memory_unmap(AddressSpace *as, dma_memory_read(as, addr, &val, (_bits) / 8, attrs); \ return _end##_bits##_to_cpu(val); \ } \ -static inline void st##_sname##_##_end##_dma(AddressSpace *as, \ - dma_addr_t addr, \ - uint##_bits##_t val, \ - MemTxAttrs attrs) \ -{ \ -val = cpu_to_##_end##_bits(val);\ -dma_memory_write(as, addr, &val, (_bits) / 8, attrs); \ +static inline MemTxResult st##_sname##_##_end##_dma(AddressSpace *as, \ +dma_addr_t addr, \ +uint##_bits##_t val, \ +MemTxAttrs attrs) \ +{ \ +val = cpu_to_##_end##_bits(val); \ +return dma_memory_write(as, addr, &val, (_bits) / 8, attrs); \ } static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr, MemTxAttrs attrs) @@ -265,10 +265,10 @@ static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr, MemTxAttrs att return val; } -static inline void stb_dma(AddressSpace *as, dma_addr_t addr, - uint8_t val, MemTxAttrs attrs) +static inline MemTxResult stb_dma(AddressSpace *as, dma_addr_t addr, + uint8_t val, MemTxAttrs attrs) { -dma_memory_write(as, addr, &val, 1, attrs); +return dma_memory_write(as, addr, &val, 1, attrs); } DEFINE_LDST_DMA(uw, w, 16, le); -- 2.33.1
[PATCH v2 11/23] dma: Let dma_buf_write() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_buf_write(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/dma.h | 2 +- hw/ide/ahci.c | 6 -- hw/nvme/ctrl.c| 3 ++- hw/scsi/megasas.c | 2 +- hw/scsi/scsi-bus.c| 2 +- softmmu/dma-helpers.c | 5 ++--- 6 files changed, 11 insertions(+), 9 deletions(-) diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index 0d5b836013d..e3dd74a9c4f 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -303,7 +303,7 @@ BlockAIOCB *dma_blk_write(BlockBackend *blk, QEMUSGList *sg, uint64_t offset, uint32_t align, BlockCompletionFunc *cb, void *opaque); uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg); -uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg); +uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs); void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, QEMUSGList *sg, enum BlockAcctType type); diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 8e77ddb660f..079d2977f23 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1381,8 +1381,10 @@ static void ahci_pio_transfer(const IDEDMA *dma) has_sglist ? "" : "o"); if (has_sglist && size) { +const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; + if (is_write) { -dma_buf_write(s->data_ptr, size, &s->sg); +dma_buf_write(s->data_ptr, size, &s->sg, attrs); } else { dma_buf_read(s->data_ptr, size, &s->sg); } @@ -1479,7 +1481,7 @@ static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write) if (is_write) { dma_buf_read(p, l, &s->sg); } else { -dma_buf_write(p, l, &s->sg); +dma_buf_write(p, l, &s->sg, MEMTXATTRS_UNSPECIFIED); } /* free sglist, update byte count */ diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 5f573c417b3..e1a531d5d6c 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -1146,10 +1146,11 @@ static uint16_t nvme_tx(NvmeCtrl *n, NvmeSg *sg, uint8_t *ptr, uint32_t len, assert(sg->flags & NVME_SG_ALLOC); if (sg->flags & NVME_SG_DMA) { +const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; uint64_t residual; if (dir == NVME_TX_DIRECTION_TO_DEVICE) { -residual = dma_buf_write(ptr, len, &sg->qsg); +residual = dma_buf_write(ptr, len, &sg->qsg, attrs); } else { residual = dma_buf_read(ptr, len, &sg->qsg); } diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index dc28302f96d..da1c88167ee 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -1465,7 +1465,7 @@ static int megasas_dcmd_set_properties(MegasasState *s, MegasasCmd *cmd) dcmd_size); return MFI_STAT_INVALID_PARAMETER; } -dma_buf_write(&info, dcmd_size, &cmd->qsg); +dma_buf_write(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); trace_megasas_dcmd_unsupported(cmd->index, cmd->iov_size); return MFI_STAT_OK; } diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index 77325d8cc7a..64a506a3975 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -1423,7 +1423,7 @@ void scsi_req_data(SCSIRequest *req, int len) if (req->cmd.mode == SCSI_XFER_FROM_DEV) { req->resid = dma_buf_read(buf, len, req->sg); } else { -req->resid = dma_buf_write(buf, len, req->sg); +req->resid = dma_buf_write(buf, len, req->sg, MEMTXATTRS_UNSPECIFIED); } scsi_req_continue(req); } diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c index fa81d2b386c..2f1a241b81a 100644 --- a/softmmu/dma-helpers.c +++ b/softmmu/dma-helpers.c @@ -322,10 +322,9 @@ uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg) MEMTXATTRS_UNSPECIFIED); } -uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg) +uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs) { -return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_TO_DEVICE, - MEMTXATTRS_UNSPECIFIED); +return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_TO_DEVICE, attrs); } void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, -- 2.33.1
[PATCH v2 20/23] pci: Let st*_pci_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling st*_pci_dma(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Richard Henderson Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 11 ++- hw/audio/intel-hda.c | 10 ++ hw/net/eepro100.c| 29 ++--- hw/net/tulip.c | 18 ++ hw/scsi/megasas.c| 15 ++- hw/scsi/vmw_pvscsi.c | 3 ++- 6 files changed, 52 insertions(+), 34 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 8c5f2ed5054..9f51ef2c3c2 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -859,11 +859,12 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, MEMTXATTRS_UNSPECIFIED); \ return val; \ } \ -static inline void st##_s##_pci_dma(PCIDevice *dev, \ -dma_addr_t addr, uint##_bits##_t val) \ -{ \ -st##_s##_dma(pci_get_address_space(dev), addr, val, \ - MEMTXATTRS_UNSPECIFIED); \ +static inline void st##_s##_pci_dma(PCIDevice *dev, \ +dma_addr_t addr, \ +uint##_bits##_t val, \ +MemTxAttrs attrs) \ +{ \ +st##_s##_dma(pci_get_address_space(dev), addr, val, attrs); \ } PCI_DMA_DEFINE_LDST(ub, b, 8); diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c index fb3d34a4a0c..3309ae0ea18 100644 --- a/hw/audio/intel-hda.c +++ b/hw/audio/intel-hda.c @@ -345,6 +345,7 @@ static void intel_hda_corb_run(IntelHDAState *d) static void intel_hda_response(HDACodecDevice *dev, bool solicited, uint32_t response) { +const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; HDACodecBus *bus = HDA_BUS(dev->qdev.parent_bus); IntelHDAState *d = container_of(bus, IntelHDAState, codecs); hwaddr addr; @@ -367,8 +368,8 @@ static void intel_hda_response(HDACodecDevice *dev, bool solicited, uint32_t res ex = (solicited ? 0 : (1 << 4)) | dev->cad; wp = (d->rirb_wp + 1) & 0xff; addr = intel_hda_addr(d->rirb_lbase, d->rirb_ubase); -stl_le_pci_dma(&d->pci, addr + 8*wp, response); -stl_le_pci_dma(&d->pci, addr + 8*wp + 4, ex); +stl_le_pci_dma(&d->pci, addr + 8 * wp, response, attrs); +stl_le_pci_dma(&d->pci, addr + 8 * wp + 4, ex, attrs); d->rirb_wp = wp; dprint(d, 2, "%s: [wp 0x%x] response 0x%x, extra 0x%x\n", @@ -394,6 +395,7 @@ static void intel_hda_response(HDACodecDevice *dev, bool solicited, uint32_t res static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, uint8_t *buf, uint32_t len) { +const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; HDACodecBus *bus = HDA_BUS(dev->qdev.parent_bus); IntelHDAState *d = container_of(bus, IntelHDAState, codecs); hwaddr addr; @@ -428,7 +430,7 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, st->be, st->bp, st->bpl[st->be].len, copy); pci_dma_rw(&d->pci, st->bpl[st->be].addr + st->bp, buf, copy, !output, - MEMTXATTRS_UNSPECIFIED); + attrs); st->lpib += copy; st->bp += copy; buf += copy; @@ -451,7 +453,7 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, if (d->dp_lbase & 0x01) { s = st - d->st; addr = intel_hda_addr(d->dp_lbase & ~0x01, d->dp_ubase); -stl_le_pci_dma(&d->pci, addr + 8*s, st->lpib); +stl_le_pci_dma(&d->pci, addr + 8 * s, st->lpib, attrs); } dprint(d, 3, "dma: --\n"); diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c index 16e95ef9cc9..83c4431b1ad 100644 --- a/hw/net/eepro100.c +++ b/hw/net/eepro100.c @@ -700,6 +700,8 @@ static void set_ru_state(EEPRO100State * s, ru_state_t state) static void dump_statistics(EEPRO100State * s) { +const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; + /* Dump statistical data. Most data is never changed by the emulation * and always 0, so we first just copy the whole block and then those * values which really matter. @@ -707,16 +709,18 @@ static void dump_statistics(EEPRO100State * s) */ pci_dma_write(&s->dev, s->statsaddr, &s->statistics, s->stats_size); stl_le_pci_dma(&s->dev, s->statsaddr + 0, - s->statistics.tx_good_frames); + s->statistics.tx_good_frames, attrs); stl_le_pci_dma(&s->dev, s->statsaddr + 36, - s->statistics.rx_good_frames); + s->statistics.rx_good_frames, attrs); stl_le_pci_dma(&s->dev, s->statsaddr + 48, - s->statistics.rx_resource_errors); + s->statistics.rx_reso
[PATCH v2 21/23] pci: Let ld*_pci_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling ld*_pci_dma(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Richard Henderson Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 6 +++--- hw/audio/intel-hda.c | 2 +- hw/net/eepro100.c| 19 +-- hw/net/tulip.c | 18 ++ hw/scsi/megasas.c| 16 ++-- hw/scsi/mptsas.c | 10 ++ hw/scsi/vmw_pvscsi.c | 3 ++- hw/usb/hcd-xhci.c| 1 + 8 files changed, 46 insertions(+), 29 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 9f51ef2c3c2..7a46c1fa226 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -852,11 +852,11 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, #define PCI_DMA_DEFINE_LDST(_l, _s, _bits) \ static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev, \ - dma_addr_t addr) \ + dma_addr_t addr, \ + MemTxAttrs attrs) \ { \ uint##_bits##_t val; \ -ld##_l##_dma(pci_get_address_space(dev), addr, &val, \ - MEMTXATTRS_UNSPECIFIED); \ +ld##_l##_dma(pci_get_address_space(dev), addr, &val, attrs); \ return val; \ } \ static inline void st##_s##_pci_dma(PCIDevice *dev, \ diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c index 3309ae0ea18..e34b7ab0e92 100644 --- a/hw/audio/intel-hda.c +++ b/hw/audio/intel-hda.c @@ -335,7 +335,7 @@ static void intel_hda_corb_run(IntelHDAState *d) rp = (d->corb_rp + 1) & 0xff; addr = intel_hda_addr(d->corb_lbase, d->corb_ubase); -verb = ldl_le_pci_dma(&d->pci, addr + 4*rp); +verb = ldl_le_pci_dma(&d->pci, addr + 4 * rp, MEMTXATTRS_UNSPECIFIED); d->corb_rp = rp; dprint(d, 2, "%s: [rp 0x%x] verb 0x%08x\n", __func__, rp, verb); diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c index 83c4431b1ad..eb82e9cb118 100644 --- a/hw/net/eepro100.c +++ b/hw/net/eepro100.c @@ -737,6 +737,7 @@ static void read_cb(EEPRO100State *s) static void tx_command(EEPRO100State *s) { +const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; uint32_t tbd_array = s->tx.tbd_array_addr; uint16_t tcb_bytes = s->tx.tcb_bytes & 0x3fff; /* Sends larger than MAX_ETH_FRAME_SIZE are allowed, up to 2600 bytes. */ @@ -772,11 +773,14 @@ static void tx_command(EEPRO100State *s) /* Extended Flexible TCB. */ for (; tbd_count < 2; tbd_count++) { uint32_t tx_buffer_address = ldl_le_pci_dma(&s->dev, -tbd_address); +tbd_address, +attrs); uint16_t tx_buffer_size = lduw_le_pci_dma(&s->dev, - tbd_address + 4); + tbd_address + 4, + attrs); uint16_t tx_buffer_el = lduw_le_pci_dma(&s->dev, -tbd_address + 6); +tbd_address + 6, +attrs); tbd_address += 8; TRACE(RXTX, logout ("TBD (extended flexible mode): buffer address 0x%08x, size 0x%04x\n", @@ -792,9 +796,12 @@ static void tx_command(EEPRO100State *s) } tbd_address = tbd_array; for (; tbd_count < s->tx.tbd_count; tbd_count++) { -uint32_t tx_buffer_address = ldl_le_pci_dma(&s->dev, tbd_address); -uint16_t tx_buffer_size = lduw_le_pci_dma(&s->dev, tbd_address + 4); -uint16_t tx_buffer_el = lduw_le_pci_dma(&s->dev, tbd_address + 6); +uint32_t tx_buffer_address = ldl_le_pci_dma(&s->dev, tbd_address, +attrs); +uint16_t tx_buffer_size = lduw_le_pci_dma(&s->dev, tbd_address + 4, + attrs); +uint16_t tx_buffer_el = lduw_le_pci_dma(&s->dev, tbd_address + 6, +attrs); tbd_address += 8; TRACE(RXTX, logout ("TBD (flexible mode): buffer address 0x%08x, size 0x%04x\n", diff --git a/hw/net/tulip.c b/hw/net/tulip.c index 1f2c79dd58b..c76e4868f73 100644 --- a/hw/net/tulip.c +++ b/hw/net/tulip.c @@ -70,16 +70,18 @@ static const VMStateDescription vmstate_pci
[PATCH v2 12/23] dma: Let dma_buf_read() take MemTxAttrs argument
Let devices specify transaction attributes when calling dma_buf_read(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/dma.h | 2 +- hw/ide/ahci.c | 4 ++-- hw/nvme/ctrl.c| 2 +- hw/scsi/megasas.c | 24 hw/scsi/scsi-bus.c| 2 +- softmmu/dma-helpers.c | 5 ++--- 6 files changed, 19 insertions(+), 20 deletions(-) diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index e3dd74a9c4f..fd8f16003dd 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -302,7 +302,7 @@ BlockAIOCB *dma_blk_read(BlockBackend *blk, BlockAIOCB *dma_blk_write(BlockBackend *blk, QEMUSGList *sg, uint64_t offset, uint32_t align, BlockCompletionFunc *cb, void *opaque); -uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg); +uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs); uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs); void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index 079d2977f23..205dfdc6622 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -1386,7 +1386,7 @@ static void ahci_pio_transfer(const IDEDMA *dma) if (is_write) { dma_buf_write(s->data_ptr, size, &s->sg, attrs); } else { -dma_buf_read(s->data_ptr, size, &s->sg); +dma_buf_read(s->data_ptr, size, &s->sg, attrs); } } @@ -1479,7 +1479,7 @@ static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write) } if (is_write) { -dma_buf_read(p, l, &s->sg); +dma_buf_read(p, l, &s->sg, MEMTXATTRS_UNSPECIFIED); } else { dma_buf_write(p, l, &s->sg, MEMTXATTRS_UNSPECIFIED); } diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index e1a531d5d6c..462f79a1f60 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -1152,7 +1152,7 @@ static uint16_t nvme_tx(NvmeCtrl *n, NvmeSg *sg, uint8_t *ptr, uint32_t len, if (dir == NVME_TX_DIRECTION_TO_DEVICE) { residual = dma_buf_write(ptr, len, &sg->qsg, attrs); } else { -residual = dma_buf_read(ptr, len, &sg->qsg); +residual = dma_buf_read(ptr, len, &sg->qsg, attrs); } if (unlikely(residual)) { diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index da1c88167ee..fe36de10a21 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -848,7 +848,7 @@ static int megasas_ctrl_get_info(MegasasState *s, MegasasCmd *cmd) MFI_INFO_PDMIX_SATA | MFI_INFO_PDMIX_LD); -cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); return MFI_STAT_OK; } @@ -878,7 +878,7 @@ static int megasas_mfc_get_defaults(MegasasState *s, MegasasCmd *cmd) info.disable_preboot_cli = 1; info.cluster_disable = 1; -cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); return MFI_STAT_OK; } @@ -899,7 +899,7 @@ static int megasas_dcmd_get_bios_info(MegasasState *s, MegasasCmd *cmd) info.expose_all_drives = 1; } -cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); return MFI_STAT_OK; } @@ -910,7 +910,7 @@ static int megasas_dcmd_get_fw_time(MegasasState *s, MegasasCmd *cmd) fw_time = cpu_to_le64(megasas_fw_time()); -cmd->iov_size -= dma_buf_read(&fw_time, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&fw_time, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); return MFI_STAT_OK; } @@ -937,7 +937,7 @@ static int megasas_event_info(MegasasState *s, MegasasCmd *cmd) info.shutdown_seq_num = cpu_to_le32(s->shutdown_event); info.boot_seq_num = cpu_to_le32(s->boot_event); -cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, dcmd_size, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); return MFI_STAT_OK; } @@ -1006,7 +1006,7 @@ static int megasas_dcmd_pd_get_list(MegasasState *s, MegasasCmd *cmd) info.size = cpu_to_le32(offset); info.count = cpu_to_le32(num_pd_disks); -cmd->iov_size -= dma_buf_read(&info, offset, &cmd->qsg); +cmd->iov_size -= dma_buf_read(&info, offset, &cmd->qsg, MEMTXATTRS_UNSPECIFIED); return MFI_STAT_OK; } @@ -1100,7 +1100,7 @@ static int megasas_pd_get_info_submit(SCSIDevice *sdev, int lun, info->connected_port_bitmap = 0x1; info->device_speed = 1; info->link_speed = 1; -resid = dma_buf_read(cmd->iov_buf, dcmd_size, &cmd->qsg); +resid = dma_buf_read(cmd->iov_buf, dcmd_s
[PATCH v3 kvm/queue 02/16] mm/memfd: Introduce MFD_INACCESSIBLE flag
Introduce a new memfd_create() flag indicating the content of the created memfd is inaccessible from userspace. It does this by force setting F_SEAL_INACCESSIBLE seal when the file is created. It also set F_SEAL_SEAL to prevent future sealing, which means, it can not coexist with MFD_ALLOW_SEALING. Signed-off-by: Chao Peng --- include/uapi/linux/memfd.h | 1 + mm/memfd.c | 12 +++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/memfd.h b/include/uapi/linux/memfd.h index 7a8a26751c23..48750474b904 100644 --- a/include/uapi/linux/memfd.h +++ b/include/uapi/linux/memfd.h @@ -8,6 +8,7 @@ #define MFD_CLOEXEC0x0001U #define MFD_ALLOW_SEALING 0x0002U #define MFD_HUGETLB0x0004U +#define MFD_INACCESSIBLE 0x0008U /* * Huge page size encoding when MFD_HUGETLB is specified, and a huge page diff --git a/mm/memfd.c b/mm/memfd.c index 9f80f162791a..c898a007fb76 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -245,7 +245,8 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned long arg) #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1) #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN) -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB) +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | \ + MFD_INACCESSIBLE) SYSCALL_DEFINE2(memfd_create, const char __user *, uname, @@ -267,6 +268,10 @@ SYSCALL_DEFINE2(memfd_create, return -EINVAL; } + /* Disallow sealing when MFD_INACCESSIBLE is set. */ + if (flags & MFD_INACCESSIBLE && flags & MFD_ALLOW_SEALING) + return -EINVAL; + /* length includes terminating zero */ len = strnlen_user(uname, MFD_NAME_MAX_LEN + 1); if (len <= 0) @@ -315,6 +320,11 @@ SYSCALL_DEFINE2(memfd_create, *file_seals &= ~F_SEAL_SEAL; } + if (flags & MFD_INACCESSIBLE) { + file_seals = memfd_file_seals_ptr(file); + *file_seals &= F_SEAL_SEAL | F_SEAL_INACCESSIBLE; + } + fd_install(fd, file); kfree(name); return fd; -- 2.17.1
[PATCH v2 22/23] pci: Let st*_pci_dma() propagate MemTxResult
st*_dma() returns a MemTxResult type. Do not discard it, return it to the caller. Reviewed-by: Richard Henderson Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 7a46c1fa226..c90cecc85c0 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -859,12 +859,12 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, ld##_l##_dma(pci_get_address_space(dev), addr, &val, attrs); \ return val; \ } \ -static inline void st##_s##_pci_dma(PCIDevice *dev, \ -dma_addr_t addr, \ -uint##_bits##_t val, \ -MemTxAttrs attrs) \ +static inline MemTxResult st##_s##_pci_dma(PCIDevice *dev, \ + dma_addr_t addr, \ + uint##_bits##_t val, \ + MemTxAttrs attrs) \ { \ -st##_s##_dma(pci_get_address_space(dev), addr, val, attrs); \ +return st##_s##_dma(pci_get_address_space(dev), addr, val, attrs); \ } PCI_DMA_DEFINE_LDST(ub, b, 8); -- 2.33.1
[PATCH v2 13/23] dma: Let dma_buf_rw() propagate MemTxResult
dma_memory_rw() returns a MemTxResult type. Do not discard it, return it to the caller. Since dma_buf_rw() was previously returning the QEMUSGList size not consumed, add an extra argument where this size can be stored. Update the 2 callers. Reviewed-by: Klaus Jensen Signed-off-by: Philippe Mathieu-Daudé --- softmmu/dma-helpers.c | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c index a391773c296..b0be1564797 100644 --- a/softmmu/dma-helpers.c +++ b/softmmu/dma-helpers.c @@ -294,12 +294,14 @@ BlockAIOCB *dma_blk_write(BlockBackend *blk, } -static uint64_t dma_buf_rw(void *buf, int32_t len, QEMUSGList *sg, - DMADirection dir, MemTxAttrs attrs) +static MemTxResult dma_buf_rw(void *buf, int32_t len, uint64_t *residp, + QEMUSGList *sg, DMADirection dir, + MemTxAttrs attrs) { uint8_t *ptr = buf; uint64_t resid; int sg_cur_index; +MemTxResult res = MEMTX_OK; resid = sg->size; sg_cur_index = 0; @@ -307,23 +309,34 @@ static uint64_t dma_buf_rw(void *buf, int32_t len, QEMUSGList *sg, while (len > 0) { ScatterGatherEntry entry = sg->sg[sg_cur_index++]; int32_t xfer = MIN(len, entry.len); -dma_memory_rw(sg->as, entry.base, ptr, xfer, dir, attrs); +res |= dma_memory_rw(sg->as, entry.base, ptr, xfer, dir, attrs); ptr += xfer; len -= xfer; resid -= xfer; } -return resid; +if (residp) { +*residp = resid; +} +return res; } uint64_t dma_buf_read(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs) { -return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_FROM_DEVICE, attrs); +uint64_t resid; + +dma_buf_rw(ptr, len, &resid, sg, DMA_DIRECTION_FROM_DEVICE, attrs); + +return resid; } uint64_t dma_buf_write(void *ptr, int32_t len, QEMUSGList *sg, MemTxAttrs attrs) { -return dma_buf_rw(ptr, len, sg, DMA_DIRECTION_TO_DEVICE, attrs); +uint64_t resid; + +dma_buf_rw(ptr, len, &resid, sg, DMA_DIRECTION_TO_DEVICE, attrs); + +return resid; } void dma_acct_start(BlockBackend *blk, BlockAcctCookie *cookie, -- 2.33.1
[PATCH v3 kvm/queue 03/16] mm/memfd: Introduce MEMFD_OPS
From: "Kirill A. Shutemov" The patch introduces new MEMFD_OPS facility around file created by memfd_create() to allow a third kernel component to make use of memory bookmarked in a memfd and gets notifier when the memory in the file is allocated/invalidated. It will be used for KVM to use memfd file descriptor as the guest memory backend and KVM will use MEMFD_OPS to interact with memfd subsystem. In the future there might be other consumers (e.g. VFIO with encrypted device memory). It consists two set of callbacks: - memfd_falloc_notifier: callbacks which provided by KVM and called by memfd when memory gets allocated/invalidated through fallocate() ioctl. - memfd_pfn_ops: callbacks which provided by memfd and called by KVM to request memory page from memfd. Locking is needed for above callbacks to prevent race condition. - get_owner/put_owner is used to ensure the owner is still alive in the invalidate_page_range/fallocate callback handlers using a reference mechanism. - page is locked between get_lock_pfn/put_unlock_pfn to ensure pfn is still valid when it's used (e.g. when KVM page fault handler uses it to establish the mapping in the secondary MMU page tables). Userspace is in charge of guest memory lifecycle: it can allocate the memory with fallocate() or punch hole to free memory from the guest. The file descriptor passed down to KVM as guest memory backend. KVM registers itself as the owner of the memfd via memfd_register_falloc_notifier() and provides memfd_falloc_notifier callbacks that need to be called on fallocate() and punching hole. memfd_register_falloc_notifier() returns memfd_pfn_ops callbacks that need to be used for requesting a new page from KVM. At this time only shmem is supported. Signed-off-by: Kirill A. Shutemov Signed-off-by: Chao Peng --- include/linux/memfd.h| 22 ++ include/linux/shmem_fs.h | 16 mm/Kconfig | 4 + mm/memfd.c | 21 ++ mm/shmem.c | 158 +++ 5 files changed, 221 insertions(+) diff --git a/include/linux/memfd.h b/include/linux/memfd.h index 4f1600413f91..0007073b53dc 100644 --- a/include/linux/memfd.h +++ b/include/linux/memfd.h @@ -13,4 +13,26 @@ static inline long memfd_fcntl(struct file *f, unsigned int c, unsigned long a) } #endif +#ifdef CONFIG_MEMFD_OPS +struct memfd_falloc_notifier { + void (*invalidate_page_range)(struct inode *inode, void *owner, + pgoff_t start, pgoff_t end); + void (*fallocate)(struct inode *inode, void *owner, + pgoff_t start, pgoff_t end); + bool (*get_owner)(void *owner); + void (*put_owner)(void *owner); +}; + +struct memfd_pfn_ops { + long (*get_lock_pfn)(struct inode *inode, pgoff_t offset, int *order); + void (*put_unlock_pfn)(unsigned long pfn); + +}; + +extern int memfd_register_falloc_notifier(struct inode *inode, void *owner, + const struct memfd_falloc_notifier *notifier, + const struct memfd_pfn_ops **pfn_ops); +extern void memfd_unregister_falloc_notifier(struct inode *inode); +#endif + #endif /* __LINUX_MEMFD_H */ diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 166158b6e917..503adc63728c 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -12,6 +12,11 @@ /* inode in-kernel data */ +#ifdef CONFIG_MEMFD_OPS +struct memfd_falloc_notifier; +struct memfd_pfn_ops; +#endif + struct shmem_inode_info { spinlock_t lock; unsigned intseals; /* shmem seals */ @@ -24,6 +29,10 @@ struct shmem_inode_info { struct shared_policypolicy; /* NUMA memory alloc policy */ struct simple_xattrsxattrs; /* list of xattrs */ atomic_tstop_eviction; /* hold when working on inode */ +#ifdef CONFIG_MEMFD_OPS + void*owner; + const struct memfd_falloc_notifier *falloc_notifier; +#endif struct inodevfs_inode; }; @@ -96,6 +105,13 @@ extern unsigned long shmem_swap_usage(struct vm_area_struct *vma); extern unsigned long shmem_partial_swap_usage(struct address_space *mapping, pgoff_t start, pgoff_t end); +#ifdef CONFIG_MEMFD_OPS +extern int shmem_register_falloc_notifier(struct inode *inode, void *owner, + const struct memfd_falloc_notifier *notifier, + const struct memfd_pfn_ops **pfn_ops); +extern void shmem_unregister_falloc_notifier(struct inode *inode); +#endif + /* Flag allocation requirements to shmem_getpage */ enum sgp_type { SGP_READ, /* don't exceed i_size, don't allocate page */ diff --git a/mm/Kconfig b/mm/Kconfig index 28edafc820ad..9989904d1b56 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -900,6 +900,1
[PATCH v2 23/23] pci: Let ld*_pci_dma() propagate MemTxResult
ld*_dma() returns a MemTxResult type. Do not discard it, return it to the caller. Update the few callers. Reviewed-by: Richard Henderson Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 17 - hw/audio/intel-hda.c | 2 +- hw/net/eepro100.c| 25 ++--- hw/net/tulip.c | 16 hw/scsi/megasas.c| 21 - hw/scsi/mptsas.c | 16 +++- hw/scsi/vmw_pvscsi.c | 16 ++-- 7 files changed, 60 insertions(+), 53 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index c90cecc85c0..5b36334a28a 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -850,15 +850,14 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, DMA_DIRECTION_FROM_DEVICE, MEMTXATTRS_UNSPECIFIED); } -#define PCI_DMA_DEFINE_LDST(_l, _s, _bits) \ -static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev, \ - dma_addr_t addr, \ - MemTxAttrs attrs) \ -{ \ -uint##_bits##_t val; \ -ld##_l##_dma(pci_get_address_space(dev), addr, &val, attrs); \ -return val; \ -} \ +#define PCI_DMA_DEFINE_LDST(_l, _s, _bits) \ +static inline MemTxResult ld##_l##_pci_dma(PCIDevice *dev, \ + dma_addr_t addr, \ + uint##_bits##_t *val, \ + MemTxAttrs attrs) \ +{ \ +return ld##_l##_dma(pci_get_address_space(dev), addr, val, attrs); \ +} \ static inline MemTxResult st##_s##_pci_dma(PCIDevice *dev, \ dma_addr_t addr, \ uint##_bits##_t val, \ diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c index e34b7ab0e92..2b55d521503 100644 --- a/hw/audio/intel-hda.c +++ b/hw/audio/intel-hda.c @@ -335,7 +335,7 @@ static void intel_hda_corb_run(IntelHDAState *d) rp = (d->corb_rp + 1) & 0xff; addr = intel_hda_addr(d->corb_lbase, d->corb_ubase); -verb = ldl_le_pci_dma(&d->pci, addr + 4 * rp, MEMTXATTRS_UNSPECIFIED); +ldl_le_pci_dma(&d->pci, addr + 4 * rp, &verb, MEMTXATTRS_UNSPECIFIED); d->corb_rp = rp; dprint(d, 2, "%s: [rp 0x%x] verb 0x%08x\n", __func__, rp, verb); diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c index eb82e9cb118..679f52f80f1 100644 --- a/hw/net/eepro100.c +++ b/hw/net/eepro100.c @@ -769,18 +769,16 @@ static void tx_command(EEPRO100State *s) } else { /* Flexible mode. */ uint8_t tbd_count = 0; +uint32_t tx_buffer_address; +uint16_t tx_buffer_size; +uint16_t tx_buffer_el; + if (s->has_extended_tcb_support && !(s->configuration[6] & BIT(4))) { /* Extended Flexible TCB. */ for (; tbd_count < 2; tbd_count++) { -uint32_t tx_buffer_address = ldl_le_pci_dma(&s->dev, -tbd_address, -attrs); -uint16_t tx_buffer_size = lduw_le_pci_dma(&s->dev, - tbd_address + 4, - attrs); -uint16_t tx_buffer_el = lduw_le_pci_dma(&s->dev, -tbd_address + 6, -attrs); +ldl_le_pci_dma(&s->dev, tbd_address, &tx_buffer_address, attrs); +lduw_le_pci_dma(&s->dev, tbd_address + 4, &tx_buffer_size, attrs); +lduw_le_pci_dma(&s->dev, tbd_address + 6, &tx_buffer_el, attrs); tbd_address += 8; TRACE(RXTX, logout ("TBD (extended flexible mode): buffer address 0x%08x, size 0x%04x\n", @@ -796,12 +794,9 @@ static void tx_command(EEPRO100State *s) } tbd_address = tbd_array; for (; tbd_count < s->tx.tbd_count; tbd_count++) { -uint32_t tx_buffer_address = ldl_le_pci_dma(&s->dev, tbd_address, -attrs); -uint16_t tx_buffer_size = lduw_le_pci_dma(&s->dev, tbd_address + 4, - attrs); -uint16_t tx_buffer_el = lduw_le_pci_dma(&s->dev, tbd_address + 6, -attrs); +ldl_le_pci_dma(&s->dev, tbd_address, &tx_buffer_address, attrs); +lduw_le_pci_dma(&s->dev, tbd_address + 4, &tx_buffer_size, attrs); +
[PATCH v2 15/23] dma: Let st*_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling st*_dma(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Richard Henderson Reviewed-by: Cédric Le Goater Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 3 ++- include/hw/ppc/spapr_vio.h | 12 include/sysemu/dma.h | 10 ++ hw/nvram/fw_cfg.c | 4 ++-- 4 files changed, 18 insertions(+), 11 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index a751ab5a75d..d07e9707b48 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -859,7 +859,8 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, static inline void st##_s##_pci_dma(PCIDevice *dev, \ dma_addr_t addr, uint##_bits##_t val) \ { \ -st##_s##_dma(pci_get_address_space(dev), addr, val);\ +st##_s##_dma(pci_get_address_space(dev), addr, val, \ + MEMTXATTRS_UNSPECIFIED); \ } PCI_DMA_DEFINE_LDST(ub, b, 8); diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index 5d2ea8e6656..e87f8e6f596 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -118,10 +118,14 @@ static inline int spapr_vio_dma_set(SpaprVioDevice *dev, uint64_t taddr, H_DEST_PARM : H_SUCCESS; } -#define vio_stb(_dev, _addr, _val) (stb_dma(&(_dev)->as, (_addr), (_val))) -#define vio_sth(_dev, _addr, _val) (stw_be_dma(&(_dev)->as, (_addr), (_val))) -#define vio_stl(_dev, _addr, _val) (stl_be_dma(&(_dev)->as, (_addr), (_val))) -#define vio_stq(_dev, _addr, _val) (stq_be_dma(&(_dev)->as, (_addr), (_val))) +#define vio_stb(_dev, _addr, _val) \ +(stb_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) +#define vio_sth(_dev, _addr, _val) \ +(stw_be_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) +#define vio_stl(_dev, _addr, _val) \ +(stl_be_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) +#define vio_stq(_dev, _addr, _val) \ +(stq_be_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) #define vio_ldq(_dev, _addr) (ldq_be_dma(&(_dev)->as, (_addr))) int spapr_vio_send_crq(SpaprVioDevice *dev, uint8_t *crq); diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index d11c1d794f9..ebbc0501681 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -249,10 +249,11 @@ static inline void dma_memory_unmap(AddressSpace *as, } \ static inline void st##_sname##_##_end##_dma(AddressSpace *as, \ dma_addr_t addr, \ - uint##_bits##_t val) \ + uint##_bits##_t val, \ + MemTxAttrs attrs) \ { \ val = cpu_to_##_end##_bits(val);\ -dma_memory_write(as, addr, &val, (_bits) / 8, MEMTXATTRS_UNSPECIFIED); \ +dma_memory_write(as, addr, &val, (_bits) / 8, attrs); \ } static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr) @@ -263,9 +264,10 @@ static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr) return val; } -static inline void stb_dma(AddressSpace *as, dma_addr_t addr, uint8_t val) +static inline void stb_dma(AddressSpace *as, dma_addr_t addr, + uint8_t val, MemTxAttrs attrs) { -dma_memory_write(as, addr, &val, 1, MEMTXATTRS_UNSPECIFIED); +dma_memory_write(as, addr, &val, 1, attrs); } DEFINE_LDST_DMA(uw, w, 16, le); diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 9b91b15cb08..e5f3c981841 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -360,7 +360,7 @@ static void fw_cfg_dma_transfer(FWCfgState *s) if (dma_memory_read(s->dma_as, dma_addr, &dma, sizeof(dma), MEMTXATTRS_UNSPECIFIED)) { stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, control), - FW_CFG_DMA_CTL_ERROR); + FW_CFG_DMA_CTL_ERROR, MEMTXATTRS_UNSPECIFIED); return; } @@ -446,7 +446,7 @@ static void fw_cfg_dma_transfer(FWCfgState *s) } stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, control), -dma.control); +dma.control, MEMTXATTRS_UNSPECIFIED); trace_fw_cfg_read(s, 0); } -- 2.33.1
[PATCH v3 kvm/queue 01/16] mm/shmem: Introduce F_SEAL_INACCESSIBLE
From: "Kirill A. Shutemov" Introduce a new seal F_SEAL_INACCESSIBLE indicating the content of the file is inaccessible from userspace in any possible ways like read(),write() or mmap() etc. It provides semantics required for KVM guest private memory support that a file descriptor with this seal set is going to be used as the source of guest memory in confidential computing environments such as Intel TDX/AMD SEV but may not be accessible from host userspace. At this time only shmem implements this seal. Signed-off-by: Kirill A. Shutemov Signed-off-by: Chao Peng --- include/uapi/linux/fcntl.h | 1 + mm/shmem.c | 37 +++-- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 2f86b2ad6d7e..e2bad051936f 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -43,6 +43,7 @@ #define F_SEAL_GROW0x0004 /* prevent file from growing */ #define F_SEAL_WRITE 0x0008 /* prevent writes */ #define F_SEAL_FUTURE_WRITE0x0010 /* prevent future writes while mapped */ +#define F_SEAL_INACCESSIBLE0x0020 /* prevent file from accessing */ /* (1U << 31) is reserved for signed error codes */ /* diff --git a/mm/shmem.c b/mm/shmem.c index 18f93c2d68f1..faa7e9b1b9bc 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1098,6 +1098,10 @@ static int shmem_setattr(struct user_namespace *mnt_userns, (newsize > oldsize && (info->seals & F_SEAL_GROW))) return -EPERM; + if ((info->seals & F_SEAL_INACCESSIBLE) && + (newsize & ~PAGE_MASK)) + return -EINVAL; + if (newsize != oldsize) { error = shmem_reacct_size(SHMEM_I(inode)->flags, oldsize, newsize); @@ -1364,6 +1368,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) goto redirty; if (!total_swap_pages) goto redirty; + if (info->seals & F_SEAL_INACCESSIBLE) + goto redirty; /* * Our capabilities prevent regular writeback or sync from ever calling @@ -2262,6 +2268,9 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma) if (ret) return ret; + if (info->seals & F_SEAL_INACCESSIBLE) + return -EPERM; + /* arm64 - allow memory tagging on RAM-based files */ vma->vm_flags |= VM_MTE_ALLOWED; @@ -2459,12 +2468,15 @@ shmem_write_begin(struct file *file, struct address_space *mapping, pgoff_t index = pos >> PAGE_SHIFT; /* i_rwsem is held by caller */ - if (unlikely(info->seals & (F_SEAL_GROW | - F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) { + if (unlikely(info->seals & (F_SEAL_GROW | F_SEAL_WRITE | + F_SEAL_FUTURE_WRITE | + F_SEAL_INACCESSIBLE))) { if (info->seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) return -EPERM; if ((info->seals & F_SEAL_GROW) && pos + len > inode->i_size) return -EPERM; + if (info->seals & F_SEAL_INACCESSIBLE) + return -EPERM; } return shmem_getpage(inode, index, pagep, SGP_WRITE); @@ -2538,6 +2550,21 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) end_index = i_size >> PAGE_SHIFT; if (index > end_index) break; + + /* +* inode_lock protects setting up seals as well as write to +* i_size. Setting F_SEAL_INACCESSIBLE only allowed with +* i_size == 0. +* +* Check F_SEAL_INACCESSIBLE after i_size. It effectively +* serialize read vs. setting F_SEAL_INACCESSIBLE without +* taking inode_lock in read path. +*/ + if (SHMEM_I(inode)->seals & F_SEAL_INACCESSIBLE) { + error = -EPERM; + break; + } + if (index == end_index) { nr = i_size & ~PAGE_MASK; if (nr <= offset) @@ -2663,6 +2690,12 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, goto out; } + if ((info->seals & F_SEAL_INACCESSIBLE) && + (offset & ~PAGE_MASK || len & ~PAGE_MASK)) { + error = -EINVAL; + goto out; + } + shmem_falloc.waitq = &shmem_falloc_waitq; shmem_falloc.start = (u64)unmap_start >> PAGE_SHIFT; shmem_falloc.next = (unmap_end + 1) >> PAGE_SHIFT; -- 2.17.1
[PATCH v3 kvm/queue 05/16] KVM: Maintain ofs_tree for fast memslot lookup by file offset
Similar to hva_tree for hva range, maintain interval tree ofs_tree for offset range of a fd-based memslot so the lookup by offset range can be faster when memslot count is high. Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 2 ++ virt/kvm/kvm_main.c | 17 + 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 2cd35560c44b..3bd875f9669f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -451,6 +451,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) struct kvm_memory_slot { struct hlist_node id_node[2]; struct interval_tree_node hva_node[2]; + struct interval_tree_node ofs_node[2]; struct rb_node gfn_node[2]; gfn_t base_gfn; unsigned long npages; @@ -560,6 +561,7 @@ struct kvm_memslots { u64 generation; atomic_long_t last_used_slot; struct rb_root_cached hva_tree; + struct rb_root_cached ofs_tree; struct rb_root gfn_tree; /* * The mapping table from slot id to memslot. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b0f7e6eb00ff..47e96d1eb233 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1087,6 +1087,7 @@ static struct kvm *kvm_create_vm(unsigned long type) atomic_long_set(&slots->last_used_slot, (unsigned long)NULL); slots->hva_tree = RB_ROOT_CACHED; + slots->ofs_tree = RB_ROOT_CACHED; slots->gfn_tree = RB_ROOT; hash_init(slots->id_hash); slots->node_idx = j; @@ -1363,7 +1364,7 @@ static void kvm_replace_gfn_node(struct kvm_memslots *slots, * With NULL @old this simply adds @new. * With NULL @new this simply removes @old. * - * If @new is non-NULL its hva_node[slots_idx] range has to be set + * If @new is non-NULL its hva/ofs_node[slots_idx] range has to be set * appropriately. */ static void kvm_replace_memslot(struct kvm *kvm, @@ -1377,6 +1378,7 @@ static void kvm_replace_memslot(struct kvm *kvm, if (old) { hash_del(&old->id_node[idx]); interval_tree_remove(&old->hva_node[idx], &slots->hva_tree); + interval_tree_remove(&old->ofs_node[idx], &slots->ofs_tree); if ((long)old == atomic_long_read(&slots->last_used_slot)) atomic_long_set(&slots->last_used_slot, (long)new); @@ -1388,20 +1390,27 @@ static void kvm_replace_memslot(struct kvm *kvm, } /* -* Initialize @new's hva range. Do this even when replacing an @old +* Initialize @new's hva/ofs range. Do this even when replacing an @old * slot, kvm_copy_memslot() deliberately does not touch node data. */ new->hva_node[idx].start = new->userspace_addr; new->hva_node[idx].last = new->userspace_addr + (new->npages << PAGE_SHIFT) - 1; + if (kvm_slot_is_private(new)) { + new->ofs_node[idx].start = new->ofs; + new->ofs_node[idx].last = new->ofs + + (new->npages << PAGE_SHIFT) - 1; + } /* * (Re)Add the new memslot. There is no O(1) interval_tree_replace(), -* hva_node needs to be swapped with remove+insert even though hva can't -* change when replacing an existing slot. +* hva_node/ofs_node needs to be swapped with remove+insert even though +* hva/ofs can't change when replacing an existing slot. */ hash_add(slots->id_hash, &new->id_node[idx], new->id); interval_tree_insert(&new->hva_node[idx], &slots->hva_tree); + if (kvm_slot_is_private(new)) + interval_tree_insert(&new->ofs_node[idx], &slots->ofs_tree); /* * If the memslot gfn is unchanged, rb_replace_node() can be used to -- 2.17.1
[PATCH v3 kvm/queue 09/16] KVM: Split out common memory invalidation code
When fd-based memory is enabled, there will be two types of memory invalidation: - memory invalidation from native MMU through mmu_notifier callback for hva-based memory, and, - memory invalidation from memfd through memfd_notifier callback for fd-based memory. Some code can be shared between these two types of memory invalidation. This patch moves those shared code into one place so that it can be used for both CONFIG_MMU_NOTIFIER and CONFIG_MEMFD_NOTIFIER. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- virt/kvm/kvm_main.c | 35 +++ 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 19736a0013a0..7b7530b1ea1e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -469,22 +469,6 @@ void kvm_destroy_vcpus(struct kvm *kvm) EXPORT_SYMBOL_GPL(kvm_destroy_vcpus); #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) -static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn) -{ - return container_of(mn, struct kvm, mmu_notifier); -} - -static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, unsigned long end) -{ - struct kvm *kvm = mmu_notifier_to_kvm(mn); - int idx; - - idx = srcu_read_lock(&kvm->srcu); - kvm_arch_mmu_notifier_invalidate_range(kvm, start, end); - srcu_read_unlock(&kvm->srcu, idx); -} typedef bool (*gfn_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); @@ -611,6 +595,25 @@ static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm, /* The notifiers are averse to booleans. :-( */ return (int)ret; } +#endif + +#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) +static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn) +{ + return container_of(mn, struct kvm, mmu_notifier); +} + +static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct kvm *kvm = mmu_notifier_to_kvm(mn); + int idx; + + idx = srcu_read_lock(&kvm->srcu); + kvm_arch_mmu_notifier_invalidate_range(kvm, start, end); + srcu_read_unlock(&kvm->srcu, idx); +} static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, unsigned long start, -- 2.17.1
[PATCH v2 16/23] dma: Let ld*_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling ld*_dma(). Keep the default MEMTXATTRS_UNSPECIFIED in the few callers. Reviewed-by: Richard Henderson Reviewed-by: Cédric Le Goater Signed-off-by: Philippe Mathieu-Daudé --- include/hw/pci/pci.h | 3 ++- include/hw/ppc/spapr_vio.h | 3 ++- include/sysemu/dma.h | 11 ++- hw/intc/pnv_xive.c | 7 --- hw/usb/hcd-xhci.c | 6 +++--- 5 files changed, 17 insertions(+), 13 deletions(-) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index d07e9707b48..0613308b1b6 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -854,7 +854,8 @@ static inline MemTxResult pci_dma_write(PCIDevice *dev, dma_addr_t addr, static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev, \ dma_addr_t addr) \ { \ -return ld##_l##_dma(pci_get_address_space(dev), addr); \ +return ld##_l##_dma(pci_get_address_space(dev), addr, \ +MEMTXATTRS_UNSPECIFIED);\ } \ static inline void st##_s##_pci_dma(PCIDevice *dev, \ dma_addr_t addr, uint##_bits##_t val) \ diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h index e87f8e6f596..d2ec9b0637f 100644 --- a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@ -126,7 +126,8 @@ static inline int spapr_vio_dma_set(SpaprVioDevice *dev, uint64_t taddr, (stl_be_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) #define vio_stq(_dev, _addr, _val) \ (stq_be_dma(&(_dev)->as, (_addr), (_val), MEMTXATTRS_UNSPECIFIED)) -#define vio_ldq(_dev, _addr) (ldq_be_dma(&(_dev)->as, (_addr))) +#define vio_ldq(_dev, _addr) \ +(ldq_be_dma(&(_dev)->as, (_addr), MEMTXATTRS_UNSPECIFIED)) int spapr_vio_send_crq(SpaprVioDevice *dev, uint8_t *crq); diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h index ebbc0501681..f3cf60d222d 100644 --- a/include/sysemu/dma.h +++ b/include/sysemu/dma.h @@ -241,10 +241,11 @@ static inline void dma_memory_unmap(AddressSpace *as, #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \ static inline uint##_bits##_t ld##_lname##_##_end##_dma(AddressSpace *as, \ -dma_addr_t addr) \ +dma_addr_t addr, \ +MemTxAttrs attrs) \ { \ uint##_bits##_t val;\ -dma_memory_read(as, addr, &val, (_bits) / 8, MEMTXATTRS_UNSPECIFIED); \ +dma_memory_read(as, addr, &val, (_bits) / 8, attrs); \ return _end##_bits##_to_cpu(val); \ } \ static inline void st##_sname##_##_end##_dma(AddressSpace *as, \ @@ -253,14 +254,14 @@ static inline void dma_memory_unmap(AddressSpace *as, MemTxAttrs attrs) \ { \ val = cpu_to_##_end##_bits(val);\ -dma_memory_write(as, addr, &val, (_bits) / 8, attrs); \ +dma_memory_write(as, addr, &val, (_bits) / 8, attrs); \ } -static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr) +static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr, MemTxAttrs attrs) { uint8_t val; -dma_memory_read(as, addr, &val, 1, MEMTXATTRS_UNSPECIFIED); +dma_memory_read(as, addr, &val, 1, attrs); return val; } diff --git a/hw/intc/pnv_xive.c b/hw/intc/pnv_xive.c index ad43483612e..d9249bbc0c1 100644 --- a/hw/intc/pnv_xive.c +++ b/hw/intc/pnv_xive.c @@ -172,7 +172,7 @@ static uint64_t pnv_xive_vst_addr_indirect(PnvXive *xive, uint32_t type, /* Get the page size of the indirect table. */ vsd_addr = vsd & VSD_ADDRESS_MASK; -vsd = ldq_be_dma(&address_space_memory, vsd_addr); +vsd = ldq_be_dma(&address_space_memory, vsd_addr, MEMTXATTRS_UNSPECIFIED); if (!(vsd & VSD_ADDRESS_MASK)) { #ifdef XIVE_DEBUG @@ -195,7 +195,8 @@ static uint64_t pnv_xive_vst_addr_indirect(PnvXive *xive, uint32_t type, /* Load the VSD we are looking for, if not already done */ if (vsd_idx) { vsd_addr = vsd_addr + vsd_idx * XIVE_VSD_SIZE; -vsd = ldq_be_dma(&address_space_memory, vsd_addr); +vsd = ldq_be_dma(&address_space_memory, vsd_addr, + MEMTXATTRS_UNSPECIFIED); if (!(vsd & VSD_ADDRESS_MASK)) { #ifdef XIVE_DEBUG @@ -542,7 +543,7 @@ stat
[PATCH v3 kvm/queue 08/16] KVM: Special handling for fd-based memory invalidation
For fd-based guest memory, the memory backend (e.g. the fd provider) should notify KVM to unmap/invalidate the privated memory from KVM secondary MMU when userspace punches hole on the fd (e.g. when userspace converts private memory to shared memory). To support fd-based memory invalidation, existing hva-based memory invalidation needs to be extended. A new 'inode' for the fd is passed in from memfd_falloc_notifier and the 'start/end' will represent start/end offset in the fd instead of hva range. During the invalidation KVM needs to check this inode against that in the memslot. Only when the 'inode' in memslot equals to the passed-in 'inode' we should invalidate the mapping in KVM. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- virt/kvm/kvm_main.c | 30 -- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b7a1c4d7eaaa..19736a0013a0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -494,6 +494,7 @@ typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start, struct kvm_useraddr_range { unsigned long start; unsigned long end; + struct inode *inode; pte_t pte; gfn_handler_t handler; on_lock_fn_t on_lock; @@ -544,14 +545,27 @@ static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm, struct interval_tree_node *node; slots = __kvm_memslots(kvm, i); - useraddr_tree = &slots->hva_tree; + useraddr_tree = range->inode ? &slots->ofs_tree : &slots->hva_tree; kvm_for_each_memslot_in_useraddr_range(node, useraddr_tree, range->start, range->end - 1) { unsigned long useraddr_start, useraddr_end; + unsigned long useraddr_base; + + if (range->inode) { + slot = container_of(node, struct kvm_memory_slot, + ofs_node[slots->node_idx]); + if (!slot->file || + slot->file->f_inode != range->inode) + continue; + useraddr_base = slot->ofs; + } else { + slot = container_of(node, struct kvm_memory_slot, + hva_node[slots->node_idx]); + useraddr_base = slot->userspace_addr; + } - slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]); - useraddr_start = max(range->start, slot->userspace_addr); - useraddr_end = min(range->end, slot->userspace_addr + + useraddr_start = max(range->start, useraddr_base); + useraddr_end = min(range->end, useraddr_base + (slot->npages << PAGE_SHIFT)); /* @@ -568,10 +582,10 @@ static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm, * {gfn_start, gfn_start+1, ..., gfn_end-1}. */ gfn_range.start = useraddr_to_gfn_memslot(useraddr_start, - slot, true); + slot, !range->inode); gfn_range.end = useraddr_to_gfn_memslot( useraddr_end + PAGE_SIZE - 1, - slot, true); + slot, !range->inode); gfn_range.slot = slot; if (!locked) { @@ -613,6 +627,7 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, .on_lock= (void *)kvm_null_fn, .flush_on_ret = true, .may_block = false, + .inode = NULL, }; return __kvm_handle_useraddr_range(kvm, &range); @@ -632,6 +647,7 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn .on_lock= (void *)kvm_null_fn, .flush_on_ret = false, .may_block = false, + .inode = NULL, }; return __kvm_handle_useraddr_range(kvm, &range); @@ -700,6 +716,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, .on_lock= kvm_inc_notifier_count, .flush_on_ret = true, .may_block = mmu_notifier_range_blockable(range), + .inode = NULL, }; trace_kvm_unmap_hva_range(range-
[PATCH v3 kvm/queue 11/16] KVM: Add kvm_map_gfn_range
This new function establishes the mapping in KVM page tables for a given gfn range. It can be used in the memory fallocate callback for memfd based memory to establish the mapping for KVM secondary MMU when the pages are allocated in the memory backend. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- arch/x86/kvm/mmu/mmu.c | 47 include/linux/kvm_host.h | 2 ++ virt/kvm/kvm_main.c | 5 + 3 files changed, 54 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 1d275e9d76b5..2856eb662a21 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1568,6 +1568,53 @@ static __always_inline bool kvm_handle_gfn_range(struct kvm *kvm, return ret; } +bool kvm_map_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) +{ + struct kvm_vcpu *vcpu; + kvm_pfn_t pfn; + gfn_t gfn; + int idx; + bool ret = true; + + /* Need vcpu context for kvm_mmu_do_page_fault. */ + vcpu = kvm_get_vcpu(kvm, 0); + if (mutex_lock_killable(&vcpu->mutex)) + return false; + + vcpu_load(vcpu); + idx = srcu_read_lock(&kvm->srcu); + + kvm_mmu_reload(vcpu); + + gfn = range->start; + while (gfn < range->end) { + if (signal_pending(current)) { + ret = false; + break; + } + + if (need_resched()) + cond_resched(); + + pfn = kvm_mmu_do_page_fault(vcpu, gfn << PAGE_SHIFT, + PFERR_WRITE_MASK | PFERR_USER_MASK, + false); + if (is_error_noslot_pfn(pfn) || kvm->vm_bugged) { + ret = false; + break; + } + + gfn++; + } + + srcu_read_unlock(&kvm->srcu, idx); + vcpu_put(vcpu); + + mutex_unlock(&vcpu->mutex); + + return ret; +} + bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { bool flush = false; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index be567925831b..8c2359175509 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -241,6 +241,8 @@ struct kvm_gfn_range { pte_t pte; bool may_block; }; + +bool kvm_map_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f495c1a313bd..660ce15973ad 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -471,6 +471,11 @@ EXPORT_SYMBOL_GPL(kvm_destroy_vcpus); #if defined(CONFIG_MEMFD_OPS) ||\ (defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)) +bool __weak kvm_map_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) +{ + return false; +} + typedef bool (*gfn_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start, -- 2.17.1
[PATCH v3 kvm/queue 06/16] KVM: Implement fd-based memory using MEMFD_OPS interfaces
This patch adds the new memfd facility in KVM using MEMFD_OPS to provide guest memory from a file descriptor created in userspace with memfd_create() instead of traditional userspace hva. It mainly provides two kind of functions: - Pair/unpair a fd-based memslot to a memory backend that owns the file descriptor when such memslot gets created/deleted. - Get/put a pfn that to be used in KVM page fault handler from/to the paired memory backend. At the pairing time, KVM and the memfd subsystem exchange calllbacks that each can call into the other side. These callbacks are the major places to implement fd-based guest memory provisioning. KVM->memfd: - get_pfn: get and lock a page at specified offset in the fd. - put_pfn: put and unlock the pfn. Note: page needs to be locked between get_pfn/put_pfn to ensure pfn is valid when KVM uses it to establish the mapping in the secondary MMU page table. memfd->KVM: - invalidate_page_range: called when userspace punches hole on the fd, KVM should unmap related pages in the secondary MMU. - fallocate: called when userspace fallocates space on the fd, KVM can map related pages in the secondary MMU. - get/put_owner: used to ensure guest is still alive using a reference mechanism when calling above invalidate/fallocate callbacks. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- arch/x86/kvm/Kconfig | 1 + include/linux/kvm_host.h | 6 +++ virt/kvm/Makefile.kvm| 2 +- virt/kvm/memfd.c | 91 4 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 virt/kvm/memfd.c diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 03b2ce34e7f4..86655cd660ca 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -46,6 +46,7 @@ config KVM select SRCU select INTERVAL_TREE select HAVE_KVM_PM_NOTIFIER if PM + select MEMFD_OPS help Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 3bd875f9669f..21f8b1880723 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -806,6 +806,12 @@ static inline void kvm_irqfd_exit(void) { } #endif + +int kvm_memfd_register(struct kvm *kvm, struct kvm_memory_slot *slot); +void kvm_memfd_unregister(struct kvm_memory_slot *slot); +long kvm_memfd_get_pfn(struct kvm_memory_slot *slot, gfn_t gfn, int *order); +void kvm_memfd_put_pfn(kvm_pfn_t pfn); + int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, struct module *module); void kvm_exit(void); diff --git a/virt/kvm/Makefile.kvm b/virt/kvm/Makefile.kvm index ffdcad3cc97a..8842128d8429 100644 --- a/virt/kvm/Makefile.kvm +++ b/virt/kvm/Makefile.kvm @@ -5,7 +5,7 @@ KVM ?= ../../../virt/kvm -kvm-y := $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o +kvm-y := $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o $(KVM)/memfd.o kvm-$(CONFIG_KVM_VFIO) += $(KVM)/vfio.o kvm-$(CONFIG_KVM_MMIO) += $(KVM)/coalesced_mmio.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o diff --git a/virt/kvm/memfd.c b/virt/kvm/memfd.c new file mode 100644 index ..662393a76782 --- /dev/null +++ b/virt/kvm/memfd.c @@ -0,0 +1,91 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * memfd.c: routines for fd based guest memory + * Copyright (c) 2021, Intel Corporation. + * + * Author: + * Chao Peng + */ + +#include +#include + +#ifdef CONFIG_MEMFD_OPS +static const struct memfd_pfn_ops *memfd_ops; + +static void memfd_invalidate_page_range(struct inode *inode, void *owner, + pgoff_t start, pgoff_t end) +{ +} + +static void memfd_fallocate(struct inode *inode, void *owner, + pgoff_t start, pgoff_t end) +{ +} + +static bool memfd_get_owner(void *owner) +{ + return kvm_get_kvm_safe(owner); +} + +static void memfd_put_owner(void *owner) +{ + kvm_put_kvm(owner); +} + +static const struct memfd_falloc_notifier memfd_notifier = { + .invalidate_page_range = memfd_invalidate_page_range, + .fallocate = memfd_fallocate, + .get_owner = memfd_get_owner, + .put_owner = memfd_put_owner, +}; +#endif + +long kvm_memfd_get_pfn(struct kvm_memory_slot *slot, gfn_t gfn, int *order) +{ +#ifdef CONFIG_MEMFD_OPS + pgoff_t index = gfn - slot->base_gfn + (slot->ofs >> PAGE_SHIFT); + + return memfd_ops->get_lock_pfn(slot->file->f_inode, index, order); +#else + return -EOPNOTSUPP; +#endif +} + +void kvm_memfd_put_pfn(kvm_pfn_t pfn) +{ +#ifdef CONFIG_MEMFD_OPS + memfd_ops->put_unlock_pfn(pfn); +#endif +} + +int kvm_memfd_register(struct kvm *kvm, struct kvm_memory_slot *slot) +{ +#ifdef CONFIG_MEMFD_OPS + int ret; + struct fd fd = fdget(slot->fd); + + if (!fd.file) + return -EINVAL; + + ret
[PATCH v3 kvm/queue 13/16] KVM: Add KVM_EXIT_MEMORY_ERROR exit
This new exit allows user space to handle memory-related errors. Currently it supports two types (KVM_EXIT_MEM_MAP_SHARED/PRIVATE) of errors which are used for shared memory <-> private memory conversion in memory encryption usage. After private memory is enabled, there are two places in KVM that can exit to userspace to trigger private <-> shared conversion: - explicit conversion: happens when guest explicitly calls into KVM to map a range (as private or shared), KVM then exits to userspace to do the map/unmap operations. - implicit conversion: happens in KVM page fault handler. * if the fault is due to a private memory access then causes a userspace exit for a shared->private conversion request when the page has not been allocated in the private memory backend. * If the fault is due to a shared memory access then causes a userspace exit for a private->shared conversion request when the page has already been allocated in the private memory backend. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/uapi/linux/kvm.h | 15 +++ 1 file changed, 15 insertions(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 41434322fa23..d68db3b2eeec 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -243,6 +243,18 @@ struct kvm_xen_exit { } u; }; +struct kvm_memory_exit { +#define KVM_EXIT_MEM_MAP_SHARED 1 +#define KVM_EXIT_MEM_MAP_PRIVATE2 + __u32 type; + union { + struct { + __u64 gpa; + __u64 size; + } map; + } u; +}; + #define KVM_S390_GET_SKEYS_NONE 1 #define KVM_S390_SKEYS_MAX1048576 @@ -282,6 +294,7 @@ struct kvm_xen_exit { #define KVM_EXIT_X86_BUS_LOCK 33 #define KVM_EXIT_XEN 34 #define KVM_EXIT_RISCV_SBI35 +#define KVM_EXIT_MEMORY_ERROR 36 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -499,6 +512,8 @@ struct kvm_run { unsigned long args[6]; unsigned long ret[2]; } riscv_sbi; + /* KVM_EXIT_MEMORY_ERROR */ + struct kvm_memory_exit mem; /* Fix the size of the union. */ char padding[256]; }; -- 2.17.1
[PATCH v2 19/23] hw/scsi/megasas: Use uint32_t for reply queue head/tail values
While the reply queue values fit in 16-bit, they are accessed as 32-bit: 661:s->reply_queue_head = ldl_le_pci_dma(pcid, s->producer_pa); 662:s->reply_queue_head %= MEGASAS_MAX_FRAMES; 663:s->reply_queue_tail = ldl_le_pci_dma(pcid, s->consumer_pa); 664:s->reply_queue_tail %= MEGASAS_MAX_FRAMES; Having: 41:#define MEGASAS_MAX_FRAMES 2048 /* Firmware limit at 65535 */ In order to update the ld/st*_pci_dma() API to pass the address of the value to access, it is simpler to have the head/tail declared as 32-bit values. Replace the uint16_t by uint32_t, wasting 4 bytes in the MegasasState structure. Acked-by: Richard Henderson Signed-off-by: Philippe Mathieu-Daudé --- hw/scsi/megasas.c| 4 ++-- hw/scsi/trace-events | 8 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index 87101705d01..266c3d38003 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -109,8 +109,8 @@ struct MegasasState { uint64_t reply_queue_pa; void *reply_queue; uint16_t reply_queue_len; -uint16_t reply_queue_head; -uint16_t reply_queue_tail; +uint32_t reply_queue_head; +uint32_t reply_queue_tail; uint64_t consumer_pa; uint64_t producer_pa; diff --git a/hw/scsi/trace-events b/hw/scsi/trace-events index 92d5b40f892..ae8551f2797 100644 --- a/hw/scsi/trace-events +++ b/hw/scsi/trace-events @@ -42,18 +42,18 @@ mptsas_config_sas_phy(void *dev, int address, int port, int phy_handle, int dev_ # megasas.c megasas_init_firmware(uint64_t pa) "pa 0x%" PRIx64 " " -megasas_init_queue(uint64_t queue_pa, int queue_len, uint64_t head, uint64_t tail, uint32_t flags) "queue at 0x%" PRIx64 " len %d head 0x%" PRIx64 " tail 0x%" PRIx64 " flags 0x%x" +megasas_init_queue(uint64_t queue_pa, int queue_len, uint32_t head, uint32_t tail, uint32_t flags) "queue at 0x%" PRIx64 " len %d head 0x%" PRIx32 " tail 0x%" PRIx32 " flags 0x%x" megasas_initq_map_failed(int frame) "scmd %d: failed to map queue" megasas_initq_mapped(uint64_t pa) "queue already mapped at 0x%" PRIx64 megasas_initq_mismatch(int queue_len, int fw_cmds) "queue size %d max fw cmds %d" megasas_qf_mapped(unsigned int index) "skip mapped frame 0x%x" megasas_qf_new(unsigned int index, uint64_t frame) "frame 0x%x addr 0x%" PRIx64 megasas_qf_busy(unsigned long pa) "all frames busy for frame 0x%lx" -megasas_qf_enqueue(unsigned int index, unsigned int count, uint64_t context, unsigned int head, unsigned int tail, int busy) "frame 0x%x count %d context 0x%" PRIx64 " head 0x%x tail 0x%x busy %d" -megasas_qf_update(unsigned int head, unsigned int tail, unsigned int busy) "head 0x%x tail 0x%x busy %d" +megasas_qf_enqueue(unsigned int index, unsigned int count, uint64_t context, uint32_t head, uint32_t tail, unsigned int busy) "frame 0x%x count %d context 0x%" PRIx64 " head 0x%" PRIx32 " tail 0x%" PRIx32 " busy %u" +megasas_qf_update(uint32_t head, uint32_t tail, unsigned int busy) "head 0x%" PRIx32 " tail 0x%" PRIx32 " busy %u" megasas_qf_map_failed(int cmd, unsigned long frame) "scmd %d: frame %lu" megasas_qf_complete_noirq(uint64_t context) "context 0x%" PRIx64 " " -megasas_qf_complete(uint64_t context, unsigned int head, unsigned int tail, int busy) "context 0x%" PRIx64 " head 0x%x tail 0x%x busy %d" +megasas_qf_complete(uint64_t context, uint32_t head, uint32_t tail, int busy) "context 0x%" PRIx64 " head 0x%" PRIx32 " tail 0x%" PRIx32 " busy %u" megasas_frame_busy(uint64_t addr) "frame 0x%" PRIx64 " busy" megasas_unhandled_frame_cmd(int cmd, uint8_t frame_cmd) "scmd %d: MFI cmd 0x%x" megasas_handle_scsi(const char *frame, int bus, int dev, int lun, void *sdev, unsigned long size) "%s dev %x/%x/%x sdev %p xfer %lu" -- 2.33.1
[PATCH v3 kvm/queue 12/16] KVM: Implement fd-based memory fallocation
KVM gets notified through memfd_notifier when userspace allocatea space via fallocate() on the fd which is used for guest memory. KVM can set up the mapping in the secondary MMU page tables at this time. This patch adds function in KVM to map pfn to gfn when the page is allocated in the memory backend. While it's possible to postpone the mapping of the secondary MMU to KVM page fault handler but we can reduce some VMExits by also mapping the secondary page tables when a page is mapped in the primary MMU. It reuses the same code for kvm_memfd_invalidate_range, except using kvm_map_gfn_range as its handler. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 2 ++ virt/kvm/kvm_main.c | 22 +++--- virt/kvm/memfd.c | 2 ++ 3 files changed, 23 insertions(+), 3 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8c2359175509..ad89a0e8bf6b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2017,6 +2017,8 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) #ifdef CONFIG_MEMFD_OPS int kvm_memfd_invalidate_range(struct kvm *kvm, struct inode *inode, unsigned long start, unsigned long end); +int kvm_memfd_fallocate_range(struct kvm *kvm, struct inode *inode, + unsigned long start, unsigned long end); #endif /* CONFIG_MEMFD_OPS */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 660ce15973ad..36dd2adcd7fc 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -891,15 +891,17 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ #ifdef CONFIG_MEMFD_OPS -int kvm_memfd_invalidate_range(struct kvm *kvm, struct inode *inode, - unsigned long start, unsigned long end) +int kvm_memfd_handle_range(struct kvm *kvm, struct inode *inode, + unsigned long start, unsigned long end, + gfn_handler_t handler) + { int ret; const struct kvm_useraddr_range useraddr_range = { .start = start, .end= end, .pte= __pte(0), - .handler= kvm_unmap_gfn_range, + .handler= handler, .on_lock= (void *)kvm_null_fn, .flush_on_ret = true, .may_block = false, @@ -914,6 +916,20 @@ int kvm_memfd_invalidate_range(struct kvm *kvm, struct inode *inode, return ret; } + +int kvm_memfd_invalidate_range(struct kvm *kvm, struct inode *inode, + unsigned long start, unsigned long end) +{ + return kvm_memfd_handle_range(kvm, inode, start, end, + kvm_unmap_gfn_range); +} + +int kvm_memfd_fallocate_range(struct kvm *kvm, struct inode *inode, + unsigned long start, unsigned long end) +{ + return kvm_memfd_handle_range(kvm, inode, start, end, + kvm_map_gfn_range); +} #endif /* CONFIG_MEMFD_OPS */ #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER diff --git a/virt/kvm/memfd.c b/virt/kvm/memfd.c index 547f65f5a187..91a17c9fbc49 100644 --- a/virt/kvm/memfd.c +++ b/virt/kvm/memfd.c @@ -23,6 +23,8 @@ static void memfd_invalidate_page_range(struct inode *inode, void *owner, static void memfd_fallocate(struct inode *inode, void *owner, pgoff_t start, pgoff_t end) { + kvm_memfd_fallocate_range(owner, inode, start >> PAGE_SHIFT, + end >> PAGE_SHIFT); } static bool memfd_get_owner(void *owner) -- 2.17.1
[PATCH v3 kvm/queue 15/16] KVM: Use kvm_userspace_memory_region_ext
Use the new extended memslot structure kvm_userspace_memory_region_ext which includes two additional fd/ofs fields comparing to the current kvm_userspace_memory_region. The fields fd/ofs will be copied from userspace only when KVM_MEM_PRIVATE is set. Internal the KVM we change all existing kvm_userspace_memory_region to kvm_userspace_memory_region_ext since the new extended structure covers all the existing fields in kvm_userspace_memory_region. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- arch/x86/kvm/x86.c | 2 +- include/linux/kvm_host.h | 4 ++-- virt/kvm/kvm_main.c | 19 +-- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 42bde45a1bc2..52942195def3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11551,7 +11551,7 @@ void __user * __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, } for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { - struct kvm_userspace_memory_region m; + struct kvm_userspace_memory_region_ext m; m.slot = id | (i << 16); m.flags = 0; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ad89a0e8bf6b..fabab3b77d57 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -981,9 +981,9 @@ enum kvm_mr_change { }; int kvm_set_memory_region(struct kvm *kvm, - const struct kvm_userspace_memory_region *mem); + const struct kvm_userspace_memory_region_ext *mem); int __kvm_set_memory_region(struct kvm *kvm, - const struct kvm_userspace_memory_region *mem); + const struct kvm_userspace_memory_region_ext *mem); void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot); void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen); int kvm_arch_prepare_memory_region(struct kvm *kvm, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 36dd2adcd7fc..cf8dcb3b8c7f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1514,7 +1514,7 @@ static void kvm_replace_memslot(struct kvm *kvm, } } -static int check_memory_region_flags(const struct kvm_userspace_memory_region *mem) +static int check_memory_region_flags(const struct kvm_userspace_memory_region_ext *mem) { u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES; @@ -1907,7 +1907,7 @@ static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, int id, * Must be called holding kvm->slots_lock for write. */ int __kvm_set_memory_region(struct kvm *kvm, - const struct kvm_userspace_memory_region *mem) + const struct kvm_userspace_memory_region_ext *mem) { struct kvm_memory_slot *old, *new; struct kvm_memslots *slots; @@ -2011,7 +2011,7 @@ int __kvm_set_memory_region(struct kvm *kvm, EXPORT_SYMBOL_GPL(__kvm_set_memory_region); int kvm_set_memory_region(struct kvm *kvm, - const struct kvm_userspace_memory_region *mem) + const struct kvm_userspace_memory_region_ext *mem) { int r; @@ -2023,7 +2023,7 @@ int kvm_set_memory_region(struct kvm *kvm, EXPORT_SYMBOL_GPL(kvm_set_memory_region); static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm, - struct kvm_userspace_memory_region *mem) + struct kvm_userspace_memory_region_ext *mem) { if ((u16)mem->slot >= KVM_USER_MEM_SLOTS) return -EINVAL; @@ -4569,12 +4569,19 @@ static long kvm_vm_ioctl(struct file *filp, break; } case KVM_SET_USER_MEMORY_REGION: { - struct kvm_userspace_memory_region kvm_userspace_mem; + struct kvm_userspace_memory_region_ext kvm_userspace_mem; r = -EFAULT; if (copy_from_user(&kvm_userspace_mem, argp, - sizeof(kvm_userspace_mem))) + sizeof(struct kvm_userspace_memory_region))) goto out; + if (kvm_userspace_mem.flags & KVM_MEM_PRIVATE) { + int offset = offsetof( + struct kvm_userspace_memory_region_ext, ofs); + if (copy_from_user(&kvm_userspace_mem.ofs, argp + offset, + sizeof(kvm_userspace_mem) - offset)) + goto out; + } r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem); break; -- 2.17.1
[PATCH v3 kvm/queue 14/16] KVM: Handle page fault for private memory
When a page fault from the secondary page table while the guest is running happens in a memslot with KVM_MEM_PRIVATE, we need go different paths for private access and shared access. - For private access, KVM checks if the page is already allocated in the memory backend, if yes KVM establishes the mapping, otherwise exits to userspace to convert a shared page to private one. - For shared access, KVM also checks if the page is already allocated in the memory backend, if yes then exit to userspace to convert a private page to shared one, otherwise it's treated as a traditional hva-based shared memory, KVM lets existing code to obtain a pfn with get_user_pages() and establish the mapping. The above code assume private memory is persistent and pre-allocated in the memory backend so KVM can use this information as an indicator for a page is private or shared. The above check is then performed by calling kvm_memfd_get_pfn() which currently is implemented as a pagecache search but in theory that can be implemented differently (i.e. when the page is even not mapped into host pagecache there should be some different implementation). Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- arch/x86/kvm/mmu/mmu.c | 73 -- arch/x86/kvm/mmu/paging_tmpl.h | 11 +++-- 2 files changed, 77 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2856eb662a21..fbcdf62f8281 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2920,6 +2920,9 @@ int kvm_mmu_max_mapping_level(struct kvm *kvm, if (max_level == PG_LEVEL_4K) return PG_LEVEL_4K; + if (kvm_slot_is_private(slot)) + return max_level; + host_level = host_pfn_mapping_level(kvm, gfn, pfn, slot); return min(host_level, max_level); } @@ -3950,7 +3953,59 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, kvm_vcpu_gfn_to_hva(vcpu, gfn), &arch); } -static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int *r) +static bool kvm_vcpu_is_private_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) +{ + /* +* At this time private gfn has not been supported yet. Other patch +* that enables it should change this. +*/ + return false; +} + +static bool kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, + struct kvm_page_fault *fault, + bool *is_private_pfn, int *r) +{ + int order; + int mem_convert_type; + struct kvm_memory_slot *slot = fault->slot; + long pfn = kvm_memfd_get_pfn(slot, fault->gfn, &order); + + if (kvm_vcpu_is_private_gfn(vcpu, fault->addr >> PAGE_SHIFT)) { + if (pfn < 0) + mem_convert_type = KVM_EXIT_MEM_MAP_PRIVATE; + else { + fault->pfn = pfn; + if (slot->flags & KVM_MEM_READONLY) + fault->map_writable = false; + else + fault->map_writable = true; + + if (order == 0) + fault->max_level = PG_LEVEL_4K; + *is_private_pfn = true; + *r = RET_PF_FIXED; + return true; + } + } else { + if (pfn < 0) + return false; + + kvm_memfd_put_pfn(pfn); + mem_convert_type = KVM_EXIT_MEM_MAP_SHARED; + } + + vcpu->run->exit_reason = KVM_EXIT_MEMORY_ERROR; + vcpu->run->mem.type = mem_convert_type; + vcpu->run->mem.u.map.gpa = fault->gfn << PAGE_SHIFT; + vcpu->run->mem.u.map.size = PAGE_SIZE; + fault->pfn = -1; + *r = -1; + return true; +} + +static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, + bool *is_private_pfn, int *r) { struct kvm_memory_slot *slot = fault->slot; bool async; @@ -3984,6 +4039,10 @@ static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, } } + if (kvm_slot_is_private(slot) && + kvm_faultin_pfn_private(vcpu, fault, is_private_pfn, r)) + return *r == RET_PF_FIXED ? false : true; + async = false; fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, &async, fault->write, &fault->map_writable, @@ -4044,6 +4103,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault bool is_tdp_mmu_fault = is_tdp_mmu(vcpu->arch.mmu); unsigned long mmu_seq; + bool is_private_pfn = false; int r; fault->gfn = fault->addr >> PAGE_SHIFT; @@ -4063,7 +4123,7 @@ static int direct_page_fault(struct kvm
Re: [PATCH v4 18/19] iotests.py: implement unsupported_imgopts
On 03.12.21 14:07, Vladimir Sementsov-Ogievskiy wrote: We have added support for some addition IMGOPTS in python iotests like in bash iotests. Similarly to bash iotests, we want a way to skip some tests which can't work with specific IMGOPTS. Globally for python iotests we now don't support things like 'data_file=$TEST_IMG.ext_data_file' in IMGOPTS, so, forbid this globally in iotests.py. Suggested-by: Hanna Reitz Signed-off-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/iotests.py | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) Reviewed-by: Hanna Reitz Can we move this and the next patch before patch 2, though? Otherwise, the tests adjusted in the next patch will be broken after patch 2 (when given those unsupported options). The move seems trivial, just wondering whether you know of anything that would prohibit this.
[PATCH v3 kvm/queue 16/16] KVM: Register/unregister private memory slot to memfd
Expose KVM_MEM_PRIVATE flag and register/unregister private memory slot to memfd when userspace sets the flag. KVM_MEM_PRIVATE is disallowed by default but architecture code can turn on it by implementing kvm_arch_private_memory_supported(). Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 34 -- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index fabab3b77d57..5173c52e70d4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1229,6 +1229,7 @@ bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu *vcpu); int kvm_arch_post_init_vm(struct kvm *kvm); void kvm_arch_pre_destroy_vm(struct kvm *kvm); int kvm_arch_create_vm_debugfs(struct kvm *kvm); +bool kvm_arch_private_memory_supported(struct kvm *kvm); #ifndef __KVM_HAVE_ARCH_VM_ALLOC /* diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index cf8dcb3b8c7f..1caebded52c4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1514,10 +1514,19 @@ static void kvm_replace_memslot(struct kvm *kvm, } } -static int check_memory_region_flags(const struct kvm_userspace_memory_region_ext *mem) +bool __weak kvm_arch_private_memory_supported(struct kvm *kvm) +{ + return false; +} + +static int check_memory_region_flags(struct kvm *kvm, + const struct kvm_userspace_memory_region_ext *mem) { u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES; + if (kvm_arch_private_memory_supported(kvm)) + valid_flags |= KVM_MEM_PRIVATE; + #ifdef __KVM_HAVE_READONLY_MEM valid_flags |= KVM_MEM_READONLY; #endif @@ -1756,6 +1765,8 @@ static void kvm_delete_memslot(struct kvm *kvm, struct kvm_memory_slot *old, struct kvm_memory_slot *invalid_slot) { + if (old->flags & KVM_MEM_PRIVATE) + kvm_memfd_unregister(old); /* * Remove the old memslot (in the inactive memslots) by passing NULL as * the "new" slot, and for the invalid version in the active slots. @@ -1836,6 +1847,14 @@ static int kvm_set_memslot(struct kvm *kvm, kvm_invalidate_memslot(kvm, old, invalid_slot); } + if (new->flags & KVM_MEM_PRIVATE && change == KVM_MR_CREATE) { + r = kvm_memfd_register(kvm, new); + if (r) { + mutex_unlock(&kvm->slots_arch_lock); + return r; + } + } + r = kvm_prepare_memory_region(kvm, old, new, change); if (r) { /* @@ -1850,6 +1869,10 @@ static int kvm_set_memslot(struct kvm *kvm, } else { mutex_unlock(&kvm->slots_arch_lock); } + + if (new->flags & KVM_MEM_PRIVATE && change == KVM_MR_CREATE) + kvm_memfd_unregister(new); + return r; } @@ -1917,7 +1940,7 @@ int __kvm_set_memory_region(struct kvm *kvm, int as_id, id; int r; - r = check_memory_region_flags(mem); + r = check_memory_region_flags(kvm, mem); if (r) return r; @@ -1974,6 +1997,10 @@ int __kvm_set_memory_region(struct kvm *kvm, if ((kvm->nr_memslot_pages + npages) < kvm->nr_memslot_pages) return -EINVAL; } else { /* Modify an existing slot. */ + /* Private memslots are immutable, they can only be deleted. */ + if (mem->flags & KVM_MEM_PRIVATE) + return -EINVAL; + if ((mem->userspace_addr != old->userspace_addr) || (npages != old->npages) || ((mem->flags ^ old->flags) & KVM_MEM_READONLY)) @@ -2002,6 +2029,9 @@ int __kvm_set_memory_region(struct kvm *kvm, new->npages = npages; new->flags = mem->flags; new->userspace_addr = mem->userspace_addr; + new->fd = mem->fd; + new->file = NULL; + new->ofs = mem->ofs; r = kvm_set_memslot(kvm, old, new, change); if (r) -- 2.17.1
Re: [PATCH v4 19/19] iotests: specify some unsupported_imgopts for python iotests
On 03.12.21 14:07, Vladimir Sementsov-Ogievskiy wrote: We support IMGOPTS for python iotests now. Still a lot of tests are unprepared to common IMGOPTS that are used with bash iotests. So we should define corresponding unsupported_imgopts. Signed-off-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/044 | 3 ++- tests/qemu-iotests/065 | 3 ++- tests/qemu-iotests/163 | 3 ++- tests/qemu-iotests/165 | 3 ++- tests/qemu-iotests/196 | 3 ++- tests/qemu-iotests/242 | 3 ++- tests/qemu-iotests/246 | 3 ++- tests/qemu-iotests/254 | 3 ++- tests/qemu-iotests/260 | 4 ++-- tests/qemu-iotests/274 | 3 ++- tests/qemu-iotests/281 | 3 ++- tests/qemu-iotests/303 | 3 ++- tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test | 3 ++- tests/qemu-iotests/tests/migrate-bitmaps-test | 3 ++- tests/qemu-iotests/tests/migrate-during-backup | 3 ++- tests/qemu-iotests/tests/remove-bitmap-from-backing| 3 ++- 16 files changed, 32 insertions(+), 17 deletions(-) Few of these tests look like they could be made to support refcount_bits if we filtered qemu-img info output accordingly, but I don’t mind just marking the option as unsupported, so I’m good with your approach. diff --git a/tests/qemu-iotests/044 b/tests/qemu-iotests/044 index 714329eb16..a5ee9a7ded 100755 --- a/tests/qemu-iotests/044 +++ b/tests/qemu-iotests/044 @@ -118,4 +118,5 @@ class TestRefcountTableGrowth(iotests.QMPTestCase): if __name__ == '__main__': iotests.activate_logging() iotests.main(supported_fmts=['qcow2'], - supported_protocols=['file']) + supported_protocols=['file'], + unsupported_imgopts=['refcount_bits']) diff --git a/tests/qemu-iotests/065 b/tests/qemu-iotests/065 index 4b3c5c6c8c..f7c1b68dad 100755 --- a/tests/qemu-iotests/065 +++ b/tests/qemu-iotests/065 @@ -139,4 +139,5 @@ TestQMP = None if __name__ == '__main__': iotests.main(supported_fmts=['qcow2'], - supported_protocols=['file']) + supported_protocols=['file'], + unsupported_imgopts=['refcount_bits']) diff --git a/tests/qemu-iotests/163 b/tests/qemu-iotests/163 index dedce8ef43..0b00df519c 100755 --- a/tests/qemu-iotests/163 +++ b/tests/qemu-iotests/163 @@ -169,4 +169,5 @@ ShrinkBaseClass = None if __name__ == '__main__': iotests.main(supported_fmts=['raw', 'qcow2'], - supported_protocols=['file']) + supported_protocols=['file'], + unsupported_imgopts=['compat=0.10']) Works for my case (I use -o compat=0.10), but compat=v2 is also allowed. For cases that don’t support anything but refcount_bits=16, you already disallow specifying any refcount_bits value, even refcount_bits=16 (which would work fine in most cases, I believe). Perhaps we should then also just disallow any compat option instead of compat=0.10 specifically? [...] diff --git a/tests/qemu-iotests/tests/migrate-during-backup b/tests/qemu-iotests/tests/migrate-during-backup index 34103229ee..12cc4dde2e 100755 --- a/tests/qemu-iotests/tests/migrate-during-backup +++ b/tests/qemu-iotests/tests/migrate-during-backup @@ -94,4 +94,5 @@ class TestMigrateDuringBackup(iotests.QMPTestCase): if __name__ == '__main__': iotests.main(supported_fmts=['qcow2'], - supported_protocols=['file']) + supported_protocols=['file'], + unsupported_imgopts=['compat=0.10']) It seems to me like this test can handle compat=0.10 just fine, though. Hanna
[PATCH v3 kvm/queue 00/16] KVM: mm: fd-based approach for supporting KVM guest private memory
This is the third version of this series which try to implement the fd-based KVM guest private memory. Earlier this week I sent another v3 version at link: https://lore.kernel.org/linux-mm/20211222012223.ga22...@chaop.bj.intel.com/T/ That version is based on the latest TDX codebase. In contrast the one you are reading is the same code rebased to latest kvm/queue branch at commit: c34c87a69727 KVM: x86: Update vPMCs when retiring branch instructions There are some changes made to fit into the kvm queue branch but generally the two versions are the same code in logic. There is also difference in test. In the previous one I tested the new private memory feature with TDX but in this rebased version I can not test the new feature because lack TDX. I did run simple regression test on this new version. Introduction In general this patch series introduce fd-based memslot which provide guest memory through a memfd file descriptor fd[offset,size] instead of hva/size. The fd then can be created from a supported memory filesystem like tmpfs/hugetlbfs etc which we refer as memory backend. KVM and the memory backend exchange some callbacks when such memslot gets created. At runtime KVM will call into callbacks provided by backend to get the pfn with the fd+offset. Memory backend will also call into KVM callbacks when userspace fallocate/punch hole on the fd to notify KVM to map/unmap secondary MMU page tables. Comparing to existing hva-based memslot, this new type of memslot allow guest memory unmapped from host userspace like QEMU and even the kernel itself, therefore reduce attack surface and prevent userspace bugs. Based on this fd-based memslot, we can build guest private memory that is going to be used in confidential computing environments such as Intel TDX and AMD SEV. When supported, the memory backend can provide more enforcement on the fd and KVM can use a single memslot to hold both the private and shared part of the guest memory. Memfd/shmem extension - Introduces new MFD_INACCESSIBLE flag for memfd_create(), the file created with this flag cannot read(), write() or mmap() etc. In addition, two sets of callbacks are introduced as new MEMFD_OPS: - memfd_falloc_notifier: memfd -> KVM notifier when memory gets allocated/invalidated through fallocate(). - memfd_pfn_ops: kvm -> memfd to get a pfn with the fd+offset. Memslot extension - Add the private fd and the offset into the fd to existing 'shared' memslot so that both private/shared guest memory can live in one single memslot. A page in the memslot is either private or shared. A page is private only when it's already allocated in the backend fd, all the other cases it's treated as shared, this includes those already mapped as shared as well as those having not been mapped. This means the memory backend is the place which tells the truth of which page is private. Private memory map/unmap and conversion --- Userspace's map/unmap operations are done by fallocate() ioctl on the backend fd. - map: default fallocate() with mode=0. - unmap: fallocate() with FALLOC_FL_PUNCH_HOLE. The map/unmap will trigger above memfd_falloc_notifier to let KVM map/unmap second MMU page tables. Test NOTE: below is the test for previous TDX based version. For this version I only tested regular vm booting. This code has been tested with latest TDX code patches hosted at (https://github.com/intel/tdx/tree/kvm-upstream) with minimal TDX adaption and QEMU support. Example QEMU command line: -object tdx-guest,id=tdx \ -object memory-backend-memfd-private,id=ram1,size=2G \ -machine q35,kvm-type=tdx,pic=no,kernel_irqchip=split,memory-encryption=tdx,memory-backend=ram1 Changelog -- v3: - Added locking protection when calling invalidate_page_range/fallocate callbacks. - Changed memslot structure to keep use useraddr for shared memory. - Re-organized F_SEAL_INACCESSIBLE and MEMFD_OPS. - Added MFD_INACCESSIBLE flag to force F_SEAL_INACCESSIBLE. - Commit message improvement. - Many small fixes for comments from the last version. Links of previous discussions - [1] Original design proposal: https://lkml.kernel.org/kvm/20210824005248.200037-1-sea...@google.com/ [2] Updated proposal and RFC patch v1: https://lkml.kernel.org/linux-fsdevel/2021141352.26311-1-chao.p.p...@linux.intel.com/ [3] RFC patch v2: https://x-lore.kernel.org/qemu-devel/2029134739.20218-1-chao.p.p...@linux.intel.com/ Chao Peng (14): mm/memfd: Introduce MFD_INACCESSIBLE flag KVM: Extend the memslot to support fd-based private memory KVM: Maintain ofs_tree for fast memslot lookup by file offset KVM: Implement fd-based memory using MEMFD_OPS interfaces KVM: Refactor hva based memory invalidation code KVM: Special handling for fd-based memory invalidation KVM: Split out common memory invalidation code KVM: Implement fd-based memory inva
Re: [PATCH] Supporting AST2600 HACE engine accumulative mode
[ Adding Klaus ] On 12/22/21 03:22, Troy Lee wrote: Accumulative mode will supply a initial state and append padding bit at the end of hash stream. However, the crypto library will padding those bit automatically, so ripped it off from iov array. Signed-off-by: Troy Lee --- hw/misc/aspeed_hace.c | 30 -- include/hw/misc/aspeed_hace.h | 1 + 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c index 10f00e65f4..7c1794d6d0 100644 --- a/hw/misc/aspeed_hace.c +++ b/hw/misc/aspeed_hace.c @@ -27,6 +27,7 @@ #define R_HASH_SRC (0x20 / 4) #define R_HASH_DEST (0x24 / 4) +#define R_HASH_KEY_BUFF (0x28 / 4) #define R_HASH_SRC_LEN (0x2c / 4) #define R_HASH_CMD (0x30 / 4) @@ -94,7 +95,10 @@ static int hash_algo_lookup(uint32_t reg) return -1; } -static void do_hash_operation(AspeedHACEState *s, int algo, bool sg_mode) +static void do_hash_operation(AspeedHACEState *s, + int algo, + bool sg_mode, + bool acc_mode) { struct iovec iov[ASPEED_HACE_MAX_SG]; g_autofree uint8_t *digest_buf; @@ -103,6 +107,7 @@ static void do_hash_operation(AspeedHACEState *s, int algo, bool sg_mode) if (sg_mode) { uint32_t len = 0; +uint32_t total_len = 0; for (i = 0; !(len & SG_LIST_LEN_LAST); i++) { uint32_t addr, src; @@ -127,6 +132,21 @@ static void do_hash_operation(AspeedHACEState *s, int algo, bool sg_mode) plen = iov[i].iov_len; iov[i].iov_base = address_space_map(&s->dram_as, addr, &plen, false, MEMTXATTRS_UNSPECIFIED); + +total_len += plen; +if (acc_mode && len & SG_LIST_LEN_LAST) { +/* + * Read the message length in bit from last 64/128 bits + * and tear the padding bits from iov + */ +uint64_t stream_len; + +memcpy(&stream_len, iov[i].iov_base + iov[i].iov_len - 8, 8); +stream_len = __bswap_64(stream_len) / 8; + +if (total_len > stream_len) +iov[i].iov_len -= total_len - stream_len; +} } } else { hwaddr len = s->regs[R_HASH_SRC_LEN]; @@ -210,6 +230,9 @@ static void aspeed_hace_write(void *opaque, hwaddr addr, uint64_t data, case R_HASH_DEST: data &= ahc->dest_mask; break; +case R_HASH_KEY_BUFF: +data &= ahc->key_mask; +break; case R_HASH_SRC_LEN: data &= 0x0FFF; break; @@ -234,7 +257,7 @@ static void aspeed_hace_write(void *opaque, hwaddr addr, uint64_t data, __func__, data & ahc->hash_mask); break; } -do_hash_operation(s, algo, data & HASH_SG_EN); +do_hash_operation(s, algo, data & HASH_SG_EN, data & HASH_DIGEST_ACCUM); if (data & HASH_IRQ_EN) { qemu_irq_raise(s->irq); @@ -333,6 +356,7 @@ static void aspeed_ast2400_hace_class_init(ObjectClass *klass, void *data) ahc->src_mask = 0x0FFF; ahc->dest_mask = 0x0FF8; +ahc->key_mask = 0x0FC0; ahc->hash_mask = 0x03ff; /* No SG or SHA512 modes */ } @@ -351,6 +375,7 @@ static void aspeed_ast2500_hace_class_init(ObjectClass *klass, void *data) ahc->src_mask = 0x3fff; ahc->dest_mask = 0x3ff8; +ahc->key_mask = 0x3FC0; ahc->hash_mask = 0x03ff; /* No SG or SHA512 modes */ } @@ -369,6 +394,7 @@ static void aspeed_ast2600_hace_class_init(ObjectClass *klass, void *data) ahc->src_mask = 0x7FFF; ahc->dest_mask = 0x7FF8; +ahc->key_mask = 0x7FF8; ahc->hash_mask = 0x00147FFF; } diff --git a/include/hw/misc/aspeed_hace.h b/include/hw/misc/aspeed_hace.h index 94d5ada95f..2242945eb4 100644 --- a/include/hw/misc/aspeed_hace.h +++ b/include/hw/misc/aspeed_hace.h @@ -37,6 +37,7 @@ struct AspeedHACEClass { uint32_t src_mask; uint32_t dest_mask; +uint32_t key_mask; uint32_t hash_mask; };
[PATCH v3 kvm/queue 04/16] KVM: Extend the memslot to support fd-based private memory
Extend the memslot definition to provide fd-based private memory support by adding two new fields(fd/ofs). The memslot then can maintain memory for both shared and private pages in a single memslot. Shared pages are provided in the existing way by using userspace_addr(hva) field and get_user_pages() while private pages are provided through the new fields(fd/ofs). Since there is no 'hva' concept anymore for private memory we cannot call get_user_pages() to get a pfn, instead we rely on the newly introduced MEMFD_OPS callbacks to do the same job. This new extension is indicated by a new flag KVM_MEM_PRIVATE. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 10 ++ include/uapi/linux/kvm.h | 12 2 files changed, 22 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f8ed799e8674..2cd35560c44b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -460,8 +460,18 @@ struct kvm_memory_slot { u32 flags; short id; u16 as_id; + u32 fd; + struct file *file; + u64 ofs; }; +static inline bool kvm_slot_is_private(const struct kvm_memory_slot *slot) +{ + if (slot && (slot->flags & KVM_MEM_PRIVATE)) + return true; + return false; +} + static inline bool kvm_slot_dirty_track_enabled(const struct kvm_memory_slot *slot) { return slot->flags & KVM_MEM_LOG_DIRTY_PAGES; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 1daa45268de2..41434322fa23 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -103,6 +103,17 @@ struct kvm_userspace_memory_region { __u64 userspace_addr; /* start of the userspace allocated memory */ }; +struct kvm_userspace_memory_region_ext { + __u32 slot; + __u32 flags; + __u64 guest_phys_addr; + __u64 memory_size; /* bytes */ + __u64 userspace_addr; /* hva */ + __u64 ofs; /* offset into fd */ + __u32 fd; + __u32 padding[5]; +}; + /* * The bit 0 ~ bit 15 of kvm_memory_region::flags are visible for userspace, * other bits are reserved for kvm internal use which are defined in @@ -110,6 +121,7 @@ struct kvm_userspace_memory_region { */ #define KVM_MEM_LOG_DIRTY_PAGES(1UL << 0) #define KVM_MEM_READONLY (1UL << 1) +#define KVM_MEM_PRIVATE(1UL << 2) /* for KVM_IRQ_LINE */ struct kvm_irq_level { -- 2.17.1
[PATCH v3 kvm/queue 07/16] KVM: Refactor hva based memory invalidation code
The purpose of this patch is for fd-based memslot to reuse the same mmu_notifier based guest memory invalidation code for private pages. No functional changes except renaming 'hva' to more neutral 'useraddr' so that it can also cover 'offset' in a fd that private pages live in. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 8 -- virt/kvm/kvm_main.c | 55 ++-- 2 files changed, 36 insertions(+), 27 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 21f8b1880723..07863ff855cd 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1464,9 +1464,13 @@ static inline int memslot_id(struct kvm *kvm, gfn_t gfn) } static inline gfn_t -hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot) +useraddr_to_gfn_memslot(unsigned long useraddr, struct kvm_memory_slot *slot, + bool addr_is_hva) { - gfn_t gfn_offset = (hva - slot->userspace_addr) >> PAGE_SHIFT; + unsigned long useraddr_base = addr_is_hva ? slot->userspace_addr + : slot->ofs; + + gfn_t gfn_offset = (useraddr - useraddr_base) >> PAGE_SHIFT; return slot->base_gfn + gfn_offset; } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 47e96d1eb233..b7a1c4d7eaaa 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -486,16 +486,16 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); } -typedef bool (*hva_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); +typedef bool (*gfn_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start, unsigned long end); -struct kvm_hva_range { +struct kvm_useraddr_range { unsigned long start; unsigned long end; pte_t pte; - hva_handler_t handler; + gfn_handler_t handler; on_lock_fn_t on_lock; bool flush_on_ret; bool may_block; @@ -515,13 +515,13 @@ static void kvm_null_fn(void) #define IS_KVM_NULL_FN(fn) ((fn) == (void *)kvm_null_fn) /* Iterate over each memslot intersecting [start, last] (inclusive) range */ -#define kvm_for_each_memslot_in_hva_range(node, slots, start, last) \ - for (node = interval_tree_iter_first(&slots->hva_tree, start, last); \ +#define kvm_for_each_memslot_in_useraddr_range(node, tree, start, last) \ + for (node = interval_tree_iter_first(tree, start, last); \ node; \ node = interval_tree_iter_next(node, start, last)) \ -static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, - const struct kvm_hva_range *range) +static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm, + const struct kvm_useraddr_range *range) { bool ret = false, locked = false; struct kvm_gfn_range gfn_range; @@ -540,17 +540,19 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, idx = srcu_read_lock(&kvm->srcu); for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + struct rb_root_cached *useraddr_tree; struct interval_tree_node *node; slots = __kvm_memslots(kvm, i); - kvm_for_each_memslot_in_hva_range(node, slots, + useraddr_tree = &slots->hva_tree; + kvm_for_each_memslot_in_useraddr_range(node, useraddr_tree, range->start, range->end - 1) { - unsigned long hva_start, hva_end; + unsigned long useraddr_start, useraddr_end; slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]); - hva_start = max(range->start, slot->userspace_addr); - hva_end = min(range->end, slot->userspace_addr + - (slot->npages << PAGE_SHIFT)); + useraddr_start = max(range->start, slot->userspace_addr); + useraddr_end = min(range->end, slot->userspace_addr + + (slot->npages << PAGE_SHIFT)); /* * To optimize for the likely case where the address @@ -562,11 +564,14 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, gfn_range.may_block = range->may_block; /* -* {gfn(page) | page intersects with [hva_start, hva_end)} = +* {gfn(page) | page intersects with [useraddr_start, useraddr_end)} =
[PATCH v3 kvm/queue 10/16] KVM: Implement fd-based memory invalidation
KVM gets notified when userspace punches a hole in a fd which is used for guest memory. KVM should invalidate the mapping in the secondary MMU page tables. This is the same logic as MMU notifier invalidation except the fd related information is carried around to indicate the memory range. KVM hence can reuse most of existing MMU notifier invalidation code including looping through the memslots and then calling into kvm_unmap_gfn_range() which should do whatever needed for fd-based memory unmapping (e.g. for private memory managed by TDX it may need call into SEAM-MODULE). Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 8 - virt/kvm/kvm_main.c | 69 +++- virt/kvm/memfd.c | 2 ++ 3 files changed, 63 insertions(+), 16 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 07863ff855cd..be567925831b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -233,7 +233,7 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #endif -#ifdef KVM_ARCH_WANT_MMU_NOTIFIER +#if defined(KVM_ARCH_WANT_MMU_NOTIFIER) || defined(CONFIG_MEMFD_OPS) struct kvm_gfn_range { struct kvm_memory_slot *slot; gfn_t start; @@ -2012,4 +2012,10 @@ static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) /* Max number of entries allowed for each kvm dirty ring */ #define KVM_DIRTY_RING_MAX_ENTRIES 65536 +#ifdef CONFIG_MEMFD_OPS +int kvm_memfd_invalidate_range(struct kvm *kvm, struct inode *inode, + unsigned long start, unsigned long end); +#endif /* CONFIG_MEMFD_OPS */ + + #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 7b7530b1ea1e..f495c1a313bd 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -468,7 +468,8 @@ void kvm_destroy_vcpus(struct kvm *kvm) } EXPORT_SYMBOL_GPL(kvm_destroy_vcpus); -#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) +#if defined(CONFIG_MEMFD_OPS) ||\ + (defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)) typedef bool (*gfn_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); @@ -595,6 +596,30 @@ static __always_inline int __kvm_handle_useraddr_range(struct kvm *kvm, /* The notifiers are averse to booleans. :-( */ return (int)ret; } + +static void mn_active_invalidate_count_inc(struct kvm *kvm) +{ + spin_lock(&kvm->mn_invalidate_lock); + kvm->mn_active_invalidate_count++; + spin_unlock(&kvm->mn_invalidate_lock); + +} + +static void mn_active_invalidate_count_dec(struct kvm *kvm) +{ + bool wake; + + spin_lock(&kvm->mn_invalidate_lock); + wake = (--kvm->mn_active_invalidate_count == 0); + spin_unlock(&kvm->mn_invalidate_lock); + + /* +* There can only be one waiter, since the wait happens under +* slots_lock. +*/ + if (wake) + rcuwait_wake_up(&kvm->mn_memslots_update_rcuwait); +} #endif #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) @@ -732,9 +757,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, * * Pairs with the decrement in range_end(). */ - spin_lock(&kvm->mn_invalidate_lock); - kvm->mn_active_invalidate_count++; - spin_unlock(&kvm->mn_invalidate_lock); + mn_active_invalidate_count_inc(kvm); __kvm_handle_useraddr_range(kvm, &useraddr_range); @@ -773,21 +796,11 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, .may_block = mmu_notifier_range_blockable(range), .inode = NULL, }; - bool wake; __kvm_handle_useraddr_range(kvm, &useraddr_range); /* Pairs with the increment in range_start(). */ - spin_lock(&kvm->mn_invalidate_lock); - wake = (--kvm->mn_active_invalidate_count == 0); - spin_unlock(&kvm->mn_invalidate_lock); - - /* -* There can only be one waiter, since the wait happens under -* slots_lock. -*/ - if (wake) - rcuwait_wake_up(&kvm->mn_memslots_update_rcuwait); + mn_active_invalidate_count_dec(kvm); BUG_ON(kvm->mmu_notifier_count < 0); } @@ -872,6 +885,32 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ +#ifdef CONFIG_MEMFD_OPS +int kvm_memfd_invalidate_range(struct kvm *kvm, struct inode *inode, + unsigned long start, unsigned long end) +{ + int ret; + const struct kvm_useraddr_range useraddr_range = { + .start = start, + .end= end, + .pte= __pte(0), + .handler= kvm_unmap_gfn_range, + .on_lock= (void *)kvm_
Re: [PATCH v2] audio: Add sndio backend
On Montag, 20. Dezember 2021 16:41:31 CET Christian Schoenebeck wrote: > On Freitag, 17. Dezember 2021 10:38:32 CET Alexandre Ratchov wrote: > > sndio is the native API used by OpenBSD, although it has been ported to > > other *BSD's and Linux (packages for Ubuntu, Debian, Void, Arch, etc.). > > > > Signed-off-by: Brad Smith > > Signed-off-by: Alexandre Ratchov > > --- > > > > Thank you for the reviews and all the comments. Here's a second diff > > with all the suggested changes: > > > > - Replace ISC license by SPDX-License-Identifier header > > - Fix units (milli- vs micro-) in comment about SNDIO_LATENCY_US > > - Drop outdated comment about the "size" argument of > > sndio_get_buffer_out() > > - Fix AUDIO_FORMAT_U32 handling (missing "break" statement) > > - Set {read,write] methods to audio_generic_{read,write} (fixes craches) > > - Check if backend is enabled in sndio_poll_event() > > - Use https://sndio.org in description > > - Mark options as available after 7.0 release (instead of 6.2) > > - Describe sndio-specific options (dev, latency) in qemu-options.hx > > - Add myself as reviewer to MAINTAINERS > > - Style fixes: no space after function names, use 4-space indent > > - Don't use "return foo()" if foo() returns void > > - Include backend to audio_drivers_priority[] > > > > Tested on OpenBSD, works as expected! > > > > MAINTAINERS| 5 + > > audio/audio.c | 1 + > > audio/audio_template.h | 2 + > > audio/meson.build | 1 + > > audio/sndioaudio.c | 555 + > > meson.build| 9 +- > > meson_options.txt | 4 +- > > qapi/audio.json| 25 +- > > qemu-options.hx| 16 ++ > > tests/vm/freebsd | 3 + > > 10 files changed, 618 insertions(+), 3 deletions(-) > > create mode 100644 audio/sndioaudio.c > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 7543eb4d59..76bdad064f 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -2307,6 +2307,7 @@ X: audio/jackaudio.c > > > > X: audio/ossaudio.c > > X: audio/paaudio.c > > X: audio/sdlaudio.c > > > > +X: audio/sndio.c > > > > X: audio/spiceaudio.c > > F: qapi/audio.json > > > > @@ -2349,6 +2350,10 @@ R: Thomas Huth > > > > S: Odd Fixes > > F: audio/sdlaudio.c > > > > +Sndio Audio backend > > +R: Alexandre Ratchov > > +F: audio/sndio.c > > + > > Thanks Alexandre for volunteering as reviewer! > > Gerd, would it be OK to set you as maintainer for now until new > maintainer(s) adopt audio sections? Or should this start with "S: Orphan" > instead? Alexandre, if Gerd does not reply in a week or so, then please add "S: Orphan" to MAINTAINERS for now to make it clear that there is no maintainer for sndio yet to increase the chance for somebody to adopt it. >From Volker's response I assume you will be posting a v3 anyway. If nobody takes care to queue your patch then let me know. Maybe I can push it through my queue this time, provided that there are enough reviews. I also saw your patch just by coincidence BTW, so please CC maintainers of affected files as suggested by Volker. Best regards, Christian Schoenebeck
Re: [PATCH] pci: Skip power-off reset when pending unplug
On Wed, Dec 22, 2021 at 04:10:07PM -0700, Alex Williamson wrote: > On Wed, 22 Dec 2021 15:48:24 -0500 > "Michael S. Tsirkin" wrote: > > > On Wed, Dec 22, 2021 at 12:08:09PM -0700, Alex Williamson wrote: > > > On Tue, 21 Dec 2021 18:40:09 -0500 > > > "Michael S. Tsirkin" wrote: > > > > > > > On Tue, Dec 21, 2021 at 09:36:56AM -0700, Alex Williamson wrote: > > > > > On Mon, 20 Dec 2021 18:03:56 -0500 > > > > > "Michael S. Tsirkin" wrote: > > > > > > > > > > > On Mon, Dec 20, 2021 at 11:26:59AM -0700, Alex Williamson wrote: > > > > > > > The below referenced commit introduced a change where devices > > > > > > > under a > > > > > > > root port slot are reset in response to removing power to the > > > > > > > slot. > > > > > > > This improves emulation relative to bare metal when the slot is > > > > > > > powered > > > > > > > off, but introduces an unnecessary step when devices under that > > > > > > > slot > > > > > > > are slated for removal. > > > > > > > > > > > > > > In the case of an assigned device, there are mandatory delays > > > > > > > associated with many device reset mechanisms which can stall the > > > > > > > hot > > > > > > > unplug operation. Also, in cases where the unplug request is > > > > > > > triggered > > > > > > > via a release operation of the host driver, internal device > > > > > > > locking in > > > > > > > the host kernel may result in a failure of the device reset > > > > > > > mechanism, > > > > > > > which generates unnecessary log warnings. > > > > > > > > > > > > > > Skip the reset for devices that are slated for unplug. > > > > > > > > > > > > > > Cc: qemu-sta...@nongnu.org > > > > > > > Fixes: d5daff7d3126 ("pcie: implement slot power control for pcie > > > > > > > root ports") > > > > > > > Signed-off-by: Alex Williamson > > > > > > > > > > > > I am not sure this is safe. IIUC pending_deleted_event > > > > > > is normally set after host admin requested device removal, > > > > > > while the reset could be triggered by guest for its own reasons > > > > > > such as suspend or driver reload. > > > > > > > > > > Right, the case where I mention that we get the warning looks exactly > > > > > like the admin doing a device eject, it calls qdev_unplug(). I'm not > > > > > trying to prevent arbitrary guest resets of the device, in fact there > > > > > are cases where the guest really should be able to reset the device, > > > > > nested assignment in addition to the cases you mention. Gerd noted > > > > > that this was an unintended side effect of the referenced patch to > > > > > reset device that are imminently being removed. > > > > > > > > > > > Looking at this some more, I am not sure I understand the > > > > > > issue completely. > > > > > > We have: > > > > > > > > > > > > if ((sltsta & PCI_EXP_SLTSTA_PDS) && (val & PCI_EXP_SLTCTL_PCC) > > > > > > && > > > > > > (val & PCI_EXP_SLTCTL_PIC_OFF) == PCI_EXP_SLTCTL_PIC_OFF && > > > > > > (!(old_slt_ctl & PCI_EXP_SLTCTL_PCC) || > > > > > > (old_slt_ctl & PCI_EXP_SLTCTL_PIC_OFF) != > > > > > > PCI_EXP_SLTCTL_PIC_OFF)) { > > > > > > pcie_cap_slot_do_unplug(dev); > > > > > > } > > > > > > pcie_cap_update_power(dev); > > > > > > > > > > > > so device unplug triggers first, reset follows and by that time > > > > > > there should be no devices under the bus, if there are then > > > > > > it's because guest did not clear the power indicator. > > > > > > > > > > Note that the unplug only triggers here if the Power Indicator Control > > > > > is OFF, I see writes to SLTCTL in the following order: > > > > > > > > > > 01f1 - > 02f1 -> 06f1 -> 07f1 > > > > > > > > > > So PIC changes to BLINK, then PCC changes the slot to OFF (this > > > > > triggers the reset), then PIC changes to OFF triggering the unplug. > > > > > > > > > > The unnecessary reset that occurs here is universal. Should the > > > > > unplug > > > > > be occurring when: > > > > > > > > > > (val & PCI_EXP_SLTCTL_PIC_OFF) != PCI_EXP_SLTCTL_PIC_ON > > > > > > > > > > ? > > > > > > > > well blinking generally means "do not remove yet". > > > > > > Blinking indicates that the slot is in a transition phase, > > > > Well the spec seems to state that blinking indicates it's waiting > > to see user does not change his/her mind by pressing the > > button again. > > We're dealing with the Power Indicator, not the Attention Indicator > here. Let's make sure we are talking about the same here: The Attention Indicator, which must be yellow or amber in color, indicates that an operational problem exists or that the hot-plug slot is being identified so that a human operator can locate it easily. and Attention Indicator Blinking A blinking Attention Indicator indicates that system software is identifying this slot for a human operator to find. This behavior is controlled by a user (for example, from a software user interface or management tool). On the other ha
Re: [PATCH] acpi: validate hotplug selector on access
On Thu, Dec 23, 2021 at 10:58:14AM +0100, Mauro Matteo Cascella wrote: > Hi, > > On Wed, Dec 22, 2021 at 9:52 PM Michael S. Tsirkin wrote: > > > > On Wed, Dec 22, 2021 at 09:27:51PM +0100, Philippe Mathieu-Daudé wrote: > > > On Wed, Dec 22, 2021 at 9:20 PM Michael S. Tsirkin > > > wrote: > > > > On Wed, Dec 22, 2021 at 08:19:41PM +0100, Philippe Mathieu-Daudé wrote: > > > > > +Mauro & Alex > > > > > > > > > > On 12/21/21 15:48, Michael S. Tsirkin wrote: > > > > > > When bus is looked up on a pci write, we didn't > > > > > > validate that the lookup succeeded. > > > > > > Fuzzers thus can trigger QEMU crash by dereferencing the NULL > > > > > > bus pointer. > > > > > > > > > > > > Fixes: b32bd763a1 ("pci: introduce acpi-index property for PCI > > > > > > device") > > > > > > Cc: "Igor Mammedov" > > > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/770 > > > > > > Signed-off-by: Michael S. Tsirkin > > > > > > > > > > It seems this problem is important enough to get a CVE assigned. > > > > > > > > Guest root can crash guest. > > > > I don't see why we would assign a CVE. > > > > > > Well thinking about downstream distributions, if there is a CVE assigned, > > > it helps them to have it written in the commit. Maybe I am mistaken. > > > > > > Unrelated but it seems there is a coordination problem with the > > > qemu-security@ list, > > > if this isn't a security issue, why was a CVE requested? > > > > Right. I don't think a priveleged user crashing VM warrants a CVE, > > it can just halt a CPU or whatever. Just cancel the CVE request pls. > > While I agree with you that this is kind of borderline and I expressed > similar concerns in the past, I was told that: > > 1) root guest users are not necessarily trustworthy (from the host > perspective). > 2) NULL pointer deref and similar issues caused by an > ill-handled/error condition are CVE worthy, even if triggered by root. > 3) In other cases, DoS triggered by root is not a security issue > because it's an expected behavior and not an ill-handled/error > condition (think of assert failures, for example). > > In other words, "ill-handled condition" is the crucial factor that > makes a bug CVE worthy or not. I guess the point is that a downstream might have a slightly different code path where it would be more serious ... OK then, not a big deal for me. So what's the CVE # then? > +Prasad, can you shed some light on this? Is my understanding correct? > > Also, please note that we regularly get CVE requests for bugs like > this and many CVEs have been assigned in the past. Of course that > doesn't mean we can't change things going forward, but I think we > should make it clear (probably here: > https://www.qemu.org/docs/master/system/security.html) that these > kinds of bugs are not eligible for CVE assignment. That would be good, yes. > > > > > Mauro, please update us when you get the CVE number. > > > > > Michael, please amend the CVE number before committing the fix. > > > > > > > > > > FWIW Paolo asked every fuzzed bug reproducer to be committed > > > > > as qtest, see tests/qtest/fuzz*c. Alex has a way to generate > > > > > reproducer in plain C. > > > > > > > > > > Regards, > > > > > > > > > > Phil. > > > > > > > > -- > Mauro Matteo Cascella > Red Hat Product Security > PGP-Key ID: BB3410B0
Re: [PATCH v1 1/2] hw/misc: Implementating dummy AST2600 I3C model
Hello, On 12/22/21 10:23, Troy Lee wrote: Introduce a dummy AST2600 I3C model. Aspeed 2600 SDK enables I3C support by default. The I3C driver will try to reset the device controller and setup through device address table register. This dummy model response these register with default value listed on ast2600v10 datasheet chapter 54.2. If the device address table register doesn't set correctly, it will cause guest machine kernel panic due to reference to invalid address. Overall looks good. Some comments, Signed-off-by: Troy Lee --- hw/misc/aspeed_i3c.c | 258 +++ hw/misc/meson.build | 1 + include/hw/misc/aspeed_i3c.h | 30 3 files changed, 289 insertions(+) create mode 100644 hw/misc/aspeed_i3c.c create mode 100644 include/hw/misc/aspeed_i3c.h diff --git a/hw/misc/aspeed_i3c.c b/hw/misc/aspeed_i3c.c new file mode 100644 index 00..9d2bda203e --- /dev/null +++ b/hw/misc/aspeed_i3c.c @@ -0,0 +1,258 @@ +/* + * ASPEED I3C Controller + * + * Copyright (C) 2021 ASPEED Technology Inc. + * + * This code is licensed under the GPL version 2 or later. See + * the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/log.h" +#include "qemu/error-report.h" +#include "hw/misc/aspeed_i3c.h" +#include "qapi/error.h" +#include "migration/vmstate.h" + +/* I3C Controller Registers */ +#define R_I3CG_REG0(x) (((x * 0x10) + 0x10) / 4) +#define I3CG_REG0_SDA_PULLUP_EN_MASK GENMASK(29, 28) GENMASK() is a macro defined in the FSI model which is not upstream. There are other ways to define bitfield masks in QEMU. Please take a look at include/hw/registerfields.h. +#define I3CG_REG0_SDA_PULLUP_EN_2K BIT(28) +#define I3CG_REG0_SDA_PULLUP_EN_750BIT(29) +#define I3CG_REG0_SDA_PULLUP_EN_545(BIT(29) | BIT(28)) + +#define R_I3CG_REG1(x) (((x * 0x10) + 0x14) / 4) +#define I3CG_REG1_I2C_MODE BIT(0) +#define I3CG_REG1_TEST_MODEBIT(1) +#define I3CG_REG1_ACT_MODE_MASKGENMASK(3, 2) +#define I3CG_REG1_ACT_MODE(x) (((x) << 2) & I3CG_REG1_ACT_MODE_MASK) +#define I3CG_REG1_PENDING_INT_MASK GENMASK(7, 4) +#define I3CG_REG1_PENDING_INT(x) (((x) << 4) & I3CG_REG1_PENDING_INT_MASK) +#define I3CG_REG1_SA_MASK GENMASK(14, 8) +#define I3CG_REG1_SA(x)(((x) << 8) & I3CG_REG1_SA_MASK) +#define I3CG_REG1_SA_ENBIT(15) +#define I3CG_REG1_INST_ID_MASK GENMASK(19, 16) +#define I3CG_REG1_INST_ID(x) (((x) << 16) & I3CG_REG1_INST_ID_MASK) + +/* I3C Device Registers */ +#define R_DEVICE_CTRL (0x00 / 4) +#define R_DEVICE_ADDR (0x04 / 4) +#define R_HW_CAPABILITY (0x08 / 4) +#define R_COMMAND_QUEUE_PORT(0x0c / 4) +#define R_RESPONSE_QUEUE_PORT (0x10 / 4) +#define R_RX_TX_DATA_PORT (0x14 / 4) +#define R_IBI_QUEUE_STATUS (0x18 / 4) +#define R_IBI_QUEUE_DATA(0x18 / 4) +#define R_QUEUE_THLD_CTRL (0x1c / 4) +#define R_DATA_BUFFER_THLD_CTRL (0x20 / 4) +#define R_IBI_QUEUE_CTRL(0x24 / 4) +#define R_IBI_MR_REQ_REJECT (0x2c / 4) +#define R_IBI_SIR_REQ_REJECT(0x30 / 4) +#define R_RESET_CTRL(0x34 / 4) +#define R_SLV_EVENT_CTRL(0x38 / 4) +#define R_INTR_STATUS (0x3c / 4) +#define R_INTR_STATUS_EN(0x40 / 4) +#define R_INTR_SIGNAL_EN(0x44 / 4) +#define R_INTR_FORCE(0x48 / 4) +#define R_QUEUE_STATUS_LEVEL(0x4c / 4) +#define R_DATA_BUFFER_STATUS_LEVEL (0x50 / 4) +#define R_PRESENT_STATE (0x54 / 4) +#define R_CCC_DEVICE_STATUS (0x58 / 4) +#define R_DEVICE_ADDR_TABLE_POINTER (0x5c / 4) +#define DEVICE_ADDR_TABLE_DEPTH(x) (((x) & GENMASK(31, 16)) >> 16) +#define DEVICE_ADDR_TABLE_ADDR(x) ((x) & GENMASK(7, 0)) +#define R_DEV_CHAR_TABLE_POINTER(0x60 / 4) +#define R_VENDOR_SPECIFIC_REG_POINTER (0x6c / 4) +#define R_SLV_MIPI_PID_VALUE(0x70 / 4) +#define R_SLV_PID_VALUE (0x74 / 4) +#define R_SLV_CHAR_CTRL (0x78 / 4) +#define R_SLV_MAX_LEN (0x7c / 4) +#define R_MAX_READ_TURNAROUND (0x80 / 4) +#define R_MAX_DATA_SPEED(0x84 / 4) +#define R_SLV_DEBUG_STATUS (0x88 / 4) +#define R_SLV_INTR_REQ (0x8c / 4) +#define R_DEVICE_CTRL_EXTENDED (0xb0 / 4) +#define R_SCL_I3C_OD_TIMING (0xb4 / 4) +#define R_SCL_I3C_PP_TIMING (0xb8 / 4) +#define R_SCL_I2C_FM_TIMING (0xbc / 4) +#define R_SCL_I2C_FMP_TIMING(0xc0 / 4) +#define R_SCL_EXT_LCNT_TIMING (0xc8 / 4) +#define R_SCL_EXT_TERMN_LCNT_TIMING (0xcc / 4) +#define R_BUS_FREE_TIMING (0xd4 / 4) +#define R_BUS_IDLE_TIMIN
Re: [PATCH v1 2/2] hw/arm/aspeed_ast2600: create i3c instance
On 12/22/21 10:23, Troy Lee wrote: This patch includes i3c instance in ast2600 soc. Signed-off-by: Troy Lee Looks good but it is based on the QEMU aspeed branch for OpenBMC. You should rebase on upstream. Thanks, C. --- hw/arm/aspeed_ast2600.c | 12 include/hw/arm/aspeed_soc.h | 3 +++ 2 files changed, 15 insertions(+) diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c index f2fef9d706..219b025bc2 100644 --- a/hw/arm/aspeed_ast2600.c +++ b/hw/arm/aspeed_ast2600.c @@ -63,6 +63,7 @@ static const hwaddr aspeed_soc_ast2600_memmap[] = { [ASPEED_DEV_VUART] = 0x1E787000, [ASPEED_DEV_FSI1] = 0x1E79B000, [ASPEED_DEV_FSI2] = 0x1E79B100, +[ASPEED_DEV_I3C] = 0x1E7A, [ASPEED_DEV_SDRAM] = 0x8000, }; @@ -112,6 +113,7 @@ static const int aspeed_soc_ast2600_irqmap[] = { [ASPEED_DEV_FSI1] = 100, [ASPEED_DEV_FSI2] = 101, [ASPEED_DEV_DP]= 62, +[ASPEED_DEV_I3C] = 102, /* 102 -> 107 */ }; static qemu_irq aspeed_soc_get_irq(AspeedSoCState *s, int ctrl) @@ -230,6 +232,8 @@ static void aspeed_soc_ast2600_init(Object *obj) object_initialize_child(obj, "pwm", &s->pwm, TYPE_ASPEED_PWM); +object_initialize_child(obj, "i3c", &s->i3c, TYPE_ASPEED_I3C); + object_initialize_child(obj, "fsi[*]", &s->fsi[0], TYPE_ASPEED_APB2OPB); } @@ -542,6 +546,14 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp) sysbus_connect_irq(SYS_BUS_DEVICE(&s->pwm), 0, aspeed_soc_get_irq(s, ASPEED_DEV_PWM)); +/* I3C */ +if (!sysbus_realize(SYS_BUS_DEVICE(&s->i3c), errp)) { +return; +} +sysbus_mmio_map(SYS_BUS_DEVICE(&s->i3c), 0, sc->memmap[ASPEED_DEV_I3C]); +sysbus_connect_irq(SYS_BUS_DEVICE(&s->i3c), 0, + aspeed_soc_get_irq(s, ASPEED_DEV_I3C)); + /* FSI */ if (!sysbus_realize(SYS_BUS_DEVICE(&s->fsi[0]), errp)) { return; diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h index 0db200d813..0c950fab3c 100644 --- a/include/hw/arm/aspeed_soc.h +++ b/include/hw/arm/aspeed_soc.h @@ -21,6 +21,7 @@ #include "hw/timer/aspeed_timer.h" #include "hw/rtc/aspeed_rtc.h" #include "hw/i2c/aspeed_i2c.h" +#include "hw/misc/aspeed_i3c.h" #include "hw/ssi/aspeed_smc.h" #include "hw/misc/aspeed_hace.h" #include "hw/watchdog/wdt_aspeed.h" @@ -53,6 +54,7 @@ struct AspeedSoCState { AspeedRtcState rtc; AspeedTimerCtrlState timerctrl; AspeedI2CState i2c; +AspeedI3CState i3c; AspeedSCUState scu; AspeedHACEState hace; AspeedXDMAState xdma; @@ -148,6 +150,7 @@ enum { ASPEED_DEV_FSI2, ASPEED_DEV_DPMCU, ASPEED_DEV_DP, +ASPEED_DEV_I3C, }; #endif /* ASPEED_SOC_H */