[PATCH] semihosting/uaccess.c: Replaced a malloc call with g_malloc.
Replaced a call to malloc() and its respective call to free() with g_malloc() and g_free(). Signed-off-by: dinglimin V3 -> V4:Delete null checks after g malloc(). g_malloc() is preferred more than g_try_* functions, which return NULL on error, when the size of the requested allocation is small. This is because allocating few bytes should not be a problem in a healthy system. Otherwise, the system is already in a critical state. V2 -> V3:softmmu_unlock_user changes free to g free. V1 -> V2:if cpu_memory_rw_debug failed, still need to set p=NULL Signed-off-by: dinglimin --- semihosting/uaccess.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/semihosting/uaccess.c b/semihosting/uaccess.c index 8018828069..81a0f80e9f 100644 --- a/semihosting/uaccess.c +++ b/semihosting/uaccess.c @@ -14,10 +14,10 @@ void *softmmu_lock_user(CPUArchState *env, target_ulong addr, target_ulong len, bool copy) { -void *p = malloc(len); +void *p = g_malloc(len); if (p && copy) { if (cpu_memory_rw_debug(env_cpu(env), addr, p, len, 0)) { -free(p); +g_free(p); p = NULL; } } @@ -87,5 +87,5 @@ void softmmu_unlock_user(CPUArchState *env, void *p, if (len) { cpu_memory_rw_debug(env_cpu(env), addr, p, len, 1); } -free(p); +g_free(p); } -- 2.30.0.windows.2
Re: [PATCH v2 1/6] python/machine: move socket setup out of _base_args property
> On 25-Jul-2023, at 11:33 PM, John Snow wrote: > > This property isn't meant to do much else besides return a list of > strings, so move this setup back out into _pre_launch(). > > Signed-off-by: John Snow Reviewed-by: Ani Sinha > --- > python/qemu/machine/machine.py | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py > index c16a0b6fed..8be0f684fe 100644 > --- a/python/qemu/machine/machine.py > +++ b/python/qemu/machine/machine.py > @@ -300,9 +300,7 @@ def _base_args(self) -> List[str]: > > if self._qmp_set: > if self._sock_pair: > -fd = self._sock_pair[0].fileno() > -os.set_inheritable(fd, True) > -moncdev = f"socket,id=mon,fd={fd}" > +moncdev = f"socket,id=mon,fd={self._sock_pair[0].fileno()}" > elif isinstance(self._monitor_address, tuple): > moncdev = "socket,id=mon,host={},port={}".format( > *self._monitor_address > @@ -339,6 +337,7 @@ def _pre_launch(self) -> None: > if self._qmp_set: > if self._monitor_address is None: > self._sock_pair = socket.socketpair() > +os.set_inheritable(self._sock_pair[0].fileno(), True) > sock = self._sock_pair[1] > if isinstance(self._monitor_address, str): > self._remove_files.append(self._monitor_address) > -- > 2.41.0 >
Re: how to build qemu 8.1 - keycodemapdb?
26.07.2023 09:41, Michael Tokarev wrote: Looks like scripts/make-release.sh needs some updating. make-release.sh apparently does the right thing. But the published tarball does not include the 3 required sub-projects anyway. Is it about how the release is made? What is used to make the actual release tarball, is it not make-release.sh? /mjt
回复: [PATCH v1] block/stream:add flush l2_table_cache,ensure data integrity
>On 25.07.23 18:13, Denis V. Lunev wrote: >>On 7/25/23 16:25, Vladimir Sementsov-Ogievskiy wrote: >>>On 24.07.23 10:30, Evanzhang wrote: On 7/26/23 01:41, Vladimir Sementsov-Ogievskiy wrote: block_stream will not actively flush l2_table_cache,when qemu process exception exit,causing disk data loss Signed-off-by: Evanzhang --- block/stream.c | 6 ++ 1 file changed, 6 insertions(+) >>> >diff --git a/block/stream.c b/block/stream.c index e522bbd..a5e08da 100644 --- a/block/stream.c +++ b/block/stream.c @@ -207,6 +207,12 @@ static int coroutine_fn stream_run(Job *job, Error **errp) } } + /* + * Complete stream_populate,force flush l2_table_cache,to + * avoid unexpected termination of process, l2_table loss + */ + qcow2_cache_flush(bs, ((BDRVQcow2State +*)bs->opaque)->l2_table_cache); + /* Do not remove the backing file if an error was there but ignored. */ return error; } >>> >>> Hi! >>> >>> I think, it's more correct just call bdrv_co_flush(bs), which should do all >>> the job. Also, stream_run() should fail if flush fails. >>> >>> Also, I remember I've done it for all (or at least several) blockjobs >>> generically, so that any blockjob must succesfully flush target to report >>> success.. But now I can find neither my patches nor the code :( Den, Kevin, >>> Hanna, don't you remember this topic? >>> >> This was a part of compressed write cache series, which was postponed. >> >> https://lore.kernel.org/all/20210305173507.393137-1-vsementsov@virtuoz >> zo.com/T/#m87315593ed5ab16e5d0e4e7a5ae6d776fbbaec77 >> >> We have it ported to 7.0 QEMU. >> >> Not a problem to port to master and resend. >> Will this make a sense? >> >O, thanks! Patch 01 applies with a little conflict to master, so I'll just >resend it myself. > Thanks all ! With best regards !
Re: [PATCH v7 4/4] tests/qtest: Introduce tests for UFS
Hi! On 26/07/2023 07.30, Jeuk Kim wrote: This patch includes the following tests Test mmio read Test ufs device initialization and ufs-lu recognition Test I/O (Performs a write followed by a read to verify) Signed-off-by: Jeuk Kim --- ... diff --git a/tests/qtest/ufs-test.c b/tests/qtest/ufs-test.c new file mode 100644 index 00..5104a0a56a --- /dev/null +++ b/tests/qtest/ufs-test.c @@ -0,0 +1,575 @@ +/* + * QTest testcase for UFS + * + * Copyright (c) 2023 Samsung Electronics Co., Ltd. All rights reserved. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qemu/module.h" +#include "qemu/units.h" +#include "libqtest.h" +#include "libqos/qgraph.h" +#include "libqos/pci.h" +#include "scsi/constants.h" +#include "include/block/ufs.h" + +/* Test images sizes in Bytes */ +#define TEST_IMAGE_SIZE (64 * 1024 * 1024) +/* Timeout for various operations, in seconds. */ +#define TIMEOUT_SECONDS 5 From what I've seen in the past, it's possible that a process gets paused for 3 - 4 seconds on a very loaded CI machine, so 5 seconds is already close ... I'd suggest to use 8 - 10 seconds for a timeout instead, just to be on the safe side. +static char *drive_create(void) +{ +int fd, ret; +char *t_path; + +/* Create a temporary raw image */ +fd = g_file_open_tmp("qtest.XX", &t_path, NULL); Could you maybe use "qtest-ufs.XX" or something more prominent instead? ... in case the files don't get deleted correctly, it's easier to know at which test to look at later. With that change: Acked-by: Thomas Huth
Re: [PATCH v2 2/6] python/machine: close sock_pair in cleanup path
> On 25-Jul-2023, at 11:48 PM, Daniel P. Berrangé wrote: > > On Tue, Jul 25, 2023 at 02:03:33PM -0400, John Snow wrote: >> If everything has gone smoothly, we'll already have closed the socket we >> gave to the child during post_launch. The other half of the pair that we >> gave to the QMP connection should, likewise, be definitively closed by >> now. >> >> However, in the cleanup path, it's possible we've created the socketpair >> but flubbed the launch and need to clean up resources. These resources >> *would* be handled by the garbage collector, but that can happen at >> unpredictable times. Nicer to just clean them up synchronously on the >> exit path, here. >> >> Signed-off-by: John Snow Reviewed-by: Ani Sinha >> --- >> python/qemu/machine/machine.py | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py >> index 8be0f684fe..26f0fb8a81 100644 >> --- a/python/qemu/machine/machine.py >> +++ b/python/qemu/machine/machine.py >> @@ -395,6 +395,11 @@ def _post_shutdown(self) -> None: >> finally: >> assert self._qmp_connection is None >> >> +if self._sock_pair: >> +self._sock_pair[0].close() >> +self._sock_pair[1].close() >> +self._sock_pair = None >> + > > Reviewed-by: Daniel P. Berrangé > > With regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| >
Re: [PATCH 1/2] block/blkio: fix opening virtio-blk drivers
On Tue, Jul 25, 2023 at 04:00:38PM -0400, Stefan Hajnoczi wrote: On Mon, Jul 24, 2023 at 05:46:10PM +0200, Stefano Garzarella wrote: libblkio 1.3.0 added support of "fd" property for virtio-blk-vhost-vdpa driver. In QEMU, starting from commit cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk") we are using `blkio_get_int(..., "fd")` to check if the "fd" property is supported for all the virtio-blk-* driver. Unfortunately that property is also available for those driver that do not support it, such as virtio-blk-vhost-user. Indeed now QEMU is failing if used with virtio-blk-vhost-user in this way: -blockdev node-name=drive0,driver=virtio-blk-vhost-user,path=vhost-user-blk.sock,cache.direct=on: Could not open 'vhost-user-blk.sock': No such device or address So, `blkio_get_int()` is not enough to check whether the driver supports the `fd` property or not. This is because the virito-blk common libblkio driver only checks whether or not `fd` is set during `blkio_connect()` and fails for those transports that do not support it (all except vhost-vdpa for now). So for now let's also check that the driver is virtio-blk-vhost-vdpa, since that's the only one that supports it. What happens when more virtio-blk-* libblkio drivers gain support for `fd`? I think we'll be back to the same problem because QEMU will be unable to distinguish between old and new libraries. If we release a v1.3.1 version of libblkio with https://gitlab.com/libblkio/libblkio/-/merge_requests/208 we can set a minimum requirement in QEMU and use blkio_set_fd() here. How about retrying with `path` if opening with `fd` fails? IIUC the only way is to check if blkio_connect() will fail with -EINVAL, that can also be generated by other issues, then retry forcing `path`. Do you see other ways? The code wouldn't be great, but we could do it for now and then when we release a new version of libblkio, do the revert and use blkio_set_int(fd) to see if it's supported or not. I don't know if it is worth it, or if it is better to merge this, release libblkio v1.3.1 and force the minimum requirement. WDYT? Thanks, Stefano
Re: [PATCH 2/2] block/blkio: use blkio_set_int("fd") to check fd support
On Tue, Jul 25, 2023 at 04:05:38PM -0400, Stefan Hajnoczi wrote: On Mon, Jul 24, 2023 at 05:46:11PM +0200, Stefano Garzarella wrote: The way the virtio-blk driver is implemented in libblkio, it's much easier to use blkio_set_int() instead of blkio_get_int() and have it fail right away to see if `fd` is supported by the transport. See https://gitlab.com/libblkio/libblkio/-/merge_requests/208 The commit description is vague about what's going on here. My understanding is: Setting the `fd` property fails with virtio-blk-* libblkio drivers that do not support fd passing since https://gitlab.com/libblkio/libblkio/-/merge_requests/208. Getting the `fd` property, on the other hand, always succeeds for virtio-blk-* libblkio drivers even when they don't support fd passing. This patch switches to setting the `fd` property because it is a better mechanism for probing fd passing support than getting the `fd` property. Please update the commit description. Thanks! Ack, I'll update it. Thanks, Stefano Signed-off-by: Stefano Garzarella --- block/blkio.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blkio.c b/block/blkio.c index ca1149042a..719b19324b 100644 --- a/block/blkio.c +++ b/block/blkio.c @@ -665,7 +665,7 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs, const char *blkio_driver = bs->drv->protocol_name; BDRVBlkioState *s = bs->opaque; bool fd_supported = false; -int fd, ret; +int ret; if (!path) { error_setg(errp, "missing 'path' option"); @@ -678,7 +678,7 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs, } if (strcmp(blkio_driver, "virtio-blk-vhost-vdpa") == 0 && -blkio_get_int(s->blkio, "fd", &fd) == 0) { +blkio_set_int(s->blkio, "fd", -1) == 0) { fd_supported = true; } @@ -688,7 +688,7 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs, * layer through the "/dev/fdset/N" special path. */ if (fd_supported) { -int open_flags; +int open_flags, fd; if (flags & BDRV_O_RDWR) { open_flags = O_RDWR; -- 2.41.0
Re: avocado test failing INTERRUPTED for "Missing asset"
On 25/7/23 19:13, Peter Maydell wrote: Currently this CI job is failing: https://gitlab.com/qemu-project/qemu/-/jobs/4737819946 because: (05/59) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_exynos4210_initrd: INTERRUPTED: Missing asset https://snapshot.debian.org/archive/debian/20190928T224601Z/pool/main/l/linux/linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb\nRunner error occurred: Timeout reached\nOriginal status: CANCEL\n{'name': '05-tests/avocado/boot_linux_console... (90.67 s) Why is a "Missing asset" causing a timeout after 90 seconds, rather than being accounted as a "SKIP" ("missing requirements in the test environment" sounds like what we have here) ? Maybe something to report to upstream Avocado. That said, we want CI to be reproducible. If we fetch assets from unreliable sources, we can't be reproducible. If we are unable to provide a assets cache, we'll keep hitting this problem. If we can not find a way to have assets stored (requiring sysadmin time setting up some infra, possibly only for GitLab), then I'd consider stopping running tests depending on external assets on CI; otherwise at some point we'll get tired to waste time figuring out the same problem. As a maintainer I'm happy to run the avocado tests using my local assets cache, and I would rather keep using the framework. But then my cache is likely different from others (users, maintainers, CI). Similarly few users/maintainers end up having the same cache and running the same set of tests. $ du -chs ~/avocado/data/cache/ 5.7G/Users/philmd/avocado/data/cache/ Some files are older than 3 years, and I'm happy to still run the tests depending on them (although they disappeared from their original http server). I don't understand the debug.log, because it says all of * that it retrieved the URL * that it wanted to cancel the test * that the test timed out Here it is: 16:03:16 DEBUG| PARAMS (key=arch, path=*, default=arm) => 'arm' 16:03:16 DEBUG| PARAMS (key=cpu, path=*, default=None) => None 16:03:16 DEBUG| PARAMS (key=qemu_bin, path=*, default=./qemu-system-arm) => './qemu-system-arm' 16:03:16 DEBUG| PARAMS (key=machine, path=*, default=smdkc210) => 'smdkc210' 16:03:16 INFO | Asset not in cache, fetching it. 16:03:16 INFO | Fetching https://snapshot.debian.org/archive/debian/20190928T224601Z/pool/main/l/linux/linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb -> /builds/qemu-project/qemu/avocado-cache/by_location/5f20376efeb69c8898caaff3edf7de45b4540163/linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb.ooffovd_ 16:04:05 DEBUG| Retrieved URL "https://snapshot.debian.org/archive/debian/20190928T224601Z/pool/main/l/linux/linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb": content-length 33882084, date: "Tue, 25 Jul 2023 16:03:16 GMT", last-modified: "Tue, 24 Sep 2019 22:31:23 GMT" 16:04:46 ERROR| RuntimeError: Test interrupted by SIGTERM 16:04:46 ERROR| 16:04:46 ERROR| Reproduced traceback from: /builds/qemu-project/qemu/build/tests/venv/lib/python3.9/site-packages/avocado/core/test.py:767 16:04:46 ERROR| Traceback (most recent call last): 16:04:46 ERROR| File "/builds/qemu-project/qemu/build/tests/venv/lib/python3.9/site-packages/avocado/core/test.py", line 1043, in fetch_asset 16:04:46 ERROR| return asset_obj.fetch() 16:04:46 ERROR| File "/builds/qemu-project/qemu/build/tests/venv/lib/python3.9/site-packages/avocado/utils/asset.py", line 381, in fetch 16:04:46 ERROR| raise OSError("Failed to fetch %s (%s)." % (self.asset_name, error)) 16:04:46 ERROR| OSError: Failed to fetch linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb (Test interrupted by SIGTERM). 16:04:46 ERROR| 16:04:46 ERROR| During handling of the above exception, another exception occurred: 16:04:46 ERROR| 16:04:46 ERROR| Traceback (most recent call last): 16:04:46 ERROR| File "/builds/qemu-project/qemu/build/tests/avocado/boot_linux_console.py", line 514, in test_arm_exynos4210_initrd 16:04:46 ERROR| deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash) 16:04:46 ERROR| File "/builds/qemu-project/qemu/build/tests/avocado/avocado_qemu/__init__.py", line 260, in fetch_asset 16:04:46 ERROR| return super().fetch_asset(name, 16:04:46 ERROR| File "/builds/qemu-project/qemu/build/tests/venv/lib/python3.9/site-packages/avocado/core/test.py", line 1049, in fetch_asset 16:04:46 ERROR| self.cancel("Missing asset {}".format(name)) 16:04:46 ERROR| File "/builds/qemu-project/qemu/build/tests/venv/lib/python3.9/site-packages/avocado/core/test.py", line 988, in cancel 16:04:46 ERROR| raise exceptions.TestCancel(message) 16:04:46 ERROR| avocado.core.exceptions.TestCancel: Missing asset https://snapshot.debian.org/archive/debian/20190928T224601Z/pool/main/l/linux/linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb 16:04:46 ERROR| 16:04:46 ERROR| CANCEL 05-tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_exynos4210_initrd -> TestCancel: Missing asset https://snapshot.deb
[RESEND PATCH v3 1/1] target/riscv: Add Zihintntl extension ISA string to DTS
RVA23 Profiles states: The RVA23 profiles are intended to be used for 64-bit application processors that will run rich OS stacks from standard binary OS distributions and with a substantial number of third-party binary user applications that will be supported over a considerable length of time in the field. The chapter 4 of the unprivileged spec introduces the Zihintntl extension and Zihintntl is a mandatory extension presented in RVA23 Profiles, whose purpose is to enable application and operating system portability across different implementations. Thus the DTS should contain the Zihintntl ISA string in order to pass to software. The unprivileged spec states: Like any HINTs, these instructions may be freely ignored. Hence, although they are described in terms of cache-based memory hierarchies, they do not mandate the provision of caches. These instructions are encoded with non-used opcode, e.g. ADD x0, x0, x2, which QEMU already supports, and QEMU does not emulate cache. Therefore these instructions can be considered as a no-op, and we only need to add a new property for the Zihintntl extension. Reviewed-by: Frank Chang Reviewed-by: Alistair Francis Signed-off-by: Jason Chien --- target/riscv/cpu.c | 2 ++ target/riscv/cpu_cfg.h | 1 + 2 files changed, 3 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 921c19e6cd..a49e934b41 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -87,6 +87,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zicond, PRIV_VERSION_1_12_0, ext_zicond), ISA_EXT_DATA_ENTRY(zicsr, PRIV_VERSION_1_10_0, ext_icsr), ISA_EXT_DATA_ENTRY(zifencei, PRIV_VERSION_1_10_0, ext_ifencei), +ISA_EXT_DATA_ENTRY(zihintntl, PRIV_VERSION_1_10_0, ext_zihintntl), ISA_EXT_DATA_ENTRY(zihintpause, PRIV_VERSION_1_10_0, ext_zihintpause), ISA_EXT_DATA_ENTRY(zmmul, PRIV_VERSION_1_12_0, ext_zmmul), ISA_EXT_DATA_ENTRY(zawrs, PRIV_VERSION_1_12_0, ext_zawrs), @@ -1763,6 +1764,7 @@ static Property riscv_cpu_extensions[] = { DEFINE_PROP_BOOL("sscofpmf", RISCVCPU, cfg.ext_sscofpmf, false), DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), +DEFINE_PROP_BOOL("Zihintntl", RISCVCPU, cfg.ext_zihintntl, true), DEFINE_PROP_BOOL("Zihintpause", RISCVCPU, cfg.ext_zihintpause, true), DEFINE_PROP_BOOL("Zawrs", RISCVCPU, cfg.ext_zawrs, true), DEFINE_PROP_BOOL("Zfa", RISCVCPU, cfg.ext_zfa, true), diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index 2bd9510ba3..518686eaa3 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -66,6 +66,7 @@ struct RISCVCPUConfig { bool ext_icbom; bool ext_icboz; bool ext_zicond; +bool ext_zihintntl; bool ext_zihintpause; bool ext_smstateen; bool ext_sstc; -- 2.17.1
[RESEND PATCH v3 0/1] target/riscv: Add Zihintntl extension ISA string to DTS
In v2, I rebased the patch on https://github.com/alistair23/qemu/tree/riscv-to-apply.next However, I forgot to add "Reviewed-by" in v2, so I add them in v3. Jason Chien (1): target/riscv: Add Zihintntl extension ISA string to DTS target/riscv/cpu.c | 2 ++ target/riscv/cpu_cfg.h | 1 + 2 files changed, 3 insertions(+) -- 2.17.1
[PATCH v2] block/blkio: do not use open flags in qemu_open()
qemu_open() in blkio_virtio_blk_common_open() is used to open the character device (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or in the future eventually the unix socket. In all these cases we cannot open the path in read-only mode, when the `read-only` option of blockdev is on, because the exchange of IOCTL commands for example will fail. In order to open the device read-only, we have to use the `read-only` property of the libblkio driver as we already do in blkio_file_open(). Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2225439 Reported-by: Qing Wang Signed-off-by: Stefano Garzarella --- Notes: v2: - added comment on top of qemu_open() [Daniel] v1: https://lore.kernel.org/qemu-devel/2023072555.85426-1-sgarz...@redhat.com/ block/blkio.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/block/blkio.c b/block/blkio.c index 1798648134..cd6d2e55e7 100644 --- a/block/blkio.c +++ b/block/blkio.c @@ -686,15 +686,18 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs, * layer through the "/dev/fdset/N" special path. */ if (fd_supported) { -int open_flags; - -if (flags & BDRV_O_RDWR) { -open_flags = O_RDWR; -} else { -open_flags = O_RDONLY; -} - -fd = qemu_open(path, open_flags, errp); +/* + * `path` can contain the path of a character device + * (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or a unix socket. + * + * So, we should always open it with O_RDWR flag, also if BDRV_O_RDWR + * is not set in the open flags, because the exchange of IOCTL commands + * for example will fail. + * + * In order to open the device read-only, we are using the `read-only` + * property of the libblkio driver in blkio_file_open(). + */ +fd = qemu_open(path, O_RDWR, errp); if (fd < 0) { return -EINVAL; } -- 2.41.0
Re: s390 intermittent test failure in qemu:block / io-qcow2-copy-before-write
Am 25.07.23 um 18:45 schrieb Peter Maydell: There seems to be an intermittent failure on the s390 host in the qemu:block / io-qcow2-copy-before-write test: https://gitlab.com/qemu-project/qemu/-/jobs/4737819873 The log says the test was expecting to do some reading and writing but got an unexpected 'permission denied' error on the read. Any idea why this might happen ? 768/835 qemu:block / io-qcow2-copy-before-write ERROR 12.05s exit status 1 PYTHON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3 MALLOC_PERTURB_=101 /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3 /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/../tests/qemu-iotests/check -tap -qcow2 copy-before-write --source-dir /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests --build-dir /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qemu-iotests ― ✀ ― stderr: --- /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests/tests/copy-before-write.out +++ /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad @@ -1,5 +1,21 @@ - +...F +== +FAIL: test_timeout_break_snapshot (__main__.TestCbwError) +-- +Traceback (most recent call last): + File "/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests/tests/copy-before-write", line 210, in test_timeout_break_snapshot + self.assertEqual(log, """\ +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset 0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission denied\n' + wrote 524288/524288 bytes at offset 0 + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + wrote 524288/524288 bytes at offset 524288 + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) ++ read failed: Permission denied +- read 1048576/1048576 bytes at offset 0 +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + + -- Ran 4 tests -OK +FAILED (failures=1) (test program exited with status code 1) ―― Same failure, previous job: https://gitlab.com/qemu-project/qemu/-/jobs/4736463062 This one's a "Failed to get write lock" in io-qcow2-161: https://gitlab.com/qemu-project/qemu/-/jobs/4734846533 See https://lore.kernel.org/qemu-devel/028059df-eaf4-9e65-a195-4733b708a...@linux.ibm.com/ I was not yet able to understand that. And I can only reproduce with Ubuntu (RHEL,SLES,Fedora are fine). Upstream kernel on Ubuntu does not help, so I assume it must be something in the libraries or config.
Re: [PATCH v21 03/20] target/s390x/cpu topology: handle STSI(15) and build the SYSIB
On 7/25/23 17:41, Nina Schoetterl-Glausch wrote: On Fri, 2023-06-30 at 11:17 +0200, Pierre Morel wrote: On interception of STSI(15.1.x) the System Information Block (SYSIB) is built from the list of pre-ordered topology entries. Signed-off-by: Pierre Morel --- MAINTAINERS | 1 + qapi/machine-target.json | 14 ++ include/hw/s390x/cpu-topology.h | 25 +++ include/hw/s390x/sclp.h | 1 + target/s390x/cpu.h | 76 hw/s390x/cpu-topology.c | 4 +- target/s390x/kvm/kvm.c | 5 +- target/s390x/kvm/stsi-topology.c | 310 +++ target/s390x/kvm/meson.build | 3 +- 9 files changed, 436 insertions(+), 3 deletions(-) create mode 100644 target/s390x/kvm/stsi-topology.c [...] typedef struct S390Topology { uint8_t *cores_per_socket; +bool polarization; You don't use this as a bool and since it's no longer called vertical_polarization, it's not longer entirely clear what the value means so I think this should be a CpuS390Polarization. That also makes the assignment in patch 12 clearer since it assigns the same type. [...] right, I make the change Thanks Pierre
Re: vdpa: use io_uring passthrough command for IOCTLs [was Re: [PATCH 1/2] Reduce vdpa initialization / startup overhead]
On Tue, Jul 18, 2023 at 6:32 PM Stefano Garzarella wrote: > > On Thu, Apr 20, 2023 at 6:20 AM Jason Wang wrote: > > > > On Wed, Apr 19, 2023 at 11:33 PM Eugenio Perez Martin > > wrote: > > > > > > On Wed, Apr 19, 2023 at 12:56 AM wrote: > > > > > > > > From: Pei Li > > > > > > > > Currently, part of the vdpa initialization / startup process > > > > needs to trigger many ioctls per vq, which is very inefficient > > > > and causing unnecessary context switch between user mode and > > > > kernel mode. > > > > > > > > This patch creates an additional ioctl() command, namely > > > > VHOST_VDPA_GET_VRING_GROUP_BATCH, that will batching > > > > commands of VHOST_VDPA_GET_VRING_GROUP into a single > > > > ioctl() call. > > > > I'd expect there's a kernel patch but I didn't see that? > > > > If we want to go this way. Why simply have a more generic way, that is > > introducing something like: > > > > VHOST_CMD_BATCH which did something like > > > > struct vhost_cmd_batch { > > int ncmds; > > struct vhost_ioctls[]; > > }; > > > > Then you can batch other ioctls other than GET_VRING_GROUP? > > > > Just restarting this discussion, since I recently worked more with > io_uring passthrough commands and I think it can help here. > > The NVMe guys had a similar problem (ioctl too slow for their use > case)[1][2], so they developed a new feature in io_uring that > basically allows you to do IOCTLs asynchronously and in batches using > io_uring. > > The same feature is also used by ublk [3] and I recently talked about > this at DevConf with German [4]. > > Basically, there's a new callback in fops (struct file_operations.uring_cmd). > IIUC for NVMe (drivers/nvme/host/ioctl.c) they used exactly the same > values used for IOCTLs also for the new uring_cmd callback. > > We could do the same. The changes in the vhost-vdpa kernel module > should be simple, and we could share the code for handling ioctl and > uring_cmd. > That way any new command can be supported with both for compatibility. > > In QEMU then we can start using it to optimize the control path. > > What do you think? This looks interesting. > > If it's interesting, I could throw down an RFC with the changes or if > anyone is interested in working on it, I can help with the details. Please do that. Thanks > > Thanks, > Stefano > > [1] https://lpc.events/event/11/contributions/989/ > [2] https://lpc.events/event/16/contributions/1382/ > [3] https://lwn.net/Articles/903855/ > [4] https://www.youtube.com/watch?v=6JqNPirreoY >
Re: [PATCH] Open file as read only on private mapping in qemu_ram_alloc_from_file
On 25.07.23 18:01, ThinerLogoer wrote: At 2023-07-25 19:42:30, "David Hildenbrand" wrote: Hi, patch subject should start with "softmmu/physmem: Open ..." Sorry I am newbie to the patch submission part. I will resubmit a version of patch if the final acceptable patch after discussion is mostly the same. (For example, if this patch finally involves adding another parameter and adding various hooks, then I may feel it hard to finish the patch myself, both due to lack of knowledge of qemu source code tree, and due to lack of various environment to test every case out) No worries. I'll be happy to guide you. But if you feel more comfortable that I take over, just let me know. Anyway thanks to all your suggestions. On 25.07.23 12:52, Thiner Logoer wrote: An read only file can be mapped with read write as long as the mapping is private, which is very common case. Make At least in the environments I know, using private file mappings is a corner case ;) What is you use case? VM templating? Mostly, if I understand the terminology correct. I was experimenting on vm snapshoting that uses MAP_PRIVATE when recovering memory, similar to what firecracker says in this documentation. https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md And in my experiment qemu supports recovering from a memory file + a guest state file out of the box. In fact, `-mem-path filename4pc.ram` works out of the box (since the default parameter is map_private+readwrite), only that vanilla setup requires memory file to be writeable Oh, you're saying "-mem-path" results in a MAP_PRIVATE mapping? That sounds very nasty :/ It probably was introduced only for hugetlb handling, and wouldn't actually share memory with another process. In fact, we added MAP_SHARED handling later commit dbcb8981183592be129b2e624b7bcd4245e75fbc Author: Paolo Bonzini Date: Tue Jun 10 19:15:24 2014 +0800 hostmem: add property to map memory with MAP_SHARED A new "share" property can be used with the "memory-file" backend to map memory with MAP_SHARED instead of MAP_PRIVATE. Even one doc in docs/devel/multi-process.rst is wrong: "Note guest memory must be backed by file descriptors, such as when QEMU is given the *-mem-path* command line option." ... no, that won't work with a MAP_PRIVATE mapping. though the file never gets written. (the actual memory file & guest state file require separated hacking) And at least the patch provided here have been the solution to this last problem for me for a while. By the way the commit: "Commit 134253a4, machine: do not crash if default RAM backend name has been stolen" disallows me to use a memory backed file directly as pc.ram and make `-object memory-backed-file,*` based setup more complex (I cannot easily make the memory Can't you simply do -object memory-backed-file,id=mem1 \ -machine q35,memory-backend=mem1,share=off \ Or what would be the problem with that? unbacked by any file before snapshoting and backed by file after recovery from snapshot after this patch). This is the reason why I prefer `-mem-path` despite the doc tells that this usage is close to deprecated, and that `-mem-path` has less configurable parameters. qemu_ram_alloc_from_file open file as read only when the mapping is private, otherwise open will fail when file does not allow write. If this file does not exist or is a directory, the flag is not used, so it should be OK. from https://gitlab.com/qemu-project/qemu/-/issues/1689 Signed-off-by: Thiner Logoer --- softmmu/physmem.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 3df73542e1..e8036ee335 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1945,8 +1945,15 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion*mr, int fd; bool created; RAMBlock *block; + ^ .git/rebase-apply/patch:13: trailing whitespace. I remembered I have deleted this whitespace before. Obviously I have messed up with different version of patch files, sorry about that ... No worries :) +/* + * If map is private, the fd does not need to be writable. + * This only get effective when the file is existent. "This will get ignored if the file does not yet exist." + */ +bool open_as_readonly = readonly || !(ram_flags & RAM_SHARED); -fd = file_ram_open(mem_path, memory_region_name(mr), readonly, &created, +fd = file_ram_open(mem_path, memory_region_name(mr), + open_as_readonly, &created, errp); if (fd < 0) { return NULL; Opening a file R/O will also make operations like fallocate/ftruncate/ ... fail. I saw fallocate in softmmu/physmem.c on somewhere, though I was not sure how it is actually used. Your response fills in this part. For example, this will make fallocate(FALLOC_FL_P
Re: [PATCH v7 4/4] tests/qtest: Introduce tests for UFS
On 7/26/2023 4:21 PM, Thomas Huth wrote: Hi! On 26/07/2023 07.30, Jeuk Kim wrote: This patch includes the following tests Test mmio read Test ufs device initialization and ufs-lu recognition Test I/O (Performs a write followed by a read to verify) Signed-off-by: Jeuk Kim --- ... diff --git a/tests/qtest/ufs-test.c b/tests/qtest/ufs-test.c new file mode 100644 index 00..5104a0a56a --- /dev/null +++ b/tests/qtest/ufs-test.c @@ -0,0 +1,575 @@ +/* + * QTest testcase for UFS + * + * Copyright (c) 2023 Samsung Electronics Co., Ltd. All rights reserved. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qemu/module.h" +#include "qemu/units.h" +#include "libqtest.h" +#include "libqos/qgraph.h" +#include "libqos/pci.h" +#include "scsi/constants.h" +#include "include/block/ufs.h" + +/* Test images sizes in Bytes */ +#define TEST_IMAGE_SIZE (64 * 1024 * 1024) +/* Timeout for various operations, in seconds. */ +#define TIMEOUT_SECONDS 5 From what I've seen in the past, it's possible that a process gets paused for 3 - 4 seconds on a very loaded CI machine, so 5 seconds is already close ... I'd suggest to use 8 - 10 seconds for a timeout instead, just to be on the safe side. +static char *drive_create(void) +{ + int fd, ret; + char *t_path; + + /* Create a temporary raw image */ + fd = g_file_open_tmp("qtest.XX", &t_path, NULL); Could you maybe use "qtest-ufs.XX" or something more prominent instead? ... in case the files don't get deleted correctly, it's easier to know at which test to look at later. With that change: Acked-by: Thomas Huth Okay, I'm going to change the timeout to 10 and rename the file to "qtest-ufs.XX". Thank you for your comment!!
[RFC PATCH] target/i386: Truncate ESP when exiting from long mode
While working on some EFI boot changes for Linux/x86, I noticed that TCG deviates from bare metal when it comes to how it handles the value of the stack pointer register RSP when dropping out of long mode. On bare metal, RSP is truncated to 32 bits, even if the code that runs in 32-bit protected mode never uses the stack at all (and uses a long jump rather than long return to switch back to long mode). This means 64-bit code cannot rely on RSP surviving any excursions into 32-bit protected mode (with paging disabled). Let's align TCG with this behavior, so that code that relies on RSP retaining its value does not inadvertently work while bare metal does not. Observed on Intel Ice Lake cores. Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Link: https://lore.kernel.org/all/20230711091453.2543622-11-a...@kernel.org/ Signed-off-by: Ard Biesheuvel --- I used this patch locally to reproduce an issue that was reported on Ice Lake but didn't trigger in my QEMU testing. Hints welcome on where the architectural behavior is specified, and in particular, whether or not other 64-bit GPRs can be relied upon to preserve their full 64-bit length values. target/i386/helper.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/i386/helper.c b/target/i386/helper.c index 89aa696c6d53d68c..a338da23a87746ed 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -149,6 +149,7 @@ void cpu_x86_update_cr0(CPUX86State *env, uint32_t new_cr0) env->efer &= ~MSR_EFER_LMA; env->hflags &= ~(HF_LMA_MASK | HF_CS64_MASK); env->eip &= 0x; +env->regs[R_ESP] &= 0x; } #endif env->cr[0] = new_cr0 | CR0_ET_MASK; -- 2.39.2
Re: [PATCH v2] block/blkio: do not use open flags in qemu_open()
On Wed, Jul 26, 2023 at 09:48:07AM +0200, Stefano Garzarella wrote: > qemu_open() in blkio_virtio_blk_common_open() is used to open the > character device (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or in > the future eventually the unix socket. > > In all these cases we cannot open the path in read-only mode, > when the `read-only` option of blockdev is on, because the exchange > of IOCTL commands for example will fail. > > In order to open the device read-only, we have to use the `read-only` > property of the libblkio driver as we already do in blkio_file_open(). > > Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for > virtio-blk") > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2225439 > Reported-by: Qing Wang > Signed-off-by: Stefano Garzarella > --- > > Notes: > v2: > - added comment on top of qemu_open() [Daniel] > > v1: > https://lore.kernel.org/qemu-devel/2023072555.85426-1-sgarz...@redhat.com/ > > block/blkio.c | 21 - > 1 file changed, 12 insertions(+), 9 deletions(-) Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 01/10] hw/arm/virt-acpi-build.c: Move fw_cfg and virtio to common location
On Tue, 25 Jul 2023 22:20:36 +0530 Sunil V L wrote: > On Mon, Jul 24, 2023 at 05:18:59PM +0200, Igor Mammedov wrote: > > On Wed, 12 Jul 2023 22:09:34 +0530 > > Sunil V L wrote: > > > > > The functions which add fw_cfg and virtio to DSDT are same for ARM > > > and RISC-V. So, instead of duplicating in RISC-V, move them from > > > hw/arm/virt-acpi-build.c to common aml-build.c. > > > > > > Signed-off-by: Sunil V L > > > --- > > > hw/acpi/aml-build.c | 41 > > > hw/arm/virt-acpi-build.c| 42 - > > > hw/riscv/virt-acpi-build.c | 16 -- > > > include/hw/acpi/aml-build.h | 6 ++ > > > 4 files changed, 47 insertions(+), 58 deletions(-) > > > > > > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c > > > > patch looks fine modulo, > > I'd put these into respective device files instead of generic > > aml-build.c which was intended for basic AML primitives > > (it 's got polluted over time with device specific functions > > but that's not the reason to continue doing that). > > > > Also having those functions along with devices models > > goes along with self enumerating ACPI devices (currently > > it works for x86 PCI/ISA device but there is no reason > > that it can't work with other types as well when > > I get there) > > > Thanks!, Igor. Let me add them to device specific files as per your > recommendation. just be careful and build test other targets (while disabling the rest) at least no to regress them due to build deps. (I'd pick 2 with ACPI support that use and not uses affected code) and 1 that uses device model but doesn't use ACPI at all (if such exists) > > Thanks! > Sunil >
Re: [PATCH v4] block-jobs: flush target at the end of .run()
On 7/25/23 19:40, Vladimir Sementsov-Ogievskiy wrote: From: Vladimir Sementsov-Ogievskiy Actually block job is not completed without this final flush. It's rather unexpected to have broken target when job was successfully completed long ago and now we fail to flush or process just crashed/killed. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/backup.c | 7 +-- block/commit.c | 2 +- block/mirror.c | 4 block/stream.c | 7 ++- blockjob.c | 18 ++ include/block/blockjob_int.h | 11 +++ 6 files changed, 45 insertions(+), 4 deletions(-) diff --git a/block/backup.c b/block/backup.c index db3791f4d1..b9ff63359a 100644 --- a/block/backup.c +++ b/block/backup.c @@ -295,10 +295,13 @@ static int coroutine_fn backup_run(Job *job, Error **errp) job_yield(job); } } else { -return backup_loop(s); +ret = backup_loop(s); +if (ret < 0) { +return ret; +} } -return 0; +return block_job_final_target_flush(&s->common, s->target_bs); } static void coroutine_fn backup_pause(Job *job) diff --git a/block/commit.c b/block/commit.c index aa45beb0f0..15df96b4f3 100644 --- a/block/commit.c +++ b/block/commit.c @@ -187,7 +187,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp) } } -return 0; +return block_job_final_target_flush(&s->common, blk_bs(s->base)); } static const BlockJobDriver commit_job_driver = { diff --git a/block/mirror.c b/block/mirror.c index d3cacd1708..cd19b49f7f 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1143,6 +1143,10 @@ immediate_exit: g_free(s->in_flight_bitmap); bdrv_dirty_iter_free(s->dbi); +if (ret >= 0) { +ret = block_job_final_target_flush(&s->common, blk_bs(s->target)); +} + if (need_drain) { s->in_drain = true; bdrv_drained_begin(bs); diff --git a/block/stream.c b/block/stream.c index e522bbdec5..f7e8b35e94 100644 --- a/block/stream.c +++ b/block/stream.c @@ -131,6 +131,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp) BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs); int64_t len; int64_t offset = 0; +int ret; int error = 0; int64_t n = 0; /* bytes */ @@ -149,7 +150,6 @@ static int coroutine_fn stream_run(Job *job, Error **errp) for ( ; offset < len; offset += n) { bool copy; -int ret; /* Note that even when no rate limit is applied we need to yield * with no pending I/O here so that bdrv_drain_all() returns. @@ -207,6 +207,11 @@ static int coroutine_fn stream_run(Job *job, Error **errp) } } +ret = block_job_final_target_flush(&s->common, s->target_bs); +if (error == 0) { +error = ret; +} + /* Do not remove the backing file if an error was there but ignored. */ return error; } diff --git a/blockjob.c b/blockjob.c index 25fe8e625d..313e586b0d 100644 --- a/blockjob.c +++ b/blockjob.c @@ -611,3 +611,21 @@ AioContext *block_job_get_aio_context(BlockJob *job) GLOBAL_STATE_CODE(); return job->job.aio_context; } + +int coroutine_fn +block_job_final_target_flush(BlockJob *job, BlockDriverState *target_bs) +{ +int ret; + +WITH_GRAPH_RDLOCK_GUARD() { +ret = bdrv_co_flush(target_bs); +} + +if (ret < 0 && !block_job_is_internal(job)) { +qapi_event_send_block_job_error(job->job.id, +IO_OPERATION_TYPE_WRITE, +BLOCK_ERROR_ACTION_REPORT); +} + +return ret; +} diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h index 104824040c..617e40b916 100644 --- a/include/block/blockjob_int.h +++ b/include/block/blockjob_int.h @@ -152,4 +152,15 @@ void block_job_ratelimit_sleep(BlockJob *job); BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err, int is_read, int error); +/** + * block_job_final_target_flush: + * @job: The job to signal an error for if flush failed. + * @target_bs: The bs to flush. + * + * The function is intended to be called at the end of .run() for any data + * copying job. + */ +int coroutine_fn +block_job_final_target_flush(BlockJob *job, BlockDriverState *target_bs); + #endif Reviewed-by: Denis V. Lunev
Re: [PATCH v9 06/10] migration: New migrate and migrate-incoming argument 'channels'
On Wed, Jul 26, 2023 at 01:24:48AM +0530, Het Gala wrote: > Sorry, last reply on this patch was accidently replied only to Daniel. > Pasting the reply again so it is received by all the active maintianers > here. Apologies for the error 😅 > > On 26/07/23 12:07 am, Daniel P. Berrangé wrote: > > On Tue, Jul 25, 2023 at 07:34:09PM +0100, Daniel P. Berrangé wrote: > > > On Fri, Jul 21, 2023 at 02:49:31PM +, Het Gala wrote: > > > > MigrateChannelList allows to connect accross multiple interfaces. > > > > Add MigrateChannelList struct as argument to migration QAPIs. > > > > > > > > We plan to include multiple channels in future, to connnect > > > > multiple interfaces. Hence, we choose 'MigrateChannelList' > > > > as the new argument over 'MigrateChannel' to make migration > > > > QAPIs future proof. > > > > > > > > Suggested-by: Aravind Retnakaran > > > > Signed-off-by: Het Gala > > > > Acked-by: Markus Armbruster > > > > --- > > > > migration/migration-hmp-cmds.c | 6 +- > > > > migration/migration.c | 34 -- > > > > qapi/migration.json| 109 - > > > > softmmu/vl.c | 2 +- > > > > 4 files changed, 139 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/migration/migration-hmp-cmds.c > > > > b/migration/migration-hmp-cmds.c > > > > index 9885d7c9f7..49b150f33f 100644 > > > > --- a/migration/migration-hmp-cmds.c > > > > +++ b/migration/migration-hmp-cmds.c > > > > @@ -424,7 +424,7 @@ void hmp_migrate_incoming(Monitor *mon, const QDict > > > > *qdict) > > > > Error *err = NULL; > > > > const char *uri = qdict_get_str(qdict, "uri"); > > > > -qmp_migrate_incoming(uri, &err); > > > > +qmp_migrate_incoming(uri, false, NULL, &err); > > > > hmp_handle_error(mon, err); > > > > } > > > > @@ -705,8 +705,8 @@ void hmp_migrate(Monitor *mon, const QDict *qdict) > > > > const char *uri = qdict_get_str(qdict, "uri"); > > > > Error *err = NULL; > > > > -qmp_migrate(uri, !!blk, blk, !!inc, inc, > > > > -false, false, true, resume, &err); > > > > +qmp_migrate(uri, false, NULL, !!blk, blk, !!inc, inc, > > > > + false, false, true, resume, &err); > > > > if (hmp_handle_error(mon, err)) { > > > > return; > > > > } > > > > diff --git a/migration/migration.c b/migration/migration.c > > > > index f37b388876..bd3a93fc8c 100644 > > > > --- a/migration/migration.c > > > > +++ b/migration/migration.c > > > > @@ -466,10 +466,22 @@ static bool migrate_uri_parse(const char *uri, > > > > return true; > > > > } > > > > -static void qemu_start_incoming_migration(const char *uri, Error > > > > **errp) > > > > +static void qemu_start_incoming_migration(const char *uri, bool > > > > has_channels, > > > > + MigrationChannelList > > > > *channels, > > > > + Error **errp) > > > > { > > > > g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1); > > > > +/* > > > > + * Having preliminary checks for uri and channel > > > > + */ > > > > +if (uri && has_channels) { > > > > +error_setg(errp, "'uri' and 'channels' arguments are mutually " > > > > + "exclusive; exactly one of the two should be > > > > present in " > > > > + "'migrate-incoming' qmp command "); > > > > +return; > > > > +} > > > This checks is both are present. > > > > > > Also needs a check if neither are present as that's invalid. > > Also it should (temporarily) raise an error if "has_channels" is > > set, as while we've added the parameter in QAPI, we've not > > implemented it yet. IOW, raise an error now, and remove the > > error in a later patch. > Ack. So in total there should be 3 checks right. 1) if 'has_channels' is > set, 2) if 'uri' and 'channels' both are present, 3) if 'uri' and 'channels' > both are absent. Basically right now only uri should allowed and should > atleast be present. Correct. > I think overall only 1) would be enough and should be checked before > 'migration_channels_and_uri_compatible()' and if 'has_channels' is set, just > return for now. With this 2) would not be necessary or not come into play in > this patch. 3) will be taken care by > 'migration_channels_and_uri_compatible()' itself IMO. I think all the checks should be in this method, as it is the entrypoint for execution of the QMP command and thus where parameter validation should live. Spreading it across methods will make it very easy to open bugs by refactoring code and not realizing the new codepaths don't do the right checks. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: i386/xen: prevent guest from binding loopback event channel to itself
On 25/07/2023 11:05, David Woodhouse wrote: From: David Woodhouse Fuzzing showed that a guest could bind an interdomain port to itself, by guessing the next port to be allocated and putting that as the 'remote' port number. By chance, that works because the newly-allocated port has type EVTCHNSTAT_unbound. It shouldn't. Signed-off-by: David Woodhouse --- hw/i386/kvm/xen_evtchn.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) Reviewed-by: Paul Durrant
Re: how to build qemu 8.1 - keycodemapdb?
On Wed, Jul 26, 2023 at 10:15:24AM +0300, Michael Tokarev wrote: > 26.07.2023 09:41, Michael Tokarev wrote: > > > Looks like scripts/make-release.sh needs some updating. FWIW, i see the same problem as you: $ ./configure --target-list=x86_64-softmmu --disable-download Using './build' as the directory for build output ERROR: missing subprojects This is not a GIT checkout but subproject content appears to be missing. Do not use 'git archive' or GitHub download links to acquire QEMU source archives. Non-GIT builds are only supported with source archives linked from: https://www.qemu.org/download/#source Developers working with GIT can use scripts/archive-source.sh if they need to create valid source archives. > make-release.sh apparently does the right thing. But the published > tarball does not include the 3 required sub-projects anyway. > > Is it about how the release is made? What is used to make the > actual release tarball, is it not make-release.sh? make-release is what I expect to be used for making release tarballs. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 1/3] block: add subcluster_size field to BlockDriverInfo
On 11.07.23 20:25, Andrey Drobyshev wrote: This is going to be used in the subsequent commit as requests alignment (in particular, during copy-on-read). This value only makes sense for the formats which support subclusters (currently QCOW2 only). If this field isn't set by driver's own bdrv_get_info() implementation, we simply set it equal to the cluster size thus treating each cluster as having a single subcluster. Reviewed-by: Eric Blake Reviewed-by: Denis V. Lunev Signed-off-by: Andrey Drobyshev Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: how to build qemu 8.1 - keycodemapdb?
26.07.2023 11:50, Daniel P. Berrangé wrote: .. make-release.sh apparently does the right thing. But the published tarball does not include the 3 required sub-projects anyway. Is it about how the release is made? What is used to make the actual release tarball, is it not make-release.sh? make-release is what I expect to be used for making release tarballs. When I run ./scripts/make-release 8.1.0-rc1 , the resulting tarball includes the necessary submodules in subprojects/. It is more: it includes 2 copies of berkeley-softfloat & berkeley-testfloat, one in subprojects/ and one in roms/edk2/ArmPkg/Library/ArmSoftFloatLib/ . But the tarballs published on qemu.org does not include these. So I conclude the tarballs were not created using make-release.sh. /mjt
Re: i386/xen: prevent guest from binding loopback event channel to itself
On Wed, 2023-07-26 at 09:44 +0100, Paul Durrant wrote: > On 25/07/2023 11:05, David Woodhouse wrote: > > From: David Woodhouse > > > > Fuzzing showed that a guest could bind an interdomain port to itself, by > > guessing the next port to be allocated and putting that as the 'remote' > > port number. By chance, that works because the newly-allocated port has > > type EVTCHNSTAT_unbound. It shouldn't. > > > > Signed-off-by: David Woodhouse > > --- > > hw/i386/kvm/xen_evtchn.c | 11 +-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > Reviewed-by: Paul Durrant > Thanks. I'll change the title prefix to 'hw/xen' since it's in hw/ not target/i386. Please can I have also have a review for https://lore.kernel.org/qemu-devel/20076888f6bdf06a65aafc5cf954260965d45b97.ca...@infradead.org/ I'll then send these outstanding patches from my tree as a series for 8.1: David Woodhouse (4): hw/xen: Clarify (lack of) error handling in transaction_commit() hw/xen: fix off-by-one in xen_evtchn_set_gsi() i386/xen: consistent locking around Xen singleshot timers hw/xen: prevent guest from binding loopback event channel to itself smime.p7s Description: S/MIME cryptographic signature
Re: avocado test failing INTERRUPTED for "Missing asset"
On 26/07/2023 09.33, Philippe Mathieu-Daudé wrote: On 25/7/23 19:13, Peter Maydell wrote: Currently this CI job is failing: https://gitlab.com/qemu-project/qemu/-/jobs/4737819946 because: (05/59) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_exynos4210_initrd: INTERRUPTED: Missing asset https://snapshot.debian.org/archive/debian/20190928T224601Z/pool/main/l/linux/linux-image-4.19.0-6-armmp_4.19.67-2+deb10u1_armhf.deb\nRunner error occurred: Timeout reached\nOriginal status: CANCEL\n{'name': '05-tests/avocado/boot_linux_console... (90.67 s) Why is a "Missing asset" causing a timeout after 90 seconds, rather than being accounted as a "SKIP" ("missing requirements in the test environment" sounds like what we have here) ? Maybe something to report to upstream Avocado. We're back to using Avocado v88.1 in QEMU. We first need someone who can update to the latest Avocado release and take care of the remaining problems... This is *very* frustrating. Thomas
Re: [PATCH v2 2/3] block/io: align requests to subcluster_size
On 11.07.23 20:25, Andrey Drobyshev wrote: When target image is using subclusters, and we align the request during copy-on-read, it makes sense to align to subcluster_size rather than cluster_size. Otherwise we end up with unnecessary allocations. This commit renames bdrv_round_to_clusters() to bdrv_round_to_subclusters() and utilizes subcluster_size field of BlockDriverInfo to make necessary alignments. It affects copy-on-read as well as mirror job (which is using bdrv_round_to_clusters()). This change also fixes the following bug with failing assert (covered by the test in the subsequent commit): qemu-img create -f qcow2 base.qcow2 64K qemu-img create -f qcow2 -o extended_l2=on,backing_file=base.qcow2,backing_fmt=qcow2 img.qcow2 64K qemu-io -c "write -P 0xaa 0 2K" img.qcow2 qemu-io -C -c "read -P 0x00 2K 62K" img.qcow2 qemu-io: ../block/io.c:1236: bdrv_co_do_copy_on_readv: Assertion `skip_bytes < pnum' failed. Reviewed-by: Eric Blake Reviewed-by: Denis V. Lunev Signed-off-by: Andrey Drobyshev Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: how to build qemu 8.1 - keycodemapdb?
On Wed, Jul 26, 2023 at 12:05:41PM +0300, Michael Tokarev wrote: > 26.07.2023 11:50, Daniel P. Berrangé wrote: > .. > > > make-release.sh apparently does the right thing. But the published > > > tarball does not include the 3 required sub-projects anyway. > > > > > > Is it about how the release is made? What is used to make the > > > actual release tarball, is it not make-release.sh? > > > > make-release is what I expect to be used for making release > > tarballs. > > When I run ./scripts/make-release 8.1.0-rc1 , the resulting tarball > includes the necessary submodules in subprojects/. > > It is more: it includes 2 copies of berkeley-softfloat & berkeley-testfloat, > one in subprojects/ and one in roms/edk2/ArmPkg/Library/ArmSoftFloatLib/ . > > But the tarballs published on qemu.org does not include these. > > So I conclude the tarballs were not created using make-release.sh. I filed an issue for this and marked it as a release blocker. https://gitlab.com/qemu-project/qemu/-/issues/1791 rc0 was broken in the same way too. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] hw/xen: Clarify (lack of) error handling in transaction_commit()
On 20/06/2023 18:58, David Woodhouse wrote: From: David Woodhouse Coverity was unhappy (CID 1508359) because we didn't check the return of init_walk_op() in transaction_commit(), despite doing so at every other call site. Strictly speaking, this is a false positive since it can never fail. It only fails for invalid user input (transaction ID or path), and both of those are hard-coded to known sane values in this invocation. But Coverity doesn't know that, and neither does the casual reader of the code. Returning an error here would be weird, since the transaction *is* committed by this point; all the walk_op is doing is firing watches on the newly-committed changed nodes. So make it a g_assert(!ret), since it really should never happen. Signed-off-by: David Woodhouse --- hw/i386/kvm/xenstore_impl.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) Reviewed-by: Paul Durrant
Re: i386/xen: prevent guest from binding loopback event channel to itself
On 26/07/2023 10:07, David Woodhouse wrote: On Wed, 2023-07-26 at 09:44 +0100, Paul Durrant wrote: On 25/07/2023 11:05, David Woodhouse wrote: From: David Woodhouse Fuzzing showed that a guest could bind an interdomain port to itself, by guessing the next port to be allocated and putting that as the 'remote' port number. By chance, that works because the newly-allocated port has type EVTCHNSTAT_unbound. It shouldn't. Signed-off-by: David Woodhouse --- hw/i386/kvm/xen_evtchn.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) Reviewed-by: Paul Durrant Thanks. I'll change the title prefix to 'hw/xen' since it's in hw/ not target/i386. Yes, makes sense. Please can I have also have a review for https://lore.kernel.org/qemu-devel/20076888f6bdf06a65aafc5cf954260965d45b97.ca...@infradead.org/ Sorry I missed that. Done. Cheers, Paul
Re: [PATCH v2 3/3] tests/qemu-iotests/197: add testcase for CoR with subclusters
On 11.07.23 20:25, Andrey Drobyshev via wrote: Add testcase which checks that allocations during copy-on-read are performed on the subcluster basis when subclusters are enabled in target image. This testcase also triggers the following assert with previous commit not being applied, so we check that as well: qemu-io: ../block/io.c:1236: bdrv_co_do_copy_on_readv: Assertion `skip_bytes < pnum' failed. Reviewed-by: Eric Blake Reviewed-by: Denis V. Lunev Signed-off-by: Andrey Drobyshev Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH v8 01/10] parallels: Fix comments formatting inside parallels driver
On 7/18/23 12:44, Alexander Ivanov wrote: This patch is technically necessary as git patch rendering could result in moving some code from one place to the another and that hits checkpatch.pl warning. This problem specifically happens within next series. Signed-off-by: Alexander Ivanov --- block/parallels.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 18e34aef28..c7b2ed5a54 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -188,7 +188,8 @@ allocate_clusters(BlockDriverState *bs, int64_t sector_num, idx = sector_num / s->tracks; to_allocate = DIV_ROUND_UP(sector_num + *pnum, s->tracks) - idx; -/* This function is called only by parallels_co_writev(), which will never +/* + * This function is called only by parallels_co_writev(), which will never * pass a sector_num at or beyond the end of the image (because the block * layer never passes such a sector_num to that function). Therefore, idx * is always below s->bat_size. @@ -196,7 +197,8 @@ allocate_clusters(BlockDriverState *bs, int64_t sector_num, * exceed the image end. Therefore, idx + to_allocate cannot exceed * s->bat_size. * Note that s->bat_size is an unsigned int, therefore idx + to_allocate - * will always fit into a uint32_t. */ + * will always fit into a uint32_t. + */ assert(idx < s->bat_size && idx + to_allocate <= s->bat_size); space = to_allocate * s->tracks; @@ -230,13 +232,15 @@ allocate_clusters(BlockDriverState *bs, int64_t sector_num, } } -/* Try to read from backing to fill empty clusters +/* + * Try to read from backing to fill empty clusters * FIXME: 1. previous write_zeroes may be redundant *2. most of data we read from backing will be rewritten by * parallels_co_writev. On aligned-to-cluster write we do not need * this read at all. *3. it would be good to combine write of data from backing and new - * data into one write call */ + * data into one write call. + */ if (bs->backing) { int64_t nb_cow_sectors = to_allocate * s->tracks; int64_t nb_cow_bytes = nb_cow_sectors << BDRV_SECTOR_BITS; @@ -864,8 +868,10 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, s->data_end = ROUND_UP(bat_entry_off(s->bat_size), BDRV_SECTOR_SIZE); } if (s->data_end < s->header_size) { -/* there is not enough unused space to fit to block align between BAT - and actual data. We can't avoid read-modify-write... */ +/* + * There is not enough unused space to fit to block align between BAT + * and actual data. We can't avoid read-modify-write... + */ s->header_size = size; } Reviewed-by: Denis V. Lunev
Re: [PATCH] semihosting/uaccess.c: Replaced a malloc call with g_malloc.
(Something went wrong with the quoting in your email. I've fixed it up.) On Wed, 26 Jul 2023 at 05:38, wrote: > Peter Maydell wrote: > > The third part here, is that g_malloc() does not ever > > fail -- it will abort() on out of memory. However > > the code here is still handling g_malloc() returning NULL. > > The equivalent for "we expect this might fail" (which we want > > here, because the guest is passing us the length of memory > > to try to allocate) is g_try_malloc(). > g_malloc() is preferred more than g_try_* functions, which return NULL on > error, > when the size of the requested allocation is small. > This is because allocating few bytes should not be a problem in a healthy > system. This is true. But in this particular case we cannot be sure that the size of the allocation is small, because the size is controlled by the guest. So we want g_try_malloc(). thanks -- PMM
Re: [PATCH v9 06/10] migration: New migrate and migrate-incoming argument 'channels'
On 26/07/23 2:05 pm, Daniel P. Berrangé wrote: On Wed, Jul 26, 2023 at 01:24:48AM +0530, Het Gala wrote: Sorry, last reply on this patch was accidently replied only to Daniel. Pasting the reply again so it is received by all the active maintianers here. Apologies for the error 😅 On 26/07/23 12:07 am, Daniel P. Berrangé wrote: On Tue, Jul 25, 2023 at 07:34:09PM +0100, Daniel P. Berrangé wrote: On Fri, Jul 21, 2023 at 02:49:31PM +, Het Gala wrote: MigrateChannelList allows to connect accross multiple interfaces. Add MigrateChannelList struct as argument to migration QAPIs. We plan to include multiple channels in future, to connnect multiple interfaces. Hence, we choose 'MigrateChannelList' as the new argument over 'MigrateChannel' to make migration QAPIs future proof. Suggested-by: Aravind Retnakaran Signed-off-by: Het Gala Acked-by: Markus Armbruster --- migration/migration-hmp-cmds.c | 6 +- migration/migration.c | 34 -- qapi/migration.json| 109 - softmmu/vl.c | 2 +- 4 files changed, 139 insertions(+), 12 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 9885d7c9f7..49b150f33f 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -424,7 +424,7 @@ void hmp_migrate_incoming(Monitor *mon, const QDict *qdict) Error *err = NULL; const char *uri = qdict_get_str(qdict, "uri"); -qmp_migrate_incoming(uri, &err); +qmp_migrate_incoming(uri, false, NULL, &err); hmp_handle_error(mon, err); } @@ -705,8 +705,8 @@ void hmp_migrate(Monitor *mon, const QDict *qdict) const char *uri = qdict_get_str(qdict, "uri"); Error *err = NULL; -qmp_migrate(uri, !!blk, blk, !!inc, inc, -false, false, true, resume, &err); +qmp_migrate(uri, false, NULL, !!blk, blk, !!inc, inc, + false, false, true, resume, &err); if (hmp_handle_error(mon, err)) { return; } diff --git a/migration/migration.c b/migration/migration.c index f37b388876..bd3a93fc8c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -466,10 +466,22 @@ static bool migrate_uri_parse(const char *uri, return true; } -static void qemu_start_incoming_migration(const char *uri, Error **errp) +static void qemu_start_incoming_migration(const char *uri, bool has_channels, + MigrationChannelList *channels, + Error **errp) { g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1); +/* + * Having preliminary checks for uri and channel + */ +if (uri && has_channels) { +error_setg(errp, "'uri' and 'channels' arguments are mutually " + "exclusive; exactly one of the two should be present in " + "'migrate-incoming' qmp command "); +return; +} This checks is both are present. Also needs a check if neither are present as that's invalid. Also it should (temporarily) raise an error if "has_channels" is set, as while we've added the parameter in QAPI, we've not implemented it yet. IOW, raise an error now, and remove the error in a later patch. Ack. So in total there should be 3 checks right. 1) if 'has_channels' is set, 2) if 'uri' and 'channels' both are present, 3) if 'uri' and 'channels' both are absent. Basically right now only uri should allowed and should atleast be present. Correct. Ack. I think overall only 1) would be enough and should be checked before 'migration_channels_and_uri_compatible()' and if 'has_channels' is set, just return for now. With this 2) would not be necessary or not come into play in this patch. 3) will be taken care by 'migration_channels_and_uri_compatible()' itself IMO. I think all the checks should be in this method, as it is the entrypoint for execution of the QMP command and thus where parameter validation should live. Spreading it across methods will make it very easy to open bugs by refactoring code and not realizing the new codepaths don't do the right checks. Okay, I got the concern here. Will add these 3 checks in this patch and later remove 'has_channels' check in the patch where we are implementing MigrateChannel. With regards, Daniel Regards, Het Gala
Re: [Stable-8.0.4 00/31] Patch Round-up for stable 8.0.4, freeze on 2023-08-05
Tue, 25 Jul 2023 16:45:17 +0300 Michael Tokarev : > Please respond here or CC qemu-sta...@nongnu.org on any additional patches > you think should (or shouldn't) be included in the release. Consider 497fad38979c16b6412388927401e577eba43d26 ("pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym"). Otherwise it will start to FTBFS in Tumbleweed from now on. Olaf pgpzyW9ApUjPS.pgp Description: Digitale Signatur von OpenPGP
Re: [Stable-8.0.4 00/31] Patch Round-up for stable 8.0.4, freeze on 2023-08-05
26.07.2023 13:07, Olaf Hering пишет: Tue, 25 Jul 2023 16:45:17 +0300 Michael Tokarev : Please respond here or CC qemu-sta...@nongnu.org on any additional patches you think should (or shouldn't) be included in the release. Consider 497fad38979c16b6412388927401e577eba43d26 ("pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym"). Otherwise it will start to FTBFS in Tumbleweed from now on. https://gitlab.com/qemu-project/qemu/-/commits/stable-8.0/ This one is included in 8.0.3 and 7.2.4 already, picked up for the previous stable series/releases. Thanks, /mjt
Re: [Stable-8.0.4 00/31] Patch Round-up for stable 8.0.4, freeze on 2023-08-05
Wed, 26 Jul 2023 13:12:43 +0300 Michael Tokarev : > This one is included in 8.0.3 and 7.2.4 already, picked up for the previous > stable series/releases. Indeed. I just noticed I still had 8.0.2 exported, sorry for the noise. Olaf pgpk0hLzJTvpI.pgp Description: Digitale Signatur von OpenPGP
Re: [PATCH v2 4/6] python/machine: use socketpair() for console connections
> On 25-Jul-2023, at 11:33 PM, John Snow wrote: > > Create a socketpair for the console output. This should help eliminate > race conditions around console text early in the boot process that might > otherwise have been dropped on the floor before being able to connect to > QEMU under "server,nowait". > > Signed-off-by: John Snow Thanks for doing this. I recall we spoke about this late last year in the context of fixing my bios-bits avocado test and adding a console output there. Except the concern below, Reviewed-by: Ani Sinha > --- > python/qemu/machine/machine.py | 30 +++--- > 1 file changed, 27 insertions(+), 3 deletions(-) > > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py > index 26f0fb8a81..09f214c95c 100644 > --- a/python/qemu/machine/machine.py > +++ b/python/qemu/machine/machine.py > @@ -159,6 +159,8 @@ def __init__(self, > > self._name = name or f"{id(self):x}" > self._sock_pair: Optional[Tuple[socket.socket, socket.socket]] = None > +self._cons_sock_pair: Optional[ > +Tuple[socket.socket, socket.socket]] = None > self._temp_dir: Optional[str] = None > self._base_temp_dir = base_temp_dir > self._sock_dir = sock_dir > @@ -315,8 +317,9 @@ def _base_args(self) -> List[str]: > for _ in range(self._console_index): > args.extend(['-serial', 'null']) > if self._console_set: > -chardev = ('socket,id=console,path=%s,server=on,wait=off' % > - self._console_address) > +assert self._cons_sock_pair is not None > +fd = self._cons_sock_pair[0].fileno() > +chardev = f"socket,id=console,fd={fd}" > args.extend(['-chardev', chardev]) > if self._console_device_type is None: > args.extend(['-serial', 'chardev:console']) > @@ -351,6 +354,10 @@ def _pre_launch(self) -> None: > nickname=self._name > ) > > +if self._console_set: > +self._cons_sock_pair = socket.socketpair() > +os.set_inheritable(self._cons_sock_pair[0].fileno(), True) > + > # NOTE: Make sure any opened resources are *definitely* freed in > # _post_shutdown()! > # pylint: disable=consider-using-with > @@ -368,6 +375,9 @@ def _pre_launch(self) -> None: > def _post_launch(self) -> None: > if self._sock_pair: > self._sock_pair[0].close() > +if self._cons_sock_pair: > +self._cons_sock_pair[0].close() > + > if self._qmp_connection: > if self._sock_pair: > self._qmp.connect() > @@ -518,6 +528,11 @@ def _early_cleanup(self) -> None: > self._console_socket.close() > self._console_socket = None > > +if self._cons_sock_pair: > +self._cons_sock_pair[0].close() > +self._cons_sock_pair[1].close() > +self._cons_sock_pair = None > + > def _hard_shutdown(self) -> None: > """ > Perform early cleanup, kill the VM, and wait for it to terminate. > @@ -878,10 +893,19 @@ def console_socket(self) -> socket.socket: > Returns a socket connected to the console > """ > if self._console_socket is None: > +if not self._console_set: > +raise QEMUMachineError( > +"Attempt to access console socket with no connection") > +assert self._cons_sock_pair is not None > +# os.dup() is used here for sock_fd because otherwise we'd > +# have two rich python socket objects that would each try to > +# close the same underlying fd when either one gets garbage > +# collected. > self._console_socket = console_socket.ConsoleSocket( > -self._console_address, > +sock_fd=os.dup(self._cons_sock_pair[1].fileno()), > file=self._console_log_path, > drain=self._drain_console) > +self._cons_sock_pair[1].close() I am not 100% sure but should we save the new sock_fd here? Like self._cons_sock_pair[1] = sock_fd ; Then next time console_socket() is invoked, the correct fd will be duped. > return self._console_socket > > @property > -- > 2.41.0 >
[PULL 2/5] qapi/block: Tidy up block-latency-histogram-set documentation
Examples come out like Example set new histograms for all io types with intervals [0, 10), [10, 50), [50, 100), [100, +inf): The sentence "set new histograms ..." starts with a lower case letter. Capitalize it. Same for the other examples. Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-3-arm...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé --- qapi/block.json | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/qapi/block.json b/qapi/block.json index 0f25ce3961..535892fddc 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -547,7 +547,7 @@ # # Example: # -# set new histograms for all io types with intervals [0, 10), [10, +# Set new histograms for all io types with intervals [0, 10), [10, # 50), [50, 100), [100, +inf): # # -> { "execute": "block-latency-histogram-set", @@ -557,7 +557,7 @@ # # Example: # -# set new histogram only for write, other histograms will remain not +# Set new histogram only for write, other histograms will remain not # changed (or not created): # # -> { "execute": "block-latency-histogram-set", @@ -567,7 +567,7 @@ # # Example: # -# set new histograms with the following intervals: read, flush: [0, +# Set new histograms with the following intervals: read, flush: [0, # 10), [10, 50), [50, 100), [100, +inf) write: [0, 1000), [1000, # 5000), [5000, +inf) # @@ -579,7 +579,7 @@ # # Example: # -# remove all latency histograms: +# Remove all latency histograms: # # -> { "execute": "block-latency-histogram-set", # "arguments": { "id": "drive0" } } -- 2.41.0
[PULL 0/5] QAPI patches patches for 2023-07-26
The following changes since commit 6cb2011fedf8c4e7b66b4a3abd6b42c1bae99ce6: Update version for v8.1.0-rc1 release (2023-07-25 20:09:05 +0100) are available in the Git repository at: https://repo.or.cz/qemu/armbru.git tags/pull-qapi-2023-07-26 for you to fetch changes up to 1799bdcb47d8368c92cfc69c3b4f305ec8414977: qapi: Reformat recent doc comments to conform to current conventions (2023-07-26 13:26:53 +0200) QAPI patches patches for 2023-07-26 The patches affect only documentation. Generated code does not change. Markus Armbruster (5): qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation qapi/block: Tidy up block-latency-histogram-set documentation qapi/qdev: Tidy up device_add documentation qapi/trace: Tidy up trace-event-get-state, -set-state documentation qapi: Reformat recent doc comments to conform to current conventions qapi/block-core.json | 85 +++- qapi/block.json | 12 +++ qapi/cxl.json| 4 +-- qapi/machine-target.json | 2 +- qapi/migration.json | 10 +++--- qapi/net.json| 1 - qapi/qdev.json | 6 ++-- qapi/qom.json| 9 ++--- qapi/trace.json | 9 ++--- qapi/ui.json | 2 +- 10 files changed, 66 insertions(+), 74 deletions(-) -- 2.41.0
[PULL 5/5] qapi: Reformat recent doc comments to conform to current conventions
Since commit a937b6aa739 (qapi: Reformat doc comments to conform to current conventions), a number of comments not conforming to the current formatting conventions were added. No problem, just sweep the entire documentation once more. To check the generated documentation does not change, I compared the generated HTML before and after this commit with "wdiff -3". Finds no differences. Comparing with diff is not useful, as the reflown paragraphs are visible there. Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-7-arm...@redhat.com> --- qapi/block-core.json | 78 +++- qapi/block.json | 4 +-- qapi/cxl.json| 4 +-- qapi/machine-target.json | 2 +- qapi/migration.json | 10 +++--- qapi/net.json| 1 - qapi/qom.json| 9 ++--- qapi/trace.json | 2 ++ qapi/ui.json | 2 +- 9 files changed, 55 insertions(+), 57 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 6ca448b6e6..31c4812b44 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -136,7 +136,7 @@ # # @filename: Name of the extent file # -# @format: Extent type (e.g. FLAT or SPARSE) +# @format: Extent type (e.g. FLAT or SPARSE) # # @virtual-size: Number of bytes covered by this extent # @@ -853,9 +853,8 @@ # @min_wr_latency_ns: Minimum latency of write operations in the # defined interval, in nanoseconds. # -# @min_zone_append_latency_ns: Minimum latency of zone append operations -# in the defined interval, in nanoseconds -# (since 8.1) +# @min_zone_append_latency_ns: Minimum latency of zone append +# operations in the defined interval, in nanoseconds (since 8.1) # # @min_flush_latency_ns: Minimum latency of flush operations in the # defined interval, in nanoseconds. @@ -866,9 +865,8 @@ # @max_wr_latency_ns: Maximum latency of write operations in the # defined interval, in nanoseconds. # -# @max_zone_append_latency_ns: Maximum latency of zone append operations -# in the defined interval, in nanoseconds -# (since 8.1) +# @max_zone_append_latency_ns: Maximum latency of zone append +# operations in the defined interval, in nanoseconds (since 8.1) # # @max_flush_latency_ns: Maximum latency of flush operations in the # defined interval, in nanoseconds. @@ -879,9 +877,8 @@ # @avg_wr_latency_ns: Average latency of write operations in the # defined interval, in nanoseconds. # -# @avg_zone_append_latency_ns: Average latency of zone append operations -# in the defined interval, in nanoseconds -# (since 8.1) +# @avg_zone_append_latency_ns: Average latency of zone append +# operations in the defined interval, in nanoseconds (since 8.1) # # @avg_flush_latency_ns: Average latency of flush operations in the # defined interval, in nanoseconds. @@ -893,8 +890,7 @@ # the defined interval. # # @avg_zone_append_queue_depth: Average number of pending zone append -# operations in the defined interval -# (since 8.1). +# operations in the defined interval (since 8.1). # # Since: 2.5 ## @@ -919,8 +915,8 @@ # # @wr_bytes: The number of bytes written by the device. # -# @zone_append_bytes: The number of bytes appended by the zoned devices -# (since 8.1) +# @zone_append_bytes: The number of bytes appended by the zoned +# devices (since 8.1) # # @unmap_bytes: The number of bytes unmapped by the device (Since 4.2) # @@ -930,8 +926,8 @@ # @wr_operations: The number of write operations performed by the # device. # -# @zone_append_operations: The number of zone append operations performed -# by the zoned devices (since 8.1) +# @zone_append_operations: The number of zone append operations +# performed by the zoned devices (since 8.1) # # @flush_operations: The number of cache flush operations performed by # the device (since 0.15) @@ -946,7 +942,7 @@ # 0.15). # # @zone_append_total_time_ns: Total time spent on zone append writes -# in nanoseconds (since 8.1) +# in nanoseconds (since 8.1) # # @flush_total_time_ns: Total time spent on cache flushes in # nanoseconds (since 0.15). @@ -965,8 +961,8 @@ # @wr_merged: Number of write requests that have been merged into # another request (Since 2.3). # -# @zone_append_merged: Number of zone append requests that have been merged -# into another request (since 8.1) +# @zone_append_merged: Number of zone append requests that have been +# merged into another request (since 8.1) # # @unmap_merged: Number of unmap requests that have been merged into # another request (Since 4.2) @@ -981,9 +977,8 @@ # @failed_wr_operations
[PULL 1/5] qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
Documentation for member @bin comes out like list of io request counts corresponding to histogram intervals. len("bins") = len("boundaries") + 1 For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Note how the equation and the sentence following it run together. Replace the equation: list of io request counts corresponding to histogram intervals, same number of elements as "boundaries". For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Cc: Vladimir Sementsov-Ogievskiy Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-2-arm...@redhat.com> --- qapi/block-core.json | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 5dd5f7e4b0..6ca448b6e6 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -652,10 +652,9 @@ # 10), [10, 50), [50, 100), [100, +inf). # # @bins: list of io request counts corresponding to histogram -# intervals. -# len(@bins) = len(@boundaries) + 1 -# For the example above, @bins may be something like [3, 1, 5, 2], -# and corresponding histogram looks like: +# intervals, same number of elements as @boundaries. For the +# example above, @bins may be something like [3, 1, 5, 2], and +# corresponding histogram looks like: # # :: # -- 2.41.0
[PULL 4/5] qapi/trace: Tidy up trace-event-get-state, -set-state documentation
trace-event-set-state's explanation of how events are selected is under "Features". Doesn't belong there. Simply delete it, as it feels redundant with documentation of member @name. trace-event-get-state's explanation is under "Returns". Tolerable, but similarly redundant. Delete it, too. Cc: Alex Bennée Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-5-arm...@redhat.com> --- qapi/trace.json | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/qapi/trace.json b/qapi/trace.json index 39b752fc88..0819d93016 100644 --- a/qapi/trace.json +++ b/qapi/trace.json @@ -60,9 +60,6 @@ # # Returns: a list of @TraceEventInfo for the matching events # -# An event is returned if its name matches the @name pattern -# (There are no longer any per-vCPU events). -# # Since: 2.2 # # Example: @@ -90,10 +87,8 @@ # @vcpu: The vCPU to act upon (all by default; since 2.7). # # Features: -# @deprecated: Member @vcpu is deprecated, and always ignored. # -# An event is enabled if its name matches the @name pattern -# (There are no longer any per-vCPU events). +# @deprecated: Member @vcpu is deprecated, and always ignored. # # Since: 2.2 # -- 2.41.0
[PULL 3/5] qapi/qdev: Tidy up device_add documentation
The notes section comes out like this: Notes Additional arguments depend on the type. 1. For detailed information about this command, please refer to the ‘docs/qdev-device-use.txt’ file. 2. It’s possible to list device properties by running QEMU with the “-device DEVICE,help” command-line argument, where DEVICE is the device’s name The first item isn't numbered. Fix that: 1. Additional arguments depend on the type. 2. For detailed information about this command, please refer to the ‘docs/qdev-device-use.txt’ file. 3. It’s possible to list device properties by running QEMU with the “-device DEVICE,help” command-line argument, where DEVICE is the device’s name Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-4-arm...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé --- qapi/qdev.json | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/qapi/qdev.json b/qapi/qdev.json index 2d73b27c2a..6bc5a733b8 100644 --- a/qapi/qdev.json +++ b/qapi/qdev.json @@ -53,12 +53,12 @@ # # Notes: # -# Additional arguments depend on the type. +# 1. Additional arguments depend on the type. # -# 1. For detailed information about this command, please refer to the +# 2. For detailed information about this command, please refer to the #'docs/qdev-device-use.txt' file. # -# 2. It's possible to list device properties by running QEMU with the +# 3. It's possible to list device properties by running QEMU with the #"-device DEVICE,help" command-line argument, where DEVICE is the #device's name # -- 2.41.0
Re: [PULL 1/5] qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
On 26.07.23 14:28, Markus Armbruster wrote: Documentation for member @bin comes out like list of io request counts corresponding to histogram intervals. len("bins") = len("boundaries") + 1 For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Note how the equation and the sentence following it run together. Replace the equation: list of io request counts corresponding to histogram intervals, same number of elements as "boundaries". For the example above, not same, but one more. N points break the line into N+1 intervals "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Cc: Vladimir Sementsov-Ogievskiy Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-2-arm...@redhat.com> --- qapi/block-core.json | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 5dd5f7e4b0..6ca448b6e6 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -652,10 +652,9 @@ # 10), [10, 50), [50, 100), [100, +inf). # # @bins: list of io request counts corresponding to histogram -# intervals. -# len(@bins) = len(@boundaries) + 1 -# For the example above, @bins may be something like [3, 1, 5, 2], -# and corresponding histogram looks like: +# intervals, same number of elements as @boundaries. For the +# example above, @bins may be something like [3, 1, 5, 2], and +# corresponding histogram looks like: # # :: # -- Best regards, Vladimir
Re: [PATCH v21 12/20] qapi/s390x/cpu topology: query-cpu-polarization qmp command
On Fri, 2023-06-30 at 11:17 +0200, Pierre Morel wrote: > The query-cpu-polarization qmp command returns the current > CPU polarization of the machine. > > Signed-off-by: Pierre Morel Reviewed-by: Nina Schoetterl-Glausch [...]
Re: [PULL 1/5] qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
Vladimir Sementsov-Ogievskiy writes: > On 26.07.23 14:28, Markus Armbruster wrote: >> Documentation for member @bin comes out like >> list of io request counts corresponding to histogram intervals. >> len("bins") = len("boundaries") + 1 For the example above, "bins" >> may be something like [3, 1, 5, 2], and corresponding histogram >> looks like: >> >> Note how the equation and the sentence following it run together. >> Replace the equation: >> >> list of io request counts corresponding to histogram intervals, >> same number of elements as "boundaries". For the example above, > > not same, but one more. N points break the line into N+1 intervals Thanks for catching this. What about "one more element than @boundaries has"? >> "bins" may be something like [3, 1, 5, 2], and corresponding >> histogram looks like: >> Cc: Vladimir Sementsov-Ogievskiy >> Signed-off-by: Markus Armbruster >> Message-ID: <20230720071610.1096458-2-arm...@redhat.com>
[PATCH] docs/devel: Add cross-compiling doc
Add instructions for how to cross-compile QEMU for RISC-V. The file is named generically because there's no reason not to collect other architectures steps into the same file, especially because several subsections like those for cross-compiling QEMU dependencies using meson and a cross-file could be shared. Additionally, other approaches to creating sysroots, such as with debootstrap, may be documented in this file in the future. Signed-off-by: Andrew Jones --- docs/devel/cross-compiling.rst | 221 + 1 file changed, 221 insertions(+) create mode 100644 docs/devel/cross-compiling.rst diff --git a/docs/devel/cross-compiling.rst b/docs/devel/cross-compiling.rst new file mode 100644 index ..1b988ba54e4c --- /dev/null +++ b/docs/devel/cross-compiling.rst @@ -0,0 +1,221 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + + +Cross-compiling QEMU + + +Cross-compiling QEMU first requires the preparation of a cross-toolchain +and the cross-compiling of QEMU's dependencies. While the steps will be +similar across architectures, each architecture will have its own specific +recommendations. This document collects architecture-specific procedures +and hints that may be used to cross-compile QEMU, where typically the host +environment is x86. + +RISC-V +== + +Toolchain +- + +Select a root directory for the cross environment +^ + +Export an environment variable pointing to a root directory +for the cross environment. For example, :: + + $ export PREFIX="$HOME/opt/riscv" + +Create a work directory +^^^ + +Tools and several components will need to be downloaded and built. Create +a directory for all the work, :: + + $ export WORK_DIR="$HOME/work/xqemu" + $ mkdir -p "$WORK_DIR" + +Select and prepare the toolchain + + +Select a toolchain such as [riscv-toolchain]_ and follow its instructions +for building and installing it to ``$PREFIX``, e.g. :: + + $ cd "$WORK_DIR" + $ git clone https://github.com/riscv/riscv-gnu-toolchain + $ cd riscv-gnu-toolchain + $ ./configure --prefix="$PREFIX" + $ make -j$(nproc) linux + +Set the ``$CROSS_COMPILE`` environment variable to the prefix of the cross +tools and add the tools to ``$PATH``, :: + +$ export CROSS_COMPILE=riscv64-unknown-linux-gnu- +$ export PATH="$PREFIX/bin:$PATH" + +Also set ``$SYSROOT``, where all QEMU cross-compiled dependencies will be +installed. The toolchain installation likely created a 'sysroot' directory +at ``$PREFIX/sysroot``, which is the default location for most cross +tools, making it a good location, :: + + $ mkdir -p "$PREFIX/sysroot" + $ export SYSROOT="$PREFIX/sysroot" + +Create a pkg-config wrapper +^^^ + +The build processes of QEMU and some of its dependencies depend on +pkg-config. Create a wrapper script for it which works for the cross +environment: :: + + $ cat <"$PREFIX/bin/${CROSS_COMPILE}pkg-config" + #!/bin/sh + + [ "\$SYSROOT" ] || exit 1 + + export PKG_CONFIG_PATH= + export PKG_CONFIG_LIBDIR="\${SYSROOT}/usr/lib/pkgconfig:\${SYSROOT}/usr/lib64/pkgconfig:\${SYSROOT}/usr/share/pkgconfig" + + exec pkg-config "\$@" + EOF + $ chmod +x "$PREFIX/bin/${CROSS_COMPILE}pkg-config" + +Create a cross-file for meson builds + + +meson setup, used by some of QEMU's dependencies, needs a "cross-file" to +configure the cross environment. Create one, :: + + $ cd "$WORK_DIR" + $ cat
How to tame CI?
Hi Now a not on CI, thas has been really bad. After too many problems with last PULLS, I decided to learn to use qemu CI. On one hand, it is not so difficult, even I can use it O:-) On the other hand, the amount of problems that I got is inmense. Some of them dissapear when I rerun the checks, but I never know if it is my PULL request, the CI system or the tests themselves. So it ends going something like: while (true); do - git pull - git rebase - git push ci blah, blah - Next day comes, and too many errors, so I rebase again The last step takes more time than expected and not always trivial to know how the failure is. This (last) patch is not part of the PULL request, but I have found that it _always_ makes gcov fail. I had to use bisect to find where the problem was. https://gitlab.com/juan.quintela/qemu/-/jobs/4571878922 To make things easier, this is the part that show how it breaks (this is the gcov test): 357/423 qemu:block / io-qcow2-copy-before-write ERROR 6.38s exit status 1 >>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3 >>> MALLOC_PERTURB_=44 /builds/juan.quintela/qemu/build/pyvenv/bin/python3 >>> /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap -qcow2 >>> copy-before-write --source-dir >>> /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir >>> /builds/juan.quintela/qemu/build/tests/qemu-iotests ― ✀ ― stderr: --- /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out +++ /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad @@ -1,5 +1,21 @@ - +...F +== +FAIL: test_timeout_break_snapshot (__main__.TestCbwError) +-- +Traceback (most recent call last): + File "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", line 210, in test_timeout_break_snapshot +self.assertEqual(log, """\ +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset 0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission denied\n' + wrote 524288/524288 bytes at offset 0 + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + wrote 524288/524288 bytes at offset 524288 + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) ++ read failed: Permission denied +- read 1048576/1048576 bytes at offset 0 +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + + -- Ran 4 tests -OK +FAILED (failures=1) (test program exited with status code 1) ―― I could use help to know how a change in test/qtest/migration-test.c can break block layer tests, I am all ears. This is the commit: https://gitlab.com/juan.quintela/qemu/-/commit/7455ee794c01662b5efa1ee67396d85943663ded Yes, I tried several times. It always fails on that patch. The previous commint passes CI with flying colors. Later, Juan.
[PULL 00/25] Migration 20230726 patches
The following changes since commit 6cb2011fedf8c4e7b66b4a3abd6b42c1bae99ce6: Update version for v8.1.0-rc1 release (2023-07-25 20:09:05 +0100) are available in the Git repository at: https://gitlab.com/juan.quintela/qemu.git tags/migration-20230726-pull-request for you to fetch changes up to 697c4c86ab515a728ffb2adc2c3c04b22fa9210f: migration/rdma: Split qemu_fopen_rdma() into input/output functions (2023-07-26 10:55:56 +0200) Migration Pull request Hi This is the migration PULL request. It is the same than yesterday with proper PULL headers. It pass CI. It contains: - Fabiano rosas trheadinfo cleanups - Hyman Huang dirtylimit changes - Part of my changes - Peter Xu documentation - Tejus updato to migration descriptions - Wei want improvements for postocpy and multifd setup Please apply. Thanks, Juan. Fabiano Rosas (2): migration/multifd: Rename threadinfo.c functions migration/multifd: Protect accesses to migration_threads Hyman Huang(黄勇) (8): softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" qapi/migration: Introduce x-vcpu-dirty-limit-period parameter qapi/migration: Introduce vcpu-dirty-limit parameters migration: Introduce dirty-limit capability migration: Refactor auto-converge capability logic migration: Put the detection logic before auto-converge checking migration: Implement dirty-limit convergence algo migration: Extend query-migrate to provide dirty page limit info Juan Quintela (11): migration-test: Be consistent for ppc migration-test: Make machine_opts regular with other options migration-test: Create arch_opts migration-test: machine_opts is really arch specific migration: skipped field is really obsolete. qemu-file: Rename qemu_file_transferred_ fast -> noflush migration: Change qemu_file_transferred to noflush qemu_file: Make qemu_file_is_writable() static qemu-file: Simplify qemu_file_shutdown() qemu-file: Make qemu_file_get_error_obj() static migration/rdma: Split qemu_fopen_rdma() into input/output functions Peter Xu (1): docs/migration: Update postcopy bits Tejus GK (1): migration: Update error description whenever migration fails Wei Wang (2): migration: enforce multifd and postcopy preempt to be set before incoming qtest/migration-tests.c: use "-incoming defer" for postcopy tests docs/about/deprecated.rst | 10 docs/devel/migration.rst | 94 ++-- qapi/migration.json| 87 ++ include/sysemu/dirtylimit.h| 2 + migration/options.h| 1 + migration/qemu-file.h | 14 ++--- migration/threadinfo.h | 7 +-- migration/migration-hmp-cmds.c | 26 + migration/migration.c | 36 + migration/multifd.c| 4 +- migration/options.c| 87 +- migration/qemu-file.c | 24 ++--- migration/ram.c| 59 + migration/rdma.c | 39 +++--- migration/savevm.c | 6 +-- migration/threadinfo.c | 19 +-- migration/vmstate.c| 4 +- softmmu/dirtylimit.c | 97 ++ tests/qtest/migration-test.c | 48 - migration/trace-events | 1 + 20 files changed, 510 insertions(+), 155 deletions(-) -- 2.40.1
[PULL 12/25] migration-test: Make machine_opts regular with other options
Reviewed-by: Peter Xu Signed-off-by: Juan Quintela Message-ID: <20230608224943.3877-5-quint...@redhat.com> --- tests/qtest/migration-test.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 2296ed4bf5..f51a25e299 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -739,7 +739,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, start_address = S390_TEST_MEM_START; end_address = S390_TEST_MEM_END; } else if (strcmp(arch, "ppc64") == 0) { -machine_opts = "vsmt=8"; +machine_opts = "-machine vsmt=8"; memory_size = "256M"; start_address = PPC_TEST_MEM_START; end_address = PPC_TEST_MEM_END; @@ -751,7 +751,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, arch_target = g_strdup("-nodefaults"); } else if (strcmp(arch, "aarch64") == 0) { init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel)); -machine_opts = "virt,gic-version=max"; +machine_opts = "-machine virt,gic-version=max"; memory_size = "150M"; arch_source = g_strdup_printf("-cpu max " "-kernel %s", @@ -791,14 +791,13 @@ static int test_migrate_start(QTestState **from, QTestState **to, shmem_opts = g_strdup(""); } -cmd_source = g_strdup_printf("-accel kvm%s -accel tcg%s%s " +cmd_source = g_strdup_printf("-accel kvm%s -accel tcg %s " "-name source,debug-threads=on " "-m %s " "-serial file:%s/src_serial " "%s %s %s %s", args->use_dirty_ring ? ",dirty-ring-size=4096" : "", - machine_opts ? " -machine " : "", machine_opts ? machine_opts : "", memory_size, tmpfs, arch_source, shmem_opts, @@ -811,7 +810,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, &got_src_stop); } -cmd_target = g_strdup_printf("-accel kvm%s -accel tcg%s%s " +cmd_target = g_strdup_printf("-accel kvm%s -accel tcg %s " "-name target,debug-threads=on " "-m %s " "-serial file:%s/dest_serial " @@ -819,7 +818,6 @@ static int test_migrate_start(QTestState **from, QTestState **to, "%s %s %s %s", args->use_dirty_ring ? ",dirty-ring-size=4096" : "", - machine_opts ? " -machine " : "", machine_opts ? machine_opts : "", memory_size, tmpfs, uri, arch_target, shmem_opts, -- 2.40.1
[PULL 01/25] migration/multifd: Rename threadinfo.c functions
From: Fabiano Rosas We're about to add more functions to this file so make it use the same coding style as the rest of the code. Signed-off-by: Fabiano Rosas Reviewed-by: Juan Quintela Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Peter Xu Message-Id: <20230607161306.31425-2-faro...@suse.de> Signed-off-by: Juan Quintela --- migration/threadinfo.h | 5 ++--- migration/migration.c | 4 ++-- migration/multifd.c| 4 ++-- migration/threadinfo.c | 4 ++-- 4 files changed, 8 insertions(+), 9 deletions(-) diff --git a/migration/threadinfo.h b/migration/threadinfo.h index 4d69423c0a..8aa6999d58 100644 --- a/migration/threadinfo.h +++ b/migration/threadinfo.h @@ -23,6 +23,5 @@ struct MigrationThread { QLIST_ENTRY(MigrationThread) node; }; -MigrationThread *MigrationThreadAdd(const char *name, int thread_id); - -void MigrationThreadDel(MigrationThread *info); +MigrationThread *migration_threads_add(const char *name, int thread_id); +void migration_threads_remove(MigrationThread *info); diff --git a/migration/migration.c b/migration/migration.c index 91bba630a8..ae49d42eab 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2953,7 +2953,7 @@ static void *migration_thread(void *opaque) MigThrError thr_error; bool urgent = false; -thread = MigrationThreadAdd("live_migration", qemu_get_thread_id()); +thread = migration_threads_add("live_migration", qemu_get_thread_id()); rcu_register_thread(); @@ -3031,7 +3031,7 @@ static void *migration_thread(void *opaque) migration_iteration_finish(s); object_unref(OBJECT(s)); rcu_unregister_thread(); -MigrationThreadDel(thread); +migration_threads_remove(thread); return NULL; } diff --git a/migration/multifd.c b/migration/multifd.c index 0e3ae87449..0f6b203877 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -651,7 +651,7 @@ static void *multifd_send_thread(void *opaque) int ret = 0; bool use_zero_copy_send = migrate_zero_copy_send(); -thread = MigrationThreadAdd(p->name, qemu_get_thread_id()); +thread = migration_threads_add(p->name, qemu_get_thread_id()); trace_multifd_send_thread_start(p->id); rcu_register_thread(); @@ -767,7 +767,7 @@ out: qemu_mutex_unlock(&p->mutex); rcu_unregister_thread(); -MigrationThreadDel(thread); +migration_threads_remove(thread); trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages); return NULL; diff --git a/migration/threadinfo.c b/migration/threadinfo.c index 1de8b31855..3dd9b14ae6 100644 --- a/migration/threadinfo.c +++ b/migration/threadinfo.c @@ -14,7 +14,7 @@ static QLIST_HEAD(, MigrationThread) migration_threads; -MigrationThread *MigrationThreadAdd(const char *name, int thread_id) +MigrationThread *migration_threads_add(const char *name, int thread_id) { MigrationThread *thread = g_new0(MigrationThread, 1); thread->name = name; @@ -25,7 +25,7 @@ MigrationThread *MigrationThreadAdd(const char *name, int thread_id) return thread; } -void MigrationThreadDel(MigrationThread *thread) +void migration_threads_remove(MigrationThread *thread) { if (thread) { QLIST_REMOVE(thread, node); -- 2.40.1
[PULL 06/25] migration: Introduce dirty-limit capability
From: Hyman Huang(黄勇) Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela Message-Id: <168618975839.6361.1740763387474768865...@git.sr.ht> Signed-off-by: Juan Quintela --- qapi/migration.json | 13 - migration/options.h | 1 + migration/options.c | 23 ++- softmmu/dirtylimit.c | 18 ++ 4 files changed, 49 insertions(+), 6 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 22104cef08..e094438d74 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -497,6 +497,16 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # +# @dirty-limit: If enabled, migration will use the dirty-limit algo to +# throttle down guest instead of auto-converge algo. +# Throttle algo only works when vCPU's dirtyrate greater +# than 'vcpu-dirty-limit', read processes in guest os +# aren't penalized any more, so this algo can improve +# performance of vCPU during live migration. This is an +# optional performance feature and should not affect the +# correctness of the existing auto-converge algo. +# (since 8.1) +# # Features: # # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -512,7 +522,8 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] } + 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', + 'dirty-limit'] } ## # @MigrationCapabilityStatus: diff --git a/migration/options.h b/migration/options.h index 9aaf363322..045e2a41a2 100644 --- a/migration/options.h +++ b/migration/options.h @@ -29,6 +29,7 @@ bool migrate_block(void); bool migrate_colo(void); bool migrate_compress(void); bool migrate_dirty_bitmaps(void); +bool migrate_dirty_limit(void); bool migrate_events(void); bool migrate_ignore_shared(void); bool migrate_late_block_activate(void); diff --git a/migration/options.c b/migration/options.c index 7d2d98830e..7d83f190d6 100644 --- a/migration/options.c +++ b/migration/options.c @@ -27,6 +27,7 @@ #include "qemu-file.h" #include "ram.h" #include "options.h" +#include "sysemu/kvm.h" /* Maximum migrate downtime set to 2000 seconds */ #define MAX_MIGRATE_DOWNTIME_SECONDS 2000 @@ -196,7 +197,7 @@ Property migration_properties[] = { #endif DEFINE_PROP_MIG_CAP("x-switchover-ack", MIGRATION_CAPABILITY_SWITCHOVER_ACK), - +DEFINE_PROP_MIG_CAP("x-dirty-limit", MIGRATION_CAPABILITY_DIRTY_LIMIT), DEFINE_PROP_END_OF_LIST(), }; @@ -242,6 +243,13 @@ bool migrate_dirty_bitmaps(void) return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS]; } +bool migrate_dirty_limit(void) +{ +MigrationState *s = migrate_get_current(); + +return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT]; +} + bool migrate_events(void) { MigrationState *s = migrate_get_current(); @@ -572,6 +580,19 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) return false; } } +if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) { +if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) { +error_setg(errp, "dirty-limit conflicts with auto-converge" + " either of then available currently"); +return false; +} + +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "dirty-limit requires KVM with accelerator" + " property 'dirty-ring-size' set"); +return false; +} +} return true; } diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index e80201097a..942d876523 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -24,6 +24,9 @@ #include "hw/boards.h" #include "sysemu/kvm.h" #include "trace.h" +#include "migration/misc.h" +#include "migration/migration.h" +#include "migration/options.h" /* * Dirtylimit stop working if dirty page rate error @@ -75,14 +78,21 @@ static bool
[PULL 14/25] migration-test: machine_opts is really arch specific
And it needs to be in both source and target, so put it on arch_opts. Reviewed-by: Peter Xu Message-ID: <20230608224943.3877-7-quint...@redhat.com> Signed-off-by: Juan Quintela --- tests/qtest/migration-test.c | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index c723f083da..fd145e38d9 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -711,7 +711,6 @@ static int test_migrate_start(QTestState **from, QTestState **to, g_autofree char *shmem_opts = NULL; g_autofree char *shmem_path = NULL; const char *arch = qtest_get_arch(); -const char *machine_opts = NULL; const char *memory_size; if (args->use_shmem) { @@ -739,7 +738,6 @@ static int test_migrate_start(QTestState **from, QTestState **to, start_address = S390_TEST_MEM_START; end_address = S390_TEST_MEM_END; } else if (strcmp(arch, "ppc64") == 0) { -machine_opts = "-machine vsmt=8"; memory_size = "256M"; start_address = PPC_TEST_MEM_START; end_address = PPC_TEST_MEM_END; @@ -747,12 +745,12 @@ static int test_migrate_start(QTestState **from, QTestState **to, "'nvramrc=hex .\" _\" begin %x %x " "do i c@ 1 + i c! 1000 +loop .\" B\" 0 " "until'", end_address, start_address); -arch_opts = g_strdup("-nodefaults"); +arch_opts = g_strdup("-nodefaults -machine vsmt=8"); } else if (strcmp(arch, "aarch64") == 0) { init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel)); -machine_opts = "-machine virt,gic-version=max"; memory_size = "150M"; -arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath); +arch_opts = g_strdup_printf("-machine virt,gic-version=max -cpu max " +"-kernel %s", bootpath); start_address = ARM_TEST_MEM_START; end_address = ARM_TEST_MEM_END; @@ -787,14 +785,13 @@ static int test_migrate_start(QTestState **from, QTestState **to, shmem_opts = g_strdup(""); } -cmd_source = g_strdup_printf("-accel kvm%s -accel tcg %s " +cmd_source = g_strdup_printf("-accel kvm%s -accel tcg " "-name source,debug-threads=on " "-m %s " "-serial file:%s/src_serial " "%s %s %s %s %s", args->use_dirty_ring ? ",dirty-ring-size=4096" : "", - machine_opts ? machine_opts : "", memory_size, tmpfs, arch_opts ? arch_opts : "", arch_source ? arch_source : "", @@ -808,7 +805,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, &got_src_stop); } -cmd_target = g_strdup_printf("-accel kvm%s -accel tcg %s " +cmd_target = g_strdup_printf("-accel kvm%s -accel tcg " "-name target,debug-threads=on " "-m %s " "-serial file:%s/dest_serial " @@ -816,7 +813,6 @@ static int test_migrate_start(QTestState **from, QTestState **to, "%s %s %s %s %s", args->use_dirty_ring ? ",dirty-ring-size=4096" : "", - machine_opts ? machine_opts : "", memory_size, tmpfs, uri, arch_opts ? arch_opts : "", arch_target ? arch_target : "", -- 2.40.1
[PULL 10/25] migration: Extend query-migrate to provide dirty page limit info
From: Hyman Huang(黄勇) Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela Message-ID: <168733225273.5845.1587182678887974167...@git.sr.ht> Signed-off-by: Juan Quintela --- qapi/migration.json| 16 +- include/sysemu/dirtylimit.h| 2 ++ migration/migration-hmp-cmds.c | 10 + migration/migration.c | 10 + softmmu/dirtylimit.c | 39 ++ 5 files changed, 76 insertions(+), 1 deletion(-) diff --git a/qapi/migration.json b/qapi/migration.json index e094438d74..440660bced 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -250,6 +250,18 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-time-per-round: Maximum throttle time (in microseconds) of virtual +# CPUs each dirty ring full round, which shows how +# MigrationCapability dirty-limit affects the guest +# during live migration. (since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in microseconds) +# each dirty ring full round, note that the value equals +# dirty ring memory size divided by average dirty page rate +# of virtual CPU, which can be used to observe the average +# memory load of virtual CPU indirectly. Note that zero +# means guest doesn't dirty memory (since 8.1) +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -267,7 +279,9 @@ '*postcopy-blocktime': 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-time-per-round': 'uint64', + '*dirty-limit-ring-full-time': 'uint64'} } ## # @query-migrate: diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3a6b..d11edb 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +uint64_t dirtylimit_throttle_time_per_round(void); +uint64_t dirtylimit_ring_full_time(void); #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 35e8020bbf..c115ef2d23 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_time_per_round) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n", + info->dirty_limit_throttle_time_per_round); +} + +if (info->has_dirty_limit_ring_full_time) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n", + info->dirty_limit_ring_full_time); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/migration/migration.c b/migration/migration.c index 49332251e8..1ea7512291 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -64,6 +64,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "options.h" +#include "sysemu/dirtylimit.h" static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); @@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->dirty_pages_rate = stat64_get(&mig_stats.dirty_pages_rate); } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_time_per_round = true; +info->dirty_limit_throttle_time_per_round = +dirtylimit_throttle_time_per_round(); + +info->has_dirty_limit_ring_full_time = true; +info->dirty_limit_ring_full_time = dirtylimit_ring_full_time(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index a6d854d161..3c275ee55b 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -565,6 +565,45 @@ out: hmp_handle_error(mon, err); } +/* Return the max throttle time of each virtual CPU */ +uint64_t dirtylimit_throttle_time_per_round(void) +{ +
[PULL 09/25] migration: Implement dirty-limit convergence algo
From: Hyman Huang(黄勇) Implement dirty-limit convergence algo for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Enable dirty page limit if dirty_rate_high_cnt greater than 2 when dirty-limit capability enabled, Disable dirty-limit if migration be canceled. Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit" commands are not allowed during dirty-limit live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Message-ID: <168733225273.5845.1587182678887974167...@git.sr.ht> Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/migration.c | 3 +++ migration/ram.c| 36 softmmu/dirtylimit.c | 29 + migration/trace-events | 1 + 4 files changed, 69 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index ae49d42eab..49332251e8 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -166,6 +166,9 @@ void migration_cancel(const Error *error) if (error) { migrate_set_error(current_migration, error); } +if (migrate_dirty_limit()) { +qmp_cancel_vcpu_dirty_limit(false, -1, NULL); +} migrate_fd_cancel(current_migration); } diff --git a/migration/ram.c b/migration/ram.c index 1d9300f4c5..9040d66e61 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -46,6 +46,7 @@ #include "qapi/error.h" #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qapi-commands-migration.h" #include "qapi/qmp/qerror.h" #include "trace.h" #include "exec/ram_addr.h" @@ -59,6 +60,8 @@ #include "multifd.h" #include "sysemu/runstate.h" #include "options.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/kvm.h" #include "hw/boards.h" /* for machine_dump_guest_core() */ @@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) } } +/* + * Enable dirty-limit to throttle down the guest + */ +static void migration_dirty_limit_guest(void) +{ +/* + * dirty page rate quota for all vCPUs fetched from + * migration parameter 'vcpu_dirty_limit' + */ +static int64_t quota_dirtyrate; +MigrationState *s = migrate_get_current(); + +/* + * If dirty limit already enabled and migration parameter + * vcpu-dirty-limit untouched. + */ +if (dirtylimit_in_service() && +quota_dirtyrate == s->parameters.vcpu_dirty_limit) { +return; +} + +quota_dirtyrate = s->parameters.vcpu_dirty_limit; + +/* + * Set all vCPU a quota dirtyrate, note that the second + * parameter will be ignored if setting all vCPU for the vm + */ +qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL); +trace_migration_dirty_limit_guest(quota_dirtyrate); +} + static void migration_trigger_throttle(RAMState *rs) { uint64_t threshold = migrate_throttle_trigger_threshold(); @@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs) trace_migration_throttle(); mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); +} else if (migrate_dirty_limit()) { +migration_dirty_limit_guest(); } } } diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 942d876523..a6d854d161 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void) dirtylimit_state_finalize(); } +/* + * dirty page rate limit is not allowed to set if migration + * is running with dirty-limit capability enabled. + */ +static bool dirtylimit_is_allowed(void) +{ +MigrationState *ms = migrate_get_current(); + +if (migration_is_running(ms->state) && +(!qemu_thread_is_self(&ms->thread)) && +migrate_dirty_limit() && +dirtylimit_in_service()) { +return false; +} +return true; +} + void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index, int64_t cpu_index, Error **errp) @@ -449,6 +466,12 @@ void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index, return; } +if (!dirtylimit_is_allowed()) { +error_setg(errp, "can't cancel dirty page rate limit while" + " migration is running"); +return; +} + if (!dirtylimit_in_service()) { return; } @@ -499,6 +522,12 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index, return; } +if (!dirtylimit_is_allowed()) { +error_setg(errp, "can't set dirty page rate limit while" + " migration is running"); +return; +} + if (!dirty_rate) { qmp_cancel_vcpu_dirty_limit(has_cpu_index, cpu_index, errp); return; diff --git a/migration/trace-events b/migration/trace-events index 4e43fe20fc..4666f19325 1
[PULL 05/25] qapi/migration: Introduce vcpu-dirty-limit parameters
From: Hyman Huang(黄勇) Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela Message-Id: <168618975839.6361.1740763387474768865...@git.sr.ht> Signed-off-by: Juan Quintela --- qapi/migration.json| 18 +++--- migration/migration-hmp-cmds.c | 8 migration/options.c| 21 + 3 files changed, 44 insertions(+), 3 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index d83b694fef..22104cef08 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -793,6 +793,9 @@ # live migration. Should be in the range 1 to 1000ms, # defaults to 1000ms. (Since 8.1) # +# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. +#Defaults to 1. (Since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -816,7 +819,8 @@ 'max-cpu-throttle', 'multifd-compression', 'multifd-zlib-level', 'multifd-zstd-level', 'block-bitmap-mapping', - { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] } ] } + { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] }, + 'vcpu-dirty-limit'] } ## # @MigrateSetParameters: @@ -955,6 +959,9 @@ # live migration. Should be in the range 1 to 1000ms, # defaults to 1000ms. (Since 8.1) # +# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. +#Defaults to 1. (Since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -995,7 +1002,8 @@ '*multifd-zstd-level': 'uint8', '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], '*x-vcpu-dirty-limit-period': { 'type': 'uint64', -'features': [ 'unstable' ] } } } +'features': [ 'unstable' ] }, +'*vcpu-dirty-limit': 'uint64'} } ## # @migrate-set-parameters: @@ -1154,6 +1162,9 @@ # live migration. Should be in the range 1 to 1000ms, # defaults to 1000ms. (Since 8.1) # +# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. +#Defaults to 1. (Since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1191,7 +1202,8 @@ '*multifd-zstd-level': 'uint8', '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], '*x-vcpu-dirty-limit-period': { 'type': 'uint64', -'features': [ 'unstable' ] } } } +'features': [ 'unstable' ] }, +'*vcpu-dirty-limit': 'uint64'} } ## # @query-migrate-parameters: diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 352e9ec716..35e8020bbf 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %" PRIu64 " ms\n", MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), params->x_vcpu_dirty_limit_period); + +monitor_printf(mon, "%s: %" PRIu64 " MB/s\n", +MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT), +params->vcpu_dirty_limit); } qapi_free_MigrationParameters(params); @@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_x_vcpu_dirty_limit_period = true; visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); break; +case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT: +p->has_vcpu_dirty_limit = true; +visit_type_size(v, param, &p->vcpu_dirty_limit, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 1de63ba775..7d2d98830e 100644 --- a/migration/options.c +++ b/migration/options.c @@ -81,6 +81,7 @@ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1 /* MB/s */ Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, @@ -168,6 +169,9 @@ Property migration_properties[] = { DE
[PULL 08/25] migration: Put the detection logic before auto-converge checking
From: Hyman Huang(黄勇) This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Juan Quintela Message-ID: <168733225273.5845.1587182678887974167...@git.sr.ht> Signed-off-by: Juan Quintela --- migration/ram.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index f31de47a47..1d9300f4c5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs) return; } -if (migrate_auto_converge()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +if (migrate_auto_converge()) { trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); } -- 2.40.1
[PULL 17/25] migration: Update error description whenever migration fails
From: Tejus GK There are places in migration.c where the migration is marked failed with MIGRATION_STATUS_FAILED, but the failure reason is never updated. Hence libvirt doesn't know why the migration failed when it queries for it. Reviewed-by: Daniel P. Berrangé Signed-off-by: Tejus GK Message-ID: <20230621130940.178659-2-tejus...@nutanix.com> Signed-off-by: Juan Quintela --- migration/migration.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 1ea7512291..5528acb65e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1689,7 +1689,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, if (!resume_requested) { yank_unregister_instance(MIGRATION_YANK_INSTANCE); } -error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri", +error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE, "uri", "a valid migration protocol"); migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_FAILED); @@ -2082,7 +2082,7 @@ migration_wait_main_channel(MigrationState *ms) * Switch from normal iteration to postcopy * Returns non-0 on error */ -static int postcopy_start(MigrationState *ms) +static int postcopy_start(MigrationState *ms, Error **errp) { int ret; QIOChannelBuffer *bioc; @@ -2192,7 +2192,7 @@ static int postcopy_start(MigrationState *ms) */ ret = qemu_file_get_error(ms->to_dst_file); if (ret) { -error_report("postcopy_start: Migration stream errored (pre package)"); +error_setg(errp, "postcopy_start: Migration stream errored (pre package)"); goto fail_closefb; } @@ -2229,7 +2229,7 @@ static int postcopy_start(MigrationState *ms) ret = qemu_file_get_error(ms->to_dst_file); if (ret) { -error_report("postcopy_start: Migration stream errored"); +error_setg(errp, "postcopy_start: Migration stream errored"); migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE, MIGRATION_STATUS_FAILED); } @@ -2750,6 +2750,7 @@ typedef enum { static MigIterateState migration_iteration_run(MigrationState *s) { uint64_t must_precopy, can_postcopy; +Error *local_err = NULL; bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE; bool can_switchover = migration_can_switchover(s); @@ -2773,8 +2774,9 @@ static MigIterateState migration_iteration_run(MigrationState *s) /* Still a significant amount to transfer */ if (!in_postcopy && must_precopy <= s->threshold_size && can_switchover && qatomic_read(&s->start_postcopy)) { -if (postcopy_start(s)) { -error_report("%s: postcopy failed to start", __func__); +if (postcopy_start(s, &local_err)) { +migrate_set_error(s, local_err); +error_report_err(local_err); } return MIG_ITERATE_SKIP; } @@ -3265,8 +3267,10 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) */ if (migrate_postcopy_ram() || migrate_return_path()) { if (open_return_path_on_source(s, !resume)) { -error_report("Unable to open return-path for postcopy"); +error_setg(&local_err, "Unable to open return-path for postcopy"); migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); +migrate_set_error(s, local_err); +error_report_err(local_err); migrate_fd_cleanup(s); return; } @@ -3290,6 +3294,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) } if (multifd_save_setup(&local_err) != 0) { +migrate_set_error(s, local_err); error_report_err(local_err); migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_FAILED); -- 2.40.1
[PULL 11/25] migration-test: Be consistent for ppc
It makes no sense that we don't have the same configuration on both sides. Reviewed-by: Laurent Vivier Message-ID: <20230608224943.3877-2-quint...@redhat.com> Signed-off-by: Juan Quintela --- tests/qtest/migration-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index e256da1216..2296ed4bf5 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -748,7 +748,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, "'nvramrc=hex .\" _\" begin %x %x " "do i c@ 1 + i c! 1000 +loop .\" B\" 0 " "until'", end_address, start_address); -arch_target = g_strdup(""); +arch_target = g_strdup("-nodefaults"); } else if (strcmp(arch, "aarch64") == 0) { init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel)); machine_opts = "virt,gic-version=max"; -- 2.40.1
[PULL 03/25] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
From: Hyman Huang(黄勇) dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid if less than 0, so add parameter check for it. Note that this patch also delete the unsolicited help message and clean up the code. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Message-Id: <168618975839.6361.1740763387474768865...@git.sr.ht> Signed-off-by: Juan Quintela --- softmmu/dirtylimit.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 015a9038d1..e80201097a 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict *qdict) int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1); Error *err = NULL; +if (dirty_rate < 0) { +error_setg(&err, "invalid dirty page limit %" PRId64, dirty_rate); +goto out; +} + qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); -if (err) { -hmp_handle_error(mon, err); -return; -} -monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query " - "dirty limit for virtual CPU]\n"); +out: +hmp_handle_error(mon, err); } static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index) -- 2.40.1
[PULL 02/25] migration/multifd: Protect accesses to migration_threads
From: Fabiano Rosas This doubly linked list is common for all the multifd and migration threads so we need to avoid concurrent access. Add a mutex to protect the data from concurrent access. This fixes a crash when removing two MigrationThread objects from the list at the same time during cleanup of multifd threads. Fixes: 671326201d ("migration: Introduce interface query-migrationthreads") Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Message-Id: <20230607161306.31425-3-faro...@suse.de> Signed-off-by: Juan Quintela --- migration/threadinfo.h | 2 -- migration/threadinfo.c | 15 ++- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/migration/threadinfo.h b/migration/threadinfo.h index 8aa6999d58..2f356ff312 100644 --- a/migration/threadinfo.h +++ b/migration/threadinfo.h @@ -10,8 +10,6 @@ * See the COPYING file in the top-level directory. */ -#include "qemu/queue.h" -#include "qemu/osdep.h" #include "qapi/error.h" #include "qapi/qapi-commands-migration.h" diff --git a/migration/threadinfo.c b/migration/threadinfo.c index 3dd9b14ae6..262990dd75 100644 --- a/migration/threadinfo.c +++ b/migration/threadinfo.c @@ -10,23 +10,35 @@ * See the COPYING file in the top-level directory. */ +#include "qemu/osdep.h" +#include "qemu/queue.h" +#include "qemu/lockable.h" #include "threadinfo.h" +QemuMutex migration_threads_lock; static QLIST_HEAD(, MigrationThread) migration_threads; +static void __attribute__((constructor)) migration_threads_init(void) +{ +qemu_mutex_init(&migration_threads_lock); +} + MigrationThread *migration_threads_add(const char *name, int thread_id) { MigrationThread *thread = g_new0(MigrationThread, 1); thread->name = name; thread->thread_id = thread_id; -QLIST_INSERT_HEAD(&migration_threads, thread, node); +WITH_QEMU_LOCK_GUARD(&migration_threads_lock) { +QLIST_INSERT_HEAD(&migration_threads, thread, node); +} return thread; } void migration_threads_remove(MigrationThread *thread) { +QEMU_LOCK_GUARD(&migration_threads_lock); if (thread) { QLIST_REMOVE(thread, node); g_free(thread); @@ -39,6 +51,7 @@ MigrationThreadInfoList *qmp_query_migrationthreads(Error **errp) MigrationThreadInfoList **tail = &head; MigrationThread *thread = NULL; +QEMU_LOCK_GUARD(&migration_threads_lock); QLIST_FOREACH(thread, &migration_threads, node) { MigrationThreadInfo *info = g_new0(MigrationThreadInfo, 1); info->name = g_strdup(thread->name); -- 2.40.1
[PULL 15/25] migration: skipped field is really obsolete.
Has return zero for more than 10 years. Specifically we introduced the field in 1.5.0 commit f1c72795af573b24a7da5eb52375c9aba8a37972 Author: Peter Lieven Date: Tue Mar 26 10:58:37 2013 +0100 migration: do not sent zero pages in bulk stage during bulk stage of ram migration if a page is a zero page do not send it at all. the memory at the destination reads as zero anyway. even if there is an madvise with QEMU_MADV_DONTNEED at the target upon receipt of a zero page I have observed that the target starts swapping if the memory is overcommitted. it seems that the pages are dropped asynchronously. this patch also updates QMP to return the number of skipped pages in MigrationStats. but removed its usage in 1.5.3 commit 9ef051e5536b6368a1076046ec6c4ec4ac12b5c6 Author: Peter Lieven Date: Mon Jun 10 12:14:19 2013 +0200 Revert "migration: do not sent zero pages in bulk stage" Not sending zero pages breaks migration if a page is zero at the source but not at the destination. This can e.g. happen if different BIOS versions are used at source and destination. It has also been reported that migration on pseries is completely broken with this patch. This effectively reverts commit f1c72795af573b24a7da5eb52375c9aba8a37972. Reviewed-by: Daniel P. Berrangé Message-ID: <20230612193344.3796-2-quint...@redhat.com> Signed-off-by: Juan Quintela --- docs/about/deprecated.rst | 10 ++ qapi/migration.json | 12 ++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 02ea5a839f..1c35f55666 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -451,3 +451,13 @@ both, older and future versions of QEMU. The ``blacklist`` config file option has been renamed to ``block-rpcs`` (to be in sync with the renaming of the corresponding command line option). + +Migration +- + +``skipped`` MigrationStats field (since 8.1) + + +``skipped`` field in Migration stats has been deprecated. It hasn't +been used for more than 10 years. + diff --git a/qapi/migration.json b/qapi/migration.json index 440660bced..388425b4c8 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -23,7 +23,8 @@ # # @duplicate: number of duplicate (zero) pages (since 1.2) # -# @skipped: number of skipped zero pages (since 1.5) +# @skipped: number of skipped zero pages. Always zero, only provided for +# compatibility (since 1.5) # # @normal: number of normal pages (since 1.2) # @@ -62,11 +63,18 @@ # between 0 and @dirty-sync-count * @multifd-channels. (since # 7.1) # +# Features: +# +# @deprecated: Member @skipped is always zero since 1.5.3 +# # Since: 0.14 +# ## { 'struct': 'MigrationStats', 'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' , - 'duplicate': 'int', 'skipped': 'int', 'normal': 'int', + 'duplicate': 'int', + 'skipped': { 'type': 'int', 'features': ['deprecated'] }, + 'normal': 'int', 'normal-bytes': 'int', 'dirty-pages-rate': 'int', 'mbps': 'number', 'dirty-sync-count': 'int', 'postcopy-requests': 'int', 'page-size': 'int', -- 2.40.1
[PULL 21/25] migration: Change qemu_file_transferred to noflush
We do a qemu_fclose() just after that, that also does a qemu_fflush(), so remove one qemu_fflush(). Reviewed-by: Philippe Mathieu-Daudé Message-ID: <20230530183941.7223-3-quint...@redhat.com> Signed-off-by: Juan Quintela --- migration/savevm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/migration/savevm.c b/migration/savevm.c index 0b2583a205..a2cb8855e2 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -3007,7 +3007,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate, goto the_end; } ret = qemu_savevm_state(f, errp); -vm_state_size = qemu_file_transferred(f); +vm_state_size = qemu_file_transferred_noflush(f); ret2 = qemu_fclose(f); if (ret < 0) { goto the_end; -- 2.40.1
[PULL 20/25] qemu-file: Rename qemu_file_transferred_ fast -> noflush
Fast don't say much. Noflush indicates more clearly that it is like qemu_file_transferred but without the flush. Reviewed-by: Philippe Mathieu-Daudé Message-ID: <20230530183941.7223-2-quint...@redhat.com> Signed-off-by: Juan Quintela --- migration/qemu-file.h | 11 +-- migration/qemu-file.c | 2 +- migration/savevm.c| 4 ++-- migration/vmstate.c | 4 ++-- 4 files changed, 10 insertions(+), 11 deletions(-) diff --git a/migration/qemu-file.h b/migration/qemu-file.h index e649718492..aa6eee66da 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -86,16 +86,15 @@ int qemu_fclose(QEMUFile *f); uint64_t qemu_file_transferred(QEMUFile *f); /* - * qemu_file_transferred_fast: + * qemu_file_transferred_noflush: * - * As qemu_file_transferred except for writable - * files, where no flush is performed and the reported - * amount will include the size of any queued buffers, - * on top of the amount actually transferred. + * As qemu_file_transferred except for writable files, where no flush + * is performed and the reported amount will include the size of any + * queued buffers, on top of the amount actually transferred. * * Returns: the total bytes transferred and queued */ -uint64_t qemu_file_transferred_fast(QEMUFile *f); +uint64_t qemu_file_transferred_noflush(QEMUFile *f); /* * put_buffer without copying the buffer. diff --git a/migration/qemu-file.c b/migration/qemu-file.c index acc282654a..fdf115b5da 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -694,7 +694,7 @@ int coroutine_mixed_fn qemu_get_byte(QEMUFile *f) return result; } -uint64_t qemu_file_transferred_fast(QEMUFile *f) +uint64_t qemu_file_transferred_noflush(QEMUFile *f) { uint64_t ret = f->total_transferred; int i; diff --git a/migration/savevm.c b/migration/savevm.c index 51e40e3a0b..0b2583a205 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -927,9 +927,9 @@ static int vmstate_load(QEMUFile *f, SaveStateEntry *se) static void vmstate_save_old_style(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc) { -uint64_t old_offset = qemu_file_transferred_fast(f); +uint64_t old_offset = qemu_file_transferred_noflush(f); se->ops->save_state(f, se->opaque); -uint64_t size = qemu_file_transferred_fast(f) - old_offset; +uint64_t size = qemu_file_transferred_noflush(f) - old_offset; if (vmdesc) { json_writer_int64(vmdesc, "size", size); diff --git a/migration/vmstate.c b/migration/vmstate.c index af01d54b6f..31842c3afb 100644 --- a/migration/vmstate.c +++ b/migration/vmstate.c @@ -361,7 +361,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, void *curr_elem = first_elem + size * i; vmsd_desc_field_start(vmsd, vmdesc_loop, field, i, n_elems); -old_offset = qemu_file_transferred_fast(f); +old_offset = qemu_file_transferred_noflush(f); if (field->flags & VMS_ARRAY_OF_POINTER) { assert(curr_elem); curr_elem = *(void **)curr_elem; @@ -391,7 +391,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, return ret; } -written_bytes = qemu_file_transferred_fast(f) - old_offset; +written_bytes = qemu_file_transferred_noflush(f) - old_offset; vmsd_desc_field_end(vmsd, vmdesc_loop, field, written_bytes, i); /* Compressed arrays only care about the first element */ -- 2.40.1
[PULL 16/25] docs/migration: Update postcopy bits
From: Peter Xu We have postcopy recovery but not reflected in the document, do an update for that. Add a very small section on postcopy preempt. Touch up the pagemap section, dropping the unsent map because it's already been dropped in the source code in commit 1e7cf8c323 ("migration/postcopy: unsentmap is not necessary for postcopy"). Touch up the postcopy section to remove "network connection" failures as downside, because now it's not fatal and can be recovered. Suggested by Laszlo. Acked-by: Laszlo Ersek Signed-off-by: Peter Xu Reviewed-by: Juan Quintela Message-ID: <20230706115611.371048-1-pet...@redhat.com> Signed-off-by: Juan Quintela --- docs/devel/migration.rst | 94 1 file changed, 67 insertions(+), 27 deletions(-) diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst index 6f65c23b47..c3e1400c0c 100644 --- a/docs/devel/migration.rst +++ b/docs/devel/migration.rst @@ -594,8 +594,7 @@ Postcopy 'Postcopy' migration is a way to deal with migrations that refuse to converge (or take too long to converge) its plus side is that there is an upper bound on the amount of migration traffic and time it takes, the down side is that during -the postcopy phase, a failure of *either* side or the network connection causes -the guest to be lost. +the postcopy phase, a failure of *either* side causes the guest to be lost. In postcopy the destination CPUs are started before all the memory has been transferred, and accesses to pages that are yet to be transferred cause @@ -721,6 +720,42 @@ processing. is no longer used by migration, while the listen thread carries on servicing page data until the end of migration. +Postcopy Recovery +- + +Comparing to precopy, postcopy is special on error handlings. When any +error happens (in this case, mostly network errors), QEMU cannot easily +fail a migration because VM data resides in both source and destination +QEMU instances. On the other hand, when issue happens QEMU on both sides +will go into a paused state. It'll need a recovery phase to continue a +paused postcopy migration. + +The recovery phase normally contains a few steps: + + - When network issue occurs, both QEMU will go into PAUSED state + + - When the network is recovered (or a new network is provided), the admin +can setup the new channel for migration using QMP command +'migrate-recover' on destination node, preparing for a resume. + + - On source host, the admin can continue the interrupted postcopy +migration using QMP command 'migrate' with resume=true flag set. + + - After the connection is re-established, QEMU will continue the postcopy +migration on both sides. + +During a paused postcopy migration, the VM can logically still continue +running, and it will not be impacted from any page access to pages that +were already migrated to destination VM before the interruption happens. +However, if any of the missing pages got accessed on destination VM, the VM +thread will be halted waiting for the page to be migrated, it means it can +be halted until the recovery is complete. + +The impact of accessing missing pages can be relevant to different +configurations of the guest. For example, when with async page fault +enabled, logically the guest can proactively schedule out the threads +accessing missing pages. + Postcopy states --- @@ -765,36 +800,31 @@ ADVISE->DISCARD->LISTEN->RUNNING->END (although it can't do the cleanup it would do as it finishes a normal migration). + - Paused + +Postcopy can run into a paused state (normally on both sides when +happens), where all threads will be temporarily halted mostly due to +network errors. When reaching paused state, migration will make sure +the qemu binary on both sides maintain the data without corrupting +the VM. To continue the migration, the admin needs to fix the +migration channel using the QMP command 'migrate-recover' on the +destination node, then resume the migration using QMP command 'migrate' +again on source node, with resume=true flag set. + - End The listen thread can now quit, and perform the cleanup of migration state, the migration is now complete. -Source side page maps -- - -The source side keeps two bitmaps during postcopy; 'the migration bitmap' -and 'unsent map'. The 'migration bitmap' is basically the same as in -the precopy case, and holds a bit to indicate that page is 'dirty' - -i.e. needs sending. During the precopy phase this is updated as the CPU -dirties pages, however during postcopy the CPUs are stopped and nothing -should dirty anything any more. - -The 'unsent map' is used for the transition to postcopy. It is a bitmap that -has a bit cleared whenever a page is sent to the destination, however during -the transition to postcopy mode it is combined with the migration bitmap -to form a set of pages that: - - a)
[PULL 18/25] migration: enforce multifd and postcopy preempt to be set before incoming
From: Wei Wang qemu_start_incoming_migration needs to check the number of multifd channels or postcopy ram channels to configure the backlog parameter (i.e. the maximum length to which the queue of pending connections for sockfd may grow) of listen(). So enforce the usage of postcopy-preempt and multifd as below: - need to use "-incoming defer" on the destination; and - set_capability and set_parameter need to be done before migrate_incoming Otherwise, disable the use of the features and report error messages to remind users to adjust the commands. Signed-off-by: Wei Wang Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Message-ID: <20230606101910.20456-2-wei.w.w...@intel.com> Signed-off-by: Juan Quintela Acked-by: Juan Quintela --- migration/options.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/migration/options.c b/migration/options.c index 7d83f190d6..1d1e1321b0 100644 --- a/migration/options.c +++ b/migration/options.c @@ -441,6 +441,11 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, MIGRATION_CAPABILITY_VALIDATE_UUID, MIGRATION_CAPABILITY_ZERO_COPY_SEND); +static bool migrate_incoming_started(void) +{ +return !!migration_incoming_get_current()->transport_data; +} + /** * @migration_caps_check - check capability compatibility * @@ -564,6 +569,12 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy preempt not compatible with compress"); return false; } + +if (migrate_incoming_started()) { +error_setg(errp, + "Postcopy preempt must be set before incoming starts"); +return false; +} } if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) { @@ -571,6 +582,10 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Multifd is not compatible with compress"); return false; } +if (migrate_incoming_started()) { +error_setg(errp, "Multifd must be set before incoming starts"); +return false; +} } if (new_caps[MIGRATION_CAPABILITY_SWITCHOVER_ACK]) { -- 2.40.1
[PULL 25/25] migration/rdma: Split qemu_fopen_rdma() into input/output functions
This is how everything else in QEMUFile is structured. As a bonus they are three less lines of code. Reviewed-by: Peter Xu Message-ID: <20230530183941.7223-17-quint...@redhat.com> Signed-off-by: Juan Quintela --- migration/qemu-file.h | 1 - migration/qemu-file.c | 12 migration/rdma.c | 39 +++ 3 files changed, 19 insertions(+), 33 deletions(-) diff --git a/migration/qemu-file.h b/migration/qemu-file.h index 8b8b7d27fe..47015f5201 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -102,7 +102,6 @@ uint64_t qemu_file_transferred_noflush(QEMUFile *f); */ void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size, bool may_free); -bool qemu_file_mode_is_not_valid(const char *mode); #include "migration/qemu-file-types.h" diff --git a/migration/qemu-file.c b/migration/qemu-file.c index d30bf3c377..19c33c9985 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -100,18 +100,6 @@ int qemu_file_shutdown(QEMUFile *f) return 0; } -bool qemu_file_mode_is_not_valid(const char *mode) -{ -if (mode == NULL || -(mode[0] != 'r' && mode[0] != 'w') || -mode[1] != 'b' || mode[2] != 0) { -fprintf(stderr, "qemu_fopen: Argument validity check failed\n"); -return true; -} - -return false; -} - static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable) { QEMUFile *f; diff --git a/migration/rdma.c b/migration/rdma.c index dd1c039e6c..ca430d319d 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -4053,27 +4053,26 @@ static void qio_channel_rdma_register_types(void) type_init(qio_channel_rdma_register_types); -static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode) +static QEMUFile *rdma_new_input(RDMAContext *rdma) { -QIOChannelRDMA *rioc; +QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA)); -if (qemu_file_mode_is_not_valid(mode)) { -return NULL; -} +rioc->file = qemu_file_new_input(QIO_CHANNEL(rioc)); +rioc->rdmain = rdma; +rioc->rdmaout = rdma->return_path; +qemu_file_set_hooks(rioc->file, &rdma_read_hooks); -rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA)); +return rioc->file; +} -if (mode[0] == 'w') { -rioc->file = qemu_file_new_output(QIO_CHANNEL(rioc)); -rioc->rdmaout = rdma; -rioc->rdmain = rdma->return_path; -qemu_file_set_hooks(rioc->file, &rdma_write_hooks); -} else { -rioc->file = qemu_file_new_input(QIO_CHANNEL(rioc)); -rioc->rdmain = rdma; -rioc->rdmaout = rdma->return_path; -qemu_file_set_hooks(rioc->file, &rdma_read_hooks); -} +static QEMUFile *rdma_new_output(RDMAContext *rdma) +{ +QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA)); + +rioc->file = qemu_file_new_output(QIO_CHANNEL(rioc)); +rioc->rdmaout = rdma; +rioc->rdmain = rdma->return_path; +qemu_file_set_hooks(rioc->file, &rdma_write_hooks); return rioc->file; } @@ -4099,9 +4098,9 @@ static void rdma_accept_incoming_migration(void *opaque) return; } -f = qemu_fopen_rdma(rdma, "rb"); +f = rdma_new_input(rdma); if (f == NULL) { -fprintf(stderr, "RDMA ERROR: could not qemu_fopen_rdma\n"); +fprintf(stderr, "RDMA ERROR: could not open RDMA for input\n"); qemu_rdma_cleanup(rdma); return; } @@ -4224,7 +4223,7 @@ void rdma_start_outgoing_migration(void *opaque, trace_rdma_start_outgoing_migration_after_rdma_connect(); -s->to_dst_file = qemu_fopen_rdma(rdma, "wb"); +s->to_dst_file = rdma_new_output(rdma); migrate_fd_connect(s, NULL); return; return_path_err: -- 2.40.1
[PULL 13/25] migration-test: Create arch_opts
This will contain the options needed for both source and target. Reviewed-by: Peter Xu Message-ID: <20230608224943.3877-6-quint...@redhat.com> Signed-off-by: Juan Quintela --- tests/qtest/migration-test.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index f51a25e299..c723f083da 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -702,6 +702,8 @@ static int test_migrate_start(QTestState **from, QTestState **to, { g_autofree gchar *arch_source = NULL; g_autofree gchar *arch_target = NULL; +/* options for source and target */ +g_autofree gchar *arch_opts = NULL; g_autofree gchar *cmd_source = NULL; g_autofree gchar *cmd_target = NULL; const gchar *ignore_stderr; @@ -727,15 +729,13 @@ static int test_migrate_start(QTestState **from, QTestState **to, assert(sizeof(x86_bootsect) == 512); init_bootfile(bootpath, x86_bootsect, sizeof(x86_bootsect)); memory_size = "150M"; -arch_source = g_strdup_printf("-drive file=%s,format=raw", bootpath); -arch_target = g_strdup(arch_source); +arch_opts = g_strdup_printf("-drive file=%s,format=raw", bootpath); start_address = X86_TEST_MEM_START; end_address = X86_TEST_MEM_END; } else if (g_str_equal(arch, "s390x")) { init_bootfile(bootpath, s390x_elf, sizeof(s390x_elf)); memory_size = "128M"; -arch_source = g_strdup_printf("-bios %s", bootpath); -arch_target = g_strdup(arch_source); +arch_opts = g_strdup_printf("-bios %s", bootpath); start_address = S390_TEST_MEM_START; end_address = S390_TEST_MEM_END; } else if (strcmp(arch, "ppc64") == 0) { @@ -743,20 +743,16 @@ static int test_migrate_start(QTestState **from, QTestState **to, memory_size = "256M"; start_address = PPC_TEST_MEM_START; end_address = PPC_TEST_MEM_END; -arch_source = g_strdup_printf("-nodefaults " - "-prom-env 'use-nvramrc?=true' -prom-env " +arch_source = g_strdup_printf("-prom-env 'use-nvramrc?=true' -prom-env " "'nvramrc=hex .\" _\" begin %x %x " "do i c@ 1 + i c! 1000 +loop .\" B\" 0 " "until'", end_address, start_address); -arch_target = g_strdup("-nodefaults"); +arch_opts = g_strdup("-nodefaults"); } else if (strcmp(arch, "aarch64") == 0) { init_bootfile(bootpath, aarch64_kernel, sizeof(aarch64_kernel)); machine_opts = "-machine virt,gic-version=max"; memory_size = "150M"; -arch_source = g_strdup_printf("-cpu max " - "-kernel %s", - bootpath); -arch_target = g_strdup(arch_source); +arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath); start_address = ARM_TEST_MEM_START; end_address = ARM_TEST_MEM_END; @@ -795,12 +791,14 @@ static int test_migrate_start(QTestState **from, QTestState **to, "-name source,debug-threads=on " "-m %s " "-serial file:%s/src_serial " - "%s %s %s %s", + "%s %s %s %s %s", args->use_dirty_ring ? ",dirty-ring-size=4096" : "", machine_opts ? machine_opts : "", memory_size, tmpfs, - arch_source, shmem_opts, + arch_opts ? arch_opts : "", + arch_source ? arch_source : "", + shmem_opts, args->opts_source ? args->opts_source : "", ignore_stderr); if (!args->only_target) { @@ -815,12 +813,14 @@ static int test_migrate_start(QTestState **from, QTestState **to, "-m %s " "-serial file:%s/dest_serial " "-incoming %s " - "%s %s %s %s", + "%s %s %s %s %s", args->use_dirty_ring ? ",dirty-ring-size=4096" : "", machine_opts ? machine_opts : "", memory_size, tmpfs, uri, - arch_target, shmem_opts, + arch_opts ? arch_opts : "", + arch_target ? arch_target : "", + shmem_opts,
[PULL 19/25] qtest/migration-tests.c: use "-incoming defer" for postcopy tests
From: Wei Wang The Postcopy preempt capability is expected to be set before incoming starts, so change the postcopy tests to start with deferred incoming and call migrate-incoming after the cap has been set. Why the existing tests (without this patch) didn't fail? There could be two reasons: 1) "backlog" specifies the number of pending connections. As long as the server accepts the connections faster than the clients side connecting, connection will succeed. For the preempt test, it uses only 2 channels, so very likely to not have pending connections. 2) per my tests (on kernel 6.2), the number of pending connections allowed is actually "backlog + 1", which is 2 in this case. That said, the implementation of socket_start_incoming_migration_internal expects "migrate defer" to be used, and for safety, change the test to work with the expected usage. Signed-off-by: Wei Wang Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Message-ID: <20230606101910.20456-3-wei.w.w...@intel.com> Signed-off-by: Juan Quintela --- tests/qtest/migration-test.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index fd145e38d9..62d3f37021 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -1239,10 +1239,9 @@ static int migrate_postcopy_prepare(QTestState **from_ptr, QTestState **to_ptr, MigrateCommon *args) { -g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); QTestState *from, *to; -if (test_migrate_start(&from, &to, uri, &args->start)) { +if (test_migrate_start(&from, &to, "defer", &args->start)) { return -1; } @@ -1262,10 +1261,13 @@ static int migrate_postcopy_prepare(QTestState **from_ptr, migrate_ensure_non_converge(from); migrate_prepare_for_dirty_mem(from); +qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming'," + " 'arguments': { 'uri': 'tcp:127.0.0.1:0' }}"); /* Wait for the first serial output from the source */ wait_for_serial("src_serial"); +g_autofree char *uri = migrate_get_socket_address(to, "socket-address"); migrate_qmp(from, uri, "{}"); migrate_wait_for_dirty_mem(from, to); -- 2.40.1
[PULL 07/25] migration: Refactor auto-converge capability logic
From: Hyman Huang(黄勇) Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela Message-Id: <168618975839.6361.1740763387474768865...@git.sr.ht> Signed-off-by: Juan Quintela --- migration/ram.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 0ada6477e8..f31de47a47 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs) /* During block migration the auto-converge logic incorrectly detects * that ram migration makes no progress. Avoid this by disabling the * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { +if (blk_mig_bulk_active()) { +return; +} + +if (migrate_auto_converge()) { /* The following detection logic can be refined later. For now: Check to see if the ratio between dirtied bytes and the approx. amount of bytes that just got transferred since the last time -- 2.40.1
Re: [PULL 1/5] qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
On 26.07.23 15:01, Markus Armbruster wrote: Vladimir Sementsov-Ogievskiy writes: On 26.07.23 14:28, Markus Armbruster wrote: Documentation for member @bin comes out like list of io request counts corresponding to histogram intervals. len("bins") = len("boundaries") + 1 For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Note how the equation and the sentence following it run together. Replace the equation: list of io request counts corresponding to histogram intervals, same number of elements as "boundaries". For the example above, not same, but one more. N points break the line into N+1 intervals Thanks for catching this. What about "one more element than @boundaries has"? Sounds good! "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Cc: Vladimir Sementsov-Ogievskiy Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-2-arm...@redhat.com> -- Best regards, Vladimir
[PULL v2 2/5] qapi/block: Tidy up block-latency-histogram-set documentation
Examples come out like Example set new histograms for all io types with intervals [0, 10), [10, 50), [50, 100), [100, +inf): The sentence "set new histograms ..." starts with a lower case letter. Capitalize it. Same for the other examples. Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-3-arm...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé --- qapi/block.json | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/qapi/block.json b/qapi/block.json index 0f25ce3961..535892fddc 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -547,7 +547,7 @@ # # Example: # -# set new histograms for all io types with intervals [0, 10), [10, +# Set new histograms for all io types with intervals [0, 10), [10, # 50), [50, 100), [100, +inf): # # -> { "execute": "block-latency-histogram-set", @@ -557,7 +557,7 @@ # # Example: # -# set new histogram only for write, other histograms will remain not +# Set new histogram only for write, other histograms will remain not # changed (or not created): # # -> { "execute": "block-latency-histogram-set", @@ -567,7 +567,7 @@ # # Example: # -# set new histograms with the following intervals: read, flush: [0, +# Set new histograms with the following intervals: read, flush: [0, # 10), [10, 50), [50, 100), [100, +inf) write: [0, 1000), [1000, # 5000), [5000, +inf) # @@ -579,7 +579,7 @@ # # Example: # -# remove all latency histograms: +# Remove all latency histograms: # # -> { "execute": "block-latency-histogram-set", # "arguments": { "id": "drive0" } } -- 2.41.0
[PULL v2 1/5] qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
Documentation for member @bin comes out like list of io request counts corresponding to histogram intervals. len("bins") = len("boundaries") + 1 For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Note how the equation and the sentence following it run together. Replace the equation: list of io request counts corresponding to histogram intervals, one more element than "boundaries" has. For the example above, "bins" may be something like [3, 1, 5, 2], and corresponding histogram looks like: Cc: Vladimir Sementsov-Ogievskiy Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-2-arm...@redhat.com> [Off by one fixed] --- qapi/block-core.json | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 5dd5f7e4b0..dcfd54d15c 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -652,10 +652,9 @@ # 10), [10, 50), [50, 100), [100, +inf). # # @bins: list of io request counts corresponding to histogram -# intervals. -# len(@bins) = len(@boundaries) + 1 -# For the example above, @bins may be something like [3, 1, 5, 2], -# and corresponding histogram looks like: +# intervals, one more element than @boundaries has. For the +# example above, @bins may be something like [3, 1, 5, 2], and +# corresponding histogram looks like: # # :: # -- 2.41.0
Re:Re: [PATCH] Open file as read only on private mapping in qemu_ram_alloc_from_file
At 2023-07-26 16:11:44, "David Hildenbrand" wrote: > >> though the file never gets written. (the actual memory file & guest state >> file require >> separated hacking) >> >> And at least the patch provided here have been the solution to this last >> problem for me >> for a while. >> >> By the way the commit: "Commit 134253a4, machine: do not crash if default >> RAM backend name >> has been stolen" disallows me to use a memory backed file directly as pc.ram >> and make >> `-object memory-backed-file,*` based setup more complex (I cannot easily >> make the memory > >Can't you simply do > >-object memory-backed-file,id=mem1 \ >-machine q35,memory-backend=mem1,share=off \ > >Or what would be the problem with that? I could do this though I have not tested this out. For me the biggest problem is that machines who uses a custom memory backend are likely not compatible with those who uses the default memory backend, and this is not convenient - why they are different types of machines and does not migrate from one to another ... My hack is mostly like using `-machine q35` when making snapshot, and `-machine q35 -mem-path filename4pc.ram` when I like to provide a separated memory file on recovery and `-machine q35` when I do not like to provide a memory file. I prefer flexibility when recovering from snapshot that I can either provide a guest state file containing everything, trading memory initialization cost for convenience and disk usage (memory file is and can only be uncompressed), or, provide a memory file + guest state file that focuses more on memory performance. >> >>> >>> While it doesn't work "in the general case", it works in the "single file >>> owner" case >>> where someone simply forgot to specify "share=on" -- "share=off" is the >>> default for >>> memory-backend-file :( . >>> >>> >>> For example, with hugetlb+virtio-mem the following works if the file does >>> not exists: >>> >>> (note that virtio-mem will fallocate(FALLOC_FL_PUNCH_HOLE) the whole file >>> upfront) >>> >>> ... >>> -object >>> memory-backend-file,share=off,mem-path=/dev/hugepages/vmem0,id=mem2,size=2G >>> \ >>> -device virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root >>> >>> >>> With you patch, once the file already exists, we would now get >>> >>> qemu-system-x86_64: -device >> virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root: >> ram_block_discard_range: >> Failed to fallocate :0 +8000 (-9) >>> qemu-system-x86_64: -device >> virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root: Unexpected >> error >> discarding RAM: Bad file descriptor >>> >>> >>> So this has the potential to break existing setups. >>> >>> The easy fix for these would be to configure "share=on" in these >>> now-failing setups. Hm >> >> I am afraid that the easiest prefix could be to configure `share=on` when >> the path starts >> with "/dev/huge" while firing a warning :D >> >> I am sorry about that if existing systems will be broken because of my patch >> ... >> >> I have learnt that mem-path commonly refer to hugetlb/hugepage, but actually >> I have no >> idea what is the outcome if hugetlb or anything similar was mapped with >> map_private and >> copy-on-write happens - will a whole huge page be copied on write then? >> >> I would suppose that in reality system managers may consider directly remove >> the file >> first if the file will be truncated anyway. However t would be a different >> story if this >> file should be truncated exactly PARTIALLY. >> >> Alternatively maybe another flag "create=on" can be added when private >> semantics are >> required, so that if the file exists, the file should be unlinked or >> truncated first >> before using? >> >> Since I am nowhere familiar to this part of qemu source code, it will be >> hard for me to >> write the additional command line flag part correct, if this is believed to >> be the correct >> solution though. >> >> In summary I am glad to learn more of the backgrounds here. > >The easiest way not break any existing setup would be to open the file >R/O only if opening it R/W failed due to lack of permissions, and we >have a private mapping. So, in case of !RAMP_SHARED, simply retry once >more without write permissions. > >Would that keep your use-case working? > I believe yes. When I want to make sure that memory file is not changed, I only need to make sure that the file is read only on the filesystem level, either by readonly mounting or readonly file modes. Though this is not perfect - it means when I accidentally make the memory file writeable, qemu will open it with readwrite mode. This is a bit scary and counter-intuitive. To think of it, this solution seems most likely to be accepted since it works most of the time, no matter how the other parts of the code changes. It does not even related to whether the memory pages are shared or anything similar. I will give this a shot later. As I have learnt in the messages
[PULL 22/25] qemu_file: Make qemu_file_is_writable() static
It is not used outside of qemu_file, and it shouldn't. Signed-off-by: Juan Quintela Message-ID: <20230530183941.7223-19-quint...@redhat.com> Signed-off-by: Juan Quintela --- migration/qemu-file.h | 1 - migration/qemu-file.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/migration/qemu-file.h b/migration/qemu-file.h index aa6eee66da..a081ef6c3f 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -103,7 +103,6 @@ uint64_t qemu_file_transferred_noflush(QEMUFile *f); void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, size_t size, bool may_free); bool qemu_file_mode_is_not_valid(const char *mode); -bool qemu_file_is_writable(QEMUFile *f); #include "migration/qemu-file-types.h" diff --git a/migration/qemu-file.c b/migration/qemu-file.c index fdf115b5da..9a89e17924 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -228,7 +228,7 @@ void qemu_file_set_error(QEMUFile *f, int ret) qemu_file_set_error_obj(f, ret, NULL); } -bool qemu_file_is_writable(QEMUFile *f) +static bool qemu_file_is_writable(QEMUFile *f) { return f->is_writable; } -- 2.40.1
[PULL 24/25] qemu-file: Make qemu_file_get_error_obj() static
It was not used outside of qemu_file.c anyways. Reviewed-by: Peter Xu Message-ID: <20230530183941.7223-21-quint...@redhat.com> Signed-off-by: Juan Quintela --- migration/qemu-file.h | 1 - migration/qemu-file.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/migration/qemu-file.h b/migration/qemu-file.h index a081ef6c3f..8b8b7d27fe 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -128,7 +128,6 @@ void qemu_file_skip(QEMUFile *f, int size); * accounting information tracks the total migration traffic. */ void qemu_file_credit_transfer(QEMUFile *f, size_t size); -int qemu_file_get_error_obj(QEMUFile *f, Error **errp); int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp); void qemu_file_set_error_obj(QEMUFile *f, int ret, Error *err); void qemu_file_set_error(QEMUFile *f, int ret); diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 4c577bdff8..d30bf3c377 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -158,7 +158,7 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks) * is not 0. * */ -int qemu_file_get_error_obj(QEMUFile *f, Error **errp) +static int qemu_file_get_error_obj(QEMUFile *f, Error **errp) { if (errp) { *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL; -- 2.40.1
[PULL 23/25] qemu-file: Simplify qemu_file_shutdown()
Reviewed-by: Peter Xu Message-ID: <20230530183941.7223-20-quint...@redhat.com> Signed-off-by: Juan Quintela --- migration/qemu-file.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 9a89e17924..4c577bdff8 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -65,8 +65,6 @@ struct QEMUFile { */ int qemu_file_shutdown(QEMUFile *f) { -int ret = 0; - /* * We must set qemufile error before the real shutdown(), otherwise * there can be a race window where we thought IO all went though @@ -96,10 +94,10 @@ int qemu_file_shutdown(QEMUFile *f) } if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL) < 0) { -ret = -EIO; +return -EIO; } -return ret; +return 0; } bool qemu_file_mode_is_not_valid(const char *mode) -- 2.40.1
Re: [Qemu PATCH v2 5/9] hw/mem/cxl_type3: Add host backend and address space handling for DC regions
On 7/25/23 13:39, Fan Ni wrote: > From: Fan Ni > > Add (file/memory backed) host backend, all the dynamic capacity regions > will share a single, large enough host backend. Set up address space for > DC regions to support read/write operations to dynamic capacity for DCD. > > With the change, following supports are added: > 1. add a new property to type3 device "nonvolatile-dc-memdev" to point to host >memory backend for dynamic capacity; > 2. add namespace for dynamic capacity for read/write support; > 3. create cdat entries for each dynamic capacity region; > 4. fix dvsec range registers to include DC regions. > > Signed-off-by: Fan Ni > --- > hw/cxl/cxl-mailbox-utils.c | 19 +++- > hw/mem/cxl_type3.c | 203 +--- > include/hw/cxl/cxl_device.h | 4 + > 3 files changed, 185 insertions(+), 41 deletions(-) > > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c > index dd5ea95af8..0511b8e6f7 100644 > --- a/hw/cxl/cxl-mailbox-utils.c > +++ b/hw/cxl/cxl-mailbox-utils.c > @@ -388,9 +388,11 @@ static CXLRetCode cmd_firmware_update_get_info(struct > cxl_cmd *cmd, > char fw_rev4[0x10]; > } QEMU_PACKED *fw_info; > QEMU_BUILD_BUG_ON(sizeof(*fw_info) != 0x50); > +CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate); > > if ((cxl_dstate->vmem_size < CXL_CAPACITY_MULTIPLIER) || > -(cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER)) { > +(cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER) || > +(ct3d->dc.total_capacity < CXL_CAPACITY_MULTIPLIER)) { > return CXL_MBOX_INTERNAL_ERROR; > } > > @@ -531,7 +533,8 @@ static CXLRetCode cmd_identify_memory_device(struct > cxl_cmd *cmd, > CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); > > if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) || > -(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER))) { > +(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER)) || > +(!QEMU_IS_ALIGNED(ct3d->dc.total_capacity, > CXL_CAPACITY_MULTIPLIER))) { > return CXL_MBOX_INTERNAL_ERROR; > } > > @@ -566,9 +569,11 @@ static CXLRetCode cmd_ccls_get_partition_info(struct > cxl_cmd *cmd, > uint64_t next_pmem; > } QEMU_PACKED *part_info = (void *)cmd->payload; > QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20); > +CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate); > > if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) || > -(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER))) { > +(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER)) || > +(!QEMU_IS_ALIGNED(ct3d->dc.total_capacity, > CXL_CAPACITY_MULTIPLIER))) { > return CXL_MBOX_INTERNAL_ERROR; > } > > @@ -880,7 +885,13 @@ static CXLRetCode cmd_media_clear_poison(struct cxl_cmd > *cmd, > struct clear_poison_pl *in = (void *)cmd->payload; > > dpa = ldq_le_p(&in->dpa); > -if (dpa + CXL_CACHE_LINE_SIZE > cxl_dstate->static_mem_size) { > +if (dpa + CXL_CACHE_LINE_SIZE >= cxl_dstate->static_mem_size > +&& ct3d->dc.num_regions == 0) { > +return CXL_MBOX_INVALID_PA; > +} > + > +if (ct3d->dc.num_regions && dpa + CXL_CACHE_LINE_SIZE >= > +cxl_dstate->static_mem_size + ct3d->dc.total_capacity) { > return CXL_MBOX_INVALID_PA; > } > > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c > index b29bb2309a..76bbd9f785 100644 > --- a/hw/mem/cxl_type3.c > +++ b/hw/mem/cxl_type3.c > @@ -20,6 +20,7 @@ > #include "hw/pci/spdm.h" > > #define DWORD_BYTE 4 > +#define CXL_CAPACITY_MULTIPLIER (256 * MiB) > > /* Default CDAT entries for a memory region */ > enum { > @@ -33,8 +34,8 @@ enum { > }; > > static int ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table, > - int dsmad_handle, MemoryRegion *mr, > - bool is_pmem, uint64_t dpa_base) > +int dsmad_handle, uint8_t flags, > +uint64_t dpa_base, uint64_t size) > { > g_autofree CDATDsmas *dsmas = NULL; > g_autofree CDATDslbis *dslbis0 = NULL; > @@ -53,9 +54,9 @@ static int ct3_build_cdat_entries_for_mr(CDATSubHeader > **cdat_table, > .length = sizeof(*dsmas), > }, > .DSMADhandle = dsmad_handle, > -.flags = is_pmem ? CDAT_DSMAS_FLAG_NV : 0, > +.flags = flags, > .DPA_base = dpa_base, > -.DPA_length = memory_region_size(mr), > +.DPA_length = size, > }; > > /* For now, no memory side cache, plausiblish numbers */ > @@ -137,9 +138,9 @@ static int ct3_build_cdat_entries_for_mr(CDATSubHeader > **cdat_table, > * NV: Reserved - the non volatile from DSMAS matters > * V: EFI_MEMORY_SP > */ > -.EFI_memory_type_attr = is_pmem ? 2 :
Re: [PATCH] docs/devel: Add cross-compiling doc
On 7/26/23 09:07, Andrew Jones wrote: Add instructions for how to cross-compile QEMU for RISC-V. The file is named generically because there's no reason not to collect other architectures steps into the same file, especially because several subsections like those for cross-compiling QEMU dependencies using meson and a cross-file could be shared. Additionally, other approaches to creating sysroots, such as with debootstrap, may be documented in this file in the future. Signed-off-by: Andrew Jones --- Reviewed-by: Daniel Henrique Barboza I've also tested the steps and it works. Not having to compile QEMU inside an emulated risc-v instance will be a dramatic increase in my lifespan. This is the best doc entry of the year for sure. Tested-by: Daniel Henrique Barboza docs/devel/cross-compiling.rst | 221 + 1 file changed, 221 insertions(+) create mode 100644 docs/devel/cross-compiling.rst diff --git a/docs/devel/cross-compiling.rst b/docs/devel/cross-compiling.rst new file mode 100644 index ..1b988ba54e4c --- /dev/null +++ b/docs/devel/cross-compiling.rst @@ -0,0 +1,221 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + + +Cross-compiling QEMU + + +Cross-compiling QEMU first requires the preparation of a cross-toolchain +and the cross-compiling of QEMU's dependencies. While the steps will be +similar across architectures, each architecture will have its own specific +recommendations. This document collects architecture-specific procedures +and hints that may be used to cross-compile QEMU, where typically the host +environment is x86. + +RISC-V +== + +Toolchain +- + +Select a root directory for the cross environment +^ + +Export an environment variable pointing to a root directory +for the cross environment. For example, :: + + $ export PREFIX="$HOME/opt/riscv" + +Create a work directory +^^^ + +Tools and several components will need to be downloaded and built. Create +a directory for all the work, :: + + $ export WORK_DIR="$HOME/work/xqemu" + $ mkdir -p "$WORK_DIR" + +Select and prepare the toolchain + + +Select a toolchain such as [riscv-toolchain]_ and follow its instructions +for building and installing it to ``$PREFIX``, e.g. :: + + $ cd "$WORK_DIR" + $ git clone https://github.com/riscv/riscv-gnu-toolchain + $ cd riscv-gnu-toolchain + $ ./configure --prefix="$PREFIX" + $ make -j$(nproc) linux + +Set the ``$CROSS_COMPILE`` environment variable to the prefix of the cross +tools and add the tools to ``$PATH``, :: + +$ export CROSS_COMPILE=riscv64-unknown-linux-gnu- +$ export PATH="$PREFIX/bin:$PATH" + +Also set ``$SYSROOT``, where all QEMU cross-compiled dependencies will be +installed. The toolchain installation likely created a 'sysroot' directory +at ``$PREFIX/sysroot``, which is the default location for most cross +tools, making it a good location, :: + + $ mkdir -p "$PREFIX/sysroot" + $ export SYSROOT="$PREFIX/sysroot" + +Create a pkg-config wrapper +^^^ + +The build processes of QEMU and some of its dependencies depend on +pkg-config. Create a wrapper script for it which works for the cross +environment: :: + + $ cat <"$PREFIX/bin/${CROSS_COMPILE}pkg-config" + #!/bin/sh + + [ "\$SYSROOT" ] || exit 1 + + export PKG_CONFIG_PATH= + export PKG_CONFIG_LIBDIR="\${SYSROOT}/usr/lib/pkgconfig:\${SYSROOT}/usr/lib64/pkgconfig:\${SYSROOT}/usr/share/pkgconfig" + + exec pkg-config "\$@" + EOF + $ chmod +x "$PREFIX/bin/${CROSS_COMPILE}pkg-config" + +Create a cross-file for meson builds + + +meson setup, used by some of QEMU's dependencies, needs a "cross-file" to +configure the cross environment. Create one, :: + + $ cd "$WORK_DIR" + $ cat
[PULL 04/25] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
From: Hyman Huang(黄勇) Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela Message-Id: <168618975839.6361.1740763387474768865...@git.sr.ht> Signed-off-by: Juan Quintela --- qapi/migration.json| 34 +++--- migration/migration-hmp-cmds.c | 8 migration/options.c| 28 3 files changed, 63 insertions(+), 7 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 2a6565a0f8..d83b694fef 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -789,9 +789,14 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during +# live migration. Should be in the range 1 to 1000ms, +# defaults to 1000ms. (Since 8.1) +# # Features: # -# @unstable: Member @x-checkpoint-delay is experimental. +# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period +#are experimental. # # Since: 2.4 ## @@ -809,8 +814,9 @@ 'multifd-channels', 'xbzrle-cache-size', 'max-postcopy-bandwidth', 'max-cpu-throttle', 'multifd-compression', - 'multifd-zlib-level' ,'multifd-zstd-level', - 'block-bitmap-mapping' ] } + 'multifd-zlib-level', 'multifd-zstd-level', + 'block-bitmap-mapping', + { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] } ] } ## # @MigrateSetParameters: @@ -945,9 +951,14 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during +# live migration. Should be in the range 1 to 1000ms, +# defaults to 1000ms. (Since 8.1) +# # Features: # -# @unstable: Member @x-checkpoint-delay is experimental. +# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period +#are experimental. # # TODO: either fuse back into MigrationParameters, or make # MigrationParameters members mandatory @@ -982,7 +993,9 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', -'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } +'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], +'*x-vcpu-dirty-limit-period': { 'type': 'uint64', +'features': [ 'unstable' ] } } } ## # @migrate-set-parameters: @@ -1137,9 +1150,14 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during +# live migration. Should be in the range 1 to 1000ms, +# defaults to 1000ms. (Since 8.1) +# # Features: # -# @unstable: Member @x-checkpoint-delay is experimental. +# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period +#are experimental. # # Since: 2.4 ## @@ -1171,7 +1189,9 @@ '*multifd-compression': 'MultiFDCompression', '*multifd-zlib-level': 'uint8', '*multifd-zstd-level': 'uint8', -'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ] } } +'*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], +'*x-vcpu-dirty-limit-period': { 'type': 'uint64', +'features': [ 'unstable' ] } } } ## # @query-migrate-parameters: diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 9885d7c9f7..352e9ec716 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) } } } + +monitor_printf(mon, "%s: %" PRIu64 " ms\n", +MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), +params->x_vcpu_dirty_limit_period); } qapi_free_MigrationParameters(params); @@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict
Re: How to tame CI?
On Wed, 26 Jul 2023 at 13:06, Juan Quintela wrote: > To make things easier, this is the part that show how it breaks (this is > the gcov test): > > 357/423 qemu:block / io-qcow2-copy-before-write > ERROR 6.38s exit status 1 > >>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3 > >>> MALLOC_PERTURB_=44 /builds/juan.quintela/qemu/build/pyvenv/bin/python3 > >>> /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap -qcow2 > >>> copy-before-write --source-dir > >>> /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir > >>> /builds/juan.quintela/qemu/build/tests/qemu-iotests > ― ✀ ― > stderr: > --- /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out > +++ > /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad > @@ -1,5 +1,21 @@ > - > +...F > +== > +FAIL: test_timeout_break_snapshot (__main__.TestCbwError) > +-- > +Traceback (most recent call last): > + File > "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", line > 210, in test_timeout_break_snapshot > +self.assertEqual(log, """\ > +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset 0\n1 > MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission denied\n' > + wrote 524288/524288 bytes at offset 0 > + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > + wrote 524288/524288 bytes at offset 524288 > + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > ++ read failed: Permission denied > +- read 1048576/1048576 bytes at offset 0 > +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > + This iotest failing is an intermittent that I've seen running pullreqs on master. I tend to see it on the s390 host. I suspect a race condition somewhere where it fails if the host is heavily loaded. -- PMM
[PATCH 09/44] Add GPIO and SD to BCM2838 periph
Signed-off-by: Sergey Kambalin --- hw/arm/bcm2838_peripherals.c | 140 +++ include/hw/arm/bcm2838_peripherals.h | 9 ++ 2 files changed, 149 insertions(+) diff --git a/hw/arm/bcm2838_peripherals.c b/hw/arm/bcm2838_peripherals.c index 864941c231..0c5e716853 100644 --- a/hw/arm/bcm2838_peripherals.c +++ b/hw/arm/bcm2838_peripherals.c @@ -15,22 +15,53 @@ /* Lower peripheral base address on the VC (GPU) system bus */ #define BCM2838_VC_PERI_LOW_BASE 0x7c00 +/* Capabilities for SD controller: no DMA, high-speed, default clocks etc. */ +#define BCM2835_SDHC_CAPAREG 0x52134b4 + static void bcm2838_peripherals_init(Object *obj) { BCM2838PeripheralState *s = BCM2838_PERIPHERALS(obj); BCM2838PeripheralClass *bc = BCM2838_PERIPHERALS_GET_CLASS(obj); +RaspiPeripheralBaseState *s_base = RASPI_PERIPHERALS_BASE(obj); /* Lower memory region for peripheral devices (exported to the Soc) */ memory_region_init(&s->peri_low_mr, obj, "bcm2838-peripherals", bc->peri_low_size); sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->peri_low_mr); +/* Extended Mass Media Controller 2 */ +object_initialize_child(obj, "emmc2", &s->emmc2, TYPE_SYSBUS_SDHCI); + +/* GPIO */ +object_initialize_child(obj, "gpio", &s->gpio, TYPE_BCM2838_GPIO); + +object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhci", + OBJECT(&s_base->sdhci.sdbus)); +object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhost", + OBJECT(&s_base->sdhost.sdbus)); + +object_initialize_child(obj, "mmc_irq_orgate", &s->mmc_irq_orgate, +TYPE_OR_IRQ); +object_property_set_int(OBJECT(&s->mmc_irq_orgate), "num-lines", 2, +&error_abort); + +object_initialize_child(obj, "dma_7_8_irq_orgate", &s->dma_7_8_irq_orgate, +TYPE_OR_IRQ); +object_property_set_int(OBJECT(&s->dma_7_8_irq_orgate), "num-lines", 2, +&error_abort); + +object_initialize_child(obj, "dma_9_10_irq_orgate", &s->dma_9_10_irq_orgate, +TYPE_OR_IRQ); +object_property_set_int(OBJECT(&s->dma_9_10_irq_orgate), "num-lines", 2, +&error_abort); } static void bcm2838_peripherals_realize(DeviceState *dev, Error **errp) { +MemoryRegion *mphi_mr; BCM2838PeripheralState *s = BCM2838_PERIPHERALS(dev); RaspiPeripheralBaseState *s_base = RASPI_PERIPHERALS_BASE(dev); +int n; raspi_peripherals_common_realize(dev, errp); @@ -42,6 +73,115 @@ static void bcm2838_peripherals_realize(DeviceState *dev, Error **errp) BCM2838_VC_PERI_LOW_BASE, &s->peri_low_mr_alias, 1); +/* Extended Mass Media Controller 2 */ +object_property_set_uint(OBJECT(&s->emmc2), "sd-spec-version", 3, + &error_abort); +object_property_set_uint(OBJECT(&s->emmc2), "capareg", + BCM2835_SDHC_CAPAREG, &error_abort); +object_property_set_bool(OBJECT(&s->emmc2), "pending-insert-quirk", true, + &error_abort); +if (!sysbus_realize(SYS_BUS_DEVICE(&s->emmc2), errp)) { +return; +} + +memory_region_add_subregion( +&s_base->peri_mr, EMMC2_OFFSET, +sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->emmc2), 0)); + +/* According to DTS, EMMC and EMMC2 share one irq */ +if (!qdev_realize(DEVICE(&s->mmc_irq_orgate), NULL, errp)) { +return; +} + +DeviceState *mmc_irq_orgate = DEVICE(&s->mmc_irq_orgate); +sysbus_connect_irq(SYS_BUS_DEVICE(&s->emmc2), 0, +qdev_get_gpio_in(mmc_irq_orgate, 0)); + +sysbus_connect_irq(SYS_BUS_DEVICE(&s_base->sdhci), 0, +qdev_get_gpio_in(mmc_irq_orgate, 1)); + + /* Connect EMMC and EMMC2 to the interrupt controller */ +qdev_connect_gpio_out(mmc_irq_orgate, 0, + qdev_get_gpio_in_named(DEVICE(&s_base->ic), + BCM2835_IC_GPU_IRQ, + INTERRUPT_ARASANSDIO)); + +/* Connect DMA 0-6 to the interrupt controller */ +for (n = 0; n < 7; n++) { +sysbus_connect_irq(SYS_BUS_DEVICE(&s_base->dma), n, + qdev_get_gpio_in_named(DEVICE(&s_base->ic), + BCM2835_IC_GPU_IRQ, + GPU_INTERRUPT_DMA0 + n)); +} + + /* According to DTS, DMA 7 and 8 share one irq */ +if (!qdev_realize(DEVICE(&s->dma_7_8_irq_orgate), NULL, errp)) { +return; +} +DeviceState *dma_7_8_irq_orgate = DEVICE(&s->dma_7_8_irq_orgate); + +/* Connect DMA 7-8 to the interrupt controller */ +sysbus_connect_irq(SYS_BUS_DEVICE(&s_base->dma),
[PATCH 13/44] Add memory region for BCM2837 RPiVid ASB
Signed-off-by: Sergey Kambalin --- hw/arm/bcm2838_peripherals.c | 3 +++ include/hw/arm/bcm2838_peripherals.h | 3 ++- include/hw/arm/raspi_platform.h | 1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/hw/arm/bcm2838_peripherals.c b/hw/arm/bcm2838_peripherals.c index 0c5e716853..f689e16535 100644 --- a/hw/arm/bcm2838_peripherals.c +++ b/hw/arm/bcm2838_peripherals.c @@ -182,6 +182,9 @@ static void bcm2838_peripherals_realize(DeviceState *dev, Error **errp) sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->gpio), 0)); object_property_add_alias(OBJECT(s), "sd-bus", OBJECT(&s->gpio), "sd-bus"); + +/* BCM2838 RPiVid ASB must be mapped to prevent kernel crash */ +create_unimp(s_base, &s->asb, "bcm2838-asb", RPI4B_ASB_OFFSET, 0x24); } static void bcm2838_peripherals_class_init(ObjectClass *oc, void *data) diff --git a/include/hw/arm/bcm2838_peripherals.h b/include/hw/arm/bcm2838_peripherals.h index aba38a18f0..ebed11dd40 100644 --- a/include/hw/arm/bcm2838_peripherals.h +++ b/include/hw/arm/bcm2838_peripherals.h @@ -66,12 +66,13 @@ struct BCM2838PeripheralState { MemoryRegion mphi_mr_alias; SDHCIState emmc2; -UnimplementedDeviceState clkisp; BCM2838GpioState gpio; OrIRQState mmc_irq_orgate; OrIRQState dma_7_8_irq_orgate; OrIRQState dma_9_10_irq_orgate; + +UnimplementedDeviceState asb; }; struct BCM2838PeripheralClass { diff --git a/include/hw/arm/raspi_platform.h b/include/hw/arm/raspi_platform.h index 30b114f6e0..4a5e8c1cc8 100644 --- a/include/hw/arm/raspi_platform.h +++ b/include/hw/arm/raspi_platform.h @@ -71,6 +71,7 @@ uint64_t board_ram_size(uint32_t board_rev); #define DMA_OFFSET 0x7000 /* DMA controller, channels 0-14 */ #define ARBA_OFFSET 0x9000 #define BRDG_OFFSET 0xa000 +#define RPI4B_ASB_OFFSET0xa000 /* BCM2838 (BCM2711) RPiVid ASB */ #define ARM_OFFSET 0xB000 /* ARM control block */ #define ARMCTRL_OFFSET (ARM_OFFSET + 0x000) #define ARMCTRL_IC_OFFSET (ARM_OFFSET + 0x200) /* Interrupt controller */ -- 2.34.1
[PATCH 04/44] Introduce BCM2838 SoC
Signed-off-by: Sergey Kambalin --- hw/arm/bcm2838.c | 110 +++ hw/arm/bcm2838_peripherals.c | 72 ++ hw/arm/meson.build | 2 + include/hw/arm/bcm2838.h | 26 +++ include/hw/arm/bcm2838_peripherals.h | 36 + 5 files changed, 246 insertions(+) create mode 100644 hw/arm/bcm2838.c create mode 100644 hw/arm/bcm2838_peripherals.c create mode 100644 include/hw/arm/bcm2838.h create mode 100644 include/hw/arm/bcm2838_peripherals.h diff --git a/hw/arm/bcm2838.c b/hw/arm/bcm2838.c new file mode 100644 index 00..dd650c8148 --- /dev/null +++ b/hw/arm/bcm2838.c @@ -0,0 +1,110 @@ +/* + * BCM2838 SoC emulation + * + * Copyright (C) 2022 Ovchinnikov Vitalii + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu/module.h" +#include "hw/arm/raspi_platform.h" +#include "hw/sysbus.h" +#include "hw/arm/bcm2838.h" +#include "trace.h" + +struct BCM2838Class { +/*< private >*/ +BCM283XBaseClass parent_class; +/*< public >*/ +hwaddr peri_low_base; /* Lower peripheral base address seen by the CPU */ +hwaddr gic_base; /* GIC base address inside ARM local peripherals region */ +}; + +#define VIRTUAL_PMU_IRQ 7 + +static void bcm2838_init(Object *obj) +{ +BCM2838State *s = BCM2838(obj); + +object_initialize_child(obj, "peripherals", &s->peripherals, +TYPE_BCM2838_PERIPHERALS); +object_property_add_alias(obj, "board-rev", OBJECT(&s->peripherals), + "board-rev"); +object_property_add_alias(obj, "vcram-size", OBJECT(&s->peripherals), + "vcram-size"); +object_property_add_alias(obj, "command-line", OBJECT(&s->peripherals), + "command-line"); +} + +static void bcm2838_realize(DeviceState *dev, Error **errp) +{ +int n; +BCM2838State *s = BCM2838(dev); +BCM283XBaseState *s_base = BCM283X_BASE(dev); +BCM2838Class *bc = BCM2838_GET_CLASS(dev); +BCM283XBaseClass *bc_base = BCM283X_BASE_GET_CLASS(dev); +BCM2838PeripheralState *ps = BCM2838_PERIPHERALS(&s->peripherals); +RaspiPeripheralBaseState *ps_base = RASPI_PERIPHERALS_BASE(&s->peripherals); + +if (!bcm283x_common_realize(dev, ps_base, errp)) { +return; +} +sysbus_mmio_map_overlap(SYS_BUS_DEVICE(ps), 1, bc->peri_low_base, 1); + +/* bcm2836 interrupt controller (and mailboxes, etc.) */ +if (!sysbus_realize(SYS_BUS_DEVICE(&s_base->control), errp)) { +return; +} +sysbus_mmio_map(SYS_BUS_DEVICE(&s_base->control), 0, bc_base->ctrl_base); + +/* Create cores */ +for (n = 0; n < bc_base->core_count; n++) { +/* TODO: this should be converted to a property of ARM_CPU */ +s_base->cpu[n].core.mp_affinity = (bc_base->clusterid << 8) | n; + +/* start powered off if not enabled */ +if (!object_property_set_bool(OBJECT(&s_base->cpu[n].core), + "start-powered-off", + n >= s_base->enabled_cpus, + errp)) { +return; +} + +if (!qdev_realize(DEVICE(&s_base->cpu[n].core), NULL, errp)) { +return; +} +} +} + +static void bcm2838_class_init(ObjectClass *oc, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(oc); +BCM2838Class *bc = BCM2838_CLASS(oc); +BCM283XBaseClass *bc_base = BCM283X_BASE_CLASS(oc); + +bc_base->cpu_type = ARM_CPU_TYPE_NAME("cortex-a72"); +bc_base->core_count = BCM283X_NCPUS; +bc_base->peri_base = 0xfe00; +bc_base->ctrl_base = 0xff80; +bc_base->clusterid = 0x0; +bc->peri_low_base = 0xfc00; +dc->realize = bcm2838_realize; +} + +static const TypeInfo bcm2838_type = { +.name = TYPE_BCM2838, +.parent = TYPE_BCM283X_BASE, +.instance_size = sizeof(BCM2838State), +.instance_init = bcm2838_init, +.class_size = sizeof(BCM2838Class), +.class_init = bcm2838_class_init, +}; + +static void bcm2838_register_types(void) +{ +type_register_static(&bcm2838_type); +} + +type_init(bcm2838_register_types); diff --git a/hw/arm/bcm2838_peripherals.c b/hw/arm/bcm2838_peripherals.c new file mode 100644 index 00..864941c231 --- /dev/null +++ b/hw/arm/bcm2838_peripherals.c @@ -0,0 +1,72 @@ +/* + * BCM2838 peripherals emulation + * + * Copyright (C) 2022 Ovchinnikov Vitalii + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu/module.h" +#include "hw/arm/raspi_platform.h" +#include "hw/arm/bcm2838_peripherals.h" + +/* Lower peripheral base address on the VC (GPU) system bus */ +#define BCM2838_VC_PERI_LOW_BASE 0x7c00 + +static void bcm2838_peripherals_init(Object *obj) +{ +BCM2838PeripheralState *s = BCM2838_PERIPHERALS(obj
[PATCH 41/44] Add mailbox property tests. Part 2
Signed-off-by: Sergey Kambalin --- tests/qtest/bcm2838-mbox-property-test.c | 196 ++- 1 file changed, 195 insertions(+), 1 deletion(-) diff --git a/tests/qtest/bcm2838-mbox-property-test.c b/tests/qtest/bcm2838-mbox-property-test.c index ac173ed3ff..bcee9971c6 100644 --- a/tests/qtest/bcm2838-mbox-property-test.c +++ b/tests/qtest/bcm2838-mbox-property-test.c @@ -36,7 +36,8 @@ (TEST_TAG_TYPE(testname) * tag); \ static void CHECK_FN_NAME(testname, __VA_ARGS__) \ (TEST_TAG_TYPE(testname) *tag); \ -static void TEST_FN_NAME(testname, __VA_ARGS__)(void) { \ +static void TEST_FN_NAME(testname, __VA_ARGS__)(void) \ +{ \ struct { \ MboxBufHeader header; \ TEST_TAG_TYPE(testname) tag; \ @@ -161,6 +162,179 @@ DECLARE_TEST_CASE_SETUP(GET_MIN_CLOCK_RATE, EMMC) { } /**/ +DECLARE_TEST_CASE(GET_CLOCK_RATE, UART) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_UART); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_UART); +} +DECLARE_TEST_CASE_SETUP(GET_CLOCK_RATE, UART) { +tag->request.value.clock_id = CLOCK_ID_UART; +} + +/**/ +DECLARE_TEST_CASE(GET_MAX_CLOCK_RATE, UART) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_UART); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_UART); +} +DECLARE_TEST_CASE_SETUP(GET_MAX_CLOCK_RATE, UART) { +tag->request.value.clock_id = CLOCK_ID_UART; +} + +/**/ +DECLARE_TEST_CASE(GET_MIN_CLOCK_RATE, UART) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_UART); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_UART); +} +DECLARE_TEST_CASE_SETUP(GET_MIN_CLOCK_RATE, UART) { +tag->request.value.clock_id = CLOCK_ID_UART; +} + +/**/ +DECLARE_TEST_CASE(GET_CLOCK_RATE, CORE) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_CORE); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_CORE); +} +DECLARE_TEST_CASE_SETUP(GET_CLOCK_RATE, CORE) { +tag->request.value.clock_id = CLOCK_ID_CORE; +} + +/**/ +DECLARE_TEST_CASE(GET_MAX_CLOCK_RATE, CORE) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_CORE); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_CORE); +} +DECLARE_TEST_CASE_SETUP(GET_MAX_CLOCK_RATE, CORE) { +tag->request.value.clock_id = CLOCK_ID_CORE; +} + +/**/ +DECLARE_TEST_CASE(GET_MIN_CLOCK_RATE, CORE) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_CORE); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_CORE); +} +DECLARE_TEST_CASE_SETUP(GET_MIN_CLOCK_RATE, CORE) { +tag->request.value.clock_id = CLOCK_ID_CORE; +} + +/**/ +DECLARE_TEST_CASE(GET_CLOCK_RATE, ANY) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_UNDEFINED); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_ANY); +} +DECLARE_TEST_CASE_SETUP(GET_CLOCK_RATE, ANY) { +tag->request.value.clock_id = CLOCK_ID_UNDEFINED; +} + +/**/ +DECLARE_TEST_CASE(GET_MAX_CLOCK_RATE, ANY) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_UNDEFINED); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_ANY); +} +DECLARE_TEST_CASE_SETUP(GET_MAX_CLOCK_RATE, ANY) { +tag->request.value.clock_id = CLOCK_ID_UNDEFINED; +} + +/**/ +DECLARE_TEST_CASE(GET_MIN_CLOCK_RATE, ANY) { +g_assert_cmphex(tag->response.value.clock_id, ==, CLOCK_ID_UNDEFINED); +g_assert_cmphex(tag->response.value.rate, ==, CLOCK_RATE_ANY); +} +DECLARE_TEST_CASE_SETUP(GET_MIN_CLOCK_RATE, ANY) { +tag->request.value.clock_id = CLOCK_ID_UNDEFINED; +} + +/**/ +DECLARE_TEST_CASE(GET_TEMPERATURE) { +g_assert_cmphex(tag->response.value.temperature_id, ==, TEMPERATURE_ID_SOC); +g_assert_cmpint(tag->response.value.temperature, ==, TEMPERATURE_SOC); +} +DECLARE_TEST_CASE_SETUP(GET_TEMPERATURE) { +tag->request.value.
[PATCH 17/44] Add RNG200 skeleton
Signed-off-by: Sergey Kambalin --- hw/misc/bcm2838_rng200.c | 118 +++ hw/misc/meson.build | 1 + hw/misc/trace-events | 10 +++ include/hw/misc/bcm2838_rng200.h | 77 4 files changed, 206 insertions(+) create mode 100644 hw/misc/bcm2838_rng200.c create mode 100644 include/hw/misc/bcm2838_rng200.h diff --git a/hw/misc/bcm2838_rng200.c b/hw/misc/bcm2838_rng200.c new file mode 100644 index 00..a17e8f2cda --- /dev/null +++ b/hw/misc/bcm2838_rng200.c @@ -0,0 +1,118 @@ +/* + * BCM2838 Random Number Generator emulation + * + * Copyright (C) 2022 Sergey Pushkarev + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "hw/qdev-properties.h" +#include "hw/misc/bcm2838_rng200.h" +#include "trace.h" + +static void bcm2838_rng200_rng_reset(BCM2838Rng200State *state) +{ +state->rng_ctrl.value = 0; + +trace_bcm2838_rng200_rng_soft_reset(); +} + +static uint64_t bcm2838_rng200_read(void *opaque, hwaddr offset, +unsigned size) +{ +uint32_t res = 0; + +trace_bcm2838_rng200_read((void *)offset, size, res); +return res; +} + +static void bcm2838_rng200_write(void *opaque, hwaddr offset, + uint64_t value, unsigned size) +{ + +trace_bcm2838_rng200_write((void *)offset, value, size); +} + +static const MemoryRegionOps bcm2838_rng200_ops = { +.read = bcm2838_rng200_read, +.write = bcm2838_rng200_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +static void bcm2838_rng200_realize(DeviceState *dev, Error **errp) +{ +BCM2838Rng200State *s = BCM2838_RNG200(dev); + +if (s->rng == NULL) { +Object *default_backend = object_new(TYPE_RNG_BUILTIN); + +object_property_add_child(OBJECT(dev), "default-backend", + default_backend); +object_unref(default_backend); + +object_property_set_link(OBJECT(dev), "rng", default_backend, + errp); +} + +sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->irq); +} + +static void bcm2838_rng200_init(Object *obj) +{ +BCM2838Rng200State *s = BCM2838_RNG200(obj); +SysBusDevice *sbd = SYS_BUS_DEVICE(obj); + +s->clock = qdev_init_clock_in(DEVICE(s), "rbg-clock", + NULL, s, + ClockPreUpdate); +if (s->clock == NULL) { +error_setg(&error_fatal, "Failed to init RBG clock"); +return; +} + +memory_region_init_io(&s->iomem, obj, &bcm2838_rng200_ops, s, + TYPE_BCM2838_RNG200, 0x28); +sysbus_init_mmio(sbd, &s->iomem); +} + +static void bcm2838_rng200_reset(DeviceState *dev) +{ +BCM2838Rng200State *s = BCM2838_RNG200(dev); +bcm2838_rng200_rng_reset(s); +} + +static Property bcm2838_rng200_properties[] = { +DEFINE_PROP_UINT32("rbg-period", BCM2838Rng200State, rbg_period, 250), +DEFINE_PROP_UINT32("rng-fifo-cap", BCM2838Rng200State, rng_fifo_cap, 128), +DEFINE_PROP_LINK("rng", BCM2838Rng200State, rng, + TYPE_RNG_BACKEND, RngBackend *), +DEFINE_PROP_BOOL("use-timer", BCM2838Rng200State, use_timer, true), +DEFINE_PROP_END_OF_LIST(), +}; + +static void bcm2838_rng200_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); + +dc->realize = bcm2838_rng200_realize; +dc->reset = bcm2838_rng200_reset; +device_class_set_props(dc, bcm2838_rng200_properties); +} + +static const TypeInfo bcm2838_rng200_info = { +.name = TYPE_BCM2838_RNG200, +.parent= TYPE_SYS_BUS_DEVICE, +.instance_size = sizeof(BCM2838Rng200State), +.class_init= bcm2838_rng200_class_init, +.instance_init = bcm2838_rng200_init, +}; + +static void bcm2838_rng200_register_types(void) +{ +type_register_static(&bcm2838_rng200_info); +} + +type_init(bcm2838_rng200_register_types) diff --git a/hw/misc/meson.build b/hw/misc/meson.build index 892f8b91c5..a6230ced43 100644 --- a/hw/misc/meson.build +++ b/hw/misc/meson.build @@ -88,6 +88,7 @@ system_ss.add(when: 'CONFIG_RASPI', if_true: files( 'bcm2835_thermal.c', 'bcm2835_cprman.c', 'bcm2835_powermgt.c', + 'bcm2838_rng200.c' )) system_ss.add(when: 'CONFIG_SLAVIO', if_true: files('slavio_misc.c')) system_ss.add(when: 'CONFIG_ZYNQ', if_true: files('zynq_slcr.c')) diff --git a/hw/misc/trace-events b/hw/misc/trace-events index 4d1a0e17af..d26cd2d22d 100644 --- a/hw/misc/trace-events +++ b/hw/misc/trace-events @@ -297,3 +297,13 @@ virt_ctrl_instance_init(void *dev) "ctrl: %p" lasi_chip_mem_valid(uint64_t addr, uint32_t val) "access to addr 0x%"PRIx64" is %d" lasi_chip_read(uint64_t addr, uint32_t val) "addr 0x%"PRIx64" val 0x%08x" lasi_chip_write(uint64_t addr, uint32_t val) "addr 0x%"PRIx64" val 0x%0
[PATCH 39/44] Add mailbox tests tags. Part 3
Signed-off-by: Sergey Kambalin --- tests/qtest/bcm2838-mailbox.h | 78 +++ 1 file changed, 78 insertions(+) diff --git a/tests/qtest/bcm2838-mailbox.h b/tests/qtest/bcm2838-mailbox.h index 2231c2c596..d8975b11ea 100644 --- a/tests/qtest/bcm2838-mailbox.h +++ b/tests/qtest/bcm2838-mailbox.h @@ -517,6 +517,84 @@ DECLARE_TAG_TYPE(TAG_GET_DMA_CHANNELS_t, uint32_t mask; }); +DECLARE_TAG_TYPE(TAG_GET_THROTTLED_t, +struct {}, +struct { +uint32_t throttled; +}); + +DECLARE_TAG_TYPE(TAG_GET_NUM_DISPLAYS_t, +struct {}, +struct { +uint32_t num_displays; +}); + +DECLARE_TAG_TYPE(TAG_GET_DISPLAY_SETTINGS_t, +struct {}, +struct { +uint32_t display_num; +uint32_t phys_width; +uint32_t phys_height; +uint32_t bpp; +uint16_t pitch; +uint32_t virt_width; +uint32_t virt_height; +uint16_t virt_width_offset; +uint32_t virt_height_offset; +uint32_t fb_bus_address_lo; +uint32_t fb_bus_address_hi; +}); + +DECLARE_TAG_TYPE(TAG_GET_GPIO_CONFIG_t, +struct { +uint32_t gpio_num; +}, +struct { +uint32_t zero; +uint32_t direction; +uint32_t polarity; +uint32_t term_en; +uint32_t term_pull_up; +}); + + +DECLARE_TAG_TYPE(TAG_SET_GPIO_CONFIG_t, +struct { +uint32_t gpio_num; +uint32_t direction; +uint32_t polarity; +uint32_t term_en; +uint32_t term_pull_up; +uint32_t state; +}, +struct { +uint32_t zero; +}); + +DECLARE_TAG_TYPE(TAG_GET_GPIO_STATE_t, +struct { +uint32_t gpio_num; +}, +struct { +uint32_t zero; +uint32_t state; +}); + +DECLARE_TAG_TYPE(TAG_SET_GPIO_STATE_t, +struct { +uint32_t gpio_num; +uint32_t state; +}, +struct { +uint32_t zero; +}); + +DECLARE_TAG_TYPE(TAG_INITIALIZE_VCHIQ_t, +struct {}, +struct { +uint32_t zero; +}); + int mbox0_has_data(void); void mbox0_read_message(uint8_t channel, void *msgbuf, size_t msgbuf_size); void mbox1_write_message(uint8_t channel, uint32_t msg_addr); -- 2.34.1
[PATCH 23/44] Add GENET register structs. Part 1
Signed-off-by: Sergey Kambalin --- include/hw/net/bcm2838_genet.h | 125 - 1 file changed, 124 insertions(+), 1 deletion(-) diff --git a/include/hw/net/bcm2838_genet.h b/include/hw/net/bcm2838_genet.h index f62b24fa2f..89b45eb39f 100644 --- a/include/hw/net/bcm2838_genet.h +++ b/include/hw/net/bcm2838_genet.h @@ -18,8 +18,131 @@ OBJECT_DECLARE_SIMPLE_TYPE(BCM2838GenetState, BCM2838_GENET) #define BCM2838_GENET_REV_MAJOR 6 #define BCM2838_GENET_REV_MINOR 0 +typedef union { +uint32_t value; +struct { +uint32_t gphy_rev:16; +uint32_t minor_rev:4; +uint32_t reserved_20_23:4; +uint32_t major_rev:4; +uint32_t reserved_28_31:4; +} fields; +} BCM2838GenetSysRevCtrl; + +typedef union { +uint32_t value; +struct { +uint32_t scb:1; +uint32_t ephy:1; +uint32_t phy_det_r:1; +uint32_t phy_det_f:1; +uint32_t link_up:1; +uint32_t link_down:1; +uint32_t umac:1; +uint32_t umac_tsv:1; +uint32_t tbuf_underrun:1; +uint32_t rbuf_overflow:1; +uint32_t hfb_sm:1; +uint32_t hfb_mm:1; +uint32_t mpd_r:1; +uint32_t rxdma_mbdone:1; +uint32_t rxdma_pdone:1; +uint32_t rxdma_bdone:1; +uint32_t txdma_mbdone:1; +uint32_t txdma_pdone:1; +uint32_t txdma_bdone:1; +uint32_t reserved_19_22:4; +uint32_t mdio_done:1; +uint32_t mdio_error:1; +uint32_t reserved_25_31:7; +} fields; +} BCM2838GenetIntrl0; + +typedef union { +uint32_t value; +struct { +uint32_t tx_intrs:16; +uint32_t rx_intrs:16; +} fields; +} BCM2838GenetIntrl1; + +typedef struct { +BCM2838GenetSysRevCtrl rev_ctrl; +uint32_t port_ctrl; +uint32_t rbuf_flush_ctrl; +uint32_t tbuf_flush_ctrl; +uint8_t reserved_0x10[0x30]; +} __attribute__((__packed__)) BCM2838GenetRegsSys; + +typedef struct { +uint8_t reserved_0x0[0x40]; +} __attribute__((__packed__)) BCM2838GenetRegsGrBridge; + +typedef struct { +uint32_t pwr_mgmt; +uint8_t reserved_0x4[0x8]; +uint32_t rgmii_oob_ctrl; +uint8_t reserved_0x10[0xC]; +uint32_t gphy_ctrl; +uint8_t reserved_0x20[0x60]; +} __attribute__((__packed__)) BCM2838GenetRegsExt; + +typedef struct { +BCM2838GenetIntrl0 stat; +BCM2838GenetIntrl0 set; +BCM2838GenetIntrl0 clear; +BCM2838GenetIntrl0 mask_status; +BCM2838GenetIntrl0 mask_set; +BCM2838GenetIntrl0 mask_clear; +uint8_t reserved_0x18[0x28]; +} __attribute__((__packed__)) BCM2838GenetRegsIntrl0; + +typedef struct { +BCM2838GenetIntrl1 stat; +BCM2838GenetIntrl1 set; +BCM2838GenetIntrl1 clear; +BCM2838GenetIntrl1 mask_status; +BCM2838GenetIntrl1 mask_set; +BCM2838GenetIntrl1 mask_clear; +uint8_t reserved_0x18[0x28]; +} __attribute__((__packed__)) BCM2838GenetRegsIntrl1; + +typedef struct { +uint32_t ctrl; +uint8_t reserved_0x4[0x8]; +uint32_t status; +uint8_t reserved_0x10[0x4]; +uint32_t chk_ctrl; +uint8_t reserved_0x18[0x7C]; +uint32_t ovfl_cnt; +uint32_t err_cnt; +uint32_t energy_ctrl; +uint8_t reserved_0xA0[0x14]; +uint32_t size_ctrl; +uint8_t reserved_0xB8[0x48]; +} __attribute__((__packed__)) BCM2838GenetRegsRbuf; + +typedef struct { +uint32_t ctrl; +uint8_t reserved_0x4[0x8]; +uint32_t bp_mc; +uint8_t reserved_0x10[0x4]; +uint32_t energy_ctrl; +uint8_t reserved_0x18[0xE8]; +} __attribute__((__packed__)) BCM2838GenetRegsTbuf; + typedef struct { -uint8_t stub_area[0x1]; /* temporary stub */ +BCM2838GenetRegsSys sys; +BCM2838GenetRegsGrBridge gr_bridge; +BCM2838GenetRegsExt ext; +uint8_t reserved_0x100[0x100]; +BCM2838GenetRegsIntrl0 intrl0; +BCM2838GenetRegsIntrl1 intrl1; +uint8_t reserved_0x280[0x80]; +BCM2838GenetRegsRbuf rbuf; +uint8_t reserved_0x400[0x200]; +BCM2838GenetRegsTbuf tbuf; +uint8_t reserved_0x700[0x100]; } __attribute__((__packed__)) BCM2838GenetRegs; struct BCM2838GenetState { -- 2.34.1
[PATCH 02/44] Split out common part of peripherals
Signed-off-by: Sergey Kambalin --- hw/arm/bcm2835_peripherals.c | 198 +++ hw/arm/bcm2836.c | 24 ++-- include/hw/arm/bcm2835_peripherals.h | 29 +++- include/hw/arm/bcm2836.h | 3 +- 4 files changed, 154 insertions(+), 100 deletions(-) diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c index 0233038b95..4c0c0b1e7d 100644 --- a/hw/arm/bcm2835_peripherals.c +++ b/hw/arm/bcm2835_peripherals.c @@ -30,9 +30,9 @@ #define SEPARATE_DMA_IRQ_MAX 10 #define ORGATED_DMA_IRQ_COUNT 4 -static void create_unimp(BCM2835PeripheralState *ps, - UnimplementedDeviceState *uds, - const char *name, hwaddr ofs, hwaddr size) +void create_unimp(RaspiPeripheralBaseState *ps, + UnimplementedDeviceState *uds, + const char *name, hwaddr ofs, hwaddr size) { object_initialize_child(OBJECT(ps), name, uds, TYPE_UNIMPLEMENTED_DEVICE); qdev_prop_set_string(DEVICE(uds), "name", name); @@ -45,9 +45,36 @@ static void create_unimp(BCM2835PeripheralState *ps, static void bcm2835_peripherals_init(Object *obj) { BCM2835PeripheralState *s = BCM2835_PERIPHERALS(obj); +RaspiPeripheralBaseState *s_base = RASPI_PERIPHERALS_BASE(obj); + +/* Random Number Generator */ +object_initialize_child(obj, "rng", &s->rng, TYPE_BCM2835_RNG); + +/* Thermal */ +object_initialize_child(obj, "thermal", &s->thermal, TYPE_BCM2835_THERMAL); + +/* GPIO */ +object_initialize_child(obj, "gpio", &s->gpio, TYPE_BCM2835_GPIO); + +object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhci", + OBJECT(&s_base->sdhci.sdbus)); +object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhost", + OBJECT(&s_base->sdhost.sdbus)); + +/* Gated DMA interrupts */ +object_initialize_child(obj, "orgated-dma-irq", +&s_base->orgated_dma_irq, TYPE_OR_IRQ); +object_property_set_int(OBJECT(&s_base->orgated_dma_irq), "num-lines", +ORGATED_DMA_IRQ_COUNT, &error_abort); +} + +static void raspi_peripherals_base_init(Object *obj) +{ +RaspiPeripheralBaseState *s = RASPI_PERIPHERALS_BASE(obj); +RaspiPeripheralBaseClass *bc = RASPI_PERIPHERALS_BASE_GET_CLASS(obj); /* Memory region for peripheral devices, which we export to our parent */ -memory_region_init(&s->peri_mr, obj,"bcm2835-peripherals", 0x100); +memory_region_init(&s->peri_mr, obj, "bcm2835-peripherals", bc->peri_size); sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->peri_mr); /* Internal memory region for peripheral bus addresses (not exported) */ @@ -98,9 +125,6 @@ static void bcm2835_peripherals_init(Object *obj) object_property_add_const_link(OBJECT(&s->property), "dma-mr", OBJECT(&s->gpu_bus_mr)); -/* Random Number Generator */ -object_initialize_child(obj, "rng", &s->rng, TYPE_BCM2835_RNG); - /* Extended Mass Media Controller */ object_initialize_child(obj, "sdhci", &s->sdhci, TYPE_SYSBUS_SDHCI); @@ -110,25 +134,9 @@ static void bcm2835_peripherals_init(Object *obj) /* DMA Channels */ object_initialize_child(obj, "dma", &s->dma, TYPE_BCM2835_DMA); -object_initialize_child(obj, "orgated-dma-irq", -&s->orgated_dma_irq, TYPE_OR_IRQ); -object_property_set_int(OBJECT(&s->orgated_dma_irq), "num-lines", -ORGATED_DMA_IRQ_COUNT, &error_abort); - object_property_add_const_link(OBJECT(&s->dma), "dma-mr", OBJECT(&s->gpu_bus_mr)); -/* Thermal */ -object_initialize_child(obj, "thermal", &s->thermal, TYPE_BCM2835_THERMAL); - -/* GPIO */ -object_initialize_child(obj, "gpio", &s->gpio, TYPE_BCM2835_GPIO); - -object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhci", - OBJECT(&s->sdhci.sdbus)); -object_property_add_const_link(OBJECT(&s->gpio), "sdbus-sdhost", - OBJECT(&s->sdhost.sdbus)); - /* Mphi */ object_initialize_child(obj, "mphi", &s->mphi, TYPE_BCM2835_MPHI); @@ -148,7 +156,72 @@ static void bcm2835_peripherals_init(Object *obj) static void bcm2835_peripherals_realize(DeviceState *dev, Error **errp) { +MemoryRegion *mphi_mr; BCM2835PeripheralState *s = BCM2835_PERIPHERALS(dev); +RaspiPeripheralBaseState *s_base = RASPI_PERIPHERALS_BASE(dev); +int n; + +raspi_peripherals_common_realize(dev, errp); + +/* Extended Mass Media Controller */ +sysbus_connect_irq(SYS_BUS_DEVICE(&s_base->sdhci), 0, +qdev_get_gpio_in_named(DEVICE(&s_base->ic), BCM2835_IC_GPU_IRQ, + INTERRUPT_ARASANSDIO)); + + /* Connect DMA 0-12 to the interrupt controller */ +for (n = 0; n <= SEPARATE_DMA_IRQ_MAX; n++) { +sysbus
[PATCH 10/44] Add BCM2838 checkpoint support
Signed-off-by: Sergey Kambalin --- hw/gpio/bcm2838_gpio.c | 17 + 1 file changed, 17 insertions(+) diff --git a/hw/gpio/bcm2838_gpio.c b/hw/gpio/bcm2838_gpio.c index 7291e473dc..f1121f9c58 100644 --- a/hw/gpio/bcm2838_gpio.c +++ b/hw/gpio/bcm2838_gpio.c @@ -17,6 +17,7 @@ #include "qemu/timer.h" #include "qapi/error.h" #include "hw/sysbus.h" +#include "migration/vmstate.h" #include "hw/sd/sd.h" #include "hw/gpio/bcm2838_gpio.h" #include "hw/irq.h" @@ -324,6 +325,21 @@ static const MemoryRegionOps bcm2838_gpio_ops = { .endianness = DEVICE_NATIVE_ENDIAN, }; +static const VMStateDescription vmstate_bcm2838_gpio = { +.name = "bcm2838_gpio", +.version_id = 1, +.minimum_version_id = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT8_ARRAY(fsel, BCM2838GpioState, BCM2838_GPIO_NUM), +VMSTATE_UINT32(lev0, BCM2838GpioState), +VMSTATE_UINT32(lev1, BCM2838GpioState), +VMSTATE_UINT8(sd_fsel, BCM2838GpioState), +VMSTATE_UINT32_ARRAY(pup_cntrl_reg, BCM2838GpioState, + GPIO_PUP_PDN_CNTRL_NUM), +VMSTATE_END_OF_LIST() +} +}; + static void bcm2838_gpio_init(Object *obj) { BCM2838GpioState *s = BCM2838_GPIO(obj); @@ -355,6 +371,7 @@ static void bcm2838_gpio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); +dc->vmsd = &vmstate_bcm2838_gpio; dc->realize = &bcm2838_gpio_realize; dc->reset = &bcm2838_gpio_reset; } -- 2.34.1
[PATCH 03/44] Split out raspi machine common part
Signed-off-by: Sergey Kambalin --- hw/arm/raspi.c | 112 ++-- include/hw/arm/raspi_platform.h | 21 ++ 2 files changed, 85 insertions(+), 48 deletions(-) diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c index af866ebce2..7d04734cd2 100644 --- a/hw/arm/raspi.c +++ b/hw/arm/raspi.c @@ -18,6 +18,7 @@ #include "qapi/error.h" #include "hw/arm/boot.h" #include "hw/arm/bcm2836.h" +#include "hw/arm/raspi_platform.h" #include "hw/registerfields.h" #include "qemu/error-report.h" #include "hw/boards.h" @@ -25,6 +26,9 @@ #include "hw/arm/boot.h" #include "qom/object.h" +#define TYPE_RASPI_MACHINE MACHINE_TYPE_NAME("raspi-common") +OBJECT_DECLARE_SIMPLE_TYPE(RaspiMachineState, RASPI_MACHINE) + #define SMPBOOT_ADDR0x300 /* this should leave enough space for ATAGS */ #define MVBAR_ADDR 0x400 /* secure vectors */ #define BOARDSETUP_ADDR (MVBAR_ADDR + 0x20) /* board setup code */ @@ -37,25 +41,10 @@ struct RaspiMachineState { /*< private >*/ -MachineState parent_obj; +RaspiBaseMachineState parent_obj; /*< public >*/ BCM283XState soc; -struct arm_boot_info binfo; -}; -typedef struct RaspiMachineState RaspiMachineState; - -struct RaspiMachineClass { -/*< private >*/ -MachineClass parent_obj; -/*< public >*/ -uint32_t board_rev; }; -typedef struct RaspiMachineClass RaspiMachineClass; - -#define TYPE_RASPI_MACHINE MACHINE_TYPE_NAME("raspi-common") -DECLARE_OBJ_CHECKERS(RaspiMachineState, RaspiMachineClass, - RASPI_MACHINE, TYPE_RASPI_MACHINE) - /* * Board revision codes: @@ -83,6 +72,11 @@ static const struct { [PROCESSOR_ID_BCM2837] = {TYPE_BCM2837, BCM283X_NCPUS}, }; +static void raspi_base_machine_init(MachineState *machine, + BCM283XBaseState *soc); +static void raspi_machine_class_common_init(MachineClass *mc, + uint32_t board_rev); + static uint64_t board_ram_size(uint32_t board_rev) { assert(FIELD_EX32(board_rev, REV_CODE, STYLE)); /* Only new style */ @@ -200,13 +194,12 @@ static void reset_secondary(ARMCPU *cpu, const struct arm_boot_info *info) cpu_set_pc(cs, info->smp_loader_start); } -static void setup_boot(MachineState *machine, RaspiProcessorId processor_id, - size_t ram_size) +static void setup_boot(MachineState *machine, ARMCPU *cpu, + RaspiProcessorId processor_id, size_t ram_size) { -RaspiMachineState *s = RASPI_MACHINE(machine); +RaspiBaseMachineState *s = RASPI_BASE_MACHINE(machine); int r; -s->binfo.board_id = MACH_TYPE_BCM2708; s->binfo.ram_size = ram_size; if (processor_id <= PROCESSOR_ID_BCM2836) { @@ -252,13 +245,13 @@ static void setup_boot(MachineState *machine, RaspiProcessorId processor_id, s->binfo.firmware_loaded = true; } -arm_load_kernel(&s->soc.parent_obj.cpu[0].core, machine, &s->binfo); +arm_load_kernel(cpu, machine, &s->binfo); } -static void raspi_machine_init(MachineState *machine) +static void raspi_base_machine_init(MachineState *machine, + BCM283XBaseState *soc) { -RaspiMachineClass *mc = RASPI_MACHINE_GET_CLASS(machine); -RaspiMachineState *s = RASPI_MACHINE(machine); +RaspiBaseMachineClass *mc = RASPI_BASE_MACHINE_GET_CLASS(machine); uint32_t board_rev = mc->board_rev; uint64_t ram_size = board_ram_size(board_rev); uint32_t vcram_size; @@ -279,19 +272,17 @@ static void raspi_machine_init(MachineState *machine) machine->ram, 0); /* Setup the SOC */ -object_initialize_child(OBJECT(machine), "soc", &s->soc, -board_soc_type(board_rev)); -object_property_add_const_link(OBJECT(&s->soc), "ram", OBJECT(machine->ram)); -object_property_set_int(OBJECT(&s->soc), "board-rev", board_rev, +object_property_add_const_link(OBJECT(soc), "ram", OBJECT(machine->ram)); +object_property_set_int(OBJECT(soc), "board-rev", board_rev, &error_abort); -object_property_set_str(OBJECT(&s->soc), "command-line", +object_property_set_str(OBJECT(soc), "command-line", machine->kernel_cmdline, &error_abort); -qdev_realize(DEVICE(&s->soc), NULL, &error_fatal); +qdev_realize(DEVICE(soc), NULL, &error_fatal); /* Create and plug in the SD cards */ di = drive_get(IF_SD, 0, 0); blk = di ? blk_by_legacy_dinfo(di) : NULL; -bus = qdev_get_child_bus(DEVICE(&s->soc), "sd-bus"); +bus = qdev_get_child_bus(DEVICE(soc), "sd-bus"); if (bus == NULL) { error_report("No SD bus found in SOC object"); exit(1); @@ -300,19 +291,32 @@ static void raspi_machine_init(MachineState *machine) qdev_prop_set_drive_err(carddev, "drive", blk, &error_fatal); qdev_realize_and_unref(carddev, bus, &error_fatal); -vcram_size = object_p
[PATCH 35/44] Add mailbox test stub
Signed-off-by: Sergey Kambalin --- tests/qtest/bcm2838-mailbox.c | 70 +++ tests/qtest/bcm2838-mailbox.h | 48 tests/qtest/meson.build | 1 + 3 files changed, 119 insertions(+) create mode 100644 tests/qtest/bcm2838-mailbox.c create mode 100644 tests/qtest/bcm2838-mailbox.h diff --git a/tests/qtest/bcm2838-mailbox.c b/tests/qtest/bcm2838-mailbox.c new file mode 100644 index 00..211d167ff8 --- /dev/null +++ b/tests/qtest/bcm2838-mailbox.c @@ -0,0 +1,70 @@ +/* + * Helper functions to work with BCM2838 mailbox via qtest interface. + * + * Copyright (c) 2023 Auriga LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "libqtest-single.h" +#include "bcm2838-mailbox.h" + + +static uint32_t qtest_mbox0_read_reg32(QTestState *s, uint32_t offset) +{ +return qtest_readl(s, MBOX0_BASE + offset); +} + +static void qtest_mbox1_write_reg32(QTestState *s, uint32_t offset, uint32_t value) +{ +return qtest_writel(s, MBOX1_BASE + offset, value); +} + +static void qtest_mbox1_write(QTestState *s, uint8_t channel, uint32_t data) +{ +MboxRegWrite reg; + +reg.fields.channel = channel; +reg.fields.data = data; +qtest_mbox1_write_reg32(s, MBOX_REG_WRITE, reg.value); +} + +int qtest_mbox0_has_data(QTestState *s) { +return !(qtest_mbox0_read_reg32(s, MBOX_REG_STATUS) & MBOX_READ_EMPTY); +} + +int mbox0_has_data(void) { +return qtest_mbox0_has_data(global_qtest); +} + +void qtest_mbox0_read_message(QTestState *s, + uint8_t channel, + void *msgbuf, + size_t msgbuf_size) +{ +MboxRegRead reg; +uint32_t msgaddr; + +g_assert(qtest_mbox0_has_data(s)); +reg.value = qtest_mbox0_read_reg32(s, MBOX_REG_READ); +g_assert_cmphex(reg.fields.channel, ==, channel); +msgaddr = reg.fields.data << 4; +qtest_memread(s, msgaddr, msgbuf, msgbuf_size); +} + +void mbox0_read_message(uint8_t channel, void *msgbuf, size_t msgbuf_size) { +qtest_mbox0_read_message(global_qtest, channel, msgbuf, msgbuf_size); +} + +void qtest_mbox1_write_message(QTestState *s, uint8_t channel, uint32_t msg_addr) +{ +qtest_mbox1_write(s, channel, msg_addr >> 4); +} + + +void mbox1_write_message(uint8_t channel, uint32_t msg_addr) +{ +qtest_mbox1_write_message(global_qtest, channel, msg_addr); +} diff --git a/tests/qtest/bcm2838-mailbox.h b/tests/qtest/bcm2838-mailbox.h new file mode 100644 index 00..a81b325af9 --- /dev/null +++ b/tests/qtest/bcm2838-mailbox.h @@ -0,0 +1,48 @@ +/* + * Declarations for BCM2838 mailbox test. + * + * Copyright (c) 2023 Auriga LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +typedef union { +uint32_t value; +struct { +uint32_t channel: 4; +uint32_t data : 28; +} fields; +} MboxRegWrite; + +typedef MboxRegWrite MboxRegRead; + +typedef struct { +uint32_t size; +uint32_t req_resp_code; +} MboxBufHeader; + +#define DECLARE_TAG_TYPE(TypeName, RequestValueType, ResponseValueType) \ +typedef struct {\ +uint32_t id;\ +uint32_t value_buffer_size; \ +union { \ +struct {\ +uint32_t zero; \ +RequestValueType __attribute__((packed)) value; \ +} request; \ +struct {\ +uint32_t size : 31; \ +uint32_t success: 1;\ +ResponseValueType __attribute__((packed)) value;\ +} response; \ +}; \ +} TypeName + + +int mbox0_has_data(void); +void mbox0_read_message(uint8_t channel, void *msgbuf, size_t msgbuf_size); +void mbox1_write_message(uint8_t channel, uint32_t msg_addr); +int qtest_mbox0_has_data(QTestState *s); +void qtest_mbox0_read_message(QTestState *s, uint8_t channel, void *msgbuf, size_t msgbuf_size); +void qtest_mbox1_write_message(QTestState *s, uint8_t channel, uint32_t msg_addr); diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build index b071d400b3..61e9aab835 100644 --- a/tests/qtest/meson.build +++ b/tests/qtest/meson.build @@ -318,6 +318,7 @@ qtests = { 'tpm-tis-device-test': [io, tpmemu_files, 'tpm-tis-util.
[PATCH 14/44] Add BCM2838 PCIE Root Complex
Signed-off-by: Sergey Kambalin --- hw/arm/bcm2838_pcie.c | 65 +++ hw/arm/meson.build| 5 ++- hw/arm/trace-events | 4 +++ include/hw/arm/bcm2838_pcie.h | 44 4 files changed, 117 insertions(+), 1 deletion(-) create mode 100644 hw/arm/bcm2838_pcie.c create mode 100644 include/hw/arm/bcm2838_pcie.h diff --git a/hw/arm/bcm2838_pcie.c b/hw/arm/bcm2838_pcie.c new file mode 100644 index 00..522e19f3cf --- /dev/null +++ b/hw/arm/bcm2838_pcie.c @@ -0,0 +1,65 @@ +/* + * BCM2838 PCIe Root Complex emulation + * + * Copyright (C) 2022 Ovchinnikov Vitalii + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qemu/log.h" +#include "qapi/error.h" +#include "hw/irq.h" +#include "hw/pci-host/gpex.h" +#include "hw/qdev-properties.h" +#include "migration/vmstate.h" +#include "qemu/module.h" +#include "hw/arm/bcm2838_pcie.h" +#include "trace.h" + +/* + * RC root part (D0:F0) + */ + +static void bcm2838_pcie_root_init(Object *obj) +{ +PCIBridge *br = PCI_BRIDGE(obj); +BCM2838PcieRootState *s = BCM2838_PCIE_ROOT(obj); + +br->bus_name = "pcie.1"; +memset(s->regs, 0xFF, sizeof(s->regs)); +} + +static void bcm2838_pcie_root_class_init(ObjectClass *class, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(class); +PCIDeviceClass *k = PCI_DEVICE_CLASS(class); +PCIERootPortClass *rpc = PCIE_ROOT_PORT_CLASS(class); + +dc->desc = "BCM2711 PCIe Bridge"; +/* + * PCI-facing part of the host bridge, not usable without the host-facing + * part, which can't be device_add'ed. + */ +dc->user_creatable = false; +k->vendor_id = BCM2838_PCIE_VENDOR_ID; +k->device_id = BCM2838_PCIE_DEVICE_ID; +k->revision = BCM2838_PCIE_REVISION; +rpc->exp_offset = BCM2838_PCIE_EXP_CAP_OFFSET; +rpc->aer_offset = BCM2838_PCIE_AER_CAP_OFFSET; +} + +static const TypeInfo bcm2838_pcie_root_info = { +.name = TYPE_BCM2838_PCIE_ROOT, +.parent = TYPE_PCIE_ROOT_PORT, +.instance_size = sizeof(BCM2838PcieRootState), +.instance_init = bcm2838_pcie_root_init, +.class_init = bcm2838_pcie_root_class_init, +}; + +static void bcm2838_pcie_register(void) +{ +type_register_static(&bcm2838_pcie_root_info); +} + +type_init(bcm2838_pcie_register) diff --git a/hw/arm/meson.build b/hw/arm/meson.build index 768b2608c1..72680fa534 100644 --- a/hw/arm/meson.build +++ b/hw/arm/meson.build @@ -39,7 +39,10 @@ arm_ss.add(when: 'CONFIG_ALLWINNER_A10', if_true: files('allwinner-a10.c', 'cubi arm_ss.add(when: 'CONFIG_ALLWINNER_H3', if_true: files('allwinner-h3.c', 'orangepi.c')) arm_ss.add(when: 'CONFIG_ALLWINNER_R40', if_true: files('allwinner-r40.c', 'bananapi_m2u.c')) arm_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2836.c', 'raspi.c')) -arm_ss.add(when: ['CONFIG_RASPI', 'TARGET_AARCH64'], if_true: files('bcm2838.c', 'raspi4b.c')) +arm_ss.add(when: ['CONFIG_RASPI', 'TARGET_AARCH64'], if_true: files( + 'bcm2838.c', + 'bcm2838_pcie.c', + 'raspi4b.c')) arm_ss.add(when: 'CONFIG_STM32F100_SOC', if_true: files('stm32f100_soc.c')) arm_ss.add(when: 'CONFIG_STM32F205_SOC', if_true: files('stm32f205_soc.c')) arm_ss.add(when: 'CONFIG_STM32F405_SOC', if_true: files('stm32f405_soc.c')) diff --git a/hw/arm/trace-events b/hw/arm/trace-events index 4f0167e638..6cfab31539 100644 --- a/hw/arm/trace-events +++ b/hw/arm/trace-events @@ -55,5 +55,9 @@ smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s" smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint16_t vmid, uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d iova=0x%"PRIx64" tg=%d num_pages=0x%"PRIx64 +# bcm2838_pcie.c +bcm2838_pcie_host_read(unsigned int size, uint64_t offset, uint64_t value) "%u bytes @ 0x%04"PRIx64": 0x%016"PRIx64 +bcm2838_pcie_host_write(unsigned int size, uint64_t offset, uint64_t value) "%u bytes @ 0x%04"PRIx64": 0x%016"PRIx64 + # bcm2838.c bcm2838_gic_set_irq(int irq, int level) "gic irq:%d lvl:%d" diff --git a/include/hw/arm/bcm2838_pcie.h b/include/hw/arm/bcm2838_pcie.h new file mode 100644 index 00..b3d39b808d --- /dev/null +++ b/include/hw/arm/bcm2838_pcie.h @@ -0,0 +1,44 @@ +/* + * BCM2838 PCIe Root Complex emulation + * + * Copyright (C) 2022 Ovchinnikov Vitalii + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef BCM2838_PCIE_H +#define BCM2838_PCIE_H + +#include "exec/hwaddr.h" +#include "hw/sysbus.h" +#include "hw/pci/pci.h" +#include "hw/pci/pcie_host.h" +#include "hw/pci/pcie_port.h" +#include "qom/object.h" + +#define TYPE_BCM2838_PCIE_ROOT "bcm2838-pcie-root" +OBJECT_DECLARE_SIMPLE_TYPE(BCM2838PcieRootState, BCM2838_PCIE_ROOT) + +#define BCM2838_PCIE_VENDOR_ID 0x14E4 +#define BCM2838_PCIE_DEVICE_ID 0x2711 +#define BCM2838_PCIE_REVISION 20 + +#define BCM2838_PCIE_REGS_SIZE 0x9310
[PULL v2 4/5] qapi/trace: Tidy up trace-event-get-state, -set-state documentation
trace-event-set-state's explanation of how events are selected is under "Features". Doesn't belong there. Simply delete it, as it feels redundant with documentation of member @name. trace-event-get-state's explanation is under "Returns". Tolerable, but similarly redundant. Delete it, too. Cc: Alex Bennée Signed-off-by: Markus Armbruster Message-ID: <20230720071610.1096458-5-arm...@redhat.com> --- qapi/trace.json | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/qapi/trace.json b/qapi/trace.json index 39b752fc88..0819d93016 100644 --- a/qapi/trace.json +++ b/qapi/trace.json @@ -60,9 +60,6 @@ # # Returns: a list of @TraceEventInfo for the matching events # -# An event is returned if its name matches the @name pattern -# (There are no longer any per-vCPU events). -# # Since: 2.2 # # Example: @@ -90,10 +87,8 @@ # @vcpu: The vCPU to act upon (all by default; since 2.7). # # Features: -# @deprecated: Member @vcpu is deprecated, and always ignored. # -# An event is enabled if its name matches the @name pattern -# (There are no longer any per-vCPU events). +# @deprecated: Member @vcpu is deprecated, and always ignored. # # Since: 2.2 # -- 2.41.0
[PULL v2 0/5] QAPI patches patches for 2023-07-26
The following changes since commit 6cb2011fedf8c4e7b66b4a3abd6b42c1bae99ce6: Update version for v8.1.0-rc1 release (2023-07-25 20:09:05 +0100) are available in the Git repository at: https://repo.or.cz/qemu/armbru.git tags/pull-qapi-2023-07-26-v2 for you to fetch changes up to 9e272073e1c41acb3ba1e43b69c7a3f9c26089c2: qapi: Reformat recent doc comments to conform to current conventions (2023-07-26 14:51:36 +0200) QAPI patches patches for 2023-07-26 The patches affect only documentation. Generated code does not change. Markus Armbruster (5): qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation qapi/block: Tidy up block-latency-histogram-set documentation qapi/qdev: Tidy up device_add documentation qapi/trace: Tidy up trace-event-get-state, -set-state documentation qapi: Reformat recent doc comments to conform to current conventions qapi/block-core.json | 85 +++- qapi/block.json | 12 +++ qapi/cxl.json| 4 +-- qapi/machine-target.json | 2 +- qapi/migration.json | 10 +++--- qapi/net.json| 1 - qapi/qdev.json | 6 ++-- qapi/qom.json| 9 ++--- qapi/trace.json | 9 ++--- qapi/ui.json | 2 +- 10 files changed, 66 insertions(+), 74 deletions(-) -- 2.41.0
[PATCH 07/44] Implement BCM2838 GPIO functionality
Signed-off-by: Sergey Kambalin --- hw/gpio/bcm2838_gpio.c | 197 - 1 file changed, 193 insertions(+), 4 deletions(-) diff --git a/hw/gpio/bcm2838_gpio.c b/hw/gpio/bcm2838_gpio.c index 59be608250..cc9f4f74a0 100644 --- a/hw/gpio/bcm2838_gpio.c +++ b/hw/gpio/bcm2838_gpio.c @@ -18,6 +18,7 @@ #include "qapi/error.h" #include "hw/sysbus.h" #include "hw/gpio/bcm2838_gpio.h" +#include "hw/irq.h" #define GPFSEL0 0x00 #define GPFSEL1 0x04 @@ -56,14 +57,139 @@ #define RESET_VAL_CNTRL_REG2 0x50AAA95A; #define RESET_VAL_CNTRL_REG3 0x0005; +#define NUM_FSELN_IN_GPFSELN 10 +#define NUM_BITS_FSELN 3 +#define MASK_FSELN 0x7 + #define BYTES_IN_WORD4 +static uint32_t gpfsel_get(BCM2838GpioState *s, uint8_t reg) +{ +int i; +uint32_t value = 0; +for (i = 0; i < NUM_FSELN_IN_GPFSELN; i++) { +uint32_t index = NUM_FSELN_IN_GPFSELN * reg + i; +if (index < sizeof(s->fsel)) { +value |= (s->fsel[index] & MASK_FSELN) << (NUM_BITS_FSELN * i); +} +} +return value; +} + +static void gpfsel_set(BCM2838GpioState *s, uint8_t reg, uint32_t value) +{ +int i; +for (i = 0; i < NUM_FSELN_IN_GPFSELN; i++) { +uint32_t index = NUM_FSELN_IN_GPFSELN * reg + i; +if (index < sizeof(s->fsel)) { +int fsel = (value >> (NUM_BITS_FSELN * i)) & MASK_FSELN; +s->fsel[index] = fsel; +} +} +} + +static int gpfsel_is_out(BCM2838GpioState *s, int index) +{ +if (index >= 0 && index < BCM2838_GPIO_NUM) { +return s->fsel[index] == 1; +} +return 0; +} + +static void gpset(BCM2838GpioState *s, uint32_t val, uint8_t start, + uint8_t count, uint32_t *lev) +{ +uint32_t changes = val & ~*lev; +uint32_t cur = 1; + +int i; +for (i = 0; i < count; i++) { +if ((changes & cur) && (gpfsel_is_out(s, start + i))) { +qemu_set_irq(s->out[start + i], 1); +} +cur <<= 1; +} + +*lev |= val; +} + +static void gpclr(BCM2838GpioState *s, uint32_t val, uint8_t start, + uint8_t count, uint32_t *lev) +{ +uint32_t changes = val & *lev; +uint32_t cur = 1; + +int i; +for (i = 0; i < count; i++) { +if ((changes & cur) && (gpfsel_is_out(s, start + i))) { +qemu_set_irq(s->out[start + i], 0); +} +cur <<= 1; +} + +*lev &= ~val; +} + static uint64_t bcm2838_gpio_read(void *opaque, hwaddr offset, unsigned size) { +BCM2838GpioState *s = (BCM2838GpioState *)opaque; uint64_t value = 0; -qemu_log_mask(LOG_UNIMP, "%s: %s: not implemented for %"HWADDR_PRIx"\n", -TYPE_BCM2838_GPIO, __func__, offset); +switch (offset) { +case GPFSEL0: +case GPFSEL1: +case GPFSEL2: +case GPFSEL3: +case GPFSEL4: +case GPFSEL5: +value = gpfsel_get(s, offset / BYTES_IN_WORD); +break; +case GPSET0: +case GPSET1: +case GPCLR0: +case GPCLR1: +/* Write Only */ +qemu_log_mask(LOG_GUEST_ERROR, "%s: %s: Attempt reading from write only" + " register. %lu will be returned. Address 0x%"HWADDR_PRIx + ", size %u\n", TYPE_BCM2838_GPIO, __func__, value, offset, + size); +break; +case GPLEV0: +value = s->lev0; +break; +case GPLEV1: +value = s->lev1; +break; +case GPEDS0: +case GPEDS1: +case GPREN0: +case GPREN1: +case GPFEN0: +case GPFEN1: +case GPHEN0: +case GPHEN1: +case GPLEN0: +case GPLEN1: +case GPAREN0: +case GPAREN1: +case GPAFEN0: +case GPAFEN1: +/* Not implemented */ +qemu_log_mask(LOG_UNIMP, "%s: %s: not implemented for %"HWADDR_PRIx"\n", + TYPE_BCM2838_GPIO, __func__, offset); +break; +case GPIO_PUP_PDN_CNTRL_REG0: +case GPIO_PUP_PDN_CNTRL_REG1: +case GPIO_PUP_PDN_CNTRL_REG2: +case GPIO_PUP_PDN_CNTRL_REG3: +value = s->pup_cntrl_reg[(offset - GPIO_PUP_PDN_CNTRL_REG0) + / sizeof(s->pup_cntrl_reg[0])]; +break; +default: +qemu_log_mask(LOG_GUEST_ERROR, "%s: %s: bad offset %"HWADDR_PRIx"\n", + TYPE_BCM2838_GPIO, __func__, offset); +break; +} return value; } @@ -71,14 +197,77 @@ static uint64_t bcm2838_gpio_read(void *opaque, hwaddr offset, unsigned size) static void bcm2838_gpio_write(void *opaque, hwaddr offset, uint64_t value, unsigned size) { -qemu_log_mask(LOG_UNIMP, "%s: %s: not implemented for %"HWADDR_PRIx"\n", -TYPE_BCM2838_GPIO, __func__, offset); +BCM2838GpioState *s = (BCM2838GpioState *)opaque; + +switch (offset) { +case GPFSEL0: +case GPFSEL1: +case GPFSEL2: +case GPFSEL3: +case GPFSEL4: +case GPFSEL5: +gpfsel_set(s, off
[PATCH 21/44] Add clock_isp stub
Signed-off-by: Sergey Kambalin --- hw/arm/bcm2838_peripherals.c | 6 ++ include/hw/arm/bcm2838_peripherals.h | 1 + 2 files changed, 7 insertions(+) diff --git a/hw/arm/bcm2838_peripherals.c b/hw/arm/bcm2838_peripherals.c index e7a9db97ab..60ed535673 100644 --- a/hw/arm/bcm2838_peripherals.c +++ b/hw/arm/bcm2838_peripherals.c @@ -17,6 +17,9 @@ #define PCIE_MMIO_ARM_OFFSET0x6 #define PCIE_MMIO_SIZE 0x4000 +#define CLOCK_ISP_OFFSET0xc11000 +#define CLOCK_ISP_SIZE 0x100 + /* Lower peripheral base address on the VC (GPU) system bus */ #define BCM2838_VC_PERI_LOW_BASE 0x7c00 @@ -224,6 +227,9 @@ static void bcm2838_peripherals_realize(DeviceState *dev, Error **errp) memory_region_add_subregion(get_system_memory(), PCIE_MMIO_ARM_OFFSET, &s->pcie_mmio_alias); +create_unimp(s_base, &s->clkisp, "bcm2835-clkisp", CLOCK_ISP_OFFSET, + CLOCK_ISP_SIZE); + /* GPIO */ if (!sysbus_realize(SYS_BUS_DEVICE(&s->gpio), errp)) { return; diff --git a/include/hw/arm/bcm2838_peripherals.h b/include/hw/arm/bcm2838_peripherals.h index bb26aceb13..be4fc20f11 100644 --- a/include/hw/arm/bcm2838_peripherals.h +++ b/include/hw/arm/bcm2838_peripherals.h @@ -72,6 +72,7 @@ struct BCM2838PeripheralState { BCM2838Rng200State rng200; Bcm2838ThermalState thermal; SDHCIState emmc2; +UnimplementedDeviceState clkisp; BCM2838PcieHostState pcie_host; BCM2838GpioState gpio; -- 2.34.1