Re: [PATCH 7/8] tests/unit/test-smp-parse.c: Test the full 7-levels topology hierarchy
On 18/01/2024 15.48, Zhao Liu wrote: From: Zhao Liu Currently, -smp supports up to 7-levels topology hierarchy: -drawers/books/sockets/dies/clusters/cores/threads. Though no machine supports all these 7 levels yet, these 7 levels have the strict containment relationship and together form the generic CPU topology representation of QEMU. Also, note that the maxcpus is calculated by multiplying all 7 levels: maxcpus = drawers * books * sockets * dies * clusters * cores * threads. To cover this code path, it is necessary to test the full topology case (with all 7 levels). This also helps to avoid introducing new issues by further expanding the CPU topology in the future. Signed-off-by: Zhao Liu --- tests/unit/test-smp-parse.c | 143 1 file changed, 143 insertions(+) FWIW: Acked-by: Thomas Huth
Re: [PATCH 8/8] tests/unit/test-smp-parse.c: Test smp_props.has_clusters
On 18/01/2024 15.48, Zhao Liu wrote: From: Zhao Liu The smp_props.has_clusters in MachineClass is not a user configured field, and it indicates if user specifies "clusters" in -smp. After -smp parsing, other module could aware if the cluster level is configured by user. This is used when the machine has only 1 cluster since there's only 1 cluster by default. Add the check to cover this field. Signed-off-by: Zhao Liu --- tests/unit/test-smp-parse.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) Acked-by: Thomas Huth
[PULL 1/6] libqos/virtio.c: init all elems in qvring_indirect_desc_setup()
From: Daniel Henrique Barboza The loop isn't setting the values for the last element. Every other element is being initialized with addr = 0, flags = VRING_DESC_F_NEXT and next = i + 1. The last elem is never touched. This became a problem when enabling a RISC-V 'virt' libqos machine in the 'indirect' test of virti-blk-test.c. The 'flags' for the last element will end up being an odd number (since we didn't touch it). Being an odd number it will be mistaken by VRING_DESC_F_NEXT, which happens to be 1. Deep into hw/virt/virtio.c, in virtqueue_split_pop(), into virtqueue_split_read_next_desc(), a check for VRING_DESC_F_NEXT will be made to see if we're supposed to chain. The code will keep up chaining in the last element because the uninitialized value happens to be odd. We'll error out right after that because desc->next (which is also uninitialized) will be >= max. A VIRTQUEUE_READ_DESC_ERROR will be returned, with an error message like this in the stderr: qemu-system-riscv64: Desc next is 49391 Since we never returned, we'll end up timing out at qvirtio_wait_used_elem(): ERROR:../tests/qtest/libqos/virtio.c:236:qvirtio_wait_used_elem: assertion failed: (g_get_monotonic_time() - start_time <= timeout_us) The root cause is using uninitialized values from guest_alloc() in qvring_indirect_desc_setup(). There's no guarantee that the memory pages retrieved will be zeroed, so we can't make assumptions. In fact, commit 5b4f72f5e8 ("tests/qtest: properly initialise the vring used idx") fixed a similar problem stating "It is probably not wise to assume guest memory is zeroed anyway". I concur. Initialize all elems in qvring_indirect_desc_setup(). Fixes: f294b029aa ("libqos: Added indirect descriptor support to virtio implementation") Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Reviewed-by: Thomas Huth Message-ID: <20240217192607.32565-2-dbarb...@ventanamicro.com> Signed-off-by: Thomas Huth --- tests/qtest/libqos/virtio.c | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/tests/qtest/libqos/virtio.c b/tests/qtest/libqos/virtio.c index 410513225f..4f39124eba 100644 --- a/tests/qtest/libqos/virtio.c +++ b/tests/qtest/libqos/virtio.c @@ -280,14 +280,27 @@ QVRingIndirectDesc *qvring_indirect_desc_setup(QTestState *qs, QVirtioDevice *d, indirect->elem = elem; indirect->desc = guest_alloc(alloc, sizeof(struct vring_desc) * elem); -for (i = 0; i < elem - 1; ++i) { +for (i = 0; i < elem; ++i) { /* indirect->desc[i].addr */ qvirtio_writeq(d, qs, indirect->desc + (16 * i), 0); -/* indirect->desc[i].flags */ -qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, - VRING_DESC_F_NEXT); -/* indirect->desc[i].next */ -qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, i + 1); + +/* + * If it's not the last element of the ring, set + * the chain (VRING_DESC_F_NEXT) flag and + * desc->next. Clear the last element - there's + * no guarantee that guest_alloc() will do it. + */ +if (i != elem - 1) { +/* indirect->desc[i].flags */ +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, + VRING_DESC_F_NEXT); + +/* indirect->desc[i].next */ +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, i + 1); +} else { +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, 0); +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, 0); +} } return indirect; -- 2.44.0
[PULL 2/6] libqos/virtio.c: fix 'avail_event' offset in qvring_init()
From: Daniel Henrique Barboza In qvring_init() we're writing vq->used->avail_event at "vq->used + 2 + array_size". The struct pointed by vq->used is, from virtio_ring.h Linux header): * // A ring of used descriptor heads with free-running index. * __virtio16 used_flags; * __virtio16 used_idx; * struct vring_used_elem used[num]; * __virtio16 avail_event_idx; So 'flags' is the word right at vq->used. 'idx' is vq->used + 2. We need to skip 'used_idx' by adding + 2 bytes, and then sum the vector size, to reach avail_event_idx. An example on how to properly access this field can be found in qvirtqueue_kick(): avail_event = qvirtio_readw(d, qts, vq->used + 4 + sizeof(struct vring_used_elem) * vq->size); This error was detected when enabling the RISC-V 'virt' libqos machine. The 'idx' test from vhost-user-blk-test.c errors out with a timeout in qvirtio_wait_used_elem(). The timeout happens because when processing the first element, 'avail_event' is read in qvirtqueue_kick() as non-zero because we didn't initialize it properly (and the memory at that point happened to be non-zero). 'idx' is 0. All of this makes this condition fail because "idx - avail_event" will overflow and be non-zero: /* < 1 because we add elements to avail queue one by one */ if ((flags & VRING_USED_F_NO_NOTIFY) == 0 && (!vq->event || (uint16_t)(idx-avail_event) < 1)) { d->bus->virtqueue_kick(d, vq); } As a result the virtqueue is never kicked and we'll timeout waiting for it. Fixes: 1053587c3f ("libqos: Added EVENT_IDX support") Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Reviewed-by: Thomas Huth Message-ID: <20240217192607.32565-3-dbarb...@ventanamicro.com> Signed-off-by: Thomas Huth --- tests/qtest/libqos/virtio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qtest/libqos/virtio.c b/tests/qtest/libqos/virtio.c index 4f39124eba..82a6e122bf 100644 --- a/tests/qtest/libqos/virtio.c +++ b/tests/qtest/libqos/virtio.c @@ -265,7 +265,7 @@ void qvring_init(QTestState *qts, const QGuestAllocator *alloc, QVirtQueue *vq, /* vq->used->idx */ qvirtio_writew(vq->vdev, qts, vq->used + 2, 0); /* vq->used->avail_event */ -qvirtio_writew(vq->vdev, qts, vq->used + 2 + +qvirtio_writew(vq->vdev, qts, vq->used + 4 + sizeof(struct vring_used_elem) * vq->size, 0); } -- 2.44.0
[PULL 6/6] chardev/char-socket: Fix TLS io channels sending too much data to the backend
Commit ffda5db65a ("io/channel-tls: fix handling of bigger read buffers") changed the behavior of the TLS io channels to schedule a second reading attempt if there is still incoming data pending. This caused a regression with backends like the sclpconsole that check in their read function that the sender does not try to write more bytes to it than the device can currently handle. The problem can be reproduced like this: 1) In one terminal, do this: mkdir qemu-pki cd qemu-pki openssl genrsa 2048 > ca-key.pem openssl req -new -x509 -nodes -days 365000 -key ca-key.pem -out ca-cert.pem # enter some dummy value for the cert openssl genrsa 2048 > server-key.pem openssl req -new -x509 -nodes -days 365000 -key server-key.pem \ -out server-cert.pem # enter some other dummy values for the cert gnutls-serv --echo --x509cafile ca-cert.pem --x509keyfile server-key.pem \ --x509certfile server-cert.pem -p 8338 2) In another terminal, do this: wget https://download.fedoraproject.org/pub/fedora-secondary/releases/39/Cloud/s390x/images/Fedora-Cloud-Base-39-1.5.s390x.qcow2 qemu-system-s390x -nographic -nodefaults \ -hda Fedora-Cloud-Base-39-1.5.s390x.qcow2 \ -object tls-creds-x509,id=tls0,endpoint=client,verify-peer=false,dir=$PWD/qemu-pki \ -chardev socket,id=tls_chardev,host=localhost,port=8338,tls-creds=tls0 \ -device sclpconsole,chardev=tls_chardev,id=tls_serial QEMU then aborts after a second or two with: qemu-system-s390x: ../hw/char/sclpconsole.c:73: chr_read: Assertion `size <= SIZE_BUFFER_VT220 - scon->iov_data_len' failed. Aborted (core dumped) It looks like the second read does not trigger the chr_can_read() function to be called before the second read, which should normally always be done before sending bytes to a character device to see how much it can handle, so the s->max_size in tcp_chr_read() still contains the old value from the previous read. Let's make sure that we use the up-to-date value by calling tcp_chr_read_poll() again here. Cc: qemu-sta...@nongnu.org Fixes: ffda5db65a ("io/channel-tls: fix handling of bigger read buffers") Buglink: https://issues.redhat.com/browse/RHEL-24614 Reviewed-by: "Daniel P. Berrangé" Message-ID: <20240229104339.42574-1-th...@redhat.com> Reviewed-by: Antoine Damhet Tested-by: Antoine Damhet Reviewed-by: Marc-André Lureau Signed-off-by: Thomas Huth --- chardev/char-socket.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/chardev/char-socket.c b/chardev/char-socket.c index 67e3334423..8a0406cc1e 100644 --- a/chardev/char-socket.c +++ b/chardev/char-socket.c @@ -496,9 +496,9 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque) s->max_size <= 0) { return TRUE; } -len = sizeof(buf); -if (len > s->max_size) { -len = s->max_size; +len = tcp_chr_read_poll(opaque); +if (len > sizeof(buf)) { +len = sizeof(buf); } size = tcp_chr_recv(chr, (void *)buf, len); if (size == 0 || (size == -1 && errno != EAGAIN)) { -- 2.44.0
[PULL 0/6] Misc fixes (libqos vring, Kconfig, TLS io channels, ...)
Hi Peter! The following changes since commit c0c6a0e3528b88aaad0b9d333e295707a195587b: Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging (2024-02-28 17:27:10 +) are available in the Git repository at: https://gitlab.com/thuth/qemu.git tags/pull-request-2024-03-01 for you to fetch changes up to 462945cd22d2bcd233401ed3aa167d83a8e35b05: chardev/char-socket: Fix TLS io channels sending too much data to the backend (2024-03-01 08:27:33 +0100) * Fix some bugs in the vring setup of libqos * Fix GIC settings when using --without-default-devices * Fix USB PCAP streams on Windows * Remove temporary files from test-util-sockets * Fix TLS io channels sending too much data to the backend Benjamin David Lunt (1): hw/usb/bus.c: PCAP adding 0xA in Windows version Daniel Henrique Barboza (2): libqos/virtio.c: init all elems in qvring_indirect_desc_setup() libqos/virtio.c: fix 'avail_event' offset in qvring_init() Thomas Huth (3): hw/intc/Kconfig: Fix GIC settings when using "--without-default-devices" tests/unit/test-util-sockets: Remove temporary file after test chardev/char-socket: Fix TLS io channels sending too much data to the backend chardev/char-socket.c | 6 +++--- hw/usb/bus.c | 5 +++-- tests/qtest/libqos/virtio.c| 27 --- tests/unit/test-util-sockets.c | 1 + hw/intc/Kconfig| 12 ++-- 5 files changed, 33 insertions(+), 18 deletions(-)
[PULL 5/6] tests/unit/test-util-sockets: Remove temporary file after test
test-util-sockets leaves the temporary socket files around in the temporary files folder. Let's better remove them at the end of the testing. Fixes: 4d3a329af5 ("tests/util-sockets: add abstract unix socket cases") Message-ID: <20240226082728.249753-1-th...@redhat.com> Reviewed-by: Marc-André Lureau Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Thomas Huth --- tests/unit/test-util-sockets.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/unit/test-util-sockets.c b/tests/unit/test-util-sockets.c index 63909ccb2b..4c9dd0b271 100644 --- a/tests/unit/test-util-sockets.c +++ b/tests/unit/test-util-sockets.c @@ -326,6 +326,7 @@ static void test_socket_unix_abstract(void) test_socket_unix_abstract_row(&matrix[i]); } +unlink(addr.u.q_unix.path); g_free(addr.u.q_unix.path); } -- 2.44.0
[PULL 3/6] hw/intc/Kconfig: Fix GIC settings when using "--without-default-devices"
When using "--without-default-devices", the ARM_GICV3_TCG and ARM_GIC_KVM settings currently get disabled, though the arm virt machine is only of very limited use in that case. This also causes the migration-test to fail in such builds. Let's make sure that we always keep the GIC switches enabled in the --without-default-devices builds, too. Message-ID: <20240221110059.152665-1-th...@redhat.com> Tested-by: Fabiano Rosas Signed-off-by: Thomas Huth --- hw/intc/Kconfig | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig index 97d550b06b..2b5b2d2301 100644 --- a/hw/intc/Kconfig +++ b/hw/intc/Kconfig @@ -12,10 +12,6 @@ config IOAPIC bool select I8259 -config ARM_GIC -bool -select MSI_NONBROKEN - config OPENPIC bool select MSI_NONBROKEN @@ -25,14 +21,18 @@ config APIC select MSI_NONBROKEN select I8259 +config ARM_GIC +bool +select ARM_GICV3_TCG if TCG +select ARM_GIC_KVM if KVM +select MSI_NONBROKEN + config ARM_GICV3_TCG bool -default y depends on ARM_GIC && TCG config ARM_GIC_KVM bool -default y depends on ARM_GIC && KVM config XICS -- 2.44.0
[PULL 4/6] hw/usb/bus.c: PCAP adding 0xA in Windows version
From: Benjamin David Lunt Since Windows text files use CRLFs for all \n, the Windows version of QEMU inserts a CR in the PCAP stream when a LF is encountered when using USB PCAP files. This is due to the fact that the PCAP file is opened as TEXT instead of BINARY. To show an example, when using a very common protocol to USB disks, the BBB protocol uses a 10-byte command packet. For example, the READ_CAPACITY(10) command will have a command block length of 10 (0xA). When this 10-byte command (part of the 31-byte CBW) is placed into the PCAP file, the Windows file manager inserts a 0xD before the 0xA, turning the 31-byte CBW into a 32-byte CBW. Actual CBW: 0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0a 25 USBC...% 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... PCAP CBW 0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0d 0a USBC 0050 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 %.. I believe simply opening the PCAP file as BINARY instead of TEXT will fix this issue. Cc: qemu-sta...@nongnu.org Resolves: https://bugs.launchpad.net/qemu/+bug/2054889 Signed-off-by: Benjamin David Lunt Message-ID: <000101da6823$ce1bbf80$6a533e80$@fysnet.net> [thuth: Break long line to avoid checkpatch.pl error] Signed-off-by: Thomas Huth --- hw/usb/bus.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/usb/bus.c b/hw/usb/bus.c index 796769fadb..bfab2807d7 100644 --- a/hw/usb/bus.c +++ b/hw/usb/bus.c @@ -260,13 +260,14 @@ static void usb_qdev_realize(DeviceState *qdev, Error **errp) } if (dev->pcap_filename) { -int fd = qemu_open_old(dev->pcap_filename, O_CREAT | O_WRONLY | O_TRUNC, 0666); +int fd = qemu_open_old(dev->pcap_filename, + O_CREAT | O_WRONLY | O_TRUNC | O_BINARY, 0666); if (fd < 0) { error_setg(errp, "open %s failed", dev->pcap_filename); usb_qdev_unrealize(qdev); return; } -dev->pcap = fdopen(fd, "w"); +dev->pcap = fdopen(fd, "wb"); usb_pcap_init(dev->pcap); } } -- 2.44.0
Re: [PATCH v6 00/23] migration: File based migration with multifd and mapped-ram
On Fri, Mar 01, 2024 at 09:50:32AM +0800, Peter Xu wrote: > On Thu, Feb 29, 2024 at 12:29:54PM -0300, Fabiano Rosas wrote: > > Based-on: 74aa0fb297 (migration: options incompatible with cpr) # > > peterx/migration-next > > > > Hi, > > > > In this v6: > > > > - Minor fixes to 17/23 and 19/23 > > The whole set looks good to me now. I plan to queue it before the > direct-io stuff. Any other comments / concerns from anyone? > > Dan, would it be fine I queue the IO patches together? Yes, that's fine, when the series is ready. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] hw/us/bus.c PCAP adding 0xA in Windows version
On 25/02/2024 20.49, benl...@fysnet.net wrote: Since Windows text files use CRLFs for all \n, the Windows version of QEMU inserts a CR in the PCAP stream when a LF is encountered when using USB PCAP files. This is due to the fact that the PCAP file is opened as TEXT instead of BINARY. To show an example, when using a very common protocol to USB disks, the BBB protocol uses a 10-byte command packet. For example, the READ_CAPACITY(10) command will have a command block length of 10 (0xA). When this 10-byte command (part of the 31-byte CBW) is placed into the PCAP file, the Windows file manager inserts a 0xD before the 0xA, turning the 31-byte CBW into a 32-byte CBW. Actual CBW: 0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0a 25 USBC...% 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... PCAP CBW 0040 55 53 42 43 01 00 00 00 08 00 00 00 80 00 0d 0a USBC 0050 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 %.. I believe simply opening the PCAP file as BINARY instead of TEXT will fix this issue. Resolves: https://bugs.launchpad.net/qemu/+bug/2054889 Signed-off-by: Benjamin David Lunt --- hw/usb/bus.c | 2 +- 1 file changed, 2 insertions(+), 2 deletions(-) diff -u a/hw/usb/bus.c b/hw/usb/bus.c --- a/hw/usb/bus.c +++ b/hw/usb/bus.c @@ -273,13 +273,13 @@ } if (dev->pcap_filename) { -int fd = qemu_open_old(dev->pcap_filename, O_CREAT | O_WRONLY | O_TRUNC, 0666); +int fd = qemu_open_old(dev->pcap_filename, O_CREAT | O_WRONLY | O_TRUNC | O_BINARY, 0666); Hi Benjamin! Thanks for the patch! Since it's rather a trivial patch and our USB maintainer Gerd is pretty much busy with other stuff right now, I went ahead and put it in my current pull request. Two things to notice: First, something along the way (likely your mail program) added a line break after the "O_WRONLY |" in the above two lines, so I had to undo that change manually before I was able to apply the patch ... please try to use "git send-email" for sending patches, then such things don't happen. And second, please use the scripts/checkpatch.pl script to check your patches - it was complaining about a line getting too long here, so I went ahead and fixed that, too (i.e. no need to resend, just a FYI for future patches). Thanks, Thomas if (fd < 0) { error_setg(errp, "open %s failed", dev->pcap_filename); usb_qdev_unrealize(qdev); return; } -dev->pcap = fdopen(fd, "w"); +dev->pcap = fdopen(fd, "wb"); usb_pcap_init(dev->pcap); } }
[PATCH v1 1/1] memory tier: acpi/hmat: create CPUless memory tiers after obtaining HMAT info
* Introduce `mt_init_with_hmat()` We defer memory tier initialization for those CPUless NUMA nodes until acquiring HMAT info. `mt_init_with_hmat()` is introduced to post-create CPUless memory tiers after obtaining HMAT info. It iterates through each CPUless memory node, creating memory tiers if necessary. Finally, it calculates demotion tables again at the end. * Introduce `hmat_find_alloc_memory_type()` Find or allocate a memory type in the `hmat_memory_types` list. * Make `set_node_memory_tier()` more generic This function can also be used for setting other memory types for a node. To do so, a new argument is added to specify a memory type. * Handle cases where there is no HMAT when creating memory tiers If no HMAT is specified, it falls back to using `default_dram_type`. * Change adist calculation code to use another new lock, mt_perf_lock. Iterating through CPUlist nodes requires holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up trying to acquire the same lock, leading to a potential deadlock. Therefore, we propose introducing a standalone `mt_perf_lock` to protect `default_dram_perf`. This approach not only avoids deadlock but also prevents holding a large lock simultaneously. Signed-off-by: Ho-Ren (Jack) Chuang Signed-off-by: Hao Xiang --- drivers/acpi/numa/hmat.c | 3 ++ include/linux/memory-tiers.h | 6 +++ mm/memory-tiers.c| 76 3 files changed, 77 insertions(+), 8 deletions(-) diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index d6b85f0f6082..9f57338b3cb5 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -1038,6 +1038,9 @@ static __init int hmat_init(void) if (!hmat_set_default_dram_perf()) register_mt_adistance_algorithm(&hmat_adist_nb); + /* Post-create CPUless memory tiers after getting HMAT info */ + mt_init_with_hmat(); + return 0; out_put: hmat_free_structures(); diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h index 69e781900082..2f845e90c033 100644 --- a/include/linux/memory-tiers.h +++ b/include/linux/memory-tiers.h @@ -48,6 +48,7 @@ int mt_calc_adistance(int node, int *adist); int mt_set_default_dram_perf(int nid, struct access_coordinate *perf, const char *source); int mt_perf_to_adistance(struct access_coordinate *perf, int *adist); +void mt_init_with_hmat(void); #ifdef CONFIG_MIGRATION int next_demotion_node(int node); void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets); @@ -136,5 +137,10 @@ static inline int mt_perf_to_adistance(struct access_coordinate *perf, int *adis { return -EIO; } + +static inline void mt_init_with_hmat(void) +{ + +} #endif /* CONFIG_NUMA */ #endif /* _LINUX_MEMORY_TIERS_H */ diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c index 0537664620e5..7a0a579b3deb 100644 --- a/mm/memory-tiers.c +++ b/mm/memory-tiers.c @@ -35,7 +35,9 @@ struct node_memory_type_map { }; static DEFINE_MUTEX(memory_tier_lock); +static DEFINE_MUTEX(mt_perf_lock); static LIST_HEAD(memory_tiers); +static LIST_HEAD(hmat_memory_types); static struct node_memory_type_map node_memory_types[MAX_NUMNODES]; struct memory_dev_type *default_dram_type; @@ -502,7 +504,7 @@ static inline void __init_node_memory_type(int node, struct memory_dev_type *mem } } -static struct memory_tier *set_node_memory_tier(int node) +static struct memory_tier *set_node_memory_tier(int node, struct memory_dev_type *new_memtype) { struct memory_tier *memtier; struct memory_dev_type *memtype; @@ -514,7 +516,7 @@ static struct memory_tier *set_node_memory_tier(int node) if (!node_state(node, N_MEMORY)) return ERR_PTR(-EINVAL); - __init_node_memory_type(node, default_dram_type); + __init_node_memory_type(node, new_memtype); memtype = node_memory_types[node].memtype; node_set(node, memtype->nodes); @@ -623,6 +625,56 @@ void clear_node_memory_type(int node, struct memory_dev_type *memtype) } EXPORT_SYMBOL_GPL(clear_node_memory_type); +static struct memory_dev_type *hmat_find_alloc_memory_type(int adist) +{ + bool found = false; + struct memory_dev_type *mtype; + + list_for_each_entry(mtype, &hmat_memory_types, list) { + if (mtype->adistance == adist) { + found = true; + break; + } + } + if (!found) { + mtype = alloc_memory_type(adist); + if (!IS_ERR(mtype)) + list_add(&mtype->list, &hmat_memory_types); + } + return mtype; +} + +static void mt_create_with_hmat(int node) +{ + struct memory_dev_type *mtype = NULL; + int adist = MEMTIER_ADISTANCE_DRAM; + + mt_calc_adistance(node, &adist); + if (adist != MEMTIER_ADISTANCE_DRAM) { + mtype = hmat_find_alloc_memory_type
[PATCH v1 0/1] Improved Memory Tier Creation for CPUless NUMA Nodes
The memory tiering component in the kernel is functionally useless for CPUless memory/non-DRAM devices like CXL1.1 type3 memory because the nodes are lumped together in the DRAM tier. https://lore.kernel.org/linux-mm/ph0pr08mb7955e9f08ccb64f23963b5c3a8...@ph0pr08mb7955.namprd08.prod.outlook.com/T/ This patchset automatically resolves the issues. It delays the initialization of memory tiers for CPUless NUMA nodes until they obtain HMAT information at boot time, eliminating the need for user intervention. If no HMAT specified, it falls back to using `default_dram_type`. Example usecase: We have CXL memory on the host, and we create VMs with a new system memory device backed by host CXL memory. We inject CXL memory performance attributes through QEMU, and the guest now sees memory nodes with performance attributes in HMAT. With this change, we enable the guest kernel to construct the correct memory tiering for the memory nodes. Ho-Ren (Jack) Chuang (1): memory tier: acpi/hmat: create CPUless memory tiers after obtaining HMAT info drivers/acpi/numa/hmat.c | 3 ++ include/linux/memory-tiers.h | 6 +++ mm/memory-tiers.c| 76 3 files changed, 77 insertions(+), 8 deletions(-) -- Hao Xiang and Ho-Ren (Jack) Chuang
Re: [PATCH v7 2/2] hw/acpi: Implement the SRAT GI affinity structure
>> >> One more thing. Right now we can't use Generic Initiators as >> HMAT initiators. That also wants fixing given that's their >> normal usecase rather than what you are using them for so it >> should 'work'. > > Something along the lines of this will do the job. Thanks! Will incorporate the patch in the next posting. > > diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h > index 4173ef2afa..825cfe86bc 100644 > --- a/include/sysemu/numa.h > +++ b/include/sysemu/numa.h > @@ -41,6 +41,7 @@ struct NodeInfo { > struct HostMemoryBackend *node_memdev; > bool present; > bool has_cpu; >+ bool has_gi; > uint8_t lb_info_provided; > uint16_t initiator; > uint8_t distance[MAX_NODES]; > diff --git a/hw/acpi/acpi-generic-initiator.c > b/hw/acpi/acpi-generic-initiator.c > index 9179590a42..8a67300320 100644 > --- a/hw/acpi/acpi-generic-initiator.c > +++ b/hw/acpi/acpi-generic-initiator.c > @@ -6,6 +6,7 @@ > #include "qemu/osdep.h" > #include "hw/acpi/acpi-generic-initiator.h" > #include "hw/pci/pci_device.h" > +#include "hw/boards.h" > #include "qapi/error.h" > #include "qapi/qapi-builtin-visit.h" > #include "qapi/visitor.h" > @@ -58,6 +59,7 @@ static void acpi_generic_node_set_node(Object *obj, Visitor > *v, > const char *name, void *opaque, > Error **errp) > { > + MachineState *ms = MACHINE(qdev_get_machine()); > AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj); > uint32_t value; > > @@ -72,6 +74,10 @@ static void acpi_generic_node_set_node(Object *obj, > Visitor *v, > } > > gn->node = value; > + > + if (object_dynamic_cast(obj, TYPE_ACPI_GENERIC_INITIATOR)) { > + ms->numa_state->nodes[gn->node].has_gi = true; > + } > } > > static void acpi_generic_node_class_init(ObjectClass *oc, void *data) > diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c > index b933ae3c06..9b1662b6b8 100644 > --- a/hw/acpi/hmat.c > +++ b/hw/acpi/hmat.c > @@ -225,7 +225,7 @@ static void hmat_build_table_structs(GArray *table_data, > NumaState *numa_state) > } > > for (i = 0; i < numa_state->num_nodes; i++) { > - if (numa_state->nodes[i].has_cpu) { > + if (numa_state->nodes[i].has_cpu || numa_state->nodes[i].has_gi) { > initiator_list[num_initiator++] = i; > } > } > diff --git a/hw/core/numa.c b/hw/core/numa.c > index f08956ddb0..58a32f1564 100644 > --- a/hw/core/numa.c > +++ b/hw/core/numa.c > @@ -229,7 +229,8 @@ void parse_numa_hmat_lb(NumaState *numa_state, > NumaHmatLBOptions *node, > node->target, numa_state->num_nodes); > return; > } > - if (!numa_info[node->initiator].has_cpu) { > + if (!numa_info[node->initiator].has_cpu && > + !numa_info[node->initiator].has_gi) { > error_setg(errp, "Invalid initiator=%d, it isn't an " > "initiator proximity domain", node->initiator); > return; > > Jonathan > > > > > > > >
Re: [PATCH v7 1/2] qom: new object to associate device to numa node
>> As for your suggestion of using acpi-dev as the arg to take both >> pci-dev and acpi-dev.. Would that mean sending a pure pci device >> (not the corner case you mentioned) through the acpi-dev argument >> as well? Not sure if that would appropriate. > > Ah, looking up my description is unclear. I meant two optional parameters > pci-dev or acpi-dev - which one was supplied would indicate the type > of handle to be used. Yes, that makes sense. But for now only have pci-dev until we have any acpi-dev related code added? IIRC, this is what we discussed earlier.
Re: [PATCH 06/19] smbios: get rid of smbios_legacy global
> On 29-Feb-2024, at 19:59, Igor Mammedov wrote: > > On Thu, 29 Feb 2024 16:23:21 +0530 > Ani Sinha wrote: > >>> On 27-Feb-2024, at 21:17, Igor Mammedov wrote: >>> >>> clean up smbios_set_defaults() which is reused by legacy >>> and non legacy machines from being aware of 'legacy' notion >>> and need to turn it off. And push legacy handling up to >>> PC machine code where it's relevant. >>> >>> Signed-off-by: Igor Mammedov Reviewed-by: Ani Sinha >>> --- >>> PS: I've moved/kept legacy smbios_entries to smbios_get_tables() >>> but it at least is not visible to API users. To get rid of it >>> as well, it would be necessary to change how '-smbios' CLI >>> option is processed. Which is done later in the series. >>> --- >>> include/hw/firmware/smbios.h | 2 +- >>> hw/arm/virt.c| 2 +- >>> hw/i386/fw_cfg.c | 7 --- >>> hw/loongarch/virt.c | 2 +- >>> hw/riscv/virt.c | 2 +- >>> hw/smbios/smbios.c | 35 +++ >>> 6 files changed, 23 insertions(+), 27 deletions(-) >>> >>> diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h >>> index a187fbbd3d..0818184834 100644 >>> --- a/include/hw/firmware/smbios.h >>> +++ b/include/hw/firmware/smbios.h >>> @@ -293,7 +293,7 @@ struct smbios_type_127 { >>> void smbios_entry_add(QemuOpts *opts, Error **errp); >>> void smbios_set_cpuid(uint32_t version, uint32_t features); >>> void smbios_set_defaults(const char *manufacturer, const char *product, >>> - const char *version, bool legacy_mode, >>> + const char *version, >>> bool uuid_encoded, SmbiosEntryPointType ep_type); >>> void smbios_set_default_processor_family(uint16_t processor_family); >>> uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t >>> *length); >>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c >>> index 0af1943697..8588681f27 100644 >>> --- a/hw/arm/virt.c >>> +++ b/hw/arm/virt.c >>> @@ -1633,7 +1633,7 @@ static void virt_build_smbios(VirtMachineState *vms) >>>} >>> >>>smbios_set_defaults("QEMU", product, >>> -vmc->smbios_old_sys_ver ? "1.0" : mc->name, false, >>> +vmc->smbios_old_sys_ver ? "1.0" : mc->name, >>>true, SMBIOS_ENTRY_POINT_TYPE_64); >>> >>>/* build the array of physical mem area from base_memmap */ >>> diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c >>> index fcb4fb0769..c1e9c0fd9c 100644 >>> --- a/hw/i386/fw_cfg.c >>> +++ b/hw/i386/fw_cfg.c >>> @@ -63,15 +63,16 @@ void fw_cfg_build_smbios(PCMachineState *pcms, >>> FWCfgState *fw_cfg) >>>if (pcmc->smbios_defaults) { >>>/* These values are guest ABI, do not change */ >>>smbios_set_defaults("QEMU", mc->desc, mc->name, >>> -pcmc->smbios_legacy_mode, >>> pcmc->smbios_uuid_encoded, >>> +pcmc->smbios_uuid_encoded, >>>pcms->smbios_entry_point_type); >>>} >>> >>>/* tell smbios about cpuid version and features */ >>>smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]); >>> >>> -smbios_tables = smbios_get_table_legacy(ms->smp.cpus, >>> &smbios_tables_len); >>> -if (smbios_tables) { >>> +if (pcmc->smbios_legacy_mode) { >>> +smbios_tables = smbios_get_table_legacy(ms->smp.cpus, >>> +&smbios_tables_len); >>>fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES, >>> smbios_tables, smbios_tables_len); >>>return; >>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >>> index 0ad7d8c887..73fb3522ba 100644 >>> --- a/hw/loongarch/virt.c >>> +++ b/hw/loongarch/virt.c >>> @@ -320,7 +320,7 @@ static void virt_build_smbios(LoongArchMachineState >>> *lams) >>>return; >>>} >>> >>> -smbios_set_defaults("QEMU", product, mc->name, false, >>> +smbios_set_defaults("QEMU", product, mc->name, >>>true, SMBIOS_ENTRY_POINT_TYPE_64); >>> >>>smbios_get_tables(ms, NULL, 0, &smbios_tables, &smbios_tables_len, >>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c >>> index fd35c74781..e2c9529df2 100644 >>> --- a/hw/riscv/virt.c >>> +++ b/hw/riscv/virt.c >>> @@ -1235,7 +1235,7 @@ static void virt_build_smbios(RISCVVirtState *s) >>>product = "KVM Virtual Machine"; >>>} >>> >>> -smbios_set_defaults("QEMU", product, mc->name, false, >>> +smbios_set_defaults("QEMU", product, mc->name, >>>true, SMBIOS_ENTRY_POINT_TYPE_64); >>> >>>if (riscv_is_32bit(&s->soc[0])) { >>> diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c >>> index 15339d8dbe..c46fc93357 100644 >>> --- a/hw/smbios/smbios.c >>> +++ b/hw/smbios/smbios.c >>> @@ -54,7 +54,6 @@ struct smbios_table { >>> >>> static uint8_t *smbios_entries; >>> static size_t smbios_entries_len; >>> -static bool smbios_lega
Re: [PATCH v6 00/23] migration: File based migration with multifd and mapped-ram
On Thu, Feb 29, 2024 at 12:29:54PM -0300, Fabiano Rosas wrote: > Based-on: 74aa0fb297 (migration: options incompatible with cpr) # > peterx/migration-next > > Hi, > > In this v6: > > - Minor fixes to 17/23 and 19/23 Thanks both for confirming, queued now. -- Peter Xu
Re: [PATCH v2 1/3] qtest: migration: Enhance qtest migration functions to support 'channels' argument
On 29/02/24 6:47 am, Fabiano Rosas wrote: Het Gala writes: On 27/02/24 1:04 am, Het Gala wrote: On 26/02/24 6:31 pm, Fabiano Rosas wrote: Het Gala writes: On 24/02/24 1:42 am, Fabiano Rosas wrote: this was the same first approach that I attempted. It won't work because The final 'migrate' QAPI with channels string would look like { "execute": "migrate", "arguments": { "channels": "[ { "channel-type": "main", "addr": { "transport": "socket", "type": "inet", "host": "10.117.29.84", "port": "4000" }, "multifd-channels": 2 } ]" } } instead of { "execute": "migrate", "arguments": { "channels": [ { "channel-type": "main", "addr": { "transport": "socket", "type": "inet", "host": "10.117.29.84", "port": "4000" }, "multifd-channels": 2 } ] } } It would complain, that channels should be an *array* and not a string. So, that's the reason parsing was required in qtest too. I would be glad to hear if there are any ideas to convert /string -> json object -> add it inside qdict along with uri/ ? Isn't this what the various qobject_from_json do? How does it work with the existing tests? qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming'," " 'arguments': { " " 'channels': [ { 'channel-type': 'main'," " 'addr': { 'transport': 'socket'," "'type': 'inet'," "'host': '127.0.0.1'," "'port': '0' } } ] } }"); We can pass this^ string successfully to QMP somehow... I think, here in qtest_qmp_assert_success, we actually can pass the whole QMP command, and it just asserts that return key is present in the response, though I am not very much familiar with qtest codebase to verify how swiftly we can convert string into an actual QObject. [...] I tried with qobject_from_json type of utility functions and the error I got was this : migration-test: /rpmbuild/SOURCES/qemu/include/qapi/qmp/qobject.h:126: qobject_type: Assertion `QTYPE_NONE < obj->base.type && obj->base.type < QTYPE__MAX' failed. And I suppose this was the case because, there are only limited types of QTYPE available typedefenumQType{ QTYPE_NONE, QTYPE_QNULL, QTYPE_QNUM, QTYPE_QSTRING, QTYPE_QDICT, QTYPE_QLIST, QTYPE_QBOOL, QTYPE__MAX, } QType; And 'channels' is a mixture of QDICT and QLIST and hence it is not able to easily convert from string to json. Thoughts on this ? I'm not sure what you tried. This works: g_assert(!qdict_haskey(args, "channels")); if (channels) { channels_obj = qobject_from_json(channels, errp); qdict_put_obj(args, "channels", channels_obj); } Are you sure the above works ? Sorry I want to get this doubt cleared first, Let's me take a step back and just focus on the above part and not focus on port issue for now. Adding a validation test for uri and channels both used together migration_test_add("/migration/validate_uri/channels/both_set", test_validate_uri_channels_both_set); static void test_validate_uri_channels_both_set(void) { QTestState *from, *to; MigrateCommon args = { .connect_channels = "'channels': [ { 'channel-type': 'main'," " 'addr': { 'transport': 'socket'," " 'type': 'inet'," " 'host': '127.0.0.1'," " 'port': '0' } } ]", .listen_uri = "tcp:127.0.0.1:0", .result = MIG_TEST_QMP_ERROR, }; if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { return; } wait_for_serial("src_serial"); if (args.result == MIG_TEST_QMP_ERROR) { migrate_qmp_fail(from, args.connect_uri, args.connect_channels, "{}"); } else { migrate_qmp(from, args.connect_uri, args.connect_channels, "{}"); } test_migrate_end(from, to, false); } In the migration-helpers.c file, inside migrate_qmp_fail function: void migrate_qmp_fail(QTestState *who, const char *uri, const char *channels, const char *fmt, ...) { [...] Error **errp = NULL; QObject *channels_obj = NULL; [...] g_assert(!qdict_haskey(args, "channels")); if (channels) { channels_obj = qobject_from_json(channels, errp); if (!channels_obj) { fprintf(stdout, "Everything is right till here also ?\n"); goto cleanup; } qdict_put_obj(args, "channels", channels_obj); } err = qtest_qmp_assert_failure_ref( who, "{ 'execute': 'migrate', 'arguments': %p}", args); [...] cleanup: qobject_unref(channels_obj); } When I ran the test alone, test passed, but instead of giving output as {"error": {"class": "GenericError", "desc": "need either 'uri' or 'channels' argument"}} It just passed,
[PATCH] migration/multifd: Document two places for mapped-ram
From: Peter Xu Add two documentations for mapped-ram migration on two spots that may not be extremely clear. Signed-off-by: Peter Xu --- Based-on: <20240229153017.2221-1-faro...@suse.de> --- migration/multifd.c | 12 migration/ram.c | 8 +++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/migration/multifd.c b/migration/multifd.c index b4e5a9dfcc..2942395ce2 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -709,6 +709,18 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) { if (p->c) { migration_ioc_unregister_yank(p->c); +/* + * An explicitly close() on the channel here is normally not + * required, but can be helpful for "file:" iochannels, where it + * will include an fdatasync() to make sure the data is flushed to + * the disk backend. + * + * The object_unref() cannot guarantee that because: (1) finalize() + * of the iochannel is only triggered on the last reference, and + * it's not guaranteed that we always hold the last refcount when + * reaching here, and, (2) even if finalize() is invoked, it only + * does a close(fd) without data flush. + */ qio_channel_close(p->c, &error_abort); object_unref(OBJECT(p->c)); p->c = NULL; diff --git a/migration/ram.c b/migration/ram.c index 1f1b5297cf..c79e3de521 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -4258,7 +4258,13 @@ static int ram_load_precopy(QEMUFile *f) switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { case RAM_SAVE_FLAG_MEM_SIZE: ret = parse_ramblocks(f, addr); - +/* + * For mapped-ram migration (to a file) using multifd, we sync + * once and for all here to make sure all tasks we queued to + * multifd threads are completed, so that all the ramblocks + * (including all the guest memory pages within) are fully + * loaded after this sync returns. + */ if (migrate_mapped_ram()) { multifd_recv_sync_main(); } -- 2.44.0
[PATCH v5 10/17] hw/loongarch: fdt adds cpu interrupt controller node
fdt adds cpu interrupt controller node, we use 'loongson,cpu-interrupt-controller'. See: https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongarch-cpu.c https://lore.kernel.org/r/20221114113824.1880-2-liupei...@loongson.cn Signed-off-by: Song Gao --- hw/loongarch/virt.c | 20 1 file changed, 20 insertions(+) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 5f787338a2..a8374c35a4 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -81,7 +81,23 @@ static void virt_flash_map(LoongArchMachineState *lams, sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal); memory_region_add_subregion(sysmem, base, sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0)); +} + +static void fdt_add_cpuic_node(LoongArchMachineState *lams, + uint32_t *cpuintc_phandle) +{ +MachineState *ms = MACHINE(lams); +char *nodename; +*cpuintc_phandle = qemu_fdt_alloc_phandle(ms->fdt); +nodename = g_strdup_printf("/cpuic"); +qemu_fdt_add_subnode(ms->fdt, nodename); +qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", *cpuintc_phandle); +qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", +"loongson,cpu-interrupt-controller"); +qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0); +qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 1); +g_free(nodename); } static void fdt_add_flash_node(LoongArchMachineState *lams) @@ -492,6 +508,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams) CPULoongArchState *env; CPUState *cpu_state; int cpu, pin, i, start, num; +uint32_t cpuintc_phandle; /* * The connection of interrupts: @@ -526,6 +543,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams) memory_region_add_subregion(&lams->system_iocsr, MAIL_SEND_ADDR, sysbus_mmio_get_region(SYS_BUS_DEVICE(ipi), 1)); +/* Add cpu interrupt-controller */ +fdt_add_cpuic_node(lams, &cpuintc_phandle); + for (cpu = 0; cpu < ms->smp.cpus; cpu++) { cpu_state = qemu_get_cpu(cpu); cpudev = DEVICE(cpu_state); -- 2.25.1
[PATCH v5 08/17] hw/loongarch: Init efi_fdt table
Signed-off-by: Song Gao --- include/hw/loongarch/boot.h | 4 hw/loongarch/boot.c | 11 +++ 2 files changed, 15 insertions(+) diff --git a/include/hw/loongarch/boot.h b/include/hw/loongarch/boot.h index ce47056608..bbe8c8dd5d 100644 --- a/include/hw/loongarch/boot.h +++ b/include/hw/loongarch/boot.h @@ -34,6 +34,10 @@ typedef struct { EFI_GUID(0x5568e427, 0x68fc, 0x4f3d, 0xac, 0x74, \ 0xca, 0x55, 0x52, 0x31, 0xcc, 0x68) +#define DEVICE_TREE_GUID \ +EFI_GUID(0xb1b621d5, 0xf19c, 0x41a5, 0x83, 0x0b, \ + 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0) + struct efi_config_table { efi_guid_t guid; uint64_t *ptr; diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index 6f56d4fd91..fe3e640508 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -111,6 +111,16 @@ static void init_efi_initrd_table(struct efi_system_table *systab, initrd_table->size = initrd_size; } +static void init_efi_fdt_table(struct efi_system_table *systab) +{ +efi_guid_t tbl_guid = DEVICE_TREE_GUID; + +/* efi_configuration_table 3 */ +guidcpy(&systab->tables[2].guid, &tbl_guid); +systab->tables[2].table = (void *)0x10; /* fdt_base 1MiB */ +systab->nr_tables = 3; +} + static void init_systab(struct loongarch_boot_info *info, void *p, void *start) { void *bp_tables_start; @@ -136,6 +146,7 @@ static void init_systab(struct loongarch_boot_info *info, void *p, void *start) sizeof(efi_memory_desc_t) * memmap_entries, 64); init_efi_initrd_table(systab, p, start); p += ROUND_UP(sizeof(struct efi_initrd), 64); +init_efi_fdt_table(systab); systab->tables = (struct efi_configuration_table *)(bp_tables_start - start); } -- 2.25.1
[PATCH v5 02/17] hw/loongarch: Add load initrd
we load initrd ramdisk after kernel_high address Signed-off-by: Song Gao --- hw/loongarch/boot.c | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index 0f2bc15fdf..3075c276d4 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -21,7 +21,8 @@ static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr) static int64_t load_kernel_info(struct loongarch_boot_info *info) { -uint64_t kernel_entry, kernel_low, kernel_high; +uint64_t kernel_entry, kernel_low, kernel_high, initrd_size; +ram_addr_t initrd_offset; ssize_t kernel_size; kernel_size = load_elf(info->kernel_filename, NULL, @@ -36,6 +37,32 @@ static int64_t load_kernel_info(struct loongarch_boot_info *info) load_elf_strerror(kernel_size)); exit(1); } + +if (info->initrd_filename) { +initrd_size = get_image_size(info->initrd_filename); +if (initrd_size > 0) { +initrd_offset = ROUND_UP(kernel_high + 4 * kernel_size, 64 * KiB); + +if (initrd_offset + initrd_size > info->ram_size) { +error_report("memory too small for initial ram disk '%s'", + info->initrd_filename); +exit(1); +} + +initrd_size = load_image_targphys(info->initrd_filename, initrd_offset, + info->ram_size - initrd_offset); +} + +if (initrd_size == (target_ulong)-1) { +error_report("could not load initial ram disk '%s'", + info->initrd_filename); +exit(1); +} +} else { +error_report("Need initrd!"); +exit(1); +} + return kernel_entry; } -- 2.25.1
[PATCH v5 07/17] hw/loongarch: Init efi_initrd table
Signed-off-by: Song Gao --- include/hw/loongarch/boot.h | 9 + hw/loongarch/boot.c | 23 +-- 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/include/hw/loongarch/boot.h b/include/hw/loongarch/boot.h index 7ad25080c5..ce47056608 100644 --- a/include/hw/loongarch/boot.h +++ b/include/hw/loongarch/boot.h @@ -30,6 +30,10 @@ typedef struct { EFI_GUID(0x800f683f, 0xd08b, 0x423a, 0xa2, 0x93, \ 0x96, 0x5c, 0x3c, 0x6f, 0xe2, 0xb4) +#define LINUX_EFI_INITRD_MEDIA_GUID \ +EFI_GUID(0x5568e427, 0x68fc, 0x4f3d, 0xac, 0x74, \ + 0xca, 0x55, 0x52, 0x31, 0xcc, 0x68) + struct efi_config_table { efi_guid_t guid; uint64_t *ptr; @@ -83,6 +87,11 @@ struct efi_boot_memmap { efi_memory_desc_t map[32]; }; +struct efi_initrd { +uint64_t base; +uint64_t size; +}; + struct loongarch_boot_info { uint64_t ram_size; const char *kernel_filename; diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index c086070bc8..6f56d4fd91 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -14,6 +14,9 @@ #include "qemu/error-report.h" #include "sysemu/reset.h" +ram_addr_t initrd_offset; +uint64_t initrd_size; + static const unsigned int slave_boot_code[] = { /* Configure reset ebase. */ 0x0400302c, /* csrwr $r12,0xc*/ @@ -93,6 +96,21 @@ static void init_efi_boot_memmap(struct efi_system_table *systab, } } +static void init_efi_initrd_table(struct efi_system_table *systab, + void *p, void *start) +{ +efi_guid_t tbl_guid = LINUX_EFI_INITRD_MEDIA_GUID; +struct efi_initrd *initrd_table = p; + +/* efi_configuration_table 2 */ +guidcpy(&systab->tables[1].guid, &tbl_guid); +systab->tables[1].table = (struct efi_configuration_table *)(p - start); +systab->nr_tables = 2; + +initrd_table->base = initrd_offset; +initrd_table->size = initrd_size; +} + static void init_systab(struct loongarch_boot_info *info, void *p, void *start) { void *bp_tables_start; @@ -116,6 +134,8 @@ static void init_systab(struct loongarch_boot_info *info, void *p, void *start) init_efi_boot_memmap(systab, p, start); p += ROUND_UP(sizeof(struct efi_boot_memmap) + sizeof(efi_memory_desc_t) * memmap_entries, 64); +init_efi_initrd_table(systab, p, start); +p += ROUND_UP(sizeof(struct efi_initrd), 64); systab->tables = (struct efi_configuration_table *)(bp_tables_start - start); } @@ -137,8 +157,7 @@ static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr) static int64_t load_kernel_info(struct loongarch_boot_info *info) { -uint64_t kernel_entry, kernel_low, kernel_high, initrd_size; -ram_addr_t initrd_offset; +uint64_t kernel_entry, kernel_low, kernel_high; ssize_t kernel_size; kernel_size = load_elf(info->kernel_filename, NULL, -- 2.25.1
[PATCH v5 00/17] Add boot LoongArch elf kernel with FDT
Hi, All We already support boot efi kernel with bios, but not support boot elf kernel. This series adds boot elf kernel with FDT. 'LoongArch supports ACPI and FDT. The information that needs to be passed to the kernel includes the memmap, the initrd, the command line, optionally the ACPI/FDT tables, and so on' see [1]. Patch 2-8 : Create efi system table, and three efi configuration table boot_memmap, initd, FDT. Patch 9-17 : Fixes FDT problems. Test: - Start kernel See [2] start_kernel.sh - Start qcow2 See [2] start_qcow2.sh V5: - Rebase; V4: - patch 3 change slave_boot_code[] to const, and 'static void *p ' to 'void *p'; - patch 4 fixes build error; - patch 10-13, add project and commit link. V3: - Load initrd at kernel_high + 4 * kernel_size; - Load 'boot_rom' at [0 - 1M], the 'boot_rom' includes slave_boot_code, cmdline_buf and systab_tables; - R-b and rebase. V2: - FDT pcie node adds cells 'msi-map'; [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arch/loongarch/booting.rst?h=v6.7-rc4 [2]: https://github.com/gaosong-loongson/loongarch-binary/releases Please review! Thanks. Song Gao Song Gao (17): hw/loongarch: Move boot fucntions to boot.c hw/loongarch: Add load initrd hw/loongarch: Add slave cpu boot_code hw/loongarch: Add init_cmdline hw/loongarch: Init efi_system_table hw/loongarch: Init efi_boot_memmap table hw/loongarch: Init efi_initrd table hw/loongarch: Init efi_fdt table hw/loongarch: Fix fdt memory node wrong 'reg' hw/loongarch: fdt adds cpu interrupt controller node hw/loongarch: fdt adds Extend I/O Interrupt Controller hw/loongarch: fdt adds pch_pic Controller hw/loongarch: fdt adds pch_msi Controller hw/loongarch: fdt adds pcie irq_map node hw/loongarch: fdt remove unused irqchip node hw/loongarch: Add cells missing from uart node hw/loongarch: Add cells missing from rtc node include/hw/intc/loongarch_extioi.h | 1 + include/hw/loongarch/boot.h| 109 + include/hw/loongarch/virt.h| 14 ++ include/hw/pci-host/ls7a.h | 2 + target/loongarch/cpu.h | 2 + hw/loongarch/boot.c| 330 ++ hw/loongarch/virt.c| 364 - hw/loongarch/meson.build | 1 + 8 files changed, 661 insertions(+), 162 deletions(-) create mode 100644 include/hw/loongarch/boot.h create mode 100644 hw/loongarch/boot.c -- 2.25.1
[PATCH v5 16/17] hw/loongarch: Add cells missing from uart node
uart node need interrupts and interrupt-parent cells. Signed-off-by: Song Gao --- hw/loongarch/virt.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index e2185d7bb4..5d92b2f1aa 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -209,7 +209,8 @@ static void fdt_add_rtc_node(LoongArchMachineState *lams) g_free(nodename); } -static void fdt_add_uart_node(LoongArchMachineState *lams) +static void fdt_add_uart_node(LoongArchMachineState *lams, + uint32_t *pch_pic_phandle) { char *nodename; hwaddr base = VIRT_UART_BASE; @@ -222,6 +223,10 @@ static void fdt_add_uart_node(LoongArchMachineState *lams) qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0x0, base, 0x0, size); qemu_fdt_setprop_cell(ms->fdt, nodename, "clock-frequency", 1); qemu_fdt_setprop_string(ms->fdt, "/chosen", "stdout-path", nodename); +qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupts", + VIRT_UART_IRQ - VIRT_GSI_BASE, 0x4); +qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent", + *pch_pic_phandle); g_free(nodename); } @@ -593,7 +598,7 @@ static void loongarch_devices_init(DeviceState *pch_pic, qdev_get_gpio_in(pch_pic, VIRT_UART_IRQ - VIRT_GSI_BASE), 115200, serial_hd(0), DEVICE_LITTLE_ENDIAN); -fdt_add_uart_node(lams); +fdt_add_uart_node(lams, pch_pic_phandle); /* Network init */ pci_init_nic_devices(pci_bus, mc->default_nic); -- 2.25.1
[PATCH v5 17/17] hw/loongarch: Add cells missing from rtc node
rtc node need interrupts and interrupt-parent cells. Signed-off-by: Song Gao --- hw/loongarch/virt.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 5d92b2f1aa..6810f78ebd 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -195,7 +195,8 @@ static void fdt_add_flash_node(LoongArchMachineState *lams) g_free(nodename); } -static void fdt_add_rtc_node(LoongArchMachineState *lams) +static void fdt_add_rtc_node(LoongArchMachineState *lams, + uint32_t *pch_pic_phandle) { char *nodename; hwaddr base = VIRT_RTC_REG_BASE; @@ -204,8 +205,13 @@ static void fdt_add_rtc_node(LoongArchMachineState *lams) nodename = g_strdup_printf("/rtc@%" PRIx64, base); qemu_fdt_add_subnode(ms->fdt, nodename); -qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", "loongson,ls7a-rtc"); +qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", +"loongson,ls7a-rtc"); qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg", 2, base, 2, size); +qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupts", + VIRT_RTC_IRQ - VIRT_GSI_BASE , 0x4); +qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent", + *pch_pic_phandle); g_free(nodename); } @@ -611,7 +617,7 @@ static void loongarch_devices_init(DeviceState *pch_pic, sysbus_create_simple("ls7a_rtc", VIRT_RTC_REG_BASE, qdev_get_gpio_in(pch_pic, VIRT_RTC_IRQ - VIRT_GSI_BASE)); -fdt_add_rtc_node(lams); +fdt_add_rtc_node(lams, pch_pic_phandle); /* acpi ged */ lams->acpi_ged = create_acpi_ged(pch_pic, lams); -- 2.25.1
[PATCH v5 09/17] hw/loongarch: Fix fdt memory node wrong 'reg'
The right fdt memory node like [1], not [2] [1] memory@0 { device_type = "memory"; reg = <0x00 0x00 0x00 0x1000>; }; [2] memory@0 { device_type = "memory"; reg = <0x02 0x00 0x02 0x1000>; }; Reviewed-by: Bibo Mao Signed-off-by: Song Gao --- hw/loongarch/virt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 47e040628f..5f787338a2 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -290,7 +290,7 @@ static void fdt_add_memory_node(MachineState *ms, char *nodename = g_strdup_printf("/memory@%" PRIx64, base); qemu_fdt_add_subnode(ms->fdt, nodename); -qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 2, base, 2, size); +qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0, base, 0, size); qemu_fdt_setprop_string(ms->fdt, nodename, "device_type", "memory"); if (ms->numa_state && ms->numa_state->num_nodes) { -- 2.25.1
[PATCH v5 04/17] hw/loongarch: Add init_cmdline
Add init_cmline and set boot_info->a0, a1 Signed-off-by: Song Gao --- include/hw/loongarch/virt.h | 2 ++ target/loongarch/cpu.h | 2 ++ hw/loongarch/boot.c | 19 +++ 3 files changed, 23 insertions(+) diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h index 6e6a5e487a..94ff15ad66 100644 --- a/include/hw/loongarch/virt.h +++ b/include/hw/loongarch/virt.h @@ -31,6 +31,8 @@ #define VIRT_GED_MEM_ADDR (VIRT_GED_EVT_ADDR + ACPI_GED_EVT_SEL_LEN) #define VIRT_GED_REG_ADDR (VIRT_GED_MEM_ADDR + MEMORY_HOTPLUG_IO_LEN) +#define COMMAND_LINE_SIZE 512 + struct LoongArchMachineState { /*< private >*/ MachineState parent_obj; diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index ec37579fd6..ce02ef3979 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -361,6 +361,8 @@ typedef struct CPUArchState { uint32_t mp_state; /* Store ipistate to access from this struct */ DeviceState *ipistate; + +struct loongarch_boot_info *boot_info; #endif } CPULoongArchState; diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index 2f398260af..897d4636b6 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -62,6 +62,16 @@ static const unsigned int slave_boot_code[] = { 0x4c20, /* jirl $r0,$r1,0 */ }; +static void init_cmdline(struct loongarch_boot_info *info, void *p, void *start) +{ +hwaddr cmdline_addr = (hwaddr)p - (hwaddr)start; + +info->a0 = 1; +info->a1 = cmdline_addr; + +memcpy(p, info->kernel_cmdline, COMMAND_LINE_SIZE); +} + static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr) { return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS); @@ -121,6 +131,10 @@ static void reset_load_elf(void *opaque) cpu_reset(CPU(cpu)); if (env->load_elf) { + if (cpu == LOONGARCH_CPU(first_cpu)) { +env->gpr[4] = env->boot_info->a0; +env->gpr[5] = env->boot_info->a1; +} cpu_set_pc(CPU(cpu), env->elf_address); } } @@ -160,8 +174,13 @@ static void loongarch_firmware_boot(LoongArchMachineState *lams, static void init_boot_rom(struct loongarch_boot_info *info, void *p) { +void *start = p; + memcpy(p, &slave_boot_code, sizeof(slave_boot_code)); p += sizeof(slave_boot_code); + +init_cmdline(info, p, start); +p += COMMAND_LINE_SIZE; } static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info) -- 2.25.1
[PATCH v5 06/17] hw/loongarch: Init efi_boot_memmap table
Signed-off-by: Song Gao --- include/hw/loongarch/boot.h | 27 + include/hw/loongarch/virt.h | 10 ++ hw/loongarch/boot.c | 39 + hw/loongarch/virt.c | 11 ++- 4 files changed, 78 insertions(+), 9 deletions(-) diff --git a/include/hw/loongarch/boot.h b/include/hw/loongarch/boot.h index aa12a60ffb..7ad25080c5 100644 --- a/include/hw/loongarch/boot.h +++ b/include/hw/loongarch/boot.h @@ -21,6 +21,15 @@ typedef struct { uint8_t b[16]; } efi_guid_t __attribute__((aligned(8))); +#define EFI_GUID(a, b, c, d...) (efi_guid_t){ { \ +(a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \ +(b) & 0xff, ((b) >> 8) & 0xff, \ +(c) & 0xff, ((c) >> 8) & 0xff, d } } + +#define LINUX_EFI_BOOT_MEMMAP_GUID \ +EFI_GUID(0x800f683f, 0xd08b, 0x423a, 0xa2, 0x93, \ + 0x96, 0x5c, 0x3c, 0x6f, 0xe2, 0xb4) + struct efi_config_table { efi_guid_t guid; uint64_t *ptr; @@ -56,6 +65,24 @@ struct efi_system_table { struct efi_configuration_table *tables; }; +typedef struct { +uint32_t type; +uint32_t pad; +uint64_t phys_addr; +uint64_t virt_addr; +uint64_t num_pages; +uint64_t attribute; +} efi_memory_desc_t; + +struct efi_boot_memmap { +uint64_t map_size; +uint64_t desc_size; +uint32_t desc_ver; +uint64_t map_key; +uint64_t buff_size; +efi_memory_desc_t map[32]; +}; + struct loongarch_boot_info { uint64_t ram_size; const char *kernel_filename; diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h index 94ff15ad66..d937586dbb 100644 --- a/include/hw/loongarch/virt.h +++ b/include/hw/loongarch/virt.h @@ -33,6 +33,16 @@ #define COMMAND_LINE_SIZE 512 +extern struct memmap_entry *memmap_table; +extern unsigned memmap_entries; + +struct memmap_entry { +uint64_t address; +uint64_t length; +uint32_t type; +uint32_t reserved; +}; + struct LoongArchMachineState { /*< private >*/ MachineState parent_obj; diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index 769212fce2..c086070bc8 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -62,8 +62,40 @@ static const unsigned int slave_boot_code[] = { 0x4c20, /* jirl $r0,$r1,0 */ }; +static inline void *guidcpy(void *dst, const void *src) +{ +return memcpy(dst, src, sizeof(efi_guid_t)); +} + +static void init_efi_boot_memmap(struct efi_system_table *systab, + void *p, void *start) +{ +unsigned i; +struct efi_boot_memmap *boot_memmap = p; +efi_guid_t tbl_guid = LINUX_EFI_BOOT_MEMMAP_GUID; + +/* efi_configuration_table 1 */ +guidcpy(&systab->tables[0].guid, &tbl_guid); +systab->tables[0].table = (struct efi_configuration_table *)(p - start); +systab->nr_tables = 1; + +boot_memmap->desc_size = sizeof(efi_memory_desc_t); +boot_memmap->desc_ver = 1; +boot_memmap->map_size = 0; + +efi_memory_desc_t *map = p + sizeof(struct efi_boot_memmap); +for (i = 0; i < memmap_entries; i++) { +map = (void *)boot_memmap + sizeof(*map); +map[i].type = memmap_table[i].type; +map[i].phys_addr = memmap_table[i].address; +map[i].num_pages = memmap_table[i].length >> 16; /* 64KB align*/ +p += sizeof(efi_memory_desc_t); +} +} + static void init_systab(struct loongarch_boot_info *info, void *p, void *start) { +void *bp_tables_start; struct efi_system_table *systab = p; info->a2 = (uint64_t)p - (uint64_t)start; @@ -79,6 +111,13 @@ static void init_systab(struct loongarch_boot_info *info, void *p, void *start) p += ROUND_UP(sizeof(struct efi_system_table), 64); systab->tables = p; +bp_tables_start = p; + +init_efi_boot_memmap(systab, p, start); +p += ROUND_UP(sizeof(struct efi_boot_memmap) + + sizeof(efi_memory_desc_t) * memmap_entries, 64); + +systab->tables = (struct efi_configuration_table *)(bp_tables_start - start); } static void init_cmdline(struct loongarch_boot_info *info, void *p, void *start) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index d929e4f9bf..47e040628f 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -342,15 +342,8 @@ static void virt_powerdown_req(Notifier *notifier, void *opaque) acpi_send_event(s->acpi_ged, ACPI_POWER_DOWN_STATUS); } -struct memmap_entry { -uint64_t address; -uint64_t length; -uint32_t type; -uint32_t reserved; -}; - -static struct memmap_entry *memmap_table; -static unsigned memmap_entries; +struct memmap_entry *memmap_table; +unsigned memmap_entries; static void memmap_add_entry(uint64_t address, uint64_t length, uint32_t type) { -- 2.25.1
[PATCH v5 03/17] hw/loongarch: Add slave cpu boot_code
Signed-off-by: Song Gao --- hw/loongarch/boot.c | 70 - 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index 3075c276d4..2f398260af 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -14,6 +14,54 @@ #include "qemu/error-report.h" #include "sysemu/reset.h" +static const unsigned int slave_boot_code[] = { + /* Configure reset ebase. */ +0x0400302c, /* csrwr $r12,0xc*/ + + /* Disable interrupt. */ +0x0380100c, /* ori$r12,$r0,0x4*/ +0x04000180, /* csrxchg$r0,$r12,0x0*/ + + /* Clear mailbox. */ +0x142d, /* lu12i.w$r13,1(0x1) */ +0x038081ad, /* ori$r13,$r13,0x20 */ +0x06481da0, /* iocsrwr.d $r0,$r13*/ + + /* Enable IPI interrupt. */ +0x142c, /* lu12i.w$r12,1(0x1) */ +0x0400118c, /* csrxchg$r12,$r12,0x4 */ +0x02fffc0c, /* addi.d $r12,$r0,-1(0xfff) */ +0x142d, /* lu12i.w$r13,1(0x1) */ +0x038011ad, /* ori$r13,$r13,0x4 */ +0x064819ac, /* iocsrwr.w $r12,$r13 */ +0x142d, /* lu12i.w$r13,1(0x1) */ +0x038081ad, /* ori$r13,$r13,0x20 */ + + /* Wait for wakeup <.L11>: */ +0x06488000, /* idle 0x0 */ +0x0340, /* andi $r0,$r0,0x0 */ +0x064809ac, /* iocsrrd.w $r12,$r13 */ +0x43fff59f, /* beqz $r12,-12(0x74) # 48 <.L11> */ + + /* Read and clear IPI interrupt. */ +0x142d, /* lu12i.w$r13,1(0x1) */ +0x064809ac, /* iocsrrd.w $r12,$r13 */ +0x142d, /* lu12i.w$r13,1(0x1) */ +0x038031ad, /* ori$r13,$r13,0xc */ +0x064819ac, /* iocsrwr.w $r12,$r13 */ + + /* Disable IPI interrupt.*/ +0x142c, /* lu12i.w$r12,1(0x1) */ +0x04001180, /* csrxchg$r0,$r12,0x4*/ + + /* Read mail buf and jump to specified entry */ +0x142d, /* lu12i.w$r13,1(0x1) */ +0x038081ad, /* ori$r13,$r13,0x20 */ +0x06480dac, /* iocsrrd.d $r12,$r13 */ +0x00150181, /* move $r1,$r12*/ +0x4c20, /* jirl $r0,$r1,0 */ +}; + static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr) { return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS); @@ -110,8 +158,15 @@ static void loongarch_firmware_boot(LoongArchMachineState *lams, fw_cfg_add_kernel_info(info, lams->fw_cfg); } +static void init_boot_rom(struct loongarch_boot_info *info, void *p) +{ +memcpy(p, &slave_boot_code, sizeof(slave_boot_code)); +p += sizeof(slave_boot_code); +} + static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info) { +void *p, *bp; int64_t kernel_addr = 0; LoongArchCPU *lacpu; CPUState *cs; @@ -123,11 +178,24 @@ static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info) exit(1); } +/* Load 'boot_rom' at [0 - 1MiB] */ +p = g_malloc0(1 * MiB); +bp = p; +init_boot_rom(info, p); +rom_add_blob_fixed("boot_rom", bp, 1 * MiB, 0); + CPU_FOREACH(cs) { lacpu = LOONGARCH_CPU(cs); lacpu->env.load_elf = true; -lacpu->env.elf_address = kernel_addr; +if (cs == first_cpu) { +lacpu->env.elf_address = kernel_addr; +} else { +lacpu->env.elf_address = 0; +} +lacpu->env.boot_info = info; } + +g_free(bp); } void loongarch_load_kernel(MachineState *ms, struct loongarch_boot_info *info) -- 2.25.1
[PATCH v5 15/17] hw/loongarch: fdt remove unused irqchip node
Signed-off-by: Song Gao --- hw/loongarch/virt.c | 31 +-- 1 file changed, 1 insertion(+), 30 deletions(-) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index ea73a80628..e2185d7bb4 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -410,34 +410,6 @@ static void fdt_add_pcie_node(const LoongArchMachineState *lams, g_free(nodename); } -static void fdt_add_irqchip_node(LoongArchMachineState *lams) -{ -MachineState *ms = MACHINE(lams); -char *nodename; -uint32_t irqchip_phandle; - -irqchip_phandle = qemu_fdt_alloc_phandle(ms->fdt); -qemu_fdt_setprop_cell(ms->fdt, "/", "interrupt-parent", irqchip_phandle); - -nodename = g_strdup_printf("/intc@%lx", VIRT_IOAPIC_REG_BASE); -qemu_fdt_add_subnode(ms->fdt, nodename); -qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 3); -qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0); -qemu_fdt_setprop_cell(ms->fdt, nodename, "#address-cells", 0x2); -qemu_fdt_setprop_cell(ms->fdt, nodename, "#size-cells", 0x2); -qemu_fdt_setprop(ms->fdt, nodename, "ranges", NULL, 0); - -qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", -"loongarch,ls7a"); - -qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg", - 2, VIRT_IOAPIC_REG_BASE, - 2, PCH_PIC_ROUTE_ENTRY_OFFSET); - -qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", irqchip_phandle); -g_free(nodename); -} - static void fdt_add_memory_node(MachineState *ms, uint64_t base, uint64_t size, int node_id) { @@ -959,8 +931,7 @@ static void loongarch_init(MachineState *machine) /* Initialize the IO interrupt subsystem */ loongarch_irq_init(lams); -fdt_add_irqchip_node(lams); -platform_bus_add_all_fdt_nodes(machine->fdt, "/intc", +platform_bus_add_all_fdt_nodes(machine->fdt, "/platic", VIRT_PLATFORM_BUS_BASEADDRESS, VIRT_PLATFORM_BUS_SIZE, VIRT_PLATFORM_BUS_IRQ); -- 2.25.1
[PATCH v5 01/17] hw/loongarch: Move boot fucntions to boot.c
Move some boot functions to boot.c and struct loongarch_boot_info into struct LoongArchMachineState. Signed-off-by: Song Gao --- include/hw/loongarch/boot.h | 21 ++ include/hw/loongarch/virt.h | 2 + hw/loongarch/boot.c | 125 hw/loongarch/virt.c | 123 +++ hw/loongarch/meson.build| 1 + 5 files changed, 157 insertions(+), 115 deletions(-) create mode 100644 include/hw/loongarch/boot.h create mode 100644 hw/loongarch/boot.c diff --git a/include/hw/loongarch/boot.h b/include/hw/loongarch/boot.h new file mode 100644 index 00..3275c1e295 --- /dev/null +++ b/include/hw/loongarch/boot.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Definitions for LoongArch boot. + * + * Copyright (C) 2023 Loongson Technology Corporation Limited + */ + +#ifndef HW_LOONGARCH_BOOT_H +#define HW_LOONGARCH_BOOT_H + +struct loongarch_boot_info { +uint64_t ram_size; +const char *kernel_filename; +const char *kernel_cmdline; +const char *initrd_filename; +uint64_t a0, a1, a2; +}; + +void loongarch_load_kernel(MachineState *ms, struct loongarch_boot_info *info); + +#endif /* HW_LOONGARCH_BOOT_H */ diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h index 6ef9a92394..6e6a5e487a 100644 --- a/include/hw/loongarch/virt.h +++ b/include/hw/loongarch/virt.h @@ -13,6 +13,7 @@ #include "qemu/queue.h" #include "hw/intc/loongarch_ipi.h" #include "hw/block/flash.h" +#include "hw/loongarch/boot.h" #define LOONGARCH_MAX_CPUS 256 @@ -53,6 +54,7 @@ struct LoongArchMachineState { MemoryRegion system_iocsr; MemoryRegion iocsr_mem; AddressSpace as_iocsr; +struct loongarch_boot_info bootinfo; }; #define TYPE_LOONGARCH_MACHINE MACHINE_TYPE_NAME("virt") diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c new file mode 100644 index 00..0f2bc15fdf --- /dev/null +++ b/hw/loongarch/boot.c @@ -0,0 +1,125 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * LoongArch boot helper functions. + * + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ + +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "target/loongarch/cpu.h" +#include "hw/loongarch/virt.h" +#include "hw/loader.h" +#include "elf.h" +#include "qemu/error-report.h" +#include "sysemu/reset.h" + +static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr) +{ +return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS); +} + +static int64_t load_kernel_info(struct loongarch_boot_info *info) +{ +uint64_t kernel_entry, kernel_low, kernel_high; +ssize_t kernel_size; + +kernel_size = load_elf(info->kernel_filename, NULL, + cpu_loongarch_virt_to_phys, NULL, + &kernel_entry, &kernel_low, + &kernel_high, NULL, 0, + EM_LOONGARCH, 1, 0); + +if (kernel_size < 0) { +error_report("could not load kernel '%s': %s", + info->kernel_filename, + load_elf_strerror(kernel_size)); +exit(1); +} +return kernel_entry; +} + +static void reset_load_elf(void *opaque) +{ +LoongArchCPU *cpu = opaque; +CPULoongArchState *env = &cpu->env; + +cpu_reset(CPU(cpu)); +if (env->load_elf) { +cpu_set_pc(CPU(cpu), env->elf_address); +} +} + +static void fw_cfg_add_kernel_info(struct loongarch_boot_info *info, + FWCfgState *fw_cfg) +{ +/* + * Expose the kernel, the command line, and the initrd in fw_cfg. + * We don't process them here at all, it's all left to the + * firmware. + */ +load_image_to_fw_cfg(fw_cfg, + FW_CFG_KERNEL_SIZE, FW_CFG_KERNEL_DATA, + info->kernel_filename, + false); + +if (info->initrd_filename) { +load_image_to_fw_cfg(fw_cfg, + FW_CFG_INITRD_SIZE, FW_CFG_INITRD_DATA, + info->initrd_filename, false); +} + +if (info->kernel_cmdline) { +fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, + strlen(info->kernel_cmdline) + 1); +fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, + info->kernel_cmdline); +} +} + +static void loongarch_firmware_boot(LoongArchMachineState *lams, +struct loongarch_boot_info *info) +{ +fw_cfg_add_kernel_info(info, lams->fw_cfg); +} + +static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info) +{ +int64_t kernel_addr = 0; +LoongArchCPU *lacpu; +CPUState *cs; + +if (info->kernel_filename) { +kernel_addr = load_kernel_info(info); +} else { +error_report("Need kernel filename\n"); +exit(1); +} + +CPU_FOREACH(cs) { +lacpu = LOONGARCH_CPU(cs); +lacpu-
Re: [PATCH v5 06/12] tests/plugin/mem: migrate to new per_vcpu API
On 2/29/24 5:46 PM, Alex Bennée wrote: Pierrick Bouvier writes: On 2/29/24 11:08 AM, Alex Bennée wrote: Pierrick Bouvier writes: On 2/29/24 2:08 AM, Alex Bennée wrote: Luc Michel writes: Hi Pierrick, My bad. Other plugins enable only inline when both are supplied, so I missed this here. I added an explicit error when user enable callback and inline at the same time on this plugin. We could default to inline unless we need the more features of callbacks? This remark applies for all plugins in general. Now we have safe and correct inline operations, callbacks are not needed in some case. Would that be acceptable for you that we delay this "cleanup" to existing plugins after soft freeze? So we can focus on merging current API changes needed. As long as we fix the double reporting bug sure. Yes, fixed in v6.
Re: [PATCH v6 06/12] tests/plugin/mem: migrate to new per_vcpu API
On 2/29/24 10:08 PM, Alex Bennée wrote: Pierrick Bouvier writes: Reviewed-by: Luc Michel Signed-off-by: Pierrick Bouvier --- tests/plugin/mem.c | 46 +++--- 1 file changed, 31 insertions(+), 15 deletions(-) diff --git a/tests/plugin/mem.c b/tests/plugin/mem.c index 44e91065ba7..b650dddcce1 100644 --- a/tests/plugin/mem.c +++ b/tests/plugin/mem.c @@ -16,9 +16,14 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; -static uint64_t inline_mem_count; -static uint64_t cb_mem_count; -static uint64_t io_count; +typedef struct { +uint64_t mem_count; +uint64_t io_count; +} CPUCount; + +static struct qemu_plugin_scoreboard *counts; +static qemu_plugin_u64 mem_count; +static qemu_plugin_u64 io_count; static bool do_inline, do_callback; static bool do_haddr; static enum qemu_plugin_mem_rw rw = QEMU_PLUGIN_MEM_RW; @@ -27,16 +32,16 @@ static void plugin_exit(qemu_plugin_id_t id, void *p) { g_autoptr(GString) out = g_string_new(""); -if (do_inline) { -g_string_printf(out, "inline mem accesses: %" PRIu64 "\n", inline_mem_count); -} -if (do_callback) { -g_string_append_printf(out, "callback mem accesses: %" PRIu64 "\n", cb_mem_count); +if (do_inline || do_callback) { +g_string_printf(out, "mem accesses: %" PRIu64 "\n", +qemu_plugin_u64_sum(mem_count)); } if (do_haddr) { -g_string_append_printf(out, "io accesses: %" PRIu64 "\n", io_count); +g_string_append_printf(out, "io accesses: %" PRIu64 "\n", + qemu_plugin_u64_sum(io_count)); } qemu_plugin_outs(out->str); +qemu_plugin_scoreboard_free(counts); } static void vcpu_mem(unsigned int cpu_index, qemu_plugin_meminfo_t meminfo, @@ -46,12 +51,12 @@ static void vcpu_mem(unsigned int cpu_index, qemu_plugin_meminfo_t meminfo, struct qemu_plugin_hwaddr *hwaddr; hwaddr = qemu_plugin_get_hwaddr(meminfo, vaddr); if (qemu_plugin_hwaddr_is_io(hwaddr)) { -io_count++; +qemu_plugin_u64_add(io_count, cpu_index, 1); } else { -cb_mem_count++; +qemu_plugin_u64_add(mem_count, cpu_index, 1); } } else { -cb_mem_count++; +qemu_plugin_u64_add(mem_count, cpu_index, 1); } } @@ -64,9 +69,10 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb) struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i); if (do_inline) { -qemu_plugin_register_vcpu_mem_inline(insn, rw, - QEMU_PLUGIN_INLINE_ADD_U64, - &inline_mem_count, 1); +qemu_plugin_register_vcpu_mem_inline_per_vcpu( +insn, rw, +QEMU_PLUGIN_INLINE_ADD_U64, +mem_count, 1); } if (do_callback) { qemu_plugin_register_vcpu_mem_cb(insn, vcpu_mem, @@ -117,6 +123,16 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id, } } +if (do_inline && do_callback) { +fprintf(stderr, +"can't enable inline and callback counting at the same time\n"); +return -1; +} + +counts = qemu_plugin_scoreboard_new(sizeof(CPUCount)); +mem_count = qemu_plugin_scoreboard_u64_in_struct( +counts, CPUCount, mem_count); +io_count = qemu_plugin_scoreboard_u64_in_struct(counts, CPUCount, io_count); qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans); qemu_plugin_register_atexit_cb(id, plugin_exit, NULL); return 0; You need the following to keep check-tcg working: modified tests/tcg/Makefile.target @@ -168,7 +168,7 @@ RUN_TESTS+=$(EXTRA_RUNS) # Some plugins need additional arguments above the default to fully # exercise things. We can define them on a per-test basis here. -run-plugin-%-with-libmem.so: PLUGIN_ARGS=$(COMMA)inline=true$(COMMA)callback=true +run-plugin-%-with-libmem.so: PLUGIN_ARGS=$(COMMA)inline=true ifeq ($(filter %-softmmu, $(TARGET)),) run-%: % Missed that, will fix in next version. Thanks!
Re: [PATCH v5 00/17] Add boot LoongArch elf kernel with FDT
Hi, If there are no new comments, I'll add this series to the loongarch-next branch next week. Thanks. Song Gao 在 2024/3/1 下午5:38, Song Gao 写道: Hi, All We already support boot efi kernel with bios, but not support boot elf kernel. This series adds boot elf kernel with FDT. 'LoongArch supports ACPI and FDT. The information that needs to be passed to the kernel includes the memmap, the initrd, the command line, optionally the ACPI/FDT tables, and so on' see [1]. Patch 2-8 : Create efi system table, and three efi configuration table boot_memmap, initd, FDT. Patch 9-17 : Fixes FDT problems. Test: - Start kernel See [2] start_kernel.sh - Start qcow2 See [2] start_qcow2.sh V5: - Rebase; V4: - patch 3 change slave_boot_code[] to const, and 'static void *p ' to 'void *p'; - patch 4 fixes build error; - patch 10-13, add project and commit link. V3: - Load initrd at kernel_high + 4 * kernel_size; - Load 'boot_rom' at [0 - 1M], the 'boot_rom' includes slave_boot_code, cmdline_buf and systab_tables; - R-b and rebase. V2: - FDT pcie node adds cells 'msi-map'; [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/arch/loongarch/booting.rst?h=v6.7-rc4 [2]: https://github.com/gaosong-loongson/loongarch-binary/releases Please review! Thanks. Song Gao Song Gao (17): hw/loongarch: Move boot fucntions to boot.c hw/loongarch: Add load initrd hw/loongarch: Add slave cpu boot_code hw/loongarch: Add init_cmdline hw/loongarch: Init efi_system_table hw/loongarch: Init efi_boot_memmap table hw/loongarch: Init efi_initrd table hw/loongarch: Init efi_fdt table hw/loongarch: Fix fdt memory node wrong 'reg' hw/loongarch: fdt adds cpu interrupt controller node hw/loongarch: fdt adds Extend I/O Interrupt Controller hw/loongarch: fdt adds pch_pic Controller hw/loongarch: fdt adds pch_msi Controller hw/loongarch: fdt adds pcie irq_map node hw/loongarch: fdt remove unused irqchip node hw/loongarch: Add cells missing from uart node hw/loongarch: Add cells missing from rtc node include/hw/intc/loongarch_extioi.h | 1 + include/hw/loongarch/boot.h| 109 + include/hw/loongarch/virt.h| 14 ++ include/hw/pci-host/ls7a.h | 2 + target/loongarch/cpu.h | 2 + hw/loongarch/boot.c| 330 ++ hw/loongarch/virt.c| 364 - hw/loongarch/meson.build | 1 + 8 files changed, 661 insertions(+), 162 deletions(-) create mode 100644 include/hw/loongarch/boot.h create mode 100644 hw/loongarch/boot.c
[PATCH v5 14/17] hw/loongarch: fdt adds pcie irq_map node
Signed-off-by: Song Gao --- hw/loongarch/virt.c | 73 ++--- 1 file changed, 69 insertions(+), 4 deletions(-) diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index e1fe1ea97f..ea73a80628 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -316,7 +316,62 @@ static void fdt_add_fw_cfg_node(const LoongArchMachineState *lams) g_free(nodename); } -static void fdt_add_pcie_node(const LoongArchMachineState *lams) +static void fdt_add_pcie_irq_map_node(const LoongArchMachineState *lams, + char *nodename, + uint32_t *pch_pic_phandle) +{ +int pin, dev; +uint32_t irq_map_stride = 0; +uint32_t full_irq_map[GPEX_NUM_IRQS *GPEX_NUM_IRQS * 10] = {}; +uint32_t *irq_map = full_irq_map; +const MachineState *ms = MACHINE(lams); + +/* This code creates a standard swizzle of interrupts such that + * each device's first interrupt is based on it's PCI_SLOT number. + * (See pci_swizzle_map_irq_fn()) + * + * We only need one entry per interrupt in the table (not one per + * possible slot) seeing the interrupt-map-mask will allow the table + * to wrap to any number of devices. + */ + +for (dev = 0; dev < GPEX_NUM_IRQS; dev++) { +int devfn = dev * 0x8; + +for (pin = 0; pin < GPEX_NUM_IRQS; pin++) { +int irq_nr = 16 + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS); +int i = 0; + +/* Fill PCI address cells */ +irq_map[i] = cpu_to_be32(devfn << 8); +i += 3; + +/* Fill PCI Interrupt cells */ +irq_map[i] = cpu_to_be32(pin + 1); +i += 1; + +/* Fill interrupt controller phandle and cells */ +irq_map[i++] = cpu_to_be32(*pch_pic_phandle); +irq_map[i++] = cpu_to_be32(irq_nr); + +if (!irq_map_stride) { +irq_map_stride = i; +} +irq_map += irq_map_stride; +} +} + + +qemu_fdt_setprop(ms->fdt, nodename, "interrupt-map", full_irq_map, + GPEX_NUM_IRQS * GPEX_NUM_IRQS * + irq_map_stride * sizeof(uint32_t)); +qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupt-map-mask", + 0x1800, 0, 0, 0x7); +} + +static void fdt_add_pcie_node(const LoongArchMachineState *lams, + uint32_t *pch_pic_phandle, + uint32_t *pch_msi_phandle) { char *nodename; hwaddr base_mmio = VIRT_PCI_MEM_BASE; @@ -347,6 +402,11 @@ static void fdt_add_pcie_node(const LoongArchMachineState *lams) 2, base_pio, 2, size_pio, 1, FDT_PCI_RANGE_MMIO, 2, base_mmio, 2, base_mmio, 2, size_mmio); +qemu_fdt_setprop_cells(ms->fdt, nodename, "msi-map", + 0, *pch_msi_phandle, 0, 0x1); + +fdt_add_pcie_irq_map_node(lams, nodename, pch_pic_phandle); + g_free(nodename); } @@ -505,7 +565,10 @@ static DeviceState *create_platform_bus(DeviceState *pch_pic) return dev; } -static void loongarch_devices_init(DeviceState *pch_pic, LoongArchMachineState *lams) +static void loongarch_devices_init(DeviceState *pch_pic, + LoongArchMachineState *lams, + uint32_t *pch_pic_phandle, + uint32_t *pch_msi_phandle) { MachineClass *mc = MACHINE_GET_CLASS(lams); DeviceState *gpex_dev; @@ -551,6 +614,9 @@ static void loongarch_devices_init(DeviceState *pch_pic, LoongArchMachineState * gpex_set_irq_num(GPEX_HOST(gpex_dev), i, 16 + i); } +/* Add pcie node */ +fdt_add_pcie_node(lams, pch_pic_phandle, pch_msi_phandle); + serial_mm_init(get_system_memory(), VIRT_UART_BASE, 0, qdev_get_gpio_in(pch_pic, VIRT_UART_IRQ - VIRT_GSI_BASE), @@ -697,7 +763,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams) /* Add PCH MSI node */ fdt_add_pch_msi_node(lams, &eiointc_phandle, &pch_msi_phandle); -loongarch_devices_init(pch_pic, lams); +loongarch_devices_init(pch_pic, lams, &pch_pic_phandle, &pch_msi_phandle); } static void loongarch_firmware_init(LoongArchMachineState *lams) @@ -904,7 +970,6 @@ static void loongarch_init(MachineState *machine) lams->powerdown_notifier.notify = virt_powerdown_req; qemu_register_powerdown_notifier(&lams->powerdown_notifier); -fdt_add_pcie_node(lams); /* * Since lowmem region starts from 0 and Linux kernel legacy start address * at 2 MiB, FDT base address is located at 1 MiB to avoid NULL pointer -- 2.25.1
[PATCH v5 05/17] hw/loongarch: Init efi_system_table
Add init_systab and set boot_info->a2 Signed-off-by: Song Gao --- include/hw/loongarch/boot.h | 48 + hw/loongarch/boot.c | 22 + 2 files changed, 70 insertions(+) diff --git a/include/hw/loongarch/boot.h b/include/hw/loongarch/boot.h index 3275c1e295..aa12a60ffb 100644 --- a/include/hw/loongarch/boot.h +++ b/include/hw/loongarch/boot.h @@ -8,6 +8,54 @@ #ifndef HW_LOONGARCH_BOOT_H #define HW_LOONGARCH_BOOT_H +/* UEFI 2.10 */ +#define EFI_SYSTEM_TABLE_SIGNATURE 0x5453595320494249 +#define EFI_2_100_SYSTEM_TABLE_REVISION ((2<<16) | (100)) +#define EFI_SPECIFICATION_VERSIONEFI_SYSTEM_TABLE_REVISION +#define EFI_SYSTEM_TABLE_REVISIONEFI_2_100_SYSTEM_TABLE_REVISION + +#define FW_VERSION 0x1 +#define FW_PATCHLEVEL 0x0 + +typedef struct { +uint8_t b[16]; +} efi_guid_t __attribute__((aligned(8))); + +struct efi_config_table { +efi_guid_t guid; +uint64_t *ptr; +const char name[16]; +}; + +typedef struct { +uint64_t signature; +uint32_t revision; +uint32_t headersize; +uint32_t crc32; +uint32_t reserved; +} efi_table_hdr_t; + +struct efi_configuration_table { +efi_guid_t guid; +void *table; +}; + +struct efi_system_table { +efi_table_hdr_t hdr; +uint64_t fw_vendor;/* physical addr of CHAR16 vendor string */ +uint32_t fw_revision; +uint64_t con_in_handle; +uint64_t *con_in; +uint64_t con_out_handle; +uint64_t *con_out; +uint64_t stderr_handle; +uint64_t stderr; +uint64_t *runtime; +uint64_t *boottime; +uint64_t nr_tables; +struct efi_configuration_table *tables; +}; + struct loongarch_boot_info { uint64_t ram_size; const char *kernel_filename; diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c index 897d4636b6..769212fce2 100644 --- a/hw/loongarch/boot.c +++ b/hw/loongarch/boot.c @@ -62,6 +62,25 @@ static const unsigned int slave_boot_code[] = { 0x4c20, /* jirl $r0,$r1,0 */ }; +static void init_systab(struct loongarch_boot_info *info, void *p, void *start) +{ +struct efi_system_table *systab = p; + +info->a2 = (uint64_t)p - (uint64_t)start; + +systab->hdr.signature = EFI_SYSTEM_TABLE_SIGNATURE; +systab->hdr.revision = EFI_SPECIFICATION_VERSION; +systab->hdr.revision = sizeof(struct efi_system_table), +systab->fw_revision = FW_VERSION << 16 | FW_PATCHLEVEL << 8; +systab->runtime = 0; +systab->boottime = 0; +systab->nr_tables = 0; + +p += ROUND_UP(sizeof(struct efi_system_table), 64); + +systab->tables = p; +} + static void init_cmdline(struct loongarch_boot_info *info, void *p, void *start) { hwaddr cmdline_addr = (hwaddr)p - (hwaddr)start; @@ -134,6 +153,7 @@ static void reset_load_elf(void *opaque) if (cpu == LOONGARCH_CPU(first_cpu)) { env->gpr[4] = env->boot_info->a0; env->gpr[5] = env->boot_info->a1; +env->gpr[6] = env->boot_info->a2; } cpu_set_pc(CPU(cpu), env->elf_address); } @@ -181,6 +201,8 @@ static void init_boot_rom(struct loongarch_boot_info *info, void *p) init_cmdline(info, p, start); p += COMMAND_LINE_SIZE; + +init_systab(info, p, start); } static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info) -- 2.25.1
[PATCH v5 12/17] hw/loongarch: fdt adds pch_pic Controller
fdt adds pch pic controller, we use 'loongson,pch-pic-1.0' See: https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongson-pch-pic.c https://lore.kernel.org/r/20200528152757.1028711-4-jiaxun.y...@flygoat.com Signed-off-by: Song Gao --- include/hw/pci-host/ls7a.h | 1 + hw/loongarch/virt.c| 30 +- 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/include/hw/pci-host/ls7a.h b/include/hw/pci-host/ls7a.h index e753449593..fe260f0183 100644 --- a/include/hw/pci-host/ls7a.h +++ b/include/hw/pci-host/ls7a.h @@ -24,6 +24,7 @@ #define VIRT_PCH_REG_BASE0x1000UL #define VIRT_IOAPIC_REG_BASE (VIRT_PCH_REG_BASE) #define VIRT_PCH_MSI_ADDR_LOW0x2FF0UL +#define VIRT_PCH_REG_SIZE0x400 /* * GSI_BASE is hard-coded with 64 in linux kernel, else kernel fails to boot diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 4091a14080..87908d1f79 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -125,6 +125,31 @@ static void fdt_add_eiointc_node(LoongArchMachineState *lams, g_free(nodename); } +static void fdt_add_pch_pic_node(LoongArchMachineState *lams, + uint32_t *eiointc_phandle, + uint32_t *pch_pic_phandle) +{ +MachineState *ms = MACHINE(lams); +char *nodename; +hwaddr pch_pic_base = VIRT_PCH_REG_BASE; +hwaddr pch_pic_size = VIRT_PCH_REG_SIZE; + +*pch_pic_phandle = qemu_fdt_alloc_phandle(ms->fdt); +nodename = g_strdup_printf("/platic@%" PRIx64, pch_pic_base); +qemu_fdt_add_subnode(ms->fdt, nodename); +qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", *pch_pic_phandle); +qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", +"loongson,pch-pic-1.0"); +qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0, + pch_pic_base, 0, pch_pic_size); +qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0); +qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 2); +qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent", + *eiointc_phandle); +qemu_fdt_setprop_cell(ms->fdt, nodename, "loongson,pic-base-vec", 0); +g_free(nodename); +} + static void fdt_add_flash_node(LoongArchMachineState *lams) { MachineState *ms = MACHINE(lams); @@ -533,7 +558,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams) CPULoongArchState *env; CPUState *cpu_state; int cpu, pin, i, start, num; -uint32_t cpuintc_phandle, eiointc_phandle; +uint32_t cpuintc_phandle, eiointc_phandle, pch_pic_phandle; /* * The connection of interrupts: @@ -624,6 +649,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams) qdev_connect_gpio_out(DEVICE(d), i, qdev_get_gpio_in(extioi, i)); } +/* Add PCH PIC node */ +fdt_add_pch_pic_node(lams, &eiointc_phandle, &pch_pic_phandle); + pch_msi = qdev_new(TYPE_LOONGARCH_PCH_MSI); start = num; num = EXTIOI_IRQS - start; -- 2.25.1
[PATCH v5 13/17] hw/loongarch: fdt adds pch_msi Controller
fdt adds pch msi controller, we use 'loongson,pch-msi-1.0'. See: https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongson-pch-msi.c https://lore.kernel.org/r/20200528152757.1028711-6-jiaxun.y...@flygoat.com Signed-off-by: Song Gao --- include/hw/pci-host/ls7a.h | 1 + hw/loongarch/virt.c| 33 - 2 files changed, 33 insertions(+), 1 deletion(-) diff --git a/include/hw/pci-host/ls7a.h b/include/hw/pci-host/ls7a.h index fe260f0183..cd7c9ec7bc 100644 --- a/include/hw/pci-host/ls7a.h +++ b/include/hw/pci-host/ls7a.h @@ -25,6 +25,7 @@ #define VIRT_IOAPIC_REG_BASE (VIRT_PCH_REG_BASE) #define VIRT_PCH_MSI_ADDR_LOW0x2FF0UL #define VIRT_PCH_REG_SIZE0x400 +#define VIRT_PCH_MSI_SIZE0x8 /* * GSI_BASE is hard-coded with 64 in linux kernel, else kernel fails to boot diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 87908d1f79..e1fe1ea97f 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -150,6 +150,34 @@ static void fdt_add_pch_pic_node(LoongArchMachineState *lams, g_free(nodename); } +static void fdt_add_pch_msi_node(LoongArchMachineState *lams, + uint32_t *eiointc_phandle, + uint32_t *pch_msi_phandle) +{ +MachineState *ms = MACHINE(lams); +char *nodename; +hwaddr pch_msi_base = VIRT_PCH_MSI_ADDR_LOW; +hwaddr pch_msi_size = VIRT_PCH_MSI_SIZE; + +*pch_msi_phandle = qemu_fdt_alloc_phandle(ms->fdt); +nodename = g_strdup_printf("/msi@%" PRIx64, pch_msi_base); +qemu_fdt_add_subnode(ms->fdt, nodename); +qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", *pch_msi_phandle); +qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", +"loongson,pch-msi-1.0"); +qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", + 0, pch_msi_base, + 0, pch_msi_size); +qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0); +qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent", + *eiointc_phandle); +qemu_fdt_setprop_cell(ms->fdt, nodename, "loongson,msi-base-vec", + VIRT_PCH_PIC_IRQ_NUM); +qemu_fdt_setprop_cell(ms->fdt, nodename, "loongson,msi-num-vecs", + EXTIOI_IRQS - VIRT_PCH_PIC_IRQ_NUM); +g_free(nodename); +} + static void fdt_add_flash_node(LoongArchMachineState *lams) { MachineState *ms = MACHINE(lams); @@ -558,7 +586,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams) CPULoongArchState *env; CPUState *cpu_state; int cpu, pin, i, start, num; -uint32_t cpuintc_phandle, eiointc_phandle, pch_pic_phandle; +uint32_t cpuintc_phandle, eiointc_phandle, pch_pic_phandle, pch_msi_phandle; /* * The connection of interrupts: @@ -666,6 +694,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams) qdev_get_gpio_in(extioi, i + start)); } +/* Add PCH MSI node */ +fdt_add_pch_msi_node(lams, &eiointc_phandle, &pch_msi_phandle); + loongarch_devices_init(pch_pic, lams); } -- 2.25.1
[PATCH v5 11/17] hw/loongarch: fdt adds Extend I/O Interrupt Controller
fdt adds Extend I/O Interrupt Controller, we use 'loongson,ls2k2000-eiointc'. See: https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongson-eiointc.c https://lore.kernel.org/r/764e02d924094580ac0f1d15535f4b98308705c6.1683279769.git.zhoubin...@loongson.cn Signed-off-by: Song Gao --- include/hw/intc/loongarch_extioi.h | 1 + hw/loongarch/virt.c| 30 +- 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/include/hw/intc/loongarch_extioi.h b/include/hw/intc/loongarch_extioi.h index a0a46b888c..410c6e1121 100644 --- a/include/hw/intc/loongarch_extioi.h +++ b/include/hw/intc/loongarch_extioi.h @@ -39,6 +39,7 @@ #define EXTIOI_COREISR_END (0xB20 - APIC_OFFSET) #define EXTIOI_COREMAP_START (0xC00 - APIC_OFFSET) #define EXTIOI_COREMAP_END (0xD00 - APIC_OFFSET) +#define EXTIOI_SIZE 0x800 typedef struct ExtIOICore { uint32_t coreisr[EXTIOI_IRQS_GROUP_COUNT]; diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index a8374c35a4..4091a14080 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -100,6 +100,31 @@ static void fdt_add_cpuic_node(LoongArchMachineState *lams, g_free(nodename); } +static void fdt_add_eiointc_node(LoongArchMachineState *lams, + uint32_t *cpuintc_phandle, + uint32_t *eiointc_phandle) +{ +MachineState *ms = MACHINE(lams); +char *nodename; +hwaddr extioi_base = APIC_BASE; +hwaddr extioi_size = EXTIOI_SIZE; + +*eiointc_phandle = qemu_fdt_alloc_phandle(ms->fdt); +nodename = g_strdup_printf("/eiointc@%" PRIx64, extioi_base); +qemu_fdt_add_subnode(ms->fdt, nodename); +qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", *eiointc_phandle); +qemu_fdt_setprop_string(ms->fdt, nodename, "compatible", +"loongson,ls2k2000-eiointc"); +qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0); +qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 1); +qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent", + *cpuintc_phandle); +qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupts", 3); +qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0x0, + extioi_base, 0x0, extioi_size); +g_free(nodename); +} + static void fdt_add_flash_node(LoongArchMachineState *lams) { MachineState *ms = MACHINE(lams); @@ -508,7 +533,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams) CPULoongArchState *env; CPUState *cpu_state; int cpu, pin, i, start, num; -uint32_t cpuintc_phandle; +uint32_t cpuintc_phandle, eiointc_phandle; /* * The connection of interrupts: @@ -577,6 +602,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams) } } +/* Add Extend I/O Interrupt Controller node */ +fdt_add_eiointc_node(lams, &cpuintc_phandle, &eiointc_phandle); + pch_pic = qdev_new(TYPE_LOONGARCH_PCH_PIC); num = VIRT_PCH_PIC_IRQ_NUM; qdev_prop_set_uint32(pch_pic, "pch_pic_irq_num", num); -- 2.25.1
Re: [PATCH v5 08/12] tests/plugin/bb: migrate to new per_vcpu API
On 2/29/24 6:21 PM, Alex Bennée wrote: Pierrick Bouvier writes: Signed-off-by: Pierrick Bouvier I did notice there is a discrepancy between what libisns and libb report. The libb looks like an overcount so I wonder if there are some instructions we are not picking up but I can't see where that would be. ➜ ./qemu-hppa -plugin ./tests/plugin/libinsn.so -plugin ./tests/plugin/libbb.so,inline=true -d plugin ./tests/tcg/hppa-linux-user/sha512 1..10 ok 1 - do_test(&tests[i]) ok 2 - do_test(&tests[i]) ok 3 - do_test(&tests[i]) ok 4 - do_test(&tests[i]) ok 5 - do_test(&tests[i]) ok 6 - do_test(&tests[i]) ok 7 - do_test(&tests[i]) ok 8 - do_test(&tests[i]) ok 9 - do_test(&tests[i]) ok 10 - do_test(&tests[i]) CPU0: bb's: 54282, insns: 775697 Total: bb's: 54282, insns: 775697 cpu 0 insns: 774827 total insns: 774827 Although weirdly maybe only an hppa thing. Richard? Do you observe the exact same number if you run only one of the plugin? bb count number of instructions in an executed block, while insn effectively count every instructions ran. Maybe there is hppa specifity that makes some tb exit in the middle, thus executing less instructions than expected from bb count. I don't know how to reproduce this test. Did you run it from a specific docker env? ➜ ./qemu-aarch64 -plugin ./tests/plugin/libinsn.so -plugin ./tests/plugin/libbb.so,inline=true -d plugin ./tests/tcg/aarch64-linux-user/sha512 1..10 ok 1 - do_test(&tests[i]) ok 2 - do_test(&tests[i]) ok 3 - do_test(&tests[i]) ok 4 - do_test(&tests[i]) ok 5 - do_test(&tests[i]) ok 6 - do_test(&tests[i]) ok 7 - do_test(&tests[i]) ok 8 - do_test(&tests[i]) ok 9 - do_test(&tests[i]) ok 10 - do_test(&tests[i]) CPU0: bb's: 41513, insns: 302671 Total: bb's: 41513, insns: 302671 cpu 0 insns: 302671 total insns: 302671 Anyway: Reviewed-by: Alex Bennée
[PATCH 0/1] kvm: add support for guest physical bits
The matching kernel bits are here: https://lore.kernel.org/kvm/20240301101410.356007-1-kra...@redhat.com/T/ Gerd Hoffmann (1): kvm: add support for guest physical bits target/i386/cpu.h | 1 + target/i386/cpu.c | 1 + target/i386/kvm/kvm.c | 8 3 files changed, 10 insertions(+) -- 2.44.0
[PATCH 1/1] kvm: add support for guest physical bits
query kvm for supported guest physical address bits using KVM_CAP_VM_GPA_BITS. Expose the value to the guest via cpuid (leaf 0x8008, eax, bits 16-23). Signed-off-by: Gerd Hoffmann --- target/i386/cpu.h | 1 + target/i386/cpu.c | 1 + target/i386/kvm/kvm.c | 8 3 files changed, 10 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 952174bb6f52..d427218827f6 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -2026,6 +2026,7 @@ struct ArchCPU { /* Number of physical address bits supported */ uint32_t phys_bits; +uint32_t guest_phys_bits; /* in order to simplify APIC support, we leave this pointer to the user */ diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 2666ef380891..1a6cfc75951e 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6570,6 +6570,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) { /* 64 bit processor */ *eax |= (cpu_x86_virtual_addr_width(env) << 8); + *eax |= (cpu->guest_phys_bits << 16); } *ebx = env->features[FEAT_8000_0008_EBX]; if (cs->nr_cores * cs->nr_threads > 1) { diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 42970ab046fa..e06c9d66bb01 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -1716,6 +1716,7 @@ int kvm_arch_init_vcpu(CPUState *cs) X86CPU *cpu = X86_CPU(cs); CPUX86State *env = &cpu->env; uint32_t limit, i, j, cpuid_i; +uint32_t guest_phys_bits; uint32_t unused; struct kvm_cpuid_entry2 *c; uint32_t signature[3]; @@ -1751,6 +1752,13 @@ int kvm_arch_init_vcpu(CPUState *cs) env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY; +guest_phys_bits = kvm_check_extension(cs->kvm_state, KVM_CAP_VM_GPA_BITS); +if (guest_phys_bits && +(cpu->guest_phys_bits == 0 || + cpu->guest_phys_bits > guest_phys_bits)) { +cpu->guest_phys_bits = guest_phys_bits; +} + /* * kvm_hyperv_expand_features() is called here for the second time in case * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle -- 2.44.0
Re: [PATCH v2 0/5] Add device DM163 (led driver, matrix colors shield & display)
On Wed, 28 Feb 2024 at 12:06, Thomas Huth wrote: > > On 28/02/2024 12.31, Inès Varhol wrote: > > TLDR: how can I provide a test or an example? > > > > I've tested the display by running custom executables and > > comparing to the result on the real board, but I don't > > know how to test it using a QTest. > I'd suggest to try to take a screenshot via QMP or HMP. If you need a third > party binary to get something on the screen, it's maybe best to write an > avocado test instead of a qtest. See for example > tests/avocado/machine_m68k_nextcube.py - it takes a screenshot and then > converts the console messages from it via Tessaract to text and looks for > the expected strings there. Alternatively, have a look at > tests/avocado/machine_s390_ccw_virtio.py - it boots a Linux in the guest > that writes a string into the /dev/fb0 framebuffer, then takes a screendump > and then tries to find the string in the file. The latter could maybe also > be done via a qtest if you know how to set up the framebuffer without a > guest kernel. The other extremely popular approach for testing display devices is "I looked at the screen while I was doing the development, and it was fine" :-) -- PMM
Re: [PULL 26/29] contrib/plugins: extend execlog to track register changes
Zhao Liu writes: > Hi Alex, > > I hit the following warnings (with "./configure --enable-werror"): > > /qemu/contrib/plugins/execlog.c: In function ‘registers_init’: > /qemu/contrib/plugins/execlog.c:330:17: warning: ‘g_pattern_match_string’ is > deprecated: Use 'g_pattern_spec_match_string' instead > [-Wdeprecated-declarations] > 330 | if (g_pattern_match_string(pat, rd->name) || > | ^~ > In file included from /usr/include/glib-2.0/glib.h:65, > from /qemu/contrib/plugins/execlog.c:9: > /usr/include/glib-2.0/glib/gpattern.h:55:15: note: declared here >55 | gboolean g_pattern_match_string (GPatternSpec *pspec, > | ^~ > /qemu/contrib/plugins/execlog.c:331:21: warning: ‘g_pattern_match_string’ is > deprecated: Use 'g_pattern_spec_match_string' instead > [-Wdeprecated-declarations] > 331 | g_pattern_match_string(pat, rd_lower)) { > | ^~ > In file included from /usr/include/glib-2.0/glib.h:65, > from /qemu/contrib/plugins/execlog.c:9: > /usr/include/glib-2.0/glib/gpattern.h:55:15: note: declared here >55 | gboolean g_pattern_match_string (GPatternSpec *pspec, > | ^~ > /qemu/contrib/plugins/execlog.c:339:63: warning: passing argument 2 of > ‘g_ptr_array_add’ discards ‘const’ qualifier from pointer target type > [-Wdiscarded-qualifiers] > 339 | g_ptr_array_add(all_reg_names, > reg->name); Hmm I missed that. Not sure what the neatest solution is in this case - g_ptr_array_new() doesn't have a destroy func so we shouldn't ever attempt to free it. We can manually re-add the const qualifier at the other end for completeness and I guess comment and cast? > |~~~^~ > In file included from /usr/include/glib-2.0/glib.h:31, > from /qemu/contrib/plugins/execlog.c:9: > /usr/include/glib-2.0/glib/garray.h:192:62: note: expected ‘gpointer’ {aka > ‘void *’} but argument is of type ‘const char *’ > 192 |gpointer data); > |~~^~~~ > > In addition, I checked my glic version: > > $ldd --version > ldd (Ubuntu GLIBC 2.35-0ubuntu3.5) 2.35 > > I think it's v2.35. Are these three warning reports valid? It's the glib (not glibc) version that matters here. g_pattern_match_string was deprecated in 2.70 when the suggested alternative was added. However our baseline for glib is still: # When bumping glib minimum version, please check also whether to increase # the _WIN32_WINNT setting in osdep.h according to the value from glib glib_req_ver = '>=2.56.0' glib_pc = dependency('glib-2.0', version: glib_req_ver, required: true, method: 'pkg-config') The usual solution for this is to throw in a compat wrapper in glib-compat.h: --8<---cut here---start->8--- modified include/glib-compat.h @@ -105,6 +105,24 @@ static inline gpointer g_memdup2_qemu(gconstpointer mem, gsize byte_size) } #define g_memdup2(m, s) g_memdup2_qemu(m, s) +/* + * g_pattern_match_string has been deprecated in Glib since 2.70 and + * will complain about it if you try to use it. Fortunately the + * signature of both functions is the same making it easy to work + * around. + */ +static inline +gboolean g_pattern_spec_match_string_qemu(GPatternSpec *pspec, + const gchar *string) +{ +#if GLIB_CHECK_VERSION(2, 70, 0) +return g_pattern_spec_match_string(pspec, string); +#else +return g_pattern_match_string(pspec, string); +#endif +}; +#define g_pattern_spec_match_string(p, s) g_pattern_spec_match_string_qemu(p, s) + #if defined(G_OS_UNIX) /* * Note: The fallback implementation is not MT-safe, and it returns a copy of modified contrib/plugins/execlog.c @@ -334,8 +334,8 @@ static void registers_init(int vcpu_index) for (int p = 0; p < rmatches->len; p++) { g_autoptr(GPatternSpec) pat = g_pattern_spec_new(rmatches->pdata[p]); g_autofree gchar *rd_lower = g_utf8_strdown(rd->name, -1); -if (g_pattern_match_string(pat, rd->name) || -g_pattern_match_string(pat, rd_lower)) { +if (g_pattern_spec_match_string(pat, rd->name) || +g_pattern_spec_match_string(pat, rd_lower)) { Register *reg = init_vcpu_register(vcpu_index, rd); g_ptr_array_add(registers, reg); --8<---cut here---end--->8--- but I hesitated to add it for this case as plugins shouldn't assume they have access to QEMU's internals. Maybe the glib-compat.h header could be treated as a special case. > > I also noticed in target
Re: [PATCH v5 08/12] tests/plugin/bb: migrate to new per_vcpu API
Pierrick Bouvier writes: > On 2/29/24 6:21 PM, Alex Bennée wrote: >> Pierrick Bouvier writes: >> >>> Signed-off-by: Pierrick Bouvier >> I did notice there is a discrepancy between what libisns and libb >> report. The libb looks like an overcount so I wonder if there are some >> instructions we are not picking up but I can't see where that would be. >>➜ ./qemu-hppa -plugin ./tests/plugin/libinsn.so -plugin >> ./tests/plugin/libbb.so,inline=true -d plugin >> ./tests/tcg/hppa-linux-user/sha512 >>1..10 >>ok 1 - do_test(&tests[i]) >>ok 2 - do_test(&tests[i]) >>ok 3 - do_test(&tests[i]) >>ok 4 - do_test(&tests[i]) >>ok 5 - do_test(&tests[i]) >>ok 6 - do_test(&tests[i]) >>ok 7 - do_test(&tests[i]) >>ok 8 - do_test(&tests[i]) >>ok 9 - do_test(&tests[i]) >>ok 10 - do_test(&tests[i]) >>CPU0: bb's: 54282, insns: 775697 >>Total: bb's: 54282, insns: 775697 >>cpu 0 insns: 774827 >>total insns: 774827 >> Although weirdly maybe only an hppa thing. Richard? >> > > Do you observe the exact same number if you run only one of the plugin? > > bb count number of instructions in an executed block, while insn > effectively count every instructions ran. > Maybe there is hppa specifity that makes some tb exit in the middle, > thus executing less instructions than expected from bb count. Almost certainly - I just wasn't sure what would do that on straight line code. Probably some funky aspect of HPPA I'm not aware off ;-) > > I don't know how to reproduce this test. Did you run it from a > specific docker env? If you have docker enabled the "make check-tcg" will build and use a container to build the test cases. If you are on debian you just need: apt install gcc-hppa-linux-gnu libc6-dev-hppa-cross and re-run configure. > >>➜ ./qemu-aarch64 -plugin ./tests/plugin/libinsn.so -plugin >> ./tests/plugin/libbb.so,inline=true -d plugin >> ./tests/tcg/aarch64-linux-user/sha512 >>1..10 >>ok 1 - do_test(&tests[i]) >>ok 2 - do_test(&tests[i]) >>ok 3 - do_test(&tests[i]) >>ok 4 - do_test(&tests[i]) >>ok 5 - do_test(&tests[i]) >>ok 6 - do_test(&tests[i]) >>ok 7 - do_test(&tests[i]) >>ok 8 - do_test(&tests[i]) >>ok 9 - do_test(&tests[i]) >>ok 10 - do_test(&tests[i]) >>CPU0: bb's: 41513, insns: 302671 >>Total: bb's: 41513, insns: 302671 >>cpu 0 insns: 302671 >>total insns: 302671 >> Anyway: >> Reviewed-by: Alex Bennée >> -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [PATCH V4 10/14] migration: stop vm for cpr
On 3/1/24 02:28, Peter Xu wrote: On Thu, Feb 29, 2024 at 10:21:14AM -0500, Steven Sistare wrote: On 2/25/2024 9:08 PM, Peter Xu wrote: On Thu, Feb 22, 2024 at 09:28:36AM -0800, Steve Sistare wrote: When migration for cpr is initiated, stop the vm and set state RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the possibility of ram and device state being out of sync, and guarantees that a guest in the suspended state remains suspended, because qmp_cont rejects a cont command in the RUN_STATE_FINISH_MIGRATE state. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu cpr-reboot mode keeps changing behavior. Could we declare it "experimental" until it's solid? Maybe a patch to document this? Normally IMHO we shouldn't merge a feature if it's not complete, however cpr-reboot is so special that the mode itself is already merged in 8.2 before I started to merge patches, and it keeps changing things. I don't know what else we can do here besides declaring it experimental and not declare it a stable feature. Hi Peter, the planned/committed functionality for cpr-reboot changed only once, in: migration: stop vm for cpr Suspension to support vfio is an enhancement which adds to the basic functionality, it does not change it. This was planned all along, but submitted as a separate If VFIO used to migrate without suspension and now it won't, it's a behavior change? series to manage complexity, as I outlined in my qemu community presentation, which I emailed you at the time. Other "changes" that arose during review were just clarifications and explanations. So, I don't think cpr-reboot deserves to be condemned to experimental limbo. IMHO it's not about a feature being condemned, it's about a kindful heads-up to the users that one needs to take risk on using it until it becomes stable, it also makes developers easier because of no limitation on behavior change. If all the changes are landing, then there's no need for such a patch. If so, please propose the planned complete document patch. I had a feeling that cpr is still not fully understood by even many developers on the list. It'll be great that such document will contain all the details one needs to know on the feature, etc. meaning of the name cpr-reboot (what is "cpr"), when to use it, how to use it, how it differs from "normal" mode (etc. lifted limitations on some devices that used to block migration?), what is enforced (vm stop, suspension, etc.) and what is optionally offered (VFIO, shared-mem, etc.), the expected behaviors, etc. When send it, please copy relevant people (Alex & Cedric for VFIO, while Markus could also be a good candidate considering the QMP involvement). I second that. If we could have a file under docs/, it would be great. Thanks, C.
Re: [PATCH v5 08/12] tests/plugin/bb: migrate to new per_vcpu API
On 3/1/24 2:26 PM, Alex Bennée wrote: Pierrick Bouvier writes: On 2/29/24 6:21 PM, Alex Bennée wrote: Pierrick Bouvier writes: Signed-off-by: Pierrick Bouvier I did notice there is a discrepancy between what libisns and libb report. The libb looks like an overcount so I wonder if there are some instructions we are not picking up but I can't see where that would be. ➜ ./qemu-hppa -plugin ./tests/plugin/libinsn.so -plugin ./tests/plugin/libbb.so,inline=true -d plugin ./tests/tcg/hppa-linux-user/sha512 1..10 ok 1 - do_test(&tests[i]) ok 2 - do_test(&tests[i]) ok 3 - do_test(&tests[i]) ok 4 - do_test(&tests[i]) ok 5 - do_test(&tests[i]) ok 6 - do_test(&tests[i]) ok 7 - do_test(&tests[i]) ok 8 - do_test(&tests[i]) ok 9 - do_test(&tests[i]) ok 10 - do_test(&tests[i]) CPU0: bb's: 54282, insns: 775697 Total: bb's: 54282, insns: 775697 cpu 0 insns: 774827 total insns: 774827 Although weirdly maybe only an hppa thing. Richard? Do you observe the exact same number if you run only one of the plugin? bb count number of instructions in an executed block, while insn effectively count every instructions ran. Maybe there is hppa specifity that makes some tb exit in the middle, thus executing less instructions than expected from bb count. Almost certainly - I just wasn't sure what would do that on straight line code. Probably some funky aspect of HPPA I'm not aware off ;-) I don't know how to reproduce this test. Did you run it from a specific docker env? If you have docker enabled the "make check-tcg" will build and use a container to build the test cases. If you are on debian you just need: apt install gcc-hppa-linux-gnu libc6-dev-hppa-cross and re-run configure. Thanks. The difference observed predates this series, so there should definitely be something specific to this arch. ➜ ./qemu-aarch64 -plugin ./tests/plugin/libinsn.so -plugin ./tests/plugin/libbb.so,inline=true -d plugin ./tests/tcg/aarch64-linux-user/sha512 1..10 ok 1 - do_test(&tests[i]) ok 2 - do_test(&tests[i]) ok 3 - do_test(&tests[i]) ok 4 - do_test(&tests[i]) ok 5 - do_test(&tests[i]) ok 6 - do_test(&tests[i]) ok 7 - do_test(&tests[i]) ok 8 - do_test(&tests[i]) ok 9 - do_test(&tests[i]) ok 10 - do_test(&tests[i]) CPU0: bb's: 41513, insns: 302671 Total: bb's: 41513, insns: 302671 cpu 0 insns: 302671 total insns: 302671 Anyway: Reviewed-by: Alex Bennée
Re: [PATCH v2 0/5] Add device DM163 (led driver, matrix colors shield & display)
On 01/03/2024 11.21, Peter Maydell wrote: On Wed, 28 Feb 2024 at 12:06, Thomas Huth wrote: On 28/02/2024 12.31, Inès Varhol wrote: TLDR: how can I provide a test or an example? I've tested the display by running custom executables and comparing to the result on the real board, but I don't know how to test it using a QTest. I'd suggest to try to take a screenshot via QMP or HMP. If you need a third party binary to get something on the screen, it's maybe best to write an avocado test instead of a qtest. See for example tests/avocado/machine_m68k_nextcube.py - it takes a screenshot and then converts the console messages from it via Tessaract to text and looks for the expected strings there. Alternatively, have a look at tests/avocado/machine_s390_ccw_virtio.py - it boots a Linux in the guest that writes a string into the /dev/fb0 framebuffer, then takes a screendump and then tries to find the string in the file. The latter could maybe also be done via a qtest if you know how to set up the framebuffer without a guest kernel. The other extremely popular approach for testing display devices is "I looked at the screen while I was doing the development, and it was fine" :-) Well, but that doesn't help that much against regressions that can occur in the course of time... Thomas
Re: [PATCH v3 2/3] tools: build qemu-vmsr-helper
Hi Paolo, > > > +static void compute_default_paths(void) > > > +{ > > > +socket_path = g_build_filename("/run", "qemu-vmsr-helper.sock", > > > NULL); > > > +pidfile = g_build_filename("/run", "qemu-vmsr-helper.pid", NULL); > > > +} > > > > We shouldn't be hardcoding /run, we need to honour --prefix and > > --localstatedir args given to configure. /var/run is a symlink > > to /run so the end result ends up the same AFAIK > > Indeed; just copy from scsi/qemu-pr-helper.c. > When I copy the same as compute-default() of scsi/qemu-pr-helper.c, the helper is listening to "/var/local/run/qemu-vmsr-helper.sock". Problem is /var/local/run is 700 while /run is 755. So I would need to adjust the qemu-vmsr-helper.service to give the --socket=PATH of qemu-vmsr-helper.socket (i.e /run/qemu-vmsr-helper.sock) Problem is when I do that , I fall into the case: "Unix socket can't be set when using socket activation" So I'm a bit confused what to do about that. > > You never answered my question from the previous posting of this > > > > This check is merely validating the the thread ID in the message > > is a child of the process ID connected to the socket. Any process > > on the entire host can satisfy this requirement. > > > > I don't see what is limiting this to only QEMU as claimed by the > > commit message, unless you're expecting the UNIX socket permissions > > to be such that only processes under the qemu:qemu user:group pair > > can access to the socket ? That would be a libvirt based permissions > > assumption though. > > Yes, this is why the systemd socket uses 600, like > contrib/systemd/qemu-pr-helper.socket. The socket can be passed via > SCM_RIGHTS by libvirt, or its permissions can be changed (e.g. 660 and > root:kvm would make sense on a Debian system), or a separate helper > can be started by libvirt. > > Either way, the policy is left to the user rather than embedding it in > the provided systemd unit. > During this patch test, when I run by hand my VM (without libvirt), the vmsr helper systemd service/socket was like that: [Service] ExecStart=/usr/bin/qemu-vmsr-helper User=root Group=root and [Socket] ListenStream=/run/qemu-vmsr-helper.sock SocketUser=qemu SocketGroup=qemu SocketMode=0660 And it seems to works. So I'm not sure 100% what to do in my patch. Should I follow the pr-helper systemd files anyway ? Regards, Anthony
Re: [QEMU][PATCH v3 1/7] softmmu: physmem: Split ram_block_add()
Vikram Garhwal writes: > Extract ram block list update to a new function ram_block_add_list(). This is > done to support grant mappings which adds a memory region for granted memory > and > updates the ram_block list. > > Signed-off-by: Juergen Gross > Signed-off-by: Vikram Garhwal > Reviewed-by: Stefano Stabellini Reviewed-by: Alex Bennée -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [PATCH v7 0/3] string list functions
Steven Sistare writes: > All the changes look good - steve Thanks! I can stick this into my next PR, ETA early next week.
[PATCH] replay: Improve error messages about configuration conflicts
Improve Record/replay feature is not supported for '-rtc base=localtime' Record/replay feature is not supported for 'smp' Record/replay feature is not supported for '-snapshot' to Record/replay is not supported with -rtc base=localtime Record/replay is not supported with multiple CPUs Record/replay is not supported with -snapshot Signed-off-by: Markus Armbruster --- replay/replay.c | 2 +- system/vl.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/replay/replay.c b/replay/replay.c index 3fd241a4fc..a2c576c16e 100644 --- a/replay/replay.c +++ b/replay/replay.c @@ -511,7 +511,7 @@ void replay_add_blocker(const char *feature) { Error *reason = NULL; -error_setg(&reason, "Record/replay feature is not supported for '%s'", +error_setg(&reason, "Record/replay is not supported with %s", feature); replay_blockers = g_slist_prepend(replay_blockers, reason); } diff --git a/system/vl.c b/system/vl.c index e480afd7a0..cc03a17c09 100644 --- a/system/vl.c +++ b/system/vl.c @@ -1932,7 +1932,7 @@ static void qemu_apply_machine_options(QDict *qdict) } if (current_machine->smp.cpus > 1) { -replay_add_blocker("smp"); +replay_add_blocker("multiple CPUs"); } } -- 2.43.0
Re: [PATCH] migration/multifd: Document two places for mapped-ram
pet...@redhat.com writes: > From: Peter Xu > > Add two documentations for mapped-ram migration on two spots that may not > be extremely clear. > > Signed-off-by: Peter Xu Reviewed-by: Fabiano Rosas
Re: [PATCH v2 1/3] qtest: migration: Enhance qtest migration functions to support 'channels' argument
On 01/03/24 2:19 pm, Het Gala wrote: On 29/02/24 6:47 am, Fabiano Rosas wrote: Het Gala writes: On 27/02/24 1:04 am, Het Gala wrote: On 26/02/24 6:31 pm, Fabiano Rosas wrote: Het Gala writes: On 24/02/24 1:42 am, Fabiano Rosas wrote: this was the same first approach that I attempted. It won't work because The final 'migrate' QAPI with channels string would look like [...] I'm not sure what you tried. This works: g_assert(!qdict_haskey(args, "channels")); if (channels) { channels_obj = qobject_from_json(channels, errp); qdict_put_obj(args, "channels", channels_obj); } Are you sure the above works ? Sorry, please ignore the below doubt. I understood what silly mistakes I was doing. you were right, qobject_from_json() works fine and converts string to json object. Will follow your latest reply and send out the patch really soon. Thank you for unblocking me here quickly :) [...] Regards, Het Gala
Re: [PATCH v2] hw/nvme/ns: Add NVMe NGUID property
On Feb 22 17:50, Nabih Estefan wrote: > From: Roque Arcudia Hernandez > > This patch adds a way to specify an NGUID for a given NVMe Namespace using a > string of hexadecimal digits with an optional '-' separator to group bytes. > For > instance: > > -device nvme-ns,nguid="e9accd3b83904e13167cf0593437f57d" > > If provided, the NGUID will be part of the Namespace Identification Descriptor > list and the Identify Namespace data. > > Signed-off-by: Roque Arcudia Hernandez > Signed-off-by: Nabih Estefan > Reviewed-by: Klaus Jensen Thanks, applied to nvme-next! signature.asc Description: PGP signature
Re: [PATCH v5 09/12] contrib/plugins/hotblocks: migrate to new per_vcpu API
On 14:33 Thu 29 Feb , Alex Bennée wrote: > Caution: This message originated from an External Source. Use proper caution > when opening attachments, clicking links, or responding. > > > Luc Michel writes: > > > On 15:09 Tue 27 Feb , Pierrick Bouvier wrote: > >> On 2/27/24 2:54 PM, Luc Michel wrote: > >> > Hi Pierrick, > >> > > >> > On 13:14 Mon 26 Feb , Pierrick Bouvier wrote: > >> > > Signed-off-by: Pierrick Bouvier > >> > > --- > >> > > contrib/plugins/hotblocks.c | 50 > >> > > ++--- > >> > > 1 file changed, 30 insertions(+), 20 deletions(-) > >> > > > >> > > diff --git a/contrib/plugins/hotblocks.c b/contrib/plugins/hotblocks.c > >> > > index 4de1b134944..02bc5078bdd 100644 > >> > > --- a/contrib/plugins/hotblocks.c > >> > > +++ b/contrib/plugins/hotblocks.c > >> > > @@ -34,8 +34,8 @@ static guint64 limit = 20; > >> > >*/ > >> > > typedef struct { > >> > > uint64_t start_addr; > >> > > -uint64_t exec_count; > >> > > -int trans_count; > >> > > +struct qemu_plugin_scoreboard *exec_count; > >> > > +int trans_count; > >> > > unsigned long insns; > >> > > } ExecCount; > >> > > > >> > > @@ -43,7 +43,17 @@ static gint cmp_exec_count(gconstpointer a, > >> > > gconstpointer b) > >> > > { > >> > > ExecCount *ea = (ExecCount *) a; > >> > > ExecCount *eb = (ExecCount *) b; > >> > > -return ea->exec_count > eb->exec_count ? -1 : 1; > >> > > +uint64_t count_a = > >> > > + > >> > > qemu_plugin_u64_sum(qemu_plugin_scoreboard_u64(ea->exec_count)); > >> > > +uint64_t count_b = > >> > > + > >> > > qemu_plugin_u64_sum(qemu_plugin_scoreboard_u64(eb->exec_count)); > >> > > +return count_a > count_b ? -1 : 1; > >> > > +} > >> > > + > >> > > +static void exec_count_free(gpointer key, gpointer value, gpointer > >> > > user_data) > >> > > +{ > >> > > +ExecCount *cnt = value; > >> > > +qemu_plugin_scoreboard_free(cnt->exec_count); > >> > > } > >> > > > >> > > static void plugin_exit(qemu_plugin_id_t id, void *p) > >> > > @@ -52,7 +62,6 @@ static void plugin_exit(qemu_plugin_id_t id, void *p) > >> > > GList *counts, *it; > >> > > int i; > >> > > > >> > > -g_mutex_lock(&lock); > >> > > >> > I encountered cases before where the vCPUs continue executing while > >> > plugin_exit is called. This can happen e.g., when QEMU calls exit(3) > >> > from one CPU thread. Others will continue to run at the same time the > >> > atexit callbacks are called. > >> > > >> > This also means that you can't really free the resources as you do at > >> > the end of plugin_exit. > >> > > >> > >> Interesting... > >> > >> The current documentation [1] mentions it's the right place to free > >> resources, and from what I saw, existing plugins assume this too (see > >> contrib/plugins/cache.c for instance). > >> > >> There is probably a bug related to the case you mention, and I'll try to > >> reproduce this, and see if we can have a proper fix. > >> > >> I'm not sure that removing cleanup code from existing plugins is the > >> right thing to do at the moment, even though there is an existing issue > >> with some programs. > > > > Yes absolutely. The problem is on the QEMU side. FWIW I tried to fix one > > of those exit cases (semihosted exit syscall) some time ago: > > https://lore.kernel.org/qemu-devel/20220621125916.25257-1-lmic...@kalray.eu/ > > The plugin exit handler should flush all instrumented code: > > /* >* Handle exit from linux-user. Unlike the normal atexit() mechanism >* we need to handle the clean-up manually as it's possible threads >* are still running. We need to remove all callbacks from code >* generation, flush the current translations and then we can safely >* trigger the exit callbacks. >*/ > > void qemu_plugin_user_exit(void) > { > enum qemu_plugin_event ev; > CPUState *cpu; > > /* >* Locking order: we must acquire locks in an order that is consistent >* with the one in fork_start(). That is: >* - start_exclusive(), which acquires qemu_cpu_list_lock, >* must be called before acquiring plugin.lock. >* - tb_flush(), which acquires mmap_lock(), must be called >* while plugin.lock is not held. >*/ > start_exclusive(); > > qemu_rec_mutex_lock(&plugin.lock); > /* un-register all callbacks except the final AT_EXIT one */ > for (ev = 0; ev < QEMU_PLUGIN_EV_MAX; ev++) { > if (ev != QEMU_PLUGIN_EV_ATEXIT) { > struct qemu_plugin_cb *cb, *next; > > QLIST_FOREACH_SAFE_RCU(cb, &plugin.cb_lists[ev], entry, next) { > plugin_unregister_cb__locked(cb->ctx, ev); > } > } > } > CPU_FOREACH(cpu) { > qemu_plugin_disable_mem_helpers(cpu); > } > qemu_rec_mutex_unlock(&plugin.lock); > > tb_flush(current_cpu); > end_exclusive(); > >
Re: [PULL v2 1/1] loongarch: Change the UEFI loading mode to loongarch
On 29/2/24 12:38, Song Gao wrote: From: Xianglai Li The UEFI loading mode in loongarch is very different from that in other architectures:loongarch's UEFI code is in rom, while other architectures' UEFI code is in flash. loongarch UEFI can be loaded as follows: -machine virt,pflash=pflash0-format -bios ./QEMU_EFI.fd Other architectures load UEFI using the following methods: -machine virt,pflash0=pflash0-format,pflash1=pflash1-format loongarch's UEFI loading method makes qemu and libvirt incompatible when using NVRAM, and the cost of loongarch's current loading method far outweighs the benefits, so we decided to use the same UEFI loading scheme as other architectures. FYI I'm still trying to find a way to avoid that, planning to discuss more with libvirt folks. Well, maybe it is a waste of my time and I should just stop worrying / caring about this long standing issue. Cc: Andrea Bolognani Cc: maob...@loongson.cn Cc: Philippe Mathieu-Daudé Cc: Song Gao Cc: zhaotian...@loongson.cn Signed-off-by: Xianglai Li Tested-by: Andrea Bolognani Reviewed-by: Song Gao Message-Id: <0bd892aa9b88e0f4cc904cb70efd0251fc1cde29.1708336919.git.lixiang...@loongson.cn> Signed-off-by: Song Gao --- hw/loongarch/acpi-build.c | 29 +-- hw/loongarch/virt.c | 101 ++-- include/hw/loongarch/virt.h | 10 ++-- 3 files changed, 107 insertions(+), 33 deletions(-)
Re: [PATCH v2 3/5] linux-user: Add strace for shmat
On 28/2/24 21:25, Richard Henderson wrote: Signed-off-by: Richard Henderson --- linux-user/strace.c| 23 +++ linux-user/strace.list | 2 +- 2 files changed, 24 insertions(+), 1 deletion(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v2 1/3] qtest: migration: Enhance qtest migration functions to support 'channels' argument
Het Gala writes: > On 01/03/24 2:19 pm, Het Gala wrote: >> >> On 29/02/24 6:47 am, Fabiano Rosas wrote: >>> Het Gala writes: >>> On 27/02/24 1:04 am, Het Gala wrote: > > On 26/02/24 6:31 pm, Fabiano Rosas wrote: >> Het Gala writes: >> >>> On 24/02/24 1:42 am, Fabiano Rosas wrote: >>> this was the same first approach that I attempted. It won't work >>> because >>> >>> The final 'migrate' QAPI with channels string would look like >>> > [...] >>> I'm not sure what you tried. This works: >>> >>> g_assert(!qdict_haskey(args, "channels")); >>> if (channels) { >>> channels_obj = qobject_from_json(channels, errp); >>> qdict_put_obj(args, "channels", channels_obj); >>> } >> >> Are you sure the above works ? > > Sorry, please ignore the below doubt. I understood what silly mistakes I > was doing. you were right, qobject_from_json() works fine and converts > string to json object. > > Will follow your latest reply and send out the patch really soon. Thank > you for unblocking me here quickly :) Just please don't make it all one patch. Each of those steps could be a separate patch.
[RFC 1/8] virtio/virtio-pci: Handle extra notification data
Add support to virtio-pci devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-pci device when this feature is enabled varies depending on the device's virtqueue layout. In a split virtqueue layout, this data includes: - upper 16 bits: last_avail_idx - lower 16 bits: virtqueue index In a packed virtqueue layout, this data includes: - upper 16 bits: 1-bit wrap counter & 15-bit last_avail_idx - lower 16 bits: virtqueue index Signed-off-by: Jonah Palmer --- hw/virtio/virtio-pci.c | 13 ++--- hw/virtio/virtio.c | 13 + include/hw/virtio/virtio.h | 1 + 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index 1a7039fb0c..c7c577b177 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -384,7 +384,7 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) { VirtIOPCIProxy *proxy = opaque; VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); -uint16_t vector; +uint16_t vector, vq_idx; hwaddr pa; switch (addr) { @@ -408,8 +408,15 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) vdev->queue_sel = val; break; case VIRTIO_PCI_QUEUE_NOTIFY: -if (val < VIRTIO_QUEUE_MAX) { -virtio_queue_notify(vdev, val); +if (virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { +vq_idx = val & 0x; +virtio_set_notification_data(vdev, vq_idx, val); +} else { +vq_idx = val; +} + +if (vq_idx < VIRTIO_QUEUE_MAX) { +virtio_queue_notify(vdev, vq_idx); } break; case VIRTIO_PCI_STATUS: diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index d229755eae..a61f69b7fd 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -2052,6 +2052,19 @@ int virtio_set_status(VirtIODevice *vdev, uint8_t val) return 0; } +void virtio_set_notification_data(VirtIODevice *vdev, uint16_t i, uint32_t data) +{ +VirtQueue *vq = &vdev->vq[i]; + +if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) { +vq->last_avail_wrap_counter = (data >> 31) & 0x1; +vq->last_avail_idx = (data >> 16) & 0x7FFF; +} else { +vq->last_avail_idx = (data >> 16) & 0x; +} +vq->shadow_avail_idx = vq->last_avail_idx; +} + static enum virtio_device_endian virtio_default_endian(void) { if (target_words_bigendian()) { diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index c8f72850bc..c92d8afc42 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -345,6 +345,7 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index); void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index); void virtio_update_irq(VirtIODevice *vdev); int virtio_set_features(VirtIODevice *vdev, uint64_t val); +void virtio_set_notification_data(VirtIODevice *vdev, uint16_t i, uint32_t data); /* Base devices. */ typedef struct VirtIOBlkConf VirtIOBlkConf; -- 2.39.3
[RFC 5/8] virtio-ccw: Handle extra notification data
Add support to virtio-ccw devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-ccw device when this feature is enabled varies depending on the device's virtqueue layout. That data passed to the virtio-ccw device is in the same format as the data passed to virtio-pci devices. Signed-off-by: Jonah Palmer --- hw/s390x/s390-virtio-ccw.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 62804cc228..b8e193956c 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -140,9 +140,11 @@ static void subsystem_reset(void) static int virtio_ccw_hcall_notify(const uint64_t *args) { uint64_t subch_id = args[0]; -uint64_t queue = args[1]; +uint64_t data = args[1]; SubchDev *sch; +VirtIODevice *vdev; int cssid, ssid, schid, m; +uint16_t vq_idx; if (ioinst_disassemble_sch_ident(subch_id, &m, &cssid, &ssid, &schid)) { return -EINVAL; @@ -151,12 +153,20 @@ static int virtio_ccw_hcall_notify(const uint64_t *args) if (!sch || !css_subch_visible(sch)) { return -EINVAL; } -if (queue >= VIRTIO_QUEUE_MAX) { + +vdev = virtio_ccw_get_vdev(sch); +if (virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { +vq_idx = data & 0x; +virtio_set_notification_data(vdev, vq_idx, data); +} else { +vq_idx = data; +} + +if (vq_idx >= VIRTIO_QUEUE_MAX) { return -EINVAL; } -virtio_queue_notify(virtio_ccw_get_vdev(sch), queue); +virtio_queue_notify(vdev, vq_idx); return 0; - } static int virtio_ccw_hcall_early_printk(const uint64_t *args) -- 2.39.3
[RFC 8/8] virtio: Add VIRTIO_F_NOTIFICATION_DATA property definition
Extend the virtio device property definitions to include the VIRTIO_F_NOTIFICATION_DATA feature. The default state of this feature is disabled, allowing it to be explicitly enabled where it's supported. Signed-off-by: Jonah Palmer --- include/hw/virtio/virtio.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index c92d8afc42..5772737dde 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -369,7 +369,9 @@ typedef struct VirtIORNGConf VirtIORNGConf; DEFINE_PROP_BIT64("packed", _state, _field, \ VIRTIO_F_RING_PACKED, false), \ DEFINE_PROP_BIT64("queue_reset", _state, _field, \ - VIRTIO_F_RING_RESET, true) + VIRTIO_F_RING_RESET, true), \ +DEFINE_PROP_BIT64("notification_data", _state, _field, \ + VIRTIO_F_NOTIFICATION_DATA, false) hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n); bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n); -- 2.39.3
[RFC 7/8] vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature bits
Add support for the VIRTIO_F_NOTIFICATION_DATA feature across a variety of vhost devices. The inclusion of VIRTIO_F_NOTIFICATION_DATA in the feature bits arrays for these devices ensures that the backend is capable of offering and providing support for this feature, and that it can be disabled if the backend does not support it. Signed-off-by: Jonah Palmer --- hw/block/vhost-user-blk.c| 1 + hw/net/vhost_net.c | 2 ++ hw/scsi/vhost-scsi.c | 1 + hw/scsi/vhost-user-scsi.c| 1 + hw/virtio/vhost-user-fs.c| 2 +- hw/virtio/vhost-user-vsock.c | 1 + net/vhost-vdpa.c | 1 + 7 files changed, 8 insertions(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index 6a856ad51a..983c0657da 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -51,6 +51,7 @@ static const int user_feature_bits[] = { VIRTIO_F_RING_PACKED, VIRTIO_F_IOMMU_PLATFORM, VIRTIO_F_RING_RESET, +VIRTIO_F_NOTIFICATION_DATA, VHOST_INVALID_FEATURE_BIT }; diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index e8e1661646..bb1f975b39 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -48,6 +48,7 @@ static const int kernel_feature_bits[] = { VIRTIO_F_IOMMU_PLATFORM, VIRTIO_F_RING_PACKED, VIRTIO_F_RING_RESET, +VIRTIO_F_NOTIFICATION_DATA, VIRTIO_NET_F_HASH_REPORT, VHOST_INVALID_FEATURE_BIT }; @@ -55,6 +56,7 @@ static const int kernel_feature_bits[] = { /* Features supported by others. */ static const int user_feature_bits[] = { VIRTIO_F_NOTIFY_ON_EMPTY, +VIRTIO_F_NOTIFICATION_DATA, VIRTIO_RING_F_INDIRECT_DESC, VIRTIO_RING_F_EVENT_IDX, diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c index 58a00336c2..b8048f18e9 100644 --- a/hw/scsi/vhost-scsi.c +++ b/hw/scsi/vhost-scsi.c @@ -38,6 +38,7 @@ static const int kernel_feature_bits[] = { VIRTIO_RING_F_EVENT_IDX, VIRTIO_SCSI_F_HOTPLUG, VIRTIO_F_RING_RESET, +VIRTIO_F_NOTIFICATION_DATA, VHOST_INVALID_FEATURE_BIT }; diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c index a63b1f4948..0b050805a8 100644 --- a/hw/scsi/vhost-user-scsi.c +++ b/hw/scsi/vhost-user-scsi.c @@ -36,6 +36,7 @@ static const int user_feature_bits[] = { VIRTIO_RING_F_EVENT_IDX, VIRTIO_SCSI_F_HOTPLUG, VIRTIO_F_RING_RESET, +VIRTIO_F_NOTIFICATION_DATA, VHOST_INVALID_FEATURE_BIT }; diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c index cca2cd41be..ae48cc1c96 100644 --- a/hw/virtio/vhost-user-fs.c +++ b/hw/virtio/vhost-user-fs.c @@ -33,7 +33,7 @@ static const int user_feature_bits[] = { VIRTIO_F_RING_PACKED, VIRTIO_F_IOMMU_PLATFORM, VIRTIO_F_RING_RESET, - +VIRTIO_F_NOTIFICATION_DATA, VHOST_INVALID_FEATURE_BIT }; diff --git a/hw/virtio/vhost-user-vsock.c b/hw/virtio/vhost-user-vsock.c index 9431b9792c..802b44a07d 100644 --- a/hw/virtio/vhost-user-vsock.c +++ b/hw/virtio/vhost-user-vsock.c @@ -21,6 +21,7 @@ static const int user_feature_bits[] = { VIRTIO_RING_F_INDIRECT_DESC, VIRTIO_RING_F_EVENT_IDX, VIRTIO_F_NOTIFY_ON_EMPTY, +VIRTIO_F_NOTIFICATION_DATA, VHOST_INVALID_FEATURE_BIT }; diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 3726ee5d67..2827d29ce7 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -62,6 +62,7 @@ const int vdpa_feature_bits[] = { VIRTIO_F_RING_PACKED, VIRTIO_F_RING_RESET, VIRTIO_F_VERSION_1, +VIRTIO_F_NOTIFICATION_DATA, VIRTIO_NET_F_CSUM, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, VIRTIO_NET_F_CTRL_MAC_ADDR, -- 2.39.3
[RFC 0/8] virtio,vhost: Add VIRTIO_F_NOTIFICATION_DATA support
The goal of these patches are to add support to a variety of virtio and vhost devices for the VIRTIO_F_NOTIFICATION_DATA transport feature. This feature indicates that a driver will pass extra data (instead of just a virtqueue's index) when notifying the corresponding device. The data passed in by the driver when this feature is enabled varies in format depending on if the device is using a split or packed virtqueue layout: Split VQ - Upper 16 bits: last_avail_idx - Lower 16 bits: virtqueue index Packed VQ - Upper 16 bits: 1-bit wrap counter & 15-bit last_avail_idx - Lower 16 bits: virtqueue index Also, due to the limitations of ioeventfd not being able to carry the extra provided by the driver, ioeventfd is left disabled for any devices using this feature. A significant aspect of this effort has been to maintain compatibility across different backends. As such, the feature is offered by backend devices only when supported, with fallback mechanisms where backend support is absent. Jonah Palmer (8): virtio/virtio-pci: Handle extra notification data virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA virtio-mmio: Handle extra notification data virtio-mmio: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA virtio-ccw: Handle extra notification data virtio-ccw: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature bits virtio: Add VIRTIO_F_NOTIFICATION_DATA property definition hw/block/vhost-user-blk.c| 1 + hw/net/vhost_net.c | 2 ++ hw/s390x/s390-virtio-ccw.c | 18 ++ hw/s390x/virtio-ccw.c| 6 -- hw/scsi/vhost-scsi.c | 1 + hw/scsi/vhost-user-scsi.c| 1 + hw/virtio/vhost-user-fs.c| 2 +- hw/virtio/vhost-user-vsock.c | 1 + hw/virtio/virtio-mmio.c | 18 ++ hw/virtio/virtio-pci.c | 19 ++- hw/virtio/virtio.c | 13 + include/hw/virtio/virtio.h | 5 - net/vhost-vdpa.c | 1 + 13 files changed, 71 insertions(+), 17 deletions(-) -- 2.39.3
[RFC 6/8] virtio-ccw: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
Prevent ioeventfd from being enabled/disabled when a virtio-ccw device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport feature. Due to the ioeventfd not being able to carry the extra data associated with this feature, the ioeventfd should be left in a disabled state for emulated virtio-ccw devices using this feature. Signed-off-by: Jonah Palmer --- hw/s390x/virtio-ccw.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c index b4676909dd..936ba78fda 100644 --- a/hw/s390x/virtio-ccw.c +++ b/hw/s390x/virtio-ccw.c @@ -530,14 +530,16 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw) if (ret) { break; } -if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) { +if (!(status & VIRTIO_CONFIG_S_DRIVER_OK) && +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { virtio_ccw_stop_ioeventfd(dev); } if (virtio_set_status(vdev, status) == 0) { if (vdev->status == 0) { virtio_ccw_reset_virtio(dev); } -if (status & VIRTIO_CONFIG_S_DRIVER_OK) { +if ((status & VIRTIO_CONFIG_S_DRIVER_OK) && +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { virtio_ccw_start_ioeventfd(dev); } sch->curr_status.scsw.count = ccw.count - sizeof(status); -- 2.39.3
[RFC 4/8] virtio-mmio: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
Prevent ioeventfd from being enabled/disabled when a virtio-mmio device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport feature. Due to ioeventfd not being able to carry the extra data associated with this feature, the ioeventfd should be left in a disabled state for emulated virtio-mmio devices using this feature. Signed-off-by: Jonah Palmer --- hw/virtio/virtio-mmio.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 2bac77460e..fc780a03b2 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -424,7 +424,8 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, virtio_update_irq(vdev); break; case VIRTIO_MMIO_STATUS: -if (!(value & VIRTIO_CONFIG_S_DRIVER_OK)) { +if (!(value & VIRTIO_CONFIG_S_DRIVER_OK) && +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { virtio_mmio_stop_ioeventfd(proxy); } @@ -436,7 +437,8 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, virtio_set_status(vdev, value & 0xff); -if (value & VIRTIO_CONFIG_S_DRIVER_OK) { +if ((value & VIRTIO_CONFIG_S_DRIVER_OK) && +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { virtio_mmio_start_ioeventfd(proxy); } -- 2.39.3
[RFC 3/8] virtio-mmio: Handle extra notification data
Add support to virtio-mmio devices for handling the extra data sent from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA transport feature has been negotiated. The extra data that's passed to the virtio-mmio device when this feature is enabled varies depending on the device's virtqueue layout. The data passed to the virtio-mmio device is in the same format as the data passed to virtio-pci devices. Signed-off-by: Jonah Palmer --- hw/virtio/virtio-mmio.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 22f9fbcf5a..2bac77460e 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -248,6 +248,7 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, { VirtIOMMIOProxy *proxy = (VirtIOMMIOProxy *)opaque; VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus); +uint16_t vq_idx; trace_virtio_mmio_write_offset(offset, value); @@ -407,8 +408,15 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, uint64_t value, } break; case VIRTIO_MMIO_QUEUE_NOTIFY: -if (value < VIRTIO_QUEUE_MAX) { -virtio_queue_notify(vdev, value); +if (virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { +vq_idx = value & 0x; +virtio_set_notification_data(vdev, vq_idx, value); +} else { +vq_idx = value; +} + +if (vq_idx < VIRTIO_QUEUE_MAX) { +virtio_queue_notify(vdev, vq_idx); } break; case VIRTIO_MMIO_INTERRUPT_ACK: -- 2.39.3
[RFC 2/8] virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
Prevent ioeventfd from being enabled/disabled when a virtio-pci device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport feature. Due to ioeventfd not being able to carry the extra data associated with this feature, the ioeventfd should be left in a disabled state for emulated virtio-pci devices using this feature. Signed-off-by: Jonah Palmer --- hw/virtio/virtio-pci.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c index c7c577b177..fd9717a0f5 100644 --- a/hw/virtio/virtio-pci.c +++ b/hw/virtio/virtio-pci.c @@ -420,13 +420,15 @@ static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) } break; case VIRTIO_PCI_STATUS: -if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) { +if (!(val & VIRTIO_CONFIG_S_DRIVER_OK) && +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { virtio_pci_stop_ioeventfd(proxy); } virtio_set_status(vdev, val & 0xFF); -if (val & VIRTIO_CONFIG_S_DRIVER_OK) { +if ((val & VIRTIO_CONFIG_S_DRIVER_OK) && +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) { virtio_pci_start_ioeventfd(proxy); } -- 2.39.3
Re: [PATCH 00/19] Workaround Windows failing to find 64bit SMBIOS entry point with SeaBIOS
Am 29.02.24 um 14:18 schrieb Fiona Ebner: > Am 27.02.24 um 16:47 schrieb Igor Mammedov: >> Windows (10) bootloader when running on top of SeaBIOS, fails to find >> >> SMBIOSv3 entry point. Tracing it shows that it looks for v2 anchor markers >> >> only and not v3. Tricking it into believing that entry point is found >> >> lets Windows successfully locate and parse SMBIOSv3 tables. Whether it >> >> will be fixed on Windows side is not clear so here goes a workaround. >> >> >> >> Idea is to try build v2 tables if QEMU configuration permits, >> >> and fallback to v3 tables otherwise. That will mask Windows issue >> >> form majority of users. >> >> However if VM configuration can't be described (typically large VMs) >> >> by v2 tables, QEMU will use SMBIOSv3 and Windows will hit the issue >> >> again. In this case complain to Microsoft and/or use UEFI instead of >> >> SeaBIOS (requires reinstall). >> >> >> >> Default compat setting of smbios-entry-point-type after series >> >> for pc/q35 machines: >> >> * 9.0-newer: 'auto' >> >> * 8.1-8.2: '64' >> >> * 8.0-older: '32' >> >> >> >> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2008 >> > > Thank you! I'm happy to confirm that this series works around the issue :) > While I still didn't do any in-depth testing (don't have enough knowledge for that anyways), I played around a bit more now, check that nothing obvious breaks also with a Linux VM and also ran a successful 'make check'. If that is enough, feel free to add: Tested-by: Fiona Ebner Best Regards, Fiona
Re: [QEMU][PATCH v3 2/7] xen: add pseudo RAM region for grant mappings
Vikram Garhwal writes: > From: Juergen Gross > > Add a memory region which can be used to automatically map granted > memory. It is starting at 0x8000ULL in order to be able to > distinguish it from normal RAM. Is the Xen memory map for HVM guests documented anywhere? I couldn't find anything googling or on the Xen wiki. I'm guessing this is going to be shared across all 64 bit HVM arches in Xen? Anyway: Reviewed-by: Alex Bennée -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [RFC 0/4] mirror: implement incremental and bitmap modes
On 29.02.24 17:50, Fiona Ebner wrote: Am 29.02.24 um 13:00 schrieb Vladimir Sementsov-Ogievskiy: But anyway, this all could be simply achieved with bitmap-copying/merging API, if we allow to pass user-given bitmap to the mirror as working bitmap. I see, I'll drop the 'bitmap-mode' in the next version if nobody complains :) Good. It's a golden rule: never make public interfaces which you don't actually need for production. I myself sometimes violate it and spend extra time on developing features, which we later have to just drop as "not needed downstream, no sense in upstreaming". Just wondering which new mode I should allow for the @MirrorSyncMode then? The documentation states: # @incremental: only copy data described by the dirty bitmap. # (since: 2.4) # # @bitmap: only copy data described by the dirty bitmap. (since: 4.2) # Behavior on completion is determined by the BitmapSyncMode. For backup, do_backup_common() just maps @incremental to @bitmap + @bitmap-mode == @on-success. Using @bitmap for mirror would lead to being at odds with the documentation, because it mentions the BitmapSyncMode, which mirror won't have. Using @incremental for mirror would be consistent with the documentation, but behave a bit differently from backup. Opinions? Good question. As we already understood, (block-)job-api needs some spring-cleaning. Unfortunately I don't have much time on it, but still I decided to start from finally depreacting block-job-* API and moving to job-*.. Probably bitmap/bitmap-mode/sync APIs also need some optimization, keeping in mind new block-dirty-bitmap-merge api. So, what I could advice in this situation for newc interfaces: 1. be minimalistic 2. add `x-` prefix when unsure So, following these two rules, what about x-bitmap field, which may be combined only with 'full' mode, and do what you need? About documentation: actually, I never liked that we use for backup job "MirrorSyncMode". Now it looks more like "BackupSyncMode", having two values supported only by backup. I'm also unsure how mode=full&bitmap=some_bitmap differs from mode=bitmap&bitmap=some_bitmap.. So, I'd suggest simply rename MirrorSyncMode to BackupSyncMode, and add separate MirrorSyncMode with only "full", "top" and "none" values. -- Best regards, Vladimir
Re: [PULL 0/6] Misc fixes (libqos vring, Kconfig, TLS io channels, ...)
On Fri, 1 Mar 2024 at 08:09, Thomas Huth wrote: > > Hi Peter! > > The following changes since commit c0c6a0e3528b88aaad0b9d333e295707a195587b: > > Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu > into staging (2024-02-28 17:27:10 +) > > are available in the Git repository at: > > https://gitlab.com/thuth/qemu.git tags/pull-request-2024-03-01 > > for you to fetch changes up to 462945cd22d2bcd233401ed3aa167d83a8e35b05: > > chardev/char-socket: Fix TLS io channels sending too much data to the > backend (2024-03-01 08:27:33 +0100) > > > * Fix some bugs in the vring setup of libqos > * Fix GIC settings when using --without-default-devices > * Fix USB PCAP streams on Windows > * Remove temporary files from test-util-sockets > * Fix TLS io channels sending too much data to the backend > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any user-visible changes. -- PMM
Re: [PULL v2 0/1] loongarch-to-apply queue
On Thu, 29 Feb 2024 at 11:38, Song Gao wrote: > > The following changes since commit bfe8020c814a30479a4241aaa78b63960655962b: > > Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging > (2024-02-28 14:23:21 +) > > are available in the Git repository at: > > https://gitlab.com/gaosong/qemu.git tags/pull-loongarch-20240229 > > for you to fetch changes up to c6e9847fc4becba561c631c4505e3b05d4926184: > > loongarch: Change the UEFI loading mode to loongarch (2024-02-29 19:32:45 > +0800) > > > pull-loongarch-20240229 > > V2: fix build error on mipsel > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any user-visible changes. -- PMM
[PATCH] target/riscv: move ratified/frozen exts to non-experimental
smaia and ssaia were ratified in August 25th 2023 [1]. zvfh and zvfhmin were ratified in August 2nd 2023 [2]. zfbfmin and zvfbf(min|wma) are frozen and moved to public review since Dec 16th 2023 [3]. zaamo and zalrsc are both marked as "Frozen" since January 24th 2024 [4]. [1] https://jira.riscv.org/browse/RVS-438 [2] https://jira.riscv.org/browse/RVS-871 [3] https://jira.riscv.org/browse/RVS-704 [4] https://jira.riscv.org/browse/RVS-1995 Signed-off-by: Daniel Henrique Barboza --- target/riscv/cpu.c | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index fd0c7efdda..f5d30510ef 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -1463,17 +1463,26 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = { MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true), MULTI_EXT_CFG_BOOL("zihintpause", ext_zihintpause, true), MULTI_EXT_CFG_BOOL("zacas", ext_zacas, false), +MULTI_EXT_CFG_BOOL("zaamo", ext_zaamo, false), +MULTI_EXT_CFG_BOOL("zalrsc", ext_zalrsc, false), MULTI_EXT_CFG_BOOL("zawrs", ext_zawrs, true), MULTI_EXT_CFG_BOOL("zfa", ext_zfa, true), +MULTI_EXT_CFG_BOOL("zfbfmin", ext_zfbfmin, false), MULTI_EXT_CFG_BOOL("zfh", ext_zfh, false), MULTI_EXT_CFG_BOOL("zfhmin", ext_zfhmin, false), MULTI_EXT_CFG_BOOL("zve32f", ext_zve32f, false), MULTI_EXT_CFG_BOOL("zve64f", ext_zve64f, false), MULTI_EXT_CFG_BOOL("zve64d", ext_zve64d, false), +MULTI_EXT_CFG_BOOL("zvfbfmin", ext_zvfbfmin, false), +MULTI_EXT_CFG_BOOL("zvfbfwma", ext_zvfbfwma, false), +MULTI_EXT_CFG_BOOL("zvfh", ext_zvfh, false), +MULTI_EXT_CFG_BOOL("zvfhmin", ext_zvfhmin, false), MULTI_EXT_CFG_BOOL("sstc", ext_sstc, true), +MULTI_EXT_CFG_BOOL("smaia", ext_smaia, false), MULTI_EXT_CFG_BOOL("smepmp", ext_smepmp, false), MULTI_EXT_CFG_BOOL("smstateen", ext_smstateen, false), +MULTI_EXT_CFG_BOOL("ssaia", ext_ssaia, false), MULTI_EXT_CFG_BOOL("svade", ext_svade, false), MULTI_EXT_CFG_BOOL("svadu", ext_svadu, true), MULTI_EXT_CFG_BOOL("svinval", ext_svinval, false), @@ -1561,19 +1570,6 @@ const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = { /* These are experimental so mark with 'x-' */ const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = { -MULTI_EXT_CFG_BOOL("x-smaia", ext_smaia, false), -MULTI_EXT_CFG_BOOL("x-ssaia", ext_ssaia, false), - -MULTI_EXT_CFG_BOOL("x-zaamo", ext_zaamo, false), -MULTI_EXT_CFG_BOOL("x-zalrsc", ext_zalrsc, false), - -MULTI_EXT_CFG_BOOL("x-zvfh", ext_zvfh, false), -MULTI_EXT_CFG_BOOL("x-zvfhmin", ext_zvfhmin, false), - -MULTI_EXT_CFG_BOOL("x-zfbfmin", ext_zfbfmin, false), -MULTI_EXT_CFG_BOOL("x-zvfbfmin", ext_zvfbfmin, false), -MULTI_EXT_CFG_BOOL("x-zvfbfwma", ext_zvfbfwma, false), - DEFINE_PROP_END_OF_LIST(), }; -- 2.43.2
Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
On 29.02.24 16:21, Markus Armbruster wrote: Thomas Huth writes: On 29/02/2024 08.20, Vladimir Sementsov-Ogievskiy wrote: On 29.02.24 09:32, Markus Armbruster wrote: [...] Anti-pattern: fail without setting an error. There might be more elsewhere in the series. qapi/error.h's big comment: * - On success, the function should not touch *errp. On failure, it * should set a new error, e.g. with error_setg(errp, ...), or * propagate an existing one, e.g. with error_propagate(errp, ...). * * - Whenever practical, also return a value that indicates success / * failure. This can make the error checking more concise, and can * avoid useless error object creation and destruction. Note that * we still have many functions returning void. We recommend * • bool-valued functions return true on success / false on failure, * • pointer-valued functions return non-null / null pointer, and * • integer-valued functions return non-negative / negative. } qemu_put_be64(f, STATTR_FLAG_EOS); return 0; } When adding Error **errp to a function, you must also add code to set an error on failure to every failure path. Adding it in a later patch in the same series can be okay, Personally, I'd prefer not doing so. Creating wrong commits and fixing them in same series - better to merge all fixes into bad commit:) I agree - that might create issues with bisecting later. Please fix it in this patch here already! Depends on the wrongness, really. In my understanding, unset errp on failure path is wrong enough. For example, simple pattern Error *local_err = NULL; int ret = foo(&local_err); if (ret < 0) { error_report_err(local_err); return; } will just crash. When I write code, I expect that "errp is set" === "ret < 0", for all functions returning negative integer on failure. Also, we have enough code that relying on errp for failure checking: $ git grep 'if (local_err)' | wc -l 373 Of course, most of this should be checking failure of functions that return void, but who knows. We don't want broken intermediate states, no argument. But intermediate states that are merely unclean can be acceptable. For instance, my commit a30ecde6e79 (net: Permit incremental conversion of init functions to Error) added such Error ** parameters to a somewhat tangled nest of functions, along with FIXME comments where errors weren't set. The next few commits fixed most, but not all of them. Later commits fixed some more. The one in tap-win32.c is still there today. This was acceptable, because it improved things from "bad error reporting" to "okay error reporting in most cases, unclean and bad error reporting in a few cases marked FIXME", with "a few" over time going down to the one I can't test, and nobody else seems to care about. -- Best regards, Vladimir
Re: [RFC 0/4] mirror: implement incremental and bitmap modes
Am 01.03.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy: > > As we already understood, (block-)job-api needs some spring-cleaning. > Unfortunately I don't have much time on it, but still I decided to start > from finally depreacting block-job-* API and moving to job-*.. Probably > bitmap/bitmap-mode/sync APIs also need some optimization, keeping in > mind new block-dirty-bitmap-merge api. > > So, what I could advice in this situation for newc interfaces: > > 1. be minimalistic > 2. add `x-` prefix when unsure > > So, following these two rules, what about x-bitmap field, which may be > combined only with 'full' mode, and do what you need? > AFAIU, it should rather be marked as @unstable in QAPI [0]? Then it doesn't need to be renamed if it becomes stable later. > About documentation: actually, I never liked that we use for backup job > "MirrorSyncMode". Now it looks more like "BackupSyncMode", having two > values supported only by backup. > > I'm also unsure how mode=full&bitmap=some_bitmap differs from > mode=bitmap&bitmap=some_bitmap.. > With the current patches, it was an error to specify @bitmap for other modes than 'incremental' and 'bitmap'. > So, I'd suggest simply rename MirrorSyncMode to BackupSyncMode, and add > separate MirrorSyncMode with only "full", "top" and "none" values. > Sounds good to me! [0]: https://gitlab.com/qemu-project/qemu/-/commit/a3c45b3e62962f99338716b1347cfb0d427cea44 Best Regards, Fiona
Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
On 01.03.24 17:44, Vladimir Sementsov-Ogievskiy wrote: On 29.02.24 16:21, Markus Armbruster wrote: Thomas Huth writes: On 29/02/2024 08.20, Vladimir Sementsov-Ogievskiy wrote: On 29.02.24 09:32, Markus Armbruster wrote: [...] Anti-pattern: fail without setting an error. There might be more elsewhere in the series. qapi/error.h's big comment: * - On success, the function should not touch *errp. On failure, it * should set a new error, e.g. with error_setg(errp, ...), or * propagate an existing one, e.g. with error_propagate(errp, ...). * * - Whenever practical, also return a value that indicates success / * failure. This can make the error checking more concise, and can * avoid useless error object creation and destruction. Note that * we still have many functions returning void. We recommend * • bool-valued functions return true on success / false on failure, * • pointer-valued functions return non-null / null pointer, and * • integer-valued functions return non-negative / negative. } qemu_put_be64(f, STATTR_FLAG_EOS); return 0; } When adding Error **errp to a function, you must also add code to set an error on failure to every failure path. Adding it in a later patch in the same series can be okay, Personally, I'd prefer not doing so. Creating wrong commits and fixing them in same series - better to merge all fixes into bad commit:) I agree - that might create issues with bisecting later. Please fix it in this patch here already! Depends on the wrongness, really. In my understanding, unset errp on failure path is wrong enough. For example, simple pattern Error *local_err = NULL; int ret = foo(&local_err); if (ret < 0) { error_report_err(local_err); return; } will just crash. When I write code, I expect that "errp is set" === "ret < 0", for all functions returning negative integer on failure. Also, we have enough code that relying on errp for failure checking: $ git grep 'if (local_err)' | wc -l 373 Of course, most of this should be checking failure of functions that return void, but who knows. We don't want broken intermediate states, no argument. But intermediate states that are merely unclean can be acceptable. For instance, my commit a30ecde6e79 (net: Permit incremental conversion of init functions to Error) added such Error ** parameters to a somewhat tangled nest of functions, along with FIXME comments where errors weren't set. The next few commits fixed most, but not all of them. Later commits fixed some more. The one in tap-win32.c is still there today. This was acceptable, because it improved things from "bad error reporting" to "okay error reporting in most cases, unclean and bad error reporting in a few cases marked FIXME", with "a few" over time going down to the one I can't test, and nobody else seems to care about. You may be sure, that functions you modify are never used in conditions I've described above. But you can't guarantee that this will not change in future. Of course, if someone will create new call of the function, he should look (at least once) at that function and see "FIXME", but better not rely on this. Also, if someone will add a call to another function that calls function with bad error reporting, most probably he will not see the "FIXME"... Formally, you should add such FIXME to all functions with errp, that may do (nested) calls to a function with FIXME -- Best regards, Vladimir
[PATCH] Fix unexpected Illegal instruction error on RISC-V.
Avoid right-shifting by a negative number of bits when lmul is 8. Signed-off-by: SiHuaN --- target/riscv/vector_helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 84cec73eb2..f0158ea237 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -53,10 +53,11 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, * VLEN * LMUL >= SEW * VLEN >> (8 - lmul) >= sew * (vlenb << 3) >> (8 - lmul) >= sew + * Considering that lmul may be 8, the following form cannot be used. * vlenb >> (8 - 3 - lmul) >= sew */ if (vlmul == 4 || -cpu->cfg.vlenb >> (8 - 3 - vlmul) < sew) { +(cpu->cfg.vlenb << 3) >> (8 - vlmul) < sew) { vill = true; } } -- 2.44.0
Re: [RFC 0/4] mirror: implement incremental and bitmap modes
On 01.03.24 17:52, Fiona Ebner wrote: Am 01.03.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy: As we already understood, (block-)job-api needs some spring-cleaning. Unfortunately I don't have much time on it, but still I decided to start from finally depreacting block-job-* API and moving to job-*.. Probably bitmap/bitmap-mode/sync APIs also need some optimization, keeping in mind new block-dirty-bitmap-merge api. So, what I could advice in this situation for newc interfaces: 1. be minimalistic 2. add `x-` prefix when unsure So, following these two rules, what about x-bitmap field, which may be combined only with 'full' mode, and do what you need? AFAIU, it should rather be marked as @unstable in QAPI [0]? Then it doesn't need to be renamed if it becomes stable later. Right, unstable feature is needed, using "x-" is optional. Recent discussion about it was in my "vhost-user-blk: live resize additional APIs" series: https://patchew.org/QEMU/20231006202045.1161543-1-vsement...@yandex-team.ru/20231006202045.1161543-5-vsement...@yandex-team.ru/ Following it, I think it's OK to not care anymore with "x-" prefixes, and rely on unstable feature. About documentation: actually, I never liked that we use for backup job "MirrorSyncMode". Now it looks more like "BackupSyncMode", having two values supported only by backup. I'm also unsure how mode=full&bitmap=some_bitmap differs from mode=bitmap&bitmap=some_bitmap.. With the current patches, it was an error to specify @bitmap for other modes than 'incremental' and 'bitmap'. Current documentation says: # @bitmap: The name of a dirty bitmap to use. Must be present if sync # is "bitmap" or "incremental". Can be present if sync is "full" # or "top". Must not be present otherwise. # (Since 2.4 (drive-backup), 3.1 (blockdev-backup)) So, I'd suggest simply rename MirrorSyncMode to BackupSyncMode, and add separate MirrorSyncMode with only "full", "top" and "none" values. Sounds good to me! [0]: https://gitlab.com/qemu-project/qemu/-/commit/a3c45b3e62962f99338716b1347cfb0d427cea44 Best Regards, Fiona -- Best regards, Vladimir
Re: [RFC 0/4] mirror: implement incremental and bitmap modes
Am 01.03.24 um 16:02 schrieb Vladimir Sementsov-Ogievskiy: > On 01.03.24 17:52, Fiona Ebner wrote: >> Am 01.03.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy: >>> >>> As we already understood, (block-)job-api needs some spring-cleaning. >>> Unfortunately I don't have much time on it, but still I decided to start >>> from finally depreacting block-job-* API and moving to job-*.. Probably >>> bitmap/bitmap-mode/sync APIs also need some optimization, keeping in >>> mind new block-dirty-bitmap-merge api. >>> >>> So, what I could advice in this situation for newc interfaces: >>> >>> 1. be minimalistic >>> 2. add `x-` prefix when unsure >>> >>> So, following these two rules, what about x-bitmap field, which may be >>> combined only with 'full' mode, and do what you need? >>> >> >> AFAIU, it should rather be marked as @unstable in QAPI [0]? Then it >> doesn't need to be renamed if it becomes stable later. > > Right, unstable feature is needed, using "x-" is optional. > > Recent discussion about it was in my "vhost-user-blk: live resize > additional APIs" series: > > https://patchew.org/QEMU/20231006202045.1161543-1-vsement...@yandex-team.ru/20231006202045.1161543-5-vsement...@yandex-team.ru/ > > Following it, I think it's OK to not care anymore with "x-" prefixes, > and rely on unstable feature. > Thanks for the confirmation! I'll go without the prefix in the name then. >> >>> About documentation: actually, I never liked that we use for backup job >>> "MirrorSyncMode". Now it looks more like "BackupSyncMode", having two >>> values supported only by backup. >>> >>> I'm also unsure how mode=full&bitmap=some_bitmap differs from >>> mode=bitmap&bitmap=some_bitmap.. >>> >> >> With the current patches, it was an error to specify @bitmap for other >> modes than 'incremental' and 'bitmap'. > > Current documentation says: > # @bitmap: The name of a dirty bitmap to use. Must be present if sync > # is "bitmap" or "incremental". Can be present if sync is "full" > # or "top". Must not be present otherwise. > # (Since 2.4 (drive-backup), 3.1 (blockdev-backup)) > > This is for backup. The documentation (and behavior) for @bitmap added by these patches for mirror is different ;) Best Regards, Fiona
Re: QNX VM hang on Qemu
Hi Faiq, On Fri, Feb 23, 2024 at 3:55 PM Faiq Ali Sayed wrote: > > So as far as my understanding, we provide these binaries using Qemu command > as depicted in the example you provided and there is no way I found to put > them into a single image. > Regarding the overlapping space, I don't have much idea but I think we could > provide a starting address separately to these images something like > addr=0x0010. Where is this 0x0010 address coming from ? Could you confirm with "readelf -h" that this is the entry point of your image ? Alternatively and that's what we used locally, qemu is able to guess the entry point of an image, when passed from -kernel. Therefore, our command simply looks like: | $ qemu-system-aarch64 -M xlnx-zcu102 -m 4G -no-reboot -nographic -kernel qnx.img I'm not the one having built the qnx.img we're using. But it looks pretty standard at the first look, made with "mkifs" and the kernel specs from zcu102 evaluation kit. Hope it helps, Clément > So as per your suggestion, I compared my images and I found that the image > does not show a virtual disk, and other commands like mkdir, do not have > these binaries. > So these binaries are not included at the time of image creation and I don't > exactly know that how can we add these binaries into the QNX image. > > The Image that is currently installed in real hardware does not have a > debugging symbol, so I can't use GDB to debug that. > Now I am looking for a way to create the correct QNX OS image for Qemu. > > Any lead in this regard will be really helpful :) >
Re: [PULL 26/29] contrib/plugins: extend execlog to track register changes
On Fri, Mar 01, 2024 at 10:22:08AM +, Alex Bennée wrote: > Date: Fri, 01 Mar 2024 10:22:08 + > From: Alex Bennée > Subject: Re: [PULL 26/29] contrib/plugins: extend execlog to track register > changes > > Zhao Liu writes: > > > Hi Alex, > > > > I hit the following warnings (with "./configure --enable-werror"): > > > > /qemu/contrib/plugins/execlog.c: In function ‘registers_init’: > > /qemu/contrib/plugins/execlog.c:330:17: warning: ‘g_pattern_match_string’ > > is deprecated: Use 'g_pattern_spec_match_string' instead > > [-Wdeprecated-declarations] > > 330 | if (g_pattern_match_string(pat, rd->name) || > > | ^~ > > In file included from /usr/include/glib-2.0/glib.h:65, > > from /qemu/contrib/plugins/execlog.c:9: > > /usr/include/glib-2.0/glib/gpattern.h:55:15: note: declared here > >55 | gboolean g_pattern_match_string (GPatternSpec *pspec, > > | ^~ > > /qemu/contrib/plugins/execlog.c:331:21: warning: ‘g_pattern_match_string’ > > is deprecated: Use 'g_pattern_spec_match_string' instead > > [-Wdeprecated-declarations] > > 331 | g_pattern_match_string(pat, rd_lower)) { > > | ^~ > > In file included from /usr/include/glib-2.0/glib.h:65, > > from /qemu/contrib/plugins/execlog.c:9: > > /usr/include/glib-2.0/glib/gpattern.h:55:15: note: declared here > >55 | gboolean g_pattern_match_string (GPatternSpec *pspec, > > | ^~ > > /qemu/contrib/plugins/execlog.c:339:63: warning: passing argument 2 of > > ‘g_ptr_array_add’ discards ‘const’ qualifier from pointer target type > > [-Wdiscarded-qualifiers] > > 339 | g_ptr_array_add(all_reg_names, > > reg->name); > > Hmm I missed that. Not sure what the neatest solution is in this case - > g_ptr_array_new() doesn't have a destroy func so we shouldn't ever > attempt to free it. We can manually re-add the const qualifier at the > other end for completeness and I guess comment and cast? I find other palces use 2 ways: * Use g_strdup() to create a copy (e.g., net/net.c, add_nic_model_help()). But I'm not sure if this is OK since you said we shouldn't attempt to free it. May I ask if the free issue you mentioned will affect the use of g_strdup() here? * Another way is the forced conversion to gpointer (also e.g., in net/net.c, qemu_get_nic_models()). Which way do you like? ;-) > > > > |~~~^~ > > In file included from /usr/include/glib-2.0/glib.h:31, > > from /qemu/contrib/plugins/execlog.c:9: > > /usr/include/glib-2.0/glib/garray.h:192:62: note: expected ‘gpointer’ {aka > > ‘void *’} but argument is of type ‘const char *’ > > 192 |gpointer data); > > |~~^~~~ > > > > In addition, I checked my glic version: > > > > $ldd --version > > ldd (Ubuntu GLIBC 2.35-0ubuntu3.5) 2.35 > > > > I think it's v2.35. Are these three warning reports valid? > > It's the glib (not glibc) version that matters here. > g_pattern_match_string was deprecated in 2.70 when the suggested > alternative was added. However our baseline for glib is still: > > # When bumping glib minimum version, please check also whether to increase > # the _WIN32_WINNT setting in osdep.h according to the value from glib > glib_req_ver = '>=2.56.0' > glib_pc = dependency('glib-2.0', version: glib_req_ver, required: true, > method: 'pkg-config') > > The usual solution for this is to throw in a compat wrapper in > glib-compat.h: > > --8<---cut here---start->8--- > modified include/glib-compat.h > @@ -105,6 +105,24 @@ static inline gpointer g_memdup2_qemu(gconstpointer mem, > gsize byte_size) > } > #define g_memdup2(m, s) g_memdup2_qemu(m, s) > > +/* > + * g_pattern_match_string has been deprecated in Glib since 2.70 and > + * will complain about it if you try to use it. Fortunately the > + * signature of both functions is the same making it easy to work > + * around. > + */ > +static inline > +gboolean g_pattern_spec_match_string_qemu(GPatternSpec *pspec, > + const gchar *string) > +{ > +#if GLIB_CHECK_VERSION(2, 70, 0) > +return g_pattern_spec_match_string(pspec, string); > +#else > +return g_pattern_match_string(pspec, string); > +#endif > +}; > +#define g_pattern_spec_match_string(p, s) > g_pattern_spec_match_string_qemu(p, s) > + > #if defined(G_OS_UNIX) > /* > * Note: The fallback implementation is not MT-safe, and it returns a copy of > modified contrib/plugins/execlog.c > @@ -334,8 +334,8 @@ static void registers_init(int vcpu_index) > for (int p = 0; p < rmatches-
[PATCH v9 1/1] hw: add some convenient trace-events for pcie and shpc hotplug
Add trace-events that may help to debug problems with hotplugging. Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/pci/pcie.c | 56 + hw/pci/shpc.c | 46 + hw/pci/trace-events | 6 + 3 files changed, 108 insertions(+) diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c index 6db0cf69cd..f56079acf5 100644 --- a/hw/pci/pcie.c +++ b/hw/pci/pcie.c @@ -28,6 +28,7 @@ #include "hw/pci/pcie_regs.h" #include "hw/pci/pcie_port.h" #include "qemu/range.h" +#include "trace.h" //#define DEBUG_PCIE #ifdef DEBUG_PCIE @@ -45,6 +46,23 @@ static bool pcie_sltctl_powered_off(uint16_t sltctl) && (sltctl & PCI_EXP_SLTCTL_PIC) == PCI_EXP_SLTCTL_PWR_IND_OFF; } +static const char *pcie_led_state_to_str(uint16_t value) +{ +switch (value) { +case PCI_EXP_SLTCTL_PWR_IND_ON: +case PCI_EXP_SLTCTL_ATTN_IND_ON: +return "on"; +case PCI_EXP_SLTCTL_PWR_IND_BLINK: +case PCI_EXP_SLTCTL_ATTN_IND_BLINK: +return "blink"; +case PCI_EXP_SLTCTL_PWR_IND_OFF: +case PCI_EXP_SLTCTL_ATTN_IND_OFF: +return "off"; +default: +return "invalid"; +} +} + /*** * pci express capability helper functions */ @@ -735,6 +753,28 @@ void pcie_cap_slot_get(PCIDevice *dev, uint16_t *slt_ctl, uint16_t *slt_sta) *slt_sta = pci_get_word(exp_cap + PCI_EXP_SLTSTA); } +static void find_child_fn(PCIBus *bus, PCIDevice *dev, void *opaque) +{ +PCIDevice **child = opaque; + +if (!*child) { +*child = dev; +} +} + +/* + * Returns the plugged device or first function of multifunction plugged device + */ +static PCIDevice *pcie_cap_slot_find_child(PCIDevice *dev) +{ +PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(dev)); +PCIDevice *child = NULL; + +pci_for_each_device(sec_bus, pci_bus_num(sec_bus), find_child_fn, &child); + +return child; +} + void pcie_cap_slot_write_config(PCIDevice *dev, uint16_t old_slt_ctl, uint16_t old_slt_sta, uint32_t addr, uint32_t val, int len) @@ -779,6 +819,22 @@ void pcie_cap_slot_write_config(PCIDevice *dev, sltsta); } +if (trace_event_get_state_backends(TRACE_PCIE_CAP_SLOT_WRITE_CONFIG)) { +DeviceState *parent = DEVICE(dev); +DeviceState *child = DEVICE(pcie_cap_slot_find_child(dev)); + +trace_pcie_cap_slot_write_config( +parent->canonical_path, +child ? child->canonical_path : "no-child", +(sltsta & PCI_EXP_SLTSTA_PDS) ? "present" : "not present", +pcie_led_state_to_str(old_slt_ctl & PCI_EXP_SLTCTL_PIC), +pcie_led_state_to_str(val & PCI_EXP_SLTCTL_PIC), +pcie_led_state_to_str(old_slt_ctl & PCI_EXP_SLTCTL_AIC), +pcie_led_state_to_str(val & PCI_EXP_SLTCTL_AIC), +(old_slt_ctl & PCI_EXP_SLTCTL_PWR_OFF) ? "off" : "on", +(val & PCI_EXP_SLTCTL_PWR_OFF) ? "off" : "on"); +} + /* * If the slot is populated, power indicator is off and power * controller is off, it is safe to detach the devices. diff --git a/hw/pci/shpc.c b/hw/pci/shpc.c index d2a5eea69e..aac6f2d034 100644 --- a/hw/pci/shpc.c +++ b/hw/pci/shpc.c @@ -8,6 +8,7 @@ #include "hw/pci/pci.h" #include "hw/pci/pci_bus.h" #include "hw/pci/msi.h" +#include "trace.h" /* TODO: model power only and disabled slot states. */ /* TODO: handle SERR and wakeups */ @@ -123,6 +124,34 @@ #define SHPC_PCI_TO_IDX(pci_slot) ((pci_slot) - 1) #define SHPC_IDX_TO_PHYSICAL(slot) ((slot) + 1) +static const char *shpc_led_state_to_str(uint8_t value) +{ +switch (value) { +case SHPC_LED_ON: +return "on"; +case SHPC_LED_BLINK: +return "blink"; +case SHPC_LED_OFF: +return "off"; +default: +return "invalid"; +} +} + +static const char *shpc_slot_state_to_str(uint8_t value) +{ +switch (value) { +case SHPC_STATE_PWRONLY: +return "power-only"; +case SHPC_STATE_ENABLED: +return "enabled"; +case SHPC_STATE_DISABLED: +return "disabled"; +default: +return "invalid"; +} +} + static uint8_t shpc_get_status(SHPCDevice *shpc, int slot, uint16_t msk) { uint8_t *status = shpc->config + SHPC_SLOT_STATUS(slot); @@ -302,6 +331,23 @@ static void shpc_slot_command(PCIDevice *d, uint8_t target, shpc_set_status(shpc, slot, state, SHPC_SLOT_STATE_MASK); } +if (trace_event_get_state_backends(TRACE_SHPC_SLOT_COMMAND)) { +DeviceState *parent = DEVICE(d); +int pci_slot = SHPC_IDX_TO_PCI(slot); +DeviceState *child = +DEVICE(shpc->sec_bus->devices[PCI_DEVFN(pci_slot, 0)]); + +trace_shpc_slot_command( +parent->canonical_path, pci_slot, +child ? child->canonical_path : "no-child", +
[PATCH v9 0/1] pci hotplug tracking
v9: I was convinced, that adding new qapi interfaces here is questionable. So, to conclude, let's still add two convenient trace-points, which may help to debug hotplug problems. Vladimir Sementsov-Ogievskiy (1): hw: add some convenient trace-events for pcie and shpc hotplug hw/pci/pcie.c | 56 + hw/pci/shpc.c | 46 + hw/pci/trace-events | 6 + 3 files changed, 108 insertions(+) -- 2.34.1
Re: [RFC 0/4] mirror: implement incremental and bitmap modes
On 01.03.24 18:14, Fiona Ebner wrote: Am 01.03.24 um 16:02 schrieb Vladimir Sementsov-Ogievskiy: On 01.03.24 17:52, Fiona Ebner wrote: Am 01.03.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy: As we already understood, (block-)job-api needs some spring-cleaning. Unfortunately I don't have much time on it, but still I decided to start from finally depreacting block-job-* API and moving to job-*.. Probably bitmap/bitmap-mode/sync APIs also need some optimization, keeping in mind new block-dirty-bitmap-merge api. So, what I could advice in this situation for newc interfaces: 1. be minimalistic 2. add `x-` prefix when unsure So, following these two rules, what about x-bitmap field, which may be combined only with 'full' mode, and do what you need? AFAIU, it should rather be marked as @unstable in QAPI [0]? Then it doesn't need to be renamed if it becomes stable later. Right, unstable feature is needed, using "x-" is optional. Recent discussion about it was in my "vhost-user-blk: live resize additional APIs" series: https://patchew.org/QEMU/20231006202045.1161543-1-vsement...@yandex-team.ru/20231006202045.1161543-5-vsement...@yandex-team.ru/ Following it, I think it's OK to not care anymore with "x-" prefixes, and rely on unstable feature. Thanks for the confirmation! I'll go without the prefix in the name then. About documentation: actually, I never liked that we use for backup job "MirrorSyncMode". Now it looks more like "BackupSyncMode", having two values supported only by backup. I'm also unsure how mode=full&bitmap=some_bitmap differs from mode=bitmap&bitmap=some_bitmap.. With the current patches, it was an error to specify @bitmap for other modes than 'incremental' and 'bitmap'. Current documentation says: # @bitmap: The name of a dirty bitmap to use. Must be present if sync # is "bitmap" or "incremental". Can be present if sync is "full" # or "top". Must not be present otherwise. # (Since 2.4 (drive-backup), 3.1 (blockdev-backup)) This is for backup. The documentation (and behavior) for @bitmap added by these patches for mirror is different ;) I meant backup in "I'm also unsure", just as one more point not consider backup-bitmap-API as a prototype for mirror-bitmap-API. Best Regards, Fiona -- Best regards, Vladimir
Re: [Spam][PATCH] replay: Improve error messages about configuration conflicts
Reviewed-by: Pavel Dovgalyuk On 01.03.2024 15:06, Markus Armbruster wrote: Improve Record/replay feature is not supported for '-rtc base=localtime' Record/replay feature is not supported for 'smp' Record/replay feature is not supported for '-snapshot' to Record/replay is not supported with -rtc base=localtime Record/replay is not supported with multiple CPUs Record/replay is not supported with -snapshot Signed-off-by: Markus Armbruster --- replay/replay.c | 2 +- system/vl.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/replay/replay.c b/replay/replay.c index 3fd241a4fc..a2c576c16e 100644 --- a/replay/replay.c +++ b/replay/replay.c @@ -511,7 +511,7 @@ void replay_add_blocker(const char *feature) { Error *reason = NULL; -error_setg(&reason, "Record/replay feature is not supported for '%s'", +error_setg(&reason, "Record/replay is not supported with %s", feature); replay_blockers = g_slist_prepend(replay_blockers, reason); } diff --git a/system/vl.c b/system/vl.c index e480afd7a0..cc03a17c09 100644 --- a/system/vl.c +++ b/system/vl.c @@ -1932,7 +1932,7 @@ static void qemu_apply_machine_options(QDict *qdict) } if (current_machine->smp.cpus > 1) { -replay_add_blocker("smp"); +replay_add_blocker("multiple CPUs"); } }
Re: [PULL 2/2] aspeed: fix hardcode boot address 0
Hi Jamin, On 27/2/24 13:52, Cédric Le Goater wrote: From: Jamin Lin In the previous design of ASPEED SOCs QEMU model, it set the boot address at "0" which was the hardcode setting for ast10x0, ast2600, ast2500 and ast2400. According to the design of ast2700, it has a bootmcu(riscv-32) which is used for executing SPL and initialize DRAM and copy u-boot image from SPI/Flash to DRAM at address 0x4 at SPL boot stage. Then, CPUs(cortex-a35) execute u-boot, kernel and rofs. Currently, qemu not support emulate two CPU architectures at the same machine. Therefore, qemu will only support to emulate CPU(cortex-a35) side for ast2700 and the boot address is "0x4 ". Fixed hardcode boot address "0" for future models using a different mapping address. Signed-off-by: Troy Lee Signed-off-by: Jamin Lin Reviewed-by: Cédric Le Goater Tip for the email workflow: when someone provide a R-b tag for a patch, please carry it on in your next iterations. https://lore.kernel.org/qemu-devel/09f9ca34-4e0c-4ada-b808-643a8c578...@linaro.org/ See https://www.qemu.org/docs/master/devel/submitting-a-patch.html#proper-use-of-reviewed-by-tags-can-aid-review Signed-off-by: Cédric Le Goater --- include/hw/arm/aspeed_soc.h | 2 -- hw/arm/aspeed.c | 4 +++- hw/arm/aspeed_ast2400.c | 4 ++-- hw/arm/aspeed_ast2600.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-)
Re: [PATCH] Fix unexpected Illegal instruction error on RISC-V.
Hi SiHuaN, On 1/3/24 15:55, SiHuaN wrote: Avoid right-shifting by a negative number of bits when lmul is 8. FYI Demin posted a similar patch, see: https://lore.kernel.org/qemu-devel/20240225174114.5298-1-demin@starfivetech.com/ Signed-off-by: SiHuaN --- target/riscv/vector_helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 84cec73eb2..f0158ea237 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -53,10 +53,11 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, * VLEN * LMUL >= SEW * VLEN >> (8 - lmul) >= sew * (vlenb << 3) >> (8 - lmul) >= sew + * Considering that lmul may be 8, the following form cannot be used. * vlenb >> (8 - 3 - lmul) >= sew */ if (vlmul == 4 || -cpu->cfg.vlenb >> (8 - 3 - vlmul) < sew) { +(cpu->cfg.vlenb << 3) >> (8 - vlmul) < sew) { vill = true; } }
Re: [PATCH v9 1/1] hw: add some convenient trace-events for pcie and shpc hotplug
On 1/3/24 16:41, Vladimir Sementsov-Ogievskiy wrote: Add trace-events that may help to debug problems with hotplugging. Signed-off-by: Vladimir Sementsov-Ogievskiy --- hw/pci/pcie.c | 56 + hw/pci/shpc.c | 46 + hw/pci/trace-events | 6 + 3 files changed, 108 insertions(+) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH] replay: Improve error messages about configuration conflicts
On 1/3/24 13:06, Markus Armbruster wrote: Improve Record/replay feature is not supported for '-rtc base=localtime' Record/replay feature is not supported for 'smp' Record/replay feature is not supported for '-snapshot' to Record/replay is not supported with -rtc base=localtime Record/replay is not supported with multiple CPUs Record/replay is not supported with -snapshot Signed-off-by: Markus Armbruster --- replay/replay.c | 2 +- system/vl.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH v2 1/6] libqos/virtio.c: init all elems in qvring_indirect_desc_setup()
17.02.2024 22:26, Daniel Henrique Barboza: diff --git a/tests/qtest/libqos/virtio.c b/tests/qtest/libqos/virtio.c index 410513225f..4f39124eba 100644 --- a/tests/qtest/libqos/virtio.c +++ b/tests/qtest/libqos/virtio.c @@ -280,14 +280,27 @@ QVRingIndirectDesc *qvring_indirect_desc_setup(QTestState *qs, QVirtioDevice *d, indirect->elem = elem; indirect->desc = guest_alloc(alloc, sizeof(struct vring_desc) * elem); -for (i = 0; i < elem - 1; ++i) { +for (i = 0; i < elem; ++i) { /* indirect->desc[i].addr */ qvirtio_writeq(d, qs, indirect->desc + (16 * i), 0); -/* indirect->desc[i].flags */ -qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, - VRING_DESC_F_NEXT); -/* indirect->desc[i].next */ -qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, i + 1); + +/* + * If it's not the last element of the ring, set + * the chain (VRING_DESC_F_NEXT) flag and + * desc->next. Clear the last element - there's + * no guarantee that guest_alloc() will do it. + */ +if (i != elem - 1) { +/* indirect->desc[i].flags */ +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, + VRING_DESC_F_NEXT); + +/* indirect->desc[i].next */ +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, i + 1); +} else { +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, 0); +qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, 0); +} } In my view it would be cleaner to add 2 extra function calls after this loop for the i==elem-1 case: for (i = 0; i < elem - 1; ++i) { /* indirect->desc[i].addr */ qvirtio_writeq(d, qs, indirect->desc + (16 * i), 0); /* indirect->desc[i].flags */ qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, VRING_DESC_F_NEXT); /* indirect->desc[i].next */ qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, i + 1); } /* clear last element */ qvirtio_writeq(d, qs, indirect->desc + (16 * i), 0); qvirtio_writew(d, qs, indirect->desc + (16 * i) + 12, 0); qvirtio_writew(d, qs, indirect->desc + (16 * i) + 14, 0); FWIW, -- it's too late to change it now I think. /mjt
Re: [PATCH v7 1/2] qom: new object to associate device to numa node
On Fri, 1 Mar 2024 08:33:42 + Ankit Agrawal wrote: > >> As for your suggestion of using acpi-dev as the arg to take both > >> pci-dev and acpi-dev.. Would that mean sending a pure pci device > >> (not the corner case you mentioned) through the acpi-dev argument > >> as well? Not sure if that would appropriate. > > > > Ah, looking up my description is unclear. I meant two optional parameters > > pci-dev or acpi-dev - which one was supplied would indicate the type > > of handle to be used. > > Yes, that makes sense. But for now only have pci-dev until we have any > acpi-dev related code added? IIRC, this is what we discussed earlier. Yep, I think we acknowledged that this device supports either PCI or ACPI devices and we only currently have a use case for PCI devices, so that's what's implemented and what the interface expects. A separate ACPI device interface could be added later or derived from updating the interface to accept a generic device object. Thanks, Alex
Re: Re: [PATCH] Fix unexpected Illegal instruction error on RISC-V.
Hi Philippe, Thanks for the heads up. Sorry I didn't check for this before sending out my patch. I'll track this in Demin's thread. > -原始邮件- > 发件人: "Philippe Mathieu-Daudé" > 发送时间: 2024-03-01 23:51:03 (星期五) > 收件人: SiHuaN , qemu-devel@nongnu.org > 抄送: "demin.han" , qemu-riscv , "Daniel Henrique Barboza" > 主题: Re: [PATCH] Fix unexpected Illegal instruction error on RISC-V. > > Hi SiHuaN, > > On 1/3/24 15:55, SiHuaN wrote: > > Avoid right-shifting by a negative number of bits when lmul is 8. > > FYI Demin posted a similar patch, see: > https://lore.kernel.org/qemu-devel/20240225174114.5298-1-demin@starfivetech.com/ > > > Signed-off-by: SiHuaN > > --- > > target/riscv/vector_helper.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c > > index 84cec73eb2..f0158ea237 100644 > > --- a/target/riscv/vector_helper.c > > +++ b/target/riscv/vector_helper.c > > @@ -53,10 +53,11 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, > >* VLEN * LMUL >= SEW > >* VLEN >> (8 - lmul) >= sew > >* (vlenb << 3) >> (8 - lmul) >= sew > > + * Considering that lmul may be 8, the following form cannot be used. > >* vlenb >> (8 - 3 - lmul) >= sew > >*/ > > if (vlmul == 4 || > > -cpu->cfg.vlenb >> (8 - 3 - vlmul) < sew) { > > +(cpu->cfg.vlenb << 3) >> (8 - vlmul) < sew) { > > vill = true; > > } > > }
Re: [PATCH v1 0/5] hw/ppc: SPI model
Hello, I would greatly appreciate the review comments/suggestions on PATCH V1. Thank You and Regards, Chalapathi On 07-02-2024 21:38, Chalapathi V wrote: Hello, In this series of patchset, SPI controller and responder models for Power10 processor are modelled. Serial peripheral interface provides full-duplex synchronous serial communication between single controller and multiple responder devices. The current configuration supports a single SPI controller reside on the SPI bus. In p10, SPI controller device model supports a connection to a single SPI responder such as SPI seeproms, TPM, flash device and an ADC controller. SPI controller model is divided into configuration unit, sequencer FSM and shifter engine. All SPI function control is mapped into the SPI register space to enable full control by firmware. SPI configuration component is modelled which contains all SPI configuration and status registers as well as the hold registers for data to be sent or having been received. Shift engine performs serialization and de-serialization according to the control by the sequencer and according to the setup defined in the configuration registers. Sequencer implements the main control logic and FSM to handle data transmit and data receive control of the shift engine. Microchip's 25CSM04 SEEPROM device is modelled and connected to SPI bus "spi_bus2" of SPI controller "PIB_SPIC[2]". Patches overview in V1. PATCH1: Create a SPI responder model which includes responder methods and SPI bus implementation. PATCH2: Create a SPI controller model and implement configuration unit to model SCOM registers. PATCH3: SPI controller model: implement sequencer FSM and shift engine. PATCH4: create SPI SEEPROM model. PATCH5: Connect SPI controllers to p10 chip and create keystore seeprom device on spi_bus2 of PIB_SPIC[2]. Thank You, Chalapathi Chalapathi V (5): hw/ppc: SPI responder model hw/ppc: SPI controller model - registers implementation hw/ppc: SPI controller model - sequencer and shifter hw/ppc: SPI SEEPROM model hw/ppc: SPI controller wiring to P10 chip and create seeprom device include/hw/ppc/pnv_chip.h |4 + include/hw/ppc/pnv_spi_controller.h | 101 ++ include/hw/ppc/pnv_spi_responder.h | 109 ++ include/hw/ppc/pnv_spi_seeprom.h| 70 ++ include/hw/ppc/pnv_xscom.h |3 + hw/ppc/pnv.c| 32 + hw/ppc/pnv_spi_controller.c | 1609 +++ hw/ppc/pnv_spi_responder.c | 166 +++ hw/ppc/pnv_spi_seeprom.c| 989 hw/ppc/meson.build |3 + 10 files changed, 3086 insertions(+) create mode 100644 include/hw/ppc/pnv_spi_controller.h create mode 100644 include/hw/ppc/pnv_spi_responder.h create mode 100644 include/hw/ppc/pnv_spi_seeprom.h create mode 100644 hw/ppc/pnv_spi_controller.c create mode 100644 hw/ppc/pnv_spi_responder.c create mode 100644 hw/ppc/pnv_spi_seeprom.c
Re: [PULL 26/29] contrib/plugins: extend execlog to track register changes
Zhao Liu writes: > On Fri, Mar 01, 2024 at 10:22:08AM +, Alex Bennée wrote: >> Date: Fri, 01 Mar 2024 10:22:08 + >> From: Alex Bennée >> Subject: Re: [PULL 26/29] contrib/plugins: extend execlog to track register >> changes >> >> Zhao Liu writes: >> >> > Hi Alex, >> > >> > I hit the following warnings (with "./configure --enable-werror"): >> > >> > /qemu/contrib/plugins/execlog.c: In function ‘registers_init’: >> > /qemu/contrib/plugins/execlog.c:330:17: warning: ‘g_pattern_match_string’ >> > is deprecated: Use 'g_pattern_spec_match_string' instead >> > [-Wdeprecated-declarations] >> > 330 | if (g_pattern_match_string(pat, rd->name) || >> > | ^~ >> > In file included from /usr/include/glib-2.0/glib.h:65, >> > from /qemu/contrib/plugins/execlog.c:9: >> > /usr/include/glib-2.0/glib/gpattern.h:55:15: note: declared here >> >55 | gboolean g_pattern_match_string (GPatternSpec *pspec, >> > | ^~ >> > /qemu/contrib/plugins/execlog.c:331:21: warning: ‘g_pattern_match_string’ >> > is deprecated: Use 'g_pattern_spec_match_string' instead >> > [-Wdeprecated-declarations] >> > 331 | g_pattern_match_string(pat, rd_lower)) { >> > | ^~ >> > In file included from /usr/include/glib-2.0/glib.h:65, >> > from /qemu/contrib/plugins/execlog.c:9: >> > /usr/include/glib-2.0/glib/gpattern.h:55:15: note: declared here >> >55 | gboolean g_pattern_match_string (GPatternSpec *pspec, >> > | ^~ >> > /qemu/contrib/plugins/execlog.c:339:63: warning: passing argument 2 of >> > ‘g_ptr_array_add’ discards ‘const’ qualifier from pointer target type >> > [-Wdiscarded-qualifiers] >> > 339 | g_ptr_array_add(all_reg_names, >> > reg->name); >> >> Hmm I missed that. Not sure what the neatest solution is in this case - >> g_ptr_array_new() doesn't have a destroy func so we shouldn't ever >> attempt to free it. We can manually re-add the const qualifier at the >> other end for completeness and I guess comment and cast? > > I find other palces use 2 ways: > * Use g_strdup() to create a copy (e.g., net/net.c, > add_nic_model_help()). But I'm not sure if this is OK since you said > we shouldn't attempt to free it. May I ask if the free issue you > mentioned will affect the use of g_strdup() here? If we g_strdup we have to remember to free later and ensure the g_ptr_array has a free func. > * Another way is the forced conversion to gpointer (also e.g., in > net/net.c, qemu_get_nic_models()). I think this is ok, but with a comment on all_reg_names just to remind any potential users that the strings are const and allocated by QEMU so are not to be modified or freed. > > Which way do you like? ;-) > >> >> but I hesitated to add it for this case as plugins shouldn't assume they >> have access to QEMU's internals. Maybe the glib-compat.h header could be >> treated as a special case. > > Thanks! This works on my side! > > I support to fix the compatibility as the above, after all it's always > confusing when we allow users to use newer glib and see warnings at > compile time! > >> > >> > I also noticed in target/arm/helper.c, there's another >> > g_pattern_match_string() but I haven't met the warning. >> >> Hmm that's weird. I suspect glib suppresses the warnings with: >> >> /* Ask for warnings for anything that was marked deprecated in >>* the defined version, or before. It is a candidate for rewrite. >>*/ >> #define GLIB_VERSION_MIN_REQUIRED GLIB_VERSION_2_56 >> > > I'm not too familiar with the QEMU build framework, but based on this, a > natural question is, can this rule be applied to plugins code as well? > If so, this would also avoid warning. I think the simplest solution would be to add: #include "glib-compat.h" into qemu-plugin.h so plugins have the same deprecation rules as the QEMU they come from. I checked with Michael on IRC and Debian currently doesn't package any header files but if anyone starts shipping a qemu-dev package we'll need to make sure we include glib-compat.h in the same directory and qemu-plugin.h. > > Thanks, > Zhao -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [PATCH v1 0/5] hw/ppc: SPI model
Chalapathi, On 3/1/24 17:17, Chalapathi V wrote: Hello, I would greatly appreciate the review comments/suggestions on PATCH V1. Thank You and Regards, I didn't forget but I lacked the time in this release cycle. Sorry about that. I have one quick comment though. There are already a few models implementing SPI controllers in QEMU. When I skimmed through the patches, I was surprised to see no use of the available SSI framework. Doesn't the current framework fit your needs ? Please take a look at files in : include/hw/ssi/* hw/ssi/* Same comment for the Serial EEPROM model. This device is generic and not POWER specific. It should be possible to attach the device model on other machines and different SPI bus provided by QEMU. This doesn't mean rewriting everything, but the "HW" interface probably needs to be reworked. It would make it easier to write unit test (see ests/qtest/) and ease the review also. Thanks, C. Chalapathi On 07-02-2024 21:38, Chalapathi V wrote: Hello, In this series of patchset, SPI controller and responder models for Power10 processor are modelled. Serial peripheral interface provides full-duplex synchronous serial communication between single controller and multiple responder devices. The current configuration supports a single SPI controller reside on the SPI bus. In p10, SPI controller device model supports a connection to a single SPI responder such as SPI seeproms, TPM, flash device and an ADC controller. SPI controller model is divided into configuration unit, sequencer FSM and shifter engine. All SPI function control is mapped into the SPI register space to enable full control by firmware. SPI configuration component is modelled which contains all SPI configuration and status registers as well as the hold registers for data to be sent or having been received. Shift engine performs serialization and de-serialization according to the control by the sequencer and according to the setup defined in the configuration registers. Sequencer implements the main control logic and FSM to handle data transmit and data receive control of the shift engine. Microchip's 25CSM04 SEEPROM device is modelled and connected to SPI bus "spi_bus2" of SPI controller "PIB_SPIC[2]". Patches overview in V1. PATCH1: Create a SPI responder model which includes responder methods and SPI bus implementation. PATCH2: Create a SPI controller model and implement configuration unit to model SCOM registers. PATCH3: SPI controller model: implement sequencer FSM and shift engine. PATCH4: create SPI SEEPROM model. PATCH5: Connect SPI controllers to p10 chip and create keystore seeprom device on spi_bus2 of PIB_SPIC[2]. Thank You, Chalapathi Chalapathi V (5): hw/ppc: SPI responder model hw/ppc: SPI controller model - registers implementation hw/ppc: SPI controller model - sequencer and shifter hw/ppc: SPI SEEPROM model hw/ppc: SPI controller wiring to P10 chip and create seeprom device include/hw/ppc/pnv_chip.h | 4 + include/hw/ppc/pnv_spi_controller.h | 101 ++ include/hw/ppc/pnv_spi_responder.h | 109 ++ include/hw/ppc/pnv_spi_seeprom.h | 70 ++ include/hw/ppc/pnv_xscom.h | 3 + hw/ppc/pnv.c | 32 + hw/ppc/pnv_spi_controller.c | 1609 +++ hw/ppc/pnv_spi_responder.c | 166 +++ hw/ppc/pnv_spi_seeprom.c | 989 hw/ppc/meson.build | 3 + 10 files changed, 3086 insertions(+) create mode 100644 include/hw/ppc/pnv_spi_controller.h create mode 100644 include/hw/ppc/pnv_spi_responder.h create mode 100644 include/hw/ppc/pnv_spi_seeprom.h create mode 100644 hw/ppc/pnv_spi_controller.c create mode 100644 hw/ppc/pnv_spi_responder.c create mode 100644 hw/ppc/pnv_spi_seeprom.c
Re: [PATCH 3/9] backends/confidential-guest-support: Add functions to support IGVM
On Tue, Feb 27, 2024 at 02:50:09PM +, Roy Hopkins wrote: > In preparation for supporting the processing of IGVM files to configure > guests, this adds a set of functions to ConfidentialGuestSupport > allowing configuration of secure virtual machines that can be > implemented for each supported isolation platform type such as Intel TDX > or AMD SEV-SNP. These functions will be called by IGVM processing code > in subsequent patches. > > This commit provides a default implementation of the functions that > either perform no action or generate a warning or error when they are > called. Targets that support ConfidentalGuestSupport should override > these implementations. > > Signed-off-by: Roy Hopkins > --- > backends/confidential-guest-support.c | 26 > include/exec/confidential-guest-support.h | 76 +++ > 2 files changed, 102 insertions(+) > > diff --git a/backends/confidential-guest-support.c > b/backends/confidential-guest-support.c > index da436fb736..42628be8d7 100644 > --- a/backends/confidential-guest-support.c > +++ b/backends/confidential-guest-support.c > @@ -14,6 +14,7 @@ > #include "qemu/osdep.h" > > #include "exec/confidential-guest-support.h" > +#include "qemu/error-report.h" > > OBJECT_DEFINE_ABSTRACT_TYPE(ConfidentialGuestSupport, > confidential_guest_support, > @@ -45,8 +46,33 @@ static void > confidential_guest_support_class_init(ObjectClass *oc, void *data) > #endif > } > > +static int check_support(ConfidentialGuestPlatformType platform, > + uint16_t platform_version, uint8_t highest_vtl, > + uint64_t shared_gpa_boundary) > +{ > +/* Default: no support. */ > +return 0; > +} > + > +static int set_guest_state(hwaddr gpa, uint8_t *ptr, uint64_t len, > + ConfidentialGuestPageType memory_type, > + uint16_t cpu_index) > +{ > +warn_report("Confidential guest memory not supported"); > +return -1; > +} > + > +static int get_mem_map_entry(int index, ConfidentialGuestMemoryMapEntry > *entry) > +{ > +return 1; > +} IIUC, all these can reports errors, and as such I think they should have an 'Error **errp' parameter, so we can report precise errors in these methods, rather than less useful generic errors in the caller. The above 'warn_report' ought to be an error too, since it is returning an failure code (-1) > + > static void confidential_guest_support_init(Object *obj) > { > +ConfidentialGuestSupport *cgs = CONFIDENTIAL_GUEST_SUPPORT(obj); > +cgs->check_support = check_support; > +cgs->set_guest_state = set_guest_state; > +cgs->get_mem_map_entry = get_mem_map_entry; > } > > static void confidential_guest_support_finalize(Object *obj) > diff --git a/include/exec/confidential-guest-support.h > b/include/exec/confidential-guest-support.h > index b08ad8de4d..c43a1a32f1 100644 > --- a/include/exec/confidential-guest-support.h > +++ b/include/exec/confidential-guest-support.h > @@ -21,10 +21,46 @@ > #ifndef CONFIG_USER_ONLY > > #include "qom/object.h" > +#include "exec/hwaddr.h" > + > +#if defined(CONFIG_IGVM) > +#include "igvm/igvm.h" > +#endif > > #define TYPE_CONFIDENTIAL_GUEST_SUPPORT "confidential-guest-support" > OBJECT_DECLARE_SIMPLE_TYPE(ConfidentialGuestSupport, > CONFIDENTIAL_GUEST_SUPPORT) > > +typedef enum ConfidentialGuestPlatformType { > +CGS_PLATFORM_SEV, > +CGS_PLATFORM_SEV_ES, > +CGS_PLATFORM_SEV_SNP, > +CGS_PLATFORM_TDX, QEMU only has support for SEV & SEV_ES today. We should leave the others until we actually get an impl of SEV-SNP/TDX in QEMU that supports those platforms. > +} ConfidentialGuestPlatformType; > + > +typedef enum ConfidentialGuestMemoryType { > +CGS_MEM_RAM, > +CGS_MEM_RESERVED, > +CGS_MEM_ACPI, > +CGS_MEM_NVS, > +CGS_MEM_UNUSABLE, > +} ConfidentialGuestMemoryType; > + > +typedef struct ConfidentialGuestMemoryMapEntry { > +uint64_t gpa; > +uint64_t size; > +ConfidentialGuestMemoryType type; > +} ConfidentialGuestMemoryMapEntry; > + > +typedef enum ConfidentialGuestPageType { > +CGS_PAGE_TYPE_NORMAL, > +CGS_PAGE_TYPE_VMSA, > +CGS_PAGE_TYPE_ZERO, > +CGS_PAGE_TYPE_UNMEASURED, > +CGS_PAGE_TYPE_SECRETS, > +CGS_PAGE_TYPE_CPUID, > +CGS_PAGE_TYPE_REQUIRED_MEMORY, > +} ConfidentialGuestPageType; > + > struct ConfidentialGuestSupport { > Object parent; > > @@ -60,6 +96,46 @@ struct ConfidentialGuestSupport { > */ > char *igvm_filename; > #endif > + > +/* > + * The following virtual methods need to be implemented by systems that > + * support confidential guests that can be configured with IGVM and are > + * used during processing of the IGVM file with process_igvm(). > + */ > + > +/* > + * Check for to see if this confidential guest supports a particular > + * platform or configuration > + */ > +in
Re: [PATCH] hw/misc: zynq_slcr: set SLC_RST bit in REBOOT_STATUS register
On Wed, 28 Feb 2024 at 01:40, Gregory Anders wrote: > > When the CPU is reset using PSS_RST_CTRL in the SLCR, bit 19 in > REBOOT_STATUS should be set. > > Refer to page 1602 of the Xilinx Zynq 7000 Technical Reference Manual. > > Signed-off-by: Gregory Anders > --- > hw/misc/zynq_slcr.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c > index d2ac2e77f2..a8f1792bf6 100644 > --- a/hw/misc/zynq_slcr.c > +++ b/hw/misc/zynq_slcr.c > @@ -120,6 +120,7 @@ REG32(RS_AWDT_CTRL, 0x24c) > REG32(RST_REASON, 0x250) > > REG32(REBOOT_STATUS, 0x258) > +FIELD(REBOOT_STATUS, SLC_RST, 19, 1) > REG32(BOOT_MODE, 0x25c) > > REG32(APU_CTRL, 0x300) > @@ -562,6 +563,7 @@ static void zynq_slcr_write(void *opaque, hwaddr offset, > switch (offset) { > case R_PSS_RST_CTRL: > if (FIELD_EX32(val, PSS_RST_CTRL, SOFT_RST)) { > +s->regs[R_REBOOT_STATUS] |= R_REBOOT_STATUS_SLC_RST_MASK; > qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET); > } > break; > -- The manual also says that "This field is written by ROM code", so as a model of the hardware should QEMU be writing it? I've cc'd the Zynq maintainers for their opinion. thanks -- PMM
Re: [PATCH 4/9] backends/igvm: Implement parsing and processing of IGVM files
On Tue, Feb 27, 2024 at 02:50:10PM +, Roy Hopkins wrote: > This commit adds an implementation of an IGVM loader which parses the > file specified as a pararameter to ConfidentialGuestSupport and provides > a function that uses the interface in the same object to configure and > populate guest memory based on the contents of the file. > > The IGVM file is parsed when a filename is provided but the code to > process the IGVM file is not yet hooked into target systems. This will > follow in a later commit. > > Signed-off-by: Roy Hopkins > --- > backends/confidential-guest-support.c | 4 + > backends/igvm.c | 718 ++ > backends/meson.build | 1 + > include/exec/confidential-guest-support.h | 5 + > include/exec/igvm.h | 35 ++ > 5 files changed, 763 insertions(+) > create mode 100644 backends/igvm.c > create mode 100644 include/exec/igvm.h > > diff --git a/backends/confidential-guest-support.c > b/backends/confidential-guest-support.c > index 42628be8d7..cb7651a8d0 100644 > --- a/backends/confidential-guest-support.c > +++ b/backends/confidential-guest-support.c > @@ -15,6 +15,7 @@ > > #include "exec/confidential-guest-support.h" > #include "qemu/error-report.h" > +#include "exec/igvm.h" > > OBJECT_DEFINE_ABSTRACT_TYPE(ConfidentialGuestSupport, > confidential_guest_support, > @@ -33,6 +34,9 @@ static void set_igvm(Object *obj, const char *value, Error > **errp) > ConfidentialGuestSupport *cgs = CONFIDENTIAL_GUEST_SUPPORT(obj); > g_free(cgs->igvm_filename); > cgs->igvm_filename = g_strdup(value); > +#if defined(CONFIG_IGVM) > +igvm_file_init(cgs); > +#endif > } > #endif > > diff --git a/backends/igvm.c b/backends/igvm.c > new file mode 100644 > index 00..a6261d796f > --- /dev/null > +++ b/backends/igvm.c > @@ -0,0 +1,718 @@ > +/* > + * QEMU IGVM configuration backend for Confidential Guests > + * > + * Copyright (C) 2023-2024 SUSE > + * > + * Authors: > + * Roy Hopkins > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + */ > + > +#include "qemu/osdep.h" > + > +#if defined(CONFIG_IGVM) > + > +#include "exec/confidential-guest-support.h" > +#include "qemu/queue.h" > +#include "qemu/typedefs.h" > + > +#include "exec/igvm.h" > +#include "qemu/error-report.h" > +#include "hw/boards.h" > +#include "qapi/error.h" > +#include "exec/address-spaces.h" > + > +#include > +#include > +#include > + > +typedef struct IgvmParameterData { > +QTAILQ_ENTRY(IgvmParameterData) next; > +uint8_t *data; > +uint32_t size; > +uint32_t index; > +} IgvmParameterData; > + > +static QTAILQ_HEAD(, IgvmParameterData) parameter_data; > + > +static void directive_page_data(ConfidentialGuestSupport *cgs, int i, > +uint32_t compatibility_mask); > +static void directive_vp_context(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > +static void directive_parameter_area(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > +static void directive_parameter_insert(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > +static void directive_memory_map(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > +static void directive_vp_count(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > +static void directive_environment_info(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > +static void directive_required_memory(ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask); > + > +struct IGVMDirectiveHandler { > +uint32_t type; > +void (*handler)(ConfidentialGuestSupport *cgs, int i, > +uint32_t compatibility_mask); > +}; > + > +static struct IGVMDirectiveHandler directive_handlers[] = { > +{ IGVM_VHT_PAGE_DATA, directive_page_data }, > +{ IGVM_VHT_VP_CONTEXT, directive_vp_context }, > +{ IGVM_VHT_PARAMETER_AREA, directive_parameter_area }, > +{ IGVM_VHT_PARAMETER_INSERT, directive_parameter_insert }, > +{ IGVM_VHT_MEMORY_MAP, directive_memory_map }, > +{ IGVM_VHT_VP_COUNT_PARAMETER, directive_vp_count }, > +{ IGVM_VHT_ENVIRONMENT_INFO_PARAMETER, directive_environment_info }, > +{ IGVM_VHT_REQUIRED_MEMORY, directive_required_memory }, > +}; > + > +static void directive(uint32_t type, ConfidentialGuestSupport *cgs, int i, > + uint32_t compatibility_mask) > +{ > +size_t handler; > +for (handler = 0; handler < (sizeof(directive_handlers) / > +