[PATCH] qapi: Misc cleanups to migrate QAPIs

2024-02-16 Thread Het Gala
Signed-off-by: Het Gala --- qapi/migration.json | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 5a565d9b8d..5756e650b0 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1728,6 +1728,7 @@ # # -> { "exe

Re: [PATCH v4 00/10] Optimize buffer_is_zero

2024-02-16 Thread Alexander Monakov
On Thu, 15 Feb 2024, Richard Henderson wrote: > On 2/15/24 13:37, Alexander Monakov wrote: > > Ah, I guess you might be running at low perf_event_paranoid setting that > > allows unprivileged sampling of kernel events? In our submissions the > > percentage was for perf_event_paranoid=2, i.e. rel

[PATCH 10/13] esp.c: introduce esp_update_drq() and update esp_fifo_{push, pop}_buf() to use it

2024-02-16 Thread Mark Cave-Ayland
This new function sets the DRQ line correctly according to the current transfer mode, direction and FIFO contents. Update esp_fifo_push_buf() and esp_fifo_pop_buf() to use it so that DRQ is always set correctly when reading/writing multiple bytes to/from the FIFO. Signed-off-by: Mark Cave-Ayland

[PATCH 13/13] esp.c: remove explicit setting of DRQ within ESP state machine

2024-02-16 Thread Mark Cave-Ayland
Now the esp_update_drq() is called for all reads/writes to the FIFO, there is no need to manually raise and lower the DRQ signal. Signed-off-by: Mark Cave-Ayland Fixes: https://gitlab.com/qemu-project/qemu/-/issues/611 Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1831 --- hw/scsi/esp.c |

[PATCH 11/13] esp.c: update esp_fifo_{push, pop}() to call esp_update_drq()

2024-02-16 Thread Mark Cave-Ayland
This ensures that the DRQ line is always set correctly when reading/writing single bytes to/from the FIFO. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c index ca0fa5098d..2150cd4

[PATCH 04/13] esp.c: change esp_fifo_push() to take ESPState

2024-02-16 Thread Mark Cave-Ayland
Now that all users of esp_fifo_push() operate on the main FIFO there is no need to pass the FIFO explicitly. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c index 7a24515bb9..b898e43e2b

[PATCH 01/13] esp.c: replace cmdfifo use of esp_fifo_pop_buf() in do_command_phase()

2024-02-16 Thread Mark Cave-Ayland
The aim is to restrict the esp_fifo_*() functions so that they only operate on the hardware FIFO. When reading from cmdfifo in do_command_phase() use the underlying Fifo8 functions directly. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(

[PATCH 02/13] esp.c: replace cmdfifo use of esp_fifo_pop_buf() in do_message_phase()

2024-02-16 Thread Mark Cave-Ayland
The aim is to restrict the esp_fifo_*() functions so that they only operate on the hardware FIFO. When reading from cmdfifo in do_message_phase() use the underlying Fifo8 functions directly. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletio

[PATCH 06/13] esp.c: use esp_fifo_push() instead of fifo8_push()

2024-02-16 Thread Mark Cave-Ayland
There are still a few places that use fifo8_push() instead of esp_fifo_push() in order to push a value into the FIFO. Update those places to use esp_fifo_push() instead. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw

[PATCH 08/13] esp.c: introduce esp_fifo_push_buf() function for pushing to the FIFO

2024-02-16 Thread Mark Cave-Ayland
Instead of pushing data into the FIFO directly with fifo8_push_all(), add a new esp_fifo_push_buf() function and use it accordingly. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c ind

[PATCH 12/13] esp.c: ensure esp_pdma_write() always calls esp_fifo_push()

2024-02-16 Thread Mark Cave-Ayland
This ensures that esp_update_drq() is called via esp_fifo_push() whenever the host uses PDMA to transfer data to a SCSI device. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c index 215

[PATCH 00/13] esp: avoid explicit setting of DRQ within ESP state machine

2024-02-16 Thread Mark Cave-Ayland
The ESP device has a DRQ (DMA request) signal that is used to handle flow control during DMA transfers. At the moment the DRQ signal is explicitly raised and lowered at various points in the ESP state machine as required, rather than implementing the logic described in the datasheet: "DREQ will r

[PATCH 05/13] esp.c: change esp_fifo_pop() to take ESPState

2024-02-16 Thread Mark Cave-Ayland
Now that all users of esp_fifo_pop() operate on the main FIFO there is no need to pass the FIFO explicitly. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c index b898e43e2b..0e42ff50e7

[PATCH 03/13] esp.c: replace cmdfifo use of esp_fifo_pop() in do_message_phase()

2024-02-16 Thread Mark Cave-Ayland
Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c index 100560244b..7a24515bb9 100644 --- a/hw/scsi/esp.c +++ b/hw/scsi/esp.c @@ -312,7 +312,8 @@ static void do_message_phase(ESPState *s) uint

[PATCH 09/13] esp.c: move esp_set_phase() and esp_get_phase() towards the beginning of the file

2024-02-16 Thread Mark Cave-Ayland
This allows these functions to be used earlier in the file without needing a separate forward declaration. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 36 ++-- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c in

[PATCH 07/13] esp.c: change esp_fifo_pop_buf() to take ESPState

2024-02-16 Thread Mark Cave-Ayland
Now that all users of esp_fifo_pop_buf() operate on the main FIFO there is no need to pass the FIFO explicitly. Signed-off-by: Mark Cave-Ayland --- hw/scsi/esp.c | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c index

[PATCH] tcg/aarch64: Apple does not align __int128_t in even registers

2024-02-16 Thread Richard Henderson
Cc: qemu-sta...@nongnu.org Fixes: 5427a9a7604 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2169 Signed-off-by: Richard Henderson --- See the gitlab issue for complete discussion of the ABI. r~ --- tcg/aarch64/tcg-target.h | 6 +- 1 fil

Re: [PATCH v4 00/10] Optimize buffer_is_zero

2024-02-16 Thread Richard Henderson
On 2/16/24 10:20, Alexander Monakov wrote: FWIW, in situations like these I always recommend to run perf with fixed sampling rate, i.e. 'perf record -e cycles:P -c 10' or 'perf record -e cycles/period=10/P' to make sample counts between runs of different duration directly comparable (disp

Re: [PATCH v2 3/3] target/riscv/translate.c: set vstart_eq_zero in mark_vs_dirty()

2024-02-16 Thread Daniel Henrique Barboza
On 2/16/24 15:56, Richard Henderson wrote: On 2/16/24 03:57, Daniel Henrique Barboza wrote: The 'vstart_eq_zero' flag which is used to determine if some insns, like vector reductor operations, should SIGILL. At this moment the flag is being updated only during cpu_get_tb_cpu_state(), at the s

[PATCH v2 1/7] migration/multifd: Add new migration option zero-page-detection.

2024-02-16 Thread Hao Xiang
This new parameter controls where the zero page checking is running. 1. If this parameter is set to 'legacy', zero page checking is done in the migration main thread. 2. If this parameter is set to 'none', zero page checking is disabled. Signed-off-by: Hao Xiang --- hw/core/qdev-properties-syste

[PATCH v2 5/7] migration/multifd: Add new migration test cases for legacy zero page checking.

2024-02-16 Thread Hao Xiang
Now that zero page checking is done on the multifd sender threads by default, we still provide an option for backward compatibility. This change adds a qtest migration test case to set the zero-page-detection option to "legacy" and run multifd migration with zero page checking on the migration main

[PATCH v2 3/7] migration/multifd: Zero page transmission on the multifd thread.

2024-02-16 Thread Hao Xiang
1. Implements the zero page detection and handling on the multifd threads for non-compression, zlib and zstd compression backends. 2. Added a new value 'multifd' in ZeroPageDetection enumeration. 3. Add proper asserts to ensure pages->normal are used for normal pages in all scenarios. Signed-off-b

[PATCH v2 2/7] migration/multifd: Support for zero pages transmission in multifd format.

2024-02-16 Thread Hao Xiang
This change adds zero page counters and updates multifd send/receive tracing format to track the newly added counters. Signed-off-by: Hao Xiang --- migration/multifd.c| 43 ++ migration/multifd.h| 21 - migration/ram.c|

[PATCH v2 0/7] Introduce multifd zero page checking.

2024-02-16 Thread Hao Xiang
v2 update: * Implement zero-page-detection switch with enumeration "legacy", "none" and "multifd". * Move normal/zero pages from MultiFDSendParams to MultiFDPages_t. * Add zeros and zero_bytes accounting. This patchset is based on Juan Quintela's old series here https://lore.kernel.org/all/2022080

[PATCH v2 4/7] migration/multifd: Enable zero page checking from multifd threads.

2024-02-16 Thread Hao Xiang
This change adds a dedicated handler for MigrationOps::ram_save_target_page in multifd live migration. Now zero page checking can be done in the multifd threads and this becomes the default configuration. We still provide backward compatibility where zero page checking is done from the migration

[PATCH v2 6/7] migration/multifd: Add zero pages and zero bytes counter to migration status interface.

2024-02-16 Thread Hao Xiang
This change extends the MigrationStatus interface to track zero pages and zero bytes counter. Signed-off-by: Hao Xiang --- migration/migration-hmp-cmds.c | 4 migration/migration.c | 2 ++ qapi/migration.json | 15 ++- tests/migration/guestpe

[PATCH v2 7/7] Update maintainer contact for migration multifd zero page checking acceleration.

2024-02-16 Thread Hao Xiang
Add myself to maintain multifd zero page checking acceleration function. Signed-off-by: Hao Xiang --- MAINTAINERS | 5 + 1 file changed, 5 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index a24c2b51b6..3ca407cb58 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3403,6 +3403,11 @@ F: t

Re: [PATCH v3 1/3] hw/i2c: core: Add reset

2024-02-16 Thread Joe Komlodi
On Thu, Feb 8, 2024 at 8:39 AM Peter Maydell wrote: > > On Fri, 2 Feb 2024 at 20:48, Joe Komlodi wrote: > > > > It's possible for a reset to come in the middle of a transaction, which > > causes the bus to be in an old state when a new transaction comes in. > > > > Signed-off-by: Joe Komlodi > >

Re: [PATCH v2 3/3] target/riscv/translate.c: set vstart_eq_zero in mark_vs_dirty()

2024-02-16 Thread Richard Henderson
On 2/16/24 12:40, Daniel Henrique Barboza wrote: After reading the reviews of patches 1 and 3 what I'm considering here is: 1 - drop patch 1; Ok. 2 - there's a patch from Ivan Klokov sent 2 months ago: "[PATCH 1/1] target/riscv: Clear vstart_qe_zero flag" https://lore.kernel.org/qemu-riscv/

[PATCH RFC 8/8] target/riscv: Add counter delegation/configuration support

2024-02-16 Thread Atish Patra
From: Kaiwen Xue The Smcdeleg/Ssccfg adds the support for counter delegation via S*indcsr and Ssccfg. It also adds a new shadow CSR scountinhibit and menvcfg enable bit (CDE) to enable this extension and scountovf virtualization. Signed-off-by: Kaiwen Xue Co-developed-by: Atish Patra Signed-o

[PATCH RFC 0/8] Add Counter delegation ISA extension support

2024-02-16 Thread Atish Patra
This series adds the counter delegation extension support. The counter delegation ISA extension(Smcdeleg/Ssccfg) actually depends on multiple ISA extensions. 1. S[m|s]csrind : The indirect CSR extension[1] which defines additional 5 ([M|S|VS]IREG2-[M|S|VS]IREG6) register to address size limitat

[PATCH RFC 2/8] target/riscv: Decouple AIA processing from xiselect and xireg

2024-02-16 Thread Atish Patra
From: Kaiwen Xue Since xiselect and xireg also will be of use in sxcsrind, AIA should have its own separated interface when those CSRs are accessed. Signed-off-by: Atish Patra Signed-off-by: Kaiwen Xue --- target/riscv/csr.c | 147 + 1 file changed,

[PATCH RFC 6/8] target/riscv: Add counter delegation definitions

2024-02-16 Thread Atish Patra
From: Kaiwen Xue This adds definitions for counter delegation, including the new scountinhibit register and the mstateen.CD bit. Signed-off-by: Atish Patra Signed-off-by: Kaiwen Xue --- target/riscv/cpu.h | 1 + target/riscv/cpu_bits.h | 8 +++- target/riscv/machine.c | 1 + 3 files

[PATCH RFC 1/8] target/riscv: Add properties for Indirect CSR Access extension

2024-02-16 Thread Atish Patra
From: Kaiwen Xue This adds the properties for sxcsrind. Definitions of new registers and implementations will come with future patches. Signed-off-by: Atish Patra Signed-off-by: Kaiwen Xue --- target/riscv/cpu.c | 4 target/riscv/cpu_cfg.h | 2 ++ 2 files changed, 6 insertions(+) di

[PATCH RFC 7/8] target/riscv: Add select value range check for counter delegation

2024-02-16 Thread Atish Patra
From: Kaiwen Xue This adds checks in ops performed on xireg and xireg2-xireg6 so that the counter delegation function will receive a valid xiselect value with the proper extensions enabled. Co-developed-by: Atish Patra Signed-off-by: Kaiwen Xue Signed-off-by: Atish Patra --- target/riscv/csr

[PATCH RFC 5/8] target/riscv: Add smcdeleg/ssccfg properties

2024-02-16 Thread Atish Patra
From: Kaiwen Xue This adds the properties of smcdeleg/ssccfg. Implementation will be in future patches. Signed-off-by: Atish Patra Signed-off-by: Kaiwen Xue --- target/riscv/cpu.c | 4 target/riscv/cpu_cfg.h | 2 ++ 2 files changed, 6 insertions(+) diff --git a/target/riscv/cpu.c b/

[PATCH RFC 3/8] target/riscv: Enable S*stateen bits for AIA

2024-02-16 Thread Atish Patra
As per the ratified AIA spec v1.0, three stateen bits control AIA CSR access. Bit 60 controls the indirect CSRs Bit 59 controls the most AIA CSR state Bit 58 controls the IMSIC state such as stopei and vstopei Enable the corresponding bits in [m|h]stateen and enable corresponding checks in the CS

[PATCH RFC 4/8] target/riscv: Support generic CSR indirect access

2024-02-16 Thread Atish Patra
From: Kaiwen Xue This adds the indirect access registers required by sscsrind/smcsrind and the operations on them. Note that xiselect and xireg are used for both AIA and sxcsrind, and the behavior of accessing them depends on whether each extension is enabled and the value stored in xiselect. Co

[PATCH v5 06/10] util/bufferiszero: Improve scalar variant

2024-02-16 Thread Richard Henderson
Split less-than and greater-than 256 cases. Use unaligned accesses for head and tail. Avoid using out-of-bounds pointers in loop boundary conditions. Signed-off-by: Richard Henderson --- util/bufferiszero.c | 86 +++-- 1 file changed, 52 insertions(+), 34

[PATCH v5 01/10] util/bufferiszero: Remove SSE4.1 variant

2024-02-16 Thread Richard Henderson
From: Alexander Monakov The SSE4.1 variant is virtually identical to the SSE2 variant, except for using 'PTEST+JNZ' in place of 'PCMPEQB+PMOVMSKB+CMP+JNE' for testing if an SSE register is all zeroes. The PTEST instruction decodes to two uops, so it can be handled only by the complex decoder, and

[PATCH v5 02/10] util/bufferiszero: Remove AVX512 variant

2024-02-16 Thread Richard Henderson
From: Alexander Monakov Thanks to early checks in the inline buffer_is_zero wrapper, the SIMD routines are invoked much more rarely in normal use when most buffers are non-zero. This makes use of AVX512 unprofitable, as it incurs extra frequency and voltage transition periods during which the CPU

[PATCH v5 04/10] util/bufferiszero: Remove useless prefetches

2024-02-16 Thread Richard Henderson
From: Alexander Monakov Use of prefetching in bufferiszero.c is quite questionable: - prefetches are issued just a few CPU cycles before the corresponding line would be hit by demand loads; - they are done for simple access patterns, i.e. where hardware prefetchers can perform better; - th

[PATCH v5 07/10] util/bufferiszero: Introduce biz_accel_fn typedef

2024-02-16 Thread Richard Henderson
Signed-off-by: Richard Henderson --- util/bufferiszero.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/util/bufferiszero.c b/util/bufferiszero.c index a904b747c7..61ea59d2e0 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -26,7 +26,8 @@ #include "qemu

[PATCH v5 08/10] util/bufferiszero: Simplify test_buffer_is_zero_next_accel

2024-02-16 Thread Richard Henderson
Because the three alternatives are monotonic, we don't need to keep a couple of bitmasks, just identify the strongest alternative at startup. Signed-off-by: Richard Henderson --- util/bufferiszero.c | 56 ++--- 1 file changed, 22 insertions(+), 34 deletion

[PATCH v5 09/10] util/bufferiszero: Add simd acceleration for aarch64

2024-02-16 Thread Richard Henderson
Because non-embedded aarch64 is expected to have AdvSIMD enabled, merely double-check with the compiler flags for __ARM_NEON and don't bother with a runtime check. Otherwise, model the loop after the x86 SSE2 function, and use VADDV to reduce the four vector comparisons. Signed-off-by: Richard He

[PATCH v5 03/10] util/bufferiszero: Reorganize for early test for acceleration

2024-02-16 Thread Richard Henderson
From: Alexander Monakov Test for length >= 256 inline, where is is often a constant. Before calling into the accelerated routine, sample three bytes from the buffer, which handles most non-zero buffers. Signed-off-by: Alexander Monakov Signed-off-by: Mikhail Romanov Message-Id: <20240206204809

[PATCH v5 10/10] tests/bench: Add bufferiszero-bench

2024-02-16 Thread Richard Henderson
Benchmark each acceleration function vs an aligned buffer of zeros. Signed-off-by: Richard Henderson --- tests/bench/bufferiszero-bench.c | 42 tests/bench/meson.build | 4 ++- 2 files changed, 45 insertions(+), 1 deletion(-) create mode 100644 tests/b

[PATCH v5 05/10] util/bufferiszero: Optimize SSE2 and AVX2 variants

2024-02-16 Thread Richard Henderson
From: Alexander Monakov Increase unroll factor in SIMD loops from 4x to 8x in order to move their bottlenecks from ALU port contention to load issue rate (two loads per cycle on popular x86 implementations). Avoid using out-of-bounds pointers in loop boundary conditions. Follow SSE2 implementat

[PATCH v5 00/10] Optimize buffer_is_zero

2024-02-16 Thread Richard Henderson
v3: https://patchew.org/QEMU/20240206204809.9859-1-amona...@ispras.ru/ v4: https://patchew.org/QEMU/20240215081449.848220-1-richard.hender...@linaro.org/ Changes for v5: - Move 3 byte sample back inline; document it. - Drop AArch64 SVE alternative; neoverse-v2 still recommends simd for memcpy

Re: [PATCH v3 1/3] hw/i2c: core: Add reset

2024-02-16 Thread Corey Minyard
On Thu, Feb 08, 2024 at 04:39:10PM +, Peter Maydell wrote: > On Fri, 2 Feb 2024 at 20:48, Joe Komlodi wrote: > > > > It's possible for a reset to come in the middle of a transaction, which > > causes the bus to be in an old state when a new transaction comes in. > > > > Signed-off-by: Joe Koml

[PATCH v2] target: hppa: Fix unaligned double word accesses for hppa64

2024-02-16 Thread Guenter Roeck
Unaligned 64-bit accesses were found in Linux to clobber carry bits, resulting in bad results if an arithmetic operation involving a carry bit was executed after an unaligned 64-bit operation. hppa 2.0 defines additional carry bits in PSW register bits 32..39. When restoring PSW after executing an

Re: [PATCH v2 3/3] target/riscv/translate.c: set vstart_eq_zero in mark_vs_dirty()

2024-02-16 Thread Daniel Henrique Barboza
On 2/16/24 20:41, Richard Henderson wrote: On 2/16/24 12:40, Daniel Henrique Barboza wrote: After reading the reviews of patches 1 and 3 what I'm considering here is: 1 - drop patch 1; Ok. 2 - there's a patch from Ivan Klokov sent 2 months ago: "[PATCH 1/1] target/riscv: Clear vstart_qe

Re: [PATCH] i386: load kernel on xen using DMA

2024-02-16 Thread Marek Marczykowski-Górecki
On Fri, Jun 18, 2021 at 09:54:14AM +0100, Alex Bennée wrote: > > Marek Marczykowski-Górecki writes: > > > Kernel on Xen is loaded via fw_cfg. Previously it used non-DMA version, > > which loaded the kernel (and initramfs) byte by byte. Change this > > to DMA, to load in bigger chunks. > > This c

Re: [PATCH] tests/cdrom-test: Add cdrom test for LoongArch virt machine

2024-02-16 Thread maobibo
On 2024/2/6 下午5:20, Thomas Huth wrote: On 06/02/2024 03.29, maobibo wrote: Hi Philippe, On 2024/2/5 下午8:58, Philippe Mathieu-Daudé wrote: Hi Bibo, On 5/2/24 03:13, Bibo Mao wrote: The cdrom test skips to execute on LoongArch system with command "make check", this patch enables cdrom test

Re: [PATCH v2 3/7] migration/multifd: Zero page transmission on the multifd thread.

2024-02-16 Thread Richard Henderson
On 2/16/24 12:39, Hao Xiang wrote: +void multifd_zero_page_check_recv(MultiFDRecvParams *p) +{ +for (int i = 0; i < p->zero_num; i++) { +void *page = p->host + p->zero[i]; +if (!buffer_is_zero(page, p->page_size)) { +memset(page, 0, p->page_size); +} +}

<    1   2   3