[PATCH 01/33] target/ppc: introduce do_ea_calc

2021-10-21 Thread matheus . ferst
From: pherde The do_ea_calc function will calculate the effective address(EA) according to PowerIsa 3.1. With that, it was replaced part of do_ldst() that calculates the EA by this new function. Signed-off-by: Fernando Eckhardt Valle (pherde) Signed-off-by: Matheus Ferst --- target/ppc/transl

[PATCH 05/33] target/ppc: Move LQ and STQ to decodetree

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode | 11 ++ target/ppc/translate.c | 156 + target/ppc/translate/fixedpoint-impl.c.inc | 98 + 3 files changed, 114 insertions(+), 151 deletions(-)

[PATCH 14/33] target/ppc: Implement vsldbi/vsrdbi instructions

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 8 +++ target/ppc/translate/vmx-impl.c.inc | 78 + 2 files changed, 86 insertions(+) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 4666c06f55..257b11

[PATCH 06/33] target/ppc: Implement PLQ and PSTQ

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Matheus Ferst --- target/ppc/insn64.decode | 4 target/ppc/translate/fixedpoint-impl.c.inc | 12 2 files changed, 16 insertions(+) diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index 11e5ea81d6..48756cd4

[PATCH 03/33] target/ppc: Move load and store floating point instructions to decodetree

2021-10-21 Thread matheus . ferst
From: pherde Move load floating point instructions (lfs, lfsu, lfsx, lfsux, lfd, lfdu, lfdx, lfdux) and store floating point instructions(stfs, stfsu, stfsx, stfsux, stfd, stfdu, stfdx, stfdux) from legacy system to decodetree. Signed-off-by: Fernando Eckhardt Valle Signed-off-by: Matheus Fer

[PATCH 09/33] target/ppc: Implement pdepd instruction

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Matheus Ferst --- target/ppc/helper.h| 1 + target/ppc/insn32.decode | 1 + target/ppc/int_helper.c| 18 ++ target/ppc/translate/fixedpoint-impl.c.inc | 12 4 files ch

[PATCH 15/33] target/ppc: Implement Vector Insert from GPR using GPR index insns

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Implements the following PowerISA v3.1 instructions: vinsblx: Vector Insert Byte from GPR using GPR-specified Left-Index vinshlx: Vector Insert Halfword from GPR using GPR-specified Left-Index vinswlx: Vector Insert Word from GPR using GPR-specified Left-Index vinsdlx: Vector

[PATCH 08/33] target/ppc: Implement cnttzdm

2021-10-21 Thread matheus . ferst
From: Luis Pires Implement the following PowerISA v3.1 instruction: cnttzdm: Count Trailing Zeros Doubleword Under Bit Mask Signed-off-by: Luis Pires Signed-off-by: Matheus Ferst --- target/ppc/helper.h| 1 + target/ppc/insn32.decode | 1 + target/p

[PATCH 07/33] target/ppc: Implement cntlzdm

2021-10-21 Thread matheus . ferst
From: Luis Pires Implement the following PowerISA v3.1 instruction: cntlzdm: Count Leading Zeros Doubleword Under Bit Mask Signed-off-by: Luis Pires Signed-off-by: Matheus Ferst --- target/ppc/helper.h| 1 + target/ppc/insn32.decode | 1 + target/pp

[PATCH 12/33] target/ppc: Implement vclzdm/vctzdm instructions

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Luis Pires Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 2 ++ target/ppc/translate/vmx-impl.c.inc | 36 + 2 files changed, 38 insertions(+) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decod

[PATCH 11/33] target/ppc: Move vcfuged to vmx-impl.c.inc

2021-10-21 Thread matheus . ferst
From: Matheus Ferst There's no reason to keep vector-impl.c.inc separate from vmx-impl.c.inc. Additionally, let GVec handle the multiple calls to helper_cfuged for us. Signed-off-by: Matheus Ferst --- target/ppc/helper.h| 2 +- target/ppc/int_helper.c

[PATCH 32/33] target/ppc: Implement xxblendvb/xxblendvh/xxblendvw/xxblendvd instructions

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Matheus Ferst --- target/ppc/helper.h | 4 +++ target/ppc/insn64.decode| 19 ++ target/ppc/int_helper.c | 15 target/ppc/translate/vsx-impl.c.inc | 55 ++

[PATCH 19/33] target/ppc: Implement Vector Extract Double to VSR using GPR index insns

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Implement the following PowerISA v3.1 instructions: vextdubvlx: Vector Extract Double Unsigned Byte to VSR using GPR-specified Left-Index vextduhvlx: Vector Extract Double Unsigned Halfword to VSR using GPR-specified Left-Index vextduwvlx: Vector Extrac

[PATCH 17/33] target/ppc: Implement Vector Insert from VSR using GPR index insns

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Implements the following PowerISA v3.1 instructions: vinsbvlx: Vector Insert Byte from VSR using GPR-specified Left-Index vinshvlx: Vector Insert Halfword from VSR using GPR-specified Left-Index vinswvlx: Vector Insert Word from VSR using GPR-specified Left-Index vin

[PATCH 13/33] target/ppc: Implement vpdepd/vpextd instruction

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Luis Pires Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 2 ++ target/ppc/translate/vmx-impl.c.inc | 36 + 2 files changed, 38 insertions(+) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decod

[PATCH 20/33] target/ppc: Introduce REQUIRE_VSX macro

2021-10-21 Thread matheus . ferst
From: "Bruno Larsen (billionai)" Introduce the macro to centralize checking if the VSX facility is enabled and handle it correctly. Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Luis Pires Signed-off-by: Matheus Ferst --- target/ppc/translate.c | 8 1 file changed, 8 insert

[PATCH 21/33] target/ppc: moved stxv and lxv from legacy to decodtree

2021-10-21 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" Moved stxv and lxv implementation from the legacy system to decodetree. Signed-off-by: Luis Pires Signed-off-by: Lucas Mateus Castro (alqotel) Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 8 target/ppc/translate.c

Re: [PATCH v3 03/22] host-utils: introduce uabs64()

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: Introduce uabs64(), a function that returns the absolute value of a 64-bit int as an unsigned value. This avoids the undefined behavior for common abs implementations, where abs of the most negative value is undefined. I do question the comment there wrt un

[PATCH 16/33] target/ppc: Implement Vector Insert Word from GPR using Immediate insns

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Implements the following PowerISA v3.1 instructions: vinsw: Vector Insert Word from GPR using immediate-specified index vinsd: Vector Insert Doubleword from GPR using immediate-specified index Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 6

Re: [PATCH v3 05/22] host-utils: move checks out of divu128/divs128

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: In preparation for changing the divu128/divs128 implementations to allow for quotients larger than 64 bits, move the div-by-zero and overflow checks to the callers. Signed-off-by: Luis Pires Reviewed-by: Richard Henderson Frederic, I had forgotten about

[PATCH 22/33] target/ppc: moved stxvx and lxvx from legacy to decodtree

2021-10-21 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" Moved stxvx and lxvx implementation from the legacy system to decodetree. Signed-off-by: Lucas Mateus Castro (alqotel) Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 5 ++ target/ppc/translate/vsx-impl.c.inc | 127 ---

[PATCH 23/33] target/ppc: added the instructions LXVP and STXVP

2021-10-21 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" Implemented the instructions lxvp and stxvp using decodetree Signed-off-by: Luis Pires Signed-off-by: Lucas Mateus Castro (alqotel) Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 5 target/ppc/translate/vsx-impl.c.inc | 40 +

[PATCH 25/33] target/ppc: added the instructions PLXV and PSTXV

2021-10-21 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" Implemented the instructions plxv and pstxv using decodetree Signed-off-by: Lucas Mateus Castro (alqotel) Signed-off-by: Matheus Ferst --- target/ppc/insn64.decode| 10 ++ target/ppc/translate/vsx-impl.c.inc | 16 2 fi

[PATCH 26/33] target/ppc: added the instructions PLXVP and PSTXVP

2021-10-21 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" Implemented the instructions plxvp and pstxvp using decodetree Signed-off-by: Lucas Mateus Castro (alqotel) Signed-off-by: Matheus Ferst --- target/ppc/insn64.decode| 9 + target/ppc/translate/vsx-impl.c.inc | 2 ++ 2 files changed, 11

[PATCH 24/33] target/ppc: added the instructions LXVPX and STXVPX

2021-10-21 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" Implemented the instructions lxvpx and stxvpx using decodetree Signed-off-by: Lucas Mateus Castro (alqotel) Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 3 +++ target/ppc/translate/vsx-impl.c.inc | 18 -- 2 files

[PATCH 28/33] target/ppc: moved XXSPLTIB to using decodetree

2021-10-21 Thread matheus . ferst
From: "Bruno Larsen (billionai)" Changed the function that handles XXSPLTIB emulation to using decodetree, but still use the same logic as before Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 5 + target/ppc/translate/vsx-i

[PATCH 27/33] target/ppc: moved XXSPLTW to using decodetree

2021-10-21 Thread matheus . ferst
From: "Bruno Larsen (billionai)" Changed the function that handles XXSPLTW emulation to using decodetree, but still using the same logic. Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 9 + target/ppc/translate/vsx-impl.

[PATCH 31/33] target/ppc: implemented XXSPLTIDP instruction

2021-10-21 Thread matheus . ferst
From: "Bruno Larsen (billionai)" Implemented the instruction XXSPLTIDP using decodetree. Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Matheus Ferst --- target/ppc/insn64.decode| 2 ++ target/ppc/translate/vsx-impl.c.inc | 10 ++ 2 files changed, 12 insertions(+)

[PATCH 33/33] target/ppc: Implement lxvkq instruction

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Luis Pires Signed-off-by: Matheus Ferst --- target/ppc/insn32.decode| 7 + target/ppc/translate/vsx-impl.c.inc | 44 + 2 files changed, 51 insertions(+) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.de

[PATCH 29/33] target/ppc: implemented XXSPLTI32DX

2021-10-21 Thread matheus . ferst
From: "Bruno Larsen (billionai)" Implemented XXSPLTI32DX emulation using decodetree Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Matheus Ferst --- target/ppc/insn64.decode| 11 target/ppc/translate/vsx-impl.c.inc | 41 + 2 files chang

[PATCH 18/33] target/ppc: Move vinsertb/vinserth/vinsertw/vinsertd to decodetree

2021-10-21 Thread matheus . ferst
From: Matheus Ferst Signed-off-by: Matheus Ferst --- target/ppc/helper.h | 4 target/ppc/insn32.decode| 5 + target/ppc/int_helper.c | 21 --- target/ppc/translate/vmx-impl.c.inc | 32 - target/ppc/tr

[PATCH 30/33] target/ppc: Implemented XXSPLTIW using decodetree

2021-10-21 Thread matheus . ferst
From: "Bruno Larsen (billionai)" Implemented the XXSPLTIW instruction, using decodetree. Signed-off-by: Bruno Larsen (billionai) Signed-off-by: Matheus Ferst --- target/ppc/insn64.decode| 6 ++ target/ppc/translate/vsx-impl.c.inc | 10 ++ 2 files changed, 16 insertion

Re: [PATCH v3 06/22] host-utils: move udiv_qrnnd() to host-utils

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: Move udiv_qrnnd() from include/fpu/softfloat-macros.h to host-utils, so it can be reused by divu128(). Signed-off-by: Luis Pires --- include/fpu/softfloat-macros.h | 82 -- include/qemu/host-utils.h | 81 ++

Re: [PATCH v3 02/22] host-utils: fix missing zero-extension in divs128

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: *plow (lower 64 bits of the dividend) is passed into divs128() as a signed 64-bit integer. When building an __int128_t from it, it must be zero-extended, instead of sign-extended. Suggested-by: Richard Henderson Signed-off-by: Luis Pires --- include/qemu/h

Re: plugins: Missing Store Exclusive Memory Accesses

2021-10-21 Thread Aaron Lindsay via
On Oct 21 13:28, Alex Bennée wrote: > It's a bit clearer if you use the contrib/execlog plugin: > > ./qemu-aarch64 -plugin contrib/plugins/libexeclog.so -d plugin > ./tests/tcg/aarch64-linux-user/stxp > > 0, 0x400910, 0xf9800011, "prfm pstl1strm, [x0] > 0, 0x400914, 0xc87f4410, "ldxp x16,

Re: [PATCH v4 6/6] vfio: defer to commit kvm irq routing when enable msi/msix

2021-10-21 Thread Alex Williamson
On Thu, 14 Oct 2021 08:48:52 +0800 "Longpeng(Mike)" wrote: > In migration resume phase, all unmasked msix vectors need to be > setup when loading the VF state. However, the setup operation would > take longer if the VM has more VFs and each VF has more unmasked > vectors. > > The hot spot is kvm

[PATCH v3 02/48] tcg/optimize: Split out OptContext

2021-10-21 Thread Richard Henderson
Provide what will become a larger context for splitting the very large tcg_optimize function. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 77 ++ 1 file changed, 40 insertions(+), 37 delet

[PATCH v3 00/48] tcg: optimize redundant sign extensions

2021-10-21 Thread Richard Henderson
Currently, we have support for optimizing redundant zero extensions, which I think was done with x86 and aarch64 in mind, which zero-extend all 32-bit operations into the 64-bit register. But targets like Alpha, MIPS, and RISC-V do sign-extensions instead. But before that, split the quite massive

[PATCH v3 05/48] tcg/optimize: Move prev_mb into OptContext

2021-10-21 Thread Richard Henderson
This will expose the variable to subroutines that will be broken out of tcg_optimize. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimi

[PATCH v3 03/48] tcg/optimize: Remove do_default label

2021-10-21 Thread Richard Henderson
Break the final cleanup clause out of the main switch statement. When fully folding an opcode to mov/movi, use "continue" to process the next opcode, else break to fall into the final cleanup. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c

[PATCH v3 07/48] tcg/optimize: Split out copy_propagate

2021-10-21 Thread Richard Henderson
Continue splitting tcg_optimize. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c

[PATCH v3 15/48] tcg/optimize: Split out fold_const{1,2}

2021-10-21 Thread Richard Henderson
Split out a whole bunch of placeholder functions, which are currently identical. That won't last as more code gets moved. Use CASE_32_64_VEC for some logical operators that previously missed the addition of vectors. Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c |

[PATCH v3 01/48] tcg/optimize: Rename "mask" to "z_mask"

2021-10-21 Thread Richard Henderson
Prepare for tracking different masks by renaming this one. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 142 + 1 file changed, 72 insertions(+), 70 del

[PATCH v3 06/48] tcg/optimize: Split out init_arguments

2021-10-21 Thread Richard Henderson
There was no real reason for calls to have separate code here. Unify init for calls vs non-calls using the call path, which handles TCG_CALL_DUMMY_ARG. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 25

[PATCH v3 08/48] tcg/optimize: Split out fold_call

2021-10-21 Thread Richard Henderson
Calls are special in that they have a variable number of arguments, and need to be able to clobber globals. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 63 -- 1 file changed, 41 insertion

[PATCH v3 13/48] tcg/optimize: Use a boolean to avoid a mass of continues

2021-10-21 Thread Richard Henderson
Reviewed-by: Luis Pires Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 368457f4a2..699476e2f1 100644 --- a/tcg/optimize.c +++ b/tcg/optimiz

[PATCH v3 12/48] tcg/optimize: Split out finish_folding

2021-10-21 Thread Richard Henderson
Copy z_mask into OptContext, for writeback to the first output within the new function. Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 49 + 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/tcg/optimiz

[PATCH v3 19/48] tcg/optimize: Split out fold_setcond

2021-10-21 Thread Richard Henderson
Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 9059e917cf..2086e894c6 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -933,6 +933,17

[PATCH v3 18/48] tcg/optimize: Split out fold_brcond

2021-10-21 Thread Richard Henderson
Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 61a6221ad2..9059e917cf 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -71

[PATCH v3 04/48] tcg/optimize: Change tcg_opt_gen_{mov, movi} interface

2021-10-21 Thread Richard Henderson
Adjust the interface to take the OptContext parameter instead of TCGContext or both. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 67 +- 1 file changed, 34 insertions(+), 33 deletions(-)

[PATCH v3 14/48] tcg/optimize: Split out fold_mb, fold_qemu_{ld,st}

2021-10-21 Thread Richard Henderson
This puts the separate mb optimization into the same framework as the others. While fold_qemu_{ld,st} are currently identical, that won't last as more code gets moved. Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 89 +

[PATCH v3 16/48] tcg/optimize: Split out fold_setcond2

2021-10-21 Thread Richard Henderson
Reduce some code duplication by folding the NE and EQ cases. Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 145 - 1 file changed, 72 insertions(+), 73 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index

[PATCH v3 25/48] tcg/optimize: Split out fold_deposit

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 3fffc5b200..9758d83e3e 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -878,6 +878,18 @@ static bool fold_c

[PATCH v3 09/48] tcg/optimize: Drop nb_oargs, nb_iargs locals

2021-10-21 Thread Richard Henderson
Rather than try to keep these up-to-date across folding, re-read nb_oargs at the end, after re-reading the opcode. A couple of asserts need dropping, but that will take care of itself as we split the function further. Reviewed-by: Alex Bennée Reviewed-by: Luis Pires Signed-off-by: Richard Hende

[PATCH v3 23/48] tcg/optimize: Split out fold_extract2

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index ed5a304089..885380bb22 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -883,6 +883,25 @@ stat

[PATCH v3 32/48] tcg/optimize: Split out fold_xi_to_i

2021-10-21 Thread Richard Henderson
Pull the "op r, a, 0 => movi r, 0" optimization into a function, and use it in the outer opcode fold functions. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 32 +++- 1 file changed, 15 insertions(+), 17 deletions(-) diff

[PATCH v3 11/48] tcg/optimize: Return true from tcg_opt_gen_{mov, movi}

2021-10-21 Thread Richard Henderson
This will allow callers to tail call to these functions and return true indicating processing complete. Reviewed-by: Luis Pires Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/t

[PATCH v3 20/48] tcg/optimize: Split out fold_mulu2_i32

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 37 + 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 2086e894c6..142f445cb1 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -889,6 +889,24 @@ static

[PATCH v3 27/48] tcg/optimize: Split out fold_bswap

2021-10-21 Thread Richard Henderson
Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 27 --- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index c54f839434..77b31680f1 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c

[PATCH v3 10/48] tcg/optimize: Change fail return for do_constant_folding_cond*

2021-10-21 Thread Richard Henderson
Return -1 instead of 2 for failure. This us to use comparisons against 0 for all cases. Reviewed-by: Luis Pires Signed-off-by: Richard Henderson --- tcg/optimize.c | 145 + 1 file changed, 74 insertions(+), 71 deletions(-) diff --git a/tcg/optimi

[PATCH v3 21/48] tcg/optimize: Split out fold_addsub2_i32

2021-10-21 Thread Richard Henderson
Add two additional helpers, fold_add2_i32 and fold_sub2_i32 which will not be simple wrappers forever. Signed-off-by: Richard Henderson --- tcg/optimize.c | 70 +++--- 1 file changed, 44 insertions(+), 26 deletions(-) diff --git a/tcg/optimize.c b/tcg

[PATCH v3 33/48] tcg/optimize: Add type to OptContext

2021-10-21 Thread Richard Henderson
Compute the type of the operation early. There are at least 4 places that used a def->flags ladder to determine the type of the operation being optimized. There were two places that assumed !TCG_OPF_64BIT means TCG_TYPE_I32, and so could potentially compute incorrect results for vector operations

[PATCH v3 17/48] tcg/optimize: Split out fold_brcond2

2021-10-21 Thread Richard Henderson
Reduce some code duplication by folding the NE and EQ cases. Signed-off-by: Richard Henderson --- tcg/optimize.c | 159 + 1 file changed, 81 insertions(+), 78 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 0eaa0127f3..61a6221ad2 1

[PATCH v3 31/48] tcg/optimize: Split out fold_xx_to_x

2021-10-21 Thread Richard Henderson
Pull the "op r, a, a => mov r, a" optimization into a function, and use it in the outer opcode fold functions. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 39 --- 1 file changed, 24 insertions(+), 15 deletions(-)

[PATCH v3 36/48] tcg/optimize: Split out fold_xi_to_x

2021-10-21 Thread Richard Henderson
Pull the "op r, a, i => mov r, a" optimization into a function, and use them int the outer-most logical operations. Signed-off-by: Richard Henderson --- tcg/optimize.c | 60 +- 1 file changed, 25 insertions(+), 35 deletions(-) diff --git a/tcg/opt

[PATCH v3 22/48] tcg/optimize: Split out fold_movcond

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 56 -- 1 file changed, 31 insertions(+), 25 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index eb6f1581ac..ed5a304089 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -917,6 +917

[PATCH v3 24/48] tcg/optimize: Split out fold_extract, fold_sextract

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 48 ++-- 1 file changed, 30 insertions(+), 18 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 885380bb22..3fffc5b200 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -883,6 +883,1

[PATCH v3 28/48] tcg/optimize: Split out fold_dup, fold_dup2

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 53 +- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 77b31680f1..2d626c604a 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -915,6 +915

[PATCH v3 29/48] tcg/optimize: Split out fold_mov

2021-10-21 Thread Richard Henderson
This is the final entry in the main switch that was in a different form. After this, we have the option to convert the switch into a function dispatch table. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 27 ++- 1 file changed

[PATCH v3 34/48] tcg/optimize: Split out fold_to_not

2021-10-21 Thread Richard Henderson
Split out the conditional conversion from a more complex logical operation to a simple NOT. Create a couple more helpers to make this easy for the outer-most logical operations. Signed-off-by: Richard Henderson --- tcg/optimize.c | 154 +++-- 1 file c

[PATCH v3 38/48] tcg/optimize: Split out fold_masks

2021-10-21 Thread Richard Henderson
Move all of the known-zero optimizations into the per-opcode functions. Use fold_masks when there is a possibility of the result being determined, and simply set ctx->z_mask otherwise. Signed-off-by: Richard Henderson --- tcg/optimize.c | 545 ++--- 1

[PATCH v3 26/48] tcg/optimize: Split out fold_count_zeros

2021-10-21 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/optimize.c | 32 ++-- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 9758d83e3e..c54f839434 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -873,6 +873,20 @@ static bool

[PATCH v3 39/48] tcg/optimize: Expand fold_mulu2_i32 to all 4-arg multiplies

2021-10-21 Thread Richard Henderson
Rename to fold_multiply2, and handle muls2_i32, mulu2_i64, and muls2_i64. Signed-off-by: Richard Henderson --- tcg/optimize.c | 44 +++- 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index f0086ee789..efd5f5

[PATCH v3 37/48] tcg/optimize: Split out fold_ix_to_i

2021-10-21 Thread Richard Henderson
Pull the "op r, 0, b => movi r, 0" optimization into a function, and use it in fold_shift. Signed-off-by: Richard Henderson --- tcg/optimize.c | 28 ++-- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index af26429175..6c1c

[PATCH v3 43/48] tcg/optimize: Stop forcing z_mask to "garbage" for 32-bit values

2021-10-21 Thread Richard Henderson
This "garbage" setting pre-dates the addition of the type changing opcodes INDEX_op_ext_i32_i64, INDEX_op_extu_i32_i64, and INDEX_op_extr{l,h}_i64_i32. So now we have a definitive points at which to adjust z_mask to eliminate such bits from the 32-bit operands. Signed-off-by: Richard Henderson -

[PATCH v3 44/48] tcg/optimize: Optimize sign extensions

2021-10-21 Thread Richard Henderson
Certain targets, like riscv, produce signed 32-bit results. This can lead to lots of redundant extensions as values are manipulated. Begin by tracking only the obvious sign-extensions, and converting them to simple copies when possible. Signed-off-by: Richard Henderson --- tcg/optimize.c | 129

[PATCH v3 30/48] tcg/optimize: Split out fold_xx_to_i

2021-10-21 Thread Richard Henderson
Pull the "op r, a, a => movi r, 0" optimization into a function, and use it in the outer opcode fold functions. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 41 - 1 file changed, 24 insertions(+), 17 deletions(

[PATCH v3 40/48] tcg/optimize: Expand fold_addsub2_i32 to 64-bit ops

2021-10-21 Thread Richard Henderson
Rename to fold_addsub2. Use Int128 to implement the wider operation. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 65 ++ 1 file changed, 44 insertions(+), 21 deletions(-) diff --git a/tcg/optimize.c b/

[PATCH v3 35/48] tcg/optimize: Split out fold_sub_to_neg

2021-10-21 Thread Richard Henderson
Even though there is only one user, place this more complex conversion into its own helper. Signed-off-by: Richard Henderson --- tcg/optimize.c | 84 -- 1 file changed, 47 insertions(+), 37 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c

[PATCH v3 42/48] tcg/optimize: Add more simplifications for orc

2021-10-21 Thread Richard Henderson
Two simplifications that were missing from before the split to fold functions, and are now easy to provide. Signed-off-by: Richard Henderson --- tcg/optimize.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index 92b35a8c3f..dc7744d41a 100644 --- a/tcg/opti

[PATCH v3 47/48] tcg/optimize: Propagate sign info for bit counting

2021-10-21 Thread Richard Henderson
The results are generally 6 bit unsigned values, though the count leading and trailing bits may produce any value for a zero input. Signed-off-by: Richard Henderson --- tcg/optimize.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 64d3

Re: [PATCH v3 07/22] host-utils: add 128-bit quotient support to divu128/divs128

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: These will be used to implement new decimal floating point instructions from Power ISA 3.1. A new argument, prem, was added to divu128/divs128 to receive the remainder, freeing up phigh to receive the high 64 bits of the quotient. Signed-off-by: Luis Pires

Re: [PATCH v3 08/22] host-utils: add unit tests for divu128/divs128

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: Signed-off-by: Luis Pires --- tests/unit/meson.build | 1 + tests/unit/test-div128.c | 197 +++ 2 files changed, 198 insertions(+) create mode 100644 tests/unit/test-div128.c Reviewed-by: Richard Henderson r~

[PATCH v3 41/48] tcg/optimize: Sink commutative operand swapping into fold functions

2021-10-21 Thread Richard Henderson
Most of these are handled by creating a fold_const2_commutative to handle all of the binary operators. The rest were already handled on a case-by-case basis in the switch, and have their own fold function in which to place the call. We now have only one major switch on TCGOpcode. Signed-off-by:

[PATCH v3 45/48] tcg/optimize: Propagate sign info for logical operations

2021-10-21 Thread Richard Henderson
Sign repetitions are perforce all identical, whether they are 1 or 0. Bitwise operations preserve the relative quantity of the repetitions. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/optimize.c | 29 + 1 file changed, 29 insertions(+

[PATCH v3 48/48] tcg/optimize: Propagate sign info for shifting

2021-10-21 Thread Richard Henderson
For constant shifts, we can simply shift the s_mask. For variable shifts, we know that sar does not reduce the s_mask, which helps for sequences like ext32s_i64 t, in sar_i64 t, t, v ext32s_i64 out, t allowing the final extend to be eliminated. Signed-off-by: Richard Henderson

Re: [PATCH v3 15/22] target/ppc: Implement DCTFIXQQ

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: --- a/target/ppc/dfp_helper.c +++ b/target/ppc/dfp_helper.c @@ -51,6 +51,12 @@ static void set_dfp128(ppc_fprp_t *dfp, ppc_vsr_t *src) dfp[1].VsrD(0) = src->VsrD(1); } +static void set_dfp128_to_avr(ppc_avr_t *dst, ppc_vsr_t *src) +{ +dst->Vsr

[PATCH v3 46/48] tcg/optimize: Propagate sign info for setcond

2021-10-21 Thread Richard Henderson
The result is either 0 or 1, which means that we have a 2 bit signed result, and thus 62 bits of sign. For clarity, use the smask_from_zmask function. Signed-off-by: Richard Henderson --- tcg/optimize.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index d

Re: [PATCH v3 12/22] target/ppc: Implement DCFFIXQQ

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: +void helper_DCFFIXQQ(CPUPPCState *env, ppc_fprp_t *t, ppc_avr_t *b) +{ +struct PPC_DFP dfp; Space here after the variable declaration would help. +dfp_prepare_decimal128(&dfp, NULL, NULL, env); +decNumberFromInt128(&dfp.t, (uint64_t)b->VsrD(1

Re: [PATCH v2 0/6] hw/riscv: Use MachineState::ram and MachineClass::default_ram_id in all machines

2021-10-21 Thread Alistair Francis
On Wed, Oct 20, 2021 at 11:41 AM Bin Meng wrote: > > As of today, all RISC-V machines (except virt) are still using > memory_region_init_ram() > to initilize the sysytem RAM, which can't possibly handle vhost-user, and > can't > work as expected with '-numa node,memdev' options. > > Change to us

Re: [PATCH] multiboot: Use DMA instead port-based transfer

2021-10-21 Thread Marcus Hähnel
On Tuesday, October 19, 2021 6:45:44 PM CEST Paolo Bonzini wrote: > On my system (a relatively recent laptop) I get 15-20 MiB per second, > which is slow but not as slow as what you got. Out of curiosity, can > you test what you get with the following kernel patch? > > diff --git a/arch/x86/kvm

Re: [PATCH v1 5/9] hw/intc: sifive_plic: Cleanup the irq_request function

2021-10-21 Thread Alistair Francis
On Thu, Oct 21, 2021 at 5:33 PM Bin Meng wrote: > > On Mon, Oct 18, 2021 at 10:39 AM Alistair Francis > wrote: > > > > From: Alistair Francis > > > > Signed-off-by: Alistair Francis > > --- > > hw/intc/sifive_plic.c | 10 -- > > 1 file changed, 4 insertions(+), 6 deletions(-) > > > > R

Re: [PATCH v3 16/22] target/ppc: Move dtstdc[q]/dtstdg[q] to decodetree

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: +&Z22_bf_fra bf fra dm +@Z22_bf_fra .. bf:3 .. fra:5 dm:6 . . &Z22_bf_fra + +%z22_frap 17:4 !function=times_2 +@Z22_bf_frap.. bf:3 .. 0 dm:6 . . &Z22_bf_fra fra=%z22_frap How confusing. The

Re: [PATCH v5 1/8] target/riscv: zfh: half-precision load and store

2021-10-21 Thread Alistair Francis
On Fri, Oct 22, 2021 at 2:30 AM wrote: > > From: Kito Cheng > > Signed-off-by: Kito Cheng > Signed-off-by: Chih-Min Chao > Signed-off-by: Frank Chang > Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Alistair > --- > target/riscv/cpu.h| 1 + > target

Re: [PATCH v3 17/22] target/ppc: Move d{add,sub,mul,div,iex}[q] to decodetree

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: -#define GEN_DFP_T_A_B_Rc(name) \ -static void gen_##name(DisasContext *ctx)\ -{\ -TCGv_ptr rd, ra, rb; \ -if (unlikely(!ctx->fpu_enabled)) { \

Re: [PULL 0/1] Block patches

2021-10-21 Thread Richard Henderson
On 10/21/21 10:41 AM, Stefan Hajnoczi wrote: The following changes since commit afc9fcde55296b83f659de9da3cdf044812a6eeb: Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging (2021-10-20 06:10:51 -0700) are available in the Git repository at: https://gitlab.com/stef

Re: [PATCH v3 20/22] target/ppc: Move dqua[q], drrnd[q] to decodetree

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: -#define GEN_DFP_T_A_B_I32_Rc(name, i32fld) \ -static void gen_##name(DisasContext *ctx)\ -{\ -TCGv_ptr rt, ra, rb; \ -TCGv_i32 i32;\

Re: [PATCH v3 18/22] target/ppc: Move dcmp{u,o}[q],dts{tex,tsf,tsfi}[q] to decodetree

2021-10-21 Thread Richard Henderson
On 9/10/21 4:26 AM, Luis Pires wrote: -#define GEN_DFP_BF_A_B(name) \ -static void gen_##name(DisasContext *ctx) \ -{ \ -TCGv_ptr ra, rb; \ -if (unlikely(!ctx->fpu_enabled)) {

RE: [gdbstub] redirecting qemu console output to a debugger

2021-10-21 Thread Sid Manning
> -Original Message- > From: Alex Bennée > Sent: Thursday, October 21, 2021 9:52 AM > To: Philippe Mathieu-Daudé > Cc: Sid Manning ; Marc-André Lureau > ; Paolo Bonzini ; > qemu-devel@nongnu.org > Subject: Re: [gdbstub] redirecting qemu console output to a debugger > > WARNING: This emai

Re: [PATCH 8/8] x86-iommu: Fail early if vIOMMU specified after vfio-pci

2021-10-21 Thread Alex Williamson
On Thu, 21 Oct 2021 18:42:59 +0800 Peter Xu wrote: > Scan the pci bus to make sure there's no vfio-pci device attached before > vIOMMU > is realized. Sorry, I'm not onboard with this solution at all. It would be really useful though if this commit log or a code comment described exactly the in

<    1   2   3   4   >