From: "Lucas Mateus Castro (alqotel)"
This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4
From: "Lucas Mateus Castro (alqotel)"
Used gvec to translate XVTSTDCSP and XVTSTDCDP.
xvtstdcsp:
reptloopimm master version prev versioncurrent version
25 40000 0,2062000,040730 (-80.2%)0,040740 (-80.2%)
25 40001
From: "Lucas Mateus Castro (alqotel)"
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree wit
From: "Lucas Mateus Castro (alqotel)"
Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to
decodetree and used gvec to translate them.
xvabssp:
reptloopmaster patch
8 12500 0,00477900 0,00476000 (-0.4%)
25 4000
From: "Lucas Mateus Castro (alqotel)"
Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.
vabsdub:
reptloopmaster patch
8 12500 0,03601600 0,00688500 (-80.9%)
25 40000,03651000 0,00532100 (-85.4%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its
From: "Lucas Mateus Castro (alqotel)"
Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.
vprtybw:
reptloopmaster patch
8 12500 0,01198900 0,00703100 (
From: "Lucas Mateus Castro (alqotel)"
Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.
vnegw:
reptloopmaster patch
8 12500 0,01053200 0,00548400 (-47.9%)
25 40000,01030500 0,0039 (-62.2%)
10
From: "Lucas Mateus Castro (alqotel)"
This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.
vmhaddshs:
reptloopmaster patch
8 12500 0,02983400 0,02648500 (-11.2%)
25 400
From: "Lucas Mateus Castro (alqotel)"
Patches missing review: 12
v2 -> v3:
- Used ctpop in i32 and i64 vprtyb
- Changed gvec set up in xvtstdc[ds]p
v1 -> v2:
- Implemented instructions with fni4/fni8 and dropped the helper:
* VSUBCUW
* VADDCUW
From: "Lucas Mateus Castro (alqotel)"
This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.
reptloopmaster patch
8 12500 0,01810500 0,00903100 (-50.1%)
25 40000,01739400 0,00747700 (-
From: "Lucas Mateus Castro (alqotel)"
Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.
xvcpsgnsp:
reptloopmaster patch
8 12500 0,00561400 0,00537900 (-4.2%)
25 40000,00562100 0,0040 (-28.8%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).
Obs: The tests in this one are slightly different, these are
From: "Lucas Mateus Castro (alqotel)"
Used gvec to translate XVTSTDCSP and XVTSTDCDP.
xvtstdcsp:
reptloopimm prev versioncurrent version
25 40000 0,0475500,040820 (-14.2%)
25 40001 0,0695200,053520 (-23.0%)
25
From: "Lucas Mateus Castro (alqotel)"
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree wit
From: "Lucas Mateus Castro (alqotel)"
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).
Obs: The tests in this one are slightly different, these are
From: "Lucas Mateus Castro (alqotel)"
Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.
vabsdub:
reptloopmaster patch
8 12500 0,03601600 0,00688500 (-80.9%)
25 40000,03651000 0,00532100 (-85.4%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its
From: "Lucas Mateus Castro (alqotel)"
Patches missing review: 3,5,9,11,12
v1 -> v2:
- Implemented instructions with fni4/fni8 and dropped the helper:
* VSUBCUW
* VADDCUW
* VPRTYBW
* VPRTYBD
- Reworked patch12 to only use gvec implementati
From: "Lucas Mateus Castro (alqotel)"
Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.
vnegw:
reptloopmaster patch
8 12500 0,01053200 0,00548400 (-47.9%)
25 40000,01030500 0,0039 (-62.2%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.
xvcpsgnsp:
reptloopmaster patch
8 12500 0,00561400 0,00537900 (-4.2%)
25 40000,00562100 0,0040 (-28.8%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to
decodetree and used gvec to translate them.
xvabssp:
reptloopmaster patch
8 12500 0,00477900 0,00476000 (-0.4%)
25 4000
From: "Lucas Mateus Castro (alqotel)"
Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.
vprtybw:
reptloopmaster patch
8 12500 0,00991200 0,00626300 (
From: "Lucas Mateus Castro (alqotel)"
This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4
From: "Lucas Mateus Castro (alqotel)"
This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.
vmhaddshs:
reptloopmaster patch
8 12500 0,02983400 0,02648500 (-11.2%)
25 400
From: "Lucas Mateus Castro (alqotel)"
This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.
reptloopmaster patch
8 12500 0,01810500 0,00903100 (-50.1%)
25 40000,01739400 0,00747700 (-
From: "Lucas Mateus Castro (alqotel)"
Used gvec to translate XVTSTDCSP and XVTSTDCDP.
xvtstdcsp:
reptlooppatch10 patch12
8 12500 2,70288900 1,24050300 (-54.1%)
25 40002,65665700 1,14078900 (-57.1%)
100 1000
From: "Lucas Mateus Castro (alqotel)"
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of
its decoding away from the helper as previously the DCMX, XB and BF were
calculated in the helper with the help of cpu_env, now that part was
moved to the decodetree wit
From: "Lucas Mateus Castro (alqotel)"
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper
to be simpler and do all decoding in the decodetree (so XB, XT and DCMX
are all calculated outside the helper).
Obs: The tests in this one are slightly different, these are
From: "Lucas Mateus Castro (alqotel)"
Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its
From: "Lucas Mateus Castro (alqotel)"
Moved XVABSSP, XVABSDP, XVNABSSP,XVNABSDP, XVNEGSP and XVNEGDP to
decodetree and used gvec to translate them.
xvabssp:
reptloopmaster patch
8 12500 0,00477900 0,00476000 (-0.4%)
25 4000
From: "Lucas Mateus Castro (alqotel)"
Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree.
vprtybw:
reptloopmaster patch
8 12500 0,01215900 0,00705600 (-42.0%)
25 40000,01198700 0,00574400 (-52.1%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.
vabsdub:
reptloopmaster patch
8 12500 0,03601600 0,00688500 (-80.9%)
25 40000,03651000 0,00532100 (-85.4%)
10
From: "Lucas Mateus Castro (alqotel)"
This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1
vaddcuw:
reptloop
From: "Lucas Mateus Castro (alqotel)"
This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.
reptloopmaster patch
8 12500 0,01810500 0,00903100 (-50.1%)
25 40000,01739400 0,00747700 (-
From: "Lucas Mateus Castro (alqotel)"
Moved XVCPSGNSP and XVCPSGNDP to decodetree and used gvec to translate
them.
xvcpsgnsp:
reptloopmaster patch
8 12500 0,00722000 0,00587700 (-18.6%)
25 40000,00604300 0,00521500 (-13.7%)
10
From: "Lucas Mateus Castro (alqotel)"
Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.
vnegw:
reptloopmaster patch
8 12500 0,01053200 0,00548400 (-47.9%)
25 40000,01030500 0,0039 (-62.2%)
10
From: "Lucas Mateus Castro (alqotel)"
This patch series moves some instructions from decode legacy to
decodetree and translate said instructions with gvec. Some cases using
gvec ended up with a bigger, more complex and slower so those
instructions were only moved to decodetree.
In
From: "Lucas Mateus Castro (alqotel)"
This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.
vmhaddshs:
reptloopmaster patch
8 12500 0,02983400 0,02648500 (-11.2%)
25 400
This patch series aim to make easier to set up a compilation and CI
environment on PPC64 and PPC64LE machines.
v3:
Changed patch 1 to respect alphabetical order
v2:
This patch series are only patches 2-4 of v1 and an alternative to patch 1
suggested by Daniel.
Lucas Mateus Castro (alqotel) (4
From: "Lucas Mateus Castro (alqotel)"
The alpine docker image only comes with busybox, which doesn't have the
'-e' option on its readlink, so change it to 'realpath' to avoid that
problem.
Suggested-by: Daniel P. Berrangé
Signed-off-by: Lucas Mateus Castro (
From: "Lucas Mateus Castro (alqotel)"
Changed build-environment.yml to only install spice-server on x86_64 and
aarch64 as this package is only available on those architectures.
Signed-off-by: Lucas Mateus Castro (alqotel)
Reviewed-by: Philippe Mathieu-Daudé
---
scripts/ci/s
From: "Lucas Mateus Castro (alqotel)"
XEN hypervisor is only available in ARM and x86, but the yaml only
checked if the architecture is different from s390x, changed it to
a more accurate test.
Tested this change on a Ubuntu 20.04 ppc64le.
Signed-off-by: Lucas Mateus Castro (alqotel)
From: "Lucas Mateus Castro (alqotel)"
ninja-build is missing from the RHEL environment, so a system prepared
with that script would still fail to compile QEMU.
Tested on a Fedora 36
Signed-off-by: Lucas Mateus Castro (alqotel)
---
scripts/ci/setup/build-environment.yml | 1 +
1 fi
From: "Lucas Mateus Castro (alqotel)"
The alpine docker image only comes with busybox, which doesn't have the
'-e' option on its readlink, so change it to 'realpath' to avoid that
problem.
Suggested-by: Daniel P. Berrangé
Signed-off-by: Lucas Mateus Castro (
From: "Lucas Mateus Castro (alqotel)"
XEN hypervisor is only available in ARM and x86, but the yaml only
checked if the architecture is different from s390x, changed it to
a more accurate test.
Tested this change on a Ubuntu 20.04 ppc64le.
Signed-off-by: Lucas Mateus Castro (alqotel)
From: "Lucas Mateus Castro (alqotel)"
ninja-build is missing from the RHEL environment, so a system prepared
with that script would still fail to compile QEMU.
Tested on a Fedora 36
Signed-off-by: Lucas Mateus Castro (alqotel)
---
scripts/ci/setup/build-environment.yml | 1 +
1 fi
From: "Lucas Mateus Castro (alqotel)"
This patch series aim to make easier to set up a compilation and CI
environment on PPC64 and PPC64LE machines.
v2:
This patch series are only patches 2-4 of v1 and an alternative to patch 1
suggested by Daniel.
Lucas Mateus Castro (alqotel) (4):
From: "Lucas Mateus Castro (alqotel)"
Changed build-environment.yml to only install spice-server on x86_64 and
aarch64 as this package is only available on those architectures.
Signed-off-by: Lucas Mateus Castro (alqotel)
---
scripts/ci/setup/build-environment.yml | 12 ++
Added a test to see if the adjustment is being made correctly when an
underflow occurs and UE is set.
Signed-off-by: Lucas Mateus Castro (alqotel)
---
This patch will also fail without the underflow with UE set bugfix
Message-Id:<20220805141522.412864-3-lucas.ara...@eldorado.org.br>
---
Added a test to see if the adjustment is being made correctly when an
overflow occurs and OE is set.
Signed-off-by: Lucas Mateus Castro (alqotel)
---
The prctl patch is not ready yet, so this patch does as Richard
Henderson suggested and check the fp register in the signal handler
This patch
From: "Lucas Mateus Castro (alqotel)"
Added a test to see if the adjustment is being made correctly when an
underflow occurs and UE is set.
Signed-off-by: Lucas Mateus Castro (alqotel)
---
This patch will also fail without the underflow with UE set bugfix
Message-Id:<2022080514
From: "Lucas Mateus Castro (alqotel)"
Added a test to see if the adjustment is being made correctly when an
overflow occurs and OE is set.
Signed-off-by: Lucas Mateus Castro (alqotel)
---
The prctl patch is not ready yet, so this patch does as Richard
Henderson suggested and ch
From: "Lucas Mateus Castro (alqotel)"
Added the possibility of recalculating a result if it overflows or
underflows, if the result overflow and the rebias bool is true then the
intermediate result should have 3/4 of the total range subtracted from
the exponent. The same for underf
From: "Lucas Mateus Castro (alqotel)"
When an overflow exception occurs and OE is set the intermediate result
should be adjusted (by subtracting from the exponent) to avoid rounding
to inf. The same applies to an underflow exceptionion and UE (but adding
to the exponent). To do th
From: "Lucas Mateus Castro (alqotel)"
Changes in v2:
- Completely reworked the solution:
* Created re_bias in FloatFmt, it is 3/4 of the total exponent
range of a FP type
* Added rebias bools that dictates if the result should have
its ex
From: "Lucas Mateus Castro (alqotel)"
DO NOT MERGE
This patch adds a test to check if the add/sub of the intermediate
result when an overflow or underflow exception with the corresponding
enabling bit being set (i.e. OE/UE), but linux-user currently can't
disable MSR.FE0 and MSR
From: "Lucas Mateus Castro (alqotel)"
Change fdiv in the same way of fadd/fsub to handle overflow/underflow if
OE/UE is set (i.e. function that receives a value to add/subtract from
the exponent if an overflow/underflow occurs).
Signed-off-by: Lucas Mateus Castro (alqotel)
---
fpu/s
From: "Lucas Mateus Castro (alqotel)"
Change fmul in the same way of fadd/fsub to handle overflow/underflow if
OE/UE is set (i.e. function that receives a value to add/subtract from
the exponent if an overflow/underflow occurs).
Signed-off-by: Lucas Mateus Castro (alqotel)
---
fpu/s
From: "Lucas Mateus Castro (alqotel)"
As mentioned in the functions float_overflow_excp and
float_underflow_excp, the result should be adjusted as mentioned in the
ISA (subtracted 192/1536 from the exponent of the intermediate result if
an overflow occurs with OE set and added 192/1
e used in
the docker command.
Signed-off-by: Lucas Mateus Castro(alqotel)
---
tests/docker/docker.py | 15 ---
tests/docker/dockerfiles/alpine.docker | 2 ++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/tests/docker/docker.py b/tests/docker/docker
Minicloud doesn't have a RHEL image, but it does have Fedora 34 and 35
images and both use DNF as package manager, so just change the ansible facts
to check if it's RHEL or Fedora
Signed-off-by: Lucas Mateus Castro(alqotel)
---
scripts/ci/setup/build-environment.yml | 12 --
ninja-build is missing from the RHEL environment, so a system prepared
with that script would still fail to compile QEMU.
Tested on a Fedora 36
Signed-off-by: Lucas Mateus Castro(alqotel)
---
scripts/ci/setup/build-environment.yml | 1 +
1 file changed, 1 insertion(+)
diff --git a/scripts/ci
XEN hypervisor is only available in ARM and x86, but the yaml only
checked if the architecture is different from s390x, changed it to
a more accurate test.
Tested this change on a Ubuntu 20.04 ppc64le.
Signed-off-by: Lucas Mateus Castro(alqotel)
---
scripts/ci/setup/build-environment.yml | 2
Minicloud has a PPC64 BE Debian11 image which can be used for the CI,
so add Debian to the build-environment.yml so it can be configured with
ansible-playbook.
Signed-off-by: Lucas Mateus Castro(alqotel)
---
scripts/ci/setup/build-environment.yml | 31 +-
1 file changed
Added ppc64le so that the gitlab-runner.yml could be used to set up
ppc64le runners.
Signed-off-by: Lucas Mateus Castro(alqotel)
---
scripts/ci/setup/vars.yml.template | 1 +
1 file changed, 1 insertion(+)
diff --git a/scripts/ci/setup/vars.yml.template
b/scripts/ci/setup/vars.yml.template
Currently the run script uses 'readlink -e' but the image only has the
busybox readlink, this commit add the coreutils package which
contains the readlink with the '-e' option.
Signed-off-by: Lucas Mateus Castro(alqotel)
---
tests/docker/dockerfiles/alpine.docker | 1
Changed build-environment.yml to only install spice-server on x86_64 and
aarch64 as this package is only available on those architectures.
Signed-off-by: Lucas Mateus Castro(alqotel)
---
scripts/ci/setup/build-environment.yml | 12 +++-
1 file changed, 11 insertions(+), 1 deletion
way to run the docker tests in PPC64LE.
Lucas Mateus Castro(alqotel) (8):
tests/docker: Fix alpine dockerfile
scripts/ci/setup: ninja missing from build-environment
scripts/ci/setup: Fix libxen requirements
scripts/ci/setup: spice-server only on x86 aarch64
scripts/ci/setup: Add ppc64le
: 20201105154208.12442-1-ganqi...@huawei.com
Signed-off-by: Lucas Mateus Castro(alqotel)
---
Currently there's a disagreement between the checkpatch code and the
documentation, this RFC just changes the checkpatch to match the
documentation.
But there was a discussion in 2020 as the best way to deal
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
vmodsq: Vector Modulo Signed Quadword
vmoduq: Vector Modulo Unsigned Quadword
Signed-off-by: Lucas Mateus Castro (alqotel)
Reviewed-by: Richard Henderson
Resolves: https://gitlab.com/qemu-pr
From: "Lucas Mateus Castro (alqotel)"
Based on already existing QEMU implementation, created an unsigned 256
bit by 128 bit division needed to implement the vector divide extended
unsigned instruction from PowerISA3.1
Signed-off-by: Lucas Mateus Castro (alqotel)
---
This patch ha
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
vdivesw: Vector Divide Extended Signed Word
vdiveuw: Vector Divide Extended Unsigned Word
Signed-off-by: Lucas Mateus Castro (alqotel)
---
target/ppc/insn32.decode| 3 ++
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
vmodsw: Vector Modulo Signed Word
vmoduw: Vector Modulo Unsigned Word
vmodsd: Vector Modulo Signed Doubleword
vmodud: Vector Modulo Unsigned Doubleword
Signed-off-by: Lucas Mateus Castr
From: "Lucas Mateus Castro (alqotel)"
Based on already existing QEMU implementation created a signed
256 bit by 128 bit division needed to implement the vector divide
extended signed quadword instruction from PowerISA 3.1
Signed-off-by: Lucas Mateus Castro (alqotel)
Reviewed-b
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
vdivsw: Vector Divide Signed Word
vdivuw: Vector Divide Unsigned Word
vdivsd: Vector Divide Signed Doubleword
vdivud: Vector Divide Unsigned Doubleword
Signed-off-by: Lucas Mateus Castr
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
vdivsq: Vector Divide Signed Quadword
vdivuq: Vector Divide Unsigned Quadword
Signed-off-by: Lucas Mateus Castro (alqotel)
Reviewed-by: Richard Henderson
---
target/ppc/helper.h
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
vdivesd: Vector Divide Extended Signed Doubleword
vdiveud: Vector Divide Extended Unsigned Doubleword
vdivesq: Vector Divide Extended Signed Quadword
vdiveuq: Vector Divide Extended Unsigne
From: Joel Stanley
These are new hwcap bits added for power10.
Signed-off-by: Joel Stanley
Signed-off-by: Lucas Mateus Castro (alqotel)
Reviewed-by: Richard Henderson
---
linux-user/elfload.c | 4
1 file changed, 4 insertions(+)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update)
xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply,
Negative accumulate
xvbf16ger2np: VSX Vector bfloat16 GER (ran
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer
GER (rank-4 update)
pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer
GER (rank-4 update) Positiv
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER
(rank-2 update)
pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER
(rank-2 update) Negative multiply, Negative
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xxmfacc: VSX Move From Accumulator
xxmtacc: VSX Move To Accumulator
xxsetaccz: VSX Set Accumulator to Zero
The PowerISA 3.1 mentions that for the current version of the
architecture, &qu
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update)
xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative
multiply, Negative accumulate
xvf32gernp: VSX Vec
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update)
xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative
multiply, Negative accumulate
xvf16ger2np: VSX Vec
From: "Lucas Mateus Castro (alqotel)"
Based-on: https://gitlab.com/danielhb/qemu/-/tree/ppc-next
This patch series is a patch series of the Matrix-Multiply Assist (MMA)
instructions implementation from the PowerISA 3.1
This patch series was created based on Victor's target/pp
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvi4ger8: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update)
xvi4ger8pp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update)
Positive multiply, Positive accumulate
xvi8ger
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER
(rank-2 update)
pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER
(rank-2 update) Negative multiply, Negative
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update)
xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative
multiply, Negative accumulate
xvf16ger2np: VSX Vec
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update)
xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply,
Negative accumulate
xvbf16ger2np: VSX Vector bfloat16 GER (ran
From: Joel Stanley
These are new hwcap bits added for power10.
Signed-off-by: Joel Stanley
Signed-off-by: Lucas Mateus Castro (alqotel)
Reviewed-by: Richard Henderson
---
linux-user/elfload.c | 4
1 file changed, 4 insertions(+)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xxmfacc: VSX Move From Accumulator
xxmtacc: VSX Move To Accumulator
xxsetaccz: VSX Set Accumulator to Zero
The PowerISA 3.1 mentions that for the current version of the
architecture, &qu
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvi4ger8: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update)
xvi4ger8pp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update)
Positive multiply, Positive accumulate
xvi8ger
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update)
xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative
multiply, Negative accumulate
xvf32gernp: VSX Vec
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer
GER (rank-4 update)
pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer
GER (rank-4 update) Positiv
From: "Lucas Mateus Castro (alqotel)"
Based-on: <20220517161522.36132-1-victor.colo...@eldorado.org.br>
This patch series is a patch series of the Matrix-Multiply Assist (MMA)
instructions implementation from the PowerISA 3.1
These and the VDIV/VMOD implementation are the last
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER
(rank-2 update)
pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER
(rank-2 update) Negative multiply, Negative
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer
GER (rank-4 update)
pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer
GER (rank-4 update) Positiv
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update)
xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply,
Negative accumulate
xvbf16ger2np: VSX Vector bfloat16 GER (ran
From: "Lucas Mateus Castro (alqotel)"
Implement the following PowerISA v3.1 instructions:
xxmfacc: VSX Move From Accumulator
xxmtacc: VSX Move To Accumulator
xxsetaccz: VSX Set Accumulator to Zero
The PowerISA 3.1 mentions that for the current version of the
architecture, &qu
1 - 100 of 213 matches
Mail list logo