Add some opcodes for compound logic operations that were so far marked as TODO. Implement those for PPC and S390X.
We do not want to implement 512-bit width operations, because those trigger a cluster clock slowdown on the current set of Intel cpus. But there are new operations in avx512 that apply to 128 and 256-bit vectors, which do not trigger the slowdown, and those are very interesting. r~ Richard Henderson (20): tcg/optimize: Fix folding of vector ops tcg: Add opcodes for vector nand, nor, eqv tcg/ppc: Implement vector NAND, NOR, EQV tcg/s390x: Implement vector NAND, NOR, EQV tcg/i386: Detect AVX512 tcg/i386: Add tcg_out_evex_opc tcg/i386: Use tcg_can_emit_vec_op in expand_vec_cmp_noinv tcg/i386: Implement avx512 variable shifts tcg/i386: Implement avx512 scalar shift tcg/i386: Implement avx512 immediate sari shift tcg/i386: Implement avx512 immediate rotate tcg/i386: Implement avx512 variable rotate tcg/i386: Support avx512vbmi2 vector shift-double instructions tcg/i386: Expand vector word rotate as avx512vbmi2 shift-double tcg/i386: Remove rotls_vec from tcg_target_op_def tcg/i386: Expand scalar rotate with avx512 insns tcg/i386: Implement avx512 min/max/abs tcg/i386: Implement avx512 multiply tcg/i386: Implement more logical operations for avx512 tcg/i386: Implement bitsel for avx512 include/qemu/cpuid.h | 20 +- include/tcg/tcg-opc.h | 3 + include/tcg/tcg.h | 3 + tcg/aarch64/tcg-target.h | 3 + tcg/arm/tcg-target.h | 3 + tcg/i386/tcg-target-con-set.h | 1 + tcg/i386/tcg-target.h | 17 +- tcg/i386/tcg-target.opc.h | 3 + tcg/ppc/tcg-target.h | 3 + tcg/s390x/tcg-target.h | 3 + tcg/optimize.c | 61 ++++-- tcg/tcg-op-vec.c | 27 ++- tcg/tcg.c | 6 + tcg/i386/tcg-target.c.inc | 386 ++++++++++++++++++++++++++++------ tcg/ppc/tcg-target.c.inc | 15 ++ tcg/s390x/tcg-target.c.inc | 17 ++ 16 files changed, 472 insertions(+), 99 deletions(-) -- 2.25.1