The PowerPC ISA 3.0 (power9) has new instructions that make it feasible to
allow QImode and HImode to be allocated to vector registers.  The new
instructions are:

    * load byte with zero extend
    * load half word with zero extend
    * store byte
    * store half word
    * extract byte from vector and zero extend
    * extract half word from vector and zero extend
    * sign extend byte to word/double word
    * sign extend half word to word/double word

I have bootstrapped a previous version of the changes on a little endian Power8
system, and I'm now repeating the bootstrap on both a big endian Power8 and a
little endian Power8.  Assuming there are no regressions with the patches, can
I check these patches into the trunk?

I have built the spec 2006 CPU benchmark suite with these changes, and the
power8 (ISA 2.07) code generation does not change.

I have also built the spec 2006 CPU benchmark for power9.  The following
15 (out of 30) benchmarks had code changes:

    * perlbench         (char <-> floating point)
    * gcc               (one extra ld/std)
    * gamess            (int <-> floating point)
    * gromacs           (one fmr instead of li/mtvsrd)
    * cactusADM         (char/int <-> floating point)
    * namd              (floating point -> int)
    * gobmk             (floating point -> int)
    * dealII            (int/long in vector regs. vs. gprs)
    * povray            (char/int <-> floating point)
    * calculix          (int -> zero extend to long)
    * hmmer             (floating point -> int)
    * h264ref           (zero extend short)
    * tonto             (floating point -> int)
    * omnetpp           (floating point -> int)
    * wrf               (floating point -> unsigned/int)

[gcc]
2016-11-09  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): If ISA 3.0,
        enable HImode and QImode to go in vector registers by default if
        the -mvsx-small-integer option is enabled.
        (rs6000_secondary_reload_simple_move): Likewise.
        (rs6000_preferred_reload_class): Don't force integer constants to
        be loaded into vector registers that we can easily make into
        memory (or being created in the GPRs and moved over with direct
        move).
        * config/rs6000/vsx.md (UNSPEC_P9_MEMORY): Delete, no longer
        used.
        (vsx_extract_<mode>): Rework V4SImode, V8HImode, and V16QImode
        vector extraction on ISA 3.0 when the scalar integer can be
        allocated in vector registers.  Generate the VEC_SELECT directy,
        and don't use UNSPEC's to avoid having the scalar type in a vector
        register.  Make the expander target registers, and let the
        combiner fold in results storing to memory, if the machine
        supports stores.
        (vsx_extract_<mode>_di): Likewise.
        (vsx_extract_<mode>_p9): Likewise.
        (vsx_extract_<mode>_di_p9): Likewise.
        (vsx_extract_<mode>_store_p9): Likewise.
        (vsx_extract_si): Likewise.
        (vsx_extract_<mode>_p8): Likewise.
        (p9_lxsi<wd>zx): Delete, no longer used.
        (p9_stxsi<wd>x): Likewise.
        * config/rs6000/rs6000.md (INT_ISA3): New mode iterator for
        integers in vector registers for ISA 3.0.
        (QHI): Update comment.
        (zero_extendqi<mode>2): Add support for ISA 3.0 scalar load or
        vector extract instructions in sign/zero extend.
        (zero_extendhi<mode>): Likewise.
        (extendqi<mode>): Likewise.
        (extendhi<mode>2): Likewise.
        (HImode splitter for load/sign extend in vector register):
        Likewise.
        (float<QHI:mode><FP_ISA3:mode>2): Eliminate old method of
        optimizing floating point conversions to/from small data types and
        rewrite it to support QImode/HImode being allowed in vector
        registers on ISA 3.0.
        (float<QHI:mode><FP_ISA3:mode>2_internal): Likewise.
        (floatuns<QHI:mode><FP_ISA3:mode>2): Likewise.
        (floatuns<QHI:mode><FP_ISA3:mode>2_internal): Likewise.
        (fix_trunc<SFDF:mode><QHI:mode>2): Likewise.
        (fix_trunc<SFDF:mode><QHI:mode>2_internal): Likewise.
        (fixuns_trunc<SFDF:mode><QHI:mode>2): Likewise.
        (fixuns_trunc<SFDF:mode><QHI:mode>2_internal): Likewise.
        VSPLITISW on ISA 2.07.
        (movhi_internal): Combine movhi_internal and movqi_internal into
        one mov<mode>_internal with an iterator.  Add support for QImode
        and HImode being allowed in vector registers.  Make large number
        of attributes and constraints easier to read.
        (movqi_internal): Likewise.
        (mov<mode>_internal): Likewise.
        (movdi_internal64): Fix constraint to allow loading -16..15 with
        VSPLITISW on ISA 2.07.
        (integer XXSPLTIB splitter): Add support for QI, HI, and SImode as
        well as DImode.

[gcc/testsuite]
2016-11-09  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * gcc.target/powerpc/vsx-qimode.c: New test for QImode, HImode
        being allowed in vector registers.
        * gcc.target/powerpc/vsx-qimode2.c: Likewise.
        * gcc.target/powerpc/vsx-qimode3.c: Likewise.
        * gcc.target/powerpc/vsx-himode.c: Likewise.
        * gcc.target/powerpc/vsx-himode2.c: Likewise.
        * gcc.target/powerpc/vsx-himode3.c: Likewise.
        * gcc.target/powerpc/p9-extract-1.c: Change MFVSRD to just MFVSR,
        to allow matching MFVSRD or MFVSRW.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Reply via email to