Christophe Lyon <christophe.l...@linaro.org> writes:
> On Wed, 18 Sep 2019 at 11:41, Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Richard Biener <richard.guent...@gmail.com> writes:
>> > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford
>> > <richard.sandif...@arm.com> wrote:
>> >>
>> >> assemble_real used GEN_INT to create integers directly from the
>> >> longs returned by real_to_target.  assemble_integer then went on
>> >> to interpret the const_ints as though they had the mode corresponding
>> >> to the accompanying size parameter:
>> >>
>> >>       imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require ();
>> >>
>> >>       for (i = 0; i < size; i += subsize)
>> >>         {
>> >>           rtx partial = simplify_subreg (omode, x, imode, i);
>> >>
>> >> But in the assemble_real case, X might not be canonical for IMODE.
>> >>
>> >> If the interface to assemble_integer is supposed to allow outputting
>> >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg
>> >> above is wrong.  But if the number of bytes passed to assemble_integer
>> >> is supposed to be the number of bytes that the integer actually contains,
>> >> assemble_real is wrong.
>> >>
>> >> This patch takes the latter interpretation and makes assemble_real
>> >> generate const_ints that are canonical for the number of bytes passed.
>> >>
>> >> The flip_storage_order handling assumes that each long is a full
>> >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats
>> >> whose memory size is not a multiple of 32 bits (which includes
>> >> HFmode at least).  The patch therefore leaves that code alone.
>> >> If interpreting each integer as SImode is correct, the const_ints
>> >> that it generates are also correct.
>> >>
>> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Also tested
>> >> by making sure that there were no new errors from a range of
>> >> cross-built targets.  OK to install?
>> >>
>> >> Richard
>> >>
>> >>
>> >> 2019-09-17  Richard Sandiford  <richard.sandif...@arm.com>
>> >>
>> >> gcc/
>> >>         * varasm.c (assemble_real): Generate canonical const_ints.
>> >>
>> >> Index: gcc/varasm.c
>> >> ===================================================================
>> >> --- gcc/varasm.c        2019-09-05 08:49:30.829739618 +0100
>> >> +++ gcc/varasm.c        2019-09-17 15:30:10.400740515 +0100
>> >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar
>> >>    real_to_target (data, &d, mode);
>> >>
>> >>    /* Put out the first word with the specified alignment.  */
>> >> +  unsigned int chunk_nunits = MIN (nunits, units_per);
>> >>    if (reverse)
>> >>      elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], 
>> >> SImode));
>> >>    else
>> >> -    elt = GEN_INT (data[0]);
>> >> -  assemble_integer (elt, MIN (nunits, units_per), align, 1);
>> >> -  nunits -= units_per;
>> >> +    elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT));
>> >
>> > why the appearant difference between the storage-order flipping
>> > variant using gen_int_mode vs. the GEN_INT with sext_hwi?
>> > Can't we use gen_int_mode in the non-flipping path and be done with that?
>>
>> Yeah, I mentioned this in the covering note.  The flip_storage_order
>> stuff only seems to work for floats that are a multiple of 32 bits in
>> size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the
>> new "else" does.  Hard-coding SImode also hard-codes BITS_PER_UNIT==8,
>> unlike the "else".
>>
>> So if anything, it's flip_storage_order that might need to change
>> to avoid hard-coding SImode.  That doesn't look like a trivial change
>> though.  E.g. the number of bytes passed to assemble_integer would need
>> to match the number of bytes in data[nelts - 1] rather than data[0].
>> The alignment code below would also need to be adjusted.  Fixing that
>> (if it is a bug) seems like a separate change and TBH I'd rather not
>> touch it here.
>>
>
> Hi Richard,
>
> I suspect you've probably noticed already, but in case you haven't:
> this patch causes a regression on arm:
> FAIL: gcc.target/arm/fp16-compile-alt-3.c scan-assembler \t.short\t49152
> FAIL: gcc.target/arm/fp16-compile-ieee-3.c scan-assembler \t.short\t49152

Hadn't noticed that actually (but should have) -- thanks for the heads up.
I've applied the below as obvious after testing on armeb-eabi.

Richard


2019-09-26  Richard Sandiford  <richard.sandif...@arm.com>

gcc/testsuite/
        * gcc.target/arm/fp16-compile-alt-3.c: Expect (__fp16) -2.0
        to be written as a negative short rather than a positive one.
        * gcc.target/arm/fp16-compile-ieee-3.c: Likewise.

Index: gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c
===================================================================
--- gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c   2019-03-08 
18:14:28.836998325 +0000
+++ gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c   2019-09-26 
11:42:47.502378676 +0100
@@ -7,4 +7,4 @@
 __fp16 xx = -2.0;
 
 /* { dg-final { scan-assembler "\t.size\txx, 2" } } */
-/* { dg-final { scan-assembler "\t.short\t49152" } } */
+/* { dg-final { scan-assembler "\t.short\t-16384" } } */
Index: gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c
===================================================================
--- gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c  2019-03-08 
18:14:28.732998720 +0000
+++ gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c  2019-09-26 
11:42:47.506378645 +0100
@@ -6,4 +6,4 @@
 __fp16 xx = -2.0;
 
 /* { dg-final { scan-assembler "\t.size\txx, 2" } } */
-/* { dg-final { scan-assembler "\t.short\t49152" } } */
+/* { dg-final { scan-assembler "\t.short\t-16384" } } */

Reply via email to