On Thu, Aug 26, 2021 at 1:16 AM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 8/24/2021 3:44 AM, Hongtao Liu via Gcc-patches wrote: > > On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <crazy...@gmail.com> wrote: > > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <crazy...@gmail.com> wrote: > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <crazy...@gmail.com> wrote: > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford > <richard.sandif...@arm.com> wrote: > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <hongtao....@intel.com> wrote: > > Hi: > --- > OK, I think sth is amiss here upthread. insv/extv do look like they > are designed > to work on integer modes (but docs do not say anything about this here). > In fact the caller of extract_bit_field_using_extv is named > extract_integral_bit_field. Of course nothing seems to check what kind of > modes we're dealing with, but we're for example happily doing > expand_shift in 'mode'. In the extract_integral_bit_field call 'mode' is > some integer mode and op0 is HFmode? From the above I get it's > the other way around? In that case we should wrap the > call to extract_integral_bit_field, extracting in an integer mode with the > same size as 'mode' and then converting the result as (subreg:HF (reg:HI > ...)). > --- > This is a separate patch as a follow up of upper comments. > > gcc/ChangeLog: > > * expmed.c (extract_bit_field_1): Wrap the call to > extract_integral_bit_field, extracting in an integer mode with > the same size as 'tmode' and then converting the result > as (subreg:tmode (reg:imode)). > > gcc/testsuite/ChangeLog: > * gcc.target/i386/float16-5.c: New test. > --- > gcc/expmed.c | 19 +++++++++++++++++++ > gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++ > 2 files changed, 31 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c > > diff --git a/gcc/expmed.c b/gcc/expmed.c > index 3143f38e057..72790693ef0 100644 > --- a/gcc/expmed.c > +++ b/gcc/expmed.c > @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, > poly_uint64 bitnum, > op0_mode = opt_scalar_int_mode (); > } > > + /* Make sure we are playing with integral modes. Pun with subregs > + if we aren't. When tmode is HFmode, op0 is SImode, there will be ICE > + in extract_integral_bit_field. */ > + if (int_mode_for_mode (tmode).exists (&imode) > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly > cheaper. Then imode should always be != tmode. Maybe > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure > how it behaves for composite modes. > > Of course the least surprises would happen when we restrict this > to FLOAT_MODE_P (tmode). > > Richard - any preferences? > > If the bug is that extract_integral_bit_field is being called with > a non-integral mode parameter, then it looks odd that we can still > fall through to it without an integral mode (when exists is false). > > If calling extract_integral_bit_field without an integral mode is > a bug then I think we should have: > > int_mode_for_mode (mode).require () > > whenever mode is not already SCALAR_INT_MODE_P/is_a<scalar_int_mode>. > Ideally we'd make the mode parameter scalar_int_mode too. > > extract_integral_bit_field currently has: > > /* Find a correspondingly-sized integer field, so we can apply > shifts and masks to it. */ > scalar_int_mode int_mode; > if (!int_mode_for_mode (tmode).exists (&int_mode)) > /* If this fails, we should probably push op0 out to memory and then > do a load. */ > int_mode = int_mode_for_mode (mode).require (); > > which would seem to be redundant after this change. > > I'm not sure what exactly the bug is, but extract_integral_bit_field ends > up creating a lowpart subreg that's not allowed and that ICEs (and I > can't see a way to check beforehand). So it seems to me at least > part of that function doesn't expect non-integral extraction modes. > > But who knows - the code is older than I am (OK, not, but older than > my involvment in GCC ;)) > > How about attached patch w/ below changelog > > gcc/ChangeLog: > > * expmed.c (extract_bit_field_1): Make sure we're playing with > integral modes before call extract_integral_bit_field. > (extract_integral_bit_field): Add a parameter of type > scalar_int_mode which corresponds to of tmode. > And call extract_and_convert_fixed_bit_field instead of > extract_fixed_bit_field and convert_extracted_bit_field. > (extract_and_convert_fixed_bit_field): New function, it's a > combination of extract_fixed_bit_field and > convert_extracted_bit_field. > > gcc/testsuite/ChangeLog: > * gcc.target/i386/float16-5.c: New test. > > I'd like to ping this patch, or maybe we can use the patch before with > richi's comments. > > Rebased and ping^2, there are plenty of avx512fp16 patches blocked by > this patch, i'd like someone to help review this patch. > > Please ignore the former attached patch, should be the patch attached here. > > Richard. > > + && imode != tmode > + && imode != GET_MODE (op0)) > + { > + rtx ret = extract_integral_bit_field (op0, op0_mode, > + bitsize.to_constant (), > + bitnum.to_constant (), unsignedp, > + NULL, imode, imode, > + reverse, fallback_p); > + gcc_assert (ret); > + > + if (!REG_P (ret)) > + ret = force_reg (imode, ret); > + return gen_lowpart_SUBREG (tmode, ret); > + } > + > /* It's possible we'll need to handle other cases here for > polynomial bitnum and bitsize. */ > > Minor nit, but since the code is using to_constant, it should go after > this comment rather than before it. > > Thanks, > Richard > > diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c > b/gcc/testsuite/gcc.target/i386/float16-5.c > new file mode 100644 > index 00000000000..ebc0af1490b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-5.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-msse2 -O2" } */ > +_Float16 > +foo (int a) > +{ > + union { > + int a; > + _Float16 b; > + }c; > + c.a = a; > + return c.b; > +} > -- > 2.27.0 > > > > -- > BR, > Hongtao > > > -- > BR, > Hongtao > > > -- > BR, > Hongtao > > > > 0001-Make-sure-we-re-playing-with-integral-modes-before-c.patch > > From 9c77ac15e69b567156a82debe45e3ced10df1110 Mon Sep 17 00:00:00 2001 > From: liuhongt <hongtao....@intel.com> > Date: Fri, 6 Aug 2021 10:18:43 +0800 > Subject: [PATCH] Make sure we're playing with integral modes before call > extract_integral_bit_field. > > gcc/ChangeLog: > > * expmed.c (extract_bit_field_1): Make sure we're playing with > integral modes before call extract_integral_bit_field. > (extract_integral_bit_field): Add a parameter of type > scalar_int_mode which corresponds to of tmode. > And call extract_and_convert_fixed_bit_field instead of > extract_fixed_bit_field and convert_extracted_bit_field. > (extract_and_convert_fixed_bit_field): New function, it's a > combination of extract_fixed_bit_field and > convert_extracted_bit_field. > > gcc/testsuite/ChangeLog: > * gcc.target/i386/float16-5.c: New test. > > I bet this is all getting triggered due to the introduction of HFmode. > Wrapping with a subreg to get an integral mode may work, but I'd be more > comfortable if we had other instances where we knew wrapping an SF/DF mode > with SI/DI was enough to make all this code safe. I fear we're just pushing > the bug down in one spot and it's going to pop up elsewhere. > > Another approach would be to force the object into memory, but I suspect > y'all don't want to do that ;-) > > So in the end, it may be reasonable, but I wouldn't be surprised if we trip > over more problems in this code with FP modes.
One thought I had is whether we can "fix" validate_subreg to have less "weird" allowed float-int special cases. As said upthread I think that we either should allow all of those, implying that subregs work semantically as if there's subregs to same-sized integer modes inbetween or disallow them all and make sure we're actually doing that explicitely. For example /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field is the culprit here, and not the backends. */ else if (known_ge (osize, regsize) && known_ge (isize, osize)) ; I can't decipther rtl.text as to what the semantics of such a subreg is given the docs hand-wave about WORDS_BIG_ENDIAN vs. FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens when you mix those in a subreg. So maybe the above should have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN. But then the world would be much simpler if subregs of non-same size modes have explicit documentation for the mode kinds we have. Richard. > jeff >