subreg vs vec_select

2020-09-09 Thread Ilya Leoshkevich via Gcc
Hi!

I have a vector pseudo containing a single 128-bit value (V1TFmode) and
I need to access its last 64 bits (DFmode). Which of the two options
is better?

(subreg:DF (reg:V1TF) 8)

or

(vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel [(const_int 1)]))

If I use the first one, I run into a problem with set_noop_p (): it
thinks that

(set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8))

is a no-op, because it doesn't check the mode after stripping the
subreg:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616

However this is not correct, because SET_DEST is the second register in
a register pair, and SET_SRC is half of a vector register that overlaps
the first register in the corresponding pair. So it looks as if mode
needs to be considered there.

This helps:

--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set)
return 0;
   src = SUBREG_REG (src);
   dst = SUBREG_REG (dst);
+  if (GET_MODE (src) != GET_MODE (dst))
+   return 0;
 }

but I'm not sure whether I'm not missing something about subreg
semantics in the first place.

Best regards,
Ilya



Adding conditional move (movsicc) to MD

2020-09-09 Thread m

Hello!

I am trying to get movmodecc (movsicc) going for my MRISC32 machine 
description, but I am unable to get GCC to use my define_expand pattern.


I have tried different variants, but here is one example that I think 
should work:


    (define_expand "movsicc"
  [(set (match_operand:SI 0 "register_operand")
    (if_then_else:SI (match_operator 1 "comparison_operator"
   [(match_operand:SI 2 "register_operand")
    (match_operand:SI 3 "register_operand")])
 (match_operand:SI 4 "register_operand")
 (match_operand:SI 5 "register_operand")))]
  ""
    {
  if (!mrisc32_expand_conditional_move (operands))
    FAIL;
  DONE;
    })

As far as I can see, mrisc32_expand_conditional_move() is never called, 
even for simple code like:


    int cmov_ne(int a, int b, int c, int d) {
  return a != b ? c : d;
    }

Am I missing something? Do I need to explicitly enable conditional moves 
somewhere in the MD?


Regards,

  Marcus



Re: PowerPC long double Mangling

2020-09-09 Thread Jonathan Wakely via Gcc
Sorry for the slow reply to this.

On Fri, 7 Aug 2020 at 22:14, Michael Meissner  wrote:
>
> One issue with doing the transition is what mangling should be used with the
> new long double.
>
> At the moment, the current mangling is:
> long double "g"
> __float128  "u9__ieee128"
> __ibm128"g"
>
> Obviously this will have to change in the future.  It is unfortunate that we
> choose "g" to mean IBM extended double many many years ago, when it should 
> have
> been used for IEEE 128-bit floating point.  But that is long ago, so I think 
> we
> need to keep it.
>
> But assuming we want compatibility with libraries like glibc and libstdc++, I
> think we will have to continue to use "g" for __ibm128.
>
> With the long double change, I tend to view this as an ABI change.  But if the
> user doesn't use long double, they should be able to link without changes.
>
> I would propose using a new mangling for IEEE 128-bit long double.  I would
> prefer to get agreement on what the new mangling should be so we don't have an
> issue like we had in GCC 8.1 going to GCC 8.2, where we changed the mangling,
> and had to provide aliases for the old name.
>
> At the moment I think the mangling should be:
> long double "g" if long double is IBM
> long double "u12_ieee128_ld"if long double is IEEE
> __float128  "u9__ieee128"
> __ibm128"g"

What's the benefit of having __float128 and IEEE long double be
distinct types? That complicates things for libraries like libstdc++.
If we want to support using "__float128" with C++ iostreams then we
need yet another set of I/O routines, even though it's identical to
one of the types we already handle. Why not just keep __float128 and
__ieee128 and "long double when long double is IEEE" as three
different aliases for the same type, so that C++ code like
_Z4funcu9__ieee128 works for all of them, instead of needing to also
define _Z4funcu12__ieee128_ld?

What about the "__ieee128" type, would that be mangled as
"u12_ieee128_ld" or "u9__ieee128"?

Currently it's the latter, i.e. __ieee128 is the same type as
__float128, which is the same type as "long double when
-mabi=ieeelongdouble". That seems useful to me, because it means that
I can always use either __ibm128 or __ieee128 to refer to "the type
that is sometimes known as 'long double'" irrespective of which -mabi
is actually used.

Today's mangling means I can declare a function in a header file as:

void func(long double);

and then in a library provide two definitions of that function, with
the right one being chosen by the linker depending what mangled name
it uses for "func(long double)" e.g. the definitions would be:

void func(__ibm128) { }
void func(__ieee128) { }

If __ieee128 is mangled to u9__ieee128, then if you compile the header
with -mabi=ieeelongdouble you can't link to the definition. To make it
work with the proposed mangling, the definitions would have to be:

void func(__ibm128) { }
void func(long double) { }

and then using -mabi=ieeelongdouble to compile the file would be
mandatory. That isn't terrible, but it seems inconsistent and
asymmetrical.

I'd also find it surprising if __ieee128 is not mangled to u9__ieee128
since that's its name :-) But that's far less important than the
practical matter of which types are mangled to the same and which
overloads are equivalent to each other.


JUMP_LABEL returns NULL for the just created jump instruction

2020-09-09 Thread Anton Youdkevitch
Folks,

I'm trying to deal with CFG construction at RTL level and I bumped into a
problem
when I created a jump to a certain label. After the jump is created I try
to extract the
label using JUMP_LABEL but I get nothing.

The code looks like like this:

begin_sequence ();
code_label lab = gen_label_rtx ()
rtx x = gen_rtx_GT (...);
x = gen_rtx_IF_THEN_ELSE (VOIDmode, x, gen_rtx_LABEL_REF (Pmode, lab));
rtx_insn *j = emit_jump_insn (gen_rtx_SET (x));
end_sequence ();

rtx lab1 = JUMP_LABEL (j)  // <--- I get NULL here

What am I missing?

-- 
  Thanks,
  Anton


Re: PowerPC long double Mangling

2020-09-09 Thread Segher Boessenkool
Hi!

On Wed, Sep 09, 2020 at 02:42:36PM +0100, Jonathan Wakely wrote:
> On Fri, 7 Aug 2020 at 22:14, Michael Meissner  wrote:
> > But assuming we want compatibility with libraries like glibc and libstdc++, 
> > I
> > think we will have to continue to use "g" for __ibm128.

Yes.

> > With the long double change, I tend to view this as an ABI change.

You can use both __ibm128 and __ieee128 in one program, so it isn't an
ABI change.  Only the default of what "long double" means changes.  And
we have been there before, there is the "e" mangling as well...

> > I would propose using a new mangling for IEEE 128-bit long double.  I would
> > prefer to get agreement on what the new mangling should be so we don't have 
> > an
> > issue like we had in GCC 8.1 going to GCC 8.2, where we changed the 
> > mangling,
> > and had to provide aliases for the old name.
> >
> > At the moment I think the mangling should be:
> > long double "g" if long double is IBM
> > long double "u12_ieee128_ld"if long double is IEEE

Why?  Why can't this be exactly the same as __ieee128?

> > __float128  "u9__ieee128"
> > __ibm128"g"

(There is DF128_ as well btw, oh joy).

> What's the benefit of having __float128 and IEEE long double be
> distinct types? That complicates things for libraries like libstdc++.

Yeah.

> If we want to support using "__float128" with C++ iostreams then we
> need yet another set of I/O routines, even though it's identical to
> one of the types we already handle. Why not just keep __float128 and
> __ieee128 and "long double when long double is IEEE" as three
> different aliases for the same type, so that C++ code like
> _Z4funcu9__ieee128 works for all of them, instead of needing to also
> define _Z4funcu12__ieee128_ld?

Both __ieee128 and __float128 are in the implementation namespace, so
we can do whatever we want with it, indeed.

> What about the "__ieee128" type, would that be mangled as
> "u12_ieee128_ld" or "u9__ieee128"?

The latter.  Anything else would be an ABI change.

> Currently it's the latter, i.e. __ieee128 is the same type as
> __float128, which is the same type as "long double when
> -mabi=ieeelongdouble". That seems useful to me, because it means that
> I can always use either __ibm128 or __ieee128 to refer to "the type
> that is sometimes known as 'long double'" irrespective of which -mabi
> is actually used.

Yes.

> I'd also find it surprising if __ieee128 is not mangled to u9__ieee128
> since that's its name :-) But that's far less important than the
> practical matter of which types are mangled to the same and which
> overloads are equivalent to each other.

Mangling to u12__ieee_128_ld means we have a type named __ieee128_ld.
We don't.


Segher


Re: irange best practices document

2020-09-09 Thread Martin Sebor via Gcc

On 9/3/20 1:14 AM, Aldy Hernandez via Gcc wrote:
Below is a documented we have drafted to provide guidance on using 
irange's and converting passes to it.  This will future proof any such 
passes so that they will work with the ranger, or any other mechanism 
using multiple sub-ranges (as opposed to the more limited value_range).


The official document will live here, but is included below for discussion:

 https://gcc.gnu.org/wiki/irange-best-practices

Feel free to respond to this post with any questions or comments.


Thanks for the writeup!

The biggest question on my mind at the moment is probably about
how to go about rewriting classes that maintain their own ranges
(most often as an array of two offset_int) to use one of the Ranger
classes.  Two examples are the access_ref class in builtins.h and
the builtin_memref class in gimple-ssa-warn-restrict.c.

Both of these compute the size of an object (as a simple range)
and the offset into it (also as a range).  In the future, they will
track of sizes of multiple objects (from PHI nodes).

My thinking is to do this in two steps: In 1) replace each
offset_int[2] member array with an instance of int_range<1> and
then rewrite all the code that manipulates the array with calls
to the ranger API. That should be fairly straightforward. In 2)
replace the simple int_range<1> with something more interesting
(int_range_max?) and rewrite the final code that processes
the results to scan all the subranges for overflow or overlap,
as well as the code that presents them in warnings/notes.  It
would be nice to have support for the formatting of ranges in
the pretty-printer to cut down on the repetitive tests that
determine how to format a constant (%E, vs a range [%E, %E],
vs a multi-range [%E, %E][%E, %E], ...[%E, %E]).

I suppose I/we should write this up as a case study after I/we
do the first rewrite.

Longer term, the size of an object (or objects) and an offset
into it/them seems like a generally useful property that would
be best associated with a pointer somehow, so that it could be
queried by an API similar to the one exposed by the irange class.
I imagine it would benefit both warnings and optimizations.

I do have one concern with a wholesale switch over to the ranger
classes and APIs.  Being designed with conservative assumptions
suitable for optimizers, the current APIs aren't necessarily
ideal for warnings,  A basic example is integer overflow and
wrapping.  In the general case, the optimizer must assume that
the range of the argument in 'malloc (n + 1)' might overflow (or
wrap).  A warning, though, should detect when it does and complain.
So I'd like to see a mode where the ranger evaluates expressions
in infinite precision.

Martin



Aldy & Andrew


INTRODUCTION


irange is a class for storing and manipulating multi-ranges of
integers.  It is meant to be a replacement for value_range's, and can
currently inter-operate seamlessly with them.

Original value_range's can contain a range of integers, say from 10 to
20 inclusive, represented as [10, 20].  It also has a way of
representing an anti-range, which is the inverse of a range.  For
example, everything except 10 to 20 is represented as an anti-range of
~[10, 20].  This is really, shorthand for the union of
[MIN_INT, 9] U [21, MAX_INT].

The value_range representation has the limitation that higher
granularity is not representable without losing precision.  For
example, you cannot specify the range of [10, 15] U [20, 30], instead
it is represented ambiguously with with a range of [10, 30].

On the other hand, multi-ranges with the irange class, can represent
an arbitrary number of sub-ranges.  More formally, multi-ranges have
0 or more non-intersecting sub-ranges with integral bounds.  For
example, you can specify a range containing the numbers of [10, 15]
and [20, 30] with an irange of [10, 15][20, 30].  With irange, instead
of using anti-ranges for ~[10, 20] the underlying number of ranges can
be represented accurately with [MIN_INT, 9][21, MAX_INT].

Multi-ranges are not limited to 1 or 2 sub-ranges.  Instead, you can
specify any number of sub-ranges.  For example:

  int_range<5> five_pairs;
  int_range<2> two_pairs;
  int_range<1> legacy_value_range;
  widest_irange huge_irange;  // currently 255 sub-ranges.

The special case of int_range<1> provides legacy support for 
value_range.  Currently, value_range is just a typedef for int_range<1>.


The specially named widest_irange is used for "unlimited" sub-ranges,
and is meant to be used when calculating intermediate results.
Currently it is a large number (255), but could be changed without
prior notice.  Note that "widest" does not have anything to do with
the range of the underlying integers, but the maximum amount of
sub-range pairs available for calculation.

Here are some examples of calculations with different sub-range
granularity:

 // Assume:
 tree n10 = build_int_cst (integer_type

Re: PowerPC long double Mangling

2020-09-09 Thread Thomas Koenig via Gcc

Am 09.09.20 um 17:36 schrieb Segher Boessenkool:

You can use both __ibm128 and __ieee128 in one program, so it isn't an
ABI change.  Only the default of what "long double" means changes.  And
we have been there before, there is the "e" mangling as well...


For Fortran, it is an ABI change unless we define an additional KIND
number for __ieee128.

Best regards

Thomas


Re: PowerPC long double Mangling

2020-09-09 Thread Segher Boessenkool
On Wed, Sep 09, 2020 at 07:06:41PM +0200, Thomas Koenig wrote:
> Am 09.09.20 um 17:36 schrieb Segher Boessenkool:
> >You can use both __ibm128 and __ieee128 in one program, so it isn't an
> >ABI change.  Only the default of what "long double" means changes.  And
> >we have been there before, there is the "e" mangling as well...
> 
> For Fortran, it is an ABI change unless we define an additional KIND
> number for __ieee128.

Yes, Fortran has existing problems here (now *already*).


Segher


Re: PowerPC long double Mangling

2020-09-09 Thread Jakub Jelinek via Gcc
On Wed, Sep 09, 2020 at 12:32:22PM -0500, Segher Boessenkool wrote:
> On Wed, Sep 09, 2020 at 07:06:41PM +0200, Thomas Koenig wrote:
> > Am 09.09.20 um 17:36 schrieb Segher Boessenkool:
> > >You can use both __ibm128 and __ieee128 in one program, so it isn't an
> > >ABI change.  Only the default of what "long double" means changes.  And
> > >we have been there before, there is the "e" mangling as well...
> > 
> > For Fortran, it is an ABI change unless we define an additional KIND
> > number for __ieee128.
> 
> Yes, Fortran has existing problems here (now *already*).

Well, the Fortran kind case is the same thing as the change of the meaning
of long double from __ibm128 to __ieee128.
Neither C nor Fortran has special mangling for that, so for those languages
it is a real ABI change, for C++ it is an ABI change too, but one that can
be dealt for selected libraries through mangling and compiling stuff that
refers to long double twice (e.g. libstdc++).  For glibc I guess it can be
dealt with using asm redirects of the math functions.

Jakub



Re: PowerPC long double Mangling

2020-09-09 Thread Segher Boessenkool
On Wed, Sep 09, 2020 at 07:41:02PM +0200, Jakub Jelinek wrote:
> On Wed, Sep 09, 2020 at 12:32:22PM -0500, Segher Boessenkool wrote:
> > On Wed, Sep 09, 2020 at 07:06:41PM +0200, Thomas Koenig wrote:
> > > Am 09.09.20 um 17:36 schrieb Segher Boessenkool:
> > > >You can use both __ibm128 and __ieee128 in one program, so it isn't an
> > > >ABI change.  Only the default of what "long double" means changes.  And
> > > >we have been there before, there is the "e" mangling as well...
> > > 
> > > For Fortran, it is an ABI change unless we define an additional KIND
> > > number for __ieee128.
> > 
> > Yes, Fortran has existing problems here (now *already*).
> 
> Well, the Fortran kind case is the same thing as the change of the meaning
> of long double from __ibm128 to __ieee128.
> Neither C nor Fortran has special mangling for that, so for those languages
> it is a real ABI change, for C++ it is an ABI change too, but one that can
> be dealt for selected libraries through mangling and compiling stuff that
> refers to long double twice (e.g. libstdc++).  For glibc I guess it can be
> dealt with using asm redirects of the math functions.

My point is that you already have *both* __ibm128 and __ieee128, and you
can have them in one source file even, and that just works.  In C.

Which of those some configuration uses by default matters a lot for libs
that use long double of course, but there is an ELF attribute for that,
so problems are not hard to spot usually.

But for Fortran this still does not work at all.


Segher


Re: subreg vs vec_select

2020-09-09 Thread Segher Boessenkool
Hi Ilya,

On Wed, Sep 09, 2020 at 11:50:56AM +0200, Ilya Leoshkevich via Gcc wrote:
> I have a vector pseudo containing a single 128-bit value (V1TFmode) and
> I need to access its last 64 bits (DFmode). Which of the two options
> is better?
> 
> (subreg:DF (reg:V1TF) 8)
> 
> or
> 
> (vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel [(const_int 1)]))
> 
> If I use the first one, I run into a problem with set_noop_p (): it
> thinks that
> 
> (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8))
> 
> is a no-op, because it doesn't check the mode after stripping the
> subreg:
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616
> 
> However this is not correct, because SET_DEST is the second register in
> a register pair, and SET_SRC is half of a vector register that overlaps
> the first register in the corresponding pair. So it looks as if mode
> needs to be considered there.

Yes.

> This helps:
> 
> --- a/gcc/rtlanal.c
> +++ b/gcc/rtlanal.c
> @@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set)
> return 0;
>src = SUBREG_REG (src);
>dst = SUBREG_REG (dst);
> +  if (GET_MODE (src) != GET_MODE (dst))
> +   return 0;
>  }
> 
> but I'm not sure whether I'm not missing something about subreg
> semantics in the first place.

You probably should just see if both modes are the same number of hard
registers?  HARD_REGNO_NREGS.


Segher


Re: A problem with one instruction multiple latencies and pipelines

2020-09-09 Thread Segher Boessenkool
Hi!

On Mon, Sep 07, 2020 at 09:20:59PM +0100, Richard Sandiford wrote:
> This is just personal opinion, but in general (from the point of view
> of a new port, or a new subport like SVE), I think the best approach
> to handling the "type" attribute is to start with the coarsest
> classification that makes sense, then split these relatively coarse
> types up whenever there's a specific need.

Agreed.

> When taking that approach, it's OK (and perhaps even a good sign)
> for an existing type to sometimes be too coarse for a new CPU.
> 
> So thanks for asking about this, and please don't feel constrained
> by the existing "type" classification.  IMO we should split existing
> types wherever that makes sense for new CPUs.

You can also use some other attributes to classify instructions, you
don't have to put it all in one "type" attribute.  This can of course be
done later, at a time when it is clearer what a good design will be.
Sometimes it is obvious from the start though :-)

(This primarily makes the pipeline descriptions much simpler, but also
custom scheduling code and the like.  If one core has two types of "A"
insn, say "Aa" and "Ab", it isn't nice if all other cores now have to
handle both "Aa" and "Ab" instead of just "A").


Segher


RE: A problem with one instruction multiple latencies and pipelines

2020-09-09 Thread Qian, Jianhua
Hi Segher

> You can also use some other attributes to classify instructions, you don't 
> have
> to put it all in one "type" attribute.  This can of course be done later, at 
> a time
> when it is clearer what a good design will be.
> Sometimes it is obvious from the start though :-)

Thank you for your advice. It is also a good idea.
Considering other cores(existing and future),
I think it is better to keep the type attribute unchanging,
and add other attributes to classify instructions.

Regards
Qian

> -Original Message-
> From: Segher Boessenkool 
> Sent: Thursday, September 10, 2020 5:23 AM
> To: Qian, Jianhua/钱 建华 ; gcc@gcc.gnu.org;
> richard.sandif...@arm.com
> Subject: Re: A problem with one instruction multiple latencies and pipelines
> 
> Hi!
> 
> On Mon, Sep 07, 2020 at 09:20:59PM +0100, Richard Sandiford wrote:
> > This is just personal opinion, but in general (from the point of view
> > of a new port, or a new subport like SVE), I think the best approach
> > to handling the "type" attribute is to start with the coarsest
> > classification that makes sense, then split these relatively coarse
> > types up whenever there's a specific need.
> 
> Agreed.
> 
> > When taking that approach, it's OK (and perhaps even a good sign) for
> > an existing type to sometimes be too coarse for a new CPU.
> >
> > So thanks for asking about this, and please don't feel constrained by
> > the existing "type" classification.  IMO we should split existing
> > types wherever that makes sense for new CPUs.
> 
> You can also use some other attributes to classify instructions, you don't 
> have
> to put it all in one "type" attribute.  This can of course be done later, at 
> a time
> when it is clearer what a good design will be.
> Sometimes it is obvious from the start though :-)
> 
> (This primarily makes the pipeline descriptions much simpler, but also custom
> scheduling code and the like.  If one core has two types of "A"
> insn, say "Aa" and "Ab", it isn't nice if all other cores now have to handle 
> both
> "Aa" and "Ab" instead of just "A").
> 
> 
> Segher
>