subreg vs vec_select
Hi! I have a vector pseudo containing a single 128-bit value (V1TFmode) and I need to access its last 64 bits (DFmode). Which of the two options is better? (subreg:DF (reg:V1TF) 8) or (vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel [(const_int 1)])) If I use the first one, I run into a problem with set_noop_p (): it thinks that (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8)) is a no-op, because it doesn't check the mode after stripping the subreg: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616 However this is not correct, because SET_DEST is the second register in a register pair, and SET_SRC is half of a vector register that overlaps the first register in the corresponding pair. So it looks as if mode needs to be considered there. This helps: --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set) return 0; src = SUBREG_REG (src); dst = SUBREG_REG (dst); + if (GET_MODE (src) != GET_MODE (dst)) + return 0; } but I'm not sure whether I'm not missing something about subreg semantics in the first place. Best regards, Ilya
Adding conditional move (movsicc) to MD
Hello! I am trying to get movmodecc (movsicc) going for my MRISC32 machine description, but I am unable to get GCC to use my define_expand pattern. I have tried different variants, but here is one example that I think should work: (define_expand "movsicc" [(set (match_operand:SI 0 "register_operand") (if_then_else:SI (match_operator 1 "comparison_operator" [(match_operand:SI 2 "register_operand") (match_operand:SI 3 "register_operand")]) (match_operand:SI 4 "register_operand") (match_operand:SI 5 "register_operand")))] "" { if (!mrisc32_expand_conditional_move (operands)) FAIL; DONE; }) As far as I can see, mrisc32_expand_conditional_move() is never called, even for simple code like: int cmov_ne(int a, int b, int c, int d) { return a != b ? c : d; } Am I missing something? Do I need to explicitly enable conditional moves somewhere in the MD? Regards, Marcus
Re: PowerPC long double Mangling
Sorry for the slow reply to this. On Fri, 7 Aug 2020 at 22:14, Michael Meissner wrote: > > One issue with doing the transition is what mangling should be used with the > new long double. > > At the moment, the current mangling is: > long double "g" > __float128 "u9__ieee128" > __ibm128"g" > > Obviously this will have to change in the future. It is unfortunate that we > choose "g" to mean IBM extended double many many years ago, when it should > have > been used for IEEE 128-bit floating point. But that is long ago, so I think > we > need to keep it. > > But assuming we want compatibility with libraries like glibc and libstdc++, I > think we will have to continue to use "g" for __ibm128. > > With the long double change, I tend to view this as an ABI change. But if the > user doesn't use long double, they should be able to link without changes. > > I would propose using a new mangling for IEEE 128-bit long double. I would > prefer to get agreement on what the new mangling should be so we don't have an > issue like we had in GCC 8.1 going to GCC 8.2, where we changed the mangling, > and had to provide aliases for the old name. > > At the moment I think the mangling should be: > long double "g" if long double is IBM > long double "u12_ieee128_ld"if long double is IEEE > __float128 "u9__ieee128" > __ibm128"g" What's the benefit of having __float128 and IEEE long double be distinct types? That complicates things for libraries like libstdc++. If we want to support using "__float128" with C++ iostreams then we need yet another set of I/O routines, even though it's identical to one of the types we already handle. Why not just keep __float128 and __ieee128 and "long double when long double is IEEE" as three different aliases for the same type, so that C++ code like _Z4funcu9__ieee128 works for all of them, instead of needing to also define _Z4funcu12__ieee128_ld? What about the "__ieee128" type, would that be mangled as "u12_ieee128_ld" or "u9__ieee128"? Currently it's the latter, i.e. __ieee128 is the same type as __float128, which is the same type as "long double when -mabi=ieeelongdouble". That seems useful to me, because it means that I can always use either __ibm128 or __ieee128 to refer to "the type that is sometimes known as 'long double'" irrespective of which -mabi is actually used. Today's mangling means I can declare a function in a header file as: void func(long double); and then in a library provide two definitions of that function, with the right one being chosen by the linker depending what mangled name it uses for "func(long double)" e.g. the definitions would be: void func(__ibm128) { } void func(__ieee128) { } If __ieee128 is mangled to u9__ieee128, then if you compile the header with -mabi=ieeelongdouble you can't link to the definition. To make it work with the proposed mangling, the definitions would have to be: void func(__ibm128) { } void func(long double) { } and then using -mabi=ieeelongdouble to compile the file would be mandatory. That isn't terrible, but it seems inconsistent and asymmetrical. I'd also find it surprising if __ieee128 is not mangled to u9__ieee128 since that's its name :-) But that's far less important than the practical matter of which types are mangled to the same and which overloads are equivalent to each other.
JUMP_LABEL returns NULL for the just created jump instruction
Folks, I'm trying to deal with CFG construction at RTL level and I bumped into a problem when I created a jump to a certain label. After the jump is created I try to extract the label using JUMP_LABEL but I get nothing. The code looks like like this: begin_sequence (); code_label lab = gen_label_rtx () rtx x = gen_rtx_GT (...); x = gen_rtx_IF_THEN_ELSE (VOIDmode, x, gen_rtx_LABEL_REF (Pmode, lab)); rtx_insn *j = emit_jump_insn (gen_rtx_SET (x)); end_sequence (); rtx lab1 = JUMP_LABEL (j) // <--- I get NULL here What am I missing? -- Thanks, Anton
Re: PowerPC long double Mangling
Hi! On Wed, Sep 09, 2020 at 02:42:36PM +0100, Jonathan Wakely wrote: > On Fri, 7 Aug 2020 at 22:14, Michael Meissner wrote: > > But assuming we want compatibility with libraries like glibc and libstdc++, > > I > > think we will have to continue to use "g" for __ibm128. Yes. > > With the long double change, I tend to view this as an ABI change. You can use both __ibm128 and __ieee128 in one program, so it isn't an ABI change. Only the default of what "long double" means changes. And we have been there before, there is the "e" mangling as well... > > I would propose using a new mangling for IEEE 128-bit long double. I would > > prefer to get agreement on what the new mangling should be so we don't have > > an > > issue like we had in GCC 8.1 going to GCC 8.2, where we changed the > > mangling, > > and had to provide aliases for the old name. > > > > At the moment I think the mangling should be: > > long double "g" if long double is IBM > > long double "u12_ieee128_ld"if long double is IEEE Why? Why can't this be exactly the same as __ieee128? > > __float128 "u9__ieee128" > > __ibm128"g" (There is DF128_ as well btw, oh joy). > What's the benefit of having __float128 and IEEE long double be > distinct types? That complicates things for libraries like libstdc++. Yeah. > If we want to support using "__float128" with C++ iostreams then we > need yet another set of I/O routines, even though it's identical to > one of the types we already handle. Why not just keep __float128 and > __ieee128 and "long double when long double is IEEE" as three > different aliases for the same type, so that C++ code like > _Z4funcu9__ieee128 works for all of them, instead of needing to also > define _Z4funcu12__ieee128_ld? Both __ieee128 and __float128 are in the implementation namespace, so we can do whatever we want with it, indeed. > What about the "__ieee128" type, would that be mangled as > "u12_ieee128_ld" or "u9__ieee128"? The latter. Anything else would be an ABI change. > Currently it's the latter, i.e. __ieee128 is the same type as > __float128, which is the same type as "long double when > -mabi=ieeelongdouble". That seems useful to me, because it means that > I can always use either __ibm128 or __ieee128 to refer to "the type > that is sometimes known as 'long double'" irrespective of which -mabi > is actually used. Yes. > I'd also find it surprising if __ieee128 is not mangled to u9__ieee128 > since that's its name :-) But that's far less important than the > practical matter of which types are mangled to the same and which > overloads are equivalent to each other. Mangling to u12__ieee_128_ld means we have a type named __ieee128_ld. We don't. Segher
Re: irange best practices document
On 9/3/20 1:14 AM, Aldy Hernandez via Gcc wrote: Below is a documented we have drafted to provide guidance on using irange's and converting passes to it. This will future proof any such passes so that they will work with the ranger, or any other mechanism using multiple sub-ranges (as opposed to the more limited value_range). The official document will live here, but is included below for discussion: https://gcc.gnu.org/wiki/irange-best-practices Feel free to respond to this post with any questions or comments. Thanks for the writeup! The biggest question on my mind at the moment is probably about how to go about rewriting classes that maintain their own ranges (most often as an array of two offset_int) to use one of the Ranger classes. Two examples are the access_ref class in builtins.h and the builtin_memref class in gimple-ssa-warn-restrict.c. Both of these compute the size of an object (as a simple range) and the offset into it (also as a range). In the future, they will track of sizes of multiple objects (from PHI nodes). My thinking is to do this in two steps: In 1) replace each offset_int[2] member array with an instance of int_range<1> and then rewrite all the code that manipulates the array with calls to the ranger API. That should be fairly straightforward. In 2) replace the simple int_range<1> with something more interesting (int_range_max?) and rewrite the final code that processes the results to scan all the subranges for overflow or overlap, as well as the code that presents them in warnings/notes. It would be nice to have support for the formatting of ranges in the pretty-printer to cut down on the repetitive tests that determine how to format a constant (%E, vs a range [%E, %E], vs a multi-range [%E, %E][%E, %E], ...[%E, %E]). I suppose I/we should write this up as a case study after I/we do the first rewrite. Longer term, the size of an object (or objects) and an offset into it/them seems like a generally useful property that would be best associated with a pointer somehow, so that it could be queried by an API similar to the one exposed by the irange class. I imagine it would benefit both warnings and optimizations. I do have one concern with a wholesale switch over to the ranger classes and APIs. Being designed with conservative assumptions suitable for optimizers, the current APIs aren't necessarily ideal for warnings, A basic example is integer overflow and wrapping. In the general case, the optimizer must assume that the range of the argument in 'malloc (n + 1)' might overflow (or wrap). A warning, though, should detect when it does and complain. So I'd like to see a mode where the ranger evaluates expressions in infinite precision. Martin Aldy & Andrew INTRODUCTION irange is a class for storing and manipulating multi-ranges of integers. It is meant to be a replacement for value_range's, and can currently inter-operate seamlessly with them. Original value_range's can contain a range of integers, say from 10 to 20 inclusive, represented as [10, 20]. It also has a way of representing an anti-range, which is the inverse of a range. For example, everything except 10 to 20 is represented as an anti-range of ~[10, 20]. This is really, shorthand for the union of [MIN_INT, 9] U [21, MAX_INT]. The value_range representation has the limitation that higher granularity is not representable without losing precision. For example, you cannot specify the range of [10, 15] U [20, 30], instead it is represented ambiguously with with a range of [10, 30]. On the other hand, multi-ranges with the irange class, can represent an arbitrary number of sub-ranges. More formally, multi-ranges have 0 or more non-intersecting sub-ranges with integral bounds. For example, you can specify a range containing the numbers of [10, 15] and [20, 30] with an irange of [10, 15][20, 30]. With irange, instead of using anti-ranges for ~[10, 20] the underlying number of ranges can be represented accurately with [MIN_INT, 9][21, MAX_INT]. Multi-ranges are not limited to 1 or 2 sub-ranges. Instead, you can specify any number of sub-ranges. For example: int_range<5> five_pairs; int_range<2> two_pairs; int_range<1> legacy_value_range; widest_irange huge_irange; // currently 255 sub-ranges. The special case of int_range<1> provides legacy support for value_range. Currently, value_range is just a typedef for int_range<1>. The specially named widest_irange is used for "unlimited" sub-ranges, and is meant to be used when calculating intermediate results. Currently it is a large number (255), but could be changed without prior notice. Note that "widest" does not have anything to do with the range of the underlying integers, but the maximum amount of sub-range pairs available for calculation. Here are some examples of calculations with different sub-range granularity: // Assume: tree n10 = build_int_cst (integer_type
Re: PowerPC long double Mangling
Am 09.09.20 um 17:36 schrieb Segher Boessenkool: You can use both __ibm128 and __ieee128 in one program, so it isn't an ABI change. Only the default of what "long double" means changes. And we have been there before, there is the "e" mangling as well... For Fortran, it is an ABI change unless we define an additional KIND number for __ieee128. Best regards Thomas
Re: PowerPC long double Mangling
On Wed, Sep 09, 2020 at 07:06:41PM +0200, Thomas Koenig wrote: > Am 09.09.20 um 17:36 schrieb Segher Boessenkool: > >You can use both __ibm128 and __ieee128 in one program, so it isn't an > >ABI change. Only the default of what "long double" means changes. And > >we have been there before, there is the "e" mangling as well... > > For Fortran, it is an ABI change unless we define an additional KIND > number for __ieee128. Yes, Fortran has existing problems here (now *already*). Segher
Re: PowerPC long double Mangling
On Wed, Sep 09, 2020 at 12:32:22PM -0500, Segher Boessenkool wrote: > On Wed, Sep 09, 2020 at 07:06:41PM +0200, Thomas Koenig wrote: > > Am 09.09.20 um 17:36 schrieb Segher Boessenkool: > > >You can use both __ibm128 and __ieee128 in one program, so it isn't an > > >ABI change. Only the default of what "long double" means changes. And > > >we have been there before, there is the "e" mangling as well... > > > > For Fortran, it is an ABI change unless we define an additional KIND > > number for __ieee128. > > Yes, Fortran has existing problems here (now *already*). Well, the Fortran kind case is the same thing as the change of the meaning of long double from __ibm128 to __ieee128. Neither C nor Fortran has special mangling for that, so for those languages it is a real ABI change, for C++ it is an ABI change too, but one that can be dealt for selected libraries through mangling and compiling stuff that refers to long double twice (e.g. libstdc++). For glibc I guess it can be dealt with using asm redirects of the math functions. Jakub
Re: PowerPC long double Mangling
On Wed, Sep 09, 2020 at 07:41:02PM +0200, Jakub Jelinek wrote: > On Wed, Sep 09, 2020 at 12:32:22PM -0500, Segher Boessenkool wrote: > > On Wed, Sep 09, 2020 at 07:06:41PM +0200, Thomas Koenig wrote: > > > Am 09.09.20 um 17:36 schrieb Segher Boessenkool: > > > >You can use both __ibm128 and __ieee128 in one program, so it isn't an > > > >ABI change. Only the default of what "long double" means changes. And > > > >we have been there before, there is the "e" mangling as well... > > > > > > For Fortran, it is an ABI change unless we define an additional KIND > > > number for __ieee128. > > > > Yes, Fortran has existing problems here (now *already*). > > Well, the Fortran kind case is the same thing as the change of the meaning > of long double from __ibm128 to __ieee128. > Neither C nor Fortran has special mangling for that, so for those languages > it is a real ABI change, for C++ it is an ABI change too, but one that can > be dealt for selected libraries through mangling and compiling stuff that > refers to long double twice (e.g. libstdc++). For glibc I guess it can be > dealt with using asm redirects of the math functions. My point is that you already have *both* __ibm128 and __ieee128, and you can have them in one source file even, and that just works. In C. Which of those some configuration uses by default matters a lot for libs that use long double of course, but there is an ELF attribute for that, so problems are not hard to spot usually. But for Fortran this still does not work at all. Segher
Re: subreg vs vec_select
Hi Ilya, On Wed, Sep 09, 2020 at 11:50:56AM +0200, Ilya Leoshkevich via Gcc wrote: > I have a vector pseudo containing a single 128-bit value (V1TFmode) and > I need to access its last 64 bits (DFmode). Which of the two options > is better? > > (subreg:DF (reg:V1TF) 8) > > or > > (vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel [(const_int 1)])) > > If I use the first one, I run into a problem with set_noop_p (): it > thinks that > > (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8)) > > is a no-op, because it doesn't check the mode after stripping the > subreg: > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616 > > However this is not correct, because SET_DEST is the second register in > a register pair, and SET_SRC is half of a vector register that overlaps > the first register in the corresponding pair. So it looks as if mode > needs to be considered there. Yes. > This helps: > > --- a/gcc/rtlanal.c > +++ b/gcc/rtlanal.c > @@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set) > return 0; >src = SUBREG_REG (src); >dst = SUBREG_REG (dst); > + if (GET_MODE (src) != GET_MODE (dst)) > + return 0; > } > > but I'm not sure whether I'm not missing something about subreg > semantics in the first place. You probably should just see if both modes are the same number of hard registers? HARD_REGNO_NREGS. Segher
Re: A problem with one instruction multiple latencies and pipelines
Hi! On Mon, Sep 07, 2020 at 09:20:59PM +0100, Richard Sandiford wrote: > This is just personal opinion, but in general (from the point of view > of a new port, or a new subport like SVE), I think the best approach > to handling the "type" attribute is to start with the coarsest > classification that makes sense, then split these relatively coarse > types up whenever there's a specific need. Agreed. > When taking that approach, it's OK (and perhaps even a good sign) > for an existing type to sometimes be too coarse for a new CPU. > > So thanks for asking about this, and please don't feel constrained > by the existing "type" classification. IMO we should split existing > types wherever that makes sense for new CPUs. You can also use some other attributes to classify instructions, you don't have to put it all in one "type" attribute. This can of course be done later, at a time when it is clearer what a good design will be. Sometimes it is obvious from the start though :-) (This primarily makes the pipeline descriptions much simpler, but also custom scheduling code and the like. If one core has two types of "A" insn, say "Aa" and "Ab", it isn't nice if all other cores now have to handle both "Aa" and "Ab" instead of just "A"). Segher
RE: A problem with one instruction multiple latencies and pipelines
Hi Segher > You can also use some other attributes to classify instructions, you don't > have > to put it all in one "type" attribute. This can of course be done later, at > a time > when it is clearer what a good design will be. > Sometimes it is obvious from the start though :-) Thank you for your advice. It is also a good idea. Considering other cores(existing and future), I think it is better to keep the type attribute unchanging, and add other attributes to classify instructions. Regards Qian > -Original Message- > From: Segher Boessenkool > Sent: Thursday, September 10, 2020 5:23 AM > To: Qian, Jianhua/钱 建华 ; gcc@gcc.gnu.org; > richard.sandif...@arm.com > Subject: Re: A problem with one instruction multiple latencies and pipelines > > Hi! > > On Mon, Sep 07, 2020 at 09:20:59PM +0100, Richard Sandiford wrote: > > This is just personal opinion, but in general (from the point of view > > of a new port, or a new subport like SVE), I think the best approach > > to handling the "type" attribute is to start with the coarsest > > classification that makes sense, then split these relatively coarse > > types up whenever there's a specific need. > > Agreed. > > > When taking that approach, it's OK (and perhaps even a good sign) for > > an existing type to sometimes be too coarse for a new CPU. > > > > So thanks for asking about this, and please don't feel constrained by > > the existing "type" classification. IMO we should split existing > > types wherever that makes sense for new CPUs. > > You can also use some other attributes to classify instructions, you don't > have > to put it all in one "type" attribute. This can of course be done later, at > a time > when it is clearer what a good design will be. > Sometimes it is obvious from the start though :-) > > (This primarily makes the pipeline descriptions much simpler, but also custom > scheduling code and the like. If one core has two types of "A" > insn, say "Aa" and "Ab", it isn't nice if all other cores now have to handle > both > "Aa" and "Ab" instead of just "A"). > > > Segher >