[Bug libstdc++/84654] New: libstdc++ tries to use __float128 when compiling with -mno-float128

2018-03-01 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84654

Bug ID: 84654
   Summary: libstdc++ tries to use __float128 when compiling with
-mno-float128
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tuliom at linux dot vnet.ibm.com
  Target Milestone: ---

Using the following test program:

$ cat test1.cpp 
#include 
int main() {
return 0;
}

$ g++-8 -mno-float128 test.cpp 
In file included from /usr/include/c++/8/cmath:47,
 from /usr/include/c++/8/math.h:36,
 from test.cpp:1:
/usr/include/c++/8/bits/std_abs.h:101:3: error: ‘__float128’ does not name a
type; did you mean ‘__floorl’?
   __float128
   ^~
   __floorl

I reproduced this problem on powerpc64le.

I have a patch to fix this.

[Bug c++/84788] New: Parenthesis changes a constant to non-constant

2018-03-09 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84788

Bug ID: 84788
   Summary: Parenthesis changes a constant to non-constant
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tuliom at linux dot vnet.ibm.com
  Target Milestone: ---

Reproduced with GCC 8.0.1 rev. 258059:

$ cat test-cxx.cpp 
template 
class PackedCache {
 public:
  static const int kValuebits = 7;
  static const int kValueMask = 1 << ((kValuebits)-1);
};

$ g++ -c test-cxx.cpp
test-cxx.cpp:5:53: error: non-constant in-class initialization invalid for
static member ‘PackedCache::kValueMask’
   static const int kValueMask = 1 << ((kValuebits)-1);
 ^
test-cxx.cpp:5:53: note: (an out of class initialization is required)


If the parenthesis around kValuebits is removed, the error disappears, i.e.:
  static const int kValueMask = 1 << (kValuebits-1);

I can't reproduce this error with GCC 7.
This file used to build with rev. 253122.
I can build the file when using -std=gnu++11.

[Bug target/81193] PowerPC GCC __builtin_cpu_is and __builtin_cpu_supports should warn about old libraries

2017-06-27 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81193

Tulio Magno Quites Machado Filho  changed:

   What|Removed |Added

 CC||tuliom at linux dot 
vnet.ibm.com

--- Comment #7 from Tulio Magno Quites Machado Filho  ---
(In reply to Michael Meissner from comment #4)
> I have no problems with restricting use of __builtin_cpu_ and
> target_clone to GLIBC 2.19 or newer (or whatever the release version is).

That's glibc 2.23.

[Bug c/82636] New: powerpc: Unnecessary copy of __ieee128 parameter

2017-10-20 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82636

Bug ID: 82636
   Summary: powerpc: Unnecessary copy of __ieee128 parameter
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tuliom at linux dot vnet.ibm.com
  Target Milestone: ---

Tested with GCC 7.2.1 on powerpc64le.

The copy of vs36 (v4) to vs32 (v0) shouldn't be required, i.e. I'd expect to
have xsmaddqp v4,v2,v3.

$ cat s_fmaf128-power9.c
__ieee128
__fmaf128_power9 (__ieee128 x, __ieee128 y, __ieee128 z)
{
  asm ("xsmaddqp\t%0, %1, %2" : "+v" (z) : "v" (x), "v" (y));
  return z;
}

$ gcc -mcpu=power9 -mfloat128 -O3 -c s_fmaf128-power9.c -o test.o

$ objdump -d test.o
...
 <__fmaf128_power9>:
   0:   97 24 04 f0 xxlor   vs32,vs36,vs36<
   4:   08 1b 02 fc xsmaddqp v0,v2,v3
   8:   97 04 40 f0 xxlor   vs34,vs32,vs32
   c:   20 00 80 4e blr
...

[Bug target/67281] New: HTM builtins aren't treated as compiler barriers on powerpc

2015-08-19 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67281

Bug ID: 67281
   Summary: HTM builtins aren't treated as compiler barriers on
powerpc
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tuliom at linux dot vnet.ibm.com
  Target Milestone: ---

Depending on the level of optimization, GCC moves a memory access outside a
transaction which breaks the atomicity of the transaction.

This is a small testcase that reproduces this behavior:

$ cat tbegin-barrier.c
long
foo (long dest, long src0, long src1, long tries)
{
  long i;
  for (i = 0; i < tries; i++)
{
  __builtin_tbegin (0);
  dest = src0 + src1;
  __builtin_tend (0);
}
  return dest;
}

compiling using: -O2 -S -mcpu=power8 tbegin-barrier.c
gives
foo:
cmpdi 0,6,0
blelr 0
mtctr 6
add 3,4,5
.p2align 4,,15
.L3:
tbegin. 0
tend. 0
bdnz .L3
blr


[Bug target/67281] HTM builtins aren't treated as compiler barriers on powerpc

2015-08-19 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67281

--- Comment #2 from Tulio Magno Quites Machado Filho  ---
Oooops.  My bad.
What about this one?

$ cat tbegin-barrier.c
long
foo (long dest, long *src0, long src1, long tries)
{
  long i;
  for (i = 0; i < tries; i++)
{
  __builtin_tbegin (0);
  dest = *src0 + src1;
  __builtin_tend (0);
}
  return dest;
}

If we compile it the same way:

foo:
cmpdi 0,6,0
blelr 0
mtctr 6
ld 3,0(4)
.p2align 4,,15
.L3:
tbegin. 0
tend. 0
bdnz .L3
add 3,3,5
blr


[Bug target/67281] HTM builtins aren't treated as compiler barriers on powerpc

2015-08-19 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67281

--- Comment #4 from Tulio Magno Quites Machado Filho  ---
(In reply to Andrew Pinski from comment #3)
> Since there are no stores, the load seems like it can be pulled out of the
> loop too.

I disagree with you.
If I use the value of dest to take a decision inside the transaction, I need
the memory barrier before the access to *src0.

Here's an example:

long
foo (long dest, long *src0, long src1, long tries)
{
  long i;
  for (i = 0; i < tries; i++)
{
  __builtin_tbegin (0);
  dest = *src0 + src1;
  if (dest == 13)
__builtin_tabort(0);
  __builtin_tend (0);
}
  return dest;
}

In other words, if you access *src0 before the memory barrier, its value may
change when the memory barrier is created.  This is particularly useful if dest
says if a lock has been acquired by another thread or not.

For the reference, this has been found in glibc source code:
https://sourceware.org/ml/libc-alpha/2015-07/msg00986.html


[Bug target/71977] New: powerpc64: Use VSR when operating on float and integer

2016-07-22 Thread tuliom at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71977

Bug ID: 71977
   Summary: powerpc64: Use VSR when operating on float and integer
   Product: gcc
   Version: 6.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tuliom at linux dot vnet.ibm.com
  Target Milestone: ---

The following code is common on libm:

#include 

typedef union
{
  float value;
  uint32_t word;
} ieee_float_shape_type;

float
mask_float (float f, uint32_t mask)
{ 
  ieee_float_shape_type u;

  u.value = f;
  u.word &= mask;

  return u.value;
}

GCC 6.1.1 is executing the operation in the GPR:

mask_float:
xscvdpspn 12,1
mfvsrd 9,12
srdi 9,9,32
and 4,4,9
sldi 9,4,32
mtvsrd 1,9
xscvspdpn 1,1
blr

However, operating in the VSR reduces the amount of instructions and improves
GPR pressure, e.g.:

mask_float2:
sldi 4,4,32
mtvsrd 0,4
xvcvdpsp 1, 1
xxland 1, 1, 0
xvcvspdp 1, 1
blr