Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)

2016-07-27 Thread Erik Trulsson

Citerar Warren D Smith :


Also, I'm somewhat amazed how it is argued to me that a 9-bit machine
the PDP-10 is
covered by C fine, but yet, C insists on having everything a multiple
of 8 bits with padding bits disallowed, and that too is fine, and both
these facts refute me.



Wrong.
The C language does not insist on multiples of 8.
As far as the C standard is concerned it would be perfectly fine to have
9 bit chars, 18 bit short ints, 27 bit ints, 36 bit long, and 72 bit  
long long.
I do not believe GCC has support for any targets like that, but it is  
allowed by C.


And, except for unsigned char any and all the integer types can have  
some types of padding bits/disallowed bit patterns.
Again, perhaps not in any of the targets that GCC supports, but  
allowed by the C standard.




Re: Two suggestions for gcc C compiler to extend C language (by WD Smith)

2016-07-27 Thread David Brown
On 26/07/16 21:06, Warren D Smith wrote:
> OK, you just said you've used packed nybble arrays a couple of times.

Yes, a couple of times in 20+ years.  And I work with the kind of
programming where something like nibble arrays could conceivably be
useful.  For most C programmers, "int" is the only integer type they
ever need or use, and they will probably not even have heard of a
"nibble".  It is absolutely a non-issue.

> Multiplying you by 100,000 that proves if you put it in GCC,
> you'd save 200,000 programmer hours, which is equivalent to saving
> over 2 lives.

That claim is absurd in every way imaginable.

> 
> You just said you've written your own double-length multiply.
> Same proof.

Again, you are completely wrong.  Double-length multiplies are very
rarely necessary, so there is no point in having extra compiler features
in order to make them simpler.  In the rare cases where they /are/
needed, it is not hard to implement them yourself.  And gcc has already
made the process easier and more efficient over the years, by making
real and beneficial advances that have the side-effect of making
double-length multiplies easier and more efficient (general optimisation
improvements make the code more efficient, the C99 long long removed
almost all use-cases for manual extended integer arithmetic, and the
__builtin_XXX_overflow instructions are available).

It is hard to imagine how you could be more wrong in what you write.  I
don't know whether you are just ignorant and angry at something, or
trolling.

> 
> Thank you for proving my point.
> 
> How many deaths does it take before it is worth putting into GCC?

You are now claiming that adding a double-length "mul" builtin to gcc
would save lives?  Seriously?

> And it isn't like I'm suggesting something hard, or which would be
> unattractive to users.

Making a double-length multiply built-in would not be very hard - but it
is far easier to write it in C source code than to add it to the
compiler.  And yes, it would be unattractive to users - for almost
everybody, it would be useless but use an identifier "mul" that could
conflict with their own code.

And of course there is the key point that the gcc developers are not
lazy, despite your claims - they are extremely dedicated and hard
working, often in their free time.  There are large numbers of /useful/
features (and occasional bug fixes!) that are on their "todo" lists.
Any time spent on meaningless ideas that can be handled perfectly well
at the moment is time they cannot spend on something the people actually
want.

> 
> And thanks for the tip on how to do add-with-carry.
> That's nice.   Now I have to ask, now you've helpfully demonstrated
> how nice it can be, why not put that niceness inside GCC?  

gcc /has/ that "niceness" - the code I wrote works fine with gcc.  (And
gcc has the __builtin_XXX_overflow functions - I suspect you did not
bother looking at the documentation about the features gcc supports,
before ranting about what's missing.)

> I mean, if
> GCC already is going to
> provide div(a,b) -- it was not me who asked for that, it was GCC that
> decided to provide it --

"div" was specified in the earliest versions of the C standards, before
gcc was conceived.

> which I could have just got in plain C using  q=a/b; r=a%b;  and depended on
> optimizer, zero thought required -- then how can just justify GCC
> *not* providing addc(a,b) when it is trickier for the programmer, so
> you are clearly providing something more helpful since was more
> tricky?

"div" is documented in the standards - the compiler or library has no
option but to support it, even if people can now get at least as good
code (in most cases) by writing the operations individually.

It is true that it was the gcc folks that made the decision to have a
builtin version as an alternative to a library call.  They have tried to
do that for many simple or commonly used standard library functions,
because that is useful for a lot of code.

> 
> Why am I bothering?  You prove my point then act as though you proved 
> opposite.
> 
> Concrete examples?  Hell, I suggested stdint.h years and years before
> it came along, and I was told I was an idiot.  I suggested making a
> lot of library functions be builtins, told I was an idiot, and now lo
> and behold, years and years later, gcc makes many library functions be
> builtins.  I complained the stdio library was a disaster waiting to
> happen with
> buffer overflows, told I was an idiot, and lo and behold, years and
> years later people keep trying to work around that, with at least two
> people having written nonstandard replacement libraries to try for
> safety, and huge billions of dollars estimated to be lost due to this
> bad design.

Judging by your writing here, you were told that you were an idiot
because you are an idiot.  There may be the occasional good idea hidden
behind the piles of nonsense, insults, rants, accusations and
complaints, but they would be hard to see.  An

Re: [gimplefe] hacking pass manager

2016-07-27 Thread Richard Biener
On Tue, Jul 26, 2016 at 11:38 PM, Prathamesh Kulkarni
 wrote:
> On 27 July 2016 at 00:20, Prasad Ghangal  wrote:
>> On 20 July 2016 at 18:28, Richard Biener  wrote:
>>> On Wed, Jul 20, 2016 at 1:46 PM, Prathamesh Kulkarni
>>>  wrote:
 On 20 July 2016 at 11:34, Richard Biener  
 wrote:
> On Tue, Jul 19, 2016 at 10:09 PM, Prasad Ghangal
>  wrote:
>> On 19 July 2016 at 11:04, Richard Biener  
>> wrote:
>>> On July 18, 2016 11:05:58 PM GMT+02:00, David Malcolm 
>>>  wrote:
On Tue, 2016-07-19 at 00:52 +0530, Prasad Ghangal wrote:
> On 19 July 2016 at 00:25, Richard Biener 
> wrote:
> > On July 18, 2016 8:28:15 PM GMT+02:00, Prasad Ghangal <
> > prasad.ghan...@gmail.com> wrote:
> > > On 15 July 2016 at 16:13, Richard Biener <
> > > richard.guent...@gmail.com>
> > > wrote:
> > > > On Sun, Jul 10, 2016 at 6:13 PM, Prasad Ghangal
> > > >  wrote:
> > > > > On 8 July 2016 at 13:13, Richard Biener <
> > > > > richard.guent...@gmail.com>
> > > wrote:
> > > > > > On Thu, Jul 7, 2016 at 9:45 PM, Prasad Ghangal
> > >  wrote:
> > > > > > > On 6 July 2016 at 14:24, Richard Biener
> > >  wrote:
> > > > > > > > On Wed, Jul 6, 2016 at 9:51 AM, Prasad Ghangal
> > >  wrote:
> > > > > > > > > On 30 June 2016 at 17:10, Richard Biener
> > >  wrote:
> > > > > > > > > > On Wed, Jun 29, 2016 at 9:13 PM, Prasad Ghangal
> > > > > > > > > >  wrote:
> > > > > > > > > > > On 29 June 2016 at 22:15, Richard Biener
> > >  wrote:
> > > > > > > > > > > > On June 29, 2016 6:20:29 PM GMT+02:00,
> > > > > > > > > > > > Prathamesh Kulkarni
> > >  wrote:
> > > > > > > > > > > > > On 18 June 2016 at 12:02, Prasad Ghangal
> > > 
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I tried hacking pass manager to execute
> > > > > > > > > > > > > > only given passes.
> > > For this I
> > > > > > > > > > > > > > am adding new member as opt_pass
> > > > > > > > > > > > > > *custom_pass_list to the
> > > function
> > > > > > > > > > > > > > structure to store passes need to execute
> > > > > > > > > > > > > > and providing the
> > > > > > > > > > > > > > custom_pass_list to execute_pass_list()
> > > > > > > > > > > > > > function instead of
> > > all
> > > > > > > > > > > > > passes
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > for test case like-
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > int a;
> > > > > > > > > > > > > > void __GIMPLE (execute ("tree-ccp1", "tree
> > > > > > > > > > > > > > -fre1")) foo()
> > > > > > > > > > > > > > {
> > > > > > > > > > > > > > bb_1:
> > > > > > > > > > > > > >   a = 1 + a;
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > it will execute only given passes i.e. ccp1
> > > > > > > > > > > > > > and fre1 pass
> > > on the
> > > > > > > > > > > > > function
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > and for test case like -
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > int a;
> > > > > > > > > > > > > > void __GIMPLE (startwith ("tree-ccp1"))
> > > > > > > > > > > > > > foo()
> > > > > > > > > > > > > > {
> > > > > > > > > > > > > > bb_1:
> > > > > > > > > > > > > >   a = 1 + a;
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > it will act as a entry point to the
> > > > > > > > > > > > > > pipeline and will
> > > execute passes
> > > > > > > > > > > > > > starting from given pass.
> > > > > > > > > > > > > Bike-shedding:
> > > > > > > > > > > > > Would it make sense to have syntax for
> > > > > > > > > > > > > defining pass ranges
> > > to execute
> > > > > > > > > > > > > ?
> > > > > > > > > > > > > for instance:
> > > > > > > > > > > > > void __GIMPLE(execute (pass_start :
> > > > > > > > > > > > > pass_end))
> > > > > > > > > > > > > which would execute all the passes within
> > > > > > > > > > > > > range [pass_start,
> > > pass_end],
> > > > > > > > > > > > > which would be convenient if the range is
> > > > > > > > > > > > > large.
> > > > > > > > > > > >
> > > > > > > > > > > > But it would rely on a particular pass
> > > > > > > > > > > > pipeline, f.e.
> > > pass-start appearing before pass-end.
> > > > > > > > > > > >
> > > > > > > > > > > > Currently control

Re: Thread-safety of a profiled binary (and GCOV runtime library)

2016-07-27 Thread Martin Liška
On 07/26/2016 01:15 AM, Andi Kleen wrote:
> You definitely need a new flag: atomic or per thread instrumentation
> will almost certainly have significant overhead (either at run time
> or in memory). Just making an existing facility a lot of slower
> without a way around it is not a good idea.

Hi

Agree with that!

> 
> BTW iirc there were patches from google on this a few years back.
> May be still in some branch.

That's great, I'm CCing Google people who worked on PGO, hope they will
help me to find patches/discussion about the problem.

Martin

> 
> -Andi



GCC 6.2?

2016-07-27 Thread Paul Smith
Hi all.  Don't want to be a noodge but is there any info on a timeline
for the 6.2 release?

I'm planning a major tools upgrade (from GCC 4.9.2) and I've been kind
of putting it off until 6.2 is out so I can jump to that... but the
natives are getting restless as they want some C++ features that aren't
available in 4.9.2.

Usually it seems like the "first patch release" for a new major release
is out right around now (3 months after the initial release).  Just
wondering if there is any info on this or if things are going to be
very different for 6.2.

Cheers!


Re: GCC 6.2?

2016-07-27 Thread Richard Biener
On Wed, Jul 27, 2016 at 4:03 PM, Paul Smith  wrote:
> Hi all.  Don't want to be a noodge but is there any info on a timeline
> for the 6.2 release?
>
> I'm planning a major tools upgrade (from GCC 4.9.2) and I've been kind
> of putting it off until 6.2 is out so I can jump to that... but the
> natives are getting restless as they want some C++ features that aren't
> available in 4.9.2.
>
> Usually it seems like the "first patch release" for a new major release
> is out right around now (3 months after the initial release).  Just
> wondering if there is any info on this or if things are going to be
> very different for 6.2.

I'm doing 4.9.4 now and 6.2 only afterwards so you can expect 6.2 earliest
in about three weeks.

Richard.

> Cheers!


[RFD] Insane alignment of read-only arrays

2016-07-27 Thread Thorsten Glaser
Hi,

(tl;dr: skip to “questions from me to the GCC developers”)

I’ve recently spotted the following in some code I inherited:

struct foo {
char name[4];
uint8_t len;
uint8_t type;
};

static const struct foo fooinfo[] = {
/* list of 43 members */
};

I’ve seen the obvious two-byte padding (on i386, with -Os)
and thought to restructure this into:

static const char fooname[][4] = { … };
static const uint8_t foolen[] = { … };
static const uint8_t footype[] = { … };

Colour me surprised when this made the code over fifty
bytes longer. After some debugging, I found that the
assembly code generated had “.align 32” in front of
each of the new structs.

After some (well, lots) more debugging, I eventually
discovered -fdump-translation-unit (which, in the version
I was using, also worked for C, not just C++), which showed
me that the alignment was 256 even (only later reduced to
32 as that’s the maximum alignment for i386).

Lots of digging later, I found this gem in gcc/config/i386/i386.c:

int
ix86_data_alignment (tree type, int align)
{
  if (AGGREGATE_TYPE_P (type)
   && TYPE_SIZE (type)
   && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
   && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256
   || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256)
return 256;

Voilà, there we have my culprit – commenting this out resulted
in a 12-byte yield (not 42*2 byte, as the code generated for e.g.
“strncmp(foo, opname[i], (size_t)oplen[i])” is a bit less optimal
than for “strncmp(foo, opinfo[i].name, (size_t)opinfo[i].len)”,
but that’s okay)… and a 206-byte reduction for the rest of the
codebase.

Seeing that ix86_data_alignment() also contains amd64-specific
alignment, and that MMX stuff generally needs more alignment,
I first did this:

&& (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256
-  || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256)
+  || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
+   && (TARGET_MMX || !optimize_size)
+   && align < 256)
 return 256;

The idea here being that both TARGET_SSE and TARGET_64BIT
enable TARGET_MMX, and to do this only for -Os.

Then I went into the svn history for this function and
discovered that its predecessor in gcc/config/i386/i386.h
(the DATA_ALIGNMENT(TYPE, ALIGN) macro) was added in around
2000, before MMX was even a thing, to “improve floating point
performance”, but that architectures apparently can do without.

Now I’m trying roughly this:

[…]
 ix86_constant_alignment (tree exp, int align)
 {
+  if (optimize_size && !TARGET_MMX)
+return align;
[…]
 ix86_data_alignment (tree type, int align)
 {
+  if (optimize_size && !TARGET_MMX)
+return align;
[…]
 ix86_local_alignment (tree type, int align)
 {
+  if (optimize_size && !TARGET_MMX)
+return align;
[…]

This opens up some questions from me to the GCC developers
though:

– Is this safe to do? (My baseline here is 3.4.6, so
  if someone still remembers, please do answer, but
  the scope of this eMail in total goes beyond that.)

– Is this something that GCC trunk could benefit from?

– Is the exclusion of MMX and 64BIT required? (Since
  this code has been there “ever” since even before
  MMX support landed in GCC, I fear that some of the
  “required alignment” are done inside this function
  instead of in other places.)

– Even better: is this something we could do for *all*
  platforms in general? Something like this, in gcc/varasm.c:

 #ifdef DATA_ALIGNMENT
+  if (!optimize_size)
-  align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
+align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
 #endif

My aim here is to tighten the density (reduce the size
of the individual sections in the .o file and, ideally,
the file size of the final executable) of the generated
code for -Os while not breaking anything, and leaving
the case of not-Os completely alone.

Of course I’ll do a full rebuild of MirBSD (which uses
-Os in almost all code, only some legacy crap from the
1970s like AT&T nroff uses -O1 or even -O0 as the code
doesn’t conform to ISO C) to see if things break, but
I’m also interested in the bigger picture, besides I
have invested into embedded systems (FreeWRT/OpenADK,
but also dietlibc, klibc, etc.) which love small code.

Thanks in advance,
//mirabilos
PS: Please do Cc me, I’m not subscribed.
PPS: I’ve exchanged assignment papers with the FSF about GCC,
 so feel free to just commit anything, if it makes sense.
-- 
ObCaptcha: null

[RFD] Extremely large alignment of read-only strings

2016-07-27 Thread Thorsten Glaser
Hi,

(tl;dr: skip to “questions from me to the GCC developers”)

I’ve recently spotted the following in some code I inherited:

struct foo {
char name[4];
uint8_t len;
uint8_t type;
};

static const struct foo fooinfo[] = {
/* list of 43 members */
};

I’ve seen the obvious two-byte padding (on i386, with -Os)
and thought to restructure this into:

static const char fooname[][4] = { … };
static const uint8_t foolen[] = { … };
static const uint8_t footype[] = { … };

Colour me surprised when this made the code over fifty
bytes longer. After some debugging, I found that the
assembly code generated had “.align 32” in front of
each of the new structs.

After some (well, lots) more debugging, I eventually
discovered -fdump-translation-unit (which, in the version
I was using, also worked for C, not just C++), which showed
me that the alignment was 256 even (only later reduced to
32 as that’s the maximum alignment for i386).

Lots of digging later, I found this gem in gcc/config/i386/i386.c:

int
ix86_data_alignment (tree type, int align)
{
  if (AGGREGATE_TYPE_P (type)
   && TYPE_SIZE (type)
   && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
   && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256
   || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256)
return 256;

Voilà, there we have my culprit – commenting this out resulted
in a 12-byte yield (not 42*2 byte, as the code generated for e.g.
“strncmp(foo, opname[i], (size_t)oplen[i])” is a bit less optimal
than for “strncmp(foo, opinfo[i].name, (size_t)opinfo[i].len)”,
but that’s okay)… and a 206-byte reduction for the rest of the
codebase.

Seeing that ix86_data_alignment() also contains amd64-specific
alignment, and that MMX stuff generally needs more alignment,
I first did this:

&& (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 256
-  || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 256)
+  || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
+   && (TARGET_MMX || !optimize_size)
+   && align < 256)
 return 256;

The idea here being that both TARGET_SSE and TARGET_64BIT
enable TARGET_MMX, and to do this only for -Os.

Then I went into the svn history for this function and
discovered that its predecessor in gcc/config/i386/i386.h
(the DATA_ALIGNMENT(TYPE, ALIGN) macro) was added in around
2000, before MMX was even a thing, to “improve floating point
performance”, but that architectures apparently can do without.

Now I’m trying roughly this:

[…]
 ix86_constant_alignment (tree exp, int align)
 {
+  if (optimize_size && !TARGET_MMX)
+return align;
[…]
 ix86_data_alignment (tree type, int align)
 {
+  if (optimize_size && !TARGET_MMX)
+return align;
[…]
 ix86_local_alignment (tree type, int align)
 {
+  if (optimize_size && !TARGET_MMX)
+return align;
[…]

This opens up some questions from me to the GCC developers
though:

– Is this safe to do? (My baseline here is 3.4.6, so
  if someone still remembers, please do answer, but
  the scope of this eMail in total goes beyond that.)

– Is this something that GCC trunk could benefit from?

– I’ve also been wondering whether this applies to
  regular strings (not arrays that technically are
  strings too) as well…

– Is the exclusion of MMX and 64BIT required? (Since
  this code has been there “ever” since even before
  MMX support landed in GCC, I fear that some of the
  “required alignment” are done inside this function
  instead of in other places.)

– Even better: is this something we could do for *all*
  platforms in general? Something like this, in gcc/varasm.c:

 #ifdef DATA_ALIGNMENT
+  if (!optimize_size)
-  align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
+align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
 #endif

My aim here is to tighten the density (reduce the size
of the individual sections in the .o file and, ideally,
the file size of the final executable) of the generated
code for -Os while not breaking anything, and leaving
the case of not-Os completely alone.

Of course I’ll do a full rebuild of MirBSD (which uses
-Os in almost all code, only some legacy crap from the
1970s like AT&T nroff uses -O1 or even -O0 as the code
doesn’t conform to ISO C) to see if things break, but
I’m also interested in the bigger picture, besides I
have invested into embedded systems (FreeWRT/OpenADK,
but also dietlibc, klibc, etc.) which love small code.

Thanks in advance,
//mirabilos
PS:  Please do Cc me, I’m not subscribed.
PPS: I’ve exchanged assignment papers with the FSF about GCC,
 so feel free to just commit anything, if it makes sense.
-- 
FWIW, I'm quite impressed with mksh interactively. I thought it was much
*much* more bare bones. But it turns out it beats the living hell out of
ksh93 in that respect. I'd even consider it for my daily use if I hadn't
wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh


Re: [RFD] Extremely large alignment of read-only strings

2016-07-27 Thread Thorsten Glaser
(apologies for the double post, GMane had a hiccup. The latter,
which this is a reply to, has one discussion item more, so please
ignore the other)


Re: Thread-safety of a profiled binary (and GCOV runtime library)

2016-07-27 Thread Xinliang David Li
Resend in plain text mode.

On Wed, Jul 27, 2016 at 9:07 AM, Xinliang David Li  wrote:
> Our experience is that non-atomic counter update (the current
> implementation) rarely result in corrupted profile (in heavily threaded
> environment) -- it usually results in some profile insanity which can be
> corrected with -fprofile-correction -- otherwise we would have been forced
> to go the TLS route.
>
> The profile corruption usually happens when server program crashed during
> dumping (server program usually do not exit, so forcing it to exit to dump
> profile can cause problem). Our solution is to introduce profile runtime API
> to be invoked by the user.
>
> We added the atomic support mostly for linux kernel FDO. The support is in
> google/gcc_49 branch. The option description is:
>
> ; fprofile-generate-atomic=0: disable aotimically update.
> ; fprofile-generate-atomic=1: aotimically update edge profile counters.
> ; fprofile-generate-atomic=2: aotimically update value profile counters.
> ; fprofile-generate-atomic=3: aotimically update edge and value profile
> counters.
> ; other values will be ignored (fall back to the default of 0).
> fprofile-generate-atomic=
> Common Joined UInteger Report Var(flag_profile_gen_atomic) Init(0)
> Optimization
> fprofile-generate-atomic=[0..3] Atomically increments for profile counters.
>
> thanks,
>
> David
>
>
> On Wed, Jul 27, 2016 at 5:05 AM, Martin Liška  wrote:
>>
>> On 07/26/2016 01:15 AM, Andi Kleen wrote:
>> > You definitely need a new flag: atomic or per thread instrumentation
>> > will almost certainly have significant overhead (either at run time
>> > or in memory). Just making an existing facility a lot of slower
>> > without a way around it is not a good idea.
>>
>> Hi
>>
>> Agree with that!
>>
>> >
>> > BTW iirc there were patches from google on this a few years back.
>> > May be still in some branch.
>>
>> That's great, I'm CCing Google people who worked on PGO, hope they will
>> help me to find patches/discussion about the problem.
>>
>> Martin
>>
>> >
>> > -Andi
>>
>


Re: Need help with PR71976 combine.c::get_last_value returns a wrong result

2016-07-27 Thread Segher Boessenkool
On Tue, Jul 26, 2016 at 03:38:18PM +0200, Georg-Johann Lay wrote:
> >>@@ -13206,6 +13206,13 @@ get_last_value (const_rtx x)
> >>   && DF_INSN_LUID (rsp->last_set) >= subst_low_luid)
> >> return 0;
> >>
> >>+  /* If the lookup is for a hard register make sure that value contains 
> >>at
> >>least
> >>+ as many bits as x does.  */
> >>+
> >>+  if (regno < FIRST_PSEUDO_REGISTER
> >>+  && GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION
> >>(GET_MODE (x)))
> >>+return 0;
> >>+
> >>   /* If the value has all its registers valid, return it.  */
> >>   if (get_last_value_validate (&value, rsp->last_set, 
> >>   rsp->last_set_label,
> >>   0))
> >> return value;
> >
> >That might be a bit harsh.
> 
> In what respect?

It disables all kinds of optimisations that work now.

> >First things first: is the information recorded correct?
> 
> I think yes, but care must be taken what may be concluded from that 
> information.  From a set of 8 bits you cannot draw conclusion about all 64 
> bits; this should be obvious enough :-)

:-)

> In the above case rsp[regno] holds only information for 1 sub-byte.  In 
> order to get the complete DImode story we would have to get the info for 
> all sub-parts and then put them together...

Yes, and the rsp stuff does not do that.

I am testing the following patch.  Thanks,


Segher


diff --git a/gcc/combine.c b/gcc/combine.c
index 77e0d2b..dec6226 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9977,6 +9977,9 @@ reg_num_sign_bit_copies_for_combine (const_rtx x, 
machine_mode mode,
   return NULL;
 }
 
+  if (GET_MODE_PRECISION (rsp->last_set_mode) != GET_MODE_PRECISION (mode))
+return NULL;
+
   tem = get_last_value (x);
   if (tem != 0)
 return tem;
-- 
1.9.3



Remove deprecated std::has_trivial_xxx traits

2016-07-27 Thread Jonathan Wakely

I propose that we remove the following non-standard traits in GCC 7:

 /// has_trivial_default_constructor (temporary legacy)
 template
   struct has_trivial_default_constructor
   : public integral_constant
   { } _GLIBCXX_DEPRECATED;

 /// has_trivial_copy_constructor (temporary legacy)
 template
   struct has_trivial_copy_constructor
   : public integral_constant
   { } _GLIBCXX_DEPRECATED;

 /// has_trivial_copy_assign (temporary legacy)
 template
   struct has_trivial_copy_assign
   : public integral_constant
   { } _GLIBCXX_DEPRECATED;

They were in C++0x drafts but were removed before the final C++11
standard, so have never been part of ISO C++. They've been deprecated
since GCC 5.1: https://gcc.gnu.org/gcc-5/changes.html

People who need that functionality can still use the built-ins
directly, we don't need to define non-standard traits.

Alternatively, we could move them to __gnu_cxx, so they don't pollute
the namespace by default, or only define them when __STRICT_ANSI__ is
not defined. I prefer simply removing them.




Re: Remove deprecated std::has_trivial_xxx traits

2016-07-27 Thread Daniel Krügler
2016-07-27 20:25 GMT+02:00 Jonathan Wakely :
> I propose that we remove the following non-standard traits in GCC 7:
>
>  /// has_trivial_default_constructor (temporary legacy)
>  template
>struct has_trivial_default_constructor
>: public integral_constant
>{ } _GLIBCXX_DEPRECATED;
>
>  /// has_trivial_copy_constructor (temporary legacy)
>  template
>struct has_trivial_copy_constructor
>: public integral_constant
>{ } _GLIBCXX_DEPRECATED;
>
>  /// has_trivial_copy_assign (temporary legacy)
>  template
>struct has_trivial_copy_assign
>: public integral_constant
>{ } _GLIBCXX_DEPRECATED;
>
> They were in C++0x drafts but were removed before the final C++11
> standard, so have never been part of ISO C++. They've been deprecated
> since GCC 5.1: https://gcc.gnu.org/gcc-5/changes.html
>
> People who need that functionality can still use the built-ins
> directly, we don't need to define non-standard traits.
>
> Alternatively, we could move them to __gnu_cxx, so they don't pollute
> the namespace by default, or only define them when __STRICT_ANSI__ is
> not defined. I prefer simply removing them.

+1 for removing them.

- Daniel


Re: Need help with PR71976 combine.c::get_last_value returns a wrong result

2016-07-27 Thread Georg-Johann Lay

Segher Boessenkool schrieb:

On Tue, Jul 26, 2016 at 03:38:18PM +0200, Georg-Johann Lay wrote:

@@ -13206,6 +13206,13 @@ get_last_value (const_rtx x)
  && DF_INSN_LUID (rsp->last_set) >= subst_low_luid)
return 0;

+  /* If the lookup is for a hard register make sure that value contains 
at

least
+ as many bits as x does.  */
+
+  if (regno < FIRST_PSEUDO_REGISTER
+  && GET_MODE_PRECISION (rsp->last_set_mode) < GET_MODE_PRECISION
(GET_MODE (x)))
+return 0;
+
  /* If the value has all its registers valid, return it.  */
  if (get_last_value_validate (&value, rsp->last_set, 
  rsp->last_set_label,

  0))
return value;

That might be a bit harsh.

In what respect?


It disables all kinds of optimisations that work now.


First things first: is the information recorded correct?
I think yes, but care must be taken what may be concluded from that 
information.  From a set of 8 bits you cannot draw conclusion about all 64 
bits; this should be obvious enough :-)


:-)

In the above case rsp[regno] holds only information for 1 sub-byte.  In 
order to get the complete DImode story we would have to get the info for 
all sub-parts and then put them together...


Yes, and the rsp stuff does not do that.

I am testing the following patch.  Thanks,


Segher


diff --git a/gcc/combine.c b/gcc/combine.c
index 77e0d2b..dec6226 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9977,6 +9977,9 @@ reg_num_sign_bit_copies_for_combine (const_rtx x, 
machine_mode mode,
   return NULL;
 }
 
+  if (GET_MODE_PRECISION (rsp->last_set_mode) != GET_MODE_PRECISION (mode))

+return NULL;
+
   tem = get_last_value (x);
   if (tem != 0)
 return tem;


But the problem is inside get_last_value.  You'd have to add such a 
check at /all/ call sites of that function.  Using a value for a hard 
reg that's been set in a smaller mode than the mode of get_last_value is 
just wrong.


For example combine tracks bits which are known to be zero, and if 
get_last_value is used in a similar situation, then the conclusion about 
zero-bits is also wrong.


Would be interesting to patch get_last_value so that it issues some 
diagnostic if


if (regno < FIRST_PSEUDO_REGISTER
&& (GET_MODE_PRECISION (rsp->last_set_mode)
< GET_MODE_PRECISION (GET_MODE (x

and then run the testsuite against that compiler.  I would expect that 
the condition doesn't even trigger on x86.


Johann



Re: Need help with PR71976 combine.c::get_last_value returns a wrong result

2016-07-27 Thread Segher Boessenkool
On Wed, Jul 27, 2016 at 09:14:27PM +0200, Georg-Johann Lay wrote:
> >diff --git a/gcc/combine.c b/gcc/combine.c
> >index 77e0d2b..dec6226 100644
> >--- a/gcc/combine.c
> >+++ b/gcc/combine.c
> >@@ -9977,6 +9977,9 @@ reg_num_sign_bit_copies_for_combine (const_rtx x, 
> >machine_mode mode,
> >   return NULL;
> > }
> > 
> >+  if (GET_MODE_PRECISION (rsp->last_set_mode) != GET_MODE_PRECISION 
> >(mode))
> >+return NULL;
> >+
> >   tem = get_last_value (x);
> >   if (tem != 0)
> > return tem;
> 
> But the problem is inside get_last_value.  You'd have to add such a 
> check at /all/ call sites of that function.  Using a value for a hard 
> reg that's been set in a smaller mode than the mode of get_last_value is 
> just wrong.

Things like nonzero_bits explicitly deal with it.  For MODE_INT values
at least; the code looks a bit shaky.

> For example combine tracks bits which are known to be zero, and if 
> get_last_value is used in a similar situation, then the conclusion about 
> zero-bits is also wrong.

How so?  I don't see it.

> Would be interesting to patch get_last_value so that it issues some 
> diagnostic if
> 
> if (regno < FIRST_PSEUDO_REGISTER
> && (GET_MODE_PRECISION (rsp->last_set_mode)
> < GET_MODE_PRECISION (GET_MODE (x
> 
> and then run the testsuite against that compiler.  I would expect that 
> the condition doesn't even trigger on x86.

Why restrict this to hard regs at all?  Yes I know it isn't supposed
to happen for pseudos, but that also means you do not need to check?


Segher


gcc-4.9-20160727 is now available

2016-07-27 Thread gccadmin
Snapshot gcc-4.9-20160727 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160727/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 238801

You'll find:

 gcc-4.9-20160727.tar.bz2 Complete GCC

  MD5=91bcf7e0231ab8a729c8581b6258fdd0
  SHA1=4a86f0fd083df1fbc353b524d31844245aa31404

Diffs from 4.9-20160720 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.