Re: incremental compiler project

2015-09-08 Thread Diego Novillo
On Thu, Sep 3, 2015 at 1:02 PM, Jeff Law  wrote:

> Agreed.  I think the google project went further, but with Lawrence retiring, 
> I think it's been abandoned.

We got up to the point where we could store and re-use pre-parsed
images of headers.  The big problem were those headers with exposed
references (e.g. stddef.h).  Our goal was to create a precursor for
the future C++ modules.  We did not pursue the project any longer
after we started focusing more on Clang/LLVM.

Right now, we are focusing on C++ modules.  In our codebase, not
having to parse headers over and over is the single largest win in
compile time.


Diego.


Re: How to allocate memory safely in RTL, preferably on the stack? (relating to the RTL-level if-converter)

2015-09-08 Thread Jeff Law

On 09/08/2015 12:05 PM, Abe wrote:

Dear all,

In order to be able to implement this idea for stores, I think I need
to make some changes to the RTL if-converter such that it will
sometimes add -- to the code being compiled -- a new slot/variable in
the stack frame.  This memory needs to be addressable via a pointer
in the code being generated, so AFAIK just allocating a new
pseudo-register won`t work and AFAIK using an RTL "scratch" register
also won`t work.  I also want to do my best to ensure that this
memory is thread-local.  For those reasons, I`m asking about the
stack.

Look at assign_stack_local.


Jeff


Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Aditya K
IIUC, in the haifa-sched.c, the default scheduling algorithm seems to be 
top-down (before reload).
Is there a way to schedule the other way (bottom up), or both ways?

As a use case for bottom-up or some other heuristic:
Currently, the first priority in the selection is given to the longest path, in 
some cases this may produce code with stalls at the end of the basic block. 
Whereas in the case of combined top-down + bottom-up scheduling we would end up 
having stalls in the middle of the basic block.

Thanks,
-Aditya

  

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Segher Boessenkool
On Tue, Sep 08, 2015 at 06:39:19PM +, Aditya K wrote:
> IIUC, in the haifa-sched.c, the default scheduling algorithm seems to be 
> top-down (before reload).
> Is there a way to schedule the other way (bottom up), or both ways?
> 
> As a use case for bottom-up or some other heuristic:
> Currently, the first priority in the selection is given to the longest path, 
> in some cases this may produce code with stalls at the end of the basic 
> block. Whereas in the case of combined top-down + bottom-up scheduling we 
> would end up having stalls in the middle of the basic block.

Why would that be better?


Segher


Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law

On 09/08/2015 12:39 PM, Aditya K wrote:

IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
be top-down (before reload). Is there a way to schedule the other way
(bottom up), or both ways?
Not that I'm aware of.  Note that region scheduling allows insns to move 
between basic blocks to help fill the bubbles that can occur at the end 
of a block.




As a use case for bottom-up or some other heuristic: Currently, the
first priority in the selection is given to the longest path, in some
cases this may produce code with stalls at the end of the basic
block. Whereas in the case of combined top-down + bottom-up
scheduling we would end up having stalls in the middle of the basic
block.
GCC's original scheduler worked bottom-up until ~1997.  IBM Haifa's work 
turned it into a top-down model and was a small, but clear improvement.


There's certainly better things that can be done than strictly top-down 
or bottom-up, but revamping the scheduler again hasn't been seen as a 
major win for the most common processors GCC targets these days.  Thus 
it hasn't been a significant area of focus.


Jeff


Re: Why scheduler do not re-emit REG_DEAD notes?

2015-09-08 Thread Jeff Law

On 09/07/2015 10:05 AM, Konstantin Vladimirov wrote:

Hi,

In private backend for GCC 5.2.0, we do have target-specific scheduler
(running in TARGET_SCHED_FINISH hook), that do some instruction
packing/pairing on sched2 and relies on REG_DEAD notes, that should be
correct.

But they aren't because inside haifa-sched.c, that is being run first
in the sched2 pass, reemit_notes function processes only REG_SAVE_NOTE
case, and, after this scheduler, some insns with REG_DEAD on register,
say r1, might be moved before previous r1 usage (input dependency
case) and things become totally wrong.

Now I appllied some minimal patch locally to fix it (just added
separate REG_DEAD case).

But may be it is part of design and may be it is generally true, that
we can't rely on correct REG_DEAD notes in platform-specific
scheduler?
You can not rely on death notes within the scheduler.  That's been in 
its design as long as I can remember (circa 1992).


jeff


RE: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Aditya K



> Subject: Re: Combined top-down and bottom-up instruction scheduler
> To: hiradi...@msn.com; gcc@gcc.gnu.org
> CC: vmaka...@redhat.com
> From: l...@redhat.com
> Date: Tue, 8 Sep 2015 12:51:24 -0600
>
> On 09/08/2015 12:39 PM, Aditya K wrote:
>> IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
>> be top-down (before reload). Is there a way to schedule the other way
>> (bottom up), or both ways?
> Not that I'm aware of. Note that region scheduling allows insns to move
> between basic blocks to help fill the bubbles that can occur at the end
> of a block.
>
>>
>> As a use case for bottom-up or some other heuristic: Currently, the
>> first priority in the selection is given to the longest path, in some
>> cases this may produce code with stalls at the end of the basic
>> block. Whereas in the case of combined top-down + bottom-up
>> scheduling we would end up having stalls in the middle of the basic
>> block.
> GCC's original scheduler worked bottom-up until ~1997. IBM Haifa's work
> turned it into a top-down model and was a small, but clear improvement.
>
> There's certainly better things that can be done than strictly top-down
> or bottom-up, but revamping the scheduler again hasn't been seen as a
> major win for the most common processors GCC targets these days. Thus
> it hasn't been a significant area of focus.

Do you have pointers on places to look for if I want to explore bottom-up, or 
maybe a combination of the two.

Thanks,
-Aditya

>
> Jeff
  

Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Sebastian Pop
Segher Boessenkool wrote:
> On Tue, Sep 08, 2015 at 06:39:19PM +, Aditya K wrote:
> > IIUC, in the haifa-sched.c, the default scheduling algorithm seems to be 
> > top-down (before reload).
> > Is there a way to schedule the other way (bottom up), or both ways?
> > 
> > As a use case for bottom-up or some other heuristic:
> > Currently, the first priority in the selection is given to the longest 
> > path, in some cases this may produce code with stalls at the end of the 
> > basic block. Whereas in the case of combined top-down + bottom-up 
> > scheduling we would end up having stalls in the middle of the basic block.
> 
> Why would that be better?
> 

Top-down scheduling does a good job at the beginning of a basic block.
Bottom-up does a good job at the end of a basic block and then a poor job at the
beginning of the basic block.

When you combine the two (like in the swing scheduler, schedule one insn at the
top, then one insn at the bottom, and alternate), the result is good code at the
top, good code at the bottom, and in the middle you may have some poor code.

We have seen an example where GCC does a poor job at scheduling at the end of a
block, and we do not have a way to improve the situation, because we don't have
enough insns outside the critical path to be scheduled towards the end of the 
bb.

Sebastian


Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Vladimir Makarov

On 09/08/2015 02:51 PM, Jeff Law wrote:

On 09/08/2015 12:39 PM, Aditya K wrote:

IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
be top-down (before reload). Is there a way to schedule the other way
(bottom up), or both ways?
Not that I'm aware of.  Note that region scheduling allows insns to 
move between basic blocks to help fill the bubbles that can occur at 
the end of a block.


Also the current scheduler has a lot of algorithms to decrease the 
problem as backtracking scheduler written by Bernd Schmidt or some form 
of lookahead.


As a use case for bottom-up or some other heuristic: Currently, the
first priority in the selection is given to the longest path, in some
cases this may produce code with stalls at the end of the basic
block. Whereas in the case of combined top-down + bottom-up
scheduling we would end up having stalls in the middle of the basic
block.
GCC's original scheduler worked bottom-up until ~1997.  IBM Haifa's 
work turned it into a top-down model and was a small, but clear 
improvement.


As I remember it is was written by Mike Tiemann.  Bottom-up scheduler as 
a rule generates worse code than top-down one.  By the way, implementing 
bottom-up scheduler in GCC would require implementing reverse (N)DFA 
from a processor description to recognize resource constraints.
There's certainly better things that can be done than strictly 
top-down or bottom-up, but revamping the scheduler again hasn't been 
seen as a major win for the most common processors GCC targets these 
days. Thus it hasn't been a significant area of focus.
Yes, that is true for OOO execution processors which can rearrange insns 
and execute them speculatively looking through several branches.  For 
such processors, software pipelining is more important as the processors 
can look only through a few branches as software pipelining could look 
through any number of branches.  That is why Intel compiler did not have 
any insn scheduler (but had software pipelining) until Intel Atom 
introduction which was originally in-order processor.


Actually, I believe dealing with variable/unknown latency of load insns 
(depending where data are placed in a cache or memory) would be more 
important than bottom-up or hybrid scheduler.  A balanced scheduling 
dealing with this problem was implemented by Alexander Monakov about 7-8 
years ago as a google internship work but it was not included as at that 
time its advantages was not confirmed on SPEC2000.  It would be 
interesting to reconsider and re-evaluate it on modern processors and 
scientific benchmarks with big data.


For in-order processors, we also have another scheduler (selective one) 
which does additional transformations (like register renaming and 
non-modulo software pipelining) which could be more important than 
top-down/bottom-up scheduling.  And it gave 1-2% improvement on Itanium 
SPEC2000 in comparison with haifa scheduler.


Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law

On 09/08/2015 01:40 PM, Vladimir Makarov wrote:



As I remember it is was written by Mike Tiemann.

Correct.


  Bottom-up scheduler as

a rule generates worse code than top-down one.
Indeed that was one of the key things we were looking to get from the 
Haifa scheduler along with improved superscalar support some support for 
region scheduling & speculation.



Yes, that is true for OOO execution processors which can rearrange insns
and execute them speculatively looking through several branches.  For
such processors, software pipelining is more important as the processors
can look only through a few branches as software pipelining could look
through any number of branches.  That is why Intel compiler did not have
any insn scheduler (but had software pipelining) until Intel Atom
introduction which was originally in-order processor.
Correct.  Latency scheduling just isn't that important for OOO and 
instead you look at scheduling to mitigate costs for large latency 
operations (ie, cache miss  and transcendental functions).  You might 
also attack secondary issues like throughput at the retirement stage for 
example.




Actually, I believe dealing with variable/unknown latency of load insns
(depending where data are placed in a cache or memory) would be more
important than bottom-up or hybrid scheduler.
Agreed.  This is in-line with what the HP guys were seeing as they 
transitioned to the PA8000.



  A balanced scheduling

dealing with this problem was implemented by Alexander Monakov about 7-8
years ago as a google internship work but it was not included as at that
time its advantages was not confirmed on SPEC2000.  It would be
interesting to reconsider and re-evaluate it on modern processors and
scientific benchmarks with big data.

Agreed.



For in-order processors, we also have another scheduler (selective one)
which does additional transformations (like register renaming and
non-modulo software pipelining) which could be more important than
top-down/bottom-up scheduling.  And it gave 1-2% improvement on Itanium
SPEC2000 in comparison with haifa scheduler.

Right.

Jeff



Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law

On 09/08/2015 01:24 PM, Aditya K wrote:





Subject: Re: Combined top-down and bottom-up instruction scheduler
To: hiradi...@msn.com; gcc@gcc.gnu.org
CC: vmaka...@redhat.com
From: l...@redhat.com
Date: Tue, 8 Sep 2015 12:51:24 -0600

On 09/08/2015 12:39 PM, Aditya K wrote:

IIUC, in the haifa-sched.c, the default scheduling algorithm seems to
be top-down (before reload). Is there a way to schedule the other way
(bottom up), or both ways?

Not that I'm aware of. Note that region scheduling allows insns to move
between basic blocks to help fill the bubbles that can occur at the end
of a block.



As a use case for bottom-up or some other heuristic: Currently, the
first priority in the selection is given to the longest path, in some
cases this may produce code with stalls at the end of the basic
block. Whereas in the case of combined top-down + bottom-up
scheduling we would end up having stalls in the middle of the basic
block.

GCC's original scheduler worked bottom-up until ~1997. IBM Haifa's work
turned it into a top-down model and was a small, but clear improvement.

There's certainly better things that can be done than strictly top-down
or bottom-up, but revamping the scheduler again hasn't been seen as a
major win for the most common processors GCC targets these days. Thus
it hasn't been a significant area of focus.


Do you have pointers on places to look for if I want to explore bottom-up, or 
maybe a combination of the two.

Not immediately handy.  I'd comb through PLDI through the 1990s and 
early 2000s and possibly Morgan's compiler book.


jeff



RE: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Evandro Menezes
> > Yes, that is true for OOO execution processors which can rearrange
> > insns and execute them speculatively looking through several branches.
> > For such processors, software pipelining is more important as the
> > processors can look only through a few branches as software pipelining
> > could look through any number of branches.  That is why Intel compiler
> > did not have any insn scheduler (but had software pipelining) until
> > Intel Atom introduction which was originally in-order processor.
> Correct.  Latency scheduling just isn't that important for OOO and instead
> you look at scheduling to mitigate costs for large latency operations (ie,
> cache miss  and transcendental functions).  You might also attack
secondary
> issues like throughput at the retirement stage for example.

Our motivation stems from the fact that even modern, aggressively OOO
processors don't have orthogonal resources.  Some insns depend on expensive
circuitry (area or power wise) that is added only once, making such insns
simply scalar, though most of other insns enjoy multiple resources capable
of executing them as superscalar.  That's why we believe that a hybrid
approach might yield good results.  We don't have data, for it possibly
requires implementing it first.

I'd also argue that looking at an OOO pipeline in a steady state is not the
only approach.  It's also important to consider how quickly the pipeline can
be replenished or warmed up to reach a steady state.

-- 
Evandro Menezes  Austin, TX



Re: Combined top-down and bottom-up instruction scheduler

2015-09-08 Thread Jeff Law

On 09/08/2015 03:12 PM, Evandro Menezes wrote:

cache miss  and transcendental functions).  You might also attack

secondary

issues like throughput at the retirement stage for example.


Our motivation stems from the fact that even modern, aggressively OOO
processors don't have orthogonal resources.  Some insns depend on expensive
circuitry (area or power wise) that is added only once, making such insns
simply scalar, though most of other insns enjoy multiple resources capable
of executing them as superscalar.  That's why we believe that a hybrid
approach might yield good results.  We don't have data, for it possibly
requires implementing it first.

I'd also argue that looking at an OOO pipeline in a steady state is not the
only approach.  It's also important to consider how quickly the pipeline can
be replenished or warmed up to reach a steady state.
Which is why I mentioned optimizing for throughput at the retirement 
stage rather than traditional latency scheduling.


That's from a real world case -- the PA8000 where retirement bandwidth 
was at a premium (relative to functional unit bandwidth).


jeff


gcc-5-20150908 is now available

2015-09-08 Thread gccadmin
Snapshot gcc-5-20150908 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/5-20150908/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch 
revision 227567

You'll find:

 gcc-5-20150908.tar.bz2   Complete GCC

  MD5=07f2be533c15d7b0668d507aeab9179a
  SHA1=61c14513a4e0d86dd9eadb09844d17eb05f02209

Diffs from 5-20150901 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


GCC branches/st (was Re: Offer of help with move to git)

2015-09-08 Thread Jason Merrill
[David, we're talking about moving the GCC repository to Git, and how to 
handle subdirectory branches.]


On 09/04/2015 12:17 PM, Joseph Myers wrote:

branches/st is more complicated than simply being a container for
subdirectory branches.  It has a README file, five cli* subdirectories
that look like branches of GCC, two subdirectories binutils/ and
mono-based-binutils/ that are essentially separate projects (this is not
of course any problem for git - having a branch sharing no ancestry with
other branches is absolutely fine), and a subdirectory tags that contains
tags of those various branches (I think).  So you want to say:
branches/st/tags/* are tags; branches/st/* (subdirectories other than
tags/) are branches; branches/st/README I don't know how you should handle
(I suppose it could be a branch on its own, that just contains a README
file, with commits affecting both README and other branches being split
into separate commits to each piece; it is something that's still
meaningful after the conversion and that ought to end up in the converted
repository in some form).


Hmm.  The README does complicate things; maybe we should just leave it 
as a single branch and let interested people choose how to deal with it. 
 David, you were the last committer; any opinions?


Jason