date:20081001

What to do with hardware exception (unaligned access) ? ARM920T processor

2008-10-01 Thread Vladimir Sterjantov


Hi!

Processor ARM920T, chip Atmel at91rm9200.

Simple C code:

char c[30];
unsigned short *pN = &c[1];

*pN = 0x1234;

causes hardware exeption - memory aborts (used to implement memory 
protection or virtual memory).


We have a lot of source code, with pieces of code like this, which must 
be ported from x86 to ARM9.


Are there any compiler options to handle this exception?

I compile it by using arm-elf-gcc 4.3.2 under Linux.
binutils-2.18.

Thanks.

Best regrads,
Vladimir

Re: What to do with hardware exception (unaligned access) ? ARM920T processor

2008-10-01 Thread Martin Guy

On 10/1/08, Vladimir Sterjantov <[EMAIL PROTECTED]> wrote:
>  Processor ARM920T, chip Atmel at91rm9200.
>
>  char c[30];
>  unsigned short *pN = &c[1];
>
>  *pN = 0x1234;

Accesses to shorts on ARM need to be aligned to an even address, and
longs to a 4-byte address. Otherwise the access returns (eg, for a
4-byte word pointer) is *(p & ~3) >>> *(p & 3) (where >>> is byte
rotate, not bit shift). Or causes a memory fault, if that's how your
system is configured.

If you don't want to make the code portable and your are running a
recent Linux, a fast fix is to
  echo 2 > /proc/cpu/alignment
which should make the kernel trap misaligned accesses and fix them up
for you, with a loss in performance of course. The real answer is to
fix the code...

M

RE: Defining a common plugin machinery

2008-10-01 Thread Grigori Fursin

Dear all,

I noticed a long discussion about plugins for GCC. 
It seems that it's currently moving toward important legal issues,
however, I wanted to backtrack and just mention that we
at INRIA and in the MILEPOST project are clearly interested
in having a proper plugin system in the mainline of GCC which 
will simplify our work on automatically tuning optimization heuristics
(cost models, etc) or easily adding new transformations
and modules for program analysis.

We currently have a simple plugin system within Interactive Compilation 
Interface (http://gcc-ici.sourceforge.net) that can load external DLL plugin 
(transparently through the environment variables to avoid changing project 
Makefiles or through command line), substitutes the original Pass Manager 
to be able to call any passes (new analysis passes for example) in any legal 
order
and has an event mechanism to rise events in any place in GCC and 
pass data to the plugin (such as information about cost model
dependencies) or return parameters (such as decisions about transformations
for example). 

Since it's relatively simple, we are currently able to port it to the new 
versions
of GCC without problems, however, naturally, we would like to have
this functionality within the GCC mainline with the defined API (that
what I discussed with Taras from Mozilla during GCC Summit this year).
I believe it may help making GCC a modular compiler and simplify future
designs (and can be in line with the idea to use C++ for GCC development
if Ian moves this project forward). By the way, Hugh Leather 
also developed an interesting plugin system for GCC that allows 
to substitute internal GCC functions with the external ones within 
plugins to enable hooks inside GCC (he mentioned that he plans to release it 
soon)...

Furthermore, we will then be able to use our current MILEPOST tools 
and Collective Compilation Framework to automatically tune default GCC 
optimization heuristic for performance, code size or power consumption 
(instead of using -O1,2,3 levels) for a particular architecture before 
new versions of GCC are actually released (or for additional testing 
of a compiler using different combinations and orders of passes). 
And when the compiler is released, users can further tune their particular 
programs interactively or automatically through the external tools and GCC 
plugins.

By the way, we are extending current ICI for GCC 4.4 to add cost-model tuning 
for major optimizations (GRAPHITE, vectorization, inlining, scheduling,
register allocation, unrolling, etc) and provide function cloning with
different optimizations, and naturally would like to make it compatible 
with the potential future common GCC plugin system, so I hope we will 
be able to agree on a common plugin design soon and move it forward ;) ... 

Regards,
Grigori

=
Grigori Fursin, INRIA, France
http://fursin.net/research

> -Original Message-
> From: Taras Glek [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, September 16, 2008 11:43 PM
> To: Diego Novillo
> Cc: Basile STARYNKEVITCH; gcc@gcc.gnu.org; Sean Callanan; Albert Cohen;
> [EMAIL PROTECTED]
> Subject: Re: Defining a common plugin machinery
> 
> Basile STARYNKEVITCH wrote:
> > Hello Diego and all,
> >
> > Diego Novillo wrote:
> >>
> >> After the FSF gives final approval on the plugin feature, we will need
> >> to coordinate towards one common plugin interface for GCC.  I don't
> >> think we should be adding two different and incompatible plugin
> >> harnesses.
> >
> > What exactly did happen on the FSF side after the last GCC summit? I
> > heard nothing more detailed than the Steeering Committee Q&A BOFS and
> > the early draft of some legal licence involving plugins. What happened
> > on the Steering Commitee or legal side since august 2008? Is there any
> > annoucement regarding FSF approving plugins?
> >
> >> I am CCing everyone who I know is working on plugin features.
> >> Apologies if I missed anyone.
> >>
> >> I would like to start by understanding what the plugin API should
> >> have.  What features do we want to support?  What should we export?
> >> What should be the user interface to incorporate plugin code?  At
> >> what point in the compilation stage should it kick in?
> >>
> >> Once we define a common API and then we can take the implementation
> >> from the existing branches.  Perhaps create a common branch?  I would
> >> also like to understand what the status and features of  the
> >> different branches is.
> >
> >
> > The MELT plugin machinery is quite specific in its details, and I
> > don't believe it can be used -in its current form- for other plugins.
> > It really expects the plugin to be a MELT one.
> >
> > From what I remember of the plugin BOFS (but I may be wrong), an easy
> > consensus seems to be that plugins should be loadable thru the command
> > line (probably a -fplugin=foo meaning that some foo.so should be
> > dlopen-ed), that they could take a single s

Re: What to do with hardware exception (unaligned access) ? ARM920T processor

2008-10-01 Thread Marco Manfredini

On Wednesday 01 October 2008, Martin Guy wrote:
> If you don't want to make the code portable and your are running a
> recent Linux, a fast fix is to
>   echo 2 > /proc/cpu/alignment
> which should make the kernel trap misaligned accesses and fix them up
> for you, with a loss in performance of course. The real answer is to
> fix the code...

...and this is where -Wcast-align should help. The OP should also have a look 
at -Wpadded and -Wpacked, because this may expose similar pitfalls.
 
This writeup looks like a good start for the OP:
http://lecs.cs.ucla.edu/wiki/index.php/XScale_alignment

Re: GCC 4.2.2 arm-linux-gnueabi: c++ exceptions handling?

2008-10-01 Thread Sergei Poselenov


Hello all,

I've found the cause of my problem - it's binutils 2.17.50.
Using ld 2.18, or even 2.17.90 creates workable libstdc++.so.

Regards,
Sergei

Sergei Poselenov wrote:

Hello all,

I've built the above cross-compiler and ran the GCC testsuite.
Noted a lot of c++ tests failed with the same output:
...
terminate called after throwing an instance of 'int'
terminate called recursively
Aborted
...

Compiler details:
Reading specs from 
/opt/eldk-4.2-arm-2008-09-24/usr/bin/../lib/gcc/arm-linux-gnueabi/4.2.2/specs 


Target: arm-linux-gnueabi
Configured with: 
/work/psl/eldk-builds/arm-2008-09-24/work/usr/src/denx/BUILD/crosstool-0.43/build/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi/gcc-4.2.2/configure 
--target=arm-linux-gnueabi --host=i686-host_pc-linux-gnu 
--prefix=/var/tmp/eldk.Jb5047/usr/crosstool/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi 
--disable-hosted-libstdcxx 
--with-headers=/var/tmp/eldk.Jb5047/usr/crosstool/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi/arm-linux-gnueabi/include 
--with-local-prefix=/var/tmp/eldk.Jb5047/usr/crosstool/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi/arm-linux-gnueabi 
--disable-nls --enable-threads=posix --enable-symvers=gnu 
--enable-__cxa_atexit --enable-languages=c,c++,java --enable-shared 
--enable-c99 --enable-long-long --without-x

Thread model: posix
gcc version 4.2.2


However, testing results in 
http://gcc.gnu.org/ml/gcc-testresults/2007-09/msg00570.html

states that this should work.

I even downloaded and built the exact pre-release version
used in the above tests and tried it - all the same.

I wonder could it be the kernel or Glibc/binutils issue?
I'm running 2.6.21.5, Glibc is 2.6 (Fedora Core 7 release),
binutils is 2.17.50.0.12.

Could someone having the 4.2 release series compiler configured for
ARM EABI target try this simple test:

extern "C" void abort(void);

#define CI(stmt) try { stmt; abort(); } catch (int) { }

struct has_destructor
{
~has_destructor() { }
};

struct no_destructor
{
};

int PI(int& i) { return i++; }

int main(int argc, char *argv[])
{
(argc+1 ? has_destructor() : throw 0);
CI((argc+1 ? throw 0 : has_destructor()));
}

Build as arm-linux-gnueabi-g++ -o cond1 cond1.C


Thanks for any feedback,
Sergei

Re: Defining a common plugin machinery

2008-10-01 Thread Hugh Leather


Sorry, I think this bounced twice.

Hugh Leather wrote:

Hi All,

Thanks, Grigori, for mentioning my plugin system, libplugin, which 
can be found at http://libplugin.sourceforge.net/.


I have been meaning to release it but finding the time to finish 
off the documentation and upload all the newest code to SourceForge 
has been difficult (both code and docs on SourceForge are some months 
out of date).


The plugin system was built to support MilePost goals for GCC, as 
we need to be able to capture events during compilation as well as 
change compilation behaviour.  Here are some of the features of the 
system:


*Application agnostic.*
The plugin system is not GCC specific but can be used to make any 
application plugin aware.  Plugin management is handled through a 
shared library which GCC, or any other application, can link to. 

I think if GCC took a similar approach then it would benefit from 
the exposure the system received elsewhere and the wider community 
would also have access to a professionally built plugin system.  As 
the plugin system becomes more powerful, GCC reaps the rewards without 
having to change a line of code.


The other, huge advantage of this, together with the design I'll 
describe below, is that GCC only has ~10 lines of plugin code; to 
initialise the library.  The rest is working out how to refactor GCC 
to make it extensible.  This way GCC won't be cluttered with nasty 
plugin stuff that obscures the meaning of the code.


Finally, plugins for different applications can coexist.  For 
example, we might have some plugins for the linker, some for driver, 
some for compiler and some that work in any of those.


*Eclipse Inspired.*
I've take inspiration from the excellent plugin system for the 
Eclipse IDE, http://www.eclipse.org.  It has proved very successful, 
so it seemed like a good starting point.


*What it is.*

* Each plugin has an XML file describing it
* Plugins have ExtensionPoints that other plugins can extend
* Plugins can have shared libraries
* Requires libxml2, libffi, libltdl

An ExtensionPoint is one of the fundamental parts of the system.  
It provides the links between plugins.


Each ExtensionPoint is really just an object with one method:
   bool extend( ExtensionPoint* self, Plugin* extendingPlugin, 
xmlNodePtr specfication )


This method tells the ExtensionPoint that some other plugin is 
extending it and gives it the XML that plugin uses.  The 
ExtensionPoint can do whatever it likes with that XML.  It might 
contain symbols pointing to functions it should use, it might be 
markup for logging text.  It could be a list of unroll factors, one 
for each function or a list of passes to apply to a particular 
function.  You can describe anything in XML.

*
An Example.*
Maybe that's a bit confusing, so here's an example. 

Suppose we have a plugin which offers a logging or message 
service.  It would have a plugin specification in XML like this:














That says it's a plugin for GCC, it has id "simple-message", it 
uses a certain shared library.  It also says it has an extension point 
called "simple-message.print" and gives the function to call when 
anyone extends that extension point. 

This function is called "simpleMessage_extend" and is in the 
shared library the plugin specified.  It looks like this:
  bool simpleMessage_extend( ExtensionPoint* self, Plugin* 
extendingPlugin, xmlNodePtr specfication ) {

printf( "%s\n", xmlNodeGetContent( specification ));
return TRUE;
}
It simply prints the text content of any plugin that extends it.

Another plugin might come along later and have this as its 
specification:



   


Hello,World!



Hopefully that little guy should be clear, it prints "Hello, World!"

Now the plugin system has taken care of all the dependency 
management, only required plugins are loaded, etc.  All the 
appropriate extension points are created (only those used) and 
extensions are applied. 

There's also a plugin lifecycle allowing things to happen at 
appropriate times.


We could have had our plugins exchange code, remember things until 
later, do anything in fact.  If you can describe it in XML then the 
plugin system lets you do it.


*Ease of Use with Events and JoinPoints*
Such simple extension points provide all the power you ever need, 
but not the ease of use.  So, the system also lets you do common 
things with almost no code in GCC, just a slight refactoring and tiny 
description in XML.


The most common things people want from a plugin system is to be 
able to listen to events or to replace behaviours.  I'll show you how 
events are added here.


Suppose GCC (or another plugin) is going to raise an event called 
"myEvent".  The event will take an int and a string.  Here's

query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Ramana Radhakrishnan

Hi ,

Based on the conversation in the thread at
http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
pass trying to undo final value replacement going. The initial
implementation was done by Pranav Bhandarkar when he was employed at
Azingo as part of work sponsored by Icera Semiconductor.  I've been
trying to get this working with my private port over here.  We intend
to contribute this back once our copyright assignments are sorted and
if this will be acceptable to all. I've been getting a few compile
time ICEs with this approach and haven't been able to resolve them
well enough yet. Whilst doing so, I wanted to check on the approach as
outlined below and ask if there's anything that we might have missed
or any problem that one can see with us going along these lines.
Thanks for your time and patience.

cheers
Ramana



1) Understanding what scalar evolution does.
~

Consider the following pseudo code.

function memcpy (src_pointer, dst_pointer)
src_1 = src_pointer;
dst_1 = dst_pointer;
L1:
*dst_1 = *src_1  (Word copy)
dst++;
src++;< Inc by 4 bytes i.e 1 word.
conditional jump to L1

 /* This is the exit block of loop 1. The following PHI nodes
 are added by loopinit pass to convert the SSA form into "closed loop
 SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */

src_2 = PHI 
dst_2 = PHI 


L2:
*dst_2 = *src_2 (Byte Copy)
dst++;
src++;



Now scalar evolution convertes this into

Function memcpy (src_pointer, dst_pointer)
 src_1 = src_pointer;
 dst_1 = dst_pointer;
 L1:
 *dst_1 = *src_1  (Word copy)
 dst++;
 src++;< Inc by 4 bytes i.e 1 word.
 conditional jump to L1


 /* This is the exit block of loop 1. The following PHI nodes
 are added by loopinit pass to convert the SSA form into "closed loop
 SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */

 D.1234_11 = 4 * n (where 'n' is the number of iterations of L1)
 src_2 = src_pointer + D.1234_11
 D.1235_22 = 4 * n
 dst_2 = dst_pointer + D.1235_22

 
 L2:
 *dst_2 = *src_2 (Byte Copy)
 dst++;
 src++;
 



Therefore a PHI Node is replaced by the final value of src_1 and
dst_1, thus introducing extra computations.


2) How to undo what scalar evolution does.
~~

To undo what scalar evolution does, we need to record the changes that
scalar evolution makes and then after the loop optimizations are run
we need to replace the PHI nodes that were removed earlier in place of
the computations introduced by scalar evolution.

A high level description of the process is listed here. (see
tree-scalar-evolution.c and grep for DXP_SPECIFIC)

   Explanation of this sub-pass.
   ~~

   Part 1: Record Final Value replacement related changes.
   ~~
   Final value replacement replaces PHI nodes at the exits of loops with
   computations based on the number of iterations of the loop.
   For e.g.
   
   L1:

   x_1 = src + 4
   ...
   ...
   conditional jump to L1.
(Loop Exit)
   x_2 = PHI   < Phi node added by rewrite_into_loop_closed_ssa.
  (see the loopinit pass).

   Final Value replacement replaces the PHI node in  by

   ssa_temp_var = 4 * no_of_iterations_of_loop
   x_2 = src + ssa_temp_var;

   Therefore a PHI node is replaced by computations.

   Recording final value replacement related changes is controlled by the
   variable record_scalar_evolution_changes. When set to a non-zero value
   the function record_changed_stmts records the changes made. The changes
   are recorded in a hashtable changed_stmts_table. The hashtable contains
   the stmt added, the PHI node for which this stmt was added and hashcodes
   for both the stmt and the phi_node. We also note how many computations
   have been added for each of the removed PHI nodes. This is done in a
   link list pointed to by phi_nodes_info_head.

   Part 2: Undo Final value replacement related changes
   ~~
   This is the part where the new computations are removed and the PHI nodes
   that they are replaced are inserted back in. This replacement is
   contingent to a few conditions.
  a) All the computations that were added are still present in the basic
 block. i.e all the computations are still present in the form in
 which they were added and havent been touched by any of the loop
 optimizations passes that run between the scalar evolution pass (i.e the
 pass when Part 1 is executed) and the 'loopdone' pass. We go
 through the exit basic block and look up each stmt in
 changed_stmt_table. If found we lookup the corresponding PHI node in the
 phi_node_info link list and decrement its count by 1 (count here denotes
 the number of computations added. When count it 0 it means all the
 computatins added in the scalar evolution pass have been found in the
 same form in the loop done pass such a PHI node can be inserted back in
 if 'b' is a

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Richard Guenther

On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> wrote:
> Hi ,
>
> Based on the conversation in the thread at
> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
> pass trying to undo final value replacement going. The initial
> implementation was done by Pranav Bhandarkar when he was employed at
> Azingo as part of work sponsored by Icera Semiconductor.  I've been
> trying to get this working with my private port over here.  We intend
> to contribute this back once our copyright assignments are sorted and
> if this will be acceptable to all. I've been getting a few compile
> time ICEs with this approach and haven't been able to resolve them
> well enough yet. Whilst doing so, I wanted to check on the approach as
> outlined below and ask if there's anything that we might have missed
> or any problem that one can see with us going along these lines.
> Thanks for your time and patience.

Some quick comments.  First, do you have a non-pseudo-code testcase
that exposes the extra computations?  Second, I think rather than
trying to undo what SCEV const(!)-prop is doing adjust its cost
model (maybe there isn't one) to not create the costly substitutions.

Thanks,
Richard.

> cheers
> Ramana
>
>
>
> 1) Understanding what scalar evolution does.
> ~
>
> Consider the following pseudo code.
>
> function memcpy (src_pointer, dst_pointer)
> src_1 = src_pointer;
> dst_1 = dst_pointer;
> L1:
> *dst_1 = *src_1  (Word copy)
> dst++;
> src++;< Inc by 4 bytes i.e 1 word.
> conditional jump to L1
> 
>  /* This is the exit block of loop 1. The following PHI nodes
>  are added by loopinit pass to convert the SSA form into "closed loop
>  SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */
>
> src_2 = PHI 
> dst_2 = PHI 
>
> 
> L2:
> *dst_2 = *src_2 (Byte Copy)
> dst++;
> src++;
> 
>
>
> Now scalar evolution convertes this into
>
> Function memcpy (src_pointer, dst_pointer)
>  src_1 = src_pointer;
>  dst_1 = dst_pointer;
>  L1:
>  *dst_1 = *src_1  (Word copy)
>  dst++;
>  src++;< Inc by 4 bytes i.e 1 word.
>  conditional jump to L1
>
> 
>  /* This is the exit block of loop 1. The following PHI nodes
>  are added by loopinit pass to convert the SSA form into "closed loop
>  SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */
>
>  D.1234_11 = 4 * n (where 'n' is the number of iterations of L1)
>  src_2 = src_pointer + D.1234_11
>  D.1235_22 = 4 * n
>  dst_2 = dst_pointer + D.1235_22
>
>  
>  L2:
>  *dst_2 = *src_2 (Byte Copy)
>  dst++;
>  src++;
>  
>
>
>
> Therefore a PHI Node is replaced by the final value of src_1 and
> dst_1, thus introducing extra computations.
>
>
> 2) How to undo what scalar evolution does.
> ~~
>
> To undo what scalar evolution does, we need to record the changes that
> scalar evolution makes and then after the loop optimizations are run
> we need to replace the PHI nodes that were removed earlier in place of
> the computations introduced by scalar evolution.
>
> A high level description of the process is listed here. (see
> tree-scalar-evolution.c and grep for DXP_SPECIFIC)
>
>   Explanation of this sub-pass.
>   ~~
>
>   Part 1: Record Final Value replacement related changes.
>   ~~
>   Final value replacement replaces PHI nodes at the exits of loops with
>   computations based on the number of iterations of the loop.
>   For e.g.
>   
>   L1:
>
>   x_1 = src + 4
>   ...
>   ...
>   conditional jump to L1.
>(Loop Exit)
>   x_2 = PHI   < Phi node added by rewrite_into_loop_closed_ssa.
>  (see the loopinit pass).
>
>   Final Value replacement replaces the PHI node in  by
>
>   ssa_temp_var = 4 * no_of_iterations_of_loop
>   x_2 = src + ssa_temp_var;
>
>   Therefore a PHI node is replaced by computations.
>
>   Recording final value replacement related changes is controlled by the
>   variable record_scalar_evolution_changes. When set to a non-zero value
>   the function record_changed_stmts records the changes made. The changes
>   are recorded in a hashtable changed_stmts_table. The hashtable contains
>   the stmt added, the PHI node for which this stmt was added and hashcodes
>   for both the stmt and the phi_node. We also note how many computations
>   have been added for each of the removed PHI nodes. This is done in a
>   link list pointed to by phi_nodes_info_head.
>
>   Part 2: Undo Final value replacement related changes
>   ~~
>   This is the part where the new computations are removed and the PHI nodes
>   that they are replaced are inserted back in. This replacement is
>   contingent to a few conditions.
>  a) All the computations that were added are still present in the basic
> block. i.e all the computations are still present in the form in
> which they were added and havent been touched by any of the loop
> optimizations pas

Re: IRA accumulated costs

2008-10-01 Thread Vladimir Makarov

Hi, Richard.  Returning to accurate cost accumulation issue you found 
recently.  Here is the patch fixing it.  You could try, if you want, how 
MIPS  will behave with it.  The patch also more accurately calculates 
ALLOCNO_CALL_FREQ which affects decision to spill allocno in 
assign_hard_reg if it is more profitable.



2008-10-01  Vladimir Makarov  <[EMAIL PROTECTED]>

   * ira-int.h (ira_allocno): Add member updated_cover_class_cost.
   (ALLOCNO_UPDATED_COVER_CLASS_COST): New.
   (ira_fast_allocation): Remove the prototype.
   
   * ira-color.c (update_copy_costs, allocno_cost_compare_func,

   assign_hard_reg, calculate_allocno_spill_cost): Use updated costs.
   (color_pass): Modify the updated costs.
   (ira_color): Rename to color.  Make it static.
   (ira_fast_allocation): Rename to fast_allocation.  Make it static.
   (ira_color): New function.
   
   * ira-conflicts.c (process_regs_for_copy): Propagate hard reg cost

   change.

   * ira-lives.c (last_call_num, allocno_saved_at_call): New
   variables.
   (set_allocno_live, clear_allocno_live, mark_ref_live,
   mark_ref_dead): Invalidate corresponding element of
   allocno_saved_at_call.
   (process_bb_node_lives): Increment last_call_num.  Setup
   allocno_saved_at_call.  Don't increase ALLOCNO_CALL_FREQ if the
   allocno was already saved.
   (ira_create_allocno_live_ranges): Initiate last_call_num and
   allocno_saved_at_call.

   * ira-build.c (ira_create_allocno): Initiate
   ALLOCNO_UPDATED_COVER_CLASS_COST.
   (create_cap_allocno, propagate_allocno_info,
   remove_unnecessary_allocnos): Remove setting updated costs.
   (ira_flattening): Set up ALLOCNO_UPDATED_COVER_CLASS_COST.

   * ira.c (ira):  Don't call ira_fast_allocation.

   * ira-costs.c (setup_allocno_cover_class_and_costs): Don't set up
   updated costs.
   

Index: ira-conflicts.c
===
--- ira-conflicts.c	(revision 140793)
+++ ira-conflicts.c	(working copy)
@@ -337,6 +337,7 @@ process_regs_for_copy (rtx reg1, rtx reg
   enum reg_class rclass, cover_class;
   enum machine_mode mode;
   ira_copy_t cp;
+  ira_loop_tree_node_t parent;
 
   gcc_assert (REG_SUBREG_P (reg1) && REG_SUBREG_P (reg2));
   only_regs_p = REG_P (reg1) && REG_P (reg2);
@@ -388,13 +389,23 @@ process_regs_for_copy (rtx reg1, rtx reg
 cost = ira_register_move_cost[mode][cover_class][rclass] * freq;
   else
 cost = ira_register_move_cost[mode][rclass][cover_class] * freq;
-  ira_allocate_and_set_costs
-(&ALLOCNO_HARD_REG_COSTS (a), cover_class,
- ALLOCNO_COVER_CLASS_COST (a));
-  ira_allocate_and_set_costs
-(&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), cover_class, 0);
-  ALLOCNO_HARD_REG_COSTS (a)[index] -= cost;
-  ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[index] -= cost;
+  for (;;)
+{
+  ira_allocate_and_set_costs
+	(&ALLOCNO_HARD_REG_COSTS (a), cover_class,
+	 ALLOCNO_COVER_CLASS_COST (a));
+  ira_allocate_and_set_costs
+	(&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), cover_class, 0);
+  ALLOCNO_HARD_REG_COSTS (a)[index] -= cost;
+  ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[index] -= cost;
+  if (ALLOCNO_HARD_REG_COSTS (a)[index] < ALLOCNO_COVER_CLASS_COST (a))
+	ALLOCNO_COVER_CLASS_COST (a) = ALLOCNO_HARD_REG_COSTS (a)[index];
+  if (ALLOCNO_CAP (a) != NULL)
+	a = ALLOCNO_CAP (a);
+  else if ((parent = ALLOCNO_LOOP_TREE_NODE (a)->parent) == NULL
+	   || (a = parent->regno_allocno_map[ALLOCNO_REGNO (a)]) == NULL)
+	break;
+}
   return true;
 }
 
Index: ira-int.h
===
--- ira-int.h	(revision 140793)
+++ ira-int.h	(working copy)
@@ -258,9 +258,9 @@ struct ira_allocno
   /* Register class which should be used for allocation for given
  allocno.  NO_REGS means that we should use memory.  */
   enum reg_class cover_class;
-  /* Minimal accumulated cost of usage register of the cover class for
- the allocno.  */
-  int cover_class_cost;
+  /* Minimal accumulated and updated costs of usage register of the
+ cover class for the allocno.  */
+  int cover_class_cost, updated_cover_class_cost;
   /* Minimal accumulated, and updated costs of memory for the allocno.
  At the allocation start, the original and updated costs are
  equal.  The updated cost may be changed after finishing
@@ -451,6 +451,7 @@ struct ira_allocno
 #define ALLOCNO_LEFT_CONFLICTS_NUM(A) ((A)->left_conflicts_num)
 #define ALLOCNO_COVER_CLASS(A) ((A)->cover_class)
 #define ALLOCNO_COVER_CLASS_COST(A) ((A)->cover_class_cost)
+#define ALLOCNO_UPDATED_COVER_CLASS_COST(A) ((A)->updated_cover_class_cost)
 #define ALLOCNO_MEMORY_COST(A) ((A)->memory_cost)
 #define ALLOCNO_UPDATED_MEMORY_COST(A) ((A)->updated_memory_cost)
 #define ALLOCNO_EXCESS_PRESSURE_POINTS_NUM(A) ((A)->excess_pressure_points_num)
@@ -897,7 +898,6 @@ extern void ira_reassign_conflict_allocn
 extern void ira_initiate_assign (void);
 extern void ira_finish_assign (void);
 extern void ira_color (void);
-extern void

Re: Defining a common plugin machinery

2008-10-01 Thread Hugh Leather


Aye up all,

I've now been reading through some of the list archive.  Some of the 
posts were about how to tell GCC which plugins to load.  I thought I'd 
tell you how libplugin does it.


First there is a plugin path.  This tells the system where plugin XML 
specifications can be found (each plugin needs one specification file).  
The path can be set by environment variable and/or command line 
argument.  The application can also add to the plugin path itself, so 
GCC could include plugins from its own installation directory (although 
it doesn't at the moment).


The plugin system will look at every specification file in the plugin 
path.  Those files must be well formed XML with processing directives 
saying that the plugin is for the current application. 
 for only GCC 4.3.1

or
for any GCC 4.X.X
or
for any application - a few of the 
provided plugins work with all applications - like logging support.


For the patch I have for GCC, only the compiler is plugin aware, not the 
driver, linker, etc.  We could easily have processing directives for 
each so that you could define a plugin that worked on any set of the 
compiler applications.  A plugin with both the processing directives 
below would work on both the linker and the driver, but not anything else:

   
   

Plugins can be marked as lazy or eager (eager by default).  Lazy plugins 
aren't loaded unless explicitly asked for or unless needed by another 
plugin.  This allows users to setup (or for the compiler to set up) 
default plugins.  You could, for example, remove all passes from GCC and 
distribute them as plugins with a small number being required (not that 
you should).


The user can specify a list plugins with environment variable 
(GCC_PLUGINS=id,id) and/or command line argument (-plugins id,id).  
These are just comma separated lists of plugin ids (actually they can be 
glob patterns, too).  Every plugin with such a matching id is marked as 
eager if it isn't already.  The system fails if a requested plugin can't 
be found.


So, we start loading all eager plugins.  This means setting up their 
extension points, loading their libraries, executing lifecycle methods, 
etc., etc..  If any loaded plugin needs a lazy plugin, that lazy plugin 
is marked eager and will also be loaded. 


Plugins 'need' each other by either:

   * Having an explicit 'requires' element in their specfication, e.g. 
 


   * Extending an extension point from the other plugin. This can be
 either by 'extension' elements in the specifications or by code
 (for example in lifecycle methods or, well, pretty much
 anything).  This means that you don't need to know what plugins
 provide extension points you want, the system just handles it for you.


This also means that you can have one plugin which loads up lots of 
other plugins.  If we had all non-essential passes as plugins for 
example, one plugin called "O3" could load up a certain number of them 
and set parameters - I'm not suggesting we do that, though :-)


Finally, you can specify arguments to plugins.  This can either be via 
command line (-plugin-var id=value;id=value) and/or environment variable 
(GCC_PLUGIN_VAR=id=value;id=value). 

Plugin XML specifications can directly use these arguments, specify 
their own, use expansion over them etc.  Plugin XML files can also use 
other sources of variable, such as any environment variable with 
variable names like "env.PATH". 


Plugin shared libraries also have an API to access these arguments.

The variables also have an escaping syntax so that characters like '=' 
and ';' can be represented.


   Cheers,

   Hugh.
  


Hugh Leather wrote:

Sorry, I think this bounced twice.

Hugh Leather wrote:

Hi All,

Thanks, Grigori, for mentioning my plugin system, libplugin, 
which can be found at http://libplugin.sourceforge.net/.


I have been meaning to release it but finding the time to finish 
off the documentation and upload all the newest code to SourceForge 
has been difficult (both code and docs on SourceForge are some months 
out of date).


The plugin system was built to support MilePost goals for GCC, as 
we need to be able to capture events during compilation as well as 
change compilation behaviour.  Here are some of the features of the 
system:


*Application agnostic.*
The plugin system is not GCC specific but can be used to make any 
application plugin aware.  Plugin management is handled through a 
shared library which GCC, or any other application, can link to.
I think if GCC took a similar approach then it would benefit from 
the exposure the system received elsewhere and the wider community 
would also have access to a professionally built plugin system.  As 
the plugin system becomes more powerful, GCC reaps the rewards 
without having to change a line of code.


The other, huge advantage of this, together with the design I'll 
describe below, is that GCC only has ~10 lines of plugin code; to

Re: m32c: pointer math vs sizetype again

2008-10-01 Thread Joel Sherrill


Is this related to the loop termination bug I reported
on the m32c? 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37665

The generated code is using the lower 16-bits of the
address for end of the array rather than the full 24-bit
address.

--joel


DJ Delorie wrote:

I've got a partial patch which works with older (4.3) gccs, but fails
gimple's check for trunk (attached).  My trivial test case...

char *
foo (char *a, int b)
{
  return a-b;
}

...fails thusly:

  
constant 32>
unit size  
constant 4>
align 8 symtab 0 alias set -1 canonical type 0xb7f52c30 precision 32 min  max >
  
constant 16>
unit size  
constant 2>
align 8 symtab 0 alias set -1 canonical type 0xb7efc000 precision 16 min  max >
useless false: ../../gcc/gcc/tree-ssa.c 1092
dj.c: In function 'foo':
dj.c:2: error: type mismatch in pointer plus expression
D.1194 = a + D.1196;

char *

char *



D.1194 = a + D.1196;

dj.c:2: internal compiler error: verify_gimple failed


I'm obviously doing something wrong in the cast-to-bigger step.  How
can I get this to pass gimple?  What I'm trying to accomplish is this:

1. Values added to pointers need to be treated as signed (at least, if
   they're signed types, certainly if you're going to use a
   NEGATE_EXPR).

2. If sizeof(size_t) < sizeof(void *), sign extend the intop to be
   pointer-sized before adding it.



Index: c-common.c
===
--- c-common.c  (revision 140759)
+++ c-common.c  (working copy)
@@ -3337,20 +3337,28 @@ pointer_int_sum (enum tree_code resultco
 intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
 TYPE_UNSIGNED (sizetype)), intop);

   /* Replace the integer argument with a suitable product by the object size.
  Do this multiplication as signed, then convert to the appropriate
  type for the pointer operation.  */
-  intop = convert (sizetype,
+  intop = convert (ssizetype,
   build_binary_op (EXPR_LOCATION (intop),
MULT_EXPR, intop,
convert (TREE_TYPE (intop), size_exp), 1));

   /* Create the sum or difference.  */
   if (resultcode == MINUS_EXPR)
-intop = fold_build1 (NEGATE_EXPR, sizetype, intop);
+intop = fold_build1 (NEGATE_EXPR, ssizetype, intop);
+
+  if (TREE_CODE (result_type) == POINTER_TYPE
+  && TYPE_PRECISION (result_type) > TYPE_PRECISION (TREE_TYPE (intop)))
+{
+  tree iptr_type = c_common_type_for_mode (TYPE_MODE (result_type),
+  TYPE_UNSIGNED (result_type));
+  intop = fold_build1 (NOP_EXPR, iptr_type, intop);
+}

   ret = fold_build2 (POINTER_PLUS_EXPR, result_type, ptrop, intop);

   fold_undefer_and_ignore_overflow_warnings ();

   return ret;
Index: tree.c
===
--- tree.c  (revision 140759)
+++ tree.c  (working copy)
@@ -3283,15 +3283,21 @@ build2_stat (enum tree_code code, tree t

   if ((code == MINUS_EXPR || code == PLUS_EXPR || code == MULT_EXPR)
   && arg0 && arg1 && tt && POINTER_TYPE_P (tt))
 gcc_assert (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == 
INTEGER_CST);

   if (code == POINTER_PLUS_EXPR && arg0 && arg1 && tt)
-gcc_assert (POINTER_TYPE_P (tt) && POINTER_TYPE_P (TREE_TYPE (arg0))
-   && INTEGRAL_TYPE_P (TREE_TYPE (arg1))
-   && useless_type_conversion_p (sizetype, TREE_TYPE (arg1)));
+{
+  gcc_assert (POINTER_TYPE_P (tt));
+  gcc_assert (POINTER_TYPE_P (TREE_TYPE (arg0)));
+  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (arg1)));
+#if 0
+  gcc_assert (useless_type_conversion_p (sizetype, TREE_TYPE (arg1))
+ || useless_type_conversion_p (ssizetype, TREE_TYPE (arg1)));
+#endif
+}

   t = make_node_stat (code PASS_MEM_STAT);
   TREE_TYPE (t) = tt;

   /* Below, we automatically set TREE_SIDE_EFFECTS and TREE_READONLY for the
  result based on those same flags for the arguments.  But if the
  



--
Joel Sherrill, Ph.D. Director of Research & Development
[EMAIL PROTECTED]On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
  Support Available (256) 722-9985

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Richard Guenther

On Wed, Oct 1, 2008 at 3:59 PM, Richard Guenther
<[EMAIL PROTECTED]> wrote:
> On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> 
> wrote:
>> Hi ,
>>
>> Based on the conversation in the thread at
>> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
>> pass trying to undo final value replacement going. The initial
>> implementation was done by Pranav Bhandarkar when he was employed at
>> Azingo as part of work sponsored by Icera Semiconductor.  I've been
>> trying to get this working with my private port over here.  We intend
>> to contribute this back once our copyright assignments are sorted and
>> if this will be acceptable to all. I've been getting a few compile
>> time ICEs with this approach and haven't been able to resolve them
>> well enough yet. Whilst doing so, I wanted to check on the approach as
>> outlined below and ask if there's anything that we might have missed
>> or any problem that one can see with us going along these lines.
>> Thanks for your time and patience.
>
> Some quick comments.  First, do you have a non-pseudo-code testcase
> that exposes the extra computations?  Second, I think rather than
> trying to undo what SCEV const(!)-prop is doing adjust its cost
> model (maybe there isn't one) to not create the costly substitutions.

Indeed the comment on scev_const_prop says

"Also perform final value replacement in loops,
   in case the replacement expressions are cheap."

but no such check for cheapness is done.  Whatever cost model we
add we need to make sure to not disable empty loop removal - that is,
loops that only care for the final value of their induction variable.

A sensible simple cost model is that the replacement is either a
constant, a SSA_NAME or an expression of the form CST * name + CST
(which may be a common thing).  In your testcase there are even
divisions inserted.

I guess the empty-loop removal interaction makes fixing this one harder,
but trying to record things to undo this transformation doesn't look right
either.

Richard.

Re: Defining a common plugin machinery

2008-10-01 Thread Basile STARYNKEVITCH


Hugh Leather wrote:

Aye up all,

I've now been reading through some of the list archive.  Some of the 
posts were about how to tell GCC which plugins to load.  I thought I'd 
tell you how libplugin does it.



Thanks for the nice explanation. I'm not sure to understand exactly how 
libplugin deals with adding passes; apparently, the entire pass manager 
(ie gcc/passes.c) has been rewritten or enhanced. Also, I did not 
understood the exact conceptual differences between libplugin & other 
proposals. Apparently libplugin is much more ambitious.


So we now have many plugin proposals & experiments. However, we do know 
that there are some legal/political/license issues on these points (with 
the GCC community rightly wanting as hard as possible to avoid 
proprietary plugins), that some interaction seems to happen (notably 
between Steering Committee & FSF), that the work is going slowly 
(because of lack of resource & labor & funding? at FSF).


My perception is that the issues are not mostly technical, but still 
political (and probably, as Ian Taylor mentioned it in 
http://gcc.gnu.org/ml/gcc/2008-09/msg00442.html a lack of lawyer or 
other human resources at FSF, which cost much more than any reasonable 
person could afford individually). I actually might not understand why 
exactly plugins are not permitted by the current GCC licenses.


What I don't understand is

* what exactly do we call a plugin? I feel (but I am not a lawyer) that 
(on linux) it is any *.so file which is fed to dlopen. I'm not able to 
point what parts of the GCC license prohibit that (I actually hope that 
nothing prohibits it right now, if the *.so is compiled from GPLv3-ed 
FSF copyrighted code. the MELT branch is doing exactly that right now).


* will the runtime license be working for Christmas 2008. [some messages 
made me think that not, it is too much lawyer work; other messages made 
me a bit more optimistic; I really am confused]. Of course, I don't want 
any hard date, but I am in the absolute darkness on the actual work 
already done on improving the runtime license, and even more on what 
needs to be fixed. Also, I have no idea of the work involved in writing 
new licenses (I only know that the GPLv3 effort lasted much more than 
one year). Did I say that I am not a lawyer, and not understanding even 
the basic principles of US laws (or perhaps even French ones)?


* what kind of intrusiveness do we want for the plugin machinery. Do we 
want it to be clean and hence to touch a lot of files (in particular the 
details of passes & the pass manager), or do we first want some quick 
and dirty plugin trick merged into the trunk, even if it is imperfect?


* what is the plugin machinery useful for? Only adding optimisation 
passes, or much more ambitious (adding new front ends, back ends, targets)?


* what is the interaction between the plugin machinery & the rest of GCC 
(e.g. GGC, dump files, )


* what is the granularity plugins are wanted or needed for? Only whole 
passes, or something smaller than that (e.g. some specific functions 
inside specific passes)?


* who really want plugins to happen quick, and which company would 
invest money [not only code] on that?


* what host system do we want the plugin to work with? Is libtool dyn 
loader enough? Could every non static symbol inside cc1 be visible to 
the plugin?


* do we really want one single (fits all) plugin machinery inside GCC?

My feeling is that a lot of various technical efforts has already being 
put into plugins, but that the future runtime license may (or not) 
impact technicalities (perhaps making some proposed technical solutions 
impossible). I really don't understand what is the hard limit, i.e. what 
the FSF or the Steering Committee wants to avoid exactly (obviously 
proprietary plugins implementing new machine targets are unwanted, but 
what else; is the goal to only permit FSF copyrighted GPLed plugins; 
what would be the review policy of code going into plugins?)?


I've got no idea of how would it be hard to make any plugin system 
accepted into the GCC trunk, and when could that work begins to start 
(i.e. when to send plugin patches to gcc-patches@). I tend to believe 
that it the main issue now. Are plugin patches supposed to be welcome 
-on the gcc-patches@ mailing list, for trunk acceptance- when GCC goes 
back in stage1? Will the first plugin patches (submitted to gcc-patches@ 
for acceptance into trunk) be huge or tiny patches? Technically both are 
possible (of course with different goals & features).


I even don't know what legally a plugin is. For instance, in my MELT 
branch code is indeed dlopen-ed, but [currently] the C code of the 
plugin is generated (by the plugin itself) from MELT lisp-like files, 
which are all inside the MELT branch (GPL-ed, FSF copyrighted) Perhaps 
that does not even count, from a legal point of view, as a plugin? [I 
really hope I am not doing unknowingly illegal things on the MELT 
branch; to calm eve

Re: Does IRA support stack slot sharing for locals and spilled pseudos?

2008-10-01 Thread Jeff Law


Pat Haugen wrote:

Alexander Monakov <[EMAIL PROTECTED]> wrote on 09/29/2008 01:34:12 PM:

  

I'm seeing a miscompilation on sel-sched branch that at first sight looks
related to IRA merge.

alias.c::anti_dependence disambiguates references to
(mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64])
and
(mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64])
while there are no stores to r122 between corresponding insns.

It does so because nonoverlapping_memrefs_p returns TRUE for these mems,


which
  

is, in turn, due to this code:

2118   /* If either RTL is not a MEM, it must be a REG or CONCAT, meaning


they
  

2119  can't overlap unless they are the same because we never
reuse that part
2120  of the stack frame used for locals for spilled pseudos.  */
2121   if ((!MEM_P (rtlx) || !MEM_P (rtly))
2122   && ! rtx_equal_p (rtlx, rtly))
2123 return 1;

Corresponding RTL_DECLS are:
rtlx = (reg:DI 97 r105 [orig:850 ivtmp.743 ] [850])
rtly = (mem/c:DI (plus:DI (reg/f:DI 111 r119)
(const_int -1456 [0xfa50])) [64 ivtmp.1640+0 S8 A64])

Does IRA support stack slot sharing described in the comment?



We just got done walking through a failure with 200.sixtrack that looks
like the same thing.  The two insns involved are:


(insn 33168 33162 33175 27 maincr.f:1 (set (reg/f:DI 14 14 [orig:614
ivtmp.1309 ] [614])
(mem/c:DI (plus:DI (reg:DI 11 11)
(const_int -7080 [0xe458])) [101 ivtmp.1309+0
S8 A64])) 349 {*movdi_internal64} (nil))

(insn 33175 33168 33176 27 maincr.f:1 (set (mem/c:DF (plus:DI (reg:DI 11 11
[5])
(const_int -7080 [0xe458])) [101 D.3497+0 S8
A64])
(reg:DF 45 13 [orig:765 D.3497 ] [765])) 336 {*movdf_hardfloat64}
(expr_list:REG_DEAD (reg:DF 45 13 [orig:765 D.3497 ] [765])
(nil)))


The MEM refs are not seen as overlapping which then allows the scheduler to
reorder the store to MEM above the load. The problem is brought about
because an additional register is needed to access the stack location since
it is beyond the 32K limit for PPC.  So before these references we have an
insn 'r11 = r1 + 64K'.  The code in alias.c:stack_addr_p() does not
recognize r11 as pointing to the stack and therefor the IRA code in
nonoverlapping_memrefs_p() does not recognize the above MEMs as being stack
references and use the special code for reused ira spill slots. It seems
like stack_addr_p() doesn't handle reg+reg addressing also, only
recognizing reg+const references (unless those are meant to be caught
elsewhere).


  
Yea, I don't see how stack_addr_p handles cases such as secondary 
reloads due to an out of range displacement in a reg+d style addressing 
mode.  Given the unpredictable nature of how out of range slot addresses 
are reloaded,  I'm not sure that following the use-def chains back to 
the definition sites would be useful either.


I'm certainly at a loss for a good way to fix this.

Vlad -- any thoughts?

jeff

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak

Hi,

>   b) If any PHI node has count zero it can be inserted back and its
>  corresponding computations removed, iff the argument of the PHI node
>  still exists as an SSA variable. This means that we can insert
>  a_1 = PHI  if D.10_1 still exists and hasnt been removed by
>  any of the passes between the scalar evolution pass and the
>  loopdone pass.

this does not work:
-- we reuse ssa names, so it can happen that the argument of the PHI node
   is eliminated, then reused for a different purpose
-- in case more complex loop transformations were performed
   (e.g., loop reversal), the final value of the ssa name might have
   changed.

Zdenek

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak

Hi,

> > Based on the conversation in the thread at
> > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
> > pass trying to undo final value replacement going. The initial
> > implementation was done by Pranav Bhandarkar when he was employed at
> > Azingo as part of work sponsored by Icera Semiconductor.  I've been
> > trying to get this working with my private port over here.  We intend
> > to contribute this back once our copyright assignments are sorted and
> > if this will be acceptable to all. I've been getting a few compile
> > time ICEs with this approach and haven't been able to resolve them
> > well enough yet. Whilst doing so, I wanted to check on the approach as
> > outlined below and ask if there's anything that we might have missed
> > or any problem that one can see with us going along these lines.
> > Thanks for your time and patience.
> 
> Some quick comments.  First, do you have a non-pseudo-code testcase
> that exposes the extra computations?  Second, I think rather than
> trying to undo what SCEV const(!)-prop is doing adjust its cost
> model (maybe there isn't one) to not create the costly substitutions.

I would disagree on that.  Whether a final value replacement is
profitable or not largely depends on whether it makes further
optimization of the loop possible or not; this makes it difficult
to find a good cost model.  I think undoing FVR is a good approach
to solve this problem (unfortunately, the proposed implementation
does not work),

Zdenek

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak

Hi,

> On Wed, Oct 1, 2008 at 3:59 PM, Richard Guenther
> <[EMAIL PROTECTED]> wrote:
> > On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> 
> > wrote:
> >> Hi ,
> >>
> >> Based on the conversation in the thread at
> >> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
> >> pass trying to undo final value replacement going. The initial
> >> implementation was done by Pranav Bhandarkar when he was employed at
> >> Azingo as part of work sponsored by Icera Semiconductor.  I've been
> >> trying to get this working with my private port over here.  We intend
> >> to contribute this back once our copyright assignments are sorted and
> >> if this will be acceptable to all. I've been getting a few compile
> >> time ICEs with this approach and haven't been able to resolve them
> >> well enough yet. Whilst doing so, I wanted to check on the approach as
> >> outlined below and ask if there's anything that we might have missed
> >> or any problem that one can see with us going along these lines.
> >> Thanks for your time and patience.
> >
> > Some quick comments.  First, do you have a non-pseudo-code testcase
> > that exposes the extra computations?  Second, I think rather than
> > trying to undo what SCEV const(!)-prop is doing adjust its cost
> > model (maybe there isn't one) to not create the costly substitutions.
> 
> Indeed the comment on scev_const_prop says
> 
> "Also perform final value replacement in loops,
>in case the replacement expressions are cheap."
> 
> but no such check for cheapness is done. 

sorry for the leftover comment -- there used to be a test for the cost
of the computation, but it caused so many (missed optimization) problems
that I removed it in the end,

Zdenek

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Ramana Radhakrishnan

Hi Zdenek,

On Wed, Oct 1, 2008 at 5:19 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:
> Hi,
>
>>   b) If any PHI node has count zero it can be inserted back and its
>>  corresponding computations removed, iff the argument of the PHI node
>>  still exists as an SSA variable. This means that we can insert
>>  a_1 = PHI  if D.10_1 still exists and hasnt been removed by
>>  any of the passes between the scalar evolution pass and the
>>  loopdone pass.
>
> this does not work:
> -- we reuse ssa names, so it can happen that the argument of the PHI node
>   is eliminated, then reused for a different purpose

I wasn't sure if from the proposal strong enough to catch this case ? i.e. if

 a) All the computations that were added are still present in the basic
block. i.e all the computations are still present in the form in
which they were added and havent been touched by any of the loop
optimizations passes that run between the scalar evolution pass (i.e the
pass when Part 1 is executed) and the 'loopdone' pass. We go
through the exit basic block and look up each stmt in
changed_stmt_table. If found we lookup the corresponding PHI node in the
phi_node_info link list and decrement its count by 1 (count here denotes
the number of computations added. When count it 0 it means all the
computatins added in the scalar evolution pass have been found in the
same form in the loop done pass such a PHI node can be inserted back in
if 'b' is also true).

So if the ssa_names are infact reused they won't be the same
computations. We do store the statements that were introduced and if
we see a difference in the statements based on the hashes calculated
we don't undo the change.

> -- in case more complex loop transformations were performed
>   (e.g., loop reversal), the final value of the ssa name might have
>   changed.

Could you give an example for this ? Is there anything else you might
suggest in terms of undoing the transformations from scalar cprop.?

cheers
Ramana

>
> Zdenek
>

-- 
Ramana Radhakrishnan

Re: Does IRA support stack slot sharing for locals and spilled pseudos?

2008-10-01 Thread Jeff Law


Alexander Monakov wrote:

Hello,

I'm seeing a miscompilation on sel-sched branch that at first sight looks
related to IRA merge.

alias.c::anti_dependence disambiguates references to
(mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64])
and
(mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64])
while there are no stores to r122 between corresponding insns.

It does so because nonoverlapping_memrefs_p returns TRUE for these mems, which
is, in turn, due to this code:

2118   /* If either RTL is not a MEM, it must be a REG or CONCAT, meaning they
2119  can't overlap unless they are the same because we never reuse that 
part
2120  of the stack frame used for locals for spilled pseudos.  */
2121   if ((!MEM_P (rtlx) || !MEM_P (rtly))
2122   && ! rtx_equal_p (rtlx, rtly))
2123 return 1;

Corresponding RTL_DECLS are:
rtlx = (reg:DI 97 r105 [orig:850 ivtmp.743 ] [850])
rtly = (mem/c:DI (plus:DI (reg/f:DI 111 r119)
(const_int -1456 [0xfa50])) [64 ivtmp.1640+0 S8 A64])

Does IRA support stack slot sharing described in the comment?
  
Yes.  There's code at the start of nonoverlapping_memrefs_p to handle 
these cases, but as Pat pointed out, it doesn't work for large offsets 
from the stack/frame pointer (large enough to cause a secondary 
reload).  I'm not sure offhand how to best fix this.


jeff

Re: Does IRA support stack slot sharing for locals and spilled pseudos?

2008-10-01 Thread Richard Henderson


Jeff Law wrote:

(mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64])
and
(mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64])

...
Yes.  There's code at the start of nonoverlapping_memrefs_p to handle 
these cases, but as Pat pointed out, it doesn't work for large offsets 
from the stack/frame pointer (large enough to cause a secondary 
reload).  I'm not sure offhand how to best fix this.


How about setting the MEM_EXPR to a fake "spill_slot_base" symbol,
plus the full frame pointer offset number?  Since the slots are
being shared, the original decl as the MEM_EXPR isn't terribly useful.


r~

gcc source: how can access asmspec_tree in function push_parm_decl

2008-10-01 Thread Bernd Roesch

Hello 

in gcc/c-decl.c

I see in finish_decl 

"""
finish_decl (tree decl, tree init, tree asmspec_tree)
{


there is access possible to asmspec_tree.

but the func

push_parm_decl 

have no parameter asmspec_tree.

is there a way to get access to it, without many code changes, or below func
can use to a better place  ?

the amiga os target need that functions.does somebody can tell me a more
easy way to add this feature, so not so many gcc source must change when
make a amiga OS Port ?.  

here is a short testprog to see what need.it is the feature to tell what
variable must put in which register or what register must put in which
variable

long GfxBase;

void (*Old_Text)(long rp asm("a1"),
  long string asm("a0"),
 long count asm("d0"),
 long GfxBase asm("a6"));

 void New_Text(long rp __asm("a1"),
  long string __asm("a0"),
  long count __asm("d0"))

{
(*Old_Text)(rp, string, count,GfxBase);

}

But it is much more easy, when there is a way to get access to asmspec and
need not 1 additional parameter.

The current way is change many lines in c-parse.in (see below)

here is change that is need in c-decl.c
the changes are from 3.4.0 i find out, 

diff -rupN gcc-3.4.0/gcc/c-decl.c gcc-3.4.0-gg/gcc/c-decl.c
--- gcc-3.4.0/gcc/c-decl.c  Mon Mar 22 18:58:18 2004
+++ gcc-3.4.0-gg/gcc/c-decl.c   Tue Apr 27 11:12:30 2004
@@ -2943,7 +2943,7 @@ finish_decl (tree decl, tree init, tree 
and push that on the current scope.  */
 
 void
-push_parm_decl (tree parm)
+push_parm_decl (tree parm, tree asmspec)
 {
   tree decl;
 
@@ -2956,6 +2956,75 @@ push_parm_decl (tree parm)
 TREE_PURPOSE (TREE_PURPOSE (parm)),
 PARM, 0, NULL);
   decl_attributes (&decl, TREE_VALUE (parm), 0);
+
+  /* begin-GG-local: explicit register specification for parameters */
+  if (asmspec)
+#ifdef TARGET_AMIGAOS
+{
+  const char *regname=TREE_STRING_POINTER(asmspec);
+  int regnum;
+  if ((regnum=decode_reg_name(regname))>=0)
+   {
+ tree type=TREE_TYPE(decl);
+ if (HARD_REGNO_MODE_OK(regnum, TYPE_MODE(type)))
+   {
+ tree t, attrs;
+ /* Build tree for __attribute__ ((asm(regnum))). */
+#if 0
+ /* This doesn't work well because of a bug in
+attribute_list_contained(), which passes list of arguments to
+simple_cst_equal() instead of passing every argument
+separately. */
+ attrs=tree_cons(get_identifier("asm"), tree_cons(NULL_TREE,
+   build_int_2_wide(regnum, 0), NULL_TREE), NULL_TREE);
+#else
+ attrs=tree_cons(get_identifier("asm"),
+ build_int_2_wide(regnum, 0), NULL_TREE);
+#endif
+#if 0
+ /* build_type_attribute_variant() would seem to be more
+appropriate here. However, that function does not support
+attributes for parameters properly. It modifies
+TYPE_MAIN_VARIANT of a new type. As a result, comptypes()
+thinks that types of parameters in prototype and definition
+are different and issues error messages. See also comment
+below. */
+ type=build_type_attribute_variant(type, attrs);
+#else
+ /* First check whether such a type already exists - if yes, use
+that one. This is very important, since otherwise
+common_type() would think that it sees two different
+types and would try to merge them - this could result in
+warning messages. */
+ for (t=TYPE_MAIN_VARIANT(type); t; t=TYPE_NEXT_VARIANT(t))
+   if (comptypes(t, type, COMPARE_STRICT)==1
+   && attribute_list_equal(TYPE_ATTRIBUTES(t), attrs))
+ break;
+ if (t)
+   type=t;
+ else
+   {
+ /* Create a new variant, with differing attributes.
+(Hack! Type with differing attributes should no longer be
+a variant of its main type. See comment above for
+explanation why this was necessary). */
+ type=build_type_copy(type);
+ TYPE_ATTRIBUTES(type)=attrs;
+   }
+#endif
+ TREE_TYPE(decl)=type;
+   }
+ else
+   error ("%Jregister specified for '%D' isn't suitable for data type",
+  decl, decl);
+   }
+  else
+   error ("invalid register name `%s'", regname);
+}
+#else /* !TARGET_AMIGAOS */
+error("explicit register specification for parameters is not supported
for this target");
+#endif /* !TARGET_AMIGAOS */
+  /* end-GG-local */
 
   decl = pushdecl (decl); 





..

diff -rupN gcc-3.4.0/gcc/c-parse.in gcc-3.4.0-gg/gcc/c-parse.in
--- gcc-3.4.0/gcc/c-parse.inSun Feb  8 21:56:44 2004
+++ gcc-3.4.0-gg/gcc/c-parse.in Tue Apr 27 11:12:30 2004
@@ -29,7 +29,7 @@ Software Foundation, 59 Temple Place - S
written by AT&T, but I have never seen it.  */
 
 @@ifc
-%expect 10 /* shift/reduce conflicts, and no reduce/reduce conflicts.  */
+%expect 11 /* shift/reduce conflicts, and no reduce/reduce conflicts.  */
 @@end_ifc

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Richard Guenther

On Wed, Oct 1, 2008 at 6:22 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:
> Hi,
>
>> > Based on the conversation in the thread at
>> > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a
>> > pass trying to undo final value replacement going. The initial
>> > implementation was done by Pranav Bhandarkar when he was employed at
>> > Azingo as part of work sponsored by Icera Semiconductor.  I've been
>> > trying to get this working with my private port over here.  We intend
>> > to contribute this back once our copyright assignments are sorted and
>> > if this will be acceptable to all. I've been getting a few compile
>> > time ICEs with this approach and haven't been able to resolve them
>> > well enough yet. Whilst doing so, I wanted to check on the approach as
>> > outlined below and ask if there's anything that we might have missed
>> > or any problem that one can see with us going along these lines.
>> > Thanks for your time and patience.
>>
>> Some quick comments.  First, do you have a non-pseudo-code testcase
>> that exposes the extra computations?  Second, I think rather than
>> trying to undo what SCEV const(!)-prop is doing adjust its cost
>> model (maybe there isn't one) to not create the costly substitutions.
>
> I would disagree on that.  Whether a final value replacement is
> profitable or not largely depends on whether it makes further
> optimization of the loop possible or not; this makes it difficult
> to find a good cost model.  I think undoing FVR is a good approach
> to solve this problem (unfortunately, the proposed implementation
> does not work),

Ok, fair enough.  Ideally we would then be able to retain the PHI nodes
and somehow record an equivalency in the IL from which we later could
remove either of the definitions.  Something like

def_1 = PHI < ... >

def_2 = compute

def_3 = EQUIV 
(def_3 = ASSERT_EXPR ?)

much similar to REG_EQUAL notes.  This means that both def_1 and def_2
are conditionally dead if the EQUIV is the only remaining use.

No idea if this is feasible and useful enough in general though.

Do you remember what kind of missed optimizations you saw (apart from
missed dead loop removal)?

Thanks,
Richard.

Re: Defining a common plugin machinery

2008-10-01 Thread Joe Buck

On Wed, Oct 01, 2008 at 06:03:21PM +0200, Basile STARYNKEVITCH wrote:
> So we now have many plugin proposals & experiments. However, we do know 
> that there are some legal/political/license issues on these points (with 
> the GCC community rightly wanting as hard as possible to avoid 
> proprietary plugins), that some interaction seems to happen (notably 
> between Steering Committee & FSF), that the work is going slowly 
> (because of lack of resource & labor & funding? at FSF).

That impression isn't really right; we're getting close now to a resolution.
There should be some news soon.

Convert Blanket Write Privileges to Global Reviewers

2008-10-01 Thread Mark Mitchell


On my recommendation, and with the support of all those with blanket
write privileges on the SC, the GCC SC has decided to eliminate
"blanket write privileges" in favor of "global reviewers".  Those who
previously held blanket write privileges are now global reviewers.
Global reviewers may now review and approve patches to any portion of
the compiler and/or associated libraries, but cannot approve their own
patches.

Global reviewers who are also maintainers of particular parts of GCC
may continue to approve their own patches to those portions.  So, for
example, I can still check in a C++ front-end patch without review,
but cannot check in a loop optimizer patch without review.

This change is being made to encourage peer-review of patches and to
avoid any appearance of impropriety on the part of those of us who had
blanket write privileges.

I will commit the attached patch momentarily.

--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713

2008-10-01  Mark Mitchell  <[EMAIL PROTECTED]>

* MAINTAINERS (Blanket Write Privs): Change to Global Reviewers.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 140816)
+++ MAINTAINERS (working copy)
@@ -18,7 +18,7 @@ To report problems in GCC, please visit:
 Maintainers
 ===
 
-   Blanket Write Privs.
+   Global Reviewers
 
 Richard Earnshaw   [EMAIL PROTECTED]
 Richard Henderson  [EMAIL PROTECTED]
@@ -32,6 +32,9 @@ Mark Mitchell [EMAIL 
PROTECTED]
 Bernd Schmidt  [EMAIL PROTECTED]
 Jim Wilson [EMAIL PROTECTED]
 
+Note that while global reviewers can approve changes to any part of
+the compiler or associated libraries, they still need approval for
+their own patches from other maintainers or reviewers.
 
CPU Port Maintainers(CPU alphabetical order)

a solution to getch()

2008-10-01 Thread Brandon

/*solution provided by kermi3 from this web posting 
http://cboard.cprogramming.com/archive/index.php/t-27714.html*/


#include 
#include 

int mygetch(void)
{
struct termios oldt,
newt;
int ch;
tcgetattr( STDIN_FILENO, &oldt );
newt = oldt;
newt.c_lflag &= ~( ICANON | ECHO );
tcsetattr( STDIN_FILENO, TCSANOW, &newt );
ch = getchar();
tcsetattr( STDIN_FILENO, TCSANOW, &oldt );
return ch;
}
/*this point down was coded by Brandon Camadine*/
void mygetchs(int length,char array[],int display)
{
   int x;
   array[length-1]='\0';
   while(array[x]!='\0')
   {
  
   array[x]=mygetch();

   if(display !=0)
   {
   putchar(array[x]);   
   }   
   x++;   
   }
  
}
/*solution provided by kermi3 from this web posting 
http://cboard.cprogramming.com/archive/index.php/t-27714.html*/

#include 
#include 

int mygetch(void)
{
struct termios oldt,
newt;
int ch;
tcgetattr( STDIN_FILENO, &oldt );
newt = oldt;
newt.c_lflag &= ~( ICANON | ECHO );
tcsetattr( STDIN_FILENO, TCSANOW, &newt );
ch = getchar();
tcsetattr( STDIN_FILENO, TCSANOW, &oldt );
return ch;
}
/*this point down was coded by Brandon Camadine*/
void mygetchs(int length,char array[],int display)
{
int x; 
array[length-1]='\0';
while(array[x]!='\0')
{

array[x]=mygetch();
if(display !=0)
{
putchar(array[x]);  
}   
x++;
}

}

Re: [lto] Adding -fwhopr

2008-10-01 Thread Diego Novillo

On Wed, Oct 1, 2008 at 13:19, Ollie Wild <[EMAIL PROTECTED]> wrote:
> On Tue, Sep 30, 2008 at 3:31 PM, Diego Novillo <[EMAIL PROTECTED]> wrote:
>>
>> -flto: as described above.
>> -fwhopr: similar to what -fwpa does today, but it is accepted by the
>> driver and can take either source code or object code.  In this case,
>> we'd move -fwpa and -fltrans to be an lto1-only flag.
>
> Sounds reasonable.  Just to clarify, are you thinking of -flto and -fwhopr
> as mutually exclusive options, or is -fwhopr just an additional mode of
> -flto.

I think the only time where both -flto and -fwhopr are virtually
identical is in the LGEN phase.  So,

$ gcc -c -flto file.c

should be the same as

$ gcc -c -fwhopr file.c

Though, I think this may not even be true long term, at some point we
may want to do different things for both.  The case that really
matters is the actual link phase:

$ gcc -o binary -flto *.o

vs

$ gcc -o binary -fwhopr *.o

So, I'm leaning towards making them mutually exclusive always.

>  If the latter, what does being able to "take either source code or
> object code" mean?

Well, simply that -fwpa should only be accepted by lto1, and lto1 does
not take source code, only .o files with GIMPLE in them.

Perhaps the distinction is not important, as the driver can always
call the corresponding front end to generate gimple before calling
lto1 (as we do now) , but the multiplicity of LTO flags may be
confusing for the user: -flto, -fwhopr, -fwpa, -fltrans.

The last two are really only meaningful when calling lto1, so I would
simply not accept it at the driver level (i.e., gcc -fwpa *.c would
error out).

Diego.

Re: Defining a common plugin machinery

2008-10-01 Thread Hugh Leather


Aye up Basile,

   Thanks for wading through my gibberish :-)


*Differences with other proposals.*

I'll have a stab at some differences between this system and the 
others.  But, this is going to be a bit difficult since I haven't seen 
them all :-)


   *Separating Plugin system from appliction*
   Libplugin ships as a library.  Apart from a few lines of code in
   toplev.c, the only other changes to GCC will be refactorings and
   maybe calling a few functions through pointers.

   I think it's important to separate the plugin system from the
   application.  Doing plugins well, IMO, requires a lot of code.  It
   shouldn't be spread through the app.  It also cleanly separates
   plugin mechanism from the actual extensions the app wants. 


   Finally, plugins have to be extensible too. They should really be on
   a nearly equal footing with the app.  Otherwise plugin developers
   who want the plugins to be extensible will need to reimplement there
   own extensibility system.


   *Pull vs push*
   Libplugin has a 'push' architecture, not a 'pull' one.  What I mean
   is that the system pushes plugin awareness onto the application
   rather than requiring the application to call out to the plugin
   system all the time.
   Here's an example of that.  In GCC, passes have execute and gate
   functions which are already function pointers.  With libplugin you
   can make these replaceable/extensible/event-like without changing a
   single line of code in GCC.
   An external plugin, the "gcc-pass-manager" plugin, tells the world
   that it has a join point for each gate and execute function of every
   pass in the system.

   A quick aside on join points.  Suppose you have a function

  int myHeuristic( basic_block bb, rtx insn ) {
 // blah, blah
 return x;
  }

   If we redefine that function to be called myHeuristic_default
   and setup a function pointer with same name:
  static int myHeuristic_default( basic_block bb, rtx insn ) {
   ... }
  int ( *myHeuristic )( basic_block bb, rtx insn ) =
   myHeuristic_default;

   Now we can use the heuristic unchanged in the code.

   But if we tell libplugin that that is a join point with
   id="my-heuristic" (in the XML for some plugin) it will create
   1. An event called "my-heuristic.before" with signature
   "void (basic_block, rtx)"
   2. A replaceble function stack called "my-heuristic.around"
   with signature "int (basic_block, rtx)"
   3. An event called "my-heuristic.after" with signature "void
   (int, basic_block, rtx)"
   If anyone extends any of those, then the function pointer,
   myHeuristic, will be replaced with a dynamically built function
   which does, roughly:
  
   int myHeuristic_dynamic( basic_block bb, rtx insn ) {

  // call listeners to before
  foreach f in my-heuristic.before.eventHandlers {
 f( bb, insn );
  }

  // do the behaviour of the heuristic
  top = my-heuristic.around.topOfAdviceStack;
  // top is initially myHeuristic_default unless someone
   overrode it
  // top can also access the rest of the advice stack, but
   I ignore that here
  int rval = top( bb, insn );
 
  // call listeners to after

  foreach f in my-heuristic.after.eventHandlers {
 f( rval, bb, insn );
  }
  return rval
   }

   It then sets myHeuristic = myHeuristic_dynamic.  Note that if no
   one listens to the events of pushes advice on the around stack,
   then the original function pointer isn't changed - no
   performance cost.

   Now the dynamic functions are pushed onto each passes' gate or
   execute only if someone wants to extends them.  Not one line of code
   was changed in GCC.  This is what I mean by push not pull. 
   Consider the alternative, which I call 'pull' because it has to pull

   plugin awareness from the system.  It would require each pass and
   gate to check if anyone was interested, lots of changes to the
   code.  Or every calling site would have to do it, similarly
   unpleasant for most uses.

   This is great when you already have function pointers.  If you don't
   you have to make only minimal changes.  Your code remains efficient
   if no one extends it.

   *Scalable and Granularity*
   The system is very scalable.  Really this is due to the push
   architecture.

   Consider if events were implemented by something like this.  A
   single function:

   void firePluginEvent( int eventId, void* data );

   Every event would be fired by calling through this one function. 
   Plugins would register a callback function.


   This is fine when you only have a few events but look what happens
   when you have very fine grained events happ

Re: Defining a common plugin machinery

2008-10-01 Thread Basile STARYNKEVITCH


Hugh Leather wrote:

Aye up Basile,

   Thanks for wading through my gibberish :-)


*Differences with other proposals.*

I'll have a stab at some differences between this system and the 
others.  But, this is going to be a bit difficult since I haven't seen 
them all :-)


   *Separating Plugin system from appliction*
   Libplugin ships as a library.  Apart from a few lines of code in
   toplev.c, the only other changes to GCC will be refactorings and
   maybe calling a few functions through pointers.


The point is who will do the refactoring you mention? We should not 
expect tons of other people to rewrite their pass for us... This won't 
happen (and conversely I am not able to rewrite other people's passes; 
GCC is really complex for me).


I actually don't understand well what kind of plugin proposal will make 
into the trunk. Let's assume that the trunk will go in stage 1 on 
Christmas 2008, and that all the legal issues are solved (ie a runtime 
license is *defined* and accepted and explains what kind of plugins are 
GCC compatible) at the same time.

It is late for me, and I am dreaming of Santa Claus :-)

So let's suppose that Santa Claus came and give us (on end of december 
2008) the runtime license and the stage 1.


What happens next? What kind of patches is sent to gcc-patches@ and who 
will have time to review them?





   I think it's important to separate the plugin system from the
   application.  Doing plugins well, IMO, requires a lot of code.  It
   shouldn't be spread through the app.  It also cleanly separates
   plugin mechanism from the actual extensions the app wants.
   Finally, plugins have to be extensible too. They should really be on
   a nearly equal footing with the app.  Otherwise plugin developers
   who want the plugins to be extensible will need to reimplement there
   own extensibility system.


The issue I see here is to get a consensus on these ideas and on your 
code. (I have same frightenning feeling about trying to get some day 
MELT into the trunk).






   *Pull vs push*
   Libplugin has a 'push' architecture, not a 'pull' one.  What I mean
   is that the system pushes plugin awareness onto the application
   rather than requiring the application to call out to the plugin
   system all the time.


I'm not sure to understand (I need to go to sleep) and I am concerned 
about having any plugin (be it your's or Sean's or Tarek's etc...) 
related code accepted into the trunk some day. I would dream of plugins 
to be inside GCC before 2010, but I am not sure of that!



   Here's an example of that.  In GCC, passes have execute and gate
   functions which are already function pointers.  With libplugin you
   can make these replaceable/extensible/event-like without changing a
   single line of code in GCC.



I'll read again in detail your proposal later. But my main concern is 
not technical (how to do plugins - we each have our ideas, and we 
hopefully would be able to merge them), but more social: what are the 
(future) constaints (notably legal & licence contraints)? what are the 
technical counterparts? how to make some plugin code accepted in the 
trunk (I don't care whose code it is; and I don't have any generic 
plugin machinery myself: I feel that MELT could fit in most reasonably 
defined & well documented plugin subsystems)?


Maybe some much simpler plugin mechanism code (perhaps from Mozilla 
Treehydra) might be easier to accept into the trunk, not because it is 
better, but just because it is simpler, and quicker to be accepted into 
the trunk?


Actually, I still don't understand how exactly (socially speaking) are 
big patches accepted into the trunk (at stage one) and how to help a 
branch (or some other code somewhere else) to be more easily accepted 
into GCC? Should big patches absolutely be cut into small understandable 
parts? If in practice a succession of small patches has a much bigger 
chance to be accepted into the trunk than a bigger one, I don't know 
what will happen next.


In addition, I would suppose that the runtime license could *requires* 
some technical behavior by the plugin machinery, and such requirements 
may perhaps prohibit more technically advanced solutions.


I'm really impatient to understand what kind of plugins will be 
permitted (and when) in GCC, and also what kind of plugins will be 
disallowed [this could mean more than the definition of a proprietary 
plugin]. So far, I have no idea (because I am not a lawyer and because I 
don't know anything about the current work on the runtime license).


Perhaps a microscopic plugin feature is better than a better designed 
one. A miniplugin mechanism is enough to add more advanced mechanisms 
(like your libplugin proposal, & perhaps like my MELT branch) inside, 
themselves implemented in dlopen-ed *.so.


Maybe we should wait till at least end of october for some input (even 
unofficial rumors) by the few people working on the runtime license.


I think that most of the is

Need help in a linking error

2008-10-01 Thread ying lcs

Hi,

I appreciate if someone can help me with my linking error:

In my "c++" options , i already have ' -L/usr/local/lib -lgnet-2.0'.

I get a number of '7: undefined reference to `gnet_conn_readline'' errors.

Can you please tell me why the linking fails?  I have the 'gnet.h'
include in my .cpp and I apparently compile fine.


c++   -fno-rtti -fno-exceptions -Wall -Wpointer-arith
-Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy
-Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof
-Wno-long-long -pedantic -fno-strict-aliasing -fshort-wchar -pthread
-pipe  -DDEBUG -D_DEBUG -DDEBUG_scheung -DTRACING -g -fno-inline -Os
-freorder-blocks -fno-reorder-functions -finline-limit=50
-I/usr/include/gtk-2.0 -I/usr/lib/gtk-2.0/include
-I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pango-1.0
-I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include
-I/usr/include/freetype2 -I/usr/include/libpng12
-I/usr/include/pixman-1 -I/usr/include/gtk-unix-print-2.0   -o
TestGNet TestGNet.o   -lpthread
-Wl,-rpath-link,/media/sdb3/src/tracemonkey/src/firefox-objdir/dist/bin
-Wl,-rpath-link,/lib  -L../../../../dist/bin -L../../../../dist/lib
-lX11   /media/sdb3/src/tracemonkey/src/firefox-objdir/dist/lib/libxpcomglue.a
-lasound -ldl -lm  -L/usr/local/lib -lgnet-2.0 -lgtk-x11-2.0 -latk-1.0
-lgdk-x11-2.0 -lgdk_pixbuf-2.0 -lm -lpangocairo-1.0 -lpango-1.0
-lcairo -lgobject-2.0 -lgmodule-2.0 -ldl -lglib-2.0
TestGNet.o: In function `main':
/media/sdb3/src/tests/TestGNet.cpp:304: undefined reference to `gnet_conn_new'
/media/sdb3/src/tests/TestGNet.cpp:310: undefined reference to
`gnet_conn_connect'
/media/sdb3/src/tests/TestGNet.cpp:312: undefined reference to
`gnet_conn_set_watch_error'
/media/sdb3/src/tests/TestGNet.cpp:314: undefined reference to
`gnet_conn_timeout'
TestGNet.o: In function `ob_sig_int':
/media/sdb3/src/tests/TestGNet.cpp:1380: undefined reference to
`gnet_conn_write'
/media/sdb3/src/tests/TestGNet.cpp:1382: undefined reference to
`gnet_conn_readline'
TestGNet.o: In function `ob_conn_func':
/media/sdb3/src/tests/TestGNet.cpp:1307: undefined reference to
`gnet_conn_readline'
/media/sdb3/src/tests/TestGNet.cpp:1344: undefined reference to
`gnet_conn_delete'
/media/sdb3/src/tests/TestGNet.cpp:1212: undefined reference to
`gnet_conn_timeout'
/media/sdb3/src/tests/TestGNet.cpp:1227: undefined reference to
`gnet_conn_write'
/media/sdb3/src/tests/TestGNet.cpp:1229: undefined reference to
`gnet_conn_readline'
/media/sdb3/src/tests/TestGNet.cpp:1315: undefined reference to
`gnet_conn_delete'
/media/sdb3/src/tests/TestGNet.cpp:1330: undefined reference to
`gnet_conn_delete'
/media/sdb3/src/tests/TestGNet.cpp:1251: undefined reference to
`gnet_conn_delete'
/usr/bin/ld: TestGNet: hidden symbol `gnet_conn_connect' isn't defined
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: ld returned 1 exit status
gmake[5]: *** [TestGNet] Error 1

Thank you for any help.

Re: Does IRA support stack slot sharing for locals and spilled pseudos?

2008-10-01 Thread Jeff Law


Richard Henderson wrote:

Jeff Law wrote:

(mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64])
and
(mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64])

...
Yes.  There's code at the start of nonoverlapping_memrefs_p to handle 
these cases, but as Pat pointed out, it doesn't work for large 
offsets from the stack/frame pointer (large enough to cause a 
secondary reload).  I'm not sure offhand how to best fix this.


How about setting the MEM_EXPR to a fake "spill_slot_base" symbol,
plus the full frame pointer offset number?  Since the slots are
being shared, the original decl as the MEM_EXPR isn't terribly useful.
Presumably mucking around with the MEM_EXPR on the DECL isn't going to 
mess up inlining?  What about debugging?


Jeff

Re: IRA accumulated costs

2008-10-01 Thread Richard Sandiford

Hi Vlad,

Thanks for the great reply, and sorry for not replying sooner.
Things have been a bit hectic for me recently.

Vladimir Makarov <[EMAIL PROTECTED]> writes:
> Richard Sandiford wrote:
>> Although I suspect it isn't intentional, I can imagine it doesn't show
>> up much on targets whose memory move costs are significantly higher than
>> their register move costs.
>>
>> Which brings us to MIPS.  The big question there is: what do we
>> do with the accumulator registers?  Do we put them in the same
>> cover class as GPRs?
>>
>> Or perhaps that's jumping the gun.  Perhaps the first question is:
>> should we mark the accumulator registers as fixed, or at least hide them
>> from the register allocator?  I'm planning to do the latter for MIPS16,
>> but I don't think it's a good idea for normal MIPS for two reasons:
>>
>>   - The DSP ASE provides 4 accumulators.  We want to apply
>> normal register allocation to them.
>>
>>   - Some targets have multiply-accumulate instructions that operate on
>> LO and HI.  But it isn't always a win to use them.  If a target has
>> both multiply-accumulate _and_ pipelined three-operand multiplication
>> instructions, it is often better to use the latter for parallel
>> multiply-accumulate chains.  We've traditionally treated the
>> choice as a register-allocation problem, which seems to have
>> worked reasonably well.
>>
>> Also, the macc instruction on some targets can copy the LO result
>> to a GPR too.  The register allocator can currently take advantage
>> of that, allocating a GPR when it's useful and not wasting one
>> otherwise.  (From what I've seen in the past, JPEG FFT loops tend
>> to be on the borderline as far as register pressure on MIPS goes,
>> so this can be an important win.)
>>
>> But there are only a limited number of accumulator registers (1 or 4,
>> depending on the target).  There's quite a high likelihood that
>> any given value will need to be spilled from the accumulators
>> at some point.  When that happens, it's better to spill to a GPR
>> than it is to spill to memory, since any load or store has to go
>> through a GPR anyway.  It therefore seems better to put GPRs and
>> accumulator registers in the same cover class.
>>
>>   
> It better to put GPRs and ACCs in the same class  if it is better to 
> spill ACC_REGS to GPR than to memory but it should be reflected in some 
> way in memory and register costs. 

Great!  That's what the WIP patch did, and it seems to work well
with IRA for the most part.  This unnecessary spilling thing was the
only real problem I've found.

>> We currently give moves between GPRs and accumulators a higher cost than
>> GPR loads and stores.[*]  On the targets for which this cost is accurate,
>> we _don't_ want to use LO and HI as spill space.  We also don't want
>> to move between one accumulator and another if we can help it.
>> And IRA generally seems happy with this.
>>
>>   [*] Which isn't an accurate reflection of all targets, but that's
>>   another story.  We ought eventually to put this in the CPU cost
>>   table.
>>
>> The hitch is that the cost of storing an accumulator to memory
>> is the cost of a GPR<->accumulator move plus the cost of a load
>> or store.  The cost of moving between one accumulator and another
>> is the cost of two GPR<->accumulator moves.  Both of these aggregate
>> costs are accurate, to the extent that the constituent costs are
>> accurate (see [*] above).  So we have a situation in which the
>> worst-case register<->register cost (acc<->acc) outweighs the
>> worst-cost register<->memory cost (acc<->mem).  And that goes
>> against the cover class documentation.
>>
>>   
> The documentation can be changed.  It is just a recommendation or how I 
> see it.  We have to separate reg classes into non-intersected ones 
> because Chaitin-Briggs coloring needs this for its work.  The bigger 
> cover classes, the more probability that RA puts more pseudos into hard 
> registers.  On the other hand CB coloring does not understand register 
> and memory move costs (assign_hard_reg does understand but it can be 
> used for other coloring algorithms as Chow's priority coloring).  It is 
> hard to find a balance in defining cover classes to put more pseudos 
> into hard-registers and still generate cheap code.  I should think about 
> better IRA_COVER_CLASSES description.

Thanks.  (This certainly wasn't an attack on the documentation btw.
My point was more: "I realise that, because the MIPS set-up isn't
really sanctioned by the documentation, it would be reasonable to
classify this as a bug in the MIPS port".)

But it's good to hear this is more a recommendation than a hard rule.
Like I say, IRA seems to cope pretty well with things despite the
unusual costs.

>> For the most part things Just Work.  But in the code quoted
>> above, this cost:
>>
>>cost = (ira_register_move_cost[mode][rclass][rclass] 
>>* (

Re: IRA accumulated costs

2008-10-01 Thread Richard Sandiford

Vladimir Makarov <[EMAIL PROTECTED]> writes:
> Hi, Richard.  Returning to accurate cost accumulation issue you found 
> recently.  Here is the patch fixing it.  You could try, if you want, how 
> MIPS  will behave with it.  The patch also more accurately calculates 
> ALLOCNO_CALL_FREQ which affects decision to spill allocno in 
> assign_hard_reg if it is more profitable.

Thanks.  I'll give the patch a go.

Richard

Re: m32c: pointer math vs sizetype again

2008-10-01 Thread DJ Delorie


> Is this related to the loop termination bug I reported
> on the m32c? 
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37665

Probably related, but I don't know if a patch to fix one will fix the
other.

Re: m32c: pointer math vs sizetype again

2008-10-01 Thread Richard Guenther

On Wed, Oct 1, 2008 at 12:20 AM, DJ Delorie <[EMAIL PROTECTED]> wrote:
>
> I've got a partial patch which works with older (4.3) gccs, but fails
> gimple's check for trunk (attached).  My trivial test case...
>
> char *
> foo (char *a, int b)
> {
>  return a-b;
> }
>
> ...fails thusly:
>
>  size  
> constant 32>
>unit size  int> constant 4>
>align 8 symtab 0 alias set -1 canonical type 0xb7f52c30 precision 32 min 
>  max >
>  size  
> constant 16>
>unit size  int> constant 2>
>align 8 symtab 0 alias set -1 canonical type 0xb7efc000 precision 16 min 
>  max >
> useless false: ../../gcc/gcc/tree-ssa.c 1092
> dj.c: In function 'foo':
> dj.c:2: error: type mismatch in pointer plus expression
> D.1194 = a + D.1196;
>
> char *
>
> char *
>
> 
>
> D.1194 = a + D.1196;
>
> dj.c:2: internal compiler error: verify_gimple failed
>
>
> I'm obviously doing something wrong in the cast-to-bigger step.  How
> can I get this to pass gimple?  What I'm trying to accomplish is this:
>
> 1. Values added to pointers need to be treated as signed (at least, if
>   they're signed types, certainly if you're going to use a
>   NEGATE_EXPR).
>
> 2. If sizeof(size_t) < sizeof(void *), sign extend the intop to be
>   pointer-sized before adding it.
>
>
>
> Index: c-common.c
> ===
> --- c-common.c  (revision 140759)
> +++ c-common.c  (working copy)
> @@ -3337,20 +3337,28 @@ pointer_int_sum (enum tree_code resultco
> intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
> TYPE_UNSIGNED (sizetype)), intop);
>
>   /* Replace the integer argument with a suitable product by the object size.
>  Do this multiplication as signed, then convert to the appropriate
>  type for the pointer operation.  */
> -  intop = convert (sizetype,
> +  intop = convert (ssizetype,
>   build_binary_op (EXPR_LOCATION (intop),
>MULT_EXPR, intop,
>convert (TREE_TYPE (intop), size_exp), 1));
>
>   /* Create the sum or difference.  */
>   if (resultcode == MINUS_EXPR)
> -intop = fold_build1 (NEGATE_EXPR, sizetype, intop);
> +intop = fold_build1 (NEGATE_EXPR, ssizetype, intop);
> +
> +  if (TREE_CODE (result_type) == POINTER_TYPE
> +  && TYPE_PRECISION (result_type) > TYPE_PRECISION (TREE_TYPE (intop)))
> +{
> +  tree iptr_type = c_common_type_for_mode (TYPE_MODE (result_type),
> +  TYPE_UNSIGNED (result_type));
> +  intop = fold_build1 (NOP_EXPR, iptr_type, intop);
> +}
>
>   ret = fold_build2 (POINTER_PLUS_EXPR, result_type, ptrop, intop);
>
>   fold_undefer_and_ignore_overflow_warnings ();
>
>   return ret;

I think this is the wrong place to fix this.  If you would override
the sizetypes precision
from your target, would that fix it?  That is, in stor-layout.c
set_sizetype make the
target allow adjusting the passed type (which is supposed to be
sizetype).  If at all
then these types should be consistent.

Richard.

gcc-4.2-20081001 is now available

2008-10-01 Thread gccadmin

Snapshot gcc-4.2-20081001 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20081001/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch 
revision 140822

You'll find:

gcc-4.2-20081001.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20081001.tar.bz2 C front end and core compiler

gcc-ada-4.2-20081001.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20081001.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20081001.tar.bz2  C++ front end and runtime

gcc-java-4.2-20081001.tar.bz2 Java front end and runtime

gcc-objc-4.2-20081001.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20081001.tar.bz2The GCC testsuite

Diffs from 4.2-20080924 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Re: m32c: pointer math vs sizetype again

2008-10-01 Thread DJ Delorie


> I think this is the wrong place to fix this.  If you would override
> the sizetypes precision from your target, would that fix it?  That
> is, in stor-layout.c set_sizetype make the target allow adjusting
> the passed type (which is supposed to be sizetype).  If at all then
> these types should be consistent.

The problem is that the chip has 24 bit pointers, but 16 bit
registers.  It has math operations for 16 bit numbers and some 32 bit
numbers (the rest are emulated).  It has a few operations for 24 bit
numbers.  There are no C types for 24 bit numbers (PSImode is 32 bits
wide with 24 bit precision, if I tweak its precision manually it tries
to use bitfield instructions all over the place, if I don't it uses
"long int" which is wrong).

All I want for now is to treat ptr+int as a signed addition, not an
unsigned one.  My patch is just trying to detect the case where a sign
extension is needed at all, and insert it.

Re: m32c: pointer math vs sizetype again

2008-10-01 Thread Jeff Law


DJ Delorie wrote:

I think this is the wrong place to fix this.  If you would override
the sizetypes precision from your target, would that fix it?  That
is, in stor-layout.c set_sizetype make the target allow adjusting
the passed type (which is supposed to be sizetype).  If at all then
these types should be consistent.



The problem is that the chip has 24 bit pointers, but 16 bit
registers.  It has math operations for 16 bit numbers and some 32 bit
numbers (the rest are emulated).  It has a few operations for 24 bit
numbers.  There are no C types for 24 bit numbers (PSImode is 32 bits
wide with 24 bit precision, if I tweak its precision manually it tries
to use bitfield instructions all over the place, if I don't it uses
"long int" which is wrong).

All I want for now is to treat ptr+int as a signed addition, not an
unsigned one.  My patch is just trying to detect the case where a sign
extension is needed at all, and insert it.
  
Can you look in the CVS/SVN archives and see what the mn102 port did -- 
it had the same core properties as the chip you're describing.  It was a 
16/24 bit chip (true 24bit address registers), mostly 16bit ops with a 
few 24bit ops.  All 32bit ops were synthesized.




Jeff

Re: Does IRA support stack slot sharing for locals and spilled pseudos?

2008-10-01 Thread Richard Henderson


Jeff Law wrote:
Presumably mucking around with the MEM_EXPR on the DECL isn't going to 
mess up inlining?  What about debugging?


I'm certain it won't mess up inlining, because all that is long
done with by the time we're in rtl -- it's all just one big
function by this time.

I wouldn't have thought it would mess up debugging, but we'll
have to see what happens with Alex's enhanced var-tracking...

I have attached a patch to pr 37447.


r~

Re: Need help in a linking error

2008-10-01 Thread Ben Elliston

This list is for discussing GCC development, not deal with usage
problems.  Please try asking [EMAIL PROTECTED]

Thanks,
Ben

Re: Defining a common plugin machinery

2008-10-01 Thread Brendon Costa

I have notes inline below, following is my summary of libplugin from
what i understand of your posts:
* It exists as a fraemwork that works with GCC now
* It uses xml files to define plugins (Allows making new plugins as
combinations of others without making a new shared library, i.e. just
create an xml file that describes the plugin)
* It handles issues with inter-dependencies between plugins
* It uses a "push" framework, where function pointers are
replaced/chained in the original application rather than explicit calls
to plugins (Provides more extensibility in a application that makes
heavy use of function pointers, but produces a less explicit set of
entry points or hooks for plugins)
* Currently it provides automatic loading of plugins without specific
user request
* It already has a framework for allowing plugins to interact with the
pass manager

If you can think of any other points to summarize the features it might
be helpful as you are closer to it.

The issues i see with this framework:
   * it seems to provide a lot of features that we may not necessarily
need (That should be up for discussion)
   * plugin entry points are not well defined but can be "any function
pointer call"

Some questions:
* How does the framework interact with the compile command line arguments?
* Does this work on platforms that dont support -rdynamic or can it be
modified to do so in the future?

Hugh Leather wrote:
>*Separating Plugin system from appliction*
>Libplugin ships as a library.  Apart from a few lines of code in
>toplev.c, the only other changes to GCC will be refactorings and
>maybe calling a few functions through pointers.
As i understand the difference between the pull vs push, a plugin will
load, and then modify existing function pointers in GCC to insert its
own code and chain the existing code to be called after it. Is this correct?

Doing this will be able to make use of existing function pointers as
plugin hook locations, but some hooks we may want are not already called
by function pointers and so would need to be changed. This means that
plugin hook locations are not explicitly defined, but rather any place
where a function pointer is used can be modified. Personally i prefer
explicit specification of plugin hook locations.

>I think it's important to separate the plugin system from the
>application.  Doing plugins well, IMO, requires a lot of code.  It
>shouldn't be spread through the app.  It also cleanly separates
>plugin mechanism from the actual extensions the app wants.
>Finally, plugins have to be extensible too. They should really be on
>a nearly equal footing with the app.  Otherwise plugin developers
>who want the plugins to be extensible will need to reimplement there
>own extensibility system.
Without the use of plugin meta-data in XML files and auto-loading and
many of the things discussed, i am not so sure that plugins will be such
a large body of code. It is really a matter of deciding if such features
that libplugin provides are desirable for GCC. If so, then there is a
lot of code required for plugins and libplugin becomes a good idea IMO.
If not, then libplugin may just be more than we need. It really depends
on what "doing plugins well" means for the specific application.

>*Scalable and Granularity*
>The system is very scalable.  Really this is due to the push
>architecture.
The granularity as i understand it is only as fine/coarse as the number
of function pointers in the system that can be overwritten. This is no
different from the pull method (i.e. The granularity depends on where
you put the hook locations) except that function pointers "may already
exist". Though i may have mis-understood something...

I.e. For the "pull" method you can:

Add a "pull" for firePluginEvent() or add a "pull" inside each existing
event handler. Where as the push method requires that the existing event
handlers are called via function pointers and the "push" chains itself
to that.

I have used a similar method for the "push" plugin in python. The
advantage here is that basically "anything" can be pushed in python so
the system becomes very flexible to extend via "plugins". In C/C++ the
areas that can be extended need to be defined and turned into function
pointers for the push method to work.

Again, assuming i have understood how it works.

>*Mutliple cooperating plugins
>*I think some of the proposals don't allow multiple plugins or
>plugins aren't able to be extended in the same way that the
>application is.  In libplugin you can have lots of plugins all
>depending on each other.  Plugins can provide extension points as
>well as the application - this means it isn't just a matter of the
>application deciding what's important and everyone else having to
>make do.
>
>In some senses, this is the difference between a plugin system and
>loading a few shared libraries.  A plugin system provides a

Re: m32c: pointer math vs sizetype again

2008-10-01 Thread DJ Delorie


> Can you look in the CVS/SVN archives and see what the mn102 port did -- 

It used SImode for size_type but I think I tried that and it blew up
in useless_type_conversion_p.  I can try again if you're interested in
the details.

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak

Hi,

> >>   b) If any PHI node has count zero it can be inserted back and its
> >>  corresponding computations removed, iff the argument of the PHI 
> >> node
> >>  still exists as an SSA variable. This means that we can insert
> >>  a_1 = PHI  if D.10_1 still exists and hasnt been removed by
> >>  any of the passes between the scalar evolution pass and the
> >>  loopdone pass.
> >
> > this does not work:
> > -- we reuse ssa names, so it can happen that the argument of the PHI node
> >   is eliminated, then reused for a different purpose
> 
> I wasn't sure if from the proposal strong enough to catch this case ? i.e. if
> 
> 
> So if the ssa_names are infact reused they won't be the same
> computations.

do you also check this for ssa names inside the loop (in your example,
D.10_1?

> > -- in case more complex loop transformations were performed
> >   (e.g., loop reversal), the final value of the ssa name might have
> >   changed.
> 
> Could you give an example for this ?

for (i = 100; i > 0; i--)
  a[i] = i;

transformed to

for (i = 1; i <= 100; i++)
  a[i] = i;

the final value of i was originally 0, now it is 101.

> Is there anything else you might
> suggest in terms of undoing the transformations from scalar cprop.?

I would probably try to somehow pass the information from scev analysis
to value numbering, and let PRE take care of the issue,

Zdenek

Re: query regarding adding a pass to undo final value replacement.

2008-10-01 Thread Zdenek Dvorak

Hi,

> > I would disagree on that.  Whether a final value replacement is
> > profitable or not largely depends on whether it makes further
> > optimization of the loop possible or not; this makes it difficult
> > to find a good cost model.  I think undoing FVR is a good approach
> > to solve this problem (unfortunately, the proposed implementation
> > does not work),
> 
> Ok, fair enough.  Ideally we would then be able to retain the PHI nodes
> and somehow record an equivalency in the IL from which we later could
> remove either of the definitions.  Something like
> 
> def_1 = PHI < ... >
> 
> def_2 = compute
> 
> def_3 = EQUIV 
> (def_3 = ASSERT_EXPR ?)
> 
> much similar to REG_EQUAL notes.  This means that both def_1 and def_2
> are conditionally dead if the EQUIV is the only remaining use.
> 
> No idea if this is feasible and useful enough in general though.
> 
> Do you remember what kind of missed optimizations you saw (apart from
> missed dead loop removal)?

vectorization and linear loop transformations did not like values used
outside of the loop; I am not sure whether (our implementation of)
graphite handles them or not,

Zdenek

Re: Defining a common plugin machinery

2008-10-01 Thread Basile STARYNKEVITCH


Hello All,

Brendon Costa wrote:


Some questions:
* How does the framework interact with the compile command line arguments?
* Does this work on platforms that dont support -rdynamic or can it be
modified to do so in the future?


[I'm skipping the rest of an interesting post]

I thought that for the first plugin machinery we don't care about 
platforms without -rdynamic (or those without dlopen or tl_dlopen).


I believe we should first focus (when the runtime license will permit 
that) on making whatever plugin machinery available and merged into the 
trunk (when it comes back to stage one). This is not an easy task.


In practice, I think that we should first try to get some code into the 
trunk which make some plugin work on some common easy host system 
(Linux), and only after try to generalize the work to harder hosts.


At last, I believe that the plugin system will at first be something 
which can be disabled at configure time, and will be disabled by default.


My main concern is plugins & passes.

Regards.
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

Re: Defining a common plugin machinery

2008-10-01 Thread Brendon Costa


> I believe we should first focus (when the runtime license will permit
> that) on making whatever plugin machinery available and merged into
> the trunk (when it comes back to stage one). This is not an easy task.
Isn't the point of this discussion to decide what features to put into a
plugin framework? I.e. We need a "whatever plugin machinery available"
to exist before we can even think about merging that into the trunk and
defining what that looks like is the point of this discussion i thought.

Possible steps for moving forward with this:
1) Define what features we need for the first release, and think about
what we may want in the future
2) See which frameworks currently exist and how each meets the necessary
features identified
3) Either use one of the above frameworks as a base or start a new
framework on the plugin branch
4) Work on the "base set of features" for a first release
5) Make sure the branch is up to date/tracking the trunk
6) Look at merging into the trunk when licensing is complete

We are still at 1 (and partially identifying projects for 2) as far as i
understand.

I was going to start itemizing the features we have discussed, and the
frameworks mentioned on the wiki. But I am not going to have time to do
so for a number of weeks now. If someone else wants to do it it may get
done a bit faster.

So far, i think libplugin seems to be the most "general" plugin
framework for GCC i have had a chance to look at (It was easy to look at
because it has some decent documentation online).

> In practice, I think that we should first try to get some code into
> the trunk which make some plugin work on some common easy host system
> (Linux), and only after try to generalize the work to harder hosts.
I agree, that providing working code for only simple to implement
platforms (and basic plugin features) at first is a good idea (but do so
on a branch first, then merge that to the trunk once it is operational).
However we do not want to start with a framework that will need to be
completely redesigned in the future to later support other platforms or
usages. I.e. Thinking ahead but not necessarily implementing ahead...

> My main concern is plugins & passes.
Yes. We have not really looked at this more important aspect in much
detail, how to manage passes with plugins. It looks like libplugin has
some ideas for pass management that may help? Any thoughts?

Re: Defining a common plugin machinery

2008-10-01 Thread Basile STARYNKEVITCH


Brendon Costa wrote:

I believe we should first focus (when the runtime license will permit
that) on making whatever plugin machinery available and merged into
the trunk (when it comes back to stage one). This is not an easy task.

Isn't the point of this discussion to decide what features to put into a
plugin framework? I.e. We need a "whatever plugin machinery available"
to exist before we can even think about merging that into the trunk and
defining what that looks like is the point of this discussion i thought.



I entirely agree. Apologies to everyone if I badly expressed myself.


Possible steps for moving forward with this:
1) Define what features we need for the first release, and think about
what we may want in the future
2) See which frameworks currently exist and how each meets the necessary
features identified
3) Either use one of the above frameworks as a base or start a new
framework on the plugin branch
4) Work on the "base set of features" for a first release
5) Make sure the branch is up to date/tracking the trunk
6) Look at merging into the trunk when licensing is complete

We are still at 1 (and partially identifying projects for 2) as far as i
understand.


I also agree. What I don't understand is if having a simple crude plugin 
mechanism make it easier to be accepted in the trunk. I don't understand 
what makes features & code easy to be accepted in a stage one trunk. I 
don't understand if havving a small plugin machinery (there already 
exists some) make it easier to be accepted in the trunk.
I still do not understand how and when big patches get accepted in the 
trunk. What are the social issues involved? What is the best way to get 
code reviewers (those able to approve a patch) interested by any big 
plugin patch? (And FYI I am asking myself the same question for MELT: 
what should I do now to get some day in the future MELT accepted in the 
trunk?).




So far, i think libplugin seems to be the most "general" plugin
framework for GCC i have had a chance to look at (It was easy to look at
because it has some decent documentation online).


In practice, I think that we should first try to get some code into
the trunk which make some plugin work on some common easy host system
(Linux), and only after try to generalize the work to harder hosts.

I agree, that providing working code for only simple to implement
platforms (and basic plugin features) at first is a good idea (but do so
on a branch first, then merge that to the trunk once it is operational).
However we do not want to start with a framework that will need to be
completely redesigned in the future to later support other platforms or
usages. I.e. Thinking ahead but not necessarily implementing ahead...


I fully agree. But who thinks that the libplugin patch (or any other 
plugin machinery) could be accepted into the trunk?



My main concern is plugins & passes.

Yes. We have not really looked at this more important aspect in much
detail, how to manage passes with plugins. It looks like libplugin has
some ideas for pass management that may help? Any thoughts?


Apparently they have.

But we need to have a more exact picture of what the GCC steering 
commitee & the FSF wants (and even more importantly do not wants) 
regarding plugins. I could imagine that they (& perhaps us) want some 
tricks that make proprietary plugins impractical, but I have no idea of 
what that would technically mean (because I have no understanding of the 
legal & social system involved).


My hypothesis is that several plugin mechanisms for GCC already exist 
(on some branches or somewhere else). If a small plugin patch has a 
better chance to get accepted into the trunk, we should limit ourselves 
to such a small thing. If big plugin machinery could be accepted (I 
would prefer that) we should understand what would make them more 
acceptable. In both cases, plugins have probably some requirements 
defined by the future runtime license, which I don't know yet.


Regards.
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

46 matches

Mail list logo