What to do with hardware exception (unaligned access) ? ARM920T processor
Hi! Processor ARM920T, chip Atmel at91rm9200. Simple C code: char c[30]; unsigned short *pN = &c[1]; *pN = 0x1234; causes hardware exeption - memory aborts (used to implement memory protection or virtual memory). We have a lot of source code, with pieces of code like this, which must be ported from x86 to ARM9. Are there any compiler options to handle this exception? I compile it by using arm-elf-gcc 4.3.2 under Linux. binutils-2.18. Thanks. Best regrads, Vladimir
Re: What to do with hardware exception (unaligned access) ? ARM920T processor
On 10/1/08, Vladimir Sterjantov <[EMAIL PROTECTED]> wrote: > Processor ARM920T, chip Atmel at91rm9200. > > char c[30]; > unsigned short *pN = &c[1]; > > *pN = 0x1234; Accesses to shorts on ARM need to be aligned to an even address, and longs to a 4-byte address. Otherwise the access returns (eg, for a 4-byte word pointer) is *(p & ~3) >>> *(p & 3) (where >>> is byte rotate, not bit shift). Or causes a memory fault, if that's how your system is configured. If you don't want to make the code portable and your are running a recent Linux, a fast fix is to echo 2 > /proc/cpu/alignment which should make the kernel trap misaligned accesses and fix them up for you, with a loss in performance of course. The real answer is to fix the code... M
RE: Defining a common plugin machinery
Dear all, I noticed a long discussion about plugins for GCC. It seems that it's currently moving toward important legal issues, however, I wanted to backtrack and just mention that we at INRIA and in the MILEPOST project are clearly interested in having a proper plugin system in the mainline of GCC which will simplify our work on automatically tuning optimization heuristics (cost models, etc) or easily adding new transformations and modules for program analysis. We currently have a simple plugin system within Interactive Compilation Interface (http://gcc-ici.sourceforge.net) that can load external DLL plugin (transparently through the environment variables to avoid changing project Makefiles or through command line), substitutes the original Pass Manager to be able to call any passes (new analysis passes for example) in any legal order and has an event mechanism to rise events in any place in GCC and pass data to the plugin (such as information about cost model dependencies) or return parameters (such as decisions about transformations for example). Since it's relatively simple, we are currently able to port it to the new versions of GCC without problems, however, naturally, we would like to have this functionality within the GCC mainline with the defined API (that what I discussed with Taras from Mozilla during GCC Summit this year). I believe it may help making GCC a modular compiler and simplify future designs (and can be in line with the idea to use C++ for GCC development if Ian moves this project forward). By the way, Hugh Leather also developed an interesting plugin system for GCC that allows to substitute internal GCC functions with the external ones within plugins to enable hooks inside GCC (he mentioned that he plans to release it soon)... Furthermore, we will then be able to use our current MILEPOST tools and Collective Compilation Framework to automatically tune default GCC optimization heuristic for performance, code size or power consumption (instead of using -O1,2,3 levels) for a particular architecture before new versions of GCC are actually released (or for additional testing of a compiler using different combinations and orders of passes). And when the compiler is released, users can further tune their particular programs interactively or automatically through the external tools and GCC plugins. By the way, we are extending current ICI for GCC 4.4 to add cost-model tuning for major optimizations (GRAPHITE, vectorization, inlining, scheduling, register allocation, unrolling, etc) and provide function cloning with different optimizations, and naturally would like to make it compatible with the potential future common GCC plugin system, so I hope we will be able to agree on a common plugin design soon and move it forward ;) ... Regards, Grigori = Grigori Fursin, INRIA, France http://fursin.net/research > -Original Message- > From: Taras Glek [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 16, 2008 11:43 PM > To: Diego Novillo > Cc: Basile STARYNKEVITCH; gcc@gcc.gnu.org; Sean Callanan; Albert Cohen; > [EMAIL PROTECTED] > Subject: Re: Defining a common plugin machinery > > Basile STARYNKEVITCH wrote: > > Hello Diego and all, > > > > Diego Novillo wrote: > >> > >> After the FSF gives final approval on the plugin feature, we will need > >> to coordinate towards one common plugin interface for GCC. I don't > >> think we should be adding two different and incompatible plugin > >> harnesses. > > > > What exactly did happen on the FSF side after the last GCC summit? I > > heard nothing more detailed than the Steeering Committee Q&A BOFS and > > the early draft of some legal licence involving plugins. What happened > > on the Steering Commitee or legal side since august 2008? Is there any > > annoucement regarding FSF approving plugins? > > > >> I am CCing everyone who I know is working on plugin features. > >> Apologies if I missed anyone. > >> > >> I would like to start by understanding what the plugin API should > >> have. What features do we want to support? What should we export? > >> What should be the user interface to incorporate plugin code? At > >> what point in the compilation stage should it kick in? > >> > >> Once we define a common API and then we can take the implementation > >> from the existing branches. Perhaps create a common branch? I would > >> also like to understand what the status and features of the > >> different branches is. > > > > > > The MELT plugin machinery is quite specific in its details, and I > > don't believe it can be used -in its current form- for other plugins. > > It really expects the plugin to be a MELT one. > > > > From what I remember of the plugin BOFS (but I may be wrong), an easy > > consensus seems to be that plugins should be loadable thru the command > > line (probably a -fplugin=foo meaning that some foo.so should be > > dlopen-ed), that they could take a single s
Re: What to do with hardware exception (unaligned access) ? ARM920T processor
On Wednesday 01 October 2008, Martin Guy wrote: > If you don't want to make the code portable and your are running a > recent Linux, a fast fix is to > echo 2 > /proc/cpu/alignment > which should make the kernel trap misaligned accesses and fix them up > for you, with a loss in performance of course. The real answer is to > fix the code... ...and this is where -Wcast-align should help. The OP should also have a look at -Wpadded and -Wpacked, because this may expose similar pitfalls. This writeup looks like a good start for the OP: http://lecs.cs.ucla.edu/wiki/index.php/XScale_alignment
Re: GCC 4.2.2 arm-linux-gnueabi: c++ exceptions handling?
Hello all, I've found the cause of my problem - it's binutils 2.17.50. Using ld 2.18, or even 2.17.90 creates workable libstdc++.so. Regards, Sergei Sergei Poselenov wrote: Hello all, I've built the above cross-compiler and ran the GCC testsuite. Noted a lot of c++ tests failed with the same output: ... terminate called after throwing an instance of 'int' terminate called recursively Aborted ... Compiler details: Reading specs from /opt/eldk-4.2-arm-2008-09-24/usr/bin/../lib/gcc/arm-linux-gnueabi/4.2.2/specs Target: arm-linux-gnueabi Configured with: /work/psl/eldk-builds/arm-2008-09-24/work/usr/src/denx/BUILD/crosstool-0.43/build/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi/gcc-4.2.2/configure --target=arm-linux-gnueabi --host=i686-host_pc-linux-gnu --prefix=/var/tmp/eldk.Jb5047/usr/crosstool/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi --disable-hosted-libstdcxx --with-headers=/var/tmp/eldk.Jb5047/usr/crosstool/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi/arm-linux-gnueabi/include --with-local-prefix=/var/tmp/eldk.Jb5047/usr/crosstool/gcc-4.2.2-glibc-20070515T2025-eldk/arm-linux-gnueabi/arm-linux-gnueabi --disable-nls --enable-threads=posix --enable-symvers=gnu --enable-__cxa_atexit --enable-languages=c,c++,java --enable-shared --enable-c99 --enable-long-long --without-x Thread model: posix gcc version 4.2.2 However, testing results in http://gcc.gnu.org/ml/gcc-testresults/2007-09/msg00570.html states that this should work. I even downloaded and built the exact pre-release version used in the above tests and tried it - all the same. I wonder could it be the kernel or Glibc/binutils issue? I'm running 2.6.21.5, Glibc is 2.6 (Fedora Core 7 release), binutils is 2.17.50.0.12. Could someone having the 4.2 release series compiler configured for ARM EABI target try this simple test: extern "C" void abort(void); #define CI(stmt) try { stmt; abort(); } catch (int) { } struct has_destructor { ~has_destructor() { } }; struct no_destructor { }; int PI(int& i) { return i++; } int main(int argc, char *argv[]) { (argc+1 ? has_destructor() : throw 0); CI((argc+1 ? throw 0 : has_destructor())); } Build as arm-linux-gnueabi-g++ -o cond1 cond1.C Thanks for any feedback, Sergei
Re: Defining a common plugin machinery
Sorry, I think this bounced twice. Hugh Leather wrote: Hi All, Thanks, Grigori, for mentioning my plugin system, libplugin, which can be found at http://libplugin.sourceforge.net/. I have been meaning to release it but finding the time to finish off the documentation and upload all the newest code to SourceForge has been difficult (both code and docs on SourceForge are some months out of date). The plugin system was built to support MilePost goals for GCC, as we need to be able to capture events during compilation as well as change compilation behaviour. Here are some of the features of the system: *Application agnostic.* The plugin system is not GCC specific but can be used to make any application plugin aware. Plugin management is handled through a shared library which GCC, or any other application, can link to. I think if GCC took a similar approach then it would benefit from the exposure the system received elsewhere and the wider community would also have access to a professionally built plugin system. As the plugin system becomes more powerful, GCC reaps the rewards without having to change a line of code. The other, huge advantage of this, together with the design I'll describe below, is that GCC only has ~10 lines of plugin code; to initialise the library. The rest is working out how to refactor GCC to make it extensible. This way GCC won't be cluttered with nasty plugin stuff that obscures the meaning of the code. Finally, plugins for different applications can coexist. For example, we might have some plugins for the linker, some for driver, some for compiler and some that work in any of those. *Eclipse Inspired.* I've take inspiration from the excellent plugin system for the Eclipse IDE, http://www.eclipse.org. It has proved very successful, so it seemed like a good starting point. *What it is.* * Each plugin has an XML file describing it * Plugins have ExtensionPoints that other plugins can extend * Plugins can have shared libraries * Requires libxml2, libffi, libltdl An ExtensionPoint is one of the fundamental parts of the system. It provides the links between plugins. Each ExtensionPoint is really just an object with one method: bool extend( ExtensionPoint* self, Plugin* extendingPlugin, xmlNodePtr specfication ) This method tells the ExtensionPoint that some other plugin is extending it and gives it the XML that plugin uses. The ExtensionPoint can do whatever it likes with that XML. It might contain symbols pointing to functions it should use, it might be markup for logging text. It could be a list of unroll factors, one for each function or a list of passes to apply to a particular function. You can describe anything in XML. * An Example.* Maybe that's a bit confusing, so here's an example. Suppose we have a plugin which offers a logging or message service. It would have a plugin specification in XML like this: That says it's a plugin for GCC, it has id "simple-message", it uses a certain shared library. It also says it has an extension point called "simple-message.print" and gives the function to call when anyone extends that extension point. This function is called "simpleMessage_extend" and is in the shared library the plugin specified. It looks like this: bool simpleMessage_extend( ExtensionPoint* self, Plugin* extendingPlugin, xmlNodePtr specfication ) { printf( "%s\n", xmlNodeGetContent( specification )); return TRUE; } It simply prints the text content of any plugin that extends it. Another plugin might come along later and have this as its specification: Hello,World! Hopefully that little guy should be clear, it prints "Hello, World!" Now the plugin system has taken care of all the dependency management, only required plugins are loaded, etc. All the appropriate extension points are created (only those used) and extensions are applied. There's also a plugin lifecycle allowing things to happen at appropriate times. We could have had our plugins exchange code, remember things until later, do anything in fact. If you can describe it in XML then the plugin system lets you do it. *Ease of Use with Events and JoinPoints* Such simple extension points provide all the power you ever need, but not the ease of use. So, the system also lets you do common things with almost no code in GCC, just a slight refactoring and tiny description in XML. The most common things people want from a plugin system is to be able to listen to events or to replace behaviours. I'll show you how events are added here. Suppose GCC (or another plugin) is going to raise an event called "myEvent". The event will take an int and a string. Here's
query regarding adding a pass to undo final value replacement.
Hi , Based on the conversation in the thread at http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a pass trying to undo final value replacement going. The initial implementation was done by Pranav Bhandarkar when he was employed at Azingo as part of work sponsored by Icera Semiconductor. I've been trying to get this working with my private port over here. We intend to contribute this back once our copyright assignments are sorted and if this will be acceptable to all. I've been getting a few compile time ICEs with this approach and haven't been able to resolve them well enough yet. Whilst doing so, I wanted to check on the approach as outlined below and ask if there's anything that we might have missed or any problem that one can see with us going along these lines. Thanks for your time and patience. cheers Ramana 1) Understanding what scalar evolution does. ~ Consider the following pseudo code. function memcpy (src_pointer, dst_pointer) src_1 = src_pointer; dst_1 = dst_pointer; L1: *dst_1 = *src_1 (Word copy) dst++; src++;< Inc by 4 bytes i.e 1 word. conditional jump to L1 /* This is the exit block of loop 1. The following PHI nodes are added by loopinit pass to convert the SSA form into "closed loop SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ src_2 = PHI dst_2 = PHI L2: *dst_2 = *src_2 (Byte Copy) dst++; src++; Now scalar evolution convertes this into Function memcpy (src_pointer, dst_pointer) src_1 = src_pointer; dst_1 = dst_pointer; L1: *dst_1 = *src_1 (Word copy) dst++; src++;< Inc by 4 bytes i.e 1 word. conditional jump to L1 /* This is the exit block of loop 1. The following PHI nodes are added by loopinit pass to convert the SSA form into "closed loop SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ D.1234_11 = 4 * n (where 'n' is the number of iterations of L1) src_2 = src_pointer + D.1234_11 D.1235_22 = 4 * n dst_2 = dst_pointer + D.1235_22 L2: *dst_2 = *src_2 (Byte Copy) dst++; src++; Therefore a PHI Node is replaced by the final value of src_1 and dst_1, thus introducing extra computations. 2) How to undo what scalar evolution does. ~~ To undo what scalar evolution does, we need to record the changes that scalar evolution makes and then after the loop optimizations are run we need to replace the PHI nodes that were removed earlier in place of the computations introduced by scalar evolution. A high level description of the process is listed here. (see tree-scalar-evolution.c and grep for DXP_SPECIFIC) Explanation of this sub-pass. ~~ Part 1: Record Final Value replacement related changes. ~~ Final value replacement replaces PHI nodes at the exits of loops with computations based on the number of iterations of the loop. For e.g. L1: x_1 = src + 4 ... ... conditional jump to L1. (Loop Exit) x_2 = PHI < Phi node added by rewrite_into_loop_closed_ssa. (see the loopinit pass). Final Value replacement replaces the PHI node in by ssa_temp_var = 4 * no_of_iterations_of_loop x_2 = src + ssa_temp_var; Therefore a PHI node is replaced by computations. Recording final value replacement related changes is controlled by the variable record_scalar_evolution_changes. When set to a non-zero value the function record_changed_stmts records the changes made. The changes are recorded in a hashtable changed_stmts_table. The hashtable contains the stmt added, the PHI node for which this stmt was added and hashcodes for both the stmt and the phi_node. We also note how many computations have been added for each of the removed PHI nodes. This is done in a link list pointed to by phi_nodes_info_head. Part 2: Undo Final value replacement related changes ~~ This is the part where the new computations are removed and the PHI nodes that they are replaced are inserted back in. This replacement is contingent to a few conditions. a) All the computations that were added are still present in the basic block. i.e all the computations are still present in the form in which they were added and havent been touched by any of the loop optimizations passes that run between the scalar evolution pass (i.e the pass when Part 1 is executed) and the 'loopdone' pass. We go through the exit basic block and look up each stmt in changed_stmt_table. If found we lookup the corresponding PHI node in the phi_node_info link list and decrement its count by 1 (count here denotes the number of computations added. When count it 0 it means all the computatins added in the scalar evolution pass have been found in the same form in the loop done pass such a PHI node can be inserted back in if 'b' is a
Re: query regarding adding a pass to undo final value replacement.
On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> wrote: > Hi , > > Based on the conversation in the thread at > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a > pass trying to undo final value replacement going. The initial > implementation was done by Pranav Bhandarkar when he was employed at > Azingo as part of work sponsored by Icera Semiconductor. I've been > trying to get this working with my private port over here. We intend > to contribute this back once our copyright assignments are sorted and > if this will be acceptable to all. I've been getting a few compile > time ICEs with this approach and haven't been able to resolve them > well enough yet. Whilst doing so, I wanted to check on the approach as > outlined below and ask if there's anything that we might have missed > or any problem that one can see with us going along these lines. > Thanks for your time and patience. Some quick comments. First, do you have a non-pseudo-code testcase that exposes the extra computations? Second, I think rather than trying to undo what SCEV const(!)-prop is doing adjust its cost model (maybe there isn't one) to not create the costly substitutions. Thanks, Richard. > cheers > Ramana > > > > 1) Understanding what scalar evolution does. > ~ > > Consider the following pseudo code. > > function memcpy (src_pointer, dst_pointer) > src_1 = src_pointer; > dst_1 = dst_pointer; > L1: > *dst_1 = *src_1 (Word copy) > dst++; > src++;< Inc by 4 bytes i.e 1 word. > conditional jump to L1 > > /* This is the exit block of loop 1. The following PHI nodes > are added by loopinit pass to convert the SSA form into "closed loop > SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ > > src_2 = PHI > dst_2 = PHI > > > L2: > *dst_2 = *src_2 (Byte Copy) > dst++; > src++; > > > > Now scalar evolution convertes this into > > Function memcpy (src_pointer, dst_pointer) > src_1 = src_pointer; > dst_1 = dst_pointer; > L1: > *dst_1 = *src_1 (Word copy) > dst++; > src++;< Inc by 4 bytes i.e 1 word. > conditional jump to L1 > > > /* This is the exit block of loop 1. The following PHI nodes > are added by loopinit pass to convert the SSA form into "closed loop > SSA" (see rewrite_into_loop_closed_ssa" in tree-ssa-loop-manip.c */ > > D.1234_11 = 4 * n (where 'n' is the number of iterations of L1) > src_2 = src_pointer + D.1234_11 > D.1235_22 = 4 * n > dst_2 = dst_pointer + D.1235_22 > > > L2: > *dst_2 = *src_2 (Byte Copy) > dst++; > src++; > > > > > Therefore a PHI Node is replaced by the final value of src_1 and > dst_1, thus introducing extra computations. > > > 2) How to undo what scalar evolution does. > ~~ > > To undo what scalar evolution does, we need to record the changes that > scalar evolution makes and then after the loop optimizations are run > we need to replace the PHI nodes that were removed earlier in place of > the computations introduced by scalar evolution. > > A high level description of the process is listed here. (see > tree-scalar-evolution.c and grep for DXP_SPECIFIC) > > Explanation of this sub-pass. > ~~ > > Part 1: Record Final Value replacement related changes. > ~~ > Final value replacement replaces PHI nodes at the exits of loops with > computations based on the number of iterations of the loop. > For e.g. > > L1: > > x_1 = src + 4 > ... > ... > conditional jump to L1. >(Loop Exit) > x_2 = PHI < Phi node added by rewrite_into_loop_closed_ssa. > (see the loopinit pass). > > Final Value replacement replaces the PHI node in by > > ssa_temp_var = 4 * no_of_iterations_of_loop > x_2 = src + ssa_temp_var; > > Therefore a PHI node is replaced by computations. > > Recording final value replacement related changes is controlled by the > variable record_scalar_evolution_changes. When set to a non-zero value > the function record_changed_stmts records the changes made. The changes > are recorded in a hashtable changed_stmts_table. The hashtable contains > the stmt added, the PHI node for which this stmt was added and hashcodes > for both the stmt and the phi_node. We also note how many computations > have been added for each of the removed PHI nodes. This is done in a > link list pointed to by phi_nodes_info_head. > > Part 2: Undo Final value replacement related changes > ~~ > This is the part where the new computations are removed and the PHI nodes > that they are replaced are inserted back in. This replacement is > contingent to a few conditions. > a) All the computations that were added are still present in the basic > block. i.e all the computations are still present in the form in > which they were added and havent been touched by any of the loop > optimizations pas
Re: IRA accumulated costs
Hi, Richard. Returning to accurate cost accumulation issue you found recently. Here is the patch fixing it. You could try, if you want, how MIPS will behave with it. The patch also more accurately calculates ALLOCNO_CALL_FREQ which affects decision to spill allocno in assign_hard_reg if it is more profitable. 2008-10-01 Vladimir Makarov <[EMAIL PROTECTED]> * ira-int.h (ira_allocno): Add member updated_cover_class_cost. (ALLOCNO_UPDATED_COVER_CLASS_COST): New. (ira_fast_allocation): Remove the prototype. * ira-color.c (update_copy_costs, allocno_cost_compare_func, assign_hard_reg, calculate_allocno_spill_cost): Use updated costs. (color_pass): Modify the updated costs. (ira_color): Rename to color. Make it static. (ira_fast_allocation): Rename to fast_allocation. Make it static. (ira_color): New function. * ira-conflicts.c (process_regs_for_copy): Propagate hard reg cost change. * ira-lives.c (last_call_num, allocno_saved_at_call): New variables. (set_allocno_live, clear_allocno_live, mark_ref_live, mark_ref_dead): Invalidate corresponding element of allocno_saved_at_call. (process_bb_node_lives): Increment last_call_num. Setup allocno_saved_at_call. Don't increase ALLOCNO_CALL_FREQ if the allocno was already saved. (ira_create_allocno_live_ranges): Initiate last_call_num and allocno_saved_at_call. * ira-build.c (ira_create_allocno): Initiate ALLOCNO_UPDATED_COVER_CLASS_COST. (create_cap_allocno, propagate_allocno_info, remove_unnecessary_allocnos): Remove setting updated costs. (ira_flattening): Set up ALLOCNO_UPDATED_COVER_CLASS_COST. * ira.c (ira): Don't call ira_fast_allocation. * ira-costs.c (setup_allocno_cover_class_and_costs): Don't set up updated costs. Index: ira-conflicts.c === --- ira-conflicts.c (revision 140793) +++ ira-conflicts.c (working copy) @@ -337,6 +337,7 @@ process_regs_for_copy (rtx reg1, rtx reg enum reg_class rclass, cover_class; enum machine_mode mode; ira_copy_t cp; + ira_loop_tree_node_t parent; gcc_assert (REG_SUBREG_P (reg1) && REG_SUBREG_P (reg2)); only_regs_p = REG_P (reg1) && REG_P (reg2); @@ -388,13 +389,23 @@ process_regs_for_copy (rtx reg1, rtx reg cost = ira_register_move_cost[mode][cover_class][rclass] * freq; else cost = ira_register_move_cost[mode][rclass][cover_class] * freq; - ira_allocate_and_set_costs -(&ALLOCNO_HARD_REG_COSTS (a), cover_class, - ALLOCNO_COVER_CLASS_COST (a)); - ira_allocate_and_set_costs -(&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), cover_class, 0); - ALLOCNO_HARD_REG_COSTS (a)[index] -= cost; - ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[index] -= cost; + for (;;) +{ + ira_allocate_and_set_costs + (&ALLOCNO_HARD_REG_COSTS (a), cover_class, + ALLOCNO_COVER_CLASS_COST (a)); + ira_allocate_and_set_costs + (&ALLOCNO_CONFLICT_HARD_REG_COSTS (a), cover_class, 0); + ALLOCNO_HARD_REG_COSTS (a)[index] -= cost; + ALLOCNO_CONFLICT_HARD_REG_COSTS (a)[index] -= cost; + if (ALLOCNO_HARD_REG_COSTS (a)[index] < ALLOCNO_COVER_CLASS_COST (a)) + ALLOCNO_COVER_CLASS_COST (a) = ALLOCNO_HARD_REG_COSTS (a)[index]; + if (ALLOCNO_CAP (a) != NULL) + a = ALLOCNO_CAP (a); + else if ((parent = ALLOCNO_LOOP_TREE_NODE (a)->parent) == NULL + || (a = parent->regno_allocno_map[ALLOCNO_REGNO (a)]) == NULL) + break; +} return true; } Index: ira-int.h === --- ira-int.h (revision 140793) +++ ira-int.h (working copy) @@ -258,9 +258,9 @@ struct ira_allocno /* Register class which should be used for allocation for given allocno. NO_REGS means that we should use memory. */ enum reg_class cover_class; - /* Minimal accumulated cost of usage register of the cover class for - the allocno. */ - int cover_class_cost; + /* Minimal accumulated and updated costs of usage register of the + cover class for the allocno. */ + int cover_class_cost, updated_cover_class_cost; /* Minimal accumulated, and updated costs of memory for the allocno. At the allocation start, the original and updated costs are equal. The updated cost may be changed after finishing @@ -451,6 +451,7 @@ struct ira_allocno #define ALLOCNO_LEFT_CONFLICTS_NUM(A) ((A)->left_conflicts_num) #define ALLOCNO_COVER_CLASS(A) ((A)->cover_class) #define ALLOCNO_COVER_CLASS_COST(A) ((A)->cover_class_cost) +#define ALLOCNO_UPDATED_COVER_CLASS_COST(A) ((A)->updated_cover_class_cost) #define ALLOCNO_MEMORY_COST(A) ((A)->memory_cost) #define ALLOCNO_UPDATED_MEMORY_COST(A) ((A)->updated_memory_cost) #define ALLOCNO_EXCESS_PRESSURE_POINTS_NUM(A) ((A)->excess_pressure_points_num) @@ -897,7 +898,6 @@ extern void ira_reassign_conflict_allocn extern void ira_initiate_assign (void); extern void ira_finish_assign (void); extern void ira_color (void); -extern void
Re: Defining a common plugin machinery
Aye up all, I've now been reading through some of the list archive. Some of the posts were about how to tell GCC which plugins to load. I thought I'd tell you how libplugin does it. First there is a plugin path. This tells the system where plugin XML specifications can be found (each plugin needs one specification file). The path can be set by environment variable and/or command line argument. The application can also add to the plugin path itself, so GCC could include plugins from its own installation directory (although it doesn't at the moment). The plugin system will look at every specification file in the plugin path. Those files must be well formed XML with processing directives saying that the plugin is for the current application. for only GCC 4.3.1 or for any GCC 4.X.X or for any application - a few of the provided plugins work with all applications - like logging support. For the patch I have for GCC, only the compiler is plugin aware, not the driver, linker, etc. We could easily have processing directives for each so that you could define a plugin that worked on any set of the compiler applications. A plugin with both the processing directives below would work on both the linker and the driver, but not anything else: Plugins can be marked as lazy or eager (eager by default). Lazy plugins aren't loaded unless explicitly asked for or unless needed by another plugin. This allows users to setup (or for the compiler to set up) default plugins. You could, for example, remove all passes from GCC and distribute them as plugins with a small number being required (not that you should). The user can specify a list plugins with environment variable (GCC_PLUGINS=id,id) and/or command line argument (-plugins id,id). These are just comma separated lists of plugin ids (actually they can be glob patterns, too). Every plugin with such a matching id is marked as eager if it isn't already. The system fails if a requested plugin can't be found. So, we start loading all eager plugins. This means setting up their extension points, loading their libraries, executing lifecycle methods, etc., etc.. If any loaded plugin needs a lazy plugin, that lazy plugin is marked eager and will also be loaded. Plugins 'need' each other by either: * Having an explicit 'requires' element in their specfication, e.g. * Extending an extension point from the other plugin. This can be either by 'extension' elements in the specifications or by code (for example in lifecycle methods or, well, pretty much anything). This means that you don't need to know what plugins provide extension points you want, the system just handles it for you. This also means that you can have one plugin which loads up lots of other plugins. If we had all non-essential passes as plugins for example, one plugin called "O3" could load up a certain number of them and set parameters - I'm not suggesting we do that, though :-) Finally, you can specify arguments to plugins. This can either be via command line (-plugin-var id=value;id=value) and/or environment variable (GCC_PLUGIN_VAR=id=value;id=value). Plugin XML specifications can directly use these arguments, specify their own, use expansion over them etc. Plugin XML files can also use other sources of variable, such as any environment variable with variable names like "env.PATH". Plugin shared libraries also have an API to access these arguments. The variables also have an escaping syntax so that characters like '=' and ';' can be represented. Cheers, Hugh. Hugh Leather wrote: Sorry, I think this bounced twice. Hugh Leather wrote: Hi All, Thanks, Grigori, for mentioning my plugin system, libplugin, which can be found at http://libplugin.sourceforge.net/. I have been meaning to release it but finding the time to finish off the documentation and upload all the newest code to SourceForge has been difficult (both code and docs on SourceForge are some months out of date). The plugin system was built to support MilePost goals for GCC, as we need to be able to capture events during compilation as well as change compilation behaviour. Here are some of the features of the system: *Application agnostic.* The plugin system is not GCC specific but can be used to make any application plugin aware. Plugin management is handled through a shared library which GCC, or any other application, can link to. I think if GCC took a similar approach then it would benefit from the exposure the system received elsewhere and the wider community would also have access to a professionally built plugin system. As the plugin system becomes more powerful, GCC reaps the rewards without having to change a line of code. The other, huge advantage of this, together with the design I'll describe below, is that GCC only has ~10 lines of plugin code; to
Re: m32c: pointer math vs sizetype again
Is this related to the loop termination bug I reported on the m32c? http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37665 The generated code is using the lower 16-bits of the address for end of the array rather than the full 24-bit address. --joel DJ Delorie wrote: I've got a partial patch which works with older (4.3) gccs, but fails gimple's check for trunk (attached). My trivial test case... char * foo (char *a, int b) { return a-b; } ...fails thusly: constant 32> unit size constant 4> align 8 symtab 0 alias set -1 canonical type 0xb7f52c30 precision 32 min max > constant 16> unit size constant 2> align 8 symtab 0 alias set -1 canonical type 0xb7efc000 precision 16 min max > useless false: ../../gcc/gcc/tree-ssa.c 1092 dj.c: In function 'foo': dj.c:2: error: type mismatch in pointer plus expression D.1194 = a + D.1196; char * char * D.1194 = a + D.1196; dj.c:2: internal compiler error: verify_gimple failed I'm obviously doing something wrong in the cast-to-bigger step. How can I get this to pass gimple? What I'm trying to accomplish is this: 1. Values added to pointers need to be treated as signed (at least, if they're signed types, certainly if you're going to use a NEGATE_EXPR). 2. If sizeof(size_t) < sizeof(void *), sign extend the intop to be pointer-sized before adding it. Index: c-common.c === --- c-common.c (revision 140759) +++ c-common.c (working copy) @@ -3337,20 +3337,28 @@ pointer_int_sum (enum tree_code resultco intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype), TYPE_UNSIGNED (sizetype)), intop); /* Replace the integer argument with a suitable product by the object size. Do this multiplication as signed, then convert to the appropriate type for the pointer operation. */ - intop = convert (sizetype, + intop = convert (ssizetype, build_binary_op (EXPR_LOCATION (intop), MULT_EXPR, intop, convert (TREE_TYPE (intop), size_exp), 1)); /* Create the sum or difference. */ if (resultcode == MINUS_EXPR) -intop = fold_build1 (NEGATE_EXPR, sizetype, intop); +intop = fold_build1 (NEGATE_EXPR, ssizetype, intop); + + if (TREE_CODE (result_type) == POINTER_TYPE + && TYPE_PRECISION (result_type) > TYPE_PRECISION (TREE_TYPE (intop))) +{ + tree iptr_type = c_common_type_for_mode (TYPE_MODE (result_type), + TYPE_UNSIGNED (result_type)); + intop = fold_build1 (NOP_EXPR, iptr_type, intop); +} ret = fold_build2 (POINTER_PLUS_EXPR, result_type, ptrop, intop); fold_undefer_and_ignore_overflow_warnings (); return ret; Index: tree.c === --- tree.c (revision 140759) +++ tree.c (working copy) @@ -3283,15 +3283,21 @@ build2_stat (enum tree_code code, tree t if ((code == MINUS_EXPR || code == PLUS_EXPR || code == MULT_EXPR) && arg0 && arg1 && tt && POINTER_TYPE_P (tt)) gcc_assert (TREE_CODE (arg0) == INTEGER_CST && TREE_CODE (arg1) == INTEGER_CST); if (code == POINTER_PLUS_EXPR && arg0 && arg1 && tt) -gcc_assert (POINTER_TYPE_P (tt) && POINTER_TYPE_P (TREE_TYPE (arg0)) - && INTEGRAL_TYPE_P (TREE_TYPE (arg1)) - && useless_type_conversion_p (sizetype, TREE_TYPE (arg1))); +{ + gcc_assert (POINTER_TYPE_P (tt)); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (arg0))); + gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (arg1))); +#if 0 + gcc_assert (useless_type_conversion_p (sizetype, TREE_TYPE (arg1)) + || useless_type_conversion_p (ssizetype, TREE_TYPE (arg1))); +#endif +} t = make_node_stat (code PASS_MEM_STAT); TREE_TYPE (t) = tt; /* Below, we automatically set TREE_SIDE_EFFECTS and TREE_READONLY for the result based on those same flags for the arguments. But if the -- Joel Sherrill, Ph.D. Director of Research & Development [EMAIL PROTECTED]On-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available (256) 722-9985
Re: query regarding adding a pass to undo final value replacement.
On Wed, Oct 1, 2008 at 3:59 PM, Richard Guenther <[EMAIL PROTECTED]> wrote: > On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> > wrote: >> Hi , >> >> Based on the conversation in the thread at >> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a >> pass trying to undo final value replacement going. The initial >> implementation was done by Pranav Bhandarkar when he was employed at >> Azingo as part of work sponsored by Icera Semiconductor. I've been >> trying to get this working with my private port over here. We intend >> to contribute this back once our copyright assignments are sorted and >> if this will be acceptable to all. I've been getting a few compile >> time ICEs with this approach and haven't been able to resolve them >> well enough yet. Whilst doing so, I wanted to check on the approach as >> outlined below and ask if there's anything that we might have missed >> or any problem that one can see with us going along these lines. >> Thanks for your time and patience. > > Some quick comments. First, do you have a non-pseudo-code testcase > that exposes the extra computations? Second, I think rather than > trying to undo what SCEV const(!)-prop is doing adjust its cost > model (maybe there isn't one) to not create the costly substitutions. Indeed the comment on scev_const_prop says "Also perform final value replacement in loops, in case the replacement expressions are cheap." but no such check for cheapness is done. Whatever cost model we add we need to make sure to not disable empty loop removal - that is, loops that only care for the final value of their induction variable. A sensible simple cost model is that the replacement is either a constant, a SSA_NAME or an expression of the form CST * name + CST (which may be a common thing). In your testcase there are even divisions inserted. I guess the empty-loop removal interaction makes fixing this one harder, but trying to record things to undo this transformation doesn't look right either. Richard.
Re: Defining a common plugin machinery
Hugh Leather wrote: Aye up all, I've now been reading through some of the list archive. Some of the posts were about how to tell GCC which plugins to load. I thought I'd tell you how libplugin does it. Thanks for the nice explanation. I'm not sure to understand exactly how libplugin deals with adding passes; apparently, the entire pass manager (ie gcc/passes.c) has been rewritten or enhanced. Also, I did not understood the exact conceptual differences between libplugin & other proposals. Apparently libplugin is much more ambitious. So we now have many plugin proposals & experiments. However, we do know that there are some legal/political/license issues on these points (with the GCC community rightly wanting as hard as possible to avoid proprietary plugins), that some interaction seems to happen (notably between Steering Committee & FSF), that the work is going slowly (because of lack of resource & labor & funding? at FSF). My perception is that the issues are not mostly technical, but still political (and probably, as Ian Taylor mentioned it in http://gcc.gnu.org/ml/gcc/2008-09/msg00442.html a lack of lawyer or other human resources at FSF, which cost much more than any reasonable person could afford individually). I actually might not understand why exactly plugins are not permitted by the current GCC licenses. What I don't understand is * what exactly do we call a plugin? I feel (but I am not a lawyer) that (on linux) it is any *.so file which is fed to dlopen. I'm not able to point what parts of the GCC license prohibit that (I actually hope that nothing prohibits it right now, if the *.so is compiled from GPLv3-ed FSF copyrighted code. the MELT branch is doing exactly that right now). * will the runtime license be working for Christmas 2008. [some messages made me think that not, it is too much lawyer work; other messages made me a bit more optimistic; I really am confused]. Of course, I don't want any hard date, but I am in the absolute darkness on the actual work already done on improving the runtime license, and even more on what needs to be fixed. Also, I have no idea of the work involved in writing new licenses (I only know that the GPLv3 effort lasted much more than one year). Did I say that I am not a lawyer, and not understanding even the basic principles of US laws (or perhaps even French ones)? * what kind of intrusiveness do we want for the plugin machinery. Do we want it to be clean and hence to touch a lot of files (in particular the details of passes & the pass manager), or do we first want some quick and dirty plugin trick merged into the trunk, even if it is imperfect? * what is the plugin machinery useful for? Only adding optimisation passes, or much more ambitious (adding new front ends, back ends, targets)? * what is the interaction between the plugin machinery & the rest of GCC (e.g. GGC, dump files, ) * what is the granularity plugins are wanted or needed for? Only whole passes, or something smaller than that (e.g. some specific functions inside specific passes)? * who really want plugins to happen quick, and which company would invest money [not only code] on that? * what host system do we want the plugin to work with? Is libtool dyn loader enough? Could every non static symbol inside cc1 be visible to the plugin? * do we really want one single (fits all) plugin machinery inside GCC? My feeling is that a lot of various technical efforts has already being put into plugins, but that the future runtime license may (or not) impact technicalities (perhaps making some proposed technical solutions impossible). I really don't understand what is the hard limit, i.e. what the FSF or the Steering Committee wants to avoid exactly (obviously proprietary plugins implementing new machine targets are unwanted, but what else; is the goal to only permit FSF copyrighted GPLed plugins; what would be the review policy of code going into plugins?)? I've got no idea of how would it be hard to make any plugin system accepted into the GCC trunk, and when could that work begins to start (i.e. when to send plugin patches to gcc-patches@). I tend to believe that it the main issue now. Are plugin patches supposed to be welcome -on the gcc-patches@ mailing list, for trunk acceptance- when GCC goes back in stage1? Will the first plugin patches (submitted to gcc-patches@ for acceptance into trunk) be huge or tiny patches? Technically both are possible (of course with different goals & features). I even don't know what legally a plugin is. For instance, in my MELT branch code is indeed dlopen-ed, but [currently] the C code of the plugin is generated (by the plugin itself) from MELT lisp-like files, which are all inside the MELT branch (GPL-ed, FSF copyrighted) Perhaps that does not even count, from a legal point of view, as a plugin? [I really hope I am not doing unknowingly illegal things on the MELT branch; to calm eve
Re: Does IRA support stack slot sharing for locals and spilled pseudos?
Pat Haugen wrote: Alexander Monakov <[EMAIL PROTECTED]> wrote on 09/29/2008 01:34:12 PM: I'm seeing a miscompilation on sel-sched branch that at first sight looks related to IRA merge. alias.c::anti_dependence disambiguates references to (mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64]) and (mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64]) while there are no stores to r122 between corresponding insns. It does so because nonoverlapping_memrefs_p returns TRUE for these mems, which is, in turn, due to this code: 2118 /* If either RTL is not a MEM, it must be a REG or CONCAT, meaning they 2119 can't overlap unless they are the same because we never reuse that part 2120 of the stack frame used for locals for spilled pseudos. */ 2121 if ((!MEM_P (rtlx) || !MEM_P (rtly)) 2122 && ! rtx_equal_p (rtlx, rtly)) 2123 return 1; Corresponding RTL_DECLS are: rtlx = (reg:DI 97 r105 [orig:850 ivtmp.743 ] [850]) rtly = (mem/c:DI (plus:DI (reg/f:DI 111 r119) (const_int -1456 [0xfa50])) [64 ivtmp.1640+0 S8 A64]) Does IRA support stack slot sharing described in the comment? We just got done walking through a failure with 200.sixtrack that looks like the same thing. The two insns involved are: (insn 33168 33162 33175 27 maincr.f:1 (set (reg/f:DI 14 14 [orig:614 ivtmp.1309 ] [614]) (mem/c:DI (plus:DI (reg:DI 11 11) (const_int -7080 [0xe458])) [101 ivtmp.1309+0 S8 A64])) 349 {*movdi_internal64} (nil)) (insn 33175 33168 33176 27 maincr.f:1 (set (mem/c:DF (plus:DI (reg:DI 11 11 [5]) (const_int -7080 [0xe458])) [101 D.3497+0 S8 A64]) (reg:DF 45 13 [orig:765 D.3497 ] [765])) 336 {*movdf_hardfloat64} (expr_list:REG_DEAD (reg:DF 45 13 [orig:765 D.3497 ] [765]) (nil))) The MEM refs are not seen as overlapping which then allows the scheduler to reorder the store to MEM above the load. The problem is brought about because an additional register is needed to access the stack location since it is beyond the 32K limit for PPC. So before these references we have an insn 'r11 = r1 + 64K'. The code in alias.c:stack_addr_p() does not recognize r11 as pointing to the stack and therefor the IRA code in nonoverlapping_memrefs_p() does not recognize the above MEMs as being stack references and use the special code for reused ira spill slots. It seems like stack_addr_p() doesn't handle reg+reg addressing also, only recognizing reg+const references (unless those are meant to be caught elsewhere). Yea, I don't see how stack_addr_p handles cases such as secondary reloads due to an out of range displacement in a reg+d style addressing mode. Given the unpredictable nature of how out of range slot addresses are reloaded, I'm not sure that following the use-def chains back to the definition sites would be useful either. I'm certainly at a loss for a good way to fix this. Vlad -- any thoughts? jeff
Re: query regarding adding a pass to undo final value replacement.
Hi, > b) If any PHI node has count zero it can be inserted back and its > corresponding computations removed, iff the argument of the PHI node > still exists as an SSA variable. This means that we can insert > a_1 = PHI if D.10_1 still exists and hasnt been removed by > any of the passes between the scalar evolution pass and the > loopdone pass. this does not work: -- we reuse ssa names, so it can happen that the argument of the PHI node is eliminated, then reused for a different purpose -- in case more complex loop transformations were performed (e.g., loop reversal), the final value of the ssa name might have changed. Zdenek
Re: query regarding adding a pass to undo final value replacement.
Hi, > > Based on the conversation in the thread at > > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a > > pass trying to undo final value replacement going. The initial > > implementation was done by Pranav Bhandarkar when he was employed at > > Azingo as part of work sponsored by Icera Semiconductor. I've been > > trying to get this working with my private port over here. We intend > > to contribute this back once our copyright assignments are sorted and > > if this will be acceptable to all. I've been getting a few compile > > time ICEs with this approach and haven't been able to resolve them > > well enough yet. Whilst doing so, I wanted to check on the approach as > > outlined below and ask if there's anything that we might have missed > > or any problem that one can see with us going along these lines. > > Thanks for your time and patience. > > Some quick comments. First, do you have a non-pseudo-code testcase > that exposes the extra computations? Second, I think rather than > trying to undo what SCEV const(!)-prop is doing adjust its cost > model (maybe there isn't one) to not create the costly substitutions. I would disagree on that. Whether a final value replacement is profitable or not largely depends on whether it makes further optimization of the loop possible or not; this makes it difficult to find a good cost model. I think undoing FVR is a good approach to solve this problem (unfortunately, the proposed implementation does not work), Zdenek
Re: query regarding adding a pass to undo final value replacement.
Hi, > On Wed, Oct 1, 2008 at 3:59 PM, Richard Guenther > <[EMAIL PROTECTED]> wrote: > > On Wed, Oct 1, 2008 at 3:22 PM, Ramana Radhakrishnan <[EMAIL PROTECTED]> > > wrote: > >> Hi , > >> > >> Based on the conversation in the thread at > >> http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a > >> pass trying to undo final value replacement going. The initial > >> implementation was done by Pranav Bhandarkar when he was employed at > >> Azingo as part of work sponsored by Icera Semiconductor. I've been > >> trying to get this working with my private port over here. We intend > >> to contribute this back once our copyright assignments are sorted and > >> if this will be acceptable to all. I've been getting a few compile > >> time ICEs with this approach and haven't been able to resolve them > >> well enough yet. Whilst doing so, I wanted to check on the approach as > >> outlined below and ask if there's anything that we might have missed > >> or any problem that one can see with us going along these lines. > >> Thanks for your time and patience. > > > > Some quick comments. First, do you have a non-pseudo-code testcase > > that exposes the extra computations? Second, I think rather than > > trying to undo what SCEV const(!)-prop is doing adjust its cost > > model (maybe there isn't one) to not create the costly substitutions. > > Indeed the comment on scev_const_prop says > > "Also perform final value replacement in loops, >in case the replacement expressions are cheap." > > but no such check for cheapness is done. sorry for the leftover comment -- there used to be a test for the cost of the computation, but it caused so many (missed optimization) problems that I removed it in the end, Zdenek
Re: query regarding adding a pass to undo final value replacement.
Hi Zdenek, On Wed, Oct 1, 2008 at 5:19 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > Hi, > >> b) If any PHI node has count zero it can be inserted back and its >> corresponding computations removed, iff the argument of the PHI node >> still exists as an SSA variable. This means that we can insert >> a_1 = PHI if D.10_1 still exists and hasnt been removed by >> any of the passes between the scalar evolution pass and the >> loopdone pass. > > this does not work: > -- we reuse ssa names, so it can happen that the argument of the PHI node > is eliminated, then reused for a different purpose I wasn't sure if from the proposal strong enough to catch this case ? i.e. if a) All the computations that were added are still present in the basic block. i.e all the computations are still present in the form in which they were added and havent been touched by any of the loop optimizations passes that run between the scalar evolution pass (i.e the pass when Part 1 is executed) and the 'loopdone' pass. We go through the exit basic block and look up each stmt in changed_stmt_table. If found we lookup the corresponding PHI node in the phi_node_info link list and decrement its count by 1 (count here denotes the number of computations added. When count it 0 it means all the computatins added in the scalar evolution pass have been found in the same form in the loop done pass such a PHI node can be inserted back in if 'b' is also true). So if the ssa_names are infact reused they won't be the same computations. We do store the statements that were introduced and if we see a difference in the statements based on the hashes calculated we don't undo the change. > -- in case more complex loop transformations were performed > (e.g., loop reversal), the final value of the ssa name might have > changed. Could you give an example for this ? Is there anything else you might suggest in terms of undoing the transformations from scalar cprop.? cheers Ramana > > Zdenek > -- Ramana Radhakrishnan
Re: Does IRA support stack slot sharing for locals and spilled pseudos?
Alexander Monakov wrote: Hello, I'm seeing a miscompilation on sel-sched branch that at first sight looks related to IRA merge. alias.c::anti_dependence disambiguates references to (mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64]) and (mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64]) while there are no stores to r122 between corresponding insns. It does so because nonoverlapping_memrefs_p returns TRUE for these mems, which is, in turn, due to this code: 2118 /* If either RTL is not a MEM, it must be a REG or CONCAT, meaning they 2119 can't overlap unless they are the same because we never reuse that part 2120 of the stack frame used for locals for spilled pseudos. */ 2121 if ((!MEM_P (rtlx) || !MEM_P (rtly)) 2122 && ! rtx_equal_p (rtlx, rtly)) 2123 return 1; Corresponding RTL_DECLS are: rtlx = (reg:DI 97 r105 [orig:850 ivtmp.743 ] [850]) rtly = (mem/c:DI (plus:DI (reg/f:DI 111 r119) (const_int -1456 [0xfa50])) [64 ivtmp.1640+0 S8 A64]) Does IRA support stack slot sharing described in the comment? Yes. There's code at the start of nonoverlapping_memrefs_p to handle these cases, but as Pat pointed out, it doesn't work for large offsets from the stack/frame pointer (large enough to cause a secondary reload). I'm not sure offhand how to best fix this. jeff
Re: Does IRA support stack slot sharing for locals and spilled pseudos?
Jeff Law wrote: (mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64]) and (mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64]) ... Yes. There's code at the start of nonoverlapping_memrefs_p to handle these cases, but as Pat pointed out, it doesn't work for large offsets from the stack/frame pointer (large enough to cause a secondary reload). I'm not sure offhand how to best fix this. How about setting the MEM_EXPR to a fake "spill_slot_base" symbol, plus the full frame pointer offset number? Since the slots are being shared, the original decl as the MEM_EXPR isn't terribly useful. r~
gcc source: how can access asmspec_tree in function push_parm_decl
Hello in gcc/c-decl.c I see in finish_decl """ finish_decl (tree decl, tree init, tree asmspec_tree) { there is access possible to asmspec_tree. but the func push_parm_decl have no parameter asmspec_tree. is there a way to get access to it, without many code changes, or below func can use to a better place ? the amiga os target need that functions.does somebody can tell me a more easy way to add this feature, so not so many gcc source must change when make a amiga OS Port ?. here is a short testprog to see what need.it is the feature to tell what variable must put in which register or what register must put in which variable long GfxBase; void (*Old_Text)(long rp asm("a1"), long string asm("a0"), long count asm("d0"), long GfxBase asm("a6")); void New_Text(long rp __asm("a1"), long string __asm("a0"), long count __asm("d0")) { (*Old_Text)(rp, string, count,GfxBase); } But it is much more easy, when there is a way to get access to asmspec and need not 1 additional parameter. The current way is change many lines in c-parse.in (see below) here is change that is need in c-decl.c the changes are from 3.4.0 i find out, diff -rupN gcc-3.4.0/gcc/c-decl.c gcc-3.4.0-gg/gcc/c-decl.c --- gcc-3.4.0/gcc/c-decl.c Mon Mar 22 18:58:18 2004 +++ gcc-3.4.0-gg/gcc/c-decl.c Tue Apr 27 11:12:30 2004 @@ -2943,7 +2943,7 @@ finish_decl (tree decl, tree init, tree and push that on the current scope. */ void -push_parm_decl (tree parm) +push_parm_decl (tree parm, tree asmspec) { tree decl; @@ -2956,6 +2956,75 @@ push_parm_decl (tree parm) TREE_PURPOSE (TREE_PURPOSE (parm)), PARM, 0, NULL); decl_attributes (&decl, TREE_VALUE (parm), 0); + + /* begin-GG-local: explicit register specification for parameters */ + if (asmspec) +#ifdef TARGET_AMIGAOS +{ + const char *regname=TREE_STRING_POINTER(asmspec); + int regnum; + if ((regnum=decode_reg_name(regname))>=0) + { + tree type=TREE_TYPE(decl); + if (HARD_REGNO_MODE_OK(regnum, TYPE_MODE(type))) + { + tree t, attrs; + /* Build tree for __attribute__ ((asm(regnum))). */ +#if 0 + /* This doesn't work well because of a bug in +attribute_list_contained(), which passes list of arguments to +simple_cst_equal() instead of passing every argument +separately. */ + attrs=tree_cons(get_identifier("asm"), tree_cons(NULL_TREE, + build_int_2_wide(regnum, 0), NULL_TREE), NULL_TREE); +#else + attrs=tree_cons(get_identifier("asm"), + build_int_2_wide(regnum, 0), NULL_TREE); +#endif +#if 0 + /* build_type_attribute_variant() would seem to be more +appropriate here. However, that function does not support +attributes for parameters properly. It modifies +TYPE_MAIN_VARIANT of a new type. As a result, comptypes() +thinks that types of parameters in prototype and definition +are different and issues error messages. See also comment +below. */ + type=build_type_attribute_variant(type, attrs); +#else + /* First check whether such a type already exists - if yes, use +that one. This is very important, since otherwise +common_type() would think that it sees two different +types and would try to merge them - this could result in +warning messages. */ + for (t=TYPE_MAIN_VARIANT(type); t; t=TYPE_NEXT_VARIANT(t)) + if (comptypes(t, type, COMPARE_STRICT)==1 + && attribute_list_equal(TYPE_ATTRIBUTES(t), attrs)) + break; + if (t) + type=t; + else + { + /* Create a new variant, with differing attributes. +(Hack! Type with differing attributes should no longer be +a variant of its main type. See comment above for +explanation why this was necessary). */ + type=build_type_copy(type); + TYPE_ATTRIBUTES(type)=attrs; + } +#endif + TREE_TYPE(decl)=type; + } + else + error ("%Jregister specified for '%D' isn't suitable for data type", + decl, decl); + } + else + error ("invalid register name `%s'", regname); +} +#else /* !TARGET_AMIGAOS */ +error("explicit register specification for parameters is not supported for this target"); +#endif /* !TARGET_AMIGAOS */ + /* end-GG-local */ decl = pushdecl (decl); .. diff -rupN gcc-3.4.0/gcc/c-parse.in gcc-3.4.0-gg/gcc/c-parse.in --- gcc-3.4.0/gcc/c-parse.inSun Feb 8 21:56:44 2004 +++ gcc-3.4.0-gg/gcc/c-parse.in Tue Apr 27 11:12:30 2004 @@ -29,7 +29,7 @@ Software Foundation, 59 Temple Place - S written by AT&T, but I have never seen it. */ @@ifc -%expect 10 /* shift/reduce conflicts, and no reduce/reduce conflicts. */ +%expect 11 /* shift/reduce conflicts, and no reduce/reduce conflicts. */ @@end_ifc
Re: query regarding adding a pass to undo final value replacement.
On Wed, Oct 1, 2008 at 6:22 PM, Zdenek Dvorak <[EMAIL PROTECTED]> wrote: > Hi, > >> > Based on the conversation in the thread at >> > http://gcc.gnu.org/ml/gcc/2008-03/msg00513.html , we've tried to get a >> > pass trying to undo final value replacement going. The initial >> > implementation was done by Pranav Bhandarkar when he was employed at >> > Azingo as part of work sponsored by Icera Semiconductor. I've been >> > trying to get this working with my private port over here. We intend >> > to contribute this back once our copyright assignments are sorted and >> > if this will be acceptable to all. I've been getting a few compile >> > time ICEs with this approach and haven't been able to resolve them >> > well enough yet. Whilst doing so, I wanted to check on the approach as >> > outlined below and ask if there's anything that we might have missed >> > or any problem that one can see with us going along these lines. >> > Thanks for your time and patience. >> >> Some quick comments. First, do you have a non-pseudo-code testcase >> that exposes the extra computations? Second, I think rather than >> trying to undo what SCEV const(!)-prop is doing adjust its cost >> model (maybe there isn't one) to not create the costly substitutions. > > I would disagree on that. Whether a final value replacement is > profitable or not largely depends on whether it makes further > optimization of the loop possible or not; this makes it difficult > to find a good cost model. I think undoing FVR is a good approach > to solve this problem (unfortunately, the proposed implementation > does not work), Ok, fair enough. Ideally we would then be able to retain the PHI nodes and somehow record an equivalency in the IL from which we later could remove either of the definitions. Something like def_1 = PHI < ... > def_2 = compute def_3 = EQUIV (def_3 = ASSERT_EXPR ?) much similar to REG_EQUAL notes. This means that both def_1 and def_2 are conditionally dead if the EQUIV is the only remaining use. No idea if this is feasible and useful enough in general though. Do you remember what kind of missed optimizations you saw (apart from missed dead loop removal)? Thanks, Richard.
Re: Defining a common plugin machinery
On Wed, Oct 01, 2008 at 06:03:21PM +0200, Basile STARYNKEVITCH wrote: > So we now have many plugin proposals & experiments. However, we do know > that there are some legal/political/license issues on these points (with > the GCC community rightly wanting as hard as possible to avoid > proprietary plugins), that some interaction seems to happen (notably > between Steering Committee & FSF), that the work is going slowly > (because of lack of resource & labor & funding? at FSF). That impression isn't really right; we're getting close now to a resolution. There should be some news soon.
Convert Blanket Write Privileges to Global Reviewers
On my recommendation, and with the support of all those with blanket write privileges on the SC, the GCC SC has decided to eliminate "blanket write privileges" in favor of "global reviewers". Those who previously held blanket write privileges are now global reviewers. Global reviewers may now review and approve patches to any portion of the compiler and/or associated libraries, but cannot approve their own patches. Global reviewers who are also maintainers of particular parts of GCC may continue to approve their own patches to those portions. So, for example, I can still check in a C++ front-end patch without review, but cannot check in a loop optimizer patch without review. This change is being made to encourage peer-review of patches and to avoid any appearance of impropriety on the part of those of us who had blanket write privileges. I will commit the attached patch momentarily. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713 2008-10-01 Mark Mitchell <[EMAIL PROTECTED]> * MAINTAINERS (Blanket Write Privs): Change to Global Reviewers. Index: MAINTAINERS === --- MAINTAINERS (revision 140816) +++ MAINTAINERS (working copy) @@ -18,7 +18,7 @@ To report problems in GCC, please visit: Maintainers === - Blanket Write Privs. + Global Reviewers Richard Earnshaw [EMAIL PROTECTED] Richard Henderson [EMAIL PROTECTED] @@ -32,6 +32,9 @@ Mark Mitchell [EMAIL PROTECTED] Bernd Schmidt [EMAIL PROTECTED] Jim Wilson [EMAIL PROTECTED] +Note that while global reviewers can approve changes to any part of +the compiler or associated libraries, they still need approval for +their own patches from other maintainers or reviewers. CPU Port Maintainers(CPU alphabetical order)
a solution to getch()
/*solution provided by kermi3 from this web posting http://cboard.cprogramming.com/archive/index.php/t-27714.html*/ #include #include int mygetch(void) { struct termios oldt, newt; int ch; tcgetattr( STDIN_FILENO, &oldt ); newt = oldt; newt.c_lflag &= ~( ICANON | ECHO ); tcsetattr( STDIN_FILENO, TCSANOW, &newt ); ch = getchar(); tcsetattr( STDIN_FILENO, TCSANOW, &oldt ); return ch; } /*this point down was coded by Brandon Camadine*/ void mygetchs(int length,char array[],int display) { int x; array[length-1]='\0'; while(array[x]!='\0') { array[x]=mygetch(); if(display !=0) { putchar(array[x]); } x++; } } /*solution provided by kermi3 from this web posting http://cboard.cprogramming.com/archive/index.php/t-27714.html*/ #include #include int mygetch(void) { struct termios oldt, newt; int ch; tcgetattr( STDIN_FILENO, &oldt ); newt = oldt; newt.c_lflag &= ~( ICANON | ECHO ); tcsetattr( STDIN_FILENO, TCSANOW, &newt ); ch = getchar(); tcsetattr( STDIN_FILENO, TCSANOW, &oldt ); return ch; } /*this point down was coded by Brandon Camadine*/ void mygetchs(int length,char array[],int display) { int x; array[length-1]='\0'; while(array[x]!='\0') { array[x]=mygetch(); if(display !=0) { putchar(array[x]); } x++; } }
Re: [lto] Adding -fwhopr
On Wed, Oct 1, 2008 at 13:19, Ollie Wild <[EMAIL PROTECTED]> wrote: > On Tue, Sep 30, 2008 at 3:31 PM, Diego Novillo <[EMAIL PROTECTED]> wrote: >> >> -flto: as described above. >> -fwhopr: similar to what -fwpa does today, but it is accepted by the >> driver and can take either source code or object code. In this case, >> we'd move -fwpa and -fltrans to be an lto1-only flag. > > Sounds reasonable. Just to clarify, are you thinking of -flto and -fwhopr > as mutually exclusive options, or is -fwhopr just an additional mode of > -flto. I think the only time where both -flto and -fwhopr are virtually identical is in the LGEN phase. So, $ gcc -c -flto file.c should be the same as $ gcc -c -fwhopr file.c Though, I think this may not even be true long term, at some point we may want to do different things for both. The case that really matters is the actual link phase: $ gcc -o binary -flto *.o vs $ gcc -o binary -fwhopr *.o So, I'm leaning towards making them mutually exclusive always. > If the latter, what does being able to "take either source code or > object code" mean? Well, simply that -fwpa should only be accepted by lto1, and lto1 does not take source code, only .o files with GIMPLE in them. Perhaps the distinction is not important, as the driver can always call the corresponding front end to generate gimple before calling lto1 (as we do now) , but the multiplicity of LTO flags may be confusing for the user: -flto, -fwhopr, -fwpa, -fltrans. The last two are really only meaningful when calling lto1, so I would simply not accept it at the driver level (i.e., gcc -fwpa *.c would error out). Diego.
Re: Defining a common plugin machinery
Aye up Basile, Thanks for wading through my gibberish :-) *Differences with other proposals.* I'll have a stab at some differences between this system and the others. But, this is going to be a bit difficult since I haven't seen them all :-) *Separating Plugin system from appliction* Libplugin ships as a library. Apart from a few lines of code in toplev.c, the only other changes to GCC will be refactorings and maybe calling a few functions through pointers. I think it's important to separate the plugin system from the application. Doing plugins well, IMO, requires a lot of code. It shouldn't be spread through the app. It also cleanly separates plugin mechanism from the actual extensions the app wants. Finally, plugins have to be extensible too. They should really be on a nearly equal footing with the app. Otherwise plugin developers who want the plugins to be extensible will need to reimplement there own extensibility system. *Pull vs push* Libplugin has a 'push' architecture, not a 'pull' one. What I mean is that the system pushes plugin awareness onto the application rather than requiring the application to call out to the plugin system all the time. Here's an example of that. In GCC, passes have execute and gate functions which are already function pointers. With libplugin you can make these replaceable/extensible/event-like without changing a single line of code in GCC. An external plugin, the "gcc-pass-manager" plugin, tells the world that it has a join point for each gate and execute function of every pass in the system. A quick aside on join points. Suppose you have a function int myHeuristic( basic_block bb, rtx insn ) { // blah, blah return x; } If we redefine that function to be called myHeuristic_default and setup a function pointer with same name: static int myHeuristic_default( basic_block bb, rtx insn ) { ... } int ( *myHeuristic )( basic_block bb, rtx insn ) = myHeuristic_default; Now we can use the heuristic unchanged in the code. But if we tell libplugin that that is a join point with id="my-heuristic" (in the XML for some plugin) it will create 1. An event called "my-heuristic.before" with signature "void (basic_block, rtx)" 2. A replaceble function stack called "my-heuristic.around" with signature "int (basic_block, rtx)" 3. An event called "my-heuristic.after" with signature "void (int, basic_block, rtx)" If anyone extends any of those, then the function pointer, myHeuristic, will be replaced with a dynamically built function which does, roughly: int myHeuristic_dynamic( basic_block bb, rtx insn ) { // call listeners to before foreach f in my-heuristic.before.eventHandlers { f( bb, insn ); } // do the behaviour of the heuristic top = my-heuristic.around.topOfAdviceStack; // top is initially myHeuristic_default unless someone overrode it // top can also access the rest of the advice stack, but I ignore that here int rval = top( bb, insn ); // call listeners to after foreach f in my-heuristic.after.eventHandlers { f( rval, bb, insn ); } return rval } It then sets myHeuristic = myHeuristic_dynamic. Note that if no one listens to the events of pushes advice on the around stack, then the original function pointer isn't changed - no performance cost. Now the dynamic functions are pushed onto each passes' gate or execute only if someone wants to extends them. Not one line of code was changed in GCC. This is what I mean by push not pull. Consider the alternative, which I call 'pull' because it has to pull plugin awareness from the system. It would require each pass and gate to check if anyone was interested, lots of changes to the code. Or every calling site would have to do it, similarly unpleasant for most uses. This is great when you already have function pointers. If you don't you have to make only minimal changes. Your code remains efficient if no one extends it. *Scalable and Granularity* The system is very scalable. Really this is due to the push architecture. Consider if events were implemented by something like this. A single function: void firePluginEvent( int eventId, void* data ); Every event would be fired by calling through this one function. Plugins would register a callback function. This is fine when you only have a few events but look what happens when you have very fine grained events happ
Re: Defining a common plugin machinery
Hugh Leather wrote: Aye up Basile, Thanks for wading through my gibberish :-) *Differences with other proposals.* I'll have a stab at some differences between this system and the others. But, this is going to be a bit difficult since I haven't seen them all :-) *Separating Plugin system from appliction* Libplugin ships as a library. Apart from a few lines of code in toplev.c, the only other changes to GCC will be refactorings and maybe calling a few functions through pointers. The point is who will do the refactoring you mention? We should not expect tons of other people to rewrite their pass for us... This won't happen (and conversely I am not able to rewrite other people's passes; GCC is really complex for me). I actually don't understand well what kind of plugin proposal will make into the trunk. Let's assume that the trunk will go in stage 1 on Christmas 2008, and that all the legal issues are solved (ie a runtime license is *defined* and accepted and explains what kind of plugins are GCC compatible) at the same time. It is late for me, and I am dreaming of Santa Claus :-) So let's suppose that Santa Claus came and give us (on end of december 2008) the runtime license and the stage 1. What happens next? What kind of patches is sent to gcc-patches@ and who will have time to review them? I think it's important to separate the plugin system from the application. Doing plugins well, IMO, requires a lot of code. It shouldn't be spread through the app. It also cleanly separates plugin mechanism from the actual extensions the app wants. Finally, plugins have to be extensible too. They should really be on a nearly equal footing with the app. Otherwise plugin developers who want the plugins to be extensible will need to reimplement there own extensibility system. The issue I see here is to get a consensus on these ideas and on your code. (I have same frightenning feeling about trying to get some day MELT into the trunk). *Pull vs push* Libplugin has a 'push' architecture, not a 'pull' one. What I mean is that the system pushes plugin awareness onto the application rather than requiring the application to call out to the plugin system all the time. I'm not sure to understand (I need to go to sleep) and I am concerned about having any plugin (be it your's or Sean's or Tarek's etc...) related code accepted into the trunk some day. I would dream of plugins to be inside GCC before 2010, but I am not sure of that! Here's an example of that. In GCC, passes have execute and gate functions which are already function pointers. With libplugin you can make these replaceable/extensible/event-like without changing a single line of code in GCC. I'll read again in detail your proposal later. But my main concern is not technical (how to do plugins - we each have our ideas, and we hopefully would be able to merge them), but more social: what are the (future) constaints (notably legal & licence contraints)? what are the technical counterparts? how to make some plugin code accepted in the trunk (I don't care whose code it is; and I don't have any generic plugin machinery myself: I feel that MELT could fit in most reasonably defined & well documented plugin subsystems)? Maybe some much simpler plugin mechanism code (perhaps from Mozilla Treehydra) might be easier to accept into the trunk, not because it is better, but just because it is simpler, and quicker to be accepted into the trunk? Actually, I still don't understand how exactly (socially speaking) are big patches accepted into the trunk (at stage one) and how to help a branch (or some other code somewhere else) to be more easily accepted into GCC? Should big patches absolutely be cut into small understandable parts? If in practice a succession of small patches has a much bigger chance to be accepted into the trunk than a bigger one, I don't know what will happen next. In addition, I would suppose that the runtime license could *requires* some technical behavior by the plugin machinery, and such requirements may perhaps prohibit more technically advanced solutions. I'm really impatient to understand what kind of plugins will be permitted (and when) in GCC, and also what kind of plugins will be disallowed [this could mean more than the definition of a proprietary plugin]. So far, I have no idea (because I am not a lawyer and because I don't know anything about the current work on the runtime license). Perhaps a microscopic plugin feature is better than a better designed one. A miniplugin mechanism is enough to add more advanced mechanisms (like your libplugin proposal, & perhaps like my MELT branch) inside, themselves implemented in dlopen-ed *.so. Maybe we should wait till at least end of october for some input (even unofficial rumors) by the few people working on the runtime license. I think that most of the is
Need help in a linking error
Hi, I appreciate if someone can help me with my linking error: In my "c++" options , i already have ' -L/usr/local/lib -lgnet-2.0'. I get a number of '7: undefined reference to `gnet_conn_readline'' errors. Can you please tell me why the linking fails? I have the 'gnet.h' include in my .cpp and I apparently compile fine. c++ -fno-rtti -fno-exceptions -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-long-long -pedantic -fno-strict-aliasing -fshort-wchar -pthread -pipe -DDEBUG -D_DEBUG -DDEBUG_scheung -DTRACING -g -fno-inline -Os -freorder-blocks -fno-reorder-functions -finline-limit=50 -I/usr/include/gtk-2.0 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12 -I/usr/include/pixman-1 -I/usr/include/gtk-unix-print-2.0 -o TestGNet TestGNet.o -lpthread -Wl,-rpath-link,/media/sdb3/src/tracemonkey/src/firefox-objdir/dist/bin -Wl,-rpath-link,/lib -L../../../../dist/bin -L../../../../dist/lib -lX11 /media/sdb3/src/tracemonkey/src/firefox-objdir/dist/lib/libxpcomglue.a -lasound -ldl -lm -L/usr/local/lib -lgnet-2.0 -lgtk-x11-2.0 -latk-1.0 -lgdk-x11-2.0 -lgdk_pixbuf-2.0 -lm -lpangocairo-1.0 -lpango-1.0 -lcairo -lgobject-2.0 -lgmodule-2.0 -ldl -lglib-2.0 TestGNet.o: In function `main': /media/sdb3/src/tests/TestGNet.cpp:304: undefined reference to `gnet_conn_new' /media/sdb3/src/tests/TestGNet.cpp:310: undefined reference to `gnet_conn_connect' /media/sdb3/src/tests/TestGNet.cpp:312: undefined reference to `gnet_conn_set_watch_error' /media/sdb3/src/tests/TestGNet.cpp:314: undefined reference to `gnet_conn_timeout' TestGNet.o: In function `ob_sig_int': /media/sdb3/src/tests/TestGNet.cpp:1380: undefined reference to `gnet_conn_write' /media/sdb3/src/tests/TestGNet.cpp:1382: undefined reference to `gnet_conn_readline' TestGNet.o: In function `ob_conn_func': /media/sdb3/src/tests/TestGNet.cpp:1307: undefined reference to `gnet_conn_readline' /media/sdb3/src/tests/TestGNet.cpp:1344: undefined reference to `gnet_conn_delete' /media/sdb3/src/tests/TestGNet.cpp:1212: undefined reference to `gnet_conn_timeout' /media/sdb3/src/tests/TestGNet.cpp:1227: undefined reference to `gnet_conn_write' /media/sdb3/src/tests/TestGNet.cpp:1229: undefined reference to `gnet_conn_readline' /media/sdb3/src/tests/TestGNet.cpp:1315: undefined reference to `gnet_conn_delete' /media/sdb3/src/tests/TestGNet.cpp:1330: undefined reference to `gnet_conn_delete' /media/sdb3/src/tests/TestGNet.cpp:1251: undefined reference to `gnet_conn_delete' /usr/bin/ld: TestGNet: hidden symbol `gnet_conn_connect' isn't defined /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: ld returned 1 exit status gmake[5]: *** [TestGNet] Error 1 Thank you for any help.
Re: Does IRA support stack slot sharing for locals and spilled pseudos?
Richard Henderson wrote: Jeff Law wrote: (mem/c:DI (reg:DI 122 r122 [121]) [64 ivtmp.743+0 S8 A64]) and (mem/c:DI (reg:DI 122 r122) [64 ivtmp.1640+0 S8 A64]) ... Yes. There's code at the start of nonoverlapping_memrefs_p to handle these cases, but as Pat pointed out, it doesn't work for large offsets from the stack/frame pointer (large enough to cause a secondary reload). I'm not sure offhand how to best fix this. How about setting the MEM_EXPR to a fake "spill_slot_base" symbol, plus the full frame pointer offset number? Since the slots are being shared, the original decl as the MEM_EXPR isn't terribly useful. Presumably mucking around with the MEM_EXPR on the DECL isn't going to mess up inlining? What about debugging? Jeff
Re: IRA accumulated costs
Hi Vlad, Thanks for the great reply, and sorry for not replying sooner. Things have been a bit hectic for me recently. Vladimir Makarov <[EMAIL PROTECTED]> writes: > Richard Sandiford wrote: >> Although I suspect it isn't intentional, I can imagine it doesn't show >> up much on targets whose memory move costs are significantly higher than >> their register move costs. >> >> Which brings us to MIPS. The big question there is: what do we >> do with the accumulator registers? Do we put them in the same >> cover class as GPRs? >> >> Or perhaps that's jumping the gun. Perhaps the first question is: >> should we mark the accumulator registers as fixed, or at least hide them >> from the register allocator? I'm planning to do the latter for MIPS16, >> but I don't think it's a good idea for normal MIPS for two reasons: >> >> - The DSP ASE provides 4 accumulators. We want to apply >> normal register allocation to them. >> >> - Some targets have multiply-accumulate instructions that operate on >> LO and HI. But it isn't always a win to use them. If a target has >> both multiply-accumulate _and_ pipelined three-operand multiplication >> instructions, it is often better to use the latter for parallel >> multiply-accumulate chains. We've traditionally treated the >> choice as a register-allocation problem, which seems to have >> worked reasonably well. >> >> Also, the macc instruction on some targets can copy the LO result >> to a GPR too. The register allocator can currently take advantage >> of that, allocating a GPR when it's useful and not wasting one >> otherwise. (From what I've seen in the past, JPEG FFT loops tend >> to be on the borderline as far as register pressure on MIPS goes, >> so this can be an important win.) >> >> But there are only a limited number of accumulator registers (1 or 4, >> depending on the target). There's quite a high likelihood that >> any given value will need to be spilled from the accumulators >> at some point. When that happens, it's better to spill to a GPR >> than it is to spill to memory, since any load or store has to go >> through a GPR anyway. It therefore seems better to put GPRs and >> accumulator registers in the same cover class. >> >> > It better to put GPRs and ACCs in the same class if it is better to > spill ACC_REGS to GPR than to memory but it should be reflected in some > way in memory and register costs. Great! That's what the WIP patch did, and it seems to work well with IRA for the most part. This unnecessary spilling thing was the only real problem I've found. >> We currently give moves between GPRs and accumulators a higher cost than >> GPR loads and stores.[*] On the targets for which this cost is accurate, >> we _don't_ want to use LO and HI as spill space. We also don't want >> to move between one accumulator and another if we can help it. >> And IRA generally seems happy with this. >> >> [*] Which isn't an accurate reflection of all targets, but that's >> another story. We ought eventually to put this in the CPU cost >> table. >> >> The hitch is that the cost of storing an accumulator to memory >> is the cost of a GPR<->accumulator move plus the cost of a load >> or store. The cost of moving between one accumulator and another >> is the cost of two GPR<->accumulator moves. Both of these aggregate >> costs are accurate, to the extent that the constituent costs are >> accurate (see [*] above). So we have a situation in which the >> worst-case register<->register cost (acc<->acc) outweighs the >> worst-cost register<->memory cost (acc<->mem). And that goes >> against the cover class documentation. >> >> > The documentation can be changed. It is just a recommendation or how I > see it. We have to separate reg classes into non-intersected ones > because Chaitin-Briggs coloring needs this for its work. The bigger > cover classes, the more probability that RA puts more pseudos into hard > registers. On the other hand CB coloring does not understand register > and memory move costs (assign_hard_reg does understand but it can be > used for other coloring algorithms as Chow's priority coloring). It is > hard to find a balance in defining cover classes to put more pseudos > into hard-registers and still generate cheap code. I should think about > better IRA_COVER_CLASSES description. Thanks. (This certainly wasn't an attack on the documentation btw. My point was more: "I realise that, because the MIPS set-up isn't really sanctioned by the documentation, it would be reasonable to classify this as a bug in the MIPS port".) But it's good to hear this is more a recommendation than a hard rule. Like I say, IRA seems to cope pretty well with things despite the unusual costs. >> For the most part things Just Work. But in the code quoted >> above, this cost: >> >>cost = (ira_register_move_cost[mode][rclass][rclass] >>* (
Re: IRA accumulated costs
Vladimir Makarov <[EMAIL PROTECTED]> writes: > Hi, Richard. Returning to accurate cost accumulation issue you found > recently. Here is the patch fixing it. You could try, if you want, how > MIPS will behave with it. The patch also more accurately calculates > ALLOCNO_CALL_FREQ which affects decision to spill allocno in > assign_hard_reg if it is more profitable. Thanks. I'll give the patch a go. Richard
Re: m32c: pointer math vs sizetype again
> Is this related to the loop termination bug I reported > on the m32c? > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37665 Probably related, but I don't know if a patch to fix one will fix the other.
Re: m32c: pointer math vs sizetype again
On Wed, Oct 1, 2008 at 12:20 AM, DJ Delorie <[EMAIL PROTECTED]> wrote: > > I've got a partial patch which works with older (4.3) gccs, but fails > gimple's check for trunk (attached). My trivial test case... > > char * > foo (char *a, int b) > { > return a-b; > } > > ...fails thusly: > > size > constant 32> >unit size int> constant 4> >align 8 symtab 0 alias set -1 canonical type 0xb7f52c30 precision 32 min > max > > size > constant 16> >unit size int> constant 2> >align 8 symtab 0 alias set -1 canonical type 0xb7efc000 precision 16 min > max > > useless false: ../../gcc/gcc/tree-ssa.c 1092 > dj.c: In function 'foo': > dj.c:2: error: type mismatch in pointer plus expression > D.1194 = a + D.1196; > > char * > > char * > > > > D.1194 = a + D.1196; > > dj.c:2: internal compiler error: verify_gimple failed > > > I'm obviously doing something wrong in the cast-to-bigger step. How > can I get this to pass gimple? What I'm trying to accomplish is this: > > 1. Values added to pointers need to be treated as signed (at least, if > they're signed types, certainly if you're going to use a > NEGATE_EXPR). > > 2. If sizeof(size_t) < sizeof(void *), sign extend the intop to be > pointer-sized before adding it. > > > > Index: c-common.c > === > --- c-common.c (revision 140759) > +++ c-common.c (working copy) > @@ -3337,20 +3337,28 @@ pointer_int_sum (enum tree_code resultco > intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype), > TYPE_UNSIGNED (sizetype)), intop); > > /* Replace the integer argument with a suitable product by the object size. > Do this multiplication as signed, then convert to the appropriate > type for the pointer operation. */ > - intop = convert (sizetype, > + intop = convert (ssizetype, > build_binary_op (EXPR_LOCATION (intop), >MULT_EXPR, intop, >convert (TREE_TYPE (intop), size_exp), 1)); > > /* Create the sum or difference. */ > if (resultcode == MINUS_EXPR) > -intop = fold_build1 (NEGATE_EXPR, sizetype, intop); > +intop = fold_build1 (NEGATE_EXPR, ssizetype, intop); > + > + if (TREE_CODE (result_type) == POINTER_TYPE > + && TYPE_PRECISION (result_type) > TYPE_PRECISION (TREE_TYPE (intop))) > +{ > + tree iptr_type = c_common_type_for_mode (TYPE_MODE (result_type), > + TYPE_UNSIGNED (result_type)); > + intop = fold_build1 (NOP_EXPR, iptr_type, intop); > +} > > ret = fold_build2 (POINTER_PLUS_EXPR, result_type, ptrop, intop); > > fold_undefer_and_ignore_overflow_warnings (); > > return ret; I think this is the wrong place to fix this. If you would override the sizetypes precision from your target, would that fix it? That is, in stor-layout.c set_sizetype make the target allow adjusting the passed type (which is supposed to be sizetype). If at all then these types should be consistent. Richard.
gcc-4.2-20081001 is now available
Snapshot gcc-4.2-20081001 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20081001/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch revision 140822 You'll find: gcc-4.2-20081001.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20081001.tar.bz2 C front end and core compiler gcc-ada-4.2-20081001.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20081001.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20081001.tar.bz2 C++ front end and runtime gcc-java-4.2-20081001.tar.bz2 Java front end and runtime gcc-objc-4.2-20081001.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20081001.tar.bz2The GCC testsuite Diffs from 4.2-20080924 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: m32c: pointer math vs sizetype again
> I think this is the wrong place to fix this. If you would override > the sizetypes precision from your target, would that fix it? That > is, in stor-layout.c set_sizetype make the target allow adjusting > the passed type (which is supposed to be sizetype). If at all then > these types should be consistent. The problem is that the chip has 24 bit pointers, but 16 bit registers. It has math operations for 16 bit numbers and some 32 bit numbers (the rest are emulated). It has a few operations for 24 bit numbers. There are no C types for 24 bit numbers (PSImode is 32 bits wide with 24 bit precision, if I tweak its precision manually it tries to use bitfield instructions all over the place, if I don't it uses "long int" which is wrong). All I want for now is to treat ptr+int as a signed addition, not an unsigned one. My patch is just trying to detect the case where a sign extension is needed at all, and insert it.
Re: m32c: pointer math vs sizetype again
DJ Delorie wrote: I think this is the wrong place to fix this. If you would override the sizetypes precision from your target, would that fix it? That is, in stor-layout.c set_sizetype make the target allow adjusting the passed type (which is supposed to be sizetype). If at all then these types should be consistent. The problem is that the chip has 24 bit pointers, but 16 bit registers. It has math operations for 16 bit numbers and some 32 bit numbers (the rest are emulated). It has a few operations for 24 bit numbers. There are no C types for 24 bit numbers (PSImode is 32 bits wide with 24 bit precision, if I tweak its precision manually it tries to use bitfield instructions all over the place, if I don't it uses "long int" which is wrong). All I want for now is to treat ptr+int as a signed addition, not an unsigned one. My patch is just trying to detect the case where a sign extension is needed at all, and insert it. Can you look in the CVS/SVN archives and see what the mn102 port did -- it had the same core properties as the chip you're describing. It was a 16/24 bit chip (true 24bit address registers), mostly 16bit ops with a few 24bit ops. All 32bit ops were synthesized. Jeff
Re: Does IRA support stack slot sharing for locals and spilled pseudos?
Jeff Law wrote: Presumably mucking around with the MEM_EXPR on the DECL isn't going to mess up inlining? What about debugging? I'm certain it won't mess up inlining, because all that is long done with by the time we're in rtl -- it's all just one big function by this time. I wouldn't have thought it would mess up debugging, but we'll have to see what happens with Alex's enhanced var-tracking... I have attached a patch to pr 37447. r~
Re: Need help in a linking error
This list is for discussing GCC development, not deal with usage problems. Please try asking [EMAIL PROTECTED] Thanks, Ben
Re: Defining a common plugin machinery
I have notes inline below, following is my summary of libplugin from what i understand of your posts: * It exists as a fraemwork that works with GCC now * It uses xml files to define plugins (Allows making new plugins as combinations of others without making a new shared library, i.e. just create an xml file that describes the plugin) * It handles issues with inter-dependencies between plugins * It uses a "push" framework, where function pointers are replaced/chained in the original application rather than explicit calls to plugins (Provides more extensibility in a application that makes heavy use of function pointers, but produces a less explicit set of entry points or hooks for plugins) * Currently it provides automatic loading of plugins without specific user request * It already has a framework for allowing plugins to interact with the pass manager If you can think of any other points to summarize the features it might be helpful as you are closer to it. The issues i see with this framework: * it seems to provide a lot of features that we may not necessarily need (That should be up for discussion) * plugin entry points are not well defined but can be "any function pointer call" Some questions: * How does the framework interact with the compile command line arguments? * Does this work on platforms that dont support -rdynamic or can it be modified to do so in the future? Hugh Leather wrote: >*Separating Plugin system from appliction* >Libplugin ships as a library. Apart from a few lines of code in >toplev.c, the only other changes to GCC will be refactorings and >maybe calling a few functions through pointers. As i understand the difference between the pull vs push, a plugin will load, and then modify existing function pointers in GCC to insert its own code and chain the existing code to be called after it. Is this correct? Doing this will be able to make use of existing function pointers as plugin hook locations, but some hooks we may want are not already called by function pointers and so would need to be changed. This means that plugin hook locations are not explicitly defined, but rather any place where a function pointer is used can be modified. Personally i prefer explicit specification of plugin hook locations. >I think it's important to separate the plugin system from the >application. Doing plugins well, IMO, requires a lot of code. It >shouldn't be spread through the app. It also cleanly separates >plugin mechanism from the actual extensions the app wants. >Finally, plugins have to be extensible too. They should really be on >a nearly equal footing with the app. Otherwise plugin developers >who want the plugins to be extensible will need to reimplement there >own extensibility system. Without the use of plugin meta-data in XML files and auto-loading and many of the things discussed, i am not so sure that plugins will be such a large body of code. It is really a matter of deciding if such features that libplugin provides are desirable for GCC. If so, then there is a lot of code required for plugins and libplugin becomes a good idea IMO. If not, then libplugin may just be more than we need. It really depends on what "doing plugins well" means for the specific application. >*Scalable and Granularity* >The system is very scalable. Really this is due to the push >architecture. The granularity as i understand it is only as fine/coarse as the number of function pointers in the system that can be overwritten. This is no different from the pull method (i.e. The granularity depends on where you put the hook locations) except that function pointers "may already exist". Though i may have mis-understood something... I.e. For the "pull" method you can: Add a "pull" for firePluginEvent() or add a "pull" inside each existing event handler. Where as the push method requires that the existing event handlers are called via function pointers and the "push" chains itself to that. I have used a similar method for the "push" plugin in python. The advantage here is that basically "anything" can be pushed in python so the system becomes very flexible to extend via "plugins". In C/C++ the areas that can be extended need to be defined and turned into function pointers for the push method to work. Again, assuming i have understood how it works. >*Mutliple cooperating plugins >*I think some of the proposals don't allow multiple plugins or >plugins aren't able to be extended in the same way that the >application is. In libplugin you can have lots of plugins all >depending on each other. Plugins can provide extension points as >well as the application - this means it isn't just a matter of the >application deciding what's important and everyone else having to >make do. > >In some senses, this is the difference between a plugin system and >loading a few shared libraries. A plugin system provides a
Re: m32c: pointer math vs sizetype again
> Can you look in the CVS/SVN archives and see what the mn102 port did -- It used SImode for size_type but I think I tried that and it blew up in useless_type_conversion_p. I can try again if you're interested in the details.
Re: query regarding adding a pass to undo final value replacement.
Hi, > >> b) If any PHI node has count zero it can be inserted back and its > >> corresponding computations removed, iff the argument of the PHI > >> node > >> still exists as an SSA variable. This means that we can insert > >> a_1 = PHI if D.10_1 still exists and hasnt been removed by > >> any of the passes between the scalar evolution pass and the > >> loopdone pass. > > > > this does not work: > > -- we reuse ssa names, so it can happen that the argument of the PHI node > > is eliminated, then reused for a different purpose > > I wasn't sure if from the proposal strong enough to catch this case ? i.e. if > > > So if the ssa_names are infact reused they won't be the same > computations. do you also check this for ssa names inside the loop (in your example, D.10_1? > > -- in case more complex loop transformations were performed > > (e.g., loop reversal), the final value of the ssa name might have > > changed. > > Could you give an example for this ? for (i = 100; i > 0; i--) a[i] = i; transformed to for (i = 1; i <= 100; i++) a[i] = i; the final value of i was originally 0, now it is 101. > Is there anything else you might > suggest in terms of undoing the transformations from scalar cprop.? I would probably try to somehow pass the information from scev analysis to value numbering, and let PRE take care of the issue, Zdenek
Re: query regarding adding a pass to undo final value replacement.
Hi, > > I would disagree on that. Whether a final value replacement is > > profitable or not largely depends on whether it makes further > > optimization of the loop possible or not; this makes it difficult > > to find a good cost model. I think undoing FVR is a good approach > > to solve this problem (unfortunately, the proposed implementation > > does not work), > > Ok, fair enough. Ideally we would then be able to retain the PHI nodes > and somehow record an equivalency in the IL from which we later could > remove either of the definitions. Something like > > def_1 = PHI < ... > > > def_2 = compute > > def_3 = EQUIV > (def_3 = ASSERT_EXPR ?) > > much similar to REG_EQUAL notes. This means that both def_1 and def_2 > are conditionally dead if the EQUIV is the only remaining use. > > No idea if this is feasible and useful enough in general though. > > Do you remember what kind of missed optimizations you saw (apart from > missed dead loop removal)? vectorization and linear loop transformations did not like values used outside of the loop; I am not sure whether (our implementation of) graphite handles them or not, Zdenek
Re: Defining a common plugin machinery
Hello All, Brendon Costa wrote: Some questions: * How does the framework interact with the compile command line arguments? * Does this work on platforms that dont support -rdynamic or can it be modified to do so in the future? [I'm skipping the rest of an interesting post] I thought that for the first plugin machinery we don't care about platforms without -rdynamic (or those without dlopen or tl_dlopen). I believe we should first focus (when the runtime license will permit that) on making whatever plugin machinery available and merged into the trunk (when it comes back to stage one). This is not an easy task. In practice, I think that we should first try to get some code into the trunk which make some plugin work on some common easy host system (Linux), and only after try to generalize the work to harder hosts. At last, I believe that the plugin system will at first be something which can be disabled at configure time, and will be disabled by default. My main concern is plugins & passes. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: Defining a common plugin machinery
> I believe we should first focus (when the runtime license will permit > that) on making whatever plugin machinery available and merged into > the trunk (when it comes back to stage one). This is not an easy task. Isn't the point of this discussion to decide what features to put into a plugin framework? I.e. We need a "whatever plugin machinery available" to exist before we can even think about merging that into the trunk and defining what that looks like is the point of this discussion i thought. Possible steps for moving forward with this: 1) Define what features we need for the first release, and think about what we may want in the future 2) See which frameworks currently exist and how each meets the necessary features identified 3) Either use one of the above frameworks as a base or start a new framework on the plugin branch 4) Work on the "base set of features" for a first release 5) Make sure the branch is up to date/tracking the trunk 6) Look at merging into the trunk when licensing is complete We are still at 1 (and partially identifying projects for 2) as far as i understand. I was going to start itemizing the features we have discussed, and the frameworks mentioned on the wiki. But I am not going to have time to do so for a number of weeks now. If someone else wants to do it it may get done a bit faster. So far, i think libplugin seems to be the most "general" plugin framework for GCC i have had a chance to look at (It was easy to look at because it has some decent documentation online). > In practice, I think that we should first try to get some code into > the trunk which make some plugin work on some common easy host system > (Linux), and only after try to generalize the work to harder hosts. I agree, that providing working code for only simple to implement platforms (and basic plugin features) at first is a good idea (but do so on a branch first, then merge that to the trunk once it is operational). However we do not want to start with a framework that will need to be completely redesigned in the future to later support other platforms or usages. I.e. Thinking ahead but not necessarily implementing ahead... > My main concern is plugins & passes. Yes. We have not really looked at this more important aspect in much detail, how to manage passes with plugins. It looks like libplugin has some ideas for pass management that may help? Any thoughts?
Re: Defining a common plugin machinery
Brendon Costa wrote: I believe we should first focus (when the runtime license will permit that) on making whatever plugin machinery available and merged into the trunk (when it comes back to stage one). This is not an easy task. Isn't the point of this discussion to decide what features to put into a plugin framework? I.e. We need a "whatever plugin machinery available" to exist before we can even think about merging that into the trunk and defining what that looks like is the point of this discussion i thought. I entirely agree. Apologies to everyone if I badly expressed myself. Possible steps for moving forward with this: 1) Define what features we need for the first release, and think about what we may want in the future 2) See which frameworks currently exist and how each meets the necessary features identified 3) Either use one of the above frameworks as a base or start a new framework on the plugin branch 4) Work on the "base set of features" for a first release 5) Make sure the branch is up to date/tracking the trunk 6) Look at merging into the trunk when licensing is complete We are still at 1 (and partially identifying projects for 2) as far as i understand. I also agree. What I don't understand is if having a simple crude plugin mechanism make it easier to be accepted in the trunk. I don't understand what makes features & code easy to be accepted in a stage one trunk. I don't understand if havving a small plugin machinery (there already exists some) make it easier to be accepted in the trunk. I still do not understand how and when big patches get accepted in the trunk. What are the social issues involved? What is the best way to get code reviewers (those able to approve a patch) interested by any big plugin patch? (And FYI I am asking myself the same question for MELT: what should I do now to get some day in the future MELT accepted in the trunk?). So far, i think libplugin seems to be the most "general" plugin framework for GCC i have had a chance to look at (It was easy to look at because it has some decent documentation online). In practice, I think that we should first try to get some code into the trunk which make some plugin work on some common easy host system (Linux), and only after try to generalize the work to harder hosts. I agree, that providing working code for only simple to implement platforms (and basic plugin features) at first is a good idea (but do so on a branch first, then merge that to the trunk once it is operational). However we do not want to start with a framework that will need to be completely redesigned in the future to later support other platforms or usages. I.e. Thinking ahead but not necessarily implementing ahead... I fully agree. But who thinks that the libplugin patch (or any other plugin machinery) could be accepted into the trunk? My main concern is plugins & passes. Yes. We have not really looked at this more important aspect in much detail, how to manage passes with plugins. It looks like libplugin has some ideas for pass management that may help? Any thoughts? Apparently they have. But we need to have a more exact picture of what the GCC steering commitee & the FSF wants (and even more importantly do not wants) regarding plugins. I could imagine that they (& perhaps us) want some tricks that make proprietary plugins impractical, but I have no idea of what that would technically mean (because I have no understanding of the legal & social system involved). My hypothesis is that several plugin mechanisms for GCC already exist (on some branches or somewhere else). If a small plugin patch has a better chance to get accepted into the trunk, we should limit ourselves to such a small thing. If big plugin machinery could be accepted (I would prefer that) we should understand what would make them more acceptable. In both cases, plugins have probably some requirements defined by the future runtime license, which I don't know yet. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***