Re: bootstrap on powerpc fails
Maybe in another six years cpu improvements will outpace gcc bootstrap times enough to reconsider. We'll have 60 cores per CPU, and 1 minute after invoking "make" we'll be cursing how much it takes for insn-attrtab.c to finish compiling. :-) Paolo
Re: bootstrap on powerpc fails
> I can sympathize with that, I have a slightly different problem. Right > now there are some java test that time-out 10x on solaris2.10. I run four > passes of the testsuite with different options each time, so that 40 > timeouts. (This is without any extra RTL checking turned on.) At 5 > minutes each it adds up fast! > http://gcc.gnu.org/ml/gcc-testresults/2006-11/msg00294.html This happened at some point during 4.1 development too. It turned out to be a code generation bug that was butchering the PTHREAD_* initializer macros. -- Eric Botcazou
Re: bootstrap on powerpc fails
> Figures? Tree checking is not cheap with GCC 4.x either. Here are mine (Athlon64 2.4 GHz, 1 GB, c,c++,objc,obj-c++,java,fortran,ada): gcc version 4.3.0 20061103 (experimental) assert,runtime (aka release): 115 min assert,runtime,misc:176 min assert,runtime,misc,gc: 186 min assert,runtime,misc,gc,tree:203 min assert,runtime,misc,gc,tree,rtl,rtlflag 266 min So I was wrong, tree checking is still relatively cheap; misc and rtl are not. -- Eric Botcazou
Re: Abt long long support
On 11/7/06, Mike Stump <[EMAIL PROTECTED]> wrote: On Nov 6, 2006, at 9:30 PM, Mohamed Shafi wrote: > My target (non gcc/private one) fails for long long testcases Does it work flawlessly otherwise, if not, fix all those problems first. After those are all fixed, then you can see if it then just works. In particular, you will want to ensure that 32 bit things work fine, first. Well, the test cases fails only for one condition. when main calls a function, like llabs ,to find the absolute value of a negative number and the function performs the action with return (arg<0 ? -arg : arg ); The program works fine if i pass a 1.positive value 2.use -fomit-frame-pointer flag while compiling (with negative value) 3.use another variable in function body to return i.e long long foo(long long x){ long long k; k=(x<0 ? -x : x); return k; } When i diff the rtl dumps for programs passing negative value with and without frame pointer i find changes from file.greg . Thats when the frame pointer issue kicks in. This is a small test case which produces the bug #include long long fun(long long k) { return ( k>0 ? k : -k); } int main() { long long i= -1; if(fun(i) == 1) printf("\nsuccess \n"); else printf("\nfailure \n"); } here the relevant rtl dump for the function fun from .greg file ; Hard regs used: 0 1 2 3 12 13 14 21 (note 2 0 9 NOTE_INSN_DELETED) ;; Start of basic block 0, registers live: 0 [d0] 1 [d1] 14 [a6] 15 [a7] 22 [vAP] (note 9 2 4 0 [bb 0] NOTE_INSN_BASIC_BLOCK) (insn 4 9 5 0 (parallel [ (set (reg/f:SI 13 a5 [31]) (plus:SI (reg/f:SI 14 a6) (const_int -8 [0xfff8]))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (nil)) (insn 5 4 6 0 (set (mem/c/i:SI (reg/f:SI 13 a5 [31]) [0 k+0 S4 A32]) (reg:SI 0 d0 [ k ])) 16 {movsi_store} (nil) (nil)) (insn 6 5 7 0 (set (mem/c/i:SI (plus:SI (reg/f:SI 13 a5 [31]) (const_int 4 [0x4])) [0 k+4 S4 A32]) (reg:SI 1 d1 [orig:0 k+4 ] [0])) 16 {movsi_store} (nil) (nil)) (note 7 6 13 0 NOTE_INSN_FUNCTION_BEG) (insn 13 7 14 0 (parallel [ (set (reg/f:SI 13 a5 [33]) (plus:SI (reg/f:SI 14 a6) (const_int -8 [0xfff8]))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (nil)) (insn 14 13 63 0 (set (reg:SI 0 d0) (mem/c/i:SI (reg/f:SI 13 a5 [33]) [0 k+0 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 63 14 64 0 (set (reg:SI 12 a4) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (nil)) (insn 64 63 65 0 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) (nil))) (insn 65 64 15 0 (set (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32]) (reg:SI 0 d0)) 16 {movsi_store} (nil) (nil)) (insn 15 65 68 0 (set (reg:SI 0 d0) (mem/c/i:SI (plus:SI (reg/f:SI 13 a5 [33]) (const_int 4 [0x4])) [0 k+4 S4 A32])) 15 {movsi_load} (nil) (nil)) (insn 68 15 69 0 (set (reg:SI 12 a4) (const_int -12 [0xfff4])) 17 {movsi_short_const} (nil) (nil)) (insn 69 68 70 0 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -12 [0xfff4])) (nil))) (insn 70 69 73 0 (set (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32]) (reg:SI 0 d0)) 16 {movsi_store} (nil) (nil)) (insn 73 70 74 0 (set (reg:SI 12 a4) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (nil)) (insn 74 73 75 0 (parallel [ (set (reg:SI 12 a4) (plus:SI (reg:SI 12 a4) (reg/f:SI 14 a6))) (clobber (reg:CC 21 cc)) ]) 29 {addsi3} (nil) (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6) (const_int -16 [0xfff0])) (nil))) (insn 75 74 17 0 (set (reg:SI 12 a4) (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S8 A32])) 15 {movsi_load} (nil) (nil)) (insn 17 75 18 0 (set (reg:CC 21 cc) (compare:CC (reg:SI 12 a4) (const_int 0 [0x0]))) 67 {*cmpsi_internal0} (nil) (nil)) (jump_insn 18 17 50 0 (set (pc) (if_then_else (gt:CC (reg:CC 21 cc) (const_int 0 [0x0])) (label_ref 32) (pc))) 41 {*branch_true} (nil) (nil)) ;; End of basic block 0, registers live: 14 [a6] 15 [a7] 22 [vAP] 28 ;; Start of basic block 2, registers live: 14 [a6] 15 [a7] 22 [vAP] 28 (note 50 18 78 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 78 50 79 2 (set (reg:SI 13 a5) (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil) (
Re: Obtaining type equivilance in C front end
> > How do i determine if two type nodes in the C front end are equivilent? > In C++ i use same_type_p() but do not see an equivilant for the C front end. The function you want is comptypes. Thanks, Andrew Pinski
Re: Abt long long support
On Nov 9, 2006, at 6:39 AM, Mohamed Shafi wrote: When i diff the rtl dumps for programs passing negative value with and without frame pointer i find changes from file.greg . And, is that change bad? We do expect changes in codegen, you didn't say if those changes are invalid, or what was invalid about them. If they are valid, which pass is the first pass that contains invalid rtl? If this was the first pass with invalid rtl, which instruction was invalid and why? What assembly do you get in both cases? Which instruction is wrong? What's wrong about it? Did you examine: long long l, k; l = -k; for correctness by itself? Was it valid or invalid? [ read ahead for spoilers, I'd rather you pull this information out of the dump and present it to us... ] A quick glance at the rtl shows that insn 95 tries to use [a4+4] but insn 94 clobbered a4 already, also d3 is used by insn 93, but there isn't a set for it. The way the instructions are numbered suggests that the code went wrong before this point. You have to read and understand all the dumps, whether they are right or wrong and why, track down the code in the compiler that is creating the wrong code and then see if you can guess why. If not, fire up gdb, and watch it add/remove/reorder instructions and why it was doing it and the conditions it checked before doing the transformation and then reason about why it is wrong and what it should be doing instead. I'd suspect the bug lies in your port file and gcc is using information from it and coming up with the bad code. For example, what pass removed the setting of d3?
Planned LTO driver work
This message outlines a plan for modifying the GCC driver to support compilation in LTO mode. The goal is that: gcc --lto foo.c bar.o will generate LTO information for foo.c, while compiling it, then invoke the LTO front end for foo.o and bar.o, and then invoke the linker. However, as a first step, the LTO front end will be invoked separately for foo.o and bar.o -- meaning that the LTO front end will not actually do any link-time optimization. The reason for this first step is that it's easier, and that it will allow us to run through the GCC testsuite in LTO mode, eliminating failures in single-file mode, before we move on to multi-file mode. The key idea is to leverage the existing collect2 functionality for reinvoking the compiler. That's presently used for static constructor/destructor handling and for instantiating templates in -frepo mode. So, the work plan is as follows: 1. Add a --lto option to collect2. When collect2 sees this option, treat all .o files as if they were .rpo files and recompile them. We will do this after all C++ template instantiation has been done, since we want to optimize the .o files after the program can actually link. 2. Modify the driver so that --lto passes -flto to the C front-end and --lto to collect2. Any objections to this plan? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Canonical type nodes, or, comptypes considered harmful
> I can dig out actual real live numbers, if you're curious. For example, when calling comptypes, the no answers are (were) 34x more likely than yes answers. If you cannot return false immediately when point_to_type1 != pointer_to_type2, you then have to run a structural equality tester, and once you do that, you spend 120ns per depth in the tree as you fault everything into cache, what's that 300 some instructions. 21,980 were fast, 336,523 were slow, the slow path dominated. I think in order to handle the C type system with the non-transitive type compatibility effectively, for each type we have to pre-compute the most general variant, even if that has no direct representative in the current program. I.e. for an array, point to the corresponding incomplete array. (Fortunately, C allows only one dimension to be incomplete.) For a pointer to a struct, point to the type where the struct type is incomplete. If an array appears in a context of another type where an incomplete array is not allowed, we can use the complete array for computing the most general variant of that other type. Types can only be compatible if their most general variants are equal. In addition to this most generalized type, each complete type can also have a pointer to a representative of its equivalence class, and be flagged as complete; two complete types are compatible iff they are the same. If a type is not in the same equivalence class as its most general variant, it needs to describe all the 'optional' bits, i.e. struct types pointed to, array dimensions, cv-qualifiers. I'm not sure if this is better done by having all the semantics there (that can be a win if there are lots of places where cv-qualifiers could be added without breaking type compatibility, but not many cv-qualifiers are actually encountered), or if it should only contain a bare data field for each item (e.g. an integer for an array dimension), with the most general variant having a checklist of how to compare them, and its description of the overall type saying what the data actually means when it comes to operating on the type.
Re: Canonical type nodes, or, comptypes considered harmful
On Thu, Nov 09, 2006 at 09:06:42PM +, Joern RENNECKE wrote: > > I can dig out actual real live numbers, if you're curious. For > example, when calling comptypes, the no answers are (were) 34x more > likely than yes answers. If you cannot return false immediately when > point_to_type1 != pointer_to_type2, you then have to run a structural > equality tester, How about using a hash, so mismatches fail quickly?
Re: Obtaining type equivilance in C front end
The function you want is comptypes. Thanks. That is working well. Hi Brendon, Wouldn't the C++ one (mostly) be a superset of the C? Types are reasonably different between the C and C++ front ends though you do have the common ones because as you said, C++ is a superset of C. The C++ front end has a similar comptypes function which is called by same_type_p, however it is not the same as the C one. Rather than trying to write my own based on the C++ one, I was sure there would already exist a function to do it somewhere in the C front end... Thanks for the help. Brendon.
Re: Planned LTO driver work
Mark Mitchell <[EMAIL PROTECTED]> writes: > 1. Add a --lto option to collect2. When collect2 sees this option, > treat all .o files as if they were .rpo files and recompile them. We > will do this after all C++ template instantiation has been done, since > we want to optimize the .o files after the program can actually link. > > 2. Modify the driver so that --lto passes -flto to the C front-end and > --lto to collect2. Sounds workable in general. I note that in your example of gcc --lto foo.c bar.o this presumably means that bar.o will be recompiled using the compiler options specified on that command line, rather than, say, the compiler options specified when bar.o was first compiled. This is probably the correct way to handle -march= options. I assume that in the long run, the gcc driver with --lto will invoke the LTO frontend rather than collect2. And that the LTO frontend will then open all the .o files which it is passed. Ian
gcc-4.0-20061109 is now available
Snapshot gcc-4.0-20061109 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.0-20061109/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.0 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_0-branch revision 118630 You'll find: gcc-4.0-20061109.tar.bz2 Complete GCC (includes all of below) gcc-core-4.0-20061109.tar.bz2 C front end and core compiler gcc-ada-4.0-20061109.tar.bz2 Ada front end and runtime gcc-fortran-4.0-20061109.tar.bz2 Fortran front end and runtime gcc-g++-4.0-20061109.tar.bz2 C++ front end and runtime gcc-java-4.0-20061109.tar.bz2 Java front end and runtime gcc-objc-4.0-20061109.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.0-20061109.tar.bz2The GCC testsuite Diffs from 4.0-20061102 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.0 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Compile Farm for GCC developpers and free software developpers
Reminder: if you need access to x86 machines to run your boring batches or to test your software with GCC snapshots with even more boring batches, the GCC compile farm is for you: http://gcc.gnu.org/wiki/CompileFarm << GCC Compile Farm Project The GCC CompileFarm Project is seeking volunteers to maintain script machinery to help with GCC development on nine bi pentium 3 machines as well as GCC developers that are lacking x86 machine access. How to Get Involved ? If you are a GCC developer and want access to the compileFarm for GCC development and testing, or if you are a free software developer wishing to set up automated testing of a piece of free software with the current GCC development version (preferably with a test suite), please send 1. your ssh public key (HOME/.ssh/authorized_keys format) *in attachment* and not inline in the email and 2. your prefered UNIX login to laurent at guerby dot net. [...] >> Laurent http://guerby.org/blog/
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 8, 2006, at 5:59 AM, Doug Gregor wrote: However, this approach could have some odd side effects when there are multiple mappings within one context. For instance, we could have something like: typedef int foo_t; typedef int bar_t; foo_t* x = strlen("oops"); x is a decl, the decl has a type, the context of that instance of the type is x. map(int,x) == foo_t. It is this, because we know that foo_x was used to create x and we set map(int,x) equal to foo_t as it is created. It can never be wrong. It can never be wrong, because any use of the type that would have had the wrong value comes from a specific context, and that specific context map(type,context) can be set to _any_ value, including the right value. Put another way, any time one carries around type outside of a specific context, one also needs to also carry around the context the type came from. The error message that pops out would likely reference "bar_t *" map(int,x) doesn't yield bar_t. This approach wouldn't help with the implementation of concepts, because we need to be able to take two distinct types (say, template type parameters T and U) and consider them equivalent in the type system. I'd need to see more specifics, but from just the above... Any data you need that would make them different, you put into map (type,context), we're not restricted to just the typedef name. Once you do that, then you discover that what's left, is identical, and since they are identical, they have the same address, and the same address makes them the same type. The two things this doesn't work on are if you have two different notions of equality, my scheme (unaltered) can only handle 1 definition for equality, or some of the temporal aspects, like, we didn't know T1 and T2 were the same before, but now we do, because they are both int. The later case I'm expecting to not be an issue, as to form the type, you do the substitution and after you do it, you replace T1 with int (creating data in map(int,context), if you later need to know this was a T1 for any reason (debugging, error messages)). These bubble up and one is left with the real type, and then equality remains fast, post substitution. Reasoning about type equality pre-substitution remains slow. You can even get fast unsubstituted comparisons for a particular definition of equality. You boost the substitution bits out as variants, notice then, you have nothing left, and nothing is nothing, so the values wind up being the same again. Now to get comptypes to work, you just have to add code to compare the boosted variants in the top of comptypes. Now, before you say that that is as bad as what we had before, no, it isn't. If the type is equal, then you can immediate fail the comparison, this takes care of 90% of the calls. After than you check the variants for equality and return that. The one address compare doesn't hit memory and can answer most of the equations by itself. The variants are all on one cache line, and if the cost to compare them is cheap, it is just two memory hits. We can't literally combine T and U into a single canonical type node, because they start out as different types. ? Granted, we could layer a union-find implementation (that better supports concepts) on top of this approach. Ah, but once you break the fundamental quality that different addresses implies different types, you limit things to structural equality and that is slow. type = type_representative (TREE_TYPE (exp)); if (TREE_CODE (type) == REFERENCE_TYPE) type = TREE_TYPE (type); We could find all of these places by "poisoning" TREE_CODE for TYPE_ALIAS_TYPE nodes, then patch up the compiler to make the appropriate type_representative calls. We'd want to save the original type for diagnostics. Or, you can just save the context the type came from: type = TREE_TYPE (exp); type_context = &TREE_TYPE (exp); same amount of work on the use side, but much faster equality checking. An alternative to poisoning TREE_CODE would be to have TREE_TYPE do the mapping itself and have another macro to access the original (named) type: #define TREE_TYPE(NODE) type_representative ((NODE)->common.type) #define TREE_ORIGINAL_TYPE(NODE) ((NODE)->common.type) Likewise, given those, we could do: #define TREE_TYPE(NODE) ((NODE)->common.type) #define TREE_ORIGINAL_TYPE(NODE) (map((NODE)->common.type, &(NODE)->common.type) ? map((NODE)->common.type, &(NODE)->common.type) : (NODE)->common.type) and remain fast for equality. Since we know that type canonicalization is incremental, could we work > toward type canonicalization in the GCC 4.3 time frame? If by we you mean you, I don't see why that would be a bad idea. :-) The risk is if one invests all this effort, and the win turns out to be < 1% on real code and 10x on benchmark code, one feels bad. ConceptGCC has hit the
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 8, 2006, at 7:14 AM, Ian Lance Taylor wrote: The way to canonicalize them is to have all equivalent types point to a single canonical type for the equivalence set. The comparison is one memory dereference and one pointer comparison, not the current procedure of checking for structural equivalence. Once not equal addresses might mean equal types, you have to do a structure walk to compare types, and you're right back were we started. The only way to save yourself, is to be able to say, different addresses, _must_ be different types. An example, are these two types the same: A B given that A and B are the same type. Your way, you need to walk two trees, hitting memory 40 times. The cost is 40 cache misses, each one takes 100 ns, so we're up to 2000 ns to compare them. In my scheme, the addresses are the same, so for codegen you get: cmp p1, p2 which is 1 machine instruction, and no memory hits, and this runs in around 0.2 ns, a 1x speedup. Now, imagine real live template code. 20 deep is nothing, and the branching is worse than one I suspect.
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 9, 2006, at 4:54 PM, Mike Stump wrote: On Nov 8, 2006, at 7:14 AM, Ian Lance Taylor wrote: The way to canonicalize them is to have all equivalent types point to a single canonical type for the equivalence set. The comparison is one memory dereference and one pointer comparison, not the current procedure of checking for structural equivalence. Once not equal addresses might mean equal types, you have to do a structure walk to compare types, and you're right back were we started. The only way to save yourself, is to be able to say, different addresses, _must_ be different types. I had in mind something like if (p1 == p2) equal else if (p1->ptr_equality_suffices_for_this_type || p2- >ptr_equality_suffices_for_this_type) not equal else tree walk Don't know how workable that is.
Re: [m32c-elf] losing track of register lifetime in combine?
I compared the generated code with an equivalent explicit test, and discovered that gcc uses a separate rtx for the intermediate: i = 0xf; if (j >= 16) { int i2; i2 = i >> 8; i = i2 >> 8; j -= 16; } This seems to avoid the combiner problem, becuase you don't have the same register being set and being used in one insn. Does this explain why combine was having a problem, or was this a legitimate thing to do and the combiner is still wrong? Using a temp in the expander works around the problem.
Re: Canonical type nodes, or, comptypes considered harmful
On Thu, Nov 09, 2006 at 04:54:23PM -0800, Mike Stump wrote: > Once not equal addresses might mean equal types, you have to do a > structure walk to compare types, and you're right back were we > started. Not quite. A structure walk is required to be certain of equality, but if inequality is the more common case, some kind of hash to speed up that case could speed things up. For example, in > A > B if each node in the type tree contains some kind of hash code, which is built as a function of the leaves when the tree is constructed, then you avoid the tree walk. However, for > A > A at different addresses, you still have to do it. Someone said earlier in this thread that inequality was 34x as common as equality, so ...
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 9, 2006, at 1:06 PM, Joern RENNECKE wrote: I think in order to handle the C type system with the non-transitive type compatibility effectively, for each type we have to pre-compute the most general variant, even if that has no direct representative in the current program. The scheme you describe is logically the same as mine, where the things I was calling variants are present in the non-most general variants bit but not in the most general variant bits. I think the data layout of your scheme is better as then you don't have to do log n map lookups to find data, the data are right there and for comparison, instead of address equality of the main node, you do address equality of the `most general variant'. In my scheme, I was calling that field just the type. I'm sorry if the others where thinking of that type of scheme, I though they weren't. Now, what are the benefits and weaknesses between mine and your, you don't have to carry around type_context the way mine would, that's a big win. You don't have to do anything special move a reference to a type around, that's a big win. You have to do a structural walk if there are any bits that are used for type equality. In my scheme, I don't have to. I just have a vector of items, they are right next to each other, in the same cache line. In your scheme, you have to walk all of memory to find them, which is slow. So, if you want speed, I have a feeling mine is still faster. If you want ease of implementation or conversion yours may be better.
Compile Farm for GCC developpers and free software developpers
Reminder: if you need access to x86 machines to run your boring batches or to test your software with GCC snapshots with even more boring batches, the GCC compile farm is for you: http://gcc.gnu.org/wiki/CompileFarm << GCC Compile Farm Project The GCC CompileFarm Project is seeking volunteers to maintain script machinery to help with GCC development on nine bi pentium 3 machines as well as GCC developers that are lacking x86 machine access. How to Get Involved ? If you are a GCC developer and want access to the compileFarm for GCC development and testing, or if you are a free software developer wishing to set up automated testing of a piece of free software with the current GCC development version (preferably with a test suite), please send 1. your ssh public key (HOME/.ssh/authorized_keys format) *in attachment* and not inline in the email and 2. your prefered UNIX login to laurent at guerby dot net. [...] >> Laurent http://guerby.org/blog/
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 8, 2006, at 5:11 AM, Richard Kenner wrote: My confusion here is how can you "canonicalize" types that are different (meaning have different names) without messing up debug information. If you have: Foo xyz; typedef int Foo; TREE_TYPE (xyz) == int map(int, &TREE_TYPE (xyz)) == Foo debug infomation for xyz is nameL "xyz", type: map(TREE_TYPE (decl), &TREE_TYPE (decl)), which happens to be Foo.
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 9, 2006, at 5:00 PM, Dale Johannesen wrote: On Nov 9, 2006, at 4:54 PM, Mike Stump wrote: else if (p1->ptr_equality_suffices_for_this_type || p2- >ptr_equality_suffices_for_this_type) not equal else tree walk For trivial things, those things that are fast anyway, you make them fast, for slow things, you make them slow, so, there isn't a net change in speed.
Re: Canonical type nodes, or, comptypes considered harmful
On Nov 9, 2006, at 5:11 PM, Joe Buck wrote: On Thu, Nov 09, 2006 at 04:54:23PM -0800, Mike Stump wrote: Once not equal addresses might mean equal types, you have to do a structure walk to compare types, and you're right back were we started. Not quite. Ah, you're right, thanks for spotting that. A structure walk is required to be certain of equality, :-(
Re: Planned LTO driver work
On Thu, 2006-11-09 at 12:32 -0800, Mark Mitchell wrote: > 1. Add a --lto option to collect2. When collect2 sees this option, > treat all .o files as if they were .rpo files and recompile them. We > will do this after all C++ template instantiation has been done, since > we want to optimize the .o files after the program can actually link. > > 2. Modify the driver so that --lto passes -flto to the C front-end and > --lto to collect2. > > Any objections to this plan? Maybe not an objection but a suggestion with respect of static libraries. It might be useful to also to look into archives for files with LTO info in them and be able to read them inside the compiler also. Thanks, Andrew Pinski
Re: Planned LTO driver work
Andrew Pinski wrote: > On Thu, 2006-11-09 at 12:32 -0800, Mark Mitchell wrote: >> 1. Add a --lto option to collect2. When collect2 sees this option, >> treat all .o files as if they were .rpo files and recompile them. We >> will do this after all C++ template instantiation has been done, since >> we want to optimize the .o files after the program can actually link. >> >> 2. Modify the driver so that --lto passes -flto to the C front-end and >> --lto to collect2. >> >> Any objections to this plan? > > Maybe not an objection but a suggestion with respect of static > libraries. It might be useful to also to look into archives for files > with LTO info in them and be able to read them inside the compiler also. Definitely -- but not yet. :-) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Planned LTO driver work
Ian Lance Taylor wrote: > Mark Mitchell <[EMAIL PROTECTED]> writes: > >> 1. Add a --lto option to collect2. When collect2 sees this option, >> treat all .o files as if they were .rpo files and recompile them. We >> will do this after all C++ template instantiation has been done, since >> we want to optimize the .o files after the program can actually link. >> >> 2. Modify the driver so that --lto passes -flto to the C front-end and >> --lto to collect2. > > Sounds workable in general. I note that in your example of > gcc --lto foo.c bar.o > this presumably means that bar.o will be recompiled using the compiler > options specified on that command line, rather than, say, the compiler > options specified when bar.o was first compiled. This is probably the > correct way to handle -march= options. I think so. Of course, outright conflicting options (e.g., different ABIs between the original and subsequent compilation) should be detected and an error issued. There has to be one set of options for LTO, so I don't see much benefit in recording the original options and trying to reuse them. We can't generate code for two different CPUs, or optimize both for size and for space, for example. (At least not without a lot more stuff that we don't presently have.) > I assume that in the long run, the gcc driver with --lto will invoke > the LTO frontend rather than collect2. And that the LTO frontend will > then open all the .o files which it is passed. Either that, or, at least, collect2 will invoke LTO once with all of the .o files. I'm not sure if it matters whether it's the driver or collect2 that does the invocation. What do you think? In any case, for now, I'm just trying to move forward, and the collect2 route looks a bit easier. If you're concerned about that, then I'll take note to revisit and discuss before anything goes to mainline. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Abt long long support
Thanks for the input and the questions Did you examine: long long l, k; l = -k; for correctness by itself? Was it valid or invalid? Yes this is working. [ read ahead for spoilers, I'd rather you pull this information out of the dump and present it to us... ] A quick glance at the rtl shows that insn 95 tries to use [a4+4] but insn 94 clobbered a4 already, also d3 is used by insn 93, but there isn't a set for it. Looks like you have found out the problem.But i need to look more into it. The way the instructions are numbered suggests that the code went wrong before this point. You have to read and understand all the The instructions are numbered randomly and not in the increasing order ... But looking at the diff of working and non working code i thought it was not an issue. Is this natural ? Those are ids for insns and each insns have unique id. Is it wrong to the the insns ids to be in jumbled fashion. and one more thing. In the dumps i noticed that before using a register in DI mode they are all clobbred first, like (insn 30 54 28 6 (clobber (reg:DI 34)) -1 (nil) (nil)) What is the use of this insns ... Why do we need to clobber these registers befor the use? After some pass they are not seen in the dump. Regards, Shafi.
Getting "char" from INTEGER_TYPE node
I am having some trouble with getting type names as declared by the user in source. In particular if i have two functions: void Function(int i); void Function(char c); when processing the parameters i get an INTEGER_TYPE node in the parameter list for both function as expected, however IDENTIFIER_POINTER(DECL_NAME(TYPE_NAME(node))) returns the string "int" for both nodes. I would have expected one to be "int" and the other to be "char". Looking at the TYPE_PRECISION for these nodes i get correct values though, i.e. one is 8 bit precision, the other is 32 bit. How can i get the "char" string when a user uses char types instead of "int" strings? Thanks, Brendon.
cp_parser_parameter_declaration_clause
The function cp_parser_parameter_declaration_clause says that it returns NULL if the parameters are (...). However, there is a line of code that is: /* Parse the parameter-declaration-list. */ parameters = cp_parser_parameter_declaration_list (parser, &is_error); /* If a parse error occurred while parsing the parameter-declaration-list, then the entire parameter-declaration-clause is erroneous. */ if (is_error) return NULL; So how does one tell if this function has returned an error? Thanks, Sohail
Re: Planned LTO driver work
Mark Mitchell <[EMAIL PROTECTED]> writes: > > I assume that in the long run, the gcc driver with --lto will invoke > > the LTO frontend rather than collect2. And that the LTO frontend will > > then open all the .o files which it is passed. > > Either that, or, at least, collect2 will invoke LTO once with all of the > .o files. I'm not sure if it matters whether it's the driver or > collect2 that does the invocation. What do you think? I think in the long run the driver should invoke the LTO frontend directly. The LTO frontend will then presumably emit a single .s file. Then the driver should invoke the assembler as usual, and then the linker. That will save a process--if collect2 does the invocation, we have to run the driver twice. Bad way: gcc collect2 gcc <-- there is the extra process lto1 as ld Good way: gcc lto1 as collect2 ld (or else we have to teach collect2 how to invoke as directly, which just sounds painful). > In any case, for now, I'm just trying to move forward, and the collect2 > route looks a bit easier. If you're concerned about that, then I'll > take note to revisit and discuss before anything goes to mainline. No worries on my part. Ian
Re: Abt long long support
"Mohamed Shafi" <[EMAIL PROTECTED]> writes: > and one more thing. In the dumps i noticed that before using a > register in DI mode they are all clobbred first, like > > (insn 30 54 28 6 (clobber (reg:DI 34)) -1 (nil) > (nil)) > > What is the use of this insns ... Why do we need to clobber these > registers befor the use? After some pass they are not seen in the > dump. It's a hack to tell the flow pass that the register is not used before it is set. Otherwise when the code initializes half of the register, flow will think that the other half is live, perhaps having been initialized before the function started. The clobber tells flow that the register is completely dead, and is initialized one half at a time. Ian
Re: Canonical type nodes, or, comptypes considered harmful
Mike Stump <[EMAIL PROTECTED]> writes: > On Nov 8, 2006, at 7:14 AM, Ian Lance Taylor wrote: > > The way to canonicalize them is to have all equivalent types point to > > a single canonical type for the equivalence set. The comparison is > > one memory dereference and one pointer comparison, not the current > > procedure of checking for structural equivalence. > > Once not equal addresses might mean equal types, you have to do a > structure walk to compare types, and you're right back were we > started. The only way to save yourself, is to be able to say, > different addresses, _must_ be different types. I have no idea what you mean by this. I meant something very simple: for every type, there is a TYPE_CANONICAL field. This is how you tell whether two types are equivalent: TYPE_CANONICAL (a) == TYPE_CANONICAL (b) That is what I mean when I saw one memory dereference and one pointer comparison. > An example, are these two types the same: > > A > B > > given that A and B are the same type. Your way, you need to walk two > trees, hitting memory 40 times. No. When you create *A, you also create * (TYPE_CANONICAL (A)) (this may be the same as *A, of course). You set TYPE_CANONICAL (*A) to that type. And the same for *B. Since TYPE_CANONICAL (A) == TYPE_CANONICAL (B) by assumption, you make sure that TYPE_CANONICAL (*A) == TYPE_CANONICAL (*B). Ian
Re: [m32c-elf] losing track of register lifetime in combine?
DJ Delorie <[EMAIL PROTECTED]> writes: > I compared the generated code with an equivalent explicit test, > and discovered that gcc uses a separate rtx for the intermediate: > > i = 0xf; > if (j >= 16) > { > int i2; > i2 = i >> 8; > i = i2 >> 8; > j -= 16; > } > > This seems to avoid the combiner problem, becuase you don't have the > same register being set and being used in one insn. Does this explain > why combine was having a problem, or was this a legitimate thing to do > and the combiner is still wrong? Using a temp in the expander works > around the problem. Interesting. Using a temporary is the natural way to implement this code. But not using a temporary should be valid. So I think there is a bug in combine. But since using a temporary will give more CSE opportunities, I think you should use a temporary. And you shouldn't worry about fixing combine, since all that code is going to have to change on dataflow-branch anyhow. (Actually it probably just works on dataflow-branch.) Ian
How to create both -option-name-* and -option-name=* options?
The Fortran front end currently has a lang.opt entry of the following form: ffixed-line-length- Fortran RejectNegative Joined UInteger I would like to add to this the following option which differs in the last character, but should be treated identically: ffixed-line-length= Fortran RejectNegative Joined UInteger (Why do I want to do this horrible thing? Well, the second is really the syntax we should be using, but I would like to just undocument the first version rather than removing it, so as not to break backwards compatibility with everyone's makefiles.) Anyhow, if I try this, I get the following error (trimmed slightly for clarity): gcc -c [...] ../../svn-source/gcc/genconstants.c In file included from tm.h:7, from ../../svn-source/gcc/genconstants.c:32: options.h:659: error: redefinition of `OPT_ffixed_line_length_' options.h:657: error: `OPT_ffixed_line_length_' previously defined here This is because both the '=' and the '-' in the option name reduce to a '_' in the enumeration name, which of course causes the enumerator to get defined twice -- and that's a problem, even though I'm quite happy for the options to both be treated identically. There's not really any good way around this problem, is there? - Brooks
Re: Planned LTO driver work
Ian Lance Taylor wrote: > Mark Mitchell <[EMAIL PROTECTED]> writes: > >>> I assume that in the long run, the gcc driver with --lto will invoke >>> the LTO frontend rather than collect2. And that the LTO frontend will >>> then open all the .o files which it is passed. >> Either that, or, at least, collect2 will invoke LTO once with all of the >> .o files. I'm not sure if it matters whether it's the driver or >> collect2 that does the invocation. What do you think? > > I think in the long run the driver should invoke the LTO frontend > directly. > That will save a process--if collect2 does the invocation, we have to > run the driver twice. Good point. Probably not a huge deal in the context of optimizing the whole program, but still, why be stupid? Though, if we *are* doing the template-repository dance, we'll have to do that for a while, declare victory, then invoke the LTO front end, and, finally, the actual linker, which will be a bit complicated. It might be that we should move the invocation of the real linker back into gcc.c, so that collect2's job just becomes generating the right pile of object files via template instantiation and static constructor/destructor generation? -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713