[PATCH] trivial fix for typo in gcc/configure.ac
Good morning, The following trivial patch fixes an apparent typo in configure.ac in the gcc/ subdirectory. This is HEAD, I did not check if this is needed on any of the branches as well. Regards, Jan van Dijk 2006-11-06 Jan van Dijk <[EMAIL PROTECTED]> * configure.ac: Fixed typo in case statement: :: changed to ;; Index: configure.ac === --- configure.ac(revision 118516) +++ configure.ac(working copy) @@ -1438,7 +1438,7 @@ *) AC_CHECK_FUNC(__cxa_atexit,[use_cxa_atexit=yes], [echo "__cxa_atexit can't be enabled on this target"]) - :: + ;; esac else # We can't check for __cxa_atexit when building a cross, so assume -- If I haven't seen further, it is by standing in the footprints of giants
Re: compiling very large functions.
Kenneth Zadeck wrote on 11/04/06 15:17: 1) defining the set of optimizations that need to be skipped. 2) defining the set of functions that trigger the special processing. This seems too simplistic. Number of variables/blocks/statements is a factor, but they may interact in ways that are difficult or not possible to compute until after the optimization has started (it may depend on how many blocks have this or that property, in/out degree, number of variables referenced in statements, grouping of something or other, etc). So, in my view, each pass should be responsible for throttling itself. The pass gate functions already give us the mechanism for on/off. I agree that we need more graceful throttles. And then we have components of the pipeline that cannot really be turned on/off (like alias analysis) but could throttle themselves based on size (working on that). The compilation manager could then look at the options, in particular the -O level and perhaps some new options to indicate that this is a small machine or in the other extreme "optimize all functions come hell or high water!!" and skip those passes which will cause performance problems. All this information is already available to the gate functions. There isn't a lot here that the pass manager needs to do. We already know compilations options, target machine features, and overall optimization level. What we do need is for each pass to learn to throttle itself and/or turn itself off. Turning the pass off statically and quickly could be done in the gating function. A quick analysis of the CFG made by the pass itself may be enough to decide. We could provide a standard group of heuristics with standard metrics that lazy passes could use. Say a 'cfg_too_big_p' or 'cfg_too_jumpy_p' that passes could call and decide not to run, or set internal flags that would partially disable parts of the pass (much like DCE can work with or without control-dependence information).
Re: defunct fortran built by default for cross-compiler
Joern Rennecke wrote: It appears that most of the errors are of the form: collect-ld: cannot find -lgfortranbegin I've found that the problem was related to configure deciding to build fortran and enable runtime tests for it when doing check-gcc even though libgfortran was not present; I had made my script remove that some time ago because libgfortran was not supported. When I tried to configure with libgfortran present and add the make target all-target-libgfortran, I get after about three hours: /mnt/scratch/nightly/2006-11-02-softfp/sh-elf/./gcc/xgcc -B/mnt/scratch/nightly/2006-11-02-softfp/sh-elf/./gcc/ -nostdinc -B/mnt/scratch/nightly/2006-11-02-softfp/sh-elf/sh-multi-elf/newlib/ -isystem /mnt/scratch/nightly/2006-11-02-softfp/sh-elf/sh-multi-elf/newlib/targ-include -isystem /mnt/scratch/nightly/2006-11-02-softfp/srcw/newlib/libc/include -B/usr/local/sh-multi-elf/bin/ -B/usr/local/sh-multi-elf/lib/ -isystem /usr/local/sh-multi-elf/include -isystem /usr/local/sh-multi-elf/sys-include -L/mnt/scratch/nightly/2006-11-02-softfp/sh-elf/./ld -DHAVE_CONFIG_H -I. -I../../../srcw/libgfortran -I. -iquote../../../srcw/libgfortran/io -I../../../srcw/libgfortran/../gcc -I../../../srcw/libgfortran/../gcc/config -I../.././gcc -D_GNU_SOURCE -std=gnu99 -Wall -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wextra -Wwrite-strings -O2 -g -O2 -c ../../../srcw/libgfortran/runtime/error.c -o error.o ../../../srcw/libgfortran/runtime/error.c: In function 'show_locus': ../../../srcw/libgfortran/runtime/error.c:288: warning: format '%d' expects type 'int', but argument 2 has type 'GFC_INTEGER_4' ../../../srcw/libgfortran/runtime/error.c: At top level: ../../../srcw/libgfortran/runtime/error.c:334: error: '_gfortran_runtime_error' aliased to undefined symbol '__gfortrani_runtime_error' make[2]: *** [error.lo] Error 1 make[2]: Leaving directory `/mnt/scratch/nightly/2006-11-02-softfp/sh-elf/sh-multi-elf/libgfortran' make[1]: *** [all] Error 2 make[1]: Leaving directory `/mnt/scratch/nightly/2006-11-02-softfp/sh-elf/sh-multi-elf/libgfortran' make: *** [all-target-libgfortran] Error 2 So, it appears the only way to do a regression test now is to hard code with --enable-languages the set of languages that are known to generally work, I.e. c,c++ and objc.
Re: compiling very large functions.
On Sat, 2006-11-04 at 15:17 -0500, Kenneth Zadeck wrote: > ld. > > However, I think that before anyone starts hacking anything, we should > come to a consensus on an overall strategy and implement something > consistent for the entire compiler rather than attack some particular > pass in a manner that only gets us pass the next pr. I think we really do need to deal with it on a per pass basis. We occasionally get a monster PR which shows half a dozen (or more) serious time hogs. We more often get a pathological case for a specific pass. In each and every one of these cases, someone has taken a look at how to improve the time hog(s). Usually the result is a leaner/faster and improved pass, although it sometimes takes a new release for it to happen :-). Occasionally, we turn something off. A couple of PRs with massive functions are primarily responsible for the pending changes to out-of-ssa... And those changes will benefit small functions as well as the large ones. If we start throttling some of these passes simply because a big function is coming through, we are likely to miss out on some of these further improvements triggered by large functions. Another reason I favour per-pass is because there may very well be throttling options available which don't completely turn off the pass. There may still be useful things that can be done with less precise information for instance... Only the individual pass can properly make that decision. I think the effort put into a pass manager would be better put into the passes themselves. And any time a pass is actually throttled, it should be well documented as to why it is being throttled, the PR/testcase which is causing it to be throttled, and ideas/suggestions for changes in the future which will allow it to become un-throttled. When we start a new release, someone could take a quick visit to each of these throttled passes and see if there are any proposed projects to help with them. Encourage these types of projects, maybe even disable all throttling in stage 1. And finally, maybe in stage 2, take a quick run through and see if the throttling is still necessary. Perhaps other infrastructural changes have helped some, and the throttling decision needs to be changed. I think its important we not throttle-and-forget optimizations. Andrew
Volatile operations and PRE
Hello, I have discovered that volatile expresions can cause the tree-ssa pre pass to loop forever in "compute_antic". The problem seems to be that the expresion is assigned a different value number at each iteration, hence the fixed point required to exit the loop is never reached. This can be fixed with the attached patch, which modifies "can_value_number_operation" to return false for volatile expresions. I think this makes sense, because you cannot value number volatile expresions (in the same sense that you cannot value number non pure or const function calls). I cannot easily provide a testcase because this problem appears only with a gcc frontend that I am writting. With this fix, volatile acesses work correctly (without it they work correctly only if this pass is disabled). Do you think this patch is correct? Thanks, Ricardo. Index: gcc/tree-ssa-pre.c === --- gcc/tree-ssa-pre.c (revision 557) +++ gcc/tree-ssa-pre.c (working copy) @@ -2133,12 +2133,13 @@ static bool can_value_number_operation (tree op) { - return UNARY_CLASS_P (op) -|| BINARY_CLASS_P (op) -|| COMPARISON_CLASS_P (op) -|| REFERENCE_CLASS_P (op) -|| (TREE_CODE (op) == CALL_EXPR - && can_value_number_call (op)); + return (UNARY_CLASS_P (op) + || BINARY_CLASS_P (op) + || COMPARISON_CLASS_P (op) + || REFERENCE_CLASS_P (op) + || (TREE_CODE (op) == CALL_EXPR + && can_value_number_call (op))) +&& !TREE_THIS_VOLATILE (op); }
Re: Volatile operations and PRE
On 11/6/06, Ricardo FERNANDEZ PASCUAL <[EMAIL PROTECTED]> wrote: Hello, I have discovered that volatile expresions can cause the tree-ssa pre pass to loop forever in "compute_antic". The problem seems to be that the expresion is assigned a different value number at each iteration, hence the fixed point required to exit the loop is never reached. This should not be possible, it would imply that you have SSA names marked as volatile, or a statement whose operands are marked volatile but ann->has_volatile_ops is false. This can be fixed with the attached patch, which modifies "can_value_number_operation" to return false for volatile expresions. I think this makes sense, because you cannot value number volatile expresions (in the same sense that you cannot value number non pure or const function calls). This is wrong. The only place that can_value_number_operation is used is inside an if block that says: else if (TREE_CODE (stmt) == MODIFY_EXPR && !ann->has_volatile_ops && TREE_CODE (TREE_OPERAND (stmt, 0)) == SSA_NAME && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (TREE_OPERAND (stmt, 0))) { ... if (can_value_number_operation (rhs)) { } } Any statement which contains volatile operands should have ann->has_volatile_ops set. That is your real bug. I cannot easily provide a testcase because this problem appears only with a gcc frontend that I am writting. With this fix, volatile acesses work correctly (without it they work correctly only if this pass is disabled). Do you think this patch is correct? Nope, for the reasons above.
Re: Volatile operations and PRE
Thank you for your answer. I give some more information below: Daniel Berlin wrote: On 11/6/06, Ricardo FERNANDEZ PASCUAL <[EMAIL PROTECTED]> wrote: Hello, I have discovered that volatile expresions can cause the tree-ssa pre pass to loop forever in "compute_antic". The problem seems to be that the expresion is assigned a different value number at each iteration, hence the fixed point required to exit the loop is never reached. This should not be possible, it would imply that you have SSA names marked as volatile, or a statement whose operands are marked volatile but ann->has_volatile_ops is false. Actually, I have a MODIFY_EXPR with a volatile operand whose ann->has_volatile is not being set. Any statement which contains volatile operands should have ann->has_volatile_ops set. That is your real bug. Ok, the tree that is giving me problems is the following one: unit size align 32 symtab 0 alias set -1 precision 32 min max pointer_to_this > side-effects visited arg 0 visited var def_stmt version 3> arg 1 side-effects volatile arg 0 static SI file volatile.exe line 1 size unit size align 32 chain > arg 1 SI file volatile.exe line 1 size unit size align 32 offset_align 128 offset bit offset context >> volatile.exe:1> Notice that the arg 1 of the MODIFY_EXPR is a COMPONENT_REF which is marked as volatile. Notice also that the arg 1 of the COMPONENT_REF is not marked as such, because that field is not volatile itself and there are other accesses to it which are not volatile. This is because in my source language individual load or stores can be marked as volatile, not the variables. So, I think the real question is: are COMPONENT_REF nodes allowed to be marked as volatile by themselves? I think they should, and actually it seems to work (the generated code looks correct). If it is allowed, the attached patch would solve my problem too. Would it be correct this time? Thanks for your help, Ricardo. Index: ../gcc/tree-ssa-operands.c === --- ../gcc/tree-ssa-operands.c (revision 557) +++ ../gcc/tree-ssa-operands.c (working copy) @@ -1960,7 +1960,7 @@ if (code == COMPONENT_REF) { - if (s_ann && TREE_THIS_VOLATILE (TREE_OPERAND (expr, 1))) + if (s_ann && (TREE_THIS_VOLATILE (TREE_OPERAND (expr, 1)) || TREE_THIS_VOLATILE (expr))) s_ann->has_volatile_ops = true; get_expr_operands (stmt, &TREE_OPERAND (expr, 2), opf_none); }
Re: compiling very large functions.
Andrew MacLeod wrote: > On Sat, 2006-11-04 at 15:17 -0500, Kenneth Zadeck wrote: > >> ld. >> > > >> However, I think that before anyone starts hacking anything, we should >> come to a consensus on an overall strategy and implement something >> consistent for the entire compiler rather than attack some particular >> pass in a manner that only gets us pass the next pr. >> > > I think we really do need to deal with it on a per pass basis. We > occasionally get a monster PR which shows half a dozen (or more) serious > time hogs. We more often get a pathological case for a specific pass. > > In each and every one of these cases, someone has taken a look at how to > improve the time hog(s). Usually the result is a leaner/faster and > improved pass, although it sometimes takes a new release for it to > happen :-). Occasionally, we turn something off. A couple of PRs with > massive functions are primarily responsible for the pending changes to > out-of-ssa... And those changes will benefit small functions as well as > the large ones. > > The problem with trying to solve this problem on a per pass basis rather than coming up with an integrate solution is that we are completely leaving the user out of the thought process. There are some uses who have big machines or a lot of time on their hands and want the damn the torpedoes full speed ahead and there are some uses that want reasonable decisions made even at high optimization. We need to give them any easy to turn knob. I am not saying that my original proposal was the best of all possible worlds, but solving hacking things on a pass by pass or pr by pr basis is not really solving the problem. kenny
Re: Volatile operations and PRE
Ricardo FERNANDEZ PASCUAL writes: > > Notice that the arg 1 of the MODIFY_EXPR is a COMPONENT_REF which > is marked as volatile. Notice also that the arg 1 of the > COMPONENT_REF is not marked as such, because that field is not > volatile itself and there are other accesses to it which are not > volatile. This is because in my source language individual load or > stores can be marked as volatile, not the variables. > So, I think the real question is: are COMPONENT_REF nodes allowed > to be marked as volatile by themselves? I think they should, and > actually it seems to work (the generated code looks correct). volatile is a type qualifier. The type of a COMPONENT_REF is the type of the operand to which it refers. If you want to change the effective type of a reference, you should generate a suitable CONVERT_EXPR. Like this: tree exp_type = TREE_TYPE (exp); tree v_type = build_qualified_type (exp_type, TYPE_QUALS (exp_type) | TYPE_QUAL_VOLATILE); tree addr = build_fold_addr_expr (exp); v_type = build_pointer_type (v_type); addr = fold_convert (v_type, addr); exp = build_fold_indirect_ref (addr); Andrew.
Re: compiling very large functions.
The problem with trying to solve this problem on a per pass basis rather than coming up with an integrate solution is that we are completely leaving the user out of the thought process. There are some uses who have big machines or a lot of time on their hands and want the damn the torpedoes full speed ahead and there are some uses that want reasonable decisions made even at high optimization. We need to give them any easy to turn knob. This is why we have O2 vs O3 and why we have -fexpensive-optimizations. I am not saying that my original proposal was the best of all possible worlds, but solving hacking things on a pass by pass or pr by pr basis is not really solving the problem. Sure it is. The problem with your approach is that most of the algorithms in GCC that sometimes have very bad times with large functions do not have such simple qualities as "they take a long time when n_basic_blocks is large". First, almost none of the algorithms are super-linear in any easy-to-calculate-on-a-global-basis-way. The only easy ones are regalloc and non-modulo scheduling. Everything else is just not stupid enough to be N^2 in the number of instructions or basic blocks. Take points-to analysis, one of our N^3 worst case algorithms. You can throttle PTA by losing precision for speed very easily. However, the time bound is very complex. It takes N iterations where N is the length of the largest uncollapseable cycle in the constraint graph. Each iteration takes V*N time where V is the number of non-unifiable variables, and N is the size of the pointed-to set. Uncollapseable cycles only occur when you have address taking of pointed to fields, or pointer arithmetic on pointers to structures (IE a = a->next). Variables can't be unified or collapsed only in some odd cases. In almost *all* cases, it acts linear. I can tell you whether points-to is going to take an hour or two *after* we've run constructed and simplified the constraint graph (IE before we spend time solving it). I certainly can't tell you based on the number of basic blocks or instructions. Want another example? Take GVN-PRE, which is potentially N^2. The time bound is related to the largest number of distinct values that occur in a function. Even on very large functions, this may not be a lot. We have plenty of incredibly large functions that just don't take a lot of time in PRE. The only way to make reasonable decisions about time is on a per-pass basis, because our passes just don't have bad cases that correspond to global heuristics.
Re: compiling very large functions.
Kenneth Zadeck wrote on 11/06/06 12:54: I am not saying that my original proposal was the best of all possible worlds, but solving hacking things on a pass by pass or pr by pr basis is not really solving the problem. I don't think it's a hackish approach. We have policy setting at the high level (-O[123]), and local implementation of that policy via the gating functions. Providing common predicates that every pass can use to decide whether to switch itself off is fine (size estimators, high connectivity, etc), but ultimately the analysis required to determine whether a function is too expensive for a pass may not be the same from one pass to the other. OTOH, just using the gating function is not enough. Sometimes you want the pass to work in a partially disabled mode (like the CD-DCE example I mentioned earlier). In terms of machinery, I don't think we are missing a lot. All the information is already there. What we are missing is the implementation of more throttling/disabling mechanisms.
Re: Where is the splitting of MIPS %hi and %lo relocations handled?
David Daney <[EMAIL PROTECTED]> writes: > I am going to try to fix: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29721 > > Which is a problem where a %lo relocation gets separated from its > corresponding %hi. > > What is the mechanism that tries to prevent this from happening? And > where is it implemented? This implemented by having the assembler sort the relocations so that each %lo relocations follows the appropriate set of %hi relocations. It is implemented in gas/config/tc-mips.c in append_insn. Look for reloc_needs_lo_p and mips_frob_file. At first glance the assembler does appear to handle %got correctly, so I'm not sure why it is failing for you. Ian
Re: Abt long long support
"Mohamed Shafi" <[EMAIL PROTECTED]> writes: > Looking at a .md file of a backend it there a way to know whether a > target supports long long gcc always supports "long long" for all targets. Can you ask a more precise question? Ian
Re: Where is the splitting of MIPS %hi and %lo relocations handled?
Ian Lance Taylor wrote: David Daney <[EMAIL PROTECTED]> writes: I am going to try to fix: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29721 Which is a problem where a %lo relocation gets separated from its corresponding %hi. What is the mechanism that tries to prevent this from happening? And where is it implemented? This implemented by having the assembler sort the relocations so that each %lo relocations follows the appropriate set of %hi relocations. It is implemented in gas/config/tc-mips.c in append_insn. Look for reloc_needs_lo_p and mips_frob_file. At first glance the assembler does appear to handle %got correctly, so I'm not sure why it is failing for you. Did you look at the assembly fragment in the PR? Is it correct in that there is a pair of %got/%lo in the middle of another %got/%lo pair? David Daney,
Re: Where is the splitting of MIPS %hi and %lo relocations handled?
David Daney <[EMAIL PROTECTED]> writes: > Ian Lance Taylor wrote: > > David Daney <[EMAIL PROTECTED]> writes: > > > >>I am going to try to fix: > >> > >>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29721 > >> > >>Which is a problem where a %lo relocation gets separated from its > >>corresponding %hi. > >> > >>What is the mechanism that tries to prevent this from happening? And > >>where is it implemented? > > This implemented by having the assembler sort the relocations so that > > each %lo relocations follows the appropriate set of %hi relocations. > > It is implemented in gas/config/tc-mips.c in append_insn. Look for > > reloc_needs_lo_p and mips_frob_file. > > At first glance the assembler does appear to handle %got correctly, > > so > > I'm not sure why it is failing for you. > > Did you look at the assembly fragment in the PR? > > Is it correct in that there is a pair of %got/%lo in the middle of > another %got/%lo pair? Sure, why not? They can be disambiguated by looking at which symbol the %got/%lo applies to. That is just what the assembler reloc sorting implements in mips_frob_file. (Or, at least, is supposed to implement, though apparently something is going wrong in your case.) The assembler sorts the relocations so that the linker always see the %lo reloc immediately after the corresponding %got reloc(s). Ian
Re: Where is the splitting of MIPS %hi and %lo relocations handled?
Ian Lance Taylor wrote: David Daney <[EMAIL PROTECTED]> writes: Ian Lance Taylor wrote: David Daney <[EMAIL PROTECTED]> writes: I am going to try to fix: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29721 Which is a problem where a %lo relocation gets separated from its corresponding %hi. What is the mechanism that tries to prevent this from happening? And where is it implemented? This implemented by having the assembler sort the relocations so that each %lo relocations follows the appropriate set of %hi relocations. It is implemented in gas/config/tc-mips.c in append_insn. Look for reloc_needs_lo_p and mips_frob_file. At first glance the assembler does appear to handle %got correctly, so I'm not sure why it is failing for you. Did you look at the assembly fragment in the PR? Is it correct in that there is a pair of %got/%lo in the middle of another %got/%lo pair? Sure, why not? They can be disambiguated by looking at which symbol the %got/%lo applies to. That is just what the assembler reloc sorting implements in mips_frob_file. (Or, at least, is supposed to implement, though apparently something is going wrong in your case.) The assembler sorts the relocations so that the linker always see the %lo reloc immediately after the corresponding %got reloc(s). OK, thanks. I will look at the assembler. It will not be the first binutils bug that has affected the MIPS GCC. David Daney
RE: [PATCH] trivial fix for typo in gcc/configure.ac
> > 2006-11-06 Jan van Dijk <[EMAIL PROTECTED]> > > * configure.ac: Fixed typo in case statement: :: changed to ;; > Sorry, that was my typo. I have committed your patch, with additional * configure: Regenerate. as obvious. Thanks. Danny
Re: Abt long long support
On Mon, Nov 06, 2006 at 10:52:00AM +0530, Mohamed Shafi wrote: > Hello all, > > Looking at a .md file of a backend it there a way to know whether a > target supports long long > Should i look for patterns with machine mode DI? No. For example, 8-bit, 16-bit and 32-bit targets should normally not define patterns such as anddi3, iordi3 and xordi3. It is possible that a target could have no patterns with mode DI but still support long long, although probably with significant slowdown. E.g. the middle end can synthesize adddi3 and subdi3 from SImode operations, but I think most targets can easily improve 10x in terms of speed and size on that code. Watch out for targets where units are larger than 8 bits. An example is the c4x where a unit is 32-bits and HImode is 64-bits. > Is there some other way? This depends a lot on exactly what you mean when you say support, but grep for LONG_TYPE_SIZE and LONG_LONG_TYPE_SIZE in the .h file and compare the two. -- Rask Ingemann Lambertsen
Re: differences between dg-do compile and dg-do assemble
On Nov 5, 2006, at 6:52 PM, Manuel López-Ibáñez wrote: Although I understand what is the difference between dg-do compile and dg-do assemble, I have noticed that there are many testcases that use either dg-compile or dg-do assemble and do nothing with the output. Thus, I would like to know: Is it faster {dg-do compile} or {dg-do assemble}? Our assembler is in the 1-2% range for compile times. So, using the right one might speed things up 1-2%, if we didn't test on a 2-4 processor box, but we do, so, we don't care. On a 1 processor box, it is so marginal as to almost not worry about it, though, I'd tend to pick the right one for tests I author. Is it appropriate to always use the faster one if the testcase just checks for the presence/absence of warnings and errors? Yes, it is appropriate to use the right one. As for which one is right, the one that FAILs when the compiler has the bug and PASSes when the compiler has been fixed is a good first order approximation. Beyond that, if assemble is enough to do that, you can use it. Some testcases will need compile to elicit a FAIL. It is natural that some people will tend to use compile instead of assemble occasionally, when assemble would have worked, don't worry about it, it is healthy to have a mix.
Encouraging indirect addressing mode
Hello, I've been working with some old programs that have been build with other compilers and moving them to GCC. The code is for an embedded m68k (mcpu32) application with no onboard OS (yet). I've been disappointed with the size of the code that I've seen generated by the compiler, and after looking at the disassembly, I think that a lot of the bloat is due to the fact that the compiler is using little to no indirect address access, even with -Os. Here's an example of something I might see: file: test.c static int main_line = 0; int foo(int *a, int *b, int *c, int *d) { #define DOLINE if(*pmain_line < __LINE__) *pmain_line = __LINE__ register int *const pmain_line = &main_line; DOLINE; *a = *b+*c; DOLINE; *b = *c+*d; DOLINE; *c = *d+*a; DOLINE; *d = *a+*b; return *a; } m68k-elf-gcc -mcpu32 -Os -g -c test.c -o test m68k-elf-objdump -S test test: file format elf32-m68k Disassembly of section .text: : static int main_line = 0; int foo(int *a, int *b, int *c, int *d) { 0: 4e56 linkw %fp,#0 4: 2f0bmovel %a3,[EMAIL PROTECTED] 6: 2f0amovel %a2,[EMAIL PROTECTED] 8: 226e 0008 moveal %fp@(8),%a1 c: 266e 000c moveal %fp@(12),%a3 10: 206e 0010 moveal %fp@(16),%a0 14: 246e 0014 moveal %fp@(20),%a2 #define DOLINE if(*pmain_line < __LINE__) *pmain_line = __LINE__ register int *const pmain_line = &main_line; DOLINE; *a = *b+*c; 18: 7005moveq #5,%d0 1a: b0b9 cmpl 0 ,%d0 20: 6d0ablts 2c 22: 103c 0006 moveb #6,%d0 26: 23c0 movel %d0,0 2c: 2013movel %a3@,%d0 2e: d090addl %a0@,%d0 30: 2280movel %d0,%a1@ DOLINE; *b = *c+*d; 32: 7006moveq #6,%d0 34: b0b9 cmpl 0 ,%d0 3a: 6d0ablts 46 3c: 103c 0007 moveb #7,%d0 40: 23c0 movel %d0,0 46: 2010movel %a0@,%d0 48: d092addl %a2@,%d0 4a: 2680movel %d0,%a3@ DOLINE; *c = *d+*a; 4c: 7007moveq #7,%d0 4e: b0b9 cmpl 0 ,%d0 54: 6d0ablts 60 56: 103c 0008 moveb #8,%d0 5a: 23c0 movel %d0,0 60: 2012movel %a2@,%d0 62: d091addl %a1@,%d0 64: 2080movel %d0,%a0@ DOLINE; *d = *a+*b; 66: 7008moveq #8,%d0 68: b0b9 cmpl 0 ,%d0 6e: 6d0ablts 7a 70: 103c 0009 moveb #9,%d0 74: 23c0 movel %d0,0 7a: 2011movel %a1@,%d0 7c: d093addl %a3@,%d0 7e: 2480movel %d0,%a2@ return *a; } 80: 2011movel %a1@,%d0 82: 245fmoveal [EMAIL PROTECTED],%a2 84: 265fmoveal [EMAIL PROTECTED],%a3 86: 4e5eunlk %fp 88: 4e75rts Here I've used a macro to keep track of the farthest place reached in the code. As you can see, I've even tried to set it up in such a way that it will use a register to access the value. However, I don't get that result, as I guess that is optimized out. Instead each comparison uses the full address of the array, creating two more words for the read and for the write. I'd prefer a sequence to read something like: movel #main_line, %a0 /* only once, at the start of the function */ moveq #(LINE-1), %d0 cmpl %a0@, %d0 blt skip moveb #LINE, %d0 movel %d0,%a0@ skip: ... I haven't seen any options that encourage more use of indirect addressing. Are there any that I have missed? If not, I assume I will need to work with the machine description. I've downloaded the gcc internals book, but it's a lot of material and it's hard to figure out where to start. Can anybody point me in the right direction? Thanks, Luke Powell Project Engineer BJ Services
Should GMP 4.1+ and MPFR 2.2+ be needed when we're not building gfortran?
Hello, The configure changes on the trunk require GMP 4.1+ and MPFR 2.2+. If I understand things correctly, these libraries are only needed for gfortran. Would it be possible to disable the checks for GMP and MPFR when building with --enable-languages=[something not including fortran] ? Please CC me on any replies; I'm not on the main GCC list. Cheers, Doug
Re: Should GMP 4.1+ and MPFR 2.2+ be needed when we're not building gfortran?
Doug Gregor <[EMAIL PROTECTED]> writes: > The configure changes on the trunk require GMP 4.1+ and MPFR 2.2+. If > I understand things correctly, these libraries are only needed for > gfortran. Would it be possible to disable the checks for GMP and MPFR > when building with --enable-languages=[something not including > fortran] ? They are now required to build the generic middle-end. They are no longer only required for Fortran. Ian
Re: Encouraging indirect addressing mode
[EMAIL PROTECTED] writes: > Here I've used a macro to keep track of the farthest place reached in the > code. As you can see, I've even tried to set it up in such a way that it > will use a register to access the value. However, I don't get that result, > as I guess that is optimized out. Instead each comparison uses the full > address of the array, creating two more words for the read and for the > write. I'd prefer a sequence to read something like: > > movel #main_line, %a0 /* only once, at the start of the function */ > moveq #(LINE-1), %d0 > cmpl %a0@, %d0 > blt skip > moveb #LINE, %d0 > movel %d0,%a0@ > skip: ... > > I haven't seen any options that encourage more use of indirect addressing. > Are there any that I have missed? If not, I assume I will need to work > with the machine description. I've downloaded the gcc internals book, but > it's a lot of material and it's hard to figure out where to start. Can > anybody point me in the right direction? The first thing to try is to use TARGET_ADDRESS_COST to make the cost of register indirect smaller than the cost of an absolute address reference, at least when optimize_space is true. m68k.c doesn't seem to have a definition of TARGET_ADDRESS_COST, so you will have to add one. You should also take a look at TARGET_RTX_COSTS. Ian
Re: compiling very large functions.
Kenneth Zadeck wrote: The problem with trying to solve this problem on a per pass basis rather than coming up with an integrate solution is that we are completely leaving the user out of the thought process. There are some uses who have big machines or a lot of time on their hands and want the damn the torpedoes full speed ahead and there are some uses that want reasonable decisions made even at high optimization. We need to give them any easy to turn knob. Is there a need for any fine-grained control on this knob, though, or would it be sufficient to add an -O4 option that's equivalent to -O3 but with no optimization throttling? - Brooks
Re: compiling very large functions.
Brooks Moses wrote on 11/06/06 17:41: Is there a need for any fine-grained control on this knob, though, or would it be sufficient to add an -O4 option that's equivalent to -O3 but with no optimization throttling? We need to distinguish two orthogonal issues here: effort and enabled transformations. Currently, -O3 means enabling transformations that (a) may not result in an optimization improvement, and (b) may change the semantics of the program. -O3 will also enable "maximum effort" out of every transformation. In terms of effort, we currently have individual knobs in the form of -f and/or --params settings. It should not be hard to introduce a global -Oeffort=xxx parameter. But, it will take some tweaking to coordinate what -f/--params/-m switches should that enable.
RE: RED ALERT ON FUNDS HELD FOR OVER 2YEARS
Good day, In line with the UN directive of diversion of dormant accounts to charity, I thereby bring to you an opportunity to share with me in a certain redistribution of certain accounts in my coffers for the financial year about to end 2006. I am a registered financial service authority agent with an investment firm in the UK. The directive of the UK government in conjunction with the UN approval has created a scenario where government officials are setting up illegal charity firms and making close concealed associates the front men in form of C.E.O's to this charity firms. Thus I need a partner to work on some dormant accounts files in my possession have facts. Regards, Paul Allan Denley. [EMAIL PROTECTED]
Compiling gcc 3.2.3, AMD, x86_64,
I'm trying to compile gcc v3.2.3 and I'm getting through most of it but the make file stops showing the following error: /bin/sh: ./../../../configure: No such file or directory configure: error: ./../../../configure failed for libU77 If I could get some help troubleshooting this problem, I'd be grateful. Please look at my wiki-notes at http://hadron.physics.fsu.edu/wiki/index.php/CentOS_via_Rocks_installation_does_not_install_gcc32#Compile_Attempt_.232 PhilipC
Re: Compiling gcc 3.2.3, AMD, x86_64,
On Nov 6, 2006, at 5:25 PM, Philip Coltharp wrote: I'm trying to compile gcc v3.2.3 and I'm getting through most of it but the make file stops showing the following error: /bin/sh: ./../../../configure: No such file or directory I suspect the answer is don't do: ../configure instead, do ../gcc/configure... This would be a FAQ if my guess is correct. gcc-help is more appropriate for user questions like this.
Re: Compiling gcc 3.2.3, AMD, x86_64,
On Nov 6, 2006, at 6:57 PM, Mike Stump wrote: On Nov 6, 2006, at 5:25 PM, Philip Coltharp wrote: I'm trying to compile gcc v3.2.3 and I'm getting through most of it but the make file stops showing the following error: /bin/sh: ./../../../configure: No such file or directory I suspect the answer is don't do: ../configure instead, do ../gcc/configure... Ah, I read though more of your wiki, I guessed wrong. You didn't give all the commands you issued and I still suspect the same issue, try this: untar gcc mkdir obj cd obj ../gcc/configure [...] make or annotate your wiki with _all_ the commands you did, also, start of in a new directory, no make clean, it just isn't worth it.
Re: build failure, GMP not available
On Tue, 31 Oct 2006, Ian Lance Taylor wrote: > "Kaveh R. GHAZI" <[EMAIL PROTECTED]> writes: > > > Should that message refer to this: > > ftp://gcc.gnu.org/pub/gcc/infrastructure/ > > > > or this: > > ftp://ftp.gnu.org/gnu/gmp/ > > http://www.mpfr.org/mpfr-current/ > > > > or this (perhaps with more details): > > http://gcc.gnu.org/install/prerequisites.html > > The first, I think. > > > I prefer the latter one of avoid duplicating the info in more than one > > place. If prerequisites needs more info, I'll fill in there better. > > I think the primary goal should be making it as simple and obvious as > possible for people to build gcc. If that can be done without > duplicating information, that is good. But the primary goal should be > making it very very easy to build gcc. > > If we encounter the problem whose solution is "download mpfr from > gcc.gnu.org and untar it," then I don't think it helps to point people > to the long list at http://gcc.gnu.org/install/prerequisites.html, > which is irrelevant for most people. > Ian I ended up including both your preference and mine. Hopefully one or other other (or both) end up being useful to users. Tested on sparc-sun-solaris2.10 via configure, with and without specifying the gmp/mpfr location to see the error message and to pass it. Okay for mainline? --Kaveh 2006-11-06 Kaveh R. Ghazi <[EMAIL PROTECTED]> * configure.in: Robustify error message for missing GMP/MPFR. * configure: Regenerate. diff -rup orig/egcc-SVN20061105/configure.in egcc-SVN20061105/configure.in --- orig/egcc-SVN20061105/configure.in 2006-10-21 10:02:13.0 -0400 +++ egcc-SVN20061105/configure.in 2006-11-06 22:28:49.178608073 -0500 @@ -1118,7 +1118,11 @@ fi CFLAGS="$saved_CFLAGS" if test x$have_gmp != xyes; then - AC_MSG_ERROR([Building GCC requires GMP 4.1+ and MPFR 2.2+. Try the --with-gmp and/or --with-mpfr options.]) + AC_MSG_ERROR([Building GCC requires GMP 4.1+ and MPFR 2.2+. +Try the --with-gmp and/or --with-mpfr options to specify their locations. +Copies of these libraries' source code can be found at their respective +hosting sites as well as at ftp://gcc.gnu.org/pub/gcc/infrastructure/. +See also http://gcc.gnu.org/install/prerequisites.html for additional info.]) fi # Flags needed for both GMP and/or MPFR
Problem with listing i686-apple-darwin as a Primary Platform
Hi, Right now after patches by the Apple folks causes you to need a newer dwarfutils which don't exist outside of Apple so the community of Free Source and GCC is not helped by making Darwin a primary platform. Maybe we should list a specific version of darwin which changes the confusion of which darwin (Mac OS X) version is supposed to be able to compile right out of the box. Right now on the PowerPC side, Darwin before 8 (so Panther and before) are broken bootstrapping the mainline and is also broken on the 4.2 branch. We should request that the Apple folks at least support the latest release version of their OS and the previous version or we should remove Darwin from being a primary/secondary platform until they are able to. Thanks, Andrew Pinski
Re: build failure, GMP not available
I ended up including both your preference and mine. Hopefully one or other other (or both) end up being useful to users. Thanks, this will help with some of the questions I received internally today. -eric
Re: Problem with listing i686-apple-darwin as a Primary Platform
Right now after patches by the Apple folks causes you to need a newer dwarfutils which don't exist outside of Apple so the community of Free Source and GCC is not helped by making Darwin a primary platform. Maybe we should list a specific version of darwin which changes the confusion of which darwin (Mac OS X) version is supposed to be able to compile right out of the box. Right now on the PowerPC side, Darwin before 8 (so Panther and before) are broken bootstrapping the mainline and is also broken on the 4.2 branch. We're in stage1, breakages happen - see the current fun with gmp/mpfr as well as c99 inlining. File a bug or bring a problem up for discussion. As far as 4.2 this is the first I've heard of it. What's the problem? -eric
Re: Problem with listing i686-apple-darwin as a Primary Platform
On Mon, 2006-11-06 at 20:57 -0800, Eric Christopher wrote: > As far as 4.2 this is the first I've heard of it. What's the problem? Well you need a new cctools which does not exist for 10.2. Thanks, Andrew Pinski
Re: Problem with listing i686-apple-darwin as a Primary Platform
On Mon, 2006-11-06 at 20:57 -0800, Eric Christopher wrote: > We're in stage1, breakages happen - see the current fun with gmp/mpfr as > well as c99 inlining. File a bug or bring a problem up for discussion. Except this is a different issue as the patch is for Darwin. http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00168.html If the patch was not specific for Darwin, I would not have a problem with keeping darwin as a primary platform. This is also one reason why I suggested a specific version of Darwin that is required. Oh and 10.0, 10.1, 10.2 compiling with GCC are all broken (so is 10.3). Thanks, Andrew Pinski
Re: Problem with listing i686-apple-darwin as a Primary Platform
On Nov 6, 2006, at 8:59 PM, Andrew Pinski wrote: On Mon, 2006-11-06 at 20:57 -0800, Eric Christopher wrote: As far as 4.2 this is the first I've heard of it. What's the problem? Well you need a new cctools which does not exist for 10.2. While I'm sure you could be less specific, would you be more specific in this case? And, no offense, but 10.2 is quite old, I'm not sure that Apple is even supporting it these days, but I'll leave that to the people that know better. Perhaps odcctools would work for you? -eric
Re: Problem with listing i686-apple-darwin as a Primary Platform
Except this is a different issue as the patch is for Darwin. http://gcc.gnu.org/ml/gcc-patches/2006-11/msg00168.html Geoff appears to have given a workaround for the problem and has promised to inquire further about more up to date solutions. Another solution, of course, is to revert the default back to stabs. Or use Shantonu's workaround. Personally I agree with Daniel's complaint on the issue and we may need to temporarily revert that single patch until a newer dwarf-utilities can be distributed. This is also one reason why I suggested a specific version of Darwin that is required. That may not be a bad idea. Oh and 10.0, 10.1, 10.2 compiling with GCC are all broken (so is 10.3). I'd probably suggest at least 10.3.9 myself. I'm not sure since 10.3.x predates my employment at apple, or what the current policy is regarding it. -eric
Re: Problem with listing i686-apple-darwin as a Primary Platform
I would more worried about the second issue if gcc 4.2 was remotely close to release. However at the rate regressions are being fixed (or not) in gcc 4.2 branch, I wouldn't hold my breath as to which is released first (gcc 4.2 or Leopard). Once Leopard is released, Darwin8 will become the 'previous' release and the problem on Darwin PPC will go away. Jack On Mon, Nov 06, 2006 at 07:49:59PM -0800, Andrew Pinski wrote: > Hi, > Right now after patches by the Apple folks causes you to need a newer > dwarfutils which don't exist outside of Apple so the community of Free > Source and GCC is not helped by making Darwin a primary platform. > Maybe we should list a specific version of darwin which changes the > confusion of which darwin (Mac OS X) version is supposed to be able to > compile right out of the box. Right now on the PowerPC side, Darwin > before 8 (so Panther and before) are broken bootstrapping the mainline > and is also broken on the 4.2 branch. > > We should request that the Apple [EMAIL PROTECTED] support the latest > release version of their OS and the previous version or we should remove > Darwin from being a primary/secondary platform until they are able to. > > Thanks, > Andrew Pinski
Re: Problem with listing i686-apple-darwin as a Primary Platform
On Nov 6, 2006, at 9:10 PM, Eric Christopher wrote: Oh and 10.0, 10.1, 10.2 compiling with GCC are all broken (so is 10.3). I'd probably suggest at least 10.3.9 myself My take, 10.2 and on should work. I think it is wrong to put things into darwin.[ch] that don't work on earlier systems. And I don't think that on 10.2, dwarf should be the default, as the gdb on that system isn't going to work. I think that for 10.5, the default being dwarf would be fine. For 10.4, I tend to think it won't work, so there is no point in even trying. Geoff?
Re: Abt long long support
Thanks for the reply My target (non gcc/private one) fails for long long testcases and there are cases (with long long) which gets through, but not with the right output. When i replace long long with long the testcases runs fine, even those giving wrong output. The target is not able to compile properly for simple statements like long long a = 10; So when i looked into the .md file i saw no patterns with DI machine mode ,used for long long(am i right?), execpt define_insn "adddi3" and define_insn "subdi3" The .md file says that this is to prevent gcc from synthesising it, though i didnt understand what that means. Thats when i started to doubt if the backend provides support for long long.But if what Rask is saying is true , which has to be i guess since you guys are saying that,then middle end should take care of synthesizing long long. The 32 bit target has this defined in the .h file LONG_TYPE_SIZE 32 LONG_LONG_TYPE_SIZE 64 Is there anything else thati should provide in the bach end to make sure that rest of gcc is synthesizing long long properly? Any thoughts? On 11/7/06, Rask Ingemann Lambertsen <[EMAIL PROTECTED]> wrote: On Mon, Nov 06, 2006 at 10:52:00AM +0530, Mohamed Shafi wrote: > Hello all, > > Looking at a .md file of a backend it there a way to know whether a > target supports long long > Should i look for patterns with machine mode DI? No. For example, 8-bit, 16-bit and 32-bit targets should normally not define patterns such as anddi3, iordi3 and xordi3. It is possible that a target could have no patterns with mode DI but still support long long, although probably with significant slowdown. E.g. the middle end can synthesize adddi3 and subdi3 from SImode operations, but I think most targets can easily improve 10x in terms of speed and size on that code. Watch out for targets where units are larger than 8 bits. An example is the c4x where a unit is 32-bits and HImode is 64-bits. > Is there some other way? This depends a lot on exactly what you mean when you say support, but grep for LONG_TYPE_SIZE and LONG_LONG_TYPE_SIZE in the .h file and compare the two. -- Rask Ingemann Lambertsen
Re: Abt long long support
"Mohamed Shafi" <[EMAIL PROTECTED]> writes: > So when i looked into the .md file i saw no patterns with DI machine > mode ,used for long long(am i right?), execpt > > define_insn "adddi3" and define_insn "subdi3" > > The .md file says that this is to prevent gcc from synthesising it, > though i didnt understand what that means. If there is no adddi3 pattern, then gcc will handle 64-bit addition by doing 32-bit additions, and similarly for 64-bit subtraction. Putting an adddi3 pattern in the MD file lets the backend specify how to do 64-bit addition. The backend can do it more efficiently than the generic approach if the target provides an add-with-carry instruction. I don't see how we can say anything else useful since we don't know anything about how your examples are failing to compile. Ian
Re: Abt long long support
On Nov 6, 2006, at 9:30 PM, Mohamed Shafi wrote: My target (non gcc/private one) fails for long long testcases Does it work flawlessly otherwise, if not, fix all those problems first. After those are all fixed, then you can see if it then just works. In particular, you will want to ensure that 32 bit things work fine, first. there are cases (with long long) which gets through, but not with the right output. When i replace long long with long the testcases runs fine, even those giving wrong output. The target is not able to compile properly for simple statements like long long a = 10; So when i looked into the .md file i saw no patterns with DI machine mode ,used for long long(am i right?), execpt define_insn "adddi3" and define_insn "subdi3" The .md file says that this is to prevent gcc from synthesising it, though i didnt understand what that means. Does this mean that in your md file you define adddi3 and subdi3? And, are the definition right or wrong? If wrong, why not fix them? I suspect they are wrong. For example if they expand to "", that is certainly wrong. Any thoughts? Yes, having us guess as why your port doesn't work, isn't productive for us. If you want to enable us to know why it doesn't work and help you, you can put the entire port up for easy access somewhere. To do a port, you have to be able to synthesize testcases, run them through the compiler, read the output of the compiler, understand the target assembly language to know why it is wrong and then map that back to the target files. How many of these steps were you able to do for this case? For the last step, reading and understanding the gcc porting manual is useful, as is studying ports that are similar to yours. You can help us answer your questions by including all the relevant details. Things that are relevant include the testcase, the output of -da and -fdump-all-trees, the generated .s file, and the relevant portions of the .md file.
Abt RTL expression - combining instruction
Hi all, I am trying to combine the compare and branch instruction. But my instructions are not getting generated as my operands are not matched properly. Previously for individual compare instructions, i had operand 0 - Register operand operand 1 - Non memory operand. For branch instruction, operator 0 - compare operator operand 1 - label. So when i combined compare and branch, i just superimposed both statements with same conditions with operand 0 - Register operand operand 1 - Non memory operand operator 2 - comparison operator operand 3 - label. 1. Is it the right way to match operands and operators while combining instruction? 2. How to check where my instruction matching goes wrong? regards Rohit --- Thanks for finally giving the complete program and the RTL. I think you said earlier that this is a private target, not a standard gcc target. On that basis, I would say that the problem appears to be that you have a cc0 machine of a type which gcc does not handle naturally. If your comparison instructions set a fixed hard register, and simple register to register moves clobber that hard register, then you must handle comparison instructions specially before reload. You must emit a combined compare_and_branch instruction, which does the comparison and the branch in a single insn. Then you may write a define_split which runs after reload and splits the instruction back into its components for the second scheduling pass. You've encountered a somewhat subtle way in which gcc fails if you don't do this. There a much more obvious it will fail: reload will wind up clobbering the condition register every time it has to load or store a register. Writing the patterns to avoid using the fixed condition register before reload is tedious but typically not difficult. Writing the define_splits which run after reload is typically more complicated. Note that in general the define_splits are only useful if scheduling helps your processor. The C4X is an example of a standard target which has some of these issues. (In fairness I should say that there is an alternative, which is to use (cc0) for the condition register. But I do not recommend this, as support for (cc0) may be removed from the compiler at some point in the future.) Ian
Re: Abt RTL expression - combining instruction
"Rohit Arul Raj" <[EMAIL PROTECTED]> writes: > I am trying to combine the compare and branch instruction. But my > instructions are not getting generated as my operands are not matched > properly. > > Previously for individual compare instructions, i had > operand 0 - Register operand > operand 1 - Non memory operand. > > For branch instruction, > operator 0 - compare operator > operand 1 - label. > > So when i combined compare and branch, i just superimposed both > statements with same conditions with > operand 0 - Register operand > operand 1 - Non memory operand > operator 2 - comparison operator > operand 3 - label. > > 1. Is it the right way to match operands and operators while combining > instruction? > 2. How to check where my instruction matching goes wrong? When writing an MD file you have to think about what you generate (insns/expanders with standard names) and what you recognize (insns). You have to always generate something which you can recognize. Anyhow, the easier way to generate a combined compare and branch instruction is to use an insn or expander with the cbranchMM4 standard name. See the documentation. Ian
Re: Problem with listing i686-apple-darwin as a Primary Platform
Jack Howarth wrote: I would more worried about the second issue if gcc 4.2 was remotely close to release. However at the rate regressions are being fixed (or not) in gcc 4.2 branch, I wouldn't hold my breath as to which is released first (gcc 4.2 or Leopard). Once Leopard is released, Darwin8 will become the 'previous' release and the problem on Darwin PPC will go away. All this would be fine, if it was not that (correct me if I'm wrong, I'd love to be corrected) you have to pay to get the latest Mac OS. That may be a very good reason why some people are stuck with 10.3.9 for example. If I had to do it with my own money, I would not have spent 120 euros for Tiger. Paolo