sjlj exceptions?
Hi, when porting gcc (still 3.4.4), how do I exactly know whether I need to pass --enable-sjlj-exceptions to configure? Is there a test case which fails if I need it and have it not enabled, and passes otherwise (disabled and not needed, or enabled)? TIA, //mirabile -- > Hi, does anyone sell openbsd stickers by themselves and not packaged > with other products? No, the only way I've seen them sold is for $40 with a free OpenBSD CD. -- Haroon Khalid and Steve Shockley in gmane.os.openbsd.misc
Re[2]: object code execution statistics
Hello James, Tuesday, April 26, 2005, 6:36:56 AM, you wrote: JEW> Sergei Tovpeko wrote: >> Is there any util that would produce result containing the asm code >> execution staticstics ??? JEW> I assume you want assembly instruction execution counts. You could JEW> produce this info from gcov with a bit of work, as gcov already gives JEW> you execution counts for basic blocks. You just need to disassemble the JEW> code, apply the counts to the assembly instructions, and then print JEW> whatever statistics you want. Probably not easy, but feasible. JEW> Otherwise, there is nothing in gcc that will help you here. Does it mean that GCOV have much more data (in its binary format) but don't treat them and out to user in human format? -- Best regards, Sergeimailto:[EMAIL PROTECTED]
Re: Side-effect latency in DFA scheduler
Jon, How is the latency of instructions that have side effects modeled in the DFA scheduler. For example, define_insn_reservation only has one latency value, yet instructions such as loads with post increment addressing have two outputs, possibly with different latencies. Do both outputs get the same latency? you should set the latency to the larger of those two values. You can then insert bypasses for the shorter one. Look at the arm schedulers, which have instances of that going on. nathan -- Nathan Sidwell:: http://www.codesourcery.com :: CodeSourcery LLC [EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk
Re: Store scheduling with DFA scheduler
Jon, (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") (define_insn_reservation "stores" 3 (eq_attr "type" "store") "x,m*2") Stores don't really have a 'result', why have you set the cycle count to 3? Shouldn't it be '1'? (then you won't need store bypasses for autoincrements) nathan -- Nathan Sidwell:: http://www.codesourcery.com :: CodeSourcery LLC [EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk
Re: Regression involving COMMON(?)
Andrew, You were right: I think this is caused by: 2005-04-25 Nathan Sidwell <[EMAIL PROTECTED]> * tree-ssa-alias.c (fieldoff_t): Remove. (fieldoff_s): typedef the structure itself. Create a vector of objects. (push_fields_onto_fieldstack): Return count of fields pushed. Remove peeling of first field. Adjust. (fieldoff_compare): Adjust. (create_overlap_variables_for): Adjust. by the way Paul, what is the error? For every each one of the FAILs - internal compiler error: in push_fields_onto_fieldstack, at tree-ssa-alias.c:2834 Paul T
Re: Regression involving COMMON(?)
Paul Thomas wrote: Andrew, You were right: I think this is caused by: 2005-04-25 Nathan Sidwell <[EMAIL PROTECTED]> * tree-ssa-alias.c (fieldoff_t): Remove. (fieldoff_s): typedef the structure itself. Create a vector of objects. (push_fields_onto_fieldstack): Return count of fields pushed. Remove peeling of first field. Adjust. (fieldoff_compare): Adjust. (create_overlap_variables_for): Adjust. by the way Paul, what is the error? For every each one of the FAILs - internal compiler error: in push_fields_onto_fieldstack, at tree-ssa-alias.c:2834 I am looking at it. -- Nathan Sidwell:: http://www.codesourcery.com :: CodeSourcery LLC [EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk
Free-Standing Implementation
Hello Everyone, I want to know what is to be expected out of a 'Free-Standard' implementation of gcc, glibc and newlib that confirms to C89 standard. We have gcc ported to a new custom processor and the porting company says it is a free-standing version. So, what all can I expect out of it and the libraries? I've tried searching in google, but couldnt get enough explanation. Any help is appreciated. Thanks and Regards, Sriharsha. -- Sriharsha Vedurmudi Software Engineer Redpine Signals Inc. Gate No 395, Sagar Society, Road No 2, Banjara Hills, Hyderabad - 500 034 Andhra Pradesh India Phone: +91-40-23559911 Website: http://www.redpinesignals.com
RE: Store scheduling with DFA scheduler
> Jon, > > (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") > > (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") > > (define_insn_reservation "stores" 3 (eq_attr "type" > "store") "x,m*2") > > Stores don't really have a 'result', why have you set the > cycle count to 3? Shouldn't it be '1'? (then you won't need > store bypasses for autoincrements) Primilary because that's how it appears to be coded in the ARM port (e.g store_wbuf in arm-generic.md). I had tried both ways though, and for this particular problem, changing this value appears to have no effect. I can see that it would for autoinc though. Cheers, Jon
RE: Free-Standing Implementation
Original Message >From: Sriharsha Vedurmudi >Sent: 26 April 2005 12:37 > Hello Everyone, > > I want to know what is to be expected out of a 'Free-Standard' > implementation of gcc, glibc and newlib that confirms to C89 standard. > We have gcc ported to a new custom processor and the porting company > says it is a free-standing version. So, what all can I expect out of it > and the libraries? I've tried searching in google, but couldnt get > enough explanation. > Any help is appreciated. > > Thanks and Regards, > Sriharsha. Try "info gcc" and search for "freestanding" in that. Then read the C language spec for the full definition of what "freestanding" implies. cheers, DaveK -- Can't think of a witty .sigline today
Re: Store scheduling with DFA scheduler
Jon Beniston wrote: Jon, (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") (define_insn_reservation "stores" 3 (eq_attr "type" "store") "x,m*2") Stores don't really have a 'result', why have you set the cycle count to 3? Shouldn't it be '1'? (then you won't need store bypasses for autoincrements) Primilary because that's how it appears to be coded in the ARM port (e.g store_wbuf in arm-generic.md). I had tried both ways though, and for this particular problem, changing this value appears to have no effect. I can see that it would for autoinc though. ah, the arm1026ejs and arm1136jfs don't do that. nathan -- Nathan Sidwell:: http://www.codesourcery.com :: CodeSourcery LLC [EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk
Build report for AIX 5.1
Hi, i just built GCC 4.0.0 on AIX 5.1 using the following commands: ../gcc-4.0.0/configure --with-libiconv-prefix=/usr --disable-nls --disable-multilib make bootstrap-lean make install $ config.guess powerpc-ibm-aix5.1.0.0 $ gcc -v Using built-in specs. Target: powerpc-ibm-aix5.1.0.0 Configured with: /home/linke/temp/gcc-4.0.0/configure --with-libiconv-prefix=/usr --disable-nls --disable-multilib Thread model: aix gcc version 4.0.0 The system is an IBM pSeries M80 with AIX 5.1 at the latest patchlevel. The building c-complier is gcc 3.4.3 Make is gnu-make 3.80 The disable-xxx configure-options shouldn't be necessary, i used them for buildtime- and space-saving reasons. The whole build took less than two hours. Mario Linke
RE: Store scheduling with DFA scheduler
On Tue, 2005-04-26 at 12:52, Jon Beniston wrote: > > Jon, > > > (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") > > > (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") > > > (define_insn_reservation "stores" 3 (eq_attr "type" > > "store") "x,m*2") > > > > Stores don't really have a 'result', why have you set the > > cycle count to 3? Shouldn't it be '1'? (then you won't need > > store bypasses for autoincrements) > > Primilary because that's how it appears to be coded in the ARM port (e.g > store_wbuf in arm-generic.md). I had tried both ways though, and for this > particular problem, changing this value appears to have no effect. I can see > that it would for autoinc though. The store_wbuf code always was a little suspect, even in the days before the DFA scheduler. These days it's only used for cores that have no other scheduling constraints.
EABI stack alignment for ppc
Hello, PPC EABI targets are currently configured with both BIGGEST_ALIGNMENT and PREFERRED_STACK_BOUNDARY set to 128, I believe to accomodate "a long double member within a structure or union shall start at the lowest available offset aligned on a 16byte boundary" Besides, for 32bit non-altivec targets, we have 64 for STACK_BOUNDARY. There is code in expand_main_function to force the stack pointer alignment to PREFERRED_STACK_BOUNDARY in such situations, not triggered for C like languages on those targets because FORCE_PREFERRED_STACK_BOUNDARY_IN_MAIN is not defined (maybe it should, btw). We apply this dynamic re-alignment to every subprogram with foreign convention in Ada, still, because alignment requests must always be obeyed when not to be rejected. It turns out that the current re-alignment code doesn't work for PPC. It first 'rounds' the stack pointer in place, and then resorts to allocate_dynamic_stack_space to "pick up the pieces", as the comment says. The latter triggers the 'allocate_stack' expander, which copies the back chain from the current stack pointer, which in turn retrieves garbage when the initial 'rounding' has had a real effect. One way of addressing that would be to adjust the re-alignment code so that it does not touch the stack pointer before calling allocate_dynamic_stack_space. Now, I'm a bit unclear on the meaning of the ABI statement quoted above, and on the real implications this should have in the compiler. Does it imply that a long double field *address* should always be a multiple of 16, or just that the *offset* should be such a multiple ? In the latter case, are bumping BIGGEST_ALIGNMENT and PREFERRED_STACK_BOUNDARY the only options ? Other thoughts ? Thanks in advance, Olivier
Re: gcc 4.0.0 optimization vs. id strings (RCS, SCCS, etc.)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, Apr 25, 2005 at 05:52:33PM -0700, Zack Weinberg wrote: > Bruce Lilly <[EMAIL PROTECTED]> writes: > > Earlier versions of gcc retain static character strings in object > > files which can be used for identification via ident (RCS) or what > > (SCCS). Gcc 4.0.0 removes them above optimization level 1. > > The first observation I'd like to make is that we (the GCC developers, > collectively) don't really understand the need for this. We don't put > $Id$ strings in our own source code, never mind the object files. $Id$ strings in object files are useful when you goof and forget or neglect to tag your code base immediately before building and releasing. If you can't tell with certainty exactly which revision of each source file went into your build, other people (typically nontechnical managers) almost inevitably start distracting from the real issue (poor release management) with spurious concerns over your use of "freeware" like CVS, SDCC and GCC instead of "professional" packages like MS Visual SourceSafe. Having been in this very situation a few weeks ago (where a development snapshot suddenly got retroactively promoted to "release" a month after), I can say that it could have made life easier if I had been able to extract the $Id$ strings from the executable. Think of it as a safety net in case you or someone else goofs and forgets to do 'cvs tag todays-release'. Doesn't cover uncommitted changes, though... > The attitude is that the version number of the program as a whole > suffices to figure out which release branch to go looking for the bug > in. That's because GCC, AFAICT, has a pretty good release process. -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.4 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQFCbkKY/FmLrNfLpjMRAgSqAJ9ZuN3Sf4coFFCb/OS0RSesgOzseQCeL13B hWODRBDm+CfhALC6ap9euKo= =Q9oa -END PGP SIGNATURE-
Re: Store scheduling with DFA scheduler
Jon Beniston wrote: Hi, I'm trying to get the DFA scheduler in GCC 4.0.0 to schedule loads and stores, but I can only get it to work for loads. I have an automaton defined as follows: (define_automaton "cpu") (define_cpu_unit "x" "cpu") (define_cpu_unit "m" "cpu") (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") (define_insn_reservation "stores" 3 (eq_attr "type" "store") "x,m*2") All instructions take one cycle in "x". Loads then take one "m" cycle, while stores take two "m" cycles. Basically stores aren't fully pipelined. There is not enough information to say what is wrong. It would be better if you send gcc output when -fsched-verbose=10 is used. Vlad
RE: EABI stack alignment for ppc
Original Message >From: Olivier Hainque >Sent: 26 April 2005 14:25 > "a long double member within a structure or union shall start at the >lowest available offset aligned on a 16byte boundary" > > Now, I'm a bit unclear on the meaning of the ABI statement quoted above, > and on the real implications this should have in the compiler. > > Does it imply that a long double field *address* should always be a > multiple of 16, or just that the *offset* should be such a multiple ? It only implies that the offset should be such a multiple, but since the struct itself will have to be aligned to a multiple of 16 if any of its members have to be aligned to a multiple of 16 (at least according to the C language rules), it works out the same: the base address is aligned, the offset is aligned because of the paragraph above, so the actual member address (offset + base) is also aligned. cheers, DaveK -- Can't think of a witty .sigline today
Re: Store scheduling with DFA scheduler
> Nathan Sidwell writes: >> (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") >> (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") >> (define_insn_reservation "stores" 3 (eq_attr "type" "store") "x,m*2") Nathan> Stores don't really have a 'result', why have you set the cycle Nathan> count to 3? Shouldn't it be '1'? (then you won't need store bypasses Nathan> for autoincrements) Stores do have results: memory. If one does not have a store bypass in the processor, one needs to model the delay for the result to appear in the cache and be available for a subsequent load. David
Re: Store scheduling with DFA scheduler
David Edelsohn wrote: Nathan Sidwell writes: (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") (define_insn_reservation "stores" 3 (eq_attr "type" "store") "x,m*2") Nathan> Stores don't really have a 'result', why have you set the cycle Nathan> count to 3? Shouldn't it be '1'? (then you won't need store bypasses Nathan> for autoincrements) Stores do have results: memory. If one does not have a store bypass in the processor, one needs to model the delay for the result to appear in the cache and be available for a subsequent load. Is that modelled by that bit of the scheduler? I thought the cycle count in define_insn_reservation was for a register output that would be available to a subsequent instruction. The RAW contention you describe would need to be modelled separately (as I've done in at least one port), because you could start the load insn before the store had completed -- provided the load was processed in the memory unit after the store has sufficiently completed. an absense_set would be the way to model stalling a load whilst a store was being processed. nathan -- Nathan Sidwell:: http://www.codesourcery.com :: CodeSourcery LLC [EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk
RE: Store scheduling with DFA scheduler
Hi Vlad, > There is not enough information to say what is wrong. It > would be better if you send gcc output when > -fsched-verbose=10 is used. Cheers, Jon ;; == ;; -- basic block 0 from 18 to 32 -- before reload ;; == ;; --- forward dependences: ;; --- Region Dependences --- b 0 bb 0 ;; insn codebb dep prio cost reservation ;; -- --- --- ;; 1810 0 0 6 2 x,m : 20 19 ;; 1992 0 1 4 1 x : 20 ;; 2010 0 2 3 3 x,m*2 : ;; 2210 0 0 6 2 x,m : 24 23 ;; 2392 0 1 4 1 x : 24 ;; 2410 0 2 3 3 x,m*2 : ;; 2610 0 0 6 2 x,m : 28 27 ;; 2792 0 1 4 1 x : 28 ;; 2810 0 2 3 3 x,m*2 : ;; 3010 0 0 6 2 x,m : 32 31 ;; 3192 0 1 4 1 x : 32 ;; 3210 0 2 3 3 x,m*2 : ;; Ready list after queue_to_ready:30 26 22 18 ;; Ready list after ready_sort:30 26 22 18 ;; Ready list (t = 0):30 26 22 18 ;;0--> 18 r41=[`x'] :x,m ;; dependences resolved: insn 19 into queue with cost=2 ;; Ready-->Q: insn 19: queued for 2 cycles. ;; Ready list (t = 0):30 26 22 ;; Ready list after queue_to_ready:30 26 22 ;; Ready list after ready_sort:30 26 22 ;; Ready list (t = 1):30 26 22 ;;1--> 22 r43=[`y'] :x,m ;; dependences resolved: insn 23 into queue with cost=2 ;; Ready-->Q: insn 23: queued for 2 cycles. ;; Ready list (t = 1):30 26 ;; Q-->Ready: insn 19: moving to ready without stalls ;; Ready list after queue_to_ready:19 30 26 ;; Ready list after ready_sort:19 30 26 ;; Ready list (t = 2):19 30 26 ;;2--> 26 r45=[`z'] :x,m ;; dependences resolved: insn 27 into queue with cost=2 ;; Ready-->Q: insn 27: queued for 2 cycles. ;; Ready list (t = 2):19 30 ;; Q-->Ready: insn 23: moving to ready without stalls ;; Ready list after queue_to_ready:23 19 30 ;; Ready list after ready_sort:23 19 30 ;; Ready list (t = 3):23 19 30 ;;3--> 30 r47=[`w'] :x,m ;; dependences resolved: insn 31 into queue with cost=2 ;; Ready-->Q: insn 31: queued for 2 cycles. ;; Ready list (t = 3):23 19 ;; Q-->Ready: insn 27: moving to ready without stalls ;; Ready list after queue_to_ready:27 23 19 ;; Ready list after ready_sort:27 23 19 ;; Ready list (t = 4):27 23 19 ;;4--> 19 {r41=r41+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 20 into queue with cost=1 ;; Ready-->Q: insn 20: queued for 1 cycles. ;; Ready list (t = 4):27 23 ;; Q-->Ready: insn 20: moving to ready without stalls ;; Q-->Ready: insn 31: moving to ready without stalls ;; Ready list after queue_to_ready:31 20 27 23 ;; Ready list after ready_sort:20 31 27 23 ;; Ready list (t = 5):20 31 27 23 ;;5--> 23 {r43=r43+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 24 into queue with cost=1 ;; Ready-->Q: insn 24: queued for 1 cycles. ;; Ready list (t = 5):20 31 27 ;; Q-->Ready: insn 24: moving to ready without stalls ;; Ready list after queue_to_ready:24 20 31 27 ;; Ready list after ready_sort:24 20 31 27 ;; Ready list (t = 6):24 20 31 27 ;;6--> 27 {r45=r45+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 28 into queue with cost=1 ;; Ready-->Q: insn 28: queued for 1 cycles. ;; Ready list (t = 6):24 20 31 ;; Q-->Ready: insn 28: moving to ready without stalls ;; Ready list after queue_to_ready:28 24 20 31 ;; Ready list after ready_sort:28 24 20 31 ;; Ready list (t = 7):28 24 20 31 ;;7--> 31 {r47=r47+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 32 into queue with cost=1 ;; Ready-->Q: insn 32: queued for 1 cycles. ;; Ready list (t = 7):28 24 20 ;; Q-->Ready: insn 32: moving to ready without stalls ;; Ready list after queue_to_ready:32 28 24 20 ;;
New gcc 4.0.0 warnings seem spurious
Demonstration code: -- #define AAA 0x1U #define BBB 0x2U struct foo { unsigned int bar:8; }; struct foo foos[] = { { ~(AAA) }, { ~(BBB) }, { ~(AAA|BBB) }, { ~(AAA&BBB) } }; -- compiling with gcc 3.x produced no warnings, as expected (no problems as all values fit easily within the defined structure's bit field). gcc 4.0.0 produces: gcctest.c:9: warning: large integer implicitly truncated to unsigned type gcctest.c:10: warning: large integer implicitly truncated to unsigned type gcctest.c:11: warning: large integer implicitly truncated to unsigned type gcctest.c:12: warning: large integer implicitly truncated to unsigned type
Re: Store scheduling with DFA scheduler
Jon Beniston wrote: Hi Vlad, There is not enough information to say what is wrong. It would be better if you send gcc output when -fsched-verbose=10 is used. Cheers, Jon ;; Ready list (t = 10):32 28 24 ;; 10--> 24 [`y']=r43 :x,m*2 ;; Ready list (t = 10):32 28 ;; Ready list after queue_to_ready:32 28 ;; Ready list after ready_sort:32 28 ;; Ready list (t = 11):32 28 ;; Ready-->Q: insn 28: queued for 1 cycles. ;; Ready list (t = 11):32 ;; Ready-->Q: insn 32: queued for 1 cycles. ;; Ready list (t = 11): ;; Q-->Ready: insn 32: moving to ready without stalls ;; Q-->Ready: insn 28: moving to ready without stalls ;; Ready list after queue_to_ready:28 32 ;; Ready list after ready_sort:32 28 ;; Ready list (t = 12):32 28 ;; 12--> 28 [`z']=r45 :x,m*2 ;; Ready list (t = 12):32 ;; Ready list after queue_to_ready:32 ;; Ready list after ready_sort:32 ;; Ready list (t = 13):32 ;; Ready-->Q: insn 32: queued for 1 cycles. ;; Ready list (t = 13): ;; Q-->Ready: insn 32: moving to ready without stalls ;; Ready list after queue_to_ready:32 ;; Ready list after ready_sort:32 ;; Ready list (t = 14):32 ;; 14--> 32 [`w']=r47 :x,m*2 ;; Ready list (t = 14): ;; Ready list (final): ;; total time = 14 ;; new head = 18 ;; new tail = 32 It looks like DFA pipeline hazard recognizer works well. Even the data ready for stores, there is 2 cycle delay between stores because memory unit is reserved by previous store insn. The problem is in heuristics used to sort ready insns. The most high priority heuristic is critical path length. Loads have the biggest value 6, than additions have value 4, and finaly stores have value 3. So the heuristic does not work well for you. But experience of many compiler developers shows that is the best heuristic for list insn scheduling. It could be fixed by using other more sophistciated algorithms or optimal algorithms which as a rule can not be used in an industrial compiler because they are too slow. Vlad
Re: New gcc 4.0.0 warnings seem spurious
On Tue, 26 Apr 2005, Bruce Lilly wrote: > Demonstration code: > -- > #define AAA 0x1U > #define BBB 0x2U > > struct foo { > unsigned int bar:8; > }; > > struct foo foos[] = { > { ~(AAA) }, > { ~(BBB) }, > { ~(AAA|BBB) }, > { ~(AAA&BBB) } > }; > -- > > compiling with gcc 3.x produced no warnings, as expected (no problems as > all values fit easily within the defined structure's bit field). I don't see why you think the warnings are spurious. ~(AAA), for example, is 4294967294, which being greater than 255 certainly does not fit within the type unsigned:8. Previous GCC versions had a long-known bug whereby they did not diagnose this; that bug has been fixed in GCC 4. -- Joseph S. Myers http://www.srcf.ucam.org/~jsm28/gcc/ [EMAIL PROTECTED] (personal mail) [EMAIL PROTECTED] (CodeSourcery mail) [EMAIL PROTECTED] (Bugzilla assignments and CCs)
Re: EABI stack alignment for ppc
Dave Korn wrote: > > "a long double member within a structure or union shall start at the > >lowest available offset aligned on a 16byte boundary" > It only implies that the offset should be such a multiple, but since the > struct itself will have to be aligned to a multiple of 16 if any of its > members have to be aligned to a multiple of 16 (at least according to the C > language rules), it works out the same: the base address is aligned, the > offset is aligned because of the paragraph above, so the actual member > address (offset + base) is also aligned. Humm, doesn't "member has to be aligned on a multiple of X" apply to the address of the member ? My understanding was that the maximum of those member address alignment constraints have to be propagated to the aggregate to be translatable into local constraints on offsets. Actually the whole EABI paragraph reads: "Unlike the SVR4 ABI, an array, structure or union containing a long double shall start aligned on an 8 byte boundary. However, as in the SVR4 ABI, a long double member within a structure or union shall start at the lowest available offset aligned on a 16byte boundary, and the size of the structure or union with a long double member shall be a multiple of 16 bytes." What I find very surprising is that statisfying an EABI rule on long double offsets in aggregates has to translate into painful to maintain constraints on the stack pointer just contrary to one of the EABI major points (relaxing the stack pointer alignment). Thanks for your feedback, Olivier
Re: Heads-up: volatile and C++
Marcin Dalecki <[EMAIL PROTECTED]> writes: | On 2005-04-15, at 23:59, Mike Stump wrote: | | > On Friday, April 15, 2005, at 02:52 PM, Marcin Dalecki wrote: | >>> My god, you didn't actually buy into that did you? Hint, it was | >>> is, and always will be a joke. | >> | >> You dare to explain what's so funny about it? | > | > Oh, it wasn't funny. Maybe the English is slightly too idiomatic? | > I'd need someone that understands the English and German to | > translate. | | > You can read it as, it was and will always be, just a bad idea. | | When will be a full and standard conforming template implementation in | GCC finished then? when someone finishes it, meaning you or any other GCC contributor. But, in the meantime, that does not prevent you from enjoying other templates benefits in embedded world. -- Gaby
Re: object code execution statistics
On Tue, Apr 26, 2005 at 12:57:54PM +0400, Sergei Tovpeko wrote: > Hello James, > > Tuesday, April 26, 2005, 6:36:56 AM, you wrote: > > JEW> Sergei Tovpeko wrote: > >> Is there any util that would produce result containing the asm code > >> execution staticstics ??? > > JEW> I assume you want assembly instruction execution counts. You could > JEW> produce this info from gcov with a bit of work, as gcov already gives > JEW> you execution counts for basic blocks. You just need to disassemble the > JEW> code, apply the counts to the assembly instructions, and then print > JEW> whatever statistics you want. Probably not easy, but feasible. > > JEW> Otherwise, there is nothing in gcc that will help you here. > > Does it mean that GCOV have much more data (in its binary format) > but don't treat them and out to user in human format? It depends on what you want. If you have inline assembly and want to know how often it is executed, you could try the source-code annotation feature of gcov, to get counts for each line. Or you could use the line-level information to try to figure out how often the assembly is executed. But that isn't easy. The instrumentation that gcov relies on has to be inserted into the code by gcc, and doesn't get inserted into assembly language.
Re: gcc 4.0.0 optimization vs. id strings (RCS, SCCS, etc.)
On Mon April 25 2005 20:52, Zack Weinberg wrote: > Bruce Lilly <[EMAIL PROTECTED]> writes: > > > Earlier versions of gcc retain static character strings in object > > files which can be used for identification via ident (RCS) or what > > (SCCS). Gcc 4.0.0 removes them above optimization level 1. > > The first observation I'd like to make is that we (the GCC developers, > collectively) don't really understand the need for this. We don't put > $Id$ strings in our own source code, never mind the object files. The > attitude is that the version number of the program as a whole suffices > to figure out which release branch to go looking for the bug in. That depends on the specifics of the project. For one particular project, which is a library rather than a program, there are about 250 source files. Several are gperf source files generated from some data by awk scripts. So in order to "go looking for the bug", I may need to know which version of an awk script is used, with which version of awk/gawk/mawk/nawk/oawk, what version of gperf is used and with which arguments, and the specific version of data used. Moreover, as the project is open source, I need to be able to determine if some extensions or other modifications are involved. Because the project is a library, that library might have been built by a different person at a different time using a different set of tools from who/what/when for some application program that uses that library. So "the version number of the program as a whole" really says nothing, at least in this case. Other applications of identification strings include copyright information, which in many cases is required to be retained by licensing terms. > Therefore, I hesitate to give you any sort of advice, because as long > as there's no one on the GCC team who understands why this needs to > work, people are going to continue to break it. By analogy with unreferenced static functions, I think gcc should at least provide some warning to the programmer that a static constant string is unreferenced. IMO, that is all that should happen; it should be up to the programmer to either elide unused cruft or to retain content which has some use that the compiler is not able to discern. At the moment, gcc 4.0.0 acts more like a programmer's nanny, removing content with not so much as a warning to the programmer. That has two undesirable effects: 1. it runs counter to the principle of least surprise; if a programmer includes a static constant string, he expects it to be there. 2. it provides no incentive to clean up unused cruft, leading to clutter in code which results in high maintenance costs. > I do have three suggestions for you: > > 1) The current way to tell the compiler not to throw away >apparently-unused data is __attribute__((used)), like this: > > static const char __attribute__((used)) rcs_sccs_id[] = > "$Id: @(#)%M% %I% 20%E% %U% copyright 2005 %Q% string1\\ $"; OK, thanks to all who pointed that out. >You can hide this behind a macro so it doesn't interfere with other >compilers. Maybe; that remains to be seen. > 2) #ident currently does something useful if and only if the back-end >defines what to do with it. However, we could define a sensible >default, e.g. converting it to a global string declaration like the >one above. I'd be happy to take a patch that made that change. > >However, the purpose of -fno-ident is to make #ident do nothing; >we're not going to change that. I'm a little confused why you even >bring it up. The idea is to produce identification strings in a portable and dependable manner. Some mechanisms are non-portable, others are not dependable (due to different behavior with different options); some are neither portable nor dependable. > 3) #sccs does nothing because no one has ever told me what it was >supposed to do, or even what its argument syntax was. From what >you write, it's functionally identical to #ident; I'd be happy to >take a patch that made #sccs an alias for #ident. That's what it looks like; but as noted, it's also non-portable, so truly supporting it (as opposed to claiming it as a GNU extension and then ignoring it) doesn't help for new code intended to be portable, although it might improve compatibility for reuse of old code (e.g. on AT&T 3B1s, which used #sccs extensively in header files, IIRC).
RE: gcc 4.0.0 optimization vs. id strings (RCS, SCCS, etc.)
We use the feature of placing strings into the object file somewhat differently. We record configuration and compilation-related info. into strings which are collesced into their own linkage section. A runtime component traverses this config. info. section to ensure that the various separately linked modules have been compiled with consistent settings. Yes, this might be better done by a host based tool like collect, but that requires more work and more mechanism, and the simpler approach works fine for now.
Re: Re[2]: object code execution statistics
On Tue, 2005-04-26 at 01:57, Sergei Tovpeko wrote: > Does it mean that GCOV have much more data (in its binary format) > but don't treat them and out to user in human format? We have branch taken/not-taken counts, and from that, we can compute basic block execution counts and branch probabilities. gcov then uses a basic block to line number map to convert this into line number execution counts for the pretty printed source output. There are certainly other things that could be done with the info. Applying execution counts to assembly instructions is probably not very useful to the end user. It is too hard to map source code to assembly code after optimization. It might be interesting to generate statistics though, e.g. computing the percentage of executed instructions that were nops, the percent that were moves, etc. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: sjlj exceptions?
On Apr 26, 2005, at 1:00 AM, Thorsten Glaser wrote: when porting gcc (still 3.4.4), how do I exactly know whether I need to pass --enable-sjlj-exceptions to configure? You should never need it. Is there a test case which fails if I need it and have it not enabled, and passes otherwise (disabled and not needed, or enabled)? You can try eh*.C from check-g++, if they all mostly pass, you're set, if not the above might work around the fact there is more work to be done.
libstdc++ problem after compiling gcc-4.0 with the -fvisibity-inlines
I just compiled gcc-4.0 with the fvisibility-inlines-hidden option, and I get undefined symbols when linking c++ code with libstdc++. For example this simple c++ file does not compile: #include #include using namespace std; int main (void) { basic_string a = "thing one"; string b = "thing two"; cout << a.c_str() << endl; return (a == b); } I get the following errors: tmp/ccK2jGQY.o(.text+0x24): In function `main': test.cpp: undefined reference to `std::allocator::allocator()' /tmp/ccK2jGQY.o(.text+0x51):test.cpp: undefined reference to `std::allocator::~allocator()' /tmp/ccK2jGQY.o(.text+0x6a):test.cpp: undefined reference to `std::allocator::~allocator()' /tmp/ccK2jGQY.o(.text+0x75):test.cpp: undefined reference to `std::allocator::allocator()' /tmp/ccK2jGQY.o(.text+0xa2):test.cpp: undefined reference to `std::allocator::~allocator()' /tmp/ccK2jGQY.o(.text+0xb2):test.cpp: undefined reference to `std::allocator::~allocator()' /tmp/ccK2jGQY.o(.text+0xbd):test.cpp: undefined reference to `std::basic_string, std::allocator >::c_str() const' /tmp/ccK2jGQY.o(.text+0xdd):test.cpp: undefined reference to `std::basic_ostream >::operator<<(std::basic_ostream >& (*)(std::basic_ostream >&))' /tmp/ccK2jGQY.o(.text+0x100):test.cpp: undefined reference to `std::basic_string, std::allocator >::~basic_string()' /tmp/ccK2jGQY.o(.text+0x113):test.cpp: undefined reference to `std::basic_string, std::allocator >::~basic_string()' /tmp/ccK2jGQY.o(.text+0x129):test.cpp: undefined reference to `std::basic_string, std::allocator >::~basic_string()' /tmp/ccK2jGQY.o(.text+0x142):test.cpp: undefined reference to `std::basic_string, std::allocator >::~basic_string()' /tmp/ccK2jGQY.o(.gnu.linkonce.t._ZSteqIcSt11char_traitsIcESaIcEEbRKSbIT_T0_T1_ES8_[bool std::operator==, std::allocator >(std::basic_string, std::allocator > const&, std::basic_string, std::allocator > const&)]+0x14): In function `bool std::operator==, std::allocator >(std::basic_string, std::allocator > const&, std::basic_string, std::allocator > const&)': test.cpp: undefined reference to `std::basic_string, std::allocator >::compare(std::basic_string, std::allocator > const&) const' collect2: ld returned 1 exit status So it is not safe to compile gcc with the fvisibility-inlines-hidden option? Regards Papadakos Panagiotis
Re: libstdc++ problem after compiling gcc-4.0 with the -fvisibity-inlines
Panagiotis Papadakos wrote: >I just compiled gcc-4.0 with the fvisibility-inlines-hidden option, >and I get undefined symbols when linking c++ code with libstdc++. >For example this simple c++ file does not compile: > > Can you please compare what you are seeing with libstdc++/19664? I believe it's the same thing. I have somewhere the trivial libstdc++-v3 patch, which seems needed in any case, but I'm holding it back because there are also issues with the compiler proper. Honestly, I'm not sure to fully understand the whole complex tangle, at the moment :-( Paolo.
Re: New gcc 4.0.0 warnings seem spurious
On Tue April 26 2005 11:10, Joseph S. Myers wrote: > On Tue, 26 Apr 2005, Bruce Lilly wrote: > > > Demonstration code: > > -- > > #define AAA 0x1U > > #define BBB 0x2U > > > > struct foo { > > unsigned int bar:8; > > }; > > > > struct foo foos[] = { > > { ~(AAA) }, > > { ~(BBB) }, > > { ~(AAA|BBB) }, > > { ~(AAA&BBB) } > > }; > > -- > > > > compiling with gcc 3.x produced no warnings, as expected (no problems as > > all values fit easily within the defined structure's bit field). > > I don't see why you think the warnings are spurious. ~(AAA), for example, > is 4294967294, No, in this context it is 254 (an 8-bit unsigned field with the LSB clear). > which being greater than 255 certainly does not fit within > the type unsigned:8. But 254 is certainly < 255. The 'U' in the constant simply means unsigned; if it had been specified as "LU" you might have a point. But it wasn't. > Previous GCC versions had a long-known bug whereby > they did not diagnose this; that bug has been fixed in GCC 4. Looks more like several bugs were introduced; consider: #if 0 #define AAA 0x1U #define BBB 0x2U #else static const unsigned char AAA = 0x1U; static const unsigned char BBB = 0x2U; #endif struct foo { unsigned int bar:8; }; struct foo foos[] = { { ~(AAA) }, { ~(BBB) }, { ~(AAA|BBB) }, { ~(AAA&BBB) } }; gcc 4.0.0 reports: gcctest.c:14: error: initializer element is not constant gcctest.c:14: error: (near initialization for 'foos[0].bar') gcctest.c:15: error: initializer element is not constant gcctest.c:15: error: (near initialization for 'foos[1].bar') gcctest.c:16: error: initializer element is not constant gcctest.c:16: error: (near initialization for 'foos[2].bar') gcctest.c:17: error: initializer element is not constant gcctest.c:17: error: (near initialization for 'foos[3].bar') Now it's claiming that two *explicitly declared* const values aren't constant!
Re: New gcc 4.0.0 warnings seem spurious
Bruce Lilly <[EMAIL PROTECTED]> writes: >> I don't see why you think the warnings are spurious. ~(AAA), for example, >> is 4294967294, > > No, in this context it is 254 (an 8-bit unsigned field with the LSB clear). C does not work the way you think. AAA has type unsigned int. The expression ~(AAA) also has type unsigned int, and the value that Joseph stated. It always has that type and value, no matter what context it appears in. The initializer thus tries to give a variable with type unsigned:8 a value that it cannot hold. The diagnostic is correct. > static const unsigned char AAA = 0x1U; > static const unsigned char BBB = 0x2U; Again, C does not work the way you think. These are not constants. They are variables, which happen to be read-only. You cannot use them in initializers, just as you cannot use any other variable in an initializer. Furthermore, with that definition, AAA has type unsigned char, but ~(AAA) has type *signed* int and the value -2, because all arithmetic operations on types smaller than int, signed or unsigned, are first promoted to int (this is a slight simplification but is correct as far as it goes). zw
Re: Where did the include files go?
Øystein Johansen wrote: But why is the /gcc4.1/include/ directory empty? I think if you build only the C compiler, and your target doesn't support mudflap, then you won't get any files here. This is because the C compiler doesn't have anything to put there. Otherwise, there will be files here. cd /gcc4.1 find . -name 'stdio.h' -print stdio.h comes from the C library not from gcc. Try looking for 'include' or 'stddef.h'. The rest of the header files are in /usr/include as usual. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: Propagating attributes for to structure elements (needed for different address spaces)
Martin Koegler wrote: typedef struct x ax __attribute__ ((eeprom)); void test1(ax* x) One possible solution is to change your syntax. eeprom is supposed to be an attribute that applies to a decl. You are using a trick here to apply it to a type via a typedef, which takes advantage of the fact that typedefs are implemented internally as decls. So, maybe you should stop using tricks, and apply the attribute to the decl where it belongs. e.g something like "struct x __attribute__ ((eprom)) * x". This also lets you eliminate the TYPE_DECL hackery in your attribute handler. This requires that the attribute work a bit like the const type qualifier, since we can have pointers to eeprom and eeprom pointers, and they need to be handled differently. In this case, while expanding x->y.x=1, the type of x->y is integer. Even the expression, which the MEM_EXPR gets, contains not the information, that the eeprom attribute is present. Similar problems occure, if pointers to elements of structures (or pointer to array elements) are used. Presumably the info is there somewhere in the exander. It should simply be a matter of keeping track of it and carrying it along. Maybe this works better with the attribute on the decl instead of on the type? The solution would be to add also for the base type of an array the eeprom attribute. Additonally all elements of structures and unions must also have the eeprom attribute added. Bad idea. Far too much hackery. Another alternative would be to implement backend specific qualifiers (like volatile) in the tree representation. This would need large changes in GCC (at each location, where a new tree expression is created, these qualifiers would also need to be set). Yes, this seems reasonable. This seems to agree with what I mentioned at the top, and also seems to agree with the proposed address spaces extension that Joseph Myers pointed at. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: A plan for eliminating cc0
Sorry, I dropped the ball on this one. On Mar 24, 2005, Ian Lance Taylor wrote: > Alexandre Oliva <[EMAIL PROTECTED]> writes: >> I realize the sequence construct is already taken for delayed >> branches, but that's only in the outermost insn pattern. We could >> overload the meaning, or just enclose the (sequence) in some other >> construct (a dummy parallel?), give it a different mode (a CC mode?) >> or something else to indicate that this is a combined pattern. Or >> create a different rtl code for cc0 sequences of instructions. The >> syntax is not all that important at this point. > I don't understand why you are suggesting a sequence at all. Why not > this: > (define_insn_and_split "cc0_condbranch_cmpsi" > [(set (pc) > (if_then_else (match_operator 3 "comparison_operator" >(match_operand:SI 0 "register_operand" "!*d*a*x,dax") >(match_operand:SI 1 "nonmemory_operand" "*0,daxi" > (label_ref (match_operand 2 "" "")) > (pc)))] > )] Because the above assumes the insn that sets cc is a single set, without anything else in parallel. If you have to handle parallels, you may be tempted to just add them to the combined insn, and then in some ugly cases you can end up having ambiguities. Consider a (clobber (reg X)) in parallel with the (set (cc)), for example. Consider that some instructions that use cc may also have such clobbers, and some don't, and that the clobber is the only difference between them. How would you deal with this? Consider, for the worst case, that whatever strategy you use to decide where to add the clobber is the strategy the port maintainer took when deciding where to add the explicit clobber to the two insn variants that use cc.q Besides, I feel the combined sequences might reduce the actual effects of the pattern explosion, since you won't have to generate code to match the whole thing and the components, you just match a sequence with the given components. But maybe it doesn't make any significant difference in this regard. -- Alexandre Oliva http://www.ic.unicamp.br/~oliva/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: A plan for eliminating cc0
On Mar 28, 2005, Paul Schlie <[EMAIL PROTECTED]> wrote: > More specifically, if GCC enabled set to optionally specify multiple targets > for a single rtl source expression, i.e.: > (set ((reg:xx %0) (reg CC) ...) (some-expression:xx ...)) There's always (set (parallel (...)) (some-expression)). We use parallels with similar semantics in cumulative function arguments, so this wouldn't be entirely new, but I suppose most rtl handlers would need a bit of work to fully understand the implications of this. Also, the fact that reg CC has a different mode might require some further tweaking. Given this, we could figure out some way to create lisp-like macros to translate input shorthands such as (set_cc (match_operand) (value)) into the uglier (set (parallel (match_operand) (cc0)) (value)), including the possibility of a port introducing multiple variants/modes for set_cc, corresponding to different sets of flags that various instructions may set. -- Alexandre Oliva http://www.ic.unicamp.br/~oliva/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}
Re: EABI stack alignment for ppc
Olivier Hainque <[EMAIL PROTECTED]> writes: > Hello, > > PPC EABI targets are currently configured with both BIGGEST_ALIGNMENT and > PREFERRED_STACK_BOUNDARY set to 128, I believe to accomodate > > "a long double member within a structure or union shall start at the lowest >available offset aligned on a 16byte boundary" BIGGEST_ALIGNMENT is 128 for a number of reasons, but PREFERRED_STACK_BOUNDARY is 128 primarily so that code compiled with -meabi can also be used on Linux and other SVR4 targets, and for Altivec support. > Now, I'm a bit unclear on the meaning of the ABI statement quoted above, and > on the real implications this should have in the compiler. > > Does it imply that a long double field *address* should always be a multiple > of 16, or just that the *offset* should be such a multiple ? It says offset, so it probably means offset.
Re: [RFA] Invalid mmap(2) assumption in pch (ggc-common.c)
Matt Thomas <[EMAIL PROTECTED]> writes: > Running the libstdc++ testsuite on NetBSD/sparc or NetBSD/sparc64 > results in most tests failing like: > > :1: fatal error: had to relocate PCH > compilation terminated. > compiler exited with status 1 > > This is due to a misassumption in ggc-common.c:654 > (mmap_gt_pch_use_address): > > This version assumes that the kernel honors the START operand of mmap > even without MAP_FIXED if START through START+SIZE are not currently > mapped with something. > > That is not true for NetBSD. Due to MMU idiosyncracies, some architecures > (like sparc and sparc64) will align mmap requests that don't have MAP_FIXED > set for architecture specific reasons). Such systems should work anyway, so long as the alignment is consistent; the address being requested is one which was returned from mmap() in a previous invocation of the compiler. If the alignment isn't consistent, then you might look at overriding this routine for your host. There are custom versions for Solaris, Linux, Darwin, and HPPA, each of which uses a different implementation technique.
RE: Ada and bad configury architecture.
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of > Nathanael Nerode > Sent: Monday, April 25, 2005 8:47 PM [...] > > Actually, I was going to try to convince y'all to allow the *configury* > to be put in the *configure* files. All of it. The current scheme of > stuffing the configury in the Makefile, although I know the Ada > maintainers like it, is just trouble, and is fundamentally the source of > most or all of the endless Ada cross-build problems. We implement an experimental dialect of C, called UPC, which targets SIMD class machines. One of the changes between 3.3 and 3.4 that have caused us the most grief is the decision to defer per-language configuration to the make step. This means that the dialect-specific configuration runs after gcc configuration, and we can no longer, for example overlay (or add to) the basic configuration. As an example, we need to introduce dialect-specific runtime start and end object files (serving a similar function to crtbegin.o and crtend.o) but the common start and end files are now built well before the UPC language files are even configured. Thus, there is no mechanism to add language-specific components onto the list of files that come with the base level compiler. For 3.4 we've worked around the problem, but the workaround is kludgy. In a related matter, I find it difficult to debug the makefiles that make use of included makefile fragments. I can see some advantages of these included files for developers who happen to be working on those fragments, but overall, the include files make life more difficult. Same thing goes for the included configure fragments, IMO. And while I'm ranting, I'd much prefer it the make files were 'for loop free'; that is, that they listed explicit dependencies and built those dependents in a classic make file fashion, rather than implementing iteration in the make step. Most of these suggestions argue for a method to generate make files in a more automated fashion.
Re: Java failures [Re: 75 GCC HEAD regressions, 0 new, with your patch on 2005-04-20T14:39:10Z.]
Andrew Haley wrote: * postreload-gcse.c (hash_scan_set): Removve bogus assertion. I agree with Roger here, we need to add code to handle REG_EG_REGION notes here instead of just dropping the gcc_assert call. See my 2 week old message on the gcc list when this first came up http://gcc.gnu.org/ml/gcc/2005-04/msg00643.html -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: [PATCH] Debugging Vector Types
Devang Patel wrote: * dbxout.c (dbxout_type): Emit attribute vector. You are setting have_used_extensions without first checking use_gnu_debug_info_extensions, which is wrong. If you look at the code, you will see that this idiom is used everywhere in dbxout.c. Bootstrapped and tested on powerpc-darwin. All changes to debug output files require running the gdb testsuite. Did you do that? The usual gcc bootstrap and make check isn't sufficient, as there are no debug info checks in gcc. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: Submission Status: CRX port ?
Paul Woegerer wrote: two weeks ago i've posted a new port to gcc-patches and it seems that no global maintainer took a look at it so far. Maybe now that the 4.0 is released there is someone who can take a look at it :) I should be able to help with this if no one else can, as I am trying to review patches again. I am a bit rusty though. I see I accidentally posted two gcc-patches followups to the gcc list today. A classic mistake. It will take me some time to get fully back up to speed. I exchanged a bit of mail with Tal Agmon last year, so I already know a little bit about the port. There is also the MAXQ port that is waiting review which I should look at also. I suspect this one is not as far along as the CRX port though. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: [PATCH] Debugging Vector Types
On Tue, Apr 26, 2005 at 05:55:21PM -0700, James E Wilson wrote: > Devang Patel wrote: > >* dbxout.c (dbxout_type): Emit attribute vector. > > You are setting have_used_extensions without first checking > use_gnu_debug_info_extensions, which is wrong. If you look at the code, > you will see that this idiom is used everywhere in dbxout.c. I think there are already some exceptions. Do those checks still add value over the -gstabs/-gstabs+ distinction? -- Daniel Jacobowitz CodeSourcery, LLC
[PATCH] VAX: cleanup; move macros from config/vax/vax.h to normal in config/vax/vax.c
This doesn't change any functionality, it just moves and cleans up a large number of complicated macros in vax.h to normal C code in vax.c. It's the first major step to integrating PIC support that I did for gcc 2.95.3. It also switches from using SYMBOL_REF_FLAG to SYMBOL_REF_LOCAL_P. Committed. -- Matt Thomas email: [EMAIL PROTECTED] 3am Software Foundry www: http://3am-software.com/bio/matt/ Cupertino, CA disclaimer: I avow all knowledge of this message. 2005-03-26 Matt Thomas <[EMAIL PROTECTED]> * config/vax/vax.c (legitimate_constant_address_p): New. Formerly CONSTANT_ADDRESS_P in config/vax/vax.h (legitimate_constant_p): New. Formerly CONSTANT_P in vax.h. (INDEX_REGISTER_P): New. (BASE_REGISTER_P): New. (indirectable_constant_address_p): New. Adapted from INDIRECTABLE_CONSTANT_ADDRESS_P in vax.h. Use SYMBOL_REF_LOCAL_P. (indirectable_address_p): New. Adapted from INDIRECTABLE_ADDRESS_P in vax.h. (nonindexed_address_p): New. Adapted from GO_IF_NONINDEXED_ADDRESS in vax.h. (index_temp_p): New. Adapted from INDEX_TERM_P in vax.h. (reg_plus_index_p): New. Adapted from GO_IF_REG_PLUS_INDEX in vax.h. (legitimate_address_p): New. Adapted from GO_IF_LEGITIMATE_ADDRESS in vax.h (vax_mode_dependent_address_p): New. Adapted from GO_IF_MODE_DEPENDENT_ADDRESS in vax.h * config/vax/vax.h (CONSTANT_ADDRESS_P): Use legitimate_constant_address_p (CONSTANT_P): Use legitimate_constant_p. (INDIRECTABLE_CONSTANT_ADDRESS_P): Removed. (INDIRECTABLE_ADDRESS_P): Removed. (GO_IF_NONINDEXED_ADDRESS): Removed. (INDEX_TEMP_P): Removed. (GO_IF_REG_PLUS_INDEX): Removed. (GO_IF_LEGITIMATE_ADDRESS): Use legitimate_address_p. Two definitions, depending on whether REG_OK_STRICT is defined. (GO_IF_MODE_DEPENDENT_ADDRESS): Use vax_mode_dependent_address_p. Two definitions, depending on whether REG_OK_STRICT is defined. * config/vax/vax-protos.h (legitimate_constant_address_p): Prototype added. (legitimate_constant_p): Prototype added. (legitimate_address_p): Prototype added. (vax_mode_dependent_address_p): Prototype added. Index: vax.c === RCS file: /cvs/gcc/gcc/gcc/config/vax/vax.c,v retrieving revision 1.60 diff -u -3 -p -r1.60 vax.c --- vax.c 7 Apr 2005 21:44:57 - 1.60 +++ vax.c 26 Apr 2005 20:45:42 - @@ -1100,3 +1100,227 @@ vax_output_conditional_branch (enum rtx_ } } +/* 1 if X is an rtx for a constant that is a valid address. */ + +int +legitimate_constant_address_p (rtx x) +{ + return (GET_CODE (x) == LABEL_REF || GET_CODE (x) == SYMBOL_REF + || GET_CODE (x) == CONST_INT || GET_CODE (x) == CONST + || GET_CODE (x) == HIGH); +} + +/* Nonzero if the constant value X is a legitimate general operand. + It is given that X satisfies CONSTANT_P or is a CONST_DOUBLE. */ + +int +legitimate_constant_p (rtx x ATTRIBUTE_UNUSED) +{ + return 1; +} + +/* The other macros defined here are used only in legitimate_address_p (). */ + +/* Nonzero if X is a hard reg that can be used as an index + or, if not strict, if it is a pseudo reg. */ +#defineINDEX_REGISTER_P(X, STRICT) +(GET_CODE (X) == REG && (!(STRICT) || REGNO_OK_FOR_INDEX_P (REGNO (X + +/* Nonzero if X is a hard reg that can be used as a base reg + or, if not strict, if it is a pseudo reg. */ +#defineBASE_REGISTER_P(X, STRICT) +(GET_CODE (X) == REG && (!(STRICT) || REGNO_OK_FOR_BASE_P (REGNO (X + +#ifdef NO_EXTERNAL_INDIRECT_ADDRESS + +/* Re-definition of CONSTANT_ADDRESS_P, which is true only when there + are no SYMBOL_REFs for external symbols present. */ + +static int +indirectable_constant_address_p (rtx x) +{ + if (!CONSTANT_ADDRESS_P (x)) +return 0; + if (GET_CODE (x) == CONST && GET_CODE (XEXP ((x), 0)) == PLUS) +x = XEXP (XEXP (x, 0), 0); + if (GET_CODE (x) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (x)) +return 0; + + return 1; +} + +#else /* not NO_EXTERNAL_INDIRECT_ADDRESS */ + +static int +indirectable_constant_address_p (rtx x) +{ + return CONSTANT_ADDRESS_P (x); +} + +#endif /* not NO_EXTERNAL_INDIRECT_ADDRESS */ + +/* Nonzero if X is an address which can be indirected. External symbols + could be in a sharable image library, so we disallow those. */ + +static int +indirectable_address_p(rtx x, int strict) +{ + if (indirectable_constant_address_p (x)) +return 1; + if (BASE_REGISTER_P (x, strict)) +return 1; + if (GET_CODE (x) == PLUS + && BASE_REGISTER_P (XEXP (x, 0), strict) + && indirectable_constant_address_p (XEXP (x, 1))) +return
Re: [PATCH] Debugging Vector Types
On Apr 26, 2005, at 5:55 PM, James E Wilson wrote: Devang Patel wrote: * dbxout.c (dbxout_type): Emit attribute vector. You are setting have_used_extensions without first checking use_gnu_debug_info_extensions, which is wrong. If you look at the code, you will see that this idiom is used everywhere in dbxout.c. OK, I'll fix it. Bootstrapped and tested on powerpc-darwin. All changes to debug output files require running the gdb testsuite. Did you do that? The usual gcc bootstrap and make check isn't sufficient, as there are no debug info checks in gcc. GDB testsuite run using Darwin GDB did not reveal new failures. - Devang
GCC 4.1: Buildable on GHz machines only?
Over the past month I've been making sure that GCC 4.1 works on NetBSD. I've completed bootstraps on sparc, sparc64, arm, x86_64, i386, alpha, mipsel, mipseb, and powerpc. I've done cross-build targets for vax. Results have been sent to gcc-testsuite. The times to complete bootstraps on older machines has been bothering me. It took nearly 72 hours for 233MHz StrongArm with 64MB to complete a bootstrap (with libjava). It took over 48 hours for a 120MHz MIPS R4400 (little endian) with 128MB to finish (without libjava) and a bit over 24 hours for a 250MHz MIPS R4400 (big endian) with 256MB to finish (again, no libjava). That doesn't even include the time to run the testsuites. I have a 50MHz 68060 with 96MB of memory (MVME177) approaching 100 hours (48 hours just to exit stage3 and start on the libraries) doing a bootstrap knowing that it's going to die when doing the ranlib of libjava. The kernel for the 060 isn't configured with a large enough dataspace to complete the ranlib. Most of the machines I've listed above are relatively powerful machines near the apex of performance of their target architecture. And yet GCC4.1 can barely be bootstrapped on them. I do most of my GCC work on a 2GHz x86_64 because it's so fast. I'm afraid the widespread availability of such fast machines hides the fast that the current performance of GCC on older architectures is appalling. I'm going to run some bootstraps with --disable-checking just to see how much faster they are. I hope I'm going to pleasantly surprised but I'm not counting on it. -- Matt Thomas email: [EMAIL PROTECTED] 3am Software Foundry www: http://3am-software.com/bio/matt/ Cupertino, CA disclaimer: I avow all knowledge of this message.
Re: GCC 4.1: Buildable on GHz machines only?
On Tue, Apr 26, 2005 at 07:50:40PM -0700, Matt Thomas wrote: > Over the past month I've been making sure that GCC 4.1 works on NetBSD. > I've completed bootstraps on sparc, sparc64, arm, x86_64, i386, alpha, > mipsel, mipseb, and powerpc. I've done cross-build targets for vax. > Results have been sent to gcc-testsuite. > > The times to complete bootstraps on older machines has been bothering me. > It took nearly 72 hours for 233MHz StrongArm with 64MB to complete a > bootstrap (with libjava). It took over 48 hours for a 120MHz MIPS R4400 > (little endian) with 128MB to finish (without libjava) and a bit over 24 > hours for a 250MHz MIPS R4400 (big endian) with 256MB to finish (again, > no libjava). That doesn't even include the time to run the testsuites. > > I have a 50MHz 68060 with 96MB of memory (MVME177) approaching 100 hours > (48 hours just to exit stage3 and start on the libraries) doing a bootstrap > knowing that it's going to die when doing the ranlib of libjava. The kernel > for the 060 isn't configured with a large enough dataspace to complete the > ranlib. > > Most of the machines I've listed above are relatively powerful machines > near the apex of performance of their target architecture. And yet GCC4.1 > can barely be bootstrapped on them. Note that the MIPSen are not near the top of modern MIPS performance. The ARM isn't quite there either, but the higher-powered ARMs are a bit scarcer than the MIPS. None of this detracts from your point, though. > I'm going to run some bootstraps with --disable-checking just to see how > much faster they are. I hope I'm going to pleasantly surprised but I'm > not counting on it. I would expect it to be drastically faster. However this won't show up clearly in the bootstrap. The, bar none, longest bit of the bootstrap is building stage2; and stage1 is always built with optimization off and (IIRC) checking on. -- Daniel Jacobowitz CodeSourcery, LLC
Re: Build gcc-4.0.0
Jean-Paul Rigault wrote: - I had to use the --enable-languages option to get the Ada compiler; without it, and contrarily to what is suggested in the installation doc, Ada was not built. - the HTML documentation is generated in /objdir//gcc/HTML, not in /objdir//HTML as indicated in the documentation. Thanks for the info. I have posted a proposed patch on the gcc-patches mailing list here: http://gcc.gnu.org/ml/gcc-patches/2005-04/msg02720.html -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: A plan for eliminating cc0
> From: Alexandre Oliva <[EMAIL PROTECTED]> >> On Mar 28, 2005, Paul Schlie <[EMAIL PROTECTED]> wrote: > >> More specifically, if GCC enabled set to optionally specify multiple targets >> for a single rtl source expression, i.e.: > >> (set ((reg:xx %0) (reg CC) ...) (some-expression:xx ...)) > > There's always (set (parallel (...)) (some-expression)). We use > parallels with similar semantics in cumulative function arguments, so > this wouldn't be entirely new, but I suppose most rtl handlers would > need a bit of work to fully understand the implications of this. > > Also, the fact that reg CC has a different mode might require some > further tweaking. > > Given this, we could figure out some way to create lisp-like macros to > translate input shorthands such as (set_cc (match_operand) (value)) > into the uglier (set (parallel (match_operand) (cc0)) (value)), > including the possibility of a port introducing multiple > variants/modes for set_cc, corresponding to different sets of flags > that various instructions may set. Understood; any thoughts as how it may be similarly specified which cc-mode regs may be clobbered, in addition to updated, or left alone? Which leads me to wonder if it may be worth while potentially accepting and classifying something like the following as a single-set: [(set (match_operand %0) (some-expression ...)) (update (ccx) (ccy)) (clobber (ccz))] Specifying that: - %0 is not bound to (some-expression) - ccx, ccy are now equivalent to %0 - ccz now has no equivalency (i.e. undefined) - any other potentially defined cc-mode regs equivalency's are unchanged. (as possibly a simper and somewhat more familiar and flexible approach?)
Re: A plan for eliminating cc0
> From: Alexandre Oliva <[EMAIL PROTECTED]> >> On Mar 28, 2005, Paul Schlie <[EMAIL PROTECTED]> wrote: > >> More specifically, if GCC enabled set to optionally specify multiple targets >> for a single rtl source expression, i.e.: > >> (set ((reg:xx %0) (reg CC) ...) (some-expression:xx ...)) > > There's always (set (parallel (...)) (some-expression)). We use > parallels with similar semantics in cumulative function arguments, so > this wouldn't be entirely new, but I suppose most rtl handlers would > need a bit of work to fully understand the implications of this. > > Also, the fact that reg CC has a different mode might require some > further tweaking. > > Given this, we could figure out some way to create lisp-like macros to > translate input shorthands such as (set_cc (match_operand) (value)) > into the uglier (set (parallel (match_operand) (cc0)) (value)), > including the possibility of a port introducing multiple > variants/modes for set_cc, corresponding to different sets of flags > that various instructions may set. (sorry had to fix a typo, should be somewhat more sensible now): Understood; any thoughts as how it may be similarly specified which cc-mode regs may be clobbered, in addition to updated, or left alone? Which leads me to wonder if it may be worth while potentially accepting and classifying something like the following as a single-set: [(set (match_operand %0) (some-expression ...)) (update (ccx) (ccy)) (clobber (ccz))] Specifying that: - %0 is bound to (some-expression) - ccx, ccy are now equivalent to %0 - ccz now has no equivalency (i.e. undefined) - any other potentially defined cc-mode regs equivalency's are unchanged. (as possibly a simper and somewhat more familiar and flexible approach?)
Re: GCC 4.1: Buildable on GHz machines only?
On Tue, Apr 26, 2005 at 10:57:07PM -0400, Daniel Jacobowitz wrote: > I would expect it to be drastically faster. However this won't show up > clearly in the bootstrap. The, bar none, longest bit of the bootstrap > is building stage2; and stage1 is always built with optimization off and > (IIRC) checking on. Which is why I essentially always supply STAGE1_CFLAGS='-O -g' when building on risc machines. r~
Re: GCC 4.1: Buildable on GHz machines only?
Richard Henderson wrote: > On Tue, Apr 26, 2005 at 10:57:07PM -0400, Daniel Jacobowitz wrote: > >>I would expect it to be drastically faster. However this won't show up >>clearly in the bootstrap. The, bar none, longest bit of the bootstrap >>is building stage2; and stage1 is always built with optimization off and >>(IIRC) checking on. > > > Which is why I essentially always supply STAGE1_CFLAGS='-O -g' when > building on risc machines. Alas, the --disable-checking and STAGE1_CFLAGS="-O2 -g" (which I was already doing) only decreased the bootstrap time by 10%. By far, the longest bit of the bootstrap is building libjava. -- Matt Thomas email: [EMAIL PROTECTED] 3am Software Foundry www: http://3am-software.com/bio/matt/ Cupertino, CA disclaimer: I avow all knowledge of this message.
[RFA] Which is better? More and simplier patterns? Fewer patterns with more embedded code?
Back when I modified gcc 2.95.3 to produce PIC code for NetBSD/vax, I changed the patterns in vax.md to be more specific with the instructions that got matched. The one advantage (to me as the writer) was it made it much easier to track down what pattern caused what instruction to be emitted. For instance: (define_insn "*pushal" [(set (match_operand:SI 0 "push_operand" "=g") (match_operand:SI 0 "address_operand" "p"))] "" "pushal %a1") I like the more and simplier patterns approach but I'm wondering what the general recommendation is? -- Matt Thomas email: [EMAIL PROTECTED] 3am Software Foundry www: http://3am-software.com/bio/matt/ Cupertino, CA disclaimer: I avow all knowledge of this message.
RE: GCC 4.1: Buildable on GHz machines only?
> -Original Message- > From: Matt Thomas > Sent: Tuesday, April 26, 2005 10:42 PM [...] > > Alas, the --disable-checking and STAGE1_CFLAGS="-O2 -g" (which I was > already doing) only decreased the bootstrap time by 10%. By far, the > longest bit of the bootstrap is building libjava. > Is it fair to compare current build times, with libjava included, against past build times when it didn't exist? Would a closer apples-to-apples comparison be to bootstrap GCC Core only on the older sub Ghz platforms?
Re: GCC 4.1: Buildable on GHz machines only?
Matt Thomas <[EMAIL PROTECTED]> writes: > I have a 50MHz 68060 with 96MB of memory (MVME177) approaching 100 hours > (48 hours just to exit stage3 and start on the libraries) doing a bootstrap > knowing that it's going to die when doing the ranlib of libjava. The kernel > for the 060 isn't configured with a large enough dataspace to complete the > ranlib. If so, that is particularly irritating since the ranlib is completely unnecessary. GNU ar always builds an archive symbol table by default. Ian
Re: GCC 4.1: Buildable on GHz machines only?
Gary Funck wrote: > >>-Original Message- >>From: Matt Thomas >>Sent: Tuesday, April 26, 2005 10:42 PM > > [...] > >>Alas, the --disable-checking and STAGE1_CFLAGS="-O2 -g" (which I was >>already doing) only decreased the bootstrap time by 10%. By far, the >>longest bit of the bootstrap is building libjava. >> > > > Is it fair to compare current build times, with libjava included, > against past build times when it didn't exist? Would a closer > apples-to-apples comparison be to bootstrap GCC Core only on > the older sub Ghz platforms? libjava is built on everything but vax and mips. Bootstrapping core might be better but do the configure on the fly it's not as easy as it used to be. It would be nice if bootstrap emitted timestamps when it was started and when it completed a stage so one could just look at the make output. Regardless, GCC4.1 is a computational pig. -- Matt Thomas email: [EMAIL PROTECTED] 3am Software Foundry www: http://3am-software.com/bio/matt/ Cupertino, CA disclaimer: I avow all knowledge of this message.
Re: GCC 4.1: Buildable on GHz machines only?
Matt Thomas <[EMAIL PROTECTED]> writes: > libjava is built on everything but vax and mips. Bootstrapping core > might be better but do the configure on the fly it's not as easy as > it used to be. --enable-languages=c,c++ (or even perhaps --enable-languages=c) doesn't work for you? Also, I believe "make all-gcc TARGET-gcc=bootstrap" will bootstrap the compiler normally but not build the runtime libraries. > It would be nice if bootstrap emitted timestamps when it was started > and when it completed a stage so one could just look at the make output. Patches are welcome. zw
folding after TER notes
Just some notes I've gathered on folding statements modified by TER... First, a lot of the changes made do not affect the code we generate in a meaningful way. That's because a lot of the changes merely reorder operands in conditionals, arithmetic expressions and the like. For example, after TER we might have something like if (a->b != c) ... Folding will rewrite that into if (c != a->b) [ In general we will try to have _DECL nodes as the first operand and expressions as the second operand to binary operators. ] I've disabled the code in fold which rearranges operands so that I can better evaluate the real effects of folding expressions resulting from TER. There are definitely cases where folding after TER does result in simpler trees and sometimes better code generation. For example, I've seen folding after TER change stuff like this: t = *&t->exp.operands[0]; into something like this: t = t->exp.operands[0]; This is by far the most common effect I'm seeing for folding after TER. It's happening enough that I suspect we're missing propagations earlier in our SSA optimization path. I don't think it's really advisable to evaluate folding after TER without first looking at why we're seeing so many *& expressions showing up after TER. In a few places we change conditionals like (x & 128) turns into: (signed char)x < 0 I'm not sure if that's really an improvement or not, but it's something that I'm seeing relatively often. In a few places we're able to kill some casts. For example: ! pretmp.428 = (int) (reg_class) D.18943; might turn into: ! pretmp.428 = D.18943; Or in some cases we remove redundant operations: ! *p = (char) (val & 255); Turns into: ! *p = (char) val; In a small number of cases we turn pointer arithmetic into array indexing: ! bodyp = &D.20390->exp.operands[0] + 4B; Is folded into: ! bodyp = &D.20390->exp.operands[1]; This is something I would have expected to happen before TER. In a few places we're turning simple conditionals like x->bitfield != 0 into (BIT_FIELD_REF (...) & ) != 0 Which may or may not be a good thing to do. Sometimes we get multiple transformations. For example, turning pointer arithmetic into array indexing and removal of *& occur relatively often together in the same expression. Anyway, I'm going to look into why we're seeing so many *& expressions during TER. It just seems to me that such expressions should have been exposed and optimized before TER and that we might get some secondary benefits from fixing this oversight. Jeff
Re: folding after TER notes
Wasn't TER a temporary kludge that should be going away? zw
Re: [RFA] Which is better? More and simplier patterns? Fewer patterns with more embedded code?
> > I like the more and simplier patterns approach but I'm wondering what > the general recommendation is? Mostly what I go for in individual insns,though I try to make sure that the lengths are equal and it's something generated by the named patterns. I.e. make sure that the patterns you do have don't have a lot of multiple insns to accomplish a single task, but also make sure that you're generating the insns in the first place :) -eric
Re: GCC 4.1: Buildable on GHz machines only?
Matt Thomas <[EMAIL PROTECTED]> writes: > Richard Henderson wrote: > > On Tue, Apr 26, 2005 at 10:57:07PM -0400, Daniel Jacobowitz wrote: > > > >>I would expect it to be drastically faster. However this won't show up > >>clearly in the bootstrap. The, bar none, longest bit of the bootstrap > >>is building stage2; and stage1 is always built with optimization off and > >>(IIRC) checking on. > > > > > > Which is why I essentially always supply STAGE1_CFLAGS='-O -g' when > > building on risc machines. > > Alas, the --disable-checking and STAGE1_CFLAGS="-O2 -g" (which I was I don't think that is enough, also edit gcc/Makefile.in and change the line: STAGE1_CHECKING = -DENABLE_CHECKING -DENABLE_ASSERT_CHECKING to be STAGE1_CHECKING = Is there a better way to do this? STAGE1_CHECKING is not passed from the toplevel make, so one cannot put it on the make bootstrap command line...