Re: hang in acats testsuite test cxg2014 on hppa2.0w-hp-hpux11.00
Rainer Emrich wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > acats test cxg2014 hangs forever on hppa2.0w-hp-hpux11.00. > ,.,. CXG2014 ACATS 2.5 06-02-10 18:02:32 > - CXG2014 Check the accuracy of the SINH and COSH functions. >* CXG2014 sinh(1) actual: 7.49632E-01 expected: > 1.17520119364380146E+00 difference: > -4.25201193643801825E-01 max err: > 1.01932454980493402E-18. > Any Ideas ? I'm not sure about the hang. For the floating point operations, we are currently using *** gcc/config/pa/pa.h.ori Tue Mar 30 11:42:04 2004 --- gcc/config/pa/pa.h Tue Mar 30 11:45:20 2004 *** *** 461,466 --- 461,471 #define UNITS_PER_WORD (TARGET_64BIT ? 8 : 4) #define MIN_UNITS_PER_WORD 4 + /* The widest floating point format supported by the hardware. Note that +setting this influences some Ada floating point type sizes, currently +required for GNAT to operate properly. */ + #define WIDEST_HARDWARE_FP_SIZE 64 + in our GCC 3.4 based tree. I haven't yet been able to check the situation in 4.X, which is why no patch submission has been issued as of today. You might still want to give it a try, though, because I think it is likely to remain applicable. It probably incurs a difference in size between C and Ada long_double s but this is minor compared to kind of damage you are observing. Olivier
Re: Massive FORTRAN test failures
> I have got massive FORFRAN test failures on Linux/ia64 and > Linux/x86-64: > > http://gcc.gnu.org/ml/gcc-testresults/2006-02/msg00730.html > http://gcc.gnu.org/ml/gcc-testresults/2006-02/msg00729.html > > Most of failures look like: > > /net/gnu-13/export/gnu/src/gcc/gcc/gcc/testsuite/gfortran.dg/char_result_11.f90:0: > internal compiler error: Segmentation fault^M > Please submit a full bug report,^M > with preprocessed source if appropriate.^M > See http://gcc.gnu.org/bugs.html> for instructions.^M > compiler exited with status 1 I did see the following on my nightly regtest on x86_64-linux: FAIL: gfortran.dg/char_transpose_1.f90 -O3 -g (test for excess errors) WARNING: gfortran.dg/char_transpose_1.f90 -O3 -g compilation failed to produce executable FX
Re: pruning unused debugging types (enums/PR23336)
Aldy Hernandez wrote: >> You could combine the two ideas: a global hash table of types used in >> casts, where each entry had a list of functions using those types. That >> should take up no more storage than the per-function vectors. Then, >> you'd have to walk the entire hash table, writing out each type for >> which at least one of the associated functions was written out, >> including being inlined into another function. > > Do we keep a hash of functions that have been written out somewhere? > You need a list of functions that have been written or scheduled to be emitted? Each node in the call graph has a field 'output' set to true when the function is marked for generation.
Re: PATCH: [4.1/4.2 Regression]: Miscompiled FORTRAN program
Jim Wilson wrote: I don't believe this is safe. If you look at the uses of regno_clobbered_p in reload.c, the comments clearly indicate that we are looking for registers used in clobbers. So unconditionally adding code that handles REG_INC notes will break these uses. You have to add the REG_INC support the same way that the sets support was added, by adding another argument (or reusing the sets argument), and then modifying the one place we know is broken (choose_reload_regs) to use the new argument (or new sets argument value). Jim, Did I understand your idea correctly? Can you comment new patch version. It isn't fully tested but am I going in right direction? 2006-02-13 Denis Nagorny <[EMAIL PROTECTED]> PR rtl-optimization/25603 * reload.c (reg_inc_found_and_valid_p): New. Check REG_INC note. (regno_clobbered_p): Use it. Reusing SETS argument for REG_INC case. * reload1.c (choose_reload_regs): Added call of egno_clobbered_p with new meaning of SETS. Index: reload.c === *** reload.c(revision 110905) --- reload.c(working copy) *** static int find_inc_amount (rtx, rtx); *** 281,286 --- 281,287 static int refers_to_mem_for_reload_p (rtx); static int refers_to_regno_for_reload_p (unsigned int, unsigned int, rtx, rtx *); + static int reg_inc_found_and_valid_p(unsigned int, unsigned int, rtx); /* Determine if any secondary reloads are needed for loading (if IN_P is nonzero) or storing (if IN_P is zero) X to or from a reload register of *** find_inc_amount (rtx x, rtx inced) *** 6941,6949 return 0; } /* Return 1 if register REGNO is the subject of a clobber in insn INSN. !If SETS is nonzero, also consider SETs. REGNO must refer to a hard !register. */ int regno_clobbered_p (unsigned int regno, rtx insn, enum machine_mode mode, --- 6942,6978 return 0; } + /* Return 1 if registers from REGNO to ENDREGNO are the subjects of a +REG_INC note in insn INSN. REGNO must refer to a hard register. */ + + static int + reg_inc_found_and_valid_p(unsigned int regno ATTRIBUTE_UNUSED, + unsigned int endregno ATTRIBUTE_UNUSED, + rtx insn ATTRIBUTE_UNUSED) + { + #ifdef AUTO_INC_DEC + rtx link; + + gcc_assert (insn); + + if (! INSN_P (insn)) + return 0; + + for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) + if (REG_NOTE_KIND (link) == REG_INC) + { + unsigned int test = (int) REGNO (XEXP (link, 0)); + if (test >= regno && test < endregno) + return 1; + } + #endif + return 0; + + } + /* Return 1 if register REGNO is the subject of a clobber in insn INSN. !If SETS is 1, also consider SETs. If SETS is 2, enable checking REG_INC. !REGNO must refer to a hard register. */ int regno_clobbered_p (unsigned int regno, rtx insn, enum machine_mode mode, *** regno_clobbered_p (unsigned int regno, r *** 6958,6964 endregno = regno + nregs; if ((GET_CODE (PATTERN (insn)) == CLOBBER !|| (sets && GET_CODE (PATTERN (insn)) == SET)) && REG_P (XEXP (PATTERN (insn), 0))) { unsigned int test = REGNO (XEXP (PATTERN (insn), 0)); --- 6987,6993 endregno = regno + nregs; if ((GET_CODE (PATTERN (insn)) == CLOBBER !|| (sets == 1 && GET_CODE (PATTERN (insn)) == SET)) && REG_P (XEXP (PATTERN (insn), 0))) { unsigned int test = REGNO (XEXP (PATTERN (insn), 0)); *** regno_clobbered_p (unsigned int regno, r *** 6966,6971 --- 6995,7003 return test >= regno && test < endregno; } + if (sets == 2 && reg_inc_found_and_valid_p(regno, endregno, insn)) + return 1; + if (GET_CODE (PATTERN (insn)) == PARALLEL) { int i = XVECLEN (PATTERN (insn), 0) - 1; *** regno_clobbered_p (unsigned int regno, r *** 6974,6980 { rtx elt = XVECEXP (PATTERN (insn), 0, i); if ((GET_CODE (elt) == CLOBBER ! || (sets && GET_CODE (PATTERN (insn)) == SET)) && REG_P (XEXP (elt, 0))) { unsigned int test = REGNO (XEXP (elt, 0)); --- 7006,7012 { rtx elt = XVECEXP (PATTERN (insn), 0, i); if ((GET_CODE (elt) == CLOBBER ! || (sets == 1 && GET_CODE (PATTERN (insn)) == SET)) && REG_P (XEXP (elt, 0))) { unsigned int test = REGNO (XEXP (elt, 0)); *** regno_clobbered_p (unsigned int regno, r *** 6982,6987 --- 7014,7021 if (test >= regno && test < endregno) return 1; } + if (sets == 2 && reg_inc_found_and_valid_p(regno, endregno, elt)) + return 1; } } Index: reload1.c === *** reload1.c (revision 110905) --- reload1.c (wor
Re: Compilation performance comparison of GCC 4.0.1 and GCC 4.1.0 20060210 on MICO 2.3.12 sources
On 2/11/06, Karel Gardas <[EMAIL PROTECTED]> wrote: > > Hello, > > it's been a while since my last comparison of various GCC's compilation > performance on MICO sources. A lot happened in GCC development since > then and so I'm here with more up-to-date measurements. This time I've > used MICO 2.3.12 release sources and again measured time of > compilation of orb subdirectory files. I've compared my GCC 4.0.1 > release with GCC 4.1.0 20060210 prerelease on AMD64 generating code > for this platform. Whole tables are below, but overall results are, > that 4.1.0 is slower on all compilations (sums) while using > -O0/1/2/3/s optimization levels, actual numbers are: Apart from compile-time, is there a runtime benchmarking suite for MICO (or is runtime generally not that interesting here)? Or numbers on object size, if runtime is not that important. Just to get an idea if there is anything we get back from the increased compile times. Thanks, Richard.
Re: "cscope" type functionality
How do I keep track of the "LTO" design process? I may not be able to help 'cause I don't know enough about gcc's internals but I'd like to if possible. On Feb 13, 2006, at 2:47 PM, Mark Mitchell wrote: Gabriel Dos Reis wrote: Tom Tromey <[EMAIL PROTECTED]> writes: [...] | I think it would be more advisable to design something with AST | database generation as an explicit goal. I believe that is a sensible approach, one that I thought a by- product of the "Link Time Optimization" proposal. Such a format will be tremendously useful. Yes, that was part of the motivation for the LTO design that Kenny, I, and others developed: such a database "falls out" of that design, with relatively minimal additional effort. However, whether or not that proposal will be implemented is still an open question, dependent on demand, technological evaluation relative to other approaches for link-time optimization (notably, LLVM), and available resources. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Massive FORTRAN test failures
On Wed, Feb 15, 2006 at 10:07:42AM +0100, Fran?ois-Xavier Coudert wrote: > > I have got massive FORFRAN test failures on Linux/ia64 and The name of the language is "Fortran" not "FORTRAN". > I did see the following on my nightly regtest on x86_64-linux: > > FAIL: gfortran.dg/char_transpose_1.f90 -O3 -g (test for excess errors) > WARNING: gfortran.dg/char_transpose_1.f90 -O3 -g compilation failed > to produce executable This has been reported. I've posted a backtrace where a gcc_assert is being triggered because a variable magically goes missing! At least it is missing in a gdb session. -- Steve
Recursive Destructors?
I am assuming I am doing something wrong but I am hoping someone can give me a clue as to where to look. I'm trying to write an AIX device driver using g++. The drivers do not have a __start or main, so I have to call the global constructors and destructors myself. I've written code attempting to make sure they get constructed before any globals are accessed and no global access happens after they are destroyed. But my trace shows that the object that is suppose to do that is getting its destructor called recursively. The essence is that I have an automatic variable. The constructor calls the global constructors; the destructor calls the global destructors. Line 1 and 10 you see this variable's destructor getting called. The second call returns at line 13. The first call returns at line 23. The second call does nothing because I have already flipped the bit that says that the global destructors need to be called. So the second call comes and goes without much fuss. The only thing unusual about the config_lock class is it has two static members. I could move those and just have global C statics if necessary. The static members are an int and an lock_t (C types). I can write code to cope with this but I assume that this is not suppose to happen. My fear is that I am trashing something and just getting lucky that I am not crashing. I first want to make sure (from comments from this mailing list) that it is not suppose to happen. The second thing I'm hoping for is some suggestions on what may be happening. The first call appears to be destroying objects, then calls itself again, then continues destroying objects, then returns. Why is it calling itself again? Any ideas? 1 config_lock_dtor 2 get_lockl_ctor 3 get_lockl_ctor ret 0 from line 108 4 global_dtors 5 global_dtors 004E 6 dtor 7 dtor ret 0 from line 24 8 dvr_simple_lock_dtor 9 dvr_simple_lock_dtor ret 0 from line 31 10 config_lock_dtor 11 get_lockl_ctor 12 get_lockl_ctor ret 0 from line 108 13 config_lock_dtor ret 0 from line 75 14 get_lockl_dtor 15 get_lockl_dtor ret 0 from line 116 16 free ptr 70BDDFC0 17 free ret 0 from line 32 18 free ptr 70BCF480 19 free ret 0 from line 32 20 free ptr 70BCF820 21 free ret 0 from line 32 22 global_dtors ret 0 from line 82 23 config_lock_dtor ret 0 from line 75 Thank you for you help. Perry Smith
RE: Recursive Destructors?
On 15 February 2006 15:27, Perry Smith wrote: > I am assuming I am doing something wrong but I am hoping someone can > give me a clue as to where to look. > > I'm trying to write an AIX device driver using g++. ^^^ Clue: Look here! ;-) > The drivers do > not have a __start or main, so I have to call the global constructors > and destructors myself. I've written code attempting to make sure > they get constructed before any globals are accessed and no global > access happens after they are destroyed. But my trace shows that the > object that is suppose to do that is getting its destructor called > recursively. Ah, I see you've already worked out just /why/ c++ is a tricky choice for system-level code. > The essence is that I have an automatic variable. The constructor > calls the global constructors; the destructor calls the global > destructors. Line 1 and 10 you see this variable's destructor getting > called. The second call returns at line 13. The first call returns > at line 23. The second call does nothing because I have already > flipped the bit that says that the global destructors need to be > called. So the second call comes and goes without much fuss. The > only thing unusual about the config_lock class is it has two static > members. I could move those and just have global C statics if > necessary. The static members are an int and an lock_t (C types). > > I can write code to cope with this but I assume that this is not > suppose to happen. My fear is that I am trashing something and just > getting lucky that I am not crashing. I first want to make sure (from > comments from this mailing list) that it is not suppose to happen. What, global destructors that infinitely recurse until the entire stack is filled? I suppose that's one way to get a process terminated, but I'm sure it's not the right one. No, this probably should not be happening. Nor should the presence of a couple of static class members necessitate calling the d-tor at global d-tor time, although if either of them were non-PODs they might have had d-tors of their own. > The second thing I'm hoping for is some suggestions on what may be > happening. The first call appears to be destroying objects, then > calls itself again, then continues destroying objects, then returns. Most likely you've written a bug in your code somewhere. > Why is it calling itself again? Any ideas? Well, I would like to suggest that you have inadvertently instantiated a static instance of your class, and so there is a call to the class d-tor in amongst the list of d-tors in the .dtors section. > >1 config_lock_dtor >2 get_lockl_ctor >3 get_lockl_ctor ret 0 from line 108 >4 global_dtors >5 global_dtors 004E >6 dtor >7 dtor ret 0 from line 24 >8 dvr_simple_lock_dtor >9 dvr_simple_lock_dtor ret 0 from line 31 > 10 config_lock_dtor > 11 get_lockl_ctor > 12 get_lockl_ctor ret 0 from line 108 > 13 config_lock_dtor ret 0 from line 75 > 14 get_lockl_dtor > 15 get_lockl_dtor ret 0 from line 116 > 16 free ptr 70BDDFC0 > 17 free ret 0 from line 32 > 18 free ptr 70BCF480 > 19 free ret 0 from line 32 > 20 free ptr 70BCF820 > 21 free ret 0 from line 32 > 22 global_dtors ret 0 from line 82 > 23 config_lock_dtor ret 0 from line 75 > > Thank you for you help. > Perry Smith The most useful information would be to print out the value of 'this' in config_lock_dtor and see if it's the same or a different one each time. If it's different, you just need to find the static instantiation and delete it from your program. If it's the same, you must be doing something wrong in the way that your object's d-tor goes about calling the global d-tors. BTW, this is probably a gcc-help@ post really. It's only marginally related to the internals of gcc. cheers, DaveK -- Can't think of a witty .sigline today
Re: hang in acats testsuite test cxg2014 on hppa2.0w-hp-hpux11.00
> + /* The widest floating point format supported by the hardware. Note that > +setting this influences some Ada floating point type sizes, currently > +required for GNAT to operate properly. */ > + #define WIDEST_HARDWARE_FP_SIZE 64 I missed this "new" define and will try it. Perhaps, this should take account of the situation when TARGET_SOFT_FLOAT is true. For example, #define WIDEST_HARDWARE_FP_SIZE (TARGET_SOFT_FLOAT ? 0 : 64) Dave -- J. David Anglin [EMAIL PROTECTED] National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
Re: hang in acats testsuite test cxg2014 on hppa2.0w-hp-hpux11.00
On Feb 15, 2006, at 11:44, John David Anglin wrote: I missed this "new" define and will try it. Perhaps, this should take account of the situation when TARGET_SOFT_FLOAT is true. For example, When emulating all software floating-point, we still don't want to use 128-bit floats. The whole idea is that Long_Long_Float is the widest supported type that will still give reasonable performance. In many cases this type is used for all computation, and changing that to a 128-bit type is not a good idea until such a type is really supported efficiently by hardware. Accuracy requirements mandated by Annex G of the Ada standard make it quite difficult to correctly implement this, since all real and complex elementary functions will need to compute accurate results. Because with 128-bit double extended IEEE floating-point is supported yet, this work has not been done yet, and it will be a tremendous effort, since there are no good system math libraries yet. -Geert
Re: hang in acats testsuite test cxg2014 on hppa2.0w-hp-hpux11.00
> On Feb 15, 2006, at 11:44, John David Anglin wrote: > > I missed this "new" define and will try it. Perhaps, this should > > take account of the situation when TARGET_SOFT_FLOAT is true. For > > example, > > When emulating all software floating-point, we still don't want to > use 128-bit floats. The whole idea is that Long_Long_Float is the Understood. My question was what should the define for WIDEST_HARDWARE_FP_SIZE be when generating code for a target with no hardware floating point support (e.g., when TARGET_SOFT_FLOAT is true)? Besides the accuracy issue, it has occured to me that the DWARF2 EH support probably can't unwind successfully from exceptions thrown by the HP quad IEEE routines. Dave -- J. David Anglin [EMAIL PROTECTED] National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
Re: hang in acats testsuite test cxg2014 on hppa2.0w-hp-hpux11.00
On Feb 15, 2006, at 13:28, John David Anglin wrote: Understood. My question was what should the define for WIDEST_HARDWARE_FP_SIZE be when generating code for a target with no hardware floating point support (e.g., when TARGET_SOFT_FLOAT is true)? Practically, I'd say it should be 64, as it's a bit of a universal assumption that you at least have 32-bit and 64-bit float types, and possibly an 80 bit one (formatted up to 128 bits). Of course, the idea with soft float is not to reflect reality, but rather to have a reasonable match with expectations of the software you'd want to run.
Re: Design a microcontroller for gcc
DJ Delorie wrote: >>So If I use 16 bits registers, do I have to handle pairs of them to form >>32 bits ? > > > Well, you don't *have* to if your word size is only 16 bits. GCC will > still pair them, but you'll need to tell gcc how to split them back up > for the opcodes you have available. > > Note that there are some operations that gcc assumes you have 32-bit > opcodes for, though. Or at least insns that emulate it. Like what ? > You really want to have native support for sizeof(int) values and > sizeof(void *) values, bigger things can be emulated or broken up. Ok, then so it will be ;) Any way, thanks for all the advices, I'll try to come up with a good instruction set (both in regards to effecient implementation and effecient for running code) Sylvain
Re: Design a microcontroller for gcc
> > Note that there are some operations that gcc assumes you have 32-bit > > opcodes for, though. Or at least insns that emulate it. > > Like what ? Well, cmpsi2 for example. and divsi2.
Re: Design a microcontroller for gcc
On Wednesday 15 February 2006 20:06, DJ Delorie wrote: > > > Note that there are some operations that gcc assumes you have 32-bit > > > opcodes for, though. Or at least insns that emulate it. > > > > Like what ? > > Well, cmpsi2 for example. and divsi2. You mean divsi3? Many targets don't have div at all. Paul
Re: Recursive Destructors?
On Feb 15, 2006, at 7:27 AM, Perry Smith wrote: I am assuming I am doing something wrong but I am hoping someone can give me a clue as to where to look. I'd fire up a debugger and type up a couple of times from a breakpoint in the dtor. If you want to randomly try things, if you inserted code to call the dtors, remove it, and then see if everything then works. Or, you should be able to statically analyze something like: int bar = 0; class S { ~S() { ++bar; } } a; if you'd rather do that.
Re: Design a microcontroller for gcc
> > Well, cmpsi2 for example. and divsi2. > > You mean divsi3? Many targets don't have div at all. Er, right. divsi3.
Re: PATCH: [4.1/4.2 Regression]: Miscompiled FORTRAN program
On Wed, 2006-02-15 at 05:52, Denis Nagorny wrote: > Did I understand your idea correctly? Can you comment new patch version. > It isn't fully tested but am I going in right direction? Yes, that is what I was suggesting. In choose_reload_regs, you only need one regno_clobbered_p call, with sets == 2, as that will check for both clobbers and REG_INC notes. There are some indentation problems with the patch. It looks like maybe you have inconsistent usage of spaces/tabs for indentation, and/or that your mailer converted the tabs to spaces. You probably should put patches in a MIME attachment. In both the reg_inc_found_and_valid_p declaration and definition, there should be a space before the open parenthesis. Also, the two calls in regno_clobbered_p have the same problem. At the end of the reg_inc_found_and_valid_p, there is a spurious blank line. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Re: Design a microcontroller for gcc
On Wed, 15 Feb 2006, Sylvain Munaut wrote: > * 2 flags Carry & Zero for testing. I think most of your questions have been answered, so let me just add that if nothing else, the port will be much simplified if you make sure that only specific compare instructions set condition codes, i.e. not as a nice side-effect of move, add and sub - or at least make such condition-code side-effects optional. It depends on too many undisclosed details like pipeline restrictions to say whether performance is generally better or worse, but I can tell for sure that the GCC port will be simpler with a specific set of condition-code setting insns. BTW, it depends on the compare (and branch) instructions whether just two flags are sufficient. brgds, H-P
Re: Recursive Destructors?
Thanks guys. I was misusing a shared_ptr. I had two shared_ptr's pointing to the same object (that didn't know about each other). I apologize if this was more appropriate for gcc-help. Some of the details of how dtors get called is still magic to me and I thought it may be gcc specific. On Feb 15, 2006, at 2:40 PM, Mike Stump wrote: On Feb 15, 2006, at 7:27 AM, Perry Smith wrote: I am assuming I am doing something wrong but I am hoping someone can give me a clue as to where to look. I'd fire up a debugger and type up a couple of times from a breakpoint in the dtor. If you want to randomly try things, if you inserted code to call the dtors, remove it, and then see if everything then works. Or, you should be able to statically analyze something like: int bar = 0; class S { ~S() { ++bar; } } a; if you'd rather do that.
Re: Design a microcontroller for gcc
Hans-Peter Nilsson wrote: On Wed, 15 Feb 2006, Sylvain Munaut wrote: * 2 flags Carry & Zero for testing. BTW, it depends on the compare (and branch) instructions whether just two flags are sufficient. That's true. MIPS for example seems to get by with no flags, although it makes multi-word addition/subtraction sequences a bit large. David Daney
Re: CAN_ELIMINATE question
Denis Chertykov <[EMAIL PROTECTED]> writes: > Code fragment from reload1.c: reload() > > if (caller_save_needed) > setup_save_areas (); > > /* If we allocated another stack slot, redo elimination bookkeeping. */ > if (starting_frame_size != get_frame_size ()) > continue; > > if (caller_save_needed) > { > save_call_clobbered_regs (); > /* That might have allocated new insn_chain structures. */ > reload_firstobj = obstack_alloc (&reload_obstack, 0); > } > > calculate_needs_all_insns (global); > > > Call to setup_save_areas () can change frame size but only offsets on > eliminable registers will be changed before call to calculate_needs_all_insns. > calculate_needs_all_insns will calculate wrong needs for elimination. > > Example for AVR: > > avr target can eliminate fp -> sp only if get_frame_size () == 0. > > Before call to setup_save_areas() frame size was 0 (CAN_ELIMINATE (FP,SP) != > 0) > > setup_save_areas() increase frame size. > set_initial_elim_offsets() correct offsets but can_eliminate isn't changed. > > save_call_clobbered_regs () emit save insn > (insn 659 161 162 16 (set (mem/c:HI (plus:HI (reg/f:HI 28 r28) # it's FP > (const_int 1 [0x1])) [29 S2 A8]) > (reg:HI 24 r24)) 12 {*movhi} (nil) > (nil)) > > calculate_needs_all_insns() try to eliminate (reg/f:HI 28 r28) to SP. > It's wrong because get_frame_size () != 0 and CAN_ELIMINATE (FP,SP) == 0 But then we'll call update_eliminables(), notice that something changed, and go around the loop again. There may be a bug here, but it needs more explanation. > I think that better to call update_eliminables() somewhere after > setup_save_areas() Exactly. We do that. About 15 lines after the lines you quoted above. What am I missing? Ian
Re: Design a microcontroller for gcc
Hans-Peter Nilsson wrote: > On Wed, 15 Feb 2006, Sylvain Munaut wrote: > >> * 2 flags Carry & Zero for testing. > > > I think most of your questions have been answered, so let me > just add that if nothing else, the port will be much simplified > if you make sure that only specific compare instructions set > condition codes, i.e. not as a nice side-effect of move, add and > sub - or at least make such condition-code side-effects > optional. It depends on too many undisclosed details like > pipeline restrictions to say whether performance is generally > better or worse, but I can tell for sure that the GCC port will > be simpler with a specific set of condition-code setting insns. Making it optionnal is not hard nor expensive in hardware, the problem is that my opcodes need to be 18 bits and I won't have space to stuff another option bit ... What I was thinking for the moment was to have : - sign is always the msb of the last ALU output - add/sub to modify all flags - move/xor/and/not/or only affect zero (and sign) - shift operations always affect carry and zero - Have some specific instructions like compare and test, but theses would only operate on registers (and not on immediate) What's so bad about have the flag as side-effects ? Here it's a simple MCU, it doesn't have a very long pipeline and that pipeline is 'almost' invisible to the end-user exception for memory fetch and io/access ... > BTW, it depends on the compare (and branch) instructions whether > just two flags are sufficient. g Adding Sign and overflow is pretty easy. And the compare instruction/logic path shouldn't be a problem either. MIPS has no flag ??? how does branching work ? Finally, about immediates, I'm thinking of having instruction like add could have 4 different forms : add rD, rA, rB add rA, rA, imm add rA, rA, imm<<8 add rA, rA, signextend(imm) Is that kind of manipulation on the immediate well understood by gcc internals ? Or maybe just allow immediates in the mov but that seems like a big penalty ... Sylvain
Re: Design a microcontroller for gcc
> What's so bad about have the flag as side-effects ? You can't put any other insn between the compare and the jump. Like, if you wanted to move an address into a register to do the jump, you'd lose the condition bits. The advantage of having most insns set flags, is you can sometimes avoid the compare completely. > MIPS has no flag ??? how does branching work ? Some chips combine the compare and jump into one insn, like "jeq $r0,4,label". > Or maybe just allow immediates in the mov but that seems like a big > penalty ... Most risc chips have more move insns than other opcodes. So, you'd have two adds (register and sign- or zero-extended immediate), and a variety of moves (lower, upper, extended, etc). You have to think about what kind of constants are going to be common in your software, and plan accordingly.
Re: Design a microcontroller for gcc
On Thu, 16 Feb 2006, Sylvain Munaut wrote: > What I was thinking for the moment was to have : > - sign is always the msb of the last ALU output > - add/sub to modify all flags > - move/xor/and/not/or only affect zero (and sign) > - shift operations always affect carry and zero > - Have some specific instructions like compare and test, but theses >would only operate on registers (and not on immediate) No, really. Just use compare insns. (And perhaps some way for carry propagation for multi-word add/sub, if that mechanism interferes. BTW, carry-out from shifts is very rarely used in compiled code.) > What's so bad about have the flag as side-effects ? Besides what DJ said about performance (both pros and cons there), the problem is as I said with port complexity, because of the way you have to handle condition codes in gcc. (_Should_ now, _have_to_ in the future -- or actually now, as you say you need scheduling.) Flag setting really should be explicit, so with your way, you have to show that add and sub etc. also set condition codes. And that's where you notice the complexity in the port, because (partly because of gcc pecularities) unless you want to lose performancewise, you need to show that most of the time, the flag register result is just clobbered by those operations and not used. Anyway, at least keep a way to add reg+reg and reg+integer, load and store of memory and load of integer and address without condition code effects and your port has a chance to avoid the related bloat. Sorry, I won't spend the time to spell out the details. Whatever: if you're determined on your way to do it and won't take advice you asked for, by all means feel free. You _have_ been warned, though. brgds, H-P
Re: Design a microcontroller for gcc
> BTW, carry-out from shifts is very rarely used in compiled code.) Unless you've expanded SI shifts into a pair of HI shifts. > Besides what DJ said about performance (both pros and cons > there), the problem is as I said with port complexity, because > of the way you have to handle condition codes in gcc. Unless you tell gcc that the condition codes are hard register? That's what m32c does; it has separate cmp/jmp and most opcodes set flags, so I just set an attribute that says which flags are set by each insn. Then, I can add a reorg pass to delete the cmps if the previous insn that set the flags happened to set them right. > Anyway, at least keep a way to add reg+reg and reg+integer, load and > store of memory and load of integer and address without condition > code effects and your port has a chance to avoid the related bloat. At least, move/load/store shouldn't touch flags.
Re: Design a microcontroller for gcc
On Wed, 15 Feb 2006, DJ Delorie wrote: > > BTW, carry-out from shifts is very rarely used in compiled code.) > Unless you've expanded SI shifts into a pair of HI shifts. > > > Besides what DJ said about performance (both pros and cons > > there), the problem is as I said with port complexity, because > > of the way you have to handle condition codes in gcc. > > Unless you tell gcc that the condition codes are hard register? No "unless" here. You either have a clobber or a set in a parallel with the main feature, and you lose out on all the single_set-directed optimizations if you put in a "set" early. > That's what m32c does; it has separate cmp/jmp and most opcodes set > flags, so I just set an attribute that says which flags are set by > each insn. Then, I can add a reorg pass to delete the cmps if the > previous insn that set the flags happened to set them right. A machine dependent reorg pass isn't something I'd recommend given that there are other possibilities. FWIW, I use peephole2s and condition code modes in CRIS w.i.p. Works ok, except for all the things that doesn't like insns with parallels that I have to weed out to get performance on par with the cc0 representation. brgds, H-P
Re: Design a microcontroller for gcc
On Wed, 15 Feb 2006, Hans-Peter Nilsson wrote: > FWIW, I use > peephole2s and condition code modes in CRIS w.i.p. ...and cbranch (cc setter + user in one combined insn) which are split after reload. brgds, H-P
Re: Design a microcontroller for gcc
> No "unless" here. You either have a clobber or a set in a parallel > with the main feature, and you lose out on all the > single_set-directed optimizations if you put in a "set" early. "Oh, crap" I hope I can stick with my cmp/jmp model and manage them myself still, though, because there's a LOT of patterns in m32c where the set of flags affected depends on which alternative you select, and most patterns affect the flags in some (usually nonorthagonal) way. Or is gcc going to start putting things between the cmp and jmp?
Re: Design a microcontroller for gcc
> ...and cbranch (cc setter + user in one combined insn) which are > split after reload. I have the cbranch and split, but allow it before reload. So far that hasn't been a problem, although I split it only to delete the cmp if I can.
Re: Design a microcontroller for gcc
On Wed, 15 Feb 2006, DJ Delorie wrote: > I hope I can stick with my cmp/jmp model and manage them myself still, > though, because there's a LOT of patterns in m32c where the set of > flags affected depends on which alternative you select, and most > patterns affect the flags in some (usually nonorthagonal) way. Unless I'm delirious (it's way past bedtime) I see a m32c port and it's cc0-free. Is there a problem? > Or is gcc going to start putting things between the cmp and jmp? Yes. At least reload wants to do that. The choice a port has is to either have cc-free reload insns (like i386) or keep the cc setter and user combined at least until after reload (cbranch, but you don't have to use the cbranchM4 name; you can do the combination to a cbranch-type insn in the CC user). Not my idea, so it's probably sane. :-) brgds, H-P PS. There may be other choices, but none that caught my attention.
Re: Design a microcontroller for gcc
On Wed, 15 Feb 2006, Hans-Peter Nilsson wrote: > Unless I'm delirious (it's way past bedtime) I see a m32c port > and it's cc0-free. Is there a problem? I see, in the code in svn trunk the compares aren't optimized away yet. You must be having a lot of fun right now. ;-) brgds, H-P
Re: Design a microcontroller for gcc
> Unless I'm delirious (it's way past bedtime) I see a m32c port > and it's cc0-free. Is there a problem? m32c has a separate $flg register defined, not a cc0 port. Hence, this pattern: (define_insn_and_split "cbranch4" [(set (pc) (if_then_else (match_operator 0 "m32c_cmp_operator" [(match_operand:QHPSI 1 "mra_operand" "RraSd") (match_operand:QHPSI 2 "mrai_operand" "iRraSd")]) (label_ref (match_operand 3 "" "")) (pc)))] "" "#" "" [(set (reg:CC FLG_REGNO) (compare (match_dup 1) (match_dup 2))) (set (pc) (if_then_else (match_dup 4) (label_ref (match_dup 3)) (pc)))] "operands[4] = m32c_cmp_flg_0 (operands[0]);" )
Re: Design a microcontroller for gcc
> I see, in the code in svn trunk the compares aren't optimized away > yet. You must be having a lot of fun right now. ;-) *That* is an understatement. Unfortunately, reload hates me (see archives for that thread) so I can't commit anything yet.
Re: "cscope" type functionality
Perry Smith <[EMAIL PROTECTED]> writes: > How do I keep track of the "LTO" design process? I may not be able > to help 'cause I don't know enough about gcc's internals but I'd like > to if possible. You read the mailing list [EMAIL PROTECTED] So you're already doing the right thing. Ian