Re: revamping synth_mult()
On 7/19/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: Folks, I'm currently looking at substantively revamping synth_mult(), the gcc routine for reducing multiplicative constants to shift/add/sub sequences. My perception here, from experimentation, is that synth_mult() is: 1. slow (deeply recursive) 2. bag of tricks including factoring (factoring is hard) 3. result is merely fair to mediocre Others may think differently. I also am able to determine from experimentation, that a "pretty good" result is readily achievable with linear effort, and furthermore, that a "near optimal" or "optimal" result is often only one shift/add pair less than the "pretty good" result, but obtaining that near optimal result is non-linear in computational complexity. Maybe it isn't worth it. Maybe we enumerate the oddball cases that fall out (i.e. the ones that are notably poorly coded after application of a systematic algorithm). A search of the literature indicates a small set of such systematic algorithms of which factoring is one have been applied to the problem. I think my objective here is to do in a systemmatic way, at least as well in the average case, and better in the worst case, as the existing algorithm, and to do it much faster. The code improvement probably won't make a measurable difference on standard benchmarks. If anyone has work-inprogress in this area, or a bug/tweak/suggestion, I'd be happy to hear about it. I'm not subscribed yet to the gcc@gcc.gnu.org mailing list, so copy me on any direct reply. There is the pending http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00313.html which improves multiplication sequences. Richard.
Message to send to gcc development list
Hello, We are trying to write a new backend for GCC. The target machine is 16-bit, with 24-bit pointers. However, the indexed addressing mode has a 24-bit base and a 16-bit index, so we want to generate RTXs such as (mem:QI (plus:SI (reg:HI ) (reg:SI ))) (mem:HI (plus:SI (mult:SI (reg:HI ) (const_int 2)) (reg:SI ))) etc. But I've not managed to get this to work reliably. Can this be done relatively painlessly? Also, we are currently working on a gcc 3.4 branch; are there any changes in gcc 4.x which would make this easier? Thanks for your help, Saajan. _ This email is from Cambridge Consultants Limited, Science Park, Milton Road, Cambridge CB4 0DW with registered number 1036298 England. It may contain confidential information. It is intended for the addressee only and may not be copied or disclosed to any third party without our permission. If you are not the intended recipient please contact the sender as soon as possible and delete the material from any computer. If this email has been sent as a personal message to the addressee, the sender is not acting in his/her capacity as an employee or officer of Cambridge Consultants Limited and no liability is accepted for the content of any such email. Outgoing email may be monitored for the purpose of ensuring compliance with our email policy and relevant laws.
Indexed address problem
Hello, We are trying to write a new backend for GCC. The target machine is 16-bit, with 24-bit pointers. However, the indexed addressing mode has a 24-bit base and a 16-bit index, so we want to generate RTXs such as (mem:QI (plus:SI (reg:HI ) (reg:SI ))) (mem:HI (plus:SI (mult:SI (reg:HI ) (const_int 2)) (reg:SI ))) etc. But I've not managed to get this to work reliably. Can this be done relatively painlessly? Also, we are currently working on a gcc 3.4 branch; are there any changes in gcc 4.x which would make this easier? Thanks for your help, Saajan. _ This email is from Cambridge Consultants Limited, Science Park, Milton Road, Cambridge CB4 0DW with registered number 1036298 England. It may contain confidential information. It is intended for the addressee only and may not be copied or disclosed to any third party without our permission. If you are not the intended recipient please contact the sender as soon as possible and delete the material from any computer. If this email has been sent as a personal message to the addressee, the sender is not acting in his/her capacity as an employee or officer of Cambridge Consultants Limited and no liability is accepted for the content of any such email. Outgoing email may be monitored for the purpose of ensuring compliance with our email policy and relevant laws.
JIT exception handling
This is just to tell you that now it is working. I have suceeded in making my JIT generate the right tables for gcc As it seems, both gcc 4.1 and gcc 3.3 seem to work OK. Can anyone confirm this? There isn't any difference between gcc-3.x and gcc4.x at this level isn't it? jacob
Re: JIT exception handling
jacob navia writes: > This is just to tell you that now it is working. > > I have suceeded in making my JIT generate the right tables for gcc Excellent. > As it seems, both gcc 4.1 and gcc 3.3 seem to work OK. > Can anyone confirm this? That they work OK? No, you are the only person who has done this. > There isn't any difference between gcc-3.x and gcc4.x at this > level isn't it? There have been changes in this area, but they shouldn't affect compatibility. It would be nice if you told us what you did to make it work. Andrew.
Re: JIT exception handling
Andrew Haley a écrit : jacob navia writes: > This is just to tell you that now it is working. > > I have suceeded in making my JIT generate the right tables for gcc Excellent. > As it seems, both gcc 4.1 and gcc 3.3 seem to work OK. > Can anyone confirm this? That they work OK? No, you are the only person who has done this. > There isn't any difference between gcc-3.x and gcc4.x at this > level isn't it? There have been changes in this area, but they shouldn't affect compatibility. It would be nice if you told us what you did to make it work. Andrew. Well, remember that I posted here that the lsb specs had a bug? I did post the bug, but *I did not correct the code* !!! Can you imagine something more stupid than that? There was a point in my code where I did not correct for that mistake, that is all. I followed the code in the debugger (after finishing building all the debug libraries needed) and I noticed it. Corrected it, and it worked. jacob
query regarding ivopts.
Hi, I am upgrading a port from 3.4.5 to 4.1.x .In course of this I see some regressions in terms of performance in memcpy . I have narrowed down the test case to the function below. ivopts generates ivtmps for each of the address calculations as shown in the attached log instead of coalescing them into accesses off a single base . I have constant offset based indexed addressing available. 3.4.x works just fine and transforms these into accesses off a single base. >From the little I understand of the way ivopts works- I assume this has to do with costs from my backend. I tweaked TARGET_ADDRESS_COST to return 0 always as well as specifically for POST_INC, POST_DEC and friends , but this did not help . Any suggestions would be great ! Thanks for your time . cheers Ramana --- Ramana Radhakrishnan GNU Tools Celunite Inc memcpyreduced.i Description: Binary data memcpyreduced.i.t81.ivopts Description: Binary data
Re: Modifying ARM code generator for elimination of 8bit writes - need help
On Wed, Jul 19, 2006 at 07:52:32AM +0200, Wolfgang Mües wrote: > Hello, > > after getting a "working" version of the gcc 4.0.2 with the Nintendo > 8-bit-write problem, I was busy the last weeks trying to adapt the > linux system (replacing I/O with writeb() macros, removing strb > assembler calls). It is good to hear from you again. > However, it turned out that the sources of the linux kernel are a far > more demanding test than every single small test case. Such is life. Since the ARM backend has many arm_expand_fubar() functions, it wouldn't too surpricing if one of them would now generate invalid insns. > I have tried my very best to implement the last patch from Rask (thank > you very much!). There was one place I was not shure I have coded the > right solution: > > + (match_code "reg" "0"))) > > My patch (without the second operand for match_code): > > > (match_test "GET_CODE (XEXP (op, 0)) == REG"))) > > Is this the right substitution? I believe it is. > If I compile the linux kernel with this patch, many files get compiled > without problems, but in fs/vfat/namei.c I get: > > > fs/vfat/namei.c: In function 'vfat_add_entry': > > fs/vfat/namei.c:694: error: unrecognizable insn: > > (insn 2339 2338 2340 188 (set (mem/s/j:QI (reg:SI 14 lr) [0 > > .attr+0 S1 A8]) (reg:QI 12 ip)) -1 (nil) > > (nil)) > > fs/vfat/namei.c:694: internal compiler error: in extract_insn, at > > recog.c:2020 Please submit a full bug report, > > I can't see what is going on here... The (clobber ...) part is missing. The first thing to do is to compile with -fdump-rtl-all and see which pass creates this invalid insn. Grep is your friend. I've spotted a function named emit_set_insn() in arm.c. It might be the problem, because it uses gen_rtx_SET() directly. Thus, if Y is a general operand, the resulting insn should match one of the move patterns, but the (clobber ...) expression needed for "_arm_movqi_insn_swp" will be missing. For example, arm_split_constant() contains these potentially problematic lines: emit_set_insn (target, GEN_INT (val)); rtx temp = subtargets ? gen_reg_rtx (mode) : target; emit_set_insn (temp, GEN_INT (val)); There are other functions which call emit_set_insn(). I have no idea which one, if any, of these calls is causing the problem. It could also be one of the RTL passes. The function named emit_move_insn() ought to do the trick here, but is perhaps a bit heavyweight for this purpose. Anyway, try this patch (untested), which should plug this particular hole: Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 115021) +++ gcc/config/arm/arm.c(working copy) @@ -709,12 +709,15 @@ TLS_LE32 }; -/* Emit an insn that's a simple single-set. Both the operands must be known - to be valid. */ +/* Emit an insn that's a simple single-set if Y isn't a general operand. + Both the operands must be known to be valid. */ inline static rtx emit_set_insn (rtx x, rtx y) { - return emit_insn (gen_rtx_SET (VOIDmode, x, y)); + if (general_operand (y, GET_MODE (x)) +return emit_move_insn (x, y); + else +return emit_insn (gen_rtx_SET (VOIDmode, x, y)); } /* Return the number of bits set in VALUE. */ -- Rask Ingemann Lambertsen
Re: Indexed address problem
On Wed, Jul 19, 2006 at 09:44:12AM +0100, [EMAIL PROTECTED] wrote: > We are trying to write a new backend for GCC. The target machine is > 16-bit, with 24-bit pointers. However, the indexed addressing mode has a > 24-bit base and a 16-bit index, so we want to generate RTXs such as > > (mem:QI (plus:SI (reg:HI ) > (reg:SI ))) > > (mem:HI (plus:SI (mult:SI (reg:HI ) > (const_int 2)) > (reg:SI ))) > > But I've not managed to get this to work reliably. You are not making it easy to help you. Please tell us what makes you conclude that it isn't working reliably (error messages, code which is slower and/or larger than expected, etc) and how you tried to make it work, such as: Your definition of GO_IF_LEGITIMATE_ADDRESS() and TARGET_ADDRESS_COST(), if any. I would imagine that you need some (zero_extend ...) expressions and also to use PSImode for pointers. Your second example above would then look like (mem:HI (plus:PSI (mult:PSI (zero_extend:PSI (reg:HI )) (const_int 2)) (reg:PSI ))) (Some target machines use zero extension and others use sign extension, please adjust accordingly.) > Can this be done > relatively painlessly? Also, we are currently working on a gcc 3.4 branch; > are there any changes in gcc 4.x which would make this easier? Starting with GCC 4.2, maybe this one: http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01311.html>. On http://gcc.gnu.org/lists.html>, please read the part which follows: "Please do not include or reference confidentiality notices". -- Rask Ingemann Lambertsen
Re: [PATCH] Install drivers from gcc/Makefile.in
> The attached patch moves the basic installation of the compiler > drivers from gcc/*/Make-lang.in to gcc/Makefile.in. The Make-lang.in > has only to inform the driver's name. What about Ada ? Will things still work after your change ? It would seem cleaner (if not mandatory) to take all languages into account in your change. Thanks in advance. > 2006-07-13 Rafael Ávila de Espíndola <[EMAIL PROTECTED]> > >* gcc/java/Make-lang.in (DRIVERS): New >(java.install-common): Don't Install the driver >* gcc/cp/Make-lang.in (DRIVERS): New >(c++.install-common): Don't install the driver >* gcc/fortran/Make-lang.in (DRIVERS): New >(fortran.install-common): Don't install the driver >* gcc/treelang/Make-lang.in (DRIVERS): New >(treelang.install.common.done): Don't install the driver >* gcc/Makefile.in (DRIVERS): New >(LANG_INSTALL_COMMONS): New >(install-drivers): New
Re: Project RABLET
On Fri, Jun 23, 2006 at 03:23:04PM -0400, Andrew MacLeod wrote: > A new register allocator written from scratch is a very long term > project (measured in years), and there is no guarantee after all that > work that we'd end up with something which is remarkably better. One > would hope that it is a lot more maintainable, but the generated code is > a crapshot. It will surely look better but will it really run faster? > The current plate of spaghetti we call the register allocator has had a > lot of fine tuning go into it over the years, and it generally generates > pretty darn good code IF it doesn't have to spill much, which is much of > the time. One area where the current register allocator is really lousy is when faced with pseudos spanning multiple registers. A recent example: http://gcc.gnu.org/ml/gcc-help/2006-04/msg00064.html>. And 16-bit targets suffer from this much of the time. I can only imagine that the AVR camp must be tearing their hair out in frustration. The register allocater really needs to be able to allocate subregs independently. -- Rask Ingemann Lambertsen
Re: Modifying ARM code generator for elimination of 8bit writes - need help
On Wed, Jul 19, 2006 at 01:24:59PM +0200, Rask Ingemann Lambertsen wrote: > > The function named emit_move_insn() ought to do the trick here, but > is perhaps a bit heavyweight for this purpose. Anyway, try this patch > (untested), which should plug this particular hole: There was an unbalanced parantheses. Here's an updated patch: Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 115021) +++ gcc/config/arm/arm.c(working copy) @@ -709,12 +709,15 @@ TLS_LE32 }; -/* Emit an insn that's a simple single-set. Both the operands must be known - to be valid. */ +/* Emit an insn that's a simple single-set if Y isn't a general operand. + Both the operands must be known to be valid. */ inline static rtx emit_set_insn (rtx x, rtx y) { - return emit_insn (gen_rtx_SET (VOIDmode, x, y)); + if (general_operand (y, GET_MODE (x))) +return emit_move_insn (x, y); + else +return emit_insn (gen_rtx_SET (VOIDmode, x, y)); } /* Return the number of bits set in VALUE. */ -- Rask Ingemann Lambertsen
Re: Indexed address problem
Rask Ingemann Lambertsen wrote: You are not making it easy to help you. Please tell us what makes you conclude that it isn't working reliably (error messages, code which is slower and/or larger than expected, etc) and how you tried to make it work, such as: Your definition of GO_IF_LEGITIMATE_ADDRESS() and TARGET_ADDRESS_COST(), if any. I have defined GO_IF_LEGITIMATE_ADDRESS() to only accept indexed addresses where the index is HImode, and then LEGITIMIZE_ADDRESS to replace SImode index registers with a subreg. This sometimes causes ICEs in emit_move_insn or copy_to_mode_reg because the mode of the index register is different to the mode of the PLUS or the MULT. I'll try using a zero_extend as you suggested. I would imagine that you need some (zero_extend ...) expressions and also to use PSImode for pointers. Your second example above would then look like What is the advantage of using PSImode? As far as I can see there is nothing which specifies how many bits are actually used, so it would seem to be treated the same as SImode. The target machine doesn't have special instructions for manipulating 24-bit pointers, all pointer moves and arithmetic are done with 32-bit instructions. On http://gcc.gnu.org/lists.html>, please read the part which follows: "Please do not include or reference confidentiality notices". Sorry! Thanks for your help, Saajan.
using threads with gcc on fedora (undefined reference to pthread_create)
I am using thread in my application. When i try to compile the code, the gcc says that the "pthread_create" is an undefined reference. I have included the library pthread.h. Is there something else that i need to do. What options do I have to use when compiling my code? Is there any document on this? thanks, Abid Ghufran.
Re: using threads with gcc on fedora (undefined reference to pthread_create)
On Wed, Jul 19, 2006 at 02:49:18PM +0100, Abid Ghufran wrote: > I am using thread in my application. > > When i try to compile the code, the gcc says that the "pthread_create" > is an undefined reference. I have included the library pthread.h. Is > there something else that i need to do. > > What options do I have to use when compiling my code? > Is there any document on this? This is a list for the developers of GCC. You may get more help on the gcc-help list. It sounds like you didn't link with -lpthread. -- Daniel Jacobowitz CodeSourcery
Re: Indexed address problem
On Wed, Jul 19, 2006 at 02:12:09PM +0100, Saajan Singh Chana wrote: > I have defined GO_IF_LEGITIMATE_ADDRESS() to only accept indexed > addresses where the index is HImode, I was trying to get you to copy and paste your definition og GO_IF_LEGITIMATE_ADDRESS() into your message. :-) > and then LEGITIMIZE_ADDRESS to replace SImode index registers > with a subreg. I don't think you can do that. (plus:SI (reg:SI ) (reg:SI ) 0)) (reg:SI )) unless you can somehow prove that the index fits inside 16 bits. Here's an example where you can't: int foo (int *base, unsigned long int index) { return (base[index]); } > This sometimes causes > ICEs in emit_move_insn or copy_to_mode_reg because the mode of the index > register is different to the mode of the PLUS or the MULT. > I'll try using a zero_extend as you suggested. Btw, I suppose you might see things like (mem:XX (plus:SI (zero_extend:SI (subreg:HI (reg:SI )) 0) (reg:SI ))) as well as (mem:XX (plus:SI (zero_extend:SI (reg:HI )) (reg:SI ))) and I think GO_IF_LEGITIMATE_ADDRESS() should accept either form. You'd see the former before the reload pass and the latter both after and before reload. (Does GCC ever use (truncate:HI (reg:SI ...)) instead of (subreg:HI (reg:SI ...) 0 )? Just wondering.) > What is the advantage of using PSImode? As far as I can see there is > nothing which specifies how many bits are actually used, so it would > seem to be treated the same as SImode. The target machine doesn't have > special instructions for manipulating 24-bit pointers, all pointer moves > and arithmetic are done with 32-bit instructions. OK, use SImode. -- Rask Ingemann Lambertsen
Re: query regarding ivopts.
Hello, > I am upgrading a port from 3.4.5 to 4.1.x .In course of this I see some > regressions in terms of performance in memcpy . I have narrowed down the test > case to the function below. > > ivopts generates ivtmps for each of the address calculations as shown in the > attached log instead of coalescing them into accesses off a single base . > > I have constant offset based indexed addressing available. 3.4.x works just > fine > and transforms these into accesses off a single base. > > >From the little I understand of the way ivopts works- I assume this has to do > with costs from my backend. I tweaked TARGET_ADDRESS_COST to return 0 always > as > well as specifically for POST_INC, POST_DEC and friends , but this did not > help . > > Any suggestions would be great ! Thanks for your time . this is a bug in ivopts, the costs have nothing to do with it. It appears to be fixed in 4.2 (I cannot reproduce it with your example there). Zdenek
Re: query regarding ivopts.
Hi Zdenek, I can't seem to reproduce this on 4.1.x with any other port. Maybe I need a sync up with the latest svn of 4.1.x . I'll try looking at 4.2 head also to spot differences if any. Thanks for your time cheers Ramana Ramana Radhakrishnan GNU Tools Celunite Inc On Wed Jul 19 8:09 , Zdenek Dvorak sent: >Hello, > >> I am upgrading a port from 3.4.5 to 4.1.x .In course of this I see some >> regressions in terms of performance in memcpy . I have narrowed down the test >> case to the function below. >> >> ivopts generates ivtmps for each of the address calculations as shown in the >> attached log instead of coalescing them into accesses off a single base . >> >> I have constant offset based indexed addressing available. 3.4.x works just >> fine >> and transforms these into accesses off a single base. >> >> >From the little I understand of the way ivopts works- I assume this has to >> >do >> with costs from my backend. I tweaked TARGET_ADDRESS_COST to return 0 always >> as >> well as specifically for POST_INC, POST_DEC and friends , but this did not >> help . >> >> Any suggestions would be great ! Thanks for your time . > >this is a bug in ivopts, the costs have nothing to do with it. It >appears to be fixed in 4.2 (I cannot reproduce it with your example >there). > >Zdenek >
Re: JIT exception handling
On Jul 19, 2006, at 3:08 AM, jacob navia wrote: This is just to tell you that now it is working. Yeah. Glad to hear it, and thanks for the update.
Re: Modifying ARM code generator for elimination of 8bit writes - need help
Hello Rask, On Wednesday 19 July 2006 13:24, Rask Ingemann Lambertsen wrote: > I've spotted a function named emit_set_insn() in arm.c. It might be > the problem, because it uses gen_rtx_SET() directly. But it's not the only function which uses gen_rtx_SET. There are also much places with > emit_constant_insn (cond, > gen_rtx_SET (VOIDmode, target, source)); Isn't it better to replace gen_rtx_SET? regards Wolfgang -- We're back to the times when men were men and wrote their own device drivers. (Linus Torvalds)