Re: Mis-handled ColdFire submission?
>My only critisism is that surely, all these improvements weren't carried > out just last week. I.e. some of them could have been submitted earlier, > thereby making them available to users earlier as well as preventing > duplicate work. An example is PR target/28181, which was reported half a > year ago and at least three people worked on fixing. So once your patches > are ready, go ahead and submit them. 28181 has been popping up over the last several years in various forms (5373, 13803, 18421, 23695, etc). -- Peter Barada [EMAIL PROTECTED]
Re: bug management: WAITING bugs that have timed out
> The description of WORKSFORME sounds closest: we don't know how to > reproduce the bug. Should that be used? The only other choices > are FIXED (wrong), DUPLICATE (wrong), INVALID (we don't know that), > or WONTFIX (we're not saying we won't fix it if we get a testcase). > > This came up because RMS raised a concern about the large number > of wrong-code bugs; many of those not marked "regression" are WAITING, > or should be WAITING. I'd like to knock off a few (and of course > they can be re-opened if we get more data). Create a new case then. Afterall it can't be that hard compared to working on GCC, right? Perhaps "STUCK-FOR-MORE-INFO" would make sense. I can see "WAITING" being that either there's not enough info or not enough testcase. Even if its state is WORKSFORME, I can see adding the testcase around on some automated tester to verify the compiler doesn't bork in the future. Any bug that creates an ICE should DEFINITELY remain open, and maintainers should persue the reporter for a testcase, if only to fix/add to an automated tester. As an example, a bunch of the ColdFire "move.b, ,%aY" bugs (5753 for example), with testcases, get getting hammered closed in various releases, only to pop up again in the future. Saddest is that is that in a batch of various related bug closings, the blanket comment "M68k/ColdFire is not a primary platform - CLOSED". Again, this misleads any analysis of how good GCC is as a compiler for non-primary targets. Is there a way of converting these to DEFERRED so they don't get magically CLOSED and their testcases get pulled into some automated tester? I can't see blindly punting bugs that someone has gone to at least the effort of reporting only to get ignored if no further information is forthcoming - perhaps the description of the issue is enough for some energetic intern to come along and create a testcase, who knows? -- Peter Barada [EMAIL PROTECTED]
matching constraints in asm operands question
I'm trying to improve atomic operations for ColdFir ein a 2.4 kernel, and I tried the following following the current online manual at: http://gcc.gnu.org/onlinedocs/gcc-3.4.3/gcc/Extended-Asm.html#Extended-Asm static __inline__ void atomic_inc(atomic_t *v) { __asm__ __volatile__("addql #1,%0" : "=m" (*v) : "0" (*v)); } but that genreates *lots* of warning messages about "matching contstaint doesn't allow a register". The manual states that if I *don't* use "0", then the compiler may have the input and output operand in seperate locations, and predicts unkown bad things can happen. Searching the archives I see that people are using: static __inline__ void atomic_inc(atomic_t *v) { __asm__ __volatile__("addql #1,%0" : "=m" (*v) : "m" (*v)); } which seems to work, but I'm really concerned about the manuals warning of the input and output operads being in seperate places. Which form is correct? -- Peter Barada [EMAIL PROTECTED]
Re: matching constraints in asm operands question
>> which seems to work, but I'm really concerned about the manuals >> warning of the input and output operads being in seperate places. >> >> Which form is correct? > >static __inline__ void atomic_inc(atomic_t *v) >{ > __asm__ __volatile__("addql #1,%0" : "+m" (*v)); >} > >Works just fine, every where I know of. It is the same as you last >example also. Ugh, in the hopes of simplifying the example, I made it somewhat trivial... static __inline__ void atomic_add(atomic_t *v, int i) { __asm__ __volatile__("addl %2,%0" : "=m" (*v) : "m" (*v), "d" (i)); } Is that correct? And if so, then isn't the documentation wrong? -- Peter Barada [EMAIL PROTECTED]
Re: matching constraints in asm operands question
>> Ugh, in the hopes of simplifying the example, I made it somewhat trivial... >> >> static __inline__ void atomic_add(atomic_t *v, int i) >> { >> __asm__ __volatile__("addl %2,%0" : "=m" (*v) : "m" (*v), "d" (i)); >> } >> >> Is that correct? And if so, then isn't the documentation wrong? > >This is correct, as is a version using "+m" (*v) : "d"(i), >though of course the %2 gets moved to %1 in that case. If I try: static __inline__ void atomic_add(atomic_t *v, int i) { __asm__ __volatile__("addl %1,%0" : "+m" (*v) : "d" (i)); } Then the compiler complains with: /asm/atomic.h:33: warning: read-write constraint does not allow a register So is the warning wrong? -- Peter Barada [EMAIL PROTECTED]
[m68k]: Trouble trying to figure out LEGITIMIZE_RELOAD_ADDRESS
I'm in the midst of fixing the m68k prologue/epilogue code for ColdFire and its FPU, and stumbled across a problem. The following code when compiled with -O2 -mcfv4e -fomit-frame-pointer (with the v4e cod in): double func(int i1, int i2, int i3, int i4, double a, double b) { int stuff[8192]; double total = 0.0; int i,j,k,l; for (i=0; i
Re: [m68k]: Trouble trying to figure out LEGITIMIZE_RELOAD_ADDRESS
>Peter Barada wrote: >> I'd like to make the reload look like: >> (set (reg:SI y) (plus:SI (reg_SI 16) (const_int 32832))) >> (set (reg:DF x) (mem:DF (reg:SI y))) > >Reload already knows how to make this transformation, so it should not >be necessary to resort to LEGITIMIZE_RELOAD_ADDRESS. This is only >needed for target specific tricks that reload can not and does not know >about. > I figured out how to make it work using LEGITIMIZE_RELOAD_ADDRESS(at least for gcc-3.4.3) via: /* Our implementation of LEGITIMIZE_RELOAD_ADDRESS. Returns a value to replace the input X, or NULL_RTX if no replacement is called for. If a new X is returned, then goto WIN in the invoking macro. Reload will try to handle large displacements off a base register by pushing the displacement into a register and using mode6, but this doesn't work for the ColdFire FPU, so in that case, if the address is not valid for the mode, and is either mode 5 or mode 6, then push the *entire* address into a register and use mode 2 to access the value. */ rtx m68k_legitimize_reload_address (rtx *x, enum machine_mode mode, int opnum, int type, int ind_levels ATTRIBUTE_UNUSED) { /* If the address is valid for 'mode', accept it. */ if (strict_memory_address_p(mode, *x)) { return NULL_RTX; } /* The ColdFire v4e can't handle FP mode 6 addresses. Unfortunately reload tries to remap a mode 5 address with the offset out of range into a mode 6. Push an FP mode 5 with displacement out of range or mode 6 address into a register and use mode 2 addressing instead. */ if (TARGET_CFV4E && GET_MODE_CLASS (mode) == MODE_FLOAT && GET_CODE (*x) == PLUS && !(GET_CODE (XEXP (*x, 0)) == REG && ((GET_CODE (XEXP (*x, 1)) == CONST_INT && (INTVAL (XEXP (*x, 1)) >= -32768 && INTVAL (XEXP (*x, 1)) <= 32767) { push_reload (*x, NULL_RTX, x, NULL, BASE_REG_CLASS, GET_MODE (*x), VOIDmode, 0, 0, opnum, type); return *x; } return NULL_RTX; } Hopefully I got it right. If not, yell. :) >See the code in find_reloads_address with the comment > /* If we have address of a stack slot but it's not valid because the > displacement is too large, compute the sum in a register. > >The problem here seems to be that double_reg_address_ok is true, but you >don't want it to be true. It can only be true if >GO_IF_LEGITIMATE_ADDRESS accept REG+REG+4. Perhaps the problem here is >that a double address reg is OK for integer loads, but not for FP loads, >in which case the double_address_reg_ok logic in reloads needs to be >generalized a bit. Maybe an array based on mode instead of a single >variable? Or maybe just int and fp versions are good enough for now. For ColdFire v4e, FP loads and stores can not use REG+REG addressing. I think you are correct that this current hack can be improved by making an array of mode to determine if double_reg_address_ok should be true, but I think in the end that a more flexible scheme be thought about since this isn't the *only* peculiarity of the ColdFire. One is that pc-relative addressing is only available on the *source*, not the destination, and currently GO_IF_LEGITIMATE_ADDRESS nor LEGITIMIZE_ADDRESS have no way of know if its a source or destination. Unfortunately I don't have any spare time to try that approach this week. If I can find the time next week, I'll put together a patch against mainline that addresses this, but first I'd have to reproduce the problem there(which will give a testcase). -- Peter Barada [EMAIL PROTECTED]
Re: Sorry for the noise: Bootstrap fails on HEAD 4.1 for AVR
>When trying to figure out the origin of the problem, I have realized so far, >that it is obviously stems from a problem during my local configure process: >The xgcc I'm just building tries to pipe the asm result through my "host-as" >instead of the "target-as". I will myself have to look for why configure >chose the wrong assembler. Unfortunately, the error message I got from "make" >was not really instructive. So: Sorry for the noise. When you configured the cross compiler, did you have the target assembler in your PATH? If not configure will use 'as' in your path and find your host assembler instead. -- Peter Barada [EMAIL PROTECTED]
[m68k]: More trouble with byte moves into Address registers
This is driving me up a tree. I have a fix for 18421(on mainline & gcc-3.4.3) that uses HARD_REGNO_MODE_OK to prevent bytes into address registers, and modified movqi for ColdFire to drop the '*a' in d*a/di*a constraint, as well as modified addsi3_5200 to us 'i' instead of 's'. My current problem is when I'm compiling Perl for the ColdFire v4e using gcc-3.4.3 for m68k-linux, and I'm seeing: [EMAIL PROTECTED] tmp]$ /opt/logicpd/ColdFire-new12/m68k-linux/gcc-3.4.3-glibc-2.3.2/bin/m68k-linux-gcc -DPERL_CORE -mcfv4e -fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -O2 pp_pack.c -S -o pp_pack.s -da pp_pack.c: In function `S_unpack_rec': pp_pack.c:2220: error: unable to find a register to spill in class `ADDR_REGS' pp_pack.c:2220: error: this is the insn: (insn 5559 5558 5560 694 pp_pack.c:2144 (set (reg:SI 8 %a0 [1421]) (plus:SI (subreg:SI (reg:QI 1420) 0) (const_int -32 [0xffe0]))) 121 {*addsi3_5200} (insn_list 5558 (nil)) (nil)) pp_pack.c:2220: confused by earlier errors, bailing out The RTL surrounding this from pp_pack.c.24.lreg is: ;; Start of basic block 694, registers live: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1725 1734 1735 1736 1738 1740 1756 (note 6967 5556 5558 694 [bb 694] NOTE_INSN_BASIC_BLOCK) (insn 5558 6967 5559 694 pp_pack.c:2144 (set (reg:QI 1420) (mem:QI (reg:SI 1756 [ s ]) [0 S1 A8])) 43 {movqi_cfv4} (nil) (expr_list:REG_DEAD (reg:SI 1756 [ s ]) (nil))) (insn 5559 5558 5560 694 pp_pack.c:2144 (set (reg:SI 1421) (plus:SI (subreg:SI (reg:QI 1420) 0) (const_int -32 [0xffe0]))) 121 {*addsi3_5200} (insn_list 5558 (nil)) (nil)) (insn:QI 5560 5559 5561 694 pp_pack.c:2144 (set (cc0) (compare (subreg:QI (reg:SI 1421) 3) (const_int 64 [0x40]))) 15 {cfv4_cmpqi} (insn_list 5559 (nil)) (expr_list:REG_DEAD (reg:SI 1421) (nil))) pp_pack.c.25.greg stops with the line 'Spilling for insn 5559.' This line is repeated twice previously. the first with no information attached(just 'Spilling for insn 5559.'), and then second time has 'Using reg 9 for reload 0' after it. This also fails on gcc-3.4.3 for -m5407, but compiles for -m5200. The 5407 has more instructions and less restrictive addressing modes on some instructions than the 5200 has. Can anyone take a stab at describing *how* to debug this? Is this just a case where there are so many live registers that reload has just backed itself into a corner? Any suggestions are appreciated! -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
>For some reason reload has decided that it needs ADDR_REGS for the >register being reloaded, namely (reg:QI 1420). So gcc looks for a >register in ADDR_REGS which can hold QImode. Because of your changes, >it doesn't find one. So it crashes. > >The question is why reload thinks that it needs ADDR_REGS for this >register. Look at the local-alloc debugging dump to see where >regclass thinks that the register should go. Which debugging dump has the output from "local-alloc"? If its pp_pack.c.24.lreg, then that is the output I supplied in the original message which contains(for all bits regarding register 1420 up until the compilation fails): Register 1420 costs: DATA_REGS:84 GENERAL_REGS:210 DATA_OR_FP_REGS:294 ALL_REGS:294 MEM:441 Register 1420 pref DATA_REGS Register 1420 costs: DATA_REGS:84 GENERAL_REGS:210 DATA_OR_FP_REGS:294 ALL_REGS:294 MEM:441 Register 1420 used 3 times across 6 insns; set 1 time; 1 bytes; pref DATA_REGS. Registers live at end: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1420 1725 1734 1735 1736 1738 1740 Registers live at start: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1420 1725 1734 1735 1736 1738 1740 ;; Start of basic block 694, registers live: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1725 1734 1735 1736 1738 1740 1756 (note 6967 5556 5558 694 [bb 694] NOTE_INSN_BASIC_BLOCK) (insn 5558 6967 5559 694 pp_pack.c:2144 (set (reg:QI 1420) (mem:QI (reg:SI 1756 [ s ]) [0 S1 A8])) 43 {movqi_cfv4} (nil) (expr_list:REG_DEAD (reg:SI 1756 [ s ]) (nil))) (insn 5559 5558 5560 694 pp_pack.c:2144 (set (reg:SI 1421) (plus:SI (subreg:SI (reg:QI 1420) 0) (const_int -32 [0xffe0]))) 121 {*addsi3_5200} (insn_list 5558 (nil)) (nil)) (insn:QI 5560 5559 5561 694 pp_pack.c:2144 (set (cc0) (compare (subreg:QI (reg:SI 1421) 3) (const_int 64 [0x40]))) 15 {cfv4_cmpqi} (insn_list 5559 (nil)) (expr_list:REG_DEAD (reg:SI 1421) (nil))) BTW: the patterns mentioned in the dump: (define_insn "movqi_cfv4" [(set (match_operand:QI 0 "nonimmediate_operand" "=d,dmU,U,d") (match_operand:QI 1 "general_src_operand" "dmi,d,Ui,di"))] "TARGET_CFV4" "* return output_move_qimode (operands);") (define_insn "*addsi3_5200" [(set (match_operand:SI 0 "nonimmediate_operand" "=m,?a,?a,r") (plus:SI (match_operand:SI 1 "general_operand" "%0,a,rJK,0") (match_operand:SI 2 "general_src_operand" "dIL,rJK,a,mrIKLi")))] "TARGET_COLDFIRE" "* return output_addsi3 (operands);") (define_insn "cfv4_cmpqi" [(set (cc0) (compare (match_operand:QI 0 "nonimmediate_operand" "mdKs,d") (match_operand:QI 1 "general_operand" "d,mdKs")))] "TARGET_CFV4" "* { if (REG_P (operands[1]) || (!REG_P (operands[0]) && GET_CODE (operands[0]) != MEM)) { cc_status.flags |= CC_REVERSED; #ifdef SGS_CMP_ORDER return \"cmp%.b %d1,%d0\"; #else return \"cmp%.b %d0,%d1\"; #endif } #ifdef SGS_CMP_ORDER return \"cmp%.b %d0,%d1\"; #else return \"cmp%.b %d1,%d0\"; #endif }") And the function called for HARD_REGNO_MODE_OK is: /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE. On the 68000, the cpu registers can hold any mode except bytes in address registers, but the 68881 registers can hold only SFmode or DFmode. */ int m68k_hard_regno_mode_ok(int regno, enum machine_mode mode) { if (regno < 8) { /* Data Registers can hold anything if it fits into the data registers */ if ((regno) + GET_MODE_SIZE (mode) / 4 > 8) return 0; return 1; } else if (regno < 16) { /* Address Registers, can't hold bytes, can hold aggregate if fits in */ if (GET_MODE_SIZE (mode) == 1) return 0; if (!((regno) < 8 && (regno) + GET_MODE_SIZE (mode) / 4 > 16)) return 1; } else if (regno < 24) { /* FPU registers, hold float or complex float of long double or smaller */ if ((GET_MODE_CLASS (mode) == MODE_FLOAT || GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT) && (((GET_MODE_UNIT_SIZE (mode) <= 12) && TARGET_68881) || ((GET_MODE_UNIT_SIZE (mode) <= 8) && TARGET_CFV4E))) return 1; } return 0; } Any further insight or suggestions are *really* appreciated! -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
>For some reason reload has decided that it needs ADDR_REGS for the >register being reloaded, namely (reg:QI 1420). So gcc looks for a >register in ADDR_REGS which can hold QImode. Because of your changes, >it doesn't find one. So it crashes. > >The question is why reload thinks that it needs ADDR_REGS for this >register. Look at the local-alloc debugging dump to see where >regclass thinks that the register should go. Which debugging dump has the output from "local-alloc"? If its pp_pack.c.24.lreg, then that is the output I supplied in the original message which contains(for all bits regarding register 1420 up until the compilation fails): Register 1420 costs: DATA_REGS:84 GENERAL_REGS:210 DATA_OR_FP_REGS:294 ALL_REGS:294 MEM:441 Register 1420 pref DATA_REGS Register 1420 costs: DATA_REGS:84 GENERAL_REGS:210 DATA_OR_FP_REGS:294 ALL_REGS:294 MEM:441 Register 1420 used 3 times across 6 insns; set 1 time; 1 bytes; pref DATA_REGS. Registers live at end: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1420 1725 1734 1735 1736 1738 1740 Registers live at start: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1420 1725 1734 1735 1736 1738 1740 ;; Start of basic block 694, registers live: 14 [%a6] 15 [%sp] 24 [argptr] 31 35 36 37 38 39 43 45 46 47 48 50 52 57 70 1369 1370 1371 1378 1391 1402 1413 1725 1734 1735 1736 1738 1740 1756 (note 6967 5556 5558 694 [bb 694] NOTE_INSN_BASIC_BLOCK) (insn 5558 6967 5559 694 pp_pack.c:2144 (set (reg:QI 1420) (mem:QI (reg:SI 1756 [ s ]) [0 S1 A8])) 43 {movqi_cfv4} (nil) (expr_list:REG_DEAD (reg:SI 1756 [ s ]) (nil))) (insn 5559 5558 5560 694 pp_pack.c:2144 (set (reg:SI 1421) (plus:SI (subreg:SI (reg:QI 1420) 0) (const_int -32 [0xffe0]))) 121 {*addsi3_5200} (insn_list 5558 (nil)) (nil)) (insn:QI 5560 5559 5561 694 pp_pack.c:2144 (set (cc0) (compare (subreg:QI (reg:SI 1421) 3) (const_int 64 [0x40]))) 15 {cfv4_cmpqi} (insn_list 5559 (nil)) (expr_list:REG_DEAD (reg:SI 1421) (nil))) BTW: the patterns mentioned in the dump: (define_insn "movqi_cfv4" [(set (match_operand:QI 0 "nonimmediate_operand" "=d,dmU,U,d") (match_operand:QI 1 "general_src_operand" "dmi,d,Ui,di"))] "TARGET_CFV4" "* return output_move_qimode (operands);") (define_insn "*addsi3_5200" [(set (match_operand:SI 0 "nonimmediate_operand" "=m,?a,?a,r") (plus:SI (match_operand:SI 1 "general_operand" "%0,a,rJK,0") (match_operand:SI 2 "general_src_operand" "dIL,rJK,a,mrIKLi")))] "TARGET_COLDFIRE" "* return output_addsi3 (operands);") (define_insn "cfv4_cmpqi" [(set (cc0) (compare (match_operand:QI 0 "nonimmediate_operand" "mdKs,d") (match_operand:QI 1 "general_operand" "d,mdKs")))] "TARGET_CFV4" "* { if (REG_P (operands[1]) || (!REG_P (operands[0]) && GET_CODE (operands[0]) != MEM)) { cc_status.flags |= CC_REVERSED; #ifdef SGS_CMP_ORDER return \"cmp%.b %d1,%d0\"; #else return \"cmp%.b %d0,%d1\"; #endif } #ifdef SGS_CMP_ORDER return \"cmp%.b %d0,%d1\"; #else return \"cmp%.b %d1,%d0\"; #endif }") And the function called for HARD_REGNO_MODE_OK is: /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE. On the 68000, the cpu registers can hold any mode except bytes in address registers, but the 68881 registers can hold only SFmode or DFmode. */ int m68k_hard_regno_mode_ok(int regno, enum machine_mode mode) { if (regno < 8) { /* Data Registers can hold anything if it fits into them */ if (((regno) + GET_MODE_SIZE (mode) / 4) <= 8) return 1; } else if (regno < 16) { /* Address Registers can't hold bytes, can hold aggregate if it fits into them */ if (GET_MODE_SIZE (mode) == 1) return 0; if (((regno) + GET_MODE_SIZE (mode) / 4) <= 16) return 1; } else if (regno < 24) { /* FPU registers, hold float or complex float of long double or smaller */ if ((GET_MODE_CLASS (mode) == MODE_FLOAT || GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT) && (((GET_MODE_UNIT_SIZE (mode) <= 12) && TARGET_68881) || ((GET_MODE_UNIT_SIZE (mode) <= 8) && TARGET_CFV4E))) return 1; } return 0; } Any further insight or suggestions are *really* appreciated! -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
SS (GET_MODE (X)) == MODE_FLOAT) \ ? ((TARGET_68881|TARGET_CFV4E) && (CLASS == FP_REGS || CLASS == DATA_OR_FP_REGS) \ ? FP_REGS : NO_REGS) \ : (TARGET_PCREL \ && (GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == CONST \ || GET_CODE (X) == LABEL_REF))\ ? ADDR_REGS \ : (CLASS)) -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
>> Would it help to rearrange the constraints to have reg += >> mem|reg|constant before the addreg += ... ? > >Probably not in this case. You could try it. It is true that when >two alternatives have the same cost, reload will pick the first one >listed. With my luck that will cause a bigger problem somewhere else. :) I'll try it once I get past this. >> >I don't know where else register 1421 is being used, so my tentative >> >guess would be that gcc is picking an address register based on the >> >constraints in addsi3_5200. Perhaps you need to change "?a" to "*a". >> >After all, you probably don't want to encourage pseudos to go into the >> >address registers merely because you add values to them. >> >> 1421 is only used in 5559 and 5560. from the lreg dump: >> >> Register 1421 used 2 times across 2 insns in block 694; set 1 time; pref >> DATA_REGS. > >That is odd. Your earlier e-mail seemed to show that register 1421 >was specifically put into an address register, specifically %a0. Why >would that happen if it has a preference for DATA_REGS? Was it >assigned by local-alloc? In that case you would see a line like ";; >Register 2421 in NNN." in the .lreg dump file. Uggh. I've been trying changes currently so I don't have the original compiler to regenerate the output. >> The problem is that there is no valid QImode instruction that can move >> values in/out of an address register > >I know. I'm suggesting that QImode values have to move in and out of >address registers via data registers, so you just put the QImode value >into the data register, and then move it into the address register, or >vice-versa. Is there another backend in GCC I can look at for examples of how this is done? >> Which looks like it allows QImode into ADDR_REGS instead >> of insisting on DATA_REGS. Do you think this should be: > >No, that will break reload. If it calls PREFERRED_RELOAD_CLASS with >ADDR_REGS, then it has already selected an alternative which requires >ADDR_REGS. Returning a register class which does not permit any >register in ADDR_REGS will give you a constraint violation later on. Well I tried it anyway and was able to build a complete C toolchain(including glibc-2.3.2), but it blew up at the same point building perl with a constraint violation of: pp_pack.c: In function `S_unpack_rec': pp_pack.c:2220: error: insn does not satisfy its constraints: (insn 5559 8594 8595 694 pp_pack.c:2144 (set (reg:SI 8 %a0 [1421]) (plus:SI (reg:SI 0 %d0) (const_int -32 [0xffe0]))) 121 {*addsi3_5200} (insn_list 5558 (nil)) (nil)) pp_pack.c:2220: internal compiler error: in reload_cse_simplify_operands, at postreload.c:391 This would be easy to fix by adding "?r/r/mi" to addsi3_5200 and emit code that first 'move.l %1,%0', and then add the string from output_addsi3(). That is assuming that changing PREFERRED_RELOAD_CLASS this way doesn't severely break reload. Where in reload would you think it decides that it absolutely has to have an ADDR_REG for 1421? I'll dig into it with gdb, but there's so much code in reload that a clue or two would *really* help :) I'll undo the change to PREFERRED_RELOAD_CLASS, and then change the '?a' to '*a' in addsi3_5200 to see if that helps reload to not pick and ADDR_REG for the value. If it still fails, I'll regenerate all the information as I did in the 2nd email to you. Thanks! -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
>For some reason reload has decided that it needs ADDR_REGS for the >register being reloaded, namely (reg:QI 1420). So gcc looks for a >register in ADDR_REGS which can hold QImode. Because of your changes, >it doesn't find one. So it crashes. > >The question is why reload thinks that it needs ADDR_REGS for this >register. Look at the local-alloc debugging dump to see where >regclass thinks that the register should go. Ok, will do. The previous suggetsion of converting from ?a to *a with the original version of PREFERRRED_RELOAD_CLASS causes the build of glibc to crash with: ../sysdeps/generic/printf_fphex.c: In function `__printf_fphex': ../sysdeps/generic/printf_fphex.c:490: error: unable to find a register to spill in class `ADDR_REGS' ../sysdeps/generic/printf_fphex.c:490: error: this is the insn: (insn 3081 3080 3075 216 ./_itowa.h:58 (set (subreg:SI (reg/v:QI 49 [ leading ]) 0) (plus:SI (subreg:SI (reg/v:QI 49 [ leading ]) 0) (const_int 1 [0x1]))) 121 {*addsi3_5200} (nil) (nil)) ../sysdeps/generic/printf_fphex.c:490: confused by earlier errors, bailing out I'm rebuilding the toolchain without the ?a change. What in the .lreg dump am I looking for that will tellm "where regclass things that the register should go"? Is it: ;; Register 1421 in 0. -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
>> pp_pack.c:2220: error: unable to find a register to spill in class >> `ADDR_REGS' >> pp_pack.c:2220: error: this is the insn: >> (insn 5559 5558 5560 694 pp_pack.c:2144 (set (reg:SI 8 %a0 [1421]) >> (plus:SI (subreg:SI (reg:QI 1420) 0) >> (const_int -32 [0xffe0]))) 121 {*addsi3_5200} (insn_list >> 5558 (nil)) > >You might want to look at CANNOT_CHANGE_MODE_CLASS, which can be used to >change how reload handles (subreg:SI (reg:QI)). That might help avoid >generating QImode ADDR_REG reloads in the first place. > >But if they are generated, then you need second reloads to resolve them >as Ian mentioned. There is probably no way to avoid implementing this. > >You should also look at MODES_TIEABLE_P, which may also help prevent >getting QImode ADDR_REG reloads. > >Even if you fix both of these, you will probably still need the >secondary reload support for this case. I added: /* Try to suppress (subreg:SI (reg:QI)) from ending up in ADDR_REGS */ #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \ (((FROM) == QImode || (TO) == QImode) \ ? reg_classes_intersect_p (ADDR_REGS, (CLASS)) : 0) to my gcc-3.4.3 for ColdFire v4e, and that allows perl-5.9.2 to build. I'm in the midst of rebuilding the entire linux kernel and user environment to make sure it didn't cause anything else to go wrong, but so far, so good. I have my fingers crossed that this is both correct and is enough(Should I try (GET_MODE_SIZE (FROM) == 1 || GET_MODE_SIZE (TO) == 1) )? I'm really leary of twisting any more knobs without a *clear* understanding of what impact the knobs actaully have, and the doco just isn't giving me that. I've got a "Using and Porting GNU CC" manual for rev 2.95, and am looking around for a newer one and can't find it anywhere. Does anyone know if a newer printed manual is available(and if so, where I can find it)? Eventually I'll have to try my changes on gcc-4.0 to see what that does. -- Peter Barada [EMAIL PROTECTED]
Re: [m68k]: More trouble with byte moves into Address registers
>> I've got a "Using and Porting GNU CC" manual for rev 2.95, and am >> looking around for a newer one and can't find it anywhere. Does >> anyone know if a newer printed manual is available(and if so, where I >> can find it)? > >At the risk of stating the dreadfully obvious, the manual is online at >http://gcc.gnu.org/onlinedocs/gccint/ >and you can print your own copy by running "make dvi" and using your >favorite DVI printing program, or running dvipdf to produce a PDF. Yes, I've done that before where I had punched paper and a duplex laser printer. Here at home I have neither. >If you mean a printed and bound book published by somebody else, I >don't think there is a newer one available. I like the printed book since I can dog-ear pages and scribble notes in it. As it is, my 2.95 version's binding is nearly fallying apart :) -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>>> The alternative of course is to do only crossbuilds. Is it reasonable >>> to say that, for platforms where a bootstrap is no longer feasible, a >>> successful crossbuild is an acceptable test procedure to use instead? >>> >> Sure, and get flamed and trounced by Uli on glibc when you talk >> about problems with crossbuilding. > >He'll flame you anyway for talking about platforms >that are no longer relevant to the desktop. >(Well, maybe he wouldn't. See http://lwn.net/Articles/19297 ) > >A successful crossbuild is certainly the minimum concievable standard. >Perhaps one should also require bootstrapping the C compiler alone; >that would provide at least some sanity-checking. Unfortunately for some of the embedded targets(like the ColdFire V4e work I'm doing), a bootstrap is impossible due to limited memory and no usable mass-storage device on the hardware I have available, so hopefully a successful crossbuild will suffice. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>Well, yes. 1 second/file is still slow! I want "make" to complete >instantaneously! Don't you? Actually I want it to complete before I even start, but I don't want to get too greedy. :) What's really sad is that for cross-compilation of the toolchain, we have to repeat a few steps (build gcc twice, build glibc twice) because glibc and gcc assume that a near-complete environment is available(such as gcc needing headers, and glibc needing -lgcc-eh), so even really fast machines(2.4Ghz P4) take an hour to do a cross-build from scratch. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> What's really sad is that for cross-compilation of the toolchain, we >> have to repeat a few steps (build gcc twice, build glibc twice) >> because glibc and gcc assume that a near-complete environment is >> available(such as gcc needing headers, and glibc needing -lgcc-eh), so >> even really fast machines(2.4Ghz P4) take an hour to do a cross-build >> from scratch. > >This could be made substantially easier if libgcc moved to the top >level. You wanna help out with that? Uh, ok. What do you mean by "move to the top level"? -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> What's really sad is that for cross-compilation of the toolchain, we >> have to repeat a few steps (build gcc twice, build glibc twice) >> because glibc and gcc assume that a near-complete environment is >> available(such as gcc needing headers, and glibc needing -lgcc-eh), so >> even really fast machines(2.4Ghz P4) take an hour to do a cross-build >> from scratch. > >That sounds comparable to the time required to build RTEMS toolsets. I >just looked at the timestamp on the build logs for a gcc 4.0.0 CVS build >with newlib 1.13.0 and it is on the order of 60-90 minutes per target on >a 2.4 Ghz P4 w/512 MB RAM. This is just C and C++ and the variance is >probably mostly due to the number of multilibs. This is for a m68k-linux build (with coldfire-linux config for glibc), and its only the C compiler, so adding C++ will obvioulsy make it take longer. >A 2.4 Ghz P4 isn't what I would consider an obsolete machine and it took >90 minutes for "make" -- not a full bootstrap. Even on a 3.0Ghz P4 with HT, 1Gb DDR and a hardware RAID with SATA drives it takes about 30 minutes so there's a *lot* of work going on, and I'd call that near cutting-edge. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>Not that I'm really complaining: you can get quite a lot of mileage >out of multiple CPUs as it is, more than enough (in my opinion) to >justify purchasing some nice build servers by software shops that do a >lot of GCC work. (I won't post the actual bootstrap times out of fear >of being lynched.) This might show up more as people start moving >towards dual-core and/or multiple CPU systems even on the low end. That's great if the software can be cross-built. As it is, cross-building a toolchain requires a lot of extra work, and if it weren't for Dan Kegel's commitment, I'd dare say near impossible. I've watched the sometimes near-indifference to the problems we have trying to put together toolchains for non-hosted environments. Even when I have a cross-toolchain, its still a *long* uphill battle since there are too many OSS packages out there that can't cross-configure/compile (openssh, perl as examples off the top of my head) without a *lot* of work. Its just that it takes a lot of time and work to cross-build a non-x86 linux environment to verify any changes in the toolchain. And comments like "get a faster machine" are a non-starter. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Unfortunately for some of the embedded targets(like the ColdFire V4e >> work I'm doing), a bootstrap is impossible due to limited memory and >> no usable mass-storage device on the hardware I have available, so >> hopefully a successful crossbuild will suffice. > >How about a successful crossbuild plus >passing some regression test suite, >e.g. gcc's, glibc's, and/or ltp's? >Any one of them would provide a nice reality check. I'm open to running them if there's a *really* clear how-to to do it that takes into account remote hardware. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>We're not talking about 5% speedup; if the linker starts thrashing because >of insufficient memory you pay far more than that. And certainly anyone >with an older computer who is dissatified with its performance, but >doesn't have a lot of money, should look into getting more memory before >anything else. Still, the GNU project shouldn't be telling people in the >third world with cast-off machines that they are out of luck; to many of >them, 256M is more than they have. Also don't forget us embedded people that are *desperately* trying to do native compilations using an NFSroot with limited main memory and don't have a disk in the hardware design to swap to. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Also don't forget us embedded people that are *desperately* trying to >> do native compilations using an NFSroot with limited main memory and >> don't have a disk in the hardware design to swap to. > >Why would you work in such a crippled environment? Agh! Believe me, I do as much work on a 3+Ghz 2GbDDR x86 box, but then I'm literally screwed by the plethora of Linux packages that just can't cross build because their configure thinks it can build/run test programs to figure out things like byte ordering, etc. Take perl, zlib, openssh, as an example. Also there are so many interdependencies between packages that we have to build a pile of libraries and support stuff that is never used on the target just so we can get a package that we do need to configure/build(like sed and perl). Until package maintainers take cross-compilation *seriously*, I have no choice but to do native compilation of a large hunk of the packages on eval boards that can literally takes *DAYS* to build. We embedded linux developers have been harping on this for the past couple of years, but no one really takes our problem seriously. Instead we keep getting the "get faster hardware" as the patent cure-all to execution speed problems, but in my case, there is no other hardware I can use. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>But AFAICT even the developers who work on embedded targets focus >on code quality and new features, instead of on the compile time >and memory footprint issues that you would expect their group of >users to complain about. I think that most of us embedded developers are trying to keep up with where GCC is going. Personally I spend most of my time in gcc/config/m68k instead of the optimizers since its the target description that I know, not the optimizers. Also the mainline developers for x86 don't have the constraints that we have, so its a case of "out of sight, out of mind" and a batch of them have those glitzy workstations that they build native code for instead of the hardware us embedded developers have. Since I don't have any choice but to build natively on what to GCC developers is "crippled hardware" (only 263 BogoMips) then it takes somwhere 20 times as long to build the packages, and a "minor" 3% slowdown means it takes a *lot* longer to go through a build cycle. This also means that I can't track snapshots since they show up quicker than the amount of raw compute time to just build everything while hoping that the build doesn't blow its brains out due to a "minor" increase in memory consumption. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Also there are so many interdependencies >> between packages that we have to build a pile of libraries and support >> stuff that is never used on the target just so we can get a package >> that we do need to configure/build(like sed and perl). > >Please give me as much information as possible on sed. AFAIK, >configuring with --disable-nls should be enough to skip libiconv, >libintl, etc. and cross-build. I don't think sed has a problem cross-building, its just all the junk that each package uses in its configure that if it *has* to be natively built that compounds the problem. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>What I have problem understanding is the last sentence of this paragraph >in the light of your claim that it will results in swapping especially >when we consider developers' machines with 512MB/1GB RAM, i.e. machines >where memory is not "tight". Sure, and this is the point. Pick a number for the RSS and stick to it. Crank it up if you don't want to be bothered, but this "yardstick" can be used to measure trends in the compiler's footprint by measuring the number of page swaps. It might also be a way to measuer/improve the locality of reference since poor locality will cause thrashing if the RSS is set low enough. Of course if the RSS is set too low than *any* pattern of page access will cause thrashing. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>BS. Even the large disto builders do cross compilations a lot. Yeah, I know. I did consulting for a 'large disto builder'. Do you have a clue how long it takes to build the base packages for a PXA255 board(including X11 that won't even run on the board but is required due to package dependecies)? Can you think in *days*, and that was more than a year ago. Even then we were all concerned about the trend in compliation speed. Speak of what you *personally* know. >I am getting pretty sick of this. Can we now start discussing >what GCC does do well, or otherwise, for further complaints >remove me from the CC: please. I've pulled you from the CC list, but I'm passing it on to the GCC list in hopes that someone there cares more than you. The RSS bloat probelm is *not* going to go away, and *wishing* it away won't. >I can't say all is good about GCC. There are always ways to do >things better. But, as Dewar already pointed out, GCC just can >not be perfect for everyone's needs. I, for one, am very happy >that we are finally pulling GCC out of the 80s, into the 21st >century. The compile time and memory consumption problems are >obviously there, but just complaining is not going to fix them. No, gcc is not perfect for all things, but the trend in resource consumption is getting pretty serious. As others have pointed out before, no one complains about a resource bproblem until it gets large enough that it made it inconvenient if not just impossible. You don't complain to your car dealer when your car runs fine, but if it craps out on the way to work, you'll be complaining pretty damn loudly, expecially if its nearly brand new. I develop GCC for ColdFire, and I have been contributing back changes to GCC in the hopes that it will be a world-class compiler that I can use for my work. Unfortunately due to circumstances that have *nothing* to do with GCC I have no choice but to build packages using a GCC that runs natively in an Linux environment on my ColdFire V4e embedded board where the resource constraints are *exteremely* severe, and possibly an extra MB of RSS usage by GCC version-to-version will be the difference between success and failure. I have great faith in OSS and FSF code, and I don't want to demean the valued contributions that people have made to it, but please understand that Linux systems are built using GCC, whether its for a workstation or an embedeed Linux device, and as such *should* consider the problems that both encounter and not just favor the workstation end. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Yes, but Ralf was complaining about embedded cross-compiling development >> for RTEMS. I have not tried to reply to Peter Barada who complains about >> GCC inablity to be run on embedded targets directly. > >Logically Peter's situation is the same as the NetBSD issue with >building and testing on old hosts. He is running GNU/Linux on >ColdFire and I suspect his ColdFire target is probably faster and better >equipped than the old UNIX boxes the BSD folks mentioned. Its a 266Mhz ColdFire v4e machine, about 263 BogoMips, 1/20 the BogoMips of my workstation, and with an NFS rootfs, it gets network bound pretty rapidly and runs even slower compared to a NetBSD machine with a local disk :) -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Its a 266Mhz ColdFire v4e machine, about 263 BogoMips, 1/20 the >> BogoMips of my workstation, and with an NFS rootfs, it gets network >> bound pretty rapidly and runs even slower compared to a NetBSD machine >> with a local disk :) > >I would have thought the CPU itself was comparable to or faster than >a 68040 or 68060 since they was always at much lower clock speeds. Oh, its faster than the 040/060 since those topped out at 75Mhz, but at the same time, the more restricted addressing modes/instructions require more instructions to acheive the same amount of work, but on the whole is faster(the imfamous RISC debate). >Do you have any local disk or is everything NFS? If so, that would >be killer for performance. I remember an old pair of SparcStation 10's >we used to have. Network builds vs. local disk was already like 5-10x >slower. Its currently NFS all the way. :) >How much RAM? 128Mb. I do have some experimental kernel hacks in to allow swapping via NFS, so you can understand why it can take *days* to build stuff. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Its a 266Mhz ColdFire v4e machine, about 263 BogoMips, 1/20 the >> BogoMips of my workstation, and with an NFS rootfs, it gets network >> bound pretty rapidly and runs even slower compared to a NetBSD machine >> with a local disk :) > >Hmmm, Ghz wise and BogoMips wise, this is about half what I have (a 550 >Mhz G4 PowerBook). > >Nevertheless, you don't hear *me* complain ... No, but I don't have a disk, so everything has to come over the network. I also don't have much ram, so if I start running out of RAM, it slows to a crawl since it can't cache any source. >I build GCC while at work (i.e., while away from the notebook at home :-) > >Try it ... it works, Huh? I can cross-compile GCC, its all the packages that require native configuration/building -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> ATM, I am experiencing objects being generated by GCC to increase in >> size and forcing us to gradually abandon targets with tight memory >> requirements. At least one cause of this seems to be GCC abandoning COFF >> in favor of ELF, which seems to imply larger memory requirements. > >I don't see at all why this should increase target memory requirements, >can you eludicate? Perhaps it is that the ELF executables are larger in total size(not code size) than the corresponding COFF executables, and if stored on the target, then the footprint increases, causing "larger memory requirements" of flash, not DRAM. -- Peter Barada [EMAIL PROTECTED]
Re: GCC 4.1: Buildable on GHz machines only?
>> Its a 266Mhz ColdFire v4e machine, about 263 BogoMips, 1/20 the >> BogoMips of my workstation, and with an NFS rootfs, it gets network > >BogoMips are called BogoMips because they are not comparable among >different CPUs. All they measure is how often the CPU needs to run a >particular near-empty loop to delay a certain time. I know exactly what a BogoMips is. >There usually is a small factor which can convert between BogoMips and CPU >MHz for every CPU model. It would seem to be 1 for your ColdFire; it >happens to be 1/2 for my Athlon (bogomips: 2287.20, cpu MHz: 1145.142). > >Comparisions like yours are worse than meaningless. I wouldn't call it meaningless. I don't have other benchmark numbers for the chip, and it was menat to show that it isn't a blazingly fast processor (as compared to desktop machines). -- Peter Barada [EMAIL PROTECTED]
problemns confgire/build gcc/libstdc++ for ColdFire v4e
I'm tyring to cross-build a linux toolchain for a ColdFire v4e, where I have to pass -mcfv4e to the compiler to select the ColdFire v4e, as well as the linker(to make sure it picks the right library), and I have this working for when building gcc-3.4.3 with --languages=c, but when I try to build for --languages=c,c++, it fails while building libstdc++. What is the preferred method of configuring gcc such that options will passed to the cross-compiler, as well as the linker when building libraries? The reason I need to pass options is that the ColdFire is a variant of the m68k, and needs options to select ColdFire specific behavior. Thanks in advance! -- Peter Barada [EMAIL PROTECTED]
Re: Minimum/maximum operators are deprecated?
>> It was an ill-defined and poorly maintained language extension that was >> broken in many cases. Proper replacements exist in standard C++: > >I am well aware of std::min/max. But they are not what I would call a 'proper >replacement', but that probably depends from the point of view. > >I just hit an ICE when it comes to overwriting those operators. Maybe nobody >has ever tried to do this before :-) So there is probably no point in >submitting a bug report? Your best hope to get a bug fixed is to submit it to bugzilla with a minimal testcase. Hopefully some kind volunteer will spend their valuable time to fix it. But if you don't report it, tough, don't complain about it... -- Peter Barada [EMAIL PROTECTED]
CVS access to the uberbaum tree
Does the uberbaum tree exist on savanna, or is it only on sources.redhat.com? If so, what is the procedure for accessing it? Thanks in advance... -- Peter Barada [EMAIL PROTECTED]
Re: warning: '' may be used uninitialized in this function
>I am somewhat confused about the status of the >"may be used uninitialized" warning... This list is more for discussing the internals of the GCC compiler, not how to use it. As for your question, if cnt is less than or equal to zero, or if a[i] is always less than zero, then the assignment to best.d never happens in the loop which leaves trash in best.d since best is allocated off the stack and holds trash until initialized. Hence the warning for reading at a possibly unitialized variable. Initialize best.d where you initialize best.score to quiet the warning. -- Peter Barada [EMAIL PROTECTED]
Re: Feature request - a macro defined for GCC
> I think an important point was missed in the discussion. Some seem to > focus on the dishonest definition of __GNUC__ by non-GNU C compilers. > That was not my point. My point is that if __GNUC__ is defined by > CPP, not the GNU C compiler proper, (and this seems to be supported by > the CPP Manual,) and any (non-GNU) C compiler can use CPP, then those > non-GNU C compilers would "inadverdently" define __GNUC__ and lead > people to believe that they are GNU C. That is why I think the GNU C > compiler should define a macro independently from CPP. Or, > alternatively, __GNUC__ should be defined by the GCC compiler proper, > not CPP. And do what with the preprocessor symbol? If the symbol is defined by the compiler *after* preprocessing occurs(as in the compiler and not the preprocessor) , then it can't be used to selectively preprocess code... -- Peter Barada [EMAIL PROTECTED]