Re: Bug in expand_builtin_setjmp_receiver ?
On 21 October 2010 16:49, Nathan Froyd wrote: >> Is it easy to test lm32 on some simulator? > > lm32 has a gdb simulator available, so it should be fairly easy to write > a board file for it if one doesn't already exist. > > Unfortunately, building lm32-elf is broken in several different ways > right now. OK... what's the best way forward on this? Do we just leave it as it is and wait until an official port needs complains about it? Should it be filled in bugzilla? Cheers, Fred
question on ssa representation of aggregates
Hi : In paper "Memory SSA-A Unified Approach for Sparsely Representing Memory Operations", section 2.2, it says : "Whenever possible, compiler will create symbolic names to represent distinct regions inside aggregates(called structure field tags or SFT). For instance, in Figure 2(b), GCC will create three SFT symbols for this structure, namely SFT.0 for A.x, SFT.1 for A.b and SFT.2 for A.a" I tried GCC4.4.1(mips target) with following piece of code, ---start struct tag_1 { int *i; int *j; int *x; int y; }a; struct tag_2 { struct tag_1 t1[100]; int x[200]; int *y; }s; int func(int **p) { int *c = *p; if (a.y > 0) s.y = *p1; else *c = *s.y; return 0; } ---end The "055t.alias" dumped are like, ---start func (int * * p) { int * c; int * gp.2; int g.1; int D.1352; int * D.1351; int * D.1349; int * * p1.0; int D.1345; : # VUSE c_2 = *p_1(D); # VUSE D.1345_3 = a.y; if (D.1345_3 > 0) goto ; else goto ; : # VUSE p1.0_4 = p1; # VUSE D.1349_5 = *p1.0_4; # s_18 = VDEF s.y = D.1349_5; goto ; : # VUSE D.1351_6 = s.y; # VUSE D.1352_7 = *D.1351_6; # g_21 = VDEF # a_22 = VDEF # s_23 = VDEF # SMT.14_24 = VDEF *c_2 = D.1352_7; ---end. it seems structure a and s are treated as array variables, no SFT is created. Did I miss anything or the implementation is different? Thanks. -- Best Regards.
Re: peephole2: dead regs not marked as dead
Ian Lance Taylor schrieb: > Georg Lay writes: > >> Regs that are "naturally" dead because the function ends are not marked as >> dead, >> and therefore some optimization opportunities pass by unnoticed, e.g. >> together >> with recog.c::peep2_reg_dead_p() et. al. > > I don't understand what you mean. All registers other than the return > register, stack pointer, and frame pointer die at the end of the > function, and they should be marked accordingly. Can you give an > example? Unfortunately, not all dead regs are marked as dead. The example is from a private port for the C-Source int f (int); int and (int x) { return f (x & 0x00018000); } After .ira, RTL looks like this: (insn 9 8 21 2 peep2.c:5 (set (reg:SI 15 d15) (and:SI (reg:SI 4 d4 [ x ]) (const_int 98304 [0x18000]))) 433 {*and3_zeroes-2.insert.ic} (nil)) (insn 21 9 10 2 peep2.c:5 (set (reg:SI 4 d4) (reg:SI 15 d15)) 2 {*movsi_insn} (nil)) (call_insn/j 10 21 11 2 peep2.c:5 (parallel [ (set (reg:SI 2 d2) (call (mem:HI (symbol_ref:SI ("f") [flags 0x41] ) [0 S2 A16]) (const_int 0 [0x0]))) (use (const_int 1 [0x1])) ]) 92 {call_value_insn} (nil) (expr_list:REG_DEP_TRUE (use (reg:SI 4 d4)) (nil))) ;; End of basic block 2 -> ( 1) ;; lr out 2 [d2] 26 [SP] 27 [a11] ;; live out 2 [d2] 26 [SP] 27 [a11] ;; Succ edge EXIT [100.0%] (ab,sibcall) (barrier 11 10 20) The first insn, AND, is an early-clobber of output operand, D15. Functions get/receive their first arg in D4, so reload generates a move. Then the first insn gets split after reload and before peephole2: (insn 22 8 23 2 peep2.c:5 (set (reg:SI 15 d15) (and:SI (reg:SI 4 d4 [ x ]) (const_int -98305 [0xfffe7fff]))) 143 {*and3_zeroes.insert.{SI}.ic} (nil)) (insn 23 22 21 2 peep2.c:5 (set (reg:SI 15 d15) (xor:SI (reg:SI 15 d15) (reg:SI 4 d4 [ x ]))) 39 {*xorsi3} (nil)) (insn 21 23 10 2 peep2.c:5 (set (reg:SI 4 d4) (reg:SI 15 d15)) 2 {*movsi_insn} (nil)) (call_insn/j 10 21 11 2 peep2.c:5 (parallel [ (set (reg:SI 2 d2) (call (mem:HI (symbol_ref:SI ("f") [flags 0x41] ) [0 S2 A16]) (const_int 0 [0x0]))) (use (const_int 1 [0x1])) ]) 92 {call_value_insn} (nil) (expr_list:REG_DEP_TRUE (use (reg:SI 4 d4)) (nil))) ;; End of basic block 2 -> ( 1) ;; lr out 2 [d2] 26 [SP] 27 [a11] ;; live out 2 [d2] 26 [SP] 27 [a11] ;; Succ edge EXIT [100.0%] (ab,sibcall) (barrier 11 10 20) The second insn (XOR) and the third insn (SET) could be combined into one insn because the xor-insn can handle three different regs. This is the peep2: (define_peephole2 [(set (match_operand:SI 0 "register_operand" "") (match_operator:SI 4 "tric_s10_operator" [(match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "reg_or_s10_operand" "")])) (set (match_operand:SI 3 "register_operand" "") (match_dup 0))] "peep2_reg_dead_p (2, operands[0])" ... with XOR an element of "tric_s10_operator" This peep2 fails because op0, in this case D15, is not marked as dead resp. peep2_reg_dead_p does not report it as dead. D15 is a call-saved register. The architecture automatically saves regs in CALL and restores them in RETURN, so most functions have no prologue (except in cases SP has to be changed) and no epilogue except a RETURN. D15 is advantageous in many instructions even though it is call-saved. I already tried to fix this by introducing a different return-pattern, i.e. a PARALLEL of return and bunch of clobbers of unused regs. That fixes this problem but has many other disadvantages compared to a simple return. Georg Lay
Re: Describing multi-register values in RTL
Frédéric RISS writes: >> The lower subreg pass will do that for you if you have the right set of >> insns. > > Could you expand a bit on what the 'right set of instructions' is or > even better give an example of an md file where we could find an > example? E.g., on a 32-bit system, start with a normal adddi3 insn which just does (set (reg:DI) (plus:DI (op) (op))) That will work for combine, the RTL CSE and loop optimizers, etc. Then have a splitter for that insn into something like (parallel (set (reg:SI) (plus:SI (op-low) (op-low))) (set (reg:SI) (plus:SI (plus:SI (op-high) (op-high)) (truncate:SI (lshiftrt:DI (plus:DI (zero_extend:DI (op-low)) (zero_extend:DI (op-high That split will happen after the RTL passes which care about the DImode add. The point of the parallel is to express a DImode addition as a pair of SImode additions, adding in the carry bit to the upper value. The lower-subreg pass will run after the split. It will see that the value is accessed only as SImode registers and will split it into two independent SImode registers. That will let the register allocator handle them separately. Then you need another splitter which takes the parallel above and splits it into two independent insns, which can be scheduled independently. Ian
Re: Questions about selective scheduler and PowerPC
On 10/20/2010 7:48 PM, Jie Zhang wrote: Running CPU2006, with the hack removed I see about a 1% improvement in specint (10% in 456.hmmer, a couple others in the 3% range, -3% 401.bzip2) and a 1% degradation in specfp (mainly due to a 13% degradation in 435.gromacs). But 454.calculix also fails for me (output miscompare), so assume we're generating incorrect code for some reason with the hack removed. Thanks for benchmarking! Since there is a bug in max_issue, issue_rate is not really honored. Could you try this patch http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html with and without the hack? With your patch applied I see pretty similar results as before, except for a couple additional specint benchmarks that degraded a couple percent with the hack removed. -Pat
Re: Questions about selective scheduler and PowerPC
On 10/23/2010 01:50 AM, Pat Haugen wrote: On 10/20/2010 7:48 PM, Jie Zhang wrote: Running CPU2006, with the hack removed I see about a 1% improvement in specint (10% in 456.hmmer, a couple others in the 3% range, -3% 401.bzip2) and a 1% degradation in specfp (mainly due to a 13% degradation in 435.gromacs). But 454.calculix also fails for me (output miscompare), so assume we're generating incorrect code for some reason with the hack removed. Thanks for benchmarking! Since there is a bug in max_issue, issue_rate is not really honored. Could you try this patch http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html with and without the hack? With your patch applied I see pretty similar results as before, except for a couple additional specint benchmarks that degraded a couple percent with the hack removed. Thanks for testing! Seems rs6000 port still has to keep that hack for now. -- Jie Zhang CodeSourcery
The Linux binutils 2.20.51.0.12 is released
This is the beta release of binutils 2.20.51.0.12 for Linux, which is based on binutils 2010 1020 in CVS on sourceware.org plus various changes. It is purely for Linux. All relevant patches in patches have been applied to the source tree. You can take a look at patches/README to see what have been applied and in what order they have been applied. Starting from the 2.20.51.0.4 release, no diffs against the previous release will be provided. You can enable both gold and bfd ld with --enable-gold=both. Gold will be installed as ld.gold and bfd ld will be installed as ld.bfd. By default, ld.bfd will be installed as ld. You can use the configure option, --enable-gold=both/gold to choose gold as the default linker, ld. IA-32 binary and X64_64 binary tar balls are configured with --enable-gold=both/ld --enable-plugins --enable-threads. Starting from the 2.18.50.0.4 release, the x86 assembler no longer accepts fnstsw %eax fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged. Please use fnstsw %ax Starting from the 2.17.50.0.4 release, the default output section LMA (load memory address) has changed for allocatable sections from being equal to VMA (virtual memory address), to keeping the difference between LMA and VMA the same as the previous output section in the same region. For .data.init_task : { *(.data.init_task) } LMA of .data.init_task section is equal to its VMA with the old linker. With the new linker, it depends on the previous output section. You can use .data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) } to ensure that LMA of .data.init_task section is always equal to its VMA. The linker script in the older 2.6 x86-64 kernel depends on the old behavior. You can add AT (ADDR(section)) to force LMA of .data.init_task section equal to its VMA. It will work with both old and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and above is OK. The new x86_64 assembler no longer accepts monitor %eax,%ecx,%edx You should use monitor %rax,%ecx,%edx or monitor which works with both old and new x86_64 assemblers. They should generate the same opcode. The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are available at http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch The ia64 assembler is now defaulted to tune for Itanium 2 processors. To build a kernel for Itanium 1 processors, you will need to add ifeq ($(CONFIG_ITANIUM),y) CFLAGS += -Wa,-mtune=itanium1 AFLAGS += -Wa,-mtune=itanium1 endif to arch/ia64/Makefile in your kernel source tree. Please report any bugs related to binutils 2.20.51.0.12 to hjl.to...@gmail.com and http://www.sourceware.org/bugzilla/ Changes from binutils 2.20.51.0.11: 1. Update from binutils 2010 1020. 2. Add plugin support to ld. 3. Support mixing REL and RELA relocations. 4. Mark ELF linker generated dynamic symbols as ELF. PR 11812. 5. Improve ar/nm plugin support. PR 12004/12088. 6. Avoid unnecessary relaxation in assembler. PR 12049. 7. Add .d32 suffix support to x86 assembler to force 32bit displacement. 8. Improve ELF linker diagnostic with incompatible inputs. PR 11944/11933. 9. Speed up ELF linker hash table size computation. PR 11843. 10. Update elfedit to update ELF OSABI. 11. Fix linker for moving location counter backwards. PR 12066. 12. Fix Invalid memory access in ELF linker. PR 11946. 13. Fix ld --build-id crash with non-ELF input. PR 11937. 14. Update expression evaluation in linker scripts. 15. Proper dump relocation addend as signed in readelf. 16. Fix x86-64 Window 64bit immediate in assembler. PR 11974. 17. Fix readelf crashes. PR 11889. 18. Fix opjcopy. PR 11953. 19. Fix a linker crash. PR 11939. 20. Improve handling of invalid ELF section flags in assembler. PR 12011. 21. Improve gold. 22. Improve VMS support. 23. Improve Windows SEH support. 24. Improve alpha support. 25. Improve arm support. 26. Improve bfin support. 27. Improve mips support. 28. Improve spu support. 29. Improve tic6x support. Changes from binutils 2.20.51.0.10: 1. Update from binutils 2010 0810. 2. Properly support compressed debug sections in all binutis programs. Add --compress-debug-sections/--decompress-debug-sections to objcopy. PR 11819. 3. Fix linker crash on undefined symbol errors with DWARF. PR 11817. 4. Don't gen
Re: Bug in expand_builtin_setjmp_receiver ?
Frederic Riss writes: > On 21 October 2010 16:49, Nathan Froyd wrote: >>> Is it easy to test lm32 on some simulator? >> >> lm32 has a gdb simulator available, so it should be fairly easy to write >> a board file for it if one doesn't already exist. >> >> Unfortunately, building lm32-elf is broken in several different ways >> right now. > > OK... what's the best way forward on this? Do we just leave it as it > is and wait until an official port needs complains about it? Should it > be filled in bugzilla? Did you just happen to come across this, or is this relevant for a port you are working on? If you are not working on a port, then I think the best think to do right now is to add a FIXME comment in the source code. Ian
Re: question on ssa representation of aggregates
"Amker.Cheng" writes: >In paper "Memory SSA-A Unified Approach for Sparsely Representing > Memory Operations", > Did I miss anything or the implementation is different? Thanks. The implementation of this stuff changes fairly regularly. The people who like this kind of thing are still honing in on the best way to handle aliasing information. Richard Guenther is the main guy working in this area today. Ian
Re: peephole2: dead regs not marked as dead
Georg Lay writes: > Unfortunately, not all dead regs are marked as dead. OK, you have a good example. And my response is: it seems to me that d15 should be marked as dead. So the question is why that is not happening. I don't know the answer. Ian
G++ test suite picking up incorrect libstc++
Hi -- I'm seeing test suite failures in g++ caused by linking with the wrong libstdc++.so. It looks like g++.exp always appends the default directory append flags -L${gccpath}/libstdc++-v3/src/.libs instead of append flags -L${gccpath}//libstdc++-v3/src/.libs Has anyone else run into this problem? Is this supposed to work in a different way? Anyone come up with a fix? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: G++ test suite picking up incorrect libstc++
On 10/22/2010 08:43 PM, Michael Eager wrote: > Hi -- > > I'm seeing test suite failures in g++ caused by > linking with the wrong libstdc++.so. > > It looks like g++.exp always appends the default > directory > append flags -L${gccpath}/libstdc++-v3/src/.libs > instead of > append flags -L${gccpath}//libstdc++-v3/src/.libs Without having looked into the issue in any detail, the issue seems weird to me: for sure many people regularly build multilib (myself and HJ on gcc-testresults included, for example) without any problem whatsoever. I would suggest figuring out first what's special about your setup. Paolo.
Re: G++ test suite picking up incorrect libstc++
Paolo Carlini wrote: On 10/22/2010 08:43 PM, Michael Eager wrote: Hi -- I'm seeing test suite failures in g++ caused by linking with the wrong libstdc++.so. It looks like g++.exp always appends the default directory append flags -L${gccpath}/libstdc++-v3/src/.libs instead of append flags -L${gccpath}//libstdc++-v3/src/.libs Without having looked into the issue in any detail, the issue seems weird to me: for sure many people regularly build multilib (myself and HJ on gcc-testresults included, for example) without any problem whatsoever. I would suggest figuring out first what's special about your setup. I don't know that there's anything special about my setup. g++.exp is adding -L paths to the wrong libstdc++ directory. When running GCC tests, only the -B option is added. The correct multilib directory is selected by the gcc driver. Do you run "make check" with default options, or do you specify compiler options which should result in linking non-default c++ libraries? I'm going to run the test_installed script. This should use the gcc driver to select the multilib, rather than g++.exp. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: G++ test suite picking up incorrect libstc++
On Fri, Oct 22, 2010 at 1:35 PM, Michael Eager wrote: > Paolo Carlini wrote: >> >> On 10/22/2010 08:43 PM, Michael Eager wrote: >>> >>> Hi -- >>> >>> I'm seeing test suite failures in g++ caused by >>> linking with the wrong libstdc++.so. >>> >>> It looks like g++.exp always appends the default >>> directory >>> append flags -L${gccpath}/libstdc++-v3/src/.libs >>> instead of >>> append flags -L${gccpath}//libstdc++-v3/src/.libs > >> Without having looked into the issue in any detail, the issue seems >> weird to me: for sure many people regularly build multilib (myself and >> HJ on gcc-testresults included, for example) without any problem >> whatsoever. I would suggest figuring out first what's special about your >> setup. > > I don't know that there's anything special about my setup. > g++.exp is adding -L paths to the wrong libstdc++ directory. > When running GCC tests, only the -B option is added. The > correct multilib directory is selected by the gcc driver. > > Do you run "make check" with default options, or do you > specify compiler options which should result in linking > non-default c++ libraries? I use # make check RUNTESTFLAGS="--target_board 'unix{-m32,}'" to test both 32bit/64bit on Intel64. H.J.
Re: question on ssa representation of aggregates
> The implementation of this stuff changes fairly regularly. The people > who like this kind of thing are still honing in on the best way to > handle aliasing information. Richard Guenther is the main guy working > in this area today. thanks very much for clarification. -- Best Regards.