Re: Bug in expand_builtin_setjmp_receiver ?

2010-10-22 Thread Frederic Riss
On 21 October 2010 16:49, Nathan Froyd  wrote:
>> Is it easy to test lm32 on some simulator?
>
> lm32 has a gdb simulator available, so it should be fairly easy to write
> a board file for it if one doesn't already exist.
>
> Unfortunately, building lm32-elf is broken in several different ways
> right now.

OK... what's the best way forward on this? Do we just leave it as it
is and wait until an official port needs complains about it? Should it
be filled in bugzilla?

Cheers,
Fred



question on ssa representation of aggregates

2010-10-22 Thread Amker.Cheng
Hi :
   In paper "Memory SSA-A Unified Approach for Sparsely Representing
Memory Operations",
section 2.2, it says :

"Whenever possible, compiler will create symbolic names to represent distinct
regions inside aggregates(called structure field tags or SFT). For instance,
in Figure 2(b), GCC will create three SFT symbols for this structure, namely
SFT.0 for A.x, SFT.1 for A.b and SFT.2 for A.a"

I tried GCC4.4.1(mips target) with following piece of code,
---start
struct tag_1
{
  int *i;
  int *j;
  int *x;
  int y;
}a;
struct tag_2
{
  struct tag_1 t1[100];
  int x[200];
  int *y;
}s;
int func(int **p)
{
int *c = *p;
if (a.y > 0)
  s.y = *p1;
else
  *c = *s.y;

  return 0;
}
---end
The "055t.alias" dumped are like,
---start
func (int * * p)
{
  int * c;
  int * gp.2;
  int g.1;
  int D.1352;
  int * D.1351;
  int * D.1349;
  int * * p1.0;
  int D.1345;

:
  # VUSE 
  c_2 = *p_1(D);
  # VUSE 
  D.1345_3 = a.y;
  if (D.1345_3 > 0)
goto ;
  else
goto ;

:
  # VUSE 
  p1.0_4 = p1;
  # VUSE 
  D.1349_5 = *p1.0_4;
  # s_18 = VDEF 
  s.y = D.1349_5;
  goto ;

:
  # VUSE 
  D.1351_6 = s.y;
  # VUSE 
  D.1352_7 = *D.1351_6;
  # g_21 = VDEF 
  # a_22 = VDEF 
  # s_23 = VDEF 
  # SMT.14_24 = VDEF 
  *c_2 = D.1352_7;
---end.

it seems structure a and s are treated as array variables, no SFT is created.

Did I miss anything or the implementation is different? Thanks.
-- 
Best Regards.


Re: peephole2: dead regs not marked as dead

2010-10-22 Thread Georg Lay
Ian Lance Taylor schrieb:
> Georg Lay  writes:
> 
>> Regs that are "naturally" dead because the function ends are not marked as 
>> dead,
>> and therefore some optimization opportunities pass by unnoticed, e.g. 
>> together
>> with recog.c::peep2_reg_dead_p() et. al.
> 
> I don't understand what you mean.  All registers other than the return
> register, stack pointer, and frame pointer die at the end of the
> function, and they should be marked accordingly.  Can you give an
> example?

Unfortunately, not all dead regs are marked as dead. The example is from a
private port for the C-Source

int f (int);

int and (int x)
{
return f (x & 0x00018000);
}

After .ira, RTL looks like this:


(insn 9 8 21 2 peep2.c:5 (set (reg:SI 15 d15)
(and:SI (reg:SI 4 d4 [ x ])
(const_int 98304 [0x18000]))) 433 {*and3_zeroes-2.insert.ic} (nil))

(insn 21 9 10 2 peep2.c:5 (set (reg:SI 4 d4)
(reg:SI 15 d15)) 2 {*movsi_insn} (nil))

(call_insn/j 10 21 11 2 peep2.c:5 (parallel [
(set (reg:SI 2 d2)
(call (mem:HI (symbol_ref:SI ("f") [flags 0x41]  ) [0 S2 A16])
(const_int 0 [0x0])))
(use (const_int 1 [0x1]))
]) 92 {call_value_insn} (nil)
(expr_list:REG_DEP_TRUE (use (reg:SI 4 d4))
(nil)))

;; End of basic block 2 -> ( 1)
;; lr  out   2 [d2] 26 [SP] 27 [a11]
;; live  out 2 [d2] 26 [SP] 27 [a11]
;; Succ edge  EXIT [100.0%]  (ab,sibcall)

(barrier 11 10 20)

The first insn, AND, is an early-clobber of output operand, D15. Functions
get/receive their first arg in D4, so reload generates a move.

Then the first insn gets split after reload and before peephole2:

(insn 22 8 23 2 peep2.c:5 (set (reg:SI 15 d15)
(and:SI (reg:SI 4 d4 [ x ])
(const_int -98305 [0xfffe7fff]))) 143 {*and3_zeroes.insert.{SI}.ic}
(nil))

(insn 23 22 21 2 peep2.c:5 (set (reg:SI 15 d15)
(xor:SI (reg:SI 15 d15)
(reg:SI 4 d4 [ x ]))) 39 {*xorsi3} (nil))

(insn 21 23 10 2 peep2.c:5 (set (reg:SI 4 d4)
(reg:SI 15 d15)) 2 {*movsi_insn} (nil))

(call_insn/j 10 21 11 2 peep2.c:5 (parallel [
(set (reg:SI 2 d2)
(call (mem:HI (symbol_ref:SI ("f") [flags 0x41]  ) [0 S2 A16])
(const_int 0 [0x0])))
(use (const_int 1 [0x1]))
]) 92 {call_value_insn} (nil)
(expr_list:REG_DEP_TRUE (use (reg:SI 4 d4))
(nil)))
;; End of basic block 2 -> ( 1)
;; lr  out   2 [d2] 26 [SP] 27 [a11]
;; live  out 2 [d2] 26 [SP] 27 [a11]
;; Succ edge  EXIT [100.0%]  (ab,sibcall)

(barrier 11 10 20)


The second insn (XOR) and the third insn (SET) could be combined into one insn
because the xor-insn can handle three different regs. This is the peep2:

(define_peephole2
  [(set (match_operand:SI 0 "register_operand" "")
(match_operator:SI 4 "tric_s10_operator"
   [(match_operand:SI 1 "register_operand"  "")
(match_operand:SI 2 "reg_or_s10_operand" "")]))
   (set (match_operand:SI 3 "register_operand" "")
(match_dup 0))]
  "peep2_reg_dead_p (2, operands[0])"
  ...

with XOR an element of "tric_s10_operator"
This peep2 fails because op0, in this case D15, is not marked as dead resp.
peep2_reg_dead_p does not report it as dead. D15 is a call-saved register.
The architecture automatically saves regs in CALL and restores them in RETURN,
so most functions have no prologue (except in cases SP has to be changed) and no
epilogue except a RETURN. D15 is advantageous in many instructions even though
it is call-saved.

I already tried to fix this by introducing a different return-pattern, i.e. a
PARALLEL of return and bunch of clobbers of unused regs. That fixes this problem
but has many other disadvantages compared to a simple return.

Georg Lay


Re: Describing multi-register values in RTL

2010-10-22 Thread Ian Lance Taylor
Frédéric RISS  writes:

>> The lower subreg pass will do that for you if you have the right set of
>> insns.
>
> Could you expand a bit on what the 'right set of instructions' is or
> even better give an example of an md file where we could find an
> example?

E.g., on a 32-bit system, start with a normal adddi3 insn which just
does
   (set (reg:DI) (plus:DI (op) (op)))
That will work for combine, the RTL CSE and loop optimizers, etc.

Then have a splitter for that insn into something like
   (parallel
(set (reg:SI) (plus:SI (op-low) (op-low)))
(set (reg:SI) (plus:SI (plus:SI (op-high) (op-high))
   (truncate:SI
(lshiftrt:DI
 (plus:DI
  (zero_extend:DI (op-low))
  (zero_extend:DI (op-high

That split will happen after the RTL passes which care about the DImode
add.  The point of the parallel is to express a DImode addition as a
pair of SImode additions, adding in the carry bit to the upper value.

The lower-subreg pass will run after the split.  It will see that the
value is accessed only as SImode registers and will split it into two
independent SImode registers.  That will let the register allocator
handle them separately.

Then you need another splitter which takes the parallel above and splits
it into two independent insns, which can be scheduled independently.

Ian


Re: Questions about selective scheduler and PowerPC

2010-10-22 Thread Pat Haugen

On 10/20/2010 7:48 PM, Jie Zhang wrote:

Running CPU2006, with the hack removed I see about a 1% improvement in
specint (10% in 456.hmmer, a couple others in the 3% range, -3%
401.bzip2) and a 1% degradation in specfp (mainly due to a 13%
degradation in 435.gromacs). But 454.calculix also fails for me (output
miscompare), so assume we're generating incorrect code for some reason
with the hack removed.

Thanks for benchmarking! Since there is a bug in max_issue, issue_rate 
is not really honored. Could you try this patch


http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html

with and without the hack?



With your patch applied I see pretty similar results as before, except 
for a couple additional specint benchmarks that degraded a couple 
percent with the hack removed.


-Pat




Re: Questions about selective scheduler and PowerPC

2010-10-22 Thread Jie Zhang

On 10/23/2010 01:50 AM, Pat Haugen wrote:

On 10/20/2010 7:48 PM, Jie Zhang wrote:

Running CPU2006, with the hack removed I see about a 1% improvement in
specint (10% in 456.hmmer, a couple others in the 3% range, -3%
401.bzip2) and a 1% degradation in specfp (mainly due to a 13%
degradation in 435.gromacs). But 454.calculix also fails for me (output
miscompare), so assume we're generating incorrect code for some reason
with the hack removed.


Thanks for benchmarking! Since there is a bug in max_issue, issue_rate
is not really honored. Could you try this patch

http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html

with and without the hack?



With your patch applied I see pretty similar results as before, except
for a couple additional specint benchmarks that degraded a couple
percent with the hack removed.


Thanks for testing! Seems rs6000 port still has to keep that hack for now.

--
Jie Zhang
CodeSourcery


The Linux binutils 2.20.51.0.12 is released

2010-10-22 Thread H.J. Lu
This is the beta release of binutils 2.20.51.0.12 for Linux, which is
based on binutils 2010 1020 in CVS on sourceware.org plus various
changes. It is purely for Linux.

All relevant patches in patches have been applied to the source tree.
You can take a look at patches/README to see what have been applied and
in what order they have been applied.

Starting from the 2.20.51.0.4 release, no diffs against the previous
release will be provided.

You can enable both gold and bfd ld with --enable-gold=both.  Gold will
be installed as ld.gold and bfd ld will be installed as ld.bfd.  By
default, ld.bfd will be installed as ld.  You can use the configure
option, --enable-gold=both/gold to choose gold as the default linker,
ld.  IA-32 binary and X64_64 binary tar balls are configured with
--enable-gold=both/ld --enable-plugins --enable-threads.

Starting from the 2.18.50.0.4 release, the x86 assembler no longer
accepts

fnstsw %eax

fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged.
Please use

fnstsw %ax

Starting from the 2.17.50.0.4 release, the default output section LMA
(load memory address) has changed for allocatable sections from being
equal to VMA (virtual memory address), to keeping the difference between
LMA and VMA the same as the previous output section in the same region.

For

.data.init_task : { *(.data.init_task) }

LMA of .data.init_task section is equal to its VMA with the old linker.
With the new linker, it depends on the previous output section. You
can use

.data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) }

to ensure that LMA of .data.init_task section is always equal to its
VMA. The linker script in the older 2.6 x86-64 kernel depends on the
old behavior.  You can add AT (ADDR(section)) to force LMA of
.data.init_task section equal to its VMA. It will work with both old
and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and
above is OK.

The new x86_64 assembler no longer accepts

monitor %eax,%ecx,%edx

You should use

monitor %rax,%ecx,%edx

or
monitor

which works with both old and new x86_64 assemblers. They should
generate the same opcode.

The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

movl (%eax),%ds
movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

mov (%eax),%ds
mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

movw (%eax),%ds
movw %ds,(%eax)

without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are
available at

http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch
http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch

The ia64 assembler is now defaulted to tune for Itanium 2 processors.
To build a kernel for Itanium 1 processors, you will need to add

ifeq ($(CONFIG_ITANIUM),y)
CFLAGS += -Wa,-mtune=itanium1
AFLAGS += -Wa,-mtune=itanium1
endif

to arch/ia64/Makefile in your kernel source tree.

Please report any bugs related to binutils 2.20.51.0.12 to
hjl.to...@gmail.com

and

http://www.sourceware.org/bugzilla/

Changes from binutils 2.20.51.0.11:

1. Update from binutils 2010 1020.
2. Add plugin support to ld. 
3. Support mixing REL and RELA relocations. 
4. Mark ELF linker generated dynamic symbols as ELF.  PR 11812.
5. Improve ar/nm plugin support.  PR 12004/12088.
6. Avoid unnecessary relaxation in assembler.  PR 12049.
7. Add .d32 suffix support to x86 assembler to force 32bit displacement.
8. Improve ELF linker diagnostic with incompatible inputs.  PR 11944/11933.
9. Speed up ELF linker hash table size computation.  PR 11843.
10. Update elfedit to update ELF OSABI.
11. Fix linker for moving location counter backwards.  PR 12066.
12. Fix Invalid memory access in ELF linker.  PR 11946.
13. Fix ld --build-id crash with non-ELF input.  PR 11937.
14. Update expression evaluation in linker scripts.
15. Proper dump relocation addend as signed in readelf.
16. Fix x86-64 Window 64bit immediate in assembler.  PR 11974.
17. Fix readelf crashes.  PR 11889.
18. Fix opjcopy.  PR 11953.
19. Fix a linker crash.  PR 11939.
20. Improve handling of invalid ELF section flags in assembler. PR 12011.
21. Improve gold.
22. Improve VMS support.
23. Improve Windows SEH support.
24. Improve alpha support.
25. Improve arm support.
26. Improve bfin support.
27. Improve mips support.
28. Improve spu support.
29. Improve tic6x support.

Changes from binutils 2.20.51.0.10:

1. Update from binutils 2010 0810.
2. Properly support compressed debug sections in all binutis programs.
Add --compress-debug-sections/--decompress-debug-sections to objcopy.
PR 11819.
3. Fix linker crash on undefined symbol errors with DWARF.  PR 11817.
4. Don't gen

Re: Bug in expand_builtin_setjmp_receiver ?

2010-10-22 Thread Ian Lance Taylor
Frederic Riss  writes:

> On 21 October 2010 16:49, Nathan Froyd  wrote:
>>> Is it easy to test lm32 on some simulator?
>>
>> lm32 has a gdb simulator available, so it should be fairly easy to write
>> a board file for it if one doesn't already exist.
>>
>> Unfortunately, building lm32-elf is broken in several different ways
>> right now.
>
> OK... what's the best way forward on this? Do we just leave it as it
> is and wait until an official port needs complains about it? Should it
> be filled in bugzilla?

Did you just happen to come across this, or is this relevant for a port
you are working on?

If you are not working on a port, then I think the best think to do
right now is to add a FIXME comment in the source code.

Ian


Re: question on ssa representation of aggregates

2010-10-22 Thread Ian Lance Taylor
"Amker.Cheng"  writes:

>In paper "Memory SSA-A Unified Approach for Sparsely Representing
> Memory Operations",

> Did I miss anything or the implementation is different? Thanks.

The implementation of this stuff changes fairly regularly.  The people
who like this kind of thing are still honing in on the best way to
handle aliasing information.  Richard Guenther is the main guy working
in this area today.

Ian


Re: peephole2: dead regs not marked as dead

2010-10-22 Thread Ian Lance Taylor
Georg Lay  writes:

> Unfortunately, not all dead regs are marked as dead.

OK, you have a good example.  And my response is: it seems to me that
d15 should be marked as dead.  So the question is why that is not
happening.  I don't know the answer.

Ian


G++ test suite picking up incorrect libstc++

2010-10-22 Thread Michael Eager

Hi --

I'm seeing test suite failures in g++ caused by
linking with the wrong libstdc++.so.

It looks like g++.exp always appends the default
directory
  append flags -L${gccpath}/libstdc++-v3/src/.libs
instead of
  append flags -L${gccpath}//libstdc++-v3/src/.libs

Has anyone else run into this problem?
Is this supposed to work in a different way?
Anyone come up with a fix?


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: G++ test suite picking up incorrect libstc++

2010-10-22 Thread Paolo Carlini
On 10/22/2010 08:43 PM, Michael Eager wrote:
> Hi --
>
> I'm seeing test suite failures in g++ caused by
> linking with the wrong libstdc++.so.
>
> It looks like g++.exp always appends the default
> directory
>   append flags -L${gccpath}/libstdc++-v3/src/.libs
> instead of
>   append flags -L${gccpath}//libstdc++-v3/src/.libs
Without having looked into the issue in any detail, the issue seems
weird to me: for sure many people regularly build multilib (myself and
HJ on gcc-testresults included, for example) without any problem
whatsoever. I would suggest figuring out first what's special about your
setup.

Paolo.


Re: G++ test suite picking up incorrect libstc++

2010-10-22 Thread Michael Eager

Paolo Carlini wrote:

On 10/22/2010 08:43 PM, Michael Eager wrote:

Hi --

I'm seeing test suite failures in g++ caused by
linking with the wrong libstdc++.so.

It looks like g++.exp always appends the default
directory
  append flags -L${gccpath}/libstdc++-v3/src/.libs
instead of
  append flags -L${gccpath}//libstdc++-v3/src/.libs



Without having looked into the issue in any detail, the issue seems
weird to me: for sure many people regularly build multilib (myself and
HJ on gcc-testresults included, for example) without any problem
whatsoever. I would suggest figuring out first what's special about your
setup.


I don't know that there's anything special about my setup.
g++.exp is adding -L paths to the wrong libstdc++ directory.
When running GCC tests, only the -B option is added.  The
correct multilib directory is selected by the gcc driver.

Do you run "make check" with default options, or do you
specify compiler options which should result in linking
non-default c++ libraries?

I'm going to run the test_installed script.  This should
use the gcc driver to select the multilib, rather than g++.exp.

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: G++ test suite picking up incorrect libstc++

2010-10-22 Thread H.J. Lu
On Fri, Oct 22, 2010 at 1:35 PM, Michael Eager  wrote:
> Paolo Carlini wrote:
>>
>> On 10/22/2010 08:43 PM, Michael Eager wrote:
>>>
>>> Hi --
>>>
>>> I'm seeing test suite failures in g++ caused by
>>> linking with the wrong libstdc++.so.
>>>
>>> It looks like g++.exp always appends the default
>>> directory
>>>  append flags -L${gccpath}/libstdc++-v3/src/.libs
>>> instead of
>>>  append flags -L${gccpath}//libstdc++-v3/src/.libs
>
>> Without having looked into the issue in any detail, the issue seems
>> weird to me: for sure many people regularly build multilib (myself and
>> HJ on gcc-testresults included, for example) without any problem
>> whatsoever. I would suggest figuring out first what's special about your
>> setup.
>
> I don't know that there's anything special about my setup.
> g++.exp is adding -L paths to the wrong libstdc++ directory.
> When running GCC tests, only the -B option is added.  The
> correct multilib directory is selected by the gcc driver.
>
> Do you run "make check" with default options, or do you
> specify compiler options which should result in linking
> non-default c++ libraries?

I use

# make check RUNTESTFLAGS="--target_board 'unix{-m32,}'"

to test both 32bit/64bit on Intel64.

H.J.


Re: question on ssa representation of aggregates

2010-10-22 Thread Amker.Cheng
> The implementation of this stuff changes fairly regularly.  The people
> who like this kind of thing are still honing in on the best way to
> handle aliasing information.  Richard Guenther is the main guy working
> in this area today.

thanks very much for clarification.


-- 
Best Regards.