LRA reloads of subregs

2015-09-03 Thread David Miller

I'm working on converting sparc to LRA, and thanks probably to the
work the powerpc folks did this is going much better than when I
last tried this.

The first major stumbling block I've run into is when LRA forces
a reload for a SUBREG, and specifically there is a MEM involved
that itself needs a reload due to having an invalid address.

For example, simplify_operand_subreg() is working on this insn:

(insn 18631 1099 1100 14 (set (reg:SI 13423)
(subreg:SI (mem/c:QI (plus:SI (reg/f:SI 101 %sfp)
(const_int -14269 [0xc843])) [0 %sfp+-14269 S1 
A8]) 0)) x.c:104 63 {*movsi_insn}
 (expr_list:REG_DEAD (reg:QI 287)
(nil)))

lra_emit_move() (via insert_move_for_subreg()) is called (here, 'reg'
is the MEM expression).

Because the expression is a MEM, all of the special cased code in
lra_emit_move() meant to avoid invalid displacements and indexes is
not used, and it just performs a plain emit_move_insn().

Calling emit_move_insn() does not work properly because it emits code
which needs reloads, to handle the too large CONST_INT offset in the
MEM expression.

We abort because lra_process_new_insns() expects everything emitted
by insert_move_for_subreg() to be recognizable, and with that too
large offset it cannot.

I wonder why another target's LRA conversion hasn't hit this :-)

Vlad I wonder how you'd like this to be handled?  The code to handle
this kind of situation is there in the process_address infrastructure.


Re: LRA reloads of subregs

2015-09-03 Thread David Miller
From: Segher Boessenkool 
Date: Thu, 3 Sep 2015 20:26:51 -0500

> On Thu, Sep 03, 2015 at 03:33:56PM -0700, David Miller wrote:
>> (insn 18631 1099 1100 14 (set (reg:SI 13423)
>> (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 101 %sfp)
>> (const_int -14269 [0xc843])) [0 %sfp+-14269 
>> S1 A8]) 0)) x.c:104 63 {*movsi_insn}
>>  (expr_list:REG_DEAD (reg:QI 287)
>> (nil)))
> 
>> I wonder why another target's LRA conversion hasn't hit this :-)
> 
> Maybe a stupid question but... why are you seeing subregs of mem at all?
> Does sparc not have INSN_SCHEDULING defined?

The paradoxical subreg restriction in general_operand() is only
enforced when reload_completed is true, which will not be the
case while LRA is working.


Re: LRA reloads of subregs

2015-09-04 Thread David Miller
From: Vladimir Makarov 
Date: Fri, 4 Sep 2015 10:00:54 -0400

> LRA porting frequently needs changing in constraints.md, .c,
> and .md files.

I did make such changes, trust me :-)

First obstacle was that, unlike reload, LRA is very strict about
register constraints.  If a constraint doesn't evaluate to a
register class, LRA refuses to consider it a register.

So we had this ugly thing:

(define_constraint "U"
 "Pseudo-register or hard even-numbered integer register"
 (and (match_test "TARGET_ARCH32")
  (match_code "reg")
  (ior (match_test "REGNO (op) < FIRST_PSEUDO_REGISTER")
   (not (match_test "reload_in_progress && reg_renumber [REGNO (op)] < 
0")))
  (match_test "register_ok_for_ldd (op)")))

A few years ago I tried the simple thing, changing this to a plain
GENERAL_REGS register constraint and hoping that HARD_REGNO_OK would
properly enforce the even register number requirement.  Back then it
didn't work but now it appears to work properly.

I've included the full patch I am working with below in case you are
curious.

> I don't think we should add a new LRA code calling process_address
> before adding insns for further processing.  LRA just needs to get
> operands from insns to make them valid.  So again I'd try to make insn
> recognizable for LRA first and only if it does not work then think
> about other solutions in case when such change creates other problems
> (it is hard for me to predict LRA behaviour definitely just reading
> source files and not knowing sparc port well).

If LRA is prepared to do a blind emit_move_insn() on an arbitrary MEM,
before it even validates the displacements in such a MEM as needing
reloads or not, it has to do something to accomodate this situation.

If LRA had not done the special SUBREG processing on this insn, it
indeed would have fixed up the invalid displacement using a reload.

Anyways, here is the patch I am working with.

diff --git a/gcc/config/sparc/constraints.md b/gcc/config/sparc/constraints.md
index e12efa1..7a18879 100644
--- a/gcc/config/sparc/constraints.md
+++ b/gcc/config/sparc/constraints.md
@@ -44,6 +44,8 @@
 (define_register_constraint "h" "(TARGET_V9 && TARGET_V8PLUS ? I64_REGS : 
NO_REGS)"
  "64-bit global or out register in V8+ mode")
 
+(define_register_constraint "U" "(TARGET_ARCH32 ? GENERAL_REGS : NO_REGS)")
+
 ;; Floating-point constant constraints
 
 (define_constraint "G"
@@ -135,51 +137,6 @@
   (match_code "mem")
   (match_test "memory_ok_for_ldd (op)")))
 
-;; This awkward register constraint is necessary because it is not
-;; possible to express the "must be even numbered register" condition
-;; using register classes.  The problem is that membership in a
-;; register class requires that all registers of a multi-regno
-;; register be included in the set.  It is add_to_hard_reg_set
-;; and in_hard_reg_set_p which populate and test regsets with these
-;; semantics.
-;;
-;; So this means that we would have to put both the even and odd
-;; register into the register class, which would not restrict things
-;; at all.
-;;
-;; Using a combination of GENERAL_REGS and HARD_REGNO_MODE_OK is not a
-;; full solution either.  In fact, even though IRA uses the macro
-;; HARD_REGNO_MODE_OK to calculate which registers are prohibited from
-;; use in certain modes, it still can allocate an odd hard register
-;; for DImode values.  This is due to how IRA populates the table
-;; ira_useful_class_mode_regs[][].  It suffers from the same problem
-;; as using a register class to describe this restriction.  Namely, it
-;; sets both the odd and even part of an even register pair in the
-;; regset.  Therefore IRA can and will allocate odd registers for
-;; DImode values on 32-bit.
-;;
-;; There are legitimate cases where DImode values can end up in odd
-;; hard registers, the most notable example is argument passing.
-;;
-;; What saves us is reload and the DImode splitters.  Both are
-;; necessary.  The odd register splitters cannot match if, for
-;; example, we have a non-offsetable MEM.  Reload will notice this
-;; case and reload the address into a single hard register.
-;;
-;; The real downfall of this awkward register constraint is that it does
-;; not evaluate to a true register class like a bonafide use of
-;; define_register_constraint would.  This currently means that we cannot
-;; use LRA on Sparc, since the constraint processing of LRA really depends
-;; upon whether an extra constraint is for registers or not.  It uses
-;; reg_class_for_constraint, and checks it against NO_REGS.
-(define_constraint "U"
- "Pseudo-register or hard even-numbered integer register"
- (and (match_test "TARGET_ARCH32")
-  (match_code "reg")
-  (ior (match_test "REGNO (op) < FIRST_PSEUDO_REGISTER")
-  (not (match_test "reload_in_progress && reg_renumber [REGNO (op)] < 
0")))
-  (match_test "register_ok_for_ldd (op)")))
-
 ;; Equivalent to 'T' but available in 64-bit mode
 (define_memory_constraint "W"
  "Memory refere

Re: LRA reloads of subregs

2015-09-04 Thread David Miller
From: Segher Boessenkool 
Date: Fri, 4 Sep 2015 06:46:04 -0500

> On Thu, Sep 03, 2015 at 11:19:43PM -0700, David Miller wrote:
>> The paradoxical subreg restriction in general_operand() is only
>> enforced when reload_completed is true, which will not be the
>> case while LRA is working.
> 
> This one?
> 
> #ifdef INSN_SCHEDULING
>   /* On machines that have insn scheduling, we want all memory
>reference to be explicit, so outlaw paradoxical SUBREGs.
>However, we must allow them after reload so that they can
>get cleaned up by cleanup_subreg_operands.  */
>   if (!reload_completed && MEM_P (sub)
> && GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (sub)))
>   return 0;
> #endif
> 
> I think you misread that.  Also doc/rtl.texi makes pretty clear that
> you really shouldn't see subregs of mem.  So where does it come from?

I see what you are saying, I'll take a look into this.

Thanks.


Re: LRA reloads of subregs

2015-09-04 Thread David Miller
From: David Miller 
Date: Fri, 04 Sep 2015 11:30:26 -0700 (PDT)

> From: Segher Boessenkool 
> Date: Fri, 4 Sep 2015 06:46:04 -0500
> 
>> On Thu, Sep 03, 2015 at 11:19:43PM -0700, David Miller wrote:
>>> The paradoxical subreg restriction in general_operand() is only
>>> enforced when reload_completed is true, which will not be the
>>> case while LRA is working.
>> 
>> This one?
>> 
>> #ifdef INSN_SCHEDULING
>>   /* On machines that have insn scheduling, we want all memory
>>   reference to be explicit, so outlaw paradoxical SUBREGs.
>>   However, we must allow them after reload so that they can
>>   get cleaned up by cleanup_subreg_operands.  */
>>   if (!reload_completed && MEM_P (sub)
>>&& GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (sub)))
>>  return 0;
>> #endif
>> 
>> I think you misread that.  Also doc/rtl.texi makes pretty clear that
>> you really shouldn't see subregs of mem.  So where does it come from?
> 
> I see what you are saying, I'll take a look into this.

It looks like it is created in LRA itself, initially LRA is looking
at:

(insn 1100 1099 1101 14 (set (reg:SI 3376)
(ior:SI (subreg:SI (reg:QI 287) 0)
(subreg:SI (reg:QI 289) 0))) x.c:104 234 {iorsi3}
 (expr_list:REG_DEAD (reg:QI 289)
(expr_list:REG_DEAD (reg:QI 287)
(nil

in curr_insn_transform(), and emits the move:

(set (reg:SI 13423) (subreg:SI (reg:QI 287) 0))

Later the reg inside of this subreg appears to get transformed into
an on-stack MEM.

(insn 18631 1099 1100 14 (set (reg:SI 13423)
(subreg:SI (mem/c:QI (plus:SI (reg/f:SI 101 %sfp)
(const_int -14269 [0xc843])) [0 %sfp+-14269 S1 
A8]) 0)) x.c:104 63 {*movsi_insn}
 (expr_list:REG_DEAD (reg:QI 287)
(nil)))

I suppose perhaps I need to make the input_operand predicate more
strict on sparc.  So I'll look into that now.



Re: LRA reloads of subregs

2015-09-04 Thread David Miller
From: David Miller 
Date: Fri, 04 Sep 2015 11:27:31 -0700 (PDT)

> From: Vladimir Makarov 
> Date: Fri, 4 Sep 2015 10:00:54 -0400
> 
>> I don't think we should add a new LRA code calling process_address
>> before adding insns for further processing.  LRA just needs to get
>> operands from insns to make them valid.  So again I'd try to make insn
>> recognizable for LRA first and only if it does not work then think
>> about other solutions in case when such change creates other problems
>> (it is hard for me to predict LRA behaviour definitely just reading
>> source files and not knowing sparc port well).

I've taken some time to see exactly what is going on here, perhaps
you can give me some guidance, I'm quite happy to implement anything
:-)

We start with:

(insn 1100 1099 1101 14 (set (reg:SI 3376)
(ior:SI (subreg:SI (reg:QI 287) 0)
(subreg:SI (reg:QI 289) 0))) x.c:104 234 {iorsi3}
 (expr_list:REG_DEAD (reg:QI 289)
(expr_list:REG_DEAD (reg:QI 287)
(nil

LRA emits, in curr_insn_transform():

(set (reg:SI 13423) (subreg:SI (reg:QI 287) 0))

LRA then spills the subreg onto the stack, which gives us:

(insn 18631 1099 1100 14 (set (reg:SI 13423)
(subreg:SI (mem/c:QI (plus:SI (reg/f:SI 101 %sfp)
(const_int -14269 [0xc843])) [0 %sfp+-14269 S1 
A8]) 0)) x.c:104 63 {*movsi_insn}
 (expr_list:REG_DEAD (reg:QI 287)
(nil)))

And this is where we run into trouble in simplify_operand_subreg(),
which seems to force reloads for all SUBREGs of MEM.

Normally, if there were no SUBREG here, LRA would run
process_address() over the MEMs in this instruction and all would be
well.

It is also the case that I cannot do anything special in the SPARC
move emitter to handle this, as address validization is disabled when
lra_in_progress is true.


Re: Converting to LRA (calling all maintainers)

2016-09-17 Thread David Miller
From: Eric Botcazou 
Date: Fri, 16 Sep 2016 23:43:43 +0200

>> p.s.  Are there plans for converting the SPARC port?
> 
> There are more than plans - actual patches by DaveM that were installed at 
> some point and then reverted quickly because of unexpected fallout.

Yeah, sparc64 failed to bootstrap and it was the stage2 that was
miscompiled.

I lacked the time to debug it properly so we reverted.


Re: Converting to LRA (calling all maintainers)

2016-09-17 Thread David Miller
From: Eric Botcazou 
Date: Sat, 17 Sep 2016 10:18:23 +0200

>> I lacked the time to debug it properly so we reverted.
> 
> Do you plan to give it a try again in the near future?

I was going to work on this over the past summer, but other
responsibilities took up all of my time.

Probably the earliest I could look into this again would be
November.


Re: Converting to LRA (calling all maintainers)

2017-01-03 Thread David Miller
From: Eric Botcazou 
Date: Tue, 03 Jan 2017 22:22:05 +0100

>> p.s.  Are there plans for converting the SPARC port?
> 
> The SPARC port has now been converted.

Thanks so much for doing this work, I wish I could have been more
helpful.


Re: [sparc64] kernel OOPS with gcc 7.1 / 7.2

2017-08-15 Thread David Miller
From: Anthony Yznaga 
Date: Tue, 15 Aug 2017 17:45:12 -0700

> I compiled a kernel with gcc 7 and found that the compiler inserted a
> call to __multi3() in mq_attr_ok().  The sparc64 implementation of
> __multi3() was added by 1b4af13ff2cc specifically for gcc 7 and later,
> but it clobbers %g4 and %g5.  Not sure if that was intended but it
> looks like __multi3() is not safe to call from kernel code.

Good catch, we have to redo the register allocation in that routine
to fix this.


Re: [sparc64] kernel OOPS with gcc 7.1 / 7.2

2017-08-15 Thread David Miller
From: Anatoly Pugachev 
Date: Tue, 15 Aug 2017 21:50:45 +0300

> Together with Dmitry (ldv) , we've discovered that running test suite
> from strace produces kernel OOPS, when kernel is compiled with gcc 7.1
> or with gcc 7.2 , but not with gcc 6 :

Please try this patch:

diff --git a/arch/sparc/lib/multi3.S b/arch/sparc/lib/multi3.S
index d6b6c97..703127a 100644
--- a/arch/sparc/lib/multi3.S
+++ b/arch/sparc/lib/multi3.S
@@ -5,26 +5,26 @@
.align  4
 ENTRY(__multi3) /* %o0 = u, %o1 = v */
mov %o1, %g1
-   srl %o3, 0, %g4
-   mulx%g4, %g1, %o1
+   srl %o3, 0, %o4
+   mulx%o4, %g1, %o1
srlx%g1, 0x20, %g3
-   mulx%g3, %g4, %g5
-   sllx%g5, 0x20, %o5
-   srl %g1, 0, %g4
+   mulx%g3, %o4, %g7
+   sllx%g7, 0x20, %o5
+   srl %g1, 0, %o4
sub %o1, %o5, %o5
srlx%o5, 0x20, %o5
-   addcc   %g5, %o5, %g5
+   addcc   %g7, %o5, %g7
srlx%o3, 0x20, %o5
-   mulx%g4, %o5, %g4
+   mulx%o4, %o5, %o4
mulx%g3, %o5, %o5
sethi   %hi(0x8000), %g3
-   addcc   %g5, %g4, %g5
-   srlx%g5, 0x20, %g5
+   addcc   %g7, %o4, %g7
+   srlx%g7, 0x20, %g7
add %g3, %g3, %g3
movcc   %xcc, %g0, %g3
-   addcc   %o5, %g5, %o5
-   sllx%g4, 0x20, %g4
-   add %o1, %g4, %o1
+   addcc   %o5, %g7, %o5
+   sllx%o4, 0x20, %o4
+   add %o1, %o4, %o1
add %o5, %g3, %g2
mulx%g1, %o2, %g1
add %g1, %g2, %g1


Re: [sparc64] kernel OOPS with gcc 7.1 / 7.2

2017-08-16 Thread David Miller
From: Anatoly Pugachev 
Date: Wed, 16 Aug 2017 11:42:43 +0300

> On Wed, Aug 16, 2017 at 7:30 AM, David Miller  wrote:
>> From: Anatoly Pugachev 
>> Date: Tue, 15 Aug 2017 21:50:45 +0300
>>
>>> Together with Dmitry (ldv) , we've discovered that running test suite
>>> from strace produces kernel OOPS, when kernel is compiled with gcc 7.1
>>> or with gcc 7.2 , but not with gcc 6 :
>>
>> Please try this patch:
> 
> Dave,
> 
> this patch fixes OOPS, thanks. Tested on ldom (gcc 7.2, git kernel +
> patch, git strace).

Thanks for testing.


Re: ARM unaligned MMIO access with attribute((packed))

2011-02-02 Thread David Miller
From: Russell King - ARM Linux 
Date: Wed, 2 Feb 2011 16:37:02 +

> 1. there's no way to tell GCC that the inline assembly is a load
>instruction and therefore it needs to schedule the following
>instructions appropriately.

Just add a dummy '"m" (pointer)' asm input argument to the inline asm
statement.  Just make sure "typeof(pointer)" has a size matching the
size of the load your are performing.

> 2. GCC will needlessly reload pointers from structures and other such
>behaviour because it can't be told clearly what the inline assembly
>is doing, so the inline asm needs to have a "memory" clobber.

This behavior is correct, and in fact needed.  Writing to chip registers
can trigger changes to arbitrary main memory locations.

> 3. It seems to misses out using the pre-index addressing, prefering to
>create add/sub instructions prior to each inline assembly load/store.

Yes, this is indeed a problem.

But you really need that memory clobber there whether you like it or
not, see above.


Re: ARM unaligned MMIO access with attribute((packed))

2011-02-02 Thread David Miller
From: Russell King - ARM Linux 
Date: Wed, 2 Feb 2011 21:45:22 +

> On Wed, Feb 02, 2011 at 01:38:31PM -0800, David Miller wrote:
>> From: Russell King - ARM Linux 
>> Date: Wed, 2 Feb 2011 16:37:02 +
>> 
>> > 1. there's no way to tell GCC that the inline assembly is a load
>> >instruction and therefore it needs to schedule the following
>> >instructions appropriately.
>> 
>> Just add a dummy '"m" (pointer)' asm input argument to the inline asm
>> statement.  Just make sure "typeof(pointer)" has a size matching the
>> size of the load your are performing.
> 
> That involves this problematical cast from a packed struct pointer to
> an unsigned long pointer, which according to the C standard and GCC
> folk is undefined.

It's alignment may be undefined, but it's size definitely is well
defined and that's what matters here.

> Practice over the last 15 years on ARM has also shown that this is not
> necessary.

Sorry oh big super man, little ole' me is only a kernel newbie.


Re: ARM unaligned MMIO access with attribute((packed))

2011-02-02 Thread David Miller
From: Måns Rullgård 
Date: Wed, 02 Feb 2011 23:08:01 +

> David Miller  writes:
> 
>> From: Russell King - ARM Linux 
>> Date: Wed, 2 Feb 2011 16:37:02 +
>>
>>> 1. there's no way to tell GCC that the inline assembly is a load
>>>instruction and therefore it needs to schedule the following
>>>instructions appropriately.
>>
>> Just add a dummy '"m" (pointer)' asm input argument to the inline asm
>> statement.  Just make sure "typeof(pointer)" has a size matching the
>> size of the load your are performing.
> 
> That should be "m"(*pointer).

Right, thanks for the correction.

>> But you really need that memory clobber there whether you like it or
>> not, see above.
> 
> I don't know of any device where the side-effects are not explicitly
> indicated by other means in the code triggering them, so it probably
> is safe without the clobber as Russel says.

You're probably right.


float "op-and-halve"

2011-09-30 Thread David Miller

I'm planning to support some new instructions found in recent sparc
cpus, specifically VIS 3.0 adds a series of "X and halve"
floating-point instructions where X is one of "add" or "subtract".

There are variants which negate the result as well.

They operate similar to FMA in that all the operations are performed
and then rounding occurs only one time at the very end.

The advantage of having these "halve" variants is that since the chip
does the calculations with a larger amount of precision internally,
the final result cannot overflow.

Does any other cpu support this, and in particular in GCC?

If not, does anyone have any suggestions on how do you model this?
It'd be real disappointing to have to unspec these things and only
be able to access them using builtins.


Re: VIS2 pattern review

2011-10-13 Thread David Miller
From: Richard Henderson 
Date: Wed, 12 Oct 2011 17:49:19 -0700

> There's a code sample 7-1 that illustrates a 16x16 multiply:
> 
>   fmul8sux16 %f0, %f1, %f2
>   fmul8ulx16 %f0, %f1, %f3
>   fpadd16%f2, %f3, %f4

Be wary of code examples that don't even assemble (even numbered
float registers are required here).

fmul8sux16 basically does, for each element:

src1 = (rs1 >> 8) & 0xff;
src2 = rs2 & 0x;

product = src1 * src2;

scaled = (product & 0x0000) >> 8;
if (product & 0x80)
scaled++;

rd = scaled & 0x;

fmul8ulx16 does the same except the assignment to src1 is:

src1 = rs1 & 0xff;

Therefore, I think this "16 x 16 multiply" operation isn't the kind
you think it is, and it's therefore not appropriate to use this in the
compiler for vector multiplies.

Just for shits and grins I tried it and the slp-7 testcase, as expected,
fails.  The main multiply loop in that test case is compiled to:

sethi   %hi(.LLC6), %i3
sethi   %hi(in2), %g1
ldd [%i3+%lo(.LLC6)], %f22
sethi   %hi(.LLC7), %i4
sethi   %hi(.LLC8), %i2
sethi   %hi(.LLC9), %i3
add %fp, -256, %g2
ldd [%i4+%lo(.LLC7)], %f20
or  %g1, %lo(in2), %g1  
ldd [%i2+%lo(.LLC8)], %f18
mov %fp, %i5
ldd [%i3+%lo(.LLC9)], %f16
mov %g1, %g4
mov %g2, %g3
.LL10:
ldd [%g4+8], %f14
ldd [%g4+16], %f12
fmul8sux16  %f14, %f22, %f26
ldd [%g4+24], %f10
fmul8ulx16  %f14, %f22, %f24
ldd [%g4], %f8
fmul8sux16  %f12, %f20, %f34
fmul8ulx16  %f12, %f20, %f32
fmul8sux16  %f10, %f18, %f30
fpadd16 %f26, %f24, %f14
fmul8ulx16  %f10, %f18, %f28
fmul8sux16  %f8, %f16, %f26
fmul8ulx16  %f8, %f16, %f24
fpadd16 %f34, %f32, %f12
std %f14, [%g3+8]
fpadd16 %f30, %f28, %f10
std %f12, [%g3+16]
fpadd16 %f26, %f24, %f8
std %f10, [%g3+24]
std %f8, [%g3]
add %g3, 32, %g3
cmp %g3, %i5
bne,pt  %icc, .LL10
 add%g4, 32, %g4

and it simply gives the wrong results.

The entire out2[] array is all zeros.


Re: VIS2 pattern review

2011-10-13 Thread David Miller
From: David Miller 
Date: Thu, 13 Oct 2011 14:26:36 -0400 (EDT)

>   product = src1 * src2;
> 
>   scaled = (product & 0x0000) >> 8;
>   if (product & 0x80)
>   scaled++;

In fact, all of the partitioned multiply instructions scale the result
by 8 bits with rounding towards positive infinity.

Therefore, we have to use an unspec for all of them.


Re: VIS2 pattern review

2011-10-13 Thread David Miller
From: Richard Henderson 
Date: Wed, 12 Oct 2011 17:49:19 -0700

> The comment for fpmerge_vis is not correct.
> I believe that the operation is representable with
> 
>   (vec_select:V8QI
> (vec_concat:V8QI
>   (match_operand:V4QI 1 ...)
>   (match_operand:V4QI 2 ...)
> (parallel [
>   0 4 1 5 2 6 3 7
>   ]))
> 
> which can be used as the basis for both of the
> 
>   vec_interleave_lowv8qi
>   vec_interleave_highv8qi
> 
> named patterns.

Agreed.

> AFAICS, this needs an unspec, like fmul8x16al.
> Similarly for fmul8sux16_vis, fmuld8sux16_vis,

Yes, as we found all the partitioned multiplies need to be unspecs.

>> (define_code_iterator vis3_addsub_ss [ss_plus ss_minus])
>> (define_code_attr vis3_addsub_ss_insn
>>   [(ss_plus "fpadds") (ss_minus "fpsubs")])
>> 
>> (define_insn "_vis"
>>   [(set (match_operand:VASS 0 "register_operand" "=")
>> (vis3_addsub_ss:VASS (match_operand:VASS 1 "register_operand" 
>> "")
>>  (match_operand:VASS 2 "register_operand" 
>> "")))]
>>   "TARGET_VIS3"
>>   "\t%1, %2, %0")
> 
> These should be exposed as "ssadd3" "sssub3".

Agreed.

I'm currently regstrapping the patch at the end of this mail and will
commit it to trunk if no regressions pop up.

I'll look into the rest of your feedback.  But I want to look into a more
fundamental issue with VIS support before moving much further.

I worked for several evenings on adding support for the VIS3
instructions that move directly between float and integer regs.  I tried
really hard to get the compiler to do something sensible but it's next
to impossible for two reasons:

1) We don't represent single entry vectors using vector modes, we just
   use SImode, DImode etc.

   I think this is a huge mistake, because the compiler now thinks it
   can do "SImode stuff" in the float regs.  Other backends are able
   to segregate vector vs. non-vector operations by using the single
   entry vector modes.

2) In addition to that, because of how we number the registers for
   allocation on sparc for leaf functions, the compiler starts trying
   to reload SImode and DImode values into the floating point registers
   before trying to use the non-leaf integer regs.

   Because the leaf allocation order is "leaf integer regs", "float
   regs", "non-leaf integer regs".

Even if I jacked up the register move cost for the cases where the
VIS3 instructions applied, it still did these kinds of reloads.

This also gets reload into trouble because it believes that if it can
move an address value (say Pmode == SImode) from one place to another,
then a plus on the same operands can be performed (with perhaps minor
reloading).  But that doesn't work when this "move" is "move float reg
to int reg" and therefore the operands are "%f12" and "%g3".  It
tries to do things like "(plus:SI (reg:SI %f12) (reg:SI %g3))"

All of these troubles would be eliminated if we used vector modes for
all the VIS operations instead of using SImode and DImode for the single
entry vector cases.

Unfortunately, that would involve some ABI changes for the VIS
builtins.  I'm trending towards considering just changing things
anyways since the VIS intrinsics were next to unusable beforehand.
I've scoured the net for examples of people actually using the GCC
intrinsics before all of my recent changes, and they all fall into two
categories: 1) they use inline assembler because the VIS intrinsics
don't work and 2) they try to use the intrinsics but the code is
disabled because it "doesn't work".


Fix the RTL of some sparc VIS patterns.

* config/sparc/sparc.md (UNSPEC_FPMERGE): Delete.
(UNSPEC_MUL16AU, UNSPEC_MUL8, UNSPEC_MUL8SU, UNSPEC_MULDSU): New
unspecs.
(fpmerge_vis): Remove inaccurate comment, represent using vec_select
of a vec_concat.
(vec_interleave_lowv8qi, vec_interleave_highv8qi): New insns.
(fmul8x16_vis, fmul8x16au_vis, fmul8sux16_vis, fmuld8sux16_vis):
Reimplement as unspecs and remove inaccurate comments.
(vis3_shift_patname): New code attr.
(_vis): Rename to 
"v3".
(vis3_addsub_ss_patname): New code attr.
(_vis): Rename to 
"".

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 017594f..ae36634 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,18 @@
 2011-10-12  David S. Miller  
 
+   * config/sparc/sparc.md (UNSPEC_FPMERGE): Delete.
+   (UNSPEC_MUL16AU, UNSPEC_MUL8, UNSPEC_MUL8SU, UNSPEC_MULDSU): New
+   unspecs.
+   (fpmerge_vis): Remove inaccurate comment, represent using vec_select
+   of a vec_concat.
+   (vec_interleave_lowv8qi, vec_interleave_highv8qi): New insns.
+   (fmul8x16_vis, fmul8x16au_vis, fmul8sux16_vis, fmuld8sux16_vis):
+   Reimplement as unspecs and remove inaccurate comments.
+   (vis3_shift_patname): New code attr.
+   (_vis): Rename to 
"v3".
+   (vis3_addsub_ss_patname): New code attr.
+   (_vis): Rename to 
"".
+
 

Re: VIS2 pattern review

2011-10-13 Thread David Miller
From: Richard Henderson 
Date: Thu, 13 Oct 2011 13:06:19 -0700

> On 10/13/2011 12:55 PM, David Miller wrote:
>> -(define_insn "_vis"
>> +(define_insn ""
> 
> Missing a "3" on the end.  Otherwise these look ok.

Thanks for finding that.

>> Unfortunately, that would involve some ABI changes for the VIS
>> builtins.  I'm trending towards considering just changing things
>> anyways since the VIS intrinsics were next to unusable beforehand.
> 
> Why?  You can do just about anything you like inside the builtin
> expander, including frobbing the modes around.

Hmmm, ok, I'll look into approaching the change that way.

Thanks again Richard.


Re: VIS2 pattern review

2011-10-13 Thread David Miller
From: Eric Botcazou 
Date: Fri, 14 Oct 2011 00:41:42 +0200

>> Unfortunately, that would involve some ABI changes for the VIS
>> builtins.  I'm trending towards considering just changing things
>> anyways since the VIS intrinsics were next to unusable beforehand.
> 
> Could you elaborate?  The calling conventions for vectors (like for the other 
> classes) shouldn't depend on the mode but only on the type.

Right and as Richard said I can munge the modes during expansion of
existing builtins when needed.


extending fpmuls

2011-10-24 Thread David Miller

While working on some test cases I noticed that the 'fsmuld'
instruction on sparc was not being matched by the combiner for
things like:

double fsmuld (float a, float b)
{
  return a * b;
}

Combine does try to match:

(set x (float_extend:DF (mul:SF y z)))

instead of what backends (and in particular at least Sparc and Alpha)
seem to use canonically for this pattern which is:

(set x (mul:DF (float_extend:DF y) (float_extend:DF y)))

Something similar happens for:

double fnsmuld (float a, float b)
{
  return -(a * b);
}

which combine should match to the *fnsmuld sparc.md pattern,
but similar to above combine tries:

(set x (float_extend:DF (mul:SF (neg:SF y) z)))

instead of:

(set x (mul:DF (neg:DF (float_extend:DF y) (float_extend:DF z

Which is right?  "Canonicalization of Instructions" in the internals
documentation doesn't give any guidance :-)


Re: extending fpmuls

2011-10-25 Thread David Miller
From: Jakub Jelinek 
Date: Tue, 25 Oct 2011 10:00:50 +0200

> I bet
> double fsmuld (float a, float b)
> {
>   return (double) a * b;
> }
> instead will match your pattern, then the operands are first extended
> into double and then multiplied into a double product.

Right, in existing testcases I've used "(double)a * (double) b"
to trigger the pattern.

Thanks for explaining.


cprop_reg problem on sparc

2011-10-27 Thread David Miller

Although copy_value() in regcprop.c tries to avoid recording cases
where substitutions would be illegal, there are some bad cases it
still can let through.

On 64-bit sparc, integer regs are 64-bit and float regs are
(basically) 32-bit.  So HARD_REGNO_NREGS(float_reg, DFmode) is 2, and
HARD_REGNO_NREGS(integer_reg, DImode) is 1.

cprop sees the sequence:

(insn 330 172 230 .. (set (reg:DI %g2) const_int)
(insn 171 330 173 .. (set (reg:DF %f10) (reg:DF %g2)))
(insn 173 171 222 .. (set (reg:DF %f2) (reg:DF %f10)))
(insn 222 173 223 .. (set (MEM:SI ..) (reg:SI %f10)))
(insn 223 222 174 .. (set (MEM:SI ..) (reg:SI %f11)))

And then it believes that in insn 222 it can replace %f10 with %g2,
but this is not a correct transformation.

cprop uses hard_regno_nregs[][] to attempt to detect illegal cases
like this one, but such checks will not trigger here because
hard_regno_nregs[][] is '1' for all of the registers being inspected:

hard_regno_nregs[][] (reg:SI f10)   1
hard_regno_nregs[][] (reg:DI g2)1

The (set (reg:DI %g2) const_int) is generated by the *movdf_insn_sp64 insn
which in turn triggers a splitter for loading float constants into
integer registers.

The MEM:SI stores are reloads generated by IRA for a pseudo that has
to live across a call.  For whatever reason it allocated only a 4-byte
aligned stack location, and I suppose that is why the reload is split
into 2 SImode pieces.

To reproduce build gcc.c-torture/execute/ieee/mzero.c with
"-m64 -mcpu=niagara3 -O2" on sparc.

I'm suspecting that perhaps cprop is ok, and the real issue is that
sparc's definition of CANNOT_CHANGE_MODE_CLASS needs to be adjusted.


Re: cprop_reg problem on sparc

2011-10-27 Thread David Miller
From: Eric Botcazou 
Date: Thu, 27 Oct 2011 15:17:40 +0200

>> To reproduce build gcc.c-torture/execute/ieee/mzero.c with
>> "-m64 -mcpu=niagara3 -O2" on sparc.
> 
> AFAICS there is no such file as gcc.c-torture/execute/ieee/mzero.c.

Sorry, the final path component should be "mzero2.c"


Re: cprop_reg problem on sparc

2011-10-27 Thread David Miller
From: Eric Botcazou 
Date: Thu, 27 Oct 2011 15:17:40 +0200

>> On 64-bit sparc, integer regs are 64-bit and float regs are
>> (basically) 32-bit.  So HARD_REGNO_NREGS(float_reg, DFmode) is 2, and
>> HARD_REGNO_NREGS(integer_reg, DImode) is 1.
>>
>> cprop sees the sequence:
>>
>> (insn 330 172 230 .. (set (reg:DI %g2) const_int)
>> (insn 171 330 173 .. (set (reg:DF %f10) (reg:DF %g2)))
>> (insn 173 171 222 .. (set (reg:DF %f2) (reg:DF %f10)))
>> (insn 222 173 223 .. (set (MEM:SI ..) (reg:SI %f10)))
>> (insn 223 222 174 .. (set (MEM:SI ..) (reg:SI %f11)))
>>
>> And then it believes that in insn 222 it can replace %f10 with %g2,
>> but this is not a correct transformation.
>>
>> cprop uses hard_regno_nregs[][] to attempt to detect illegal cases
>> like this one, but such checks will not trigger here because
>> hard_regno_nregs[][] is '1' for all of the registers being inspected:
>>
>>  hard_regno_nregs[][] (reg:SI f10)   1
>>  hard_regno_nregs[][] (reg:DI g2)1
> 
> There seems to be a hole in the checks, as the number of registers is 2 for 
> some of the intermediate steps.

I think cprop_reg can legitimately only consider the candidate read
replacement (reg:SI %f10) with the most recent store to it's
equivalent value (reg:DI %g2).

In fact, that's exactly what cprop_reg's core algorithm is :-)

The issue is how to test properly that accesses to a value in two
register accesses is equivalent after a move.  This pass is currently
using hard_regno_nregs[][] and a paradoxical subreg test to achieve
that, but it's obviously not sufficient.

Note that it would actually be legal to make the transformation on
insn 223, replacing %f11 with %g2.

But I'll note that if IRA had properly spilled %f10 to the stack using
an 8-byte aligned stack slot and DFmode, we wouldn't even be having to
consider this situation.

I think cprop_reg should cope with this properly, but I also think that
it would be nice if we worked to minimize unnecessary mode changes like
those seen here.  The const-->DFmode splitter in sparc.md is partly to
blame, as is IRA.


Re: cprop_reg problem on sparc

2011-10-27 Thread David Miller
From: Eric Botcazou 
Date: Thu, 27 Oct 2011 23:11:33 +0200

>> Sorry, the final path component should be "mzero2.c"
> 
> Thanks.  I think we that need the same treatment in:
 ...
> as:
 ...
> i.e. we need to bail out if we are narrowing and this is a big-endian target.

I quickly tried the patch below, but this does not prevent the
transformation.

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index ad92a64..54e008b 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -448,6 +448,14 @@ find_oldest_value_reg (enum reg_class cl, rtx reg, struct 
value_data *vd)
   if (hard_regno_nregs[regno][mode]
  > hard_regno_nregs[regno][vd->e[regno].mode])
return NULL_RTX;
+
+  /* And likewise, if we are narrowing on big endian the transformation
+is also invalid.  */
+  if (hard_regno_nregs[regno][mode]
+ < hard_regno_nregs[regno][vd->e[regno].mode]
+ && (GET_MODE_SIZE (vd->e[regno].mode) > UNITS_PER_WORD
+ ? WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN))
+   return NULL_RTX;
 }
 
   for (i = vd->e[regno].oldest_regno; i != regno; i = vd->e[i].next_regno)


Re: cprop_reg problem on sparc

2011-10-27 Thread David Miller
From: Eric Botcazou 
Date: Thu, 27 Oct 2011 23:55:00 +0200

>> I quickly tried the patch below, but this does not prevent the
>> transformation.
> 
> The quoted code is in copyprop_hardreg_forward_1.

Indeed :-)

This patch below works for the specific test case, and I'll post to
gcc-patches and commit it after regstrapping.

Thanks Eric!


Fix illegal register substitutions on big-endian during cprop_reg.

* regcprop.c (copyprop_hardreg_forward_1): Reject the
transformation when we narrow the mode on big endian.
---
 gcc/ChangeLog  |5 +
 gcc/regcprop.c |8 
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 403fb60..54e059e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-10-27  David S. Miller  
+
+   * regcprop.c (copyprop_hardreg_forward_1): Reject the
+   transformation when we narrow the mode on big endian.
+
 2011-10-27  Jakub Jelinek  
 
* config/i386/sse.md (avx_cvtpd2dq256_2, avx_cvttpd2dq256_2,
diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index ad92a64..b0f0343 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -824,6 +824,14 @@ copyprop_hardreg_forward_1 (basic_block bb, struct 
value_data *vd)
  if (hard_regno_nregs[regno][mode]
  > hard_regno_nregs[regno][vd->e[regno].mode])
goto no_move_special_case;
+
+ /* And likewise, if we are narrowing on big endian the 
transformation
+is also invalid.  */
+ if (hard_regno_nregs[regno][mode]
+ < hard_regno_nregs[regno][vd->e[regno].mode]
+ && (GET_MODE_SIZE (vd->e[regno].mode) > UNITS_PER_WORD
+ ? WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN))
+   goto no_move_special_case;
}
 
  /* If the destination is also a register, try to find a source
-- 
1.7.6.401.g6a319



scalar vector shift expansion problem on 64-bit

2011-10-27 Thread David Miller

I'm getting an ICE on 64-bit sparc for some vector test cases but
I'm not sure where the fix belongs.

When the compiler expands a vecor shift by scalar into a vector
shift by a vector it uses expand_vector_broadcast(), which has
a comment which states:

"The mode of OP must be the element mode of VMODE."

But no such guarentee exists, and during the compilation of
gcc.dg/vect/pr33953.c we end up trying to expand a broadcast where the
vector mode is V2SI and the scalar object is of type DImode.

The tree node for the shift count that gets expanded by
expand_normal() into this DImode rtx looks like:

 
unit size 
align 32 symtab 0 alias set 2 canonical type 0xf71e63c0 precision 32 
min  max 
pointer_to_this >
visited var def_stmt GIMPLE_NOP

version 12>

Anyways, in the sparc backend I made use of the invariant
expand_vector_broadcast() mentions, and assumed that the vector
inner mode equals the mode of the RTX object being broadcast.

This results in a crash because I end up making a emit_move_insn()
call where the modes don't match up.

So should expand_vector_broadcast() really provide this invariant to
the vec_init expander, or does the vec_init expander need to tidy
things up with gen_lowpart() etc. calls?


PR c++/39480 not really fixed

2011-10-28 Thread David Miller

g++.dg/init/copy7.C makes sure that memcpy() is not emitted with
src and dst equal.

The fix installed absolutely relies upon a backend implementing
the movmem pattern, and essentially that such a pattern will
always succeed to emit for arbitrary circumstances.

However 1) not all platforms implement the pattern, it is not
required, therefore it cannot be relied upon for correct code
generation and 2) even those that implement this pattern have certain
restrictions which could legitimately be triggered and thus cause the
memcpy() with src and dst equal to still be emitted.

Sparc is currently failing this testcase for reason #1, but it would
be trivial to make minor modifications to the copy7.C test case to get
it to fail on a score of several targets as well (even those that
provide a movmem pattern).

So we should either install a more complete fix or open the bug back
up.


Re: PR c++/39480 not really fixed

2011-10-28 Thread David Miller
From: Richard Guenther 
Date: Fri, 28 Oct 2011 11:27:25 +0200

> On Fri, Oct 28, 2011 at 9:48 AM, David Miller  wrote:
>>
>> g++.dg/init/copy7.C makes sure that memcpy() is not emitted with
>> src and dst equal.
> 
> The testcase is bogus and should be removed.  See the patch I posted
> (and the PR).  Unfortunately the patch didn't get any review yet.

Thanks.  Which patch and PR are you talking about, 39480?

Meanwhile, if the conclusion is that memcpy(x, x, ...) is legal, I
disagree.  I can think of implementations where this will break.

For example, if the memcpy implementation does a cache initializing
store on the first cache line of the destination before fetching the
beginning of the source, such memcpy() calls with src==dst will
produce garbage.



Re: PR c++/39480 not really fixed

2011-10-28 Thread David Miller
From: Richard Guenther 
Date: Fri, 28 Oct 2011 12:47:30 +0200

> Then we have to fix the middle-end which will happily expand
> block-moves to memcpy with exact overlap (a = a is valid in C).
> See the PR and the C testcases therein.
> 
> Just trying to avoid this in the C++ frontend is bogus.

Agreed.

> Of course, as Linus would say, such implementation would be "broken" ;)
> (I expect that such implementations can (and would) easily avoid the situation
> with almost no cost by doing an early out for src == dst).

memcpy is not required to be mindful of overlapping buffers.

Otherwise there is no point in having seperate memmove() and memcpy()
interfaces in the first place.

Even internally inside of GCC there is confusion about the movmem
patterns.  The documentation explicitly states that movmem need not
give special consideration to the posibility that source and
destination might overlap.

But the comments about the expander in the MIPS backend mention that
they might overlap.

And reality shows that the middle-end will end up expanding such
overlapping cases.


vector shift regression on sparc

2011-10-29 Thread David Miller

gcc.dg/pr48616.c segfaults on sparc as of a day or two ago

vectorizable_shift() crashes because op1_vectype is NULL and
we hit this code path:

  /* Vector shifted by vector.  */
  if (!scalar_shift_arg)
{
  optab = optab_for_tree_code (code, vectype, optab_vector);
  if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vector/vector shift/rotate found.");
=>if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))

dt[1] is vect_external_def and slp_node is non-NULL.

Indeed, when the 'dt' arg to vect_is_simple_use_1() is
vect_external_def *vectype will be set to NULL.

And scalar_shift_arg gets set to false because of this loop
check:

  if (slp_node)
{
  VEC (gimple, heap) *stmts = SLP_TREE_SCALAR_STMTS (slp_node);
  gimple slpstmt;

  FOR_EACH_VEC_ELT (gimple, stmts, k, slpstmt)
if (!operand_equal_p (gimple_assign_rhs2 (slpstmt), op1, 0))
  scalar_shift_arg = false;
}

Indeed, this crashing test was added by this change:

2011-10-28  Jakub Jelinek  

   * tree-vect-stmts.c (vectorizable_shift): Give up if op1 has different
   vector mode from vectype's mode.


Re: [PATCH 1/1] sparc leon: add -Aleon architecture to GAS

2011-11-01 Thread David Miller

Please post binutils patches with the binutils development list CC:'d.


Re: [PATCH 1/1] sparc leon: Use -Aleon assembler switch for -mcpu=leon arch

2011-11-01 Thread David Miller

GCC patches are to be posted to gcc-patches, not gcc.


Re: [PATCH 1/1] sparc leon: add -Aleon architecture to GAS

2011-11-01 Thread David Miller
From: Konrad Eisele 
Date: Tue, 01 Nov 2011 10:19:04 +0100

> David Miller wrote:
>> 
>> Please post binutils patches with the binutils development list CC:'d.
>> 
>> 
> 
> Is the binutils development list bug-binut...@gnu.org ?

No, it's binut...@sourceware.org


Re: scalar vector shift expansion problem on 64-bit

2011-11-01 Thread David Miller
From: David Miller 
Date: Fri, 28 Oct 2011 01:05:54 -0400 (EDT)

> So should expand_vector_broadcast() really provide this invariant to
> the vec_init expander, or does the vec_init expander need to tidy
> things up with gen_lowpart() etc. calls?

Richard I don't know if you had a chance to look into this at all yet,
but I wanted to make a comment about vec_init in general.

I've come to find that I want the compiler to do as little as possible
with the expressions that get put into vector initializers.

I don't want it to modify the mode of the individual inner elements in
the assignment. I also don't want it to force mems into registers.

In fact, the less it does the better.

I want to make use of the special VIS load instructions that can take
a QImode or HImode value in memory and load it zero extended into a
64-bit float register.

For example:

int x;

__v8qi test_v8qi(void)
{
  __v8qi ret = { x, x, x, x, x, x, x, x };

  return ret;
}

I want to be able to generate:

test_v8qi:
sethi %hi(x + 3), %g1
or%g1, %lo(x + 3), %g1
ldda  [%g1] ASI_FL8_P, %f2
sethi %hi(0x), %g2
or%g2, %lo(0x), %g2
bmask %g2, %g0, %g0
retl
 bshuffle  %f2, %f2, %f0

but I can't because the vec_init expander sees:

(parallel:V8QI [
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
(reg:QI 110 [ D.2249 ])
])

in operands[1].


serious libgcc regression added recently

2011-11-02 Thread David Miller

My sparc-linux-gnu builds with --enable-targets=all is failing with:

../../../../gcc/libgcc/config/sparc/lb1spc.S: Assembler messages:
../../../../gcc/libgcc/config/sparc/lb1spc.S:124: Error: detected global 
register use not covered by .register pseudo-op
../../../../gcc/libgcc/config/sparc/lb1spc.S:134: Error: detected global 
register use not covered by .register pseudo-op
...

It looks like it's trying to build 32-bit sparc files during the -m64
64-bit multiarch build or similar.  Or, alternatively, doing the
32-bit multiarch build with 64-bit options.

And others have reported link failures during the testsuite on
other targets.

Rainer it seems it might be your changes?


Re: serious libgcc regression added recently

2011-11-02 Thread David Miller
From: Joel Sherrill 
Date: Wed, 2 Nov 2011 16:29:16 -0500

> Is this similar to what I just got for sparc-rtems when compiling
> libgcc2 with -mcpu=v8?
> 
> /tmp/cczMc4jN.s: Assembler messages:
> /tmp/cczMc4jN.s:16: Error: Hardware capability "mul32" not enabled for
> "smul".
> /tmp/cczMc4jN.s:18: Error: Hardware capability "mul32" not enabled for
> "smul".
> /tmp/cczMc4jN.s:22: Error: Hardware capability "mul32" not enabled for
> "umul".
> 
> I can prepare a PR if you think it is different.

I don't think so.  The bug I'm hitting seems to be simply that
config/sparc/t-linux64 wasn't migrated up into the libgcc configure
area properly the way that config/sparc/t-linux was.

I'm working on a fix right now.


Re: serious libgcc regression added recently

2011-11-02 Thread David Miller
From: David Miller 
Date: Wed, 02 Nov 2011 18:30:56 -0400 (EDT)

> From: Joel Sherrill 
> Date: Wed, 2 Nov 2011 16:29:16 -0500
> 
>> Is this similar to what I just got for sparc-rtems when compiling
>> libgcc2 with -mcpu=v8?
>> 
>> /tmp/cczMc4jN.s: Assembler messages:
>> /tmp/cczMc4jN.s:16: Error: Hardware capability "mul32" not enabled for
>> "smul".
>> /tmp/cczMc4jN.s:18: Error: Hardware capability "mul32" not enabled for
>> "smul".
>> /tmp/cczMc4jN.s:22: Error: Hardware capability "mul32" not enabled for
>> "umul".
>> 
>> I can prepare a PR if you think it is different.
> 
> I don't think so.  The bug I'm hitting seems to be simply that
> config/sparc/t-linux64 wasn't migrated up into the libgcc configure
> area properly the way that config/sparc/t-linux was.

Actually the problem is that libgcc/config.host checks ${host}
to decide whether to append config/sparc/t-softmul to the tmake
variable.

But when multilibbing 64-bit libraries on a 32-bit hosted build
configured with --enable-targets=all, the host will be sparc-*-linux
even when building the 64-bit libgcc.

So t-softmul gets appended anyways, and this causes us to try and
build config/sparc/lb1spc.S for the 64-bit libgcc which we should
never do.

And it will do the wrong thing in the opposite case too, when building
a 64-bit hosted sparc gcc, host will be sparc64-*-* when building the
multilib 32-bit libgcc, and in that cast config.host will not append
t-softfp even when it should.

I sure wish these changes got a lot more testing before they were
installed :-/



Re: serious libgcc regression added recently

2011-11-02 Thread David Miller
From: David Miller 
Date: Wed, 02 Nov 2011 18:43:52 -0400 (EDT)

> So t-softmul gets appended anyways, and this causes us to try and
> build config/sparc/lb1spc.S for the 64-bit libgcc which we should
> never do.

I tried the patch below but it just results in syntax errors in the
Makefile.

Is this the way differences between multilib cases are going to be
handled now in libgcc, with these backtick shell conditionals that (of
all things) looks at the destination directory?

What if I want to put 64-bit libraries in a different location such as
plain 'lib/' to create a 64-bit pure system or similar?

I definitely prefer how this stuff worked beforehand wherein we would
know the actual "target" we're building for and we bring in the
appropriate "target" makefile fragments based upon that "target".

Now we just seem to look at the host and essentially include every
possible target makefile that could be multilibbed out of that host.

diff --git a/libgcc/config.host b/libgcc/config.host
index 05f084b..47e0e73 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1052,7 +1052,8 @@ sparc64-*-freebsd*|ultrasparc-*-freebsd*)
;;
 sparc64-*-linux*)  # 64-bit SPARC's running GNU/Linux
extra_parts="$extra_parts crtfastmath.o"
-   tmake_file="${tmake_file} t-crtfm sparc/t-linux sparc/t-linux64"
+   tmake_file="${tmake_file} t-crtfm sparc/t-linux sparc/t-linux64 \
+sparc/t-softmul"
md_unwind_header=sparc/linux-unwind.h
;;
 sparc64-*-netbsd*)
diff --git a/libgcc/config/sparc/t-softmul b/libgcc/config/sparc/t-softmul
index 7142200..5489a37 100644
--- a/libgcc/config/sparc/t-softmul
+++ b/libgcc/config/sparc/t-softmul
@@ -1,2 +1,4 @@
-LIB1ASMSRC = sparc/lb1spc.S
-LIB1ASMFUNCS = _mulsi3 _divsi3 _modsi3
+LIB1ASMSRC = `if test x$$($(CC) -print-multi-os-directory) \
+   = x../lib64; then echo sparc/lb1spc.S; fi`
+LIB1ASMFUNCS = `if test x$$($(CC) -print-multi-os-directory) \
+   = x../lib64; then echo _mulsi3 _divsi3 _modsi3; fi`


Re: serious libgcc regression added recently

2011-11-02 Thread David Miller
From: Andrew Pinski 
Date: Wed, 2 Nov 2011 16:40:13 -0700

> On Wed, Nov 2, 2011 at 4:28 PM, David Miller  wrote:
>> +LIB1ASMSRC = `if test x$$($(CC) -print-multi-os-directory) \
>> +                       = x../lib64; then echo sparc/lb1spc.S; fi`
>> +LIB1ASMFUNCS = `if test x$$($(CC) -print-multi-os-directory) \
>> +                       = x../lib64; then echo _mulsi3 _divsi3 _modsi3; fi`
>>
> 
> -print-multi-directory is most likely easier to handle than os-directory.

I was just copying the construct Rainer was already using in
sparc/t-linux64 :-)


Re: serious libgcc regression added recently

2011-11-02 Thread David Miller
From: "Joseph S. Myers" 
Date: Thu, 3 Nov 2011 00:22:49 + (UTC)

> On Wed, 2 Nov 2011, David Miller wrote:
> 
>> Actually the problem is that libgcc/config.host checks ${host}
>> to decide whether to append config/sparc/t-softmul to the tmake
>> variable.
> 
> ${host} is the *target* when configuring target libraries.

It doesn't represent the 'target' we're generating code for in
a multilib instance so we can conditionalize off of it correctly.

The only way sparc/t-softfp can be added to tmake is if the host
matches sparc-*-* and that's what happens even when we're building
the 64-bit libgcc in my case.

It's sparc-*-* for both the 32-bit and 64-bit libgcc builds.


Re: serious libgcc regression added recently

2011-11-02 Thread David Miller
From: "Joseph S. Myers" 
Date: Thu, 3 Nov 2011 01:21:35 + (UTC)

> What is new is that you can now put tests in libgcc/configure.ac
> such as the "Check 32bit or 64bit for x86." one, and select t-*
> files based on those tests - whereas in the gcc/ directory there is
> no possibility at all for the choice of t-* files to depend on the
> multilib.

That works for me, I'm bootstrapping the following:


[PATCH] Fix multilib build of libgcc on Linux/sparc.

* configure.ac: Set host_address on sparc too.
* configure: Regenerate.
* config.host: Add sparc/t-linux64 and sparc/t-softmul conditionally
based upon host_address.
* config/sparc/t-linux64: Set CRTSTUFF_T_CFLAGS unconditionally.
---
 libgcc/ChangeLog  |8 
 libgcc/config.host|   17 ++---
 libgcc/config/sparc/t-linux64 |3 +--
 libgcc/configure  |7 ---
 libgcc/configure.ac   |7 ---
 5 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index 1bbe29a..3944193 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,11 @@
+2011-11-02  David S. Miller  
+
+   * configure.ac: Set host_address on sparc too.
+   * configure: Regenerate.
+   * config.host: Add sparc/t-linux64 and sparc/t-softmul conditionally
+   based upon host_address.
+   * config/sparc/t-linux64: Set CRTSTUFF_T_CFLAGS unconditionally.
+
 2011-11-02  Rainer Orth  
 
* gthr-single.h, gthr.h: New files.
diff --git a/libgcc/config.host b/libgcc/config.host
index 05f084b..647c6a1 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1008,7 +1008,10 @@ sparc-*-elf*)
extra_parts="$extra_parts crti.o crtn.o crtfastmath.o"
;;
 sparc-*-linux*)# SPARC's running GNU/Linux, libc6
-   tmake_file="${tmake_file} t-crtfm sparc/t-linux64"
+   tmake_file="${tmake_file} t-crtfm"
+   if test "${host_address}" = 64; then
+   tmake_file="$tmake_file sparc/t-linux64"
+   fi
case ${host} in
*-leon*)
tmake_file="${tmake_file} t-fdpbit"
@@ -1021,7 +1024,9 @@ sparc-*-linux*)   # SPARC's running GNU/Linux, 
libc6
*-leon[3-9]*)
;;
*)
-   tmake_file="$tmake_file sparc/t-softmul"
+   if test "${host_address}" = 32; then
+   tmake_file="$tmake_file sparc/t-softmul"
+   fi
;;
esac
extra_parts="$extra_parts crtfastmath.o"
@@ -1052,7 +1057,13 @@ sparc64-*-freebsd*|ultrasparc-*-freebsd*)
;;
 sparc64-*-linux*)  # 64-bit SPARC's running GNU/Linux
extra_parts="$extra_parts crtfastmath.o"
-   tmake_file="${tmake_file} t-crtfm sparc/t-linux sparc/t-linux64"
+   tmake_file="${tmake_file} t-crtfm sparc/t-linux"
+   if test "${host_address}" = 64; then
+   tmake_file="${tmake_file} sparc/t-linux64"
+   fi
+   if test "${host_address}" = 32; then
+   tmake_file="${tmake_file} sparc/t-softmul"
+   fi
md_unwind_header=sparc/linux-unwind.h
;;
 sparc64-*-netbsd*)
diff --git a/libgcc/config/sparc/t-linux64 b/libgcc/config/sparc/t-linux64
index ca4a892..6583fe2 100644
--- a/libgcc/config/sparc/t-linux64
+++ b/libgcc/config/sparc/t-linux64
@@ -1,2 +1 @@
-CRTSTUFF_T_CFLAGS = `if test x$$($(CC) -print-multi-os-directory) \
-   = x../lib64; then echo -mcmodel=medany; fi`
+CRTSTUFF_T_CFLAGS = -mcmodel=medany
diff --git a/libgcc/configure b/libgcc/configure
index 0d91645..0f18037 100644
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -4609,11 +4609,12 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $libgcc_cv_cfi" >&5
 $as_echo "$libgcc_cv_cfi" >&6; }
 
-# Check 32bit or 64bit for x86.
+# Check 32bit or 64bit for x86 and sparc.
 case ${host} in
-i?86*-*-* | x86_64*-*-*)
+i?86*-*-* | x86_64*-*-* | sparc*-*-*)
   cat > conftest.c < conftest.c <

Re: serious libgcc regression added recently

2011-11-03 Thread David Miller
From: Jakub Jelinek 
Date: Thu, 3 Nov 2011 09:22:51 +0100

> On Wed, Nov 02, 2011 at 11:41:08PM -0400, David Miller wrote:
>> --- a/libgcc/configure.ac
>> +++ b/libgcc/configure.ac
>> @@ -255,11 +255,12 @@ AC_CACHE_CHECK([whether assembler supports CFI 
>> directives], [libgcc_cv_cfi],
>>[libgcc_cv_cfi=yes],
>>[libgcc_cv_cfi=no])])
>>  
>> -# Check 32bit or 64bit for x86.
>> +# Check 32bit or 64bit for x86 and sparc.
>>  case ${host} in
>> -i?86*-*-* | x86_64*-*-*)
>> +i?86*-*-* | x86_64*-*-* | sparc*-*-*)
>>cat > conftest.c <> -#ifdef __x86_64__
>> +#if defined(__x86_64__) || \
>> +(defined(__sparc__) && defined(__arch64__))
>>  host_address=64
>>  #else
>>  host_address=32
> 
> I think much better would be to handle sparc*/s390*/powerpc* differently
> here, just using #ifdef __LP64__ test.  i?86/x86_64 is different because
> of the third weirdo multilib option.

Yes, using __LP64__ for non-x86 is much better.  Then we can completely
remove the ${host} conditional and nobody will have to hack on this
piece of configure code ever again.

I'll try to find time ot hack this together if nobody beats me to it.


Re: serious libgcc regression added recently

2011-11-04 Thread David Miller
From: Jakub Jelinek 
Date: Thu, 3 Nov 2011 09:22:51 +0100

> I think much better would be to handle sparc*/s390*/powerpc* differently
> here, just using #ifdef __LP64__ test.  i?86/x86_64 is different because
> of the third weirdo multilib option.

I just tested and committed the following, it seemed to make no sense
to keep the case statement there and now the host_address should be
available to anyone who wants to make use of it in config.host


[PATCH] Tweak libgcc configure test for 64-bit.

libgcc/

* configure.ac: Test for 64-bit addresses on !x86 using __LP64__.
* configure: Rebuild.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@181000 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgcc/ChangeLog|5 +
 libgcc/configure|   15 +--
 libgcc/configure.ac |   15 +--
 3 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index ec06a09..d3f091e 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-11-04  David S. Miller  
+
+   * configure.ac: Test for 64-bit addresses on !x86 using __LP64__.
+   * configure: Rebuild.
+
 2011-11-04  Andreas Krebbel  
 
* config/s390/t-crtstuff: Add -fPIC to CRTSTUFF_T_CFLAGS_S
diff --git a/libgcc/configure b/libgcc/configure
index 0f18037..1895a76 100644
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -4609,21 +4609,16 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $libgcc_cv_cfi" >&5
 $as_echo "$libgcc_cv_cfi" >&6; }
 
-# Check 32bit or 64bit for x86 and sparc.
-case ${host} in
-i?86*-*-* | x86_64*-*-* | sparc*-*-*)
-  cat > conftest.c < conftest.c < conftest.c < conftest.c <

bootstrap regression on sparc

2011-11-11 Thread David Miller

While building libstdc++ I get an assertion failure in haifa-sched.c,
specifically the assertion on line 3437 is failing:

  gcc_assert (!jump_p
  || ((common_sched_info->sched_pass_id == SCHED_RGN_PASS)
  && IS_SPECULATION_BRANCHY_CHECK_P (insn))
  || (common_sched_info->sched_pass_id
  == SCHED_EBB_PASS));

I haven't looked more deeply at it, but the first recent suspicious change
are the basic block handling changes Alan made two days ago:

2011-11-09  Alan Modra  

* function.c (bb_active_p): Delete.
(dup_block_and_redirect, active_insn_between): New functions.
(convert_jumps_to_returns, emit_return_for_exit): New functions,
split out from..
(thread_prologue_and_epilogue_insns): ..here.  Delete
shadowing variables.  Don't do prologue register clobber tests
when shrink wrapping already failed.  Delete all last_bb_active
code.  Instead compute tail block candidates for duplicating
exit path.  Remove these from antic set.  Duplicate tails when
reached from both blocks needing a prologue/epilogue and
blocks not needing such.
* ifcvt.c (dead_or_predicable): Test both flag_shrink_wrap and
HAVE_simple_return.
* bb-reorder.c (get_uncond_jump_length): Make global.
* bb-reorder.h (get_uncond_jump_length): Declare.
* cfgrtl.c (rtl_create_basic_block): Comment typo fix.
(rtl_split_edge): Likewise.  Warning fix.
(rtl_duplicate_bb): New function.
(rtl_cfg_hooks): Enable can_duplicate_block_p and duplicate_block.
* Makefile.in (function.o): Update dependencies.


Re: bootstrap regression on sparc

2011-11-11 Thread David Miller
From: David Miller 
Date: Fri, 11 Nov 2011 20:41:23 -0500 (EST)

> I haven't looked more deeply at it, but the first recent suspicious change
> are the basic block handling changes Alan made two days ago:
> 
> 2011-11-09  Alan Modra  
> 
>   * function.c (bb_active_p): Delete.
 ...

I've verified that the bootstrap succeeds with this change reverted.


Re: bootstrap regression on sparc

2011-11-12 Thread David Miller
From: Dennis Clarke 
Date: Sat, 12 Nov 2011 12:51:18 -0500 (EST)

>> While building libstdc++ I get an assertion failure in haifa-sched.c,
>> specifically the assertion on line 3437 is failing:
> 
>   I am seeing no major problems on Sparc at all. What rev of GCC are you
> referring to please?

As always, with head.

I do my builds and bootstraps configured using "--with-cpu=niagara3".


Re: bootstrap regression on sparc

2011-11-12 Thread David Miller
From: Joel Sherrill 
Date: Sat, 12 Nov 2011 08:34:29 -0600

> From my perspective, the head doesn't look so good. :(

I'm extremely disappointed with how the last 2 weeks have gone
as well.  I can't work on any of the bugs I want to work on
because the tree keeps being broken.

I guess the end of stage 1 means "dump as much of your half ready
not-really-tested crap into the tree as possible".

The tree was more stable, and broke less often, while we were still in
the midst of stage 1.


Re: bootstrap regression on sparc

2011-11-12 Thread David Miller
From: Hans-Peter Nilsson 
Date: Sat, 12 Nov 2011 07:25:46 -0500 (EST)

> On Fri, 11 Nov 2011, David Miller wrote:
>>
>> While building libstdc++ I get an assertion failure in haifa-sched.c,
>> specifically the assertion on line 3437 is failing:
> 
>> I haven't looked more deeply at it, but the first recent suspicious change
>> are the basic block handling changes Alan made two days ago:
> 
> Try <http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01409.html>
> and maybe approve or commit the patch there.

That patch indeed does fix the problem for me.

However, I don't feel qualified enough on CFG issues to approve or
commit that patch.  Someone else will need give it the OK.

Thanks!


ICE in int_mode_for_mode()

2011-11-18 Thread David Miller

For a few days a lot of new testsuite failures have popped up on sparc,
wherein int_mode_for_mode() gets called with "VOIDmode" as an argument
from extract_bit_field_1 because "op0" is "(const_int 0)"

I have a feeling this is a known problem, but I couldn't find any
discussions about this.

I strongly suspect the following change caused this problem:

2011-11-16  Andreas Krebbel  

PR middle-end/50325
* expmed.c (store_bit_field_1): Use extract_bit_field on big
endian targets if the source cannot be exactly covered by word
mode chunks.


Re: Memory corruption due to word sharing

2012-02-01 Thread David Miller
From: Michael Matz 
Date: Wed, 1 Feb 2012 18:41:05 +0100 (CET)

> One problem is that it's not a new problem, GCC emitted similar code since 
> about forever, and still they turned up only now (well, probably because 
> ia64 is dead, but sparc64 should have similar problems). 

Indeed, on sparc64 it does do the silly 64-bit access too:

wrong:
ldx [%o0+8], %g2
sethi   %hi(2147483648), %g1
or  %g2, %g1, %g1
jmp %o7+8
 stx%g1, [%o0+8]

Personally I've avoided C bitfields like the plague in any code I've
written.


Re: Preparing to merge ARM/hard_vfp_branch to trunk

2009-08-05 Thread David Miller
From: Eric Botcazou 
Date: Wed, 5 Aug 2009 17:59:01 +0200

>> I believe that I could legitimately approve that patch myself (it's
>> pretty trivial and I didn't author it), but I'd prefer to get approval
>> from one of the SPARC maintainers.  Here's your chance:
>>
>>  http://gcc.gnu.org/ml/gcc-patches/2009-04/msg01027.html
> 
> OK.

It looks fine to me too.


LTO and asm specs...

2010-03-12 Thread David Miller

There is one g++ LTO test case (g++.lto/20090303) that fails on sparc,
it compiles the intermediate objects with -fPIC but the final
compilation creates an executable.

The problem is that when LTO re-instantiates the options for the
individual builds, the proper ASM specs of the target are not
executed, so in this case "-K PIC" is not passed down to the assembler
in response to "-fPIC".

As a consequence, relocations against _GLOBAL_OFFSET_TABLE_
in code like this:

sethi   %hi(_GLOBAL_OFFSET_TABLE_), %g1

use the R_SPARC_HI22 relocation instead of R_SPARC_PC22.

Thus the program crashes.

I couldn't figure out immediately how to fix this as the
way LTO does spec overriding and such looked non-trivial.

Thanks.


fixincl 'make check' regressions...

2010-03-15 Thread David Miller

Ever since your changes installed on March 12th, I've been getting
fixincludes testsuite failures of the form below.

I also notice that none of these changes added ChangeLog entries, and
furthermore the SVN commit messages were extremely terse so it was
hard to diagnose the intent or reasoning behind your changes.

iso/math_c99.h 
/home/davem/src/GIT/GCC/gcc/fixincludes/tests/base/iso/math_c99.h differ: char 
1366, line 52
*** iso/math_c99.h  Mon Mar 15 22:55:36 2010
--- /home/davem/src/GIT/GCC/gcc/fixincludes/tests/base/iso/math_c99.h   Thu Jan 
21 04:06:11 2010
***
*** 49,55 
   ? __builtin_signbitf(x) \
   : sizeof(x) == sizeof(long double) \
 ? __builtin_signbitl(x) \
!: __builtin_signbit(x));
  #endif  /* SOLARIS_MATH_8_CHECK */
  
  
--- 49,55 
   ? __builtin_signbitf(x) \
   : sizeof(x) == sizeof(long double) \
 ? __builtin_signbitl(x) \
!: __builtin_signbit(x))
  #endif  /* SOLARIS_MATH_8_CHECK */
  
  

There were fixinclude test FAILURES


Re: LTO and asm specs...

2010-03-16 Thread David Miller
From: Richard Henderson 
Date: Tue, 16 Mar 2010 11:31:44 -0700

> On 03/12/2010 09:33 PM, David Miller wrote:
>> I couldn't figure out immediately how to fix this as the
>> way LTO does spec overriding and such looked non-trivial.
> 
> It would not be a bad thing, IMO, if the sparc assembler
> were extended to be able to emit any reloc directly, without
> needing a specific command-line option.  Then you'd only
> encounter this problem with legacy assemblers.

It's not the assemblers fault.

We're using %hi() and expecting the assembler to emit a
PC relative relcation just because the symbol name happens
to be _GLOBAL_OFFSET_TABLE_  And it will do this, but only
when -PIC.  Changing that is pretty dangerous.

But even if we got past that, we need to get the assembler options
right in order to enable instruction classes.  For example we have to
get -Av9a there when using VIS instructions.

Other platforms are going to hit things like this too.

LTO really needs to evaluate the specs correctly.


Re: LTO and asm specs...

2010-03-16 Thread David Miller
From: Richard Henderson 
Date: Tue, 16 Mar 2010 12:53:47 -0700

> On 03/16/2010 12:28 PM, David Miller wrote:
>> It's not the assemblers fault.
>> 
>> We're using %hi() and expecting the assembler to emit a
>> PC relative relcation just because the symbol name happens
>> to be _GLOBAL_OFFSET_TABLE_  And it will do this, but only
>> when -PIC.  Changing that is pretty dangerous.
> 
> It is the assembler's fault because it doesn't provide %pcrelhi() or
> some such to allow the compiler (or asm programmer) to emit exactly
> the relocation that's desired.

There is %pc22() and %pc10.  I don't know if it's safe to
change gcc to use them in all cases though.

>> But even if we got past that, we need to get the assembler options
>> right in order to enable instruction classes.  For example we have to
>> get -Av9a there when using VIS instructions.
> 
> How about ".arch v9a" like other platforms emit?
> 
> Command-line options that control what the assembler emits for
> the exact same bit of text are a Really Bad Idea, as we've seen
> from other platforms time and time again.

I think this distracts from the issue that LTO needs to
process specs properly.

Are you seriously against fixing that LTO bug?


Re: fixincl 'make check' regressions...

2010-03-18 Thread David Miller

You said you would fix this several nights ago, but I still
haven't seen any changes to fixincludes since then.

When will you get around to fixing these regressions you
introduced?

Thank you.


Re: DWARF register numbering discrepancy on SPARC between GCC and GDB

2009-01-21 Thread David Miller
From: Eric Botcazou 
Date: Wed, 21 Jan 2009 15:22:19 +0100

> > Obviously the GCC folks broke backwards compatibility with themselves.
> > So unless we find evidence that contradicts the wiki page you cite, I
> > think GCC needs to be fixed.
> 
> Yes, the SVR4 definition used to be masked by that of the sol2.h file on 
> Solaris and is not anymore.  But the SVR4 definition is the one used for
> the various BSD variants.

Ok, so it seems the fix is to reinstate the override in sol2.h,
right?


Re: RFD: simple instruction cache code layout heuristics

2009-01-27 Thread David Miller
From: Ian Lance Taylor 
Date: Tue, 27 Jan 2009 13:16:31 -0800

> A co-worker of mine at Google did some experiments along those lines
> using gold.  He was not able to demonstrate any performance
> improvements, unfortunately (x86_64 target, proprietary test cases).
> He was hacking linker scripts.

This sort of optimization helps more on cpus with f.e. software
managed TLBs.  The higher the TLB miss cost, the more the benefit.

This is why Sun's tools have supported this optimization for a long
time, since 64-bit sparc has been predominantly using software managed
TLB refill.

I think it's the "-xlinkopt" options that do this with the SunPRO
compiler tools.


Re: The state of glibc libm

2012-02-29 Thread David Miller
From: "Joseph S. Myers" 
Date: Wed, 29 Feb 2012 17:17:17 + (UTC)

Thanks for looking into all of these issues.

> (c) Various functions do not set errno correctly (many cases) or raise the 
> proper floating-point exceptions (a smaller number of cases - both 
> spurious exceptions where not permitted by ISO C, and failing to raise 
> required overflow/underflow exceptions).  In general this is a separate 
> bug for each function (filed as many separate bugs in glibc Bugzilla) and 
> can be fixed by a separate local patch for each function (adding a 
> testcase, of course - note that glibc's main libm-test.inc presently only 
> tests invalid and divide-by-zero exceptions, so if working on these error 
> handling issues it might be useful to extend it to cover other exceptions 
> as well as errno values).

This reminds me that there are math tests local to the powerpc port
that therefore only run on powerpc.  For example, I've looked at
sysdeps/powerpc/test-arith{,f}.c and they don't seem so non-portable
that we couldn't run them everywhere with just some small tweaks.


Re: The state of glibc libm

2012-03-14 Thread David Miller
From: Jeff Law 
Date: Wed, 14 Mar 2012 10:41:27 -0600

> The better performance with potential loss of accuracy is an across
> the board request, it's not just a sin/cos issue.  All the trig, exp,
> pow, log, etc, which I don't think are necessarily covered by using
> the old x87 fp unit.

This is no surprise to me, vendors have commonly provided some kind of
"optimized libm" library that provided better performance with less
accuracy.

For example, Sun has always had a variant of libm optimized in this
way, called "libmopt", with it's SunPRO compilers.


expansion of vector shifts...

2012-10-27 Thread David Miller

On sparc a simple test like (from the PR tree-optimization/53410 testcase):


typedef int V __attribute__((vector_size (4 * sizeof (int;
typedef unsigned int W __attribute__((vector_size (4 * sizeof (int;

void
f10 (W *p, W *q)
{
  *p = *p < (((const W) { 1U, 1U, 1U, 1U }) << *q);
}


aborts in convert_move() because we're trying to move a TImode value
into a V2SImode one.  How does that happen?

On sparc the generic tree vector layer turns the above expression into
two V2SImode shifts.  The *q parts of each shift are represented as:

(subreg:V2SI (reg:TI xxx) 0)
(subreg:V2SI (reg:TI xxx) 8)

When we get down into expand_shift_1(), that SUBREG is stripped out by
the SHIFT_COUNT_TRUNCATED code, and that's how we end up in the crash
by the time we reach convert_move() (via expand_binop() -->
expand_binop_directly() --> convert_modes() --> convert_move()).

Perhaps we should elide the SUBREG stripping if the subreg has a
vector mode?

Actually, what seems to confuse this code is that we're passing
TImode values around for this vector that the target doesn't have
direct support for.  The SUBREG stripper explicitly checks for
INTEGRAL_MODE_P, and indeed TImode is integral.


Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-29 Thread David Miller
From: Jakub Jelinek 
Date: Mon, 29 Oct 2012 18:56:42 +0100

> I'd like to close the stage 1 phase of GCC 4.8 development
> on Monday, November 5th.  If you have still patches for new features you'd
> like to see in GCC 4.8, please post them for review soon.  Patches
> posted before the freeze, but reviewed shortly after the freeze, may
> still go in, further changes should be just bugfixes and documentation
> fixes.

I'd like to get the Sparc cbcond stuff in (3 revisions posted) which
is waiting for Eric B. to do some Solaris specific work.

I'd also like to enable LRA for at least 32-bit sparc, even if I can't
find the time to work on auditing 64-bit completely.


Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-29 Thread David Miller
From: Eric Botcazou 
Date: Mon, 29 Oct 2012 20:25:15 +0100

>> I'd like to get the Sparc cbcond stuff in (3 revisions posted) which
>> is waiting for Eric B. to do some Solaris specific work.
>> 
>> I'd also like to enable LRA for at least 32-bit sparc, even if I can't
>> find the time to work on auditing 64-bit completely.
> 
> End of stage #1 isn't a hard limit for architecture-specific patches, so we 
> need not make a decision about LRA immediately.  I don't think we want to 
> half 
> enable it though, so it's all or nothing.

Upon further consideration, agreed.  I'll only turn this on if I can
get the whole backend working.

FWIW, I think we should consider delaying stage1 for another reason.
A large number of North American developers are about to be hit by a
major natural disaster, and may be without power for weeks.



LRA unaligned reloads

2012-11-03 Thread David Miller

On 32-bit sparc with LRA enabled we have the following (this
generated for gcc.dg/vect/pr51581-4.c with -flto):

(insn 252 142 165 4 (set (reg:HI 234 [ D.1511 ])
(mem/c:HI (plus:SI (reg/f:SI 1307)
(const_int 24 [0x18])) [4 b+24 S2 A64])) test.c:23 59 
{*movhi_insn}
 (expr_list:REG_EQUIV (mem/c:HI (plus:SI (reg/f:SI 1307)
(const_int 24 [0x18])) [4 b+24 S2 A64])
(expr_list:REG_EQUAL (mem/c:HI (const:SI (plus:SI (symbol_ref:SI 
("b.1425") [flags 0x2]  <\
var_decl 0xf7519740 b>)
(const_int 24 [0x18]))) [4 b+24 S2 A64])
(nil

 ...

(insn 255 221 233 4 (set (reg:SI 1285)
(ashift:SI (subreg:SI (reg:HI 234 [ D.1511 ]) 0)
(const_int 16 [0x10]))) test.c:23 306 {ashlsi3}
 (expr_list:REG_DEAD (reg:HI 234 [ D.1511 ])
(nil)))

When LRA does reloads for insn 255, it tries to strip the subregs
for pseudo 234 and use SImode moves.

(insn 1367 1732 1712 4 (set (reg:SI 12 %o4 [1425])
(mem/c:SI (plus:SI (reg/f:SI 3 %g3 [1307])
(const_int 22 [0x16])) [4 b+22 S4 A16])) test.c:23 61 
{*movsi_insn}
 (nil))
(insn 255 1712 1366 4 (set (reg:SI 12 %o4 [1285])
(ashift:SI (reg:SI 12 %o4 [1425])
(const_int 16 [0x10]))) test.c:23 306 {ashlsi3}
 (expr_list:REG_DEAD (reg:SI 12 %o4 [1425])
(nil)))
(insn 1366 255 1711 4 (set (reg:SI 12 %o4 [1285])
(reg:SI 12 %o4 [1285])) test.c:23 61 {*movsi_insn}
 (expr_list:REG_DEAD (reg:SI 12 %o4 [1285])
(nil)))

Unfortunately, this is illegal, because insn 1367 is an SImode
load to a non-4-byte-aligned location (%g3 + 22).

This does work, and might even be profitable, on non-STRICT_ALIGNMENT
targets.  But for those which are STRICT_ALIGNMENT, we cannot make
transformations like this.



LRA best_reload_nregs

2012-11-04 Thread David Miller

Unlike the other variables that track the state of the current
instruction being analyzed by the LRA constraints code, I don't
see anything which initializes best_reload_nregs when we start
looking at a new instruction.


Re: expansion of vector shifts...

2012-11-15 Thread David Miller
From: Richard Sandiford 
Date: Mon, 29 Oct 2012 10:14:53 +

> ...given that the code is like you say written:
> 
>   if (SHIFT_COUNT_TRUNCATED)
> {
>   if (CONST_INT_P (op1)
> ...
>   else if (GET_CODE (op1) == SUBREG
>  && subreg_lowpart_p (op1)
>  && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1
>   op1 = SUBREG_REG (op1);
> }
> 
> INTEGRAL_MODE_P (GET_MODE (op1)) might be better than an explicit
> VECTOR_MODE_P check.  The code really doesn't make sense for anything
> other than integers.
> 
> (It amounts to the same thing in practice, of course...)

Agreed, I've just committed the following.  Thanks!


Fix gcc.c-torture/compile/pr53410-2.c on sparc.

* expmed.c (expand_shift_1): Don't strip non-integral SUBREGs.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@193547 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog | 2 ++
 gcc/expmed.c  | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9abd396..62bde4e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,7 @@
 2012-11-15  David S. Miller  
 
+   * expmed.c (expand_shift_1): Don't strip non-integral SUBREGs.
+
* configure.ac: Add check for assembler SPARC4 instruction
support.
* configure: Rebuild.
diff --git a/gcc/expmed.c b/gcc/expmed.c
index 5b697a1..8640427 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -2165,7 +2165,8 @@ expand_shift_1 (enum tree_code code, enum machine_mode 
mode, rtx shifted,
   % GET_MODE_BITSIZE (mode));
   else if (GET_CODE (op1) == SUBREG
   && subreg_lowpart_p (op1)
-  && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1
+  && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1)))
+  && INTEGRAL_MODE_P (GET_MODE (op1)))
op1 = SUBREG_REG (op1);
 }
 
-- 
1.7.12.2.dirty



var-tracking wrt. leaf regs on sparc

2013-02-05 Thread David Miller

Hello Eric, this is in regards to your HAVE_window_save code in
var-tracking.c and elsewhere added for PR target/48220.

I'm trying to fix all of the guality failures on sparc and this issue
below is the first one I was able to comprehend.

All of this special register window debugging handling in
var-tracking.c can get tripped up any time we have a leaf function, in
fact I would venture that most if not all of the guality failures on
sparc are directly or indirectly related to this issue.

A good example is pr54200.c, where for function foo:

int __attribute__((noinline,noclone))
foo (int z, int x, int b)
{
  if (x == 1)
{
  bar ();
  return z;
}
  else
{
  int a = (x + z) + b;
  return a; /* { dg-final { gdb-test 20 "z" "3" } } */
}
}

with -O1 and higher we end up with a leaf function here.  And incoming
argument register %o0 is used for computing the function's return
value.  But var-tracking.c doesn't see these modifications at all.

The reason is that the window tracking code converts the incoming
arguments %iN registers into the outgoing %oN registers.

But this is not the right thing to do because all of the RTL being
inspected has not had leaf register remapping applied to it yet.

So var-tracking.c sees modifications of %i0 in the RTL but doesn't
grok that this is a modification of incoming argument 'z' because it
thinks that 'z' lives in %o0.

Leaf register remapping of the RTL stream is only done as the very
last step in final.c

And when debugging information is emitted, the debugging backends
perform the leaf register remapping manually so that the emitted
debugging regno references match what final.c is going to emit.

This essentially means every other part of the compiler should look at
the RTL and incoming arguments unmolested.

As a quick check I found that if I just remove all of the
HAVE_window_save code in var-tracking.c, the pr54200.c test case
passes at all optimization levels.

Of course there is more to it than this, looking quickly at the
var-tracking rtl dump, it seems to now think in this test case that
the incoming arguments live in both %oN and %iN.  I would need to
dig more deeply to figure out why that's happening.

The only other alternative I can see would be to get everything in
var-tracking.c and the other subsystems it uses to do leaf register
remapping, but that seems completely like the wrong way to handle
this.


Re: var-tracking wrt. leaf regs on sparc

2013-02-05 Thread David Miller
From: David Miller 
Date: Tue, 05 Feb 2013 18:18:39 -0500 (EST)

> The only other alternative I can see would be to get everything in
> var-tracking.c and the other subsystems it uses to do leaf register
> remapping, but that seems completely like the wrong way to handle
> this.

Following up to myself... :-)

Now that I understand fully what we're trying to accomplish with the
DT_OP_GNU_entry_value and DT_OP_GNU_call_site_parameter extensions, it
does in fact seem like we will need to do leaf register remapping in
var-tracking.c

Here below is a patch I'm playing with.  It's a rough draft but it
definitely fixes the pr54200.c problem completely.

Another way to do this would be to not translate the incoming
parameter registers (leave them at %i*) if we don't see the window
save.  That way we only have to play the regno remapping game for
these specific incoming argument pieces, rather than for everything we
look at in the RTL stream.

diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 714acb69..14635b9 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -1057,6 +1057,32 @@ adjust_mem_stores (rtx loc, const_rtx expr, void *data)
 }
 }
 
+/* Given a regno from the RTL instruction stream, return the
+   actual register number that will be used by final and debug
+   info emission.  */
+static unsigned int
+real_regno (unsigned int regno)
+{
+#ifdef LEAF_REG_REMAP
+  if (regno < FIRST_PSEUDO_REGISTER
+  && crtl->uses_only_leaf_regs)
+{
+  int remapped = LEAF_REG_REMAP (regno);
+
+  if (remapped >= 0)
+   regno = (unsigned int) remapped;
+}
+#endif
+
+  return regno;
+}
+
+static unsigned int
+real_regno_rtx (rtx reg)
+{
+  return real_regno (REGNO (reg));
+}
+
 /* Simplify INSN.  Remove all {PRE,POST}_{INC,DEC,MODIFY} rtxes,
replace them with their value in the insn and add the side-effects
as other sets to the insn.  */
@@ -1804,12 +1830,12 @@ var_reg_decl_set (dataflow_set *set, rtx loc, enum 
var_init_status initialized,
   if (decl_p)
 dv = dv_from_decl (var_debug_decl (dv_as_decl (dv)));
 
-  for (node = set->regs[REGNO (loc)]; node; node = node->next)
+  for (node = set->regs[real_regno_rtx (loc)]; node; node = node->next)
 if (dv_as_opaque (node->dv) == dv_as_opaque (dv)
&& node->offset == offset)
   break;
   if (!node)
-attrs_list_insert (&set->regs[REGNO (loc)], dv, offset, loc);
+attrs_list_insert (&set->regs[real_regno_rtx (loc)], dv, offset, loc);
   set_variable_part (set, loc, dv, offset, initialized, set_src, iopt);
 }
 
@@ -1875,7 +1901,7 @@ var_reg_delete_and_set (dataflow_set *set, rtx loc, bool 
modify,
   if (initialized == VAR_INIT_STATUS_UNKNOWN)
 initialized = get_init_value (set, loc, dv_from_decl (decl));
 
-  nextp = &set->regs[REGNO (loc)];
+  nextp = &set->regs[real_regno_rtx (loc)];
   for (node = *nextp; node; node = next)
 {
   next = node->next;
@@ -1904,7 +1930,7 @@ var_reg_delete_and_set (dataflow_set *set, rtx loc, bool 
modify,
 static void
 var_reg_delete (dataflow_set *set, rtx loc, bool clobber)
 {
-  attrs *nextp = &set->regs[REGNO (loc)];
+  attrs *nextp = &set->regs[real_regno_rtx (loc)];
   attrs node, next;
 
   if (clobber)
@@ -2386,7 +2412,7 @@ val_bind (dataflow_set *set, rtx val, rtx loc, bool 
modified)
   if (REG_P (loc))
 {
   if (modified)
-   var_regno_delete (set, REGNO (loc));
+   var_regno_delete (set, real_regno_rtx (loc));
   var_reg_decl_set (set, loc, VAR_INIT_STATUS_INITIALIZED,
dv_from_value (val), 0, NULL_RTX, INSERT);
 }
@@ -2584,7 +2610,7 @@ val_resolve (dataflow_set *set, rtx val, rtx loc, rtx 
insn)
 {
   attrs node, found = NULL;
 
-  for (node = set->regs[REGNO (loc)]; node; node = node->next)
+  for (node = set->regs[real_regno_rtx (loc)]; node; node = node->next)
if (dv_is_value_p (node->dv)
&& GET_MODE (dv_as_value (node->dv)) == GET_MODE (loc))
  {
@@ -2838,7 +2864,8 @@ variable_union (variable src, dataflow_set *set)
{
  if (!((REG_P (node2->loc)
 && REG_P (node->loc)
-&& REGNO (node2->loc) == REGNO (node->loc))
+&& (real_regno_rtx (node2->loc)
+== real_regno_rtx (node->loc)))
|| rtx_equal_p (node2->loc, node->loc)))
{
  if (node2->init < node->init)
@@ -2871,7 +2898,8 @@ variable_union (variable src, dataflow_set *set)
  for (node = src->var_part[i].loc_chain; node; node = node->next)
if (!((REG_P (dstnode->loc)
   && REG_P (node->loc)
-  

Re: var-tracking wrt. leaf regs on sparc

2013-02-06 Thread David Miller
From: Eric Botcazou 
Date: Wed, 06 Feb 2013 11:13:30 +0100

> I think testing crtl->uses_only_leaf_regs is sufficient here (and
> while you're at it, you could also test the value of
> HAVE_window_save, which can be 0 if -mflat is passed on the SPARC),
> so
> 
> #ifdef HAVE_window_save
> if (HAVE_window_save && !crtl->uses_only_leaf_regs)
>   {
> 
>   }
> #endif

Yes, this works perfectly, Jakub any objections?

gcc/

2013-02-06  David S. Miller  

* var-tracking.c (vt_add_function_parameter): Test the presence of
HAVE_window_save properly and do not remap argument registers when
we have a leaf function.

diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 714acb69..0db1562 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -9502,31 +9502,34 @@ vt_add_function_parameter (tree parm)
   /* DECL_INCOMING_RTL uses the INCOMING_REGNO of parameter registers.
  If the target machine has an explicit window save instruction, the
  actual entry value is the corresponding OUTGOING_REGNO instead.  */
-  if (REG_P (incoming)
-  && HARD_REGISTER_P (incoming)
-  && OUTGOING_REGNO (REGNO (incoming)) != REGNO (incoming))
+  if (HAVE_window_save && !crtl->uses_only_leaf_regs)
 {
-  parm_reg_t p;
-  p.incoming = incoming;
-  incoming
-   = gen_rtx_REG_offset (incoming, GET_MODE (incoming),
- OUTGOING_REGNO (REGNO (incoming)), 0);
-  p.outgoing = incoming;
-  vec_safe_push (windowed_parm_regs, p);
-}
-  else if (MEM_P (incoming)
-  && REG_P (XEXP (incoming, 0))
-  && HARD_REGISTER_P (XEXP (incoming, 0)))
-{
-  rtx reg = XEXP (incoming, 0);
-  if (OUTGOING_REGNO (REGNO (reg)) != REGNO (reg))
+  if (REG_P (incoming)
+ && HARD_REGISTER_P (incoming)
+ && OUTGOING_REGNO (REGNO (incoming)) != REGNO (incoming))
{
  parm_reg_t p;
- p.incoming = reg;
- reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg)));
- p.outgoing = reg;
+ p.incoming = incoming;
+ incoming
+   = gen_rtx_REG_offset (incoming, GET_MODE (incoming),
+ OUTGOING_REGNO (REGNO (incoming)), 0);
+ p.outgoing = incoming;
  vec_safe_push (windowed_parm_regs, p);
- incoming = replace_equiv_address_nv (incoming, reg);
+   }
+  else if (MEM_P (incoming)
+  && REG_P (XEXP (incoming, 0))
+  && HARD_REGISTER_P (XEXP (incoming, 0)))
+   {
+ rtx reg = XEXP (incoming, 0);
+ if (OUTGOING_REGNO (REGNO (reg)) != REGNO (reg))
+   {
+ parm_reg_t p;
+ p.incoming = reg;
+ reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg)));
+ p.outgoing = reg;
+ vec_safe_push (windowed_parm_regs, p);
+ incoming = replace_equiv_address_nv (incoming, reg);
+   }
}
 }
 #endif


Re: var-tracking wrt. leaf regs on sparc

2013-02-06 Thread David Miller
From: Jakub Jelinek 
Date: Wed, 6 Feb 2013 07:56:44 +0100

> so achieving zero failures might be too hard

I don't believe this, all the sparc failures I see look
like a similar bug just showing up in multiple tests.



Re: var-tracking wrt. leaf regs on sparc

2013-02-07 Thread David Miller
From: Jakub Jelinek 
Date: Thu, 7 Feb 2013 18:22:32 +0100

> Then supposedly somewhere in dwarf2out we do some adjustment,
> but still end up with d/e loclist of:
> .LLST2:
> .uaxword.LVL0-.Ltext0   ! Location list begin address 
> (*.LLST2)
> .uaxword.LVL1-.Ltext0   ! Location list end address (*.LLST2)
> .uahalf 0x6 ! Location expression size
> .byte   0x88! DW_OP_breg24
> .byte   0   ! sleb128 0
> .byte   0x89! DW_OP_breg25
> .byte   0   ! sleb128 0
> .byte   0x22! DW_OP_plus
> .byte   0x9f! DW_OP_stack_value
> .uaxword.LVL1-.Ltext0   ! Location list begin address 
> (*.LLST2)
> .uaxword.LFE0-.Ltext0   ! Location list end address (*.LLST2)
> .uahalf 0x1 ! Location expression size
> .byte   0x58! DW_OP_reg8
> .uaxword0   ! Location list terminator begin (*.LLST2)
> .uaxword0   ! Location list terminator end (*.LLST2)
> where I'd expect breg8/breg9 instead.

The fix for this is trivial, just a missing leaf renumbering in dwarf2out.c:

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 06cfb18..765d5c5 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -10864,7 +10864,16 @@ based_loc_descr (rtx reg, HOST_WIDE_INT offset,
}
 }
 
-  regno = DWARF_FRAME_REGNUM (REGNO (reg));
+  regno = REGNO (reg);
+#ifdef LEAF_REG_REMAP
+  if (crtl->uses_only_leaf_regs)
+{
+  int leaf_reg = LEAF_REG_REMAP (regno);
+  if (leaf_reg != -1)
+   regno = (unsigned) leaf_reg;
+}
+#endif
+  regno = DWARF_FRAME_REGNUM (regno);
 
   if (!optimize && fde
   && (fde->drap_reg == regno || fde->vdrap_reg == regno))


Re: var-tracking wrt. leaf regs on sparc

2013-02-07 Thread David Miller
From: David Miller 
Date: Thu, 07 Feb 2013 14:38:18 -0500 (EST)

> From: Jakub Jelinek 
> Date: Thu, 7 Feb 2013 18:22:32 +0100
> 
>> Then supposedly somewhere in dwarf2out we do some adjustment,
>> but still end up with d/e loclist of:
>> .LLST2:
>> .uaxword.LVL0-.Ltext0   ! Location list begin address 
>> (*.LLST2)
>> .uaxword.LVL1-.Ltext0   ! Location list end address (*.LLST2)
>> .uahalf 0x6 ! Location expression size
>> .byte   0x88! DW_OP_breg24
>> .byte   0   ! sleb128 0
>> .byte   0x89! DW_OP_breg25
>> .byte   0   ! sleb128 0
>> .byte   0x22! DW_OP_plus
>> .byte   0x9f! DW_OP_stack_value
>> .uaxword.LVL1-.Ltext0   ! Location list begin address 
>> (*.LLST2)
>> .uaxword.LFE0-.Ltext0   ! Location list end address (*.LLST2)
>> .uahalf 0x1 ! Location expression size
>> .byte   0x58! DW_OP_reg8
>> .uaxword0   ! Location list terminator begin (*.LLST2)
>> .uaxword0   ! Location list terminator end (*.LLST2)
>> where I'd expect breg8/breg9 instead.
> 
> The fix for this is trivial, just a missing leaf renumbering in dwarf2out.c:

So the combined patch is below, any objections?

Here is the testsuite diff:

@@ -155,8 +148,8 @@ FAIL: gcc.dg/guality/vla-2.c  -O2 -flto

=== gcc Summary ===

-# of expected passes   2128
-# of unexpected failures   122
+# of expected passes   2135
+# of unexpected failures   115
 # of unexpected successes  31
 # of expected failures 17
 # of unsupported tests 136

This is undoubtedly an improvement.

gcc/

2013-02-07  David S. Miller  

* dwarf2out.c (based_loc_descr): Perform leaf register remapping
on 'reg'.
* var-tracking.c (vt_add_function_parameter): Test the presence of
HAVE_window_save properly and do not remap argument registers when
we have a leaf function.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 06cfb18..765d5c5 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -10864,7 +10864,16 @@ based_loc_descr (rtx reg, HOST_WIDE_INT offset,
}
 }
 
-  regno = DWARF_FRAME_REGNUM (REGNO (reg));
+  regno = REGNO (reg);
+#ifdef LEAF_REG_REMAP
+  if (crtl->uses_only_leaf_regs)
+{
+  int leaf_reg = LEAF_REG_REMAP (regno);
+  if (leaf_reg != -1)
+   regno = (unsigned) leaf_reg;
+}
+#endif
+  regno = DWARF_FRAME_REGNUM (regno);
 
   if (!optimize && fde
   && (fde->drap_reg == regno || fde->vdrap_reg == regno))
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 714acb69..0db1562 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -9502,31 +9502,34 @@ vt_add_function_parameter (tree parm)
   /* DECL_INCOMING_RTL uses the INCOMING_REGNO of parameter registers.
  If the target machine has an explicit window save instruction, the
  actual entry value is the corresponding OUTGOING_REGNO instead.  */
-  if (REG_P (incoming)
-  && HARD_REGISTER_P (incoming)
-  && OUTGOING_REGNO (REGNO (incoming)) != REGNO (incoming))
+  if (HAVE_window_save && !crtl->uses_only_leaf_regs)
 {
-  parm_reg_t p;
-  p.incoming = incoming;
-  incoming
-   = gen_rtx_REG_offset (incoming, GET_MODE (incoming),
- OUTGOING_REGNO (REGNO (incoming)), 0);
-  p.outgoing = incoming;
-  vec_safe_push (windowed_parm_regs, p);
-}
-  else if (MEM_P (incoming)
-  && REG_P (XEXP (incoming, 0))
-  && HARD_REGISTER_P (XEXP (incoming, 0)))
-{
-  rtx reg = XEXP (incoming, 0);
-  if (OUTGOING_REGNO (REGNO (reg)) != REGNO (reg))
+  if (REG_P (incoming)
+ && HARD_REGISTER_P (incoming)
+ && OUTGOING_REGNO (REGNO (incoming)) != REGNO (incoming))
{
  parm_reg_t p;
- p.incoming = reg;
- reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg)));
- p.outgoing = reg;
+ p.incoming = incoming;
+ incoming
+   = gen_rtx_REG_offset (incoming, GET_MODE (incoming),
+ OUTGOING_REGNO (REGNO (incoming)), 0);
+ p.outgoing = incoming;
  vec_safe_push (windowed_parm_regs, p);
- incoming = replace_equiv_address_nv (incoming, reg);
+   }
+  else if (MEM_P (incoming)
+  && REG_P (XEXP (incoming, 0))
+  && HARD_REGISTER_P (XEXP (incoming, 0)))
+   {
+ rtx reg = XEXP (incoming, 0);
+ if (OUTGOING_REGNO (REGNO (reg)) != REGNO (reg))
+   {
+ parm_reg_t p;
+ p.incoming = reg;
+ reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg)));
+ p.outgoing = reg;
+ vec_safe_push (windowed_parm_regs, p);
+ incoming = replace_equiv_address_nv (incoming, reg);
+   }
}
 }
 #endif


Re: var-tracking wrt. leaf regs on sparc

2013-02-07 Thread David Miller
From: Jakub Jelinek 
Date: Thu, 7 Feb 2013 20:43:32 +0100

> This and earlier patch are ok, if it bootstraps/regtests fine, and suitable
> ChangeLog entry is provided.
> Running gdb testsuite before and after wouldn't hurt though.

I've done all of this, and committed to trunk and the gcc-4.7 branch,
thanks.

In looking at the remaining failures, several have to do with
an early clobber if the first incoming argument register.

The issue is that this is where return values are placed, so we run
into a situation where that incoming argument value can't be
reconstituted in any way by the variable tracking code and thus gdb
says that it has been optimized out.

Many non-x86 cpus are going to run into this problem.  For example,
from pr36728-1.c:

foo:
save%sp, -96, %sp
add %sp, -40, %sp
mov 2, %g2
add %sp, 123, %g1
mov 25, %g4
and %g1, -32, %g1
sethi   %hi(b), %g3
st  %g2, [%g1]
ld  [%fp+92], %g2
nop
ld  [%g1], %i0
add %g2, 14, %g2
and %g2, -8, %g2
sub %sp, %g2, %sp
stb %g4, [%sp+96]
add %sp, 96, %g2
sethi   %hi(a), %g4
nop
return  %i7+8
 nop

Here %i0 is written early, and then the tests can't view 'arg1'
properly later in the function.

Also, I noticed that calculation of the on-stack address of values
with alignment regressed in gcc-4.8 vs. gcc-4.7 Again, in pr36728-1.c,
'y' can be printed properly in gcc-4.7 but in gcc-4.8 it cannot.

I think it might be getting the base register wrong, I'll look
more deeply if I get a chance.


Re: expansion of vector shifts...

2013-02-12 Thread David Miller
From: David Miller 
Date: Fri, 16 Nov 2012 00:33:05 -0500 (EST)

> From: Richard Sandiford 
> Date: Mon, 29 Oct 2012 10:14:53 +
> 
>> ...given that the code is like you say written:
>> 
>>   if (SHIFT_COUNT_TRUNCATED)
>> {
>>   if (CONST_INT_P (op1)
>> ...
>>   else if (GET_CODE (op1) == SUBREG
>> && subreg_lowpart_p (op1)
>> && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1
>>  op1 = SUBREG_REG (op1);
>> }
>> 
>> INTEGRAL_MODE_P (GET_MODE (op1)) might be better than an explicit
>> VECTOR_MODE_P check.  The code really doesn't make sense for anything
>> other than integers.
>> 
>> (It amounts to the same thing in practice, of course...)
> 
> Agreed, I've just committed the following.  Thanks!
> 
> 
> Fix gcc.c-torture/compile/pr53410-2.c on sparc.
> 
>   * expmed.c (expand_shift_1): Don't strip non-integral SUBREGs.

This is broken on sparc again, although I'm confused about how this
has happened.

The suggestion was to use INTEGRAL_MODE_P as the test, so what's there
in expand_shift_1() is:

  else if (GET_CODE (op1) == SUBREG
   && subreg_lowpart_p (op1)
   && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1)))
   && INTEGRAL_MODE_P (GET_MODE (op1)))
op1 = SUBREG_REG (op1);

but INTEGRAL_MODE_P accepts vectors.  This is really confusing because
I was absolutely sure I re-ran the test case with the fix I committed
and it didn't crash any more.

Maybe what we really mean to do here is check both op1 and SUBREG_REG
(op1) against SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P?

Something like this:

gcc/

2013-02-12  David S. Miller  

* expmed.c (expand_shift_1): Only strip scalar integer subregs.

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 4a6ddb0..954a360 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -2116,8 +2116,8 @@ expand_shift_1 (enum tree_code code, enum machine_mode 
mode, rtx shifted,
   % GET_MODE_BITSIZE (mode));
   else if (GET_CODE (op1) == SUBREG
   && subreg_lowpart_p (op1)
-  && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1)))
-  && INTEGRAL_MODE_P (GET_MODE (op1)))
+  && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (op1)))
+  && SCALAR_INT_MODE_P (GET_MODE (op1)))
op1 = SUBREG_REG (op1);
 }
 




Re: expansion of vector shifts...

2013-02-13 Thread David Miller
From: Richard Biener 
Date: Wed, 13 Feb 2013 12:15:13 +0100

> On Tue, Feb 12, 2013 at 11:31 PM, David Miller  wrote:
>> Maybe what we really mean to do here is check both op1 and SUBREG_REG
>> (op1) against SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P?
> 
> Yes.

Ok, I'll commit this after doing some regstraps, thanks.

>> Something like this:
>>
>> gcc/
>>
>> 2013-02-12  David S. Miller  
>>
>> * expmed.c (expand_shift_1): Only strip scalar integer subregs.
>>
>> diff --git a/gcc/expmed.c b/gcc/expmed.c
>> index 4a6ddb0..954a360 100644
>> --- a/gcc/expmed.c
>> +++ b/gcc/expmed.c
>> @@ -2116,8 +2116,8 @@ expand_shift_1 (enum tree_code code, enum machine_mode 
>> mode, rtx shifted,
>>% GET_MODE_BITSIZE (mode));
>>else if (GET_CODE (op1) == SUBREG
>>&& subreg_lowpart_p (op1)
>> -  && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1)))
>> -  && INTEGRAL_MODE_P (GET_MODE (op1)))
>> +  && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (op1)))
>> +  && SCALAR_INT_MODE_P (GET_MODE (op1)))
>> op1 = SUBREG_REG (op1);
>>  }
>>
>>
>>
> 


Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread David Miller
From: Aurelien Jarno <[EMAIL PROTECTED]>
Date: Wed, 05 Mar 2008 21:52:14 +0100

> H. Peter Anvin a écrit :
> > The best would be if this could be controlled by a flag, which we can
> > flip once kernel fixes has been around for long enough.
> 
> I have to agree there. Whatever the decision that gcc will take,
> distributions will reenable the old behaviour for some time for to allow
> upgrades from a previous version.

I don't think this approach is tenable.

If a distribution should ship with a "fixed" kernel and
compiler enabling the new direction flag behavior, any
binary you create on that system will be broken on any
other existing system.

I think we really are stuck with this forever, overwhelming
practice over the past 15 years has dictated to us what the
real ABI is.


Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread David Miller
From: "Richard Guenther" <[EMAIL PROTECTED]>
Date: Wed, 5 Mar 2008 22:40:59 +0100

> Right.  So this problem is over-exaggerated.  It's not like
> "any binary you create on that system will be broken on any
> other existing system."

I will be sure to hunt you down to help debug when someone reports
that once every few weeks their multi-day simulation gives incorrect
results :-)

This is one of those cases where the bug is going to be a huge
issue to people who actually hit it, and since we know about the
problem, knowingly shipping something in that state is unforgivable.


Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread David Miller
From: Michael Matz <[EMAIL PROTECTED]>
Date: Wed, 5 Mar 2008 22:43:33 +0100 (CET)

> The error is arcane and happens seldomly if at all.  And only on
> unfixed kernels.

Which translates right now into "all kernels."


Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread David Miller
From: Adrian Bunk <[EMAIL PROTECTED]>
Date: Thu, 6 Mar 2008 00:13:04 +0200

> On Wed, Mar 05, 2008 at 10:59:21PM +0100, Michael Matz wrote:
> > The problem is with old kernels, which by definition stay unfixed.
> 
> Compiling older kernels with new gcc versions has never been supported.

Adrian we're talking about userland binaries compiled by
gcc-4.3, not the kernel.

Please follow the discussion if you'd like to contribute.

Thanks.


Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread David Miller
From: Michael Matz <[EMAIL PROTECTED]>
Date: Thu, 6 Mar 2008 00:07:39 +0100 (CET)

> The fix lies in the kernel, the work-around in gcc.

This depends upon how you interpret this ABI situation.

There is at least some agreement that how things have
actually been implemented by these kernels for more
than 15 years trumps whatever a paper standard states.


Re: US-CERT Vulnerability Note VU#162289

2008-04-11 Thread David Miller
From: Ian Lance Taylor <[EMAIL PROTECTED]>
Date: Fri, 11 Apr 2008 11:04:38 -0700

> "Robert C. Seacord" <[EMAIL PROTECTED]> writes:
> 
> >>  What you really mean is,
> >> "Use an older GCC or some other compiler that is known not to
> >> take advantage of this optimization."
> >>   
> > i think we mean what we say, which is "*Avoid newer versions of gcc"
> > and *"avoiding the use of gcc versions 4.2 and later."  i don't see
> > any verbiage that says "use a different compiler".
> 
> I know I'm biased, but I think "use a different compiler" is clearly
> implied by the text of the advisory.  If the advisory mentioned that
> other compilers also implement the same optimization, then that
> implication would not be there.

I completely agree.



Re: US-CERT Vulnerability Note VU#162289

2008-04-23 Thread David Miller
From: Chad Dougherty <[EMAIL PROTECTED]>
Date: Wed, 23 Apr 2008 07:52:26 -0400

>   We won't include information about other vendors without either a 
> statement from them or independent verification of their affectedness.

How, may I ask, did that policy apply to the GCC "vendor"
when this all got started?


Re: US-CERT Vulnerability Note VU#162289

2008-04-23 Thread David Miller
From: Chad Dougherty <[EMAIL PROTECTED]>
Date: Wed, 23 Apr 2008 08:37:11 -0400

> David Miller wrote:
> > How, may I ask, did that policy apply to the GCC "vendor"
> > when this all got started?
> 
> Our own testing of multiple versions of gcc on multiple platforms and 
> subsequent confirmation by Mark that it was intentional, desired 
> behavior.  This all occurred prior to even the initial version of the note.

CERT is asking these vendors for "approval" for the text they will add
mentioning anything about their product.  That's the bit I'm talking
about.

They are getting protection and consideration that was not really
afforded to GCC.

CERT treated GCC differently.


Re: US-CERT Vulnerability Note VU#162289

2008-04-23 Thread David Miller
From: Joe Buck <[EMAIL PROTECTED]>
Date: Wed, 23 Apr 2008 08:24:44 -0700

> If CERT is to maintain its reputation, it needs to do better.  The warning
> is misdirected in any case; given the very large number of compilers that
> these coding practices cause trouble for, you need to focus on the bad
> coding practices, not on unfair demonization of new GCC releases.

In my opinion CERT's advisory has been nothing but an unfair FUD
campaign on compilers, and GCC specifically, and has seriously
devalued CERT's advisories, in general, which were already of low
value to begin with.

It looks similar to a news article run by a newspaper that is losing
money and has no real news to write about, but yet they have to write
about something.

The worst part of this fiasco is that GCCs reputation has been
unfairly harmed in one way or another, and there is nothing CERT can
do to rectify the damage they've caused.


Re: Should we remove java from the default bootstrap languages?

2008-06-20 Thread David Miller
From: "Steven Bosscher" <[EMAIL PROTECTED]>
Date: Sat, 21 Jun 2008 00:09:26 +0200

> What is far more worrying to me, actually, is that libjava grows
> bigger and bigger and bigger with every release, so that testing it
> costs developers who care zilch about java (i.e. most people) get
> penalized more and more with increased bootstrap and test times.

I agree and will admit that this is the one thing that has curtailed
severely my contributions to GCC in the past 4 to 5 years.

I used to be able to bootstrap gcc fully in minutes on average
hardware 6 or so years ago.  Those days are long gone.  On my largest
64 cpu and 128 cpu boxes it takes forever these days.

The libjava build is notoriously not helped by parallelization because
certain compiles are extremely expensive, which effectively
single-threads the build.


Re: Should we remove java from the default bootstrap languages?

2008-06-21 Thread David Miller
From: Ralf Wildenhues <[EMAIL PROTECTED]>
Date: Sat, 21 Jun 2008 10:58:55 +0200

> Would that be compiles of object files that end up in libgcj (as opposed
> to the link, or stuff that depends on libgcj)?  If yes, the lack of
> parallelism should be fixable.

It's the compilation of the object files, not the linking.


Re: Should we remove java from the default bootstrap languages?

2008-06-22 Thread David Miller
From: Laurent GUERBY <[EMAIL PROTECTED]>
Date: Sat, 21 Jun 2008 22:12:24 +0200

> I'm curious at how many GCC developpers use non x86/_64 as their
> main development machine (and how many non x86/_64 core they use).

I definitely am one.

Or, maybe you are asking the wrong question, which likely
should be how many maintainers of other cpus backends are
in this situation :-)


Re: Should we remove java from the default bootstrap languages?

2008-06-22 Thread David Miller
From: Laurent GUERBY <[EMAIL PROTECTED]>
Date: Sun, 22 Jun 2008 09:21:32 +0200

> On Sun, 2008-06-22 at 00:04 -0700, David Miller wrote:
> > From: Laurent GUERBY <[EMAIL PROTECTED]>
> > Date: Sat, 21 Jun 2008 22:12:24 +0200
> > 
> > > I'm curious at how many GCC developpers use non x86/_64 as their
> > > main development machine (and how many non x86/_64 core they use).
> > 
> > I definitely am one.
> 
> How many core does your main development machine have?

8 cores and 8 cpu threads per core on one, 64 cpus total.
16 cores and 8 cpu threads per core on another, 128 cpus total.

It still takes an hour or so to bootstrap on these machines because
the individual cpu threads are slow and individual expensive
compilations become the bottleneck, especially in the libjava build.

And the way the testsuite works it can't even come close to using
even a significant fraction of those cpus.


Re: Should we remove java from the default bootstrap languages?

2008-06-22 Thread David Miller
From: Laurent GUERBY <[EMAIL PROTECTED]>
Date: Sun, 22 Jun 2008 10:05:26 +0200

> On Sun, 2008-06-22 at 00:45 -0700, David Miller wrote:
> > > How many core does your main development machine have?
> > 
> > 8 cores and 8 cpu threads per core on one, 64 cpus total.
> > 16 cores and 8 cpu threads per core on another, 128 cpus total.
> > 
> > It still takes an hour or so to bootstrap on these machines because
> > the individual cpu threads are slow and individual expensive
> > compilations become the bottleneck, especially in the libjava build.
> 
> Did you measure how long insn-attrtab.o generation takes?
> See my other email, on a 2.2 GHz barcelona it's about 5mn20s (for each
> stage) and it's the main limiting factor (but I assume it's architecture
> dependant).

On Sparc, which is what these systems are and the cpu I target with
gcc, insn-attrtab.o compilation is not expensive.


Re: [10 PATCHES] inline functions to avoid stack overflow

2008-06-25 Thread David Miller
From: Mikulas Patocka <[EMAIL PROTECTED]>
Date: Wed, 25 Jun 2008 08:53:10 -0400 (EDT)

> Even worse, gcc doesn't use these additional bytes. If you try this:
> 
> extern void f(int *i);
> void g()
> {
>  int a;
>  f(&a);
> }
> 
> , it allocates additional 16 bytes for the variable "a" (so there's total 
> 208 bytes), even though it could place the variable into 48-byte 
> ABI-mandated area that it inherited from the caller or into it's own 
> 16-byte padding that it made when calling "f".

The extra 16 bytes of space allocated is so that GCC can perform a
secondary reload of a quad floating point value.  It always has to be
present, because we can't satisfy a secondary reload by emitting yet
another reload, it's the end of the possible level of recursions
allowed by the reload pass.

GCC could be smart and eliminate that slot when it's not used, but
such a thing is not implemented yet.

It would also require quite a bit of new code to determine cases
like you mention above, where the incoming arg slots from the
caller are unused, assuming this would be legal.

And that legality is doubtful.  We'd need to be careful because I
think the caller is allowed to assume that those slots are untouched
by the callee, and thus can be assumed to have whatever values the
caller put there even after the callee returns.


Re: [10 PATCHES] inline functions to avoid stack overflow

2008-06-26 Thread David Miller
From: "Bart Van Assche" <[EMAIL PROTECTED]>
Date: Thu, 26 Jun 2008 08:32:35 +0200

> On Thu, Jun 26, 2008 at 12:09 AM, David Miller <[EMAIL PROTECTED]> wrote:
> > The extra 16 bytes of space allocated is so that GCC can perform a
> > secondary reload of a quad floating point value.  It always has to be
> > present, because we can't satisfy a secondary reload by emitting yet
> > another reload, it's the end of the possible level of recursions
> > allowed by the reload pass.
> 
> Is there any floating-point code present in the Linux kernel ?

Yes, but not coming from C compiled code.  Floating point is
used in most of the memcpy/memset implementations of the
sparc64 kernel.

> Would it be a good idea to add an option to gcc that tells gcc that
> the compiled code does not contain floating-point instructions, such
> that gcc knows that no space has to be provided for a quad floating
> point value ?

I think it exists already, it's called -mno-fpu :-)


Re: [10 PATCHES] inline functions to avoid stack overflow

2008-07-01 Thread David Miller
From: Mikulas Patocka <[EMAIL PROTECTED]>
Date: Wed, 2 Jul 2008 00:39:35 -0400 (EDT)

> The ABI is very vague about it. The V9 ABI just displays that 6-word space 
> in a figure bug doesn't say anything about it's usage. The V8 ABI just 
> says that "the function may write incoming arguments there". If it may 
> write anything other, it is unknown --- probably yes, but it is not said 
> in the document.
> 
> The document nicely specifies who owns which registers, but doesn't say 
> that about the stack space :-(

Actually, I know for a fact that you have to have those slots there.

A long time ago in the sparc64 kernel, in the trap entry code, I tried
only giving 128 bytes of stack frame as the trap entry called into C
code.  And it did not work, I had to put the 6 slots there.


Re: Bootstrap failure on sparc-sun-solaris2.10

2008-09-08 Thread David Miller
From: Rainer Orth <[EMAIL PROTECTED]>
Date: Mon, 8 Sep 2008 17:18:50 +0200 (MEST)

> Eric Botcazou writes:
> 
> > Confirmed (on Solaris 9).  Would you mind opening a PR?  There is already 
> > one 
> > for Linux (37344) but the failure is a little different.  Thanks in advance.
> 
> Sure, done: PR bootstrap/37424.

BTW, I'm also seeing the sparc-*-linux failure, and it seems the
compiler is outputting an unaligned memory access somehow.


Re: Is Sun putting much effort into supporting the gcc/binutils toolchain on sparc64 ?

2007-09-12 Thread David Miller
From: Andrew Walrond <[EMAIL PROTECTED]>
Date: Wed, 12 Sep 2007 12:37:03 +0100

> I have to make buying decisions, and having tested a Sun T1000 for a
> while I am impressed with Suns hardware. But, we are 100% gnu/linux and
> it disturbs me that David Miller seems to be a (very impressive) team of
> 1 on the sparclinux ML (My impression; perhaps I am wrong?)
> 
> So I wonder, is Sun putting any more effort into developing and
> supporting the gcc/binutils toolchain?

They are sending me specs and systems very early these days,
so in that perspective it's been great.

I do also have contacts which I use to report firmware and
hypervisor issues I'd like fixed or clarified, so that's great
too.

I tend to do the processor specific gcc instruction scheduler
and costs entries, and adding the new instruction support to
binutils as well.

So no, Sun really isn't helping with any actual development.


Re: Is Sun putting much effort into supporting the gcc/binutils toolchain on sparc64 ?

2007-09-13 Thread David Miller
From: Sunil Amitkumar Janki <[EMAIL PROTECTED]>
Date: Thu, 13 Sep 2007 11:16:00 +0200

> As far as I know, GCC for SPARC appears to be a front end
> to the Sun Studio compiler which happens to translate GCC
> flags to ones suitable for their own compiler.

That's correct, it's essentially GCC's language front ends
plugged into Sun's compiler backend.


Re: Is Sun putting much effort into supporting the gcc/binutils toolchain on sparc64 ?

2007-09-13 Thread David Miller
From: Andrew Walrond <[EMAIL PROTECTED]>
Date: Thu, 13 Sep 2007 15:02:08 +0100

> So, Sun really _don't_ give a damn about gnu/linux on sparc64.

That is a gross mischaracterization of the situation, don't
say this, it isn't true at all.

I have a full rack of Niagara systems that proves that Sun
cares to some extent.  I get early hardware access and
documentation access, plus engineers to talk to and ask
questions of.

That is miles away from what you are claiming.

I know what you are after, and it doesn't happen overnight.  So don't
be spastic and just whine as the only possible effect that will have
is to scare them off and put Sun in reverse gear instead of continuing
forward.


Re: Is Sun putting much effort into supporting the gcc/binutils toolchain on sparc64 ?

2007-09-14 Thread David Miller
From: Andrew Walrond <[EMAIL PROTECTED]>
Date: Fri, 14 Sep 2007 22:43:16 +0100

> And I apologise (to everyone) for any unnecessary rhetoric on my part; I
> freely admit that I designed my posts specifically to sparc (ahem) this
> debate, but I guess most of you knew that already ;)

The big issue you keep missing in all of your rediculious complaints
is that it takes a lot of time for a company to invest in future
potential revenue.

Most of Sun's resources are invested in things that make them money
right now and keep the lights on at their buildings and their employee
salaries paid.

Investing in better Linux Niagara support in the way that you want is
a huge risk, and you are only one very vocal customer and that does
not, in and of itself, translate into lots of revenue and lots of
profit for Sun.  In fact, it is not even guarenteed that you yourself
will buy lots of Niagara boxes to run Linux on, things change
sometimes unexpectedly.

So your expectations and judgments of Sun as a company are wholly
disconnected from reality.

Please stop this banter, and work on things to support what Sun has
done so far and things which will encourage them to do more in the
future.

And _NO_ this does mean continuing to say that Sun isn't doing enough
to satisfy you, that discourages rather than encourages in case you
haven't gotten that message LOUD and CLEAR by now.



  1   2   >