On Wed, 10 Oct 2012, Richard Sandiford wrote:
> > I've thought of -miterative-mdu or suchlike a knob that would override
> > the default cost of multiplication/division as appropriate (i.e. to 32/64
> > plus any extra operation-specific constant as required), perhaps by
> > forcing the 4Kp ins
"Maciej W. Rozycki" writes:
> On Sun, 7 Oct 2012, Richard Sandiford wrote:
>
>> > So I think this can't really be selected automatically for all cores,
>> > some human-supplied knowledge about the MD unit used is required -- that
>> > obviously affects other operations too, e.g. some multiplica
On Sun, 7 Oct 2012, Richard Sandiford wrote:
> > So I think this can't really be selected automatically for all cores,
> > some human-supplied knowledge about the MD unit used is required -- that
> > obviously affects other operations too, e.g. some multiplications
> > involving a constant tha
"Maciej W. Rozycki" writes:
> On Tue, 25 Sep 2012, Richard Sandiford wrote:
>> Although I see the 4kp with its 32-cycle MULTs and MADDs is one where
>> MULT $0,$0 would be a really bad choice. Sigh. The amount of effort
>> required for this optimisation is getting a bit ridiculous.
>
> I have d
On Tue, 25 Sep 2012, Richard Sandiford wrote:
> >> According to my sources the R4650 has a 4-cycle MULT latency (MAD is 3-4
> >> cycles on that processor). An MTHI/MTLO pair will take 2 cycles;
> >> obviously the resulting larger code may adversely affect cache performance
> >> in some scenar
Richard Sandiford writes:
> "Maciej W. Rozycki" writes:
>> On Mon, 24 Sep 2012, Richard Sandiford wrote:
>>
>>> > From the context I am assuming none of this matters for the 74K (and
>>> > presumably the 24KE/34K) and a MULT $0, $0 is indeed faster, but overall
>>> > isn't it something that sh
"Maciej W. Rozycki" writes:
> On Mon, 24 Sep 2012, Richard Sandiford wrote:
>
>> > From the context I am assuming none of this matters for the 74K (and
>> > presumably the 24KE/34K) and a MULT $0, $0 is indeed faster, but overall
>> > isn't it something that should be decided based on instructi
On Mon, 24 Sep 2012, Richard Sandiford wrote:
> > From the context I am assuming none of this matters for the 74K (and
> > presumably the 24KE/34K) and a MULT $0, $0 is indeed faster, but overall
> > isn't it something that should be decided based on instruction costs from
> > DFA schedulers?
"Maciej W. Rozycki" writes:
> On Tue, 18 Sep 2012, Richard Sandiford wrote:
>
>> > Have you had time to think about this some more? I am not sure I can
>> > guess how you'd like me to fix this patch now without some more specific
>> > review and/or suggestions about where the optimization shoul
On Tue, 18 Sep 2012, Richard Sandiford wrote:
> > Have you had time to think about this some more? I am not sure I can
> > guess how you'd like me to fix this patch now without some more specific
> > review and/or suggestions about where the optimization should happen and
> > what cases it sho
Sandra Loosemore writes:
> On 08/27/2012 10:36 AM, Richard Sandiford wrote:
>> Sandra Loosemore writes:
>>> On 08/19/2012 11:22 AM, Richard Sandiford wrote:
Not sure whether a peephole is the right choice here. In practice,
I'd imagine these opportunities would only come from a DI
On 08/27/2012 10:36 AM, Richard Sandiford wrote:
Sandra Loosemore writes:
On 08/19/2012 11:22 AM, Richard Sandiford wrote:
Not sure whether a peephole is the right choice here. In practice,
I'd imagine these opportunities would only come from a DImode move of
$0 into a doubleword register, s
Sandra Loosemore writes:
> On 08/19/2012 11:22 AM, Richard Sandiford wrote:
>>
>> Not sure whether a peephole is the right choice here. In practice,
>> I'd imagine these opportunities would only come from a DImode move of
>> $0 into a doubleword register, so we could simply emit the pattern in
>>
On 08/19/2012 11:22 AM, Richard Sandiford wrote:
Not sure whether a peephole is the right choice here. In practice,
I'd imagine these opportunities would only come from a DImode move of
$0 into a doubleword register, so we could simply emit the pattern in
mips_split_doubleword_move.
That would
Sandra Loosemore writes:
> This patch adds a peephole optimization to use a clever trick to
> zero-initialize the two halves of an accumulator register with one
> instruction instead of a mtlo/mthi pair. OK to check in?
>
> -Sandra
>
> 2012-08-16 Sandra Loosemore
> Julian Brown
>
This patch adds a peephole optimization to use a clever trick to
zero-initialize the two halves of an accumulator register with one
instruction instead of a mtlo/mthi pair. OK to check in?
-Sandra
2012-08-16 Sandra Loosemore
Julian Brown
MIPS Technologies, Inc.
16 matches
Mail list logo