On 10/4/06, Richard Kenner <[EMAIL PROTECTED]> wrote:
What's happening is that the insn that combine makes from those three is
likely algebraically the same as that insn, but looks different. Use a
debugger to find out what it made when it combined the three insn.
This, plus tuning the costs,
So let's say that I have an instruction that the combiner finds.
For instance:
"mac":
(set (match_operand:SI 0 "register_operand" "=r")
(plus:SI (mult:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r"))
On 9/29/06, David Edelsohn <[EMAIL PROTECTED]> wrote:
The GCC register allocator allocates objects that span multiple
registers in adjacent registers. For instance, a 64-bit doubleword
integer (long long int) will be allocated in two adjacent hardware
registers when GCC is targeted at a
On 9/29/06, David Edelsohn <[EMAIL PROTECTED]> wrote:
>>>>> Erich Plondke writes:
Erich> rs6000 and Sparc ports seem to use a peephole2 to get the ldd or lfq
Erich> instructions (respectively), but it looks like there's no reason for
Erich> the register alloc
rs6000 and Sparc ports seem to use a peephole2 to get the ldd or lfq
instructions (respectively), but it looks like there's no reason for
the register allocater to allocate registers together. The peephole2
just picks up loads to adjacent memory locations if the allocater
happens to choose adjace
Dorit Nuzman wrote:
Indeed on altivec we implement the 'mask_for_load(addr)' builtin using
'lvsr(neg(addr))', that feeds the 'realign_load' (which is a 'vperm' on
altivec).
I'm not too familiar with the ARM WMMX ISA, but couldn't you use a similar
trick - i.e instead of using the low bits of the
I've been tinkering with the autovectorizer. It's really cool.
I particularly like the realignment support.
I've noticed just a few things while tinkering with it (in 4.1.1):
0) The realignment code takes the floor of the unaligned pointer, and we
increment the unaligned pointer in the loop. T
I've noticed while tinkering with 3.4 and 4.1 that some
code sequences turn out much better in 4.1. However, other
code sequences turn out substantially worse in 4.1.
The most frustrating is the reduction in use of postmodify
addressing modes. It looks like tree-ssa-loop-ivopts converts
a loop
I'm trying to add a hook for aligning vectors for loads.
I'm using the altivec rs6000 code as a baseline.
However, the instruction is like the iwmmxt_walign instruction in the
ARM port; it takes
a normalish register and uses the bottom bits... it doesn't use a
full-width vector.
GCC complains w
I'm doing some research on a pretty plain 32-bit RISC architecture that has
some extra facilities for doing vector operations. Not exactly new, I know.
The difference with this one is that the vectors are pairs of normal
registers.
This isn't all that new; lots of architectures have normal regi
10 matches
Mail list logo