Re: recombination?

2006-10-04 Thread Erich Plondke
On 10/4/06, Richard Kenner <[EMAIL PROTECTED]> wrote: What's happening is that the insn that combine makes from those three is likely algebraically the same as that insn, but looks different. Use a debugger to find out what it made when it combined the three insn. This, plus tuning the costs,

recombination?

2006-10-03 Thread Erich Plondke
So let's say that I have an instruction that the combiner finds. For instance: "mac": (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r"))

Re: paired register loads and stores

2006-09-29 Thread Erich Plondke
On 9/29/06, David Edelsohn <[EMAIL PROTECTED]> wrote: The GCC register allocator allocates objects that span multiple registers in adjacent registers. For instance, a 64-bit doubleword integer (long long int) will be allocated in two adjacent hardware registers when GCC is targeted at a

Re: paired register loads and stores

2006-09-29 Thread Erich Plondke
On 9/29/06, David Edelsohn <[EMAIL PROTECTED]> wrote: >>>>> Erich Plondke writes: Erich> rs6000 and Sparc ports seem to use a peephole2 to get the ldd or lfq Erich> instructions (respectively), but it looks like there's no reason for Erich> the register alloc

paired register loads and stores

2006-09-28 Thread Erich Plondke
rs6000 and Sparc ports seem to use a peephole2 to get the ldd or lfq instructions (respectively), but it looks like there's no reason for the register allocater to allocate registers together. The peephole2 just picks up loads to adjacent memory locations if the allocater happens to choose adjace

Re: Notes from tinkering with the autovectorizer (4.1.1)

2006-09-27 Thread Erich Plondke
Dorit Nuzman wrote: Indeed on altivec we implement the 'mask_for_load(addr)' builtin using 'lvsr(neg(addr))', that feeds the 'realign_load' (which is a 'vperm' on altivec). I'm not too familiar with the ARM WMMX ISA, but couldn't you use a similar trick - i.e instead of using the low bits of the

Notes from tinkering with the autovectorizer (4.1.1)

2006-09-26 Thread Erich Plondke
I've been tinkering with the autovectorizer. It's really cool. I particularly like the realignment support. I've noticed just a few things while tinkering with it (in 4.1.1): 0) The realignment code takes the floor of the unaligned pointer, and we increment the unaligned pointer in the loop. T

3.4 vs. 4.1 performance issues

2006-09-26 Thread Erich Plondke
I've noticed while tinkering with 3.4 and 4.1 that some code sequences turn out much better in 4.1. However, other code sequences turn out substantially worse in 4.1. The most frustrating is the reduction in use of postmodify addressing modes. It looks like tree-ssa-loop-ivopts converts a loop

Type yielded by TARGET_VECTORIZE_BUILTIN_MASK_FOR_LOAD hook?

2006-09-15 Thread Erich Plondke
I'm trying to add a hook for aligning vectors for loads. I'm using the altivec rs6000 code as a baseline. However, the instruction is like the iwmmxt_walign instruction in the ARM port; it takes a normalish register and uses the bottom bits... it doesn't use a full-width vector. GCC complains w

How do I teach GCC about automatic vec_concat and vec_select?

2006-08-18 Thread Erich Plondke
I'm doing some research on a pretty plain 32-bit RISC architecture that has some extra facilities for doing vector operations. Not exactly new, I know. The difference with this one is that the vectors are pairs of normal registers. This isn't all that new; lots of architectures have normal regi