Hello there! I'm trying to add some vector registers to a MIPS arch (32 bit). This arch has 32 x 128 bit registers that can essentially be seen as V4SF. So far I'm using this test:
volatile float foo __attribute__ ((vector_size (16))); volatile float bar __attribute__ ((vector_size (16))); int main() { foo = foo + bar; } Which produces the right SSE/AVX instructions for x86 but fails on my mips cross compiler with my modifications. The modifications I did so far are: - Add 32 new regsiters, adding a register class, updating/adding bit fields, updating also other macros that deal with reg allocation (like caller saved and stuff). Also incremented the first pseudo reg value. - Add 3 define_insn that load, store and add vectors. - Tweak some things here and there to let the compiler know about the V4SF type being available. So far the compiler goes back to scalar code, not working properly at the veclower pass. My test.c.123t.veclower21 looks like: <bb 2>: foo.0_2 ={v} foo; bar.1_3 ={v} bar; _6 = BIT_FIELD_REF <foo.0_2, 32, 0>; _7 = BIT_FIELD_REF <bar.1_3, 32, 0>; _8 = _6 + _7; _9 = BIT_FIELD_REF <foo.0_2, 32, 32>; _10 = BIT_FIELD_REF <bar.1_3, 32, 32>; _11 = _9 + _10; _12 = BIT_FIELD_REF <foo.0_2, 32, 64>; _13 = BIT_FIELD_REF <bar.1_3, 32, 64>; _14 = _12 + _13; _15 = BIT_FIELD_REF <foo.0_2, 32, 96>; _16 = BIT_FIELD_REF <bar.1_3, 32, 96>; _17 = _15 + _16; foo.2_4 = {_8, _11, _14, _17}; foo ={v} foo.2_4; return 0; Any ideas on what I'm missing and/or how to further debug this? I don't really want autovectorization, just to be able to use vec registers "manually". Thanks! David