A minor feature that also should be considered is if you have two clone functions, one that calls the other, we should optimize the call to avoid using the indirect call setup by ifunc.
I.e. extern __attribute__((target_clones("default","avx","avx2"))) int caller (); extern __attribute__((target_clones("default","avx","avx2"))) int callee (); __attribute__((target_clones("default","avx","avx2"))) int caller (void) { return -callee (); } __attribute__((target_clones("default","avx","avx2"))) int callee (void) { return 10; } I.e. caller.avx should call callee.avx, not callee (or callee.ifunc), and caller.avx2 should call callee.avx2. Do people think this is useful? -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797