A minor feature that also should be considered is if you have two clone
functions, one that calls the other, we should optimize the call to avoid using
the indirect call setup by ifunc.

I.e.

        extern __attribute__((target_clones("default","avx","avx2"))) int 
caller ();
        extern __attribute__((target_clones("default","avx","avx2"))) int 
callee ();

        __attribute__((target_clones("default","avx","avx2")))
        int caller (void)
        {
          return -callee ();
        }

        __attribute__((target_clones("default","avx","avx2")))
        int callee (void)
        {
          return 10;
        }

I.e. caller.avx should call callee.avx, not callee (or callee.ifunc), and
caller.avx2 should call callee.avx2.  Do people think this is useful?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Reply via email to