https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111064
--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> --- > > [liuhongt@intel gather_emulation]$ ./gather.out > ;./nogather_xmm.out;./nogather_ymm.out > elapsed time: 1.75997 seconds for gather with 30000000 iterations > elapsed time: 2.42473 seconds for no_gather_xmm with 30000000 iterations > elapsed time: 1.86436 seconds for no_gather_ymm with 30000000 iterations > For 510.parest_r, enable gather emulation for ymm can bring back 3% performance, still not as good as gather instruction due to thoughput bound.