On 3/19/19 10:05 AM, Aleksandar Markovic wrote: > May I ask you to redo this segment of code as Richard > describe (the exact invocations of TCG functions are in > a Richard's comment to some of the previous versions of > this patch). This means redo ILVEV.W handling. Then you > can compare the performance of two versions, and attach > the results here. You can also (using -d out-asm or similar > QEMU options) find out what code is generated for both > alternatives, and attach the generated code here, maybe > some folks will find them interesting (I will).
To be fair, this will not affect x86 as a host at the moment. But the underlying deposit operation is supported by AArch64, PowerPC, and S390 as hosts. There is some support for deposit on x86 as a host within my tcg_gen_extract2 patch set, https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg02036.html which will, in these cases, use rol+shld instead of shr+movq+and+or. r~