Hi,

AArch64 zip_* intrinsics are currently implemented with temporary inline asm, which prevent analysis through themselves. This is to replace those asm blocks with (equivalent) calls to __builtin_shuffle, which produce the same assembler instructions (unless gcc can do better).

First patch adds a bunch of tests, passing for the current asm implementation;
Second patch reimplements with __builtin_shuffle;
Third patch reuses the test bodies in equivalent tests on the ARM architecture.

Ok for stage 1 ?

Cheers, Alan

Reply via email to