Hi all, Pattern "(x | y) - y" can be optimized to simple "(x & ~y)" andn pattern.
So, for the example code: $ cat main.c int f_i(int x, int y) { return (x | y) - y; } long long f_l(long long x, long long y) { return (x | y) - y; } typedef int v4si __attribute__ ((vector_size (16))); typedef long long v2di __attribute__ ((vector_size (16))); v4si f_v4si(v4si a, v4si b) { return (a | b) - b; } v2di f_v2di(v2di a, v2di b) { return (a | b) - b; } void f(v4si *d, v4si *a, v4si *b) { for (int i=0; i<N; i++) d[i] = (a[i] | b[i]) - b[i]; } Before this patch: $ ./aarch64-none-linux-gnu-gcc -S -O2 main.c -dp f_i: orr w0, w0, w1 // 8 [c=4 l=4] iorsi3/0 sub w0, w0, w1 // 14 [c=4 l=4] subsi3 ret // 24 [c=0 l=4] *do_return f_l: orr x0, x0, x1 // 8 [c=4 l=4] iordi3/0 sub x0, x0, x1 // 14 [c=4 l=4] subdi3/0 ret // 24 [c=0 l=4] *do_return f_v4si: orr v0.16b, v0.16b, v1.16b // 8 [c=8 l=4] iorv4si3/0 sub v0.4s, v0.4s, v1.4s // 14 [c=8 l=4] subv4si3 ret // 24 [c=0 l=4] *do_return f_v2di: orr v0.16b, v0.16b, v1.16b // 8 [c=8 l=4] iorv2di3/0 sub v0.2d, v0.2d, v1.2d // 14 [c=8 l=4] subv2di3 ret // 24 [c=0 l=4] *do_return After this patch: $ ./aarch64-none-linux-gnu-gcc -S -O2 main.c -dp f_i: bic w0, w0, w1 // 13 [c=8 l=4] *bic_and_not_si3 ret // 23 [c=0 l=4] *do_return f_l: bic x0, x0, x1 // 13 [c=8 l=4] *bic_and_not_di3 ret // 23 [c=0 l=4] *do_return f_v4si: bic v0.16b, v0.16b, v1.16b // 13 [c=16 l=4] *bic_and_not_simd_v4si3 ret // 23 [c=0 l=4] *do_return f_v2di: bic v0.16b, v0.16b, v1.16b // 13 [c=16 l=4] *bic_and_not_simd_v2di3 ret // 23 [c=0 l=4] *do_return Bootstrapped and tested on aarch64-none-linux-gnu. OK for master ? Cheers, Przemyslaw gcc/ChangeLog: PR tree-optimization/94880 * config/aarch64/aarch64.md (bic_and_not_<mode>3): New define_insn. * config/aarch64/aarch64-simd.md (bic_and_not_simd_<mode>3): New define_insn. gcc/testsuite/ChangeLog: PR tree-optimization/94880 * gcc.target/aarch64/bic_and_not_di3.c: New test. * gcc.target/aarch64/bic_and_not_si3.c: New test. * gcc.target/aarch64/bic_and_not_v2di3.c: New test. * gcc.target/aarch64/bic_and_not_v4si3.c: New test.
patch.patch
Description: patch.patch