https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95199
Bug ID: 95199 Summary: Remove extra variable created for memory reference in loop vectorization. Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhoukaipeng3 at huawei dot com Target Milestone: --- The function vect_create_data_ref_ptr created two equal variable for two equal memory references. gcc version 11.0.0 20200515 (experimental) (GCC) Target: aarch64-unknown-linux-gnu Configured with: ../configure Command: gcc -O2 -march=armv8.2-a+fp+sve -ftree-vectorize test.c -S Testcase: void foo (double *a, double *b, double m, int inc_x, int inc_y) { int ix = 0, iy = 0; for (int i = 0; i < 1000; ++i) { a[ix] += m * b[iy]; ix += inc_x; iy += inc_y; } return ; } Assembly code .L5: ld1d z3.d, p0/z, [x5, z2.d, lsl 3] ld1d z1.d, p0/z, [x3, z4.d, lsl 3] fmad z1.d, p1/m, z0.d, z3.d st1d z1.d, p0, [x2, z2.d, lsl 3] incd x1 add x5, x5, x6 add x3, x3, x4 add x2, x2, x6 whilelo p0.d, x1, x0 b.any .L5 x2 is the same as x5. vectorizable_load and vectorizable_store called vect_create_data_ref_ptr twice for a[ix]. Dump Log in test.c.161.vect test.c:4:2: note: create real_type-pointer variable to type: double vectorizing a pointer ref: *a_16(D) test.c:4:2: note: created a_16(D) test.c:4:2: note: add new stmt: vect__4.5_94 = .MASK_GATHER_LOAD (vectp_a.3_91, _90, 8, { 0.0, ... }, loop_mask_93); ... test.c:4:2: note: create real_type-pointer variable to type: double vectorizing a pointer ref: *a_16(D) test.c:4:2: note: created a_16(D) test.c:4:2: note: add new stmt: .MASK_SCATTER_STORE (vectp_a.11_117, _116, 8, vect__10.10_108, loop_mask_93); I plan to add a hash_map to loop_vec_info for dr and the corresponding pointer created by vect_create_data_ref_ptr. If the dr->ref has been handled, return the corresponding pointer. Optimized assembly code .L3: ld1d z2.d, p0/z, [x0, z1.d, lsl 3] ld1d z0.d, p0/z, [x1, z4.d, lsl 3] fmad z0.d, p1/m, z3.d, z2.d st1d z0.d, p0, [x0, z1.d, lsl 3] incd x2 add x0, x0, x5 add x1, x1, x4 whilelo p0.d, w2, w3 b.any .L3 ret