Hi Bin, > Seems Richi added code like below comparing costs between aligned and > unsigned loads, and only peeling if it's beneficial: > > /* In case there are only loads with different unknown misalignments, > use > peeling only if it may help to align other accesses in the loop or > if it may help improving load bandwith when we'd end up using > unaligned loads. */ > tree dr0_vt = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr0))); > if (!first_store > && !STMT_VINFO_SAME_ALIGN_REFS ( > vinfo_for_stmt (DR_STMT (dr0))).length () > && (vect_supportable_dr_alignment (dr0, false) > != dr_unaligned_supported > || (builtin_vectorization_cost (vector_load, dr0_vt, 0) > == builtin_vectorization_cost (unaligned_load, dr0_vt, -1)))) > do_peeling = false;
yes this is the "special case" I was referring to. This successfully avoids peeling when there is no store (after we had set vectorization costs). My patch tries to check the costs for all references. Regards Robin