https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107199
Bug ID: 107199 Summary: AVX512 fully masked loop vectorization needs extract_last pattern for vectorization of live variables Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- For fully masked vectorization with AVX512 we'd need to define extract_last which given a loop mask {kN} extracts the lane of the vector corresponding to the last set bit in {kN} (the last iteration of the loop). I don't see a kOP for this so I suppose moving {kN} to a GPR and then doing BSR, moving it back and then doing a compress might work.