https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116058
--- Comment #4 from Jeffrey A. Law <law at gcc dot gnu.org> --- Note there isn't anything inherently wrong with having a clobber that references the same hard register as another operand. If the clobber occurs before the inputs are consumed then the clobber need marked as earlyclobber in the constraint. I think the parallel is redundant here, but it's not obvious why removing it helps. In a define_insn, if you have multiple elements, they are implicitly wrapped in a parallel. In a define_expand you have to explicitly use a parallel as the multiple elements would be considered distinct insns to emit. So why exactly does removing the explicit parallel and relying on the implicit parallel help here?