https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68780
Guille <guille at cal dot berkeley.edu> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #2 from Guille <guille at cal dot berkeley.edu> --- Thanks for the explanation, you are exactly right. Apologies for my misunderstanding. (In reply to James Almer from comment #1) > What i assume you want is _mm256_mullo_epi32(a, b), which maps to the > vpmulld instruction (Multiply the packed 32-bit integers in a and b, > producing intermediate 64-bit integers, and store the low 32 bits of the > intermediate integers in dst), which for your testcase would result in eight > 32-bit integers with value 2. > > _mm256_mul_epi32(a, b) maps to vpmuldq (Multiply the low 32-bit integers > from each packed 64-bit element in a and b, and store the signed 64-bit > results in dst), which for your testcase correctly gives four 64-bit > integers with value 2.