https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121910
--- Comment #2 from JuzheZhong <juzhe.zhong at rivai dot ai> --- I think the issue is in the register pressure issue. Consider this following IR: unsigned char _11; _9 = (sizetype) x_75; _10 = src_72 + _9; _11 = *_10; _12 = (int) _11; ---> EEW8->EEW32. If maximum LMUL = 4 is picked, The number of vector register in live should be 4 * 4 = 16 in a single IR statement. However, the register pressure analysis seems to missed it: /app/example.c:19:27: note: Compute local program points for bb 4: /app/example.c:19:27: note: program point 1: _11 = *_10; /app/example.c:19:27: note: program point 2: _16 = *_15; /app/example.c:19:27: note: program point 3: _21 = *_20; /app/example.c:19:27: note: program point 4: _26 = *_25; /app/example.c:19:27: note: program point 5: _33 = (unsigned char) _31; /app/example.c:19:27: note: program point 6: *_32 = _33; before vect the bb 4 IR is : _9 = (sizetype) x_75; _10 = src_72 + _9; _11 = *_10; _12 = (int) _11; _13 = _12 * cA_46; _14 = _9 + 1; _15 = src_72 + _14; _16 = *_15; _17 = (int) _16; _18 = _17 * cB_47; _19 = _13 + _18; _20 = srcp_73 + _9; _21 = *_20; _22 = (int) _21; _23 = _22 * cC_48; _24 = _19 + _23; _25 = srcp_73 + _14; _26 = *_25; _27 = (int) _26; _28 = _27 * cD_49; _29 = _24 + _28; _30 = _29 + 32; _31 = _30 >> 6; _32 = dst_71 + _9; _33 = (unsigned char) _31; *_32 = _33; I think we have missed multiple IRs (program points) during the analysis. In https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv-vector- costs.cc I guess the condition here: if (STMT_VINFO_RELEVANT_P (stmt_info)) { stmt_point info = {point, gsi_stmt (si), stmt_info}; program_points.safe_push (info); point++; if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "program point %d: %G", info.point, gsi_stmt (si)); } missed some IRs