https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104151

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
172_1 1 times scalar_store costs 12 in body
173_2 1 times scalar_store costs 12 in body
174__builtin_bswap64 (_8) 1 times scalar_stmt costs 4 in body
175__builtin_bswap64 (_10) 1 times scalar_stmt costs 4 in body
176BIT_FIELD_REF <a_3(D), 64, 64> 1 times scalar_stmt costs 4 in body
177BIT_FIELD_REF <a_3(D), 64, 0> 1 times scalar_stmt costs 4 in body
1781<unknown> 1 times vec_perm costs 4 in body
179__builtin_bswap64 (_8) 1 times vector_stmt costs 4 in prologue
180__builtin_bswap64 (_8) 1 times vec_perm costs 4 in body
181_1 1 times vector_store costs 16 in body
182test.cc:9:10: note: Cost model analysis for part in loop 0:
183  Vector cost: 28
184  Scalar cost: 40
...
243  <bb 2> [local count: 1073741824]:
244  _8 = BIT_FIELD_REF <a_3(D), 64, 64>;
245  _11 = VIEW_CONVERT_EXPR<vector(2) long unsigned int>(a_3(D));
246  _13 = VIEW_CONVERT_EXPR<vector(2) long unsigned int>(a_3(D));
247  _12 = VEC_PERM_EXPR <_11, _13, { 1, 0 }>;
248  _14 = VIEW_CONVERT_EXPR<vector(16) char>(_12);
249  _15 = VEC_PERM_EXPR <_14, _14, { 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12,
11, 10, 9, 8 }>;
250  _16 = VIEW_CONVERT_EXPR<vector(2) long unsigned int>(_15);
251  _1 = __builtin_bswap64 (_8);
252  _10 = BIT_FIELD_REF <a_3(D), 64, 0>;
253  _2 = __builtin_bswap64 (_10);
254  MEM <vector(2) long unsigned int> [(long unsigned int *)&y] = _16;
255  _7 = MEM <uint128_t> [(char * {ref-all})&y];


1. According to ABI, uint128 is passed by 2 gpr, and there should be extra cost
for _11 = VIEW_CONVERT_EXPR<vector(2) long unsigned int>(a_3(D));
2. Why there's 1781<unknown> 1 times vec_perm costs 4 in body, should it be 2
times vec_perm costs?

Reply via email to