dmgreen added a comment.

Hello. We've received reports that this is bloating codesize of some code, 
quite a lot in places. There is an example in https://godbolt.org/z/66TEKa1xK. 
Essentially the glomming together of reads/writes into i32's (in our case) 
helps to reduce the total number of loads/stores needed. Splitting that up into 
individual i8/i16's creates a lot more load/mask/load/mask/or/store sequences.

I suspect it's better in cases, worse in others. It depends on where exactly 
the bitfield delimitators lie. And whether they are more likely to be on 
i32/i64 boundaries (which I believe is common in a lot of cases bitfields are 
used in), or to be more randomly distributed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79155/new/

https://reviews.llvm.org/D79155

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to