On Tue, Jun 06, 2017 at 01:24:11 -0700, Richard Henderson wrote: > On 06/05/2017 03:49 PM, Emilio G. Cota wrote: > >+TranslationBlock *tcg_tb_alloc(TCGContext *s) > >+{ > >+ void *aligned; > >+ > >+ aligned = (void *)ROUND_UP((uintptr_t)s->code_gen_ptr, > >QEMU_CACHELINE_SIZE); > >+ if (unlikely(aligned + sizeof(TranslationBlock) > > >s->code_gen_highwater)) { > >+ return NULL; > >+ } > >+ s->code_gen_ptr += aligned - s->code_gen_ptr + sizeof(TranslationBlock); > >+ return aligned; > > We don't really need the 2/3 patch. We don't gain anything by telling the > compiler that the structure is more aligned than it needs to be.
The compile-time requirement is for the compiler to pad the structs appropriately; this is critical to avoid false sharing when allocating arrays of structs like those test programs do. > We can query the line size at runtime, as suggested by Pranith, and use that > for the alignment here. Which means that the binary isn't tied to a > particular cpu implementation, which is clearly preferable for > distributions. For this particular case we can get away without padding the structs if we're OK with having the end of a TB struct immediately followed by its translated code, instead of having that code on the following cache line. E.