On 03/06/16 21:27, Pranith Kumar wrote: > On Thu, Jun 2, 2016 at 5:18 PM, Richard Henderson <r...@twiddle.net> wrote: >> Hum. That does seem helpful-ish. But I'm not certain how helpful it is to >> complicate the helper functions even further. >> >> What if we have tcg_canonicalize_memop (or some such) split off the barriers >> into separate opcodes. E.g. >> >> MO_BAR_LD_B = 32 // prevent earlier loads from crossing current op >> MO_BAR_ST_B = 64 // prevent earlier stores from crossing current op >> MO_BAR_LD_A = 128 // prevent later loads from crossing current op >> MO_BAR_ST_A = 256 // prevent later stores from crossing current op >> MO_BAR_LDST_B = MO_BAR_LD_B | MO_BAR_ST_B >> MO_BAR_LDST_A = MO_BAR_LD_A | MO_BAR_ST_A >> MO_BAR_MASK = MO_BAR_LDST_B | MO_BAR_LDST_A >> >> // Match Sparc MEMBAR as the most flexible host. >> TCG_BAR_LD_LD = 1 // #LoadLoad barrier >> TCG_BAR_ST_LD = 2 // #StoreLoad barrier >> TCG_BAR_LD_ST = 4 // #LoadStore barrier >> TCG_BAR_ST_ST = 8 // #StoreStore barrier >> TCG_BAR_SYNC = 64 // SEQ_CST barrier > I really like this format. I would also like to add to the frontend: > > MO_BAR_ACQUIRE > MO_BAR_RELEASE > > and the following to the backend: > > TCG_BAR_ACQUIRE > TCG_BAR_RELEASE > > since these are one-way barriers and the previous barrier types do not > cover them.
Actually, the acquire barrier is a combined load-load + load-store barrier; and the release barrier is a combo of load-store + store-store barriers. Kind regards, Sergey > >> where >> >> tcg_gen_qemu_ld_i32(x, y, i, m | MO_BAR_LD_BEFORE | MO_BAR_ST_AFTER) >> >> emits >> >> mb TCG_BAR_LD_LD >> qemu_ld_i32 x, y, i, m >> mb TCG_BAR_LD_ST >> >> We can then add an optimization pass which folds barriers with no memory >> operations in between, so that duplicates are eliminated. >> > Yes, folding/eliding these barriers in an optimization pass sounds > like a good idea. > > Thanks,