On 03/06/16 21:27, Pranith Kumar wrote:
> On Thu, Jun 2, 2016 at 5:18 PM, Richard Henderson <r...@twiddle.net> wrote:
>> Hum.  That does seem helpful-ish.  But I'm not certain how helpful it is to
>> complicate the helper functions even further.
>>
>> What if we have tcg_canonicalize_memop (or some such) split off the barriers
>> into separate opcodes.  E.g.
>>
>> MO_BAR_LD_B = 32        // prevent earlier loads from crossing current op
>> MO_BAR_ST_B = 64        // prevent earlier stores from crossing current op
>> MO_BAR_LD_A = 128       // prevent later loads from crossing current op
>> MO_BAR_ST_A = 256       // prevent later stores from crossing current op
>> MO_BAR_LDST_B = MO_BAR_LD_B | MO_BAR_ST_B
>> MO_BAR_LDST_A = MO_BAR_LD_A | MO_BAR_ST_A
>> MO_BAR_MASK = MO_BAR_LDST_B | MO_BAR_LDST_A
>>
>> // Match Sparc MEMBAR as the most flexible host.
>> TCG_BAR_LD_LD = 1       // #LoadLoad barrier
>> TCG_BAR_ST_LD = 2       // #StoreLoad barrier
>> TCG_BAR_LD_ST = 4       // #LoadStore barrier
>> TCG_BAR_ST_ST = 8       // #StoreStore barrier
>> TCG_BAR_SYNC  = 64      // SEQ_CST barrier
> I really like this format. I would also like to add to the frontend:
>
> MO_BAR_ACQUIRE
> MO_BAR_RELEASE
>
> and the following to the backend:
>
> TCG_BAR_ACQUIRE
> TCG_BAR_RELEASE
>
> since these are one-way barriers and the previous barrier types do not
> cover them.

Actually, the acquire barrier is a combined load-load + load-store
barrier; and the release barrier is a combo of load-store + store-store
barriers.

Kind regards,
Sergey

>
>> where
>>
>>   tcg_gen_qemu_ld_i32(x, y, i, m | MO_BAR_LD_BEFORE | MO_BAR_ST_AFTER)
>>
>> emits
>>
>>   mb            TCG_BAR_LD_LD
>>   qemu_ld_i32   x, y, i, m
>>   mb            TCG_BAR_LD_ST
>>
>> We can then add an optimization pass which folds barriers with no memory
>> operations in between, so that duplicates are eliminated.
>>
> Yes, folding/eliding these barriers in an optimization pass sounds
> like a good idea.
>
> Thanks,


Reply via email to