On Tue, Oct 5, 2021 at 3:07 AM Richard Biener <rguent...@suse.de> wrote: > > On Mon, 4 Oct 2021, H.J. Lu wrote: > > > commit adedd5c173388ae505470df152b9cb3947339566 > > Author: Jakub Jelinek <ja...@redhat.com> > > Date: Tue May 3 13:37:25 2016 +0200 > > > > re PR target/49244 (__sync or __atomic builtins will not emit 'lock > > bts/btr/btc') > > > > optimized bit test on atomic builtin return with lock bts/btr/btc. But > > it works only for unsigned integers since atomic builtins operate on the > > 'uintptr_t' type. It fails on bool: > > > > _1 = atomic builtin; > > _4 = (_Bool) _1; > > > > and signed integers: > > > > _1 = atomic builtin; > > _2 = (int) _1; > > _5 = _2 & (1 << N); > > > > Improve bit test on atomic builtin return by converting: > > > > _1 = atomic builtin; > > _4 = (_Bool) _1; > > > > to > > > > _1 = atomic builtin; > > _5 = _1 & (1 << 0); > > _4 = (_Bool) _5; > > > > and converting: > > > > _1 = atomic builtin; > > _2 = (int) _1; > > _5 = _2 & (1 << N); > > > > to > > _1 = atomic builtin; > > _6 = _1 & (1 << N); > > _5 = (int) _6; > > Why not do this last bit with match.pd patterns (and independent on > whether _1 is defined by an atomic builtin)? For the first suggested
The full picture is _1 = _atomic_fetch_or_* (ptr_6, mask, _3); _2 = (int) _1; _5 = _2 & mask; to _1 = _atomic_fetch_or_* (ptr_6, mask, _3); _6 = _1 & mask; _5 = (int) _6; It is useful only if 2 masks are the same. > transform that's likely going to be undone by folding, no? > The bool case is _1 = __atomic_fetch_or_* (ptr_6, 1, _3); _4 = (_Bool) _1; to _1 = __atomic_fetch_or_* (ptr_6, 1, _3); _5 = _1 & 1; _4 = (_Bool) _5; Without __atomic_fetch_or_*, the conversion isn't needed. After the conversion, optimize_atomic_bit_test_and will immediately optimize the code sequence to _6 = .ATOMIC_BIT_TEST_AND_SET (&v, 0, 0, 0); _4 = (_Bool) _6; and there is nothing to fold after it. -- H.J.