On 8/14/20 12:43 PM, Stefan Kanthak wrote:
Hi @ll,
in his ACM queue article <https://queue.acm.org/detail.cfm?id=3372264>,
Matt Godbolt used the function
| bool isWhitespace(char c)
| {
| return c == ' '
| || c == '\r'
| || c == '\n'
| || c == '\t';
| }
as an example, for which GCC 9.1 emits the following assembly for AMD64
processors (see <https://godbolt.org/z/acm19_conds>):
| xor eax, eax ; result = false
| cmp dil, 32 ; is c > 32
| ja .L4 ; if so, exit with false
| movabs rax, 4294977024 ; rax = 0x100002600
| shrx rax, rax, rdi ; rax >>= c
| and eax, 1 ; result = rax & 1
|.L4:
| ret
This code is but not optimal!
What evidence do you have that your alternative sequence performs better? Have
you benchmarked it? (I tried, but your code doesn't assemble)
It is more instructions and cannot speculate past the setnz (As I understand it,
x86_64 speculates branch instructions, but doesn't speculate cmov -- so
perversely branches are faster!)
The following equivalent and branchless code works on i386 too,
it needs neither an AMD64 processor nor the SHRX instruction,
which is not available on older processors:
mov ecx, edi
mov eax, 2600h ; eax = (1 << '\r') | (1 << '\n') | (1 <<
'\t')
test cl, cl
setnz al ; eax |= (c != '\0')
shr eax, cl ; eax >>= (c % ' ')
^^ operand type mismatch on this instruction
xor edx, edx
cmp ecx, 33 ; CF = c <= ' '
adc edx, edx ; edx = (c <= ' ')
and eax, edx
ret
regards
Stefan Kanthak
--
Nathan Sidwell