On 8/16/20 9:54 AM, Stefan Kanthak wrote:
"Nathan Sidwell" <nat...@acm.org> wrote:
What evidence do you have that your alternative sequence performs
better?
45+ years experience in writing assembly code!
Have you benchmarked it?
Of course! Did you?
I didn't include the numbers in my initial post since I don't have
a processor which supports BMI2 and thus can't run the original code.
I benchmarked the following equivalent code (input character is in
ECX instead of EDI):
you seem very angry about being asked for data. As I said, I couldn't benchmark
your code, because of the incorrect assembly.
As some one with 45+years of writing assembly, you'll be aware that processor
micro architectures have changed dramatically over that time, and one can very
easily be misled by 'intuition'.
Because I dared to show code for the old(er) i386 alias x86 processor,
not for the AMD64 alias x86_64.
Which I did find bizarre -- if you're targeting an x86_64 ISA, why are you
writing code for a different processor?
anyway, you've made it clear you do not wish to engage in constructive
discussion.
BTW, I have come up with a sequence as short as GCC's but without the
conditional branch. Sadly the margin is too small to write it.
Good day, sir
nathan
--
Nathan Sidwell