On Dec 4, 2010, at 5:22 AM, Florian Weimer wrote: > * Joe Buck: > >> It's wasted code if the multiply instruction detects the overflow. >> It's true that the cost is small (maybe just one extra instruction >> and the same number of tests, maybe one more on architectures where you >> have to load a large constant), but it is slightly worse code than what >> Chris Lattner showed. > > It's possible to improve slightly on the LLVM code by using the > overflow flag (at least on i386/amd64), as explained in this blog > post: > > <http://blogs.msdn.com/b/michael_howard/archive/2005/12/06/500629.aspx>
Ah, great point. I improved the clang codegen to this: $ cat t.cc void *test(long count) { return new int[count]; } $ clang t.cc -S -o - -O3 -mkernel -fomit-frame-pointer -mllvm -show-mc-encoding .section __TEXT,__text,regular,pure_instructions .globl __Z4testl .align 4, 0x90 __Z4testl: ## @_Z4testl ## BB#0: ## %entry movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmovnoq %rax, %rdi ## encoding: [0x48,0x0f,0x41,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] ## fixup A - offset: 1, value: __Znam-1, kind: FK_PCRel_1 .subsections_via_symbols This could be further improved by inverting the cmov condition to avoid the first movq, which we'll tackle as a general regalloc improvement. Thanks for the pointer! -Chris