On Thu, Feb 14, 2008 at 09:25:35PM +0100, Ingo Molnar wrote: > The per function call overhead from stackprotector is already pretty > serious IMO, but at least that's something that GCC _could_ be doing > (much) smarter (why doesnt it jne forward out to __check_stk_failure, > instead of generating 4 instructions, one of them a default-mispredicted > branch instruction??), so that overhead could in theory be something > like 4 fall-through instructions per function, instead of the current 6.
Where do you see a mispredicted branch? int foo (void) { char buf[64]; bar (buf); return 6; } -O2 -fstack-protector -m64: subq $88, %rsp movq %fs:40, %rax movq %rax, 72(%rsp) xorl %eax, %eax movq %rsp, %rdi call bar movq 72(%rsp), %rdx xorq %fs:40, %rdx movl $6, %eax jne .L5 addq $88, %rsp ret .L5: .p2align 4,,6 .p2align 3 call __stack_chk_fail -O2 -fstack-protector -m32: pushl %ebp movl %esp, %ebp subl $88, %esp movl %gs:20, %eax movl %eax, -4(%ebp) xorl %eax, %eax leal -68(%ebp), %eax movl %eax, (%esp) call bar movl $6, %eax movl -4(%ebp), %edx xorl %gs:20, %edx jne .L5 leave ret .L5: .p2align 4,,7 .p2align 3 call __stack_chk_fail -O2 -fstack-protector -m64 -mcmodel=kernel: subq $88, %rsp movq %gs:40, %rax movq %rax, 72(%rsp) xorl %eax, %eax movq %rsp, %rdi call bar movq 72(%rsp), %rdx xorq %gs:40, %rdx movl $6, %eax jne .L5 addq $88, %rsp ret .L5: .p2align 4,,6 .p2align 3 call __stack_chk_fail both with gcc 4.1.x and 4.3.0. BTW, you can use -fstack-protector --param=ssp-buffer-size=4 etc. to tweak the size of buffers to trigger stack protection, the default is 8, but e.g. whole Fedora is compiled with 4. Jakub -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/