From: Nicholas Piggin > Sent: 20 April 2020 02:10 ... > >> Yes, but does it really matter to optimize this specific usage case > >> for size? glibc, for instance, tries to leverage the syscall mechanism > >> by adding some complex pre-processor asm directives. It optimizes > >> the syscall code size in most cases. For instance, kill in static case > >> generates on x86_64: > >> > >> 0000000000000000 <__kill>: > >> 0: b8 3e 00 00 00 mov $0x3e,%eax > >> 5: 0f 05 syscall > >> 7: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax > >> d: 0f 83 00 00 00 00 jae 13 <__kill+0x13>
Hmmm... that cmp + jae is unnecessary here. It is also a 32bit offset jump. I also suspect it gets predicted very badly. > >> 13: c3 retq > >> > >> While on musl: > >> > >> 0000000000000000 <kill>: > >> 0: 48 83 ec 08 sub $0x8,%rsp > >> 4: 48 63 ff movslq %edi,%rdi > >> 7: 48 63 f6 movslq %esi,%rsi > >> a: b8 3e 00 00 00 mov $0x3e,%eax > >> f: 0f 05 syscall > >> 11: 48 89 c7 mov %rax,%rdi > >> 14: e8 00 00 00 00 callq 19 <kill+0x19> > >> 19: 5a pop %rdx > >> 1a: c3 retq > > > > Wow that's some extraordinarily bad codegen going on by gcc... The > > sign-extension is semantically needed and I don't see a good way > > around it (glibc's asm is kinda a hack taking advantage of kernel not > > looking at high bits, I think), but the gratuitous stack adjustment > > and refusal to generate a tail call isn't. I'll see if we can track > > down what's going on and get it fixed. A suitable cast might get rid of the sign extension. Possibly just (unsigned int). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)