I've uploaded them here:
http://www.kernel.org/pub/linux/kernel/people/mst/
you can't see them in mirrors yet but will be able to soon when
kernel.org mirroring system catches them.
There is no difference in optimizations except that here:
for (i = 0; i < aiocb->aio_niov && count; ++i) {
one of the two versions actually does "count && i < aiocb->aio_niov" due
to hashing vagaries. This is irrelevant anyway. Same inlining, same
loop optimization decisions, same everything else. So a GCC bug can be
ruled out, IMHO.
The only difference, as someone already suspected, is the padding---the
sigset is placed between the top of the frame and the other variables,
which may hide an overrun. This is quite amazing for a function that
has no arrays, but still is the only evidence.
I suggest trying to make the sigset_t static, since that generates
exactly the same code as the "nohang" case, and exactly the same stack
layout as the "hang" case. The next obvious step would be placing a
watchpoint somewhere.
Cheers,
Paolo