Hello Mike, hello Linus,
Some minutes ago, I wrote:
> I think I have found the reason for our bugs. It seems GCC really
> miscompiles buffer.c:bdflush_init without frame pointers. I'll try harder
> now to understand what excactly is going on, but it seems it is smashing
> its local stack space by decrementing its stack pointer too early, then
> calling an assembler function (__down_failed). It might be that GCC is
> confused by this.
[...]
> Any comments on this? I'll now try to split up the stack space operation in
> two parts, the first after call kernel_thread: addl $12, %esp (as in the
> first call), and an additional addl $64, %esp just before leaving (before
> popl %ebx). And I'll report what happened, later - but I have a good
> feeling that I have caught the bug.
... and my good feeling was right. Changing the bogus assembly code made the
bug go away. I'll try to prepare a simpler testcase for the GCC maintainers
tomorrow. For short, this is what happens: GCC tries to free its stack frame
for the local variables far too early. It then calls __down_failed(), which
pushes some things on the stack - thereby corrupting the semaphore pointer!
So __down() works on a random memory location instead of the semaphore, which
is guaranteed to fail badly.
I've added linux-kernel as CC again, so everybody can now hear that this is
definitely a GCC bug, and not a kernel issue.
Greetings,
Andreas
--
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [EMAIL PROTECTED] --->>>---
->>>---- Keep smiling! ----------------------------<<<-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/