On Wed, 25 Jul 2007, Oleg Nesterov wrote:
On 07/24, Jeremy Katz wrote:
Sorry. That should have been "without apparent effect".
Sorry. I confused completely.
So. You mean that even with that patch you _still_ see the
BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free() ?
Yes. I did not notice
On 07/24, Jeremy Katz wrote:
>
> On Tue, 24 Jul 2007, Oleg Nesterov wrote:
>
> >Interesting. Could you show the patch? Where does sys_timer_create() set
> >counter == 1?
>
> --- kernel/posix-timers.c.old 2007-07-24 11:21:29.0 -0700
> +++ kernel/posix-timers.c 2007-07-20 15:49:5
On Tue, 24 Jul 2007, Oleg Nesterov wrote:
On 07/23, Jeremy Katz wrote:
On Fri, 20 Jul 2007, Oleg Nesterov wrote:
I still can't believe we have a double-free problem, this looks imposiible.
Do you see the
"idr_remove called for id=%d which is not allocated.\n"
in syslog?
No. I al
On 07/23, Jeremy Katz wrote:
>
> On Fri, 20 Jul 2007, Oleg Nesterov wrote:
>
> >I still can't believe we have a double-free problem, this looks imposiible.
> >Do you see the
> >
> > "idr_remove called for id=%d which is not allocated.\n"
> >
> >in syslog?
>
> No. I also added some accounting
On Fri, 20 Jul 2007, Oleg Nesterov wrote:
On 07/18, Jeremy Katz wrote:
On Wed, 18 Jul 2007, Oleg Nesterov wrote:
Jeremy, I agree with Thomas that your patch should not be right, but it
does make a difference. Perhaps this is just the timing, but who knows.
Could you add some printk's to be s
On 07/18, Jeremy Katz wrote:
>
> On Wed, 18 Jul 2007, Oleg Nesterov wrote:
>
> >Jeremy, I agree with Thomas that your patch should not be right, but it
> >does make a difference. Perhaps this is just the timing, but who knows.
> >Could you add some printk's to be sure that lock_timer() actually fa
On Thu, 19 Jul 2007, Thomas Gleixner wrote:
On Wed, 2007-07-18 at 16:43 -0700, Jeremy Katz wrote:
On Wed, 18 Jul 2007, Jeremy Katz wrote:
On Wed, 18 Jul 2007, Thomas Gleixner wrote:
Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
locking problem, which might be relate
On Wed, 2007-07-18 at 16:43 -0700, Jeremy Katz wrote:
> On Wed, 18 Jul 2007, Jeremy Katz wrote:
>
> > On Wed, 18 Jul 2007, Thomas Gleixner wrote:
> >
> >>> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
> >>> locking problem, which might be related to this.
> >>
> >> Anoth
On Wed, 18 Jul 2007, Jeremy Katz wrote:
On Wed, 18 Jul 2007, Thomas Gleixner wrote:
Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
locking problem, which might be related to this.
Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
further ?
I'll
On Wed, 18 Jul 2007, Oleg Nesterov wrote:
Jeremy, I agree with Thomas that your patch should not be right, but it
does make a difference. Perhaps this is just the timing, but who knows.
Could you add some printk's to be sure that lock_timer() actually fails
while it never should?
Agreed.
Unfo
On Wed, 18 Jul 2007, Thomas Gleixner wrote:
On Wed, 2007-07-18 at 08:05 +0200, Thomas Gleixner wrote:
On Tue, 2007-07-17 at 16:58 -0700, Jeremy Katz wrote:
EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
Hmm. Are there any other patches on that kernel ?
Just hrt6 and your proposed fix. The
On 07/17, Jeremy Katz wrote:
>
> This is with the patch (and 2.6.22.1 and hrt6):
>
> [ cut here ]
> Kernel BUG at c0125adb [verbose debug info unavailable]
> invalid opcode: [#1]
> SMP
> Modules linked in:
> CPU:3
> EIP:0060:[]Not tainted VLI
> EFLAGS: 0001
On Wed, 2007-07-18 at 08:05 +0200, Thomas Gleixner wrote:
> On Tue, 2007-07-17 at 16:58 -0700, Jeremy Katz wrote:
>
> > Scratch that. I had infrastructure problems, and ended up using the wrong
> > build.
>
> > EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
>
> Hmm. Are there any other patches o
On Tue, 2007-07-17 at 16:58 -0700, Jeremy Katz wrote:
> Scratch that. I had infrastructure problems, and ended up using the wrong
> build.
> EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
Hmm. Are there any other patches on that kernel ?
Is there a chance that you can whip up a test program whi
On Tue, 17 Jul 2007, Jeremy Katz wrote:
On Tue, 17 Jul 2007, Thomas Gleixner wrote:
With 2.6.14 or with current mainline ?
I haven't been keeping notes quite as studiously as I should have been, but
this just occurred with 2.6.22.1 + the hrt6 patch + your proposed fix:
Scratch that. I ha
On Tue, 17 Jul 2007, Thomas Gleixner wrote:
On Tue, 2007-07-17 at 11:39 -0700, Jeremy Katz wrote:
I tried the patch with my test case, but still see the issue.
Here's my explanation of the double free race:
CPU 0 CPU 1
sys_timer_delete():
lock_timer();
On Tue, 2007-07-17 at 11:39 -0700, Jeremy Katz wrote:
> I tried the patch with my test case, but still see the issue.
> Here's my explanation of the double free race:
> CPU 0 CPU 1
> sys_timer_delete():
> lock_timer();
> ...
> unlock_timer();
On Tue, 17 Jul 2007, Thomas Gleixner wrote:
Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
caused by a subtle race, which is there since the original posix timer
commit and persists until today.
timer_delete does:
lock_timer();
timer->it_process = NULL;
unlock_timer();
re
On Tue, 17 Jul 2007, Ingo Molnar wrote:
nice one! The race looks pretty narrow - Jeremy, does your Xens have
hyperthreading? (or are there any heavy SMI sources perhaps that could
open up this race.) If not then there might be some other bug lurking in
there as well.
Affirmative. 2 cores, 2 h
On Tue, 2007-07-17 at 17:07 +0400, Oleg Nesterov wrote:
> I think we can make a simpler patch,
>
> --- posix-timers.c~ 2007-06-29 14:45:04.0 +0400
> +++ posix-timers.c2007-07-17 16:59:45.0 +0400
> @@ -449,6 +449,9 @@ static void release_posix_timer(struct k
> id
On 07/17, Thomas Gleixner wrote:
>
> Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
> caused by a subtle race, which is there since the original posix timer
> commit and persists until today.
>
> timer_delete does:
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
* Thomas Gleixner <[EMAIL PROTECTED]> wrote:
> Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
> caused by a subtle race, which is there since the original posix timer
> commit and persists until today.
>
> timer_delete does:
> lock_timer();
> timer->it_process = NULL;
> unl
Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
caused by a subtle race, which is there since the original posix timer
commit and persists until today.
timer_delete does:
lock_timer();
timer->it_process = NULL;
unlock_timer();
release_posix_timer();
timer->it_process is check
23 matches
Mail list logo