Leonid, On Tue, 5 Feb 2013, Leonid Shatz wrote:
Please stop top posting! > The explanation were submitted as possible scenario which could explain how > the bug in kernel could happen and it does not mean that serious designer > could do exactly that. As I said before, it's also possible that a race > between hrtimer_cancel and hrtimer_start can trigger the bug. The idea is to > have kernel more robust. I'm not against making the kernel more robust and I already applied the patch. > There are already locks used inside hrtimer code, so why should > users of the hrtimer add another layer of locks and get involved in > the intricacy of which cases are protected by internal hrtimer lock > and which are not? Groan. The hrtimer locks are there to protect the internal data structures of the hrtimer code and to ensure that hrtimer functions are proper protected against concurrent running callbacks. But that does not give you any kind of protection versus multiple users of your hrtimer resource. Look at the following scenario: CPU0 CPU1 hrtimer_cancel() hrtimer_start() teardown_crap() hrtimer_callback() runs That's probably not what you want and magic serialization in the hrtimer code does not help at all. There is also no protection against: CPU0 CPU1 hrtimer_cancel() hrtimer_start() hrtimer_forward() Which leaves the hrtimer enqueued on CPU1 with a wrong expiry value. So while concurrent hrtimer_start() is protected, other things are not. So do we need to create a list of functions which can be abused by a programmer without proper protection of the resource and which not? If you want to use any kind of resource (including hrtimers) concurrently you better have proper serialization in that code. Everything else is voodoo programming of the worst kind. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/