On 04.08.2005 [11:38:33 +0200], Roman Zippel wrote: > Hi, > > Andrew, please drop this patch. > Nish, please stop fucking around with kernel APIs.
The comment for schedule_timeout() claims: * Make the current task sleep until @timeout jiffies have * elapsed. Currently, it does not do so. I was simply trying to make the function do what it claims it does. > On Wed, 3 Aug 2005, Nishanth Aravamudan wrote: > > > > The "jiffies_to_msecs(msecs_to_jiffies(timeout_msecs) + 1)" case (when > > > the > > > process is immediately woken up again) makes the caller suspectible to > > > timeout manipulations and requires constant reauditing, that no caller > > > gets it wrong, so it's better to avoid this error source completely. > > Nish, did you read this? Is my English this bad? Your English is fine. My point was that the +1 issues (potential infinite timeouts) are a problem with *jiffies* not milliseconds. And thus need to be pushed down to the jiffies layer. I think my explanation was pretty clear. > > > --- 2.6.13-rc5/kernel/timer.c 2005-08-01 12:31:53.000000000 -0700 > > +++ 2.6.13-rc5-dev/kernel/timer.c 2005-08-03 17:30:10.000000000 -0700 > > @@ -1134,7 +1134,7 @@ fastcall signed long __sched schedule_ti > > } > > } > > > > - expire = timeout + jiffies; > > + expire = timeout + jiffies + 1; > > > > init_timer(&timer); > > timer.expires = expire; > > And a little later it does: > > timeout = expire - jiffies; > > which means callers can get back a larger timeout. Hrm, maybe I will need to cache the parameter. But the only way you would get a return value greater than requested is if schedule() returns before the next jiffy? Which I guess could happen if a wait-queue event or signal (if the task is set as such) occurs before the next interrupt *and* there are no other threads running, I believe, as process_timeout() -> try_to_wake_up() only puts the task back on the run queue; I don't think it actually preempts the currently running task, does it? Just an FYI, though, that is a problem in the current code, in the sense that we will pass back the exact same value again Let me take a look, maybe we need to do some -1 later (or just cache the request, as I mentioned). Thanks for pointing this out. > Nish, did you check and fix _all_ users? I can easily find a number of > users which immediately use the return value as next timeout. I haven't fixed all users yet. I plan on trying to do that today. Would those callers that do immediately use the return value as the next timeout fall under the functions that should be using time_after()/time_before()? > There are _a_lot_ of schedule_timeout(1) for small busy loops, these are > asking for "please schedule until next tick". Did you check that these are > still ok? According to the comment for schedule_timeout(), that is not true. They are asking to sleep for *1* ticks, no the *next* tick. If the assumption in the code is the latter, they have been calling this function inappropriately for a while. They probably should be requesting schedule_timeout(0), i.e. no sleep at all (but we still will add a timer internally and it won't expire until the next interrupt occurs). > schedule_timeout() is arguably a broken API, but can we please _first_ > come up with a plan to fix this, before we break even more? It *is* broken, and my patch *does* fix it. I'm not suggesting it go to Linus today, but it's ok in Andrew's tree, especially if I'm able to get the depending patches sent out soon. I am pretty sure there is no way to fix the problems of adding 1 without an inter-tick position. The +1 is the closest thing we have to a "ceiling" function with jiffies. Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/