Matt Mackall wrote: > On Thu, Jun 07, 2007 at 06:18:52AM -0400, Mark Hounschell wrote: >> Matt Mackall wrote: >>> On Wed, Jun 06, 2007 at 10:28:28AM -0700, Andrew Morton wrote: >>>> On Wed, 06 Jun 2007 09:12:04 -0400 Mark Hounschell <[EMAIL PROTECTED]> >>>> wrote: >>>> >>>>>> As far as a 100% CPU bound task being a valid thing to do, it has been >>>>>> done for many years on SMP machines. Any kernel limitation on this >>>>>> surely must be considered a bug? >>>>>> >>>>> Could someone authoritatively comment on this? Is a SCHED_RR/SCHED_FIFO >>>>> 100% Cpu bound process supported in an SMP env on Linux? (vanilla or -rt) >>>> It will kill the kernel, sorry. >>>> >>>> The only way in which we can fix that is to allow kernel threads to preempt >>>> rt-priority userspace threads. But if we were to do that (to benefit the >>>> few) it would cause _all_ people's rt-prio processes to experience glitches >>>> due to kernel activity, which we believe to be worse. >>>> >>>> So we're between a rock and a hard place here. >>>> >>>> If we really did want to solve this then I guess the kernel would need some >>>> new code to detect a 100%-busy rt-prio process and to then start premitting >>>> preemption of it for kernel thread activity. That detector would need to >>>> be smart enough to detect a number of 100%-busy rt-prio processes which are >>>> yielding to each other, and one rt-prio process which keeps forking others, >>>> etc. It might get tricky. >>> The usual alternative is to manually chrt the relevant kernel threads >>> to RT priority and adjust the priority scheme of their processes >>> appropriately. >>> >> >From an earlier thread member: >> >>>> Mark writes: >>>> Again I don't understand why flush_scheduled_work() running on behalf >>>> of a process affinitized to processor-1 requires cooperation from >>>> events/2 (affinitized to processor-2) >>>> when there is an events/1 already affinitized to processor 1? >>> Oleg write: >>> flush_workqueue() blocks until any scheduled work on any CPU has run to >>> completion. If we have some work_struct pending on CPU 2, it can be >>> completed only when events/2 executes it. >> Could not flush_scheduled_work() just follow the affinity mask of the >> task that caused the call to begin with. If calling task had a cpu-mask >> of 3 then flush_scheduled_work() would do the events/0 and events/1 >> thing and if the calling task had an affinity mask of 1 then only >> events/0 would be done? > > The kernel's internal event API doesn't track any of this stuff and > it's not clear we'd want it to. It'd be a bit simpler perhaps to > simply allow SIGSTOPing events/0. This might even work today from > userspace. > > In general, it's considered a mistake to mark CPU hogs as RT precisely > because they present a starvation risk to everything else in the > system, not just kernel threads. We could add kernel infrastructure to > make events survive this sort of thing, but that will very likely just > expose another kernel or userspace livelock. >
In general maybe, but we are not really talking in general terms here. For everything other than kernel stuff, I (userland) would assume responsibility. That is why I force all other userland tasks that I want to run onto all the other processors before my RT-HOG starts. >From a userland point of view, if I have an 8 processor box, I should be able to have 7 RT-HOGS running on it while all other proceses are on the 8th. Yet in Linux that I can't have even one because it breaks the kernel. I'm sorry, this does not sound right to me. To me and people in my world, this is clearly a kernel deficiency. Regards Mark - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/