On Wed, Feb 14, 2007 at 01:06:59PM -0800, Davide Libenzi wrote: > Bear with me Ben, and let's follow this up :) If you are in the middle of > an MMX copy operation, inside the syscall, you are: > > - Userspace, on task A, calls sys_async_exec > > - Userspace in _not_ doing any MMX stuff before the call
That's an incorrect assumption. Every task/thread in the system has FPU state associated with it, in part due to the fact that glibc has to change some of the rounding mode bits, making them different than the default from a freshly initialized state. > - We wake task B that will return to userspace At which point task B has to touch the FPU in userspace as part of the cleanup, which adds back in an expensive operation to the whole process. The whole context switch mechanism is a zero sum game -- everything that occurs does so because it *must* be done. If you remove something at one point, then it has to occur somewhere else. My opinion of this whole thread is that it implies that our thread creation and/or context switch is too slow. If that is the case, improve those elements first. At least some of those optimizations have to be done in hardware on x86, while on other platforms are probably unnecessary. Fwiw, there are patches floating around that did AIO via kernel threads for file descriptors that didn't implement AIO (and remember: kernel thread context switches are cheaper than userland thread context switches). At least take a stab at measuring what the performance differences are and what optimizations are possible before prematurely introducing a new "fast" way of doing things that adds a bunch more to maintain. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/