Wow! You really helped Zach out ;)
On Tue, 13 Feb 2007, Ingo Molnar wrote: > +The Syslet Atom: > +---------------- > + > +The syslet atom is a small, fixed-size (44 bytes on 32-bit) piece of > +user-space memory, which is the basic unit of execution within the syslet > +framework. A syslet represents a single system-call and its arguments. > +In addition it also has condition flags attached to it that allows the > +construction of larger programs (syslets) from these atoms. > + > +Arguments to the system call are implemented via pointers to arguments. > +This not only increases the flexibility of syslet atoms (multiple syslets > +can share the same variable for example), but is also an optimization: > +copy_uatom() will only fetch syscall parameters up until the point it > +meets the first NULL pointer. 50% of all syscalls have 2 or less > +parameters (and 90% of all syscalls have 4 or less parameters). Why do you need to have an extra memory indirection per parameter in copy_uatom()? It also forces you to have parameters pointed-to, to be "long" (or pointers), instead of their natural POSIX type (like fd being "int" for example). Also, you need to have array pointers (think about a "char buf[];" passed to an async read(2)) to be saved into a pointer variable, and pass the pointer of the latter to the async system. Same for all structures (ie. stat(2) "struct stat"). Let them be real argouments and add a nparams argoument to the structure: struct syslet_atom { unsigned long flags; unsigned int nr; unsigned int nparams; long __user *ret_ptr; struct syslet_uatom __user *next; unsigned long args[6]; }; I can understand that chaining syscalls requires variable sharing, but the majority of the parameters passed to syscalls are just direct ones. Maybe a smart method that allows you to know if a parameter is a direct one or a pointer to one? An "unsigned int pmap" where bit N is 1 if param N is an indirection? Hmm? > +Running Syslets: > +---------------- > + > +Syslets can be run via the sys_async_exec() system call, which takes > +the first atom of the syslet as an argument. The kernel does not need > +to be told about the other atoms - it will fetch them on the fly as > +execution goes forward. > + > +A syslet might either be executed 'cached', or it might generate a > +'cachemiss'. > + > +'Cached' syslet execution means that the whole syslet was executed > +without blocking. The system-call returns the submitted atom's address > +in this case. > + > +If a syslet blocks while the kernel executes a system-call embedded in > +one of its atoms, the kernel will keep working on that syscall in > +parallel, but it immediately returns to user-space with a NULL pointer, > +so the submitting task can submit other syslets. > + > +Completion of asynchronous syslets: > +----------------------------------- > + > +Completion of asynchronous syslets is done via the 'completion ring', > +which is a ringbuffer of syslet atom pointers user user-space memory, > +provided by user-space in the sys_async_register() syscall. The > +kernel fills in the ringbuffer starting at index 0, and user-space > +must clear out these pointers. Once the kernel reaches the end of > +the ring it wraps back to index 0. The kernel will not overwrite > +non-NULL pointers (but will return an error), user-space has to > +make sure it completes all events it asked for. Sigh, I really dislike shared userspace/kernel stuff, when we're transfering pointers to userspace. Did you actually bench it against a: int async_wait(struct syslet_uatom **r, int n); I can fully understand sharing userspace buffers with the kernel, if we're talking about KB transferd during a block or net I/O DMA operation, but for transfering a pointer? Behind each pointer transfer(4/8 bytes) there is a whole syscall execution, that makes the 4/8 bytes tranfers have a relative cost of 0.01% *maybe*. Different case is a O_DIRECT read of 16KB of data, where in that case the memory transfer has a relative cost compared to the syscall, that can be pretty high. The syscall saving argument is moot too, because syscall are cheap, and if there's a lot of async traffic, you'll be fetching lots of completions to keep you dispatch loop pretty busy for a while. And the API is *certainly* cleaner. - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/