Hi,
I've been thinking about all of the various ways to get either Dtrace or SystemTap working. I've come up with 4 options: 1 Dtrace would need to sit entirely out of the kernel. About 3k lines of simple code to rewrite, FSF lawyers are ok with this option. Since it must sit in userland and it has a VM as well as several supporting processes it needs the RPC mechanism. We can interrupt the kernel at the probe, suspend that task, switch into a different task (we'll call it the dtrace kernel task), have that one head to userland and give the cpu over to the dtrace process. We shouldn't be letting other processes run while dtrace does, so that they don't mess up the system state; priority inversion handles the task switching between dtrace tasks, no need for the scheduler. We also can't let it do everything, like call arbitrary processes because it's so special to the scheduler and rather sensitive in terms of kernel state. There would be some places in the kernel that we simply can't switch into userland immediately from, even if we implement this by having a dedicated kernel thread for dtrace. Finding all of those places may be tricky. This option basically adds a special type of thread to mach, is rather invasive, although it allows modification of the kernel state and that may well be a desired option. This is the only way to support dtrace that I can see. Since this is R/W it may well be good for other tasks as well, potentially moving other things, like the Linux drivers, into userland. 2 This one is ugly, and I'd rather not do it. It involves adding module loading support to do the same thing that SystemTap does on Linux. Compile code, add a prelude so that it becomes a regular module, load it in and use kprobes to call it. 3 It would be far nicer if SystemTap generated code could sit in userland. But perhaps we don't want something as invasive as the dtrace solution since SystemTap is strictly for R/O observation. We could on a probe point execute code to copy, using copy on write for efficiency, the data that the module will need and the relevant kernel state. Then when we get to a point in the kernel that is amiable to servicing these probe points we can switch into userland, SystemTap code can run as any regular process, and then return when ready. There's a nice solution, although a strange one, to telling the kernel which memory we need a copy of: DWARF. It's designed to let debuggers know exactly this and we could just dump something like libelf into SystemTap; and some extra ELF support in the kernel, thereby getting this mechanism mostly for free. This seems to be the least invasive solution. It also seems the most flexible, and most hurd-ish. It also doesn't add any major mechanisms to mach. On the downside this is inherently R/O, although that does increase security significantly. 4 We could do 1 with SystemTap instead of dtrace thereby giving it R/W access. The downside is that we sacrifice a lot of security, although we do gain a lot of power while doing it. I think if the trouble for 1 is worth it, and it may well be, we should do dtrace instead of SystemTap. I prefer #3. It all depends, do we want userspace instrumentation to have the possibility of R/W access? As things progress more information will be here: http://csclub.uwaterloo.ca/~abarbu/hurd/ Andrei