Re: [seL4] How to (and how not to) use seL4 IPC

Gernot.Heiser Sun, 10 Mar 2019 03:27:19 -0700

On 10 Mar 2019, at 17:13, Andrew Warkentin 
<[email protected]<mailto:[email protected]>> wrote:


The caveat is that seL4 may have SMP scalability issues with a
general-purpose OS running a wide variety of applications on a wide
variety of hardware like UX/RT will be. In a general-purpose OS,
processes shouldn't have to be statically assigned to run on specific
cores (although of course that should be an option), and they
shouldn't have to care about which core a server is running on.

You need to keep in mind that, as I said in my blog, seL4 is designed as a 
minimal wrapper, just enough to allow securely multiplexing hardware. seL4 not 
migrating threads on its own is a feature, not a bug, as any migration would 
require a policy, and that needs to be defined at user level. seL4 gives you 
all the mechanisms you need. Eg you can implement a Linux-like scheduler on top 
(in fact, a simplified version of this was done in our EuroSys’18 paper on 
scheduling contexts, i.e. the one describing the MCS kernel, 
https://ts.data61.csiro.au/publications/csiroabstracts/Lyons_MAH_18.abstract.pml).

AFAIK, seL4 doesn’t have scalability issues if used properly (and it’s not the 
kernel’s job to support ill-considered uses).

To understand this, think of a typical Arm multicore SoC: 4–16 or so cores, 
shared L2. Inter-core cache line migration latency 10–20 cycles. Even assuming 
a syscall needs to access about 10 write-shared lines, that’s still well below 
the syscall latency. A shared kernel image makes sense, as anything else would 
force high overheads (from explicit cross-core communication/synchronisation at 
user level).

Now think of a high-end x86 server processor. Latency of shipping a single 
cache line from one core at the periphery to the opposite end costs many 100s, 
probably >1,000 cycles, just for a single line. If the syscall needs 10 of 
those, then the latency of migrating cache lines is over an OoM larger than the 
cost of a fast syscall. It makes no sense whatsoever to do this, you’d be 
forcing high overheads onto every syscall.

A single, shared seL4 image only makes sense in a domain where the cache-line 
migration cost remains small compared to the syscall cost. Beyond that, you 
want separate kernel images, i.e. a multikernel.

But this doesn’t limit what you can do on top. If you want to run a single Unix 
system across the whole thing, this can make sense (Unix syscalls are OmM more 
expensive than seL4 syscalls, so the trade-offs are quite different). And you 
can do this across a set of seL4 clusters, they basically present you with a 
NUMA system. Which makes sense, as the hardware will be NUMA as well.

This clustered multikernel story is described in a paper that’s now a few years 
old: 
https://ts.data61.csiro.au/publications/nictaabstracts/Peters_DEH_15.abstract.pml
 Unfortunately, we never got around to build the middleware supporting this (as 
people are throwing money at us to do other cool things), so you’re on your own 
there for now. But there’s nothing inherent in seL4 that stops you.

If I can't make seL4 scale well for
UX/RT I may end up having to fork it. Of all the existing open source
microkernels it is still the closest to what I consider ideal in terms
of architecture, despite the possible SMP scalability issues.

As I said, I don’t think there is a scalability issue. And forking is generally 
a bad idea, in the case of seL4 it’s an especially bad one, as you’ll lose its 
biggest asset, the verification story (or have to do it yourself).

Gernot

_______________________________________________
Devel mailing list
[email protected]
https://sel4.systems/lists/listinfo/devel

Re: [seL4] How to (and how not to) use seL4 IPC

Reply via email to