Dear seL4 community,

I'm announcing the release of v0.2 of Neptune OS, an operating system that
I have been developing that aims to create a Windows NT personality on top
of the seL4 microkernel, by implementing the upper layer of the Windows
kernel called the NT Executive, as well as Windows kernel drivers, as
userspace processes under seL4. For those that remember the release of v0.1
two years ago and have been wondering, yes, the project is still alive. I
have been a bit busy with postdoc work (academia is hard!!) last year but
finally got a bit of free time lately to get enough work done that I think
is interesting enough to warrant a v0.2 release. The project can be found
on github (https://github.com/cl91/NeptuneOS).

The biggest feature of the v0.2 release is a (I hope) reasonably complete
file system stack supporting read-ahead and write-back caching, that
includes a FAT12/16/32 file system driver and a floppy controller driver,
both of which are ported over from ReactOS (but running in userspace
instead of kernel space). I'll explain the technical points below but you
can watch a short video (https://www.youtube.com/watch?v=o3FLRnkh0ic) that
demonstrates the basic file system commands, including cd, dir, copy, move,
del, md, and mount/umount.

Arguably the biggest challenge for a userspace file system stack is
minimizing the performance penalties of the additional context switches.
There are radical solutions to this problem along the lines of Unikernels
that require linking the FS code with the applications. These solutions are
not very practical for everyday desktop or server OSes. On traditional OSes
there is FUSE on Linux/Unix which is often regarded as having inferior
performance compared to equivalent kernel-mode drivers. To some extent this
problem is inevitable, as context switches are always more expensive than
function calls, no matter how much you optimize the former. However, that
doesn't mean we can't optimize our OS design so that the performance
penalties of running file systems in userspace are small enough that the
benefit of userspace file systems outweighs their performance cons.

The key to this performance goal I believe is an efficient cache design.
Indeed, for FUSE it is possible to tune the cache parameters to drastically
increase its performance (see [1] and the citation therein). On Neptune OS
we cache aggressively: not only do we cache the underlying block device
pages, we also maintain the relationship between file regions and their
underlying disk regions. In other words, the primary role of the file
system driver is to translate an offset of a file object into an offset of
the underlying storage device, and this relationship is cached since they
are not expected to change, at least for currently open files. Furthermore,
we cache file attributes as well as the file hierarchy itself, so that if a
process opened a file and the file system driver returned a valid handle,
we do not need to query the FS driver again (at least not before the first
process closes the handle) when another process opens the same file with
the same attributes (because we know it exists and know about its
attributes). Of course, we need to be careful when things get
closed/deleted/moved, etc, so we need a mechanism to synchronize
information with the file system driver process. This is done by sending
messages using the seL4 IPC.

What is amazing is the fact that despite these changes to the inner
workings of the NT cache manager, not to mention the fact that all drivers
are now running in userspace, the Windows kernel driver API remains largely
unchanged, so that it is indeed possible to port the relevant ReactOS
drivers to our architecture without doing too much work (I believe I spent
three to four man-weeks worth of work porting the FAT file system itself).
The vast majority of work is in fact removing code rather than writing new
code, because of the simplifications of locking and other synchronization
issues that simply disappear when drivers run in userspace. For those who
are interested in knowing more about our architecture and are perhaps
interested in porting drivers from ReactOS, I am in the process of writing
a book (found under the docs of the github repo) called "Neptune OS
Developer Guide" that explains all these things. Note this is very much a
work-in-progress.

In addition to the cache manager work, in order for the file system stack
to function, several other system components must be implemented, most
notably DMA and Win32 SEH (structured exception handling). Our DMA
architecture supports both PCI bus mastering and the weird ISA DMA that
must go through the ISA DMA controller. The latter requires managing the
so-called "map registers" that must be shared across different ISA devices.
Again we find that the Windows kernel API can be adapted to a seL4-based
userspace driver model almost straightforwardly. This is perhaps not so
surprising, because NT was allegedly originally designed as a microkernel
OS (this is inevitably an urban legend, but early NT design documents refer
to the code beneath the NT Executive as the "microkernel", so there is
reason to believe).

To summarize, what I hope is that with the design outlined above, we can in
fact have a practical, reasonably performant userspace file system stack
without resorting to radical departures from traditional desktop/server
computing paradigms. Of course, it's ridiculous to talk about performance
for floppy disk controllers, so the primary goal of the next release is to
port the ATA/AHCI hard disk drivers from ReactOS to Neptune OS, in order to
produce a valid benchmark against equivalent kernel mode designs. Anyway,
if you read so far, I hope you find the work interesting (I personally
do!). If you have any comments, or even better if you ran the code and
found bugs/issues, please do open an issue in the github repo. Thank you!

[1]
https://medium.com/@xiaolongjiang/linux-fuse-file-system-performance-learning-efb23a1fb83f

---
Dr. Chang Liu, PhD.
github.com/cl91/NeptuneOS
_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems

Reply via email to