Hi,
I worked on OS software, and know little about networking virtualization. I
have a question. Assuming the following technologies come true, then we can let
users install their own choice of vswitches(or such kind of "devices") in their
own VM, without risk to influence CSP's hypervisor software; and we can allow
users to build their own network construction overlaying upon the existing net,
give them more flexibilities and control. Can these be an inprovements?
These techs described below can be used in many other fields. If it can help
vswitches, it is a good place to get start. After all if they have been tested
in strict net working use cases, it won't have many problems elsewhere. The
following text will briefly explain these techs, if you are interested.
tech 1. Guest user mode semi-RT enviroment: VM(Lev 1,2,3...) could have a
semi-RT(explained later) environment to run realtime apps. For example, vswitch
run in such environment can have as good lantancy as in hypervisor kernel,
without doing any expensive things (CPU shielding, RT thread, Preallocating
max-possible memory, other OOXX....), and such a process schedules like a
normal application from outside of view.(I know it seems absurd, or contridict,
so pls continue to read),So users can install a vswitch entirely in VM's
usermode, without bothering their CSP.
To do so, we should have a middleware takes care of para-virtualized memory
management and scheduling(not complex as sounds). Including following functions:
a) Asynchronous memory controls, including
allocating、releasing、lock(unswapable)、BIG TLBs、cross-vm sharing(somehow useful)
etc. without traping into host or guest kernel. By doing this we can eliminate
most trap chances(nopage, memmap).
b) Short time scheduling protect: user apps can aquire/release short time
protection(10us, or 1ms) very efficiently. It is sufficient for most simple
packet handling. Of cause it opens the possibility of user mode spin-lock, and
we can easily use the power of SMP. This short time protection can be very
useful.
c) Grouped-up scheduling: Let's say we need 2 CPU Core's power running work
threads for the worst case. Then we may startup 6 cloned brothers, each on a
different Core. Kernel can apply verious scheduling algorithm to them, and try
to guarantee at least 2 of them work simultaneously.
So we endup hidding a semi-RT application inside an appear-to-be normal process
or VM, or level 2,3 VM. Such a vswitch can behave as good as real device box in
many aspects, such as lantancy, software efficiency.
tech 2. Virtualized wires and spider-server.
One of the basic element of network is "wire". So why don't we have virtualized
wires? Basicly a two-way fifo from anywhere to anywhere, an abstraction of
cross-pross、cross-vm、app-kernel、network communication. Based on tech 1, we can
make very simple and efficient wires, and linkup any phisically accessable
elements.So user can link up their own VMs and vswitches as they wish, the CSPs
need only make those things mutually accessable.
Every physical wire is unique, so we need to give and v-wire a unique name,
which is mantained by a spider-server. These servers record information like
this:
v-wire name:"t...@jerry.com:vm1_nic1_line1"
path:"tom's machine1" <-> aaa.bbb.ccc.ddd:eeee <-> xxx.yyy.zzz.www:vvvv ->
"jerry's server1"
statics:
lantancy:
status:
duplication/hubs:
Hypervisor OS take such responsibilities: answer spider-server's wire buildup
requests;athority control for wire open/close; provide v-wire service if wire
is cross-net;transmit data properly.
tech 3. Snapshot memory. Which may take care of reliability and data-sharing.
Two things I think is crucial for distributed vswitches.
If one process can take a "snap shot" and make a piece of memory be seen by
another process on another machine, without worry about consistency, then data
sharing can be somehow simplized. If you can time-to-time take a snapshot of a
process group, when they are unfortunately down, their mirages can resurrected
elsewhere, It can not be a bad thing for reliability. This can be done more
efficiently than most people have expected based on tech 1 and 2.
I think most of you have figure out because you read down here, that what I am
talking about virt-net is based halfly on guess job. But nothing about the OS
part is. We are not discuessing a complexed designing with many dirty tricks.
They are very simple, not difficult to be adapted into most existing
hypervisor/guest OS.
If you tell me it IS useful for virt-net, My team will make it happen.
Best regards,
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss