Re: Networking design proposal

Olivier P�ningault Sun, 27 Oct 2002 06:57:20 -0800

Hi all,

here is the updated version of my proposal for re-implementing network
in hurd. There are changes, since hurd-net doesn't exist any more. It is
replaced by having more stuff in some layers. People could think that
this breaks a little bit the usual layer design, but I will try to
explain and justify why it works, and why IMHO the usual layer design is
not broken.



Overview
========
The network stack will be divided in several translators :
- layer 2 translators. One of each will run per real physical device. It
will give an interface for the layer 3 protocols, it will hide _all_ the
data link stuff to upper layers. It will also provide means for basic
routing features.
At the moment, the code will be divided between kernel space, and
user-land. When drivers will be in user-land, maybe they will implement
this interface to provide better speed.
A rendez-vous point could be /dev/(lo|eth0|ppp0|atm0|tokenring0)

- layer 3 translators. There might be one translator per network
address, but it is possible for people to code a translator that will
work for many network addresses. layer 3 protocols register to the layer
2 translator they depend on. They also will have the responsability of
storing information about binding; programs that want to bind a port on
a specific address will have to ask the layer 3 translator that is
responsible for that address. They will can provide routing, with the
help of layer 2 protocols; this will be explained later. A mean of
configuration of addresses and options will be aviable.
These protocols should have rendez-vous point such as
/servers/interfaces/<network_address>
One network address will be associated to only one translator. Users
will can set up their own layer 3 translators.

- layer 4 translators. These translators can be set up in many ways.
They can be associated with layer 3 protocols (it will be a big
translator ala pfinet). There can also be big translators with all layer
4 protocols mixed together. We can also imagine a one translator per
protocol policy. Users will be totally free to replace default
translators. As many layer 4 translators can run on top of a given layer
3 translator, the best way to have a "good" binding policy is IMHO to
put this information in the layer 3 translator, otherwise, it would be
nearly impossible to sync all layer 4 protocols.

- sockets interface. As said Niels, part of sockets could be in libc.
libc will can use layer 3-4 translators for some part of the work.
I mainly have think about layer 2 and 3 design, I have verified that it
can work with many protocols. As ideas for layer 2 3 become more
precise, I will think of a good way to implement sockets. Backward
compatibility with pfinet must be provided, in the first times.


Layer 2 translators
===================
These translators will can implement at least ethernet+arp, ppp+ncp,
atm+aal; tokenring+llc[+snap] depending on the kind of physical device
aviable.
Maybe other layer 2 protocols will be aviable.

The translator will be composed of three parts:
data transmission (ethernet, ppp, atm, tokenring), layer 3 register, and
layer 2 - layer 3 specific stuff.

Data transmission interface could be :
--------------------------------------
send (data,to); data to send to the network
receive (data); data to receive from the network
shutdown (); this layer 2 translator is exiting. It is an advertisement
message to all the layer 3 translator that are registred.

Layer 3 register interface :
----------------------------
register (); this is the function that will indicate to the layer 2
translator that a layer 3 translator runs on top of it. It will send
several informations that will be stored in the layer 2 translator, such
as: <mach port>, <network address>, <netmask of the address>, <length of
the address>, <protocol number>
These informations will allow the layer 2 translator to receive packets,
and forward them to the appropriate layer 3 translator. It will be also
usefull for layer 2-layer 3 work.
update (); change some information sent below.
unregister (); layer 3 translator wants to stop to use this interface.

Layer 2-layer 3 specific stuff :
--------------------------------
This is where arp (say) will be implemented. The register interface
allows to get all needed information to perform this stuff. In this way,
upper layer protocol will totally ignore this, while the work is done
internally. 
Example: we receive a gratuitous arp packet. No matter, we verify if the
address is registred or not in the local table. layer 3 protocols will
not know about it.
Example 2: we don't have in the arp cache the ethernet address of the
host we have to send data to. We send an arp request to the network, and
when the reply comes, we can send the packet. Layer 3 translator does
not know anything about it.


Layer 3 translators
====================
These translators will implement protocols such as : ip4+icmp4,
ip6+icmp6, and maybe other things will be aviable.

This will be very ip(4-6) oriented, but it might apply to others
protocols.
In theory, layer 3 with ip is divided in 3 protocols.
- arp : encapsulated in ethernet (say) it performs layer 2 to layer 3
stuff. I would prefer to implement it in the layer 2 translator, in
order to hide all these awfull low level checks.
- ip : it is only designed to transmit data (without control).
- icmp : performs the control wrk that is not present in ip.

As icmp can be used by all layer 4 protocols, not implementing it in
layer 3 would duplicate this code in each upper-level translator.
Implementing ip and icmp together allows us to present 2 program
interfaces : data transmission, and control.

In order to allow programs to choose on which interface(s) they want to
bind, and because it would be a big problem to store this information in
layer 4 translators, we store it here, so a third program interface will
be aviable for registering binding.

We also need to offer a fourth interface, for configuration purpose.

Data transmission :
-------------------
begin_session (); starts a session
close_session (); stops a session
send_data (); send a packet to the network
receive_data (); receives a packet from the network

Control :
---------
send_error ();
receive_error ();

Binding :
---------
port_register ();
port_release ();

Configuration :
---------------
get_address (); gets network address + interface informations (netmask,
...)
set_address (); sets address + interface informations
get_opts (); gets informations about options
set_opts (); sets informations about options


Layer 4 translators
====================

These translators will can be implmented as described in the "Overview"
section.

Connectionless protocols (such as udp) will open a mach port to
underlying layer translators the first time they will send data to them,
and the port can be open as long as both translators run.

Connectionfull protocols (like tcp) will open a new port everytime a new
session begins, and will close this port when the seesion ends.

They will have to implement one interface.
------------------------------------------
open (); begins a session
close (); ends a session
send (); send data to the network
receive (); receives data from the network
Maybe other functions (options configuration) should be aviable.


Comments
========

Some things have bot been said above, but are aviable.

Routing between interfaces
---------------------------
You have 2 interfaces eth0 eth1
You have two addresses 192.168.1.1 (eth0) and 192.168.2.1 (eth1)

The translator responsible for 192.168.1.1 will register itself as :
eth0 : 192.168.1.1/32 - proto=0x800
eth1 : 192.168.1.0/24 - proto=0x800

The translator responsible for 192.168.2.1 will register itself as :
eth0 : 192.168.2.0/24 - proto=0x800
eth1 : 192.168.2.1/32 - proto=0x800

If a packet arrives on eth0 with proto=0x800 (ip) then we get the 5th
word of the ip packet (dest_addr) and we compare it to the registred
interfaces. If it matches 192.168.1.1, we send it to the right
translator, if it matches 192.168.2, we send it to the other translator.

If we have a default interface to send packets, a layer 3 translator
will register as 0.0.0.0/0 - proto=0x800.

The search mechanism order will be based on the netmask length (the
number after the "/"). We begin with biggest netmask (32) and decrease
the number until we have the default route (0). This default interface
will have to get a "good" address (not 0.0.0.0).

Network file system ala Plan 9
-------------------------------
Niels dreams about it.
Me too.

We first must have a working network stack, but then, it can be done.
For example :
$ cat /netfs/eth0/2001:660:305:f001::4/udp/dns/alpha.gnu.org
will return ip address(es) of alpha.gnu.org
$ cp
/netfs/eth1/1.2.3.4/tcp/ftp/alpha.gnu.org/pub/gnu/hurd/contrib/marcus/gnu-latest.tar.gz
 .
will download the latest snapshot of GNU. :)
But we'll have to implement translators for dns, ftp, http, ssh, telnet,
.... that can work well and fast with the network stack (for ftp and
http, we yet have things).

Differences with Niels' proposal
--------------------------------

Since hurd-net doesn't exist any more, our views converge. Differences
are that my layer 2 translator correspond to Niels' layer 2 translator,
and layer 2 part 1 stuff. Niels want to implement icmp in the layer 4
translators. Interfaces have differences, and rendez-vous names differ
(this point is not really important).


That's all folks !
==================

I have many other ideas, but they will wait a little bit, because the
first step is to have a good idea of how it will work, then we have to
make it work, and after, we'll can enjoy coding many features based on
this stack.

I hope many people will give their opinions or ask questions. I prefer
to explain ten times something simple, because the 11th question might
be a serious problem, which will help us to design something that will
work better, with fewer issues (otherwise, it will have to be
re-implemented...).

olivier




_______________________________________________
Bug-hurd mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-hurd

Re: Networking design proposal

Reply via email to