I've tried to ponder on what the interfaces to the layer2 and layer3 code should look like. I'm thinking primarily of ipv4 and ipv6 over ethernet, but I hope it's not too difficult to generalize to other media.
Summary ~~~~~~~ I propose splitting the code into the following parts: layer 2 (device driverish), layer 3 part 1 (media dependent ip functionality), layer 3 part 2 (media independent ip stuff, and interface management), layer 4 (implements tcp, udp, and icmp, and replaces current pfinet, in one way or another). Layer 2 should get its own translator in the filesystem, and the main reason is to make it possible to run several pfinets in parallell. The rest of the code should, at least for a start, be a single process. Code for accessing layer 2 and layer 3 should preferably be put into a library libif, analogous to libstore. Layer 2 (Ethernet) ~~~~~~~~~~~~~~~~~~ Basically this piece of code represents a real physical ethernet card. Each interface should have a rendezvous point somewhere in the filesystem, e.g. /device/eth0. It should be implemented as a kernel device, a userspace translator, or work divided up between kernel and userspace. Access would usually be restricted, but not necessarily to root only, for instance one could make the filesystem node owned by a group "network". The supported operations: open() Gives you a port to the device. close() Stop using it. write(frame) Accepts a raw ethernet frame as argument, and puts it onto the wire. listen(code, dst) Tells the device what traffic you want to see. Code is the ethernet type code, and dst is the destination address on frames. The frst argument distinguishes e.g. between ipv4 and ipv6, and the second is needed for multicast. You can listen on several (code, dst) descriptions at once, and if several processes have the device open at the same time, they can listen on the same or different codes/addresses. This call works as a filter, and it also lets the device configure the card to do filtering in hardware, as wel as putting the card in promiscous mode as necessary. ignore(code, dst) The opposite to listen. read(buffer) Reads a raw frame into a buffer. Furthermore, the device should implement the usual calls needed by select(), and there should be some calls to ask the device of its type, maximum mtu, hardware mac-address and other properties (I'd *still* like to see general propery lists on inodes, MUAHAHAHA). Layer 3 (IPv4 and IPv6), part 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It would be cute to have the interfaces described in this section available in the filesystem, but I don't think that's terribly important. The "layer 3, part 1" interface is similar to raw sockets. The code that implements the interface (which could be a library libif, or some separate process) talks to a single layer 2 device, and provides the following interface (via rpc, or ordinary function calls) to it's users: open(layer-2-device) Initialization close() Shutdown write(packet) Sends a raw ip-packet on the interface. Source and destination addresses, all headers, checksums, etc, are filled in by the caller. listen(ip-address) Tells the interface that we're interested in packets with the given ip-address. Can be a unicast or multicast address. It's not clear whether or not the same layer 3 component should handle both ipv4 and ipv6. One way might be to use ipv6 exclusively, and represent ipv4 addresses as ipv6 mapped addresses. The wildcard address is valid, which is useful for a packet forwarding process. ignore(ip-addres) The inverse of the above. read(buffer) Reads a raw ip-packet into the buffer. This is pretty similar to the layer 2 interface. And like that, we also need calls to get the interface's mtu, ip netmask, and perhaps other properties like any hardware-based link-local address. If rpc:s are used, we should use the same rpc:s as for layer 2. Layer 3 (IPv4 and IPv6), part 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This component talkes to one or more "layer 3, part 1" interface. It is responsible for routing decisions and the like. The point is that is independent on the underlying media; everything media specific is done by layer 3, part 1, while the rest is done here. Operations: open() Initialization close() Shutdown list_interfaces() delete_interface(interface-index) add_interface(interface) Manage interfaces. add_address(interface-index, ip-address) delete_address(ip_address) list_addresses(interface-index) Managing ip-address assignment. write(packet) Writes a raw ip-packet. Source and destination addresses provided by the caller. Automatically chooses an appropriate interface. listen(ip-address) Says what addresses the caller is interested in. The typical cases are (i) get packets with a specific ip-address as destination, (ii) get packets with a destination address assigned to any of the interfaces, and (iii) get all packets, no matter what the destination address is. The latter is for packet forwarding. read(buffer) Reads an ip-packet into the given buffer. select_address(src-set, dst-set) Given a set of possible source addresses, defaulting to all addresses on any of the interfaces, and a set of possible destination address, choose the best (according to apropriate address selection rules) source and destination address. Actually, this could be a plain library function, but if there's any local configuration of the rules, that configuration must be stored somewhere, and this seems like as good a place as any. Layer 4 interface ~~~~~~~~~~~~~~~~~ This is the interface that is closest to what socket-using applications will use. It implements tcp, icmp and udp (and perhaps other protocols as well). I'm not really sure how this would look like, it could be the current pfinet interface (where is that defined? I looked in hurd/hurd, but the only rcp defined by hurd/hurd/pfinet.defs is pfinet_siocgifconf, which isn't terribly intersting for now), or something plan-9-ish with nodes /.../<ip>/tcp/<port>. For now, I think the nicest way is to have a directory tree with ip-addresses, port numbers etc, including wildcard addresses and ports. Then link socket-applications with a an -lsocket library that knows how to deal with that tree. As far as possible one should use ordinary file rpc:s. Perhaps e.g. SO_REUSEADDR could be mapped to O_EXCL in some way? It's also conceivable to do something more low-level, for instance handling of wildcard addresses and ports could be delegated to -lsocket. Or one could ge one step further, with the layer 4 server only handling management of ip addresses and port numbers and not much more. Clients would get handles that are associated with socket quadruples <src-ip, src-port, dst-ip, dst-port>, on which they can send and receive packets, and put the details of tcp, udp and icmp into -lsocket. Regards, /Niels _______________________________________________ Bug-hurd mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-hurd