On Tue, 7 Apr 2009 11:48:52 -0500 Marco Peereboom <sl...@peereboom.us> wrote:
> > The design involves a technology called "Express Ether" though it is > > typically written as "ExpEther," and it is basically a way to run a > > PCIe bus over ethernet. Though this might be the first you've heard > > of it, ExpEther has been in development at NEC for the last five > > years, and yes, I'm currently working on getting the documentation > > released for the existing silicon. > > Getting these docs would be kick ass. I was vaguely aware that this > was happening and what I don't know is how the silicon works or looks > like. Got any free docs on that? > I'm working simultaneously on five (5) different fronts: 1.) Getting Documentation Open 2.) Getting Existing Code Open 3.) Prevent NDA Nonsense 4.) Providing Hardware 5.) Reciprocation To put it as bluntly as possible, *I* *WANT* the code quality and reliability of OpenBSD in the system I'm working on, so I'm doing everything I can to make it happen. This is the first time I've had the opportunity to strongly influence, if not decide, practice, policy and procedure, so when it comes to getting OpenBSD involved, I *am* making things up as I go along. Needless to say, changing the thinking and standard operating procedures at multiple bureaucratic corporations may prove to be nothing more than wishful thinking on my part, but the only way to find out is to try. Think for a moment of the recent announcement about commercial use of OpenSSH versus the support given to the OpenSSH authors, and all similar situations. Since reciprocation from corporate entities to open source software authors is typically lacking, reciprocation is an important issue to mention publicly. None the less, the details of reciprocation are a private matter. As for getting docs, hardware, code and whatnot open an into the hands of interested developers, I'm doing it as fast as possible. Since I'm a NDA-slave, I have some of it here, so it's mostly a matter of getting the required permissions to release it. The (stupidly) super secret sauce documentation I have here is from us forcing NEC to provide accurate translations to English of their internal Japanese docs. They are dated "March 30, 2009," so this is all brutally new. ExpEther has been one of the secret research projects developed at NEC labs, and has been in development for over five years. It works but just getting example and/or reference hardware for partner corporations has, thus far, been a royal pain (limited supply issues), but I'll be getting this issue fixed shortly. > > > > http://www.nec.co.jp/press/en/0702/0801.html > > http://www.expether.org/ > > > > In short, you can think of ExpEther as something between a bus > > extender and a bridge (PCIe<->ethernet), so basically anything you > > can plug into a PCIe slot can be made available to a remote > > machine. Yep, you can even partition attached devices into VLANs > > and basically "build" a computer on the fly out of available parts > > attached to the network. For example if your VPN or secure website > > is running a little slow, you would usually halt the machine and > > add a crypto accelerator, but with ExpEther, you just export a > > crypto accelerator device on another system to the system that > > needs it and the recipient system assumes the device is attached to > > it's local PCIe bus. > > So this is where all the work comes in. We need a new pci bridge (or > bus) device that does all the magic. Once this is in place one could > trivially hook hardware up and make it work regardless of distance > (latency would have to be considered obviously). > > I am a little confused here though; if this is done right it should be > transparent to the OS and no code would have to be written at all > (minus management obviously). Why do we need code? > Outside of getting the ExpEther driver ported/written for OpenBSD, there's only one place where it has been suggested that new code is needed; getting the pseudo-device created by softraid, and the pseudo-device pretending to be a scsi controller (where the softraid device is attached), to appear to be attached to a PCIe bus. --As far as I can tell, you've *already* written most of the code, and have recently been attempting to do something (roughly) similar to ExpEther with your softraid-AoE code. > > One of the first applications I'm working on is exporting a softraid > > volume over ExpEther. I was asked if it was possible to build a shim > > that makes a block device like a softraid sd0a look like an ATA > > device sitting on a (fictitious) ATA controller on the PCIe bus? > > Sure it could easily be used for that however if you want to make this > much more usable see my previous paragraph. You really want to solve > the problem only once and not multiple times. > > > Though it's certainly an uncommon thing to try to do, there's just > > something about this approach that makes me wonder if it's a > > crazy/stupid idea, or absolutely brilliant? > > Fine hack to prove a concept however a pci bridge (or bus) is the > device you really need and should write. > I agree. > > To *me* (complete idiot), I'm wondering if this is being approached > > at the wrong level, namely shimming a block device like sd0a to be > > seen as a ata/scsi device on a fictitious controller, versus > > shimming something below it, i.e. > > scsibus0 at softraid0 > > sd0 at scsibus0 > > Softraid is nothing but a virtual HBA. Or a shim or a > $insert_fancy_name_here. > Yep, the "shim" they asked about would take something that is already virtualized (a softraid block device sd0a), and virtualize it a again as a HBA. Personally, I think this double-virtualization is a stupid idea, and it does *not* qualify a "fine hack to prove a concept" because I can see nothing (but overhead and wasted time) being gained from it. To *me* it makes far more sense to just use softraid as intended. I still need to get through the existing ExpEther driver source, but I suspect it must supply a virtual PCIe bus, so it's really just a matter of getting the pseudo-HBA of softraid attached to it. I also suspect the existing driver must provide a pci bridge so it can talk to physical physically connected PCIe devices. > > > > The *consumer* of the resource is expecting to see a disk attached > > to a (fictitious) scsi/ata controller on it's local PCIe bus (which > > is imported via ExpEther). > > > > The *provider* of the resource needs to take a softraid volume and > > make it look like just a (fictitious) disk attached to a > > (fictitious) scsi/ata controller on a (fictitious) PCIe bus (which > > is exported via ExpEther). > > Sure all this is done in softraid today. See the disabled AOE code as > an example. > I read your AoE code once briefly, and drooled on myself, but once I get through the other docs (and finish beating up the required people to get them released), I'll give the AoE code another read. -- J.C. Roberts