Hi, Sorry for answering that late again. We do seem to have a very bad timing -- that's the second time your mail arrives just at the beginnig of a longer period of email abstinence on my side ;-)
On Sat, Jan 19, 2008 at 01:57:55PM +0100, Carl Fredrik Hammar wrote: > <[EMAIL PROTECTED]> writes: > No, a hub is a an object similar to a channel, except it deals with > filesystem requests, i.e. opens. Which results in a channel. I think it's clearer now: You are talking about the distinction between the object doing the channel management, i.e. handling opening the connection to a channel-aware FS node, optionally uploading to the client; and the object that implements the calls on the once opened connection -- the one that actually gets uploaded. There might be some confusion here (at least I must admit that it wasn't so clear to myself up to now): Opening a channel session is really orthogonal to opening files! Think of translators that expose a whole directory tree rather than just one node: If we want to optimize stacking of such translators, it's not sufficient to open channel sessions whenever opening some of the files exposed by this translator. Rather, we want all operations -- both FS and IO -- invoked on the directory tree exposed by this translator, to happen in a single large session, created on first access to the translator's directory tree. Considering this, it seems obvious that the connection management is completely distinct from the invocations (FS, IO, or else) done on an existing connection. No call for trying to unify them in any way, I would say... > My original idea was to implement hubs as channels, using channels > ability to implement extra interfaces and not implementing io. This > didn't fit with the channel concept. Expanding the concept to allow > it, we can throw out hub as a special concept, it's ``just a channel > implementing the fs interface''. (The channel fs interface would > return channels instead of ports on `open()'.) I see. Terminology is really problematic here... AIUI the channel concept as originally planned has two major elements: One is the connection management. The other is an API that can work both with modules loaded in the client, and with actual RPCs to the translator; and implements the standard IO interface as well as hooks for any family-specific extensions. Your idea now was to use the extension mechanism to implement the management interface as well, right? I don't remember how you intended to implement the extension hooks in your original design -- or maybe I never really knew. Thinking about it now, I don't quite see a reason why channels would need any special handling for extensions at all? It seems to me that the family-specific extensions are Hurd interfaces like any others -- each coming with a .defs file for MiG, and a library for programmer's convenience (and for optimized stacking...) In this sense, "implementing something as channel" just means that the respective convenience library supports libchannel. (Or better even something with a more appropriate name, like libtstack or so...) For FS and IO interfaces ideally this would be libc, or perhaps some special libstackio offering a similar interface, if there are some objections agaist implementing it in libc itself. For stores it would be libstore; for audio and network it would be libnoize and libpacket, or something like that. > > Not sure though what exactly you mean by "Hurd objects" in this > > context. Did I mention already that the word "object" is way > > overloaded? ;-) > > By Hurd objects I meant server-side object, as in objects servers of > the Hurd provide. Typically file objects, but also others, for > instance user identities and processes through the auth and proc > servers respectively. (I'm not suggesting these particular objects > are useful in the context of channels.) OK, that's what I guessed. Just wanted to make sure :-) > I'll stick with the term `server-side object' instead. I didn't say "Hurd object" is a bad term. It's not fully self-explaining, but neither probably would be any other term one could come up with... > > Considering the more generic scope, maybe we should drop the whole > > "channel" terminology alltogether, and try to find something more > > intuitively describing translator stacking... I have no suggestions > > offhand, though :-) > > How about `virtual port' or `vport' for short. Well, the "virtual" part has some merit; but "port" is rather confusing IMHO. Note that a connection to a translator exporting several files can have many open ports. (In fact, it could even with a single-file translator -- though this is not a very likely case...) > > (For stores for example it's probably not really useful most of the > > time... It seems to me that the major motivation behind libstore was > > actually to allow the root FS to run without relying on other > > processes. Personally, I'm not convinced this was really a good > > idea. But well, what do I know :-) ) > > Interesting. I think I agree with you, at the very least libstore > seems a much too complicated solution with respect to the problem. You think so? To be honest, *if* the root fs running all in a single process is considered a requirement, I can't think of any simpler solution that wouldn't loose a lot of flexibility... It's really only the requirement I have doubts about. > > The idea of channels (or more generally, optimizing translator > > stacking) is *not* merely to avoid actual RPC calls. I don't think > > that would be a worthwhile goal. The actual IPC is very slow on > > Mach, but modern microkernels show that it is possible to do much > > better. The cost of IPC itself is not exactly neglectible, but only > > in few situations really a relevant performance factor. > > I would argue that (potentially deep) translator stacks is one such > situation. The depth is not really decisive here. IPC overhead only becomes a relevant factor when having many calls, each doing only very little work. That can happen without any stacking just as well. Also, even if it's a relevant *factor*, it's still not necessarily a *problem*, unless we have really lots of calls in absolute numbers -- with Mach, in the order of at least tens of thousands per second; on a modern machine, probably more like hundreds of thousands. There are not really that many situations where we reach such numbers, I'd guess... I'm not saying IPC overhead is meaningless. But we shouldn't overestimate it. (Admittedly, I heard claims that on Mach there is a much larger indirect overhead from IPC, because of poor scheduling. I don't know enough about the details to form a good opinion on whether it's true, and if so, whether it can be fixed...) > Also my impressions are that we will be stuck with Mach for quite a > while, and that IPC on Mach is inherently slow. So the fast IPC > argument doesn't really apply. Well, opinions on that are vastly diverging. Marcus and Neal seem to consider the existing implementation useless, and to believe that we should all focus on new designs instead. (At least that is my impression... Hope I won't be accused for misrepresenting their opinion again.) On the other hand there are people like me, convinced that the research on new designs is interesting and will give some inspiration for future developments; but otherwise has little direct effect on current Hurd development. Convinced that before we switch to a totally new design, we first should take the existing design (and Mach) as far as we can; and only when we have reached the real limits of what can be done with it, go for a design that addresses precisely these limits -- based on knowledge, not speculation. While this means sticking with Mach for the time being, it doesn't mean we need to take it as given. The process of improving the existing implementation can very well encompass improvements to Mach as well. Presently, there is lower hanging fruit for improving Hurd performance; but once we get to a stage where IPC performance becomes the major bottleneck, I hope that we can improve on it, rather than trying to work around it... It seems for example that a good part of what makes Mach IPC so slow is owed to network transparency at kernel level -- which we don't make use of anyways. Considerable simplification might be possible here; the question is only whether it can be done without changing the semantics too much, so that everything would need to be rewritten... > > The main overhead of RPCs is not from the actual calls, but from the > > implications of an RPC interface -- from the fact that client and > > server can run in different processes (address spaces), possibly > > even on different hosts. (Though we don't employ this latter > > possibilily presently, and I'm not convinced it is really useful to > > preserve network transparency at such a low level.) Meaning the > > client needs to be prepared for communication to fail; meaning that > > the interface is constrained to passing mostly plain values, no > > pointers, no global variables, no function pointers etc.; meaning > > client and server don't have access to the same resources; meaning > > server and client threads run asynchronously (unless using passive > > objects, which we don't). > > I'll tackle the issues one at the time in the order you enumerated > them. > > * Communication failure > > We have to deal with this in either case, since we might be using a > port wrapper. That's exactly my point: *If* you want to do everything the same as if it was a real port you are communicating through, you will have to handle things like possible communication failure, even when the code is actually loaded in the client, and communication failure thus can't happen. That is precisely why pretending that we are always talking through ports is not really helpful. We need to do things at an abstraction level where it is possible to skip these things when not necessary. > Even if not using a port wrapper directly, the bottom layer of a > channel stack probably uses IPC, (it is most likely a port wrapper). Not at all. Translator stacks that are mostly standalone, where incoming requests are handled internally rather than being forwarded to same other entity, are perfectly possible -- and in fact it's those that can profit most from stacking optimization. More importantly, the fact that communication can fail at some lower layer, does not mean that *every* layer has to guard against it. > * No pointers > > The problem here is unnecessary copying. To illustrate this lets > compare the Hurd's `io_read' to POSIX's `read'. > > `io_read' optionally takes a buffer as input and returns a buffer > which is either the input buffer or a newly allocated one. Note > that the input buffer is deallocated from the client on a successful > send and that the output buffer is deallocated on a successful > reply. Mach can take advantage of this and avoid copying the buffer > if the server reuses the input buffer. > > Instead of taking a buffer, `read' takes a pointer as input and > writes to the underlying memory, thus avoiding any copy. > > The problem with `io_read' is that we have to pass page aligned > data. (We can pass unaligned data, but the entire page would be > visible to the server.) This means that we have to resort to copy > if we want to store the data at an unaligned address. Unfortunately, > this is quite common and for instance it's needed for buffering. Yes, out-of-band transfer can help in some cases. However, the need for alignment already extremely narrows down the use: It means that you can't just use it for any piece of memory you wish to transfer, but only for specially prepared buffers. It also means that it can only be used in situations with few, large buffers. Furthermore, while it can avoid the cost of copying, it is far from free -- the VM manipulations necessary are quite expensive too. (In fact, I once saw a claim on lkml that VM tricks tend to be *more* expensive than copying! My intuition tells me that with standard 4k page size, this might very well be true.) While these points strongly reduce the usefulness of out-of-band transfer as a remedy to the lack of pointers in case of one-shot buffer transfers, that's not even the worst of it. Pointers can do much more than that. For one, a pointer once transferred allows both caller and library to update the referred data repeatedly. With RPC, each subsequent update needs to be communicated explicitely. (Shared memory can be used to avoid it, but is expensive at setup time.) While in some cases this can actually be considered a good thing, as it tends to be more robust, you must see that it can be an enormous cost. And there is yet more. Pointers are crucial for complex data structures. For RPC, any data structure needs to be flattened to a set of arrays and indices, and reconstructed into a proper data structure on the receiver side. (Or used in the awkward flattened representation.) Also, when transferring data over RPC, the caller either needs to have a very good understanding of what will actually be needed, or in some cases has to transfer much more than really necessary -- which is especially painful when the data structure needs to be converted... > * No globals > > The use of globals implies that memory be shared between different > clients each having a channel from the same translator. I think we > can agree that is a bad thing in this context, (unless read-only > like code). I don't see how globals are related to a translator having multiple clients. I'm talking about server and client code sharing global variables. Of course that also means that when a client contacts multiple instances of the same translator, i.e. one uploaded module is used multiple times, all the instances share the same variables, which obviously limits application somewhat... > * No function pointers > > Right. But we do have ports which can do the same thing, just send > it and listen for call-backs. That's not quite the same thing. They are more expensive, both in terms of resources and of complexity, by several orders of magnitude -- prohibitively expensive in all but a very few cases. > * Asynchronism > > While Mach's IPC primitive `mach_msg' is asynchronous, we are only > interested in RPCs and these are synchronous. In some sense an RPC > is just a function call to a function in another address space. I guess you are right on this one. While I'm not convinced that the RPC mechanism fully hides the underlying asynchronity, I can't think of any situation right now, where it would add complexity above the RPC level... Maybe I was too hasty here. > * etc. > > It's hard for me to counter this one. I hope you don't mind me > skipping it. ;-) You left out access to resources... But that's beside the point anyways. I must have expressed myself very badly indeed, if I left you in the belief that by commenting on some individual issues I picked out, you could in any way disprove the general problem I'm presenting here. You can hardly question the fact that RPC mechanisms put very severe restrictions on communication interfaces; and by that, on the structure of the whole program. Things that work naturally in a library interface, require explicit, all but trivial handling, when dealing with RPCs. We have to put up with vast overheads; lots of code to handle all the communication, the context, the possible error conditions -- where the actual functionality of the translator might boil down to a simple function of no more than a few lines. On top of the direct and indirect cost of communication itself, the constraints and inefficiency or RPC often calls for redundant checks and calculations at the various layers; for data and control flow patterns not at all optimal when transferred to a library call environment. And a framework working at the lowest level, unaware of semantics of the communication, has no means to offer anything that might reduce the inefficiency -- anything that could help avoid redundancies or rearrange patterns. > > (I must confess that I don't know the actual store interfaces; but > > my guess is that libstore uses such a special interface between the > > modules internally?) > > It seems the only difference is that offsets are mandatory and given > in blocks instead of bytes, amount to be read is in bytes but must be > a multiple of block size. In this case it would of been better to > keep it wholly consistent with the io interface, and just use block > aligned offsets. > > Also it has some funky functionality to remap the blocks of a store > without any cooperation from the back-end. But this just seems > awfully complex and could probably be reimplemented through a store > module with only a slight loss of performance. > > It seems that libstore really could use a clean-up. :-/ Don't be too hasty in your judgements. These things were designed by some very smart folks, and I'm sure they did have something in mind there. Problem is that we do not know what it was... (If you put down specific questions, and directly address the mail to Roland and/or Thomas, you *might* have a chance of getting a response; but don't count on it :-( ) > > In some cases the actual functionality of the individual layers > > perhaps could even be implemented using some kind of abstract > > description, rather than C code. > > I don't really see how that would work, do you want to elaborate? Take the example of stores: Many of them just do some kind of remapping of the blocks of underlying stores (striping, concatenating), or other trivial operations (zero store). These can easily be expressed mathematically. In such a case, instead of passing the client requests through all the layers one by one, each doing some transformation, the framework could get the mathematical descriptions from all layers, and assemble them in a single translation function. Note that I'm not claiming this is really feasible or useful in practice. Just wanted to give an idea what possibilities exist when working at a higher abstraction level... > So where do we go from here? As I see it we have two extremes, which > I will call dynamic and static channels (at least for now). For the sake of understanding, let me present my own take on this -- which seems mostly to restate what you are saying below, in slightly different terms. I see some three or four distinct levels at which translator stacking could be optimized. Level 1 would just replace RPC by library calls, having no clue about the semantics of the communication, and giving no clue to the actual translator implementations. This is pretty similar in spirit to what I was originally pondering way back -- only that my crazy mind was actually thinking even lower level (let's call it 0.5): Hooking right into the program execution mechanisms... The library variant seems much simpler and just as effective, though. The great advantage of such an approach is simplicity of use: Once the necessary machinery is implemented, everything can make use of it, with hardly any modification at all. However, I'm no longer convinced that such a mechanism would be really worthwhile -- that the limited performance increase would make up for the disadvantages of giving up the process boundaries. If we already have to coax all communication through the restrictions of RPC, we can just as well reap the benefits of having distinct hardware protected address spaces... (I mentioned in the "vision" mail that I was thinking of using LLVM in the hope that it would optimze away the remaining overhead of RPC-based communication that I described above, after linking. But I have serious doubts regarding LLVM's ability to optimize a lot at such a high level.) This approach might have some merit perhaps, because I was also pondering a similar approach to do network transparent RPC in user space -- both things might be handled in a common RPC abstraction framework. OTOH, tschwinge suggested a much simpler approach for network RPC... (Just using a port forwarding translator.) Network transparency could of course also be achieved at a higher abstraction level, e.g. FS like in Plan9. This way it could be smarter, but also less universal. Not sure which approach is better. Translator stacking at level 2 is totally different from level 1: It works directly with the public, POSIX (for FS and IO) and POSIX-like APIs. It knows the semantics of these interfaces. It is also fully aware of the difference between real RPC and library interface, and intelligently maps the APIs in either case to minimize overhead. This level is still fairly simple: It requires an adapted library for each interface; but actual translators and applications can still reap the benefits with only little or no explicit support. At the same time it should already cut considerably at the communication overhead. (Some intermediate level, say 1.5, might also be possible: Hooking at some internal interfaces below the level of the public APIs; already aware about the differences between communication over real RPC and library calls, but not having full understanding of the semantics of the specific API. I can't see any benefit in this approach, though...) I believe that level 2 should be the base line for the translator stacking framework, working for all stackable translator families. Individual families like stores can optionally support even more optimized interfaces, for internal use by translator of that family. This would be level 3. It's pretty much the same as the existing libstore (and libchannel as originally intended), except that it uses a common base framework for all families, and that modules are loaded directly from the server rather than from some external location. > Dynamic channels are the ones I have presented, sans the non-RPC > interfaces. That is they closely emulate the existing RPC interfaces, > and appears as nothing more than fast RPCs to clients. That seems to be what I'm describing as level 1 above. > The big selling point here is transparency. Using a channel is just > like using a port, so existing clients need little change to benefit > >from using channels. Same thing with servers, since they implement > RPCs already, it's mostly a matter of hooking the existing RPC > implementations to channels instead. Yeah, the simplicity of use as I described it. > Unfortunately we miss out on the convenience offered by libc, unless > we reimplement them over channels and supply them also. (We could > also integrate channels into libc directly, but I suspect that > wouldn't happen anytime soon if at all.) Indeed, I also fear that there might be problems with that. Which would be rather unfortunate, as not only level 1 is in this situation, but also level 2 (or 1.5). > Other benefits include that different channel families need not be > aware of each-other to benefit from using the channel interface. That is also true for level 2, with FS and IO interfaces forming the common denominator for all translator families. Only Level 3 offers no compatibility per se -- but then, as I said, I'd only implement level 3 as optional extensions beside level 2, so no problem here. > Though I suspect interoperability isn't very useful. Why would an > audio channel want to be layered over a network channel? Oh, this is actually a boring case: Streaming an audio channel over the network is a way to obvious application. But what about the inverse, layering a network channel stack over an audio channel? Now that calls for a bit more imagination :-) How about pushing network packets through an audio link? (Thing analog modem!) Or maybe just listening to the network stream to get a feel for the traffic? Or maybe an art project? ;-) Obviously, I'm not totally serious with the latter use cases. But my point is to demonstrate that just because applications are not obvious, we should not assume that there are none. In fact, for me the whole point of the Hurd is that it's much more flexible in use of interfaces and mechanisms than other operating systems -- that it's much more open to implementing new ideas! > With static channels each channel belongs to a family, where each > family corresponds to its own abstraction, and where each channel > implements an interface optimized for this abstraction. > > That is, we introduce a libaudio, libnet, etc. for each family. Each > being like libstore currently is, only cleaner and sharing common code > through libchannel. Where libchannel itself doesn't introduce a > channel abstraction per se, it's just a support library. I'm not certain about the details of what you are describing here. I think you mean something akin to my level 3, but I'm not entirely sure. > (Although in this case I'd much rather split libchannel into smaller, > clear cut pieces, for instance a `libenc' to deal with encoding and > decoding transferred data.) Splitting out sub-libraries only makes sense if there will actually be other users of these... > The thing about static channels is that they are simple. Both simple > to implement and use because their interface can be brought closer to > the problem domain. > > The downside being that a suitable abstraction must be engineered for > each channel family, including support libraries to implement the > translators corresponding to each modules. Well, simpler by what measure? This is really hard to classify. Writing a translator using the level 3 interface is completely different from writing a traditional translator centered around POSIX interfaces. On one hand, the interface can be much better optimized for the problem domain; and the programmer is also relieved from handling all the gory details of FS-like interfaces. On the other hand, this means loosing a common language, and a very intuitive abstraction -- FS interfaces, while usually far from optimal for specific applications, also have strong advantages! So it's really hard to come to a definite conclusion as to which way it's easier to write translators... > Also any interoperability must be explicit, by creating modules that > adapts one channel type to another. Not at all: I personally believe that every translator using a specific interface, should also expose an FS-based compatibility interface, at least for the basic operations. Special RPCs more suitable for the specific application should be implemented as additions, not replacements for FS interfaces. The nice thing when working with level 3 libraries, is that handling of the compatibility interface can be done automatically by the library. So compatibility in fact gets easier: Once the support is implemented in the level 3 library, the programmers of the actual translators don't need to think about it. > The middle ground as I see it is dynamic channels with non-RPC > interfaces, where each such interface corresponds to a channel family. Not sure what you mean here. Is it related to my level 2? > Somehow I think providing *both* static and dynamic channels would be > cleaner and more straight forward, using which ever handles the task > well enough. As I said, I believe we should provide level 2, and additionally level 3 (what you seem to call "static") for some families. I don't see much use in also providing level 1 ("dynamic")... -antrik- _______________________________________________ Bug-hurd mailing list Bug-hurd@gnu.org http://lists.gnu.org/mailman/listinfo/bug-hurd