J. Roeleveld <jo...@antarean.org> wrote: >> >>No, it wouldn't, since jobs just finishing and wanting to report their >>status cannot do this when there is no server. You would need a rather >>involved protocol to deal with such situations dynamically. >>It can certainly be done, but it is not something which can >>easily be "added" as a feature: If this is required, it has to be the >>fundamental concept from the very beginning and everything else has to >>follow this first aim. You need different protocols than TCP sockets, >>to start with; something like "dbus over IP" with servers being able >>to announce their new presence, etc. > > I think it's doable with standard networking protocols.
Yes, you can "tunnel" such a protocol over existing protocols, but "essentially" you must use a different one. Unless you want a static setup (use server A, if that fail use server B, and server A reports everything to server B) it cannot be done in a simple way that you have only one port open on the server: The client also needs a port open to be informed about the "current" server. Even worse, you need a "daemon" running for each client to handle this port. In such a case, you might make each client its own server, by spreading all changes to all clients immediately. > But, either you have a master server which controls everything. > Or you have a master process which has failover functionality > using classical distributed software techniques. This summarizes it quite good. The concept of my "schedule" is to follow the first path (with the advantage of being simple, having only one part, clients do nothing while their "task" is runnning). If you want to follow the latter, you need a rather different CLI and a different protocol - which is practically everything "schedule" consists of; so it is probably simpler to rewrite this from scratch. As I said: It is not a "feature" you can easily add later on; it is a fundamental decision you must choose from the very beginning. When you are at it you should probably also encrypt the communication and establish methods for authentification which is also something I currently omitted in "schedule" for simplicity (although this might be easier to add later on).