On Thu, Oct 29, 2015 at 11:52 AM, Chris Angelico <ros...@gmail.com> wrote: > On Thu, Oct 29, 2015 at 9:35 PM, Marc Aymerich <glicer...@gmail.com> wrote: >> 1) Each node on the cluster needs to keep track of *all* the changes >> that ever ocurred. So far, each node is storing each change as >> individual lines on a file (the "historical state log" I was referring >> to, the concept is very similar to the bitcoin blockchain) >> >> 2) The main communication channel is driven by a UDP gossip protocol. >> From the performance perspective, it makes a huge difference if the >> whole log entry fits into the UDP payload (512B), otherwise the log >> entry has to be transferred by other means. Because config files are >> mostly text, almost every single one of them can fit into a UDP >> packet, if properly compressed. >> >> After reading your replies I'm concluding that >> >> 1) I should use the most space-efficient encoding *only* for >> transferring the log entry, just lzma compress it. >> 2) I should use the most readable one for storing the block on the log >> file. Leave metadata as text and compress+base64 the "actual file >> content" so it fits in an space-less ascii block, something like: > > I agree with those conclusions. If anything goes wrong, you have the > tidy log in a form that's easily dug into, and then compression is > used for transmission only. > > A couple of points of interest, though: > > 1) Conflicts - since you lack any concept of central authority, > there's the possibility that two peers will simultaneously make > incompatible changes, and then begin propagating them through the > farm. What happens when a node receives a change it can't apply?
Yes, my solution is very similar to the bitcoin blockchain double spending solution [1]. I have one blockchain for every path, all log entries contain the hash of its ancestor, when more than one log entry has the same ancestor we have a conflict (branching). Solving this conflict just means that all nodes on the network should agree on choosing the same branch. Bitcoin uses proof-of-work: the longest chain (more CPU power) is always chosen. My nodes will chose the branch with: 1) more log entries with higher authority signatures (allowing to support key revoking)** 2) if equal, the one with more different valid signatures (more participants say its the good one) 3) if equal, higher hash (arbitrary but it has to be deterministic) :P Notice that the file system is eventual consistent, the branch that is valid now, may not be tomorrow. **My solution for decentralized authority is having a .keys file on each directory containing the keys that are authorized to write in it, higher authority just means that a log entry signature belongs to a key that appears higher in the file system hierarchy (kind of proof-of-stake, the branch with higher authority wins so key revoking is possible) [1] https://en.bitcoin.it/wiki/How_bitcoin_works#Double_spending > 2) UDP is unreliable. What happens if a node misses out on a change? > Can it figure out that it's missed something, and go ask? I'm still pending to read some papers on the matter (just starting), but as far as I can tell gossip protocols take a probabilistic approach when disseminating information, like the spread of a disease, one particular node is exposed multiple times to the same messages by different nodes, making gossip protocols very resilient to packet loss. Also when a node recovers from failure (or rejoins the cluster after a network partition) I perform a full state sync, but I still don't know if this is sufficient or I'll need to do periodic syncs, still learning. Btw, I'm using serf[1] which in turn uses SWIM[2], and there are a lot of other interesting papers, just didn't read them, yet. [1] https://www.serfdom.io/docs/internals/gossip.html [2] https://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf -- Marc -- https://mail.python.org/mailman/listinfo/python-list