Re: Most space-efficient way to store log entries

Marc Aymerich Thu, 29 Oct 2015 04:59:43 -0700

On Thu, Oct 29, 2015 at 11:52 AM, Chris Angelico <ros...@gmail.com> wrote:
> On Thu, Oct 29, 2015 at 9:35 PM, Marc Aymerich <glicer...@gmail.com> wrote:
>> 1) Each node on the cluster needs to keep track of *all* the changes
>> that ever ocurred. So far, each node is storing each change as
>> individual lines on a file (the "historical state log" I was referring
>> to, the concept is very similar to the bitcoin blockchain)
>>
>> 2) The main communication channel is driven by a UDP gossip protocol.
>> From the performance perspective, it makes a huge difference if the
>> whole log entry fits into the UDP payload (512B), otherwise the log
>> entry has to be transferred by other means. Because config files are
>> mostly text, almost every single one of them can fit into a UDP
>> packet, if properly compressed.
>>
>> After reading your replies I'm concluding that
>>
>> 1) I should use the most space-efficient encoding *only* for
>> transferring the log entry, just lzma compress it.
>> 2) I should use the most readable one for storing the block on the log
>> file. Leave metadata as text and compress+base64 the "actual file
>> content" so it fits in an space-less ascii block, something like:
>
> I agree with those conclusions. If anything goes wrong, you have the
> tidy log in a form that's easily dug into, and then compression is
> used for transmission only.
>
> A couple of points of interest, though:
>
> 1) Conflicts - since you lack any concept of central authority,
> there's the possibility that two peers will simultaneously make
> incompatible changes, and then begin propagating them through the
> farm. What happens when a node receives a change it can't apply?


Yes, my solution is very similar to the bitcoin blockchain double
spending solution [1]. I have one blockchain for every path, all log
entries contain the hash of its ancestor, when more than one log entry
has the same ancestor we have a conflict (branching). Solving this
conflict just means that all nodes on the network should agree on
choosing the same branch. Bitcoin uses proof-of-work: the longest
chain (more CPU power) is always chosen. My nodes will chose the
branch with:
  1) more log entries with higher authority signatures (allowing to
support key revoking)**
  2) if equal, the one with more different valid signatures (more
participants say its the good one)
  3) if equal, higher hash (arbitrary but it has to be deterministic) :P

Notice that the file system is eventual consistent, the branch that is
valid now, may not be tomorrow.

**My solution for decentralized authority is having a .keys file on
each directory containing the keys that are authorized to write in it,
higher authority just means that a log entry signature belongs to a
key that appears higher in the file system hierarchy (kind of
proof-of-stake, the branch with higher authority wins so key revoking
is possible)

[1] https://en.bitcoin.it/wiki/How_bitcoin_works#Double_spending

> 2) UDP is unreliable. What happens if a node misses out on a change?
> Can it figure out that it's missed something, and go ask?

I'm still pending to read some papers on the matter (just starting),
but as far as I can tell gossip protocols take a probabilistic
approach when disseminating information, like the spread of a disease,
one particular node is exposed multiple times to the same messages by
different nodes, making gossip protocols very resilient to packet
loss. Also when a node recovers from failure (or rejoins the cluster
after a network partition) I perform a full state sync, but I still
don't know if this is sufficient or I'll need to do periodic syncs,
still learning. Btw, I'm using serf[1] which in turn uses SWIM[2], and
there are a lot of other interesting papers, just didn't read them,
yet.

[1] https://www.serfdom.io/docs/internals/gossip.html
[2] https://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf



-- 
Marc
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Most space-efficient way to store log entries

Reply via email to