Leveled and Anti-Entropy

2017-07-21 Thread Martin Sumner
I've added some anti-entropy features to Leveled (the pure-Erlang KV store
designed as a Riak backend).  These features are in-part an experiment in
how to approach both anti-entropy and full-sync multi-data centre
replication in the future.

There's a long write-up, including some history of AAE in Riak:

https://github.com/martinsumner/leveled/blob/master/docs/ANTI_ENTROPY.md

In summary, Riak's current AAE is based on cryptographically strong Merkle
trees, and this experiment is based on removing that security strength, as
it isn't relevant to the context in which is used.  Instead Leveled now has
Merkle Trees that can be merged and also can be built incrementally (i.e.
be built key by key even when the keys are not in segment order).

Using these new trees (coined TicTac trees to fit into Leveled's terrible
naming convention), we can build AAE trees in folds incrementally and hence
at a lower cost, but also merge trees across independent stores.  In the
future trees can be built from folds using Riak coverage queries, across
either indexes or objects in the store - and compared between different
database clusters even where those clusters are partitioned differently
e.g. different ring-sizes.

The expectation is that there will be more flexibility of choice in what we
can decide to compare at run time - not just are the objects consistent,
are the indexes consistent.  Also split from partition constraints there
will be improved flexibility in what we can decide to compare between -
e.g. make it easier to compare with a different database.

Coupled with this there's a demonstration of using temporary indexes in
Leveled, index entries that auto-expire at a TTL, and we've shown how this
can be used with tree-creating folds to compare recent changes between
stores at a lower cost than comparing the whole database state: with the
added advantage that the long-term footprint of the database is not
extended by maintaining a separate copy of all the keys and hashes.

Concurrently to this, we now have some other work ongoing in the space of
replication and anti-entropy:

- @russelldb is continuing to test and improve his open source real-time
replication solution (rabl) which uses RabbitMQ.  He's hoping to be able to
talk further on progress with this by the end of August.
- I'm working on implementing in riak_core a core_node_worker_pool, which
is intended to compliment the core_vnode_worker_pool but allow for coverage
queries where snapshots are taken on a covering set of vnodes, but folds
are then scheduled to run one-at-a-time on each node.  This can then be
used to regulate the impact of anti-entropy folds.

Our current target is to have a release candidate of open-source
replication (both real-time and full-sync) by the end of September.  This
will initially be focused only on replication between two Riak clusters.

Regards

Martin (@masleeds)

P.S. Hopefully next Friday we should also be able to report back on the
improvements and test enhancements that followed up the work on riak_core
claim.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Leveled and Anti-Entropy

2017-07-21 Thread Heinz N. Gies
Have you taken a look at the changes here 
https://github.com/Kyorai/riak_core/pull/24 


It pulls the AAE work for riak_kv into riak_core.


> On 21. Jul 2017, at 16:06, Martin Sumner  wrote:
> 
> 
> I've added some anti-entropy features to Leveled (the pure-Erlang KV store 
> designed as a Riak backend).  These features are in-part an experiment in how 
> to approach both anti-entropy and full-sync multi-data centre replication in 
> the future.
> 
> There's a long write-up, including some history of AAE in Riak:
> 
> https://github.com/martinsumner/leveled/blob/master/docs/ANTI_ENTROPY.md 
> 
> 
> In summary, Riak's current AAE is based on cryptographically strong Merkle 
> trees, and this experiment is based on removing that security strength, as it 
> isn't relevant to the context in which is used.  Instead Leveled now has 
> Merkle Trees that can be merged and also can be built incrementally (i.e. be 
> built key by key even when the keys are not in segment order).
> 
> Using these new trees (coined TicTac trees to fit into Leveled's terrible 
> naming convention), we can build AAE trees in folds incrementally and hence 
> at a lower cost, but also merge trees across independent stores.  In the 
> future trees can be built from folds using Riak coverage queries, across 
> either indexes or objects in the store - and compared between different 
> database clusters even where those clusters are partitioned differently e.g. 
> different ring-sizes.
> 
> The expectation is that there will be more flexibility of choice in what we 
> can decide to compare at run time - not just are the objects consistent, are 
> the indexes consistent.  Also split from partition constraints there will be 
> improved flexibility in what we can decide to compare between - e.g. make it 
> easier to compare with a different database.
> 
> Coupled with this there's a demonstration of using temporary indexes in 
> Leveled, index entries that auto-expire at a TTL, and we've shown how this 
> can be used with tree-creating folds to compare recent changes between stores 
> at a lower cost than comparing the whole database state: with the added 
> advantage that the long-term footprint of the database is not extended by 
> maintaining a separate copy of all the keys and hashes.
> 
> Concurrently to this, we now have some other work ongoing in the space of 
> replication and anti-entropy:
> 
> - @russelldb is continuing to test and improve his open source real-time 
> replication solution (rabl) which uses RabbitMQ.  He's hoping to be able to 
> talk further on progress with this by the end of August.
> - I'm working on implementing in riak_core a core_node_worker_pool, which is 
> intended to compliment the core_vnode_worker_pool but allow for coverage 
> queries where snapshots are taken on a covering set of vnodes, but folds are 
> then scheduled to run one-at-a-time on each node.  This can then be used to 
> regulate the impact of anti-entropy folds.
> 
> Our current target is to have a release candidate of open-source replication 
> (both real-time and full-sync) by the end of September.  This will initially 
> be focused only on replication between two Riak clusters.
> 
> Regards
> 
> Martin (@masleeds)
> 
> P.S. Hopefully next Friday we should also be able to report back on the 
> improvements and test enhancements that followed up the work on riak_core 
> claim.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Leveled and Anti-Entropy

2017-07-21 Thread Martin Sumner
Heinz,

No I haven't, but I will.

Thanks

Martin

On 21 July 2017 at 15:43, Heinz N. Gies  wrote:

> Have you taken a look at the changes here https://github.com/
> Kyorai/riak_core/pull/24
>
> It pulls the AAE work for riak_kv into riak_core.
>
>
> On 21. Jul 2017, at 16:06, Martin Sumner 
> wrote:
>
>
> I've added some anti-entropy features to Leveled (the pure-Erlang KV store
> designed as a Riak backend).  These features are in-part an experiment in
> how to approach both anti-entropy and full-sync multi-data centre
> replication in the future.
>
> There's a long write-up, including some history of AAE in Riak:
>
> https://github.com/martinsumner/leveled/blob/master/docs/ANTI_ENTROPY.md
>
> In summary, Riak's current AAE is based on cryptographically strong Merkle
> trees, and this experiment is based on removing that security strength, as
> it isn't relevant to the context in which is used.  Instead Leveled now has
> Merkle Trees that can be merged and also can be built incrementally (i.e.
> be built key by key even when the keys are not in segment order).
>
> Using these new trees (coined TicTac trees to fit into Leveled's terrible
> naming convention), we can build AAE trees in folds incrementally and hence
> at a lower cost, but also merge trees across independent stores.  In the
> future trees can be built from folds using Riak coverage queries, across
> either indexes or objects in the store - and compared between different
> database clusters even where those clusters are partitioned differently
> e.g. different ring-sizes.
>
> The expectation is that there will be more flexibility of choice in what
> we can decide to compare at run time - not just are the objects consistent,
> are the indexes consistent.  Also split from partition constraints there
> will be improved flexibility in what we can decide to compare between -
> e.g. make it easier to compare with a different database.
>
> Coupled with this there's a demonstration of using temporary indexes in
> Leveled, index entries that auto-expire at a TTL, and we've shown how this
> can be used with tree-creating folds to compare recent changes between
> stores at a lower cost than comparing the whole database state: with the
> added advantage that the long-term footprint of the database is not
> extended by maintaining a separate copy of all the keys and hashes.
>
> Concurrently to this, we now have some other work ongoing in the space of
> replication and anti-entropy:
>
> - @russelldb is continuing to test and improve his open source real-time
> replication solution (rabl) which uses RabbitMQ.  He's hoping to be able to
> talk further on progress with this by the end of August.
> - I'm working on implementing in riak_core a core_node_worker_pool, which
> is intended to compliment the core_vnode_worker_pool but allow for coverage
> queries where snapshots are taken on a covering set of vnodes, but folds
> are then scheduled to run one-at-a-time on each node.  This can then be
> used to regulate the impact of anti-entropy folds.
>
> Our current target is to have a release candidate of open-source
> replication (both real-time and full-sync) by the end of September.  This
> will initially be focused only on replication between two Riak clusters.
>
> Regards
>
> Martin (@masleeds)
>
> P.S. Hopefully next Friday we should also be able to report back on the
> improvements and test enhancements that followed up the work on riak_core
> claim.
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com