Re: O(n) behavior on crdts

Russell Brown Tue, 01 Apr 2014 12:29:11 -0700

Hey James,

Massive respect to any mailing list response about performance that includes 
graphs, thanks!


I plan to keep optimising right up until we release, so hopefully we can 
further improve the performance. From the look of your graphs there is work to 
be done.

Are these graphs for Set or Map additions, please?

Keep us posted with your testing/assessment, please.

Cheers

Russell


On 1 Apr 2014, at 20:18, James Moore <james.mo...@tapjoy.com> wrote:

> Thanks Russell,
> 
> I'm mainly observing this issue around the add operation on hash's when 
> running a similar operation as this guy
> 
> https://gist.github.com/russelldb/d330d796ca1b25d1879d
> 
> the behavior I was seeing show's up in the following graphs
> 
> the horizontal axis is the iteration of the add operation, and the y axis is 
> milliseconds
> 
> <image.png>
> <image.png>
> 
> From that chart it looks as though the fix in compression has resolved the 
> issue :)  ( and also in hindsight the results don't quite look O(n) )
> 
> for larger add operations I'm still seeing a linear performance degradation, 
> but add operations are still pretty quick through 10k operations
> 
> <image.png>
> 
> 
> In answer to you're question on performance I believe the switch up to pre20 
> should resolve the the poor add performance for us, and overall I've found it 
> to be quite impressive so far.
> 
> --james
> 
> 
> 
> 
> 
> 
> 
> On Tue, Apr 1, 2014 at 2:21 AM, Russell Brown <russell.br...@me.com> wrote:
> Hey James,
> 
> I haven’t analysed the complexity of the data types. Off hand I know that 
> operations on Maps, Sets, Counters etc are not O(n). Merges sometimes will 
> be, if every entry must be compared, and we’re looking at ways to optimise 
> this. The `value` operation on Maps and Sets must be O(n) since we have to 
> derive the correct value for each entry. I’m not sure how we would optimise 
> that.
> 
> We’ve optimised the `context` for operations down to a single version vector 
> (before it was a binary of the entire Map or Set), which should have got us 
> some performance improvements.
> 
> Right now Sets and Maps still use orddict. We have a round of performance and 
> scalability testing starting, and hopefully some optimisations will come out 
> of that.
> 
> I think you’re referring to Sean’s porting of Elixir’s TreeMap to erlang. We 
> have not yet tested that code, nor have we integrated it with riak_dt. I 
> don’t know if anyone even plans to.
> 
> Have you observed poor performance from the data types? The earlier releases' 
> main performance issue was to/from binary encoding. We since switched to 
> erlangs built in t2b/b2t functions, with compression, and have found the 
> performance to be acceptable.
> 
> I guess the short answer is: we’re working on it, but with 2.0 release 
> looming up, we might not get all the way there in time.
> 
> There is some cost for using CRDTs, I don’t think it could be another way, 
> there aren’t any free lunches. My aim is to reduce that cost below the pain 
> barrier of siblings, and writing complex merge functions.
> 
> Cheers
> 
> Russell
> 
> On 31 Mar 2014, at 23:06, James Moore <james.mo...@tapjoy.com> wrote:
> 
> > Hey All,
> >
> > has there been any progress on resolving the O(n) behavior of crdt sets and 
> > maps?
> >
> > Sean mentioned in a previous thread that there was a potential for a fix to 
> > resolve the poor performance of erlang's orddict.
> >
> > thanks!
> >
> > --James
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: O(n) behavior on crdts

Reply via email to