On Nov 19, 2010, at 2:15 PM, Parker Thompson wrote: > I'm experimenting with Riak by trying to port a simple a/b testing framework > that's currently SQL backed. Since I'm using Ripple/riak-client my code below > are in Ruby/JS. > > The domain model is fairly simple. I have visitors, which get created for any > user who hits the site, visitors see alternatives (currently these are > ActiveRecord objects) and are tracked by creating experiences (the joining of > a alternative ID and a visitor). Finally, as visitors do things we track > events, which are distinguished from one another by their classes. >
The first concept you'll have to give up with Riak is "join tables", since you can't have indexes on them in the same way as you can with a relational DB. A more natural model would be to have a "double" of the ActiveRecord object, which has the same key/id, and then links to all visitors who viewed that alternative. That is, you'd have another model (or maybe just an RObject, depending on how you want to deal with it), like so: class Riak::Alternative include Ripple::Document many :visitors, :class_name => "Riak::Visitor" property :alternative_id, Integer, :presence => true key_on :alternative_id end Then some portions of your MapReduce query will become simpler, some more difficult. I'm using a technique below I blogged about called "forwarding", which puts the data you want to return at the end of the query in the keyData for subsequent phases. In a relational DB you'd probably use a nested SELECT or some crazy group by/having combination. The Riak version feels more like a fanout (and double-back). ######## def visitors_who_shared Riak::MapReduce.new(Ripple.client). add("riak_alternatives", ar_id.to_s). link(:bucket => 'riak_visitors'). map(link_to_events_forward_visitor). map(map_share_events_to_visitor). reduce(["riak_kv_mapreduce", "reduce_set_union"]). map(map_identity, :keep => true). run end # Inspect the links, select the ones that point to events, put the visitor's key as the keyData # You could also put the whole object in the keyData, but this saves bandwidth and computation. def link_to_events_forward_visitor <<-FUNCTION function(object, keyData, arg){ return object.values[0].metadata.Links.reduce(function(acc, link){ if(link[0] == "events") acc.push([link[0], link[1], object.key]); return acc; }, []); } FUNCTION end # If the data is a ShareEvent, map to the visitor who created it def map_share_event_to_visitor <<-FUNCTION function(v, keyData){ var data = JSON.parse(v.values[0].data); if(data._type == "Riak::ShareEvent" ){ return [["visitors", keyData]]; } else { return []; } } FUNCTION end def map_identity <<-FUNCTION function(v){ return [v]; } FUNCTION end ######### The result of your visitors_who_shared method could then be used to vivify Visitor objects (it's straightforward, but I'm not putting the code here). Long term, you'll want to be creating your own Javascript built-in functions instead of passing the source along with every query. I've also only solved one issue with your schema above (denormalizing the "experiences" into "alternatives"). Please ask again if you have other questions/issues. Sean Cribbs <s...@basho.com> Developer Advocate Basho Technologies, Inc. http://basho.com/ _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com