RE: Best practice -- duplicating and syncing objects

Felix Terkhorn Thu, 29 Nov 2012 09:46:11 -0800

Thanks, Eric.  This makes sense.  Right now we're indeed handling the update on 
the client side.


Your point about considering normalization in cases where we have to duplicate 
a lot of data is well taken.  This actually applies to a couple of use cases 
for us already - in one, we only have a single copy of the data; in another, we 
could see as many as 10-50 copies.

For the first case, we can stick to client-side updates of both copies for now. 
 For the second case, if we actually see that many copies starting to crop up, 
we can certainly de-duplicate things a bit.

In case we do go about experimenting with post-commit hooks... my understanding 
was that the can only be written in erlang.  I'm not entirely sure how this 
syncs up with your mention of "the number of pooled connections in javascript." 
 Forgive me if I'm missing something obvious, there.  Is there some config 
setting that I can visit to find out how many pooled connections we have 
available to handle those post-commit hooks?

-f

From: Eric Redmond [mailto:eredm...@basho.com]
Sent: Thursday, November 29, 2012 11:17 AM
To: Felix Terkhorn
Cc: riak-users@lists.basho.com
Subject: Re: Best practice -- duplicating and syncing objects

There's no general best practice for keeping denormalized data in sync, beyond 
the obvious case, which is to update all values through whatever client you use 
to update one. If your number of keys are few, this is not going to be a hard 
hit on your updates. If you have an unbounded number of keys, you may consider 
normalizing your data model a bit to reduce duplicate data.

Correct, you do not have to wait for a post commit to fire (actually, you 
can't).

You could functionally update objects in a post-commit, though I don't know how 
commonly this is done. If the post-commit job is long running, you might run 
out of pooled connections in javascript. You'd also have to be very careful to 
avoid your aforementioned "infinity loop", since whether by link walking or 
postcommit hooks, you still run the risk of objects updating each other 
recursively.

Eric

On Nov 29, 2012, at 7:53 AM, Felix Terkhorn 
<fterkh...@exacttarget.com<mailto:fterkh...@exacttarget.com>> wrote:


Greetings!

In the event that we have several documents, [A1, A2, ..., An], which contain 
the same data accessed via different keys, what is the best practice for 
keeping the data in sync?

Also, do post commit hooks fire after the client receives a successful 201 or 
200 status on a PUT?  That is to say, we don't have to wait for all post-commit 
hooks to fire, in order for our client to receive an HTTP success status, right?

That's our assumption, and if true, we'd like to exploit that fact in order to 
keep the response time of the PUT low.  Basically, client could PUT A1, and we 
could let Riak handle the necessary updates in a post-processing step.

We could keep the list [A1, A2, ..., An] somewhere else, and simply walk that 
list every time any document in the list is updated, excluding the document 
itself.  Is this a standard approach?

We thought of linking objects together, and having them update each other on 
post-commit, but that seems like it will bring us into infinite loop territory. 
:-D

Thanks,
Felix
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com<mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

RE: Best practice -- duplicating and syncing objects

Reply via email to