date:20131017

Re: Storing JSON via Erlang Client

2013-10-17 Thread Daniil Churikov

{ok, Worker} = riakc_pb_socket:start_link("my_riak_node_1", 8087),
Obj = riakc_obj:new(<<"my_bucket">>, <<"my_key">>, <<"{\"key\":\"\val\"}">>,
<<"application/json">>),
ok = riakc_pb_socket:put(Worker, Obj),
ok = riakc_pb_socket:stop(Worker).




--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Storing-JSON-via-Erlang-Client-tp4029489p4029491.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

riak (riak-cs) ring problem

2013-10-17 Thread Jordi Valverde

Hello,


it's the first post, I've been searching around and having a trobule right now.

I really apologize if this question seems newbie, I've been playing a few riak 
since now.


My problem is that today I tryed to update riak, I was working with this 
versions:

ii  riak   1.4.0-1   amd64  
  Riak is a distributed data store
ii  riak-cs1.3.1-1   amd64  
  Riak CS

to update to:

ii  riak   1.4.2-1   amd64  
  Riak is a distributed data store
ii  riak-cs1.4.0-1   amd64  
  Riak CS


After upgrading this versions the main node where I have my stanchion server 
and "master" node, crashed when I tryed to start the updated node.

I tryed a downgrade the upgraded server and they start now normally, but I get 
this error:

I paused the node and in the shell get this:

root@rcs1:~# riak-admin ringready
FALSE Node 'riak@10.0.0.1' and 'riak@10.0.0.2' list different partition owners

googling couldn't find info about this and what's next step.

Could you help me out or point me where I could get info for solving this?


Thanks,

cheers,
Jordi.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Sam Elliott

It is perfectly safe with Counters to "blindly" issue an update. Clients (for 
counters) should allow a way to blindly send updates.

You should only be aware that your updates are *not* idempotent - if you retry 
an update to a counter, both updates could be preserved.

Sam
-- 
Sam Elliott
Engineer
sam.elli...@basho.com
--


On Thursday, 17 October 2013 at 10:03AM, Weston Jossey wrote:

> In the context of using distributed counters (introduced in 1.4), is it 
> strictly necessary to perform a read prior to issue a write for a given key? 
> A la, if I want to blindly increment a value by 1, regardless of what its 
> current value is, is it sufficient to issue the write without previously 
> having read the object?
> 
> I ask because looking at some of the implementations for counters in the open 
> source community, it's common to perform a read before a write, which impacts 
> performance ceilings on clusters with high volume reads / writes. I want to 
> verify before issuing some PRs that this is in fact safe behavior.
> 
> Thank you!
> -Wes Jossey
> ___
> riak-users mailing list
> riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Read Before Writes on Distributed Counters

2013-10-17 Thread Weston Jossey

In the context of using distributed counters (introduced in 1.4), is it 
strictly necessary to perform a read prior to issue a write for a given key?  A 
la, if I want to blindly increment a value by 1, regardless of what its current 
value is, is it sufficient to issue the write without previously having read 
the object?

I ask because looking at some of the implementations for counters in the open 
source community, it's common to perform a read before a write, which impacts 
performance ceilings on clusters with high volume reads / writes.  I want to 
verify before issuing some PRs that this is in fact safe behavior.

Thank you!
-Wes Jossey
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

Hi Wes,

The client application does not need to perform a read before a write, the riak 
server must read from disk before updating the counter. Or at least it must 
with our current implementation.

What PRs did you have in mind? I'm curious.

Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" line, 
that means when riak tells you "error" for some counter increment, it may only 
be a partial failure, and re-running the operation may lead to over counting.

Cheers

Russell

On 17 Oct 2013, at 16:03, Weston Jossey  wrote:

> In the context of using distributed counters (introduced in 1.4), is it 
> strictly necessary to perform a read prior to issue a write for a given key?  
> A la, if I want to blindly increment a value by 1, regardless of what its 
> current value is, is it sufficient to issue the write without previously 
> having read the object?
> 
> I ask because looking at some of the implementations for counters in the open 
> source community, it's common to perform a read before a write, which impacts 
> performance ceilings on clusters with high volume reads / writes.  I want to 
> verify before issuing some PRs that this is in fact safe behavior.
> 
> Thank you!
> -Wes Jossey
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Weston Jossey

Great everyone, thank you.

@Russell:  I specifically work with either Go (
https://github.com/tpjg/goriakpbc) or Ruby (basho client).  I haven't
tested the ruby client, but I'd assume it will perform the write without
the read (based on my reading of the code).  The Go library, on the other
hand, currently always performs a read prior to the write.  It's an easy
patch that I've already applied locally for benchmarking, I just didn't
want to submit the PR till I was sure this was the correct behavior.

Somewhat off topic, but I dont' want to open up another thread if it's
unnecessary.  This questions arose because I've been doing extensive
benchmarking around distributed counters.  Are there pre-existing
benchmarks out there that I can measure myself against?  I haven't stumbled
across many at this point, probably because of how new it is.

Cheers,
Wes

On Thu, Oct 17, 2013 at 10:21 AM, Russell Brown wrote:

> Hi Wes,
>
> The client application does not need to perform a read before a write, the
> riak server must read from disk before updating the counter. Or at least it
> must with our current implementation.
>
> What PRs did you have in mind? I'm curious.
>
> Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent"
> line, that means when riak tells you "error" for some counter increment, it
> may only be a partial failure, and re-running the operation may lead to
> over counting.
>
> Cheers
>
> Russell
>
> On 17 Oct 2013, at 16:03, Weston Jossey  wrote:
>
> > In the context of using distributed counters (introduced in 1.4), is it
> strictly necessary to perform a read prior to issue a write for a given
> key?  A la, if I want to blindly increment a value by 1, regardless of what
> its current value is, is it sufficient to issue the write without
> previously having read the object?
> >
> > I ask because looking at some of the implementations for counters in the
> open source community, it's common to perform a read before a write, which
> impacts performance ceilings on clusters with high volume reads / writes.
>  I want to verify before issuing some PRs that this is in fact safe
> behavior.
> >
> > Thank you!
> > -Wes Jossey
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

I have some from a while back, if I can find my graphs I'll put them up 
somewhere.

Cheers

Russell

On 17 Oct 2013, at 16:35, Weston Jossey  wrote:

> Great everyone, thank you.  
> 
> @Russell:  I specifically work with either Go 
> (https://github.com/tpjg/goriakpbc) or Ruby (basho client).  I haven't tested 
> the ruby client, but I'd assume it will perform the write without the read 
> (based on my reading of the code).  The Go library, on the other hand, 
> currently always performs a read prior to the write.  It's an easy patch that 
> I've already applied locally for benchmarking, I just didn't want to submit 
> the PR till I was sure this was the correct behavior.
> 
> Somewhat off topic, but I dont' want to open up another thread if it's 
> unnecessary.  This questions arose because I've been doing extensive 
> benchmarking around distributed counters.  Are there pre-existing benchmarks 
> out there that I can measure myself against?  I haven't stumbled across many 
> at this point, probably because of how new it is.
> 
> Cheers,
> Wes
> 
> 
> On Thu, Oct 17, 2013 at 10:21 AM, Russell Brown  wrote:
> Hi Wes,
> 
> The client application does not need to perform a read before a write, the 
> riak server must read from disk before updating the counter. Or at least it 
> must with our current implementation.
> 
> What PRs did you have in mind? I'm curious.
> 
> Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" 
> line, that means when riak tells you "error" for some counter increment, it 
> may only be a partial failure, and re-running the operation may lead to over 
> counting.
> 
> Cheers
> 
> Russell
> 
> On 17 Oct 2013, at 16:03, Weston Jossey  wrote:
> 
> > In the context of using distributed counters (introduced in 1.4), is it 
> > strictly necessary to perform a read prior to issue a write for a given 
> > key?  A la, if I want to blindly increment a value by 1, regardless of what 
> > its current value is, is it sufficient to issue the write without 
> > previously having read the object?
> >
> > I ask because looking at some of the implementations for counters in the 
> > open source community, it's common to perform a read before a write, which 
> > impacts performance ceilings on clusters with high volume reads / writes.  
> > I want to verify before issuing some PRs that this is in fact safe behavior.
> >
> > Thank you!
> > -Wes Jossey
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: vm.swappiness ?

2013-10-17 Thread Alex Rice

Aha! thanks for that tip


On Wed, Oct 16, 2013 at 3:05 PM, Jared Morrow  wrote:
> It is checked by 'riak-admin diag' if you run that to check your system.
>
> -Jared
>
>
>
>
> On Wed, Oct 16, 2013 at 2:33 PM, Alex Rice  wrote:
>>
>> Thanks for confirming, Matthew! That might be a good check for the
>> startup script, much like I think it currently complains about ulimit
>> is not set to the Max.
>>
>> On Wed, Oct 16, 2013 at 2:31 PM, Matthew Von-Maszewski
>>  wrote:
>> > recommend value for Riak is zero.
>> >
>> > On Oct 16, 2013, at 4:28 PM, Alex Rice  wrote:
>> >
>> >> Just an informal poll- what is preferred for Linux vm.swappiness
>> >> setting for Riak in a *cloud* environment? The default is 60 - a lot
>> >> of stuff gets swapped out. This is good for OS disk cache.
>> >>
>> >> I am thinking vm.swappiness = 0
>> >>
>> >> - Avoid slow & potentially costly I/O operations in the virtualized
>> >> environment
>> >> - Make it easier to see if bitcask is running low of memory, as soon
>> >> as the Swap partition has been touched at all (1st out of memory
>> >> condition)
>> >> - Disk cache wont be as effective.
>> >> - I just like seeing Swap: 0% when I log in :0
>> >>
>> >> Does this setting even matter at all, practically speaking?
>> >>
>> >> ___
>> >> riak-users mailing list
>> >> riak-users@lists.basho.com
>> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> >
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Daniil Churikov

Correct me if I wrong, but when you blindly do update without previous read,
you create a sibling, which should be resolved on read. In case if you make
a lot of increments for counter and rarely reads it will lead to siblings
explosion.

I am not familiar with new counters datatypes, so I am curious.



--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Jeremiah Peschka

When you 'update' a counter, you send in an increment operation. That's
added to an internal list in Riak. The operations are then zipped up to
provide the correct counter value on read. The worst that you'll do is add
a large(ish) number of values to the op list inside Riak.

Siblings will be created, but they will not be visible to the end user who
is reading from the counter.

Check out this demo of the new counter types from Sean Cribbs:
https://vimeo.com/43903960

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop

On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov  wrote:

> Correct me if I wrong, but when you blindly do update without previous
> read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
>
> I am not familiar with new counters datatypes, so I am curious.
>
>
>
> --
> View this message in context:
> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Sean Cribbs

The reasons counters are interesting are:

1) You send an "increment" or "decrement" operation rather than the new
value.
2) Any conflicts that were created by that operation get resolved
automatically.

So, no, sibling explosion will not occur.


On Thu, Oct 17, 2013 at 3:55 PM, Daniil Churikov  wrote:

> Correct me if I wrong, but when you blindly do update without previous
> read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
>
> I am not familiar with new counters datatypes, so I am curious.
>
>
>
> --
> View this message in context:
> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Sean Cribbs 
Software Engineer
Basho Technologies, Inc.
http://basho.com/
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Sean Cribbs

Since Jeremiah loves it when I'm pedantic, it bears mentioning that the
list of operations are rolled up immediately (not kept around), grouping by
which partition took the increment. So if I increment by 2 and then by 50,
and the increment goes to different replicas, my counter will look like
[{a, 2}, {b, 50}], for a sum of 52.


On Thu, Oct 17, 2013 at 4:21 PM, Jeremiah Peschka <
jeremiah.pesc...@gmail.com> wrote:

> When you 'update' a counter, you send in an increment operation. That's
> added to an internal list in Riak. The operations are then zipped up to
> provide the correct counter value on read. The worst that you'll do is add
> a large(ish) number of values to the op list inside Riak.
>
> Siblings will be created, but they will not be visible to the end user who
> is reading from the counter.
>
> Check out this demo of the new counter types from Sean Cribbs:
> https://vimeo.com/43903960
>
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
>
>
> On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov  wrote:
>
>> Correct me if I wrong, but when you blindly do update without previous
>> read,
>> you create a sibling, which should be resolved on read. In case if you
>> make
>> a lot of increments for counter and rarely reads it will lead to siblings
>> explosion.
>>
>> I am not familiar with new counters datatypes, so I am curious.
>>
>>
>>
>> --
>> View this message in context:
>> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
>> Sent from the Riak Users mailing list archive at Nabble.com.
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 
Sean Cribbs 
Software Engineer
Basho Technologies, Inc.
http://basho.com/
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Jeremiah Peschka

That's why I linked to the video - it's 60 minutes of Cribbs™ brand
pedantry.

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Thu, Oct 17, 2013 at 10:45 AM, Sean Cribbs  wrote:

> Since Jeremiah loves it when I'm pedantic, it bears mentioning that the
> list of operations are rolled up immediately (not kept around), grouping by
> which partition took the increment. So if I increment by 2 and then by 50,
> and the increment goes to different replicas, my counter will look like
> [{a, 2}, {b, 50}], for a sum of 52.
>
>
> On Thu, Oct 17, 2013 at 4:21 PM, Jeremiah Peschka <
> jeremiah.pesc...@gmail.com> wrote:
>
>> When you 'update' a counter, you send in an increment operation. That's
>> added to an internal list in Riak. The operations are then zipped up to
>> provide the correct counter value on read. The worst that you'll do is add
>> a large(ish) number of values to the op list inside Riak.
>>
>> Siblings will be created, but they will not be visible to the end user
>> who is reading from the counter.
>>
>> Check out this demo of the new counter types from Sean Cribbs:
>> https://vimeo.com/43903960
>>
>> ---
>> Jeremiah Peschka - Founder, Brent Ozar Unlimited
>> MCITP: SQL Server 2008, MVP
>> Cloudera Certified Developer for Apache Hadoop
>>
>>
>> On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov wrote:
>>
>>> Correct me if I wrong, but when you blindly do update without previous
>>> read,
>>> you create a sibling, which should be resolved on read. In case if you
>>> make
>>> a lot of increments for counter and rarely reads it will lead to siblings
>>> explosion.
>>>
>>> I am not familiar with new counters datatypes, so I am curious.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
>>> Sent from the Riak Users mailing list archive at Nabble.com.
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
> --
> Sean Cribbs 
> Software Engineer
> Basho Technologies, Inc.
> http://basho.com/
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

Hi Daniil,

On 17 Oct 2013, at 16:55, Daniil Churikov  wrote:

> Correct me if I wrong, but when you blindly do update without previous read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
> 
> I am not familiar with new counters datatypes, so I am curious.

The counters in riak 1.4 are the first of a few data types we are building. The 
main change, conceptually, is that Riak knows about the type of the data you're 
storing in a counter.
Riak already detects conflicting writes, (writes that are causally concurrent), 
but doesn't know how to merge your data to a single value, instead it presents 
all the conflicting values to the client to resolve. However, in the case of a 
counter Riak _does_ know the meaning of your data and we're using a data type 
that can automatically merge to a correct value.

There is code running on Riak that will automatically merge counter siblings on 
write. And if siblings are detected on read, they are merged that a single 
value is presented to the client application.

I think Sean Cribbs has replied faster than me this time, and he's hinted at 
how the data type is implemented.

Cheers

Russell

> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing JSON via Erlang Client

2013-10-17 Thread Eric Redmond

For building json you should also check out a tool like mochijson2.
On Oct 17, 2013 6:51 AM, "Daniil Churikov"  wrote:

> {ok, Worker} = riakc_pb_socket:start_link("my_riak_node_1", 8087),
> Obj = riakc_obj:new(<<"my_bucket">>, <<"my_key">>,
> <<"{\"key\":\"\val\"}">>,
> <<"application/json">>),
> ok = riakc_pb_socket:put(Worker, Obj),
> ok = riakc_pb_socket:stop(Worker).
>
>
>
>
> --
> View this message in context:
> http://riak-users.197444.n3.nabble.com/Storing-JSON-via-Erlang-Client-tp4029489p4029491.html
> Sent from the Riak Users mailing list archive at Nabble.com.
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing JSON via Erlang Client

2013-10-17 Thread Christopher Meiklejohn

I'd also recommend jsx [1], which doesn't require wrapping your objects in 
struct tuples. 

[1] https://github.com/talentdeficit/jsx

- Chris 

-- 
Christopher Meiklejohn
Software Engineer
Basho Technologies, Inc.



On Thursday, October 17, 2013 at 12:08 PM, Eric Redmond wrote:

> For building json you should also check out a tool like mochijson2.
> On Oct 17, 2013 6:51 AM, "Daniil Churikov"  (mailto:ddo...@gmail.com)> wrote:
> > {ok, Worker} = riakc_pb_socket:start_link("my_riak_node_1", 8087),
> > Obj = riakc_obj:new(<<"my_bucket">>, <<"my_key">>, <<"{\"key\":\"\val\"}">>,
> > <<"application/json">>),
> > ok = riakc_pb_socket:put(Worker, Obj),
> > ok = riakc_pb_socket:stop(Worker).
> > 
> > 
> > 
> > 
> > --
> > View this message in context: 
> > http://riak-users.197444.n3.nabble.com/Storing-JSON-via-Erlang-Client-tp4029489p4029491.html
> > Sent from the Riak Users mailing list archive at Nabble.com 
> > (http://Nabble.com).
> > 
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Question about Hooks & Nodes

2013-10-17 Thread Tristan Foureur

Hi,My question is simple, but really I cannot find a clear answer anywhere in the documentation. I understand how a cluster works, and how a hook works, but if you have a hook on a certain bucket and commit to a node on that bucket, is the hook triggered only on that node, on all the nodes, or only on the nodes that will actually host that key?I ask this because I deployed my pre-commit hook to one node only, and then sometimes the insert failed telling me my precommit hook was "undef", and it was fixed by deploying the hook to all vnodes.Thanks!
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing JSON via Erlang Client

2013-10-17 Thread Konstantin Kalin

I used mochijson2 and ejson. I found that ejson works faster since it's
built using NIF. But both libraries use tuple wrapping around proplists.
Thus I developed a few wrapper functions to manipulate with fields.

Thank you,
Konstantin.


On Thu, Oct 17, 2013 at 9:08 AM, Eric Redmond  wrote:

> For building json you should also check out a tool like mochijson2.
> On Oct 17, 2013 6:51 AM, "Daniil Churikov"  wrote:
>
>> {ok, Worker} = riakc_pb_socket:start_link("my_riak_node_1", 8087),
>> Obj = riakc_obj:new(<<"my_bucket">>, <<"my_key">>,
>> <<"{\"key\":\"\val\"}">>,
>> <<"application/json">>),
>> ok = riakc_pb_socket:put(Worker, Obj),
>> ok = riakc_pb_socket:stop(Worker).
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://riak-users.197444.n3.nabble.com/Storing-JSON-via-Erlang-Client-tp4029489p4029491.html
>> Sent from the Riak Users mailing list archive at Nabble.com.
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Storing JSON via Erlang Client

2013-10-17 Thread Daniil Churikov

To Konstantin Kalin:
This is not a good place to start discussion about NIFS, but check this
http://ferd.ca/rtb-where-erlang-blooms.html especially last passage.



--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Storing-JSON-via-Erlang-Client-tp4029489p4029509.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Question about Hooks & Nodes

2013-10-17 Thread Eric Redmond

Apologies that it's unclear, and I'll update the docs to correct this.

http://docs.basho.com/riak/latest/ops/advanced/install-custom-code/

When you install custom code, you must install that code on every node.

Eric

On Oct 17, 2013, at 9:17 AM, Tristan Foureur  wrote:

> Hi,
> 
> My question is simple, but really I cannot find a clear answer anywhere in 
> the documentation. I understand how a cluster works, and how a hook works, 
> but if you have a hook on a certain bucket and commit to a node on that 
> bucket, is the hook triggered only on that node, on all the nodes, or only on 
> the nodes that will actually host that key?
> 
> I ask this because I deployed my pre-commit hook to one node only, and then 
> sometimes the insert failed telling me my precommit hook was "undef", and it 
> was fixed by deploying the hook to all vnodes.
> 
> Thanks!
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

On 17 Oct 2013, at 17:21, Jeremiah Peschka  wrote:

> When you 'update' a counter, you send in an increment operation. That's added 
> to an internal list in Riak. The operations are then zipped up to provide the 
> correct counter value on read. The worst that you'll do is add a large(ish) 
> number of values to the op list inside Riak. 

Just to borrow some Cribbs-brand pedantry here:- That isn't true. We read the 
data from disk, increment an entry in what is essentially a version vector, and 
write it back, (then replicate the result to N-1 vnodes.) The size of the 
counter depends on the number of actors that have incremented it (typically N) 
not the number of operations.

> 
> Siblings will be created, but they will not be visible to the end user who is 
> reading from the counter.

There won't be siblings on disk (we do create a temporary one in memory, does 
that count?) _unless_

1. you also write an object to that same key in a normal riak kv  way (don't do 
that)
2. AAE or MDC cause a sibling to be created (this is because we use the 
operation of incrementing a counter to identify a key as counter, to the rest 
of riak it is just a riak object)

In that last case, an increment operation to the key will resolve the 
sibling(s).

Cheers

Russell

> 
> Check out this demo of the new counter types from Sean Cribbs: 
> https://vimeo.com/43903960
> 
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
> 
> 
> On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov  wrote:
> Correct me if I wrong, but when you blindly do update without previous read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
> 
> I am not familiar with new counters datatypes, so I am curious.
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread wjossey

And, just to close the loop, I went ahead and patched the Go library to
support the above functionality.

Thanks for the help everyone.  



--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029513.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Question about Hooks & Nodes

2013-10-17 Thread Sam Elliott

On Thursday, 17 October 2013 at 12:45PM, Eric Redmond wrote:
> Apologies that it's unclear, and I'll update the docs to correct this.
> 
> http://docs.basho.com/riak/latest/ops/advanced/install-custom-code/
> 
> When you install custom code, you must install that code on every node.
> 
> Eric
> 
> On Oct 17, 2013, at 9:17 AM, Tristan Foureur  (mailto:e...@me.com)> wrote:
> > Hi,
> > 
> > My question is simple, but really I cannot find a clear answer anywhere in 
> > the documentation. I understand how a cluster works, and how a hook works, 
> > but if you have a hook on a certain bucket and commit to a node on that 
> > bucket, is the hook triggered only on that node, on all the nodes, or only 
> > on the nodes that will actually host that key?
The hook is only triggered on one node, the write coordinator. 

This is not always the exact same node that you sent the request to. 

Because you can't predict which node the coordinator will be on, the code must 
be loaded on every single node, as Eric pointed out.

Sam 
> > 
> > I ask this because I deployed my pre-commit hook to one node only, and then 
> > sometimes the insert failed telling me my precommit hook was "undef", and 
> > it was fixed by deploying the hook to all vnodes.
> > 
> > Thanks! ___
> > riak-users mailing list
> > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Getting a list of events using secondary index

2013-10-17 Thread Alex Robson

Working on an event-sourcing aproach and would really appreciate some
advice.

1. Every event is tagged with a secondary index ("aggregate_id")
2. Every event's id is k-ordered (using a Flake compatible id generator)
3. Every aggregate has last_event_id

I would like the ability to select all event ids, for a given an
aggregate_id that occurred after the last_event_id for a given aggregate.

At the moment, I am using the secondary index only to get all event ids for
a particular aggregate. I then filter the id list so that I only have the
ids that occurred after last_event_id. I then issue a multi-key get and
retrieve the events I want.

Latency is fairly important in this case and so I wanted to see if there
were a better way (or if what I'm doing is just an awful misuse of Riak). I
got the impression from reading docs that map/reduce is not ideal for
real-time operations and intended more for batch stuff that runs out of
band. This series of operations would occur for every read of the aggregate.

Thanks for your help and thanks for Riak :)

Alex
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Getting a list of events using secondary index

2013-10-17 Thread Brady Wetherington

You don't say _how_ you get the last_event_id for a particular aggregate -
but presuming that's a relatively trivial operation - you could change
around your secondary index so you just have to make a range query.

Instead of - or possibly in addition to - the aggregate_id, you could have
an aggregate_id_event_id composite index, which contains something like
":". Then, for a query for aggregate_id "x", you
would do a range query for x: to x:

After that, you'd still have to do the multi-key get to grab your events.
But if the number of events per aggregate is 'very large' and the number of
events per aggregate greater than last_event_id is "small", then it could
help.

-B.

On Thu, Oct 17, 2013 at 3:53 PM, Alex Robson  wrote:

> Working on an event-sourcing aproach and would really appreciate some
> advice.
>
> 1. Every event is tagged with a secondary index ("aggregate_id")
> 2. Every event's id is k-ordered (using a Flake compatible id generator)
> 3. Every aggregate has last_event_id
>
> I would like the ability to select all event ids, for a given an
> aggregate_id that occurred after the last_event_id for a given aggregate.
>
> At the moment, I am using the secondary index only to get all event ids
> for a particular aggregate. I then filter the id list so that I only have
> the ids that occurred after last_event_id. I then issue a multi-key get and
> retrieve the events I want.
>
> Latency is fairly important in this case and so I wanted to see if there
> were a better way (or if what I'm doing is just an awful misuse of Riak). I
> got the impression from reading docs that map/reduce is not ideal for
> real-time operations and intended more for batch stuff that runs out of
> band. This series of operations would occur for every read of the aggregate.
>
> Thanks for your help and thanks for Riak :)
>
> Alex
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Proxy Setting with Kerl?

2013-10-17 Thread Dave King

May have been a network issue, it's started working.

- Peace
Dave



On Thu, Oct 17, 2013 at 5:19 PM, Dave King  wrote:

> I'm trying to install erlang on a machine with Proxy values.  curl picks
> up these values.  Kerl on the other hand just seems to sit and wait.  Is
> there a way to pass proxy settings to Kerl?
>
> Is there a good page on Kerl?  Google doesn't seem to recognize it.
>
> Dave
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Proxy Setting with Kerl?

2013-10-17 Thread Dave King

I'm trying to install erlang on a machine with Proxy values.  curl picks up
these values.  Kerl on the other hand just seems to sit and wait.  Is there
a way to pass proxy settings to Kerl?

Is there a good page on Kerl?  Google doesn't seem to recognize it.

Dave
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Proxy Setting with Kerl?

2013-10-17 Thread Luke Bakken

Hi Dave,

Since kerl uses curl to download files, you should be able to set your
proxy this way and have it picked up:

$ export http_proxy http://proxy.server.com:3128


--
Luke Bakken
CSE
lbak...@basho.com


On Thu, Oct 17, 2013 at 4:19 PM, Dave King  wrote:
> I'm trying to install erlang on a machine with Proxy values.  curl picks up
> these values.  Kerl on the other hand just seems to sit and wait.  Is there
> a way to pass proxy settings to Kerl?
>
> Is there a good page on Kerl?  Google doesn't seem to recognize it.
>
> Dave
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

C++ Not Found

2013-10-17 Thread Dave King

Trying to Build Riak on SUSE LE 11 SP2
make rel fails with
*
*
==> ebloom (compile)
Compiled src/ebloom.erl
Compiling /home/cstatmgrd/riak/riak-1.4.2/deps/ebloom/c_src/ebloom_nifs.cpp
sh: line 0: exec: c++: not found
ERROR: compile failed while processing
/home/cstatmgrd/riak/riak-1.4.2/deps/ebloom: rebar_abort
make: *** [compile] Error 1

gcc -v says

Using built-in specs.
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info
--mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64
--enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.3
--enable-ssp --disable-libssp
--with-bugurl=http://bugs.opensuse.org/--with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap
--with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit
--enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --program-suffix=-4.3
--enable-linux-futex --without-system-libunwind --with-cpu=generic
--build=x86_64-suse-linux
Thread model: posix
gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux)


The important part would seem to be
--enable-languages=c,c++,objc,fortran,obj-c++,java,ada

has c++ in the list.

I'm in no way a gcc expert, so where do I go from here?

- Peace
Dave
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Riak consumes too much memory

2013-10-17 Thread ZhouJianhua

Hi
I installed riak v1.4.2 on ubuntu12.04(64bit, 4G RAM) with apt-get,  run it 
with default app.conf but change the backend to leveldb, and test it with 
https://github.com/tpjg/goriakpbc . 
Just keep putting (key, value) to an bucket,  the memory always increasing, and 
in the end it crashed, as it cannot allocate memory. 
Should i change the configuration or other? 
  ___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak consumes too much memory

2013-10-17 Thread wjossey

4GB of memory is not very much, and you'll likely exhaust it after not a lot
of time.  If you're attempting to do development work on that little amount
of memory, you're going to want to lower the memory consumption for leveldb
by tweaking the leveldb configuration parameters (such as cache_size).

-Wes



--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Riak-consumes-too-much-memory-tp4029521p4029522.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak consumes too much memory

2013-10-17 Thread Matthew Von-Maszewski

Greetings,

The default config targets 5 servers and 16 to 32G of RAM.  Yes, the app.config 
needs some adjustment to achieve happiness for you:

- change ring_creation_size from 64 to 16 (remove the % from the beginning of 
the line)
- add this line before "{data_root, }" in eleveldb section: 
"{max_open_files, 40}," (be sure the comma is at the end of this line).

Good luck,
Matthew


On Oct 17, 2013, at 8:23 PM, ZhouJianhua  wrote:

> Hi
> 
> I installed riak v1.4.2 on ubuntu12.04(64bit, 4G RAM) with apt-get,  run it 
> with default app.conf but change the backend to leveldb, and test it with 
> https://github.com/tpjg/goriakpbc . 
> 
> Just keep putting (key, value) to an bucket,  the memory always increasing, 
> and in the end it crashed, as it cannot allocate memory. 
> 
> Should i change the configuration or other?
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak consumes too much memory

2013-10-17 Thread Eric Redmond

How many nodes are you running? You should aim for around 8-16 vnodes per 
server (must be a power of 2). So if you're running 5 nodes, you should be fine 
with 4GB since it'll be approx 12 vnodes per. If you're only running on 1 
server, you'll be running 64 vnodes on that single server (which is too many), 
in which case 4GB is RAM is not near enough.

Eric

On Oct 17, 2013, at 5:23 PM, ZhouJianhua  wrote:

> Hi
> 
> I installed riak v1.4.2 on ubuntu12.04(64bit, 4G RAM) with apt-get,  run it 
> with default app.conf but change the backend to leveldb, and test it with 
> https://github.com/tpjg/goriakpbc . 
> 
> Just keep putting (key, value) to an bucket,  the memory always increasing, 
> and in the end it crashed, as it cannot allocate memory. 
> 
> Should i change the configuration or other?
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak consumes too much memory

2013-10-17 Thread darren

But why isn't riak smart enough to adjust itself to the available memory or 
lack thereof?

No serious enterprise technology should just consume everything and crash.


Sent from my Verizon Wireless 4G LTE Smartphone

 Original message 
From: Matthew Von-Maszewski  
Date: 10/17/2013  8:38 PM  (GMT-05:00) 
To: ZhouJianhua  
Cc: riak-users@lists.basho.com 
Subject: Re: Riak consumes too much memory 
 
Greetings,

The default config targets 5 servers and 16 to 32G of RAM.  Yes, the app.config 
needs some adjustment to achieve happiness for you:

- change ring_creation_size from 64 to 16 (remove the % from the beginning of 
the line)
- add this line before "{data_root, }" in eleveldb section: 
"{max_open_files, 40}," (be sure the comma is at the end of this line).

Good luck,
Matthew


On Oct 17, 2013, at 8:23 PM, ZhouJianhua  wrote:

Hi

I installed riak v1.4.2 on ubuntu12.04(64bit, 4G RAM) with apt-get,  run it 
with default app.conf but change the backend to leveldb, and test it with 
https://github.com/tpjg/goriakpbc . 

Just keep putting (key, value) to an bucket,  the memory always increasing, and 
in the end it crashed, as it cannot allocate memory. 

Should i change the configuration or other?
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak consumes too much memory

2013-10-17 Thread wjossey

Hi Darren,
One can always configure swap to be turned on, which can prevent OOM
killing; however, the performance impact of doing this is detrimental and
not recommended.  I'd recommend you Matthew's recommendation above as a
starting point, if you indeed are limited to 4GB of RAM.

Cheers,
Wes



--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Riak-consumes-too-much-memory-tp4029521p4029526.html
Sent from the Riak Users mailing list archive at Nabble.com.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak consumes too much memory

2013-10-17 Thread Matthew Von-Maszewski

It is already in test and available for your download now:

https://github.com/basho/leveldb/tree/mv-flexcache

Discussion is here:

https://github.com/basho/leveldb/wiki/mv-flexcache

This code is slated for Riak 2.0.  Enjoy!!

Matthew

On Oct 17, 2013, at 20:50, darren  wrote:

> But why isn't riak smart enough to adjust itself to the available memory or 
> lack thereof?
> 
> No serious enterprise technology should just consume everything and crash.
> 
> 
> Sent from my Verizon Wireless 4G LTE Smartphone
> 
> 
> 
>  Original message 
> From: Matthew Von-Maszewski  
> Date: 10/17/2013 8:38 PM (GMT-05:00) 
> To: ZhouJianhua  
> Cc: riak-users@lists.basho.com 
> Subject: Re: Riak consumes too much memory 
> 
> 
> Greetings,
> 
> The default config targets 5 servers and 16 to 32G of RAM.  Yes, the 
> app.config needs some adjustment to achieve happiness for you:
> 
> - change ring_creation_size from 64 to 16 (remove the % from the beginning of 
> the line)
> - add this line before "{data_root, }" in eleveldb section: 
> "{max_open_files, 40}," (be sure the comma is at the end of this line).
> 
> Good luck,
> Matthew
> 
> 
> On Oct 17, 2013, at 8:23 PM, ZhouJianhua  wrote:
> 
>> Hi
>> 
>> I installed riak v1.4.2 on ubuntu12.04(64bit, 4G RAM) with apt-get,  run it 
>> with default app.conf but change the backend to leveldb, and test it with 
>> https://github.com/tpjg/goriakpbc . 
>> 
>> Just keep putting (key, value) to an bucket,  the memory always increasing, 
>> and in the end it crashed, as it cannot allocate memory. 
>> 
>> Should i change the configuration or other?
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Deleting data in a map-reduce job

2013-10-17 Thread Daniel Abrahamsson

Hi,

Does anyone have any experience with a similar setup?

We have resolved questions 4 and 5 - they occurred due to a firewall
misconfiguration, but I would still very much like to hear if there are any
drawbacks with deleting data in the map reduce job itself compared to just
collecting the keys and then deleting the data with the client.

Regards,
Daniel Abrahamsson


On Thu, Oct 10, 2013 at 11:14 AM, Daniel Abrahamsson <
daniel.abrahams...@klarna.com> wrote:

> Hi, I've some questions regarding map-reduce jobs. The main one regards
> deleting data in a map-reduce job.
>
> I have a map-reduce job that traverses an entire bucket to clean up
> old/unusable data once a day. The deletion of objects is done in the
> map-reduce job itself. Here is an example map-reduce job expressed as a
> qfun explaining what I am doing:
>
> fun({error, notfound}, _, _)   -> [];
>(O, _, Condition) ->
>   Obj = case hd(riak_object:get_values(O)) of
> <<>> -> % Ignore tombstones
>   default_object_not_to_be_deleted()
> Bin -> binary_to_term(Bin)
>   end,
>   case should_be_deleted(Obj, Condition) of
> false -> [{total, 1}, {removed, 0}];
> true ->
>Key = riak_object:key(O),
>Bucket = riak_object:bucket(O),
>{ok, Client} = riak:local_client(),
>Client:delete(Bucket,Key),
>[{total, 1}, {removed, 1}]
>   end
> end.
>
> And now to the questions:
> 1. I have noted that deleting data this way leaves the keys around if I do
> a subsequent
>list_keys() operation. They are pruned when I try to get the objects
> and get {error, notfound}.
>With this approach, will the keys ever be removed unless someone tries
> to get them first?
>
> 2. Are there any other drawbacks with deleting data in the map-reduce job
> itself, rather than
>reading up the keys with the job, and the using the regular riak client
> to delete the objects?
>
> 3. Handling of tombstones in map-reduce jobs is very poorly documented.
> The approach above has worked for us. However, the approach feels very
> akward with both an {error, notfound} clase and checking for an empty
> binary as value. I know you can also check for the "X-Riak-Deleted" flag in
> the metadata. Under what circumstances do the different values appear, and
> most importantly, which is the recommended way of dealing with tombstones
> in map-reduce jobs?
>
> Considering that we will have quite much data in our bucket, we run the
> job during off-hours not to
> disturb regular traffic. However, when we run the job we often get an
> "error, disconnected" error after approximately 15 minutes, even if our
> timeout is even greater than that. Running the job manually afterwards
> usually takes just ~30 seconds.
>
> 4. Have anyone else experienced this with a "cold" database? We have not
> yet configured all the tuning parameters reported by "riak diag", but will
> do so soon. Might this have an effect in this case?
>
> 5. What does the "disconnected" message mean, considering that the timeout
> value has not yet been reached?
>
> Regards,
> Daniel Abrahamsson
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

37 matches

Mail list logo