from:"Russell Brown"

Re: Are integer indexes in 2i 64-bit or 32-bit?

2012-06-07 Thread Russell Brown


On 7 Jun 2012, at 22:55, Guido Medina wrote:

> All points to 32 bits, at least for the Java client side (indexes can be of 
> type Integer, not Long which is the 64 bits) Look for RiakIndex.java, that 
> will give you some answers.

That's a mistake on the part of the client developer at that time (me). They 
should probably be BigInteger, since integers can be arbitrarily large in 
erlang. I'm pretty sure Brian Roach (the new, smarter, Java developer) is 
addressing this https://github.com/basho/riak-java-client/issues/112

Russell

>  
> I don’t know the exact answer though.
>  
> Regards,
>  
> Guido.
>  
> From: Alexander Sicular
> Sent: Thursday, June 07, 2012 10:43 PM
> To: Berend Ozceri
> Cc: riak-users@lists.basho.com
> Subject: Re: Are integer indexes in 2i 64-bit or 32-bit?
>  
> I would say yes... Probably, if you're on a 64bit system. . Unless you're 
> shifting stuff through JavaScript in which case I doubt it. Cause last I 
> checked, js don't speak 64bit int. 
> 
> 
> @siculars on twitter
> http://siculars.posterous.com
>  
> Sent from my iRotaryPhone
> 
> On Jun 7, 2012, at 17:08, Berend Ozceri  wrote:
> 
>> I apologize for asking this question if it's an FAQ or is documented 
>> somewhere, but I don't see anything specific mentioned about the size of 
>> integer indexes in 2i:
>>  
>> http://wiki.basho.com/Secondary-Indexes.html
>>  
>> I certainly could dive into to source code to answer this question, but in 
>> case someone here knows, what's the size of an integer index in 2i? I'm 
>> hoping that the answer will be that it's 64 bits…
>>  
>> Thanks,
>>  
>> Berend
>>  
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Throughput issue contd. On Joyend Riak Smartmachine

2012-06-27 Thread Russell Brown


On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote:

> Its not about the difference in throughput in the two approaches I took. 
> Rather, the issue is that even 200 writes/sec is a bit on the lower side.
> I could be doing something wrong with the configuration because people are 
> reporting throughputs of 2-3k ops/sec
> 
> If anyone here could guide me in setting up a cluster which would give such 
> kind of throughput.

To get the kind of throughput I use multiple threads / workers. Have you looked 
at basho_bench[1], it is a simple, reliable tool to benchmark Riak clusters?

Cheers

Russell

[1] Basho Bench - https://github.com/basho/basho_bench and 
http://wiki.basho.com/Benchmarking.html

> 
> Thanks,
> Yousuf
> 
> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson  wrote:
> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan  wrote:
> 
>> Hi,
>> 
>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak 
>> OpenSource SmartMachine Image.
>> 
>> Afterwards I tried loading data by following two methods
>> 1. Bash script
>> #!/bin/bash
>> echo $(date)
>> for (( c=1; c<=1000; c++ ))
>> do
>>  curl -s -d 'this is a test' -H "Content-Type: text/plain" 
>> http://127.0.0.1:8098/buckets/test/keys
>> done
>> echo $(date)
>> 
>> 2. Python Riak Client
>> c=riak.RiakClient("10.112.2.185") 
>> b=c.bucket("test")
>> for i in xrange(1):o=b.new(str(i), str(i)).store()
>> 
>> For case 1, throughput was 25 writes/sec
>> For case 2, throughput was 200 writes/sec
>> 
>> Maybe I am making a fundamental mistake somewhere. I tried the above two 
>> scripts on EC2 clusters too and still got the same performance.
>> 
>> Please, someone help
> 
> 
> The major difference between these two is the first is executing a binary, 
> which has to basically create everything (connection, payload, etc) every 
> time through the loop.  The second does not - it creates the client once, 
> then iterates over it keeping the same client and presumably the same 
> connection as well.  That makes a huge difference.
> 
> I would not use curl to do performance testing.  What you probably want is 
> something like your python script that will work on many threads/processes at 
> once (or fire them up many times).
> 
> 
> Eric Anderson
> Co-Founder
> CopperEgg
> 
> 
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Throughput issue contd. On Joyend Riak Smartmachine

2012-06-27 Thread Russell Brown


On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote:

> I did use basho bench on my clusters. It should throughput of around 150

Could you share the config you used, please?

> 
> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown  wrote:
> 
> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote:
> 
>> Its not about the difference in throughput in the two approaches I took. 
>> Rather, the issue is that even 200 writes/sec is a bit on the lower side.
>> I could be doing something wrong with the configuration because people are 
>> reporting throughputs of 2-3k ops/sec
>> 
>> If anyone here could guide me in setting up a cluster which would give such 
>> kind of throughput.
> 
> To get the kind of throughput I use multiple threads / workers. Have you 
> looked at basho_bench[1], it is a simple, reliable tool to benchmark Riak 
> clusters?
> 
> Cheers
> 
> Russell
> 
> [1] Basho Bench - https://github.com/basho/basho_bench and 
> http://wiki.basho.com/Benchmarking.html
> 
>> 
>> Thanks,
>> Yousuf
>> 
>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson  
>> wrote:
>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan  wrote:
>> 
>>> Hi,
>>> 
>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak 
>>> OpenSource SmartMachine Image.
>>> 
>>> Afterwards I tried loading data by following two methods
>>> 1. Bash script
>>> #!/bin/bash
>>> echo $(date)
>>> for (( c=1; c<=1000; c++ ))
>>> do
>>> curl -s -d 'this is a test' -H "Content-Type: text/plain" 
>>> http://127.0.0.1:8098/buckets/test/keys
>>> done
>>> echo $(date)
>>> 
>>> 2. Python Riak Client
>>> c=riak.RiakClient("10.112.2.185") 
>>> b=c.bucket("test")
>>> for i in xrange(1):o=b.new(str(i), str(i)).store()
>>> 
>>> For case 1, throughput was 25 writes/sec
>>> For case 2, throughput was 200 writes/sec
>>> 
>>> Maybe I am making a fundamental mistake somewhere. I tried the above two 
>>> scripts on EC2 clusters too and still got the same performance.
>>> 
>>> Please, someone help
>> 
>> 
>> The major difference between these two is the first is executing a binary, 
>> which has to basically create everything (connection, payload, etc) every 
>> time through the loop.  The second does not - it creates the client once, 
>> then iterates over it keeping the same client and presumably the same 
>> connection as well.  That makes a huge difference.
>> 
>> I would not use curl to do performance testing.  What you probably want is 
>> something like your python script that will work on many threads/processes 
>> at once (or fire them up many times).
>> 
>> 
>> Eric Anderson
>> Co-Founder
>> CopperEgg
>> 
>> 
>> 
>> 
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Throughput issue contd. On Joyend Riak Smartmachine

2012-06-27 Thread Russell Brown


On 27 Jun 2012, at 12:09, Yousuf Fauzan wrote:

> I used examples/riakc_pb.config
> 
> {mode, max}.
> 
> {duration, 10}.
> 
> {concurrent, 1}.

Try upping this. On my local 3 node cluster with 8gb ram and an old, cheap quad 
core per box I'd set concurrency to 10 workers.

> 
> {driver, basho_bench_driver_riakc_pb}.
> 
> {key_generator, {int_to_bin, {uniform_int, 1}}}.
> 
> {value_generator, {fixed_bin, 1}}.
> 
> {riakc_pb_ips, [{}]}.

I add all the IPs here, one entry per node.

> 
> {riakc_pb_replies, 1}.
> 
> {operations, [{get, 1}, {update, 1}]}.
> 
> 
> On Wed, Jun 27, 2012 at 4:37 PM, Russell Brown  wrote:
> 
> On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote:
> 
>> I did use basho bench on my clusters. It should throughput of around 150
> 
> Could you share the config you used, please?
> 
>> 
>> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown  wrote:
>> 
>> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote:
>> 
>>> Its not about the difference in throughput in the two approaches I took. 
>>> Rather, the issue is that even 200 writes/sec is a bit on the lower side.
>>> I could be doing something wrong with the configuration because people are 
>>> reporting throughputs of 2-3k ops/sec
>>> 
>>> If anyone here could guide me in setting up a cluster which would give such 
>>> kind of throughput.
>> 
>> To get the kind of throughput I use multiple threads / workers. Have you 
>> looked at basho_bench[1], it is a simple, reliable tool to benchmark Riak 
>> clusters?
>> 
>> Cheers
>> 
>> Russell
>> 
>> [1] Basho Bench - https://github.com/basho/basho_bench and 
>> http://wiki.basho.com/Benchmarking.html
>> 
>>> 
>>> Thanks,
>>> Yousuf
>>> 
>>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson  
>>> wrote:
>>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan  wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak 
>>>> OpenSource SmartMachine Image.
>>>> 
>>>> Afterwards I tried loading data by following two methods
>>>> 1. Bash script
>>>> #!/bin/bash
>>>> echo $(date)
>>>> for (( c=1; c<=1000; c++ ))
>>>> do
>>>>curl -s -d 'this is a test' -H "Content-Type: text/plain" 
>>>> http://127.0.0.1:8098/buckets/test/keys
>>>> done
>>>> echo $(date)
>>>> 
>>>> 2. Python Riak Client
>>>> c=riak.RiakClient("10.112.2.185") 
>>>> b=c.bucket("test")
>>>> for i in xrange(1):o=b.new(str(i), str(i)).store()
>>>> 
>>>> For case 1, throughput was 25 writes/sec
>>>> For case 2, throughput was 200 writes/sec
>>>> 
>>>> Maybe I am making a fundamental mistake somewhere. I tried the above two 
>>>> scripts on EC2 clusters too and still got the same performance.
>>>> 
>>>> Please, someone help
>>> 
>>> 
>>> The major difference between these two is the first is executing a binary, 
>>> which has to basically create everything (connection, payload, etc) every 
>>> time through the loop.  The second does not - it creates the client once, 
>>> then iterates over it keeping the same client and presumably the same 
>>> connection as well.  That makes a huge difference.
>>> 
>>> I would not use curl to do performance testing.  What you probably want is 
>>> something like your python script that will work on many threads/processes 
>>> at once (or fire them up many times).
>>> 
>>> 
>>> Eric Anderson
>>> Co-Founder
>>> CopperEgg
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Throughput issue contd. On Joyend Riak Smartmachine

2012-06-27 Thread Russell Brown


On 27 Jun 2012, at 12:36, Yousuf Fauzan wrote:

> So I changed concurrency to 10 and put all the IPs of the nodes in basho 
> bench config.
> Throughput is now around 1500.
> 

I guess you can now try 5 or 15 concurrent workers and see which is optimal for 
that set up to get a good feel for the sizing of any connection pools for your 
application.

You can also see how adding nodes and adding workers effects your results to 
help you size the cluster you need for your expected usage.

Cheers

Russell

> 
> On Wed, Jun 27, 2012 at 4:40 PM, Russell Brown  wrote:
> 
> On 27 Jun 2012, at 12:09, Yousuf Fauzan wrote:
> 
>> I used examples/riakc_pb.config
>> 
>> {mode, max}.
>> 
>> {duration, 10}.
>> 
>> {concurrent, 1}.
> 
> Try upping this. On my local 3 node cluster with 8gb ram and an old, cheap 
> quad core per box I'd set concurrency to 10 workers.
> 
>> 
>> {driver, basho_bench_driver_riakc_pb}.
>> 
>> {key_generator, {int_to_bin, {uniform_int, 1}}}.
>> 
>> {value_generator, {fixed_bin, 1}}.
>> 
>> {riakc_pb_ips, [{}]}.
> 
> I add all the IPs here, one entry per node.
> 
>> 
>> {riakc_pb_replies, 1}.
>> 
>> {operations, [{get, 1}, {update, 1}]}.
>> 
>> 
>> On Wed, Jun 27, 2012 at 4:37 PM, Russell Brown  wrote:
>> 
>> On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote:
>> 
>>> I did use basho bench on my clusters. It should throughput of around 150
>> 
>> Could you share the config you used, please?
>> 
>>> 
>>> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown  
>>> wrote:
>>> 
>>> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote:
>>> 
>>>> Its not about the difference in throughput in the two approaches I took. 
>>>> Rather, the issue is that even 200 writes/sec is a bit on the lower side.
>>>> I could be doing something wrong with the configuration because people are 
>>>> reporting throughputs of 2-3k ops/sec
>>>> 
>>>> If anyone here could guide me in setting up a cluster which would give 
>>>> such kind of throughput.
>>> 
>>> To get the kind of throughput I use multiple threads / workers. Have you 
>>> looked at basho_bench[1], it is a simple, reliable tool to benchmark Riak 
>>> clusters?
>>> 
>>> Cheers
>>> 
>>> Russell
>>> 
>>> [1] Basho Bench - https://github.com/basho/basho_bench and 
>>> http://wiki.basho.com/Benchmarking.html
>>> 
>>>> 
>>>> Thanks,
>>>> Yousuf
>>>> 
>>>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson  
>>>> wrote:
>>>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan  wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and riak 
>>>>> OpenSource SmartMachine Image.
>>>>> 
>>>>> Afterwards I tried loading data by following two methods
>>>>> 1. Bash script
>>>>> #!/bin/bash
>>>>> echo $(date)
>>>>> for (( c=1; c<=1000; c++ ))
>>>>> do
>>>>>   curl -s -d 'this is a test' -H "Content-Type: text/plain" 
>>>>> http://127.0.0.1:8098/buckets/test/keys
>>>>> done
>>>>> echo $(date)
>>>>> 
>>>>> 2. Python Riak Client
>>>>> c=riak.RiakClient("10.112.2.185") 
>>>>> b=c.bucket("test")
>>>>> for i in xrange(1):o=b.new(str(i), str(i)).store()
>>>>> 
>>>>> For case 1, throughput was 25 writes/sec
>>>>> For case 2, throughput was 200 writes/sec
>>>>> 
>>>>> Maybe I am making a fundamental mistake somewhere. I tried the above two 
>>>>> scripts on EC2 clusters too and still got the same performance.
>>>>> 
>>>>> Please, someone help
>>>> 
>>>> 
>>>> The major difference between these two is the first is executing a binary, 
>>>> which has to basically create everything (connection, payload, etc) every 
>>>> time through the loop.  The second does not - it creates the client once, 
>>>> then iterates over it keeping the same client and presumably the same 
>>>> connection as well.  That makes a huge difference.
>>>> 
>>>> I would not use curl to do performance testing.  What you probably want is 
>>>> something like your python script that will work on many threads/processes 
>>>> at once (or fire them up many times).
>>>> 
>>>> 
>>>> Eric Anderson
>>>> Co-Founder
>>>> CopperEgg
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ___
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>> 
>> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Java client: byte arrays as keys?

2012-07-16 Thread Russell Brown

Hi Kaspar,

Sorry for the slow reply.

On 16 Jul 2012, at 07:49, Kaspar Thommen wrote:

> Anyone please?
> 
> On Jun 26, 2012 8:57 PM, "Kaspar Thommen"  wrote:
> Hi,
> 
> The high-level API in the Java client library (IRiakClient) does not allow 
> one to use byte[] arrays as keys, only Strings, whereas the underlying PBC 
> and http APIs (e.g. com.basho.riak.pbc.RiakClient) do, via the 
> fetch(ByteString, ...) methods. Any reason for this?

Oversight or oversimplifying by me.


> Is it planned to add byte array keys to IRiakClient as well at some point?

We should. Please will you raise an issue for it on the RJC github repo[1] to 
ensure we get to it?

Cheers

Russell

[1] Java client issues - 
https://github.com/basho/riak-java-client/issues?direction=desc&sort=created&state=open

> 
> Thanks,
> Kaspar
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak 1.2 on OS X 10.7 is not starting

2012-07-16 Thread Russell Brown


On 16 Jul 2012, at 16:33, Senthilkumar Peelikkampatti wrote:

> I tried from git master and riak 1.2 "zip" download to try if it is running. 
> 
> I end of always getting the following error, any idea what I did wrong?

You did nothing wrong, this is a bug.There seems to be an issue with the start 
order of riak apps in master. I'm working on a fix (in fact, I have one, just 
need to raise the PR.)

Basically riak_api is starting before riak_core, but it depends on riak_core 
for the stat cache. I've seen this with erlang r14b0* but not with r15b0*.

Will have a fix really soon.

Cheers

Russell

> 
> Erlang R15B (erts-5.9) [source] [smp:4:4] [async-threads:64] [hipe] 
> [kernel-poll:true]
> 
> 21:00:14.839 [info] Application lager started on node 'riak@127.0.0.1'
> 21:00:14.875 [error] CRASH REPORT Process <0.59.0> with 0 neighbours exited 
> with reason: no such process or port in call to 
> gen_server:call(riak_core_stat_cache, 
> {register,riak_api,{riak_api_stat,produce_stats,[]},5}) in 
> application_master:init/4 line 138
> {"Kernel pid 
> terminated",application_controller,"{application_start_failure,riak_api,{bad_return,{{riak_api_app,start,[normal,[]]},{'EXIT',{noproc,{gen_server,call,[riak_core_stat_cache,{register,riak_api,{riak_api_stat,produce_stats,[]},5}]}}"}
> 
> Crash dump was written to: ./log/erl_crash.dump
> Kernel pid terminated (application_controller) 
> ({application_start_failure,riak_api,{bad_return,{{riak_api_app,start,[normal,[]]},{'EXIT',{noproc,{gen_server,call,[riak_core_stat_cache,{register,ria
> 
> Thanks,
> Senthil
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reasons for using gen_server to gather statictics with folsom

2012-08-13 Thread Russell Brown

Hi Sergey,
First, sorry for missing your first post. I just didn't see it.

I'll try and answer your questions.

> 1.   Why separate gen_servers (riak_api_stat, riak_core_stat, 
> riak_kv_stat) were used to gather statistics instead of the direct calls to 
> folsom_metrics through some more high-level api?

Time. There will be a high-level api provided by riak-core, or folsom, soon. 
The idea being that you declaratively register stats and riak-core will start 
a/some processes for you and you just use the API to update stats. I started 
work on this structure but didn't finish it in time for 1.2. It is what I am 
working on next. I'll keep you posted. If you follow the existing model 
hopefully porting to the new API will be relatively simple. Sorry for not 
getting it solidified sooner. The reason for gen_servers, of course, is to cast 
the calls to folsom rather than blocking on ets when doing critical Riak ops 
like writing and reading data. There are a number of table ownership/crashing 
issues in folsom, as well as a couple of race conditions. I'll be working with 
Joe Williams of Boundary to resolve these and refactor folsom as part of my 
ongoing stats work for Riak. Watch that repo to keep informed.

> 2.   What is the purpose of riak_core_stat_cache and what it is intended 
> to do?

Calculating the histograms for stats is expensive. Especially when there are a 
lot of readings. In some cases it can take a few seconds to calculate stats for 
some metrics on a busy node. The cache is there for 2 reasons. 1. To only have 
one process calculating stats at a time, so if multiple calls to get stats 
happen at once, one process actually calculates and the rest are parked and 
notified when the answer comes. 2. To actually cache the results so they're not 
calculated more often than needed. There are stats gathered on how long it 
takes to calculate stats, and the idea was to have the mean time to calculate 
stats for an application to be the cache TTL. That is work still to be done.

But in many ways the cache is there to support backwards compatibility for 
Riak's /stats endpoint and the riak-admin commands. In future I'd rather expose 
the folsom stats directly over REST and CLI so you can request only the stat 
you want and not waste time calculating a load of stats you're not interested 
in. This is the next, next thing I'll be working on. 

> As far as I understand riak_core_stat_cache caches stats using ets, so I’m 
> wondering why statistics that is stored in ets is cached using ets?

So why cache stats in ets that are already in ets: the cache is for groups of 
stats that have had the _expensive_ calculations run on them already, folsom 
stores the raw readings in ets.

> Is it correct that calls to folsom_metrics are done via gen_server to 
> decrease the possibility of losing ets tables that are bound to a concrete 
> process?

Really calls are done via gen_server so that calls to folsom are cast. 
Originally the code called folsom direct in process but bench marking showed 
this to be slower and more damaging in the case of an error/crash in folsom. I 
mention the ets ownership/crashing issues above. There is an example of one 
here[1]. I'm going to work on refactoring folsom to have a more coherent 
strategy of table ownership.

I hope this helps, if I've missed anything please ask. The short term aim was 
to stabilise stats in Riak and fix known issues, and I think I accomplished 
that. Next is to better structure the code so that riak-core provides a stats 
service.

Cheers

Russell

[1] https://github.com/boundary/folsom/issues/30

On 13 Aug 2012, at 09:54, Zhemzhitsky Sergey wrote:

> Hi guys,
>  
> Any updates on these questions?
>  
> I’ve read the following blog entry 
> http://basho.com/blog/technical/2012/07/02/folsom-backed-stats-riak-1-2/and 
> still haven’t found the answers.
>  
> As far as I understand riak_core_stat_cache caches stats using ets, so I’m 
> wondering why statistics that is stored in ets is cached using ets?
> Is it correct that calls to folsom_metrics are done via gen_server to 
> decrease the possibility of losing ets tables that are bound to a concrete 
> process?
>  
>  
> Best Regards,
> Sergey
>  
> From: riak-users-boun...@lists.basho.com 
> [mailto:riak-users-boun...@lists.basho.com] On Behalf Of Zhemzhitsky Sergey
> Sent: Friday, August 10, 2012 6:33 PM
> To: riak-users@lists.basho.com
> Subject: Reasons for using gen_server to gather statictics with folsom
>  
> Hi riak gurus,
>  
> Recently riak 1.2 has been released that uses folsom library to gather 
> statistics.
>  
> I’d like to use the same library (folsom) in my application so could you 
> answer the following questions:
>  
> 1.   Why separate gen_servers (riak_api_stat, riak_core_stat, 
> riak_kv_stat) were used to gather statistics instead of the direct calls to 
> folsom_metrics through some more high-level api?
> 2.   What is the purpose of riak_core_stat_cache and what it

Re: Java map-reduce and result keys

2012-09-14 Thread Russell Brown


On 14 Sep 2012, at 14:24, Deepak Balasubramanyam wrote:

> Hi,
> 
> I've written a map reduce query on the riak java client like so...
> 
> client.mapReduce(BUCKET).addKeyFilter(keyFilter)
> .addLinkPhase(BUCKET, "_", false)
> .addMapPhase(new NamedJSFunction("Riak.mapValuesJson"), 
> false)
> .addReducePhase(phaseFunction).execute();
> Collection types = result.getResult(MyType.class);
> 
> This is the class definition for MyType
> 
> public class MyType
> {
> @RiakKey
> private String myKey;
> private String property1;
> private String property2;
> 
> /*  Getters / Setters go here */
> }
> 
> When the object mapper deserializes the results into Collection, none 
> of the types have the myKey property populated in them. When I debugged the 
> calls made by riak I realized that the result of the /mapred call does not 
> contain any key information in the body. It only contains the value that each 
> key represents. So that explains why the keys are null in the result.

The Java client doesn't add the value of the @RiakKey field to the value stored 
in riak.

> 
> On the contrary, a link walk in riak returns the Location header for each 
> multipart form entry in the response (Location: /riak/bucket/key). So I guess 
> there is at least some way to tweak a client to parse the location to get the 
> keys, but you lose out on the map-reduce goodness. 
> 
> Is there some way a map-reduce query can be formed to allow the resulting 
> type's RiakKey to be populated ? What are my options ?

A custom Map function may do what you want. Get the Key from the Key Data 
passed to the Map function and add it to the JSON value returned. Jackson 
should then take care of de-serialising it into your values.

Cheers

Russell

> 
> Thanks
> Deepak Bala
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Specified w/dw/pw values invalid for bucket n value of 1

2012-09-19 Thread Russell Brown


On 19 Sep 2012, at 14:54, Ingo Rockel wrote:

> Hi Mark,
> 
> thanks for looking into this.
> 
> I'm using the riak-java-client for accessing riak,

Are you using HTTP or PB, please? And can you let me know the version, too?

Thanks

Russell

> the bucket is created with the following code:
> 
> msgBucket = riakClient.createBucket(MSG_BUCKET_ID).
>nVal(1).w(1).dw(1).pw(1).
>execute();
> 
> and the write operation is:
> 
>msgBucket.store("Mailbox|" + messageDto.getReceptor(), 
> createMailbox(messageDto)).execute();
> 
> without doing any modification of w/dw/pw for the single write.
> 
> Ingo
> 
> Am 19.09.2012 07:56, schrieb Mark Phillips:
>> Hi Ingo
>> 
>> I just built a single node Riak 1.2 from source on my laptop, cranked
>> the n, w, dw, and pw values down to "1", and was able to write
>> successfully via curl. For the sake of completeness, I attempted to
>> write with "w=2" and was able to throw the "Specified w/dw/pw values
>> invalid for bucket n value of 1" error that you reported.
>> 
>> Any chance you have some app code that's changing the W value? Can you
>> share the specifics of the request that's failing?
>> 
>> Mark
>> 
>> 
>> On Tue, Sep 18, 2012 at 9:33 AM, Ingo Rockel
>>  wrote:
>>> Hi Mark,
>>> 
>>> yes, I'm running 1.2.
>>> 
>>> Ingo
>>> 
>>> Am 18.09.2012 02:11, schrieb Mark Phillips:
>>> 
 Hi Ingo,
 
 Sorry for the holdup here.
 
 Riak shouldn't be throwing this error if all your R and W values are
 set to "1". Are you running Riak 1.2?
 
 Mark
 
 On Mon, Sep 17, 2012 at 3:10 AM, Ingo Rockel
  wrote:
> 
> Anyone?
> 
> Am 30.08.2012 18:34, schrieb Ingo Rockel:
> 
>> Hi List,
>> 
>> I'm trying to set the n-val to 1 for my single-node test server but do
>> always fail with the following error:
>> 
>> Specified w/dw/pw values invalid for bucket n value of 1
>> 
>> This is my bucket configuration:
>> 
>> 
>> 
>> {"props":{"allow_mult":false,"basic_quorum":false,"big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":1,"last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":1,"name":"messages","notfound_ok":true,"old_vclock":86400,"postcommit":[],"pr":1,"precommit":[],"pw":1,"r":1,"rw":1,"small_vclock":50,"w":1,"young_vclock":20}}
>> 
>> 
>> I can't see what's wrong with it, maybe someone can shed a light.
>> 
>> Regards,
>> 
>>   Ingo
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> 
>>> 
>>> --
>>> Software Architect
>>> 
>>> Blue Lion mobile GmbH
>>> Tel. +49 (0) 221 788 797 14
>>> Fax. +49 (0) 221 788 797 19
>>> Mob. +49 (0) 176 24 87 30 89
>>> 
>>> ingo.roc...@bluelionmobile.com
>> qeep: Hefferwolf
>>> 
>>> www.bluelionmobile.com
>>> www.qeep.net
> 
> 
> -- 
> Software Architect
> 
> Blue Lion mobile GmbH
> Tel. +49 (0) 221 788 797 14
> Fax. +49 (0) 221 788 797 19
> Mob. +49 (0) 176 24 87 30 89
> 
> ingo.roc...@bluelionmobile.com
> >>> qeep: Hefferwolf
> 
> www.bluelionmobile.com
> www.qeep.net
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Specified w/dw/pw values invalid for bucket n value of 1

2012-09-24 Thread Russell Brown


On 24 Sep 2012, at 13:38, Ingo Rockel wrote:

> Hi,
> 
> I finally found the reason for this message: Although the bucket was 
> initialized with r=1, the java client silently appended a "r=2" to all 
> fetches which caused this message.

I've found the code that does this. It is deep in the legacy client code, but 
it is a bug. Very sorry about this and thanks for finding the issue.

https://github.com/basho/riak-java-client/blob/master/src/main/java/com/basho/riak/client/http/util/ClientHelper.java#L231

> 
> The combination of two things really cost me a lot of time to find the reason 
> for this:
> * the error message from riak complains about wrong write parameters but in 
> fact the read parameter was wrong.

This is also a bug, in Riak, in the riak_kv_wm_object module, and I've opened 
an issue for this, too.

> * the java client library silently attaches a "r=2" to all read queries and I 
> didn't find anything in the documentation about this. Is this intended or a 
> bug?

It looks like it is was _intended_ by the initial author, but it is a bad idea, 
and therefore a bug.

Again, thanks for finding those, and sorry for the time/effort lost.

I opened the following two issues

https://github.com/basho/riak-java-client/issues/167
https://github.com/basho/riak_kv/issues/398

Cheers

Russell

> 
> Setting the used apache http client into trace really helped a lot, maybe you 
> should add something about this into the java client documentation.
> 
> Ingo
> 
> Am 19.09.2012 16:10, schrieb Ingo Rockel:
>> Hi Russell,
>> 
>> I'm using riak 1.2 and the client version is version 1.0.5 and
>> concerning HTTP and PB, I tried both.
>> 
>> Ingo
>> 
>> Am 19.09.2012 16:04, schrieb Russell Brown:
>>> 
>>> On 19 Sep 2012, at 14:54, Ingo Rockel wrote:
>>> 
>>>> Hi Mark,
>>>> 
>>>> thanks for looking into this.
>>>> 
>>>> I'm using the riak-java-client for accessing riak,
>>> 
>>> Are you using HTTP or PB, please? And can you let me know the version,
>>> too?
>>> 
>>> Thanks
>>> 
>>> Russell
>>> 
>>>> the bucket is created with the following code:
>>>> 
>>>> msgBucket = riakClient.createBucket(MSG_BUCKET_ID).
>>>>nVal(1).w(1).dw(1).pw(1).
>>>>execute();
>>>> 
>>>> and the write operation is:
>>>> 
>>>>msgBucket.store("Mailbox|" + messageDto.getReceptor(),
>>>> createMailbox(messageDto)).execute();
>>>> 
>>>> without doing any modification of w/dw/pw for the single write.
>>>> 
>>>> Ingo
>>>> 
>>>> Am 19.09.2012 07:56, schrieb Mark Phillips:
>>>>> Hi Ingo
>>>>> 
>>>>> I just built a single node Riak 1.2 from source on my laptop, cranked
>>>>> the n, w, dw, and pw values down to "1", and was able to write
>>>>> successfully via curl. For the sake of completeness, I attempted to
>>>>> write with "w=2" and was able to throw the "Specified w/dw/pw values
>>>>> invalid for bucket n value of 1" error that you reported.
>>>>> 
>>>>> Any chance you have some app code that's changing the W value? Can you
>>>>> share the specifics of the request that's failing?
>>>>> 
>>>>> Mark
>>>>> 
>>>>> 
>>>>> On Tue, Sep 18, 2012 at 9:33 AM, Ingo Rockel
>>>>>  wrote:
>>>>>> Hi Mark,
>>>>>> 
>>>>>> yes, I'm running 1.2.
>>>>>> 
>>>>>> Ingo
>>>>>> 
>>>>>> Am 18.09.2012 02:11, schrieb Mark Phillips:
>>>>>> 
>>>>>>> Hi Ingo,
>>>>>>> 
>>>>>>> Sorry for the holdup here.
>>>>>>> 
>>>>>>> Riak shouldn't be throwing this error if all your R and W values are
>>>>>>> set to "1". Are you running Riak 1.2?
>>>>>>> 
>>>>>>> Mark
>>>>>>> 
>>>>>>> On Mon, Sep 17, 2012 at 3:10 AM, Ingo Rockel
>>>>>>>  wrote:
>>>>>>>> 
>>>>>>>> Anyone?
>>>>>>>> 
>>>>>>>> Am 30.08.2012 18:34, schrieb Ingo Rockel:
>>>>>>>> 
>>>

Re: Cluster setup

2012-12-11 Thread Russell Brown

Hey,

'eaddrinuse' from your previous mail suggests that the address your binding to 
is already in use. Maybe riak is already running? Or something else is bound to 
the address?

Cheers

Russell

On 11 Dec 2012, at 19:06, Kevin Burton  wrote:

> Any more information on this or something I can do to help diagnose my own 
> problem?
>  
> Again app.config looks like:
>  
> app.config looks like:
> [
>  
>  %% Riak Client APIs config
>  
>  {riak_api, [
> %% pb_backlog is the maximum length to which the queue of pending
> %% connections may grow. If set, it must be an integer >= 0.
> %% By default the value is 5. If you anticipate a huge number of
> %% connections being initialised *simultaneously*, set this number
> %% higher.
> %% {pb_backlog, 64},
> %% pb_ip is the IP address that the Riak Protocol Buffers 
> interface
> %% will bind to.  If this is undefined, the interface will not 
> run.
> {pb_ip,   "10.79.110.52" },
> %% pb_port is the TCP port that the Riak Protocol Buffers 
> interface
> %% will bind to
> {pb_port, 8089 }
> ]},
>  %% Riak Core config
>  {riak_core, [
>   %% Default location of ringstate
>   {ring_state_dir, "/var/lib/riak/ring"},
>  
>   %% Default ring creation size.  Make sure it is a power of 2,
>   %% e.g. 16, 32, 64, 128, 256, 512 etc
>   %{ring_creation_size, 64},
>  
>   %% http is a list of IP addresses and TCP ports that the Riak
>   %% HTTP interface will bind.
>   {http, [ {"10.79.110.52", 8099 } ]},
>  
> ifconfig looks like:
>  
> eth0  Link encap:Ethernet  HWaddr 00:15:5D:50:3B:27
>   inet addr:10.79.110.52  Bcast:10.79.111.255  Mask:255.255.254.0
>   inet6 addr: fe80::215:5dff:fe50:3b27/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:879 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:1353 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000
>   RX bytes:107098 (104.5 KiB)  TX bytes:204947 (200.1 KiB)
>  
> loLink encap:Local Loopback
>   inet addr:127.0.0.1  Mask:255.0.0.0
>   inet6 addr: ::1/128 Scope:Host
>   UP LOOPBACK RUNNING  MTU:16436  Metric:1
>   RX packets:656 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:656 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:0
>   RX bytes:38513 (37.6 KiB)  TX bytes:38513 (37.6 KiB)
>  
> Finally the error.log
>  
> 2012-12-11 05:24:49.785 [error] <0.170.0> CRASH REPORT Process <0.170.0> with 
> 0 neighbours exited with reason: eaddrinuse in gen_server:init_it/6 line 320
> 2012-12-11 14:11:31.723 [info] <0.7.0> Application lager started on node 
> 'riak@10.79.110.52'
> 2012-12-11 14:11:31.904 [warning] 
> <0.154.0>@riak_core_ring_manager:reload_ring:231 No ring file available.
> 2012-12-11 14:11:32.091 [error] <0.170.0> CRASH REPORT Process <0.170.0> with 
> 0 neighbours exited with reason: eaddrinuse in gen_server:init_it/6 line 320
> 2012-12-11 16:31:34.125 [info] <0.7.0> Application lager started on node 
> 'riak@10.79.110.52'
> 2012-12-11 16:31:34.415 [warning] 
> <0.151.0>@riak_core_ring_manager:reload_ring:231 No ring file available.
> 2012-12-11 16:31:34.711 [error] <0.170.0> CRASH REPORT Process <0.170.0> with 
> 0 neighbours exited with reason: eaddrinuse in gen_server:init_it/6 line 320
> 2012-12-11 17:57:02.388 [info] <0.7.0> Application lager started on node 
> 'riak@10.79.110.52'
> 2012-12-11 17:57:02.648 [warning] 
> <0.154.0>@riak_core_ring_manager:reload_ring:231 No ring file available.
> 2012-12-11 17:57:02.899 [error] <0.170.0> CRASH REPORT Process <0.170.0> with 
> 0 neighbours exited with reason: eaddrinuse in gen_server:init_it/6 line 320
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak KV coordinators

2012-12-12 Thread Russell Brown


On 12 Dec 2012, at 19:20, David Fox  wrote:

> Hey everyone,
> 
> I'm currently using riak_kv as a reference of how to implement riak_core and 
> see that whenever a new coordinator process is needed, a new one is created 
> via their supervisor. But in the case of the get and put coordinators, the 
> supervisor is not used.
> 
> Besides eliminating the supervisor as a potential bottleneck, is there any 
> reason why the supervisor is not used?

AFAIK it is only that (eliminating a potential bottleneck.) It is an 
optimisation added to the kv code in the last few weeks.

Cheers

Russell

> 
> https://github.com/basho/riak_kv/blob/master/src/riak_client.erl#L80
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: What is the "riak_kv_nodeq_total" metric?

2013-01-16 Thread Russell Brown

Hi Dave,

On 16 Jan 2013, at 11:29, Dave Brady  wrote:

> Greetings, 
> 
> I won't bore everyone with details here: the short story is I ran "riak-admin 
> cluster leave/plan/commit" to remove a node and got a lot of grief from our 
> five-node ring. 
> 
> The ring was pretty well de-stabilized. One-or-more nodes would be down, then 
> up, when repeatedly running "riak-admin ring-status". 
> 
> I have finally isolated a wildly misbehaving node (not the one I was trying 
> to make "leave", by the way). 
> 
> None of the existing metrics I was graphing highlighted a problem, so I went 
> through "/stats" (yet again), looking at the undocumented metrics to see what 
> looked interested. 
> 
> I noticed that riak_kv_vnodeq_total was showing up with a non zero-value, so 
> I set up a graph which plots the difference between the previous-and-current 
> value (like I do for the other "*_total" metrics). 
> 
> The results were *very* interesting! The other four nodes showed occasional 
> values of 1, 2 even 3 once or twice. Our troublesome node showed 152, 8000, 
> 704... !! 
> 
> Does anyone know what riak_kv_vnodeq_total indicates? 

It is the total number of messages in the queues for all the riak_kv_vnodes 
running on the node. Large queues mean that a/some vnode(s) are not able to 
keep up with the requests made of it/them. 


Cheers

Russell

> 
> Thanks!
> 
> --
> Dave Brady
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak_dt

2013-01-25 Thread Russell Brown

Hi Petter,

> Hi
> 
> I would like to use Riak, but counters is a must in this project. They are so 
> handy :-)
> 
> I tried to build the riak-dt, but the build-system seems broken. I get this 
> when i do make devrel:
> 
> ERROR: generate failed while processing /Users/petter/erlang/riak/rel: 
> {'EXIT',{{badmatch,{error,"luke: : Missing application directory."}},
>  [{rebar_reltool,generate,2,[]},
>   {rebar_core,run_modules,4,[]},
>   {rebar_core,execute,5,[]},
>   {rebar_core,process_dir0,6,[]},
>   {rebar_core,process_dir,4,[]},
>   {rebar_core,process_commands,2,[]},
>   {rebar,main,1,[]},
>   {escript,run,2,[{file,"escript.erl"},{line,741}]}]}}
> 
> Any clues?

I think the riak-dt branch of riak has got a bit stale and needs a rebase. I'll 
do that today and post back when I have it working.

> 
> The riak-dt does also seem a little bit forgotten - no commits in a couple of 
> months. Is it still planned?

Riak_dt, or something like it _is_ still planned. We're working out how best to 
integrate CRDTs in general (and counters in particular) into Riak_kv. In the 
meantime I'll fix up that experimental branch.

Cheers

Russell

> 
> Cheers,
> 
> Petter

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak_dt

2013-01-28 Thread Russell Brown

I just pushed an updated 'riak-dt' branch to the riak repo on github.

Just pull the branch and build and you should have a working riak_dt.

Let me know if you have any issues

Cheers

Russell

On 26 Jan 2013, at 10:54, Petter Egesund  wrote:

> Great, thanks - I will stay tuned :-)
> 
> Petter
> 
> 
> ________
> Fra: Russell Brown [russell.br...@me.com]
> Sendt: 25. januar 2013 19:52
> Til: Petter Egesund
> Kopi: riak-users List
> Emne: Re: riak_dt
> 
> Hi Petter,
> 
>> Hi
>> 
>> I would like to use Riak, but counters is a must in this project. They are 
>> so handy :-)
>> 
>> I tried to build the riak-dt, but the build-system seems broken. I get this 
>> when i do make devrel:
>> 
>> ERROR: generate failed while processing /Users/petter/erlang/riak/rel: 
>> {'EXIT',{{badmatch,{error,"luke: : Missing application directory."}},
>> [{rebar_reltool,generate,2,[]},
>>  {rebar_core,run_modules,4,[]},
>>  {rebar_core,execute,5,[]},
>>  {rebar_core,process_dir0,6,[]},
>>  {rebar_core,process_dir,4,[]},
>>  {rebar_core,process_commands,2,[]},
>>  {rebar,main,1,[]},
>>  {escript,run,2,[{file,"escript.erl"},{line,741}]}]}}
>> 
>> Any clues?
> 
> I think the riak-dt branch of riak has got a bit stale and needs a rebase. 
> I'll do that today and post back when I have it working.
> 
>> 
>> The riak-dt does also seem a little bit forgotten - no commits in a couple 
>> of months. Is it still planned?
> 
> Riak_dt, or something like it _is_ still planned. We're working out how best 
> to integrate CRDTs in general (and counters in particular) into Riak_kv. In 
> the meantime I'll fix up that experimental branch.
> 
> Cheers
> 
> Russell
> 
>> 
>> Cheers,
>> 
>> Petter
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Tune Riak for fast inserts - populate DB

2013-02-13 Thread Russell Brown

Hi,

On 13 Feb 2013, at 07:37, Bogdan Flueras  wrote:

> Hello all,
> I've got a 5 node cluster with Riak 1.2.1, all machines are multicore,
> with min 4GB RAM.
> 
> I want to insert something like 50 million records in Riak with the java 
> client (Protobuf used) with default settings.  I've tried also with HTTP 
> protocol and set w = 1 but got some problems.
> 
> However the process is very slow: it doesn't write more than 6GB/ hour or 
> aprox. 280 KB/second. 
> To have all my data filled in, it would take aprox 2 days !!
> 
> What can I do to have the data filled into Riak ASAP? 
> How should I configure the cluster ? (vm.args/ app.config) I don't care so 
> much about consistency at this point.

If you are certain to be only inserting new data setting your bucket(s) to last 
write wins will speed things up. Also, are you using multiple threads for the 
Java client insert? Spreading the load across all five nodes? Are you using the 
"withoutFetch()" option on the java client?

Cheers

Russell

> 
> Thank you,
> ing. Bogdan Flueras
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Tune Riak for fast inserts - populate DB

2013-02-13 Thread Russell Brown


On 13 Feb 2013, at 08:07, Bogdan Flueras  wrote:

> How to set the bucket to last write? Is it in the builder?

Something like:

Bucket b =   client.createBucket("my_bucket").lastWriteWins(true);

Also, after you've created the bucket, do you use it from all threads? You 
don't re-fetch the bucket per-insert operation, do you?

But  the "withoutFecth()" option is probably going to be the biggest 
performance increase, and safe if you are only doing inserts.

Cheers

Russell

> I'll have a look..
> Yes, I use more threads and the bucket is configured to spread the load 
> across all nodes.
> 
> Thanks, I'll have a deeper look into the API and let you know about my 
> results.
> 
> ing. Bogdan Flueras
> 
> 
> 
> On Wed, Feb 13, 2013 at 10:02 AM, Russell Brown  wrote:
> Hi,
> 
> On 13 Feb 2013, at 07:37, Bogdan Flueras  wrote:
> 
> > Hello all,
> > I've got a 5 node cluster with Riak 1.2.1, all machines are multicore,
> > with min 4GB RAM.
> >
> > I want to insert something like 50 million records in Riak with the java 
> > client (Protobuf used) with default settings.  I've tried also with HTTP 
> > protocol and set w = 1 but got some problems.
> >
> > However the process is very slow: it doesn't write more than 6GB/ hour or 
> > aprox. 280 KB/second.
> > To have all my data filled in, it would take aprox 2 days !!
> >
> > What can I do to have the data filled into Riak ASAP?
> > How should I configure the cluster ? (vm.args/ app.config) I don't care so 
> > much about consistency at this point.
> 
> If you are certain to be only inserting new data setting your bucket(s) to 
> last write wins will speed things up. Also, are you using multiple threads 
> for the Java client insert? Spreading the load across all five nodes? Are you 
> using the "withoutFetch()" option on the java client?
> 
> Cheers
> 
> Russell
> 
> >
> > Thank you,
> > ing. Bogdan Flueras
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Tune Riak for fast inserts - populate DB

2013-02-13 Thread Russell Brown


On 13 Feb 2013, at 09:44, Bogdan Flueras  wrote:

> Ok, so I've done something like this:
> Bucket bucket = client.createBucket("foo"); // lastWriteWins(true) doesn't 
> work for Protobuf
> 
> when I insert I have:
> bucket.store(someKey, someValue).withoutFetch().pw(1).execute();
> 
> It looks like it's 20% faster than before. Is there something I could further 
> tweak ?

The .pw(1) is redundant. All riak inserts are pw(1) by default. Setting last 
write wins will get you a speed gain, since it means the riak node will not 
attempt a read before it writes your data. Maybe use the HTTP client to set 
this property?

How many threads are you using? How are you getting the data to be written to 
the writing processes?

Cheers

Russell

> 
> ing. Bogdan Flueras
> 
> 
> 
> On Wed, Feb 13, 2013 at 10:19 AM, Bogdan Flueras  
> wrote:
> Each thread has it's own bucket instance (pointing to the same location) and 
> I don't re-fetch the bucket per insert.
> Thank you very much!
> 
> ing. Bogdan Flueras
> 
> 
> 
> On Wed, Feb 13, 2013 at 10:14 AM, Russell Brown  wrote:
> 
> On 13 Feb 2013, at 08:07, Bogdan Flueras  wrote:
> 
> > How to set the bucket to last write? Is it in the builder?
> 
> Something like:
> 
> Bucket b =   client.createBucket("my_bucket").lastWriteWins(true);
> 
> Also, after you've created the bucket, do you use it from all threads? You 
> don't re-fetch the bucket per-insert operation, do you?
> 
> But  the "withoutFecth()" option is probably going to be the biggest 
> performance increase, and safe if you are only doing inserts.
> 
> Cheers
> 
> Russell
> 
> > I'll have a look..
> > Yes, I use more threads and the bucket is configured to spread the load 
> > across all nodes.
> >
> > Thanks, I'll have a deeper look into the API and let you know about my 
> > results.
> >
> > ing. Bogdan Flueras
> >
> >
> >
> > On Wed, Feb 13, 2013 at 10:02 AM, Russell Brown  
> > wrote:
> > Hi,
> >
> > On 13 Feb 2013, at 07:37, Bogdan Flueras  wrote:
> >
> > > Hello all,
> > > I've got a 5 node cluster with Riak 1.2.1, all machines are multicore,
> > > with min 4GB RAM.
> > >
> > > I want to insert something like 50 million records in Riak with the java 
> > > client (Protobuf used) with default settings.  I've tried also with HTTP 
> > > protocol and set w = 1 but got some problems.
> > >
> > > However the process is very slow: it doesn't write more than 6GB/ hour or 
> > > aprox. 280 KB/second.
> > > To have all my data filled in, it would take aprox 2 days !!
> > >
> > > What can I do to have the data filled into Riak ASAP?
> > > How should I configure the cluster ? (vm.args/ app.config) I don't care 
> > > so much about consistency at this point.
> >
> > If you are certain to be only inserting new data setting your bucket(s) to 
> > last write wins will speed things up. Also, are you using multiple threads 
> > for the Java client insert? Spreading the load across all five nodes? Are 
> > you using the "withoutFetch()" option on the java client?
> >
> > Cheers
> >
> > Russell
> >
> > >
> > > Thank you,
> > > ing. Bogdan Flueras
> > >
> > > ___
> > > riak-users mailing list
> > > riak-users@lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> 
> 
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Secondary index maintenance

2013-02-20 Thread Russell Brown


On 20 Feb 2013, at 14:35, Theo Bot  wrote:

> Hi
> 
> It's not that I want to use the erlang client. It's just that I want to know 
> to to create http queries to maintain the secondary indexes.

Ah, OK.

Sorry for the confusion. Updating the indexes is just like updating a value or 
any other object meta-data in riak, you need to fetch the whole value, change 
it, and send it back.

To delete all the indexes for a key just POST that keys value to riak's HTTP, 
without the x-riak-index headers.
To remove particular index(es) just POST that keys value to riak's HTTP 
interface minus the x-riak-index headers for the index you want to remove.

For example, say you created (as per the riak docs example[1]) 

curl -X POST \
-H 'x-riak-index-twitter_bin: jsmith123' \
-H 'x-riak-index-email_bin: jsm...@basho.com' \
-d'...user data...' \
http://localhost:8098/buckets/users/keys/john_smith

When you read the data back with `curl -v 
localhost:8098/buckets/users/keys/john_smith` you get:

< HTTP/1.1 200 OK
< X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fgbKiqhmMCUy57EyfImZeYovCwA=
< x-riak-index-twitter_bin: jsmith123
< x-riak-index-email_bin: jsm...@basho.com
< Vary: Accept-Encoding
< Server: MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue)
< Link: ; rel="up"
< Last-Modified: Wed, 20 Feb 2013 14:43:00 GMT
< ETag: "7DdYGiY7JujKiCVTTXp51M"
< Date: Wed, 20 Feb 2013 14:43:10 GMT
< Content-Type: application/x-www-form-urlencoded
< Content-Length: 15
< 
* Connection #0 to host localhost left intact
...user data...* Closing connection #0

Those x-riak-index headers are your indexes. To remove one just post back the 
object, minus the index you want to remove. So to drop the 'twitter' index:

curl -X POST  http://localhost:8098/buckets/users/keys/john_smith -d"...user 
data..." \
 -H"X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fgbKiqhmMCUy57EyfImZeYovCwA=" \
 -H 'x-riak-index-email_bin: jsm...@basho.com'

When you read the value again you'll see it now only has one index.

I guess your client will need to parse out those index headers into some useful 
structure, and expose a way for the user to add / edit / remove indexes, and 
your client then needs to generate headers to post back to riak.

Does that cover it?

Cheers

Russell

[1] 
http://docs.basho.com/riak/latest/tutorials/querying/Secondary-Indexes/#Query-Interfaces-and-Examples


> 
> Theo
> 
> 
> On Wed, Feb 20, 2013 at 3:30 PM, Christian Dahlqvist  
> wrote:
> Hi Theo,
> 
> The Riak HTTP client for Erlang uses the 'riakc_obj' from the PB client to 
> represent records. You can therefore use any utility functions available 
> there to manipulate metadata. The HTTP client for Erlang does however 
> currently not support secondary indexes [1], meaning that these will not be 
> parsed nor sent when getting or putting an object.
> 
> [1] http://docs.basho.com/riak/1.3.0rc4/references/Client-Libraries/#HTTP
> 
> Best regards,
> 
> Christian
> 
> 
> On 20 Feb 2013, at 08:41, Theo Bot  wrote:
> 
>> Hi
>> 
>> I found in the erlang client that there are methods 
>> (clear_secondary_indexes, delete_secondary_index) in the pb api to maintain 
>> the secondary index of objects. However in the http api I cannot find such 
>> methods.
>> 
>> -- 
>> Kind regards
>> 
>> Theo Bot
>> Network Management Engineer
>> e-mail: theo@proxy.nl
>> Website: http://www.proxy.nl
>> LinkedIn: http://www.linkedin.com/in/theobot
>>  
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Met vriendelijke groet,
> 
> Theo Bot
> Network Management Engineer
> Tel: +31653965698
> e-mail: theo@proxy.nl
> Website: http://www.proxy.nl
> LinkedIn: http://www.linkedin.com/in/theobot
>  
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Understanding read_repairs

2013-02-22 Thread Russell Brown

Hi,
Thanks for trying Riak.

On 21 Feb 2013, at 23:48, Belai Beshah  wrote:

> Hi All,
> 
> We are evaluating Riak to see if it can be used to cache large blobs of data. 
> Here is our test cluster setup:
> 
>   • six Ubuntu LTS 12.04 dedicated nodes with 8 core 2.6 Ghz CPU, 32 GB 
> RAM, 3.6T disk
>   • {pb_backlog, 64},
>   • {ring_creation_size, 256},
>   • {default_bucket_props, [{n_val, 2}, 
> {allow_mult,false},{last_write_wins,true}]},
>   • using bitcask as the backend 
> 
> Everything else default except the above. There is an HAProxy load balancer 
> infront of the nodes that the clients talk too configured according to the 
> basho wiki. Due to the nature of the application we are integrating we do 
> about 1200/s writes of approximately 40-50KB each and read them back almost 
> immediately. We noticed a lot of read repairs and since that was one of the 
> things that could indicate performance problem we go worried. So we wrote a 
> simple java client application that simulates our use case. The test program 
> is dead simple:
>   • generate keys using random UUID and value using Apache commons 
> RandomStringUtils
>   • create a thread pool of 5 and store key/value using “bucket.store()”
>   • read the values back using “bucket.fetch()” multiple times
> I could provide the spike code if needed. What we noticed is that we get a 
> lot of read repairs all over the place. We even made it use a single thread 
> to read/write, played with the write/read quorum and even put a delay of 5 
> minutes between the writes before the reads start to give the cluster time to 
> be eventually consistent. Nothing helps, we always see a lot of read repairs, 
> sometime as many as the number of inserts.


It sounds like you are experiencing this bug 
https://github.com/basho/riak_kv/pull/334

It is fixed in master, but it doesn't look like it made it into 1.3.0. If 
you're ok with building from source, I tried it and a patch from 
8895d2877576af2441bee755028df1a6cf2174c7 goes cleanly onto 1.3.0.

Cheers

Russell


> The good thing is that in all of these tests we have not seen any read 
> failures. Performance is also not bad, a few maxs here and there we don't 
> like but 90% looks good. Even when we killed a node, the reads are still 
> successful.
> 
> We are wondering what the expected ratio of read repairs is and what is a 
> reasonable time for the cluster not to restore to read_repair to fulfill a 
> read request or is there something we are missing in our setup.
> 
> Thanks
> Belai
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Understanding read_repairs

2013-02-24 Thread Russell Brown


On 24 Feb 2013, at 15:01, Sebastian Cohnen  wrote:

> Can you confirm that the PR didn't made it into 1.3.0? It was closed about 
> two four weeks ago...

It is not in 1.3. We actually had the code freeze for 1.3.0 a _long_ time ago 
(I think it was 16th Nov 2012) and this fix did not make it in time for the 
freeze.

Sorry,

Russell
> 
> I think I'm hitting the same issue here. In one project I'm using last write 
> wins and the read repair count is ridiculously high. I'm hoping that this 
> might also help reducing the latency problem we are seeing for the 99th+ 
> percentiles of requests.
> 
> 
> On 22.02.2013, at 09:24, Russell Brown  wrote:
> 
>> It is fixed in master, but it doesn't look like it made it into 1.3.0. If 
>> you're ok with building from source, I tried it and a patch from 
>> 8895d2877576af2441bee755028df1a6cf2174c7 goes cleanly onto 1.3.0.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Understanding read_repairs

2013-03-01 Thread Russell Brown


On 1 Mar 2013, at 17:39, Belai Beshah  wrote:

> Nothing fancy really the set method throws an exception  
> "com.basho.riak.client.RiakRetryFailedException: java.io.EOFException". Tried 
> to find anything that could explain it in the error or console logs but 
> nothing.

Some questions:

Are you using the PB client? Do you have anything in your riak logs that points 
at a pb socket crash? What version of the RJC are you using, please?

Cheers

Russell

> 
> 
> From: Kresten Krab Thorup [k...@trifork.com]
> Sent: Friday, March 01, 2013 5:40 AM
> To: Belai Beshah
> Cc: Jared Morrow; riak-users@lists.basho.com; Russell Brown
> Subject: Re: Understanding read_repairs
> 
> Interesting. What does the failure look like?
> 
> Kresten
> 
> On Feb 27, 2013, at 11:25 PM, Belai Beshah 
> mailto:belai.bes...@nwgeo.com>> wrote:
> 
> I see my post is not clear, the 0.1% is a get/set failure not slowdown. We 
> will have been ok with a slow response but a failed response from the AAE was 
> not something we can tolerate. Since the Java client by deafult does 3 
> retiries I didn't see any point in adding more retries to see if it works 
> with more.
> 
> 
> From: Jared Morrow [ja...@basho.com<mailto:ja...@basho.com>]
> Sent: Wednesday, February 27, 2013 2:21 PM
> To: Belai Beshah
> Cc: Russell Brown; 
> riak-users@lists.basho.com<mailto:riak-users@lists.basho.com>
> Subject: Re: Understanding read_repairs
> 
> Belai,
> 
> Active Anti-Entropy is doing work building trees and checking data, so it 
> will slow down gets/puts slightly.  If you can't accept the slight 
> performance hit, disabling it is the right choice.  In our testing, if you 
> use eLevelDB, 1.3.0 with AAE enabled is faster than 1.2.1 without AAE in most 
> cases due to the other speedups added to eLevelDB in 1.3.0.  Since Bitcask 
> runs about the limit of what a filesystem can handle, AAE definitely shows a 
> slight performance hit since it is accessing the filesystem as well.
> 
> Glad to hear the patch solved your other issues.
> 
> -Jared
> 
> 
> 
> On Wed, Feb 27, 2013 at 1:13 PM, Belai Beshah 
> mailto:belai.bes...@nwgeo.com>> wrote:
> Patch worked good on 1.3, no more continuous read repairs. However, we 
> started seeing problems with Set/Get of about 0.1% which was not there in the 
> 1.2 release. Since this happens even without the patch on a clean 1.3  
> install we narrowed it down to being Active Anti-Entropy since it looks like 
> it is always actively fixing data, may it is our write and read immediately 
> pattern or the fact that we have only a single 4TB disk behind each node and 
> they can't keep up. With Active Anti-Entropy turned off all our tests passed 
> and performance returned to 1.2 levels without any read repairs. For now we 
> are happy to continue our tests with Active Anti-Entropy turned off but it 
> will be great if we can get some pointer from the experts that could explain 
> the behavior we saw. Thanks you guys for the help.
> 
> 
> From: Jared Morrow [ja...@basho.com<mailto:ja...@basho.com>]
> Sent: Friday, February 22, 2013 11:56 AM
> To: Belai Beshah
> Cc: Russell Brown; 
> riak-users@lists.basho.com<mailto:riak-users@lists.basho.com>
> Subject: Re: Understanding read_repairs
> 
> 
> Belai,
> 
> One other option is to use our "basho-patches" functionality. We use it to 
> run new code on current installations where sending a new .beam file is 
> easier than remaking the packages or compiling from source. On your ubuntu 
> system using our packages, the folder should be in 
> /usr/lib/riak/lib/basho-patches.
> 
> To do this you just need the one file changed in the PR pointed to by Russell.
> 
> Here are the steps to make that happen:
> 
> *   Install Erlang R15B01: 
> http://docs.basho.com/riak/latest/tutorials/installation/Installing-Erlang/
> *   Get riak_kv: git clone 
> git://github.com/basho/riak_kv.git<http://github.com/basho/riak_kv.git>
> *   compile riak_kv with just 'make'
> *   copy the resulting .beam file in the ebin folder to the machines you need 
> the new file:scp ebin/riak_kv_vnode.beam 
> user@myriaknode:/usr/lib/riak/lib/basho-patches
> *   stop each node and restart them one at a time
> *   If you want to convince yourself you are using the new code, you can do a 
> 'riak attach' to attach to the node and run code:which('riak_kv_vnode'). 
> (Don't forget the '.' at the end)
> 
> For example on my dev install here is the command before the file is in 
> basho-patches

Re: Stats bug in riak 1.3.1

2013-04-05 Thread Russell Brown

That is indeed a bug. I guess you would have seen it in 1.3.0 too, if you 
called the '/stats/ endpoint.

I've opened an issue here https://github.com/basho/riak_kv/issues/528

Thanks for letting us now, I'll get a patch as soon as I can.

Cheers

Russell

On 5 Apr 2013, at 13:15, Chris Read  wrote:

> Greetings all...
> 
> It appears I've found a small bug in the stats changes introduced in 1.3.1. 
> 
> We don't use protocol buffers at all, but as of 1.3.1 we now see the 
> following filling up console.log:
> 
> 2013-04-05 07:13:05.900 [warning] <0.12654.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_api,pbc_connects,active} with 
> exit:{noproc,{gen_server,call,[riak_api_pb_sup,count_children,infinity]}}
> 
> The only real problem is that it's spamming the log file, but it's not really 
> a warning - it's expected behaviour...
> 
> Chris
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Retrieve all keys of an index?

2013-04-10 Thread Russell Brown

Hi Jeff,

On 10 Apr 2013, at 02:54, Jeff Peck  wrote:

> Hello,
> 
> In Riak, is it possible to retrieve all of the keys of an index? I do not 
> want the object keys in this case, but rather the actual index keys.

I think this is covered by a feature I'm adding, if I understand what you're 
asking for. Does this example give you the values you want?

Say in current riak the query you might do would be

curl localhost:8098/buckets/your_bucket/index/catalog_bin/1000/2000

would get you a list of keys. In 1.4 there'll be the option to get the index 
values as well, so the results would be an array of pairs like {results: 
["1001" : "primary_key"]} where the first element is the value of catalog_bin 
index and the second the primary key of the object.

Is that what you're asking for?

Cheers

Russell

> 
> I am not sure that I am using the correct terminology, but to illustrate, 
> consider a Riak bucket with the following objects, where "catalog" is indexed 
> to "catalog_bin":
> 
> {'url':'http://www.google.com', 'catalog':'1001'}
> {'url':'http://www.yahoo.com', 'catalog':'1001'}
> {'url':'http://www.blah.com', 'catalog': 1002'}
> {'url':'http://www.test123.com', 'catalog': 1002'}
> {'url':'http://www.test12345.com', 'catalog': 1003'}
> 
> I would like to retrieve all of the keys for the index catalog_bin. From the 
> above example, that would be:
> 1001
> 1002
> 1003
> 
> To illustrate further, it would be the equivalent of the following in MySQL, 
> if the above data were to be in a table called "urls":
> 
>   SELECT catalog FROM urls GROUP BY catalog;
> 
> I would appreciate any advice as to how to query for this in Riak, or if this 
> not feasible, then perhaps a suggestion for the best way to organize the data.
> 
> Thank you,
> Jeff
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Simple mapreduce with 2i returns different result

2013-04-17 Thread Russell Brown


On 17 Apr 2013, at 08:54, Mattias Sjölinder  wrote:

> Thanks for your help. Your query returned the same number over and over again 
> just as expected. 
> 
> I think I have found the reason for my problem though. The client lib 
> CorrugatedIron seems to wrap each document in the MapReduce result into a 
> array turning the result into a nested array looking like:
> [[{doc1}], [{doc2}], [{doc3}]] counting 3 arrays & 3 docs
> and sometimes 
> [[{doc1}], [{doc2}, {doc3}]] counting 2 arrays and 2 docs (according to my 
> client code... :)
> 
> Why the result for same query in CorrugatedIron and for a simple curl differs 
> I don't know but I will investigate it further. 

My guess is the client library streams the mapreduce results and gathers the 
separate messages. So depending on how many chunks are sent by riak you get a 
different nested array of responses.

> 
> 
> Thanks again for your help!
> 
> Best regards,
> Mattias
> 
> 
> 
> 2013/4/16 Christian Dahlqvist 
> Hi Mattias,
> 
> The following curl query simply counts the number of inputs, and has worked 
> well for me in the past. Can you please run it against the cluster a couple 
> of times and see if it also return varying number of results?
> 
> curl -XPOST http://localhost:8098/mapred 
>   -H 'Content-Type: application/json' 
>   -d '{"inputs":{
>"bucket":"som-bucket",
>"index":"userid_bin",
>"key":"18481123123"
>},
>"query":[{"reduce":{"language":"erlang",
>"module":"riak_kv_mapreduce",
>"function":"reduce_count_inputs"
>}}]}'
> 
> Best regards,
> 
> Christian
> 
> 
> 
> 
> 
> On 16 Apr 2013, at 15:52, Mattias Sjölinder  wrote:
> 
>> 18481123123
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Reformatting 2i in Riak 1.3.1

2013-04-30 Thread Russell Brown


On 30 Apr 2013, at 09:47, Daniel Iwan  wrote:

> When doing migration from pre-1.3.1 do I run
> 
> riak-admin reformat-indexes [] []
> 
> on every node that is part of the cluster or just one and then it magically 
> applies change to all of them? Changelog says: 
> Riak 1.3.1 includes a utility, as part of riak-admin, that will perform the 
> reformatting of these indexes while the node is online
> 
> which suggests I need to run it on every node.

Yes, please run it on every node.

> 
> Also can I upgrade from 1.2 to 1.3.1 or do I have to go via intermediate 
> upgrade 1.3.0 ?

Yes. You should.

Cheers

Russell

> 
> Thanks
> Daniel
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Failed to calculate stat

2013-07-04 Thread Russell Brown

Hi,
That warning means that riak failed to calculate the leveldb read block error 
count stat.

This is caused by a bug fixed in 1.3.2. The stat code picks a random vnode from 
1 to num_partitions on the node and asks it for the read block error stat. If 
your node has 1 or fewer partitions this error occurs.

It is an edge case, but one that people seem to hit from time to time. As I 
said, fixed for 1.3.2 and .1.4, as far as I know: what version of Riak are you 
running when you see this?

Cheers

Russell

On 4 Jul 2013, at 05:31, fenix.ser...@gmail.com wrote:

> Hi
> 
> What does it mean !?
> 
> On newly installed node after joining to cluster in log:
> 
> 
> 2013-07-04 11:24:58.733 [warning] <0.4686.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:03.762 [warning] <0.4742.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:08.790 [warning] <0.5057.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:13.811 [warning] <0.5093.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:18.828 [warning] <0.5228.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:23.847 [warning] <0.5435.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:28.868 [warning] <0.5754.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:33.894 [warning] <0.5787.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:38.921 [warning] <0.6027.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:43.943 [warning] <0.6062.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:48.971 [warning] <0.6301.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:53.990 [warning] <0.6335.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:25:58.007 [warning] <0.6588.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:26:03.037 [warning] <0.6625.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:26:08.058 [warning] <0.6834.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:26:13.082 [warning] <0.6870.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:26:18.104 [warning] <0.7080.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:26:23.126 [warning] <0.7114.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> 2013-07-04 11:26:28.142 [warning] <0.7370.0>@riak_core_stat_q:log_error:123 
> Failed to calculate stat {riak_kv,vnode,backend,leveldb,read_block_error} 
> with error:badarg
> ...
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Dangling keys/objects after a batch of sequential inserts (for going on 3 days)

2013-07-21 Thread Russell Brown

Hi,

On 21 Jul 2013, at 02:09, Siraaj Khandkar  wrote:

> I (sequentially) made 146204 inserts of unique objects to a single bucket. 
> Several secondary indices (most with unique values) were set for each object, 
> one of which was "bucket" = BucketName (to use 2i for listing all keys).

There is a special $bucket index for this already, please see the docs here 
http://docs.basho.com/riak/latest/dev/using/2i/

> 
> 6 of the objects appear to have been lost - they're consistently not found by 
> GETs (by key) and are not found by 2i queries to the indices with unique 
> values.

Are you sure they were inserted? Was there an error during your batch insert?

> 
> However, the "bucket" index search returns them _sometimes_.

Oh. Erm. Have you deleted some keys? 2i is essentially an r=1 query.

> 
> Now, I understand there may be a replication lag, but this state has remained 
> for over 3 days now.
> 
> "What is fucked, and why?" :)

Good question.

Could you provide some more details to help me figure it out: How many nodes 
are you running? Can you provide an example of the 2i queries you're running? 
If this is just a dev cluster, can you verify the keys are present / absent 
using either a range 2i $keys query, or a key list, please?

Cheers

Russell

> 
> 
> System info:
> 
>OS: Ubuntu 12.04
>Riak: 1.4
>N: 3
>W: 2
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Dangling keys/objects after a batch of sequential inserts (for going on 3 days)

2013-07-21 Thread Russell Brown


On 21 Jul 2013, at 14:20, Siraaj Khandkar  wrote:

> On 07/21/2013 07:24 AM, Russell Brown wrote:> Hi,
> >
> > On 21 Jul 2013, at 02:09, Siraaj Khandkar  wrote:
> >
> >> I (sequentially) made 146204 inserts of unique objects to a single
> >> bucket.  Several secondary indices (most with unique values) were set
> >> for each object, one of which was "bucket" = BucketName (to use 2i
> >> for listing all keys).
> >
> > There is a special $bucket index for this already, please see the docs
> > here http://docs.basho.com/riak/latest/dev/using/2i/
> >
> 
> Yeah... I stumbled on that piece of info in another doc about two days
> ago - made me feel both stupid and validated :)
> 
> However, it doesn't seem to work for me - I always get: {ok,{keys,[]}}

Curious. How do you make the 2i query to the $bucket index?

> 
> 
> >>
> >> 6 of the objects appear to have been lost - they're consistently not
> >> found by GETs (by key) and are not found by 2i queries to the indices
> >> with unique values.
> >
> > Oh. Erm. Have you deleted some keys? 2i is essentially an r=1 query.
> >
> 
> Sort-of. This was a second instance of this batch insertion (a slightly
> extended set of keys), the first one was deleted ~6 hours prior to
> executing the second one.
> 
> At the end of the deletion there _were_ some tombstones left. Frankly I
> do not remember with certainty if there are overlaps between tombstones
> from previous delete and the keys in question. In retrospect - it was
> big failure on my part not to take note of those.
> 
> After the second instance of the set insertion - there were _no_
> more deletions.
> 
> So, in summary:
> 
> 1) Inserted the set
> 2) Deleted the set
> 3) 6 hours passed
> 4) Inserted the set
> 5) Observed the problem

What is your delete_mode setting, please 
(http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/)?

Did the second insert do a fetch to get a tombstone vclock before trying to 
overwrite the key, or a PUT with an empty vclock?

> 
> 
> >>
> >> Now, I understand there may be a replication lag, but this state has
> >> remained for over 3 days now.
> >>
> >> "What is fucked, and why?" :)
> >
> > Good question.
> >
> 
> I was hoping this list would appreciate the reference :)
> 
> 
> > Could you provide some more details to help me figure it out: How many
> > nodes are you running?
> 
> 5
> 
> 
> > Can you provide an example of the 2i queries you're running?
> 
> This is how I am testing it:
> 
>Compare = fun(PID, Bucket) ->
>B = Bucket,
>L1 = riakc_pb_socket:get_index(PID, B, {binary_index, "bucket"}, B),
>L2 = riakc_pb_socket:get_index(PID, B, {binary_index, "bucket"}, B),
>io:format("L1: ~b, L2: ~b~n",[length(L1), length(L2)]),
>Diff_L1_L2 = L1 -- L2,
>Diff_L2_L1 = L2 -- L1,
>io:format("=== L1 -- L2 ===~n~p~n~n", [Diff_L1_L2]),
>io:format("=== L2 -- L1 ===~n~p~n~n", [Diff_L2_L1]),
>Fetch = fun(Key) ->
>case riakc_pb_socket:get(PID, B, Key) of
>{ok, _}-> io:format("FOUND: ~p~n", [Key]);
>{error, _} -> io:format("NOT FOUND: ~p~n", [Key])
>end
>end,
>io:format("=== L1 -- L2 ===~n"),
>lists:foreach(Fetch, Diff_L1_L2),
>io:format("=== L2 -- L1 ===~n"),
>lists:foreach(Fetch, Diff_L2_L1)
>end.
> 
> Which results in differences _sometimes_, but _always_ fails on get.
> 
> 
> > If this is just a dev cluster, can you verify the keys are present /
> > absent using either a range 2i $keys query, or a key list, please?
> >
> 
> Unfortunately this is prod, so brute-force key list is out of the
> question.
> 
> Running:
>curl "http://127.0.0.1:8098/buckets/$bucket/index/\$keys_bin/0/z";
> 
> Returns:
>{"keys":[]}
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: How to create a Jinterface node and cluster for Java Client Benchmark with Basho Bench

2013-07-21 Thread Russell Brown


On 21 Jul 2013, at 18:34, jerryjyuan  wrote:

> I am trying to run some Java client benchmark testing with Basho Bench by
> following the Riak document:
> http://docs.basho.com/riak/latest/references/appendices/Java-Client-Benchmark/
> 
> The Basho Bench tool configuration file for this need requires Jinterface
> node, see below:
> 
> %% The list of remote Java client nodes you want to bench.
> %% Each entry is a tupple of the format
> %% {node(), inet:ip4_address(), inet:port_number()}
> %% Were host is the Jinterface node and
> %% ip and port form the address of the
> %% Riak interface that Java client should call
> 
> {riakc_java_nodes, [{'java@127.0.0.1', {127,0,0,1}, 10018}]}.
> 
> And I don't see the Riak online documentation has some description about
> Jinterface node. If you have information about the document pages or the
> procedures to create Jinterface node, it would be highly appreciated.

Poor as it is, old and unmaintained, here is the README for the bench_shim that 
is mentioned on the page you linked too.

https://github.com/basho/bench_shim

The bench_shim is a JInterface wrapper for the RJC. I don't know if it works 
with the current version of the client though.

Cheers

Russell

> 
> Thanks.
> Jerry
> 
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/How-to-create-a-Jinterface-node-and-cluster-for-Java-Client-Benchmark-with-Basho-Bench-tp4028467.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Dangling keys/objects after a batch of sequential inserts (for going on 3 days)

2013-07-21 Thread Russell Brown


On 21 Jul 2013, at 19:15, Siraaj Khandkar  wrote:

> On 07/21/2013 04:54 PM, Russell Brown wrote:
>> 
>> On 21 Jul 2013, at 14:20, Siraaj Khandkar  wrote:
>> 
>>> On 07/21/2013 07:24 AM, Russell Brown wrote:> Hi,
>>>> 
>>>> On 21 Jul 2013, at 02:09, Siraaj Khandkar  wrote:
>>>> 
>>>>> I (sequentially) made 146204 inserts of unique objects to a single
>>>>> bucket.  Several secondary indices (most with unique values) were set
>>>>> for each object, one of which was "bucket" = BucketName (to use 2i
>>>>> for listing all keys).
>>>> 
>>>> There is a special $bucket index for this already, please see the docs
>>>> here http://docs.basho.com/riak/latest/dev/using/2i/
>>>> 
>>> 
>>> Yeah... I stumbled on that piece of info in another doc about two days
>>> ago - made me feel both stupid and validated :)
>>> 
>>> However, it doesn't seem to work for me - I always get: {ok,{keys,[]}}
>> 
>> Curious. How do you make the 2i query to the $bucket index?
> 
> Just as bellow, but with "$bucket" instead of "bucket":
> 
> Index = {binary_index, "$bucket"},
> riakc_pb_socket:get_index(PID, Bucket, Index, Bucket).

riakc_pb:socket:get_index(Pid, Bucket, <<"$bucket">>, Bucket).

As in the "$bucket" index is not a binary index, ($bucket_bin is what you've 
been using inadvertently)

Sorry it is not better documented. 

>>>> 
>>>> Oh. Erm. Have you deleted some keys? 2i is essentially an r=1 query.
>>>> 
>>> 
>>> Sort-of. This was a second instance of this batch insertion (a slightly
>>> extended set of keys), the first one was deleted ~6 hours prior to
>>> executing the second one.
>>> 
>>> At the end of the deletion there _were_ some tombstones left. Frankly I
>>> do not remember with certainty if there are overlaps between tombstones
>>> from previous delete and the keys in question. In retrospect - it was
>>> big failure on my part not to take note of those.
>>> 
>>> After the second instance of the set insertion - there were _no_
>>> more deletions.
>>> 
>>> So, in summary:
>>> 
>>> 1) Inserted the set
>>> 2) Deleted the set
>>> 3) 6 hours passed
>>> 4) Inserted the set
>>> 5) Observed the problem
>> 
>> What is your delete_mode setting, please 
>> (http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/)?
>> 
> 
> It is not configured explicitly, so I am assuming the default 3 second delay.
> 
> 
> >
>> Did the second insert do a fetch to get a tombstone vclock before trying to 
>> overwrite the key, or a PUT with an empty vclock?
> >
> 
> PUT with an empty vclock.

I need to look into this more, take some time to reproduce it, but I imagine it 
is something to do with deletes and then re-inserting the keys.

I'll post when I get something for you.

Cheers

Russell

> 
> 
>>>>> 
>>>>> Now, I understand there may be a replication lag, but this state has
>>>>> remained for over 3 days now.
>>>>> 
>>>>> "What is fucked, and why?" :)
>>>> 
>>>> Good question.
>>>> 
>>> 
>>> I was hoping this list would appreciate the reference :)
>>> 
>>> 
>>>> Could you provide some more details to help me figure it out: How many
>>>> nodes are you running?
>>> 
>>> 5
>>> 
>>> 
>>>> Can you provide an example of the 2i queries you're running?
>>> 
>>> This is how I am testing it:
>>> 
>>>Compare = fun(PID, Bucket) ->
>>>B = Bucket,
>>>L1 = riakc_pb_socket:get_index(PID, B, {binary_index, "bucket"}, B),
>>>L2 = riakc_pb_socket:get_index(PID, B, {binary_index, "bucket"}, B),
>>>io:format("L1: ~b, L2: ~b~n",[length(L1), length(L2)]),
>>>Diff_L1_L2 = L1 -- L2,
>>>Diff_L2_L1 = L2 -- L1,
>>>io:format("=== L1 -- L2 ===~n~p~n~n", [Diff_L1_L2]),
>>>io:format("=== L2 -- L1 ===~n~p~n~n", [Diff_L2_L1]),
>>>Fetch = fun(Key) ->
>>>case riakc_pb_socket:get(PID, B, Key) of
>>>{ok, _}-> io:format("FOUND: ~p~n", [Key]);
>>>{error, _} -> io:format("NOT FOUND: ~p~n", [Key])
>>>end
>>>end,
>>>io:format("=== L1 -- L2 ===~n"),
>>>lists:foreach(Fetch, Diff_L1_L2),
>>>io:format("=== L2 -- L1 ===~n"),
>>>lists:foreach(Fetch, Diff_L2_L1)
>>>end.
>>> 
>>> Which results in differences _sometimes_, but _always_ fails on get.
>>> 
>>> 
>>>> If this is just a dev cluster, can you verify the keys are present /
>>>> absent using either a range 2i $keys query, or a key list, please?
>>>> 
>>> 
>>> Unfortunately this is prod, so brute-force key list is out of the
>>> question.
>>> 
>>> Running:
>>>curl "http://127.0.0.1:8098/buckets/$bucket/index/\$keys_bin/0/z";
>>> 
>>> Returns:
>>>{"keys":[]}
>>> 
>> 
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

RFC: CRDTs in Riak

2013-07-25 Thread Russell Brown

Hi,
Although I haven't finished the write up for GC, here is the RFC for CRDTs in 
Riak https://github.com/basho/riak/issues/354

Please let me know what you think on the Github issue.

Many thanks

Russell
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: 2i timeouts in 1.4

2013-07-26 Thread Russell Brown

Hi Sean,
I'm very sorry to say that you've found a featurebug.

There was a fix put in here https://github.com/basho/riak_core/pull/332

But that means that the default timeout of 60 seconds is now honoured. In the 
past it was not.

As far as I can see the 2i endpoint never accepted a timeout argument, and it 
still does not.

The fix would be to add the timeout to the 2i API endpoints, and I'll do that 
straight away.

In the meantime, I wonder if streaming the results would help, or if you'd 
still hit the overall timeout?

Very sorry that you've run into this. Let me know if streaming helps, I've 
raised an issue here[1] if you want to track this bug

Cheers

Russell

[1] https://github.com/basho/riak_kv/issues/610


On 26 Jul 2013, at 17:59, Sean McKibben  wrote:

> I should have mentioned that I also tried:
> curl -H "X-Riak-Timeout: 26" 
> "http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?timeout=26"; 
> -i
> but still receive the 500 error below exactly at the 60 second mark. Is this 
> a bug?
> 
> Secondary to getting this working at all, is this documented anywhere? and 
> any way to set this timeout using the ruby riak client?
> 
> Stream may well work, but I'm going to have to make a number of changes on 
> the client side to deal with the results.
> 
> Sean
> 
> On Jul 26, 2013, at 3:53 PM, Brian Roach  wrote:
> 
>> Sean -
>> 
>> The timeout isn't via a header, it's a query param -> &timeout=
>> 
>> You can also use stream=true to stream the results.
>> 
>> - Roach
>> 
>> Sent from my iPhone
>> 
>> On Jul 26, 2013, at 3:43 PM, Sean McKibben  wrote:
>> 
>>> We just upgraded to 1.4 and are having a big problem with some of our 
>>> larger 2i queries. We have a few key queries that takes longer than 60 
>>> seconds (usually about 110 seconds) to execute, but after going to 1.4 we 
>>> can't seem to get around a 60 second timeout.
>>> 
>>> I've tried:
>>> curl -H "X-Riak-Timeout: 26" 
>>> "http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?x-riak-timeout=26";
>>>  -i
>>> 
>>> But I always get
>>> HTTP/1.1 500 Internal Server Error
>>> Vary: Accept-Encoding
>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
>>> Date: Fri, 26 Jul 2013 21:41:28 GMT
>>> Content-Type: text/html
>>> Content-Length: 265
>>> Connection: close
>>> 
>>> 500 Internal Server 
>>> ErrorInternal Server ErrorThe server 
>>> encountered an error while processing this 
>>> request:{error,{error,timeout}}mochiweb+webmachine
>>>  web server
>>> 
>>> Right at the 60 second mark. What can I set to give my secondary index 
>>> queries more time??
>>> 
>>> This is causing major problems for us :(
>>> 
>>> Sean
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: 2i timeouts in 1.4

2013-07-26 Thread Russell Brown

For a work around you could use streaming and pagination.

Request smaller pages of data (i.e. sub 60 seconds worth) and use streaming to 
get the results to your client sooner.

In HTTP this would look like

http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?max_results=1&stream=true

your results will include a continuation like

{"continuation":"g2gCYgAAFXttBDU0OTk="}

and you can use that to get the next N results. Breaking your query up that way 
should duck the timeout.

Furthermore, adding &stream=true will mean the first results is received very 
rapidly.

I don't think the Ruby client is up to date for the new 2i features, but you 
could monkeypatch as before.

Cheers

Russell

On 26 Jul 2013, at 19:00, Sean McKibben  wrote:

> Thank you for looking in to this. This is a major problem for our production 
> cluster, and we're in a bit of a bind right now trying to figure out a 
> workaround in the interim. It sounds like maybe a mapreduce might handle the 
> timeout properly, so hopefully we can do that in the meantime.
> If there is any way we can have a hotfix ASAP though, that would be 
> preferable. Certainly would not be a problem for us to edit a value in the 
> config file (and given the lack of support in the ruby client for the timeout 
> setting, the ability to edit the global default would be preferred).
> In the ruby client i had to monkeypatch it like this to even submit the 
> timeout value, which is not ideal:
> 
> module Riak
>  class Client
>class HTTPBackend
>  def get_index(bucket, index, query)
>bucket = bucket.name if Bucket === bucket
>path = case query
>   when Range
> raise ArgumentError, t('invalid_index_query', :value => 
> query.inspect) unless String === query.begin || Integer === query.end
> index_range_path(bucket, index, query.begin, query.end)
>   when String, Integer
> index_eq_path(bucket, index, query, 'timeout' => '26')
>   else
> raise ArgumentError, t('invalid_index_query', :value => 
> query.inspect)
>   end
>response = get(200, path)
>JSON.parse(response[:body])['keys']
>  end
>end
>  end
> end
> 
> Thanks for the update,
> Sean
> 
> 
> 
> On Jul 26, 2013, at 4:49 PM, Russell Brown  wrote:
> 
>> Hi Sean,
>> I'm very sorry to say that you've found a featurebug.
>> 
>> There was a fix put in here https://github.com/basho/riak_core/pull/332
>> 
>> But that means that the default timeout of 60 seconds is now honoured. In 
>> the past it was not.
>> 
>> As far as I can see the 2i endpoint never accepted a timeout argument, and 
>> it still does not.
>> 
>> The fix would be to add the timeout to the 2i API endpoints, and I'll do 
>> that straight away.
>> 
>> In the meantime, I wonder if streaming the results would help, or if you'd 
>> still hit the overall timeout?
>> 
>> Very sorry that you've run into this. Let me know if streaming helps, I've 
>> raised an issue here[1] if you want to track this bug
>> 
>> Cheers
>> 
>> Russell
>> 
>> [1] https://github.com/basho/riak_kv/issues/610
>> 
>> 
>> On 26 Jul 2013, at 17:59, Sean McKibben  wrote:
>> 
>>> I should have mentioned that I also tried:
>>> curl -H "X-Riak-Timeout: 26" 
>>> "http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?timeout=26";
>>>  -i
>>> but still receive the 500 error below exactly at the 60 second mark. Is 
>>> this a bug?
>>> 
>>> Secondary to getting this working at all, is this documented anywhere? and 
>>> any way to set this timeout using the ruby riak client?
>>> 
>>> Stream may well work, but I'm going to have to make a number of changes on 
>>> the client side to deal with the results.
>>> 
>>> Sean
>>> 
>>> On Jul 26, 2013, at 3:53 PM, Brian Roach  wrote:
>>> 
>>>> Sean -
>>>> 
>>>> The timeout isn't via a header, it's a query param -> &timeout=
>>>> 
>>>> You can also use stream=true to stream the results.
>>>> 
>>>> - Roach
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>> On Jul 26, 2013, at 3:43 PM, Sean McKibben  wrote:
>>>> 
>>>>> We just upgraded to 1.4 and are having a big problem with some of our 
>>>>&g

Re: 2i timeouts in 1.4

2013-07-27 Thread Russell Brown

Hi Sean,

This one really is a bug with 2i pagination on Equals term queries[1].

I've fixed it already[2], I'm very sorry that this one got out into the wild. 
It is something that worked then stopped working at some point in a merge. It 
was a pretty bad oversight that we didn't have a test for this condition to 
catch the regression.

I've added a test[3] to the suite for the issue and verified the patch. I guess 
there will be a 1.4.1 soon.

Apologies, again.

Russell

[1] https://github.com/basho/riak_kv/issues/611
[2] https://github.com/basho/riak_kv/pull/612
[3] https://github.com/basho/riak_test/pull/340

On 27 Jul 2013, at 01:11, Sean McKibben  wrote:

> So when I try to use pagination, it doesn't seem to be picking up my 
> continuation. I'm having trouble parsing the json I get back using 
> stream=true (and there is still a timeout) so I went to just using 
> pagination. Perhaps I'm doing it wrong, (likely, it has been a long day) but 
> riak seems to be ignoring my continuation:
> 
> (pardon the sanitization)
> curl 
> 'http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?max_results=5'
> {"keys":["1","2","3","4","5"],"continuation":"g20AAABAMDAwMDE1ZWVjMmNiZjY3Y2Y4YmU3ZTVkMWNiZTVjM2ZkYjg2YWU0MGIwNzNjMTE3NDYyZjEzMTNlMDQ5YmI2ZQ=="}
> 
> curl 
> 'http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?max_results=5&continuation=g20AAABAMDAwMDE1ZWVjMmNiZjY3Y2Y4YmU3ZTVkMWNiZTVjM2ZkYjg2YWU0MGIwNzNjMTE3NDYyZjEzMTNlMDQ5YmI2ZQ=='
> {"keys":["1","2","3","4","5"],"continuation":"g20AAABAMDAwMDE1ZWVjMmNiZjY3Y2Y4YmU3ZTVkMWNiZTVjM2ZkYjg2YWU0MGIwNzNjMTE3NDYyZjEzMTNlMDQ5YmI2ZQ=="}
> 
> The same keys and continuation value are returned regardless of whether my 
> request contains a continuation value. I've tried swapping the order of 
> max_results and continuation without any luck. I also made sure that my 
> continuation value was url encoded. Hopefully I'm not missing something 
> obvious here. Well, come to think of it, hopefully I am missing something 
> obvious!
> 
> Sean
> 
> On Jul 26, 2013, at 6:43 PM, Russell Brown  wrote:
> 
>> For a work around you could use streaming and pagination.
>> 
>> Request smaller pages of data (i.e. sub 60 seconds worth) and use streaming 
>> to get the results to your client sooner.
>> 
>> In HTTP this would look like
>> 
>> http://127.0.0.1:8098/buckets/mybucket/index/test_bin/myval?max_results=1&stream=true
>> 
>> your results will include a continuation like
>> 
>>   {"continuation":"g2gCYgAAFXttBDU0OTk="}
>> 
>> and you can use that to get the next N results. Breaking your query up that 
>> way should duck the timeout.
>> 
>> Furthermore, adding &stream=true will mean the first results is received 
>> very rapidly.
>> 
>> I don't think the Ruby client is up to date for the new 2i features, but you 
>> could monkeypatch as before.
>> 
>> Cheers
>> 
>> Russell
>> 
>> On 26 Jul 2013, at 19:00, Sean McKibben  wrote:
>> 
>>> Thank you for looking in to this. This is a major problem for our 
>>> production cluster, and we're in a bit of a bind right now trying to figure 
>>> out a workaround in the interim. It sounds like maybe a mapreduce might 
>>> handle the timeout properly, so hopefully we can do that in the meantime.
>>> If there is any way we can have a hotfix ASAP though, that would be 
>>> preferable. Certainly would not be a problem for us to edit a value in the 
>>> config file (and given the lack of support in the ruby client for the 
>>> timeout setting, the ability to edit the global default would be preferred).
>>> In the ruby client i had to monkeypatch it like this to even submit the 
>>> timeout value, which is not ideal:
>>> 
>>> module Riak
>>> class Client
>>>  class HTTPBackend
>>>def get_index(bucket, index, query)
>>>  bucket = bucket.name if Bucket === bucket
>>>  path = case query
>>> when Range
>>>   raise ArgumentError, t('invalid_index_query', :value => 
>>> query.inspect) unless String === query.begin || Integer === query.end
>>>   index_range_path(bucket, index, query.begin, query.end)
>>> when String, Integer
>>>   index_eq_path(bucket, index, query, 'timeout' => '26')
>>> else
>&g

Re: MapReduce notfound error in 1.4

2013-07-29 Thread Russell Brown

Hi Yan,

Another 2i bug in 1.4

I've raised an issue here[1], the fix is very simple[2]. We're getting together 
a few fixes this week, and expect to cut a 1.4.1 very soon.

This issue only effects range query inputs to MR ($key is unaffected, as are 
equals queries).

Sorry for the trouble, fixes coming soon

Cheers

Russell

[1] https://github.com/basho/riak_kv/issues/617
[2] https://github.com/basho/riak_kv/pull/618

On 26 Jul 2013, at 20:01, Yan Martins  wrote:

> Hi guys,
> 
> I'm having this trouble after migrating to 1.4 from 1.3. I get a bunch of 
> notfound errors, even though I know the objects are there.
> 
> 10> riakc_pb_socket:get_index(Conn, <<"iops">>, {integer_index, "time"}, 
> 1374787015, 1374787025).  
> {ok,{index_results_v1,[<<"C3e5e4ffc0001H0N525400f52ae2hS0-93">>,
><<"C3e5e4ffc0001H1N52540065404fhS0-92">>,
><<"C3e5e4ffc0001H2N525400847d32hS0-92">>,
><<"C3e5e4ffc0001H4N525400b54073hS0-92">>,
><<"C3e5e4ffc0001H3N525400962a0ehS0-92">>],
>   undefined,undefined}}
> 
> 11> riakc_pb_socket:mapred(Conn,{index,<<"iops">>,<<"time_int">>, 1374787015, 
> 1374787025}, 
> 11> [{map, {modfun, riak_kv_mapreduce, map_identity}, undefined, true}]). 
>  
> {ok,[{0,
>   [{error,notfound},
>{error,notfound},
>{error,notfound},
>{error,notfound},
>{error,notfound}]}]}
> 
> 
> Again, I'm sure the keys were not deleted (I can fetch them by key and there 
> is no X-Riak-Deleted tag).Was there any change to the PB interface that could 
> cause this issue?
> 
> What am I doing wrong? Did anyone else have this problem?
> 
> Thanks
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: MapReduce notfound error in 1.4

2013-07-29 Thread Russell Brown

Hi Yan,

Another 2i bug in 1.4

I've raised an issue here[1], the fix is very simple[2]. We're getting together 
a few fixes this week, and expect to cut a 1.4.1 very soon.

This issue only effects range query inputs to MR ($key is unaffected, as are 
equals queries).

Sorry for the trouble, fixes coming soon

Cheers

Russell

[1] https://github.com/basho/riak_kv/issues/617
[2] https://github.com/basho/riak_kv/pull/618

On 26 Jul 2013, at 20:01, Yan Martins  wrote:

> Hi guys,
> 
> I'm having this trouble after migrating to 1.4 from 1.3. I get a bunch of 
> notfound errors, even though I know the objects are there.
> 
> 10> riakc_pb_socket:get_index(Conn, <<"iops">>, {integer_index, "time"}, 
> 1374787015, 1374787025).  
> {ok,{index_results_v1,[<<"C3e5e4ffc0001H0N525400f52ae2hS0-93">>,
><<"C3e5e4ffc0001H1N52540065404fhS0-92">>,
><<"C3e5e4ffc0001H2N525400847d32hS0-92">>,
><<"C3e5e4ffc0001H4N525400b54073hS0-92">>,
><<"C3e5e4ffc0001H3N525400962a0ehS0-92">>],
>   undefined,undefined}}
> 
> 11> riakc_pb_socket:mapred(Conn,{index,<<"iops">>,<<"time_int">>, 1374787015, 
> 1374787025}, 
> 11> [{map, {modfun, riak_kv_mapreduce, map_identity}, undefined, true}]). 
>  
> {ok,[{0,
>   [{error,notfound},
>{error,notfound},
>{error,notfound},
>{error,notfound},
>{error,notfound}]}]}
> 
> 
> Again, I'm sure the keys were not deleted (I can fetch them by key and there 
> is no X-Riak-Deleted tag).Was there any change to the PB interface that could 
> cause this issue?
> 
> What am I doing wrong? Did anyone else have this problem?
> 
> Thanks
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Secondary index reverse sort

2013-07-31 Thread Russell Brown

Hi Lucas,

I'm sorry, as easy as it would have been to add with the latest changes, we 
just ran out of time.

It is something I'd love to add in future. Or maybe something a contributor 
could add? (Happy to advise / review.)

Many thanks

Russell

On 31 Jul 2013, at 02:04, Lucas Cooper  wrote:

> I've seen that the results of secondary index queries are sorted on index 
> values by default.
> I was wondering if there's something I'm missing that would allow me to fetch 
> those keys but reverse sorted.
> 
> I have indexes based on UNIX timestamps and I'd like to grab the most recent 
> keys.
> I'd like this query to be running on demand so I'd like to avoid MapReduce if 
> at all possible.
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Using MapReduce with Counters

2013-08-02 Thread Russell Brown


On 2 Aug 2013, at 16:56, João Machado  wrote:

> Hi Sean,
> 
> Thanks for your quick response. If I follow the steps from Sam, it works as 
> expected. I tried the same steps but with my own bucket (and data) and it 
> worked too. The difference between what I was trying and what Sam did was 
> because I used JavaScript and Sam Erlang.
> 
> Is there any trick to use Javascript?
Hi João,

Not right now, sorry, the internal representation is a binary format stored in 
a regular riak_object, that requires the riak_kv_counter module to decode.

Russell

> 
> 
> []'s
> 
> 
> 
> João Machado
> joao.mach...@ideais.com.br
> Tel.: 55 21 3553 1301 R: 215
> Cel.: 55 21 8124 3531
> Site: www.ideais.com.br
> 
> 
> On Fri, Aug 2, 2013 at 5:09 PM, Sean Cribbs  wrote:
> Hi João,
> 
> You might want to try the steps shown in Sam Elliott's "cookbook": 
> https://github.com/basho/riak_crdt_cookbook/blob/master/counters/README.md
> 
> 
> On Fri, Aug 2, 2013 at 2:56 PM, João Machado  
> wrote:
> Hello,
> 
> Anyone tried to use MR with counters?
> 
> I'm trying with the following steps:
> 
> Increment the counter:
> -> curl -X POST http://localhost:8098/buckets/BUCKET/counters/MY_COUNTER -d 1
> 
> Confirm the actual value:
> -> curl http://localhost:8098/buckets/BUCKET/counters/MY_COUNTER 
> 1
> 
> Execute mapreduce:
> -> curl   -X POST -H "content-type: application/json" 
> http://127.0.0.1:8098/mapred -d @map_red.js
> {"error":"bad_utf8_character_code"}
> 
> Where the map_red.js is:
> 
> {
>   "inputs":"BUCKET",
>   "query":[
>{"map":{"language":"javascript","name":"Riak.mapValues"}}
>,{"reduce":{"language":"javascript","name":"Riak.reduceSum"}}
> 
>   ]
> }
> 
> 
> 
> 
> 
> []'s
> 
> 
> 
> João Machado
> joao.mach...@ideais.com.br
> Tel.: 55 21 3553 1301 R: 215
> Cel.: 55 21 8124 3531
> Site: www.ideais.com.br
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> -- 
> Sean Cribbs 
> Software Engineer
> Basho Technologies, Inc.
> http://basho.com/
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak Crash

2013-08-10 Thread Russell Brown

Are you using riak search?

On 9 Aug 2013, at 18:17, Lucas Cooper  wrote:

> I had a crash of an entire cluster early this morning, I'm not entirely sure 
> why, seems to be something with the indexer or Protocol Buffers (or my use of 
> them). Here are the various logs: http://egf.me/logs.tar.xz
> 
> Any idea what's up? All this crashing is kinda putting me off Riak ._.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Invoking Python script as part of a M/R job?

2013-09-26 Thread Russell Brown

Not that it is something that answers your immediate need, I just thought I'd 
point you at this post where Brian Lee Yung Rowe attempts to integrate Riak 
MapReduce and R.

http://cartesianfaith.com/2011/08/17/teaser-running-r-as-a-mapreduce-job-from-riak/

On 26 Sep 2013, at 05:33, jeffrey k eliasen  wrote:

> I'm trying to do some image processing using OpenCV. Later I'll be doing some 
> video processing as well. In a future project I will be using R to do deep 
> analysis on some data I'm collecting. In all these cases, what I want to do 
> is very simple with external languages but very hard with both Erlang and 
> Javascript.
> 
> What I want to do is simply invoke an external script on each element in a 
> bucket in the general case so that I can use advanced external tools in an 
> arbitrary manner. I was told by someone at Basho a long time ago (about a 
> year, which is a long time in internet years) that this could be done by 
> invoking scripts from Erlang, but I haven't heard back from him since then 
> and was hoping someone on the list could point me at an example demonstrating 
> this.
> 
> -- 
> 
> jeffrey k eliasen
> 
> Find and follow me on:
> Blog: http://jeff.jke.net
> Twitter: http://twitter.com/jeffreyeliasen
> Facebook: http://facebook.com/jeffrey.eliasen
> 
> On Sep 26, 2013, at 10:41 , Luke Bakken  wrote:
> 
>> Hi Jeff,
>> 
>> Erlang and Javascript are the only two supported "in process"
>> languages for map/reduce. Can you explain the process or provide the
>> python you want to use and perhaps someone on the list could help out
>> translating it to erlang?
>> --
>> Luke Bakken
>> CSE
>> lbak...@basho.com
>> 
>> 
>> On Wed, Sep 25, 2013 at 4:47 PM, Jeffrey Eliasen  wrote:
>>> I would like to write a job that invokes a python script to execute the
>>> processing of each node in a bucket. I can't find a way to do this using
>>> Javascript and I don't really know Erlang well enough to make this work...
>>> is there a sample piece of code somewhere that demonstrates this?
>>> 
>>> Thanks in advance!
>>> 
>>> jeff
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: are Siblings ordered?

2013-10-08 Thread Russell Brown

I'd very much like to see the same thing.

I have a working branch and test here https://github.com/basho/riak_kv/pull/688 
and https://github.com/basho/riak_test/tree/feature/rdb-sib-ex

This isn't using the DVVSets but a sort of rough hack, where we store the event 
dot for each write in the metadata dictionary for that value. As you can see 
from the test it works. There is some work to be done to productionise it, I 
really hope I get time for that. Feel free to pitch in (the riak_object tests 
that are broken would be the first place to help.)

My only concern is with client id vclocks (which are still supported in Riak.)

Cheers

Russell

On 8 Oct 2013, at 05:40, Pedram Nimreezi  wrote:

> I'd like to see the dotted version vectors get launched, even with merging 
> siblings high concurrency = high amount of unnecessary siblings,
> Which wastes space and impacts performance. Dotted version vectors helps 
> improve this. Proof of concept and paper available...
> 
> On Oct 7, 2013 3:06 PM, "Sam Elliott"  wrote:
> Each sibling has it's own last-modified date, which you should be able to use 
> to sort them when you get them back.
> 
> However, I'd suggest the following: if they're siblings, they were created 
> from concurrent edits. Thus, create a merge function that is entirely 
> deterministic without using the timestamp. This should save you from any 
> clock skew issues, and also from the fact that an edit may not have been 
> performed with the most up-to-date information.
> 
> I guess people can't wait for our CRDTs to launch.
> 
> Sam
> 
> --
> Sam Elliott
> Engineer
> sam.elli...@basho.com
> --
> 
> 
> On Monday, 7 October 2013 at 2:47PM, Alex Rice wrote:
> 
> > Yes, exactly that's what I'm working on. By knowing which sibling is
> > the oldest and which is the newest it seems like I can usually figure
> > out how to marge/apply the modifications. Things like Player profiles,
> > friends lists, etc.
> >
> > On Mon, Oct 7, 2013 at 12:43 PM, Jeremiah Peschka
> > mailto:jeremiah.pesc...@gmail.com)> wrote:
> > > There's no guarantee of return order as far as I know. Since you can't 
> > > count
> > > on clocks anyway...
> > >
> > > Are you trying to determine which data modifications to apply from 
> > > multiple
> > > siblings?
> > >
> > > ---
> > > sent from a tiny portion of the hive mind...
> > > in this case, a phone
> > >
> > > On Oct 7, 2013 11:40 AM, "Alex Rice"  > > (mailto:a...@mindlube.com)> wrote:
> > > >
> > > > Are they ordered by timestamp, and is the ordering guaranteed? (within
> > > > the clock accuracy of course). Using the C# client Thanks,
> > > > Alex
> > > >
> > > > ___
> > > > riak-users mailing list
> > > > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > >
> >
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

Hi Wes,

The client application does not need to perform a read before a write, the riak 
server must read from disk before updating the counter. Or at least it must 
with our current implementation.

What PRs did you have in mind? I'm curious.

Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" line, 
that means when riak tells you "error" for some counter increment, it may only 
be a partial failure, and re-running the operation may lead to over counting.

Cheers

Russell

On 17 Oct 2013, at 16:03, Weston Jossey  wrote:

> In the context of using distributed counters (introduced in 1.4), is it 
> strictly necessary to perform a read prior to issue a write for a given key?  
> A la, if I want to blindly increment a value by 1, regardless of what its 
> current value is, is it sufficient to issue the write without previously 
> having read the object?
> 
> I ask because looking at some of the implementations for counters in the open 
> source community, it's common to perform a read before a write, which impacts 
> performance ceilings on clusters with high volume reads / writes.  I want to 
> verify before issuing some PRs that this is in fact safe behavior.
> 
> Thank you!
> -Wes Jossey
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

I have some from a while back, if I can find my graphs I'll put them up 
somewhere.

Cheers

Russell

On 17 Oct 2013, at 16:35, Weston Jossey  wrote:

> Great everyone, thank you.  
> 
> @Russell:  I specifically work with either Go 
> (https://github.com/tpjg/goriakpbc) or Ruby (basho client).  I haven't tested 
> the ruby client, but I'd assume it will perform the write without the read 
> (based on my reading of the code).  The Go library, on the other hand, 
> currently always performs a read prior to the write.  It's an easy patch that 
> I've already applied locally for benchmarking, I just didn't want to submit 
> the PR till I was sure this was the correct behavior.
> 
> Somewhat off topic, but I dont' want to open up another thread if it's 
> unnecessary.  This questions arose because I've been doing extensive 
> benchmarking around distributed counters.  Are there pre-existing benchmarks 
> out there that I can measure myself against?  I haven't stumbled across many 
> at this point, probably because of how new it is.
> 
> Cheers,
> Wes
> 
> 
> On Thu, Oct 17, 2013 at 10:21 AM, Russell Brown  wrote:
> Hi Wes,
> 
> The client application does not need to perform a read before a write, the 
> riak server must read from disk before updating the counter. Or at least it 
> must with our current implementation.
> 
> What PRs did you have in mind? I'm curious.
> 
> Oh, it looks like Sam beat me to it…to elaborate on his "not idempotent" 
> line, that means when riak tells you "error" for some counter increment, it 
> may only be a partial failure, and re-running the operation may lead to over 
> counting.
> 
> Cheers
> 
> Russell
> 
> On 17 Oct 2013, at 16:03, Weston Jossey  wrote:
> 
> > In the context of using distributed counters (introduced in 1.4), is it 
> > strictly necessary to perform a read prior to issue a write for a given 
> > key?  A la, if I want to blindly increment a value by 1, regardless of what 
> > its current value is, is it sufficient to issue the write without 
> > previously having read the object?
> >
> > I ask because looking at some of the implementations for counters in the 
> > open source community, it's common to perform a read before a write, which 
> > impacts performance ceilings on clusters with high volume reads / writes.  
> > I want to verify before issuing some PRs that this is in fact safe behavior.
> >
> > Thank you!
> > -Wes Jossey
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown

Hi Daniil,

On 17 Oct 2013, at 16:55, Daniil Churikov  wrote:

> Correct me if I wrong, but when you blindly do update without previous read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
> 
> I am not familiar with new counters datatypes, so I am curious.

The counters in riak 1.4 are the first of a few data types we are building. The 
main change, conceptually, is that Riak knows about the type of the data you're 
storing in a counter.
Riak already detects conflicting writes, (writes that are causally concurrent), 
but doesn't know how to merge your data to a single value, instead it presents 
all the conflicting values to the client to resolve. However, in the case of a 
counter Riak _does_ know the meaning of your data and we're using a data type 
that can automatically merge to a correct value.

There is code running on Riak that will automatically merge counter siblings on 
write. And if siblings are detected on read, they are merged that a single 
value is presented to the client application.

I think Sean Cribbs has replied faster than me this time, and he's hinted at 
how the data type is implemented.

Cheers

Russell

> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Read Before Writes on Distributed Counters

2013-10-17 Thread Russell Brown


On 17 Oct 2013, at 17:21, Jeremiah Peschka  wrote:

> When you 'update' a counter, you send in an increment operation. That's added 
> to an internal list in Riak. The operations are then zipped up to provide the 
> correct counter value on read. The worst that you'll do is add a large(ish) 
> number of values to the op list inside Riak. 

Just to borrow some Cribbs-brand pedantry here:- That isn't true. We read the 
data from disk, increment an entry in what is essentially a version vector, and 
write it back, (then replicate the result to N-1 vnodes.) The size of the 
counter depends on the number of actors that have incremented it (typically N) 
not the number of operations.

> 
> Siblings will be created, but they will not be visible to the end user who is 
> reading from the counter.

There won't be siblings on disk (we do create a temporary one in memory, does 
that count?) _unless_

1. you also write an object to that same key in a normal riak kv  way (don't do 
that)
2. AAE or MDC cause a sibling to be created (this is because we use the 
operation of incrementing a counter to identify a key as counter, to the rest 
of riak it is just a riak object)

In that last case, an increment operation to the key will resolve the 
sibling(s).

Cheers

Russell

> 
> Check out this demo of the new counter types from Sean Cribbs: 
> https://vimeo.com/43903960
> 
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
> 
> 
> On Thu, Oct 17, 2013 at 9:55 AM, Daniil Churikov  wrote:
> Correct me if I wrong, but when you blindly do update without previous read,
> you create a sibling, which should be resolved on read. In case if you make
> a lot of increments for counter and rarely reads it will lead to siblings
> explosion.
> 
> I am not familiar with new counters datatypes, so I am curious.
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/Read-Before-Writes-on-Distributed-Counters-tp4029492p4029498.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: secondary indexes limit

2013-10-21 Thread Russell Brown

Hi Louis-Philippe,
It costs to create secondary indexes. Nothing is free.

But I'm not sure what "1000 different secondary indexes" means. When you add 
secondary indexes we store the name of the index and the value it indexes as 
part of the object metadata and on disk in an index.

I'm sure our ProServe and CSE could give better examples, but one customer I've 
worked with stores ~20 indexes per object, and has ~60million objects. If you 
means 1000s of different indexes per key, yes, that would make reading and 
writing those keys slower, and would add some overhead to communications 
channels. If you mean 1000s of indexes across your dataset, not an issue, if 
you mean 1000s of index entries, not an issue.

Does that answer your question?

Your previous question, "how many index keys?" depends on how big your cluster 
is, how big your disks are and all the things that determine how much data you 
can store, as indexes are just data too.

Would you care to share some more of your use case or intended use so I can 
help you decide if Riak is suitable?

Cheers

Russell

On 21 Oct 2013, at 15:13, Louis-Philippe Perron  wrote:

> I had heard from a possibly unfounded source that creating over 1000 
> different secondary indexes could place a burden on cluster performance.  Can 
> anyone confirm that?
> 
> L-P
> 
> 
> On Fri, Oct 18, 2013 at 5:55 PM, Alexander Sicular  wrote:
> I think that's just a memory limit.
> 
> mo 2i mo problems.
> 
> 
> @siculars
> http://siculars.posthaven.com
> 
> Sent from my iRotaryPhone
> 
> > On Oct 18, 2013, at 16:44, Louis-Philippe Perron  wrote:
> >
> > Hi,
> > this question has probably been answered already, still I can't get to it, 
> > so,
> >
> > What is the maximum number of unique secondary indexe keys a riak cluster 
> > can manage before running into trouble?
> >
> > thanks!
> >
> > L-P
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Updating Counters on Riak 2.0 pre whatever

2013-10-22 Thread Russell Brown

Yes! Sorry!

We broke backwards compatibility during development. We merged the patch 
today[1]. The develop branch works right now. It will get into the next pre (or 
whatever the next tag is called.)

Apologies again, it was easier for me to do the new work without thinking about 
backwards compatibility, then re-add old counters when the 2.0 stuff was done.

Let me know if you have any more issues going forward, please.

Cheers

Russell

[1] https://github.com/basho/riak_kv/pull/697
On 22 Oct 2013, at 19:56, Jeremiah Peschka  wrote:

> I'm attempting to create a counter on Riak 2.0 built from the develop on 
> Sunday. When I send a counter increment message using RpbCounterUpdateReq, I 
> get the following back from Riak:
> 
> Riak returned an error. Code '0'. Message: Error processing incoming message: 
> error:undef:[{riak_kv_counter,supported,
>  [],[]},
> {riak_kv_pb_counter,process,
>  2,
>  [{file,
>
> "src/riak_kv_pb_counter.erl"},
>   {line,114}]},
> {riak_api_pb_server,
>  process_message,4,
>  [{file,
>
> "src/riak_api_pb_server.erl"},
>   {line,386}]},
> {riak_api_pb_server,
>  connected,2,
>  [{file,
>
> "src/riak_api_pb_server.erl"},
>   {line,228}]},
> {riak_api_pb_server,
>  decode_buffer,2,
>  [{file,
>
> "src/riak_api_pb_server.erl"},
>   {line,362}]},
> {gen_fsm,handle_msg,7,
>  [{file,"gen_fsm.erl"},
>   {line,505}]},
> {proc_lib,init_p_do_apply,3,
>  [{file,"proc_lib.erl"},
>   {line,239}]}]
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Updating Counters on Riak 2.0 pre whatever

2013-10-22 Thread Russell Brown

On 22 Oct 2013, at 20:24, Jeremiah Peschka  wrote:

> Hah! No worries at all. Thanks for the clarification. I'll rebuild.
> 
> I wasn't sure if I needed to try to figure out (from a client perspective) if 
> I needed to use the CRDT syntax for 2.0 and newer and RpbCounterUpdateReq for 
> 1.4.x and earlier. Good to know that I can blindly proceed.

Ok, this is a good question. And something we've been working on.

1.4 had counters, 2.0 has counters. Are they the same? Yes and no.

In 2.0 we add bucket types. We decided that CRDTs in 2.0 should take advantage 
of bucket types. The reason being that there is no sensible way to merge across 
types (how do you merge a Map with a Set? (we thought of a few ways, but non 
were intuitive)) It comes down to the reason for CRDTs: no conflicts, no 
siblings. If we allow different types in the same bucket then you can create 
the same key with two different types, which means siblings types: just as 
complex as siblings.

So, we restrict you to one CRDT type per bucket. But in 1.4 we didn't have this 
restriction. And we need to support 1.4 style counters.

If you use the 1.4 API or store counters in the default (untyped) bucket using 
the 2.0 API then you're interoperable between versions.

Sorry if it seems confusing. The point is that as a client developer you can 
access 1.4 counters through the 2.0 API by using the default bucket type OR 
continue support for the 1.4 API (or bolth.) Whichever you prefer.

I realise this might be an information overload, so feel free to ask questions 
about anything I wasn't clear enough about.

Cheers

Russell

> 
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
> 
> 
> On Tue, Oct 22, 2013 at 2:07 PM, Russell Brown  wrote:
> Yes! Sorry!
> 
> We broke backwards compatibility during development. We merged the patch 
> today[1]. The develop branch works right now. It will get into the next pre 
> (or whatever the next tag is called.)
> 
> Apologies again, it was easier for me to do the new work without thinking 
> about backwards compatibility, then re-add old counters when the 2.0 stuff 
> was done.
> 
> Let me know if you have any more issues going forward, please.
> 
> Cheers
> 
> Russell
> 
> [1] https://github.com/basho/riak_kv/pull/697
> On 22 Oct 2013, at 19:56, Jeremiah Peschka  wrote:
> 
> > I'm attempting to create a counter on Riak 2.0 built from the develop on 
> > Sunday. When I send a counter increment message using RpbCounterUpdateReq, 
> > I get the following back from Riak:
> >
> > Riak returned an error. Code '0'. Message: Error processing incoming 
> > message: error:undef:[{riak_kv_counter,supported,
> >  [],[]},
> > {riak_kv_pb_counter,process,
> >  2,
> >  [{file,
> >
> > "src/riak_kv_pb_counter.erl"},
> >   {line,114}]},
> > {riak_api_pb_server,
> >  process_message,4,
> >  [{file,
> >
> > "src/riak_api_pb_server.erl"},
> >   {line,386}]},
> > {riak_api_pb_server,
> >  connected,2,
> >  [{file,
> >
> > "src/riak_api_pb_server.erl"},
> >   {line,228}]},
> > {riak_api_pb_server,
> >  decode_buffer,2,
> >  [{file,
> >
> > "src/riak_api_pb_server.erl"},
> >   {line,362}]},
> > {gen_fsm,handle_msg,7,
> >  [{file,"gen_fsm.erl"},
> >   {line,505}]},
> > {proc_lib,init_p_do_apply,3,
> >

Re: Import big data to Riak

2013-10-29 Thread Russell Brown

Hi Georgi,

All Guido’s (below) advice is good. If you are just importing unique items, I 
would set the bucket property to LWW=true for the import, it will be much 
faster since Riak will not do N local reads for vclock data.

Cheers

Russell

On 29 Oct 2013, at 15:21, Guido Medina  wrote:

> Your tests are not close to what you are going to have in production IMHO, 
> here are few recommendations:
>   • Build a cluster with at least 5 nodes with N=3 and R=W=2 (You can 
> update your bucket properties via PBC with Java)
>   • Use PBC instead of HTTP.
>   • If you are only importing data call 
> .store()withoutFetch().execute() to avoid unnecessary roundtrips.
> If you test using unrealistic scenarios you will find unpleasant surprises 
> when you are about to be go live so better to set your expectations right at 
> the beginning.
> HTH,
> Guido.
> On 29/10/13 14:59, Georgi Ivanov wrote:
>> Hello,
>> I am importing some big data to Riak. 
>> I am importing like 10GB per day and i have to import one year of data. 
>> The task is to speed up the initial import. After  that i will import on 
>> daily 
>> basis, so the speed is not very important.
>> 
>> I am using JAVA HTTP client. So far my test show that the fastest setup is 
>> to 
>> use n_val 1 and import to single server.
>> 
>> I tested importing on 2 servers (with n_val:2), but it is actually slower.
>> My JAVA client is multi-threaded.
>> 
>> My idea is to use n_val:1 on single node, then increase the n_val:2 and add 
>> one more node to the cluster. The problem is that i don't see the storage to 
>> grow when i change n_val : 2
>> I was looking at Riak Active Anti-Entropy feature and i am expecting my 
>> storage to grow after i increase the n_val. Unfortunately this is not the 
>> case  
>> or i don't understand AAE feature 
>> I can't any changes in storage size at all. I don't want to go in direction 
>> of 
>> force repair as it would take forever.
>> 
>> Can anyone shed some light on AAE ? Or any tips for speeding up the import 
>> in 
>> general.
>> 
>> To summarize the situation :
>> 1. One Riak node with n_val : 1 , eLevelDb as back-end
>> 2. Import data.
>> 3. Change n_val to 2
>> 4. Join one more node to the cluster.
>> 
>> What i expect to happen :
>> To have all the keys distributed to 2 riak nodes with n_val:2
>> So if i had 1TB of data on node1 with n_val:1 , after changing to n_val 2 
>> and 
>> joining one more node, to have 1TB of data on each node.
>> 
>> 
>> ___
>> riak-users mailing list
>> 
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Forcing Siblings to Occur

2013-11-08 Thread Russell Brown

Hi Mark,
It is pretty easy.

Set your bucket to allow_mult=true.

Send a put to bucket, key.
Send another one to the same bucket key.

If you’re using a well behaved client like the Riak-Java-Client, or any other 
that gets a vclock before doing a put, use whatever option stops that.

With the pb client it is as simple as: 
https://gist.github.com/russelldb/46153e2ab9d2b9206f63

Hope that Helps

Russell

On 8 Nov 2013, at 18:29, Mark A. Basil, Jr.  wrote:

> Is there some method that is either guaranteed or very highly likely to 
> create Siblings of an object (that isn’t a counter)?  I would like to have a 
> reliable method to test code which is meant to handle them.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Multiple named counters per key

2013-11-08 Thread Russell Brown

Riak 2.0 will support this with the new Map data type. Until then, I’m afraid a 
Key is a counter or a Key is a riak object.

On 8 Nov 2013, at 19:49, Mark A. Basil, Jr.  wrote:

> Just a thought.  It would be handy if one could add many named counters per 
> buket/key to more closely handle a shopping cart scenario.  Otherwise, unless 
> I’m mistaken, one would have to have a bucket that would be the “cart” and 
> keys which are the “items”.  That doesn’t make a lot of sense to me.
>  
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Forcing Siblings to Occur

2013-11-13 Thread Russell Brown


On 13 Nov 2013, at 10:03, Carlos Baquero  wrote:

> 
> Its interesting to see a use case where a grow only set is sufficient. I 
> believe Riak 2.0 will offer optimized OR-Sets that allow item removal at the 
> expense of some extra complexity in element storage and logarithmic metadata 
> growth per operation. But for your case a simple direct set of elements with 
> server side merge by set union looks perfect. Its not efficient at all to 
> keep all those siblings if a simple server side merge can reduce them.
> 
> Maybe it is a good idea to not overlook the potential usefulness of simple 
> grow only sets and add that datatype to the 2.0 server side CRDTs library. 
> And maybe even 2P-Sets that only allow deleting once, might be useful for 
> some cases. 

We plan to add more data types in future, I don’t think they’ll make them into 
2.0. You can use an ORSet as a G-Set, though, just only ever add to it. The 
overhead is pretty small.

the difficulty is exposing different “flavours” of CRDTs in a non-confusing 
way. We chose to go with the name “data type” and name the implementations 
generically (set, map, counter.) I wonder if we painted ourselves into a corner.

Cheers

Russell

> 
> Regards,
> Carlos
> 
> -
> Carlos Baquero
> HASLab / INESC TEC &
> Universidade do Minho,
> Portugal
> 
> c...@di.uminho.pt
> http://gsd.di.uminho.pt/cbm
> 
> 
> 
> 
> 
> On 12/11/2013, at 22:10, Jason Campbell wrote:
> 
>> I am currently forcing siblings for time series data. The maximum bucket 
>> sizes are very predictable due to the nature of the data. I originally used 
>> the get/update/set cycle, but as I approach the end of the interval, reading 
>> and writing 1MB+ objects at a high frequency kills network bandwidth. So 
>> now, I append siblings, and I have a cron that merges the previous siblings 
>> (a simple set union works for me, only entire objects are ever deleted).
>> 
>> I can see how it can be dangerous to insert siblings, bit if you have some 
>> other method of knowing how much data is in one, I don't see size being an 
>> issue. I have also considered using a counter to know how large an object is 
>> without fetching it, which shouldn't be off by more than a few siblings 
>> unless there is a network partition.
>> 
>> So aside from size issues, which can be roughly predicted or worked around, 
>> is there any reason to not create hundreds or thousands of siblings and 
>> resolve them later? I realise sets could work well for my use case, but they 
>> seem overkill for simple append operations when I don't need delete 
>> functionality. Creating your own CRDTs are trivial if you never need to 
>> delete.
>> 
>> Thoughts are welcome,
>> Jason
>> From: John Daily
>> Sent: Wednesday, 13 November 2013 3:10 AM
>> To: Olav Frengstad
>> Cc: riak-users
>> Subject: Re: Forcing Siblings to Occur
>> 
>> Forcing siblings other than for testing purposes is not typically a good 
>> idea; as you indicate, the object size can easily become a problem as all 
>> siblings will live inside the same Riak value.
>> 
>> Your counter-example sounds a lot like a use case for server-side CRDTs; 
>> data structures that allow the application to add values without retrieving 
>> the server-side content first, and siblings are resolved by Riak.
>> 
>> These will arrive with Riak 2.0; see 
>> https://gist.github.com/russelldb/f92f44bdfb619e089a4d for an overview.
>> 
>> -John
>> 
>> On Nov 12, 2013, at 7:13 AM, Olav Frengstad  wrote:
>> 
>>> Do you consider forcing siblings a good idea? I would like to get some 
>>> input on possible use cases and pitfalls.
>>> For instance i have considered to force siblings and then merge them on 
>>> read instead of fetching an object every time i want to update it 
>>> (especially with larger objects).
>>> 
>>> It's not clear from the docs if there are any limitations, will the maximum 
>>> object size be the limitation:?
>>> 
>>> A section of the docs[1] comees comes to mind:
>>> 
>>> "Having an enormous object in your node can cause reads of that object to 
>>> crash the entire node. Other issues are increased cluster latency as the 
>>> object is replicated and out of memory errors."
>>> 
>>> [1] 
>>> http://docs.basho.com/riak/latest/theory/concepts/Vector-Clocks/#Siblings
>>> 
>>> 2013/11/9 Brian Roach 
>>> On Fri, Nov 8, 2013 at 11:38 AM, Russell Brown  wrote:
>>> 
>>>> If you’re using a well b

Re: Degraded response times with massive increase in Erlang VM process memory use

2013-11-14 Thread Russell Brown

Hi Dave,

Are you sure that "No queries are running”?

The log you posted shows the index coverage fsm running as well as the 
streaming merge sort buffer.

My guess would be you have some (many?) 2i queries with large page_size set on 
the results and a slow vnode causing all the results to be buffered in memory.

Can you check again that your regular workload doesn’t include a bunch of 2i 
queries with large result sets?

Cheers

Russell

On 14 Nov 2013, at 21:32, Dave Brady  wrote:

> Hi Luke,
> 
> Thanks for responding!  I've been unavailable most of the day, hence my late 
> reply.
> 
> I'll gather up the those logs (tomorrow).
> 
> No queries are running, and no one has tried to getting the a key list.  We 
> restarted our programs to clear any connections they had to the slow nodes 
> after disabling those nodes in haproxy.
> 
> One node has slowly, over the last five hours, started to head back to 
> normal.  Peak usage is down to 5.2 GB.
> 
> The other node has gotten even worse.  It's now ranging from 6.5 GB to 23 GB. 
> 
> --
> Dave Brady
> 
> From: "Luke Bakken" 
> To: "Dave Brady" 
> Cc: "riak-users" 
> Sent: Jeudi 14 Novembre 2013 18:14:43
> Subject: Re: Degraded response times with massive increase in Erlang VM 
> process memory use
> 
> Hi Dave,
> 
> A few people have chimed in to ask what kinds of queries are running / have 
> been run recently against this cluster - Map/Reduce, list keys, 2i?
> 
> --
> Luke Bakken
> CSE
> lbak...@basho.com
> 
> 
> On Thu, Nov 14, 2013 at 1:56 AM, Dave Brady  wrote:
> Hello Everyone,
> 
> Two of our five nodes seeing the 100% GET/PUT times (node_[get | 
> put]_fsm_time_100) increase to as high as 8 seconds, and looking at our 
> available metrics we see huge amounts memory being used by Erlang processes 
> (memory_processed_used).
> 
> We normally see Erlang processes use tens of MBs, and occasionally a few 
> hundred MBs for short periods.  One node is now using between 5.2 GB to 18.5 
> GB.  The other one is just little lower: 4 GB to 14 GB.
> 
> Our average object size is roughly 25 KB.
> 
> The logs on these two nodes have lots of:
> 
> 2013-11-14 09:26:03.961 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.19362.2932> 
> [{initial_call,{riak_core_coverage_fsm,init,1}},{almost_current_function,{sms,sms,1}},{message_queue_len,41}]
>  
> [{timeout,5356},{old_heap_block_size,0},{heap_block_size,870001580},{mbuf_size,0},{stack_size,54},{old_heap_size,0},{heap_size,336260414}]
> 2013-11-14 09:26:03.961 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap 
> <0.19362.2932> 
> [{initial_call,{riak_core_coverage_fsm,init,1}},{almost_current_function,{sms,sms,1}},{message_queue_len,41}]
>  
> [{old_heap_block_size,0},{heap_block_size,870001580},{mbuf_size,0},{stack_size,54},{old_heap_size,0},{heap_size,336260414}]
> 2013-11-14 09:26:03.968 [error] <0.3205.3273> CRASH REPORT Process 
> <0.3205.3273> with 0 neighbours crashed with reason: no function clause 
> matching webmachine_request:peer_from_peername({error,enotconn}, 
> {webmachine_request,{wm_reqstate,#Port<0.38822194>,[],undefined,undefined,undefined,{wm_reqdata,...},...}})
>  line 150
> 
> Anyone seen this before?
> --
> Dave Brady
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak as backend for Titian graph DB

2013-11-15 Thread Russell Brown

To add what Eric said: I don’t know Titan, but the phrase “atomic edge 
operations” suggests they require properties from the datastore that Riak’s 
eventually consistent datatypes can’t satisfy.

All that said, we have been asked by a customer to look at integration with 
Titan, and when time permits, we will.

Cheers

Russell

On 16 Nov 2013, at 06:58, Eric Redmond  wrote:

> The new sets aren't sorted, they're just proper sets.
> 
> Riak isn't fundamentally changing from a key value store, the values of the 
> new data types are just convergent. So you still retrieve the whole value at 
> a time, no partial values (eg. subsets).
> 
> Eric
> 
> 
> On Nov 15, 2013, at 6:39 PM, bernie  wrote:
> 
>> Hi,
>> 
>> there was a thread on the Google's Titan group about using Riak with Titan. 
>> Matthias (one of the lead developers) initial response was that Riak's data 
>> model was difficult to work with for their use case.
>> When I asked him about the new 2.0 data types he answered the following
>> 
>> "Sets seem close to what we need. At least that will allow us to do atomic 
>> edge operations. Is it possible to specify a sort order on those sets and 
>> retrieve subsets or can I only retrieve the set as a whole?"
>> 
>> Since I could not find the information, I am hoping somebody here will be 
>> able to help out.
>> 
>> Thanks in advance,
>> Bernie
>> 
>> https://groups.google.com/forum/#!topic/aureliusgraphs/sP0uS6A47IA
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak 1.4x counter examples using riakc (erlang-riak-client)

2013-11-21 Thread Russell Brown

Hi Mark,

The Counter is just a a riak_object under the hood (at the Riak end of things) 
to the erlang client though it is modelled as an integer and operations on an 
integer.

We’ll get around to the README, sorry about that.

Using the counter is pretty simple. First you need to set whatever bucket you 
wish to store the counter in to allow_mult=true
Then just use the increment or fetch functions.

Here is an example session: https://gist.github.com/russelldb/7596268

Please note, you can also use the regular R,W,PR,PW etc options on increment 
and fetch.

Sorry about the lack of examples in the README, hope this makes up for it a 
little.

Cheers

Russell

On 21 Nov 2013, at 00:17, Mark Allen  wrote:

> Hi -
> 
> I'm trying to puzzle through how to use the PN counters in Riak 1.4.x via the 
> riakc client.  It *looks* like
> a 1.4 counter is a special kind of riak_obj metadata tuple.  So do you just 
> set the tuple in a riakc_obj with
> an undefined value?
> 
> The README in the client repo also doesn't seem to have an PN counter 
> examples.  Any guidance would
> be appreciated.
> 
> Thanks.
> 
> Mark
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak 2.0 Sets

2013-12-16 Thread Russell Brown

Hi Massimiliano,

Answers (and explanations) inline below:

On 16 Dec 2013, at 12:20, Massimiliano Ciancio  wrote:

> Hi all,
> I've two questions about new Riak 2.0 sets:
> 1) how many elements can be added to a set? Are 1-10-100 millions of
> keys, about 20-30 chars each, reasonable for a single set?

All the new data types in riak 2.0 are really just riak objects. Anything that 
you wouldn’t do with a regular riak key-value, you shouldn’t do with the data 
types.
Think of them as a way for us to implement your merge functions for you, in the 
server. We expose an operations based API (add/remove element) but at the node 
level, this is just a riak object operation, so the full object is read from 
disk, and the update applied, and the result replicated. Smaller objects are 
better. Millions of elements in a Set is a bad idea. If you need big sets like 
that, consider sharding over multiple keys.

> 2) is it possible to do union/intersection/diffs with a riak set?

No, not yet. I’d love to add more Redis-like functionality to our data types in 
future releases.

> Any update about 2.0 availability?

You can get pre5[1] now to start experimenting with. We’re working toward the 
first RC for 2.0 at the moment, expect something final in the first quarter of 
2014.

Cheers

Russell

[1] http://docs.basho.com/riak/2.0.0pre5/downloads/

> Thanks
> MC
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: May allow_mult cause DoS?

2013-12-18 Thread Russell Brown

Hi,

Can you describe your use case a little? Maybe it would be easier for us to 
help.

On 18 Dec 2013, at 04:32, Viable Nisei  wrote:

> On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen  wrote:
> It really is not a good idea to use siblings to represent 1-to-many 
> relations. That's not what it's intended for, nor what it's optimized for...
> Ok, understood.
>  
> Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would 
> probably do it.
> 1) According to 
> http://docs.basho.com/riak/latest/ops/running/backups/#LevelDB-Backups , it's 
> real pain to implement backups with leveldb.
> 2) According to 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ , reads may 
> be slower comparing to bitcask, it's critical for us
> 
> Otherwise, storing a list of items under each key could be a solution, 
> depending of course on the number of items per key. (But do perform conflict 
> resolution.)
> Why any conflict resolving is required? As far as I understood, with 
> allow_mult=true riak should just collect all the values written to key 
> without anything additional work? What design decision leads to exponential 
> slowdown and crashes when multiple values allowed for any single key?.. So, 
> what's the REAL purpose of allow_mult=true if it's bad idea to use it for 
> unlimited values per single key?

The real purpose of allow_mult=true is so that writes are never dropped. In the 
case where your application concurrently writes to the same key on two 
different nodes, or on two partitioned nodes, Riak keeps both values. Other 
data stores will lose one of the writes based on timestamp, serialise your 
writes (slow) or simply refuse to accept one or more of them.

It is the job of the client to aggregate those multiple writes into a single 
value when it detects the conflict on read. Conflict resolution is required 
because your data is opaque to Riak. Riak doesn’t know that you’re storing 
lists of values, or JPEGs or JSON. It can’t possibly know how to resolve two 
conflicting values unless it knows the semantics of the values. Riak _does_ 
collect all the values written to a key, but it does so as a temporary measure, 
it expects your application to resolve them to a single value. How many are you 
writing per Key?

Riak’s sweetspot is highly write available applications. If you have the time 
read the Amazon Dynamo paper[1], as it explains the _problems_ Riak solves as 
well as the way in which it solves them. If you don’t have these problems, 
maybe Riak is not the right datastore for you. Solving these problems comes 
with some developer complexity costs. You’ve run into one of them. We have many 
customers who think the trade-off is worth it: that the high availability and 
low-latency makes up for having eventual consistency.

> 
> Ok, documentation contains the following paragraph:
>  
> > Sibling explosion occurs when an object rapidly collects siblings without 
> > being reconciled. This can lead to a myriad of issues. Having an enormous 
> > object in your node can cause reads of that object to crash the entire 
> > node. Other issues are increased cluster latency as the object is 
> > replicated and out of memory errors.
>  
> But there is no point if it related to allow_mult=false or both cases.

Sorry, but I don’t understand what you mean by this statement. The point of 
allow_mult=true is so that writes are not arbitrarily dropped. It allows Riak 
nodes to continue to be available to take writes even if they can’t communicate 
with each other. Have a look at Kyle Kingsbury’s Jepsen[2] post on Riak.

> 
> So, the only solution is leveldb+2i?

Maybe. Or maybe just use the client as it is intended to resolve sibling values 
and send that value and a vector clock back to Riak. Or maybe roll your own 
indexes like in this blog post[3]. With Riak 2.0 there are a few data types 
added to Riak that are not opaque. Maybe Riak’s Sets would suit your purpose 
(depending on the size of your Set.)

There is a wealth of data modelling experience at Basho and on this list. The 
more information you give us about your problem, (rather than describing what 
you perceive to be Riak’s shortcomings), the more likely you are to be able to 
benefit from that experience.

You’re fighting the database at the moment, rather than working with it. The 
properties of Riak buy you some wonderful things (high availability, partition 
tolerance, low latency) but you have to want / need those properties, and then 
you have to accept that there is a data modelling / developer complexity price 
to pay. We don’t think that price is too high. We have many customers who 
agree. We’re always working to lower that price (see Strong Consistency, 
Yokozuna, Data Types etc in Riak 2.0[4].)

You seem to have had a very negative first experience of Riak (and Basho.) I 
think that is because you misunderstand what it is for and how it should be 
used. I'm very keen to fix that. If it turns out that Riak

Re: accessing CRDTs in riak 2.0

2013-12-18 Thread Russell Brown

Hi James,

We’re working on docs. There are some edocs at the top of riak_kv_wm_crdt that 
describe the HTTP API, that I’ve put on DropBox here 
https://www.dropbox.com/s/bcdn2q2owgv4jxl/riak_kv_wm_crdt.html, though we are 
still pre-freeze on this code, so APIs change.

As for the PB messages, it might be best at this time to look at the proto file 
here https://github.com/basho/riak_pb/blob/develop/src/riak_dt.proto

Sorry that that is all there is right now. When we have examples, I’ll get them 
on the list. In the meantime, there is the riak-erlang-client and the 
riak-erlang-http-client, as others have said.

Cheers

Russell

On 17 Dec 2013, at 20:08, James Moore  wrote:

> mainly a spec for the straight http or pb APIs, as far as I understand the 
> only client with explicit support right now is erlang.
> 
> thanks!
> 
> --James
> 
> 
> On Tue, Dec 17, 2013 at 3:06 PM, Brian Roach  wrote:
> Hi James,
> 
> Do you mean via the Erlang client, or one of the other client libs, or ... ?
> 
> Thanks,
> - Roach
> 
> On Tue, Dec 17, 2013 at 12:42 PM, James Moore  wrote:
> > Hey all,
> >
> > I'm working on testing out some of the CRDT features but haven't been able
> > to sort through the incantations to store/query any CRDT other than a
> > counter.  any tips?
> >
> > thanks,
> >
> > --James
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak performance

2013-12-20 Thread Russell Brown

Hi Georgio,

With data that small I doubt there is a difference in perf.

Can I get some more info, please?

Are you getting 2400 reqs a second against  a single key? What backend are you 
using? What is the spec of the machines? Are they real or on some cloud? 
Network?

Is this perf figure against cs or riak? Have you tried the PB driver? What is 
the ring size? What “requests” are you doing? Are they serial in a loop using 
CURL or parallel using basho_bench?

Something is wrong, because on my local 3 desktop cluster I can get well over 
5000ops/sec.

If you can provide more info then we can probably help you get performance 
closer to your expectations.

Cheers

Russell

On 20 Dec 2013, at 08:33, Georgio Pandarez  wrote:

> Hi Guys,
> 
> I'm still evaluating riak & riak-cs.
> 
> On my five node cluster, with riak only, I am only able to do 2400 requests 
> per second. I have two keys, one a one byte key, and another a 4kb key, the 
> performance doesn't change against either.
> 
> Is this the best that I can expect to get from this, I don't consider 2400 
> rps to be significantly high for 5 machines. I'm testing using the http 
> interface.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak performance

2013-12-20 Thread Russell Brown

Thanks Georgio,

I just noticed that we went off list so I’m copying riak-users back in. I 
always forget to reply-all.

Also realised I never asked, what version of Riak are you running (looks 
1.4.something?)

On 20 Dec 2013, at 10:01, Georgio Pandarez  wrote:

> Hi Russell,
> 
> > What about the drives? Actually, with a single Key you’re probably not 
> > touching the drives at all.
> 
> Just a single SATA disk. I've only got two keys and am testing a single key 
> at a time. 
> 
> > So is there a single HTTP connection? Many? How many? Does it use a pool of 
> > connections that are kept alive, establishing an HTTP connection is pricey. 
> > Is all the load going to one node? Spread across all 5?
> 
> I am using haproxy to load balance requests against the 5 servers. I am using 
> keepalive connections and have tested from 5 connections to 200 connections. 
> After a certain point increasing the connections only increases latency.
> 

That makes sense, there is usually a sweetspot for number of connections. All 
your clients are contending for the same key.

> Looking at the servers, riak is using most of the CPU, I'm not sure what it 
> is doing.
> 
> running riak-admin top from one of the servers shows the below, I've tried 
> disabling riak_kv_stat but it still hangs around:

You can’t disable stats in riak 1.4.x which is looks like your using (from the 
inclusion of side job in the etop output)

> 
>  Load:  cpu   114   Memory:  total   90204binary  
>   672
> procs 670processes8103code
> 10535
> runq1atom  493ets 
> 30233
> 
> Pid Name or Initial Func Time   Reds Memory   
> MsgQ Current Function
> ---
> <6210.577.0>proc_lib:init_p/5 '-'7415450  34688   
>0 gen_fsm:loop/7  
> <6210.2977.0>   proc_lib:init_p/5 '-' 761452   2880   
>1 prim_file:drv_get_response/1
> <6210.304.0>riak_kv_stat_sj_1 '-' 371484   3856   
>0 gen_server:loop/6   
> <6210.305.0>riak_kv_stat_sj_2 '-' 363239   3856   
>0 gen_server:loop/6   
> <6210.1054.19>  proc_lib:init_p/5 '-' 125656  34328   
>0 mochiweb_http:request/3 
> <6210.753.19>   proc_lib:init_p/5 '-' 125588  34328   
>0 mochiweb_http:request/3 
> <6210.1041.19>  proc_lib:init_p/5 '-' 125588  34328   
>0 mochiweb_http:request/3 
> <6210.1052.19>  proc_lib:init_p/5 '-' 125588  34328   
>0 mochiweb_http:request/3 
> <6210.664.19>   proc_lib:init_p/5 '-' 124041  21552   
>0 riak_client:wait_for_reqid/3
> <6210.694.19>   proc_lib:init_p/5 '-' 124041  21552   
>0 riak_client:wait_for_reqid/3

It doesn’t look like Riak is doing a whole lot at this point. I’ll speak with 
my colleagues when they get online for some more guidance, as I’m not sure what 
numbers you should expect for a single key. Are you going to try with 100s, 
1000s, millions of keys (which is far more common usage)? What is the use case 
that requires you get the same key thousands of times a second?

Cheers

Russell

> 
> 
> 
> On Fri, Dec 20, 2013 at 8:51 PM, Russell Brown  wrote:
> Hi Georgio,
> 
> Thanks for getting back to me.
> 
> On 20 Dec 2013, at 09:23, Georgio Pandarez  wrote:
> 
> > Hi Russell,
> >
> > Thanks for your reply.
> >
> > > Are you getting 2400 reqs a second against  a single key?
> >
> > Yes.
> 
> Ah, OK. Well that is OKish for a single key, though I’ve seen better.
> 
> >
> > > What backend are you using?
> >
> > bitcask
> >
> > > What is the spec of the machines?
> >
> > dual core, 4gig, physical, 100mbit between the machines
> 
> What about the drives? Actually, with a single Key you’re probably not 
> touching the drives at all.
> 
> >
> > > Is this perf figure against cs or riak?
> >
> > This perf figure is against riak. I was originally trying riak-cs and 
> > wasn't g

Re: riak performance

2013-12-20 Thread Russell Brown


On 20 Dec 2013, at 10:18, Georgio Pandarez  wrote:

> Hi Russell,
> 
> Thanks for the prompt response.
> 
> > Also realised I never asked, what version of Riak are you running
> 
> 1.4.2
> 
> > It doesn’t look like Riak is doing a whole lot at this point
> 
> So is it normal for the cpu to spin?

Again, I’ll check with colleagues later, I don’t see that behaviour locally, I 
have a suspicion this is the stats gathering process running in the background. 
Though you can’t turn it off, you can give it a really long interval and see if 
that helps, You can do this by adding

{stat_cache_ttl, $SomeTimeInSeconds}

to the riak_core section in your app.config

> 
> > . I’ll speak with my colleagues when they get online for some more 
> > guidance, as I’m not sure what numbers you should expect for a single key. 
> > Are you going to try with 100s, 1000s, millions of keys (which is far more 
> > common usage)? What is the use case that requires you get the same key 
> > thousands of times a second?
> 
> Yes, I will be trying with many keys, I just wanted to see if riak is 
> suitable for this application. 
> 
> What kind of rps should I be expecting on a typical 5 node cluster of a read 
> heavy workload? I know there are so many variables, but should each node be 
> generating 1000 rps?

Yes, you should be able to achieve 1000s ops/sec per node. We have a tuning 
section in our docs that is worth a read 
http://docs.basho.com/riak/1.4.0/cookbooks/Linux-Performance-Tuning/

Hopefully I can get some more concrete answers for you later in the day.

Cheers

Russell

> 
> 
> 
> 
> On Fri, Dec 20, 2013 at 9:08 PM, Russell Brown  wrote:
> Thanks Georgio,
> 
> I just noticed that we went off list so I’m copying riak-users back in. I 
> always forget to reply-all.
> 
> Also realised I never asked, what version of Riak are you running (looks 
> 1.4.something?)
> 
> On 20 Dec 2013, at 10:01, Georgio Pandarez  wrote:
> 
> > Hi Russell,
> >
> > > What about the drives? Actually, with a single Key you’re probably not 
> > > touching the drives at all.
> >
> > Just a single SATA disk. I've only got two keys and am testing a single key 
> > at a time.
> >
> > > So is there a single HTTP connection? Many? How many? Does it use a pool 
> > > of connections that are kept alive, establishing an HTTP connection is 
> > > pricey. Is all the load going to one node? Spread across all 5?
> >
> > I am using haproxy to load balance requests against the 5 servers. I am 
> > using keepalive connections and have tested from 5 connections to 200 
> > connections. After a certain point increasing the connections only 
> > increases latency.
> >
> 
> That makes sense, there is usually a sweetspot for number of connections. All 
> your clients are contending for the same key.
> 
> > Looking at the servers, riak is using most of the CPU, I'm not sure what it 
> > is doing.
> >
> > running riak-admin top from one of the servers shows the below, I've tried 
> > disabling riak_kv_stat but it still hangs around:
> 
> You can’t disable stats in riak 1.4.x which is looks like your using (from 
> the inclusion of side job in the etop output)
> 
> >
> >  Load:  cpu   114   Memory:  total   90204binary
> > 672
> > procs 670processes8103code  
> >   10535
> > runq1atom  493ets   
> >   30233
> >
> > Pid Name or Initial Func Time   Reds Memory 
> >   MsgQ Current Function
> > ---
> > <6210.577.0>proc_lib:init_p/5 '-'7415450  34688 
> >  0 gen_fsm:loop/7
> > <6210.2977.0>   proc_lib:init_p/5 '-' 761452   2880 
> >  1 prim_file:drv_get_response/1
> > <6210.304.0>riak_kv_stat_sj_1 '-' 371484   3856 
> >  0 gen_server:loop/6
> > <6210.305.0>riak_kv_stat_sj_2 '-' 363239   3856 
> >  0 gen_server:loop/6
> > <6210.1054.19>  proc_lib:init_p/5 '-' 125656  34328 
> >  0 mochiweb_http:request/3
> > <6210.753.19>   proc_lib:init_p/5 '-' 125588  34328 
> >  0 mochiweb_http:request/3
> > <6210.1041.19>  proc_lib:

Re: May allow_mult cause DoS?

2013-12-21 Thread Russell Brown

Hi,

On 20 Dec 2013, at 23:16, Jason Campbell  wrote:

> 
> - Original Message -
> From: "Andrew Stone" 
> To: "Jason Campbell" 
> Cc: "Sean Cribbs" , "riak-users" 
> , "Viable Nisei" 
> Sent: Saturday, 21 December, 2013 10:01:29 AM
> Subject: Re: May allow_mult cause DoS?
> 
> 
>> Think of an object with thousands of siblings. That's an object that has 1 
>> copy of the data for each sibling. That object could be on the order of 100s 
>> of megabytes. Everytime an object is read off disk and returned to the 
>> client 100mb is being transferred. Furthermore leveldb must rewrite the 
>> entire 100mb to disk everytime a new sibling is added. And it just got 
>> larger with that write. If a merge occurs, the amount of data is a single 
>> copy of the data at that key instead of what amounts to approximately 1 
>> copies of the same sized data, when all you care about is one of those 
>> 10,000. 
> 
> This makes sense for concurrent writes, but the use case that was being 
> talked about was siblings with no parent object.

What is a "sibling with no parent object”? I think I understand what you’re 
getting at, when each sibling is some fragment of the whole, is that it?

>  I understand the original use case being discussed was tens of millions of 
> objects, and the metadata alone would likely exceed recommended object sizes 
> in Riak.
> I've mentioned my use case before, which is trying to get fast writes on 
> large objects.  I abuse siblings to some extent, although by the nature of 
> the data, there will never be more than a few thousand small siblings (under 
> a hundred bytes).  I merge them on read and write the updated object back.  
> Even with sibling metadata, I doubt the bloated object is over a few MB, 
> especially with snappy compression which handles duplicate content quite 
> well.  Even if Riak merges the object on every write, it's still much faster 
> than transferring the whole object over the network every time I want to do a 
> small write.  Is there a more efficient way to do this?  I thought about 
> writing single objects and using a custom index, but that results in a read 
> and 2 writes, and the index could grow quite large compared to the amount of 
> data I'm writing.

This is similar, i suppose, to Riak 2.0 data types. We send an operation to 
Riak, and apply that inside the database rather then fetching, mutating, 
writing at the client. Think of adding to a Set, you just send the thing to be 
added and Riak merges it for you. For your use case would a user defined merge 
function in the database be a valuable feature? It would be every better if 
Riak stored data differently (incrementally, append-only rather than 
read-merge-write at the vnode.) These are things we’re going to be working on 
soon (I hope!) I had no idea that people used siblings this way. It’s 
interesting.

Cheers

Russell

> 
> Thanks,
> Jason
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: [ANN] Riak 2.0pre1

2014-01-20 Thread Russell Brown

Hi Elias,

Answers inline below:

On 20 Jan 2014, at 19:31, Elias Levy  wrote:

> 
> On Sun, Jan 19, 2014 at 9:00 AM,  wrote:
> From: Luc Perkins 
> * Reduced sibling creation, inspired by the dotted versions vectors research 
> from Preguiça, Baquero, et al[1]
> 
> [1] http://arxiv.org/abs/1011.5808
> 
> A quick skim over the paper seems to indicate that version vectors with 
> per-server entries cannot track causality among concurrent updates 
> coordinated by the same replica node.
> 
> Isn't version vectors with per server entries what Riak was using previous to 
> this change?  If so, did this lack of causality tracking apply to previous 
> versions?

Short answer: yes-ish.

Longer answer: Riak gave users the option of client or _vnode_ ids in version 
vectors. By default Riak uses vnode ids. Riak erred on the side of caution, and 
would create false concurrency, rather than lose writes. In the usual case this 
isn’t a problem. You may generate a couple of extra sibling values. In the 
pathological case it can lead to “sibling explosion”. In very rare cases such 
explosions create very large objects that can impact node, and even cluster 
performance. So DVV style causality tracking is very much a bug fix.

We’ve also added other forms of sibling count an large object protection in 
Riak 2.0, which are configurable limits in the riak.conf.

Cheers

Russell

> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: [ANN] Riak 2.0pre1

2014-01-20 Thread Russell Brown

On 20 Jan 2014, at 20:35, Elias Levy  wrote:

> On Mon, Jan 20, 2014 at 12:14 PM, Russell Brown  wrote:
> Longer answer: Riak gave users the option of client or _vnode_ ids in version 
> vectors. By default Riak uses vnode ids. Riak erred on the side of caution, 
> and would create false concurrency, rather than lose writes. 
> 
> I am curious as to how that was accomplished when using vnode version 
> vectors.  As section 3.2 of the paper mentions, the node with concurrent 
> updates could detect they are concurrent and it could reject the update, but 
> how could you encode the causal history using the version vectors so as to 
> generate a sibling?  That section ends with a statement saying no such 
> version vector could be generated.
> 
> Presumably Riak implemented a version vector that is somewhat different from 
> that described in the paper?

I guess you must be right. Riak’s vnode version vectors, in the case described 
in 3.2 would generate siblings. The put of `v` with an empty VV would lead to 
the value `v` and VV {b, 1}, but the put of `w` with no VV would not lead to a 
VV of {b, 2} and overwriting of {b, 1}=`v`.

What Riak does is this: look at the incoming VV, and increment it (so {b, 
1}=`w` for that second client put) read the state on disk ({b, 1}=`v`) realise 
that these values are concurrent (the incoming value does not dominate the 
local value), so keep them both, and generate a VV that dominates ({b, 2}=`v`, 
`w`).

Does that answer your question?

Cheers

Russell

> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak2 erlang mapreduce counters

2014-01-23 Thread Russell Brown

On 23 Jan 2014, at 20:51, Eric Redmond  wrote:

> For version 1.4 counters, riak_kv_pncounter. For 2.0 CRDT counters, 
> riak_dt_pncounter.

As in, if the data was written in 1.4, or in 2.0 using the legacy, backwards 
compatible 1.4 API endpoints, the the type is risk_kv_pncounter. If the counter 
is 2.0, bucket types counter, then risk_dt_pncounter.

Really, we need to re-introduce the riak_kv_counter module for backwards 
compatibility, and add some friendly `value’ functions to risk_kv_crdt. I’m 
opening an issue for just this now.

The other option is to include the riak_kv_types.hrl and use the macros 
?MAP_TYPE, ?SET_TYPE, ?V1_COUNTER_TYPE, ?COUNTER_TYPE for now, and assume that 
we’ll have some helper functions for MapReduce in before 2.0.

Cheers

Russell

> 
> Eric
> 
> On Jan 23, 2014, at 3:44 PM, Bryce Verdier  wrote:
> 
>> In 1.4 there was just the simple function riak_kv_counters:value. In 2.0 I 
>> found the riak_kv_crdt module, which has a value function in it. But I'm not 
>> sure what "type" to use for second value argument for a counter.
>> 
>> Can someone share that with me?
>> 
>> Thanks in advance,
>> Bryce
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak2 erlang mapreduce counters

2014-01-24 Thread Russell Brown

Hi Bryce,

Sorry about this, and thanks for the detailed info. I do need to add MapReduce 
friendly functions.

On 23 Jan 2014, at 23:51, Bryce Verdier  wrote:

> Thank you both Eric & Russeli for the answer, sadly it leads to more 
> questions. Regardless of the type (though I can say in this case the counters 
> were pushed from the python 2.0.2 client, so I assume its riak_dt_pncounter)

It is a riak_kv_pncounter, I don’t think any of the clients support the new API 
end points (bucket types) so you’ll be using the 1.4 counter.

> 
> I get this error:
> {"phase":0,"error":"badarg","input":"{ok,{r_object,<<\"ogir-fp\">>,<<\"682l2fp6\">>,[{r_content,{dict,4,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"content-type\">>,97,112,112,108,105,99,97,116,105,111,110,47,114,105,97,107,95,99,111,117,110,116,101,114],[<<\"X-Riak-VTag\">>,52,103,66,88,71,122,56,55,111,105,66,112,103,65,75,99,54,72,55,69,110,79]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1390,508268,984013}]],[],[]}}},<<69,1,71,1,0,0,0,29,70,1,131,108,0,0,0,1,104,2,109,...>>}],...},...}","type":"error","stack":"[{erlang,apply,[<<\"riak_kv_pncounter\">>,new,[]],[]},{riak_kv_crdt,crdt_value,2,[{file,\"src/riak_kv_crdt.erl\"},{line,94}]},{riak_kv_crdt,value,2,[{file,\"src/riak_kv_crdt.erl\"},{line,86}]},{mr_kv_counters,value,3,[{file,\"mr_kv_counters.erl\"},{line,38}]},{riak_kv_mrc_map,map,3,[{file,\"src/riak_kv_mrc_map.erl\"},{line,165}]},{riak_kv_mrc_map,process,3,[{file,\"src/riak_kv_mrc_map.erl\"},{line,141}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,445}]},{riak_pipe_vnode_worker,...}]”}

That binary (<<69,1,71…>>) says CRDT(69), Version 1(1), riak_kv_pncounter (71). 
The error says that erlang is trying to apply the Module, Function, Arguments 
of <<“riak_kv_pncounter”>>, new, []. But modules need to be atoms.

> Just to help, this is my erlang MR code:
> value(RiakObject, _KeyData, _Arg) ->
>Key   = riak_object:key(RiakObject),
>Count = riak_kv_crdt:value(RiakObject, <<"riak_kv_pncounter">>),

Type needs to be an atom, as it is the module name (so riak_kv_pncounter). 
Also, the return value is now a tuple of

{{Context :: binary(), Value :: term()}, Stats :: proplist()}. You probably 
only want the Value bit. So:

{{_Ctx, Count}, _Stats} = riak_kv_crdt:value(RiakObject, riak_kv_pncounter),

Should be what you need, let me know if that works, please?

Cheers

Russell

>[ {Key, Count} ].
> 
> What am I doing wrong? I can't seem to figure it out... I'm sure its 
> something simple thing I'm just not seeing.
> 
> Thanks again,
> Bryce
> 
> On 01/23/2014 01:07 PM, Russell Brown wrote:
>> On 23 Jan 2014, at 20:51, Eric Redmond  wrote:
>> 
>>> For version 1.4 counters, riak_kv_pncounter. For 2.0 CRDT counters, 
>>> riak_dt_pncounter.
>> As in, if the data was written in 1.4, or in 2.0 using the legacy, backwards 
>> compatible 1.4 API endpoints, the the type is risk_kv_pncounter. If the 
>> counter is 2.0, bucket types counter, then risk_dt_pncounter.
>> 
>> Really, we need to re-introduce the riak_kv_counter module for backwards 
>> compatibility, and add some friendly `value’ functions to risk_kv_crdt. I’m 
>> opening an issue for just this now.
>> 
>> The other option is to include the riak_kv_types.hrl and use the macros 
>> ?MAP_TYPE, ?SET_TYPE, ?V1_COUNTER_TYPE, ?COUNTER_TYPE for now, and assume 
>> that we’ll have some helper functions for MapReduce in before 2.0.
>> 
>> Cheers
>> 
>> Russell
>> 
>>> Eric
>>> 
>>> On Jan 23, 2014, at 3:44 PM, Bryce Verdier  wrote:
>>> 
>>>> In 1.4 there was just the simple function riak_kv_counters:value. In 2.0 I 
>>>> found the riak_kv_crdt module, which has a value function in it. But I'm 
>>>> not sure what "type" to use for second value argument for a counter.
>>>> 
>>>> Can someone share that with me?
>>>> 
>>>> Thanks in advance,
>>>> Bryce
>>>> 
>>>> ___
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Listing all keys and 2i $key query on a bucket

2014-01-25 Thread Russell Brown

On 25 Jan 2014, at 18:50, Daniel Iwan  wrote:

> How "heavy" for the cluster are those two operations for Riak cluster 3-5
> nodes?
> Listing all keys and filtering on client side is definitely not recommended
> but is 2i query via $key for given bucket equally heavy and not recommended?

It is a coverage query, so it hits 1 / N of the vnodes in your cluster. It then 
folds over the whole key space for the given bucket. Riak has back pressure for 
2i queries, but it is reasonably expensive operation. I recommend you try it on 
a test cluster, using basho_bench maybe, to set up a representative work load 
for your application and see if the impact is tolerable to for your application.

> 
> On related note is there a $bucket query to find all the buckets in the
> cluster and if there is how heavy is that operation?

There is not. It would basically be traversing the entire key set.

> 
> Thanks
> Daniel
> 
> 
> 
> --
> View this message in context: 
> http://riak-users.197444.n3.nabble.com/Listing-all-keys-and-2i-key-query-on-a-bucket-tp4030332.html
> Sent from the Riak Users mailing list archive at Nabble.com.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: last_write_wins

2014-01-29 Thread Russell Brown


On 29 Jan 2014, at 09:57, Edgar Veiga  wrote:

> tl;dr
> 
> If I guarantee that the same key is only written with a 5 second interval, is 
> last_write_wins=true profitable?

It depends. Does the value you write depend in anyway on the value you read, or 
is it always that you are just getting a totally new value that replaces what 
is in Riak (regardless what is in Riak)?

> 
> 
> On 27 January 2014 23:25, Edgar Veiga  wrote:
> Hi there everyone!
> 
> I would like to know, if my current application is a good use case to set 
> last_write_wins to true.
> 
> Basically I have a cluster of node.js workers reading and writing to riak. 
> Each node.js worker is responsible for a set of keys, so I can guarantee some 
> kind of non distributed cache... 
> The real deal here is that the writing operation is not run evertime an 
> object is changed but each 5 seconds in a "batch insertion/update" style. 
> This brings the guarantee that the same object cannot be write to riak at the 
> same time, not event at the same seconds, there's always a 5 second window 
> between each insertion/update.
> 
> That said, is it profitable to me if I set last_write_wins to true? I've been 
> facing some massive writting delays under high loads and it would be nice if 
> I have some kind of way to tune riak.
> 
> Thanks a lot and keep up the good work!
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Copying counters from cluster A to B using Java client.

2014-01-29 Thread Russell Brown


On 29 Jan 2014, at 11:27, Guido Medina  wrote:

> Hi,
> 
> We are using Riak Java client 1.4.x and we want to copy all counters from 
> cluster A to cluster B (all counters will be stored on a single to very few 
> buckets), if I list the keys using special 2i bucket index and then treat 
> each key as IRiakObject, will that be enough to copy counters, or will 
> counter siblings stop me from doing that?

There shouldn’t ever be a sibling in a counter object, so that _should_ work.

However, since you've mentioned copying from cluster to cluster, expect a sales 
call any minute…to preempt that, let me tell you about a wonderful piece of 
software we sell at Basho that could solve this very problem for 
you…http://basho.com/riak-enterprise/

Cheers

Russell

>  Since at Riak Java client CounterObject is not an IRiakObject, it is instead 
> an operation.
> 
> Comments at a working method:
> 
> Source bucket: Bucket from Riak client pointing to source cluster.
> Dest bucket: Bucket from Riak client pointing to dest bucket.
> 
>   protected void copyOneItem(final Bucket sourceBucket, final Bucket 
> destBucket, final String key) throws RiakRetryFailedException
>   {
> final IRiakObject riakObject=sourceBucket.fetch(key).execute();
> if(riakObject!=null){
>   destBucket.store(riakObject).withoutFetch().execute();
> }
>   }
> 
> 
> Regards,
> 
> Guido.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Copying counters from cluster A to B using Java client.

2014-01-29 Thread Russell Brown

Oh damn, wait. You said 1.4.*. There might, therefore be siblings, do a counter 
increment before the copy to ensure siblings are resolved (if you can.) Or use 
RiakEE MDC.

On 29 Jan 2014, at 11:27, Guido Medina  wrote:

> Hi,
> 
> We are using Riak Java client 1.4.x and we want to copy all counters from 
> cluster A to cluster B (all counters will be stored on a single to very few 
> buckets), if I list the keys using special 2i bucket index and then treat 
> each key as IRiakObject, will that be enough to copy counters, or will 
> counter siblings stop me from doing that? Since at Riak Java client 
> CounterObject is not an IRiakObject, it is instead an operation.
> 
> Comments at a working method:
> 
> Source bucket: Bucket from Riak client pointing to source cluster.
> Dest bucket: Bucket from Riak client pointing to dest bucket.
> 
>   protected void copyOneItem(final Bucket sourceBucket, final Bucket 
> destBucket, final String key) throws RiakRetryFailedException
>   {
> final IRiakObject riakObject=sourceBucket.fetch(key).execute();
> if(riakObject!=null){
>   destBucket.store(riakObject).withoutFetch().execute();
> }
>   }
> 
> 
> Regards,
> 
> Guido.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: last_write_wins

2014-01-30 Thread Russell Brown


On 30 Jan 2014, at 10:37, Edgar Veiga  wrote:

> Also,
> 
> Using last_write_wins = true, do I need to always send the vclock while on a 
> PUT request? In the official documention it says that riak will look only at 
> the timestamp of the requests.

Ok, from what you’ve said it sounds like you are always wanting to replace what 
is at a key with the new information you are putting. If that is the case, then 
you have the perfect use case for LWW=true. And indeed, you do not need to pass 
a vclock with your put request. And it sounds like there is no need for you to 
fetch-before-put since that is only to get context /resolve siblings. Curious 
about your use case if you can share more.

Cheers

Russell


> 
> Best regards,
> 
> 
> On 29 January 2014 10:29, Edgar Veiga  wrote:
> Hi Russel,
> 
> No, it doesn't depend. It's always a new value.
> 
> Best regards
> 
> 
> On 29 January 2014 10:10, Russell Brown  wrote:
> 
> On 29 Jan 2014, at 09:57, Edgar Veiga  wrote:
> 
>> tl;dr
>> 
>> If I guarantee that the same key is only written with a 5 second interval, 
>> is last_write_wins=true profitable?
> 
> It depends. Does the value you write depend in anyway on the value you read, 
> or is it always that you are just getting a totally new value that replaces 
> what is in Riak (regardless what is in Riak)?
> 
>> 
>> 
>> On 27 January 2014 23:25, Edgar Veiga  wrote:
>> Hi there everyone!
>> 
>> I would like to know, if my current application is a good use case to set 
>> last_write_wins to true.
>> 
>> Basically I have a cluster of node.js workers reading and writing to riak. 
>> Each node.js worker is responsible for a set of keys, so I can guarantee 
>> some kind of non distributed cache... 
>> The real deal here is that the writing operation is not run evertime an 
>> object is changed but each 5 seconds in a "batch insertion/update" style. 
>> This brings the guarantee that the same object cannot be write to riak at 
>> the same time, not event at the same seconds, there's always a 5 second 
>> window between each insertion/update.
>> 
>> That said, is it profitable to me if I set last_write_wins to true? I've 
>> been facing some massive writting delays under high loads and it would be 
>> nice if I have some kind of way to tune riak.
>> 
>> Thanks a lot and keep up the good work!
>> 
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: last_write_wins

2014-01-30 Thread Russell Brown


On 30 Jan 2014, at 10:58, Guido Medina  wrote:

> Hi,
> 
> Now I'm curious too, according to 
> http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/ 
> the default value for Erlang property last_write_wins is false, now, if 95% 
> of the buckets/keys have no siblings (or conflict resolution), does that mean 
> that for such buckets last_write_wins is set to true, I'm wondering what's 
> the effect (if any) if allow_multi on a bucket is false.
> 
> In other words; I could assume that:
> If allow_multi is true then last_write_wins will be ignored 'cause vclock is 
> needed for conflict resolution?
> if allow_multi is false then last_write_wins is true?
They’re independant settings, but allow_mult=true + lww=true makes no sense (in 
reality, in the code, I’m pretty sure the lww=true will be applied.)

allow_mult=false+lww=false means at each vnode there is a read-before-write, 
and casually dominated values are dropped, while siblings values are made, but 
before we write to disk (or return to the user on get) we pick the sibling with 
the highest timestamp. This means that you get _one_ of the causally concurrent 
values, the one with the largest timestamp.

allow_mult=false+lww=false means that at the coordinating vnode we just 
increment whatever vclock the put has (probably none, right?) and write it to 
disk (no read of the local value first) and down stream at the replicas, the 
same thing, just store it. I need to check, but on a get, if there are 
siblings, just pick the highest timestamp.

I really think, for riak, 90% of the time, allow_mult=true is your best choice. 
John Daily did a truly exhaustive set of blog posts on this 
http://basho.com/understanding-riaks-configurable-behaviors-part-1/ I highly 
recommend it. If you data is always overwrite maybe LWW makes sense for you. If 
it is write once, read ever after LWW is perfect.

Cheers

Russell
> Correct me if I'm wrong,
> Again, we have a very similar scenarios, where we create/modify keys and we 
> are certain we have the latest version so for us last_write_wins...
> Regards,
> 
> Guido.
> 
> On 30/01/14 10:46, Russell Brown wrote:
>> 
>> On 30 Jan 2014, at 10:37, Edgar Veiga  wrote:
>> 
>>> Also,
>>> 
>>> Using last_write_wins = true, do I need to always send the vclock while on 
>>> a PUT request? In the official documention it says that riak will look only 
>>> at the timestamp of the requests.
>> 
>> Ok, from what you’ve said it sounds like you are always wanting to replace 
>> what is at a key with the new information you are putting. If that is the 
>> case, then you have the perfect use case for LWW=true. And indeed, you do 
>> not need to pass a vclock with your put request. And it sounds like there is 
>> no need for you to fetch-before-put since that is only to get context 
>> /resolve siblings. Curious about your use case if you can share more.
>> 
>> Cheers
>> 
>> Russell
>> 
>> 
>>> 
>>> Best regards,
>>> 
>>> 
>>> On 29 January 2014 10:29, Edgar Veiga  wrote:
>>> Hi Russel,
>>> 
>>> No, it doesn't depend. It's always a new value.
>>> 
>>> Best regards
>>> 
>>> 
>>> On 29 January 2014 10:10, Russell Brown  wrote:
>>> 
>>> On 29 Jan 2014, at 09:57, Edgar Veiga  wrote:
>>> 
>>>> tl;dr
>>>> 
>>>> If I guarantee that the same key is only written with a 5 second interval, 
>>>> is last_write_wins=true profitable?
>>> 
>>> It depends. Does the value you write depend in anyway on the value you 
>>> read, or is it always that you are just getting a totally new value that 
>>> replaces what is in Riak (regardless what is in Riak)?
>>> 
>>>> 
>>>> 
>>>> On 27 January 2014 23:25, Edgar Veiga  wrote:
>>>> Hi there everyone!
>>>> 
>>>> I would like to know, if my current application is a good use case to set 
>>>> last_write_wins to true.
>>>> 
>>>> Basically I have a cluster of node.js workers reading and writing to riak. 
>>>> Each node.js worker is responsible for a set of keys, so I can guarantee 
>>>> some kind of non distributed cache... 
>>>> The real deal here is that the writing operation is not run evertime an 
>>>> object is changed but each 5 seconds in a "batch insertion/update" style. 
>>>> This brings the guarantee that the same object cannot be write to riak at 
>>>> the same time, not event at the same seconds, there's always a 5 second 
>&

Re: Max/Min Integer CRDTs?

2014-02-08 Thread Russell Brown

Hi Elias,

This is a great time for you to ask, if you’re asking what I think you’re 
asking.

On 8 Feb 2014, at 22:35, Elias Levy  wrote:

> Does Basho have any plans for implementing a CRDT that maintains the minimum 
> or maximum value for an integer?  It would come in handy in our application 
> and it would be very simple to implement.

Do you mean some kind of bounded counter that cannot be incremented beyond (say 
1000), or decremented below a certain bound (i.e. non-negative counter?) If so, 
then the plan is yes, but I’m not sure it is simple. If you have a design for 
such a thing please share it. We’re working with a team from Universidade Nova 
de Lisboa as part of the SyncFree project on this, but I’d love to hear your 
ideas for an implementation. If you want to keep it private feel free to email 
me off list.

Cheers

Russell

> 
> Elias Levy 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Max/Min Integer CRDTs?

2014-02-08 Thread Russell Brown


On 8 Feb 2014, at 23:00, Jason Campbell  wrote:

> My understanding of what Elias wanted was a counter that simply stored the 
> minimum and maximum values it has ever reached, an optional reset would 
> probably be nice as well.  It would be quite helpful when dealing with 
> statistics counters that can decrement.

Ah, in that case, see Sean’s message (yes trivial, yes planned.) Sorry for the 
misunderstanding, I’ve been thinking about bounded counters so that context 
misconstrued the question. Oops.

> 
> Then again, I could be wrong.

> 
> - Original Message -
> From: "Russell Brown" 
> To: "Elias Levy" 
> Cc: "riak-users" 
> Sent: Sunday, 9 February, 2014 9:53:42 AM
> Subject: Re: Max/Min Integer CRDTs?
> 
> Hi Elias,
> 
> This is a great time for you to ask, if you’re asking what I think you’re 
> asking.
> 
> On 8 Feb 2014, at 22:35, Elias Levy  wrote:
> 
>> Does Basho have any plans for implementing a CRDT that maintains the minimum 
>> or maximum value for an integer?  It would come in handy in our application 
>> and it would be very simple to implement.
> 
> Do you mean some kind of bounded counter that cannot be incremented beyond 
> (say 1000), or decremented below a certain bound (i.e. non-negative counter?) 
> If so, then the plan is yes, but I’m not sure it is simple. If you have a 
> design for such a thing please share it. We’re working with a team from 
> Universidade Nova de Lisboa as part of the SyncFree project on this, but I’d 
> love to hear your ideas for an implementation. If you want to keep it private 
> feel free to email me off list.
> 
> Cheers
> 
> Russell
> 
>> 
>> Elias Levy 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: CRDT on Riak 2.0

2014-03-04 Thread Russell Brown

And you only need to send the context object if you’re removing things, so if 
you can partition your work between adds and removes, you can have more 
efficient adds.

On 4 Mar 2014, at 16:27, Sam Elliott  wrote:

> Yes, batch your updates, it'll be much more efficient that way.
> 
> Do not try to decode the `context` object. Use it as an opaque value, as the 
> data it holds could change without warning.
> 
> Sam  
> 
> --  
> Sam Elliott
> Engineer
> sam.elli...@basho.com
> --
> 
> 
> On Tuesday, 4 March 2014 at 10:22AM, EmiNarcissus wrote:
> 
>> 
>> 
>> Hi Sean,
>> 
>> 
>> 
>> Thanks very much, that’s very helpful. Anyways, I’ve noticed in update_dt 
>> function they are preferred to apply a context dict, which in the mail list 
>> described as a encoded version of original object(dict,set, so to say on a 
>> single update action, each time a new record is being added/removed it will 
>> apply the full original object to the server?), so for frequently 
>> operations, how much performance difference is between batch/single actions?
>> 
>> 
>> 
>> I’m only thinking about the possibilities on large dataset, but really in 
>> fact the dataset I’m planning to use at most contains 1-2k records is 
>> already big enough. currently I’m working on a project which using Riak Link 
>> to hold the identical items,also need to update the object data itself a 
>> lot(inside have a few counters(but indeed currently is just a number,didn’t 
>> put it into a separated counter yet), so is very non-efficient to update 
>> when new record is being inserted, don’t know if CRDT will provide more 
>> efficiency on that.
>> 
>> 
>> On 2014年3月4日 at 下午11:12:03, Sam Elliott (sam.elli...@basho.com 
>> (mailto:sam.elli...@basho.com)) wrote:
>>> To answer another thing brought up in your message:
>>> 
>>> When you say "big enough" set sizes of 10k, be very careful. Riak Data 
>>> Types should not  
>>> be larger than you would make a normal Riak Object. There's more guidance 
>>> in this thread:  
>>> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-February/014722.html
>>>   
>>> 
>>> Sam
>>> 
>>> --
>>> Sam Elliott
>>> Engineer
>>> sam.elli...@basho.com (mailto:sam.elli...@basho.com)
>>> --
>>> 
>>> 
>>> On Tuesday, 4 March 2014 at 10:02AM, Sean Cribbs wrote:
>>> 
 Hi Tim,
 
 We punted on sub-type queries for 2.0. We intend to address them in 2.1, 
 so yes you must  
>>> fetch the entire set or map in order to find out things like membership and 
>>> cardinality.  
 
 
 On Tue, Mar 4, 2014 at 8:01 AM, EmiNarcissus  
>>> wrote:
> Hi,
> 
> I’m now porting the riak 2.0 driver for twisted, it works beautifully now 
> with what  
>>> Yokozuna provides, also have a great back-port ability,really appreciate 
>>> everything  
>>> what this team have brought us XD. But because it still in lacks of 
>>> document, I must read  
>>> the implementation both from ruby and riakc-erl repo to get it 
>>> started(erlang is okay  
>>> for me, but I’m not quite familiar with ruby,sadly looks like only ruby’s 
>>> client implementation  
>>> is throughly right now).
> 
> So here is my question on the datatype implementation on CRDT system.
> 
> From the code I can tell , fetch_dt/update_dt/modify_dt is what have been 
> exposed  
>>> from the pbc interface. Now I’m more focused on Set object, so each time 
>>> client will fetch  
>>> the whole set(fetch_dt) from the server and build the set on the local end, 
>>> and maintain  
>>> a add/remove operation list to send to the server when user does a 
>>> update/add/remove  
>>> action.
> 
> But I’m a little bit confused here, like what redis provides, a set have 
> a ismember function  
>>> is being done on the server instead of fetch/test manner. Does this 
>>> available for Riak  
>>> 2.0 Set datatype(just like what Riak1.4 Counter object provided, it will do 
>>> the operation  
>>> on the server side,MISMEMBER a b, it will either return True/False). 
>>> Currently I only  
>>> can see add/update/delete operation on pbc proto file, and don’t have 
>>> something alike,  
>>> is that in a Todo list ? or will not implement at all(I will reconsider how 
>>> the data should  
>>> be structured if so). This feature will be really helpful when the dataset 
>>> is big enough(like  
>>> more than 10k values in it.)
> 
> 
> Best,
> Tim Lee
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 
 
 
 
 
 --
 Sean Cribbs  
 Software Engineer
 Basho Technologies, Inc.
 http://basho.com/
 
 
 ___
 riak-users mailing list
 riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
 http://lists.basho.com/mai

Re: Implemented Riak 2.0 API for python twisted framework

2014-03-13 Thread Russell Brown


On 13 Mar 2014, at 13:27, EmiNarcissus  wrote:

> Hi Dear basho team,
> 
> …CRDT currently is not really to use via http…

Can you let us know what you think is missing from the CRDT http API, please, I 
thought it was ‘done’ in the more recent 2.0pre releases?

> 
> --  
> Best Regards  
> Tim Lee
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: O(n) behavior on crdts

2014-03-31 Thread Russell Brown

Hey James,

I haven’t analysed the complexity of the data types. Off hand I know that 
operations on Maps, Sets, Counters etc are not O(n). Merges sometimes will be, 
if every entry must be compared, and we’re looking at ways to optimise this. 
The `value` operation on Maps and Sets must be O(n) since we have to derive the 
correct value for each entry. I’m not sure how we would optimise that.

We’ve optimised the `context` for operations down to a single version vector 
(before it was a binary of the entire Map or Set), which should have got us 
some performance improvements.

Right now Sets and Maps still use orddict. We have a round of performance and 
scalability testing starting, and hopefully some optimisations will come out of 
that.

I think you’re referring to Sean’s porting of Elixir’s TreeMap to erlang. We 
have not yet tested that code, nor have we integrated it with riak_dt. I don’t 
know if anyone even plans to.

Have you observed poor performance from the data types? The earlier releases' 
main performance issue was to/from binary encoding. We since switched to 
erlangs built in t2b/b2t functions, with compression, and have found the 
performance to be acceptable.

I guess the short answer is: we’re working on it, but with 2.0 release looming 
up, we might not get all the way there in time.

There is some cost for using CRDTs, I don’t think it could be another way, 
there aren’t any free lunches. My aim is to reduce that cost below the pain 
barrier of siblings, and writing complex merge functions.

Cheers

Russell

On 31 Mar 2014, at 23:06, James Moore  wrote:

> Hey All,
> 
> has there been any progress on resolving the O(n) behavior of crdt sets and 
> maps?
> 
> Sean mentioned in a previous thread that there was a potential for a fix to 
> resolve the poor performance of erlang's orddict.
> 
> thanks!
> 
> --James
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: O(n) behavior on crdts

2014-04-01 Thread Russell Brown

Hey James,

Massive respect to any mailing list response about performance that includes 
graphs, thanks!

I plan to keep optimising right up until we release, so hopefully we can 
further improve the performance. From the look of your graphs there is work to 
be done.

Are these graphs for Set or Map additions, please?

Keep us posted with your testing/assessment, please.

Cheers

Russell


On 1 Apr 2014, at 20:18, James Moore  wrote:

> Thanks Russell,
> 
> I'm mainly observing this issue around the add operation on hash's when 
> running a similar operation as this guy
> 
> https://gist.github.com/russelldb/d330d796ca1b25d1879d
> 
> the behavior I was seeing show's up in the following graphs
> 
> the horizontal axis is the iteration of the add operation, and the y axis is 
> milliseconds
> 
> 
> 
> 
> From that chart it looks as though the fix in compression has resolved the 
> issue :)  ( and also in hindsight the results don't quite look O(n) )
> 
> for larger add operations I'm still seeing a linear performance degradation, 
> but add operations are still pretty quick through 10k operations
> 
> 
> 
> 
> In answer to you're question on performance I believe the switch up to pre20 
> should resolve the the poor add performance for us, and overall I've found it 
> to be quite impressive so far.
> 
> --james
> 
> 
> 
> 
> 
> 
> 
> On Tue, Apr 1, 2014 at 2:21 AM, Russell Brown  wrote:
> Hey James,
> 
> I haven’t analysed the complexity of the data types. Off hand I know that 
> operations on Maps, Sets, Counters etc are not O(n). Merges sometimes will 
> be, if every entry must be compared, and we’re looking at ways to optimise 
> this. The `value` operation on Maps and Sets must be O(n) since we have to 
> derive the correct value for each entry. I’m not sure how we would optimise 
> that.
> 
> We’ve optimised the `context` for operations down to a single version vector 
> (before it was a binary of the entire Map or Set), which should have got us 
> some performance improvements.
> 
> Right now Sets and Maps still use orddict. We have a round of performance and 
> scalability testing starting, and hopefully some optimisations will come out 
> of that.
> 
> I think you’re referring to Sean’s porting of Elixir’s TreeMap to erlang. We 
> have not yet tested that code, nor have we integrated it with riak_dt. I 
> don’t know if anyone even plans to.
> 
> Have you observed poor performance from the data types? The earlier releases' 
> main performance issue was to/from binary encoding. We since switched to 
> erlangs built in t2b/b2t functions, with compression, and have found the 
> performance to be acceptable.
> 
> I guess the short answer is: we’re working on it, but with 2.0 release 
> looming up, we might not get all the way there in time.
> 
> There is some cost for using CRDTs, I don’t think it could be another way, 
> there aren’t any free lunches. My aim is to reduce that cost below the pain 
> barrier of siblings, and writing complex merge functions.
> 
> Cheers
> 
> Russell
> 
> On 31 Mar 2014, at 23:06, James Moore  wrote:
> 
> > Hey All,
> >
> > has there been any progress on resolving the O(n) behavior of crdt sets and 
> > maps?
> >
> > Sean mentioned in a previous thread that there was a potential for a fix to 
> > resolve the poor performance of erlang's orddict.
> >
> > thanks!
> >
> > --James
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: allow_mult defaults to false for 2.0.0pre20 in my tests

2014-04-02 Thread Russell Brown

Hi David,

Sorry about the hokey-cokey on this.

In 2.0 for allow_mult=false as a default for default/untyped buckets. That is 
to support legacy applications, and rolling upgrades with the least surprise.
allow_mult=true by default for typed buckets, as we think this is the correct 
way to run Riak, and there are no legacy typed buckets.

Information on Bucket Types can be found here 
http://docs.basho.com/riak/2.0.0pre20/dev/advanced/bucket-types/

Cheers

Russell

On 2 Apr 2014, at 08:59, David James  wrote:

> In my tests, allow_mult defaults to false for 2.0.0pre20. This was not the 
> case for 2.0.0pre11; my tests behave correctly under pre11.
> 
> This according to my testing with my Clojure Riak driver, Kria:
> https://github.com/bluemont/kria
> 
> My understanding is that Riak intends allow_mult to default to true.
> 
> I'm using the Mac builds from:
> * http://docs.basho.com/riak/2.0.0pre11/downloads/
> * http://docs.basho.com/riak/2.0.0pre20/downloads/
> 
> I also confirmed this to be the case with a Homebrew installation:
> * https://github.com/Homebrew/homebrew/blob/master/Library/Formula/riak.rb
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: allow_mult defaults to false for 2.0.0pre20 in my tests

2014-04-02 Thread Russell Brown

On 2 Apr 2014, at 09:21, David James  wrote:

> What versions(s) of Riak have allow_mult=true by default? Which ones have 
> allow_mult=false by default?

All released versions of Riak have allow_mult=false for default buckets. All 
release version of Riak only have default buckets.

2.0 will have allow_mult=true as default for typed buckets, and 
allow_mult=false for untyped/legacy buckets.

Which pre did it change in? I don’t know. But certainly from pre20 it is the 
case. I think it would be very hard for any client to support different pre 
releases.

Why? As stated by Eric and myself, for backwards compatibility with existing 
data and clusters. But the change to `true` for typed buckets is because we 
think that is the correct way to run Riak.

> 
> I thought this was decided in the early days of Riak. Why the back and forth 
> now?
> 

It was, there is a lot of history. Originally Basho wanted allow_mult=true, I 
think they opted for false for ease of take up/adoption, but have since 
realised that Riak is safest with allow_mult=true. The back-and-forth in the 
press reflects I suppose the difficulty of making a decision, and also, the 
difficulty of doing the “right” thing after the “wrong” thing has been done.

Cheers

Russell

> 
> On Wed, Apr 2, 2014 at 4:16 AM, Eric Redmond  wrote:
> This was a changed back a few weeks ago, where allow_mult is back to false 
> for buckets without a type (default), but is true for buckets with a type.
> 
> Sorry for the back and forth, but we decided it would be better to keep it as 
> false so as to not break existing users. However, we strongly encourage that 
> all users choose typed buckets going forward.
> 
> Eric
> 
> On Apr 2, 2014 1:01 AM, "David James"  wrote:
> In my tests, allow_mult defaults to false for 2.0.0pre20. This was not the 
> case for 2.0.0pre11; my tests behave correctly under pre11.
> 
> This according to my testing with my Clojure Riak driver, Kria:
> https://github.com/bluemont/kria
> 
> My understanding is that Riak intends allow_mult to default to true.
> 
> I'm using the Mac builds from:
> * http://docs.basho.com/riak/2.0.0pre11/downloads/
> * http://docs.basho.com/riak/2.0.0pre20/downloads/
> 
> I also confirmed this to be the case with a Homebrew installation:
> * https://github.com/Homebrew/homebrew/blob/master/Library/Formula/riak.rb
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: oddness when using java client within storm

2014-04-14 Thread Russell Brown

HTTP or PB? Pretty sure the HTTP client defaults to a pool of 50 connections.

On 14 Apr 2014, at 16:50, Sean Allen  wrote:

> We fire off 100 requests for the items in the batch and wait on the futures 
> to complete.
> 
> 
> On Mon, Apr 14, 2014 at 11:40 AM, Alexander Sicular  
> wrote:
> I'm not sure what "looking up entries... in batches of 100 from Riak" 
> devolves into in the java client but riak doesn't have a native multiget. It 
> either does 100 get ops or a [search>]mapreduce. That might inform some of 
> your performance issues.
> 
> -Alexander
> 
> @siculars
> http://siculars.posthaven.com
> 
> Sent from my iRotaryPhone
> 
> > On Apr 14, 2014, at 8:26, Sean Allen  wrote:
> >
> > I'm seeing something very odd trying to scale out part of code I'm working 
> > on.
> >
> > It runs inside of Storm and lookups up entries from 10 node riak cluster.
> > I've hit a wall that we can't get past. We are looking up entries (json 
> > representation of a job)
> > in batches of 100 from Riak, each batch gets handled by a bolt in Storm, 
> > adding more
> > bolts (an instance of the bolt class with a dedicated thread) results in no 
> > increase
> > in performance. I instrumted the code and saw that waiting for all riak 
> > futures to finish
> > increases as more bolts are added. Thinking that perhaps there was 
> > contention around the
> > RiakCluster object that we were sharing per jvm, I tried giving each bolt 
> > instance its own
> > cluster object and there wasn't any change.
> >
> > Note that changing Thread spool size given to withExecutor not 
> > withExecutionAttempts value
> > has any impact.
> >
> > We're working off of the develop branch for the java client. We've been 
> > using d3cc30d but I also tried with cef7570 and had the same issue.
> >
> > A simplied version of the scala code running this:
> >
> >   // called once upon bolt initialization.
> >   def prepare(config: JMap[_, _],
> >   context: TopologyContext,
> >   collector: OutputCollector): Unit = {
> > ...
> >
> > val nodes = RiakNode.Builder.buildNodes(new RiakNode.Builder, (1 to 
> > 10).map(n => s"riak-beavis-$n").toList.asJava)
> > riak = new RiakCluster.Builder(nodes)
> >   // varying this has made no difference
> >   .withExecutionAttempts(1)
> >  // nor has varying this
> >   .withExecutor(new ScheduledThreadPoolExecutor(200))
> >   .build()
> > riak.start
> >
> > ...
> >   }
> >
> >   private def get(jobLocationId: String): 
> > RiakFuture[FetchOperation.Response] = {
> > val location = new 
> > Location("jobseeker-job-view").setBucketType("no-siblings").setKey(jobLocationId)
> > val fop = new 
> > FetchOperation.Builder(location).withTimeout(75).withR(1).build
> >
> > riak.execute(fop)
> >   }
> >
> >   def execute(tuple: Tuple): Unit = {
> > val indexType = tuple.getStringByField("index_type")
> > val indexName = tuple.getStringByField("index_name")
> > val batch = tuple.getValueByField("batch").asInstanceOf[Set[Payload]]
> >
> > var lookups: Set[(Payload, RiakFuture[FetchOperation.Response])] = 
> > Set.empty
> >
> > // this always returns in a standard time based on batch size
> > time("dispatch-calls") {
> >   lookups = batch.filter(_.key.isDefined).map {
> > payload => {(payload, get(payload.key.get))}
> >   }
> > }
> >
> > val futures = lookups.map(_._2)
> >
> > // this is what takes longer and longer when more bolts are added.
> > // it doesnt matter what the sleep time is.
> > time("waiting-on-futures") {
> >   while (futures.count(!_.isDone) > 0) {
> > Thread.sleep(25L)
> >   }
> > }
> >
> >
> > // everything from here to the end returns in a fixed amount of time
> > // and doesn't change with the number of bolts
> > ...
> >
> >   }
> >
> >
> > It seems like we are running into contention somewhere in the riak java 
> > client.
> > My first thought was the LinkedBlockingQueue that serves as the retry queue 
> > in RiakCluster
> > but, I've tried running with only a single execution attempt as well as a 
> > custom client
> > version where I removed all retries from the codebase and still experience 
> > the same problem.
> >
> > I'm still digging through the code looking for possible points of 
> > contention.
> >
> > Any thoughts?
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> -- 
> 
> Ce n'est pas une signature
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Errors messages from 2i/range queries, or something else?

2014-05-30 Thread Russell Brown

How large is your max_results size for 2i queries, or, if you’re not using 
pagination, what do you estimate the result size is?

Do you require sorted results? If you don’t, and you’re not using pagination, 
1.4.8 might solve your issues since it doesn’t buffer and sort results in 
memory (see Sorting under 
http://docs.basho.com/riak/1.4.8/dev/using/2i/#Querying)

Cheers

Russell

On 30 May 2014, at 10:27, Dave Brady  wrote:

> Our five-node cluster, running 1.4.7, is spewing lots of these messages:
> 
> 2014-05-30 08:49:20.683 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.8208.361> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge3_21_3,6}},{message_queue_len,0}]
>  
> [{timeout,155},{old_heap_block_size,0},{heap_block_size,317811},{mbuf_size,0},{stack_size,59},{old_heap_size,0},{heap_size,271310}]
> 2014-05-30 08:52:30.532 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.17361.375> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge2_1,4}},{message_queue_len,0}]
>  
> [{timeout,103},{old_heap_block_size,0},{heap_block_size,514229},{mbuf_size,0},{stack_size,59},{old_heap_size,0},{heap_size,252851}]
> 2014-05-30 08:52:31.123 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.17361.375> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge2_2,5}},{message_queue_len,0}]
>  
> [{timeout,103},{old_heap_block_size,0},{heap_block_size,514229},{mbuf_size,0},{stack_size,58},{old_heap_size,0},{heap_size,25}]
> 2014-05-30 08:52:37.041 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.29376.375> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,split_1_1,6}},{message_queue_len,0}]
>  
> [{timeout,103},{old_heap_block_size,0},{heap_block_size,196418},{mbuf_size,0},{stack_size,56},{old_heap_size,0},{heap_size,190521}]
> 2014-05-30 08:52:37.149 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.29376.375> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,split_1,5}},{message_queue_len,0}]
>  
> [{timeout,101},{old_heap_block_size,0},{heap_block_size,317811},{mbuf_size,0},{stack_size,56},{old_heap_size,0},{heap_size,196361}]
> 2014-05-30 08:52:37.977 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.29376.375> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge3_21_3,6}},{message_queue_len,0}]
>  
> [{timeout,123},{old_heap_block_size,0},{heap_block_size,514229},{mbuf_size,0},{stack_size,59},{old_heap_size,0},{heap_size,251541}]
> 2014-05-30 09:02:00.731 [info] 
> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
> <0.16282.433> 
> [{initial_call,{erlang,apply,2}},{almost_current_function,{erlang,max,2}},{message_queue_len,0}]
>  
> [{timeout,103},{old_heap_block_size,0},{heap_block_size,196418},{mbuf_size,0},{stack_size,65},{old_heap_size,0},{heap_size,125642}]
> 
> GET and PUT times are shooting into multi-second response.
> 
> Can someone give me any hints about where to look, please?
> 
> --
> Dave Brady
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Errors messages from 2i/range queries, or something else?

2014-05-30 Thread Russell Brown


On 30 May 2014, at 11:55, Dave Brady  wrote:

> Hi Russell,
> 
> Thanks for clarifying that it is our 2i queries!

I’m not 100% sure, but it seems possible, those messages look similar to the 
streaming merge sort code working. Which reminds me, are you streaming the 
results? It’s way better than buffering all the results in memory before 
returning them.

> 
> I myself don't know the size of the result sets.  I know pagination is used 
> in some of our programs, though I don't know if all of them use it.
> 
> I didn't know that 1.4.8 had those improvements. I'll look into upgrading, 
> too.
> 
> Thanks again!
> 
> --
> Dave Brady
> 
> - Original Message -
> From: "Russell Brown" 
> To: "Dave Brady" 
> Cc: riak-users@lists.basho.com
> Sent: Friday, May 30, 2014 11:33:45 AM
> Subject: Re: Errors messages from 2i/range queries, or something else?
> 
> How large is your max_results size for 2i queries, or, if you’re not using 
> pagination, what do you estimate the result size is?
> 
> Do you require sorted results? If you don’t, and you’re not using pagination, 
> 1.4.8 might solve your issues since it doesn’t buffer and sort results in 
> memory (see Sorting under 
> http://docs.basho.com/riak/1.4.8/dev/using/2i/#Querying)
> 
> Cheers
> 
> Russell
> 
> On 30 May 2014, at 10:27, Dave Brady  wrote:
> 
>> Our five-node cluster, running 1.4.7, is spewing lots of these messages:
>> 
>> 2014-05-30 08:49:20.683 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.8208.361> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge3_21_3,6}},{message_queue_len,0}]
>>  
>> [{timeout,155},{old_heap_block_size,0},{heap_block_size,317811},{mbuf_size,0},{stack_size,59},{old_heap_size,0},{heap_size,271310}]
>> 2014-05-30 08:52:30.532 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.17361.375> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge2_1,4}},{message_queue_len,0}]
>>  
>> [{timeout,103},{old_heap_block_size,0},{heap_block_size,514229},{mbuf_size,0},{stack_size,59},{old_heap_size,0},{heap_size,252851}]
>> 2014-05-30 08:52:31.123 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.17361.375> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge2_2,5}},{message_queue_len,0}]
>>  
>> [{timeout,103},{old_heap_block_size,0},{heap_block_size,514229},{mbuf_size,0},{stack_size,58},{old_heap_size,0},{heap_size,25}]
>> 2014-05-30 08:52:37.041 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.29376.375> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,split_1_1,6}},{message_queue_len,0}]
>>  
>> [{timeout,103},{old_heap_block_size,0},{heap_block_size,196418},{mbuf_size,0},{stack_size,56},{old_heap_size,0},{heap_size,190521}]
>> 2014-05-30 08:52:37.149 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.29376.375> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,split_1,5}},{message_queue_len,0}]
>>  
>> [{timeout,101},{old_heap_block_size,0},{heap_block_size,317811},{mbuf_size,0},{stack_size,56},{old_heap_size,0},{heap_size,196361}]
>> 2014-05-30 08:52:37.977 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.29376.375> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{lists,rmerge3_21_3,6}},{message_queue_len,0}]
>>  
>> [{timeout,123},{old_heap_block_size,0},{heap_block_size,514229},{mbuf_size,0},{stack_size,59},{old_heap_size,0},{heap_size,251541}]
>> 2014-05-30 09:02:00.731 [info] 
>> <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc 
>> <0.16282.433> 
>> [{initial_call,{erlang,apply,2}},{almost_current_function,{erlang,max,2}},{message_queue_len,0}]
>>  
>> [{timeout,103},{old_heap_block_size,0},{heap_block_size,196418},{mbuf_size,0},{stack_size,65},{old_heap_size,0},{heap_size,125642}]
>> 
>> GET and PUT times are shooting into multi-second response.
>> 
>> Can someone give me any hints about where to look, please?
>> 
>> --
>> Dave Brady
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Error decodeing type with PBC API

2014-06-22 Thread Russell Brown

Hi,
For this type of issue client and server versions are very useful, please.

Russell

On 23 Jun 2014, at 03:55, japhar81  wrote:

> Hello all,
> I've been banging my head against the PBC API all weekend and I can't seem to 
> figure out why I'm getting an error. I'm able to call RpbListBuckets (empty 
> message) without issue, but when I try to call RpbListKeys I get an error 
> (pasted below in full) -- Error decodeing type with PBC API.
> 
> I've pulled out the encoded message (trying to list the rekon bucket as a 
> test), it is:
> 00-00-00-0A-11-00-10-6E-6F-6B-65-72-05-0A
> 
> I've confirmed that the length (0x0A = 10) and message type (0x11 = 
> RpbListKeysReq) are correct. I'm unsure how I'd manually decode the message 
> itself, but it matches what protobuf is giving me, so I'm assuming it to be 
> correct as well.
> 
> I'd appreciate any pointers as to what I'm doing wrong, I'm completely 
> stumped.
> 
> The full error I'm getting:
> 
> Error processing incoming message: throw:{error,"Error decodeing 
> type"}:[{protobuffs,
>   
> read_field_num_and_wire_type,
>   1,
>   
> [{file,
> 
> "src/protobuffs.erl"},
>
> {line,
> 
> 203}]},
>  
> {protobuffs,
>   
> next_field_num,
>   1,
>   
> [{file,
> 
> "src/protobuffs.erl"},
>
> {line,
> 
> 242}]},
>  
> {riak_kv_pb,
>   
> decode,
>   3,
>   
> [{file,
> 
> "src/riak_kv_pb.erl"},
>
> {line,
> 
> 105}]},
>  
> {riak_kv_pb,
>   
> decode,
>   2,
>   
> [{file,
> 
> "src/riak_kv_pb.erl"},
>
> {line,
> 
> 100}]},
>  
> {riak_kv_pb_bucket,
>   
> decode,
>   2,
>   
> [{file,
> 
> "src/riak_kv_pb_bucket.erl"},
>
> {line,
> 
> 67}]},
>  
> {riak_api_pb_server,
>   
> handle_message,
>   3,
>   
> [{file,
> 
> "src/riak_api_pb_server.erl"},
>
> {line,
> 
> 196}]},
>  
> {riak_api_pb_server,
>   
> decode_buffer,
>

Re: search-cmd - could not read

2014-06-23 Thread Russell Brown

Hi Mark,

Answers inline below.

On 23 Jun 2014, at 15:58, Mark Richard Thomas  wrote:

> Hello
>  
> Why does the search-cmd return a “Could not read” error?
>  
> · search-cmd show-schema mybucket > /root/schema.txt
>  
> · search-cmd set-schema mybucket /root/schema.txt
>  
> :: Updating schema for 'mybucket'...
> :: ERROR: Could not read '/root/schema.txt'.
> RPC to 'r...@abc.def.ntg.ghi.com' failed: {'EXIT',-1}

Does the user have permission to access “/root/schema.txt”? Does it exist?

Also, if you want to use search with Riak, and you’re not going live in the 
next month or so, I strongly recommend that you look at Yokozuna, or Riak 
Search 2.0: Legacy Search as shipped in the 1.* series has many issues. The new 
Search in 2.0 is a completely different product, and the future for Riak. 
Unless you have an application in production, please consider riak 2.0 search 
first.

http://docs.basho.com/riak/2.0.0beta1/dev/using/search/ for docs and getting 
started.

Cheers

Russell

>  
> Mark Thomas | Software Engineer | Equifax UK
>  
> p:   +44 (0)208 941 0573
> m:  +44 (0)7908 798 270
> e:   mark.tho...@equifax.com
>  
> Equifax Ltd, Capital House, 25 Chapel Street, London, NW1 5DS
>  
> Equifax Limited is registered in England with Registered No. 2425920. 
> Registered Office: Capital House, 25 Chapel Street, London NW1 5DS. Equifax 
> Limited is authorised and regulated by the Financial Conduct Authority.
> Equifax Touchstone Limited is registered in Scotland with Registered No. 
> SC113401. Registered Office: 54 Deerdykes View, Westfield Park, Cumbernauld 
> G68 9HN.
> Equifax Commercial Services Limited is registered in the Republic of Ireland 
> with Registered No. 215393. Registered Office: IDA Business & Technology 
> Park, Rosslare Road, Drinagh, Wexford.
>  
> This message contains information from Equifax which may be confidential and 
> privileged. If you are not an intended recipient, please refrain from any 
> disclosure, copying, distribution or use of this information and note that 
> such actions are prohibited. If you have received this transmission in error, 
> please notify by e-mail postmas...@equifax.com.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Java CRDTs implementation

2014-07-01 Thread Russell Brown

There is an akka implementation here 
https://github.com/patriknw/akka-datareplication

LWW-Set and LWW-Regsiter should be pretty easy to make, if you can’t use the 
akka code.

On 1 Jul 2014, at 10:45, David Lopes  wrote:

> Hi,
> 
> do you know where I can find some CRDTs Java implementation of LWW-Sets or 
> Registers?
> 
> Thanks,
> David
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Java CRDTs implementation

2014-07-11 Thread Russell Brown

I guess, or read the papers[1][2], they contain specs. Or read the erlang 
source for riak_dt[3], or maybe even look at the C++[4] code from one of the 
creators of CRDTs.

Happy to give pointers on this list or #riak on freenode.

Cheers

Russell

[1] http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf
[2] http://arxiv.org/abs/1210.3368
[3] https://github.com/basho/riak_dt
[4] https://github.com/SyncFree/delta-enabled-crdts (great links to papers too!)

On 11 Jul 2014, at 12:19, Mohan Radhakrishnan  
wrote:

> Hi,
>
>  What does one read to implement this in Java ? Look at source code?
> 
> Thanks,
> Mohan
> 
> 
> On Tue, Jul 1, 2014 at 4:45 PM, Russell Brown  wrote:
> There is an akka implementation here 
> https://github.com/patriknw/akka-datareplication
> 
> LWW-Set and LWW-Regsiter should be pretty easy to make, if you can’t use the 
> akka code.
> 
> On 1 Jul 2014, at 10:45, David Lopes  wrote:
> 
> > Hi,
> >
> > do you know where I can find some CRDTs Java implementation of LWW-Sets or 
> > Registers?
> >
> > Thanks,
> > David
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Too many siblings error

2014-07-28 Thread Russell Brown

Hi Bryce,
Single node?

I’m pretty surprised by this, in 1.4 every write to a counter resolves the 
siblings at the coordinating vnode. Do you know what your sibling limit is set 
too? How are you typically using counters in your application? Are you reusing 
the same bucket/key for non-counter objects?

Cheers

Russell

On 28 Jul 2014, at 17:40, Bryce Verdier  wrote:

> Hi,
> 
> In my application I'm using riak counters, but in the very recent past I saw 
> a large number of these show up in the console logs:
> <0.1432.0>@riak_kv_vnode:encode_and_put:1795 Too many siblings for object 
> <<"BUCKET">>/<<"KEY">>
> 
> Are there known limitations to how many siblings can be created for use with 
> counters? Any ideas on how to fix this? Also, other important information 
> that might be relevant, this is single node running 1.4.8.
> 
> Warm regards,
> Bryce
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: repair-2i stops with "bad argument in call to eleveldb:async_write"

2014-07-30 Thread Russell Brown

Hi Simon, 
So the earlier “this is on wheezy, rest are on squeeze” thing is no longer a 
factor?

Any and all 2i repair you do ends with the same error?

Cheers

Russell

On 30 Jul 2014, at 07:29, Effenberg, Simon  wrote:

> I tried it now with one partition on 6 different machines and everywhere the 
> same result: index_scan_timeout and the info: bad argument in call to 
> eleveldb:async_get (2x) or async_write (4x).
> 
> 
> Von Samsung Mobile gesendet
> 
> 
>  Ursprüngliche Nachricht 
> Von: "Effenberg, Simon"
> Datum:30.07.2014 07:49 (GMT+01:00)
> An: bryan hunt
> Cc: riak-users@lists.basho.com
> Betreff: AW: repair-2i stops with "bad argument in call to 
> eleveldb:async_write"
> 
> Hi,
> 
>  I tried it on two different nodes with one partition each. Both multiple 
> times before the upgrade and after the upgrade.
> 
> I will try it on other machines in a minute but because I tried it already on 
> two different nodes and one of them is 2 weeks old and stored on a HP 3par I 
> bet that this is not a disk corruption issue..
> 
> Simon
> 
> 
> Von Samsung Mobile gesendet
> 
> 
>  Ursprüngliche Nachricht 
> Von: bryan hunt
> Datum:29.07.2014 18:21 (GMT+01:00)
> An: "Effenberg, Simon"
> Cc: riak-users@lists.basho.com
> Betreff: Re: repair-2i stops with "bad argument in call to 
> eleveldb:async_write"
> 
> Hi Simon,
> 
> Does the problem persist if you run it again? 
> 
> Does it happen if you run it against any other partition?
> 
> Best Regards,
> 
> Bryan
> 
> 
> 
> Bryan Hunt - Client Services Engineer - Basho Technologies Limited - 
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
> 
> On 29 Jul 2014, at 09:35, Effenberg, Simon  wrote:
> 
> > Hi,
> > 
> > we have some issues with 2i queries like that:
> > 
> > seffenberg@kriak46-1:~$ while :; do curl -s 
> > localhost:8098/buckets/conversation/index/createdat_int/0/23182680 | ruby 
> > -rjson -e "o = JSON.parse(STDIN.read); puts o['keys'].size"; sleep 1; done
> > 
> > 13853
> > 13853
> > 0
> > 557
> > 557
> > 557
> > 13853
> > 0
> > 
> > 
> > ...
> > 
> > So I tried to start a repair-2i first on one vnode/partition on one node
> > (which is quiet new in the cluster.. 2 weeks or so).
> > 
> > The command is failing with the following log entries:
> > 
> > seffenberg@kriak46-7:~$ sudo riak-admin repair-2i 
> > 22835963083295358096932575511191922182123945984
> > Will repair 2i on these partitions:
> >22835963083295358096932575511191922182123945984
> > Watch the logs for 2i repair progress reports
> > seffenberg@kriak46-7:~$ 2014-07-29 08:20:22.729 UTC [info] 
> > <0.5929.1061>@riak_kv_2i_aae:init:139 Starting 2i repair at speed 100 for 
> > partitions [22835963083295358096932575511191922182123945984]
> > 2014-07-29 08:20:22.729 UTC [info] 
> > <0.5930.1061>@riak_kv_2i_aae:repair_partition:257 Acquired lock on 
> > partition 22835963083295358096932575511191922182123945984
> > 2014-07-29 08:20:22.729 UTC [info] 
> > <0.5930.1061>@riak_kv_2i_aae:repair_partition:259 Repairing indexes in 
> > partition 22835963083295358096932575511191922182123945984
> > 2014-07-29 08:20:22.740 UTC [info] 
> > <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:324 Creating temporary 
> > database of 2i data in /var/lib/riak/anti_entropy/2i/tmp_db
> > 2014-07-29 08:20:22.751 UTC [info] 
> > <0.5930.1061>@riak_kv_2i_aae:create_index_data_db:361 Grabbing all index 
> > data for partition 22835963083295358096932575511191922182123945984
> > 2014-07-29 08:25:22.752 UTC [info] 
> > <0.5929.1061>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
> >Total partitions: 1
> >Finished partitions: 1
> >Speed: 100
> >Total 2i items scanned: 0
> >Total tree objects: 0
> >Total objects fixed: 0
> > With errors:
> > Partition: 22835963083295358096932575511191922182123945984
> > Error: index_scan_timeout
> > 
> > 
> > 2014-07-29 08:25:22.752 UTC [error] <0.4711.1061> gen_server <0.4711.1061> 
> > terminated with reason: bad argument in call to 
> > eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, 
> > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >  []) in eleveldb:write/3 line 155
> > 2014-07-29 08:25:22.753 UTC [error] <0.4711.1061> CRASH REPORT Process 
> > <0.4711.1061> with 0 neighbours exited with reason: bad argument in call to 
> > eleveldb:async_write(#Ref<0.0.10120.211816>, <<>>, 
> > [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
> >  []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
> > 2014-07-29 08:25:22.753 UTC [error] <0.1031.0> Supervisor 
> > {<0.1031.0>,poolboy_sup} had child riak_core_vnode_worker started with 
> > {riak_core_vnode_worker,start_link,undefined} at <0.4711.1061> exit with 
> > reason bad argument in call to eleveldb:async_write(#Ref<0.0.10120.211816>, 
> > <<>>, 
> > [{put,<<131,104,2,109,0,0,0,20,99,11

Re: repair-2i stops with "bad argument in call to eleveldb:async_write"

2014-07-30 Thread Russell Brown

gt;Total objects fixed: 0
> With errors:
> Partition: 34253944624943037145398863266787883273185918976
> Error: index_scan_timeout
> 
> 
> 2014-07-30 06:16:09.154 UTC [error] <0.4086.0> gen_server <0.4086.0> 
> terminated with reason: bad argument in call to 
> eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, 
> <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
>  []) in eleveldb:get/3 line 143
> 2014-07-30 06:16:09.154 UTC [error] <0.4086.0> CRASH REPORT Process 
> <0.4086.0> with 0 neighbours exited with reason: bad argument in call to 
> eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, 
> <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
>  []) in eleveldb:get/3 line 143 in gen_server:terminate/6 line 747
> 2014-07-30 06:16:09.154 UTC [error] <0.4085.0> Supervisor 
> {<0.4085.0>,poolboy_sup} had child riak_core_vnode_worker started with 
> {riak_core_vnode_worker,start_link,undefined} at <0.4086.0> exit with reason 
> bad argument in call to eleveldb:async_get(#Ref<0.0.2698.198008>, <<>>, 
> <<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,101,116,...>>,
>  []) in eleveldb:get/3 line 143 in context child_terminated
> 
> On Wed, Jul 30, 2014 at 09:50:22AM +0100, Russell Brown wrote:
>> Hi Simon, 
>> So the earlier “this is on wheezy, rest are on squeeze” thing is no longer a 
>> factor?
>> 
>> Any and all 2i repair you do ends with the same error?
>> 
>> Cheers
>> 
>> Russell
>> 
>> On 30 Jul 2014, at 07:29, Effenberg, Simon  wrote:
>> 
>>> I tried it now with one partition on 6 different machines and everywhere 
>>> the same result: index_scan_timeout and the info: bad argument in call to 
>>> eleveldb:async_get (2x) or async_write (4x).
>>> 
>>> 
>>> Von Samsung Mobile gesendet
>>> 
>>> 
>>>  Ursprüngliche Nachricht 
>>> Von: "Effenberg, Simon"
>>> Datum:30.07.2014 07:49 (GMT+01:00)
>>> An: bryan hunt
>>> Cc: riak-users@lists.basho.com
>>> Betreff: AW: repair-2i stops with "bad argument in call to 
>>> eleveldb:async_write"
>>> 
>>> Hi,
>>> 
>>> I tried it on two different nodes with one partition each. Both multiple 
>>> times before the upgrade and after the upgrade.
>>> 
>>> I will try it on other machines in a minute but because I tried it already 
>>> on two different nodes and one of them is 2 weeks old and stored on a HP 
>>> 3par I bet that this is not a disk corruption issue..
>>> 
>>> Simon
>>> 
>>> 
>>> Von Samsung Mobile gesendet
>>> 
>>> 
>>>  Ursprüngliche Nachricht 
>>> Von: bryan hunt
>>> Datum:29.07.2014 18:21 (GMT+01:00)
>>> An: "Effenberg, Simon"
>>> Cc: riak-users@lists.basho.com
>>> Betreff: Re: repair-2i stops with "bad argument in call to 
>>> eleveldb:async_write"
>>> 
>>> Hi Simon,
>>> 
>>> Does the problem persist if you run it again? 
>>> 
>>> Does it happen if you run it against any other partition?
>>> 
>>> Best Regards,
>>> 
>>> Bryan
>>> 
>>> 
>>> 
>>> Bryan Hunt - Client Services Engineer - Basho Technologies Limited - 
>>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
>>> 
>>> On 29 Jul 2014, at 09:35, Effenberg, Simon  
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> we have some issues with 2i queries like that:
>>>> 
>>>> seffenberg@kriak46-1:~$ while :; do curl -s 
>>>> localhost:8098/buckets/conversation/index/createdat_int/0/23182680 | ruby 
>>>> -rjson -e "o = JSON.parse(STDIN.read); puts o['keys'].size"; sleep 1; done
>>>> 
>>>> 13853
>>>> 13853
>>>> 0
>>>> 557
>>>> 557
>>>> 557
>>>> 13853
>>>> 0
>>>> 
>>>> 
>>>> ...
>>>> 
>>>> So I tried to start a repair-2i first on one vnode/partition on one node
>>>> (which is quiet new in the cluster.. 2 weeks or so).
>>>> 
>>>> The command is failing with the following log entries:
>>>> 
>>>>

Re: repair-2i stops with "bad argument in call to eleveldb:async_write"

2014-08-01 Thread Russell Brown

Hi Simon,
Sorry for the delays. I’m on vacation for a couple of days. Will pick this up 
on Monday.

Cheers

Russell

On 1 Aug 2014, at 09:56, Effenberg, Simon  wrote:

> Hi Russell, @basho
> 
> any updates on this? We still have the issues with 2i (repair is also
> still not possible) and searching for the 2i indexes is reproducable
> creating (for one range I tested) 3 different values.
> 
> I would love to provide anything you need to debug that issue.
> 
> Cheers
> Simon
> 
> On Wed, Jul 30, 2014 at 09:22:56AM +, Effenberg, Simon wrote:
>> Great. Thanks Russell..
>> 
>> if you need me to do something.. feel free to ask.
>> 
>> Cheers
>> Simon
>> 
>> On Wed, Jul 30, 2014 at 10:19:56AM +0100, Russell Brown wrote:
>>> Thanks Simon,
>>> 
>>> I’m going to spend a some time on this day.
>>> 
>>> Cheers
>>> 
>>> Russell
>>> 
>>> On 30 Jul 2014, at 10:05, Effenberg, Simon  
>>> wrote:
>>> 
>>>> Hi Russel,
>>>> 
>>>> still one machine out of 13 is on wheezy and the rest on squeeze but the
>>>> software is the same and basho is providing even the erlang stuff. So
>>>> their should no real difference inside the application.
>>>> 
>>>> And the errors are almost the same (except the async_write/read
>>>> difference).
>>>> 
>>>> I paste them:
>>>> 
>>>> -- node 1 ---
>>>> 
>>>> 2014-07-30 06:16:07.728 UTC [info] 
>>>> <0.14871.336>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
>>>>   Total partitions: 1
>>>>   Finished partitions: 1
>>>>   Speed: 100
>>>>   Total 2i items scanned: 0
>>>>   Total tree objects: 0
>>>>   Total objects fixed: 0
>>>> With errors:
>>>> Partition: 12559779695812446953312916531172001681702912
>>>> Error: index_scan_timeout
>>>> 
>>>> 
>>>> 2014-07-30 06:16:07.728 UTC [error] <0.1525.0> gen_server <0.1525.0> 
>>>> terminated with reason: bad argument in call to 
>>>> eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, 
>>>> [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97
>>>> ,116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 
>>>> line 155
>>>> 2014-07-30 06:16:07.728 UTC [error] <0.1525.0> CRASH REPORT Process 
>>>> <0.1525.0> with 0 neighbours exited with reason: bad argument in call to 
>>>> eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, 
>>>> [{put,<<131,104,2,109,0,0,0,20,99,11
>>>> 1,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}], 
>>>> []) in eleveldb:write/3 line 155 in gen_server:terminate/6 line 747
>>>> 2014-07-30 06:16:07.728 UTC [error] <0.1517.0> Supervisor 
>>>> {<0.1517.0>,poolboy_sup} had child riak_core_vnode_worker started with 
>>>> {riak_core_vnode_worker,start_link,undefined} at <0.1525.0> exit with 
>>>> reason bad argument in call
>>>> to eleveldb:async_write(#Ref<0.0.324.211123>, <<>>, 
>>>> [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,116,105,111,110,95,115,101,99,114,...>>,...}],
>>>>  []) in eleveldb:write/3 line 155 in context child_terminated
>>>> 
>>>> 
>>>> -- node 2 ---
>>>> 
>>>> 2014-07-30 06:16:07.791 UTC [info] 
>>>> <0.8083.314>@riak_kv_2i_aae:next_partition:160 Finished 2i repair:
>>>>   Total partitions: 1
>>>>   Finished partitions: 1
>>>>   Speed: 100
>>>>   Total 2i items scanned: 0
>>>>   Total tree objects: 0
>>>>   Total objects fixed: 0
>>>> With errors:
>>>> Partition: 622279994019798508141412682679979879462877528064
>>>> Error: index_scan_timeout
>>>> 
>>>> 
>>>> 2014-07-30 06:16:07.791 UTC [error] <0.1884.0> gen_server <0.1884.0> 
>>>> terminated with reason: bad argument in call to 
>>>> eleveldb:async_write(#Ref<0.0.318.96628>, <<>>, 
>>>> [{put,<<131,104,2,109,0,0,0,20,99,111,110,118,101,114,115,97,
>>>> 116,105,111,110,95,115,101,99,114,...>>,...}], []) in eleveldb:write/3 
>>>> line 155
>>>> 20

Re: riak-erlang-client

2014-09-29 Thread Russell Brown

Have you included the deps/riak_pb/ebin in your erlang path?

On 29 Sep 2014, at 17:18, Jon Brisbin  wrote:

> I’m trying to use the riak-erlang-client from master to talk to Riak 2.0 and 
> I’m getting an error when I use the test commands found in the README:
> 
> (rabbit@localhost)1> {ok, Pid} = riakc_pb_socket:start_link("127.0.0.1", 
> 8087).
> {ok,<0.685.0>}
> (rabbit@localhost)2> riakc_pb_socket:ping(Pid).
> ** exception exit: undef
>  in function  riak_pb_messages:msg_code/1
> called as riak_pb_messages:msg_code(rpbpingreq)
>  in call from riak_pb_codec:encode/1 (src/riak_pb_codec.erl, line 73)
>  in call from riakc_pb_socket:encode_request_message/1 
> (src/riakc_pb_socket.erl, line 2094)
>  in call from riakc_pb_socket:send_request/2 (src/riakc_pb_socket.erl, 
> line 2077)
>  in call from riakc_pb_socket:handle_call/3 (src/riakc_pb_socket.erl, 
> line 1258)
>  in call from gen_server:handle_msg/5 (gen_server.erl, line 585)
>  in call from proc_lib:init_p_do_apply/3 (proc_lib.erl, line 239)
> 
> Do I have something wrong here? I have installed the riak-erlang-client .ez 
> file as a RabbitMQ plugin via dependency from my riak-exchange plugin, which 
> I’m trying to get updated to 2.0.
> 
> Any thoughts?
> 
> 
> Thanks!
> 
> Jon Brisbin
> http://jbrisbin.com
> @JonBrisbin | @mformonochrome
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Proper way do delete datatypes ( map )

2014-10-02 Thread Russell Brown

Hi Alexander,

I think you are deleting data-types the proper way.

What is your `delete_mode` setting, please?

I would guess that sibling you are seeing is a tombstone, which suggests you 
have some concurrent update with the delete.

You will only ever have a single CRDT sibling, and 1 (or possibly more) 
tombstone siblings.

If you write back just the binary CRDT sibling as the value in the normal K/V 
interface, any concurrent data type updates will merge ok to a single value.

I don’t know about your Yokozuna question, but I think it is just that Yokozuna 
indexes each sibling as an independent document.

Let me know if that works out OK for you, please?

Cheers

Russell

On 2 Oct 2014, at 12:54, Alexander Popov  wrote:

> I have map bucket with some data( with neseted maps )
> aprox structure( don't know does it matter or not ):
> {
>   "update": {
>   "some_register": "value",
>   "some_flag": "disable",
>   "nested_map": {
>   "update": {
>   "nested1_map": {
>   "update": {
>  "some_register": "value",
>   "some_flag": "disable",
>   }
>   },
>   "nested1_map": {
>   "update": {
>  "some_register": "value",
>   "some_flag": "disable",
>   }
>   }   
>   }
>   },
>   "some_counter": 13,
>   }
> }
> 
> 
> Updates works fine. Even simultaneous
> But Sometimes  I need recreate entire value, so I Delete it using 
> curl -XDELETE http://host:8098/types/maps/buckets/mybucket/keys/some
> 
> After that sometimes  siblings appears.
> 
> 
> curl -H "Accept: multipart/mixed"  
> http://host:8098/types/maps/buckets/mybucket/keys/some
> 
> show that confict with delete
> 
> --XZ98hy0TJbr4sVETS44XBEJf7Yt
> Last-Modified: Thu, 02 Oct 2014 11:29:15 GMT
> 
> E
> ��A�6- some binary
> --XZ98hy0TJbr4sVETS44XBEJf7Yt
> Content-Type: application/octet-stream
> Link: ; rel="up"
> Etag: 1MqocFt6qWeQxIw8bE1B8e
> Last-Modified: Thu, 02 Oct 2014 11:29:03 GMT
> X-Riak-Deleted: true
> 
> 
> --XZ98hy0TJbr4sVETS44XBEJf7Yt--
> 
> 
> Further updates to datatype using 
> http://host:8098/types/maps/buckets/mybucket/datatypes/some NOT create new 
> sibling. It is replace previous one.
> 
> Problems: 
> 
> 1. I should remove datatype with different method? Or how  to resolve such 
> conflicts? probably
> Data I receive is binary, and query like
> curl -H "Accept: multipart/mixed"  
> http://host:8098/types/maps/buckets/mybucket/datatypes/some
> to get json data not works.Should I post it as binary back? 
> 
> 
> 2. I have also search index on this bucket.
>   each further  updates to this datatype before resolution creates new 
> records in solr because 
> _yz_id includes sibling id:
> 1*maps*mybucket*some*34*46hGXxyhuW3yn3L8bRHIml
>  
> good news that when I delete record again all entries in solr deleted too.
> 
> 
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: occasional timeout when deleting key on multi-node Riak 1.4

2014-10-02 Thread Russell Brown


On 2 Oct 2014, at 17:59, Igor Senderovich  wrote:

> There are no other errors in any of the logs at exactly the same time but 
> there are periodic errors in error.log and console.log of the following form 
> (and these occurred seconds before and after the crash):
> 
> 
> ** Reason for termination =
> ** 
> {{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,[{file,"src/riak_core_vnode.erl"},{line,299}]},{gen_fsm,handle_m
> sg,7,[{file,"gen_fsm.erl"},{line,494}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
> 2014-10-02 12:07:57 =CRASH REPORT
>   crasher:
> initial call: poolboy:init/1
> pid: <0.30125.18>
> registered_name: []
> exception exit: 
> {{{case_clause,"immediate"},[{riak_kv_vnode,do_delete,3,[{file,"src/riak_kv_vnode.erl"},{line,1321}]},{riak_core_vnode,vnode_command,3,

Can I see your config? Looks like you have delete_mode configured with the 
string “immediate” rather than the atom ‘immediate’.


Cheers

Russell

> 
> 
> On Thu, Oct 2, 2014 at 12:20 PM, Dmitri Zagidulin  
> wrote:
> Thanks. Are there entries in any of the other logs? (like the crash dump).
> 
> Can you also provide more info on the nodes themselves. What size AWS 
> instances are you running? Is the delete timeout happening while load 
> testing? 
> 
> On Thu, Oct 2, 2014 at 12:11 PM, Igor Senderovich 
>  wrote:
> Thanks for your help, Dmitri,
> 
> I get the following in error.log:
> 2014-10-02 12:05:45.037 [error] <0.6359.19> Webmachine error at path 
> "/buckets/imc/keys/5134a18660494ea5553d2c90ef9eea2f" : "Service Unavailable"
> 
> And no, there is no load balancer on our cluster.
> Thank you
> 
> 
> On Thu, Oct 2, 2014 at 11:52 AM, Dmitri Zagidulin  
> wrote:
> One other question - are you using a load balancer for your cluster (like 
> HAProxy or the like). In which case, take a look at its logs, also.
> 
> On Thu, Oct 2, 2014 at 11:51 AM, Dmitri Zagidulin  
> wrote:
> Igor,
> Can you look in the riak log directory, in the error.log (and console log and 
> crash dump file) to see if there's any entries, around the time of the delete 
> operation? And post them here?
> 
> 
> 
> On Thu, Oct 2, 2014 at 11:45 AM, Igor Senderovich 
>  wrote:
> Hi,
> 
> I get a timeout when deleting a key, reproducible in about 1 in 10 times:
> $ curl -i -vvv 
> http://myhost:8098/buckets/imc/keys/5134a18660494ea5553d2c90ef9eea2f
> 
> * About to connect() to dp1.prod6.ec2.cmg.net port 8098
> *   Trying 10.12.239.90... connected
> * Connected to dp1.prod6.ec2.cmg.net (10.12.239.90) port 8098
> > DELETE /buckets/imc/keys/5134a18660494ea5553d2c90ef9eea2f HTTP/1.1
> > User-Agent: curl/7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 
> > OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
> > Host: dp1.prod6.ec2.cmg.net:8098
> > Accept: */*
> >
> < HTTP/1.1 503 Service Unavailable
> HTTP/1.1 503 Service Unavailable
> < Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
> < Date: Wed, 01 Oct 2014 16:11:41 GMT
> Date: Wed, 01 Oct 2014 16:11:41 GMT
> < Content-Type: text/plain
> Content-Type: text/plain
> < Content-Length: 18
> Content-Length: 18
> 
> request timed out
> * Connection #0 to host dp1.prod6.ec2.cmg.net left intact
> * Closing connection #0
> 
> 
> This is on Riak 1.4 on a 5 node cluster with an n-value of 3.
> Thank you for your help
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> 
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: map CRDT performance/limits

2014-11-14 Thread Russell Brown

Hi,
Sorry to say that Maps are just riak objects underneath, so the same size limit 
applies. Anything over 1mb and you'll start to feel the pain.

Cheers

Russell

On 14 Nov 2014, at 19:34, Mark Rechler  wrote:

> Hi Everyone,
> 
> I'm curious if anyone has had experience using maps as containers for
> a large amount of data.
> 
> Are there any inherent size limits for maps? Could you put sets
> equivalent to several gigabytes of data into one map? I'd also be
> curious as to the performance impact of deleting said map. Would it be
> the same impact as scanning keys in a bucket and deleting
> sequentially?
> 
> I'm exploring using a map to organize data within a time bucket if you
> will to get around the leveldb back-end not having TTL support.
> 
> Thanks in advance!
> 
> -- 
> Mark Rechler
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Weird RIAK behavior

2014-12-23 Thread Russell Brown

Did you check that you had an intermediate value after the second write as 
expected? Do you have allow_mult set to true?

On 23 Dec 2014, at 03:20, Claudio Cesar Sanchez Tejeda  
wrote:

> Hi,
> 
> On Mon, Dec 22, 2014 at 11:54 PM, Alexander Sicular  
> wrote:
>> Same client code writing to all 5 clusters?
> 
> Yes, it is the same code.
> 
>> 
>> How does the config of the 5th cluster differ from the first 4?
>> 
> 
> Two clusters have one more memory backend configured (they are
> configured with multibackend). The cluster with issues is one of these
> two clusters.
> 
>> Quick notes:
>> Minimum of 5 nodes for a production deployment to ensure the default 3 
>> replicas are all on different physical nodes. Which is a good segue into the 
>> fact that you shouldn't run multiple Riak nodes on the same physical 
>> hardware. Performance aside, if you lose that physical machine you lose all 
>> your data.
> 
> Yes, these clusters (that are located in the same physical machine)
> are used for the developing team.
> 
> Regards.
> 
>> 
>> 
>> @siculars
>> http://siculars.posthaven.com
>> 
>> Sent from my iRotaryPhone
>> 
>>> On Dec 22, 2014, at 18:59, Claudio Cesar Sanchez Tejeda 
>>>  wrote:
>>> 
>>> I'm a sysadmin and I managing 5 cluster of RIAK:
>>> 
>>> - two of them are LXC containers on the same physical machine (3 nodes
>>> per cluster)
>>> - one of them are LXC containers located on different physical
>>> machines (6 nodes)
>>> - one of them are LXC containers located on different physical
>>> machines and XEN VMs (6 nodes)
>>> - and the last of them are VMware ESX VMs (3 nodes)
>>> 
>>> Our application works correctly on the first four clusters, but it
>>> doesn't work as we expected on the last one.
>>> 
>>> When we update a key and we retrieve this key in order to write it
>>> again, it has an old value (it doesn't have the first value that we
>>> wrote), for example:
>>> 
>>> The key has: lalala
>>> We retrieve the key, and add lololo, so it should be lalala,lololo
>>> We retrieve the key again, and try to add lelele, so it should be now:
>>> lalala,lololo,lelele, but when we retrieve it again, we only have:
>>> lalala,lelele
>>> 
>>> In the second write action, when we retrieve the key, we obtained a
>>> key with the old value. We set r, w, pr and rw to 3 to the REST
>>> requests, but it doesn't help.
>>> 
>>> All the configuration files are very similiar and we don't have any
>>> major differences in the disk I/O and network performance of the nodes
>>> of the clusters.
>>> 
>>> Did anyone have a similar issue?
>>> 
>>> Regards.
>>> 
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Random timeouts on Riak

2014-12-29 Thread Russell Brown


On 29 Dec 2014, at 12:09, Jason Ryan  wrote:

> All types/buckets we use are set to allow_mult: false - last_write_wins:true

Did you change to this setting after these keys were written?

Looks like a bug in Riak, so I’m going to open a ticket: hd([]) should never be 
called in reconcile. But any further help we can get from you would be 
appreciated. Anyway you can get each of the N values from their backends 
separately so I can see there content lists?

> 
> 
> 
>   
> Jason Ryan
> VP Engineering
> 
> Trustev
> Real Time, Online Identity Verification
>  
> email: jason.r...@trustev.com 
> skype: jason_j_ryan
> web: www.trustev.com
> 
> Trustev Ltd, 2100 Cork Airport Business Park, Cork, Ireland. 
> 
> On 29 December 2014 at 12:08, Sargun Dhillon  wrote:
> The bucket (type) that you're working with -- what are your
> allow_mult, and last_write_wins settings?
> 
> On Mon, Dec 29, 2014 at 4:05 AM, Jason Ryan  wrote:
> > It seems to move between 4 keys in particular, these keys are actually empty
> > at the moment (i.e. an empty JSON document).
> >
> > CPU utilization is close to zero.
> >
> > Can't see anything in particular, bar the error message I just posted
> > before.
> >
> > Jason
> >
> >
> > On 29 December 2014 at 11:58, Ciprian Manea  wrote:
> >>
> >> Hi Jason,
> >>
> >> Are these random timeouts happening for only one key, or is common for
> >> more?
> >>
> >> What is the CPU utilisation in the cluster when you're experience these
> >> timeouts?
> >>
> >> Can you spot anything peculiar in your server's $ dmesg outputs? Any I/O
> >> errors there?
> >>
> >>
> >> Regards,
> >> Ciprian
> >>
> >> On Mon, Dec 29, 2014 at 1:55 PM, Sargun Dhillon  wrote:
> >>>
> >>> Several things:
> >>> 1) I recommend you have a 5-node cluster:
> >>> http://basho.com/why-your-riak-cluster-should-have-at-least-five-nodes/
> >>> 2) What version of Riak are you using?
> >>> 3) What backend(s) are you using?
> >>> 4) What's the size of your keyspace?
> >>> 5) Are you actively rewriting keys, or writing keys to the cluster?
> >>> 6) Do you know how much I/O the cluster is currently doing?
> >>>
> >>> On Mon, Dec 29, 2014 at 2:51 AM, Jason Ryan 
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > We are getting random timeouts from our application (>60seconds) when
> >>> > we try
> >>> > to retrieve a key from our Riak cluster (4 nodes with a load balancer
> >>> > in
> >>> > front of them). Our application just uses the standard REST API to
> >>> > query
> >>> > Riak.
> >>> >
> >>> > We are pretty new to Riak - so would like to understand how best to
> >>> > debug
> >>> > this issue? Is there any good pointers on what to start with? This is
> >>> > our
> >>> > production cluster.
> >>> >
> >>> > Thanks,
> >>> > Jason
> >>> >
> >>> >
> >>> > This message is for the named person's use only. If you received this
> >>> > message in error, please immediately delete it and all copies and
> >>> > notify the
> >>> > sender. You must not, directly or indirectly, use, disclose,
> >>> > distribute,
> >>> > print, or copy any part of this message if you are not the intended
> >>> > recipient. Any views expressed in this message are those of the
> >>> > individual
> >>> > sender and not Trustev Ltd. Trustev is registered in Ireland No. 516425
> >>> > and
> >>> > trades from 2100 Cork Airport Business Park, Cork, Ireland.
> >>> >
> >>> >
> >>> > ___
> >>> > riak-users mailing list
> >>> > riak-users@lists.basho.com
> >>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>> >
> >>>
> >>> ___
> >>> riak-users mailing list
> >>> riak-users@lists.basho.com
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >>
> >
> >
> > This message is for the named person's use only. If you received this
> > message in error, please immediately delete it and all copies and notify the
> > sender. You must not, directly or indirectly, use, disclose, distribute,
> > print, or copy any part of this message if you are not the intended
> > recipient. Any views expressed in this message are those of the individual
> > sender and not Trustev Ltd. Trustev is registered in Ireland No. 516425 and
> > trades from 2100 Cork Airport Business Park, Cork, Ireland.
> 
> 
> This message is for the named person's use only. If you received this message 
> in error, please immediately delete it and all copies and notify the sender. 
> You must not, directly or indirectly, use, disclose, distribute, print, or 
> copy any part of this message if you are not the intended recipient. Any 
> views expressed in this message are those of the individual sender and not 
> Trustev Ltd. Trustev is registered in Ireland No. 516425 and trades from 2100 
> Cork Airport Business Park, Cork, Ireland.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.

Re: Random timeouts on Riak

2014-12-29 Thread Russell Brown

Hi Jason,

I opened https://github.com/basho/riak_kv/issues/1069. Feel free to add any 
information to it that you think is pertinent.

On 29 Dec 2014, at 12:26, Jason Ryan  wrote:

> No - those settings were set during the setup of the cluster.
> 
> >> Anyway you can get each of the N values from their backends separately so 
> >> I can see there content lists?
> 
> The N value is 3 - or is it a particular command you want us to run on each 
> node?

I will try and get this for you. It will involve attaching to the nodes in 
question and running some erlang in the console, probably.

Cheers

Russell

> 
> Thanks,
> Jason
> 
> 
> On 29 December 2014 at 12:19, Russell Brown  wrote:
> 
> On 29 Dec 2014, at 12:09, Jason Ryan  wrote:
> 
>> All types/buckets we use are set to allow_mult: false - last_write_wins:true
> 
> Did you change to this setting after these keys were written?
> 
> Looks like a bug in Riak, so I’m going to open a ticket: hd([]) should never 
> be called in reconcile. But any further help we can get from you would be 
> appreciated. Anyway you can get each of the N values from their backends 
> separately so I can see there content lists?
> 
>> 
>> 
>> On 29 December 2014 at 12:08, Sargun Dhillon  wrote:
>> The bucket (type) that you're working with -- what are your
>> allow_mult, and last_write_wins settings?
>> 
>> On Mon, Dec 29, 2014 at 4:05 AM, Jason Ryan  wrote:
>> > It seems to move between 4 keys in particular, these keys are actually 
>> > empty
>> > at the moment (i.e. an empty JSON document).
>> >
>> > CPU utilization is close to zero.
>> >
>> > Can't see anything in particular, bar the error message I just posted
>> > before.
>> >
>> > Jason
>> >
>> >
>> > On 29 December 2014 at 11:58, Ciprian Manea  wrote:
>> >>
>> >> Hi Jason,
>> >>
>> >> Are these random timeouts happening for only one key, or is common for
>> >> more?
>> >>
>> >> What is the CPU utilisation in the cluster when you're experience these
>> >> timeouts?
>> >>
>> >> Can you spot anything peculiar in your server's $ dmesg outputs? Any I/O
>> >> errors there?
>> >>
>> >>
>> >> Regards,
>> >> Ciprian
>> >>
>> >> On Mon, Dec 29, 2014 at 1:55 PM, Sargun Dhillon  wrote:
>> >>>
>> >>> Several things:
>> >>> 1) I recommend you have a 5-node cluster:
>> >>> http://basho.com/why-your-riak-cluster-should-have-at-least-five-nodes/
>> >>> 2) What version of Riak are you using?
>> >>> 3) What backend(s) are you using?
>> >>> 4) What's the size of your keyspace?
>> >>> 5) Are you actively rewriting keys, or writing keys to the cluster?
>> >>> 6) Do you know how much I/O the cluster is currently doing?
>> >>>
>> >>> On Mon, Dec 29, 2014 at 2:51 AM, Jason Ryan 
>> >>> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > We are getting random timeouts from our application (>60seconds) when
>> >>> > we try
>> >>> > to retrieve a key from our Riak cluster (4 nodes with a load balancer
>> >>> > in
>> >>> > front of them). Our application just uses the standard REST API to
>> >>> > query
>> >>> > Riak.
>> >>> >
>> >>> > We are pretty new to Riak - so would like to understand how best to
>> >>> > debug
>> >>> > this issue? Is there any good pointers on what to start with? This is
>> >>> > our
>> >>> > production cluster.
>> >>> >
>> >>> > Thanks,
>> >>> > Jason
>> >>> >
>> >>> >
>> >>> > This message is for the named person's use only. If you received this
>> >>> > message in error, please immediately delete it and all copies and
>> >>> > notify the
>> >>> > sender. You must not, directly or indirectly, use, disclose,
>> >>> > distribute,
>> >>> > print, or copy any part of this message if you are not the intended
>> >>> > recipient. Any views expressed in this message are those of the
>> >>> > individual
>> >>> > sender and not Trustev Ltd. Trustev is registered in Ireland No. 51

1 2 3 4 >

1 - 100 of 365 matches

Mail list logo