Riak cluster creation when node named riak@127.0.0.1

2012-05-27 Thread Matt Black
Hey list,

I'm looking for some advice on the best steps to remedy the situation below.

I have an already running (in production) node using the Bitcask backend,
which I would like to migrate to eLevelDB for 2i support. My plan was to
introduce another node to the cluster with the eLevelDB backend and allow
the two nodes to synchronise, before taking out the Bitcask node. Please
advise if this is a bad approach.

The major problem I have however, is that the Bitcask node was never
properly configured, and is currently named "riak@127.0.0.1", listening on
127.0.0.1:8098. I don't see any issues with changing the listening IP and
restarting the node, but obviously I can't change the node name.

So, will there be a problem with introducing say, two new nodes which are
correctly configured and then decommissioning this first node after the
data has been shared out? If the node name is just that - a name - then I
think I'll be okay..

Not an ideal situation to be in, so any and all suggestions appreciated.

Thanks
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Secondary indexes on Multi Backend

2012-05-27 Thread Matt Black
Hey list,

Can someone confirm if 2i will work with multi backend please? I get this
error when running an index phase in map reduce:
"indexes_not_supported,riak_kv_multi_backend".

Yet, the following pull request suggests that this would work:

https://github.com/basho/riak_kv/pull/258/commits

And commit 8e11d114 appears in tag 1.1.1 which I'm running:

https://github.com/basho/riak_kv/commits/1.1.1?page=3

Thanks
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Map Reduce with Python

2012-06-14 Thread Matt Black
Hello list,

In an M/R query, I'd like to be able to merge objects from two different
buckets in my output. So my process is a map phase for each bucket, with a
link phase in the middle, and a reduce phase to do a bit of merge
processing at the end.

client = riak.RiakClient(RIAK_HOST, RIAK_PORT)
query = client.add("carts")
query.map("function(v) { return [[ v.bucket, v.key,
Riak.mapValuesJson(v)[0] ]]; }", {'keep': False})
query.link("cart-products", "cart-products", False)
query.link("products", "product", False)
query.map("function(v) { return [ Riak.mapValuesJson(v)[0] ]; }", {'keep':
False})
query.reduce("function(values) { return values; }")

This query will return only objects from the "products" bucket.

I tried using the "keep" flags on each map phase, but in this case the I
get output from each map phase and the reduce phase appears to be ignored -
which is somewhat unexpected.

Please advise on changes or things I could try.

Thanks
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak behind a Load Balancer

2012-06-24 Thread Matt Black
Dear list,

Does anyone have an opinion on the concept of putting a Riak cluster behind
a load balancer?

We wish to be able to automatically add/remove nodes from the cluster, so
adding an extra layer at the front is desirable. We should also benefit for
incoming requests behind shared across all nodes.

Can anyone see any drawbacks / problems with doing this?

Thanks
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Riak behind a Load Balancer

2012-09-09 Thread Matt Black
I'm currently running six Riak nodes on EC2 small instances behind an ELB.
This works fine for us - although when running a large map reduce task we
connect directly to a single node rather than routing through the ELB. I
don't have any actual performance statistics to hand, but I could get some
if the list is interested.

On 9 September 2012 03:25, gibraltar  wrote:

>
> I wonder what happens if one would run a Riak cluster on 5/10/15+ EC2
> micro (or small) Linux machines behind Elastic Load Balance? Do you think
> it would perform well enough for a web site with moderate traffic. Idea is
> having many many "small" machines rather than couple of "big" machines.
>
> Anyone has any experience with something similar?
>
> Thanks,
> Gibraltar
>
> On Aug 30, 2012, at 5:09 PM, Sean Carey  wrote:
>
>  Matt,
> Haproxy is my load balancer of choice. You can always run multiple copies
> of haproxy and use some type of dynamic dns with it.
>
> We do this in many cases. Haproxy scales well. I've seen a single node
> sustain multiple gigabits per second with almost no sweat.
>
>
> Thanks.
>
>
> Sean
>
> On Monday, June 25, 2012 at 7:36 AM, Matt Black wrote:
>
> Dear list,
>
> Does anyone have an opinion on the concept of putting a Riak cluster
> behind a load balancer?
>
> We wish to be able to automatically add/remove nodes from the cluster, so
> adding an extra layer at the front is desirable. We should also benefit for
> incoming requests behind shared across all nodes.
>
> Can anyone see any drawbacks / problems with doing this?
>
> Thanks
> Matt
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>  ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 

*Matt Black*
Senior Developer

[image: JBA Digital] <http://www.jbadigital.com/>

*JBA Online Consultancy*

M: +61 (0) 421 073 321
W: www.jbadigital.com
A:  Level 1, 2 Darling Street, South Yarra, Melbourne 3141

The information contained in this email is confidential and is intended for
the use of the individual or entity named above. If the receiver of this
message is not the intended recipient, you are hereby notified that any
dissemination, distribution or copy of this email is strictly prohibited.
If you have received this e-mail in error, please notify our office by
telephone. JB/A and their employees do not represent that this transmission
is free from viruses or other defects and you should see it as your
responsibility to check for viruses and defects. JB/A disclaims any
liability to any person for loss or damage resulting (directly or
indirectly) from the receipt of electronic mail (including enclosures).
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak upgrade 1.1.1 -> 1.2

2012-09-11 Thread Matt Black
Hey list,

I'm going to be doing our first full-cluster upgrade in production soon.
The docs suggest this is as easy as stopping a node, upgrading, and
restarting the node - doing each node one at a time. Before I start, It'd
be great if someone could confirm that I have this right.. :)

Thanks
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Cloning a cluster / copying all cluster data

2012-09-11 Thread Matt Black
Hey again list,

Can anyone recommend a process for either cloning a cluster, or backing up
a whole cluster's data to restore elsewhere? This would be incredibly
useful for creating a new environment to test upgrades and run application
regression testing.

In the dark old days, it would be a simple combination of mysqldump &
mysql..

Cheers
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak 1.2 Ubuntu Precise .deb

2012-09-23 Thread Matt Black
Hey basho,

There's no 32-bit deb listed under Downloads on your website.. I found it
eventually, but it should be probably listed for all varieties of Ubuntu.

Cheers!
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak Python Bindings 1.5.0

2012-09-24 Thread Matt Black
Hey list,

I'm upgrading to Riak 1.2 and moving over to the latest Python bindings at
the same time. What's the current recommended install process for this? I'd
(obviously) like to use pip for simplicity - but it seems that's not
currently working (details below). I recall manually installing protobuf
once upon a time, but I thought we were beyond this now?

Thanks y'all

>From pip install riak==1.5.0:

Downloading/unpacking riak-pb>=1.2.0,<1.3.0 (from riak)
  Running setup.py egg_info for package riak-pb
Traceback (most recent call last):
  File "", line 14, in 
  File "/home/ubuntu/build/riak-pb/setup.py", line 4, in 
from proto_cmd import build_proto, clean_proto
ImportError: No module named proto_cmd
Complete output from command python setup.py egg_info:
Traceback (most recent call last):

  File "", line 14, in 

  File "/home/ubuntu/build/riak-pb/setup.py", line 4, in 

from proto_cmd import build_proto, clean_proto

ImportError: No module named proto_cmd
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Cloning a cluster / copying all cluster data

2012-09-27 Thread Matt Black
Thanks for your answers.

So let's say I have a production cluster with six nodes, and my data is
then evenly distributed across all six. How would I approach cloning that
into a three node cluster?

I actually tried a background process which rsync'd a six node cluster into
a new one - each node one-for-one.. And it didn't work. No nodes could join
the cluster and Riak was crashing out periodically. I didn't really have
time to investigate exactly why - I'm just trying out some process stuff.

Any thoughts?

On 12 September 2012 10:18, Matthew Tovbin  wrote:

> Matt,
>
> We copied all the data key by key, since we used incorrect
> 'ring_creation_size' value.
> (
> http://riak-users.197444.n3.nabble.com/Cluster-migration-due-to-incorrect-quot-ring-creation-size-quot-value-td4024509.html
> )
>
> So, you can use this copy tool - https://github.com/tovbinm/riak-tools,
> which proved itself copying 200m+ items for us.
>
> -Matthew
>
>
>
> On Tue, Sep 11, 2012 at 5:09 PM, Kresten Krab Thorup wrote:
>
>> Simply copying the riak/data and riak/etc directories should do the
>> trick. You can use tar, rsync or simply cp as you feel like, and you can do
>> all that while Riak is running at full throttle.  The beauty of log
>> structured storage (Bitcask, LevelDB, HainoiDB)
>>
>> At least that's the idea. I seem to remember that there was an issue with
>> LevelDB that made basho recommend that you stop a node before doing backup,
>> I dunno if that was fixed.
>>
>> If its just for debugging/dev purposes you can just stop sending requests
>> to riak while doing the backup and you should definitively be fine.
>>
>> Kresten, Trifork
>>
>> Den 12/09/2012 kl. 01.59 skrev "Matt Black" :
>>
>> > Hey again list,
>> >
>> > Can anyone recommend a process for either cloning a cluster, or backing
>> up a whole cluster's data to restore elsewhere? This would be incredibly
>> useful for creating a new environment to test upgrades and run application
>> regression testing.
>> >
>> > In the dark old days, it would be a simple combination of mysqldump &
>> mysql..
>> >
>> > Cheers
>> > Matt
>> > ___
>> > riak-users mailing list
>> > riak-users@lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


noproc error on Map/Reduce

2012-10-10 Thread Matt Black
Hello list,

I've written a large map/reduce query using the Python bindings - basically
it's an expanded version of the users/tags example in this post:

http://basho.com/blog/technical/2010/04/14/practical-map-reduce:-forwarding-and-collecting/

When I run the query it fails with this error, with which I am unfamiliar..
Any thoughts on how to diagnose?

Exception: Error running MapReduce operation. Headers: {'date': 'Thu, 11
Oct 2012 03:57:39 GMT', 'content-length': '1211', 'content-type':
'application/json', 'http_code': 500, 'server': 'MochiWeb/1.1
WebMachine/1.9.0 (someone had painted it blue)'} Body:
'{"phase":0,"error":"{noproc,{gen_server,call,[riak_kv_js_map,{reserve_vm,<0.23487.1151>},infinity]}}","input":"{ok,{r_object,<<\\"carts\\">>,<<\\"1284510f1d9013cf43d44cce3fde7847\\">>,[{r_content,{dict,6,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[[<<\\"Links\\">>,{{<<\\"cart-products\\">>,<<\\"1284510f1d9013cf43d44cce3fde7847-0\\">>},<<\\"cart-products\\">>},{{<<\\"users\\">>,<<\\"1284510f1d9013cf43d44cce3fde7847\\">>},<<\\"user\\">>}]],[],[],[],[],[],[],[],[[<<\\"content-type\\">>,97,112,112,108,105,99,97,116,105,111,110,47,106,115,111,110],[<<\\"X-Riak-VT...\\">>,...]],...}}},...}],...},...}","type":"exit","stack":"[{gen_server,call,3,[{file,\\"gen_server.erl\\"},{line,188}]},{riak_kv_js_manager,blocking_dispatch,4,[{file,\\"src/riak_kv_js_manager.erl\\"},{line,250}]},{riak_kv_mrc_map,map_js,3,[{file,\\"src/riak_kv_mrc_map.erl\\"},{line,192}]},{riak_kv_mrc_map,process,3,[{file,\\"src/riak_kv_mrc_map.erl\\"},{line,140}]},{riak_pipe_vnode_worker,process_input,3,[{file,\\"src/riak_pipe_vnode_worker.erl\\"},{line,445}]},{riak_pipe_vnode_worker,wait_for_input,2,[{file,\\"src/riak_pipe_vnode_worker.erl\\"},{line,377}]},{gen_fsm,handle_msg,...},...]"}'

Thanks
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak cluster mitosis

2012-10-18 Thread Matt Black
This is something I've been wondering about of late. It seems to me it
would be useful to at least be able to tell a single node to take a whole
copy of all the data and then leave the cluster.

You could then start a new cluster from that node. It would (possibly) save
time in backup and restores..


On 19 October 2012 08:35, David Lowell  wrote:

> If I have a cluster of say, 4 nodes, is it possible to split that cluster
> into two clusters of 2 nodes, each with a full complement of the original
> cluster's data, while the data is continuously being served? Obviously, we
> would require that the data be small enough to fit on 2 nodes.
>
> Thanks for your help,
>
> Dave
>
> --
> Dave Lowell
> d...@connectv.com
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: riak cluster mitosis

2012-10-24 Thread Matt Black
Hi Simon,

My intention for this is to create a pre-prod test environment - so having
an exact replica of the data isn't an issue. Previously I have used the
riak-admin backup/restore, but the file has grown to about 70GB, and so
takes around a day to backup, copy and restore into a new cluster.

Multi-data centre replication looks interesting - and probably we'll end up
using that for redundancy reasons. In the short term I'm looking for an
as-quick-as-possible spawn of a new cluster, with as-accurate-as-possible
data in there..

Asking too much?! Don't want to sound ungrateful ;)


On 24 October 2012 01:07, Simon Vans-Colina  wrote:

> Hi David, Matt,
>
> We've just discussed this and its a bit tricky. For 4 servers with N=3 it
> is possible to stop 2 nodes, (say 3 and 4) and then send a "force-remove"
> command to the remaining servers (1 and 2).
>
> Then you could start (3 and 4), and send *them* a force-remove of (1 and
> 2).
>
> This is risky because you're only guaranteed one replica of the data on
> each of the new clusters.
>
> The other risk is that new data written to (1 and 2) will not be
> replicated over to (3 and 4). Are you planning on re-joining the cluster
> back up afterwards?
>
> There's better ways than doing "mitotis" (although i love the name). Look
> into Multi Data Center Replication, or just add new nodes and do a
> force-replace. If the data is small enough, you could use the backup commn
>
> Hope i got this right, i'm quite new to this myself.
>
> Cheers
>
> On Thu, Oct 18, 2012 at 10:35 PM, David Lowell  wrote:
>
>> If I have a cluster of say, 4 nodes, is it possible to split that cluster
>> into two clusters of 2 nodes, each with a full complement of the original
>> cluster's data, while the data is continuously being served? Obviously, we
>> would require that the data be small enough to fit on 2 nodes.
>>
>> Thanks for your help,
>>
>> Dave
>>
>> --
>> Dave Lowell
>> d...@connectv.com
>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
> --
> Simon Vans-Colina - Client Services Engineer - Basho
>
> Tel:+44 744 791 4640
> Twitter: @simonvc
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Migrating cluster from 1.1.1 to 1.2.1

2012-10-29 Thread Matt Black
Hello list,

I just started a cluster migration, and encountered a problem with
introducing the first replacement node :)

I'm not able to either "cluster plan" or "cluster commit" with the new node:

root@ip-10-138-x-221:~# riak-admin cluster join riak@10.130.x.125
Attempting to restart script through sudo -H -u riak
Success: staged join request for 'riak@10.138.x.221' to 'riak@10.130.x.125'

root@ip-10-138-x-221:~# riak-admin cluster plan
Attempting to restart script through sudo -H -u riak
RPC to 'riak@10.138.x.221' failed: {'EXIT',
  {noproc,
   {gen_server,call,
[{riak_core_claimant,
  'riak@10.128.x.82'},
 plan,infinity]}}}

root@ip-10-138-x-221:~# riak-admin cluster commit
Attempting to restart script through sudo -H -u riak
RPC to 'riak@10.138.x.221' failed: {'EXIT',
  {noproc,
   {gen_server,call,
[{riak_core_claimant,
  'riak@10.128.x.82'},
 commit,infinity]}}}

Now, since the cluster group of commands were new in 1.2 - should I be
using the older join commands for this migration?

Thanks
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migrating cluster from 1.1.1 to 1.2.1

2012-10-29 Thread Matt Black
Following up on my own email, minutes after I sent it...

The node join appeared stalled before I wrote the below - much like I
needed to issue the "cluster commit" command. However, I just checked
"member_status" and partitions are actually being transferred to my new
node.


On 30 October 2012 13:42, Matt Black  wrote:

> Hello list,
>
> I just started a cluster migration, and encountered a problem with
> introducing the first replacement node :)
>
> I'm not able to either "cluster plan" or "cluster commit" with the new
> node:
>
> root@ip-10-138-x-221:~# riak-admin cluster join riak@10.130.x.125
> Attempting to restart script through sudo -H -u riak
> Success: staged join request for 'riak@10.138.x.221' to 'riak@10.130.x.125
> '
>
> root@ip-10-138-x-221:~# riak-admin cluster plan
> Attempting to restart script through sudo -H -u riak
> RPC to 'riak@10.138.x.221' failed: {'EXIT',
>   {noproc,
>{gen_server,call,
> [{riak_core_claimant,
>   'riak@10.128.x.82'},
>  plan,infinity]}}}
>
> root@ip-10-138-x-221:~# riak-admin cluster commit
> Attempting to restart script through sudo -H -u riak
> RPC to 'riak@10.138.x.221' failed: {'EXIT',
>   {noproc,
>{gen_server,call,
> [{riak_core_claimant,
>   'riak@10.128.x.82'},
>  commit,infinity]}}}
>
> Now, since the cluster group of commands were new in 1.2 - should I be
> using the older join commands for this migration?
>
> Thanks
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: More Migration Questions

2012-11-13 Thread Matt Black
I still haven't really gotten to the bottom of the best way to do this
(short of paying for
MDC
):

http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009951.html

Previously, I've used backup/restore for situations like this, but our
backup has now grown to around 100GB - so it has become impractical.

Shane, in your maintenance window could you:
* create your new cluster
* stop any new data being added to the old cluster
* run a riak-admin backup
* run a riak-admin restore into the new one

The maintenance window here saves you a lot of trouble... Unfortunately,
most people won't get one ;)

Cheers
Matt



On 14 November 2012 09:44, Martin Woods  wrote:

> Hi Tom
>
> I'd be very interested to know if Shane's approach should work, or if you
> know of any good reason why that approach would cause issues.
>
> Also, aren't there several very real business use cases here that users of
> Riak will inevitably encounter, and must be able to satisfy? Shane mentions
> two use cases below: creation of a test environment using a copy of data
> from a production cluster; and the migration of data within one cloud
> provider from one set of systems to a distinct, separate set of systems.
>
> To add to this, what about the case where a Riak customer needs to move
> from one cloud provider to another? How does this customer take his data
> with him?
>
> All of the above cases require that a separate cluster be spun up from the
> original cluster, with different names and IP addresses for the Riak nodes
>  involved in the cluster.
>
> None of these use cases are satisfied by using the riak-admin cluster
> command.
>
> It seemed that this was the purpose of the reip command, but if Basho is
> poised to deprecate this command, and indeed no longer recommends its use,
> how are the previous cases supported? Surely these are important scenarios
> for users of Riak, and therefore Basho?
>
> At one level, it seems it should be entirely possible to simply copy the
> data directory from each Riak node and tell Riak that the node names and IP
> addresses have changed (reip!). So what's the problem with doing this?
>
> Regards,
> Martin.
>
>
> On 13 November 2012 17:16, Thomas Santero  wrote:
>
>> Hi Shane,
>>
>> I'm sorry for the delay on this. Over the weekend I was working to
>> replicate your setup so I can answer your question from experience. Alas,
>> time got the best of me and I have not yet finished.
>>
>> That said, I'm inclined to suggest upgrading riak on your current cluster
>> first and then using riak-admin replace to move off of the VM's and onto
>> metal.
>>
>> * In this scenario, do a rolling upgrade (including making backups) of
>> the current cluster.
>> * Install riak onto the new machines
>> * join the first machine to the cluster
>> * use riak-admin replace to replace one of the old nodes with the new node
>> * wait for ring-ready, then repeat for the other nodes.
>>
>> Tom
>>
>>
>> On Tue, Nov 13, 2012 at 11:59 AM, Shane McEwan wrote:
>>
>>> Anyone? Beuller? :-)
>>>
>>> Installing Riak 1.1.1 on the new nodes, copying the data directories
>>> from the old nodes, issuing a "reip" on all the new nodes, starting up,
>>> waiting for partition handoffs to complete, shutting down, upgrading to
>>> 1.2.1 and starting up again got us to where we want to be. But this is not
>>> very convenient.
>>>
>>> What do I do when I come to creating our test environment where I'll be
>>> wanting to copy production data onto the test nodes on a regular basis? At
>>> that point I won't have the "luxury" of downgrading to 1.1.1 to have a
>>> working "reip" command.
>>>
>>> Surely there's gotta be an easier way to spin up a new cluster with new
>>> names and IPs but with old data?
>>>
>>> Shane.
>>>
>>>
>>> On 08/11/12 21:10, Shane McEwan wrote:
>>>
 G'day!

 Just to add to the list of people asking questions about migrating to
 1.2.1 . . .

 We're about to migrate our 4 node production Riak database from 1.1.1 to
 1.2.1. At the same time we're also migrating from virtual machines to
 physical machines. These machines will have new names and IP addresses.

 The process of doing rolling upgrades is well documented but I'm unsure
 of the correct procedure for moving to an entirely new cluster.

 We have the luxury of a maintenance window so we don't need to keep
 everything running during the migration. Therefore the current plan is
 to stop the current cluster, copy the Riak data directories to the new
 machines and start up the new cluster. The hazy part of the process is
 how we "reip" the database so it will work in the new cluster.

 We've tried using the "riak-admin reip" command but we

Re: Riak and host names

2013-01-07 Thread Matt Black
Thanks for this Charlie.

I'm running a production Riak cluster on AWS which runs constantly, and
I've been wondering how I might be able to easliy stop and start AWS nodes
for a testing and benchmarking cluster (to save on cost).

By using the 'riaknode1.priv' hostname method you describe, would I be able
to stop and then restart a whole cluster of nodes at once? (As described by
Deepak, AWS assigns new IPs when a VM starts).

Thanks
Matt


On 8 January 2013 01:31, Charlie Voiselle  wrote:

> Deepak:
>
> When you name a node in app.config with -name it has to have a '.' in it,
> like r...@hostname.net  As you have surmised, you can get around that if
> you use the -sname argument instead.
>
> They have to be done consistently.  In your example, had you used the
> -sname argument, `riak@riaknode1` would work.  Making a host entry
> `riaknode1.priv` that points to the local address would work with the -name
> argument.
>
> The inportant thing about -name and -sname is that they can't mix within a
> cluster.
>
> Cluster replace is designed to replace a node with a new one and transfer
> all the partitions. You can cheat and use it to rename a node though.
>
> The process to do this would look like the following:
>
>- Stop the node to rename with `riak stop`
>- Mark it 'down' *from another node in the cluster *using `riak-admin
>down «old nodename».
>- Rename the node in vm.args.
>- Delete the ring directory.
>- Start the node with `riak start`.
>- It will come up as a single instance which you can verify with
>`riak-admin member-status`.
>- Join the node to the cluster with `riak-admin cluster join «cluster
>nodename» `
>- Set it to replace the old instance of itself with `riak-admin
>cluster replace «old nodename» «new nodename»
>- Plan the changes with `riak-admin cluster plan`
>- Commit the changes with `riak-admin cluster commit`
>
>
> As you can see, this is a very large effort, so best to use hostnames that
> aren't moving around.  Apologies for you getting this twice, Deepak. I
> failed to reply to the list as well.
>
> Hope this makes sense...
> Charlie
> On Jan 1, 2013, at 2:43 PM, Deepak Balasubramanyam 
> wrote:
>
> I took the AWS EC2 riak image for a spin today. I have a query regarding
> riak nodes and how they behave when the machine reboots.
>
> When an EC2 instance reboots, the internal ip / internal DNS / external
> DNS change. This renders the app.config and -name argument on vm.args
> incorrect. I was exploring solutions to deal with this problem.
>
> *1. Preventive measures*
>
> Someone on this thread dated May 
> 2011
>  suggested
> using host file entries that point to the local internal IP address. That
> does not seem to work. Riak fails with the following error when I add a new
> entry to /etc/hosts and configure vm.args with -name riak@riaknode1
>
> Hostname riaknode1 is illegal
>
> I confirmed that riaknode1 pings correctly before starting riak. I guess
> erlang tries to match the hostname of the system resulting in this failure
> ? Can anyone throw some light on this ?
>
> *2. Use -sname*
>
> Is starting the erlang VM with the sname flag an option if it will help
> prevent the 'illegal hostname' error ?
> Disclaimer: My knowledge of erlang is close to zilch, so sorry if that
> option sounded like something you could dismiss easily :)
>
> *3. Use cluster replace
> *
>
> a. I understand that the IPs in app.config and vm.args can be replaced
> with the correct IP on a restart and using a subsequent 'cluster replace'
> command will do. Will executing the 'cluster plan' and 'cluster commit'
> commands now produce network chatter ?
>
> b . What happens if 2 nodes go down and one was joined with the other.
> They both have 2 different IP addresses on restart. How will 'cluster
> replace' work now ?
>
> Do let me know your thoughts.
>
> Thanks
> -Deepak
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Riak and host names

2013-01-09 Thread Matt Black
A quick update on this subject.

Using an Elastic IP won't help with AWS since that only binds to the public
interface - not the internal private one. The hostname command still
returns the same internal IP address as before, which is what's seen by
Riak.

In AWS an internal IP address will actually persist across reboots. It does
not persist across shutdown and startup.


On 8 January 2013 10:16, Richard Shaw  wrote:

> Hi Matt,
>
> Just to add to Charlie's suggestion, you could take a look at EC2 elastic
> IP addresses which would allow you to permanently map a public and private
> address to an EC2 instance, assignDNS hostnames and not have them change on
> reboot[1]
>
> [1] http://aws.amazon.com/articles/1346
>
> Regards
>
> Richard
>
> On 7 Jan 2013, at 23:03, Charlie Voiselle  wrote:
>
> > Matt:
> >
> > You would need to use (or implement your own) DNS service that you could
> programmatically access--Route 53 has an API that you could use to create
> DNS entries that point to the internal addresses of your nodes.   In very
> carefully re-reading the thread Deepak mentions, one problem that will
> occur is that each node needs to be able to resolve the other nodes by name
> also.  The only way for this to occur reasonably, would be to register the
> internal addresses with a single point that they share.  Some examples of
> free services that you might use for this are DynDns[1], DNSDynamic[2], or
> DNS-O-Matic[3].   I have also seen some projects floating around the web
> that might enable you to create a self-hosted dynamic DNS like opendyn[4]
> and GnuDIP[5]; however, I have had no occasion to use something like this
> in my own environment.   Some additional discussion about creating your own
> Dynamic DNS server is also at
> http://unix.stackexchange.com/questions/29049/how-to-create-a-custom-dynamic-dns-solution
> >
> > Hope this helps!
> > Charlie
> >
> > [1] http://www.dyn.com
> > [2] http://www.dnsdynamic.org
> > [3] http://www.dnsomatic.com
> > [4] http://code.google.com/p/opendyn/
> > [5] http://gnudip2.sourceforge.net/
> >
> > On Jan 7, 2013, at 5:00 PM, Matt Black 
> wrote:
> >
> >> Thanks for this Charlie.
> >>
> >> I'm running a production Riak cluster on AWS which runs constantly, and
> I've been wondering how I might be able to easliy stop and start AWS nodes
> for a testing and benchmarking cluster (to save on cost).
> >>
> >> By using the 'riaknode1.priv' hostname method you describe, would I be
> able to stop and then restart a whole cluster of nodes at once? (As
> described by Deepak, AWS assigns new IPs when a VM starts).
> >>
> >> Thanks
> >> Matt
> >>
> >>
> >> On 8 January 2013 01:31, Charlie Voiselle  wrote:
> >> Deepak:
> >>
> >> When you name a node in app.config with -name it has to have a '.' in
> it,  like r...@hostname.net  As you have surmised, you can get around
> that if you use the -sname argument instead.
> >>
> >> They have to be done consistently.  In your example, had you used the
> -sname argument, `riak@riaknode1` would work.  Making a host entry
> `riaknode1.priv` that points to the local address would work with the -name
> argument.
> >>
> >> The inportant thing about -name and -sname is that they can't mix
> within a cluster.
> >>
> >> Cluster replace is designed to replace a node with a new one and
> transfer all the partitions. You can cheat and use it to rename a node
> though.
> >>
> >> The process to do this would look like the following:
> >>
> >>  • Stop the node to rename with `riak stop`
> >>  • Mark it 'down' from another node in the cluster using
> `riak-admin down «old nodename».
> >>  • Rename the node in vm.args.
> >>  • Delete the ring directory.
> >>  • Start the node with `riak start`.
> >>  • It will come up as a single instance which you can verify with
> `riak-admin member-status`.
> >>  • Join the node to the cluster with `riak-admin cluster join
> «cluster nodename» `
> >>  • Set it to replace the old instance of itself with `riak-admin
> cluster replace «old nodename» «new nodename»
> >>  • Plan the changes with `riak-admin cluster plan`
> >>  • Commit the changes with `riak-admin cluster commit`
> >>
> >> As you can see, this is a very large effort, so best to use hostnames
> that aren't moving around.  Apologies for you getting this twice, Deepak. I
> failed to reply to the list as well.
&g

Re: Riak Node Failure

2013-01-13 Thread Matt Black
Hi Pavel,

Have you run "riak console" on the server to see what the output is? Could
you paste that here? I'm no expert, but I may be able to help diagnose a
problem with the node starting.


On 14 January 2013 15:19, Pavel Kogan  wrote:

> Hi all,
>
> I am running couple of months cluster of 5 Riak nodes on CentOs (Bitcask
> backend).
> Till today everything was fine, but today one of the nodes just stopped to
> work.
> When trying riak start/stop/ ... it just claims that node doesn't respond
> to pings.
> Restart of server didn't help. What should I do to recover this node?
>
> Thanks,
>Pavel
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Exception: {"phase":0,"error":"[timeout]

2013-01-24 Thread Matt Black
Hi David,

This is a known error that has been resolved in trunk, but is yet to be
released:

https://github.com/basho/riak_kv/issues/290

Any word on a release date, Basho guys?


On 25 January 2013 12:52, David Montgomery wrote:

> Hi,
>
> Why will riak throw this type of exception when I have a timeout of
> TIMEOUT=300?
>
> What does this error mean?
>
> How of now I cant get data out of a production system.
>
>
>
>
> 'Traceback (most recent call last):
>   File "gg.py", line 324, in 
> results =
> getDomainReportIndex(bucket=bucket,date_start=int(utc_start_date),date_end=int(utc_end_date))
>   File "gg.py", line 137, in getDomainReportIndex
> for result in query.run(timeout=TIMEOUT):
>   File
> "/usr/local/lib/python2.7/dist-packages/riak-1.5.1-py2.7.egg/riak/mapreduce.py",
> line 234, in run
> result = t.mapred(self._inputs, query, timeout)
>   File
> "/usr/local/lib/python2.7/dist-packages/riak-1.5.1-py2.7.egg/riak/transports/pbc.py",
> line 454, in mapred
> _handle_response)
>   File
> "/usr/local/lib/python2.7/dist-packages/riak-1.5.1-py2.7.egg/riak/transports/pbc.py",
> line 548, in send_msg_multi
> msg_code, resp = self.recv_msg(conn, expect)
>   File
> "/usr/local/lib/python2.7/dist-packages/riak-1.5.1-py2.7.egg/riak/transports/pbc.py",
> line 589, in recv_msg
> raise Exception(msg.errmsg)
> Exception:
> {"phase":0,"error":"[timeout]","input":"{<<\"impressions\">>,<<\"322c0473-9eeb-4c9c-81fe-4e898bc50416:cid5989410021:agid7744464312:2012122316:SG\">>}","type":"forward_preflist","stack":"[]"}
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: leveldb and noatime?

2013-03-18 Thread Matt Black
This warning is described in the docs at:

http://docs.basho.com/riak/latest/tutorials/choosing-a-backend/LevelDB/#Tuning-LevelDB
http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/#Tuning-Bitcask

In order to fix this you will need to update /etc/fstab and reboot.


On 19 March 2013 08:27,  wrote:

> I am running 'riak-admin diag disk' and am getting the warning:
>
> 16:22:29.851 [notice] Data directory /var/lib/riak/leveldb is not mounted
> with 'noatime'. Please remount its disk with the 'noatime' flag to improve
> performance.
>
> Since I only have one partition I cannot very well 'remount' the disk.
> What is at the root of this warning? What can I do to satisfy this warning?
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Getting multiple values: is iterating or MapReduce preferred?

2013-03-25 Thread Matt Black
This is very interesting.

I've just been testing a multi-phase map reduce job which was intended to
replace some code where we look up several objects sequentially after a
small map-reduce job. Sounds like the current application code is the right
way to do it..


On 26 March 2013 11:14, John Caprice  wrote:

> Rob,
>
> Performing GET requests either serially or concurrently is more efficient
> than using MapReduce to query for values.  MapReduce has additional
> overhead that GET requests do not have.  One example of this is that a GET
> is sent to only the nodes in the prefs list for a given key, while a
> MapReduce query is sent to all nodes.
>
> There are appropriate uses of MapReduce.  Using MapReduce in a controlled
> manner outside of your peak production hours can minimize performance
> effects.  For example, using MapReduce nightly to
> perform maintenance, build reports etc.  It is important to ensure that
> MapReduce queries remain bounded.  Replacing serial / concurrent GETs in
> your application with MapReduce queries provides the opportunity for
> unbounded use, which can have severe performance consequences.
>
> Making separate requests, either serially or concurrently, is the optimal
> way to query data in Riak.  To an application developer, this might not
> look as elegant however it is much more efficient for Riak.
>
> Thanks,
>
> John
>
>
> On Mon, Mar 25, 2013 at 3:07 PM, Rob Speer  wrote:
>
>> I've looked at the archives of this mailing list to find a way to
>> implement a "multi-get" using Riak, for the very common case where there
>> are multiple keys to look up. Making a separate round-trip to the server
>> for each key seems inefficient, after all.
>>
>> I came across the suggestion to use MapReduce, so I tried implementing it
>> this way (using riak-python-client):
>>
>> def multi_get(self, bucket_name, ids):
>> if len(ids) == 0:
>> return []
>> mr = RiakMapReduce(self.riak)
>> for uid in ids:
>> mr.add(bucket_name, uid)
>> query = mr.map_values_json()
>> return query.run()
>>
>> After this I noticed significant load on the Riak servers, and the client
>> code would sometimes stall for a long time, even on a multi_get that was
>> only returning 6 documents. Is this actually an inappropriate use of
>> MapReduce? (And are there appropriate uses of MapReduce in NoSQL databases
>> besides stress-testing them?)
>>
>> Is it better to make a separate request for each ID, to use MapReduce, or
>> to use some other method I haven't thought of?
>> -- Rob
>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Large map reduce query

2013-03-25 Thread Matt Black
Hi list,

I have a non-trivial map reduce query which traverses links between objects
across several map phases, constructing a composite object enroute. One of
our links is one-to-many, so I use a reduce phase to flatten my set of
objects at the end.

Would a query of this type be better done as a smaller M/R query to get the
base set of objects, and then multiple concurrent gets for the related
objects? Is it sensible to do map reduce jobs with many map and reduce
phases?

I appreciate this question is somewhat open ended, but any input would be
welcome.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Large map reduce query

2013-03-25 Thread Matt Black
Hi John, thanks for the response.

In my case this is a nightly process which will be farming older data out
of Riak for analysis elsewhere. I can certainly run a few tests with the
M/R as it is, and with a more simple version which does sequential gets.
Which metrics should I be interested in when measuring this do you think?

Thanks
Matt


On 26 March 2013 15:35, John Caprice  wrote:

> Matt,
>
> (I'll finish this response this time!)
>
> How often is this MapReduce query being run?  Is the execution of this
> MapReduce query done in a controlled manner (for instance, not initiated by
> users of your application)?
>
> The actual use case of MapReduce queries are important in determining the
> sensibility of using MapReduce.  A MapReduce query that has the potential
> to be executed at an untested frequency can cause unintended load on the
> cluster.  If the frequency of the MapReduce query is known, it can be
> tested under production load to determine the overall affect it has on the
> cluster.
>
> You can also test this against your second suggestion, combining multiple
> MapReduce queries with successive / concurrent GETs to determine which
> method is more efficient in your situation.
>
> Thanks,
>
> John Caprice
>
>
> On Mon, Mar 25, 2013 at 9:28 PM, John Caprice  wrote:
>
>> Matt,
>>
>>
>> How often is this MapReduce query being run?  Is the execution of this
>> MapReduce query done in a controlled manner (for instance, not initiated by
>> users of your application)?
>>
>> The actual use case of MapReduce queries are important in determining
>>
>> Thanks,
>>
>> John Caprice
>>
>>
>> On Mon, Mar 25, 2013 at 9:17 PM, Matt Black wrote:
>>
>>> Hi list,
>>>
>>> I have a non-trivial map reduce query which traverses links between
>>> objects across several map phases, constructing a composite object enroute.
>>> One of our links is one-to-many, so I use a reduce phase to flatten my set
>>> of objects at the end.
>>>
>>> Would a query of this type be better done as a smaller M/R query to get
>>> the base set of objects, and then multiple concurrent gets for the related
>>> objects? Is it sensible to do map reduce jobs with many map and reduce
>>> phases?
>>>
>>> I appreciate this question is somewhat open ended, but any input would
>>> be welcome.
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Cluster Leave Behaviour Appears Dependent On Node Leaving?

2013-03-27 Thread Matt Black
I believe this is to do with the distribution of data across your cluster.

Your data will spread evenly across 2n nodes (4, 8, 16) - so with six nodes
your data is spread unevenly and you will different plans depending which
node is asked to leave.

This command will show you the percentage data spread across the cluster:

riak-admin member_status


On 27 March 2013 22:25, Ralph Williams  wrote:

>  We have a six node riak cluster and have been testing the situation
> where one node is removed using the "riak-admin cluster leave" command.
>
>
>
> Depending on which server is requested to leave, a different plan can
> result. The difference lies in the amount of data handoffs occurring
> between the nodes which remain in the cluster.
>
>
>
> Is there a simple explanation for this?
>
>
>
> I have already read http://lists.basho.com/pipermail/riak-
> users_lists.basho.com/2012-April/008195.html
>
>
>  The information included in this email and any files transmitted with it
> may contain information that is confidential and it must not be used by, or
> its contents or attachments copied or disclosed, to persons other than the
> intended addressee. If you have received this email in error, please notify
> BJSS. In the absence of written agreement to the contrary BJSS' relevant
> standard terms of contract for any work to be undertaken will apply. Please
> carry out virus or such other checks as you consider appropriate in respect
> of this email. BJSS do not accept responsibility for any adverse effect
> upon your system or data in relation to this email or any files transmitted
> with it. BJSS Limited, a company registered in England and Wales (Company
> Number 2777575), VAT Registration Number 613295452, Registered Office
> Address, First Floor, Coronet House, Queen Street, Leeds, LS1 2TW
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Map phase timeout

2013-04-07 Thread Matt Black
Dear list,

I'm currently getting a timeout during a single phase of a multi-phase map
reduce query. Is there anything I can do to assist this in running?

It's purpose is to backup and remove objects from Riak, so it will run
periodically during quiet times moving old data out of Riak into file
storage.

Traceback (most recent call last):
  File "./tools/rolling_backup.py", line 185, in 
main()
  File "./tools/rolling_backup.py", line 181, in main
args.func(**kwargs)
  File "/srv/backup/tools/mapreduce.py", line 295, in do_map_reduce
raise e
Exception:
{"phase":2,"error":"timeout","input":"[<<\"cart-products\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05-2\">>,{struct,[{<<\"uid\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05\">>},{<<\"cart\">>,{struct,[{<<\"expired_ts\">>,<<\"2013-03-05T19:12:23.906228\">>},{<<\"last_updated\">>,<<\"2013-03-05T19:12:23.906242\">>},{<<\"tags\">>,{struct,[{<<\"type\">>,<<\"AB\">>}]}},{<<\"completed\">>,false},{<<\"created\">>,<<\"2013-03-04T02:10:18.638413\">>},{<<\"products\">>,[{struct,[{<<\"cost\">>,0},{<<\"bundleName\">>,<<\"Product\">>},...]},...]},...]}},...]}]","type":"exit","stack":"[{riak_kv_w_reduce,'-js_runner/1-fun-0-',3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,283}]},{riak_kv_w_reduce,reduce,3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,206}]},{riak_kv_w_reduce,maybe_reduce,2,[{file,\"src/riak_kv_w_reduce.erl\"},{line,157}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,444}]},{riak_pipe_vnode_worker,wait_for_input,2,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,376}]},{gen_fsm,handle_msg,7,[{file,\"gen_fsm.erl\"},{line,494}]},{proc_lib,...}]"}
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Map phase timeout

2013-04-08 Thread Matt Black
Thanks for the reply, Christian.

I didn't explain well enough in my first post - the map reduce operation is
merely loading a bunch of objects, and a Python script which makes the
connection to Riak then will write these objects to disk. (It's probably
obvious, but I'm using javascript and riak python client.)

The query itself has many map phases where a composite object is built up
from related objects spread across many buckets.

I was hoping there may be some kind of timeout I could adjust on a per-map
phase basis - clutching at straws really.

Cheers
Matt


On 8 April 2013 17:14, Christian Dahlqvist  wrote:

> Hi,
>
> Without having access to the mapreduce functions you are running, I would
> assume that a mapreduce job both writing data to disk as well as deleting
> the written record from Riak might be quite slow. This is not really a use
> case mapreduce was designed for, and when a mapreduce job crashes or times
> out it is difficult to know how far along the processing of different
> records it got.
>
> I would therefore recommend considering running this type of archiving and
> delete job as an external batch process instead as it will give you better
> control over the execution and avoid timeout problems.
>
> Best regards,
>
> Christian
>
>
>
> On 8 Apr 2013, at 00:49, Matt Black  wrote:
>
> > Dear list,
> >
> > I'm currently getting a timeout during a single phase of a multi-phase
> map reduce query. Is there anything I can do to assist this in running?
> >
> > It's purpose is to backup and remove objects from Riak, so it will run
> periodically during quiet times moving old data out of Riak into file
> storage.
> >
> > Traceback (most recent call last):
> >   File "./tools/rolling_backup.py", line 185, in 
> > main()
> >   File "./tools/rolling_backup.py", line 181, in main
> > args.func(**kwargs)
> >   File "/srv/backup/tools/mapreduce.py", line 295, in do_map_reduce
> > raise e
> > Exception:
> {"phase":2,"error":"timeout","input":"[<<\"cart-products\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05-2\">>,{struct,[{<<\"uid\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05\">>},{<<\"cart\">>,{struct,[{<<\"expired_ts\">>,<<\"2013-03-05T19:12:23.906228\">>},{<<\"last_updated\">>,<<\"2013-03-05T19:12:23.906242\">>},{<<\"tags\">>,{struct,[{<<\"type\">>,<<\"AB\">>}]}},{<<\"completed\">>,false},{<<\"created\">>,<<\"2013-03-04T02:10:18.638413\">>},{<<\"products\">>,[{struct,[{<<\"cost\">>,0},{<<\"bundleName\">>,<<\"Product\">>},...]},...]},...]}},...]}]","type":"exit","stack":"[{riak_kv_w_reduce,'-js_runner/1-fun-0-',3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,283}]},{riak_kv_w_reduce,reduce,3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,206}]},{riak_kv_w_reduce,maybe_reduce,2,[{file,\"src/riak_kv_w_reduce.erl\"},{line,157}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,444}]},{riak_pipe_vnode_worker,wait_for_input,2,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,376}]},{gen_fsm,handle_msg,7,[{file,\"gen_fsm.erl\"},{line,494}]},{proc_lib,...}]"}
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Map phase timeout

2013-04-08 Thread Matt Black
All,

Huge thanks for your replies. It seems to me that our approach with
MapReduce queries has been fundamentally wrong, and that I should rewrite
my backup script to use sequential GETs. Currently we're on the bitcask
backend, and on our roadmap is a move over to eleveldb and the application
of appropriate 2i across the whole dataset. Looks like that will be the
next step - before doing any backup of old data.

Matt


On 9 April 2013 01:01, Dmitri Zagidulin  wrote:

> Matt,
>
> My recommendation to you is - don't use MapReduce for this use case. Fetch
> the objects via regular Riak GETs (using connection pooling and
> multithreading, preferably).
>
> I'm assuming that you have a list of keys (either by keeping track of them
> externally to Riak, or via a Secondary Index query or a Search query), and
> you want to back up those objects.
>
> The natural inclination, once you know the keys, is to want to fetch all
> of those objects via a single query, and MapReduce immediately comes to
> mind. (And to most developers, writing the MR function in Javascript is
> easier and more familiar than in Erlang). Unfortunately, as Christian
> mentioned, it's very easy for the JS VMs to run out of resources and crash
> or time out. In addition, I've found that rewriting the MapReduce in Erlang
> affords only a bit more resources -- once you hit a certain number of keys
> that you want to fetch, or a certain object size threshold, even Erlang MR
> jobs can time out (keep in mind, while the Map phase can happen in parallel
> on all of the nodes in a cluster, all the object values have to be
> serialized on the single coordinating node, which becomes the bottleneck).
>
> The workaround for this, even though it might seem counter-intuitive, is
> -- if you know the list of keys, fetch them using GETs. Even a naive
> single-threaded "while loop" way of fetching the objects can often be
> faster than a MapReduce job (for this use case), and it doesn't time out.
> Add to that connection-pooling and multiple worker threads, and this method
> is invariably faster.
>
> Dmitri
>
>
> On Mon, Apr 8, 2013 at 4:27 AM, Christian Dahlqvist 
> wrote:
>
>> Hi Matt,
>>
>> If you have a complicated mapreduce job containing multiple phases
>> implemented in JavaScript, you will most likely see a lot of contention for
>> the JavaScript VMs which will cause problems. While you can tune the
>> configuration [1], you may find that you will need a very large pool size
>> in order to properly support your job, especially for map phases as these
>> run in parallel.
>>
>> The best way to speed up the mapreduce job and get around the VM pool
>> contention is to implement the mapreduce functions in Erlang.
>>
>> Best regards,
>>
>> Christian
>>
>> [1]
>> http://docs.basho.com/riak/1.2.0/references/appendices/MapReduce-Implementation/#Configuration-Tuning-for-Javascript
>>
>>
>>
>> 
>> Christian Dahlqvist
>> Client Services Engineer
>> Basho Technologies
>> EMEA Office
>> E-mail: christ...@basho.com
>> Skype: c.dahlqvist
>> Mobile: +44 7890 590 910
>>
>> On 8 Apr 2013, at 08:20, Matt Black  wrote:
>>
>> Thanks for the reply, Christian.
>>
>> I didn't explain well enough in my first post - the map reduce operation
>> is merely loading a bunch of objects, and a Python script which makes the
>> connection to Riak then will write these objects to disk. (It's probably
>> obvious, but I'm using javascript and riak python client.)
>>
>> The query itself has many map phases where a composite object is built up
>> from related objects spread across many buckets.
>>
>> I was hoping there may be some kind of timeout I could adjust on a
>> per-map phase basis - clutching at straws really.
>>
>> Cheers
>> Matt
>>
>>
>> On 8 April 2013 17:14, Christian Dahlqvist  wrote:
>>
>>> Hi,
>>>
>>> Without having access to the mapreduce functions you are running, I
>>> would assume that a mapreduce job both writing data to disk as well as
>>> deleting the written record from Riak might be quite slow. This is not
>>> really a use case mapreduce was designed for, and when a mapreduce job
>>> crashes or times out it is difficult to know how far along the processing
>>> of different records it got.
>>>
>>> I would therefore recommend considering running this type of archiving
>>> and delete job as an external batch process instead as it will give you
>>> better control over the execution and a

Re: The suitability of MapReduce

2013-04-08 Thread Matt Black
I think an short and explicit discussion of using sequential GETs would be
good to add to the docs in [1]. It'll be helpful to put the alternate
option in the reader's head so they can evaluate as they're going through
the article.

Cheers
Matt


On 9 April 2013 02:02, Jeremiah Peschka  wrote:

> I want to follow up on the recent "Map phase timeout" thread [2]. In part
> out of curiosity and in part as a documentation clean up... Should the
> documentation at [1] be changed? Specifically, the docs say MR should be
> used:
>
>- *When you know the set of objects you want to MapReduce over (the
>bucket-key pairs) *(emphasis added)
>- When you want to return actual objects or pieces of the object – not
>just the keys, as do Search & Secondary Indexes
>- When you need utmost flexibility in querying your data. MapReduce
>gives you full access to your object and lets you pick it apart any way you
>want.
>
> It seems to me that a lot of discussions around MR in Riak come down to
> "You're close but this isn't the best use case of MapReduce in Riak." Would
> it be better, for the purposes of a general discussion, to say that
> MapReduce is the appropriate paradigm when you want to:
>
>- manipulate a large amount of data inside the Riak cluster in bulk -
>e.g. read all of my sales orders and where the version is 1, perform the
>changes necessary to update the order format to version 2.
>- burn a lot of I/O and make your admin sad
>- move data from one bucket to another
>- re-write an entire bucket so all data is indexed for 2i, search, etc
>- Anything where the query can be resumed with no knowledge of state
>at the time the last run of the query failed.
>
> Are there other use cases when MR is the better approach?
>
> [1]:
> http://docs.basho.com/riak/latest/tutorials/querying/MapReduce/#When-to-Use-MapReduce
> [2]:
> http://riak.markmail.org/search/?q=#query:+page:1+mid:4o27v64qf55ejzwc+state:results
>
>  ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Map phase timeout

2013-04-09 Thread Matt Black
Hi Dmitri,

Thanks for your clarification. I was pretty sure that was how it would work
- and so I had planned a different way of migrating to a new backend. I
intended to introduce new nodes which have the eleveldb backend configured,
and presumed that Riak would move data into this backend as the node joined
the cluster. Then I would migrate out the bitcask nodes one-by-one.

Would this approach work? Or will I need to look at a migration tool?

Matt


On 10 April 2013 00:06, Dmitri Zagidulin  wrote:

> Matt,
>
> Just for clarity - you mention that you plan to move the backend to
> LevelDB before backing up old data.
> I just want to caution and say - if you switch the config setting from
> Bitcask to LevelDB and restart the cluster, Riak does not automatically
> migrate the data for you, to the new back end.
>
> Meaning, if you just switch to LevelDB (without backing up data), you'll
> have an empty cluster running on leveldb, and you'd have no way to access
> the old data in Bitcask. Backing up and restoring data is helpful precisely
> in the areas of migrating to a different back end (or to a different ring
> size).
>
> (You probably knew this, and have a migration plan in mind already, but I
> just wanted to make sure).
>
> If you need a good "logical backup" tool, take a look at
> https://github.com/dankerrigan/riak-data-migrator (it's java-based, but
> is pretty good at backing up the contents of one or more buckets to disk,
> and then restoring afterwards). (As opposed to "file based backup" as
> described in http://docs.basho.com/riak/latest/cookbooks/Backups/ , which
> is the recommended approach for backups for a production cluster, but won't
> help you in migrating to a different backend).
>
> Dmitri
>
>
> On Mon, Apr 8, 2013 at 7:20 PM, Matt Black wrote:
>
>> All,
>>
>> Huge thanks for your replies. It seems to me that our approach with
>> MapReduce queries has been fundamentally wrong, and that I should rewrite
>> my backup script to use sequential GETs. Currently we're on the bitcask
>> backend, and on our roadmap is a move over to eleveldb and the application
>> of appropriate 2i across the whole dataset. Looks like that will be the
>> next step - before doing any backup of old data.
>>
>> Matt
>>
>>
>>
>> On 9 April 2013 01:01, Dmitri Zagidulin  wrote:
>>
>>> Matt,
>>>
>>> My recommendation to you is - don't use MapReduce for this use case.
>>> Fetch the objects via regular Riak GETs (using connection pooling and
>>> multithreading, preferably).
>>>
>>> I'm assuming that you have a list of keys (either by keeping track of
>>> them externally to Riak, or via a Secondary Index query or a Search query),
>>> and you want to back up those objects.
>>>
>>> The natural inclination, once you know the keys, is to want to fetch all
>>> of those objects via a single query, and MapReduce immediately comes to
>>> mind. (And to most developers, writing the MR function in Javascript is
>>> easier and more familiar than in Erlang). Unfortunately, as Christian
>>> mentioned, it's very easy for the JS VMs to run out of resources and crash
>>> or time out. In addition, I've found that rewriting the MapReduce in Erlang
>>> affords only a bit more resources -- once you hit a certain number of keys
>>> that you want to fetch, or a certain object size threshold, even Erlang MR
>>> jobs can time out (keep in mind, while the Map phase can happen in parallel
>>> on all of the nodes in a cluster, all the object values have to be
>>> serialized on the single coordinating node, which becomes the bottleneck).
>>>
>>> The workaround for this, even though it might seem counter-intuitive, is
>>> -- if you know the list of keys, fetch them using GETs. Even a naive
>>> single-threaded "while loop" way of fetching the objects can often be
>>> faster than a MapReduce job (for this use case), and it doesn't time out.
>>> Add to that connection-pooling and multiple worker threads, and this method
>>> is invariably faster.
>>>
>>> Dmitri
>>>
>>>
>>> On Mon, Apr 8, 2013 at 4:27 AM, Christian Dahlqvist >> > wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> If you have a complicated mapreduce job containing multiple phases
>>>> implemented in JavaScript, you will most likely see a lot of contention for
>>>> the JavaScript VMs which will cause problems. While you can tune the
>>>> configuration [1], you m

Re: Production server specs for Riak node

2013-04-22 Thread Matt Black
There's a lot of information in here:

http://docs.basho.com/riak/latest/cookbooks/Linux-Performance-Tuning/

And the sister article, specific to AWS:

http://docs.basho.com/riak/latest/cookbooks/Performance-Tuning-AWS/

As ever with Riak, the answer to your question depends on what you're doing
:)


On 23 April 2013 02:07, Tom Zeng  wrote:

> Hi,
>
> I am wondering if there are any docs/recommendations on server hardware
> specs for production Riak instances - cores, clock speed, network
> interfaces/bandwidth, hard drive raid vs ssd, and etc.
>
> Tom
> Thanks,
> --
> Tom Zeng
> Director of Engineering
> Intridea, Inc. | www.intridea.com
> t...@intridea.com
> (o) 888.968.4332 x519
> (c) 240-643-8728
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Call me maybe blog post

2013-05-20 Thread Matt Black
Dear list,

What is your take on the conclusions drawn in the following blog post?

In our setup, we occasionally suffer lost data through simultaneous writes
to the same key - which I mitigate to some extent with locks in the
application layer. In fact, the resolution in the application is
essentially like the "merge" technique he describes with the CRDTs solution
at the end.

http://aphyr.com/posts/285-call-me-maybe-riak

Would the CRDT approach he talks about replace the locks in my application
layer (where simultaneous LWW causes overwrites of a single object)?

I suppose I'm left with the feeling that I don't understand enough of the
workings of Riak and the implications of CAP to really understand this post
on my own.

Anything anyone can offer on the subject would be most appreciated!

Cheers
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Installing protobuf 2.5.0 for Riak Python 2.0

2013-07-30 Thread Matt Black
Hello list,

I've been eagerly awaiting the latest update to the python bindings, so it
was with great enthusiasm that I started on it this morning!

However, I'm unable to install the latest v2.5 of protobuf. Has anyone else
had problems? Presumably it works for others on different setups. (Sean?)

(test)vagrant@boomerang:/tmp/protobuf-2.5.0/python > python setup.py build
running build
running build_py
Generating google/protobuf/unittest_pb2.py...
google/protobuf/unittest_import.proto:53:8: Expected a string naming the
file to import.
google/protobuf/unittest.proto: Import
"google/protobuf/unittest_import.proto" was not found or had errors.
google/protobuf/unittest.proto:97:12:
"protobuf_unittest_import.ImportMessage" is not defined.
google/protobuf/unittest.proto:101:12:
"protobuf_unittest_import.ImportEnum" is not defined.
google/protobuf/unittest.proto:107:12:
"protobuf_unittest_import.PublicImportMessage" is not defined.
google/protobuf/unittest.proto:135:12:
"protobuf_unittest_import.ImportMessage" is not defined.
google/protobuf/unittest.proto:139:12:
"protobuf_unittest_import.ImportEnum" is not defined.
google/protobuf/unittest.proto:165:12:
"protobuf_unittest_import.ImportEnum" is not defined.
google/protobuf/unittest.proto:216:12:
"protobuf_unittest_import.ImportMessage" is not defined.
google/protobuf/unittest.proto:221:12:
"protobuf_unittest_import.ImportEnum" is not defined.
google/protobuf/unittest.proto:227:12:
"protobuf_unittest_import.PublicImportMessage" is not defined.
google/protobuf/unittest.proto:256:12:
"protobuf_unittest_import.ImportMessage" is not defined.
google/protobuf/unittest.proto:261:12:
"protobuf_unittest_import.ImportEnum" is not defined.
google/protobuf/unittest.proto:291:12:
"protobuf_unittest_import.ImportEnum" is not defined.

vagrant@boomerang:~ > uname -a
Linux boomerang 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux

vagrant@boomerang:~ > lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 12.04.2 LTS
Release:12.04
Codename:   precise
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Installing protobuf 2.5.0 for Riak Python 2.0

2013-08-05 Thread Matt Black
Ah okay, thanks Sean.

The reason I went down that road is that the docs say you need v2.5:

http://docs.basho.com/riak/latest/dev/taste-of-riak/python/


On 31 July 2013 23:05, Sean Cribbs  wrote:

> Matt,
>
> For compatibility reasons, we use 2.4.1, which is pinned in the
> requirements of the riak_pb package. We intend to move to 2.5 for later
> releases.
>
>
> On Tue, Jul 30, 2013 at 11:02 PM, Matt Black wrote:
>
>> Hello list,
>>
>> I've been eagerly awaiting the latest update to the python bindings, so
>> it was with great enthusiasm that I started on it this morning!
>>
>> However, I'm unable to install the latest v2.5 of protobuf. Has anyone
>> else had problems? Presumably it works for others on different setups.
>> (Sean?)
>>
>> (test)vagrant@boomerang:/tmp/protobuf-2.5.0/python > python setup.py
>> build
>> running build
>> running build_py
>> Generating google/protobuf/unittest_pb2.py...
>> google/protobuf/unittest_import.proto:53:8: Expected a string naming the
>> file to import.
>> google/protobuf/unittest.proto: Import
>> "google/protobuf/unittest_import.proto" was not found or had errors.
>> google/protobuf/unittest.proto:97:12:
>> "protobuf_unittest_import.ImportMessage" is not defined.
>> google/protobuf/unittest.proto:101:12:
>> "protobuf_unittest_import.ImportEnum" is not defined.
>> google/protobuf/unittest.proto:107:12:
>> "protobuf_unittest_import.PublicImportMessage" is not defined.
>> google/protobuf/unittest.proto:135:12:
>> "protobuf_unittest_import.ImportMessage" is not defined.
>> google/protobuf/unittest.proto:139:12:
>> "protobuf_unittest_import.ImportEnum" is not defined.
>> google/protobuf/unittest.proto:165:12:
>> "protobuf_unittest_import.ImportEnum" is not defined.
>> google/protobuf/unittest.proto:216:12:
>> "protobuf_unittest_import.ImportMessage" is not defined.
>> google/protobuf/unittest.proto:221:12:
>> "protobuf_unittest_import.ImportEnum" is not defined.
>> google/protobuf/unittest.proto:227:12:
>> "protobuf_unittest_import.PublicImportMessage" is not defined.
>> google/protobuf/unittest.proto:256:12:
>> "protobuf_unittest_import.ImportMessage" is not defined.
>> google/protobuf/unittest.proto:261:12:
>> "protobuf_unittest_import.ImportEnum" is not defined.
>> google/protobuf/unittest.proto:291:12:
>> "protobuf_unittest_import.ImportEnum" is not defined.
>>
>> vagrant@boomerang:~ > uname -a
>> Linux boomerang 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC
>> 2012 x86_64 x86_64 x86_64 GNU/Linux
>>
>> vagrant@boomerang:~ > lsb_release -a
>> No LSB modules are available.
>> Distributor ID: Ubuntu
>> Description:Ubuntu 12.04.2 LTS
>> Release:12.04
>> Codename:   precise
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
> --
> Sean Cribbs 
> Software Engineer
> Basho Technologies, Inc.
> http://basho.com/
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


RiakNode class in Python client 2.0

2013-08-05 Thread Matt Black
I've been looking through the changes to the RiakClient class - and have a
question about the nodes list and RiakNode/Decaying class.

It seems that the Decaying class is used internally to track error_rates
(request failure_rate?) of nodes in the available pool.

In the application I maintain, we're accessing Riak in many concurrent
celery workers - so my assumption would be that this internal error_rate
tracking wouldn't do much good. Have I got this right? Is the client's
internal "nodes" list only really useful in a long running application?
(ie, when I'm not instantiating RiakClient fresh for each query).

We currently have a "pool" of Riak nodes in some configuration, from which
a random IP is chosen for each request.

Thanks
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


GPG error in Apt

2013-08-06 Thread Matt Black
Hey Basho peeps,

Looks like you might have signed the latest Riak release with new
certificate (or something) - Apt is reporting that the key from
http://apt.basho.com/gpg/basho.apt.key is incorrect this morning.

> apt-get update
W: GPG error: http://apt.basho.com precise Release: The following
signatures were invalid: BADSIG F933E597DDF2E833 Basho Technologies (Debian
/ Ubuntu signing key) 
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: RiakNode class in Python client 2.0

2013-08-06 Thread Matt Black
Hi Sean,

I would indeed like to take advantage of the pooling features of the new
client. Shared objects across Celery workers isn't something I'd ever
really looked at before - but it seems the only real way to share across
workers (ie, processes) is to use memcache or similar. Which makes sense.

Cheers
Matt


On 6 August 2013 23:04, Sean Cribbs  wrote:

> Hi Matt,
>
> You are correct about the Decaying class, it is a wrapper for an
> exponentially-decaying error rate. The idea is, you configure your client
> to connect to multiple Riak nodes, and if one goes down or is restarted,
> the possibility of its selection for new connections can be automatically
> reduced by detecting network errors. Of course, this is moot when you are
> connecting to the cluster via a load-balancer; at that point the
> error-tracking logic can't help you.
>
> If you are creating and throwing away RiakClient objects frequently,
> neither the error-tracking nor the connection pool will help you much.
> However, I urge you to consider keeping a single client around, in some
> globally accessible place (module constant? config object?). Your
> concurrent workers will get to take advantage of existing connections, and
> your socket/file-handle usage will be less.
>
>
> On Tue, Aug 6, 2013 at 1:07 AM, Matt Black wrote:
>
>> I've been looking through the changes to the RiakClient class - and have
>> a question about the nodes list and RiakNode/Decaying class.
>>
>> It seems that the Decaying class is used internally to track error_rates
>> (request failure_rate?) of nodes in the available pool.
>>
>> In the application I maintain, we're accessing Riak in many concurrent
>> celery workers - so my assumption would be that this internal error_rate
>> tracking wouldn't do much good. Have I got this right? Is the client's
>> internal "nodes" list only really useful in a long running application?
>> (ie, when I'm not instantiating RiakClient fresh for each query).
>>
>> We currently have a "pool" of Riak nodes in some configuration, from
>> which a random IP is chosen for each request.
>>
>> Thanks
>>
>>
>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
> --
> Sean Cribbs 
> Software Engineer
> Basho Technologies, Inc.
> http://basho.com/
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: GPG error in Apt

2013-08-07 Thread Matt Black
Confirmed as working from my end.


On 8 August 2013 05:58, Hector Castro  wrote:

> Hey Matt,
>
> That issue should be resolved now. Please ping us if you hit any further
> issues.
>
> --
> Hector
>
>
> On Tue, Aug 6, 2013 at 7:42 PM, Matt Black 
> wrote:
> > Hey Basho peeps,
> >
> > Looks like you might have signed the latest Riak release with new
> > certificate (or something) - Apt is reporting that the key from
> > http://apt.basho.com/gpg/basho.apt.key is incorrect this morning.
> >
> >> apt-get update
> > W: GPG error: http://apt.basho.com precise Release: The following
> signatures
> > were invalid: BADSIG F933E597DDF2E833 Basho Technologies (Debian / Ubuntu
> > signing key) 
> >
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Multi-backend, eLevelDB, & Too many open files

2013-09-08 Thread Matt Black
Hey list,

We're migrating from our currently operational cluster to a new one in a
different EC2 region. As part of this migration, we'd like to move from
Bitcask to eLevelDB - mostly for the benefits provided by secondary indexes.

We also use Multi-backend configuration, in order to split our various
client's data into entirely separate spaces on disk (this is useful for
legal reasons). I made the simple change from "riak_kv_bitcask_backend" to
"riak_kv_eleveldb_backend" in our config, and did some calculations for
"max_open_files", and then started a new node.

It fails with the following error in short order:

> cat error.log
2013-09-09 03:49:51.391 [error] <0.715.0>@riak_kv_vnode:init:375 Failed to
start riak_kv_multi_backend Reason: [{riak_kv_eleveldb_backend,{db_open,"IO
error:
/mnt/riak/eleveldb/client1/365375409332725729550921208179070754913983135744/15.dbtmp:
Too many open files"}}]
2013-09-09 03:49:51.392 [error] <0.940.0>@riak_kv_vnode:init:375 Failed to
start riak_kv_multi_backend Reason: [{riak_kv_eleveldb_backend,{db_open,"IO
error:
/mnt/riak/eleveldb/client2/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
/mnt/riak/eleveldb/client3/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
/mnt/riak/eleveldb/client4/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
/mnt/riak/eleveldb/client5/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
/mnt/riak/eleveldb/client6/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}}]

And some relevant app.config:

{multi_backend_default, <<"default">>},
{multi_backend, [
%% Default fallback: this is unused
{<<"default">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/default"}
]},
{<<"eleveldb_client1">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/client1"}
]},
{<<"eleveldb_client2">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/client2"}
]},
{<<"eleveldb_client3">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/client3"}
]},
{<<"eleveldb_client4">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/client4"}
]},
{<<"eleveldb_client5">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/client5"}
]},
{<<"eleveldb_client6">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/client6"}
]}
]},

.. snip ..

 %% Default cache size of 8MB
 {cache_size, 8388608},
 %% Maximum number of files open at once per partition
 {max_open_files, 50}

I've set the "riak soft/hard nofile 65536" in /etc/security/limits.conf, so
presumably this "Too many open files" error is referring to the
"max_open_files" option as part of eLevelDB config. The RAM in each machine
is 3.5GB free, so I calculated on a 5 node cluster this 50 max_open_files
limit.

- Is there something about Multi-backend I haven't taken into account?
- Do I need a larger max_open_files? (And thus more RAM? :)

Cheers!
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Multi-backend, eLevelDB, & Too many open files

2013-09-09 Thread Matt Black
Hi Luke,

Thanks for the response.

I played around with the various config options - and in the end I had to
include the /etc/defaults/riak file, as well as the
/etc/security/limits.conf config in order to get Riak to stay up. That must
be relatively new - our existing v1.4.0 cluster is operating fine without
it.

Cheers
Matt


On 10 September 2013 01:32, Luke Bakken  wrote:

> Hi Matt -
>
> Can you attach to a node in your cluster with "riak attach" and run
> this erlang snippet (before the node errors out)?
>
> os:cmd("ulimit -n").
>
> This will show the actual value for ulimit for that running riak node.
> I suspect that this value hasn't been updated for the running process,
> even though you've modified limits.conf.
>
> Depending on your OS there may be more configuration necessary for the
> open files limit to "stick":
> http://docs.basho.com/riak/latest/ops/tuning/open-files-limit/#Linux
>
> On Sun, Sep 8, 2013 at 9:15 PM, Matt Black 
> wrote:
> > Hey list,
> >
> > We're migrating from our currently operational cluster to a new one in a
> > different EC2 region. As part of this migration, we'd like to move from
> > Bitcask to eLevelDB - mostly for the benefits provided by secondary
> indexes.
> >
> > We also use Multi-backend configuration, in order to split our various
> > client's data into entirely separate spaces on disk (this is useful for
> > legal reasons). I made the simple change from "riak_kv_bitcask_backend"
> to
> > "riak_kv_eleveldb_backend" in our config, and did some calculations for
> > "max_open_files", and then started a new node.
> >
> > It fails with the following error in short order:
> >
> >> cat error.log
> > 2013-09-09 03:49:51.391 [error] <0.715.0>@riak_kv_vnode:init:375 Failed
> to
> > start riak_kv_multi_backend Reason:
> [{riak_kv_eleveldb_backend,{db_open,"IO
> > error:
> >
> /mnt/riak/eleveldb/client1/365375409332725729550921208179070754913983135744/15.dbtmp:
> > Too many open files"}}]
> > 2013-09-09 03:49:51.392 [error] <0.940.0>@riak_kv_vnode:init:375 Failed
> to
> > start riak_kv_multi_backend Reason:
> [{riak_kv_eleveldb_backend,{db_open,"IO
> > error:
> >
> /mnt/riak/eleveldb/client2/570899077082383952423314387779798054553098649600/CURRENT:
> > Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
> >
> /mnt/riak/eleveldb/client3/570899077082383952423314387779798054553098649600/CURRENT:
> > Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
> >
> /mnt/riak/eleveldb/client4/570899077082383952423314387779798054553098649600/CURRENT:
> > Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
> >
> /mnt/riak/eleveldb/client5/570899077082383952423314387779798054553098649600/CURRENT:
> > Too many open files"}},{riak_kv_eleveldb_backend,{db_open,"IO error:
> >
> /mnt/riak/eleveldb/client6/570899077082383952423314387779798054553098649600/CURRENT:
> > Too many open files"}}]
> >
> > And some relevant app.config:
> >
> > {multi_backend_default, <<"default">>},
> > {multi_backend, [
> > %% Default fallback: this is unused
> > {<<"default">>, riak_kv_eleveldb_backend, [
> > {data_root, "/mnt/riak/eleveldb/default"}
> > ]},
> > {<<"eleveldb_client1">>, riak_kv_eleveldb_backend, [
> > {data_root, "/mnt/riak/eleveldb/client1"}
> > ]},
> > {<<"eleveldb_client2">>, riak_kv_eleveldb_backend, [
> > {data_root, "/mnt/riak/eleveldb/client2"}
> > ]},
> > {<<"eleveldb_client3">>, riak_kv_eleveldb_backend, [
> > {data_root, "/mnt/riak/eleveldb/client3"}
> > ]},
> > {<<"eleveldb_client4">>, riak_kv_eleveldb_backend, [
> > {data_root, "/mnt/riak/eleveldb/client4"}
> > ]},
> > {<<"eleveldb_client5">>, riak_kv_eleveldb_backend, [
> > {data_root, "/mnt/riak/eleveldb/client5"}
> > ]},
> > {<<"eleveldb_client6">>, riak_kv_eleveldb_backend, [

Storing dates in Secondary Indexes

2013-09-22 Thread Matt Black
Hey list.

A quick question on best practices really:

   - Should I use a bin index with ISO8601 format?
   - Or should I use an Unix timestamp as an integer index?
   - Is there likely to be a performance difference? (There will certainly
   be storage size difference).
   - Has anyone had an specific success with either approach?

Thanks
Matt
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Filtering not_found in reduce JS causes SyntaxError

2013-10-20 Thread Matt Black
Hey list,

A script recently introduced to cleanup old data by deleting it has caused
one of our old reporting scripts to start failing with “not_found”. I’d
encountered this once before - so I thought the simple introduction of a
reduce phase using Riak.filterNotFound would fix it.

However, now I’m receiving this error - removing the one line addition of
query.reduce("Riak.filterNotFound") gives me my old “not_found” error
straight back.

Exception: 
{"phase":1,"error":"[{<<\"lineno\">>,466},{<<\"message\">>,<<\"SyntaxError:
syntax 
error\">>},{<<\"source\">>,<<\"()\">>}]","input":"{ok,{r_object,<<\"carts\">>,<<\"dd2bcd07fa8019b2d1fc1d4832c41c74\">>,[{r_content,{dict,4,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"X-Riak-VTag\">>,52,68,115,107,113,49,105,69,66,109,103,79,106,87,104,75,75,97,53,98,54,65]],[[<<\"index\">>]],[[<<\"X-Riak-Deleted\">>,116,114,117,101]],[[<<\"X-Riak-Last-Modified\">>|{1381,978330,755498}]],[],[]}}},<<>>}],[{<<250,120,75,127,79,209,93,62>>,{6,63516323103}},{<<31,103,165,230,79,209,...>>,...},...],...},...}"}

Any thoughts?

Thanks y'all
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Filtering not_found in reduce JS causes SyntaxError

2013-10-20 Thread Matt Black
BTW, this cluster is running 1.4.0 still. If 1.4.2 would fix this issue I
could update.


On 21 October 2013 10:42, Matt Black  wrote:

> Hey list,
>
> A script recently introduced to cleanup old data by deleting it has caused
> one of our old reporting scripts to start failing with “not_found”. I’d
> encountered this once before - so I thought the simple introduction of a
> reduce phase using Riak.filterNotFound would fix it.
>
> However, now I’m receiving this error - removing the one line addition of
> query.reduce("Riak.filterNotFound") gives me my old “not_found” error
> straight back.
>
> Exception: 
> {"phase":1,"error":"[{<<\"lineno\">>,466},{<<\"message\">>,<<\"SyntaxError: 
> syntax 
> error\">>},{<<\"source\">>,<<\"()\">>}]","input":"{ok,{r_object,<<\"carts\">>,<<\"dd2bcd07fa8019b2d1fc1d4832c41c74\">>,[{r_content,{dict,4,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"X-Riak-VTag\">>,52,68,115,107,113,49,105,69,66,109,103,79,106,87,104,75,75,97,53,98,54,65]],[[<<\"index\">>]],[[<<\"X-Riak-Deleted\">>,116,114,117,101]],[[<<\"X-Riak-Last-Modified\">>|{1381,978330,755498}]],[],[]}}},<<>>}],[{<<250,120,75,127,79,209,93,62>>,{6,63516323103}},{<<31,103,165,230,79,209,...>>,...},...],...},...}"}
>
> Any thoughts?
>
> Thanks y'all
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Filtering not_found in reduce JS causes SyntaxError

2013-10-20 Thread Matt Black
The plot thickens. Having run the same query a couple more times just now -
I see a different error! (No changes we made to the code).

Exception: Error processing stream message:
exit:{ucs,{bad_utf8_character_code}}:[{xmerl_ucs,
from_utf8,
1,
[{file,

"xmerl_ucs.erl"},
 {line,
  185}]},
   {mochijson2,

json_encode_string,
2,
[{file,

"src/mochijson2.erl"},
 {line,
  186}]},
   {mochijson2,

'-json_encode_proplist/2-fun-0-',
3,
[{file,

"src/mochijson2.erl"},
 {line,
  167}]},
   {lists,
foldl,
3,
[{file,

"lists.erl"},
 {line,
  1197}]},
   {mochijson2,

json_encode_proplist,
2,
[{file,

"src/mochijson2.erl"},
 {line,
  170}]},

{riak_kv_pb_mapred,

process_stream,
3,
[{file,

"src/riak_kv_pb_mapred.erl"},
 {line,
  115}]},

{riak_api_pb_server,

process_stream,
5,
[{file,

"src/riak_api_pb_server.erl"},
 {line,
  246}]},

{riak_api_pb_server,
handle_info,
2,
[{file,

"src/riak_api_pb_server.erl"},
 {line,
  129}]}]


On 21 October 2013 11:58, Matt Black  wrote:

> BTW, this cluster is running 1.4.0 still. If 1.4.2 would fix this issue I
> could update.
>
>
> On 21 October 2013 10:42, Matt Black  wrote:
>
>> Hey list,
>>
>> A script recently introduced to cleanup old data by deleting it has
>> caused one of our old reporting scripts to start failing with “not_found”.
>> I’d encountered this once before - so I thought the simple introduction of
>> a reduce phase using Riak.filterNotFound would fix it.
>>
>> However, now I’m receiving this error - removing the one line addition of
>> query.reduce("Riak.filterNotFound") gives me my old “not_found” error
>> straight back.
>>
>> Exception: 
>> {"phase":1,"error":"[{<<\"lineno\">>,466},{<<\"message\">>,<<\"SyntaxError: 
>> syntax 
>> error\">>},{<<\"source\">>,<<\"()\">>}]","input":"{ok,{r_object,<<\"carts\">>,<<\"dd2bcd07fa8019b2d1fc1d4832c41c74\">>,[{r_content,{dict,4,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],

Re: Filtering not_found in reduce JS causes SyntaxError

2013-10-20 Thread Matt Black
I side-stepped this error by adding this little block of code into the top
of my map phase (which we are using elsewhere in the same project):

if(v.values[0].metadata['X-Riak-Deleted'] !== undefined) {
return [];
}

Unfortunately I now have a different problem, which I’ll detail in a
separate thread.

On 21 October 2013 12:15, Matt Black  wrote:

> The plot thickens. Having run the same query a couple more times just now
> - I see a different error! (No changes we made to the code).
>
> Exception: Error processing stream message: 
> exit:{ucs,{bad_utf8_character_code}}:[{xmerl_ucs,
> from_utf8,
> 1,
> [{file,
>   
> "xmerl_ucs.erl"},
>  {line,
>   185}]},
>
> {mochijson2,
> 
> json_encode_string,
> 2,
> [{file,
>   
> "src/mochijson2.erl"},
>  {line,
>   186}]},
>
> {mochijson2,
> 
> '-json_encode_proplist/2-fun-0-',
> 3,
> [{file,
>   
> "src/mochijson2.erl"},
>  {line,
>   167}]},
>{lists,
> foldl,
> 3,
> [{file,
>   
> "lists.erl"},
>  {line,
>   1197}]},
>
> {mochijson2,
> 
> json_encode_proplist,
> 2,
> [{file,
>   
> "src/mochijson2.erl"},
>  {line,
>   170}]},
>
> {riak_kv_pb_mapred,
> 
> process_stream,
> 3,
> [{file,
>   
> "src/riak_kv_pb_mapred.erl"},
>  {line,
>   115}]},
>
> {riak_api_pb_server,
> 
> process_stream,
> 5,
> [{file,
>   
> "src/riak_api_pb_server.erl"},
>  {line,
>   246}]},
>

Phase 2 Timeout

2013-10-20 Thread Matt Black
Following on from some earlier errors I was getting, I’m now kind of stuck
between a rock and a hard place.

One of our statistics reports fails with a timeout during a
query.filter_not_found() phase:

Exception: 
{"phase":2,"error":"timeout","input":"[<<\"users\">>,<<\"33782eee0470cac583b136fd063decdc\">>,{struct,[{<<\"brid\">>,<<\"33782eee0470cac583b136fd063decdc\">>},{<<\"field1\">>,<<\"2012-11-19T04:53:00Z\">>},{<<\"field2\">>,..
SNIP 
,...]}]","type":"exit","stack":"[{riak_kv_w_reduce,'-js_runner/1-fun-0-',3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,283}]},{riak_kv_w_reduce,reduce,3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,206}]},{riak_kv_w_reduce,maybe_reduce,2,[{file,\"src/riak_kv_w_reduce.erl\"},{line,157}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,445}]},{riak_pipe_vnode_worker,wait_for_input,2,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,377}]},{gen_fsm,handle_msg,7,[{file,\"gen_fsm.erl\"},{line,494}]},{proc_lib,...}]"}

This is exactly the same problem discussed way back on this very list:


https://groups.google.com/forum/#!topic/nosql-databases/iHYDyqyidkM

Unfortunately this time I’m unable to rewrite the query to work in a
different way - removing the query.filter_not_found() phase I receive a
different error - exit:{json_encode, {bad_term, {not_found (which was
covered in more detail in my previous emails).

Any thoughts on how I can attempt to work around this?
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: LevelDB tuning questions.

2013-11-07 Thread Matt Black
Interesting thread!

I had used the old 1.2 spreadsheet to calculate cache_size and max_files -
and having used the 1.4 spreadsheet, it looks like I should change my
config :)

However, I'm not seeing anything like the below in my log files. Which log
file should I be looking in?

Cheers
Matt


On 8 November 2013 11:50, Matthew Von-Maszewski  wrote:

> Agreed
>
> Matthew Von-Maszewski
>
>
> On Nov 7, 2013, at 19:44, kzhang  wrote:
>
> > Thanks!
> >
> > The cluster has been in production for 4 months. I found this in leveldb
> log
> > file:
> > 013/10/25-12:43:10.678239 7fb895781700 compacted to: files[ 0 1 5 31 51
> 0 0
> > ]
> > 2013/10/29-16:57:26.280633 7fb895781700 compacted to: files[ 0 2 5 31 51
> 0 0
> > ]
> > 2013/11/03-11:46:15.935006 7fb895781700 compacted to: files[ 0 3 5 31 51
> 0 0
> > ]
> > 2013/11/07-12:04:18.308733 7fb895781700 compacted to: files[ 0 4 5 31 51
> 0 0
> > ]
> > 2013/11/07-12:04:29.921077 7fb8958bf700 compacted to: files[ 0 0 20 31
> 51 0
> > 0 ]
> > 2013/11/07-12:04:31.629970 7fb8958bf700 compacted to: files[ 0 0 19 31
> 51 0
> > 0 ]
> > 2013/11/07-12:04:34.008481 7fb8958bf700 compacted to: files[ 0 0 18 31
> 51 0
> > 0 ]
> > 2013/11/07-12:04:38.480066 7fb8958bf700 compacted to: files[ 0 0 17 32
> 51 0
> > 0 ]
> > 2013/11/07-12:04:39.197229 7fb8958bf700 compacted to: files[ 0 0 15 35
> 51 0
> > 0 ]
> > 2013/11/07-12:04:41.440583 7fb8958bf700 compacted to: files[ 0 0 14 35
> 51 0
> > 0 ]
> > 2013/11/07-12:04:44.895349 7fb8958bf700 compacted to: files[ 0 0 13 35
> 51 0
> > 0 ]
> > 2013/11/07-12:04:47.098208 7fb8958bf700 compacted to: files[ 0 0 12 36
> 51 0
> > 0 ]
> > 2013/11/07-12:04:51.675148 7fb8958bf700 compacted to: files[ 0 0 11 39
> 51 0
> > 0 ]
> > 2013/11/07-12:04:53.454434 7fb8958bf700 compacted to: files[ 0 0 10 41
> 51 0
> > 0 ]
> > 2013/11/07-12:04:58.335885 7fb8958bf700 compacted to: files[ 0 0 9 44 51
> 0 0
> > ]
> > 2013/11/07-12:05:02.312521 7fb8958bf700 compacted to: files[ 0 0 8 44 51
> 0 0
> > ]
> > 2013/11/07-12:05:06.239789 7fb8958bf700 compacted to: files[ 0 0 7 44 51
> 0 0
> > ]
> > 2013/11/07-12:05:09.034111 7fb8958bf700 compacted to: files[ 0 0 6 45 51
> 0 0
> > ]
> > 2013/11/07-12:05:11.293695 7fb8958bf700 compacted to: files[ 0 0 6 44 51
> 0 0
> > ]
> > 2013/11/07-12:05:12.762098 7fb8958bf700 compacted to: files[ 0 0 6 43 51
> 0 0
> > ]
> > 2013/11/07-12:05:15.484027 7fb8958bf700 compacted to: files[ 0 0 6 42 51
> 0 0
> > ]
> > 2013/11/07-12:05:18.960306 7fb8958bf700 compacted to: files[ 0 0 6 41 51
> 0 0
> > ]
> >
> > looks like the sum never exceeded 110. I guess I should go ahead change
> the
> > max open file to 138?
> >
> > Thanks,
> >
> > Kathleen
> >
> >
> >
> >
> > --
> > View this message in context:
> http://riak-users.197444.n3.nabble.com/LevelDB-tuning-questions-tp4029246p4029725.html
> > Sent from the Riak Users mailing list archive at Nabble.com.
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Links and uniqueness

2013-11-18 Thread Matt Black
Hello list,

Once upon a time, a link from one object to another was unique - you
couldn't add two links from object A onto object B. I know this as I had to
code around it in our app.

At some stage that limitation has been removed - in either the Python
bindings or Riak itself.

Can anyone else confirm this? Basho peeps, are non-unique links the
intended behaviour?

Thanks
Matt Black
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Links and uniqueness

2013-11-21 Thread Matt Black
Apologies for the bump!

Basho guys, can I get a confirmation on the uniqueness of links between two
objects please? (Before I go an modify the code in my app to suit)

Thanks
Matt


On 19 November 2013 14:31, Matt Black  wrote:

> Hello list,
>
> Once upon a time, a link from one object to another was unique - you
> couldn't add two links from object A onto object B. I know this as I had to
> code around it in our app.
>
> At some stage that limitation has been removed - in either the Python
> bindings or Riak itself.
>
> Can anyone else confirm this? Basho peeps, are non-unique links the
> intended behaviour?
>
> Thanks
> Matt Black
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Links and uniqueness

2013-11-21 Thread Matt Black
Thanks Brian, I suspected it was a constraint applied in the client tools
(although in my case the Python ones).


On 22 November 2013 13:05, Brian Roach  wrote:

> Matt -
>
> This has never been a restriction in Riak itself AFAIK. I fixed the
> same issue in the Java client over a year ago - it was using a hashmap
> for links so duplicates were discarded;
> https://github.com/basho/riak-java-client/pull/165
>
> - Roach
>
> On Thu, Nov 21, 2013 at 7:00 PM, Matt Black 
> wrote:
> > Apologies for the bump!
> >
> > Basho guys, can I get a confirmation on the uniqueness of links between
> two
> > objects please? (Before I go an modify the code in my app to suit)
> >
> > Thanks
> > Matt
> >
> >
> >
> > On 19 November 2013 14:31, Matt Black  wrote:
> >>
> >> Hello list,
> >>
> >> Once upon a time, a link from one object to another was unique - you
> >> couldn't add two links from object A onto object B. I know this as I
> had to
> >> code around it in our app.
> >>
> >> At some stage that limitation has been removed - in either the Python
> >> bindings or Riak itself.
> >>
> >> Can anyone else confirm this? Basho peeps, are non-unique links the
> >> intended behaviour?
> >>
> >> Thanks
> >> Matt Black
> >>
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: using salt stack with riak

2014-01-22 Thread Matt Black
Hi Matt,

We manage all our Riak infrastructure with a couple of Salt states and a
custom module I wrote which you can see here:

https://github.com/saltstack/salt-contrib/blob/master/modules/riak.py

There's another Riak module in Salt core, but last time I checked it had
less functionality. (I talked with them a while back about merging the two
modules - perhaps I should bring that up again).

I can send our Salt states as well, if you're interested :)


On 23 January 2014 07:05, Matt Davis  wrote:

> Hey all,
>
> We're implementing salt stack for configuration management, and I've been
> trying out how it works with riak, specifically remote command execution.
>
> Anyone out there in riak-land been successfully integrating it with salt?
>
> I've hit a couple of "arroo?" moments and am curious what others have
> experienced.
>
> -matt
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: using salt stack with riak

2014-01-23 Thread Matt Black
This state is meat of it, and should be pretty self-explanatory if you’re
already writing Salt states..

riak-ulimit-pam:
  file.append:
- name: /etc/pam.d/common-session
- text: "session\trequired\tpam_limits.so"

riak-ulimit-pam-noninteractive:
  file.append:
- name: /etc/pam.d/common-session-noninteractive
- text: "session\trequired\tpam_limits.so"

riak-ulimit:
  file.append:
- name: /etc/security/limits.conf
- text:
  - "riak soft nofile 65536"
  - "riak hard nofile 65536"
- require:
  - file: riak-ulimit-pam

python-software-properties:
  pkg.installed

basho-pkgrepo:
  pkgrepo.managed:
- humanname: Basho PPA
- name: deb http://apt.basho.com precise main
- file: /etc/apt/sources.list.d/basho.list
- key_url: http://apt.basho.com/gpg/basho.apt.key
- require:
  - pkg: python-software-properties

riak:
  pkg.installed:
- version: 1.4.2-1
- require:
  - pkgrepo: basho-pkgrepo
  riak.running:
- require:
  - pkg: riak
  - file: /etc/riak/app.config
  - file: /etc/riak/vm.args
  - file: riak-ulimit

/etc/riak/app.config:
  file.managed:
- source: salt://riak/app.config
- mode: 644
- template: jinja
- require:
  - pkg: riak
- defaults:
internal_ip: {{ salt['cmd.exec_code']('bash', 'hostname -I') }}

/etc/riak/vm.args:
  file.managed:
- source: salt://riak/vm.args
- mode: 644
- template: jinja
- require:
  - pkg: riak
- defaults:
internal_ip: {{ salt['cmd.exec_code']('bash', 'hostname -I') }}



On 24 January 2014 04:22, Matt Davis  wrote:

> Nicely done Matt! Sure would love to see your states... I've got a fairly
> good one for riak-cs, would love to see some others.
>
>
> On Wed, Jan 22, 2014 at 2:14 PM, Matt Black wrote:
>
>> Hi Matt,
>>
>> We manage all our Riak infrastructure with a couple of Salt states and a
>> custom module I wrote which you can see here:
>>
>> https://github.com/saltstack/salt-contrib/blob/master/modules/riak.py
>>
>> There's another Riak module in Salt core, but last time I checked it had
>> less functionality. (I talked with them a while back about merging the two
>> modules - perhaps I should bring that up again).
>>
>> I can send our Salt states as well, if you're interested :)
>>
>>
>> On 23 January 2014 07:05, Matt Davis  wrote:
>>
>>> Hey all,
>>>
>>> We're implementing salt stack for configuration management, and I've
>>> been trying out how it works with riak, specifically remote command
>>> execution.
>>>
>>> Anyone out there in riak-land been successfully integrating it with salt?
>>>
>>> I've hit a couple of "arroo?" moments and am curious what others have
>>> experienced.
>>>
>>> -matt
>>>
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Multi-backend and retiring storage

2014-08-26 Thread Matt Black
Hello list,

Apologies in advance for the longish preamble - the question (when I get to
it) is fairly straight forward.

We’ve been happily running Riak for a couple of years now using
multi-backend to map various buckets onto different disk locations (for
infosec compliance reasons).

Our config is essentially this:

{riak_kv, [
{storage_backend, riak_kv_multi_backend},

%% Configure multiple eleveldb storage locations
{multi_backend_default, <<"default">>},
{multi_backend, [
%% Default fallback: this is unused
{<<"default">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/default"}
]},
%% Storage1
{<<"eleveldb_storage1">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/storage1"}
]},
%% Storage2
{<<"eleveldb_storage2">>, riak_kv_eleveldb_backend, [
{data_root, "/mnt/riak/eleveldb/storage2"}
]},
...

Bucket properties then map Riak buckets onto the relevant backend.

Again for infosec compliance reasons, we need to “retire” one of our
backends, and the data contained therein.

What would a process be for doing this? What happens across the cluster as
I remove the config from each node? At what point can I safely delete the
data directories?

Thanks
Matt
​
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com