Most efficient way to determine if 1000 specific keys exist?

2012-05-02 Thread Tim Haines
Hey guys,

Still a relative newbie here.

I was hoping to be able to setup a MapReduce job that I could feed 1000
keys to, and have it tell me of the 1000, which keys exist in the bucket.
 I was hoping this could use the key index (such a thing exists right?)
without having to read the objects.

The methods I've tried for doing this fail when the first non-existing key
is found though.

Is there a way to do this?

Or alternatively, is there a way to check for the presence of one key at a
time without riak having to read the object?

Cheers,

Tim.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Most efficient way to determine if 1000 specific keys exist?

2012-05-02 Thread Jeremiah Peschka
You can use the $keys index to grab this information out of Riak. You could 
also write an MR job that returns just the key for every key that exists, but 
you would still have to read everything from disk... Or I think so, could be 
wrong.

---
Jeremiah Peschka, Managing Director, Brent Ozar PLF, LLC
Microsoft SQL Server MVP

On May 2, 2012, at 11:47 AM, Tim Haines  wrote:

> Hey guys,
> 
> Still a relative newbie here.
> 
> I was hoping to be able to setup a MapReduce job that I could feed 1000 keys 
> to, and have it tell me of the 1000, which keys exist in the bucket.  I was 
> hoping this could use the key index (such a thing exists right?) without 
> having to read the objects.
> 
> The methods I've tried for doing this fail when the first non-existing key is 
> found though.
> 
> Is there a way to do this?
> 
> Or alternatively, is there a way to check for the presence of one key at a 
> time without riak having to read the object?
> 
> Cheers,
> 
> Tim.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Riak Join is Screwed Up

2012-05-02 Thread Rebecca Meritz
I'm testing a script that joins my riak ring on all machines. There is no
data in the database yet but I have repeated joined the ring
an separated it while working on the script that sets up the environment on
my machines. The ring is now in a bizarre state:

[rebecca]$ riak-admin member_status
Attempting to restart script through sudo -u riak
= Membership
==
Status RingPendingNode
---
valid  50.0%  --  'r...@xx.xxx.xx.10'
valid  50.0%  --  'r...@xx.xxx.xx.12'
---
Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
[pay@payment-testing-gw2 ~]$ riak-admin join r...@xx.xxx.xx.14
Attempting to restart script through sudo -u riak
*Failed: This node is already a member of a cluster*
[pay@payment-testing-gw2 ~]$ riak-admin force-remove r...@xx.xxx.xx.14
Attempting to restart script through sudo -u riak
*Failed: "r...@xx.xxx.xx.14" is not a member of the cluster.*

I cannot join a new member nor can I remove it.

I've tried stop them all get them to leave, if they wouldn't leave I forced
their removal. I stopped them all. I even deleted the whole old ring file
before restarting.

How can I fix this situations. What causes the above error?

Thanks,
Rebecca
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Reip(ing) riak node created two copies in the cluster

2012-05-02 Thread Nitish Sharma
Hi,
We have a 12-node Riak cluster. Until now we were naming every new node as 
riak@. We then decided to rename the all the nodes to 
riak@, which makes troubleshooting easier. 
After issuing reip command to two nodes, we noticed in the "status" that those 
2 nodes were now appearing in the cluster with the old name as well as the new 
name. Other nodes were trying to handoff partitions to the "new" nodes, but 
apparently they were not able to. After this the whole cluster went down and 
completely stopped responding to any read/write requests. 
member_status displayed old Riak name in "legacy" mode. Since this is our 
production cluster, we are desperately looking for some quick remedies. Issuing 
"force-remove" to the old names, restarting all the nodes, changing the riak 
names back to the old ones -  none of it helped.
Currently, we are hosting limited amount of data. Whats an elegant way to 
recover from this mess? Would shutting off all the nodes, deleting the ring 
directory, and again forming the cluster work?

Cheers
Nitish  
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Reip(ing) riak node created two copies in the cluster

2012-05-02 Thread Mark Phillips
First question: what version of Riak are you running?

Mark


On May 2, 2012, at 11:05, Nitish Sharma  wrote:

> Hi,
> We have a 12-node Riak cluster. Until now we were naming every new node as 
> riak@. We then decided to rename the all the nodes to 
> riak@, which makes troubleshooting easier. 
> After issuing reip command to two nodes, we noticed in the "status" that 
> those 2 nodes were now appearing in the cluster with the old name as well as 
> the new name. Other nodes were trying to handoff partitions to the "new" 
> nodes, but apparently they were not able to. After this the whole cluster 
> went down and completely stopped responding to any read/write requests. 
> member_status displayed old Riak name in "legacy" mode. Since this is our 
> production cluster, we are desperately looking for some quick remedies. 
> Issuing "force-remove" to the old names, restarting all the nodes, changing 
> the riak names back to the old ones -  none of it helped.
> Currently, we are hosting limited amount of data. Whats an elegant way to 
> recover from this mess? Would shutting off all the nodes, deleting the ring 
> directory, and again forming the cluster work?
> 
> Cheers
> Nitish  
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Reip(ing) riak node created two copies in the cluster

2012-05-02 Thread Jon Meredith
Hi Nitish,

If you rebuild the cluster with the same ring size, the data will
eventually get back to the right place.  While the rebuild is taking place
you may have notfounds for gets until the data has been handed off to the
newly assigned owner (as it will be secondary handoff, not primary
ownership handoff to get teh data back).  If you don't have a lot of data
stored in the cluster it shouldn't take too long.

The process would be to stop all nodes, move the files out of the ring
directory to a safe place, start all nodes and rejoin.  If you're using
1.1.x and you have capacity in your hardware you may want to increase
handoff_concurrency to something like 4 to permit more transfers to happen
across the cluster.


Jon.



On Wed, May 2, 2012 at 9:05 AM, Nitish Sharma wrote:

> Hi,
> We have a 12-node Riak cluster. Until now we were naming every new node as
> riak@. We then decided to rename the all the nodes to 
> riak@,
> which makes troubleshooting easier.
> After issuing reip command to two nodes, we noticed in the "status" that
> those 2 nodes were now appearing in the cluster with the old name as well
> as the new name. Other nodes were trying to handoff partitions to the "new"
> nodes, but apparently they were not able to. After this the whole cluster
> went down and completely stopped responding to any read/write requests.
> member_status displayed old Riak name in "legacy" mode. Since this is our
> production cluster, we are desperately looking for some quick remedies.
> Issuing "force-remove" to the old names, restarting all the nodes, changing
> the riak names back to the old ones -  none of it helped.
> Currently, we are hosting limited amount of data. Whats an elegant way to
> recover from this mess? Would shutting off all the nodes, deleting the ring
> directory, and again forming the cluster work?
>
> Cheers
> Nitish
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Jon Meredith
Platform Engineering Manager
Basho Technologies, Inc.
jmered...@basho.com
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Reip(ing) riak node created two copies in the cluster

2012-05-02 Thread Nitish Sharma
Hi Jon,
Thanks for your input. I've already started working on that lines. 
I stopped all the nodes, moved ring directory from one node, brought that one 
up, and issued join command to one other node (after moving the ring directory) 
- node2. While they were busy re-distributing the partitions, I started another 
node (node3) and issued join command (before risk_kv was running, since it 
takes some time to load existing data).
But after this, data handoffs are occurring only between node1 and node2. 
"member_status" says that node 3 owns 0% of the ring and 0% are pending.
We have a lot of data - each node serves around 200 million documents. Riak 
cluster is running 1.1.2.
Any suggestions?

Cheers
Nitish
On May 2, 2012, at 5:31 PM, Jon Meredith wrote:

> Hi Nitish,
> 
> If you rebuild the cluster with the same ring size, the data will eventually 
> get back to the right place.  While the rebuild is taking place you may have 
> notfounds for gets until the data has been handed off to the newly assigned 
> owner (as it will be secondary handoff, not primary ownership handoff to get 
> teh data back).  If you don't have a lot of data stored in the cluster it 
> shouldn't take too long.
> 
> The process would be to stop all nodes, move the files out of the ring 
> directory to a safe place, start all nodes and rejoin.  If you're using 1.1.x 
> and you have capacity in your hardware you may want to increase 
> handoff_concurrency to something like 4 to permit more transfers to happen 
> across the cluster.
> 
> 
> Jon.
> 
> 
> 
> On Wed, May 2, 2012 at 9:05 AM, Nitish Sharma  
> wrote:
> Hi,
> We have a 12-node Riak cluster. Until now we were naming every new node as 
> riak@. We then decided to rename the all the nodes to 
> riak@, which makes troubleshooting easier.
> After issuing reip command to two nodes, we noticed in the "status" that 
> those 2 nodes were now appearing in the cluster with the old name as well as 
> the new name. Other nodes were trying to handoff partitions to the "new" 
> nodes, but apparently they were not able to. After this the whole cluster 
> went down and completely stopped responding to any read/write requests.
> member_status displayed old Riak name in "legacy" mode. Since this is our 
> production cluster, we are desperately looking for some quick remedies. 
> Issuing "force-remove" to the old names, restarting all the nodes, changing 
> the riak names back to the old ones -  none of it helped.
> Currently, we are hosting limited amount of data. Whats an elegant way to 
> recover from this mess? Would shutting off all the nodes, deleting the ring 
> directory, and again forming the cluster work?
> 
> Cheers
> Nitish
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> -- 
> Jon Meredith
> Platform Engineering Manager
> Basho Technologies, Inc.
> jmered...@basho.com
> 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Reip(ing) riak node created two copies in the cluster

2012-05-02 Thread Jon Meredith
Hi Nitish, for this to work you'll have to stop all the nodes at the same
time, clear the ring on all nodes, start up all nodes, then rejoin

If you clear the rings one node at a time, when you rejoin the nodes the
ring with the old and new style names will be gossipped back to it and
you'll still have both names.

I didn't realize you had a large amount of data - originally you
said "Currently, we are hosting limited amount of data", but 200mil docs
per node seems like a fair amount.  Rebuilding that size cluster may take a
long time.

Your options as I see them are
  1) If you have backups of the ring files, you could revert the node name
changes and get the cluster stable again on riak@IP.  The ring files have a
timestamp associated with them, but we only keep a few of the last ring
files, so if enough gossip has happened then the pre-rename rings will have
been destroyed.  You will have to stop all nodes, put the ring files back
as they were before the change and fix the names in vm.args and then
restart the nodes.

  2) you can continue on the rebuild plan.  stop all nodes, set the new
names in vm.args, start the nodes again and rebuild the cluster, adding as
many nodes as you can at once so they rebalance at the same time.  When new
nodes are added the claimant node works out ownership changes and will
start a sequence of transfers.  If new nodes are added once a sequence is
under way the claimant will wait for that to complete, then check if there
are any new nodes and repeat until all nodes are assigned.  If you add all
the nodes at once you will do less transfers over all.


If the cluster cannot be stopped, there are other things we might be able
to do, but they're a bit more complex.  What are your uptime requirements?

Jon



On Wed, May 2, 2012 at 9:57 AM, Nitish Sharma wrote:

> Hi Jon,
> Thanks for your input. I've already started working on that lines.
> I stopped all the nodes, moved ring directory from one node, brought that
> one up, and issued join command to one other node (after moving the ring
> directory) - node2. While they were busy re-distributing the partitions, I
> started another node (node3) and issued join command (before risk_kv was
> running, since it takes some time to load existing data).
> But after this, data handoffs are occurring only between node1 and node2.
> "member_status" says that node 3 owns 0% of the ring and 0% are pending.
> We have a lot of data - each node serves around 200 million documents.
> Riak cluster is running 1.1.2.
> Any suggestions?
>
> Cheers
> Nitish
> On May 2, 2012, at 5:31 PM, Jon Meredith wrote:
>
> Hi Nitish,
>
> If you rebuild the cluster with the same ring size, the data will
> eventually get back to the right place.  While the rebuild is taking place
> you may have notfounds for gets until the data has been handed off to the
> newly assigned owner (as it will be secondary handoff, not primary
> ownership handoff to get teh data back).  If you don't have a lot of data
> stored in the cluster it shouldn't take too long.
>
> The process would be to stop all nodes, move the files out of the ring
> directory to a safe place, start all nodes and rejoin.  If you're using
> 1.1.x and you have capacity in your hardware you may want to increase
> handoff_concurrency to something like 4 to permit more transfers to happen
> across the cluster.
>
>
> Jon.
>
>
>
> On Wed, May 2, 2012 at 9:05 AM, Nitish Sharma 
> wrote:
>
>> Hi,
>> We have a 12-node Riak cluster. Until now we were naming every new node
>> as riak@. We then decided to rename the all the nodes to
>> riak@, which makes troubleshooting easier.
>> After issuing reip command to two nodes, we noticed in the "status" that
>> those 2 nodes were now appearing in the cluster with the old name as well
>> as the new name. Other nodes were trying to handoff partitions to the "new"
>> nodes, but apparently they were not able to. After this the whole cluster
>> went down and completely stopped responding to any read/write requests.
>> member_status displayed old Riak name in "legacy" mode. Since this is our
>> production cluster, we are desperately looking for some quick remedies.
>> Issuing "force-remove" to the old names, restarting all the nodes, changing
>> the riak names back to the old ones -  none of it helped.
>> Currently, we are hosting limited amount of data. Whats an elegant way to
>> recover from this mess? Would shutting off all the nodes, deleting the ring
>> directory, and again forming the cluster work?
>>
>> Cheers
>> Nitish
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
>
> --
> Jon Meredith
> Platform Engineering Manager
> Basho Technologies, Inc.
> jmered...@basho.com
>
>
>


-- 
Jon Meredith
Platform Engineering Manager
Basho Technologies, Inc.
jmered...@basho.com
___
riak-users mailing list
riak-u

Re: Reip(ing) riak node created two copies in the cluster

2012-05-02 Thread Nitish Sharma

On May 2, 2012, at 6:12 PM, Jon Meredith wrote:

> Hi Nitish, for this to work you'll have to stop all the nodes at the same 
> time, clear the ring on all nodes, start up all nodes, then rejoin
> 
> If you clear the rings one node at a time, when you rejoin the nodes the ring 
> with the old and new style names will be gossipped back to it and you'll 
> still have both names.
Sorry for the confusion. I didn't clear the rings one node at a time while 
keeping other nodes live. Following are the steps I followed:
1. Stop Riak on all the nodes.
2. Remove ring directory from all nodes.
3. Start the nodes and rejoin.

> I didn't realize you had a large amount of data - originally you said 
> "Currently, we are hosting limited amount of data", but 200mil docs per node 
> seems like a fair amount.  Rebuilding that size cluster may take a long time.
> 
Yeah, we are currently serving very limited amount because of Riak shortage. In 
total, we have almost 750 million documents served by Riak.
> Your options as I see them are
>   1) If you have backups of the ring files, you could revert the node name 
> changes and get the cluster stable again on riak@IP.  The ring files have a 
> timestamp associated with them, but we only keep a few of the last ring 
> files, so if enough gossip has happened then the pre-rename rings will have 
> been destroyed.  You will have to stop all nodes, put the ring files back as 
> they were before the change and fix the names in vm.args and then restart the 
> nodes.
> 
>   2) you can continue on the rebuild plan.  stop all nodes, set the new names 
> in vm.args, start the nodes again and rebuild the cluster, adding as many 
> nodes as you can at once so they rebalance at the same time.  When new nodes 
> are added the claimant node works out ownership changes and will start a 
> sequence of transfers.  If new nodes are added once a sequence is under way 
> the claimant will wait for that to complete, then check if there are any new 
> nodes and repeat until all nodes are assigned.  If you add all the nodes at 
> once you will do less transfers over all.
> 
> 
> If the cluster cannot be stopped, there are other things we might be able to 
> do, but they're a bit more complex.  What are your uptime requirements?
> 
We have currently stopped the cluster and running on small amount of data. We 
can wait for the partition re-distribution to complete on Riak, but I don't 
have a strong feeling about it. "member_status" doesn't give us a correct 
picture: http://pastie.org/3849548. Is this expected behavior? I should also 
mention that all the nodes are still loading existing data and it will take few 
hours (2-3) until Riak KV is running on all of them.

Cheers
Nitish
> Jon
> 
> 
> 
> On Wed, May 2, 2012 at 9:57 AM, Nitish Sharma  
> wrote:
> Hi Jon,
> Thanks for your input. I've already started working on that lines. 
> I stopped all the nodes, moved ring directory from one node, brought that one 
> up, and issued join command to one other node (after moving the ring 
> directory) - node2. While they were busy re-distributing the partitions, I 
> started another node (node3) and issued join command (before risk_kv was 
> running, since it takes some time to load existing data).
> But after this, data handoffs are occurring only between node1 and node2. 
> "member_status" says that node 3 owns 0% of the ring and 0% are pending.
> We have a lot of data - each node serves around 200 million documents. Riak 
> cluster is running 1.1.2.
> Any suggestions?
> 
> Cheers
> Nitish
> On May 2, 2012, at 5:31 PM, Jon Meredith wrote:
> 
>> Hi Nitish,
>> 
>> If you rebuild the cluster with the same ring size, the data will eventually 
>> get back to the right place.  While the rebuild is taking place you may have 
>> notfounds for gets until the data has been handed off to the newly assigned 
>> owner (as it will be secondary handoff, not primary ownership handoff to get 
>> teh data back).  If you don't have a lot of data stored in the cluster it 
>> shouldn't take too long.
>> 
>> The process would be to stop all nodes, move the files out of the ring 
>> directory to a safe place, start all nodes and rejoin.  If you're using 
>> 1.1.x and you have capacity in your hardware you may want to increase 
>> handoff_concurrency to something like 4 to permit more transfers to happen 
>> across the cluster.
>> 
>> 
>> Jon.
>> 
>> 
>> 
>> On Wed, May 2, 2012 at 9:05 AM, Nitish Sharma  
>> wrote:
>> Hi,
>> We have a 12-node Riak cluster. Until now we were naming every new node as 
>> riak@. We then decided to rename the all the nodes to 
>> riak@, which makes troubleshooting easier.
>> After issuing reip command to two nodes, we noticed in the "status" that 
>> those 2 nodes were now appearing in the cluster with the old name as well as 
>> the new name. Other nodes were trying to handoff partitions to the "new" 
>> nodes, but apparently they were not able to. After this the whole cluster 
>> went down and compl