Re: Multiple disks

2011-03-27 Thread Dan Reverri
Hi Joe, You observation regarding question 5 is correct. The coordinating FSM would attempt to send the request to the failed vnode and receive either an error or no reply. A request may still succeed if enough of the other vnodes respond; "enough" would be determined by the "r", "w", "dw", or "rw

Re: Multiple disks

2011-03-23 Thread Joseph Blomstedt
Sorry, I don't have a lot of time right now. I'll try to write a more detailed response later. >>> With a few hours of investigation today, your patch is looking >>> promising. Maybe you can give some more detail on what you did in your >>> experiments a few months ago? I'll try to write somethin

Re: Multiple disks

2011-03-23 Thread Nico Meyer
After reading todays recap, I am a bit unsure: 5) Q --- Would Riak handle an individual vnode failure the same way as an entire node failure? (from grourk via #riak) A --- Yes. The request to that vnode would fail and will be routed to the next available vnode Is it really handled the sam

Re: Multiple disks

2011-03-23 Thread Nico Meyer
Hi Greg, I don't think the vnodes will always die. I have seen some situations (disk full, filesystem becoming read only due to device errors, corrupted bitcask files after a mashine crash) where the vnode did not crash, but the get and/or put requests returned errors. Even if the process cras

Re: Multiple disks

2011-03-23 Thread Greg Nelson
Hi Joe, With a few hours of investigation today, your patch is looking promising. Maybe you can give some more detail on what you did in your experiments a few months ago? What I did was set up a Ubuntu VM with three loopback file systems. Then built Riak 0.14.1 with your patch, configured as

Re: Multiple disks

2011-03-22 Thread Jeremiah Peschka
As an aside about RAID 10: If you have a decent RAID controller, you may get reads that are 2x as fast because the controller will read from both stripes at the same time. It's not just a redundancy issue: it's also fast. Like a cheetah. On meth. RAID 10 will really improve performance if you'r

Re: Multiple disks

2011-03-22 Thread Joseph Blomstedt
> Do the redundant writes still go to different physical nodes or just vnodes > that might be on the same host and fail at the same time? This is a standard riak question, and you can find a lot of good discussion if you search the mailing list. riak predominately operates at the vnode level, the

Re: Multiple disks

2011-03-22 Thread Les Mikesell
On 3/22/2011 11:54 AM, Joseph Blomstedt wrote: Let's consider a simple scenario under normal riak. The key concept here is to realize that riak's vnodes are completely independent, and that failure and partition ownership changes are handled through handoff alone. Let's say we have an 8-partiti

Re: Multiple disks

2011-03-22 Thread Greg Nelson
Thanks for all the responses! Regarding RAID 10, I think for now that's out because we can't afford to cut storage capacity in half when we're already using 2-3x for Riak level redundancy. And adding more redundancy at the RAID layer just seems... incorrect. Running multiple instances on each

Re: Multiple disks

2011-03-22 Thread Joseph Blomstedt
You're forgetting how awesome riak actually is. Given how riak is implemented, my patches should work without any operational headaches at all. Let me explain. First, there was the one issue from yesterday. My initial patch didn't reuse the same partition bitcask on the same node. I've fixed that

Re: Multiple disks

2011-03-22 Thread Alexander Sicular
Ya, my original message just highlighted the standard 0,1,5 that most people/hardware should know/be able to support. There are better options and 10 would be one of them. @siculars on twitter http://siculars.posterous.com Sent from my iPhone On Mar 22, 2011, at 8:43, Ryan Zezeski wrote:

Re: Multiple disks

2011-03-22 Thread Ryan Zezeski
On Tue, Mar 22, 2011 at 10:01 AM, Alexander Sicular wrote: > Save your ops dudes the headache and just use raid 5 and be done with it. > > Depending on the number of disks available I might even argue running software RAID 10 for better throughput and less chance of data loss (as long as you can

Re: Multiple disks

2011-03-22 Thread Alexander Sicular
This kinda thing is an operational nightmare. At the very least, I imagine, you are going to have to have symlinks on all your nodes for all your vnode/directories combos. How does this get managed in failure scenarios or when adding/removing nodes? Think about it a bit if you were to do th

Re: Multiple disks

2011-03-22 Thread Joseph Blomstedt
Oh, I just realized that was only a partial solution to the problem. I forgot to commit related logic that handles selecting the same directory on vnode restart. That's what I get for sending out code late at night. You'll want to maintain a partition->directory index somewhere to really make it wo

Re: Multiple disks

2011-03-22 Thread Joseph Blomstedt
Each vnode already opens a separate bitcask, therefore there isn't any necessary factor preventing the desired behavior. It's just not coded that way. While an individual bitcask must be a single directory, there is no reason all vnodes need to open bitcasks within a shared root directory. Luckily

Re: Multiple disks

2011-03-21 Thread Luke Monahan
On Tue, Mar 22, 2011 at 11:14 AM, Seth Falcon wrote: > > Perhaps another option is to simply run multiple separate raik nodes > on the same machine, each pointed at its own disk. > That's what I thought when first reading this, but a hardware failure would be likely to take out all the nodes on a

Re: Multiple disks

2011-03-21 Thread Seth Falcon
On Mon, Mar 21, 2011 at 4:51 PM, Alexander Sicular wrote: > In short, no. Vnodes can not be pointed to individual disks. Whichever > backend you use for riak, all the files will live in one directory. Perhaps another option is to simply run multiple separate raik nodes on the same machine, each p

Re: Multiple disks

2011-03-21 Thread Alexander Sicular
In short, no. Vnodes can not be pointed to individual disks. Whichever backend you use for riak, all the files will live in one directory. Your only option is raid and to select the raid that is appropriate for your application. You basically have 3 options when it comes to raid levels: S