This kinda thing is an operational nightmare. At the very least, I imagine, you are going to have to have symlinks on all your nodes for all your vnode/directories combos. How does this get managed in failure scenarios or when adding/removing nodes?

Think about it a bit if you were to do this I think the most reliable way to get the info out to the cluster would be to use riak_core to gossip around vnode meta data that included the path. Still that path would have to exist on all your nodes or be created dynamically when a vnode gets assigned.

Again, imho, this is an "incredibly bad idea" (tm) that will introduce all sorts of new pain into your life in ways you dont want to think about. It is really not worth the incredible headache vs the savings in disk space you would get by mapping vnodes to disks. Save your ops dudes the headache and just use raid 5 and be done with it. You may just keep your ops from killing themselves - and hey, the life you save may be your own.

(not to say your patch isn't cool though, Joseph)

-Alexander


@siculars on twitter
http://siculars.posterous.com

Sent from my iPhone

On Mar 22, 2011, at 0:47, Joseph Blomstedt <joseph.blomst...@gmail.com> wrote:

Oh, I just realized that was only a partial solution to the problem. I
forgot to commit related logic that handles selecting the same
directory on vnode restart. That's what I get for sending out code
late at night. You'll want to maintain a partition->directory index
somewhere to really make it work (or search all directories for an
existing bitcask corresponding to the partition).

For what it's worth, my experiments a few months back in this area
just used a deterministic function to map partitions to a directory.
That's another approach.

-Joe

On Tue, Mar 22, 2011 at 1:25 AM, Joseph Blomstedt
<joseph.blomst...@gmail.com> wrote:
Each vnode already opens a separate bitcask, therefore there isn't any
necessary factor preventing the desired behavior. It's just not coded
that way. While an individual bitcask must be a single directory,
there is no reason all vnodes need to open bitcasks within a shared
root directory.

Luckily, it's easy to change this behavior. In fact, I played around
with the idea awhile back. This question prompted me to find/release
the code:
https://github.com/jtuple/riak_kv/commit/a8ab33224651e6850aed385e4c05c1993916a3e5

That commit should apply against riak-0.14.1. It extends the bitcask
data_root config option to allow for multiple root paths as well as a
selection strategy (random or spread). Random just randomly chooses
one of the directories. Spread picks the directory containing the
fewest already-opened bitcasks -- although, this is a soft guarantee
since no effort is taken to address multiple vnodes choosing a
directory concurrently.

Using paths that correspond to different mounted drives should do the trick.

-Joe


On Mon, Mar 21, 2011 at 5:29 PM, Greg Nelson <gro...@dropcam.com> wrote:
Hello,
We are currently evaluating Riak for an application that will store large amounts of data in a write-heavy pattern. We'd like to pack many disks into each machine. Currently, it appears that Bitcask uses exactly one directory to store data. What is the best way to have it use multiple disks? Is this
something Innostore would handle better?
We'd like to avoid RAID since we'll be paying for redundancy at a higher
level with Riak (N=3, etc.).
We'd also like to avoid a JBOD type setup where a single disk failure brings the whole node down, as we'll obviously be increasing those odds with each
disk.
What I'm wondering is, can each node distribute its vnodes across many disks? And if one of those disks fails, will Riak handle that appropriately (i.e., the other vnodes continue to operate normally and hand-off data when
the new disk comes online)?
Thanks!
Greg
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to