Riak Enterprise: can it be used to migrate to a new configuration?
Can Riak Enterprise replicate between rings where each ring has a different number of partitions? Our five-node ring was originally configured with 64 partitions, and I saw that Basho is recommending 512 for that number of machines. Any ideas on how to make as-painless-a-migration-as-possible are welcome, of course! -- Dave Brady ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Enterprise: can it be used to migrate to a new configuration?
Yes, we have done excatly that. When we migrated from 256 to 128 partitions in a live dual-cluster system, we took one cluster down. Wiped the data, changed number of partitions, brought it back up and synced all data back with a full sync. Then we did the same with the other cluster. However, I must disagree with the recomendation of 512 partitions for 5 nodes. You should go for 128 or 256 unless you plan on scaling out to 10+ nodes pr. cluster. There are downsides to having many partitions. The price of the higher granularity is that the more storage backend processes use more resources for housekeeping. If you do multibackend, the ressources used are multiplied yet again with the number of backends because each vnode will have a number of running backend processes. Say you go with the 512 partitions and have a multibackend config with 4 backends, because you need to backup 4 different types of data independently. That gives you 2k running backends on each node of which 412 will be actively in use in normal running scenario and more when you're doing handoff. Thats a lot of ressources just to run these, that you might otherwise have used for doing business. When you increase the number of partitions you should consider: - Number of open files. Especially when using eleveldb. - Late triggering of bitcask compaction. The default is no compaction of any file before it hits 2GB. That means up to 2G of dead space per vnode. This can however be configured down to a smaller number than the 2 gigs, which is crazy high in almost any use case involving delete, expiry or update of data. - Leveldb cache is pr. vnode, so you need to lower the number, in order to not use all memory, which will lead to death by swapping. - With a high number of vnodes pr. node, each vnode's leveldb cache will be comparatively small leading to (slighty) less effecient cache usage. Please be in touch if you need onsite or offsite professional assistance configuring, testing or running your Riak clusters. BR Rune Skou Larsen Trifork - We do Riak PS. -- Best regards / Venlig hilsen *Rune Skou Larsen* Trifork Public A/S / Team Riak Margrethepladsen 4, 8000 Århus C, Denmark Phone: +45 3160 2497Skype: runeskoularsen twitter: @RuneSkouLarsen Den 19-10-2012 12:38, Dave Brady skrev: Can Riak Enterprise replicate between rings where each ring has a different number of partitions? Our five-node ring was originally configured with 64 partitions, and I saw that Basho is recommending 512 for that number of machines. Any ideas on how to make as-painless-a-migration-as-possible are welcome, of course! -- Dave Brady ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is Riak suitable for s small-record write-intensive billion-records application?
Riak is all about high availability, if eventually consistent data is not a problem OR, you can cover those aspects of the CAP concept with an in-memory caching system and a sort of a locking mechanism to emulate the core atomic action of your application (put-if-absent) then I would say, you are in the right place, now, Riak uses bloom filters and the hashing mechanism from Google code (this is not my expertise though so I could be wrong), you should be fine by letting Riak manage your hashing and equals concept And from the Java world, if Object A equals Object B then Object A hash equals Object B hash, not the opposite though, two object can have the same hash and still not be equal. If that is what you are referring to by "symmetric". Hashing a key is a no brainier job if Riak delegates that to whatever best practice algorithm should be use for hashing, in this case I strongly believe they are using Google's algorithms, all indicates they do because of the Bloom filters (To tidy up both concepts they should somehow) they are using from Google. All this said, it is at your hands and tools to have an in-memory cache and locking mechanism. HTH !!! Guido. On 19/10/12 07:10, Yassen Damyanov wrote: On Fri, Oct 19, 2012, Yassen Damyanov wrote: Whatever the solution, it needs to be symmetric, that is, all nodes must be equivalent. With "symmetric" I mean more "interchangable" than "functionally equal". That is, if a node plays a central role and goes down, the system should be able to pick a new "master" on its own and any other node should be able to become such. Guys, your input is MUCH appreciated. Thank you! Yassen ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is Riak suitable for s small-record write-intensive billion-records application?
On Fri, Oct 19, 2012 at 6:57 AM, Guido Medina wrote: > Riak is all about high availability, if eventually consistent data is not a > problem What is the 'eventually consistent' result of simultaneous inserts of different values for a new key at different nodes? Does partitioning affect this case? > OR, you can cover those aspects of the CAP concept with an in-memory > caching system and a sort of a locking mechanism to emulate the core atomic > action of your application (put-if-absent) then I would say, you are in the > right place, What happens if the partitioning that riak is so concerned about happens between the inserter and the lock - or the nodes providing redundancy for the lock? > All this said, it is at your hands and tools to have an in-memory cache and > locking mechanism. If you have more than one writer, doesn't this need to be just as distributed and robust as riak? -- Les Mikesell lesmikes...@gmail.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is Riak suitable for s small-record write-intensive billion-records application?
It depends, if you have siblings enabled at the bucket, then you need to resolve the conflicts using the object vclock, if you are not using siblings, last write wins, either way, I haven't got any good results by delegating that tasks to Riak, with siblings, eventually I ran Riak out in speed of the writes making Riak fail (Due to LevelDB write speed?). And with last write wins then I don't think you would want unexpected results, and hence my recommendation: We use two things to resolve such issues; in-memory cache + locking mechanism. If you are concerned about locking mechanism speed, you can use MapMaker from Guava framework ((at least in Java) which provides a concurrency level making your application concurrently speaking...fast !!!, and for cache, you could use either Guava or EHCache, now, what I don't have, is a distributed locking mechanism (one of this days I will built a distributed re-entrant locking mechanism base on REST for the sake of it) For the last quote, the locking mechanism if well designed will always take care of that. Regards, Guido. On 19/10/12 13:42, Les Mikesell wrote: On Fri, Oct 19, 2012 at 6:57 AM, Guido Medina wrote: Riak is all about high availability, if eventually consistent data is not a problem What is the 'eventually consistent' result of simultaneous inserts of different values for a new key at different nodes? Does partitioning affect this case? OR, you can cover those aspects of the CAP concept with an in-memory caching system and a sort of a locking mechanism to emulate the core atomic action of your application (put-if-absent) then I would say, you are in the right place, What happens if the partitioning that riak is so concerned about happens between the inserter and the lock - or the nodes providing redundancy for the lock? All this said, it is at your hands and tools to have an in-memory cache and locking mechanism. If you have more than one writer, doesn't this need to be just as distributed and robust as riak? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is Riak suitable for s small-record write-intensive billion-records application?
On Fri, Oct 19, 2012 at 8:02 AM, Guido Medina wrote: > It depends, if you have siblings enabled at the bucket, then you need to > resolve the conflicts using the object vclock, How does that work for simultaneous initial inserts? > if you are not using > siblings, last write wins, either way, I haven't got any good results by > delegating that tasks to Riak, with siblings, eventually I ran Riak out in > speed of the writes making Riak fail (Due to LevelDB write speed?). And with > last write wins then I don't think you would want unexpected results, and > hence my recommendation: We use two things to resolve such issues; in-memory > cache + locking mechanism. The problem is where the inserting client should handle new keys and updates differently, or at least be aware that its insert failed or will be ignored later. > For the last quote, the locking mechanism if well designed will always take > care of that. If it is easy, why doesn't riak handle it? -- Les Mikesell lesmikes...@gmail.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is Riak suitable for s small-record write-intensive billion-records application?
Locking mechanism on a single server is easy, on a cluster is not, that's why you don't see too many multi masters databases right? Riak instead focused on high availability and partitioning, but no consistency, if you notice, consistency is related with locking, with 1 single access per key, so you have to decide which one you focus. From the Java world and specifically from the ConcurrentMap idiom, you use put-if-absent or As Pseudo language you: *lock(key) (be synchronized or re-entrant lock, doesn't matter)* { ... .. } Once you get the lock, you verify if exists, if it doesn't create it, if it does, exit the lock ASAP, since it is meant to be a very quick atomic operation. Regarding siblings, Riak allow you to create many copies of the same key, and when you fetch that key, you get all the copies so YOU have to figure out how to assemble a consistent copy of your data base on all the written versions you have (because there is no distribute lock per key) I don't think I can explain in two more paragraph, you will have to watch this presentation: http://www.slideshare.net/seancribbs/eventuallyconsistent-data-structures I'm limited to a certain level... On 19/10/12 16:32, Les Mikesell wrote: On Fri, Oct 19, 2012 at 8:02 AM, Guido Medina wrote: It depends, if you have siblings enabled at the bucket, then you need to resolve the conflicts using the object vclock, How does that work for simultaneous initial inserts? if you are not using siblings, last write wins, either way, I haven't got any good results by delegating that tasks to Riak, with siblings, eventually I ran Riak out in speed of the writes making Riak fail (Due to LevelDB write speed?). And with last write wins then I don't think you would want unexpected results, and hence my recommendation: We use two things to resolve such issues; in-memory cache + locking mechanism. The problem is where the inserting client should handle new keys and updates differently, or at least be aware that its insert failed or will be ignored later. For the last quote, the locking mechanism if well designed will always take care of that. If it is easy, why doesn't riak handle it? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: MapReduce questions
So... no answers. I guess there are no smart minds at Basho working on M/R currently. Too bad, but I guess a company has to choose its priority. On Tue, Oct 16, 2012 at 11:03 AM, Callixte Cauchois wrote: > Hi there, > > as part of my evaluation of Riak, I am looking at the M/R capabilities and > I have several questions: > 1/ the doc states that " Riak MapReduce is intended for batch processing, > not real-time querying." But as of now, you always get the results and > cannot automatically store them in a bucket like you would do with MongoDB > for example. Is taht something that is on the roadmap? > 2/ Basho blog has two articles from last year on Hadoop on top of Riak. Is > this project still live? It is on github, so I can always dig it up and > work on it, but I am wondering if there is something like Yokozuna for > Hadoop integration. > > Thanks. > C. > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Is Riak suitable for s small-record write-intensive billion-records application?
About distributed locking mechanism, you might wanna take a look at Google services, something called Chubby? Ctrl + F on that link: http://en.wikipedia.org/wiki/Distributed_lock_manager Regards, Guido. On 19/10/12 16:47, Guido Medina wrote: Locking mechanism on a single server is easy, on a cluster is not, that's why you don't see too many multi masters databases right? Riak instead focused on high availability and partitioning, but no consistency, if you notice, consistency is related with locking, with 1 single access per key, so you have to decide which one you focus. From the Java world and specifically from the ConcurrentMap idiom, you use put-if-absent or As Pseudo language you: *lock(key) (be synchronized or re-entrant lock, doesn't matter)* { ... .. } Once you get the lock, you verify if exists, if it doesn't create it, if it does, exit the lock ASAP, since it is meant to be a very quick atomic operation. Regarding siblings, Riak allow you to create many copies of the same key, and when you fetch that key, you get all the copies so YOU have to figure out how to assemble a consistent copy of your data base on all the written versions you have (because there is no distribute lock per key) I don't think I can explain in two more paragraph, you will have to watch this presentation: http://www.slideshare.net/seancribbs/eventuallyconsistent-data-structures I'm limited to a certain level... On 19/10/12 16:32, Les Mikesell wrote: On Fri, Oct 19, 2012 at 8:02 AM, Guido Medina wrote: It depends, if you have siblings enabled at the bucket, then you need to resolve the conflicts using the object vclock, How does that work for simultaneous initial inserts? if you are not using siblings, last write wins, either way, I haven't got any good results by delegating that tasks to Riak, with siblings, eventually I ran Riak out in speed of the writes making Riak fail (Due to LevelDB write speed?). And with last write wins then I don't think you would want unexpected results, and hence my recommendation: We use two things to resolve such issues; in-memory cache + locking mechanism. The problem is where the inserting client should handle new keys and updates differently, or at least be aware that its insert failed or will be ignored later. For the last quote, the locking mechanism if well designed will always take care of that. If it is easy, why doesn't riak handle it? ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: riak search - creating many indexes for one inserted object
Pawel, On Tue, Oct 9, 2012 at 5:21 PM, kamiseq wrote: > hi all, > > right now we are using solr as search index and we are inserting data > manually. so there is nothing to stop us from creating many indexes > (sort of views) on same entity, aggregate data and so on. > can something like that be achieved with riak search?? > Just to be sure I understand you. When you say "many indexes" do you mean something like writing to multiple Solr cores? If so, no, Riak Search cannot do that. It writes to an index named after the bucket you have the hook on. > I think that commit hooks are good point to start with but as I read > search index is kept in different format than bucket data and I would > love to still use solr-like api to search the index. > Yes, Riak Search stores index data in a backend called Merged Index. Riak Search has a Solr _like_ interface but it lacks many features, and doesn't have the same semantics or performance characteristics. There is a new project underway called Yokozuna which tightly integrates Riak and Solr. If you like Solr then keep an eye on this. I'm looking for people who want to prototype on it so if that interests you please email me directly. https://github.com/rzezeski/yokozuna > example > > I have two entities cars and parking_lots, each car references parking > lot it belongs to. > when I create/update/delete car object I would like to not only update > car index (so I can search by car type, name, number plates, etc) but > also update parking index to easily check how many cars given lot has > (plus search lots by cars, or search cars with given property). > Why have a separate index at all? Is it not good enough to have just the car index. Each doc would have a 'parking_lot_s' field. "How many cars a given lot has" -- would be numFound on q=parking_lot_s:foo. "Search lots by cars" -- I'm guessing you mean something like "tell me what lots have cars like this", sounds like a facet on 'parking_lot_s', right? "Search cars with a given property" -- like the last query but no facet. > probably all this can be achieved in many other ways. I can imagine > storing array of direct references in parking object and update this > object when car object also changed. but this way I need to issue two > asynchronous write request with no guaranties that both will be > persisted. > Yes. This is a problem with two Solr cores as well. I'm not sure if this is a toy example but I don't see the need for 2 indexes. I potentially see 2 buckets: 'cars' and 'lots'. But that doesn't mean it has to be two indexes. Does that make sense? -Z ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: MapReduce questions
On Fri, Oct 19, 2012 at 8:48 AM, Callixte Cauchois wrote: > So... no answers. > I guess there are no smart minds at Basho working on M/R currently. Too bad, > but I guess a company has to choose its priority. A lovely "good morning" to you, too. > as part of my evaluation of Riak, I am looking at the M/R capabilities and I have several questions: 1/ the doc states that " Riak MapReduce is intended for batch processing, not real-time querying." But as of now, you always get the results and cannot automatically store them in a bucket like you would do with MongoDB for example. Is taht something that is on the roadmap? > This isn't something that's on the roadmap. That doesn't mean we won't do it, but no one is working on code that does this at the moment. That said, I'm told by one of our smart, busy minds here that works on m/r that it would not be the most complicated addition. It may well end up being part of future work but, again, there's no timeline on it. In the mean time, any use case details you could share (short of "like you would do with MongoDB") that would help inform the development would be stellar. If you were feeling adventurous and wanted to write some erlang code, this behavior could be written as part of map or reduce job. > 2/ Basho blog has two articles from last year on Hadoop on top of Riak. Is this project still live? It is on github, so I can always dig it up and work on it, but I am wondering if there is something like Yokozuna for Hadoop integration. > The work on this specific code has stalled. That said, there are some people using Riak in production alongside hadoop, and we have plans to build deeper integration between the two at some point. As an aside, if you're looking for lower latency responses on implementation questions, you might want to consider hanging out in #riak on freenode. Mark twitter.com/pharkmillups > > > On Tue, Oct 16, 2012 at 11:03 AM, Callixte Cauchois > wrote: >> >> Hi there, >> >> as part of my evaluation of Riak, I am looking at the M/R capabilities and >> I have several questions: >> 1/ the doc states that " Riak MapReduce is intended for batch processing, >> not real-time querying." But as of now, you always get the results and >> cannot automatically store them in a bucket like you would do with MongoDB >> for example. Is taht something that is on the roadmap? >> 2/ Basho blog has two articles from last year on Hadoop on top of Riak. Is >> this project still live? It is on github, so I can always dig it up and work >> on it, but I am wondering if there is something like Yokozuna for Hadoop >> integration. >> >> Thanks. >> C. > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: MapReduce questions
Hey Mark, really sorry if I sounded aggressive or whatever. English is not my primary language and sometimes I do not sound as I wanted... I just wanted to acknowledge that no answer was kind of an answer to my questions. And yes, I will share how I would like M/R to behave for future reference, even if it is not going to be implemented short or mid term. Unfortunately, I do not speak Erlang, so it would be hard for me to contribute, even if I would love to. Maybe, I'd feel extra adventurous and give it a try though. Thank you for your response, and I also wanted to say that I love the new documentation site, compared to how it can be a struggle to understand how some competitors'product can be installed, it really kicks ass. C. On Fri, Oct 19, 2012 at 9:52 AM, Mark Phillips wrote: > On Fri, Oct 19, 2012 at 8:48 AM, Callixte Cauchois > wrote: > > So... no answers. > > I guess there are no smart minds at Basho working on M/R currently. Too > bad, > > but I guess a company has to choose its priority. > > A lovely "good morning" to you, too. > > > > as part of my evaluation of Riak, I am looking at the M/R capabilities > and I have several questions: > 1/ the doc states that " Riak MapReduce is intended for batch > processing, not real-time querying." But as of now, you always get the > results and cannot automatically store them in a bucket like you would > do with MongoDB for example. Is taht something that is on the roadmap? > > > > This isn't something that's on the roadmap. That doesn't mean we won't > do it, but no one is working on code that does this at the moment. > That said, I'm told by one of our smart, busy minds here that works on > m/r that it would not be the most complicated addition. It may well > end up being part of future work but, again, there's no timeline on > it. In the mean time, any use case details you could share (short of > "like you would do with MongoDB") that would help inform the > development would be stellar. > > If you were feeling adventurous and wanted to write some erlang code, > this behavior could be written as part of map or reduce job. > > > > 2/ Basho blog has two articles from last year on Hadoop on top of > Riak. Is this project still live? It is on github, so I can always dig > it up and work on it, but I am wondering if there is something like > Yokozuna for Hadoop integration. > > > > The work on this specific code has stalled. That said, there are some > people using Riak in production alongside hadoop, and we have plans to > build deeper integration between the two at some point. > > As an aside, if you're looking for lower latency responses on > implementation questions, you might want to consider hanging out in > #riak on freenode. > > Mark > twitter.com/pharkmillups > > > > > > > On Tue, Oct 16, 2012 at 11:03 AM, Callixte Cauchois < > ccauch...@virtuoz.com> > > wrote: > >> > >> Hi there, > >> > >> as part of my evaluation of Riak, I am looking at the M/R capabilities > and > >> I have several questions: > >> 1/ the doc states that " Riak MapReduce is intended for batch > processing, > >> not real-time querying." But as of now, you always get the results and > >> cannot automatically store them in a bucket like you would do with > MongoDB > >> for example. Is taht something that is on the roadmap? > >> 2/ Basho blog has two articles from last year on Hadoop on top of Riak. > Is > >> this project still live? It is on github, so I can always dig it up and > work > >> on it, but I am wondering if there is something like Yokozuna for Hadoop > >> integration. > >> > >> Thanks. > >> C. > > > > > > > > ___ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak Enterprise: can it be used to migrate to a new configuration?
Dave, 64 is fine for a 6 node cluster. Rune gives a great rundown of the downsides of large rings on small numbers of machines in his post. Usually our recommendation is for ~10 ring partitions per physical machine, rounded up to the next power of two. Where did you see the recommendation for 512 from us? Rune, Basho's replication won't work in the situation that you've described. Are you talking about an in-house replication product? Our full-sync doesn't work between clusters of different ring sizes. On Fri, Oct 19, 2012 at 4:50 AM, Rune Skou Larsen wrote: > Yes, we have done excatly that. When we migrated from 256 to 128 partitions > in a live dual-cluster system, we took one cluster down. Wiped the data, > changed number of partitions, brought it back up and synced all data back > with a full sync. Then we did the same with the other cluster. > > However, I must disagree with the recomendation of 512 partitions for 5 > nodes. You should go for 128 or 256 unless you plan on scaling out to 10+ > nodes pr. cluster. > > There are downsides to having many partitions. The price of the higher > granularity is that the more storage backend processes use more resources > for housekeeping. If you do multibackend, the ressources used are multiplied > yet again with the number of backends because each vnode will have a number > of running backend processes. > > Say you go with the 512 partitions and have a multibackend config with 4 > backends, because you need to backup 4 different types of data > independently. That gives you 2k running backends on each node of which 412 > will be actively in use in normal running scenario and more when you're > doing handoff. Thats a lot of ressources just to run these, that you might > otherwise have used for doing business. > > When you increase the number of partitions you should consider: > - Number of open files. Especially when using eleveldb. > - Late triggering of bitcask compaction. The default is no compaction of any > file before it hits 2GB. That means up to 2G of dead space per vnode. This > can however be configured down to a smaller number than the 2 gigs, which is > crazy high in almost any use case involving delete, expiry or update of > data. > - Leveldb cache is pr. vnode, so you need to lower the number, in order to > not use all memory, which will lead to death by swapping. > - With a high number of vnodes pr. node, each vnode's leveldb cache will be > comparatively small leading to (slighty) less effecient cache usage. > > Please be in touch if you need onsite or offsite professional assistance > configuring, testing or running your Riak clusters. > > BR Rune Skou Larsen > > Trifork > - We do Riak PS. > > -- > > Best regards / Venlig hilsen > > Rune Skou Larsen > Trifork Public A/S / Team Riak > Margrethepladsen 4, 8000 Århus C, Denmark > Phone: +45 3160 2497 Skype: runeskoularsen twitter: @RuneSkouLarsen > > > > Den 19-10-2012 12:38, Dave Brady skrev: > > Can Riak Enterprise replicate between rings where each ring has a different > number of partitions? > > Our five-node ring was originally configured with 64 partitions, and I saw > that Basho is recommending 512 for that number of machines. > > Any ideas on how to make as-painless-a-migration-as-possible are welcome, of > course! > > -- > Dave Brady > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com