> The reason for me looking at virtual nodes is because of terrible experiences > we had with 0.8 repairs and as per documentation (an logically) the virtual > nodes seems like it will help repairs being smoother. Is this true? I've not thought too much about how they help repair run smoother, what was the documentation you read ?
> Also how to get the right number of virtual nodes? Use the default 256 Hope that helps. ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 3/08/2013, at 7:39 AM, rash aroskar <rashmi.aros...@gmail.com> wrote: > Thanks for helpful responses. The upgrade from 0.8 to 1.2 is not direct, we > have setup test cluster where we did upgrade from 0.8 to 1.1 and then 1.2. > Also we will do a whole different cluster with 1.2, the 0.8 cluster will not > be upgraded. But the data will be moved from 0.8 cluster to 1.2 cluster. > The reason for me looking at virtual nodes is because of terrible experiences > we had with 0.8 repairs and as per documentation (an logically) the virtual > nodes seems like it will help repairs being smoother. Is this true? Also how > to get the right number of virtual nodes? David suggested 64 vnodes for 20 > machines. Is there a formula or a thought process to be followed to get this > number right? > > > On Mon, Jul 29, 2013 at 4:15 AM, aaron morton <aa...@thelastpickle.com> wrote: > I would *strongly* recommend against upgrading from 0.8 directly to 1.2. > Skipping a major version is generally not recommended, skipped 3 would seem > like carelessness. > >> I second Romain, do the upgrade and make sure the health is good first. > > +1 but I would also recommend deciding if you actually need to use virtual > nodes. The shuffle process can take a long time and people have had mixed > experiences with it. > > If you wanted to move to 1.2 and get vNodes I would consider spinning up a > new cluster and bulk loading into it. You could do an initial load and then > to delta loads using snapshots, there would however be a period of stale data > in the new cluster until the last delta snapshot is loaded. > > Cheers > > ----------------- > Aaron Morton > Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 27/07/2013, at 3:36 AM, David McNelis <dmcne...@gmail.com> wrote: > >> I second Romain, do the upgrade and make sure the health is good first. >> >> If you have or plan to have a large number of nodes, you might consider >> using fewer than 256 as your initial vnodes amount. I think that number is >> inflated from reasonable in the docs, as we've had some people talk about >> potential performance degradation if you have a large number of nodes and a >> very high number of vnodes, if I had it to do over again, I'd have done 64 >> vnodes as my default (across 20 nodes). >> >> Another thing to be very cognizant of before shuffle is disk space. You >> *must* have less than 50% used in order to do the shuffle successfully >> because no data is removed (cleaned) from a node during the shuffle process >> and the shuffle process essentially doubles the amount of data until you're >> able to run a clean. >> >> >> On Fri, Jul 26, 2013 at 11:25 AM, Romain HARDOUIN >> <romain.hardo...@urssaf.fr> wrote: >> Vnodes are a great feature. More nodes are involved during operations such >> as bootstrap, decommission, etc. >> DataStax documentation is definitely a must read. >> That said, If I were you, I'd wait somewhat before to shuffle the ring. I'd >> focus on cluster upgrade and monitoring the nodes. (number of files handles, >> memory usage, latency, etc). >> Upgrading from 0.8 to 1.2 can be tricky, there are so many changes since >> then. Be careful about compaction strategies you choose and double check the >> options. >> >> Regards, >> Romain >> >> rash aroskar <rashmi.aros...@gmail.com> a écrit sur 25/07/2013 23:25:11 : >> >> > De : rash aroskar <rashmi.aros...@gmail.com> >> > A : user@cassandra.apache.org, >> > Date : 25/07/2013 23:25 >> > Objet : cassandra 1.2.5- virtual nodes (num_token) pros/cons? >> > >> > Hi, >> > I am upgrading my cassandra cluster from 0.8 to 1.2.5. >> > In cassandra 1.2.5 the 'num_token' attribute confuses me. >> > I understand that it distributes multiple tokens per node but I am >> > not clear how that is helpful for performance or load balancing. Can >> > anyone elaborate? has anyone used this feature and knows its >> > advantages/disadvantages? >> > >> > Thanks, >> > Rashmi >> > >