No one had ever tried vnodes with hadoop until the OP did, or they would have noticed this. No one extensively used it with secondary indexes either from the last ticket I mentioned.
My mistake they are not a default. I do think vnodes are awesome, its great that c* has the longer release cylcle. Just saying I do not know what .0 and .1 releases are. They just seem like extended beta-s to me. Edward On Fri, Feb 15, 2013 at 11:10 PM, Eric Evans <eev...@acunu.com> wrote: > On Fri, Feb 15, 2013 at 7:01 PM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: >> Seems like the hadoop Input format should combine the splits that are >> on the same node into the same map task, like Hadoop's >> CombinedInputFormat can. I am not sure who recommends vnodes as the >> default, because this is now the second problem (that I know of) of >> this class where vnodes has extra overhead, >> https://issues.apache.org/jira/browse/CASSANDRA-5161 >> >> This seems to be the standard operating practice in c* now, enable >> things in the default configuration like new partitioners and newer >> features like vnodes, even though they are not heavily tested in the >> wild or well understood, then deal with fallout. > > Except that it is not in fact enabled by default; The default remains > 1-token-per-node. > > That said, the only way that a feature like this will ever be heavily > tested in the wild, and well understood, is if it is actually put to > use. Speaking only for myself, I am grateful to users like Cem who > test new features and report the issues they find. > >> On Fri, Feb 15, 2013 at 11:52 AM, cem <cayiro...@gmail.com> wrote: >>> Hi All, >>> >>> I have just started to use virtual nodes. I set the number of nodes to 256 >>> as recommended. >>> >>> The problem that I have is when I run a mapreduce job it creates node * 256 >>> mappers. It creates node * 256 splits. this effects the performance since >>> the range queries have a lot of overhead. >>> >>> Any suggestion to improve the performance? It seems like I need to lower the >>> number of virtual nodes. >>> >>> Best Regards, >>> Cem >>> >>> > > > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu