Re: Virtual node support for Hadoop workloads

2013-10-18 Thread Jeremiah D Jordan
Paulo, If you have large data sizes then the vnodes with hadoop issue is moot. You will get that many splits with or without vnodes. The issues come when you don't have a lot of data, so all the extra splits slow everything down to a crawl because there are 256 times as many tasks created as y

Virtual node support for Hadoop workloads

2013-10-17 Thread Paulo Motta
Hello, According to DSE3.1 documentation [1], "DataStax recommends using virtual nodes only on data centers running purely Cassandra workloads. You should disable virtual nodes on data centers running either Hadoop or Solr workloads by setting num_tokens to 1.". There was a thread in this mailing