Paulo,
If you have large data sizes then the vnodes with hadoop issue is moot. You
will get that many splits with or without vnodes. The issues come when you
don't have a lot of data, so all the extra splits slow everything down to a
crawl because there are 256 times as many tasks created as y
Hello,
According to DSE3.1 documentation [1], "DataStax recommends using virtual
nodes only on data centers running purely Cassandra workloads. You should
disable virtual nodes on data centers running either Hadoop or Solr
workloads by setting num_tokens to 1.".
There was a thread in this mailing