On Thu, Oct 18, 2012 at 12:00 PM, Michael Kjellman <mkjell...@barracuda.com> wrote: > Unless you have Brisk (however as far as I know there was one fork that got > it working on 1.0 but nothing for 1.1 and is not being actively maintained > by Datastax) or go with CFS (which comes with DSE) you are not guaranteed > all data is on that hadoop node. You can take a look at the forks if > interested here: https://github.com/riptano/brisk/network but I'd personally > be afraid to put my eggs in a basket that is certainly not super supported > anymore. > > job.getConfiguration().set("cassandra.consistencylevel.read", "QUORUM"); > should get you started. This is what I don't understand. With QUORUM you read data from at least two nodes. If so, you don't benefit from data locality. What's the point to use hadoop? I can run application on any machine(s) and iterate through column family. What is the difference?
Thank you, Andrey