Thanks Hiller and Shamim. Let me share more details. I want to use cassandra MR to calculate some KPI's on the data which is stored in cassandra continuously. So here fetching whole data from cassandra every time seems an overhead to me?
The rowkey I'm using is like "(timestamp/60000)_otherid"; this CF contains reference of rowkeys of actual data stored in other CF. so to calculate KPI I will work for a particular minute and fetch data from other CF, and process it. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Filter-data-on-row-key-in-Cassandra-Hadoop-s-Random-Partitioner-tp7584212p7584263.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.