Do you mean, running the get_range_slices from a single? Yes, it would be reasonable for a relatively small key range, when it comes to analyze a really big range in really big data collection (i.e. like the one we currently populate) having a way for distributing the reads among the cluster seems the only reasonable solution.
In this current situation, the best option might be distributing the range among ColumnFamilies (say, 1 CF for each day) and emptying the CF in order to reuse for another day range after analyzing the data. Can you suggest a workaround for this? On Fri, Apr 30, 2010 at 3:22 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Sounds like doing this w/o m/r with get_range_slices is a reasonable way to > go. > > On Thu, Apr 29, 2010 at 6:04 PM, Utku Can Topçu <u...@topcu.gen.tr> wrote: > > I'm currently writing collected data continuously to Cassandra, having > keys > > starting with a timestamp and a unique identifier (like > > 2009.01.01.00.00.00.RANDOM) for being able to query in time ranges. > > > > I'm thinking of running periodical mapreduce jobs which will go through a > > designated time period. I might want to analyze the data only between > > 2009.01 and 2009.02. > > I had done this previously with HBase however I thought cassandra would > be a > > better choice for continuously storing data in a safe manner. > > > > I guess this briefly explains my designated use case. > > > > Best Regards, > > Utku > > > > On Thu, Apr 29, 2010 at 11:32 PM, Jonathan Ellis <jbel...@gmail.com> > wrote: > >> > >> It's technically possible but 0.6 does not support this, no. > >> > >> What is the use case you are thinking of? > >> > >> On Thu, Apr 29, 2010 at 11:14 AM, Utku Can Topçu <u...@topcu.gen.tr> > >> wrote: > >> > Hi, > >> > > >> > I've been trying to use Cassandra for some kind of a supplementary > input > >> > source for Hadoop MapReduce jobs. > >> > > >> > The default usage of the ColumnFamilyInputFormat does a full > >> > columnfamily > >> > scan for using within the MapReduce framework as map input. > >> > > >> > However I believe that, it should be possible to give a keyrange to > scan > >> > the > >> > specified range. > >> > > >> > Is it anymeans possible? > >> > > >> > Best Regards, > >> > > >> > Utku > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder of Riptano, the source for professional Cassandra support > >> http://riptano.com > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >