Hi Tharindu, try having a look at Brisk( http://www.datastax.com/products/brisk) it integrates Hadoop with Cassandra and is shipped with Hive for SQL analysis. You can then install Sqoop( http://www.cloudera.com/downloads/sqoop/) on top of Hadoop in order to enable data import/export between Hadoop and MySQL. Does this sound ok to you ?
2011/8/29 Tharindu Mathew <mcclou...@gmail.com> > Hi, > > I have an already running system where I define a simple data flow (using a > simple custom data flow language) and configure jobs to run against stored > data. I use quartz to schedule and run these jobs and the data exists on > various data stores (mainly Cassandra but some data exists in RDBMS like > mysql as well). > > Thinking about scalability and already existing support for standard data > flow languages in the form of Pig and HiveQL, I plan to move my system to > Hadoop. > > I've seen some efforts on the integration of Cassandra and Hadoop. I've > been reading up and still am contemplating on how to make this change. > > It would be great to hear the recommended approach of doing this on Hadoop > with the integration of Cassandra and other RDBMS. For example, a sample > task that already runs on the system is "once in every hour, get rows from > column family X, aggregate data in columns A, B and C and write back to > column family Y, and enter details of last aggregated row into a table in > mysql" > > Thanks in advance. > > -- > Regards, > > Tharindu > -- *Eric Djatsa Yota* *Double degree MsC Student in Computer Science Engineering and Communication Networks Télécom ParisTech (FRANCE) - Politecnico di Torino (ITALY)* *Intern at AMADEUS S.A.S Sophia Antipolis* djatsa...@gmail.com *Tel : 0601791859*