Hi All,
   The problem I am trying to address is:  Store the raw files (files are
in xml format and of the size arnd 700MB) in cassandra, later fetch it and
process it in hadoop cluster and populate back the processed data in
cassandra.  Regarding this, I wanted few clarifications:

1) The FAQ (
https://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage) informs
that I can have only files of around 64 MB but at the same time talks about
about an jira issue https://issues.apache.org/jira/browse/CASSANDRA-16
which is solved in 0.6 version itself.  So, in the present version of
cassandra (2.0.11), is there any limit on the size of the file in a column
and if so, what is it?
2) Can I replace HDFS with Cassandra so that I don't have to sync/fetch the
file from cassandra to HDFS when I want to process it in hadoop cluster?

Regards,
Seenu.

Reply via email to