look at contrib/bmt_example, with the caveat that it's usually premature optimization
On Tue, Jul 13, 2010 at 12:31 PM, Mubarak Seyed <mubarak.se...@gmail.com> wrote: > Thanks Torsten. > Jonathan's blog on Fact Vs Fiction says that > Fact: It has always been straightforward to send the output of Hadoop jobs > to Cassandra, and Facebook, Digg, and others have been using Hadoop like > this as a Cassandra bulk-loader for over a year. > Does anyone from Facebook or Digg share details on how to use Cassandra > BulkLoader? > I could see some details from Arin's presentation on Cassandra @ Digg about > data load from MySQL -> Hadoop -> Cassandra. > Can someone please help me? > Thanks, > Mubarak > > On Tue, Jul 13, 2010 at 1:27 AM, Torsten Curdt <tcu...@vafer.org> wrote: >> >> On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed <mubarak.se...@gmail.com> >> wrote: >> > Where can i find the documentation for BinaryMemTable (btm_example in >> > contrib) >> > to use CassandraBulkLoader? What is the input to be supplied to >> > CassandraBulkLoader? >> > How to form the input data and what is the format of an input data? >> >> The code is the documentation I fear. >> >> I'll see if I get permission to get our updated code contributed. >> We added command line fu and using it to import large TSVs. >> >> > Do i need the HDFS to store my storage-conf.xml? >> >> Why HDFS? >> >> The machine running the bulk loader joins the cassandra ring kind of >> like a temporary node. >> So you will need the storage-conf.xml on that machine. >> >> cheers >> -- >> Torsten > > > > -- > Thanks, > Mubarak Seyed. > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com