Re: CassandraBulkLoader

Jonathan Ellis Tue, 13 Jul 2010 13:07:28 -0700

look at contrib/bmt_example, with the caveat that it's usually
premature optimization


On Tue, Jul 13, 2010 at 12:31 PM, Mubarak Seyed <mubarak.se...@gmail.com> wrote:
> Thanks Torsten.
> Jonathan's blog on Fact Vs Fiction says that
> Fact: It has always been straightforward to send the output of Hadoop jobs
> to Cassandra, and Facebook, Digg, and others have been using Hadoop like
> this as a Cassandra bulk-loader for over a year.
> Does anyone from Facebook or Digg share details on how to use Cassandra
> BulkLoader?
> I could see some details from Arin's presentation on Cassandra @ Digg about
> data load from MySQL -> Hadoop -> Cassandra.
> Can someone please help me?
> Thanks,
> Mubarak
>
> On Tue, Jul 13, 2010 at 1:27 AM, Torsten Curdt <tcu...@vafer.org> wrote:
>>
>> On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed <mubarak.se...@gmail.com>
>> wrote:
>> > Where can i find the documentation for BinaryMemTable (btm_example in
>> > contrib)
>> > to use CassandraBulkLoader? What is the input to be supplied to
>> > CassandraBulkLoader?
>> > How to form the input data and what is the format of an input data?
>>
>> The code is the documentation I fear.
>>
>> I'll see if I get permission to get our updated code contributed.
>> We added command line fu and using it to import large TSVs.
>>
>> > Do i need the HDFS to store my storage-conf.xml?
>>
>> Why HDFS?
>>
>> The machine running the bulk loader joins the cassandra ring kind of
>> like a temporary node.
>> So you will need the storage-conf.xml on that machine.
>>
>> cheers
>> --
>> Torsten
>
>
>
> --
> Thanks,
> Mubarak Seyed.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: CassandraBulkLoader

Reply via email to