Re: Ingesting Large Number of files

2015-11-17 Thread Robert Coli
On Tue, Nov 17, 2015 at 6:32 AM, Tushar Agrawal wrote: > We get periodic bulk load (twice a month) in form of delimited data files. > We get about 10K files with average size of 50 MB. Each record is a row in > Cassandra table. > http://www.pythian.com/blog/bulk-loading-options-for-cassandra/ =

Re: Ingesting Large Number of files

2015-11-17 Thread areddyraja
This is about 5gb one time. Network speed lets us say 200mb/sec Let us say you have 10 node cluster. Choose your partition key such a way that, it can write on all nodes. That means about 0.5gb per node. With 200 mb/sec network speed, 500 mb takes 500*8/200 would give 20 secs total time for

Ingesting Large Number of files

2015-11-17 Thread Tushar Agrawal
We get periodic bulk load (twice a month) in form of delimited data files. We get about 10K files with average size of 50 MB. Each record is a row in Cassandra table. What is the best way to ingest data into cassandra in fastest possible way? Thank you, Tushar