Hi all,

I am dealing with a scenario where I receive a .csv file in every 10mins
intervals which is of average 300MB. I need to update a Cassandra cluster
according to the received data from .csv file, after some processing
functions.

Current approach is keeping a Hashmap in memory, updating it from the
processed .csv files gathering the data to be updated(This data is mostly a
update on a counter). Then periodically(let's say in 2s intervals) the
values in the Hashmap are read one by one again and updated in Cassandra.

I have tried generating sstables and loading data as batches via
sstableloader, but it is lot slower than the requirement that I need near
real time results.

Are there any hints on what I can try out? Is there any possibility to do
something like directly updating values in a memtable (Instead of using
Hashmap) and sending to Cassandra than loading via sstables?



-- 
Pushpalanka Jayawardhana

Reply via email to