Hi all, I am dealing with a scenario where I receive a .csv file in every 10mins intervals which is of average 300MB. I need to update a Cassandra cluster according to the received data from .csv file, after some processing functions.
Current approach is keeping a Hashmap in memory, updating it from the processed .csv files gathering the data to be updated(This data is mostly a update on a counter). Then periodically(let's say in 2s intervals) the values in the Hashmap are read one by one again and updated in Cassandra. I have tried generating sstables and loading data as batches via sstableloader, but it is lot slower than the requirement that I need near real time results. Are there any hints on what I can try out? Is there any possibility to do something like directly updating values in a memtable (Instead of using Hashmap) and sending to Cassandra than loading via sstables? -- Pushpalanka Jayawardhana