I hope the Cassandra Community can help me finding a decision. The project i am working on actually is located in industrial plant, machines are connected to a server an every 5 minutes i get data from the machines about its status. We are talking about a production with 100+ machines, so the data amount is very high:
Per Machine every 5th minute one row, means 12 rows per hour, means roundabout 120 rows per day = 1200+ rows per day multiplied by 20 its 240.000 rows per month and 2.880.000 rows per year. I have to hold the last 3 years and i must be able to do analytics on this data. in the end i deal with roundabout 10 Mio Rows (12 columns holding text and numbers each row) Okay, its kind of big data is not really "big data" isn'it but for me its a lot data to handle anyway. Actually i am holding all these data in a oracle database but doing analytics on so many rows is not the good and modern way i think. as the company is successfull they will grew, means more machines, again more data to handle... So i thought maybe Big Data technologies are a possible solution for me to store my data. Meanwhile i know Apache Hadoop is not the right tool for this kind of thing because it scales not down.But maybe Cassandra ? This is my question to you, do you think cassandra is the right store for this kind of data? I am thinking about 2 Nodes. Maybe virtual. Let me know what you think. And if Cassandra is not the right tool please tell me and if you know any please tell me alternatives. Maybe i am already doing the right thing with storing that much data in oracle database and maybe one of you is doing the same - if so please let me also know. Thank you very much. Web: http://www.teufel.net