Re: performance and cluster size required

2014-06-05 Thread Nitin Pawar
on the first part of your question, what should be the cluster size, it is totally dependent on 1)what type of queries you are performing 2) what type of cluster you have got as in its shared or dedicated to you only. 3) compressed file format drives the query performance based if the compression t

performance and cluster size required

2014-06-05 Thread Bogala, Chandra Reddy
Hi, I get 300MB compressed file (structured CSV data) in spool directory every 3 minutes from collector. I have around 6 collectors. I move data from spool dir to HDFS directory and add as a hive partition for every 15 minutes data. Then I run different aggregation queries and post data to Hba