Hello all, My Use Case is: 1) I have a relational database which has a very large data. (MS SQL Server) 2) I want to do analysis on these huge data and want to generate reports on it after analysis. Like this I have to generate various reports based on different analysis.
I tried to implement this using Hive. What I did is: 1) I imported all tables in Hive from MS SQL Server using SQOOP. 2) I wrote many queries in Hive which is executing using JDBC on Hive Thrift Server 3) I am getting the correct result in table form, which I am expecting 4) But the problem is that the time which require to execute is too much long. (My complete program is executing in near about 3-4 hours on *small amount of data*). I decided to do this using Hive. And as I told previously how much time Hive consumed for execution. my organization is expecting to complete this task in near about less than 1/2 hours Now after spending too much time for complete execution for this task what should I do? I want to ask one thing that: *Is this Use Case is possible with Hive?* If possible what should I do in my program to increase the performance? *And If not possible what is the other good way to implement this Use Case?* Please reply me. Thanks -- Regards, Bhavesh Shah