Recommend Amazon ElasticMapReduce. Otherwise, it costs you time to prepare and set up hadoop and hive package for running on ec2. EMR does heavyweight lifting work for you and still allow you option to customize your hadoop and hive by pointing to their property files in xml (e.g. in S3). EMR also allows your hive job to run in batch mode (through emr client command tools or amazon consoler) or in interactive mode for test/debug purpose. Another benefit of using EMR/hive is that its hive has enhanced features otherwise not available, s.a., passing parameters from command line, loading partitions automatically from S3 instead of loading them individually, etc. Here's a link to emr faq and you may take a look at the answer to "Are there new features in Hive specific to Amazon Elastic MapReduce?"
http://aws.amazon.com/elasticmapreduce/faqs/ Michael ________________________________ From: "Aggarwal, Vaibhav" <vagg...@amazon.com> To: "d...@hive.apache.org" <d...@hive.apache.org>; "user@hive.apache.org" <user@hive.apache.org> Sent: Tuesday, August 30, 2011 11:51 AM Subject: RE: Hive in EC2 You could also choose to look at Amazon ElasticMapReduce. It allows you to provision an EC2 cluster of your choice preinstalled with Hive and Hadoop. https://cwiki.apache.org/confluence/display/Hive/HiveAmazonElasticMapReduce Thanks Vaibhav -----Original Message----- From: MIS [mailto:misapa...@gmail.com] Sent: Monday, August 29, 2011 11:03 PM To: user@hive.apache.org; hive Subject: Hive in EC2 Hi, Can somebody point me to production level setup of Hive in EC2. The intent is to know the setup best practices being employed. Thanks.