Hi all,
I am trying to get an idea of what people do for setting up Hive metastore when 
using Amazon EMR.

For those of you using Amazon EMR:

1) Do you have a dedicated RDS instance external to your EMR Hive+Hadoop 
cluster that you use as a persistent metastore for all your cluster 
instantiations?

2) Do you use the MySQL DB that comes pre-installed on the master node and 
export its data (on cluster tear down) to something like S3 and import it from 
S3 during cluster bring up?

3) Do you use a local installation of Hive (instead of that on EMR) so that you 
could make use of an in-house dedicated metastore while utilizing Hadoop 
cluster on EMR? (i.e. local Hive + EMR Hadoop)

4) Do you do something really simple and naive like scripting up all your 
"create external table" commands and running them every time you bring up a 
cluster?

Or, do you do something else not mentioned above?:-)

Thank you in advance for sharing!

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 


Reply via email to