Hi all, I am trying to get an idea of what people do for setting up Hive metastore when using Amazon EMR.
For those of you using Amazon EMR: 1) Do you have a dedicated RDS instance external to your EMR Hive+Hadoop cluster that you use as a persistent metastore for all your cluster instantiations? 2) Do you use the MySQL DB that comes pre-installed on the master node and export its data (on cluster tear down) to something like S3 and import it from S3 during cluster bring up? 3) Do you use a local installation of Hive (instead of that on EMR) so that you could make use of an in-house dedicated metastore while utilizing Hadoop cluster on EMR? (i.e. local Hive + EMR Hadoop) 4) Do you do something really simple and naive like scripting up all your "create external table" commands and running them every time you bring up a cluster? Or, do you do something else not mentioned above?:-) Thank you in advance for sharing! Mark Mark Grover, Business Intelligence Analyst OANDA Corporation www: oanda.com www: fxtrade.com "Best Trading Platform" - World Finance's Forex Awards 2009. "The One to Watch" - Treasury Today's Adam Smith Awards 2009.