Re: Amazon EMR Best Practices for Hive metastore

2012-03-06 Thread Sam Wilson
We also do #4. Initially we had lots of conversations about all the other options and we should do this or that... Ultimately we focused on just going live as quickly as possible and getting more involved in the setup later. Since then the only thing we've needed to do is hack a few o the basel

Re: rainstor

2012-01-25 Thread Sam Wilson
Google? Sent from my iPhone On Jan 25, 2012, at 7:34 PM, Dalia Sobhy wrote: > Do anyone have any idea about rainstor ??? > > Opensource? How to download ? How to use? PErformance ??

Re: drop table -> java.lang.OutOfMemoryError: Java heap space

2012-01-05 Thread Sam Wilson
I recommend trying a daily partitioning scheme over an hourly one. We had a similar setup and ran into the same problem and ultimately found that daily works fine for us, even with larger file sizes. At the very least it is worth evaluating. Sent from my iPhone On Jan 5, 2012, at 2:23 PM, Mat

Re: Hive Metadata URI error

2011-12-11 Thread Sam Wilson
Try file:// in front of the property value... Sent from my iPhone On Dec 12, 2011, at 12:07 AM, "Periya.Data" wrote: > Hi, >I am trying to create Hive tables on an EC2 instance. I get this strange > error about URI schema and log4j properties not found. I do not know how to > fix this. >

Re: Building out Hive in EC2/S3 versus dedicated servers

2011-11-22 Thread Sam Wilson
We recently adopted Hadoop and Hive for doing some significant data processing. We went the Amazon route. My own $.02 is as follows: If you are already incredibly experienced with Hadoop and Hive and have someone on staff who has previously built a cluster at least as big as the one you are pr

Re: Asynchronous query exection

2011-11-15 Thread Sam Wilson
If you go this route, you may want to use nohup. This way your processes will continue running even if you lose connection to your terminal session. Other options: 1) You can write your queries to a DB/Queue and have a process running on the Hive server that reads from the DB/queue and runs the