Glad to hear, Chunky. Out of curiosity, what errors did you get when using msck?
On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <chunky.gu...@vizury.com>wrote: > Hi Mark, > I tried msck, but it is not working for me. I have written a python script > to partition the data individually. > > Thank you Edward, Mark and Dean. > Chunky. > > > On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover > <grover.markgro...@gmail.com>wrote: > >> Chunky, >> I have used "recover partitions" command on EMR, and that worked fine. >> >> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems >> like msck command in Apache Hive does the same thing. Try it out and let us >> know it goes. >> >> Mark >> >> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote: >> >>> Recover partitions should work the same way for different file systems. >>> >>> Edward >>> >>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>> <dean.wamp...@thinkbiganalytics.com> wrote: >>> > Writing a script to add the external partitions individually is the >>> only way >>> > I know of. >>> > >>> > Sent from my rotary phone. >>> > >>> > >>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <chunky.gu...@vizury.com> >>> wrote: >>> > >>> > Hi Dean, >>> > >>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>> storage >>> > containing logs which updates daily and having partition with >>> date(dt). And >>> > I was using this recover partition. >>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. >>> So, >>> > what is the alternate of using recover partition in this case, if you >>> have >>> > any idea ? >>> > I found one way of individually partitioning all dates, so I have to >>> write >>> > script for that to do so for all dates. Is there any easiest way other >>> than >>> > this ? >>> > >>> > Thanks, >>> > Chunky >>> > >>> > >>> > >>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>> > <dean.wamp...@thinkbiganalytics.com> wrote: >>> >> >>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >>> version >>> >> of Hive. >>> >> >>> >> >>> >> >>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>> >> >>> >> <shameless-plus> >>> >> Chapter 21 of Programming Hive discusses this feature and other >>> aspects >>> >> of using Hive in EMR. >>> >> </shameless-plug> >>> >> >>> >> dean >>> >> >>> >> >>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <chunky.gu...@vizury.com >>> > >>> >> wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and >>> Hive >>> >>> version 0.8.1 (I configured everything) . I have created a table >>> using :- >>> >>> >>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >>> 's3://my-location/data/'; >>> >>> >>> >>> Now I am trying to recover partition using :- >>> >>> >>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>> >>> >>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot >>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table >>> statement" >>> >>> >>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 >>> and >>> >>> Hive version 0.8.1 (Configured by EMR), works fine. >>> >>> >>> >>> So is this a version issue or am I missing some configuration >>> changes in >>> >>> EC2 setup ? >>> >>> I am not able to find exact solution for this problem on internet. >>> Please >>> >>> help me. >>> >>> >>> >>> Thanks, >>> >>> Chunky. >>> >>> >>> >>> >>> >>> >>> >> >>> >> >>> >> >>> >> -- >>> >> Dean Wampler, Ph.D. >>> >> thinkbiganalytics.com >>> >> +1-312-339-1330 >>> >> >>> >> >>> > >>> >> >> >