Chunky, I just tried it myself. It turns out that the directory you are adding as partition has to be empty for msck repair to work. This is obviously sub-optimal and there is a JIRA in place ( https://issues.apache.org/jira/browse/HIVE-3231) to fix it.
So, I'd suggest you keep an eye out for the next version for that fix to come in. In the meanwhile, run msck after you create your partition directory but before you populate your directory with data. Mark On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta <chunky.gu...@vizury.com>wrote: > Hi Mark, > Sorry, I forgot to mention. I have also tried > msck repair table <Table name>; > and same output I got which I got from msck only. > Do I need to do any other settings for this to work, because I have > prepared Hadoop and Hive setup from start on EC2. > > Thanks, > Chunky. > > > > On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover > <grover.markgro...@gmail.com>wrote: > >> Chunky, >> You should have run: >> msck repair table <Table name>; >> >> Sorry, I should have made it clear in my last reply. I have added an >> entry to Hive wiki for benefit of others: >> >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions >> >> Mark >> >> >> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta <chunky.gu...@vizury.com>wrote: >> >>> Hi Mark, >>> I didn't get any error. >>> I ran this on hive console:- >>> "msck table Table_Name;" >>> It says Ok and showed the execution time as 1.050 sec. >>> But when I checked partitions for table using >>> "show partitions Table_Name;" >>> It didn't show me any partitions. >>> >>> Thanks, >>> Chunky. >>> >>> >>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover < >>> grover.markgro...@gmail.com> wrote: >>> >>>> Glad to hear, Chunky. >>>> >>>> Out of curiosity, what errors did you get when using msck? >>>> >>>> >>>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta >>>> <chunky.gu...@vizury.com>wrote: >>>> >>>>> Hi Mark, >>>>> I tried msck, but it is not working for me. I have written a python >>>>> script to partition the data individually. >>>>> >>>>> Thank you Edward, Mark and Dean. >>>>> Chunky. >>>>> >>>>> >>>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover < >>>>> grover.markgro...@gmail.com> wrote: >>>>> >>>>>> Chunky, >>>>>> I have used "recover partitions" command on EMR, and that worked fine. >>>>>> >>>>>> However, take a look at >>>>>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck >>>>>> command in Apache Hive does the same thing. Try it out and let us know it >>>>>> goes. >>>>>> >>>>>> Mark >>>>>> >>>>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo < >>>>>> edlinuxg...@gmail.com> wrote: >>>>>> >>>>>>> Recover partitions should work the same way for different file >>>>>>> systems. >>>>>>> >>>>>>> Edward >>>>>>> >>>>>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>>>>> <dean.wamp...@thinkbiganalytics.com> wrote: >>>>>>> > Writing a script to add the external partitions individually is >>>>>>> the only way >>>>>>> > I know of. >>>>>>> > >>>>>>> > Sent from my rotary phone. >>>>>>> > >>>>>>> > >>>>>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <chunky.gu...@vizury.com> >>>>>>> wrote: >>>>>>> > >>>>>>> > Hi Dean, >>>>>>> > >>>>>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3 >>>>>>> storage >>>>>>> > containing logs which updates daily and having partition with >>>>>>> date(dt). And >>>>>>> > I was using this recover partition. >>>>>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive >>>>>>> cluster. So, >>>>>>> > what is the alternate of using recover partition in this case, if >>>>>>> you have >>>>>>> > any idea ? >>>>>>> > I found one way of individually partitioning all dates, so I have >>>>>>> to write >>>>>>> > script for that to do so for all dates. Is there any easiest way >>>>>>> other than >>>>>>> > this ? >>>>>>> > >>>>>>> > Thanks, >>>>>>> > Chunky >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>>>>>> > <dean.wamp...@thinkbiganalytics.com> wrote: >>>>>>> >> >>>>>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their >>>>>>> version >>>>>>> >> of Hive. >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>>>>>> >> >>>>>>> >> <shameless-plus> >>>>>>> >> Chapter 21 of Programming Hive discusses this feature and other >>>>>>> aspects >>>>>>> >> of using Hive in EMR. >>>>>>> >> </shameless-plug> >>>>>>> >> >>>>>>> >> dean >>>>>>> >> >>>>>>> >> >>>>>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta < >>>>>>> chunky.gu...@vizury.com> >>>>>>> >> wrote: >>>>>>> >>> >>>>>>> >>> Hi, >>>>>>> >>> >>>>>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 >>>>>>> and Hive >>>>>>> >>> version 0.8.1 (I configured everything) . I have created a table >>>>>>> using :- >>>>>>> >>> >>>>>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT >>>>>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >>>>>>> 's3://my-location/data/'; >>>>>>> >>> >>>>>>> >>> Now I am trying to recover partition using :- >>>>>>> >>> >>>>>>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>>>>>> >>> >>>>>>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 >>>>>>> cannot >>>>>>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table >>>>>>> statement" >>>>>>> >>> >>>>>>> >>> Doing same steps on a cluster setup on EMR with Hadoop version >>>>>>> 1.0.3 and >>>>>>> >>> Hive version 0.8.1 (Configured by EMR), works fine. >>>>>>> >>> >>>>>>> >>> So is this a version issue or am I missing some configuration >>>>>>> changes in >>>>>>> >>> EC2 setup ? >>>>>>> >>> I am not able to find exact solution for this problem on >>>>>>> internet. Please >>>>>>> >>> help me. >>>>>>> >>> >>>>>>> >>> Thanks, >>>>>>> >>> Chunky. >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> -- >>>>>>> >> Dean Wampler, Ph.D. >>>>>>> >> thinkbiganalytics.com >>>>>>> >> +1-312-339-1330 >>>>>>> >> >>>>>>> >> >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >