Okay Mark, I will be looking into this JIRA regularly. Thanks again for helping. Chunky.
On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover <grover.markgro...@gmail.com>wrote: > Chunky, > I just tried it myself. It turns out that the directory you are adding as > partition has to be empty for msck repair to work. This is obviously > sub-optimal and there is a JIRA in place ( > https://issues.apache.org/jira/browse/HIVE-3231) to fix it. > > So, I'd suggest you keep an eye out for the next version for that fix to > come in. In the meanwhile, run msck after you create your partition > directory but before you populate your directory with data. > > Mark > > > On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta <chunky.gu...@vizury.com>wrote: > >> Hi Mark, >> Sorry, I forgot to mention. I have also tried >> msck repair table <Table name>; >> and same output I got which I got from msck only. >> Do I need to do any other settings for this to work, because I have >> prepared Hadoop and Hive setup from start on EC2. >> >> Thanks, >> Chunky. >> >> >> >> On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <grover.markgro...@gmail.com >> > wrote: >> >>> Chunky, >>> You should have run: >>> msck repair table <Table name>; >>> >>> Sorry, I should have made it clear in my last reply. I have added an >>> entry to Hive wiki for benefit of others: >>> >>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions >>> >>> Mark >>> >>> >>> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta <chunky.gu...@vizury.com>wrote: >>> >>>> Hi Mark, >>>> I didn't get any error. >>>> I ran this on hive console:- >>>> "msck table Table_Name;" >>>> It says Ok and showed the execution time as 1.050 sec. >>>> But when I checked partitions for table using >>>> "show partitions Table_Name;" >>>> It didn't show me any partitions. >>>> >>>> Thanks, >>>> Chunky. >>>> >>>> >>>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover < >>>> grover.markgro...@gmail.com> wrote: >>>> >>>>> Glad to hear, Chunky. >>>>> >>>>> Out of curiosity, what errors did you get when using msck? >>>>> >>>>> >>>>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta >>>>> <chunky.gu...@vizury.com>wrote: >>>>> >>>>>> Hi Mark, >>>>>> I tried msck, but it is not working for me. I have written a python >>>>>> script to partition the data individually. >>>>>> >>>>>> Thank you Edward, Mark and Dean. >>>>>> Chunky. >>>>>> >>>>>> >>>>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover < >>>>>> grover.markgro...@gmail.com> wrote: >>>>>> >>>>>>> Chunky, >>>>>>> I have used "recover partitions" command on EMR, and that worked >>>>>>> fine. >>>>>>> >>>>>>> However, take a look at >>>>>>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck >>>>>>> command in Apache Hive does the same thing. Try it out and let us know >>>>>>> it >>>>>>> goes. >>>>>>> >>>>>>> Mark >>>>>>> >>>>>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo < >>>>>>> edlinuxg...@gmail.com> wrote: >>>>>>> >>>>>>>> Recover partitions should work the same way for different file >>>>>>>> systems. >>>>>>>> >>>>>>>> Edward >>>>>>>> >>>>>>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler >>>>>>>> <dean.wamp...@thinkbiganalytics.com> wrote: >>>>>>>> > Writing a script to add the external partitions individually is >>>>>>>> the only way >>>>>>>> > I know of. >>>>>>>> > >>>>>>>> > Sent from my rotary phone. >>>>>>>> > >>>>>>>> > >>>>>>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <chunky.gu...@vizury.com> >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > Hi Dean, >>>>>>>> > >>>>>>>> > Actually I was having Hadoop and Hive cluster on EMR and I have >>>>>>>> S3 storage >>>>>>>> > containing logs which updates daily and having partition with >>>>>>>> date(dt). And >>>>>>>> > I was using this recover partition. >>>>>>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive >>>>>>>> cluster. So, >>>>>>>> > what is the alternate of using recover partition in this case, if >>>>>>>> you have >>>>>>>> > any idea ? >>>>>>>> > I found one way of individually partitioning all dates, so I have >>>>>>>> to write >>>>>>>> > script for that to do so for all dates. Is there any easiest way >>>>>>>> other than >>>>>>>> > this ? >>>>>>>> > >>>>>>>> > Thanks, >>>>>>>> > Chunky >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler >>>>>>>> > <dean.wamp...@thinkbiganalytics.com> wrote: >>>>>>>> >> >>>>>>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to >>>>>>>> their version >>>>>>>> >> of Hive. >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html >>>>>>>> >> >>>>>>>> >> <shameless-plus> >>>>>>>> >> Chapter 21 of Programming Hive discusses this feature and >>>>>>>> other aspects >>>>>>>> >> of using Hive in EMR. >>>>>>>> >> </shameless-plug> >>>>>>>> >> >>>>>>>> >> dean >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta < >>>>>>>> chunky.gu...@vizury.com> >>>>>>>> >> wrote: >>>>>>>> >>> >>>>>>>> >>> Hi, >>>>>>>> >>> >>>>>>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 >>>>>>>> and Hive >>>>>>>> >>> version 0.8.1 (I configured everything) . I have created a >>>>>>>> table using :- >>>>>>>> >>> >>>>>>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW >>>>>>>> FORMAT >>>>>>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION >>>>>>>> 's3://my-location/data/'; >>>>>>>> >>> >>>>>>>> >>> Now I am trying to recover partition using :- >>>>>>>> >>> >>>>>>>> >>> ALTER TABLE XXX RECOVER PARTITIONS; >>>>>>>> >>> >>>>>>>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 >>>>>>>> cannot >>>>>>>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter >>>>>>>> table statement" >>>>>>>> >>> >>>>>>>> >>> Doing same steps on a cluster setup on EMR with Hadoop version >>>>>>>> 1.0.3 and >>>>>>>> >>> Hive version 0.8.1 (Configured by EMR), works fine. >>>>>>>> >>> >>>>>>>> >>> So is this a version issue or am I missing some configuration >>>>>>>> changes in >>>>>>>> >>> EC2 setup ? >>>>>>>> >>> I am not able to find exact solution for this problem on >>>>>>>> internet. Please >>>>>>>> >>> help me. >>>>>>>> >>> >>>>>>>> >>> Thanks, >>>>>>>> >>> Chunky. >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> -- >>>>>>>> >> Dean Wampler, Ph.D. >>>>>>>> >> thinkbiganalytics.com >>>>>>>> >> +1-312-339-1330 >>>>>>>> >> >>>>>>>> >> >>>>>>>> > >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >