Glad to hear, Chunky.

Out of curiosity, what errors did you get when using msck?

On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <chunky.gu...@vizury.com>wrote:

> Hi Mark,
> I tried msck, but it is not working for me. I have written a python script
> to partition the data individually.
>
> Thank you Edward, Mark and Dean.
> Chunky.
>
>
> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover 
> <grover.markgro...@gmail.com>wrote:
>
>> Chunky,
>> I have used "recover partitions" command on EMR, and that worked fine.
>>
>> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems
>> like msck command in Apache Hive does the same thing. Try it out and let us
>> know it goes.
>>
>> Mark
>>
>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote:
>>
>>> Recover partitions should work the same way for different file systems.
>>>
>>> Edward
>>>
>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>>> <dean.wamp...@thinkbiganalytics.com> wrote:
>>> > Writing a script to add the external partitions individually is the
>>> only way
>>> > I know of.
>>> >
>>> > Sent from my rotary phone.
>>> >
>>> >
>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <chunky.gu...@vizury.com>
>>> wrote:
>>> >
>>> > Hi Dean,
>>> >
>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
>>> storage
>>> > containing logs which updates daily and having partition with
>>> date(dt). And
>>> > I was using this recover partition.
>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster.
>>> So,
>>> > what is the alternate of using recover partition in this case, if you
>>> have
>>> > any idea ?
>>> > I found one way of individually partitioning all dates, so I have to
>>> write
>>> > script for that to do so for all dates. Is there any easiest way other
>>> than
>>> > this ?
>>> >
>>> > Thanks,
>>> > Chunky
>>> >
>>> >
>>> >
>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
>>> > <dean.wamp...@thinkbiganalytics.com> wrote:
>>> >>
>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
>>> version
>>> >> of Hive.
>>> >>
>>> >>
>>> >>
>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>>> >>
>>> >> <shameless-plus>
>>> >>   Chapter 21 of Programming Hive discusses this feature and other
>>> aspects
>>> >> of using Hive in EMR.
>>> >> </shameless-plug>
>>> >>
>>> >> dean
>>> >>
>>> >>
>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <chunky.gu...@vizury.com
>>> >
>>> >> wrote:
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and
>>> Hive
>>> >>> version 0.8.1 (I configured everything) . I have created a table
>>> using :-
>>> >>>
>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
>>> 's3://my-location/data/';
>>> >>>
>>> >>> Now I am trying to recover partition using :-
>>> >>>
>>> >>> ALTER TABLE XXX RECOVER PARTITIONS;
>>> >>>
>>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
>>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
>>> statement"
>>> >>>
>>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3
>>> and
>>> >>> Hive version 0.8.1 (Configured by EMR), works fine.
>>> >>>
>>> >>> So is this a version issue or am I missing some configuration
>>> changes in
>>> >>> EC2 setup ?
>>> >>> I am not able to find exact solution for this problem on internet.
>>> Please
>>> >>> help me.
>>> >>>
>>> >>> Thanks,
>>> >>> Chunky.
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dean Wampler, Ph.D.
>>> >> thinkbiganalytics.com
>>> >> +1-312-339-1330
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Reply via email to