Re: Best practice for automating jobs

Alexander Alten-Lorenz Thu, 10 Jan 2013 23:24:07 -0800

+1

This is the best solution to automate jobs.


cheers,
 Alex

On Jan 10, 2013, at 11:11 PM, Sean McNamara <sean.mcnam...@webtrends.com> wrote:

>> I want to know if there are any accepted patterns or best practices for
>> this?
> 
> http://oozie.apache.org/
> 
> 
> 
>> New partitions will be added regularly
> 
> What type of partitions are you adding? Why frequently?
> 
> 
> 
> 
> Sean
> 
> 
> On 1/10/13 3:03 PM, "Tom Brown" <tombrow...@gmail.com> wrote:
> 
>> All,
>> 
>> I want to automate jobs against Hive (using an external table with
>> ever growing partitions), and I'm running into a few challenges:
>> 
>> Concurrency - If I run Hive as a thrift server, I can only safely run
>> one job at a time. As such, it seems like my best bet will be to run
>> it from the command line and setup a brand new instance for each job.
>> That quite a bit of a hassle to solves a seemingly common problem, so
>> I want to know if there are any accepted patterns or best practices
>> for this?
>> 
>> Partition management - New partitions will be added regularly. If I
>> have to setup multiple instances of Hive for each (potentially)
>> overlapping job, it will be difficult to keep track of the partitions
>> that have been added. In the context of the preceding question, what
>> is the best way to add metadata about new partitions?
>> 
>> Thanks in advance!
>> 
>> --Tom
> 

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF

Re: Best practice for automating jobs

Reply via email to