> I want to know if there are any accepted patterns or best practices for >this?
http://oozie.apache.org/ > New partitions will be added regularly What type of partitions are you adding? Why frequently? Sean On 1/10/13 3:03 PM, "Tom Brown" <tombrow...@gmail.com> wrote: >All, > >I want to automate jobs against Hive (using an external table with >ever growing partitions), and I'm running into a few challenges: > >Concurrency - If I run Hive as a thrift server, I can only safely run >one job at a time. As such, it seems like my best bet will be to run >it from the command line and setup a brand new instance for each job. >That quite a bit of a hassle to solves a seemingly common problem, so >I want to know if there are any accepted patterns or best practices >for this? > >Partition management - New partitions will be added regularly. If I >have to setup multiple instances of Hive for each (potentially) >overlapping job, it will be difficult to keep track of the partitions >that have been added. In the context of the preceding question, what >is the best way to add metadata about new partitions? > >Thanks in advance! > >--Tom