We are building an application that involves chains of M/R jobs, most likely 
all will be written in Hive.  We need to start a Hive job when one or more 
prerequisite data sets appear (defined in the Hive sense as a new partition 
having been populated with data) - OR- a particular time has been reached.

We know of two scheduling packages that appear to solve this problem: Oozie & 
Pentaho (to which my company has a license).

Does anyone have actual experience using either of these (or something else) to 
schedule Hive jobs?

William Kornfeld
Baynote

Reply via email to