Thanks for good advice and input everyone!
While we were talking, Pinterest open sourced their workflow engine:
http://engineering.pinterest.com/post/113376157699/open-sourcing-pinball.
It looks similar to Luigi in terms of architecture.
My current plan is to use Aurora in the manner described in
quot;
> >
> Date: Wednesday, March 11, 2015 at 3:21 PM
> To: "dev@aurora.incubator.apache.org " <
> dev@aurora.incubator.apache.org >
> Subject: Re: Data processing pipeline workflow management
>
> >Hey,
> >
> >This is a great question. See my comments
3:21 PM
To: "dev@aurora.incubator.apache.org"
Subject: Re: Data processing pipeline workflow management
>Hey,
>
>This is a great question. See my comments inline below.
>
>On Tue, Mar 10, 2015 at 8:28 AM, Lars Albertsson
>
>wrote:
>
>> We are evaluat
Hey,
This is a great question. See my comments inline below.
On Tue, Mar 10, 2015 at 8:28 AM, Lars Albertsson
wrote:
> We are evaluating Aurora as a workflow management tool for batch
> processing pipelines. We basically need a tool that regularly runs
> batch processes that are connected as pr
I'm afraid in general the use cases you describe are not things that Aurora
currently intends to fulfill. Though, that's not to say that you could not
do this on top of Aurora if you wanted to.
Does anyone have experience with building workflows with Aurora?
I do not. I could opine about how o
We are evaluating Aurora as a workflow management tool for batch
processing pipelines. We basically need a tool that regularly runs
batch processes that are connected as producers/consumers of data,
typically stored in HDFS or S3.
The alternative tools would be Azkaban, Luigi, and Oozie, but I am