Yes of-course.  I already feel a bit less intelligent for having asked the
question ;-)

The status now is that I managed to have it all puzzled together.  Copying
the files from s3 to an ephemeral volume takes all of 2 seconds so it's
really not an issue.  The cluster starts and our fat jar and Apache Hop
MainBeam class is found and started.

The only thing that remains is figuring out how to configure the Flink
cluster itself.  I have a couple of m5.large ec2 instances in a node group
on EKS and I set taskmanager.numberOfTaskSlots to "4".  However, the tasks
in the pipeline can't seem to find resources to start.

Caused by:
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
Slot request bulk is not fulfillable! Could not allocate the required slot
within slot request timeout

Parallelism was set to 1 for the runner and there are only 2 tasks in my
first Beam pipeline so it should be simple enough but it just times out.

Next step for me is to document the result which will end up on
hop.apache.org.   I'll probably also want to demo this in Austin at the
upcoming Beam summit.

Thanks a lot for your time and help so far!

Cheers,
Matt

Reply via email to