Hey there!
You are correct that this is focused on the higher-level API but doesn't
preclude using the lower-level API. I was at the same point you were not
long ago, in fact, and had a very productive conversation on the list:
you should look for "Question about custom StreamJob/Factory" in the
list archive for the past couple months.
I'll quote Jagadish Venkatraman from that thread:
For the section on the low-level API, can you use
LocalApplicationRunner#runTask()? It basically creates a new
StreamProcessor and runs it. Remember to provide task.class and set it
to your implementation of StreamTask or AsyncStreamTask. Please note
that this is an evolving API and hence, subject to change.
I ended up just switching to the high-level API because I don't have any
existing Tasks and the Kubernetes story is a little more straight
forward there (there's only one container/configuration to deploy).
Best,
Tom
Thunder Stumpges <tstump...@ntent.com> writes:
Hi all,
We are using Samza (0.12.0) in about 2 dozen jobs implementing several processing pipelines. We have also begun a
significant move of other services within our company to Docker/Kubernetes. Right now our Hadoop/Yarn cluster has a
mix of stream and batch "Map Reduce" jobs (many reporting and other batch processing jobs). We would really like to
move our stream processing off of Hadoop/Yarn and onto Kubernetes.
When I just read about some of the new progress in .13 and .14 I got really excited! We would love to have our jobs
run as simple libraries in our own JVM, and use the Kafka High-Level-Consumer for partition distribution and
such. This would let us "dockerfy" our application and run/scale in kubernetes.
However as I read it, this new deployment model is ONLY for the new(er) High Level API, correct? Is there a plan
and/or resources for adapting this back to existing low-level tasks ? How complicated of a task is that? Do I have any
other options to make this transition easier?
Thanks in advance.
Thunder