Re: Question about custom StreamJob/Factory

2018-02-05 Thread Jagadish Venkatraman
>> When using a KVSerde on the input stream, deserializing messages always failed. I was producing messages with the console producer so maybe I just did it wrong? >From the stack-trace, it looks you are getting a "BufferUnderflowException". This is usually caused when Samza tries to read 4-by

Re: Question about custom StreamJob/Factory

2018-02-04 Thread Tom Davis
Thanks for all the follow-up, it definitely helped solidify my understanding of things a bit. Given that, I set about confirming for myself that running multiple instances of the StreamApplication in different Pods, provided an appropriate Partition count in Kafka, would result in different Contai

Re: Question about custom StreamJob/Factory

2018-01-31 Thread Jagadish Venkatraman
Hey Tom, >> As promised, here's the link to the repository: https://github. com/sonian/samza-kubernetes I just reviewed your repo for Kubernetes integration. Really nice work on integrating the high-level API and Kubernetes with the ZkJobCoordinator!! Were you also able to spawn multiple instanc

Re: Question about custom StreamJob/Factory

2018-01-31 Thread Jagadish Venkatraman
>> Thanks for the timely and thorough reply! Based on your explanation, it sounds like when using the high-level API I don't need to go through the JobRunner or `run-job.sh` at all -- is that correct? Your understanding is right. You can just run multiple instances of the run-app.sh script. You ca

Re: Question about custom StreamJob/Factory

2018-01-28 Thread Tom Davis
As promised, here's the link to the repository: https://github.com/sonian/samza-kubernetes The section "Your Job Image" covers my remaining questions on the low-level API. We use Clojure on the backend, so I'm using that to sanity-check the example high-level app and will update the example if i

Re: Question about custom StreamJob/Factory

2018-01-27 Thread Tom Davis
Thanks for the timely and thorough reply! Based on your explanation, it sounds like when using the high-level API I don't need to go through the JobRunner or `run-job.sh` at all -- is that correct? I can simply run as many instances of, e.g., `WikipediaZkLocalApplication` as I want and Samza will

Re: Question about custom StreamJob/Factory

2018-01-27 Thread Jagadish Venkatraman
+Yi Hi Tom, Thank you for your feedback on Samza's architecture. Pluggability has been a differentiator that has enabled us to support a wide range of use-cases - from stand-alone deployments to managed services, from streaming to batch inputs and integrations with various systems from Kafka, Kin

Question about custom StreamJob/Factory

2018-01-27 Thread Tom Davis
Hi there! First off, thanks for the continued work on Samza -- I looked into many DC/stream processors and Samza was a real standout with its smart architecture and pluggable design. I'm working on a custom StreamJob/Factory for running Samza jobs on Kubernetes. Those two classes are pretty strai