Hi Sachin, Some quick answers, and a link to some documentation to read more:
- If you restart the application, it will start from the point it crashed (possibly reprocessing a small window of records). - You can run more than one instance of the application. They'll coordinate by virtue of being part of a Kafka consumer group; if one crashes, the partitions that it was reading from will be picked up by other instances. - When running more than one instance, the tasks will be distributed between the instances. Confluent's docs on the Kafka Streams architecture goes into a lot more detail: http://docs.confluent.io/3.0.0/streams/architecture.html On Thu, Dec 8, 2016 at 9:05 PM, Sachin Mittal <sjmit...@gmail.com> wrote: > Hi All, > We were able to run a stream processing application against a fairly decent > load of messages in production environment. > > To make the system robust say the stream processing application crashes, is > there a way to make it auto start from the point when it crashed? > > Also is there any concept like running the same application in a cluster, > where one fails, other takes over, until we bring back up the failed node > of streams application. > > If yes, is there any guidelines or some knowledge base we can look at to > understand how this would work. > > Is there way like in spark, where the driver program distributes the tasks > across various nodes in a cluster, is there something similar in kafka > streaming too. > > Thanks > Sachin >