spark can benefit from data locality and will try to launch tasks on the node where the kafka partition resides.
however i think in production many organizations run a dedicated kafka cluster. On Sat, Feb 6, 2016 at 11:27 PM, Diwakar Dhanuskodi < diwakar.dhanusk...@gmail.com> wrote: > Yes . To reduce network latency . > > > Sent from Samsung Mobile. > > > -------- Original message -------- > From: fanooos <dev.fano...@gmail.com> > Date:07/02/2016 09:24 (GMT+05:30) > To: user@spark.apache.org > Cc: > Subject: Apache Spark data locality when integrating with Kafka > > Dears > > If I will use Kafka as a streaming source to some spark jobs, is it advised > to install spark to the same nodes of kafka cluster? > > What are the benefits and drawbacks of such a decision? > > regards > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-data-locality-when-integrating-with-Kafka-tp26165.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >