Fanoos, Where you want the solution to be deployed ?. On premise or cloud?
Regards Diwakar . Sent from Samsung Mobile. <div>-------- Original message --------</div><div>From: "Yuval.Itzchakov" <yuva...@gmail.com> </div><div>Date:07/02/2016 19:38 (GMT+05:30) </div><div>To: user@spark.apache.org </div><div>Cc: </div><div>Subject: Re: Apache Spark data locality when integrating with Kafka </div><div> </div>I would definitely try to avoid hosting Kafka and Spark on the same servers. Kafka and Spark will be doing alot of IO between them, so you'll want to maximize on those resources and not share them on the same server. You'll want each Kafka broker to be on a dedicated server, as well as your spark master and workers. If you're hosting them on Amazon EC2 instances, then you'll want these to be on the same availability zone, so you can benefit from low latency in that same zone. If you're on a dedicated servers, perhaps you'll want to create a VPC between the two clusters so you can, again, benefit from low IO latency and high throughput. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-data-locality-when-integrating-with-Kafka-tp26165p26170.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org