Hi This is my first mail to this group. I am writing this in order to get a fair understanding of how zeppelin can be integrated with Spark.
Our use case is to load few tables from a DB to Spark, run some transformation. Once done, we want to expose data through Zeppelin for analytics. I have few question around that to sound off any gross architectural flaws. Questions: 1. How Zeppelin connects to Spark? Thrift JDBC? How is it different than JDBC server? 2. What is the scope of Spark application when it is used from Zeppelin? For example, if I have few subsequent actions in zeppelin like map,filter,reduceByKey, filter,collect. I assume this will translate to an application and get submitted to Spark. However, If I want to use reuse some part of the data (for example) after first map transformation in earlier application. Can I do it? Or will it be another application and another spark submit? In our use case data will already be loaded in RDDs. So how Zeppelin can access it? Is it even possible? 3. How can I control access on specific rdds to specific users in Zeppelin (assuming we have implemented some way of login mechanism in Zeppelin and we have a mapping between Zeppelin users and their LDAP accounts). Is it even possible? appreciate any help/pointers/guidance. -- This e-mail and any attachments to it are confidential. You must not use, disclose or act on the e-mail if you are not the intended recipient. If you have received this e-mail in error, please let us know by contacting the sender and deleting the original e-mail. Liability limited by a scheme approved under Professional Standards Legislation. Deloitte refers to one or more of Deloitte Touche Tohmatsu Limited, a UK private company limited by guarantee, and its network of member firms, each of which is a legally separate and independent entity. Please see www.deloitte.com.au/about <http://www.deloitte.com/au/about> for a detailed description of the legal structure of Deloitte Touche Tohmatsu Limited and its member firms.