For JDBC to work, you can start spark-submit with appropriate jdbc driver jars (using --jars), then you will have the driver available on executors.
For acquiring connections, create a singleton connection per executor. I think dataframe's jdbc reader (sqlContext.read.jdbc) already take care of it. Finally, if you want multiple mysql table to be accesses in a single spark job, you can create a list of tables and run a map on that list. Something like: def getTable(tablename:String): Dataframe def saveTable(d : Dataframe): Unit val tables = sc.paralleize(<List of Table>) tables.map(getTable).map(saveTable) On Wed, Mar 22, 2017 at 9:41 AM, Shashank Mandil <mandil.shash...@gmail.com> wrote: > I am using spark to dump data from mysql into hdfs. > The way I am doing this is by creating a spark dataframe with the metadata > of different mysql tables to dump from multiple mysql hosts and then > running a map over that data frame to dump each mysql table data into hdfs > inside the executor. > > The reason I want spark context is that I would like to use spark jdbc to > be able to read the mysql table and then the spark writer to be able to > write to hdfs. > > Thanks, > Shashank > > On Tue, Mar 21, 2017 at 3:37 PM, ayan guha <guha.a...@gmail.com> wrote: > >> What is your use case? I am sure there must be a better way to solve >> it.... >> >> On Wed, Mar 22, 2017 at 9:34 AM, Shashank Mandil < >> mandil.shash...@gmail.com> wrote: >> >>> Hi All, >>> >>> I am using spark in a yarn cluster mode. >>> When I run a yarn application it creates multiple executors on the >>> hadoop datanodes for processing. >>> >>> Is it possible for me to create a local spark context (master=local) on >>> these executors to be able to get a spark context ? >>> >>> Theoretically since each executor is a java process this should be >>> doable isn't it ? >>> >>> Thanks, >>> Shashank >>> >> >> >> >> -- >> Best Regards, >> Ayan Guha >> > > -- Best Regards, Ayan Guha