Can you provide the JDBC connector jar version. Possibly the full JAR name and full command you ran Spark with ?
On Wed, Apr 15, 2015 at 11:27 AM, Nathan McCarthy < nathan.mccar...@quantium.com.au> wrote: > Just an update, tried with the old JdbcRDD and that worked fine. > > From: Nathan <nathan.mccar...@quantium.com.au> > Date: Wednesday, 15 April 2015 1:57 pm > To: "user@spark.apache.org" <user@spark.apache.org> > Subject: SparkSQL JDBC Datasources API when running on YARN - Spark 1.3.0 > > Hi guys, > > Trying to use a Spark SQL context’s .load(“jdbc", …) method to create a > DF from a JDBC data source. All seems to work well locally (master = > local[*]), however as soon as we try and run on YARN we have problems. > > We seem to be running into problems with the class path and loading up > the JDBC driver. I’m using the jTDS 1.3.1 driver, > net.sourceforge.jtds.jdbc.Driver. > > ./bin/spark-shell --jars /tmp/jtds-1.3.1.jar --master yarn-client > > When trying to run I get an exception; > > scala> sqlContext.load("jdbc", Map("url" -> > "jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd", "dbtable" -> > "CUBE.DIM_SUPER_STORE_TBL”)) > > java.sql.SQLException: No suitable driver found for > jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd > > Thinking maybe we need to force load the driver, if I supply *“driver” > -> “net.sourceforge.jtds.jdbc.Driver”* to .load we get; > > scala> sqlContext.load("jdbc", Map("url" -> > "jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd", "driver" -> > "net.sourceforge.jtds.jdbc.Driver", "dbtable" -> > "CUBE.DIM_SUPER_STORE_TBL”)) > > java.lang.ClassNotFoundException: net.sourceforge.jtds.jdbc.Driver > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:191) > at > org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:97) > at > org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:290) > at org.apache.spark.sql.SQLContext.load(SQLContext.scala:679) > at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:21) > > Yet if I run a Class.forName() just from the shell; > > scala> Class.forName("net.sourceforge.jtds.jdbc.Driver") > res1: Class[_] = class net.sourceforge.jtds.jdbc.Driver > > No problem finding the JAR. I’ve tried in both the shell, and running > with spark-submit (packing the driver in with my application as a fat JAR). > Nothing seems to work. > > I can also get a connection in the driver/shell no problem; > > scala> import java.sql.DriverManager > import java.sql.DriverManager > scala> > DriverManager.getConnection("jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd") > res3: java.sql.Connection = > net.sourceforge.jtds.jdbc.JtdsConnection@2a67ecd0 > > I’m probably missing some class path setting here. In > *jdbc.DefaultSource.createRelation* it looks like the call to > *Class.forName* doesn’t specify a class loader so it just uses the > default Java behaviour to reflectively get the class loader. It almost > feels like its using a different class loader. > > I also tried seeing if the class path was there on all my executors by > running; > > *import *scala.collection.JavaConverters._ > > sc.parallelize(*Seq*(1,2,3,4)).flatMap(_ => java.sql.DriverManager. > *getDrivers*().asScala.map(d => *s**”**$*d* | **$*{d.acceptsURL( > *"jdbc:jtds:sqlserver://blah:1433/MyDB;user=usr;password=pwd"*)}*"* > )).collect().foreach(*println*) > > This successfully returns; > > 15/04/15 01:07:37 INFO scheduler.DAGScheduler: Job 0 finished: collect > at Main.scala:46, took 1.495597 s > org.apache.derby.jdbc.AutoloadedDriver40 | false > com.mysql.jdbc.Driver | false > net.sourceforge.jtds.jdbc.Driver | true > org.apache.derby.jdbc.AutoloadedDriver40 | false > com.mysql.jdbc.Driver | false > net.sourceforge.jtds.jdbc.Driver | true > org.apache.derby.jdbc.AutoloadedDriver40 | false > com.mysql.jdbc.Driver | false > net.sourceforge.jtds.jdbc.Driver | true > org.apache.derby.jdbc.AutoloadedDriver40 | false > com.mysql.jdbc.Driver | false > net.sourceforge.jtds.jdbc.Driver | true > > As a final test we tried with postgres driver and had the same problem. > Any ideas? > > Cheers, > Nathan > -- Deepak