Spark and SQL Server

Young, Matthew T Fri, 17 Jul 2015 10:17:47 -0700

Hello,

I am testing Spark interoperation with SQL Server via JDBC with Microsoft’s 4.2 
JDBC Driver. Reading from the database works ok, but I have encountered a 
couple of issues writing back. In Scala 2.10 I can write back to the database 
except for a couple of types.



1.      When I read a DataFrame from a table that contains a datetime column it 
comes in as a java.sql.Timestamp object in the DataFrame. This is alright for 
Spark purposes, but when I go to write this back to the database with 
df.write.jdbc(…) it errors out because it is trying to write the TimeStamp type 
to SQL Server, which is not a date/time storing type in TSQL. I think it should 
be writing a datetime, but I’m not sure how to tell Spark this.



2.      A related misunderstanding happens when I try to write a 
java.lang.boolean to the database; it errors out because Spark is trying to 
specify the width of the bit type, which is illegal in SQL Server (error msg: 
Cannot specify a column width on data type bit). Do I need to edit Spark source 
to fix this behavior, or is there a configuration option somewhere that I am 
not aware of?


When I attempt to write back to SQL Server in an IPython notebook, py4j seems 
unable to convert a Python dict into a Java hashmap, which is necessary for 
parameter passing. I’ve documented details of this problem with code examples 
here<http://stackoverflow.com/questions/31417653/java-util-hashmap-missing-in-pyspark-session>.
 Any advice would be appreciated.

Thank you for your time,

-- Matthew Young

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark and SQL Server

Reply via email to