Hi guys,

I recently met with an error with 'JDBC_PUSHDOWN_PREDICATE' option not work.  
The background is

val url = "jdbc:sqlserver://XXXXXX"
val properties = new Properties
val df = spark.read.jdbc(url, "movies", properties)
df.filter("rated == true").show()

I am using this code to read from SQL server do transformations with filter. 
However this way I met with an expcetion:
Job aborted due to stage failure. Caused by: SQLServerException: Invalid column 
name 'true'.

The original table contains a 'bit' data type 'rated'. Digging into the code, I 
found 'bit' will be translate to Boolean type. Following the pushdown logic, in 
MSSqlserverDialect compileValue() method, the Boolean value is translated to 
'true'/'false' which doesn't match TSQL language '1'/'0'. And finally caused 
this issue.

After figuring out the issue, I tried to use 'pushDownPredicate' options to 
avoid pushing down the filter logic into SQL query, the code is like

val url = "jdbc:sqlserver://XXXXXX"
val properties = new Properties
properties.setProperty(JDBCOptions.JDBC_PUSHDOWN_PREDICATE, "false") add but 
still not work
val df = spark.read.jdbc(url, "movies", properties)
df.filter("rated == true").show()

However it still failed with the same error message. Seems the pushdown false 
is not working at all. So the question is why the pushdownPredicate option is 
not work as expected and if there is other mitigations to fix this issue.


Best,
Xiaojin

Reply via email to