[
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409183#comment-15409183
]
Aseem Bansal commented on SPARK-16893:
--------------------------------------
Reading a CSV causes an exception. Code used and excpetion are below. Also
present in the github issue that I have referenced here.
{code}
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("my app")
.getOrCreate();
Dataset<Row> df = spark.read()
.format("com.databricks.spark.csv")
.option("header", "true")
.option("nullValue", "")
.csv("/home/aseem/data.csv")
;
df.show();
}
{code}
bq. Exception in thread "main" java.lang.RuntimeException: Multiple sources
found for csv (org.apache.spark.sql.execution.datasources.csv.CSVFileFormat,
com.databricks.spark.csv.DefaultSource15), please specify the fully qualified
class name.
People need to use format("csv"). I think that is counter intuitive seeing that
I am using the CSV method.
> Spark CSV Provider option is not documented
> -------------------------------------------
>
> Key: SPARK-16893
> URL: https://issues.apache.org/jira/browse/SPARK-16893
> Project: Spark
> Issue Type: Documentation
> Affects Versions: 2.0.0
> Reporter: Aseem Bansal
> Priority: Minor
>
> I was working with databricks spark csv library and came across an error. I
> have logged the issue in their github but it would be good to document that
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]