[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

Aseem Bansal (JIRA) Fri, 05 Aug 2016 01:53:29 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409183#comment-15409183
 ]


Aseem Bansal commented on SPARK-16893:
--------------------------------------

Reading a CSV causes an exception. Code used and excpetion are below. Also 
present in the github issue that I have referenced here.

{code}
public static void main(String[] args) {
        SparkSession spark = SparkSession
                .builder()
                .appName("my app")
                .getOrCreate();

        Dataset<Row> df = spark.read()
                .format("com.databricks.spark.csv")
                .option("header", "true")
                .option("nullValue", "")
                .csv("/home/aseem/data.csv")
                ;

        df.show();
    }
{code}

bq. Exception in thread "main" java.lang.RuntimeException: Multiple sources 
found for csv (org.apache.spark.sql.execution.datasources.csv.CSVFileFormat, 
com.databricks.spark.csv.DefaultSource15), please specify the fully qualified 
class name.

People need to use format("csv"). I think that is counter intuitive seeing that 
I am using the CSV method.

> Spark CSV Provider option is not documented
> -------------------------------------------
>
>                 Key: SPARK-16893
>                 URL: https://issues.apache.org/jira/browse/SPARK-16893
>             Project: Spark
>          Issue Type: Documentation
>    Affects Versions: 2.0.0
>            Reporter: Aseem Bansal
>            Priority: Minor
>
> I was working with databricks spark csv library and came across an error. I 
> have logged the issue in their github but it would be good to document that 
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON     
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

Reply via email to