Re: CSV spark package not working in v0.6.1, spark 2.0, scala 2.11

Mina Lee Wed, 17 Aug 2016 03:33:50 -0700

Hi Abul,

spark-csv is integrated into spark itself so you don't need to load
spark-csv dependencies anymore.


Could you try below instead?

val df = sqlContext.read.
options(Map("header" -> "true", "inferSchema" -> "true")).
csv("hdfs:// ... /S&P")

df.printSchema

Hope this solves your issue!

Mina

On Wed, Aug 17, 2016 at 11:43 AM Abul Basar <aba...@einext.com> wrote:

> Hello,
>
> It is exciting to see new release 0.6.1 in a short span after 0.6 release.
>
> I am test driving 0.6.1 with spark 2.0 (Scala 2.11). RDD, DF operations
> are working fine. I am facing a problem while using csv package (
> https://github.com/databricks/spark-csv).
>
> i added "com.databricks:spark-csv_2.11:1.4.0" in the interpreter
> dependencies using UI and  I am trying the following code. I restarted
> zeppelin.
>
>
> val df = spark.sqlContext.read.
> format("com.databricks.spark.csv").
> options(Map("header" -> "true", "inferSchema" -> "true")).
> load("hdfs:// ... /S&P")
>
> df.printSchema
>
>
> The above statement errors out with the follow message
>
> java.lang.NoSuchMethodError:
> com.univocity.parsers.csv.CsvParserSettings.setUnescapedQuoteHandling(Lcom/univocity/parsers/csv/UnescapedQuoteHandling;)V
> at
> org.apache.spark.sql.execution.datasources.csv.CsvReader.parser$lzycompute(CSVParser.scala:50)
> at
> org.apache.spark.sql.execution.datasources.csv.CsvReader.parser(CSVParser.scala:35)
> at
> org.apache.spark.sql.execution.datasources.csv.LineCsvReader.parseLine(CSVParser.scala:117)
> at
> org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:59)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:392)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:392)
> at scala.Option.orElse(Option.scala:289)
> at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:391)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132)
> ... 46 elided
>
>
> I successfully tested the same code using REPL.The above error seems a bug
> introduced in 0.6.1. It works fine in 0.6.0.
>
> Any ideas about how to resolve the issue?
>
> Thanks!
> - AB
>
>
>

Re: CSV spark package not working in v0.6.1, spark 2.0, scala 2.11

Reply via email to