Having issues reading a csv file into a DataSet using Spark 2.1

Keith Chapman Wed, 22 Mar 2017 16:18:42 -0700

Hi,

I'm trying to read in a CSV file into a Dataset but keep having compilation
issues. I'm using spark 2.1 and the following is a small program that
exhibit the issue I'm having. I've searched around but not found a solution
that worked, I've added "import sqlContext.implicits._" as suggested but no
luck. What am I missing? Would appreciate some advice.


import org.apache.spark.sql.functions._
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.{Encoder,Encoders}

object DatasetTest{

  def main(args: Array[String]) {
    val sparkConf = new SparkConf().setAppName("DatasetTest")
    val sc = new SparkContext(sparkConf)
    case class Foo(text: String)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.implicits._
    val ds : org.apache.spark.sql.Dataset[Foo] =
sqlContext.read.csv(args(1)).as[Foo]
    ds.show
  }
}

Compiling the above program gives, I'd expect it to work as its a simple
case class, changing it to as[String] works, but I would like to get the
case class to work.

[error] /home/keith/dataset/DataSetTest.scala:13: Unable to find encoder
for type stored in a Dataset.  Primitive types (Int, String, etc) and
Product types (case classes) are supported by importing spark.implicits._
Support for serializing other types will be added in future releases.
[error]     val ds : org.apache.spark.sql.Dataset[Foo] =
sqlContext.read.csv(args(1)).as[Foo]


Regards,
Keith.

Having issues reading a csv file into a DataSet using Spark 2.1

Reply via email to