Need this to solve the problem: import org.apache.spark.SparkContext._
Yishu On Mar 10, 2014, at 2:46 PM, Yishu Lin <yishutheco...@gmail.com> wrote: > I have the same question and tried with 1, but get compilation error: > > [error] …. could not find implicit value for parameter kcf: () => > org.apache.spark.WritableConverter[String] > [error] val t2 = sc.sequenceFile[String, Int](“/test/data", 20) > > > Yishu > > On Mar 9, 2014, at 12:21 AM, Shixiong Zhu <zsxw...@gmail.com> wrote: > >> Hi Kane, >> >> In the sequence file, the class is org.apache.hadoop.io.Text. You need to >> convert Text to String. There are two approaches: >> >> 1. Use implicit conversions to convert Text to String automatically. I >> recommend this one. E.g., >> >> val t2 = sc.sequenceFile[String, String]("/user/hdfs/e1Mseq") >> t2.groupByKey().take(5) >> >> 2. Use "classOf[Text]" to specify the correct class in the sequence file and >> convert Text to String. E.g., >> >> import org.apache.hadoop.io.Text >> val t2 = sc.sequenceFile("/user/hdfs/e1Mseq", classOf[Text], classOf[Text]) >> t2.map { case (k,v) => (k.toString, v.toString) } .groupByKey().take(5) >> >> >> Best Regards, >> >> Shixiong Zhu >> >> >> 2014-03-09 13:30 GMT+08:00 Kane <kane.ist...@gmail.com>: >> when i try to open sequence file: >> val t2 = sc.sequenceFile("/user/hdfs/e1Mseq", classOf[String], >> classOf[String]) >> t2.groupByKey().take(5) >> >> I get: >> org.apache.spark.SparkException: Job aborted: Task 25.0:0 had a not >> serializable result: java.io.NotSerializableException: >> org.apache.hadoop.io.Text >> >> another thing is: >> t2.take(5) - returns 5 identical items, i guess I have to map/clone items, >> but i get something like org.apache.hadoop.io.Text cannot be cast to >> java.lang.String, how do i clone it? >> >> Thanks. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/sequenceFile-and-groupByKey-tp2428.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >