How to convert RDD to DF for this case -

Aakash Basu Fri, 17 Feb 2017 01:37:20 -0800

Hi all,


Without using case class I tried making a DF to work on the join and other
filtration later. But I'm getting an ArrayIndexOutOfBoundException error
while doing a show of the DF.


1)      Importing SQLContext=

import org.apache.spark.sql.SQLContext._

import org.apache.spark.sql.SQLContext



2)      Initializing SQLContext=

val sqlContext = new SQLContext(sc)



3)      Importing implicits package for toDF conversion=

import sqlContext.implicits._



4)      Reading the Station and Storm Files=

val stat = sc.textFile("/user/root/spark_demo/scala/data/Stations.txt")

val stor = sc.textFile("/user/root/spark_demo/scala/data/Storms.txt")





stat.foreach(println)


uihgf   Paris   56   5

asfsds   ***   43   1

fkwsdf   London   45   6

gddg   ABCD   32   2

grgzg   *CSD   35   3

gsrsn   ADR*   22   4


5) Creating row by segregating columns after reading the tab delimited file
before converting into DF=


*val stati = stat.map(x => (x.split("\t")(0), x.split("\t")(1),
x.split("\t")(2),x.split("\t")(3)))*



6)      Converting into DF=

val station = stati.toDF()

*station.show* is giving the below error ->

17/02/17 08:46:35 ERROR Executor: Exception in task 0.0 in stage 9.0 (TID
15)
java.lang.ArrayIndexOutOfBoundsException: 1


Please help!

Thanks,
Aakash.

How to convert RDD to DF for this case -

Reply via email to