Hi all,
Without using case class I tried making a DF to work on the join and other
filtration later. But I'm getting an ArrayIndexOutOfBoundException error
while doing a show of the DF.
1) Importing SQLContext=
import org.apache.spark.sql.SQLContext._
import org.apache.spark.sql.SQLContext
2) Initializing SQLContext=
val sqlContext = new SQLContext(sc)
3) Importing implicits package for toDF conversion=
import sqlContext.implicits._
4) Reading the Station and Storm Files=
val stat = sc.textFile("/user/root/spark_demo/scala/data/Stations.txt")
val stor = sc.textFile("/user/root/spark_demo/scala/data/Storms.txt")
stat.foreach(println)
uihgf Paris 56 5
asfsds *** 43 1
fkwsdf London 45 6
gddg ABCD 32 2
grgzg *CSD 35 3
gsrsn ADR* 22 4
5) Creating row by segregating columns after reading the tab delimited file
before converting into DF=
*val stati = stat.map(x => (x.split("\t")(0), x.split("\t")(1),
x.split("\t")(2),x.split("\t")(3)))*
6) Converting into DF=
val station = stati.toDF()
*station.show* is giving the below error ->
17/02/17 08:46:35 ERROR Executor: Exception in task 0.0 in stage 9.0 (TID
15)
java.lang.ArrayIndexOutOfBoundsException: 1
Please help!
Thanks,
Aakash.