Hi all,
Without using case class I tried making a DF to work on the join and other filtration later. But I'm getting an ArrayIndexOutOfBoundException error while doing a show of the DF. 1) Importing SQLContext= import org.apache.spark.sql.SQLContext._ import org.apache.spark.sql.SQLContext 2) Initializing SQLContext= val sqlContext = new SQLContext(sc) 3) Importing implicits package for toDF conversion= import sqlContext.implicits._ 4) Reading the Station and Storm Files= val stat = sc.textFile("/user/root/spark_demo/scala/data/Stations.txt") val stor = sc.textFile("/user/root/spark_demo/scala/data/Storms.txt") stat.foreach(println) uihgf Paris 56 5 asfsds *** 43 1 fkwsdf London 45 6 gddg ABCD 32 2 grgzg *CSD 35 3 gsrsn ADR* 22 4 5) Creating row by segregating columns after reading the tab delimited file before converting into DF= *val stati = stat.map(x => (x.split("\t")(0), x.split("\t")(1), x.split("\t")(2),x.split("\t")(3)))* 6) Converting into DF= val station = stati.toDF() *station.show* is giving the below error -> 17/02/17 08:46:35 ERROR Executor: Exception in task 0.0 in stage 9.0 (TID 15) java.lang.ArrayIndexOutOfBoundsException: 1 Please help! Thanks, Aakash.