I got there in the end by specifying my Row record like this ... but there must be a neater way of doing this ??
val trainRDD = rawTrainData.map( rawRow => rawRow.split(",")) .map( p => Row( p(0),p(1),p(2),p(3),p(4),p(5),p(6),p(7),p(8),p(9), p(10),p(11),p(12),p(13),p(14),p(15),p(16),p(17),p(18),p(19), ................. p(770),p(771),p(772),p(773),p(774),p(775),p(776),p(777),p(778),p(779), p(780),p(781),p(782),p(783),p(784) i.e by specifying all 785 elements physically -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutOfBoundsException-tp22854p22855.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org