You can extract the nested fields in sql: SELECT field.nestedField ... If you don't do that then nested fields are represented as rows within rows and can be retrieved as follows:
t.getAs[Row](0).getInt(0) Also, I would write t.getAs[Buffer[CharSequence]](12) as t.getAs[Seq[String]](12) since we don't guarantee the return type will be a buffer. On Wed, Nov 19, 2014 at 1:33 PM, Simone Franzini <captainfr...@gmail.com> wrote: > I have been using Spark SQL to read in JSON data, like so: > val myJsonFile = sqc.jsonFile(args("myLocation")) > myJsonFile.registerTempTable("myTable") > sqc.sql("mySQLQuery").map { row => > myFunction(row) > } > > And then in myFunction(row) I can read the various columns with the > Row.getX methods. However, this methods only work for basic types (string, > int, ...). > I was having some trouble reading columns that are arrays or maps (i.e. > other JSON objects). > > I am now using Spark 1.2 from the Cloudera snapshot and I noticed that > there is a new method getAs. I was able to use it to read for example an > array of strings like so: > t.getAs[Buffer[CharSequence]](12) > > However, if I try to read a column with a nested JSON object like this: > t.getAs[Map[String, Any]](11) > > I get the following error: > java.lang.ClassCastException: > org.apache.spark.sql.catalyst.expressions.GenericRow cannot be cast to > scala.collection.immutable.Map > > How can I read such a field? Am I just missing something small or should I > be looking for a completely different alternative to reading JSON? > > Simone Franzini, PhD > > http://www.linkedin.com/in/simonefranzini >