A value of a row can be accessed through both generic access by ordinal, which will incur boxing overhead for primitives, as well as native primitive access. An example of generic access by ordinal:
import org.apache.spark.sql._ val row = Row(1, true, "a string", null) // row: Row = [1,true,a string,null] val firstValue = row(0) // firstValue: Any = 1 val fourthValue = row(3) // fourthValue: Any = null For native primitive access, it is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null. An example of native primitive access: // using the row from the previous example. val firstValue = row.getInt(0) // firstValue: Int = 1 val isNull = row.isNullAt(3) // isNull: Boolean = true In Scala, fields in a Row <https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/Row.html> object can be extracted in a pattern match. Example: import org.apache.spark.sql._ val pairs = sql("SELECT key, value FROM src").rdd.map { case Row(key: Int, value: String) => key -> value } Hope this helps for more info please refer to https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/Row.html On 10 October 2016 at 04:50, Koert Kuipers <ko...@tresata.com> wrote: > the Spark-SQL Row trait has a schema that by default is null. when the > schema is null operations that rely on fieldIndex such as > getAs[T](fieldName: String): T do not work. > > i noticed that when i convert a DataFrame to Rdd[Row] that the Row objects > do have schemas. can i rely on this? > > when can i be sure that the schema is not null? what is the expectation > here? > > thanks! koert > >