Dear all, I have three questions about equality of org.apache.spark.sql.Row.
(1) If a Row has a complex type (e.g. Array), is the following behavior expected? If two Rows has the same array instance, Row.equals returns true in the second assert. If two Rows has different array instances (a1 and a2) that have the same array elements, Row.equals returns false in the third assert. val a1 = Array(3, 4) val a2 = Array(3, 4) val r1 = Row(a1) val r2 = Row(a2) assert(a1.sameElements(a2)) // SUCCESS assert(Row(a1).equals(Row(a1))) // SUCCESS assert(Row(a1).equals(Row(a2))) // FAILURE This is because two objects are compared by "o1 != o2" instead of "o1.equals(o2)" at https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala#L408 (2) If (1) is expected, where is this behavior is described or defined? I cannot find the description in the API document. https://spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/Row.html https://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/api/scala/index.html#org.apache.spark.sql.Row (3) If (3) is expected, is there any recommendation to write code of equality of two Rows that have an Array or complex types (e.g. Map)? Best Regards, Kazuaki Ishizaki, @kiszk