On Tue, May 13, 2014 at 8:26 AM, Michael Malak <michaelma...@yahoo.com>wrote:

> Reposting here on dev since I didn't see a response on user:
>
> I'm seeing different Serializable behavior in Spark Shell vs. Scala Shell.
> In the Spark Shell, equals() fails when I use the canonical equals()
> pattern of match{}, but works when I subsitute with isInstanceOf[]. I am
> using Spark 0.9.0/Scala 2.10.3.
>
> Is this a bug?
>
> Spark Shell (equals uses match{})
> =================================
>
> class C(val s:String) extends Serializable {
>   override def equals(o: Any) = o match {
>     case that: C => that.s == s
>     case _ => false
>   }
> }
>
> val x = new C("a")
> val bos = new java.io.ByteArrayOutputStream()
> val out = new java.io.ObjectOutputStream(bos)
> out.writeObject(x);
> val b = bos.toByteArray();
> out.close
> bos.close
> val y = new java.io.ObjectInputStream(new
> java.io.ByteArrayInputStream(b)).readObject().asInstanceOf[C]
> x.equals(y)
>
> res: Boolean = false
>
> Spark Shell (equals uses isInstanceOf[])
> ========================================
>
> class C(val s:String) extends Serializable {
>   override def equals(o: Any) = if (o.isInstanceOf[C])
> (o.asInstanceOf[C].s == s) else false
> }
>
> val x = new C("a")
> val bos = new java.io.ByteArrayOutputStream()
> val out = new java.io.ObjectOutputStream(bos)
> out.writeObject(x);
> val b = bos.toByteArray();
> out.close
> bos.close
> val y = new java.io.ObjectInputStream(new
> java.io.ByteArrayInputStream(b)).readObject().asInstanceOf[C]
> x.equals(y)
>
> res: Boolean = true
>
> Scala Shell (equals uses match{})
> =================================
>
> class C(val s:String) extends Serializable {
>   override def equals(o: Any) = o match {
>     case that: C => that.s == s
>     case _ => false
>   }
> }
>
> val x = new C("a")
> val bos = new java.io.ByteArrayOutputStream()
> val out = new java.io.ObjectOutputStream(bos)
> out.writeObject(x);
> val b = bos.toByteArray();
> out.close
> bos.close
> val y = new java.io.ObjectInputStream(new
> java.io.ByteArrayInputStream(b)).readObject().asInstanceOf[C]
> x.equals(y)
>
> res: Boolean = true
>


Hmm. I see that this can be reproduced without Spark in Scala 2.11, with
and without -Yrepl-class-based command line flag to the repl. Spark's REPL
has the effective behavior of 2.11's -Yrepl-class-based flag. Inspecting
the byte code generated, it appears -Yrepl-class-based results in the
creation of "$outer" field in the generated classes (including class C).
The first case match in equals() is resulting code along the lines of
(simplified):

if (o isinstanceof Cstr && this.$outer == that.$outer) { // do string
compare // }

$outer is the synthetic field object to the outer object in which the
object was created (in this case, the repl environment). Now obviously,
when x is taken through the bytestream and deserialized, it would have a
new $outer created (it may have deserialized in a different jvm or machine
for all we know). So the $outer's mismatching is expected. However I'm
still trying to understand why the outers need to be the same for the case
match.

Reply via email to