Re: Apache Spark-Subtract two datasets

2017-10-13 Thread Nathan Kronenfeld
I think you want a join of type "left_anti"... See below log scala> import spark.implicits._ import spark.implicits._ scala> case class Foo (a: String, b: Int) defined class Foo scala> case class Bar (a: String, d: Double) defined class Bar scala> var fooDs = Seq(Foo("a", 1), Foo("b", 2), Foo("

Re: Apache Spark-Subtract two datasets

2017-10-12 Thread Imran Rajjad
if the datasets hold objects of different classes, then you will have to convert both of them to rdd and then rename the columns befrore you call rdd1.subtract(rdd2) On Thu, Oct 12, 2017 at 10:16 PM, Shashikant Kulkarni < shashikant.kulka...@gmail.com> wrote: > Hello, > > I have 2 datasets, Datas

Apache Spark-Subtract two datasets

2017-10-12 Thread Shashikant Kulkarni
Hello, I have 2 datasets, Dataset and other is Dataset. I want the list of records which are in Dataset but not in Dataset. How can I do this in Apache Spark using Java Connector? I am using Apache Spark 2.2.0 Thank you - To u