We were able to reproduce it with a minimal example. I've opened a jira
issue:
https://issues.apache.org/jira/browse/SPARK-15825
On Wed, Jun 8, 2016 at 12:43 PM, Koert Kuipers wrote:
> great!
>
> we weren't able to reproduce it because the unit tests use a
> broadcast-join while on the cluster
Hi all,
We were in the process of porting an RDD program to one which uses
Datasets. Most things were easy to transition, but one hole in
functionality we found was the ability to reduce a Dataset by key,
something akin to PairRDDFunctions.reduceByKey. Our first attempt of adding
the functionality
Hi all, I'm getting some odd behavior when using the joinWith functionality
for Datasets. Here is a small test case:
val left = List(("a", 1), ("a", 2), ("b", 3), ("c", 4)).toDS()
val right = List(("a", "x"), ("b", "y"), ("d", "z")).toDS()
val joined = left.toDF("k", "v").as[(String, Int)].