Thanks for the clarification, definitely would want to require Sort but
only recommend partitioning ... I think that would be useful to request
based on details about the incoming dataset.
On Tue, Mar 27, 2018 at 4:55 PM Ryan Blue wrote:
> A required clustering would not, but a required sort wo
A required clustering would not, but a required sort would. Clustering is
asking for the input dataframe's partitioning, and sorting would be how
each partition is sorted.
On Tue, Mar 27, 2018 at 4:53 PM, Russell Spitzer
wrote:
> I forgot since it's been a while, but does Clustering support allo
I forgot since it's been a while, but does Clustering support allow
requesting that partitions contain elements in order as well? That would be
a useful trick for me. IE
Request/Require(SortedOn(Col1))
Partition 1 -> ((A,1), (A, 2), (B,1) , (B,2) , (C,1) , (C,2))
On Tue, Mar 27, 2018 at 4:38 PM Ry
Thanks, it makes sense that the existing interface is for aggregation and
not joins. Why are there requirements for the number of partitions that are
returned then?
Does it makes sense to design the write-side `Requirement` classes and the
read-side reporting separately?
On Tue, Mar 27, 2018 at 3
Hi Ryan, yea you are right that SupportsReportPartitioning doesn't expose
hash function, so Join can't benefit from this interface, as Join doesn't
require a general ClusteredDistribution, but a more specific one
called HashClusteredDistribution.
So currently only Aggregate can benefit from Suppor
Spark Dev,
On second thought, the below topic seems more appropriate for spark-dev
rather than spark-users:
Spark Users,
>
> In SparkR, RBackend is created in RRunner.main(). This in particular makes
> it difficult to control or use the RBackend. For my use case, I am looking
> to access the JVMO
I just took a look at SupportsReportPartitioning and I'm not sure that it
will work for real use cases. It doesn't specify, as far as I can tell, a
hash function for combining clusters into tasks or a way to provide Spark a
hash function for the other side of a join. It seems unlikely to me that
ma