I think that depends on your use case. If you want to work on the entire
dataset as a whole anyways, you can assign a Dummy-Key (like 0) to all
elements, group by that key and sort the group on the actual value.
What exactly is you use case? Does the above solution work there?
Am 15.03.2015 17:39
After building flink 0.9-SNAPSHOT from source and using
DataSet.sortPartition is indeed working as expected.
This is fine but raises the question on how to go about sorting in 0.8.1?
On Sun, Mar 15, 2015 at 5:05 PM, Kristoffer Sjögren
wrote:
> That's the thing, there is no DataSet.sortParti
That's the thing, there is no DataSet.sortPartition method in 0.8.1.
Looking through the git history show that sortPartition was added 20th of
February so I think that's 0.9-SNAPSHOT?
On Sun, Mar 15, 2015 at 4:51 PM, Stephan Ewen wrote:
> Hi!
>
> I think sort partition is the right think, if yo
Hi!
I think sort partition is the right think, if you have only one partition
(which makes sense, if you want a total order). It is not a parallel
operation any mode, so use it only after the data size has been reduced
(filters / aggregations).
What about "data.sortPartition().setParallelism(1)".
Thanks for your answer. I guess i'm a bit infected by writing to much
Crunch code and I also suspected that getDataSet() was the wrong thing to
do :-)
However I was expecting DataSet.sortPartition to do the sorting, but this
method is missing in 0.8.1?
Do you have a minimal example? I was looking
Hi Kristoffer!
There are a few issues with that code:
1) Grouping and then calling "sort group" sorts within the group. In your
case, you group after the entire element and each group has on value - the
element. Sorting inside the group does not make any difference. There is no
order across group
Hi
This is silly but I can't understand why the following code doesn't sort
the collection of integers. It seems to be reasonable thing to do from an
API perspective?
Cheers,
-Kristoffer
final ExecutionEnvironment env =
ExecutionEnvironment.getExecutionEnvironment();
env.fromCollection(Lists