I had that problem/question some time ago, too.
The quick fix is to just put the line number in the line itself. Go for it.
However, we worked out a solution for another distributed processing
system, that did the following:
Read each partition, count the lines, broadcast a map
"partition->lin
On Mon, Feb 1, 2016 at 12:32 PM, Fridtjof Sander
mailto:fsan...@mailbox.tu-berlin.de>>
wrote:
Hi Till,
thanks for your reply!
The problem with that is, that I sometimes combine two elements:
So from x0 -> x1 -> x2 I join (x0, x1) which might
ve to add another field to your input data but
then you don’t have to run the |zipWithIndex| for every iteration.
Cheers,
Till
On Mon, Feb 1, 2016 at 11:37 AM, Fridtjof Sander
mailto:fsan...@mailbox.tu-berlin.de>>
wrote:
(tried to reformat)
Hi,
I have a problem wh
d make my day so hard...
Anyways, thanks for your time!
Best, Fridtjof
Am 01.02.16 um 11:32 schrieb Fridtjof Sander:
Hi,
I have a problem which seems to be unsolvable in Flink at the moment
(1.0-Snapshot, current master branch)
and I would kindly ask for some input, ideas on alternativ
Hi,
I have a problem which seems to be unsolvable in Flink at the moment
(1.0-Snapshot, current master branch)
and I would kindly ask for some input, ideas on alternative approaches or just a
confirmatory "yup, that doesn't work".
### Here's the situation:
I have a dataset and its elements ar
Hi,
I want to register a custom aggregation convergence criterion to a bulk
iteration and I want to use the scala API.
It appears to me that this is not possible at the moment, right?
The AggregatorRegistry is exposed by IterativeDataSet.java, which is
hidden by DataSet.scala:
def iterate