Re: Possibility to get the line numbers?

2016-02-04 Thread Fridtjof Sander
I had that problem/question some time ago, too. The quick fix is to just put the line number in the line itself. Go for it. However, we worked out a solution for another distributed processing system, that did the following: Read each partition, count the lines, broadcast a map "partition->lin

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
On Mon, Feb 1, 2016 at 12:32 PM, Fridtjof Sander mailto:fsan...@mailbox.tu-berlin.de>> wrote: Hi Till, thanks for your reply! The problem with that is, that I sometimes combine two elements: So from x0 -> x1 -> x2 I join (x0, x1) which might

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
ve to add another field to your input data but then you don’t have to run the |zipWithIndex| for every iteration. Cheers, Till ​ On Mon, Feb 1, 2016 at 11:37 AM, Fridtjof Sander mailto:fsan...@mailbox.tu-berlin.de>> wrote: (tried to reformat) Hi, I have a problem wh

Re: join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
d make my day so hard... Anyways, thanks for your time! Best, Fridtjof Am 01.02.16 um 11:32 schrieb Fridtjof Sander: Hi, I have a problem which seems to be unsolvable in Flink at the moment (1.0-Snapshot, current master branch) and I would kindly ask for some input, ideas on alternativ

join with no element appearing in multiple join-pairs

2016-02-01 Thread Fridtjof Sander
Hi, I have a problem which seems to be unsolvable in Flink at the moment (1.0-Snapshot, current master branch) and I would kindly ask for some input, ideas on alternative approaches or just a confirmatory "yup, that doesn't work". ### Here's the situation: I have a dataset and its elements ar

How to register aggregation convergence criterion to bulk iteration in scala API?

2016-01-28 Thread Fridtjof Sander
Hi, I want to register a custom aggregation convergence criterion to a bulk iteration and I want to use the scala API. It appears to me that this is not possible at the moment, right? The AggregatorRegistry is exposed by IterativeDataSet.java, which is hidden by DataSet.scala: def iterate