Hello, In Flink, one often used way to access data from multiple DataSets at the same time is to perform a join (Flink actually calls equi-joins [1] just "join"), just as in the database world.
For example, in the algorithm that you linked, you access A[u] for every edge (u,v). I assume that you have stored A in a DataSet of (index, value) pairs. You can achieve this access pattern by performing a join, and in the join condition you specify that the first endpoint of the edge should be equal to the index of A. This way, you get a DataSet where every record contains an edge (u,v) and also A[u], so you can do a map on this where the UDF of your map will get (u,v) and A[u]. Your algorithm also accesses A[v], which can be achieved by performing a second join that is similar to the first (using the result of the first). However, the updating of P will be more tricky to translate to Flink. I'm not sure I undersand the linked algorithm correctly: does every element of P contain a list, and the + means appending an element to a list? (in the line P[v] = P[u] + v) Best, Gábor [1] https://en.wikipedia.org/wiki/Join_(SQL)#Equi-join 2016-10-30 8:25 GMT+01:00 otherwise777 <wou...@onzichtbaar.net>: > Currently i'm trying to implement this algorithm [1] which requires me to > loop over one DataSet (the edges) and access another DataSet (the vertices), > for this loop i use a Mapping (i'm not sure if this is the correct way of > looping over a DataSet) but i don't know how to access the elements of > another DataSet while i'm looping over one. > > I know Gelly also has iterative support for these kind of things, but they > loop over the Vertices and not the Edges > > [1] http://prntscr.com/d0qeyd > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Looping-over-a-DataSet-and-accesing-another-DataSet-tp9778.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at > Nabble.com.