A quickfix would be to take the first join and give it a
"JoinHint.REPARTITION_HASH_BUILD_SECOND" hint.


The best thing would be to have batch exchanges for iterations.


The second best thing would be to recognize in the optimizer that a batch
exchange cannot happen (if inside an iteration) and instead set the
receiver task to break the pipeline (set TempMode.makePipelineBreaker())


On Tue, Sep 8, 2015 at 12:43 PM, Ufuk Celebi <u...@apache.org> wrote:

>
> > On 08 Sep 2015, at 10:12, Schueler, Ricarda <
> ricarda.schue...@student.hpi.uni-potsdam.de> wrote:
> >
> > Hi,
> >
> > we tested it with the version 0.9.1, but unfortunately the issue
> persists.
>
> Thanks for helping me out debugging this Ricarda! :)
>
> From what I can tell, this is not a deadlock in the network runtime, but a
> join deadlock within an iteration.
>
> https://gist.github.com/uce/3fd5ca45383402ed1b16
>
> @Stephan, Fabian: What’s the best way to fix this for good?
>
> @Ricarda: you can work your way around this by providing
> JoinHint.REPARTITION_SORT_MERGE as a join hint in the bulk iteration, i.e.
>
> joinedtriangles = joinedtriangles.join(graph,
> JoinHint.REPARTITION_SORT_MERGE).where({triangle =>
> (triangle.edge3.vertex1, triangle.edge3.vertex2)}).equalTo("vertex1",
> "vertex2"){
>   (triangle, edge) =>
>     triangle.edge3.triangleCount = edge.triangleCount
>     triangle
> }.name("third triangle edge join”)
>
> I saw that you were benchmarking this for a project. This should impact
> the runtime of your program, so you might need to re-run the experiments.
>
> – Ufuk
>
>

Reply via email to