[ https://issues.apache.org/jira/browse/FLINK-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293429#comment-15293429 ]
ASF GitHub Bot commented on FLINK-3780: --------------------------------------- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/1980#discussion_r64048334 --- Diff: docs/apis/batch/libs/gelly.md --- @@ -2250,14 +2250,33 @@ graph.run(new TranslateVertexValues(new LongValueAddOffset(vertexCount))); </tr> <tr> - <td>translate.<br/><strong>TranslateEdgeValues</strong></td> + <td>asm.translate.<br/><strong>TranslateEdgeValues</strong></td> <td> <p>Translate edge values using the given <code>TranslateFunction</code>.</p> {% highlight java %} graph.run(new TranslateEdgeValues(new Nullify())); {% endhighlight %} </td> </tr> + + <tr> + <td>library.similarity.<br/><strong>JaccardIndex</strong></td> + <td> + <p>Measures the similarity between vertex neighborhoods. The Jaccard Index score is computed as the number of shared numbers divided by the number of distinct neighbors. Scores range from 0.0 (no shared neighbors) to 1.0 (all neighbors are shared).</p> --- End diff -- Why did you add this here and not in the "Usage" section of the library method? I find it a bit confusing... You describe graph algorithms as building blocks for other algorithms. Does Jaccard index fall in this category? > Jaccard Similarity > ------------------ > > Key: FLINK-3780 > URL: https://issues.apache.org/jira/browse/FLINK-3780 > Project: Flink > Issue Type: New Feature > Components: Gelly > Affects Versions: 1.1.0 > Reporter: Greg Hogan > Assignee: Greg Hogan > Fix For: 1.1.0 > > > Implement a Jaccard Similarity algorithm computing all non-zero similarity > scores. This algorithm is similar to {{TriangleListing}} but instead of > joining two-paths against an edge list we count two-paths. > {{flink-gelly-examples}} currently has {{JaccardSimilarityMeasure}} which > relies on {{Graph.getTriplets()}} so only computes similarity scores for > neighbors but not neighbors-of-neighbors. > This algorithm is easily modified for other similarity scores such as > Adamic-Adar similarity where the sum of endpoint degrees is replaced by the > degree of the middle vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)