[ https://issues.apache.org/jira/browse/FLINK-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232200#comment-15232200 ]
ASF GitHub Bot commented on FLINK-2909: --------------------------------------- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/1807#discussion_r59026954 --- Diff: docs/apis/batch/libs/gelly.md --- @@ -1734,3 +1734,547 @@ vertex represents a group of vertices and each edge represents a group of edges vertex and edge in the output graph stores the common group value and the number of represented elements. {% top %} + +Graph Generators +----------- + +Gelly provides a collection of scalable graph generators. Each generator is + +* parallelizable, in order to create large datasets +* scale-free, generating the same graph regardless of parallelism +* thrifty, using as few operators as possible + +### Complete Graph + +An undirected graph connecting every distinct pair of vertices. + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + +Graph<LongValue,NullValue,NullValue> graph = new CompleteGraph(env, 5) + .generate(); +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +import org.apache.flink.api.scala._ +import org.apache.flink.graph.generator.CompleteGraph + +val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment + +val graph = new CompleteGraph(env.getJavaEnv, 5).generate() +{% endhighlight %} +</div> +</div> + +<svg class="graph" width="540" height="540" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink"> + + <line x1="270" y1="40" x2="489" y2="199" /> + <line x1="270" y1="40" x2="405" y2="456" /> + <line x1="270" y1="40" x2="135" y2="456" /> + <line x1="270" y1="40" x2="51" y2="199" /> + + <line x1="489" y1="199" x2="405" y2="456" /> + <line x1="489" y1="199" x2="135" y2="456" /> + <line x1="489" y1="199" x2="51" y2="199" /> + + <line x1="405" y1="456" x2="135" y2="456" /> + <line x1="405" y1="456" x2="51" y2="199" /> + + <line x1="135" y1="456" x2="51" y2="199" /> + + <circle cx="270" cy="40" r="20" /> + <text x="270" y="40">0</text> + + <circle cx="489" cy="199" r="20" /> + <text x="489" y="199">1</text> + + <circle cx="405" cy="456" r="20" /> + <text x="405" y="456">2</text> + + <circle cx="135" cy="456" r="20" /> + <text x="135" y="456">3</text> + + <circle cx="51" cy="199" r="20" /> + <text x="51" y="199">4</text> +</svg> + +### Cycle Graph + +An undirected graph where all edges form a single cycle. + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + +Graph<LongValue,NullValue,NullValue> graph = new CycleGraph(env, 5) + .generate(); +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +import org.apache.flink.api.scala._ +import org.apache.flink.graph.generator.CycleGraph + +val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment + +val graph = new CycleGraph(env.getJavaEnv, 5).generate() +{% endhighlight %} +</div> +</div> + +<svg class="graph" width="540" height="540" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink"> + + <line x1="270" y1="40" x2="489" y2="199" /> + <line x1="489" y1="199" x2="405" y2="456" /> + <line x1="405" y1="456" x2="135" y2="456" /> + <line x1="135" y1="456" x2="51" y2="199" /> + <line x1="51" y1="199" x2="270" y2="40" /> + + <circle cx="270" cy="40" r="20" /> + <text x="270" y="40">0</text> + + <circle cx="489" cy="199" r="20" /> + <text x="489" y="199">1</text> + + <circle cx="405" cy="456" r="20" /> + <text x="405" y="456">2</text> + + <circle cx="135" cy="456" r="20" /> + <text x="135" y="456">3</text> + + <circle cx="51" cy="199" r="20" /> + <text x="51" y="199">4</text> +</svg> + +### Empty Graph + +The graph containing no edges. + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + +Graph<LongValue,NullValue,NullValue> graph = new EmptyGraph(env, 5) + .generate(); +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +import org.apache.flink.api.scala._ +import org.apache.flink.graph.generator.EmptyGraph + +val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment + +val graph = new EmptyGraph(env.getJavaEnv, 5).generate() +{% endhighlight %} +</div> +</div> + +<svg class="graph" width="540" height="80" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink"> + + <circle cx="30" cy="40" r="20" /> + <text x="30" y="40">0</text> + + <circle cx="150" cy="40" r="20" /> + <text x="150" y="40">1</text> + + <circle cx="270" cy="40" r="20" /> + <text x="270" y="40">2</text> + + <circle cx="390" cy="40" r="20" /> + <text x="390" y="40">3</text> + + <circle cx="510" cy="40" r="20" /> + <text x="510" y="40">4</text> +</svg> + +### Grid Graph + +An undirected graph connecting vertices in a regular tiling in one or more dimensions. + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + +Graph<LongValue,NullValue,NullValue> graph = new GridGraph(env) + .addDimension(2, false) --- End diff -- I think we should explain what `addDimension` does and what its arguments represent. > Gelly Graph Generators > ---------------------- > > Key: FLINK-2909 > URL: https://issues.apache.org/jira/browse/FLINK-2909 > Project: Flink > Issue Type: New Feature > Components: Gelly > Affects Versions: 1.0.0 > Reporter: Greg Hogan > Assignee: Greg Hogan > > Include a selection of graph generators in Gelly. Generated graphs will be > useful for performing scalability, stress, and regression testing as well as > benchmarking and comparing algorithms, for both Flink users and developers. > Generated data is infinitely scalable yet described by a few simple > parameters and can often substitute for user data or sharing large files when > reporting issues. > There are at multiple categories of graphs as documented by > [NetworkX|https://networkx.github.io/documentation/latest/reference/generators.html] > and elsewhere. > Graphs may be a well-defined, i.e. the [Chvátal > graph|https://en.wikipedia.org/wiki/Chv%C3%A1tal_graph]. These may be > sufficiently small to populate locally. > Graphs may be scalable, i.e. complete and star graphs. These should use > Flink's distributed parallelism. > Graphs may be stochastic, i.e. [RMat > graphs|http://snap.stanford.edu/class/cs224w-readings/chakrabarti04rmat.pdf] > . A key consideration is that the graphs should source randomness from a > seedable PRNG and generate the same Graph regardless of parallelism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)