Great, thanks for sharing Martin! On 24 November 2015 at 15:00, Martin Junghanns <m.jungha...@mailbox.org> wrote:
> Hi, > > I wrote a short blog post about the ldbc-flink tool including a short > overview of Flink and a Gelly example. > > http://ldbcouncil.org/blog/ldbc-and-apache-flink > > Best, > Martin > > On 06.10.2015 11:00, Martin Junghanns wrote: > > Hi Vasia, > > > > No problem. Sure, Gelly is just a map() call away :) > > > > Best, > > Martin > > > > On 06.10.2015 10:53, Vasiliki Kalavri wrote: > >> Hi Martin, > >> > >> thanks a lot for sharing! This is a very useful tool. > >> I only had a quick look, but if we merge label and payload inside a > Tuple2, > >> then it should also be Gelly-compatible :) > >> > >> Cheers, > >> Vasia. > >> > >> On 6 October 2015 at 10:03, Martin Junghanns <m.jungha...@mailbox.org> > >> wrote: > >> > >>> Hi all, > >>> > >>> For our benchmarks with Flink, we are using a data generator provided > by > >>> the LDBC project (Linked Data Benchmark Council) [1][2]. The generator > uses > >>> MapReduce to create directed, labeled, attributed graphs that mimic > >>> properties of real online social networks (e.g, degree distribution, > >>> diameter). The output is stored in several files either local or in > HDFS. > >>> Each file represents a vertex, edge or multi-valued property class. > >>> > >>> I wrote a little tool, that parses and transforms the LDBC output into > two > >>> datasets representing vertices and edges. Each vertex has a unique id, > a > >>> label and payload according to the LDBC schema. Each edge has a unique > id, > >>> a label, source and target vertex IDs and also payload according to the > >>> schema. > >>> > >>> I thought this may be useful for others so I put it on GitHub [2]. It > >>> currently uses Flink 0.10-SNAPSHOT as it depends on some fixes made in > >>> there. > >>> > >>> Best, > >>> Martin > >>> > >>> [1] http://ldbcouncil.org/ > >>> [2] https://github.com/ldbc/ldbc_snb_datagen > >>> [3] https://github.com/s1ck/ldbc-flink-import > >>> > >> >