Hi, I wrote a short blog post about the ldbc-flink tool including a short overview of Flink and a Gelly example.
http://ldbcouncil.org/blog/ldbc-and-apache-flink Best, Martin On 06.10.2015 11:00, Martin Junghanns wrote: > Hi Vasia, > > No problem. Sure, Gelly is just a map() call away :) > > Best, > Martin > > On 06.10.2015 10:53, Vasiliki Kalavri wrote: >> Hi Martin, >> >> thanks a lot for sharing! This is a very useful tool. >> I only had a quick look, but if we merge label and payload inside a Tuple2, >> then it should also be Gelly-compatible :) >> >> Cheers, >> Vasia. >> >> On 6 October 2015 at 10:03, Martin Junghanns <m.jungha...@mailbox.org> >> wrote: >> >>> Hi all, >>> >>> For our benchmarks with Flink, we are using a data generator provided by >>> the LDBC project (Linked Data Benchmark Council) [1][2]. The generator uses >>> MapReduce to create directed, labeled, attributed graphs that mimic >>> properties of real online social networks (e.g, degree distribution, >>> diameter). The output is stored in several files either local or in HDFS. >>> Each file represents a vertex, edge or multi-valued property class. >>> >>> I wrote a little tool, that parses and transforms the LDBC output into two >>> datasets representing vertices and edges. Each vertex has a unique id, a >>> label and payload according to the LDBC schema. Each edge has a unique id, >>> a label, source and target vertex IDs and also payload according to the >>> schema. >>> >>> I thought this may be useful for others so I put it on GitHub [2]. It >>> currently uses Flink 0.10-SNAPSHOT as it depends on some fixes made in >>> there. >>> >>> Best, >>> Martin >>> >>> [1] http://ldbcouncil.org/ >>> [2] https://github.com/ldbc/ldbc_snb_datagen >>> [3] https://github.com/s1ck/ldbc-flink-import >>> >>