Hi Vasia, Here <https://docs.google.com/document/d/1SChonW4bXNiqU2Pz9PPjuKDuKrBGmESrPcJXamj4dlI/edit?usp=sharing> is the Google doc.
-- Saumitra S. Shahapure On Thu, Oct 22, 2015 at 11:10 AM, Vasiliki Kalavri < vasilikikala...@gmail.com> wrote: > Hi Saumitra, > > could you maybe add your ideas above in a google doc and share the link, so > we can easily comment and/or edit? > > Thank you! > -Vasia. > > On 22 October 2015 at 10:53, Andra Lungu <lungu.an...@gmail.com> wrote: > > > Hi Saumitra, > > > > As you already noticed, the first version (with duplicates) is highly > > inefficient and consumes a lot of memory. So, I suggest we drop it for > now. > > The version with the label makes a lot of modifications on the base Graph > > class, and this, in my opinion would make it more difficult to grasp. > > Bipartite Graphs are not as common as regular graphs. And then, when you > > would add an Isomorphic Graph class (stupid example), you'll need to > strip > > stuff out again. > > > > I believe we should go with a BipartiteGraph class which extends Graph > and > > which has a DataSet of topVertices and a DataSet of bottomVertices. Apart > > from getTopVertices() methods, we would still need functions such as > > getNumberOfTopVertices(). For the time being, just support the funstiones > > mentiones as well as the basics: fromCollection, fromDataSet, etc, get*, > > join*, map*, filter*. > > > > Wait for a few more opinions and dig in. I believe we should discuss this > > even further and propose Jiras for examples of algos for bipartite > graphs. > > > > Once we have the design specs well figured out, and if I find some time > > I'll gladly chime in :) > > > > Cheers! > > Andra > > > > On Wed, Oct 21, 2015 at 8:45 PM, Saumitra Shahapure < > > saumitra.offic...@gmail.com> wrote: > > > > > In FLINK-2254 <https://issues.apache.org/jira/browse/FLINK-2254> , we > > > want to extend Graph API of Gelly by adding support for bipartite > graphs > > > too. In the long term, the support for newer graph types can be added > in > > > the same way. Also specialised algorithms for special types of graphs > > > should be implemented. > > > > > > For bipartite graph, a new class BipartiteGraph can be implemented > which > > > extends Graph. Other graph algorithms which are written for Graph can > be > > > used for BipartiteGraph too, because of inheritance. > > > Initialisation functions like public static <K, VV, EV> Graph<K, VV, > EV> > > > fromCollection(Collection<Vertex<K, VV>> vertices,Collection<Edge<K, > EV>> > > > edges, ExecutionEnvironment context) need to be duplicated as public > > static > > > <K, VV, EV> BipartiteGraph<K, VV, EV> > fromCollection(Collection<Vertex<K, > > > VV>> topVertices,Collection<Vertex<K, VV>> bottomVertices, > > > Collection<Edge<K, EV>> edges, ExecutionEnvironment context) > > > Graph validation does not need to happen implicitly. user can call > > > BipartiteGraph.validate() in case he wants to check sanity of data. > > > Vertex modes is primary requirement. BipartiteGraph should have > > functions, > > > getTopVertices() and getBottomVertices(). There are three ways to > > implement > > > it: > > > Simplest and least efficient way is to maintain vertices variable in > > Graph > > > in addition to two more Datasets topVertices, bottomVertices. Benefit > of > > > this approach is that Graph class would not need any modification at > all. > > > this.vertices variable access inside Graph class would work correctly. > > > Disadvantage is that multiple copies of vertex dataset are created and > > > maintained. So this is inefficient in terms of memory. > > > Another way is to keep topVertices and bottomVertices variables in > > > BipartiteGraph. vertices variable in Graph would stay empty. > > getVertices() > > > function of Graph would be overloaded by BipartiteGraph and > reimplemented > > > as union of of topVertices and bottomVertices. Since, vertices is a > > private > > > variable, external view of vertices stays unaffected. All functions of > > > Graph class which use vertices local variable (e.g. > > getVerticesAsTuple2()) > > > need to use getVertices() instead of this.vertices. Disadvantage of > this > > > method is Graph.vertices variable would have invalid value throughout > for > > > BipartiteGraph. > > > Another way is to ‘label’ vertices with an integer. So public class > > > BipartiteGraph<K,VV,EV> extends Graph<K, Tuple2<VV,Integer>, EV> would > be > > > the signature. Initialisers would tag the vertices according to their > > mode. > > > getVertices() should be overridden to strip-off the tag values so that > > > users would get consistent view of vertex dataset for Graph and > > > BipartiteGraph. getTopVertices/getBottomVertices would be filter > > functions > > > on vertex tags. > > > I personally feel method 2 to be most elegant. > > > Functions like addVertices(vertexList) are not relevant without > > specifying > > > whether they are top or bottom. A possibility could be to add them to > > > topVertices by default. And have overloaded function > > > addVertices(vertexList, mode) to add them to specific partition. > > > Specialised algorithms for Bipartite graphs can implement new > > > BipartiteGraphAlgorithm interface. It would be similar to > GraphAlgorithm. > > > In future, newer types of graphs can be similarly derived from Graph > and > > > type hierarchy of Graph can be created. Class hierarchy would be better > > > here rather than interface hierarchy, because the functions which are > > > applicable to all derived classes can be implemented in the base class. > > > As a part of future refactoring, Graph transformation functions should > > > rather be a new class implementing GraphAlgorithm rather than function > in > > > Graph class. This may make implementing complex graph types easier. > > > PS. Do I need to add it in wiki too? I can copy the contents to wiki if > > > you say so, > > > -- Saumitra Shahapure > > > > > > > > >