Re: [Proposal] Addition to Gelly

2015-08-16 Thread Fabian Hueske
I think skewed graphs can be considered quite common, i.e., not a corner case. So if there is code to significantly speed up computations on such graphs, this would definitely be interesting for Gelly, IMO. Would it be possible to integrate your approach with existing library algorithms and offer

Re: [Proposal] Addition to Gelly

2015-08-12 Thread Stephan Ewen
Same here as for Max, I am not familiar enough any more to make really good comments. Some generic comments, though: - Check whether you really need a technique. Techniques that improve corner cases, but make the code much more complex and make the behavior of algorithms less robust are often b

Re: [Proposal] Addition to Gelly

2015-08-12 Thread Maximilian Michels
I think this is a decision to be made by the people involved in the Gelly library. I'm not very familiar with graph processing libraries. Thus, it is hard for me to asses the value of this contribution. However, you outlined pretty well that for highly skewed graphs your technique results in a muc

Re: [Proposal] Addition to Gelly

2015-08-12 Thread Andra Lungu
I would love to get some feedback from the guys at data Artisans about this one. So far, the comments originated and spread in the Stockholm area :) On Tue, Aug 11, 2015 at 6:33 PM, Andra Lungu wrote: > Hi Samia, > > A good method to statistically determine skewed vertices was beyond the > purpo

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Andra Lungu
Hi Samia, A good method to statistically determine skewed vertices was beyond the purpose of my thesis. Unfortunately, the statistical methods that fit a power law distribution don't do a good job. So what I do is that I plot the degree distribution and then visually determine the threshold. That

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Samia Khalid
Dear Andra, The idea seems pretty nice. I wonder how you decide the threshold to separate the high degree vertices from the low degree vertices. Regards, Samia On Tue, Aug 11, 2015 at 3:41 PM, Andra Lungu wrote: > Hi Paris, > > Nice to virtually meet you too :) > > Maybe it makes sense to sha

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Andra Lungu
Hi Paris, Nice to virtually meet you too :) Maybe it makes sense to share my freshest chart: https://drive.google.com/file/d/0BwnaKJcSLc43Qm9fZV9RUE5zT1E/view?usp=sharing This is for the Community Detection algorithm [1] in which you basically find communities by continuously rescoring vertices.

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Paris Carbone
Hi Andra and nice to meet you btw :) It sounds like very fancy way to deal with skew, I like the idea even though I am not a graph analytics expert. Have you ran any experiments or benchmarks to see when this preferable ? Users should be aware when they will get benefits by using it since node s

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Andra Lungu
Hi Vasia, I shall polish the functions a bit, but this is more or less what I had in mind: GSA Jaccard [what we have in Gelly right now]: https://github.com/andralungu/gelly-partitioning/blob/master/src/main/java/example/GSAJaccardSimilarityMeasure.java The same version with node split: https://gi

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Vasiliki Kalavri
Hi Andra, thanks for offering to add this work to Gelly and for starting the discussion! How do you think this would look like from an API point of view? Is it easy to make it transparent to the application? Could you give us a simple example of what you have in mind? Apart from usability, we sh