Re: Flink Scala performance

2015-07-18 Thread Michele Bertoni
hi, actually the same happens to me on my macbook pro when not plugged to power but with battery and twice if i am using hdfs in my case it seems like in power saving mode jvm commands has a very high latency i.e. a simple "hdfs dfs -ls /“ takes about 20 seconds when only on battery, so it is

Re: Flink Scala performance

2015-07-18 Thread Vinh June
it sounds unreasonable for me, because I'm working on other Java projects also, non of them takes that long to fire up JVM. Strange ! Do you have any suggestion to fix this ? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Scala-perfor

Re: Containment Join Support

2015-07-18 Thread Vasiliki Kalavri
Hi Martin, I'm really glad to see that you've started using Gelly :) I think that a graph summarization library method would be a great addition! Let me know if you need help and if you want to discuss ideas or other methods. Cheers, Vasia. On 17 July 2015 at 12:25, Martin Junghanns wrote: >

Re: Gelly EOFException

2015-07-18 Thread Vasiliki Kalavri
Hi Flavio, Gelly currently makes no sanity checks regarding the input graph data. We decided to leave it to the user to check that they have a valid graph, for performance reasons. That means that there might exist Gelly methods that assume that your input graph is valid, i.e. no duplicate vertice

Re: HBase on 4 machine cluster - OutOfMemoryError

2015-07-18 Thread Lydia Ickler
Hi, yes, it is in one row. Each row represents a patient that has values of 20.000 different genes stored in one column family and one value of health status in a second column family. > Am 18.07.2015 um 15:38 schrieb Stephan Ewen : > > This error is in the HBase RPC Service. Apparently the R

Re: HBase on 4 machine cluster - OutOfMemoryError

2015-07-18 Thread Stephan Ewen
This error is in the HBase RPC Service. Apparently the RPC message is very large. Is the data that you request in one row? Am 18.07.2015 00:50 schrieb "Lydia Ickler" : > Hi all, > > I am trying to read a data set from HBase within a cluster application. > The data is about 90MB big. > > When I ru

Re: Flink deadLetters

2015-07-18 Thread Flavio Pompermaier
The job is quite simple..it just reads 10 parquet dirs, extract some infos out of the thrift objects and generates Tuple3,make a project() and a distinct() to call an external service only for some of the extracted ids (the external service translates the local id into a global one). Then there are