Re: read .gz files

2015-02-19 Thread Sebastian
Upgrading to 0.8.1 helped, thx! On 19.02.2015 22:08, Robert Metzger wrote: Hey, are you using Flink 0.8.0 ? I think we've added support for Hadoop input formats with scala in 0.8.1 and 0.9 (master). The following code just printed me the List of all page titles of the catalan wikipedia ;) (bui

Re: Flink test examples

2015-02-19 Thread Robert Metzger
Hi Plamen, thank you for your interest in Flink! It seems you've found already the flink examples. Please note that not all examples are contained in the "examples/" directory in of the binary release. There are some more examples "hidden" in the source code: Here are the Java examples: https://

Re: read .gz files

2015-02-19 Thread Robert Metzger
Hey, are you using Flink 0.8.0 ? I think we've added support for Hadoop input formats with scala in 0.8.1 and 0.9 (master). The following code just printed me the List of all page titles of the catalan wikipedia ;) (build against master) def main(args: Array[String]) { val env = ExecutionEnvi

Re: read .gz files

2015-02-19 Thread Sebastian
I tried to follow the example on the web page like this: --- implicit val env = ExecutionEnvironment.getExecutionEnvironment val job = Job.getInstance val hadoopInput = new HadoopInputFormat[LongWritable,Text]( new TextInput

Re: read .gz files

2015-02-19 Thread Robert Metzger
I just had a look at Hadoop's TextInputFormat. In hadoop-common-2.2.0.jar there are the following compression codecs contained: org.apache.hadoop.io.compress.BZip2Codec org.apache.hadoop.io.compress.DefaultCodec org.apache.hadoop.io.compress.DeflateCodec org.apache.hadoop.io.compress.GzipCodec org

Re: read .gz files

2015-02-19 Thread Robert Metzger
Hi, right now Flink itself has only support for reading ".deflate" files. Its basically the same algorithm as gzip but gzip files seem to have some header which makes the two formats incompatible. But you can easily use HadoopInputFormats with Flink. I'm sure there is a Hadoop IF for reading gzip

read .gz files

2015-02-19 Thread Sebastian
Hi, does flink support reading gzipped files? Haven't found any info about this on the website. Best, Sebastian

Re: Using Spargel's FilterOnVerices gets stuck.

2015-02-19 Thread Carsten Brandt
Sounds like what is getting stuck here is not flink but your while loop which does not come to the point where there is no node in the graph. On 18.02.2015 22:39, HungChang wrote: > Thank you for the information you provided. > > Yes, it runs an iterative algorithm on a graph and feeds the resul

Re: Exception: Insufficient number of network buffers: required 120, but only 2 of 2048 available

2015-02-19 Thread Henry Saputra
Would it be helpful to add additional message in the error message in NetworkBufferPool#createBufferPool to check the taskmanager.network.numberOfBuffers property? - Henry On Wed, Feb 18, 2015 at 4:32 PM, Yiannis Gkoufas wrote: > Perfect! It worked! Thanks a lot for the help! > > On 18 February

Re: Flink test examples

2015-02-19 Thread Plamen L. Simeonov
Well, I got some bundled examples on the Flink page, but is there an elaborated test set with more details (e.g. Google keyword search simulation) around? Thanks++ ___ Dr.-Ing. Plamen L. Simeonov Department 1: Geodäsie und Fernerkundung Sektion 1.5: Geoinformatik

Re: Flink test examples

2015-02-19 Thread Plamen L. Simeonov
Dear all, can somebody of you help us with some information about ready to run examples for a fresh flink installation. We simply wish to know how it works when we feed it with some data (what comes out and how it performs). many thanks! ___ Dr.-Ing. Plamen L.

Re: Efficient datatypes?

2015-02-19 Thread Stephan Ewen
Hey! All data types are always kept serialized for caching/hashing/sorting. Deserialization is sometimes needed in the internal algorithms (on hash collisions and sort-prefix collisions). The most efficient data types for that are actually Tuples. POJOs and other data types are a little less effic

Efficient datatypes?

2015-02-19 Thread Kruse, Sebastian
Hi everyone, I think that during one of the meetups, it was mentioned that Flink can in some cases operate on serialized data. Given I understood that correctly, which cases that would be, i.e, which data types and operators support such a feature? Cheers, Sebastian --- Sebastian Kruse Doktor