Re: How can handles Exist ,not Exist query on flink

2015-07-10 Thread hagersaleh
I want example on use join or co group for handles Exists or not Exists -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-can-handles-Exist-not-Exist-query-on-flink-tp1939p2006.html Sent from the Apache Flink User Mailing List archive. mail

Re: how can handles Any , All query on flink

2015-07-10 Thread hagersaleh
please help -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/how-can-handles-Any-All-query-on-flink-tp1997p2005.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: DelimitedInputFormat reads entire buffer when splitLength is 0

2015-07-10 Thread Stephan Ewen
Hi Robert! This clearly sounds like unintended behavior. Thanks for reporting this. Apparently, the 0 line length was supposed to have a double meaning, but it goes haywire in this case. Let me try to come with a fix for this... Greetings, Stephan On Fri, Jul 10, 2015 at 6:05 PM, Robert Schmi

DelimitedInputFormat reads entire buffer when splitLength is 0

2015-07-10 Thread Robert Schmidtke
Hey everyone, I just noticed that when processing input splits from a DelimitedInputFormat (specifically, I have a text file with words in it), that if the splitLength is 0, the entire readbuffer is filled (see https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/a

Multiple ElasticSearch sinks

2015-07-10 Thread Flavio Pompermaier
Hi to all, I have a Flink job that produce json objects that I'd like to index in different Elasticsearch indices depending on the "type" attribute of my json object (e.g. "people", "places", etc..). Is there any previous attempt to do something like that in Flink? I was thinking to use the EsHado

Re: TeraSort on Flink and Spark

2015-07-10 Thread Stephan Ewen
Hi Dongwon Kim! Thank you for trying out these changes. The OptimizedText can be sorted more efficiently, because it generates a binary key prefix. That way, the sorting needs to serialize/deserialize less and saves on CPU. In parts of the program, the CPU is then less of a bottleneck and the di

Re: TeraSort on Flink and Spark

2015-07-10 Thread Fabian Hueske
Hi Dongwon Kim, this blog post describes Flink's memory management, serialization, and sort algorithm and also includes performance numbers of some microbenchmarks. --> http://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html The difference between Text and OptimizedText, is tha