Re: Apache Flink transactions

2015-06-09 Thread Hawin Jiang
Hey Aljoscha I also sent an email to Bill for asking the latest test results. From Bill's email, Apache Spark performance looks like better than Flink. How about your thoughts. Best regards Hawin On Tue, Jun 9, 2015 at 2:29 AM, Aljoscha Krettek wrote: > Hi, > we don't have any current perf

Re: Apache Flink transactions

2015-06-09 Thread Hawin Jiang
On Tue, Jun 9, 2015 at 2:29 AM, Aljoscha Krettek wrote: > Hi, > we don't have any current performance numbers. But the queries mentioned > on the benchmark page should be easy to implement in Flink. It could be > interesting if someone ported these queries and ran them with exactly the > same dat

Re: Load balancing

2015-06-09 Thread Fabian Hueske
Hi Sebastian, I agree, shuffling only specific elements would be a very useful feature, but unfortunately it's not supported (yet). Would you like to open a JIRA for that? Cheers, Fabian 2015-06-09 17:22 GMT+02:00 Kruse, Sebastian : > Hi folks, > > > > I would like to do some load balancing wi

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread Robert Metzger
Great! I'm happy to hear that it worked. On Tue, Jun 9, 2015 at 5:28 PM, hagersaleh wrote: > I can solve problem when final Map map = new > HashMap(); > > very thanks > code run in command line not any error > public static void main(String[] args) throws Exception { > final Map map

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread hagersaleh
I can solve problem when final Map map = new HashMap(); very thanks code run in command line not any error public static void main(String[] args) throws Exception { final Map map = new HashMap(); map.put("C_MKTSEGMENT", 2); ExecutionEnvironment env = ExecutionEnv

Load balancing

2015-06-09 Thread Kruse, Sebastian
Hi folks, I would like to do some load balancing within one of my Flink jobs to achieve good scalability. The rebalance() method is not applicable in my case, as the runtime is dominated by the processing of very few larger elements in my dataset. Hence, I need to distribute the processing work

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Hi Ufuk, I used the TableInput format from flink-addons. Best Regards, Hilmi Am 09.06.2015 um 13:17 schrieb Ufuk Celebi: Hey Hilmi, thanks for reporting the issue. Sorry for the inconvenience this has caused. I'm not familiar with HBase in combination with Flink. From what I've seen, there a

Re: Reading from HBase problem

2015-06-09 Thread Ufuk Celebi
Hey Hilmi, thanks for reporting the issue. Sorry for the inconvenience this has caused. I'm not familiar with HBase in combination with Flink. From what I've seen, there are two options: either use Flink's TableInputFormat from flink-addons or the Hadoop TableInputFormat, right? Which one are y

Re: Reading from HBase problem

2015-06-09 Thread fhueske
Thank you very much! From: Hilmi Yildirim Sent: ‎Tuesday‎, ‎9‎. ‎June‎, ‎2015 ‎11‎:‎40 To: user@flink.apache.org Done https://issues.apache.org/jira/browse/FLINK-2188 Am 09.06.2015 um 11:26 schrieb Fabian Hueske: Would you mind opening a JIRA for this issue? -> https://issues.apa

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Done https://issues.apache.org/jira/browse/FLINK-2188 Am 09.06.2015 um 11:26 schrieb Fabian Hueske: Would you mind opening a JIRA for this issue? -> https://issues.apache.org/jira/browse/FLINK I can do it as well, but you know all the details. Thanks, Fabian 2015-06-09 11:03 GMT+02:00 Hilmi

Re: Apache Flink transactions

2015-06-09 Thread Aljoscha Krettek
Hi, we don't have any current performance numbers. But the queries mentioned on the benchmark page should be easy to implement in Flink. It could be interesting if someone ported these queries and ran them with exactly the same data on the same machines. Bill Sparks wrote on the mailing list some

Re: Reading from HBase problem

2015-06-09 Thread Fabian Hueske
Would you mind opening a JIRA for this issue? -> https://issues.apache.org/jira/browse/FLINK I can do it as well, but you know all the details. Thanks, Fabian 2015-06-09 11:03 GMT+02:00 Hilmi Yildirim : > I want to add that I run the Flink job on a cluster with 13 machines and > each machine

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
I want to add that I run the Flink job on a cluster with 13 machines and each machine has 13 processing slots which results in a total number of processing slots of 169. Am 09.06.2015 um 10:59 schrieb Hilmi Yildirim: Correct. I also counted the rows with Spark and Hive. Both returned the same

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Correct. I also counted the rows with Spark and Hive. Both returned the same value which is nearly 100 mio. rows. But Flink returns 102 mio. rows. Best Regards, Hilmi Am 09.06.2015 um 10:47 schrieb Fabian Hueske: OK, so the problem seems to be with the HBase InputFormat. I guess this issue

Re: Reading from HBase problem

2015-06-09 Thread Fabian Hueske
OK, so the problem seems to be with the HBase InputFormat. I guess this issue needs a bit of debugging. We need to check if records are emitted twice (or more often) and if that is the case which records. Unfortunately, this issue only seems to occur with large tables :-( Did I got that right, th

Re: Reading from HBase problem

2015-06-09 Thread Hilmi Yildirim
Hi, Now I tested the "count" method. It returns the same result as the flatmap.groupBy(0).sum(1) method. Furthermore, the Hbase contains nearly 100 mio. rows but the result is 102 mio.. This means that the HbaseInput reads more rows than the HBase contains. Best Regards, Hilmi Am 08.06.201

Re: when return value from linkedlist or map and use in filter function display error

2015-06-09 Thread hagersaleh
error dispaly non-static variable map cannot be referenced from a static context map.put("C_MKTSEGMENT", 2); code public Map map = new HashMap(); public static void main(String[] args) throws Exception { map.put("C_MKTSEGMENT", 2); ExecutionEnvironment env = Exe