Re: Best way to compute the difference between 2 datasets

2019-07-21 Thread Ken Krugler
Hi Juan, If you want to deduplicate, then you could group by the record, and use a (very simple) reduce function to only emit a record if the group contains one element. There will be performance issues, though - Flink will have to generate all groups first, which typically means spilling to di

Best way to compute the difference between 2 datasets

2019-07-21 Thread Juan Rodríguez Hortalá
Hi, I've been trying to write a function to compute the difference between 2 datasets. With that I mean computing a dataset that has all the elements of a dataset that are not present in another dataset. I first tried using coCogroup, but it was very slow in a local execution environment, and ofte

Re:Unable to build Flink1.10 from source

2019-07-21 Thread Haibo Sun
Hi, Yebgenya The reason for this problem can be found in this email (http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/NOTICE-SSL-issue-when-building-flink-mapr-fs-td30757.html). The solution is to add the parameter "-Punsafe-mapr-repo" to the maven command, as given in the e-m

Re: StackOverflow Error

2019-07-21 Thread Vinay Patil
Hi Ravi, The uber jar was correct, adding ClosureCleanerLevel to TOP_LEVEL resolved this issue. Thanks a lot. Is there any disadvantage of explicitly setting this ? Regards, Vinay Patil On Sat, Jul 20, 2019 at 10:23 PM Ravi Bhushan Ratnakar < ravibhushanratna...@gmail.com> wrote: > Hi Vinay,

Unable to build Flink1.10 from source

2019-07-21 Thread Yebgenya Lazarkhosrouabadi
Hello, I'm trying to build flink1.10 from source , but it fails with this error; [ERROR] Failed to execute goal on project flink-mapr-fs: Could not resolve dependencies for project org.apache.flink:flink-mapr-fs:jar:1.10-SNAPSHOT: Failed to collect dependencies at com.mapr.hadoop:maprfs:jar:5.2

[ANNOUNCE] Weekly Community Update 2019/28

2019-07-21 Thread Konstantin Knauf
Dear community, happy to share this week's community update with news about a Flink Ecosystem website, new Jira permissions, Apache Flink 1.9.0 and a few more topics. As always, please feel free to add additional updates and news to this thread! Flink Development === [ecosystem] Dar