Re: How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-08-30 Thread Jörn Franke
Increase replication factor and/or use HDFS cache https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html Try to reduce the size of the Jar, eg the Flink libraries do not need to be included. > Am 30.08.2019 um 01:09 schrieb Elkhan Dadashov : > > De

Re: [DISCUSS] Contributing Chinese website and docs to Apache Flink

2019-01-29 Thread Jörn Franke
Keep also in mind the other direction eg a new/modified version of the Chinese documentation needs to be reflected in the English one. > Am 29.01.2019 um 06:35 schrieb SteNicholas : > > Hi Jark, > > Thank you for starting this discussion.I am very willing to participate in > flink document tra

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-12 Thread Jörn Franke
Thank you very nice , I fully agree with that. > Am 11.10.2018 um 19:31 schrieb Zhang, Xuefu : > > Hi Jörn, > > Thanks for your feedback. Yes, I think Hive on Flink makes sense and in fact > it is one of the two approaches that I named in the beginning of the thread. > As also pointed out the

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-10 Thread Jörn Franke
Would it maybe make sense to provide Flink as an engine on Hive („flink-on-Hive“)? Eg to address 4,5,6,8,9,10. this could be more loosely coupled than integrating hive in all possible flink core modules and thus introducing a very tight dependency to Hive in the core. 1,2,3 could be achieved via

Re: Reading a single input file in parallel?

2018-02-18 Thread Jörn Franke
AFAIK Flink has a similar notion of splittable as Hadoop. Furthermore you can set for custom Fileibputformats the attribute unsplittable = true if your file format cannot be split > On 18. Feb 2018, at 13:28, Niels Basjes wrote: > > Hi, > > In Hadoop MapReduce there is the notion of "splitta

Re: Submitting jobs via maven coordinates?

2017-11-14 Thread Jörn Franke
I have seen no official script but what you describe should be easily down with a build tool such as Gradle. The advantage would be that a build tool is already tested and you do not have to maintain scripts for downloading etc. > On 14. Nov 2017, at 07:46, Ron Crocker wrote: > > Internally we

Re: Switch to Scala 2.11 as a default build profile

2017-06-29 Thread Jörn Franke
EMR has a flink package. Just go to advanced options and but a checkbox on flink. No need to build yourself. > On 29. Jun 2017, at 05:56, Bowen Li wrote: > > +1. > > AWS EMR eco system is using Scala 2.11, and breaks with Scala 2.10. We had > to build several Flink components (e.g. flink-kines