Re: Write bulks files from streaming app

2018-07-22 Thread Jozef Vilcek
like parquet), one needs an additional step, to > encode/compress when the specific destination file is done (if you think in > Hadoop terms, that would be in the "commit" step). > > > On Sun, Jul 22, 2018 at 2:10 PM, Jozef Vilcek > wrote: > >> I looked into Wait.o

Flink cluster crashing going from 1.4.0 -> 1.5.3

2018-08-23 Thread Jozef Vilcek
Hello, I am trying to get my Beam application (run on newer version of Flink (1.5.3) but having trouble with that. When I submit application, everything works fine but after a few mins (as soon as 2 minutes after job start) cluster just goes bad. Logs are full of timeouts for heartbeats, JobManage

Re: Flink cluster crashing going from 1.4.0 -> 1.5.3

2018-08-23 Thread Jozef Vilcek
Yes, on smaller data and therefore smaller resources and parallelism exactly same job runs fine On Thu, Aug 23, 2018, 16:11 Aljoscha Krettek wrote: > Hi, > > So with Flink 1.5.3 but a smaller parallelism the job works fine? > > Best, > Aljoscha > > > On 23. Aug 2

Re: Flink cluster crashing going from 1.4.0 -> 1.5.3

2018-08-23 Thread Jozef Vilcek
allelism are you using? > > Piotrek > > > On 23 Aug 2018, at 16:21, Jozef Vilcek wrote: > > > > Yes, on smaller data and therefore smaller resources and parallelism > > exactly same job runs fine > > > > On Thu, Aug 23, 2018, 16:11 Aljoscha Krettek

Re: Flink cluster crashing going from 1.4.0 -> 1.5.3

2018-08-24 Thread Jozef Vilcek
gnificantly increases > the number of accesses to the ConcurrentHashMaps that store metrics fort > he UI. It could be that this code is just too slow for the amount of > metrics. > > On 23.08.2018 19:06, Jozef Vilcek wrote: > > parallelism is 100. I tried clusters with 1 and 2 slots per TM

Re: Flink cluster crashing going from 1.4.0 -> 1.5.3

2018-08-24 Thread Jozef Vilcek
> That is, if you have 2 sources, with 2 other operators and a parallelism > of 10 you can end up with 400 latency metrics. > 10 (subtasks per source) * 10 (subtasks per operator) * 2 (# operators) * > 2 (#-sources) > > On 24.08.2018 11:28, Jozef Vilcek wrote: > > Fo

Unbalanced processing of KeyedStream

2019-01-02 Thread Jozef Vilcek
Hello, I am facing a problem where KeyedStream is purely parallelised on workers for case where number of keys is close to parallelism. Some workers process zero keys, some more than one. This is because of `KeyGroupRangeAssignment.assignKeyToParallelOperator()` in `KeyGroupStreamPartitioner` as

Re: Unbalanced processing of KeyedStream

2019-01-02 Thread Jozef Vilcek
tils.java#L153> > should generate key values that will be partitioned properly. > > — Ken > > > On Jan 2, 2019, at 12:16 AM, Jozef Vilcek wrote: > > > > Hello, > > > > I am facing a problem where KeyedStream is purely parallelised on workers >

Re: Unbalanced processing of KeyedStream

2019-01-03 Thread Jozef Vilcek
.operatorIndex) == this.numOperators) { … } > > — Ken > > > On Jan 2, 2019, at 11:32 AM, Jozef Vilcek wrote: > > > > Thanks Ken. Yes, similar approach is suggested in post I shared in my > > question. But to me it feels a bit hack-ish. > > I would like to kno

Re: Unbalanced processing of KeyedStream

2019-01-03 Thread Jozef Vilcek
gt; > int hash = calcPositiveHashCode(key); > > if ((hash % this.operatorIndex) == this.numOperators) { … } > > > > — Ken > > > >> On Jan 2, 2019, at 11:32 AM, Jozef Vilcek > wrote: > >> > >> Thanks Ken. Yes, similar approach is suggeste

[jira] [Created] (FLINK-15773) ClosureCleaner fails with joda DateTimeZone

2020-01-27 Thread Jozef Vilcek (Jira)
Jozef Vilcek created FLINK-15773: Summary: ClosureCleaner fails with joda DateTimeZone Key: FLINK-15773 URL: https://issues.apache.org/jira/browse/FLINK-15773 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-9656) Environment java opts for flink run

2018-06-25 Thread Jozef Vilcek (JIRA)
Jozef Vilcek created FLINK-9656: --- Summary: Environment java opts for flink run Key: FLINK-9656 URL: https://issues.apache.org/jira/browse/FLINK-9656 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-10226) Latency metrics can choke job-manager

2018-08-27 Thread Jozef Vilcek (JIRA)
Jozef Vilcek created FLINK-10226: Summary: Latency metrics can choke job-manager Key: FLINK-10226 URL: https://issues.apache.org/jira/browse/FLINK-10226 Project: Flink Issue Type