date:20180815

Stream collector serialization performance

2018-08-15 Thread 祁明良

Hi all, I’m currently using the keyed process function, I see there’s serialization happening when I collect the object / update the object to rocksdb. For me the performance of serialization seems to be the bottleneck. By default, POJO serializer is used, and the timecost of collect / update to

[ANNOUNCE] Weekly community update #33

2018-08-15 Thread Till Rohrmann

Dear community, this is the weekly community update thread #33. Please post any news and updates you want to share with the community to this thread. # Flink 1.6.0 has been released The community released Flink 1.6.0 [1]. Thanks to everyone who made this release possible. # Improving tutorials

Re: CoFlatMapFunction with more than two input streams

2018-08-15 Thread Averell

Thank you Vino & Xingcan. @Vino: could you help explain more details on using DBMS? Would that be with using TableAPI, or you meant directly reading DBMS data inside the ProcessFunction? @Xingcan: I don't know what are the benefits of using CoProcess over RichCoFlatMap in this case. Regarding usin

Re: CoFlatMapFunction with more than two input streams

2018-08-15 Thread vino yang

Hi Averell, What I mean is that if you store stream_c data in an RDBMS, you can access the RDBMS directly in the CoFlatMapFunction instead of using the Table API. This is somewhat similar to stream and dimension table joins. Of course, the premise of adopting this option is that the amount of data

Re: 1.5.1

2018-08-15 Thread Juho Autio

Thanks Gary.. What could be blocking the RPC threads? Slow checkpointing? In production we're still using a self-built Flink package 1.5-SNAPSHOT, flink commit 8395508b0401353ed07375e22882e7581d46ac0e, and the jobs are stable. Now with 1.5.2 the same jobs are failing due to heartbeat timeouts ev

Re: 1.5.1

2018-08-15 Thread Juho Autio

Vishal, from which version did you upgrade to 1.5.1? Maybe from 1.5.0 (release)? Knowing that might help narrowing down the source of this. On Wed, Aug 15, 2018 at 11:38 AM Juho Autio wrote: > Thanks Gary.. > > What could be blocking the RPC threads? Slow checkpointing? > > In production we're s

Re: 1.5.1

2018-08-15 Thread Juho Autio

Gary, I found another mail thread about similar issue: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Testing-on-Flink-1-5-tp19565p19647.html Specifically I found this: > we are observing Akka.ask.timeout error for few of our jobs (JM's logs[2]), we tried to increase this pa

Re: watermark does not progress

2018-08-15 Thread Fabian Hueske

Hi John, Watermarks cannot make progress if you have stream partitions that do not carry any data. What kind of source are you using? Best, Fabian 2018-08-15 4:25 GMT+02:00 vino yang : > Hi Johe, > > In local mode, it should also work. > When you debug, you can set a breakpoint in the getCurren

Re: Limit on number of files to read for Dataset

2018-08-15 Thread Fabian Hueske

Hi Darshan, This looks like a file system configuration issue to me. Flink supports different file systems for S3 and there are also a few tuning knobs. Did you have a look at the docs for file system configuration [1]? Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.

RE: watermark does not progress

2018-08-15 Thread John O

I did some more testing. Below is a pseudo version of by setup. kafkaconsumer-> assignTimestampsAndWatermarks(BoundedOutOfOrdernessTimestampExtractor)-> process(print1 ctx.timerService().currentWatermark()) -> keyBy(_.someProp) -> process(print2 ctx.timerService().currentWatermark()) -> I am man

ODP: docker, error NoResourceAvailableException..

2018-08-15 Thread Dominik Wosiński

Hey, The problem is that your command does start Job Manager container, but it does not start the Task Manager . That is why you have 0 slots. Currently, the default numberOfTaskSlots is set to the number of CPUs avaialbe on the machine. So, You generally can to do 2 things: 1) Start Job Mana

Re: 1.5.1

2018-08-15 Thread Gary Yao

Hi Juho, the main thread of the RPC endpoint should never be blocked. Blocking on that thread is considered an implementation error. Unfortunately, without logs it is difficult to tell what the exact problem is. If you are able to reproduce heartbeat timeouts on your test staging environment, can

Re: Stream collector serialization performance

2018-08-15 Thread Timo Walther

Hi Mingliang, first of all the POJO serializer is not very performant. Tuple or Row are better. If you want to improve the performance of a collect() between operators, you could also enable object reuse. You can read more about this here [1] (section "Issue 2: Object Reuse"), but make sure y

Re: watermark does not progress

2018-08-15 Thread Hequn Cheng

Hi John, I guess the source data of local are different from the cluster. And as Fabian said, it is probably that some partitions don't carry data. As a choice, you can set job parallelism to 1 and check the result. Best, Hequn On Wed, Aug 15, 2018 at 5:22 PM, John O wrote: > I did some more t

Scala 2.12 Support

2018-08-15 Thread Aaron Levin

Hello! I'm wondering if there is anywhere I can see Flink's roadmap for Scala 2.12 support. The last email I can find on the list for this was back in January, and the FLINK-7811[0], the ticket asking for Scala 2.12 support, hasn't been updated in a few months. Recently Spark fixed the ClosureCle

Re: CoFlatMapFunction with more than two input streams

2018-08-15 Thread Xingcan Cui

Hi Averell, With the CoProcessFunction, you could get access to the time-related services which may be useful when maintaining the elements in Stream_C and you could get rid of type casting with the Either class. Best, Xingcan > On Aug 15, 2018, at 3:27 PM, Averell wrote: > > Thank you Vino

Re: docker, error NoResourceAvailableException..

2018-08-15 Thread shyla deshpande

Thanks Dominik, I will try that. On Wed, Aug 15, 2018 at 3:10 AM, Dominik Wosiński wrote: > Hey, > The problem is that your command does start Job Manager container, but it > does not start the Task Manager . That is why you have 0 slots. Currently, > the default *numberOfTaskSlots* is set to th

Re: docker, error NoResourceAvailableException..

2018-08-15 Thread Esteban Serrano

You can also instead of defining 2 services (taskmanager and taskmanager1), set the scale parameter on taskmanager to the number of desired slots. Something like this: taskmanager: image: "${FLINK_DOCKER_IMAGE:-flink:1.5.2}" scale: 2 expose: - "6121" - "6122" - "8081" On Wed, A

Flink CLI does not return after submitting yarn job in detached mode

2018-08-15 Thread madhav Kelkar

Hi there, I am trying to run a single flink job on YARN in detached mode. as per the docs for flink 1.4.2, I am using -yd to do that. The problem I am having is the flink bash script doesn't terminate execution and return until I press control + c. In detached mode, I would expect the flink C

OneInputStreamOperatorTestHarness.snapshot doesn't include timers?

2018-08-15 Thread Ken Krugler

Hi all, It looks to me like the OperatorSubtaskState returned from OneInputStreamOperatorTestHarness.snapshot fails to include any timers that had been registered via registerProcessingTimeTimer but had not yet fired when the snapshot was saved. Is this a known limitation of OneInputStreamOper

Re: Flink CLI does not return after submitting yarn job in detached mode

2018-08-15 Thread Marvin777

Hi, Madhav, > ./flink-1.4.2/bin/flink run -m yarn-cluster *-yd* -yn 2 -yqu "default" > -ytm 2048 myjar.jar Modified to, ./flink-1.4.2/bin/flink run -m yarn-cluster -*d* -yn 2 -yqu "default" -ytm 2048 myjar.jar [image: image.png] madhav Kelkar 于2018年8月16日周四上午5:01写道： > Hi there, > >

Re: How to submit flink job on yarn by java code

2018-08-15 Thread Rong Rong

I dont think your exception / code was attached. In general, this is largely depending on how your setup is. Are you trying to setup a long-running YARN session cluster or are you trying to directly use YARN cluster submit? [1]. We have an open-sourced project [2] with similar requirement submitti

Re: CoFlatMapFunction with more than two input streams

2018-08-15 Thread Averell

Thank you Xingcan. Regarding that Either, I still see the need to do TypeCasting/CaseClass matching. Could you please help give a look? val transformed = dog .union(cat) .connect(transformer) .keyBy(r => r.name, r2 => r2.name) .process(new Transfo

How to compare two window ?

2018-08-15 Thread 苗元君

Hi, Flink guys, U really to a quick release, it's fantastic ! I'v got a situation , window 1 is time driven, slice is 1min, trigger is 1 count window 2 is count driven, slice is 3 count, trigger is 1count 1. Then element is out of window1 and just right into window2. For example if there is o

Re: Flink CLI does not return after submitting yarn job in detached mode

2018-08-15 Thread vino yang

Hi Marvin777, You are wrong. It uses the Flink on YARN single job mode and should use the "-yd" parameter. Hi Madhav, I seem to have found the problem, the source code of your log is here.[1] It is based on a judgment method "isUsingInteractiveMode". The source code for this method is here[2],

Stream collector serialization performance

[ANNOUNCE] Weekly community update #33

Re: CoFlatMapFunction with more than two input streams

Re: CoFlatMapFunction with more than two input streams

Re: 1.5.1

Re: 1.5.1

Re: 1.5.1

Re: watermark does not progress

Re: Limit on number of files to read for Dataset

RE: watermark does not progress

ODP: docker, error NoResourceAvailableException..

Re: 1.5.1

Re: Stream collector serialization performance

Re: watermark does not progress

Scala 2.12 Support

Re: CoFlatMapFunction with more than two input streams

Re: docker, error NoResourceAvailableException..

Re: docker, error NoResourceAvailableException..

Flink CLI does not return after submitting yarn job in detached mode

OneInputStreamOperatorTestHarness.snapshot doesn't include timers?

Re: Flink CLI does not return after submitting yarn job in detached mode

Re: How to submit flink job on yarn by java code

Re: CoFlatMapFunction with more than two input streams

How to compare two window ?

Re: Flink CLI does not return after submitting yarn job in detached mode

25 matches

Site Navigation

Mail list logo

Footer information