Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Felipe Gutierrez
Thanks Piotr, the thing is that I am on Stream data and not on keyed stream data. So, I cannot use the TimerService concept here. I am triggering a local window. I ended up using java.util.Timer [1] and it seems to suffice my requirements. [1] https://docs.oracle.com/javase/7/docs/api/java/util/T

How can I get the backpressure signals inside my function or operator?

2019-11-05 Thread Felipe Gutierrez
Hi all, let's say that I have a "source -> map .> preAggregrate -> keyBy -> reduce -> sink" job and the reducer is sending backpressure signals to the preAggregate, map and source operator. How do I get those signals inside my operator's implementation? I guess inside the function is not possible.

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Gyula Fóra
You might have to introduce some dummy keys for a more robust solution that integrates with the fault-tolerance mechanism. Gyula On Tue, Nov 5, 2019 at 9:57 AM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Thanks Piotr, > > the thing is that I am on Stream data and not on keyed strea

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Felipe Gutierrez
@Gyula, I am afraid I haven't got your point. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com * On Tue, Nov 5, 2019 at 10:11 AM Gyula Fóra wrote: > You might have to introduce some dummy keys for a mor

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Felipe Gutierrez
Ah, yep. I do create a keyed stream which does not partition data. And I pre-aggregate key-values inside my operator. But I cannot rely on the number of keys to pre-aggregate because I can never receive some specific number of keys. So, the master concept to pre-aggregate is the time. After that, I

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Piotr Nowojski
Yes you are right. Good to hear that you have solved your issue :) Piotrek > On 5 Nov 2019, at 09:56, Felipe Gutierrez > wrote: > > Thanks Piotr, > > the thing is that I am on Stream data and not on keyed stream data. So, I > cannot use the TimerService concept here. I am triggering a local

Re: How can I get the backpressure signals inside my function or operator?

2019-11-05 Thread Zhijiang
Hi Felipe, That is an interesting idea to control the upstream's output based on downstream's input. If I understood correctly, the preAggregate operator would trigger flush output while the reduce operator is idle/hungry. In contrast, the preAggregate would continue aggregating data in the c

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread bupt_ljy
This should be sent to user mailing list. Moving it here... Original Message Sender: bupt_ljy Recipient: dev Date: Tuesday, Nov 5, 2019 21:13 Subject: Re: RocksDB state on HDFS seems not being cleanned up Hi Shuwen, The “shared” means that the state files are shared among multiple checkpoint

Re: The RMClient's and YarnResourceManagers internal state about the number of pending container requests has diverged

2019-11-05 Thread Till Rohrmann
Hi Regina, I've taken another look at the problem I think we could improve the situation by reordering the calls we do in YarnResourceManager#onContainersAllocated. I've created a PR [1] for the re-opened issue [2]. Would it be possible for you to verify the fix? What you need to do is to check th

Re: How can I get the backpressure signals inside my function or operator?

2019-11-05 Thread Felipe Gutierrez
Hi Zhijiang, thanks for your reply. Yes, you understood correctly. The fact that I cannot get "Shuffle.Netty.Input.Buffers.inputQueueLength" on the operator might be because of the way Flink runtime architecture was designed. But I was wondering what kind of signal I can get. I guess some backpres

Partitioning based on key flink kafka sink

2019-11-05 Thread Vishwas Siravara
Hi all, I am using flink 1.7.0 and using this constructor FlinkKafkaProducer(String topicId, KeyedSerializationSchema serializationSchema, Properties producerConfig) >From the doc it says this constructor uses fixed partitioner. I want to partition based on key , so I tried to use this public Fl

Limit max cpu usage per TaskManager

2019-11-05 Thread Lu Niu
Hi, When run flink application in yarn mode, is there a way to limit maximum cpu usage per TaskManager? I tried this application with just source and sink operator. parallelism of source is 60 and parallelism of sink is 1. When running in default config, there are 60 TaskManager assigned. I notic