Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Arvid Heise
Hi all, just wanted to share my experience with configurations with you. For non-expert users configurations of Flink can be very daunting. The list of common properties is already helping a lot [1], but it's not clear how they depend on each other and settings common for specific use cases are no

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Zhu Zhu
1s looks good to me. And I think the conclusion that when a user should override the delay is worth to be documented. Thanks, Zhu Zhu Steven Wu 于2019年9月3日周二 上午4:42写道: > 1s sounds a good tradeoff to me. > > On Mon, Sep 2, 2019 at 1:30 PM Till Rohrmann wrote: > >> Thanks a lot for all your feedb

understanding task manager logs

2019-09-02 Thread Vishwas Siravara
Hi guys, I am using flink 1.7.2 and my application consumes from a kafka topic and publish to another kafka topic which is in its own kafka environment running a different kafka version,. I am using FlinkKafkaConsumer010 from this dependency *"org.apache.flink" %% "flink-connector-kafka-0.10" % fli

question

2019-09-02 Thread ????????
How do you do: My problem is flink table format and table schema mapping. The input data is similar to the following json format?? { "id": "123","serial": "6b0c2d26", "msg": {"f1": "5677"} } The format code for TableSource is as follows: new Json().schema(Types.ROW

Re: kinesis table connector support

2019-09-02 Thread Bowen Li
@Fanbin, I don't think there's one yet. Feel free to create a ticket and submit a PR for it On Mon, Sep 2, 2019 at 8:13 AM Biao Liu wrote: > Hi Fanbin, > > I'm not familiar with table module. Maybe someone else could help. > > @jincheng sun > Do you know there is any plan for kinesis table conn

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Steven Wu
1s sounds a good tradeoff to me. On Mon, Sep 2, 2019 at 1:30 PM Till Rohrmann wrote: > Thanks a lot for all your feedback. I see there is a slight tendency > towards having a non zero default delay so far. > > However, Yu has brought up some valid points. Maybe I can shed some light > on a). > >

Re: End of Window Marker

2019-09-02 Thread Eduardo Winpenny Tejedor
Hi all, I'll illustrate my approach with an example as it is definitely unorthodox. Here's some sample code. It works for me...I hope there are no (obvious) flaws! //myStream should be a stream of objects associated to a timestamp. the idea is to create a Flink app that //sends each object to kaf

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Till Rohrmann
Thanks a lot for all your feedback. I see there is a slight tendency towards having a non zero default delay so far. However, Yu has brought up some valid points. Maybe I can shed some light on a). Before FLINK-9158 we set the default delay to 10s because Flink did not support queued scheduling w

Re: End of Window Marker

2019-09-02 Thread Padarn Wilson
Hi Fabian, > but each partition may only be written by a single task Sorry I think I misunderstand something here then: If I have a topic with one partition, but multiple sink tasks (or parallelism > 1).. this means the data must all be shuffled to the single task writing that partition? Padarn

Re: kinesis table connector support

2019-09-02 Thread Biao Liu
Hi Fanbin, I'm not familiar with table module. Maybe someone else could help. @jincheng sun Do you know there is any plan for kinesis table connector? Thanks, Biao /'bɪ.aʊ/ On Sat, 24 Aug 2019 at 02:26, Fanbin Bu wrote: > Hi, > > Looks like Flink table connectors do not include `kinesis`.

Re: checkpoint failure suddenly even state size is into 10 mb around

2019-09-02 Thread Biao Liu
Hi Sushant, Your screenshot shows the checkpoint expired. It means checkpoint did not finish in time. I guess the reason is the heavy back pressure blocks both data and barrier. But I can't tell why there was a heavy back pressure. If this scenario happens again, you could pay more attention to t

Kinesis stream and serialization schemas

2019-09-02 Thread Yoandy Rodríguez
Hi everyone, As I've mention in previous emails, we're currently exploring flink as a substitute for some in house products. One of these products sends JSON data to a Kinesis Data Stream, another product process the records after some time. We've tried to set up the Kinesis producer like this:

Window metadata removal

2019-09-02 Thread gil bl
Hi, I'm interested in why metadata like WindowOperator and InternalTimer are being kept for windowSize + allowedLateness period per each pane.What is the purpose of keeping this data if no new events are expected to enter the pane? Is there any way this metadata can be released earlier?

Re: Is it possible to register a custom TypeInfoFactory without using an annotation?

2019-09-02 Thread Biao Liu
Hi, Java supports customization of serializer/deserializer, see [1]. Could it satisfy your requirement? 1. https://stackoverflow.com/questions/7290777/java-custom-serialization Thanks, Biao /'bɪ.aʊ/ On Mon, 26 Aug 2019 at 16:34, 杨力 wrote: > I'd like to provide a custom serializer for a POJO

Re: tumbling event time window , parallel

2019-09-02 Thread Fabian Hueske
I meant to not use Flink's built-in windows at all but implement your logic in a KeyedProcessFunction. So basically: myDataStream.keyBy(...).process(new MyKeyedProcessFunction) instead of: myDataStream.keyBy(...).window(...).process(new MyWindowProcessFunction) Am Mo., 2. Sept. 2019 um 15:29 Uhr

Re: Kafka producer failed with InvalidTxnStateException when performing commit transaction

2019-09-02 Thread Becket Qin
Hi Tony, >From the symptom it is not quite clear to me what may cause this issue. Supposedly the TransactionCoordinator is independent of the active controller, so bouncing the active controller should not have special impact on the transactions (at least not every time). If this is stably reprodu

Re: End of Window Marker

2019-09-02 Thread Fabian Hueske
Hi Padarn, Regarding your throughput concerns: A sink task may write to multiple partitions, but each partition may only be written by a single task. @Eduardo: Thanks for sharing your approach! Not sure if I understood it correctly, but I think that the approach does not guarantee that all result

Join with slow changing dimensions/ streams

2019-09-02 Thread Hanan Yehudai
I have a very common use case -enriching the stream with some dimension tables. e.g the events stream has a SERVER_ID , and another files have the LOCATION associated with e SERVER_ID. ( a dimension table csv file) in SQL I would simply join. but hen using Flink stream API , as far

Re: checkpoint failure in forever loop suddenly even state size less than 1 mb

2019-09-02 Thread Fabian Hueske
Hi Sushant, It's hard to tell what's going on. Maybe the thread pool of the async io operator is too small for the ingested data rate? This could cause the backpressure on the source and eventually also the failing checkpoints. Which Flink version are you using? Best, Fabian Am Do., 29. Aug. 2

RE: tumbling event time window , parallel

2019-09-02 Thread Hanan Yehudai
Im not sure what you mean by use process function and not window process function , as the window operator takes in a windowprocess function.. From: Fabian Hueske Sent: Monday, August 26, 2019 1:33 PM To: Hanan Yehudai Cc: user@flink.apache.org Subject: Re: tumbling event time window , paralle

TaskManager process continue to work after termination

2019-09-02 Thread Ustinov Anton
Hello, I have a standalone cluster setup with Flink 1.8. Task manager processes configured via systemd units with the always restart policy. An error occurred during execution of the JobGraph and caused termination of the task manager. Logs from task manager: {"time":"2019-09-02 11:33:14.797",

Re: How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-09-02 Thread Yang Wang
Hi Dadashov, Regarding your questions. > Q1 Do all those 800 nodes download of batch of 3 at a time The 800+ containers will be allocated on different yarn nodes. By default, the LocalResourceVisibility is APPLICATION, so they will be downloaded only once and shared for all taskmanager conta

Re: Assigning UID to Flink SQL queries

2019-09-02 Thread Dawid Wysakowicz
Hi Yuval, Unfortunately currently you cannot assign UIDs in table programs. The reason is that uid is used for reassigning state upon restart. Table programs are automatically compiled to executable program. This program might change depending on many factors: table statistics, applied rules etc.