Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Terry Wang
Congratulations! Best, Terry Wang > 2020年1月17日 14:09,Biao Liu 写道: > > Congrats! > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Fri, 17 Jan 2020 at 13:43, Rui Li > wrote: > Congratulations Dian, well deserved! > > On Thu, Jan 16, 2020 at 5:58 PM jincheng sun

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Biao Liu
+1 I think that's how it should be. Timer should align with other regular state. If user wants a better performance without memory concern, memory or FS statebackend might be considered. Or maybe we could optimize the performance by introducing a specific column family for timer. It could have it

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Biao Liu
Congrats! Thanks, Biao /'bɪ.aʊ/ On Fri, 17 Jan 2020 at 13:43, Rui Li wrote: > Congratulations Dian, well deserved! > > On Thu, Jan 16, 2020 at 5:58 PM jincheng sun > wrote: > >> Hi everyone, >> >> I'm very happy to announce that Dian accepted the offer of the Flink PMC >> to become a committ

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Rui Li
Congratulations Dian, well deserved! On Thu, Jan 16, 2020 at 5:58 PM jincheng sun wrote: > Hi everyone, > > I'm very happy to announce that Dian accepted the offer of the Flink PMC > to become a committer of the Flink project. > > Dian Fu has been contributing to Flink for many years. Dian Fu pl

Old offsets consumed from Kafka after Flink upgrade to 1.9.1 (from 1.2.1)

2020-01-16 Thread Somya Maithani
Hey Team, *Problem* Recently, we were trying to upgrade Flink infrastructure to version 1.9.1 and we noticed that a week old offset was consumed from Kafka even though the configuration says latest. *Pretext* 1. Our current Flink version in production is 1.2.1. 2. We use RocksDB + Hadoop as our b

Job cannot be deployed when use detached mode

2020-01-16 Thread sysuke Lee
Hi all, We've got a jar with hadoop configuration files in it. Previously we use blocking mode to deploy jars on YARN, they run well. Recently we find the client process occupies more and more memory , so we try to use detached mode, but the job failed to deploy with following error information:

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Yuan Mei
Congrats! Best Yuan On Thu, Jan 16, 2020 at 5:59 PM jincheng sun wrote: > Hi everyone, > > I'm very happy to announce that Dian accepted the offer of the Flink PMC to > become a committer of the Flink project. > > Dian Fu has been contributing to Flink for many years. Dian Fu played an > essen

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Paul Lam
Congrats, Dian! Best, Paul Lam > 在 2020年1月17日,10:49,tison 写道: > > Congratulations! Dian > > Best, > tison. > > > Zhu Zhu mailto:reed...@gmail.com>> 于2020年1月17日周五 > 上午10:47写道: > Congratulations Dian. > > Thanks, > Zhu Zhu > > hailongwang <18868816...@163.com > 于

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread tison
Congratulations! Dian Best, tison. Zhu Zhu 于2020年1月17日周五 上午10:47写道: > Congratulations Dian. > > Thanks, > Zhu Zhu > > hailongwang <18868816...@163.com> 于2020年1月17日周五 上午10:01写道: > >> >> Congratulations Dian ! >> >> Best, >> Hailong Wang >> >> >> >> >> 在 2020-01-16 21:15:34,"Congxian Qiu" 写道: >

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Zhu Zhu
Congratulations Dian. Thanks, Zhu Zhu hailongwang <18868816...@163.com> 于2020年1月17日周五 上午10:01写道: > > Congratulations Dian ! > > Best, > Hailong Wang > > > > > 在 2020-01-16 21:15:34,"Congxian Qiu" 写道: > > Congratulations Dian Fu > > Best, > Congxian > > > Jark Wu 于2020年1月16日周四 下午7:44写道: > >> Co

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Jingsong Li
Hi Stephan, Thanks for starting this discussion. +1 for stores times in RocksDB by default. In the past, when Flink didn't save the times with RocksDb, I had a headache. I always adjusted parameters carefully to ensure that there was no risk of Out of Memory. Just curious, how much impact of heap

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread hailongwang
Congratulations Dian ! Best, Hailong Wang 在 2020-01-16 21:15:34,"Congxian Qiu" 写道: Congratulations Dian Fu Best, Congxian Jark Wu 于2020年1月16日周四 下午7:44写道: Congratulations Dian and welcome on board! Best, Jark On Thu, 16 Jan 2020 at 19:32, Jingsong Li wrote: > Congratulatio

Job Manager heap metrics

2020-01-16 Thread RKandoji
Hi, Could someone please tell me what is the best way to check amount of heap consumed by Job Manager? Currently I added huge heap of 20GB for both Job Manager and Task Manager. I'm able to see task manager heap usage on UI but not for Job Manager. I would like to decide how much heap to allocat

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Bowen Li
Congrats! On Thu, Jan 16, 2020 at 13:45 Peter Huang wrote: > Congratulations, Dian! > > > Best Regards > Peter Huang > > On Thu, Jan 16, 2020 at 11:04 AM Yun Tang wrote: > >> Congratulations, Dian! >> >> Best >> Yun Tang >> -- >> *From:* Benchao Li >> *Sent:* Thursd

savepoint failed for finished tasks

2020-01-16 Thread Fanbin Bu
Hi, I couldn't make a savepoint for the following graph: [image: image.png] with stacktrace: Caused by: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException: org.apache.flink.runtime.checkpoint.CheckpointException: Failed to trigger savepoint. Failure reason: Not all

Re: SANSA 0.7.1 (Scalable Semantic Analytics Stack) Released

2020-01-16 Thread Oytun Tez
Looks interesting, Hajira. Thank you for sharing. We should add this to flink-packages.org. -- [image: MotaWord] Oytun Tez M O T A W O R D | CTO & Co-Founder oy...@motaword.com On Thu, Jan 16, 2020 at 5:01 PM Hajira Jabeen wrote: > > Dear all, > >

SANSA 0.7.1 (Scalable Semantic Analytics Stack) Released

2020-01-16 Thread Hajira Jabeen
Dear all, The Smart Data Analytics group (http://sda.tech) is happy to announce SANSA 0.7.1 - the seventh release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Apache Flink in order to allow scalable machine learning, inference and querying capa

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Peter Huang
Congratulations, Dian! Best Regards Peter Huang On Thu, Jan 16, 2020 at 11:04 AM Yun Tang wrote: > Congratulations, Dian! > > Best > Yun Tang > -- > *From:* Benchao Li > *Sent:* Thursday, January 16, 2020 22:27 > *To:* Congxian Qiu > *Cc:* d...@flink.apache.org ;

Re: How to handle startup for mandatory config parameters?

2020-01-16 Thread John Smith
Sorry I should have specified how to handle job specific config parameters using ParameterTool ParameterTool parameters = ... String someConfig = parameters.get("some.config"); <--- This is mandatory Do I check someConfig for what ever requirement and just throw an exception before starting the

How to declare the Row object schema

2020-01-16 Thread Soheil Pourbafrani
Hi, Inserting a DataSet of the type Row using the Flink *JDBCOutputFormat *I continuously go the warning: [DataSink (org.apache.flink.api.java.io.jdbc.JDBCOutputFormat@18be83e4) (1/4)] WARN org.apache.flink.api.java.io.jdbc.JDBCOutputFormat - Unknown column type for column 8. Best effort approach

Re: Flink Batch mode checkpointing

2020-01-16 Thread Soheil Pourbafrani
Thanks a lot. On Thu, Jan 16, 2020 at 8:12 PM Yun Tang wrote: > Hi > > Current Batch API does not really relay on checkpoint mechanism and not > support for checkpoints. In the long term, Flink targets for the > unification of streaming and batch [1]. Once this is completed, we can > enable some

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Yun Tang
Hi Stephan, I am +1 for the change which stores timers in RocksDB by default. Some users hope the checkpoint could be completed as fast as possible, which also need the timer stored in RocksDB to not affect the sync part of checkpoint. Best Yun Tang From: Andrey

Re: Flink Batch mode checkpointing

2020-01-16 Thread Yun Tang
Hi Current Batch API does not really relay on checkpoint mechanism and not support for checkpoints. In the long term, Flink targets for the unification of streaming and batch [1]. Once this is completed, we can enable some checkpoint future when dealing batch cases. [1] https://flink.apache.o

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Yun Tang
Congratulations, Dian! Best Yun Tang From: Benchao Li Sent: Thursday, January 16, 2020 22:27 To: Congxian Qiu Cc: d...@flink.apache.org ; Jingsong Li ; jincheng sun ; Shuo Cheng ; Xingbo Huang ; Wei Zhong ; Hequn Cheng ; Leonard Xu ; Jeff Zhang ; user ; user

Flink Batch mode checkpointing

2020-01-16 Thread Soheil Pourbafrani
Hi, While in Streaming mode I'm using the Flink checkpointing and restart strategy, I could not find any checkpointing or restart strategy for Batch mode! Does Flink have any support for that? Actually I'm gonna read some huge text files and I need the application to be at least restarted on any f

Re: Read CSV file and and create customized field

2020-01-16 Thread Chesnay Schepler
You should add an extra map function. On 16/01/2020 17:10, Soheil Pourbafrani wrote: Hi friends, I'm going to read a CSV file that has 3 columns. I want the final loaded datatype to have other columns inferred by that 3 columns. For example, I would split the first column of the CSV file and c

Read CSV file and and create customized field

2020-01-16 Thread Soheil Pourbafrani
Hi friends, I'm going to read a CSV file that has 3 columns. I want the final loaded datatype to have other columns inferred by that 3 columns. For example, I would split the first column of the CSV file and create 3 new columns. The problem is I did not find a straightforward approach for that. He

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Andrey Zagrebin
Hi Stephan, Thanks for starting this discussion. I am +1 for this change. In general, number of timer state keys can have the same order as number of main state keys. So if RocksDB is used for main state for scalability, it makes sense to have timers there as well unless timers are used for only v

[DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Stephan Ewen
Hi all! I would suggest a change of the current default for timers. A bit of background: - Timers (for windows, process functions, etc.) are state that is managed and checkpointed as well. - When using the MemoryStateBackend and the FsStateBackend, timers are kept on the JVM heap, like regula

Why would indefinitely growing state an issue for Flink while doing stream to stream joins?

2020-01-16 Thread kant kodali
Hi All, The doc https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/joins.html#regular-joins says the following. "However, this operation has an important implication: it requires to keep both sides of the join input in Flink’s state forever. Thus, the resource usage will g

Re: Flink ParquetAvroWriters Sink

2020-01-16 Thread aj
Hi Arvid, Thanks for the quick response. I am new to this Avro design so can you please help me understand and design for my use case. I have use case like this : 1. we have an app where a lot of action happened from the user side. 2. for each action we collect some set of information that define

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Benchao Li
Congratulations Dian. Congxian Qiu 于2020年1月16日周四 下午10:15写道: > Congratulations Dian Fu > > Best, > Congxian > > > Jark Wu 于2020年1月16日周四 下午7:44写道: > >> Congratulations Dian and welcome on board! >> >> Best, >> Jark >> >> On Thu, 16 Jan 2020 at 19:32, Jingsong Li wrote: >> >> > Congratulations Di

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Congxian Qiu
Congratulations Dian Fu Best, Congxian Jark Wu 于2020年1月16日周四 下午7:44写道: > Congratulations Dian and welcome on board! > > Best, > Jark > > On Thu, 16 Jan 2020 at 19:32, Jingsong Li wrote: > > > Congratulations Dian Fu. Well deserved! > > > > Best, > > Jingsong Lee > > > > On Thu, Jan 16, 2020 a

Re: Frequently checkpoint failure, could make the flink sql state not clear?

2020-01-16 Thread Congxian Qiu
Hi, AFAIK, whether a timer will fire is irrelevant to checkpoint success or not. Best, Congxian LakeShen 于2020年1月16日周四 下午8:53写道: > Hi community, now I am using Flink sql , and I set the retention time, As > I all know is that Flink will set the timer for per key to clear their > state, if Fli

Re: Flink ParquetAvroWriters Sink

2020-01-16 Thread Arvid Heise
Hi Anuj, you should always avoid having records with different schemas in the same topic/dataset. You will break the compatibility features of the schema registry and your consumer/producer code is always hard to maintain. A common and scalable way to avoid it is to use some kind of envelope form

Flink ParquetAvroWriters Sink

2020-01-16 Thread aj
Hi All, I have a use case where I am getting a different set of Avro records in Kafka. I am using the schema registry to store Avro schema. One topic can also have different types of records. Now I have created a GenericRecord Stream using kafkaAvroDeseralizer by defining custom Deserializer clas

Frequently checkpoint failure, could make the flink sql state not clear?

2020-01-16 Thread LakeShen
Hi community, now I am using Flink sql , and I set the retention time, As I all know is that Flink will set the timer for per key to clear their state, if Flink task always checkpoint failure, are the key state cleared by timer? Thanks to your replay.

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Jark Wu
Congratulations Dian and welcome on board! Best, Jark On Thu, 16 Jan 2020 at 19:32, Jingsong Li wrote: > Congratulations Dian Fu. Well deserved! > > Best, > Jingsong Lee > > On Thu, Jan 16, 2020 at 6:26 PM jincheng sun > wrote: > >> Congrats Dian Fu and welcome on board! >> >> Best, >> Jinchen

Re: Flink Dataset to ParquetOutputFormat

2020-01-16 Thread aj
Hi Arvid, Thanks for the details reply. I am using Dataset API and its a batch job so wondering is the option you provided is works for that. Thanks, Anuj On Wed, Jan 8, 2020 at 7:01 PM Arvid Heise wrote: > Hi Anji, > > StreamingFileSink has a BucketAssigner that you can use for that purpose.

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Jingsong Li
Congratulations Dian Fu. Well deserved! Best, Jingsong Lee On Thu, Jan 16, 2020 at 6:26 PM jincheng sun wrote: > Congrats Dian Fu and welcome on board! > > Best, > Jincheng > > Shuo Cheng 于2020年1月16日周四 下午6:22写道: > >> Congratulations! Dian Fu >> >> > Xingbo Wei Zhong 于2020年1月16日周四 下午6:13写道: >

Taskmanager fails to connect to Jobmanager [Could not find any IPv4 address that is not loopback or link-local. Using localhost address.]

2020-01-16 Thread Kumar Bolar, Harshith
Hi all, We were previously using RHEL for our Flink machines. I'm currently working on moving them over to Ubuntu. When I start the task manager, it fails to connect to the job manager with the following message - 2020-01-16 10:54:42,777 INFO org.apache.flink.runtime.util.LeaderRetrievalUtils

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread jincheng sun
Congrats Dian Fu and welcome on board! Best, Jincheng Shuo Cheng 于2020年1月16日周四 下午6:22写道: > Congratulations! Dian Fu > > > Xingbo Wei Zhong 于2020年1月16日周四 下午6:13写道: >> jincheng sun > 于2020年1月16日周四 下午5:58写道: >

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Xingbo Huang
Congratulations, Dian. Well deserved! Best, Xingbo Wei Zhong 于2020年1月16日周四 下午6:13写道: > Congrats Dian Fu! Well deserved! > > Best, > Wei > > 在 2020年1月16日,18:10,Hequn Cheng 写道: > > Congratulations, Dian. > Well deserved! > > Best, Hequn > > On Thu, Jan 16, 2020 at 6:08 PM Leonard Xu wrote: > >

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Wei Zhong
Congrats Dian Fu! Well deserved! Best, Wei > 在 2020年1月16日,18:10,Hequn Cheng 写道: > > Congratulations, Dian. > Well deserved! > > Best, Hequn > > On Thu, Jan 16, 2020 at 6:08 PM Leonard Xu > wrote: > Congratulations! Dian Fu > > Best, > Leonard > >> 在 2020年1月16日,1

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Hequn Cheng
Congratulations, Dian. Well deserved! Best, Hequn On Thu, Jan 16, 2020 at 6:08 PM Leonard Xu wrote: > Congratulations! Dian Fu > > Best, > Leonard > > 在 2020年1月16日,18:00,Jeff Zhang 写道: > > Congrats Dian Fu ! > > jincheng sun 于2020年1月16日周四 下午5:58写道: > >> Hi everyone, >> >> I'm very happy to a

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Leonard Xu
Congratulations! Dian Fu Best, Leonard > 在 2020年1月16日,18:00,Jeff Zhang 写道: > > Congrats Dian Fu ! > > jincheng sun mailto:sunjincheng...@gmail.com>> > 于2020年1月16日周四 下午5:58写道: > Hi everyone, > > I'm very happy to announce that Dian accepted the offer of the Flink PMC to > become a committer

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Jeff Zhang
Congrats Dian Fu ! jincheng sun 于2020年1月16日周四 下午5:58写道: > Hi everyone, > > I'm very happy to announce that Dian accepted the offer of the Flink PMC > to become a committer of the Flink project. > > Dian Fu has been contributing to Flink for many years. Dian Fu played an > essential role in PyFli

[ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread jincheng sun
Hi everyone, I'm very happy to announce that Dian accepted the offer of the Flink PMC to become a committer of the Flink project. Dian Fu has been contributing to Flink for many years. Dian Fu played an essential role in PyFlink/CEP/SQL/Table API modules. Dian Fu has contributed several major fea