Re: Flink 1.12

2020-12-17 Thread Yang Wang
The latest successful checkpoint pointer is stored in the ConfigMap, as well as the JobGraph pointer. They could help us recover the running jobs before you delete the K8s deployment. If the HA ConfigMaps are deleted, then when you create a Flink cluster with the same cluster-id, it could not recov

Re: Flink 1.12

2020-12-17 Thread Boris Lublinsky
Also re reading https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/ha/kubernetes_ha.html#high-availability-data-clean-up This does not seem right:

Re: Flink 1.12

2020-12-17 Thread Boris Lublinsky
Thanks Yang, > On Dec 17, 2020, at 8:49 PM, Yang Wang wrote: > > Hi Boris, > > Thanks for your follow up response and trying the new KubernetesHAService. > > 1. It is a valid bug. We are not setting the service account for TaskManager > pod. Before the KubernetesHAService is introduced, it w

Flink eventTIme????

2020-12-17 Thread ?g???U?[????
Hi all     When I use SQL with UDTF, when I call the tableEnv.sqlQuery () method, I throw the following error: Rowtime attributes must not be in the input rows of a regular join. As a workaround you can cast the time attributes of input tables to TIMESTAMP before. I used the to_timestamp functi

Re: Pyflink UDF with ARRAY as input

2020-12-17 Thread Xingbo Huang
Hi Torben, It is indeed a bug, and I have created a JIRA[1]. The work around solution is to use the index to solve (written in release-1.12): env = StreamExecutionEnvironment.get_execution_environment() env.set_parallelism(1) t_env = StreamTableEnvironment.create(env, environment_set

Re: Flink 1.12

2020-12-17 Thread Yang Wang
Hi Boris, Thanks for your follow up response and trying the new KubernetesHAService. 1. It is a valid bug. We are not setting the service account for TaskManager pod. Before the KubernetesHAService is introduced, it works fine because the TaskManager does not need to access the K8s resource(e.g.

Re: Flink 1.11 job hit error "Job leader lost leadership" or "ResourceManager leader changed to new address null"

2020-12-17 Thread Xintong Song
I'm not aware of any significant changes to the HA components between 1.9/1.11. Would you mind sharing the complete jobmanager/taskmanager logs? Thank you~ Xintong Song On Fri, Dec 18, 2020 at 8:53 AM Lu Niu wrote: > Hi, Xintong > > Thanks for replying and your suggestion. I did check the ZK

Re: Flink 1.11 job hit error "Job leader lost leadership" or "ResourceManager leader changed to new address null"

2020-12-17 Thread Lu Niu
Hi, Xintong Thanks for replying and your suggestion. I did check the ZK side but there is nothing interesting. The error message actually shows that only one TM thought JM lost leadership while others ran fine. Also, this happened only after we migrated from 1.9 to 1.11. Is it possible this is int

Re: Flink 1.12

2020-12-17 Thread Boris Lublinsky
And K8 native HA works, But there are 2 bugs in this implementation. 1. Task manager pods are running as default user account, which fails because it does not have access to config maps to get endpoint’s information. I had to add permissions to default service account to make it work. Ideally bo

Pyflink UDF with ARRAY as input

2020-12-17 Thread Barth, Torben
Dear List, I have a table with the following structure my_table -- Key: String -- List_element: ARRAY> I want to define a udf to extract information of the “list_element”. I do not manage to access the information of the array in the udf. I try something like: @udf(result_type=DataTypes.STRIN

flink sql read hive table throw java.lang.ArrayIndexOutOfBoundsException: 1024

2020-12-17 Thread house??????
when i use pyflink hive sql read data insert into es ,throw the follow exeception : the environment ?? flink 1.11.2 flink-sql-connector-hive-3.1.2_2.11-1.11.2.jar hive 3.1.2 2020-12-17 21:10:24,398 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: HiveTableSource(

Re: Challenges Deploying Flink With Savepoints On Kubernetes

2020-12-17 Thread Till Rohrmann
Flink should try to pick the latest checkpoint and will only use the savepoint if no newer checkpoint could be found. Cheers, Till On Wed, Dec 16, 2020 at 10:13 PM vishalovercome wrote: > I'm not sure if this addresses the original concern. For instance consider > this sequence: > > 1. Job star

Re: Does flink have a plan to support flink sql udf of any language?

2020-12-17 Thread Till Rohrmann
Hi Josh, Currently Flink supports Java, Scala and Python for defining SQL UDFs. There are no concrete plans to extend the set of supported languages at the moment. In general, these kinds of contributions are always welcome. What we have to make sure is to see how it fits into the overall story.

Does flink have a plan to support flink sql udf of any language?

2020-12-17 Thread Joshua Fan
Hi, Does the flink community have a plan to support flink sql udf in any language? For example, a udf in c or php. Because in my company, many developers do not know java or scala, they use c in their usual work. Now we have a workaround to support this situation by creating a process running the

Re: Flink - Create Temporary View and "Rowtime attributes must not be in the input rows of a regular join"

2020-12-17 Thread Timo Walther
Hi Dan, are you intending to use interval joins, regular joins, or a mixture of both? For regular joins you must ensure to cast a rowtime attribute to timestamp as early as possible. For interval joins, you need to make sure that the rowtime attribute is unmodified. Currently, I see COALE

Re: Flink - sending clicks+impressions to AWS Personalize

2020-12-17 Thread Timo Walther
Hi Dan, the exception that you get is a very frequent limitation in Flink SQL at the moment. I tried to summarize the issue recently here: https://stackoverflow.com/questions/64445207/rowtime-attributes-must-not-be-in-the-input-rows-of-a-regular-join-despite-usi/64500296#64500296 The query i

Re: Set TimeZone of Flink Streaming job

2020-12-17 Thread Timo Walther
Hi, Flink does not support time zones currently. However, all time operations work on Java `long` values. It can be up to the user what this long value represents. It must not be UTC but can also be adjusted for another time zone. Since DataStream API supports arbirary Java objects, users can

Re: Changing application configuration when restoring from checkpoint/savepoint

2020-12-17 Thread Timo Walther
Hi, I gave some answers in the other mail thread. Some additional comment: In general I think even configuration can be considered as state in this case. If state is not set, the job can be considered as a fresh start. Once the state is set, it would basically be just a configuration update.

Re: state inside functions

2020-12-17 Thread Timo Walther
Hi, if you would like to dynamically adjust configuration of your streaming job, it might be a good approach to consider the configuration as a stream itself. The connect() API can be used to connect a main stream with a control stream. In any case the configuration should be persisted in st

Re: Will job manager restarts lead to repeated savepoint restoration?

2020-12-17 Thread Till Rohrmann
Hi, if you start a Flink job from a savepoint and the job needs to recover, then it will only reuse the savepoint if no later checkpoint has been created. Flink will always use the latest checkpoint/savepoint taken. Cheers, Till On Wed, Dec 16, 2020 at 9:47 PM vishalovercome wrote: > My flink