date:20220419

[jira] [Created] (FLINK-27301) KafkaSourceE2ECase#restartFromSavepoint is failed on Azure

2022-04-19 Thread Jark Wu (Jira)

Jark Wu created FLINK-27301:
---

 Summary: KafkaSourceE2ECase#restartFromSavepoint is failed on Azure
 Key: FLINK-27301
 URL: https://issues.apache.org/jira/browse/FLINK-27301
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.16.0
Reporter: Jark Wu






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27302) Document regular CRD upgrade process

2022-04-19 Thread Gyula Fora (Jira)

Gyula Fora created FLINK-27302:
--

 Summary: Document regular CRD upgrade process
 Key: FLINK-27302
 URL: https://issues.apache.org/jira/browse/FLINK-27302
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.0.0
Reporter: Gyula Fora


The CRD upgrade documentation 
([https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/upgrade/)].
 only deals with the case of breaking CRD changes.

It would be important to add a section detailing the regular CRD 
install/upgrade process using kubectl create/replace as apply simply does not 
work with a large CRD like this as seen for example here: 
https://issues.apache.org/jira/browse/FLINK-27288



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27303) Flink Operator will create a large amount of temp log config files

2022-04-19 Thread Gyula Fora (Jira)

Gyula Fora created FLINK-27303:
--

 Summary: Flink Operator will create a large amount of temp log 
config files
 Key: FLINK-27303
 URL: https://issues.apache.org/jira/browse/FLINK-27303
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.0.0
Reporter: Gyula Fora
 Fix For: kubernetes-operator-1.0.0


Now we use the configbuilder in multiple different places to generate the 
effective config including observer, reconciler, validator etc.

The effective config gerenration logic also creates temporary log config files 
(if spec logConfiguration is set) which would lead to 3-4 files generated in 
every reconcile loop for a given job. These files are not cleaned up until the 
operator restarts leading to a large amount of files.

I believe we should change the config generation logic and only apply the 
logconfig generation logic right before flink cluster submission as that is the 
only thing affected by it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27304) Flink doesn't support Hive primitive type void yet

2022-04-19 Thread tartarus (Jira)

tartarus created FLINK-27304:


 Summary: Flink doesn't support Hive primitive type void yet
 Key: FLINK-27304
 URL: https://issues.apache.org/jira/browse/FLINK-27304
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Affects Versions: 1.13.1, 1.15.0
Reporter: tartarus


We can reproduce through a UT

Add test case in HiveDialectITCase
{code:java}
@Test
public void testHiveVoidType() {
tableEnv.loadModule("hive", new 
HiveModule(hiveCatalog.getHiveVersion()));
tableEnv.executeSql(
"create table src (a int, b string, c int, sample 
array)");
tableEnv.executeSql("select a, one from src lateral view 
explode(sample) samples as one where a > 0 ");
} {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27305) Hive Dialect support implicit conversion

2022-04-19 Thread tartarus (Jira)

tartarus created FLINK-27305:


 Summary: Hive Dialect support implicit conversion
 Key: FLINK-27305
 URL: https://issues.apache.org/jira/browse/FLINK-27305
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Hive
Affects Versions: 1.13.1, 1.15.0
Reporter: tartarus


We can reproduce through a UT

Add test case in HiveDialectITCase
{code:java}
@Test
public void testHiveIntEqualsBoolean() {
tableEnv.executeSql(
"create table src (x int,y string, z int) partitioned by 
(p_date string)");
tableEnv.executeSql("select * from src where x = true");
} {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: [SPAM] Re: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced Function DDL

2022-04-19 Thread Jark Wu

Thank Ron for updating the FLIP.

I think the updated FLIP has addressed Martijn's concern.
I don't have other feedback. So +1 for a vote.

Best,
Jark

On Fri, 15 Apr 2022 at 16:36, 刘大龙  wrote:

> Hi, Jingsong
>
> Thanks for your feedback, we will use flink FileSytem abstraction, so HDFS
> S3 OSS will be supported.
>
> Best,
>
> Ron
>
> > -原始邮件-
> > 发件人: "Jingsong Li" 
> > 发送时间: 2022-04-14 17:55:03 (星期四)
> > 收件人: dev 
> > 抄送:
> > 主题: [SPAM] Re: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced
> Function DDL
> >
> > I agree with Martijn.
> >
> > At least, HDFS S3 OSS should be supported.
> >
> > Best,
> > Jingsong
> >
> > On Thu, Apr 14, 2022 at 4:46 PM Martijn Visser 
> wrote:
> > >
> > > Hi Ron,
> > >
> > > The FLIP mentions that the priority will be set to support HDFS as a
> > > resource provider. I'm concerned that we end up with a partially
> > > implemented FLIP which only supports local and HDFS and then we move
> on to
> > > other features, as we see happen with others. I would argue that we
> should
> > > not focus on one resource provider, but that at least S3 support is
> > > included in the same Flink release as HDFS support is.
> > >
> > > Best regards,
> > >
> > > Martijn Visser
> > > https://twitter.com/MartijnVisser82
> > > https://github.com/MartijnVisser
> > >
> > >
> > > On Thu, 14 Apr 2022 at 08:50, 刘大龙  wrote:
> > >
> > > > Hi, everyone
> > > >
> > > > First of all, thanks for the valuable suggestions received about this
> > > > FLIP. After some discussion, it looks like all concerns have been
> addressed
> > > > for now, so I will start a vote about this FLIP in two or three days
> later.
> > > > Also, further feedback is very welcome.
> > > >
> > > > Best,
> > > >
> > > > Ron
> > > >
> > > >
> > > > > -原始邮件-
> > > > > 发件人: "刘大龙" 
> > > > > 发送时间: 2022-04-08 10:09:46 (星期五)
> > > > > 收件人: dev@flink.apache.org
> > > > > 抄送:
> > > > > 主题: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced
> Function
> > > > DDL
> > > > >
> > > > > Hi, Martijn
> > > > >
> > > > > Do you have any question about this FLIP? looking forward to your
> more
> > > > feedback.
> > > > >
> > > > > Best,
> > > > >
> > > > > Ron
> > > > >
> > > > >
> > > > > > -原始邮件-
> > > > > > 发件人: "刘大龙" 
> > > > > > 发送时间: 2022-03-29 19:33:58 (星期二)
> > > > > > 收件人: dev@flink.apache.org
> > > > > > 抄送:
> > > > > > 主题: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced
> Function DDL
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -原始邮件-
> > > > > > > 发件人: "Martijn Visser" 
> > > > > > > 发送时间: 2022-03-24 16:18:14 (星期四)
> > > > > > > 收件人: dev 
> > > > > > > 抄送:
> > > > > > > 主题: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced Function
> DDL
> > > > > > >
> > > > > > > Hi Ron,
> > > > > > >
> > > > > > > Thanks for creating the FLIP. You're talking about both local
> and
> > > > remote
> > > > > > > resources. With regards to remote resources, how do you see
> this
> > > > work with
> > > > > > > Flink's filesystem abstraction? I did read in the FLIP that
> Hadoop
> > > > > > > dependencies are not packaged, but I would hope that we do
> that for
> > > > all
> > > > > > > filesystem implementation. I don't think it's a good idea to
> have
> > > > any tight
> > > > > > > coupling to file system implementations, especially if at some
> point
> > > > we
> > > > > > > could also externalize file system implementations (like we're
> doing
> > > > for
> > > > > > > connectors already). I think the FLIP would be better by not
> only
> > > > > > > referring to "Hadoop" as a remote resource provider, but a more
> > > > generic
> > > > > > > term since there are more options than Hadoop.
> > > > > > >
> > > > > > > I'm also thinking about security/operations implications:
> would it be
> > > > > > > possible for bad actor X to create a JAR that either
> influences other
> > > > > > > running jobs, leaks data or credentials or anything else? If
> so, I
> > > > think it
> > > > > > > would also be good to have an option to disable this feature
> > > > completely. I
> > > > > > > think there are roughly two types of companies who run Flink:
> those
> > > > who
> > > > > > > open it up for everyone to use (here the feature would be
> welcomed)
> > > > and
> > > > > > > those who need to follow certain minimum standards/have a more
> > > > closed Flink
> > > > > > > ecosystem). They usually want to validate a JAR upfront before
> > > > making it
> > > > > > > available, even at the expense of speed, because it gives them
> more
> > > > control
> > > > > > > over what will be running in their environment.
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Martijn Visser
> > > > > > > https://twitter.com/MartijnVisser82
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 23 Mar 2022 at 16:47, 刘大龙  wrote:
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > -原始邮件-
> > > > > > > > > 发件人: "Peter Huang" 
> > > > > > > > > 发送时间: 2022-03-23

[jira] [Created] (FLINK-27306) Chinese translation for 1.15 roadmap update

2022-04-19 Thread Joe Moser (Jira)

Joe Moser created FLINK-27306:
-

 Summary: Chinese translation for 1.15 roadmap update
 Key: FLINK-27306
 URL: https://issues.apache.org/jira/browse/FLINK-27306
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Joe Moser


We just updated the roadmap for the 1.15 release. Would be great if a community 
member could provide the Chinese translation to for the Chinese version of the 
page. 

 

https://github.com/apache/flink-web/pull/527



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: Re: [SPAM] Re: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced Function DDL

2022-04-19 Thread 刘大龙

Thanks for all discuss about this FLIP again. I will open a vote tomorrow.

Best,
Ron


> -原始邮件-
> 发件人: "Jark Wu" 
> 发送时间: 2022-04-19 16:03:22 (星期二)
> 收件人: dev 
> 抄送: 
> 主题: Re: [SPAM] Re: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced 
> Function DDL
> 
> Thank Ron for updating the FLIP.
> 
> I think the updated FLIP has addressed Martijn's concern.
> I don't have other feedback. So +1 for a vote.
> 
> Best,
> Jark
> 
> On Fri, 15 Apr 2022 at 16:36, 刘大龙  wrote:
> 
> > Hi, Jingsong
> >
> > Thanks for your feedback, we will use flink FileSytem abstraction, so HDFS
> > S3 OSS will be supported.
> >
> > Best,
> >
> > Ron
> >
> > > -原始邮件-
> > > 发件人: "Jingsong Li" 
> > > 发送时间: 2022-04-14 17:55:03 (星期四)
> > > 收件人: dev 
> > > 抄送:
> > > 主题: [SPAM] Re: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced
> > Function DDL
> > >
> > > I agree with Martijn.
> > >
> > > At least, HDFS S3 OSS should be supported.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Thu, Apr 14, 2022 at 4:46 PM Martijn Visser 
> > wrote:
> > > >
> > > > Hi Ron,
> > > >
> > > > The FLIP mentions that the priority will be set to support HDFS as a
> > > > resource provider. I'm concerned that we end up with a partially
> > > > implemented FLIP which only supports local and HDFS and then we move
> > on to
> > > > other features, as we see happen with others. I would argue that we
> > should
> > > > not focus on one resource provider, but that at least S3 support is
> > > > included in the same Flink release as HDFS support is.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn Visser
> > > > https://twitter.com/MartijnVisser82
> > > > https://github.com/MartijnVisser
> > > >
> > > >
> > > > On Thu, 14 Apr 2022 at 08:50, 刘大龙  wrote:
> > > >
> > > > > Hi, everyone
> > > > >
> > > > > First of all, thanks for the valuable suggestions received about this
> > > > > FLIP. After some discussion, it looks like all concerns have been
> > addressed
> > > > > for now, so I will start a vote about this FLIP in two or three days
> > later.
> > > > > Also, further feedback is very welcome.
> > > > >
> > > > > Best,
> > > > >
> > > > > Ron
> > > > >
> > > > >
> > > > > > -原始邮件-
> > > > > > 发件人: "刘大龙" 
> > > > > > 发送时间: 2022-04-08 10:09:46 (星期五)
> > > > > > 收件人: dev@flink.apache.org
> > > > > > 抄送:
> > > > > > 主题: Re: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced
> > Function
> > > > > DDL
> > > > > >
> > > > > > Hi, Martijn
> > > > > >
> > > > > > Do you have any question about this FLIP? looking forward to your
> > more
> > > > > feedback.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Ron
> > > > > >
> > > > > >
> > > > > > > -原始邮件-
> > > > > > > 发件人: "刘大龙" 
> > > > > > > 发送时间: 2022-03-29 19:33:58 (星期二)
> > > > > > > 收件人: dev@flink.apache.org
> > > > > > > 抄送:
> > > > > > > 主题: Re: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced
> > Function DDL
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -原始邮件-
> > > > > > > > 发件人: "Martijn Visser" 
> > > > > > > > 发送时间: 2022-03-24 16:18:14 (星期四)
> > > > > > > > 收件人: dev 
> > > > > > > > 抄送:
> > > > > > > > 主题: Re: Re: 回复：Re:[DISCUSS] FLIP-214 Support Advanced Function
> > DDL
> > > > > > > >
> > > > > > > > Hi Ron,
> > > > > > > >
> > > > > > > > Thanks for creating the FLIP. You're talking about both local
> > and
> > > > > remote
> > > > > > > > resources. With regards to remote resources, how do you see
> > this
> > > > > work with
> > > > > > > > Flink's filesystem abstraction? I did read in the FLIP that
> > Hadoop
> > > > > > > > dependencies are not packaged, but I would hope that we do
> > that for
> > > > > all
> > > > > > > > filesystem implementation. I don't think it's a good idea to
> > have
> > > > > any tight
> > > > > > > > coupling to file system implementations, especially if at some
> > point
> > > > > we
> > > > > > > > could also externalize file system implementations (like we're
> > doing
> > > > > for
> > > > > > > > connectors already). I think the FLIP would be better by not
> > only
> > > > > > > > referring to "Hadoop" as a remote resource provider, but a more
> > > > > generic
> > > > > > > > term since there are more options than Hadoop.
> > > > > > > >
> > > > > > > > I'm also thinking about security/operations implications:
> > would it be
> > > > > > > > possible for bad actor X to create a JAR that either
> > influences other
> > > > > > > > running jobs, leaks data or credentials or anything else? If
> > so, I
> > > > > think it
> > > > > > > > would also be good to have an option to disable this feature
> > > > > completely. I
> > > > > > > > think there are roughly two types of companies who run Flink:
> > those
> > > > > who
> > > > > > > > open it up for everyone to use (here the feature would be
> > welcomed)
> > > > > and
> > > > > > > > those who need to follow certain minimum standards/have a more
> > > > > closed Flink
> > > > > > > > ecosystem). They usually want to validate a

[jira] [Created] (FLINK-27307) Flink table store support append-only ingestion without primary keys.

2022-04-19 Thread Zheng Hu (Jira)

Zheng Hu created FLINK-27307:


 Summary: Flink table store support append-only ingestion without 
primary keys.
 Key: FLINK-27307
 URL: https://issues.apache.org/jira/browse/FLINK-27307
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Zheng Hu
 Fix For: table-store-0.2.0


Currently,  flink table store only support row data ingestion with defined 
primary keys.  Those tables are quite suitable for maintaining CDC events or 
flink upsert stream. 

But in fact, there are many real scenarios which don't have the required 
primary keys. Such as user clicks logs,  we only need to maintain those data 
like a Hive/iceberg table.  I mean we don't need to define the primary keys, 
and we don't need to maintain all those rows into a sorted LSM. We only need to 
partition those rows into the correct partitions and maintain the basic 
statistics to speed the query. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: [VOTE] Release 1.15.0, release candidate #3

2022-04-19 Thread Timo Walther


-1

We found a regression in one of the core SQL operations (i.e. CAST 
string<->binary) [1]. It would be better to fix this immediately to 
avoid confusion. Also, we can use the time to fix another small 
regression where a PR is also almost ready [2]. Both should be merged in 
the next hours.


Sorry for any inconvenience,
Timo


[1] https://issues.apache.org/jira/browse/FLINK-27212
[2] https://issues.apache.org/jira/browse/FLINK-27263

Am 18.04.22 um 12:45 schrieb Yang Wang:

+1(non-binding)

- Verified signature and checksum
- Build image with flink-docker repo
- Run statemachine last-state upgrade via flink-kubernetes-operator which
could verify the following aspects
 - Native K8s integration
 - Multiple Component Kubernetes HA services
- Run Flink application with 5 TM and ZK HA enabled on YARN
 - Verify job result store


Best,
Yang

Guowei Ma  于2022年4月18日周一 15:51写道：


+1(binding)

- Verified the signature and checksum of the release binary
- Run the SqlClient example
- Run the WordCount example
- Compile from the source and success

Best,
Guowei


On Mon, Apr 18, 2022 at 11:13 AM Xintong Song 
wrote:


+1 (binding)

- verified signature and checksum
- build from source
- run example jobs in a standalone cluster, everything looks expected


Thank you~

Xintong Song



On Fri, Apr 15, 2022 at 12:56 PM Yun Gao 
wrote:


Hi everyone,

Please review and vote on the release candidate #3 for the version

1.15.0,

as follows:
[ ] +1, Approve the release[ ] -1, Do not approve the release (please
provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],* the official Apache source release and

binary

convenience releases to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint
CBE82BEFD827B08AFA843977EDBF922A7BC84897 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.15.0-rc3" [5],* website pull request

listing

the new release and adding announcement blog post [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Joe, Till and Yun Gao
[1]


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12350442

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.15.0-rc3/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4]


https://repository.apache.org/content/repositories/orgapacheflink-1497/

[5] https://github.com/apache/flink/releases/tag/release-1.15.0-rc3/
[6] https://github.com/apache/flink-web/pull/526

[jira] [Created] (FLINK-27308) Update the Hadoop implementation for filesystems to 3.3.2

2022-04-19 Thread Martijn Visser (Jira)

Martijn Visser created FLINK-27308:
--

 Summary: Update the Hadoop implementation for filesystems to 3.3.2
 Key: FLINK-27308
 URL: https://issues.apache.org/jira/browse/FLINK-27308
 Project: Flink
  Issue Type: Technical Debt
  Components: FileSystems
Reporter: Martijn Visser


Flink currently uses Hadoop version 3.2.2 for the Flink filesystem 
implementations. Upgrading this to version 3.3.2 would provide users the 
features listed in HADOOP-17566



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27309) Allow to load default flink configs in the k8s operator dynamically

2022-04-19 Thread Biao Geng (Jira)

Biao Geng created FLINK-27309:
-

 Summary: Allow to load default flink configs in the k8s operator 
dynamically
 Key: FLINK-27309
 URL: https://issues.apache.org/jira/browse/FLINK-27309
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Biao Geng


Current default configs used by the k8s operator will be saved in the 
/opt/flink/conf dir in the k8s operator pod and will be loaded only once when 
the operator is created.
Since the flink k8s operator could be a long running service and users may want 
to modify the default configs(e.g the metric reporter sampling interval) for 
newly created deployments, it may better to load the default configs 
dynamically(i.e. parsing the latest /opt/flink/conf/flink-conf.yaml) in the 
{{ReconcilerFactory}} and {{ObserverFactory}}, instead of redeploying the 
operator.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27310) FlinkOperatorITCase failure due to JobManager replicas less than one

2022-04-19 Thread Usamah Jassat (Jira)

Usamah Jassat created FLINK-27310:
-

 Summary: FlinkOperatorITCase failure due to JobManager replicas 
less than one
 Key: FLINK-27310
 URL: https://issues.apache.org/jira/browse/FLINK-27310
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Reporter: Usamah Jassat


The FlinkOperatorITCase test is currently failing, even in the CI pipeline 

 
{code:java}
INFO] Running org.apache.flink.kubernetes.operator.FlinkOperatorITCase

  

  

Error:  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
3.178 s <<< FAILURE! - in 
org.apache.flink.kubernetes.operator.FlinkOperatorITCase

  

  

Error:  org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test  Time 
elapsed: 2.664 s  <<< ERROR!

  

  

io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
POST at: 
https://192.168.49.2:8443/apis/flink.apache.org/v1beta1/namespaces/flink-operator-test/flinkdeployments.
 Message: Forbidden! User minikube doesn't have permission. admission webhook 
"vflinkdeployments.flink.apache.org" denied the request: JobManager replicas 
should not be configured less than one..

  

  

at 
flink.kubernetes.operator@1.0-SNAPSHOT/org.apache.flink.kubernetes.operator.FlinkOperatorITCase.test(FlinkOperatorITCase.java:86){code}
 

While the test is failing the CI test run is passing which also should be fixed 
then to fail on the test failure.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27311) Rocksdb mapstate behaves unexpectedly

2022-04-19 Thread Jonathan diamant (Jira)

Jonathan diamant created FLINK-27311:


 Summary: Rocksdb mapstate behaves unexpectedly 
 Key: FLINK-27311
 URL: https://issues.apache.org/jira/browse/FLINK-27311
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / State Backends, Stateful Functions
Affects Versions: 1.14.4
 Environment: Kubernetes
Reporter: Jonathan diamant
 Fix For: 1.14.5


We use rocksDb backend for our state and we experience an unexpected behavior. 
The state we use is MapState> and when a restart occurs 
and the state is being recovered from the last checkpoints, it seems that not 
all the list corresponding to a certain key has been loaded and only after all 
the state has been recovered it behaves as expected. 

Our guess is that while recovering the state, rocksdb recovers the state in 
chunks and loads entries of the map not as a whole even though we expect that 
for every key, the value (a list) will be loaded as one object at once.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27312) Implicit casting should be introduced with Rules during planning

2022-04-19 Thread Marios Trivyzas (Jira)

Marios Trivyzas created FLINK-27312:
---

 Summary: Implicit casting should be introduced with Rules during 
planning
 Key: FLINK-27312
 URL: https://issues.apache.org/jira/browse/FLINK-27312
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: Marios Trivyzas


Currently we do implicit casting directly in the code generation, and the plan 
is not in sync with what is happening under the hood (in the generated code).

Ideally implicit casting should be introduced where necessary at an earlier 
stage, during planning with a Rule, so that one can exactly see from the plan 
produced what are the operations introduced automatically (implicit casting) to 
make the SQL executable.

See for example: https://issues.apache.org/jira/browse/FLINK-27247



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-27313) Flink Datadog reporter add custom and dynamic tags to custom metrics when job running

2022-04-19 Thread Huibo Peng (Jira)

Huibo Peng created FLINK-27313:
--

 Summary: Flink Datadog reporter add custom and dynamic tags to 
custom metrics when job running
 Key: FLINK-27313
 URL: https://issues.apache.org/jira/browse/FLINK-27313
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Huibo Peng


We use datadog to receive Flink job's metric. We found a limitation that Flink 
can not add custom and dynamic tags to metric when Flink job is running exclude 
adding them in flink-conf.yaml. And this feature is important to us because we 
need to use these tag to filter metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: Re: [DISCUSS] FLIP-216 Decouple Hive connector with Flink planner

2022-04-19 Thread Jark Wu

Hi Martijn,

I have discussed this with Yuxia offline and improved the design again.

*Here are the improvements:*
1) all the public APIs are staying in flink-table-common or
flink-table-api-java modules,
2) rename "flink-table-planner-spi" to "flink-table-calcite-bridge" which
only contains Calcite dependency
and internal classes (CalciteContext & CalciteQueryOperation). This module
is only used for parser plugins
 to interact with Calcite APIs.

*And here is the migration plan:*
0) expose parser public APIs and introduce the "flink-table-calcite-bridge"
1) make Hive connector planner free (i.e. only depend on
"flink-table-calcite-bridge"). ~ 1 week
2) migration RelNode translation to Operation tree one by one. ~ several
months, may cross versions
3) drop "flink-table-calcite-bridge" module
4) all the other 3rd-party dialects are not affected.

I think this can address your concerns about the final state of APIs, but
also allow us to have a smooth migration path
to guarantee a high-quality code.

What do you think about this?

Best,
Jark


On Wed, 13 Apr 2022 at 22:12, 罗宇侠(莫辞)
 wrote:

> Hi all,
> Sorry for the late reply for this thread.
> About decoupling Hive Connector, it is actually mainly for decoupling Hive
> dialect. So, I think it's a good timing to introduce pluagble dialect
> mechanism for Flink and make Hive dialect as the first.
> Based on this point, I have updated the FLIP-216 (Introduce pluggable
> dialect and plan for migrate Hive dialect)[1].
> The overview of the FLIP is as follows:
>
> 1: Introuce a slim module with limited public interfaces for the pluagble
> dialect. The public interfaces are in final state and other dialects should
> implement the interfaces and follow the style that converts SQL statement
> to Flink's Operation Tree.
>
> 2: Plan for migrating Hive Dialect. For implementing Hive dialect, it's
> also expected to convert SQL statement to Flink's Operation Tree. But
> unfortunately, the current implementation is convert SQL statment to
> Calcite RelNode. It's hard to migrate it to Operation tree at one shot.
> It'll be better to migrate it step by step. So, the first step is to stll
> keep Calcite dependency and introduce an internal interface called
> CalciteContext to create Calcite RelNode, then we can decouple to
> flink-table-planner. As a result, we can move Hive connector out from Flink
> repository. The second step is to migrate it to Operation Tree, so that we
> can drop the Calcite dependency.
>
> For some questiones from previous emails:
> >> What about Scala? Is flink-table-planner-spi going to be a scala module
> with the related suffix?
> flink-table-planner-spi is going to be a scala free moudle, for Hive
> dialect, it's fine not to expose a couple of types
> which implemented with Scala.
> >> Are you sure exposing the Calcite interfaces is going to be enough?
> I'm quite sure. As for the specific method like
> FlinkTypeFactory#toLogicalType, I think we think we can implement a simliar
> type converter in Hive connector itself.
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-216%3A++Introduce+pluggable+dialect+and+plan+for+migrate+Hive+dialect
>
> Best regards,
> Yuxia--
> 发件人：Martijn Visser
> 日 期：2022年03月31日 17:09:15
> 收件人：dev
> 抄 送：罗宇侠(莫辞)
> 主 题：Re: [DISCUSS] FLIP-216 Decouple Hive connector with Flink planner
>
> Hi all,
>
> Thanks for opening this discussion. I agree with Francesco that we should
> go for the best solution instead of another bandaid or intermediate
> solution. I think there's actually consensus for that, given the argument
> provided by Yuxia:
>
> > The first way is the ideal way and we should go in that direction. But it
> will take much effort for it requires rewriting all the code about Hive
> dialect and it's hard to do it in one shot. And given we want to move out
> Hive connector in 1.16, it's more pratical to decouple first, and then
> migrate it to operation tree.
>
> Why should we first invest time in another intermediate solution and then
> spend time afterwards to actually get to the proper solution? I would
> propose to:
>
> - Remove Hive connector for version 1.*, 2.1.* and 2.2* in Flink 1.16
> - Upgrade to the latest supported Hive 2.3.* and 3.1.* in Flink 1.16
> - Get as much work done as possible to decouple the Hive connector from
> Flink in 1.16.
>
> There's customer value in this approach, because we'll support the latest
> supported Hive versions. If customers need support for older Hive versions,
> they can still use older Flink versions. There is also community value in
> this approach, because we're actively working on making our codebase
> maintainable. If the entire decoupling is not possible, then we won't move
> the Hive 2.3.* and 3.1.* connector out in Flink 1.16, but in Flink 1.17.
> Support for the new Hive version 4.* could then also be added in Flink 1.17
> (we should not add support for newer versions of Hive until this

Re: [DISCUSS] FLIP-220: Temporal State

2022-04-19 Thread Nico Kruber

Hi all,
I have read the discussion points from the last emails and would like to add
my two cents on what I believe are the remaining points to solve:

1. Do we need a TemporalValueState?

I guess, this all boils down to dealing with duplicates / values for the same
timestamp. Either you always have to account for them and thus always have to
store a list anyway, or you only need to join with "the latest" (or the only)
value for a given timestamp and get a nicer API and lower overhead for that
use case.
At the easiest, you can make an assumption that there is only a single value
for each timestamp by contract, e.g. by increasing the timestamp precision and
interpreting them as nanoseconds, or maybe milliseconds are already good
enough. If that contract breaks, however, you will get into undefined
behaviour.
The TemporalRowTimeJoinOperator, for example, currently just assumes that
there is only a single value on the right side of the join (rightState) and I
believe many use cases can make that assumption or otherwise you'd have to
define the expected behaviour for multiple values at the same timestamp, e.g.
"join with the most recent value at the time of the left side and if there are
multiple values, choose X".

I lean towards having a ValueState implementation as well (in addition to
lists).

2. User-facing API (Iterators vs. valueAtOr[Before|After])

I like the iterable-based APIs that David M was proposing, i.e.
- Iterable> readRange(long minTimestamp, long
limitTimestamp);
- void clearRange(long minTimestamp, long limitTimestamp);

However, I find Iterables rather cumbersome to work with if you actually only
need a single value, e.g. the most recent one.
For iterating over a range of values, however, they feel more natural to me
than our proposal.

Actually, if we generalise the key type (see below), we may also need to offer
additional value[Before|After] functions to cover "+1" iterations where we
cannot simply add 1 as we do now.

(a) How about offering both Iterables and value[AtOrBefore|AtOrAfter|Before|
After]?
This would be similar to what NavigableMap [2] is offering but with a more
explicit API than "ceiling", "floor",...

(b) Our API proposal currently also allows iterating backwards which is not
covered by the readRange proposal - we could, however, just do that if
minTimestamp > limitTimestamp). What do you think?

(c) When implementing the iterators, I actually also see two different modes
which may differ in performance: I call them iteration with eager vs. lazy
value retrieval. Eager retrieval may retrieve all values in a range at once
and make them available in memory, e.g. for smaller data sets similar to what
TemporalRowTimeJoinOperator is doing for the right side of the join. This can
be spare a lot of Java<->JNI calls and let RocksDB iterate only once (as long
at things fit into memory). Lazy retrieval would fetch results one-by-one.
-> We could set one as default and allow the user to override that behaviour.

3. Should we generalise the Temporal***State to offer arbitrary key types and
not just Long timestamps?

@Yun Tang: can you describe in more detail where you think this would be
needed for SQL users? I don't quite get how this would be beneficial. The
example you linked doesn't quite show the same behaviour.

Other than this, I could see that you can leverage such a generalisation for
arbitrary joins between, for example, IDs and ID ranges which don't have a
time component attached to it. Given that this shouldn't be too difficult to
expose (the functionality has to exist anyway, but otherwise buried into
Flink's internals). We'd just have to find suitable names.

(a) I don't think TemporalListState is actually SortedMapState> because we need efficient "add to list" primitives which cannot
easily be made available with a single generic SortedMapState...

(b) So the most expressive (yet kind-of ugly) names could be
- SortedMapState
- SortedMapOfListsState>

(c) For both of these, we could then re-use the existing MapStateDescriptor to
define key and value/list-element types and require that the key type /
serializer implements a certain RetainingSortOrderSerializer interface (and
think about a better name for this) which defines the contract that the binary
sort order is the same as the Java Object one.
-> that can also be verified at runtime to fail early.

4. ChangelogStateBackend: we don't think this needs special attention - it is
just delegating to the other backends anyway and these methods are already
adapted in our POC code

@David M, Yun Tang: let me/us know what you think about these proposals

Nico

[2] https://docs.oracle.com/javase/8/docs/api/java/util/NavigableMap.html

On Thursday, 14 April 2022 14:15:53 CEST Yun Tang wrote:
> Hi David Anderson,
>
> I feel doubted that no motivating use case for this generalization to
> SortedMapState. From our internal stats, SQL user would use much more cases
> of min/max with

[jira] [Created] (FLINK-27314) Enable active resource management (reactive scaling) in Flink Kubernetes Operator

2022-04-19 Thread Fuyao Li (Jira)

Fuyao Li created FLINK-27314:


 Summary: Enable active resource management (reactive scaling) in 
Flink Kubernetes Operator
 Key: FLINK-27314
 URL: https://issues.apache.org/jira/browse/FLINK-27314
 Project: Flink
  Issue Type: New Feature
Reporter: Fuyao Li


Generally, this task is a low priority task now.

Flink has some system level Flink metrics, Flink kubernetes operator can detect 
these metrics and rescale automatically based checkpoint(similar to standalone 
reactive mode) and rescale policy configured by users.

The rescale behavior can be based on CPU utilization or memory utilization.
 # Before rescaling, Flink operator should check whether the cluster has enough 
resources, if not, the rescaling will be aborted.
 # We can create a addition field to support this feature. The fields below is 
just a rough suggestion.

{code:java}
reactiveScaling:
  enabled: boolean
  scaleMetric:  enum ["CPU", "MEM"]
scaleDownThreshold:
    scaleUpThreshold:
minimumLimit:
maximumLimit:
increasePolicy: 
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27315) Fix the demo of MemoryStateBackendMigration

2022-04-19 Thread Echo Lee (Jira)

Echo Lee created FLINK-27315:


 Summary: Fix the demo of MemoryStateBackendMigration
 Key: FLINK-27315
 URL: https://issues.apache.org/jira/browse/FLINK-27315
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.14.4
Reporter: Echo Lee
 Fix For: 1.15.0


There is a problem with the memorystatebackendmigration demo under [state 
backends 
doc|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#code-configuration]

JobManagerStateBackend should be changed to JobManagerCheckpointStorage



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Re: Re: [VOTE] Release 1.15.0, release candidate #3

2022-04-19 Thread Yun Gao

Thanks Timo for checking and fixing this issue! Then this RC should be
officially canceled and I'll post the next RC soon. 

Best,
Yun Gao


--
Sender:Timo Walther
Date:2022/04/19 16:46:51
Recipient:
Theme:Re: [VOTE] Release 1.15.0, release candidate #3

-1

We found a regression in one of the core SQL operations (i.e. CAST 
string<->binary) [1]. It would be better to fix this immediately to 
avoid confusion. Also, we can use the time to fix another small 
regression where a PR is also almost ready [2]. Both should be merged in 
the next hours.

Sorry for any inconvenience,
Timo


[1] https://issues.apache.org/jira/browse/FLINK-27212
[2] https://issues.apache.org/jira/browse/FLINK-27263

Am 18.04.22 um 12:45 schrieb Yang Wang:
> +1(non-binding)
>
> - Verified signature and checksum
> - Build image with flink-docker repo
> - Run statemachine last-state upgrade via flink-kubernetes-operator which
> could verify the following aspects
>  - Native K8s integration
>  - Multiple Component Kubernetes HA services
> - Run Flink application with 5 TM and ZK HA enabled on YARN
>  - Verify job result store
>
>
> Best,
> Yang
>
> Guowei Ma  于2022年4月18日周一 15:51写道：
>
>> +1(binding)
>>
>> - Verified the signature and checksum of the release binary
>> - Run the SqlClient example
>> - Run the WordCount example
>> - Compile from the source and success
>>
>> Best,
>> Guowei
>>
>>
>> On Mon, Apr 18, 2022 at 11:13 AM Xintong Song 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> - verified signature and checksum
>>> - build from source
>>> - run example jobs in a standalone cluster, everything looks expected
>>>
>>>
>>> Thank you~
>>>
>>> Xintong Song
>>>
>>>
>>>
>>> On Fri, Apr 15, 2022 at 12:56 PM Yun Gao 
>>> wrote:
>>>
 Hi everyone,

 Please review and vote on the release candidate #3 for the version
>>> 1.15.0,
 as follows:
 [ ] +1, Approve the release[ ] -1, Do not approve the release (please
 provide specific comments)

 The complete staging area is available for your review, which includes:
 * JIRA release notes [1],* the official Apache source release and
>> binary
 convenience releases to be deployed to dist.apache.org [2],
 which are signed with the key with fingerprint
 CBE82BEFD827B08AFA843977EDBF922A7BC84897 [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.15.0-rc3" [5],* website pull request
>> listing
 the new release and adding announcement blog post [6].

 The vote will be open for at least 72 hours. It is adopted by majority
 approval, with at least 3 PMC affirmative votes.

 Thanks,
 Joe, Till and Yun Gao
 [1]

>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12350442
 [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.15.0-rc3/
 [3] https://dist.apache.org/repos/dist/release/flink/KEYS
 [4]

>> https://repository.apache.org/content/repositories/orgapacheflink-1497/
 [5] https://github.com/apache/flink/releases/tag/release-1.15.0-rc3/
 [6] https://github.com/apache/flink-web/pull/526