[jira] [Created] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-23 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-28216:
-

 Summary: Hadoop S3FileSystemFactory does not honor fs.s3.impl
 Key: FLINK-28216
 URL: https://issues.apache.org/jira/browse/FLINK-28216
 Project: Flink
  Issue Type: Improvement
  Components: FileSystems
Affects Versions: 1.15.0
Reporter: Prabhu Joseph


Currently Hadoop S3FileSystemFactory has hardcoded the S3 FileSystem 
implementation to S3AFileSystem. It does not allow to configure any other 
implementation specified in fs.s3.impl. Suggest to read the fs.s3.impl from 
Hadoop Config loaded and use the same.

 
{code:java}
@Override
protected org.apache.hadoop.fs.FileSystem createHadoopFileSystem() {
return new S3AFileSystem();
}{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28217) Bump mysql-connector-java from 8.0.27 to 8.0.28

2022-06-23 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-28217:
--

 Summary: Bump mysql-connector-java from 8.0.27 to 8.0.28
 Key: FLINK-28217
 URL: https://issues.apache.org/jira/browse/FLINK-28217
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / JDBC
Affects Versions: 1.16.0
Reporter: Martijn Visser
Assignee: Martijn Visser


We should bump our test dependency for {{mysql-connector-java}} to make sure 
that we support the latest version of MySQL



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28218) Add readiness and liveliness probe to k8s operator

2022-06-23 Thread Jira
Márton Balassi created FLINK-28218:
--

 Summary: Add readiness and liveliness probe to k8s operator
 Key: FLINK-28218
 URL: https://issues.apache.org/jira/browse/FLINK-28218
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.0.0
Reporter: Márton Balassi


Let us add readiness and liveness probes to the operator, we could expose 
relevant HTTP endpoints for this.

I have observed cases recently where even though the operator pod was running 
it could not do its job due to missing rolebindings or misconfigured dynamic 
namespaces.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28219) Refactor TableStoreCatalog: Introduce a dedicated Catalog for table store

2022-06-23 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-28219:


 Summary: Refactor TableStoreCatalog: Introduce a dedicated Catalog 
for table store
 Key: FLINK-28219
 URL: https://issues.apache.org/jira/browse/FLINK-28219
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.2.0


We currently have developed a Flink's Catalog.
If we expose this Catalog directly to other connector developers, it is not 
good and there will be many unsupported interfaces and capabilities.

So here we create a tablestore dedicated catalog.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] FLIP-221: Abstraction for lookup source and metric

2022-06-23 Thread Martijn Visser
Great work on the FLIP.

+1 (binding)

Op do 23 jun. 2022 om 08:07 schreef Leonard Xu :

> Thanks Qingsheng for driving this work.
>
> +1(binding)
>
>
> Best,
> Leonard
> > 2022年6月23日 下午1:37,Jark Wu  写道:
> >
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Thu, 23 Jun 2022 at 12:49, Qingsheng Ren  wrote:
> >
> >> Hi devs,
> >>
> >> I’d like to start a vote thread for FLIP-221: Abstraction for lookup
> >> source and metric. You can find the discussion thread in [2]*.
> >>
> >> The vote will be open for at least 72 hours unless there is an objection
> >> or not enough binding votes.
> >>
> >> Thanks everyone participating in the discussion!
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric
> >> [2] https://lists.apache.org/thread/9c0fbgkofkbfdr5hvs62m0cxd2bkgwho
> >> [*] The link to the discussion thread might not include all emails.
> Please
> >> search the Apache email archive with keyword "FLIP-221" to get all
> >> discussion histories.
> >>
> >> Best regards,
> >> Qingsheng
>
>


Re: [VOTE] FLIP-241: Completed Jobs Information Enhancement

2022-06-23 Thread Xintong Song
+1 (binding)

Best,

Xintong



On Thu, Jun 23, 2022 at 1:49 PM Yangze Guo  wrote:

> +1 (binding)
>
> Best,
> Yangze Guo
>
> On Thu, Jun 23, 2022 at 1:07 PM junhan yang 
> wrote:
> >
> > Hi everyone,
> >
> > Thanks for the feedbacks on the discussion thread[1]. I would like to
> start
> > a vote thread here for FLIP-241: Completed Jobs Information
> Enhancement[2].
> >
> > The vote will last for at least 72 hours unless there is an objection, I
> > will try to close it by *next Tuesday* if we receive sufficient votes
> until
> > then.
> >
> > Thank you again for your participation in this FLIP discussion.
> >
> > [1] https://lists.apache.org/thread/qycqmxbh37b5qzs72y110rp8457kkxkb
> > [2] https://cwiki.apache.org/confluence/x/dRD1D
> >
> > Best regards,
> > Junhan
>


Re: [DISCUSS] FLIP-242: Introduce configurable CongestionControlStrategy for Async Sink

2022-06-23 Thread Piotr Nowojski
Hi Hong,

As I understand it, this effort is about replacing hardcoded
`AIMDRateLimitingStrategy` with something more flexible? +1 for the general
idea.

If I read it correctly, there are basically three issues:
1. (what) `AIMDRateLimitingStrategy` is only able to limit the size of all
in-flight records across all batches, not the amount of in-flight batches.
2. (when) Currently `AsyncSinkWriter` decides whether and when to scale up
or down. You would like it to be customisable behaviour.
3. (how) The actual `scaleUp()` and `scaleDown()` behaviours are hardcoded,
and this could be customised as well.

Right? Assuming so, I have one main question about the design. Why are you
trying to split it into three different interfaces?
Can not we have a single public interface `RateLimitingStrategy` instead of
three that you proposed, that would have methods like:

`bool shouldRateLimit()` / `bool shouldBlock()`
`void startedRequest(RequestInfo requestInfo)`
`void completedRequest(RequestInfo requestInfo)`

where  `RequestInfo` is a simple POJO similar to `CongestionControlInfo`
that you suggested

public class RequestInfo {
  int failedMessages;
  int batchSize;
  long requestStartTime;
}

I think it would be more flexible and at the same time a simpler public
interface. Also we could provide the same builder that you proposed in
"Example configuring the Congestion Control Strategy using the new
interfaces",
Or am I missing something?

Best Piotrek

pon., 20 cze 2022 o 09:52 Teoh, Hong 
napisał(a):

> Hi all,
>
> I would like to open a discussion on FLIP-242: Introduce configurable
> CongestionControlStrategy for Async Sink.
>
> The Async Sink was introduced as part of FLIP-171 [1], and implements a
> non-configurable congestion control strategy to reduce network congestion
> when the destination rejects or throttles requests. We want to make this
> configurable, to allow the sink implementer to decide the desired
> congestion control behaviour given a specific destination.
>
> This is a step towards reducing the barrier to entry to writing new async
> sinks in Flink!
>
> You can find more details in the FLIP-242 [2]. Looking forward to your
> feedback!
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-242%3A+Introduce+configurable+CongestionControlStrategy+for+Async+Sink
>
>
> Regards,
> Hong
>
>
>


Re: [DISCUSS] FLIP-241: Completed Jobs Information Enhancement

2022-06-23 Thread Chesnay Schepler
The addition of the /jobs/:jobid/jobmanager/config / environment 
exclusively to the HS is a bit of a strange workaround.

How do you intend to document those? (and test compatibility)?

Why not just add a general /jobs/:jobid/environment endpoint that works 
just like jobmanager/environment.

To me that seems like a cleaner solution.
It is somewhat mentioned as an alternative in the FLIP, but I don't 
understand what is supposed to be confusing about it.

Whether the job ID is actually used in the end isn't visible after all.

/jobmanager/config could be integrated into /jobs/:jobid/config.

The same approach could maybe be used for logs; not really sure yet (not 
a fan of displaying logs in the HS in the first place).


On 23/06/2022 06:55, junhan yang wrote:

Hi all,

Thank you all for your feedbacks. As far as I can see, it looks like the
discussion on this FLIP has been converged.

I will start a new vote thread now.

Best regards,
Junhan

Yangze Guo  于2022年6月17日周五 14:05写道:


Thanks for the input, Jiangang.

I think it's a valid demand to distinguish completed jobs with the same
name.
- If they are different jobs, I think users need to give them
different meaningful names respectively.
- If they are exactly the same job, IIUC, what you need is to figure
out the order. ApplicationId in Yarn might help. But in this case, you
can just sort them with the start time.

Best,
Yangze Guo

On Fri, Jun 17, 2022 at 12:13 PM Jiangang Liu 
wrote:

Thanks for the FLIP. It is helpful to track detail infos for completed

jobs.

I want to ask another question. In our environment, sometimes it is hard

to

distinguish jobs since the same job names may appear multi times in the
completed jobs. Because a job may run multi times or different jobs have
the same job names. I wonder that wether we can enhance the complete jobs
display with more information, such as applicationId and application name
in yarn. Maybe it is different in k8s to identify a job.

Best
Jiangang Liu

Yangze Guo  于2022年6月17日周五 11:40写道:


Thanks for the feedback, Aitozi and Jing.


Are each attempts of the TaskManager or JobManager pods (if failure

occurs)
all be shown in the ui?

The info of the prior execution attempts will be archived, you could
refer to `ArchivedExecutionVertex$priorExecutions`.


It seems that most of these metrics are more interesting to batch

jobs.

Does it make sense to calculate them for pure streaming jobs too?

All the proposed metrics will be calculated no matter what the job

type is.

Why "duration is less interesting" which is mentioned in the FLIP?

As a first step, we mainly focus on the most interesting status during
the job lifecycle. The duration of final states like FINISHED and
CANCELED is meaningless, while abnormal conditions like CANCELING will
not be included at the moment.


Could you share your thoughts on "accumulated-busy-time"? It should

describe the time while the task is working as expected, i.e. the happy
path. When do we need it for analytics or diagnosis?

A task could be busy or idle while it is working. Users may adjust the
parallelism or the partition key according to the ratio between them.

Best,
Yangze Guo

On Fri, Jun 17, 2022 at 5:08 AM Jing Ge  wrote:

Hi Junhan

These are must-to-have information for batch processing. Thanks for
bringing it up.

I have some comments:

1. It seems that most of these metrics are more interesting to batch

jobs.

Does it make sense to calculate them for pure streaming jobs too?
2. Why "duration is less interesting" which is mentioned in the FLIP?
3. Could you share your thoughts on "accumulated-busy-time"? It

should

describe the time while the task is working as expected, i.e. the

happy

path. When do we need it for analytics or diagnosis?

BTW, you might want to optimize the format of the FLIP. Some text is
running out of the right border of the wiki page.

Best regards,
Jing

On Thu, Jun 16, 2022 at 4:40 PM Aitozi  wrote:


Thanks Junhan for driving this. It a great improvement for the

batch

jobs.

I'm looking forward to this feature in our internal use case. +1

for

it.

One more question:

Are each attempts of the TaskManager or JobManager pods (if failure

occurs)

all be shown in the ui ?

Best,
Aitozi.

Yang Wang  于2022年6月16日周四 19:10写道:


Thanks Xintong for the explanation.

It makes sense to leave the discussion about job result store in

a

dedicated thread.


Best,
Yang

Xintong Song  于2022年6月16日周四 13:40写道:


My impression of JobResultStore is more about fault tolerance

and

high

availability. Using it for providing information to users

sounds

worth

exploring. We probably need more time to think it through.

Given that it doesn't conflict with what we have proposed in

this

FLIP,

I'd

suggest considering it as a separate thread and exclude it

from the

scope

of this one.

Best,

Xintong



On Thu, Jun 16, 2022 at 11:43 AM Yang Wang <

danrtsey...@gmail.com>

wrote:

This is a very useful feature both for finished streaming and

batc

[jira] [Created] (FLINK-28220) Create Table Like support excluding physical columns

2022-06-23 Thread JustinLee (Jira)
JustinLee created FLINK-28220:
-

 Summary: Create Table Like support excluding physical columns
 Key: FLINK-28220
 URL: https://issues.apache.org/jira/browse/FLINK-28220
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.15.0
Reporter: JustinLee


when users want to Create Table A Like B , they can choose to include or 
exclude options, computed columns ,etc.  But it's mandatory that table A should 
inherit all physical columns of table B, which may cause inconvenience in some 
scenes , such as table A has its own schema and just want to inherit the 
options of table B. 

so I think it would be more flexible to provide the option to include or 
exclude physical columns when Using Create Table .. Like .. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Release Kubernetes operator 1.0.1

2022-06-23 Thread Yang Wang
Thanks Gyula for preparing the first patch release for Flink Kubernetes
operator.

+1 for this.

Best,
Yang

Őrhidi Mátyás  于2022年6月22日周三 23:38写道:

> +1 for the patch release. Thanks Gyula!
>
> On Wed, Jun 22, 2022 at 5:35 PM Márton Balassi 
> wrote:
>
>> Hi team,
>>
>> +1 for having a 1.0.1 for the Kubernetes Operator.
>>
>> On Wed, Jun 22, 2022 at 4:23 PM Gyula Fóra  wrote:
>>
>> > Hi Devs!
>> >
>> > How do you feel about releasing the 1.0.1 patch release for the
>> Kubernetes
>> > operator?
>> >
>> > We have fixed a few annoying issues that many people tend to hit.
>> >
>> > Given that we are about halfway until the next minor release based on
>> the
>> > proposed schedule I think we could prepare a 1.0.1 RC1 in the next 1-2
>> days
>> > .
>> >
>> > I can volunteer to be the release manager.
>> >
>> > What do you think?
>> >
>> > Cheers,
>> > Gyula
>> >
>>
>


[jira] [Created] (FLINK-28221) Savepoint may corrupt file metas by repeat commit

2022-06-23 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-28221:


 Summary: Savepoint may corrupt file metas by repeat commit
 Key: FLINK-28221
 URL: https://issues.apache.org/jira/browse/FLINK-28221
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.2.0


[https://github.com/apache/flink-table-store/runs/7020439369?check_suite_focus=true]
Error:  Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 46.953 
s <<< FAILURE! - in org.apache.flink.table.store.connector.RescaleBucketITCase 
[32285|https://github.com/apache/flink-table-store/runs/7020439369?check_suite_focus=true#step:4:32286]Error:
  testSuspendAndRecoverAfterRescaleOverwrite Time elapsed: 25.545 s <<< ERROR!
{code:java}
Caused by: java.lang.IllegalStateException: Trying to add file 
{org.apache.flink.table.data.binary.BinaryRowData@9c67b85d, 0, 0, 
data-4756dfaf-e14e-440e-b211-df2b25f2537a-0.orc} which is already added. 
Manifest might be corrupted.
32416   at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:215)
32417   at 
org.apache.flink.table.store.file.operation.AbstractFileStoreScan.plan(AbstractFileStoreScan.java:189)
32418   at 
org.apache.flink.table.store.table.source.TableScan.plan(TableScan.java:99)
32419   at 
org.apache.flink.table.store.connector.source.FileStoreSource.restoreEnumerator(FileStoreSource.java:117)
32420   at 
org.apache.flink.table.store.connector.source.FileStoreSource.createEnumerator(FileStoreSource.java:93)
32421   at 
org.apache.flink.runtime.source.coordinator.SourceCoordinator.start(SourceCoordinator.java:197)
32422   ... 33 more {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28222) Add sql-csv/json modules

2022-06-23 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-28222:


 Summary: Add sql-csv/json modules
 Key: FLINK-28222
 URL: https://issues.apache.org/jira/browse/FLINK-28222
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System, Formats (JSON, Avro, Parquet, ORC, 
SequenceFile)
Reporter: Chesnay Schepler
 Fix For: 1.16.0


For consistency and maintainability the csv/json formats should have a 
dedicated sql-jar module.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28223) Add artifact-fetcher to the pod-template.yaml example

2022-06-23 Thread Matyas Orhidi (Jira)
Matyas Orhidi created FLINK-28223:
-

 Summary: Add artifact-fetcher to the pod-template.yaml example
 Key: FLINK-28223
 URL: https://issues.apache.org/jira/browse/FLINK-28223
 Project: Flink
  Issue Type: Improvement
Reporter: Matyas Orhidi


We could improve the pod template example to have an artifact fetcher.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28224) Add document for algorithms and features in Flink ML 2.1

2022-06-23 Thread Yunfeng Zhou (Jira)
Yunfeng Zhou created FLINK-28224:


 Summary: Add document for algorithms and features in Flink ML 2.1
 Key: FLINK-28224
 URL: https://issues.apache.org/jira/browse/FLINK-28224
 Project: Flink
  Issue Type: Improvement
  Components: Library / Machine Learning
Affects Versions: ml-2.1.0
Reporter: Yunfeng Zhou


The algorithms and new features introduced in Flink ML 2.1 needs to be 
documented and displayed on Flink ML's document website.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28225) Supports custom ENVs in the Helm chart

2022-06-23 Thread Xin Hao (Jira)
Xin Hao created FLINK-28225:
---

 Summary: Supports custom ENVs in the Helm chart
 Key: FLINK-28225
 URL: https://issues.apache.org/jira/browse/FLINK-28225
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Xin Hao


Can we add custom ENVs supports in the operator Helm?

Such as:
{code:java}
# In the values.yaml

operatorEnvs:
# - name: ""
#   value: ""
webhookEnvs:
# - name: ""
#   value: ""{code}
{code:java}
# In the deployment.yaml
env:
- name: name1
  value: value1
{{- range $k, $v := .Values.operatorEnvs }} {code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Maintain a Calcite repository for Flink to accelerate the development for Flink SQL features

2022-06-23 Thread Martijn Visser
Hi Jing,

My pleasure. I also saw some reviews coming in from Julian on your Calcite
PR, so that's great :)

> About keeping up with the Calcite updates, I would like to take this
issue. Is it too late to schedule the 1.16 version? How about scheduling
this work on version 1.17?

I'm not sure if there's a reviewer available, but I still think it would
already be super valuable to get a PR opened with the actual work. If we
can still fit it in the 1.16 release cycle (there's 4 weeks left until
feature freeze) we can push it in, else we'll merge it in for 1.17.

Best regards,

Martijn

Op do 23 jun. 2022 om 06:38 schreef Jing Zhang :

> Hi Martijin,
> This is really exciting news.
> Thanks a lot for the effort to improve collaboration and communication with
> the Calcite community.
>
> > My take away from the discussion in the Flink community and the
> discussion
> in the Calcite community is that I believe we should do 3 things.
>
> Agreed on these 3 points.
> About keeping up with the Calcite updates, I would like to take this issue.
> Is it too late to schedule the 1.16 version? How about scheduling this work
> on version 1.17?
>
> Best,
> Jing Zhang
>
>
> Martijn Visser  于2022年6月23日周四 02:01写道:
>
> > Hi everyone,
> >
> > I've recently reached out to the Calcite community to see if we could
> > somehow get something done with regards to the PR that Jing Zhang had
> > opened a long time ago. In that thread, I also mentioned that we had a
> > discussion in the Flink community on potentially forking Calcite. I would
> > recommend reading up on the thread [1]. Specifically the replies from
> other
> > projects/PMCs (Apache Drill, Apache Dremio) are super interesting. These
> > projects have forked Calcite in the past, regret that move, have reverted
> > back to Calcite / are in the process of reverting and are elaborating on
> > that. This thread also gained some traction on Twitter in case you're
> > interested in more opinions. [3]
> >
> > My take away from the discussion in the Flink community and the
> discussion
> > in the Calcite community is that I believe we should do 3 things:
> >
> > 1. We should not fork Calcite. There might be short term benefits but
> long
> > term pain. I think we already are suffering from enough long term pain in
> > the Flink codebase that we shouldn't take a step that will increase that
> > pain even more, scattered over multiple places.
> > 2. I think we should try to help out the Calcite community more. Not only
> > by opening new PRs for new features, but we can also help by reviewing
> > those PRs, reviewing other PRs that could be relevant for Flink or
> propose
> > improvements given our experience at Flink. As you can see in the Calcite
> > thread, Timo has already expressed desire in doing so. Part of the OSS
> > community is also about helping each other; if we improve Calcite, we
> will
> > also improve Flink.
> > 3. I think we need to prioritise keeping up with the Calcite updates.
> They
> > are currently working on releasing version 1.31, while Flink is still at
> > 1.26.0. We don't necessarily need to stay in sync with the latest
> available
> > version, but I definitely think we should be at most 2 versions (and
> > preferably 1 version) behind (so currently that would be 1.28 and 1.29
> > soonish). Not only are we increasing our own tech debt by not updating,
> we
> > are also limiting ourselves in adding new features in the Table/SQL
> space.
> > As you can also see for the 1.26 release notes, there's a warning to only
> > use 1.26 for development since it can corrupt your data [3]. There are
> > already multiple upgrade tickets for Calcite [4] [5] [6].
> >
> > [1] https://lists.apache.org/thread/3lkfhwjpqwy9pfhnvwmfkwmwlfyqs45z
> > [2]
> >
> >
> https://twitter.com/gunnarmorling/status/1539499415337111553?s=21&t=8fGk3PxScOx4FJPJWE5UeA
> > [3] https://calcite.apache.org/news/2020/10/06/release-1.26.0/
> > [4] https://issues.apache.org/jira/browse/FLINK-20873
> > [5] https://issues.apache.org/jira/browse/FLINK-21239
> > [6] https://issues.apache.org/jira/browse/FLINK-27998
> >
> > Best regards,
> >
> > Martijn Visser
> > https://twitter.com/MartijnVisser82
> > https://github.com/MartijnVisser
> >
> > Op do 5 mei 2022 om 10:34 schreef godfrey he :
> >
> > > Hi, Timo & Martijn,
> > >
> > > Sorry for the late reply, thanks for the feedback.
> > >
> > > I strongly agree that the best solution would be to cooperate more
> > > with the Calcite community
> > > and maintain all new features and bug fixes in the Calcite community,
> > > without any forking.
> > > It is a long-term process. I think it's difficult to change community
> > > rules, because the Calcite
> > > project is a neutral lib that serves multiple projects simultaneously.
> > > I don't think fork calcite is the perfect solution, but rather a
> > > better balance within limited resources:
> > > it's possible to introduce some necessary minor features and bug fixes
> > > without having to
> > > upgrade 

[jira] [Created] (FLINK-28226) 'Run kubernetes pyflink application test' fail with "[FAIL] Test script contains errors"

2022-06-23 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-28226:
--

 Summary: 'Run kubernetes pyflink application test' fail with 
"[FAIL] Test script contains errors"
 Key: FLINK-28226
 URL: https://issues.apache.org/jira/browse/FLINK-28226
 Project: Flink
  Issue Type: Bug
  Components: API / Python, Deployment / Kubernetes
Affects Versions: 1.14.6
Reporter: Martijn Visser


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37103&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=070ff179-953e-5bda-71fa-d6599415701c&l=6592

{code:java}
Jun 23 10:40:36 Stopping minikube ...
Jun 23 10:40:36 * Stopping node "minikube"  ...
Jun 23 10:40:46 * 1 node stopped.
Jun 23 10:40:46 [FAIL] Test script contains errors.
Jun 23 10:40:46 Checking for errors...
Jun 23 10:40:46 No errors in log files.
Jun 23 10:40:46 Checking for exceptions...
Jun 23 10:40:46 No exceptions in log files.
Jun 23 10:40:46 Checking for non-empty .out files...
grep: /home/vsts/work/_temp/debug_files/flink-logs/*.out: No such file or 
directory
Jun 23 10:40:46 No non-empty .out files.
Jun 23 10:40:46 
Jun 23 10:40:46 [FAIL] 'Run kubernetes pyflink application test' failed after 3 
minutes and 36 seconds! Test exited with exit code 1
Jun 23 10:40:46 
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28227) Add implementations for classes in o.a.f.streaming.examples based on the new new Source API

2022-06-23 Thread Alexander Fedulov (Jira)
Alexander Fedulov created FLINK-28227:
-

 Summary: Add implementations for classes in 
o.a.f.streaming.examples based on the new new Source API
 Key: FLINK-28227
 URL: https://issues.apache.org/jira/browse/FLINK-28227
 Project: Flink
  Issue Type: Sub-task
Reporter: Alexander Fedulov


Reimplement 
 * CarSource
 * SimpleSource
 * SessionWindowing
 * IterateExample

with the new Source API.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28228) Target generation logic might skip over specs when observing already upgraded clusters

2022-06-23 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-28228:
--

 Summary: Target generation logic might skip over specs when 
observing already upgraded clusters
 Key: FLINK-28228
 URL: https://issues.apache.org/jira/browse/FLINK-28228
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.1.0
Reporter: Gyula Fora
Assignee: Gyula Fora
 Fix For: kubernetes-operator-1.1.0


There are currently 2 problems with detecting already upgraded clusters in the 
AbstractDeploymentObserver:

1. Detecting the initial deployments is not reliable, because the user sends in 
an upgrade before the observer runs the logic will fail because we have a new 
higher generation.

2. When an upgrade was detected we should not simply use 
ReconciliationUtils.updateForSpecReconciliationSuccess because this will mark 
the current spec as reconciled which is not necessarily true



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Releasing Flink ML 2.1.0

2022-06-23 Thread Xingbo Huang
Hi Yun and Zhipeng,

Thanks a lot for starting the discussion. +1 for the FLINK ML 2.1.0
release. Looking forward for these ML algorithms. I plan to write a blog
about PyFlink + Flink ML after the released.

Best,
Xingbo

Zhipeng Zhang  于2022年6月23日周四 11:15写道:

> Hi devs,
>
> Yun and I would like to start a discussion for releasing Flink ML
>  2.1.0.
>
> In the past few months, we focused on improving the infra (e.g. memory
> management, benchmark infra, online training, python support) of Flink ML
> by implementing, benchmarking, and optimizing 9 new algorithms in Flink ML.
> Our results have shown that Flink ML is able to meet or exceed the
> performance of selected algorithms in alternative popular ML libraries.
>
> Please see below for a detailed list of improvements:
>
> - A set of representative machine learning algorithms:
> - feature engineering
> - MinMaxScaler (https://issues.apache.org/jira/browse/FLINK-25552)
> - StringIndexer (https://issues.apache.org/jira/browse/FLINK-25527
> )
> - VectorAssembler (
> https://issues.apache.org/jira/browse/FLINK-25616
> )
> - StandardScaler (
> https://issues.apache.org/jira/browse/FLINK-26626)
> - Bucketizer (https://issues.apache.org/jira/browse/FLINK-27072)
> - online learning:
> - OnlineKmeans (https://issues.apache.org/jira/browse/FLINK-26313)
> - OnlineLogisiticRegression (
> https://issues.apache.org/jira/browse/FLINK-27170)
> - regression:
> - LinearRegression (
> https://issues.apache.org/jira/browse/FLINK-27093)
> - classification:
> - LinearSVC (https://issues.apache.org/jira/browse/FLINK-27091)
> - Evaluation:
> - BinaryClassificationEvaluator (
> https://issues.apache.org/jira/browse/FLINK-27294)
> - A benchmark framework for Flink ML. (
> https://issues.apache.org/jira/browse/FLINK-26443)
> - A website for Flink ML users (
> https://nightlies.apache.org/flink/flink-ml-docs-stable/)
> - Python support for Flink ML algorithms (
> https://issues.apache.org/jira/browse/FLINK-26268,
> https://issues.apache.org/jira/browse/FLINK-26269)
> - Several optimizations for FlinkML infrastructure (
> https://issues.apache.org/jira/browse/FLINK-27096,
> https://issues.apache.org/jira/browse/FLINK-27877)
>
> With the improvements and throughput benchmarks we have made, we think it
> is time to release Flink ML 2.1.0, so that interested developers in the
> community can try out the new Flink ML infra to develop algorithms with
> high throughput and low latency.
>
> If there is any concern, please let us know.
>
>
> Best,
> Yun and Zhipeng
>


[jira] [Created] (FLINK-28229) Introduce Source API alternatives for StreamExecutionEnvironment#fromCollection() methods

2022-06-23 Thread Alexander Fedulov (Jira)
Alexander Fedulov created FLINK-28229:
-

 Summary: Introduce Source API alternatives for 
StreamExecutionEnvironment#fromCollection() methods
 Key: FLINK-28229
 URL: https://issues.apache.org/jira/browse/FLINK-28229
 Project: Flink
  Issue Type: Sub-task
Reporter: Alexander Fedulov


 
 * FromElementsFunction
 * FromIteratorFunction



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28230) Deduplicate Dependency classes across checks

2022-06-23 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-28230:


 Summary: Deduplicate Dependency classes across checks
 Key: FLINK-28230
 URL: https://issues.apache.org/jira/browse/FLINK-28230
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0


The new Dependency class introduced in FLINK-28201 can also be used for the 
NoticeFileChecker, or after FLINK-28202, the shade-plugin parser utils.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


RE: [DISCUSS] Release Kubernetes operator 1.0.1

2022-06-23 Thread zhengyu chen
+1
for the patch release.

-- 
Best

ConradJam


RE: [DISCUSS] FLIP-223: Implement standalone mode support in the kubernetes operator

2022-06-23 Thread zhengyu chen
I only see LAST_STATE support for zookeeper here(FLINK-27416
), but I don't see
anything about k8s. What do you think about this? I am very willing to move
this work forward

-- 
Best

ConradJam


Re: [DISCUSS] FLIP-223: Implement standalone mode support in the kubernetes operator

2022-06-23 Thread Gyula Fóra
I think Usamah is working right now towards a first prototype PR of the
standalone implementation.
We should wait for that before we start splitting the work :)

Gyula


On Thu, Jun 23, 2022 at 3:25 PM zhengyu chen  wrote:

> I only see LAST_STATE support for zookeeper here(FLINK-27416
> ), but I don't see
> anything about k8s. What do you think about this? I am very willing to move
> this work forward
>
> --
> Best
>
> ConradJam
>


Re: Re: [DISCUSS] FLIP-240: Introduce "ANALYZE TABLE" Syntax

2022-06-23 Thread godfrey he
Hi, everyone.

Thanks for all the inputs.
If there is no feedback any more, I will start the vote tomorrow.

Best,
Godfrey

Jing Ge  于2022年6月22日周三 15:50写道:
>
> sounds good to me. Thanks!
>
> Best regards,
> Jing
>
> On Fri, Jun 17, 2022 at 5:37 AM godfrey he  wrote:
>
> > Hi, Jing.
> >
> > Thanks for the feedback.
> >
> > >When will the converted SELECT statement of the ANALYZE TABLE be
> > > submitted? right after the CREATE TABLE?
> > The SELECT  job will be submitted only when `ANALYZE TABLE` is executed,
> > and there is nothing to do with CREATE TABLE. Because the `ANALYZE TABLE`
> > is triggered manually as needed.
> >
> > >Will it be submitted periodically to keep the statistical data
> > >up-to-date, since the data might be mutable?
> > the `ANALYZE TABLE` is triggered manually as needed.
> > I will update the doc.
> >
> > >It might not be strong enough to avoid human error
> > > I would suggest using FOR ALL PARTITIONS explicitly
> > > just like FOR ALL COLUMNS.
> > Agree, specifying `PARTITION` explicitly is more friendly
> > and safe. I prefer to use `PARTITION(ds, hr)` without
> > specific partition value, hive has the similar syntax.
> > WDYT ?
> >
> > Best,
> > Godfrey
> >
> > Jing Ge  于2022年6月16日周四 03:53写道:
> > >
> > > Hi Godfrey,
> > >
> > > Thanks for driving this! There are some areas where I couldn't find
> > enough
> > > information in the FLIP, just wondering if I could get more
> > > explanation from you w.r.t. the following questions:
> > >
> > > 1. When will the converted SELECT statement of the ANALYZE TABLE be
> > > submitted? right after the CREATE TABLE?
> > >
> > > 2. Will it be submitted periodically to keep the statistical data
> > > up-to-date, since the data might be mutable?
> > >
> > > 3. " If no partition is specified, the statistics will be gathered for
> > all
> > > partitions"  - I think this is fine for multi-level partitions, e.g.
> > PARTITION
> > > (ds='2022-06-01') means two partitions: PARTITION (ds='2022-06-01', hr=1)
> > > and PARTITION (ds='2022-06-01', hr=2), because it will save a lot of code
> > > and therefore help developer work more efficiently. If we use this rule
> > for
> > > top level partitions, It might not be strong enough to avoid human
> > > error, e.g. developer might trigger huge selection on the table with many
> > > partitions, when he forgot to write the partition in the ANALYZE TABLE
> > > script. In this case, I would suggest using FOR ALL PARTITIONS explicitly
> > > just like FOR ALL COLUMNS.
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > On Wed, Jun 15, 2022 at 10:16 AM godfrey he  wrote:
> > >
> > > > Hi Jark,
> > > >
> > > > Thanks for the inputs.
> > > >
> > > > >Do we need to provide DESC EXTENDED  statement like
> > Spark[1]
> > > > to
> > > > >show statistic for table/partition/columns?
> > > > We do have supported `DESC EXTENDED` syntax, but currently only table
> > > > schema
> > > > will be display, I think we just need a JIRA to support it.
> > > >
> > > > > is it possible to ignore execution mode and force using batch mode
> > for
> > > > the statement?
> > > > As I replied above, The semantics of `ANALYZE TABLE` does not
> > > > distinguish batch and streaming,
> > > > It works for both batch and streaming, but the result of unbounded
> > > > sources is meaningless.
> > > > Currently, I throw exception for streaming mode,
> > > > and we can support streaming mode with bounded source in the future.
> > > >
> > > > Best,
> > > > Godfrey
> > > >
> > > > Jark Wu  于2022年6月14日周二 17:56写道:
> > > > >
> > > > > Hi Godfrey, thanks for starting this discussion, this is a great
> > feature
> > > > > for batch users.
> > > > >
> > > > > The FLIP looks good to me in general.
> > > > >
> > > > > I only have 2 comments:
> > > > >
> > > > > 1) How do users know whether the given table or partition contains
> > > > required
> > > > > statistics?
> > > > > Do we need to provide DESC EXTENDED  statement like
> > Spark[1]
> > > > to
> > > > > show statistic for table/partition/columns?
> > > > >
> > > > > 2) If ANALYZE TABLE can only run in batch mode, is it possible to
> > ignore
> > > > > execution mode
> > > > > and force using batch mode for the statement? From my perspective,
> > > > ANALYZE
> > > > > TABLE
> > > > > is an auxiliary statement similar to SHOW TABLES but heavier, which
> > > > doesn't
> > > > > care about
> > > > > environment execution mode.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > [1]:
> > > > >
> > > >
> > https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-aux-analyze-table.html
> > > > >
> > > > > On Tue, 14 Jun 2022 at 13:52, Jing Ge  wrote:
> > > > >
> > > > > > Hi 华宗
> > > > > >
> > > > > > 退订请发送任意消息至dev-unsubscr...@flink.apache.org
> > > > > > In order to unsubscribe, please send an email to
> > > > > > dev-unsubscr...@flink.apache.org
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Best regards,
> > > > > > Jing
> > > > > >
> > > > > >
> > > > > > On Tue, Jun 14, 2022 at 2:0

[jira] [Created] (FLINK-28231) Support Ozone file system

2022-06-23 Thread bianqi (Jira)
bianqi created FLINK-28231:
--

 Summary: Support Ozone file system
 Key: FLINK-28231
 URL: https://issues.apache.org/jira/browse/FLINK-28231
 Project: Flink
  Issue Type: New Feature
  Components: fs
Affects Versions: 1.14.5
 Environment: * Flink1.12.1
 * Hadoop3.2.2
 * Ozone1.2.1
Reporter: bianqi


After the ozone environment is currently configured, mapreduce tasks can be 
submitted to read and write ozone, but flink tasks cannot be submitted

The error is as follows
{code:java}
[root@jykj0 ozone-fs-hadoop]# flink run 
/soft/flink13/examples/batch/WordCount.jar --input 
ofs://jykj0.yarn.com/volume/bucket/warehouse/input --output 
ofs://jykj0.yarn.com/volume/bucket/warehouse/output/
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/soft/flink-1.13.5/lib/log4j-slf4j-impl-2.16.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/soft/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
           [] - Found Yarn properties file under 
/tmp/flink-1.13.5/.yarn-properties-root.
2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
           [] - Found Yarn properties file under 
/tmp/flink-1.13.5/.yarn-properties-root.
2022-06-23 22:30:27,151 INFO  org.apache.hadoop.io.retry.RetryInvocationHandler 
           [] - java.lang.IllegalStateException, while invoking 
$Proxy27.submitRequest over nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 1 
failover attempts. Trying to failover after sleeping for 4000ms.
2022-06-23 22:30:31,152 INFO  org.apache.hadoop.io.retry.RetryInvocationHandler 
           [] - java.lang.IllegalStateException, while invoking 
$Proxy27.submitRequest over nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 2 
failover attempts. Trying to failover after sleeping for 6000ms. {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28232) Allow for custom pre-flight checks for SQL UDFs

2022-06-23 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-28232:
--

 Summary: Allow for custom pre-flight checks for SQL UDFs
 Key: FLINK-28232
 URL: https://issues.apache.org/jira/browse/FLINK-28232
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Robert Metzger


Currently, implementors of SQL UDFs [1] can not validate the UDF input before 
submitting a SQL query to the runtime. 
Take for example a UDF that computes a regex based on user input. Ideally 
there's a callback for the UDF implementor to check if the user-provided regex 
is valid and compiles, to avoid errors during the execution of the SQL query.

It would be ideal to get access to the schema information resolved by the SQL 
planner in that pre-flight validation to also allow for schema related checks 
pre-flight.


[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/udfs/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28233) Successful observe doesn't clear errors due to patching

2022-06-23 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-28233:
--

 Summary: Successful observe doesn't clear errors due to patching
 Key: FLINK-28233
 URL: https://issues.apache.org/jira/browse/FLINK-28233
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.0.0, kubernetes-operator-1.1.0
Reporter: Gyula Fora
Assignee: Gyula Fora
 Fix For: kubernetes-operator-1.1.0


We set the error to `null` when trying to clear it but due to the patching 
mechanism this does nothing. It should be set to an empty string instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Releasing Flink ML 2.1.0

2022-06-23 Thread Dong Lin
Hi Zhipeng and Yun,

Thanks for starting the discussion. +1 for the Flink ML 2.1.0 release.

Cheers,
Dong

On Thu, Jun 23, 2022 at 11:15 AM Zhipeng Zhang 
wrote:

> Hi devs,
>
> Yun and I would like to start a discussion for releasing Flink ML
>  2.1.0.
>
> In the past few months, we focused on improving the infra (e.g. memory
> management, benchmark infra, online training, python support) of Flink ML
> by implementing, benchmarking, and optimizing 9 new algorithms in Flink ML.
> Our results have shown that Flink ML is able to meet or exceed the
> performance of selected algorithms in alternative popular ML libraries.
>
> Please see below for a detailed list of improvements:
>
> - A set of representative machine learning algorithms:
> - feature engineering
> - MinMaxScaler (https://issues.apache.org/jira/browse/FLINK-25552)
> - StringIndexer (https://issues.apache.org/jira/browse/FLINK-25527
> )
> - VectorAssembler (
> https://issues.apache.org/jira/browse/FLINK-25616
> )
> - StandardScaler (
> https://issues.apache.org/jira/browse/FLINK-26626)
> - Bucketizer (https://issues.apache.org/jira/browse/FLINK-27072)
> - online learning:
> - OnlineKmeans (https://issues.apache.org/jira/browse/FLINK-26313)
> - OnlineLogisiticRegression (
> https://issues.apache.org/jira/browse/FLINK-27170)
> - regression:
> - LinearRegression (
> https://issues.apache.org/jira/browse/FLINK-27093)
> - classification:
> - LinearSVC (https://issues.apache.org/jira/browse/FLINK-27091)
> - Evaluation:
> - BinaryClassificationEvaluator (
> https://issues.apache.org/jira/browse/FLINK-27294)
> - A benchmark framework for Flink ML. (
> https://issues.apache.org/jira/browse/FLINK-26443)
> - A website for Flink ML users (
> https://nightlies.apache.org/flink/flink-ml-docs-stable/)
> - Python support for Flink ML algorithms (
> https://issues.apache.org/jira/browse/FLINK-26268,
> https://issues.apache.org/jira/browse/FLINK-26269)
> - Several optimizations for FlinkML infrastructure (
> https://issues.apache.org/jira/browse/FLINK-27096,
> https://issues.apache.org/jira/browse/FLINK-27877)
>
> With the improvements and throughput benchmarks we have made, we think it
> is time to release Flink ML 2.1.0, so that interested developers in the
> community can try out the new Flink ML infra to develop algorithms with
> high throughput and low latency.
>
> If there is any concern, please let us know.
>
>
> Best,
> Yun and Zhipeng
>


Re:Re: [VOTE] FLIP-221: Abstraction for lookup source and metric

2022-06-23 Thread zst...@163.com
Thanks Qingsheng for driving this.




+1 (non-binding)




Best regards,

Yuan




At 2022-06-23 15:44:55, "Martijn Visser"  wrote:
>Great work on the FLIP.
>
>+1 (binding)
>
>Op do 23 jun. 2022 om 08:07 schreef Leonard Xu :
>
>> Thanks Qingsheng for driving this work.
>>
>> +1(binding)
>>
>>
>> Best,
>> Leonard
>> > 2022年6月23日 下午1:37,Jark Wu  写道:
>> >
>> > +1 (binding)
>> >
>> > Best,
>> > Jark
>> >
>> > On Thu, 23 Jun 2022 at 12:49, Qingsheng Ren  wrote:
>> >
>> >> Hi devs,
>> >>
>> >> I’d like to start a vote thread for FLIP-221: Abstraction for lookup
>> >> source and metric. You can find the discussion thread in [2]*.
>> >>
>> >> The vote will be open for at least 72 hours unless there is an objection
>> >> or not enough binding votes.
>> >>
>> >> Thanks everyone participating in the discussion!
>> >>
>> >> [1]
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric
>> >> [2] https://lists.apache.org/thread/9c0fbgkofkbfdr5hvs62m0cxd2bkgwho
>> >> [*] The link to the discussion thread might not include all emails.
>> Please
>> >> search the Apache email archive with keyword "FLIP-221" to get all
>> >> discussion histories.
>> >>
>> >> Best regards,
>> >> Qingsheng
>>
>>


[DISCUSS] Contribution of Multi Cluster Kafka Source

2022-06-23 Thread Mason Chen
Hi community,

We have been working on a Multi Cluster Kafka Source and are looking to
contribute it upstream. I've given a talk about the features and design at
a Flink meetup: https://youtu.be/H1SYOuLcUTI.

The main features that it provides is:
1. Reading multiple Kafka clusters within a single source.
2. Adjusting the clusters and topics the source consumes from dynamically,
without Flink job restart.

Some of the challenging use cases that these features solve are:
1. Transparent Kafka cluster migration without Flink job restart.
2. Transparent Kafka topic migration without Flink job restart.
3. Direct integration with Hybrid Source.

In addition, this is designed with wrapping and managing the existing
KafkaSource components to enable these features, so it can continue to
benefit from KafkaSource improvements and bug fixes. It can be considered
as a form of a composite source.

I think the contribution of this source could benefit a lot of users who
have asked in the mailing list about Flink handling Kafka migrations and
removing topics in the past. I would love to hear and address your thoughts
and feedback, and if possible drive a FLIP!

Best,
Mason


Re: Re: [VOTE] FLIP-221: Abstraction for lookup source and metric

2022-06-23 Thread Jingsong Li
+1

Best,
Jingsong

On Fri, Jun 24, 2022 at 9:30 AM zst...@163.com  wrote:
>
> Thanks Qingsheng for driving this.
>
>
>
>
> +1 (non-binding)
>
>
>
>
> Best regards,
>
> Yuan
>
>
>
>
> At 2022-06-23 15:44:55, "Martijn Visser"  wrote:
> >Great work on the FLIP.
> >
> >+1 (binding)
> >
> >Op do 23 jun. 2022 om 08:07 schreef Leonard Xu :
> >
> >> Thanks Qingsheng for driving this work.
> >>
> >> +1(binding)
> >>
> >>
> >> Best,
> >> Leonard
> >> > 2022年6月23日 下午1:37,Jark Wu  写道:
> >> >
> >> > +1 (binding)
> >> >
> >> > Best,
> >> > Jark
> >> >
> >> > On Thu, 23 Jun 2022 at 12:49, Qingsheng Ren  wrote:
> >> >
> >> >> Hi devs,
> >> >>
> >> >> I’d like to start a vote thread for FLIP-221: Abstraction for lookup
> >> >> source and metric. You can find the discussion thread in [2]*.
> >> >>
> >> >> The vote will be open for at least 72 hours unless there is an objection
> >> >> or not enough binding votes.
> >> >>
> >> >> Thanks everyone participating in the discussion!
> >> >>
> >> >> [1]
> >> >>
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric
> >> >> [2] https://lists.apache.org/thread/9c0fbgkofkbfdr5hvs62m0cxd2bkgwho
> >> >> [*] The link to the discussion thread might not include all emails.
> >> Please
> >> >> search the Apache email archive with keyword "FLIP-221" to get all
> >> >> discussion histories.
> >> >>
> >> >> Best regards,
> >> >> Qingsheng
> >>
> >>


Re: [VOTE] FLIP-221: Abstraction for lookup source and metric

2022-06-23 Thread Alexander Smirnov
+1 (non-binding)

Best,
Alexander

чт, 23 июн. 2022 г., 11:49 Qingsheng Ren :

> Hi devs,
>
> I’d like to start a vote thread for FLIP-221: Abstraction for lookup
> source and metric. You can find the discussion thread in [2]*.
>
> The vote will be open for at least 72 hours unless there is an objection
> or not enough binding votes.
>
> Thanks everyone participating in the discussion!
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric
> [2] https://lists.apache.org/thread/9c0fbgkofkbfdr5hvs62m0cxd2bkgwho
> [*] The link to the discussion thread might not include all emails. Please
> search the Apache email archive with keyword "FLIP-221" to get all
> discussion histories.
>
> Best regards,
> Qingsheng


[jira] [Created] (FLINK-28234) Infinite or NaN exception for power(-1, 0.5)

2022-06-23 Thread luoyuxia (Jira)
luoyuxia created FLINK-28234:


 Summary:  Infinite or NaN exception for power(-1, 0.5)
 Key: FLINK-28234
 URL: https://issues.apache.org/jira/browse/FLINK-28234
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: luoyuxia
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28235) Min aggregate function support type: ''ARRAY''.

2022-06-23 Thread tartarus (Jira)
tartarus created FLINK-28235:


 Summary: Min aggregate function support type: ''ARRAY''.
 Key: FLINK-28235
 URL: https://issues.apache.org/jira/browse/FLINK-28235
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.15.0
Reporter: tartarus


Min aggregate function does not support type: ''ARRAY''.

We can reproduce it with the following SQL
{code:java}
select app, concat_ws(',', collect_set(cast(user_id as string))) as user_list, 
count(distinct user_id) as uv from test_table group by app {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Releasing Flink ML 2.1.0

2022-06-23 Thread Becket Qin
+1.

It looks like we have some decent progress on Flink ML :)

Thanks,

Jiangjie (Becket) Qin

On Fri, Jun 24, 2022 at 8:29 AM Dong Lin  wrote:

> Hi Zhipeng and Yun,
>
> Thanks for starting the discussion. +1 for the Flink ML 2.1.0 release.
>
> Cheers,
> Dong
>
> On Thu, Jun 23, 2022 at 11:15 AM Zhipeng Zhang 
> wrote:
>
> > Hi devs,
> >
> > Yun and I would like to start a discussion for releasing Flink ML
> >  2.1.0.
> >
> > In the past few months, we focused on improving the infra (e.g. memory
> > management, benchmark infra, online training, python support) of Flink ML
> > by implementing, benchmarking, and optimizing 9 new algorithms in Flink
> ML.
> > Our results have shown that Flink ML is able to meet or exceed the
> > performance of selected algorithms in alternative popular ML libraries.
> >
> > Please see below for a detailed list of improvements:
> >
> > - A set of representative machine learning algorithms:
> > - feature engineering
> > - MinMaxScaler (
> https://issues.apache.org/jira/browse/FLINK-25552)
> > - StringIndexer (
> https://issues.apache.org/jira/browse/FLINK-25527
> > )
> > - VectorAssembler (
> > https://issues.apache.org/jira/browse/FLINK-25616
> > )
> > - StandardScaler (
> > https://issues.apache.org/jira/browse/FLINK-26626)
> > - Bucketizer (https://issues.apache.org/jira/browse/FLINK-27072)
> > - online learning:
> > - OnlineKmeans (
> https://issues.apache.org/jira/browse/FLINK-26313)
> > - OnlineLogisiticRegression (
> > https://issues.apache.org/jira/browse/FLINK-27170)
> > - regression:
> > - LinearRegression (
> > https://issues.apache.org/jira/browse/FLINK-27093)
> > - classification:
> > - LinearSVC (https://issues.apache.org/jira/browse/FLINK-27091)
> > - Evaluation:
> > - BinaryClassificationEvaluator (
> > https://issues.apache.org/jira/browse/FLINK-27294)
> > - A benchmark framework for Flink ML. (
> > https://issues.apache.org/jira/browse/FLINK-26443)
> > - A website for Flink ML users (
> > https://nightlies.apache.org/flink/flink-ml-docs-stable/)
> > - Python support for Flink ML algorithms (
> > https://issues.apache.org/jira/browse/FLINK-26268,
> > https://issues.apache.org/jira/browse/FLINK-26269)
> > - Several optimizations for FlinkML infrastructure (
> > https://issues.apache.org/jira/browse/FLINK-27096,
> > https://issues.apache.org/jira/browse/FLINK-27877)
> >
> > With the improvements and throughput benchmarks we have made, we think it
> > is time to release Flink ML 2.1.0, so that interested developers in the
> > community can try out the new Flink ML infra to develop algorithms with
> > high throughput and low latency.
> >
> > If there is any concern, please let us know.
> >
> >
> > Best,
> > Yun and Zhipeng
> >
>


[jira] [Created] (FLINK-28236) managed memory  no  matter how much I configure, it will be exhausted

2022-06-23 Thread LCER (Jira)
LCER created FLINK-28236:


 Summary: managed memory  no  matter how much I configure, it will 
be exhausted
 Key: FLINK-28236
 URL: https://issues.apache.org/jira/browse/FLINK-28236
 Project: Flink
  Issue Type: Bug
Reporter: LCER
 Attachments: flink-conf.yaml, image-2022-06-24-10-55-07-475.png

managed memory  no  matter how much I configure, it will be exhausted, 
resulting in taskmanager crash。

!image-2022-06-24-10-55-07-475.png!

 

I have a question, managed memory What resources of the physical machine are 
used? I check the memory occupied by the process through the top command. The 
actual RES usage is only about 10g

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28237) Improve the flink-ml python examples in doc

2022-06-23 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-28237:


 Summary: Improve the flink-ml python examples in doc
 Key: FLINK-28237
 URL: https://issues.apache.org/jira/browse/FLINK-28237
 Project: Flink
  Issue Type: Improvement
  Components: API / Python, Library / Machine Learning
Affects Versions: ml-2.1.0
Reporter: Huang Xingbo
Assignee: Huang Xingbo
 Fix For: ml-2.1.0






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28238) Fix unstable testCancelOperationAndFetchResultInParallel

2022-06-23 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-28238:
-

 Summary: Fix unstable testCancelOperationAndFetchResultInParallel
 Key: FLINK-28238
 URL: https://issues.apache.org/jira/browse/FLINK-28238
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway
Affects Versions: 1.16.0
Reporter: Shengkai Fang
 Fix For: 1.16.0


The failed test in 
https://dev.azure.com/martijn0323/Flink/_build/results?buildId=2711&view=logs&j=43a[…]cc-244368da36b4&t=82d122c0-8bbf-56f3-4c0d-8e3d69630d0f&l=11611

It's possible the fetcher fetches the results from the closed operation.




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] FLIP-241: Completed Jobs Information Enhancement

2022-06-23 Thread Xintong Song
Whether the job ID is actually used in the end isn't visible after all.

I'm not sure about this. E.g., for an empty session cluster, users have to
understand they don't need to provide an actual jobid for requesting
jobmanager information via rest.

I believe both ways work. I think this is a trade off between a) explaining
to history server rest api users how the urls are different from jobmanager
and b) explaining to jobmanager rest api users why we need an unused jobid
for some of the cases. I'm leaning toward the current approach, because I'd
expect a smaller set of history server rest api users than (or even a
subset of) that of jobmanager.

The plan is to document which (and how) the urls are different from
jobmanager in the history server page [1].

Compatibility test indeed should be considered. Thanks for pointing it out.
Currently the compatibility of history server rest api is guaranteed by the
compatibility of jobmanager rest api. I think the only thing we need is to
make sure /foo/bar of jobmanager is identical to /jobs/:jobid/foo/bar of
history server. We can introduce an interface, as a subtype of JsonArchivist,
that archives the json with a path that includes the jobid. Then we can
test against all relevant handlers as implementations of this interface.

WDYT?

Best,

Xintong


[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/advanced/historyserver/#available-requests



On Thu, Jun 23, 2022 at 5:07 PM Chesnay Schepler  wrote:

> The addition of the /jobs/:jobid/jobmanager/config / environment
> exclusively to the HS is a bit of a strange workaround.
> How do you intend to document those? (and test compatibility)?
>
> Why not just add a general /jobs/:jobid/environment endpoint that works
> just like jobmanager/environment.
> To me that seems like a cleaner solution.
> It is somewhat mentioned as an alternative in the FLIP, but I don't
> understand what is supposed to be confusing about it.
> Whether the job ID is actually used in the end isn't visible after all.
>
> /jobmanager/config could be integrated into /jobs/:jobid/config.
>
> The same approach could maybe be used for logs; not really sure yet (not
> a fan of displaying logs in the HS in the first place).
>
> On 23/06/2022 06:55, junhan yang wrote:
> > Hi all,
> >
> > Thank you all for your feedbacks. As far as I can see, it looks like the
> > discussion on this FLIP has been converged.
> >
> > I will start a new vote thread now.
> >
> > Best regards,
> > Junhan
> >
> > Yangze Guo  于2022年6月17日周五 14:05写道:
> >
> >> Thanks for the input, Jiangang.
> >>
> >> I think it's a valid demand to distinguish completed jobs with the same
> >> name.
> >> - If they are different jobs, I think users need to give them
> >> different meaningful names respectively.
> >> - If they are exactly the same job, IIUC, what you need is to figure
> >> out the order. ApplicationId in Yarn might help. But in this case, you
> >> can just sort them with the start time.
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Fri, Jun 17, 2022 at 12:13 PM Jiangang Liu <
> liujiangangp...@gmail.com>
> >> wrote:
> >>> Thanks for the FLIP. It is helpful to track detail infos for completed
> >> jobs.
> >>> I want to ask another question. In our environment, sometimes it is
> hard
> >> to
> >>> distinguish jobs since the same job names may appear multi times in the
> >>> completed jobs. Because a job may run multi times or different jobs
> have
> >>> the same job names. I wonder that wether we can enhance the complete
> jobs
> >>> display with more information, such as applicationId and application
> name
> >>> in yarn. Maybe it is different in k8s to identify a job.
> >>>
> >>> Best
> >>> Jiangang Liu
> >>>
> >>> Yangze Guo  于2022年6月17日周五 11:40写道:
> >>>
>  Thanks for the feedback, Aitozi and Jing.
> 
> > Are each attempts of the TaskManager or JobManager pods (if failure
>  occurs)
>  all be shown in the ui?
> 
>  The info of the prior execution attempts will be archived, you could
>  refer to `ArchivedExecutionVertex$priorExecutions`.
> 
> > It seems that most of these metrics are more interesting to batch
> >> jobs.
>  Does it make sense to calculate them for pure streaming jobs too?
> 
>  All the proposed metrics will be calculated no matter what the job
> >> type is.
> > Why "duration is less interesting" which is mentioned in the FLIP?
>  As a first step, we mainly focus on the most interesting status during
>  the job lifecycle. The duration of final states like FINISHED and
>  CANCELED is meaningless, while abnormal conditions like CANCELING will
>  not be included at the moment.
> 
> > Could you share your thoughts on "accumulated-busy-time"? It should
>  describe the time while the task is working as expected, i.e. the
> happy
>  path. When do we need it for analytics or diagnosis?
> 
>  A task could be busy or idle while it is working. Users may adjust

Re: [VOTE] FLIP-221: Abstraction for lookup source and metric

2022-06-23 Thread Lincoln Lee
+1 (non-binding)


Best,
Lincoln Lee


Alexander Smirnov  于2022年6月24日周五 09:54写道:

> +1 (non-binding)
>
> Best,
> Alexander
>
> чт, 23 июн. 2022 г., 11:49 Qingsheng Ren :
>
> > Hi devs,
> >
> > I’d like to start a vote thread for FLIP-221: Abstraction for lookup
> > source and metric. You can find the discussion thread in [2]*.
> >
> > The vote will be open for at least 72 hours unless there is an objection
> > or not enough binding votes.
> >
> > Thanks everyone participating in the discussion!
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric
> > [2] https://lists.apache.org/thread/9c0fbgkofkbfdr5hvs62m0cxd2bkgwho
> > [*] The link to the discussion thread might not include all emails.
> Please
> > search the Apache email archive with keyword "FLIP-221" to get all
> > discussion histories.
> >
> > Best regards,
> > Qingsheng
>


Re:RE: [DISCUSS] Release Kubernetes operator 1.0.1

2022-06-23 Thread Lihe Ma
+1 for this. Thanks Gyula !


Best,
Lihe Ma














At 2022-06-23 21:19:34, "zhengyu chen"  wrote:
>+1
>for the patch release.
>
>-- 
>Best
>
>ConradJam


??????RE: [DISCUSS] Release Kubernetes operator 1.0.1

2022-06-23 Thread 867127831
+1 for the this release. Thanks Gyula!


Best,
Jun He


--  --
??: 
   "dev"