[jira] [Created] (FLINK-32460) Add doc for list procedures

2023-06-28 Thread luoyuxia (Jira)
luoyuxia created FLINK-32460:


 Summary: Add doc for list procedures
 Key: FLINK-32460
 URL: https://issues.apache.org/jira/browse/FLINK-32460
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: luoyuxia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Kafka connector code removed from apache/master

2023-06-28 Thread Martijn Visser
Thank you for this!

On Tue, Jun 27, 2023 at 8:57 PM Mason Chen  wrote:

> Hi all,
>
> I would like to inform you that we have removed the Kafka connector code
> from the Flink main repo. This should reduce the developer confusion of
> which repo to submit PRs.
>
> Regarding a few nuances, we have kept the Confluent avro format in the main
> repo. This is because the format is actually connector agnostic. Also, to
> unblock the code removal for the 1.18 release, we have pinned the connector
> versions in tests/examples, and these are the followup items [1][2][3] to
> refactor the main repo code because the Kafka connector dependency is not
> required in these cases.
>
> A big thanks to Gordon, Martjin, and Chesnay who helped review the work
> from the beginning to the end!
>
> [1] https://issues.apache.org/jira/browse/FLINK-32449
> [2] https://issues.apache.org/jira/browse/FLINK-32451
> [3] https://issues.apache.org/jira/browse/FLINK-32452
>
> Best,
> Mason
>


[jira] [Created] (FLINK-32461) manage operator state increase very large

2023-06-28 Thread wgcn (Jira)
wgcn created FLINK-32461:


 Summary: manage operator state increase very large 
 Key: FLINK-32461
 URL: https://issues.apache.org/jira/browse/FLINK-32461
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.17.1
Reporter: wgcn
 Attachments: image-2023-06-28-15-57-52-615.png

 !image-2023-06-28-15-57-39-557.png! 
This issue doesn't usually occur, but it happens during busy nights when the 
machines are more active. The "manage operator state" will increase 
significantly, and I can see that it stores Kafka offsets inside mostly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-06-28 Thread Hang Ruan
Thanks for Dong and Yunfeng's work.

The FLIP looks good to me. This new version is clearer to understand.

Best,
Hang

Dong Lin  于2023年6月27日周二 16:53写道:

> Thanks Jack, Jingsong, and Zhu for the review!
>
> Thanks Zhu for the suggestion. I have updated the configuration name as
> suggested.
>
> On Tue, Jun 27, 2023 at 4:45 PM Zhu Zhu  wrote:
>
> > Thanks Dong and Yunfeng for creating this FLIP and driving this
> discussion.
> >
> > The new design looks generally good to me. Increasing the checkpoint
> > interval when the job is processing backlogs is easier for users to
> > understand and can help in more scenarios.
> >
> > I have one comment about the new configuration.
> > Naming the new configuration
> > "execution.checkpointing.interval-during-backlog" would be better
> > according to Flink config naming convention.
> > It is also because that nested config keys should be avoided. See
> > FLINK-29372 for more details.
> >
> > Thanks,
> > Zhu
> >
> > Jingsong Li  于2023年6月27日周二 15:45写道:
> > >
> > > Looks good to me!
> > >
> > > Thanks Dong, Yunfeng and all for your discussion and design.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Tue, Jun 27, 2023 at 3:35 PM Jark Wu  wrote:
> > > >
> > > > Thank you Dong for driving this FLIP.
> > > >
> > > > The new design looks good to me!
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > > 2023年6月27日 14:38,Dong Lin  写道:
> > > > >
> > > > > Thank you Leonard for the review!
> > > > >
> > > > > Hi Piotr, do you have any comments on the latest proposal?
> > > > >
> > > > > I am wondering if it is OK to start the voting thread this week.
> > > > >
> > > > > On Mon, Jun 26, 2023 at 4:10 PM Leonard Xu 
> > wrote:
> > > > >
> > > > >> Thanks Dong for driving this FLIP forward!
> > > > >>
> > > > >> Introducing  `backlog status` concept for flink job makes sense to
> > me as
> > > > >> following reasons:
> > > > >>
> > > > >> From concept/API design perspective, it’s more general and natural
> > than
> > > > >> above proposals as it can be used in HybridSource for bounded
> > records, CDC
> > > > >> Source for history snapshot and general sources like KafkaSource
> for
> > > > >> historical messages.
> > > > >>
> > > > >> From user cases/requirements, I’ve seen many users manually to set
> > larger
> > > > >> checkpoint interval during backfilling and then set a shorter
> > checkpoint
> > > > >> interval for real-time processing in their production environments
> > as a
> > > > >> flink application optimization. Now, the flink framework can make
> > this
> > > > >> optimization no longer require the user to set the checkpoint
> > interval and
> > > > >> restart the job multiple times.
> > > > >>
> > > > >> Following supporting using larger checkpoint for job under backlog
> > status
> > > > >> in current FLIP, we can explore supporting larger
> > parallelism/memory/cpu
> > > > >> for job under backlog status in the future.
> > > > >>
> > > > >> In short, the updated FLIP looks good to me.
> > > > >>
> > > > >>
> > > > >> Best,
> > > > >> Leonard
> > > > >>
> > > > >>
> > > > >>> On Jun 22, 2023, at 12:07 PM, Dong Lin 
> > wrote:
> > > > >>>
> > > > >>> Hi Piotr,
> > > > >>>
> > > > >>> Thanks again for proposing the isProcessingBacklog concept.
> > > > >>>
> > > > >>> After discussing with Becket Qin and thinking about this more, I
> > agree it
> > > > >>> is a better idea to add a top-level concept to all source
> > operators to
> > > > >>> address the target use-case.
> > > > >>>
> > > > >>> The main reason that changed my mind is that isProcessingBacklog
> > can be
> > > > >>> described as an inherent/nature attribute of every source
> instance
> > and
> > > > >> its
> > > > >>> semantics does not need to depend on any specific checkpointing
> > policy.
> > > > >>> Also, we can hardcode the isProcessingBacklog behavior for the
> > sources we
> > > > >>> have considered so far (e.g. HybridSource and MySQL CDC source)
> > without
> > > > >>> asking users to explicitly configure the per-source behavior,
> which
> > > > >> indeed
> > > > >>> provides better user experience.
> > > > >>>
> > > > >>> I have updated the FLIP based on the latest suggestions. The
> > latest FLIP
> > > > >> no
> > > > >>> longer introduces per-source config that can be used by
> end-users.
> > While
> > > > >> I
> > > > >>> agree with you that CheckpointTrigger can be a useful feature to
> > address
> > > > >>> additional use-cases, I am not sure it is necessary for the
> > use-case
> > > > >>> targeted by FLIP-309. Maybe we can introduce CheckpointTrigger
> > separately
> > > > >>> in another FLIP?
> > > > >>>
> > > > >>> Can you help take another look at the updated FLIP?
> > > > >>>
> > > > >>> Best,
> > > > >>> Dong
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Fri, Jun 16, 2023 at 11:59 PM Piotr Nowojski <
> > pnowoj...@apache.org>
> > > > >>> wrote:
> > > > >>>
> > > >  Hi Dong,
> > > > 
> > > > > Suppose there are 1000 subtask and each subtask has 1% chance
> of
> > bein

[jira] [Created] (FLINK-32462) Kafka shouldn't rely on Flink-Shaded

2023-06-28 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-32462:
--

 Summary: Kafka shouldn't rely on Flink-Shaded
 Key: FLINK-32462
 URL: https://issues.apache.org/jira/browse/FLINK-32462
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Kafka
Reporter: Martijn Visser






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32463) annotation word spelling error about @return of method shutdownServiceUninterruptible in TimerService

2023-06-28 Thread StephenLin (Jira)
StephenLin created FLINK-32463:
--

 Summary: annotation word spelling error about @return of method 
shutdownServiceUninterruptible in TimerService
 Key: FLINK-32463
 URL: https://issues.apache.org/jira/browse/FLINK-32463
 Project: Flink
  Issue Type: Improvement
Reporter: StephenLin


annotation word spelling error about @return of method 
shutdownServiceUninterruptible in TimerService 
[org.apache.flink.streaming.runtime.tasks.TimerService]

"@return returns true iff the shutdown was completed."

here "iff" should be "if"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32464) AssertionError when converting between Table and SQL with selection and type cast

2023-06-28 Thread Yunfeng Zhou (Jira)
Yunfeng Zhou created FLINK-32464:


 Summary: AssertionError when converting between Table and SQL with 
selection and type cast
 Key: FLINK-32464
 URL: https://issues.apache.org/jira/browse/FLINK-32464
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.16.1
Reporter: Yunfeng Zhou


In an attempt to convert table between Table API and SQL API using the 
following program

```java
public static void main(String[] args) {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

Table table = tEnv.fromValues(1, 2, 3);

tEnv.createTemporaryView("input_table", table);
table = tEnv.sqlQuery("SELECT MAP[f0, 1] AS f1 from input_table");

table = table.select($("f1").cast(DataTypes.MAP(DataTypes.INT(), 
DataTypes.INT(;

tEnv.createTemporaryView("input_table_2", table);
tEnv.sqlQuery("SELECT * from input_table_2");
}
```
The following exception is thrown.

```
Exception in thread "main" java.lang.AssertionError: Conversion to relational 
algebra failed to preserve datatypes:
validated type:
RecordType((INTEGER, INTEGER) MAP NOT NULL f1-MAP) NOT NULL
converted type:
RecordType((INTEGER, INTEGER) MAP f1-MAP) NOT NULL
rel:
LogicalProject(f1-MAP=[CAST(MAP($0, 1)):(INTEGER, INTEGER) MAP])
  LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }]])

at 
org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:470)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:582)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:215)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:191)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:1498)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlQuery(SqlToOperationConverter.java:1253)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertValidatedSqlNode(SqlToOperationConverter.java:374)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:262)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:106)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:703)
at org.apache.flink.streaming.connectors.redis.RedisSinkITCase.main
```

It seems that there is a bug with the Table-SQL conversion and selection 
process when type cast is involved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32465) KerberosLoginProvider.isLoginPossible does accidental login with keytab

2023-06-28 Thread Gabor Somogyi (Jira)
Gabor Somogyi created FLINK-32465:
-

 Summary: KerberosLoginProvider.isLoginPossible does accidental 
login with keytab
 Key: FLINK-32465
 URL: https://issues.apache.org/jira/browse/FLINK-32465
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.18.0
Reporter: Gabor Somogyi


In KerberosLoginProvider.isLoginPossible there is a call to 
UserGroupInformation.getCurrentUser() before principal check (keytab usage). 
This triggers an accidental login with either kerberos credentials if 
available, or as the local OS user, based on security settings. This is not 
problematic most of the time since KerberosLoginProvider.doLogin overwrites the 
credentials with keytab. The problem hurts however when login fails for 
whatever reason. Such case the workload is just not starting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32466) Invalid input strategy for CONCAT allows BINARY strings

2023-06-28 Thread Timo Walther (Jira)
Timo Walther created FLINK-32466:


 Summary: Invalid input strategy for CONCAT allows BINARY strings
 Key: FLINK-32466
 URL: https://issues.apache.org/jira/browse/FLINK-32466
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: Timo Walther
Assignee: Timo Walther


"string" in SQL terms covers both character strings and binary strings. The 
author of CONCAT might not have known this. In any case, the code gen instead 
of the validator fails when executing:

{code}
TableEnvironment t = 
TableEnvironment.create(EnvironmentSettings.inStreamingMode());
t.createTemporaryView("t", t.fromValues(lit(new byte[] {97})));
t.executeSql("SELECT CONCAT(f0, '-magic') FROM t").print();
{code}

As future work, we should also allow binary strings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-322 Cooldown period for adaptive scheduler

2023-06-28 Thread Chesnay Schepler
> we should schedule a check that will rescale if 
min-parallelism-increase is met. Then, what it the use of 
scaling-interval.max timeout in that context ?


To force a rescale if min-parallelism-increase is not met (but we could 
still run above the current parallelism).


min-parallelism-increase is a trade-off between the cost of rescaling vs 
the performance benefit of the parallelism increase. Over time the 
balance tips more and more in favor of the parallelism increase, hence 
we should eventually rescale anyway even if the minimum isn't met, or at 
least give users the option to do so.


> I meant the opposite: not having only the cooldown but having only 
the stabilization time. I must have missed something because what I 
wonder is: if every rescale entails a restart of the pipeline and every 
restart entails passing in waiting for resources state, then why 
introduce a cooldown when there is already at each rescale a stable 
resource timeout ?


It is technically correct that the stable resource timeout can be used 
to limit the number of rescale operations per interval, however during 
that time the job isn't running, in contrast to the cooldown.


Having both just gives you a lot more flexibility.
"I want at most 1 rescale operation per hour, and wait at most 1 minute 
for resource to stabilize when a rescale happens".

You can't express this with only one of the options.

On 20/06/2023 14:41, Etienne Chauchot wrote:

Hi Chesnay,

Thanks for your feedback. Comments inline

Le 16/06/2023 à 17:24, Chesnay Schepler a écrit :
1) Options specific to the adaptive scheduler should start with 
"jobmanager.adaptive-scheduler".



ok



2)
There isn't /really /a notion of a "scaling event". The scheduler is 
informed about new/lost slots and job failures, and reacts 
accordingly by maybe rescaling the job.
(sure, you can think of these as events, but you can think of 
practically everything as events)


There shouldn't be a queue for events. All the scheduler should have 
to know is that the next rescale check is scheduled for time T, which 
in practice boils down to a flag and a scheduled action that runs 
Executing#maybeRescale.



Makes total sense, its very simple like this. Thanks for the precision 
and pointer. After the related FLIPs, I'll look at the code now.



With that in mind, we also have to look at how we keep this state 
around. Presumably it is scoped to the current state, such that the 
cooldown is reset if a job fails.
Maybe we should add a separate ExecutingWithCooldown state; not sure 
yet.



Yes loosing cooldown state and cooldown reset upon failure is what I 
suggested in point 3 in previous email. Not sure either for a new 
state, I'll figure it out after experimenting with the code. I'll 
update the FLIP then.





It would be good to clarify whether this FLIP only attempts to cover 
scale up operations, or also scale downs in case of slot losses.



When there are slots loss, most of the time it is due to a TM loss so 
there should be several slots lost at the same time but (hopefully) 
only once. There should not be many scale downs in a row (but still 
cascading failures can happen). I think, we should just protect 
against having scale ups immediately following. For that, I think we 
could just keep the current behavior of transitioning to Restarting 
state and then back to Waiting for Resources state. This state will 
protect us against scale ups immediately following failure/restart.





We should also think about how it relates to the externalized 
declarative resource management. Should we always rescale 
immediately? Should we wait until the cooldown is over?



It relates to point 2, no ? we should rescale immediately only if last 
rescale was done more than scaling-interval.min ago otherwise schedule 
a rescale at last-rescale + scaling-interval.min time.



Related to this, there's the min-parallelism-increase option, that if 
for example set to "2" restricts rescale operations to only occur if 
the parallelism increases by at least 2.



yes I saw that in the code



Ideally however there would be a max timeout for this.

As such we could maybe think about this a bit differently:
Add 2 new options instead of 1:
jobmanager.adaptive-scheduler.scaling-interval.min: The minimum time 
the scheduler will wait for the next effective rescale operations.
jobmanager.adaptive-scheduler.scaling-interval.max: The maximum time 
the scheduler will wait for the next effective rescale operations.



At point 2, we said that when slots change (requirements change or new 
slots available), if last rescale check (call to maybeRescale) was 
done less than scaling-interval.min ago, we should schedule a check 
that will rescale if min-parallelism-increase is met. Then, what it 
the use of scaling-interval.max timeout in that context ?





3) It sounds fine that we lose the cooldown state, because imo we 
want to reset the cooldown anyway on job failures (because a job 
failure inheren

[VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread yuxia
Hi everyone, 
Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS SELECT 
statement[1]. Based on the discussion [2], we have come to a consensus, so I 
would like to start a vote. 
The vote will be open for at least 72 hours (until July 3th, 10:00AM GMT) 
unless there is an objection or an insufficient number of votes. 


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
 
[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3 


Best regards, 
Yuxia 


[jira] [Created] (FLINK-32467) Move CleanupOnCloseRpcSystem to rpc-core

2023-06-28 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-32467:


 Summary: Move CleanupOnCloseRpcSystem to rpc-core
 Key: FLINK-32467
 URL: https://issues.apache.org/jira/browse/FLINK-32467
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / RPC
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.18.0


This class is useful for any rpc system implementation and should thus be 
shared.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32468) Replace Akka by Pekko

2023-06-28 Thread Konstantin Knauf (Jira)
Konstantin Knauf created FLINK-32468:


 Summary: Replace Akka by Pekko
 Key: FLINK-32468
 URL: https://issues.apache.org/jira/browse/FLINK-32468
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Konstantin Knauf
 Fix For: 1.18.0


Akka 2.6.x will not receive security fixes from September 2023 onwards (see 
https://discuss.lightbend.com/t/2-6-x-maintenance-proposal/9949). 

A mid-term plan to replace Akka is described in FLINK-29281. In the meantime, 
we suggest to replace Akka by Apache Pekko (incubating), which is a fork of 
Akka 2.6.x under the Apache 2.0 license. This way - if needed - we at least 
have the ability to release security fixes ourselves in collaboration with the 
Pekko community. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re:[VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread Mang Zhang
+1 (no-binding)


--

Best regards,
Mang Zhang





At 2023-06-28 17:48:15, "yuxia"  wrote:
>Hi everyone, 
>Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS SELECT 
>statement[1]. Based on the discussion [2], we have come to a consensus, so I 
>would like to start a vote. 
>The vote will be open for at least 72 hours (until July 3th, 10:00AM GMT) 
>unless there is an objection or an insufficient number of votes. 
>
>
>[1] 
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> 
>[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3 
>
>
>Best regards, 
>Yuxia 


[jira] [Created] (FLINK-32469) Simplify the implementation of the checkpoint handlers

2023-06-28 Thread Hong Liang Teoh (Jira)
Hong Liang Teoh created FLINK-32469:
---

 Summary: Simplify the implementation of the checkpoint handlers
 Key: FLINK-32469
 URL: https://issues.apache.org/jira/browse/FLINK-32469
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / REST
Affects Versions: 1.17.1, 1.16.2
Reporter: Hong Liang Teoh
 Fix For: 1.18.0


*What*

The checkpoint handlers currently retrieve checkpoint information from the 
`ExecutionGraph`, which is cached in the `AbstractExecutionGraphHandler`. This 
means that this information is potentially stale (depending on the 
`web.refresh-interval`, which defaults to 3s).

 

*Why*

We want to enable programmatic use of the REST API, independent of the Flink 
dashboard.

The current configuration of the `ExecutionGraph` cache is meant to facilitate 
a fluid user experience of the Flink dashboard. On the Job details page, the 
Flink dashboard makes a series of requests (e.g. /jobs/\{jobid}, 
/jobs/\{jobid}/vertices/\{vertexid}){color:#172b4d}. {color}

{color:#172b4d}To ensure that the requests return consistent results, we have 
the execution graph cache.{color}
 
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32470) SqlValidateException should be exposed as ValidationException

2023-06-28 Thread Timo Walther (Jira)
Timo Walther created FLINK-32470:


 Summary: SqlValidateException should be exposed as 
ValidationException
 Key: FLINK-32470
 URL: https://issues.apache.org/jira/browse/FLINK-32470
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API, Table SQL / Planner
Reporter: Timo Walther
Assignee: Timo Walther


{{ValidationException}} is the main exception of the Table API and SQL in case 
the user did something wrong. Most exceptions are wrapped into 
{{ValidationException}}.

Since the parser module has no access to it, it introduces a custom 
{{SqlValidateException}}. However, this should not be exposed to users. It 
should only serve as an intermediate exception that is translated in 
{{FlinkPlannerImpl#validate}}.

{{SqlParserException}} and {{SqlParserEOFException}} could also be simplified 
to {{ValidationException}} but at least they are correctly annotated and 
located in the {{o.a.f.table.api}} package.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32471) IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH

2023-06-28 Thread grandfisher (Jira)
grandfisher created FLINK-32471:
---

 Summary: IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH
 Key: FLINK-32471
 URL: https://issues.apache.org/jira/browse/FLINK-32471
 Project: Flink
  Issue Type: Improvement
Reporter: grandfisher


According to FLINK-31273:

The reason for the error is that other filters conflict with IS_NULL, but in 
fact it won't conflict with IS_NOT_NULL, because operators in 
SUITABLE_FILTER_TO_PUSH  such as 'SqlKind.GREATER_THAN'  has an implicit filter 
'IS_NOT_NULL' according to SQL Semantics.
 
So we think it is feasible to add  IS_NOT_NULL to the SUITABLE_FILTER_TO_PUSH 
list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread Jing Ge
+1(binding)

On Wed, Jun 28, 2023 at 1:51 PM Mang Zhang  wrote:

> +1 (no-binding)
>
>
> --
>
> Best regards,
> Mang Zhang
>
>
>
>
>
> At 2023-06-28 17:48:15, "yuxia"  wrote:
> >Hi everyone,
> >Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS
> SELECT statement[1]. Based on the discussion [2], we have come to a
> consensus, so I would like to start a vote.
> >The vote will be open for at least 72 hours (until July 3th, 10:00AM GMT)
> unless there is an objection or an insufficient number of votes.
> >
> >
> >[1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> >[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3
> >
> >
> >Best regards,
> >Yuxia
>


[DISCUSSION] test connectors against Flink master in PRs

2023-06-28 Thread Etienne Chauchot

Hi all,

Connectors are external to flink. As such, they need to be tested 
against stable (released) versions of Flink.


But I was wondering if it would make sense to test connectors in PRs 
also against latest flink master snapshot to allow to discover failures 
before merging the PRs, ** while the author is still available **, 
rather than discovering them in nightly tests (that test against 
snapshot) after the merge. That would allow the author to anticipate 
potential failures and provide more future proof code (even if master is 
subject to change before the connector release).


Of course, if a breaking change was introduced in master, such tests 
will fail. But they should be considered as a preview of how the code 
will behave against the current snapshot of the next flink version.


WDYT ?


Best

Etienne


Re: [VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread Feng Jin
+1 (no-binding)


Best
Feng

On Wed, Jun 28, 2023 at 11:03 PM Jing Ge  wrote:

> +1(binding)
>
> On Wed, Jun 28, 2023 at 1:51 PM Mang Zhang  wrote:
>
> > +1 (no-binding)
> >
> >
> > --
> >
> > Best regards,
> > Mang Zhang
> >
> >
> >
> >
> >
> > At 2023-06-28 17:48:15, "yuxia"  wrote:
> > >Hi everyone,
> > >Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS
> > SELECT statement[1]. Based on the discussion [2], we have come to a
> > consensus, so I would like to start a vote.
> > >The vote will be open for at least 72 hours (until July 3th, 10:00AM
> GMT)
> > unless there is an objection or an insufficient number of votes.
> > >
> > >
> > >[1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> > >[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3
> > >
> > >
> > >Best regards,
> > >Yuxia
> >
>


[jira] [Created] (FLINK-32472) FLIP-308: Support Time Travel

2023-06-28 Thread Feng Jin (Jira)
Feng Jin created FLINK-32472:


 Summary: FLIP-308: Support Time Travel
 Key: FLINK-32472
 URL: https://issues.apache.org/jira/browse/FLINK-32472
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Feng Jin


Umbrella issue for 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32473) Introduce base interfaces for time travel

2023-06-28 Thread Feng Jin (Jira)
Feng Jin created FLINK-32473:


 Summary: Introduce base interfaces for time travel
 Key: FLINK-32473
 URL: https://issues.apache.org/jira/browse/FLINK-32473
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32474) Support time travel in table planner

2023-06-28 Thread Feng Jin (Jira)
Feng Jin created FLINK-32474:


 Summary: Support time travel in table planner 
 Key: FLINK-32474
 URL: https://issues.apache.org/jira/browse/FLINK-32474
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32475) Add doc for time travel

2023-06-28 Thread Feng Jin (Jira)
Feng Jin created FLINK-32475:


 Summary: Add doc for time travel
 Key: FLINK-32475
 URL: https://issues.apache.org/jira/browse/FLINK-32475
 Project: Flink
  Issue Type: Sub-task
Reporter: Feng Jin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Final Reminder: Community Over Code call for presentations closing soon

2023-06-28 Thread Rich Bowen
[Note: You're receiving this email because you are subscribed to one or
more project dev@ mailing lists at the Apache Software Foundation.]

This is your final reminder that the Call for Presentations for
Community Over Code (formerly known as ApacheCon) is closing soon - on
Thursday, 13 July 2023 at 23:59:59 GMT.

https://communityovercode.org/call-for-presentations/

We are looking for talk proposals on all topics related to ASF projects
and open source software.

The event will be held in Halifax, Nova Scotia, Octiber 7th through
10th. More details about the event may be found on the event website at
https://communityovercode.org/

Rich, for the event planners


Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener

2023-06-28 Thread Jing Ge
Hi Shammon,

Thanks for your proposal. After reading the FLIP, I'd like to ask
some questions to make sure we are on the same page. Thanks!

1. TableColumnLineageRelation#sinkColumn() should return
TableColumnLineageEntity instead of String, right?

2. Since LineageRelation already contains all information to build the
lineage between sources and sink, do we still need to set the LineageEntity
in the source?

3. About the "Entity" and "Relation" naming, I was confused too, like
Qingsheng mentioned. How about LineageVertex, LineageEdge, and LineageEdges
which contains multiple LineageEdge? E.g. multiple sources join into one
sink, or, edges of columns from one or different tables, etc.

Best regards,
Jing

On Sun, Jun 25, 2023 at 2:06 PM Shammon FY  wrote:

> Hi yuxia and Yun,
>
> Thanks for your input.
>
> For yuxia:
> > 1: What kinds of JobStatus will the `JobExecutionStatusEven` including?
>
> At present, we only need to notify the listener when a job goes to
> termination, but I think it makes sense to add generic `oldStatus` and
> `newStatus` in the listener and users can update the job state in their
> service as needed.
>
> > 2: I'm really confused about the `config()` included in `LineageEntity`,
> where is it from and what is it for ?
>
> The `config` in `LineageEntity` is used for users to get options for source
> and sink connectors. As the examples in the FLIP, users can add
> server/group/topic information in the config for kafka and create lineage
> entities for `DataStream` jobs, then the listeners can get this information
> to identify the same connector in different jobs. Otherwise, the `config`
> in `TableLineageEntity` will be the same as `getOptions` in
> `CatalogBaseTable`.
>
> > 3: Regardless whether `inputChangelogMode` in `TableSinkLineageEntity` is
> needed or not, since `TableSinkLineageEntity` contains
> `inputChangelogMode`, why `TableSourceLineageEntity` don't contain
> changelogmode?
>
> At present, we do not actually use the changelog mode. It can be deleted,
> and I have updated FLIP.
>
> > Btw, since there're a lot interfaces proposed, I think it'll be better to
> give an example about how to implement a listener in this FLIP to make us
> know better about the interfaces.
>
> I have added the example in the FLIP and the related interfaces and
> examples are in branch [1].
>
> For Yun:
> > I have one more question on the lookup-join dim tables, it seems this
> FLIP does not touch them, and will them become part of the
> List sources() or adding another interface?
>
> You're right, currently lookup join dim tables were not considered in the
> 'proposed changed' section of this FLIP. But the interface for lineage is
> universal and we can give `TableLookupSourceLineageEntity` which implements
> `TableSourceLineageEntity` in the future without modifying the public
> interface.
>
> > By the way, if you want to focus on job lineage instead of data column
> lineage in this FLIP, why we must introduce so many column-lineage related
> interface here?
>
> The lineage information in SQL jobs includes table lineage and column
> lineage. Although SQL jobs currently do not support column lineage, we
> would like to support this in the next step. So we have comprehensively
> considered the table lineage and column lineage interfaces here, and
> defined these two interfaces together clearly
>
>
> [1]
>
> https://github.com/FangYongs/flink/commit/d4bfe57e7a5315b790e79b8acef8b11e82c9187c
>
> Best,
> Shammon FY
>
>
> On Sun, Jun 25, 2023 at 4:17 PM Yun Tang  wrote:
>
> > Hi Shammon,
> >
> > I like the idea in general and it will help to analysis the job lineages
> > no matter FlinkSQL or Flink jar jobs in production environments.
> >
> > For Qingsheng's concern, I'd like the name of JobType more than
> > RuntimeExecutionMode, as the latter one is not easy to understand for
> users.
> >
> > I have one more question on the lookup-join dim tables, it seems this
> FLIP
> > does not touch them, and will them become part of the List
> > sources()​ or adding another interface?
> >
> > By the way, if you want to focus on job lineage instead of data column
> > lineage in this FLIP, why we must introduce so many column-lineage
> related
> > interface here?
> >
> >
> > Best
> > Yun Tang
> > 
> > From: Shammon FY 
> > Sent: Sunday, June 25, 2023 16:13
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] FLIP-314: Support Customized Job Lineage Listener
> >
> > Hi Qingsheng,
> >
> > Thanks for your valuable feedback.
> >
> > > 1. Is there any specific use case to expose the batch / streaming info
> to
> > listeners or meta services?
> >
> > I agree with you that Flink is evolving towards batch-streaming
> > unification, but the lifecycle of them is different. If a job processes a
> > bound dataset, it will end after completing the data processing,
> otherwise,
> > it will run for a long time. In our scenario, we will regularly schedule
> > some Flink jobs to process boun

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-06-28 Thread feng xiangyu
Hi Dong and Yunfeng,

Thanks for the proposal, your flip sounds very useful from my perspective.
In our business, when we using hybrid source in production we also met the
problem described in your flip.
In our solution, we tend to skip making any checkpoints before all batch
tasks have finished and resume the periodic checkpoint only in streaming
phrase. Within this flip, we can solve our problem in a more generic way.

However, I am wondering if we still want to skip making any checkpoints
during historical phrase, can we set this configuration
"execution.checkpointing.interval-during-backlog" equals "-1" to cover this
case?

Best,
Xiangyu

Hang Ruan  于2023年6月28日周三 16:30写道:

> Thanks for Dong and Yunfeng's work.
>
> The FLIP looks good to me. This new version is clearer to understand.
>
> Best,
> Hang
>
> Dong Lin  于2023年6月27日周二 16:53写道:
>
> > Thanks Jack, Jingsong, and Zhu for the review!
> >
> > Thanks Zhu for the suggestion. I have updated the configuration name as
> > suggested.
> >
> > On Tue, Jun 27, 2023 at 4:45 PM Zhu Zhu  wrote:
> >
> > > Thanks Dong and Yunfeng for creating this FLIP and driving this
> > discussion.
> > >
> > > The new design looks generally good to me. Increasing the checkpoint
> > > interval when the job is processing backlogs is easier for users to
> > > understand and can help in more scenarios.
> > >
> > > I have one comment about the new configuration.
> > > Naming the new configuration
> > > "execution.checkpointing.interval-during-backlog" would be better
> > > according to Flink config naming convention.
> > > It is also because that nested config keys should be avoided. See
> > > FLINK-29372 for more details.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Jingsong Li  于2023年6月27日周二 15:45写道:
> > > >
> > > > Looks good to me!
> > > >
> > > > Thanks Dong, Yunfeng and all for your discussion and design.
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Tue, Jun 27, 2023 at 3:35 PM Jark Wu  wrote:
> > > > >
> > > > > Thank you Dong for driving this FLIP.
> > > > >
> > > > > The new design looks good to me!
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > > 2023年6月27日 14:38,Dong Lin  写道:
> > > > > >
> > > > > > Thank you Leonard for the review!
> > > > > >
> > > > > > Hi Piotr, do you have any comments on the latest proposal?
> > > > > >
> > > > > > I am wondering if it is OK to start the voting thread this week.
> > > > > >
> > > > > > On Mon, Jun 26, 2023 at 4:10 PM Leonard Xu 
> > > wrote:
> > > > > >
> > > > > >> Thanks Dong for driving this FLIP forward!
> > > > > >>
> > > > > >> Introducing  `backlog status` concept for flink job makes sense
> to
> > > me as
> > > > > >> following reasons:
> > > > > >>
> > > > > >> From concept/API design perspective, it’s more general and
> natural
> > > than
> > > > > >> above proposals as it can be used in HybridSource for bounded
> > > records, CDC
> > > > > >> Source for history snapshot and general sources like KafkaSource
> > for
> > > > > >> historical messages.
> > > > > >>
> > > > > >> From user cases/requirements, I’ve seen many users manually to
> set
> > > larger
> > > > > >> checkpoint interval during backfilling and then set a shorter
> > > checkpoint
> > > > > >> interval for real-time processing in their production
> environments
> > > as a
> > > > > >> flink application optimization. Now, the flink framework can
> make
> > > this
> > > > > >> optimization no longer require the user to set the checkpoint
> > > interval and
> > > > > >> restart the job multiple times.
> > > > > >>
> > > > > >> Following supporting using larger checkpoint for job under
> backlog
> > > status
> > > > > >> in current FLIP, we can explore supporting larger
> > > parallelism/memory/cpu
> > > > > >> for job under backlog status in the future.
> > > > > >>
> > > > > >> In short, the updated FLIP looks good to me.
> > > > > >>
> > > > > >>
> > > > > >> Best,
> > > > > >> Leonard
> > > > > >>
> > > > > >>
> > > > > >>> On Jun 22, 2023, at 12:07 PM, Dong Lin 
> > > wrote:
> > > > > >>>
> > > > > >>> Hi Piotr,
> > > > > >>>
> > > > > >>> Thanks again for proposing the isProcessingBacklog concept.
> > > > > >>>
> > > > > >>> After discussing with Becket Qin and thinking about this more,
> I
> > > agree it
> > > > > >>> is a better idea to add a top-level concept to all source
> > > operators to
> > > > > >>> address the target use-case.
> > > > > >>>
> > > > > >>> The main reason that changed my mind is that
> isProcessingBacklog
> > > can be
> > > > > >>> described as an inherent/nature attribute of every source
> > instance
> > > and
> > > > > >> its
> > > > > >>> semantics does not need to depend on any specific checkpointing
> > > policy.
> > > > > >>> Also, we can hardcode the isProcessingBacklog behavior for the
> > > sources we
> > > > > >>> have considered so far (e.g. HybridSource and MySQL CDC source)
> > > without
> > > > > >>> asking users to explicitly configure the per-source behavior,
> > which
> > > > > >> indeed
> >

[jira] [Created] (FLINK-32476) Support configuring object-reuse for internal operators

2023-06-28 Thread Xuannan Su (Jira)
Xuannan Su created FLINK-32476:
--

 Summary: Support configuring object-reuse for internal operators
 Key: FLINK-32476
 URL: https://issues.apache.org/jira/browse/FLINK-32476
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Task
Reporter: Xuannan Su


Currently, object reuse is disabled by default for streaming jobs in order to 
prevent unexpected behavior. Object reuse becomes problematic when the upstream 
operator stores its output while the downstream operator modifies the input.

However, many operators implemented by Flink, such as Flink SQL operators, do 
not modify the input. This implies that it is safe to reuse the input object in 
such cases. Therefore, we intend to enable object reuse specifically for 
operators that do not modify the input.

As the first step, we will focus on the operators implemented within Flink. We 
will create the FLIP to introduce the API that allows user-defined operators to 
enable object reuse in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32477) flink-connector-jdbc  Fixed missing words in documentation

2023-06-28 Thread baxinyu (Jira)
baxinyu created FLINK-32477:
---

 Summary: flink-connector-jdbc  Fixed missing words in documentation
 Key: FLINK-32477
 URL: https://issues.apache.org/jira/browse/FLINK-32477
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.17.0
Reporter: baxinyu
 Attachments: flink-20230629103251.jpg





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-06-28 Thread Dong Lin
Hi Feng,

Thanks for the feedback. Yes, you can configure the
execution.checkpointing.interval-during-backlog to effectively disable
checkpoint during backlog.

Prior to your comment, the FLIP allows users to do this by setting the
config value to something large (e.g. 365 day). After thinking about this
more, we think it is more usable to allow users to achieve this goal by
setting the config value to 0. This is consistent with the existing
behavior of execution.checkpointing.interval -- the checkpoint is disabled
if user set execution.checkpointing.interval to 0.

We have updated the description of
execution.checkpointing.interval-during-backlog
to say the following:
... it is not null, the value must either be 0, which means the checkpoint
is disabled during backlog, or be larger than or equal to
execution.checkpointing.interval.

Does this address your need?

Best,
Dong



On Thu, Jun 29, 2023 at 9:23 AM feng xiangyu  wrote:

> Hi Dong and Yunfeng,
>
> Thanks for the proposal, your flip sounds very useful from my perspective.
> In our business, when we using hybrid source in production we also met the
> problem described in your flip.
> In our solution, we tend to skip making any checkpoints before all batch
> tasks have finished and resume the periodic checkpoint only in streaming
> phrase. Within this flip, we can solve our problem in a more generic way.
>
> However, I am wondering if we still want to skip making any checkpoints
> during historical phrase, can we set this configuration
> "execution.checkpointing.interval-during-backlog" equals "-1" to cover this
> case?
>
> Best,
> Xiangyu
>
> Hang Ruan  于2023年6月28日周三 16:30写道:
>
> > Thanks for Dong and Yunfeng's work.
> >
> > The FLIP looks good to me. This new version is clearer to understand.
> >
> > Best,
> > Hang
> >
> > Dong Lin  于2023年6月27日周二 16:53写道:
> >
> > > Thanks Jack, Jingsong, and Zhu for the review!
> > >
> > > Thanks Zhu for the suggestion. I have updated the configuration name as
> > > suggested.
> > >
> > > On Tue, Jun 27, 2023 at 4:45 PM Zhu Zhu  wrote:
> > >
> > > > Thanks Dong and Yunfeng for creating this FLIP and driving this
> > > discussion.
> > > >
> > > > The new design looks generally good to me. Increasing the checkpoint
> > > > interval when the job is processing backlogs is easier for users to
> > > > understand and can help in more scenarios.
> > > >
> > > > I have one comment about the new configuration.
> > > > Naming the new configuration
> > > > "execution.checkpointing.interval-during-backlog" would be better
> > > > according to Flink config naming convention.
> > > > It is also because that nested config keys should be avoided. See
> > > > FLINK-29372 for more details.
> > > >
> > > > Thanks,
> > > > Zhu
> > > >
> > > > Jingsong Li  于2023年6月27日周二 15:45写道:
> > > > >
> > > > > Looks good to me!
> > > > >
> > > > > Thanks Dong, Yunfeng and all for your discussion and design.
> > > > >
> > > > > Best,
> > > > > Jingsong
> > > > >
> > > > > On Tue, Jun 27, 2023 at 3:35 PM Jark Wu  wrote:
> > > > > >
> > > > > > Thank you Dong for driving this FLIP.
> > > > > >
> > > > > > The new design looks good to me!
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > > 2023年6月27日 14:38,Dong Lin  写道:
> > > > > > >
> > > > > > > Thank you Leonard for the review!
> > > > > > >
> > > > > > > Hi Piotr, do you have any comments on the latest proposal?
> > > > > > >
> > > > > > > I am wondering if it is OK to start the voting thread this
> week.
> > > > > > >
> > > > > > > On Mon, Jun 26, 2023 at 4:10 PM Leonard Xu 
> > > > wrote:
> > > > > > >
> > > > > > >> Thanks Dong for driving this FLIP forward!
> > > > > > >>
> > > > > > >> Introducing  `backlog status` concept for flink job makes
> sense
> > to
> > > > me as
> > > > > > >> following reasons:
> > > > > > >>
> > > > > > >> From concept/API design perspective, it’s more general and
> > natural
> > > > than
> > > > > > >> above proposals as it can be used in HybridSource for bounded
> > > > records, CDC
> > > > > > >> Source for history snapshot and general sources like
> KafkaSource
> > > for
> > > > > > >> historical messages.
> > > > > > >>
> > > > > > >> From user cases/requirements, I’ve seen many users manually to
> > set
> > > > larger
> > > > > > >> checkpoint interval during backfilling and then set a shorter
> > > > checkpoint
> > > > > > >> interval for real-time processing in their production
> > environments
> > > > as a
> > > > > > >> flink application optimization. Now, the flink framework can
> > make
> > > > this
> > > > > > >> optimization no longer require the user to set the checkpoint
> > > > interval and
> > > > > > >> restart the job multiple times.
> > > > > > >>
> > > > > > >> Following supporting using larger checkpoint for job under
> > backlog
> > > > status
> > > > > > >> in current FLIP, we can explore supporting larger
> > > > parallelism/memory/cpu
> > > > > > >> for job under backlog status in the future.
> > > > > > >>
> > > > > >

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-06-28 Thread Dong Lin
Thanks everyone (and specifically Piotr) for your valuable suggestions and
review!

We will open the voting thread for this FLIP.  We hope to make this feature
available in Flink 1.18 release, which will feature freeze on July 11.

Piotr: we will create a followup FLIP (probably in FLIP-328
)
to allow users to determine isBacklog dynamically based on the event-time
lag and/or source backpressure metrics.



On Thu, Jun 29, 2023 at 10:49 AM Dong Lin  wrote:

> Hi Feng,
>
> Thanks for the feedback. Yes, you can configure the
> execution.checkpointing.interval-during-backlog to effectively disable
> checkpoint during backlog.
>
> Prior to your comment, the FLIP allows users to do this by setting the
> config value to something large (e.g. 365 day). After thinking about this
> more, we think it is more usable to allow users to achieve this goal by
> setting the config value to 0. This is consistent with the existing
> behavior of execution.checkpointing.interval -- the checkpoint is
> disabled if user set execution.checkpointing.interval to 0.
>
> We have updated the description of 
> execution.checkpointing.interval-during-backlog
> to say the following:
> ... it is not null, the value must either be 0, which means the checkpoint
> is disabled during backlog, or be larger than or equal to
> execution.checkpointing.interval.
>
> Does this address your need?
>
> Best,
> Dong
>
>
>
> On Thu, Jun 29, 2023 at 9:23 AM feng xiangyu  wrote:
>
>> Hi Dong and Yunfeng,
>>
>> Thanks for the proposal, your flip sounds very useful from my perspective.
>> In our business, when we using hybrid source in production we also met the
>> problem described in your flip.
>> In our solution, we tend to skip making any checkpoints before all batch
>> tasks have finished and resume the periodic checkpoint only in streaming
>> phrase. Within this flip, we can solve our problem in a more generic way.
>>
>> However, I am wondering if we still want to skip making any checkpoints
>> during historical phrase, can we set this configuration
>> "execution.checkpointing.interval-during-backlog" equals "-1" to cover
>> this
>> case?
>>
>> Best,
>> Xiangyu
>>
>> Hang Ruan  于2023年6月28日周三 16:30写道:
>>
>> > Thanks for Dong and Yunfeng's work.
>> >
>> > The FLIP looks good to me. This new version is clearer to understand.
>> >
>> > Best,
>> > Hang
>> >
>> > Dong Lin  于2023年6月27日周二 16:53写道:
>> >
>> > > Thanks Jack, Jingsong, and Zhu for the review!
>> > >
>> > > Thanks Zhu for the suggestion. I have updated the configuration name
>> as
>> > > suggested.
>> > >
>> > > On Tue, Jun 27, 2023 at 4:45 PM Zhu Zhu  wrote:
>> > >
>> > > > Thanks Dong and Yunfeng for creating this FLIP and driving this
>> > > discussion.
>> > > >
>> > > > The new design looks generally good to me. Increasing the checkpoint
>> > > > interval when the job is processing backlogs is easier for users to
>> > > > understand and can help in more scenarios.
>> > > >
>> > > > I have one comment about the new configuration.
>> > > > Naming the new configuration
>> > > > "execution.checkpointing.interval-during-backlog" would be better
>> > > > according to Flink config naming convention.
>> > > > It is also because that nested config keys should be avoided. See
>> > > > FLINK-29372 for more details.
>> > > >
>> > > > Thanks,
>> > > > Zhu
>> > > >
>> > > > Jingsong Li  于2023年6月27日周二 15:45写道:
>> > > > >
>> > > > > Looks good to me!
>> > > > >
>> > > > > Thanks Dong, Yunfeng and all for your discussion and design.
>> > > > >
>> > > > > Best,
>> > > > > Jingsong
>> > > > >
>> > > > > On Tue, Jun 27, 2023 at 3:35 PM Jark Wu  wrote:
>> > > > > >
>> > > > > > Thank you Dong for driving this FLIP.
>> > > > > >
>> > > > > > The new design looks good to me!
>> > > > > >
>> > > > > > Best,
>> > > > > > Jark
>> > > > > >
>> > > > > > > 2023年6月27日 14:38,Dong Lin  写道:
>> > > > > > >
>> > > > > > > Thank you Leonard for the review!
>> > > > > > >
>> > > > > > > Hi Piotr, do you have any comments on the latest proposal?
>> > > > > > >
>> > > > > > > I am wondering if it is OK to start the voting thread this
>> week.
>> > > > > > >
>> > > > > > > On Mon, Jun 26, 2023 at 4:10 PM Leonard Xu > >
>> > > > wrote:
>> > > > > > >
>> > > > > > >> Thanks Dong for driving this FLIP forward!
>> > > > > > >>
>> > > > > > >> Introducing  `backlog status` concept for flink job makes
>> sense
>> > to
>> > > > me as
>> > > > > > >> following reasons:
>> > > > > > >>
>> > > > > > >> From concept/API design perspective, it’s more general and
>> > natural
>> > > > than
>> > > > > > >> above proposals as it can be used in HybridSource for bounded
>> > > > records, CDC
>> > > > > > >> Source for history snapshot and general sources like
>> KafkaSource
>> > > for
>> > > > > > >> historical messages.
>> > > > > > >>
>> > > > > > >> From user cases/requirements, I’ve seen many users manua

[VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-06-28 Thread Dong Lin
Hi all,

We would like to start the vote for FLIP-309: Support using larger
checkpointing interval when source is processing backlog [1]. This FLIP was
discussed in this thread [2].

Flink 1.18 release will feature freeze on July 11. We hope to make this
feature available in Flink 1.18.

The vote will be open until at least July 4th (at least 72 hours), following
the consensus voting process.

Cheers,
Yunfeng and Dong

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
[2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37


Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-06-28 Thread feng xiangyu
Hi Dong,

Thanks for your quick reply. I think this has truly solved our problem and
will enable us upgrade our existing jobs more seamless.

Best,
Xiangyu

Dong Lin  于2023年6月29日周四 10:50写道:

> Hi Feng,
>
> Thanks for the feedback. Yes, you can configure the
> execution.checkpointing.interval-during-backlog to effectively disable
> checkpoint during backlog.
>
> Prior to your comment, the FLIP allows users to do this by setting the
> config value to something large (e.g. 365 day). After thinking about this
> more, we think it is more usable to allow users to achieve this goal by
> setting the config value to 0. This is consistent with the existing
> behavior of execution.checkpointing.interval -- the checkpoint is disabled
> if user set execution.checkpointing.interval to 0.
>
> We have updated the description of
> execution.checkpointing.interval-during-backlog
> to say the following:
> ... it is not null, the value must either be 0, which means the checkpoint
> is disabled during backlog, or be larger than or equal to
> execution.checkpointing.interval.
>
> Does this address your need?
>
> Best,
> Dong
>
>
>
> On Thu, Jun 29, 2023 at 9:23 AM feng xiangyu  wrote:
>
> > Hi Dong and Yunfeng,
> >
> > Thanks for the proposal, your flip sounds very useful from my
> perspective.
> > In our business, when we using hybrid source in production we also met
> the
> > problem described in your flip.
> > In our solution, we tend to skip making any checkpoints before all batch
> > tasks have finished and resume the periodic checkpoint only in streaming
> > phrase. Within this flip, we can solve our problem in a more generic way.
> >
> > However, I am wondering if we still want to skip making any checkpoints
> > during historical phrase, can we set this configuration
> > "execution.checkpointing.interval-during-backlog" equals "-1" to cover
> this
> > case?
> >
> > Best,
> > Xiangyu
> >
> > Hang Ruan  于2023年6月28日周三 16:30写道:
> >
> > > Thanks for Dong and Yunfeng's work.
> > >
> > > The FLIP looks good to me. This new version is clearer to understand.
> > >
> > > Best,
> > > Hang
> > >
> > > Dong Lin  于2023年6月27日周二 16:53写道:
> > >
> > > > Thanks Jack, Jingsong, and Zhu for the review!
> > > >
> > > > Thanks Zhu for the suggestion. I have updated the configuration name
> as
> > > > suggested.
> > > >
> > > > On Tue, Jun 27, 2023 at 4:45 PM Zhu Zhu  wrote:
> > > >
> > > > > Thanks Dong and Yunfeng for creating this FLIP and driving this
> > > > discussion.
> > > > >
> > > > > The new design looks generally good to me. Increasing the
> checkpoint
> > > > > interval when the job is processing backlogs is easier for users to
> > > > > understand and can help in more scenarios.
> > > > >
> > > > > I have one comment about the new configuration.
> > > > > Naming the new configuration
> > > > > "execution.checkpointing.interval-during-backlog" would be better
> > > > > according to Flink config naming convention.
> > > > > It is also because that nested config keys should be avoided. See
> > > > > FLINK-29372 for more details.
> > > > >
> > > > > Thanks,
> > > > > Zhu
> > > > >
> > > > > Jingsong Li  于2023年6月27日周二 15:45写道:
> > > > > >
> > > > > > Looks good to me!
> > > > > >
> > > > > > Thanks Dong, Yunfeng and all for your discussion and design.
> > > > > >
> > > > > > Best,
> > > > > > Jingsong
> > > > > >
> > > > > > On Tue, Jun 27, 2023 at 3:35 PM Jark Wu 
> wrote:
> > > > > > >
> > > > > > > Thank you Dong for driving this FLIP.
> > > > > > >
> > > > > > > The new design looks good to me!
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > > 2023年6月27日 14:38,Dong Lin  写道:
> > > > > > > >
> > > > > > > > Thank you Leonard for the review!
> > > > > > > >
> > > > > > > > Hi Piotr, do you have any comments on the latest proposal?
> > > > > > > >
> > > > > > > > I am wondering if it is OK to start the voting thread this
> > week.
> > > > > > > >
> > > > > > > > On Mon, Jun 26, 2023 at 4:10 PM Leonard Xu <
> xbjt...@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > >> Thanks Dong for driving this FLIP forward!
> > > > > > > >>
> > > > > > > >> Introducing  `backlog status` concept for flink job makes
> > sense
> > > to
> > > > > me as
> > > > > > > >> following reasons:
> > > > > > > >>
> > > > > > > >> From concept/API design perspective, it’s more general and
> > > natural
> > > > > than
> > > > > > > >> above proposals as it can be used in HybridSource for
> bounded
> > > > > records, CDC
> > > > > > > >> Source for history snapshot and general sources like
> > KafkaSource
> > > > for
> > > > > > > >> historical messages.
> > > > > > > >>
> > > > > > > >> From user cases/requirements, I’ve seen many users manually
> to
> > > set
> > > > > larger
> > > > > > > >> checkpoint interval during backfilling and then set a
> shorter
> > > > > checkpoint
> > > > > > > >> interval for real-time processing in their production
> > > environments
> > > > > as a
> > > > > > > >> flink application optimizat

Re: [VOTE] Apache Flink ML Release 2.3.0, release candidate #1

2023-06-28 Thread Xin Jiang
Hi Dong,

Thanks for driving this release.

+1 (non-binding)

- Verified that the checksums and GPG files.
- Verified that the source distributions do not contain any binaries.
- Built the source distribution and run all unit tests.
- Verified that all POM files point to the same version.
- Browsed through JIRA release notes files.
- Browsed through README.md files.


Best Regards,
Xin


Re: [VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread liu ron
+1(no-binding)

Best,
Ron

Feng Jin  于2023年6月29日周四 00:22写道:

> +1 (no-binding)
>
>
> Best
> Feng
>
> On Wed, Jun 28, 2023 at 11:03 PM Jing Ge 
> wrote:
>
> > +1(binding)
> >
> > On Wed, Jun 28, 2023 at 1:51 PM Mang Zhang  wrote:
> >
> > > +1 (no-binding)
> > >
> > >
> > > --
> > >
> > > Best regards,
> > > Mang Zhang
> > >
> > >
> > >
> > >
> > >
> > > At 2023-06-28 17:48:15, "yuxia"  wrote:
> > > >Hi everyone,
> > > >Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS
> > > SELECT statement[1]. Based on the discussion [2], we have come to a
> > > consensus, so I would like to start a vote.
> > > >The vote will be open for at least 72 hours (until July 3th, 10:00AM
> > GMT)
> > > unless there is an objection or an insufficient number of votes.
> > > >
> > > >
> > > >[1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> > > >[2] https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3
> > > >
> > > >
> > > >Best regards,
> > > >Yuxia
> > >
> >
>


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-06-28 Thread Jingsong Li
+1 binding

On Thu, Jun 29, 2023 at 11:03 AM Dong Lin  wrote:
>
> Hi all,
>
> We would like to start the vote for FLIP-309: Support using larger
> checkpointing interval when source is processing backlog [1]. This FLIP was
> discussed in this thread [2].
>
> Flink 1.18 release will feature freeze on July 11. We hope to make this
> feature available in Flink 1.18.
>
> The vote will be open until at least July 4th (at least 72 hours), following
> the consensus voting process.
>
> Cheers,
> Yunfeng and Dong
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

2023-06-28 Thread Leonard Xu
+1 (binding)

Best,
Leonard

> On Jun 29, 2023, at 1:25 PM, Jingsong Li  wrote:
> 
> +1 binding
> 
> On Thu, Jun 29, 2023 at 11:03 AM Dong Lin  wrote:
>> 
>> Hi all,
>> 
>> We would like to start the vote for FLIP-309: Support using larger
>> checkpointing interval when source is processing backlog [1]. This FLIP was
>> discussed in this thread [2].
>> 
>> Flink 1.18 release will feature freeze on July 11. We hope to make this
>> feature available in Flink 1.18.
>> 
>> The vote will be open until at least July 4th (at least 72 hours), following
>> the consensus voting process.
>> 
>> Cheers,
>> Yunfeng and Dong
>> 
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
>> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37



Re: [VOTE] FLIP-303: Support REPLACE TABLE AS SELECT statement

2023-06-28 Thread Lincoln Lee
+1 (binding)

Best,
Lincoln Lee


liu ron  于2023年6月29日周四 13:22写道:

> +1(no-binding)
>
> Best,
> Ron
>
> Feng Jin  于2023年6月29日周四 00:22写道:
>
> > +1 (no-binding)
> >
> >
> > Best
> > Feng
> >
> > On Wed, Jun 28, 2023 at 11:03 PM Jing Ge 
> > wrote:
> >
> > > +1(binding)
> > >
> > > On Wed, Jun 28, 2023 at 1:51 PM Mang Zhang  wrote:
> > >
> > > > +1 (no-binding)
> > > >
> > > >
> > > > --
> > > >
> > > > Best regards,
> > > > Mang Zhang
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2023-06-28 17:48:15, "yuxia"  wrote:
> > > > >Hi everyone,
> > > > >Thanks for all the feedback about FLIP-303: Support REPLACE TABLE AS
> > > > SELECT statement[1]. Based on the discussion [2], we have come to a
> > > > consensus, so I would like to start a vote.
> > > > >The vote will be open for at least 72 hours (until July 3th, 10:00AM
> > > GMT)
> > > > unless there is an objection or an insufficient number of votes.
> > > > >
> > > > >
> > > > >[1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-303%3A+Support+REPLACE+TABLE+AS+SELECT+statement
> > > > >[2]
> https://lists.apache.org/thread/39mwckdsdgck48tzsdfm66hhnxorjtz3
> > > > >
> > > > >
> > > > >Best regards,
> > > > >Yuxia
> > > >
> > >
> >
>


[jira] [Created] (FLINK-32478) SourceCoordinatorAlignmentTest.testAnnounceCombinedWatermarkWithoutStart fails

2023-06-28 Thread Rui Fan (Jira)
Rui Fan created FLINK-32478:
---

 Summary: 
SourceCoordinatorAlignmentTest.testAnnounceCombinedWatermarkWithoutStart fails
 Key: FLINK-32478
 URL: https://issues.apache.org/jira/browse/FLINK-32478
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common
Affects Versions: 1.18.0, 1.16.3, 1.17.2
Reporter: Rui Fan
Assignee: Rui Fan


SourceCoordinatorAlignmentTest.testAnnounceCombinedWatermarkWithoutStart fails

 

Root cause: multiple sources share the same thread pool, and the second source 
cannot start due to the first source closes the shared thread pool.

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50611&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8&l=8613



--
This message was sent by Atlassian Jira
(v8.20.10#820010)