[jira] [Created] (FLINK-25125) Verify that downloaded artifact is not corrupt

2021-12-01 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-25125:
--

 Summary: Verify that downloaded artifact is not corrupt
 Key: FLINK-25125
 URL: https://issues.apache.org/jira/browse/FLINK-25125
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Martijn Visser


The bash based e2e tests use the `get_artifact` function to either grab an 
artifact from the cache or download it for the purpose of testing. We should 
add a verification that the downloaded artifact is not corrupt.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25126) when SET 'execution.runtime-mode' = 'batch' and 'sink.delivery-guarantee' = 'exactly-once',kafka conncetor will commit fail

2021-12-01 Thread eric yu (Jira)
eric yu created FLINK-25126:
---

 Summary: when SET 'execution.runtime-mode' = 'batch' and 
'sink.delivery-guarantee' = 'exactly-once',kafka conncetor will commit fail
 Key: FLINK-25126
 URL: https://issues.apache.org/jira/browse/FLINK-25126
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.0
 Environment: SET 'execution.runtime-mode' = 'batch';

 CREATE TABLE ka15 (
     name String,
     cnt bigint
 ) WITH (
   'connector' = 'kafka',
   'topic' = 'shifang',
  'properties.bootstrap.servers' = 'flinkx1:9092',
  'properties.transaction.timeout.ms' = '80',
  'properties.max.block.ms' = '30',
   'value.format' = 'json',
   'sink.parallelism' = '2',
   'sink.delivery-guarantee' = 'exactly-once',
   'sink.transactional-id-prefix' = 'dtstack');

 

insert into ka15 SELECT
  name,
  cnt
FROM
  (VALUES ('Bob',100), ('Alice',100), ('Greg',100), ('Bob',100)) AS 
NameTable(name,cnt);
Reporter: eric yu


flinksql task submitted by sql client will failed:
 
Caused by: java.lang.IllegalStateException
at org.apache.flink.util.Preconditions.checkState(Preconditions.java:177)
at 
org.apache.flink.connector.kafka.sink.FlinkKafkaInternalProducer.setTransactionId(FlinkKafkaInternalProducer.java:164)
at 
org.apache.flink.connector.kafka.sink.KafkaCommitter.getRecoveryProducer(KafkaCommitter.java:144)
at 
org.apache.flink.connector.kafka.sink.KafkaCommitter.lambda$commit$0(KafkaCommitter.java:76)
at java.util.Optional.orElseGet(Optional.java:267)
at 
org.apache.flink.connector.kafka.sink.KafkaCommitter.commit(KafkaCommitter.java:76)
... 14 more
 
 
i found the reason why  kafka commit failed, when downstream operator 
CommitterOperator was commiting transaction, the upstream  operator 
SinkOperator has closed , it will abort the transaction which  is committing by 
CommitterOperator, this is occurs when execution.runtime-mode is batch



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25127) Reuse a single Collection in GlobalWindows#assignWindows

2021-12-01 Thread bx123 (Jira)
bx123 created FLINK-25127:
-

 Summary: Reuse a single Collection in GlobalWindows#assignWindows
 Key: FLINK-25127
 URL: https://issues.apache.org/jira/browse/FLINK-25127
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: bx123


When we use GlobalWindow, the window assigner will create a new collection for 
each new stream record. This is not necessary, as the collection is immutable 
and GlobalWindow is singleton and stateless. We can add a static collection 
field in GlobalWindows and return it, to take pressure off the GC.

 

private static final Collection GlobalWindowCollection = 
Collections.singletonList(GlobalWindow.get());

 

@Override
public Collection assignWindows(
Object element, long timestamp, WindowAssignerContext context) {
     return GlobalWindowCollection;
}

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Deprecate Java 8 support

2021-12-01 Thread Chesnay Schepler
That is correct, yes. During the deprecation period we won't seen any 
improvements on our end.


Something that I could envision is that the docker images will default 
to Java 11 after a while, and that we put a bigger emphasis on 
performance on Java 11+ than on Java 8.


On 01/12/2021 02:31, wenlong.lwl wrote:
hi, @Chesnay Schepler  would you explain 
more about what would happen when deprecating Java 8, but still 
support it. IMO, if we still generate packages based on Java 8 which 
seems to be a consensus, we still can not take the advantages you 
mentioned even if we announce that Java 8 support is deprecated.



Best,
Wenlong

On Mon, 29 Nov 2021 at 17:22, Marios Trivyzas  wrote:

+1 from me as well on the Java 8 deprecation!
It's important to make the users aware, and "force" them but also the
communities of other
related projects (like the aforementioned Hive) to start preparing
for the
future drop of Java 8
support and the upgrade to the recent stable versions.


On Sun, Nov 28, 2021 at 11:15 PM Thomas Weise  wrote:

> +1 for Java 8 deprecation. It's an important signal for users and we
> need to give sufficient time to adopt. Thanks Chesnay for
starting the
> discussion! Maybe user@ can be included into this discussion?
>
> Thomas
>
>
> On Fri, Nov 26, 2021 at 6:49 AM Becket Qin
 wrote:
> >
> > Thanks for raising the discussion, Chesnay.
> >
> > I think it is OK to deprecate Java 8 to let the users know
that Java 11
> > migration should be put into the schedule. However, According
to some of
> > the statistics of the Java version adoption[1], a large number
of users
> are
> > still using Java 8 in production. I doubt that Java 8 users
will drop to
> a
> > negligible amount within the next 2 - 3 Flink releases. I
would suggest
> > making the time to drop Java 8 support flexible.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > [1] https://www.infoq.com/news/2021/07/snyk-jvm-2021/
> >
> > On Fri, Nov 26, 2021 at 5:09 AM Till Rohrmann

> wrote:
> >
> > > +1 for the deprecation and reaching out to the user ML to
ask for
> feedback
> > > from our users. Thanks for driving this Chesnay!
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Nov 25, 2021 at 10:15 AM Roman Khachatryan

> > > wrote:
> > >
> > > > The situation is probably a bit different now compared to the
> previous
> > > > upgrade: some users might be using Amazon Coretto (or
other builds)
> > > > which have longer support.
> > > >
> > > > Still +1 for deprecation to trigger migration, and thanks for
> bringing
> > > > this up!
> > > >
> > > > Regards,
> > > > Roman
> > > >
> > > > On Thu, Nov 25, 2021 at 10:09 AM Arvid Heise

> wrote:
> > > > >
> > > > > +1 to deprecate Java 8, so we can hopefully incorporate
the module
> > > > concept
> > > > > in Flink.
> > > > >
> > > > > On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler <
> ches...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Users can already use APIs from Java 8/11.
> > > > > >
> > > > > > On 25/11/2021 09:35, Francesco Guardiani wrote:
> > > > > > > +1 with what both Ingo and Matthias sad, personally,
I cannot
> wait
> > > to
> > > > > > start using some of
> > > > > > > the APIs introduced in Java 9. And I'm pretty sure
that's the
> same
> > > > for
> > > > > > our users as well.
> > > > > > >
> > > > > > > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk
wrote:
> > > > > > >> Hi everyone,
> > > > > > >>
> > > > > > >> continued support for Java 8 can also create
project risks,
> e.g.
> > > if
> > > > a
> > > > > > >> vulnerability arises in Flink's dependencies and we
cannot
> upgrade
> > > > them
> > > > > > >> because they no longer support Java 8. Some
projects already
> > > started
> > > > > > >> deprecating support as well, like Kafka, and other
projects
> will
> > > > likely
> > > > > > >> follow.
> > > > > > >> Let's also keep in mind that the proposal here is
not to drop
> > > > support
> > > > > > right
> > > > > > >> away, but to deprecate it, send the message, and
motivate
> users to
> > > > start
> > > > > > >> migrating. Delaying this process could ironically
mean users
> have
> > > > less
> > > > > > time
> > > > > > >> to prepare for it.
> > > > > > >>
> > > > > > >>
> > > > > > >> Ingo
> > > > > > >>
> > > > > > >> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl <
> > > > matth...@ververica.com>
> > > > > > >>
> > > > > > >> wrote:
> > > > > > >>> Thanks for constantly driv

Re: [DISCUSS] Deprecate Java 8 support

2021-12-01 Thread Jing Ge
Thanks Chesnay for bringing this up for discussion. +1 for Java 8
deprecation. Every decision has pros and cons. All concerns mentioned
previously are fair enough. I think that the active support for Java 8
ending in 4 months will have an impact on the projects that still stick
with Java 8 and on Flink users too. Flink should not be the last one to do
the migration. It's a good timing to move forward.

Best regards
Jing

On Wed, Dec 1, 2021 at 2:31 AM wenlong.lwl  wrote:

> hi, @Chesnay Schepler  would you explain more about
> what would happen when deprecating Java 8, but still support it. IMO, if we
> still generate packages based on Java 8 which seems to be a  consensus, we
> still can not take the advantages you mentioned even if we announce that
> Java 8 support is deprecated.
>
>
> Best,
> Wenlong
>
> On Mon, 29 Nov 2021 at 17:22, Marios Trivyzas  wrote:
>
> > +1 from me as well on the Java 8 deprecation!
> > It's important to make the users aware, and "force" them but also the
> > communities of other
> > related projects (like the aforementioned Hive) to start preparing for
> the
> > future drop of Java 8
> > support and the upgrade to the recent stable versions.
> >
> >
> > On Sun, Nov 28, 2021 at 11:15 PM Thomas Weise  wrote:
> >
> > > +1 for Java 8 deprecation. It's an important signal for users and we
> > > need to give sufficient time to adopt. Thanks Chesnay for starting the
> > > discussion! Maybe user@ can be included into this discussion?
> > >
> > > Thomas
> > >
> > >
> > > On Fri, Nov 26, 2021 at 6:49 AM Becket Qin 
> wrote:
> > > >
> > > > Thanks for raising the discussion, Chesnay.
> > > >
> > > > I think it is OK to deprecate Java 8 to let the users know that Java
> 11
> > > > migration should be put into the schedule. However, According to some
> > of
> > > > the statistics of the Java version adoption[1], a large number of
> users
> > > are
> > > > still using Java 8 in production. I doubt that Java 8 users will drop
> > to
> > > a
> > > > negligible amount within the next 2 - 3 Flink releases. I would
> suggest
> > > > making the time to drop Java 8 support flexible.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > [1] https://www.infoq.com/news/2021/07/snyk-jvm-2021/
> > > >
> > > > On Fri, Nov 26, 2021 at 5:09 AM Till Rohrmann 
> > > wrote:
> > > >
> > > > > +1 for the deprecation and reaching out to the user ML to ask for
> > > feedback
> > > > > from our users. Thanks for driving this Chesnay!
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Thu, Nov 25, 2021 at 10:15 AM Roman Khachatryan <
> ro...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > The situation is probably a bit different now compared to the
> > > previous
> > > > > > upgrade: some users might be using Amazon Coretto (or other
> builds)
> > > > > > which have longer support.
> > > > > >
> > > > > > Still +1 for deprecation to trigger migration, and thanks for
> > > bringing
> > > > > > this up!
> > > > > >
> > > > > > Regards,
> > > > > > Roman
> > > > > >
> > > > > > On Thu, Nov 25, 2021 at 10:09 AM Arvid Heise 
> > > wrote:
> > > > > > >
> > > > > > > +1 to deprecate Java 8, so we can hopefully incorporate the
> > module
> > > > > > concept
> > > > > > > in Flink.
> > > > > > >
> > > > > > > On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler <
> > > ches...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Users can already use APIs from Java 8/11.
> > > > > > > >
> > > > > > > > On 25/11/2021 09:35, Francesco Guardiani wrote:
> > > > > > > > > +1 with what both Ingo and Matthias sad, personally, I
> cannot
> > > wait
> > > > > to
> > > > > > > > start using some of
> > > > > > > > > the APIs introduced in Java 9. And I'm pretty sure that's
> the
> > > same
> > > > > > for
> > > > > > > > our users as well.
> > > > > > > > >
> > > > > > > > > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
> > > > > > > > >> Hi everyone,
> > > > > > > > >>
> > > > > > > > >> continued support for Java 8 can also create project
> risks,
> > > e.g.
> > > > > if
> > > > > > a
> > > > > > > > >> vulnerability arises in Flink's dependencies and we cannot
> > > upgrade
> > > > > > them
> > > > > > > > >> because they no longer support Java 8. Some projects
> already
> > > > > started
> > > > > > > > >> deprecating support as well, like Kafka, and other
> projects
> > > will
> > > > > > likely
> > > > > > > > >> follow.
> > > > > > > > >> Let's also keep in mind that the proposal here is not to
> > drop
> > > > > > support
> > > > > > > > right
> > > > > > > > >> away, but to deprecate it, send the message, and motivate
> > > users to
> > > > > > start
> > > > > > > > >> migrating. Delaying this process could ironically mean
> users
> > > have
> > > > > > less
> > > > > > > > time
> > > > > > > > >> to prepare for it.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Ingo
> > > > > > > > >>
> > > > > > > > >> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl 

Flink 1.15. Bi-weekly 2021-11-30

2021-12-01 Thread Johannes Moser
Dear Flink Community,

Today we had the second release sync for the 1.15 release.

Some hard facts:

There are 43 features listed, 3 of them are already done. We now introduced the 
progress percentage for each feature and building the average of it (neglecting 
the size of each feature) we end up at 20%. Please update the state (and the 
'updated' column) before the next bi-weekly.
Issues that are being worked on should be set to "in progress".

Blockers and instabilities
--
We got a total of 125 test instabilities but managed to fix most of the build 
system related ones. All open blockers and the most frequent failing tests have 
been assigned.

Have a look at the 1.15. release page for more facts [1].

The next bi-weekly will happen on the 14th of December at 9am CET.

Best,
Till, Yun Gao, Joe


[1] https://cwiki.apache.org/confluence/display/FLINK/1.15+Release

[jira] [Created] (FLINK-25128) Introduce flink-table-planner-loader

2021-12-01 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25128:
---

 Summary: Introduce flink-table-planner-loader
 Key: FLINK-25128
 URL: https://issues.apache.org/jira/browse/FLINK-25128
 Project: Flink
  Issue Type: Sub-task
Reporter: Francesco Guardiani


For more details, see 
https://docs.google.com/document/d/12yDUCnvcwU2mODBKTHQ1xhfOq1ujYUrXltiN_rbhT34/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25129) Update docs and examples to use flink-table-planner-loader instead of flink-table-planner

2021-12-01 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25129:
---

 Summary: Update docs and examples to use 
flink-table-planner-loader instead of flink-table-planner
 Key: FLINK-25129
 URL: https://issues.apache.org/jira/browse/FLINK-25129
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Examples, Table SQL / API
Reporter: Francesco Guardiani


For more details 
https://docs.google.com/document/d/12yDUCnvcwU2mODBKTHQ1xhfOq1ujYUrXltiN_rbhT34/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25130) Update flink-table-uber to ship flink-table-planner-loader instead of flink-table-planner

2021-12-01 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25130:
---

 Summary: Update flink-table-uber to ship 
flink-table-planner-loader instead of flink-table-planner
 Key: FLINK-25130
 URL: https://issues.apache.org/jira/browse/FLINK-25130
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Francesco Guardiani


This should also be tested by the sql client.

This change should be tested then by our e2e tests, specifically 
flink-end-to-end-tests/flink-batch-sql-test.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25131) Update sql client to ship flink-table-planner-loader instead of flink-table-planner

2021-12-01 Thread Francesco Guardiani (Jira)
Francesco Guardiani created FLINK-25131:
---

 Summary: Update sql client to ship flink-table-planner-loader 
instead of flink-table-planner
 Key: FLINK-25131
 URL: https://issues.apache.org/jira/browse/FLINK-25131
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Client
Reporter: Francesco Guardiani


See the parent task for more details.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [ANNOUNCE] Open source of remote shuffle project for Flink batch data processing

2021-12-01 Thread Yingjie Cao
Hi Jiangang,

Great to hear that, welcome to work together to make the project better.

Best,
Yingjie

刘建刚  于2021年12月1日周三 下午3:27写道:

> Good work for flink's batch processing!
> Remote shuffle service can resolve the container lost problem and reduce
> the running time for batch jobs once failover. We have investigated the
> component a lot and welcome Flink's native solution. We will try it and
> help improve it.
>
> Thanks,
> Liu Jiangang
>
> Yingjie Cao  于2021年11月30日周二 下午9:33写道:
>
> > Hi dev & users,
> >
> > We are happy to announce the open source of remote shuffle project [1]
> for
> > Flink. The project is originated in Alibaba and the main motivation is to
> > improve batch data processing for both performance & stability and
> further
> > embrace cloud native. For more features about the project, please refer
> to
> > [1].
> >
> > Before going open source, the project has been used widely in production
> > and it behaves well on both stability and performance. We hope you enjoy
> > it. Collaborations and feedbacks are highly appreciated.
> >
> > Best,
> > Yingjie on behalf of all contributors
> >
> > [1] https://github.com/flink-extended/flink-remote-shuffle
> >
>


[jira] [Created] (FLINK-25132) KafkaSource cannot work with object-reusing DeserializationSchema

2021-12-01 Thread Qingsheng Ren (Jira)
Qingsheng Ren created FLINK-25132:
-

 Summary: KafkaSource cannot work with object-reusing 
DeserializationSchema
 Key: FLINK-25132
 URL: https://issues.apache.org/jira/browse/FLINK-25132
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.0
Reporter: Qingsheng Ren
 Fix For: 1.14.1


Currently Kafka source deserializes ConsumerRecords in split reader and puts 
them into the elementQueue, then task's main thread polls these records from 
the queue asynchronously. This mechanism cannot cooperate with 
DeserializationSchemas with object reuse: all records staying in the element 
queue points to the same object.

A solution would be moving deserialization to RecordEmitter, which works in the 
task's main thread. 

Notes that this issue actually effects all sources which do deserialization in 
split reader. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Deprecate Java 8 support

2021-12-01 Thread wenlong.lwl
Thanks for the explanation.

+1 for putting a bigger emphasis on performance on Java 11+. It would be
great if Flink is already well prepared when users want to do the upgrade.


Best,
Wenlong

On Wed, 1 Dec 2021 at 16:54, Chesnay Schepler  wrote:

> That is correct, yes. During the deprecation period we won't seen any
> improvements on our end.
>
> Something that I could envision is that the docker images will default to
> Java 11 after a while, and that we put a bigger emphasis on performance on
> Java 11+ than on Java 8.
>
> On 01/12/2021 02:31, wenlong.lwl wrote:
>
> hi, @Chesnay Schepler  would you explain more about
> what would happen when deprecating Java 8, but still support it. IMO, if we
> still generate packages based on Java 8 which seems to be a  consensus,
> we still can not take the advantages you mentioned even if we announce that
> Java 8 support is deprecated.
>
>
> Best,
> Wenlong
>
> On Mon, 29 Nov 2021 at 17:22, Marios Trivyzas  wrote:
>
>> +1 from me as well on the Java 8 deprecation!
>> It's important to make the users aware, and "force" them but also the
>> communities of other
>> related projects (like the aforementioned Hive) to start preparing for the
>> future drop of Java 8
>> support and the upgrade to the recent stable versions.
>>
>>
>> On Sun, Nov 28, 2021 at 11:15 PM Thomas Weise  wrote:
>>
>> > +1 for Java 8 deprecation. It's an important signal for users and we
>> > need to give sufficient time to adopt. Thanks Chesnay for starting the
>> > discussion! Maybe user@ can be included into this discussion?
>> >
>> > Thomas
>> >
>> >
>> > On Fri, Nov 26, 2021 at 6:49 AM Becket Qin 
>> wrote:
>> > >
>> > > Thanks for raising the discussion, Chesnay.
>> > >
>> > > I think it is OK to deprecate Java 8 to let the users know that Java
>> 11
>> > > migration should be put into the schedule. However, According to some
>> of
>> > > the statistics of the Java version adoption[1], a large number of
>> users
>> > are
>> > > still using Java 8 in production. I doubt that Java 8 users will drop
>> to
>> > a
>> > > negligible amount within the next 2 - 3 Flink releases. I would
>> suggest
>> > > making the time to drop Java 8 support flexible.
>> > >
>> > > Thanks,
>> > >
>> > > Jiangjie (Becket) Qin
>> > >
>> > > [1] https://www.infoq.com/news/2021/07/snyk-jvm-2021/
>> > >
>> > > On Fri, Nov 26, 2021 at 5:09 AM Till Rohrmann 
>> > wrote:
>> > >
>> > > > +1 for the deprecation and reaching out to the user ML to ask for
>> > feedback
>> > > > from our users. Thanks for driving this Chesnay!
>> > > >
>> > > > Cheers,
>> > > > Till
>> > > >
>> > > > On Thu, Nov 25, 2021 at 10:15 AM Roman Khachatryan <
>> ro...@apache.org>
>> > > > wrote:
>> > > >
>> > > > > The situation is probably a bit different now compared to the
>> > previous
>> > > > > upgrade: some users might be using Amazon Coretto (or other
>> builds)
>> > > > > which have longer support.
>> > > > >
>> > > > > Still +1 for deprecation to trigger migration, and thanks for
>> > bringing
>> > > > > this up!
>> > > > >
>> > > > > Regards,
>> > > > > Roman
>> > > > >
>> > > > > On Thu, Nov 25, 2021 at 10:09 AM Arvid Heise 
>> > wrote:
>> > > > > >
>> > > > > > +1 to deprecate Java 8, so we can hopefully incorporate the
>> module
>> > > > > concept
>> > > > > > in Flink.
>> > > > > >
>> > > > > > On Thu, Nov 25, 2021 at 9:49 AM Chesnay Schepler <
>> > ches...@apache.org>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Users can already use APIs from Java 8/11.
>> > > > > > >
>> > > > > > > On 25/11/2021 09:35, Francesco Guardiani wrote:
>> > > > > > > > +1 with what both Ingo and Matthias sad, personally, I
>> cannot
>> > wait
>> > > > to
>> > > > > > > start using some of
>> > > > > > > > the APIs introduced in Java 9. And I'm pretty sure that's
>> the
>> > same
>> > > > > for
>> > > > > > > our users as well.
>> > > > > > > >
>> > > > > > > > On Tuesday, 23 November 2021 13:35:07 CET Ingo Bürk wrote:
>> > > > > > > >> Hi everyone,
>> > > > > > > >>
>> > > > > > > >> continued support for Java 8 can also create project risks,
>> > e.g.
>> > > > if
>> > > > > a
>> > > > > > > >> vulnerability arises in Flink's dependencies and we cannot
>> > upgrade
>> > > > > them
>> > > > > > > >> because they no longer support Java 8. Some projects
>> already
>> > > > started
>> > > > > > > >> deprecating support as well, like Kafka, and other projects
>> > will
>> > > > > likely
>> > > > > > > >> follow.
>> > > > > > > >> Let's also keep in mind that the proposal here is not to
>> drop
>> > > > > support
>> > > > > > > right
>> > > > > > > >> away, but to deprecate it, send the message, and motivate
>> > users to
>> > > > > start
>> > > > > > > >> migrating. Delaying this process could ironically mean
>> users
>> > have
>> > > > > less
>> > > > > > > time
>> > > > > > > >> to prepare for it.
>> > > > > > > >>
>> > > > > > > >>
>> > > > > > > >> Ingo
>> > > > > > > >>
>> > > > > > > >> On Tue, Nov 23, 2021 at 8:54 AM Matthias Pohl <
>> > > > > matth...@

[jira] [Created] (FLINK-25133) Introduce a Version class that keeps track of all the released versions

2021-12-01 Thread Marios Trivyzas (Jira)
Marios Trivyzas created FLINK-25133:
---

 Summary: Introduce a Version class that keeps track of all the 
released versions
 Key: FLINK-25133
 URL: https://issues.apache.org/jira/browse/FLINK-25133
 Project: Flink
  Issue Type: Improvement
  Components: Release System
Reporter: Marios Trivyzas


It would be helpful to introduce a *Version* (or {*}FlinkVersion{*}) class were 
all the releases (including hotfix ones) are listed. This would be helpful to 
use in various places like tests, classes used to generate docs, etc.

It will become very handy, with the upcoming Upgrade story for Table/SQL.

 

Currently there is a `MigrationVersion` in `flink-core` but only keeps minor 
versions and is used in a limited scope.

 

Inspired by Elasticsearch repo: 
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/Version.java



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25134) Unused RetryRule in KafkaConsumerTestBase swallows retries

2021-12-01 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-25134:
---

 Summary: Unused RetryRule in KafkaConsumerTestBase swallows retries
 Key: FLINK-25134
 URL: https://issues.apache.org/jira/browse/FLINK-25134
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.13.3, 1.14.0, 1.15.0
Reporter: Fabian Paul


After merging https://issues.apache.org/jira/browse/FLINK-15493 a few tests are 
still not retried because the KafkaConsumerTestBase overwrites the RetryRule 
unnecessarily.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25135) HBase downloads in e2e tests fail often

2021-12-01 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-25135:
--

 Summary: HBase downloads in e2e tests fail often
 Key: FLINK-25135
 URL: https://issues.apache.org/jira/browse/FLINK-25135
 Project: Flink
  Issue Type: Bug
Reporter: Martijn Visser


This is an umbrella ticket for all HBase download related failures during e2e 
testing



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[VOTE] FLIP-193: Snapshots ownership

2021-12-01 Thread Dawid Wysakowicz
Dear devs,

I'd like to open a vote on FLIP-193: Snapshots ownership [1] which was
discussed in this thread [2].
The vote will be open for at least 72 hours unless there is an objection or
not enough votes.

Best,

Dawid

[1] https://cwiki.apache.org/confluence/x/bIyqCw

[2] https://lists.apache.org/thread/zw2crf0c7t7t4cb5cwcwjpvsb3r1ovz2




OpenPGP_signature
Description: OpenPGP digital signature


[jira] [Created] (FLINK-25136) the parallelism not appear right when rescale by use scheduler in reactive mode

2021-12-01 Thread KevinyhZou (Jira)
KevinyhZou created FLINK-25136:
--

 Summary: the parallelism not appear right when rescale by use 
scheduler in reactive mode
 Key: FLINK-25136
 URL: https://issues.apache.org/jira/browse/FLINK-25136
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task, Runtime / Web Frontend
Affects Versions: 1.14.0
Reporter: KevinyhZou


As describe in document 
[https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/elastic_scaling/]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] FLIP-193: Snapshots ownership

2021-12-01 Thread Konstantin Knauf
Thanks, Dawid.

+1

On Wed, Dec 1, 2021 at 1:23 PM Dawid Wysakowicz 
wrote:

> Dear devs,
>
> I'd like to open a vote on FLIP-193: Snapshots ownership [1] which was
> discussed in this thread [2].
> The vote will be open for at least 72 hours unless there is an objection or
> not enough votes.
>
> Best,
>
> Dawid
>
> [1] https://cwiki.apache.org/confluence/x/bIyqCw
>
> [2] https://lists.apache.org/thread/zw2crf0c7t7t4cb5cwcwjpvsb3r1ovz2
>
>
>

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


Re: [VOTE] FLIP-193: Snapshots ownership

2021-12-01 Thread Till Rohrmann
Thanks for creating this FLIP Dawid.

+1

Cheers,
Till

On Wed, Dec 1, 2021 at 2:03 PM Konstantin Knauf  wrote:

> Thanks, Dawid.
>
> +1
>
> On Wed, Dec 1, 2021 at 1:23 PM Dawid Wysakowicz 
> wrote:
>
> > Dear devs,
> >
> > I'd like to open a vote on FLIP-193: Snapshots ownership [1] which was
> > discussed in this thread [2].
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > not enough votes.
> >
> > Best,
> >
> > Dawid
> >
> > [1] https://cwiki.apache.org/confluence/x/bIyqCw
> >
> > [2] https://lists.apache.org/thread/zw2crf0c7t7t4cb5cwcwjpvsb3r1ovz2
> >
> >
> >
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>


Re: [DISCUSS] Definition of Done for Apache Flink

2021-12-01 Thread Johannes Moser
Hi everyone,

thanks for discussing this and providing your input. 

Here's a summary of what I got out of your responses:
* There are already process and template in place, which isn't really executed.
* Just adjusting the process or the template won't help.
* Opening PRs shouldn't become more difficult (rather easier)
* Committers should be aware of the 'Definition of Done' and check it.
* The Github template should inform the contributor on what is expected when 
issuing a PR.
* Martijn is suggesting a three step template (code contribution process, 
tests, documentation).
* They could be automated.
* There are silent fans of the current template.
* Manually listing what tests changed have no value.
* We should use checkboxes.
* The bot is not leveraged.

Clarification/thoughts from my end:
* @Francesco, I'm aware of that list and extended it here [1]
* @Fabian, we were hoping by putting more emphasis on quality, the stability of 
master would also increase.
* Most of the comments circled around the PR template, that's why I assume the 
community is fine with the change in the guide.

Generally speaking I'm aware that there are a lot of potential improvements 
around the whole contribution process. Bot, awareness, PR template in general, 
... a few of them have been mentioned. With my PRs I mainly wanted to address 
the clarification of the 'definition of done'. While I'm happy to contribute to 
other suggestions in further step, I'd like to keep to focus on the DoD for now.

What I would suggest:
* I will merge the flink-web PR as this seems to be fine. [1]
* I will update the template PR to make it simpler (and use CHECKBOXES) in 
general, while keeping the aspects.

Allow me, just one more comment. I'm well aware, as some of you pointed out, 
this is a cultural and habit topic. The best template doesn't make sense when 
it is not executed. My goal on the one hand was to complete the level of done, 
raise awareness for tests and documentation, as well as for a sense of what 
'done' means to the Apache Flink community in general. At the end it is up to 
everyone individually to execute what is already there and whatever we might 
agree on adding. By having so many people discussing this we already came a 
step closer. 

Best, Joe

[1] https://github.com/apache/flink-web/pull/481

> On 25.11.2021, at 10:56, Martijn Visser  wrote:
> 
>> * 187 contain "yes / no / don't know" and thus have not answered at least
> one of the questions
> 
> I think there are quite some PRs that highlight in bold the answer to this
> question while keeping all options there. I'm all in favour of replacing
> this with a checkbox.
> 
> 
> 
> On Thu, 25 Nov 2021 at 10:47, Ingo Bürk  wrote:
> 
>> Hi Till,
>> 
>>> * I agree with Ingo that the "Verifying this change" section can be
>>> cumbersome to fill in. On the other hand it reminds contributors to
>> verify
>>> that his/her changes are covered by tests. Therefore, I would keep it.
>> 
>> IMO it could be replaced with a checkbox "I have made sure all changes are
>> covered by tests", though. I still see no benefit in painstakingly picking
>> out all individual test cases that have been touched or affected.
>> 
>>> In order to verify this I went through
>>> the first two pages of open PRs and I was very positively surprised that
>>> almost all PRs filled out the current template.
>> 
>> Just to try and complete the picture here, counting open PRs by including
>> search terms like "yes / no / don't know" the rough statistics are
>> 
>> * 731 open PRs
>> * 187 contain "yes / no / don't know" and thus have not answered at least
>> one of the questions
>> * 134 PRs seem to have deleted the PR template altogether as the question
>> text does not appear in it
>> 
>> Given that the latter two have to be largely mutually exclusive, that would
>> mean that 40+% of all PRs do not fill out the PR template.
>> 
>>> because what is not used can be removed.
>> 
>> Just because someone answers a question doesn't make it a useful question,
>> though. Some questions undoubtedly have a purpose for PR authors, like
>> having to decide whether the public API is affected. But what purpose do
>> "Anything that affects deployment or recovery" or "The S3 file system
>> connector" serve? I don't think this is useful to authors in any way, so
>> the other question is whether any committer actually looks at these answers
>> and does something with it. Even if we don't remove everything (which I
>> wouldn't want to, either), we can still remove those things that aren't
>> needed (if that is the case).
>> 
>> (Also, in any case I would really love if all questions could be converted
>> to checkboxes instead)
>> 
>> 
>> Ingo
>> 
>> On Thu, Nov 25, 2021 at 10:31 AM Till Rohrmann 
>> wrote:
>> 
>>> When I started writing this response I thought that the current PR
>> template
>>> is mostly ignored by the community. In order to verify this I went
>> through
>>> the first two pages of open PRs and I was very pos

Be able to use a custom KafkaSubscriber to have dynamic custom topic selection

2021-12-01 Thread CARRIERE Etienne
Hello,

Thank you for the work on dynamic topic/partition topology possible with 
KafkaSource which permit to do a lot of things (tested in 1.14) . But, in our 
organization, we have a custom need to have the list of topic defined from an 
external source (SQL database in our case).
We plan to write a custom 
KafkaSubscriber
 that will implement our logic.
Unfortunately, we can't inject our custom subscriber in KafkaSourceBuilder => 
the builder has only methods for standard subscriber nor in KafkaSource as the 
constructor is not public.

What can be a good method to implement this need ?
Would a PR which add "advanced" method to Builder or which put the KafkaSource 
as public be accepted ?

Thanks in advance,

Regards,

Etienne Carrière
=

Ce message et toutes les pieces jointes (ci-apres le "message")
sont confidentiels et susceptibles de contenir des informations
couvertes par le secret professionnel. Ce message est etabli
a l'intention exclusive de ses destinataires. Toute utilisation
ou diffusion non autorisee interdite.
Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE
et ses filiales declinent toute responsabilite au titre de ce message
s'il a ete altere, deforme falsifie.

=

This message and any attachments (the "message") are confidential,
intended solely for the addresses, and may contain legally privileged
information. Any unauthorized use or dissemination is prohibited.
E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any
of its subsidiaries or affiliates shall be liable for the message
if altered, changed or falsified.

=


Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

2021-12-01 Thread Timo Walther

Response to Francesco's feedback:

> *Proposed changes #6*: Other than defining this rule of thumb, we 
must also make sure that compiling plans with these objects that cannot 
be serialized in the plan must fail hard


Yes, I totally agree. We will fail hard with a helpful exception. Any 
mistake e.g. using a inline object in Table API or an invalid DataStream 
API source without uid should immediately fail a plan compilation step. 
I added a remark to the FLIP again.


> What worries me is breaking changes, in particular behavioural 
changes that might happen in connectors/formats


Breaking changes in connectors and formats need to be encoded in the 
options. I could also imagine to versioning in the factory identifier 
`connector=kafka` and `connector=kafka-2`. If this is necessary.


After thinking about your question again, I think we will also need the 
same testing infrastructure for our connectors and formats. Esp. restore 
tests and completeness test. I updated the document accordingly. Also I 
added a way to generate UIDs for DataStream API providers.


> *Functions:* Are we talking about the function name or the function 
complete signature?


For catalog functions, the identifier contains catalog name and database 
name. For system functions, identifier contains only a name which make 
function name and identifier identical. I reworked the section again and 
also fixed some of the naming conflicts you mentioned.


> we should perhaps use a logically defined unique id like 
/bigIntToTimestamp/


I added a concrete example for the resolution and restoration. The 
unique id is composed of name + version. Internally, this is represented 
as `$TO_TIMESTAMP_LTZ$1`.


> I think we should rather keep JSON out of the concept

Sounds ok to me. In SQL we also just call it "plan". I will change the 
file sections. But would suggest to keep the fromJsonString method.


> write it back in the original plan file

I updated the terminology section for what we consider an "upgrade". We 
might need to update the orginal plan file. This is already considered 
in the COMPILE PLAN ... FROM ... even though this is future work. Also 
savepoint migration.


Thanks for all the feedback!

Timo


On 30.11.21 14:28, Timo Walther wrote:

Response to Wenlongs's feedback:

 > I would prefer not to provide such a shortcut, let users use  COMPILE 
PLAN IF NOT EXISTS and EXECUTE explicitly, which can be understood by 
new users even without inferring the docs.


I would like to hear more opinions on this topic. Personally, I find a 
combined statement very useful. Not only for quicker development and 
debugging but also for readability. It helps in keeping the JSON path 
and the query close to each other in order to know the origin of the plan.


 > but the plan and SQL are not matched. The result would be quite 
confusing if we still execute the plan directly, we may need to add a 
validation.


You are right that there could be a mismatch. But we have a similar 
problem when executing CREATE TABLE IF NOT EXISTS. The schema or options 
of a table could have changed completely in the catalog but the CREATE 
TABLE IF NOT EXISTS is not executed again. So a mismatch could also 
occur there.


Regards,
Timo

On 30.11.21 14:17, Timo Walther wrote:

Hi everyone,

thanks for the feedback so far. Let me answer each email indvidually.

I will start with a response to Ingo's feedback:

 > Will the JSON plan's schema be considered an API?

No, not in the first version. This is explicitly mentioned in the 
`General JSON Plan Assumptions`. I tried to improve the section once 
more to make it clearer. However, the JSON plan is definitely stable 
per minor version. And since the plan is versioned by Flink version, 
external tooling could be build around it. We might make it public API 
once the design has settled.


 > Given that upgrades across multiple versions at once are 
unsupported, do we verify this somehow?


Good question. I extended the `General JSON Plan Assumptions`. Now 
yes: the Flink version is part of the JSON plan and will be verified 
during restore. But keep in mind that we might support more that just 
the last version at least until the JSON plan has been migrated.


Regards,
Timo

On 30.11.21 09:39, Marios Trivyzas wrote:
I have a question regarding the `COMPILE PLAN OVEWRITE`. If we choose 
to go

with the config option instead,
that doesn't provide the flexibility to overwrite certain plans but not
others, since the config applies globally, isn't that
something to consider?

On Mon, Nov 29, 2021 at 10:15 AM Marios Trivyzas  
wrote:



Hi Timo!

Thanks a lot for taking all that time and effort to put together this
proposal!

Regarding:
For simplification of the design, we assume that upgrades use a 
step size

of a single
minor version. We don't guarantee skipping minor versions (e.g. 1.11 to
1.14).

I think that for this first step we should make it absolutely clear 
to the

users that they would need to go through a

Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-12-01 Thread Marios Trivyzas
FYI: Opened https://github.com/apache/flink/pull/17985 which will introduce
the config option,
so we can continue working on the CAST fixes and improvements. It will be
very easy to flip
the default behaviour (currently on the PR: legacy = ENABLED) when this
discussion concludes,
and update the documentation accordingly.

On Mon, Nov 29, 2021 at 10:37 AM Marios Trivyzas  wrote:

> I definitely understand the argument to continue supporting the existing
> (return null) as the default behaviour.
> I'd like to point out though that if we decide this (option no.2) it's
> kind of unnatural, to directly drop the flag in *Flink 2.0*
> for example, and force the users at that point to use either *CAST *(throw
> errors) or *TRY_CAST* (define a default return value).
>
> The decision for the default value of this flag, is a commitment, because
> In my opinion, changing this default value in the future
> to throw errors instead, is not an option, as this will definitely confuse
> the users, so the next step (in future versions) would be to
> completely drop the flag and have the users choosing between *CAST* and
> *TRY_CAST.*
>
> Therefore, and speaking from a developing cycle perspective, my personal
> preference is to go with option no.1 which is in line
> with the usual approach (at least to my experience :)) in the open source
> software.
>
> Best regards,
> Marios
>
>
> On Tue, Nov 23, 2021 at 12:59 PM Martijn Visser 
> wrote:
>
>> Hi all,
>>
>> My conclusion at this point is that there is consensus that the default
>> behaviour of CAST should raise errors when it fails and that it should be
>> configurable via a configuration flag.
>>
>> The open discussion is on when we want to fix the current (incorrect)
>> behaviour:
>>
>> 1. Doing it in the next Flink release (1.15) by setting the configuration
>> flag to fail by default
>> 2. Keep the current (incorrect) behaviour in Flink 1.15 by setting the
>> configuration flag to the current situation and only changing this
>> if/whenever a Flink 2.0 version is released.
>>
>> From my perspective, I can understand that going for option 2 is a
>> preferred option for those that are running large Flink setups with a
>> great
>> number of users. I am wondering if those platforms also have the ability
>> to
>> set default values and/or override user configuration. That could be a way
>> to solve this issue for these platform teams.
>>
>> I would prefer to go for option 1, because I think correct execution of
>> CAST is important, especially for new Flink users. These new users should
>> have a smooth user experience and shouldn't need to change configuration
>> flags to get correct behaviour. I do expect that users who have used Flink
>> before are more familiar with checking release notes and interpreting how
>> this potentially affects them. That's why we have release notes. I also
>> doubt that we will have a Flink 2.0 release any time soon, meaning that we
>> are only going to make the pain even bigger for more users if we change
>> this incorrect behaviour at a later time.
>>
>> Best regards,
>>
>> Martijn
>>
>> On Tue, 23 Nov 2021 at 02:10, Kurt Young  wrote:
>>
>> > > This is what I don't really understand here: how adding a
>> configuration
>> > option causes issues here?
>> > This is why: for most Flink production use cases I see, it's not like a
>> > couple of people manage ~5 Flink
>> > jobs, so they can easily track all the big changes in every minor Flink
>> > version. Typically use case are like
>> > a group of people managing some streaming platform, which will provide
>> > Flink as an execution engine
>> > to their users. Alibaba has more than 40K online streaming SQL jobs, and
>> > ByteDance also has a similar
>> > number. Most of the time, whether upgrading Flink version will be
>> > controlled by the user of the platform,
>> > not the platform provider. The platform will most likely provide
>> multiple
>> > Flink version support.
>> >
>> > Even if you can count on the platform provider to read all the release
>> > notes carefully, their users won't. So
>> > we are kind of throw the responsibility to all the platform provider,
>> make
>> > them to take care of the semantic
>> > changes. They have to find some good way to control the impactions when
>> > their users upgrade Flink's version.
>> > And if they don't find a good solution around this, and their users
>> > encounter some online issues, they will be
>> > blamed. And you can guess who they would blame.
>> >
>> > Flink is a very popular engine now, every decision we make will affect
>> the
>> > users a lot. If you want them to make
>> > some changes, I would argue we should make them think it's worth it.
>> >
>> > Best,
>> > Kurt
>> >
>> >
>> > On Mon, Nov 22, 2021 at 11:29 PM Francesco Guardiani <
>> > france...@ververica.com> wrote:
>> >
>> > > > NULL in SQL essentially means "UNKNOWN", it's not as scary as a
>> null in
>> > > java which will easily cause a NPE or some random behavior with a c++

Re: Be able to use a custom KafkaSubscriber to have dynamic custom topic selection

2021-12-01 Thread Till Rohrmann
Hi Etienne,

I think this is a reasonable request. The question we need to answer is
what kind of stability guarantees we want to provide for the
KafkaSubscriber interface. Once this is done, I am sure that we can also
expose this part for advanced use cases. I am pulling in Arvid and Fabian
who are working on these parts.

Cheers,
Till

On Wed, Dec 1, 2021 at 3:10 PM CARRIERE Etienne
 wrote:

> Hello,
>
> Thank you for the work on dynamic topic/partition topology possible with
> KafkaSource which permit to do a lot of things (tested in 1.14) . But, in
> our organization, we have a custom need to have the list of topic defined
> from an external source (SQL database in our case).
> We plan to write a custom KafkaSubscriber<
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/subscriber/KafkaSubscriber.java>
> that will implement our logic.
> Unfortunately, we can't inject our custom subscriber in KafkaSourceBuilder
> => the builder has only methods for standard subscriber nor in KafkaSource
> as the constructor is not public.
>
> What can be a good method to implement this need ?
> Would a PR which add "advanced" method to Builder or which put the
> KafkaSource as public be accepted ?
>
> Thanks in advance,
>
> Regards,
>
> Etienne Carrière
> =
>
> Ce message et toutes les pieces jointes (ci-apres le "message")
> sont confidentiels et susceptibles de contenir des informations
> couvertes par le secret professionnel. Ce message est etabli
> a l'intention exclusive de ses destinataires. Toute utilisation
> ou diffusion non autorisee interdite.
> Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE
> et ses filiales declinent toute responsabilite au titre de ce message
> s'il a ete altere, deforme falsifie.
>
> =
>
> This message and any attachments (the "message") are confidential,
> intended solely for the addresses, and may contain legally privileged
> information. Any unauthorized use or dissemination is prohibited.
> E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any
> of its subsidiaries or affiliates shall be liable for the message
> if altered, changed or falsified.
>
> =
>


[jira] [Created] (FLINK-25137) Cannot do window aggregation after a regular join: Window can only be defined on a time attribute column

2021-12-01 Thread Hongbo (Jira)
Hongbo created FLINK-25137:
--

 Summary: Cannot do window aggregation after a regular join: Window 
can only be defined on a time attribute column
 Key: FLINK-25137
 URL: https://issues.apache.org/jira/browse/FLINK-25137
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.14.0
Reporter: Hongbo


I have a stream joining with a bounded lookup table. Then I want to do a window 
aggregation on this stream.

The code looks like this:

 
{code:java}
var joined = tableEnv.sqlQuery("SELECT test.ts, test.v, lookup.c from test" +
" join lookup on test.v = lookup.a");
tableEnv.createTemporaryView("joined_table", joined);

var agg = tableEnv.sqlQuery("SELECT window_start, window_end, c, sum(v)\n" +
" FROM TABLE(\n" +
" HOP(TABLE joined_table, DESCRIPTOR(ts), INTERVAL '10' seconds, INTERVAL '5' 
minutes))\n" +
" GROUP BY window_start, window_end, c");
{code}
 

However, it failed with an exception:
{code:java}
Window can only be defined on a time attribute column, but is type of 
TIMESTAMP(3)
org.apache.flink.table.api.ValidationException: Window can only be defined on a 
time attribute column, but is type of TIMESTAMP(3)
    at 
org.apache.flink.table.planner.plan.utils.WindowUtil$.convertToWindowingStrategy(WindowUtil.scala:183)
    at 
org.apache.flink.table.planner.plan.metadata.FlinkRelMdWindowProperties.getWindowProperties(FlinkRelMdWindowProperties.scala:193)
    at GeneratedMetadataHandler_WindowProperties.getWindowProperties_$(Unknown 
Source)
    at GeneratedMetadataHandler_WindowProperties.getWindowProperties(Unknown 
Source)
    at 
org.apache.flink.table.planner.plan.metadata.FlinkRelMetadataQuery.getRelWindowProperties(FlinkRelMetadataQuery.java:261)
    at 
org.apache.flink.table.planner.plan.metadata.FlinkRelMdWindowProperties.getProjectWindowProperties(FlinkRelMdWindowProperties.scala:89)
    at 
org.apache.flink.table.planner.plan.metadata.FlinkRelMdWindowProperties.getWindowProperties(FlinkRelMdWindowProperties.scala:75)
    at GeneratedMetadataHandler_WindowProperties.getWindowProperties_$(Unknown 
Source)
    at GeneratedMetadataHandler_WindowProperties.getWindowProperties(Unknown 
Source)
    at 
org.apache.flink.table.planner.plan.metadata.FlinkRelMetadataQuery.getRelWindowProperties(FlinkRelMetadataQuery.java:261)
    at 
org.apache.flink.table.planner.calcite.RelTimeIndicatorConverter.gatherIndicesToMaterialize(RelTimeIndicatorConverter.java:433)
    at 
org.apache.flink.table.planner.calcite.RelTimeIndicatorConverter.convertAggInput(RelTimeIndicatorConverter.java:416)
    at 
org.apache.flink.table.planner.calcite.RelTimeIndicatorConverter.visitAggregate(RelTimeIndicatorConverter.java:402)
    at 
org.apache.flink.table.planner.calcite.RelTimeIndicatorConverter.visit(RelTimeIndicatorConverter.java:167)
    at org.apache.calcite.rel.AbstractRelNode.accept(AbstractRelNode.java:217)
    at 
org.apache.flink.table.planner.calcite.RelTimeIndicatorConverter.convert(RelTimeIndicatorConverter.java:133)
    at 
org.apache.flink.table.planner.plan.optimize.program.FlinkRelTimeIndicatorProgram.optimize(FlinkRelTimeIndicatorProgram.scala:35)
    at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:62)
    at 
scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
    at 
scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
    at scala.collection.Iterator.foreach(Iterator.scala:937)
    at scala.collection.Iterator.foreach$(Iterator.scala:937)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
    at scala.collection.IterableLike.foreach(IterableLike.scala:70)
    at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:156)
    at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:154)
    at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
    at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:58)
    at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:163)
    at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:81)
    at 
org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
    at 
org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:300)
    at 
org.apache.flink.table.planner.delegation.StreamPlanner.explain(StreamPlanner.scala:104)
    at 
org.apache.flink.table.planner.delegation.StreamPlanner.explain(S

RE: Re: Be able to use a custom KafkaSubscriber to have dynamic custom topic selection

2021-12-01 Thread Mason Chen
Hi all,

I was also looking into this because I have a use case in which topics could 
change.

This was previously discussed on the user mailing list 
(https://lists.apache.org/thread/ys3171grvbw843lfjph8gcnoydqcwq29 
) and this is 
the JIRA ticket that was created to track exposing the KafkaSubscriber: 
https://issues.apache.org/jira/browse/FLINK-24660 
.

I was working on it and thought of some issues regarding the topics 
changing—since the Kafka subscriber is used to discover the current splits, it 
could be possible that some previously consumed topics should not be consumed 
anymore, requiring a split removal mechanism. I talked with Qingsheng about 
this here: https://lists.apache.org/thread/3wxfr39t2rz1wvxw2vsz5hsrp9t8qrwx 
 (there is a 
simple example where splits would need to be removed, because user changes the 
topic list while the reader state reflects the old topic list).

I think there needs to be some sort of filtering logic on splits, where the 
readers wait for enumerator to assign splits (although readers may have some 
state) and split removal mechanism via enumerator rpc.

Best,
Mason

On 2021/12/01 16:48:12 Till Rohrmann wrote:
> Hi Etienne,
> 
> I think this is a reasonable request. The question we need to answer is
> what kind of stability guarantees we want to provide for the
> KafkaSubscriber interface. Once this is done, I am sure that we can also
> expose this part for advanced use cases. I am pulling in Arvid and Fabian
> who are working on these parts.
> 
> Cheers,
> Till
> 
> On Wed, Dec 1, 2021 at 3:10 PM CARRIERE Etienne
>  wrote:
> 
> > Hello,
> >
> > Thank you for the work on dynamic topic/partition topology possible with
> > KafkaSource which permit to do a lot of things (tested in 1.14) . But, in
> > our organization, we have a custom need to have the list of topic defined
> > from an external source (SQL database in our case).
> > We plan to write a custom KafkaSubscriber<
> > https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/subscriber/KafkaSubscriber.java>
> > that will implement our logic.
> > Unfortunately, we can't inject our custom subscriber in KafkaSourceBuilder
> > => the builder has only methods for standard subscriber nor in KafkaSource
> > as the constructor is not public.
> >
> > What can be a good method to implement this need ?
> > Would a PR which add "advanced" method to Builder or which put the
> > KafkaSource as public be accepted ?
> >
> > Thanks in advance,
> >
> > Regards,
> >
> > Etienne Carrière
> > =
> >
> > Ce message et toutes les pieces jointes (ci-apres le "message")
> > sont confidentiels et susceptibles de contenir des informations
> > couvertes par le secret professionnel. Ce message est etabli
> > a l'intention exclusive de ses destinataires. Toute utilisation
> > ou diffusion non autorisee interdite.
> > Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE
> > et ses filiales declinent toute responsabilite au titre de ce message
> > s'il a ete altere, deforme falsifie.
> >
> > =
> >
> > This message and any attachments (the "message") are confidential,
> > intended solely for the addresses, and may contain legally privileged
> > information. Any unauthorized use or dissemination is prohibited.
> > E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any
> > of its subsidiaries or affiliates shall be liable for the message
> > if altered, changed or falsified.
> >
> > =
> >
> 

[jira] [Created] (FLINK-25138) SQL Client improve completion for commands

2021-12-01 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-25138:
---

 Summary: SQL Client improve completion for commands
 Key: FLINK-25138
 URL: https://issues.apache.org/jira/browse/FLINK-25138
 Project: Flink
  Issue Type: Sub-task
Reporter: Sergey Nuyanzin


Commands could contain commands description (toggable by property).
{{SET}} and {{RESET}} could suggest properties as completion (at least those 
which are mentioned in config map).
Add completion for other commands like {{SHOW}}, {{ADD}}, {{INSERT}} (only 
{{INTO}} and {{OVERWRITE}})



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


RE: Re: Be able to use a custom KafkaSubscriber to have dynamic custom topic selection

2021-12-01 Thread Mason Chen
Hi all,

I was also looking into this because I have a use case in which topics could 
change.

This was previously discussed on the user mailing list 
(https://lists.apache.org/thread/ys3171grvbw843lfjph8gcnoydqcwq29 
) and this is 
the JIRA ticket that was created to track exposing the KafkaSubscriber: 
https://issues.apache.org/jira/browse/FLINK-24660 
.

I was working on it and thought of some issues regarding the topics 
changing—since the Kafka subscriber is used to discover the current splits, it 
could be possible that some previously consumed topics should not be consumed 
anymore, requiring a split removal mechanism. I talked with Qingsheng about 
this here: https://lists.apache.org/thread/3wxfr39t2rz1wvxw2vsz5hsrp9t8qrwx 
 (there is a 
simple example where splits would need to be removed, because user changes the 
topic list while the reader state reflects the old topic list).

I think there needs to be some sort of filtering logic on splits, where the 
readers wait for enumerator to assign splits (although readers may have some 
state) and split removal mechanism via enumerator rpc.

Best,
Mason

On 2021/12/01 16:48:12 Till Rohrmann wrote:
> Hi Etienne,
> 
> I think this is a reasonable request. The question we need to answer is
> what kind of stability guarantees we want to provide for the
> KafkaSubscriber interface. Once this is done, I am sure that we can also
> expose this part for advanced use cases. I am pulling in Arvid and Fabian
> who are working on these parts.
> 
> Cheers,
> Till
> 
> On Wed, Dec 1, 2021 at 3:10 PM CARRIERE Etienne
>  wrote:
> 
> > Hello,
> >
> > Thank you for the work on dynamic topic/partition topology possible with
> > KafkaSource which permit to do a lot of things (tested in 1.14) . But, in
> > our organization, we have a custom need to have the list of topic defined
> > from an external source (SQL database in our case).
> > We plan to write a custom KafkaSubscriber<
> > https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/subscriber/KafkaSubscriber.java>
> > that will implement our logic.
> > Unfortunately, we can't inject our custom subscriber in KafkaSourceBuilder
> > => the builder has only methods for standard subscriber nor in KafkaSource
> > as the constructor is not public.
> >
> > What can be a good method to implement this need ?
> > Would a PR which add "advanced" method to Builder or which put the
> > KafkaSource as public be accepted ?
> >
> > Thanks in advance,
> >
> > Regards,
> >
> > Etienne Carrière
> > =
> >
> > Ce message et toutes les pieces jointes (ci-apres le "message")
> > sont confidentiels et susceptibles de contenir des informations
> > couvertes par le secret professionnel. Ce message est etabli
> > a l'intention exclusive de ses destinataires. Toute utilisation
> > ou diffusion non autorisee interdite.
> > Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE
> > et ses filiales declinent toute responsabilite au titre de ce message
> > s'il a ete altere, deforme falsifie.
> >
> > =
> >
> > This message and any attachments (the "message") are confidential,
> > intended solely for the addresses, and may contain legally privileged
> > information. Any unauthorized use or dissemination is prohibited.
> > E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any
> > of its subsidiaries or affiliates shall be liable for the message
> > if altered, changed or falsified.
> >
> > =
> >
> 

[jira] [Created] (FLINK-25139) Shaded Hadoop S3A with credentials provider failed on azure

2021-12-01 Thread Yun Gao (Jira)
Yun Gao created FLINK-25139:
---

 Summary: Shaded Hadoop S3A with credentials provider failed on 
azure
 Key: FLINK-25139
 URL: https://issues.apache.org/jira/browse/FLINK-25139
 Project: Flink
  Issue Type: Bug
  Components: FileSystems
Affects Versions: 1.13.3
Reporter: Yun Gao


{code:java}
Dec 01 19:33:39 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1232)
 ~[?:?]
Dec 01 19:33:39 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2169) 
~[?:?]
Dec 01 19:33:39 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
 ~[?:?]
Dec 01 19:33:39 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088) 
~[?:?]
Dec 01 19:33:39 at 
org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734) ~[?:?]
Dec 01 19:33:39 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.exists(S3AFileSystem.java:2970) ~[?:?]
Dec 01 19:33:39 at 
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.exists(HadoopFileSystem.java:165)
 ~[?:?]
Dec 01 19:33:39 at 
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.exists(PluginFileSystemFactory.java:148)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.core.fs.FileSystem.initOutPathDistFS(FileSystem.java:984) 
~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.api.common.io.FileOutputFormat.initializeGlobal(FileOutputFormat.java:299)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.jobgraph.InputOutputFormatVertex.initializeOnMaster(InputOutputFormatVertex.java:110)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.executiongraph.DefaultExecutionGraphBuilder.buildGraph(DefaultExecutionGraphBuilder.java:174)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.scheduler.DefaultExecutionGraphFactory.createAndRestoreExecutionGraph(DefaultExecutionGraphFactory.java:107)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:342)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.scheduler.SchedulerBase.(SchedulerBase.java:190) 
~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.scheduler.DefaultScheduler.(DefaultScheduler.java:122)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:132)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.jobmaster.DefaultSlotPoolServiceSchedulerFactory.createScheduler(DefaultSlotPoolServiceSchedulerFactory.java:110)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:340)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:317) 
~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.internalCreateJobMasterService(DefaultJobMasterServiceFactory.java:107)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.lambda$createJobMasterService$0(DefaultJobMasterServiceFactory.java:95)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
org.apache.flink.util.function.FunctionUtils.lambda$uncheckedSupplier$4(FunctionUtils.java:112)
 ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
Dec 01 19:33:39 at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
 ~[?:1.8.0_292]
Dec 01 19:33:39 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_292]
Dec 01 19:33:39 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_292]
Dec 01 19:33:39 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_292]
Dec 01 19:33:39 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[?:1.8.0_292]
Dec 01 19:33:39 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_292]
Dec 01 19:33:39 at 
java.util

[jira] [Created] (FLINK-25140) Streaming File Sink s3 end-to-end test failed due to job has not started within a timeout of 10 sec

2021-12-01 Thread Yun Gao (Jira)
Yun Gao created FLINK-25140:
---

 Summary: Streaming File Sink s3 end-to-end test failed due to job 
has not started within a timeout of 10 sec
 Key: FLINK-25140
 URL: https://issues.apache.org/jira/browse/FLINK-25140
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Affects Versions: 1.13.3
Reporter: Yun Gao


{code:java}
Dec 01 19:06:38 Starting taskexecutor daemon on host fv-az26-327.
Dec 01 19:06:38 Submitting job.
Dec 01 19:06:54 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:06:57 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:00 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:03 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:06 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:09 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:12 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:15 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:18 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:21 Job (62f9a00856309492574699642574071c) is not yet running.
Dec 01 19:07:22 Job (62f9a00856309492574699642574071c) has not started within a 
timeout of 10 sec
Dec 01 19:07:22 Stopping job timeout watchdog (with pid=401626)
 {code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27382&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529&l=12560



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25141) Elasticsearch connector customize sink parallelism

2021-12-01 Thread Ada Wong (Jira)
Ada Wong created FLINK-25141:


 Summary: Elasticsearch connector customize sink parallelism
 Key: FLINK-25141
 URL: https://issues.apache.org/jira/browse/FLINK-25141
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / ElasticSearch
Affects Versions: 1.14.0
Reporter: Ada Wong


Inspired by JDBC and Kafka connector, add a 'sink.parallelism' option, and 
using SinkProvider#of(sink, sinkParallelism).

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: Re: [VOTE] FLIP-193: Snapshots ownership

2021-12-01 Thread Yun Gao
+1 for making the lifecycle of the  savepoint / external checkpoint more clear. 
Thanks Dawid for the FLIP~

Best,
Yun



 --Original Mail --
Sender:Till Rohrmann 
Send Date:Wed Dec 1 21:28:48 2021
Recipients:dev 
Subject:Re: [VOTE] FLIP-193: Snapshots ownership
Thanks for creating this FLIP Dawid.

+1

Cheers,
Till

On Wed, Dec 1, 2021 at 2:03 PM Konstantin Knauf  wrote:

> Thanks, Dawid.
>
> +1
>
> On Wed, Dec 1, 2021 at 1:23 PM Dawid Wysakowicz 
> wrote:
>
> > Dear devs,
> >
> > I'd like to open a vote on FLIP-193: Snapshots ownership [1] which was
> > discussed in this thread [2].
> > The vote will be open for at least 72 hours unless there is an objection
> or
> > not enough votes.
> >
> > Best,
> >
> > Dawid
> >
> > [1] https://cwiki.apache.org/confluence/x/bIyqCw
> >
> > [2] https://lists.apache.org/thread/zw2crf0c7t7t4cb5cwcwjpvsb3r1ovz2
> >
> >
> >
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>